Install AlloyDB AI in AlloyDB Omni

This page shows you how to install AlloyDB Omni and integrate AlloyDB AI in it.

AlloyDB AI is a suite of features included with AlloyDB Omni that let you build enterprise generative AI applications. For more information about the ML functionality of AlloyDB, see Build generative AI applications.

AlloyDB Omni with AlloyDB AI lets you query remote ML models to work with online predictions and text embeddings generated from ML models. AlloyDB Omni with AlloyDB AI can also process vector embeddings from other content such as an image, for example, if you use the google_ml.predict_row interface and do the translation yourself in the query.

Based on where you want to install AlloyDB Omni with AlloyDB AI, select one of the following options:

Configure your AlloyDB Omni instance to query remote models

You can query remote models using model endpoint management by enabling googleMLExtension in your database cluster manifest.

Optionally, if you want to query Vertex AI models, you must configure AlloyDB service account with Vertex AI, create a Kubernetes secret using the key, and set the Kubernetes secret in the database cluster manifest.

Optional: Add the Vertex AI permissions to AlloyDB service account

To configure AlloyDB Omni to query remote Vertex AI models, follow these steps:

  1. Create a service account with Google Cloud.

  2. Create a service account key save it in JSON format to the private-key.json file, and download it.

  3. Store the key in a permanent location on your file system. It resides at this location for the lifetime of your AlloyDB Omni server.

    Note its location on your file system; you need it for the subsequent steps.

  4. Add Vertex AI Identity and Access Management (IAM) permissions to the appropriate project and service account.

       gcloud projects add-iam-policy-binding PROJECT_ID \
           --member="serviceAccount:SERVICE_ACCOUNT_ID" \
           --role="roles/aiplatform.user"
    

    Replace the following:

    • PROJECT_ID: the ID of your Google Cloud project.

    • SERVICE_ACCOUNT_ID: the ID of the service account that you created in the previous step. This includes the full @PROJECT_ID.iam.gserviceaccount.com suffix—for example, my-service@my-project.iam.gserviceaccount.com.

Optional: Create a Kubernetes secret using the service account key

To create a Kubernetes secret based on the service account key downloaded in the preceding steps, run the following command:

   kubectl create secret generic SECRET_NAME \
   --from-file=PATH_TO_SERVICE_ACCOUNT_KEY/private-key.json \
   -n NAMESPACE

Replace the following:

  • SECRET_NAME: the name of the secret used when you create a DBCluster manifest to enable AlloyDB Omni to access AlloyDB AI features. For example, vertex-ai-key-alloydb.

  • PATH_TO_SERVICE_ACCOUNT_KEY: the path to the location where you downloaded the private-key.json service account key.

  • NAMESPACE: the namespace of the database cluster.

Install the AlloyDB Omni operator

Install the AlloyDB Omni operator using steps listed in Install the AlloyDB Omni operator.

Create a database cluster with AlloyDB AI

  1. Create a database cluster with AlloyDB AI.

    When you set enabled to true under the googleMLExtension field, it lets you query remote models. Set the vertexAIKeyRef to the Kubernetes secret if you want to query Vertex AI models.

        apiVersion: v1
        kind: Secret
        metadata:
          name: db-pw-DB_CLUSTER_NAME
        type: Opaque
        data:
          DB_CLUSTER_NAME: "ENCODED_PASSWORD"
        ---
        apiVersion: alloydbomni.dbadmin.goog/v1
        kind: DBCluster
        metadata:
          name: DB_CLUSTER_NAME
        spec:
        databaseVersion: "15.7.0"
        primarySpec:
            adminUser:
              passwordRef:
                name: db-pw-DB_CLUSTER_NAME
            features:
              googleMLExtension:
                enabled: true
                config:
                  vertexAIKeyRef: VERTEX_AI_SECRET_NAME
                  vertexAIRegion: VERTEX_AI_REGION
            resources:
              cpu: CPU_COUNT
              memory: MEMORY_SIZE
              disks:
              - name: DataDisk
                size: DISK_SIZE
                storageClass: standard
    

    Replace the following:

    • DB_CLUSTER_NAME: the name of this database cluster—for example, my-db-cluster.

    • VERTEX_AI_SECRET_NAME (Optional): the Vertex AI secret that you created in preceding steps. You must set this option if you want to call Vertex AI models.

    • VERTEX_AI_REGION (Optional): the Vertex AI regional endpoint that you want to send your request to—for example, us-west4. The default value is us-central1.

    • ENCODED_PASSWORD: the database login password for the default postgres user role, encoded as a base64 string—for example, Q2hhbmdlTWUxMjM= for ChangeMe123.

    • CPU_COUNT: the number of CPUs available to each database instance in this database cluster.

    • MEMORY_SIZE: the amount of memory per database instance of this database cluster. We recommend setting this to 8 gigabytes per CPU. For example, if you set cpu to 2 earlier in this manifest, then we recommend setting memory to 16Gi.

    • DISK_SIZE: the disk size per database instance—for example, 10Gi.

  2. Apply the manifest.

    kubectl apply -f DB_CLUSTER_YAML
    

    Replace the following:

    • DB_CLUSTER_YAML: the name of this database cluster manifest file—for example, alloydb-omni-db-cluster.yaml.

Verify AlloyDB Omni with AlloyDB AI installation

To verify your installation is successful and uses model prediction, enter the following:

   CREATE EXTENSION google_ml_integration CASCADE;

   SELECT array_dims(embedding( 'textembedding-gecko@003', 'AlloyDB AI')::real[]);

The output looks similar to the following:

      array_dims
      ------------
      [1:768]
      (1 row) 

In the previous query, the embedding() call generates embeddings for the input text AlloyDB AI. array_dims returns the dimensions of the array returned by embedding(). Since the pre-registered textembedding-gecko model returns an output with 768 dimensions, the output is [768].

What's next