Register and call remote AI models in AlloyDB Omni

To invoke predictions or generate embeddings using a model, register the model with model endpoint management.

For more information about the google_ml.create_model function, see model endpoint management reference.

Before you begin

Before you register a model with model endpoint management, you must enable the google_ml_integration extension and set up authentication based on the model provider, if your model requires authentication.

Make sure that you access your database with the postgres default username.

Enable the extension

You must add and enable the google_ml_integration extension before you can start using the associated functions. Model endpoint management requires that the google_ml_integration version 1.3 extension is installed.

Connect to your database using psql.
Optional. If the google_ml_integration extension is already installed, alter it to update the version to 1.3:
```
    ALTER EXTENSION google_ml_integration UPDATE TO '1.3'
```
Add the google_ml_integration version 1.3 extension using psql:
```
  CREATE EXTENSION google_ml_integration VERSION '1.3';
```
Optional. Grant permission to a non-super PostgreSQL user to manage model metadata:
```
  GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA google_ml TO NON_SUPER_USER;
```
Replace NON_SUPER_USER with the non-super PostgreSQL username.

Enable model endpoint management on your database:

  ALTER SYSTEM SET google_ml_integration.enable_model_support=on;
  SELECT pg_reload_conf();

Set up authentication

The following sections show how to set up authentication before adding a Vertex AI model or models by other providers.

Set up authentication for Vertex AI

To use the Google Vertex AI models, you must add Vertex AI permissions to the service account that you used while installing AlloyDB Omni. For more information, see Configure your AlloyDB Omni installation to query cloud-based models.

Set up authentication for other model providers

For all models except Vertex AI models, you can store your API keys or bearer tokens in Secret Manager. This step is optional if your model doesn't handle authentication through Secret Manager—for example, if your model uses HTTP headers to pass authentication information or doesn't use authentication at all.

This section explains how to set up authentication if you are using Secret Manager.

To create and use an API key or a bearer token, complete the following steps:

Create the secret in Secret Manager. For more information, see Create a secret and access a secret version.

The secret name and the secret path is used in the google_ml.create_sm_secret() SQL function.
Download the service account key for the service account in JSON format.
Grant permissions to the AlloyDB cluster to access the secret.
```
  gcloud secrets add-iam-policy-binding 'SECRET_ID' \
      --member="serviceAccount:SERVICE_ACCOUNT_ID" \
      --role="roles/secretmanager.secretAccessor"
```
Replace the following:
- SECRET_ID: the secret name in Secret Manager.
- SERVICE_ACCOUNT_ID: the ID of the service account that you created in the previous step. Ensure that this is the same account you used during AlloyDB Omni installation. This includes the full PROJECT_ID.iam.gserviceaccount.com suffix. For example: my-service@my-project.iam-gserviceaccount.com
  
  You can also grant this role to the service account at the project level. For more information, see Add Identity and Access Management policy binding

Models with built-in support

This section shows how to register models that the model endpoint management provides built-in support for.

Vertex AI embedding models

The model endpoint management provides built-in support for all versions of the text-embedding-gecko model by Vertex AI. Use the qualified name to set the model version to either textembedding-gecko@001 or textembedding-gecko@002. Since the textembedding-gecko@001 model is pre-registered with model endpoint management, you can generate embeddings using textembedding-gecko@001 as the model ID. For these models, the extension automatically sets up default transform functions.

For AlloyDB Omni, make sure that you set up AlloyDB Omni to query cloud-based Vertex AI models.

To register the textembedding-gecko@002 model version, complete the following steps:

Connect to your database using psql.
Create and enable the google_ml_integration extension.
Call the create model function to add the textembedding-gecko model:
```
CALL
  google_ml.create_model(
    model_id =>  'MODEL_ID',
    model_provider => 'google',
    model_qualified_name => 'MODEL_QUALIFIED_NAME',
    model_type => 'text_embedding',
    model_auth_type => 'alloydb_service_agent_iam');
```
Replace the following:
- MODEL_ID: a unique ID for the model that you define. This model ID is referenced for metadata that the model needs to generate embeddings or invoke predictions.
- MODEL_QUALIFIED_NAME: the fully qualified name of the model version. Set it to either textembedding-gecko@001 or textembedding-gecko@002.
The request URL that the function generates refers to the project associated with the AlloyDB Omni service account. If you want to refer to another project, then ensure that you specify the model_request_url explicitly.

Open AI text embedding model

The model endpoint management provides built-in support for the text-embedding-ada-002 model by OpenAI.The google_ml_integration extension automatically sets up default transform functions and invokes calls to the remote model.

The following example adds the text-embedding-ada-002 OpenAI model.

Connect to your database using psql.
Create and enable the google_ml_integration extension.
Add the OpenAI API key as a secret to the Secret Manager for authentication.
Call the secret stored in the Secret Manager:
```
CALL
google_ml.create_sm_secret(
  secret_id => 'SECRET_ID',
  secret_path => 'projects/PROJECT_ID/secrets/secret-ID/versions/VERSION_NUMBER');
```
Replace the following:
- SECRET_ID: the secret name in Secret Manager-for example, key1.
- PROJECT_ID: the ID of your Google Cloud project.
- VERSION_NUMBER: the version number of the secret ID.
Call the create model function to register the text-embedding-ada-002 model:
```
CALL
  google_ml.create_model(
    model_id => 'MODEL_ID',
    model_provider => 'open_ai',
    model_type => 'text_embedding',
    model_qualified_name => 'text-embedding-ada-002',
    model_auth_type => 'secret_manager',
    model_auth => 'SECRET_ID');
```
Replace the following:
- MODEL_ID: a unique ID for the model that you define. This model ID is referenced for metadata that the model needs to generate embeddings or invoke predictions.
- SECRET_ID: the secret name in Secret Manager.

To generate embeddings, see how to generate embedding for models with built-in support.

Other text embedding models

This section shows how to register any custom hosted text embedding model or text embedding models provided by model hosting providers. Based on your model metadata, you might need to add transform functions, generate HTTP headers, or define endpoints.

Custom hosted text embedding model

This section shows how to register a custom hosted model along with creating transform functions, and optionally, custom HTTP headers. AlloyDB Omni supports all custom hosted models regardless of where they are hosted.

Before you begin

The following example adds the custom-embedding-model custom model hosted by Cymbal. The cymbal_text_input_transform and cymbal_text_output_transform transform functions are used to transform the input and output format of the model to the input and output format of the prediction function.

To register custom-hosted text embedding models, complete the following steps:

Connect to your database using psql.
Create and enable the google_ml_integration extension.
Add the API key as a secret to the Secret Manager for authentication.
Call the secret stored in the Secret Manager:
```
CALL
  google_ml.create_sm_secret(
    secret_id => 'SECRET_ID',
    secret_path => 'projects/project-id/secrets/secret-ID/versions/VERSION_NUMBER');
```
Replace the following:
- SECRET_ID: the secret name in Secret Manager-for example, key2.
- PROJECT_ID: the ID of your Google Cloud project.
- VERSION_NUMBER: the version number of the secret ID.
Note: Secret Manager generates an Authorization: Bearer SECRET_VALUE_FROM_SECRET_MANAGER header for authentication by default. If this format matches your model's authorization bearer token format, then you don't have to generate auth headers using the header generation function.
Create the input and output transform functions:
```
CREATE OR REPLACE FUNCTION INPUT_TRANSFORM_FUNCTION(model_id VARCHAR(100), input_text TEXT) RETURNS JSON;

CREATE OR REPLACE FUNCTION OUTPUT_TRANSFORM_FUNCTION(model_id VARCHAR(100), response_json JSON) RETURNS real[];
```
Replace the following:
- INPUT_TRANSFORM_FUNCTION: required. The function to transform input of the corresponding prediction function to the model specific input-for example, cymbal_text_input_transform.
- OUTPUT_TRANSFORM_FUNCTION: required. The function to transform model specific output to the prediction function output-for example, cymbal_text_output_transform.
Call the create model function to register the custom embedding model:
```
CALL
  google_ml.create_model(
    model_id => 'MODEL_ID',
    model_request_url => 'REQUEST_URL',
    model_type => 'text_embedding',
    model_auth_type => 'secret_manager',
    model_auth_id => 'SECRET_ID',
    model_qualified_name => 'MODEL_QUALIFIED_NAME',
    model_in_transform_fn => 'INPUT_TRANSFORM_FUNCTION',
    model_out_transform_fn => 'OUTPUT_TRANSFORM_FUNCTION');
```
Replace the following:
- MODEL_ID: required. A unique ID for the model that you define-for example custom-embedding-model. This model ID is referenced for metadata that the model needs to generate embeddings or invoke predictions.
- REQUEST_URL: required. The model specific endpoint when adding custom text embedding and generic models-for example, https://cymbal.com/models/text/embeddings/v1.
- MODEL_QUALIFIED_NAME: required if your model uses a qualified name. The fully qualified name in case the model has multiple versions.

OpenAI Text Embedding 3 Small and Large models

You can register the OpenAI text-embedding-3-small and text-embedding-3-large models using the embedding prediction function and transform functions specific to the model. The following example shows how to register the OpenAI text-embedding-3-small model.

Before you begin

To register the text-embedding-3-small embedding model, do the following:

Connect to your database using psql.
Create and enable the google_ml_integration extension.
Add the OpenAI API key as a secret to the Secret Manager for authentication. If you have already created a secret for any other OpenAI model, you can reuse the same secret.
Call the secret stored in the Secret Manager:
```
CALL
  google_ml.create_sm_secret(
    secret_id => 'SECRET_ID',
    secret_path => 'projects/project-id/secrets/secret-ID/versions/VERSION_NUMBER');
```
Replace the following:
- SECRET_ID: the secret name in Secret Manager-for example, openai_key.
- PROJECT_ID: the ID of your Google Cloud project.
- VERSION_NUMBER: the version number of the secret ID.
Create the input and output transform functions:
```
CREATE OR REPLACE FUNCTION INPUT_TRANSFORM_FUNCTION(model_id VARCHAR(100), input_text TEXT) RETURNS JSON;

CREATE OR REPLACE FUNCTION OUTPUT_TRANSFORM_FUNCTION(model_id VARCHAR(100), response_json JSON) RETURNS real[];
```
Replace the following:
- INPUT_TRANSFORM_FUNCTION: the function to transform input of the corresponding prediction function to the model specific input-for example, google_ml.openai_text_embedding_input_transform.
- OUTPUT_TRANSFORM_FUNCTION: the function to transform model specific output to the prediction function output-for example, google_ml.openai_text_embedding_output_transform.

Call the create model function to register the text-embedding-3-small embedding model:

CALL
  google_ml.create_model(
    model_id => 'MODEL_ID',
    Model_provider => 'open_ai',
    model_type => 'text_embedding',
    model_auth_type => 'secret_manager',
    model_auth_id => 'SECRET_ID',
    model_qualified_name => 'text-embedding-3-small',
    model_in_transform_fn => 'INPUT_TRANSFORM_FUNCTION',
    model_out_transform_fn => 'OUTPUT_TRANSFORM_FUNCTION',

Replace the following:

MODEL_ID: a unique ID for the model that you define-for example openai-te-3-small. This model ID is referenced for metadata that the model needs to generate embeddings or invoke predictions.

To generate embeddings, see how to generate embedding for other text embedding models.

Generic models

This section shows how to register any generic model that is available on a hosted model provider such as Hugging Face, OpenAI, Vertex AI, or any other provider. This section shows examples to register a generic model hosted on Hugging Face and a generic gemini-pro model from Vertex AI Model Garden, which doesn't have built-in support.

You can register any generic model as long as the input and output is in the JSON format. Based on your model metadata, you might need to generate HTTP headers or define endpoints.

Generic model on Hugging Face

The following example adds the facebook/bart-large-mnli custom classification model hosted on Hugging Face.

Connect to your database using psql.
Create and enable the google_ml_integration extension.
Add the bearer token as a secret to the Secret Manager for authentication.
Call the secret stored in the Secret Manager:
```
CALL
  google_ml.create_sm_secret(
    secret_id => 'SECRET_ID',
    secret_path => 'projects/project-id/secrets/secret-ID/versions/VERSION_NUMBER');
```
Replace the following:
- SECRET_ID: the secret name in Secret Manager-for example, key3.
- PROJECT_ID: the ID of your Google Cloud project.
- VERSION_NUMBER: the version number of the secret ID.
Call the create model function to register the facebook/bart-large-mnli model:
```
CALL
  google_ml.create_model(
    model_id => 'MODEL_ID',
    model_request_url => 'REQUEST_URL',
    model_qualified_name => 'MODEL_QUALIFIED_NAME',
    model_auth_type => 'secret_manager',
    model_auth_id => 'SECRET_ID');
```
Replace the following:
- MODEL_ID: a unique ID for the model that you define-for example, custom-classification-model. This model ID is referenced for metadata that the model needs to generate embeddings or invoke predictions.
- REQUEST_URL: the model specific endpoint when adding custom text embedding and generic models-for example, https://api-inference.huggingface.co/models/facebook/bart-large-mnli.
- MODEL_QUALIFIED_NAME: the fully qualified name of the model version-for example, facebook/bart-large-mnli.

Gemini model

Ensure that you set up AlloyDB Omni to query cloud-based Vertex AI models.

The following example adds the gemini-1.0-pro model from the Vertex AI Model Garden.

Connect to your database using psql.
Create and enable the google_ml_integration extension.

Call the create model function to register the gemini-1.0-pro model:

CALL
  google_ml.create_model(
    model_id => 'MODEL_ID',
    model_request_url => 'https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-1.0-pro:streamGenerateContent',
    model_provider => 'google',
    model_auth_type => 'alloydb_service_agent_iam');

Replace the following:

MODEL_ID: a unique ID for the model that you define-for example, gemini-1. This model ID is referenced for metadata that the model needs to generate embeddings or invoke predictions.
PROJECT_ID: the ID of your Google Cloud project.

To invoke predictions, see how to invoke predictions for generic models.

What's next

Learn about the model endpoint management reference.