To invoke predictions or generate embeddings using a model, register the model with model endpoint management.
For more information about the google_ml.create_model
function, see model endpoint management reference.
Before you begin
Before you register a model with model endpoint management, you must enable the google_ml_integration
extension and set up authentication based on the model provider, if your model requires authentication.
Make sure that you access your database with the postgres
default username.
Enable the extension
You must add and enable the google_ml_integration
extension before you can start using the associated functions. Model endpoint management requires that the google_ml_integration
version 1.3 extension is installed.
Set the
google_ml_integration.enable_model_support
database flag toon
for an instance. For more information about setting database flags, see Configure an instance's database flags.Connect to your database using
psql
.Optional. If the
google_ml_integration
extension is already installed, alter it to update the version to1.3
:ALTER EXTENSION google_ml_integration UPDATE TO '1.3'
Add the
google_ml_integration
version 1.3 extension using psql:CREATE EXTENSION google_ml_integration VERSION '1.3';
Optional. Grant permission to a non-super PostgreSQL user to manage model metadata:
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA google_ml TO NON_SUPER_USER;
Replace
NON_SUPER_USER
with the non-super PostgreSQL username.
Set up authentication
The following sections show how to set up authentication before adding a Vertex AI model or models hosted within Google Cloud.
Set up authentication for Vertex AI
To use the Google Vertex AI models, you must add Vertex AI permissions to the IAM-based AlloyDB service account you use to connect to the database. For more information about integrating with Vertex AI, see Integrate with Vertex AI.
Set up authentication for other model providers
For all models except Vertex AI models, you can store your API keys or bearer tokens in Secret Manager. This step is optional if your model doesn't handle authentication through Secret Manager—for example, if your model uses HTTP headers to pass authentication information or doesn't use authentication at all.
This section explains how to set up authentication if you are using Secret Manager.
To create and use an API key or a bearer token, complete the following steps:
Create the secret in Secret Manager. For more information, see Create a secret and access a secret version.
The secret name and the secret path is used in the
google_ml.create_sm_secret()
SQL function.Download the service account key for the service account in JSON format.
Grant permissions to the AlloyDB cluster to access the secret.
gcloud secrets add-iam-policy-binding 'SECRET_ID' \ --member="serviceAccount:SERVICE_ACCOUNT_ID" \ --role="roles/secretmanager.secretAccessor"
Replace the following:
SECRET_ID
: the secret name in Secret Manager.SERVICE_ACCOUNT_ID
: the ID of the IAM-based service account in theserviceAccount:service-PROJECT_ID@gcp-sa-alloydb.iam.gserviceaccount.com
format—for example,service-my-project@gcp-sa-alloydb.iam.gserviceaccount.com
.You can also grant this role to the service account at the project level. For more information, see Add Identity and Access Management policy binding
Models with built-in support
This section shows how to register models that the model endpoint management provides built-in support for.
Vertex AI embedding models
The model endpoint management provides built-in support for all versions of the
text-embedding-gecko
model by Vertex AI. Use the qualified name to set the model version to either
textembedding-gecko@001
or textembedding-gecko@002
. Since the textembedding-gecko@001
model is pre-registered with model endpoint management, you can generate embeddings using textembedding-gecko@001
as the model ID. For these models, the
extension automatically sets up default transform functions.
To register the textembedding-gecko@002
model version, complete the following steps:
Connect to your database using
psql
.Call the create model function to add the
textembedding-gecko
model:CALL google_ml.create_model( model_id => 'MODEL_ID', model_provider => 'google', model_qualified_name => 'MODEL_QUALIFIED_NAME', model_type => 'text_embedding', model_auth_type => 'alloydb_service_agent_iam');
Replace the following:
MODEL_ID
: a unique ID for the model that you define. This model ID is referenced for metadata that the model needs to generate embeddings or invoke predictions.MODEL_QUALIFIED_NAME
: the fully qualified name of the model version. Set it to eithertextembedding-gecko@001
ortextembedding-gecko@002
.
Custom hosted models
This section shows how to register custom models hosted in networks within Google Cloud. Adding custom hosted text embedding models involves creating transform functions, and optionally, custom HTTP headers.
Adding custom hosted generic models involves optionally generating custom HTTP headers and setting the model request URL.
Before you begin
The following example adds the custom-embedding-model
text embedding model hosted by
Cymbal, which is hosted within Google Cloud. The cymbal_text_input_transform
and cymbal_text_output_transform
transform functions are used to transform the input and output format of the
model to the input and output format of the prediction function.
To register custom-hosted text embedding models, complete the following steps:
- Connect to your database using
psql
. - Create and enable the
google_ml_integration
extension. - Add the API key as a secret to the Secret Manager for authentication.
Call the secret stored in the Secret Manager:
CALL google_ml.create_sm_secret( secret_id => 'SECRET_ID', secret_path => 'projects/project-id/secrets/secret-ID/versions/VERSION_NUMBER');
Replace the following:
SECRET_ID
: the secret name in Secret Manager-for example,key2
.PROJECT_ID
: the ID of your Google Cloud project.VERSION_NUMBER
: the version number of the secret ID.
Create the input and output transform functions:
CREATE OR REPLACE FUNCTION INPUT_TRANSFORM_FUNCTION(model_id VARCHAR(100), input_text TEXT) RETURNS JSON; CREATE OR REPLACE FUNCTION OUTPUT_TRANSFORM_FUNCTION(model_id VARCHAR(100), response_json JSON) RETURNS real[];
Replace the following:
INPUT_TRANSFORM_FUNCTION
: required. The function to transform input of the corresponding prediction function to the model specific input-for example,cymbal_text_input_transform
.OUTPUT_TRANSFORM_FUNCTION
: required. The function to transform model specific output to the prediction function output-for example,cymbal_text_output_transform
.
Call the create model function to register the custom embedding model:
CALL google_ml.create_model( model_id => 'MODEL_ID', model_request_url => 'REQUEST_URL', model_type => 'text_embedding', model_auth_type => 'secret_manager', model_auth_id => 'SECRET_ID', model_qualified_name => 'MODEL_QUALIFIED_NAME', model_in_transform_fn => 'INPUT_TRANSFORM_FUNCTION', model_out_transform_fn => 'OUTPUT_TRANSFORM_FUNCTION');
Replace the following:
MODEL_ID
: required. A unique ID for the model that you define-for examplecustom-embedding-model
. This model ID is referenced for metadata that the model needs to generate embeddings or invoke predictions.REQUEST_URL
: required. The model specific endpoint when adding custom text embedding and generic models-for example,https://cymbal.com/models/text/embeddings/v1
.MODEL_QUALIFIED_NAME
: required if your model uses a qualified name. The fully qualified name in case the model has multiple versions.
Generic models
This section shows how to register a generic
gemini-pro
model from Vertex AI
Model Garden, which doesn't have built-in support. You can register any
generic model that is hosted within Google Cloud.
AlloyDB only supports models that are available through Vertex AI Model Garden and models hosted in networks within Google Cloud.
Gemini model
The following example adds the gemini-1.0-pro
model from the Vertex AI Model Garden.
- Connect to your database using
psql
. - Create and enable the
google_ml_integration
extension. Call the create model function to register the
gemini-1.0-pro
model:CALL google_ml.create_model( model_id => 'MODEL_ID', model_request_url => 'https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-1.0-pro:streamGenerateContent', model_provider => 'google', model_auth_type => 'alloydb_service_agent_iam');
Replace the following:
MODEL_ID
: a unique ID for the model that you define-for example,gemini-1
. This model ID is referenced for metadata that the model needs to generate embeddings or invoke predictions.PROJECT_ID
: the ID of your Google Cloud project.
To invoke predictions, see how to invoke predictions for generic models.
What's next
- Learn about the model endpoint management reference.