To invoke predictions or generate embeddings using a model, register the model endpoint with model endpoint management.
Before you begin
Before you register a model endpoint with model endpoint management, you must enable the google_ml_integration
extension and set up authentication based on the model provider, if your model endpoint requires authentication.
Make sure that you access your database with the postgres
default username.
Enable the extension
You must add and enable the google_ml_integration
extension before you can start using the associated functions. Model endpoint management requires that the google_ml_integration
extension is installed.
Verify that the
google_ml_integration.enable_model_support
database flag is set toon
for an instance. For more information about setting database flags, see Configure an instance's database flags.Connect to your database using
psql
or AlloyDB for PostgreSQL Studio.Optional: If the
google_ml_integration
extension is already installed, alter it to update to the latest version:ALTER EXTENSION google_ml_integration UPDATE;
Add the
google_ml_integration
extension using psql:CREATE EXTENSION IF NOT EXISTS google_ml_integration;
Optional: Grant permission to a non-super PostgreSQL user to manage model metadata:
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA google_ml TO NON_SUPER_USER;
Replace
NON_SUPER_USER
with the non-super PostgreSQL username.Ensure that outbound IP is enabled to access models hosted outside of your VPC, such as third-party models. For more information, see Add outbound connectivity.
Set up authentication
The following sections show how to set up authentication before registering a model endpoint.
Set up authentication for Vertex AI
To use the Google Vertex AI model endpoints, you must add Vertex AI permissions to the IAM-based AlloyDB service account you use to connect to the database. For more information about integrating with Vertex AI, see Integrate with Vertex AI.
Set up authentication using Secret Manager
This section explains how to set up authentication if you are using Secret Manager to store authentication details for third party providers.
This step is optional if your model endpoint doesn't handle authentication through Secret Manager—for example, if your model endpoint uses HTTP headers to pass authentication information or doesn't use authentication at all.
To create and use an API key or a bearer token, complete the following steps:
Create the secret in Secret Manager. For more information, see Create a secret and access a secret version.
The secret path is used in the
google_ml.create_sm_secret()
SQL function.Grant permissions to the AlloyDB cluster to access the secret.
gcloud secrets add-iam-policy-binding 'SECRET_NAME' \ --member="serviceAccount:SERVICE_ACCOUNT_ID" \ --role="roles/secretmanager.secretAccessor"
Replace the following:
SECRET_NAME
: the secret name in Secret Manager.SERVICE_ACCOUNT_ID
: the ID of the IAM-based service account in theserviceAccount:service-PROJECT_ID@gcp-sa-alloydb.iam.gserviceaccount.com
format—for example,service-212340152456@gcp-sa-alloydb.iam.gserviceaccount.com
.You can also grant this role to the service account at the project level. For more information, see Add Identity and Access Management policy binding
Set up authentication using headers
The following example shows how to set up authentication using a function. The function returns a JSON object that contains the headers required to make a request to the embedding model.
CREATE OR REPLACE FUNCTION HEADER_GEN_FUNCTION(
model_id VARCHAR(100),
input_text TEXT
)
RETURNS JSON
LANGUAGE plpgsql
AS $$
#variable_conflict use_variable
DECLARE
api_key VARCHAR(255) := 'API_KEY';
header_json JSON;
BEGIN
header_json := json_build_object(
'Content-Type', 'application/json',
'Authorization', 'Bearer ' || api_key
);
RETURN header_json;
END;
$$;
Replace the following:
HEADER_GEN_FUNCTION
: the name of the header generation function that you can use when registering a model.API_KEY
: the API key of the model provider.
Text embedding models
This section shows how to register model endpoints with model endpoint management.
The model endpoint management supports some text embedding and generic Vertex AI models as pre-registered model endpoints. You can directly use the model ID to generate embeddings or invoke predictions, based on the model type. For more information about supported pre-registered models, see Pre-registered Vertex AI models.
For example, to call the pre-registered textembedding-gecko
model, you can directly call the model using the embedding function:
SELECT google_ml.embedding( model_id => 'textembedding-gecko', content => 'AlloyDB is a managed, cloud-hosted SQL database service');
Similarly, to call the pre-registered gemini-1.5-pro:generateContent
model, you can directly call the model using the prediction function:
SELECT json_array_elements( google_ml.predict_row( model_id => 'gemini-1.5-pro:generateContent', request_body => '{ "contents": [ { "role": "user", "parts": [ { "text": "For TPCH database schema as mentioned here https://www.tpc.org/TPC_Documents_Current_Versions/pdf/TPC-H_v3.0.1.pdf , generate a SQL query to find all supplier names which are located in the India nation." } ] } ] }'))-> 'candidates' -> 0 -> 'content' -> 'parts' -> 0 -> 'text';
To generate embeddings, see how to generate embedding for pre-registered model endpoints. To invoke predictions, see how to invoke predictions for pre-registered model endpoints.
Text embedding models with built-in support
The model endpoint management provides built-in support for some models by Vertex AI and OpenAI. For the list of models with built-in support, see Models with built-in support.
For models with built-in support, you can set the qualified name as the model qualified name and specify the request URL. Model endpoint management automatically identifies the model and sets up default transform functions.
Vertex AI embedding models
The following steps show how to register Vertex AI models with built-in support. The text-embedding-005
and the text-multilingual-embedding-002
model endpoint is used as an example.
Ensure that both the AlloyDB cluster and the Vertex AI model you are querying are in the same region.
Call the create model function to add the model endpoint:
text-embedding-005
CALL google_ml.create_model( model_id => 'text-embedding-005', model_request_url => 'publishers/google/models/text-embedding-005', model_provider => 'google', model_qualified_name => 'text-embedding-005', model_type => 'text_embedding', model_auth_type => 'alloydb_service_agent_iam');
text-multilingual-embedding-002
CALL google_ml.create_model( model_id => 'text-multilingual-embedding-002', model_request_url => 'publishers/google/models/text-multilingual-embedding-002', model_provider => 'google', model_qualified_name => 'text-multilingual-embedding-002', model_type => 'text_embedding', model_auth_type => 'alloydb_service_agent_iam' model_in_transform_fn => 'google_ml.vertexai_text_embedding_input_transform', model_out_transform_fn => 'google_ml.vertexai_text_embedding_output_transform');
If the model is stored in the another project and region than your AlloyDB cluster, then set the request URL to projects/PROJECT_ID/locations/REGION_ID/publishers/google/models/MODEL_ID
, where REGION_ID
is the region where your model is hosted, and the MODEL_ID
is the qualified model name.
In addition, grant the Vertex AI User (roles/aiplatform.user
) role to AlloyDB service account of the project where AlloyDB instance resides so that AlloyDB can access the model hosted in the other project.
Open AI text embedding model
The google_ml_integration
extension
automatically sets up default transform functions and invokes calls to the
remote OpenAI models. For the list of OpenAI models with built-in support, see Models with built-in support.
The following example adds the text-embedding-ada-002
OpenAI model endpoint.
You can register the OpenAI text-embedding-3-small
and
text-embedding-3-large
model endpoints using the same steps and setting the
model qualified names specific to the models.
- Connect to your database using
psql
. - Create and enable the
google_ml_integration
extension. - Add the OpenAI API key as a secret to the Secret Manager for authentication.
Call the secret stored in the Secret Manager:
CALL google_ml.create_sm_secret( secret_id => 'SECRET_ID', secret_path => 'projects/PROJECT_ID/secrets/SECRET_MANAGER_SECRET_ID/versions/VERSION_NUMBER');
Replace the following:
SECRET_ID
: the secret ID that you set and is subsequently used when registering a model endpoint—for example,key1
.SECRET_MANAGER_SECRET_ID
: the secret ID set in Secret Manager when you created the secret.PROJECT_ID
: the ID of your Google Cloud project.VERSION_NUMBER
: the version number of the secret ID.
Call the create model function to register the
text-embedding-ada-002
model endpoint:CALL google_ml.create_model( model_id => 'MODEL_ID', model_provider => 'open_ai', model_type => 'text_embedding', model_qualified_name => 'text-embedding-ada-002', model_auth_type => 'secret_manager', model_auth_id => 'SECRET_ID');
Replace the following:
MODEL_ID
: a unique ID for the model endpoint that you define. This model ID is referenced for metadata that the model endpoint needs to generate embeddings or invoke predictions.SECRET_ID
: the secret ID you used earlier in thegoogle_ml.create_sm_secret()
procedure.
To generate embeddings, see how to generate embedding for model endpoints with built-in support.
Custom-hosted text embedding model
This section shows how to register a custom-hosted model endpoint along with creating transform functions, and optionally, custom HTTP headers. All custom-hosted model endpoints are supported regardless of where they are hosted.
The following example adds the custom-embedding-model
custom model endpoint hosted by
Cymbal. The cymbal_text_input_transform
and cymbal_text_output_transform
transform functions are used to transform the input and output format of the
model to the input and output format of the prediction function.
To register custom-hosted text embedding model endpoints, complete the following steps:
Optional: Add the API key as a secret to the Secret Manager for authentication.
Call the secret stored in the Secret Manager:
CALL google_ml.create_sm_secret( secret_id => 'SECRET_ID', secret_path => 'projects/project-id/secrets/SECRET_MANAGER_SECRET_ID/versions/VERSION_NUMBER');
Replace the following:
SECRET_ID
: the secret ID that you set and is subsequently used when registering a model endpoint—for example,key1
.SECRET_MANAGER_SECRET_ID
: the secret ID set in Secret Manager when you created the secret.PROJECT_ID
: the ID of your Google Cloud project.VERSION_NUMBER
: the version number of the secret ID.
Create the input and output transform functions based on the following signature for the prediction function for text embedding model endpoints. For more information about how to create transform functions, see Transform functions example.
The following are example transform functions that are specific to the
custom-embedding-model
text embedding model endpoint:-- Input Transform Function corresponding to the custom model endpoint CREATE OR REPLACE FUNCTION cymbal_text_input_transform(model_id VARCHAR(100), input_text TEXT) RETURNS JSON LANGUAGE plpgsql AS $$ DECLARE transformed_input JSON; model_qualified_name TEXT; BEGIN SELECT json_build_object('prompt', json_build_array(input_text))::JSON INTO transformed_input; RETURN transformed_input; END; $$; -- Output Transform Function corresponding to the custom model endpoint CREATE OR REPLACE FUNCTION cymbal_text_output_transform(model_id VARCHAR(100), response_json JSON) RETURNS REAL[] LANGUAGE plpgsql AS $$ DECLARE transformed_output REAL[]; BEGIN SELECT ARRAY(SELECT json_array_elements_text(response_json->0)) INTO transformed_output; RETURN transformed_output; END; $$;
Call the create model function to register the custom embedding model endpoint:
CALL google_ml.create_model( model_id => 'MODEL_ID', model_request_url => 'REQUEST_URL', model_provider => 'custom', model_type => 'text_embedding', model_auth_type => 'secret_manager', model_auth_id => 'SECRET_ID', model_qualified_name => 'MODEL_QUALIFIED_NAME', model_in_transform_fn => 'cymbal_text_input_transform', model_out_transform_fn => 'cymbal_text_output_transform');
Replace the following:
MODEL_ID
: required. A unique ID for the model endpoint that you define-for examplecustom-embedding-model
. This model ID is referenced for metadata that the model endpoint needs to generate embeddings or invoke predictions.REQUEST_URL
: required. The model-specific endpoint when adding custom text embedding and generic model endpoints—for example,https://cymbal.com/models/text/embeddings/v1
. Ensure that the model endpoint is accessible through an internal IP address. Model endpoint management doesn't support public IP addresses.MODEL_QUALIFIED_NAME
: required if your model endpoint uses a qualified name. The fully qualified name in case the model endpoint has multiple versions.SECRET_ID
: the secret ID you used earlier in thegoogle_ml.create_sm_secret()
procedure.
Generic models
This section shows how to register any generic model endpoint that is available on a
hosted model provider such as Hugging Face, OpenAI, Vertex AI, Anthropic, or
any other provider. This section shows examples to register a generic model endpoint
hosted on Hugging Face, a generic gemini-pro
model from Vertex AI
Model Garden, and the claude-haiku
model endpoint.
You can register any generic model endpoint as long as the input and output is in the JSON format. Based on your model endpoint metadata, you might need to generate HTTP headers or define request URLs.
For more information about pre-registered generic models and models with built-in support, see Supported models.
Gemini model
Since some gemini-pro
models are pre-registered, you can directly call the model ID to invoke predictions.
The following example uses the gemini-1.5-pro:generateContent
model endpoint from the Vertex AI Model Garden.
- Connect to your database using
psql
. - Create and enable the
google_ml_integration
extension. Invoke predictions using the pre-registered model ID:
SELECT json_array_elements( google_ml.predict_row( model_id => 'gemini-1.5-pro:generateContent', request_body => '{ "contents": [ { "role": "user", "parts": [ { "text": "For TPCH database schema as mentioned here https://www.tpc.org/TPC_Documents_Current_Versions/pdf/TPC-H_v3.0.1.pdf , generate a SQL query to find all supplier names which are located in the India nation." } ] } ] }'))-> 'candidates' -> 0 -> 'content' -> 'parts' -> 0 -> 'text';
Generic model on Hugging Face
The following example adds the facebook/bart-large-mnli
custom classification
model endpoint hosted on Hugging Face.
- Connect to your database using
psql
. - Create and enable the
google_ml_integration
extension. - Add the OpenAI API key as a secret to the Secret Manager for authentication. If you have already created a secret for any other OpenAI model, you can reuse the same secret.
Call the secret stored in the Secret Manager:
CALL google_ml.create_sm_secret( secret_id => 'SECRET_ID', secret_path => 'projects/project-id/secrets/SECRET_MANAGER_SECRET_ID/versions/VERSION_NUMBER');
Replace the following:
SECRET_ID
: the secret ID that you set and is subsequently used when registering a model endpoint.SECRET_MANAGER_SECRET_ID
: the secret ID set in Secret Manager when you created the secret.PROJECT_ID
: the ID of your Google Cloud project.VERSION_NUMBER
: the version number of the secret ID.
Call the create model function to register the
facebook/bart-large-mnli
model endpoint:CALL google_ml.create_model( model_id => 'MODEL_ID', model_provider => 'hugging_face', model_request_url => 'REQUEST_URL', model_qualified_name => 'MODEL_QUALIFIED_NAME', model_auth_type => 'secret_manager', model_auth_id => 'SECRET_ID');
Replace the following:
MODEL_ID
: a unique ID for the model endpoint that you define—for example,custom-classification-model
. This model ID is referenced for metadata that the model endpoint needs to generate embeddings or invoke predictions.REQUEST_URL
: the model-specific endpoint when adding custom text embedding and generic model endpoints—for example,https://api-inference.huggingface.co/models/facebook/bart-large-mnli
.MODEL_QUALIFIED_NAME
: the fully qualified name of the model endpoint version-for example,facebook/bart-large-mnli
.SECRET_ID
: the secret ID you used earlier in thegoogle_ml.create_sm_secret()
procedure.
Anthropic generic model
The following example adds the claude-3-opus-20240229
model endpoint.
Model endpoint management provides the header function required for registering
Anthropic models.
- Connect to your database using
psql
. Create and enable the
google_ml_integration
extension.Secret Manager
- Add the bearer token as a secret to the Secret Manager for authentication.
Call the secret stored in the Secret Manager:
CALL google_ml.create_sm_secret( secret_id => 'SECRET_ID', secret_path => 'projects/project-id/secrets/SECRET_MANAGER_SECRET_ID/versions/VERSION_NUMBER');
Replace the following:
SECRET_ID
: the secret ID that you set and is subsequently used when registering a model endpoint.SECRET_MANAGER_SECRET_ID
: the secret ID set in Secret Manager when you created the secret.PROJECT_ID
: the ID of your Google Cloud project.VERSION_NUMBER
: the version number of the secret ID.
Call the create model function to register the
claude-3-opus-20240229
model endpoint.CALL google_ml.create_model( model_id => 'MODEL_ID', model_provider => 'anthropic', model_request_url => 'REQUEST_URL', model_auth_type => 'secret_manager', model_auth_id => 'SECRET_ID', generate_headers_fn => 'google_ml.anthropic_claude_header_gen_fn');
Replace the following:
MODEL_ID
: a unique ID for the model endpoint that you define—for example,anthropic-opus
. This model ID is referenced for metadata that the model endpoint needs to generate embeddings or invoke predictions.REQUEST_URL
: the model-specific endpoint when adding custom text embedding and generic model endpoints—for example,https://api.anthropic.com/v1/messages
.
Auth header
Use the
google_ml.anthropic_claude_header_gen_fn
default header generation function or create a header generation function.CREATE OR REPLACE FUNCTION anthropic_sample_header_gen_fn(model_id VARCHAR(100), request_body JSON) RETURNS JSON LANGUAGE plpgsql AS $$ #variable_conflict use_variable BEGIN RETURN json_build_object('x-api-key', 'ANTHROPIC_API_KEY', 'anthropic-version', 'ANTHROPIC_VERSION')::JSON; END; $$;
Replace the following:
ANTHROPIC_API_KEY
: the anthropic API key.ANTHROPIC_VERSION
(Optional): the specific model version you want to use—for example,2023-06-01
.
Call the create model function to register the
claude-3-opus-20240229
model endpoint.CALL google_ml.create_model( model_id => 'MODEL_ID', model_provider => 'anthropic', model_request_url => 'REQUEST_URL', generate_headers_fn => 'google_ml.anthropic_claude_header_gen_fn');
Replace the following:
MODEL_ID
: a unique ID for the model endpoint that you define—for example,anthropic-opus
. This model ID is referenced for metadata that the model endpoint needs to generate embeddings or invoke predictions.REQUEST_URL
: the model-specific endpoint when adding custom text embedding and generic model endpoints—for example,https://api.anthropic.com/v1/messages
.
For more information, see how to invoke predictions for generic model endpoints.
What's next
- Learn about the model endpoint management reference
- Use sample templates for registering model endpoints