This page shows how to invoke Vertex AI online predictions from an AlloyDB database, including the steps required to enable this capability.
About invoking online predictions
AlloyDB provides you the ability to get Vertex AI online
predictions in your SQL code by calling the ml_predict_row
function
so that you don't have to manually create a custom integration with Vertex AI.
Vertex AI online prediction is a service that is optimized to run data through hosted models with low latency. You send small batches of data to the service and it returns your predictions in the response.
Limitations that apply to AlloyDB Vertex AI online predictions are available in Vertex AI quotas and limits.
For more information about Vertex AI, see Introduction to Vertex AI.
How to enable and invoke online predictions
After database administrators enable database access to Vertex AI online
predictions and grant database users the access to predictions, the database users
can then invoke predictions using the ml_predict_row
SQL function.
Enable database access to predictions.
To enable AlloyDB database access to Vertex AI online predictions, you grant the AlloyDB service agent IAM permissions to access Vertex AI, and then you create the
google_ml_integration
extension in the database containing the data you want to run predictions on.Grant database users access to predictions.
In the database where you've created the extension, grant the database users who are going to run predictions the right to execute the
ml_predict_row
function.-
Authorized database users invoke the
ml_predict_row
SQL function to run Vertex AI online predictions against database data.
Enable database access to predictions
To enable AlloyDB database access to Vertex AI online predictions,
you grant the AlloyDB service agent IAM permissions to access
Vertex AI, and then you create the google_ml_integration
extension
in the database containing the data you want to run predictions on.
- Add Vertex AI permissions to the AlloyDB service agent
for the project where the AlloyDB database's cluster is located:
Console
- Get the project number of the project that has AlloyDB clusters or instances by following the instructions available in Identifying projects.
- In the Google Cloud console, go to the IAM page.
- Select the project that has Vertex AI endpoints.
- Enable the Include Google-provided role grants checkbox.
- Click Add.
- In the New principals field, enter:
service-PROJECT_NUMBER@gcp-sa-alloydb.iam.gserviceaccount.com
where, PROJECT_NUMBER is the project number you got earlier.
- In the Role field, enter Vertex AI User.
- Click Save.
gcloud
gcloud projects add-iam-policy-binding PROJECT_ID \ --member="serviceAccount:service-PROJECT_NUMBER@gcp-sa-alloydb.iam.gserviceaccount.com" \ --role="roles/aiplatform.user"
- PROJECT_ID: The ID of the project that has the Vertex AI endpoint.
- PROJECT_NUMBER: The project number of the project that has AlloyDB clusters or instances.
- Create the
google_ml_integration
extension in the database containing the data you want to run predictions on:- Connect a
psql
client to the cluster's primary instance, as described in Connect a psql client to an instance. - At the
psql
command prompt, connect to the database and create the extension:\c DB_NAME CREATE EXTENSION IF NOT EXISTS google_ml_integration;
DB_NAME: The name of the database on which the extension should be created.
- Connect a
Grant database users access to invoke predictions
Grant permission for database users to execute the ml_predict_row
function to run predictions:
Connect a
psql
client to the cluster's primary instance, as described in Connect a psql client to an instance.At the psql command prompt, connect to the database and grant permissions:
\c DB_NAME GRANT EXECUTE ON FUNCTION ml_predict_row TO USER_NAME;
- DB_NAME: The name of the database on which the permissions should be granted.
- USER_NAME: The name of the user for whom the permissions should be granted.
Invoke predictions using SQL
Invoke the ml_predict_row
SQL function to invoke Vertex AI online
predictions against database data. The function has the following syntax:
ML_PREDICT_ROW (MODEL_ENDPOINT TEXT, ARGS JSON) RETURNS JSON
- MODEL_ENDPOINT: The Vertex AI online prediction endpoint to send the
request to, specified in this format:
projects/PROJECT_ID/locations/REGION_ID/endpoints/ENDPOINT_NUMBER
For more information about this endpoint format, see Path parameters. - ARGS: The content to run the prediction on, with additional prediction
specific parameters, specified as a JSON request in this format:
{ "instances": [ value ], "parameters": value }
For more information about this request format, see Online predictions with the API. - RETURN VALUE: The predictions as a JSON response in this format:
{ "predictions": [ value ], "model": string, "modelDisplayName": string }
For more information about this format, see Response body.