This page shows how to invoke online predictions from an AlloyDB for PostgreSQL database.
AlloyDB provides you the ability to get online predictions in
your SQL code by calling the ml_predict_row()
function. For more information about using machine learning (ML) models with AlloyDB, see Build generative AI
applications.
Before you begin
Before you can invoke online predictions from an AlloyDB database, you must prepare your database and select an appropriate ML model.
Prepare your database
Grant permission for database users to execute the
ml_predict_row()
function to run predictions:Connect a
psql
client to the cluster's primary instance, as described in Connect a psql client to an instance.At the psql command prompt, connect to the database and grant permissions:
\c DB_NAME GRANT EXECUTE ON FUNCTION ml_predict_row TO USER_NAME;
Replace the following:
DB_NAME: the name of the database on which the permissions should be granted
USER_NAME: the name of the user for whom the permissions should be granted
Select an ML model
When you call the ml_predict_row()
function, you must specify the location of an ML model.
The model that you specify can be one of these:
A model that's running in the Vertex AI Model Garden.
The
ml_predict_row()
function supports invoking predictions only on tabular or custom models.A Vertex AI model with an active endpoint that you have Identity and Access Management (IAM) permission to access.
AlloyDB doesn't support private endpoints for getting online predictions.
Invoke online predictions
Use the ml_predict_row()
SQL function to invoke online predictions
against your data.
The format of the function's initial argument depends on whether the ML model that you want to use is in the Vertex AI Model Garden or is an endpoint running in a Google Cloud project.
Use a model in the Vertex AI Model Garden
To invoke an online prediction using an ML model that's running in the Vertex AI
Model Garden, use the following syntax for the ml_predict_row()
SQL function:
SELECT ml_predict_row('projects/PROJECT_ID/locations/REGION_ID/publishers/google/models/MODEL_ID', '{ "instances": [ INSTANCES ], "parameters":
PARAMETERS }');
Replace the following:
PROJECT_ID
: the ID of your Google Cloud projectREGION_ID
: the ID of the Google Cloud region that the model is located in—for example,us-central1
for PaLM 2 for TextMODEL_ID
: the ID of the ML model to use—for example,text-bison
for PaLM 2 for TextINSTANCES
: the inputs to the prediction call, in JSON formatPARAMETERS
: the parameters to the prediction call, in JSON format
If the ML model is stored in the same project and region as your AlloyDB cluster, then you can abbreviate this function's first argument:
SELECT ml_predict_row('publishers/google/models/MODEL_ID', '{ "instances": [ INSTANCES ], "parameters":
PARAMETERS }');
For information about the model's JSON response messages, see Generative AI foundational model reference.
For examples, see Example invocations.
Use a Vertex AI model endpoint
To invoke an online prediction using a Vertex AI model endpoint, use the following syntax for the ml_predict_row()
SQL function:
SELECT ml_predict_row('projects/PROJECT_ID/locations/REGION_ID/endpoints/ENDPOINT_ID', '{ "instances": [ INSTANCES ], "parameters":
PARAMETERS }');
Replace the following:
PROJECT_ID
: the ID of the Google Cloud project that the model is located inREGION_ID
: the ID of the Google Cloud region the model is located in—for example,us-central1
ENDPOINT_ID
: the ID of the model endpointINSTANCES
: the inputs to the prediction call, in JSON formatPARAMETERS
: the parameters to the prediction call, in JSON format
If the endpoint is located in the same project and region as your AlloyDB cluster, then you can abbreviate this function's first argument:
SELECT ml_predict_row('endpoints/ENDPOINT_ID', '{ "instances": [ INSTANCES ], "parameters":
PARAMETERS }');
For information about the model's JSON response messages, see PredictResponse.
Example invocations
The following example uses PaLM 2 for Text, available in the Model Garden, to generate text based on a short prompt that is provided as a literal argument
to ml_predict_row()
:
select ML_PREDICT_ROW('projects/PROJECT_ID/locations/us-central1/publishers/google/models/text-bison', '{"instances":[{"prompt": "What are three advantages of using AlloyDB as my SQL database server?"}], "parameters":{"maxOutputTokens":1024, "topK": 40, "topP":0.8, "temperature":0.2}}');
The response is a JSON object. For more information about the format of the object, see Response body.
The next example modifies the previous one in the following ways:
The example uses the contents of the current database's
messages.message
column as input.The example demonstrates the use of the
json_build_object()
function as an aid to formatting the function parameters.
select ML_PREDICT_ROW('projects/PROJECT_ID/locations/us-central1/publishers/google/models/text-bison', json_build_object('instances', json_build_object('prompt', message), 'parameters', json_build_object('maxOutputTokens', 1024,'topK', 40,'topP', 0.8,'temperature', 0.2))) from messages;
The returned JSON object now contains one entry in its predictions
array for every row in the messages
table.
Because the response is a JSON object, you can pull specific fields from it using the PostgreSQL arrow operator:
SELECT ML_PREDICT_ROW('projects/PROJECT_ID/locations/us-central1/publishers/google/models/text-bison',json_build_object('instances', json_build_object('prompt',message), 'parameters', json_build_object('maxOutputTokens', 1024,'topK', 40,'topP', 0.8,'temperature', 0.2)))->'predictions'->0->'content' FROM messages;
For more example arguments to ml_predict_row()
, see Quickstart using the Vertex AI
API.