This page describes how to integrate Cloud SQL with Vertex AI. This integration lets you apply large language models (LLMs), which are hosted in Vertex AI, to a Cloud SQL for PostgreSQL database, version 12 and later.
By integrating Cloud SQL with Vertex AI, you can apply the semantic and predictive power of machine learning (ML) models to your data. This integration extends the PostgreSQL syntax with two functions for querying models:
- Invoke predictions to call a model using SQL within a transaction.
- Generate embeddings to have an embedding model translate text prompts into numerical vectors. You can then apply these vector embeddings as inputs to
pgvector
functions. This includes methods to compare and sort samples of text according to their relative semantic distance.
As a result, you can make real-time predictions and gain valuable insights directly within the database, streamlining your workflows and enhancing your decision-making capabilities.
For more information about Vertex AI, see Introduction to Vertex AI.
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
- Enable the necessary Google Cloud APIs.
- Go to the APIs & Services page.
- From the projects list, select your project.
- If the API Library isn't open, then from the navigation menu, select Library.
Click the APIs that you want to enable. For this procedure, enable the Cloud SQL Admin API and Vertex AI API.
- After selecting each API, click Enable.
- Open Cloud Shell, which provides command-line access to your Google Cloud resources directly from the browser.
- To enable the required APIs, use the
gcloud services enable
command:gcloud services enable sqladmin.googleapis.com \ enable aiplatform.googleapis.com
This command enables the following APIs:
- Cloud SQL Admin API
- Vertex AI API
- Grant the Cloud SQL service account Identity and Access Management (IAM) permissions to access Vertex AI.
To add Vertex AI permissions to the Cloud SQL service account for the project where the Cloud SQL instance is located, use the gcloud projects add-iam-policy-binding
command: Make the following replacements:gcloud projects add-iam-policy-binding
PROJECT_ID \ --member="serviceAccount:SERVICE_ACCOUNT_EMAIL " \ --role="roles/aiplatform.user"- PROJECT_ID: the ID of the project that has the Vertex AI endpoint. Cloud SQL uses this endpoint to access the LLM that's hosted in Vertex AI.
SERVICE_ACCOUNT_EMAIL: the email address of the Cloud SQL service account.
To find this email address, use the
gcloud sql instances describe INSTANCE_NAME
command and replace INSTANCE_NAME with the name of the Cloud SQL instance. The value that appears next to theserviceAccountEmailAddress
parameter is the email address.
Enable database integration with Vertex AI
To enable database integration with Vertex AI, complete the following steps:
- Create or update a Cloud SQL instance so that the instance can integrate with Vertex AI.
Create the instance
To create the Cloud SQL instance, use the
gcloud sql instances create
command.gcloud sql instances create
INSTANCE_NAME \ --database-version=DATABASE_VERSION \ --tier=MACHINE_TYPE \ --region=REGION_NAME \ --enable-google-ml-integration \ --database-flags cloudsql.enable_google_ml_integration=onMake the following replacements:
- INSTANCE_NAME: the name of the instance
- DATABASE_VERSION: the database version for the instance (for example,
POSTGRES_13
) - MACHINE_TYPE: the machine type for the instance
- REGION_NAME: the region name for the instance
Update the instance
To update the instance, use the
gcloud sql instances patch
command.gcloud sql instances patch
INSTANCE_NAME \ --enable-google-ml-integration \ --database-flags cloudsql.enable_google_ml_integration=onIf this update modifies a value that requires a restart, then you see a prompt to proceed with the change or cancel.
Create the instance
Use this example to create the instance. For a complete list of parameters for this call, see the instances:insert page. For information about instance settings, including valid values for a region, see Instance settings.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID or project number of the Google Cloud project that contains the instance
- INSTANCE_NAME: the name of the instance
- REGION_NAME: the region name for the instance
- DATABASE_VERSION: enum string of the database version (for example:
POSTGRES_13
) - PASSWORD: the password for the
root
user - MACHINE_TYPE: enum string of the machine (tier) type, as:
db-custom-[CPUS]-[MEMORY_MBS]
- EDITION_TYPE: your Cloud SQL edition
You must also include the enableGoogleMlIntegration object in the request. Set the following parameters, as required:
enableGoogleMlIntegration
: when this parameter is set totrue
, Cloud SQL instances can connect to Vertex AI to pass requests for real-time predictions and insights to the AIcloudsql.enable_google_ml_integration
: when this parameter is set toon
, Cloud SQL can integrate with Vertex AI
HTTP method and URL:
POST https://sqladmin.googleapis.com/v1/projects/
PROJECT_ID /instancesRequest JSON body:
{ "name": "
INSTANCE_NAME ", "region": "REGION_NAME ", "databaseVersion": "DATABASE_VERSION ", "rootPassword": "PASSWORD ", "settings": { "tier": "MACHINE_TYPE ", "edition": "EDITION_TYPE ", "enableGoogleMlIntegration": "true " | "false " "databaseFlags": { "name": "cloudsql.enable_google_ml_integration", "value": "on " | "off " } } }To send your request, expand one of these options:
curl (Linux, macOS, or Cloud Shell)
Save the request body in a file named
request.json
, and execute the following command:curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://sqladmin.googleapis.com/v1/projects/PROJECT_ID /instances"PowerShell (Windows)
Save the request body in a file named
request.json
, and execute the following command:$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://sqladmin.googleapis.com/v1/projects/PROJECT_ID /instances" | Select-Object -Expand ContentYou should receive a JSON response similar to the following:
{ "kind": "sql#operation", "targetLink": "https://sqladmin.googleapis.com/v1/projects/
PROJECT_ID /instances/INSTANCE_ID ", "status": "PENDING", "user": "user@example.com", "insertTime": "2019-09-25T22:19:33.735Z", "operationType": "CREATE", "name": "OPERATION_ID ", "targetId": "INSTANCE_ID ", "selfLink": "https://sqladmin.googleapis.com/v1/projects/PROJECT_ID /operations/OPERATION_ID ", "targetProject": "PROJECT_ID " }Update the instance
Use this example to update the instance. For a complete list of parameters for this call, see the instances.patch page.
If this update modifies a value that requires a restart, then you see a prompt to proceed with the change or cancel.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID or project number of the Google Cloud project that contains the instance
- INSTANCE_NAME: the name of the instance
HTTP method and URL:
PATCH https://sqladmin.googleapis.com/v1/projects/
PROJECT_ID /instances/INSTANCE_NAME Request JSON body:
{ "settings": { "enableGoogleMlIntegration": true, "databaseFlags": { "name": "cloudsql.enable_google_ml_integration", "value": "on" } } }
To send your request, expand one of these options:
curl (Linux, macOS, or Cloud Shell)
Save the request body in a file named
request.json
, and execute the following command:curl -X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://sqladmin.googleapis.com/v1/projects/PROJECT_ID /instances/INSTANCE_NAME "PowerShell (Windows)
Save the request body in a file named
request.json
, and execute the following command:$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method PATCH `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://sqladmin.googleapis.com/v1/projects/PROJECT_ID /instances/INSTANCE_NAME " | Select-Object -Expand ContentYou should receive a JSON response similar to the following:
{ "kind": "sql#operation", "targetLink": "https://sqladmin.googleapis.com/v1/projects/
PROJECT_ID /instances/INSTANCE_NAME ", "status": "PENDING", "user": "user@example.com", "insertTime": "2020-01-16T02:32:12.281Z", "operationType": "UPDATE", "name": "OPERATION_ID ", "targetId": "INSTANCE_NAME ", "selfLink": "https://sqladmin.googleapis.com/v1/projects/PROJECT_ID /operations/OPERATION_ID ", "targetProject": "PROJECT_ID " }Create the instance
Use this example to create the instance. For a complete list of parameters for this call, see the instances:insert page. For information about instance settings, including valid values for a region, see Instance settings.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID or project number of the Google Cloud project that contains the instance
- INSTANCE_NAME: the name of the instance
- REGION_NAME: the region name for the instance
- DATABASE_VERSION: enum string of the database version (for example:
POSTGRES_13
) - PASSWORD: the password for the
root
user - MACHINE_TYPE: enum string of the machine (tier) type, as:
db-custom-[CPUS]-[MEMORY_MBS]
- EDITION_TYPE: your Cloud SQL edition
You must also include the enableGoogleMlIntegration object in the request. Set the following parameters, as required:
enableGoogleMlIntegration
: when this parameter is set totrue
, Cloud SQL instances can connect to Vertex AI to pass requests for real-time predictions and insights to the AIcloudsql.enable_google_ml_integration
: when this parameter is set toon
, Cloud SQL can integrate with Vertex AI
HTTP method and URL:
POST https://sqladmin.googleapis.com/v1beta4/projects/
PROJECT_ID /instancesRequest JSON body:
{ "name": "
INSTANCE_NAME ", "region": "REGION_NAME ", "databaseVersion": "DATABASE_VERSION ", "rootPassword": "PASSWORD ", "settings": { "tier": "MACHINE_TYPE ", "edition": "EDITION_TYPE ", "enableGoogleMlIntegration": "true " | "false " "databaseFlags": { "name": "cloudsql.enable_google_ml_integration", "value": "on " | "off " } } }To send your request, expand one of these options:
curl (Linux, macOS, or Cloud Shell)
Save the request body in a file named
request.json
, and execute the following command:curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://sqladmin.googleapis.com/v1beta4/projects/PROJECT_ID /instances"PowerShell (Windows)
Save the request body in a file named
request.json
, and execute the following command:$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://sqladmin.googleapis.com/v1beta4/projects/PROJECT_ID /instances" | Select-Object -Expand ContentYou should receive a JSON response similar to the following:
{ "kind": "sql#operation", "targetLink": "https://sqladmin.googleapis.com/v1beta4/projects/
PROJECT_ID /instances/INSTANCE_ID ", "status": "PENDING", "user": "user@example.com", "insertTime": "2019-09-25T22:19:33.735Z", "operationType": "CREATE", "name": "OPERATION_ID ", "targetId": "INSTANCE_ID ", "selfLink": "https://sqladmin.googleapis.com/v1beta4/projects/PROJECT_ID /operations/OPERATION_ID ", "targetProject": "PROJECT_ID " }Update the instance
Use this example to update the instance. For a complete list of parameters for this call, see the instances.patch page.
If this update modifies a value that requires a restart, then you see a prompt to proceed with the change or cancel.
Before using any of the request data, make the following replacements:
- PROJECT_ID: the ID or project number of the Google Cloud project that contains the instance
- INSTANCE_NAME: the name of the instance
HTTP method and URL:
PATCH https://sqladmin.googleapis.com/v1beta4/projects/
PROJECT_ID /instances/INSTANCE_NAME Request JSON body:
{ "settings": { "enableGoogleMlIntegration": true, "databaseFlags": { "name": "cloudsql.enable_google_ml_integration", "value": "on" } } }
To send your request, expand one of these options:
curl (Linux, macOS, or Cloud Shell)
Save the request body in a file named
request.json
, and execute the following command:curl -X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://sqladmin.googleapis.com/v1beta4/projects/PROJECT_ID /instances/INSTANCE_NAME "PowerShell (Windows)
Save the request body in a file named
request.json
, and execute the following command:$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method PATCH `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://sqladmin.googleapis.com/v1beta4/projects/PROJECT_ID /instances/INSTANCE_NAME " | Select-Object -Expand ContentYou should receive a JSON response similar to the following:
{ "kind": "sql#operation", "targetLink": "https://sqladmin.googleapis.com/v1beta4/projects/
PROJECT_ID /instances/INSTANCE_NAME ", "status": "PENDING", "user": "user@example.com", "insertTime": "2020-01-16T02:32:12.281Z", "operationType": "UPDATE", "name": "OPERATION_ID ", "targetId": "INSTANCE_NAME ", "selfLink": "https://sqladmin.googleapis.com/v1beta4/projects/PROJECT_ID /operations/OPERATION_ID ", "targetProject": "PROJECT_ID " } - Install the
google_ml_integration
extension in a database of the primary Cloud SQL instance. This database contains data on which you want to run predictions.- Connect a
psql
client to the primary instance, as described in Connect using a psql client. - At the
psql
command prompt, connect to the database:\c
DB_NAME Replace
DB_NAME
with the name of the database on which you want to install the extension. - Install the extension:
CREATE EXTENSION IF NOT EXISTS google_ml_integration CASCADE;
- Connect a
Troubleshoot
This section contains information about issues associated with integrating Cloud SQL with Vertex AI along with steps for troubleshooting the issues.
Issue | Troubleshooting |
---|---|
Error message: Google ML integration API is supported only on Postgres version 12 or above. |
To enable the Vertex AI integration in Cloud SQL, you must have a Cloud SQL for PostgreSQL database, version 12 or later. To upgrade your database to this version, see Upgrade the database major version in-place. |
Error message: Google ML Integration API is not supported on shared core instance. Please upsize your machine type. |
If you selected a shared core for the machine type of your instance, then you can't enable the Vertex AI integration in Cloud SQL. Upgrade your machine type to dedicated core. For more information, see Machine Type. |
Error message: Google ML Integration is unsupported for this maintenance version. Please follow https://cloud.google.com/sql/docs/postgres/self-service-maintenance to update the maintenance version of the instance. |
To enable the Vertex AI integration in Cloud SQL, the maintenance version of your instance must be R20240130 or later. To upgrade your instance to this version, see Self-service maintenance. |
Error message: Cannot invoke ml_predict_row if 'cloudsql.enable_google_ml_integration' is off. |
The cloudsql.enable_google_ml_integration database flag is turned off. Cloud SQL can't integrate with Vertex AI.To turn this flag on, use the gcloud sql instances patch command:gcloud sql instances patch INSTANCE_NAME --database-flags cloudsql.enable_google_ml_integration=on Replace INSTANCE_NAME with the name of the primary Cloud SQL instance. |
Error message: Failed to connect to remote host: Connection refused. |
The integration between Cloud SQL and Vertex AI isn't enabled. To enable this integration, use the gcloud sql instances patch command:gcloud sql instances patch INSTANCE_NAME Replace INSTANCE_NAME with the name of the primary Cloud SQL instance. |
Error message: Vertex AI API has not been used in project PROJECT_ID before or it is disabled. Enable it by visiting /apis/api/aiplatform.googleapis.com/overview?project=PROJECT_ID then retry. |
The Vertex AI API isn't enabled. For more information on enabling this API, see Enable database integration with Vertex AI. |
Error message: Permission 'aiplatform.endpoints.predict' denied on resource. |
Vertex AI permissions aren't added to the Cloud SQL service account for the project where the Cloud SQL instance is located. For more information on adding these permissions to the service account, see Enable database integration with Vertex AI. |
Error message: Publisher Model `projects/PROJECT_ID/locations/REGION_NAME/publishers/google/models/MODEL_NAME` not found. |
The machine learning model or the LLM doesn't exist in Vertex AI. |
Error message: Resource exhausted: grpc: received message larger than max. |
The size of the request that Cloud SQL passes to Vertex AI exceeds the gRPC limit of 4 MB per request. |
Error message: Cloud SQL attempts to send a request to Vertex AI. However, the instance is in the %s region, but the Vertex AI endpoint is in the %s region. Make sure the instance and endpoint are in the same region. |
Cloud SQL attempts to send a request to Vertex AI. However, the instance is in one region, but the Vertex AI endpoint is in a different region. To resolve this issue, both the instance and endpoint must be in the same region. |
Error message: The Vertex AI endpoint isn't formatted properly. |
The Vertex AI endpoint isn't formatted properly. For more information, see Use private endpoints for online prediction. |
Error message: Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: textembedding-gecko. |
The number of requests that Cloud SQL passes to Vertex AI exceeds the limit of 1,500 requests per minute per region per model per project. |