This page describes how to import data from CSV file stored in a Cloud Storage bucket into an AlloyDB for PostgreSQL cluster.
You can cancel the import of data into AlloyDB clusters. For more information, see Cancel an import operation.
Before you begin
Before you begin an import operation:
- Ensure that your database has adequate free space.
- Import operations use database resources, but they don't interfere with normal database operations unless the cluster is under-provisioned.
Required roles and permissions for importing to AlloyDB clusters
To import data from Cloud Storage into AlloyDB, the user initiating the import must have one of the following roles:
- The AlloyDB Admin role
- A custom role,
including the following permissions:
alloydb.clusters.get
alloydb.clusters.import
Additionally, the service account for the AlloyDB cluster must have one of the following roles:
- The
storage.objectViewer
IAM role - A custom role, including the following permissions:
storage.objects.get
For help with IAM roles, see Identity and Access Management.
Import data from a CSV file to AlloyDB clusters
- The database and table you are importing into must exist on your AlloyDB cluster. For help with creating a database, see Create a database.
- CSV files must have one line for each row of data and use comma-separated fields.
To import data to an AlloyDB cluster using a CSV file, follow these steps:
gcloud
- Create a Cloud Storage bucket.
- Upload the CSV file to your bucket. For help with uploading files to buckets, see Uploading objects.
Use gcloud storage buckets add-iam-policy-binding to grant the
storage.objectViewer
IAM role to the AlloyDB service account for the bucket.service-PROJECT_NUMBER@gcp-sa-alloydb.iam.gserviceaccount.com
For help with setting IAM permissions, see Using IAM permissions.
Import the file:
gcloud alloydb clusters import CLUSTER_NAME --region=REGION --gcs-uri=gs://BUCKET_NAME/FILE_NAME --database=DATABASE_NAME --user=USER --csv --table=TABLE_NAME
Make the following replacements:
- CLUSTER_NAME: the name of the cluster.
- REGION: the region in which the AlloyDB cluster is deployed.
- BUCKET_NAME: the Cloud Storage bucket name.
- FILE_NAME: the name of the CSV file.
- DATABASE_NAME: the name of a database inside the cluster.
- USER: the user for the import operation.
- TABLE_NAME: the table inside the database.
For more information about using the
import
command, see thealloydb import
command reference page.If you don't need to keep the IAM permissions that you set previously, remove them using
gcloud storage buckets remove-iam-policy-binding
.
REST v1
- Create a Cloud Storage bucket.
- Upload the CSV file to your bucket. For help with uploading files to buckets, see Uploading objects.
Give the service account permissions to the Cloud Storage bucket for the import operation. Use the service account format to identify the service account for the project you're importing into. The format for the service account is as follows:
service-PROJECT_NUMBER@gcp-sa-alloydb.iam.gserviceaccount.com
Use gcloud storage buckets add-iam-policy-binding to grant the
storage.objectViewer
IAM role to the AlloyDB cluster service account for the bucket. For help with setting IAM permissions, see Using IAM permissions.Import the file.
Use the following HTTP method and URL:
POST https://alloydb.googleapis.com/v1/projects/PROJECT_ID/locations/REGION/clusters/CLUSTER_ID:import
Before you use any of the request data, make the following replacements:
- PROJECT_ID: the project ID.
- LOCATION_ID: the region in which the AlloyDB cluster is deployed.
- CLUSTER_ID: the cluster ID.
- BUCKET_NAME: the Cloud Storage bucket name.
- PATH_TO_CSV_FILE: the path to the CSV file.
- USER: the user for the import operation.
- DATABASE_NAME: the name of a database inside the AlloyDB cluster.
- TABLE_NAME: the table inside the database.
- COLUMNS (Optional): the columns to import.
- ESCAPE_CHARACTER (Optional): the character that must
appear before a data character that needs to be escaped. The value
of this argument must be a character in Hex ASCII Code. For example,
22
represents a double quotation mark. - QUOTE_CHARACTER (Optional): the character
that encloses values from columns that have a string data type.
The value of this argument must be a character in Hex ASCII Code.
For example,
22
represents a double quotation mark. - FIELD_DELIMITER (Optional): the character
that splits column values. The value of this argument must be a
character in Hex ASCII Code. For example,
2C
represents a comma.
Request JSON body:
{ "gcsUri": "gs://BUCKET_NAME/PATH_TO_CSV_FILE", "database": "DATABASE_NAME", "user": "USER", "csvImportOptions": { "table": "TABLE_NAME", "columns": ["COLUMN1", "COLUMN2"], "fieldDelimiter": "FIELD_DELIMITER", "quoteCharacter": "QUOTE_CHARACTER", "escapeCharacter": "ESCAPE_CHARACTER" } }
To send your request, expand one of these options:
curl (Linux, macOS, or Cloud Shell)
Save the request body in a file named
request.json
and execute the following command:curl -X POST \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json; charset=utf-8" \ -d @request.json \ "https://alloydb.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/clusters/CLUSTER_ID:import"
PowerShell (Windows)
Save the request body in a file named
request.json
and execute the following command:$cred = gcloud auth print-access-token $headers = @{ "Authorization" = "Bearer $cred" } Invoke-WebRequest ` -Method POST ` -Headers $headers ` -ContentType: "application/json; charset=utf-8" ` -InFile request.json ` -Uri "https://alloydb.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/clusters/CLUSTER_ID:import"| Select-Object -Expand Content
You receive a JSON response similar to the following:
Response
{ "name": "projects/PROJECT_ID/locations/LOCATION_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.alloydb.v1.OperationMetadata", "createTime": "2024-09-17T06:05:31.244428646Z", "target": "projects/project-id/locations/location-id/clusters/target-cluster", "verb": "import", "requestedCancellation": false, "apiVersion": "v1" }, "done": false }
If you don't need to retain the IAM permissions that you set previously, remove them now.
For the complete list of parameters for the request, see
clusters:import
.
Customize CSV format and import the file
You can customize the CSV format and import the file when the CSV file contains non-default field delimiter, or quote or escape characters. To import a CSV file with a non-default format, you must provide the same formatting options in the request.
Use the following gcloud CLI and REST API examples to help you customize your CSV file format.
gcloud
gcloud alloydb clusters import CLUSTER_NAME --region=REGION --database=DATABASE_NAME --gcs-uri='gs://BUCKET_NAME/PATH_TO_CSV_FILE' --user=USERNAME --table=TABLE_NAME --columns=COLUMNS --field-delimiter='2C' --quote-character='22' --escape-character='5C' --csv
REST v1
The equivalent REST API request body looks similar to the following:
{
"gcsUri": "gs://BUCKET_NAME/PATH_TO_CSV_FILE",
"database": "DATABASE_NAME",
"user": "USER",
"csvImportOptions": {
"table": "TABLE_NAME",
"columns": ["COLUMN1", "COLUMN2"],
"escapeCharacter": "5C",
"quoteCharacter": "22",
"fieldDelimiter": "2C",
}
}
To see how the underlying REST API request is constructed for this task, see the APIs Explorer on the clusters:import
page.
If you get an error like INVALID_ARGUMENT
, make sure that the table exists. If
the table exists, confirm that you have the correct permissions on the bucket.
For help configuring access control in Cloud Storage, see
Create and Manage Access Control Lists.
Check the status of an import operation
To check the status of an import operation, follow these steps:
gcloud
Run the following using the gcloud alloydb operations describe command:
gcloud alloydb operations describe OPERATION_ID --region=REGION
The status of these operations is in the STATUS
field.
You can also list details of a specific operation or cancel a specific
operation. For more information about this command, see the
gcloud alloydb operations
command reference page.
REST v1
Use the GET
method and the following URL:
GET https://alloydb.googleapis.com/v1/projects/PROJECT_ID/locations/REGION/operations/OPERATION_ID
For more information, see get
.
Before you use any of the request data, make the following replacements:
- REGION: the region in which the AlloyDB cluster is deployed.
- PROJECT_ID: the project ID.
- OPERATION_ID: the ID of the import operation. For more information, see Before you begin.
To send your request, expand one of these options:
curl (Linux, macOS, or Cloud Shell)
Execute the following command:
curl -X GET \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ https://alloydb.googleapis.com/v1/projects/PROJECT_ID/locations/REGION/operations/OPERATION_ID
PowerShell (Windows)
Execute the following command:
$cred = gcloud auth print-access-token $headers = @{ "Authorization" = "Bearer $cred" } Invoke-WebRequest ` -Method GET ` -Headers $headers ` -Uri "https://alloydb.googleapis.com/v1/projects/PROJECT_ID/locations/REGION/operations/OPERATION_ID | Select-Object -Expand Content
You receive a JSON response similar to the following:
If successful, the response body contains an instance of Operation.
What's next
- Learn how to import a SQL file.
- Learn how to cancel an import operation.
- Check the status of an import operation.