Import a CSV file

This page describes how to import data from CSV file stored in a Cloud Storage bucket into an AlloyDB for PostgreSQL cluster.

You can cancel the import of data into AlloyDB clusters. For more information, see Cancel an import operation.

Before you begin

Before you begin an import operation:

  • Ensure that your database has adequate free space.
  • Import operations use database resources, but they don't interfere with normal database operations unless the cluster is under-provisioned.

Required roles and permissions for importing to AlloyDB clusters

To import data from Cloud Storage into AlloyDB, the user initiating the import must have one of the following roles:

  • The AlloyDB Admin role
  • A custom role, including the following permissions:
    • alloydb.clusters.get
    • alloydb.clusters.import

Additionally, the service account for the AlloyDB cluster must have one of the following roles:

  • The storage.objectViewer IAM role
  • A custom role, including the following permissions:
    • storage.objects.get

For help with IAM roles, see Identity and Access Management.

Import data from a CSV file to AlloyDB clusters

  • The database and table you are importing into must exist on your AlloyDB cluster. For help with creating a database, see Create a database.
  • CSV files must have one line for each row of data and use comma-separated fields.

To import data to an AlloyDB cluster using a CSV file, follow these steps:

gcloud

  1. Create a Cloud Storage bucket.
  2. Upload the CSV file to your bucket. For help with uploading files to buckets, see Uploading objects.
  3. Use gcloud storage buckets add-iam-policy-binding to grant the storage.objectViewer IAM role to the AlloyDB service account for the bucket.

     service-PROJECT_NUMBER@gcp-sa-alloydb.iam.gserviceaccount.com

    For help with setting IAM permissions, see Using IAM permissions.

  4. Import the file:

    gcloud alloydb clusters import CLUSTER_NAME --region=REGION --gcs-uri=gs://BUCKET_NAME/FILE_NAME --database=DATABASE_NAME --user=USER --csv --table=TABLE_NAME

    Make the following replacements:

    • CLUSTER_NAME: the name of the cluster.
    • REGION: the region in which the AlloyDB cluster is deployed.
    • BUCKET_NAME: the Cloud Storage bucket name.
    • FILE_NAME: the name of the CSV file.
    • DATABASE_NAME: the name of a database inside the cluster.
    • USER: the user for the import operation.
    • TABLE_NAME: the table inside the database.

    For more information about using the importcommand, see the alloydb import command reference page.

  5. If you don't need to keep the IAM permissions that you set previously, remove them using gcloud storage buckets remove-iam-policy-binding.

REST v1

  1. Create a Cloud Storage bucket.
  2. Upload the CSV file to your bucket. For help with uploading files to buckets, see Uploading objects.
  3. Give the service account permissions to the Cloud Storage bucket for the import operation. Use the service account format to identify the service account for the project you're importing into. The format for the service account is as follows:

    service-PROJECT_NUMBER@gcp-sa-alloydb.iam.gserviceaccount.com
  4. Use gcloud storage buckets add-iam-policy-binding to grant the storage.objectViewer IAM role to the AlloyDB cluster service account for the bucket. For help with setting IAM permissions, see Using IAM permissions.

  5. Import the file.

    Use the following HTTP method and URL:

    POST https://alloydb.googleapis.com/v1/projects/PROJECT_ID/locations/REGION/clusters/CLUSTER_ID:import

    Before you use any of the request data, make the following replacements:

    • PROJECT_ID: the project ID.
    • LOCATION_ID: the region in which the AlloyDB cluster is deployed.
    • CLUSTER_ID: the cluster ID.
    • BUCKET_NAME: the Cloud Storage bucket name.
    • PATH_TO_CSV_FILE: the path to the CSV file.
    • USER: the user for the import operation.
    • DATABASE_NAME: the name of a database inside the AlloyDB cluster.
    • TABLE_NAME: the table inside the database.
    • COLUMNS (Optional): the columns to import.
    • ESCAPE_CHARACTER (Optional): the character that must appear before a data character that needs to be escaped. The value of this argument must be a character in Hex ASCII Code. For example, 22 represents a double quotation mark.
    • QUOTE_CHARACTER (Optional): the character that encloses values from columns that have a string data type. The value of this argument must be a character in Hex ASCII Code. For example, 22 represents a double quotation mark.
    • FIELD_DELIMITER (Optional): the character that splits column values. The value of this argument must be a character in Hex ASCII Code. For example, 2C represents a comma.

    Request JSON body:

    {
      "gcsUri": "gs://BUCKET_NAME/PATH_TO_CSV_FILE",
      "database": "DATABASE_NAME",
      "user": "USER",
      "csvImportOptions": {
        "table": "TABLE_NAME",
        "columns": ["COLUMN1", "COLUMN2"],
        "fieldDelimiter": "FIELD_DELIMITER",
        "quoteCharacter": "QUOTE_CHARACTER",
        "escapeCharacter": "ESCAPE_CHARACTER"
      }
    }
    

    To send your request, expand one of these options:

    curl (Linux, macOS, or Cloud Shell)

    Save the request body in a file named request.json and execute the following command:

       curl -X POST \
             -H "Authorization: Bearer $(gcloud auth print-access-token)" \
             -H "Content-Type: application/json; charset=utf-8" \
             -d @request.json \
             "https://alloydb.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/clusters/CLUSTER_ID:import"
    

    PowerShell (Windows)

    Save the request body in a file named request.json and execute the following command:

    $cred = gcloud auth print-access-token
    $headers = @{ "Authorization" = "Bearer $cred" }
    
    Invoke-WebRequest `
      -Method POST `
      -Headers $headers `
      -ContentType: "application/json; charset=utf-8" `
      -InFile request.json `
      -Uri "https://alloydb.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/clusters/CLUSTER_ID:import"| Select-Object -Expand Content
    

    You receive a JSON response similar to the following:

    Response

    {
     "name": "projects/PROJECT_ID/locations/LOCATION_ID/operations/OPERATION_ID",
     "metadata": {
      "@type": "type.googleapis.com/google.cloud.alloydb.v1.OperationMetadata",
      "createTime": "2024-09-17T06:05:31.244428646Z",
      "target": "projects/project-id/locations/location-id/clusters/target-cluster",
      "verb": "import",
      "requestedCancellation": false,
      "apiVersion": "v1"
     },
     "done": false
    }
    
  6. If you don't need to retain the IAM permissions that you set previously, remove them now.

    For the complete list of parameters for the request, see clusters:import.

Customize CSV format and import the file

You can customize the CSV format and import the file when the CSV file contains non-default field delimiter, or quote or escape characters. To import a CSV file with a non-default format, you must provide the same formatting options in the request.

Use the following gcloud CLI and REST API examples to help you customize your CSV file format.

gcloud

gcloud alloydb clusters import CLUSTER_NAME --region=REGION --database=DATABASE_NAME --gcs-uri='gs://BUCKET_NAME/PATH_TO_CSV_FILE' --user=USERNAME --table=TABLE_NAME --columns=COLUMNS --field-delimiter='2C' --quote-character='22' --escape-character='5C' --csv

REST v1

The equivalent REST API request body looks similar to the following:

{
 "gcsUri": "gs://BUCKET_NAME/PATH_TO_CSV_FILE",
 "database": "DATABASE_NAME",
 "user": "USER",
 "csvImportOptions": {
     "table": "TABLE_NAME",
     "columns": ["COLUMN1", "COLUMN2"],
     "escapeCharacter": "5C",
     "quoteCharacter": "22",
     "fieldDelimiter": "2C",
 }
}

To see how the underlying REST API request is constructed for this task, see the APIs Explorer on the clusters:import page.

If you get an error like INVALID_ARGUMENT, make sure that the table exists. If the table exists, confirm that you have the correct permissions on the bucket. For help configuring access control in Cloud Storage, see Create and Manage Access Control Lists.

Check the status of an import operation

To check the status of an import operation, follow these steps:

gcloud

Run the following using the gcloud alloydb operations describe command:

gcloud alloydb operations describe OPERATION_ID --region=REGION

The status of these operations is in the STATUS field.

You can also list details of a specific operation or cancel a specific operation. For more information about this command, see the gcloud alloydb operations command reference page.

REST v1

Use the GET method and the following URL:

GET https://alloydb.googleapis.com/v1/projects/PROJECT_ID/locations/REGION/operations/OPERATION_ID

For more information, see get.

Before you use any of the request data, make the following replacements:

  • REGION: the region in which the AlloyDB cluster is deployed.
  • PROJECT_ID: the project ID.
  • OPERATION_ID: the ID of the import operation. For more information, see Before you begin.

To send your request, expand one of these options:

curl (Linux, macOS, or Cloud Shell)

Execute the following command:

  curl -X GET \
       -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  https://alloydb.googleapis.com/v1/projects/PROJECT_ID/locations/REGION/operations/OPERATION_ID
 

PowerShell (Windows)

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
  -Method GET `
  -Headers $headers `
  -Uri "https://alloydb.googleapis.com/v1/projects/PROJECT_ID/locations/REGION/operations/OPERATION_ID | Select-Object -Expand Content

You receive a JSON response similar to the following:

If successful, the response body contains an instance of Operation.

What's next