Import a CSV file

This page describes how to import a CSV file into an AlloyDB database.

The procedure to perform the import involves these tasks:

  1. Upload the CSV file to a Cloud Storage bucket.

  2. Prepare a client host to perform the import operation.

  3. Import the CSV file into the database.

  4. Clean up the resources created to perform the procedure.

Before you begin

  • You must have the Owner (roles/owner) or Editor (roles/editor) basic IAM role in the Google Cloud project you are using, or you must have these predefined IAM roles:
    • AlloyDB Admin (roles/alloydb.admin) or AlloyDB Viewer (roles/alloydb.viewer)
    • Storage Admin (roles/storage.admin)
    • Compute Instance Admin (v1) (roles/compute.instanceAdmin.v1)

Upload the CSV file

To upload the CSV file, you create a Cloud Storage bucket and then upload the CSV file to that bucket.

  1. Create a standard storage, regional storage bucket in the project and region where your AlloyDB database is located.

  2. Upload the CSV file to the storage bucket you created.

Prepare a client host

To prepare a client host to perform the import operation, you create a Compute Engine VM that can connect to the AlloyDB primary instance where your database is located, and install the psql and gsutil tools on that VM.

  1. Follow the instructions Connect a psql client to an instance to create a Compute Engine VM with the proper connectivity and the psql tool installed. When following these instructions, make sure to allocate enough local storage to the Compute Engine VM to accommodate the CSV file you are importing.

  2. Install the gsutil tool to provide command-line access to the CSV file in the Cloud Storage bucket.

Import the CSV file

To import the CSV file, you get the IP address of the AlloyDB primary instance where your database is located and then use the psql tool to create a table to contain the CSV data and then import it.

  1. Get the IP address of the AlloyDB primary instance where your database is located by viewing its details.
  2. SSH into the Compute Engine VM.

    Console

    1. In the Google Cloud console, go to the VM instances page.

      Go to VM instances

    2. In the list of virtual machine instances, click SSH in the row of the instance you created.

      SSH button next to instance name.

    gcloud

    Use the gcloud compute ssh command to connect to the instance you created.

    gcloud compute ssh --project=PROJECT_ID --zone=ZONE VM_NAME

    Replace the following:

    • PROJECT_ID: The ID of the project that contains the instance.
    • ZONE: The name of the zone in which the instance is located.
    • VM_NAME: The name of the instance.
  3. Copy the CSV file to the client host's local file system:
    gsutil cp gs://BUCKET_NAME/CSV_FILE_NAME .
    
  4. Import the CSV file:
    1. Run the psql client tool and then, at the psql prompt, connect to the database:
      psql -h IP_ADDRESS -U postgres
      \c DB_NAME
      

      You will be prompted to enter the password of the postgres user.

      • IP_ADDRESS: The IP address of the primary instance.
      • DB_NAME: The name of the database you want to store the CSV data in.
    2. Create a table to contain the CSV data; for example:
      CREATE TABLE weather (
        city     varchar(80),
        temp_lo  int,
        temp_hi  int,
        prcp     real,
        date     date
      );
      

      The CREATE TABLE statement above is for illustration purposes only; you create a table whose columns match your CSV data.

    3. Import the data from the CSV file:
    \copy TABLE_NAME(COLUMN_LIST)
      FROM 'CSV_FILE_NAME'
      DELIMITER ','
      CSV HEADER
    ;
    
    • TABLE_NAME: The name of the table you created in the previous step.
    • (COLUMN_LIST): A comma-separated list of the names of the table columns to import data into, in the order the data appears in the CSV file.
    • CSV_FILE_NAME: The CSV file.
    • HEADER: Indication that the CSV file contains a header line to be ignored. If it does not, omit this parameter.

    The \copy command in psql takes the same options and arguments as PostgreSQL's COPY command for controlling the data import operation.

Clean up resources

After successfully importing the CSV file, you can delete the Cloud Storage bucket and delete the Compute Engine VM you used during the import procedure.