Load sample data

This document provides instructions on how to load a small amount of data in the CSV file format into Spanner. You can load sample data before performing a production data migration to test schemas, queries, and applications.

Before you begin

  1. Install the Google Cloud CLI or use Cloud Shell, which has all the necessary tools pre-installed.

  2. To get the permissions that you need to export BigQuery data to Spanner, ask your administrator to grant you the following IAM roles on your project:

    • Export data from a BigQuery table: BigQuery Data Viewer (roles/bigquery.dataViewer)
    • Run an export job: BigQuery User (roles/bigquery.user)
    • Write data to a Spanner table: Spanner Database User (roles/spanner.databaseUser)

    For more information about granting roles, see Manage access to projects, folders, and organizations.

    You might also be able to get the required permissions through custom roles or other predefined roles.

Load sample data to Spanner

The following instructions are performed using the BigQuery reverse ETL workflow and the Google Cloud CLI.

  1. Set a default project on the gcloud CLI using the following command:

     gcloud config set project PROJECT_ID
    
  2. Export the source data in the CSV file format. Consider using pg_dump for PostgreSQL databases or mysqldump for MySQL databases tools to convert your sample data into the CSV file format.

  3. Load the data into BigQuery by using the following bq commands:

    1. Create a BigQuery dataset.

      bq mk BQ_DATASET
      
    2. Batch load the data into a new BigQuery table.

      bq load \
      --source_format=CSV \
      --autodetect \
      --allow_quoted_newlines \
      BQ_DATASET.BQ_TABLE /path/to/file
      

      Alternatively, you can batch load the data from a Cloud Storage file.

      bq load \
      --source_format=CSV \
      --autodetect \
      --allow_quoted_newlines \
      BQ_DATASET.BQ_TABLE gs://BUCKET/FILE
      
  4. Create a Spanner schema that matches the imported BQ_TABLE by using the following command:

     gcloud spanner databases ddl update SPANNER_DATABASE \
     --instance=SPANNER_INSTANCE \
     --ddl="CREATE TABLE SPANNER_TABLE ..."
    

    For more information, see Update Spanner schema.

  5. Export data from BigQuery to Spanner by using the following command:

       bq --use_legacy_sql=false 'EXPORT DATA OPTIONS(
         uri="https://spanner.googleapis.com/projects/PROJECT_ID/instances/SPANNER_INSTANCE/databases/SPANNER_DATABASE"
         format='CLOUD_SPANNER'
         spanner_options="""{ "table": "SPANNER_TABLE" }"""
         ) AS
         SELECT *
         FROM BQ_DATASET.BQ_TABLE;'
     

What's next