This page describes exporting and importing files into Cloud SQL instances in parallel.
Before you begin
Before you begin an export or import operation:
- Ensure that your database has adequate free space.
- Follow the best practices for exporting and importing data.
- After completing an import operation, verify the results.
Export and import operations use database resources, but they don't interfere with typical database operations unless the instance is under-provisioned.
Export data from Cloud SQL for PostgreSQL to multiple files in parallel
The following sections contain information about exporting data from Cloud SQL for PostgreSQL to multiple files in parallel.
Required roles and permissions for exporting data from Cloud SQL for PostgreSQL to multiple files in parallel
To export data from Cloud SQL into Cloud Storage, the user initiating the export must have one of the following roles:
- The Cloud SQL Editor role
- A custom role,
including the following permissions:
cloudsql.instances.get
cloudsql.instances.export
Additionally, the service account for the Cloud SQL instance must have one of the following roles:
- The
storage.objectAdmin
Identity and Access Management (IAM) role - A custom role, including the following permissions:
storage.objects.create
storage.objects.list
(for exporting files in parallel only)storage.objects.delete
(for exporting files in parallel only)
For help with IAM roles, see Identity and Access Management.
Export data to multiple files in parallel
You can export data in parallel from multiple files that reside in Cloud SQL to Cloud Storage. To do this, use thepg_dump
utility with the --jobs
option.
If you plan to import your data into Cloud SQL, then follow the instructions provided in Exporting data from an external database server so that your files are formatted correctly for Cloud SQL.
gcloud
To export data from Cloud SQL to multiple files in parallel, complete the following steps:
- Create a Cloud Storage bucket.
- To find the service account for the Cloud SQL instance that you're exporting
files from, use the
gcloud sql instances describe
command.gcloud sql instances describe INSTANCE_NAME
- To grant the
storage.objectAdmin
IAM role to the service account, use thegcloud storage buckets add-iam-policy-binding
command. For help with setting IAM permissions, see Use IAM permissions. - To export data from Cloud SQL to multiple files in parallel, use the
gcloud sql export sql
command:gcloud sql export sql INSTANCE_NAME gs://BUCKET_NAME/BUCKET_PATH/FOLDER_NAME \ --offload \ --parallel \ --threads=THREAD_NUMBER \ --database=DATABASE_NAME \ --table=TABLE_EXPRESSION
Make the following replacements:
- INSTANCE_NAME: the name of the Cloud SQL instance from which you're exporting files in parallel.
- BUCKET_NAME: the name of the Cloud Storage bucket.
- BUCKET_PATH: the path to the bucket where the export files are stored.
- FOLDER_NAME: the folder where the export files are stored.
- THREAD_NUMBER: the number of threads that Cloud SQL uses to export files in parallel. For example, if you want to export three files at a time in parallel, then specify
3
as the value for this parameter. - DATABASE_NAME: the name of the database inside of the Cloud SQL instance from which the export is made. You must specify only one database.
- TABLE_EXPRESSION: the tables to export from the specified database.
The
export sql
command doesn't contain triggers or stored procedures, but does contain views. To export triggers or stored procedures, use a single thread for the export. This thread uses thepg_dump
tool.After the export completes, you should have files in a folder in the Cloud Storage bucket in the
pg_dump
directory format. - If you don't need the IAM role that you set in Required roles and permissions for exporting from Cloud SQL for PostgreSQL, then revoke it.
Replace INSTANCE_NAME with the name of your Cloud SQL instance.
In the output, look for the value that's associated with the serviceAccountEmailAddress
field.
REST v1
To export data from Cloud SQL to multiple files in parallel, complete the following steps:
- Create a Cloud Storage bucket:
gcloud storage buckets create gs://BUCKET_NAME --project=PROJECT_NAME --location=LOCATION_NAME
Make the following replacements:- BUCKET_NAME: the name of the bucket, subject to naming requirements. For example,
my-bucket
. - PROJECT_NAME: the name of the Google Cloud project that contains the Cloud Storage bucket you're creating.
- LOCATION_NAME: the location of the bucket where you want to store the files you're exporting. For example,
us-east1
.
- BUCKET_NAME: the name of the bucket, subject to naming requirements. For example,
- Provide your instance with the
legacyBucketWriter
IAM role for your bucket. For help with setting IAM permissions, see Use IAM permissions. -
Export data from Cloud SQL to multiple files in parallel:
Before using any of the request data, make the following replacements:
- PROJECT_NAME: the name of the Google Cloud project that contains the Cloud Storage bucket you created.
- INSTANCE_NAME: the name of the Cloud SQL instance from which you're exporting files in parallel.
- BUCKET_NAME: the name of the Cloud Storage bucket.
- BUCKET_PATH: the path to the bucket where the export files are stored.
- FOLDER_NAME: the folder where the export files are stored.
- DATABASE_NAME: the name of the database inside of the Cloud SQL instance from which the export is made. You must specify only one database.
- THREAD_NUMBER: the number of threads that Cloud SQL uses to export files in parallel. For example, if you want to export three files at a time in parallel, then specify
3
as the value for this parameter.
HTTP method and URL:
POST https://sqladmin.googleapis.com/v1/projects/PROJECT_NAME/instances/INSTANCE_NAME/export
Request JSON body:
{ "exportContext": { "fileType": "SQL", "uri": "gs://BUCKET_NAME/BUCKET_PATH/FOLDER_NAME", "databases": ["DATABASE_NAME"], "offload": [TRUE|FALSE], "sqlExportOptions": { "parallel": [TRUE|FALSE], "threads": [THREAD_NUMBER] } } }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
- If you don't need the IAM role that you set in Required roles and permissions for exporting from Cloud SQL for PostgreSQL, then revoke it.
After the export completes, you should have files in a folder in the Cloud Storage bucket in the pg_dump
directory format.
REST v1beta4
To export data from Cloud SQL to multiple files in parallel, complete the following steps:
- Create a Cloud Storage bucket:
gcloud storage buckets create gs://BUCKET_NAME --project=PROJECT_NAME --location=LOCATION_NAME
Make the following replacements:- BUCKET_NAME: the name of the bucket, subject to naming requirements. For example,
my-bucket
. - PROJECT_NAME: the name of the Google Cloud project that contains the Cloud Storage bucket you're creating.
- LOCATION_NAME: the location of the bucket where you want to store the files you're exporting. For example,
us-east1
.
- BUCKET_NAME: the name of the bucket, subject to naming requirements. For example,
- Provide your instance with the
storage.objectAdmin
IAM role for your bucket. For help with setting IAM permissions, see Use IAM permissions. -
Export data from Cloud SQL to multiple files in parallel:
Before using any of the request data, make the following replacements:
- PROJECT_NAME: the name of the Google Cloud project that contains the Cloud Storage bucket you created.
- INSTANCE_NAME: the name of the Cloud SQL instance from which you're exporting files in parallel.
- BUCKET_NAME: the name of the Cloud Storage bucket.
- BUCKET_PATH: the path to the bucket where the export files are stored.
- FOLDER_NAME: the folder where the export files are stored.
- DATABASE_NAME: the name of the database inside of the Cloud SQL instance from which the export is made. You must specify only one database.
- THREAD_NUMBER: the number of threads that Cloud SQL uses to export files in parallel. For example, if you want to export three files at a time in parallel, then specify
3
as the value for this parameter.
HTTP method and URL:
POST https://sqladmin.googleapis.com/sql/v1beta4/projects/PROJECT_NAME/instances/INSTANCE_NAME/export
Request JSON body:
{ "exportContext": { "fileType": "SQL", "uri": "gs://BUCKET_NAME/BUCKET_PATH/FOLDER_NAME", "databases": ["DATABASE_NAME"], "offload": [TRUE|FALSE], "sqlExportOptions": { "parallel": [TRUE|FALSE], "threads": [THREAD_NUMBER] } } }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
- If you don't need the IAM role that you set in Required roles and permissions for exporting from Cloud SQL for PostgreSQL, then revoke it.
After the export completes, you should have files in a folder in the Cloud Storage bucket in the pg_dump
directory format.
Import data from multiple files in parallel to Cloud SQL for PostgreSQL
The following sections contain information about importing data from multiple files in parallel to Cloud SQL for PostgreSQL.
Required roles and permissions for importing data from multiple files in parallel to Cloud SQL for PostgreSQL
To import data from Cloud Storage into Cloud SQL, the user initiating the import must have one of the following roles:
- The Cloud SQL Admin role
- A custom role,
including the following permissions:
cloudsql.instances.get
cloudsql.instances.import
Additionally, the service account for the Cloud SQL instance must have one of the following roles:
- The
storage.objectAdmin
IAM role - A custom role, including the following permissions:
storage.objects.get
storage.objects.list
(for importing files in parallel only)
For help with IAM roles, see Identity and Access Management.
Import data to Cloud SQL for PostgreSQL
You can import data in parallel from multiple files that reside in Cloud Storage to your database. To do this, use the pg_restore
utility with the --jobs
option.
gcloud
To import data from multiple files in parallel into Cloud SQL, complete the following steps:
- Create a Cloud Storage bucket.
Upload the files to your bucket.
For help with uploading files to buckets, see Upload objects from files.
- To find the service account for the Cloud SQL instance that you're importing
files to, use the
gcloud sql instances describe
command.gcloud sql instances describe INSTANCE_NAME
- To grant the
storage.objectAdmin
IAM role to the service account, use thegcloud storage buckets add-iam-policy-binding
utility. For help with setting IAM permissions, see Use IAM permissions. - To import data from multiple files in parallel into Cloud SQL, use the
gcloud sql import sql
command:gcloud sql import sql INSTANCE_NAME gs://BUCKET_NAME/BUCKET_PATH/FOLDER_NAME \ --offload \ --parallel \ --threads=THREAD_NUMBER \ --database=DATABASE_NAME
Make the following replacements:
- INSTANCE_NAME: the name of the Cloud SQL instance to which you're importing files in parallel.
- BUCKET_NAME: the name of the Cloud Storage bucket.
- BUCKET_PATH: the path to the bucket where the import files are stored.
- FOLDER_NAME: the folder where the import files are stored.
- THREAD_NUMBER: the number of threads that Cloud SQL uses to import files in parallel. For example, if you want to import three files at a time in parallel, then specify
3
as the value for this parameter. - DATABASE_NAME: the name of the database inside of the Cloud SQL instance from which the import is made. You must specify only one database.
If the command returns an error like
ERROR_RDBMS
, then review the permissions; this error is often due to permissions issues. - If you don't need the IAM permissions that you
set in Required roles and permissions for importing to Cloud SQL for PostgreSQL, then use
gcloud storage buckets remove-iam-policy-binding
to remove them.
Replace INSTANCE_NAME with the name of your Cloud SQL instance.
In the output, look for the value that's associated with the serviceAccountEmailAddress
field.
REST v1
To import data from multiple files in parallel into Cloud SQL, complete the following steps:
- Create a Cloud Storage bucket:
gcloud storage buckets create gs://BUCKET_NAME --project=PROJECT_NAME --location=LOCATION_NAME
Make the following replacements:- BUCKET_NAME: the name of the bucket, subject to naming requirements. For example,
my-bucket
. - PROJECT_NAME: the name of the Google Cloud project that contains the Cloud Storage bucket you're creating.
- LOCATION_NAME: the location of the bucket where you want to store the files you're importing. For example,
us-east1
.
- BUCKET_NAME: the name of the bucket, subject to naming requirements. For example,
Upload the files to your bucket.
For help with uploading files to buckets, see Upload objects from files.
- Provide your instance with the
storage.objectAdmin
IAM role for your bucket. For help with setting IAM permissions, see Use IAM permissions. Import data from multiple files in parallel into Cloud SQL:
Before using any of the request data, make the following replacements:
- PROJECT_NAME: the name of the Google Cloud project that contains the Cloud Storage bucket you created.
- INSTANCE_NAME: the name of the Cloud SQL instance to which you're importing files in parallel.
- BUCKET_NAME: the name of the Cloud Storage bucket.
- BUCKET_PATH: the path to the bucket where the import files are stored.
- FOLDER_NAME: the folder where the import files are stored.
- DATABASE_NAME: the name of the database inside of the Cloud SQL instance from which the import is made. You must specify only one database.
- THREAD_NUMBER: the number of threads that Cloud SQL uses to import files in parallel. For example, if you want to import three files at a time in parallel, then specify
3
as the value for this parameter.
HTTP method and URL:
POST https://sqladmin.googleapis.com/v1/projects/PROJECT_NAME/instances/INSTANCE_NAME/import
Request JSON body:
{ "importContext": { "fileType": "SQL", "uri": "gs://BUCKET_NAME/BUCKET_PATH/FOLDER_NAME", "databases": ["DATABASE_NAME"], "offload": [TRUE|FALSE], "sqlImportOptions": { "parallel": [TRUE|FALSE], "clean": [TRUE|FALSE], "ifExists": [TRUE|FALSE], "threads": [THREAD_NUMBER] } } }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
To use a different user for the import, specify the
For the complete list of parameters for the request, see the Cloud SQL Admin API page.importContext.importUser
property.- If you don't need the IAM permissions that you
set in Required roles and permissions for importing to Cloud SQL for PostgreSQL, then use
gcloud storage buckets remove-iam-policy-binding
to remove them.
REST v1beta4
To import data from multiple files in parallel into Cloud SQL, complete the following steps:
- Create a Cloud Storage bucket:
gcloud storage buckets create gs://BUCKET_NAME --project=PROJECT_NAME --location=LOCATION_NAME
Make the following replacements:- BUCKET_NAME: the name of the bucket, subject to naming requirements. For example,
my-bucket
. - PROJECT_NAME: the name of the Google Cloud project that contains the Cloud Storage bucket you're creating.
- LOCATION_NAME: the location of the bucket where you want to store the files you're importing. For example,
us-east1
.
- BUCKET_NAME: the name of the bucket, subject to naming requirements. For example,
Upload the files to your bucket.
For help with uploading files to buckets, see Upload objects from files.
- Provide your instance with the
storage.objectAdmin
IAM role for your bucket. For help with setting IAM permissions, see Use IAM permissions. Import data from multiple files in parallel into Cloud SQL:
Before using any of the request data, make the following replacements:
- PROJECT_NAME: the name of the Google Cloud project that contains the Cloud Storage bucket you created.
- INSTANCE_NAME: the name of the Cloud SQL instance from which you're importing files in parallel.
- BUCKET_NAME: the name of the Cloud Storage bucket.
- BUCKET_PATH: the path to the bucket where the import files are stored.
- FOLDER_NAME: the folder where the import files are stored.
- DATABASE_NAME: the name of the database inside of the Cloud SQL instance from which the import is made. You must specify only one database.
- THREAD_NUMBER: the number of threads that Cloud SQL uses to import files in parallel. For example, if you want to import three files at a time in parallel, then specify
3
as the value for this parameter.
HTTP method and URL:
POST https://sqladmin.googleapis.com/sql/v1beta4/projects/PROJECT_NAME/instances/INSTANCE_NAME/import
Request JSON body:
{ "importContext": { "fileType": "SQL", "uri": "gs://BUCKET_NAME/BUCKET_PATH/FOLDER_NAME", "databases": ["DATABASE_NAME"], "offload": [TRUE|FALSE], "sqlImportOptions": { "parallel": [TRUE|FALSE], "clean": [TRUE|FALSE], "ifExists": [TRUE|FALSE], "threads": [THREAD_NUMBER] } } }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
To use a different user for the import, specify the
For the complete list of parameters for the request, see the Cloud SQL Admin API page.importContext.importUser
property.- If you don't need the IAM permissions that you
set in Required roles and permissions for importing to Cloud SQL for PostgreSQL, then use
gcloud storage buckets remove-iam-policy-binding
to remove them.
Limitations
- If you specify too many threads when you import or export data from multiple files in parallel, then you might use more memory than your Cloud SQL instance has. If this occurs, then an internal error message appears. Check the memory usage of your instance and increase the instance's size, as needed. For more information, see About instance settings.
- When performing an export, commas in database names or table names in the
databases
ortables
fields aren't supported. - Make sure that you have enough disk space for the initial dump file download. Otherwise, a
no space left on disk
error appears. - If your instance has only one virtual CPU (vCPU), then you can't import or export multiple files in parallel. The number of vCPUs for your instance can't be smaller than the number of threads that you're using for the import or export operation, and the number of threads must be at least two.
- The
pg_dump
utility can't chunk any tables that you export. Therefore, if you have one very large table, then it can become a bottleneck for the speed of the export operation.
What's next
- Learn how to check the status of import and export operations.
- Learn how to cancel the import and export of data.
- Learn about best practices for importing and exporting data.
- Learn about known issues for imports and exports.