Best practices for importing and exporting
The following are best practices to consider when importing and exporting data:
- Don't use Cloud Storage Requester Pays buckets
- Compress data to reduce cost.
- Reduce long-running import and export processes
- Use the bcp utility for importing and exporting data
- Use bulk insert for importing data
- Use SqlPackage for importing and exporting data
- Use striped import and export
- Verify the imported database
Don't use Cloud Storage Requester Pays buckets
You cannot use a Cloud Storage bucket that has Requester Pays enabled for imports and exports from Cloud SQL.
Compress data to reduce cost
Cloud SQL supports importing and exporting both compressed and uncompressed files. Compression can save significant storage space on Cloud Storage and reduce your storage costs, especially when you are exporting large instances.
When you export a BAK file, use a .gz
file extension to compress the data. When you import a file with an
extension of .gz
, it is decompressed automatically.
Reduce long-running import and export processes
Imports into Cloud SQL and exports out of Cloud SQL can take a long time to complete, depending on the size of the data being processed. This can have the following impacts:
- You can't stop a long-running Cloud SQL instance operation.
- You can perform only one import or export operation at a time for each instance, and a long-running import or export blocks other operations, such as daily automated backups.
You can decrease the amount of time it takes to complete each operation by using the Cloud SQL import or export functionality with smaller batches of data.
For whole database migrations, you generally should use BAK files rather than SQL files for imports. Generally, importing from a SQL file takes much longer than importing from a BAK file.
Use SqlPackage for importing and exporting data
You can import and export data in Cloud SQL by using SqlPackage. It enables you to export a SQL database, including database schema and user data, to a BACPAC file (.bacpac) and to import the schema and table data from a BACPAC file into a new user database.
SqlPackage uses your credentials to connect to SQL Server to perform database imports and exports. It makes migrations available for all Cloud SQL users. To perform import and export operations, you must have the following:
A workstation that is connected to your instance, where you can run SqlPackage. To learn more about connectivity options, see About connection options.
SqlPackage installed on your system. To learn more about downloading and installing SqlPackage, see the Microsoft documentation.
Credentials set up to access your instance. To learn more about setting up credentials, see How to authenticate to Cloud SQL.
Examples
Import
To import data to a database AdventureWorks2017
, run the following command:
c:\Program Files\Microsoft SQL Server\160\DAC\bin>SqlPackage /Action:Import /tsn:myTargetServer /tdn:AdventureWorks2017 /tu:myUsername /sf:mySourceFile /TargetTrustServerCertificate:True /tp:myPassword
Here,
mySourceFile
is a source file that you want to use as the source of action from local storage. If you use this parameter, no other source parameter is valid.myTargetServer
is the name of the server hosting the target database.myUsername
is the SQL Server username that you want to use to access the target database.myPassword
is your password in the credentials.
To learn more, see the Microsoft documentation.
Export
To export data from a database AdventureWorks2017
, run the following command:
c:\Program Files\Microsoft SQL Server\160\DAC\bin>SqlPackage /Action:Export /TargetFile:"myTargetFile" /ssn:mySourceServer /su:myUsername /sdn:AdventureWorks2017 /SourceTrustServerCertificate:True /sp:myPassword
Here,
myTargetFile
is the target file (a .dacpac file) that you want to use as the target of action instead of a database. If you use this parameter, no other target parameter is valid. This parameter is invalid for actions that only support database targets.myUsername
is the SQL Server username that you want to use to access the source database.mySourceServer
is the name of the server hosting the source database.myPassword
is your password in the credentials.
To learn more, see the Microsoft documentation.
Use the bcp utility for importing and exporting data
Another option to import and export data in Cloud SQL is using the bulk copy program (bcp) utility. By using the bcp utility, you can export data from a SQL Server database into a data file and import data from a data file into a SQL Server database. The bcp utility uses your credentials to connect to SQL Server to perform database imports and exports. It makes transfers available for all Cloud SQL users. To perform import and export operations, you must have the following:
A workstation where you can run the bcp utility, and that has connectivity to your Cloud SQL instance. To learn more about connectivity options, see About connection options.
The bcp utility installed on your system. To learn more about downloading and installing bcp, see the Microsoft documentation.
Credentials set up to access your instance. To learn more about setting up credentials, see How to authenticate to Cloud SQL.
Examples
Import
To import data from the person.csv
file to the Person
table of the AdventureWorks2017
database, run the following command:
bcp Person.Person in "person.csv" -d AdventureWorks2017 -U myLoginID -S myServer
Here,
myLoginID
is the login ID used to connect to SQL Server.myServer
is the instance of SQL Server to which you want to connect. If you don't specify a server, the bcp utility connects to the default instance of SQL Server on the local computer.
To learn more, see the Microsoft documentation.
Export
To export data from the Person
table of the AdventureWorks2017
database to the
person.dat
file, run the following command:
bcp Person.Person out "person.dat" -U myLoginID -S myServer -d AdventureWorks2017
Here,
myLoginID
is the login ID used to connect to SQL Server.myServer
is the instance of SQL Server to which you want to connect. If you don't specify a server, the bcp utility connects to the default instance of SQL Server on the local computer.
To learn more, see the Microsoft documentation.
Use bulk insert for importing data
Bulk insert lets you import data into your Cloud SQL for SQL Server database from a file stored in Cloud Storage.
This section describes the following:
- Required roles and permissions
- Considerations when using bulk insert
- Perform bulk insert
- View the imported data
- Disable bulk insert
Required roles and permissions
To configure bulk insert, you need the following:
- The
CONTROL
permission on the database where you want to import the data. An HMAC access key and a secret mapped to an IAM account with the following permissions:
storage.buckets.get
storage.objects.create
andstorage.multipartUploads.create
for writing error logs and examples of bad data.
Alternatively, you can also use following roles:
Storage Object Viewer
Storage Object Creator
for writing error logs and examples of bad data.
To use bulk insert, you need the following:
- The
EXECUTE
permission on themsdb.dbo.gcloudsql_bulk_insert
stored procedure. Cloud SQL creates the stored procedure after bulk insert is enabled on the instance. Cloud SQL grants theEXECUTE
permission to thesqlserver
admin account by default. - The
INSERT
permission on the object where you want to import the data.
For more information on creating users for bulk insert, see Create and manage users.
Considerations when using bulk insert
This section has recommendations for handling security, performance, and reliability on instances while using bulk insert.
Security
Cloud SQL encrypts and stores the HMAC access key and secret in an instance as a database scoped credential. Their values cannot be accessed after they are saved. You can delete the key and secret from an instance by dropping the database scoped credential using a T-SQL command. If you take any backup while the key and secret are stored on the instance, then that backup would contain that key and secret. You can also render the key invalid by deactivating and deleting the HMAC key.
The following operations can inadvertently transfer the access key and secret and make them available:
- Cloning the instance: the key and secret are available on the cloned instance.
- Creating a read replica: the key and secret are available on the created read replica.
- Restoring from a backup: the key and secret are available on the instance restored from a backup.
We recommend that you drop the key and secret from the target instance after performing these operations.
Bulk insert can write data that it can't parse to a file stored in a Cloud Storage bucket. If you want to protect data that bulk insert has access to, then configure VPC service controls.
Performance
We recommend doing the following to mitigate performance impacts while using bulk insert:
- Test and set an appropriate value for
@batchsize
because by default, all data is imported in a single batch. - For large inserts, disable indexes temporarily to speed up data insertion.
- If possible, use the
@tablock
option because this can reduce contention and increase data load performance. - Use the
@ordercolumnsjson
parameter to specify data sorted in the order of the clustered index. This helps with better instance performance.
Reliability
We recommend doing the following to mitigate impact on instance reliability while using bulk insert:
- If a failure occurs and
@batchsize
is used, this can lead to partially loaded data. You might need to manually clean up this data on your instance. - Use the
@errorfile
option to keep a log of errors and examples of bad data detected during the load process. This makes it easier to identify rows that have failed to load.
Perform bulk insert
You can perform the bulk insert operation using the following stored procedure:
msdb.dbo.gcloudsql_bulk_insert
For more information, see Stored procedure for using bulk insert.
Example: Import data from a file in Cloud Storage and specify an error file
1. Enable bulk insert
To enable bulk insert on your instance, enable the cloud sql enable bulk insert
flag.
gcloud sql instances patch INSTANCE_NAME --database-flags="cloud sql enable bulk insert"=on
Replace INSTANCE_NAME
with the name of the instance
that you want to use for bulk insert.
For more information, see configure database flags.
After you enable this flag on your instance, Cloud SQL installs the bulk insert stored procedure on your instance
and gives the sqlserver
admin account permissions to execute.
2. Create an HMAC key
You require an HMAC key to access your Cloud Storage bucket. We recommend that you create an HMAC key for a service account and grant the service account permissions to the buckets that you want to use for bulk insert. For more information and security considerations, see Considerations when using bulk insert.
3. Create sample data to import
Using a text editor, create a file with ANSI or UTF-16 encoding that has the following sample data. Save the file in your Cloud Storage bucket and name it as
bulkinsert.bcp
, for example.1,Elijah,Johnson,1962-03-21 2,Anya,Smith,1982-01-15 3,Daniel,Jones,1990-05-21
Create a format file using the following sample data. Save the file in your Cloud Storage bucket and name it as
bulkinsert.fmt
, for example. For more information about XML and non-XML format files in SQL Server, see Create a Format File.13.0 4 1 SQLCHAR 0 7 "," 1 PersonID "" 2 SQLCHAR 0 25 "," 2 FirstName SQL_Latin1_General_CP1_CI_AS 3 SQLCHAR 0 30 "," 3 LastName SQL_Latin1_General_CP1_CI_AS 4 SQLCHAR 0 11 "\r\n" 4 BirthDate ""
4. Execute the stored procedure
Connect to your instance using the
sqlserver
user and create a sample database and table for bulk insert.USE MASTER GO -- create test database DROP DATABASE IF EXISTS bulktest CREATE DATABASE bulktest GO -- create table to insert USE bulktest; GO CREATE TABLE dbo.myfirstimport( PersonID smallint, FirstName varchar(25), LastName varchar(30), BirthDate Date );
Create a database master key, a database scoped credential, and an external data source. Set the identity as
S3 Access Key
.-- create master key CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'YourStrongPassword1'; -- create database scoped credential CREATE DATABASE SCOPED CREDENTIAL GCSCredential WITH IDENTITY = 'S3 Access Key', SECRET = '<Access key>:<Secret>'; --create external data source CREATE EXTERNAL DATA SOURCE GCSStorage WITH ( TYPE = BLOB_STORAGE, LOCATION = 's3://storage.googleapis.com/bulk-insert-demo/' , CREDENTIAL = GCSCredential ); CREATE EXTERNAL DATA SOURCE GCSStorageError WITH ( TYPE = BLOB_STORAGE, LOCATION = 's3://storage.googleapis.com/bulk-insert-demo/' , CREDENTIAL = GCSCredential );
Execute the bulk insert stored procedure to import the sample data.
EXEC msdb.dbo.gcloudsql_bulk_insert @database = 'bulktest', @schema = 'dbo', @object = 'myfirstimport', @file = 's3://storage.googleapis.com/bulk-insert-demo/bulkinsert.bcp', @formatfile = 's3://storage.googleapis.com/bulk-insert-demo/bulkinsert.fmt', @fieldquote = '"', @formatfiledatasource = 'GCSStorage', @ROWTERMINATOR = '0x0A', @fieldterminator = ',', @datasource ='GCSStorage', @errorfiledatasource = 'GCSStorageError', @errorfile = 's3://storage.googleapis.com/oom-data/bulkinsert/bulkinsert_sampleimport.log', @ordercolumnsjson = '[{"name": "PersonID","order": " asc "},{"name": "BirthDate","order": "asc"}]'
View the imported data
You can view the imported data by using one of the following methods:
Run the following query:
SELECT * FROM dbo.myfirstimport
Cloud SQL adds a record of this procedure to the SQL error log. You can view this in Cloud Logging. You can also view this in the SQL error log data on SQL Server Management Studio (SSMS).
Disable bulk insert
To disable bulk insert, remove the cloud sql enable bulk insert
flag:
gcloud sql instances patch INSTANCE_NAME --database-flags="cloudsql enable bulk insert"=off
Replace INSTANCE_NAME
with the name of the instance
where you want to remove bulk insert.
Alternatively, you can run the following command to clear all database flags:
gcloud sql instances patch INSTANCE_NAME --clear-database-flags
Replace INSTANCE_NAME
with the name of the instance
where you want to remove bulk insert.
Use striped import and export
When you perform a striped import or export, you reduce the time it takes for the operation to complete, and enable databases larger than 5 TB to be imported and exported. For more information, see Export and import using BAK files.
Verify the imported database
After an import operation is complete, connect to your database and run the appropriate database commands to make sure the contents are correct. For example, connect and list the databases, tables, and specific entries.
Known limitations
For a list of known limitations, see Issues with importing and exporting data.
Automating export operations
Although Cloud SQL doesn't provide a built-in way to automate database exports, you can build your own automation tool using several Google Cloud components. To learn more, see this tutorial.
Troubleshooting
Troubleshooting import operations
Issue | Troubleshooting |
---|---|
HTTP Error 409: Operation failed because another operation was already in progress . |
There is already a pending operation for your instance. Only one operation is allowed at a time. Try your request after the current operation is complete. |
The import operation is taking too long. | Too many active connections can interfere with import operations.
Close unused operations. Check the CPU and memory usage of your Cloud SQL instance to make sure there are plenty of resources available. The best way to ensure maximum resources for the import is to restart the instance before beginning the operation. A restart:
|
An import operation can fail when one or more users referenced in the dump file don't exist. | Before importing a dump file, all the database users who own objects or
were granted permissions on objects in the dumped database must exist in the
target database. If they don't, the import operation fails to recreate the
objects with the original ownership or permissions.
Create the database users before importing. |
LSN mismatch | The order of the import of transaction log backups is incorrect or the transaction log chain is broken. Import the transaction log backups in the same order as that in the backup set table. |
StopAt too early | This error indicates that the first log in the transaction log file is after the StopAt timestamp. For example, if
the first log in the transaction log file is at 2023-09-01T12:00:00 and the StopAt field
has value of 2023-09-01T11:00:00, then Cloud SQL returns this error.Ensure that you use the correct StopAt timestamp and the correct transaction log file. |
Troubleshooting export operations
Issue | Troubleshooting |
---|---|
HTTP Error 409: Operation failed because another operation was
already in progress. |
There is already a pending operation for your instance. Only one operation is allowed at a time. Try your request after the current operation is complete. |
HTTP Error 403: The service account does not have the required
permissions for the bucket. |
Ensure that the bucket exists and the service account for the Cloud SQL
instance (which is performing the export) has the
Storage Object Creator role
(roles/storage.objectCreator ) to allow export to the bucket. See
IAM roles for Cloud Storage. |
You want exports to be automated. | Cloud SQL does not provide a way to automate exports.
You could build your own automated export system using Google Cloud products such as Cloud Scheduler, Pub/Sub, and Cloud Run functions, similar to this article on automating backups. |
What's next
- Learn how to import and export data using BAK files.
- Learn how to import data using SQL dump files.
- Learn how to enable automatic backups.
- Learn how to restore from backups.