Best Practices

This page provides best practices for getting the best performance, durability, and availability from Cloud SQL.

If issues occur with your Cloud SQL instance, review the following during troubleshooting:

Instance configuration and administration

Best practice More information
Read and follow the operational guidelines to ensure that your instances are covered by the Cloud SQL SLA.
Configure a maintenance window for your primary instance to control when disruptive updates can occur. See Maintenance window.
If you delete and recreate instances regularly, use a timestamp in the instance ID to increase the likelihood that new instance IDs are usable.
Don't start an administrative operation before the previous operation has completed.

Cloud SQL instances do not accept new operation requests until they have completed the previous operation. If you attempt to start a new operation prematurely, the operation request fails. This includes instance restarts.

The instance status in the Google Cloud console does not reflect whether an operation is running. The green check mark denotes only that the instance is in the RUNNABLE state. To see whether an operation is running, go to the Operations tab and check the status of the most recent operation.

Configure storage to accommodate critical database maintenance.

If the enable automatic storage increases instance setting is disabled or the automatic storage increase limit is enabled, ensure you have at least 20% available space to accommodate any critical database maintenance operations that Cloud SQL may perform.

To get alerted on available disk space falling below 20%, create a metrics-based alerting policy for the disk utilization metric with an above threshold position and a value of .8. For more information, see Create metrics-based alert policies.

Prevent over-utilization of your CPU.

You can view the percentage of available CPU that your instance is using on the instance details page in the Google Cloud console. For more information, see Metrics. You can also monitor your CPU usage and receive alerts at a specified threshold using Create metric-threshold alerting policies.

To avoid over-utilization, you can increase the number of CPUs for your instance. Changing CPUs requires an instance restart. If your instance is already at the maximum number of CPUs, you must shard your database to multiple instances.

Avoid memory exhaustion.

When looking for signs of memory exhaustion, you should primarily use the usage metric. To avoid out-of-memory errors, we recommend that this metric remains below 90%.

You can also use the total_usage metric to observe the percentage of available memory that your Cloud SQL instance is using, including memory used by the database container and memory allocated by the operating system cache.

By observing the difference between the two metrics, you can identify how much memory is used by processes versus how much is used by the operating system cache. You can repurpose the memory in this cache.

To predict out-of-memory issues, check both metrics and interpret them together. If the metrics appear high, the instance might be low in memory. This can be because of a custom configuration, the instance being undersized for the workload, or a combination of these factors.

Scale your Cloud SQL instance to increase the size of its memory. Changing the instance's memory size requires an instance restart. If your instance is already at the maximum memory size, you must shard your database across multiple instances. To learn more about monitoring both metrics in the Google Cloud console, see Metrics.

Data architecture

Best practice More information
Split your large instances into smaller instances, where possible. When possible, using many smaller Cloud SQL instances is better than one large instance. Managing a large, monolithic instance presents challenges not posed by a group of smaller instances.

Application implementation

Best practice More information
Use good connection management practices, such as connection pooling and exponential backoff. Using these techniques improves your application's use of resources and helps you stay within Cloud SQL connection limits. For more information and code samples, see Managing database connections.
Test your application's response to maintenance updates, which can happen at any time during the maintenance window. Try self-service maintenance to simulate a maintenance update. During maintenance, your instance becomes unavailable for a brief period, and existing connections are dropped. Testing maintenance rollouts gives you a better understanding of how your application handles scheduled maintenance and how quickly the system can recover.
Test your application's response to failovers, which can happen at any time. You can manually initiate a failover using the Google Cloud console, the gcloud CLI, or the API. See Initiating failover.
Avoid large transactions. Keep transactions small and short. If a large database update is needed, do it in several smaller transactions rather than one large transaction.
If you are using the Cloud SQL Auth Proxy, make sure you are using the most up-to-date version. See Keeping the Cloud SQL Auth Proxy up-to-date.

Data import and export

Best practice More information
Speed up imports for small instance sizes. For small instances, you can temporarily increase the CPU and RAM of an instance to improve performance when importing large datasets.
If you are exporting data for import into Cloud SQL, be sure to use the proper procedure. See Exporting data from an externally managed database server.

Backup and recovery

Best practice More information
Protect your data with the appropriate Cloud SQL functionality.

Backups and exports are ways to provide data redundancy and protection. They each protect against different scenarios and complement each other in a robust data protection strategy.

Backups are lightweight; they provide a way to restore the data on your instance to its state at the time you took the backup. However, backups have some limitations. If you delete the instance, the backups are also deleted. You can't back up a single database or table. And if the region where the instance is located is unavailable, you cannot restore the instance from that backup, even in an available region.

Exports take longer to create, because an external file is created in Cloud Storage that can be used to recreate your data. Exports are unaffected if you delete the instance. In addition, you can export only a single database or even table, depending on the export format you choose.

Protect your instance and backups from accidental deletion.

A Cloud SQL instance that you create in the Google Cloud console or via Terraform enables accidental deletion prevention by default.

Use the export feature in Cloud SQL to export your data for additional protection. Use Cloud Scheduler with the REST API to automate export management. For more advanced scenarios, Cloud Scheduler with Cloud Functions for automation.

What's next

For more information about general practices by database engine, see: