Operational guidelines

The Cloud SQL SLA agreement excludes outages "caused by factors outside of Google’s reasonable control". This page describes some of the user-controlled configurations that can cause an outage for a Cloud SQL instance to be excluded.

Introduction

Cloud SQL strives to give you as much control over how your instance is configured as possible. This includes some configurations that increase the risk of instance downtime, depending on the load and other configuration parameters. If your instance goes down, and Cloud SQL determines that it was out of compliance with the operational limits as described on this page, then the downtime period is not covered by (or does not count against) the Cloud SQL SLA agreement.

This list of operational limits is presented to inform you which configurations present these risks, ways to avoid inadvertently moving into one of these configurations, and ways to mitigate the risks when the configuration is required for your business environment.

Excluded configurations

The excluded configurations fall into the following categories:

  • General configuration requirements
  • Database flag values
  • Resource constraints

General configuration requirements

Only Cloud SQL instances configured for high availability with at least one dedicated CPU are covered by the SLA. Shared-core instances and single-zone instances are not covered by the SLA.

If the instance is configured and used in a way that the workload overloads the instance, then the SLA does not apply.

We strongly advise you to set up alerts and monitoring in Cloud Monitoring.

Database flag values

Cloud SQL lets you configure your instance using database flags. Some of these flags can be set in ways that might compromise the stability of the instance or the durability of its data.

Resource constraints

The following resource constraints must be avoided to retain SLA coverage:

Constraint Description Detection Remedy Prevention
Storage full If your instance runs out of storage, and the automatic storage increase capability is not enabled, your instance goes offline; this outage is not covered by the SLA. You can view the amount of storage your instance is using on the Instance details page in the Google Cloud console. Learn more.

To monitor your storage usage and receive alerts at a specified threshold, set up a Stackdriver alert. Learn more.

Increase the storage size for the instance. Although storage size can be increased, it cannot be decreased. Enable automatic storage increase for the instance. Learn more.
CPU overloaded If CPU utilization is over 98% for 6 hours, your instance is not properly sized for your workload, and it is not covered by the SLA. You can view the percentage of available CPU your instance is using on the Instance details page in the Google Cloud console. Learn more.

To monitor your CPU usage and receive alerts at a specified threshold, set up a Stackdriver alert. Learn more.

Increase the number of CPUs for your instance. Note that changing CPUs requires an instance restart.

If your instance is already at the maximum number of CPUs, shard your database to multiple instances.

Monitor CPU usage and increase when necessary. Note that changing your instance CPUs requires a restart.
Too many database tables If you have 10,000 or more database tables on a single instance, it could result in the instance becoming unresponsive or unable to perform maintenance operations, and the instance is not covered by the SLA. To see how many tables there are on your instance: SELECT COUNT(*) FROM information_schema.tables; To see how many tables there are in each database: SELECT TABLE_SCHEMA,COUNT(*) FROM information_schema.tables group by TABLE_SCHEMA; Reduce the number of tables to less than 10,000.

If you cannot immediately reduce the number of tables, you can reduce the likelihood of your instance being impacted by the high table count by setting the innodb_file_per_table flag to OFF; however, this setting does not bring the instance back into SLA compliance.

If your data architecture requires many tables, split the data across several instances.