This page explains how maintenance updates occur on Cloud SQL instances, and how you can control the timing of these updates. To get started, see Finding and setting maintenance windows.
As a managed service, Cloud SQL automatically updates instances to ensure that the underlying hardware, operating system, and database engine are reliable, performant, secure, and up-to-date. Most of these updates are performed while your Cloud SQL instance is up and running. However, certain system updates require a brief service interruption to be performed. These updates are called maintenance.
Maintenance updates the operating system and the database engine. Because these updates require the instance to be restarted, they incur some downtime. Maintenance updates deliver the following benefits:
Cloud SQL features. To launch new features, the database engine is updated and new plugins to the database are installed.
Database version upgrades. The database software provider that develops PostgreSQL releases new minor versions several times a year. With each new version come bug fixes, security patches, performance enhancements, and new database features. You can find the latest minor version that Cloud SQL for PostgreSQL supports by reviewing release notes or Database versions and version policies. Cloud SQL instances are upgraded to the latest database version shortly after release, so that you benefit from running the latest database software.
Operating system patches. We continuously monitor for newly identified security vulnerabilities in the operating system. Upon discovery, we patch the operating system to protect you from new risks.
During a maintenance event, a Cloud SQL for PostgreSQL instance loses connectivity for less than 30 seconds on average.
Downtime might be higher for instances that have high activity at the beginning of maintenance or have very large datasets. Cloud SQL typically schedules maintenance once every few months.
Cloud SQL offers you the ability to configure maintenance updates through a set of maintenance settings.
You can configure maintenance to be scheduled at times when brief downtime causes the lowest impact to your applications. For each Cloud SQL instance, you can configure the following:
Maintenance window. The day of the week and the hour in which Cloud SQL schedules maintenance. Maintenance windows last for one hour. Learn how to configure a maintenance window.
Order of update. Sets the order in which the Cloud SQL instance is updated relative to other instances in the same region. Order of update can be set to
Laterinstances are updated one week after
Earlierinstances with the same maintenance window in the same region. You set the order of update when you configure a maintenance window.
Deny maintenance period. A block of days in which Cloud SQL does not schedule maintenance. Deny maintenance periods can be up to 90 days long. Learn how to configure a deny maintenance period.
Assume you are a developer at a retailer managing a shopping cart service. You have one Cloud SQL instance for a production environment and a second for a staging environment. You want maintenance to occur at the time when your instance handles the lowest amount of traffic, which is around midnight on Sundays. You also want to skip maintenance during your busy end-of-year holiday shopping season.
In this case, you set your production instance's maintenance settings to:
- Maintenance window: Sundays between 12:00AM and 1:00AM ET
- Order of update:
- Deny maintenance period: November 1 through January 15.
The maintenance settings for your staging environment would be identical, except
the order of update would be set to
Earlier. This ensures you can run
operational acceptance tests for a maintenance release in staging at least seven
days before maintenance rolls out to production. If something goes
wrong in the staging environment, you have time to diagnose and fix the issue so
that your production environment is unaffected.
Upcoming maintenance notifications
You can have a notification about upcoming maintenance sent to your email at least one week before maintenance is scheduled. If you want to set an email filter for notifications, the email title is Upcoming maintenance for your Cloud SQL instance instancename.
Notifications for maintenance are not sent out by default. You need to opt in to maintenance notifications. You must also select a maintenance window before you can receive notifications.
Notifications are sent to the email address associated with your Google account. It's not possible to configure a custom email alias (for instance, a team email alias).
You opt into maintenance notifications for all Cloud SQL instances that have maintenance windows in a given project. You receive one notification per instance. Upcoming maintenance notifications are not sent out for read replicas.
You can also view upcoming maintenance information in the Google Cloud console.
- In the Instances list, in the Maintenance column. If maintenance is scheduled, you see the date and time for when it is scheduled to start. You can filter the instances list using the term Maintenance to find all the instances scheduled for maintenance. The Maintenance column only displays when maintenance is scheduled on one or more instances in the project. If no maintenance is scheduled, the column is hidden.
- On the Instance details page in the Maintenance pane. If maintenance is scheduled, under Upcoming, you see a date and time for when it is scheduled to start.
- On the ACTIVITY page in the Google Cloud console, you can view a list of instances scheduled for maintenance. If maintenance is scheduled, instances have the message SQL Maintenance and the date and time for when it is scheduled to start.
If you have a maintenance window for your instance, you can reschedule maintenance at any time before maintenance is currently scheduled. For example, if you have a new service launching during your currently scheduled maintenance time, you might want to reschedule the maintenance window to a few days after your launch.
You can reschedule maintenance multiple times so long as it's not more than 28 days after the originally scheduled time.
You have a few scheduling options for the new maintenance window:
- Apply updates immediately. You can apply the update to your instance immediately instead of waiting for the scheduled maintenance window. In this case, maintenance generally starts within five minutes.
Reschedule to another time. You can postpone a scheduled maintenance event in two ways:
- Next available window. This option defers maintenance to the next available maintenance window following the current scheduled maintenance time, which is typically one week later.
- Specific time. This option lets you choose any specific time within 28 days after the originally scheduled maintenance time.
To reschedule maintenance, see rescheduling planned maintenance.
How maintenance works
To keep maintenance brief, Cloud SQL uses a maintenance failover workflow that largely resembles our automatic failover workflow for highly available instances.
In short, these are the steps:
- Set up an updated VM with the new software.
- Stop the original VM.
- Start up the updated VM.
- Switch over the disk and static IP to the updated VM.
Step through the tabs below to see details of the workflow, including pre- and post-maintenance.
Before maintenance, the client communicates with the original VM through a static IP address. The data is stored on a persistent disk that is attached to the original VM. In this example, the Cloud SQL instance has high availability configured, which means that another VM is on standby to take over in the event of an unplanned outage. The Cloud SQL instance is serving traffic to the application.
Set up the new VM.
a new Virtual Machine (VM) is set up with the latest database software and VM operating system (OS). The updated VM OS is started. At this point, the database engine is not yet started. For highly available instances, a new standby VM is also set up.
The total downtime is substantially shortened by installing the software update on another VM while the original Cloud SQL instance is still serving traffic.
Shut down the original VM.
The database engine is shut down so that the disk can be detached from the original VM and attached to the updated VM. Before shutting down, the database engine waits for a few seconds for ongoing transactions to be committed and requests from existing connections to drain. After that, any open or long-running transactions are rolled back. The database stops accepting new connections, and existing connections are dropped. The instance becomes unavailable and maintenance downtime begins.
Switch over to the updated VM.
The disk is detached from the original VM and attached to the updated VM. The static IP address is reconfigured to point to the updated VM. This ensures that the application uses the same IP address after maintenance as before. The database cache is cycled out with the original VM, meaning that the database cache is effectively cleared during maintenance.
Start the updated VM.
The updated database engine is started on the data disk. Using a common data disk ensures that all transactions written to the original instance prior to maintenance are still present on the updated database after maintenance. If any incomplete transactions didn't finish rolling back during database shutdown, the database automatically goes through crash recovery to ensure that the database is restored to a usable state.
After Step 4, the Cloud SQL instance is available to accept connections and it returns to serving traffic to the application.
To the application, apart from the updated software, the Cloud SQL instance looks the same. The application still connects to the Cloud SQL instance using the same static IP address, and the updated VM runs in the same zone as the original VM. All data written to the original database is preserved.
Cloud SQL regularly releases software improvements, such as patches for known vulnerabilities, through new maintenance versions. While Cloud SQL schedules maintenance updates once every few months to ensure you have the latest software, you might want to use self-service maintenance if:
- You need an update sooner than your next scheduled maintenance event.
- You want to catch up to the latest software after skipping your most recent maintenance update.
- Does maintenance downtime count toward the SLA?
- How does maintenance affect read replicas?
- What rescheduling limitations do I need to know about?
- What happens if the maintenance event is cancelled?
- What deny maintenance periods limitations do I need to know about?
- Can I cancel scheduled maintenance?
- How can I minimize the impact of maintenance on my application?
- Can Cloud SQL schedule maintenance outside my maintenance settings?
- Is Cloud SQL maintenance cumulative?
Does maintenance downtime count toward the SLA?
Downtime from normal maintenance does not count towards the SLA. However, Cloud SQL counts time-sensitive maintenance downtime against the SLA.
How does maintenance affect read replicas?
Cloud SQL always maintains read replicas before the primary instance. Read replicas are maintained in a random order strictly before the primary instance in a manner so that no more than a third of the read replicas are maintained in parallel. If the primary instance has a maintenance window, then read replicas observe the same maintenance window. Read replicas also observe the deny maintenance period set for the primary instance. If your primary instance has multiple read replicas, then Cloud SQL may update some of the replicas simultaneously.
Can I cancel scheduled maintenance?
You can't cancel a scheduled maintenance window, but you can reschedule it. You can also configure a deny maintenance period that overlaps with the scheduled maintenance time to effectively skip maintenance.
There are a few things you need to know about rescheduling:
You must reschedule maintenance at least 24 hours before the originally scheduled maintenance event happens.
You can reschedule maintenance on one or multiple instances in your project. However, you can only reschedule one instance at a time (bulk rescheduling is not available).
You can reschedule maintenance to a time that falls within a deny maintenance period, or even outside the maintenance window, as long as the time falls within the 28 days rescheduling limitation.
If a maintenance operation is in progress, rescheduling is delayed until the operation is complete.
What happens if the maintenance event is cancelled?
If Cloud SQL cancels a maintenance event, you receive notification that maintenance is cancelled in advance, when possible.
You receive a new notification of upcoming maintenance when the maintenance event is rescheduled.
Deny maintenance period limitations
There are a few things you need to know about deny maintenance periods:
You can have a deny maintenance period even if you don't have maintenance windows configured for your instance. Deny maintenance periods can span from 1 to 90 days.
The deny maintenance period takes precedence over any scheduled maintenance window. If there is a conflict between the timing of a maintenance window and the deny maintenance period, the deny maintenance period overrides the maintenance window.
Deny maintenance periods and relative scheduling are independent features. A deny maintenance period specified on an
Earlierinstance has no impact in determining the schedule for the
Laterinstance. Notifications are not sent if the maintenance schedule falls within the deny maintenance period for
When a deny period is set on a primary instance, maintenance for all replicas associated with the primary instance is also denied. As an example, a primary instance located in region A has three read replicas: two in region A and one in region B. When a deny period is set on the primary instance, maintenance to each of the replicas, including the replica in region B, does not receive maintenance until the deny period on the primary instance expires.
If a deny maintenance period is set after maintenance is scheduled such that the deny maintenance period overlaps with the scheduled maintenance time, the update is skipped.
You can set the deny maintenance period to recur every year by not including the year in the start and end date parameters. If the year is specified, the deny maintenance period is set for only that year.
You can set multiple deny maintenance periods in a year. We recommend that you avoid chaining deny periods together to skip consecutive scheduled maintenance events. Staying current on Cloud SQL maintenance is important to ensure that your instance operates reliably. Typically, Cloud SQL maintenance is scheduled once every few months.
In order to ensure service reliability, Cloud SQL may notify users with instances running maintenance releases that are more than 12 months old that the next maintenance rollout is required.
When a deny maintenance period ends, regular maintenance behavior resumes.
Minimize the impact of maintenance
In general, Google Cloud recommends that users running applications in the cloud make their systems resilient to transient errors, which are momentary inter-service communication issues caused by temporary unavailability. Occasional transient errors are unavoidable in the cloud.
Some of the transient errors that occur during maintenance are dropped connections and failed in-flight transactions. If you design your systems and tune your applications to be resilient to transient errors, you're also positioned to minimize impacts due to database maintenance.
To minimize the impact of dropped connections, you can use connection pools. While connections between the pooler and the database are dropped during maintenance, the connections between the application and the pooler are preserved. That way, the work of reestablishing the connections is transparent to the application and offloaded to the connection pooler instead.
To reduce the transaction failures, you can limit the number of long-running transactions. Rewriting queries to be smaller and more efficient not only reduces maintenance downtime, but also improves database performance and reliability.You can use Query Insights to identify slow queries.
To recover efficiently from connection drops and transaction failures, you can efficiently manage your database connections. You can build connection and query retry logic with exponential back-off into your applications and connection poolers. In the event that a query fails or a connection is dropped, the system institutes a wait period before retrying, which increases for each subsequent retry. For example, the system might wait just a few seconds for the first retry, but up to a minute for the fourth retry. Following this pattern ensures that these failures are corrected, without overloading the service.
Other creative solutions can minimize maintenance impacts as well, from using scripts to warm the database cache after maintenance to streamlining the number of tables in databases. We recommend following database management best practices and operational guidelines to ensure that maintenance goes smoothly.
In very rare cases, Cloud SQL might need to schedule maintenance outside of your maintenance settings to patch severe stability issues or vulnerabilities that are time-sensitive. These updates roll out rapidly, and Cloud SQL counts them as downtime against the SLA.
Is Cloud SQL maintenance cumulative?
Maintenance updates are cumulative. There's no need to apply each maintenance that you might have missed. The latest maintenance is applied in the next scheduled maintenance. Or, you can apply the latest maintenance using self-service maintenance.
- See how to opt in to maintenance notifications.
- See how to set a maintenance window.
- See how to view maintenance notifications.