Maintenance overview

AlloyDB clusters and instances rely upon many internal, low-level Google Cloud resources. These include the virtual machine (VM) instances that serve as AlloyDB nodes and load balancers, and the storage volumes that hold your data. Because AlloyDB is a managed service, Google takes care of keeping these internal resources up to date. This helps ensure that your AlloyDB clusters and instances stay reliable, performant, and secure.

Most of these updates require no downtime, but certain system updates require a brief service interruption. We refer to these updates as maintenance. Because these updates require the affected node to restart, they can incur downtime.

AlloyDB's non-disruptive maintenance operations limit the downtime to <1 seconds for primary and secondary instances, and they limit the downtime to zero seconds for read pools. This near-zero and zero downtime is achieved by preparing a replacement server with the updates, and then switching the database server afterwards. As you can see in the logs, the operation time is longer than the downtime.

Reasons for maintenance

Maintenance updates can happen for the following reasons:

  • New AlloyDB features. To launch new features, Google needs to update the AlloyDB software running on the nodes within your cluster. This might also involve updating the PostgreSQL extensions that are included with AlloyDB, or installing new extensions.

  • Database compatibility upgrades. The PostgreSQL community regularly releases minor-version updates to supported major versions of PostgreSQL. Google incorporates these updates into AlloyDB, and applies them to clusters configured for compatibility with the affected major version. For more information, see Database version policies.

  • Operating system patches. Google continuously monitors for security vulnerabilities in the operating systems that run on the internal resources that constitute AlloyDB clusters. Upon discovery, we patch the resources' operating systems to protect you from new risks.

Maintenance timing and maintenance windows

You can set maintenance windows for both primary and secondary AlloyDB clusters. By default, non-emergency maintenance for an AlloyDB cluster can occur any time except for the hours between 6 AM and 10 PM on weekdays, in the local time of the region that the cluster is located in.

If your cluster is better served by maintenance timing other than the default, then you can specify a maintenance window. A maintenance window defines your preferred time, in terms of hour-of-day and day-of-week, for your cluster to begin its maintenance events. For example, you can set a cluster to have a maintenance window that begins at 11 AM on Sundays, UTC.

If you set a maintenance window, then AlloyDB schedules future non-emergency maintenance events to begin no later than one hour after the specified time. In addition, if you opt in to receive email notifications about upcoming AlloyDB maintenance events, then you receive an automated notification about the event as soon as it is scheduled. Maintenance events are scheduled at least one week ahead of time.

You can't set the ending time of a maintenance window, since the total time required for a single maintenance event can vary depending upon the complexity of the cluster—that is, the number of read pool instances that require updating—and the nature of the update. While the downtime required for any individual instance can be very brief, the entire maintenance might take hours. For this reason, you can use a maintenance window to control the general time of day when the instances of your cluster experience maintenance downtime, but you can't specify a to-the-minute downtime window for any instance.

Emergency maintenance events, such as applying urgent security patches, might occur outside of default maintenance times or configured maintenance windows.

Maintenance window best practices

We recommend that you set maintenance windows on your production clusters, and not set one maintenance windows on your non-production clusters. This is because of the following broad order of events around a maintenance update:

  1. First, Google updates all of your clusters that don't have maintenance windows.
  2. Next, Google schedules updates for all of your clusters that do have maintenance windows. These updates have at least one week of lead time.
  3. If you have opted in to receive communication about upcoming AlloyDB maintenance events, then Google emails you with notification about the scheduled maintenance.
  4. Google performs the maintenance updates at the scheduled times.

Therefore, a notification of upcoming maintenance also means that the same updates have already been applied to all of your clusters with no maintenance windows set. If you leave your non-production clusters without maintenance windows, you can then guarantee that they receive system updates first, and you can use upcoming-maintenance notifications as a prompt to test or preview the updates in a non-production environment.

What's next