This page explains how regional clusters work. To learn how to create a regional cluster, refer to Creating a Cluster.
By default, a cluster creates its cluster master and its nodes in a single compute zone that you specify at the time of creation. You can improve the availability of your cluster by creating regional clusters.
A regional cluster provides a single static endpoint for the entire cluster and spreads your cluster's Pods across multiple zones of a given region. This allows you to access the cluster's control plane even during an outage or downtime involving one or more (but not all) individual zones.
Regional clusters distribute Kubernetes resources across multiple zones within a region. A regional cluster's masters and nodes are spread across multiple zones. The default number of masters, default number of nodes per zone, and default number of zones included are all three, but you can reduce or increase the number to achieve the appropriate cluster size and number of zones.
You decide whether your cluster is zonal or regional when you create it. You cannot convert an existing zonal cluster to regional, or vice versa.
- By default, regional clusters consist of nine nodes spread evenly across three zones in a region. This consumes nine IP addresses. You can reduce the number of nodes down to one per zone, if desired. Newly-created Google Cloud Platform accounts are granted only eight IP addresses per region, so you may need to request an increase in your quotas for regional in-use IP addresses, depending on the size of your regional cluster. If you have too few available in-use IP addresses, cluster creation fails.
- For regional clusters that run
GPUs, you must either
choose a region that has GPUs in three zones, or specify zones using the
--node-locationsflag. For a complete list of applicable regions and zones, refer to GPUs on Compute Engine.
Otherwise, you may see an error like the following:
ERROR: (gcloud.container.clusters.create) ResponseError: code=400, message= (1) accelerator type "nvidia-tesla-k80" does not exist in zone us-west1-c. (2) accelerator type "nvidia-tesla-k80" does not exist in zone us-west1-a.
- You can't create node pools in zones outside of the cluster's zones. However, you can change a cluster's zones, which causes all new and existing nodes to span those zones.
Regional clusters are offered at no additional charge.
Using regional clusters requires more of your project's regional quotas.
Ensure that you understand your quotas and the
Google Kubernetes Engine pricing before using regional clusters. If you encounter an
Insufficient regional quota to satisfy request for resource error, your
request exceeds your available quota in the current region.
Additionally, you are charged for node-to-node traffic across zones. For example, if you had a service in one zone that needed to talk to a service in another zone, you would be charged for the cross-zone network traffic. For more information, refer to the "Egress between zones in the same region (per GB)" pricing on the Compute Engine pricing page.
How regional clusters work
Regional clusters replicate cluster masters and nodes across multiple zones
within in a single region. For example, a regional cluster in the
us-east1 region creates masters and nodes in all three
us-east1-d. This ensures higher availability
of resources and protects clusters from zonal downtime, as regional clusters and
their resources do not fail if a single zone fails. In the event of an
infrastructure outage, the regional control plane remains available and nodes
can be rebalanced manually or using the cluster autoscaler.
Benefits of using regional clusters include:
- Resilience from single zone failure. Regional clusters are available across a region rather than a single zone within a region. If a single zone becomes unavailable, your Kubernetes control plane and your resources are not impacted.
- Zero downtime master upgrades, master resize, and reduced downtime from master failures. Regional clusters provide a high availability control plane, so you can access your control plane even during upgrades.
Persistent storage in regional clusters
Persistent storage disks are zonal resources. When you add persistent storage to your cluster, unless a zone is specified, GKE assigns the disk to a single zone. GKE chooses the zone at random. When using a StatefulSet, the provisioned persistent disks for each replica are spread across zones.
Once a persistent disk is provisioned, any Pods referencing the disk are scheduled to the same zone as the disk.
A read-write persistent disk cannot be attached to multiple nodes.
Autoscaling regional clusters
You can use the cluster autoscaler to automatically scale regional clusters. The following sections offer some considerations for using the cluster autoscaler with regional clusters.
Overprovisioning scaling limits
To maintain capacity in the unlikely event of zonal failure, you can overprovision your scaling limits.
For example, if you overprovision a three-zone cluster by 150%, you can ensure that 100% of traffic is routed to available zones if one-third of the cluster's capacity is lost. In the above example, you would accomplish this by specifying a maximum of six nodes per zone rather than four. If one zone fails, the cluster scales to twelve nodes in the remaining zones.
Similarly, if you overprovision a two-zone cluster by 200%, you can ensure that 100% of traffic is rerouted if half of the cluster's capacity is lost.
To learn about autoscaling limits for regional clusters, refer to Autoscaling limits.
Balancing across zones
To learn how the cluster autoscaler balances the size of your cluster across multiple zones, refer to Balancing across zones.