Multi-Zone and Regional Clusters

This page provides an overview of multi-zone and regional cluster support in Kubernetes Engine.

Overview

By default, a cluster creates its cluster master and its nodes in a single compute zone that you specify at the time of creation. You can improve your clusters' availability and resilience by creating multi-zone or regional clusters. Multi-zone and regional clusters distribute Kubernetes resources across multiple zones within a region.

Regional clusters:

  • create three cluster masters across three zones
  • by default, create nodes in three zones, or in as many zones as desired

Multi-zone clusters:

  • create a single cluster master in one zone
  • create nodes in multiple zones

The primary difference between regional and multi-zone clusters is that regional clusters create three masters and multi-zone clusters create only one.

You can choose to create multi-zone or regional clusters at the time of cluster creation. However, you cannot migrate or downgrade a multi-zone cluster to a regional cluster, and vice versa.

Limitations

  • For multi-zone or regional clusters that run GPUs, there is currently no region which has any GPU type in three zones. If you want to run GPUs in a multi-zone or regional cluster, you need to specify zones using the --node-locations flag.

    Attempting to create a GPU cluster that spans three zones returns an error like the following:

    ERROR: (gcloud.container.clusters.create) ResponseError: code=400, message=
              (1) accelerator type "nvidia-tesla-k80" does not exist in zone us-west1-c.
              (2) accelerator type "nvidia-tesla-k80" does not exist in zone us-west1-a.
  • Node pools always run in the same zones. You can't create node pools in zones outside of the cluster's zones. However, you can change a cluster's zones, which causes all new and existing nodes to span those zones.

Pricing

Regional clusters are offered at no additional charge.

Using multi-zone or regional clusters requires more of your project's regional quotas. Ensure that you understand your quotas and the Kubernetes Engine pricing before using multi-zone or regional clusters. If you encounter an Insufficient regional quota to satisfy request for resource error, your request exceeds your available quota in the current region.

Additionally, you are charged for node-to-node traffic across zones. For example, if you had a service in one zone that needed to talk to a service in another zone, you would be charged for the cross-zone network traffic. For more information, refer to the "Egress between zones in the same region (per GB)" pricing on the Compute Engine pricing page.

Regional clusters

A regional cluster provides a single static endpoint for the entire cluster, providing you with the ability to access the cluster's control plane regardless of any outage or downtime in an individual zone.

How regional clusters work

Regional clusters replicate cluster masters and nodes across multiple zones within in a single region. For example, a regional cluster in the us-east1 region creates masters and nodes in all three us-east1 zones: us-east1-b, us-east1-c, and us-east1-d. This ensures higher availability of resources and protects clusters from zonal downtime, as regional clusters and their resources do not fail if a single zone fails. In the event of an infrastructure outage, the regional control plane remains available and nodes can be rebalanced manually or using cluster autoscaler.

There are several benefits to using regional clusters, including:

  • Resilience from single zone failure. Because regional clusters are available across a region rather than a single zone within a region, if a single zone goes down, your Kubernetes control plane and your resources are not impacted.
  • Zero downtime master upgrades and reduced downtime from master failures. By using a high availability control plane, your control plane’s availability is unaffected, even during upgrades.

Creating a regional cluster

You can create a regional cluster by using the GCP Console or the gcloud command-line tool.

By default, when a regional cluster is created, the cluster's node pools are replicated across three zones.

gcloud

To create a regional cluster, run the following command:

gcloud container clusters create [CLUSTER_NAME] --region [REGION] \
[--node-locations [ZONE,ZONE...]]

where [CLUSTER_NAME] is the name you choose for the regional cluster, and [REGION] is the desired region, such as us-central1. For regions with more than three zones or in cases where fewer zones is preferred, the optional --node-locations flag overrides the default zones in which the nodes are replicated.

For example, to create a regional cluster with nine nodes in us-east1 (three zones with three nodes each, which is default):

gcloud container clusters create my-regional-cluster --region us-west1

To create a regional cluster with six nodes (three zones with two nodes each, specified by --num-nodes):

gcloud container clusters create my-regional-cluster --num-nodes 2 \
--region us-west1

To create a regional cluster with six nodes in two zones (two zones, specified by --node-locations, with three nodes each):

gcloud container clusters create my-regional-cluster --region us-central1 \
--node-locations us-central1-b,us-central1-c

Console

To create a regional cluster, perform the following steps:

  1. Visit the Kubernetes Engine menu in the Google Cloud Platform Console.

    Visit the Kubernetes Engine menu

  2. Click Create cluster.

  3. From Location, select Regional.
  4. From the Region drop-down menu, select the desired region, such as us-central1.
  5. Configure your cluster as desired, then click Create.

Multi-zone clusters

Multi-zone clusters can help improve the availability of your applications by creating nodes in multiple zones. This helps protect against downtime in the unlikely event of a zone-wide outage.

A multi-zone cluster creates nodes in multiple zones within the same region. All nodes in a multi-zone cluster are controlled by the same cluster master.

How multi-zone clusters work

A diagram showing a zonal cluster.

Figure 1. A zonal cluster in a single region. Creating a multi-zone cluster causes its resources to be spread across zones.

When you create a multi-zone cluster, either initially or by adding zones to an existing cluster, Kubernetes Engine makes the resource footprint the same in all zones.

For example, suppose that you request two VM instances with four cores each, and you ask for your cluster to be spread across three zones. In that case, you would get a total of 24 cores, with eight cores in each zone.

Multi-zone clusters attempt to spread resources evenly across zones to ensure that Pods are scheduled evenly across zones. Doing so improves availability and failure recovery. If computing resources were spread unevenly across zones, the scheduler might not be able to schedule Pods evenly. You can guarantee even distribution of resources by specifying Pod anti-affinity.

The default and any custom node pools in multi-zone clusters automatically have multi-zone availability. Those nodes also have labels applied in Kubernetes that indicate their failure domain, so that they can be taken into account by the Kubernetes scheduler.

Creating a multi-zone cluster

gcloud

To create a multi-zone cluster, use the gcloud container clusters create command. Use --zone to specify the zone for the cluster control plane. Use --node-locations to specify all of the desired zones for nodes:

gcloud container clusters create [CLUSTER_NAME] \
--zone [COMPUTE_ZONE] \
--node-locations [COMPUTE_ZONE,COMPUTE_ZONE,...]

where:

  • [CLUSTER_NAME] is the name you choose for the cluster
  • --zone [COMPUTE_ZONE] is the zone for the cluster control plane
  • --node-locations [COMPUTE_ZONE,COMPUTE_ZONE,...] is all of the zones in which the cluster runs, including the cluster control plane's zone.

For example:

gcloud container clusters create example-cluster \
--zone us-central1-a \
--node-locations us-central1-a,us-central1-b,us-central1-c

When the --num-nodes flag is omitted, the default number of per-zone nodes created by the cluster is three. Because three zones were specified, this command creates a nine-node cluster with three nodes each in us-central1-a, us-central1-b, and us-central1-c.

Console

To create a multi-zone cluster, perform the following steps:

  1. Visit the Kubernetes Engine menu in GCP Console.

    Visit the Kubernetes Engine menu

  2. Click Create cluster.

  3. From Location, ensure that Zonal is selected.
  4. From the Zone drop-down menu, select the desired zone for your cluster control plane, such as us-central1-a.
  5. Configure your cluster as desired, then click More.
  6. From the Additional zones section, select additional zones in which you'd like the cluster to run.
  7. Click Create.

Adding or removing zones in an existing cluster

gcloud

To add or remove zones in an existing cluster, use the gcloud container clusters update command:

gcloud container clusters update [CLUSTER_NAME] \
--zone [COMPUTE_ZONE] \
--node-locations [COMPUTE_ZONE,COMPUTE_ZONE,...]

where:

  • [CLUSTER_NAME] is the name you choose for the cluster
  • --zone [COMPUTE_ZONE] is the zone for the cluster control plane
  • --node-locations [COMPUTE_ZONE,COMPUTE_ZONE,...] is all of the desired zones. Include the cluster control plane's zone.

For example, example-cluster runs primarily in us-central1-a. To add two more zones to the cluster, you'd run the following command:

gcloud container clusters update example-cluster \
--zone us-central1-a
--node-locations us-central1-a,us-central1-b,us-central1-c

As another example, example-cluster runs in us-central1-a, us-central1-b and us-central1-c. If you only want the cluster to run in us-central1-a and us-central1-b, you'd run the following command:

gcloud container clusters update example-cluster \
--zone us-central1-a \
--node-locations us-central1-a,us-central1-b

Console

To add or remove zones in an existing cluster, perform the following steps:

  1. Visit the Kubernetes Engine menu in GCP Console.

    Visit the Kubernetes Engine menu

  2. Select the desired cluster, then click Edit.

  3. From the Additional zones section, select some or all of the desired zones.
  4. Click Save.

Persistent storage in multi-zone and regional clusters

Persistent storage disks are zonal resources. When you add persistent storage to your cluster, unless a zone is specified, Kubernetes Engine assigns the disk to a single zone. Kubernetes Engine chooses the zone at random. When using a StatefulSet, the provisioned persistent disks for each replica are spread across zones.

Once a persistent disk is provisioned, any Pods referencing the disk are scheduled to the same zone as the disk.

A read-write persistent disk cannot be attached to multiple nodes.

Autoscaling multi-zone and regional clusters

You can use cluster autoscaler to automatically scale your multi-zone or regional cluster. Cluster autoscaler uses following mechanics when scaling clusters in more than one zone:

Autoscaling limits

When you autoscale multi-zone and regional clusters, node pool scaling limits are determined by zone availability.

For example, the following command creates an autoscaling multi-zone cluster with six nodes across three zones, with a minimum of one node per zone and a maximum of four nodes per zone:

gcloud container clusters create example-cluster \
--zone us-central1-a \
--node-locations us-central1-a,us-central1-b,us-central1-f \
--num-nodes 2 --enable-autoscaling --min-nodes 1 --max-nodes 4

The total size of this cluster is between three and twelve nodes, spread across three zones. If one of the zones fails, the total size of cluster becomes between two and eight nodes.

Overprovisioning scaling limits

To maintain capacity in the unlikely event of zonal failure, you can overprovision your scaling limits

For example, if you overprovision a three-zone cluster by 150%, you can ensure that 100% of traffic is routed to available zones if one-third of the cluster's capacity is lost. In the above example, you would accomplish this by specifying a maximum of six nodes per zone rather than four. If one zone fails, the cluster scales to twelve nodes in the remaining zones.

Similarly, if you overprovision a two-zone cluster by 200%, you can ensure that 100% of traffic is rerouted if half of the cluster's capacity is lost.

You can overprovision multi-zone and regional clusters. For more information on cluster autoscaler, refer to the cluster autoscaler documentation or FAQ for autoscaling in the Kubernetes documentation.

Balancing across zones

To learn how cluster autoscaler balances the size of your cluster across multiple zones, refer to the cluster autoscaler documentation.

What's next

Was this page helpful? Let us know how we did:

Send feedback about...

Kubernetes Engine