Survivability mode

By default, Google Distributed Cloud connected requires a constant connection to Google Cloud for normal operation. This is because by default clusters that you create with Google Distributed Cloud connected are Cloud control plane clusters. In other words, the Kubernetes control plane that orchestrates the workloads running on such a cluster runs in Google Cloud.

To enhance the reliability of Distributed Cloud connected, you have the option to create Distributed Cloud connected clusters that use a local control plane deployed on your Distributed Cloud connected hardware. When the connection to Google Cloud is lost, such clusters enter survivability mode and your workloads continue to run for up to 7 days.

Only Distributed Cloud connected clusters that have been deployed with a local control plane can enter survivability mode when the connection to Google Cloud is lost. Clusters that have been deployed with a cloud control plane running in Google Cloud cannot enter survivability mode. You cannot reconfigure an existing cluster that's using a cloud control plane to use a local control plane.

When you create a cluster with a local control plane, the following rules apply:

  • You must create local control plane clusters in their own Google Cloud project. Local control plane clusters cannot coexist in the same Google Cloud project with any other type of clusters, including non-Distributed Cloud connected clusters. Mixing local control plane Distributed Cloud clusters with any other type of clusters in the same Google Cloud project can result in data loss.
  • If you reassign a node between Distributed Cloud connected clusters, that node is wiped clean and reset to the default configuration.
  • By default, the local control plane workloads run in high availability mode with three replicas that span across three nodes chosen automatically by Distributed Cloud. This is true unless there are fewer than three nodes in the cluster, or you specifically configure the cluster to use one node to run the local control plane workloads. You also have the option to specify the three nodes for high availability mode by using the --control-plane-machine-filter flag. No other node combinations are supported.
  • The nodes that run the local control plane workloads also run your application workloads.
  • The IP addresses of local control plane endpoints are accessible on your local network. You must ensure that your local network's security configuration prevents external access to those IP addresses.

When in survivability mode, a Distributed Cloud connected cluster operates as follows:

  • Control over workloads through the Google Cloud CLI, the kubectl CLI, and the Distributed Cloud Edge Container API is disabled.
  • Distributed Cloud software updates, SLOs, and hardware repair are unavailable.
  • Limited logs and metrics are synchronized with Google Cloud after the connection to Google Cloud is re-established.
  • By default, if a node reboots while the cluster is disconnected from Google Cloud, it cannot rejoin its cluster until the connection to Google Cloud is re-established because its authentication key cannot be refreshed. You have the option to specify an offline reboot window during which a node can rejoin a cluster after rebooting while the cluster is running in survivability mode. For more information, see Create a cluster.

Prerequisites

Before you can create a Distributed Cloud connected cluster with a local control plane, you must enable the required APIs in the target Google Cloud project. To do so, you must have one of the following roles in the Google Cloud project:

  • Owner (roles/owner)
  • Editor (roles/editor)
  • Service Usage Admin (roles/serviceusage.serviceUsageAdmin)

For more information about these roles, see Basic roles.

For information about granting roles, see Grant a single role.

To create a Distributed Cloud connected cluster with a local control plane, enable the following APIs:

  • anthos.googleapis.com
  • anthosaudit.googleapis.com
  • anthosgke.googleapis.com
  • cloudresourcemanager.googleapis.com
  • connectgateway.googleapis.com
  • container.googleapis.com
  • edgecontainer.googleapis.com
  • gkeconnect.googleapis.com
  • gkehub.googleapis.com
  • gkeonprem.googleapis.com
  • iam.googleapis.com
  • logging.googleapis.com
  • monitoring.googleapis.com
  • opsconfigmonitoring.googleapis.com
  • serviceusage.googleapis.com
  • stackdriver.googleapis.com
  • storage.googleapis.com
  • sts.googleapis.com

For information about enabling APIs, see Enabling services.

Upgrade your Google Cloud SDK to version 450.0.0 or later

You must upgrade your Google Cloud SDK to version 450.0.0 or later to create local control plane clusters running Distributed Cloud connected software version 1.5.0. Otherwise, creating such clusters will fail.

Create a cluster with a local control plane

To create a Distributed Cloud connected cluster with a local control plane, you must pass the following flags when creating the cluster:

  • --control-plane-node-location instructs Distributed Cloud connected to deploy the control plane workloads for this cluster locally. The value is the name of the target Distributed Cloud connected zone.
  • --control-plane-node-count (optional) specifies the number of nodes on which to run the local control plane workloads. Valid values are 3 for high availability and 1 for standard operation. If omitted, defaults to 3.
  • --control-plane-machine-filter (optional) specifies a regex-formatted list of nodes that run the local control plane workloads. If omitted, Distributed Cloud connected selects the nodes automatically at random.
  • --control-plane-shared-deployment-policy specifies whether application workloads can run on the nodes that run the local control plane workloads. The only valid value is ALLOWED. If omitted, the cluster creation fails.
  • --external-lb-ipv4-address-pools specifies a comma-delimited list of IPv4 addresses, address ranges, or subnetworks for ingress traffic for Services that run behind the Distributed Cloud connected load balancer.

For more information about creating Distributed Cloud connected clusters, see Create a cluster.

What's next