Survivability mode

By default, Google Distributed Cloud Edge requires a constant connection to Google Cloud for normal operation. This is because by default clusters that you create with Google Distributed Cloud Edge are Cloud control plane clusters. In other words, the Kubernetes control plane that orchestrates the workloads running on such a cluster runs in Google Cloud.

To enhance the reliability of Distributed Cloud Edge, you have the option to create Distributed Cloud Edge clusters that use a local control plane deployed on your Distributed Cloud Edge hardware. When the connection to Google Cloud is lost, such clusters enter survivability mode and your workloads continue to run.

Only Distributed Cloud Edge clusters that have been deployed with a local control plane can enter survivability mode when the connection to Google Cloud is lost. Clusters that have been deployed with a cloud control plane running in Google Cloud cannot enter survivability mode. You cannot reconfigure an existing cluster that's using a cloud control plane to use a local control plane.

When you create a cluster with a local control plane, the following rules apply:

  • You must create local control plane clusters in their own Google Cloud project. Local control plane clusters cannot coexist in the same Google Cloud project with any other type of clusters, including non-Distributed Cloud Edge clusters. Mixing local control plane Distributed Cloud Edge clusters with any other type of clusters in the same Google Cloud project can result in data loss.
  • If you reassign a node between Distributed Cloud Edge clusters, that node is wiped clean and reset to the default configuration.
  • By default, the local control plane workloads run in high availability mode with three replicas that span across three nodes chosen automatically by Distributed Cloud Edge. This is true unless there are fewer than three nodes in the cluster, or you specifically configure the cluster to use one node to run the local control plane workloads. You also have the option to specify the three nodes for high availability mode by using the --control-plane-machine-filter flag. No other node combinations are supported.
  • The nodes that run the local control plane workloads also run your application workloads.
  • The IP addresses of local control plane endpoints are accessible on your local network. You must ensure that your local network's security configuration prevents external access to those IP addresses.

When in survivability mode, a Distributed Cloud Edge cluster operates as follows:

  • Control over workloads through the Google Cloud CLI, the kubectl CLI, and the Edge Container API is disabled.
  • Distributed Cloud Edge software updates, SLOs, and hardware repair are unavailable.
  • Limited logs and metrics are synchronized with Google Cloud after the connection to Google Cloud is re-established.
  • If a node reboots while the cluster is disconnected from Google Cloud, it cannot rejoin its cluster until the connection to Google Cloud is re-established because its authentication key cannot be refreshed.

Prerequisites

Before you can create a Distributed Cloud Edge cluster with a local control plane, you must enable the required APIs in the target Google Cloud project. To do so, you must have one of the following roles in the Google Cloud project:

  • Owner (roles/owner)
  • Editor (roles/editor)
  • Service Usage Admin (roles/serviceusage.serviceUsageAdmin)

For more information about these roles, see Basic roles.

For information about granting roles, see Grant a single role.

To create a Distributed Cloud Edge cluster with a local control plane, enable the following APIs:

  • anthos.googleapis.com
  • anthosaudit.googleapis.com
  • anthosgke.googleapis.com
  • cloudresourcemanager.googleapis.com
  • connectgateway.googleapis.com
  • container.googleapis.com
  • edgecontainer.googleapis.com
  • gkeconnect.googleapis.com
  • gkehub.googleapis.com
  • gkeonprem.googleapis.com
  • iam.googleapis.com
  • logging.googleapis.com
  • monitoring.googleapis.com
  • opsconfigmonitoring.googleapis.com
  • serviceusage.googleapis.com
  • stackdriver.googleapis.com
  • storage.googleapis.com
  • sts.googleapis.com

For information about enabling APIs, see Enabling services.

Upgrade your Google Cloud SDK to version 450.0.0 or later

You must upgrade your Google Cloud SDK to version 450.0.0 or later to create local control plane clusters running Distributed Cloud Edge software version 1.5.0. Otherwise, creating such clusters will fail.

Create a cluster with a local control plane

To create a Distributed Cloud Edge cluster with a local control plane, you must pass the following flags when creating the cluster:

  • --control-plane-node-location instructs Distributed Cloud Edge to deploy the control plane workloads for this cluster locally. The value is the name of the target Distributed Cloud Edge zone.
  • --control-plane-node-count (optional) specifies the number of nodes on which to run the local control plane workloads. Valid values are 3 for high availability and 1 for standard operation. If omitted, defaults to 3.
  • --control-plane-machine-filter (optional) specifies a regex-formatted list of nodes that run the local control plane workloads. If omitted, Distributed Cloud Edge selects the nodes automatically at random.
  • --control-plane-shared-deployment-policy specifies whether application workloads can run on the nodes that run the local control plane workloads. The only valid value is ALLOWED. If omitted, the cluster creation fails.
  • --external-lb-ipv4-address-pools specifies a comma-delimited list of IPv4 addresses, address ranges, or subnetworks for ingress traffic for Services that run behind the Distributed Cloud Edge load balancer.

For more information about creating Distributed Cloud Edge clusters, see Create a cluster.

What's next