How Distributed Cloud works

This page describes how Google Distributed Cloud works, including information about its infrastructure, hardware, storage, and networking capabilities.

Google Distributed Cloud consists of the following components:

  • The Distributed Cloud infrastructure. Google provides, deploys, and maintains the Distributed Cloud hardware, including remote management by a dedicated Google team.
  • The Distributed Cloud service. This service allows you to manage your Distributed Cloud clusters and node pools by using the Google Cloud CLI and the Distributed Cloud Edge Container API. The Distributed Cloud clusters are registered in your Fleet, and you can use the Kubernetes kubectl CLI tool to interact with them.

Distributed Cloud infrastructure

Google provides, deploys, operates, and maintains a rack of dedicated hardware that runs your Distributed Cloud zone. This hardware consists of rack-mounted server machines and two top-of-rack (ToR) switches that interconnect the machines with your local network. The Distributed Cloud nodes that execute your workloads run exclusively on this hardware.

The hardware runs a number of nodes grouped into node pools, which you can assign to clusters within your Distributed Cloud zone. You can configure your network so that workloads running on Distributed Cloud clusters are available only to your local users or accessible from the internet. You can also configure your network to allow only Distributed Cloud nodes to use local resources or to communicate with workloads, such as Compute Engine virtual machine (VM) instances and Kubernetes Pods running in a Virtual Private Cloud (VPC) network over a secure Cloud VPN network connection to a VPC network.

Distributed Cloud management

Distributed Cloud nodes are not standalone resources and must remain connected to Google Cloud for control plane management and monitoring purposes. The Distributed Cloud control plane nodes are hosted in the designated Google Cloud region. The on-premises Distributed Cloud nodes require a constant network connection to Google Cloud.

Google remotely manages the physical machines and ToR switches that constitute your Distributed Cloud installation. This includes installing software updates and security patches and resolving configuration issues. Your network administrator can also monitor the health and performance of Distributed Cloud clusters and nodes and work with Google to resolve any issues.

After Google has successfully deployed the Distributed Cloud hardware in your designated location, your cluster administrator can begin configuring the Distributed Cloud cluster in a way that's similar to a conventional Kubernetes cluster. They can assign machines to node pools, and node pools to clusters, and grant application owners access as required by their roles. The cluster administrator must, however, keep in mind the processing and storage limitations of the machines in your Distributed Cloud rack and plan cluster and workload configuration accordingly.

Distributed Cloud provides an API for configuring clusters and node pools.

Access to the Distributed Cloud zone

You can configure your network to allow the desired level of access to your Distributed Cloud zone, both from your local network and the internet.

You can also grant your Distributed Cloud zone access to Google Cloud services by connecting it to your VPC network. Distributed Cloud uses Cloud VPN to connect to Google service endpoints. Your network administrator must configure your network to allow this.

Distributed Cloud personas

The following personas are involved in the deployment and operation of your Distributed Cloud zone:

  • Google field technician. Delivers, installs, and activates the Distributed Cloud hardware in your designated location. Your network administrator works with the Google technicians to connect the hardware to your power source and connect it to your network.

  • Google site reliability engineer (SRE). Monitors and manages the Distributed Cloud hardware. This includes resolving configuration issues, installing patches and updates, and maintaining security.

  • Network administrator. Configures and maintains network connectivity and access control between the Distributed Cloud hardware and your local network. This includes configuring your routing and firewall rules to ensure that all required types of network traffic can freely flow between the Distributed Cloud hardware, Google Cloud, the clients that consume your Distributed Cloud workloads, internal and external data repositories, and so on. The network administrator must have access to the Google Cloud console to monitor the status of your Distributed Cloud machines. The network administrator also configures Distributed Cloud networking features.

  • Cluster administrator. Deploys and maintains Distributed Cloud clusters within your organization. This includes configuring permissions, logging, and provisioning workloads for each cluster. The cluster administrator assigns nodes to node pools, and node pools to Distributed Cloud clusters. The cluster administrator must understand the operational differences between the Distributed Cloud cluster and a traditional Kubernetes cluster, such as the processing and storage capabilities of the Distributed Cloud hardware, in order to correctly configure and deploy your workloads.

  • Application owner. A software engineer responsible for developing and/or deploying and monitoring an application running on a Distributed Cloud cluster. Application owners that own applications on a Distributed Cloud cluster must understand the limitations on the size and location of the clusters as well as the ramifications of deploying an application at the edge, such as performance and latency.

Distributed Cloud hardware

Figure 1 depicts a typical Distributed Cloud configuration.

Figure 1. Distributed Cloud components.
Figure 1. Distributed Cloud components.

The components of a Distributed Cloud installation are as follows:

  • Google Cloud. Traffic between your Distributed Cloud installation and Google Cloud includes hardware management traffic, control plane traffic, and Cloud VPN traffic to Google Cloud services and any workloads that you are running there. It can also include VPC traffic, if applicable.

  • Internet. Encrypted management and control plane traffic between your Distributed Cloud installation and Google Cloud travel over the internet. Distributed Cloud does not support proxied internet connections.

  • Local network. The local network external to the Distributed Cloud rack that connects the peering edge routers to the internet.

  • Peering edge routers. Your local network routers that interface with the Distributed Cloud ToR switches. Depending on the physical location that you choose for your Distributed Cloud installation, the peering edge routers can be owned and maintained by your organization or your co-location facility. You must configure these routers to use Border Gateway Protocol (BGP) to peer with the ToR switches and advertise a default route to your Distributed Cloud hardware. You must also configure these routers, as well as any corresponding firewalls, to allow Google's device management traffic, the Distributed Cloud control plane traffic, and Cloud VPN traffic, if applicable.

    Depending on your business requirements, you can configure these routers as follows:

    • Let your Distributed Cloud nodes access the internet by using public network address translation (NAT) or direct exposure to public IP addresses.
    • Allow a VPN connection to your VPC network and any desired Google Cloud services.
  • Top-of-rack (ToR) switches. The Layer 3 switches that connect the machines within the rack and interface with your local network. These switches are BGP speakers and handle network traffic between the Distributed Cloud rack and your local network equipment. They connect to peering edge routers by using Link Aggregation Control Protocol (LACP) bundles.

  • Machines. The physical machines that run Distributed Cloud software and execute your workloads. Each physical machine is a node within the Distributed Cloud cluster.

Distributed Cloud service

The Distributed Cloud service runs on Google Cloud and serves as a control plane for the nodes and clusters running on your Distributed Cloud hardware. Distributed Cloud must be able to connect to Google Cloud at all times and cannot function without that connection.

This control plane instantiates and configures your Distributed Cloud zone. The specific Google data center to which your Distributed Cloud hardware connects for management is chosen according to its proximity to your Distributed Cloud installation.

A Distributed Cloud zone consists of several machines equal to the number of physical machines installed in your Distributed Cloud rack. You can assign these machines, instantiated as Kubernetes nodes, to a node pool, and the node pool to a Distributed Cloud cluster.

Figure 2 depicts the logical organization of Distributed Cloud entities.

Figure 2. Distributed Cloud entities.
Figure 2. Distributed Cloud entities.

The entities are as follows:

  • Google Cloud region. The Google Cloud region for your Distributed Cloud zone is determined by the location of the Google data center that is the closest to your Distributed Cloud installation.

  • Kubernetes cloud control plane. The Kubernetes control plane for each Distributed Cloud cluster by default runs remotely in a Google data center in the Google Cloud region to which your Distributed Cloud cluster is assigned. This allows Distributed Cloud to benefit from a secure and highly available control plane without taking up processing capacity on the Distributed Cloud physical machines.

  • Kubernetes local control plane. Starting with Google Distributed Cloud version 1.5.0, you have the option to configure a Distributed Cloud cluster to use a local control plane instead of the default cloud control plane. A local control plane cluster can enter survivability mode when the connection to Google Cloud is temporarily lost, allowing your workloads to continue running until the connection is restored. For more information, See Survivability mode.

  • Distributed Cloud zone. A logical abstraction that represents the Distributed Cloud hardware installed in your Distributed Cloud rack. A Distributed Cloud zone covers a single rack of Distributed Cloud hardware. The physical machines in the zone are instantiated as Distributed Cloud machines in the Google Cloud console. The machines in a Distributed Cloud zone share a single network fabric or a single fault domain. Google creates your machines before delivering your Distributed Cloud hardware. You cannot create, delete, or modify Distributed Cloud machines.

  • Node. A node is a Kubernetes resource that instantiates a Distributed Cloud physical machine into the Kubernetes realm when you create a node pool, making it available to run workloads by assigning the node pool to a Distributed Cloud cluster. The Kubernetes control plane for each node runs on Google Cloud.

  • Node pool. A logical grouping of Distributed Cloud nodes within a single Distributed Cloud zone that allows you to assign Distributed Cloud nodes to Distributed Cloud clusters.

  • Cluster. A Distributed Cloud cluster that consists of a control plane and one or more node pools.

  • VPN connection. A VPN tunnel to a VPC network running in a Google Cloud project. This tunnel allows your Distributed Cloud workloads to access Compute Engine resources connected to that VPC network. You must create at least one node pool in your zone before you can create a VPN connection.

Storage

Distributed Cloud provides 3.3 TiB of storage per physical machine in the Distributed Cloud rack. This storage is configured as Linux logical volumes. When you create a cluster, Distributed Cloud creates one or more PersistentVolumes and exposes them as block volumes that you can assign to a workload by using PersistentVolumeClaims. Keep in mind that these PersistentVolumes do not provide data durability and are only suitable for ephemeral data. For information about working with block volumes, see PersistentVolumeClaim requesting a Raw Block Volume.

Storage security

Distributed Cloud uses LUKS to encrypt local machine storage and supports customer-managed encryption keys (CMEK). For more information, see Security best practices.

Symcloud Storage integration

You can configure Distributed Cloud to use Rakuten Symcloud Storage, which acts as a local storage abstraction layer on each Distributed Cloud node and makes its local storage available to workloads running on other Distributed Cloud nodes.

For more information, see Configure Distributed Cloud for Symcloud Storage.

Networking

This section describes the network connectivity requirements and features of Distributed Cloud.

Google pre-configures some of the Distributed Cloud virtual networking components for your installation before shipping the Distributed Cloud hardware to you. You cannot modify the pre-configured settings after the hardware is delivered.

Figure 3 depicts the topology of the Distributed Cloud virtual network.

Figure 3. Distributed Cloud networking components.
Figure 3. Distributed Cloud networking components.

The components of the Distributed Cloud virtual network are as follows:

  • Network. A virtual network with a private address space in your Distributed Cloud zone. A network is Layer 3-isolated from other virtual networks within the zone and can contain one or more subnetworks. The virtual network spans all the physical machines in the Distributed Cloud rack. A single Distributed Cloud zone supports a maximum of 20 networks.

  • Subnetwork. A Layer 2 and Layer 3 VLAN subnetwork within a Distributed Cloud network. A subnetwork has its own broadcast domain and one or more IPv4 address ranges of your choice. Subnetworks within the same network are Layer 2-isolated but can communicate with each other through Layer 3. Nodes on different subnetworks within the same network can communicate with each other by using their IP addresses. However, nodes on subnetworks within different networks cannot communicate with each other.

  • Router. A virtual router instance that governs traffic within a Distributed Cloud network. Your network administrator uses a router to configure a BGP peering session over an interconnect attachment between a Distributed Cloud network and your local network so that Distributed Cloud Pods can advertise their network prefixes on your local network. By default, routers re-advertise the routes received from Distributed Cloud subnetworks. Distributed Cloud supports one router per network.

  • Interconnect. A bundled logical link between a Distributed Cloud network and your local network. An interconnect is comprised of one or more physical links. During initial start-up, Google creates the interconnects that you requested when you ordered Distributed Cloud. Interconnects cannot be created, modified, or removed after the Distributed Cloud rack is up and running. By default, Google creates four interconnects to provide high availability for your installation.

  • Interconnect attachment. A virtual link between an interconnect and a router that isolates the corresponding Distributed Cloud network from your local network. Traffic flowing through an interconnect attachment can be untagged or tagged with a VLAN ID of your choice. You create interconnect attachments based on your business requirements.

Distributed Cloud networking components share similarities with their Google Cloud equivalents with the following differences:

  • Distributed Cloud networking components are local to the Distributed Cloud zone in which they are instantiated.

  • A Distributed Cloud network does not have direct connectivity to a VPC network.

  • By default, Distributed Cloud networks have no connectivity with each other across different Distributed Cloud zones. You have the option to explicitly configure cross-zone networking.

Your network administrator configures Distributed Cloud networking components, except interconnects, which Google configures before shipping the Distributed Cloud hardware to you.

Your network administrator must have the Edge Network Admin role (roles/edgenetwork.admin) on the target Google Cloud project, while application developers that deploy workloads on Distributed Cloud must have the Edge Network Viewer role (roles/edgenetwork.viewer) on the target Google Cloud project.

Connectivity to your local network

For outbound traffic to resources on your local network, Pods in a Distributed Cloud cluster use the default routes advertised by your peering edge routers. Distributed Cloud uses its built-in NAT to connect Pods to those resources.

For inbound traffic from resources on your local network, your network administrator must configure routing policies that match your business requirements to control access to Pods in each of your Distributed Cloud clusters. This means, at a minimum, completing the steps in Firewall configuration and configuring additional policies as required by your workloads. For example, you can set up allow/deny policies for individual node subnetworks or virtual IP addresses exposed by the built-in load balancer in Distributed Cloud. The Distributed Cloud Pod and Distributed Cloud Service CIDR blocks are not directly accessible.

Connectivity to the internet

For outbound traffic to resources on the internet, Pods in a Distributed Cloud cluster use the default route advertised by your routers to the Distributed Cloud ToR switches. This means, at a minimum, completing the steps in Firewall configuration and configuring additional policies as required by your workloads. Distributed Cloud uses its built-in NAT to connect Pods to those resources. You can optionally configure your own layer of NAT on top of the built-in layer in Distributed Cloud.

For inbound traffic, you must configure your WAN routers according to your business requirements. These requirements dictate the level of access that you need to provide from the public internet to the Pods in your Distributed Cloud clusters. Distributed Cloud uses its built-in NAT for Pod CIDR blocks and Service management CIDR blocks, so those CIDR blocks are not accessible from the internet.

Connectivity to a VPC network

Distributed Cloud includes a built in VPN solution that allows you to connect a Distributed Cloud cluster directly to a VPC instance if that instance is in the same Google Cloud project as the Distributed Cloud cluster.

If you use Cloud Interconnect to connect your local network to a VPC instance, your Distributed Cloud clusters can reach that instance by using the standard northbound eBGP peering. Your peering edge routers must be able to reach the appropriate VPC prefixes, and your Cloud Interconnect routers must correctly announce your Distributed Cloud prefixes, such as Distributed Cloud load balancer, management, and system subnetworks.

After you establish a VPN connection between your Distributed Cloud cluster and your VPC network, the following connectivity rules apply by default:

  • Your VPC network can access all the Pods in the Distributed Cloud cluster.
  • All the Pods in the Distributed Cloud cluster can access all the Pods in your VPC-native clusters. For routes-based clusters, you must manually configure custom advertised routes.
  • All Pods in the Distributed Cloud cluster can access virtual machine subnetworks in your VPC network.

Connectivity to Google Cloud APIs and services

After you configure a VPN connection to your VPC network, workloads running on your Distributed Cloud installation can access Google Cloud APIs and services.

You can additionally configure the following features if your business requirements call for them:

Network security

Your business requirements and your organization's network security policy dictate the steps necessary to secure network traffic that flows in and out of your Distributed Cloud installation. For more information, see Security best practices.

Other networking features

Distributed Cloud supports the following networking features:

High-performance networking support

Distributed Cloud supports the execution of workloads that require the best possible networking performance. To this end, Distributed Cloud ships with a specialized Network Function operator and a set of Kubernetes CustomResourceDefinitions (CRDs) that implement the features required for high-performance workload execution.

Distributed Cloud also supports virtualizing network interfaces by using SR-IOV.

Virtual machine workload support

Distributed Cloud can run workloads in virtual machines in addition to containers. For more information, see Manage virtual machines.

To learn about how virtual machines serve as an essential component of the Google Distributed Cloud platform, see Extending GKE Enterprise to manage on-premises edge VMs.

GPU workload support

Distributed Cloud can run GPU-based workloads on NVIDIA Tesla T4 GPUs. You must specify this requirement when ordering your Distributed Cloud hardware. For more information, see Manage GPU workloads.

What's next