How Distributed Cloud works

This page describes how Google Distributed Cloud works, including information about its infrastructure, hardware, storage, and networking capabilities.

Google Distributed Cloud consists of the following components:

  • The Distributed Cloud infrastructure. Google provides, deploys, and maintains the Distributed Cloud hardware, including remote management by a dedicated Google team.
  • The Distributed Cloud service. This service allows you to manage your Distributed Cloud clusters and node pools by using the Google Cloud CLI and the Distributed Cloud Edge Container API. The Distributed Cloud clusters are registered in your Fleet, and you can use the Kubernetes kubectl CLI tool to interact with them.

Distributed Cloud form factors

Distributed Cloud is available in one of the following form factors:

  • Distributed Cloud Rack. A rack of six Distributed Cloud Servers and two top-of-rack (ToR) switches. This form factor supports both local control plane and Cloud control plane clusters.
  • Distributed Cloud Server. A standalone Distributed Cloud Server that connects directly to your local network through your own ToR switches. This form factor only supports local control plane clusters. Distributed Cloud Servers can only be deployed in groups of three.

    The following table describes the differences between Distributed Cloud Racks and Distributed Cloud Servers.

    Functionality GDC Edge Rack GDC Edge Server
    Physical form factor Fully populated rack
    (2x ToR switch, 6x rackmount machine)
    1RU half-depth rackmount machine
    (deployed in groups of 3)
    Power supply AC and DC AC only
    Cluster type Cloud control plane and local control plane Local control plane only
    GPU workloads Supported Not supported
    Local network connectivity Layer 3, BGP supported Layer 2, BGP not supported
    EdgeNetwork networks Fully configurable Single (default) network only
    EdgeNetwork subnetworks CIDR and VLAN ID VLAN ID only
    EdgeNetwork interconnects Supported Not supported
    EdgeNetwork interconnect attachments Supported Not supported
    EdgeNetwork VPN connections Supported Not supported
    VPC connectivity Supported Not supported
    Symcloud Storage Supported Supported
    Network Function Operator Supported Not supported
    SR-IOV Supported Not supported

Distributed Cloud infrastructure

Google provides, deploys, operates, and maintains the dedicated hardware that runs your Distributed Cloud zone. The Distributed Cloud nodes that execute your workloads run exclusively on this hardware.

The hardware runs a number of nodes grouped into node pools, which you can assign to clusters within your Distributed Cloud zone. You can configure your network so that workloads running on Distributed Cloud clusters are available only to your local users or accessible from the internet. You can also configure your network to allow only Distributed Cloud nodes to use local resources or to communicate with workloads, such as Compute Engine virtual machine (VM) instances and Kubernetes Pods running in a Virtual Private Cloud (VPC) network over a secure Cloud VPN network connection to a VPC network.

Distributed Cloud management

Distributed Cloud nodes are not standalone resources and must remain connected to Google Cloud for control plane management and monitoring purposes. The Distributed Cloud control plane nodes are hosted in the designated Google Cloud region. The on-premises Distributed Cloud nodes require a constant network connection to Google Cloud.

Google remotely manages the physical machines and ToR switches that constitute your Distributed Cloud installation. This includes installing software updates and security patches and resolving configuration issues. Your network administrator can also monitor the health and performance of Distributed Cloud clusters and nodes and work with Google to resolve any issues.

After Google has successfully deployed the Distributed Cloud hardware in your designated location, your cluster administrator can begin configuring the Distributed Cloud cluster in a way that's similar to a conventional Kubernetes cluster. They can assign machines to node pools, and node pools to clusters, and grant application owners access as required by their roles. The cluster administrator must, however, keep in mind the processing and storage limitations of the machines in your Distributed Cloud rack and plan cluster and workload configuration accordingly.

Distributed Cloud provides an API for configuring clusters and node pools.

Access to the Distributed Cloud zone

You can configure your network to allow the desired level of access to your Distributed Cloud zone, both from your local network and the internet.

You can also grant your Distributed Cloud zone access to Google Cloud services by connecting it to your VPC network. Distributed Cloud uses Cloud VPN to connect to Google service endpoints. Your network administrator must configure your network to allow this.

Distributed Cloud personas

The following personas are involved in the deployment and operation of your Distributed Cloud zone:

  • Google field technician. Delivers, installs, and activates the Distributed Cloud hardware in your designated location. Your network administrator works with the Google technicians to connect the hardware to your power source and connect it to your network.

  • Google site reliability engineer (SRE). Monitors and manages the Distributed Cloud hardware. This includes resolving configuration issues, installing patches and updates, and maintaining security.

  • Network administrator. Configures and maintains network connectivity and access control between the Distributed Cloud hardware and your local network. This includes configuring your routing and firewall rules to ensure that all required types of network traffic can freely flow between the Distributed Cloud hardware, Google Cloud, the clients that consume your Distributed Cloud workloads, internal and external data repositories, and so on. The network administrator must have access to the Google Cloud console to monitor the status of your Distributed Cloud machines. The network administrator also configures Distributed Cloud networking features.

  • Cluster administrator. Deploys and maintains Distributed Cloud clusters within your organization. This includes configuring permissions, logging, and provisioning workloads for each cluster. The cluster administrator assigns nodes to node pools, and node pools to Distributed Cloud clusters. The cluster administrator must understand the operational differences between the Distributed Cloud cluster and a traditional Kubernetes cluster, such as the processing and storage capabilities of the Distributed Cloud hardware, in order to correctly configure and deploy your workloads.

  • Application owner. A software engineer responsible for developing and/or deploying and monitoring an application running on a Distributed Cloud cluster. Application owners that own applications on a Distributed Cloud cluster must understand the limitations on the size and location of the clusters as well as the ramifications of deploying an application at the edge, such as performance and latency.

Distributed Cloud Rack hardware

Figure 1 depicts a typical Distributed Cloud Rack configuration.

Figure 1. Distributed Cloud components.
Figure 1. Distributed Cloud components.

The components of a Distributed Cloud installation are as follows:

  • Google Cloud. Traffic between your Distributed Cloud installation and Google Cloud includes hardware management traffic, control plane traffic, and Cloud VPN traffic to Google Cloud services and any workloads that you are running there. It can also include VPC traffic, if applicable.

  • Internet. Encrypted management and control plane traffic between your Distributed Cloud installation and Google Cloud travel over the internet. Distributed Cloud does not support proxied internet connections.

  • Local network. The local network external to the Distributed Cloud rack that connects the peering edge routers to the internet.

  • Peering edge routers. Your local network routers that interface with the Distributed Cloud ToR switches. Depending on the physical location that you choose for your Distributed Cloud installation, the peering edge routers can be owned and maintained by your organization or your co-location facility. You must configure these routers to use Border Gateway Protocol (BGP) to peer with the ToR switches and advertise a default route to your Distributed Cloud hardware. You must also configure these routers, as well as any corresponding firewalls, to allow Google's device management traffic, the Distributed Cloud control plane traffic, and Cloud VPN traffic, if applicable.

    Depending on your business requirements, you can configure these routers as follows:

    • Let your Distributed Cloud nodes access the internet by using public network address translation (NAT) or direct exposure to public IP addresses.
    • Allow a VPN connection to your VPC network and any desired Google Cloud services.
  • Top-of-rack (ToR) switches. The Layer 3 switches that connect the machines within the rack and interface with your local network. These switches are BGP speakers and handle network traffic between the Distributed Cloud rack and your local network equipment. They connect to peering edge routers by using Link Aggregation Control Protocol (LACP) bundles.

  • Machines. The physical machines that run Distributed Cloud software and execute your workloads. Each physical machine is a node within the Distributed Cloud cluster.

Distributed Cloud Server hardware

Figure 2 depicts a typical Distributed Cloud Server configuration.

Figure 2. Distributed Cloud Server components.
Figure 2. Distributed Cloud Server components.

The components of a Distributed Cloud installation are as follows:

  • Google Cloud. Traffic between your Distributed Cloud installation and Google Cloud includes hardware management and audit logging traffic. It can also include VPC traffic, if applicable.

  • Internet. Encrypted management and audit logging traffic between your Distributed Cloud installation and Google Cloud travel over the internet. Distributed Cloud does not support proxied internet connections.

  • Local network. Your local network to which Distributed Cloud Servers connect through your Layer 2 ToR switches.

  • Top-of-rack (ToR) switches. Your Layer 2 switches that connect the Server machines and interface with your local network. Each Distributed Cloud Server machine requires, at a minimum, one in-band and one out-of-band connection to a single ToR switch. Google recommends using two ToR switches and two in-band connections per machine (one per switch) for added reliability. Each Distributed Cloud Server machine connects to your ToR switches as follows:

    • In-band connectivity. Each Distributed Cloud Server machine connects to one or both of your ToR switches for in-band connectivity. These connections carry your workload traffic. You must configure them as 802.1q trunks and the corresponding native VLAN as the network to which the Distributed Cloud management network interfaces belong. If you need additional workload connectivity, you can trunk additional tagged VLANs to your Distributed Cloud Servers.
    • Out-of-band connectivity. Each Distributed Cloud Server also connects to one ToR switch for out-of-band connectivity, which allows your Distributed Cloud Servers to communicate with one another. You must place the out-of-band switch ports within the same VLAN.
  • Machines. The physical Distributed Cloud Server machines that run Distributed Cloud software and execute your workloads. Each physical machine is a node within the Distributed Cloud cluster.

Distributed Cloud service

The Distributed Cloud service runs on either Google Cloud for Cloud control plane clusters, or directly on the Distributed Cloud hardware for local control plane clusters. It serves as a control plane for the nodes and clusters on your Distributed Cloud hardware.

For remote control plane clusters, Distributed Cloud must be able to connect to Google Cloud at all times and cannot function without that connection. For local control plane clusters, your workloads continue to run even if Distributed Cloud cannot connect to Google Cloud for up to 7 days. After this period, Distributed Cloud must communicate with Google Cloud to refresh authentication tokens, storage encryption keys, and synchronize hardware management and audit logging data.

This control plane instantiates and configures your Distributed Cloud zone. The specific Google data center to which your Distributed Cloud hardware connects for management is chosen according to its proximity to your Distributed Cloud installation.

A Distributed Cloud zone consists of the machines installed in your Distributed Cloud Rack or of the Distributed Cloud Server machines deployed on your premises. With Distributed Cloud Rack, you can assign these machines, instantiated as Kubernetes nodes, to a node pool, and the node pool to a Distributed Cloud cluster. With Distributed Cloud Servers, node pools are populated automatically and not configurable.

Figure 3 depicts the logical organization of Distributed Cloud entities.

Figure 2. Distributed Cloud entities.
Figure 3. Distributed Cloud entities.

The entities are as follows:

  • Google Cloud region. The Google Cloud region for your Distributed Cloud zone is determined by the location of the Google data center that is the closest to your Distributed Cloud installation.

  • Kubernetes cloud control plane. The Kubernetes control plane for each Distributed Cloud cluster by default runs remotely in a Google data center in the Google Cloud region to which your Distributed Cloud cluster is assigned. This allows Distributed Cloud to benefit from a secure and highly available control plane without taking up processing capacity on the Distributed Cloud physical machines. Cloud control plane clusters are not available on Distributed Cloud Servers.

  • Kubernetes local control plane. Starting with Google Distributed Cloud version 1.5.0, you have the option to configure a Distributed Cloud cluster to use a local control plane instead of the default cloud control plane. A local control plane cluster can enter survivability mode when the connection to Google Cloud is temporarily lost, allowing your workloads to continue running until the connection is restored. This is the only cluster type available on Distributed Cloud Servers. For more information, See Survivability mode.

  • Distributed Cloud zone. A logical abstraction that represents the Distributed Cloud hardware deployed on your premises. A Distributed Cloud zone covers a single Distributed Cloud Rack or all of the Distributed Cloud Server machines deployed at your location. The physical machines in the zone are instantiated as Distributed Cloud machines in the Google Cloud console. The machines in a Distributed Cloud zone share a single network fabric or a single fault domain. Google creates your machines before delivering your Distributed Cloud hardware. You cannot create, delete, or modify Distributed Cloud machines.

  • Node. A node is a Kubernetes resource that instantiates a Distributed Cloud physical machine into the Kubernetes realm when creating a node pool, making it available to run workloads by assigning the node pool to a Distributed Cloud cluster.

  • Node pool. A logical grouping of Distributed Cloud nodes within a single Distributed Cloud zone that lets you assign Distributed Cloud nodes to Distributed Cloud clusters. For Distributed Cloud Servers, node pools are instantiated and populated automatically.

  • Cluster. A Distributed Cloud cluster that consists of a control plane and one or more node pools.

  • VPN connection. A VPN tunnel to a VPC network running in a Google Cloud project. This tunnel allows your Distributed Cloud workloads to access Compute Engine resources connected to that VPC network. You must create at least one node pool in your zone before you can create a VPN connection. Distributed Cloud Servers don't support VPN connections.

Storage

Distributed Cloud provides 3.3 TiB of usable storage per physical machine in the Distributed Cloud Rack. This storage is configured as Linux logical volumes. When you create a cluster, Distributed Cloud creates one or more PersistentVolumes and exposes them as block volumes that you can assign to a workload by using PersistentVolumeClaims. Keep in mind that these PersistentVolumes don't provide data durability and are only suitable for ephemeral data. For information about working with block volumes, see PersistentVolumeClaim requesting a Raw Block Volume.

For Distributed Cloud Servers, storage is exclusively abstracted through Rakuten Symcloud Storage. Each Distributed Cloud Server machine provides 1TB of usable storage.

Storage security

Distributed Cloud uses LUKS to encrypt local machine storage and supports customer-managed encryption keys (CMEK). For more information, see Security best practices.

Symcloud Storage integration

On Distributed Cloud Racks, you can configure Distributed Cloud to use Rakuten Symcloud Storage, which acts as a local storage abstraction layer on each Distributed Cloud Rack node and makes its local storage available to workloads running on other Distributed Cloud nodes. On Distributed Cloud Servers, Symcloud Storage is the default and only storage option available. Distributed Cloud Servers don't expose local storage as Linux logical volumes.

For more information, see Configure Distributed Cloud for Symcloud Storage.

Networking

This section describes the network connectivity requirements and features of Distributed Cloud.

Google pre-configures some of the Distributed Cloud virtual networking components for your installation before shipping the Distributed Cloud hardware to you. You cannot modify the pre-configured settings after the hardware is delivered.

Figure 3 depicts the topology of the Distributed Cloud virtual network.

Figure 3. Distributed Cloud networking components.
Figure 3. Distributed Cloud networking components.

The components of the Distributed Cloud virtual network are as follows:

  • Network. A virtual network with a private address space in your Distributed Cloud zone. A network is Layer 3-isolated from other virtual networks within the zone and can contain one or more subnetworks. The virtual network spans all the physical machines in the Distributed Cloud Rack. A single Distributed Cloud zone supports a maximum of 20 networks. Distributed Cloud Servers only support a single network, the default one created when a Distributed Cloud Server cluster is instantiated.

  • Subnetwork. A Layer 2 and Layer 3 VLAN subnetwork within a Distributed Cloud network. A subnetwork has its own broadcast domain and one or more IPv4 address ranges of your choice. Subnetworks within the same network are Layer 2-isolated but can communicate with each other through Layer 3. Nodes on different subnetworks within the same network can communicate with each other by using their IP addresses. However, nodes on subnetworks within different networks cannot communicate with each other. Distributed Cloud Servers only support subnetwork management using VLAN IDs.

  • Router. A virtual router instance that governs traffic within a Distributed Cloud network. Your network administrator uses a router to configure a BGP peering session over an interconnect attachment between a Distributed Cloud network and your local network so that Distributed Cloud Pods can advertise their network prefixes on your local network. By default, routers re-advertise the routes received from Distributed Cloud subnetworks. Distributed Cloud supports one router per network. Distributed Cloud Servers don't support routers.

  • Interconnect. A bundled logical link between a Distributed Cloud network and your local network. An interconnect is comprised of one or more physical links. During initial start-up, Google creates the interconnects that you requested when you ordered Distributed Cloud. Interconnects cannot be created, modified, or removed after the Distributed Cloud rack is up and running. By default, Google creates four interconnects to provide high availability for your installation. Distributed Cloud Servers don't support interconnects.

  • Interconnect attachment. A virtual link between an interconnect and a router that isolates the corresponding Distributed Cloud network from your local network. Traffic flowing through an interconnect attachment can be untagged or tagged with a VLAN ID of your choice. You create interconnect attachments based on your business requirements. Distributed Cloud don't support interconnect attachments.

Distributed Cloud networking components share similarities with their Google Cloud equivalents with the following differences:

  • Distributed Cloud networking components are local to the Distributed Cloud zone in which they are instantiated.

  • A Distributed Cloud network does not have direct connectivity to a VPC network.

  • By default, Distributed Cloud networks have no connectivity with each other across different Distributed Cloud zones. You have the option to explicitly configure cross-zone networking.

Your network administrator configures Distributed Cloud networking components, except interconnects, which Google configures before shipping the Distributed Cloud hardware to you.

Your network administrator must have the Edge Network Admin role (roles/edgenetwork.admin) on the target Google Cloud project, while application developers that deploy workloads on Distributed Cloud must have the Edge Network Viewer role (roles/edgenetwork.viewer) on the target Google Cloud project.

Connectivity to your local network

For outbound traffic to resources on your local network, Pods in a Distributed Cloud cluster use the default routes advertised by your peering edge routers. Distributed Cloud uses its built-in NAT to connect Pods to those resources.

For inbound traffic from resources on your local network, your network administrator must configure routing policies that match your business requirements to control access to Pods in each of your Distributed Cloud clusters. This means, at a minimum, completing the steps in Firewall configuration and configuring additional policies as required by your workloads. For example, you can set up allow/deny policies for individual node subnetworks or virtual IP addresses exposed by the built-in load balancer in Distributed Cloud. The Distributed Cloud Pod and Distributed Cloud Service CIDR blocks are not directly accessible.

Connectivity to the internet

For outbound traffic to resources on the internet, Pods in a Distributed Cloud cluster use the default route advertised by your routers to the Distributed Cloud ToR switches. This means, at a minimum, completing the steps in Firewall configuration and configuring additional policies as required by your workloads. Distributed Cloud uses its built-in NAT to connect Pods to those resources. You can optionally configure your own layer of NAT on top of the built-in layer in Distributed Cloud.

For inbound traffic, you must configure your WAN routers according to your business requirements. These requirements dictate the level of access that you need to provide from the public internet to the Pods in your Distributed Cloud clusters. Distributed Cloud uses its built-in NAT for Pod CIDR blocks and Service management CIDR blocks, so those CIDR blocks are not accessible from the internet.

Connectivity to a VPC network

Distributed Cloud includes a built in VPN solution that allows you to connect a Distributed Cloud cluster directly to a VPC instance if that instance is in the same Google Cloud project as the Distributed Cloud cluster.

If you use Cloud Interconnect to connect your local network to a VPC instance, your Distributed Cloud clusters can reach that instance by using the standard northbound eBGP peering. Your peering edge routers must be able to reach the appropriate VPC prefixes, and your Cloud Interconnect routers must correctly announce your Distributed Cloud prefixes, such as Distributed Cloud load balancer, management, and system subnetworks.

After you establish a VPN connection between your Distributed Cloud cluster and your VPC network, the following connectivity rules apply by default:

  • Your VPC network can access all the Pods in the Distributed Cloud cluster.
  • All the Pods in the Distributed Cloud cluster can access all the Pods in your VPC-native clusters. For routes-based clusters, you must manually configure custom advertised routes.
  • All Pods in the Distributed Cloud cluster can access virtual machine subnetworks in your VPC network.

The functionality described in this section is not available on Distributed Cloud Servers.

Connectivity to Google Cloud APIs and services

After you configure a VPN connection to your VPC network, workloads running on your Distributed Cloud installation can access Google Cloud APIs and services.

You can additionally configure the following features if your business requirements call for them:

VPN connectivity is not available on Distributed Cloud Servers.

Network security

Your business requirements and your organization's network security policy dictate the steps necessary to secure network traffic that flows in and out of your Distributed Cloud installation. For more information, see Security best practices.

Other networking features

Distributed Cloud supports the following networking features:

High-performance networking support

Distributed Cloud Racks support the execution of workloads that require the best possible networking performance. To this end, Distributed Cloud ships with a specialized Network Function operator and a set of Kubernetes CustomResourceDefinitions (CRDs) that implement the features required for high-performance workload execution.

Distributed Cloud Racks also support virtualizing network interfaces by using SR-IOV.

The features described in this section are not available on Distributed Cloud Servers.

Virtual machine workload support

Distributed Cloud can run workloads in virtual machines in addition to containers. For more information, see Manage virtual machines.

To learn about how virtual machines serve as an essential component of the Google Distributed Cloud platform, see Extending GKE Enterprise to manage on-premises edge VMs.

GPU workload support

Distributed Cloud can run GPU-based workloads on NVIDIA Tesla T4 GPUs. You must specify this requirement when ordering your Distributed Cloud hardware. For more information, see Manage GPU workloads.

This functionality is not available on Distributed Cloud Servers.

What's next