Google Distributed Cloud brings Google Cloud's infrastructure and services to diverse physical locations, also known as distributed environments, and can run in on-premises data centers as well as Google's network edge.
Distributed Cloud offers two distinct solutions:
Google Distributed Cloud (GDC) connected brings Google Cloud infrastructure and services closer to where data is being generated and consumed.
Google Distributed Cloud (GDC) air-gapped lets you host, control, and manage infrastructure and services directly on your premises.
Google Distributed Cloud air-gapped does not require connectivity to Google Cloud and helps customers meet compliance and regulatory requirements. Key advantages of using the Distributed Cloud platform include but are not limited to the following:
- Open: Leveraging open source and commercial prebuilt hardware tapping into industry innovation and open ISV ecosystem.
- Intelligence: Based on the Google AI portfolio, enabling real time decisioning and automation in the platform and as a service.
- Consistent: Provides a consistent application experience across Google Cloud, Google edges, operator edges and customer edges, and data centers.
- Modern: Modern Cloud approach based on Google leadership in Kubernetes and GKE Enterprise leading hybrid-cloud solution.
- Proven: Leveraging proven best practices at scale and technologies used for Google core services.
- Secure: Security spanning Core Google Cloud, Google Global Network, Google Edge Infrastructure, and end-user devices.
Resource hierarchy
This section describes the Distributed Cloud resource hierarchy and how resources are managed in an air-gapped instance.
The purpose of the Distributed Cloud resource hierarchy is twofold:
- Provide a hierarchy of ownership, which binds the lifecycle of a resource to its immediate parent in the hierarchy.
- Provide attach points and inheritance for access control and organization policies.
The Distributed Cloud resource hierarchy resembles the file system found in operating systems as a way of organizing and managing entities hierarchically. Generally, each resource has exactly one parent. This hierarchical organization of resources lets you set access control policies, such as Identity and Access Management (IAM), which are inherited by child resources.
Resource structure in detail
The following entities are resource types recognized in the Distributed Cloud resource hierarchy:
Distributed Cloud resources are organized hierarchically. Most resources in the resource hierarchy have exactly one parent. The exception only applies to the highest resource. At the lowest level, service resources are the fundamental components that make up all Distributed Cloud services.
An organization is the top of the Distributed Cloud resource hierarchy, and all resources that belong to an organization are grouped under the organization resource. This provides central visibility and control over every resource that belongs to an organization.
Both projects and clusters are organization-scoped. They can be attached to one another to organize service resources. However, projects and clusters function independently from one another. This flexibility provides many different options for how to organize services and workloads. For example, you can have a cluster dedicated to a single project. Likewise, a cluster can span across multiple projects.
Service resources are entities that must belong to a project or a cluster, and cannot be shared across projects or clusters. Examples of service resources include virtual machines (VMs), databases, storage buckets, and backups. Most of these lower-level resources have project resources as their parents.
The following diagram represents an example Distributed Cloud resource hierarchy:
For more information on best practices for organizing your resource hierarchy, see Resource hierarchy and access control.
Organization
The organization resource represents an organization, such as a company, and is the top-level resource in the Distributed Cloud resource hierarchy. An organization defines a security boundary that encloses infrastructure resources to be administered together so that users can deploy application workloads. Within an organization, service resources such as VMs and storage volumes are logically grouped by projects.
All projects, clusters, and service resources belong to your organization instead of their creators. This means that any resource type for an organization is not deleted if the user who created it leaves the organization. Instead, all resource types follow the organization's lifecycle in Distributed Cloud.
The IAM access control policies applied to the organization resource apply throughout the hierarchy on all resources in the organization. For more information on granting organization-wide policies and permissions, see the Organization policies and IAM sections.
Project
A project is a tenancy unit that every service must integrate. Projects provide logical grouping of service resources.
Projects enable segmentation of service resources within an organization and provide a lifecycle and policy boundary for managing resources. Service resources inside a project can never outlive the project itself or move between projects, ensuring that control is available for the life of a resource.
A project is considered a proper Kubernetes namespace that spans across multiple clusters in an organization. Namespace sameness considers all namespaces of a given name the same namespace for all clusters within the same organization. The single namespace has a consistent owner across the set of clusters. Service providers create project-scoped services by creating control plane and data plane components in the namespace.
The namespace for a project hosts the following:
- Project-scoped service APIs.
- Project-level policy configurations, such as roles and role bindings.
You can attach a project to only a subset of clusters in an organization. Users can deploy containerized workloads on these clusters within a project namespace. The namespace sameness concept applies to the project namespace on these clusters. Namespace-scoped policies, such as role-based access (RBAC) policies, apply to all those namespaces.
For more information on projects, see the Projects overview.
Kubernetes cluster
A Kubernetes cluster is a set of nodes that run containerized workloads as part of GKE on GDC. Users can provision Kubernetes clusters to support the compute requirements of their applications. Clusters are organization-scoped, and must be attached to one or more projects.
Clusters subdivide infrastructure resources into isolated pools to be consumed by projects within an organization. Clusters are also logically separated from each other to provide different failure domains and isolation guarantees. The enforcement of policies per organization ensures clusters can be shared across teams and users while also maintaining performance and resource guarantees. Additionally, this enables VM workloads to run alongside container workloads without introducing operational complexity.
User clusters are beneficial for instances where users must deploy containerized workloads. However, with the option to deploy VM-based workloads, the existence of a user cluster is not required in GDC.
For more information on Kubernetes clusters, see the Manage Kubernetes clusters section.
Service resource
Service resources include the following entities:
- VMs
- Databases
- Storage buckets
- Containerized workloads
- Backups
Service resources must belong to a project, or optionally a cluster for cluster deployments, and they cannot be shared across projects. This means that service resources inside a project can never outlive the project itself, ensuring that control is available for the life of the resource. Service resources are enabled by default and can be disabled using an organization policy.
Personas
The Distributed Cloud architecture is hierarchical and consists of three tiers that map to the following personas:
- Infrastructure Operator (IO) has full access to administer the GDCH hardware, not the data or customer applications. An IO is responsible for managing and maintaining the infrastructure, hardware, and security of the operational system. The IO must refer to the Operator tab.
Platform Administrator (PA) manages organization resources, policies, and teams. A PA interacts with an IO to secure additional bulk resources, get support, plan for upgrades, and request specific configuration changes. A PA can create and delete clusters on demand for any of the customers. A PA must refer to the Administer section of the Documentation tab.
Application Operator (AO) has full access to a set of Kubernetes namespaces within a user cluster assigned by a PA. An AO interacts with a PA to secure more resources, get policy exemptions, and troubleshoot larger issues. An AO must refer to the Develop section of the Documentation tab.
The following table introduces major tasks and responsibilities of existing Distributed Cloud personas.
Infrastructure Operator | Platform Administrator | Application Operator | |
---|---|---|---|
Tasks |
|
|
|
Prerequisites
Before deploying Distributed Cloud on your premises, Google runs a site survey to ensure your location parameters such as space, power, cooling, and connectivity are physically compliant.
Based on your requirements, Google generates a solution design and provides a planning document to the cloud hardware provider that ships the required hardware to your data center.
Hardware
Distributed Cloud hardware comes fully integrated into racks and securely delivered to your premises. We partner with OEM hardware vendors to provide customers with the latest, best-in-class enterprise equipment that is backed by comprehensive services and support.
Distributed Cloud can run on minimal hardware to provide flexibility, availability, and performance.
Software
In an isolated environment, you cannot download Distributed Cloud binaries directly to a network. Before deploying Distributed Cloud on an air-gapped system, it is important to have:
- Internet access to Google Cloud to download the Distributed Cloud distribution from.
- A portable storage device to transfer downloaded Distributed Cloud to, for example, an external hard drive or a thumb drive.
- On-premise hardware to upload the downloaded files to.
- SHA256 or MD5 checksum to verify the integrity of the downloaded Distributed Cloud software.
- Nodes and clusters with enough CPU, RAM, and storage resources to meet the needs of clusters and workloads you are running regardless of your Distributed Cloud configuration.
- Downloaded Distributed Cloud documentation to use offline.
Distributed Cloud provides a FIPS 140-2 certified Ubuntu 20.04 long-term support (LTS) operating system (OS) image that runs on Distributed Cloud bare metal servers and virtual machines. The certified OS image meets all security and compliance requirements.
Major technical features
Distributed Cloud delivers a multitude of features that let enterprises use the full functionality of a private isolated environment with no internet access.
Services
The extensive collection of Distributed Cloud services includes data management, artificial intelligence, machine learning, security, observability, and computing services. Distributed Cloud supports both Kubernetes and virtual machine-based workloads.
Storage
To build a robust infrastructure and store data across an air-gapped cloud environment, Distributed Cloud provides block and object storage services. The underlying storage hardware includes high-performing all-flash solutions for block and more cost-efficient solutions for object storage.
High availability and data backup
To conform to the data sovereignty requirements, Distributed Cloud delivers an integrated backup solution for data recovery and the ability to control data residency either in a local or remote data center.
Highly available and scalable Distributed Cloud enables enterprises to perform rolling non-disruptive hardware upgrades as needed.
Networking
Distributed Cloud provides secure networking and high-speed performance for compute services and cloud storage.
Data plane and management plane networks connect all cloud components hosted in an on-premises environment to ensure data sovereignty. The networks secure data and enable customers to scale and optimize their infrastructure.
The network load balancing service distributes TCP and UDP traffic among clusters and gives absolute control over handling traffic in the Distributed Cloud environment.
Support
The Cloud Support API might not be available for use with Distributed Cloud. Consult your account manager for details.
Third-party notices and source code
Third-party notices are provided with each release of
GDC. They are provided using a tar file
stored in the same Cloud Storage location under a matching tar file that includes
notice
in the name. Alternatively, third-party notices are provided directly in the images
included in GDC.
For some of these third-party sources, we also provide copies of the source code. Third-party sources can also be found either in the images or in the following Google-hosted repositories:
For our Ubuntu mirror, we have not modified the packages. To find sources, run:
deb-src http://archive.ubuntu.com/ubuntu VERSION main universe