Stay organized with collections Save and categorize content based on your preferences.

Anthos technical overview

Anthos is Google's cloud-centric container platform for running modern apps anywhere consistently at scale. This guide provides an overview of how Anthos works and how it can help you deliver manageable, scalable, reliable applications.

Why Anthos?

Typically, as organizations embrace cloud-native technologies like containers, container orchestration, and service meshes, they reach a point where running a single cluster is no longer sufficient. There are a variety of reasons why organizations choose to deploy multiple clusters to achieve their technical and business objectives; for example, separating production from non-production environments, varying regulatory restrictions, or separating services across tiers, locales, or teams. However, using multiple clusters has its own difficulties and overheads in terms of consistent configuration, security, and management - for example, manually configuring one cluster at a time risks breakages, and it can be challenging to see exactly where errors are happening.

Things can become even more complex (and expensive) when the clusters aren't all in one place. Many organizations using Google Cloud also want or need to run workloads in their own data centers, factory floors, retail stores, and even in other public clouds – but they don't want to build new container platforms themselves in all these locations, or rethink how they configure, secure, monitor, and optimize container workloads depending on where they're running, with the possibility of inconsistent environments, security and misconfiguration risks, and operational toil.

For example:

  • A financial institution is building a digital banking platform on Google Cloud and requires consistent configurations, strong security policy enforcement, and deep visibility into how multiple apps communicate. A large retail company building a modern ecommerce platform has the same requirements. Both companies manage multiple clusters in multiple regions in Google Cloud using GKE.
  • Another global financial institution is building complex risk management apps, inter-bank transfer apps, and many other sensitive workloads, some of which must remain behind the corporate firewall and some of which are deployed on GKE on Google Cloud.
  • A major pharmacy retailer is creating new vaccine scheduling, customer messaging, and digital engagement apps to modernize pharmacy operations and create a more personalized in-store experience. These apps require in-store container platforms that are integrated with Google Cloud-hosted services like BigQuery and Retail Search
  • A media and entertainment company requires a consistent container environment in 30 ballparks - all connected to and managed from Google Cloud - to gather and analyze terabytes of game statistics and to fuel fan engagement both inside the ballpark and virtually.
  • A hardware manufacturing company needs to test and optimize factory floor product quality and worker safety by analyzing data with very low latency to make decisions in near real-time, while also consolidating data in Google Cloud for longer-term analysis.
  • A software and internet company that offers an integration platform in a software as a service (SaaS) model needs to offer its platform on several major public clouds to run where its customers need proximity to native cloud services. The company needs a unified and consistent way to provision, configure, secure, and monitor container environments in multiple public clouds from one management plane, to avoid the operational overhead of managing each cloud environment with different native management tools.

Anthos can help all these organizations by providing a consistent platform that lets them:

  • Modernize applications and infrastructure in-place
  • Create a unified cloud operating model (single pane of glass) to create, update, and optimize container clusters wherever they are
  • Scale large multi-cluster applications as fleets - logical groupings of similar environments - with consistent security, configuration, and service management
  • Enforce consistent governance and security from a unified control plane

It does this with opinionated tools and features that help them govern, manage, and operate containerized workloads at enterprise scale, enabling them to adopt best practices and principles that we've learned from running services at Google.

Anthos basics

Diagram showing the features of the Anthos platform

Anthos capabilities are built around the idea of the fleet: a logical grouping of Kubernetes clusters that can be managed together. A fleet can be entirely made up of GKE clusters on Google Cloud, or include clusters outside Google Cloud running on-premises and on other public clouds such as AWS and Azure.

Once you have created a fleet, you can use Anthos fleet-enabled features to add further value and simplify working across multiple clusters and infrastructure providers:

  • Configuration and policy management tools help you work more easily at scale, automatically adding and updating the same configuration, features, and security policies consistently across your fleet, wherever your clusters are.
  • Fleet-wide networking features help you manage traffic across your entire fleet, including Multi-Cluster Ingress for applications that span multiple clusters, and service mesh traffic management features.
  • Identity management features help you consistently configure authentication for fleet workloads and users.
  • Observability features let you monitor and troubleshoot your fleet clusters and applications, including their health, resource utilization, and security posture.
  • For microservice-based applications running in your fleet, Anthos Service Mesh provides powerful tools for application security, networking, and observability across your mesh.

You can enable the entire Anthos platform to use all available features, including multi-cloud and hybrid cloud capabilities, or you can create a fleet on Google Cloud only and pay for additional Anthos features as you need them. Anthos uses industry-standard open source technologies, and supports multiple infrastructure providers, providing flexibility to use Anthos in a way that meets your business and organizational needs.

How fleets work

Fleets are how Anthos lets you logically group and normalize Kubernetes clusters, making administration of infrastructure easier. Adopting fleets helps your organization uplevel management from individual clusters to groups of clusters, with a single view on your entire fleet in the Google Cloud console.

A key concept that makes fleets work is sameness. This means that, within a fleet of clusters, some Kubernetes objects such as namespaces in different clusters are treated as if they were the same thing when they have the same name. This normalization makes it simpler to manage many clusters at once and is used by Anthos fleet-enabled features. For example, you can apply a security policy with Policy Controller to all fleet services in namespace foo, regardless of which clusters they happen to be in, or where those clusters are.

Fleets also assume service sameness (all services in a namespace with the same name can be treated as the same service, for example for traffic management purposes) and identity sameness (services and workloads within a fleet can leverage a common identity for authentication and authorization). The fleet sameness principle also provides some strong guidance about how to set up namespaces, services, and identities, following what many organizations and Google already implement themselves as best practices.

How you organize your fleets depends on your organizational and technical needs. Each fleet is associated with a specific Google Cloud project, known as your fleet host project, which you use to manage and view your fleet, but can include clusters from other projects. You could, for example, have separate fleets for your prod, test, and dev environments, or separate fleets for different teams or lines of business. Clusters that have large amounts of cross-service communication benefit the most from being managed together in a fleet. Clusters in the same environment (for example, your production environment) should be in the same fleet.


Find out more:


Kubernetes clusters everywhere

Kubernetes is at the core of Anthos, with a variety of Kubernetes cluster options to choose from when building your fleet:

  • Google Kubernetes Engine (GKE) is Google's managed Kubernetes implementation on Google Cloud, with a cloud-hosted control plane and clusters made up of Compute Engine instances. While GKE on its own helps you automatically deploy, scale, and manage Kubernetes, grouping GKE clusters in a fleet lets you work more easily at scale, and allows you to use Anthos features in addition to the powerful cluster management features already offered by GKE.
  • Anthos clusters extends GKE for use with other infrastructure providers, including Azure, AWS, and on your own hardware on-premises (either on VMware or on bare metal). In these options, the Google-provided Kubernetes control plane runs in your data center or cloud provider along with your cluster nodes, with your clusters connected to your fleet host project in Google Cloud.
  • Google Distributed Cloud Edge also lets you add on-premises GKE clusters to your fleet, this time running on Google-provided and maintained hardware and supporting a subset of Anthos features.
  • GKE-based clusters are not your only option. Anthos also provides the ability to register conformant third-party Kubernetes clusters to your fleet, such as EKS and AKS clusters, known as attached clusters. With this option you continue to run existing workloads where they are while adding value with a subset of Anthos features. Anthos does not manage the Kubernetes control plane or node components—only the Anthos services that run on those clusters.

For all GKE-based clusters, including on-premises and public clouds, Anthos provides tools for cluster management and lifecycle (create, update, delete, and upgrade), including command line utilities and, for some cluster types, management from the Cloud console.

Cluster configuration

Wherever your clusters are, Anthos Config Management provides a consistent way to manage cluster configuration across your entire fleet, including attached clusters. Anthos Config Management uses the approach of "configuration as data": the desired state of your environment is defined declaratively, maintained as a single source of truth under version control, and applied directly with repeatable results. Anthos Config Management monitors a central Git repository containing your configuration and automatically applies any changes to its specified target clusters, wherever they happen to be running. Any YAML or JSON that can be applied with kubectl commands can be managed with Anthos Config Management and applied to any Kubernetes cluster.

Migration and VMs

For organizations that want to migrate their applications to containers and Kubernetes as part of their modernization process, the Anthos platform includes Migrate to Containers, with tools to convert VM-based workloads into containers that run on GKE or Anthos clusters. On bare metal Anthos platforms (Anthos clusters on bare metal and Distributed Cloud Edge), organizations can also use Anthos VM Runtime to run VMs on top of Kubernetes in the same way that they run containers, letting them continue to use existing VM-based workloads as they also develop and run new container-based applications. When they're ready, they can migrate these VM-based workloads to containers and continue using the same Anthos management tools.


Find out more:


Anthos features

The rest of this guide introduces you to the features that Anthos provides to help you manage your fleets and the applications that run on them. You can see a complete list of available features for each supported Kubernetes cluster type in Anthos deployment options.

Networking, authentication, and security

After you have built your fleet, Anthos helps you manage traffic, manage authentication and access control, and consistently enforce security and compliance policies across your fleet.

Connecting to your fleet

To manage the connection to Google in hybrid and multi-cloud fleets, Google provides a Kubernetes deployment called the Connect Agent. Once installed in a cluster as part of fleet registration, the agent establishes a connection between your cluster outside Google Cloud and its Google Cloud fleet host project, letting you manage your clusters and workloads from Google and use Google services.

In on-premises environments, connectivity to Google can use the public internet, a high-availability VPN, Public Interconnect, or Dedicated Interconnect, depending on your applications' latency, security, and bandwidth requirements when interacting with Google Cloud.


Find out more:


Load balancing

For managing traffic to and within your fleet, Anthos provides the following load balancing solutions:

  • GKE clusters on Google Cloud can use the following options:
  • Anthos clusters on-premises let you choose from variety of load balancing modes to suit your needs, including a bundled MetalLB load balancer and the option to manually configure load balancing to use your existing solutions
  • Distributed Cloud Edge includes bundled MetalLB load balancing
  • Anthos clusters on other public clouds use platform-native load balancers

Find out more:


Authentication and access control

A significant challenge when working with multiple clusters across multiple infrastructure providers is managing authentication and authorization. For authenticating to your fleet's clusters, Anthos provides you with options for consistent, simple, and secured authentication when interacting with clusters from the command line with kubectl, and from the Google Cloud console.

  • Use Google identity: The Connect Gateway lets users and service accounts authenticate to clusters across your fleet with their Google IDs, wherever the clusters live. You can use this feature to connect directly to clusters, or leverage it with build pipelines and other DevOps automation.
  • Use third-party identity: Anthos Identity Service lets you configure authentication with third-party identity providers, letting your teams continue to use existing usernames, passwords, and security groups from OIDC (and LDAP where supported) providers such as Microsoft AD FS and Okta across your entire fleet.

You can configure as many supported identity providers as you want for a cluster.

Once you have set up authentication, you can then use standard Kubernetes role-based access control (RBAC) to authorize authenticated users to interact with your clusters, as well as Identity and Access Management to control access to Google services such as the Connect Gateway.

For workloads running on your clusters, Anthos provides fleet-wide workload identity. This feature lets workloads on fleet member clusters use identities from a fleet-wide workload identity pool when authenticating to external services such as Cloud APIs. This makes it simpler to set up an application's access to these services versus having to configure access cluster by cluster. For example, if you have an application with a backend deployed across multiple clusters in the same fleet, and which needs to authenticate to a Google API, you can configure your application so that all services in the "backend" namespace can use that API.


Find out more:


Policy management

Another challenge when working with multiple clusters is enforcing consistent security and regulatory compliance policies across your fleet. Many organizations have stringent security and compliance requirements, such as those protecting consumer information in financial service applications, and need to be able to meet these at scale.

To help you do this, Policy Controller, which is part of Anthos Config Management, enforces custom business logic against every Kubernetes API request to the relevant clusters. These policies act as "guardrails" and prevent any changes to the configuration of the Kubernetes API from violating security, operational, or compliance controls. You can set policies to actively block non-compliant API requests across your fleet, or simply to audit the configuration of your clusters and report violations. Common security and compliance rules can easily be expressed using Policy Controller's built-in set of rules, or you can write your own rules using the extensible policy language, based on the open source Open Policy Agent project.


Find out more:


Application-level security

For applications running on your fleet, Anthos provides defence-in-depth access control and authentication features, including:

  • Binary Authorization, which lets you ensure that only trusted images are deployed on your fleet's clusters.
  • Kubernetes network policy, which lets you specify which Pods are allowed to communicate with each other and other network endpoints.
  • Anthos Service Mesh service access control, which lets you configure fine-grained access control for your mesh services based on service accounts and request contexts.
  • Anthos Service Mesh certificate authority (Mesh CA), which automatically generates and rotates certificates so you can enable mutual TLS authentication (mTLS) easily between your services.

Observability

A key part of operating and managing clusters at scale is being able to easily monitor your fleet's clusters and applications, including their health, resource utilization, and security posture.

Anthos in the Google Cloud console

The Google Cloud console is Google Cloud's web interface that you can use to manage your projects and resources. The Anthos pages in the Google Cloud console provide you with a secured, unified user interface to view and manage your clusters and workloads, including an out-of-the-box structured view of your entire fleet. Dashboard pages let you view high level details, as well as letting you drill down as far as necessary to identify issues.

  • Anthos overview: The top-level Anthos overview provides an overview of your fleet's resource usage based on information provided through Cloud Monitoring, showing CPU, memory, and disk utilization aggregated by fleet and by cluster, as well as fleet-wide Policy Controller coverage.
  • Cluster management: The Anthos cluster management view provides a secure console to view the state of all your fleet's clusters including cluster health, register GKE clusters to your fleet, and create new clusters for your fleet (Google Cloud and Anthos clusters on-premises only). For information about specific clusters, you can drill down from this view into the GKE dashboard to get further details about your cluster nodes and workloads.
  • Service Mesh: If you're using Anthos Service Mesh on Google Cloud, the Service Mesh view provides observability into the health and performance of your services. Anthos Service Mesh collects and aggregates data about each service request and response, meaning you don't have to instrument your code to collect telemetry data or manually set up dashboards and charts. Anthos Service Mesh automatically uploads metrics and logs to Cloud Monitoring and Cloud Logging for all traffic within your cluster. This detailed telemetry lets operators observe service behavior, and empowers them to troubleshoot, maintain, and optimize their applications.
  • Configuration management: The Configuration Management view gives you an at-a-glance overview of the configuration state of all fleet clusters with Anthos Config Management enabled, and lets you quickly add the feature to clusters that haven't been set up yet. You can easily track configuration changes and see which branch and commit tag has been applied to each cluster. Flexible filters make it simple to view configuration rollout status by cluster, branch, or tag.
  • Features: The Features view lets you view the state of and manage Anthos features for your fleet clusters.
  • Security: For clusters on Google Cloud only, the Anthos Security dashboard provides an at-a-glance view of your applications' current security features, as well as a more detailed policy audit view to show you where you can modify security configurations or workloads to improve your security posture.

Logging and monitoring

For more in-depth information about your clusters and their workloads, you can use Cloud Logging and Cloud Monitoring. Cloud Logging provides a unified place to store and analyze logs data, while Cloud Monitoring automatically collects and stores performance data, as well as providing data visualization and analysis tools. Most Anthos cluster types send logging and monitoring information for system components (such as workloads in the kube-system and gke-connect namespaces) to Cloud Monitoring and Cloud Logging by default. You can further configure Cloud Monitoring and Cloud Logging to get information about your own application workloads, build dashboards including multiple types of metric, create alerts, and more.

Depending on your organization and project needs, Anthos also supports integration with other observability tools, including open source Prometheus and Grafana, and third-party tools such as Elastic and Splunk.


Find out more:


Service management

In Kubernetes, a service is an abstract way to expose an application running on a set of Pods as a network service, with a single DNS address for traffic to the service workloads. In a modern microservices architecture, a single application may consist of numerous services, and each service may have multiple versions deployed concurrently. Service-to-service communication in this kind of architecture occurs over the network, so services must be able to deal with network idiosyncrasies and other underlying infrastructure issues.

To make it easier to manage services in your fleet, you can use Anthos Service Mesh. Anthos Service Mesh is based on Istio, which is an open-source implementation of a service mesh infrastructure layer. Service meshes factor out common concerns of running a service such as monitoring, networking, and security, with consistent, powerful tools, making it easier for service developers and operators to focus on creating and managing their applications. With Anthos Service Mesh, these functions are abstracted away from the application's primary container and implemented in a common out-of-process proxy delivered as a separate container in the same Pod. This pattern decouples application or business logic from network functions, and enables developers to focus on the features that the business needs. Service meshes also let operations teams and development teams decouple their work from one another.

Anthos Service Mesh provides you with many features along with all of Istio's functionality:

  • Service metrics and logs for all traffic within your mesh's cluster are automatically ingested to Google Cloud.
  • Automatically generated dashboards display in-depth telemetry in the Anthos Service Mesh dashboard, to let you dig deep into your metrics and logs, filtering and slicing your data on a wide variety of attributes.
  • Service-to-service relationships at a glance: understand what connects to each service and the services it depends on.
  • Secure your inter-service traffic: Anthos Service Mesh certificate authority (Mesh CA) automatically generates and rotates certificates so you can enable mutual TLS authentication (mTLS) easily with Istio policies.
  • Quickly see the communication security posture not only of your service, but its relationships to other services.
  • Dig deeper into your service metrics and combine them with other Google Cloud metrics using Cloud Monitoring.
  • Gain clear and simple insight into the health of your service with service level objectives (SLOs), which allow you to easily define and alert on your own standards of service health.

Anthos Service Mesh lets you choose between a fully-managed service mesh control plane in Google Cloud (for meshes running on fleet member clusters on Google Cloud only) or an in-cluster control plane that you install yourself. You can find out more about the features available for each option in the Anthos Service Mesh documentation.


Find out more:


What's next?