This guide provides best practices, practical considerations, and recommendations for implementing fleets (formerly known as environs) in your organization.
There are some limitations to consider when implementing fleets based on the fleet-aware Anthos and Google Cloud components that your organization wants to use. For example, some components might not yet support working with clusters that aren't in the fleet host project.
The following table shows each component's current requirements and limitations.
|Anthos Config Management||All Anthos and GKE supported clusters||None||None|
|Anthos Service Mesh
on Google Cloud
|Anthos clusters on Google Cloud||None||N/A|
|Anthos Service Mesh
|Anthos clusters on VMware||Cluster must be registered to a fleet.||N/A|
|Multi Cluster Ingress||Anthos clusters on Google Cloud||Ingress resources, GKE clusters, and fleet must share the same project.||Ingress resources and GKE clusters must be in the same VPC network.|
|Workload identity pools||Optimized for Anthos, GKE on Google Cloud, and Anthos clusters on VMware. With Anthos, other Kubernetes clusters are supported, but require manual setup work.||None||None|
Organizing projects and VPC networks for fleets
When architecting for fleets, you need to consider two fundamental resources: Google Cloud projects and Virtual Private Cloud (VPC) networks.
As noted in Introducing fleets, each fleet is created within a single project. However (with the limitations noted in the previous table), fleets are intended to work with fleet-aware resources from the fleet host project, another Google Cloud project, other cloud providers, or on-premises.
While not explicitly prevented, we also recommend that fleet-aware resources in the same project be added to the same fleet; they should not be split among different fleets. Splitting resources in the same project across fleets is considered an anti-pattern because the project boundary provides stronger protections for policy and governance purposes.
When deciding how to place fleet-aware resources in multiple projects, we anticipate that many organizations will have different tenancy requirements. Consider the following two extremes:
- Some organizations might choose to place all fleet-resources in a handful of centrally-controlled projects, allocating namespaces to teams.
- Other organizations might choose to give teams their own dedicated clusters or virtual machine (VM) resources within their teams' own projects.
In the first extreme, it is easier to maintain centralized governance over the resources, but it might require additional work to attain the desired isolation. In the second extreme, these tradeoffs are reversed. In some complex cases, your organization might have a mixture of both shared infrastructure resources and dedicated ones, isolated in separate projects. No matter where you end up, as we discuss in our High trust section, maintaining mutual trust over the resources registered to a fleet is important to maintaining the integrity of the fleet.
Closely related to project organization is network organization. Several fleet components, as noted in the component requirements table, require specific connectivity between registered resources in the fleet. Over time, some of these requirements might be relaxed; however, for example, today Multi Cluster Ingress requires that pods be in the same VPC network, with the clusters themselves being in the same project as the fleet.
When components can loosen these initial project and VPC network requirements, we anticipate that adopting a Shared VPC model will become a best practice whenever you require multiple projects. In such a model, the fleet can be instantiated in the VPC network's host project with resources registered from their respective service projects. If you require multiple fleets with a Shared VPC, you can nominate projects to be the fleet host project.
Adding/removing fleet resources (clusters)
Existing fleet-aware resources can be added to a fleet, but special care must be taken to ensure that services are not disrupted as a result of being added. In particular, it is important to ensure that the sameness and trust properties are considered before adding the resource to the fleet. The fleet administrator should pay special attention to how active fleet components use sameness. This might require migrating to consistent naming practices, establishing governance of the resource, or potentially performing other actions before adding the resource to the fleet.
Removing resources from a fleet also requires some additional attention. For example, resources that are actively part of a service mesh or targeted as part of a multi-cluster load balancer will be impacted. To prepare for removing the resource, we recommend reviewing each component that you have enabled on your fleet, and taking any necessary steps to drain active service mesh traffic or external traffic.
As fleets evolve, we will provide more in-band guidance when adding and removing fleet resources.
Enabling or reconfiguring fleet components
Enabling or reconfiguring Google Cloud or Anthos components that use fleets also requires some special care. When enabling new components, pay attention to the potential side effects of enabling the component on all clusters. For example, before enabling Anthos Service Mesh, understand which service endpoints are merged across resources, and ensure that this is the desired result.
We will provide further in-band guidance when configuring fleet-enabled components as we evolve the fleet concept.
- For some hypothetical scenarios that illustrate the considerations described in this guide, see Fleet examples.