This page provides a starting point to help you plan and architect CI/CD GitOps pipelines for Kubernetes, which can help you make the most of Config Sync.
This page is for Admins and architects and Operators who want to implement GitOps in their environment. To learn more about common roles and example tasks that we reference in Google Cloud content, see Common GKE Enterprise user roles and tasks.
GitOps itself is a universal best practice for organizations managing Kubernetes configuration as scale. But when it comes to architecting that solution, you have many choices. Understanding your options and the benefits and trade-offs of those decisions can help you avoid rewriting your architecture in the future.
You don't need to use every best practice listed on this page. Which best practices you choose to adopt will depend on your unique situation. The goal of this page is to help you make informed decisions when setting up your GitOps architecture.
Use a centralized, private package repository
Using a central repository for public or internal packages (such as Helm or kpt
)
can help teams find packages more easily. You can use services like Artifact Registry or
Git repositories.
The platform team can implement policies where application teams can use packages only from the central repository. Alternatively, they could use the central repository as a set of vetted packages.
You can limit write permissions to the repository to only a small number of engineers. The rest of the organization can have read access. We recommend implementing a process for promoting packages into the central repository and broadcasting updates.
The following table lists the benefits and downsides of using a centralized, private package repository:
Benefits |
Downsides |
|
|
Create wet repositories
Create repositories with the YAML output that matches the desired state of your cluster or namespace. The changes to the wet or fully-hydrated repository should be easy to review by using a diff. Good practice is to make changes to only the wet repository through a review process (for example, in GitHub, this would be a pull request).
The following table lists the benefits and downsides of creating wet repositories:
Benefits |
Downsides |
|
|
Shift left for validating configs
Waiting until Config Sync starts syncing to check for issues can create
unnecessary Git commits and a long feedback loop. Many issues can be found
before a config is applied to a cluster by using kpt
validator functions.
The following table lists the benefits and downsides of checking for issues before applying a config:
Benefits |
Downsides |
|
|
Use folders instead of branches
Use folders for variants of the configuration instead of branches. With folders,
you can use the tree
command to see variants. For example, with branches, you
can't tell if the delta between a prod and stage branch is an upcoming change in
configuration or a permanent difference between what stage and prod should look
like.
The following table lists the benefits and downsides of using folders instead of branches:
Benefits |
Downsides |
|
|
Minimize use of ClusterSelectors
ClusterSelectors
let you apply certain parts of a
configuration to a subset of clusters. Instead of configuring a RootSync or RepoSync,
you can instead modify either the resource that is being applied or add labels
to the clusters. Over time, however, as the number of ClusterSelectors
grows,
it can become complicated to understand the final state of the cluster.
Config Sync lets you sync multiple RootSyncs
and RepoSyncs
at once,
meaning you can add the relevant configuration to a separate repository and
then sync it to the clusters you want.
The following table lists the benefits and downsides of not using ClusterSelectors
:
Benefits |
Downsides |
|
|
Avoid managing Jobs with Config Sync
While Config Sync can apply Jobs for you, Jobs are not well suited for GitOps deployment for the following reasons:
Immutable fields: Many Job fields are immutable. To change an immutable field, the object must be deleted and recreated. However, Config Sync doesn't delete your object unless you remove it from the source.
Unintended running of Jobs: If you sync a Job with Config Sync and then that Job is deleted from the cluster, Config Sync considers that drift from your chosen state and re-creates the Job. If you specify a Job time to live (TTL), the Job is automatically deleted and Config Sync automatically re-creates it, restarting the Job, until you delete the Job from the source of truth. Often, this is not what was intended, because Config Sync runs the Job again.
Reconciliation issues: Config Sync normally waits for objects to reconcile after being applied. However, Jobs are considered reconciled when they have started running. This means that Config Sync doesn't wait for the Job to complete before continuing to apply other objects. However, if the Job later fails, that is considered a failure to reconcile. In some cases, this can block other resources from being synced and cause errors until you fix it. In other cases, the syncing might succeed and only reconciling fails.
For these reasons, we don't recommend syncing Jobs with Config Sync.
In most cases, Jobs and other situational tasks should be managed by a service that handles their lifecycle management. You can then manage that service with Config Sync, instead of the Jobs themselves.
The following table lists the benefits and downsides of not using Config Sync to manage Jobs:
Benefits |
Downsides |
|
|
Use unstructured repositories
Config Sync supports two structures for organizing a repository:
unstructured and hierarchical. Unstructured is the recommended approach because
it lets you organize a repository in the way that's most convenient for you.
Hierarchical repositories, by comparison, enforce a specific structure. For
example, CRDs have to be in a specific directory. This can cause issues when you
need to share configs. For example, if one team publishes a package that
contains a CRD, another team that needs to use that package would have to move
the CRD into a cluster
directory, adding more overhead to the process.
The following table lists the benefit and downside of using unstructured repositories:
Benefits |
Downsides |
|
|
To learn how to convert a hierarchical repository, see Convert a hierarchical repository to an unstructured repository.
Separate code and config repositories
When scaling up a mono-repository, it requires a build specific to each folder. Permissions and concerns for people working on the code and working on the cluster configuration are generally different. By keeping code and config repositories separate, each repository can have its own permissions and structure.
The following table lists the benefits and downsides of separating code and config repositories:
Benefits |
Downsides |
|
|
Use separate repositories to insulate changes
When scaling up a mono-repository, different permissions are required on different folders. Because of this, separating repositories allows for security boundaries between security, platform, and application configuration. It's also a good idea to separate production and non-production repositories.
The following table lists the benefits and downsides of insulating changes in separate repositories:
Benefits |
Downsides |
|
|
Pin package versions
Whether using Helm or Git, you should pin the configuration package version to something that doesn't accidentally get moved forward without an explicit rollout.
The following table lists the benefit and downside of pinning package versions:
Benefits |
Downsides |
|
|
Use Workload Identity Federation for GKE
You can enable Workload Identity Federation for GKE on GKE clusters, which allows Kubernetes workloads to access Google services in a secure and manageable way.
The following table lists the benefit and downside of using Workload Identity Federation for GKE:
Benefits |
Downsides |
|
|
High-level architecture
At a high level, you likely want at least four types of repositories:
- A package repository where shared configuration is stored. This could also be a Helm chart stored in Artifact Registry.
- A platform repository where the platform team stores fleet-wide configuration for clusters and namespaces.
- An application configuration repository.
- An application code repository.
The following diagram shows the layout of these repositories:
The following diagram shows the flow of configuration from application code into an application configuration repository. Development teams push code for applications and application configurations into a repository. The code for both apps and configs is stored in the same place and application teams has control over these repositories. App teams can then push code into a build.