Installing Anthos Config Management

The Config Management Operator is a controller that manages Anthos Config Management in a Kubernetes cluster. Follow these steps to install and configure the Operator in each cluster you want to manage using Anthos Config Management.

Before you begin

This section describes prerequisites you must meet before enrolling supported clusters in Anthos Config Management.

Preparing your local environment

Before you install Operator, make sure you have prepared your local environment by completing the following tasks:

  • Anthos Config Management requires an active Anthos entitlement. For more information, see Pricing for Anthos.

  • Install the Cloud SDK, which provides the gcloud, gsutil, and kubectl commands used in these instructions.

  • kubectl is not installed by default by Cloud SDK. To install kubectl, use the following command:

    gcloud components install kubectl
    
  • Install the nomos command so that you can use the nomos status subcommand to detect any issues during installation and setup.

  • To download components of Anthos Config Management, you must be authenticated to Google Cloud using the gcloud auth login command.

  • You must configure the kubectl command to connect to your clusters. The steps to do that are different for GKE and GKE on-prem clusters:

  • The Anthos API must be enabled for your project:

gcloud

To enable the Anthos API, run the following command:

gcloud services enable anthos.googleapis.com

Console

Enable the Anthos API

Clusters

  • Your clusters must be on an Anthos supported platform and version.

  • Your clusters must be registered to an Anthos environ using Connect. Your project's environ provides a unified way to view and manage your clusters and their workloads as part of Anthos, including clusters outside Google Cloud. Anthos charges apply only to your registered clusters. You can find out how to register a cluster in Registering a cluster.

Permissions

The Google Cloud user installing Anthos Config Management needs Identity and Access Management (IAM) permissions to create new Roles in your cluster.

Enrolling a cluster

To enroll a cluster in Anthos Config Management, complete the following steps:

  1. Deploy the Operator
  2. Grant the Operator read-only access to Git
  3. Configure the Operator

Deploying the Operator

If you are installing the Config Management Operator using the Google Cloud Console, the Operator is automatically deployed. Continue to Create the git-creds Secret.

After ensuring that you meet all the prerequisites, you can deploy the Operator by downloading and applying a YAML manifest.

  1. Download the latest version of the Operator CRD using the following command. To download a specific version instead, see Downloads.

    gsutil cp gs://config-management-release/released/latest/config-management-operator.yaml config-management-operator.yaml
    
  2. Apply the CRD:

    kubectl apply -f config-management-operator.yaml

If this fails, see Troubleshooting.

Granting the Operator read-only access to Git

The Operator needs read-only access to your Git repository (the repo) so it can read the configs committed to the repo and apply them to your clusters. If credentials are required, they are stored in the git-creds Secret on each enrolled cluster.

If the repo does not require any authentication or authorization for read-only access, you do not need to create a git-creds Secret. You can skip to Configuring Anthos Config Management and set the SecretType to none.

Most users need to create credentials because read access to their repository is restricted. Anthos Config Management supports the following mechanisms for authentication:

The mechanism you choose depends on what your repo supports. If all mechanisms are available, using an SSH keypair is recommended. At the time of writing GitHub, Cloud Source Repositories, and Bitbucket all support using an SSH keypair. If your repo is hosted by your organization and you don't know which authentication methods are supported, contact your administrator.

After you create the credential, you add it to the cluster as a Secret. How you create the secret depends on the method of authentication. For more information, see the section on your authentication method below.

Choose the best method to authorize to your repo from the options below. The method you choose also determines the secretType you use when configuring the Operator.

Using an SSH keypair

If your repo supports using an SSH keypair for authorization, follow the steps in this section to create the Secret materials. Otherwise, follow the instructions in using a cookiefile instead.

An SSH keypair consists of two files, a public key and a private key. The public key typically has a .pub extension.

  1. Create an SSH keypair to allow the Operator to authenticate to your Git repository. This is necessary if you need to authenticate to the repository in order to clone it or read from it. Skip this step if a security administrator will provide you with a keypair. You can use a single keypair for all clusters, or a keypair per cluster, depending on your security and compliance requirements.

    The following command creates 4096-bit RSA key. Lower values are not recommended. Replace [GIT REPOSITORY USERNAME] and /path/to/[KEYPAIR-FILENAME] with the values you want the Operator to use to authenticate to the repository. It is recommended to use a separate account if using a third-party Git repository host such as GitHub, or to use a service account if using Cloud Source Repositories.

    ssh-keygen -t rsa -b 4096 \
     -C "[GIT REPOSITORY USERNAME]" \
     -N '' \
     -f [/path/to/KEYPAIR-FILENAME]
    
  2. Configure your repo to recognize the newly-created public key. Refer to the documentation for your Git hosting provider. Instructions for some popular Git hosting providers are included for convenience:

  3. Add the private key to a new Secret in the cluster. Substitute the name of the private key (the one without the .pub suffix) where you see /path/to/[KEYPAIR-PRIVATE-KEY-FILENAME].

    kubectl create secret generic git-creds \
    --namespace=config-management-system \
    --from-file=ssh=/path/to/[KEYPAIR-PRIVATE-KEY-FILENAME]
    
  4. Delete the private key from the local disk or otherwise protect it.

Using cookiefile

If your repo does not support using an SSH keypair for authorization, you can use a cookiefile or a Personal Access Token.

Follow the steps in this section to create the cookiefile Secret materials.

The process for acquiring a cookiefile depends on the configuration of your repo. For example, see the article about generating static credentials on Cloud Source Repositories. The credentials are usually stored in the .gitcookies file in the user's home directory, or they may be provided to you by a security administrator.

  1. After you create and obtain the cookiefile, add it to a new Secret in the cluster. Replace /path/to/[COOKIEFILE] with the appropriate path and filename.

    kubectl create secret generic git-creds \
    --namespace=config-management-system \
    --from-file=cookie_file=/path/to/[COOKIEFILE]
    
  2. Make sure to protect the contents of the cookiefile if you still need it locally. Otherwise, delete it.

Using a token

If your organization does not permit the use of SSH keys, you might prefer to use a token. With Anthos Config Management you can use GitHub's Personal Access Tokens (PAT) or Bitbucket's App Password as your token.

To create a Secret using your token, complete the following steps:

  1. Create a token using GitHub or Bitbucket.

    GitHub

    Create a PAT. Grant the token the repo scope, so it can read from private repositories. Because you bind a PAT to a GitHub account, we also recommend you create a machine user and bind your PAT to the machine user.

    Bitbucket

    Create an App Password.

  2. After you create and obtain the token, add it to a new Secret in the cluster. Replace [USERNAME] with the desired username and [TOKEN] with the token you created in the previous step.

    kubectl create secret generic git-creds \
        --namespace="config-management-system" \
        --from-literal=username=[USERNAME] \
        --from-literal=token=[TOKEN]
    
  3. Protect the token if you still need it locally. Otherwise, delete it.

Using a Google Service Account with a Google Source Repository

If your repo is in a Google Source Repository, you may use secretType: gcenode to give Anthos Config Management access to a repository in the same project as your managed cluster.

Before you begin, ensure the following prerequisites are met:

  • The Compute Engine default service account PROJECT_NUMBER-compute@developer.gserviceaccount.com for the cluster must have source.reader access to the repository.

    gcloud projects add-iam-policy-binding [PROJECT_ID] \
    --member serviceAccount:PROJECT_NUMBER-compute@developer.gserviceaccount.com \
    --role roles/source.reader
    
  • Access scopes for the nodes in the cluster must include cloud-source-repos-ro. This can be achieved by including cloud-source-repos-ro in the --scopes list specified at cluster creation time, or by using the cloud-platform scope at cluster creation time:

    gcloud container clusters create example-cluster --scopes=cloud-platform
    

Once these prerequisites are met, set spec.git.syncRepo to the URL of the desired Google Source Repository when you configure the operator. For example:

gcloud source repos list
REPO_NAME  PROJECT_ID  URL
my-repo    my-project  https://source.developers.google.com/p/my-project/r/my-repo-csr

would require the following:

spec.git.syncRepo: https://source.developers.google.com/p/my-project/r/my-repo-csr
Using a Google Source Repository with Workload Identity

If Workload Identity is enabled on your cluster, additional steps are required to use secretType: gcenode. After completing the preceding steps, and configuring the Operator in the following section, create an IAM policy binding between the Kubernetes service account and the Google service account. The Kubernetes service account is not created until you configure the Operator to enable Anthos Config Management for the first time.

This binding allows the Anthos Config Management Kubernetes service account to act as the Compute Engine default service account:

gcloud iam service-accounts add-iam-policy-binding \
  --role roles/iam.workloadIdentityUser \
  --member "serviceAccount:[PROJECT_ID].svc.id.goog[config-management-system/importer]" \
  [PROJECT_NUMBER]-compute@developer.gserviceaccount.com

Finally, add an annotation to the Anthos Config Management Kubernetes service account using the email address of the Compute Engine default service account:

kubectl annotate serviceaccount -n config-management-system importer \
  iam.gke.io/gcp-service-account=[PROJECT_NUMBER]-compute@developer.gserviceaccount.com

Using no authentication

If your repo does not require authentication for read-only access, you can continue to configure the Operator and set spec.git.secretType to none.

Configuring the Operator

You can configure the Operator for your cluster using kubectl or the Google Cloud Console.

kubectl

To configure the behavior of the Operator, create a configuration file for the ConfigManagement CustomResource, then apply it using the kubectl apply command.

For example, create a file config-management.yaml and copy the following YAML file into it.

# config-management.yaml

apiVersion: configmanagement.gke.io/v1
kind: ConfigManagement
metadata:
  name: config-management
spec:
  # clusterName is required and must be unique among all managed clusters
  clusterName: my-cluster
  git:
    syncRepo: git@github.com:my-github-username/csp-config-management.git
    syncBranch: 1.0.0
    secretType: ssh
    policyDir: "foo-corp"

For a complete list of fields that you can add to the spec field, see the following Configuration for the Git repository section. Do not change configuration values outside the spec field.

To apply the configuration, use the kubectl apply command:

kubectl apply -f config-management.yaml

You may need to create separate configuration files for each cluster, or each type of cluster. You can specify the cluster using the --context option.

kubectl apply -f config-management1.yaml --context=cluster-1
Configuration for the Git repository
Key Description
spec.git.syncRepo The URL of the Git repository to use as the source of truth. Required.
spec.git.syncBranch The branch of the repository to sync from. Default: master.
spec.git.policyDir The path within the Git repository that represents the top level of the repo to sync. Default: the root directory of the repository.
spec.git.syncWait Period in seconds between consecutive syncs. Default: 15.
spec.git.syncRev Git revision (tag or hash) to check out. Default HEAD.
spec.git.secretType The type of secret configured for access to the Git repository. One of ssh, cookiefile, token, gcenode, or none. Required.
spec.sourceFormat When set to unstructured, configures a non-hierarchical repo. Default: hierarchy.
Proxy configuration for the Git repository

If your organization's security policies require you to route traffic through an HTTP(S) proxy, you can configure Anthos Config Management to communicate with your Git host using the proxy's URI.

Key Description
spec.git.proxy.httpProxy httpProxy defines a HTTP_PROXY env variable used to access the Git repo.
spec.git.proxy.httpsProxy httpsProxy defines a HTTPS_PROXY env variable used to access the Git repo.

If both the httpProxy and httpsProxy fields are specified, httpProxy will be ignored.

Configuration for behavior of the ConfigManagement object
Key Description
spec.clusterName The user-defined name for the cluster used by ClusterSelectors to group clusters together. Unique within a Anthos Config Management installation.
Configuration for integrations

These fields enable integration with Config Connector and Policy Controller.

Key Description
spec.configConnector.enabled If true, installs Config Connector. Defaults to false.
spec.policyController.enabled If true, enables Policy Controller. Defaults to false.
spec.policyController.templateLibraryInstalled If true, installs the constraint template library. Defaults to true.

Console

Limitations

Configuring Anthos Config Management on the Google Cloud Console has the following limitations:

  • You cannot install or configure:
    • Policy Controller
    • Config Connector
    • Hierarchy Controller
  • Only a subset of Config Sync configuration options are supported. Unsupported Config Sync configurations include:
    • spec.git.proxy
    • spec.git.secretType: gcenode
    • spec.git.secretType: token
    • spec.sourceFormat

If you want to use any of these options, you should not use the Cloud Console.

Configuring Operator

To configure the Operator on the Cloud Console, complete the following steps:

  1. Visit the Anthos Config Management menu in Google Cloud Console.

    Visit Anthos Config Management menu

  2. Select your registered clusters and click Configure.

  3. In the Git Repository Authentication for ACM section, complete the following:

    1. For the Secret type, select one of the following:

      • None. Select if your repo does not require authentication for read-only access.
      • SSH
      • cookiefile

      If you want to select token or gcenode configure the Operator using kubectl.

    2. Click Continue.

  4. In the ACM Settings for your clusters section, complete the following:

    1. In the URL field, add the URL of the Git repository to use as the source of truth. This field is required.
    2. In the Branch field, add the branch of the repository to sync from. The default is the master. This field is required.
    3. In the Tag/Commit filed, add the Git revision (tag or hash) to check out. The default is HEAD.
    4. Click Show advanced options.
    5. In the Policy directory field, add the path within the repository to the top of the policy hierarchy to sync. The default is the root directory of the repository.
    6. In the Sync wait field, add the period in seconds between consecutive syncs. The default is 15 seconds.
  5. Click Done. You are taken back to the Anthos Config Management menu. After a few minutes, refresh the page and you should see Synced in the status column next to the clusters you configured.

Verifying the installation

You can use the nomos status command to check if the Operator is installed successfully. A valid installation with no problems has a status of PENDING or SYNCED. An invalid or incomplete installation has a status of NOT INSTALLED OR NOT CONFIGURED. The output also includes any reported errors.

When the Operator is deployed successfully, it runs in a Pod whose name begins with config-management-operator, in the kube-system namespace. The Pod may take a few moments to initialize. Verify that the Pod is running:

kubectl -n kube-system get pods | grep config-management

If the Pod is running, the command's response is similar (but not identical) to the following:

config-management-operator-6f988f5fdd-4r7tr 1/1 Running 0 26s

You can also verify that the config-management-system namespace exists:

kubectl get ns | grep 'config-management-system'

The command's output is similar to the following:

config-management-system Active 1m

If the commands don't return output similar to that shown here, view the logs to see what went wrong:

kubectl -n kube-system logs -l k8s-app=config-management-operator

You can also use kubectl get events to check if Anthos Config Management has created any events.

kubectl get events -n kube-system

It is possible to have an invalid configuration that isn't detected right away, such as a missing or invalid git-creds Secret. For troubleshooting steps, see Valid but incorrect ConfigManagement object in the Troubleshooting section of this topic.

Troubleshooting

The following sections help you troubleshoot your Anthos Config Management installation.

Insufficient CPU

The output of kubectl get events might include an event with the type FailedScheduling. The event looks like the following:

LAST SEEN   TYPE      REASON              OBJECT                                             MESSAGE
9s          Warning   FailedScheduling    pod/config-management-operator-74594dc8f6    0/1 nodes are available: 1 Insufficient cpu.

To fix this error, you need to either:

  • Add a node to an existing GKE node pool, or
  • Create a node pool with larger nodes.

Error: attempt to grant extra privileges

kubectl apply -f config-management-operator.yaml
Error from server (Forbidden): error when creating "config-management-operator.yaml": clusterroles.rbac.authorization.k8s.io "config-management-operator" is forbidden: attempt to grant extra privileges: [...] ruleResolutionErrors=[]

This error indicates that the current user has fewer permissions than are required for the installation. See the Prerequisites section for role-based access control in GKE.

Valid but incorrect ConfigManagement object

If installation fails due to a problem with the ConfigManagement object that is not due to YAML or JSON syntax error, the ConfigManagement object may be instantiated in the cluster, but may not work correctly. In this situation, you can use the nomos status command to check for errors in the ConfigManagement object.

A valid installation with no problems has a status of PENDING or SYNCED.

An invalid installation has a status of NOT CONFIGURED and lists one of the following errors:

  • missing git-creds Secret
  • missing required syncRepo field
  • git-creds Secret is missing the key specified by secretType

Other errors may be added in the future.

To fix the problem, correct the configuration error. Depending on the type of error, you may need to re-apply the ConfigManagement manifest to the cluster.

If the problem is that you forgot to create the git-creds Secret, the Operator detects the Secret as soon as you create it, and you do not need to re-apply the configuration.

Upgrading

This section provides general instructions for upgrading to the current version. Before upgrading, check the release notes for any specific instructions.

Run these commands for each enrolled cluster.

  1. Download the Operator manifest and nomos commands for the new version.

  2. Apply the Operator manifest:

    kubectl apply -f config-management-operator.yaml
    

    This command updates the Operator image. Kubernetes retrieves the new version and restarts the Operator Pod using the new version. When the Operator starts, it runs a reconcile loop that applies the set of manifests bundled in the new image. This updates and restarts each component Pod.

  3. Replace the nomos or nomos.exe command on all clients with the new version. This ensures that the nomos command can always get the status of all enrolled clusters and can validate configs for them.

Uninstalling the Operator from a cluster

Follow these instructions to uninstall the Operator from a cluster. You must follow these steps for each cluster that you no longer want to manage by using Anthos Config Management.

  1. Delete the ConfigManagement object from the cluster:

    kubectl delete configmanagement --all

    The following things happen:

    • Any ClusterRoles and ClusterRoleBindings created in the cluster by Anthos Config Management are deleted from the cluster.
    • Any admission controller configurations installed by Anthos Config Management are deleted.
    • The contents of the config-management-system namespace are deleted, with the exception of the git-creds Secret. Anthos Config Management cannot function without the config-management-system namespace. Any CustomResourceDefinitions created or modified by Anthos Config Management are removed from the clusters where they were created or modified. The CustomResourceDefinitions (CRDs) required to run the Operator still exist, because from the point of view of Kubernetes, they were added by the user who installed the Operator. Information about removing these is covered in the next step.
  2. At this point, the Operator still exists in your cluster, but does nothing. If you are using GKE on-prem, you cannot remove the Operator manually.

    If you are using GKE and decide you no longer want to use Anthos Config Management at all, you can uninstall the Operator by following these steps:

    1. Verify that the config-management-system namespace is empty after deleting the ConfigManagement object in the previous step. Wait until the kubectl -n config-management-system get all command returns No resources found.

    2. Delete the config-management-system namespace:

      kubectl delete ns config-management-system
      
    3. Delete the ConfigManagement CustomResourceDefinition:

      kubectl delete crd configmanagements.configmanagement.gke.io
      
    4. Delete all Anthos Config Management objects from the kube-system namespace:

      kubectl -n kube-system delete all -l k8s-app=config-management-operator
      

What's next