Tutorial: Secure Anthos

Anthos provides a consistent platform for building and delivering secure services, with security features built in at every level that work separately and together to provide defence in depth against security issues. This tutorial introduces you to some of Anthos's powerful security features using the Anthos Sample Deployment on Google Cloud. The Anthos Sample Deployment deploys a real Anthos hands-on environment with a GKE cluster, service mesh, and a Bank of Anthos application with multiple microservices.

Objectives

In this tutorial, you're introduced to some of Anthos's security features through the following tasks:

  • Enforce mutual TLS (mTLS) in your service mesh by using Anthos Config Management to ensure end-to-end secure communication.

  • Set up a security guardrail that ensures that pods with privileged containers are not inadvertently deployed.

Costs

Using the Anthos Sample Deployment will incur pay-as-you-go charges for Anthos on Google Cloud as listed on our Pricing page unless you have an Anthos subscription.

You are also responsible for other Google Cloud costs incurred while running the Anthos Sample Deployment, such as charges for Compute Engine VMs and load balancers. You can see an estimated monthly cost for all these resources on the deployment's Google Cloud Marketplace page.

We recommend cleaning up after finishing the tutorial or exploring the deployment to avoid incurring further charges. The Anthos Sample Deployment is not intended for production use and its components cannot be upgraded.

Before you begin

This tutorial is a follow-up to the Explore Anthos tutorial. Before starting this tutorial, follow the instructions on that page to set up your project and install the Anthos Sample Deployment.

Setting up your Cloud Shell environment

In this tutorial, you will use the Cloud Shell command line and editor to make changes to cluster configuration.

To initialize the shell environment for the tutorial, the Anthos Sample Deployment provides a script that does the following:

  • Installs any missing command-line tools for interactively working with and verifying changes to the deployment:

  • Sets the Kubernetes context for anthos-sample-cluster1

  • Clones the repository that Anthos Config Management uses for synchronizing your configuration changes to your cluster. Changes that you commit and push to the upstream repository are synchronized to your infrastructure by Anthos Config Management. This is the recommended best practice for applying changes to your infrastructure.

To set up your environment:

  1. Ensure that you have an active Cloud Shell session. You can launch Cloud Shell by clicking Activate Cloud Shell Activate Shell Button from the Cloud Console in your tutorial project.

  2. Create a directory to work in:

    mkdir tutorial
    cd tutorial
    
  3. Download the initialization script:

    curl -sLO https://github.com/GoogleCloudPlatform/anthos-sample-deployment/releases/latest/download/init-anthos-sample-deployment.env
    
  4. Source the initialization script into your Cloud Shell environment:

    source init-anthos-sample-deployment.env
    

    Output:

    /google/google-cloud-sdk/bin/gcloud
    /google/google-cloud-sdk/bin/kubectl
    Your active configuration is: [cloudshell-13605]
    export PROJECT as anthos-launch-demo-1
    export KUBECONFIG as ~/.kube/anthos-launch-demo-1.anthos-trial-gcp.config
    Fetching cluster endpoint and auth data.
    kubeconfig entry generated for anthos-sample-cluster1.
    Copying gs://config-management-release/released/latest/linux_amd64/nomos...
    \ [1 files][ 40.9 MiB/ 40.9 MiB]
    Operation completed over 1 objects/40.9 MiB.
    Installed nomos into ~/bin.
    Cloned ACM config repo: ./anthos-sample-deployment-config-repo
    
  5. Change the directory to the configuration repository and use it as the working directory for the remainder of this tutorial:

    cd anthos-sample-deployment-config-repo
    

Enforcing mTLS in your service mesh

In anticipation of global expansion, your CIO has mandated that all user data must be encrypted in transit to safeguard sensitive information to be in compliance with regional data privacy and encryption laws.

So is all your traffic currently secure?

  1. Go to the Anthos Service Mesh page in your project where you have the Anthos Sample Deployment deployed:

    Go to the Anthos Service Mesh page

  2. Click transactionhistory in the services list. As you saw in Explore Anthos, the service details page shows all the telemetry available for this service.

  3. On the transactionhistory page, on the Navigation menu, select Connected Services. Here you can see both the Inbound and Outbound connections for the service. An unlocked lock icon indicates that some traffic has been observed on this port that is not using mutual TLS (mTLS).

mTLS is a security protocol that ensures that traffic is secure and trusted in both directions between two services. Each service accepts only encrypted traffic from authenticated services. As you can see, Anthos Service Mesh clearly shows that you have unencrypted traffic in your mesh. Different colors are used in Anthos Service Mesh to indicate whether the unencrypted traffic has a mix of plaintext and mTLS (orange) or only plaintext (red).

With Anthos, you're only a few steps away from being in compliance. Rather than make changes at the source code level and rebuild and redeploy your application to address this situation, you can apply the new encryption policy declaratively through configuration by using Anthos Config Management to automatically deploy your new configuration from a central Git repository.

In this section, you'll do the following:

  1. Adjust the policy configuration in your Git repository to enforce that services use encrypted communications through mTLS.

  2. Rely on Anthos Config Management to automatically pick up the policy change from the repository and adjust the Anthos Service Mesh policy.

  3. Verify that the policy change occurred on your cluster that is configured to sync with the repository.

Confirm Anthos Config Management setup

  1. The nomos command is a command-line tool that lets you interact with the Config Management Operator and perform other useful Anthos Config Management tasks from your local machine or Cloud Shell. To verify that Anthos Config Management is properly installed and configured on your cluster, run nomos status:

    nomos status
    

    Output:

    Connecting to clusters...
    Current   Context                  Sync Status  Last Synced Token   Sync Branch   Resource Status
    -------   -------                  -----------  -----------------   -----------   ---------------
    *         anthos-sample-cluster1   SYNCED       abef0b01            master        Healthy
    

    The output confirms that Anthos Config Management is configured to sync your cluster to the master branch of your configuration repository. The asterisk in the first column indicates that the current context is set to anthos-sample-cluster1. If you don't see this, switch the current context to anthos-sample-cluster1:

    kubectl config use-context anthos-sample-cluster1
    

    Output:

    Switched to context "anthos-sample-cluster1".
    
  2. Ensure that you're on the master branch:

    git checkout master
    

    Output:

    Already on 'master'
    Your branch is up to date with 'origin/master'.
    
  3. Verify your upstream configuration repository:

    git remote -v
    

    Output:

    origin  https://source.developers.google.com/.../anthos-sample-deployment-config-repo (fetch)
    origin  https://source.developers.google.com/.../anthos-sample-deployment-config-repo (push)
    

You are now ready to commit policy changes to your repository. When you push these commits to your upstream repository (origin), Anthos Config Management ensures that these changes are applied to the cluster that you have configured it to manage.

Update a policy to encrypt all service traffic

Configuration for Anthos Service Mesh is specified declaratively by using YAML files. To encrypt all service traffic, you need to modify both the YAML that specifies the types of traffic that services can accept, and the YAML that specifies the type of traffic that services send to particular destinations.

  1. The first YAML file that you need to look at is namespaces/istio-system/peer-authentication.yaml, which is a mesh-level authentication policy that specifies the types of traffic that all services in your mesh accept by default.

    cat namespaces/istio-system/peer-authentication.yaml
    

    Output:

    apiVersion: "security.istio.io/v1beta1"
    kind: "PeerAuthentication"
    metadata:
      name: "default"
      namespace: "istio-system"
    spec:
      mtls:
        mode: PERMISSIVE
    

    As you can see, the PeerAuthentication mTLS mode is PERMISSIVE, which means that services accept both plaintext HTTP and mTLS traffic.

  2. Modify namespaces/istio-system/peer-authentication.yaml to allow only encrypted communication between services by setting the mTLS mode to STRICT:

    cat <<EOF> namespaces/istio-system/peer-authentication.yaml
    apiVersion: "security.istio.io/v1beta1"
    kind: "PeerAuthentication"
    metadata:
      name: "default"
      namespace: "istio-system"
    spec:
      mtls:
        mode: STRICT
    EOF
    
  3. Next, look at the Destination Rule in namespaces/istio-system/destination-rule.yaml. This specifies rules for sending traffic to the specified destinations, including whether the traffic is encrypted. Notice that TLSmode is DISABLE, meaning that traffic is sent in plaintext to all matching hosts.

    cat namespaces/istio-system/destination-rule.yaml
    

    Output:

    apiVersion: networking.istio.io/v1alpha3
    kind: DestinationRule
    metadata:
      annotations:
        meshsecurityinsights.googleapis.com/generated: "1561996419000000000"
      name: default
      namespace: istio-system
    spec:
      host: '*.local'
      trafficPolicy:
        tls:
          mode: DISABLE
    
  4. Modify namespaces/istio-system/destination-rule.yaml to have Istio set a traffic policy that enables TLS for all matching hosts in the cluster by using TLSmode ISTIO_MUTUAL:

    cat <<EOF> namespaces/istio-system/destination-rule.yaml
    apiVersion: networking.istio.io/v1alpha3
    kind: DestinationRule
    metadata:
      annotations:
        meshsecurityinsights.googleapis.com/generated: "1561996419000000000"
      name: default
      namespace: istio-system
    spec:
      host: '*.local'
      trafficPolicy:
        tls:
          mode: ISTIO_MUTUAL
    EOF
    

Push your changes to the repository

You are almost ready to push your configuration changes; however, we recommend a few checks before you finally commit your updates.

  1. Run nomos vet to ensure that your configuration is valid:

    nomos vet
    

    No output indicates that there were no validation errors.

  2. As soon as you push your changes, Anthos Config Management picks them up and applies them to your system. To avoid unexpected results, we recommend checking that the current live state of your configuration hasn't changed since you made your edits. Use kubectl to check that the destinationrule reflects that mTLS is disabled for the cluster:

    kubectl get destinationrule default -n istio-system -o yaml
    

    Output:

    apiVersion: networking.istio.io/v1alpha3
    kind: DestinationRule
    ...
    spec:
      host: '*.local'
      trafficPolicy:
        tls:
          mode: DISABLE
    
  3. Now commit and push these changes to the upstream repository. The following command uses a helper function called watchmtls that was sourced into your environment by the init script. This helper function runs a combination of nomos status and the kubectl command that you tried earlier. It watches the cluster for changes until you press Ctrl+C to quit. Monitor the display until you see that the changes are applied and synchronized on the cluster.

    git commit -am "enable mtls"
    git push origin master && watchmtls
    

    You can also see the changes reflected on the Anthos Service Mesh pages in Anthos. If you return to the Connected Services page for transactionhistory (or any other service), you should see that the red unlocked lock icon has changed. The lock icon appears orange (mixed traffic) rather than green (entirely encrypted traffic) because we're looking by default at the last hour with a mix of mTLS and plaintext. If you check back after an hour, you should see a green lock that shows that you have successfully encrypted all the service traffic.

Using Policy Controller to set up guardrails

Your security team is concerned about potential root attacks that might occur when running pods with privileged containers (containers with root access). While the current configuration does not deploy any privileged containers, you want to guard against as many threat vectors as possible that could compromise performance or, even worse, customer data.

Despite the team's diligence, there is still a risk that you could find yourself vulnerable to root attacks unintentionally from future configuration updates through your continuous delivery process. You decide to set up a security guardrail to protect against this danger.

Apply guardrails

Guardrails are automated administrative controls intended to enforce policies that protect your environment. Anthos Config Management includes support for defining and enforcing custom rules not covered by native Kubernetes objects. The Anthos Config Management Policy Controller checks, audits, and enforces guardrails that you apply that correspond to your organization's unique security, regulatory compliance, and governance requirements.

Use Policy Controller

Anthos Config Management Policy Controller is built on an open source policy engine called Gatekeeper that is used to enforce policies each time a resource in the cluster is created, updated, or deleted. These policies are defined by using constraints from the Policy Controller template library or from other Gatekeeper constraint templates.

The Anthos Sample Deployment on Google Cloud already has Policy Controller installed and also has the Policy Controller template library enabled. You can take advantage of this when implementing your guardrail by using an existing constraint for privileged containers from the library.

Apply a policy constraint for privileged containers

To address your security team's concerns, you apply the K8sPSPPrivilegedContainer constraint. This constraint denies pods from running with privileged containers.

  1. Using the Cloud Shell terminal, create a new constraint.yaml file with the text from the library constraint, as follows:

    cat <<EOF> ~/tutorial/anthos-sample-deployment-config-repo/cluster/constraint.yaml
    apiVersion: constraints.gatekeeper.sh/v1beta1
    kind: K8sPSPPrivilegedContainer
    metadata:
      name: psp-privileged-container
    spec:
      match:
        kinds:
          - apiGroups: [""]
            kinds: ["Pod"]
        excludedNamespaces: ["kube-system"]
    EOF
    
  2. Use nomos vet to verify that the updated configuration is valid before you apply it.

    nomos vet
    

    The command returns silently as long as there are no errors.

  3. Commit and push the changes to apply the policy. You can use nomos status with the watch command to confirm that the changes are applied to your cluster. Press Ctrl+C to exit the watch command when finished.

    git add .
    git commit -m "add policy constraint for privileged containers"
    git push && watch nomos status
    

    Output:

    Connecting to clusters...
    Current   Context                  Sync Status  Last Synced Token   Sync Branch   Resource Status
    -------   -------                  -----------  -----------------   -----------   ---------------
    *         anthos-sample-cluster1   SYNCED       f2898e92            master        Healthy
    

Test your policy

After you've applied the policy, you can test it by attempting to run a pod with a privileged container.

  1. In the Cloud Shell terminal, use the following command to create a new file in the tutorial directory, nginx-privileged.yaml, with the contents from this example spec:

    cat <<EOF> ~/tutorial/nginx-privileged.yaml
    apiVersion: v1
    kind: Pod
    metadata:
      name: nginx-privileged
      labels:
        app: nginx-privileged
    spec:
      containers:
      - name: nginx
        image: nginx
        securityContext:
          privileged: true
    EOF
    
  2. Attempt to launch the pod with kubectl apply.

    kubectl apply -f ~/tutorial/nginx-privileged.yaml
    

    Output:

    Error from server ([denied by psp-privileged-container] Privileged container is not allowed: nginx, securityContext: {"privileged": true}): error when creating "~/nginx-privileged.yaml": admission webhook "validation.gatekeeper.sh" denied the request: [denied by psp-privileged-container] Privileged container is not allowed: nginx, security
    Context: {"privileged": true}
    

    The error shows that the Gatekeeper admission controller monitoring your Kubernetes environment enforced your new policy. It prevented the pod's execution due to the presence of a privileged container in the pod's specification.

The concept of version-controlled policies that you can apply to set up guardrails with Anthos Config Management is a powerful one because it standardizes, unifies, and centralizes the governance of your clusters, enforcing your policies through active monitoring of your environment post-deployment.

You can find many other types of policies to use as guardrails for your environment in the Gatekeeper repository.

Exploring the deployment further

While this tutorial has shown you how to work with some Anthos security features, there's still lots more to see and do in Anthos with our deployment. Feel free to try another tutorial or continue to explore the Anthos Sample Deployment on Google Cloud yourself, before following the cleanup instructions in the next section.

Cleaning up

After you've finished exploring the Anthos Sample Deployment, you can clean up the resources that you created on Google Cloud so they don't take up quota and you aren't billed for them in the future. The following sections describe how to delete or turn off these resources.

  • Option 1. You can delete the project. This is the recommended approach. However, if you want to keep the project around, you can use Option 2 to delete the deployment.

  • Option 2. (Experimental) If you're working within an existing but empty project, you may prefer to manually revert all the steps from this tutorial, starting with deleting the deployment.

  • Option 3. (Experimental) If you're an expert on Google Cloud or have existing resources in your cluster, you may prefer to manually clean up the resources that you created in this tutorial.

Delete the project (option 1)

  1. In the Cloud Console, go to the Manage resources page.

    Go to the Manage resources page

  2. In the project list, select the project that you want to delete and then click Delete .
  3. In the dialog, type the project ID and then click Shut down to delete the project.

Delete the deployment (option 2)

This approach relies on allowing Deployment Manager to undo what it created. Even if the deployment had errors, you can use this approach to undo it.

  1. In the Cloud Console, on the Navigation menu, click Deployment Manager.

  2. Select your deployment, and then click Delete.

  3. Confirm by clicking Delete again.

  4. Even if the deployment had errors, you can still select and delete it.

  5. If clicking Delete doesn't work, as a last resort you can try Delete but preserve resources. If Deployment Manager is unable to delete any resources, you need to note these resources and attempt to delete them manually later.

  6. Wait for Deployment Manager to finish the deletion.

  7. (Temporary step) On the Navigation menu, click Network services > Load balancing, and then delete the forwarding rules created by the anthos-sample-cluster1 cluster.

  8. (Optional) Go to https://source.cloud.google.com/<project_id>. Delete the repository whose name includes config-repo if there is one.

  9. (Optional) Delete the Service Account that you created during the deployment and all of its IAM roles.

Perform a manual cleanup (option 3)

This approach relies on manually deleting the resources from the Google Cloud Console.

  1. In the Cloud Console, on the Navigation menu, click Kubernetes Engine.

  2. Select your cluster and click Delete, and then click Delete again to confirm.

  3. In the Cloud Console, on the Navigation menu, click Compute Engine.

  4. Select the jump server and click Delete, and then click Delete again to confirm.

  5. Follow Steps 7 and 8 of Option 2.

If you plan to redeploy after the manual cleanup, verify that all requirements are met as described in the Before you begin section.

What's next

There's lots more to explore in our Anthos documentation.

Try more tutorials

Learn more about Anthos

Take our survey

When you finish working on this tutorial, please complete our survey. We're interested in hearing about any issues you might have at any point in the tutorial. Thanks for using the survey to submit your feedback.

Thank you!

The Anthos Team