Tutorial: Secure GKE Enterprise


GKE Enterprise provides a consistent platform for building and delivering secure services, with security features built in at every level that work separately and together to provide defence in depth against security issues. This tutorial introduces you to some of GKE Enterprise's powerful security features using the Anthos Sample Deployment on Google Cloud. The Anthos Sample Deployment deploys a real GKE Enterprise hands-on environment with a GKE cluster, service mesh, and a Bank of GKE Enterprise application with multiple microservices.

Objectives

In this tutorial, you're introduced to some of GKE Enterprise's security features through the following tasks:

  • Enforce mutual TLS (mTLS) in your service mesh by using Config Sync to ensure end-to-end secure communication.

  • Set up a security guardrail that ensures that pods with privileged containers are not inadvertently deployed by using Policy Controller.

Costs

Deploying the Bank of Anthos application will incur pay-as-you-go charges for GKE Enterprise on Google Cloud as listed on our Pricing page, unless you have already purchased a subscription.

You are also responsible for other Google Cloud costs incurred while running the Bank of Anthos application, such as charges for Compute Engine VMs and load balancers.

We recommend cleaning up after finishing the tutorial or exploring the deployment to avoid incurring further charges.

Before you begin

This tutorial is a follow-up to the Explore Anthos tutorial. Before starting this tutorial, follow the instructions on that page to set up your project and install the Anthos Sample Deployment.

Setting up your Cloud Shell environment

In this tutorial, you will use the Cloud Shell command line and editor to make changes to cluster configuration.

To initialize the shell environment for the tutorial, the Anthos Sample Deployment provides a script that does the following:

  • Installs any missing command-line tools for interactively working with and verifying changes to the deployment:

  • Sets the Kubernetes context for anthos-sample-cluster1

  • Clones the repository that Config Sync uses for synchronizing your configuration changes to your cluster. Changes that you commit and push to the upstream repository are synchronized to your infrastructure by Config Sync. This is the recommended best practice for applying changes to your infrastructure.

To set up your environment:

  1. Ensure that you have an active Cloud Shell session. You can launch Cloud Shell by clicking Activate Cloud Shell Activate Shell Button from the Google Cloud console in your tutorial project.

  2. Create a directory to work in:

    mkdir tutorial
    cd tutorial
    
  3. Download the initialization script:

    curl -sLO https://github.com/GoogleCloudPlatform/anthos-sample-deployment/releases/latest/download/init-anthos-sample-deployment.env
    
  4. Source the initialization script into your Cloud Shell environment:

    source init-anthos-sample-deployment.env
    

    Output:

    /google/google-cloud-sdk/bin/gcloud
    /google/google-cloud-sdk/bin/kubectl
    Your active configuration is: [cloudshell-13605]
    export PROJECT as anthos-launch-demo-1
    export KUBECONFIG as ~/.kube/anthos-launch-demo-1.anthos-trial-gcp.config
    Fetching cluster endpoint and auth data.
    kubeconfig entry generated for anthos-sample-cluster1.
    Copying gs://config-management-release/released/latest/linux_amd64/nomos...
    \ [1 files][ 40.9 MiB/ 40.9 MiB]
    Operation completed over 1 objects/40.9 MiB.
    Installed nomos into ~/bin.
    Cloned ACM config repo: ./anthos-sample-deployment-config-repo
    
  5. Change the directory to the configuration repository and use it as the working directory for the remainder of this tutorial:

    cd anthos-sample-deployment-config-repo
    

Enforcing mTLS in your service mesh

In anticipation of global expansion, your CIO has mandated that all user data must be encrypted in transit to safeguard sensitive information to be in compliance with regional data privacy and encryption laws.

So is all your traffic currently secure?

  1. Go to the Anthos Service Mesh page in your project where you have the Anthos Sample Deployment deployed:

    Go to the Anthos Service Mesh page

  2. Click transactionhistory in the services list. As you saw in Explore GKE Enterprise, the service details page shows all the telemetry available for this service.

  3. On the transactionhistory page, on the Navigation menu, select Connected Services. Here you can see both the Inbound and Outbound connections for the service. An unlocked lock icon indicates that some traffic has been observed on this port that is not using mutual TLS (mTLS).

mTLS is a security protocol that ensures that traffic is secure and trusted in both directions between two services. Each service accepts only encrypted traffic from authenticated services. As you can see, Anthos Service Mesh clearly shows that you have unencrypted traffic in your mesh. Different colors are used in Anthos Service Mesh to indicate whether the unencrypted traffic has a mix of plaintext and mTLS (orange) or only plaintext (red).

With GKE Enterprise, you're only a few steps away from being in compliance. Rather than make changes at the source code level and rebuild and redeploy your application to address this situation, you can apply the new encryption policy declaratively through configuration by using Config Sync to automatically deploy your new configuration from a central Git repository.

In this section, you'll do the following:

  1. Adjust the policy configuration in your Git repository to enforce that services use encrypted communications through mTLS.

  2. Rely on Config Sync to automatically pick up the policy change from the repository and adjust the Anthos Service Mesh policy.

  3. Verify that the policy change occurred on your cluster that is configured to sync with the repository.

Confirm Config Sync setup

  1. The nomos command is a command-line tool that lets you interact with the Config Management Operator and perform other useful Config Sync tasks from your local machine or Cloud Shell. To verify that Config Sync is properly installed and configured on your cluster, run nomos status:

    nomos status
    

    Output:

    Connecting to clusters...
    Current   Context                  Sync Status  Last Synced Token   Sync Branch   Resource Status
    -------   -------                  -----------  -----------------   -----------   ---------------
    *         anthos-sample-cluster1   SYNCED       abef0b01            master        Healthy
    

    The output confirms that Config Sync is configured to sync your cluster to the master branch of your configuration repository. The asterisk in the first column indicates that the current context is set to anthos-sample-cluster1. If you don't see this, switch the current context to anthos-sample-cluster1:

    kubectl config use-context anthos-sample-cluster1
    

    Output:

    Switched to context "anthos-sample-cluster1".
    
  2. Ensure that you're on the master branch:

    git checkout master
    

    Output:

    Already on 'master'
    Your branch is up to date with 'origin/master'.
    
  3. Verify your upstream configuration repository:

    git remote -v
    

    Output:

    origin  https://source.developers.google.com/.../anthos-sample-deployment-config-repo (fetch)
    origin  https://source.developers.google.com/.../anthos-sample-deployment-config-repo (push)
    
  4. Ensure you're still in the anthos-sample-deployment-config-repo directory, and run the following command to check your git setup. This helper function is sourced into your environment by the initialization script, and runs git config commands to check your git config's existing user.email and user.name values. If these values are not configured, the function sets defaults at the repo level based on the currently active Google Cloud account.

    init_git
    

    Output (example):

    Configured local git user.email to user@example.com
    Configured local git user.name to user
    

You are now ready to commit policy changes to your repository. When you push these commits to your upstream repository (origin), Config Sync ensures that these changes are applied to the cluster that you have configured it to manage.

Update a policy to encrypt all service traffic

Configuration for Anthos Service Mesh is specified declaratively by using YAML files. To encrypt all service traffic, you need to modify both the YAML that specifies the types of traffic that services can accept, and the YAML that specifies the type of traffic that services send to particular destinations.

  1. The first YAML file that you need to look at is namespaces/istio-system/peer-authentication.yaml, which is a mesh-level authentication policy that specifies the types of traffic that all services in your mesh accept by default.

    cat namespaces/istio-system/peer-authentication.yaml
    

    Output:

    apiVersion: "security.istio.io/v1beta1"
    kind: "PeerAuthentication"
    metadata:
      name: "default"
      namespace: "istio-system"
    spec:
      mtls:
        mode: PERMISSIVE
    

    As you can see, the PeerAuthentication mTLS mode is PERMISSIVE, which means that services accept both plaintext HTTP and mTLS traffic.

  2. Modify namespaces/istio-system/peer-authentication.yaml to allow only encrypted communication between services by setting the mTLS mode to STRICT:

    cat <<EOF> namespaces/istio-system/peer-authentication.yaml
    apiVersion: "security.istio.io/v1beta1"
    kind: "PeerAuthentication"
    metadata:
      name: "default"
      namespace: "istio-system"
    spec:
      mtls:
        mode: STRICT
    EOF
    
  3. Next, look at the Destination Rule in namespaces/istio-system/destination-rule.yaml. This specifies rules for sending traffic to the specified destinations, including whether the traffic is encrypted. Notice that TLSmode is DISABLE, meaning that traffic is sent in plaintext to all matching hosts.

    cat namespaces/istio-system/destination-rule.yaml
    

    Output:

    apiVersion: networking.istio.io/v1alpha3
    kind: DestinationRule
    metadata:
      annotations:
        meshsecurityinsights.googleapis.com/generated: "1561996419000000000"
      name: default
      namespace: istio-system
    spec:
      host: '*.local'
      trafficPolicy:
        tls:
          mode: DISABLE
    
  4. Modify namespaces/istio-system/destination-rule.yaml to have Istio set a traffic policy that enables TLS for all matching hosts in the cluster by using TLSmode ISTIO_MUTUAL:

    cat <<EOF> namespaces/istio-system/destination-rule.yaml
    apiVersion: networking.istio.io/v1alpha3
    kind: DestinationRule
    metadata:
      annotations:
        meshsecurityinsights.googleapis.com/generated: "1561996419000000000"
      name: default
      namespace: istio-system
    spec:
      host: '*.local'
      trafficPolicy:
        tls:
          mode: ISTIO_MUTUAL
    EOF
    

Push your changes to the repository

You are almost ready to push your configuration changes; however, we recommend a few checks before you finally commit your updates.

  1. Run nomos vet to ensure that your configuration is valid:

    nomos vet
    

    No output indicates that there were no validation errors.

  2. As soon as you push your changes, Config Sync picks them up and applies them to your system. To avoid unexpected results, we recommend checking that the current live state of your configuration hasn't changed since you made your edits. Use kubectl to check that the destinationrule reflects that mTLS is disabled for the cluster:

    kubectl get destinationrule default -n istio-system -o yaml
    

    Output:

    apiVersion: networking.istio.io/v1alpha3
    kind: DestinationRule
    ...
    spec:
      host: '*.local'
      trafficPolicy:
        tls:
          mode: DISABLE
    
  3. Now commit and push these changes to the upstream repository. The following command uses a helper function called watchmtls that was sourced into your environment by the init script. This helper function runs a combination of nomos status and the kubectl command that you tried earlier. It watches the cluster for changes until you press Ctrl+C to quit. Monitor the display until you see that the changes are applied and synchronized on the cluster.

    git commit -am "enable mtls"
    git push origin master && watchmtls
    

    You can also see the changes reflected on the Anthos Service Mesh pages in GKE Enterprise.

    Go to the Anthos Service Mesh page

    You should see that the red unlocked lock icon has changed. The lock icon appears orange (mixed traffic) rather than green (entirely encrypted traffic) because we're looking by default at the last hour with a mix of mTLS and plaintext. If you check back after an hour, you should see a green lock that shows that you have successfully encrypted all the service traffic.

Using Policy Controller to set up guardrails

Your security team is concerned about potential root attacks that might occur when running pods with privileged containers (containers with root access). While the current configuration does not deploy any privileged containers, you want to guard against as many threat vectors as possible that could compromise performance or, even worse, customer data.

Despite the team's diligence, there is still a risk that you could find yourself vulnerable to root attacks unintentionally from future configuration updates through your continuous delivery process. You decide to set up a security guardrail to protect against this danger.

Apply guardrails

Guardrails are automated administrative controls intended to enforce policies that protect your environment. Policy Controller includes support for defining and enforcing custom rules not covered by native Kubernetes objects. Policy Controller checks, audits, and enforces guardrails that you apply that correspond to your organization's unique security, regulatory compliance, and governance requirements.

Use Policy Controller

Policy Controller is built on an open source policy engine called Gatekeeper that is used to enforce policies each time a resource in the cluster is created, updated, or deleted. These policies are defined by using constraints from the Policy Controller template library or from other Gatekeeper constraint templates.

The Anthos Sample Deployment on Google Cloud already has Policy Controller installed and also has the Policy Controller template library enabled. You can take advantage of this when implementing your guardrail by using an existing constraint for privileged containers from the library.

Apply a policy constraint for privileged containers

To address your security team's concerns, you apply the K8sPSPPrivilegedContainer constraint. This constraint denies pods from running with privileged containers.

  1. Using the Cloud Shell terminal, create a new constraint.yaml file with the text from the library constraint, as follows:

    cat <<EOF> ~/tutorial/anthos-sample-deployment-config-repo/cluster/constraint.yaml
    apiVersion: constraints.gatekeeper.sh/v1beta1
    kind: K8sPSPPrivilegedContainer
    metadata:
      name: psp-privileged-container
    spec:
      match:
        kinds:
          - apiGroups: [""]
            kinds: ["Pod"]
        excludedNamespaces: ["kube-system"]
    EOF
    
  2. Use nomos vet to verify that the updated configuration is valid before you apply it.

    nomos vet
    

    The command returns silently as long as there are no errors.

  3. Commit and push the changes to apply the policy. You can use nomos status with the watch command to confirm that the changes are applied to your cluster. Press Ctrl+C to exit the watch command when finished.

    git add .
    git commit -m "add policy constraint for privileged containers"
    git push && watch nomos status
    

    Output:

    Connecting to clusters...
    Current   Context                  Sync Status  Last Synced Token   Sync Branch   Resource Status
    -------   -------                  -----------  -----------------   -----------   ---------------
    *         anthos-sample-cluster1   SYNCED       f2898e92            master        Healthy
    

Test your policy

After you've applied the policy, you can test it by attempting to run a pod with a privileged container.

  1. In the Cloud Shell terminal, use the following command to create a new file in the tutorial directory, nginx-privileged.yaml, with the contents from this example spec:

    cat <<EOF> ~/tutorial/nginx-privileged.yaml
    apiVersion: v1
    kind: Pod
    metadata:
      name: nginx-privileged-disallowed
      labels:
        app: nginx-privileged
    spec:
      containers:
      - name: nginx
        image: nginx
        securityContext:
          privileged: true
    EOF
    
  2. Attempt to launch the pod with kubectl apply.

    kubectl apply -f ~/tutorial/nginx-privileged.yaml
    

    Output:

    Error from server ([denied by psp-privileged-container] Privileged container is not allowed: nginx, securityContext: {"privileged": true}): error when creating "~/nginx-privileged.yaml": admission webhook "validation.gatekeeper.sh" denied the request: [denied by psp-privileged-container] Privileged container is not allowed: nginx, security
    Context: {"privileged": true}
    

    The error shows that the Gatekeeper admission controller monitoring your Kubernetes environment enforced your new policy. It prevented the pod's execution due to the presence of a privileged container in the pod's specification.

The concept of version-controlled policies that you can apply to set up guardrails with Policy Controller is a powerful one because it standardizes, unifies, and centralizes the governance of your clusters, enforcing your policies through active monitoring of your environment post-deployment.

You can find many other types of policies to use as guardrails for your environment in the Gatekeeper repository.

Exploring the deployment further

While this tutorial has shown you how to work with some GKE Enterprise security features, there's still lots more to see and do in GKE Enterprise with our deployment. Feel free to try another tutorial or continue to explore the Anthos Sample Deployment on Google Cloud yourself, before following the cleanup instructions in the next section.

Clean up

After you've finished exploring the Bank of Anthos application, you can clean up the resources that you created on Google Cloud so they don't take up quota and you aren't billed for them in the future.

  • Option 1. You can delete the project. However, if you want to keep the project around, you can use Option 2 to delete the deployment.

  • Option 2. If you want to keep your current project, you can use terraform destroy to delete the sample application and cluster.

Delete the project (option 1)

The easiest way to avoid billing is to delete the project you created for this tutorial.

  1. In the Google Cloud console, go to the Manage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then click Delete.
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

Delete the deployment (option 2)

This approach deletes the Bank of Anthos application and the cluster, but does not delete the project. Run the following commands on your Cloud Shell:

  1. Change to the directory that hosts the installation scripts:

    cd bank-of-anthos/iac/tf-anthos-gke
    
  2. Delete the sample and the cluster:

    terraform destroy
    
  3. Enter the project ID when prompted.

If you plan to redeploy, verify that all requirements are met as described in the Before you begin section.

What's next

There's lots more to explore in our GKE Enterprise documentation.

Try more tutorials

Learn more about GKE Enterprise