Anthos security blueprint: Enforcing policies

This document describes how to enforce security policies on Anthos clusters. It includes an overview of how and why you enforce policies, and it describes the Google Cloud controls that you use for this task.

The document is part of a series of security blueprints that provide prescriptive guidance for working with Anthos.

Introduction

You apply policies to your clusters to ensure that your security guardrails and compliance requirements are met. After you apply policies, you need to ensure that they are enforced and that the settings in your Anthos clusters adhere to the policy configurations that you've specified.

Policy enforcement is complementary to auditing your cluster. Auditing tells you the past and current status of your cluster, but does not prevent any actions that would circumvent your policies. In contrast, policy enforcement is a preventive control. You should apply policies at the cluster and namespace levels to meet your requirements.

You need to consider how you can enforce the following requirements:

  • Limiting resource consumption.
  • Restricting network traffic within the cluster.
  • Restricting the capabilities that a pod can run with.
  • Defining custom policies such as enforcing required labels.

Understanding the security controls you need

This section discusses the controls that are required in order to let you enforce the policies that help you meet your security and compliance requirements.

Namespaces

Labeling resources that should use the same policies

Namespaces let you provide a scope for related resources within a cluster—for example, pods, services, and replication controllers. By using namespaces, you can delegate administration responsibility for the related resources as a unit. Therefore, namespaces are integral to most security patterns.

Namespaces are an important feature for control plane isolation. However, they don't provide node isolation, data plane isolation, or network isolation.

A common approach is to create namespaces for individual applications. For example, you might create the namespace myapp-frontend for the UI component of an application.

Resource quotas

Controlling resource consumption

GKE is designed to support multiple applications running in the same cluster that are managed by multiple teams. If no single team is responsible for managing resource usage in a cluster, it's possible that applications running in the cluster might consume more resources than they should. To help prevent this situation, an administrator must configure resource quotas in order to limit aggregated resource consumption for the resources that are defined within a namespace.

Anthos Config Management

Applying configurations to your Anthos clusters

A best practice when you manage Anthos clusters is to use Anthos Config Management, which keeps your enrolled clusters in sync with configs. A config is a YAML or JSON file that's stored in your repository and that contains the same types of configuration details that you can manually apply to a cluster by using the kubectl apply command. Anthos Config Management lets you manage your policies and infrastructure deployments like you do your apps—by adopting a policy-as-code approach.

You use Anthos Config Management in conjunction with a Git repository that acts as the single source of truth for your declared policies. Anthos Config Management can manage access-control policies like RBAC, resource quotas, namespaces, and platform-level infrastructure deployments. Anthos Config Management is declarative; it continuously checks cluster state and applies the state declared in the config in order to enforce policies.

Network policies

Enforcing network traffic flow within clusters

Network policies enforce Layer 4 network traffic flows by using pod-level firewall rules. Network policies are scoped to a namespace.

By default, even if a network policy is enabled for that namespace, access to pods in a cluster is unrestricted. When at least one NetworkPolicy object selects a pod, the enforcement is applied.

A best practice is to adopt a least-privilege approach. When you implement network policies, we recommend that you create a default deny-all rule in the namespace to match all pods; this makes the namespace block access (that is, it acts as a Fail Closed system). To allow network traffic flows, you then have to ensure that you explicitly set up network policies for each namespace.

The following diagram shows that by configuring network policies for each namespace, you can implement policies that manage the traffic flow between the namespaces.

Using network policies to manage traffic flow between namespaces.

In the example, traffic is permitted to flow bidirectionally between the namespace transactions and the namespace shopfront. However, traffic is permitted to flow only from the namespace shopfront to the logging app.

For an example NetworkPolicy config that can be applied using Anthos Config Management, see the "NetworkPolicy config" section in Configuring Kubernetes objects.

Anthos Policy Controller

Enforcing compliance with policies

Anthos Policy Controller is a dynamic admission controller for Kubernetes that enforces CustomResourceDefinition-based (CRD-based) policies that are executed by the Open Policy Agent (OPA).

Admission controllers are Kubernetes plugins that intercept requests to the Kubernetes API server before an object is persisted, but after the request is authenticated and authorized. You can use admission controllers to limit how a cluster is used.

To use Policy Controller, you declare a set of constraints in a constraint template. When the constraint template has been deployed in the cluster, you can create individual constraint CRDs that are defined by the constraint template.

The following diagram shows how Policy Controller uses the OPA Constraint Framework to define and enforce policy.

The OPA Constraint Framework receives requests and enforces policies for access to other resources.

The diagram shows the following:

  1. Constraints are created from constraint templates.
  2. Policies are enabled on the cluster by applying constraints.
  3. A request comes in and an admission review is triggered, resulting in an allow or deny decision.
  4. A continuous audit evaluates all active objects on cluster against policies.

Using Policy Controller, you can enforce custom policies, such as enforcing labels. Policy Controller lets you apply the majority of the constraints that you can apply using PodSecurityPolicies. But they typically require less operational overhead for the following reasons:

  • Policy Controller includes a default template library that includes constraint templates, meaning that you don't need to write your own policies for common cases as you do with PodSecurityPolicies.
  • You don't have to manage RoleBindings as you do when you use PodSecurityPolicies.
  • Policy Controller supports dry run mode so that you can validate the effect of a constraint before you apply it.
  • You can scope policies to namespaces, which gives you the opportunity to perform a slower ramp-up of more restrictive policies. This is similar to a canary release strategy, where you manage the exposure of rolling out policies that might have unanticipated effects. For example, your rollout might uncover that you've restricted access to a volume from a pod, but that the pod should have access to the volume.
  • Policy Controller provides a single way to apply policies whether they're custom constraints or they're PodSecurityPolicies constraints that are defined in the Gatekeeper repository.

For more information about how to use Policy Controller to enforce policies that you define, see Anthos Config Management Policy Controller.

Anthos Service Mesh

Managing secure communications between services

Anthos Service Mesh helps you monitor and manage an Istio-based service mesh. A service mesh is an infrastructure layer that enables managed, observable, and secure communication across your services.

Anthos Service Mesh helps simplify the management of secure communications across services in the following ways:

  • Managing authentication and encryption of traffic (supported protocols within the cluster using mutual Transport Layer Communication (mTLS)). Anthos Service Mesh manages the provisioning and rotation of mTLS keys and certificates for Anthos workloads without disrupting communications. Regularly rotating mTLS keys is a security best practice that helps reduce exposure in the event of an attack.
  • Allowing you to configure network security policies based on service identity rather than on the IP address of the peer. Anthos Service Mesh is used to configure identity-aware access control (firewall) policies that let you create policies that are independent of the network location of the workload. This simplifies the process of setting up service-to-service communications.
  • Allowing you to configure policies that permit access from certain clients.
  • Managing user authentication by using Identity-Aware Proxy or a custom policy engine. This helps you control access to the applications that you've deployed on Anthos GKE clusters by verifying user identity and the context of the request to determine whether a user should be allowed access.

In addition to managing secure communications between services, Anthos Service Mesh helps reduce noise in access logs by logging only successful accesses once for each configurable time window. Requests that are denied by a security policy or that result in an error are always logged. Access logs and metrics are available in Google Cloud's operations suite.

For more information on Anthos Service Mesh security features, see the Anthos Service Mesh security overview.

Bringing it all together

The controls discussed earlier apply to both Anthos GKE and Anthos GKE on-prem.

To integrate the controls, map out the scope of the controls discussed in this guide and the stage at which they need to be configured, as described in the steps that follow.

  1. Create your clusters using the guidance in the applicable cluster hardening guide (GKE or Anthos GKE on-prem). When you create your cluster, be sure you follow the hardening guide and use the --enable-network-policy flag. Network policies are required, and this step lets you implement firewall rules later that restrict the traffic that flows between pods in a cluster.
  2. Define the namespaces and labels that are required for the pods. This provides a name scope that allows you to work with policies and with Kubernetes service accounts.
  3. Install Policy Controller using Anthos Config Management.
  4. Apply your network policy by using Anthos Config Management.
  5. Collect the pre-built Policy Controller constraint templates that you want to use and map them to your resources by defining constraints.
  6. Apply the constraint template and the constraints by using Anthos Config Management.

You can apply additional controls that focus at the application layer (Layer 7) to further enforce policies by using Anthos Service Mesh as follows:

  • If you didn't enable Istio policy enforcement when you created your cluster, enable it now.
  • Define the policies you want to enforce at the application layer. The policies are expressed as YAML.
  • Apply your policies using Anthos Config Management.