Version 1.4. This version is no longer supported as outlined in the Anthos version support policy. For the latest patches and updates for security vulnerabilities, exposures, and issues impacting Anthos clusters on VMware (GKE on-prem), upgrade to a supported version. You can find the most recent version here.

Hardening your cluster's security

With the speed of development in Kubernetes, there are often new security features for you to use. This page guides you through implementing our current guidance for hardening your GKE on-prem clusters.

This guide prioritizes high-value security mitigations that require your action at cluster creation time. Less critical features, secure-by-default settings, and those that can be enabled after cluster creation are mentioned further down. For a general overview of security topics, review Security.

Encrypting vSphere VMs

GKE on-prem cluster nodes run on virtual machines (VMs) in your vSphere cluster. Follow VMware vSphere security guidance and best practice guidance for Encrypting VMs.

Upgrade your infrastructure in a timely fashion

Kubernetes frequently introduces new security features and provides security patches. For information on security patches, refer to GKE on-prem security bulletins.

You are responsible for keeping your GKE on-prem clusters up to date. For each release, review the release notes Plan to update to new patch releases every month and minor versions every three months. Learn how to upgrade your clusters.

You are also responsible for upgrading and securing the vSphere infrastructure:

Configure OpenID Connect

If you want to configure user authentication for your clusters, use OpenID Connect (OIDC).

You should also take advantage of OIDC groups when granting access via role-based access control (RBAC). This removes the need to manually update your RBAC configuration as users change roles.

Use the principle of least privilege for Google Cloud service accounts

GKE on-prem requires four Google Cloud service accounts:

  • A whitelisted service account for accessing the GKE on-prem software. You create this when you purchase Anthos.
  • A register service account to be used by Connect for registering GKE on-prem clusters with Google Cloud.
  • A connect service account to be used by Connect for establishing a connection between GKE on-prem clusters and Google Cloud.
  • A Cloud Logging service account for collecting cluster logs for use by Cloud Logging.

During installation, you bind Identity and Access Management roles to these service accounts. Those roles grant the service accounts specific privileges within your project. You should configure these service accounts using the principle of least privilege: grant only the privileges required to fulfill the service accounts' respective roles.

Use Kubernetes Namespaces and RBAC to restrict access

To give teams least-privilege access to Kubernetes, create Kubernetes Namespaces or environment-specific clusters. Assign cost centers and appropriate labels to each namespace for accountability and chargeback. Only give developers the level of access to their Namespaces that they need to deploy and manage their applications, especially in production.

Map out the tasks that your users need to undertake against the cluster and define the permissions required to complete each task. To grant cluster- and namespace-level permissions, use Kubernetes RBAC.

Beyond the permissions for Google Cloud service accounts used to install GKE on-prem, IAM does not apply to GKE on-prem clusters.

Restrict cluster discovery RBAC permissions

By default, Kubernetes starts clusters with a permissive set of discovery ClusterRoleBindings, which give broad access to information about a cluster's APIs, including those of CustomResourceDefinitions (CRDs). These permissions are reduced in Kubernetes 1.14, which will be available beginning in GKE on-prem version 1.2. If it's necessary to restrict access, consider configuring your on-premises firewall appropriately.

Secret management

To provide an extra layer of protection for sensitive data, such as Kubernetes Secrets stored in etcd, configure a secrets manager that is integrated with GKE on-prem clusters.

If you are running workloads across multiple environments, you might prefer a solution that works for both Google Kubernetes Engine and GKE on-prem. If you choose to use an external secrets manager, such as HashiCorp Vault, you need to set it up before you create your GKE on-prem clusters.

You have several options for secret management.

  • You can use Kubernetes Secrets natively in GKE on-prem. We expect clusters to be using vSphere encryption for VMs as described earlier, which provides basic encryption-at-rest protection for secrets. Secrets are not further encrypted by default. To encrypt these secrets at the application-layer, you can edit the EncryptionConfig and use a key management service plugin.
  • You can use an external secrets manager, such as HashiCorp Vault. You can authenticate to HashiCorp using either a Kubernetes service account or a Google Cloud service account.

Restrict network access to the control plane and nodes

You should limit exposure of your cluster control plane and nodes to the internet. These choices cannot be changed after cluster creation.

By default, GKE on-prem cluster nodes are created using RFC 1918 addresses, and you should not change this. You should implement firewall rules in your on-premises network to restrict access to the control plane.

Use network policies to restrict traffic among Pods

By default, all Services in a GKE on-prem cluster can communicate with each other. You should control Service-to-Service communication as needed for your workloads.

Restricting network access to Services makes it much more difficult for attackers to move laterally within your cluster, and also offers Services some protection against accidental or deliberate denial of service. Two recommended ways to control traffic are:

  1. To control L7 traffic to your applications' endpoints, use Istio. Choose this if you're interested in load balancing, service authorization, throttling, quota, and metrics.
  2. To control L4 traffic between Pods, use Kubernetes network policies. Choose this if you're looking for the basic access control functionality exposed by Kubernetes.

You can enable both Istio and Kubernetes network policy after you create your GKE on-prem clusters. You can use them together if you need to.

Use Anthos Config Management policy controller

Kubernetes admission controllers are plugins that govern and enforce how a Kubernetes cluster is used. You must enable admission controllers to use Kubernetes advanced security features. Admission controllers are an important part of the defence-in-depth approach to hardening your cluster.

The best practice is to use Anthos Config Management's policy controller. Policy Controller uses the OPA Constraint Framework to describe and enforce policy as CRDs. The constraints that you wish to apply to your cluster are defined in constraint templates, which are deployed in your clusters.

Monitor your cluster configuration

You should audit your cluster configurations for deviations from your defined settings. To automatically check these configurations, you should use a solution that works with your GKE on-prem clusters, no matter where they are deployed. See Anthos partners.

Leave legacy client authentication methods disabled (default)

There are several methods of authenticating to the Kubernetes API server. OIDC is the recommended authentication mechanism. Basic authentication is disabled by default. Do not use x509 certificate for authentication.

Leave Cloud Logging enabled (default)

To reduce operational overhead and to maintain a consolidated view of your logs, implement a logging strategy that is consistent wherever your clusters are deployed. GKE on-prem is integrated with Google Cloud's operations suite by default. You should enable Google Cloud's operations suite during installation by populating the stackdriver specification in the GKE on-prem configuration file.

All GKE on-prem clusters have Kubernetes audit logging enabled by default. Audit logging keeps a chronological record of calls that have been made to the Kubernetes API server. Audit logs entries are useful for investigating suspicious API requests, collecting statistics, and creating monitoring alerts for unwanted API calls.

GKE on-prem clusters integrate Kubernetes audit logging with Google Cloud audit logs and Cloud Logging. GKE on-prem can also export from Google Cloud's operations suite to your own logging system.

Google Cloud's operations suite collects and aggregates logs from your clusters. By enabling Google Cloud's operations suite, you can get better support from Google. For more information, refer to Logging & monitoring.

Leave Kubernetes Dashboard disabled (default)

The Kubernetes Dashboard is backed by a highly-privileged Kubernetes service account, and has been exploited in several high-profile attacks on Kubernetes. The Google Cloud Console is the recommended web interface for GKE on-prem. It has much of the same functionality, supports IAM and Kubernetes RBAC without elevated privileges, and provides Anthos functionality like multi-cluster management.

The Kubernetes Dashboard is not included in GKE on-prem.

Leave attribute-based access control disabled (default)

In Kubernetes, you use RBAC to grant permissions to resources at the cluster and namespace level. RBAC allows you to define roles with rules containing a set of permissions.

By default, attribute-based access control (ABAC) is disabled on GKE on-prem clusters and you should not enable it.

What's next