Exploring container security: Navigate the security seas with ease in GKE v1.15
Product Manager, Container Security
Security Engineer, GKE Security
Your container fleet, like a flotilla, needs ongoing maintenance and attention to stay afloat—and stay secure. In the olden days of seafaring, you grounded your ship at high tide and turned it on its side to clean and repair the hull, essentially taking it “offline.” We know that isn’t practical for your container environment however, as uptime is as important as security for most applications.
Here on the Google Kubernetes Engine (GKE) team, we’re always hard at work behind the scenes to provide you with the latest security patches and features, so you can keep your fleet safe while retaining control and anticipating disruptions.
As GKE moved from v1.12 to v1.15 over the past year, here’s an overview of what security changes we’ve made to the platform, to improve security behind the scenes, and with stronger defaults, as well as advice we added to the GKE hardening guide.
Behind-the-scenes hardening in GKE
A lot of our security recommendations come down to a simple principle: implement and expose fewer items in your infrastructure, so there’s less for you to secure, maintain, and patch. In GKE, this means paring down controls to only what your application actually needs and removing older implementations or defaults. Let’s take a deeper look at the changes we made this year.
Behind the scenes, we’re continually hardening and improving GKE. A major undertaking in the past several months has been rebasing GKE master and daemonset containers on top of distroless base images. Distroless images are limited to only the application and its runtime dependencies—they’re not a full Linux distribution, so there are no shells or package managers. And because these images are smaller, they’re faster to load, and have a smaller attack surface. By moving almost all Kubernetes components to distroless images in Kubernetes 1.15 and 1.16, this helps to reduce the signal-to-noise ratio in vulnerability scanning, and makes it simpler to maintain Kubernetes components. By the way, you should also consider moving your container application images to distroless images!
Locking down system:unauthenticated access to clusters
Kubernetes authentication allows certain cluster roles to have access to cluster information by default, for example, to gather metrics about cluster performance. This specifically allows unauthenticated users (who could be from anywhere on the public internet!) to read some unintended information if they gain access to the cluster API server. We worked in open-source to change this in Kubernetes 1.14, and introduced a new discovery role
system:public-info-viewer explicitly meant for unauthenticated users. We also removed system:unauthenticated access to other API server information.
Ongoing patching and vulnerability response
Our security experts are part of the Kubernetes Product Security Committee, and help manage, develop patches for, and address newly discovered Kubernetes vulnerabilities. On GKE, in addition to Kubernetes vulnerabilities, we handle other security patches—in the past year, these included critical patches to the Linux kernel, runc, and the Go programming language—and when appropriate, publishing a security bulletin detailing the changes.
Better defaults in GKE
Among the more visible changes, we’ve also changed the defaults for new clusters in GKE to more secure options, to allow newer clusters to more easily adopt these best practices. In the past several releases, this has included enabling node auto-upgrade by default, removing the Kubernetes dashboard add-on, removing basic authentication and client certs, and removing access to legacy node metadata endpoints. These changes apply to any new GKE clusters you create, and you can still opt to use another option if you prefer.
Defaults for new clusters in GKE have been improving over releases in the past several years, to improve security
Enabling node auto-upgrade
Keeping the version of Kubernetes up-to-date is one of the simplest things you can do to improve your security. According to the shared responsibility model, we patch and upgrade GKE masters for you, but upgrading the nodes remains your responsibility. Node auto-upgrade automatically provides security patches, bug fixes and other upgrades to your node pools, and ensures alignment with your master version to avoid unsupported version skew. As of November, node auto-upgrade is enabled by default for new clusters. Nothing has changed for pre-existing clusters though, so please consider enabling node auto-upgrade manually or upgrading yourself regularly and watching the Security Bulletins for information on recommended security patches. With release channels, you can subscribe your cluster to a channel that meets your business needs, and infrastructure requirement. Release channels take care of both the masters and nodes, and ensures your cluster is up to date with the latest patch version available in the chosen channel.
Locking down the Kubernetes Dashboard
The open-source Kubernetes web UI (Dashboard) is an add-on which provides a web-based interface to interact with your Kubernetes deployment, including information on the state of your clusters and errors that may have occurred. Unfortunately, it is sometimes left publicly accessible or granted sensitive credentials, making it susceptible to attack. Since the Google Cloud Console provides much of the same functionality for GKE, we’ve further locked down the Dashboard to better protect your clusters. For new clusters created with:
- GKE v1.7, the Dashboard does not have admin access by default.
- GKE v1.10, the Dashboard is disabled by default.
- GKE v1.15 and higher, the Kubernetes web UI add-on Dashboard is no longer available in new GKE clusters.
You can still run the dashboard if you wish, following the Kubernetes web UI documentation to install it yourself.
There are several methods of authenticating to the Kubernetes API server. In GKE, the supported methods are OAuth tokens, x509 client certificates, and static passwords (basic authentication). GKE manages authentication via gcloud for you using the OAuth token method, setting up the Kubernetes configuration, getting an access token, and keeping it up to date. Enabling additional authentication methods, unless your application is using them, presents a wider surface of attack. Starting in GKE v1.12, we disabled basic authentication and legacy client certificates by default for new clusters, so that these credentials are not created for your cluster. For older clusters, make sure to remove the static password if you aren’t using it.
Disabling metadata server endpoints
Some attacks against Kubernetes use access to the VM's metadata server to extract the node’s credentials; this can be particularly true for legacy metadata server endpoints. For new clusters starting with GKE v1.12, we disabled these endpoints by default. Note that Compute Engine is in the process of turning down these legacy endpoints. If you haven’t already, you may use the check-legacy-endpoint-access tool to help discover if your apps should be updated and migrated to the GA v1 metadata endpoints, which include an added layer of security that can help customers protect against vulnerabilities .
Our latest and greatest hardening guide
Even though we keep making more and more of our security recommendations the default in GKE, they primarily apply to new clusters. This means that even if you’ve been continuously updating an older cluster, you’re not necessarily benefitting from these best practices. To lock down your workloads as best as possible, make sure to follow the GKE hardening guide. We’ve recently updated this with the latest features, and made it more practical, with recommendations for new clusters, as well as recommendations for GKE On-Prem.
It’s worth highlighting some of the newer recommendations in the hardening guide for Workload Identity and Shielded GKE Nodes.
Workload Identity is a new way to manage credentials for workloads you run in Kubernetes, automating best practices for workload authentication, and removing the need for service account private keys or node credential workarounds. We recommend you use Workload Identity over other options, as it replaces the need to use metadata concealment, and protects sensitive node metadata.
Shielded GKE Nodes
Shielded GKE Nodes is built upon Shielded VMs and further protects node metadata, providing strong, verifiable node identity and integrity for all the GKE nodes in your cluster. If you’re not using third-party kernel modules, we also recommend you enable secure boot to verify the validity of components running on your nodes and get enhanced rootkit and bootkit protections.
The most secure GKE yet
We’ve been working hard on hardening, updating defaults, and delivering new security features to help protect your GKE environment. For the latest and greatest guidance on how to bolster the security of your clusters, we’re always updating the GKE hardening guide.