Securing Kubernetes with GKE and Sysdig Falco
Michael Ducy
Director of Community & Evangelism, Sysdig
Andy Tzou
Strategic Tech Partnerships, Kubernetes
Securing your open-source Kubernetes environment can be a daunting task. Knowing which security elements you can tune and how they affect the overall stack isn’t always straightforward. Fortunately, Google Kubernetes Engine (GKE) and Google Cloud Platform make your job easier by providing options to enhance the security of your Kubernetes clusters. When combined with open source tools like Sysdig Falco, an open source project from Sysdig focused on runtime security, you can increase your confidence that you’re providing a more secure environment for your development teams.
In this post, we’ll take a look at the GKE’s built-in security features, and then, how to secure your environment even further with Falco.
Security features of GKE
A variety of security-related features are available in GKE. The items below are a good starting point as you begin to add additional layers of security to your GKE clusters. You can also follow the latest hardening advice in Hardening your cluster’s security.
Cloud IAM Policies
Google Cloud provides access control mechanisms to specific projects and clusters through Cloud Identity and Access Management (Cloud IAM). Cloud IAM works alongside Kubernetes’ Role Based Access Control (RBAC) to provide a rich set of controls. Cloud IAM and Kubernetes RBAC work like this:
Cloud IAM restricts access to specific projects, as well as specific GKE clusters within those projects.
Within the GKE cluster itself, Kubernetes RBAC is able to restrict access even further. You can restrict access at the cluster level, or even down to the Kubernetes namespace level.
Credential rotation
Credential rotation and management is a recommended best practice for any system. Credentials accidentally leak for lots of reasons. Regenerating and rotating in new credentials ensures that any credentials left lying around will eventually become useless. If someone was using a credential for malicious activity, rotating credentials would shut that actor out of the system, stopping abusive activity.
Initiating a credential rotation is easy via the `gcloud` command line interface (CLI):
Once the credential rotation starts, GKE initiates the process of moving the cluster’s master to a new IP address (IP rotation). New credentials are issued to your application, and to the control plane. Credential rotation also requires developers and administrators to refresh their credentials for tools like `kubectl`. This can be done via the `gcloud` CLI as well.
Because rotation also revokes credentials used by tools such as `kubectl`, end-users whose access should have been revoked also no longer have access the cluster.
Automatic node upgrades
Upgrading software — especially for large Kubernetes deployments — can be onerous. Upgrades are often put off or delayed until they become an absolute must. This fear of upgrading often leads to stale, out-of-date software running your critical applications.
GKE already upgrades your masters for you. For your worker nodes, GKE simplifies this by giving you the option to easily roll out upgrades to your clusters. The GKE console notifies you when a new version is available, and upgrades can be rolled out manually from the console or CLI.To enable auto upgrades for the default node pool, you can specify the `--enable-autoupgrade
` flag when creating the cluster. Or you can turn on auto upgrades for an existing cluster by updating the existing node pools.
Network policies
By default all pods and services within your GKE cluster are allowed to communicate with each other. When you start to restrict which pods and services can communicate, network policy becomes essential. While network policy is actually a feature of Kubernetes, GKE makes it simple to get started using network policy by integrating Tigera’s Project Calico to implement network policy.
Network policy allows you to implement rules which prevent certain tiers from communicating with other tiers that aren’t required for the application to work. Often this is referred to as defense in depth: employing multiple layers of security in case one fails. Say for example a front-end pod is compromised. With a network policy, you can prevent that compromised pod from communicating out — either outside the cluster or with other services or pods in your GKE cluster.
To enable network policy on a new cluster you pass the `--enable-network-policy
` flag to the `gcloud` command when creating your cluster. To enable network policy on an existing cluster, you simply run the following `gcloud` commands.
With network policy enabled, you’ll then need to define the actual policies themselves in Kubernetes. Network policy leverages Kubernetes' labeling functionality to determine which pods or services can communicate. If you’ve deployed a Kubernetes service in the past, this will be a familiar concept. For more information on creating network policies, you can refer to the Kubernetes documentation.
Runtime Security with Falco
As stated earlier, a strong defense in depth strategy requires additional layers to catch potential malicious actors in the event that a given layer fails. The open source project Falco provides an additional layer of security that watches the node and container runtime of your GKE cluster to detect abnormal behavior or possible intrusions. Falco instruments the Linux kernel of your cluster’s nodes to create an event stream from the system calls made by containers and the host. Rules are then applied to this event stream to detect abnormal behavior. By watching system calls, Falco can detect abnormal network connections, file modification, processes spawned, and more.
Falco also pulls metadata information from the underlying container runtime (e.g. Docker) and from the Kubernetes API server. This metadata can be used in Falco rules to restrict certain behaviors given the container image, container name, or particular Kubernetes resource such as pod, deployment, service, or namespace.
When Falco detects an abnormal event, it fires an alert to a destination. Destinations include stdout, a log file, syslog, or running a program (passing the alert as stdin). If you have Stackdriver Logging enabled in your GKE cluster, Falco alerts are collected and stored automatically. Additionally, you can send Falco alerts to Google Cloud Security Command Center (Cloud SCC). You can find more details on how to send Falco alerts to Cloud SCC in this blog post by Sysdig.
Conclusion
GKE provides many options to help you maintain a more secure and healthy Kubernetes cluster. With GKE, the burden of mundane and often difficult tasks like credential rotation and upgrades is greatly reduced for the cluster admin. Adding network policy and runtime security with Falco provides additional layers of security that help you implement a more comprehensive security strategy.
Want to learn more about how to secure GKE with tools like Falco and Google Cloud Security Command Center? Join our webinar on October 10th at 10AM PDT.