Containers & Kubernetes
Digging into Kubernetes 1.12
The Cloud Native Community Foundation (CNCF) announced Kubernetes 1.12 yesterday. This is a testament to the hard work of the community and especially the Kubernetes Release Team, which every three months releases a new version of what is now the second-largest open-source project, only behind Linux. Just like we’ve done with previous releases, 1.11 and 1.10, we'd like to highlight some of the development that Google Cloud is driving in the Kubernetes open source project.
What's new in 1.12
After scaling up a cluster, the Kubernetes Horizontal Pod Autoscaler (HPA) used to wait a fixed amount of time before scaling up again, to avoid making decisions based on samples in which the pods were initializing (and were thus using more resources). Now instead, HPA disregards CPU samples from when the pod was initializing. This allows HPA to react more appropriately to situations such as gradually increasing load, resulting in multiple, frequent, small scale-ups (rather than few, infrequent big scale-ups), making the system much more responsive.
HPA also used to wait for a fixed amount of time after scaling up before scaling down again, to avoid scaling down in response to a short dip in usage. Now HPA scales down if all the recommendations in the last few minutes were lower than current size (and no lower than the highest recommended size for that time period). This will make HPA more stable (a single low sample will not trigger a scale-down).
Affinity scheduling now 100x faster
Kubernetes inter-Pod affinity and anti-affinity used to be complex and slow—about three orders of magnitude slower than the average run-time of other scheduler predicates. This made Kubernetes affinity scheduling impractical in large clusters with thousands of nodes. We implemented a series of optimizations in 1.12 that resulted in over 100x performance improvement in large clusters. With these improvements in place, you can now use inter-Pod affinity scheduling in clusters of all sizes.
Topology-aware storage provisioning
Developers running stateful workloads in multi-zone and regional clusters with dynamic provisioning of Compute Engine Persistent Disks (PDs) and Regional PDs have traditionally encountered many problems. The biggest issue was that PDs were provisioned in zones without knowledge of Pod CPU or memory requirements, node selectors, and Pod affinity/anti-affinity policies, which could leave the Pod in an unschedulable state.
Kubernetes 1.12 introduces topology-aware dynamic provisioning in beta, which greatly improves the regional cluster experience for stateful workloads. Now, Kubernetes understands the inherent zonal restrictions of PDs and Regional PDs, and provisions them in the zone that is best suited to run the Pod. Kubernetes can now handle autoscaling with PDs automatically, providing a seamless experience for regional clusters.
Portability remains a cornerstone of Kubernetes design, so while this feature improves the experience for Compute Engine PDs, it was also designed to work with any type of storage topology (racks, zones, regions, or any custom way to represent storage availability). Kubernetes 1.12 also introduces topology support for Container Storage Interface (CSI) plugins in alpha. Together, these tools allow you to define your own topology boundaries, such as racks in on-premises clusters, and provide separate pools of storage within each boundary. As leads of both the volume topology and CSI features, we worked with community members representing numerous Kubernetes providers and storage vendors to ensure that the feature is portable and simple to use across multiple clusters, environments and storage systems.
The Advanced Auditing feature moved to GA, extending the previous logging system to provide much richer information about Kubernetes API requests. We enabled this feature for all Google Kubernetes Engine (GKE) clusters starting with version 1.8.3, allowing GKE users to introspect requests to their cluster via an integration with Stackdriver Cloud Audit Logging. Learn more about auditing logging in GKE, which is currently in beta and moving to GA soon.
On the horizon
As we work on building new Kubernetes features, we get them out there for users to test as soon as we can, but we take our time to make sure they are ready for your production workloads before calling them Generally Available. Here are some of the more bleeding-edge features we're excited about. If you want to test them—and we'd love it if you did!—you will soon be able to use them on a GKE alpha cluster.
RuntimeClass is an alpha feature introduced in 1.12 for selecting between multiple container runtime configurations, and surfacing properties of those runtimes to the control plane. While it is still early in the development lifecycle, this feature is critical to unblocking new ways of running containers, such as GKE Sandbox powered by gVisor.
Snapshot and restore of volumes
Kubernetes 1.12 also introduced alpha support for volume snapshotting. This feature introduces the ability to create/delete volume snapshots and create new volumes from a snapshot using the Kubernetes API.
By providing a standard way to trigger snapshot operations in the Kubernetes API, you can now incorporate snapshot operations, in a cluster agnostic way, into your tooling and policy, knowing that it will work against arbitrary Kubernetes clusters regardless of the underlying storage.
Additionally, these Kubernetes snapshot primitives act as basic building blocks that unlock the ability to develop advanced, enterprise-grade, storage administration features for Kubernetes such as data protection, data replication, and data migration.
Container Storage Interface
The Container Storage Interface (CSI) enables third-party storage providers to write and deploy plugins exposing new storage systems in Kubernetes without ever having to touch the core Kubernetes code.
In anticipation of moving Kubernetes support of CSI from beta to GA in the next release, we led the delivery of a number of CSI related features in 1.12. Most importantly, the Kubelet plugin registration mechanism, which enables Kubelet to discover and register external plugins, like CSI, was promoted to beta, a number of alpha features were added to support the use of CSI for “local ephemeral volumes”, and Alpha support was added for a new CSI driver registry, simplifying discover and customization of installed CSI drivers.
Tune in for more
One of the best ways to stay up to date on what’s happening in the Kubernetes ecosystem is to listen to the Kubernetes Podcast. Every week we release an episode covering news from the community and in-depth interviews with leaders from the Kubernetes and cloud native ecosystem. Subscribe via your favorite podcast platform.
As you can see, Googlers have been busy with contributions to Kubernetes, but we have lots more features and optimizations in the works for the next release. Come join the contributing community on GitHub and chat with us on Slack.
If you haven’t tried GCP and GKE before, you can quickly get started with our $300 free credits.