Improve Kubernetes cost and reliability with the new Policy Controller policy bundle
Andrew Peabody
Technical Solutions Consultant, Google Cloud
Poonam Lamba
Product Manager, Google
Kubernetes teams need to do more than just run their critical workloads; they also have to ensure those workloads are reliable and cost-effective. Our recent State of Kubernetes Cost Optimization Report provides prescriptive best practices for cost optimization, and we’ve incorporated many of them into the new Policy Controller Cost and Reliability policy bundle, which automatically identifies potential workload improvements, so you can achieve greater reliability and cost efficiency.
Google Kubernetes Engine (GKE) Policy Controller lets you enforce fully programmable policies for your clusters, where a policy bundle is a pre-built set of constraints that Google Cloud creates and maintains. Policy bundles help audit your cluster resources against Kubernetes standards, industry standards, or Google Cloud-recommended best practices. There are many policy bundles available, and new or existing users can use them easily without writing a single line of code. You can also view the status of policy bundle coverage and compliance for your fleet of clusters using the Policy Controller dashboard.
The new Cost and Reliability policy bundle
When you review violations with the new Cost and Reliability policy bundle, you as a Kubernetes administrator can view how well your applications align with cost and reliability recommendations.
The Cost and Reliability policy bundle requires the following configuration:
- A PodDisruptionBudget configuration
- Setting cpu and memory requests following best practices
- The following labels: environment, team, and app
- Container image repos must take advantage of image streaming
- A terminationGracePeriodSeconds of 15s or less on gke-spot
For more information on the research and reasoning behind these best practices, see the State of Kubernetes Cost Optimization Report.
Using the Cost and Reliability policy bundle
How do you view the violations identified by the Cost and Reliability policy bundle? The policies included are configured in "audit" mode by default, so they do not impact any of your existing or new workloads. You can apply the Cost and Reliability policy bundle using multiple channels:
- Install via UI (Preview)
- Kubectl
- Kpt
- Config Sync
Install via UI (Preview)
For customers with clusters on GKE Enterprise, currently in private preview, we’ve introduced a new UI installation method for the Policy Controller policy bundles. If you’re an existing Google Cloud customer and would like to try GKE Enterprise, talk to your account team to sign up for access. Otherwise, contact a Google Cloud sales specialist.
To install the Policy Controller Cost and Reliability policy bundle with the UI, follow these steps:
- In the Google Cloud console, navigate to Policy under Google Kubernetes Engine Enterprise.
- If Policy Controller (v1.16.1 or higher) isn't already installed on your cluster, you can install it by clicking INSTALL POLICY CONTROLLER.
- Under the SETTINGS tab (image below) click the pencil icon under Edit Configuration.
- Click the enable slider next to Cost and Reliability, and click SAVE CHANGES.
Installing policy bundles using the Policy Controller dashboard
View violations on the Policy Controller dashboard
Violations on the cluster can be viewed in the UI using the Policy Controller dashboard.
The Policy Controller dashboard
To remediate the violations, we recommend that you update your cluster and resource(s) yaml — some guidelines are included. Each violation also includes steps to fix it, which can be viewed both from CLI and the Policy Controller dashboard.
Monitoring the cluster(s) for Cost and Reliability policy bundle violations
By default, the Cost and Reliability policy bundle has its enforcement action set to dryrun, which is the configuration for Policy Controller to show you violations without blocking any changes. This gives you the ability to audit your clusters, share any violations with workload owners, and collaborate on resolving them.
All policy violations are automatically recorded in Cloud Logging and can be found by applying these filters in the Logs Explorer:
You can also get notified whenever policy violations occur by setting up log based alerts using Cloud Monitoring.
Policy Controller includes metrics related to policy usage such as number of constraints, constraint templates, audit violations detected, just to name a few (see list of metrics exposed). These metrics can be exported to Cloud Monitoring and/or Prometheus at install time (blog, docs). You can also set up alerts based on metrics.
Conclusion
Policy Controller enables the enforcement of policy bundles created and maintained by Google, as well as custom policies for your cluster that prevent changes to the Kubernetes API from violating security, operational, or compliance controls. Optionally, you can also use Policy Controller to analyze configuration before deployment to your Kubernetes cluster.
Get started today
The easiest way to get started with Policy Controller is to install Policy Controller and try out some of the other Google created and maintained policy bundles: