Policy Controller comes with a default library of constraint templates that can be used with the Cost and Reliability Policy bundle which helps adopt best practices for running cost-efficient GKE clusters without compromising the performance or reliability of their workloads.
This page contains instructions for manually applying a policy bundle. Alternatively, you can apply policy bundles directly.
Cost and Reliability policy bundle constraints
Constraint Name | Constraint Description |
---|---|
cost-reliability-v2023-pod-disruption-budget | Requires PodDisruptionBudget configuration for Deployments, ReplicaSets, StatefulSets, and ReplicationControllers. |
cost-reliability-v2023-pod-resources-best-practices | Requires that containers are setting resource requests and are following the best practices. |
cost-reliability-v2023-required-labels | Requires all Pods and Controllers (ReplicaSet, Deployment, StatefulSet, and DaemonSet) to have the required labels: environment, team, and app. |
cost-reliability-v2023-restrict-repos | Restricts container images to an allowed repos list to use Artifact Registry to take advantage of Image streaming. |
cost-reliability-v2023-spotvm-termination-grace | Requires terminationGracePeriodSeconds of 15s or less for Pods and Pod Templates with a nodeSelector or nodeAfffinty for gke-spot. |
Before you begin
- Install and initialize the Google Cloud CLI
, which provides the
gcloud
andkubectl
commands used in these instructions. If you use Cloud Shell, Google Cloud CLI comes pre-installed. - Install Policy Controller on your cluster with the default library of constraint templates. You must also enable support for referential constraints as this bundle contains referential constraints.
Configure Policy Controller for referential constraints
Save the following YAML manifest to a file as
policycontroller-config.yaml
. The manifest configures Policy Controller to watch specific kinds of objects.apiVersion: config.gatekeeper.sh/v1alpha1 kind: Config metadata: name: config namespace: "gatekeeper-system" spec: sync: syncOnly: - group: "" version: "v1" kind: "Service" - group: "policy" version: "v1" kind: "PodDisruptionBudget"
Apply the
policycontroller-config.yaml
manifest:kubectl apply -f policycontroller-config.yaml
Configure your cluster and workload
- Any
pod
selected by aservice
must include a Readiness Probes. - All
deployment
,replicaset
,statefulset
, andreplicationcontroller
must include apoddisruptionbudget
. - All containers should include
cpu
andmemory
requests, andmemory
limit equal tomemory
requests following best practices. - Add
environment
,team
, andapp
labels to all Pods and Pod Templates. - Host container images using Artifact Registry in the same
region as your cluster to enable Image streaming.
Allow the relevant Artifact Registry by following the example in
cost-reliability-v2023-restrict-repos
. - All Pods and Pod Templates using
gke-spot
must include aterminationGracePeriodSeconds
of 15 seconds or less.
Audit Cost and Reliability policy bundle
Policy Controller lets you enforce policies for your Kubernetes cluster. To help test your workloads and their compliance with regard to the Cost and Reliability policies outlined in the preceding table, you can deploy these constraints in "audit" mode to reveal violations and more importantly give yourself a chance to fix them before enforcing on your Kubernetes cluster.
You can apply these policies with spec.enforcementAction
set to dryrun
using
kubectl,
kpt
, or
Config Sync
.
kubectl
(Optional) Preview the policy constraints with kubectl:
kubectl kustomize https://github.com/GoogleCloudPlatform/gke-policy-library.git/anthos-bundles/cost-reliability-v2023
Apply the policy constraints with kubectl:
kubectl apply -k https://github.com/GoogleCloudPlatform/gke-policy-library.git/anthos-bundles/cost-reliability-v2023
The output is the following:
gkespotvmterminationgrace.constraints.gatekeeper.sh/cost-reliability-v2023-spotvm-termination-grace created k8sallowedrepos.constraints.gatekeeper.sh/cost-reliability-v2023-restrict-repos created k8spoddisruptionbudget.constraints.gatekeeper.sh/cost-reliability-v2023-pod-disruption-budget created k8spodresourcesbestpractices.constraints.gatekeeper.sh/cost-reliability-v2023-pod-resources-best-practices created k8srequiredlabels.constraints.gatekeeper.sh/cost-reliability-v2023-required-labels created
Verify that policy constraints have been installed and check if violations exist across the cluster:
kubectl get constraints -l policycontroller.gke.io/bundleName=cost-reliability-v2023
The output is similar to the following:
NAME ENFORCEMENT-ACTION TOTAL-VIOLATIONS gkespotvmterminationgrace.constraints.gatekeeper.sh/cost-reliability-v2023-spotvm-termination-grace dryrun 0 NAME ENFORCEMENT-ACTION TOTAL-VIOLATIONS k8spodresourcesbestpractices.constraints.gatekeeper.sh/cost-reliability-v2023-pod-resources-best-practices dryrun 0 NAME ENFORCEMENT-ACTION TOTAL-VIOLATIONS k8spoddisruptionbudget.constraints.gatekeeper.sh/cost-reliability-v2023-pod-disruption-budget dryrun 0 NAME ENFORCEMENT-ACTION TOTAL-VIOLATIONS k8sallowedrepos.constraints.gatekeeper.sh/cost-reliability-v2023-restrict-repos dryrun 0 NAME ENFORCEMENT-ACTION TOTAL-VIOLATIONS k8srequiredlabels.constraints.gatekeeper.sh/cost-reliability-v2023-required-labels dryrun 0
kpt
Install and setup kpt.
kpt is used in these instructions to customize and deploy Kubernetes resources.
Download the PCI-DSS v3.2.1 policy bundle from GitHub using kpt:
kpt pkg get https://github.com/GoogleCloudPlatform/gke-policy-library.git/anthos-bundles/cost-reliability-v2023
Run the
set-enforcement-action
kpt function to set the policies' enforcement action todryrun
:kpt fn eval cost-reliability-v2023 -i gcr.io/kpt-fn/set-enforcement-action:v0.1 \ -- enforcementAction=dryrun
Initialize the working directory with kpt, which creates a resource to track changes:
cd cost-reliability-v2023 kpt live init
Apply the policy constraints with kpt:
kpt live apply
Verify that policy constraints have been installed and check if violations exist across the cluster:
kpt live status --output table --poll-until current
A status of
CURRENT
confirms successful installation of the constraints.
Config Sync
Install and setup kpt.
kpt is used in these instructions to customize and deploy Kubernetes resources.
Operators using Config Sync to deploy policies to their clusters can use the following instructions:
Change into the sync directory for Config Sync:
cd SYNC_ROOT_DIR
To create or append
.gitignore
withresourcegroup.yaml
:echo resourcegroup.yaml >> .gitignore
Create a dedicated
policies
directory:mkdir -p policies
Download the Cost and Reliability policy bundle from GitHub using kpt:
kpt pkg get https://github.com/GoogleCloudPlatform/gke-policy-library.git/anthos-bundles/cost-reliability-v2023 policies/cost-reliability-v2023
Run the
set-enforcement-action
kpt function to set the policies' enforcement action todryrun
:kpt fn eval policies/cost-reliability-v2023 -i gcr.io/kpt-fn/set-enforcement-action:v0.1 -- enforcementAction=dryrun
(Optional) Preview the policy constraints to be created:
kpt live init policies/cost-reliability-v2023 kpt live apply --dry-run policies/cost-reliability-v2023
If your sync directory for Config Sync uses Kustomize, add
policies/cost-reliability-v2023
to your rootkustomization.yaml
. Otherwise remove thepolicies/cost-reliability-v2023/kustomization.yaml
file:rm SYNC_ROOT_DIR/policies/cost-reliability-v2023/kustomization.yaml
Push changes to the Config Sync repo:
git add SYNC_ROOT_DIR/policies/cost-reliability-v2023 git commit -m 'Adding Cost and Reliability policy audit enforcement' git push
Verify the status of the installation:
watch gcloud beta container fleet config-management status --project PROJECT_ID
A status of
SYNCED
confirms the installation of the policies.
View policy violations
Once the policy constraints are installed in audit mode, violations on the cluster can be viewed in the UI using the Policy Controller Dashboard.
You can also use kubectl
to view violations on the cluster using the following
command:
kubectl get constraint -l policycontroller.gke.io/bundleName=cost-reliability-v2023 -o json | jq -cC '.items[]| [.metadata.name,.status.totalViolations]'
If violations are present, a listing of the violation messages per constraint can be viewed with:
kubectl get constraint -l policycontroller.gke.io/bundleName=cost-reliability-v2023 -o json | jq -C '.items[]| select(.status.totalViolations>0)| [.metadata.name,.status.violations[]?]'
Change Cost and Reliability policy bundle enforcement action
Once you've reviewed policy violations on your cluster, you can consider
changing the enforcement mode so the Admission Controller will either warn
on
or even deny
block non-compliant resource from getting applied to the cluster.
kubectl
Use kubectl to set the policies' enforcement action to
warn
:kubectl get constraints -l policycontroller.gke.io/bundleName=cost-reliability-v2023 -o name | xargs -I {} kubectl patch {} --type='json' -p='[{"op":"replace","path":"/spec/enforcementAction","value":"warn"}]'
Verify that policy constraints enforcement action have been updated:
kubectl get constraints -l policycontroller.gke.io/bundleName=cost-reliability-v2023
kpt
Run the
set-enforcement-action
kpt function to set the policies' enforcement action towarn
:kpt fn eval -i gcr.io/kpt-fn/set-enforcement-action:v0.1 -- enforcementAction=warn
Apply the policy constraints:
kpt live apply
Config Sync
Operators using Config Sync to deploy policies to their clusters can use the following instructions:
Change into the sync directory for Config Sync:
cd SYNC_ROOT_DIR
Run the
set-enforcement-action
kpt function to set the policies' enforcement action towarn
:kpt fn eval policies/cost-reliability-v2023 -i gcr.io/kpt-fn/set-enforcement-action:v0.1 -- enforcementAction=warn
Push changes to the Config Sync repo:
git add SYNC_ROOT_DIR/policies/cost-reliability-v2023 git commit -m 'Adding Cost and Reliability policy bundle warn enforcement' git push
Verify the status of the installation:
gcloud alpha anthos config sync repo list --project PROJECT_ID
Your repo showing up in the
SYNCED
column confirms the installation of the policies.
Test policy enforcement
Create a non-compliant resource on the cluster using the following command:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
namespace: default
name: wp-non-compliant
labels:
app: wordpress
spec:
containers:
- image: wordpress
name: wordpress
ports:
- containerPort: 80
hostPort: 80
name: wordpress
EOF
The admission controller should produce a warning listing out the policy violations that this resource violates, as shown in the following example:
Warning: [cost-reliability-v2023-pod-resources-best-practices] Container <wordpress> must set <cpu> request. Warning: [cost-reliability-v2023-pod-resources-best-practices] Container <wordpress> must set <memory> request. Warning: [cost-reliability-v2023-required-labels] This app is missing one or more required labels: `environment`, `team`, and `app`. Warning: [cost-reliability-v2023-restrict-repos] container <wordpress> has an invalid image repo <wordpress>, allowed repos are ["gcr.io/gke-release/", "gcr.io/anthos-baremetal-release/", "gcr.io/config-management-release/", "gcr.io/kubebuilder/", "gcr.io/gkeconnect/", "gke.gcr.io/"] pod/wp-non-compliant created
Remove Cost and Reliability policy bundle
If needed, the Cost and Reliability policy bundle can be removed from the cluster.
kubectl
Use kubectl to remove the policies:
kubectl delete constraint -l policycontroller.gke.io/bundleName=cost-reliability-v2023
kpt
Remove the policies:
kpt live destroy
Config Sync
Operators using Config Sync to deploy policies to their clusters can use the following instructions:
Push changes to the Config Sync repo:
git rm -r SYNC_ROOT_DIR/policies/cost-reliability-v2023 git commit -m 'Removing Cost and Reliability policies' git push
Verify the status:
gcloud alpha anthos config sync repo list --project PROJECT_ID
Your repo showing up in the
SYNCED
column confirms the removal of the policies.