This page explains how you can reduce the resources consumed by cluster add-ons. Use these techniques on small clusters, such as clusters with three or fewer nodes or clusters that use machine types with limited resources.
In addition to your workload, cluster nodes run several add-ons that integrate the node with the cluster master and provide other functionality. As such, there is a disparity between a node's total resources and the resources that are available for your workload. Visit the cluster architecture documentation for more details.
The default configurations of these add-ons are appropriate for typical clusters, but you can fine-tune the resource usage of add-ons depending on your particular cluster configuration. You can also disable some add-ons that aren't required by your use case.
Fine-tuning is especially useful for clusters with limited compute resources, for example clusters with a single node or very few nodes, or clusters that run on low-cost machine types. You can use a small cluster to try out GKE or to experiment with new features. Clusters created using the Your first cluster cluster template can also benefit from fine-tuning.
You can configure each individual add-on to reduce its impact on node resources.
Add-ons provide visibility and debugging for your cluster. For instance, if you disable monitoring on a cluster running production workloads, problems will be more difficult to debug.
Stackdriver Logging automatically collects, processes, and stores your container and system logs.
You can disable Stackdriver Logging entirely, or leave it enabled but restrict its resource usage by fine-tuning the Fluentd addon.
To disable Stackdriver Logging:
gcloud container clusters update [CLUSTER_NAME] --logging-service none
To enable Stackdriver Logging:
gcloud container clusters update [CLUSTER_NAME] --logging-service logging.googleapis.com
Viewing logs when Stackdriver Logging is disabled
When Stackdriver logging is disabled, you can still view recent log entries. To view logs for a specific Pod:
kubectl logs -f [POD_NAME]
-f option streams logs.
To view the logs from all pods matching a selector:
kubectl logs -l [SELECTOR]
where SELECTOR is a deployment selector, for example "app=frontend".
Fluentd collects logs from your nodes and sends them to Stackdriver. The Fluentd agent is deployed to your cluster using a DaemonSet so that an instance of the agent runs on each node of your cluster. Fluentd's resource requirements depend on your particular workload; more logging requires more resources.
You can tune Fluentd's resource allocation by creating a scaling policy specific to your requirements. A scaling policy defines a Pod's resource requests and limits. Refer to Managing Compute Resources for Containers to learn how the Kubernetes scheduler handles resource requests and limits. For more information about how resource requests and limits affect quality of services (QoS), see Resource Quality of Service in Kubernetes.
Expand the following section for instructions on how to measure Fluentd's resource usage and how to write a custom scaling policy using these values.
How to create a custom scaling policy
- Visit the Kubernetes Workloads menu in the Google Cloud Platform Console.
Use the filter bar to find the Fluentd DaemonSet: Remove the "Is System: false" filter, and add a filter for Name: "fluentd-gcp-v".
Note: The version number in the workload's name depends on the currently deployed version of Fluentd. Let the filter bar autocomplete the name.
- Click on the fluentd workload in the list.
View the resource usage statistics of each Fluentd Pod by clicking on the Pod names in the Managed Pod section. There will be one Pod per cluster node. The cluster in this example has three nodes.
Look at each Pod in your cluster and note how much CPU and memory is being used by the fluentd-gcp container. Note the maximum values of each resource, we will use those values in our custom scaling policy. You can change the timescale of the graphs to view longer term trends.
In this example, the Pod's peak CPU usage is near 2.0e-2 CPU:
In this example, the Pod's peak memory usage is 135M of memory.
- Use the observed resource usage data to calculate values to use in the
- Multiply the max CPU usage by 1000 to get an mCPU value for GKE.
- The max memory will be the new memory request. To leave room for growth we can make the memory limit 1.1x the request.
- Use the values you calculated to write a scaling policy for Fluentd. Use the
following manifest as a template:
Note that this manifest does not specify a CPU limit. This allows Fluentd to consume additional CPU as needed without being evicted. You can specify a CPU if fluentd must not consume more than a specific amount of CPU.
apiVersion: scalingpolicy.kope.io/v1alpha1 kind: ScalingPolicy metadata: name: fluentd-gcp-scaling-policy namespace: kube-system spec: containers: - name: fluentd-gcp resources: requests: - resource: cpu base: [CPU request value from above] - resource: memory base: [Memory request value from above] limits: - resource: memory base: [Memory limit value from above]
- Apply the scaling policy with kubectl apply. For example, if your scaling policy file is
kubectl apply -f fluentd_scaling_policy.yaml
- Confirm that the new values have been applied. Note that it may take a minute
for the new values to be applied to the cluster.
kubectl get ds -namespace=kube-system -l k8s-app=fluentd-gcp -o custom-columns=NAME:.metadata.name,CPU_REQUEST:.spec.template.spec.containers.resources.requests.cpu,MEMORY_REQUEST:.spec.template.spec.containers.resources.requests.memory,MEMORY_LIMIT:.spec.template.spec.containers.resources.limits.memoryThis command's output shoud resemble this, with request and limit values that correspond to the values in your policy:
NAME CPU_REQUEST MEMORY_REQUEST MEMORY_LIMIT fluentd-gcp-v3.0.0 20m 135M 200M
We recommend that you use StackDriver Monitoring. However, you can disable monitoring in order to reclaim some resources.
Refer to the overview of GKE monitoring for more information about Stackdriver Monitoring.
You must disable horizontal pod autoscaling to fully disable Stackdriver
Monitoring. A side effect of this is that
kubectl top stops working.
To disable Stackdriver Monitoring:
gcloud beta container clusters update [CLUSTER_NAME] --monitoring-service none gcloud container clusters update [CLUSTER_NAME] --update-addons=HorizontalPodAutoscaling=DISABLED kubectl --namespace=kube-system scale deployment metrics-server-v0.2.1 --replicas=0
To enable Stackdriver Monitoring:
gcloud beta container clusters update [CLUSTER_NAME] --monitoring-service monitoring.googleapis.com gcloud container clusters update [CLUSTER_NAME] --update-addons=HorizontalPodAutoscaling=ENABLED
Horizontal Pod autoscaling
Horizontal Pod Autoscaling (HPA) scales the replicas of your deployments based on metrics like CPU usage or memory. If you don’t need HPA and you've already disabled Stackdriver Monitoring,you can also disable HPA.
To disable HPA:
gcloud container clusters update [CLUSTER_NAME] --update-addons=HorizontalPodAutoscaling=DISABLED
To enable HPA:
gcloud container clusters update [CLUSTER_NAME] --update-addons=HorizontalPodAutoscaling=ENABLED
You can disable the Kubernetes Dashboard add-on to conserve cluster resources. There is little or no downside to disabling the dashboard because it is redundant to the GKE dashboards available in Google Cloud Platform Console.
To disable the dashboard:
gcloud container clusters update [CLUSTER_NAME] \ --update-addons=KubernetesDashboard=DISABLED
To enable to dashboard:
gcloud container clusters update [CLUSTER_NAME] \ --update-addons=KubernetesDashboard=ENABLED
Kube DNS schedules a DNS Deployment and service in your cluster,
and the Pods in your cluster use Kube DNS service to resolve DNS names to IP
addresses for Services, Pods, Nodes, as well as public IP addresses. Kube DNS
resolves public domain names like
example.com, and it resolves service
servicename.namespace.svc.cluster.local. For more information on
DNS based service discovery, see DNS for Services and Pods.
By default, several Kube DNS replicas run in order to maintain high availability. If you do not require highly available DNS resolution, you can conserve cluster resources by reducing the number of Kube DNS replicas to one. If you don't require DNS name resolution at all, you can disable Kube DNS entirely.
Reducing Kube DNS replication
If your cluster doesn't require highly available DNS resolution, you can conserve cluster resources by turning off Kube DNS horizontal autoscaling and reducing the number of replicas to one.
To turn off the kube-dns autoscaler and reduce kube-dns to a single replica:
kubectl scale --replicas=0 deployment/kube-dns-autoscaler --namespace=kube-system kubectl scale --replicas=1 deployment/kube-dns --namespace=kube-system
To enable autoscaling:
kubectl scale --replicas=1 deployment/kube-dns-autoscaler --namespace=kube-system
For more precise control of autoscaling, you can tune the autoscaling parameters
Disabling Kube DNS
You can completely disable Kube DNS. Kube DNS is required for workloads that that resolve the DNS name of any dependent service. This includes public domain names and the names of cluster services.
To disable Kube DNS:
kubectl scale --replicas=0 deployment/kube-dns-autoscaler --namespace=kube-system kubectl scale --replicas=0 deployment/kube-dns --namespace=kube-system
To enable Kube DNS:
kubectl scale --replicas=1 deployment/kube-dns --namespace=kube-system
--replicas=1 is your desired number of replicas.
To enable Kube DNS autoscaling:
kubectl scale --replicas=1 deployment/kube-dns-autoscaler --namespace=kube-system
External DNS lookups without Kube DNS
apiVersion: v1 kind: Pod metadata: namespace: default name: dns-example spec: containers: - name: test image: nginx dnsPolicy: "None" dnsConfig: nameservers: - 18.104.22.168 - 22.214.171.124
Service discovery without Kube DNS
You can use service environment variables as an alternative to DNS-based service discovery. When a Pod is created, service environment variables are automatically created for every service in the same namespace as the Pod. This is more restrictive than Kube DNS because environment variables are only created for services that are created before the Pod is created.