Customizing node system configuration

Stay organized with collections Save and categorize content based on your preferences.

This document shows you how to customize your Google Kubernetes Engine (GKE) node configuration using a configuration file called a node system configuration.

Overview

You can customize your node configuration by using various methods. For example, you can specify parameters such as the machine type and minimum CPU platform when you create a node pool.

A node system configuration is a configuration file that provides a way to adjust a limited set of system settings. You can use a node system configuration to specify custom settings for the Kubernetes node agent (the kubelet) and low-level Linux kernel configurations (sysctl) in your node pools.

You can also make additional customizations using the following methods:

Using a node system configuration

To use a node system configuration:

  1. Create a configuration file. This file contains your kubelet and sysctl configurations.
  2. Add the configuration when you create a cluster, or when you create or update a node pool.

Creating a configuration file

Write your node system configuration file in YAML. The following example shows you how to add configurations for the kubelet and sysctl options:

kubeletConfig:
  cpuManagerPolicy: static
linuxConfig:
 sysctl:
   net.core.somaxconn: '2048'
   net.ipv4.tcp_rmem: '4096 87380 6291456'

In this example:

  • cpuManagerPolicy: static configures the kubelet to use the static CPU management policy.
  • net.core.somaxconn: '2048' limits the socket listen() backlog to 2,048 bytes.
  • net.ipv4.tcp_rmem: '4096 87380 6291456' sets the minimum, default, and maximum value of the TCP socket receive buffer to 4,096 bytes, 87,380 bytes, and 6,291,456 bytes respectively.

If you want to add configurations solely for the kubelet or sysctl, only include that section in your configuration file. For example, to add a kubelet configuration, create the following file:

kubeletConfig:
  cpuManagerPolicy: static

For a complete list of the fields that you can add to your configuration file, see the Kubelet configuration options and Sysctl configuration options sections.

Adding the configuration to a node pool

After you have created the configuration file, add the --system-config-from-file flag by using the Google Cloud CLI. You can add this flag when you create a cluster, or when you create or update a node pool. You cannot add a node system configuration with the Google Cloud console.

To add a node system configuration, run the following command:

Create cluster

gcloud container clusters create CLUSTER_NAME \
    --system-config-from-file=SYSTEM_CONFIG_PATH

Replace the following:

  • CLUSTER_NAME: the name for your cluster
  • SYSTEM_CONFIG_PATH: the path to the file that contains your kubelet and sysctl configurations

After you have applied a node system configuration, the default node pool of the cluster uses the settings that you defined.

Create node pool

gcloud container node-pools create POOL_NAME \
     --cluster CLUSTER_NAME \
     --system-config-from-file=SYSTEM_CONFIG_PATH

Replace the following:

  • POOL_NAME: the name for your node pool
  • CLUSTER_NAME: the name of the cluster that you want to add a node pool to
  • SYSTEM_CONFIG_PATH: the path to the file that contains your kubelet and sysctl configurations

Update node pool

gcloud container node-pools update POOL_NAME \
    --cluster=CLUSTER_NAME \
    --system-config-from-file=SYSTEM_CONFIG_PATH

Replace the following:

  • POOL_NAME: the name of the node pool that you want to update
  • CLUSTER_NAME: the name of the cluster that you want to update
  • SYSTEM_CONFIG_PATH: the path to the file that contains your kubelet and sysctl configurations

Editing a node system configuration

To edit a node system configuration, you can create a new node pool with the configuration that you want, or update the node system configuration of an existing node pool.

Editing by creating a node pool

To edit a node system configuration by creating a node pool:

  1. Create a configuration file with the configuration that you want.
  2. Add the configuration to a new node pool.
  3. Migrate your workloads to the new node pool.
  4. Delete the old node pool.

Editing by updating an existing node pool

To edit a node system configuration by updating an existing node pool, update the node system configuration with the values that you want. Updating a node system configuration overrides the node pool's system configuration with the new configuration. If you omit any parameters during an update, they are set to their respective defaults.

If you want to reset the node system configuration back to the defaults, update your configuration file with empty values for the kubelet and sysctl. For example:

kubeletConfig: {}
linuxConfig:
  sysctl: {}

Deleting a node system configuration

To remove a node system configuration:

  1. Create a node pool.
  2. Migrate your workloads to the new node pool.
  3. Delete the node pool that has the old node system configuration.

Kubelet configuration options

The following table shows you the kubelet options that you can modify.

Kubelet config settings Restrictions Default setting Description
cpuManagerPolicy Value must be none or static none This setting controls the kubelet's CPU Manager Policy. The default value is none which is the default CPU affinity scheme, providing no affinity beyond what the OS scheduler does automatically.

Setting this value to static allows Pods in the Guaranteed QoS class with integer CPU requests to be assigned exclusive use of CPUs.
cpuCFSQuota Value must be true or false true This setting enforces the Pod's CPU limit. Setting this value to false means that the CPU limits for Pods are ignored.

Ignoring CPU limits might be desirable in certain scenarios where Pods are sensitive to CPU limits. The risk of disabling cpuCFSQuota is that a rogue Pod can consume more CPU resources than intended.
cpuCFSQuotaPeriod Value must be a duration of time "100ms" This setting sets the CPU CFS quota period value, cpu.cfs_period_us, which specifies the period of how often a cgroup's access to CPU resources should be reallocated. This option lets you tune the CPU throttling behavior.
podPidsLimit Value must be must be between 1024 and 4194304 none This setting sets the maximum number of process IDs (PIDs) that each Pod can use.

Sysctl configuration options

To tune the performance of your system, you can modify the following Kernel attributes:

  • net.core.busy_poll
  • net.core.busy_read
  • net.core.netdev_max_backlog
  • net.core.rmem_max
  • net.core.wmem_default
  • net.core.wmem_max
  • net.core.optmem_max
  • net.core.somaxconn
  • net.ipv4.tcp_rmem
  • net.ipv4.tcp_wmem
  • net.ipv4.tcp_tw_reuse
  • net.ipv6.conf.all.disable_ipv6
  • net.ipv6.conf.default.disable_ipv6

To learn more about these attributes, see the Linux Kernel sysctl documentation.

Different Linux namespaces might have unique values for a given sysctl, while others are global for the entire node. Updating sysctl options by using a node system configuration ensures that the sysctl is applied globally on the node and in each namespace, resulting in each Pod having identical sysctl values in each Linux namespace.

Linux cgroup mode configuration options

The kubelet and the container runtime use Linux kernel cgroups for resource management such as limiting how much CPU or memory each container in a Pod can access. There are two versions of the cgroup subsystem in the kernel: cgroupv1 and cgroupv2. Kubernetes support for cgroupv2 was introduced as alpha in Kubernetes v1.18, beta in v1.22, and GA in v1.25. For more details, refer to the Kubernetes cgroups v2 documentation.

Node system configuration lets you customize the cgroup configuration of your node pools. You can use cgroupv1 or cgroupv2. By default, GKE uses cgroupv1. You can use node system configuration to change the setting for a node pool to cgroupv2 instead.

For example, to configure your node pool to use cgroupv2, use a node system configuration file such as:

linuxConfig:
  cgroupMode: 'CGROUP_MODE_V2'

The supported cgroupMode options are:

  • CGROUP_MODE_V1: Use cgroupv1 on the node pool.
  • CGROUP_MODE_V2: Use cgroupv2 on the node pool.
  • CGROUP_MODE_UNSPECIFIED: Use the default GKE cgroup configuration.

To use cgroupv2, the following requirements and limitations apply:

  • Your cluster and node pools must run GKE version 1.24.2-gke.300 or later.
  • You must use the Container-Optimized OS with containerd node image.
  • If any of your workloads depend on reading the cgroup filesystem (/sys/fs/cgroup/...), ensure they are compatible with the cgroupv2 API.
    • Ensure any monitoring or third-party tools are compatible with cgroupv2.
  • If you use JDK (Java workload), prefer to use JDK 11.0.16 and later or JDK 15 and later, which fully support cgroupv2.

What's next