Cluster node machine prerequisites

GKE on Bare Metal supports a wide variety of systems running on the hardware that the target operating system distributions support. An GKE on Bare Metal configuration can run on minimal hardware, or on multiple machines to provide flexibility, availability, and performance.

Regardless of your GKE on Bare Metal configuration, your nodes and clusters must have enough CPU, RAM, and storage resources to meet the needs of clusters and the workloads that you're running.

When you install GKE on Bare Metal, you can create different types of clusters:

  • A user cluster that runs workloads.
  • An admin cluster that creates and controls user clusters to run workloads.
  • A standalone cluster is a single cluster that can manage and run workloads, but a standalone cluster can't create or manage user clusters.
  • A hybrid cluster can manage and run workloads, and a hybrid cluster can also create and manage additional user clusters.

In addition to cluster type, you can choose from the following installation profiles in terms of resource requirements:

  • Default: The default profile has standard system resource requirements, and you can use it for all cluster types.

  • Edge: The edge profile has significantly reduced system resource requirements. Use of this profile is recommended for edge devices with limited resources. You can only use the edge profile for standalone clusters.

Resource requirements for all cluster types using the default profile

The following table describes the minimum and recommended hardware requirements that GKE on Bare Metal needs to operate and manage admin, hybrid, user, and standalone clusters using the default profile:

Resource Minimum Recommended
CPUs / vCPUs* 4 core 8 core
RAM 16 GiB 32 GiB
Storage 128 GiB 256 GiB

* GKE on Bare Metal supports CPUs and vCPUs from the x86 processor family only.

Resource requirements for standalone clusters using the edge profile

The following table describes the minimum and recommended hardware requirements that GKE on Bare Metal needs to operate and manage standalone clusters using the edge profile:

Resource Minimum Recommended
CPUs / vCPUs* 2 core 4 core
RAM Ubuntu: 4 GiB
CentOS/RHEL: 6 GiB
Ubuntu: 8 GiB
CentOS/RHEL: 12 GiB
Storage 128 GiB 256 GiB

* GKE on Bare Metal supports CPUs and vCPUs from the x86 processor family only.

To configure standalone clusters using the edge profile, follow these best practices:

  • Run bmctl on a separate workstation. If you must run bmctl on the target cluster node, you need 2 GiB of memory to meet the minimum requirements. For example, you require 6 GiB for Ubuntu and 8 GiB for CentOS/Redhat.

  • Set MaxPodsPerNode to 110. The cluster runs no more than 30 user pods per node on average. You might need extra resources for a higher MaxPodsPerNode configuration or run more than 30 user pods per node.

  • Use containerd as the container runtime. You might need extra resources to run with the Docker container runtime.

  • Kubevirt components are not considered in this minimum resource configuration. Kubevirt requires additional resources depending on the number of VMs deployed in the cluster.

Additional storage requirements

GKE on Bare Metal doesn't provide any storage resources. You must provision and configure the required storage on your system.

For detailed storage requirements, see the Installation prerequisites overview.

For more information about how to configure the storage required, see Configuring storage for GKE on Bare Metal.

etcd performance

Disk speed is critical to etcd performance and stability. A slow disk increases etcd request latency, which can lead to cluster stability problems. We recommend that you use a solid-state disk (SSD) for your etcd store. The etcd documentation provides additional hardware recommendations for ensuring the best etcd performance when running your clusters in production.

To check your etcd and disk performance, use the following etcd I/O latency metrics in the Metrics Explorer:

  • etcd_disk_backend_commit_duration_seconds: the duration should be less than 25 milliseconds for the 99th percentile (p99).
  • etcd_disk_wal_fsync_duration_seconds: the duration should be less than 10 milliseconds for the 99th percentile (p99).

For more information about etcd performance, see What does the etcd warning "apply entries took too long" mean? and What does the etcd warning "failed to send out heartbeat on time" mean?.

Node machine prerequisites

The node machines have the following prerequisites:

  • Their operating system is one of the supported Linux distributions. For more information, see, Select your operating system.
  • The Linux kernel version is 4.17.0 or newer. Ubuntu 18.04 and 18.04.1 are on Linux kernel version 4.15 and therefore incompatible.
  • Meet the minimum hardware requirements.
  • Internet access.
  • Layer 3 connectivity to all other node machines.
  • Access to the control plane VIP.
  • Properly configured DNS nameservers.
  • No duplicate host names.
  • One of the following NTP services is enabled and working:
    • chrony
    • ntp
    • ntpdate
    • systemd-timesyncd
  • A working package manager: apt, dnf, etc.
  • On Ubuntu, you must disable Uncomplicated Firewall (UFW). Run systemctl stop ufw to disable UFW.
  • On Ubuntu and starting with GKE on Bare Metal 1.8.2, you aren't required to disable AppArmor. If you deploy clusters using earlier releases of GKE on Bare Metal disable AppArmor with the following command: systemctl stop apparmor

  • If you choose Docker as your container runtime, you may use Docker version 19.03 or later installed. If you don't have Docker installed on your node machines or have an older version installed, Anthos on bare metal installs Docker 19.03.13 or later when you create clusters.

  • If you use the default container runtime, containerd, you don't need Docker, and installing Docker can cause issues. For more information, see the known issues.

  • Cluster creation only checks for the required free space for the GKE on Bare Metal system components. This change gives you more control on the space you allocate for application workloads. Whenever you install GKE on Bare Metal, ensure that the file systems backing the following directories have the required capacity and meet the following requirements:

    • /: 17 GiB (18,253,611,008 bytes).
    • /var/lib/docker or /var/lib/containerd, depending on the container runtime:
      • 30 GiB (32,212,254,720 bytes) for control plane nodes.
      • 10 GiB (10,485,760 bytes) for worker nodes.
    • /var/lib/kubelet: 500 MiB (524,288,000 bytes).
    • /var/lib/etcd: 20 GiB (21,474,836,480 bytes, applicable to control plane nodes only).

    Regardless of cluster version, the preceding lists of directories can be on the same or different partitions. If they are on the same underlying partition, then the space requirement is the sum of the space required for each individual directory on that partition. For all release versions, the cluster creation process creates the directories, if needed.

  • /var/lib/etcd and /etc/kubernetes directories are either non-existent or empty.

In addition to the prerequisites for installing and running GKE on Bare Metal, customers are expected to comply with relevant standards governing their industry or business segment, such as PCI DSS requirements for businesses that process credit cards or Security Technical Implementation Guides (STIGs) for businesses in the defense industry.

Load balancer machines prerequisites

When your deployment doesn't have a specialized load balancer node pool, you can have worker nodes or control plane nodes build a load balancer node pool. In that case, they have additional prerequisites:

  • Machines are in the same Layer 2 subnet.
  • All VIPs are in the load balancer nodes subnet and routable from the gateway of the subnet.
  • The gateway of the load balancer subnet should listen to gratuitous ARPs to forward packets to the master load balancer.

What's next