Add and manage node pools


This page shows you how to add and perform operations on node pools running your Google Kubernetes Engine (GKE) Standard clusters. To learn about how node pools work, refer to About node pools.

Clusters can perform operations, such as node auto-provisioning, on multiple node pools in parallel. You can manually create, update, or delete a node pool while another node pool is already being created, updated, or deleted.

These instructions don't apply to Autopilot clusters, where GKE manages the nodes, and there are no node pools for you to manage. To learn more, see the Autopilot overview.

Before you begin

Before you start, make sure you have performed the following tasks:

  • Enable the Google Kubernetes Engine API.
  • Enable Google Kubernetes Engine API
  • If you want to use the Google Cloud CLI for this task, install and then initialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running gcloud components update.

Set up IAM service accounts for GKE

GKE uses IAM service accounts that are attached to your nodes to run system tasks like logging and monitoring. At a minimum, these node service accounts must have the Kubernetes Engine Default Node Service Account (roles/container.defaultNodeServiceAccount) role on your project. By default, GKE uses the Compute Engine default service account, which is automatically created in your project, as the node service account.

To grant the roles/container.defaultNodeServiceAccount role to the Compute Engine default service account, complete the following steps:

console

  1. Go to the Welcome page:

    Go to Welcome

  2. In the Project number field, click Copy to clipboard.
  3. Go to the IAM page:

    Go to IAM

  4. Click Grant access.
  5. In the New principals field, specify the following value:
    PROJECT_NUMBER-compute@developer.gserviceaccount.com
    Replace PROJECT_NUMBER with the project number that you copied.
  6. In the Select a role menu, select the Kubernetes Engine Default Node Service Account role.
  7. Click Save.

gcloud

  1. Find your Google Cloud project number:
    gcloud projects describe PROJECT_ID \
        --format="value(projectNumber)"

    Replace PROJECT_ID with your project ID.

    The output is similar to the following:

    12345678901
    
  2. Grant the roles/container.defaultNodeServiceAccount role to the Compute Engine default service account:
    gcloud projects add-iam-policy-binding PROJECT_ID \
        --member="serviceAccount:PROJECT_NUMBER-compute@developer.gserviceaccount.com" \
        --role="roles/container.defaultNodeServiceAccount"

    Replace PROJECT_NUMBER with the project number from the previous step.

Add a node pool to a Standard cluster

You can add a new node pool to a GKE Standard cluster using the gcloud CLI, the Google Cloud console, or Terraform. GKE also supports node auto-provisioning, which automatically manages the node pools in your cluster based on scaling requirements.

Best practice:

Create and use a minimally-privileged Identity and Access Management (IAM) service account for your node pools to use instead of the Compute Engine default service account. For instructions to create a minimally-privileged service account, refer to Hardening your cluster's security.

gcloud

To create a node pool, run the gcloud container node-pools create command:

gcloud container node-pools create POOL_NAME \
    --cluster CLUSTER_NAME \
    --service-account SERVICE_ACCOUNT

Replace the following:

  • POOL_NAME: the name of the new node pool.
  • CLUSTER_NAME: the name of your existing cluster.
  • SERVICE_ACCOUNT: the name of the IAM service account for your nodes to use.

    We strongly recommend that you specify a minimally-privileged IAM service account that your nodes can use instead of the Compute Engine default service account. To learn how to create a minimally-privileged service account, see Use a least privilege service account.

    To specify a custom service account in the gcloud CLI, add the following flag to your command:

    --service-account=SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com

    Replace SERVICE_ACCOUNT_NAME with the name of your minimally-privileged service account.

For a full list of optional flags you can specify, refer to the gcloud container node-pools create documentation.

The output is similar to the following:

Creating node pool POOL_NAME...done.
Created [https://container.googleapis.com/v1/projects/PROJECT_ID/zones/us-central1/clusters/CLUSTER_NAME/nodePools/POOL_NAME].
NAME: POOL_NAME
MACHINE_TYPE: e2-medium
DISK_SIZE_GB: 100
NODE_VERSION: 1.21.5-gke.1302

In this output, you see details about the node pool, such as the machine type and GKE version running on the nodes.

Occasionally, the node pool is created successfully but the gcloud command times out instead of reporting the status from the server. To check the status of all node pools, including those not yet fully provisioned, use the following command:

gcloud container node-pools list --cluster CLUSTER_NAME

Console

To add a node pool to an existing Standard cluster, perform the following steps:

  1. Go to the Google Kubernetes Engine page in Google Cloud console.

    Go to Google Kubernetes Engine

  2. In the cluster list, click the name of the Standard cluster you want to modify.

  3. Click Add node pool.

  4. Configure your node pool.

  5. In the navigation menu, click Security.

  6. Optionally, specify a custom IAM service account for your nodes:
    1. In the Advanced settings page, expand the Security section.
    2. In the Service account menu, select your preferred service account.

    We strongly recommend that you specify a minimally-privileged IAM service account that your nodes can use instead of the Compute Engine default service account. To learn how to create a minimally-privileged service account, see Use a least privilege service account.

  7. Click Create to add the node pool.

Terraform

To add a node pool to an existing Standard cluster using Terraform, refer to the following example:

resource "google_container_node_pool" "default" {
  name    = "gke-standard-regional-node-pool"
  cluster = google_container_cluster.default.name

  node_config {
    service_account = google_service_account.default.email
  }
}

To learn more about using Terraform, see Terraform support for GKE.

View node pools in a Standard cluster

gcloud

To list all the node pools of a Standard cluster, run the gcloud container node-pools list command:

gcloud container node-pools list --cluster CLUSTER_NAME

To view details about a specific node pool, run the gcloud container node-pools describe command:

gcloud container node-pools describe POOL_NAME \
    --cluster CLUSTER_NAME

Replace the following:

  • CLUSTER_NAME: the name of the cluster.
  • POOL_NAME: the name of the node pool to view.

Console

To view node pools for a Standard cluster, perform the following steps:

  1. Go to the Google Kubernetes Engine page in Google Cloud console.

    Go to Google Kubernetes Engine

  2. In the cluster list, click the name of the Standard cluster.

  3. Click the Nodes tab.

  4. Under Node Pools, click the name of the node pool you want to view.

Scale a node pool

You can scale your node pools up or down to optimize for performance and cost. With GKE Standard node pools, you can scale a node pool horizontally by changing the number of nodes in the node pool, or scale a node pool vertically by changing the machine attribute configuration of the nodes.

Horizontally scale by changing the node count

gcloud

To resize a cluster's node pools, run the gcloud container clusters resize command:

gcloud container clusters resize CLUSTER_NAME \
    --node-pool POOL_NAME \
    --num-nodes NUM_NODES

Replace the following:

  • CLUSTER_NAME: the name of the cluster to resize.
  • POOL_NAME: the name of the node pool to resize.
  • NUM_NODES: the number of nodes in the pool in a zonal cluster. If you use multi-zonal or regional clusters, NUM_NODES is the number of nodes for each zone the node pool is in.

Repeat this command for each node pool. If your cluster has only one node pool, omit the --node-pool flag.

Console

To resize a cluster's node pools, perform the following steps:

  1. Go to the Google Kubernetes Engine page in Google Cloud console.

    Go to Google Kubernetes Engine

  2. In the cluster list, click the name of the Standard cluster you want to modify.

  3. Click the Nodes tab.

  4. In the Node Pools section, click the name of the node pool that you want to resize.

  5. Click Resize.

  6. In the Number of nodes field, enter how many nodes that you want in the node pool, and then click Resize.

  7. Repeat for each node pool as needed.

Vertically scale by changing the node machine attributes

You can modify the node pool's configured machine type, disk type, and disk size.

When you edit one or more of these machine attributes, GKE updates the nodes to the new configuration using the upgrade strategy configured for the node pool. If you configure the blue-green upgrade strategy you can migrate the workloads from the original nodes to the new nodes while being able to roll back the original nodes if the migration fails. Inspect the upgrade settings of the node pool to ensure that the configured strategy is how you want your nodes to be updated.

Update at least one of the highlighted machine attributes in the following command:

gcloud container node-pools update POOL_NAME \
    --cluster CLUSTER_NAME \
    --machine-type MACHINE_TYPE \
    --disk-type DISK_TYPE \
    --disk-size DISK_SIZE

Omit any flags for machine attributes that you don't want to change. However, you must use at least one machine attribute flag, as the command otherwise fails.

Replace the following:

  • POOL_NAME: the name of the node pool to resize.
  • CLUSTER_NAME: the name of the cluster to resize.
  • MACHINE_TYPE: the type of machine to use for nodes. To learn more, see gcloud container node-pools update.
  • DISK_TYPE: the type of the node VM boot disk, must be one of pd-standard, pd-ssd, pd-balanced.
  • DISK_SIZE: the size for node VM boot disks in GB. Defaults to 100GB.

This change requires recreating the nodes, which can cause disruption to your running workloads. For details about this specific change, find the corresponding row in the manual changes that recreate the nodes using a node upgrade strategy without respecting maintenance policies table. To learn more about node updates, see Planning for node update disruptions.

Upgrade a node pool

By default, a cluster's nodes have auto-upgrade enabled. Node auto-upgrades ensure that your cluster's control plane and node version remain in sync and in compliance with the Kubernetes version skew policy, which ensures that control planes are compatible with nodes up to two minor versions older than the control plane. For example, Kubernetes 1.29 control planes are compatible with Kubernetes 1.27 nodes.

Best practice:

Avoid disabling node auto-upgrades so that your cluster benefits from the upgrades listed in the preceding paragraph.

With GKE node pool upgrades, you can choose between two configurable upgrade strategies, namely surge upgrades and blue-green upgrades.

Choose a strategy and use the parameters to tune the strategy to best fit your cluster environment's needs.

How node upgrades work

While a node is being upgraded, GKE stops scheduling new Pods onto it, and attempts to schedule its running Pods onto other nodes. This is similar to other events that re-create the node, such as enabling or disabling a feature on the node pool.

During automatic or manual node upgrades, PodDisruptionBudgets (PDBs) and Pod termination grace period are respected for a maximum of 1 hour. If Pods running on the node can't be scheduled onto new nodes after one hour, GKE initiates the upgrade anyway. This behavior applies even if you configure your PDBs to always have all of your replicas available by setting the maxUnavailable field to 0 or 0% or by setting the minAvailable field to 100% or to the number of replicas. In all of these scenarios, GKE deletes the Pods after one hour so that the node deletion can happen.

Best practice:

If a workload requires more flexibility with graceful termination, use blue-green upgrades which provide settings for additional soak time to extend PDB checks beyond the one hour default.

To learn more about what to expect during node termination in general, see the topic about Pods.

If a workload requires more flexibility with graceful termination, we recommend using blue-green upgrades, which provide settings for additional soak time to extend PDB checks beyond the 1 hour default. For more information about what to expect during node termination in general, see the topic about Pods.

The upgrade is only complete when all nodes have been recreated and the cluster is in the desired state. When a newly-upgraded node registers with the control plane, GKE marks the node as schedulable.

New node instances run the desired Kubernetes version as well as:

Manually upgrade a node pool

You can manually upgrade a node pool version to match the version of the control plane or to a previous version that is still available and is compatible with the control plane. You can manually upgrade multiple node pools in parallel, whereas GKE automatically upgrades only one node pool at a time.

When you manually upgrade a node pool, GKE removes any labels you added to individual nodes using kubectl. To avoid this, apply labels to node pools instead.

Before you manually upgrade your node pool, consider the following conditions:

  • Upgrading a node pool may disrupt workloads running in that node pool. To avoid this, you can create a new node pool with the desired version and migrate the workload. After migration, you can delete the old node pool.
  • If you upgrade a node pool with an Ingress in an errored state, the instance group does not sync. To work around this issue, first check the status using the kubectl get ing command. If the instance group is not synced, you can work around the problem by re-applying the manifest used to create the ingress.

You can manually upgrade your node pools to a version compatible with the control plane, using the Google Cloud console or the Google Cloud CLI.

gcloud

The following variables are used in the commands in this section:

  • CLUSTER_NAME: the name of the cluster of the node pool to be upgraded.
  • NODE_POOL_NAME: the name of the node pool to be upgraded.
  • VERSION: the Kubernetes version to which the nodes are upgraded. For example, --cluster-version=1.7.2 or cluster-version=latest.

Upgrade a node pool:

gcloud container clusters upgrade CLUSTER_NAME \
  --node-pool=NODE_POOL_NAME

To specify a different version of GKE on nodes, use the optional --cluster-version flag:

gcloud container clusters upgrade CLUSTER_NAME \
  --node-pool=NODE_POOL_NAME \
  --cluster-version VERSION

For more information about specifying versions, see Versioning.

For more information, refer to the gcloud container clusters upgrade documentation.

Console

To upgrade a node pool using the Google Cloud console, perform the following steps:

  1. Go to the Google Kubernetes Engine page in Google Cloud console.

    Go to Google Kubernetes Engine

  2. Next to the cluster you want to edit, click Actions, then click Edit.

  3. On the Cluster details page, click the Nodes tab.

  4. In the Node Pools section, click the name of the node pool that you want to upgrade.

  5. Click Edit.

  6. Click Change under Node version.

  7. Select the desired version from the Node version drop-down list, then click Change.

It may take several minutes for the node version to change.

Deploy a Pod to a specific node pool

You can explicitly deploy a Pod to a specific node pool by using a nodeSelector in your Pod manifest. nodeSelector schedules Pods onto nodes with a matching label.

All GKE node pools have labels with the following format: cloud.google.com/gke-nodepool: POOL_NAME. Add this label to the nodeSelector field in your Pod as shown in the following example:

apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    env: test
spec:
  containers:
  - name: nginx
    image: nginx
    imagePullPolicy: IfNotPresent
  nodeSelector:
    cloud.google.com/gke-nodepool: POOL_NAME

For more information, see Assigning Pods to Nodes.

As an alternative to node selector, you can use node affinity. Use node affinity if you want a "soft" rule where the Pod attempts to meet the constraint, but is still scheduled even if the constraint can't be satisfied. For more information, see Node affinity. You can also specify resource requests for the containers.

Downgrade a node pool

You can downgrade a node pool, for example, to mitigate an unsuccessful node pool upgrade. Review the limitations before downgrading a node pool.

Best practice:

Use the blue-green node upgrade strategy if you need to optimize for risk mitigation for node pool upgrades impacting your workloads. With this strategy, you can roll backan in-progress upgrade to the original nodes if the upgrade is unsuccessful.

  1. Set a maintenance exclusion for the cluster to prevent the node pool from being automatically upgraded by GKE after being downgraded.
  2. To downgrade a node pool, specify an earlier version while following the instructions to Manually upgrade a node pool.

Delete a node pool

Deleting a node pool deletes the nodes and all running workloads, not respecting PodDisruptionBudget settings. To learn more about how this affects your workloads, including interactions with node selectors, see Deleting node pools.

gcloud

To delete a node pool, run the gcloud container node-pools delete command:

gcloud container node-pools delete POOL_NAME \
    --cluster CLUSTER_NAME

Console

To delete a node pool, perform the following steps:

  1. Go to the Google Kubernetes Engine page in Google Cloud console.

    Go to Google Kubernetes Engine

  2. In the cluster list, click the name of the Standard cluster you want to modify.

  3. Click the Nodes tab.

  4. In the Node Pools section, click next to the node pool you want to delete.

  5. When prompted to confirm, click Delete.

Migrate nodes to a different machine type

To learn about different approaches for moving workloads between machine types, for example, to migrate to a newer machine type, see Migrate nodes to a different machine type.

Migrate workloads between node pools

To migrate workloads from one node pool to another node pool, see Migrate workloads between node pools. For example, you can use these instructions if you're replacing an existing node pool with a new node pool and you want to ensure that the workloads move to the new nodes from the existing nodes.

Troubleshoot

For troubleshooting information, see Troubleshoot Standard node pools and Troubleshoot node registration.

What's next