Starting in GKE on-prem version 1.3, you can create a group of nodes in your user cluster that all have the same configuration by defining a node pool in the configuration file of that cluster. You can then manage that pool of nodes separately without affecting any of the other nodes in the cluster. Learn more about node pools.
One or more node pools can be defined in the configuration file of any
user clusters. Creating a node pool creates additional nodes in the user cluster.
Node pool management, including creating, updating, and
deleting node pools in a user cluster, is done through modifying the nodePools
section of your configuration file and deploying those changes to your existing
cluster with the gkectl update cluster
command. Note that deleting nodepools
will cause immediate removal of the related nodes regardless if any of those
nodes are running a workload.
Example node pool:
nodePools:
- name: pool-1
cpus: 4
memoryMB: 8192
replicas: 5
Tip for new installations: Create your first user cluster and define your node pools in that cluster. Then use that cluster's configuration file to create additional user clusters with the same node pool settings.
Before you begin
Support:
Only user clusters version 1.3.0 or later are supported.
Node pools in admin clusters are unsupported.
The gkectl update cluster command currently has full support for updating node pools and adding static IPs. It also supports enabling cloud audit logging and enabling / disabling auto repair. All other changes that exist in the configuration file are ignored.
While the nodes in a node pool can be managed separately from other nodes, the nodes of any cluster cannot be separately upgraded. All nodes are upgraded when you upgrade your clusters.
Resources:
You can deploy only changes to node pool
replicas
without interruption to a node's workload.Important: If you deploy any other node pool configuration change, the nodes in the node pool are recreated. You must ensure any such node pool is not running a workload that should not be disrupted.
When you deploy your node pool changes, unwanted nodes are deleted after the desired ones are created or updated. One implication of this policy is that even if the total number of nodes remains the same before and after an update, more resources (for example, IP addresses) may be required during the update. You must verify that enough IP addresses are available for the peak usage.
Creating and updating node pools
You manage a node pool through modifying and deploying your user cluster's configuration file. You can create and deploy one or more node pools in a user cluster.
To create or update node pools:
In an editor, open the configuration file of the user cluster in which you want to create or update node pools.
Define one or more node pools in the nodePools section of the user cluster configuration file:
Configure the minimum required node pool attributes. You must specify the following attributes for each node pool:
nodePools.name
: Specifies a unique name for the node pool. Updating this attribute recreates the node. Example:- name: pool-1
nodePools.cpus
: Specify how many CPUs are allocated to each worker node in the pool. Updating this attribute recreates the node. Example:cpus: 4
nodePools.memoryMB
: Specifies how much memory, in megabytes, is allocated to each worker node of the user cluster. Updating this attribute recreates the node. Example:memoryMB: 8192
nodePools.replicas
: Specifies the total number of worker nodes in the pool. The user cluster uses nodes across all the pools to run workloads. You can update this attribute without affecting any nodes or running workloads. Example:replicas: 5
Note that while some of the
nodePools
attributes are the same as theworkernode
(DHCP | Static IP) in the old configuration file, theworkernode
section is still required in the old configuration files of every user cluster. You can't remove theworkernode
section nor replace it withnodepools
. In new user cluster configuration file, there is noworkernode
section any more. You have to define at least one node pool for a user cluster and ensure that there are enough un-tainted nodes in replacement of the defaultworkernode
pool in old configuration files.Example:
nodePools: - name: pool-1 cpus: 4 memoryMB: 8192 replicas: 5
See Examples for a exemplar user cluster configuration file with multiple node pools.
Configure optional node pool attributes. You can add labels and taints to your node pool configuration to steer node workloads. You can also define which vSphere Datastore is used by your node pool.
nodePools.labels
: Specifies one or morekey : value
pairs to uniquely identify your node pools. Thekey
andvalue
must begin with a letter or number, and can contain letters, numbers, hyphens, dots, and underscores, up to 63 characters each.For detailed configuration information, see labels.
Important: You cannot specify the following keys for a label because they are reserved for use by GKE on-prem:
kubernetes.io
,k8s.io
, andgoogleapis.com
.Example:
labels: key1: value1 key2: value2
nodePools.taints
: Specifies akey
,value
, andeffect
to definetaints
for your node pools. Thesetaints
correspond with thetolerations
that you configure for your pods.The
key
is required andvalue
is optional. Both must begin with a letter or number, and may contain letters, numbers, hyphens, dots, and underscores, up to 253 characters. Optionally, you can prefix akey
with a DNS subdomain followed by a/
. For example:example.com/my-app
.Valid
effect
values are:NoSchedule
,PreferNoSchedule
, orNoExecute
.For detailed configuration information, see taints.
Example:
taints: - key: key1 value: value1 effect: NoSchedule
nodePools.bootDiskSizeGB
: Specifies the size of boot disk, in gigabytes, is allocated to each worker node in the pool. This configuration is available starting from GKE on-prem version 1.5.0Example:
bootDiskSizeGB: 40
nodePools.vsphere.datastore
: Specify the vSphere Datastore on which each node in the pool will be created on. This overrides the default vSphere Datastore of the user cluster.Example:
vsphere: datastore: datastore_name
See Examples for a configuration example with multiple node pools.
Use the
gkectl update cluster
command to deploy your changes to the user cluster.gkectl update cluster --kubeconfig [ADMIN_CLUSTER_KUBECONFIG] --config [USER_CLUSTER_CONFIG_FILE] --dry-run --yes
where:- [ADMIN_CLUSTER_KUBECONFIG]: Specifies the
kubeconfig
file of your admin cluster. - [USER_CLUSTER_CONFIG_FILE]: Specifies the
configuration
file of your user cluster. --dry-run
: Optional flag. Add this flag to view the change only. No changes are deployed to the user cluster.--yes
: Optional flag. Add this flag to run the command silently. The prompt to verify that you want to proceed is disabled.
If you aborted the command prematurely, you can run the same command again to complete the operation and deploy your changes to the user cluster.
If you need to revert your changes, you must revert your changes in the configuration file and then redeploy those changes to your user cluster.
- [ADMIN_CLUSTER_KUBECONFIG]: Specifies the
Verify that the changes are successful by inspecting all the nodes. Run the following command to list all of the nodes in the user cluster:
kubectl --kubeconfig [USER_CLUSTER_KUBECONFIG] get nodes -o wide
where [USER_CLUSTER_KUBECONFIG] is the
kubeconfig
file of your user cluster.
Deleting a node pool
To delete a node pool from a user cluster:
Remove its definition from the nodePools section of the user cluster configuration file.
Ensure that there are no workloads running on the affected nodes.
Deploy your changes by running the
gkectl update cluster
command:gkectl update cluster --kubeconfig [ADMIN_CLUSTER_KUBECONFIG] --config [USER_CLUSTER_CONFIG_FILE] --dry-run --yes
where:- [ADMIN_CLUSTER_KUBECONFIG]: Specifies the
kubeconfig
file of your admin cluster. - [USER_CLUSTER_CONFIG_FILE]: Specifies the
configuration
file of your user cluster. --dry-run
: Optional flag. Add this flag to view the change only. No changes are deployed to the user cluster.--yes
: Optional flag. Add this flag to run the command silently. The prompt to verify that you want to proceed is disabled.
- [ADMIN_CLUSTER_KUBECONFIG]: Specifies the
Verify that the changes are successful by inspecting all the nodes. Run the following command to list all of the nodes in the user cluster:
kubectl --kubeconfig [USER_CLUSTER_KUBECONFIG] get nodes -o wide
where [USER_CLUSTER_KUBECONFIG] is the
kubeconfig
file of your user cluster.
Examples
In the following example configuration, there are four node pools, each with different attributes:
pool-1
: only the minimum required attributes are specifiedpool-2
: includes vSphere Datastorepool-3
: includes bootDiskSizeGBpool-4
: includes taints and labelspool-5
: includes all attributes
apiVersion: v1
kind: UserCluster
...
# (Required) List of node pools. The total un-tainted replicas across all node pools
# must be greater than or equal to 3
nodePools:
- name: pool-1
cpus: 4
memoryMB: 8192
replicas: 5
- name: pool-2
cpus: 8
memoryMB: 16384
replicas: 3
vsphere:
datastore: my_datastore
- name: pool-3
cpus: 8
memoryMB: 8192
replicas: 3
bootDiskSizeGB: 40
- name: pool-4
cpus: 4
memoryMB: 8192
replicas: 5
taints:
- key: "example-key"
effect: NoSchedule
labels:
environment: production
app: nginx
- name: pool-5
cpus: 8
memoryMB: 16384
replicas: 3
taints:
- key: "my_key"
value: my_value1
effect: NoExecute
labels:
environment: test
vsphere:
datastore: my_datastore
bootDiskSizeGB: 60
...
Troubleshooting
In general, the
gkectl update cluster
command provides specifics when it fails. If the command succeeded and you don't see the nodes, you can troubleshoot with the Diagnosing cluster issues guide.It is possible that there are insufficient cluster resources like a lack of available IP addresses during node pool creation or update. See the Resizing a user cluster topic for details about verifying that IP addresses are available.
You can also review the general Troubleshooting guide.
Won't proceed past
Creating node MachineDeployment(s) in user cluster…
.It can take a while to create or update the node pools in your user cluster. However, if the wait time is extremely long and you suspect that something might have failed, you can run the following commands:
- Run
kubectl get nodes
to obtain the state of your nodes. - For any nodes that are not ready, run
kubectl describe node [node_name]
to obtain details.
- Run