This document shows how to configure a timeout value in case the draining of a cluster node is in violation of a Pod disruption budget (PDB).
When a node is drained, all Pods on the node must be terminated. By default, if the termination of a Pod is in violation of a PDB, the draining of the node is blocked.
In some situations, you might want to configure a maximum time that the draining of a node can be blocked by a PDB violation. For example, you might want to configure a timeout value before you start a cluster update or upgrade. Or you might need to configure a timeout value for a node that is currently blocked from draining by a PDB violation.
Set a timeout value
Each node is represented by a Machine object.
List the Machine objects in the cluster:
kubectl --kubeconfig CLUSTER_KUBECONFIG get machines
Replace CLUSTR_KUBECONIFG with the path of the cluster kubeconfig file.
my-node-pool-7f864959cd-cw472 my-node-pool-7f864959cd-kh86m my-node-pool-7f864959cd-wtpvx
Open a Machine object for editing:
kubectl --kubeconfig CLUSTER_KUBECONFIG edit machine MACHINE_NAME
Replace MACHINE_NAME with the name of the Machine object.
In the editor, add this annotation:
Replace TIMEOUT with a string that specifies the duration of the timeout. Valid time units are "s", "m", "h". Examples of time values are "1h", "1h30m", "10m", and "100s".
If you set the timeout value to "0s", then PDB violations will never time out. This is the same as the default behavior.
apiVersion: cluster.k8s.io/v1alpha1 kind: Machine metadata: annotations: kubelet-version: 1.23.5-gke.1502 onprem.cluster.gke.io/gke-on-prem-version: 1.12.0-gke.430 vm-ip-address: 203.0.113.2 onprem.cluster.gke.io/pdb-violation-timeout: "5m"
Close the editing session.
During a rolling update, a new surge machine is created first. Then the old node is drained, and after all the Pods on it have been evicted, the old Machine object and Node object are both deleted. The PDB violation timeout annotation does not persist on the newly created Machine object.