从 1.26 版开始,GKE on Azure 会自动启用正常节点关停。此功能可以管理节点关停期间 Pod 的正常终止。正常终止可让 Pod 保存其状态,并在节点关停之前释放资源。这种终止 Pod 的方法可最大限度地降低数据丢失的风险。它还可以最大程度地减少其他 Pod 以及依赖于被关停的 Pod 或与这些 Pod 交互的服务中断的风险,从而增强集群的弹性。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-09-04。"],[],[],null,["# Graceful node shutdown in GKE on Azure\n======================================\n\nStarting from version 1.26, GKE on Azure automatically enables\n[Graceful Node Shutdown](https://kubernetes.io/blog/2021/04/21/graceful-node-shutdown-beta).\nThis feature manages the graceful termination of Pods during node shutdowns.\nGraceful termination lets Pods save their state, and release resources before\nthe node is shut down. This method of terminating Pods minimizes the risk of\ndata loss. It also minimizes the risk of interruptions to other Pods and\nservices that rely on or interact with the Pods being shut down, thus enhancing\nthe resilience of your clusters.\n\nHow it works\n------------\n\nAn event such as scheduled maintenance, node scaling, or a hardware issue,\ntriggers a node shutdown. The `kubelet` component detects the event and\ninitiates the graceful node termination process by instructing `systemd` to\ndelay the system shutdown for a specified duration. This delay gives the node\ntime to drain and evict the Pods running on it.\n\nThe goal of graceful node termination is to gracefully terminate both non-system\nand critical system Pods before the node shuts down. The following default\nsettings are used:\n\n- `ShutdownGracePeriod`: 30 seconds\n- `ShutdownGracePeriodCriticalPods`: 15 seconds\n\nThese settings give non-system Pods 15 seconds to gracefully terminate before\nthey are forcibly stopped. Critical system Pods have 15 seconds to shut down\nafter the non-system Pods have terminated. However, since the feature operates\non a best-effort basis, there's a possibility that a node might not be able to\nshut down gracefully within the designated 30-second period.\n\nTriggers and limitations\n------------------------\n\nEvents that trigger graceful node shutdown include planned events such as the\nfollowing:\n\n- User-commanded shutdowns\n- Termination of instances\n- Scheduled maintenance\n- Scaling down a cluster\n\nIn these scenarios, the `kubelet` detects the node shutdown event and initiates\nthe graceful node shutdown process.\n\nIn contrast, graceful node shutdown can't be activated when the shutdown command\ndoesn't trigger the `systemd` inhibitor lock mechanism that the `kubelet`\ncomponent relies on. Examples of those kinds of situations include the\nfollowing:\n\n- Network disconnections\n- Hardware malfunctions\n- Insufficient resources such as memory or CPU\n- Unexpected power outages.\n\nIn these cases, the node might shut down abruptly, potentially causing\ndisruptions or data loss."]]