[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-09-04。"],[],[],null,["# Configure surge updates of node pools\n\nThis document describes how to enable and manage surge updates of node pools.\nFor information about how surge updates of node pools work, see\n[About surge updates](/kubernetes-engine/multi-cloud/docs/aws/concepts/about-surge-updates).\n\nThings to consider before running surge updates\n-----------------------------------------------\n\nBefore running a surge update, keep in mind the following:\n\n- Additional instances created as part of this surge step can potentially exceed your AWS instance quota limit. If you don't have enough quota and these additional instances can't be provisioned, the update might fail.\n- If `max-unavailable-update` is set to 0, disruptions to workloads can still occur as Pods get evicted and rescheduled onto the newer nodes.\n- The maximum number of nodes that can be updated simultaneously is equal to the sum of `max-surge-update` and `max-unavailable-update`, and is limited to 20.\n\nEnable and configure surge updates\n----------------------------------\n\nTo enable surge updates, contact\n[Google Cloud Support](/kubernetes-engine/multi-cloud/docs/aws/getting-support). After the support\nteam enables the feature, you can assign values to the `max-surge-update`\nand `max-unavailable-update` parameters when creating or updating your node\npool: \n\n### Create\n\n gcloud container aws node-pools create \u003cvar translate=\"no\"\u003eNODE_POOL_NAME\u003c/var\u003e\n --cluster \u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e \\\n --location \u003cvar translate=\"no\"\u003eGOOGLE_CLOUD_LOCATION\u003c/var\u003e \\\n --max-surge-update \u003cvar translate=\"no\"\u003eMAX_SURGE\u003c/var\u003e \\\n --max-unavailable-update \u003cvar translate=\"no\"\u003eMAX_UNAVAILABLE\u003c/var\u003e\n\n### Update\n\n gcloud container aws node-pools update \u003cvar translate=\"no\"\u003eNODE_POOL_NAME\u003c/var\u003e\n --cluster \u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e \\\n --location \u003cvar translate=\"no\"\u003eGOOGLE_CLOUD_LOCATION\u003c/var\u003e \\\n --max-surge-update \u003cvar translate=\"no\"\u003eMAX_SURGE\u003c/var\u003e \\\n --max-unavailable-update \u003cvar translate=\"no\"\u003eMAX_UNAVAILABLE\u003c/var\u003e\n\nReplace the following:\n\n- \u003cvar translate=\"no\"\u003eNODE_POOL_NAME\u003c/var\u003e: the name of the node pool to update.\n- \u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e: the name of the cluster.\n- \u003cvar translate=\"no\"\u003eGOOGLE_CLOUD_LOCATION\u003c/var\u003e: the [supported Google Cloud region](/kubernetes-engine/multi-cloud/docs/aws/reference/supported-regions) that manages your cluster. For example, `us-west1`.\n- \u003cvar translate=\"no\"\u003eMAX_SURGE\u003c/var\u003e: the maximum number of additional nodes that can be temporarily created beyond the current node pool size during an update. By adjusting this value, you can control how many nodes are updated simultaneously. The default setting is 1, but you can set it to 0. If you set `max-surge-update` to a value greater than 0, GKE on AWS creates surge nodes; setting it to 0 prevents their creation.\n- \u003cvar translate=\"no\"\u003eMAX_UNAVAILABLE\u003c/var\u003e: the maximum number of nodes that can be unavailable simultaneously during the update process. By increasing this value, more nodes can be updated simultaneously. The default value is 0, but it can be adjusted upwards.\n\nCheck surge update settings on a node pool\n------------------------------------------\n\nTo see the surge update settings of a node pool, run the following command: \n\n gcloud alpha container aws node-pools describe \u003cvar translate=\"no\"\u003eNODE_POOL_NAME\u003c/var\u003e\n --cluster \u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e \\\n --location \u003cvar translate=\"no\"\u003eGOOGLE_CLOUD_LOCATION\u003c/var\u003e \\\n\nReplace the following:\n\n- \u003cvar translate=\"no\"\u003eNODE_POOL_NAME\u003c/var\u003e: the name of your node pool.\n- \u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e: the name of the cluster.\n- \u003cvar translate=\"no\"\u003eGOOGLE_CLOUD_LOCATION\u003c/var\u003e: the [supported Google Cloud region](/kubernetes-engine/multi-cloud/docs/aws/reference/supported-regions) that manages your cluster. For example, `us-west1`.\n\nIf the node pool has surge updates enabled, the output from this command\ndisplays a section labeled `surge_settings`. This `surge_settings` section\ndisplays the values of the `max_surge` and `max_unavailable` parameters.\n\nManage surge updates that are in progress\n-----------------------------------------\n\nYou can cancel an ongoing surge update, perform a rollback of a surge update\nthat failed, or resume an update that's been interrupted.\n\n### Cancel (pause) and resume a surge update\n\nIn GKE on AWS, \"cancelling\" a surge update actually means pausing it. For\ndetails about how to cancel an update, see\n[Cancel an update operation](/kubernetes-engine/multi-cloud/docs/aws/how-to/update-node-pool#cancel_an_update_operation).\n\nIn other words, canceling a surge update doesn't roll back the update. Instead,\nit might leave the node pool in a partially updated state with two autoscaling\ngroups: one with nodes running the previous configuration and one with nodes\nrunning the new configuration. To eliminate this problem, resume the surge\nupdate by running the update command again, using the same target parameters as\nthe interrupted operation. Initiating an update with different node pool\nparameters is restricted until the previous update concludes.\n\n### Perform rollback of failed surge update\n\nYou can roll back a node pool to its original state if a surge update\nwas canceled or failed.\n\n#### Things to consider before rolling back a surge update\n\n- You can only roll back a surge-enabled node pool that is in a partially updated state (or the `DEGRADED` state).\n- Once a rollback is initiated on a node pool, you can't cancel it.\n- You're restricted from performing more update operations until the rollback operation finishes successfully.\n- You can only retry a rollback if it fails.\n- You can't roll back node pools after they have been successfully updated.\n\n#### How to perform a rollback of a failed surge update\n\nTo rollback an unsuccessful update operation on the node pool, run the following\ncommand: \n\n gcloud container aws node-pools rollback \u003cvar translate=\"no\"\u003eNODE_POOL_NAME\u003c/var\u003e\n --cluster \u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e\n\nReplace the following:\n\n- \u003cvar translate=\"no\"\u003eNODE_POOL_NAME\u003c/var\u003e: the name of the node pool to update.\n- \u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e: the name of the cluster.\n\n#### How the rollback works\n\nInitiating a rollback internally starts a new update operation on the node pool.\n('Internally' here means that this process is executed within the system itself,\nand doesn't require your intervention). The operation reverts the node pool\nnodes back to their original state on a best-effort basis.\n\nThe nodes belonging to the old autoscaling group are un-cordoned, and the\ncluster autoscaler of this group is enabled to allow workloads to be scheduled\non the nodes. Partially updated node pool nodes in the new autoscaling group are\ncordoned, drained, and terminated based on the surge settings you defined\nin your initial surge update attempt.\n\n### Manage unsuccessful surge updates\n\nYou have three options to address a failed update:\n\n1. Continue the update: You can proceed with the failed update using the same target node pool settings as the initial unsuccessful attempt.\n2. Rollback: Use the rollback command to revert the node pool to its original state.\n3. Modify and restart: If you want to change the parameters for the surge update, the existing node pool must be deleted and then recreated with the new settings. For instructions about how to delete a node pool, see [Delete a node pool](/kubernetes-engine/multi-cloud/docs/aws/how-to/delete-node-pool)."]]