This page shows you how to manage GKE cluster upgrades using rollout sequencing. To learn more, see About cluster upgrades with rollout sequencing.
Before you begin
Before you start, make sure you have performed the following tasks:
- Enable the Google Kubernetes Engine API. Enable Google Kubernetes Engine API
- If you want to use the Google Cloud CLI for this task,
install and then
initialize the
gcloud CLI. If you previously installed the gcloud CLI, get the latest
version by running
gcloud components update
.
- Ensure that you have enabled the required APIs for fleets. These APIs must be enabled in your fleet host projects to create any type of rollout sequence.
- Ensure that you have enabled GKE Enterprise in your fleet host projects if you want to create a team-based rollout sequence (Preview).
- For Terraform instructions, ensure that you use the version 5.13.0 or later
of the
google
provider.
Required roles
- Ensure that you have the
required IAM permissions
for cluster registration. You must grant the following permissions:
- Cluster registration permissions, in your fleet host projects.
- Cluster admin permissions for any GKE clusters to be registered.
- Cross-project cluster registration permissions for any GKE clusters to be registered to a fleet in a different project.
Configure a rollout sequence
This document explains how to create a rollout sequence using groups of clusters organized by fleets or team scopes. This document uses the term group to refer to both fleets and team scopes, because you can create a rollout sequence organized with either grouping method.
You can create a sequence of up to three groups of clusters, and you can choose how much soak testing time you want after cluster upgrades are complete in a group (maximum 30 days). You can include both Autopilot and Standard clusters.
To create a rollout sequence, your clusters must be organized into groups of either fleets or team scopes. For guidance on how to organize your clusters, see the community bank example. After they are organized into groups, you can create a rollout sequence by defining the upstream group relationships and each group's soak time. Upstream, in a rollout sequence, refers to the previous group, and downstream refers to the next group.
Organize your clusters into groups
In a rollout sequence, all clusters in all groups must be enrolled in the same release channel and be on the same minor version. If these requirements are not met and there are version discrepancies between clusters, this can cause issues with the version rollout. For more information, see Rollout eligibility.
You can create rollout sequences between fleets, or rollout sequences between a team's team scopes (Preview).
As you saw in About cluster upgrades with rollout sequencing, team scopes are an enterprise fleet-level construct for associating subsets of fleet clusters with specific application teams. You must enable GKE Enterprise to use team scopes. The following limitations apply when using or creating team scopes for rollout sequencing:
Team-based sequences require single-tenancy clusters: in other words, each individual cluster is only associated with a single team. Shared clusters (which are supported in general fleet team management) are not supported for rollout sequencing.
Each team scope must be in a different fleet to create a rollout sequence between them. Creating a rollout sequence between different team scopes within the same fleet is unsupported.
If you have already organized your clusters into groups, you can skip the following steps and proceed to Create a rollout sequence.
Fleets
To create a fleet-based rollout sequence, first you must group your clusters into fleets. You can organize your clusters by deployment environments such as Testing, Staging, and Production, as shown in the example fleet-based rollout sequence.
Register each cluster with a fleet based on your chosen grouping.
Teams
To create a team-based rollout sequence, you must group your clusters into team scopes. To do so, first you organize your clusters into fleets by deployment environments such as Testing, Staging, and Production, as shown in the example scope-based rollout sequence. Then, you can further subdivide your clusters into scopes for different teams' clusters.
- For each cluster in the sequence, register your cluster with a fleet. The cluster should be registered to the fleet in the project where you will create the team scope for this cluster. If you want to register a cluster to a fleet in a different host project, ensure you have set the necessary permissions for cross-project registration.
Create 2-3 team scopes to organize your clusters. Create each scope in the host project of the team's respective fleet. You can have up to three team scopes in a rollout sequence.
See the reference for
gcloud container fleet scopes create
for a complete list of flags. With thecreate
command, you can use the flags in the instructions to create a rollout sequence.
Create a rollout sequence
A rollout sequence is organized as a linked list with up to three elements.
When you create a rollout sequence, you set the following properties for each group of clusters, either a fleet or team scope:
- Upstream group: The upstream fleet or team scope, which qualifies new versions for the downstream group. You don't set an upstream group for the first group in a sequence.
- Soak time: The soak time for a group is the time between when upgrades complete (or rollout has taken 30 days) and when upgrades can begin on the downstream group. To learn more, see How version qualification works in a rollout sequence.
Fleets - gcloud
The following instructions use the
gcloud container fleet clusterupgrade update
command, however you can
set the same properties with the gcloud container fleet clusterupgrade create
command.
For each of the following commands, replace SOAK_TIME
with the soak time for the fleet you are updating.
Create a rollout sequence:
Set the soak time for the first fleet in the sequence:
gcloud container fleet clusterupgrade update \ --default-upgrade-soaking=SOAK_TIME \ --project=FIRST_FLEET_PROJECT_ID
Replace
FIRST_FLEET_PROJECT_ID
with the project ID of the fleet host project.Set the upstream fleet and the soak time for the second fleet in the sequence:
gcloud container fleet clusterupgrade update \ --upstream-fleet=FIRST_FLEET_PROJECT_ID \ --default-upgrade-soaking=SOAK_TIME \ --project=SECOND_FLEET_PROJECT_ID
Replace
FIRST_FLEET_PROJECT_ID
with the project ID of the first fleet's host project, andSECOND_FLEET_PROJECT_ID
with the project ID of the fleet host project.Optional: If you want to have three fleets in a rollout sequence, set the upstream fleet for the third fleet in the sequence:
gcloud container fleet clusterupgrade update \ --upstream-fleet=SECOND_FLEET_PROJECT_ID \ --default-upgrade-soaking=SOAK_TIME \ --project=THIRD_FLEET_PROJECT_ID
Replace
SECOND_FLEET_PROJECT_ID
with the project ID of the second fleet's host project, andTHIRD_FLEET_PROJECT_ID
with the project ID of the fleet host project.
Fleets - console
Go to the Rollout Sequencing page in the Google Cloud console.
Click Create rollout sequence.
In the Create a rollout sequence pane, select the first two fleets in the sequence:
- In the Fleet 1 section, select the first fleet in the sequence.
- In the Soak time for upstream fleet section, set the soak time for the first fleet using the Days, Hours, and Minutes fields.
- In the Fleet 2 section, select the second fleet in the sequence.
- Click Create.
Optional: If you want to have three fleets in this rollout sequence, do the following additional steps:
- In the Rollout graph, click the element for the second fleet.
- Click Add downstream fleet.
- In the Soak time for upstream fleet section, set the soak time for the second fleet using the Days, Hours, and Minutes fields.
- In the Next fleet in the sequence section, select the third fleet in the sequence.
- Click Save.
Fleets - Terraform
This section shows you how to create a fleet-based sequence using
Terraform. You can also use this resource to update the sequence. To learn
more, see the reference documentation for
google_gke_hub_feature
.
For each of the following commands, replace SOAK_TIME
with the soak time for the fleet you are updating.
Create a rollout sequence:
Add the following block to your Terraform configuration to set the soak time for the first fleet in the sequence:
resource "google_gke_hub_feature" "feature" { name = "clusterupgrade" location = "global" spec { clusterupgrade { upstream_fleets = [] post_conditions { soaking = "SOAK_TIME" } } } project = "FIRST_FLEET_PROJECT_ID" }
Replace
FIRST_FLEET_PROJECT_ID
with the project ID of the fleet host project.Add the following block to your Terraform configuration to set the upstream fleet and the soak time for the second fleet in the sequence:
resource "google_gke_hub_feature" "feature" { name = "clusterupgrade" location = "global" spec { clusterupgrade { upstream_fleets = ["FIRST_FLEET_PROJECT_ID"] post_conditions { soaking = "SOAK_TIME" } } } project = "SECOND_FLEET_PROJECT_ID" }
Replace
FIRST_FLEET_PROJECT_ID
with the project ID of the first fleet's host project, andSECOND_FLEET_PROJECT_ID
with the project ID of the fleet host project.Optional: If you want to have three fleets in a rollout sequence, add the following block to your Terraform configuration to set the upstream fleet for the fleet in the sequence:
resource "google_gke_hub_feature" "feature" { name = "clusterupgrade" location = "global" spec { clusterupgrade { upstream_fleets = ["SECOND_FLEET_PROJECT_ID"] post_conditions { soaking = "SOAK_TIME" } } } project = "THIRD_FLEET_PROJECT_ID" }
Replace
SECOND_FLEET_PROJECT_ID
with the project ID of the second fleet's host project, andTHIRD_FLEET_PROJECT_ID
with the project ID of the fleet host project.
Teams - gcloud
You can set these properties when you create or update a team scope. The following
instructions use the gcloud container fleet scopes update
command,
however you can set the same properties when you create a team scope with the
gcloud container fleet scopes create
command.
For each of these commands, replace the following:
- The variables with the respective team scope's name or the team scope's fleet host project ID.
- The
SOAK_TIME
with the soak time for the team scope you are updating.
Create a rollout sequence:
Set the soak time for the first scope in the sequence:
gcloud container fleet scopes update projects/FIRST_SCOPE_PROJECT_ID/locations/global/scopes/FIRST_SCOPE_NAME \ --default-upgrade-soaking=SOAK_TIME \ --project=FIRST_SCOPE_PROJECT_ID
Set the upstream scope and the soak time for the second scope in the sequence:
gcloud container fleet scopes update projects/SECOND_SCOPE_PROJECT_ID/locations/global/scopes/SECOND_SCOPE_NAME \ --upstream-scope=projects/FIRST_SCOPE_PROJECT_ID/locations/global/scopes/FIRST_SCOPE_NAME \ --default-upgrade-soaking=SOAK_TIME \ --project=SECOND_SCOPE_PROJECT_ID
Optional: If you want to have three team scopes in a rollout sequence, set the upstream scope for the third scope in the sequence:
gcloud container fleet scopes update projects/THIRD_SCOPE_PROJECT_ID/locations/global/scopes/THIRD_SCOPE_NAME \ --upstream-scope=projects/SECOND_SCOPE_PROJECT/locations/global/scopes/SECOND_SCOPE_NAME \ --default-upgrade-soaking=SOAK_TIME \ --project=THIRD_SCOPE_PROJECT_ID
Check status of a rollout sequence
You can check the status of a rollout sequence with either of the following methods:
- Monitor a visual representation of a rollout sequence in the Google Cloud console (Preview, fleet-based rollout sequence only).
- Use the gcloud CLI or GKE Hub API to check the status of a rollout sequence.
Monitor a rollout sequence in the Google Cloud console
Go to the Rollout Sequencing page in the Google Cloud console.
View the sequence in the section Monitor your rollout sequence. If you don't see a rollout sequence, switch to a different rollout sequence, or create a rollout sequence if you haven't already done so.
How to use the console to monitor a rollout sequence
On this page, you can view the rollout sequence associated with your project's fleet. You can do the following to see the progress of a rollout sequence:
- View the entire rollout sequence, or see the statuses of individual fleets and clusters within those fleets, as well as the soak time between fleets. You can also view the sequence where there is no active upgrade, if you want to check the configuration of the sequence.
- Filter by upgrade type (control plane or node upgrade) and specific version (for example, 1.31.6-gke.500).
You can visually monitor your entire rollout sequence while GKE upgrades all the clusters in the sequence, qualifying a new version across environments before upgrading your production environment clusters. While monitoring, you can manage a rollout sequence with the gcloud CLI, making any changes as needed.
Switch to a different rollout sequence
This page shows the fleet-based rollout sequence if the active project in the Google Cloud console is a fleet host project for a fleet that is enrolled in a rollout sequence.
If you want to view a different rollout sequence, select a fleet host project associated with a different rollout sequence from the project picker at the top of the page.
Use the gcloud CLI
Use these commands in the following sections to check on how upgrades are progressing in a rollout sequence. To learn more about what details are provided, see Status information for a rollout sequence
To run these commands, ensure that you have the required permissions for each fleet host project. For example, if the sequence has cross-project scopes in different fleets, you need permissions in each project to describe the sequence.
For the following commands, if you only need information about one fleet or
scope in the sequence, replace the --show-linked-cluster-upgrade
flag with
--show-cluster-upgrade
.
Fleets
Check the status of a fleet-based rollout sequence:
gcloud container fleet clusterupgrade describe \
--show-linked-cluster-upgrade --project=FLEET_PROJECT_ID
Replace FLEET_PROJECT_ID
with the project ID of the
host project for any fleet in the sequence.
See the reference gcloud container fleet clusterupgrade
describe
for a complete list of flags.
Teams
Check the status of a team-based rollout sequence:
gcloud container fleet scopes describe SCOPE_NAME \
--show-linked-cluster-upgrade
--project=SCOPE_PROJECT_ID
Replace SCOPE_NAME
with the name of any team scope in
the rollout sequence and SCOPE_PROJECT_ID
with the
project ID of this team scope.
See the reference for gcloud container fleet scopes
describe
for a
complete list of flags.
To see the status of individual clusters within a fleet or team scope, run the
following command in the fleet host project and see the membershipStates
section:
gcloud container fleet features describe clusterupgrade
Status information for a rollout sequence
When you check the status of a version rollout, you can see the progress of each group and cluster within that group.
See the following table for the potential statuses of a cluster or group:
Status | For a single cluster | For a group (fleet or team scope) |
---|---|---|
INELIGIBLE | This cluster is ineligible for this upgrade | One or more clusters in this group are ineligible for this upgrade. |
PENDING | The upgrade hasn't started or the upgrade is in progress for the cluster. | The upgrade hasn't started on any of the clusters in the group. |
IN_PROGRESS | N/A | The upgrade has started on at least one cluster but hasn't finished on all clusters. |
SOAKING | The upgrade has finished on the cluster and hasn't finished soaking. | The upgrade has finished on all clusters and hasn't finished soaking. |
FORCED_SOAKING | The upgrade took more than the maximum upgrade time (30 days) and therefore we forced it to enter the soaking phase. The upgrade can still continue in the cluster. | The upgrade took more than the maximum upgrade time (30 days) and therefore we forced it to enter the soaking phase. The upgrade can still continue in the clusters. |
COMPLETE | The upgrade is treated as "done", meaning that the upgrade has finished soaking on this cluster. | The upgrade is treated as "done" and ready to be consumed by the downstream group, meaning that the upgrade has finished soaking. |
In the output of these commands, theclusterUpgrade(s).spec
and
clusterUpgrade(s).state
attributes contain additional information about the
cluster upgrade such as soaking time, cluster upgrade overrides, and upgrade
status.
Manage a rollout sequence
You can control automatic cluster upgrades with rollout sequencing in several ways, explained in the following sections.
Change the soak time for a group
You can change the default soak time for a group or change the soak time for when that group upgrades to a specific version. The maximum is 30 days.
Update the default soak time
You can update the default soak time in the Google Cloud console (Preview, fleet-based rollout sequence only) or with the gcloud CLI.
gcloud
To change the default soak time for a group, use the gcloud CLI commands from the instructions to Create a rollout sequence, omitting the flags to set the upstream group.
Fleets - console
Go to the Rollout Sequencing page in the Google Cloud console.
View the sequence in the section Monitor your rollout sequence. If you don't see a rollout sequence, switch to a different rollout sequence, or create a rollout sequence if you haven't already done so.
In the Rollout graph, click the Soak time element after the element of the fleet where you want to update the soak time.
Click
Edit soak time.In the section Set a new soak time, enter a new soak time using the Days, Hours, and Minutes fields.
To save the settings, click Save.
Override the default soak time
You can change the soak time for a specific version rollout to be different than the default soak time for the group. For example, if you have already qualified a new version and are ready for upgrades to begin in the next group, you can set the soak time to zero. You can also use it if you want more time than the default soaking time to qualify a specific version.
As the soak time is set on a per-group basis, if you want to override the soak time for other groups in this sequence, update them using this same command with the fleet or scope name replaced, depending on the type of sequence.
For the instructions in this section, replace the following variables:
SOAK_TIME
: the soak time to use other than the default (for example, "0d" if you want to skip the soak time for one version rollout).UPGRADE_NAME
: the type of upgrade, eitherk8s_control_plane
for control plane upgrades ork8s_node
for node upgrades.VERSION
: the GKE version where you want to override the default soak time after the version (for example, 1.25.2-gke.400) has been rolled out to this group.
Fleets - gcloud
Run this command in the host project of the fleet where you want to override the soak time used for the version rollout of a specific version.
Change the soak time of a fleet:
gcloud container fleet clusterupgrade update
--add-upgrade-soaking-override=SOAK_TIME \
--upgrade-selector=name=UPGRADE_NAME,version=VERSION
Fleets - Terraform
Add the following gke_upgrades_overrides
block to your Terraform
configuration within the clusterupgrade
block to override the
soak time used for the version rollout of a specific version:
gke_upgrade_overrides {
upgrade {
name = "UPGRADE_NAME"
version = "VERSION"
}
post_conditions {
soaking = "SOAK_TIME"
}
}
Teams - gcloud
Run this command in the host project of the team scope's fleet. Replace
SCOPE_NAME
with the name of the team scope for which you
want to override the soak time used for the version rollout of a specific
version.
Change the soak time of a team scope:
gcloud container fleet scopes update SCOPE_NAME \
--add-upgrade-soaking-override=SOAK_TIME \
--upgrade-selector=name=UPGRADE_NAME,version=VERSION
Update the groups in a rollout sequence
You can update an existing rollout sequence to add, remove, or change the order of groups in the sequence. To make these changes, update the associations between groups.
You can perform these steps in the Google Cloud console (Preview, fleet-based rollout sequence only) or with the gcloud CLI.
Fleets - gcloud
Use the gcloud container fleet clusterupgrade update
command with the
--upstream-fleet
flag to add or change upstream fleets. Use the
--reset-upstream-fleet
flag to remove an upstream fleet.
You can do actions such as the following:
- Add another fleet to the start of the rollout sequence by adding an upstream fleet to the first fleet in the sequence.
- Change the order of the fleets in the rollout sequence by changing the upstream fleet associations.
- Remove the first fleet in the rollout sequence by removing the upstream fleet of the second fleet.
Fleets - console
Go to the Rollout Sequencing page in the Google Cloud console.
View the sequence in the section Monitor your rollout sequence. If you don't see a rollout sequence, switch to a different rollout sequence, or create a rollout sequence if you haven't already done so.
In the Rollout graph, click the elements for the existing fleets in the sequence. After you click those elements, you can do some of the following actions to make the changes:
- Click Add downstream fleet.
- Click Add upstream fleet.
- Click Remove fleet.
You can do actions such as the following:
- Add another fleet to the end of the rollout sequence by adding a downstream fleet to the last fleet in the sequence.
- Add another fleet to the start of the rollout sequence by adding an upstream fleet to the first fleet in the sequence.
- Change the order of the fleets in the rollout sequence by removing fleets, then adding the fleets back with a different upstream or downstream fleet.
- Remove the first fleet in the rollout sequence.
- Remove the last fleet in the rollout sequence.
- Remove the middle fleet in the rollout sequence, after removing the first or last fleet in the sequence.
Teams - gcloud
Use the gcloud container fleet scopes update
command with the
--upstream-scope
flag to add or change upstream team scopes. Use the
--reset-upstream-scope
flag to remove an upstream team scope.
You can do actions such as the following:
- Add another team scope to the start of the rollout sequence by adding an upstream team scope to the first team scope in the sequence.
- Change the order of the team scopes in the rollout sequence by changing the upstream team scope associations.
- Remove the first team scope in the rollout sequence by removing the upstream team scope of the second team scope.
Delay the completion of group's version rollout
If you need to temporarily prevent a group from completing the rollout of a new version to its clusters, you can add a maintenance exclusion to any of the clusters that have not been upgraded to the target version. This can pause a group from proceeding to its soak time or downstream group for up to 30 days. After 30 days, the group will begin soaking.
You can also change the soak time for that group to 30 days to maximize how long the rollout sequence waits before proceeding to the next group.
If you need to further delay upgrades beginning for the next group, you can use maintenance exclusions for the clusters in the next group.
Switch between fleet-based and team-based rollout sequences
You can switch from either fleet-based sequences to team-based sequences, or team-based sequences to fleet-based sequences. The instructions assume that you are transferring between sequences organized like those illustrated in the example diagrams.
Fleets to teams
To change your clusters from a fleet-based rollout sequence to a team-based rollout sequence, do the following steps:
- Configure maintenance exclusions for all clusters in each of your fleets to prevent any upgrades while you are modifying your configuration.
- Ensure that you have enabled GKE Enterprise in your fleet host projects.
- In each of your fleets, create one or more team scopes for subdividing the group of clusters in that fleet.
- Create one or more rollout sequences between the matching team scopes in each fleet.
- Add your clusters to their new team scopes.
- Remove the maintenance exclusions that you configured for this change.
Teams to fleets
To change your clusters from a team-based rollout sequence to a fleet-based rollout sequence, do the following steps:
- Configure maintenance exclusions for all clusters in each of your fleets to prevent any upgrades while you are modifying your configuration.
- Create a rollout sequence between your fleets.
- Remove your clusters from their team scopes. Now these clusters are only registered to their scope's respective fleets that, in the previous step, you joined in a rollout sequence.
- Delete the team scopes.
- Remove the maintenance exclusions that you configured for this change.
Delete a sequence
To delete a sequence, you remove the upstream associations for the second and third groups (if the rollout sequence has three groups).
You can perform these steps in the Google Cloud console (Preview, fleet-based rollout sequence only) or with the gcloud CLI.
Fleets - gcloud
Run the following command in the fleet host project of the second and third fleets in the rollout sequence:
gcloud container fleet clusterupgrade update --reset-upstream-fleet
Fleets - console
Go to the Rollout Sequencing page in the Google Cloud console.
View the sequence in the section Monitor your rollout sequence. If you don't see a rollout sequence, switch to a different rollout sequence, or create a rollout sequence if you haven't already done so.
In the Rollout graph, click the element for the third fleet.
Click
Remove fleet.To remove the fleet, click Remove.
Repeat the previous three steps for the second fleet.
Teams - gcloud
Run the following command in the fleet host project of the second and third team scopes in the rollout sequence:
gcloud container fleet scopes update SCOPE_NAME --reset-upstream-scope
Replace SCOPE_NAME
with the names of the second and
third scopes, respectively.
Troubleshooting
Troubleshoot rollout eligibility
If all clusters in a rollout sequence don't have the same upgrade target, GKE might not be able to proceed with cluster upgrades. Automatic upgrades cannot proceed if an upstream group does not qualify one upgrade target to pass to the downstream group. Automatic upgrades also cannot proceed if clusters in the upstream group qualify an invalid upgrade target for clusters in the downstream group.
To check if your rollout sequence has any rollout eligibility issues, check the status of the rollout sequence. If a group is ineligible, follow the instructions to see the status of individual clusters in a group.
To immediately advance cluster upgrades, remove any clusters with an
INELIGIBLE
status following the instructions to Advance partially eligible rollouts.
Fix eligibility in a group
In a group, if a cluster is ineligible because it is on an earlier version (for example, most of the clusters in the group are being upgraded from 1.23 to 1.24 and a cluster is on version 1.22), you can manually upgrade the cluster to 1.24 to resolve the version discrepancy.
In a group, if a cluster is ineligible because it is on a later version (for example, most of the clusters in the group are being upgraded from 1.23 to 1.24 and a cluster is on version 1.25), you cannot manually downgrade the cluster to solve the version discrepancy and need to remove the cluster.
Fix eligibility between groups
Between groups, if there is a mismatch in upgrade targets where the downstream group is on a newer version (for example, the upstream group upgraded from 1.23 to 1.24 and the clusters in the downstream group are on 1.25), you can manually upgrade the clusters in the upstream group to 1.25 to ensure that upgrades proceed.
Between groups, if there is a mismatch in upgrade targets where the downstream group is on an earlier version (for example, the upstream group upgraded from 1.24 to 1.25 and the clusters in the downstream group are on 1.23), you can manually upgrade the clusters in the downstream group to 1.24 or 1.25 to ensure that upgrades proceed.
Advance partially eligible rollouts
If cluster upgrades in a group will not finish because of issues with rollout eligibility (for example, version discrepancies within a group), you can remove clusters that are ineligible for the group's upgrade target from a group to complete the version rollout and begin the soak time or move on to the next group in the rollout sequence. You can also remove a cluster from a group for other reasons, for example if this cluster's usage is no longer related to the other clusters in the group.
Follow the instructions to unregister a cluster from a fleet or remove clusters from team scopes, depending on the type of rollout sequence.
After you have removed all clusters which are preventing a group's version rollout from being completed, the group's version rollout will complete. Confirm this by following the instructions to Check the status of a version rollout.