Migrate a user cluster to Controlplane V2

This document shows how to migrate a version 1.29 user cluster using kubeception to Controlplane V2. If your clusters are at version 1.30 or higher, we recommend that you follow the instructions in Plan cluster migration to recommended features.

1.29: Preview
1.28: Not available

About user cluster control planes

Prior to Google Distributed Cloud version 1.13, the control plane for a user cluster ran on one or more nodes in an admin cluster. This kind of control plane is referred to as kubeception. In version 1.13, Controlplane V2 was introduced for new user clusters. When Controlplane V2 is enabled, the control plane for the user cluster runs in the user cluster itself.

The benefits of Controlplane V2 include the following:

  • Failure isolation. An admin cluster failure does not affect user clusters.

  • Operational separation. An admin cluster upgrade does not cause downtime for user clusters.

  • Deployment separation. You can place the admin and user clusters in different failure domains or geographical sites. For example, a user cluster in an edge location can be in a different geographical site from the admin cluster.

Requirements

To migrate a user cluster to Controlplane V2, the user cluster must meet the following requirements:

  • The user cluster must be version 1.29 or higher. The admin cluster and node pools can be one or two minor versions lower than the user cluster. If needed, upgrade the cluster.

  • The user cluster must have Dataplane V2 enabled. This field is immutable, so if Dataplane V2 isn't enable on the cluster, you can't migrate it to Controlplane V2.

  • The user cluster must be configured to use either the MetalLB or a manual load balancer. If the user cluster is using the SeeSaw load balancer, you can migrate it to MetalLB.

  • Review the IP addresses planning document, and ensure that you have enough IP addresses available for the user cluster's control plane nodes. The control plane nodes require static IP addresses, and you will need an additional IP address for a new control plane virtual IP (VIP).

Prepare for the migration

If always-on secrets encryption has ever been enabled on the user cluster, you must do the steps in Disable always-on secrets encryption and decrypt secrets before starting the migration. Otherwise, the new Controlplane V2 cluster is unable to decrypt secrets.

Before startng the migration, run the following command to see if always-on secrets encryption has ever been enabled at some point:

kubectl --kubeconfig ADMIN_CLUSTER_KUBECONFIG \
  get onpremusercluster USER_CLUSTER_NAME \
  -n USER_CLUSTER_NAME-gke-onprem-mgmt \
  -o jsonpath={.spec.secretsEncryption}

If the output of the preceding command is empty, then always-on secrets encryption has never been enabled. You can start the migration.

If the output of the preceding command isn't empty, then always-on secrets encryption previously had been enabled. Before migrating, you must do the steps in the next section to ensure that the new Controlplane V2 cluster can decrypt secrets.

The following example shows non-empty output:

{"generatedKeyVersions":{"keyVersions":[1]}}

Disable always-on secrets encryption and decrypt secrets if needed

To disable always-on secrets encryption and decrypt secrets, perform the following steps:

  1. In the user cluster configuration file, to disable always-on secrets encryption add a disabled: true field to the secretsEncryption section:

    secretsEncryption:
        mode: GeneratedKey
        generatedKey:
            keyVersion: KEY_VERSION
            disabled: true
    
  2. Update the cluster:

    gkectl update cluster --kubeconfig ADMIN_CLUSTER_KUBECONFIG \
        --config USER_CLUSTER_CONFIG
    

    Replace the following:

    • ADMIN_CLUSTER_KUBECONFIG: the path of the admin cluster kubeconfig file
    • USER_CLUSTER_CONFIG: the path of the user cluster configuration file
  3. Do a rolling update on a specific DaemonSet, as follows:

    kubectl --kubeconfig ADMIN_CLUSTER_KUBECONFIG \
      rollout restart statefulsets kube-apiserver \
      -n USER_CLUSTER_NAME
    
  4. Get the manifests of all the secrets in the user cluster, in YAML format:

    kubectl --kubeconfig USER_CLUSTER_KUBECONFIG \
      get secrets -A -o yaml > SECRETS_MANIFEST.yaml
    
  5. So that all secrets are stored in etcd as plaintext, reapply all the secrets in the user cluster:

    kubectl --kubeconfig USER_CLUSTER_KUBECONFIG \
      apply -f SECRETS_MANIFEST.yaml
    

    You can now start the migration to Controlplane V2. After the migration completes, you can re-enable always-on secrets encryption on the cluster.

Update the user cluster configuration file

Make the following changes to the existing user cluster configuration file:

  1. Set enableControlplaneV2 to true.

  2. Optionally, make the control plane for the Controlplane V2 user cluster highly available (HA). To change from a non-HA to to an HA cluster, change masterNode.replicas from 1 to 3.

  3. Add the static IP address (or addresses) for the user cluster control plane node(s) to the network.controlPlaneIPBlock.ips section.

  4. Fill in the netmask and gateway in the network.controlPlaneIPBlock section.

  5. If the network.hostConfig section is empty, fill it in.

  6. If the user cluster uses manual load balancing, configure your load balancer as described in the next section.

  7. If the user cluster uses manual load balancing, set loadBalancer.manualLB.controlPlaneNodePort and loadBalancer.manualLB.konnectivityServerNodePort to 0 as they are not required when Controlplane V2 is enabled.

  8. Update the loadBalancer.vips.controlPlaneVIP field with the new IP address for the control plane VIP. Note that it has to be in the same VLAN as the control plane node IPs.

  9. All of the previous fields are immutable except when updating the cluster for the migration. Be sure to double check all settings.

  10. Run gkectl diagnose cluster, and fix any issues that the command finds.

    gkectl diagnose cluster --kubeconfig=ADMIN_CLUSTER_KUBECONFIG \
          --cluster-name=USER_CLUSTER_NAME

    Replace the following:

    • ADMIN_CLUSTER_KUBECONFIG: the path of the admin cluster kubeconfig file.

    • USER_CLUSTER_NAME: the name of the user cluster.

Adjust manual load balancer configuration

If your user cluster uses manual load balancing, do the step in this section. Otherwise skip this section.

Similarly to configure your load balancer for a CPv2 user cluster, for each of the three new control-plane node IP addresses that you specified in the network.controlPlaneIPBlock section, configure the mappings in your load balancer:

  • (ingressVIP:80) -> (NEW_NODE_IP_ADDRESS:ingressHTTPNodePort)
  • (ingressVIP:443) -> (NEW_NODE_IP_ADDRESS:ingressHTTPNodePort)

Update the cluster

Run the following command to migrate the cluster to Controlplane V2:

gkectl update cluster \
    --kubeconfig ADMIN_CLUSTER_KUBECONFIG \
    --config USER_CLUSTER_CONFIG

Replace the following:

  • ADMIN_CLUSTER_KUBECONFIG: the path of the admin cluster kubeconfig file.

  • USER_CLUSTER_CONFIG: the path of the user cluster configuration file.

The command does the following:

  1. Create the control plane of a new cluster with ControlPlane V2 enabled.

  2. Stop the Kubernetes control plane of the kubeception cluster.

  3. Take an etcd snapshot of the kubeception cluster.

  4. Power off the user cluster control plane nodes of the kubeception cluster. Note that for the sake of failure recovery, that is, falling back to the kubeception cluster, the nodes are not deleted until the completion of the migration.

  5. Restore the cluster data in the new control plane with the aforementioned etcd snapshot.

  6. Connect the nodepool nodes of the kubeception cluster to the new control-plane, which is accessible with the new controlPlaneVIP.

  7. Reconcile the restored user cluster to meet the end state of the cluster with ControlPlane V2 enabled.

Notes

  • During the migration, there's no downtime for user cluster workloads.

  • During the migration, there is some downtime for the user cluster control plane. More specifically, the control plane is unavailable between step 2 and the completion of step 6. (The downtime is less than 7 minutes based on our tests, but the actual length depends on your infrastructure).

  • At the end of the migration, the user cluster control plane nodes of the kubeception clusters are deleted. If the admin cluster has network.ipMode.type set to "static", you can recycle some of the unused static IPs by removing them from the admin cluster configuration file and run gkectl update admin. You can list the admin cluster node objects with kubectl get nodes -o wide to see what IPs are in use.

After the migration

If you disabled always-on secrets encryption before the migration, do the following steps to re-enable the feature:

  1. In the user cluster configuration file, set secretsEncryption.generatedKey.disabled to false. For example:

    secretsEncryption:
        mode: GeneratedKey
        generatedKey:
            keyVersion: KEY_VERSION
            disabled: false
    
  2. Update the user cluster:

    gkectl update cluster --kubeconfig ADMIN_CLUSTER_KUBECONFIG \
        --config USER_CLUSTER_CONFIG