Rotating user cluster certificate authorities

GKE on VMware uses certificates and private keys to authenticate and encrypt connections between system components in user clusters. The admin cluster creates a new set of certificate authorities (CAs) for each user cluster, and uses these CA certificates to issue additional leaf certificates for system components. The admin cluster manages distribution of the public CA certificates and leaf certificate keypairs to system components, to establish their secure communication.

The user cluster CA rotation feature allows you to trigger a rotation of the core system certificates in a user cluster. During a rotation, the admin cluster replaces the core system CAs for the user cluster with newly generated CAs, and distributes the new public CA certificates and leaf certificate key pairs to user cluster system components. The rotation happens incrementally, so that system components can continue to communicate during the rotation. Note, however, that workloads and nodes will be restarted during the rotation.

There are three system CAs managed by the admin cluster for each user cluster:

  • The etcd CA secures communication from the API server to the etcd replicas and also traffic between etcd replicas. This CA is self-signed.
  • The cluster CA secures communication between the API server and all internal Kubernetes API clients (kubelets, controllers, schedulers). This CA is self-signed.
  • The front-proxy CA secures communication with aggregated APIs. This CA is self-signed.

You may additionally be using an org CA to sign the certificate configured by the authentication.sni option. This CA and the SNI certificate are used to serve the Kubernetes API to clients outside the cluster. You manage this CA and manually generate the SNI certificate. Neither this CA nor the SNI certificate is affected by the user cluster CA rotation feature.

Limitations

  • The user cluster CA rotation feature is supported on Anthos clusters on VMware clusters version 1.8 and above.
  • The user cluster CA rotation feature is specifically limited to the etcd, cluster, and front-proxy CAs mentioned in the overview. It does not rotate your org CA. user cluster CA rotation feature is additionally limited to certificates issued automatically by Anthos clusters on VMware. It does not update certificates issued manually by an administrator, even if those certificates are signed by the system CAs.
  • A CA rotation must restart the API server, other control-plane processes, and each node in the cluster multiple times. Each stage of a CA rotation progresses similarly to a cluster upgrade. While the user cluster does remain operational during a CA rotation, you should expect that workloads will be restarted and rescheduled. You should also expect brief periods of control-plane downtime when not using an HA configuration.
  • User cluster kubeconfig files and authentication configuration files for connecting to user clusters must be manually updated and redistributed following a CA rotation. This is because a CA rotation necessarily revokes the old CA, so these credentials will no longer authenticate.
  • Once initiated, a CA rotation cannot be paused or rolled-back.
  • A CA rotation may take considerable time to complete, depending on the size of the user cluster.

How to perform a CA rotation

You can initiate a CA rotation and view the current status of the rotation by running the below gkectl commands.

Rotate CA Certificates

To trigger a CA rotation, execute the following command.

gkectl update credentials certificate-authorities rotate \
--config USER_CONFIG_FILE \
--kubeconfig ADMIN_KUBECONFIG_FILE

Where:

  • USER_CONFIG_FILE is the path to the user cluster configuration file of the user cluster to rotate CAs for.
  • ADMIN_KUBECONFIG_FILE is the path to the kubeconfig file for connecting to the admin cluster that manages the user cluster.

If the CA rotation is started successfully, you will see a message similar to the following:

successfully started the CA rotation with CAVersion 2, use gkectl update credentials certificate-authorities status command to view the current state of CA rotation

If a CA rotation is already in progress, you will see an error message similar to the following:

Exit with error:
admission webhook "vonpremusercluster.onprem.cluster.gke.io" denied the request: requests must not modify CAVersion when cluster is not ready: ready condition is not true: ClusterCreateOrUpdate: Creating or updating user cluster control plane workloads

View CA rotation status

To view the status of a CA rotation, execute the following command. This command reports the CAVersion, an integer the system automatically increments to differentiate the CAs used before and after each CA rotation, a status (True or False) that indicates whether the CA rotation is complete, and message describing which CAVersion is currently in use by each component of the system.

gkectl update credentials certificate-authorities status \
--config USER_CONFIG_FILE \
--kubeconfig ADMIN_KUBECONFIG_FILE

If the CA rotation has already completed, you will see a message similar to the following:

State of CARotation with CAVersion 2 is -
status: True,
reason: CARotationCompleted,
message: Control plane has CA bundle [2], certs from CA 2, CA 2 is CSR signer. Data plane has CA bundle [2], CA 2 was CSR signer at last restart.

If the CA rotation is still in progress, you will see a message similar to the following:

State of CARotation with CAVersion 2 is -
status: False,
reason: CARotationProgressed,
message: Control plane has CA bundle [1 2], certs from CA 2, CA 1 is CSR signer. Data plane has CA bundle [1 2], CA 1 was CSR signer at last restart.

Update User Cluster Credentials

Once a CA rotation completes, a new user cluster kubeconfig file must be downloaded from the admin cluster to replace the old kubeconfig that was previously used for connecting to user clusters. This is because the CA rotation revokes the old CA that the old kubeconfig file was based on. Run the following command after the CA rotation completes to download the new kubeconfig file:

kubectl --kubeconfig ADMIN_KUBECONFIG_FILE get secret admin \
 -n USER_CLUSTER_NAME -o jsonpath='{.data.admin\.conf}' \
 | base64 --decode > USER_CLUSTER_NAME-kubeconfig

If additional kubeconfig files were manually issued to other users of the cluster, they must also be updated.

Update Authentication Configuration Files

Once the CA rotation completes, authentication configuration files must be updated and redistributed. Follow the linked instructions to update and redistribute these files after the CA rotation:

Troubleshooting a CA rotation

The gkectl diagnose command supports checking the expected status of a completed CA rotation against a user-cluster. For instructions on how to run gkectl diagnose on a user cluster, see Diagnosing cluster issues. If you experience issues with a CA rotation, please contact Google support and provide the gkectl diagnose output.