Troubleshoot application-layer secrets encryption


This page shows you how to resolve issues related to application-layer secrets encryption in Google Kubernetes Engine (GKE).

Failed update

When you update the encryption configuration of application-layer secrets encryption, GKE must rewrite all Secret objects in the Kubernetes cluster. GKE does this to ensure that all Secrets are encrypted by the new Cloud KMS Key, or are written un-encrypted if that is what you configure.

This update operation can fail due to any of the following conditions:

  • The Kubernetes control plane is temporarily unavailable while the update is in progress.
  • A user-defined AdmissionWebhook prevents GKE from being able to update Secret objects.
  • The updated or previous Cloud KMS key is disabled before the update operation completes.

Until the update operation is successful, don't interact with either the updated or previous Cloud KMS keys.

Debugging fields

New GKE clusters running version 1.29 and later contain additional fields that help you track updates to Cluster.DatabaseEncryption and help you recover from failures.

The following steps only apply to clusters where the DatabaseEncryption.CurrentState field is not empty. If the CurrentState field is empty, the feature is not enabled on this cluster version yet.

The following limits apply to these fields:

  • Are output only, which means that you can't set them during cluster create or update requests.

CurrentState field

You can inspect the current status of a DatabaseEncryption update operation by examining the CurrentState field in Cluster.DatabaseEncryption.

Value of CurrentState Description

CURRENT_STATE_ENCRYPTED

CURRENT_STATE_DECRYPTED

The latest update operation was successful. No further action is needed. You can dispose of any previously used keys.

CURRENT_STATE_ENCRYPTION_PENDING

CURRENT_STATE_DECRYPTION_PENDING

The update is in progress.

CURRENT_STATE_ENCRYPTION_ERROR

CURRENT_STATE_DECRYPTION_ERROR

There was an error with the most recent update. Don't disable or destroy any previously used Cloud KMS keys, as they might still be in use by GKE.

Refer to the LastOperationErrors field for more information.

LastOperationErrors field

When an update operation fails, the underlying error from the GKE control plane is displayed in the output of gcloud container clusters update.

The error messages from the two most recent failed update operations are also available in Cluster.DatabaseEncryption.LastOperationErrors.

DecryptionKeys field

The Cloud KMS key used for new encryption operations is shown in DatabaseEncryption.KeyName. Usually this is the only key used by the cluster.

However, DatabaseEncryption.DecryptionKeys contains additional keys that are also used by the cluster if an update is in progress or after a failure.

Recover from a failed update

To recover from a failed update, do the following:

  1. Examine the error message and address any issues indicated.
  2. Retry the update request by running the failed command, such as gcloud container clusters update ... --database-encryption-key. We recommend that you retry with the same update request that you originally issued, or update the cluster back to the previous state. GKE might not be able to transition to a different key or encryption state if it can't read one or more Secrets.

The following sections list common reasons for errors.

Cloud KMS key error

If the error message contains a reference to one or more Cloud KMS keys, examine your Cloud KMS key configuration to make sure the relevant key versions are usable.

If the error indicates that a Cloud KMS key has been disabled or destroyed, re-enable the key or key version.

Error: Unable to use CloudKMS key configured for Application Level encryption

The following error message occurs if GKE's default service account can't access the Cloud KMS key:

Cluster problem detected (Kubernetes Engine Service Agent account unable to use CloudKMS key configured for Application Level encryption).

To resolve this issue, re-enable the disabled key.

Unable to update Secret

The following error might occur if the Kubernetes API rejected the update request due to an admission webhook:

error admission webhook WEBHOOK_NAME denied the request

To resolve this error, remove the webhook or modify it so that GKE can update Secrets in all namespaces during key updates.