Rotation of secrets

This is an advanced topic for Secret Manager. Before reading this guide, we recommend reviewing the Platform overview to understand the overall Google Cloud landscape, and the Secret Manager conceptual overview.

Introduction

Periodic rotation helps:

  • Limit the impact in the case of a leaked secret.
  • Ensure individuals who no longer require access to a secret can not continue to use old secret values.
  • Continuously exercise the rotation flow to reduce the likelihood of an outage in case of an emergency rotation.

Secret Manager has a concept of Secrets, Secret Versions and Rotation Schedules, which provide a foundation for building workloads that support rotated secrets.

This guide outlines recommendations for rotating secrets stored in Secret Manager. The next sections explore the benefits and tradeoffs for:

Binding to a secret version

A secret in Secret Manager can have multiple secret versions. Secret versions contain the immutable payload (the actual secret byte string) and are ordered and numbered. To rotate a secret, add a new secret version to an existing secret.

The most recently added secret version on a secret can be referenced using the latest alias. The latest alias, while convenient for development, can be problematic for some production workloads since a bad value could be immediately rolled out and result in a service-wide outage. Alternate methods for binding to a secret version are described in the following scenarios.

Gradual rollouts

Gradual rollouts are a guiding principle for the following scenarios. By choosing a slower secret rollout, there is lower risk of breakage, but also a slower time to recovery. Some secrets can be invalidated in external systems (such as APIs or databases that track valid secret values) that may or may not be in your control and recovery in these cases requires a rollout.

It's possible for a bad secret to be rolled out during either manual or automatic rotation. A strong rotation workflow should be able to automatically detect the breakage (HTTP error rates, for example) and rollback to use the older secret version (via a prior configuration deployment).

The rollout of the new secret version depends on how secrets are bound to your application.

Approach 1: Resolve during an existing release process

Resolve and bind your secret version with your application's release. For most deployments, this involves resolving the latest secret version into the full secret version resource name and rolling it out with the application as a flag or in a configuration file. We recommend resolving the secret version name at rotation time, storing the resource name somewhere durable (for example, commit to Git), and pulling the version name into the deployment configuration during the push to prevent blocked deployments.

At application startup, call Secret Manager with the specific secret version name to access the secret value.

This approach has the following benefits:

  • Your application uses the same secret version across restarts, increasing predictability and reducing operational complexity.
  • Existing change management processes for rollouts and rollbacks can be reused for secret rotation and secret version deployment.
  • The value can be rolled out gradually, reducing the impact of deploying bad values.

Approach 2: Resolve on application startup

Fetch the latest secret payload on application startup and continue to use the secret for the duration of the application.

The advantage of this approach is that it does not require modifying the CI/CD pipeline to resolve secret versions, but if a bad secret is rolled out then the application will fail to start as instances restart or the service scales-up and may cascade into a service outage.

Approach 3: Resolve continuously

Poll for the latest secret version continuously in the application and use the new secret value immediately.

This approach risks immediate, service-wide outage as there's no gradual adoption of the new secret value.

Rotate your secret

If your secret can be dynamically updated (for example, if the external system validating the secret provides an Admin API), we recommend setting up a rotation job that runs periodically. The general steps are outlined in the following section with Cloud Run as a sample compute environment.

Configure a rotation schedule on your Secret

Set up a rotation schedule for your secret. Pub/Sub topics must be configured on the secret to receive notifications when it's time to rotate your secret. See the Event Notifications guide for configuring topics on your secrets.

Launch a Cloud Run to create a new secret version

Create and configure a Cloud Run service to receive rotation notifications and execute rotation steps:

  1. Obtain or create a new secret in the external system (e.g. database, API provider).

    Ensure that this does not invalidate existing secrets so that existing workloads are not affected.

  2. Update Secret Manager with the new secret.

    Create a new SecretVersion in Secret Manager. This will also update the latest alias to point to the newly created secret.

Retries and Concurrency

Since your rotation process could be terminated at any time, your Cloud Run service must be able to restart the process from where it left off (it must be reentrant).

We recommend configuring retries in Pub/Sub so that failed or interrupted rotations can be re-executed. Additionally, configure the max concurrency and max instances on your Cloud Run service to minimize the probability of concurrent rotation executions interfering with each other.

In order to build a reentrant rotation function, you may find it useful to save state to allow the rotation process to resume. There are two Secret Manager features that help with this:

  • Use labels on Secrets to save state during rotation. Add a label to the Secret to track the number of the last successfully added version during the rotation workflow (i.e. ROTATING_TO_NEW_VERSION_NUMBER=3). Once the rotation has completed, remove the rotation tracking label.
  • Use etags to verify that other processes are not concurrently modifying the secret during the rotation workflow. Learn more about secret and secret version etags.

Identity and Access Management permissions

Your rotation process will require secretmanager.versions.add permission to add a new secret version, and may require secretmanager.versions.access to read the previous secret version.

The default Cloud Run service account has the Editor role, which includes permission to add, but not access secret versions. To follow the principle of least privilege, we recommend NOT using the default service account. Instead, set up a separate service account for your Cloud Run service with the Secret Manager roles granted as necessary (may be one or multiple of):

  • roles/secretmanager.secretVersionAdder
  • roles/secretmanager.secretVersionManager
  • roles/secretmanager.secretAdmin
  • roles/secretmanager.secretAccessor

Roll out the new SecretVersion to workloads

Now that a new, valid SecretVersion has been registered with the external system and is stored in Secret Manager, roll it out to your application. This rollout varies based on your approach to secret binding (see the Binding to a secret version section) and generally does not require manual intervention.

Clean up old secret versions

Once all applications are rotated off the old secret version, it can be safely cleaned up. The cleanup process will depend on the type of secret, but generally:

  1. Ensure that the new secret version has been fully rolled out to all applications.
  2. Disable the old secret version in Secret Manager and verify that applications do not break (wait a reasonable amount of time to allow a human to intervene if disabling breaks a consumer).
  3. Remove or unregister the old secret version from the external system.
  4. Destroy the old secret version in Secret Manager.