Protect sensitive data using Secret Manager with Batch

This document describes how to protect sensitive data that you want to specify for a Batch job by using Secret Manager secrets.

Secret Manager secrets protect sensitive data through encryption. In a Batch job, you can specify one or more existing secrets to securely pass the sensitive data they contain, which you can use to do following:

Before you begin

  1. If you haven't used Batch before, review Get started with Batch and enable Batch by completing the prerequisites for projects and users.
  2. Create a secret or identify a secret for the sensitive data that you want to securely specify for a job.
  3. To get the permissions that you need to create a job, ask your administrator to grant you the following IAM roles:

    For more information about granting roles, see Manage access to projects, folders, and organizations.

    You might also be able to get the required permissions through custom roles or other predefined roles.

  4. To ensure that the job's service account has the necessary permissions to access secrets, ask your administrator to grant the job's service account the Secret Manager Secret Accessor (roles/secretmanager.secretAccessor) IAM role on the secret.

Securely pass sensitive data to custom environment variables

To securely pass sensitive data from Secret Manager secrets to custom environment variables, you must define each environment variable in the secret variables (secretVariables) subfield for an environment and specify a secret for each value. Whenever you specify a secret in a job, you must format it as a path to a secret version: projects/PROJECT_ID/secrets/SECRET_ID/versions/VERSION.

You can create a job that defines secret variables by using the gcloud CLI, Batch API, Java, or Python. The following example explains how to create a job that defines and uses a secret variable for the environment of all runnables (environment subfield of taskSpec).

gcloud

  1. Create a JSON file that specifies the job's configuration details and include the secretVariables subfield for one or more environments.

    For example, to create a basic script job that uses a secret variable in the environment for all runnables, create a JSON file with the following contents:

    {
      "taskGroups": [
        {
          "taskSpec": {
            "runnables": [
              {
                "script": {
                  "text": "echo This is the secret: ${SECRET_VARIABLE_NAME}"
                }
              }
            ],
            "environment": {
              "secretVariables": {
                "{SECRET_VARIABLE_NAME}": "projects/PROJECT_ID/secrets/SECRET_NAME/versions/VERSION"
              }
            }
          }
        }
      ],
      "logsPolicy": {
        "destination": "CLOUD_LOGGING"
      }
    }
    

    Replace the following:

    • SECRET_VARIABLE_NAME: the name of the secret variable. By convention, environment variable names are capitalized.

      To securely access the sensitive data from the variable's Secret Manager secret, specify this variable name in this job's runnables. The secret variable is accessible to all of the runnables that are in the same environment where you define the secret variable.

    • PROJECT_ID: the project ID of your project.

    • SECRET_NAME: the name of an existing Secret Manager secret.

    • VERSION: the version of the specified secret that contains the data you want to pass to the job. This can be the version number or latest.

  2. To create and run the job, use the gcloud batch jobs submit command:

    gcloud batch jobs submit JOB_NAME \
      --location LOCATION \
      --config JSON_CONFIGURATION_FILE
    

    Replace the following:

    • JOB_NAME: the name of the job.

    • LOCATION: the location of the job.

    • JSON_CONFIGURATION_FILE: the path for a JSON file with the job's configuration details.

API

Make a POST request to the jobs.create method that specifies the secretVariables subfield for one or more environments.

For example, to create a basic script job that uses a secret variable in the environment for all runnables, make the following request:

POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/jobs?job_id=JOB_NAME
{
  "taskGroups": [
    {
      "taskSpec": {
        "runnables": [
          {
            "script": {
              "text": "echo This is the secret: ${SECRET_VARIABLE_NAME}"
            }
          }
        ],
        "environment": {
          "secretVariables": {
            "{SECRET_VARIABLE_NAME}": "projects/PROJECT_ID/secrets/SECRET_NAME/versions/VERSION"
          }
        }
      }
    }
  ],
  "logsPolicy": {
    "destination": "CLOUD_LOGGING"
  }
}

Replace the following:

  • PROJECT_ID: the project ID of your project.

  • LOCATION: the location of the job.

  • JOB_NAME: the name of the job.

  • SECRET_VARIABLE_NAME: the name of the secret variable. By convention, environment variable names are capitalized.

    To securely access the sensitive data from the variable's Secret Manager secret, specify this variable name in this job's runnables. The secret variable is accessible to all of the runnables that are in the same environment where you define the secret variable.

  • SECRET_NAME: the name of an existing Secret Manager secret.

  • VERSION: the version of the specified secret that contains the data you want to pass to the job. This can be the version number or latest.

Java


import com.google.cloud.batch.v1.BatchServiceClient;
import com.google.cloud.batch.v1.CreateJobRequest;
import com.google.cloud.batch.v1.Environment;
import com.google.cloud.batch.v1.Job;
import com.google.cloud.batch.v1.LogsPolicy;
import com.google.cloud.batch.v1.LogsPolicy.Destination;
import com.google.cloud.batch.v1.Runnable;
import com.google.cloud.batch.v1.Runnable.Script;
import com.google.cloud.batch.v1.TaskGroup;
import com.google.cloud.batch.v1.TaskSpec;
import com.google.protobuf.Duration;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;

public class CreateBatchUsingSecretManager {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    // Project ID or project number of the Google Cloud project you want to use.
    String projectId = "YOUR_PROJECT_ID";
    // Name of the region you want to use to run the job. Regions that are
    // available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
    String region = "europe-central2";
    // The name of the job that will be created.
    // It needs to be unique for each project and region pair.
    String jobName = "JOB_NAME";
    // The name of the secret variable.
    // This variable name is specified in this job's runnables
    // and is accessible to all of the runnables that are in the same environment.
    String secretVariableName = "VARIABLE_NAME";
    // The name of an existing Secret Manager secret.
    String secretName = "SECRET_NAME";
    // The version of the specified secret that contains the data you want to pass to the job.
    // This can be the version number or latest.
    String version = "VERSION";

    createBatchUsingSecretManager(projectId, region,
            jobName, secretVariableName, secretName, version);
  }

  // Create a basic script job to securely pass sensitive data.
  // The data is obtained from Secret Manager secrets
  // and set as custom environment variables in the job.
  public static Job createBatchUsingSecretManager(String projectId, String region,
                                                  String jobName, String secretVariableName,
                                                  String secretName, String version)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests.
    try (BatchServiceClient batchServiceClient = BatchServiceClient.create()) {
      // Define what will be done as part of the job.
      Runnable runnable =
          Runnable.newBuilder()
              .setScript(
                  Script.newBuilder()
                      .setText(
                          String.format("echo This is the secret: ${%s}.", secretVariableName))
                      // You can also run a script from a file. Just remember, that needs to be a
                      // script that's already on the VM that will be running the job.
                      // Using setText() and setPath() is mutually exclusive.
                      // .setPath("/tmp/test.sh")
                      .build())
              .build();

      // Construct the resource path to the secret's version.
      String secretValue = String
              .format("projects/%s/secrets/%s/versions/%s", projectId, secretName, version);

      // Set the secret as an environment variable.
      Environment.Builder environmentVariable = Environment.newBuilder()
          .putSecretVariables(secretVariableName, secretValue);

      TaskSpec task = TaskSpec.newBuilder()
          // Jobs can be divided into tasks. In this case, we have only one task.
          .addRunnables(runnable)
          .setEnvironment(environmentVariable)
          .setMaxRetryCount(2)
          .setMaxRunDuration(Duration.newBuilder().setSeconds(3600).build())
          .build();

      // Tasks are grouped inside a job using TaskGroups.
      // Currently, it's possible to have only one task group.
      TaskGroup taskGroup = TaskGroup.newBuilder()
          .setTaskSpec(task)
          .build();

      Job job =
          Job.newBuilder()
              .addTaskGroups(taskGroup)
              .putLabels("env", "testing")
              .putLabels("type", "script")
              // We use Cloud Logging as it's an out of the box available option.
              .setLogsPolicy(
                  LogsPolicy.newBuilder().setDestination(Destination.CLOUD_LOGGING))
              .build();

      CreateJobRequest createJobRequest =
          CreateJobRequest.newBuilder()
              // The job's parent is the region in which the job will run.
              .setParent(String.format("projects/%s/locations/%s", projectId, region))
              .setJob(job)
              .setJobId(jobName)
              .build();

      Job result =
          batchServiceClient
              .createJobCallable()
              .futureCall(createJobRequest)
              .get(5, TimeUnit.MINUTES);

      System.out.printf("Successfully created the job: %s", result.getName());

      return result;
    }
  }
}

Python

from typing import Dict, Optional

from google.cloud import batch_v1


def create_with_secret_manager(
    project_id: str,
    region: str,
    job_name: str,
    secrets: Dict[str, str],
    service_account_email: Optional[str] = None,
) -> batch_v1.Job:
    """
    This method shows how to create a sample Batch Job that will run
    a simple command on Cloud Compute instances with passing secrets from secret manager.
    Note: Job's service account should have the permissions to access secrets.
        - Secret Manager Secret Accessor (roles/secretmanager.secretAccessor) IAM role.

    Args:
        project_id: project ID or project number of the Cloud project you want to use.
        region: name of the region you want to use to run the job. Regions that are
            available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
        job_name: the name of the job that will be created.
            It needs to be unique for each project and region pair.
        secrets: secrets, which should be passed to the job. Environment variables should be capitalized
        by convention https://google.github.io/styleguide/shellguide.html#constants-and-environment-variable-names
            The format should look like:
                - {'SECRET_NAME': 'projects/{project_id}/secrets/{SECRET_NAME}/versions/{version}'}
            version can be set to 'latest'.
        service_account_email (optional): custom service account email

    Returns:
        A job object representing the job created.
    """
    client = batch_v1.BatchServiceClient()

    # Define what will be done as part of the job.
    task = batch_v1.TaskSpec()
    runnable = batch_v1.Runnable()
    runnable.script = batch_v1.Runnable.Script()
    runnable.script.text = (
        "echo Hello world! from task ${BATCH_TASK_INDEX}."
        + f" ${next(iter(secrets.keys()))} is the value of the secret."
    )
    task.runnables = [runnable]
    task.max_retry_count = 2
    task.max_run_duration = "3600s"

    envable = batch_v1.Environment()
    envable.secret_variables = secrets
    task.environment = envable

    # Tasks are grouped inside a job using TaskGroups.
    # Currently, it's possible to have only one task group.
    group = batch_v1.TaskGroup()
    group.task_count = 4
    group.task_spec = task

    # Policies are used to define on what kind of virtual machines the tasks will run on.
    # Read more about local disks here: https://cloud.google.com/compute/docs/disks/persistent-disks
    policy = batch_v1.AllocationPolicy.InstancePolicy()
    policy.machine_type = "e2-standard-4"
    instances = batch_v1.AllocationPolicy.InstancePolicyOrTemplate()
    instances.policy = policy
    allocation_policy = batch_v1.AllocationPolicy()
    allocation_policy.instances = [instances]

    service_account = batch_v1.ServiceAccount()
    service_account.email = service_account_email
    allocation_policy.service_account = service_account

    job = batch_v1.Job()
    job.task_groups = [group]
    job.allocation_policy = allocation_policy
    job.labels = {"env": "testing", "type": "script"}
    # We use Cloud Logging as it's an out of the box available option
    job.logs_policy = batch_v1.LogsPolicy()
    job.logs_policy.destination = batch_v1.LogsPolicy.Destination.CLOUD_LOGGING

    create_request = batch_v1.CreateJobRequest()
    create_request.job = job
    create_request.job_id = job_name
    # The job's parent is the region in which the job will run
    create_request.parent = f"projects/{project_id}/locations/{region}"

    return client.create_job(create_request)

Securely access container images requiring Docker registry credentials

To use a container image from a private Docker registry, a runnable must specify login credentials that allow it to access that Docker registry. Specifically, for any container runnable with the image URI (imageUri) field set to a image from a private Docker registry, you must specify any credentials required to access that Docker registry by using the username (username) field and password (password) field.

You can protect any sensitive credentials for a Docker registry by specifying existing secrets that contain the information instead of defining these fields directly. Whenever you specify a secret in a job, you must format it as a path to a secret version: projects/PROJECT_ID/secrets/SECRET_ID/versions/VERSION.

You can create a job that uses container images from a private Docker registry by using the gcloud CLI or Batch API. The following example explains how to create a job that uses a container image from a private Docker registry by specifying the username directly and the password as a secret.

gcloud

  1. Create a JSON file that specifies the job's configuration details. For any container runnables that use images from a private Docker registry, include any credentials required to access it in the username and password fields.

    For example, to create a basic container job that specifies an image from a private Docker registry, create a JSON file with the following contents:

    {
      "taskGroups": [
        {
          "taskSpec": {
            "runnables": [
              {
                "container": {
                  "imageUri": "PRIVATE_IMAGE_URI",
                  "commands": [
                    "-c",
                    "echo This runnable uses a private image."
                  ],
                  "username": "USERNAME",
                  "password": "PASSWORD"
                }
              }
            ],
          }
        }
      ],
      "logsPolicy": {
        "destination": "CLOUD_LOGGING"
      }
    }
    

    Replace the following:

    • PRIVATE_IMAGE_URI: the image URI for a container image from a private Docker registry. If this image requires any other container settings, you must include those also.

    • USERNAME: the username for the private Docker registry, which can be specified as a secret or directly.

    • PASSWORD: the password for the private Docker registry, which can be specified as a secret (recommended) or directly.

      For example, to specify the password as a secret, set PASSWORD to the following:

      projects/PROJECT_ID/secrets/SECRET_ID/versions/VERSION
      

      Replace the following:

  2. To create and run the job, use the gcloud batch jobs submit command:

    gcloud batch jobs submit JOB_NAME \
      --location LOCATION \
      --config JSON_CONFIGURATION_FILE
    

    Replace the following:

    • JOB_NAME: the name of the job.

    • LOCATION: the location of the job.

    • JSON_CONFIGURATION_FILE: the path for a JSON file with the job's configuration details.

API

Make a POST request to the jobs.create method. For any container runnables that use images from a private Docker registry, include any credentials required to access it in the username and password fields.

For example, to create a basic container job that specifies an image from a private Docker registry, make the following request:

POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/jobs?job_id=JOB_NAME
{
  "taskGroups": [
    {
      "taskSpec": {
        "runnables": [
          {
            "container": {
              "imageUri": "PRIVATE_IMAGE_URI",
                "commands": [
                  "-c",
                  "echo This runnable uses a private image."
                ],
                "username": "USERNAME",
                "password": "PASSWORD"
            }
          }
        ],
      }
    }
  ],
  "logsPolicy": {
    "destination": "CLOUD_LOGGING"
  }
}

Replace the following:

  • PROJECT_ID: the project ID of your project.

  • LOCATION: the location of the job.

  • JOB_NAME: the name of the job.

  • PRIVATE_IMAGE_URI: the image URI for a container image from a private Docker registry. If this image requires any other container settings, you must include those also.

  • USERNAME: the username for the private Docker registry, which can be specified as a secret or directly.

  • PASSWORD: the password for the private Docker registry, which can be specified as a secret (recommended) or directly.

    For example, to specify the password as a secret, set PASSWORD to the following:

    projects/PROJECT_ID/secrets/SECRET_ID/versions/VERSION
    

    Replace the following:

What's next