Create and run a job

Stay organized with collections Save and categorize content based on your preferences.

This page describes how to run a batch-processing workload on Google Cloud by creating a Batch job.

Create a job to specify your workload and its requirements. When you finish creating the job, Google Cloud automatically queues, schedules, and executes the job. The time a job takes to finish running varies for different jobs and at different times based on factors related to resource availability. Generally, jobs are more likely to run and finish sooner if they are smaller and require only a few, common resources. For the example jobs documented on this page, which use minimal resources, you might see them finish running in as little as a few minutes.

You can create a job through the following options:

  • Create a basic job describes the fundamentals, including how to define a job's tasks using either a script or container image.
  • Create a job that uses environment variables describes how to access and use Batch predefined environment variables or the environment variables you define in your job's resources.
  • Optional: Create a job from a Compute Engine instance template describes how to specify an instance template to define a job's resources. If your job needs to use a specific VM image or a custom machine type, an instance template is required.
  • Optional: Create a job that uses a custom service account describes how to specify a job's service account, which influences the resources and applications that a job's VMs can access.
  • Optional: Create a job that uses MPI for tightly coupled tasks describes how to configure a job with interdependent tasks that communicate with each other across different VMs by using a Message Passing Interface (MPI) library. A common use case for MPI is tightly coupled high-performance computing (HPC) workloads.
  • Optional: Create a job that uses a GPU describes how to define a job that uses a graphics processing unit (GPU). Common use cases for jobs that use GPUs include intensive data processing or machine learning (ML) workloads.
  • Optional: Create a job that uses storage volumes describes how to define a job that can access one or more external storage volumes. Storage options include new or existing persistent disk, new local SSDs, existing Cloud Storage buckets, and an existing network file system (NFS) such as a Filestore file share.

Before you begin

  • If you haven't used Batch before, review Get started with Batch and enable Batch by completing the prerequisites for projects and users.
  • To get the permissions that you need to create a job, ask your administrator to grant you the following IAM roles:

    • Batch Job Editor (roles/batch.jobsEditor) on the project
    • Service Account User (roles/iam.serviceAccountUser) on the job's service account, which by default is the default Compute Engine service account
    • Create a job from a Compute Engine instance template: Compute Viewer (roles/compute.viewer) on the instance template
    • Create a job that uses a Cloud Storage bucket: Storage Object Viewer (roles/storage.objectViewer) on the bucket

    For more information about granting roles, see Manage access.

Create a basic job

This section describes how to create an example job that runs either a script or a container image:

  • If you want to use Batch to write jobs that use a container image, see create a container job.
  • Otherwise, if you aren't sure if you want to use container images or if you are unfamiliar with containers, creating a script job is recommended.

The example job for both types of jobs contains a task group with an array of 4 tasks. Each task prints a message and its index to the standard output and Cloud Logging. The definition for this job specifies a parallelism of 2, which indicates that the job should run on 2 VMs to allow 2 tasks to run at a time.

Create a basic container job

You can select or create a container image to provide the code and dependencies for your job to run from any compute environment. For more information, see Working with container images and Running containers on VM instances.

You can create a basic container job using the Google Cloud console, gcloud CLI, Batch API, Go, Java, Node.js, or Python.

Console

To create a basic container job using the Google Cloud console, do the following:

  1. In the Google Cloud console, go to the Job list page.

    Go to Job list

  2. Click Create. The Create batch job page opens.

  3. In the Job name field, enter a job name.

    For example, enter example-basic-job.

  4. In the Region field, select the location for this job.

    For example, select us-central1 (default).

  5. For VM provisioning model, select an option for the provisioning model for this job's VMs:

    • If your job can withstand preemption and you want discounted VMs, select Spot.
    • Otherwise, select Standard.

    For example, select Standard (default).

  6. In the Task count field, enter the number of tasks for this job. The value must be a whole number between 1 and 10000.

    For example, enter 4.

  7. In the Parallelism field, enter the number of tasks to run concurrently. The number cannot be larger than the total number of tasks and must be a whole number between 1 and 1000.

    For example, enter 2.

  8. For Task details, select Container image URL (default).

  9. For the Container image URL field, enter a container image.

    For example, enter the following to use the busybox Docker container image.

    gcr.io/google-containers/busybox
    
  10. Optional: To override the container image's ENTRYPOINT command, enter a new command in the Entry point field.

    For example, enter the following:

    /bin/sh
    
  11. Optional: To also override the container image's CMD command, select the Override container image's CMD command checkbox and enter one or more commands in the field that appears, separating each command with a new line.

    For example, select the Override container image's CMD command checkbox and enter the following commands:

    -c
    echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks.
    
  12. For Task resources, specify the amount of VM resources required for each task: In the Cores field, enter the amount of vCPUs and, in the Memory field, enter the amount of RAM in GB.

    For example, enter 1 vCPU (default) and 0.5 GB (default).

  13. Click Create.

The Job list page displays the job that you created.

gcloud

To create a basic container job using the gcloud CLI, do the following:

  1. Create a JSON file that specifies your job's configuration details. For example, to create a basic container job, create a JSON file with the following contents. For more information about all the fields you can specify for a job, see the reference documentation for the projects.locations.jobs REST resource.

    {
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "container": {
                                CONTAINER
                            }
                        }
                    ],
                    "computeResource": {
                        "cpuMilli": CORES,
                        "memoryMib": MEMORY
                    },
                    "maxRetryCount": MAX_RETRY_COUNT,
                    "maxRunDuration": "MAX_RUN_DURATION"
                },
                "taskCount": TASK_COUNT,
                "parallelism": PARALLELISM
            }
        ]
    }
    

    Replace the following:

    • CONTAINER: the container that each task runs.
    • CORES: Optional. The amount of cores—specifically vCPUs, which usually represent half a physical core—to allocate for each task in milliCPU units. If the cpuMilli field is not specified, the value is set to 2000 (2 vCPUs).
    • MEMORY: Optional. The amount of memory to allocate for each task in MB. If the memoryMib field is not specified, the value is set to 2000 (2 GB).
    • MAX_RETRY_COUNT: Optional. The maximum number of retries for a task. The value must be a whole number between 0 and 10. If the maxRetryCount field is not specified, the value is set to 0, which means to not retry the task.
    • MAX_RUN_DURATION: Optional. The maximum time a task is allowed to run before being retried or failing, formatted as a value in seconds followed by s. If the maxRunDuration field is not specified, the value is set to 604800s (7 days), which is the maximum value.
    • TASK_COUNT: Optional. The number of tasks for the job. The value must be a whole number between 1 and 10000. If the taskCount field is not specified, the value is set to 1.
    • PARALLELISM: Optional. The number of tasks the job runs concurrently. The number cannot be larger than the number of tasks and must be a whole number between 1 and 1000. If the parallelism field is not specified, the value is set to 1.
  2. Create a job by using the gcloud batch jobs submit command.

    gcloud batch jobs submit JOB_NAME \
      --location LOCATION \
      --config JSON_CONFIGURATION_FILE
    

    Replace the following:

    • JOB_NAME: the name of the job.
    • LOCATION: the location of the job.
    • JSON_CONFIGURATION_FILE: the path for a JSON file with the job's configuration details.

For example, to create a job that runs tasks using the busybox Docker container image:

  1. Create a JSON file in the current directory named hello-world-container.json with the following contents:

    {
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "container": {
                                "imageUri": "gcr.io/google-containers/busybox",
                                "entrypoint": "/bin/sh",
                                "commands": [
                                    "-c",
                                    "echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
                                ]
                            }
                        }
                    ],
                    "computeResource": {
                        "cpuMilli": 2000,
                        "memoryMib": 16
                    },
                    "maxRetryCount": 2,
                    "maxRunDuration": "3600s"
                },
                "taskCount": 4,
                "parallelism": 2
            }
        ],
        "allocationPolicy": {
            "instances": [
                {
                    "policy": { "machineType": "e2-standard-4" }
                }
            ]
        },
        "labels": {
            "department": "finance",
            "env": "testing"
        },
        "logsPolicy": {
            "destination": "CLOUD_LOGGING"
        }
    }
    
  2. Run the following command:

    gcloud batch jobs submit example-container-job \
      --location us-central1 \
      --config hello-world-container.json
    

API

To create a basic container job using the Batch API, use the jobs.create method. For more information about all the fields you can specify for a job, see the reference documentation for the projects.locations.jobs REST resource.

POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/jobs?job_id=JOB_NAME

{
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "container": {
                            CONTAINER
                        }
                    }
                ],
                "computeResource": {
                    "cpuMilli": CORES,
                    "memoryMib": MEMORY
                },
                "maxRetryCount": MAX_RETRY_COUNT,
                "maxRunDuration": "MAX_RUN_DURATION"
            },
            "taskCount": TASK_COUNT,
            "parallelism": PARALLELISM
        }
    ]
}

Replace the following:

  • PROJECT_ID: the project ID of your project.
  • LOCATION: the location of the job.
  • JOB_NAME: the name of the job.
  • CONTAINER: the container that each task runs.
  • CORES: Optional. The amount of cores—specifically vCPUs, which usually represent half a physical core—to allocate for each task in milliCPU units. If the cpuMilli field is not specified, the value is set to 2000 (2 vCPUs).
  • MEMORY: Optional. The amount of memory to allocate for each task in MB. If the memoryMib field is not specified, the value is set to 2000 (2 GB).
  • MAX_RETRY_COUNT: Optional. The maximum number of retries for a task. The value must be a whole number between 0 and 10. If the maxRetryCount field is not specified, the value is set to 0, which means to not retry the task.
  • MAX_RUN_DURATION: Optional. The maximum time a task is allowed to run before being retried or failing, formatted as a value in seconds followed by s. If the maxRunDuration field is not specified, the value is set to 604800s (7 days), which is the maximum value.
  • TASK_COUNT: Optional. The number of tasks for the job, which must be a whole number between 1 and 10000. If the taskCount field is not specified, the value is set to 1.
  • PARALLELISM: Optional. The number of tasks the job runs concurrently. The number cannot be larger than the number of tasks and must be a whole number between 1 and 1000. If the parallelism field is not specified, the value is set to 1.

For example, to create a job that runs tasks using the busybox Docker container image, use the following request:

POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/jobs?job_id=example-container-job

{
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "container": {
                            "imageUri": "gcr.io/google-containers/busybox",
                            "entrypoint": "/bin/sh",
                            "commands": [
                                "-c",
                                "echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
                            ]
                        }
                    }
                ],
                "computeResource": {
                    "cpuMilli": 2000,
                    "memoryMib": 16
                },
                "maxRetryCount": 2,
                "maxRunDuration": "3600s"
            },
            "taskCount": 4,
            "parallelism": 2
        }
    ],
    "allocationPolicy": {
        "instances": [
            {
                "policy": { "machineType": "e2-standard-4" }
            }
        ]
    },
    "labels": {
        "department": "finance",
        "env": "testing"
    },
    "logsPolicy": {
        "destination": "CLOUD_LOGGING"
    }
}

where PROJECT_ID is the project ID of your project.

Go

Go

For more information, see the Batch Go API reference documentation.

import (
	"context"
	"fmt"
	"io"

	batch "cloud.google.com/go/batch/apiv1"
	"cloud.google.com/go/batch/apiv1/batchpb"
	durationpb "google.golang.org/protobuf/types/known/durationpb"
)

// Creates and runs a job that runs the specified container
func createContainerJob(w io.Writer, projectID, region, jobName string) error {
	// projectID := "your_project_id"
	// region := "us-central1"
	// jobName := "some-job"

	ctx := context.Background()
	batchClient, err := batch.NewClient(ctx)
	if err != nil {
		return fmt.Errorf("NewClient: %v", err)
	}
	defer batchClient.Close()

	container := &batchpb.Runnable_Container{
		ImageUri:   "gcr.io/google-containers/busybox",
		Commands:   []string{"-c", "echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."},
		Entrypoint: "/bin/sh",
	}

	// We can specify what resources are requested by each task.
	resources := &batchpb.ComputeResource{
		// CpuMilli is milliseconds per cpu-second. This means the task requires 2 whole CPUs.
		CpuMilli:  2000,
		MemoryMib: 16,
	}

	taskSpec := &batchpb.TaskSpec{
		Runnables: []*batchpb.Runnable{{
			Executable: &batchpb.Runnable_Container_{Container: container},
		}},
		ComputeResource: resources,
		MaxRunDuration: &durationpb.Duration{
			Seconds: 3600,
		},
		MaxRetryCount: 2,
	}

	// Tasks are grouped inside a job using TaskGroups.
	taskGroups := []*batchpb.TaskGroup{
		{
			TaskCount: 4,
			TaskSpec:  taskSpec,
		},
	}

	// Policies are used to define on what kind of virtual machines the tasks will run on.
	// In this case, we tell the system to use "e2-standard-4" machine type.
	// Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
	allocationPolicy := &batchpb.AllocationPolicy{
		Instances: []*batchpb.AllocationPolicy_InstancePolicyOrTemplate{{
			PolicyTemplate: &batchpb.AllocationPolicy_InstancePolicyOrTemplate_Policy{
				Policy: &batchpb.AllocationPolicy_InstancePolicy{
					MachineType: "e2-standard-4",
				},
			},
		}},
	}

	// We use Cloud Logging as it's an out of the box available option
	logsPolicy := &batchpb.LogsPolicy{
		Destination: batchpb.LogsPolicy_CLOUD_LOGGING,
	}

	jobLabels := map[string]string{"env": "testing", "type": "container"}

	// The job's parent is the region in which the job will run
	parent := fmt.Sprintf("projects/%s/locations/%s", projectID, region)

	job := batchpb.Job{
		TaskGroups:       taskGroups,
		AllocationPolicy: allocationPolicy,
		Labels:           jobLabels,
		LogsPolicy:       logsPolicy,
	}

	req := &batchpb.CreateJobRequest{
		Parent: parent,
		JobId:  jobName,
		Job:    &job,
	}

	created_job, err := batchClient.CreateJob(ctx, req)
	if err != nil {
		return fmt.Errorf("unable to create job: %v", err)
	}

	fmt.Fprintf(w, "Job created: %v\n", created_job)

	return nil
}

Java

Java

For more information, see the Batch Java API reference documentation.


import com.google.cloud.batch.v1.AllocationPolicy;
import com.google.cloud.batch.v1.AllocationPolicy.InstancePolicy;
import com.google.cloud.batch.v1.AllocationPolicy.InstancePolicyOrTemplate;
import com.google.cloud.batch.v1.BatchServiceClient;
import com.google.cloud.batch.v1.ComputeResource;
import com.google.cloud.batch.v1.CreateJobRequest;
import com.google.cloud.batch.v1.Job;
import com.google.cloud.batch.v1.LogsPolicy;
import com.google.cloud.batch.v1.LogsPolicy.Destination;
import com.google.cloud.batch.v1.Runnable;
import com.google.cloud.batch.v1.Runnable.Container;
import com.google.cloud.batch.v1.TaskGroup;
import com.google.cloud.batch.v1.TaskSpec;
import com.google.protobuf.Duration;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;

public class CreateWithContainerNoMounting {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    // Project ID or project number of the Cloud project you want to use.
    String projectId = "YOUR_PROJECT_ID";

    // Name of the region you want to use to run the job. Regions that are
    // available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
    String region = "europe-central2";

    // The name of the job that will be created.
    // It needs to be unique for each project and region pair.
    String jobName = "JOB_NAME";

    createContainerJob(projectId, region, jobName);
  }

  // This method shows how to create a sample Batch Job that will run a simple command inside a
  // container on Cloud Compute instances.
  public static void createContainerJob(String projectId, String region, String jobName)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the `batchServiceClient.close()` method on the client to safely
    // clean up any remaining background resources.
    try (BatchServiceClient batchServiceClient = BatchServiceClient.create()) {

      // Define what will be done as part of the job.
      Runnable runnable =
          Runnable.newBuilder()
              .setContainer(
                  Container.newBuilder()
                      .setImageUri("gcr.io/google-containers/busybox")
                      .setEntrypoint("/bin/sh")
                      .addCommands("-c")
                      .addCommands(
                          "echo Hello world! This is task ${BATCH_TASK_INDEX}. "
                              + "This job has a total of ${BATCH_TASK_COUNT} tasks.")
                      .build())
              .build();

      // We can specify what resources are requested by each task.
      ComputeResource computeResource =
          ComputeResource.newBuilder()
              // In milliseconds per cpu-second. This means the task requires 2 whole CPUs.
              .setCpuMilli(2000)
              // In MiB.
              .setMemoryMib(16)
              .build();

      TaskSpec task =
          TaskSpec.newBuilder()
              // Jobs can be divided into tasks. In this case, we have only one task.
              .addRunnables(runnable)
              .setComputeResource(computeResource)
              .setMaxRetryCount(2)
              .setMaxRunDuration(Duration.newBuilder().setSeconds(3600).build())
              .build();

      // Tasks are grouped inside a job using TaskGroups.
      TaskGroup taskGroup = TaskGroup.newBuilder().setTaskCount(4).setTaskSpec(task).build();

      // Policies are used to define on what kind of virtual machines the tasks will run on.
      // In this case, we tell the system to use "e2-standard-4" machine type.
      // Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
      InstancePolicy instancePolicy =
          InstancePolicy.newBuilder().setMachineType("e2-standard-4").build();

      AllocationPolicy allocationPolicy =
          AllocationPolicy.newBuilder()
              .addInstances(InstancePolicyOrTemplate.newBuilder().setPolicy(instancePolicy).build())
              .build();

      Job job =
          Job.newBuilder()
              .addTaskGroups(taskGroup)
              .setAllocationPolicy(allocationPolicy)
              .putLabels("env", "testing")
              .putLabels("type", "container")
              // We use Cloud Logging as it's an out of the box available option.
              .setLogsPolicy(
                  LogsPolicy.newBuilder().setDestination(Destination.CLOUD_LOGGING).build())
              .build();

      CreateJobRequest createJobRequest =
          CreateJobRequest.newBuilder()
              // The job's parent is the region in which the job will run.
              .setParent(String.format("projects/%s/locations/%s", projectId, region))
              .setJob(job)
              .setJobId(jobName)
              .build();

      Job result =
          batchServiceClient
              .createJobCallable()
              .futureCall(createJobRequest)
              .get(5, TimeUnit.MINUTES);

      System.out.printf("Successfully created the job: %s", result.getName());
    }
  }
}

Node.js

Node.js

For more information, see the Batch Node.js API reference documentation.

/**
 * TODO(developer): Uncomment and replace these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
/**
 * The region you want to the job to run in. The regions that support Batch are listed here:
 * https://cloud.google.com/batch/docs/get-started#locations
 */
// const region = 'us-central-1';
/**
 * The name of the job that will be created.
 * It needs to be unique for each project and region pair.
 */
// const jobName = 'YOUR_JOB_NAME';

// Imports the Batch library
const batchLib = require('@google-cloud/batch');
const batch = batchLib.protos.google.cloud.batch.v1;

// Instantiates a client
const batchClient = new batchLib.v1.BatchServiceClient();

// Define what will be done as part of the job.
const task = new batch.TaskSpec();
const runnable = new batch.Runnable();
runnable.container = new batch.Runnable.Container();
runnable.container.imageUri = 'gcr.io/google-containers/busybox';
runnable.container.entrypoint = '/bin/sh';
runnable.container.commands = [
  'echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks.',
];
task.runnables = [runnable];

// We can specify what resources are requested by each task.
const resources = new batch.ComputeResource();
resources.cpuMilli = 2000; // in milliseconds per cpu-second. This means the task requires 2 whole CPUs.
resources.memoryMib = 16;
task.computeResource = resources;

task.maxRetryCount = 2;
task.maxRunDuration = {seconds: 3600};

// Tasks are grouped inside a job using TaskGroups.
const group = new batch.TaskGroup();
group.taskCount = 4;
group.taskSpec = task;

// Policies are used to define on what kind of virtual machines the tasks will run on.
// In this case, we tell the system to use "e2-standard-4" machine type.
// Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
const allocationPolicy = new batch.AllocationPolicy();
const policy = new batch.AllocationPolicy.InstancePolicy();
policy.machineType = 'e2-standard-4';
const instances = new batch.AllocationPolicy.InstancePolicyOrTemplate();
instances.policy = policy;
allocationPolicy.instances = [instances];

const job = new batch.Job();
job.name = jobName;
job.taskGroups = [group];
job.allocationPolicy = allocationPolicy;
job.labels = {env: 'testing', type: 'container'};
// We use Cloud Logging as it's an out option available out of the box
job.logsPolicy = new batch.LogsPolicy();
job.logsPolicy.destination = batch.LogsPolicy.Destination.CLOUD_LOGGING;

// The job's parent is the project and region in which the job will run
const parent = `projects/${projectId}/locations/${region}`;

async function callCreateJob() {
  // Construct request
  const request = {
    parent,
    jobId: jobName,
    job,
  };

  // Run request
  const response = await batchClient.createJob(request);
  console.log(response);
}

callCreateJob();

Python

Python

For more information, see the Batch Python API reference documentation.

from google.cloud import batch_v1


def create_container_job(project_id: str, region: str, job_name: str) -> batch_v1.Job:
    """
    This method shows how to create a sample Batch Job that will run
    a simple command inside a container on Cloud Compute instances.

    Args:
        project_id: project ID or project number of the Cloud project you want to use.
        region: name of the region you want to use to run the job. Regions that are
            available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
        job_name: the name of the job that will be created.
            It needs to be unique for each project and region pair.

    Returns:
        A job object representing the job created.
    """
    client = batch_v1.BatchServiceClient()

    # Define what will be done as part of the job.
    runnable = batch_v1.Runnable()
    runnable.container = batch_v1.Runnable.Container()
    runnable.container.image_uri = "gcr.io/google-containers/busybox"
    runnable.container.entrypoint = "/bin/sh"
    runnable.container.commands = ["-c", "echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."]

    # Jobs can be divided into tasks. In this case, we have only one task.
    task = batch_v1.TaskSpec()
    task.runnables = [runnable]

    # We can specify what resources are requested by each task.
    resources = batch_v1.ComputeResource()
    resources.cpu_milli = 2000  # in milliseconds per cpu-second. This means the task requires 2 whole CPUs.
    resources.memory_mib = 16  # in MiB
    task.compute_resource = resources

    task.max_retry_count = 2
    task.max_run_duration = "3600s"

    # Tasks are grouped inside a job using TaskGroups.
    # Currently, it's possible to have only one task group.
    group = batch_v1.TaskGroup()
    group.task_count = 4
    group.task_spec = task

    # Policies are used to define on what kind of virtual machines the tasks will run on.
    # In this case, we tell the system to use "e2-standard-4" machine type.
    # Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
    policy = batch_v1.AllocationPolicy.InstancePolicy()
    policy.machine_type = "e2-standard-4"
    instances = batch_v1.AllocationPolicy.InstancePolicyOrTemplate()
    instances.policy = policy
    allocation_policy = batch_v1.AllocationPolicy()
    allocation_policy.instances = [instances]

    job = batch_v1.Job()
    job.task_groups = [group]
    job.allocation_policy = allocation_policy
    job.labels = {"env": "testing", "type": "container"}
    # We use Cloud Logging as it's an out of the box available option
    job.logs_policy = batch_v1.LogsPolicy()
    job.logs_policy.destination = batch_v1.LogsPolicy.Destination.CLOUD_LOGGING

    create_request = batch_v1.CreateJobRequest()
    create_request.job = job
    create_request.job_id = job_name
    # The job's parent is the region in which the job will run
    create_request.parent = f"projects/{project_id}/locations/{region}"

    return client.create_job(create_request)

Create a basic script job

You can create a basic script job using the Google Cloud console, gcloud CLI, Batch API, Go, Java, Node.js, or Python.

Console

To create a basic script job using the Google Cloud console, do the following:

  1. In the Google Cloud console, go to the Job list page.

    Go to Job list

  2. Click Create. The Create batch job page opens.

  3. In the Job name field, enter a job name.

    For example, enter example-basic-job.

  4. In the Region field, select the location for this job.

    For example, select us-central1 (default).

  5. For VM provisioning model, select an option for the provisioning model for this job's VMs:

    • If your job can withstand preemption and you want discounted VMs, select Spot.
    • Otherwise, select Standard.

    For example, select Standard (default).

  6. In the Task count field, enter the number of tasks for this job. The value must be a whole number between 1 and 10000.

    For example, enter 4.

  7. In the Parallelism field, enter the number of tasks to run concurrently. The number cannot be larger than the total number of tasks and must be a whole number between 1 and 1000.

    For example, enter 2.

  8. For Task details, select Script.

    Then, in the field that appears, enter a script to run for each task.

    For example, use the following script:

    echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks.
    
  9. For Task resources, specify the amount of VM resources required for each task: In the Cores field, enter the amount of vCPUs and, in the Memory field, enter the amount of RAM in GB.

    For example, enter 1 vCPU (default) and 0.5 GB (default).

  10. Click Create.

The Job list page displays the job that you created.

gcloud

To create a basic script job using the gcloud CLI, do the following:

  1. Create a JSON file that specifies your job's configuration details. For example, to create a basic script job, create a JSON file with the following contents. For more information about all the fields you can specify for a job, see the reference documentation for the projects.locations.jobs REST resource.

    {
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "script": {
                                SCRIPT
                            }
                        }
                    ],
                    "computeResource": {
                        "cpuMilli": CORES,
                        "memoryMib": MEMORY
                    },
                    "maxRetryCount": MAX_RETRY_COUNT,
                    "maxRunDuration": "MAX_RUN_DURATION"
                },
                "taskCount": TASK_COUNT,
                "parallelism": PARALLELISM
            }
        ]
    }
    

    Replace the following:

    • SCRIPT: the script that each task runs.
    • CORES: Optional. The amount of cores—specifically vCPUs, which usually represent half a physical core—to allocate for each task in milliCPU units. If the cpuMilli field is not specified, the value is set to 2000 (2 vCPUs).
    • MEMORY: Optional. The amount of memory to allocate for each task in MB. If the memoryMib field is not specified, the value is set to 2000 (2 GB).
    • MAX_RETRY_COUNT: Optional. The maximum number of retries for a task. The value must be a whole number between 0 and 10. If the maxRetryCount field is not specified, the value is set to 0, which means to not retry the task.
    • MAX_RUN_DURATION: Optional. The maximum time a task is allowed to run before being retried or failing, formatted as a value in seconds followed by s. If the maxRunDuration field is not specified, the value is set to 604800s (7 days), which is the maximum value.
    • TASK_COUNT: Optional. The number of tasks for the job. The value must be a whole number between 1 and 10000. If the taskCount field is not specified, the value is set to 1.
    • PARALLELISM: Optional. The number of tasks the job runs concurrently. The number cannot be larger than the number of tasks and must be a whole number between 1 and 1000. If the parallelism field is not specified, the value is set to 1.
  2. Create a job by using the gcloud batch jobs submit command.

    gcloud batch jobs submit JOB_NAME \
      --location LOCATION \
      --config JSON_CONFIGURATION_FILE
    

    Replace the following:

    • JOB_NAME: the name of the job.
    • LOCATION: the location of the job.
    • JSON_CONFIGURATION_FILE: the path for a JSON file with the job's configuration details.

For example, to create a job that runs tasks using a script:

  1. Create a JSON file in the current directory named hello-world-script.json with the following contents:

    {
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "script": {
                                "text": "echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
                            }
                        }
                    ],
                    "computeResource": {
                        "cpuMilli": 2000,
                        "memoryMib": 16
                    },
                    "maxRetryCount": 2,
                    "maxRunDuration": "3600s"
                },
                "taskCount": 4,
                "parallelism": 2
            }
        ],
        "allocationPolicy": {
            "instances": [
                {
                    "policy": { "machineType": "e2-standard-4" }
                }
            ]
        },
        "labels": {
            "department": "finance",
            "env": "testing"
        },
        "logsPolicy": {
            "destination": "CLOUD_LOGGING"
        }
    }
    
  2. Run the following command:

    gcloud batch jobs submit example-script-job \
      --location us-central1 \
      --config hello-world-script.json
    

API

To create a basic script job using the Batch API, use the jobs.create method. For more information about all the fields you can specify for a job, see the reference documentation for the projects.locations.jobs REST resource.

POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/jobs?job_id=JOB_NAME

{
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "script": {
                            SCRIPT
                        }
                    }
                ],
                "computeResource": {
                    "cpuMilli": CORES,
                    "memoryMib": MEMORY
                },
                "maxRetryCount": MAX_RETRY_COUNT,
                "maxRunDuration": "MAX_RUN_DURATION"
            },
            "taskCount": TASK_COUNT,
            "parallelism": PARALLELISM
        }
    ]
}

Replace the following:

  • PROJECT_ID: the project ID of your project.
  • LOCATION: the location of the job.
  • JOB_NAME: the name of the job.
  • SCRIPT: the script that each task runs.
  • CORES: Optional. The amount of cores—specifically vCPUs, which usually represent half a physical core—to allocate for each task in milliCPU units. If the cpuMilli field is not specified, the value is set to 2000 (2 vCPUs).
  • MEMORY: Optional. The amount of memory to allocate for each task in MB. If the memoryMib field is not specified, the value is set to 2000 (2 GB).
  • MAX_RETRY_COUNT: Optional. The maximum number of retries for a task. The value must be a whole number between 0 and 10. If the maxRetryCount field is not specified, the value is set to 0, which means to not retry the task.
  • MAX_RUN_DURATION: Optional. The maximum time a task is allowed to run before being retried or failing, formatted as a value in seconds followed by s. If the maxRunDuration field is not specified, the value is set to 604800s (7 days), which is the maximum value.
  • TASK_COUNT: Optional. The number of tasks for the job. The value must be a whole number between 1 and 10000. If the taskCount field is not specified, the value is set to 1.
  • PARALLELISM: Optional. The number of tasks the job runs concurrently. The number cannot be larger than the number of tasks and must be a whole number between 1 and 1000. If the parallelism field is not specified, the value is set to 1.

For example, to create a job that runs tasks using a script, use the following request:

POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/jobs?job_id=example-script-job

{
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "script": {
                            "text": "echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
                        }
                    }
                ],
                "computeResource": {
                    "cpuMilli": 2000,
                    "memoryMib": 16
                },
                "maxRetryCount": 2,
                "maxRunDuration": "3600s"
            },
            "taskCount": 4,
            "parallelism": 2
        }
    ],
    "allocationPolicy": {
        "instances": [
            {
                "policy": { "machineType": "e2-standard-4" }
            }
        ]
    },
    "labels": {
        "department": "finance",
        "env": "testing"
    },
    "logsPolicy": {
        "destination": "CLOUD_LOGGING"
    }
}

where PROJECT_ID is the project ID of your project.

Go

Go

For more information, see the Batch Go API reference documentation.

import (
	"context"
	"fmt"
	"io"

	batch "cloud.google.com/go/batch/apiv1"
	"cloud.google.com/go/batch/apiv1/batchpb"
	durationpb "google.golang.org/protobuf/types/known/durationpb"
)

// Creates and runs a job that executes the specified script
func createScriptJob(w io.Writer, projectID, region, jobName string) error {
	// projectID := "your_project_id"
	// region := "us-central1"
	// jobName := "some-job"

	ctx := context.Background()
	batchClient, err := batch.NewClient(ctx)
	if err != nil {
		return fmt.Errorf("NewClient: %v", err)
	}
	defer batchClient.Close()

	// Define what will be done as part of the job.
	command := &batchpb.Runnable_Script_Text{
		Text: "echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks.",
	}
	// You can also run a script from a file. Just remember, that needs to be a script that's
	// already on the VM that will be running the job.
	// Using runnable.script.text and runnable.script.path is mutually exclusive.
	// command := &batchpb.Runnable_Script_Path{
	// 	Path: "/tmp/test.sh",
	// }

	// We can specify what resources are requested by each task.
	resources := &batchpb.ComputeResource{
		// CpuMilli is milliseconds per cpu-second. This means the task requires 2 whole CPUs.
		CpuMilli:  2000,
		MemoryMib: 16,
	}

	taskSpec := &batchpb.TaskSpec{
		Runnables: []*batchpb.Runnable{{
			Executable: &batchpb.Runnable_Script_{
				Script: &batchpb.Runnable_Script{Command: command},
			},
		}},
		ComputeResource: resources,
		MaxRunDuration: &durationpb.Duration{
			Seconds: 3600,
		},
		MaxRetryCount: 2,
	}

	// Tasks are grouped inside a job using TaskGroups.
	taskGroups := []*batchpb.TaskGroup{
		{
			TaskCount: 4,
			TaskSpec:  taskSpec,
		},
	}

	// Policies are used to define on what kind of virtual machines the tasks will run on.
	// In this case, we tell the system to use "e2-standard-4" machine type.
	// Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
	allocationPolicy := &batchpb.AllocationPolicy{
		Instances: []*batchpb.AllocationPolicy_InstancePolicyOrTemplate{{
			PolicyTemplate: &batchpb.AllocationPolicy_InstancePolicyOrTemplate_Policy{
				Policy: &batchpb.AllocationPolicy_InstancePolicy{
					MachineType: "e2-standard-4",
				},
			},
		}},
	}

	// We use Cloud Logging as it's an out of the box available option
	logsPolicy := &batchpb.LogsPolicy{
		Destination: batchpb.LogsPolicy_CLOUD_LOGGING,
	}

	jobLabels := map[string]string{"env": "testing", "type": "script"}

	// The job's parent is the region in which the job will run
	parent := fmt.Sprintf("projects/%s/locations/%s", projectID, region)

	job := batchpb.Job{
		TaskGroups:       taskGroups,
		AllocationPolicy: allocationPolicy,
		Labels:           jobLabels,
		LogsPolicy:       logsPolicy,
	}

	req := &batchpb.CreateJobRequest{
		Parent: parent,
		JobId:  jobName,
		Job:    &job,
	}

	created_job, err := batchClient.CreateJob(ctx, req)
	if err != nil {
		return fmt.Errorf("unable to create job: %v", err)
	}

	fmt.Fprintf(w, "Job created: %v\n", created_job)

	return nil
}

Java

Java

For more information, see the Batch Java API reference documentation.


import com.google.cloud.batch.v1.AllocationPolicy;
import com.google.cloud.batch.v1.AllocationPolicy.InstancePolicy;
import com.google.cloud.batch.v1.AllocationPolicy.InstancePolicyOrTemplate;
import com.google.cloud.batch.v1.BatchServiceClient;
import com.google.cloud.batch.v1.ComputeResource;
import com.google.cloud.batch.v1.CreateJobRequest;
import com.google.cloud.batch.v1.Job;
import com.google.cloud.batch.v1.LogsPolicy;
import com.google.cloud.batch.v1.LogsPolicy.Destination;
import com.google.cloud.batch.v1.Runnable;
import com.google.cloud.batch.v1.Runnable.Script;
import com.google.cloud.batch.v1.TaskGroup;
import com.google.cloud.batch.v1.TaskSpec;
import com.google.protobuf.Duration;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;

public class CreateWithScriptNoMounting {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    // Project ID or project number of the Cloud project you want to use.
    String projectId = "YOUR_PROJECT_ID";

    // Name of the region you want to use to run the job. Regions that are
    // available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
    String region = "europe-central2";

    // The name of the job that will be created.
    // It needs to be unique for each project and region pair.
    String jobName = "JOB_NAME";

    createScriptJob(projectId, region, jobName);
  }

  // This method shows how to create a sample Batch Job that will run
  // a simple command on Cloud Compute instances.
  public static void createScriptJob(String projectId, String region, String jobName)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the `batchServiceClient.close()` method on the client to safely
    // clean up any remaining background resources.
    try (BatchServiceClient batchServiceClient = BatchServiceClient.create()) {

      // Define what will be done as part of the job.
      Runnable runnable =
          Runnable.newBuilder()
              .setScript(
                  Script.newBuilder()
                      .setText(
                          "echo Hello world! This is task ${BATCH_TASK_INDEX}. "
                              + "This job has a total of ${BATCH_TASK_COUNT} tasks.")
                      // You can also run a script from a file. Just remember, that needs to be a
                      // script that's already on the VM that will be running the job.
                      // Using setText() and setPath() is mutually exclusive.
                      // .setPath("/tmp/test.sh")
                      .build())
              .build();

      // We can specify what resources are requested by each task.
      ComputeResource computeResource =
          ComputeResource.newBuilder()
              // In milliseconds per cpu-second. This means the task requires 2 whole CPUs.
              .setCpuMilli(2000)
              // In MiB.
              .setMemoryMib(16)
              .build();

      TaskSpec task =
          TaskSpec.newBuilder()
              // Jobs can be divided into tasks. In this case, we have only one task.
              .addRunnables(runnable)
              .setComputeResource(computeResource)
              .setMaxRetryCount(2)
              .setMaxRunDuration(Duration.newBuilder().setSeconds(3600).build())
              .build();

      // Tasks are grouped inside a job using TaskGroups.
      TaskGroup taskGroup = TaskGroup.newBuilder().setTaskCount(4).setTaskSpec(task).build();

      // Policies are used to define on what kind of virtual machines the tasks will run on.
      // In this case, we tell the system to use "e2-standard-4" machine type.
      // Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
      InstancePolicy instancePolicy =
          InstancePolicy.newBuilder().setMachineType("e2-standard-4").build();

      AllocationPolicy allocationPolicy =
          AllocationPolicy.newBuilder()
              .addInstances(InstancePolicyOrTemplate.newBuilder().setPolicy(instancePolicy).build())
              .build();

      Job job =
          Job.newBuilder()
              .addTaskGroups(taskGroup)
              .setAllocationPolicy(allocationPolicy)
              .putLabels("env", "testing")
              .putLabels("type", "script")
              // We use Cloud Logging as it's an out of the box available option.
              .setLogsPolicy(
                  LogsPolicy.newBuilder().setDestination(Destination.CLOUD_LOGGING).build())
              .build();

      CreateJobRequest createJobRequest =
          CreateJobRequest.newBuilder()
              // The job's parent is the region in which the job will run.
              .setParent(String.format("projects/%s/locations/%s", projectId, region))
              .setJob(job)
              .setJobId(jobName)
              .build();

      Job result =
          batchServiceClient
              .createJobCallable()
              .futureCall(createJobRequest)
              .get(5, TimeUnit.MINUTES);

      System.out.printf("Successfully created the job: %s", result.getName());
    }
  }
}

Node.js

Node.js

For more information, see the Batch Node.js API reference documentation.

/**
 * TODO(developer): Uncomment and replace these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
/**
 * The region you want to the job to run in. The regions that support Batch are listed here:
 * https://cloud.google.com/batch/docs/get-started#locations
 */
// const region = 'us-central-1';
/**
 * The name of the job that will be created.
 * It needs to be unique for each project and region pair.
 */
// const jobName = 'YOUR_JOB_NAME';

// Imports the Batch library
const batchLib = require('@google-cloud/batch');
const batch = batchLib.protos.google.cloud.batch.v1;

// Instantiates a client
const batchClient = new batchLib.v1.BatchServiceClient();

// Define what will be done as part of the job.
const task = new batch.TaskSpec();
const runnable = new batch.Runnable();
runnable.script = new batch.Runnable.Script();
runnable.script.text =
  'echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks.';
// You can also run a script from a file. Just remember, that needs to be a script that's
// already on the VM that will be running the job. Using runnable.script.text and runnable.script.path is mutually
// exclusive.
// runnable.script.path = '/tmp/test.sh'
task.runnables = [runnable];

// We can specify what resources are requested by each task.
const resources = new batch.ComputeResource();
resources.cpuMilli = 2000; // in milliseconds per cpu-second. This means the task requires 2 whole CPUs.
resources.memoryMib = 16;
task.computeResource = resources;

task.maxRetryCount = 2;
task.maxRunDuration = {seconds: 3600};

// Tasks are grouped inside a job using TaskGroups.
const group = new batch.TaskGroup();
group.taskCount = 4;
group.taskSpec = task;

// Policies are used to define on what kind of virtual machines the tasks will run on.
// In this case, we tell the system to use "e2-standard-4" machine type.
// Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
const allocationPolicy = new batch.AllocationPolicy();
const policy = new batch.AllocationPolicy.InstancePolicy();
policy.machineType = 'e2-standard-4';
const instances = new batch.AllocationPolicy.InstancePolicyOrTemplate();
instances.policy = policy;
allocationPolicy.instances = [instances];

const job = new batch.Job();
job.name = jobName;
job.taskGroups = [group];
job.allocationPolicy = allocationPolicy;
job.labels = {env: 'testing', type: 'script'};
// We use Cloud Logging as it's an out option available out of the box
job.logsPolicy = new batch.LogsPolicy();
job.logsPolicy.destination = batch.LogsPolicy.Destination.CLOUD_LOGGING;

// The job's parent is the project and region in which the job will run
const parent = `projects/${projectId}/locations/${region}`;

async function callCreateJob() {
  // Construct request
  const request = {
    parent,
    jobId: jobName,
    job,
  };

  // Run request
  const response = await batchClient.createJob(request);
  console.log(response);
}

callCreateJob();

Python

Python

For more information, see the Batch Python API reference documentation.

from google.cloud import batch_v1


def create_script_job(project_id: str, region: str, job_name: str) -> batch_v1.Job:
    """
    This method shows how to create a sample Batch Job that will run
    a simple command on Cloud Compute instances.

    Args:
        project_id: project ID or project number of the Cloud project you want to use.
        region: name of the region you want to use to run the job. Regions that are
            available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
        job_name: the name of the job that will be created.
            It needs to be unique for each project and region pair.

    Returns:
        A job object representing the job created.
    """
    client = batch_v1.BatchServiceClient()

    # Define what will be done as part of the job.
    task = batch_v1.TaskSpec()
    runnable = batch_v1.Runnable()
    runnable.script = batch_v1.Runnable.Script()
    runnable.script.text = "echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
    # You can also run a script from a file. Just remember, that needs to be a script that's
    # already on the VM that will be running the job. Using runnable.script.text and runnable.script.path is mutually
    # exclusive.
    # runnable.script.path = '/tmp/test.sh'
    task.runnables = [runnable]

    # We can specify what resources are requested by each task.
    resources = batch_v1.ComputeResource()
    resources.cpu_milli = 2000  # in milliseconds per cpu-second. This means the task requires 2 whole CPUs.
    resources.memory_mib = 16
    task.compute_resource = resources

    task.max_retry_count = 2
    task.max_run_duration = "3600s"

    # Tasks are grouped inside a job using TaskGroups.
    # Currently, it's possible to have only one task group.
    group = batch_v1.TaskGroup()
    group.task_count = 4
    group.task_spec = task

    # Policies are used to define on what kind of virtual machines the tasks will run on.
    # In this case, we tell the system to use "e2-standard-4" machine type.
    # Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
    allocation_policy = batch_v1.AllocationPolicy()
    policy = batch_v1.AllocationPolicy.InstancePolicy()
    policy.machine_type = "e2-standard-4"
    instances = batch_v1.AllocationPolicy.InstancePolicyOrTemplate()
    instances.policy = policy
    allocation_policy.instances = [instances]

    job = batch_v1.Job()
    job.task_groups = [group]
    job.allocation_policy = allocation_policy
    job.labels = {"env": "testing", "type": "script"}
    # We use Cloud Logging as it's an out of the box available option
    job.logs_policy = batch_v1.LogsPolicy()
    job.logs_policy.destination = batch_v1.LogsPolicy.Destination.CLOUD_LOGGING

    create_request = batch_v1.CreateJobRequest()
    create_request.job = job
    create_request.job_id = job_name
    # The job's parent is the region in which the job will run
    create_request.parent = f"projects/{project_id}/locations/{region}"

    return client.create_job(create_request)

Create a job that uses environment variables

You can define environment variables in your job's resources and use them in your job's runnables.

By default, the runnables in your job can use the following environment variables:

  • BATCH_TASK_COUNT: the number of tasks in a task group.
  • BATCH_TASK_INDEX: the index number of a task in a task group. The index numbering starts at 0.
  • BATCH_HOSTS_FILE: Optional. The path to the file listing all the running VM instances in a task group. To use this environment variable, the requireHostsFile field is required and must be set to true.

Optionally, you can define custom environment variables in your job's resources and use them as follows:

  • In one runnable that each task runs.
  • In all the runnables that one task runs.

This section provides examples for how to create two jobs that define custom environment variables in their resources. The first example job passes an environment variable to a runnable that each task runs. The second example job passes an array of environment variables, with matching names but different values, to the tasks which indices match the environment variable's indices in the array.

You can define environment variables for your job using the gcloud CLI or Batch API.

glocud

If you want to define a job that passes an environment variable to a runnable that each task runs, see the example for how to Define and use an environment variable for a runnable. Otherwise, if you want to define a job that passes a list of environment variables to different tasks based on the task index, see the example for how to Define and use an environment variable for each task.

Define and use an environment variable for a runnable

To create a job that passes environment variables to a runnable using the gcloud CLI, use the gcloud batch jobs submit command and specify the environment variables in the job's configuration file.

For example, to create a script job that defines an environment variable and passes it to the scripts of 3 tasks, make the following request:

  1. Create a JSON file in the current directory named hello-word-environment-variables.json with the following contents:

    {
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "script": {
                                "text": "echo Hello ${VARIABLE_NAME}! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
                            },
                            "environment": {
                                "variables": {
                                    "VARIABLE_NAME": "VARIABLE_VALUE"
                                }
                            }
                        }
                    ],
                    "computeResource": {
                        "cpuMilli": 2000,
                        "memoryMib": 16
                    }
                },
                "taskCount": 3,
                "parallelism": 1
            }
        ],
        "allocationPolicy": {
            "instances": [
                {
                    "policy": {
                        "machineType": "e2-standard-4"
                    }
                }
            ]
        }
    }
    

    Replace the following:

    • VARIABLE_NAME: the name of the environment variable passed to each task. By convention, environment variable names are capitalized.
    • VARIABLE_VALUE: Optional. The value of the environment variable passed to each task.
  2. Run the following command:

    gcloud batch jobs submit example-environment-variables-job \
      --location us-central1 \
      --config hello-word-environment-variables.json
    

Define and use an environment variable for each task

To create a job that passes environment variables to a task based on task index using the gcloud CLI, use the gcloud batch jobs submit command and specify the taskEnvironments array field in the job's configuration file.

For example, to create a job that includes an array of 3 environment variables with matching names and different values, and passes the environment variables to the scripts of the tasks which indices match the environment variables' indices in the array:

  1. Create a JSON file in the current directory named hello-word-task-environment-variables.json with the following contents:

    {
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "script": {
                                "text": "echo Hello ${TASK_VARIABLE_NAME}! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
                            },
                        }
                    ],
                    "computeResource": {
                        "cpuMilli": 2000,
                        "memoryMib": 16
                    }
                },
                "taskCount": 3,
                "taskEnvironments": [
                    {
                        "variables": {
                            "TASK_VARIABLE_NAME": "TASK_VARIABLE_VALUE_0"
                        }
                    },
                    {
                        "variables": {
                            "TASK_VARIABLE_NAME": "TASK_VARIABLE_VALUE_1"
                        }
                    },
                    {
                        "variables": {
                            "TASK_VARIABLE_NAME": "TASK_VARIABLE_VALUE_2"
                        }
                    }
                ]
            }
        ],
        "allocationPolicy": {
            "instances": [
                {
                    "policy": {
                        "machineType": "e2-standard-4"
                    }
                }
            ]
        }
    }
    

    Replace the following:

    • TASK_VARIABLE_NAME: the name of the environment variables passed to the tasks with matching indices. By convention, environment variable names are capitalized.
    • TASK_VARIABLE_VALUE_0: the value of the environment variable passed to the task which BATCH_TASK_INDEX equals to 0.
    • TASK_VARIABLE_VALUE_1: the value of the environment variable passed to the task which BATCH_TASK_INDEX equals to 1.
    • TASK_VARIABLE_VALUE_2: the value of the environment variable passed to the task which BATCH_TASK_INDEX equals to 2.
  2. Run the following command:

    gcloud batch jobs submit example-task-environment-variables-job \
      --location us-central1 \
      --config hello-word-task-environment-variables.json
    

API

If you want to define a job that passes an environment variable to a runnable that each task runs, see the example for how to Define and use an environment variable for a runnable. Otherwise, if you want to define a job that passes a list of environment variables to different tasks based on the task index, see the example for how to Define and use an environment variable for each task.

Define and use an environment variable for a runnable

To create a job that passes environment variables to a runnable using Batch API, use the gcloud batch jobs submit command and specify the environment variables in the environment field.

For example, to create a job that includes an environment variable and passes it to the scripts of 3 tasks, make the following request:

POST https://batch.googleapis.com/v1/projects/<var>PROJECT_ID</var>/locations/us-central1/jobs?job_id=example-environment-variables-job

{
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "script": {
                            "text": "echo Hello ${VARIABLE_NAME}! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
                        },
                        "environment": {
                            "variables": {
                                "VARIABLE_NAME": "VARIABLE_VALUE"
                            }
                        }
                    }
                ],
                "computeResource": {
                    "cpuMilli": 2000,
                    "memoryMib": 16
                }
            },
            "taskCount": 3,
            "parallelism": 1
        }

    ],
    "allocationPolicy": {
        "instances": [
            {
                "policy": {
                    "machineType": "e2-standard-4"
                }
            }
        ]
    }
}

Replace the following:

  • PROJECT_ID: the project ID of your project.
  • VARIABLE_NAME: the name of the environment variable passed to each task. By convention, environment variable names are capitalized.
  • VARIABLE_VALUE: the value of the environment variable passed to each task.

Define and use an environment variable for each task

To create a job that passes environment variables to a task based on task index using Batch API, use the jobs.create method and specify the environment variables in the taskEnvironments array field.

For example, to create a job that includes an array of 3 environment variables with matching names and different values, and passes the environment variables to the scripts of 3 tasks based on their indices, make the following request:

POST https://batch.googleapis.com/v1/projects/<var>PROJECT_ID</var>/locations/us-central1/jobs?job_id=example-task-environment-variables-job

{
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "script": {
                            "text": "echo Hello ${TASK_VARIABLE_NAME}! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
                        },
                    }
                ],
                "computeResource": {
                    "cpuMilli": 2000,
                    "memoryMib": 16
                }
            },
            "taskCount": 3,
            "taskEnvironments": [
                {
                    "variables": {
                        "TASK_VARIABLE_NAME": "TASK_VARIABLE_VALUE_0"
                    }
                },
                {
                    "variables": {
                        "TASK_VARIABLE_NAME": "TASK_VARIABLE_VALUE_1"
                    }
                },
                {
                    "variables": {
                        "TASK_VARIABLE_NAME": "TASK_VARIABLE_VALUE_2"
                    }
                }
            ]
        }
    ],
    "allocationPolicy": {
        "instances": [
            {
                "policy": { "machineType": "e2-standard-4" }
            }
        ]
    }
}

Replace the following:

  • PROJECT_ID: the project ID of your project.
  • TASK_VARIABLE_NAME: the name of the environment variables passed to the tasks with matching indices. By convention, environment variable names are capitalized.
  • TASK_VARIABLE_VALUE_0: the value of the environment variable passed to the task which BATCH_TASK_INDEX equals to 0.
  • TASK_VARIABLE_VALUE_1: the value of the environment variable passed to the task which BATCH_TASK_INDEX equals to 1.
  • TASK_VARIABLE_VALUE_2: the value of the environment variable passed to the task which BATCH_TASK_INDEX equals to 2.

Create a job from a Compute Engine instance template

Optionally, you can use a Compute Engine instance template to define the resources for your job. This is required to create a job that uses non-default VM images or to create a job with a custom machine type.

This section provides examples for how to create a basic script job from an existing instance template. You can create a job from an instance template using the gcloud CLI or Batch API.

gcloud

To create a job from an instance template using the gcloud CLI, use the gcloud batch jobs submit command and specify the instance template in the job's JSON configuration file.

For example, to create a basic script job from an instance template:

  1. Create a JSON file in the current directory named hello-world-instance-template.json with the following contents:

    {
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "script": {
                                "text": "echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
                            }
                        }
                    ],
                    "computeResource": {
                        "cpuMilli": 2000,
                        "memoryMib": 16
                    },
                    "maxRetryCount": 2,
                    "maxRunDuration": "3600s"
                },
                "taskCount": 4,
                "parallelism": 2
            }
        ],
        "allocationPolicy": {
            "instances": [
                {
                    "installGpuDrivers" INSTALL_GPU_DRIVERS,
                    "instanceTemplate": "INSTANCE_TEMPLATE_NAME"
                }
            ]
        },
        "labels": {
            "department": "finance",
            "env": "testing"
        },
        "logsPolicy": {
            "destination": "CLOUD_LOGGING"
        }
    }
    

    Replace the following:

    • INSTALL_GPU_DRIVERS: Optional. When set to true, Batch fetches the drivers required for the GPU type that you specify in your Compute Engine instance template, and Batch installs them on your behalf. For more information, see how to create a job that uses a GPU.
    • INSTANCE_TEMPLATE_NAME: the name of an existing Compute Engine instance template. Learn how to create and list instance templates.
  2. Run the following command:

    gcloud batch jobs submit example-template-job \
      --location us-central1 \
      --config hello-world-instance-template.json
    

API

To create a basic job using the Batch API, use the jobs.create method and specify an instance template in the allocationPolicy field.

For example, to create a basic script jobs from an instance template, use the following request:

POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/jobs?job_id=example-script-job

{
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "script": {
                            "text": "echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
                        }
                    }
                ],
                "computeResource": {
                    "cpuMilli": 2000,
                    "memoryMib": 16
                },
                "maxRetryCount": 2,
                "maxRunDuration": "3600s"
            },
            "taskCount": 4,
            "parallelism": 2
        }
    ],
    "allocationPolicy": {
        "instances": [
            {
                "installGpuDrivers" INSTALL_GPU_DRIVERS,
                "instanceTemplate": "INSTANCE_TEMPLATE_NAME"
            }
        ]
    },
    "labels": {
        "department": "finance",
        "env": "testing"
    },
    "logsPolicy": {
        "destination": "CLOUD_LOGGING"
    }
}

Replace the following:

  • PROJECT_ID: the project ID of your project.
  • INSTALL_GPU_DRIVERS: Optional. When set to true, Batch fetches the drivers required for the GPU type that you specify in your Compute Engine instance template, and Batch installs them on your behalf. For more information, see how to create a job that uses a GPU.
  • INSTANCE_TEMPLATE_NAME: the name of an existing Compute Engine instance template. Learn how to create and list instance templates.

Create a job that uses a custom service account

Optionally, you can create a job that uses a custom service account instead of the default Compute Engine service account. A job's service account influences which resources and applications the job's VMs can access. The default Compute Engine service account is automatically attached to all VMs by default, so using a custom service account provides greater control in managing a job's permissions and is a recommended best practice for limiting privilege.

Before creating a job that uses a custom service account, make sure that the service account you plan to use has the permissions required to create Batch jobs for your project. For more information, see Enable Batch for a project.

To create a job that uses a custom service account, select one of the following methods:

  • Specify the custom service account in your job's definition, as shown in this section.
  • Use a Compute Engine instance template and specify the custom service account both in your instance template and in your job's definition.

This section provides an example for how to create a job that uses a custom service account. You can create a job that uses a custom service account using the gcloud CLI or Batch API.

gcloud

To create a job that uses a custom service account using the gcloud CLI, use the gcloud batch jobs submit command and specify the custom service account in the job's configuration file.

For example, to create a script job that uses a custom service account:

  1. Create a JSON file in the current directory named hello-world-service-account.json with the following contents:

    {
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "script": {
                                "text": "echo Hello World! This is task $BATCH_TASK_INDEX."
                            }
                        }
                    ]
                }
            }
        ],
        "allocationPolicy": {
            "serviceAccount": {
                "email": "SERVICE_ACCOUNT_EMAIL"
            }
        }
    }
    

    where SERVICE_ACCOUNT_EMAIL is the email address of your service account. If the serviceAccount field is not specified, the value is set to the default Compute Engine service account.

  2. Run the following command:

    gcloud batch jobs submit example-service-account-job \
      --location us-central1 \
      --config hello-world-service-account.json
    

API

To create a job that uses a custom service account using the Batch API, use the jobs.create method and specify your custom service account in the allocationPolicy field.

For example, to create a script job that uses a custom service account, make the following request:

POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/jobs?job_id=example-service-account-job

{
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "script": {
                            "text": "echo Hello World! This is task $BATCH_TASK_INDEX."
                        }
                    }
                ]
            }
        }
    ],
    "allocationPolicy": {
        "serviceAccount": {
            "email": "SERVICE_ACCOUNT_EMAIL"
        }
    }
}

Replace the following:

Create a job that uses MPI for tightly coupled tasks

Optionally, you can create a job that uses a Message Passing Interface (MPI) library to let interdependent tasks communicate with each other across different VM instances.

This section provides examples for how to create a job that can use MPI. Notably, the example jobs have 3 runnables:

  • The first runnable is a script that prepares the job for MPI by disabling simultaneous multithreading and installing Intel MPI.
  • The second runnable is an empty barrier runnable (formatted as { "barrier": {} }), which ensures that all tasks finish setting up MPI before continuing to future runnables.
  • The third runnable (and any subsequent runnables) is available for the job's workload.

You can create a job that uses MPI for tightly coupled tasks using the gcloud CLI or Batch API.

gcloud

To create a script job that uses MPI for tightly coupled tasks by using the gcloud CLI, do the following:

  1. Create a JSON configuration file with the following contents:

    {
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "script": {
                                "text": "google_mpi_tuning --nosmt; google_install_mpi --intel_mpi;"
                            }
                        },
                        { "barrier": {} },
                        {
                            "script": {
                                SCRIPT
                            }
                        }
                    ]
                },
                "taskCount": TASK_COUNT,
                "taskCountPerNode": TASK_COUNT_PER_NODE,
                "requireHostsFile": REQUIRE_HOSTS_FILE,
                "permissiveSsh": PERMISSIVE_SSH
            }
        ]
    }
    

    Replace the following:

    • SCRIPT: a script runnable for a workload that uses MPI.
    • TASK_COUNT: the number of tasks for the job. The value must be a whole number between 1 and 10000. To use the MPI libraries provided by Batch, this field is required and must be set to 2 or higher.
    • TASK_COUNT_PER_NODE: the number of tasks that a job can run concurrently on a VM instance. To use the MPI libraries provided by Batch, this field is required and must be set to 1, which equals to running one VM instance per task.
    • REQUIRE_HOSTS_FILE: when set to true, the job creates a file listing the VM instances running in a task group. The file path is stored in the BATCH_HOSTS_FILE environment variable. To use the MPI libraries provided by Batch, this field must be set to true.
    • PERMISSIVE_SSH: when set to true, Batch configures SSH to allow passwordless communication among the VM instances running in a task group. To use the MPI libraries provided by Batch, this field must be set to true.
  2. To create the job, use the gcloud batch jobs submit command.

    gcloud batch jobs submit JOB_NAME \
      --location LOCATION \
      --config JSON_CONFIGURATION_FILE
    

    Replace the following:

    • JOB_NAME: the name of the job.
    • LOCATION: the location of the job.
    • JSON_CONFIGURATION_FILE: the path for a JSON file with the job's configuration details.

Optionally, you can increase the performance of the MPI libraries provided by Batch by doing the following:

For example, to create a script job from an instance template that uses MPI and makes 1 task output the hostname of the 3 tasks in the task group:

  1. Create a JSON file in the current directory named example-job-uses-mpi.json with the following contents:

    {
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "script": {
                                "text": "google_mpi_tuning --nosmt; google_install_mpi --intel_mpi;"
                            }
                        },
                        { "barrier": {} },
                        {
                            "script": {
                                "text":
                                    "if [ $BATCH_TASK_INDEX = 0 ]; then
                                    mpirun -hostfile $BATCH_HOSTS_FILE -np 3 hostname;
                                    fi"
                            }
                        },
                        { "barrier": {} }
                    ]
                },
                "taskCount": 3,
                "taskCountPerNode": 1,
                "requireHostsFile": true,
                "permissiveSsh": true
            }
        ],
        "allocationPolicy": {
            "instances": [
                {
                    "instanceTemplate": "example-template-job-uses-mpi"
                }
            ]
        },
        "logsPolicy": {
            "destination": "CLOUD_LOGGING"
        }
    }
    
  2. Run the following command:

    gcloud batch jobs submit example-template-job-uses-mpi \
      --location us-central1 \
      --config example-job-uses-mpi.json
    

API

To create a script job that uses a MPI for tightly coupled tasks by using the Batch API, use the gcloud batch jobs submit command and specify the permissiveSsh, requireHostsFile, taskCount, and taskCountPerNode fields.

POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/jobs?job_id=JOB_ID

{
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "script": {
                            "text": "google_mpi_tuning --nosmt; google_install_mpi --intel_mpi;"
                        }
                    },
                    { "barrier": {} },
                    {
                        "script": {
                            SCRIPT
                        }
                    }
                ]
            },
            "taskCount": TASK_COUNT,
            "taskCountPerNode": TASK_COUNT_PER_NODE,
            "requireHostsFile": REQUIRE_HOSTS_FILE,
            "permissiveSsh": PERMISSIVE_SSH
        }
    ]
}

Replace the following:

  • PROJECT_ID: the project ID of your project.
  • LOCATION: the location of the job.
  • JOB_NAME: the name of the job.
  • SCRIPT: the script runnable for a workload that uses MPI.
  • TASK_COUNT: the number of tasks for the job. The value must be a whole number between 1 and 10000. To use the MPI libraries provided by Batch, this field is required and must be set to 2 or higher.
  • TASK_COUNT_PER_NODE: the number of tasks that a job can run concurrently on a VM instance. To use the MPI libraries provided by Batch, this field is required and must be set to 1, which equals to running one VM instance per task.
  • REQUIRE_HOSTS_FILE: when set to true, the job creates a file listing the VM instances running in a task group. The file path is stored in the BATCH_HOSTS_FILE environment variable. To use the MPI libraries provided by Batch, this field must be set to true.
  • PERMISSIVE_SSH: when set to true, Batch configures SSH to allow passwordless communication among the VM instances running in a task group. To use the MPI libraries provided by Batch, this field must be set to true.

Optionally, you can increase the performance of the MPI libraries provided by Batch by doing the following:

For example, to create a script job from an instance template that uses MPI and makes 1 task output the hostname of the 3 tasks in the task group, use the following request:

POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/jobs?job_id=example-template-job-uses-mpi

{
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "script": {
                            "text": "google_mpi_tuning --nosmt; google_install_mpi --intel_mpi;"
                        }
                    },
                    { "barrier": {} },
                    {
                        "script": {
                            "text":
                                "if [ $BATCH_TASK_INDEX = 0 ]; then
                                mpirun -hostfile $BATCH_HOSTS_FILE -np 3 hostname;
                                fi"
                        }
                    },
                    { "barrier": {} }
                ]
            },
            "taskCount": 3,
            "taskCountPerNode": 1,
            "requireHostsFile": true,
            "permissiveSsh": true
        }
    ],
    "allocationPolicy": {
        "instances": [
            {
                "instanceTemplate": "example-template-job-uses-mpi"
            }
        ]
    },
    "logsPolicy": {
        "destination": "CLOUD_LOGGING"
    }
}

where PROJECT_ID is the project ID of your project.

For a more detailed example of a job that uses MPI for tightly coupled tasks, see Running the Weather Research and Forecasting model with Batch.

Create a job that uses a GPU

Optionally, you can create a job that adds a graphics processing unit (GPU) to the VM instances running in your job's task group.

To install the required GPU drivers for your job, select one of the following methods:

To add a GPU to your job's resources, select one of the following methods:

  • Define the machine type and the GPU platform in your job's definition, as shown in this section.
  • Define the GPU in your Compute Engine instance template. If you include an instance template in your job's definition, then you must use this method.

This section provides examples on how to create a job that defines a GPU in the job's resources and installs the required drivers for the GPU. Specifically, the first example shows how to add a GPU to a container job that uses the default image. The second example shows how to add a GPU to a script job that uses the default image. The third example shows how to add a GPU to a container and script job that uses the default image.

You can create a job that uses a GPU using the gcloud CLI or Batch API.

gcloud

If you want to add a GPU to a job, see the following examples:

Add a GPU to a container job

To create a container job with a GPU that uses the default image by using the gcloud CLI, use the gcloud batch jobs submit command and specify the following in the job's configuration file:

  • The machine type and GPU platform.
  • The volumes and options fields to mount the GPU to a container job with the default image, as shown in this example.

For example, to create a container job with a GPU:

  1. Create a JSON file in the current directory named hello-world-container-job-gpu.json with the following contents:

    {
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "container": {
                                "volumes": [
                                    "/var/lib/nvidia/lib64:/usr/local/nvidia/lib64",
                                    "/var/lib/nvidia/bin:/usr/local/nvidia/bin"
                                ],
                                "options": "--privileged",
                                "commands": [
                                    "-c",
                                    "echo Hello world from task ${BATCH_TASK_INDEX}."
                                ]
                            }
                        }
                    ],
                },
                "taskCount": 3,
                "parallelism": 1
            }
        ],
        "allocationPolicy": {
            "instances": [
                {
                    "installGpuDrivers": INSTALL_GPU_DRIVERS,
                    "policy": {
                        "machineType": "MACHINE_TYPE",
                        "accelerators": [
                            {
                                "type": "GPU_TYPE",
                                "count": GPU_COUNT
                            }
                        ]
                    }
                }
            ],
            "location": {
                "allowedLocations": [
                    "ALLOWED_LOCATIONS"
                ]
            }
        }
    }
    

    Replace the following:

    • INSTALL_GPU_DRIVERS: Optional. When set to true, Batch fetches the drivers required for the GPU type that you specify in the policy field from a third-party location, and Batch installs them on your behalf.
    • MACHINE_TYPE: the machine type for your job's VMs, which restricts the type of GPUs you can use. To create a job with a GPU, this field is required.
    • GPU_TYPE: the GPU type. You can view a list of the available GPU types by using the gcloud compute accelerator-types list command. To create a job with a GPU, this field is required.
    • GPU_COUNT: the number of GPUs of the type you specified in the type field. To create a job with a GPU, this field is required.
    • ALLOWED_LOCATIONS: Optional. The locations where the VM instances for your job are allowed to run (for example, regions/us-central1, zones/us-central1-a allows the zone us-central1-a). If you specify an allowed location, you must select the region and, optionally, one or more zones. The locations that you choose must have the GPU type you want for this job. For more information, see the allowedLocations array field.
  2. Run the following command:

    gcloud batch jobs submit example-job-gpu \
      --location us-central1 \
      --config hello-world-container-job-gpu.json
    

Add a GPU to a script job

To create a script job with a GPU that uses the default image by using the gcloud CLI, use the gcloud batch jobs submit command and specify the machine type and GPU platform in the job's configuration file.

For example, to create a script job with a GPU:

  1. Create a JSON file in the current directory named hello-world-script-job-gpu.json with the following contents:

    {
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "script": {
                                "text": "echo Hello world from task ${BATCH_TASK_INDEX}."
                            }
                        }
                    ]
                },
                "taskCount": 3,
                "parallelism": 1
            }
        ],
        "allocationPolicy": {
            "instances": [
                {
                    "installGpuDrivers": INSTALL_GPU_DRIVERS,
                    "policy": {
                        "machineType": "MACHINE_TYPE",
                        "accelerator":
                            {
                                "type": "GPU_TYPE",
                                "count": GPU_COUNT
                            }
                        ]
                    }
                }
            ],
            "location": {
                "allowedLocations": [
                    "ALLOWED_LOCATIONS"
                ]
            }
        }
    }
    

    Replace the following:

    • INSTALL_GPU_DRIVERS: Optional. When set to true, Batch fetches the drivers required for the GPU type that you specify in the policy field from a third-party location, and Batch installs them on your behalf.
    • MACHINE_TYPE: the machine type for your job's VMs, which restricts the type of GPUs you can use. To create a job with a GPU, this field is required.
    • GPU_TYPE: the GPU type. You can view a list of the available GPU types by using the gcloud compute accelerator-types list command. To create a job with a GPU, this field is required.
    • GPU_COUNT: the number of GPUs of the type you specified in the type field. To create a job with a GPU, this field is required.
    • ALLOWED_LOCATIONS: Optional. The locations where the VM instances for your job are allowed to run (for example, regions/us-central1, zones/us-central1-a allows the zone us-central1-a). If you specify an allowed location, you must select the region and, optionally, one or more zones. The locations that you choose must have the GPU type you want for this job. For more information, see the allowedLocations array field.
  2. Run the following command:

    gcloud batch jobs submit example-job-gpu \
      --location us-central1 \
      --config hello-world-script-job-gpu.json
    

Add a GPU to a container and script job

To create a container and script job with a GPU that uses the default image by using the gcloud CLI, use the gcloud batch jobs submit command and specify the following in the job's configuration file:

  • The machine type and GPU platform.
  • The text field for each script runnable, and the imageUri and options fields for each container runnable, to mount the GPU to a container and script job with the default image, as shown in this example.

For example, to create a container and script job with a GPU:

  1. Create a JSON file in the current directory named hello-world-container-script-job-gpu.json with the following contents:

    {
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "script": {
                                "text":
                                    "distribution=$(. /etc/os-release;echo $ID$VERSION_ID);
                                    curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -;
                                    curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list;
                                    sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit;
                                    sudo systemctl restart docker"
                            }
                        },
                        {
                            "container": {
                                "imageUri": "gcr.io/google_containers/cuda-vector-add:v0.1",
                                "options": "--gpus all"
                            }
                        },
                        {
                            "script": {
                                "text": "echo Hello world from task ${BATCH_TASK_INDEX}."
                            }
                        }
                    ]
                },
                "taskCount": 3,
                "parallelism": 1
            }
        ],
        "allocationPolicy": {
            "instances": [
                {
                    "installGpuDrivers": INSTALL_GPU_DRIVERS,
                    "policy": {
                        "machineType": "MACHINE_TYPE",
                        "accelerators": [
                            {
                                "type": "GPU_TYPE",
                                "count": GPU_COUNT
                            }
                        ]
                    }
                }
            ],
            "location": {
                "allowedLocations": [
                    "ALLOWED_LOCATIONS"
                ]
            }
        }
    }
    

    Replace the following:

    • INSTALL_GPU_DRIVERS: Optional. When set to true, Batch fetches the drivers required for the GPU type that you specify in the policy field from a third-party location, and Batch installs them on your behalf.
    • MACHINE_TYPE: the machine type for your job's VMs, which restricts the type of GPUs you can use. To create a job with a GPU, this field is required.
    • GPU_TYPE: the GPU type. You can view a list of the available GPU types by using the gcloud compute accelerator-types list command. To create a job with a GPU, this field is required.
    • GPU_COUNT: the number of GPUs of the type you specified in the type field. To create a job with a GPU, this field is required.
    • ALLOWED_LOCATIONS: Optional. The locations where the VM instances for your job are allowed to run (for example, regions/us-central1, zones/us-central1-a allows the zone us-central1-a). If you specify an allowed location, you must select the region and, optionally, one or more zones. The locations that you choose must have the GPU type you want for this job. For more information, see the allowedLocations array field.
  2. Run the following command:

    gcloud batch jobs submit example-job-gpu \
      --location us-central1 \
      --config hello-world-container-script-job-gpu.json
    

API

If you want to add a GPU to a job, see the following examples:

Add a GPU to a container job

To create a container job with a GPU that uses the default image by using the Batch API, use the jobs.create method and specify the following:

  • The machine type and GPU platform.
  • The volumes and options fields to mount the GPU to a container job with the default image, as shown in this example.

For example, to create a container job with a GPU, make the following request:

POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/jobs?job_id=example-job-gpu

{
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "container": {
                            "volumes": [
                                "/var/lib/nvidia/lib64:/usr/local/nvidia/lib64",
                                "/var/lib/nvidia/bin:/usr/local/nvidia/bin"
                            ],
                            "options": "--privileged",
                            "commands": [
                                "-c",
                                "echo Hello world from task ${BATCH_TASK_INDEX}."
                            ]
                        }
                    }
                ],
            },
            "taskCount": 3,
            "parallelism": 1
        }
    ],
    "allocationPolicy": {
        "instances": [
            {
                "installGpuDrivers": INSTALL_GPU_DRIVERS,
                "policy": {
                    "machineType": "MACHINE_TYPE",
                    "accelerators": [
                        {
                            "type": "GPU_TYPE",
                            "count": GPU_COUNT
                        }
                    ]
                }
            }
        ],
        "location": {
            "allowedLocations": [
                "ALLOWED_LOCATIONS"
            ]
        }
    }
}

Replace the following:

  • PROJECT_ID: the project ID of your project.
  • INSTALL_GPU_DRIVERS: Optional. When set to true, Batch fetches the drivers required for the GPU type that you specify in the policy field from a third-party location, and Batch installs them on your behalf.
  • MACHINE_TYPE: the machine type for your job's VMs, which restricts the type of GPUs you can use. To create a job with a GPU, this field is required.
  • GPU_TYPE: the GPU type. You can view a list of the available GPU types by using the gcloud compute accelerator-types list command. To create a job with a GPU, this field is required.
  • GPU_COUNT: the number of GPUs of the type you specified in the type field. To create a job with a GPU, this field is required.
  • ALLOWED_LOCATIONS: Optional. The locations where the VM instances for your job are allowed to run (for example, regions/us-central1, zones/us-central1-a allows the zone us-central1-a). If you specify an allowed location, you must select the region and, optionally, one or more zones. The locations that you choose must have the GPU type you want for this job. For more information, see the allowedLocations array field.

Add a GPU to a script job

To create a script job with a GPU that uses the default image by using the Batch API, use the jobs.create method and specify the machine and accelerator types in the instances field.

For example, to create a script job with a GPU, make the following request:

POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/jobs?job_id=example-job-gpu

{
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "script": {
                            "text": "echo Hello world from task ${BATCH_TASK_INDEX}."
                        }
                    }
                ]
            },
            "taskCount": 3,
            "parallelism": 1
        }
    ],
    "allocationPolicy": {
        "instances": [
            {
                "installGpuDrivers": INSTALL_GPU_DRIVERS,
                "policy": {
                    "machineType": "MACHINE_TYPE",
                    "accelerators": [
                        {
                            "type": "GPU_TYPE",
                            "count": GPU_COUNT
                        }
                    ]
                }
            }
        ],
        "location": {
            "allowedLocations": [
                "ALLOWED_LOCATIONS"
            ]
        }
    }
}

Replace the following:

  • PROJECT_ID: the project ID of your project.
  • INSTALL_GPU_DRIVERS: Optional. When set to true, Batch fetches the drivers required for the GPU type that you specify in the policy field from a third-party location, and Batch installs them on your behalf.
  • MACHINE_TYPE: the machine type for your job's VMs, which restricts the type of GPUs you can use. To create a job with a GPU, this field is required.
  • GPU_TYPE: the GPU type. You can view a list of the available GPU types by using the gcloud compute accelerator-types list command. To create a job with a GPU, this field is required.
  • GPU_COUNT: the number of GPUs of the type you specified in the type field. To create a job with a GPU, this field is required.
  • ALLOWED_LOCATIONS: Optional. The locations where the VM instances for your job are allowed to run (for example, regions/us-central1, zones/us-central1-a allows the zone us-central1-a). If you specify an allowed location, you must select the region and, optionally, one or more zones. The locations that you choose must have the GPU type you want for this job. For more information, see the allowedLocations array field.

Add a GPU to a container and script job

To create a container and script job with a GPU that uses the default image by using the Batch API, use the jobs.create method and specify the following:

  • The machine type and GPU platform.
  • The text field for each script runnable, and the imageUri and options fields for each container runnable, to mount the GPU to a container and script job with the default image, as shown in this example.

For example, to create a container and script job with a GPU, make the following request:

POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/jobs?job_id=example-job-gpu

{
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "script": {
                            "text":
                                "distribution=$(. /etc/os-release;echo $ID$VERSION_ID);
                                curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -;
                                curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list;
                                sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit;
                                sudo systemctl restart docker"
                        }
                    },
                    {
                        "container": {
                            "imageUri": "gcr.io/google_containers/cuda-vector-add:v0.1",
                            "options": "--gpus all"
                        }
                    },
                    {
                        "script": {
                            "text": "echo Hello world from task ${BATCH_TASK_INDEX}."
                        }
                    }
                ]
            },
            "taskCount": 3,
            "parallelism": 1
        }
    ],
    "allocationPolicy": {
        "instances": [
            {
                "installGpuDrivers": INSTALL_GPU_DRIVERS,
                "policy": {
                    "machineType": "MACHINE_TYPE",
                    "accelerators": [
                        {
                            "type": "GPU_TYPE",
                            "count": GPU_COUNT
                        }
                    ]
                }
            }
        ],
        "location": {
            "allowedLocations": [
                "ALLOWED_LOCATIONS"
            ]
        }
    }
}

Replace the following:

  • PROJECT_ID: the project ID of your project.
  • INSTALL_GPU_DRIVERS: Optional. When set to true, Batch fetches the drivers required for the GPU type that you specify in the policy field from a third-party location, and Batch installs them on your behalf.
  • MACHINE_TYPE: the machine type for your job's VMs, which restricts the type of GPUs you can use. To create a job with a GPU, this field is required.
  • GPU_TYPE: the GPU type. You can view a list of the available GPU types by using the gcloud compute accelerator-types list command. To create a job with a GPU, this field is required.
  • GPU_COUNT: the number of GPUs of the type you specified in the type field. To create a job with a GPU, this field is required.
  • ALLOWED_LOCATIONS: Optional. The locations where the VM instances for your job are allowed to run (for example, regions/us-central1, zones/us-central1-a allows the zone us-central1-a). If you specify an allowed location, you must select the region and, optionally, one or more zones. The locations that you choose must have the GPU type you want for this job. For more information, see the allowedLocations array field.

Create a job that uses storage volumes

By default, each Compute Engine VM for a job has a single boot persistent disk that contains the operating system. Optionally, you can create a job that uses additional storage volumes. Specifically, a job's VMs can use one or more of each of the following types of storage volumes. For more information about all of the types of storage volumes and the differences and restrictions for each, see the documentation for Compute Engine VM storage options.

You can allow a job to use each storage volume by including it in your job's definition and specifying its mount path (mountPath) in your runnables. To learn how to create a job that uses storage volumes, see one or more of the following sections:

Use a persistent disk

A job that uses persistent disks has the following restrictions:

  • All persistent disks: Review the restrictions for all persistent disks.
  • Instance templates: If you want to specify an instance template while creating this job, you must attach any persistent disk(s) for this job in the instance template. Otherwise, if you don't want to use an instance template you must attach any persistent disk(s) directly in the job definition.
  • New versus existing persistent disks: Each persistent disk in a job can be either new (defined in and created with the job) or existing (already created in your project and specified in the job). The supported mount options for how Batch mounts the persistent disks to the job's VMs as well as the supported location options for your job and its persistent disks vary between new and existing persistent disks as described in the following table:

    New persistent disks Existing persistent disks
    Mount options All options are supported. All options except writing are supported. This is due to restrictions of multi-writer mode.
    Location options

    You can only create zonal persistent disks.

    You can select any location for your job. The persistent disks get created in the zone your project runs in.

    You can select zonal and regional persistent disks.

    You must set the job's location (or, if specified, just the job's allowed locations) to only locations that contain all of the job's persistent disks. For example, for a zonal persistent disk, the job's location must be the disk's zone; for a regional persistent disk, the job's location must be either the disk's region or, if specifying zones, one or both of the specific zones where the regional persistent disk is located.

You can create a job that uses a persistent disk using the gcloud CLI or Batch API. The following example describes how to create a job that attaches and mounts an existing persistent disk and a new persistent disk. The job also has 3 tasks that each run a script to create a file in the new persistent disk named output_task_TASK_INDEX.txt where TASK_INDEX is the index of each task: 0, 1, and 2.

gcloud

To create a job that uses persistent disks using the gcloud CLI, use the gcloud batch jobs submit command. In the job's JSON configuration file, specify the persistent disks in the instances field and mount the persistent disk in the volumes field.

  1. Create a JSON file.

    • If you are not using an instance template for this job, create a JSON file with the following contents:

      {
          "allocationPolicy": {
              "instances": [
                  {
                      "policy": {
                          "disks": [
                              {
                                  "deviceName": "EXISTING_PERSISTENT_DISK_NAME",
                                  "existingDisk": "projects/PROJECT_ID/EXISTING_PERSISTENT_DISK_LOCATION/disks/EXISTING_PERSISTENT_DISK_NAME"
                              },
                              {
                                  "newDisk": {
                                      "sizeGb":NEW_PERSISTENT_DISK_SIZE,
                                      "type": "NEW_PERSISTENT_DISK_TYPE"
                                  },
                                  "deviceName": "NEW_PERSISTENT_DISK_NAME"
                              }
                          ]
                      }
                  }
              ],
              "location": {
                  "allowedLocations": [
                      "EXISTING_PERSISTENT_DISK_LOCATION"
                  ]
              }
          },
          "taskGroups":[
              {
                  "taskSpec":{
                      "runnables": [
                          {
                              "script": {
                                  "text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> /mnt/disks/NEW_PERSISTENT_DISK_NAME/output_task_${BATCH_TASK_INDEX}.txt"
                              }
                          }
                      ],
                      "volumes": [
                          {
                              "deviceName": "NEW_PERSISTENT_DISK_NAME",
                              "mountPath": "/mnt/disks/NEW_PERSISTENT_DISK_NAME",
                              "mountOptions": "rw,async"
                          },
                          {
      
                              "deviceName": "EXISTING_PERSISTENT_DISK_NAME",
                              "mountPath": "/mnt/disks/EXISTING_PERSISTENT_DISK_NAME"
                          }
                      ]
                  },
                  "taskCount":3
              }
          ],
          "logsPolicy": {
              "destination": "CLOUD_LOGGING"
          }
      }
      

      Replace the following:

      • PROJECT_ID: the project ID of your project.
      • EXISTING_PERSISTENT_DISK_NAME: the name of an existing persistent disk.
      • EXISTING_PERSISTENT_DISK_LOCATION: the location of an existing persistent disk. For each existing zonal persistent disk, the job's location must be the disk's zone; for each existing regional persistent disk, the job's location must be either the disk's region or, if specifying zones, one or both of the specific zones where the regional persistent disk is located. If you are not specifying any existing persistent disks, you can select any location. Learn more about the allowedLocations field.
      • NEW_PERSISTENT_DISK_SIZE: the size of the new persistent disk in GB. The allowed sizes depend on the type of persistent disk, but the minimum is often 10 GB (10) and the maximum is often 64 TB (64000).
      • NEW_PERSISTENT_DISK_TYPE: the disk type of the new persistent disk, either pd-standard, pd-balanced, pd-ssd, or pd-extreme.
      • NEW_PERSISTENT_DISK_NAME: the name of the new persistent disk.
    • If you are using an instance template for this job, create a JSON file as shown previously, except replace the instances field with the following:

      "instances": [
          {
              "instanceTemplate": "INSTANCE_TEMPLATE_NAME"
          }
      ],
      

      where INSTANCE_TEMPLATE_NAME is the name of the instance template for this job. For a job that uses persistent disks, this instance template must define and attach the persistent disks that you want the job to use. For this example, the template must define and attach a new persistent disk named NEW_PERSISTENT_DISK_NAME and and attach an existing persistent disk named EXISTING_PERSISTENT_DISK_NAME.

  2. Run the following command:

    gcloud batch jobs submit JOB_NAME \
      --location LOCATION \
      --config JSON_CONFIGURATION_FILE
    

    Replace the following:

    • JOB_NAME: the name of the job.
    • LOCATION: the location of the job.
    • JSON_CONFIGURATION_FILE: the path for a JSON file with the job's configuration details.

API

To create a job that uses persistent disks using the Batch API, use the jobs.create method. In the request, specify the persistent disks in the instances field and mount the persistent disk in the volumes field.

  • If you are not using an instance template for this job, make the following request:

    POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/jobs?job_id=JOB_NAME
    
    {
        "allocationPolicy": {
            "instances": [
                {
                    "policy": {
                        "disks": [
                            {
                                "deviceName": "EXISTING_PERSISTENT_DISK_NAME",
                                "existingDisk": "projects/PROJECT_ID/EXISTING_PERSISTENT_DISK_LOCATION/disks/EXISTING_PERSISTENT_DISK_NAME"
                            },
                            {
                                "newDisk": {
                                    "sizeGb":NEW_PERSISTENT_DISK_SIZE,
                                    "type": "NEW_PERSISTENT_DISK_TYPE"
                                },
                                "deviceName": "NEW_PERSISTENT_DISK_NAME"
                            }
                        ]
                    }
                }
            ],
            "location": {
                "allowedLocations": [
                    "EXISTING_PERSISTENT_DISK_LOCATION"
                ]
            }
        },
        "taskGroups":[
            {
                "taskSpec":{
                    "runnables": [
                        {
                            "script": {
                                "text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> /mnt/disks/NEW_PERSISTENT_DISK_NAME/output_task_${BATCH_TASK_INDEX}.txt"
                            }
                        }
                    ],
                    "volumes": [
                        {
                            "deviceName": "NEW_PERSISTENT_DISK_NAME",
                            "mountPath": "/mnt/disks/NEW_PERSISTENT_DISK_NAME",
                            "mountOptions": "rw,async"
                        },
                        {
    
                            "deviceName": "EXISTING_PERSISTENT_DISK_NAME",
                            "mountPath": "/mnt/disks/EXISTING_PERSISTENT_DISK_NAME"
                        }
                    ]
                },
                "taskCount":3
            }
        ],
        "logsPolicy": {
            "destination": "CLOUD_LOGGING"
        }
    }
    

    Replace the following:

    • PROJECT_ID: the project ID of your project.
    • LOCATION: the location of the job.
    • JOB_NAME: the name of the job.
    • EXISTING_PERSISTENT_DISK_NAME: the name of an existing persistent disk.
    • EXISTING_PERSISTENT_DISK_LOCATION: the location of an existing persistent disk. For each existing zonal persistent disk, the job's location must be the disk's zone; for each existing regional persistent disk, the job's location must be either the disk's region or, if specifying zones, one or both of the specific zones where the regional persistent disk is located. If you are not specifying any existing persistent disks, you can select any location. Learn more about the allowedLocations field.
    • NEW_PERSISTENT_DISK_SIZE: the size of the new persistent disk in GB. The allowed sizes depend on the type of persistent disk, but the minimum is often 10 GB (10) and the maximum is often 64 TB (64000).
    • NEW_PERSISTENT_DISK_TYPE: the disk type of the new persistent disk, either pd-standard, pd-balanced, pd-ssd, or pd-extreme.
    • NEW_PERSISTENT_DISK_NAME: the name of the new persistent disk.
  • If you are using an instance template for this job, create a JSON file as shown previously, except replace the instances field with the following:

    "instances": [
        {
            "instanceTemplate": "INSTANCE_TEMPLATE_NAME"
        }
    ],
    

    where INSTANCE_TEMPLATE_NAME is the name of the instance template for this job. For a job that uses persistent disks, this instance template must define and attach the persistent disks that you want the job to use. For this example, the template must define and attach a new persistent disk named NEW_PERSISTENT_DISK_NAME and and attach an existing persistent disk named EXISTING_PERSISTENT_DISK_NAME.

Use a local SSD

A job that uses local SSDs has the following restrictions:

You can create a job that uses a local SSD using the gcloud CLI or Batch API. The following example describes how to create a job that creates, attaches, and mounts a local SSD. The job also has 3 tasks that each run a script to create a file in the local SSD named output_task_TASK_INDEX.txt where TASK_INDEX is the index of each task: 0, 1, and 2.

gcloud

To create a job that uses local SSDs using the gcloud CLI, use the gcloud batch jobs submit command. In the job's JSON configuration file, create and attach the local SSDs in the instances field and mount the local SSDs in the volumes field.

  1. Create a JSON file.

    • If you are not using an instance template for this job, create a JSON file with the following contents:

      {
          "allocationPolicy": {
              "instances": [
                  {
                      "policy": {
                          "machineType": MACHINE_TYPE,
                          "disks": [
                              {
                                  "newDisk": {
                                      "sizeGb":LOCAL_SSD_SIZE,
                                      "type": "local_ssd"
                                  },
                                  "deviceName": "LOCAL_SSD_NAME"
                              }
                          ]
                      }
                  }
              ]
          },
          "taskGroups":[
              {
                  "taskSpec":{
                      "runnables": [
                          {
                              "script": {
                                  "text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> /mnt/disks/LOCAL_SSD_NAME/output_task_${BATCH_TASK_INDEX}.txt"
                              }
                          }
                      ],
                      "volumes": [
                          {
                              "deviceName": "LOCAL_SSD_NAME",
                              "mountPath": "/mnt/disks/LOCAL_SSD_NAME",
                              "mountOptions": "rw,async"
                          }
                      ]
                  },
                  "taskCount":3
              }
          ],
          "logsPolicy": {
              "destination": "CLOUD_LOGGING"
          }
      }
      

      Replace the following:

      • MACHINE_TYPE: the machine type of the job's VMs. The allowed number of local SSDs depends on the machine type for your job's VMs.
      • LOCAL_SSD_NAME: the name of a local SSD created for this job.
      • LOCAL_SSD_SIZE: the size of all the local SSDs in GB. Each local SSD is 375 GB, so this value must be a multiple of 375 GB. For example, for 2 local SSDs, set this value to 750 GB.
    • If you are using an instance template for this job, create a JSON file as shown previously, except replace the instances field with the following:

      "instances": [
          {
              "instanceTemplate": "INSTANCE_TEMPLATE_NAME"
          }
      ],
      

      where INSTANCE_TEMPLATE_NAME is the name of the instance template for this job. For a job that uses local SSDs, this instance template must define and attach the local SSDs that you want the job to use. For this example, the template must define and attach a local SSD named LOCAL_SSD_NAME.

  2. Run the following command:

    gcloud batch jobs submit JOB_NAME \
      --location LOCATION \
      --config JSON_CONFIGURATION_FILE
    

    Replace the following:

    • JOB_NAME: the name of the job.
    • LOCATION: the location of the job.
    • JSON_CONFIGURATION_FILE: the path for a JSON file with the job's configuration details.

API

To create a job that uses local SSDs using the Batch API, use the jobs.create method. In the request, create and attach the local SSDs in the instances field and mount the local SSDs in the volumes field.

  • If you are not using an instance template for this job, make the following request:

    POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/jobs?job_id=JOB_NAME
    
    {
        "allocationPolicy": {
            "instances": [
                {
                    "policy": {
                        "machineType": MACHINE_TYPE,
                        "disks": [
                            {
                                "newDisk": {
                                    "sizeGb":LOCAL_SSD_SIZE,
                                    "type": "local_ssd"
                                },
                                "deviceName": "LOCAL_SSD_NAME"
                            }
                        ]
                    }
                }
            ]
        },
        "taskGroups":[
            {
                "taskSpec":{
                    "runnables": [
                        {
                            "script": {
                                "text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> /mnt/disks/LOCAL_SSD_NAME/output_task_${BATCH_TASK_INDEX}.txt"
                            }
                        }
                    ],
                    "volumes": [
                        {
                            "deviceName": "LOCAL_SSD_NAME",
                            "mountPath": "/mnt/disks/LOCAL_SSD_NAME",
                            "mountOptions": "rw,async"
                        }
                    ]
                },
                "taskCount":3
            }
        ],
        "logsPolicy": {
            "destination": "CLOUD_LOGGING"
        }
    }
    

    Replace the following:

    • PROJECT_ID: the project ID of your project.
    • LOCATION: the location of the job.
    • JOB_NAME: the name of the job.
    • MACHINE_TYPE: the machine type of the job's VMs. The allowed number of local SSDs depends on the machine type for your job's VMs.
    • LOCAL_SSD_NAME: the name of a local SSD created for this job.
    • LOCAL_SSD_SIZE: the size of all the local SSDs in GB. Each local SSD is 375 GB, so this value must be a multiple of 375 GB. For example, for 2 local SSDs, set this value to 750 GB.
  • If you are using an instance template for this job, create a JSON file as shown previously, except replace the instances field with the following:

    "instances": [
        {
            "instanceTemplate": "INSTANCE_TEMPLATE_NAME"
        }
    ],
    

    where INSTANCE_TEMPLATE_NAME is the name of the instance template for this job. For a job that uses local SSDs, this instance template must define and attach the local SSDs that you want the job to use. For this example, the template must define and attach a local SSD named LOCAL_SSD_NAME.

Use a Cloud Storage bucket

To create a job that uses an existing Cloud Storage bucket, select one of the following methods:

  • Mount a bucket directly to your job's compute environment, as shown in this section.
  • Create a basic job with tasks that directly access Cloud Storage, such as through client libraries.

Before you create a job that uses a bucket, create a bucket or identify an existing bucket. For more information, see Create storage buckets and Listing buckets.

You can create a job that uses a Cloud Storage bucket using the gcloud CLI or Batch API.

The following example describes how to create a job mounts a Cloud Storage bucket. The job also has 3 tasks that each run a script to create a file in the bucket named output_task_TASK_INDEX.txt where TASK_INDEX is the index of each task: 0, 1, and 2.

gcloud

To create a job that uses a Cloud Storage bucket using the gcloud CLI, use the gcloud batch jobs submit command. In the job's JSON configuration file, mount the bucket in the volumes field.

For example, to create a job that outputs files to a Cloud Storage:

  1. Create a JSON file in the current directory named hello-world-bucket.json with the following contents: json { "taskGroups": [ { "taskSpec": { "runnables": [ { "script": { "text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> MOUNT_PATH/output_task_${BATCH_TASK_INDEX}.txt" } } ], "volumes": [ { "gcs": { "remotePath": "BUCKET_PATH" }, "mountPath": "MOUNT_PATH" } ] }, "taskCount": 3 } ], "logsPolicy": { "destination": "CLOUD_LOGGING" } } Replace the following:
  • BUCKET_PATH: the path of the bucket directory that you want this job to access, which must start with the name of the bucket. For example, for a bucket named BUCKET_NAME, the path BUCKET_NAME represents the root directory of the bucket and the path BUCKET_NAME/subdirectory represents the subdirectory subdirectory.
  • MOUNT_PATH: the mount path that the job's runnables use to access this bucket. The path must start with /mnt/disks/ followed by a directory or path that you choose. For example, if you want to represent this bucket with a directory named my-bucket, set the mount path to /mnt/disks/my-bucket.
  1. Run the following command:

    gcloud batch jobs submit example-bucket-job \
      --location us-central1 \
      --config hello-world-bucket.json
    

API

To create a job that uses a Cloud Storage bucket using the Batch API, use the jobs.create method and mount the bucket in the volumes field.

POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/jobs?job_id=example-bucket-job

{
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "script": {
                            "text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> MOUNT_PATH/output_task_${BATCH_TASK_INDEX}.txt"
                        }
                    }
                ],
                "volumes": [
                    {
                        "gcs": {
                            "remotePath": "BUCKET_PATH"
                        },
                        "mountPath": "MOUNT_PATH"
                    }
                ]
            },
            "taskCount": 3
        }
    ],
    "logsPolicy": {
            "destination": "CLOUD_LOGGING"
    }
}

Replace the following:

  • PROJECT_ID: the project ID of your project.
  • BUCKET_PATH: the path of the bucket directory that you want this job to access, which must start with the name of the bucket. For example, for a bucket named BUCKET_NAME, the path BUCKET_NAME represents the root directory of the bucket and the path BUCKET_NAME/subdirectory represents the subdirectory subdirectory.
  • MOUNT_PATH: the mount path that the job's runnables use to access this bucket. The path must start with /mnt/disks/ followed by a directory or path that you choose. For example, if you want to represent this bucket with a directory named my-bucket, set the mount path to /mnt/disks/my-bucket.

Use a network file system

You can create a job that uses an existing network file system (NFS), such as a Filestore file share, using the gcloud CLI or Batch API.

Before creating a job that uses a NFS, make sure that your network's firewall is properly configured to allow traffic between your job's VMs and the NFS. For more information, see Configuring firewall rules for Filestore.

This following example describes how to create a job that specifies and mounts a NFS. The job also has 3 tasks that each run a script to create a file in the NFS named output_task_TASK_INDEX.txt where TASK_INDEX is the index of each task: 0, 1, and 2.

gcloud

To create a job that uses a NFS using the gcloud CLI, use the gcloud batch jobs submit command. In the job's JSON configuration file, mount the NFS in the volumes field.

  1. Create a JSON file with the following contents:

    {
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "script": {
                                "text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> MOUNT_PATH/output_task_${BATCH_TASK_INDEX}.txt"
                            }
                        }
                    ],
                    "volumes": [
                        {
                            "nfs": {
                                "server": "NFS_IP_ADDRESS",
                                "remotePath": "NFS_PATH"
                            },
                            "mountPath": "MOUNT_PATH"
                        }
                    ]
                },
                "taskCount": 3
            }
        ],
        "logsPolicy": {
            "destination": "CLOUD_LOGGING"
        }
    }
    

    Replace the following:

    • NFS_IP_ADDRESS: the IP address of the NFS. For example, if your NFS is a Filestore file share, then specify the IP address of the VM hosting the Filestore file share, which you can get by describing the Filestore VM.
    • NFS_PATH: the path of the NFS directory that you want this job to access, which must start with a / followed by the root directory of the NFS. For example, for a Filestore file share named FILE_SHARE_NAME, the path /FILE_SHARE_NAME represents the root directory of the file share and the path /FILE_SHARE_NAME/subdirectory represents the subdirectory subdirectory.
    • MOUNT_PATH: the mount path that the job's runnables use to access this NFS. The path must start with /mnt/disks/ followed by a directory or path that you choose. For example, if you want to represent this NFS with a directory named my-nfs, set the mount path to /mnt/disks/my-nfs.
  2. Run the following command:

    gcloud batch jobs submit JOB_NAME \
      --location LOCATION \
      --config JSON_CONFIGURATION_FILE
    

    Replace the following:

    • JOB_NAME: the name of the job.
    • LOCATION: the location of the job.
    • JSON_CONFIGURATION_FILE: the path for a JSON file with the job's configuration details.

API

To create a job that uses a NFS using the Batch API, use the jobs.create method and mount the NFS in the volumes field.

POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/jobs?job_id=JOB_NAME

   {
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "script": {
                            "text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> MOUNT_PATH/output_task_${BATCH_TASK_INDEX}.txt"
                        }
                    }
                ],
                "volumes": [
                    {
                        "nfs": {
                            "server": "NFS_IP_ADDRESS",
                            "remotePath": "NFS_PATH"
                        },
                        "mountPath": "MOUNT_PATH"
                    }
                ]
            },
            "taskCount": 3
        }
    ],
    "logsPolicy": {
        "destination": "CLOUD_LOGGING"
    }
}

Replace the following:

  • PROJECT_ID: the project ID of your project.
  • LOCATION: the location of the job.
  • JOB_NAME: the name of the job.
  • NFS_IP_ADDRESS: the IP address of the Network File System. For example, if your NFS is a Filestore file share, then specify the IP address of the VM hosting the Filestore file share, which you can get by describing the Filestore VM.
  • NFS_PATH: the path of the NFS directory that you want this job to access, which must start with a / followed by the root directory of the NFS. For example, for a Filestore file share named FILE_SHARE_NAME, the path /FILE_SHARE_NAME represents the root directory of the file share and the path /FILE_SHARE_NAME/subdirectory represents a subdirectory.
  • MOUNT_PATH: the mount path that the job's runnables use to access this NFS. The path must start with /mnt/disks/ followed by a directory or path that you choose. For example, if you want to represent this NFS with a directory named my-nfs, set the mount path to /mnt/disks/my-nfs.

What's next