This document explains the execution process and creation options for jobs. Batch jobs let you run batch-processing workloads on Google Cloud. To learn more about jobs, see Get started with Batch.
How job creation and execution works
To use Batch to run a workload, you create a job that specifies your workload and its requirements. When you finish creating the job, the job is automatically queued, scheduled, and executed on the specified resources.
The resources required to run a job—a regional managed instance group (MIG) of Compute Engine VMs and any additional resources specified—are automatically provisioned and deprovisioned. The time a job takes to finish queueing and running varies for different jobs and at different times based on factors related to resource availability. Generally, jobs are more likely to run and finish sooner if they are smaller and require only a few common resources. For the example jobs in Batch documentation, which typically use minimal resources, you might see them finish running in as little as a few minutes.
After you create a job, you can check its status by describing the job. After a job's state indicates the job has started running, you can also monitor the job by viewing logs. A job's details, history, and logs remain available until you delete the job.
Job creation options
Create and run a basic job explains the fundamentals, including how to define a job's tasks using either a script or container image and use predefined and custom environment variables.
After you understand the fundamentals for job creation, consider using one or more of the following options:
- Define job resources using a VM instance template explains how to specify a Compute Engine VM template to define a job's resources when you create a job. This method is required to create jobs that use non-default VM images.
- Control access for a job using a custom service account explains how to specify a job's service account, which influences the resources and applications that a job's VMs can access. If you do not specify a custom service account, jobs default to using the Compute Engine default service account.
- Configure task communication using an MPI library explains how to configure a job with tightly coupled tasks that communicate with each other across different VMs by using a Message Passing Interface (MPI) library. A common use case for MPI is tightly coupled high-performance computing (HPC) workloads.
- Use GPUs for a job explains how to define a job that uses one or more graphics processing units (GPUs). Common use cases for jobs that use GPUs include intensive data processing or machine learning (ML) workloads.
- Use storage volumes for a job explains how to define a job that can access one or more external storage volumes. Storage options include new or existing persistent disk, new local SSDs, existing Cloud Storage buckets, and an existing network file system (NFS) such as a Filestore file share.
What's next
- Create and run a basic job
- Follow a tutorial: Create and run a job using Workflows explains how to use Workflows to execute a job's tasks in an order that you define using the Workflows syntax.