This page describes how to get started with Batch for Google Cloud.
Overview
Batch is a fully managed service that lets you schedule, queue, and execute batch processing workloads on Google Cloud resources. Batch provisions resources and manages capacity on your behalf, allowing your batch workloads to run at scale.
Using Batch, you don't need to configure and manage third-party job schedulers, provision and deprovision resources, or request resources one zone at a time. To run a job, you specify parameters for the resources required for your workload, then Batch obtains resources and queues the job for execution. Batch provides native integration with other Google Cloud services to aid in the scheduling, execution, storage, and analysis of batch jobs, so you can focus on submitting a job and consuming the results.
Batch consists of the following components:
Job: A scheduled program that runs a set of tasks to completion without any user interaction, typically for computational workloads. For example, a job might be a single shell script or a complex, multipart computation.
A job is executed through one or more specific actions called tasks. Each Batch job consists of an array of one or more tasks that all run the same runnables, which are the executable script(s) and container(s) for your job. A job's tasks can run in parallel or sequentially on the job's resources.
Tasks: Programmatic actions that are defined as part of a job and executed when the job runs. Each task is part of a job's task group. The job's runnables are run by each task in the job.
Resources: The infrastructure needed to run a job. Each Batch job runs on a regional managed instance group (MIG) of one or more Compute Engine virtual machine (VM) instances based on the job's specified requirements and location. Each VM has dedicated hardware for CPU cores and memory—which affect the performance of your job—and a boot disk—which stores an operating system (OS) image and instructions for running your job. If specified, a job might also use or access additional resources, like GPUs, or additional read/write storage resources, like local SSDs or a Cloud Storage bucket. Some of the factors that determine the number of VMs provisioned for a job include the VM hardware resources required for each task and the job's parallelism: whether you want tasks to run sequentially on one VM or simultaneously on multiple VMs.
In summary, Batch lets you create and run jobs that each automatically provision and utilize the resources required to execute its tasks.
Pricing
There is no additional cost for using Batch. You are only charged for the cost of the underlying resources required to execute your jobs.
For more information about the costs associated with Batch and how to filter Cloud Billing reports to view Batch costs, see Pricing.
Restrictions
Batch has the following restrictions:
- You cannot exceed the Batch quotas and limits for your project.
- You can only specify one machine type, which can be predefined or custom, per job.
- To use a specific VM image for your job, you must create a job using an instance template.
- You cannot specify more than one task group per job. All jobs have only one
task group named
group0
.
Prerequisites
To start using Batch, complete the following prerequisites:
- If your project has not used Batch before, enable Batch for your project.
- Set up Batch for each new user.
Enable Batch for a project
To start using Batch with a project, do the following:
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Make sure that billing is enabled for your Google Cloud project.
Make sure that Batch is enabled for your project:
Enable the APIs for Batch using the Google Cloud console or the Google Cloud CLI.
Console
Enable the Batch, Compute Engine, and Cloud Logging APIs.
gcloud
Enable the Batch, Compute Engine, and Cloud Logging APIs:
gcloud services enable batch.googleapis.com
compute.googleapis.com logging.googleapis.com To ensure that the service account for each job has the necessary permissions to allow the Batch service agent to create and access resources for jobs, ask your administrator to grant the following IAM roles to any service accounts that your project uses for Batch jobs.
The service account each job uses by default is the Compute Engine default service account, but you can also customize which service account a job uses.
- Batch Agent Reporter (
roles/batch.agentReporter
) on the project - To let jobs access a Cloud Storage bucket: Storage Admin
(
roles/storage.admin
) on the bucket - To let jobs generate logs in Cloud Logging: Logs Writer
(
roles/logging.logWriter
) on the project
For more information about granting roles to service accounts, see Restricting service accounts and Manage access to service accounts.
- Batch Agent Reporter (
Make sure you are familiar with your project's Batch service agent:
After you create a Batch job, the Batch service agent (a Google-managed service account) is automatically created for your project with the following name:
service-PROJECT_NUMBER@gcp-sa-cloudbatch.iam.gserviceaccount.com
Replace
PROJECT_NUMBER
with the project number of your project.The Batch service agent is automatically granted the Google Batch Service Agent (
roles/batch.serviceAgent
) IAM role. This configuration is required for your project to use Batch.However, certain use cases—for example, running a job on a Shared VPC network—require you to grant additional permissions to your project's Batch service agent.
For more information, see Service agents.
Set up Batch for a new user
To start using Batch as a user, do the following:
To get the permissions that you need to use Batch, ask your administrator to grant you the required IAM roles on the project. Refer to the documentation for each task to see its required permissions.
For example, if you want to start learning how to use Batch by creating a basic job, consider requesting roles for the following tasks:
- To create jobs:
- Batch Job Editor (
roles/batch.jobsEditor
) on the project - Service Account User (
roles/iam.serviceAccountUser
) on the job's service account, which by default is the default Compute Engine service account
- Batch Job Editor (
- To list and describe jobs: Batch Job Editor (
roles/batch.jobsEditor
) or Batch Job Viewer (roles/batch.jobsViewer
) on the project - To view logs for jobs: Logs Viewer (
roles/logging.viewer
) on the project - To delete jobs: Batch Job Editor (
roles/batch.jobsEditor
) on the project
For more information about granting roles, see Manage access.
- To create jobs:
If you want to use the command-line examples for Batch, set up the Google Cloud CLI by doing the following. Learn more about authentication for the Google Cloud CLI.
Install the Google Cloud CLI, then initialize it by running the following command:
gcloud init
Recommended: Set a default project using the
gcloud config set project
command:gcloud config set project PROJECT_ID
Replace
PROJECT_ID
with the project ID of your project.
If you want to use the API examples or client library examples for Batch, see Authenticate to Batch.
Get support
You can discuss Batch with the community on Cloud Forums.
If you have issues with Batch, see the troubleshooting documentation.
To get support or provide feedback for Batch, use the following resources:
For billing issues with Google Cloud, contact Billing support.
If you have a paid support package, contact Google Cloud Support directly for issues with Batch.
Google Cloud offers different support packages to meet different needs, such as 24/7 coverage, phone support, and access to a technical support manager. For more information, see Google Cloud Support.
To provide any feedback or feature requests for Batch, or to report issues for Batch without a paid support package, click the Send feedback button, which you can find at the beginning and end of each Batch documentation page. Then, select one of the following:
- For feedback related to Batch documentation, select "Documentation feedback."
- For all other feedback about Batch, select "Product feedback."
What's next
Discover more about Batch:
Learn more about creating a job.
Learn about related Google Cloud products: