This document explains how to create and run a Batch job that uses one or more external storage volumes. External storage options include new or existing persistent disk, new local SSDs, existing Cloud Storage buckets, and an existing network file system (NFS) such as a Filestore file share.
Regardless of whether you add external storage volumes, each Compute Engine VM for a job has a boot disk, which provides storage for the job's operating system (OS) image and instructions. For information about configuring the boot disk for a job, see VM OS environment overview instead.
Before you begin
- If you haven't used Batch before, review Get started with Batch and enable Batch by completing the prerequisites for projects and users.
-
To get the permissions that you need to create a job, ask your administrator to grant you the following IAM roles:
-
Batch Job Editor (
roles/batch.jobsEditor
) on the project -
Service Account User (
roles/iam.serviceAccountUser
) on the job's service account, which by default is the default Compute Engine service account -
Create a job that uses a Cloud Storage bucket:
Storage Object Viewer (
roles/storage.objectViewer
) on the bucket
For more information about granting roles, see Manage access to projects, folders, and organizations.
You might also be able to get the required permissions through custom roles or other predefined roles.
-
Batch Job Editor (
Create a job that uses storage volumes
Optionally, a job can use one or more of each of the following types of external storage volumes. For more information about all of the types of storage volumes and the differences and restrictions for each, see the documentation for Compute Engine VM storage options.
- persistent disk: zonal or regional, persistent block storage
- local SSD: high-performance, transient block storage
- Cloud Storage bucket: affordable object storage
- network file system (NFS): distributed file system that follows Network File System protocol—for example, a Filestore file share, which is a high-performance NFS hosted on Google Cloud
You can allow a job to use each storage volume by including it
in your job's definition and specifying its
mount path (mountPath
)
in your runnables. To learn how to create a job that uses storage volumes, see
one or more of the following sections:
Use a persistent disk
A job that uses persistent disks has the following restrictions:
All persistent disks: Review the restrictions for all persistent disks.
New versus existing persistent disks: Each persistent disk in a job can be either new (defined in and created with the job) or existing (already created in your project and specified in the job). To use a persistent disk, it needs to be formatted and mounted to the job's VMs, which must be in the same location as the persistent disk. Batch mounts any persistent disks that you include in a job and formats any new persistent disks, but you must format and unmount any existing persistent disks that you want a job to use.
The supported location options, format options, and mount options vary between new and existing persistent disks as described in the following table:
New persistent disks Existing persistent disks Format options The persistent disk is automatically formatted with an
ext4
file system.You must format the persistent disk to use an
ext4
file system before using it for a job.Mount options All options are supported.
All options except writing are supported. This is due to restrictions of multi-writer mode.
You must detatch the persistent disk from any VMs that it is attached to before using it for a job.
Location options You can only create zonal persistent disks.
You can select any location for your job. The persistent disks get created in the zone your project runs in.
You can select zonal and regional persistent disks.
You must set the job's location (or, if specified, just the job's allowed locations) to only locations that contain all of the job's persistent disks. For example, for a zonal persistent disk, the job's location must be the disk's zone; for a regional persistent disk, the job's location must be either the disk's region or, if specifying zones, one or both of the specific zones where the regional persistent disk is located.Instance templates: If you want to use a VM instance template while creating this job, you must attach any persistent disk(s) for this job in the instance template. Otherwise, if you don't want to use an instance template, you must attach any persistent disk(s) directly in the job definition.
You can create a job that uses a persistent disk using the Google Cloud console, gcloud CLI, Batch API, C++, Go, Java, Node.js, or Python.
Console
Using the Google Cloud console, the following example creates a job that
runs a script to read a file from an existing zonal persistent disk that is
located in the us-central1-a
zone. The example script assumes that the job
has an existing zonal persistent disk that contains a text file named
example.txt
in the root directory.
Optional: Create an example zonal persistent disk
If you want to create a zonal persistent disk that you can use to run the example script, do the following before creating your job:
Attach a new, blank persistent named
example-disk
to a Linux VM in theus-central1-a
zone and then run commands on the VM to format and mount the disk. For instructions, see Add a persistent disk to your VM.Do not disconnect from the VM yet.
To create
example.txt
on the persistent disk, run the following commands on the VM:To change the current working directory to the root directory of the persistent disk, type the following command:
cd VM_MOUNT_PATH
Replace VM_MOUNT_PATH with the path to the directory where the persistent disk was mounted to this VM in the previous step—for example,
/mnt/disks/example-disk
.Press
Enter
.To create and define a file named
example.txt
, type the following command:cat > example.txt
Press
Enter
.Type the contents of the file. For example, type
Hello world!
.To save the file, press
Ctrl+D
(orCommand+D
on macOS).
When you're done, you can disconnect from the VM.
Detach the persistent disk from the VM.
If you don't need the VM anymore, you can delete the VM, which automatically detatches the persistent disk.
Otherwise, detach the persistent disk. For instructions, see Detaching and reattaching boot disks and detach the
example-disk
persistent disk instead of the VM's boot disk.
Create a job that uses the existing zonal persistent disk
To create a job that uses existing zonal persistent disks using the Google Cloud console, do the following:
In the Google Cloud console, go to the Job list page.
Click
Create. The Create batch job page opens. In the left pane, the Job details page is selected.Configure the Job details page:
Optional: In the Job name field, customize the job name.
For example, enter
example-disk-job
.Configure the Task details section:
In the New runnable window, add at least one script or container for this job to run.
For example, to run a script that prints the contents of a file that is named
example.txt
and located in the root directory of the persistent disk that this job uses, do the following:Select the Script checkbox. A text box appears.
In the text box, enter the following script:
echo "Here is the content of the example.txt file in the persistent disk." cat MOUNT_PATH/example.txt
Replace MOUNT_PATH with the path to where you plan to mount the persistent disk to the VMs for this job—for example,
/mnt/disks/example-disk
.Click Done.
In the Task count field, enter the number of tasks for this job.
For example, enter
1
(default).In the Parallelism field, enter the number of tasks to run concurrently.
For example, enter
1
(default).
Configure the Resource specifications page:
In the left pane, click Resource specifications. The Resource specifications page opens.
Select the location for this job. To use an existing zonal persistent disk, a job's VMs must be located in the same zone.
In the Region field, select a region.
For example, to use the example zonal persistent disk, select
us-central1 (Iowa)
(default).In the Zone field, select a zone.
For example, select
us-central1-a (Iowa)
.
Configure the Additional configurations page:
In the left pane, click Additional configurations. The Additional configurations page opens.
For each existing zonal persistent disk that you want to mount to this job, do the following:
In the Storage volume section, click Add new volume. The New volume window appears.
In the New volume window, do the following:
In the Volume type section, select Persistent disk (default).
In the Disk list, select an existing zonal persistent disk that you want to mount to this job. The disk must be located in the same zone as this job.
For example, select the existing zonal persistent disk that you prepared, which is located in the
us-central1-a
zone and contains the fileexample.txt
.Optional: If you want to rename this zonal persistent disk, do the following:
Select Customize the device name.
In the Device name field, enter the new name for your disk.
In the Mount path field, enter the mount path (MOUNT_PATH) for this persistent disk:
For example, enter the following:
/mnt/disks/EXISTING_PERSISTENT_DISK_NAME
Replace EXISTING_PERSISTENT_DISK_NAME with the name of the disk. If you renamed the zonal persistent disk, use the new name.
For example, replace EXISTING_PERSISTENT_DISK_NAME with
example-disk
.Click Done.
Optional: Configure the other fields for this job.
Optional: To review the job configuration, in the left pane, click Preview.
Click Create.
The Job details page displays the job that you created.
gcloud
Using the gcloud CLI, the following example creates a job that
attaches and mounts an existing persistent disk and a new persistent disk.
The job has 3 tasks that each run a script to create a file in the new
persistent disk named output_task_TASK_INDEX.txt
where TASK_INDEX is the index of each task: 0, 1, and
2.
To create a job that uses persistent disks using the
gcloud CLI, use the
gcloud batch jobs submit
command.
In the job's JSON configuration file, specify the persistent disks in the
instances
field and mount the persistent disk in the volumes
field.
Create a JSON file.
If you are not using an instance template for this job, create a JSON file with the following contents:
{ "allocationPolicy": { "instances": [ { "policy": { "disks": [ { "deviceName": "EXISTING_PERSISTENT_DISK_NAME", "existingDisk": "projects/PROJECT_ID/EXISTING_PERSISTENT_DISK_LOCATION/disks/EXISTING_PERSISTENT_DISK_NAME" }, { "newDisk": { "sizeGb": NEW_PERSISTENT_DISK_SIZE, "type": "NEW_PERSISTENT_DISK_TYPE" }, "deviceName": "NEW_PERSISTENT_DISK_NAME" } ] } } ], "location": { "allowedLocations": [ "EXISTING_PERSISTENT_DISK_LOCATION" ] } }, "taskGroups": [ { "taskSpec": { "runnables": [ { "script": { "text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> /mnt/disks/NEW_PERSISTENT_DISK_NAME/output_task_${BATCH_TASK_INDEX}.txt" } } ], "volumes": [ { "deviceName": "NEW_PERSISTENT_DISK_NAME", "mountPath": "/mnt/disks/NEW_PERSISTENT_DISK_NAME", "mountOptions": "rw,async" }, { "deviceName": "EXISTING_PERSISTENT_DISK_NAME", "mountPath": "/mnt/disks/EXISTING_PERSISTENT_DISK_NAME" } ] }, "taskCount":3 } ], "logsPolicy": { "destination": "CLOUD_LOGGING" } }
Replace the following:
PROJECT_ID
: the project ID of your project.EXISTING_PERSISTENT_DISK_NAME
: the name of an existing persistent disk.EXISTING_PERSISTENT_DISK_LOCATION
: the location of an existing persistent disk. For each existing zonal persistent disk, the job's location must be the disk's zone; for each existing regional persistent disk, the job's location must be either the disk's region or, if specifying zones, one or both of the specific zones where the regional persistent disk is located. If you are not specifying any existing persistent disks, you can select any location. Learn more about theallowedLocations
field.NEW_PERSISTENT_DISK_SIZE
: the size of the new persistent disk in GB. The allowed sizes depend on the type of persistent disk, but the minimum is often 10 GB (10
) and the maximum is often 64 TB (64000
).NEW_PERSISTENT_DISK_TYPE
: the disk type of the new persistent disk, eitherpd-standard
,pd-balanced
,pd-ssd
, orpd-extreme
. The default disk type for non-boot persistent disks ispd-standard
.NEW_PERSISTENT_DISK_NAME
: the name of the new persistent disk.
If you are using a VM instance template for this job, create a JSON file as shown previously, except replace the
instances
field with the following:"instances": [ { "instanceTemplate": "INSTANCE_TEMPLATE_NAME" } ],
where
INSTANCE_TEMPLATE_NAME
is the name of the instance template for this job. For a job that uses persistent disks, this instance template must define and attach the persistent disks that you want the job to use. For this example, the template must define and attach a new persistent disk namedNEW_PERSISTENT_DISK_NAME
and and attach an existing persistent disk namedEXISTING_PERSISTENT_DISK_NAME
.
Run the following command:
gcloud batch jobs submit JOB_NAME \ --location LOCATION \ --config JSON_CONFIGURATION_FILE
Replace the following:
JOB_NAME
: the name of the job.LOCATION
: the location of the job.JSON_CONFIGURATION_FILE
: the path for a JSON file with the job's configuration details.
API
Using the Batch API, the following example creates a job
that attaches and mounts an existing persistent disk and a new persistent
disk. The job has 3 tasks that each run a script to create a file in the new
persistent disk named
output_task_TASK_INDEX.txt
where
TASK_INDEX is the index of each task: 0, 1, and 2.
To create a job that uses persistent disks using the
Batch API, use the
jobs.create
method.
In the request, specify the persistent disks in the
instances
field and mount the persistent disk in the volumes
field.
If you are not using an instance template for this job, make the following request:
POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/jobs?job_id=JOB_NAME { "allocationPolicy": { "instances": [ { "policy": { "disks": [ { "deviceName": "EXISTING_PERSISTENT_DISK_NAME", "existingDisk": "projects/PROJECT_ID/EXISTING_PERSISTENT_DISK_LOCATION/disks/EXISTING_PERSISTENT_DISK_NAME" }, { "newDisk": { "sizeGb": NEW_PERSISTENT_DISK_SIZE, "type": "NEW_PERSISTENT_DISK_TYPE" }, "deviceName": "NEW_PERSISTENT_DISK_NAME" } ] } } ], "location": { "allowedLocations": [ "EXISTING_PERSISTENT_DISK_LOCATION" ] } }, "taskGroups": [ { "taskSpec": { "runnables": [ { "script": { "text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> /mnt/disks/NEW_PERSISTENT_DISK_NAME/output_task_${BATCH_TASK_INDEX}.txt" } } ], "volumes": [ { "deviceName": "NEW_PERSISTENT_DISK_NAME", "mountPath": "/mnt/disks/NEW_PERSISTENT_DISK_NAME", "mountOptions": "rw,async" }, { "deviceName": "EXISTING_PERSISTENT_DISK_NAME", "mountPath": "/mnt/disks/EXISTING_PERSISTENT_DISK_NAME" } ] }, "taskCount":3 } ], "logsPolicy": { "destination": "CLOUD_LOGGING" } }
Replace the following:
PROJECT_ID
: the project ID of your project.LOCATION
: the location of the job.JOB_NAME
: the name of the job.EXISTING_PERSISTENT_DISK_NAME
: the name of an existing persistent disk.EXISTING_PERSISTENT_DISK_LOCATION
: the location of an existing persistent disk. For each existing zonal persistent disk, the job's location must be the disk's zone; for each existing regional persistent disk, the job's location must be either the disk's region or, if specifying zones, one or both of the specific zones where the regional persistent disk is located. If you are not specifying any existing persistent disks, you can select any location. Learn more about theallowedLocations
field.NEW_PERSISTENT_DISK_SIZE
: the size of the new persistent disk in GB. The allowed sizes depend on the type of persistent disk, but the minimum is often 10 GB (10
) and the maximum is often 64 TB (64000
).NEW_PERSISTENT_DISK_TYPE
: the disk type of the new persistent disk, eitherpd-standard
,pd-balanced
,pd-ssd
, orpd-extreme
. The default disk type for non-boot persistent disks ispd-standard
.NEW_PERSISTENT_DISK_NAME
: the name of the new persistent disk.
If you are using a VM instance template for this job, create a JSON file as shown previously, except replace the
instances
field with the following:"instances": [ { "instanceTemplate": "INSTANCE_TEMPLATE_NAME" } ], ...
Where
INSTANCE_TEMPLATE_NAME
is the name of the instance template for this job. For a job that uses persistent disks, this instance template must define and attach the persistent disks that you want the job to use. For this example, the template must define and attach a new persistent disk namedNEW_PERSISTENT_DISK_NAME
and and attach an existing persistent disk namedEXISTING_PERSISTENT_DISK_NAME
.
C++
To create a Batch job that uses new or existing
persistent disks using
the Cloud Client Libraries for C++, use the
CreateJob
function
and include the following:
- To attach persistent disks to the VMs for a job,
include one of the following:
- If you are not using a VM instance template for this job,
use the
set_remote_path
method. - If you are using a VM instance template for this job, use the
set_instance_template
method.
- If you are not using a VM instance template for this job,
use the
- To mount the persistent disks to the job, use the
volumes
field with thedeviceName
andmountPath
fields. For new persistent disks, also use themountOptions
field to enable writing.
For a code sample of a similar use case, see Use a Cloud Storage bucket.
Go
To create a Batch job that uses new or existing
persistent disks using
the Cloud Client Libraries for Go, use the
CreateJob
function
and include the following:
- To attach persistent disks to the VMs for a job,
include one of the following:
- If you are not using a VM instance template for this job,
include the
AllocationPolicy_AttachedDisk
type. - If you are using a VM instance template for this job,
include the
AllocationPolicy_InstancePolicyOrTemplate_InstanceTemplate
type.
- If you are not using a VM instance template for this job,
include the
- To mount the persistent disks to the job, use the
Volume
type with theVolume_DeviceName
type andMountPath
field. For new persistent disks, also use theMountOptions
field to enable writing.
Java
To create a Batch job that uses new or existing
persistent disks using
the Cloud Client Libraries for Java, use the
CreateJobRequest
class
and include the following:
- To attach persistent disks to the VMs for a job,
include one of the following:
- If you are not using a VM instance template for this job,
include the
setDisks
method. - If you are using a VM instance template for this job,
include the
setInstanceTemplate
method.
- If you are not using a VM instance template for this job,
include the
- To mount the persistent disks to the job, use the
Volume
class with thesetDeviceName
method andsetMountPath
method. For new persistent disks, also use thesetMountOptions
method to enable writing.
For example, use the following code sample:
Node.js
To create a Batch job that uses new or existing
persistent disks using
the Cloud Client Libraries for Node.js, use the
createJob
method
and include the following:
- To attach persistent disks to the VMs for a job,
include one of the following:
- If you are not using a VM instance template for this job,
include the
AllocationPolicy.AttachedDisk
class. - If you are using a VM instance template for this job,
include the
instanceTemplate
property.
- If you are not using a VM instance template for this job,
include the
- To mount the persistent disks to the job, use the
Volume
class with thedeviceName
property andmountPath
property. For new persistent disks, also use themountOptions
property to enable writing.
Python
To create a Batch job that uses new or existing
persistent disks using
the Cloud Client Libraries for Python, use the
CreateJob
function
and include the following:
- To attach persistent disks to the VMs for a job,
include one of the following:
- If you are not using a VM instance template for this job,
include the
AttachedDisk
class. - If you are using a VM instance template for this job,
include the
instance_template
attribute.
- If you are not using a VM instance template for this job,
include the
- To mount the persistent disks to the job, use the
Volume
class with thedevice_name
attribute andmount_path
attribute. For new persistent disks, also use themount_options
attribute to enable writing.
For example, use the following code sample:
Use a local SSD
A job that uses local SSDs has the following restrictions:
- All local SSDs Review the restrictions for all local SSDs.
- Instance templates If you want to specify a VM instance template while creating this job, you must attach any persistent disk(s) for this job in the instance template. Otherwise, if you don't want to use an instance template you must attach any persistent disk(s) directly in the job definition.
You can create a job that uses a local SSD using the
gcloud CLI, Batch API, Java, or Python.
The following example describes how to create a job that creates, attaches, and
mounts a local SSD. The job also has 3 tasks
that each run a script to create a file in the local SSD named
output_task_TASK_INDEX.txt
where
TASK_INDEX
is the index of each task:
0
, 1
, and 2
.
gcloud
To create a job that uses local SSDs using the
gcloud CLI, use the
gcloud batch jobs submit
command.
In the job's JSON configuration file, create and attach the local SSDs in the
instances
field and mount the local SSDs in the volumes
field.
Create a JSON file.
If you are not using an instance template for this job, create a JSON file with the following contents:
{ "allocationPolicy": { "instances": [ { "policy": { "machineType": MACHINE_TYPE, "disks": [ { "newDisk": { "sizeGb": LOCAL_SSD_SIZE, "type": "local-ssd" }, "deviceName": "LOCAL_SSD_NAME" } ] } } ] }, "taskGroups": [ { "taskSpec": { "runnables": [ { "script": { "text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> /mnt/disks/LOCAL_SSD_NAME/output_task_${BATCH_TASK_INDEX}.txt" } } ], "volumes": [ { "deviceName": "LOCAL_SSD_NAME", "mountPath": "/mnt/disks/LOCAL_SSD_NAME", "mountOptions": "rw,async" } ] }, "taskCount":3 } ], "logsPolicy": { "destination": "CLOUD_LOGGING" } }
Replace the following:
MACHINE_TYPE
: the machine type, which can be predefined or custom, of the job's VMs. The allowed number of local SSDs depends on the machine type for your job's VMs.LOCAL_SSD_NAME
: the name of a local SSD created for this job.LOCAL_SSD_SIZE
: the size of all the local SSDs in GB. Each local SSD is 375 GB, so this value must be a multiple of375
GB. For example, for 2 local SSDs, set this value to750
GB.
If you are using a VM instance template for this job, create a JSON file as shown previously, except replace the
instances
field with the following:"instances": [ { "instanceTemplate": "INSTANCE_TEMPLATE_NAME" } ],
where
INSTANCE_TEMPLATE_NAME
is the name of the instance template for this job. For a job that uses local SSDs, this instance template must define and attach the local SSDs that you want the job to use. For this example, the template must define and attach a local SSD namedLOCAL_SSD_NAME
.
Run the following command:
gcloud batch jobs submit JOB_NAME \ --location LOCATION \ --config JSON_CONFIGURATION_FILE
Replace the following:
JOB_NAME
: the name of the job.LOCATION
: the location of the job.JSON_CONFIGURATION_FILE
: the path for a JSON file with the job's configuration details.
API
To create a job that uses local SSDs using the
Batch API, use the
jobs.create
method.
In the request, create and attach the local SSDs in the
instances
field and mount the local SSDs in the volumes
field.
If you are not using an instance template for this job, make the following request:
POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/jobs?job_id=JOB_NAME { "allocationPolicy": { "instances": [ { "policy": { "machineType": MACHINE_TYPE, "disks": [ { "newDisk": { "sizeGb": LOCAL_SSD_SIZE, "type": "local-ssd" }, "deviceName": "LOCAL_SSD_NAME" } ] } } ] }, "taskGroups": [ { "taskSpec": { "runnables": [ { "script": { "text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> /mnt/disks/LOCAL_SSD_NAME/output_task_${BATCH_TASK_INDEX}.txt" } } ], "volumes": [ { "deviceName": "LOCAL_SSD_NAME", "mountPath": "/mnt/disks/LOCAL_SSD_NAME", "mountOptions": "rw,async" } ] }, "taskCount":3 } ], "logsPolicy": { "destination": "CLOUD_LOGGING" } }
Replace the following:
PROJECT_ID
: the project ID of your project.LOCATION
: the location of the job.JOB_NAME
: the name of the job.MACHINE_TYPE
: the machine type, which can be predefined or custom, of the job's VMs. The allowed number of local SSDs depends on the machine type for your job's VMs.LOCAL_SSD_NAME
: the name of a local SSD created for this job.LOCAL_SSD_SIZE
: the size of all the local SSDs in GB. Each local SSD is 375 GB, so this value must be a multiple of375
GB. For example, for 2 local SSDs, set this value to750
GB.
If you are using a VM instance template for this job, create a JSON file as shown previously, except replace the
instances
field with the following:"instances": [ { "instanceTemplate": "INSTANCE_TEMPLATE_NAME" } ], ...
Where
INSTANCE_TEMPLATE_NAME
is the name of the instance template for this job. For a job that uses local SSDs, this instance template must define and attach the local SSDs that you want the job to use. For this example, the template must define and attach a local SSD namedLOCAL_SSD_NAME
.
Go
Java
Node.js
Python
Use a Cloud Storage bucket
To create a job that uses an existing Cloud Storage bucket, select one of the following methods:
- Recommended: Mount a bucket directly to your job's VMs by specifying the bucket in the job's definition, as shown in this section. When the job runs, the bucket is automatically mounted to the VMs for your job using Cloud Storage FUSE.
- Create a job with tasks that directly access a Cloud Storage bucket by using the gcloud CLI or client libraries for the Cloud Storage API. To learn how to access a Cloud Storage bucket directly from a VM, see the Compute Engine documentation for Writing and reading data from Cloud Storage buckets.
Before you create a job that uses a bucket, create a bucket or identify an existing bucket. For more information, see Create buckets and List buckets.
You can create a job that uses a Cloud Storage bucket using the Google Cloud console, gcloud CLI, Batch API, C++, Go, Java, Node.js, or Python.
The following example describes how to create a job that mounts a
Cloud Storage bucket. The job also has 3 tasks that each run
a script to create a file in the bucket named
output_task_TASK_INDEX.txt
where TASK_INDEX
is the index of each task:
0
, 1
, and 2
.
Console
To create a job that uses a Cloud Storage bucket using the Google Cloud console, do the following:
In the Google Cloud console, go to the Job list page.
Click
Create. The Create batch job page opens. In the left pane, the Job details page is selected.Configure the Job details page:
Optional: In the Job name field, customize the job name.
For example, enter
example-bucket-job
.Configure the Task details section:
In the New runnable window, add at least one script or container for this job to run.
For example, do the following:
Select the Script checkbox. A text box appears.
In the text box, enter the following script:
echo Hello world from task ${BATCH_TASK_INDEX}. >> MOUNT_PATH/output_task_${BATCH_TASK_INDEX}.txt
Replace MOUNT_PATH with the mount path that this job's runnables use to access an existing Cloud Storage bucket. The path must start with
/mnt/disks/
followed by a directory or path that you choose. For example, if you want to represent this bucket with a directory namedmy-bucket
, set the mount path to/mnt/disks/my-bucket
.Click Done.
In the Task count field, enter the number of tasks for this job.
For example, enter
3
.In the Parallelism field, enter the number of tasks to run concurrently.
For example, enter
1
(default).
Configure the Additional configurations page:
In the left pane, click Additional configurations. The Additional configurations page opens.
For each Cloud Storage bucket that you want to mount to this job, do the following:
In the Storage volume section, click Add new volume. The New volume window appears.
In the New volume window, do the following:
In the Volume type section, select Cloud Storage bucket.
In the Storage Bucket name field, enter the name of an existing bucket.
For example, enter the bucket you specified in this job's runnable.
In the Mount path field, enter the mount path of the bucket (MOUNT_PATH), which you specified in the runnable.
Click Done.
Optional: Configure the other fields for this job.
Optional: To review the job configuration, in the left pane, click Preview.
Click Create.
The Job details page displays the job that you created.
gcloud
To create a job that uses a Cloud Storage bucket using the
gcloud CLI, use the
gcloud batch jobs submit
command.
In the job's JSON configuration file, mount the bucket in the
volumes
field.
For example, to create a job that outputs files to a Cloud Storage:
Create a JSON file with the following contents:
{ "taskGroups": [ { "taskSpec": { "runnables": [ { "script": { "text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> MOUNT_PATH/output_task_${BATCH_TASK_INDEX}.txt" } } ], "volumes": [ { "gcs": { "remotePath": "BUCKET_PATH" }, "mountPath": "MOUNT_PATH" } ] }, "taskCount": 3 } ], "logsPolicy": { "destination": "CLOUD_LOGGING" } }
Replace the following:
BUCKET_PATH
: the path of the bucket directory that you want this job to access, which must start with the name of the bucket. For example, for a bucket namedBUCKET_NAME
, the pathBUCKET_NAME
represents the root directory of the bucket and the pathBUCKET_NAME/subdirectory
represents thesubdirectory
subdirectory.MOUNT_PATH
: the mount path that the job's runnables use to access this bucket. The path must start with/mnt/disks/
followed by a directory or path that you choose. For example, if you want to represent this bucket with a directory namedmy-bucket
, set the mount path to/mnt/disks/my-bucket
.
Run the following command:
gcloud batch jobs submit JOB_NAME \ --location LOCATION \ --config JSON_CONFIGURATION_FILE
Replace the following:
JOB_NAME
: the name of the job.LOCATION
: the location of the job.JSON_CONFIGURATION_FILE
: the path for a JSON file with the job's configuration details.
API
To create a job that uses a Cloud Storage bucket using the
Batch API, use the
jobs.create
method
and mount the bucket in the volumes
field.
POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/jobs?job_id=JOB_NAME
{
"taskGroups": [
{
"taskSpec": {
"runnables": [
{
"script": {
"text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> MOUNT_PATH/output_task_${BATCH_TASK_INDEX}.txt"
}
}
],
"volumes": [
{
"gcs": {
"remotePath": "BUCKET_PATH"
},
"mountPath": "MOUNT_PATH"
}
]
},
"taskCount": 3
}
],
"logsPolicy": {
"destination": "CLOUD_LOGGING"
}
}
Replace the following:
PROJECT_ID
: the project ID of your project.LOCATION
: the location of the job.JOB_NAME
: the name of the job.BUCKET_PATH
: the path of the bucket directory that you want this job to access, which must start with the name of the bucket. For example, for a bucket namedBUCKET_NAME
, the pathBUCKET_NAME
represents the root directory of the bucket and the pathBUCKET_NAME/subdirectory
represents thesubdirectory
subdirectory.MOUNT_PATH
: the mount path that the job's runnables use to access this bucket. The path must start with/mnt/disks/
followed by a directory or path that you choose. For example, if you want to represent this bucket with a directory namedmy-bucket
, set the mount path to/mnt/disks/my-bucket
.
C++
C++
For more information, see the Batch C++ API reference documentation.
To authenticate to Batch, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Go
Go
For more information, see the Batch Go API reference documentation.
To authenticate to Batch, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
Java
For more information, see the Batch Java API reference documentation.
To authenticate to Batch, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
Node.js
For more information, see the Batch Node.js API reference documentation.
To authenticate to Batch, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
Python
For more information, see the Batch Python API reference documentation.
To authenticate to Batch, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Use a network file system
You can create a job that uses an existing network file system (NFS), such as a Filestore file share, using the Google Cloud console, gcloud CLI or Batch API.
Before creating a job that uses a NFS, make sure that your network's firewall is properly configured to allow traffic between your job's VMs and the NFS. For more information, see Configuring firewall rules for Filestore.
This following example describes how to create a job that specifies and
mounts a NFS. The job also has 3
tasks that each run a script to create a file in the NFS named
output_task_TASK_INDEX.txt
where TASK_INDEX
is the index of each task:
0
, 1
, and 2
.
Console
To create a job that uses a NFS using the Google Cloud console, do the following:
In the Google Cloud console, go to the Job list page.
Click
Create. The Create batch job page opens. In the left pane, the Job details page is selected.Configure the Job details page:
Optional: In the Job name field, customize the job name.
For example, enter
example-nfs-job
.Configure the Task details section:
In the New runnable window, add at least one script or container for this job to run.
For example, do the following:
Select the Script checkbox. A text box appears.
In the text box, enter the following script:
echo Hello world from task ${BATCH_TASK_INDEX}. >> MOUNT_PATH/output_task_${BATCH_TASK_INDEX}.txt
Replace MOUNT_PATH with the mount path that the job's runnable use to access this NFS. The path must start with
/mnt/disks/
followed by a directory or path that you choose. For example, if you want to represent this NFS with a directory namedmy-nfs
, set the mount path to/mnt/disks/my-nfs
.Click Done.
In the Task count field, enter the number of tasks for this job.
For example, enter
3
.In the Parallelism field, enter the number of tasks to run concurrently.
For example, enter
1
(default).
Configure the Additional configurations page:
In the left pane, click Additional configurations. The Additional configurations page opens.
For each Cloud Storage bucket that you want to mount to this job, do the following:
In the Storage volume section, click Add new volume. The New volume window appears.
In the New volume window, do the following:
In the Volume type section, select Network file system.
In the File server field, enter the IP address of the server where the NFS you specified in this job's runnable is located.
For example, if your NFS is a Filestore file share, then specify the IP address of the Filestore instance, which you can get by describing the Filestore instance.
In the Remote path field, enter a path that can access the NFS you specified in the previous step.
The path of the NFS directory must start with a
/
followed by the root directory of the NFS.In the Mount path field, enter the mount path to the NFS (MOUNT_PATH), which you specified in the previous step.
Click Done.
Optional: Configure the other fields for this job.
Optional: To review the job configuration, in the left pane, click Preview.
Click Create.
The Job details page displays the job that you created.
gcloud
To create a job that uses a NFS using the
gcloud CLI, use the
gcloud batch jobs submit
command.
In the job's JSON configuration file, mount the NFS in the
volumes
field.
Create a JSON file with the following contents:
{ "taskGroups": [ { "taskSpec": { "runnables": [ { "script": { "text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> MOUNT_PATH/output_task_${BATCH_TASK_INDEX}.txt" } } ], "volumes": [ { "nfs": { "server": "NFS_IP_ADDRESS", "remotePath": "NFS_PATH" }, "mountPath": "MOUNT_PATH" } ] }, "taskCount": 3 } ], "logsPolicy": { "destination": "CLOUD_LOGGING" } }
Replace the following:
NFS_IP_ADDRESS
: the IP address of the NFS. For example, if your NFS is a Filestore file share, then specify the IP address of the Filestore instance, which you can get by describing the Filestore instance.NFS_PATH
: the path of the NFS directory that you want this job to access, which must start with a/
followed by the root directory of the NFS. For example, for a Filestore file share namedFILE_SHARE_NAME
, the path/FILE_SHARE_NAME
represents the root directory of the file share and the path/FILE_SHARE_NAME/subdirectory
represents thesubdirectory
subdirectory.MOUNT_PATH
: the mount path that the job's runnables use to access this NFS. The path must start with/mnt/disks/
followed by a directory or path that you choose. For example, if you want to represent this NFS with a directory namedmy-nfs
, set the mount path to/mnt/disks/my-nfs
.
Run the following command:
gcloud batch jobs submit JOB_NAME \ --location LOCATION \ --config JSON_CONFIGURATION_FILE
Replace the following:
JOB_NAME
: the name of the job.LOCATION
: the location of the job.JSON_CONFIGURATION_FILE
: the path for a JSON file with the job's configuration details.
API
To create a job that uses a NFS using the
Batch API, use the
jobs.create
method
and mount the NFS in the volumes
field.
POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/jobs?job_id=JOB_NAME
{
"taskGroups": [
{
"taskSpec": {
"runnables": [
{
"script": {
"text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> MOUNT_PATH/output_task_${BATCH_TASK_INDEX}.txt"
}
}
],
"volumes": [
{
"nfs": {
"server": "NFS_IP_ADDRESS",
"remotePath": "NFS_PATH"
},
"mountPath": "MOUNT_PATH"
}
]
},
"taskCount": 3
}
],
"logsPolicy": {
"destination": "CLOUD_LOGGING"
}
}
Replace the following:
PROJECT_ID
: the project ID of your project.LOCATION
: the location of the job.JOB_NAME
: the name of the job.NFS_IP_ADDRESS
: the IP address of the Network File System. For example, if your NFS is a Filestore file share, then specify the IP address of the Filestore instance, which you can get by describing the Filestore instance.NFS_PATH
: the path of the NFS directory that you want this job to access, which must start with a/
followed by the root directory of the NFS. For example, for a Filestore file share namedFILE_SHARE_NAME
, the path/FILE_SHARE_NAME
represents the root directory of the file share and the path/FILE_SHARE_NAME/subdirectory
represents a subdirectory.MOUNT_PATH
: the mount path that the job's runnables use to access this NFS. The path must start with/mnt/disks/
followed by a directory or path that you choose. For example, if you want to represent this NFS with a directory namedmy-nfs
, set the mount path to/mnt/disks/my-nfs
.
Java
Node.js
Python
What's next
- If you have issues creating or running a job, see Troubleshooting.
- View jobs and tasks.
- Learn about more job creation options.