This document describes how to limit the run times of tasks and runnables by setting timeouts.
A timeout specifies the amount of time that a task or runnable is permitted to run. Batch doesn't allow jobs to run for longer than 14 days and doesn't set default timeouts for individual tasks and runnables. Consequently, an individual task or runnable can run for as long as 14 days before automatic failure. But, if your tasks and runnables aren't intended to run for that long, this configuration might cause unexpected costs and delays. To prevent excessive run times, you can set timeouts for tasks and runnables.
Before you begin
- If you haven't used Batch before, review Get started with Batch and enable Batch by completing the prerequisites for projects and users.
-
To get the permissions that you need to create a job, ask your administrator to grant you the following IAM roles:
-
Batch Job Editor (
roles/batch.jobsEditor
) on the project -
Service Account User (
roles/iam.serviceAccountUser
) on the job's service account, which by default is the default Compute Engine service account
For more information about granting roles, see Manage access to projects, folders, and organizations.
You might also be able to get the required permissions through custom roles or other predefined roles.
-
Batch Job Editor (
Set timeouts
You can set timeouts for runnables, tasks, or both. The timeout for a runnable specifies the maximum run time for that runnable. The timeout for a task specifies the maximum run time for that task, which is the sum of all the individual run times of its runnables. For example, if a task has 3 runnables that all run at the same time for 1 minute, then the task's run time is 3 minutes, not 1 minute.
If you set overlapping timeouts—such as a timeout for both a runnable and the runnable's task—then only one timeout needs to be exceeded to trigger automatic failure. For example, suppose you set a task's timeout to 60 seconds and the timeout of each of that task's runnables to 120 seconds. Then, this example task and all of its runnables fail when the sum of the run times of its runnables exceeds 60 seconds, and it's impossible to trigger the 120-second timeouts.
To choose the appropriate timeout to set for your job's tasks and runnables, analyze the logs of similar jobs that you have previously run to determine the typical run time for the tasks and runnables for similar workloads.
Set timeout for a task
Use the Google Cloud CLI or REST API to
create a job that
includes the
maxRunDuration
field
in the taskSpec
object of the JSON file:
{
"taskGroups": [
{
"taskSpec": {
...
"maxRunDuration": "TIMEOUT"
}
}
]
}
Replace TIMEOUT
with the maximum number of seconds or
fractional sections you want to permit the task to run for. For example, 255s
.
A job that sets a 255 second timeout for a task would have a JSON configuration file similar to the following:
{
"taskGroups": [
{
"taskSpec": {
"runnables": [
{
"script": {
"text": "echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
}
}
],
"maxRunDuration": "255s"
},
"taskCount": 3
}
],
"logsPolicy": {
"destination": "CLOUD_LOGGING"
}
}
If the timeout for a task is exceeded, the task automatically fails and
the exceeded timeout is indicated by exit code 50005
in the job's
status events and logs. For more information about exceeded timeouts, see the
troubleshooting documentation for exit code 50005.
Set timeout for a runnable
Use the Google Cloud CLI or REST API to
create a job that
includes the
timeout
field
in the runnable
object of the JSON file:
{
"taskGroups": [
{
"taskSpec": {
"runnables": [
{
...
"timeout": "TIMEOUT"
}
]
}
}
]
}
Replace TIMEOUT
with the maximum number of seconds or
fractional sections you want to permit the runnable to run for. For example,
3.5s
.
A job that sets a 3.5 second timeout for a runnable would have a JSON configuration file similar to the following:
{
"taskGroups": [
{
"taskSpec": {
"runnables": [
{
"script": {
"text": "echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
},
"timeout": "3.5s"
}
]
},
"taskCount": 3
}
],
"logsPolicy": {
"destination": "CLOUD_LOGGING"
}
}
If the timeout for a runnable is exceeded, the runnable automatically fails and
the exceeded timeout is indicated by exit code 50005
in the job's
status events and logs. For more information about exceeded timeouts, see the
troubleshooting documentation for exit code 50005.
What's next
- If you have issues creating or running a job, see Troubleshooting.
- View jobs and tasks.
- Learn about more job creation options.
- Learn how to analyze a job using logs.