Running a Job

This page explains how to run Jobs in Kubernetes Engine.

Overview

In Kubernetes Engine, a Job is a controller object that represents a finite task. Jobs differ from other controller objects in that Jobs manage the task as it runs to completion, rather than managing an ongoing desired state (such as the total number of running Pods).

Jobs are useful for large computation and batch-oriented tasks. Jobs can be used to support parallel execution of Pods. You can use a Job to run independent but related work items in parallel: sending emails, rendering frames, transcoding files, scanning database keys, etc. However, Jobs are not designed for closely-communicating parallel processes such as continuous streams of background processes.

In Kubernetes Engine, there are two types of Jobs:

  • Non-parallel Job: A Job which creates only one Pod (which is re-created if the Pod terminates unsuccessfully), and which is completed when the Pod terminates successfully.
  • Parallel jobs with a completion count: A Job that is completed when a certain number of Pods terminate successfully. You specify the desired number of completions using the completions field.

Jobs are represented by Kubernetes Job objects. When a Job is created, the Job controller creates one or more Pods and ensures that its Pods terminate successfully. As its Pods terminate, a Job tracks how many Pods completed their tasks successfully. Once the desired number of successful completions is reached, the Job is complete.

Similar to other controllers, a Job controller creates a new Pod if one of its Pods fails or is deleted.

Creating a Job

You can create a Job using kubectl apply or kubectl create with a manifest file, or by running kubectl run with the Job's specification.

kubectl create

The following example shows a Job manifest file named config.yaml:

apiVersion: batch/v1
kind: Job
metadata:
  # Unique key of the Job instance
  name: example-job
spec:
  template:
    metadata:
      name: example-job
    spec:
      containers:
      - name: pi
        image: perl
        command: ["perl"]
        args: ["-Mbignum=bpi", "-wle", "print bpi(2000)"]
      # Do not restart containers after they exit
      restartPolicy: Never

This Job computes pi to 2000 places then prints it.

The only requirement for a Job object is that the Pod template field is mandatory.

To create a Job from this file, run the following command;

kubectl create -f config.yaml

kubectl apply

The following example shows a Job manifest file named config.yaml:

apiVersion: batch/v1
kind: Job
metadata:
  # Unique key of the Job instance
  name: example-job
spec:
  template:
    metadata:
      name: example-job
    spec:
      containers:
      - name: pi
        image: perl
        command: ["perl"]
        args: ["-Mbignum=bpi", "-wle", "print bpi(2000)"]
      # Do not restart containers after they exit
      restartPolicy: Never

This Job computes pi to 2000 places then prints it.

The only requirement for a Job object is that the Pod template field is mandatory.

To create a Job from this file, run the following command;

kubectl apply -f config.yaml

kubectl run

The following example computes pi to 2000 places then prints it:

kubectl run pi --image=perl --restart=Never -- perl -Mbignum=bpi -wle 'print bpi(2000)'

Job completion count

A Job is completed when a specific number of Pods terminate successfully. By default, a non-parallel Job with a single Pod completes as soon as the Pod terminates successfully.

If you have a parallel Job, you can set a completion count using the optional completions field. This field specifies how many Pods should terminate successfully before the Job is complete. The completions field accepts a non-zero, positive value.

Omitting completions or specifying a zero value causes the success of any Pod to signal the success of all Pods.

To specify a completion count, add the completions value to the Job's spec field in the manifest file. For example, the following configuration specifies that there should be eight successful completions:

apiVersion: batch/v1
kind: Job
metadata:
  name: example-job
spec:
  completions: 8
  template:
    metadata:
      name: example-job
    spec:
      ...

The default value of completions is 1. When completions is set, the parallelism field defaults to 1 unless set otherwise. If both field are not set, their default values are 1.

Managing parallelism

By default, Job Pods do not run in parallel. The optional parallelism field specifies the maximum desired number of Pods a Job should run concurrently at any given time.

The actual number of Pods running in a steady state might be less than the parallelism value if the remaining work is less than the parallelism value. If you have also set completions, the actual number of Pods running in parallel does not exceed the number of remaining completions. A Job may throttle Pod creation in response to excessive Pod creation failure.

To manage parallelism, add the parallelism value to the Job's spec field in the manifest file. For example, the following manifest specifies that there should be five concurrent Pods running:

apiVersion: batch/v1
kind: Job
metadata:
  name: example-job
spec:
  parallelism: 5
  template:
    metadata:
      name: example-job
    spec:
      ...

The default value of parallelism is 1 if the field if omitted or unless set otherwise. If the value is set to 0, the Job is paused until the value is increased.

Specifying a deadline

By default, a Job creates new Pods forever if its Pods fail continuously. If you prefer not to have a Job retry forever, you can specify a deadline value using the optional activeDeadlineSeconds field.

A deadline grants a Job has a specific amount of time, in seconds, to complete its tasks successfully before terminating.

To specify a deadline, add the activeDeadlineSeconds value to the Job's spec field in the manifest file. For example, the following configuration specifies that the Job has 100 seconds to complete successfully:

apiVersion: batch/v1
kind: Job
metadata:
  name: example-job
spec:
  activeDeadlineSeconds: 100
  template:
    metadata:
      name: example-job
    spec:
      ...

If a Job does not complete successfully before the deadline, the Job ends with the status DeadlineExceeded. This causes creation of Pods to stop and causes existing Pods to be deleted.

Specifying a Pod selector

Manually specifying a selector is useful if you want to update a Job's Pod template, but you want the Job's current Pods to run under the updated Job.

A Job is instantiated with a selector field. The selector generates a unique identifier for the Job's Pods. The generated ID does not overlap with any other Jobs. Generally, you would not set this field yourself: setting a selector value which overlaps with another Job can cause issues with Pods in the other Job. To set the field yourself, you must specify manualSelector: True in the Job's spec field.

For example, you can run kubectl get job my-job --output=yaml to see the Job's specification, which contains the selector generated for its Pods:

kind: Job
metadata:
  name: my-job
...
spec:
  completions: 1
  parallelism: 1
  selector:
    matchLabels:
      controller-uid: a8f3d00d-c6d2-11e5-9f87-42010af00002
...

When you create a new Job, you can set the manualSelector value to True, then set the selector field's job-uid value like the following:

kind: Job
metadata:
  name: my-new-job
  ...
spec:
  manualSelector: true
  selector:
    matchLabels:
      job-uid: a8f3d00d-c6d2-11e5-9f87-42010af00002
  ...

Pods created by my-new-job use the previous Pod UID.

Inspecting a Job

Console

After creating a Job using kubectl, you can inspect it by performing the following steps:

  1. Visit the Kubernetes Engine Workloads menu in GCP Console.

    Visit the Workloads menu

  2. Select the desired workload from the menu.

You can inspect the Job in the following ways:

  • To see the Job's live configuration, click YAML.
  • To see all events related to the Job, click Events.
  • To see the Job's revision history, click Revision history.

kubectl

To check a Job's status, run the following command:

kubectl describe job my-job

To view all Pod resources in your cluster, including Pods created by the Job which have completed, run:

kubectl get pods -a

The -a flag specifies that all resources of the type specified (in this case, Pods) should be shown.

Scaling a Job

Console

To scale a Job, perform the following steps:

  1. Visit the Kubernetes Engine Workloads menu in GCP Console.

    Visit the Workloads menu

  2. Select the desired workload from the menu.

  3. Click Scale.
  4. From the Replicas field, enter the desired number of replicas.
  5. Click Scale.

kubectl

To scale a Job, run the following command:

kubectl scale job my-job replicas=[VALUE]

kubectl scale causes the number of concurrently-running Pods to change. Specifically, it changes the value of parallelism to the [VALUE] you specify.

Deleting a Job

When a Job completes, the Job stops creating Pods. The Job API object is not removed when it completes, which allows you to view its status. Pods created by the Job are not deleted, but they are terminated. Retention of the Pods allows you to view their logs and to interact with them.

Console

To delete a Job, perform the following steps:

  1. Visit the Kubernetes Engine Workloads menu in GCP Console.

    Visit the Workloads menu

  2. From the menu, select the desired workload.

  3. Click Delete.
  4. From the confirmation dialog menu, click Delete.

kubectl

To delete a Job, run the following command:

kubectl delete job my-job

When you delete a Job, all of its Pods are also deleted.

To delete a Job but retain its Pods, specify the --cascade=false flag:

kubectl delete jobs my-job --cascade=false

What's next

Send feedback about...

Kubernetes Engine