Batch on GKE reference and APIs

Batch on GKE resources

Batch introduces the following new Kubernetes resources:

BatchJob
Jobs describe the work to be done and the compute resources, data, etc. needed to do it.
BatchQueue
Queues are the central organizing construct in Batch. A Queue specifies the following:
BatchJobConstraint
A constraint specifies the jobs and policies acceptable in a given Queue. For example, a constraint may require jobs with no GPUs and less than 4 hours in duration. A single BatchJobConstraint resource may be assigned to multiple Queues.
BatchBudget
A budget defines how many resources a Queue may use or how much it can spend at a time or over a number of days. The expected usage is that each effort, project or team is assigned a budget. Several Queues may consume the same budget.
BatchPriority
Admins may define multiple different priority levels in the system. A Queue can have one priority level, but the same BatchPriority resource can be attached to multiple Queues.

BatchJobs

A BatchJob defines a single user job. It specifies the work that needs to be run and the compute resource requirements for that work and, the data storage associated with this job. Jobs wait in a Queue. Queues organize and prioritize Jobs based upon resources and dependencies.

A BatchJob has to be submitted into a BatchQueue for scheduling and budget control. The resource requirement enables Batch to create and assign VMs to satisfy the BatchJob's computational needs. Once the BatchJob starts to run, the container image in the job spec will be pulled and run to completion.

The following default_job.yaml shows a minimum sample job:

apiVersion: kbatch.k8s.io/v1beta1
kind: BatchJob
metadata:
  generateName: pi-  # generateName allows the system to generate a random name with a set prefix every time this BatchJob is submitted.
  namespace: default
spec:
  batchQueueName: default
  taskGroups:
  - name: main
    maxWallTime: 5m
    template:
      spec:
        containers:
        - name: pi
          # This image has been made public so it can be pulled from any project.
          image: gcr.io/kbatch-images/generate-pi/generate-pi:latest
          resources:
            requests:
              cpu: 1.0
              memory: 2Gi
            limits:
              cpu: 1.0
              memory: 2Gi
          imagePullPolicy: IfNotPresent
        restartPolicy: Never

After a BatchJob is submitted, Batch processes the job and marks the job's latest status in the Phase field of the Job's status. A Job's phases are:

Phase Description
Queued Job is admitted into the system and waiting to be scheduled.
Scheduled Job is scheduled by the job scheduler and is waiting to be assigned to a node.
Ready Job is assigned to a node and is waiting for the node to be ready.
Running Job is running.
Failed Job has failed.
Succeeded Job has completed successfully.

If a job needs access to some file storage for input and output data, Batch supports mounting a Kubernetes PersistentVolumeClaim (PVC) based volume into the job container.

The following data_job.yaml shows an example:

apiVersion: kbatch.k8s.io/v1beta1
kind: BatchJob
metadata:
  name: data-job
  namespace: default
spec:
  batchQueueName:  default
  taskGroups:
    maxWallTime:  10m
    name: default
    retryLimit:    0
    template:
      spec:
        containers:
          image:  ubuntu:latest
          name:   data-util
          volumeMounts:
          - mountPath:  /jobdata
            name:        data-volume
        restartPolicy:  Never
        volumes:
          name:  data-volume
          persistentVolumeClaim:
            claimName:  pvc

BatchQueues

Batch introduces a new resource, BatchQueue. In Batch, a BatchQueue is a way to group, manage and reason closely related BatchJobs, typically from the same team.

Batch allows you to have multiple BatchQueues in your cluster. BatchQueues are objects for organizing and managing logical efforts in your organization with budgets, priorities and policies. Jobs that use the same budget or should have the same policies should be placed into the same set of BatchQueues. You may allow a user to submit to different BatchQueues.

The following YAML defines a BatchQueue using the BatchBudget default:

apiVersion: kbatch.k8s.io/v1beta1
kind: BatchQueue
metadata:
  labels:
    controller-tools.k8s.io: "1.0"
  name: default
  namespace: default
spec:
  batchPriorityName: high
  batchBudgetName: default
  constraintNames: ["default"]

BatchPriority

The system administrator can create a number of priority objects. Each priority object has an integer priority. One priority object can be specified on each BatchQueue. The jobs and Pods created from this BatchQueue inherit this priority.

The following .yaml files shows an example BatchPriority:

apiVersion: kbatch.k8s.io/v1beta1
kind: BatchPriority
metadata:
  labels:
    controller-tools.k8s.io: "1.0"
  name: high
spec:
  value: 100
  description: High Priority

Priorities are not namespaced, so they can be applied to a BatchQueue in any namespace. After creating a priority, it is attached to a BatchQueue by updating the batchPriorityName field in the BatchQueue with the name of the priority. The batchPriorityName field can be updated at any time.

When a job runs, the scheduler prioritizes based on the queue's priority. In general, jobs coming from a BatchQueue with a priority of 88 will run before a BatchQueue with a priority of 3.

You can delete a BatchPriority only if no BatchQueue references it in a BatchPriorityName.

BatchJobConstraints

BatchJobConstraints are groupings of constraints that can be attached to a BatchQueue. BatchJobConstraints are not namespaced and can be applied to BatchQueues in any namespace. When a BatchJob is submitted to a BatchQueue, the BatchJob is validated against the constraints. The BatchJob is only admitted into the queue if it meets the constraints.

For example, if a constraint for WallTime is setup for 30m, only jobs shorter than 30m is allowed into the queue.

The admin may want to require that certain BatchQueues have only certain types of jobs. For example, a BatchQueue meant for quick iteration during development may have a high priority but require that the jobs have shorter maximum wall-clock times and not use a lot of CPUs.

If a BatchQueue specifies a BatchJobConstraint, any job in that BatchQueue will have those requirements applied by default if it doesn't specify a value for them. If the job specifies one of those requirements with an incompatible value then it will fail during submission.

The following .yaml files shows an example BatchJobConstraint:

apiVersion: kbatch.k8s.io/v1beta1
kind: BatchJobConstraint
metadata:
  labels:
    controller-tools.k8s.io: "1.0"
  name: default
spec:
  # The system supports the following constraints:
  # Cpu, Memory, WallTime, Gpu, GpuModel, RetryLimit
  # Adding a BatchJobConstraint to a BatchQueue means that the BatchQueue will only accept jobs that satisfy the
  # listed constraints.
  constraints:
    - name: WallTime
      operator: LessThan
      values: ["24h"]

The following table shows the fields supported and the operators available for each of those values:

Field name Operator Example
WallTime LessThan, GreaterThan, Equal, In, NotIn, Exists WallTime : 30m, 1h
Memory LessThan, GreaterThan, Equal, In, NotIn, Exists Memory: 128974848, 129e6, 129M, 123Mi
Cpu LessThan, GreaterThan, Equal, In, NotIn, Exists Cpu: 0.1, 100m, 1
Gpu LessThan, GreaterThan, Equal, In, NotIn, Exists Gpu: 0, 1, 2
GpuModel Equal, In, NotIn, Exists GpuModel: nvidia-tesla-p4, nvidia-tesla-k80, nvidia-tesla-v100, nvidia-tesla-p100, nvidia-tesla-t4
RetryLimit LessThan, GreaterThan, Equal, In, NotIn, Exists RetryLimit: 0 , 1 ,2

The following table shows the Operators and requirements on Values for each operator:

Operator Values Specification
LessThan, GreaterThan, Equal Requires only 1 value
Memory Requires only 1 value
In, NotIn Requires at least 1 value
Exists Requires Zero Values

BatchBudgets

A BatchQueue can have an optional batchBudgetName that limits how many resources a BatchQueue can use in a given window. Budgets are useful approximations to shape and prevent excessive spending on CPU, GPUs and memory. BatchBudgets also allow you to limit the impact of a high-priority BatchQueue consuming all available resources.

BatchBudgets cannot account for resources and APIs outside your cluster like network traffic, object storage or third party APIs. They are also not a representation of your actual bill. To use a budget you define a cost model (or price sheet) for each resource per hour and as jobs from BatchQueues use those resources their budget is debited from the total amount.

The following default_budget.yaml shows an example budget:

apiVersion: kbatch.k8s.io/v1beta1
kind: BatchBudget
metadata:
  labels:
    controller-tools.k8s.io: "1.0"
  name: default
spec:
  batchCostModelName: "default"
  # Two durations are currently supported: 24h and 720h (30d). These may be combined, by specifying
  # two budget windows, to enforce both a daily and monthly limit.
  budgetWindows:
    - duration: "24h"
      amount: 100

This example limits the BatchQueue to a spend of 100 units per day. The cost of each resource is located in costModel below. Once the budget has been exceeded, new and queued jobs will not be allowed to run until the budget window refreshes. You can also increase the budget window amount to allow queued jobs to run before the window refreshes.

BatchBudgets use a reservation model. For example, if 1 hour of CPU costs 1 unit and a job has a time limit of 10 hours, the scheduler only allows the job to run if the budget has at least 10 units left for this time period. When the job is marked ready, 10 units of budget are then reserved. If the job then only runs for 5 units and completes, 5 units of budget are refunded to the budget. Using a reservation model prevents job from running out of budget mid-computation.

The following default_costmodel.yaml shows an example costModel:

apiVersion: kbatch.k8s.io/v1beta1
kind: BatchCostModel
metadata:
  labels:
    controller-tools.k8s.io: "1.0"
  name: default
spec:
  resources:
    # These costs are hourly and in USD. For example, based on the numbers below, running using 1 CPU
    # 1 GB of RAM, and 1 Tesla T4 GPU for 2 hours would cost 2 * ($0.031611 + $0.004237 + $0.95)
    # The values are taken from https://cloud.google.com/compute/pricing for
    # region us-west1. The values are accurate as of April 2019. Actual pricing depends on various factors,
    # such as long-term contracts, use of standard machine types, the region, etc.
    cpu: 0.031611  # This is the on-demand price for a single vCPU
    memory: 0.004237  # This is the on-demand price for 1GB of memory
    nvidia.com/gpu: 0.95  # This is the cost for use of the Tesla T4 GPU. Currently the system does not support breaking out prices for specific GPU types, so an average price for the intended GPUs should be used here.

Deleting resources

Some resource types depend on other resource types. For example, BatchPriority is used by BatchQueue. Batch makes a best effort to prevent deleting a resource that is being depended on. The first line of defence is the admission controller which reject deletions of in-use dependencies.

Due to the nature of Kubernetes, it is not always possible to detect dependencies. Particularly, a race condition can exist between the deletion of the dependency and a new use of the dependency. Batch has a second mechanism to handle such cases. If this occurs, the dependency is deleted once it detects that the resource is no longer in use.

User and access management

Batch differentiates between two kinds of users: Admins, who set up, manage, and administer the cluster, and Practitioners who submit BatchJobs to the system.

Batch abides by the principle of least privilege, that is granting Practitioners and the VMs that run their jobs the minimal scope and access required. Admins can grant more access to users, but should do so following a well-defined security policy.

Batch on GKE's user management consists of a set of Kubernetes RBACs and a CRD BatchUserContext for each user. Batch provides a script to set up these configurations in one step. For more information, see Managing Batch on GKE users.

By default, Practitioners have no access to Batch on GKE resources. Batch provides tools for user management that support the following use cases:

  • A user can submit batch jobs and access the job result and not interact with other user's job by mistake.
  • Batch on GKE users can have storage data access separation at the PersistentVolumeClaim level and at the Unix file access control level as an additional level of protection,.
  • System admins can set up Kubernetes security policy and unix file access policy for users. Batch will attach and validate these policies when a user submits a job.

Batch on GKE user Role based access control

The following table describes the set of RBACs for a Batch on GKE user:

Resources Scopes Verbs
BatchJobs All in a namespace Get, List, Watch, Create
Pods All in a namespace List
BatchQueue All the batch queues in the namespace Get, List
BatchPriority All in cluster Get, List
BatchBudgets/BatchcostModel All in cluster Get, List
PersistentVolumeClaim Only the ones allowed to be used by this user in the namespace Get
Secrets Only the ones allowed to be used by this user in the namespace Get

Once a job has been submitted, the creator can terminate the job. Cluster admins have permission to delete BatchJobs. When a BatchJob is deleted, the dynamically granted RBACs associcated with the job will be deleted.

Limitation during Beta

  • The Beta release of Batch only supports single node (VM) jobs. Multi-node, gang scheduled or MPI jobs are not currently supported.

  • When placing Pods on VMs, GKE accounts for the overhead of its own on-VM agents. For example it considers a VM of 4 vCPUs as only having some smaller number of available vCPUs for work (typically something like 3.8) so it will not place a job that needs 4 vCPUs on that VM.

  • The cost model supports only one GPU type.

  • The username, [NAME]@[PROJECT_NAME].iam.gserviceaccount.com, is limited to 63 characters. That means that both NAME and [PROJECT_NAME] are limited to 37 characters.

  • The runAsGroup field in PodSecurityPolicy is not supported GKE.

  • Batch supports NFS volume only through PersistentVolumeClaims.

  • The maximum number of concurrent job submissions is 10,000.

Reference

BatchPriority v1beta1 kbatch.k8s.io

BatchPriority is the Schema for the batchpriorities API.

Fields
metadata

Kubernetes meta/v1.ObjectMeta

Refer to the Kubernetes API documentation for the fields of the metadata field.

spec

BatchPrioritySpec

status

BatchPriorityStatus

BatchPrioritySpec

Fields
Value

int32

Description

String

BatchJobConstraint v1beta1 kbatch.k8s.io

BatchJobConstraint is the Schema for the batchjobconstraints API.

Fields
metadata

Kubernetes meta/v1.ObjectMeta

Refer to the Kubernetes API documentation for the fields of the metadata field.

spec

BatchJobConstraintSpec

status

BatchJobConstraintStatus

BatchCostModel v1beta1 kbatch.k8s.io

BatchCostModel is the Schema for the batchcostmodels API

Fields
metadata

Kubernetes meta/v1.ObjectMeta

Refer to the Kubernetes API documentation for the fields of the metadata field.

spec

BatchCostModelSpec

status

BatchCostModelStatus

BatchCostModelSpec v1beta1 kbatch.k8s.io

BatchCostModelSpec allows defining the cost of resources.

Fields
resources

ResourceCostList

Refer to the Kubernetes API documentation for the fields of the metadata field.

ResourceName v1beta1 kbatch.k8s.io

ResourceName is the type of resource.

Fields
ResourceCPU

represents the CPU resource.

ResourceMemory

represents the Memory resource.

ResourceGPU

represents the GPU resource.

BatchBudget v1beta1 kbatch.k8s.io

BatchBudget is the Schema for the batchbudgets API.

Fields
metadata

Kubernetes meta/v1.ObjectMeta

Refer to the Kubernetes API documentation for the fields of the metadata field.

spec

BatchBudgetSpec

status

BatchBudgetStatus

Defines the observed state of BatchBudget.

BatchBudgetSpec v1beta1 kbatch.k8s.io

BatchBudgetSpec defines the desired state of BatchBudget.

Fields
batchCostModelName

String

Name of the BatchCostModel used by this budget.

budgetStartTime

Kubernetes meta/v1.Time

Start Time of the first window in the budget. This is populated as budget creation time if not set.

budgetWindows

[]BudgetWindow

An array of windows each defining a budget amount limit for that window. BudgetWindows[i].Duration must be unique within BudgetWindows.

BatchBudgetStatus v1beta1 kbatch.k8s.io

BatchBudgetStatus defines the observed state of BatchBudget

Fields
LastUpdated

String

Time at which the this was last updated.

WindowStatusList

[]BudgetWindowStatus

(Optional)

WindowStatusList contains one entry for each window to convey detailed status. WindowStatusList[i].Start <= time.Now() must hold.

BudgetWindow v1beta1 kbatch.k8s.io

BudgetWindow defines the budget amount limit within that window.

Fields
duration

Kubernetes meta/v1.Duration

Duration conveys window size. At this time only days are supported so the values will have only suffix "d" for days. In future support for any duration unit such as hour, week etc will be added.

amount

[]BudgetWindowStatus

(Optional)

Amount is the limit for budget within this window.

BudgetWindowStatus v1beta1 kbatch.k8s.io

BudgetWindowStatus contains budget consumption information for a window including both current window and past windows.

Fields
duration

Kubernetes meta/v1.Duration

Duration conveys window size. At this time only days are supported so the values will have only suffix "d" for days. In future support for any duration unit such as hour, week etc will be added.

startTime

Kubernetes meta/v1.Time

Start time of the current window. Once the window ends it is added to the History and a new current window is created.

amount

float64

Amount of budget used within this window.

windowHistory

[]BudgetWindowHistoryRecord

For each complete window of this duration this list tracks the amount of budget used in that window. For Alpha we keep 90 days of history for daily window and 3 records of history for 30-day window.

BudgetWindowHistoryRecord v1beta1 kbatch.k8s.io

BudgetWindowHistoryRecord tracks the amount of budget used within that window.

Fields
windowStartTime

Kubernetes meta/v1.Time

Start time of the window.

windowEndTime

Kubernetes meta/v1.Time

End time of the window.

amount

float64

Amount of budget used within this window.

BatchQueue v1beta1 kbatch.k8s.io

BatchQueue is the Schema for the batchqueues API.

Fields
metadata

Kubernetes meta/v1.ObjectMeta

Refer to the Kubernetes API documentation for the fields of the metadata field.

spec

BatchQueueSpec

status

BatchQueueStatus

BatchQueueSpec v1beta1 kbatch.k8s.io

BatchQueueSpec defines the desired state of BatchQueue.

Fields
BatchPriorityName

String

The name of the BatchPriority resource associated with this BatchQueue. Jobs in this BatchQueue will be prioritized for scheduling based on the BatchPriority set here.

PauseAdmission

bool

(Optional)

If true, the BatchQueue will not accept new jobs. This does not affect scheduling of jobs.

PauseScheduling

bool

(Optional)

If true, the BatchQueue will not schedule jobs. This does not affect admission of jobs.

BatchBudgetName

String

The name of the BatchBudget resource associated with this BatchQueue.

ConstraintNames

[]string

Holds names of BatchJobConstraints to be applied to the queue.

BatchJob v1beta1 kbatch.k8s.io

BatchJob is the Schema for the batchjobs API

Fields
metadata

Kubernetes meta/v1.ObjectMeta

Refer to the Kubernetes API documentation for the fields of the metadata field.

spec

BatchJobSpec

BatchJobSpec defines the desired state of BatchJob

status

BatchJobStatus

BatchJobStatus defines the observed state of BatchJob

BatchJob v1beta1 kbatch.k8s.io

BatchJob is the Schema for the batchjobs API

Fields
metadata

Kubernetes meta/v1.ObjectMeta

Refer to the Kubernetes API documentation for the fields of the metadata field.

spec

BatchJobSpec

status

BatchJobStatus

Job status. This field has to be a pointer, optional, and omitempty./p>

BatchJobSpec v1beta1 kbatch.k8s.io

Fields
TaskGroups

[]BatchTaskGroupSpec

Task group spec.

BatchQueueName

string

The name of the BatchQueue under which this BatchJob will run.

Dependencies

[]BatchJobDependency

(Optional)

Dependencies defines multiple BatchJob dependencies. The logic between each dependency is "OR" operation. For example: Dependencies[0] Dependencies[1] The logic between Dependencies[0] and Dependencies[1] is: Dependencies[0] || Dependencies[1]We only support one Dependency for now.

RetryLimit

int32

(Optional)

RetryLimit defines the per task retry limit. The task execution times will be RetryLimit + 1. If not specified, default is 0.

batchQueueName

string

The name of the BatchQueue under which this BatchJob will run.

UserCommand

string

Optional

UserCommand of BatchJob. Default is an empty string. If this field is empty, there is no user command from the end user. If it is set to "Terminate" while BatchJob is running, terminate the BatchJob

BatchJobStatus v1beta1 kbatch.k8s.io

BatchJobStatus defines the observed state of BatchJob.

Fields
Phase

BatchJobPhase

(Optional)

The current phase of this job.

BatchPriority

BatchPriority

(Optional)

A copy of the BatchPriority resource associated with this BatchJob (based on the batchPriorityName declared in the BatchQueue associated with this BatchJob). We require this to be a pointer, optional, and omitempty; otherwise, the webhook mutator will not function properly.

BudgetConsumed

float64

(Optional)

Aggregated budget consumed for job including all it's tasks.

Conditions

[]BatchJobCondition

(Optional)

The latest available observations of the batchjob's current state.

SubmittedBy

string

(Optional)

The user who submitted the job.

GroupStatus

string

(Optional)

Status per task group.

What's next

このページは役立ちましたか?評価をお願いいたします。

フィードバックを送信...

Kubernetes Engine Documentation