In contrast to Apps (long-running processes), Tasks run for a finite amount of time and then stops. Tasks run in their own containers based on configuration on the parent App, and it could be configured to use limited resources (e.g. CPU/memory/ephermeral disk storage).
Use cases for Tasks
- Migrating a database
- Running a batch job (scheduled/unscheduled)
- Sending an email
- Transforming data (ETL)
- Processing data (upload/backup/download)
How Tasks work
Tasks are executed asynchronously and run independently from the parent App or other Tasks running on the same App. An App created for running Tasks does not have routes created or assigned, and the Run lifecycle is skipped. The Source code upload and Build lifecycles still proceed and result in a container image used for running Tasks after pushing the App (see App lifecycles at Deploying an Application).
The lifecycle of a Task is as follows:
- You push an App for running tasks with the
kf push APP_NAME --taskcommand.
- You run a Task on the App with the
kf run-task APP_NAMEcommand. Task inherits the environment variables, service bindings, resource allocation, start-up command, and security groups bound to the App.
- Kf creates a Tekton PipelineRun with values from the App and parameters from the
- The Tekton PipelineRun creates a Kubernetes Pod which launches a container based on the configurations on the App and Task.
- Task execution stops (Task exits or is terminated manually), the underlying Pod is either stopped or terminated. Pods of stopped Tasks are preserved and thus Task logs are accessible via the
kf logs APP_NAME --taskcommand.
- If you terminate a Task before it stops, the Tekton PipelineRun is cancelled (see Cancelling a PipelineRun), the underlying Pod together with the logs are deleted. The logs of termianted Tasks are delivered to the cluster level logging streams if configured (e.g. Stackdriver, Fluentd).
- If the number of Tasks run on an App is greater than 500, the oldest Tasks are automatically deleted.
Tasks retention policy
Tasks are created as custom resources in the Kubernetes cluster, therefore, it is important not to exhaust the space of the underlying
etcd database. By default, Kf only keeps the latest 500 Tasks per each App. Once the number of Tasks reach 500, the oldest Tasks (together with the underlying Pods and logs) will be automatically deleted.
Task logging and execution history
Any data or messages the Task outputs to STDOUT or STDERR is available by using the
kf logs APP_NAME --task command. Cluster level logging mechanism (such as Stackdriver, Fluentd) will deliver the Task logs to the configured logging destination.
As described above, Tasks can be run asynchronously by using the
kf run-task APP_NAME command.
Alternatively, you can schedule Tasks for execution by first creating a Job using
kf create-job command, and then scheduling it with the
kf schedule-job JOB_NAME command. You can schedule that Job to automatically
run Tasks on a specified unix-cron schedule.
How Tasks are scheduled
Create and schedule a Job to run the Task. A Job describes the Task to execute and automatically manages Task creation.
Tasks are created on the schedule even if previous executions of the Task are still running. If any executions are missed for any reason, only the most recently missed execution are executed when the system recovers.
Deleting a Job deletes all associated Tasks. If any associated Tasks were still in progress, they are forcefully deleted without running to completion.
Tasks created by a scheduled Job are still subject to the Task retention policy.
Differences from PCF Scheduler
PCF Scheduler allows multiple schedules for a single Job while Kf only supports a single schedule per Job. You can replicate the PCF Scheduler behavior by creating multiple Jobs, one for each schedule.