Cluster Scheduled Deletion

To help avoid incurring Google Cloud charges for an inactive cluster, use Dataproc's Cluster Scheduled Deletion feature when you create a cluster. This feature provides options to delete a cluster:

  • after a specified cluster idle period
  • at a specified future time
  • after a specified period that starts from the time of submission of the cluster creation request

Cluster idle time calculation

The dataproc:dataproc.cluster-ttl.consider-yarn-activity cluster property affects the calculation of cluster idle time, as follows:

  • This property is enabled (set to true) by default.
  • When this property is enabled, both YARN and Dataproc Jobs API activity must be idle to start and continue incrementing the cluster idle time calculation.
    • YARN activity includes pending and running YARN applications.
    • Dataproc Jobs API activity includes pending and running jobs submitted to the Dataproc Jobs API.
  • When this property is set to false, Dataproc Jobs API activity only must be idle to start and continue incrementing the cluster idle time calculation.

Using Cluster Scheduled Deletion

gcloud command

You can create a cluster with the Cluster Scheduled Deletion feature by passing the following scheduled deletion flags to the gcloud dataproc clusters create command.

FlagDescriptionFinest GranularityMin ValueMax Value
--max-idle1The duration from the moment when the cluster enters the idle state to the moment when the cluster starts to delete. Provide the duration in IntegerUnit format, where the unit can be "s, m, h, d" (seconds, minutes, hours, days, respectively). Examples: "30m" or "1d" (30 minutes or 1 day from when the cluster becomes idle).1 second5 minutes14 days
--expiration-time2The time to start deleting the cluster in ISO 8601 datetime format. An easy way to generate the datetime in correct format is through the Timestamp Generator. For example, "2017-08-22T13:31:48-08:00" specifies an expiration time of 13:21:48 in the UTC -8:00 time zone.1 second10 minutes from the current time 14 days from the current time
--max-age2The duration from the moment of submitting the cluster create request to the moment when the cluster starts to delete. Provide the duration in IntegerUnit format, where the unit can be "s, m, h, d" (seconds, minutes, hours, days, respectively). Examples: "30m" (30 minutes from now); "1d" (1 day from now).1 second10 minutes14 days
gcloud dataproc clusters create cluster-name \
    --region=region \
    --max-idle=duration \
    --expiration-time=time \
    ... other flags ...

You can update a cluster that was created with the scheduled deletion feature to change or remove scheduled deletion settings by passing the following scheduled deletion flags to the gcloud dataproc clusters update command (other cluster update flags cannot be combined with scheduled deletion flags).

FlagDescriptionFinest GranularityMin ValueMax Value
--max-idle1The duration from the moment when the cluster enters the idle state to the moment when the cluster starts to delete. Provide the duration in IntegerUnit format, where the unit can be "s, m, h, d" (seconds, minutes, hours, days, respectively). Examples: "30m" or "1d" (30 minutes or 1 day from when the cluster becomes idle).1 second5 minutes14 days
--no-max-idleCancels cluster auto-deletion by cluster idle duration previously set by the max-idle flagnot applicablenot applicablenot applicable
--expiration-time2The time to start deleting the cluster in ISO 8601 datetime format. An easy way to generate the datetime in correct format is through the Timestamp Generator. For example, "2017-08-22T13:31:48-08:00" specifies an expiration time of 13:21:48 in the UTC -8:00 time zone.1 second10 minutes from the current time, and the new time must not be earlier than the previously set time.14 days from the current time
--max-age2The duration from the moment of submitting the cluster update request to the moment when the cluster starts to delete. Provide the duration in IntegerUnit format, where the unit can be "s, m, h, d" (seconds, minutes, hours, days, respectively). Examples: "30m" (30 minutes from now); "1d" (1 day from now).1 second10 minutes, and the updated scheduled deletion time (update time + new max-age duration) must not be earlier than the previously set cluster deletion time.14 days
--no-max-ageCancels cluster auto-deletion by maximum cluster age previously set by the max-age or expiration-time flagnot applicablenot applicablenot applicable
gcloud dataproc clusters update cluster-name \
    --region=region \
    --max-idle=duration \
    --no-max-age \
    ... other flags

REST API

You can create a cluster with the Cluster Scheduled Deletion feature by setting the following ClusterLifecycleConfig fields in your cluster.create or cluster.patch API request.

Flag Description Finest Granularity Min Value Max Value
idleDeleteTtl1 The duration from the moment when the cluster enters the idle state to the moment when the cluster starts to delete. Provide a duration in seconds with up to nine fractional digits, terminated by 's'. Example: "3.5s". 1 second 5 minutes from the time of creating or updating the cluster. When updating a cluster, the new value must be greater than the previously set value. Submit a cluster.patch request with an empty duration to cancel a previously set idleDeleteTtl value. 14 days
autoDeleteTime2 The time to start deleting the cluster. Provide a timestamp in RFC 3339 UTC "Zulu" format, accurate to nanoseconds. Example: "2014-10-02T15:01:23.045123456Z". 1 second 10 minutes from the current time. When updating a cluster, the new time must be later than the previously set time. 14 days from the current time
autoDeleteTtl2 The duration from the moment of submitting the cluster create or update request to the moment when the cluster starts to delete. Provide a duration in seconds with up to nine fractional digits, terminated by 's'. Example: "3.5s". 1 second 10 minutes. When updating a cluster, the new scheduled deletion time (update time + new max-age duration) must be later than the previously set cluster deletion time. Submit a cluster.patch request with an empty duration to cancel a previously set autoDeleteTtl value. 14 days

Console

  • Open the Dataproc Create a cluster page, then select the Customize cluster panel. Scroll down to the Scheduled deletion section, then select the options to apply to your cluster.

Viewing Scheduled Deletion cluster settings

gcloud command

You can use the gcloud dataproc clusters list command to confirm that a cluster has scheduled deletion enabled.

 gcloud dataproc clusters list \
     --region=region
...
NAME         WORKER_COUNT ... SCHEDULED_DELETE
cluster-id   number       ... enabled
...

You can use the gcloud dataproc clusters describe command to check a cluster's LifecycleConfig scheduled deletion settings.

gcloud dataproc clusters describe cluster-name \
    --region=region
...
lifecycleConfig:
  autoDeleteTime: '2018-11-28T19:33:48.146Z'
  idleDeleteTtl: 1800s
  idleStartTime: '2018-11-28T18:33:48.146Z'
...

The autoDeleteTime and idleDeleteTtl are the scheduled deletion configuration values previously set by the user on the cluster. Dataproc generates the idleStartTime value, which is the latest cluster idle start time. Dataproc deletes the cluster if the cluster remains idle at idleStartTime + idleDeleteTtl.

REST API

You can make a clusters.list request to confirm that a cluster has scheduled deletion enabled.

Console

You can view the cluster's scheduled deletion settings by selecting the cluster name from the Dataproc Clusters page in the Google Cloud console. From the clusters details page, select the CONFIGURATION tab. Scroll down the cluster configuration list to view the scheduled deletion settings.