To help avoid incurring Google Cloud charges for an inactive cluster, use Dataproc's Cluster Scheduled Deletion feature when you create a cluster. This feature provides options to delete a cluster:
- after a specified cluster idle period
- at a specified future time
- after a specified period that starts from the time of submission of the cluster creation request
Cluster idle time calculation
The dataproc:dataproc.cluster-ttl.consider-yarn-activity
cluster property
affects the calculation of cluster idle time, as follows:
- This property is enabled (set to
true
) by default. - When this property is enabled, both YARN and Dataproc Jobs API
activity must be idle to start and continue incrementing the cluster idle time
calculation.
- YARN activity includes pending and running YARN applications.
- Dataproc Jobs API activity includes pending and running jobs submitted to the Dataproc Jobs API.
- When this property is set to
false
, Dataproc Jobs API activity only must be idle to start and continue incrementing the cluster idle time calculation.
Using Cluster Scheduled Deletion
gcloud command
You can create a cluster with the Cluster Scheduled Deletion feature by passing the following scheduled deletion flags to the gcloud dataproc clusters create command.
Flag | Description | Finest Granularity | Min Value | Max Value |
---|---|---|---|---|
--max-idle 1 | The duration from the moment when the cluster enters the idle state to the moment when the cluster starts to delete. Provide the duration in IntegerUnit format, where the unit can be "s, m, h, d" (seconds, minutes, hours, days, respectively). Examples: "30m" or "1d" (30 minutes or 1 day from when the cluster becomes idle). | 1 second | 5 minutes | 14 days |
--expiration-time 2 | The time to start deleting the cluster in ISO 8601 datetime format. An easy way to generate the datetime in correct format is through the Timestamp Generator. For example, "2017-08-22T13:31:48-08:00" specifies an expiration time of 13:21:48 in the UTC -8:00 time zone. | 1 second | 10 minutes from the current time | 14 days from the current time |
--max-age 2 | The duration from the moment of submitting the cluster create request to the moment when the cluster starts to delete. Provide the duration in IntegerUnit format, where the unit can be "s, m, h, d" (seconds, minutes, hours, days, respectively). Examples: "30m" (30 minutes from now); "1d" (1 day from now). | 1 second | 10 minutes | 14 days |
gcloud dataproc clusters create cluster-name \ --region=region \ --max-idle=duration \ --expiration-time=time \ ... other flags ...
You can update a cluster that was created with the scheduled deletion feature to change or remove scheduled deletion settings by passing the following scheduled deletion flags to the gcloud dataproc clusters update command (other cluster update flags cannot be combined with scheduled deletion flags).
Flag | Description | Finest Granularity | Min Value | Max Value |
---|---|---|---|---|
--max-idle 1 | The duration from the moment when the cluster enters the idle state to the moment when the cluster starts to delete. Provide the duration in IntegerUnit format, where the unit can be "s, m, h, d" (seconds, minutes, hours, days, respectively). Examples: "30m" or "1d" (30 minutes or 1 day from when the cluster becomes idle). | 1 second | 5 minutes | 14 days |
--no-max-idle | Cancels cluster auto-deletion by cluster idle duration previously set by the max-idle flag | not applicable | not applicable | not applicable |
--expiration-time 2 | The time to start deleting the cluster in ISO 8601 datetime format. An easy way to generate the datetime in correct format is through the Timestamp Generator. For example, "2017-08-22T13:31:48-08:00" specifies an expiration time of 13:21:48 in the UTC -8:00 time zone. | 1 second | 10 minutes from the current time, and the new time must not be earlier than the previously set time. | 14 days from the current time |
--max-age 2 | The duration from the moment of submitting the cluster update request to the moment when the cluster starts to delete. Provide the duration in IntegerUnit format, where the unit can be "s, m, h, d" (seconds, minutes, hours, days, respectively). Examples: "30m" (30 minutes from now); "1d" (1 day from now). | 1 second | 10 minutes, and the updated scheduled deletion time (update time + new max-age duration) must not be earlier than the previously set cluster deletion time. | 14 days |
--no-max-age | Cancels cluster auto-deletion by maximum cluster age previously set by the max-age or expiration-time flag | not applicable | not applicable | not applicable |
gcloud dataproc clusters update cluster-name \ --region=region \ --max-idle=duration \ --no-max-age \ ... other flags
REST API
You can create a cluster with the Cluster Scheduled Deletion feature by setting the following ClusterLifecycleConfig fields in your cluster.create or cluster.patch API request.
Flag | Description | Finest Granularity | Min Value | Max Value |
---|---|---|---|---|
idleDeleteTtl 1 |
The duration from the moment when the cluster enters the idle state to the moment when the cluster starts to delete. Provide a duration in seconds with up to nine fractional digits, terminated by 's'. Example: "3.5s". | 1 second | 5 minutes from the time of creating or updating the cluster. When updating a cluster, the new value must be greater than the previously set value. Submit a cluster.patch request with an empty duration to cancel a previously set idleDeleteTtl value. |
14 days |
autoDeleteTime 2 |
The time to start deleting the cluster. Provide a timestamp in RFC 3339 UTC "Zulu" format, accurate to nanoseconds. Example: "2014-10-02T15:01:23.045123456Z". | 1 second | 10 minutes from the current time. When updating a cluster, the new time must be later than the previously set time. | 14 days from the current time |
autoDeleteTtl 2 |
The duration from the moment of submitting the cluster create or update request to the moment when the cluster starts to delete. Provide a duration in seconds with up to nine fractional digits, terminated by 's'. Example: "3.5s". | 1 second | 10 minutes. When updating a cluster, the new scheduled deletion time (update time + new max-age duration) must be later than the previously set cluster deletion time. Submit a cluster.patch request with an empty duration to cancel a previously set autoDeleteTtl value. |
14 days |
Console
- Open the Dataproc Create a cluster page, then select the Customize cluster panel. Scroll down to the Scheduled deletion section, then select the options to apply to your cluster.
Viewing Scheduled Deletion cluster settings
gcloud command
You can use the gcloud dataproc clusters list
command to confirm
that a cluster has scheduled deletion enabled.
gcloud dataproc clusters list \ --region=region
... NAME WORKER_COUNT ... SCHEDULED_DELETE cluster-id number ... enabled ...
You can use the gcloud dataproc clusters describe
command to
check a cluster's LifecycleConfig
scheduled deletion settings.
gcloud dataproc clusters describe cluster-name \ --region=region
... lifecycleConfig: autoDeleteTime: '2018-11-28T19:33:48.146Z' idleDeleteTtl: 1800s idleStartTime: '2018-11-28T18:33:48.146Z' ...
The autoDeleteTime
and idleDeleteTtl
are the
scheduled deletion configuration values previously set by the user on the cluster.
Dataproc generates the idleStartTime
value, which is
the latest cluster idle start time. Dataproc deletes the
cluster if the cluster remains idle at idleStartTime
+ idleDeleteTtl
.
REST API
You can make a clusters.list request to confirm that a cluster has scheduled deletion enabled.
Console
You can view the cluster's scheduled deletion settings by selecting the cluster name from the Dataproc Clusters page in the Google Cloud console. From the clusters details page, select the CONFIGURATION tab. Scroll down the cluster configuration list to view the scheduled deletion settings.