Dataproc Serverless staging and temp buckets

Dataproc Serverless creates a Cloud Storage staging and a Cloud Storage temp bucket in your project or reuses existing staging and temp buckets from previous batch creation requests. Note that these are the default buckets created by Dataproc on Compute Engine clusters (see Dataproc staging and temp buckets).

  • Staging bucket: Used to stage workload dependencies, output, and config files.

  • Temp bucket: Used to store ephemeral data, such as Spark event log files.

Dataproc Serverless sets regional staging and temp buckets in Cloud Storage locations according to the Compute Engine zone where your workload is deployed, and then creates and manages these project-level, per-location buckets. Dataproc Serverless-created staging and temp buckets are shared among workloads in the same region. The temp bucket has a TTL of 90 days.