Serverless for Apache Spark pricing

Google Cloud Serverless for Apache Spark pricing

Dataproc | Serverless for Apache Spark | Dataproc Metastore

Note: Dataproc Serverless is now Google Cloud Serverless for Apache Spark. Until updated, some documents will refer to the previous name.

Google Cloud Serverless for Apache Spark offers two distinct tiers—Standard and Premium—which allow customers to align performance requirements and required feature access with optimal cost structures. Compare the two offerings.

Serverless for Apache Spark pricing is based on the number of Data Compute Units (DCUs), the number of accelerators used, and the amount of shuffle storage used. DCUs, accelerators, and shuffle storage are billed per second, with a 1-minute minimum charge for DCUs and shuffle storage, and a 5-minute minimum charge for accelerators.

Each Dataproc vCPU counts as 0.6 DCU. RAM is charged differently below and above 8GB. Each gigabyte of RAM below 8G gigabyte per vCPU counts as 0.1 DCU, and each gigabyte of RAM above 8G gigabyte per vCPU counts as 0.2 DCU. Memory used by Spark drivers and executors and system memory usage are counted towards DCU usage.

By default, each Serverless for Apache Spark batch and interactive workload consumes a minimum of 12 DCUs for the duration of the workload: the driver uses 4 vCPUs and 16GB of RAM and consumes 4 DCUs, and each of the 2 executors uses 4 vCPUs and 16GB of RAM and consumes 4 DCUs. You can customize the number of vCPUs and the amount of memory per vCPU by setting Spark properties. No additional Compute Engine VM or Persistent Disk charges apply.

Data Compute Unit (DCU) pricing

The DCU rate shown below is an hourly / monthly rate. It is prorated and billed per second, with a 1-minute minimum charge.

Mostrar opções de desconto

Type	Default^* (USD)	BigQuery CUD - 1 Year^* (USD)	BigQuery CUD - 3 Year^* (USD)
Data Compute Unit (Standard)	US$ 0,06 / 1,000 hour	US$ 0,054 / 1,000 hour	US$ 0,048 / 1,000 hour
Data Compute Unit (Premium)	US$ 0,089 / 1,000 hour	US$ 0,0801 / 1,000 hour	US$ 0,0712 / 1,000 hour

^* Cada modelo de consumo tem um ID exclusivo. É necessário informar que você tem interesse para se qualificar para descontos do modelo de consumo. Clique aqui para saber mais.

If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.

Google Cloud Serverless for Apache Spark interactive workload is charged at Premium.

Shuffle storage pricing

The shuffle storage rate shown below is an hourly / monthly rate. It is prorated and billed per second, with a 1-minute minimum charge for standard shuffle storage and a 5-minute minimum charge for Premium shuffle storage. Premium shuffle storage can only be used with Premium Compute Unit.

Type	Price (USD)
Shuffle Storage (Standard)	US$ 0,000054795 / 1 gibibyte hour
Shuffle Storage (Premium)	US$ 0,000136986 / 1 gibibyte hour

If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.

Accelerator pricing

The accelerator rate shown below is an hourly / monthly rate. It is prorated and billed per second, with a 5-minute minimum charge.

Type	Price (USD)
a100 40GB	US$ 3,5206896 / 1,000 hour
a100 80GB	US$ 4,713696 / 1,000 hour
L4	US$ 0,672048287 / 1,000 hour

If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.

Pricing example

If the Serverless for Apache Spark batch workload runs with 12 DCUs (spark.driver.cores=4,spark.executor.cores=4,spark.executor.instances=2) for 24 hours in the us-central1 region and consumes 25GB of shuffle storage, the price calculation is as follows.

Carregando...

Notes:

The example assumes a 30-day month. Since the batch workload duration is one day, the monthly shuffle storage rate is divided by 30.

If the Serverless for Apache Spark batch workload runs with 12 DCUs and 2 L4 GPUs (spark.driver.cores=4,spark.executor.cores=4, spark.executor.instances=2,spark.dataproc.driver.compute.tier=premium, spark.dataproc.executor.compute.tier=premium, spark.dataproc.executor.disk.tier=premium, spark.dataproc.executor.resource.accelerator.type=l4) for 24 hours in the us-central1 region and consumes 25GB of shuffle storage, the price calculation is as follows.

Carregando...

Notes:

The example assumes a 30-day month. Since the batch workload duration is one day, the monthly shuffle storage rate is divided by 30.

If the Serverless for Apache Spark interactive workload runs with 12 DCUs (spark.driver.cores=4,spark.executor.cores=4,spark.executor.instances=2) for 24 hours in the us-central1 region and consumes 25GB of shuffle storage, the price calculation is as follows:

Carregando...

Notes:

The example assumes a 30-day month. Since the batch workload duration is one day, the monthly shuffle storage rate is divided by 30.

Pricing estimation example

When a batch workload completes, Serverless for Apache Spark calculates UsageMetrics, which contain an approximation of the total DCU, accelerator, and shuffle storage resources consumed by the completed workload. After running a workload, you can run the gcloud dataproc batches describe BATCH_ID command to view workload usage metrics to help you estimate the cost of running the workload.

Example:

Serverless for Apache Spark runs a workload on an ephemeral cluster with one master and two workers. Each node consumes 4 DCUs (default is 4 DCUs per core—see spark.dataproc.driver.disk.size) and 400 GB shuffle storage (default is 100GB per core—see spark.driver.cores). Workload run time is 60 seconds. Also, each worker has 1 GPU for a total of 2 across the cluster.

The user runs gcloud dataproc batches describe BATCH_ID --region REGION to obtain usage metrics. The command output includes the following snippet (milliDcuSeconds: 4 DCUs x 3 VMs x 60 seconds x 1000 = 720000, milliAcceleratorSeconds: 1 GPU x 2 VMs x 60 seconds x 1000 = 120000, and shuffleStorageGbSeconds: 400GB x 3 VMs x 60 seconds = 72000):

Carregando...

Use of other Google Cloud resources

Your Serverless for Apache Spark workload can optionally utilize the following resources, each billed at its own pricing, including but not limited to:

What's next

Read the Serverless for Apache Spark documentation.
Get started with Serverless for Apache Spark.
Try the Pricing calculator.

Request a custom quote

With Google Cloud's pay-as-you-go pricing, you only pay for the services you use. Connect with our sales team to get a custom quote for your organization.

Google Cloud Serverless for Apache Spark pricing

Data Compute Unit (DCU) pricing

ID do modelo de consumo:

ID do modelo de consumo:

ID do modelo de consumo:

Shuffle storage pricing

Accelerator pricing

Pricing example

Pricing estimation example

Use of other Google Cloud resources

What's next

Request a custom quote