Cloud Data Fusion pricing

This document explains the pricing for Cloud Data Fusion. To see the pricing for other products, read the Pricing documentation.

For pricing purposes, usage is measured as the length of time, in minutes, between the time a Cloud Data Fusion instance is created to the time it is deleted. Although the rate for pricing is defined on the hour, Cloud Data Fusion is billed by the minute. Usage is measured in hours (30 minutes is 0.5 hours, for example) to apply hourly pricing to minute-by-minute use.

If you pay in a currency other than USD, the prices listed in your currency on Google Cloud SKUs apply.

Pricing overview

Cloud Data Fusion pricing is split across two functions: pipeline development and execution.

Development

For pipeline development, Cloud Data Fusion offers the following three editions:

Cloud Data Fusion Edition	Price (USD)
Developer	$0.35 / 1 hour
Basic	0 hour to 120 hour Free per 1 month / account 120 hour and above $1.80 / 1 hour, per 1 month / account
Enterprise	$4.20 / 1 hour

The Basic edition offers the first 120 hours per month per account free.

Execution

For pipeline execution, you are charged for the Managed Service for Apache Spark clusters that Cloud Data Fusion creates to run your pipelines at the current Managed Service for Apache Spark rates.

Comparison of Developer, Basic, and Enterprise editions

Capability	Developer	Basic	Enterprise
Number of concurrent users	2	Limited^*	Limited^*
Workloads	Development, product exploration	Testing, sandbox, PoC	Production
Internal IP support	✓	✓	✓
Role-based access control (RBAC)	🚫	🚫	✓
Visual Designer	✓	✓	✓
Connector ecosystem	✓	✓	✓
Visual transformations	✓	✓	✓
Structured, unstructured, semi-structured	✓	✓	✓
Streaming pipelines	✓	✓	✓
Integration lineage - field and dataset level	✓	✓	✓
Integration with Knowledge Catalog	✓	✓	✓
High Availability	Zonal	Regional	Regional
Create and customize compute profiles	✓	✓	✓
Devops support: REST API, Source Control Management	✓	✓	✓
Triggers and schedules	✓	✓	✓
Execution environment selection	✓	✓	✓
Concurrent pipeline execution	🚫	Limited^**	Limited^**
Developer SDK for extensibility	✓	✓	✓

* Concurrent users: in general, Cloud Data Fusion supports a maximum of 50 users per instance. If RBAC is enabled, the maximum is 25 users.

** Concurrent pipeline execution is limited and based on the instance version being used. For access to scalability details, reach out to a Google Cloud representative.

Usage of other Google Cloud resources

In addition to the development cost of a Cloud Data Fusion instance,̦ you are billed only for any resources that you use to execute your pipelines, such as:

★ For building replication jobs, BigQuery flat-rate pricing is recommended, not on-demand pricing.

Supported regions

Currently, pricing for Cloud Data Fusion is the same for all supported regions.

Region	Location
africa-south1 ^*	Johannesburg, South Africa
asia-east1	Changhua County, Taiwan
asia-east2	Hong Kong
asia-northeast1	Tokyo, Japan
asia-northeast2	Osaka, Japan
asia-northeast3	Seoul, South Korea
asia-south1	Mumbai, India
asia-south2	Delhi, India
asia-southeast1	Jurong West, Singapore
asia-southeast2	Jakarta, Indonesia
australia-southeast1	Sydney, Australia
europe-north1	Hamina, Finland
europe-southwest1	Madrid, Spain
europe-west1	St. Ghislain, Belgium
europe-west2	London, England, UK
europe-west3	Frankfurt, Germany
europe-west4	Eemshaven, Netherlands
europe-west6	Zürich, Switzerland
europe-west8	Milan, Italy
europe-west9	Paris, France
europe-west12^*	Turin, Italy
me-central1^*	Doha, Qatar
me-central2^*	Dammam, Saudi Arabia
me-west1	Tel Aviv, Israel
northamerica-northeast1	Montréal, Québec, Canada
northamerica-south1	Mexico
southamerica-east1	Osasco (São Paulo), Brazil
southamerica-west1	Santiago, Chile
us-central1	Council Bluffs, Iowa, North America
us-east1	Moncks Corner, South Carolina, North America
us-east4	Ashburn, Northern Virginia, North America
us-east5	Columbus, Ohio, North America
us-south1	Dallas, Texas, North America
us-west1	The Dalles, Oregon, North America
us-west2	Los Angeles, California, North America

* Data Lineage in Cloud Data Fusion isn't supported in africa-south1, me-central1, me-central1, or europe-west12.

Pricing example

Consider a Cloud Data Fusion instance that has been running for 24 hours, and there are no free hours remaining for the Basic edition. Based on the edition, the instance charge for Cloud Data Fusion is summarized in the following table:

Edition	Cost per hour	Number of hours	Development cost
Developer	$0.35	24	24*0.35 = $8.4
Basic	$1.80	24	24*1.8 = $43.2
Enterprise	$4.20	24	24*4.2 = $100.8

★ Note: Cloud Data Fusion instances, once provisioned, always need to be available. After you delete instances, they cannot be recovered and any pipeline data is lost. For estimated monthly costs, refer to the Pricing overview.

During this 24-hour period, you ran a pipeline that read raw data from Cloud Storage, performed transformations, and wrote the data to BigQuery every hour. Each run took approximately 15 minutes to complete. In other words, the Managed Service for Apache Spark clusters that were created for these runs were alive for 15 minutes (0.25 hours) each. Assume that the configuration of each Managed Service for Apache Spark cluster was the following:

Item	Machine Type	Virtual CPUs	Attached Persistent Disk	Number in cluster
Master Node	n1-standard-4	4	500 GB	1
Worker Node	n1-standard-4	4	500 GB	5

The Managed Service for Apache Spark clusters each have 24 virtual CPUs: 4 for the master and 20 spread across the workers. For Managed Service for Apache Spark billing purposes, the pricing for this cluster would be based on those 24 virtual CPUs and the length of time each cluster ran.

Across all runs of your pipeline, the total charge incurred for Managed Service for Apache Spark can be calculated as:

Managed Service for Apache Spark charge = # of vCPUs * number of clusters * hours per cluster * Managed Service for Apache Spark price

= 24 * 24 * 0.25 * $0.01

= $1.44

The Managed Service for Apache Spark clusters use other Google Cloud products, which would be billed separately. Specifically, these clusters would incur charges for Compute Engine and Standard Persistent Disk Provisioned Space. You will incur storage charges for Cloud Storage and BigQuery, depending on the amount of data your pipeline processes.

To determine these additional costs based on current rates, you can use the billing calculator.

What's next

Read the Cloud Data Fusion documentation.
Get started with Cloud Data Fusion.
Try the Pricing calculator.

Request a custom quote

With Google Cloud's pay-as-you-go pricing, you only pay for the services you use. Connect with our sales team to get a custom quote for your organization.