Cloud Data Fusion pricing

This document explains the pricing for Cloud Data Fusion. To see the pricing for other products, read the Pricing documentation.

For pricing purposes, usage is measured as the length of time, in minutes, between the time a Cloud Data Fusion instance is created to the time it is deleted. Although the rate for pricing is defined on the hour, Cloud Data Fusion is billed by the minute. Usage is measured in hours (30 minutes is 0.5 hours, for example) to apply hourly pricing to minute-by-minute use.

If you pay in a currency other than USD, the prices listed in your currency on Google Cloud SKUs apply.

Pricing overview

Cloud Data Fusion pricing is split across two functions: pipeline development and execution.

Development

For pipeline development, Cloud Data Fusion offers the following three editions:

Cloud Data Fusion Edition Price per instance per hour
Developer $0.35 (~$250 per month)
Basic $1.80 (~$1100 per month)
Enterprise $4.20 (~$3000 per month)

The Basic edition offers the first 120 hours per month per account free.

Execution

For pipeline execution, you are charged for the Dataproc clusters that Cloud Data Fusion creates to run your pipelines at the current Dataproc rates.

Comparison of Developer, Basic, and Enterprise editions

Capability Developer Basic Enterprise
Number of users 2 (recommended)* Unlimited Unlimited
Workloads Development, Product exploration Testing, Sandbox, PoC Production
Visual Designer
Connector ecosystem
Visual transformations
Developer SDK for extensibility
Data quality and cleansing library
Private IP support
Debugging and testing (programmatic & visual)
Join, blend, aggregate transformations
Structured, unstructured, semi-structured
Streaming pipelines
Integration metadata repository
Integration lineage - field and dataset level
High Availability Zonal Regional Regional
Create and customize compute profiles Versions 6.3 and above
Devops support - REST API
Triggers / schedules
Execution environment selection

* The developer edition offers the full feature set of Cloud Data Fusion, but has limited reliability and scalability guarantees. If multiple people use it concurrently, performance might degrade.

Usage of other Google Cloud resources

In addition to the development cost of a Cloud Data Fusion instance,̦ you are billed only for any resources that you use to execute your pipelines, such as:

Supported regions

Currently, pricing for Cloud Data Fusion is the same for all supported regions.

Region Location
africa-south1 * Johannesburg, South Africa
asia-east1 Changhua County, Taiwan
asia-east2 Hong Kong
asia-northeast1 Tokyo, Japan
asia-northeast2 Osaka, Japan
asia-northeast3 Seoul, South Korea
asia-south1 Mumbai, India
asia-south2 Delhi, India
asia-southeast1 Jurong West, Singapore
asia-southeast2 Jakarta, Indonesia
australia-southeast1 Sydney, Australia
europe-north1 Hamina, Finland
europe-southwest1 Madrid, Spain
europe-west1 St. Ghislain, Belgium
europe-west2 London, England, UK
europe-west3 Frankfurt, Germany
europe-west4 Eemshaven, Netherlands
europe-west6 Zürich, Switzerland
europe-west8 Milan, Italy
europe-west9 Paris, France
europe-west12 * Turin, Italy
me-central1* Doha, Qatar
me-central2* Dammam, Saudi Arabia
me-west1 Tel Aviv, Israel
northamerica-northeast1 Montréal, Québec, Canada
southamerica-east1 Osasco (São Paulo), Brazil
southamerica-west1 Santiago, Chile
us-central1 Council Bluffs, Iowa, North America
us-east1 Moncks Corner, South Carolina, North America
us-east4 Ashburn, Northern Virginia, North America
us-east5 Columbus, Ohio, North America
us-south1 Dallas, Texas, North America
us-west1 The Dalles, Oregon, North America
us-west2 Los Angeles, California, North America

* Data Lineage in Cloud Data Fusion isn't supported in africa-south1, me-central1, me-central1, or europe-west12.

Pricing example

Consider a Cloud Data Fusion instance has been running for 10 hours, and there are no free hours remaining for the Basic edition. Based on the edition, the development charge for Cloud Data Fusion is summarized in the following table:

Edition Cost per hour Number of hours Development cost
Developer $0.35 10 10 * 0.35 = $3.50
Basic $1.80 10 10 * 1.8 = $18
Enterprise $4.20 10 10 * 4.2 = $42

During this 10-hour period, you ran a pipeline that read raw data from Cloud Storage, performed transformations, and wrote the data to BigQuery every hour. Each run took approximately 15 minutes to complete. In other words, the Dataproc clusters that were created for these runs were alive for 15 minutes (0.25 hours) each. Assume that the configuration of each Dataproc cluster was the following:

Item Machine Type Virtual CPUs Attached Persistent Disk Number in cluster
Master Node n1-standard-4 4 500 GB 1
Worker Nodes n1-standard-4 4 500 GB 5

The Dataproc clusters each have 24 virtual CPUs: 4 for the master and 20 spread across the workers. For Dataproc billing purposes, the pricing for this cluster would be based on those 24 virtual CPUs and the length of time each cluster ran.

Across all runs of your pipeline, the total charge incurred for Dataproc can be calculated as:

Dataproc charge = # of vCPUs * number of clusters * hours per cluster * Dataproc price
                      = 24 * 10 * 0.25 * $0.01
                      = $0.60

The Dataproc clusters use other Google Cloud products, which would be billed separately. Specifically, these clusters would incur charges for Compute Engine and Standard Persistent Disk Provisioned Space. You will incur storage charges for Cloud Storage and BigQuery, depending on the amount of data your pipeline processes.

To determine these additional costs based on current rates, you can use the billing calculator.

What's next

Request a custom quote

With Google Cloud's pay-as-you-go pricing, you only pay for the services you use. Connect with our sales team to get a custom quote for your organization.
Contact sales