Cloud Data Fusion

Fully managed, code-free data integration at any scale.

View documentation for this product.

Smarter data integration for smarter analytics

Cloud Data Fusion is a fully managed, cloud-native data integration service that helps users efficiently build and manage ETL/ELT data pipelines. With a graphical interface and a broad open source library of preconfigured connectors and transformations, Cloud Data Fusion shifts an organization’s focus away from code and integration to insights and action.

Code-free deployment of data pipelines

Cloud Data Fusion features a visual point-and-click interface that enables the code-free development of ETL pipelines. When combined with its broad library of data transformation blueprints, Cloud Data Fusion empowers a self-service model of data integration that removes expertise-based bottlenecks and accelerates time to insight.

An open core, delivering hybrid and multi-cloud integration

Cloud Data Fusion is built on the open source project CDAP, and this open core ensures data pipeline portability for users. CDAP’s broad integration with on-premises and public cloud platforms gives Cloud Data Fusion users the ability to break down silos and deliver insights that were previously inaccessible.

Get more from Google’s industry-leading big data tools

Cloud Data Fusion’s native integration with Google Cloud simplifies data security and ensures your data is immediately available for analysis. Whether you’re curating a data lake with Cloud Storage and Cloud Dataproc, moving data into BigQuery for data warehousing, or transforming data to land it in a relational store like Cloud Spanner, Cloud Data Fusion’s integration makes development and iteration fast and easy.

Robust data engineering through collaboration and standardization

Cloud Data Fusion offers both preconfigured transformations from an OSS library as well as the ability to create an internal library of custom connections and transformations that can be validated, shared, and reused across an organization. It lays the foundation of collaborative data engineering and improves productivity. That means less waiting for data engineers and, importantly, less sweating about code quality.


Code-free self-service

Remove bottlenecks by enabling nontechnical users through a code-free graphical interface that delivers point-and-click data integration.

Collaborative data engineering

Cloud Data Fusion offers the ability to create an internal library of custom connections and transformations that can be validated, shared, and reused across an organization.


Fully managed GCP-native architecture unlocks the scalability, reliability, security, and privacy guarantees of Google Cloud.

Enterprise-grade security

Integration with Cloud Identity and Access Management (IAM) and Cloud Identity-Aware Proxy (IAP) provides enterprise security and alleviates risks by ensuring compliance and data protection.

Integration metadata and lineage

Search integrated datasets by technical and business metadata. Track lineage for all integrated datasets at the dataset and field level.

Seamless operations

REST APIs, time-based schedules, pipeline state-based triggers, logs, metrics, and monitoring dashboards make it easy to operate in mission-critical environments.

Comprehensive integration toolkit

Built-in connectors to a variety of modern and legacy systems, code-free transformations, conditionals and pre/post processing, alerting and notifications, and error processing provide a comprehensive data integration experience.

Hybrid enablement

Open source provides the flexibility and portability required to build standardized data integration solutions across hybrid and multi-cloud environments.

Industry use cases

Data Fusion lowers the barrier to entry for big data work by providing an intuitive visual interface and pipeline abstraction. This increased accessibility, combined with a growing collection of pre-built ‘connectors’ and transformations, translates to rapid results and in many cases allows data analysts and scientists to ‘self-serve’ without needing help from those with deep cloud or software engineering expertise.

Robert Medeiros, R&D Architect, TELUS Digital

Technical resources


Pricing for the service is broken down into:

  • Cloud Data Fusion instance hours to operate the data integration interface
  • Cloud Dataproc VMs to execute the transformations prescribed by Cloud Data Fusion
Edition Price per Cloud Data Fusion instance hour Number of simultaneous pipelines supported Number of users supported
Basic $1.80 Two Unlimited
Enterprise $4.20 Unlimited Unlimited

Take the next step

Get $300 in free credits to learn and build on Google Cloud for up to 12 months.

Try it free
Need help getting started?
Work with a trusted partner
Continue browsing