Smarter data integration for smarter analytics
Cloud Data Fusion is a fully managed, cloud-native data integration service that helps users efficiently build and manage ETL/ELT data pipelines. With a graphical interface and a broad open-source library of preconfigured connectors and transformations, Data Fusion shifts an organization’s focus away from code and integration to insights and action.
Code-free deployment of data pipelines
Data Fusion features a visual point-and-click interface that enables the code-free development of ETL pipelines. When combined with its broad library of data transformation blueprints, Data Fusion empowers a self-service model of data integration that removes expertise-based bottlenecks and accelerates time to insight.
An open core, delivering hybrid and multi-cloud integration
Data Fusion is built on the open-source project CDAP, and this open core ensures data pipeline portability for users. CDAP’s broad integration with on-premises and public cloud platforms gives Data Fusion users the ability to break down silos and deliver insights that were previously inaccessible.
Get more from Google’s industry-leading big data tools
Data Fusion’s native integration with Google Cloud simplifies data security and ensures your data is immediately available for analysis. Whether you’re curating a data lake with Cloud Storage and Cloud Dataproc, moving data into BigQuery for data warehousing, or transforming data to land it in a relational store like Cloud Spanner, Data Fusion’s integration makes development and iteration fast and easy.
Robust data engineering through collaboration and standardization
Data Fusion offers both preconfigured transformations from an OSS library as well as the ability to create an internal library of custom connections and transformations that can be validated, shared, and reused across an organization. It lays the foundation of collaborative data engineering and improves productivity. That means less waiting for data engineers and, importantly, less sweating about code quality.
Remove bottlenecks by enabling nontechnical users through a code-free graphical interface that delivers point-and-click data integration.
Collaborative data engineering
Data Fusion offers the ability to create an internal library of custom connections and transformations that can be validated, shared, and reused across an organization.
Fully managed, GCP-native architecture unlocks the scalability, reliability, security, and privacy guarantees of Google Cloud.
Integration with Cloud Identity and Access Management (IAM) and Cloud Identity-Aware Proxy (IAP) provides enterprise security and alleviates risks by ensuring compliance and data protection.
Integration metadata and lineage
Search integrated datasets by technical and business metadata. Track lineage for all integrated datasets at the dataset and field level.
REST APIs, time-based schedules, pipeline state-based triggers, logs, metrics, and monitoring dashboards make it easy to operate in mission-critical environments.
Comprehensive integration toolkit
Built-in connectors to a variety of modern and legacy systems, code-free transformations, conditionals and pre/post processing, alerting and notifications, and error processing provide a comprehensive data integration experience.
Open source provides the flexibility and portability required to build standardized data integration solutions across hybrid and multi-cloud environments.
Industry use cases
Modern, more secure cloud data lakes
Cloud Data Fusion helps users build scalable, distributed data lakes on GCP by migrating data from siloed on-premises platforms. Customers can leverage the scale of the cloud to centralize data and drive more value out of their data as a result. The self-service capabilities of Cloud Data Fusion increase process visibility and lower the overall cost of operational support.
Unified analytics environment
Many users today want to establish a unified analytics environment across a myriad of expensive, on-premises data marts. Integrating data from all these sources using a wide range of disconnected tools and stop-gap measures creates data quality and security challenges. Cloud Data Fusion’s vast variety of connectors, visual interfaces, and abstractions centered around business logic helps in lowering TCO, promoting self-service and standardization, and reducing repetitive work.
Cloud Data Fusion can help organizations better understand their customers by breaking down data silos, including the traditional silos of online and offline profiles. A trusted, unified view of customer engagement and behavior unlocks the ability to drive a better customer experience, which leads to higher retention and higher revenue per customer.
|Edition||Price per Cloud Data Fusion instance hour||Data Fusion execution (Dataproc VMs)||Number of simultaneous pipelines supported||Number of users supported|
Data Fusion lowers the barrier to entry for big data work by providing an intuitive visual interface and pipeline abstraction. This increased accessibility, combined with a growing collection of pre-built ‘connectors’ and transformations, translates to rapid results and in many cases allows data analysts and scientists to ‘self-serve’ without needing help from those with deep cloud or software engineering expertise.Robert Medeiros, R&D Architect, TELUS Digital
Data Fusion quickstart guide
Data Fusion tutorials
Learn and build
New to GCP? Get started with any GCP product for free with a $300 credit.
Need more help?
Our experts will help you build the right solution or find the right partner for your needs.