Process data with Cloud Data Fusion

Cloud Data Fusion provides a Dataplex Source plugin to read data from Dataplex entities (tables) residing on Cloud Storage or BigQuery assets. The Dataplex Source plugin lets you treat data in Cloud Storage assets as tables and filter the data with simple SQL queries.

Get started

To access this Preview feature, submit the access request form. You can access the following links after you have been added to the allowlist for this feature.

To set up your environment for the Dataplex source plugin, see the Getting started guide (permission is required). The Dataplex plugin artifacts (JAR and JSON files) are available in the same shared folder as the guide.

What's next

You can build a data pipeline in Cloud Data Fusion that reads data from a Dataplex entity, processes and curates the data in Cloud Data Fusion, and pushes the data back into a Dataplex sink. For example, you can build an ETL pipeline that extracts and filters the data from a table on Cloud Storage in a Dataplex raw zone, transforms that data in Cloud Data Fusion, and then loads the data to a BigQuery table in the Dataplex curated zone.

See the user guide the User guide (permission is required). Sample pipeline templates are available in the same shared folder.