Plugins overview

When you create a data pipeline in Cloud Data Fusion, you use a series of stages, known as nodes, to move and manage data as it flows from source to sink. Each node consists of a plugin, a customizable module that extends the capabilities of Cloud Data Fusion.

You can find the plugins in Cloud Data Fusion web interface by going to the Studio page. For more plugins, click Hub.

Plugins in the web interface.

Plugin types

Plugins are categorized into the following categories:

  • Sources
  • Transformations
  • Analytics
  • Sinks
  • Conditions and actions
  • Error handlers and alerts

Sources

Source plugins connect to databases, files, or real-time streams from which your pipeline reads data. You set up sources for your data pipeline using the web interface, so you don't have to worry about coding low-level connections.

Transformations

Transform plugins change data after it's ingested from a source. For example, you can clone a record, change the file format to JSON, or use the Javascript plugin to create a custom transformation.

Analytics

Analytics plugins perform aggregations, such as joining data from different sources and running analytics and machine learning operations.

Sinks

Sink plugins write data to resources, such as Cloud Storage, BigQuery, Spanner, relational databases, file systems, and mainframes. You can query the data that gets written to the sink using the Cloud Data Fusion web interface or REST API.

Conditions and actions

Use condition and action plugins to schedule actions that take place during a workflow that don't directly manipulate data in the workflow. For example:

  • Use the Database plugin to schedule a database command to run at the end of your pipeline.
  • Use the File Move plugin to trigger an action that moves files within Cloud Storage.

Error handlers and alerts

When stages encounter null values, logical errors, or other sources of errors, you can use an error handler plugin to catch errors. Use these plugins to find errors in the output after a transform or analytics plugin. You can write the errors to a database for analysis.

What's next