Dataflow managed I/O

The managed I/O connector is an Apache Beam transform that provides a common API for creating sources and sinks. On the backend, Dataflow treats the managed I/O connector as a service, which allows Dataflow to manage runtime operations for the connector. You can then focus on the business logic in your pipeline, rather than managing these details.

You create the managed I/O connector using Apache Beam code, just like any other I/O connector. You specify a source or sink to instantiate and pass in a set of configuration parameters. For example, the Apache Iceberg sink requires a catalog_name parameter.

The following example shows how to create the Apache Iceberg sink by passing in a map of configuration parameters:

Java

pipeline.apply(
  Managed.write(ICEBERG)
    .withConfig(ImmutableMap.<String, Map>.builder()
      .put("catalog_name", "<catalog_name>")
      .put("warehouse_location", "<warehouse_location>")
      .build()));

You can also put the configuration parameters into a YAML file and provide a URL to the file:

Java

pipeline.apply(
  Managed.write(ICEBERG)
    .withConfigUrl(<config_url>));

For more information, see the Managed class in the Apache Beam GitHub repository.