Dataflow managed I/O for Apache Iceberg

Managed I/O supports the following capabilities for Apache Iceberg:

Catalogs
Read capabilities Batch read
Write capabilities

For BigQuery tables for Apache Iceberg, use the BigQueryIO connector with BigQuery Storage API. The table must already exist; dynamic table creation is not supported.

Requirements

Requires Apache Beam SDK for Java version 2.58.0 or later.

Configuration

Managed I/O uses the following configuration parameters for Apache Iceberg:

Read and write configuration Data type Description
table string The identifier of the Apache Iceberg table. Example: "db.table1".
catalog_name string The name of the catalog. Example: "local".
catalog_properties map A map of configuration properties for the Apache Iceberg catalog. The required properties depend on the catalog. For more information, see CatalogUtil in the Apache Iceberg documentation.
config_properties map An optional set of Hadoop configuration properties. For more information, see CatalogUtil in the Apache Iceberg documentation.
Write configuration Data type Description
triggering_frequency_seconds integer For streaming write pipelines, the frequency at which the sink attempts to produce snapshots, in seconds.

For more information and code examples, see the following topics: