SDK version support status

This page lists the support status for Apache Beam and Dataflow SDK releases:

Apache Beam 2.x SDKs

Apache Beam is an open source, community-led project. Google is part of the community, but we do not own the project or control the release process. We might open bugs or submit patches to the Apache Beam codebase on behalf of Dataflow customers, but we cannot create hotfixes or official releases of Apache Beam on demand. See the Apache Beam policies page for more details about release policies.

Dataflow supports specific Apache Beam SDK components for the SDK releases listed below. These components have been tested thoroughly with Dataflow. Experimental features are not supported.

See the Apache Beam release notes for change information.

Note: Development SDK versions (marked as -SNAPSHOT for Java and .dev for Python) are unsupported.

The following tables contains the support status for the Apache Beam 2.x SDKs:

Java

SDK version(s) Status Supported components Details
2.23.0 Supported org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
Google Cloud I/O connectors under module org.apache.beam:beam-runners-google-cloud-dataflow-java: bigquery, bigtable, datastore, healthcare, pubsub, spanner
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management
org.apache.beam:beam-sdks-java-io-kafka

This version will be deprecated on July 29, 2021.

2.22.0 Supported org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
Google Cloud I/O connectors under module org.apache.beam:beam-runners-google-cloud-dataflow-java: bigquery, bigtable, datastore, healthcare, pubsub, spanner
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management
org.apache.beam:beam-sdks-java-io-kafka

This version will be deprecated on June 8, 2021.

2.21.0 Supported org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
org.apache.beam:beam-runners-google-cloud-dataflow-java
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management
org.apache.beam:beam-sdks-java-io-kafka

This version will be deprecated on May 27, 2021.

2.20.0 Supported org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
org.apache.beam:beam-runners-google-cloud-dataflow-java
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management
org.apache.beam:beam-sdks-java-io-kafka

This version will be deprecated on April 15, 2021.

2.19.0 Supported org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
org.apache.beam:beam-runners-google-cloud-dataflow-java
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management
org.apache.beam:beam-sdks-java-io-kafka

This version will be deprecated on February 4, 2021.

2.18.0 Supported org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
org.apache.beam:beam-runners-google-cloud-dataflow-java
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management
org.apache.beam:beam-sdks-java-io-kafka

This version will be deprecated on January 23, 2021.

2.17.0 Supported org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
org.apache.beam:beam-runners-google-cloud-dataflow-java
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management
org.apache.beam:beam-sdks-java-io-kafka

This version will be deprecated on January 6, 2021.

2.16.0 Supported org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
org.apache.beam:beam-runners-google-cloud-dataflow-java
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management
org.apache.beam:beam-sdks-java-io-kafka

This version will be deprecated on October 7, 2020.

2.15.0 Deprecated org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
org.apache.beam:beam-runners-google-cloud-dataflow-java
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management
org.apache.beam:beam-sdks-java-io-kafka

Deprecated on August 23, 2020.

Known issues:
  • Dataflow users who use schema features (including SQL transforms) should not upgrade to 2.15.0 due to a known issue. See the Apache Beam issue tracker for more information.
2.14.0 Deprecated org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
org.apache.beam:beam-runners-google-cloud-dataflow-java
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management
org.apache.beam:beam-sdks-java-io-kafka

Deprecated on August 1, 2020.

2.13.0 Deprecated org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
org.apache.beam:beam-runners-google-cloud-dataflow-java
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management
org.apache.beam:beam-sdks-java-io-kafka

Deprecated on June 6, 2020.

This release adds experimental support for JDK 9 or above. See the Apache Beam issue tracker for more information.

2.12.0 Deprecated org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
org.apache.beam:beam-runners-google-cloud-dataflow-java
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management
org.apache.beam:beam-sdks-java-io-kafka

Deprecated on April 25, 2020.

Known issues:
  • The Dataflow runner has an incorrect logging configuration that might cause all logs to be missing. To work around this issue, add slf4j-jdk14 to your runtime dependencies.
2.11.0 Deprecated org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
org.apache.beam:beam-runners-google-cloud-dataflow-java
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management
org.apache.beam:beam-sdks-java-io-kafka

Deprecated as of March 1, 2020.

Known issues:
  • The Dataflow runner has an incorrect logging configuration that might cause all logs to be missing. To work around this issue, add slf4j-jdk14 to your runtime dependencies.
2.10.0 Deprecated org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
org.apache.beam:beam-runners-google-cloud-dataflow-java
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management
org.apache.beam:beam-sdks-java-io-kafka

Deprecated as of February 11, 2020.

Known issues:
  • SDK 2.10.0 depends on gcsio client library version 1.9.13, which has known issues:

    To work around these issues, either upgrade to SDK 2.11.0, or override the gcsio client library version to 1.9.16 or later.

  • The Dataflow runner has an incorrect logging configuration that might cause all logs to be missing. To work around this issue, add slf4j-jdk14 to your runtime dependencies.
2.9.0 Deprecated org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
org.apache.beam:beam-runners-google-cloud-dataflow-java
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management
org.apache.beam:beam-sdks-java-io-kafka

Deprecated as of December 13, 2019.

Known issues:
  • Users enabling the Streaming Engine (Beta) experiment should not upgrade to SDK 2.9.0 due to a known issue. If you choose to use SDK 2.9.0, you must also set the enable_conscrypt_security_provider experimental flag to enable Conscrypt, which has known stability issues.
  • In a specific case, users of Apache Beam Java SDKs 2.9.0 and earlier might experience data duplication when reading files from Cloud Storage. Duplication might occur when all of the following conditions are true:
    • You are reading files with the content-encoding set to gzip, and the files are dynamically decompressive transcoded by Cloud Storage.
    • The file size (decompressed) is larger than 2.14 GB.
    • The input stream runs into an error (and is recreated) after 2.14 GB is read.
    As a workaround, do not set the content-encoding header, and store compressed files in Cloud Storage with the proper extension (for example, gz for gzip). For existing files, you can update the content-encoding header and file name with the gsutil tool.
2.8.0 Deprecated org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
org.apache.beam:beam-runners-google-cloud-dataflow-java
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management
org.apache.beam:beam-sdks-java-io-kafka

Deprecated as of October 25, 2019.

Known issues:
  • Pipelines might become stuck due to an issue with the Conscrypt library. If you see errors in Stackdriver logging with stack traces that include Conscrypt related calls, you might be affected by this issue. To resolve the issue, upgrade to SDK 2.9.0 or higher.
  • In a specific case, users of Apache Beam Java SDKs 2.9.0 and earlier might experience data duplication when reading files from Cloud Storage. Duplication might occur when all of the following conditions are true:
    • You are reading files with the content-encoding set to gzip, and the files are dynamically decompressive transcoded by Cloud Storage.
    • The file size (decompressed) is larger than 2.14 GB.
    • The input stream runs into an error (and is recreated) after 2.14 GB is read.
    As a workaround, do not set the content-encoding header, and store compressed files in Cloud Storage with the proper extension (for example, gz for gzip). For existing files, you can update the content-encoding header and file name with the gsutil tool.
2.7.0 Deprecated org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
org.apache.beam:beam-runners-google-cloud-dataflow-java
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management
org.apache.beam:beam-sdks-java-io-kafka

Deprecated as of October 3, 2019.

Known issues:
  • Pipelines might become stuck due to an issue with the Conscrypt library. If you see errors in Stackdriver logging with stack traces that include Conscrypt related calls, you might be affected by this issue. To resolve the issue, upgrade to SDK 2.9.0 or higher.
  • In a specific case, users of Apache Beam Java SDKs 2.9.0 and earlier might experience data duplication when reading files from Cloud Storage. Duplication might occur when all of the following conditions are true:
    • You are reading files with the content-encoding set to gzip, and the files are dynamically decompressive transcoded by Cloud Storage.
    • The file size (decompressed) is larger than 2.14 GB.
    • The input stream runs into an error (and is recreated) after 2.14 GB is read.
    As a workaround, do not set the content-encoding header, and store compressed files in Cloud Storage with the proper extension (for example, gz for gzip). For existing files, you can update the content-encoding header and file name with the gsutil tool.
2.6.0 Deprecated org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
org.apache.beam:beam-runners-google-cloud-dataflow-java
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management
org.apache.beam:beam-sdks-java-io-kafka

Deprecated as of August 8, 2019.

Known issues:
  • Pipelines might become stuck due to an issue with the Conscrypt library. If you see errors in Stackdriver logging with stack traces that include Conscrypt related calls, you might be affected by this issue. To resolve the issue, upgrade to SDK 2.9.0 or higher.
  • In a specific case, users of Apache Beam Java SDKs 2.9.0 and earlier might experience data duplication when reading files from Cloud Storage. Duplication might occur when all of the following conditions are true:
    • You are reading files with the content-encoding set to gzip, and the files are dynamically decompressive transcoded by Cloud Storage.
    • The file size (decompressed) is larger than 2.14 GB.
    • The input stream runs into an error (and is recreated) after 2.14 GB is read.
    As a workaround, do not set the content-encoding header, and store compressed files in Cloud Storage with the proper extension (for example, gz for gzip). For existing files, you can update the content-encoding header and file name with the gsutil tool.
2.5.0 Deprecated org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
org.apache.beam:beam-runners-google-cloud-dataflow-java
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management
org.apache.beam:beam-sdks-java-io-kafka

Deprecated as of June 6, 2019.

Known issues:
  • Pipelines might become stuck due to an issue with the Conscrypt library. If you see errors in Stackdriver logging with stack traces that include Conscrypt related calls, you might be affected by this issue. To resolve the issue, upgrade to SDK 2.9.0 or higher.
  • In a specific case, users of Apache Beam Java SDKs 2.9.0 and earlier might experience data duplication when reading files from Cloud Storage. Duplication might occur when all of the following conditions are true:
    • You are reading files with the content-encoding set to gzip, and the files are dynamically decompressive transcoded by Cloud Storage.
    • The file size (decompressed) is larger than 2.14 GB.
    • The input stream runs into an error (and is recreated) after 2.14 GB is read.
    As a workaround, do not set the content-encoding header, and store compressed files in Cloud Storage with the proper extension (for example, gz for gzip). For existing files, you can update the content-encoding header and file name with the gsutil tool.
2.4.0 Deprecated org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
org.apache.beam:beam-runners-google-cloud-dataflow-java
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management

Deprecated as of March 20, 2019.

This version will be decommissioned by August 12, 2020 due to the discontinuation of support for JSON-RPC and Global HTTP Batch Endpoints.

Known issue:
  • In a specific case, users of Apache Beam Java SDKs 2.9.0 and earlier might experience data duplication when reading files from Cloud Storage. Duplication might occur when all of the following conditions are true:
    • You are reading files with the content-encoding set to gzip, and the files are dynamically decompressive transcoded by Cloud Storage.
    • The file size (decompressed) is larger than 2.14 GB.
    • The input stream runs into an error (and is recreated) after 2.14 GB is read.
    As a workaround, do not set the content-encoding header, and store compressed files in Cloud Storage with the proper extension (for example, gz for gzip). For existing files, you can update the content-encoding header and file name with the gsutil tool.
2.3.0 Deprecated org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
org.apache.beam:beam-runners-google-cloud-dataflow-java
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management

Deprecated as of January 30, 2019.

This version will be decommissioned by August 12, 2020 due to the discontinuation of support for JSON-RPC and Global HTTP Batch Endpoints.

Known issue:
  • In a specific case, users of Apache Beam Java SDKs 2.9.0 and earlier might experience data duplication when reading files from Cloud Storage. Duplication might occur when all of the following conditions are true:
    • You are reading files with the content-encoding set to gzip, and the files are dynamically decompressive transcoded by Cloud Storage.
    • The file size (decompressed) is larger than 2.14 GB.
    • The input stream runs into an error (and is recreated) after 2.14 GB is read.
    As a workaround, do not set the content-encoding header, and store compressed files in Cloud Storage with the proper extension (for example, gz for gzip). For existing files, you can update the content-encoding header and file name with the gsutil tool.
2.2.0 Deprecated org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
org.apache.beam:beam-runners-google-cloud-dataflow-java
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management

Deprecated as of December 2, 2018.

This version will be decommissioned by August 12, 2020 due to the discontinuation of support for JSON-RPC and Global HTTP Batch Endpoints.

Known issue:
  • In a specific case, users of Apache Beam Java SDKs 2.9.0 and earlier might experience data duplication when reading files from Cloud Storage. Duplication might occur when all of the following conditions are true:
    • You are reading files with the content-encoding set to gzip, and the files are dynamically decompressive transcoded by Cloud Storage.
    • The file size (decompressed) is larger than 2.14 GB.
    • The input stream runs into an error (and is recreated) after 2.14 GB is read.
    As a workaround, do not set the content-encoding header, and store compressed files in Cloud Storage with the proper extension (for example, gz for gzip). For existing files, you can update the content-encoding header and file name with the gsutil tool.
2.1.0 Deprecated org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
org.apache.beam:beam-runners-google-cloud-dataflow-java
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management

Deprecated as of August 23, 2018.

This version will be decommissioned by August 12, 2020 due to the discontinuation of support for JSON-RPC and Global HTTP Batch Endpoints.

Known issue:
  • In a specific case, users of Apache Beam Java SDKs 2.9.0 and earlier might experience data duplication when reading files from Cloud Storage. Duplication might occur when all of the following conditions are true:
    • You are reading files with the content-encoding set to gzip, and the files are dynamically decompressive transcoded by Cloud Storage.
    • The file size (decompressed) is larger than 2.14 GB.
    • The input stream runs into an error (and is recreated) after 2.14 GB is read.
    As a workaround, do not set the content-encoding header, and store compressed files in Cloud Storage with the proper extension (for example, gz for gzip). For existing files, you can update the content-encoding header and file name with the gsutil tool.
2.0.0 Deprecated org.apache.beam:beam-sdks-java-core
org.apache.beam:beam-sdks-java-io-google-cloud-platform
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core
org.apache.beam:beam-sdks-java-extensions-protobuf
org.apache.beam:beam-runners-direct-java
org.apache.beam:beam-runners-google-cloud-dataflow-java
org.apache.beam:beam-model-pipeline
org.apache.beam:beam-runners-core-construction-java
org.apache.beam:beam-model-job-management

Deprecated as of May 17, 2018.

This version will be decommissioned by August 12, 2020 due to the discontinuation of support for JSON-RPC and Global HTTP Batch Endpoints.

Known issue:
  • In a specific case, users of Apache Beam Java SDKs 2.9.0 and earlier might experience data duplication when reading files from Cloud Storage. Duplication might occur when all of the following conditions are true:
    • You are reading files with the content-encoding set to gzip, and the files are dynamically decompressive transcoded by Cloud Storage.
    • The file size (decompressed) is larger than 2.14 GB.
    • The input stream runs into an error (and is recreated) after 2.14 GB is read.
    As a workaround, do not set the content-encoding header, and store compressed files in Cloud Storage with the proper extension (for example, gz for gzip). For existing files, you can update the content-encoding header and file name with the gsutil tool.

Python

SDK version(s) Status Supported components Details
2.23.0 Supported Core Python SDK library under module apache_beam: sub-modules coders, dataframe, metrics, options, portability, runners.dataflow, runners.direct, transforms, typehints
File-based sources and sinks and related modules under module apache_beam.io: textio, avroio, parquetio, tfrecordio, gcsfilesystem, localfilesystem
Google Cloud I/O connectors under module apache_beam.io.gcp: bigquery, datastore, pubsub

This version will be deprecated on July 29, 2021.

On October 7, 2020, Dataflow will stop supporting pipelines using Python 2. Read more information on the Python 2 support on Google Cloud page.

2.22.0 Supported Core Python SDK library under module apache_beam: sub-modules coders, dataframe, metrics, options, portability, runners.dataflow, runners.direct, transforms, typehints
File-based sources and sinks and related modules under module apache_beam.io: textio, avroio, parquetio, tfrecordio, gcsfilesystem, localfilesystem
Google Cloud I/O connectors under module apache_beam.io.gcp: bigquery, datastore, pubsub

This version will be deprecated on June 8, 2021.

On October 7, 2020, Dataflow will stop supporting pipelines using Python 2. Read more information on the Python 2 support on Google Cloud page.

2.21.0 Supported Core Python SDK library under module apache_beam: sub-modules coders, dataframe, metrics, options, portability, runners.dataflow, runners.direct, transforms, typehints
File-based sources and sinks and related modules under module apache_beam.io: textio, avroio, parquetio, tfrecordio, gcsfilesystem, localfilesystem
Google Cloud I/O connectors under module apache_beam.io.gcp

This version will be deprecated on May 27, 2021.

On October 7, 2020, Dataflow will stop supporting pipelines using Python 2. Read more information on the Python 2 support on Google Cloud page.

2.20.0 Supported Core Python SDK library under module apache_beam: sub-modules coders, dataframe, metrics, options, portability, runners.dataflow, runners.direct, transforms, typehints
File-based sources and sinks and related modules under module apache_beam.io: textio, avroio, parquetio, tfrecordio, gcsfilesystem, localfilesystem
Google Cloud I/O connectors under module apache_beam.io.gcp

This version will be deprecated on April 15, 2021.

On October 7, 2020, Dataflow will stop supporting pipelines using Python 2. Read more information on the Python 2 support on Google Cloud page.

2.19.0 Supported Core Python SDK library under module apache_beam: sub-modules coders, dataframe, metrics, options, portability, runners.dataflow, runners.direct, transforms, typehints
File-based sources and sinks and related modules under module apache_beam.io: textio, avroio, parquetio, tfrecordio, gcsfilesystem, localfilesystem
Google Cloud I/O connectors under module apache_beam.io.gcp

This version will be deprecated on February 4, 2021.

On October 7, 2020, Dataflow will stop supporting pipelines using Python 2. Read more information on the Python 2 support on Google Cloud page.

2.18.0 Supported Core Python SDK library under module apache_beam: sub-modules coders, dataframe, metrics, options, portability, runners.dataflow, runners.direct, transforms, typehints
File-based sources and sinks and related modules under module apache_beam.io: textio, avroio, parquetio, tfrecordio, gcsfilesystem, localfilesystem
Google Cloud I/O connectors under module apache_beam.io.gcp

This version will be deprecated on January 23, 2021.

On October 7, 2020, Dataflow will stop supporting pipelines using Python 2. Read more information on the Python 2 support on Google Cloud page.

2.17.0 Supported Core Python SDK library under module apache_beam: sub-modules coders, dataframe, metrics, options, portability, runners.dataflow, runners.direct, transforms, typehints
File-based sources and sinks and related modules under module apache_beam.io: textio, avroio, parquetio, tfrecordio, gcsfilesystem, localfilesystem
Google Cloud I/O connectors under module apache_beam.io.gcp

This version will be deprecated on January 6, 2021.

On October 7, 2020, Dataflow will stop supporting pipelines using Python 2. Read more information on the Python 2 support on Google Cloud page.

2.16.0 Supported Core Python SDK library under module apache_beam: sub-modules coders, dataframe, metrics, options, portability, runners.dataflow, runners.direct, transforms, typehints
File-based sources and sinks and related modules under module apache_beam.io: textio, avroio, parquetio, tfrecordio, gcsfilesystem, localfilesystem
Google Cloud I/O connectors under module apache_beam.io.gcp
This version will be deprecated on October 7, 2020.
2.15.0 Deprecated Core Python SDK library under module apache_beam: sub-modules coders, dataframe, metrics, options, portability, runners.dataflow, runners.direct, transforms, typehints
File-based sources and sinks and related modules under module apache_beam.io: textio, avroio, parquetio, tfrecordio, gcsfilesystem, localfilesystem
Google Cloud I/O connectors under module apache_beam.io.gcp
Deprecated on August 23, 2020.
2.14.0 Deprecated Core Python SDK library under module apache_beam: sub-modules coders, dataframe, metrics, options, portability, runners.dataflow, runners.direct, transforms, typehints
File-based sources and sinks and related modules under module apache_beam.io: textio, avroio, parquetio, tfrecordio, gcsfilesystem, localfilesystem
Google Cloud I/O connectors under module apache_beam.io.gcp

Deprecated on August 1, 2020.

Known issues:
  • The MongoDB source added in this release has a known issue that can result in data loss. See BEAM-7866 for details.
2.13.0 Deprecated Core Python SDK library under module apache_beam: sub-modules coders, dataframe, metrics, options, portability, runners.dataflow, runners.direct, transforms, typehints
File-based sources and sinks and related modules under module apache_beam.io: textio, avroio, parquetio, tfrecordio, gcsfilesystem, localfilesystem
Google Cloud I/O connectors under module apache_beam.io.gcp
Deprecated as of June 6, 2020.
2.12.0 Deprecated Core Python SDK library under module apache_beam: sub-modules coders, dataframe, metrics, options, portability, runners.dataflow, runners.direct, transforms, typehints
File-based sources and sinks and related modules under module apache_beam.io: textio, avroio, parquetio, tfrecordio, gcsfilesystem, localfilesystem
Google Cloud I/O connectors under module apache_beam.io.gcp
Deprecated as of April 25, 2020.
2.11.0 Deprecated Core Python SDK library under module apache_beam: sub-modules coders, dataframe, metrics, options, portability, runners.dataflow, runners.direct, transforms, typehints
File-based sources and sinks and related modules under module apache_beam.io: textio, avroio, parquetio, tfrecordio, gcsfilesystem, localfilesystem
Google Cloud I/O connectors under module apache_beam.io.gcp
Deprecated as of March 1, 2020.
2.10.0 Deprecated Deprecated as of February 11, 2020.
2.9.0 Deprecated Deprecated as of December 13, 2019.
2.8.0 Deprecated Deprecated as of October 25, 2019.
2.7.0 Deprecated

Deprecated as of October 3, 2019.

2.6.0 Deprecated Deprecated as of August 8, 2019.
2.5.0 Deprecated Deprecated as of June 6, 2019.
2.4.0 Deprecated Deprecated as of March 20, 2019. This version will be decommissioned by August 12, 2020 due to the discontinuation of support for JSON-RPC and Global HTTP Batch Endpoints.
2.3.0 Deprecated Deprecated as of January 30, 2019.
This version will be decommissioned by August 12, 2020 due to the discontinuation of support for JSON-RPC and Global HTTP Batch Endpoints.
2.2.0 Deprecated Deprecated as of December 2, 2018.
This version will be decommissioned by August 12, 2020 due to the discontinuation of support for JSON-RPC and Global HTTP Batch Endpoints.
2.1.1
2.1.0
Deprecated Deprecated as of August 23, 2018.
This version will be decommissioned by August 12, 2020 due to the discontinuation of support for JSON-RPC and Global HTTP Batch Endpoints.
2.0.0 Deprecated Deprecated as of May 17, 2018.
This version will be decommissioned by August 12, 2020 due to the discontinuation of support for JSON-RPC and Global HTTP Batch Endpoints.

Dataflow 2.x SDKs

Note: Development SDK versions (marked as -SNAPSHOT for Java and .dev for Python) are unsupported.

The following tables contains the support status for the Dataflow 2.x SDKs:

Java

See the Dataflow SDK 2.x for Java release notes for change information.

SDK version(s) Status Details
2.5.0 Deprecated

Deprecated as of June 6, 2019.

Known issue:
  • In a specific case, users of Dataflow Java SDKs 2.5.0 and earlier might experience data duplication when reading files from Cloud Storage. Duplication might occur when all of the following conditions are true:
    • You are reading files with the content-encoding set to gzip, and the files are dynamically decompressive transcoded by Cloud Storage.
    • The file size (decompressed) is larger than 2.14 GB.
    • The input stream runs into an error (and is recreated) after 2.14 GB is read.
    As a workaround, do not set the content-encoding header, and store compressed files in Cloud Storage with the proper extension (for example, gz for gzip). For existing files, you can update the content-encoding header and file name with the gsutil tool.
2.4.0 Deprecated

Deprecated as of March 20, 2019.

This version will be decommissioned by August 12, 2020 due to the discontinuation of support for JSON-RPC and Global HTTP Batch Endpoints.

Known issue:
  • In a specific case, users of Dataflow Java SDKs 2.5.0 and earlier might experience data duplication when reading files from Cloud Storage. Duplication might occur when all of the following conditions are true:
    • You are reading files with the content-encoding set to gzip, and the files are dynamically decompressive transcoded by Cloud Storage.
    • The file size (decompressed) is larger than 2.14 GB.
    • The input stream runs into an error (and is recreated) after 2.14 GB is read.
    As a workaround, do not set the content-encoding header, and store compressed files in Cloud Storage with the proper extension (for example, gz for gzip). For existing files, you can update the content-encoding header and file name with the gsutil tool.
2.3.0 Deprecated

Deprecated as of January 30, 2019.

This version will be decommissioned by August 12, 2020 due to the discontinuation of support for JSON-RPC and Global HTTP Batch Endpoints.

Known issue:
  • In a specific case, users of Dataflow Java SDKs 2.5.0 and earlier might experience data duplication when reading files from Cloud Storage. Duplication might occur when all of the following conditions are true:
    • You are reading files with the content-encoding set to gzip, and the files are dynamically decompressive transcoded by Cloud Storage.
    • The file size (decompressed) is larger than 2.14 GB.
    • The input stream runs into an error (and is recreated) after 2.14 GB is read.
    As a workaround, do not set the content-encoding header, and store compressed files in Cloud Storage with the proper extension (for example, gz for gzip). For existing files, you can update the content-encoding header and file name with the gsutil tool.
2.2.0 Deprecated

Deprecated as of December 2, 2018.

This version will be decommissioned by August 12, 2020 due to the discontinuation of support for JSON-RPC and Global HTTP Batch Endpoints.

Known issue:
  • In a specific case, users of Dataflow Java SDKs 2.5.0 and earlier might experience data duplication when reading files from Cloud Storage. Duplication might occur when all of the following conditions are true:
    • You are reading files with the content-encoding set to gzip, and the files are dynamically decompressive transcoded by Cloud Storage.
    • The file size (decompressed) is larger than 2.14 GB.
    • The input stream runs into an error (and is recreated) after 2.14 GB is read.
    As a workaround, do not set the content-encoding header, and store compressed files in Cloud Storage with the proper extension (for example, gz for gzip). For existing files, you can update the content-encoding header and file name with the gsutil tool.
2.1.0 Deprecated

Deprecated as of August 23, 2018.

This version will be decommissioned by August 12, 2020 due to the discontinuation of support for JSON-RPC and Global HTTP Batch Endpoints.

Known issue:
  • In a specific case, users of Dataflow Java SDKs 2.5.0 and earlier might experience data duplication when reading files from Cloud Storage. Duplication might occur when all of the following conditions are true:
    • You are reading files with the content-encoding set to gzip, and the files are dynamically decompressive transcoded by Cloud Storage.
    • The file size (decompressed) is larger than 2.14 GB.
    • The input stream runs into an error (and is recreated) after 2.14 GB is read.
    As a workaround, do not set the content-encoding header, and store compressed files in Cloud Storage with the proper extension (for example, gz for gzip). For existing files, you can update the content-encoding header and file name with the gsutil tool.
2.0.0 Deprecated

Deprecated as of May 17, 2018.

This version will be decommissioned by August 12, 2020 due to the discontinuation of support for JSON-RPC and Global HTTP Batch Endpoints.

Known issue:
  • In a specific case, users of Dataflow Java SDKs 2.5.0 and earlier might experience data duplication when reading files from Cloud Storage. Duplication might occur when all of the following conditions are true:
    • You are reading files with the content-encoding set to gzip, and the files are dynamically decompressive transcoded by Cloud Storage.
    • The file size (decompressed) is larger than 2.14 GB.
    • The input stream runs into an error (and is recreated) after 2.14 GB is read.
    As a workaround, do not set the content-encoding header, and store compressed files in Cloud Storage with the proper extension (for example, gz for gzip). For existing files, you can update the content-encoding header and file name with the gsutil tool.
2.0.0-beta3
2.0.0-beta2
2.0.0-beta1
Decommissioned

Decommissioned as of February 28, 2018.

Known issue:
  • In a specific case, users of Dataflow Java SDKs 2.5.0 and earlier might experience data duplication when reading files from Cloud Storage. Duplication might occur when all of the following conditions are true:
    • You are reading files with the content-encoding set to gzip, and the files are dynamically decompressive transcoded by Cloud Storage.
    • The file size (decompressed) is larger than 2.14 GB.
    • The input stream runs into an error (and is recreated) after 2.14 GB is read.
    As a workaround, do not set the content-encoding header, and store compressed files in Cloud Storage with the proper extension (for example, gz for gzip). For existing files, you can update the content-encoding header and file name with the gsutil tool.

Python

See the Dataflow SDK 2.x for Python release notes for change information.

SDK version(s) Status Details
2.5.0 Deprecated Deprecated as of June 6, 2019.
2.4.0 Deprecated Deprecated as of March 20, 2019.
This version will be decommissioned by August 12, 2020 due to the discontinuation of support for JSON-RPC and Global HTTP Batch Endpoints.
2.3.0 Deprecated Deprecated as of January 30, 2019.
This version will be decommissioned by August 12, 2020 due to the discontinuation of support for JSON-RPC and Global HTTP Batch Endpoints.
2.2.0 Deprecated Deprecated as of December 2, 2018.
This version will be decommissioned by August 12, 2020 due to the discontinuation of support for JSON-RPC and Global HTTP Batch Endpoints.
2.1.1 Deprecated Deprecated as of August 23, 2018.
This version will be decommissioned by August 12, 2020 due to the discontinuation of support for JSON-RPC and Global HTTP Batch Endpoints.
Fixes a compatibility issue with the Python six package.
See the release notes for more information.
2.1.0 Deprecated Deprecated as of August 23, 2018.
This version will be decommissioned by August 12, 2020 due to the discontinuation of support for JSON-RPC and Global HTTP Batch Endpoints.
This release has a compatibility issue with the Python six 1.11.0 package.
See the Release Notes for more information.
2.0.0 Deprecated Deprecated as of May 17, 2018.
This version will be decommissioned by August 12, 2020 due to the discontinuation of support for JSON-RPC and Global HTTP Batch Endpoints.
This release has a compatibility issue with the Python six 1.11.0 package.
See the Release Notes for more information.
0.6.0
0.5.5
0.5.1
0.4.4
0.4.3
0.4.2
0.4.1
0.4.0
Decommissioned Decommissioned as of January 29, 2018.
0.2.7 and earlier versions Decommissioned Decommissioned as of March 23, 2017.

Dataflow 1.x SDKs

The following table contains the support status for the Dataflow 1.x SDKs for Java. See the Dataflow SDK 1.x for Java release notes for change information.

SDK version(s) Status Details
1.9.1
1.9.0
Unsupported Unsupported as of October 16, 2018.
1.8.1
1.8.0
Unsupported Unsupported as of April 9, 2018.
1.7.0
Unsupported Unsupported as of March 12, 2018.
1.6.1
1.6.0
Unsupported Unsupported as of January 22, 2018.
1.5.1
1.5.0
1.4.0
1.3.0
Unsupported Unsupported as of October 1, 2017.
1.2.1
1.2.0
1.1.0
1.0.0
Unsupported Unsupported as of February 26, 2017.
Pre-1.0.0 (including 0.4.* and 0.3.*) Unsupported