Dataflow release notes

This page documents production updates to the Dataflow service. You can periodically check this page for announcements about new or updated features, bug fixes, known issues, and deprecated functionality.

You can see the latest product updates for all of Google Cloud Platform on the Google Cloud page, browse and filter all release notes in the Google Cloud console, or programmatically access release notes in BigQuery.

To get the latest product updates delivered to you, add the URL of this page to your feed reader, or add the feed URL directly.

September 08, 2025

Dataflow now supports using secure tags to set firewall rules on worker VMs. For more information, see Use secure tags with Dataflow.

August 27, 2025

Dataflow supports Cloud TPUs, Google's custom-designed AI accelerators that are optimized for large-scale AI/ML workloads. This feature lets you accelerate inference workloads on frameworks like PyTorch, JAX, and TensorFlow. This feature is generally available with an allowlist. For more information, see Dataflow support for TPUs.

August 26, 2025

Dataflow Runner v2 fixes an issue that could cause data discrepancies when using splittable DoFns, particularly when processing large datasets as side inputs. This fix ensures that all data is accurately processed and transmitted within the pipeline. This improvement is available in recent Dataflow service releases, and is automatically enabled when using Dataflow Runner v2.

Note: After this fix, pipelines that previously experienced data loss due to this issue might consume more resources (such as CPU, memory, and processing time) because more data is being processed. This increase in resource usage is expected and reflects the correct behavior of the pipeline.

August 11, 2025

Dataflow now automatically detects performance bottlenecks in streaming jobs. You can see the cause of the bottleneck in the Step Info panel to help with troubleshooting.

For more information, see Troubleshoot bottlenecks.

June 26, 2025

Dataflow now supports an automated parallel update workflow for streaming jobs. This feature helps minimize disruption by launching a new replacement job that runs in parallel with the existing job. After a duration of time you specify, the old job is automatically drained.

For more information, see Run parallel pipelines.

June 09, 2025

Dataflow now supports right fitting for streaming jobs. Right fitting lets you specify resource requirements for an entire pipeline or for specific pipeline steps. Previously, right fitting was only supported for batch pipelines. For more information, see Streaming right fitting.

May 12, 2025

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.11.0 (2025-05-06)

Features

dataflow: A new enum StreamingMode is added (2f22244)
dataflow: A new field bugs is added to message .google.dataflow.v1beta3.SdkVersion (2f22244)
dataflow: A new field data_sampling is added to message .google.dataflow.v1beta3.DebugOptions (2f22244)
dataflow: A new field default_streaming_mode is added to message .google.dataflow.v1beta3.TemplateMetadata (2f22244)
dataflow: A new field default_value is added to message .google.dataflow.v1beta3.ParameterMetadata (2f22244)
dataflow: A new field disk_size_gb is added to message .google.dataflow.v1beta3.RuntimeEnvironment (2f22244)
dataflow: A new field dynamic_destinations is added to message .google.dataflow.v1beta3.PubsubLocation (2f22244)
dataflow: A new field enable_launcher_vm_serial_port_logging is added to message .google.dataflow.v1beta3.FlexTemplateRuntimeEnvironment (2f22244)
dataflow: A new field enum_options is added to message .google.dataflow.v1beta3.ParameterMetadata (2f22244)
dataflow: A new field group_name is added to message .google.dataflow.v1beta3.ParameterMetadata (2f22244)
dataflow: A new field hidden_ui is added to message .google.dataflow.v1beta3.ParameterMetadata (2f22244)
dataflow: A new field image_repository_cert_path is added to message .google.dataflow.v1beta3.ContainerSpec (2f22244)
dataflow: A new field image_repository_password_secret_id is added to message .google.dataflow.v1beta3.ContainerSpec (2f22244)
dataflow: A new field image_repository_username_secret_id is added to message .google.dataflow.v1beta3.ContainerSpec (2f22244)
dataflow: A new field name is added to message .google.dataflow.v1beta3.ListJobsRequest (2f22244)
dataflow: A new field parent_name is added to message .google.dataflow.v1beta3.ParameterMetadata (2f22244)
dataflow: A new field parent_trigger_values is added to message .google.dataflow.v1beta3.ParameterMetadata (2f22244)
dataflow: A new field runtime_updatable_params is added to message .google.dataflow.v1beta3.Job (2f22244)
dataflow: A new field satisfies_pzi is added to message .google.dataflow.v1beta3.Job (2f22244)
dataflow: A new field service_resources is added to message .google.dataflow.v1beta3.Job (2f22244)
dataflow: A new field step_names_hash is added to message .google.dataflow.v1beta3.PipelineDescription (2f22244)
dataflow: A new field straggler_info is added to message .google.dataflow.v1beta3.WorkItemDetails (2f22244)
dataflow: A new field straggler_summary is added to message .google.dataflow.v1beta3.StageSummary (2f22244)
dataflow: A new field streaming_mode is added to message .google.dataflow.v1beta3.Environment (2f22244)
dataflow: A new field streaming_mode is added to message .google.dataflow.v1beta3.FlexTemplateRuntimeEnvironment (2f22244)
dataflow: A new field streaming_mode is added to message .google.dataflow.v1beta3.RuntimeEnvironment (2f22244)
dataflow: A new field streaming is added to message .google.dataflow.v1beta3.TemplateMetadata (2f22244)
dataflow: A new field supports_at_least_once is added to message .google.dataflow.v1beta3.TemplateMetadata (2f22244)
dataflow: A new field supports_exactly_once is added to message .google.dataflow.v1beta3.TemplateMetadata (2f22244)
dataflow: A new field trie is added to message .google.dataflow.v1beta3.MetricUpdate (2f22244)
dataflow: A new field update_mask is added to message .google.dataflow.v1beta3.UpdateJobRequest (2f22244)
dataflow: A new field use_streaming_engine_resource_based_billing is added to message .google.dataflow.v1beta3.Environment (2f22244)
dataflow: A new field user_display_properties is added to message .google.dataflow.v1beta3.JobMetadata (2f22244)
dataflow: A new message DataSamplingConfig is added (2f22244)
dataflow: A new message HotKeyDebuggingInfo is added (2f22244)
dataflow: A new message ParameterMetadataEnumOption is added (2f22244)
dataflow: A new message RuntimeUpdatableParams is added (2f22244)
dataflow: A new message SdkBug is added (2f22244)
dataflow: A new message ServiceResources is added (2f22244)
dataflow: A new message Straggler is added (2f22244)
dataflow: A new message StragglerInfo is added (2f22244)
dataflow: A new message StragglerSummary is added (2f22244)
dataflow: A new message StreamingStragglerInfo is added (2f22244)
dataflow: A new method_signature job,update_mask is added to method UpdateJob in service JobsV1Beta3 (2f22244)
dataflow: A new value BIGQUERY_TABLE is added to enum ParameterType (2f22244)
dataflow: A new value BOOLEAN is added to enum ParameterType (2f22244)
dataflow: A new value ENUM is added to enum ParameterType (2f22244)
dataflow: A new value GO is added to enum Language (2f22244)
dataflow: A new value JAVASCRIPT_UDF_FILE is added to enum ParameterType (2f22244)
dataflow: A new value KAFKA_READ_TOPIC is added to enum ParameterType (2f22244)
dataflow: A new value KAFKA_TOPIC is added to enum ParameterType (2f22244)
dataflow: A new value KAFKA_WRITE_TOPIC is added to enum ParameterType (2f22244)
dataflow: A new value KMS_KEY_NAME is added to enum ParameterType (2f22244)
dataflow: A new value MACHINE_TYPE is added to enum ParameterType (2f22244)
dataflow: A new value NUMBER is added to enum ParameterType (2f22244)
dataflow: A new value SERVICE_ACCOUNT is added to enum ParameterType (2f22244)
dataflow: A new value WORKER_REGION is added to enum ParameterType (2f22244)
dataflow: A new value WORKER_ZONE is added to enum ParameterType (2f22244)

Bug Fixes

dataflow: An existing oauth_scope `https (2f22244)
dataflow: An existing oauth_scope `https (2f22244)
dataflow: An existing oauth_scope `https (2f22244)
dataflow: An existing oauth_scope `https (2f22244)
dataflow: An existing oauth_scope `https (2f22244)
dataflow: An existing oauth_scope `https (2f22244)
dataflow: An existing oauth_scope `https (2f22244)
dataflow: An existing oauth_scope `https (2f22244)
dataflow: An existing oauth_scope `https (2f22244)
dataflow: An existing oauth_scope `https (2f22244)
dataflow: An existing oauth_scope `https (2f22244)
dataflow: An existing oauth_scope `https (2f22244)

Documentation

dataflow: A comment for enum JobState is changed (2f22244)
dataflow: A comment for enum WorkerIPAddressConfiguration is changed (2f22244)
dataflow: A comment for enum value JOB_VIEW_ALL in enum JobView is changed (2f22244)
dataflow: A comment for field additional_experiments in message .google.dataflow.v1beta3.RuntimeEnvironment is changed (2f22244)
dataflow: A comment for field additional_user_labels in message .google.dataflow.v1beta3.RuntimeEnvironment is changed (2f22244)
dataflow: A comment for field bypass_temp_dir_validation in message .google.dataflow.v1beta3.RuntimeEnvironment is changed (2f22244)
dataflow: A comment for field capabilities in message .google.dataflow.v1beta3.SdkHarnessContainerImage is changed (2f22244)
dataflow: A comment for field current_state in message .google.dataflow.v1beta3.Job is changed (2f22244)
dataflow: A comment for field dataset in message .google.dataflow.v1beta3.Environment is changed (2f22244)
dataflow: A comment for field debug_options in message .google.dataflow.v1beta3.Environment is changed (2f22244)
dataflow: A comment for field dump_heap_on_oom in message .google.dataflow.v1beta3.FlexTemplateRuntimeEnvironment is changed (2f22244)
dataflow: A comment for field dynamic_template in message .google.dataflow.v1beta3.LaunchTemplateRequest is changed (2f22244)
dataflow: A comment for field enable_hot_key_logging in message .google.dataflow.v1beta3.DebugOptions is changed (2f22244)
dataflow: A comment for field enable_streaming_engine in message .google.dataflow.v1beta3.RuntimeEnvironment is changed (2f22244)
dataflow: A comment for field environment in message .google.dataflow.v1beta3.Job is changed (2f22244)
dataflow: A comment for field flex_resource_scheduling_goal in message .google.dataflow.v1beta3.Environment is changed (2f22244)
dataflow: A comment for field gcs_path in message .google.dataflow.v1beta3.DynamicTemplateLaunchParams is changed (2f22244)
dataflow: A comment for field gcs_path in message .google.dataflow.v1beta3.LaunchTemplateRequest is changed (2f22244)
dataflow: A comment for field id in message .google.dataflow.v1beta3.Job is changed (2f22244)
dataflow: A comment for field ip_configuration in message .google.dataflow.v1beta3.RuntimeEnvironment is changed (2f22244)
dataflow: A comment for field job_name in message .google.dataflow.v1beta3.LaunchTemplateParameters is changed (2f22244)
dataflow: A comment for field kms_key_name in message .google.dataflow.v1beta3.RuntimeEnvironment is changed (2f22244)
dataflow: A comment for field launch_parameters in message .google.dataflow.v1beta3.LaunchTemplateRequest is changed (2f22244)
dataflow: A comment for field location in message .google.dataflow.v1beta3.Job is changed (2f22244)
dataflow: A comment for field machine_type in message .google.dataflow.v1beta3.RuntimeEnvironment is changed (2f22244)
dataflow: A comment for field max_workers in message .google.dataflow.v1beta3.RuntimeEnvironment is changed (2f22244)
dataflow: A comment for field name in message .google.dataflow.v1beta3.Job is changed (2f22244)
dataflow: A comment for field network in message .google.dataflow.v1beta3.RuntimeEnvironment is changed (2f22244)
dataflow: A comment for field num_workers in message .google.dataflow.v1beta3.RuntimeEnvironment is changed (2f22244)
dataflow: A comment for field project_id in message .google.dataflow.v1beta3.Job is changed (2f22244)
dataflow: A comment for field requested_state in message .google.dataflow.v1beta3.Job is changed (2f22244)
dataflow: A comment for field save_heap_dumps_to_gcs_path in message .google.dataflow.v1beta3.FlexTemplateRuntimeEnvironment is changed (2f22244)
dataflow: A comment for field service_account_email in message .google.dataflow.v1beta3.Environment is changed (2f22244)
dataflow: A comment for field service_account_email in message .google.dataflow.v1beta3.RuntimeEnvironment is changed (2f22244)
dataflow: A comment for field service_kms_key_name in message .google.dataflow.v1beta3.Environment is changed (2f22244)
dataflow: A comment for field service_options in message .google.dataflow.v1beta3.Environment is changed (2f22244)
dataflow: A comment for field set in message .google.dataflow.v1beta3.MetricUpdate is changed (2f22244)
dataflow: A comment for field subnetwork in message .google.dataflow.v1beta3.RuntimeEnvironment is changed (2f22244)
dataflow: A comment for field temp_location in message .google.dataflow.v1beta3.RuntimeEnvironment is changed (2f22244)
dataflow: A comment for field transform_name_mapping in message .google.dataflow.v1beta3.Job is changed (2f22244)
dataflow: A comment for field type in message .google.dataflow.v1beta3.Job is changed (2f22244)
dataflow: A comment for field worker_region in message .google.dataflow.v1beta3.Environment is changed (2f22244)
dataflow: A comment for field worker_region in message .google.dataflow.v1beta3.RuntimeEnvironment is changed (2f22244)
dataflow: A comment for field worker_zone in message .google.dataflow.v1beta3.Environment is changed (2f22244)
dataflow: A comment for field worker_zone in message .google.dataflow.v1beta3.RuntimeEnvironment is changed (2f22244)
dataflow: A comment for field zone in message .google.dataflow.v1beta3.RuntimeEnvironment is changed (2f22244)
dataflow: A comment for message DynamicTemplateLaunchParams is changed (2f22244)
dataflow: A comment for message Job is changed (2f22244)
dataflow: A comment for message JobExecutionStageInfo is changed (2f22244)
dataflow: A comment for message JobMetrics is changed (2f22244)
dataflow: A comment for message LaunchTemplateParameters is changed (2f22244)
dataflow: A comment for message MetricUpdate is changed (2f22244)
dataflow: A comment for message SdkHarnessContainerImage is changed (2f22244)
dataflow: A comment for message Step is changed (2f22244)
dataflow: A comment for method AggregatedListJobs in service JobsV1Beta3 is changed (2f22244)
dataflow: A comment for method CreateJob in service JobsV1Beta3 is changed (2f22244)
dataflow: A comment for method CreateJobFromTemplate in service TemplatesService is changed (2f22244)
dataflow: A comment for method GetTemplate in service TemplatesService is changed (2f22244)
dataflow: A comment for method LaunchTemplate in service TemplatesService is changed (2f22244)
dataflow: A comment for method ListJobs in service JobsV1Beta3 is changed (2f22244)
dataflow: A comment for service FlexTemplatesService is changed (2f22244)

April 21, 2025

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.10.6 (2025-04-15)

Bug Fixes

dataflow: Update google.golang.org/api to 0.229.0 (3319672)

March 17, 2025

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.10.5 (2025-03-13)

Bug Fixes

dataflow: Update golang.org/x/net to 0.37.0 (1144978)

March 13, 2025

Dataflow now supports data lineage. Data lineage lets you track how data moves through your systems. This feature is generally available (GA). For more information, see Use data lineage in Dataflow.

March 10, 2025

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.10.4 (2025-03-06)

Bug Fixes

dataflow: Fix out-of-sync version.go (28f0030)

March 04, 2025

Dataflow is now available in Stockholm (europe-north2).

January 21, 2025

Managed I/O now supports automatic upgrades for supported I/O connectors. Using this feature, Dataflow pipelines automatically use the latest reliable version of the connector. This feature is generally available (GA). For more information, see Dataflow managed I/O.

January 06, 2025

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.10.3 (2025-01-02)

Bug Fixes

dataflow: Update golang.org/x/net to v0.33.0 (e9b0b69)

November 18, 2024

Dataflow is available in Queretaro, Mexico (northamerica-south1). Learn more about Google Cloud locations.

October 28, 2024

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.10.2 (2024-10-23)

Bug Fixes

dataflow: Update google.golang.org/api to v0.203.0 (8bb87d5)
dataflow: WARNING: On approximately Dec 1, 2024, an update to Protobuf will change service registration function signatures to use an interface instead of a concrete type in generated .pb.go files. This change is expected to affect very few if any users of this client library. For more information, see https://togithub.com/googleapis/google-cloud-go/issues/11020. (8bb87d5)

September 16, 2024

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.10.1 (2024-09-12)

Bug Fixes

dataflow: Bump dependencies (2ddeb15)

August 26, 2024

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.10.0 (2024-08-20)

Features

dataflow: Add support for Go 1.23 iterators (84461c0)

August 12, 2024

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.9.12 (2024-08-08)

Bug Fixes

dataflow: Update google.golang.org/api to v0.191.0 (5b32644)

July 29, 2024

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.9.11 (2024-07-24)

Bug Fixes

dataflow: Update dependencies (257c40b)

July 15, 2024

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.9.10 (2024-07-10)

Bug Fixes

dataflow: Bump google.golang.org/grpc@v1.64.1 (8ecc4e9)

July 11, 2024

You can now use the Dataflow job builder UI to create and run Dataflow pipelines in the Google Cloud console, without writing any code. This feature is generally available (GA).

July 08, 2024

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.9.9 (2024-07-01)

Bug Fixes

dataflow: Bump google.golang.org/api@v0.187.0 (8fa9e39)

July 05, 2024

The remote code execution vulnerability, CVE-2024-6387, in OpenSSH has been mitigated. A patched Dataflow VM image that includes an updated OpenSSH is available. For more information about how to apply mitigations, see the GCP-2024-040 security bulletin.

July 03, 2024

A remote code execution vulnerability, CVE-2024-6387, was recently discovered in OpenSSH. Dataflow jobs might create VMs that use an OS image with versions of OpenSSH that are vulnerable to CVE-2024-6387. For more information, see the GCP-2024-040 security bulletin.

July 01, 2024

Dataflow batch jobs are now cancelled after ten days. Previously, they were cancelled after 30 days. See Quotas and limits.

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.9.8 (2024-06-26)

Bug Fixes

dataflow: Enable new auth lib (b95805f)

June 21, 2024

Dataflow SQL is deprecated. As of July 31, 2024, you can't access Dataflow SQL in the Google Cloud console. As of January 31, 2025, you can't use Dataflow SQL in the Google Cloud CLI. As a replacement, use Beam SQL.

June 04, 2024

Iceberg read/write support is available through the new Managed I/O Java API. For more information, see Dataflow managed I/O.

May 31, 2024

You can now use Metrics Explorer to find individual DoFns that cause latencies in streaming jobs. These metrics are available in streaming pipelines that use Apache Beam 2.53.0 and later versions. The following new metrics are available:

Average message processing time per DoFn (job/dofn_latency_average)
Maximum message processing time per DoFn (job/dofn_latency_max)
Minimum message processing time per DoFn (job/dofn_latency_min)
Number of messages processed per DoFn (job/dofn_latency_num_messages)
Oldest active message processing time per DoFn (job/oldest_active_message_age)
Total message processing time per DoFn (job/dofn_latency_total)

For more information about Dataflow metrics, see Google Cloud metrics.

May 17, 2024

Dataflow no longer supports the NVIDIA Tesla K80 GPU type. For a list of supported GPU types, see Dataflow support for GPUs.

May 06, 2024

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.9.7 (2024-05-01)

Bug Fixes

dataflow: Bump x/net to v0.24.0 (ba31ed5)

April 26, 2024

The following Dataflow templates now support user-defined functions (UDFs) written in Python:

April 05, 2024

The following Dataflow templates are generally available (GA):

March 28, 2024

The Dataflow right fitting feature is now supported by non-Prime batch pipelines.

March 18, 2024

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.9.6 (2024-03-14)

Bug Fixes

dataflow: Update protobuf dep to v1.33.0 (30b038d)

March 15, 2024

You can now use worker utilization hints to tune horizontal autoscaling for streaming pipelines.

Added new autoscaling metrics:

Autoscaling rationale chart: explains the factors driving autoscaling decisions
Worker CPU utilization chart: shows current user worker CPU utilization and customer autoscaling hint value
Timer backlog per stage: shows an estimate of time needed to materialize the output for windows whose timer has expired
Parallel processing: the number of keys available for parallel processing

March 11, 2024

You can now use committed use discounts (CUDs) with Dataflow streaming jobs. Committed use discounts provide discounted prices in exchange for your commitment to continuously use a certain amount of Dataflow compute resources for a year or longer.

March 08, 2024

Streaming jobs created after March 7, 2024 automatically encrypt all user data with customer-managed encryption keys (CMEK). To enable this encryption for jobs created before March 7, 2024, drain or cancel the job, and then restart it.

February 27, 2024

Dataflow now supports at-least-once streaming mode. You can use this mode to achieve lower latency and reduced costs for workloads that can tolerate duplicate records. This feature is generally available (GA). For more information, see Set the pipeline streaming mode.

February 21, 2024

You can now use Gemma models in your Apache Beam inference pipelines. For more information, see Use Gemma open models with Dataflow.

February 15, 2024

You can now use a turnkey transform to enrich streaming data in your Dataflow pipeline. When you enrich data, you augment the raw data from one source by adding related data from a second source. For more information, see Enrich streaming data.

February 12, 2024

Dataflow Streaming Engine now supports resource-based billing. When you enable resource-based billing with Streaming Engine, you're billed for the total resources consumed by your job.

February 05, 2024

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.9.5 (2024-01-30)

Bug Fixes

dataflow: Enable universe domain resolution options (fd1d569)

January 31, 2024

Dataflow is available in Johannesburg, South Africa (africa-south1).

December 18, 2023

Dataflow now supports data sampling for pipeline exceptions. With this feature, you can see samples of the data being processed when an unhandled exception occurs. Use exception sampling to help troubleshoot pipeline errors. For more information, see Use exception sampling.

December 12, 2023

You can now run a job graph validation check to verify whether a replacement job is valid before you launch the new job. For more information, see Validate a replacement job.

December 06, 2023

You can now archive completed Dataflow jobs. When you archive a Dataflow job, the job is moved from the Dataflow Jobs page in the console to the Archived jobs page. For more information, see Archive Dataflow jobs.

December 05, 2023

The Dataflow web-based monitoring interface now includes a dashboard that monitors your Dataflow jobs at the project level. For more information, see Dataflow project monitoring dashboard.

November 17, 2023

Dataflow supports NVIDIA® L4 and NVIDIA® A100 80 GB GPU types. For more information, see Dataflow support for GPUs.

November 13, 2023

The Cloud Spanner to Vertex AI Vector Search template is generally available (GA).

Dataflow jobs now scale to 4,000 worker VMs.

November 06, 2023

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.9.4 (2023-11-01)

Bug Fixes

dataflow: Bump google.golang.org/api to v0.149.0 (8d2ab9f)

October 30, 2023

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.9.3 (2023-10-26)

Bug Fixes

dataflow: Update grpc-go to v1.59.0 (81a97b0)

October 23, 2023

The Cloud Spanner to BigQuery template for batch pipelines is available in preview.

A weekly digest of client library updates from across the Cloud SDK.

Python

Changes for google-cloud-dataflow-client

0.8.5 (2023-10-09)

Documentation

Minor formatting (94b4f73)

October 16, 2023

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.9.2 (2023-10-12)

Bug Fixes

dataflow: Update golang.org/x/net to v0.17.0 (174da47)

September 19, 2023

Dataflow is now available in Dammam, Saudi Arabia (me-central2).

September 14, 2023

Dataflow now supports the Tau T2A Arm machine series as a worker machine type. This feature is generally available (GA). For more information, see Use Arm VMs on Dataflow.

September 06, 2023

The following Dataflow templates are generally available (GA):

August 22, 2023

Dataflow is available in Berlin (europe-west10).

August 15, 2023

You can now update streaming job options without stopping your job. For more information, see In-flight job option update.

Dataflow cost monitoring is generally available (GA).

July 27, 2023

The following Dataflow templates are generally available (GA):

July 26, 2023

Dynamic thread scaling is generally available (GA). Dynamic thread scaling is a part of Dataflow's suite of vertical scaling features.

July 25, 2023

When you run multiple SDK processes on a shared Dataflow GPU, you can improve GPU efficiency and utilization by enabling the NVIDIA Multi-Process Service (MPS).

July 24, 2023

You can now view streaming stragglers in the Google Cloud console. For more information, see Troubleshoot stragglers in streaming jobs.

July 10, 2023

A weekly digest of client library updates from across the Cloud SDK.

Python

Changes for google-cloud-dataflow-client

0.8.4 (2023-07-04)

Bug Fixes

Add async context manager return types (#184) (355b8b4)

June 26, 2023

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.9.1 (2023-06-20)

Bug Fixes

dataflow: REST query UpdateMask bug (df52820)

June 13, 2023

Dataflow now supports Confidential VMs for Dataflow worker VMs. For more information, see Dataflow service options.

June 05, 2023

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.9.0 (2023-05-30)

Features

dataflow: Update all direct dependencies (b340d03)

0.9.0 (2023-05-30)

Features

dataflow: Update all direct dependencies (b340d03)

May 31, 2023

Data sampling is now generally available (GA). Data sampling lets you observe the data at each step of a pipeline. For more information, see Use data sampling to observe pipeline data.

May 15, 2023

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.8.1 (2023-05-08)

Bug Fixes

dataflow: Update grpc to v1.55.0 (1147ce0)

April 21, 2023

Dataflow ML now supports the Automatic Model Refresh feature, which lets you update your machine learning model without stopping your Apache Beam pipeline.

April 19, 2023

You can now manage Dataflow jobs by using Eventarc. For more information, see Use Eventarc to manage Dataflow jobs.

April 10, 2023

Dataflow cost monitoring is now available in preview.

April 03, 2023

A weekly digest of client library updates from across the Cloud SDK.

Python

Changes for google-cloud-dataflow-client

0.8.3 (2023-03-23)

Documentation

Fix formatting of request arg in docstring (#177) (22668f6)

March 30, 2023

Dataflow is now available in Doha (me-central1).

March 29, 2023

The Dataflow VM image has been updated to include mitigations for multiple vulnerabilities by upgrading to cos-97-16919-235-30. For the full list of mitigations, see the Container-Optimized OS release notes.

Dataflow jobs started on or after March 29, 2023 will run VM instances that use this image.

March 28, 2023

Vertical Autoscaling now supports batch jobs.

March 23, 2023

Dataflow is now available in Turin (europe-west12).

February 13, 2023

A weekly digest of client library updates from across the Cloud SDK.

Python

Changes for google-cloud-dataflow-client

0.8.2 (2023-02-07)

Bug Fixes

Raise not implemented error when REST transport is not supported (#170) (44651ca)

January 30, 2023

A weekly digest of client library updates from across the Cloud SDK.

Python

Changes for google-cloud-dataflow-client

0.8.1 (2023-01-20)

Bug Fixes

Add context manager return types (63d369a)

Documentation

Add documentation for enums (63d369a)

January 16, 2023

A weekly digest of client library updates from across the Cloud SDK.

Python

Changes for google-cloud-dataflow-client

0.8.0 (2023-01-10)

Features

Add support for python 3.11 (#164) (97fa32f)

January 09, 2023

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.8.0 (2023-01-04)

Features

dataflow: Add REST client (06a54a1)

January 03, 2023

Starting in version 2023-01-03-00_RC00, the Google-provided Dataflow templates support ES6 syntax for JavaScript user-defined functions (UDFs). This change is backwards-compatible. ES5 syntax and existing user-defined functions are still supported.

When you run Google-provided templates using the latest version, your jobs are upgraded automatically on restart. If you want to keep running an earlier version of a template, when you run the template, specify version 2022-12-15-00_RC00 or earlier.

December 27, 2022

Starting with Beam SDK version 2.44.0, Dataflow will not support running Dataflow jobs with workers in a region that is different from the Dataflow regional endpoint.

December 19, 2022

A weekly digest of client library updates from across the Cloud SDK.

Python

Changes for google-cloud-dataflow-client

0.7.0 (2022-12-15)

Features

Add support for google.cloud.dataflow.__version__ (5f36251)
Add typing to proto.Message based class attributes (5f36251)

Bug Fixes

Add dict typing for client_options (5f36251)
deps: Require google-api-core >=1.34.0, >=2.11.0 (9b9083c)
Drop usage of pkg_resources (9b9083c)
Fix timeout default values (9b9083c)

Documentation

samples: Snippetgen handling of repeated enum field (5f36251)
samples: Snippetgen should call await on the operation coroutine before calling result (9b9083c)

December 16, 2022

Dataflow now supports regional placement for workers.

December 15, 2022

The Dataflow VM image has been updated to include mitigations for OpenSSL CVE-2022-3602 by upgrading to cos-97-16919-189-12. For jobs that use GPUs, the NVIDIA drivers have also been updated to mitigate the vulnerability. Dataflow jobs started on or after December 14, 2022 will run VM instances that use this image.

November 14, 2022

A weekly digest of client library updates from across the Cloud SDK.

Node.js

Changes for @google-cloud/dataflow

2.0.1 (2022-11-11)

Bug Fixes

Allow passing gax instance to client constructor (#80) (9054e83)
Better support for fallback mode (#76) (7b4c304)
Change import long to require (#77) (531996b)
deps: Use google-gax v3.5.2 (#87) (9f856a5)
Do not import the whole google-gax from proto JS (#79) (a0924da)
docs: Document fallback rest option (#72) (bb637f7)
Preserve default values in x-goog-request-params header (#81) (18e64cc)
Regenerated protos JS and TS definitions (#90) (920d3fe)
Remove pip install statements (#78) (884ea27)
use google-gax v3.3.0 (a0924da)

October 17, 2022

A weekly digest of client library updates from across the Cloud SDK.

Python

Changes for google-cloud-dataflow-client

0.6.2 (2022-10-10)

Bug Fixes

deps: Allow protobuf 3.19.5 (#150) (216c6e2)
deps: require google-api-core>=1.33.2 (216c6e2)

October 10, 2022

A weekly digest of client library updates from across the Cloud SDK.

Python

Changes for google-cloud-dataflow-client

0.6.1 (2022-10-03)

Bug Fixes

deps: Require protobuf >= 3.20.2 (#146) (52466db)

Java

Changes for google-cloud-dataflow

0.7.6 (2022-10-05)

Bug Fixes

update protobuf to v3.21.7 (63bfc0e)

0.7.5 (2022-10-03)

Dependencies

Update dependency cachetools to v5 (#243) (b55c975)
Update dependency certifi to v2022.9.24 (#222) (7482df0)
Update dependency charset-normalizer to v2.1.1 (#226) (2ea7474)
Update dependency click to v8.1.3 (#227) (20f0fdc)
Update dependency com.google.cloud:google-cloud-shared-dependencies to v3.0.4 (#247) (7010c38)
Update dependency gcp-releasetool to v1.8.8 (#223) (3c11024)
Update dependency google-api-core to v2.10.1 (#228) (cd149f3)
Update dependency google-auth to v2.12.0 (#229) (808298e)
Update dependency google-cloud-core to v2.3.2 (#224) (e9c50a8)
Update dependency google-cloud-storage to v2.5.0 (#230) (55a25e9)
Update dependency googleapis-common-protos to v1.56.4 (#225) (2ccbec5)
Update dependency markupsafe to v2.1.1 (#231) (4c6e0a6)
Update dependency protobuf to v3.20.2 (#232) (75a739c)
Update dependency protobuf to v4 (#244) (b38c19f)
Update dependency pyjwt to v2.5.0 (#233) (7f4064b)
Update dependency requests to v2.28.1 (#234) (41938f3)
Update dependency typing-extensions to v4.3.0 (#235) (8c42354)
Update dependency zipp to v3.8.1 (#242) (4b2ebd4)

October 04, 2022

Dataflow is now available in Tel Aviv (me-west1).

The Dataflow VM image has been updated to include several mitigations for a recently disclosed hardware speculative execution vulnerability named Retbleed. Dataflow jobs started on or after September 21, 2022 will run VM instances that use this image.

September 26, 2022

A weekly digest of client library updates from across the Cloud SDK.

Go

Changes for dataflow/apiv1beta3

0.7.0 (2022-09-21)

Features

dataflow: rewrite signatures in terms of new types for betas (9f303f9)

0.6.0 (2022-09-19)

Features

dataflow: start generating proto message types (563f546)

September 19, 2022

A weekly digest of client library updates from across the Cloud SDK.

Python

Changes for google-cloud-dataflow-client

0.6.0 (2022-09-13)

Features

Enable REST transport support (#139) (e8a64ff)

Java

Changes for google-cloud-dataflow

0.7.4 (2022-09-15)

Dependencies

Update dependency com.google.cloud:google-cloud-shared-dependencies to v3.0.3 (#216) (54999e7)

September 12, 2022

A weekly digest of client library updates from across the Cloud SDK.

Java

Changes for google-cloud-dataflow

0.7.3 (2022-09-09)

Dependencies

Update dependency com.google.cloud:google-cloud-shared-dependencies to v3.0.2 (#211) (20b1918)

August 25, 2022

Dataflow now uses Regional Managed Instance Groups (MIGs). Previously, Dataflow used zonal MIGs.

If this change causes you to exceed your quota, set your Regional managed instance groups quota to the same limit assigned to your Managed instance groups quota. For more information, see Working with quotas.

August 22, 2022

A weekly digest of client library updates from across the Cloud SDK.

Python

Changes for google-cloud-dataflow-client

0.5.5 (2022-08-11)

Bug Fixes

deps: allow protobuf < 5.0.0 (#126) (16b89c0)
deps: require proto-plus >= 1.22.0 (16b89c0)

August 15, 2022

A weekly digest of client library updates from across the Cloud SDK.

Java

Changes for google-cloud-dataflow

0.7.2 (2022-08-09)

Dependencies

update dependency com.google.cloud:google-cloud-shared-dependencies to v3 (#202) (161c011)

July 20, 2022

Dataflow Prime is now in General Availability.

July 11, 2022

You can use the Apache Beam SDK for Go to create batch and streaming Dataflow pipelines. This feature is now in General Availability.

June 07, 2022

Dataflow is now available in Dallas, Texas (us-south1).

May 24, 2022

Dataflow is now available in Columbus (us-east5).

May 13, 2022

Dataflow now supports Flex Template images from private registries. To learn more, see Use an image from a private registry.

May 10, 2022

Dataflow is now available in Madrid (europe-southwest1).

May 03, 2022

Dataflow is now available in Paris (europe-west9).

April 20, 2022

Dataflow is now available in Milan (europe-west8).

April 06, 2022

Dataflow now supports Runner v2 in GA for all languages.

March 31, 2022

Hot key detection and logging is deprecated for streaming pipelines. For information on pipeline debugging, see Pipeline troubleshooting and debugging.

March 17, 2022

Dataflow now supports the following Google-provided templates in GA:

March 04, 2022

You can now use the Apache Beam SDK for Go to create batch Dataflow pipelines. This feature is in Preview.

February 16, 2022

Profiling Dataflow pipelines with Cloud Profiler is generally available (GA). Use Dataflow integration with Cloud Profiler to monitor pipeline performance.

January 04, 2022

Dataflow now fully supports using Identity and Access Management (IAM) custom roles. You can create a custom IAM role and assign it to a user-managed service account used in Dataflow instead of assigning the Dataflow Worker role.

November 16, 2021

Dataflow is now available in Santiago (southamerica-west1).

September 21, 2021

Dataflow now uses Zonal DNS for worker resources. This enables Dataflow to offer higher reliability guarantees around Internal DNS registration.

September 07, 2021

Dataflow now supports Shielded VM workers.

August 31, 2021

Dataflow Prime is now available in Preview.

August 03, 2021

Dataflow is now able to use workers, Dataflow Shuffle, Streaming Engine, FlexRS, and regional endpoints in zones in Toronto (northamerica-northeast2).

July 31, 2021

Dataflow now supports storing Flex Template images in Artifact Registry.

July 22, 2021

Dataflow now supports custom containers in GA.

June 30, 2021

GPU support on Dataflow is now in General Availability.

June 29, 2021

Dataflow is now able to use workers, Dataflow Shuffle, Streaming Engine, FlexRS, and regional endpoints in zones in Delhi (asia-south2).

June 28, 2021

Dataflow snapshots are now available in GA.

June 22, 2021

Dataflow is now able to use workers, Dataflow Shuffle, Streaming Engine, FlexRS, and regional endpoints in zones in Melbourne (australia-southeast2).

June 14, 2021

In addition to scalar functions, Dataflow SQL now supports aggregate user-defined functions (UDFs) for Java. For more information, see Dataflow SQL user-defined functions. This feature is in Preview.

June 09, 2021

Dataflow SQL now supports user-defined functions (UDFs) written using Java. For more information, see Dataflow SQL user-defined functions. This feature is in Preview.

May 14, 2021

You can now enable logging of human-readable hot keys. For more information, see the hot key entry in Pipeline options.

May 11, 2021

Dataflow Shuffle is now the default mode for all batch pipelines.

March 24, 2021

Dataflow is now able to use workers, Dataflow Shuffle, Streaming Engine, FlexRS, and regional endpoints in zones in europe-central2 (Warsaw).

March 22, 2021

Dataflow SQL now supports user-defined functions (UDFs) written using SQL. For more information, see Dataflow SQL user-defined functions. This feature is in Preview.

March 19, 2021

Execution details are now available in Preview.

February 03, 2021

Dataflow now supports Dataflow Shuffle, Streaming Engine, FlexRS, and the following regional endpoints in GA:

asia-east2 Hong Kong
asia-northeast2 - Japan (Osaka)
asia-northeast3 - Seoul
asia-southeast2 - Jakarta
europe-north1 - Finland
us-west3 - Salt Lake City
us-west4 - Las Vegas

January 29, 2021

Flex templates now support updating streaming jobs and Flexible Resource Scheduling (FlexRS).

Dataflow snapshots are now available in Preview.

January 25, 2021

GPU support on Dataflow is currently available in Preview. To enroll in this Preview offering, contact Support or Sales.

December 11, 2020

Workers now use the Java 11 runtime.

December 10, 2020

Dataflow now supports custom containers as a Preview offering.

November 11, 2020

Dataflow now supports Interactive Notebooks in GA.

November 05, 2020

Dataflow now supports Dataflow Shuffle, Streaming Engine, FlexRS, and the following regional endpoints in GA:

us-west2 (Los Angeles)
southamerica-east1 (São Paulo)
europe-west6 (Zurich)
asia-south1 (Mumbai)

Pub/Sub I/O metrics in the Dataflow and Cloud Monitoring UIs may be unavailable for Dataflow jobs using Streaming Engine.

October 30, 2020

Dataflow Runner v2 is now the default runner for Python streaming pipelines using SDK 2.21.0 and above.

September 30, 2020

Dataflow now supports Flex Templates in GA.

September 29, 2020

You can now use a network tags parameter to add network tags to all worker VMs that execute a particular Dataflow job.

July 27, 2020

Dataflow now supports Dataflow Shuffle, Streaming Engine, FlexRS, and the following regional endpoints in GA:

northamerica-northeast1 (Montréal)
asia-southeast1 (Singapore)
australia-southeast1 (Sydney)

June 08, 2020

Dataflow is now able to use workers in zones in the asia-southeast2 region (Jakarta).

April 20, 2020

Dataflow is now able to use workers in zones in the us-west4 region (Las Vegas).

April 15, 2020

Cloud Dataflow SQL is now generally available. You can now run parameterized queries from the Dataflow SQL UI.

April 09, 2020

Dataflow now provides beta support for Flex Templates.

Dataflow now provides beta support for Interactive Notebooks.

April 07, 2020

Dataflow now supports Dataflow Shuffle, Streaming Engine, FlexRS, and the following regional endpoints in GA:

us-east4 (Northern Virginia)
europe-west2 (London)
europe-west3 (Frankfurt)

March 03, 2020

Cloud Dataflow SQL is now available in beta. You can now do the following in Cloud Dataflow SQL:

Write data to two destinations, including Cloud Pub/Sub
Specify how to load data into a BigQuery table
Set pipeline options in the Cloud Dataflow SQL UI

February 24, 2020

Using Cloud Dataflow with Cloud Key Management Service to create a customer-managed encryption key (CMEK) is generally available.

Cloud Dataflow is now able to use workers in zones in the us-west3 region (Salt Lake City).

February 04, 2020

The Cloud Dataflow monitoring UI now has enhanced observability features to help with troubleshooting batch and streaming pipelines.

January 24, 2020

Cloud Dataflow is now able to use workers in zones in the asia-northeast3 region (Seoul).

November 18, 2019

Flexible Resource Scheduling (FlexRS) in Cloud Dataflow is generally available. The service is available in five additional regions:

us-east1 (South Carolina)
us-west1 (Oregon)
asia-east1 (Taiwan)
asia-northeast1 (Tokyo)
europe-west4 (Netherlands)

You can now do the following in Cloud Dataflow SQL:

Use Cloud Storage filesets as a data source
Assign schemas to data sources in the Cloud Dataflow SQL UI
Preview the content of Cloud Pub/Sub messages from the Cloud Dataflow SQL UI

October 31, 2019

Cloud Dataflow Shuffle and Streaming Engine are now available in us-east1 (South Carolina).

October 25, 2019

You can now see audit logs of Cloud KMS key operations and protect Cloud Dataflow Shuffle state using a customer-managed encryption key.

October 08, 2019

Python streaming for Apache Beam SDK 2.16 or higher is generally available. You can now do the following in Python:

Update and Drain streaming pipelines.
Enable streaming autoscaling.
Use Streaming Engine.

Python 3 support for Apache Beam SDK 2.16.0 or higher is now generally available. This feature provides support for using Python 3.5, 3.6, and 3.7. You can run run any existing Python 2.7 batch and streaming pipelines that use DirectRunner or DataflowRunner. However, you might need to make changes to ensure that your pipeline code is compatible with Python 3. Keyword-only arguments (a syntactic construct introduced in Python 3) are not yet supported by Apache Beam SDK. For the current status and summary of recent Python 3-specific improvements, follow updates on the Apache Beam issue tracker.

October 07, 2019

Cloud Dataflow Shuffle and Streaming Engine are now available in two additional regions:

us-west1 (Oregon)
asia-east1 (Taiwan)

September 03, 2019

Automatic hot key detection is now enabled in batch pipelines for Apache Beam SDK 2.15.0 or higher.

August 09, 2019

Integration with Cloud Dataflow VPC Service Controls is generally available.

August 02, 2019

Using Cloud Dataflow with Cloud Key Management Service is now available in beta. Customer-managed encryption keys (CMEK) allow for encryption of your pipeline state. This feature is limited to Persistent Disks attached to Cloud Dataflow workers and used for Persistent Disk-based shuffle and streaming state storage.

August 01, 2019

Python 3 support for Apache Beam SDK 2.14.0 or higher is now in beta. This feature provides support for using Python 3.5, 3.6, and 3.7. You can run any existing Python 2.7 batch and streaming pipelines that use DirectRunner or DataflowRunner. However, you might need to make changes to ensure that your pipeline code is compatible with Python 3. Some syntactic constructs introduced in Python 3 are not yet fully supported by the Apache Beam SDK. For details and current status, follow updates on the Apache Beam issue tracker.

May 16, 2019

Cloud Dataflow SQL is now publicly available in alpha. Cloud Dataflow SQL lets you use SQL queries to develop and run Cloud Dataflow jobs from the BigQuery web UI.

April 18, 2019

Cloud Dataflow is now able to use workers in zones in the asia-northeast2 region (Osaka, Japan).

April 10, 2019

Cloud Dataflow Streaming Engine is generally available. The service is available in two additional regions:

asia-northeast1 (Tokyo)
europe-west4 (Netherlands)

Note that Streaming Engine requires the Apache Beam SDK for Java, versions 2.10.0 or higher.

Cloud Dataflow Shuffle is now available in two additional regions:

asia-northeast1 (Tokyo)
europe-west4 (Netherlands)

Cloud Dataflow provides beta support for Flexible Resource Scheduling (FlexRS) in the us-central1 and europe-west1 regions.

Streaming autoscaling is generally available for pipelines that use Streaming Engine.

April 08, 2019

Apache Beam SDK for Python can only use BigQuery resources in the following regions:

Regional locations: us-west2, us-east4, europe-north1, europe-west2, europe-west6.
Multi-regional locations: EU and US.

Cloud Dataflow provides beta support for Flexible Resource Scheduling (FlexRS) in the us-central1 and europe-west1 regions.

April 01, 2019

Cloud Dataflow provides beta support for VPC Service Controls.

March 24, 2019

The following SDK versions will be decommissioned later in 2019 due to the discontinuation of support for JSON-RPC and Global HTTP Batch Endpoints. Note that this change overrides the release note from December 17, that states that decommissioning was expected to happen in March 2019.

Apache Beam SDK for Java, versions 2.0.0 to 2.4.0 (inclusive)
Apache Beam SDK for Python, versions 2.0.0 to 2.4.0 (inclusive)
Cloud Dataflow SDK for Java, versions 2.0.0 to 2.4.0 (inclusive)
Cloud Dataflow SDK for Python, 2.0.0 to 2.4.0 (inclusive)

See the SDK version support status page for detailed SDK support status.

March 20, 2019

Apache Beam SDK 2.4.0 and Cloud Dataflow SDK 2.4.0 are now deprecated. For detailed support status information, see the SDK version support status table.

March 11, 2019

Cloud Dataflow is now able to use workers in zones in the europe-west6 region (Zürich, Switzerland).

March 06, 2019

Apache Beam SDK 2.10.0 depends on gcsio client library version 1.9.13, which has known issues:

Reading side inputs can result in sending a large number of requests to Cloud Storage. As a result, Cloud Dataflow jobs can fail with HTTP 429 errors from Cloud Storage.
Apache Beam ParquetIO fails when reading files from Cloud Storage.

To work around these issues, either upgrade to Apache Beam SDK 2.11.0, or override the gcsio client library version to 1.9.16 or later.

February 25, 2019

You can now view system latency and data freshness metrics for your pipeline in the Cloud Dataflow monitoring interface.

February 20, 2019

2018-2019-Apache Beam SDK 2.10.0 contains fixes for the known issues disclosed on December 20 and February 4.

February 04, 2019

In a specific case, users of Apache Beam Java SDKs (2.9.0 and earlier) and Cloud Dataflow Java SDKs (2.5.0 and earlier) might experience data duplication when reading files from Cloud Storage. Duplication might occur when all of the following conditions are true:

You are reading files with the content-encoding set to gzip, and the files are dynamically decompressive transcoded by Cloud Storage.
The file size (decompressed) is larger than 2.14 GB.
The input stream runs into an error (and is recreated) after 2.14 GB is read.

As a workaround, do not set the content-encoding header, and store compressed files in Cloud Storage with the proper extension (for example, gz for gzip). For existing files, you can update the content-encoding header and file name with the gsutil tool.

December 20, 2018

Streaming Engine users should not upgrade to SDK 2.9.0 due to a known issue. If you choose to use SDK 2.9.0, you must also set the enable_conscrypt_security_provider experimental flag to enable conscrypt, which has known stability issues.

December 17, 2018

2019-The following decommission notice has been changed. For more information, see the release note for March 24.

2019-The following SDK versions will be decommissioned on March 25 due to the discontinuation of support for JSON-RPC and Global HTTP Batch Endpoints. Shortly after this date, you will no longer be able to submit new Cloud Dataflow jobs or update running Cloud Dataflow jobs that use the decommissioned SDKs. In addition, existing streaming jobs that use these SDK versions might fail.

Apache Beam SDK for Java, versions 2.0.0 to 2.4.0 (inclusive)
Apache Beam SDK for Python, versions 2.0.0 to 2.4.0 (inclusive)
Cloud Dataflow SDK for Java, versions 2.0.0 to 2.4.0 (inclusive)
Cloud Dataflow SDK for Python, versions 2.0.0 to 2.4.0 (inclusive)

See the SDK version support status page for detailed SDK support status.

October 22, 2018

Cloud Dataflow is now able to use workers in zones in the asia-east2 region (Hong Kong).

October 16, 2018

2018-Cloud Dataflow SDK 1.x for Java is unsupported as of October 16. In the near future, the Cloud Dataflow service will reject new Cloud Dataflow jobs that are based on Cloud Dataflow SDK 1.x for Java. See Migrating from Cloud Dataflow SDK 1.x for Java for migration guidance.

October 03, 2018

Cloud Dataflow now has a Public IP parameter that allows you to turn off public IP addresses for your worker nodes.

July 16, 2018

Cloud Dataflow Shuffle is now generally available.

July 10, 2018

Cloud Dataflow is now able to use workers in zones in the us-west2 region (Los Angeles).

June 14, 2018

Streaming Engine is now publicly available in beta. Streaming Engine moves streaming pipeline execution out of the worker VMs and into the Cloud Dataflow service backend.

June 11, 2018

You can now specify a user-managed controller service account when you run your pipeline job.

Cloud Dataflow is now able to use workers in zones in the europe-north1 region (Finland).

April 26, 2018

You can now view side input metrics for your pipeline from the Cloud Dataflow monitoring interface.

February 21, 2018

Cloud Dataflow now supports the following regional endpoints in GA: us-central1, us-east1, europe-west1, asia-east1, and asia-northeast1.

January 10, 2018

Cloud Dataflow is now able to use workers in zones in the northamerica-northeast1 region (Montréal).

Cloud Dataflow is now able to use workers in zones in the europe-west4 region (Netherlands).

October 31, 2017

Cloud Dataflow is now able to use workers in zones in the asia-south1 region (Mumbai).

October 30, 2017

Cloud Dataflow Shuffle is now available in the europe-west1 region.

Cloud Dataflow Shuffle is now available for pipelines using the Apache Beam SDK for Python version 2.1 or later.

October 25, 2017

Cloud Dataflow provides beta support for additional Google-provided templates. To get started with templates, follow the quickstart.

October 12, 2017

Fixed the known issue disclosed on October 2, 2017.

October 02, 2017

Cloud Dataflow 2.x pipelines in which the output of a PTransform is consumed by a flatten and at least one other PTransform results in a malformed graph, leaving the other PTransforms input-less.

September 20, 2017

Cloud Dataflow provides beta support for regional endpoints us-central1 and europe-west1.

September 05, 2017

Cloud Dataflow is now able to use workers in zones in the southamerica-east1 region (São Paulo).

August 01, 2017

Cloud Dataflow is now able to use workers in zones in the europe-west3 region (Frankfurt).

July 20, 2017

You can now access the Stackdriver error report for your pipeline directly from the Dataflow monitoring interface.

June 20, 2017

Cloud Dataflow is now able to use workers in zones in the australia-southeast1 region (Sydney).

June 06, 2017

Cloud Dataflow is now able to use workers in zones in the europe-west2 region (London).

April 25, 2017

Per-step worker logs are now accessible directly in the Cloud Dataflow UI. Consult the documentation for more information.

April 11, 2017

The Cloud Dataflow service will now automatically shut down a streaming job if all steps have reached the maximum watermark. This will only affect pipelines in which every source produces only bounded input – e.g., streaming pipelines reading from Cloud Pub/Sub are not affected.

April 03, 2017

Improved graph layout in the Cloud Dataflow UI.

September 29, 2016

Autoscaling for streaming pipelines is now publicly available in beta for use with select sources and sinks. See the autoscaling documentation for more details.

September 15, 2016

The default autoscaling ceiling for batch pipelines using the Cloud Dataflow SDK for Java 1.6 or newer has been raised to 10 worker VMs. You can specify an alternate ceiling using the --maxNumWorkers pipeline option. See the autoscaling documentation for more details.

August 18, 2016

Autoscaling for batch pipelines using the Cloud Dataflow SDK for Java 1.6 or higher is now being enabled by default. This change will be rolled out to projects over the next several days. By default, the Cloud Dataflow service will cap the dynamic number of workers to a ceiling of 5 worker VMs. The default autoscaling ceiling may be raised in future service releases. You can specify an alternate ceiling using the --maxNumWorkers pipeline option. See autoscaling documentation for more details.

July 27, 2016

Announced beta support for the 0.4.0 release of the Cloud Dataflow SDK for Python. Get started and run your pipeline remotely on the service.

Default disk size for pipelines in streaming mode is now 420GB. This change will be rolled out to projects over the next several days.

March 14, 2016

Scalability and performance improvements available when using Cloud Dataflow SDK for Java version 1.5.0:

The service now scales to tens of thousands of initial splits when reading from a BoundedSource. This includes TextIO.Read, AvroIO.Read, and BigtableIO.Read, among others.
The service will now use Avro instead of JSON as a BigQuery export format for BigQueryIO.Read. This change greatly increases the efficiency and performance when reading from BigQuery.

January 29, 2016

Changes to the runtime environment for streaming jobs:

Files uploaded with --filesToStage were previously downloaded to: /dataflow/packages on the workers. With the latest service release, files will now be in the location /var/opt/google/dataflow. This change was a cleanup intended to better follow standard Linux path conventions.

January 19, 2016

Changes to the runtime environment for batch jobs:

Files uploaded with --filesToStage were previously downloaded to: /dataflow/packages on the workers. With the latest service release, files will now be in the location /var/opt/google/dataflow. This change was a cleanup intended to better follow standard Linux path conventions.

November 13, 2015

Usability improvements in the Monitoring UI:

The Job Log tab has been renamed Logs.
The View Log button has moved into the Logs tab, and renamed Worker Logs.

Performance and stability improvements for Streaming pipelines:

Addressed a condition that caused a slowly-growing memory usage in streaming workers.
Large Window buffers no longer need to fit entirely in memory at once.
Improved disk assignment to avoid data locality hotspots.
Worker logging is now optimized to avoid filling up the local disk.

August 12, 2015

The Cloud Dataflow Service is now generally available.

August 06, 2015

Monitoring changes:

Added JOB_STATE_CANCELLED as a possible state value for Cloud Dataflow jobs in the Monitoring UI and command-line interface. Appears when the user cancels a job.
Temporarily, as part of the above job state introduction, jobs may may show different job states in list view relative to the single job view.
Added Compute Engine core-hour count field to the monitoring UI and enabled core-hour counting for bounded jobs (field is populated with "-" for unbounded jobs).

Performance improvements to the unbounded runner.

July 28, 2015

Added a check during job creation to ensure active job names are unique within each project. You may no longer create a new job with the same name as an active job. If there are already active jobs with the same name running in the system, they will not be impacted by this change.

April 23, 2015

Improvements to the monitoring UI. Clicking View Log for a stage now defaults to display the logs generated by user code on the worker machines.

April 16, 2015

The Cloud Dataflow Service is now in beta.

Improvements to the monitoring UI: The job details page now provides more job information including job duration, and job type. For streaming pipelines, it additionally provides data watermark.

April 13, 2015

Command line interface now available for Cloud Dataflow in gcloud alpha.

Default disk size in batch is 250 GB.

April 09, 2015

Improvements to the monitoring UI: Improved organization of pipeline visualization.

Default VM for batch jobs is now n1-standard-1.

Improved resource teardown operations on job completion and cancellations.

Performance improvements for the service.

April 03, 2015

Improvements to the monitoring UI: The list of jobs now includes name, type, start time, and job ID.

March 27, 2015

Improved mechanisms for elastic scaling of compute resources. Batch pipelines can now grow and shrink the worker pool size at different stages of execution.

March 20, 2015

Monitoring changes:

Jobs summary page now shows the status of the current job.
Performance improvements to the UI.

March 06, 2015

Workers now use the Java 8 runtime.

March 01, 2015

Dynamic work rebalancing

Streaming support enabled for all projects participating in alpha.

Dataflow release notes Stay organized with collections Save and categorize content based on your preferences.

September 08, 2025

August 27, 2025

August 26, 2025

August 11, 2025

June 26, 2025

June 09, 2025

May 12, 2025

Go

Changes for dataflow/apiv1beta3

0.11.0 (2025-05-06)

Features

Bug Fixes

Documentation

April 21, 2025

Go

Changes for dataflow/apiv1beta3

0.10.6 (2025-04-15)

Bug Fixes

March 17, 2025

Go

Changes for dataflow/apiv1beta3

0.10.5 (2025-03-13)

Bug Fixes

March 13, 2025

March 10, 2025

Go

Changes for dataflow/apiv1beta3

0.10.4 (2025-03-06)

Bug Fixes

March 04, 2025

January 21, 2025

January 06, 2025

Go

Changes for dataflow/apiv1beta3

0.10.3 (2025-01-02)

Bug Fixes

November 18, 2024

October 28, 2024

Go

Changes for dataflow/apiv1beta3

0.10.2 (2024-10-23)

Bug Fixes

September 16, 2024

Go

Changes for dataflow/apiv1beta3

0.10.1 (2024-09-12)

Bug Fixes

August 26, 2024

Go

Changes for dataflow/apiv1beta3

0.10.0 (2024-08-20)

Features

August 12, 2024

Go

Changes for dataflow/apiv1beta3

0.9.12 (2024-08-08)

Bug Fixes

July 29, 2024

Go

Changes for dataflow/apiv1beta3

0.9.11 (2024-07-24)

Bug Fixes

July 15, 2024

Go

Changes for dataflow/apiv1beta3

0.9.10 (2024-07-10)

Bug Fixes

July 11, 2024

July 08, 2024

Go

Changes for dataflow/apiv1beta3

0.9.9 (2024-07-01)

Bug Fixes

July 05, 2024

July 03, 2024

July 01, 2024

Go

Changes for dataflow/apiv1beta3

0.9.8 (2024-06-26)

Dataflow release notes