Introducing regional placement in Dataflow
Yuta Labur
Software Engineer
Efesa Origbo
Product Manager
Try Google Cloud
Start building on Google Cloud with $300 in free credits and 20+ always free products.
Free trialWe’re excited to announce that Dataflow now supports regional placement of workers.
Building upon Auto Zone placement
Dataflow deploys its workers as Compute Engine resources, which are hosted in multiple locations worldwide. Since 2018, Dataflow has supported the Auto Zone feature, which uses the available zone capacity to automatically select the best single zone within a region to run Dataflow workers. While Auto Zone enabled customers to defer the zone selection process to the Dataflow service, it lacked a few key features:
When a zone runs out of compute capacity, Auto Zone created jobs are susceptible to resource availability errors.
In the event of a zonal failure, Auto Zone created jobs will fail since they are confined to a specific zone.
Regional worker placement resolves the gaps mentioned above by enrolling the Dataflow job in all available zones within the relevant region. Thus, if a subset of zones run out of compute capacity, the Dataflow job will continue to provision workers from other zones that have additional capacity. This helps to improve the scalability and reliability of your Dataflow jobs.


Getting started
Streaming Engine and Shuffle Service jobs are automatically enabled for regional worker placement. To learn more, head on over to the documentation page for more details.