Use Compute Engine reservations with Dataflow

To ensure that VM resources are available when your Dataflow jobs need them, you can use Compute Engine reservations. Reservations provide a high level of assurance in obtaining capacity for Compute Engine zonal resources.

To use Compute Engine reservations with Dataflow, perform the following steps:

  1. Create a Compute Engine reservation. It can be a single-project reservation or a shared reservation. For more information, see the following documents:

    The reservation can include GPU accelerators.

  2. When you submit your Dataflow job, pass one of the following service options, depending on which version of the Beam SDK you are using:

    • Beam version < 2.29: --experiments=skip_gce_quota_verification
    • Beam version >= 2.29: --dataflow_service_options=automatically_use_created_reservation

To prevent low-priority workloads in the same project from competing for reservations with Dataflow, set the reservation affinity to none when you create VMs for those workloads. For more information, see Consuming reserved instances.

In order to use the reservation, the Dataflow workers must match the reservation configuration. You might need to set the worker machine type for the job. For more information, see Workers.

Limitations

All limitations of Compute Engine reservations apply when Dataflow workers consume reservations. See How reservations work.

In addition, Dataflow relies on the default consumption order in Compute Engine. As a result, the following limitations apply:

  • Dataflow does not consume a reservation created with the --require-specific-reservation flag.
  • Other workloads in the same project or Organization that do not specify the --reservation flag might compete with Dataflow workloads for project-specific or shared reservations.
  • Dataflow Prime jobs do not consume Compute Engine reservations.

Pricing

Reserved Compute Engine VMs are billed by Dataflow while the Dataflow job is running, and are billed by Compute Engine when the VMs are not being used by Dataflow.

What's next

To learn more about Compute Engine reservations, see Reservations of Compute Engine zonal resources.