Cloud Dataflow job stuck in the 'WriteFiles' step while writing to Cloud Storage

Problem

Dataflow job stuck when writing data to Cloud Storage and below messages are observed in worker logs.

Processing stuck in step Write File(s)...
Operation ongoing in step Write File(s)...

Environment

  • Dataflow
  • Cloud Storage

Solution

  1. Set withNumShards() in File-based Write I/O transform equal to number of worker machines to increase the write parallelism.

Cause

This issue is observed when there are fewer shards writing to sink or available workers can not keep up the incoming load. In such cases, users should increase the number of shards equal to the max worker pool size so that the write parallelism is increased. More information can be found here.

Example:

FileIO.Write.withNumShards(5)