Troubleshoot Dataflow networking issues

This page shows you how to resolve issues with Dataflow networking.

Network interface must specify a subnet if the network resource is in custom subnet mode

The following error occurs when you run a Dataflow job:

Workflow failed. Causes: Invalid Error: Message: Invalid value for field
'resource.properties.networkInterfaces[0].subnetwork': ''. Network interface
must specify a subnet if the network resource is in custom subnet mode. HTTP
Code: 400

This issue occurs if the VPC network named default was converted from an auto mode VPC network to a custom mode VPC network.

To resolve this issue, specify the subnetwork parameter when using a custom mode VPC network. For more information, see Specify a network and subnetwork.

Cross-project references for this resource are not allowed

The following error occurs when you run a Dataflow job on a Shared VPC network:

Invalid value for field 'resource.properties.networkInterfaces[0].subnetwork':
'https://www.googleapis.com/compute/v1/projects/PROJECT/regions/REGION/subnetworks/SUBNETWORK'.
Cross-project references for this resource are not allowed.

This issue occurs if you specify a subnetwork in a Shared VPC network, but the service project isn't attached to the Shared VPC host project.

To resolve this issue, a Shared VPC Admin must attach the service project to the host project.

Network or subnetwork is not accessible to Dataflow service account or does not exist

One of the following errors occurs when you try to run a Dataflow job. The job fails.

Workflow failed. Causes: Network default is not accessible to Dataflow Service
account or does not exist
Workflow failed. Causes: Subnetwork SUBNETWORK is not
accessible to Dataflow Service account or does not exist

This issue can occur for the following reasons:

  • You omit both the subnetwork and network parameters when you create the Dataflow job, but an auto mode VPC network named default doesn't exist in your project. You might not have a default network if the default network was deleted or if an organization policy constraint prevents you from creating a default network.
  • The subnetwork is missing.
  • The subnetwork parameter is specified incorrectly.
  • The required permissions for the Dataflow service account are missing.

To resolve this issue, follow the guidelines for specifying a network and subnetwork.

RPC timed out or failed to connect on ports 12345 or 12346

One of the following errors occurs when you run a Dataflow job that doesn't use Streaming Engine or Dataflow Shuffle. The job gets stuck or fails.

For streaming jobs:

Rpc to WORKER_HARNESS:12345 completed with error
UNAVAILABLE: failed to connect to all addresses; last error : UNKNOWN:
ipv4:WORKER_IP_ADDRESS:12345: Failed to connect to remote
host: FD Shutdown

For batch jobs:

(g)RPC timed out when SOURCE_WORKER_HARNESS talking to
DESTINATION_WORKER_HARNESS:12346.

This issue occurs if a firewall rule that allows network traffic on TCP ports 12345 and 12346 is missing. When the job uses multiple workers, the workers aren't able to communicate with each other.

To resolve this issue, see the troubleshooting steps in DEADLINE_EXCEEDED or Server Unresponsive.

Single worker is repeatedly started and stopped

The following issue occurs when you launch a Dataflow job. On the Dataflow job's Job metrics page, the CPU utilization (All Workers) chart shows that a worker is repeatedly started and then stopped after a few minutes. Only one worker is available at a given time.

CPU utilization chart showing that one worker at a time is repeatedly created and then stopped.

The following error occurs:

The Dataflow job appears to be stuck because no worker activity has been seen
in the last 1h. Please check the worker logs in Stackdriver Logging.

No worker logs are created.

In the job logs, multiple messages similar to the following might appear:

Autoscaling: Raised the number of workers to 1 based on the rate of progress in
the currently running stage(s).

This issue occurs if the VPC network doesn't have a default route to the internet and a default route to the subnetwork.

To resolve this issue, add default routes to your VPC network. For more information, see Internet access for Dataflow.

Subnetwork does not have Private Google Access

The following error occurs when you launch a Dataflow job in which external IP addresses are disabled:

Workflow failed. Causes: Subnetwork SUBNETWORK on project
PROJECT_ID network NETWORK in
region REGION does not have Private Google Access, which
is required for usage of private IP addresses by the Dataflow workers.

This issue occurs if you turn off external IP addresses without enabling Private Google Access.

To resolve this issue, enable Private Google Access for the subnetwork that the Dataflow job uses.