Dataproc Serverless for Spark network configuration

The VPC subnetwork that is used to execute Serverless Spark workloads must meet the following requirements:

  • Open subnet connectivity The subnet must allow subnet communication on all ports. The following gcloud command attaches a network firewall to a subnet that allows ingress communications using all protocols on all ports:

    gcloud compute firewall-rules create allow-internal-ingress \
    --network="network-name" \
    --source-ranges="subnetwork internal-IP ranges" \
    --direction="ingress" \
    --action="allow" \
    --rules="all"
    
    Note: The default VPC network in a project with the default-allow-internal firewall rule, which allows ingress communication on all ports (tcp:0-65535, udp:0-65535, and icmp protocols:ports), meets this requirement. However, it also allows ingress by any VM instance on the network

  • Private Google Access. The subnet must have Private Google Access enabled.

    • External network access. Drivers and executors have internal IP addresses. You can set up Cloud NAT to allow outbound traffic using internal IPs on your VPC network.