Regional endpoints

Dataproc supports regional endpoints based on Compute Engine regions. You must specify a region, such as "us-east1" or "europe-west1", when you create a Dataproc cluster. Dataproc will isolate cluster resources, such as VM instances and Cloud Storage and metadata storage, within a zone within the specified region.

You can optionally specify a zone within the specified cluster region, such as "us-east1-a" or "europe-west1-b", when you create a cluster. If you do not specify the zone, Dataproc Auto Zone Placement will choose a zone within your specified cluster region to locate clusters resources.

The regional namespace corresponds to the /regions/REGION segment of Dataproc resource URIs (see, for example, the cluster networkUri).

Regional endpoint semantics

Regional endpoint names follow a standard naming convention based on Compute Engine regions. For example, the name for the Central US region is us-central1, and the name of the Western Europe region is europe-west1. Run the gcloud compute regions list command to see a listing of available regions.

Create a cluster

gcloud

When you create a cluster, specify a region using the required --region flag.

gcloud dataproc clusters create CLUSTER_NAME \
    --region=REGION \
    other args ...

REST API

Use the REGION URL parameter in a clusters.create request to specify the cluster region.

gRPC

Set the client transport address to the regional endpoint using the following pattern:

REGION-dataproc.googleapis.com

Python (google-cloud-python) example:

from google.cloud import dataproc_v1
from google.cloud.dataproc_v1.gapic.transports import cluster_controller_grpc_transport

transport = cluster_controller_grpc_transport.ClusterControllerGrpcTransport(
    address='us-central1-dataproc.googleapis.com:443')
client = dataproc_v1.ClusterControllerClient(transport)

project_id = 'my-project'
region = 'us-central1'
cluster = {...}

Java (google-cloud-java) example:

ClusterControllerSettings settings =
     ClusterControllerSettings.newBuilder()
        .setEndpoint("us-central1-dataproc.googleapis.com:443")
        .build();
 try (ClusterControllerClient clusterControllerClient = ClusterControllerClient.create(settings)) {
   String projectId = "my-project";
   String region = "us-central1";
   Cluster cluster = Cluster.newBuilder().build();
   Cluster response =
       clusterControllerClient.createClusterAsync(projectId, region, cluster).get();
 }

Console

Specify a Dataproc region in the Location section of the Set up cluster panel on the Dataproc Create a cluster page in the Google Cloud console.

What's next