Cloud Dataproc supports both a single "global" endpoint and "regional"
endpoints based on Compute Engine zones.
Each Dataproc region constitutes an independent resource namespace constrained to deploying instances into Compute Engine zones inside the region. Specifically, you can specify distinct regions, such as
europe-west1, to isolate resources (including VM instances and Cloud Storage) and metadata storage locations utilized by Cloud Dataproc within the user-specified region. This is made possible because the underlying infrastructure for Cloud Dataproc, including its control plane, is deployed in each region. This region parameter corresponds to the
/regions/<region> segment of the Cloud Dataproc resource URIs being referenced.
Unless specified, Cloud Dataproc will default to the "global" region. The "global" region is a special multi-region namespace which is capable of interacting with Cloud Dataproc resources across all Compute Engine zones globally.
There are some situations where specifying a regional endpoint may be useful:
- If you use Cloud Dataproc in multiple regions, specifying an explicit regional endpoint may provide better regional isolation and protection.
- You may notice better performance by selecting specific regions, especially based on geography, compared to the default "global" namespace.
Regional endpoint semantics
- Regional endpoint names follow a standard naming convention based on Compute Engine zones. For example, the name for the Central US region is
us-central1and the name of the Western Europe region is
europe-west1. You can run the
gcloud compute regions listcommand to see a listing of available regions.
- When new regions are added to Compute Engine, they will also become available for use with Cloud Dataproc.
Using regional endpoints
You can specify a region when using the
gcloud command-line tool by using the
parameter when passing commands.
gcloud dataproc clusters create cluster-name --region region ...Unless specified, the
gcloudcommand assumes a default of
You can specify a Cloud Dataproc region through the
Cloud Dataproc REST API. Currently,
a required parameter. You specify the region you want to use, including
global, in this parameter.
When you use the Google Cloud Platform Console, you specify a Cloud Dataproc region from the Create a cluster page.