When you create a Dataproc cluster, cluster resources use a regional endpoints based on Compute Engine zones. When you choose a region, you can select a zone within that region, or you can omit the zone to have the Dataproc Auto Zone feature select a zone for you in the region you choose. Once a zone is selected, all nodes for that cluster will be deployed to that zone.
Auto Zone and resource reservations
Auto Zone prioritizes creating a cluster in a zone with resource reservations, as follows:
If requested cluster resources can be fully satisfied by reserved, plus, if necessary, on-demand resources in a zone, Auto Zone will consume the reserved and on-demand resources, and create the cluster in that zone.
Auto Zone prioritizes zones for selection according to total CPU core (
vCPU
) reservations in a zone.Example: A cluster creation request specifies 20
n2-standard-2
and 1n2-standard-64
(40 + 64vCPUs
requested). Auto Zone will prioritize the following zones for selection according to the total vCPU reservations available in the zone:zone-c
available reservations: 3n2-standard-2
and 1n2-standard-64
(70vCPUs
)zone-b
available reservations: 1n2-standard-64
(64vCPUs
)zone-a
available reservations: 25n2-standard-2
(50vCPUs
)Assuming each of the above zones has additional on-demand
vCPU
and other resources sufficient to satisfy the cluster request, Auto Zone will selectzone-c
for cluster creation.
If requested cluster resources cannot be fully satisfied by reserved plus on-demand resources in a zone, Auto Zone will create the cluster in a zone that is most likely to satisfy the request using on-demand resources.
Using Auto Zone placement
Console
To create a Dataproc cluster that uses Auto Zone placement:
- In the Google Cloud console, open the Dataproc Create a Dataproc cluster on Compute Engine page. The Set up cluster panel is selected.
- In the Location section:
- Select a Region for your cluster.
- Under Zone, select "Any".
gcloud command
To create a Dataproc cluster that uses Auto Zone placement, use the
gcloud dataproc clusters create
command. Set the --region
flag to a region, and
omit the --zone
flag (or leave the flag empty: --zone=
or zone=""
).
gcloud dataproc clusters create cluster-name \ --region=region \ --zone="" \ other args ...
REST API
To create a Dataproc cluster that uses Auto Zone placement,
construct a JSON clusters.create
API request, leaving the
gceClusterConfig.zoneUri
field empty. In the REST endpoint,
https://dataproc.googleapis.com/v1/projects/projectId/regions/region/clusters
, insert a region name. Dataproc Auto Zone will choose
a zone for the cluster within the specified region.
Use short resource names with Auto Zone placement: When specifying a resource URI, such as machineTypeUri or acceleratorTypeUri, in an Auto Zone placement REST API cluster creation request, use a short resource name without a zone specification, for example, "n1-standard-2" or "nvidia-tesla-t4".