Dataproc clusters can be created on Compute Engine sole-tenant nodes. A sole-tenant node is a Compute Engine server that is dedicated to hosting your project's VMs only. Creating a Dataproc cluster on a sole tenant node keeps the cluster's VMs physically separate from VMs in other projects. The clusters function as standard Dataproc clusters, but with additional hardware isolation to address security and compliance concerns.
Dataproc sole-tenant node clusters are created in a user-specified sole-tenant node group. Each cluster's master, worker, and secondary worker instances will be created within this sole-tenant node group.
First steps
See Before you begin.
Create a sole-tenant node group.
Use autoscaling node groups if you will create autoscaling clusters in the sole-tenant node group.
Node group autoscaling recommendations:
- Make sure the node group's
max-nodes
is sufficient for themaxInstances
of clusters you will create in the sole-tenant node group. - Use the default or
migrate-within-node-group
node group maintenance policy; VMs may be unavailable for up to one hour with therestart-in-place
policy.
- Make sure the node group's
Creating a sole-tenant cluster
Before creating a sole-tenant cluster, see the sole-tenant node VM restrictions.
If you create an autoscaling cluster in a sole-tenant node group, it is recommended that node group also use autoscaling (see Node group autoscaling recommendations).
gcloud Command
To create a sole-tenant cluster, pass the --node-group
flag to the
gcloud dataproc clusters create
command.
Flag notes:
--region
(required): Must match the region of the sole-tenant-group.--node-group
(required): You can specify the sole tenant node group name ("node-group-name") or the sole-tenant node group resource URI ("projects/project-id/zones/zone/nodeGroups/node-group-name").--zone
(required): The cluster zone must match the sole-tenant node group zone.
gcloud dataproc clusters create cluster-name \ --region=region \ --zone=zone \ --node-group=node group resource name or URI \ ... other args
REST API
Create a sole-tenant cluster using a clusters.create request that specifies the NodeGroupAffinity.nodeGroupUri of the sole-tenant node group.
Note: the cluster zone specified in the zoneUri
field must match the sole-tenant node group zone.
Console
Currently, creating a sole-tenant Dataproc cluster is not supported in the Google Cloud console.