Dataproc Service Account based Secure Multi-tenancy (called "secure multi-tenancy", below) enables you to share a cluster with multiple users, with a set of users mapped to service accounts when the cluster is created. With secure multi-tenancy, users can submit interactive workloads to the cluster with isolated user identities.
When a user submits a job to the cluster, the job:
runs as a specific OS user with a specific Kerberos principal
accesses Google Cloud resources using the mapped service account credentials
Considerations and Limitations
When you create a cluster with secure multi-tenancy enabled:
The cluster is available only to users with mapped service accounts. For example, unmapped users cannot run jobs on the cluster.
The Dataproc Component Gateway is not enabled.
Direct SSH access to the cluster and Compute Engine features, such as the ability to run startup scripts on cluster VMs, are blocked. Also, jobs cannot run with
sudo
privileges.Kerberos is enabled and configured on the cluster for secure intra-cluster communication.
Dataproc Workflows are not supported.
Creating a secure multi-tenancy cluster
To create a Dataproc secure multi-tenancy cluster, use the dataproc:dataproc.beta.secure.multi-tenancy.user.mapping
cluster property to specify a list of user-to-service-account mappings.
Example:
The following command creates a cluster, with user bob@my-company.com
mapped to service account service-account-for-bob@iam.gserviceaccount.com
and user alice@my-company.com
mapped to service account service-account-for-alice@iam.gserviceaccount.com
.
gcloud dataproc clusters create my-cluster \ --properties="^#^dataproc:dataproc.beta.secure.multi-tenancy.user.mapping=bob@my-company.com:service-account-for-bob@iam.gserviceaccount.com,alice@my-company.com:service-account-for-alice@iam.gserviceaccount.com" \ --scopes=cloud-platform \ --service-account=cluster-service-account@iam.gserviceaccount.com \ --region=region \ other args ...
Notes:
As shown in the above command, cluster
--scopes
must be set tocloud-platform
(necessary for the cluster service account to perform impersonation).The cluster service account must have permissions to impersonate the service accounts mapped to the users (see Managing service account impersonation).
Recommendation: Use different cluster service accounts for different clusters to allow each cluster service account to impersonate only a limited, intended group of mapped user service accounts.