Dataproc Service Account based Secure Multi-tenancy (called "secure multi-tenancy", below) enables you to share a cluster with multiple users, with a set of users mapped to service accounts when the cluster is created. With secure multi-tenancy, users can submit interactive workloads to the cluster with isolated user identities.
When a user submits a job to the cluster, the job:
runs as a specific OS user with a specific Kerberos principal
accesses Google Cloud resources using the mapped service account credentials
Considerations and Limitations
When you create a cluster with secure multi-tenancy enabled:
The cluster is available only to users with mapped service accounts. For example, unmapped users cannot run jobs on the cluster.
The Dataproc Component Gateway is not enabled.
Direct SSH access to the cluster and Compute Engine features, such as the ability to run startup scripts on cluster VMs, are blocked. Also, jobs cannot run with
Kerberos is enabled and configured on the cluster for secure intra-cluster communication.
Dataproc Workflows are not supported.
Creating a secure multi-tenancy cluster
To create a Dataproc secure multi-tenancy cluster, use the
cluster property to specify a list of user-to-service-account mappings.
The following command creates a cluster, with user
mapped to service account
email@example.com mapped to service account
gcloud dataproc clusters create my-cluster \ --properties="^#^dataproc:firstname.lastname@example.org:email@example.com,firstname.lastname@example.org:email@example.com" \ --scopes=cloud-platform \ --firstname.lastname@example.org \ --region=region \ other args ...
As shown in the above command, cluster
--scopesmust be set to
cloud-platform(necessary for the cluster service account to perform impersonation).
The cluster service account must have permissions to impersonate the service accounts mapped to the users (see Managing service account impersonation).
Recommendation: Use different cluster service accounts for different clusters to allow each cluster service account to impersonate only a limited, intended group of mapped user service accounts.