By default, Cloud Dataprep can access data within the Google Cloud Platform project from which Cloud Dataprep is run. To give your Cloud Dataprep project access to a Cloud Storage bucket owned by a different Google Cloud console project, you must make the bucket accessible to the service accounts in your Cloud Dataprep project, and then manually enter that Cloud Storage location in the UI.
Finding your project's service accounts
Here are the Cloud Dataprep-related services accounts that you will find listed on the Google Cloud console IAM & Admin→Permissions page of your Cloud Dataprep project:
- Google Compute Engine:
project-number-compute@developer.gserviceaccount.com
- Cloud Dataprep:
service-project-number@trifacta-gcloud-prod.iam.gserviceaccount.com
Granting service account access to a bucket
You can run Google Cloud CLI gsutil commands to grant your project's service accounts ownership (read/write permission) to both the bucket and its contents.
To grant your Cloud Dataprep project's service accounts access to new objects created in a Cloud Storage bucket in another project, use the following gsutil defacl commands in your shell or terminal window.
gsutil defacl ch -u \ project-number-compute@developer.gserviceaccount.com:OWNER \ gs://bucket
gsutil defacl ch -u \ service-project-number@trifacta-gcloud-prod.iam.gserviceaccount.com:OWNER \ gs://bucket
To grant your Cloud Dataprep project's service accounts access to a Cloud Storage bucket and the current contents of the bucket in another project, use the following gsutil acl commands in your shell or terminal window:
gsutil acl ch -u \ project-number-compute@developer.gserviceaccount.com:OWNER \ gs://bucket
gsutil -m acl ch -r -u \ project-number-compute@developer.gserviceaccount.com:OWNER \ gs://bucket
gsutil acl ch -u \ service-project-number@trifacta-gcloud-prod.iam.gserviceaccount.com:OWNER \ gs://bucket
gsutil -m acl ch -r -u \ service-project-number@trifacta-gcloud-prod.iam.gserviceaccount.com:OWNER \ gs://bucket
To grant your Cloud Dataprep project's service accounts access to both current and new objects in a Cloud Storage bucket in another project, run both sets of commands listed above.
Entering the bucket path in the Cloud Dataprep UI
You can access a bucket in the Cloud Dataprep UI by manually entering the Google Cloud Storage path. The UI will not allow you to enter the Cloud Storage path If you have not given the Cloud Dataprep service account access to the bucket (see Granting service account access to a bucket).
Removing service account access to a bucket
If you have granted service account access to a bucket, you can run the following Google Cloud CLI gsutil acl commands to remove your project's service accounts ownership (read/write permission) to the bucket and its contents.
gsutil defacl ch -d \ project-number-compute@developer.gserviceaccount.com:OWNER \ gs://bucket
gsutil defacl ch -d \ service-project-number@trifacta-gcloud-prod.iam.gserviceaccount.com:OWNER \ gs://bucket
gsutil acl ch -d \ project-number-compute@developer.gserviceaccount.com \ gs://bucket
gsutil -m acl ch -r -d \ project-number-compute@developer.gserviceaccount.com \ gs://bucket
gsutil acl ch -d \ service-project-number@trifacta-gcloud-prod.iam.gserviceaccount.com \ gs://bucket
gsutil -m acl ch -r -d \ service-project-number@trifacta-gcloud-prod.iam.gserviceaccount.com \ gs://bucket