Step 4: Set up components
This page describes the fourth step to deploy Cortex Framework Data Foundation, the core of Cortex Framework. In this step, you set up the required Google Cloud Services to deploy.
In this section you enable the following Google Cloud services in your Google Cloud project:
- BigQuery instance and datasets
- Cloud Build API
- Cloud Storage Buckets
- Service Account (optional)
- Cloud Resource Manager API
Enable these Google Cloud Services using Cloud Shell:
Copy and paste the following command:
gcloud config set project SOURCE_PROJECT gcloud services enable bigquery.googleapis.com \ cloudbuild.googleapis.com \ composer.googleapis.com \ storage-component.googleapis.com \ cloudresourcemanager.googleapis.com \ dataflow.googleapis.com
Replace
SOURCE_PROJECT
with your source project ID.Ensure that Google Cloud services are enabled if you get a success message.
Optional. You can enable the following Google Cloud services in your Google Cloud Project:
- Cloud Composer for Change Data Capture (CDC) processing, hierarchy flattening (SAP only), and data replication (Non-SAP only) through Directed Acyclic Graphs (DAGs). To set up an instance, see Cloud Composer documentation.
- Looker for connecting to reporting templates.
- Analytics Hub linked datasets are used for some external sources, such as the Weather DAG. You might choose to fill this structure with any other available source of your choice for advanced scenarios.
- Dataflow: Integration tool for many of the Marketing data sets like Google Ads.
- Dataplex: Used for building a Data Mesh. For more information see the Data Mesh User Guide.
Grant permissions to the executing user
To execute the deployment in the project where Cloud Build is triggered, grant the following permissions to the executing user:
- Service Usage Consumer
- Storage Object Viewer for the Cloud Build default bucket or bucket for logs
- Object Writer to the output buckets
- Cloud Build Editor
- Project Viewer or Storage Object Viewer
For more information about granting these permissions, see the following documentation:
- Permissions to run Cloud Build.
- Permissions to storage for the Build Account.
- Permissions for the Cloud Build service account.
- Viewing logs from Builds.
Configure the Cloud Build account
Cloud Build uses a service account to execute builds on your behalf. Grant the Cloud Build service account permissions to deploy Cortex Framework with the following steps.
To grant the Cloud Build service account permissions to deploy Cortex Framework, use the following commands.
Find the default Cloud Build service account by opening Cloud Shell and executing the following command:
gcloud builds get-default-service-account --project PROJECT_ID
Receive a response formatted as either:
# Response one `serviceAccountEmail: projects/PROJECT_NUMBER/serviceAccounts/PROJECT_NUMBER-compute@developer.gserviceaccount.com` # Response two `serviceAccountEmail: projects/PROJECT_NUMBER/serviceAccounts/PROJECT_NUMBER@cloudbuild.gserviceaccount.com`
The response would replace
PROJECT_NUMBER
with your project number. Either of the previous service account emails is your default Cloud Build service account.Identify this service account in IAM looking at the
@developer.gserviceaccount.com
account or the@cloudbuild.gserviceaccount.com
account in IAM.Grant the following permissions to the Cloud Build service account in the source project (and the target project if deploying to a separate target) through the console or Google Cloud CLI:
- Cloud Build Service Account (
roles/cloudbuild.builds.builder
) - Service Account User (
roles/iam.serviceAccountUser
) - BigQuery Data Editor (
roles/bigquery.dataEditor
) BigQuery Job User (
roles/bigquery.jobUser
)
Console
In the Google Cloud console, go to the IAM page.
Select your source project.
Click
Grant access.Add the default Cloud Build service account from the preceding step as a new principal.
From the Select a role drop-down menu, search for Cloud Build Service Account, then click Cloud Build Service Account.
Repeat the previous step to add the rest of the permissions: Service Account User, BigQuery Data Editor, and BigQuery Job User.
Click Save.
Verify that the service account and the corresponding roles are listed in the IAM page. You have successfully granted an IAM role.
gcloud
Use the following command to grant the roles to the Cloud Build service account:
gcloud projects add-iam-policy-binding SOURCE_PROJECT \ --member="serviceAccount:CLOUD_BUILD_SA" \ --role="roles/cloudbuild.builds.builder" gcloud projects add-iam-policy-binding SOURCE_PROJECT \ --member="serviceAccount:CLOUD_BUILD_SA" \ --role="roles/iam.serviceAccountUser" gcloud projects add-iam-policy-binding SOURCE_PROJECT \ --member="serviceAccount:CLOUD_BUILD_SA" \ --role="roles/bigquery.dataEditor" gcloud projects add-iam-policy-binding SOURCE_PROJECT \ --member="serviceAccount:CLOUD_BUILD_SA" \ --role="roles/bigquery.jobUser"
Replace the placeholder values in the command with the following:
SOURCE_PROJECT
with the Source project ID.CLOUD_BUILD_SA
with the Cloud Build default service account.
For more information, see Granting a role to the Cloud Build service account using the IAM page and Set and manage IAM policies on buckets.
- Cloud Build Service Account (
Optional steps
For a better customization of your deployment, consider to follow these optional steps:
- Data Mesh: If you need to modify default values for Data Mesh for implementing features beyond descriptions, see the Data Mesh concepts and the Data Mesh user guide.
- Service Account for deployment: If you need to enhance security, simplify deployment, and improve auditability, see Create a Service Account for deployment.
Create a Storage bucket for storing DAG related files
A storage bucket is required to store processing DAG scripts and other temporary files generated during deployment. These scripts need to be manually moved into a Cloud Composer or Apache Airflow instance after deployment.
You can create the storage bucket from Google Cloud CLI or Google Cloud console with the following steps.
Console
Go to Cloud Storage.
Create a bucket in the same region as your BigQuery datasets.
Select the created bucket.
Go to the
Permissions
tab.Grant the permission
Storage Object Creator
to the user ID executing the Build command or to the Service account you created. For more information, see Set a new condition on a bucket: Console.
gcloud
Create a bucket from the Cloud Shell with the following command:
gcloud storage buckets create gs://DAG_BUCKET_NAME -l REGION/MULTI_REGION
Replace the following:
DAG_BUCKET_NAME
with the name for the new bucket.REGION/MULTI_REGION
with the same region as your BigQuery datasets.
Use the following command to assign the permission
Storage Object Creator
to the service account:gsutil iam ch serviceAccount:CLOUD_BUILD_SA:roles/storage.objectCreator gs://DAG_BUCKET_NAME
Replace the following:
CLOUD_BUILD_SA
with the Cloud Build default service account.DAG_BUCKET_NAME
with the name for the new bucket.
Create a Storage bucket for logs
You can create a specific bucket for the Cloud Build process to store the logs. This is useful if you want to restrict data that may be stored in logs to a specific region. You can create the storage bucket for logs from Google Cloud CLI or Google Cloud console.
Console
To create a specific bucket for the logs, follow these steps:
Go to Cloud Storage.
Create a bucket in the same region where the deployment would run.
Select the created bucket.
Go to the
Permissions
tab.Grant the permission
Storage Object Admin
to the user ID executing the Build command or to the Service account you created. For more information, see Set a new condition on a bucket: Console.
gcloud
To create a specific bucket for the logs, use the following commands.
Create a bucket from the Cloud Shell with the following command:
gcloud storage buckets create gs://LOGS_BUCKET_NAME -l REGION/MULTI_REGION
Replace the following:
REGION/MULTI_REGION
with the chosen region to create the bucket.LOGS_BUCKET_NAME
with the name for the new bucket.
Use the following command to assign the permission
Storage Object Admin
to the service account:gsutil iam ch serviceAccount:CLOUD_BUILD_SA:roles/storage.objectAdmin gs://LOGS_BUCKET_NAME
Replace the following:
CLOUD_BUILD_SA
with the Cloud Build default service account.LOGS_BUCKET_NAME
with the name for the new bucket.
Next steps
After you complete this step, move on to the following deployment steps:
- Establish workloads.
- Clone repository.
- Determine integration mechanism.
- Set up components (this page).
- Configure deployment.
- Execute deployment.