Configure your Google Cloud project for Vertex Pipelines

Before you use Vertex Pipelines to orchestrate your machine learning (ML) pipelines, you must set up your Google Cloud project. Some resources, such as the metadata store used by Vertex ML Metadata, are created in your Cloud project the first time that you run a pipeline.

Use the following instructions to configure your project for Vertex Pipelines.

  1. Create your Cloud project and configure it for use with Vertex Pipelines.

  2. If you do not specify a service account, Vertex Pipelines uses the Compute Engine default service account to run your pipelines. The Compute Engine default service account has the Project Editor role by default.

    We recommend that you create a service account to run your pipelines and then grant this account granular permissions to the Google Cloud resources that are needed to run your pipeline.

  3. Vertex Pipelines uses Cloud Storage to store the artifacts of your pipeline runs. Create a Cloud Storage bucket and grant your service account access to this bucket.

  4. Vertex Pipelines uses Vertex ML Metadata to store the metadata created by your pipeline runs. The first time that you run a pipeline, Vertex AI creates your project's metadata store.

    If you want your data encrypted using a customer-managed encryption key (CMEK), you must create your metadata store using a CMEK key before you run a pipeline.

    After the metadata store has been created, the CMEK key that the metadata store uses is independent of the CMEK key used in a pipeline run.

Set up your Google Cloud project

Use the following instructions to create a Cloud project and configure it for use with Vertex Pipelines.

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Cloud project. Learn how to confirm that billing is enabled for your project.

  4. Enable the Vertex AI and Cloud Storage APIs.

    Enable the APIs

  5. Install and initialize the Cloud SDK.
  6. Update and install gcloud components:
    gcloud components update &&
    gcloud components install beta
  7. In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  8. Make sure that billing is enabled for your Cloud project. Learn how to confirm that billing is enabled for your project.

  9. Enable the Vertex AI and Cloud Storage APIs.

    Enable the APIs

  10. Install and initialize the Cloud SDK.
  11. Update and install gcloud components:
    gcloud components update &&
    gcloud components install beta

Configure a service account with granular permissions

When you run a pipeline, you can specify a service account. Your pipeline run acts with the permissions of this service account.

If you do not specify a service account, your pipeline run uses the Compute Engine default service account. The Compute Engine default service account has the Project Editor role by default.

  • Use the following instructions to create a service account and grant it granular permissions to Google Cloud resources.

    1. Run the following command to create a service account.

      gcloud iam service-accounts create SERVICE_ACCOUNT_ID \
          --description="DESCRIPTION" \
          --display-name="DISPLAY_NAME" \
          --project=PROJECT_ID
      

      Replace the following values:

      • SERVICE_ACCOUNT_ID: The ID for the service account.
      • DESCRIPTION: (Optional.) A description of the service account.
      • DISPLAY_NAME: The display name for this service account.
      • PROJECT_ID: The project to create your service account in.

      Learn more about creating a service account.

    2. Grant your service account access to Vertex AI.

      gcloud projects add-iam-policy-binding PROJECT_ID \
          --member="serviceAccount:SERVICE_ACCOUNT_ID@PROJECT_ID.iam.gserviceaccount.com" \
          --role="roles/aiplatform.user"
      

      Replace the following values:

      • PROJECT_ID: The project that your service account was created in.
      • SERVICE_ACCOUNT_ID: The ID for the service account.
    3. If your pipelines use container images hosted in Container Registry, grant your service account access to pull images.

    4. Grant your service account access to any Google Cloud resources that you use in your pipelines.

      gcloud projects add-iam-policy-binding PROJECT_ID \
          --member="serviceAccount:SERVICE_ACCOUNT_ID@PROJECT_ID.iam.gserviceaccount.com" \
          --role="ROLE_NAME"
      

      Replace the following values:

      • PROJECT_ID: The project that your service account was created in.
      • SERVICE_ACCOUNT_ID: The ID for the service account.
      • ROLE_NAME: The Identity and Access Management role to grant to this service account.
    5. To use Vertex Pipelines to run pipelines with this service account, run the following command to grant your user account the roles/iam.serviceAccountUser role for your service account.

      gcloud iam service-accounts add-iam-policy-binding \
          SERVICE_ACCOUNT_ID@PROJECT_ID.iam.gserviceaccount.com \
          --member="user:USER_EMAIL" \
          --role="roles/iam.serviceAccountUser"
      

      Replace the following values:

      • SERVICE_ACCOUNT_ID: The ID for the service account.
      • PROJECT_ID: The project that your service account was created in.
      • USER_EMAIL: The email address of the user that runs pipelines as this service account.
  • If you prefer to use the Compute Engine default service account to run your pipelines, enable the Compute Engine API.

    Enable the API

Configure a Cloud Storage bucket for pipeline artifacts

Vertex Pipelines stores the artifacts of your pipeline runs using Cloud Storage. Use the following instructions to create a Cloud Storage bucket and grant your service account access to read and write objects in that bucket.

  1. Run the following command to create a Cloud Storage bucket in the region that you want to run your pipelines in.

    gsutil mb -p PROJECT_ID -l BUCKET_LOCATION gs://BUCKET_NAME
    

    Replace the following values:

    • PROJECT_ID: Specify the project that your bucket is associated with.
    • BUCKET_LOCATION: Specify the location of your bucket — for example, US-CENTRAL1.
    • BUCKET_NAME: The name you want to give your bucket, subject to naming requirements. For example, my-bucket.

    Learn more about creating Cloud Storage buckets.

  2. Run the following commands to grant your service account access to read and write pipeline artifacts in the bucket that you created in the previous step.

    gsutil iam ch \
    serviceAccount:SERVICE_ACCOUNT_ID@PROJECT_ID.iam.gserviceaccount.com:roles/storage.objectCreator \
    gs://BUCKET_NAME
    
    gsutil iam ch \
    serviceAccount:SERVICE_ACCOUNT_ID@PROJECT_ID.iam.gserviceaccount.com:roles/storage.objectViewer \
    gs://BUCKET_NAME
    

    Replace the following values:

    • SERVICE_ACCOUNT_ID: The ID for the service account.
    • PROJECT_ID: The project that your service account was created in.
    • BUCKET_NAME: The name of the bucket you are granting your service account access to.

    Learn more about controlling access to Cloud Storage buckets.

Create a metadata store that uses a CMEK

Use the following instructions to create a CMEK and set up a Vertex ML Metadata metadata store that uses this CMEK.

  1. Use Cloud Key Management Service to configure a customer-managed encryption key.

  2. Use the following REST call to create your project's default metadata store using your CMEK.

    Before using any of the request data, make the following replacements:

    • LOCATION: Your region.
    • PROJECT: Your project ID or project number.
    • KEY_RING: The name of the Cloud Key Management Service key ring that your encryption key is on.
    • KEY_NAME: The name of the encryption key that you want to use for this metadata store.

    HTTP method and URL:

    POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT/locations/LOCATION/metadataStores?metadata_store_id=default

    Request JSON body:

    {
      "encryption_spec": {
        "kms_key_name": "projects/PROJECT/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY_NAME"
      },
    }
    

    To send your request, expand one of these options:

    You should receive a JSON response similar to the following:

    {
      "name": "projects/12345/locations/us-central1/operations/67890",
      "metadata": {
        "@type": "type.googleapis.com/google.cloud.aiplatform.v1beta1.CreateMetadataStoreOperationMetadata",
        "genericMetadata": {
          "createTime": "2021-05-18T18:47:14.494997Z",
          "updateTime": "2021-05-18T18:47:14.494997Z"
        }
      }
    }