Customer-managed encryption keys (CMEK)

By default, Google Cloud automatically encrypts data when it is at rest using encryption keys managed by Google. If you have specific compliance or regulatory requirements related to the keys that protect your data, you can use customer-managed encryption keys (CMEK) for your resources.

You can read more about the specific benefits of using CMEK with Vertex AI resources in the following section of this guide. For more information about CMEK in general, including when and why to enable it, see the Cloud Key Management Service documentation.

This guide describes some benefits of using CMEK for Vertex AI resources and walks through how to configure a training job to use CMEK.

CMEK for Vertex AI resources

The following sections describe basic information about CMEK for Vertex AI resources that you must understand before configuring CMEK for your jobs.

Benefits of CMEK

In general, CMEK is most useful if you need full control over the keys used to encrypt your data. With CMEK, you can manage your keys within Cloud KMS. For example, you can rotate or disable a key or you can set up a rotation schedule using the Cloud KMS API. For more information about CMEK in general, including when and why to enable it, see the Cloud KMS documentation.

When you run an AutoML or custom training job, your code runs on one or more virtual machine (VM) instances managed by Vertex AI. When you enable CMEK for Vertex AI resources, the key that you designate, rather than a key managed by Google, is used to encrypt data on the boot disks of these VMs. The CMEK key encrypts the following kinds of data:

  • The copy of your code on the VMs.
  • Any data that gets loaded by your code.
  • Any temporary data that gets saved to the local disk by your code.
  • Automl-trained models.
  • Media files (data) uploaded into media datasets.

In general, the CMEK key does not encrypt metadata associated with your operation, like the job's name and region, or a dataset's display name. Metadata associated with operations is always encrypted using Google's default encryption mechanism.

For datasets, when a user imports data into dataset, the data items and annotations are CMEK-encrypted. The dataset display name is not CMEK-encrypted.

For models, the models stored in the storage system (for example, disk) are CMEK-encrypted. All the model evaluation results are CMEK-encrypted.

For endpoints, all model files used for the model deployment under the endpoint are CMEK-encrypted. This does not include any in-memory data.

For batch prediction, any temporary files (such as model files, logs, VM disks) used to execute the batch prediction job are CMEK-encrypted. Batch prediction results are stored in the user provided destination. Consequently, Vertex AI respects the default value of the destination's encryption config. Otherwise, results will also be encrypted with CMEK.

For data labeling, any input files (image, text, video, tabular), temporary discussion (for example, questions, feedback) and output (labeling result) are CMEK-encrypted. The annotation spec display names are not CMEK-encrypted.

External keys

You can use Cloud External Key Manager (Cloud EKM) to create external keys, that you manage, to encrypt data within Google Cloud.

When you use a Cloud EKM key, Google has no control over the availability of your externally-managed key. If you request access to a resource encrypted with an externally-managed key, and the key is unavailable, then Vertex AI will reject the request. There can be a delay of up to 10 minutes before you can access the resource once the key becomes available.

For more considerations when using external keys, see Cloud External Key Manager.

Use CMEK with other Google Cloud products

Configuring CMEK for Vertex AI resources does not automatically configure CMEK for other Google Cloud products that you use together with Vertex AI. To use CMEK to encrypt data in other Google Cloud products, additional configuration is required. For example:

Current CMEK-supported resources

The current Vertex AI resources covered by CMEK are as follows. CMEK support for Preview features is in Preview status as well.

Resource Material encrypted Documentation links
Dataset
  • All user imported data (for example, text content or videos) for DataItems and Annotations.
  • User created content such as AnnotationSpecs, ColumnSpecs.
Model
  • Uploaded model files.
  • Evaluation results of the trained model.
Endpoint
  • All model files used for the model deployment under the endpoint. This does not include any in-memory data, but the model will be auto-undeployed if the key is disabled.
CustomJob (excludes resources that use a TPU VM)
  • The copy of your code on the VMs used to run the operation.
  • Any data that gets loaded by your code.
  • Any temporary data that gets saved to the local disk by your code.
HyperparameterTuningJob (excludes resources that use a TPU VM)
  • The copy of your code on the VMs used to run the operation.
  • Any data that gets loaded by your code.
  • Any temporary data that gets saved to the local disk by your code.
TrainingPipeline (excludes resources that use a TPU VM)
  • The copy of your code on the VMs used to run the operation.
  • Any data that gets loaded by your code.
  • Any temporary data that gets saved to the local disk by your code.
  • AutoML-trained models.
BatchPredictionJob (excludes AutoML image batchPrediction)
  • Any temporary files (for example, model files, logs, VM disks) used in the job to proceed the batch prediction job.
  • If the written results of the BatchPrediction are stored in the user provided destination, it will respect the encryption config of its default value. Otherwise, it will also be encrypted with CMEK.
ModelDeploymentMonitoringJob
  • Any temporary files (for example, training dataset files, logs, VM disks) used in the job to process the model deployment monitoring job.
  • Any data used for detection monitoring anomalies.
  • If the key is disabled, the model deployment monitoring job will be paused.
PipelineJob
  • The pipeline job and all of its sub-resources.
MetadataStore
  • All content in the metadata store.
TensorBoard
  • All data from the uploaded TensorBoard logs including scalars, histograms, graph defs, images, and text.
Featurestore
  • The featurestore and all content in the featurestore.

CMEK support for Generative AI tuning pipelines

CMEK support is provided in the tuning pipeline of the following models:

  • text-bison for PaLM 2 (GPU)
  • BERT
  • T5
  • image-generation (GPU)

Limitations

CMEK support isn't provided in the following:

  • AutoML image model batch prediction (BatchPredictionJob)
  • TPU tuning

Configure CMEK for your resources

The following sections describe how to create a key ring and key in Cloud Key Management Service, grant Vertex AI encrypter and decrypter permissions for your key, and create resources that uses CMEK.

Before you begin

This guide assumes that you use two separate Google Cloud projects to configure CMEK for Vertex AI data:

  • A project for managing your encryption key (referred to as the "Cloud KMS project").
  • A project for accessing Vertex AI data or output in Cloud Storage, and interacting with any other Google Cloud products that you need for your use case (referred to as the "AI Platform project").

This recommended setup supports a separation of duties.

Alternatively, you can use a single Google Cloud project for the whole guide. To do so, use the same project for all of the following tasks that refer to the Cloud KMS project and the tasks that refer to the AI Platform project.

Set up the Cloud KMS project

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Google Cloud project.

  4. Enable the Cloud KMS API.

    Enable the API

  5. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  6. Make sure that billing is enabled for your Google Cloud project.

  7. Enable the Cloud KMS API.

    Enable the API

Set up the AI Platform project

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Google Cloud project.

  4. Enable the Vertex AI API.

    Enable the API

  5. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  6. Make sure that billing is enabled for your Google Cloud project.

  7. Enable the Vertex AI API.

    Enable the API

Set up the Google Cloud CLI

The gcloud CLI is required for some steps in this guide and optional for others.

Install the Google Cloud CLI, then initialize it by running the following command:

gcloud init

Create a key ring and key

Follow the Cloud KMS guide to creating symmetric keys to create a key ring and a key. When you create your key ring, specify a region that supports Vertex AI operations as the key ring's location. Vertex AI training only supports CMEK when your resource and key use the same region. You must not specify a dual-regional, multi-regional, or global location for your key ring.

Make sure to create your key ring and key in your Cloud KMS project.

Grant Vertex AI permissions

To use CMEK for your resources, you must grant Vertex AI permission to encrypt and decrypt data using your key. Vertex AI uses a Google-managed service agent to run operations using your resources. This service account is identified by an email address with the following format:

service-PROJECT_NUMBER@gcp-sa-aiplatform.iam.gserviceaccount.com

To find the appropriate service account for your AI Platform project, go to the IAM page in the Google Cloud console and find the member that matches this email address format, with the project number for your AI Platform project replacing the AI_PLATFORM_PROJECT_NUMBER variable. The service account also has the name Vertex AI Service Agent.

Go to the IAM page

Make note of the email address for this service account, and use it in the following steps to grant it permission to encrypt and decrypt data using your key. You can grant permission by using the Google Cloud console or by using the Google Cloud CLI:

Google Cloud console

  1. In the Google Cloud console, Click Security and select Key Management. This will take you to Cryptographic Keys page and select your Cloud KMS project.

    Go to the Cryptographic Keys page

  2. Click on the name of the key ring that you created in a preceding section of this guide to go to the Key ring details page.

  3. Select the checkbox for the key that you created in a preceding section of this guide. If an info panel labeled with the name of your key is not already open, click Show info panel.

  4. In the info panel, click Add member to open the Add members to "KEY_NAME" dialog. In this dialog, do the following:

    1. In the New members box, enter the service account email address that you made a note of in the preceding section: service-AI_PLATFORM_PROJECT_NUMBER@gcp-sa-aiplatform.iam.gserviceaccount.com
    2. In the Select a role drop-down list, click Cloud KMS and then select the Cloud KMS CryptoKey Encrypter/Decrypter role.

    3. Click Save.

gcloud

Run the following command:

gcloud kms keys add-iam-policy-binding KEY_NAME \
  --keyring=KEY_RING_NAME \
  --location=REGION \
  --project=KMS_PROJECT_ID \
  --member=serviceAccount:service-AI_PLATFORM_PROJECT_NUMBER@gcp-sa-aiplatform.iam.gserviceaccount.com \
  --role=roles/cloudkms.cryptoKeyEncrypterDecrypter

In this command, replace the following placeholders:

  • KEY_NAME: The name of the key that you created in a preceding section of this guide.
  • KEY_RING_NAME: The key ring that you created in a preceding section of this guide.
  • REGION: The region where you created your key ring.
  • KMS_PROJECT_ID: The ID of your Cloud KMS project.
  • AI_PLATFORM_PROJECT_NUMBER: The project number of your AI Platform project, which you noted in the preceding section as part of a service account email address.

Create resources with the KMS key

When you create a new CMEK-supported resource you can specify your key as one of the create parameters.

Console

When you create a new CMEK-supported resource in the Vertex AI section of the Google Cloud console, you can select your key in the general or advanced option section:

Select encryption key for resource section

REST & CMD Line

When you create a supported resource, add an encryptionSpec object to your request and set the encryptionSpec.kmsKeyName field to point to your key resource.

For example, when creating a dataset resource you would specify your key in the request body:

 {
   "displayName": DATASET_NAME,
   "metadataSchemaUri": METADATA_URI,
   "encryptionSpec": {
     "kmsKeyName": "projects/PROJECT_ID/locations/LOCATION_ID/keyRings/KEY_RING_NAME/cryptoKeys/KEY_NAME"
   }
 }

Java

When you create a supported resource, set the EncryptionSpec to point to your key resource. See the Vertex AI client library for Java documentation for more information.

Node.js

When you create a supported resource, set the encryptionSpec parameter to point to your key resource. See the Vertex AI client library for Node.js documentation for more information.

Python

When you create a supported resource, set the encryption_spec parameter to point to your key resource. See the Python Client for Cloud AI Platform documentation for more information.

What's next