By default, Google Cloud automatically encrypts data when it is at rest using encryption keys managed by Google. If you have specific compliance or regulatory requirements related to the keys that protect your data, you can use customer-managed encryption keys (CMEK) for your resources.
You can read more about the specific benefits of using CMEK with AI Platform (Unified) resources in the following section of this guide. For more information about CMEK in general, including when and why to enable it, see the Cloud Key Management Service documentation.
This guide describes some benefits of using CMEK for AI Platform (Unified) resources and walks through how to configure a training job to use CMEK.
Understanding CMEK for AI Platform (Unified) resources
The following sections describe basic information about CMEK for AI Platform (Unified) resources that you must understand before configuring CMEK for your jobs.
Benefits of CMEK
In general, CMEK is most useful if you need full control over the keys used to encrypt your data. With CMEK, you can manage your keys within Cloud KMS. For example, you can rotate or disable a key or you can set up a rotation schedule using the Cloud KMS API. For more information about CMEK in general, including when and why to enable it, see the Cloud KMS documentation.
When you run an AutoML or custom training job, your code runs on one or more virtual machine (VM) instances managed by AI Platform (Unified). When you enable CMEK for AI Platform (Unified) resources, the key that you designate, rather than a key managed by Google, is used to encrypt data on the boot disks of these VMs. The CMEK key encrypts the following kinds of data:
- The copy of your code on the VMs.
- Any data that gets loaded by your code.
- Any temporary data that gets saved to the local disk by your code.
- Automl-trained models.
- Media files (data) uploaded into media datasets.
In general, the CMEK key does not encrypt metadata associated with your operation, like the job's name and region, or a dataset's display name. Metadata associated with operations is always encrypted using Google's default encryption mechanism.
For datasets, when a user imports data into dataset, the data items and annotations are CMEK-encrypted. The dataset display name is not CMEK-encrypted.
For models, the models stored in the storage system (for example, disk) are CMEK-encrypted. All the model evaluation results are CMEK-encrypted.
For endpoints, all model files used for the model deployment under the endpoint are CMEK-encrypted. This does not include any in-memory data.
For batch prediction, any temporarily files (such as model files, logs, VM disks) used to execute the batch prediction job are CMEK-encrypted. Batch prediction results are stored in the user provided destination. Consequently, AI Platform respects the default value of the destination's encryption config. Otherwise, results will also be encrypted with CMEK.
For data labeling, any input files (image, text, video, etc.), temporary discussion (question, feedback, etc) and output (labeling result) are CMEK-encrypted. The annotation spec display names are not CMEK-encrypted.
Using CMEK with other Google Cloud products
Configuring CMEK for AI Platform (Unified) resources does not automatically configure CMEK for other Google Cloud products that you use together with AI Platform (Unified). To use CMEK to encrypt data in other Google Cloud products, additional configuration is required. For example:
Cloud Storage: When you perform custom training, AI Platform usually loads your data from Cloud Storage. When you use a Python training application and a pre-built container for training, AI Platform (Unified) also loads your code from a Cloud Storage bucket. In addition, some training jobs export trained model artifacts (for example, a TensorFlow SaveModel directory) to a Cloud Storage bucket as part of their output.
To ensure that your data in Cloud Storage is encrypted with CMEK, read the Cloud Storage guide to using customer-managed encryption keys. You can set your encryption key as the default key for the Cloud Storage bucket(s) that you use with AI Platform (Unified), or you can use it to encrypt specific objects.
Artifact Registry: When you use a custom container for training, you can configure AI Platform to load your container image from Artifact Registry.
To ensure that your container image is encrypted with CMEK, read the Artifact Registry guide to CMEK.
Container Registry: When you use a custom container for training, you can configure AI Platform to load your container image from Container Registry.
To ensure that your container image is encrypted with CMEK, read the Container Registry guide to CMEK.
Cloud Logging: When you run a training job, AI Platform (Unified) training saves logs to Logging. These logs are not encrypted with CMEK. However, if you use the Logs Router, then you can configure CMEK for certain temporary files that the Logs Router creates.
Current CMEK-supported resources
The current AI Platform (Unified) resources covered by CMEK are as follows:
Resource | Material encrypted | Documentation links |
---|---|---|
Dataset |
|
|
Model |
|
|
Endpoint |
|
|
CustomJob |
|
|
HyperparameterTuningJob |
|
|
TrainingPipeline |
|
|
BatchPredictionJob (excludes AutoML image batchPrediction) |
|
|
DataLabelingJob |
|
Limitations
You cannot use CMEK with:
- AutoML image model batch prediction (
BatchPredictionJob
)
Configuring CMEK for your resources
The following sections describe how to create a keyring and key in Cloud Key Management Service, grant AI Platform encrypter and decrypter permissions for your key, and create resources that uses CMEK.
Before you begin
This guide assumes that you use two separate Google Cloud projects to configure CMEK for AI Platform data:
- A project for managing your encryption key (referred to as the "Cloud KMS project").
- A project for accessing AI Platform data or output in Cloud Storage, and interacting with any other Google Cloud products that you need for your use case (referred to as the "AI Platform project").
This recommended setup supports a separation of duties.
Alternatively, you can use a single Google Cloud project for the whole guide. To do so, use the same project for all of the following tasks that refer to the Cloud KMS project and the tasks that refer to the AI Platform project.
Setting up the Cloud KMS project
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Cloud project. Learn how to confirm that billing is enabled for your project.
- Enable the Cloud KMS API.
Setting up the AI Platform project
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Cloud project. Learn how to confirm that billing is enabled for your project.
- Enable the AI Platform (Unified) API.
Setting up the gcloud
command-line tool
The gcloud
tool is required for some steps in this guide and optional
for others.
Install and initialize the Cloud SDK.
Creating a key ring and key
Follow the Cloud KMS guide to creating symmetric keys to create a key ring and a key. When you create your key ring, specify a region that supports AI Platform operations as the key ring's location. AI Platform training only supports CMEK when your resource and key use the same region. You must not specify a dual-regional, multi-regional, or global location for your key ring.
Make sure to create your key ring and key in your Cloud KMS project.
Granting AI Platform permissions
To use CMEK for your resources, you must grant AI Platform permission to encrypt and decrypt data using your key. AI Platform uses a Google-managed service agent to run operations using your resources. This service account is identified by an email address with the following format:
service-PROJECT_NUMBER@gcp-sa-aiplatform.iam.gserviceaccount.com
To find the appropriate service account for your AI Platform project, go
to the IAM page in the Google Cloud Console and find the member that matches
this email address format, with the project
number for
your AI Platform project replacing the
AI_PLATFORM_PROJECT_NUMBER variable. The service account also has the
name AI Platform Service Agent
.
Make note of the email address for this service account, and use it in the
following steps to grant it permission to encrypt and decrypt data using your
key. You can grant permission by using the Google Cloud Console or by using the
gcloud
command-line tool:
Cloud Console
In the Cloud Console, go to the Cryptographic Keys page and select your Cloud KMS project.
Click on the name of the key ring that you created in a preceding section of this guide to go to the Key ring details page.
Select the checkbox for the key that you created in a preceding section of this guide. If an info panel labeled with the name of your key is not already open, click Show info panel.
In the info panel, click
Add member to open the Add members to "KEY_NAME" dialog. In this dialog, do the following:- In the New members box, enter the service account email address that
you made a note of in the preceding section:
service-AI_PLATFORM_PROJECT_NUMBER@gcp-sa-aiplatform.iam.gserviceaccount.com
In the Select a role drop-down list, click Cloud KMS and then select the Cloud KMS CryptoKey Encrypter/Decrypter role.
Click Save.
- In the New members box, enter the service account email address that
you made a note of in the preceding section:
gcloud
Run the following command:
gcloud kms keys add-iam-policy-binding KEY_NAME \
--keyring=KEY_RING_NAME \
--location=REGION \
--project=KMS_PROJECT_ID \
--member=serviceAccount:service-AI_PLATFORM_PROJECT_NUMBER@gcp-sa-aiplatform.iam.gserviceaccount.com \
--role=roles/cloudkms.cryptoKeyEncrypterDecrypter
In this command, replace the following placeholders:
- KEY_NAME: The name of the key that you created in a preceding section of this guide.
- KEY_RING_NAME: The key ring that you created in a preceding section of this guide.
- REGION: The region where you created your key ring.
- KMS_PROJECT_ID: The ID of your Cloud KMS project.
- AI_PLATFORM_PROJECT_NUMBER: The project number of your AI Platform project, which you noted in the preceding section as part of a service account email address.
Create resources with the KMS key
When you create a new CMEK-supported resource you can specify your key as one of the create parameters.
Console
When you create a new CMEK-supported resource in the AI Platform section of the Google Cloud Console, you can select your key in the general or advanced option section:
REST & CMD Line
When you create a supported resource, add an
encryptionSpec
object to your request and set the
encryptionSpec.kmsKeyName
field to point to your key
resource.
For example, when creating a dataset
resource you would
specify your key in the request body:
{
"displayName": DATASET_NAME,
"metadataSchemaUri": METADATA_URI,
"encryptionSpec": {
"kmsKeyName": "projects/PROJECT_ID/locations/LOCATION_ID/keyRings/KEY_RING_NAME/cryptoKeys/KEY_NAME"
}
}
Java
When you create a supported resource, set the EncryptionSpec
to
point to your key resource. See the
Google AI Platform Client for Java
documentation for more information.
Node.js
When you create a supported resource, set the encryptionSpec
parameter to
point to your key resource. See the
AI Platform: Node.js Client
documentation for more information.
Python
When you create a supported resource, set the encryption_spec
parameter to
point to your key resource. See the
Python Client for Cloud AI Platform
documentation for more information.
What's next
- Learn more about CMEK on Google Cloud.
- Learn how to use CMEK with other Google Cloud products.