Use customer-managed encryption keys

This page explains how to use Customer Managed Encryption Keys (CMEK) to protect Dataproc Metastore services. CMEK provides encryption of data at rest with a key that you can control through Cloud Key Management Service. You may store the keys as software keys, in an HSM cluster, or externally.

Before you begin

If you want your Dataproc Metastore service to run inside a VPC Service Controls perimeter, you must add the Cloud Key Management Service (Cloud KMS) API to the perimeter.

Configure CMEK support for Dataproc Metastore

In order to configure CMEK support for Dataproc Metastore, you must first grant Cloud KMS key permission to the Dataproc Metastore and Cloud Storage service accounts. Then you can create a Dataproc Metastore service that uses a CMEK key.

Grant Cloud KMS key permissions

Use the following commands to grant Cloud KMS key permissions for Dataproc Metastore:

gcloud

  1. Create a CMEK key in Cloud KMS (if one is not already available). The following command is an example of how to create a software key:

    gcloud config set project PROJECT_ID
    gcloud kms keyrings create KEY_RING \
      --project KEY_PROJECT \
      --location=LOCATION
    gcloud kms keys create KEY_NAME \
      --project KEY_PROJECT \
      --location=LOCATION \
      --keyring=KEY_RING \
      --purpose=encryption
    

    Similarly, you can create a HSM key, or create an EKM key.

  2. Grant permissions to the Dataproc Metastore Service Agent service account:

    gcloud kms keys add-iam-policy-binding KEY_NAME \
      --location LOCATION \
      --keyring KEY_RING \
      --member=serviceAccount:$(gcloud beta services identity create \
      --service=metastore.googleapis.com 2>&1 | awk '{print $4}') \
      --role=roles/cloudkms.cryptoKeyEncrypterDecrypter
    
  3. Grant permissions to the Cloud Storage service account:

    gsutil kms authorize -k projects/KEY_PROJECT/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY_NAME
    

Create a Dataproc Metastore service with a CMEK key

Use the following steps to configure CMEK encryption during service creation:

Console

  1. In the Google Cloud console, open the Dataproc Metastore page:

    Go to Dataproc Metastore

  2. At the top of the Dataproc Metastore page, click Create.

    The Create service page opens.

  3. Configure your service as needed.

  4. Under Encryption, click Use a customer managed encryption key (CMEK).

  5. Select the customer-managed key.

  6. Click Submit.

Verify the service's encryption configuration:

  1. In the Google Cloud console, open the Dataproc Metastore page:

    Go to Google Cloud console

  2. On the Dataproc Metastore page, click the name of the service you'd like to view.

    The Service detail page for that service opens.

  3. Under the Configuration tab, verify that the details show CMEK is enabled.

gcloud

  1. Run the gcloud metastore services create command to create a service with CMEK encryption:

    gcloud metastore services create SERVICE \
       --encryption-kms-key=KMS_KEY
    

    Replace the following:

    • SERVICE: The name of the new service.
    • KMS_KEY: Refers to the key resource ID.
  2. Verify that the creation was successful.

Dataproc Metastore data protected with Google-provided encryption keys

The Cloud Monitoring database doesn't support CMEK encryption. Instead, Google Cloud uses Google encryption keys to protect the names and service configurations of your Dataproc Metastore services.

Import and export data from and to a CMEK-enabled service

If you want your data to remain encrypted with a customer-managed key during an import, you must set CMEK on the Cloud Storage bucket before importing data from it.

You can import from a non-CMEK protected Cloud Storage bucket. After importing, the data stored in Dataproc Metastore is protected according to the destination service's CMEK settings.

When exporting, the exported database dump is protected according to the destination storage bucket's CMEK settings.

CMEK caveats for Dataproc Metastore

  • Disabling or deleting the CMEK for a CMEK-enabled service makes the service unusable and unrecoverable.

    • The data is permanently lost.
  • You can't enable customer-managed encryption keys on an existing service.

  • You can't rotate the key used by a CMEK-enabled service.

  • A CMEK-enabled service doesn't support Data Catalog sync. Updating a CMEK-enabled service to enable Data Catalog sync fails. And you cannot create a new service with both features enabled.

  • You can't use customer-managed encryption keys to encrypt user data in transit, such as user queries and responses.

  • When you use a Cloud EKM key, Google has no control over the availability of your externally-managed key. If the key becomes unavailable during the Dataproc Metastore service creation period, then the service creation fails. After a Dataproc Metastore service is created, if the key becomes unavailable, the service becomes unavailable until the key becomes available. For more considerations when using external keys, see Cloud EKM Considerations.

What's next