Configure message encryption

This document discusses how to configure customer-managed encryption keys (CMEK) for an Apache Kafka for BigQuery cluster.

Apache Kafka for BigQuery encrypts messages at rest with Google-managed encryption keys by default. No additional setup is required to use Google-managed encryption keys.

About CMEK

CMEKs are encryption keys that you own and are managed and stored in Cloud Key Management Service (Cloud KMS). If you need more control over the encryption keys used to protect Apache Kafka for BigQuery data at rest, you can use CMEKs. Some organizations also mandate the use of CMEKs.

CMEKs give you full control over your encryption keys, letting you manage their lifecycle, rotation, and access policies. When you configure an Apache Kafka for BigQuery cluster with a CMEK, the service automatically encrypts all cluster data at rest using the specified key. Cloud KMS usage for CMEK might incur additional costs depending on your usage patterns.

Configure Apache Kafka for BigQuery cluster for CMEK

You can configure CMEK for an Apache Kafka for BigQuery cluster using the Google Cloud console or the Google Cloud CLI.

Before you begin

Complete the following tasks:

  • Enable the Cloud KMS API.

  • Create a key ring and a key in Cloud KMS. Keys and key rings cannot be deleted. As Apache Kafka for BigQuery resources are regional, we recommend that you create CMEKs in the same region as where the Kafka cluster is located.

For instructions on how to accomplish these tasks, see the Cloud KMS quickstart guide.

Required roles and permissions to configure CMEK

Apache Kafka for BigQuery uses a Google Cloud service agent to access Cloud KMS. The service agent for your Google Cloud project is automatically created after you create your first Apache Kafka for BigQuery cluster.

The service agent is maintained internally by Apache Kafka for BigQuery for each project, and is not visible on the Service Accounts page in the Google Cloud console by default.

The Apache Kafka for BigQuery service agent has the form service-${PROJECT_NUMBER}@gcp-sa-managed-kafka.iam.gserviceaccount.com.

Apache Kafka for BigQuery requires specific permissions to encrypt and decrypt data using CMEK.

Complete the following steps to set up the required access:

  1. Optional: Create the Apache Kafka for BigQuery service agent manually by using the gcloud beta services identity create command.

    If you have previously created a cluster in your project, the Apache Kafka for BigQuery service agent is already created in your project and you can skip this step.

    gcloud beta services identity create \
        --service=managedkafka.googleapis.com \
        --project=PROJECT_ID
    

    Replace PROJECT_ID with your project ID.

  2. Grant the Apache Kafka for BigQuery service agent the Cloud KMS Crypto Key Encrypter/Decrypter (roles/cloudkms.cryptoKeyEncrypterDecrypter) role.

    gcloud kms keys add-iam-policy-binding CLOUD_KMS_KEY_NAME \
        --member=serviceAccount:service-PROJECT_NUMBER@gcp-sa-managedkafka.iam.gserviceaccount.com \
        --role=roles/cloudkms.cryptoKeyEncrypterDecrypter
    

    Replace the following:

    • CLOUD_KMS_KEY_NAME: The name of the Cloud KMS key.

      The key is of the format projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/CRYPTO_KEY.

      An example is projects/test-project/locations/us-central1/keyRings/test-keyring/cryptoKeys/test-key.

    • PROJECT_NUMBER: The project number of the Apache Kafka for BigQuery project.

For more information about granting IAM roles, see Granting roles on a resource.

Create a cluster with CMEK

You can use the Google Cloud console or gcloud CLI to add your encryption keys at the time when you create your Kafka cluster.

Enable the Apache Kafka for BigQuery, Compute Engine, Cloud DNS, and Cloud KMS APIs.

Enable the APIs

Before you create a cluster, review the documentation of the cluster properties.

To create a cluster with CMEK, follow these steps:

Console

  1. In the Google Cloud console, go to the Clusters page.

    Go to Clusters

  2. Select Create.

    The Create Kafka cluster page opens.

  3. For the Cluster name, enter a string.

    For more information about how to name a cluster, see Guidelines to name an Apache Kafka for BigQuery resource.

  4. For Location, enter a supported location.

    For more information about supported locations, see Supported Apache Kafka for BigQuery locations.

  5. For Capacity configuration, enter values for Memory and vCPUs.

    The vCPU to memory ratio must be between 1:1 and 1:6.

    For more information about how to size a Apache Kafka for BigQuery cluster, see Estimate vCPUs and memory for your Apache Kafka for BigQuery cluster.

  6. For Network configuration, enter the following details:
    1. Project: The project where the subnetwork is located. The subnet must be located in the same region as the cluster, but the project might be different.
    2. Network: The network to which the subnet is connected.
    3. Subnetwork: The name of the subnet.
    4. Subnet URI path: This field is automatically populated. Or, you can enter the subnet path here. The name of the subnet must be in the format: projects/PROJECT_ID/regions/REGION/subnetworks/SUBNET_ID.
    5. Click Done.
  7. (Optional) Add additional subnets by clicking Add a connected subnet.

    You can add additional subnets, up to a maximum value of ten.

  8. For Encryption, select Cloud KMS key.
  9. For Key type, select Cloud KMS and for Select a customer-managed key, enter the CMEK that you created.
  10. Click Create.

gcloud

    In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  1. Run the gcloud beta managed-kafka clusters createcommand:

    gcloud beta managed-kafka clusters create CLUSTER_ID \
        --location=LOCATION \
        --cpu=CPU \
        --memory=MEMORY \
        --subnets=SUBNETS \
        --encryption-key=CLOUD_KMS_KEY \
    

    Replace the following:

    • CLUSTER_ID: The ID or name of the cluster.

      For more information about how to name a cluster, see Guidelines to name an Apache Kafka for BigQuery resource.

    • LOCATION: The location of the cluster.

      For more information about supported locations, see Supported Apache Kafka for BigQuery locations.

    • CPU: The number of virtual CPUs for the cluster. The vCPU to memory ratio must be between 1:1 and 1:6.

      For more information about how to size a Apache Kafka for BigQuery cluster, see Estimate vCPUs and memory for your Apache Kafka for BigQuery cluster.

    • MEMORY: The amount of memory for the cluster. Use "MB", "MiB", "GB", "GiB", "TB", or "TiB" units. For example, "10GiB".

    • SUBNETS: The list of subnets to connect to. Use commas to separate multiple subnet values.

      The format of the subnet is projects/PROJECT_ID/regions/REGION/subnetworks/SUBNET_ID.

    • ENCRYPTION_KEY: ID of the CMEK to use for the cluster.

      The format is projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/CRYPTO_KEY.

Confirm the cluster creation

Confirm that the cluster is configured for CMEK, by running the gcloud beta managed-kafka clusters describe command.

gcloud beta managed-kafka clusters describe CLUSTER_ID \
    --location=LOCATION

The output includes the configured CMEK.

Audit logs

Cloud KMS produces audit logs when keys are enabled, disabled, or used by Apache Kafka for BigQuery to encrypt and decrypt messages. This is useful in debugging issues with publish or delivery availability.

Cloud KMS key IDs are attached to audit logs for Apache Kafka for BigQuery cluster resources. Apache Kafka for BigQuery does not include any other Cloud KMS-related information in audit logs.

Disable and re-enable CMEK

There are two ways to disable CMEK. Choose one of the following methods:

  • Recommended: Disable the Cloud KMS key that you've associated with the cluster. This approach affects only the Apache Kafka for BigQuery cluster that is associated with that specific key.

  • Revoke the CryptoKey Encrypter/Decrypter role from the Apache Kafka for BigQuery service agent (service-${PROJECT_NUMBER}@gcp-sa-managedkafka.iam.gserviceaccount.com) by using Identity and Access Management (IAM). This approach affects all of the Apache Kafka for BigQuery clusters in the project and the messages encrypted by using CMEK.

Although neither operation might cause instantaneous access revocation, IAM changes generally propagate faster.

For more information, see Cloud KMS resource consistency and Access change propagation.

When Apache Kafka for BigQuery cannot access a Cloud KMS key, message publishing and delivery fails with errors. To resume delivery and publishing, restore access to the Cloud KMS key.

After the Cloud KMS key is accessible to Apache Kafka for BigQuery, publishing is available within 12 hours and message delivery resumes within 2 hours.

Although intermittent Cloud KMS outages of less than a minute are unlikely to significantly interrupt publishing and delivery, extended Cloud KMS unavailability has the same effect as key revocation.

Apache Kafka® is a registered trademark of The Apache Software Foundation or its affiliates in the United States and/or other countries.