Usa la CMEK con Google Cloud Serverless for Apache Spark
Organiza tus páginas con colecciones
Guarda y categoriza el contenido según tus preferencias.
De forma predeterminada, Google Cloud Serverless para Apache Spark encripta el contenido del cliente en reposo. Serverless para Apache Spark controla la encriptación por ti sin que debas realizar ninguna acción adicional. Esta opción se denomina Encriptación predeterminada de Google.
Si deseas controlar tus claves de encriptación, puedes usar las claves de encriptación administradas por el cliente (CMEK) en Cloud KMS con servicios integrados en CMEK, incluido Serverless para Apache Spark. El uso de claves de Cloud KMS te permite controlar su nivel de protección, ubicación, programa de rotación, permisos de uso y acceso, y límites criptográficos.
El uso de Cloud KMS también te permite hacer un seguimiento del uso de las claves, ver los registros de auditoría y controlar los ciclos de vida de las claves.
En lugar de que Google posea y administre las claves de encriptación de claves (KEK) simétricas que protegen tus datos, tú las controlas y administras en Cloud KMS.
Después de configurar tus recursos con CMEK, la experiencia de acceso a tus
recursos de Serverless for Apache Spark es similar a usar la encriptación predeterminada de Google.
Para obtener más información sobre tus opciones de encriptación, consulta Claves de encriptación administradas por el cliente (CMEK).
Usa CMEK
Sigue los pasos de esta sección para usar la CMEK y encriptar los datos que Google Cloud Serverless para Apache Spark
escribe en el disco persistente y en el bucket de etapa de pruebas de Dataproc.
KMS_PROJECT_ID: Es el ID de tu proyecto Google Cloud que ejecuta Cloud KMS. Este proyecto también puede ser el que ejecuta los recursos de Dataproc.
PROJECT_NUMBER: Es el número del proyecto (no el ID del proyecto de Google Cloud que ejecuta recursos de Dataproc.
Habilita la API de Cloud KMS en el proyecto que ejecuta recursos de Serverless para Apache Spark.
Si el rol de agente de servicio de Dataproc no está adjunto a la cuenta de servicio del agente de servicio de Dataproc, agrega el permiso serviceusage.services.use al rol personalizado adjunto a la cuenta de servicio del agente de servicio de Dataproc. Si el rol de agente de servicio de Dataproc está asociado a la cuenta de servicio del agente de servicio de Dataproc, puedes omitir este paso.
[[["Fácil de comprender","easyToUnderstand","thumb-up"],["Resolvió mi problema","solvedMyProblem","thumb-up"],["Otro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Información o código de muestra incorrectos","incorrectInformationOrSampleCode","thumb-down"],["Faltan la información o los ejemplos que necesito","missingTheInformationSamplesINeed","thumb-down"],["Problema de traducción","translationIssue","thumb-down"],["Otro","otherDown","thumb-down"]],["Última actualización: 2025-09-04 (UTC)"],[],[],null,["# Use CMEK with Google Cloud Serverless for Apache Spark\n\nBy default, Google Cloud Serverless for Apache Spark encrypts customer content at\nrest. Serverless for Apache Spark handles encryption for you without any\nadditional actions on your part. This option is called *Google default encryption*.\n\nIf you want to control your encryption keys, then you can use customer-managed encryption keys\n(CMEKs) in [Cloud KMS](/kms/docs) with CMEK-integrated services including\nServerless for Apache Spark. Using Cloud KMS keys gives you control over their protection\nlevel, location, rotation schedule, usage and access permissions, and cryptographic boundaries.\n\nUsing Cloud KMS also lets\nyou [track key usage](/kms/docs/view-key-usage), view audit logs, and\ncontrol key lifecycles.\n\n\nInstead of Google owning and managing the symmetric\n[key encryption keys (KEKs)](/kms/docs/envelope-encryption#key_encryption_keys) that protect your data, you control and\nmanage these keys in Cloud KMS.\n\nAfter you set up your resources with CMEKs, the experience of accessing your\nServerless for Apache Spark resources is similar to using Google default encryption.\nFor more information about your encryption\noptions, see [Customer-managed encryption keys (CMEK)](/kms/docs/cmek).\n| When you use Google Cloud Serverless for Apache Spark, data is stored on disks on the underlying serverless infrastructure and in a Cloud Storage [staging bucket](/dataproc-serverless/docs/concepts/buckets). This data is encrypted using a Google-generated data encryption key (DEK) and key encryption key (KEK). If you want control of your KEK, you can use a customer-managed encryption key (CMEK) instead of [default encryption at\n| rest](/security/encryption/default-encryption). When you use a CMEK, you create the key and manage access to it, and you can revoke access to it to prevent decryption of your DEKs and data.\n\nUse CMEK\n--------\n\nFollow the steps in this section to use CMEK to encrypt data that Google Cloud Serverless for Apache Spark\nwrites to persistent disk and to the Dataproc staging bucket.\n| Beginning April 23, 2024:\n|\n| - Serverless for Apache Spark also uses your CMEK to encrypt batch job arguments. The [Cloud KMS CryptoKey Encrypter/Decrypter](/kms/docs/reference/permissions-and-roles#cloudkms.cryptoKeyEncrypterDecrypter) IAM role must be assigned to the Dataproc Service Agent service account to enable this behavior. If the [Dataproc Service Agent role](/iam/docs/understanding-roles#dataproc.serviceAgent) is not attached to the Dataproc Service Agent service account, then add the `serviceusage.services.use` permission to a custom role attached to the Dataproc Service Agent service account . The Cloud KMS API must be enabled on the project that runs Serverless for Apache Spark resources.\n| - [`batches.list`](/dataproc-serverless/docs/reference/rest/v1/projects.locations.batches/list) returns an `unreachable` field that lists any batches with job arguments that couldn't be decrypted. You can issue [`batches.get`](/dataproc-serverless/docs/reference/rest/v1/projects.locations.batches/get) requests to obtain more information on unreachable batches.\n| - The key (CMEK) must be located in the same location as the encrypted resource. For example, the CMEK used to encrypt a batch that runs in the `us-central1` region must also be located in the `us-central1` region.\n\n1. Create a key using the\n [Cloud Key Management Service (Cloud KMS)](/kms/docs/creating-keys).\n\n2. Copy the resource name.\n\n The resource name is is constructed as follows: \n\n ```\n projects/PROJECT_ID/locations/REGION/keyRings/KEY_RING_NAME/cryptoKeys/KEY_NAME\n ```\n\n \u003cbr /\u003e\n\n3. Enable the Compute Engine, Dataproc, and Cloud Storage Service Agent\n service accounts to use your key:\n\n 1. See [Protect resources by using Cloud KMS keys \\\u003e Required Roles](/compute/docs/disks/customer-managed-encryption#required-roles) to assign the [Cloud KMS CryptoKey Encrypter/Decrypter](/kms/docs/reference/permissions-and-roles#cloudkms.cryptoKeyEncrypterDecrypter) role to the [Compute Engine Service Agent service account](/compute/docs/access/service-accounts#compute_engine_service_account). If this service account is not listed on the IAM page in Google Cloud console, click **Include Google-provided role grants** to list it.\n 2. Assign the [Cloud KMS CryptoKey Encrypter/Decrypter](/kms/docs/reference/permissions-and-roles#cloudkms.cryptoKeyEncrypterDecrypter)\n role to the [Dataproc Service Agent service account](/dataproc/docs/concepts/iam/dataproc-principals#service_agent_control_plane_identity).\n You can use the Google Cloud CLI to assign the role:\n\n ```\n gcloud projects add-iam-policy-binding KMS_PROJECT_ID \\\n --member serviceAccount:service-PROJECT_NUMBER@dataproc-accounts.iam.gserviceaccount.com \\\n --role roles/cloudkms.cryptoKeyEncrypterDecrypter\n ```\n\n Replace the following:\n\n \u003cvar translate=\"no\"\u003eKMS_PROJECT_ID\u003c/var\u003e: the ID of your Google Cloud project that\n runs Cloud KMS. This project can also be the project that runs Dataproc resources.\n\n \u003cvar translate=\"no\"\u003ePROJECT_NUMBER\u003c/var\u003e: the project number (not the project ID) of your Google Cloud project that runs Dataproc resources.\n 3. Enable the Cloud KMS API on the project that runs Serverless for Apache Spark resources.\n\n 4. If the [Dataproc Service Agent role](/iam/docs/understanding-roles#dataproc.serviceAgent) is not attached to the [Dataproc Service Agent service account](/dataproc/docs/concepts/iam/dataproc-principals#service_agent_control_plane_identity),\n then add the `serviceusage.services.use` permission to the custom\n role attached to the Dataproc Service Agent service account. If the Dataproc Service Agent role is\n attached to the Dataproc Service Agent service account, you can skip this step.\n\n 5. Follow the steps to\n [add your key on the bucket](/storage/docs/encryption/using-customer-managed-keys#set-default-key).\n\n4. When you\n [submit a batch workload](/dataproc-serverless/docs/quickstarts/spark-batch#submit_a_spark_batch_workload):\n\n 1. Specify your key in the batch [`kmsKey`](/dataproc-serverless/docs/reference/rest/v1/EnvironmentConfig#ExecutionConfig.FIELDS.kms_key) parameter.\n 2. Specify the name of your Cloud Storage bucket in the batch [`stagingBucket`](/dataproc-serverless/docs/reference/rest/v1/EnvironmentConfig#ExecutionConfig.FIELDS.staging_bucket) parameter.\n5. When you [create an interactive session or session template](/dataproc-serverless/docs/guides/create-serverless-sessions-templates):\n\n 1. Specify your key in the session [`kmsKey`](/dataproc-serverless/docs/reference/rest/v1/EnvironmentConfig#ExecutionConfig.FIELDS.kms_key) parameter.\n 2. Specify the name of your Cloud Storage bucket in the session [`stagingBucket`](/dataproc-serverless/docs/reference/rest/v1/EnvironmentConfig#ExecutionConfig.FIELDS.staging_bucket) parameter."]]