Utilizzare CMEK con Google Cloud Serverless per Apache Spark
Mantieni tutto organizzato con le raccolte
Salva e classifica i contenuti in base alle tue preferenze.
Per impostazione predefinita, Google Cloud Serverless per Apache Spark cripta i contenuti inattivi dei clienti. Serverless per Apache Spark gestisce la crittografia per tuo conto senza che tu debba fare altro. Questa opzione è denominata Crittografia predefinita di Google.
Se vuoi controllare le tue chiavi di crittografia, puoi utilizzare le chiavi di crittografia gestite dal cliente
(CMEK) in Cloud KMS con servizi integrati con CMEK, tra cui
Serverless per Apache Spark. L'utilizzo delle chiavi Cloud KMS ti consente di controllare il livello di protezione, la località, la pianificazione della rotazione, le autorizzazioni di utilizzo e di accesso e i limiti crittografici.
Con Cloud KMS puoi inoltre monitorare l'utilizzo delle chiavi, visualizzare gli audit log e controllare i cicli di vita delle chiavi.
Invece di Google, sei tu ad avere la proprietà e la gestione delle chiavi di crittografia della chiave (KEK) simmetriche che proteggono i tuoi dati. Puoi controllare e gestire queste chiavi in Cloud KMS.
Dopo aver configurato le risorse con le chiavi CMEK, l'esperienza di accesso alle risorse Serverless per Apache Spark è simile all'utilizzo della crittografia predefinita di Google.
Per saperne di più sulle opzioni di crittografia, consulta Chiavi di crittografia gestite dal cliente (CMEK).
Utilizzo di CMEK
Segui i passaggi descritti in questa sezione per utilizzare CMEK per criptare i dati che Google Cloud Serverless per Apache Spark
scrive sul disco permanente e nel bucket di staging Dataproc.
KMS_PROJECT_ID: l'ID del tuo progetto Google Cloud che
esegue Cloud KMS. Questo progetto può anche essere quello che esegue le risorse Dataproc.
PROJECT_NUMBER: il numero di progetto (non l'ID progetto) del tuo progetto Google Cloud che esegue le risorse Dataproc.
Abilita l'API Cloud KMS nel progetto che esegue le risorse Serverless per Apache Spark.
Se il ruolo Agente di servizio Dataproc non è collegato all'account di servizio Agente di servizio Dataproc,
aggiungi l'autorizzazione serviceusage.services.use al ruolo personalizzato
collegato all'account di servizio Agente di servizio Dataproc. Se il ruolo Agente di servizio Dataproc è
associato all'account di servizio Agente di servizio Dataproc, puoi ignorare questo passaggio.
[[["Facile da capire","easyToUnderstand","thumb-up"],["Il problema è stato risolto","solvedMyProblem","thumb-up"],["Altra","otherUp","thumb-up"]],[["Difficile da capire","hardToUnderstand","thumb-down"],["Informazioni o codice di esempio errati","incorrectInformationOrSampleCode","thumb-down"],["Mancano le informazioni o gli esempi di cui ho bisogno","missingTheInformationSamplesINeed","thumb-down"],["Problema di traduzione","translationIssue","thumb-down"],["Altra","otherDown","thumb-down"]],["Ultimo aggiornamento 2025-09-04 UTC."],[],[],null,["# Use CMEK with Google Cloud Serverless for Apache Spark\n\nBy default, Google Cloud Serverless for Apache Spark encrypts customer content at\nrest. Serverless for Apache Spark handles encryption for you without any\nadditional actions on your part. This option is called *Google default encryption*.\n\nIf you want to control your encryption keys, then you can use customer-managed encryption keys\n(CMEKs) in [Cloud KMS](/kms/docs) with CMEK-integrated services including\nServerless for Apache Spark. Using Cloud KMS keys gives you control over their protection\nlevel, location, rotation schedule, usage and access permissions, and cryptographic boundaries.\n\nUsing Cloud KMS also lets\nyou [track key usage](/kms/docs/view-key-usage), view audit logs, and\ncontrol key lifecycles.\n\n\nInstead of Google owning and managing the symmetric\n[key encryption keys (KEKs)](/kms/docs/envelope-encryption#key_encryption_keys) that protect your data, you control and\nmanage these keys in Cloud KMS.\n\nAfter you set up your resources with CMEKs, the experience of accessing your\nServerless for Apache Spark resources is similar to using Google default encryption.\nFor more information about your encryption\noptions, see [Customer-managed encryption keys (CMEK)](/kms/docs/cmek).\n| When you use Google Cloud Serverless for Apache Spark, data is stored on disks on the underlying serverless infrastructure and in a Cloud Storage [staging bucket](/dataproc-serverless/docs/concepts/buckets). This data is encrypted using a Google-generated data encryption key (DEK) and key encryption key (KEK). If you want control of your KEK, you can use a customer-managed encryption key (CMEK) instead of [default encryption at\n| rest](/security/encryption/default-encryption). When you use a CMEK, you create the key and manage access to it, and you can revoke access to it to prevent decryption of your DEKs and data.\n\nUse CMEK\n--------\n\nFollow the steps in this section to use CMEK to encrypt data that Google Cloud Serverless for Apache Spark\nwrites to persistent disk and to the Dataproc staging bucket.\n| Beginning April 23, 2024:\n|\n| - Serverless for Apache Spark also uses your CMEK to encrypt batch job arguments. The [Cloud KMS CryptoKey Encrypter/Decrypter](/kms/docs/reference/permissions-and-roles#cloudkms.cryptoKeyEncrypterDecrypter) IAM role must be assigned to the Dataproc Service Agent service account to enable this behavior. If the [Dataproc Service Agent role](/iam/docs/understanding-roles#dataproc.serviceAgent) is not attached to the Dataproc Service Agent service account, then add the `serviceusage.services.use` permission to a custom role attached to the Dataproc Service Agent service account . The Cloud KMS API must be enabled on the project that runs Serverless for Apache Spark resources.\n| - [`batches.list`](/dataproc-serverless/docs/reference/rest/v1/projects.locations.batches/list) returns an `unreachable` field that lists any batches with job arguments that couldn't be decrypted. You can issue [`batches.get`](/dataproc-serverless/docs/reference/rest/v1/projects.locations.batches/get) requests to obtain more information on unreachable batches.\n| - The key (CMEK) must be located in the same location as the encrypted resource. For example, the CMEK used to encrypt a batch that runs in the `us-central1` region must also be located in the `us-central1` region.\n\n1. Create a key using the\n [Cloud Key Management Service (Cloud KMS)](/kms/docs/creating-keys).\n\n2. Copy the resource name.\n\n The resource name is is constructed as follows: \n\n ```\n projects/PROJECT_ID/locations/REGION/keyRings/KEY_RING_NAME/cryptoKeys/KEY_NAME\n ```\n\n \u003cbr /\u003e\n\n3. Enable the Compute Engine, Dataproc, and Cloud Storage Service Agent\n service accounts to use your key:\n\n 1. See [Protect resources by using Cloud KMS keys \\\u003e Required Roles](/compute/docs/disks/customer-managed-encryption#required-roles) to assign the [Cloud KMS CryptoKey Encrypter/Decrypter](/kms/docs/reference/permissions-and-roles#cloudkms.cryptoKeyEncrypterDecrypter) role to the [Compute Engine Service Agent service account](/compute/docs/access/service-accounts#compute_engine_service_account). If this service account is not listed on the IAM page in Google Cloud console, click **Include Google-provided role grants** to list it.\n 2. Assign the [Cloud KMS CryptoKey Encrypter/Decrypter](/kms/docs/reference/permissions-and-roles#cloudkms.cryptoKeyEncrypterDecrypter)\n role to the [Dataproc Service Agent service account](/dataproc/docs/concepts/iam/dataproc-principals#service_agent_control_plane_identity).\n You can use the Google Cloud CLI to assign the role:\n\n ```\n gcloud projects add-iam-policy-binding KMS_PROJECT_ID \\\n --member serviceAccount:service-PROJECT_NUMBER@dataproc-accounts.iam.gserviceaccount.com \\\n --role roles/cloudkms.cryptoKeyEncrypterDecrypter\n ```\n\n Replace the following:\n\n \u003cvar translate=\"no\"\u003eKMS_PROJECT_ID\u003c/var\u003e: the ID of your Google Cloud project that\n runs Cloud KMS. This project can also be the project that runs Dataproc resources.\n\n \u003cvar translate=\"no\"\u003ePROJECT_NUMBER\u003c/var\u003e: the project number (not the project ID) of your Google Cloud project that runs Dataproc resources.\n 3. Enable the Cloud KMS API on the project that runs Serverless for Apache Spark resources.\n\n 4. If the [Dataproc Service Agent role](/iam/docs/understanding-roles#dataproc.serviceAgent) is not attached to the [Dataproc Service Agent service account](/dataproc/docs/concepts/iam/dataproc-principals#service_agent_control_plane_identity),\n then add the `serviceusage.services.use` permission to the custom\n role attached to the Dataproc Service Agent service account. If the Dataproc Service Agent role is\n attached to the Dataproc Service Agent service account, you can skip this step.\n\n 5. Follow the steps to\n [add your key on the bucket](/storage/docs/encryption/using-customer-managed-keys#set-default-key).\n\n4. When you\n [submit a batch workload](/dataproc-serverless/docs/quickstarts/spark-batch#submit_a_spark_batch_workload):\n\n 1. Specify your key in the batch [`kmsKey`](/dataproc-serverless/docs/reference/rest/v1/EnvironmentConfig#ExecutionConfig.FIELDS.kms_key) parameter.\n 2. Specify the name of your Cloud Storage bucket in the batch [`stagingBucket`](/dataproc-serverless/docs/reference/rest/v1/EnvironmentConfig#ExecutionConfig.FIELDS.staging_bucket) parameter.\n5. When you [create an interactive session or session template](/dataproc-serverless/docs/guides/create-serverless-sessions-templates):\n\n 1. Specify your key in the session [`kmsKey`](/dataproc-serverless/docs/reference/rest/v1/EnvironmentConfig#ExecutionConfig.FIELDS.kms_key) parameter.\n 2. Specify the name of your Cloud Storage bucket in the session [`stagingBucket`](/dataproc-serverless/docs/reference/rest/v1/EnvironmentConfig#ExecutionConfig.FIELDS.staging_bucket) parameter."]]