AEAD Encryption Concepts in Standard SQL

This topic explains the concepts behind AEAD encryption in BigQuery. For a description of the different AEAD encryption functions that BigQuery supports, see AEAD encryption functions.

Purpose of AEAD Encryption

BigQuery keeps your data safe by using encryption at rest. BigQuery also provides support for customer managed encryption keys (CMEKs), which enables you to encrypt tables using specific encryption keys. In some cases, however, you may want to encrypt individual values within a table.

For example, you want to keep data for all of your own customers in a common table, and encrypt each of your customers’ data using a different key. You have data spread across multiple tables that you want to be able to "crypto-delete". Crypto-deletion, or crypto-shredding, is the process of deleting an encryption key to render unreadable any data encrypted using that key.

AEAD encryption functions allow you to create keysets that contain keys for encryption and decryption, use these keys to encrypt and decrypt individual values in a table, and rotate keys within a keyset.

Keysets

A keyset is a collection of cryptographic keys, one of which is the primary cryptographic key and the rest of which, if any, are secondary cryptographic keys. Each key encodes an algorithm for encryption or decryption; whether the key is enabled, disabled, or destroyed; and, for non-destroyed keys, the key bytes themselves. The primary cryptographic key determines how to encrypt input plaintext. The primary cryptographic key can never be in a disabled state. Secondary cryptographic keys are only for decryption and can be either in an enabled or disabled state. A keyset can be used to decrypt any data that it was used to encrypt.

The representation of a keyset in BigQuery is as a serialized google.crypto.tink.Keyset protocol buffer in BYTES.

Example

The following is an example of an AEAD keyset, represented as a JSON string, with three keys.

primary_key_id: 569259624
key {
  key_data {
    type_url: "type.googleapis.com/google.crypto.tink.AesGcmKey"
    value: ",&\264kh\377\306\217\371\233E<\0350A4\023B-pd\203\277\240\371\212^\210bf\347\256"
    key_material_type: SYMMETRIC
  }
  status: ENABLED
  key_id: 569259624
  output_prefix_type: TINK
}
key {
  key_data {
    type_url: "type.googleapis.com/google.crypto.tink.AesGcmKey"
    value: "\374\336+.\333\245k\364\010`\037\267!\376\233\\3\215\020\356B\236\240O\256U\021\266\217\277\217\271"
    key_material_type: SYMMETRIC
  }
  status: DISABLED
  key_id: 852264701
  output_prefix_type: TINK
}
key {
  status: DESTROYED
  key_id: 237910588
  output_prefix_type: TINK
}

In the above example, the primary cryptographic key has an ID of 569259624 and is the first key listed in the JSON string. There are two secondary cryptographic keys, one with ID 852264701 in a disabled state, and another with ID 237910588 in a destroyed state. When an AEAD encryption function uses this keyset for encryption, the resulting ciphertext encodes the primary cryptographic key's ID of 569259624.

When an AEAD function uses this keyset for decryption, the function chooses the appropriate key for decryption based on the key ID encoded in the ciphertext; in the example above, attempting to decrypt using either key IDs 852264701 or 237910588 would result in an error, because key ID 852264701 is disabled and ID 237910588 is destroyed. Restoring key ID 852264701 to an enabled state would render it usable for decryption.

The key type determines the encryption mode to use with that key.

Encrypting plaintext more than once using the same keyset will generally return different ciphertext values due to different initialization vectors (IVs), which are chosen using the pseudo-random number generator provided by OpenSSL.

Advanced Encryption Standard (AES)

AEAD encryption functions use Advanced Encryption Standard (AES) encryption. AES encryption takes plaintext as input, along with a cryptographic key, and returns an encrypted sequence of bytes as output. This sequence of bytes can later be decrypted using the same key as was used to encrypt it. AES uses a block size of 16 bytes, meaning that the plaintext is treated as a sequence of 16-byte blocks. The ciphertext will contain a Tink-specific prefix indicating the key used to perform the encryption. AES encryption supports multiple block cipher modes.

Block cipher modes

Two block cipher modes supported by AEAD encryption functions are GCM and CBC.

GCM

Galois/Counter Mode (GCM) is a mode for AES encryption. The function numbers blocks sequentially, and then combines this block number with an initialization vector (IV). An initialization vector is a random or pseudo-random value that forms the basis of the randomization of the plaintext data. Next, the function encrypts the combined block number and IV using AES. The function then performs a bitwise logical exclusive or (XOR) operation on the result of the encryption and the plaintext to produce the ciphertext. GCM mode uses a cryptographic key of 128 or 256 bits in length.

CBC mode

CBC “chains” blocks by XORing each block of plaintext with the previous block of ciphertext prior to encrypting it. CBC mode uses a cryptographic key of either 128, 192, or 256 bits in length. CBC uses a 16-byte initialization vector as the initial block and XORs this block with the first plaintext block.

Additional data

AEAD encryption functions support the use of an additional_data argument, also known as associated data (AD) or additional authenticated data. Unlike the keyset, this additional data does not enable decryption of the resulting ciphertext by itself. This additional data ensures the authenticity and integrity of the encrypted data, but not its secrecy.

For example, additional_data could be the output of CAST(customer_id AS STRING) when encrypting data for a particular customer. This ensures that when the data is decrypted, it was previously encrypted using the expected customer_id. The same additional_data value is required for decryption. For more information, see RFC 5116.

Decryption

The output of AEAD.ENCRYPT is ciphertext BYTES. The AEAD.DECRYPT_STRING or AEAD.DECRYPT_BYTES functions can decrypt this ciphertext. These functions must use a keyset that contains the key that was used for encryption. That key must be in an 'ENABLED' state. They must also use the same additional_data as was used in encryption.

When the keyset is used for decryption, the appropriate key is chosen for decryption based on the key ID encoded in the ciphertext.

The output of AEAD.DECRYPT_STRING is a plaintext STRING, whereas the output of AEAD.DECRYPT_BYTES is plaintext BYTES. AEAD.DECRYPT_STRING can decrypt ciphertext that encodes a STRING value; AEAD.DECRYPT_BYTES can decrypt ciphertext that encodes a BYTES value. Using one of these functions to decrypt a ciphertext that encodes the wrong data type, such as using AEAD.DECRYPT_STRING to decrypt ciphertext that encodes a BYTES value, causes undefined behavior and may result in an error.

Key Rotation

The primary purpose of rotating encryption keys is to reduce the amount of data encrypted with any particular key, so that a potential compromised key would allow an attacker access to less data.

Keyset rotation involves:

  1. Creating a new primary cryptographic key within every keyset.
  2. Decrypting and re-encrypting all encrypted data.

The KEYS.ROTATE_KEYSET function performs the first step, by adding a new primary cryptographic key to a keyset and changing the old primary cryptographic key a secondary cryptographic key.

Segítségére volt ez az oldal? Tudassa velünk a véleményét:

Visszajelzés küldése a következővel kapcsolatban:

Segítségre van szüksége? Keresse fel súgóoldalunkat.