You can transform your CloudEvents data by writing transformation expressions using CEL. For more information, see Transform received events.
The following are some common use cases and examples that show you how to write CEL expressions to transform your event data.
Standard use cases
The following are some standard use cases when transforming event data.
Data normalization
You need to flatten a nested data structure in your event message to allow for easier processing by a downstream service.
- Scenario:
Given the following CloudEvents data:
{ "data": { "orderId": "12345", "customer": { "firstName": "Alex", "lastName": "Taylor", "address": { "street": "1800 Amphibious Blvd.", "city": "Mountain View" } } } }
You want to write a CEL expression that results in the following output:
{ "data": { "orderId": "12345", "customerFirstName": "Alex", "customerLastName": "Taylor", "customerStreet": "1800 Amphibious Blvd.", "customerCity": "Mountain View" } }
- Solution 1:
Format the output data manually. This lets you list the field names and pick only the elements needed in the output. This is a reasonable approach when the input is predictable and when the number of fields is low. For example:
message.setField("data", { "orderId": message.data.orderId, "customerFirstName": message.data.customer.firstName, "customerLastName": message.data.customer.lastName, "customerStreet": message.data.customer.address.street, "customerCity": message.data.customer.address.city, })
- Solution 2:
Use a function in your expression. The
denormalize
function flattens deep structures to a list of key and value pairs. Field names are delimited using a period (.
) to segment the structure hierarchy. For example:message.setField("data", message.data.denormalize())
This results in the following output which differs slightly from the expected payload. However, the advantages include a shorter CEL expression which operates on any input, and which automatically includes any number of incoming fields.
{ "data": { "orderId": "12345", "customer.firstName": "Alex", "customer.lastName": "Taylor", "customer.address.street": "1800 Amphibious Blvd.", "customer.address.city": "Mountain View" } }
Data masking
You need to mask sensitive data in an event payload before it's sent on to a less secure environment.
- Scenario:
Given the following CloudEvents data:
{ "data": { "userId": "user123", "email": "alex@example.com", "creditCardNumber": "1234-5678-9012-3456" } }
You want to write a CEL expression that results in the following output:
{ "data": { "userId": "user123", "email": "a***@example.com", "creditCardNumber": "xxxx-xxxx-xxxx-3456" } }
- Solution:
Use an expression to mask any sensitive information such as the email address and credit card number. For example:
message .setField("data.email", re.extract(message.data.email, "(^.).*@(.*)", "\\1***@\\2")) .setField("data.creditCardNumber", re.extract(message.data.creditCardNumber, "(\\d{4})\\D*$", "xxxx-xxxx-xxxx-\\1"))
Data redaction
You need to remove specific fields from an event payload based upon certain conditions.
- Scenario:
Given the following CloudEvents data:
{ "data": { "orderId": "12345", "customerType": "gold", "discountCode": "VIP" } }
You want to write a CEL expression that results in the following output:
{ { "orderId": "12345", "customerType": "gold" } }
- Solution:
Use an expression that redacts the
discountCode
field if thecustomerType
is "gold". For example:message.data.customerType == "gold" ? message.removeFields(["data.discountCode"]) : message
Data conversion
You need to convert data from one format or type to another.
- Scenario:
Given the following CloudEvents data:
{ "data": { "orderDate": "2024-10-31T12:00:00Z", "totalAmount": "1500" } }
You want to write a CEL expression that results in the following output:
{ "data": { "orderDate": 1704086400, "totalAmount": 1500.00 } }
- Solution:
Use an expression which converts
orderDate
to a UNIX timestamp, and thetotalAmount
type from astring
to adouble
(floating-point number). For example:message .setField("data.orderDate", int(timestamp(message.data.orderDate))) .setField("data.totalAmount", double(message.data.totalAmount))
Conditional routing
You need to route events to different destinations based on the event data.
- Scenario:
Given the following CloudEvents data:
{ "data": { "eventType": "order.created", "orderValue": 200 } }
You want to write a CEL expression that results in the following output:
{ "data": { "eventType": "order.created", "orderValue": 200, "routingKey": "highValue" } }
- Solution:
Use an expression that adds a
routingKey
field with a "highValue" if theorderValue
is greater than 100; otherwise,"normal"
. TheroutingKey
field can be used to determine the routing path. For example:message.data.orderValue > 100 ? message.setField("data.routingKey", "highValue") : message.setField("data.routingKey", "normal")
Default value handling
You need to ensure that certain fields in the event payload have default values if they are not present.
- Scenario:
Given the following CloudEvents data:
{ "data": { "itemName": "Product A" } }
You want to write a CEL expression that results in the following output:
{ "data": { "itemName": "Product A", "quantity": 1 } }
- Solution:
Use an expression that adds a
quantity
field with a default value of1
if the field doesn't already exist. For example:has(message.data.quantity) ? message : message.setField("data.quantity", 1)
String manipulation
You need to extract or modify parts of a string field in the event data.
- Scenario:
Given the following CloudEvents data:
{ "data": { "customerEmail": "alex@example.com" } }
You want to write a CEL expression that results in the following output:
{ "data": { "customerEmail": "alex@example.com", "emailDomain": "example.com" } }
- Solution:
Use an expression that extracts the domain name ("example.com") from the
customerEmail
field and stores it in a newemailDomain
field. For example:message .setField("data.emailDomain", re.extract(message.data.customerEmail, "(^.*@)(.*)", "\\2"))
List and map operations
You need to work with lists or maps in the event data.
- Scenario:
Given the following CloudEvents data:
{ "data": { "productIds": [ "product123", "product456" ] } }
You want to write a CEL expression that results in the following output:
{ "data": { "productIds": [ "product123", "product456" ], "productFound": true } }
- Solution:
Use an expression that checks if "product456" exists in the
productIds
list and stores the result (true
orfalse
) in a newproductFound
field. For example:message.setField("data.productFound", message.data.productIds.exists(id, id == "product123"))
Error handling
You need to gracefully handle potential errors or unexpected data in the event payload.
- Scenario:
Given the following CloudEvents data:
{ "data": { "quantity": "abc" } }
You want to write a CEL expression that results in the following output:
{ "data": { "quantity": 0, "error": "Invalid quantity" } }
- Solution:
Use an expression that attempts to convert the
quantity
field to an integer. If the conversion fails, set thequantity
field to0
, and add a newerror
field with the value "Invalid quantity". For example:// Check if data.quantity exists has(message.data.quantity) && // Check if data.quantity is a string type(message.data.quantity) == string && // Check if string consists of digits message.data.quantity.matches(r'^-?[0-9]+$') ? // If data.quantity is valid, use message message : // If data.quantity is invalid, set to 0 and generate error message .setField("data.quantity", 0) .setField("data.error", "Invalid quantity")
Complex use cases
The following are some complex use cases when transforming event data.
Data transformation
You need to perform multiple transformations on nested event data.
- Scenario:
Given the following CloudEvents data:
{ "data": { "orderId": "12345", "customer": { "firstName": "Alex", "lastName": "Taylor", "email": "alex@example.com", "address": { "street": "1800 Amphibious Blvd.", "city": "Mountain View", "state": "CA" } }, "items": [ { "itemId": "item1", "price": 10.00, "quantity": 2 }, { "itemId": "item2", "price": 5.00, "quantity": 1 } ] } }
You want to write a CEL expression that results in the following output:
{ "data": { "orderId": "12345", "customer.firstName": "Alex", "customer.lastName": "Taylor", "customer.email": "a***@example.com", "customer.address.city": "Mountain View", "customer.address.state": "CA" } }
- Solution:
Use an expression that extracts the city and state from the address, and that masks the email address. For example:
message .setField("data", message.data.setField("customer.address", message.data.customer.address.map(key, key == "city" || key == "state", { key: message.data.customer.address[key] }).toMap()) .setField("customer.email", re.extract(message.data.customer.email, "(^..?).*@(.*)", "\\1***@\\2")) .removeFields(["items"]) .denormalize() )
Data formatting and routing
You need to format event data, add product information, and then route the event message.
- Scenario:
Given the following CloudEvents data:
{ "data": { "productId": "p123", "productName": "Example Product", "category": "electronics" } }
You want to write a CEL expression that results in the following output:
{ "data": { "productId": "electronics-p123", "productName": "EXAMPLE PRODUCT", "category": "electronics", "routingKey": "electronics" } }
- Solution:
Use an expression that formats the product name to uppercase, adds a prefix to the product ID based on its category, and includes a routing key for downstream processing. For example:
message .setField("data.productId", message.data.category + "-" + message.data.productId) .setField("data.productName", message.data.productName.upperAscii()) .setField("data.routingKey", message.data.category)