Stay organized with collections Save and categorize content based on your preferences.

YARA-L 2.0 language syntax

This section describes the major elements of the YARA-L syntax. See also Overview of the YARA-L 2.0 language.

Rule structure

For YARA-L 2.0, you must specify variable declarations, definitions, and usages in the following order:

  1. meta
  2. events
  3. match (optional)
  4. outcome (optional)
  5. condition
  6. options (optional)

The following illustrates the generic structure of a rule:

rule <rule Name>
{
  meta:
    // Stores arbitrary key-value pairs of rule details, such as who wrote
    // it, what it detects on, version control, etc.

  events:
    // Conditions to filter events and the relationship between events.

  match:
    // Values to return when matches are found.

  outcome:
    // Additional information extracted from each detection.

  condition:
    // Condition to check events and the variables used to find matches.

  options:
    // Options to turn on or off while executing this rule.
}

Comments

Designate comments with two slash characters (// comment) or multi-line comments set off using slash asterisk characters (/* comment */), as you would in C.

Constants

Nonnegative integers (without decimal points), string, boolean, and regex constants are supported.

String and regex constants

You can use either of the following quotation characters to enclose strings in YARA-L 2.0. However, quoted text is interpreted differently depending on which one you use.

  1. Double quotes (") — Use for normal strings. Must include escape characters.
    For example: "hello\tworld" —\t is interpreted as a tab

  2. Back quotes (`) — Use to interpret all characters literally.
    For example: `hello\tworld` —\t is not interpreted as a tab

For regular expressions, you have two options.

If you want to use regular expressions directly without the re.regex() function, use /regex/ for the regular expression constants.

You can also use string constants as regex constants when you use the re.regex() function. Note that for double quote string constants, you must escape backslash characters with backslash characters, which can look awkward.

For example, the following regular expressions are equivalent:

  • re.regex($e.network.email.from, `.*altostrat\.com`)
  • re.regex($e.network.email.from, ".*altostrat\\.com")
  • $e.network.email.from = /.*altostrat\.com/

Google recommends using back quote characters for strings in regular expressions for ease of readability.

Operators

You can use the following operators in YARA-L:

Operator Description
= equal/declaration
!= not equal
< less than
<= less than or equal
> greater than
>= greater than or equal

Variables

In YARA-L 2.0, all variables are represented as $<variable name>.

You can define the following types of variables:

  • Event variables — Represent groups of events in normalized form (UDM) or entity events. Specify conditions for event variables in the events section. You identify event variables using a name, event source, and event fields. Allowed sources are udm (for normalized events) and graph (for entity events). If the source is omitted, udm is set as the default source. Event fields are represented as a chain of .<field name> (for example, $e.field1.field2). Event field chains always start from the top-level source (UDM or Entity).

  • Match variables — Declare in the match section. Match variables become grouping fields for the query, as one row is returned for each unique set of match variables (and for each time window). When the rule finds a match, the match variable values are returned. Specify what each match variable represents in the events section.

  • Placeholder variables — Declare and define in the events section. Placeholder variables are similar to match variables. However, you can use placeholder variables in the condition section to specify match conditions.

Use match variables and placeholder variables to declare relationships between event fields through transitive join conditions (see Events Section Syntax for more detail).

Functions

This section describes the YARA-L 2.0 functions that Chronicle supports in Detection Engine.

These functions can be used in the following areas in a rule:

String functions

Chronicle supports the following string manipulation functions:

  • strings.concat(a, b)
  • strings.to_lower(stringText)
  • strings.to_upper(stringText)
  • strings.base64_decode(encodedString)

The following sections describe how to use each.

Concatenate strings or integers

Returns the concatenation of two strings, two integers, or a combination of the two.

strings.concat(a, b)

This function takes two arguments, that can be either strings or integers, and returns the two values concatenated as a string. Integers are cast to a string before concatenation. The arguments can be literals or event fields. If both arguments are fields, the two attributes must be from the same event.

The following example includes a string variable and string literal as arguments.

"google-test" = strings.concat($e.principal.hostname, "-test")

The following example includes a string variable and integer variable as arguments. Both principal.hostname and principal.port are from the same event, $e, and are concatenated to return a string.

"google80" = strings.concat($e.principal.hostname, $e.principal.port)

The following example attempts to concatenate principal.port from event $e1, with principal.hostname from event $e2. It will return a compiler error because the arguments are different event variables.

// returns a compiler error
"test" = strings.concat($e1.principal.port, $e2.principal.hostname)

Convert string to uppercase or lowercase

These functions return string text after changing all characters to either uppercase or lowercase.

  • strings.to_lower(stringText)
  • strings.to_upper(stringText)
"test@google.com" = strings.to_lower($e.network.email.from)
"TEST@GOOGLE.COM" = strings.to_upper($e.network.email.to)

Base64 decode a string

Returns a string containing the base64 decoded version of the encoded string.

strings.base64_decode(encodedString)

This function takes one base64 encoded string as an argument. If encodedString is not a valid base64 encoded string, the function returns encodedString as-is.

This example returns True if principal.domain.name is "dGVzdA==", which is base64 encoding for the string "test".

"test" = strings.base64_decode($e.principal.domain.name)

RegExp functions

Chronicle supports the following regular expression functions:

  • re.regex(stringText, regex)
  • re.capture(stringText, regex)
  • re.replace(stringText, replaceRegex, replacementText)

RegExp match

You can define regular expression matching in YARA-L 2.0 using either of the following syntax:

  • Using YARA syntax — Related to events. The following is a generic representation of this syntax: $e.field = /regex/
  • Using YARA-L syntax — As a function taking in the following parameters:
    • Field the regular expression is applied to.
    • Regular expression specified as a string. You can use the nocase modifier after strings to indicate that the search should ignore capitalization. The following is a generic representation of this syntax: re.regex($e.field, `regex`)

Be aware of the following while defining regular expressions in YARA-L 2.0:

  • In either case, the predicate is true if the string contains a substring that matches the regular expression provided. It is unnecessary to add .* to the beginning or at the end of the regular expression.
  • To match the exact string or only a prefix or suffix, include the ^ (starting) and $ (ending) anchor characters in the regular expression. For example, /^full$/ matches "full" exactly, while /full/ could match "fullest", "lawfull", and "joyfully".
  • If the UDM field includes newline characters, the regexp only matches the first line of the UDM field. To enforce full UDM field matching, add a (?s) to the regular expression. For example, replace /.*allUDM.*/ with /(?s).*allUDM.*/.

RegExp capture

Captures (extracts) data from a string using the regular expression pattern provided in the argument.

re.capture(stringText, regex)

This function takes two arguments:

  • stringText: the original string to search.
  • regex: the regular expression indicating the pattern to search for.

The regular expression can contain 0 or 1 capture groups in parentheses. If the regular expression contains 0 capture groups, the function returns the first entire matching substring. If the regular expression contains 1 capture group, it returns the first matching substring for the capture group. Defining two or more capture groups returns a compiler error.

In this example, if $e.principal.hostname contains "aaa1bbaa2" the following would be True, because the function returns the first instance. This example has no capture groups.

"aaa1" = re.capture($e.principal.hostname, "a+[1-9]")

This example captures everything after the @ symbol in an email. If the $e.network.email.from field is test@google.com, the example returns google.com. This example contains one capture group.

"google.com" = re.capture($e.network.email.from , "@(.*)")

If the regular expression does not match any substring in the text, the function returns an empty string. You can omit events where no match occurs by excluding the empty string, which is especially important when you are using re.capture() with an inequality:

// Exclude the empty string to omit events where no match occurs.
"" != re.capture($e.network.email.from , "@(.*)")

// Exclude a specific string with an inequality.
"google.com" != re.capture($e.network.email.from , "@(.*)")

RegExp replacement

Performs a regular expression replacement.

re.replace(stringText, replaceRegex, replacementText)

This function takes three arguments:

  • stringText: the original string.
  • replaceRegex: the regular expression indicating the pattern to search for.
  • replacementText: The text to insert into each match.

Returns a new string derived from the original stringText, where all substrings that match the pattern in replaceRegex are replaced with the value in replacementText. You can use backslash-escaped digits (\1 to \9) within replacementText to insert text matching the corresponding parenthesized group in the replaceRegex pattern. Use \0 to refer to the entire matching text.

The function replaces non-overlapping matches and will prioritize replacing the first occurrence found. For example, re.replace("banana", "ana", "111") returns the string "b111na".

This example captures everything after the @ symbol in an email, replaces com with org, and then returns the result. Notice the use of nested functions.

"email@google.org" = re.replace($e.network.email.from, "com", "org")

This example uses backslash-escaped digits in the replacementText argument to reference matches to the replaceRegex pattern.

"test1.com.google" = re.replace(
                       $e.principal.hostname, // holds "test1.test2.google.com"
                       "test2\.([a-z]*)\.([a-z]*)",
                       "\\2.\\1"  // \\1 holds "google", \\2 holds "com"
                     )

Note the following cases when dealing with empty strings and re.replace():

Using empty string as replaceRegex:

// In the function call below, if $e.principal.hostname contains "name",
// the result is: 1n1a1m1e1, because an empty string is found next to
// every character in `stringText`.
re.replace($e.principal.hostname, "", "1")

To replace an empty string, you can use "^$" as replaceRegex:

// In the function call below, if $e.principal.hostname contains the empty
// string, "", the result is: "none".
re.replace($e.principal.hostname, "^$", "none")

Date functions

Chronicle supports the following date-related functions:

  • timestamp.get_minute(unix_seconds [, time_zone])
  • timestamp.get_hour(unix_seconds [, time_zone])
  • timestamp.get_day_of_week(unix_seconds [, time_zone])
  • timestamp.get_week(unix_seconds [, time_zone])
  • timestamp.current_seconds()

Chronicle supports negative integers as the unix_seconds argument. Negative integers represent times before the Unix epoch. If you provide an invalid integer, for example a value that results in an overflow, the function will return -1. This is an uncommon scenario.

Because YARA-L 2 doesn't support negative integer literals, make sure to check for this condition using the less than or greater than operator. For example:

0 > timestamp.get_hour(123)

Time extraction

Returns an integer in the range [0, 59].

timestamp.get_minute(unix_seconds [, time_zone])

The following function returns an integer in the range [0, 23], representing the hour of day.

timestamp.get_hour(unix_seconds [, time_zone])

The following function returns an integer in the range [1, 7] representing the day of week starting with Sunday. For example, 1 = Sunday; 2 = Monday, etc.

timestamp.get_day_of_week(unix_seconds [, time_zone])

The following function returns an integer in the range [0, 53] representing the week of the year. Weeks begin with Sunday. Dates before the first Sunday of the year are in week 0.

timestamp.get_week(unix_seconds [, time_zone])

These time extraction functions have the same arguments.

  • unix_seconds is an integer representing the number of seconds past Unix epoch, such as $e.metadata.event_timestamp.seconds, or a placeholder containing that value.
  • time_zone is optional and is a string representing a time_zone. If omitted, the default is "GMT". You can specify time zones using string literals. The options are:
    • The TZ database name, for example "America/Los_Angeles". For more information, see the "TZ Database Name" column from this page
    • The time zone offset from UTC, in the format(+|-)H[H][:M[M]], for example: "-08:00".

In this example, the time_zone argument is omitted, so it defaults to "GMT".

$ts = $e.metadata.collected_timestamp.seconds

timestamp.get_hour($ts) = 15

This example uses a string literal to define the time_zone.

$ts = $e.metadata.collected_timestamp.seconds

2 = timestamp.get_day_of_week($ts, "America/Los_Angeles")

Here are examples of other valid time_zone specifiers, which you can pass as the second argument to time extraction functions:

  • "America/Los_Angeles", or "-08:00". ("PST" is not supported)
  • "America/New_York", or "-05:00". ("EST" is not supported)
  • "Europe/London"
  • "UTC"
  • "GMT"

Current timestamp

Returns an integer representing the current time in Unix seconds. This is approximately equal to the detection timestamp and is based on when the rule is run.

timestamp.current_seconds()

The following example returns True if the certificate has been expired for more than 24h. It calculates the time difference by subtracting the current Unix seconds, and then comparing using a greater than operator.

86400 < timestamp.current_seconds() - $e.network.tls.certificate.not_after

Math functions

Absolute value

Returns the absolute value of an integer expression.

math.abs(intExpression)

This example returns True if events were more than 5 minutes apart, regardless of which event came first. Notice how the example uses nested functions.

300 < math.abs($e1.metadata.event_timestamp.seconds
               - $e2.metadata.event_timestamp.seconds
      )

Net functions

Returns true when the given IP address is within the specified subnetwork.

net.ip_in_range_cidr(ipAddress, subnetworkRange)

You can use YARA-L to search for UDM events across all of the IP addresses within a subnetwork using the net.ip_in_range_cidr() statement. Both IPv4 and IPv6 are supported.

To search across a range of IP addresses, specify an IP UDM field and a Classless Inter-Domain Routing (CIDR) range. YARA-L can handle both singular and repeating IP address fields.

IPv4 example:

net.ip_in_range_cidr($e.principal.ip, "192.0.2.0/24")

IPv6 example:

net.ip_in_range_cidr($e.network.dhcp.yiaddr, "2001:db8::/32")

For an example rule using the net.ip_in_range_cidr()statement, see the example rule. Single Event within Range of IP Addresses

Function to placeholder assignment

You can assign the result of a function call to a placeholder in the events section. For example:

$placeholder = strings.concat($e.principal.hostname, "my-string").

You can then use the placeholder variables in the match, condition, and outcome sections. However, there are two limitations with function to placeholder assignment:

  1. Every placeholder in function to placeholder assignment must be assigned to an expression containing an event field. For example, the following examples are valid:

    $ph1 = $e.principal.hostname
    $ph2 = $e.src.hostname
    
    // Both $ph1 and $ph2 have been assigned to an expression containing an event field. 
    $ph1 = strings.concat($ph2, ".com") 
    
    $ph1 = $e.network.email.from
    $ph2 = strings.concat($e.principal.hostname, "@gmail.com")
    
    // Both $ph1 and $ph2 have been assigned to an expression containing an event field.
    $ph1 = strings.to_lower($ph2) 
    

    However, the example below is invalid:

    $ph1 = strings.concat($e.principal.hostname, "foo")
    $ph2 = strings.concat($ph1, "bar") // $ph2 has NOT been assigned to an expression containing an event field. 
    
  2. Function call should depend on one and exactly one event. However, more than one field from the same event can be used in function call arguments. For example, the following is valid:

    $ph = strings.concat($event.principal.hostname, "string2")

    $ph = strings.concat($event.principal.hostname, $event.src.hostname)

    However, the following is invalid:

    $ph = strings.concat("string1", "string2")

    $ph = strings.concat($event.principal.hostname, $anotherEvent.src.hostname)

Meta section syntax

Meta section is composed of multiple lines, where each line defines a key-value pair. A key part must be an unquoted string, and a value part must be a quoted string:

<key> = "<value>"

The following is an example of a valid meta section line: meta: author = "Chronicle" severity = "HIGH"

Events section syntax

In the events section, list the predicates to specify the following:

  • What each match or placeholder variable represents
  • Simple binary expressions as conditions
  • Function expressions as conditions
  • Reference list expressions as conditions
  • Logical operators

Variable declarations

For variable declarations, use the following syntax:

  • <EVENT_FIELD> = <VAR>
  • <VAR> = <EVENT_FIELD>

Both are equivalent, as shown in the following examples:

  • $e.source.hostname = $hostname
  • $userid = $e.principal.user.userid

This declaration indicates that this variable represents the specified field for the event variable. When the event field is a repeated field, the match variable can represent any value in the array. It is also possible to assign multiple event fields to a single match or placeholder variable. This is a transitive join condition.

For example, the following:

  • $e1.source.ip = $ip
  • $e2.target.ip = $ip

Are equivalent to:

  • $e1.source.ip = $ip
  • $e1.source.ip = $e2.target.ip

When a variable is used, the variable must be declared through variable declaration. If a variable is used without any declaration, it is regarded as a compilation error.

Simple binary expressions as conditions

For a simple binary expression to use as condition, use the following syntax:

  • <EXPR> <OP> <EXPR>

Expression can be either event field, variable, constant, or function expression.

For example:

  • $e.source.hostname = "host1234"
  • $e.source.port < 1024
  • 1024 < $e.source.port
  • $e1.source.hostname != $e2.target.hostname
  • $e1.metadata.collected_timestamp.seconds > $e2.metadata.collected_timestamp.seconds
  • $port >= 25
  • $host = $e2.target.hostname
  • "google-test" = strings.concat($e.principal.hostname, "-test")
  • "email@google.org" = re.replace($e.network.email.from, "com", "org")

If both sides are constants, it is regarded as a compilation error.

Function expressions as conditions

Some function expressions return boolean value, which can be used as an individual predicate in the events section. Such functions are:

  • re.regex()
  • net.ip_in_range_cidr()

For example:

  • re.regex($e.principal.hostname, `.*\.google\.com`)
  • net.ip_in_range_cidr($e.principal.ip, "192.0.2.0/24")

Reference list expressions as conditions

Use the in operator to check for the existence of UDM values within a reference list based on equality. The in operator can be combined with not to exclude the values of a reference list. Reference list name must be prepended with % character.

Currently, only string values are supported:

  • $e.principal.hostname in %hostname_list
  • $e.about.ip in %phishing_site_list
  • $hostname in %whitelisted_hosts

Logical operators

You can use the logical and and logical or operators in the events section as shown in the following examples:

  • $e.metadata.event_type = "NETWORK_DNS" or $e.metadata.event_type = "NETWORK_DHCP"
  • ($e.metadata.event_type = "NETWORK_DNS" and $e.principal.ip = "192.0.2.12") or ($e.metadata.event_type = "NETWORK_DHCP" and $e.principal.mac = "AB:CD:01:10:EF:22")
  • not $e.metadata.event_type = "NETWORK_DNS"

By default, the precedence order from highest to lowest is not, and, or.

For example, "a or b and c" is evaluated as "a or (b and c)". You can use parentheses to alter the precedence if needed.

In the events section, all predicates are regarded as anded together by default.

Operators in events

You can use the operators with enumerated types. It can be applied to rules to simplify and optimize (use operator instead of reference lists) the performance.

In the following example, 'USER_UNCATEGORIZED' and 'USER_RESOURCE_DELETION' correspond to 15000 and 15014, so the rule will look for all the listed events:

$e.metadata.event_type >= "USER_CATEGORIZED" and $e.metadata.event_type <= "USER_RESOURCE_DELETION"

List of events:

  • USER_RESOURCE_DELETION
  • USER_RESOURCE_UPDATE_CONTENT
  • USER_RESOURCE_UPDATE_PERMISSIONS
  • USER_STATS
  • USER_UNCATEGORIZED

Modifiers

nocase

When you have a comparison expression between string values or a regex expression, you can append nocase at the end of the expression to ignore capitalization.

  • $e.principal.hostname != "http-server" nocase
  • $e1.principal.hostname = $e2.target.hostname nocase
  • $e.principal.hostname = /dns-server-[0-9]+/ nocase
  • re.regex($e.target.hostname, `client-[0-9]+`) nocase

This cannot be used when a type of field is an enumerated value. Below examples are invalid and will generate compilation errors:

  • $e.metadata.event_type = "NETWORK_DNS" nocase
  • $e.network.ip_protocol = "TCP" nocase

Repeated fields

any, all

In UDM and Entity, some fields are labeled as repeated, which indicates they are lists of values or other types of messages. In YARA-L, each element in the repeated field is treated individually. That means, if the repeated field is used in the rule, we evaluate the rule for each element in the field. This can lead to an unexpected behavior. For example, if a rule has both $e.principal.ip = "1.2.3.4" and $e.principal.ip = "5.6.7.8" in the events section, the rule never generates any matches, even if both "1.2.3.4" and "5.6.7.8" are in principal.ip.

To evaluate the repeated field as a whole, you can use any and all operators. When any is used, the predicate is evaluated as true if any value in the repeated field satisfies the condition. When all is used, the predicate is evaluated as true if all values in the repeated field satisfy the condition.

  • any $e.target.ip = "127.0.0.1"
  • all $e.target.ip != "127.0.0.1"
  • re.regex(any $e.about.hostname, `server-[0-9]+`)
  • net.ip_in_range_cidr(all $e.principal.ip, "10.0.0.0/8")

The any and all operators can only be used with repeated fields. In addition, they cannot be used when assigning a repeated field to a placeholder variable or joining with a field of another event.

For example, any $e.principal.ip = $ip and any $e1.principal.ip = $e2.principal.ip are not valid syntax. To match or join a repeated field, use $e.principal.ip = $ip. There will be one match variable value or join for each element of the repeated field.

When writing a condition with any or all, be aware that negating the condition with not might not have the same meaning as using the negated operator.

For example:

  • not all $e.principal.ip = "192.168.12.16" checks if not all IP addresses match "192.168.12.16", meaning the rule is checking whether any IP address does not match "192.168.12.16".
  • all $e.principal.ip != "192.168.12.16" checks if all IP addresses do not match "192.168.12.16", meaning the rule is checking that no IP addresses match to "192.168.12.16".

Event variable join requirements

All event variables used in the rule must be joined with every other event variable in either of the following ways:

  • directly through an equality comparison between event fields of the two joined event variables, for example: $e1.field = $e2.field. The expression must not include arithmetic or function calls.

  • indirectly through a transitive join involving only an event field (see variable declaration for a definition of "transitive join"). The expression must not include arithmetic or function calls.

For example, assuming $e1, $e2, and $e3 are used in the rule, the following events sections are valid.

events:
  $e1.principal.hostname = $e2.src.hostname // $e1 joins with $e2
  $e2.principal.ip = $e3.src.ip // $e2 joins with $e3
events:
  // all of $e1, $e2 and $e3 are transitively joined via the placeholder variable $ip
  $e1.src.ip = $ip
  $e2.target.ip = $ip
  $e3.about.ip = $ip
events:
  $e1.principal.hostname = $e2.src.hostname // $e1 joins with $e2
  
  // Function to event comparison is not a valid join condition for $e1 and $e2,
  // but the whole events section is valid because we have a valid join condition in the first line.
  re.capture($e1.src.hostname, ".*") = $e2.target.hostname

However, here are examples of invalid events sections.

events:
  // Event to function comparison is an invalid join condition for $e1 and $e2.
  $e1.principal.hostname = re.capture($e2.principal.application, ".*")
events:
  // Event to arithmetic comparison is an invalid join condition for $e1 and $e2.
  $e1.principal.port = $e2.src.port + 1
events:
  $e1.src.ip = $ip
  $e2.target.ip = $ip
  $e3.about.ip = "192.1.2.0" //$e3 is not joined with $e1 or $e2.
events:
  $e1.src.ip = $ip
  
  // Function to placeholder comparison is an invalid transitive join condition.
  re.capture($e2.target.ip, ".*") = $ip
events:
  $e1.src.port = $port
  
  // Arithmetic to placeholder comparison is an invalid transitive join condition.
  $e2.principal.port + 800 = $port

Match section syntax

In the match section, list the match variables for group events before checking for match conditions. Those fields are returned with each match.

  • Specify what each match variable represents in the events section.
  • Specify the time range to use to correlate events after the over keyword. Events outside the time range are ignored.
  • Use the following syntax to specify the time range: <number><s/m/h/d> Where s/m/h/d means seconds, minutes, hours, and days respectively.
  • Minimum time you can specify is 1 minute.
  • Maximum time you can specify is 48 hours.

The following is an example of a valid match:

$var1, $var2 over 5m

This statement returns $var1 and $var2 (defined in the events section) when the rule finds a match. The time specified is 5 minutes. Events that are more than 5 minute apart are not correlated and therefore ignored by the rule.

Here is another example of a valid match:

$user over 1h

This statement returns $user when the rule finds a match. The time window specified is 1 hour. Events that are more than an hour apart are not correlated. The rule does not consider them to be a detection.

Here is another example of a valid match:

$source_ip, $target_ip, $hostname over 2m

This statement returns $source_ip, $target_ip, and $hostname when the rule finds a match. The time window specified is 2 minutes. Events that are more than 2 minutes apart are not correlated. The rule does not consider them to be a detection.

The following examples illustrate invalid match sections:

  • var1, var2 over 5m // invalid variable name
  • $user 1h // missing keyword

Sliding window

By default, YARA-L 2.0 rules are evaluated using hop windows. A time range of enterprise event data is divided into a set of overlapping hop windows, each with the duration specified in the match section. Events are then correlated within each hop window. With hop windows, it is impossible to search for events that happen in a specific order (for example, e1 happens up to 2 minutes after e2). An occurrence of event e1 and an occurrence of event e2 are correlated as long as they are within the hop window duration of each other.

Rules can also be evaluated using sliding windows. With sliding windows, sliding windows with the duration specified in the match section are generated when beginning or ending with a specified pivot event variable. Events are then correlated within each sliding window. This makes it possible to search for events that happen in a specific order (for example, e1 happens within 2 minutes of e2). An occurrence of event e1 and an occurrence of event e2 are correlated if event e1 occurs within the sliding window duration after event e2.

Specify sliding windows in the match section of a rule as follows:

<match-var-1>, <match-var-2>, ... over <duration> before|after <pivot-event-var>

The pivot event variable is the event variable that sliding windows are based on. If you use the before keyword, sliding windows are generated, ending with each occurrence of the pivot event. If the after keyword is used, sliding windows are generated beginning with each occurrence of the pivot event.

The following are examples of valid sliding window usages:

  • $var1, $var2 over 5m after $e1
  • $user over 1h before $e2

Outcome section syntax

In the outcome section, you can define up to 10 outcome variables, with arbitrary names. These outcomes will be stored in the detections generated by the rule. Each detection may have different values for the outcomes.

The outcome name, $risk_score, is special. You can optionally define an outcome with this name, and if you do, it must be an integer type. If populated, the risk_score will be shown in the Enterprise Insights view for alerts that come from rule detections.

Outcome variable data types

Each outcome variable can have a different data type, which is determined by the expression used to compute it. We support the following outcome data types:

  • integer
  • string
  • lists of integers
  • lists of strings

Conditional logic

You can use conditional logic to compute the value of an outcome. Conditionals are specified using the following syntax pattern:

if(BOOL_CLAUSE, THEN_CLAUSE)
if(BOOL_CLAUSE, THEN_CLAUSE, ELSE_CLAUSE)

You can read a conditional expression as "if BOOL_CLAUSE is true, then return THEN_CLAUSE, else return ELSE_CLAUSE".

BOOL_CLAUSE must evaluate to a boolean value. A BOOL_CLAUSE expression takes a similar form as expressions in the events section. For example, it can contain:

  • UDM field names with comparison operator, for example:

    if($context.graph.entity.user.title = "Vendor", 100, 0)

  • placeholder variable that was defined in the events section, for example:

    if($severity = "HIGH", 100, 0)

  • functions that return a boolean, for example:

    if(re.regex($e.network.email.from, .*altostrat\.com), 100, 0)

  • look up in a reference list, for example:

    if($u.principal.hostname in %my_reference_list_name, 100, 0)

The THEN_CLAUSE and ELSE_CLAUSE must be the same data type. We support integers and strings.

You can omit the ELSE_CLAUSE if the data type is integer. If omitted, the ELSE_CLAUSE evaluates to 0. For example:

`if($e.field = "a", 5)` is equivalent to `if($e.field = "a", 5, 0)`

You must provide the ELSE_CLAUSE if the data type is string.

Mathematical operations

You can use mathematical operations to compute integer data type outcomes. We support addition and subtraction (but not multiplication, division, or modulo). For example:

outcome:
  $risk_score = max(100 + if($severity = "HIGH", 10, 5) - if($severity = "LOW", 20, 0))

Placeholder variables in outcomes

When computing outcome variables, you can use placeholder variables which were defined in the events section of your rule. In this example, assume that $email_sent_bytes was defined in the events section of the rule:

Single-event example:

// No match section, so this is a single-event rule.

outcome:
  // Use placeholder directly as an outcome value.
  $my_outcome = $email_sent_bytes

  // Use placeholder in a conditional.
  $other_outcome = if($file_size > 1024, "SEVERE", "MODERATE")

condition:
  $e

Multi-event example:

match:
  // This is a multi event rule with a match section.
  $hostname over 5m

outcome:
  // Use placeholder directly in an aggregation function.
  $max_email_size = max($email_sent_bytes)

  // Use placeholder in a mathematical computation.
  $total_bytes_exfiltrated = sum(
    1024
    + $email_sent_bytes
    + $file_event.principal.file.size
  )

condition:
  $email_event and $file_event

Aggregations

The outcome section can be used in multi-event rules (rules that contain a match section), and in single-event rules (rules that do not contain a match section). Requirements for aggregations are as follows:

  • Multi-event rules (with match section)

    • Expression to compute outcomes is evaluated over all events that generated a particular detection.
    • Expression must be wrapped in an aggregate function
      • Example: $max_email_size = max($e.network.sent_bytes)
      • If the expression contains a repeated field, the aggregate operates over all elements in the repeated field, over all events that generated the detection
  • Single-event rules (without match section)

    • Expression to compute outcomes is evaluated over the single event that generated a particular detection.
    • Must use aggregate function for expressions that involve at least one repeated field
      • Example: $suspicious_ips = array($e.principal.ip)
      • The aggregate operates over all elements in the repeated field
    • Can not use aggregate funcion for expressions that do not involve a repeated field
      • Example: $threat_status = if($e.principal.file.size > 1024, "SEVERE", "MODERATE")

You can use the following aggregation functions:

  • max(): outputs the maximum over all possible values. Only works with integer.
  • min(): outputs the minimum over all possible values. Only works with integer.
  • sum(): outputs the sum over all possible values. Only works with integer.
  • count_distinct(): collects all possible values, then outputs the distinct count of possible values.
  • count(): behaves like count_distinct(), but returns a non-distinct count of possible values.
  • array_distinct(): collects all possible values, then outputs a list of these values. It will truncate the list of values to 25 random elements.
  • array(): behaves like array_distinct(), but returns a non-distinct list of values. It also truncates the list of values to 25 random elements.

The aggregate function is important when a rule includes a condition section that specifies multiple events must exist, because the aggregate function will operate on all the events that generated the detection.

For example, if your outcome and condition sections contain:

outcome:
  $asset_id_count = count($event.principal.asset_id)
  $asset_id_distinct_count = count_distinct($event.principal.asset_id)

  $asset_id_list = array($event.principal.asset_id)
  $asset_id_distinct_list = array_distinct($event.principal.asset_id)

condition:
  #event > 1

Since the condition section requires there to be more than one event for each detection, the aggregate functions will operate on multiple events. Suppose the following events generated one detection:

event:
  // UDM event 1
  asset_id="asset-a"

event:
  // UDM event 2
  asset_id="asset-b"

event:
  // UDM event 3
  asset_id="asset-b"

Then the values of your outcomes will be:

  • $asset_id_count = 3
  • $asset_id_distinct_count = 2
  • $asset_id_list = ["asset-a", "asset-b", "asset-b"]`
  • $asset_id_distinct_list = ["asset-a", "asset-b"]

Things to know when using the outcome section:

Other notes and restrictions:

  • The outcome section cannot reference a new placeholder variable which wasn't already defined in the events section.
  • Similarly, the outcome section cannot use events that have not been defined in the events section
  • The outcome section can only correlate events that have already been correlated in the events section

You can find an example using the outcome section in Overview of the YARA-L 2.0. See Create context-aware analytics for details on detection deduping with the outcome section.

Condition section syntax

In the condition section, you can:

  • specify a match condition over events and placeholders defined in the events section. See the following section, Event and placeholder conditionals, for more details.
  • (optional) use the and keyword to specify a match condition using outcome variables defined in the outcome section. See the following section, Outcome conditionals, for more details.

The following condition patterns are valid:

condition:
  <event/placeholder conditionals>
condition:
  <event/placeholder conditionals> and <outcome conditionals>

Event and placeholder conditionals

List condition predicates for events and placeholder variables here, joined with the keyword and or or.

The following conditions are bounding conditions. They force the associated event variable to exist, meaning that at least one occurrence of the event must appear in any detection.

  • $var // equivalent to #var > 0
  • #var > n // where n >= 0
  • #var >= m // where m > 0

The following conditions are non-bounding conditions. They allow the associated event variable to not exist, meaning that it is possible that no occurrence of the event appears in a detection. This enables the making of non-existence rules, which search for the absence of a variable instead of the presence of a variable.

  • !$var // equivalent to #var = 0
  • #var >= 0
  • #var < n // where n > 0
  • #var <= m // where m >= 0

In the following example, the special character # on a variable (either the event variable or the placeholder variable) represents the count of distinct events or values of that variable.

$e and #port > 50 or #event1 > 2 or #event2 > 1 or #event3 > 0

The following non-existence example is also valid and evaluates to true if there are more than two distinct events from $event1, and zero distinct events from $event2.

#event1 > 2 and !$event2

The following are examples of invalid predicates:

  • $e, #port > 50 // incorrect keyword usage
  • $e or #port < 50 // or keyword not supported with non-bounding conditions
  • not $e // not keyword is not allowed for event and placeholder conditions

Outcome conditionals

List condition predicates for outcome variables here, joined with the keyword and or or, or preceded by the keyword not.

Specify outcome conditionals differently depending on the type of the outcome variable:

  • integer: compare against an integer literal with operators =, >, >=, <, <=, !=, for example:

    $risk_score > 10

  • string: compare against a string literal with either = or !=, for example:

    $severity = "HIGH"

  • list of integers or arrays: specify condition using the arrays.contains function, for example:

    arrays.contains($event_ids, "id_1234")

Rule classification

Specifying an outcome conditional in a rule that has a match section means that the rule will be classified as a multi-event rule for rule quota. Please see single event rule and multiple event rule for more information about single and multiple event classifications.

Count (#) character

The # character is a special character in the condition section. If it is used before any event or placeholder variable name, it represents the number of distinct events or values that satisfy all the events section conditions.

Value ($) character

The $ character is another special character in the condition section. If it is used before any outcome variable name, it represents the value of that outcome.

If it is used before any event or placeholder variable name (for example, $event), it is a shorthand for #event > 0.

Options section syntax

In the options section, you can specify the options for the rule. Syntax for the options section is similar to that of the meta section. But a key must be one of predefined option names, and the value is not restricted to string type.

Currently, the only available option is allow_zero_values.

  • allow_zero_value — If set to true, matches generated by the rule can have zero values as match variable values. Zero values are given to event fields when they are left unpopulated. This option is set to false by default.

Following is the valid options section line:

  • allow_zero_values = true

Type checking

Chronicle performs type checking against your YARA-L syntax as you create rules within the interface. The type checking errors displayed help you to revise the rule in such a way as to ensure that it will work as expected.

The following are examples of invalid predicates:

// $e.target.port is of type integer which cannot be compared to a string.
$e.target.port = "80"

// "LOGIN" is not a valid event_type enum value.
$e.metadata.event_type = "LOGIN"