Overview of the YARA-L 2.0 language

YARA-L 2.0 is a computer language used to create rules for searching through your enterprise log data as it is ingested into your Chronicle account. The YARA-L syntax is derived from the YARA language developed by VirusTotal. The language works in conjunction with the Chronicle Detection Engine and enables you to hunt for threats and other events across large volumes of data. See also YARA-L 2.0 language syntax

Rule structure

For YARA-L 2, you must specify variable declarations, definitions, and usages in the following order:

  1. meta
  2. events
  3. match (optional)
  4. condition

The following illustrates the generic structure of a rule:

rule <rule Name>
{
  meta:
    // Stores arbitrary key-value pairs of rule details, such as who wrote
    // it, what it detects on, version control, etc.
    // Identical to the meta section in YARA-L.
    //
    // For example:
    // author = "Analyst #2112"
    // date = "08/09/2020"
    // description = "suspicious domain detected"

  events:
    // Conditions to filter events and the relationship between events.

  match:
    // Values to return when matches are found.

  condition:
    // Condition to check events and the variables used to find matches.
}

YARA-L 2.0 example rules

The following examples show rules written in YARA-L 2.0. Each demonstrates how to correlate events within the rule language.

Logins from different cities

The following rule searches for users that have logged in to your enterprise from two or more cities in less than 5 minutes:

rule DifferentCityLogin {
  meta:

  events:
    $udm.metadata.event_type = "USER_LOGIN"
    $udm.principal.user.userid = $user
    $udm.principal.location.city = $city

  match:
    $user over 5m

  condition:
    #city > 1
}

Match variable: $user

Event variable:$udm

Placeholder variable: $city $user

The following describes how this rule works:

  • Groups events with username ($user) and returns it ($user) when a match is found.
  • Timespan is 5 minutes, meaning only events that are less than 5 minutes apart are correlated.
  • Searching for an event group ($udm) whose event type is USER_LOGIN.
  • For that event group, the rule calls the user id as $user and the login city as $city.
  • Returns a match if the distinct number of $city values is greater than 1 in the event group ($udm) within the 5 minute time range.

Rapid user creation and deletion

The following rule searches for users that have been created and then deleted within 4 hours:

  rule UserCreationThenDeletion {
  meta:

  events:
    $create.target.user.userid = $user
    $create.metadata.event_type = "USER_CREATION"

    $delete.target.user.userid = $user
    $delete.metadata.event_type = "USER_DELETION"

    $create.metadata.event_timestamp.seconds <=
       $delete.metadata.event_timestamp.seconds

  match:
    $user over 4h

  condition:
    $create and $delete
}

Event variables:$create and $delete

Match variable: $user

Placeholder variable: N/A

The following describes how this rule works:

  • Groups events with username ($user) and returns it ($user) when a match is found.
  • Time window is 4 hours, meaning only events separated by less than 4 hours are correlated.
  • Searches for two event groups ($create and $delete, where $create is equivalent to #create >= 1).
  • $create corresponds to USER_CREATION events and calls the user id as $user.
  • $user is used to join the two groups of events together.
  • $delete corresponds to USER_DELETION events and calls the user id as $user. This rule looks for a match where the user identifier in the two event groups is the same.
  • This rule looks for cases where the event from $delete happens later than the event from $create, returning a match when discovered.

Single-event versus multi-event

The following example rules show how you could create a rule to search for a single event and then modify it to search for multiple events.

Here is the single event version of the rule:

rule SingleEventRule {
  meta:
    author = "noone@altostrat.com"

  events:
    $e.metadata.event_type = "USER_LOGIN"

  condition:
    $e
}

This rule simply searches for a user login event and would return the first one it encounters within the enterprise data stored within your Chronicle account.

Here is a multi-event version of the rule:

rule MultiEventRule {
  meta:
    author = "noone@altostrat.com"

  events:
    $e.metadata.event_type = "USER_LOGIN"
    $e.principal.user.userid = $user

  match:
    $user over 10m

  condition:
    #e >= 10
}

The rule searches for a user who has logged in at least 10 times in less than 10 minutes.

Single event within range of IP addresses

The following example shows a single event rule searching for a match between two specific users and a specific range of IP addresses:

rule OrsAndNetworkRange {
  meta:
    author = "noone@altostrat.com"

  events:
    // Checks CIDR ranges.
    net.ip_in_range_cidr($e.principal.ip, "203.0.113.0/24")

    // Detection when the hostname field matches either value using or.
    $e.principal.hostname = /pbateman/ or $e.principal.hostname = /sspade/

  condition:
    $e
}

Repeated fields

In YARA-L 2.0, repeated fields are represented as arrays.

For example, a host might have multiple IP addresses:

principal.ip [192.168.1.2, 10.3.4.100, 192.168.12.16]

Or an e-mail address might have multiple recipients:

network.email.to ["a@google.com", "b@google.com", "c@google.com"]

Syntax

Repeated fields can be referred to in the same fashion as non-repeated fields. In this case, they are automatically disambiguated, meaning that the conditions are checked against the individual elements of the repeated field.

For example, if you include the following statement within a rule:

$e.principal.ip = "192.168.12.16"

Chronicle searches for an IP address within the array that matches "192.168.12.16". In this example, a matching address would be found and a detection would be returned.

For example, if you include the following statement within a rule:

$e.principal.ip = "192.168.12.16" and $e.principal.ip = "10.3.4.100"

Chronicle searches for an IP address within the array that matches both "192.168.12.16" and "10.3.4.100". For this example, no matching address would be found and no detection would be returned.

Repeated fields example

The following rule searches for events where a source IP address has connected to a target IP address while making requests to over 50 different target ports within a timespan of less than one minute. This is likely a malicious entity searching for an unsecured network port.

rule RepeatedFieldsRuleExample {
  meta:
    author = "noone@google.com"

  events:
    $e.principal.ip = $source_ip
    $e.target.ip = $target_ip
    $e.target.port = $target_port

  match:
    $source_ip, $target_ip over 1m

  condition:
    #target_port > 50
}

any and all operators

Repeated fields can also be referenced with the any and all operators. If so, they are not disambiguated, meaning that conditions are checked against all elements of the repeated field.

For example, if you include the following statement within a rule:

any $e.principal.ip = "192.168.12.16"

Chronicle checks if any IP address within the array matches "192.168.12.16". In this example, the array would satisfy the check and would return a detection.

If you include the following statement within your rule:

all $e.principal.ip = "192.168.12.16"

Chronicle checks if all IP addresses within the array match "192.168.12.16". In this example, the array would not satisfy the check and would not return a detection.

If you include the following statement within your rule:

any $e.principal.ip = "192.168.12.16" and any $e.principal.ip = "10.3.4.100"

Chronicle checks if any IP address within the array matches "192.168.12.16" and if any IP address within the array matches "10.3.4.100". In this example, the array would satisfy the check and it would return a detection.

The following are examples of valid predicates using the any and all operators:

  • any $e.principal.ip = "192.168.12.16"
  • net.ip_in_range_cidr(any $e.principal.ip, "192.168.12.16/24")
  • all $e.network.email.to = /.*@google\.com/
  • re.regex(all $e.network.email.to, `.*google\.com`)

The following are examples of invalid predicates using the any and all operators:

  • any $ip = "192.168.12.16"
  • any $e.principal.ip = all $e.target.ip
  • any $e.principal.ip in %reference_list

Limitations of the any and all operators

The any and all operators can only be used with repeated fields. In addition, they cannot be used when assigning a repeated field to a placeholder variable or joining with a field of another event.

For example, any $e.principal.ip = $ip and any $e1.principal.ip = $e2.principal.ip are not valid syntax. To match or join a repeated field, use $e.principal.ip = $ip. There will be one match variable value or join for each element of the repeated field.

When writing a condition with any or all, be aware that negating the condition with not might not have the same meaning as using the negated operator.

For example:

  • not all $e.principal.ip = "192.168.12.16" checks if not all IP addresses match "192.168.12.16", meaning the rule is checking whether any IP address does not match "192.168.12.16".
  • all $e.principal.ip != "192.168.12.16" checks if all IP addresses do not match "192.168.12.16", meaning the rule is checking that no IP addresses match to "192.168.12.16".

any and all rule example

The following rule searches for login events where all source IP addresses do not match an IP address known to be secure within a timespan of 5 minutes.

rule SuspiciousIPLogins {
  meta:
    author = "noone@google.com"

  events:
    $e.metadata.event_type = "USER_LOGIN"

    // Detects if all source IP addresses in an event do not match "100.97.16.0"
    // For example, if an event has source IP addresses
    // ["100.97.16.1", "100.97.16.2", "100.97.16.3"],
    // it will be detected since "100.97.16.1", "100.97.16.2",
    // and "100.97.16.3" all do not match "100.97.16.0".

    all $e.principal.ip != "100.97.16.0"

    // Assigns placeholder variable $ip to the $e.principal.ip repeated field.
    // There will be one detection per source IP address.
    // For example, if an event has source IP addresses
    // ["100.97.16.1", "100.97.16.2", "100.97.16.3"],
    // there will be one detection per address.

    $e.principal.ip = $ip

  match:
    $ip over 5m

  condition:
    $e
}

Regular expressions in a rule

The following YARA-L 2.0 regular expression example searches for events with emails received from the altostrat.com domain. Since nocase has been added to the $host variable regex comparison and the regex function, both these comparisons are case insensitive.

rule RegexRuleExample {
  meta:
    author = "noone@altostrat.com"

  events:
    $e.principal.hostname = $host
    $host = /.*HoSt.*/ nocase
    re.regex($e.network.email.from, `.*altostrat\.com`) nocase

  match:
    $host over 10m

  condition:
    #e > 10
}

YARA-L sliding windows

By default, YARA-L 2.0 rules are evaluated using hop windows. A time range of enterprise event data is divided into a set of overlapping hop windows, each with the duration specified in the match section. Events are then correlated within each hop window. With hop windows, it is not possible to search for events that happen in a specific order (for example, e1 happens up to 2 minutes after e2). An occurrence of event e1 and an occurrence of event e2 are correlated as long as they are within the hop window duration of each other.

Rules can also be evaluated using sliding windows. With sliding windows, sliding windows with the duration specified in the match section are generated when beginning or ending with a specified pivot event variable. Events are then correlated within each sliding window. This makes it possible to search for events that happen in a specific order (for example, e1 happens within 2 minutes of e2). An occurrence of event e1 and an occurrence of event e2 are correlated if event e1 occurs within the sliding window duration after event e2.

Sliding window rule syntax

Specify sliding windows in the match section of a rule as follows:

<match-variable-name-1>, <match-variable-name-2>, ... over <sliding-window- duration> before|after <pivot-event-variable-name>

The pivot event variable is the event variable that sliding windows are based on. If you use the before keyword, sliding windows are generated, ending with each occurrence of the pivot event. If the after keyword is used, sliding windows are generated beginning with each occurrence of the pivot event.

Sliding window rule example

The following YARA-L 2.0 sliding window example searches for the absence of firewall_2 events after firewall_1 events. The after keyword is used with the pivot event variable $e1 to specify that only 10 minute windows after each firewall_1 event should be checked when correlating events.

rule SlidingWindowRuleExample {
  meta:
    author = "noone@google.com"

  events:
    $e1.metadata.product_name = "firewall_1"
    $e1.principal.hostname = $host

    $e2.metadata.product_name = "firewall_2"
    $e2.principal.hostname = $host

  match:
    $host over 10m after $e1

  condition:
    $e1 and !$e2
}