YARA-L 2.0 language syntax

This section describes the major elements of the YARA-L syntax. See also Overview of the YARA-L 2.0 language.

Comments

Designate comments with two slash characters (// comment) or multi-line comments set off using slash asterisk characters (/* comment */), as you would in C.

Constants

Integer, string, float, and regex constants are supported. For string constants, use double quotes. Use /regex/for regex constants.

Count

The # character is a special character in the condition section. If it is used before any event or placeholder variable name, it represents the number of distinct events or values that satisfy all the events section conditions.

You can use YARA-L to search for UDM events across all of the IP addresses within a subnetwork using the net.ip_in_range_cidr() statement. Both IPv4 and IPv6 are supported.

To search across a range of IP addresses, specify an IP UDM field and a Classless Inter-Domain Routing (CIDR) range. YARA-L can handle both singular and repeating IP address fields.

IPv4 example:

net.ip_in_range_cidr($e.principal.ip, "192.0.2.0/24")

IPv6 example:

net.ip_in_range_cidr($e.network.dhcp.yiaddr, "2001:db8::/32")

For an example rule using the net.ip_in_range_cidr()statement, see the example rule: Single Event within Range of IP Addresses

Operators

You can use the following operators in YARA-L:

Operator Description
= equal
!= not equal
< less than
<= less than or equal
> greater than
>= greater than or equal

Quotes for strings

You can use either of the following quotation characters to enclose strings in YARA-L 2.0. However, quoted text is interpreted differently depending on which one you use.

  1. Double quotes (")—Use for normal strings. Must include escape characters. For example: "hello\tworld"—\t is interpreted as a tab

  2. Back quotes (`)—Use to interpret all characters literally. For example: `hello\tworld`—\t is not interpreted as a tab

For regular expressions, you often need to include the backslash (\) character within strings. However, if you use double quotes, you need to escape backslash characters with backslash characters, which can look awkward.

For example, the following regular expressions are equivalent:

  • re.regex($e.network.email.from, `.*altostrat\.com`)
  • re.regex($e.network.email.from, ".*altostrat\\.com")

We recommend using back quote characters for strings in regular expressions for ease of readability.

Variables

In YARA-L 2.0, all variables are represented as $<variable name>.

You can define the following types of variables:

  • Event variables—Represent groups of events in normalized form (UDM). Specify conditions for event variables in the events section. You identify event variables using the event fields after the variable. Event fields are represented as a chain of .<field name> (for example, $e.field1.field2). Event field chains always start from the top-level UDM.

  • Match variables—Declare in the match section. Match variables become grouping fields for the query, as one row is returned for each unique set of match variables (and for each time window). When the rule finds a match, the match variable values are returned. Specify what each match variable represents in the events section.

  • Placeholder variables—Declare and define in the events section. Placeholder variables are similar to match variables. However, you can use placeholder variables in the condition section to specify match conditions.

Use match variables and placeholder variables to declare relationships between event fields through transitive join conditions (see Events Section Syntax for more detail).

Events section syntax

In the events section, list the predicates to specify the following:

  • What each match or placeholder variable represents
  • Filter condition on a single event variable
  • Join condition on two event variables.

Variable declarations

For variable declarations, use the following syntax:

  • EVENT_FIELD = VAR
  • VAR = EVENT_FIELD

Both are equivalent, as shown in the following examples:

  • $e.source.hostname = $hostname
  • $userid = $e.principal.user.userid

This declaration indicates that this variable represents the specified field for the event variable. When the event field is an array field, the match variable can represent any value in the array. It is also possible to assign multiple event fields to a single match or placeholder variable. This is a transitive join condition.

For example, the following:

  • $e1.source.ip = $ip
  • $e2.target.ip = $ip

Are equivalent to:

  • $e1.source.ip = $ip
  • $e1.source.ip = $e2.target.ip

Filter conditions for event variables

For a filter condition on a single event variable, use the following syntax:

  • [EVENT_FIELD] [OP] [CONST]
  • [CONST] [OP] [EVENT_FIELD]

Although both are equivalent, we recommend using the former ([EVENT_FIELD] [OP] [CONST]) for readability.

For example:

  • $e.source.hostname = "host1234"
  • $e.source.port < 1024
  • 1024 < $e.source.port

This predicate is used as a filter on the event variable, meaning an event group represented by the event variable should satisfy it.

Join conditions for event variables

To represent a join condition for two event variables, use the following syntax:

[EVENT_FIELD] [OP] [EVENT_FIELD]

For example:

  • $e1.source.hostname = $e2.target.hostname
  • $e1.metadata.timestamp < $e2.metadata.timestamp

This predicate is used to join the two event variables with the condition.

The following are examples of invalid predicates:

  • $e.source.hostname != $hostname //comparison over match/placeholder var
  • $hostname != "host1234" //comparison over match/placeholder var
  • $var1 //variable itself does not mean anything

Logical operators

You can use the logical and and logical or operators in the events section as shown in the following examples:

  • $e.metadata.event_type = "NETWORK_DNS" or $e.metadata.event_type = "NETWORK_DHCP"
  • ($e.metadata.event_type = "NETWORK_DNS" and $e.principal.ip = "192.0.2.12") or ($e.metadata.event_type = "NETWORK_DHCP" and $e.principal.mac = "AB:CD:01:10:EF:22")
  • not $e.metadata.event_type = "NETWORK_DNS"

By default, the precedence order from highest to lowest is not, and, or.

For example, "a or b and c" is evaluated as "a or (b and c)". You can use parentheses to alter the precedence if needed.

Match section syntax

In the match section, list the match variables for group events before checking for match conditions. Those fields are returned with each match.

  • Specify what each match variable represents in the events section.
  • Specify the time range to use to correlate events after the over keyword. Events outside the time range are ignored.
  • Use the following syntax to specify the time range: <number><s/m/h/d> Where s/m/h/d means seconds, minutes, hours, and days respectively.
  • Minimum time you can specify is 1 minute.
  • Maximum time you can specify is 48 hours.

The following is an example of a valid match:

$var1, $var2 over 5m

This statement returns $var1 and $var2 (defined in the events section) when the rule finds a match. The time specified is 5 minutes. Events that are more than 5 minute apart are not correlated and therefore ignored by the rule.

Here is another example of a valid match:

$user over 1h

This statement returns $user when the rule finds a match. The time window specified is 1 hour. Events that are more than an hour apart are not correlated. The rule does not consider them to be a detection.

Here is another example of a valid match:

$source_ip, $target_ip, $hostname over 30s

This statement returns $source_ip, $target_ip, and $hostname when the rule finds a match. The time window specified is 30 seconds. Events that are more than 30 seconds apart are not correlated. The rule does not consider them to be a detection.

The following examples illustrate invalid matches:

  • var1, var2 over 5m // invalid variable name
  • $user 1h // missing keyword

Condition section syntax

In the condition section, specify the match condition over events and variables defined in the events section. List match predicates here, joined with the keyword and or or.

The following conditions are bounding conditions. They force the associated event variable to exist, meaning that at least one occurrence of the event must appear in any detection.

  • $var // equivalent to #var > 0
  • #var > n // where n is >= 0
  • #var >= m // where m > 0

The following conditions are non-bounding conditions. They allow the associated event variable to not exist, meaning that it is possible that no occurrence of the event appears in a detection. This enables the making of non-existence rules, which search for the absence of a variable instead of the presence of a variable.

  • !$var // equivalent to #var = 0
  • #var < n // where n is > 0
  • #var <= m // where m >= 0

In the following example, the special character # on a variable (either the event variable or the placeholder variable) represents the count of distinct events or values of that variable:

$e and #port > 50 or #event1 > 2 or #event2 > 1 or #event3 > 0

The following non-existence example is also valid and evaluates to true if there are more than two distinct events from $event1, and zero distinct events from $event2:

#event1 > 2 and !$event2

The following are examples of invalid predicates:

  • $e, #port > 50 // incorrect keyword usage
  • $e or #port < 50 // or keyword not supported with non-bounding conditions

Regular expressions

You can define regular expressions in YARA-L 2.0 using either of the following syntax:

  • Using YARA syntax––Related to events. The following is a generic representation of this syntax: $e.field = /regex/
  • Using YARA-L syntax––As a function taking in the following parameters:
    • Field to apply regular expression to.
    • Regular expression specified as a string. You can use the nocase modifier after strings to indicate that the search should ignore capitalization. The following is a generic representation of this syntax: re.regex($e.field, `regex`)

To match the exact string or only a prefix or suffix, include the ^ (starting) and $ (ending) anchor characters in the regular expression.

For example, /^full$/ matches "full" exactly, while /full/ could match "fullest", "lawfull", and "joyfully".

Raw strings

You can use YARA-L 2.0 to search for a raw string within your enterprise event data.

To search for a raw string, enclose the string in backtick characters (`) instead of double quotes (").

The backtick indicates that the contents of the string are to be interpreted literally (escape characters will not be parsed).