Search best practices

Supported in:

This document describes Google's recommended best practices for using the Search feature in Google Security Operations. Searches can require substantial computational resources if they're not carefully constructed. Performance also varies depending on the size and complexity of the data in your Google SecOps instance.

Use indexed UDM fields for maximum speed

The single most effective way to improve search performance is to build queries using indexed fields. These fields are optimized for fast retrieval. The known Unified Data Model (UDM) fields that are indexed are as follows:

Principal fields

  • principal.asset.hostname
  • principal.asset.ip
  • principal.asset.mac
  • principal.file.md5
  • principal.file.sha1
  • principal.file.sha256
  • principal.hostname
  • principal.ip
  • principal.mac
  • principal.process.file.md5
  • principal.process.file.sha1
  • principal.process.file.sha256
  • principal.process.parent_process.file.md5
  • principal.process.parent_process.file.sha1
  • principal.process.parent_process.file.sha256
  • principal.user.email_addresses
  • principal.user.product_object_id
  • principal.user.userid
  • principal.user.windows_sids

Source fields

  • source.user.userid
  • src.asset.hostname
  • src.hostname
  • src.ip

Target fields

  • target.asset.hostname
  • target.file.md5
  • target.file.sha1
  • target.file.sha256
  • target.hostname
  • target.ip
  • target.process.file.md5
  • target.process.file.sha1
  • target.process.file.sha256
  • target.user.email_addresses
  • target.user.product_object_id
  • target.user.userid
  • target.user.windows_sid

Additional fields

  • about.file.md5
  • about.file.sha1
  • about.file.sha256
  • intermediary.hostname
  • intermediary.ip
  • network.dns.questions.name
  • network.email.from
  • network.email.to
  • observer.hostname
  • observer.ip

Construct effective search queries for performance

Writing optimized queries is key to maximize speed and minimize resource consumption across your security data. All query conditions must strictly adhere to this fundamental structure:

udm-field operator value

For example: principal.hostname = "win-server"

Because Google SecOps can ingest a large amount of data during a search, you must minimize the time range of your query to narrow the scope and improve search performance.

Use regular expressions in search query

You can use standard logical and comparison operators when constructing your UDM search queries to build complex expressions:

  • Logical operators:Use AND, OR, and NOT to combine conditions. AND is assumed if you omit an operator between two conditions.
  • Operator precedence: Use parentheses () to override the default order of precedence. There is a maximum limit of 169 logical operators (OR, AND, NOT) that you can use within parentheses.
  • Comparison operators: Depending on the UDM field type (string, integer, timestamp), field operators can include: =, !=, >=, >, <, <=

Alternatively, for efficient searching of a large set of values, you can use the reference lists.

Use nocase as a search modifier

You can append the nocase modifier to a string comparison condition to make the search case-insensitive, which ignores capitalization.

For example, the following search is invalid:

target.user.userid = "TIM.SMITH" nocase

Avoid using regular expressions in enumerated fields

You can't use regular expressions when searching enumerated fields (fields with a range of predefined values) like metadata.event_type or network.ip_protocol

The following example is an invalid search: metadata.event_type = /NETWORK_*/

Whereas, the following example is a valid search: (metadata.event_type = "NETWORK_CONNECTION" or metadata.event_type = "NETWORK_DHCP")

Use any and all operators in the Events field

In Search, some UDM fields (like principal.ip or target.file.md5) are labeled as repeated, because they can hold a list of values or message types within a single event. Repeated fields are always treated with the any operator by default (there's no option to specify all).

When the any operator is used, the predicate is evaluated as true if any value in the repeated field satisfies the condition. For example, if you search for principal.ip != "1.2.3.4" and events in your search include both principal.ip = "1.2.3.4" and principal.ip = "5.6.7.8", a match is generated. This expands your search to include results that match any of the operators instead of matching all of them.

Each element in the repeated field is treated individually. If the repeated field is found in events in the search, the events are evaluated for each element in the field. This can cause unexpected behavior, especially when searching using the != operator.

When using the any operator, the predicate is evaluated as true if any value in the repeated field satisfies the condition.

Use Unix epoch time for timestamps

Timestamp fields are matched using Unix epoch time (the total number of seconds that have passed since Thursday, 1 January 1970 00:00:00 UTC).

When searching for a specific timestamp, the following (in epoch time) is valid:

metadata.ingested_timestamp.seconds = 1660784400

The following timestamp is invalid:

metadata.ingested_timestamp = "2022-08-18T01:00:00Z"

Exclude fields from filters

The following fields are intentionally excluded from search filters. While they contain crucial metadata, their highly unique values can introduce unnecessary search detail and reduce the overall efficiency and effectiveness of the query engine:

  • metadata.id
  • metadata.product_log_id
  • *.timestamp

Need more help? Get answers from Community members and Google SecOps professionals.