Use joins in Search
Joins help correlate data from multiple sources to provide more context for an investigation. By linking related events, entities, and other data, you can investigate complex attack scenarios.
This document explains how to use the join operation in Google Security Operations. It also covers supported join types, use cases, and best practices.
Create a join
Only statistics-based joins are supported. You must define them in the match
section of a query with a correlation time window of up to 48 hours.
You can create a join by either connecting fields directly (for
example, $e1.hostname = $e2.hostname
) or by using placeholder variables. When
you define a join in the match
section, you must use placeholder variables.
The following example queries joins two fields with an equals sign (=
) and a
shared placeholder variable:
Example 1:
events:
// Assign a value from the first event to the placeholder variable $user
$user = $e1.principal.user.userid
// The second assignment creates an implicit join, linking $e2 to $e1
// where the user ID is the same.
$user = $e2.principal.user.userid
match:
$user over 1h
condition:
$e1 and $e2
Example 2:
$e1.principal.ip = $ip
$e1.metadata.event_type = "USER_LOGIN"
$e1.principal.hostname = $host
$e2.target.ip = $ip
$e2.principal.hostname = "altostrat"
$e2.target.hostname = $host
match:
$ip, $host over 5m
Supported join types
This section describes the different types of joins that you can use.
Event-event join
An event-event join connects two different Universal Data Model (UDM) events.
The following example query links a USER_LOGIN
event with another event to
find the hostname (altostrat
) that the user interacted with, based on a common
IP address:
$e1.principal.ip = $ip
$e1.metadata.event_type = "USER_LOGIN"
$e2.target.ip = $ip
$e2.principal.hostname = "altostrat"
match:
$ip over 5m
Event-ECG join
An Event-ECG join connects a UDM event with an entity from the Entity Context
Graph (ECG). The following example query finds a NETWORK_CONNECTION
event and
an ASSET
from the entity graph that share the same hostname within a 1-hour
window:
events:
$e1.metadata.event_type = "NETWORK_CONNECTION"
$g1.graph.metadata.entity_type = "ASSET"
$e1.principal.asset.hostname = $g1.graph.entity.asset.hostname
$x = $g1.graph.entity.asset.hostname
match:
$x over 1h
condition:
$e1 and $g1
Datatable-event join
A datatable-event join connects UDM events with entries in a custom datatable.
This is useful for checking live event data against a user-defined list, such
as known malicious IP addresses or threat actors. The following example query
joins NETWORK_CONNECTION
events with a datatable to find connections involving
specific IP addresses from that list:
$ip = %DATATABLE_NAME.COLUMN_NAME
$ip = $e1.principal.ip
$e1.metadata.event_type = "NETWORK_CONNECTION"
match:
$ip over 1h
Best practices
Join queries can be resource-intensive because they combine many results. Broad, general filters can cause queries to fail, sometimes after a long delay, for example:
target.ip != null
metadata.event_type = "NETWORK_CONNECTION"
(if this event type is very common in your environment)
We recommend combining general filters with more specific ones to reduce the
total number of events that the query needs to process. A broad filter like
target.ip != null
should be paired with more specific filters to improve the
performance of the query, for example:
$e1.metadata.log_type = $log
$e1.metadata.event_type = "USER_LOGIN"
$e1.target.ip != ""
$e2.metadata.log_type = $log
$e2.principal.ip = "10.0.0.76"
$e2.target.hostname != "altostrat"
match:
$log over 5m
If your query is still slow, you can also reduce the query's overall time range (for example, from 30 days to one week).
For more information, see YARA-L best practices.
Limitations
The following limitations apply when using joins:
You can use a maximum of two UDM events per query.
You can use a maximum of one ECG event per query.
You can use a maximum of two Datatables per query.
You cannot join datatable, UDM, and ECG events together in a single query.
The maximum query time range is 90 days.
The maximum
match
time window is 48 hours.Joins are supported in the user interface and the
EventService.UDMSearch
API, but not in theSearchService.UDMSearch
API.
Common use cases
This section lists some common ways to use joins.
Detect credential theft and use
Goal: Find instances where a user logs in successfully, and then quickly deletes a critical system file. This could suggest an account takeover or malicious insider activity.
Join type: Event-Event join
Description: This query connects two distinct events that aren't suspicious
on their own, but become highly suspicious when they happen together. It first
looks for a USER_LOGIN
event, then a FILE_DELETION
event. These are joined
by the common user.userid
with a short time window.
Sample query:
// Event 1: A user successfully logs in
$e1.metadata.event_type = "USER_LOGIN"
$e1.security_result.action = "ALLOW"
$e1.principal.user.userid = $user
// Event 2: The same user deletes a critical file
$e2.metadata.event_type = "FILE_DELETION"
$e2.target.file.full_path = /etc\/passwd|C:\\Windows\\System32\\/
$e2.principal.user.userid = $user
match:
$user over 10m
condition:
$e1 and $e2
Identify risky connections from critical assets
Goal: Enrich live network data with asset information to find outbound connections from servers that shouldn't communicate with external, low-prevalence domains (for example, a production database server).
Join type: Event-ECG join
Description: A single network connection to a rare domain might not be a
high priority. However, this query increases the importance of that event by
joining it with the Entity Context Graph (ECG). It specifically looks for
NETWORK_CONNECTION
events that come from assets labeled as "Critical Database
Server" in the entity graph.
Sample query:
events:
$e.metadata.event_type = "NETWORK_CONNECTION"
$e.target.domain.prevalence.day_count <= 5
$asset.graph.metadata.entity_type = "ASSET"
$asset.graph.entity.asset.labels.value = "Critical Database Server"
$e.principal.asset.hostname = $asset.graph.entity.asset.hostname
$host = $e.principal.asset.hostname
match:
$host over 1h
condition:
$e and $asset
Hunt for threat actor IOCs
Goal: Actively search for Indicators of Compromise (IoCs) by checking all live DNS queries against a list of domains known to be used by a specific threat actor.
Join type: Datatable-Event join
Description: Your threat intelligence team maintains a datatable called
ThreatActor_Domains
that lists malicious domains. This query joins all
real-time NETWORK_DNS_QUERY
events with this datatable. It immediately shows
any instance where a host in your network tries to resolve a domain from your
threat intelligence list.
Sample query:
// Datatable: Get the list of malicious domains
$domain = $ip = %DATATABLE_NAME.COLUMN_NAME
// Event: A DNS query is made
$e.metadata.event_type = "NETWORK_DNS"
$e.network.dns.questions.name = $domain
match:
$domain over 5m
condition:
$e
Need more help? Get answers from Community members and Google SecOps professionals.