Use data tables
Data tables are multicolumn data constructs that let you input your own data into Google Security Operations. They can act as lookup tables with defined columns and the data stored in rows. You can create or import a data table to your Google SecOps account using the Google SecOps UI, the data tables API, or by using a YARA-L query in rules.
Manage data tables using the Google SecOps user interface
The sections that follow describe how to manage data tables using the user interface, including how to access your data tables, add a new data table and edit its contents, import data to your table, add rows, separate data using either commas or tabs, and how to remove a data table from your account.
Access your data tables
To navigate to the data tables page, go to the left navigation sidebar and click Detections > Data tables. You can search for an existing data table by entering the name in the search field at the top of the sidebar.
Add a new data table
To add a new data table to Google SecOps, complete the following steps:
Click add Add at the top right of the sidebar.
In the Create new data table dialog, give the new table a name and (optional) add a description.
Click Create. The new data table appears in the main window and is ready to accept data.
Import data to your data table
To add data to your data table, you can import it directly into Google SecOps as follows:
Click Import Data.
Select a standard CSV file (only CSV files can be imported to Google SecOps).
Click Open when you're ready to import the data to your data table.
Add a new row of data to your data table
To manually add a new row of data to a data table, you can enter it directly, with the first row serving as the table header. To do this, do the following:
In the Details tab, place the cursor at the end of an existing row and press Enter.
Enter a new row of data:
- Separate the data fields using either commas or tabs.
- Be sure to match each data item to the appropriate data column.
- As you enter a row data in the Details tab, it populates in the Table editor.
Specify whether to use commas or tabs to separate data
To separate data using either commas or tabs, do the following:
Click
Edit separator type next to Import Data.In the Edit separator type dialog, select either Comma or Tab from the Separator menu.
Remove a data table
To remove a data table:
Select a data table from the Data tables list to the left.
Click Delete.
Manage data tables using the Chronicle API
You can also use the REST resources available in the Chronicle API to manage data tables in Google SecOps. The API has functionality equivalent to the user interface, and includes some additional features that let you manage data tables with more performance and greater scale.
Here are the data table REST resources:
Use data tables in Google SecOps
Once you have imported data tables to your Google SecOps instance, you can use them to filter, enhance, and enrich your data using rules. This document includes numerous examples in YARA-L syntax, which you can incorporate into Google SecOps rules within your instance. For more information on how to edit rules in Google SecOps, see Manage rules using Rules Editor.
You can use data tables in conjunction with rules in the following ways:
Filter UDM event or entity data using a data table. You can filter UDM telemetry events and entities by comparing them with entries in a data table.
Join a data table with an event or entity. You can link UDM events to a data table using the equality operator for row-based comparison. This comparison lets you filter the data. For a row-based comparison to evaluate as true, all the conditions in the statement must be matched by a row.
Use a data table as a multicolumn reference list. You can use a data table as a multicolumn reference list. While a reference list can access data in a single dimension, data tables let you access data in multiple dimensions, enabling data filtering.
Filter UDM event and entity data using a data table
You can filter UDM events and entities by comparing them to entries in a data table.
Link UDM events to data tables in either of the following ways:
Using an equality operator (
=, !=, >, >=, <, <=
) for row-based comparison. For example,$<udm_variable>.<field_path> = %<data_table_name>.<column_name>
.Using the
in
keyword for column-based comparison. For example,$<udm_variable>.<field_path> in %<data_table_name>.<column_name>
.`
As with reference lists, the supported
data types for each data table column can be string
, regex
, or CIDR
.
To use a data table column of type CIDR or regular expression for row-based comparison, use the following syntax:
net.ip_in_range_cidr($e.principal.ip, %<data_table_name>.<column_name>)
re.regex($e.principal.hostname, %<data_table_name>.<column_name>)
To use a data table column of type CIDR or regular expression for column-based comparison, use the following syntax:
$e.principal.ip in cidr %cidr_data_table.column_name
$e.principal.hostname in regex %regex_data_table.column_name
If you specified the data type of the column as either CIDR or regular expression, exclude the keyword cidr
or regex
.
You can also use the not
operator with data tables. The following example rule filters out entries where the IP addresses ($e.principal.ip
) don't match any of the CIDR ranges listed in the benign_ip
column in cidr_data_table
:
not $e.principal.ip in cidr %cidr_data_table.benign_ip
Join a data table with a UDM event or entity
You can use a data table to filter UDM events using the equality operator
(=, !=, >, >=, <, <=
) for row-based comparison.
In the following example, the YARA-L rule checks that when a user logs in, their user ID exists in the data table (example_table
). In the same row of the data table where the match occurred, it also checks if the user account is active before the login event was
registered.
// Check if a user exists in a data table and the user is active for all user login events.
rule udm_join_data_table {
meta:
description = "Join data table with UDM event"
event:
$e.metadata.event_type = USER_LOGIN
$e.security_result.action = ALLOW
$e.principal.user.userid = $userid
// Event must match at least one row in the table where the uid in the table
// row is the userid for the event and the active date in the same table
// row is before the event timestamp
%example_table.uid = $userid
$e.principal.hostname = %example_table.hostname
match:
$userid over 1h
condition:
$e
}
The following example illustrates a data table and UDM event data. Based on the logic in the preceding YARA-L rule, a user with user ID 32452 surfaces in the detection as the user's hostname in the system matches the hostname in the data table.
Data table | ||
uid | title | hostname |
32452 | HR | host1 |
64452 | Finance | host2 | 46364 | IT | host3 |
UDM event table | |||
principal | metadata | security_result | principal |
32452 | USER_LOGIN | ALLOW | host1 |
64589 | USER_LOGIN | ALLOW | host9 |
87352 | USER_LOGIN | ALLOW | host4 |
Use a data table as a multicolumn reference list
You can use a data table as a multicolumn reference list. While a reference list can access data in a single dimension, data tables let you access data in multiple dimensions, enabling data filtering.
You can link UDM events to a data table using the in
keyword for column-based
comparison.
Data tables can be referenced as multicolumn reference lists using the same syntax used by reference lists:
%<data_table_name>.<column_name>.
In the following example, alerts are triggered for network connections with
suspicious port and protocol combinations. Port
and protocol
are columns in
the badApps
data table.
rule udm_in_data_table {
meta:
description = "Use data table as multicolumn reference list"
events:
$e.metadata.event_type = NETWORK_CONNECTION
$e.security_result.action = ALLOW
$e.target.asset.asset_id = $assetid
// event port matches at least one value in table column port
$e.target.port in %badApps.port
// event IP matches at least 1 value in table column protocol
$e.target.network.ip in %badApps.ip
match:
$assetid over 1h
condition:
$e
}
Map data types to single data table columns
Map each data type to a single data table column. Any column containing values for multiple fields must be split before they can be used as data tables.
In the following example, the data table column Field_value
contains values
for multiple fields:
Field_value | Field_name |
altostrat.com | FQDN |
192.0.2.135 | IP |
charlie | userid |
example | hostname |
The preceding table is split into four columns with each column mapped to only one field type before it can be used for any of the data table use cases presented in this document.
FQDN | IP | Userid | Hostname |
altostrat.com | 192.0.2.135 | charlie | example |
… | … | … | … |
Map column names to entity fields
When you create a data table, you can map the column names of the data table to entity fields.
In the following example data table, the columns Userid
and Role
are mapped
to entity.user.userid
and entity.user.attribute.role.name
:
Userid
(map to entity.user.userid) |
Role
(map to entity.user.attribute.role.name) |
|
jack | jack123@gmail.com | admin |
tony | tony123@gmail.com | engineer |
The Email
column cannot be mapped to the entity path
entity.user.email_address
since it is a repeated field. You cannot map a
column of a data table to a repeated field.
You can map a column of a data table to an entity proto field using the
mapped_column_path
field of the DataTable resource.
For columns that don't have an entity path defined, such as Email
in this
example table, you must define a data type. As with
reference lists, the supported data
types for data tables are string
, regex
, and cidr
.
You can specify both mapped and unmapped columns in the data table using the join
condition. Unmapped columns go to the additional
field of
the entity (to which the data table is joined) as key-value pairs where the key
is the column name and the value is the row value for that column.
Write results from YARA-L queries to data tables
You can write the results from YARA-L queries to a data table. Using this feature, you can create data tables from your Google SecOps data and use these tables to filter and enhance other data.
You can use the YARA-L rules syntax for data tables for the following:
Define a YARA-L syntax for writing rule results to data tables. You can use the same syntax for search and dashboards.
Use data tables for threat intelligence, incident response, and other security use cases.
Ensure that the data is consistent with YARA-L conventions.
Write detections and alerts to data tables using rules
You can use YARA-L rules to send detections and alerts to data tables.
Using the write_row function, you can overwrite a data table row with the matching key in the data table using the results from a rule. If the key is not found in the table, add a new row instead.
Specify the write_row function in the export section of a YARA-L query. Writing data to the data table must be the final action of the rule. This ensures that the outcome variables are written to the data table.
export:
%<data_table_name>.write_row(
data_table_column_x_name: <value>,
data_table_column_y_name: <value>,
...,
...,
data_table_column_z_name: <value>
)
// depending on the key column(s), the rows will be updated for existing keys
and appended for new keys
Modify a data table using YARA-L
The following shows how to modify a data table using YARA-L:
TableName: ip_user_domain_table
(key columns for the primary key are defined at creation)
ip_address | employee_id* | domain |
192.0.2.10 | Dana | altostrat.com |
192.0.2.20 | Quinn | altostrat.com |
192.0.2.30 | Lee | cymbalgroup.com |
* indicates the primary key.
The following rule captures unique combinations of
principal.ip
, principal.user.employee_id
, and target.domain
. It filters
the results based on the prevalence of the target.domain
:
rule unique_principal_userid_and_ip_and_target_domain_with_low_prevalence {
meta:
author = "GP"
description = "Captures unique combinations of principal.ip,
principal.user.employee_id, and target.domain where
target.domain.prevalence is less than 5"
rule_version = "1.0"
events:
$e.principal.ip = $principal_ip
$e.principal.user.employee_id = $principal_user_employee_id
$e.target.domain.name = $target_domain
$e.target.domain.prevalence.day_count < 5
condition:
$e
}
Rule results:
ip | empid | domain |
192.0.2.10 | Dana | altostrat.com |
192.0.2.30 | Lee | examplepetstore.com |
192.0.2.20 | Quinn | altostrat.com |
Example: Use write_row to write rule output to a data table
rule write_to_data_table {
meta:
author = "GP"
description = "Captures uniqueprincipal.user.employee_id, and target.domain where target.domain.prevalence is less than 5"
rule_version = "1.0"
events:
$e.principal.user.employee_id = $principal_user_employee_id
$e.target.domain.name = $target_domain
$e.target.domain.prevalence.day_count < 5
outcome:
$hostname = $target_domain
$principal_emp_id = $principal_user_employee_id
condition:
$e
export:
%ips_with_hostnames.write_row (
employeeid:$principal_emp_id,
hostname:$hostname,
)
}
Example: Understanding write_row
In the following example, user
and ip
are primary keys. Each detection that persists in the detections table results in one evaluation of the function call in the export section of the rule.
Here is the rule:
rule successful_logins_by_user_to_ip {
meta:
events:
$e.metadata.event_type = "USER_LOGIN"
all $e.security_result.action != "BLOCK"
all $e.security_result.action != "UNKNOWN_ACTION"
$user = $e.principal.user.userid
$ip = $e.target.ip
$ts = $e.metadata.event_timestamp.seconds
match:
$user, $ip over 1h
outcome:
$first_seen = min($ts)
condition:
$e
export:
%successful_logins.write(user:$user, ip:$ip)
}
And here is the event data:
metadata: {
event_type: USER_LOGIN
event_timestamp: { seconds: 1283299200 }
}
principal: {
user: {
userid: "charlie"
}
}
target: {
ip: ["192.0.2.135", "192.0.2.136"]
}
security_result: {
action: ALLOW
}
The following detections are returned:
Detection ID | Match $user | Match $ip |
0 | charlie | 192.0.2.135 |
1 | charlie | 192.0.2.136 |
The data table contains the following:
user | ip |
charlie | 192.0.2.135 |
charlie | 192.0.2.136 |
Enrich entity graph with a data table
You can use data tables to add, remove, or replace the entities presented in
entity graph. Use functions in the rule setup
section to indicate how the
data table should be merged with, appended to, or used to remove entities from
entity events referenced in the events
section.
You can use the following rule template to modify an entity graph:
rule entity_graph_template {
meta:
...
setup:
// import the data table into entity graph
<enrichment_keyword> <join_condition>
events:
...
match:
...
condition:
...
}
You can use the following YARA-L 2.0 functions to enhance entity graph with a data table:
graph_override
: Overwrite the rows in the entity graph that match the join condition with data from the data table.For example:
[graph_override](?tab=t.0#heading=h.v0fps7eke1if)
graph_append
: Append the rows from the data table to the rows in the entity graph. Thegraph_append
operation requires an array that includes a data table variable and an entity event variable rather than a join condition.In the following example,
$g1
is the entity graph variable andexample_table
is the data table:graph_append [$g1, %example_table]
graph_exclude
: Remove the rows in the entity graph that match the join condition.For example:
[graph_exclude](?tab=t.0#heading=h.o0qbb5paki6g)
The join condition must be an equality expression between the data table column
and the entity graph field. For the graph_override
and graph_exclude
functions, the syntax to access a data table is as follows:
<data_table_name>.<column_name>
Any filter specified for the <entity_variable>
in the event section is applied
after its enhancement with the data table.
After the entity in the entity graph is enriched with the entity in the data table, the entity variable in the entity graph must be joined to the UDM entity.
Override entity graph with data from data table
With the graph_override
function, fields present in both the entity
graph and the data table are replaced with fields from the data table. Fields
present in the entity graph and not in the data table remain the same. Fields
not present in the entity graph but present in the data table are included.
Only those columns of the data table that are mapped override the columns
of the entity graph. The columns that are unmapped are added to the additional
field of the entity graph on which the data table is joined.
Example: Match on single join
In the following example, the rows in the entity graph that match the join
condition between the data table column and the entity graph field
($g1.graph.entity.ip = %example_table.my_ip
) are overridden by the data table.
rule rule_override {
meta:
description = "Override entity context with data table before joining with UDM event"
setup:
//Rows in the entity graph that match the join condition are overridden by the data table
graph_override ($g1.graph.entity.ip = %example_table.my_ip)
events:
$e.metadata.event_type = "NETWORK_CONNECTION"
$e.security_result.action = "ALLOW"
// Filter will be applied after graph is overridden by data table
$g1.graph.entity.hostname = "ftp01"
// Accessing unmapped columns
$g1.graph.additional.fields["Owner"] = "alice"
// Joining the UDM event with the enriched entity graph
$e.target.ip = $iocip
$g1.graph.entity.ip = $iocip
match:
$iocip over 1h
condition:
$e and $g1
}
To use an unmapped column (say "Owner") of the data table, then an
equivalent statement for
$g1.graph.entity.owner = "alice" is $g1.graph.additional.fields["Owner"] = "alice"
.
This is because all unmapped columns of the data table go into the
additional
field of the entity graph ($g1)
.
The following tables illustrate an override operation where rows in the entity graph are enriched when the IP field in the data table matches the IP field in the entity graph.
Existing entity graph | ||
Hostname | IP | MAC |
ftp01 | 10.1.1.4 | …:01 |
www01 | 10.1.1.5 | …:02 |
Data table | |||
Hostname | IP | MAC | Owner |
ftp01 | 10.1.1.4 | …:bb | alice |
h1 | 10.1.1.6 | …:cc | bob |
h2 | 10.1.1.7 | …:dd | chris |
h3 | 10.1.1.4 | …:ee | doug |
Enriched entity graph | |||
Hostname | IP | MAC | Owner |
ftp01 | 10.1.1.4 | …:bb | alice |
www01 | 10.1.1.5 | …:02 | |
h3 | 10.1.1.4 | …:ee | doug |
Example: Match on multiple joins
In the following example, the rows in the entity graph that match the multiple
join conditions ($g1.graph.entity.ip = %example_table.my_ip
and $g1.graph.entity.hostname = %example_table.my_hostname
) are overridden by
the data table.
rule rule_override {
meta:
description = "Override Entity context with Data Table before joining with UDM event"
setup:
// example with more than one condition
graph_override ($g1.graph.entity.ip = %example_table.my_ip and
$g1.graph.entity.hostname = %example_table.my_hostname)
events:
$e.metadata.event_type = "NETWORK_CONNECTION"
$e.security_result.action = "ALLOW"
// Filter will be applied after graph is overridden by data table
$g1.graph.entity.hostname = "ftp01"
// joining the UDM event with the enriched entity graph
$e.target.ip = $iocip
$g1.graph.entity.ip = $iocip
match:
$iocip over 1h
condition:
$e and $g1
}
The following tables illustrate an override operation in which the rows of the entity graph are enriched when both the IP field and the hostname field in the data table match the IP field and the hostname field in the entity graph.
Existing entity graph | ||
Hostname | IP | MAC |
ftp01 | 10.1.1.4 | …:01 |
www01 | 10.1.1.5 | …:02 |
Data table | |||
Hostname | IP | MAC | Owner |
ftp01 | 10.1.1.4 | …:bb | alice |
h1 | 10.1.1.5 | …:cc | bob |
h2 | 10.1.1.6 | …:dd | chris |
h3 | 10.1.1.4 | …:ee | doug |
Enriched entity graph | |||
Hostname | IP | MAC | Owner |
ftp01 | 10.1.1.4 | …:bb | alice |
www01 | 10.1.1.5 | …:02 |
Append data from the data table to entity graph
With the graph_append
function, no join condition is required.
In the following example, all rows in the data table are appended to the rows in the entity graph.
rule rule_append {
meta:
description = "Data table append entity"
setup:
graph_append [$g1, %example_table]
events:
// filter UDM events
$e.metadata.event_type = "NETWORK_CONNECTION"
$e.security_result.action = "ALLOW"
// Join the filtered UDM events with the enriched graph
$e.target.ip = $iocip
$g1.graph.entity.ip = $iocip
match:
$iocip over 1h
condition:
$e and $g1
}
The following example table illustrates an append operation where the rows of the data table are appended to the rows in the entity graph:
Existing entity graph | ||
Hostname | IP | MAC |
ftp01 | 10.1.1.4 | …:01 |
www01 | 10.1.1.5 | …:02 |
Data table | |||
Hostname | IP | MAC | Owner |
10.1.1.4 | …:01 | alice | |
10.1.1.6 | …:cc | bob | |
10.1.1.7 | …:dd | chris | |
10.1.1.4 | …:ee | doug |
Enriched entity graph | |||
Hostname | IP | MAC | Owner |
ftp01 | 10.1.1.4 | …:01 | |
www01 | 10.1.1.5 | …:02 | |
10.1.1.4 | …:bb | alice | |
10.1.1.6 | …:cc | bob | |
10.1.1.7 | …:dd | chris | |
10.1.1.4 | …:ee | doug |
Use graph_exclude to remove rows from entity graph
With the graph_exclude
function, rows in the entity graph that
match the join condition are removed from the entity graph.
In the following example, all rows in the entity graph that match the given join condition (between the data table column and the entity graph field) are removed. No rows from the data table are added to the entity graph.
rule rule_exclude {
meta:
setup:
graph_exclude ($g1.graph.entity.ip = %example_table.ip)
events:
$e.metadata.event_type = "NETWORK_CONNECTION"
$e.security_result.action = "ALLOW"
$e.target.ip = $iocip
$g1.graph.entity.ip = $iocip
match:
$iocip over 1h
condition:
$e and $g1
}
The following tables illustrate an exclude operation in which the rows of the entity graph that match the IP field of the data table are removed:
Existing entity graph | ||
Hostname | IP | MAC |
ftp01 | 10.1.1.4 | …:01 |
www01 | 10.1.1.5 | …:02 |
Data table | ||
IP | MAC | Owner |
10.1.1.4 | …:bb | alice |
10.1.1.6 | …:cc | bob |
10.1.1.7 | …:dd | chris |
Enriched entity graph | ||
Hostname | IP | MAC |
www01 | 10.1.1.5 | …:02 |
Limitations
The limits on the number of
in
statements when referencing a reference list in a rule also apply toin
statements in a data table.Only CSV file type is supported for uploads.
Maximum size of a data table is 10 GB.
Maximum aggregate limit of data vloume across data tables in a tenant is 1 TB.
Maximum number of
in
statements in a rule, with or without special operators: 7Maximum number of
in
statements with theregex
operator: 4Maximum number of
in
statements with thecidr
operator: 2Placeholders are not allowed in the new setup section.
Unmapped columns of a data table with data type set to
string
can only be joined with string fields of UDM event or UDM entity.Use only unmapped columns in a data table with a data type set to
cidr
orregex
for CIDR or regular expression.You cannot map a data table's column to a repeated field.
Use data tables with rules
The following limitations apply to data tables when used with rules:
Run frequency
Real-time run frequency is not supported for rules with data tables.
Joins
Unlike entities and UDM, data tables don't support placeholders. This means you cannot apply one set of filters to a data table, join it with a UDM entity and then apply a different set of filters to the same data table and while joining it with another UDM placeholder variable.
For example, a data table named
dt
with 3 columns:my_hostname
,org
, andmy_email
and with the following rule:
events:
$e1.principal.hostname = %dt.my_hostname
%dt.org ="hr"
$e2.principal.email = %dt.my_email
%dt.org !="hr"
All the filters on a data table are applied first and then the filtered rows
from the data table are joined with UDM. In this case, an empty data table is
joined with e1
and e2
because the two filters on data table dt
contradict
each other (%dt.org ="hr" and %dt.org !="hr"
).
Output to data tables
You can only export outcome variables to a data table. You cannot export event path or data table columns directly.
Column lists must include the primary key columns for data tables.
You can have no more than 20 outcomes.
Data table columns don't support repeated values, so all outcome variables written to a data table must be singular values.
If a data table doesn't exist, a new table is created with the default string data type for all columns, following the order specified.
Only one rule can write to a data table at a time. If a rule tries to write to a data table that another rule is already writing to, rule compilation fails.
There is no guarantee that a producer rule can add rows to a data table before a consumer rule for that data table starts.
A rule has a limit on the number of outcomes rows. A 10,000-row limit applies over the result and persisted data. The same limit applies to data tables; a single rule execution can output a maximum of 10,000 rows to a data table.
If a row with the same primary key already exists in the data table, non-primary key columns will be replaced with the new values.
You can apply only one enrichment operation (either
override
,append
, orexclude
) to a single entity graph variable.Each enrichment operation can be performed using only one data table.
You can define a maximum of two enrichment operations of any type in the
setup
section of a YARA-L rule.
In the following example, an override operation is applied to the entity graph
variable $g1
and an append
operation is applied to the entity graph variable
$g2
.
setup:
graph_override($g1.graph.entity.user.userid = %table1.myids)
graph_append [$g2, %table1]
In the preceding example, the same data table (table1
) is used to enhance
different entity graphs. You can also use different data tables to enhance
the different entity graphs, as follows:
setup:
graph_override($g1.graph.entity.user.userid = %table1.myids)
graph_append [$g2, %table2]