How Google Security Operations enriches event and entity data
This document describes how Google Security Operations enriches data and the Unified Data Model (UDM) fields where data is stored.
To enable a security investigation, Google Security Operations ingests contextual data from different sources, performs analysis on the data, and provides additional context about artifacts in a customer environment. Analysts can use contextually enriched data in Detection Engine rules, investigative searches, or reports.
Google Security Operations performs the following types of enrichment:
- Enriches entities by using the entity graph and merging.
- Calculates and enriches each entity with a prevalence statistic that indicates its popularity in the environment.
- Calculates the first time certain entity types were seen in the environment or the most recent time.
- Enriches entities with information from Safe Browsing threat lists.
- Enriches events with geolocation data.
- Enriches entities with WHOIS data.
- Enriches events with VirusTotal file metadata.
- Enriches entities with VirusTotal relationship data.
- Ingest and store Google Cloud Threat Intelligence data.
Enriched data from WHOIS, Safe Browsing, GCTI Threat Intelligence,
VirusTotal metadata, and VirusTotal relationship are identified by entity_type
, product_name
,
and vendor_name
. When creating a rule that uses this enriched data, we recommend
that you include a filter in the rule that identifies the specific
enrichment type to include. This filter helps improve performance of the rule.
For example, include the following filter fields in the events
section of the
rule that joins WHOIS data.
$enrichment.graph.metadata.entity_type = "DOMAIN_NAME"
$enrichment.graph.metadata.product_name = "WHOISXMLAPI Simple Whois"
$enrichment.graph.metadata.vendor_name = "WHOIS"
Enrich entities by using the entity graph and merging
The entity graph identifies relationships between entities and resources in your environment. When entities from different sources are ingested into Google Security Operations, the entity graph maintains an adjacency list based on the relationship between the entities. The entity graph performs context enrichment by performing deduplication and merging.
During deduplication, redundant data is eliminated and intervals are formed to create
a common entity. For example, consider two entities e1
and e2
with timestamps t1
and t2
respectively. The entities e1
and e2
are deduplicated and the timestamps
that are different are not used during deduplication. The following fields are not
used during deduplication:
collected_timestamp
creation_timestamp
interval
During merging, relationships between entities are formed for a time interval of
one day. For example, consider an entity record of user A
who has access to a Cloud Storage
bucket. There is another entity record of user A
who owns a device. After merging,
these two entities result in a single entity user A
that has two relations. One relation
is that user A
has access to the Cloud Storage bucket and the other relation
is that user A
owns the device. Google Security Operations performs a five-day lookback when
it creates entity context data. This handles late arriving data and creates an implicit
time to live on entity context data.
Google Security Operations uses aliasing to enrich the telemetry data and uses entity graphs to enrich the entities. The detection engine rules join the merged entities against the enriched telemetry data to provide context-aware analytics.
An event that contains an entity noun is considered as an entity. Here are some event types and their corresponding entity types:
ASSET_CONTEXT
corresponds toASSET
.RESOURCE_CONTEXT
corresponds toRESOURCE
.USER_CONTEXT
corresponds toUSER
.GROUP_CONTEXT
corresponds toGROUP
.
The entity graph distinguishes between contextual data and indicators of compromise (IOC) using the threat information.
When you use contextually enriched data, consider the following entity graph behavior:
- Don't add intervals in the entity, and instead let the entity graph create intervals. This is because intervals are generated during deduplication unless otherwise specified.
- If the intervals are specified, only the same events are deduplicated, and the most recent entity is retained.
- To ensure that live rules and retrohunts work as expected, entities must be ingested at least once daily.
- If entities are not ingested daily and ingested only once in two or more days, live rules might work as expected, however, retrohunts might lose context of the event.
- If entities are ingested more than once daily, then the entity is deduplicated to a single entity.
- If the event data is missing for a day, the data of the past day is used temporarily to ensure that live rules work fine.
The entity graph also merges events having similar identifiers to get a consolidated view of the data. This merging happens based on the following list of identifiers:
Asset
entity.asset.product_object_id
entity.asset.hostname
entity.asset.asset_id
entity.asset.mac
User
entity.user.product_object_id
entity.user.userid
entity.user.windows_sid
entity.user.email_addresses
entity.user.employee_id
Resource
entity.resource.product_object_id
entity.resource.name
Group
entity.group.product_object_id
entity.group.email_addresses
entity.group.windows_sid
Calculate prevalence statistics
Google Security Operations performs statistical analysis on existing and incoming data and enriches entity context records with prevalence-related metrics.
Prevalence is a numeric value which indicates how popular an entity is.
Popularity is defined by the number of assets accessing an artifact, such as a
domain, file hash or IP address. The larger the number, the more popular the entity.
For example, google.com
has high prevalence values because it is
accessed frequently. If a domain is accessed infrequently, it will have lower
prevalence values. More popular entities are usually less likely to be malicious.
These enriched values are supported for domain, IP, and file (hash). The values are calculated and stored in the following fields.
Prevalence statistics for each entity are updated each day. Values are stored in a separate entity context that can be used by Detection Engine, but is not shown in Google Security Operations investigative views and UDM search.
The following fields can be used when creating Detection Engine rules.
Entity type | UDM fields |
---|---|
Domain | entity.domain.prevalence.day_count
entity.domain.prevalence.day_max
entity.domain.prevalence.day_max_sub_domains
entity.domain.prevalence.rolling_max
entity.domain.prevalence.rolling_max_sub_domains |
File (Hash) | entity.file.prevalence.day_count
entity.file.prevalence.day_max
entity.file.prevalence.rolling_max |
IP address | entity.artifact.prevalence.day_count
entity.artifact.prevalence.day_max
entity.artifact.prevalence.rolling_max |
The day_max and rolling_max values are calculated differently. The fields are calculated as follows:
day_max
is calculated as the maximum prevalence score for the artifact during the day, where a day is defined as 12:00:00 AM - 11:59:59 PM UTC.rolling_max
is calculated as the maximum per day prevalence score (i.e.day_max
) for the artifact over the previous 10 day window.day_count
is used to calculaterolling_max
and is always the value 10.
When calculated for a domain, the difference between day_max
versus day_max_sub_domains
(and rolling_max
versus rolling_max_sub_domains
) is as follows:
rolling_max
andday_max
represent the number of daily unique internal IP addresses accessing a given domain (excluding subdomains).rolling_max_sub_domains
andday_max_sub_domains
represent the number of unique internal IP addresses accessing a given domain (including subdomains).
Prevalence statistics are calculated on newly ingested entity data. Calculations are not performed retroactively on previously ingested data. It takes approximately 36 hours for the statistics to be calculated and stored.
Calculate the first-seen and last-seen time of entities
Google Security Operations performs statistical analysis on incoming data and enriches entity
context records with the first-seen and last-seen times of an entity. The first_seen_time
field stores the date and time when the entity was first seen in the customer
environment. The last_seen_time
field stores the date and time of the most recent
observation.
Because multiple indicators (UDM fields) can identify an asset or a user, the first-seen time is the first time any of the indicators that identify the user or asset was seen in the customer environment.
All UDM fields that describe an asset are the following:
entity.asset.hostname
entity.asset.ip
entity.asset.mac
entity.asset.asset_id
entity.asset.product_object_id
All UDM fields that describe a user are the following:
entity.user.windows_sid
entity.user.product_object_id
entity.user.userid
entity.user.employee_id
entity.user.email_addresses
The first-seen time and last-seen time enable an analyst to correlate certain activity that occurred after a domain, file (hash), asset, user, or IP address was first seen or that stopped occurring after the domain, file (hash), or IP address was last seen.
The first_seen_time
and last_seen_time
fields are populated with entities that
describe a domain, IP address, and file (hash). For entities that describe a user
or asset, only the first_seen_time
field is populated. These values are not
calculated for entities that describe other types, such as a group or resource.
The statistics are calculated for each entity across all namespaces.
Google Security Operations does not calculate the statistics for each entity within individual namespaces.
These statistics are not currently exported to theGoogle Security Operations events
schema in BigQuery.
The enriched values are calculated and stored in the following UDM fields:
Entity type | UDM fields |
---|---|
Domain | entity.domain.first_seen_time entity.domain.last_seen_time |
File (hash) | entity.file.first_seen_time entity.file.last_seen_time |
IP address | entity.artifact.first_seen_time entity.artifact.last_seen_time |
Asset | entity.asset.first_seen_time |
User | entity.user.first_seen_time |
Enrich events with geolocation data
Incoming log data can include external IP addresses without corresponding location information. This is common when an event is logging information about device activity that is not in an enterprise network. For example, a login event to a cloud service would contain a source or client IP address based on the external IP address of a device returned by the carrier NAT.
Google Security Operations provides geolocation-enriched data for external IP addresses to enable more powerful rule detections and greater context for investigations. For example, Google Security Operations might use an external IP address to enrich the event with information about the country (such as the United States), a specific state (such as Alaska), and the network the IP address is in (such as the ASN and carrier name).
Google Security Operations uses location data supplied by Google to provide an approximate geographic location and network information for an IP address. You can write Detection Engine rules against these fields in the events. The enriched event data is also exported to BigQuery where it can be used in Google Security Operations dashboards and reporting.
The following IP addresses are not enriched:
- RFC 1918 private IP address spaces because they are internal to the enterprise network.
- RFC 5771 multicast IP address space because multicast addresses do not belong to a single location.
- IPv6 Unique Local addresses.
- Google Cloud service IP addresses. Exceptions are Google Cloud Compute Engine external IP addresses, which are enriched.
Google Security Operations enriches the following UDM fields with geolocation data:
principal
target
src
observer
Type of data | UDM field |
---|---|
Location (for example, United States) | ( principal | target | src | observer ).ip_geo_artifact.location.country_or_region |
State (for example, New York) | ( principal | target | src | observer ).ip_geo_artifact.location.state |
Longitude | ( principal | target | src | observer ).ip_geo_artifact.location.region_coordinates.longitude |
Latitude | ( principal | target | src | observer ).ip_geo_artifact.location.region_coordinates.latitude |
ASN (autonomous system number) | ( principal | target | src | observer ).ip_geo_artifact.network.asn |
Carrier name | ( principal | target | src | observer ).ip_geo_artifact.network.carrier_name |
DNS domain | ( principal | target | src | observer ).ip_geo_artifact.network.dns_domain |
Organization name | ( principal | target | src | observer ).ip_geo_artifact.network.organization_name |
The following example shows the type of geographic information that would be added to a UDM event with an IP address tagged to the Netherlands:
UDM field | Value |
---|---|
principal.ip_geo_artifact.location.country_or_region |
Netherlands |
principal.ip_geo_artifact.location.region_coordinates.latitude |
52.132633 |
principal.ip_geo_artifact.location.region_coordinates.longitude |
5.291266 |
principal.ip_geo_artifact.network.asn |
8455 |
principal.ip_geo_artifact.network.carrier_name |
schuberg philis |
Inconsistencies
Google proprietary IP geolocation technology uses a combination of networking data and other inputs and methods to provide IP address location and network resolution for our users. Other organizations may use different signals or methods, which might occasionally lead to different results.
If cases arise in which you experience an inconsistency in IP geolocation results that Google provides, please open a customer support case, so that we can investigate and, if appropriate, correct our records moving forward.
Enrich entities with information from Safe Browsing threat lists
Google Security Operations ingests data from Safe Browsing related to file hashes. The data for each file is stored as an entity and provides additional context about the file. Analysts can create Detection Engine rules that query against this entity context data to build context-aware analytics.
The following information is stored with the entity context record.
UDM field | Description |
---|---|
entity.metadata.product_entity_id |
A unique identifier for the entity. |
entity.metadata.entity_type |
This value is FILE , indicating that the entity describes a file.
|
entity.metadata.collected_timestamp |
The date and time that the entity was observed or the event occurred. |
entity.metadata.interval |
Stores the start time and end time that this data is valid.
Because threat list content changes over time, the start_time
and end_time reflects the time interval during which the data about the
entity is valid. For example, a file hash was observed to be
malicious or suspicious between start_time |
entity.metadata.threat.category |
This is the Google Security Operations SecurityCategory . This is set
to one or more of the following values:
|
entity.metadata.threat.severity |
This is the Google Security Operations ProductSeverity .
If the value is CRITICAL , this indicates the artifact appears malicious.
If the value is not specified, there is not enough confidence to indicate that the
artifact is malicious.
|
entity.metadata.product_name |
Stores the value Google Safe Browsing . |
entity.file.sha256 |
The SHA256 hash value for the file. |
Enrich entities with WHOIS data
Google Security Operations ingests WHOIS data daily. During the ingestion of incoming
customer device data, Google Security Operations evaluates domains in customer data
against the WHOIS data. When there is a match, Google Security Operations stores the
related WHOIS data with the entity record for the domain. For each entity,
where entity.metadata.entity_type = DOMAIN_NAME
, Google Security Operations enriches
the entity with information from WHOIS.
Google Security Operations populates enriched WHOIS data into the following fields in the entity record:
entity.domain.admin.attribute.labels
entity.domain.audit_update_time
entity.domain.billing.attribute.labels
entity.domain.billing.office_address.country_or_region
entity.domain.contact_email
entity.domain.creation_time
entity.domain.expiration_time
entity.domain.iana_registrar_id
entity.domain.name_server
entity.domain.private_registration
entity.domain.registrant.company_name
entity.domain.registrant.office_address.state
entity.domain.registrant.office_address.country_or_region
entity.domain.registrant.email_addresses
entity.domain.registrant.user_display_name
entity.domain.registrar
entity.domain.registry_data_raw_text
entity.domain.status
entity.domain.tech.attribute.labels
entity.domain.update_time
entity.domain.whois_record_raw_text
entity.domain.whois_server
entity.domain.zone
For a description of these fields, see the Unified Data Model field list document.
Ingest and store Google Cloud Threat Intelligence data
Google Security Operations ingests data from Google Cloud Threat Intelligence (GCTI) data sources that provide you with contextual information you can use when investigating activity in your environment. You can query the following data sources:
- GCTI Tor Exit Nodes: IP addresses that are known Tor exit nodes.
- GCTI Benign Binaries: files that are either part of the operating system original distribution or were updated by an official operating system patch. Some official operating system binaries that have been abused by an adversary through activity common in living-off-the-land attacks are excluded from this data source, such as those focused on initial entry vectors.
GCTI Remote Access Tools: files that have frequently been used by malicious actors. These tools are generally legitimate applications that are sometimes abused to remotely connect to compromised systems.
This contextual data is stored globally as entities. You can query the data using detection engine rules. Include the following UDM fields and values in the rule to query these global entities:
graph.metadata.vendor_name
=Google Cloud Threat Intelligence
graph.metadata.product_name
=GCTI Feed
In this document, the placeholder <variable_name>
represents the unique variable name
used in a rule to identify a UDM record.
Timed versus timeless Google Cloud Threat Intelligence data sources
Google Cloud Threat Intelligence data sources are either timed or timeless.
Timed data sources have a time range associated with each entry. This means that if a detection is generated on day 1, on any day in the future the same detection is expected to be generated for day 1 during a retro-hunt.
Timeless data sources have no time range associated with them. This is because only the latest set of data is what should be considered. Timeless data sources are frequently used for data such as file hashes that are not expected to change. If no detection is generated on day 1, on day 2 a detection might be generated for day 1 during a retro-hunt because a new entry was added.
Data about Tor exit node IP addresses
Google Security Operations ingests and stores IP addresses that are known Tor exit nodes. Tor exit nodes are points at which traffic exits the Tor network. Information ingested from this data source is stored in the following UDM fields. Data in this source is timed.
UDM field | Description |
---|---|
<variable_name>.graph.metadata.vendor_name |
Stores the value Google Cloud Threat Intelligence . |
<variable_name>.graph.metadata.product_name |
Stores the value GCTI Feed . |
<variable_name>.graph.metadata.threat.threat_feed_name |
Stores the value Tor Exit Nodes . |
<variable_name>.graph.entity.artifact.ip |
Stores the IP address ingested from the GCTI data source. |
Data about benign operating system files
Google Security Operations ingests and stores file hashes from the GCTI Benign Binaries data source. Information ingested from this data source is stored in the following UDM fields. Data in this source is timeless.
UDM field | Description |
---|---|
<variable_name>.graph.metadata.vendor_name |
Stores the value Google Cloud Threat Intelligence . |
<variable_name>.graph.metadata.product_name |
Stores the value GCTI Feed . |
<variable_name>.graph.metadata.threat.threat_feed_name |
Stores the value Benign Binaries . |
<variable_name>.graph.entity.file.sha256 |
Stores the SHA256 hash value of the file. |
<variable_name>.graph.entity.file.sha1 |
Stores the SHA1 hash value of the file. |
<variable_name>.graph.entity.file.md5 |
Stores the MD5 hash value of the file. |
Data about remote access tools
Remote access tools include file hashes for known remote access tools such as VNC clients that have frequently been used by malicious actors. These tools are generally legitimate applications that are sometimes abused to remotely connect to compromised systems. Information ingested from this data source is stored in the following UDM fields. Data in this source is timeless.
UDM field | Description |
---|---|
Stores the value Google Cloud Threat Intelligence . |
|
Stores the value GCTI Feed . |
|
Stores the value Remote Access Tools . |
|
Stores the SHA256 hash value of the file. | |
Stores the SHA1 hash value of the file. | |
Stores the MD5 hash value of the file. |
Enrich events with VirusTotal file metadata
Google Security Operations enriches file hashes into UDM events and provides additional context during an investigation. UDM events are enriched through hash aliasing in a customer environment. Hash aliasing combines all types of file hashes and provides information about a file hash during a search.
The integration of VirusTotal file metadata and relationship enrichment with Google SecOps can be used to identify patterns of malicious activity and to track malware movements across a network.
A raw log provides limited information about the file. VirusTotal enriches the event with file metadata to provide a dump of bad hashes along with metadata about the bad file. The metadata includes information such as filenames, types, imported functions, and tags. You can use this information in the UDM search and detection engine with YARA-L to understand bad file events and in general during threat hunting. An example use case is to detect any modifications to the original file which would, in turn, import the file metadata for threat detection.
The following information is stored with the record. For a list of all UDM fields, see Unified Data Model field list.
Type of data | UDM field |
---|---|
SHA-256 | ( principal | target | src | observer ).file.sha256 |
MD5 | ( principal | target | src | observer ).file.md5 |
SHA-1 | ( principal | target | src | observer ).file.sha1 |
Size | ( principal | target | src | observer ).file.size |
ssdeep | ( principal | target | src | observer ).file.ssdeep |
vhash | ( principal | target | src | observer ).file.vhash |
authentihash | ( principal | target | src | observer ).file.authentihash |
File type | ( principal | target | src | observer ).file.file_type |
Tags | ( principal | target | src | observer ).file.tags |
Capabilities tags | ( principal | target | src | observer ).file.capabilities_tags |
Names | ( principal | target | src | observer ).file.names |
First-seen time | ( principal | target | src | observer ).file.first_seen_time |
Last-seen time | ( principal | target | src | observer ).file.last_seen_time |
Last modification time | ( principal | target | src | observer ).file.last_modification_time |
Last analysis time | ( principal | target | src | observer ).file.last_analysis_time |
Embedded URLs | ( principal | target | src | observer ).file.embedded_urls |
Embedded IPs | ( principal | target | src | observer ).file.embedded_ips |
Embedded domains | ( principal | target | src | observer ).file.embedded_domains |
Signature information | ( principal | target | src | observer ).file.signature_info |
Signature information
|
( principal | target | src | observer).file.signature_info.sigcheck |
Signature information
|
( principal | target | src | observer ).file.signature_info.sigcheck.verification_message |
Signature information
|
( principal | target | src | observer ).file.signature_info.sigcheck.verified |
Signature information
|
( principal | target | src | observer ).file.signature_info.sigcheck.signers |
Signature information
|
( principal | target | src | observer ).file.signature_info.sigcheck.signers.name |
Signature information
|
( principal | target | src | observer ).file.signature_info.sigcheck.signers.status |
Signature information
|
( principal | target | src | observer ).file.signature_info.sigcheck.signers.valid_usage |
Signature information
|
( principal | target | src | observer ).file.signature_info.sigcheck.signers.cert_issuer |
Signature information
|
( principal | target | src | observer ).file.signature_info.sigcheck.x509 |
Signature information
|
( principal | target | src | observer ).file.signature_info.sigcheck.x509.name |
Signature information
|
( principal | target | src | observer ).file.signature_info.sigcheck.x509.algorithm |
Signature information
|
( principal | target | src | observer ).file.signature_info.sigcheck.x509.thumprint |
Signature information
|
( principal | target | src | observer ).file.signature_info.sigcheck.x509.cert_issuer |
Signature information
|
( principal | target | src | observer ).file.signature_info.sigcheck.x509.serial_number |
Signature information
|
( principal | target | src | observer ).file.signature_info.codesign |
Signature information
|
( principal | target | src | observer ).file.signature_info.codesign.id |
Signature information
|
( principal | target | src | observer ).file.signature_info.codesign.format |
Signature information
|
( principal | target | src | observer ).file.signature_info.codesign.compilation_time |
Exiftool information | ( principal | target | src | observer ).file.exif_info |
Exiftool information
|
( principal | target | src | observer ).file.exif_info.original_file |
Exiftool information
|
( principal | target | src | observer ).file.exif_info.product |
Exiftool information
|
( principal | target | src | observer ).file.exif_info.company |
Exiftool information
|
( principal | target | src | observer ).file.exif_info.file_description |
Exiftool information
|
( principal | target | src | observer ).file.exif_info.entry_point |
Exiftool information
|
( principal | target | src | observer ).file.exif_info.compilation_time |
PDF information | ( principal | target | src | observer ).file.pdf_info |
PDF information
|
( principal | target | src | observer ).file.pdf_info.js |
PDF information
|
( principal | target | src | observer ).file.pdf_info.javascript |
PDF information
|
( principal | target | src | observer ).file.pdf_info.launch_action_count |
PDF information
|
( principal | target | src | observer ).file.pdf_info.object_stream_count |
PDF information
|
( principal | target | src | observer ).file.pdf_info.endobj_count |
PDF information
|
( principal | target | src | observer ).file.pdf_info.header |
PDF information
|
( principal | target | src | observer ).file.pdf_info.acroform |
PDF information
|
( principal | target | src | observer ).file.pdf_info.autoaction |
PDF information
|
( principal | target | src | observer ).file.pdf_info.embedded_file |
PDF information
|
( principal | target | src | observer ).file.pdf_info.encrypted |
PDF information
|
( principal | target | src | observer ).file.pdf_info.flash |
PDF information
|
( principal | target | src | observer ).file.pdf_info.jbig2_compression |
PDF information
|
( principal | target | src | observer ).file.pdf_info.obj_count |
PDF information
|
( principal | target | src | observer ).file.pdf_info.endstream_count |
PDF information
|
( principal | target | src | observer ).file.pdf_info.page_count |
PDF information
|
( principal | target | src | observer ).file.pdf_info.stream_count |
PDF information
|
( principal | target | src | observer ).file.pdf_info.openaction |
PDF information
|
( principal | target | src | observer ).file.pdf_info.startxref |
PDF information
|
( principal | target | src | observer ).file.pdf_info.suspicious_colors |
PDF information
|
( principal | target | src | observer ).file.pdf_info.trailer |
PDF information
|
( principal | target | src | observer ).file.pdf_info.xfa |
PDF information
|
( principal | target | src | observer ).file.pdf_info.xref |
PE file metadata | ( principal | target | src | observer ).file.pe_file |
PE file metadata
|
( principal | target | src | observer ).file.pe_file.imphash |
PE file metadata
|
( principal | target | src | observer ).file.pe_file.entry_point |
PE file metadata
|
( principal | target | src | observer ).file.pe_file.entry_point_exiftool |
PE file metadata
|
( principal | target | src | observer ).file.pe_file.compilation_time |
PE file metadata
|
( principal | target | src | observer ).file.pe_file.compilation_exiftool_time |
PE file metadata
|
( principal | target | src | observer ).file.pe_file.section |
PE file metadata
|
( principal | target | src | observer ).file.pe_file.section.name |
PE file metadata
|
( principal | target | src | observer ).file.pe_file.section.entropy |
PE file metadata
|
( principal | target | src | observer ).file.pe_file.section.raw_size_bytes |
PE file metadata
|
( principal | target | src | observer ).file.pe_file.section.virtual_size_bytes |
PE file metadata
|
( principal | target | src | observer ).file.pe_file.section.md5_hex |
PE file metadata
|
( principal | target | src | observer ).file.pe_file.imports |
PE file metadata
|
( principal | target | src | observer ).file.pe_file.imports.library |
PE file metadata
|
( principal | target | src | observer ).file.pe_file.imports.functions |
PE file metadata
|
( principal | target | src | observer ).file.pe_file.resource |
PE file metadata
|
( principal | target | src | observer ).file.pe_file.resource.sha256_hex |
PE file metadata
|
( principal | target | src | observer ).file.pe_file.resource.filetype_magic |
PE file metadata
|
( principal | target | src | observer ).file.pe_file.resource_language_code |
PE file metadata
|
( principal | target | src | observer ).file.pe_file.resource.entropy |
PE file metadata
|
( principal | target | src | observer ).file.pe_file.resource.file_type |
PE file metadata
|
( principal | target | src | observer ).file.pe_file.resources_type_count_str |
PE file metadata
|
( principal | target | src | observer ).file.pe_file.resources_language_count_str |
Enrich entities with VirusTotal relationship data
VirusTotal helps analyze suspicious files, domains, IP addresses, and URLs to detect malware and other breaches, and share the findings with the security community. Google Security Operations ingests data from VirusTotal related connections. This data is stored as an entity and provides information about the relation between file hashes and files, domains, IP addresses, and URLs.
Analysts can use this data to determine whether a file hash is bad based on information about the URL or domain from other sources. This information can be used to create Detection Engine rules that query against the entity context data to build context-aware analytics.
This data is only available for certain VirusTotal and Google Security Operations licenses. Check your entitlements with your account manager.
The following information is stored with the entity context record:
UDM field | Description |
---|---|
entity.metadata.product_entity_id |
A unique identifier for the entity |
entity.metadata.entity_type |
Stores the value FILE , indicating that the
entity describes a file |
entity.metadata.interval |
start_time refers to the beginning of
time and end_time is the end of time for which this data is valid |
entity.metadata.source_labels |
This field stores a list of key-value pairs of source_id and
target_id for this entity. source_id is the file hash
and target_id can be hash or value of the URL, domain name, or IP
address that this file is related to. You can search for the URL, domain name,
IP address, or file at virustotal.com. |
entity.metadata.product_name |
Stores the value 'VirusTotal Relationships' |
entity.metadata.vendor_name |
Stores the value 'VirusTotal' |
entity.file.sha256 |
Stores the SHA-256 hash value for the file |
entity.file.relations |
A list of child entities that the parent file entity is related to |
entity.relations.relationship |
This field explains the type of relationship between parent and child entities.
The value can be either EXECUTES , DOWNLOADED_FROM , or
CONTACTS . |
entity.relations.direction |
Stores the value 'UNIDIRECTIONAL' and indicates the direction of relation with the child entity |
entity.relations.entity.url |
The URL that the file in the parent entity contacts (if the relationship between
the parent entity and the URL is CONTACTS ) or the URL from which
the file in the parent entity was downloaded (if the relationship between the parent
entity and the URL is DOWNLOADED_FROM ). |
entity.relations.entity.ip |
A list of IP addresses that the file in parent entity contacts or was downloaded from It only contains one IP address. |
entity.relations.entity.domain.name |
The domain name which the file in parent entity contacts or was downloaded from |
entity.relations.entity.file.sha256 |
Stores the SHA-256 hash value for the file in the relation |
entity.relations.entity_type |
This field contains the type of entity in the relation. The value can be
URL , DOMAIN_NAME , IP_ADDRESS , or
FILE . These fields are populated in accordance with the
entity_type . For example, if entity_type is URL ,
then entity.relations.entity.url is populated. |
What's next
For information about how to use enriched data with other Google Security Operations features, see the following:
- Use context-enriched data in UDM Search.
- Use context-enriched data in rules.
- Use context-enriched data in reports.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-01-17 UTC.