This document shows how to export logs from Cloud Logging to destinations like BigQuery, or Chronicle or a third-party SIEM to meet the security and analytics requirements of your organization's cloud infrastructure environment. Organizations often use such security analytics tools to help prevent, detect and respond to threats like malware, phishing, ransomware, as well as potential asset misconfigurations. To help you meet these security and analytics requirements, this document first introduces the variety of security-relevant logs in your Google Cloud environment, from your application logs to platform logs in addition to audit logs like Admin Activity logs and Data Access logs.
Using the log scoping tool, you can first evaluate these security-relevant log types in terms of the visibility and threat detection coverage they provide. The tool helps you by mapping out threat tactics and techniques from the popular MITRE ATT&CK® threat model to the specific Google Cloud log types(s) that would help you investigate these common threats. Similarly, it maps out Security Command Center's Event Threat Detection modules to the corresponding Google Cloud log sources they rely on.
In this scenario, the exported logs are delivered to a dataset in
BigQuery that you configure as a part of the export in addition to the user-defined logging filter. You grant
permissions to limit access to the logs as appropriate. You can simplify
managing and querying the data by configuring the log sink to export data into
partitioned tables
where data is partitioned by day based on log entry's timestamp
field.
This approach can help reduce query costs by reducing the amount of data scanned
as a part of queries. Another benefit of partitioned tables is that you can
set a partition expiration
at the individual table or the entire dataset level, so that you maintain the logging data only as long as you
find it useful. For example, you might keep audit logging data for 3 years and
then older data is automatically deleted as partitions age out and expire.
This scenario is part of the series Design patterns for exporting Cloud Logging.
Evaluate which logs to export
Log scoping tool
The following interactive log scoping tool lists valuable security-relevant logs across Google Cloud including Cloud Audit Logs, Access Transparency logs, and several platform logs. To help you evaluate which logs to export and analyze to meet your own security and compliance needs, this tool maps each log type to the corresponding:
- Event Threat Detection modules that rely on that log.
- CIS Google Cloud Platform Foundation Benchmark compliance violations that can be detected in that log.
- MITRE ATT&CK® threat tactics and techniques that can be monitored with that log.
Select desired log types to automatically generate a suitable log filter to be used in the Set up the logging export section.
Record the log filter
In Cloud Shell, create a variable to save the preceding auto-generated log filter. You can also refine it further depending on your requirements. For example, you could filter (or exclude) resources only in one or more specific projects.
export LOG_FILTER='log-filter'
Set up the logging export
The following diagram shows the steps for enabling logging export to BigQuery.
- Set up the logging export dataset in BigQuery.
- Turn on audit logging for all Google Cloud services.
- Turn on platform logs for specific Google Cloud services.
- Configure the logging export.
- Set IAM policy permissions for the BigQuery dataset.
Set up the dataset in BigQuery
Follow the instructions to set up a dataset that will host your exported logs. If you are using aggregated logs, the BigQuery dataset should be located in one of the Google Cloud projects within your organization. When you are using log exports for a single project, the BigQuery dataset should be in the same project.
Turn on audit logging for all services
Data Access audit logs—except for BigQuery—are disabled by
default. To configure Data Access audit logs directly in Google Cloud console,
see Configure Data Access audit logs. Alternatively, you can configure Data Access audit logs
with gcloud
using IAM policy objects. For an example
broad IAM policy configuration that enables Data Access audit logs
for all services and all users, see Enable all Data Access audit logs.
Turn on service-specific platform logs
Platform logs must be enabled on a service-by-service basis, typically at the resource level. For example, Cloud DNS logs are enabled at the VPC network level, VPC Flow Logs are enabled at the subnet level for all VMs in the subnet, and Firewall Rules logs are enabled at the individual Firewall Rule level. For details on how to enable a specific log type, see log scoping tool, and click the Enable link in each row.
Configure the logging export
From the Google Cloud CLI, you use the
gcloud logging sinks create
command or the
organizations.sinks.create
API call to create a sink with the appropriate logging filter. The following
example gcloud
command creates a sink called gcp_logging_sink_bq
for the
organization. The sink includes all child projects and specifies the log filter
saved in the $LOG_FILTER
variable created above.
gcloud logging sinks create gcp_logging_sink_bq \ bigquery.googleapis.com/projects/compliance-logging-export/datasets/gcp_logging_export \ --use-partitioned-tables \ --log-filter='$LOG_FILTER' \ --include-children \ --organization=324989855333
The command output is similar to the following:
Created [https://logging.googleapis.com/v2/organizations/324989855333/sinks/gcp_logging_sink_bq]. Please remember to grant `serviceAccount:gcp-logging-sink-bq@logging-o324989855333.iam.gserviceaccount.com` the WRITER role on the dataset.. More information about sinks can be found at /logging/docs/export/configure_export
In the serviceAccount
entry returned from the API call, the identity
gcp-logging-sink-bq@logging-o324989855333.iam.gserviceaccount.com
is included
in the response. This identity represents a Google Cloud
service account that has been created specifically for your log export. Until you grant this
identity write access to the BigQuery dataset, log entry exports from this sink will
fail. To learn more, see the following section.
Set IAM policy permissions for the BigQuery dataset
By adding the service account
gcp-logging-sink-bq@logging-o324989855333.iam.gserviceaccount.com
to the
gcp_logging_export
dataset with the Editor permissions, you grant the service
account permission to write to the destination. Until you add these permissions,
the sink export will fail.
To add the permissions to the service account, follow these steps:
In the Cloud console, go to BigQuery:
Open your newly created
gcp_logging_export
dataset.In the Dataset info tab, click on the Sharingkeyboard_arrow_down drop-down menu, and then click Permissions.
In the Dataset Permissions side panel, click Add Principal.
In the New principals field, enter the service account. As shown in the following image, under the Role drop-down menu, select BigQuery Data Editor:
After you create the logging export by using this filter, log files begin to populate the BigQuery dataset in the configured project.
For an example set of permissions, you can do the following:
- Remove all nonessential users from the BigQuery dataset permissions.
- Add full control for the BigQuery admin.
- Grant the export service account permissions to write the export logs.
- Grant other individual users viewer access to the Google Cloud logging exports.
Using the exported logs
When you export logs to a BigQuery dataset, Cloud Logging
creates tables to hold the exported log entries. Log entries are placed in
tables that are named based on the entries' log names, and partitioned based on
the entries' timestamps. For example, Data Access audit logs are routed to
cloudaudit_googleapis_com_data_access
table as shown in the following image:
Both Admin Activity and Data Access logs are loaded in BigQuery using the protoPayload
format.
See Fields in exported audit logs
for details on schema conversions done by Cloud Logging before writing to BigQuery.
Sample questions and queries
You can run a broad range of queries against the audit logs. These queries
perform analysis to understand who is accessing or modifying data in Google Cloud.
Replace gcp_logging_export
with the name of your
BigQuery dataset.
Which users most frequently accessed data in the past week?
The following query uses the Data Access audit logs to find the user identities that most frequently accessed BigQuery tables data over the past week.
SELECT
protopayload_auditlog.authenticationInfo.principalEmail,
COUNT(*) AS COUNTER
FROM `gcp_logging_export.cloudaudit_googleapis_com_data_access`
WHERE
(protopayload_auditlog.methodName = "google.cloud.bigquery.v2.JobService.InsertJob" OR
protopayload_auditlog.methodName = "google.cloud.bigquery.v2.JobService.Query")
AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
GROUP BY
1
ORDER BY
2 desc, 1
LIMIT
100
Which users accessed the data in the "accounts" table last month?
The following query uses the Data Access audit logs to find the user
account that most frequently queried the "accounts" table over the past month.
Replace MY_PROJECT_ID
and MY_DATASET
with your project ID and dataset respectively.
SELECT
protopayload_auditlog.authenticationInfo.principalEmail,
COUNT(*) AS COUNTER
FROM `gcp_logging_export.cloudaudit_googleapis_com_data_access`,
UNNEST(protopayload_auditlog.authorizationInfo) authorizationInfo
WHERE
(protopayload_auditlog.methodName = "google.cloud.bigquery.v2.JobService.InsertJob" OR
protopayload_auditlog.methodName = "google.cloud.bigquery.v2.JobService.Query")
AND authorizationInfo.permission = "bigquery.tables.getData"
AND authorizationInfo.resource = "projects/MY_PROJECT_ID/datasets/MY_DATASET/tables/accounts"
AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
GROUP BY
1
ORDER BY
2 desc, 1
LIMIT
100
Which users deleted VMs in the last week?
The following query uses the Admin Activity audit logs to find the user identities that deleted VMs over the past week.
SELECT
timestamp,
resource.labels.instance_id,
protopayload_auditlog.authenticationInfo.principalEmail,
protopayload_auditlog.resourceName,
protopayload_auditlog.methodName
FROM `gcp_logging_export.cloudaudit_googleapis_com_activity`
WHERE
resource.type = "gce_instance"
AND protopayload_auditlog.methodName = "v1.compute.instances.delete"
AND operation.first IS TRUE
AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
ORDER BY
timestamp desc,
resource.labels.instance_id
LIMIT
1000
How often has autoscaling been used in the past month? Is autoscaling trending up or down, or is it holding steady?
The following query uses the Admin Activity audit logs to find how often autoscaling has been employed over the past month.
SELECT
protopayload_auditlog.methodName,
COUNT(*) AS counter
FROM `gcp_logging_export.cloudaudit_googleapis_com_activity`
WHERE
resource.type = "gce_instance_group_manager"
AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
GROUP BY
1
ORDER BY
1
LIMIT
1000
Using the following query, you can visualize the trend for Compute Engine instance manager operations over time, broken down by day.
SELECT
TIMESTAMP_TRUNC(timestamp, DAY) AS day,
protopayload_auditlog.methodName AS methodName,
COUNT(*) AS counter
FROM `gcp_logging_export.cloudaudit_googleapis_com_activity`
WHERE
resource.type = "gce_instance_group_manager"
AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
GROUP BY
1, 2
ORDER BY
1, 2
You can use the preceding query as a data source custom query in Google Data Studio to visualize the trend over time, as shown in the following graph.
See Data Studio custom queries for more information.
What tables are most frequently accessed and by whom?
The following query uses the Data Access audit logs to find the BigQuery tables with most frequently read and modified data over the past month. It displays the associated user identity along with breakdown of total number of times data was read versus modified.
SELECT
protopayload_auditlog.resourceName,
protopayload_auditlog.authenticationInfo.principalEmail,
COUNTIF(JSON_EXTRACT(protopayload_auditlog.metadataJson, "$.tableDataRead") IS NOT NULL) AS dataReadEvents,
COUNTIF(JSON_EXTRACT(protopayload_auditlog.metadataJson, "$.tableDataChange") IS NOT NULL) AS dataChangeEvents,
COUNT(*) AS totalEvents
FROM `gcp_logging_export.cloudaudit_googleapis_com_data_access`
WHERE
STARTS_WITH(resource.type, 'bigquery') IS TRUE
AND (JSON_EXTRACT(protopayload_auditlog.metadataJson, "$.tableDataRead") IS NOT NULL
OR JSON_EXTRACT(protopayload_auditlog.metadataJson, "$.tableDataChange") IS NOT NULL)
AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
GROUP BY
1, 2
ORDER BY
5 DESC, 1, 2
LIMIT 1000
What are the top 10 queries against BigQuery in the past week?
The following query uses the Data Access audit logs to find the most common queries over the past week. It also lists the corresponding users and the referenced tables.
SELECT
COALESCE(
JSON_EXTRACT_SCALAR(protopayload_auditlog.metadataJson, "$.jobChange.job.jobConfig.queryConfig.query"),
JSON_EXTRACT_SCALAR(protopayload_auditlog.metadataJson, "$.jobInsertion.job.jobConfig.queryConfig.query")) as query,
STRING_AGG(DISTINCT protopayload_auditlog.authenticationInfo.principalEmail, ',') as users,
ANY_VALUE(COALESCE(
JSON_EXTRACT_ARRAY(protopayload_auditlog.metadataJson, "$.jobChange.job.jobStats.queryStats.referencedTables"),
JSON_EXTRACT_ARRAY(protopayload_auditlog.metadataJson, "$.jobInsertion.job.jobStats.queryStats.referencedTables"))) as tables,
COUNT(*) AS counter
FROM `gcp_logging_export.cloudaudit_googleapis_com_data_access`
WHERE
(resource.type = 'bigquery_project' OR resource.type = 'bigquery_dataset')
AND operation.last IS TRUE
AND (JSON_EXTRACT(protopayload_auditlog.metadataJson, "$.jobChange") IS NOT NULL
OR JSON_EXTRACT(protopayload_auditlog.metadataJson, "$.jobInsertion") IS NOT NULL)
AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
GROUP BY
query
ORDER BY
counter DESC
LIMIT 10
What are the most common actions recorded in the data access log over the past month?
The following query uses all Cloud Audit Logs to find the 100 most frequent actions recorded over the past month.
SELECT
ANY_VALUE(_TABLE_SUFFIX),
protopayload_auditlog.methodName,
protopayload_auditlog.serviceName,
resource.type,
COUNT(*) AS counter
FROM `gcp_logging_export.cloudaudit_googleapis_com_*`
WHERE
timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
GROUP BY
protopayload_auditlog.methodName,
protopayload_auditlog.serviceName,
resource.type
ORDER BY
counter DESC
LIMIT 100
What's next
Look at the other export scenarios:
Explore reference architectures, diagrams, tutorials, and best practices about Google Cloud. Take a look at our Cloud Architecture Center.