Scenarios for exporting logging data: Security and access analytics

This scenario shows how to export logs from Cloud Logging to BigQuery to meet the security and analytics requirements of your organization's cloud infrastructure environment. Organizations often use analytical tools to help identify unauthorized changes in configuration and inappropriate access to data. To help you meet these security and analytics requirements, Cloud Logging can capture two types of audit logs: Admin Activity logs and Data Access logs.

In this scenario, the exported logs are delivered to a dataset in BigQuery that you configure as a part of the export. You grant permissions to limit access to the logs as appropriate. You can simplify managing and querying the data by organizing the exported data into date-partitioned tables. This approach can help reduce query costs by reducing the amount of data scanned as a part of queries. One benefit of partitioning is that you can set an expiration date for the partitions so that you maintain the logging data only as long as you find it useful. For example, you might keep audit logging data for 3 years and then delete it.

This scenario is part of the series Design patterns for exporting Cloud Logging.

Set up the logging export

The following diagram shows these steps for enabling logging export to BigQuery.

  • Set up the logging export dataset in BigQuery.
  • Turn on audit logging for all Google Cloud services.
  • Configure the logging export.
  • Set IAM policy permissions for the BigQuery dataset.

Enabling logging export to BigQuery.

Set up the dataset in BigQuery

Follow the instructions to set up a dataset that will host your exported logs. If you are using aggregated logs, the BigQuery dataset should be located in one of the Google Cloud projects within your organization. When you are using log exports for a single project, the BigQuery dataset should be in the same project.

Best practice: When you create the table, set a partition expiration to limit the size of the logging export storage and cumulative storage costs over time.

Turn on audit logging for all services

Data Access audit logs—except for BigQuery—are disabled by default. In order to enable all audit logs, follow the instructions to update the Cloud IAM policy with the configuration listed in the audit policy documentation. The steps include the following:

  • Downloading the current IAM policy as a file.
  • Adding the audit log policy JSON or YAML object to the current policy file.
  • Updating the project with the updated policy file.

The following is an example JSON object that enables all audit logs for all services.

"auditConfigs": [
    {
        "service": "allServices",
        "auditLogConfigs": [
            { "logType": "ADMIN_READ" },
            { "logType": "DATA_READ"  },
            { "logType": "DATA_WRITE" },
        ]
    },
]

Configure the logging export

After you set up aggregated exports or logs export, you need to refine the logging filters to export audit logs. The following logging filter includes the Admin Activity and Data Access audit logs and the logs for specific resource types.

logName:"/logs/cloudaudit.googleapis.com" OR
resource.type:gce OR
resource.type=gcs_bucket OR
resource.type=cloudsql_database OR
resource.type=bigquery_resource

From the gcloud command-line tool, you use the gcloud logging sinks create command or the organizations.sinks.create API call to create a sink with the appropriate filters. The following example gcloud command creates a sink called gcp_logging_sink_gcs for the organization. The sink includes all children projects and specifies filtering to select specific audit logs.

gcloud logging sinks create gcp_logging_sink_bq22 \
     bigquery.googleapis.com/projects/compliance-logging-export/datasets/gcp_logging_export \
     --log-filter='logName:"/logs/cloudaudit.googleapis.com"' \
     --include-children \
     --organization=324989855333

The command output is similar to the following:

Created [https://logging.googleapis.com/v2/organizations/324989855333/sinks/gcp_logging_sink_bq].
Please remember to grant `serviceAccount:gcp-logging-sink-bq@logging-o324989855333.iam.gserviceaccount.com` the WRITER role on the dataset..
More information about sinks can be found at /logging/docs/export/configure_export

In the serviceAccount entry returned from the API call, the identity gcp-logging-sink-bq@logging-o324989855333.iam.gserviceaccount.com is included in the response. This identity represents a Google Cloud service account that has been created for the export. Until you grant this identity write access to the destination, log entry exports from this sink will fail. For more information, see the next section or the required permissions for a BigQuery dataset.

Set IAM policy permissions for the BigQuery dataset

By adding the service account gcp-logging-sink-bq@logging-o324989855333.iam.gserviceaccount.com to the gcp_logging_export dataset with the Editor permissions, you grant the service account permission to write to the destination. Until you add these permissions, the sink export will fail.

To add the permissions to the service account, follow these steps:

  1. In the Cloud Console, go to BigQuery:

    GO TO BigQuery

  2. Next to the gcp_logging_export dataset, click , and then click Share Dataset.

  3. In the Add people field, enter the service account.

    IAM policy permissions - Editor

After you create the logging export by using this filter, log files begin to populate the BigQuery dataset in the configured project.

Best practice: Implement a policy of least permissions based on your needs. You can configure the dataset permissions based on specific Google Cloud user accounts, Google Groups, or Google Cloud service accounts. Use IAM permissions to grant access to the BigQuery dataset.

For an example set of permissions, you can do the following:

  • Remove all nonessential users from the BigQuery dataset permissions.
  • Add full control for the BigQuery admin.
  • Grant the export user permissions to write the export logs.
  • Grant other individual users viewer access to the Google Cloud logging exports.

You can update the IAM permissions for the dataset directly in the Cloud Console, through the gsutil command-line utility, or through the Cloud IAM API API.

Using the exported logs

When you export logs to a BigQuery dataset, Cloud Logging creates dated tables to hold the exported log entries. Log entries are placed in tables whose names are based on the entries' log names.

Dated tables.

The list of tables appear with the YYYYMMDD suffix, as shown in the following sample.

List of tables.

Because the exported logs are audit logs, the Admin Activity and Data Access logs are loaded in BigQuery using the protoPayload format.

Granting external access

You might want to grant specific users access to exported logs—for example, security analysts, your DevOps team, and auditors. BigQuery offers a number of options to grant secure access to the logs.

Logs location strategies

There are several options to select which logs should be viewable to users in BigQuery.

  • Create copies of the logs to share.

    Manually or programmatically create a copy of an individual log table or set of log tables, and place the copies in a separate BigQuery dataset. Then use the separate BigQuery dataset permissions to share logs with specific users as appropriate.

    Advantages: You can limit the amount of data that is exposed to only the copied data.

    Disadvantages: You have to create, share, and manage separate data sets and permissions, which can lead to higher costs.

  • Grant read-only access to all logs.

    Manually or programmatically set viewer permissions to the BigQuery logging export tables, which grants access to all log exports.

    Advantages: Access is easy to grant.

    Disadvantages: You must grant access to all of the logs rather than specific log files.

User access control strategies

There are several options for granting access to the logs in BigQuery.

  • Use a Google Group.

    Create a Google Group such as auditors@example.com with read-only access to the logging export BigQuery dataset. You then manage the list of Google Accounts by adding or removing auditors to the Google Group.

    Advantages: It's easy to manage access through a group; there's clear user-access purpose.

    Disadvantages: It's not possible to tell who has access without looking at the group's membership

  • Use individual Google Accounts.

    Grant individual Google Account access to the logging export BigQuery dataset for each individual user that requires it.

    Advantages: It's easy to add each user manually or programmatically.

    Disadvantages: It's not possible to discern audit users from other viewers.

Sample questions and queries

You can run a broad range of queries against the audit logs. These queries perform analysis to understand who is accessing data in Google Cloud or how your autoscalers are operating over time.

Which users most frequently accessed data in the past week?

The following query uses the Data Access Cloud Audit Logs logs to find the user account that most frequently appears in the Data Access logs.

SELECT
   protopayload_auditlog.authenticationInfo.principalEmail,
   COUNT(*) AS COUNTER
FROM
(TABLE_DATE_RANGE(
Logging_export.cloudaudit_googleapis_com_data_access_,
DATE_ADD(CURRENT_TIMESTAMP(),-7,'DAY'),
CURRENT_TIMESTAMP())
)
WHERE
  protopayload_auditlog.methodName = "jobservice.query"
GROUP BY
   protopayload_auditlog.authenticationInfo.principalEmail
ORDER BY
   protopayload_auditlog.authenticationInfo.principalEmail,
   COUNTER desc
LIMIT
  1000

Which users accessed the data in the "accounts" table last month?

The following query uses the Data Access Cloud Audit Logs logs to find the user account that most frequently queried the "accounts" table. Replace your-project-id with your project ID.

SELECT
   protopayload_auditlog.authenticationInfo.principalEmail,
   COUNT(*) AS COUNTER
FROM
(TABLE_DATE_RANGE(logging_export.cloudaudit_googleapis_com_data_access_,DATE_ADD(CURRENT_TIMESTAMP(),-30,'DAY'),CURRENT_TIMESTAMP()))
WHERE
  protopayload_auditlog.methodName = "jobservice.query" AND
  protopayload_auditlog.authorizationInfo.permission = "bigquery.tables.getData"
AND
  protopayload_auditlog.authorizationInfo.resource = "projects/your-project-id/datasets/bqtesting/tables/accounts"
GROUP BY
   protopayload_auditlog.authenticationInfo.principalEmail
ORDER BY
   protopayload_auditlog.authenticationInfo.principalEmail,
   COUNTER desc
LIMIT
  1000

Which users deleted virtual machines in the last week?

The following query uses the Admin Activity Cloud Audit Logs logs to find the user accounts that deleted virtual machines over the past week.

SELECT
  timestamp,
  resource.labels.instance_id,
  protopayload_auditlog.authenticationInfo.principalEmail,
  protopayload_auditlog.resourceName,
  protopayload_auditlog.methodName
FROM
(TABLE_DATE_RANGE(
logging_export.cloudaudit_googleapis_com_activity_,
DATE_ADD(CURRENT_TIMESTAMP(),-7,'DAY'),
CURRENT_TIMESTAMP())
)
WHERE
  resource.type = "gce_instance"
  AND operation.first IS TRUE
  AND protopayload_auditlog.methodName = "v1.compute.instances.delete"
ORDER BY
  timestamp,
  resource.labels.instance_id
LIMIT
  1000

The following query is a follow-up query that summarizes the account by simple count.

SELECT
  protopayload_auditlog.authenticationInfo.principalEmail,
  count(*) as counter
FROM
(TABLE_DATE_RANGE(
logging_export.cloudaudit_googleapis_com_activity_,
DATE_ADD(CURRENT_TIMESTAMP(),-7,'DAY'),
CURRENT_TIMESTAMP())
)
WHERE
  resource.type = "gce_instance"
  AND operation.first IS TRUE
  AND protopayload_auditlog.methodName = "v1.compute.instances.delete"
GROUP BY
  protopayload_auditlog.authenticationInfo.principalEmail
ORDER BY
  protopayload_auditlog.authenticationInfo.principalEmail,
  counter
LIMIT
  1000

The following query uses the Admin Activity Cloud Audit Logs logs to find how often autoscaling has been employed over the past month.

SELECT
  protopayload_auditlog.methodName,
  COUNT(*) AS counter
FROM
(TABLE_DATE_RANGE(
logging_export.cloudaudit_googleapis_com_activity_,
DATE_ADD(CURRENT_TIMESTAMP(),-30,'DAY'),
CURRENT_TIMESTAMP())
)
WHERE
  resource.type = "gce_instance_group_manager"
GROUP BY
  protopayload_auditlog.methodName,
ORDER BY
  protopayload_auditlog.methodName
LIMIT
  1000

Using the following query, you can visualize the trend of Compute Engine instance manager operations over time.

SELECT
  timestamp,
  protopayload_auditlog.methodName AS methodName,
  COUNT(*) AS counter
FROM
(TABLE_DATE_RANGE(
logging_export.cloudaudit_googleapis_com_activity_,
DATE_ADD(CURRENT_TIMESTAMP(),-30,'DAY'),
CURRENT_TIMESTAMP())
)
WHERE
  resource.type = "gce_instance_group_manager"
GROUP BY
  timestamp,
  methodName
ORDER BY
  timestamp,
  methodName

You can use the preceding query as a data source custom query in Google Data Studio to visualize the trend over time, as shown in the following graph.

Data Studio visualization.

See Data Studio custom queries for more information.

What data sets are most frequently accessed and by whom?

The following query uses the Data Access Cloud Audit Logs Logs to find the BigQuery datasets that are most frequently accessed, and displays the associated user account along with the count.

SELECT
  protopayload_auditlog.authorizationInfo.resource,
  protopayload_auditlog.authenticationInfo.principalEmail,
  COUNT(*) AS counter
FROM
(TABLE_DATE_RANGE(
logging_export.cloudaudit_googleapis_com_data_access_,
DATE_ADD(CURRENT_TIMESTAMP(),-30,'DAY'),
CURRENT_TIMESTAMP())
)
WHERE
  protopayload_auditlog.authorizationInfo.permission = "bigquery.tables.getData"
GROUP BY
  protopayload_auditlog.authorizationInfo.resource,
  protopayload_auditlog.authenticationInfo.principalEmail
ORDER BY
  protopayload_auditlog.authorizationInfo.resource,
  protopayload_auditlog.authenticationInfo.principalEmail,
  counter DESC
LIMIT
  1000

What are the top 10 queries against BigQuery in the past week?

The following query uses the Data Access Cloud Audit Logs Logs to find the most common queries.

SELECT
  protopayload_auditlog.servicedata_v1_bigquery.jobQueryRequest.query,
  COUNT(*) AS counter
FROM
(TABLE_DATE_RANGE(
Logging_export.cloudaudit_googleapis_com_data_access_,
DATE_ADD(CURRENT_TIMESTAMP(),-7,'DAY'),
CURRENT_TIMESTAMP()))
WHERE
  protopayload_auditlog.methodName = "jobservice.query"
GROUP BY
  protopayload_auditlog.servicedata_v1_bigquery.jobQueryRequest.query
ORDER BY
  counter DESC
LIMIT
  1000

What are the most common actions recorded in the data access log over the past 30 days?

The following query uses the Data Access Cloud Audit Logs Logs to find the most common actions recorded over the past 30 days.

SELECT
  protopayload_auditlog.methodName,
  resource.type,
  COUNT(*) AS counter
FROM
(TABLE_DATE_RANGE(
logging_export.cloudaudit_googleapis_com_data_access_,
DATE_ADD(CURRENT_TIMESTAMP(),-30,'DAY'),
CURRENT_TIMESTAMP())
)
GROUP BY
  protopayload_auditlog.methodName,
  resource.type
ORDER BY
  COUNTER DESC
LIMIT
  1000

What's next