Audit logs migration guide

Overview

BigQuery currently provides two versions of audit logs: an older version that uses AuditData payload, and a new version that uses BigQueryAuditMetadata. This document describes migrating logs interface filters, BigQuery exports, and changes in queries over the exported logs from the old format to the new format.

Logs interface filters

Changing filters might require changing the paths to the fields you want to filter on. BigQuery audit logs overview lists the changes between the old logs and the new logs. The changes include the resource, resourceName, and methodName fields.

For example, the filter to select all the BigQuery-related records changes from:

resource.type = "bigquery_resource"

to:

resource.type = ("bigquery_project" OR "bigquery_dataset")

For the filter table inserts on the method name, the filter changes from:

protoPayload.methodName="tableservice.insert"

to:

protoPayload.methodName = "google.cloud.bigquery.v2.TableService.InsertTable"

You can also easily select all table inserts regardless of the method:

protoPayload.metadata.tableCreation.reason != ""

Or filter on a specific reason for a table creation:

protoPayload.metadata.tableCreation.reason = "JOB"

New logs have separate method names for patch and update, so

protoPayload.methodName = "tableservice.update"

changes to:

protoPayload.methodName = ("google.cloud.bigquery.v2.TableService.UpdateTable" OR "google.cloud.bigquery.v2.TableService.PatchTable")

However, you can find all the table updates by using the protoPayload.metadata.tableChange.reason field.

Finally, with the new logs you can find all the records for a specific table by using the protoPayload.methodName field. For example:

protoPayload.resourceName = "projects/myproject/datasets/mydataset/tables/mytable"

If you're using a resourceName filter with the old logs to find all the side effects for a specific job ID, with the new logs you'll need to filter on a specific event instead. To find all the source tables read records for a specific job:

protoPayload.metadata.tableDataRead.jobName = "projects/myproject/jobs/myjob"

Or to find the destination table change:

protoPayload.metadata.tableChange.jobName = "projects/myproject/jobs/myjob"

Resources representations, such as Job or Table, in the new logs have the structure similar to the old logs. Hovewer, new logs represent events while the old logs focus more on the request and response.

You can compare the new BigQueryAuditMetadata and the old AuditData formats. BigQuery-specific information moved from the serviceData to the metadata field.

For example, to migrate a filter that finds all completed "CREATE TABLE AS SELECT" DDL jobs, change the filter from:

protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.query.statementType = "CREATE_TABLE_AS_SELECT"

to:

protoPayload.metadata.jobChange.after = "DONE"
protoPayload.metadata.jobChange.job.jobConfig.queryConfig.statementType = "CREATE_TABLE_AS_SELECT"

Logs routing (exports)

For exporting all records for core BigQuery operations, use the metadata type filter:

protoPayload.metadata."@type"="type.googleapis.com/google.cloud.audit.BigQueryAuditMetadata"

Querying logs exported to BigQuery

In addition to the basic structure changes described earlier, there are additional changes in the exported data structure that need to be addressed. The metadata field is exported as a single JSON column. To query the contents, you must use BigQuery JSON functions.

Example: DDL queries

For the previous example of filtering all the "CREATE TABLE AS SELECT" DDL jobs, the query over the old logs to count the number of jobs might look like this:

  #standardSQL
  SELECT COUNT(*)
  FROM
    `MYPROJECTID.MYDATASETID.cloudaudit_googleapis_com_data_access_YYYYMMDD`
  WHERE
    protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobConfiguration.query.statementType = "CREATE_TABLE_AS_SELECT"

The same query over the new logs might look like this:

  SELECT
    COUNT(*)
  FROM
    `MYPROJECTID.MYDATASETID.cloudaudit_googleapis_com_data_access_YYYYMMDD`
  WHERE
    JSON_EXTRACT_SCALAR(protopayload_auditlog.metadataJson,
      "$.jobChange.job.jobConfig.queryConfig.statementType") = "CREATE_TABLE_AS_SELECT"

Example: Hourly cost breakdown

Old query:

  #standardSQL
  SELECT
    TIMESTAMP_TRUNC(protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobStatistics.endTime, HOUR) AS time_window,
    FORMAT('%9.2f',5.0 * (SUM(protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobStatistics.totalBilledBytes)/POWER(2, 40))) AS Estimated_USD_Cost
  FROM
    `MYPROJECTID.MYDATASETID.cloudaudit_googleapis_com_data_access_YYYYMMDD`
  WHERE
    protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.eventName = 'query_job_completed'
  GROUP BY time_window
      ORDER BY time_window DESC

New query:

  SELECT
    TIMESTAMP_TRUNC(TIMESTAMP(JSON_EXTRACT_SCALAR(protopayload_auditlog.metadataJson,
          "$.jobChange.job.jobStats.endTime")), HOUR) AS time_window,
    FORMAT('%9.2f',5.0 * (SUM(CAST(JSON_EXTRACT_SCALAR(protopayload_auditlog.metadataJson,
              "$.jobChange.job.jobStats.queryStats.totalBilledBytes") AS INT64))/POWER(2, 40))) AS Estimated_USD_Cost
  FROM
    `MYPROJECTID.MYDATASETID.cloudaudit_googleapis_com_data_access_YYYYMMDD`
  WHERE
    JSON_EXTRACT_SCALAR(protopayload_auditlog.metadataJson,
      "$.jobChange.job.jobConfig.type") = "QUERY"
  GROUP BY
    time_window
  ORDER BY
    time_window DESC