Audit logs migration guide
BigQuery currently provides two versions of audit logs: an older
version that uses AuditData
payload, and a new version that uses
BigQueryAuditMetadata
. This document describes migrating logs interface
filters, BigQuery exports, and changes in queries over the
exported logs from the old format to the new format.
Logs interface filters
Changing filters might require changing the paths to the fields you want to
filter on. BigQuery audit logs overview
lists the changes between the old logs and the new logs. The
changes include the resource
, resourceName
, and methodName
fields.
For example, the filter to select all the BigQuery-related records changes from:
resource.type = "bigquery_resource"
to:
resource.type = ("bigquery_project" OR "bigquery_dataset")
For the filter table inserts on the method name, the filter changes from:
protoPayload.methodName="tableservice.insert"
to:
protoPayload.methodName = "google.cloud.bigquery.v2.TableService.InsertTable"
You can also easily select all table inserts regardless of the method:
protoPayload.metadata.tableCreation.reason != ""
Or filter on a specific reason for a table creation:
protoPayload.metadata.tableCreation.reason = "JOB"
New logs have separate method names for patch and update, so
protoPayload.methodName = "tableservice.update"
changes to:
protoPayload.methodName = ("google.cloud.bigquery.v2.TableService.UpdateTable" OR "google.cloud.bigquery.v2.TableService.PatchTable")
However, you can find all the table updates by using the
protoPayload.metadata.tableChange.reason
field.
Finally, with the new logs you can find all the records for a specific table
by using the protoPayload.methodName
field. For example:
protoPayload.resourceName = "projects/myproject/datasets/mydataset/tables/mytable"
If you're using a resourceName
filter with the old logs to find all the side
effects for a specific job ID, with the new logs you'll need to filter on a
specific event instead. To find all the source tables read records for a
specific job:
protoPayload.metadata.tableDataRead.jobName = "projects/myproject/jobs/myjob"
Or to find the destination table change:
protoPayload.metadata.tableChange.jobName = "projects/myproject/jobs/myjob"
Resources representations, such as Job or Table, in the new logs have the structure similar to the old logs. Hovewer, new logs represent events while the old logs focus more on the request and response.
You can compare the new
BigQueryAuditMetadata
and the old
AuditData
formats. BigQuery-specific information moved from the
serviceData
to the metadata
field.
For example, to migrate a filter that finds all completed "CREATE TABLE AS SELECT" DDL jobs, change the filter from:
protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.query.statementType = "CREATE_TABLE_AS_SELECT"
to:
protoPayload.metadata.jobChange.after = "DONE"
protoPayload.metadata.jobChange.job.jobConfig.queryConfig.statementType = "CREATE_TABLE_AS_SELECT"
Logs routing (exports)
For exporting all records for core BigQuery operations, use the metadata type filter:
protoPayload.metadata."@type"="type.googleapis.com/google.cloud.audit.BigQueryAuditMetadata"
Querying logs exported to BigQuery
In addition to the basic structure changes described earlier, there
are additional changes in the exported data structure that need to be addressed.
The metadata
field is exported as a single JSON column. To query the contents,
you must use
BigQuery JSON functions.
Example: DDL queries
For the previous example of filtering all the "CREATE TABLE AS SELECT" DDL jobs, the query over the old logs to count the number of jobs might look like this:
#standardSQL
SELECT COUNT(*)
FROM
`MYPROJECTID.MYDATASETID.cloudaudit_googleapis_com_data_access_YYYYMMDD`
WHERE
protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobConfiguration.query.statementType = "CREATE_TABLE_AS_SELECT"
The same query over the new logs might look like this:
SELECT
COUNT(*)
FROM
`MYPROJECTID.MYDATASETID.cloudaudit_googleapis_com_data_access_YYYYMMDD`
WHERE
JSON_EXTRACT_SCALAR(protopayload_auditlog.metadataJson,
"$.jobChange.job.jobConfig.queryConfig.statementType") = "CREATE_TABLE_AS_SELECT"
Example: Hourly cost breakdown
Old query:
#standardSQL
SELECT
TIMESTAMP_TRUNC(protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobStatistics.endTime, HOUR) AS time_window,
FORMAT('%9.2f',5.0 * (SUM(protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobStatistics.totalBilledBytes)/POWER(2, 40))) AS Estimated_USD_Cost
FROM
`MYPROJECTID.MYDATASETID.cloudaudit_googleapis_com_data_access_YYYYMMDD`
WHERE
protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.eventName = 'query_job_completed'
GROUP BY time_window
ORDER BY time_window DESC
New query:
SELECT
TIMESTAMP_TRUNC(TIMESTAMP(JSON_EXTRACT_SCALAR(protopayload_auditlog.metadataJson,
"$.jobChange.job.jobStats.endTime")), HOUR) AS time_window,
FORMAT('%9.2f',5.0 * (SUM(CAST(JSON_EXTRACT_SCALAR(protopayload_auditlog.metadataJson,
"$.jobChange.job.jobStats.queryStats.totalBilledBytes") AS INT64))/POWER(2, 40))) AS Estimated_USD_Cost
FROM
`MYPROJECTID.MYDATASETID.cloudaudit_googleapis_com_data_access_YYYYMMDD`
WHERE
JSON_EXTRACT_SCALAR(protopayload_auditlog.metadataJson,
"$.jobChange.job.jobConfig.type") = "QUERY"
GROUP BY
time_window
ORDER BY
time_window DESC