Export to BigQuery

This document describes some attributes of the BigQuery dataset that is created when a sink exports traces from Cloud Trace to BigQuery.

Schema

The schema for your table that stores trace data is determined by the Trace V2 API definition of Span.

Trace keeps track of your table columns and patches the table when a new span contains fields that can't be stored in the existing table. A patch operation is required when an arriving span contains entries not previously seen. For example, if an arriving span contains a new Attribute, the table is patched.

Data retention

You configure the data retention policies of your BigQuery tables. For information about managing tables and table data, see Working with tables.

Table type

If you configure a sink to export traces to BigQuery, Trace configures an ingestion-time partitioned table. For detailed information on partitioned tables, including how to create, managed, query, and delete these tables, see Working with partitioned tables.

Sample queries

In the following queries, DATASET is the name of the BigQuery dataset, and MY_TABLE is the name of a table in that dataset.

  • To display all columns in the table for the date of November 20, 2019 while limiting the result to 10 rows, run the query:

    SELECT
      *
    FROM
      `DATASET.MY_TABLE`
    WHERE
      DATE(_PARTITIONTIME) = "2019-11-20" LIMIT 10
    
  • To display all of the partitions available in the table, run the query:

    SELECT
      _PARTITIONTIME as pt
    FROM
      `DATASET.MY_TABLE`
    GROUP BY 1
    

HipsterShop query

HipsterShop is a demo application available on GitHub.

The following is a sample query that illustrates how you can use BigQuery queries to gather information that isn't readily available using the Trace interface.

The inner query finds all spans that match the specified regular expression that were received on December 2, 2019. The outer query selects for display the following:

  • name
  • number of spans that match
  • number of distinct trace IDs
  • 50th, 90th, and 99th quantiles
  • HTTP path
  • Error message

and displays the results sorted by the trace counts:

SELECT t0.span.displayName.value, count(t0.span.spanId) as spanCount, count(distinct traceId) as traceCount,
APPROX_QUANTILES(milliseconds, 100)[OFFSET(50)] as p50,
APPROX_QUANTILES(milliseconds, 100)[OFFSET(95)] as p95,
APPROX_QUANTILES(milliseconds, 100)[OFFSET(99)] as p99,
t0.span.attributes.attributeMap._http_path,
t0.span.attributes.attributeMap._error_message
FROM (
SELECT *,
REGEXP_EXTRACT(span.name, r"./traces/([a-f0-9]+).") as traceId,
TIMESTAMP_DIFF(span.endTime,span.startTime, MILLISECOND) as milliseconds
FROM `hipstershop-demo.Hipstershop_trace_export.cloud_trace`
WHERE DATE(_PARTITIONTIME) = "2019-12-02") AS t0
WHERE t0.span.parentSpanId is NULL
GROUP by t0.span.displayName.value, t0.span.attributes.attributeMap._http_path,t0.span.attributes.attributeMap._error_message
ORDER BY traceCount DESC
LIMIT 1000
 

For one particular installation of this application, the query result is as shown:

Display the response to the previous query.

View trace data

To view your trace data by using the BigQuery interface, select the table with your exported traces.