Visualizing BigQuery Data Using Google Data Studio

BigQuery is a petabyte-scale analytics data warehouse that you can use to run SQL queries over vast amounts of data in near realtime.

Data visualization tools can help you make sense of your BigQuery data and help you analyze the data interactively. You can use visualization tools to help you identify trends, respond to them, and make predictions using your data. In this tutorial, you use Google Data Studio to visualize data in the BigQuery natality sample table.

Objectives

In this tutorial you:

  • Connect to a BigQuery data source
  • Create reports and charts to visualize BigQuery data

Costs

The Data Studio BigQuery connector allows you to access data from your BigQuery tables within Data Studio. BigQuery is a paid product and you will incur BigQuery usage costs when accessing BigQuery through Data Studio. BigQuery query pricing provides the first 1 TB per month free of charge. For more information, see BigQuery pricing.

Before you begin

Before you begin this tutorial, use the Google Cloud Platform Console to create or select a project and enable billing.

  1. Sign in to your Google account.

    If you don't already have one, sign up for a new account.

  2. Select or create a Cloud Platform project.

    Go to the Manage resources page

  3. Enable billing for your project.

    Enable billing

  4. BigQuery is automatically enabled in new projects. To activate BigQuery in a pre-existing project, Enable the BigQuery API.

    Enable the API

Create reports and charts using Google Data Studio and the BigQuery connector

In this section of the tutorial, you use Google Data Studio to visualize data in BigQuery using the BigQuery connector. You create a data source, a report, and charts that visualize data in the natality sample table.

Create a data source

The first step in creating a report in Data Studio is to create a data source for the report. A report may contain one or more data sources. When you create a BigQuery data source, Data Studio uses the BigQuery connector.

You must have the appropriate permissions in order to add a BigQuery data source to a Data Studio report. In addition, the permissions applied to BigQuery datasets will apply to the reports, charts, and dashboards you create in Data Studio. When a Data Studio report is shared, the report components are visible only to users who have appropriate permissions.

To create a data source:

  1. Open Google Data Studio.

  2. On the Reports page, in the Start a new report section, click the Blank template. This creates a new untitled report.

    Blank template

  3. If prompted, click I accept the terms and conditions and then click Accept. You may need to click the Blank template again after agreeing to the terms and conditions.

  4. In the Add a data source window, click Create new data source.

    Add data source

  5. For Connectors, click BigQuery.

  6. For Authorization, click Authorize. This allows Data Studio access to your GCP project.

  7. In the Request for permission dialog, click Allow to give Data Studio the ability to view data in BigQuery. You may not receive this prompt if you previously used Data Studio.

  8. For My Projects, click Public Datasets.

  9. For Dataset, click samples.

  10. For Table, click natality.

    natality table

  11. For Billing Project, select your GCP project.

  12. In the upper right corner of the window, click Connect.

  13. Once Data Studio has connected to the BigQuery data source, the table’s fields are displayed. You can use this page to adjust the field properties or to create new calculated fields. Click Add to report.

    Add to report

  14. When prompted, click Add to report.

  15. In the Request for permission dialog, click Allow to give Data Studio the ability to view and manage files in Google Drive. You may not receive this prompt if you previously used Data Studio.

Create a bar chart using a calculated field

Once you have added the natality data source to the report, the next step is to create a visualization. Begin by creating a bar chart. The bar chart displays the total number of births for each year. To display the births by year, you create a calculated field.

To create a bar chart that displays total births by year:

  1. (Optional) At the top of the page, click Untitled Report to change the report name. For example, type BigQuery tutorial.

  2. When the report editor loads, click Insert > Bar chart.

  3. Using the handle, draw a rectangle on the report to display the chart.

  4. In the Bar chart properties window, on the Data tab, notice the value for Data Source (natality) and the default values for Dimension and Metric.

  5. Because the year field allows null values, change Dimension to source_year. In the Dimension section, click the existing dimension.

    Default chart properties

  6. In the Dimension picker, select source_year. The source year is the four digit year of the child’s birth.

  7. Click the back arrow back arrow to close the Dimension picker.

  8. In the Metric section, click the existing metric.

    Default chart properties

  9. In the Metric picker, click Create new metric.

  10. Click Create a calculated field (Create a calculated field). To display a count of the number of children born each year, create a calculated field. For this tutorial, you count the entries in the is_male field. The value in is_male is true if the child is male and false if the child is female.

  11. For Name, type birth_count.

  12. Leave the ID unchanged.

  13. For Formula, type the following (or use the formula assistant): COUNT(is_male).

  14. Click Create field.

    Create calculated field

  15. Click Done.

  16. In the Metric picker, select birth_count.

  17. Click the back arrow to close the Metric picker. The Dimension should be set to source_year and the Metric should be set to birth_count. Notice the chart is sorted in Descending order by default. The years with the most births are displayed first.

    Chart of total births

  18. To enhance the chart, change the bar labels. In the Bar chart properties window, click the Style tab.

  19. In the Bar chart section, check Show data labels.

    Show labels

    The total number of births is displayed above each bar in the chart.

  20. Sort the data by source_year instead of birth_count. In the Bar chart properties window, on the Data tab, in the Sort section:

    • Click birth_count and select source_year.
    • Click Descending and select Ascending. The display changes to show the number of births by year in ascending order.

    Births in ascending order

Filter the chart

Currently, the bar chart displays the total number of male and female children born each year. Add a filter to display only the female children born each year.

  1. In the Bar chart properties window, click the Data tab.

  2. In the Filter section, click Add a filter.

  3. In the Create filter dialog:

    • For Name, type Female birth count.
    • Click Include and change it to Exclude.
    • Click Select a field and choose is_male.
    • Click Select a condition and choose True.
    • Click Save.

Notice the chart is updated to display only female children born each year. Also notice that the legend does not change. The legend name still reflects the name of the metric - birth_count.

Filtered chart

Create a chart using a custom query

Because Data Studio does not allow aggregations on metrics, some report components are easier to generate using a custom SQL query. The Custom Query option also lets you leverage BigQuery’s full query capabilities such as joins, unions, and analytical functions.

Alternatively, you can leverage BigQuery's full query capabilities by creating a view. A view is a virtual table defined by a SQL query. You can query data in a view by adding the dataset containing the view as a data source. For more information on views, see Using views.

When you specify a SQL query as your BigQuery data source, the results of the query are in table format, which becomes the field definition (schema) for your data source. When you use a custom query as a data source, Data Studio uses your SQL as an inner select statement for each generated query to BigQuery. For more information on custom queries in Data Studio, consult the online help.

Add a bar chart using a custom query

To add a bar chart to your report that uses a custom query data source:

  1. Click Insert > Bar chart.

  2. Using the handle, draw a rectangle on the report to display the chart.

  3. In the Bar chart properties window, on the Data tab, notice the value for Data Source (natality) and the default values for Dimension and Metric are the same as the previous chart. In the Data Source section, click Select
data source icon (Select data source).

    Filtered chart

  4. Click Create new data source.

  5. For Connectors, click BigQuery.

  6. For My Projects, click Custom query.

  7. For Project, select your GCP project.

  8. Type the following in the Enter custom query window:

    SELECT
     source_year,
     sum(case when is_male then 1 else 0 end) as male_births,
     sum(case when is_male then 0 else 1 end) as female_births
    FROM
     `bigquery-public-data.samples.natality`
    GROUP BY
     source_year
    ORDER BY
     source_year
    DESC
    

    This query uses a CASE statement to evaluate the is_male Boolean field to determine how many births were female and how many were male for a given year.

  9. Under Query Options, deselect Use Legacy SQL. This allows you to use BigQuery standard SQL syntax.

  10. At the top of the window, click Untitled data source, change the data source name to Male female case query.

  11. In the upper right corner of the window, click Connect. Once Data Studio has connected to the BigQuery data source, the results of the query are used to determine the table schema.

  12. When the schema is displayed, notice the type and aggregation for each field. By default, source_year is formatted as a Number and the Aggregation is set to Sum. To the right of the source_year field, in the Aggregation column, click Sum and select None.

    Filtered chart

  13. Click Add to report.

  14. When prompted, click Add to report.

  15. Data Studio may be unable to determine the appropriate Dimension and Metrics for the chart. This results in the error: Configuration incomplete - Invalid dimension or metric selected. In the Bar chart properties, on the Data tab, in the Dimension section, click Invalid dimension.

    Filtered chart

  16. In the Dimension picker, select source_year.

  17. Click the back arrow to close the Dimension picker.

  18. In the Bar chart properties window, on the Data tab, in the Metric section, click Invalid metric.

  19. In the Metric picker, select female_births.

  20. Click the back arrow to close the Metric picker.

  21. In the Metric section, click Add a metric.

    Filtered chart

  22. In the Metric picker, select male_births.

  23. Click the back arrow to close the Metric picker. Your chart now displays the number of male and female children born each year.

    Filtered chart

  24. For readability, change the chart styles. In the Bar chart properties, click the Style tab.

  25. In the Bar chart section, deselect Single color.

  26. Notice each bar has a default color based on the order the metrics were added to the chart. Click the second square in the color palette and change it from red to blue.

    Filtered chart

  27. Click the first square in the color palette and change it from blue to red.

    The chart should look like the following:

    Colored bar chart

View your query history

You can view queries submitted via the BigQuery Connector by examining your query history in the BigQuery web interface. Using the query history, you can estimate query costs, and you can save queries for use in other scenarios.

To examine your query history:

  1. Go to the BigQuery web UI.

    Go to the BigQuery web UI

  2. Click Query History.

  3. The list of queries is displayed with the most recent queries first. Click Open Query to view details on the query such as Job ID and Bytes Processed.

    Colored bar chart

Cleaning up

To avoid incurring charges to your Google Cloud Platform account for the resources used in this tutorial:

  1. In the Cloud Platform Console, go to the Projects page.

    Go to the Projects page

  2. In the project list, select the project you want to delete and click Delete project. After selecting the checkbox next to the project name, click
      Delete project
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

Deleting your project prevents Data Studio from querying the data because the data source is associated with your project. If you do not want to delete your Cloud Platform project, you can delete the Data Studio report and data source.

To delete the Data Studio resources:

  1. Open Google Data Studio.

  2. On the Reports page, to the right of the BigQuery tutorial report, click the "more" button more button and choose Remove.

  3. Click Data Sources.

  4. To the right of the natality data source, click the "more" button and choose Remove.

  5. To the right of the Male female case query data source, click the "more" button and choose Remove.

What's next

  • Learn more about writing queries for BigQueryQuerying Data in the BigQuery documentation explains how to run queries, create user-defined functions (UDFs), and more.

  • Explore BigQuery syntax — The preferred dialect for SQL queries in BigQuery BigQuery is standard SQL, which is described in the SQL Reference. BigQuery's legacy SQL-like syntax is described in the Query Reference (legacy SQL).

Monitor your resources on the go

Get the Google Cloud Console app to help you manage your projects.

Send feedback about...