Visualizing BigQuery data using Data Studio

BigQuery is a petabyte-scale analytics data warehouse that you can use to run SQL queries over vast amounts of data in near real-time.

Data visualization tools can help you make sense of your BigQuery data and help you analyze the data interactively. You can use visualization tools to help you identify trends, respond to them, and make predictions using your data. In this tutorial, you use Google Data Studio to visualize data in the BigQuery natality sample table.

Objectives

In this tutorial, you can do the following:

  • Connect to a BigQuery data source
  • Create reports and charts to visualize BigQuery data

Costs

The Google Data Studio BigQuery connector lets you access data from your BigQuery tables within Google Data Studio. BigQuery is a paid product and you incur BigQuery usage costs when accessing BigQuery through Google Data Studio. BigQuery query pricing provides the first 1 TB per month free of charge. For more information, see the BigQuery Pricing page.

Before you begin

Before you begin this tutorial, use the Google Cloud console to create or select a project and enable billing.

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  5. Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.

  6. BigQuery is automatically enabled in new projects. To activate BigQuery in a preexisting project,

    Enable the BigQuery API.

    Enable the API

Create reports and charts using Google Data Studio and the BigQuery connector

In this section of the tutorial, you use Google Data Studio to visualize data in BigQuery using the BigQuery connector. You create a data source, a report, and charts that visualize data in the natality sample table.

Create a data source

The first step in creating a report in Google Data Studio is to create a data source for the report. A report may contain one or more data sources. When you create a BigQuery data source, Google Data Studio uses the BigQuery connector.

You must have the appropriate permissions in order to add a BigQuery data source to a Google Data Studio report. The permissions applied to BigQuery datasets also apply to the reports, charts, and dashboards you create in Google Data Studio. When a Google Data Studio report is shared, the report components are visible only to users who have appropriate permissions.

To create a data source:

  1. Open Google Data Studio.

  2. Click Create > Data source.

  3. In the search box, enter BigQuery.

  4. In the Google Connectors section, click BigQuery.

  5. For My Projects, click Public Datasets.

  6. For Billing Project, click or enter the name of the Cloud project you created previously.

  7. For Public Dataset, click samples.

  8. For Table, click natality.

  9. Click Connect.

  10. BigQuery shows the natality table schema. Next to the year and source_year fields, in the Type column, click Number and choose Text from the drop-down list. Changing the type to text allows you to use these columns as strings in Google Data Studio.

    Field dialog

  11. Click Create report.

  12. In the Request for permission dialog, click Allow to give Data Studio the ability to view and manage files in Google Drive. You might not receive this prompt if you previously used Google Data Studio.

Create a column chart using a calculated field

Once you have added the natality data source to the report, the next step is to create a visualization. Begin by creating a column chart. The column chart displays the total number of births for each year. To display the births by year, you create a calculated field.

To create a column chart that displays total births by year:

  1. (Optional) At the top of the page, click Untitled Report to change the report name. For example, type BigQuery tutorial.

  2. After the report editor loads, click Insert > Column chart.

  3. Using the handle, place the chart on your report.

  4. On the Data tab, notice the value for Data Source (natality) and the default values for Dimension and Metric.

  5. Because you are charting the number of births by year, the dimension should be set to source_year. If source_year is not the default dimension, drag source_year from Available Fields onto the default dimension to replace it.

  6. Change the chart metric as follows:

    1. In the Metric section, click Record count, which is the default metric.

    2. In the metric picker, click Create field.

    3. In the new metric window, for Name, type birth_count.

    4. For Formula, type the following: COUNT(is_male). This formula creates a metric that counts the entries in the is_male field. The value of is_male is true if the child was assigned male at birth.

    5. Click Apply.

    6. Click the report page close the metric picker.

    The chart is sorted by years in descending order by default. The years with the most births are displayed first.

    Chart of total births

  7. To enhance the chart, change the bar labels. In the chart properties window, click the Style tab.

  8. In the Bar chart section, check Show data labels.

    Show labels

    The total number of births is displayed at the top of each bar in the chart.

  9. Sort the data by source_year instead of birth_count. In the chart properties window, click the Setup tab.

    • In the Sort section, click the default sorting field.
    • In the sort picker, select source_year.
    • Select Ascending.

    The display changes to show the number of births by year in ascending order.

    Births in ascending order

Filter the chart

Currently, the chart displays the total number of children born each year. Add a filter to display only the children assigned female at birth each year.

  1. In the chart properties window, click the Setup tab.

  2. In the Filter section, click Add a filter.

  3. In the Create filter window:

    • For Name, type Female birth count.
    • Click Include and change it to Exclude.
    • Click Select a field and choose is_male.
    • Click Select a condition and choose True.
    • Click Save.

The chart updates to apply the filter.

Filtered chart

Create a chart using a custom query

Creating a chart using the Custom Query option lets you leverage BigQuery's full query capabilities such as joins, unions, and analytical functions.

Alternatively, you can leverage BigQuery's full query capabilities by creating a view. A view is a virtual table defined by a SQL query. You can query data in a view by adding the dataset containing the view as a data source. For more information on views, see Using views.

When you specify a SQL query as your BigQuery data source, the results of the query are in table format, which becomes the field definition (schema) for your data source. When you use a custom query as a data source, Google Data Studio uses your SQL as an inner select statement for each generated query to BigQuery. For more information on custom queries in Google Data Studio, consult the online help.

Add a column chart using a custom query

To add a column chart to your report that uses a custom query data source:

  1. From the menu options, choose Page > New page.

  2. Click Insert > Column chart.

  3. Using the handle, place the chart on the report.

  4. In the properties window, on the Setup tab, click natality.

  5. Click Add data.

  6. In the Google Connectors section, click BigQuery.

  7. For My Projects, click Custom query.

  8. For Billing project, select or enter your Google Cloud project.

  9. Enter the following standard SQL query in the Enter custom query window:

    SELECT
     source_year,
     sum(case when is_male then 1 else 0 end) as male_births,
     sum(case when is_male then 0 else 1 end) as female_births
    FROM
     `bigquery-public-data.samples.natality`
    GROUP BY
     source_year
    ORDER BY
     source_year
    DESC
    

    This query uses a CASE statement to evaluate the is_male boolean field to determine how many births were female and how many were male for a given year.

  10. Under Query Options, verify Use Legacy SQL is cleared. This action lets you use BigQuery standard SQL syntax.

    Custom query

  11. Click Add.

  12. In the chart properties window, on the Setup tab, in the Metric section, click Record Count.

  13. In the metric picker, select female_births.

  14. In the Metric section, click Add metric.

  15. In the metric picker, select male_births. Your chart now displays the number of children assigned male and female for each year, using separate bars.

    Filtered chart

  16. Change the sorting field to source_year as you did in the previous chart:

    1. In the Sort section, click female_births.
    2. In the sort picker, choose source_year.

    The chart now shows births that were assigned female and male each year, in descending order.

    Column chart

View your query history

You can view queries submitted through the BigQuery Connector by examining your query history in the BigQuery web interface. Using the query history, you can estimate query costs, and you can save queries for use in other scenarios.

To examine your query history:

Console

  1. In the Cloud console, open the BigQuery page.

    Go to BigQuery

  2. Click Query history in the left navigation.

  3. The list of queries is displayed with the most recent queries first. Click a query to view details on the query such as Job ID and Bytes Processed.

Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.

  1. In the Cloud console, go to the Manage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then click Delete.
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

Deleting your project prevents Google Data Studio from querying the data because the data source is associated with your project. If you do not want to delete your Google Cloud project, you can delete the Google Data Studio report and data source.

To delete the Google Data Studio resources:

  1. Open Google Data Studio.

  2. On the Reports page, to the right of the BigQuery tutorial report, click the "more" button more button and choose Remove.

  3. Click Data Sources.

  4. To the right of the natality data source, click the "more" button and choose Remove.

  5. To the right of the Male female case query data source, click the "more" button and choose Remove.

What's next

  • Learn more about writing queries for BigQueryQuerying data in the BigQuery documentation explains how to run queries, create user-defined functions (UDFs), and more.

  • Explore BigQuery syntax — The preferred dialect for SQL queries in BigQuery is standard SQL, which is described in the SQL reference. BigQuery's legacy SQL-like syntax is described in the Query reference (legacy SQL).