Visualizing BigQuery Data Using Google Data Studio

BigQuery is a petabyte-scale analytics data warehouse that you can use to run SQL queries over vast amounts of data in near realtime.

Data visualization tools can help you make sense of your BigQuery data and help you analyze the data interactively. You can use visualization tools to help you identify trends, respond to them, and make predictions using your data. In this tutorial, you use Google Data Studio to visualize data in the BigQuery natality sample table.

Objectives

In this tutorial you:

  • Connect to a BigQuery data source
  • Create reports and charts to visualize BigQuery data

Costs

The Data Studio BigQuery connector allows you to access data from your BigQuery tables within Data Studio. BigQuery is a paid product and you will incur BigQuery usage costs when accessing BigQuery through Data Studio. BigQuery query pricing provides the first 1 TB per month free of charge. For more information, see BigQuery pricing.

Before you begin

Before you begin this tutorial, use the Google Cloud Platform Console to create or select a project and enable billing.

  1. Sign in to your Google Account.

    If you don't already have one, sign up for a new account.

  2. Select or create a GCP project.

    Go to the Manage resources page

  3. Make sure that billing is enabled for your project.

    Learn how to enable billing

  4. BigQuery is automatically enabled in new projects. To activate BigQuery in a pre-existing project, Enable the BigQuery API.

    Enable the API

Create reports and charts using Google Data Studio and the BigQuery connector

In this section of the tutorial, you use Google Data Studio to visualize data in BigQuery using the BigQuery connector. You create a data source, a report, and charts that visualize data in the natality sample table.

Create a data source

The first step in creating a report in Data Studio is to create a data source for the report. A report may contain one or more data sources. When you create a BigQuery data source, Data Studio uses the BigQuery connector.

You must have the appropriate permissions in order to add a BigQuery data source to a Data Studio report. In addition, the permissions applied to BigQuery datasets will apply to the reports, charts, and dashboards you create in Data Studio. When a Data Studio report is shared, the report components are visible only to users who have appropriate permissions.

To create a data source:

  1. Open Google Data Studio.

  2. On the Reports page, in the Start a new report section, click the Blank template. This creates a new untitled report.

    Blank template

  3. If prompted, complete the Marketing Preferences and the Account and Privacy settings and then click Save. You may need to click the Blank template again after saving your settings.

  4. In the Add a data source window, click Create new data source.

    Add data source

  5. In the Google Connectors section, hover over BigQuery and then click Select.

  6. For Authorization, click Authorize. This allows Data Studio access to your GCP project.

  7. In the Request for permission dialog, click Allow to give Data Studio the ability to view data in BigQuery. You may not receive this prompt if you previously used Data Studio.

  8. For My Projects, click Public Datasets.

  9. For Billing Project, click the name of the GCP project you created previously.

  10. For Public Dataset, click samples.

  11. For Table, click natality.

    natality table

  12. In the upper right corner of the window, click Connect. Once Data Studio connects to the BigQuery data source, the table’s fields are displayed. You can use this page to adjust the field properties or to create new calculated fields.

  13. In the natality table, the year and source_year fields are stored as integers. To use these columns as strings in Data Studio, you change the type for these columns to text. To the right of the year and source_year fields, in the Type column, click Number and choose Text from the drop-down list.

    Field dialog

  14. Click Add to report.

    Add to report

  15. When prompted, click Add to report.

  16. In the Request for permission dialog, click Allow to give Data Studio the ability to view and manage files in Google Drive. You may not receive this prompt if you previously used Data Studio.

Create a bar chart using a calculated field

Once you have added the natality data source to the report, the next step is to create a visualization. Begin by creating a bar chart. The bar chart displays the total number of births for each year. To display the births by year, you create a calculated field.

To create a bar chart that displays total births by year:

  1. (Optional) At the top of the page, click Untitled Report to change the report name. For example, type BigQuery tutorial.

  2. After the report editor loads, click Insert > Bar chart.

  3. Using the handle, draw a rectangle on the report to display the chart.

  4. On the Data tab, notice the value for Data Source (natality) and the default values for Dimension and Metric.

  5. Because you are charting the number of births by year, you need to set the Dimension to source_year. In the Available Fields list, click and drag source_year onto the Add dimension here box. The source year is the four digit year of the child’s birth.

    Default chart properties

  6. After the dimension is added, hover over the default dimension and click the delete icon on the right hand side.

    Default chart properties

  7. In the Metric section, click Add metric here.

  8. In the Metric picker, click Create new metric.

  9. To display a count of the number of children born each year (by gender), you create a calculated field. For this tutorial, you count the entries in the is_male field. The value in is_male is true if the child is male and false if the child is female. In the new metric window, for Name, type birth_count.

  10. Leave the Field ID unchanged.

  11. For Formula, type the following (or use the formula assistant): COUNT(is_male).

  12. Click Save.

    Create calculated field

  13. Click Done.

  14. Click the back arrow to close the Metric picker.

  15. In the Available Fields list, click and drag birth_count onto the Add metric here box.

  16. After the metric is added, hover over the default metric and click the delete icon on the right hand side. The Dimension should now be set to source_year and the Metric should be set to birth_count.

  17. Notice the chart is sorted in Descending order by default. The years with the most births are displayed first.

    Chart of total births

  18. To enhance the chart, change the bar labels. In the bar chart properties window, click the Style tab.

  19. In the Bar chart section, check Show data labels.

    Show labels

    The total number of births is displayed at the top of each bar in the chart.

  20. Sort the data by source_year instead of birth_count. In the bar chart properties window, on the Data tab, in the Sort section:

    • Click the existing field (birth_count) to open the Sort picker.
    • In the Sort picker, select source_year.
    • Click the back arrow to close the Sort picker.
    • Click Descending and select Ascending. The display changes to show the number of births by year in ascending order.

    Births in ascending order

Filter the chart

Currently, the bar chart displays the total number of male and female children born each year. Add a filter to display only the female children born each year.

  1. In the bar chart properties window, click the Data tab.

  2. In the Filter section, click Add a filter.

  3. In the Create filter window:

    • For Name, type Female birth count.
    • Click Include and change it to Exclude.
    • Click Select a field and choose is_male.
    • Click Select a condition and choose True.
    • Click Save.

    Create filter window

Notice the chart is updated to display only female children born each year. Also notice that the legend does not change. The legend name still reflects the name of the metric — birth_count.

Filtered chart

Create a chart using a custom query

Because Data Studio does not allow aggregations on metrics, some report components are easier to generate using a custom SQL query. The Custom Query option also lets you leverage BigQuery’s full query capabilities such as joins, unions, and analytical functions.

Alternatively, you can leverage BigQuery's full query capabilities by creating a view. A view is a virtual table defined by a SQL query. You can query data in a view by adding the dataset containing the view as a data source. For more information on views, see Using views.

When you specify a SQL query as your BigQuery data source, the results of the query are in table format, which becomes the field definition (schema) for your data source. When you use a custom query as a data source, Data Studio uses your SQL as an inner select statement for each generated query to BigQuery. For more information on custom queries in Data Studio, consult the online help.

Add a bar chart using a custom query

To add a bar chart to your report that uses a custom query data source:

  1. From the menu options, choose Page > New page.

  2. Click Insert > Bar chart.

  3. Using the handle, draw a rectangle on the report to display the chart.

  4. In the Bar chart properties window, on the Data tab, notice the Data Source is set to natality. Click natality to open the Select Data Source window.

  5. Click Create new data source.

  6. In the Google Connectors section, hover over BigQuery and the click Select.

  7. For My Projects, click Custom query.

  8. For Project, select your GCP project.

  9. Enter the following standard SQL query in the Enter custom query window:

    SELECT
     source_year,
     sum(case when is_male then 1 else 0 end) as male_births,
     sum(case when is_male then 0 else 1 end) as female_births
    FROM
     `bigquery-public-data.samples.natality`
    GROUP BY
     source_year
    ORDER BY
     source_year
    DESC
    

    This query uses a CASE statement to evaluate the is_male boolean field to determine how many births were female and how many were male for a given year.

  10. Under Query Options, deselect Use Legacy SQL. This allows you to use BigQuery standard SQL syntax.

    Custom query

  11. At the top of the window, click Untitled data source, change the data source name to Male female case query.

  12. In the upper right corner of the window, click Connect. After Data Studio connects to the BigQuery data source, the results of the query are used to determine the table schema.

  13. When the schema is displayed, notice the type and aggregation for each field. By default, source_year is formatted as a Number and the Aggregation is set to None. To the right of the source_year field, in the Type column, click Number and select Text. Verify Aggregation is set to None.

    Filtered chart

  14. Click Add to report.

  15. When prompted, click Add to report.

  16. Click the back arrow to close the Select Data Source window.

  17. Data Studio may be unable to correctly determine the appropriate Metric for the chart. Because you are charting the number of male and female births by year, the values for Metric should be female_births and male_births. In the bar chart properties window, on the Data tab, in the Metric section, click Record Count.

  18. In the Metric picker, select female_births.

  19. Click the back arrow to close the Metric picker.

  20. In the Metric section, click Add metric here.

  21. In the Metric picker, select male_births.

  22. Click the back arrow to close the Metric picker. Your chart now displays the number of male and female children born each year using separate bars.

    Filtered chart

  23. The chart should be sorted by female_births in descending order. Change this to source_year as you did in the previous chart. In the Sort section, click female_births.

  24. In the Sort picker, choose source_year.

  25. Click the back arrow to close the Sort picker. The chart now shows female and male births each year in descending order.

  26. For readability, change the chart styles. In the Bar chart properties, click the Style tab.

  27. Notice each bar has a default color based on the order the metrics were added to the chart. Click the second square in the color palette and change it from red to blue.

    Filtered chart

  28. Click the first square in the color palette and change it from blue to red.

    The chart should look like the following:

    Colored bar chart

View your query history

You can view queries submitted via the BigQuery Connector by examining your query history in the BigQuery web interface. Using the query history, you can estimate query costs, and you can save queries for use in other scenarios.

To examine your query history:

  1. Go to the BigQuery web UI.

    Go to the BigQuery web UI

  2. Click Query History.

  3. The list of queries is displayed with the most recent queries first. Click Open Query to view details on the query such as Job ID and Bytes Processed.

    Colored bar chart

Cleaning up

To avoid incurring charges to your Google Cloud Platform account for the resources used in this tutorial:

  1. In the GCP Console, go to the Projects page.

    Go to the Projects page

  2. In the project list, select the project you want to delete and click Delete project. After selecting the checkbox next to the project name, click
      Delete project
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

Deleting your project prevents Data Studio from querying the data because the data source is associated with your project. If you do not want to delete your Cloud Platform project, you can delete the Data Studio report and data source.

To delete the Data Studio resources:

  1. Open Google Data Studio.

  2. On the Reports page, to the right of the BigQuery tutorial report, click the "more" button more button and choose Remove.

  3. Click Data Sources.

  4. To the right of the natality data source, click the "more" button and choose Remove.

  5. To the right of the Male female case query data source, click the "more" button and choose Remove.

What's next

  • Learn more about writing queries for BigQueryQuerying Data in the BigQuery documentation explains how to run queries, create user-defined functions (UDFs), and more.

  • Explore BigQuery syntax — The preferred dialect for SQL queries in BigQuery BigQuery is standard SQL, which is described in the SQL Reference. BigQuery's legacy SQL-like syntax is described in the Query Reference (legacy SQL).

Was this page helpful? Let us know how we did:

Send feedback about...