BigQuery is a petabyte-scale analytics data warehouse that you can use to run SQL queries over vast amounts of data in near realtime.
Data visualization tools can help you make sense of your BigQuery data and help you analyze the data interactively. You can use visualization tools to help you identify trends, respond to them, and make predictions using your data. In this tutorial, you use Google Data Studio to visualize data in the BigQuery natality sample table.
Objectives
In this tutorial you:
- Connect to a BigQuery data source
- Create reports and charts to visualize BigQuery data
Costs
The Google Data Studio BigQuery connector allows you to access data from your BigQuery tables within Google Data Studio. BigQuery is a paid product and you will incur BigQuery usage costs when accessing BigQuery through Google Data Studio. BigQuery query pricing provides the first 1 TB per month free of charge. For more information, see the BigQuery Pricing page.
Before you begin
Before you begin this tutorial, use the Google Cloud Console to create or select a project and enable billing.
-
Sign in to your Google Account.
If you don't already have one, sign up for a new account.
-
In the Cloud Console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project. Learn how to confirm billing is enabled for your project.
- BigQuery is automatically enabled in new projects. To activate BigQuery in a pre-existing project, Enable the BigQuery API.
Create reports and charts using Google Data Studio and the BigQuery connector
In this section of the tutorial, you use Google Data Studio to visualize data in BigQuery using the BigQuery connector. You create a data source, a report, and charts that visualize data in the natality sample table.
Create a data source
The first step in creating a report in Google Data Studio is to create a data source for the report. A report may contain one or more data sources. When you create a BigQuery data source, Google Data Studio uses the BigQuery connector.
You must have the appropriate permissions in order to add a BigQuery data source to a Google Data Studio report. In addition, the permissions applied to BigQuery datasets will apply to the reports, charts, and dashboards you create in Google Data Studio. When a Google Data Studio report is shared, the report components are visible only to users who have appropriate permissions.
To create a data source:
Open Google Data Studio.
On the Reports page, in the Start a new report section, click the Blank template. This creates a new untitled report.
If prompted, complete the Marketing Preferences and the Account and Privacy settings and then click Save. You may need to click the Blank template again after saving your settings.
In the Add a data source window, click Create new data source.
In the Google Connectors section, hover over BigQuery and then click Select.
For Authorization, click Authorize. This allows Google Data Studio access to your GCP project.
In the Request for permission dialog, click Allow to give Google Data Studio the ability to view data in BigQuery. You may not receive this prompt if you previously used Google Data Studio.
For My Projects, click Public Datasets.
For Billing Project, click the name of the GCP project you created previously.
For Public Dataset, click samples.
For Table, click
natality
.In the upper right corner of the window, click Connect. Once Google Data Studio connects to the BigQuery data source, the table’s fields are displayed. You can use this page to adjust the field properties or to create new calculated fields.
In the
natality
table, theyear
andsource_year
fields are stored as integers. To use these columns as strings in Google Data Studio, you change the type for these columns to text. To the right of theyear
andsource_year
fields, in the Type column, click Number and choose Text from the drop-down list.Click Add to report.
When prompted, click Add to report.
In the Request for permission dialog, click Allow to give Data Studio the ability to view and manage files in Google Drive. You may not receive this prompt if you previously used Google Data Studio.
Create a bar chart using a calculated field
Once you have added the natality data source to the report, the next step is to create a visualization. Begin by creating a bar chart. The bar chart displays the total number of births for each year. To display the births by year, you create a calculated field.
To create a bar chart that displays total births by year:
(Optional) At the top of the page, click Untitled Report to change the report name. For example, type
BigQuery tutorial
.After the report editor loads, click Insert > Bar chart.
Using the handle, place the chart on your report.
On the Data tab, notice the value for Data Source (natality) and the default values for Dimension and Metric.
Because you are charting the number of births by year, the dimension should be set to
source_year
. Ifsource_year
is not the default dimension, click and drag source_year from Available Fields onto the default dimension to replace it.In the Metric section, click Add metric.
In the metric picker, click Create field.
To display a count of the number of children born each year (by gender), you create a calculated field. For this tutorial, you count the entries in the
is_male
field. The value inis_male
istrue
if the child is male andfalse
if the child is female. In the new metric window, for Name, typebirth_count
.For Formula, type the following:
COUNT(is_male)
.Click Apply.
Click the report page close the metric picker.
After the metric is added, hover over the default metric and click the delete icon on the right hand side. The Dimension should now be set to
source_year
and the Metric should be set tobirth_count
.Notice the chart is sorted in Descending order by default. The years with the most births are displayed first.
To enhance the chart, change the bar labels. In the bar chart properties window, click the Style tab.
In the Bar chart section, check Show data labels.
The total number of births is displayed at the top of each bar in the chart.
Sort the data by
source_year
instead ofbirth_count
. In the bar chart properties window, on the Data tab, in the Sort section:- Click New field.
- In the sort picker, scroll to Chart Fields and select
source_year
. - Click away from the dialog to close the sort picker.
- Click Descending and select Ascending. The display changes to show the number of births by year in ascending order.
Filter the chart
Currently, the bar chart displays the total number of male and female children born each year. Add a filter to display only the female children born each year.
In the bar chart properties window, click the Data tab.
In the Filter section, click Add a filter.
In the Create filter window:
- For Name, type
Female birth count
. - Click Include and change it to Exclude.
- Click Select a field and choose
is_male
. - Click Select a condition and choose True.
- Click Save.
- For Name, type
Notice the chart is updated to display only female children born each year. Also
notice that the legend does not change. The legend name still reflects the name
of the metric — birth_count
.
Create a chart using a custom query
Creating a chart using the Custom Query option allows you to leverage BigQuery's full query capabilities such as joins, unions, and analytical functions.
Alternatively, you can leverage BigQuery's full query capabilities by creating a view. A view is a virtual table defined by a SQL query. You can query data in a view by adding the dataset containing the view as a data source. For more information on views, see Using views.
When you specify a SQL query as your BigQuery data source, the results of the query are in table format, which becomes the field definition (schema) for your data source. When you use a custom query as a data source, Google Data Studio uses your SQL as an inner select statement for each generated query to BigQuery. For more information on custom queries in Google Data Studio, consult the online help.
Add a bar chart using a custom query
To add a bar chart to your report that uses a custom query data source:
From the menu options, choose Page > New page.
Click Insert > Bar chart.
Using the handle, place the chart on the report.
In the Bar chart properties window, on the Data tab, notice the Data Source is set to
natality
. Clicknatality
to open the Select Data Source window.Click Create new data source.
In the Google Connectors section, hover over BigQuery and the click Select.
For My Projects, click Custom query.
For Project, select your GCP project.
Enter the following standard SQL query in the Enter custom query window:
SELECT source_year, sum(case when is_male then 1 else 0 end) as male_births, sum(case when is_male then 0 else 1 end) as female_births FROM `bigquery-public-data.samples.natality` GROUP BY source_year ORDER BY source_year DESC
This query uses a CASE statement to evaluate the
is_male
boolean field to determine how many births were female and how many were male for a given year.Under Query Options, verify Use Legacy SQL is deselected. This allows you to use BigQuery standard SQL syntax.
At the top of the window, click Untitled data source, change the data source name to
Male female case query
.In the upper right corner of the window, click Connect. After Google Data Studio connects to the BigQuery data source, the results of the query are used to determine the table schema.
When the schema is displayed, notice the type and aggregation for each field. By default,
source_year
is formatted as a Number and the Aggregation is set to None. To the right of thesource_year
field, in the Type column, click Number and select Text. Verify Aggregation is set to None.Click Add to report.
When prompted, click Add to report.
Click the back arrow to close the Select Data Source window.
Google Data Studio may be unable to correctly determine the appropriate Metric for the chart. Because you are charting the number of male and female births by year, the values for Metric should be
female_births
andmale_births
. In the bar chart properties window, on the Data tab, in the Metric section, click Record Count.In the metric picker, select female_births.
In the Metric section, click Add metric.
In the metric picker, select male_births. Your chart now displays the number of male and female children born each year using separate bars.
The chart should be sorted by
female_births
in descending order. Change this tosource_year
as you did in the previous chart. In the Sort section, clickfemale_births
.In the sort picker, choose
source_year
. The chart now shows female and male births each year in descending order.For readability, change the chart styles. In the Bar chart properties, click the Style tab.
Notice each bar has a default color based on the order the metrics were added to the chart. Click the second square in the color palette and change it from red to blue.
Click the first square in the color palette and change it from blue to red.
The chart should look like the following:
View your query history
You can view queries submitted via the BigQuery Connector by examining your query history in the BigQuery web interface. Using the query history, you can estimate query costs, and you can save queries for use in other scenarios.
To examine your query history:
Console
Open the BigQuery web UI in the Cloud Console.
Go to the BigQuery web UIClick Query history in the left navigation.
The list of queries is displayed with the most recent queries first. Click a query to view details on the query such as Job ID and Bytes Processed.
Classic UI
Go to the BigQuery web UI.
Click Query History.
The list of queries is displayed with the most recent queries first. Click Open Query to view details on the query such as Job ID and Bytes Processed.
Cleaning up
To avoid incurring charges to your Google Cloud Platform account for the resources used in this tutorial:
- In the Cloud Console, go to the Manage resources page.
- In the project list, select the project you want to delete and click Delete delete.
- In the dialog, type the project ID, and then click Shut down to delete the project.
Deleting your project prevents Google Data Studio from querying the data because the data source is associated with your project. If you do not want to delete your Cloud Platform project, you can delete the Google Data Studio report and data source.
To delete the Google Data Studio resources:
Open Google Data Studio.
On the Reports page, to the right of the BigQuery tutorial report, click the "more" button
and choose Remove.
Click Data Sources.
To the right of the natality data source, click the "more" button and choose Remove.
To the right of the Male female case query data source, click the "more" button and choose Remove.
What's next
Learn more about writing queries for BigQuery — Querying data in the BigQuery documentation explains how to run queries, create user-defined functions (UDFs), and more.
Explore BigQuery syntax — The preferred dialect for SQL queries in BigQuery BigQuery is standard SQL, which is described in the SQL reference. BigQuery's legacy SQL-like syntax is described in the Query reference (legacy SQL).