Analyze with data canvas

To provide feedback or request support for BigQuery data canvas, send an email to datacanvas-feedback@google.com.

BigQuery data canvas lets you discover, transform, query, and visualize data using natural language. BigQuery data canvas provides a graphic interface for your analysis that lets you work with data sources, queries, and visualizations in a directed acyclic graph (DAG), giving you a graphical view of your analysis workflow that maps to your mental model. In BigQuery data canvas, you can iterate on query results and work with multiple branches of inquiry in a single place.

BigQuery data canvas is designed to help you with your data-to-insights journey. You can work with data without needing technical knowledge of specific tools or products. BigQuery data canvas works with Dataplex metadata to identify appropriate tables based on natural language.

BigQuery data canvas uses Gemini to find your data, create SQL, generate charts, and create data summaries.

Capabilities

BigQuery data canvas supports the following capabilities:

  • Find table assets by using keyword search syntax with Dataplex metadata, such as tables, views, or materialized views.

  • Use natural language for simple SQL queries, such as:

    • Queries with SELECT FROM, math functions, arrays, and structs
    • JOIN statements between two tables
  • Common visualizations, including the following chart types:

    • Bar chart
    • Heat map
    • Line graph
    • Pie chart
    • Scatter chart
  • Custom visualizations, where you can use natural language to describe what you want.

  • Automated data insights.

BigQuery data canvas has the following limitations:

  • Natural language commands may have difficulty when working with:

    • BigQuery ML
    • Apache Spark
    • Object tables
    • BigLake
    • INFORMATION_SCHEMA
    • JSON
    • Nested and repeated fields
    • Complex functions and data types, such as DATETIME and TIMEZONE
  • Data visualizations don't work with geomap charts.

Before you begin

To access Gemini in BigQuery features, an administrator must enable Gemini in BigQuery. To learn more, see Set up Gemini in BigQuery.

To get the permissions that you need to use BigQuery data canvas, ask your administrator to grant you the following IAM roles on the project:

For more information about granting roles, see Manage access.

You might also be able to get the required permissions through custom roles or other predefined roles.

Work with BigQuery data canvas

BigQuery data canvas lets you discover, transform, query, and visualize data using a natural-language forward approach. The following examples demonstrate some of the ways you can interact with BigQuery data canvas, including a sample workflow from data discovery to visualization.

To begin working with BigQuery data canvas:

  1. Go to the Google Cloud console.

    Go to BigQuery

  2. In the navigation menu, click BigQuery Studio.

  3. Click Create data canvas. You can also start BigQuery data canvas directly from a table or a query.

  4. Enter your first prompt in natural language in the text field.

Examples

The following examples demonstrate a typical BigQuery data canvas workflow.

Example 1: Work with a single table

Prompt 1: Find data

Chicago taxi trips

Potential result:

BigQuery data canvas generates a list of potential tables based on Dataplex metadata. You can select multiple tables. Each table has the option of Add to canvas, Query, and Details. For this example, we will select the bigquery-public-data.chicago_taxi_trips.taxi_trips table and click Add to canvas.

A table cell for taxi_trips is added to BigQuery data canvas. You can cycle through the tabs in the table cell to view schema information, table details, and a preview of the data.

Click Query to query this table with natural language.

Prompt 2: Create a SQL query with the chosen table

Get me the 100 longest trips

Potential result:

BigQuery data canvas generates a SQL query similar to the following:

SELECT
  taxi_id,
  trip_start_timestamp,
  trip_end_timestamp,
  trip_miles
FROM
  `bigquery-public-data.chicago_taxi_trips.taxi_trips`
ORDER BY
  trip_miles DESC
LIMIT
  100;
  

You can also edit the query, either by manually editing the SQL query or by changing the natural language prompt and regenerating the query. For this example, we are going to edit our natural language prompt to only select trips where the customer paid with cash.

Prompt 3: Edit the query by changing the prompt

Get me the 100 longest trips where the payment type is cash

Potential result

BigQuery data canvas generates a SQL query similar to the following:

SELECT
  taxi_id,
  trip_start_timestamp,
  trip_end_timestamp,
  trip_miles
FROM
  `daui-storage.chicago_taxi_trips_123123.taxi_trips`
WHERE
  payment_type = 'Cash'
ORDER BY
  trip_miles DESC
LIMIT
  100;
  

Click Run to view the results of the query.

Prompt 4: Create a chart

  1. Click Visualize
  2. Click Create bar chart.

Potential result

BigQuery data canvas creates a bar chart showing the most trip miles by trip ID.

Along with providing a chart, BigQuery data canvas summarizes some of the key details of the data backing the visualization. You can modify the chart by clicking Visualization details and editing your chart in the side panel.

If you want to share BigQuery data canvas, click Share, then click Share link to copy BigQuery data canvas link.

To clean up BigQuery data canvas, click Clear canvas. This leaves you with a blank canvas.

Example 2: Work with multiple tables

Prompt 1:

Information about trees

Potential result:

BigQuery data canvas suggests several tables that have information about trees. For this example, we are going to work with two tables: bigquery-public-data.new_york_trees.tree_census_1995 and bigquery-public-data.new_york_trees.tree_census_2015.

After selecting both of these tables, they are displayed on the canvas.

For this example, click Join on the bigquery-public-data.new_york_trees.tree_census_1995 table to join the two tables. BigQuery data canvas suggests tables to join. Select the bigquery-public-data.new_york_trees.tree_census_2015. A new pane is displayed, with connections to each of the tables.

Prompt 2:

Join these tables on their address

Potential result

SELECT
  *
FROM
  `bigquery-public-data.new_york_trees.tree_census_2015` AS t2015
JOIN
  `bigquery-public-data.new_york_trees.tree_census_1995` AS t1995
ON
  t2015.address = t1995.address;

BigQuery data canvas proposes the SQL query to join these two tables on their address. Click Run to run the query and view the results.

BigQuery data canvas lets you export your queries as a notebook. To export to a notebook:

  1. Click the Export as notebook tab.
  2. In the Save Notebook pane, enter the name and region you want to save the notebook as.
  3. Click Save. The notebook is created successfully.
  4. Click Open in BQ Studio to view the created notebook.

Example 3

Prompt 1

Find data about USA names

Potential result

BigQuery data canvas generates a list of tables. For this example, we are going to select the bigquery-public-data.usa_names.usa_1910_current table.

Click Query to query the data. Enter a prompt to query the data.

Potential result

BigQuery data canvas generates the following query:

SELECT
  state,
  gender,
  year,
  name,
  number
FROM
  `bigquery-public-data.usa_names.usa_1910_current`

BigQuery data canvas generates the results of the query. We are going to ask an additional query to filter this data. Click Query these results.

Prompt 2

Get me the top 10 most popular names in 1980

Potential result

BigQuery data canvas generates the following query:

SELECT
  name,
  SUM(number) AS total_count
FROM
  `bigquery-public-data`.usa_names.usa_1910_current
WHERE
  year = 1980
GROUP BY
  name
ORDER BY
  total_count DESC
LIMIT
  10;

After running the query, we get a table with the ten most common names of children born in 1980.

For this example, we are going to visualize these results. Click Visualize. BigQuery data canvas suggests several visualization options, including bar chart, pie chart, line graph, and custom visualization. Click Create bar chart.

BigQuery data canvas creates a bar chart similar to the following:

Top ten names bar chart.

Along with providing a chart, BigQuery data canvas summarizes some of the key details of the data backing the visualization. You can modify the chart by clicking Visualization details and editing your chart in the side panel.

Prompt 3

Create a bar chart sorted high to low, with a gradient

Potential result

BigQuery data canvas creates a bar chart similar to the following:

Top ten names bar chart sorted.

To make further changes, click Visualization details. The Vis details sidebar is displayed. You can edit the chart title, x-axis name, y-axis name, and more. Also, if you click the JSON tab, you can make direct edits to the chart based on the JSON values.

Pricing

You are charged according to the compute- or storage-based pricing model while working with BigQuery data canvas. There are no additional charges for using Gemini with BigQuery data canvas during Preview.

Provide feedback

You can help improve BigQuery data canvas suggestions by submitting feedback to Google. To provide feedback:

  1. In the Google Cloud console, in the toolbar, click Submit feedback.

  2. Optional: Click Copy to copy the DAG JSON information to provide additional context to your feedback.

  3. Click form and fill out the form to provide feedback.

Data sharing settings apply to the entire project and can only be set by a project administrator with the serviceusage.services.enable and serviceusage.services.list IAM permissions. For more information about data use in the Trusted Tester Program, see Gemini in Google Cloud Trusted Tester Program.

You can also contact datacanvas-feedback@google.com to provide direct feedback about this feature.

What's next