Data insights offers an automated way to explore and understand your data. With data insights, Gemini uses metadata to generate natural language questions about your table and the queries to answer them. This helps you uncover patterns, assess data quality, and perform statistical analysis.
This document describes the key features of data insights and how to view them for insightful data exploration.
Before you begin
Data insights are generated using Gemini in BigQuery and can only be generated in BigQuery Studio. First set up Gemini in BigQuery, then generate insights in BigQuery. After you generate insights, you can view them in Dataplex.
Required roles
To get read-only access to the generated insights, ask your administrator to grant you the following IAM role:
- Dataplex DataScan DataViewer (
roles/dataplex.dataScanDataViewer
) on the project containing the BigQuery tables for which you want to view insights.
For more information about granting roles, see Manage access to service accounts.
You might also be able to get the required permissions through custom roles or other predefined roles. To see the exact permissions that are required generate insights, expand the Required permissions section:
Required permissions
dataplex.datascans.get
dataplex.datascans.getData
Enable APIs
To use data insights, enable the following APIs in your project:
For more information about enabling the Gemini for Google Cloud API, see Enable the Gemini for Google Cloud API in a Google Cloud project.
About data insights
When exploring a new, unfamiliar table, data analysts often face the cold-start problem. The problem often involves uncertainties about the data structure, key patterns, and relevant insights in the data, making it hard to get started writing queries.
Data insights addresses the cold-start problem by automatically generating natural language queries and their SQL equivalents based on a table's metadata. Rather than starting with an empty query editor, you can quickly start data exploration with meaningful queries that offer valuable insights. To investigate further, you can ask follow-up questions in data canvas.
Example of an insights run
Consider a table called telco_churn
with the following metadata:
Field name | Type |
---|---|
CustomerID | STRING |
Gender | STRING |
Tenure | INT64 |
InternetService | STRING |
StreamingTV | STRING |
OnlineBackup | STRING |
Contract | STRING |
TechSupport | STRING |
PaymentMethod | STRING |
MonthlyCharges | FLOAT |
Churn | BOOLEAN |
The following are some of the sample queries that data insights generates for this table:
Identify customers who have subscribed to all premium services and have been a customer for more than 50 months.
SELECT CustomerID, Contract, Tenure FROM agentville_datasets.telco_churn WHERE OnlineBackup = 'Yes' AND TechSupport = 'Yes' AND StreamingTV = 'Yes' AND Tenure > 50;
Identify which internet service has the most churned customers.
SELECT InternetService, COUNT(DISTINCT CustomerID) AS total_customers FROM agentville_datasets.telco_churn WHERE Churn = TRUE GROUP BY InternetService ORDER BY total_customers DESC LIMIT 1;
Identify churn rates by segment among high-value customers.
SELECT Contract, InternetService, Gender, PaymentMethod, COUNT(DISTINCT CustomerID) AS total_customers, SUM(CASE WHEN Churn = TRUE THEN 1 ELSE 0 END) AS churned_customers, (SUM(CASE WHEN Churn = TRUE THEN 1 ELSE 0 END) / COUNT(DISTINCT CustomerID)) * 100 AS churn_rate FROM agentville_datasets.telco_churn WHERE MonthlyCharges > 100 GROUP BY Contract, InternetService, Gender, PaymentMethod;
View insights
To view insights for a BigQuery table, access the table entry in Dataplex using Dataplex Search.
In the Google Cloud console, go to the Dataplex Search page.
Search for the table entry in Dataplex.
Click the Insights tab. If the tab is empty, it means that the insights for this table are not generated yet. You can generate data insights in BigQuery Studio.
Pricing
For details about pricing for this feature, see Gemini in BigQuery pricing overview.
Quotas and limits
For information about quotas and limits for this feature, see Quotas for Gemini in BigQuery.
Limitations
- Data insights is available for BigQuery tables, BigLake tables, external tables, and views.
- For multicloud customers, data from other clouds isn't available.
- Data insights doesn't support
Geo
orJSON
column types. - Insights runs don't guarantee the presentation of queries every time. To increase the likelihood of generating more engaging queries, regenerate insights in BigQuery Studio.
What's next
- Learn how to generate insights in BigQuery.
- Learn how to generate a data profiling scan.
- Learn how to write queries with Gemini assistance in BigQuery.