Analyze data with Gemini for Google Cloud assistance

This tutorial describes how you can use Gemini for Google Cloud, an AI-powered collaborator for Google Cloud, to analyze data. In the tutorial, you use Gemini in BigQuery to analyze and predict product sales.

This tutorial assumes that you're familiar with SQL and basic data analytics tasks. Knowledge of Google Cloud products is not assumed. If you're new to BigQuery, see the BigQuery quickstarts.

Objectives

  • Use Gemini to answer your questions about Google Cloud data analytics products and use cases.
  • Prompt Gemini to explain and generate SQL queries in BigQuery.
  • Build a machine learning (ML) model to forecast future periods.

Costs

This tutorial uses the following billable components of Google Cloud:

Use the Pricing Calculator to estimate your costs based on your projected usage.

Before you begin

  1. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  2. Ensure that you have set up Gemini in BigQuery in your Google Cloud project.
  3. In the Google Cloud console, go to the BigQuery page.

    Go to BigQuery

  4. Create a dataset that's named bqml_tutorial. You use the dataset to store database objects, including tables and models.

  5. Enable the Gemini features in BigQuery that you need to complete this tutorial:

    1. To view Gemini features in BigQuery, in the toolbar, click pen_spark Gemini.

    2. In the Gemini in BigQuery SQL editor list, select all of the following options:

      • Auto completion
      • Auto generation
      • Explanation

    To disable Gemini features in BigQuery, deselect the Gemini features that you want to disable.

Use Gemini to analyze your data

Gemini can help you know what data you can access for analysis, and how to analyze that data.

Before you can query data, you need to know what data you can access. Every data product organizes and stores data differently. To get help, you can send Gemini a natural language statement (or prompt) like "How do I view which datasets and tables are available to me in BigQuery?"

If you want to understand the characteristics of different data query systems, you might prompt Gemini for specific product information like the following:

  • "How do I get started with BigQuery?"
  • "What are the benefits of using BigQuery for data analysis?"
  • "How does BigQuery handle auto-scaling for queries?"

Gemini can also provide information about how to analyze your data. For that type of help, you might send Gemini prompts such as the following:

  • "How do I create a time series forecasting model in BigQuery?"
  • "How do I load different types of data into BigQuery?"

To prompt Gemini to answer questions about your data, follow these steps:

  1. In the Google Cloud console, go to the BigQuery page.

    Go to BigQuery

  2. In the Google Cloud console toolbar, click spark Open Gemini.

  3. In the Gemini pane, enter a prompt like How do I learn which datasets and tables are available to me in BigQuery?.

  4. Click send Send prompt.

    Gemini doesn't use your prompts or its responses as data to train its models. For more information, see How Gemini for Google Cloud uses your data.

    Gemini returns a response similar to the following:

    To learn which datasets and tables are available to you in
    BigQuery, you can use the Google Cloud console, the
    Google Cloud CLI, or the BigQuery API.
    ...
    
  5. Optional: To reset your chat history, in the Gemini pane, click the delete icon, and then click Reset chat.

Use Gemini to understand and write SQL in BigQuery

Gemini can help you work with SQL. For instance, if you work with SQL queries that other people wrote, Gemini in BigQuery can explain a complex query in plain language. Such explanations can help you understand the query syntax, underlying schema, and business context.

Prompt Gemini to explain SQL queries in a sales dataset

To prompt Gemini to explain an example SQL query, follow these steps:

  1. In the Google Cloud console, go to the BigQuery page.

    Go to BigQuery

  2. In the query editor, open or paste the query that you want explained.

    For example, you might want to understand how data tables and queries are related in a sales dataset, and you might want help writing queries that use the dataset. In the following example query, you might understand which tables are being used, but other sections of the query might take you time to parse and understand.

    SELECT u.id as user_id, u.first_name, u.last_name, avg(oi.sale_price) as avg_sale_price
    FROM `bigquery-public-data.thelook_ecommerce.users` as u
    JOIN `bigquery-public-data.thelook_ecommerce.order_items` as oi
    ON u.id = oi.user_id
    GROUP BY 1,2,3
    ORDER BY avg_sale_price DESC
    LIMIT 10
    
  3. Highlight the query that you want Gemini to explain, and then click pen_spark Explain this query.

    The SQL explanation appears in the Gemini pane.

    Using the example query from the previous step, Gemini returns an explanation similar to the following:

    The intent of this query is to find the top 10 users by average sale price.
    The query first joins the users and order_items tables on the user_id
    column. It then groups the results by user_id , first_name , and last_name,
    and calculates the average sale price for each group. The results are then
    ordered by average sale price in descending order, and the top 10 results
    are returned.
    

Generate a SQL query that groups sales by day and product

You can provide Gemini with a prompt to generate a SQL query based on your data's schema. Even if you're starting with no code, a limited knowledge of the data schema, or only a basic knowledge of SQL syntax, Gemini can suggest one or more SQL statements.

In the following example, you generate a query that lists your top products for each day. This type of query is often complex, but using Gemini, you can automatically create a statement. You then use tables in the thelook_ecommerce dataset and prompt Gemini to generate a query to calculate sales by order item and by product name.

To prompt Gemini to generate a query that lists your top products, follow these steps:

  1. In the Google Cloud console, go to the BigQuery page.

    Go to BigQuery

  2. In the navigation menu, click BigQuery Studio.

  3. Click Compose a new query. The Explorer pane automatically loads the selected database.

  4. In the query editor, enter the following prompt, and then press Enter. The pound character (#) prompts Gemini to generate SQL.

    # select the sum of sales by date and product casted to day from bigquery-public-data.thelook_ecommerce.order_items joined with products
    

    Gemini suggests a SQL query similar to the following:

    SELECT sum(sale_price),
    DATE(created_at),
    product_id
    FROM `bigquery-public-data.thelook_ecommerce.order_items`
    AS t1
    INNER JOIN `bigquery-public-data.thelook_ecommerce.products`
    AS t2
    ON t1.product_id = t2.id
    GROUP BY 2, 3
    
  5. To accept the suggested code, click Tab, and then click Run to execute the SQL statement. You can also scroll through the suggested SQL and accept specific words suggested in the statement.

  6. In the Query results pane, view the query results.

Build a forecasting model and view results

In this section, you use BigQuery ML to do the following:

  • Use a trend query to build a forecasting model.
  • Use Gemini to explain and help you write a query to view results of the forecasting model.

You use the following example query with actual sales, which are used as an input to the model. The query is used as a part of creating the ML model.

  1. To create a forecasting ML model, in the BigQuery SQL editor, run the following SQL:

    CREATE MODEL bqml_tutorial.sales_forecasting_model
    OPTIONS(MODEL_TYPE='ARIMA_PLUS',
    time_series_timestamp_col='date_col',
    time_series_data_col='total_sales',
    time_series_id_col='product_id') AS
    SELECT sum(sale_price) as total_sales,
    DATE(created_at) as date_col,
    product_id
    FROM `bigquery-public-data.thelook_ecommerce.order_items`
    AS t1
    INNER JOIN `bigquery-public-data.thelook_ecommerce.products`
    AS t2
    ON t1.product_id = t2.id
    GROUP BY 2, 3;
    

    You can use Gemini to help you understand this query

    When the model is created, the Results pane displays a message similar to the following:

    This statement will replace the model named sales_forecasting_model.
    Depending on the type of model, this may take several hours to complete.
    
  2. In the Gemini pane, enter a prompt for Gemini to help you write a query to get a forecast from the model when it's completed. For example, enter How can I get a forecast in SQL from the model?

    Based on the context of the prompt, Gemini returns an example of an ML model that forecasts sales:

    To get a forecast in SQL from the model, you can use the following query:
    
    SELECT
    *
    FROM
      ML.FORECAST(MODEL `PROJECT_ID.DATASET_ID.MODEL_NAME`,
    STRUCT(
          7 AS horizon,
          0.95 AS confidence_level
    )
    )
    
  3. In the Gemini pane, copy the SQL query.

  4. In the BigQuery SQL editor, paste and then run the SQL query.

Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, you can delete the Google Cloud project that you created for this tutorial. Alternatively, you can delete the individual resources.

  1. In the Google Cloud console, go to the Manage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then click Delete.
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

Delete your dataset

Deleting your project removes all datasets and all tables in the project. If you prefer to reuse the project, then you can delete the dataset that you created in this tutorial.

  1. In the Google Cloud console, open the BigQuery page.

    Go to BigQuery

  2. In the navigation, select the bqml_tutorial dataset that you created.

  3. To delete the dataset, the table, and all of the data, click Delete dataset.

  4. To confirm deletion, in the Delete dataset dialog, type the name of your dataset (bqml_tutorial), and then click Delete.

What's next