This tutorial demonstrates how to derive insights from big datasets by using BigQuery, Cloud Run, and the Gemma LLM. In this tutorial, you deploy a sample application to Google Kubernetes Engine (GKE). The sample app leverages BigQuery for data storage and processing, Cloud Run for request handling, and the Gemma LLM for analyzing data and generating predictions based on incoming prompts.
This tutorial is intended for cloud platform administrators and architects, Data and AI specialists, ML engineers, and MLOps (DevOps) professionals. Before reading this page, ensure that you're familiar with Kubernetes and a notebook environment like Jupyter.
As a prerequisite to this tutorial, you must complete the tutorial Serve Gemma open models using GPUs on GKE with Hugging Face TGI. The TGI framework facilitates the model serving process.
Why GKE and BigQuery
BigQuery is a Platform as a Service (PaaS), fully managed, serverless data warehouse that enables scalable analysis over petabytes of data. BigQuery lets you focus on analyzing data to find meaningful insights while using familiar SQL and built-in machine learning.
Using GPUs on GKE with TGI, you can deploy a Gemma language model to analyze and summarize user interactions in natural language. Subsequently, by integrating BigQuery with GKE, you can use BigQuery to efficiently handle massive datasets (like Google Analytics) and the model's natural language understanding capabilities to generate meaningful insights.
For example, as a data scientist or analyst, or a business decision-maker in an ecommerce company, you might want to understand user behavior on your website or app. This insight can help you optimize and personalize user journeys and make informed business decisions to boost sales.
In this scenario, you could take raw Google Analytics data from BigQuery, feed it to the Gemma model, and receive page visit summaries and insights in natural language. The Gemma model, which runs on scalable infrastructure with GPU acceleration from GKE, rapidly processes user journey data, identifying patterns and trends. You could gain insights to pinpoint popular product combinations, reveal common drop-off points in the checkout process, and highlight successful marketing campaigns driving traffic to specific landing pages.
Benefits
This solution offers a streamlined workflow with the following advantages:
- BigQuery integration: use BigQuery to store and process large datasets (like the Google Analytics data in this tutorial). This lets you query and aggregate the data needed for the model's analysis.
- GPU acceleration: run the Gemma model on a GKE cluster with GPU support to speed up the inference process, generating predictions much faster than with the CPU-based processors.
- Reduced cost and time: save time and resources by using the open source, pre-trained Gemma language model, eliminating the need to build a custom model from scratch.
Serve a Gemma model
Go to the tutorial Serve Gemma open models using GPUs on GKE with Hugging Face TGI and follow the instructions starting from Before you begin through Interact with the model using curl to ensure that your Gemma model is deployed successfully and you can interact with it.
For the purpose of this tutorial, deploy the Gemma 2B-it model.
Set up VPC network
Create or use the VPC network
in the us-central1
region so that your remote function can connect to the GKE cluster.
In this tutorial, use the Default
VPC.
To ensure that your BigQuery dataset, remote function, and the underlying
Cloud Run functions are deployed in compatible locations, the VPC
network must be in the same region as your BigQuery remote function.
In this tutorial, when you set BigQuery DataFrames options
while creating a remote function, you specify US
as
a location for your dataset, which defaults to the us-central1
region for your
Cloud Run functions. Therefore, create or use the VPC in the us-central1
region.
Create a load balancer
Follow these instructions to create an internal load balancer in your GKE cluster:
Create the following
tgi-2b-lb-service.yaml
manifest:apiVersion: v1 kind: Service metadata: name: llm-lb-service annotations: networking.gke.io/load-balancer-type: "Internal" spec: selector: app: gemma-server type: LoadBalancer ports: - protocol: TCP port: 8000 targetPort: 8000
Open a new Cloud Shell terminal and run the following command to apply the manifest:
kubectl apply -f tgi-2b-lb-service.yaml
Get the load balancer IP address. You might need to wait for 1-2 minutes before this IP address can be fetched:
kubectl get service llm-lb-service --output yaml | grep ip:
You'll use this IP address to communicate with your gemma-server
application
that's running behind the load balancer.
Create a connector
You use a Serverless VPC Access connector to send and receive requests through your VPC network without using the public internet. For more information, see Serverless VPC Access.
In this tutorial you create a connector with a new, dedicated subnet to avoid
any IP address conflicts with existing resources in the VPC. For instructions,
see the Create a connector
section and follow the gcloud
instructions for the Create a connector and a new subnet
section.
Alternatively, if you want to use an existing subnet, follow the instructions for the Create a connector using an existing subnet section.
For more information, see Connector subnet requirements.
Create a notebook
In this tutorial, you use a Colab Enterprise notebook to run all your code for defining the BigQuery remote function and performing the analysis.
To create a Colab Enterprise notebook by using Google Cloud console:
In Google Cloud console, go to the Colab Enterprise Notebooks page:
In the Region menu, select
us-central1
. This is the same region where you create all your services in this tutorial.Next to Files, click Create a notebook.
Your new notebook appears on the My notebooks tab.
To run code in your new notebook, insert a new code cell in your notebook for every command or a code snippet you want to run.
Create a BigQuery remote function
One of the ways you can define a BigQuery remote function is by using
the bigframes
library. In this section, use bigframes
to create a remote function
called process_incoming
. This remote function takes Google Analytics
data as an input, constructs a prompt, and sends it to your Gemma model
for analysis.
In the Colab Enterprise notebook you created:
- Click + Code to insert a new code cell.
Copy the following code in the new code cell:
# Install the necessary packages on the notebook runtime %pip install --upgrade bigframes --quiet import bigframes.pandas as bpd import os import ast import requests # Replace the following variables # Use the format ip:port # For example, "10.128.05:8000" lb_url = "LOADBALANCER_IP_ADDRESS:8000" # Set BigQuery DataFrames options bpd.options.bigquery.project = "PROJECT_ID" bpd.options.bigquery.location = "US" # Update the VPC connector name with the one you created vpc_connector_name = "VPC_CONNECTOR_NAME" # Create a remote function using bigframes # https://cloud.google.com/bigquery/docs/remote-functions#bigquery-dataframes @bpd.remote_function( dataset="ga_demo", name="ga_explain_example", bigquery_connection="bigframes-rf-conn", reuse=True, packages=["requests"], cloud_function_vpc_connector=VPC_CONNECTOR_NAME, cloud_function_service_account="default", ) def process_incoming(data: str) -> str: ga_data = ast.literal_eval(data) USER_PROMPT = """ 'The following are the results from Google Analytics. They are reverse ranked. reverse_event_number 1 is the last page visited. reverse_event_number 2 is the second last page visited. You are given the following data. {} Can you summarize what was the most popular page people landed on and what page they came from? """.format(ga_data) url = 'http://{}/generate'.format(lb_url) myobj = { "inputs": USER_PROMPT, "temperature": 0.90, "top_p": 0.95, "max_tokens": 2048 } x = requests.post(url, json=myobj) result = x.text return (result) function_name = process_incoming.bigframes_remote_function print (f"The function name is: {function_name}")
Replace the following:
LOADBALANCER_IP_ADDRESS
: the IP address and port of the internal load balancer you created earlier--for example,10.128.05:8000
.PROJECT_ID
: your project ID.VPC_CONNECTOR_NAME
: the name of the Serverless VPC Access connector you created earlier.
In this tutorial, the location of your BigQuery dataset is set to
US
, which defaults tous-central1
region.Click
Run cell.
The output displays the name of the function similar to the following:
The function name is: PROJECT_ID.ga_demo.ga_explain_example
Analyze user behavior
In this section, you analyze user behavior on your website using the process_incoming
remote function in either of the following two ways:
- using BigQuery DataFrames
- using the
bq
command-line tool to run a query directly in SQL.
Use BigQuery DataFrames
To run the remote function using BigQuery DataFrames in the Colab Enterprise notebook you created:
- Click + Code to insert a new code cell.
- Copy the following code in the new code cell, and click Run cell.
# Generate a list of all matchups and their histories as a JSON
grouping_sql = """
with
data_table as (
SELECT
distinct
user_pseudo_id,
events.value.string_value,
event_timestamp,
rank() over (partition by user_pseudo_id order by event_timestamp desc) as reverse_event_number
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_20210131` as events20210131,
unnest (events20210131.event_params) as events
where events.key = 'page_location'
qualify reverse_event_number < 3
)
select
*,TO_JSON_STRING (data_table) as ga_history
from data_table
limit 10;
"""
ga_df = bpd.read_gbq(grouping_sql)
post_processed = ga_df.assign(results=ga_df['ga_history'].apply(process_incoming),axis=1)
post_processed.head(10)
The following output shows the sample results of the query:
user_pseudo_id | string_value | event_timestamp | reverse_event_number | ga_history | results | axis |
---|---|---|---|---|---|---|
0 | 2342103247.0307162928 | https://shop.googlemerchandisestore.com/Google... | 1612096237169825 | 2 | {"user_pseudo_id":"2342103247.0307162928","str... | {"generated_text":"\n 'The following are... |
1 | 48976087.6959390698 | https://www.googlemerchandisestore.com/ | 1612056537823270 | 2 | {"user_pseudo_id":"48976087.6959390698","strin... | {"generated_text":"\n \n ```python\n imp... |
Use bq
command-line tool
Alternatively, you can use the bq
command-line tool to perform analysis directly
using SQL.
To run the remote function using bq
command-line tool in the Colab Enterprise
notebook you created:
- Click + Code to insert a new code cell.
Copy the following code in the new code cell, and replace
PROJECT_ID
with your project ID.# Update with your PROJECT_ID function_name = 'PROJECT_ID.ga_demo.ga_explain_example' new_sql = """'with \ data_table as ( \ SELECT \ distinct \ user_pseudo_id, \ events.value.string_value, \ event_timestamp, \ rank() over (partition by user_pseudo_id order by event_timestamp desc) as reverse_event_number \ FROM \ `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_20210131` as events20210131, \ unnest (events20210131.event_params) as events \ where events.key = "page_location" \ qualify reverse_event_number < 3 \ ) \ select \ *, `{}`(TO_JSON_STRING (data_table)) as result \ from data_table \ limit 10;' \ """.format(function_name) # Run query using bq cli directly in a notebook cell !bq query --use_legacy_sql=false \ {new_sql}
Click
Run cell.
The following output shows the sample results of the query:
user_pseudo_id | string_value | event_timestamp | reverse_event_number | result |
---|---|---|---|---|
86037838.0267811614 | https://shop.googlemerchandisestore.com/Google+Redesign/Apparel/Google+Dino+Game+Tee | 1612128627715585 | 1 | {"generated_text":"Answer:\n The most popular page was https://shop.googlemerchandisestore.com/Google+Redesign/Apparel/Google+Dino+Game+Tee\n The next most popular page was the page they came from.\n\n Explanation:\n\nThe provided data shows that the current user visited Google's merchandise store specifically for the product "Google Dino Game Tee." \n \nImportant Considerations:\n\n* Data Interpretation: You can't definitively say the"} |
4024190.3037653934 | https://shop.googlemerchandisestore.com/Google+Redesign/Apparel/Google+Black+Cloud+Zip+Hoodie | 1612085948486438 | 1 | {"generated_text":"\n ```python\n import pandas as pd\n\n data = {'user_pseudo_id': ['4024190.3037653934', '4024190.3037653934', '4024190.3037653934'],\n 'string_value': ['https://shop.googlemerchandisestore.com"} |