Building a conversational agent in BigQuery using the Conversational Analytics API
David Tamaki Szajngarten
Developer Relations Engineer
Wei Hsia
Developer Advocate
Try Gemini 3.1 Pro
Our most intelligent model available yet for complex tasks on Gemini Enterprise and Vertex AI
Try nowBringing data into BigQuery centralizes your information, but the real challenge is making that data accessible. Often, technical barriers separate the people with questions — from execs to analysts — from the answers they need.
With the Conversational Analytics API, powered by Gemini, you no longer need intricate systems to get insights. The API is engineered to help you build context-aware agents that can understand natural language, query your BigQuery data, and deliver answers in text, tables, and visual charts.
Now, you can build any solution that can interface with the API. For example, you can integrate it with the Agent Development Kit (ADK) to build a multi-agent systems, or to implement these data strategies:
- Self-service triage for operations: Give teams like Support and Sales an agent that answers data questions instantly. Instead of filing a ticket to ask, "Why did signups drop last week?", they get the answer immediately.
- Differentiate your SaaS product: Differentiate your platform by embedding a powerful chat interface directly into your platform. Let your customers query and visualize their own usage data using plain English.
- Dynamic reporting: Move beyond static PDFs. Automate the core reporting function and enable stakeholders to ask nuanced, follow-up questions for deeper investigation, effectively replacing report versions with real-time conversation.
In this post, we’ll share ways to build a conversational agent in BigQuery using the Conversational Analytics API.
Step One: Configure and create the agent
The deployment of a Data Analytics Agent involves configuring its access, context, and environment before making the final creation call.
In our included example, the Python SDK is used, but the Conversational Analytics API supports many other languages, depending on your preference and environment.
Initialize the client and define BigQuery sources
Begin by instantiating the necessary client (DataAgentServiceClient) to interact with the API. This client is used in conjunction with explicit BigQueryTableReference objects, which authorize the agent's access to specific tables (defined by project_id, dataset_id, and table_id). These individual references are then aggregated into a DatasourceReferences object under the bq field.
Set the agent context
Construct the context object by bundling the system_instruction (defining the agent's behavior/role) and the datasource_references (defining its permitted data access). This complete Context is then nested within the DataAnalyticsAgent structure of the final DataAgent object.
While you can provide a string based system instruction, we recommend that you use the more robust context object to provide instruction to the agent. The object can still be provided with additional system instructions to help provide supplemental guidance.
Create the agent
Call data_agent_client.create_data_agent. This request includes the parent resource path (projects/{billing_project}/locations/{location}), the unique data_agent_id, and the fully configured data_agent object to complete the deployment.
Your agent now exists and is defined by that published_context.
Step two: Creating a conversation (stateful vs. stateless)
The Conversational Analytics API can handle conversations in two ways:
-
Stateless: You send a question and the agent's context. You must manage the conversation history in your own application and send it with every new request.
-
Stateful: You create a "conversation" on the server. The API manages the history for you. This is what allows users to ask follow-up questions.
We'll configure a stateful conversation. We create a conversation object associated with our new agent.
Step three: Create a streaming chat loop
To allow for interactive analysis, we implement a function, stream_chat_response, to manage the conversation flow. The Data Analytics Agent API is designed to return a response as a stream, which is crucial for delivering updates on the agent’s progress in real-time.
A typical response stream can include distinct components, such as:
-
Schema: Confirmation of table resolution.
-
Data (query): The generated SQL query (excellent for debugging and transparency).
-
Data (result): The resulting data structure (e.g., a Pandas-like DataFrame).
-
Chart: A Vega-Lite JSON specification for data visualization.
-
Text: The final, synthesized natural language summary.
Defining the function
The function is defined to accept the user's question. Inside, we initialize the DataChatServiceClient and define a simple flag (chart_generated_flag) to track if a chart needs to be rendered after the stream completes. The user's question is wrapped in a Message object, which is required for the API request.
Processing the stream
The ConversationReference is essential as it ties the current request to the stateful conversation and links it back to the specific data_agent we created earlier. Once the request object is fully assembled with the parent path, messages, and reference, we call data_chat_client.chat.
We then iterate over the returned stream. A utility function, show_message, is used here to parse and appropriately format the different response types (Text, Chart, Data) for the user. Finally, if the chart_generated_flag was set during the stream, a post-processing utility (preview_in_browser) handles the rendering of the visualization.
Step four: Talk to the agent
Asking questions
Now for the payoff. We can use our stream_chat_response function to have a conversation.
Checking the context
Let's start by seeing if the agent understands its own context.
Python
The agent will respond with a summary of the top_terms and top_rising_terms tables, using the descriptions we provided in the system_instruction.
Natural language to SQL to Chart
Now for a complex query. Notice we ask for a chart in plain English.
Python
The agent will stream its process:
-
It will show the SQL query it generated to hit the top_terms table, filtering by dma_name = 'New York NY' and the most recent week.
-
It will print the resulting data as a table.
-
It will generate a Vega chart specification.
-
The preview_in_browser utility will serve this as an index.html file, showing a column chart.
The stateful follow-up
This is where the stateful conversation (Step 2) pays off.
Python
The agent remembers "these search terms" refers to the results from Question 2. It will generate a new query, this time INNER JOIN-ing the top_terms and top_rising_terms tables (as guided by our join_instructions) to find the percent_gain for that same list of terms.
Step five: Managing the agent
For a more in depth lifecycle management of the agent and messages, visit the Conversational Analytics API documentation page for the many various API requests you can make (HTTP / Python). You will find information on how to manage agents, how to invite new users to collaborate via the SetIAM, GetIAM APIs, and more.
Pro tip: Bridge the gap between data and people
By providing clear system instructions and schema descriptions, you can build an agent that is more than just conversational, as it becomes a domain expert. This interactive approach moves beyond static dashboards to provide truly accessible data analysis.


