This page shows you how to use the Python SDK to make requests to the Conversational Analytics API. The sample Python code demonstrates how to complete the following tasks:
- Authenticate and set up your environment
- Specify the billing project and system instructions
- Connect to a Looker, BigQuery, or Looker Studio data source
- Set up context for stateful or stateless chat
- Create a data agent
- Retrieve a data agent
- Create a conversation
- Use the API to ask questions
- Create a stateless multi-turn conversation
- Define helper functions
Authenticate and set up your environment
To use the Python SDK for the Conversational Analytics API, follow the instructions in the Conversational Analytics API SDK Colaboratory notebook to download and install the SDK. Note that the download method and the contents of the SDK Colab are subject to change.
After you've completed the setup instructions in the notebook, you can use the following code to import the required SDK libraries, authenticate your Google Account within a Colaboratory environment, and initialize a client for making API requests:
from google.colab import auth
auth.authenticate_user()
from google.cloud import geminidataanalytics
data_agent_client = geminidataanalytics.DataAgentServiceClient()
data_chat_client = geminidataanalytics.DataChatServiceClient()
Specify the billing project and system instructions
The following sample Python code defines the billing project and system instructions that are used throughout your script:
# Billing project
billing_project = "my_project_name"
# System instructions
system_instruction = "Help the user analyze their data."
Replace the sample values as follows:
- my_project_name: The ID of your billing project that has the required APIs enabled.
- Help the user analyze their data.: System instructions to guide the agent's behavior and customize it for your data needs. For example, you can use system instructions to define business terms, control response length, or set data formatting. Ideally, define system instructions by using the recommended YAML format in Write effective system instructions to provide detailed and structured guidance.
Connect to a data source
The following Python code samples show how to define the connection details for the Looker, BigQuery, or Looker Studio data source that your agent will query to answer questions.
Connect to Looker data
The following code examples show how to define the details for a connection to a Looker Explore with either API keys or an access token.
API keys
You can establish a connection with a Looker instance with generated Looker API keys, as described in Authenticate and connect to a data source with the Conversational Analytics API.
looker_client_id = "my_looker_client_id"
looker_client_secret = "my_looker_client_secret"
looker_instance_uri = "https://my_company.looker.com"
lookml_model = "my_model"
explore = "my_explore"
looker_explore_reference = geminidataanalytics.LookerExploreReference()
looker_explore_reference.looker_instance_uri = looker_instance_uri
looker_explore_reference.lookml_model = lookml_model
looker_explore_reference.explore = explore
credentials = geminidataanalytics.Credentials()
credentials.oauth.secret.client_id = looker_client_id
credentials.oauth.secret.client_secret = looker_client_secret
datasource_references = geminidataanalytics.DatasourceReferences()
datasource_references.looker.explore_references = [looker_explore_reference]
Replace the sample values as follows:
- my_looker_client_id: The client ID of your generated Looker API key.
- my_looker_client_secret: The client secret of your generated Looker API key.
- https://my_company.looker.com: The complete URL of your Looker instance.
- my_model: The name of the LookML model that includes the Explore that you want to connect to.
- my_explore: The name of the Looker Explore that you want the data agent to query.
Access token
You can establish a connection with a Looker instance by using an access token, as described in Authenticate and connect to a data source with the Conversational Analytics API.
looker_access_token = "my_access_token"
looker_instance_uri = "https://my_company.looker.com"
lookml_model = "my_model"
explore = "my_explore"
looker_explore_reference = geminidataanalytics.LookerExploreReference()
looker_explore_reference.looker_instance_uri = looker_instance_uri
looker_explore_reference.lookml_model = lookml_model
looker_explore_reference.explore = explore
credentials = geminidataanalytics.Credentials()
credentials.oauth.token.access_token = looker_access_token
datasource_references = geminidataanalytics.DatasourceReferences()
datasource_references.looker.explore_references = [looker_explore_reference]
Replace the sample values as follows:
- my_access_token: The
access_token
value that you generate to authenticate to Looker. - https://my_company.looker.com: The complete URL of your Looker instance.
- my_model: The name of the LookML model that includes the Explore that you want to connect to.
- my_explore: The name of the Looker Explore that you want the data agent to query.
Connect to BigQuery data
With the Conversational Analytics API, you can connect to and query up to 10 BigQuery tables at a time.
The following sample code defines a connection to a single BigQuery table.
bq_project_id = "my_project_id"
bq_dataset_id = "my_dataset_id"
bq_table_id = "my_table_id"
bigquery_table_reference = geminidataanalytics.BigQueryTableReference()
bigquery_table_reference.project_id = bq_project_id
bigquery_table_reference.dataset_id = bq_dataset_id
bigquery_table_reference.table_id = bq_table_id
# Connect to your data source
datasource_references = geminidataanalytics.DatasourceReferences()
datasource_references.bq.table_references = [bigquery_table_reference]
Replace the sample values as follows:
- my_project_id: The ID of the Google Cloud project that contains the BigQuery dataset and table that you want to connect to. To connect to a public dataset, specify
bigquery-public-data
. - my_dataset_id: The ID of the BigQuery dataset. For example,
san_francisco
. - my_table_id: The ID of the BigQuery table. For example,
street_trees
.
Connect to Looker Studio data
The following sample code defines a connection to a Looker Studio data source.
studio_datasource_id = "my_datasource_id"
studio_references = geminidataanalytics.StudioDatasourceReference()
studio_references.datasource_id = studio_datasource_id
## Connect to your data source
datasource_references.studio.studio_references = [studio_references]
In the previous example, replace my_datasource_id with the data source ID.
Set up context for stateful or stateless chat
The Conversational Analytics API supports multi-turn conversations, which let users ask follow-up questions that build on previous context. The following sample Python code demonstrates how to set up context for either stateful or stateless chat:
- Stateful chat: Google Cloud stores and manages the conversation history. Stateful chat is inherently multi-turn, as the API retains context from previous messages. You only need to send the current message for each turn.
- Stateless chat: Your application manages the conversation history. You must include the entire conversation history with each new message. For detailed examples on how to manage multi-turn conversations in stateless mode, see Create a stateless multi-turn conversation.
Stateful chat
The following code sample sets up context for stateful chat, where Google Cloud stores and manages the conversation history. You can also optionally enable advanced analysis with Python by including the line published_context.options.analysis.python.enabled = True
in the following sample code.
# Set up context for stateful chat
published_context = geminidataanalytics.Context()
published_context.system_instruction = system_instruction
published_context.datasource_references = datasource_references
# Optional: To enable advanced analysis with Python, include the following line:
published_context.options.analysis.python.enabled = True
Stateless chat
The following sample code sets up context for stateless chat, where you must send the entire conversation history with each message. You can also optionally enable advanced analysis with Python by including the line inline_context.options.analysis.python.enabled = True
in the following sample code.
# Set up context for stateless chat
# datasource_references.looker.credentials = credentials
inline_context = geminidataanalytics.Context()
inline_context.system_instruction = system_instruction
inline_context.datasource_references = datasource_references
# Optional: To enable advanced analysis with Python, include the following line:
inline_context.options.analysis.python.enabled = True
Create a data agent
The following sample Python code makes an API request to create a data agent, which you can then use to have a conversation about your data. The data agent is configured with the specified data source, system instructions, and context.
data_agent_id = "data_agent_1"
data_agent = geminidataanalytics.DataAgent()
data_agent.data_analytics_agent.published_context = published_context
data_agent.name = f"projects/{billing_project}/locations/global/dataAgents/{data_agent_id}" # Optional
request = geminidataanalytics.CreateDataAgentRequest(
parent=f"projects/{billing_project}/locations/global",
data_agent_id=data_agent_id, # Optional
data_agent=data_agent,
)
try:
data_agent_client.create_data_agent(request=request)
print("Data Agent created")
except Exception as e:
print(f"Error creating Data Agent: {e}")
In the previous example, replace the value data_agent_1 with a unique identifier for the data agent.
Retrieve a data agent
The following sample Python code demonstrates how to make an API request to retrieve a data agent that you previously created.
# Initialize request arguments
data_agent_id = "data_agent_1"
request = geminidataanalytics.GetDataAgentRequest(
name=f"projects/{billing_project}/locations/global/dataAgents/{data_agent_id}",
)
# Make the request
response = data_agent_client.get_data_agent(request=request)
# Handle the response
print(response)
In the previous example, replace the value data_agent_1 with the unique identifier for the data agent that you want to retrieve.
Create a conversation
The following sample Python code makes an API request to create a conversation.
# Initialize request arguments
data_agent_id = "data_agent_1"
conversation_id = "conversation_1"
conversation = geminidataanalytics.Conversation()
conversation.agents = [f'projects/{billing_project}/locations/global/dataAgents/{data_agent_id}']
conversation.name = f"projects/{billing_project}/locations/global/conversations/{conversation_id}"
request = geminidataanalytics.CreateConversationRequest(
parent=f"projects/{billing_project}/locations/global",
conversation_id=conversation_id,
conversation=conversation,
)
# Make the request
response = data_chat_client.create_conversation(request=request)
# Handle the response
print(response)
Replace the sample values as follows:
- data_agent_1: The ID of the data agent, as defined in the sample code block in Create a data agent.
- conversation_1: A unique identifier for the conversation.
Use the API to ask questions
After you create a data agent and a conversation, the following sample Python code sends a query to the agent. The code uses the context that you set up for stateful or stateless chat. The API returns a stream of messages that represent the steps that the agent takes to answer the query.
Stateful chat
Send a stateful chat request with a Conversation
reference
You can send a stateful chat request to the data agent by referencing a Conversation
resource that you previously created.
# Create a request that contains a single user message (your question)
question = "Which species of tree is most prevalent?"
messages = [geminidataanalytics.Message()]
messages[0].user_message.text = question
data_agent_id = "data_agent_1"
conversation_id = "conversation_1"
# Create a conversation_reference
conversation_reference = geminidataanalytics.ConversationReference()
conversation_reference.conversation = f"projects/{billing_project}/locations/global/conversations/{conversation_id}"
conversation_reference.data_agent_context.data_agent = f"projects/{billing_project}/locations/global/dataAgents/{data_agent_id}"
# conversation_reference.data_agent_context.credentials = credentials
# Form the request
request = geminidataanalytics.ChatRequest(
parent = f"projects/{billing_project}/locations/global",
messages = messages,
conversation_reference = conversation_reference
)
# Make the request
stream = data_chat_client.chat(request=request)
# Handle the response
for response in stream:
show_message(response)
Replace the sample values as follows:
- Which species of tree is most prevalent?: A natural language question to send to the data agent.
- data_agent_1: The unique identifier for the data agent, as defined in Create a data agent.
- conversation_1: The unique identifier for the conversation, as defined in Create a conversation.
Stateless chat
The following code samples demonstrate how to send a query to the data agent when you've set up context for stateless chat. You can send stateless queries by referencing a previously defined DataAgent
resource or by using inline context in the request.
Send a stateless chat request with a DataAgent
reference
You can send a query to the data agent by referencing a DataAgent
resource that you previously created.
# Create a request that contains a single user message (your question)
question = "Which species of tree is most prevalent?"
messages = [geminidataanalytics.Message()]
messages[0].user_message.text = question
data_agent_id = "data_agent_1"
data_agent_context = geminidataanalytics.DataAgentContext()
data_agent_context.data_agent = f"projects/{billing_project}/locations/global/dataAgents/{data_agent_id}"
# data_agent_context.credentials = credentials
# Form the request
request = geminidataanalytics.ChatRequest(
parent=f"projects/{billing_project}/locations/global",
messages=messages,
data_agent_context = data_agent_context
)
# Make the request
stream = data_chat_client.chat(request=request)
# Handle the response
for response in stream:
show_message(response)
Replace the sample values as follows:
- Which species of tree is most prevalent?: A natural language question to send to the data agent.
- data_agent_1: The unique identifier for the data agent, as defined in Create a data agent.
Send a stateless chat request with inline context
The following sample code demonstrates how to use the inline_context
parameter to provide context directly within your stateless chat request.
# Create a request that contains a single user message (your question)
question = "Which species of tree is most prevalent?"
messages = [geminidataanalytics.Message()]
messages[0].user_message.text = question
request = geminidataanalytics.ChatRequest(
inline_context=inline_context,
parent=f"projects/{billing_project}/locations/global",
messages=messages,
)
# Make the request
stream = data_chat_client.chat(request=request)
# Handle the response
for response in stream:
show_message(response)
In the previous example, replace Which species of tree is most prevalent? with a natural language question to send to the data agent.
Create a stateless multi-turn conversation
To ask follow-up questions in a stateless conversation, your application must manage the conversation's context by sending the entire message history with each new request. The following example shows how to create a multi-turn conversation by referencing a data agent or by using inline context to provide the data source directly.
# List that is used to track previous turns and is reused across requests
conversation_messages = []
data_agent_id = "data_agent_1"
# Use data agent context
data_agent_context = geminidataanalytics.DataAgentContext()
data_agent_context.data_agent = f"projects/{billing_project}/locations/global/dataAgents/{data_agent_id}"
# data_agent_context.credentials = credentials
# Helper function for calling the API
def multi_turn_Conversation(msg):
message = geminidataanalytics.Message()
message.user_message.text = msg
# Send a multi-turn request by including previous turns and the new message
conversation_messages.append(message)
request = geminidataanalytics.ChatRequest(
parent=f"projects/{billing_project}/locations/global",
messages=conversation_messages,
# Use data agent context
data_agent_context=data_agent_context,
# Use inline context
# inline_context=inline_context,
)
# Make the request
stream = data_chat_client.chat(request=request)
# Handle the response
for response in stream:
show_message(response)
conversation_messages.append(response)
# Send the first turn request
multi_turn_Conversation("Which species of tree is most prevalent?")
# Send follow-up turn request
multi_turn_Conversation("Can you show me the results as a bar chart?")
In the previous example, replace the sample values as follows:
- data_agent_1: The unique identifier for the data agent, as defined in the sample code block in Create a data agent.
- Which species of tree is most prevalent?: A natural language question to send to the data agent.
- Can you show me the results as a bar chart?: A follow-up question that builds on or refines the previous question.
Define helper functions
The following sample code contains helper function definitions that are used in the previous code samples. These functions help to parse the response from the API and display the results.
from pygments import highlight, lexers, formatters
import pandas as pd
import requests
import json as json_lib
import altair as alt
import IPython
from IPython.display import display, HTML
import proto
from google.protobuf.json_format import MessageToDict, MessageToJson
def handle_text_response(resp):
parts = getattr(resp, 'parts')
print(''.join(parts))
def display_schema(data):
fields = getattr(data, 'fields')
df = pd.DataFrame({
"Column": map(lambda field: getattr(field, 'name'), fields),
"Type": map(lambda field: getattr(field, 'type'), fields),
"Description": map(lambda field: getattr(field, 'description', '-'), fields),
"Mode": map(lambda field: getattr(field, 'mode'), fields)
})
display(df)
def display_section_title(text):
display(HTML('<h2>{}</h2>'.format(text)))
def format_looker_table_ref(table_ref):
return 'lookmlModel: {}, explore: {}, lookerInstanceUri: {}'.format(table_ref.lookml_model, table_ref.explore, table_ref.looker_instance_uri)
def format_bq_table_ref(table_ref):
return '{}.{}.{}'.format(table_ref.project_id, table_ref.dataset_id, table_ref.table_id)
def display_datasource(datasource):
source_name = ''
if 'studio_datasource_id' in datasource:
source_name = getattr(datasource, 'studio_datasource_id')
elif 'looker_explore_reference' in datasource:
source_name = format_looker_table_ref(getattr(datasource, 'looker_explore_reference'))
else:
source_name = format_bq_table_ref(getattr(datasource, 'bigquery_table_reference'))
print(source_name)
display_schema(datasource.schema)
def handle_schema_response(resp):
if 'query' in resp:
print(resp.query.question)
elif 'result' in resp:
display_section_title('Schema resolved')
print('Data sources:')
for datasource in resp.result.datasources:
display_datasource(datasource)
def handle_data_response(resp):
if 'query' in resp:
query = resp.query
display_section_title('Retrieval query')
print('Query name: {}'.format(query.name))
print('Question: {}'.format(query.question))
print('Data sources:')
for datasource in query.datasources:
display_datasource(datasource)
elif 'generated_sql' in resp:
display_section_title('SQL generated')
print(resp.generated_sql)
elif 'result' in resp:
display_section_title('Data retrieved')
fields = [field.name for field in resp.result.schema.fields]
d = {}
for el in resp.result.data:
for field in fields:
if field in d:
d[field].append(el[field])
else:
d[field] = [el[field]]
display(pd.DataFrame(d))
def handle_chart_response(resp):
def _value_to_dict(v):
if isinstance(v, proto.marshal.collections.maps.MapComposite):
return _map_to_dict(v)
elif isinstance(v, proto.marshal.collections.RepeatedComposite):
return [_value_to_dict(el) for el in v]
elif isinstance(v, (int, float, str, bool)):
return v
else:
return MessageToDict(v)
def _map_to_dict(d):
out = {}
for k in d:
if isinstance(d[k], proto.marshal.collections.maps.MapComposite):
out[k] = _map_to_dict(d[k])
else:
out[k] = _value_to_dict(d[k])
return out
if 'query' in resp:
print(resp.query.instructions)
elif 'result' in resp:
vegaConfig = resp.result.vega_config
vegaConfig_dict = _map_to_dict(vegaConfig)
alt.Chart.from_json(json_lib.dumps(vegaConfig_dict)).display();
def show_message(msg):
m = msg.system_message
if 'text' in m:
handle_text_response(getattr(m, 'text'))
elif 'schema' in m:
handle_schema_response(getattr(m, 'schema'))
elif 'data' in m:
handle_data_response(getattr(m, 'data'))
elif 'chart' in m:
handle_chart_response(getattr(m, 'chart'))
print('\n')