דף זה תורגם על ידי Cloud Translation API.

פיתוח סוכן נתונים באמצעות Python SDK

בדף הזה מוסבר איך משתמשים ב-SDK של Python כדי לשלוח בקשות ל-Conversational Analytics API. קוד Python לדוגמה מראה איך לבצע את הפעולות הבאות:

אימות והגדרה של הסביבה
ציון פרויקט החיוב והוראות למערכת
חיבור למקור נתונים של Looker,‏ BigQuery או Looker Studio
הגדרת הקשר לצ'אט עם מצב או ללא מצב
יצירת סוכן נתונים
אחזור של סוכן נתונים
יצירת שיחה
שימוש ב-API לשאילתות
שליחת בקשות עם כמה שלבים
הגדרת פונקציות עזר

אימות והגדרה של הסביבה

כדי להשתמש ב-SDK ל-Python של Conversational Analytics API, צריך לפעול לפי ההוראות שמפורטות במחברות של Conversational Analytics API SDK ב-Colaboratory כדי להוריד ולהתקין את ה-SDK. חשוב לזכור ששיטת ההורדה והתוכן של SDK Colab עשויים להשתנות.

אחרי שתבצעו את הוראות ההגדרה במסמך ה-notebook, תוכלו להשתמש בקוד הבא כדי לייבא את ספריות ה-SDK הנדרשות, לאמת את חשבון Google בסביבת Colaboratory ולהפעיל לקוח לשליחת בקשות ל-API:

from google.colab import auth
auth.authenticate_user()

from google.cloud import geminidataanalytics

data_agent_client = geminidataanalytics.DataAgentServiceClient()
data_chat_client = geminidataanalytics.DataChatServiceClient()

ציון הפרויקט לחיוב והוראות למערכת

קוד Python לדוגמה שמגדיר את הפרויקט לחיוב ואת הוראות המערכת שנעשה בהן שימוש לאורך הסקריפט:

# Billing project
billing_project = "my_project_name"

# System description
system_description = "Help the user analyze their data."

מחליפים את ערכי הדוגמה באופן הבא:

my_project_name: המזהה של פרויקט החיוב שבו ממשקי ה-API הנדרשים מופעלים.
Help the user analyze their data.: הוראות מערכת להנחיית ההתנהגות של הסוכן ולהתאמה אישית שלו לצרכים שלכם. לדוגמה, אפשר להשתמש בהוראות המערכת כדי להגדיר מונחים עסקיים (למשל, מהו 'לקוח נאמן'), לשלוט באורך התשובה ('סיכום בפחות מ-20 מילים') או להגדיר את עיצוב הנתונים ('התאמה לסטנדרטים של החברה').

התחברות למקור נתונים

בדוגמאות הקוד הבאות ב-Python מוסבר איך להגדיר את פרטי החיבור למקור הנתונים של Looker, BigQuery או Looker Studio, שאליו השליח ישלח שאילתה כדי לענות על השאלות.

חיבור לנתונים של Looker

בדוגמאות הקוד הבאות מוסבר איך מגדירים את הפרטים של חיבור ל-Looker Explore באמצעות מפתחות API או אסימון גישה.

מפתחות API

אפשר ליצור חיבור למכונה של Looker באמצעות מפתחות API של Looker שנוצרו, כפי שמתואר במאמר אימות והתחברות למקור נתונים באמצעות Conversational Analytics API.

looker_client_id = "my_looker_client_id"
looker_client_secret = "my_looker_client_secret"
looker_instance_uri = "https://my_company.looker.com"
lookml_model = "my_model"
explore = "my_explore"

looker_explore_reference = geminidataanalytics.LookerExploreReference()
looker_explore_reference.looker_instance_uri = looker_instance_uri
looker_explore_reference.lookml_model = lookml_model
looker_explore_reference.explore = explore

credentials = geminidataanalytics.Credentials()
credentials.oauth.secret.client_id = looker_client_id
credentials.oauth.secret.client_secret = looker_client_secret

datasource_references = geminidataanalytics.DatasourceReferences()
datasource_references.looker.explore_references = [looker_explore_reference]

מחליפים את ערכי הדוגמה באופן הבא:

my_looker_client_id: מזהה הלקוח של מפתח Looker API שנוצר.
my_looker_client_secret: סוד הלקוח של מפתח Looker API שנוצר.
https://my_company.looker.com: כתובת ה-URL המלאה של מכונה של Looker.
my_model: השם של מודל ה-LookML שכולל את ה-Explore שאליו רוצים להתחבר.
my_explore: השם של Looker Explore שרוצים שסוכן הנתונים ישלח אליו שאילתה.

טוקן גישה

אפשר ליצור חיבור למכונה של Looker באמצעות אסימון גישה, כפי שמתואר במאמר אימות והתחברות למקור נתונים באמצעות Conversational Analytics API.

looker_access_token = "my_access_token"
looker_instance_uri = "https://my_company.looker.com"
lookml_model = "my_model"
explore = "my_explore"

looker_explore_reference = geminidataanalytics.LookerExploreReference()
looker_explore_reference.looker_instance_uri = looker_instance_uri
looker_explore_reference.lookml_model = lookml_model
looker_explore_reference.explore = explore

credentials = geminidataanalytics.Credentials()
credentials.oauth.token.access_token = looker_access_token

datasource_references = geminidataanalytics.DatasourceReferences()
datasource_references.looker.explore_references = [looker_explore_reference]

מחליפים את ערכי הדוגמה באופן הבא:

my_access_token: הערך של access_token שיוצרים כדי לבצע אימות ב-Looker.
https://my_company.looker.com: כתובת ה-URL המלאה של מכונה של Looker.
my_model: השם של מודל ה-LookML שכולל את ה-Explore שאליו רוצים להתחבר.
my_explore: השם של Looker Explore שרוצים שסוכן הנתונים ישלח אליו שאילתה.

התחברות לנתוני BigQuery

בקוד לדוגמה הבא מוגדר חיבור לטבלה אחת ב-BigQuery.

bq_project_id = "my_project_id"
bq_dataset_id = "my_dataset_id"
bq_table_id = "my_table_id"

bigquery_table_reference = geminidataanalytics.BigQueryTableReference()
bigquery_table_reference.project_id = bq_project_id
bigquery_table_reference.dataset_id = bq_dataset_id
bigquery_table_reference.table_id = bq_table_id

# Connect to your data source
datasource_references = geminidataanalytics.DatasourceReferences()
datasource_references.bq.table_references = [bigquery_table_reference]

מחליפים את ערכי הדוגמה באופן הבא:

my_project_id: המזהה של Google Cloud הפרויקט שמכיל את מערך הנתונים והטבלה ב-BigQuery שאליהם רוצים להתחבר. כדי להתחבר למערך נתונים ציבורי, מציינים bigquery-public-data.
my_dataset_id: המזהה של מערך הנתונים ב-BigQuery. לדוגמה, san_francisco.
my_table_id: המזהה של טבלת BigQuery. לדוגמה, street_trees.

חיבור לנתונים של Looker Studio

הקוד לדוגמה הבא מגדיר חיבור למקור נתונים ב-Looker Studio.

studio_datasource_id = "my_datasource_id"

studio_references = geminidataanalytics.StudioDatasourceReference()
studio_references.datasource_id = studio_datasource_id

## Connect to your data source
datasource_references.studio.studio_references = [studio_references]

בדוגמה הקודמת, מחליפים את my_datasource_id במזהה של מקור הנתונים.

הגדרת הקשר לצ'אט עם מצב או ללא מצב

קוד Python לדוגמה שמראה איך להגדיר הקשר לשיחות עם מצב או ללא מצב:

צ'אט עם שמירת מצב: Google Cloud אחסון וניהול של היסטוריית השיחות. צריך לשלוח רק את ההודעה הנוכחית בכל פנייה.
צ'אט ללא מצב: צריך לשלוח את כל היסטוריית השיחה עם כל הודעה.

צ'אט עם שמירת מצב

בדוגמת הקוד הבאה מוגדר הקשר לצ'אט עם שמירת מצב, שבו Google Cloud שומר ומנהל את היסטוריית השיחות. אפשר גם להפעיל ניתוח מתקדם באמצעות Python על ידי הוספת השורה published_context.options.analysis.python.enabled = True לקוד לדוגמה הבא.

# Set up context for stateful chat
published_context = geminidataanalytics.Context()
published_context.system_instruction = system_instruction
published_context.datasource_references = datasource_references
# Optional: To enable advanced analysis with Python, include the following line:
published_context.options.analysis.python.enabled = True

צ'אט ללא שמירת מצב

הקוד לדוגמה הבא מגדיר הקשר לצ'אט ללא מצב, שבו צריך לשלוח את כל היסטוריית השיחה עם כל הודעה. אפשר גם להפעיל ניתוח מתקדם באמצעות Python על ידי הוספת השורה inline_context.options.analysis.python.enabled = True לקוד לדוגמה הבא.

# Set up context for stateless chat
# datasource_references.looker.credentials = credentials
inline_context = geminidataanalytics.Context()
inline_context.system_instruction = system_instruction
inline_context.datasource_references = datasource_references
# Optional: To enable advanced analysis with Python, include the following line:
inline_context.options.analysis.python.enabled = True

יצירת סוכן נתונים

קוד Python לדוגמה ששולח בקשת API ליצירת סוכן נתונים, שבעזרתו אפשר לנהל שיחה על הנתונים שלכם. סוכן הנתונים מוגדר עם מקור הנתונים, הוראות המערכת וההקשר שצוינו.

data_agent_id = "data_agent_1"

data_agent = geminidataanalytics.DataAgent()
data_agent.data_analytics_agent.published_context = published_context
data_agent.name = f"projects/{billing_project}/locations/global/dataAgents/{data_agent_id}" # Optional

request = geminidataanalytics.CreateDataAgentRequest(
    parent=f"projects/{billing_project}/locations/global",
    data_agent_id=data_agent_id, # Optional
    data_agent=data_agent,
)

try:
    data_agent_client.create_data_agent(request=request)
    print("Data Agent created")
except Exception as e:
    print(f"Error creating Data Agent: {e}")

בדוגמה הקודמת, מחליפים את הערך data_agent_1 במזהה ייחודי של סוכן הנתונים.

אחזור של סוכן נתונים

דוגמת הקוד הבאה ב-Python ממחישה איך לשלוח בקשת API כדי לאחזר סוכן נתונים שיצרתם בעבר.

# Initialize request arguments
data_agent_id = "data_agent_1"
request = geminidataanalytics.GetDataAgentRequest(
    name=f"projects/{billing_project}/locations/global/dataAgents/{data_agent_id}",
)

# Make the request
response = data_agent_client.get_data_agent(request=request)

# Handle the response
print(response)

בדוגמה הקודמת, מחליפים את הערך data_agent_1 במזהה הייחודי של סוכן הנתונים שרוצים לאחזר.

יצירת שיחה

קוד Python לדוגמה ששולח בקשת API ליצירת שיחה.

# Initialize request arguments
data_agent_id = "data_agent_1"
conversation_id = "conversation_1"

conversation = geminidataanalytics.Conversation()
conversation.agents = [f'projects/{billing_project}/locations/global/dataAgents/{data_agent_id}']
conversation.name = f"projects/{billing_project}/locations/global/conversations/{conversation_id}"

request = geminidataanalytics.CreateConversationRequest(
    parent=f"projects/{billing_project}/locations/global",
    conversation_id=conversation_id,
    conversation=conversation,
)

# Make the request
response = data_chat_client.create_conversation(request=request)

# Handle the response
print(response)

מחליפים את ערכי הדוגמה באופן הבא:

data_agent_1: המזהה של סוכן הנתונים, כפי שמוגדר בבלוק הקוד לדוגמה בקטע יצירת סוכן נתונים.
conversation_1: מזהה ייחודי של השיחה.

שימוש ב-API לשאילת שאלות

אחרי שיוצרים סוכן נתונים ושיחה, הקוד לדוגמה ב-Python שמוצג בהמשך שולח שאילתה לסוכן. הקוד משתמש בהקשר שהגדרתם לצ'אט עם שמירת מצב או ללא שמירת מצב. ה-API מחזיר מקור נתונים של הודעות שמייצגות את השלבים שהסוכן מבצע כדי לענות על השאילתה.

צ'אט עם שמירת מצב

שליחת בקשת צ'אט עם מצב עם הפניה ל-`Conversation`

כדי לשלוח בקשת צ'אט עם מצב (stateful) לסוכנות הנתונים, אפשר להפנות למשאב Conversation שיצרתם בעבר.

# Create a request that contains a single user message (your question)
question = "Which species of tree is most prevalent?"
messages = [geminidataanalytics.Message()]
messages[0].user_message.text = question

data_agent_id = "data_agent_1"
conversation_id = "conversation_1"

# Create a conversation_reference
conversation_reference = geminidataanalytics.ConversationReference()
conversation_reference.conversation = f"projects/{billing_project}/locations/global/conversations/{conversation_id}"
conversation_reference.data_agent_context.data_agent = f"projects/{billing_project}/locations/global/dataAgents/{data_agent_id}"
# conversation_reference.data_agent_context.credentials = credentials

# Form the request
request = geminidataanalytics.ChatRequest(
    parent = f"projects/{billing_project}/locations/global",
    messages = messages,
    conversation_reference = conversation_reference
)

# Make the request
stream = data_chat_client.chat(request=request)

# Handle the response
for response in stream:
    show_message(response)

מחליפים את ערכי הדוגמה באופן הבא:

Which species of tree is most prevalent?: שאלה בשפה טבעית ששולחים לסוכנות הנתונים.
data_agent_1: המזהה הייחודי של סוכן הנתונים, כפי שמוגדר בקטע יצירת סוכן נתונים.
conversation_1: המזהה הייחודי של השיחה, כפי שמוגדר בקטע יצירת שיחה.

צ'אט ללא שמירת מצב

בדוגמאות הקוד הבאות מוסבר איך שולחים שאילתה לסוכנות הנתונים אחרי שמגדירים הקשר לצ'אט ללא מצב. אפשר לשלוח שאילתות ללא שמירת מצב על ידי הפניה למשאב DataAgent שהוגדר מראש, או על ידי שימוש בהקשר בשורה בבקשה.

שליחת בקשת צ'אט ללא שמירת מצב עם הפניה מסוג `DataAgent`

כדי לשלוח שאילתה לסוכנות הנתונים, אפשר להפנות למשאב DataAgent שיצרתם בעבר.

# Create a request that contains a single user message (your question)
question = "Which species of tree is most prevalent?"
messages = [geminidataanalytics.Message()]
messages[0].user_message.text = question

data_agent_id = "data_agent_1"

data_agent_context = geminidataanalytics.DataAgentContext()
data_agent_context.data_agent = f"projects/{billing_project}/locations/global/dataAgents/{data_agent_id}"
# data_agent_context.credentials = credentials

# Form the request
request = geminidataanalytics.ChatRequest(
    parent=f"projects/{billing_project}/locations/global",
    messages=messages,
    data_agent_context = data_agent_context
)

# Make the request
stream = data_chat_client.chat(request=request)

# Handle the response
for response in stream:
    show_message(response)

מחליפים את ערכי הדוגמה באופן הבא:

Which species of tree is most prevalent?: שאלה בשפה טבעית ששולחים לסוכנות הנתונים.
data_agent_1: המזהה הייחודי של סוכן הנתונים, כפי שמוגדר בקטע יצירת סוכן נתונים.

שליחת בקשת צ'אט ללא שמירת מצב עם הקשר בשורה

בדוגמת הקוד הבאה מוסבר איך להשתמש בפרמטר inline_context כדי לספק הקשר ישירות בתוך בקשת הצ'אט ללא מצב.

# Create a request that contains a single user message (your question)
question = "Which species of tree is most prevalent?"
messages = [geminidataanalytics.Message()]
messages[0].user_message.text = question

request = geminidataanalytics.ChatRequest(
    inline_context=inline_context,
    parent=f"projects/{billing_project}/locations/global",
    messages=messages,
)

# Make the request
stream = data_chat_client.chat(request=request)

# Handle the response
for response in stream:
    show_message(response)

בדוגמה הקודמת, מחליפים את Which species of tree is most prevalent? בשאלה בשפה טבעית שרוצים לשלוח לסוכנות הנתונים.

שליחת בקשות עם כמה שלבים

כדי ליצור שיחה עם כמה סבבים, אפשר לשלוח שאלות המשך לסוכנות הנתונים. דוגמת הקוד הבאה ממחישה איך לשלוח בקשות עם כמה שלבים, על סמך תשובות קודמות כדי לשפר את השיחה. מידע נוסף זמין במאמר יצירת שיחה עם כמה תורנויות.

# List that is used to track previous turns and is reused across requests
input_message = []

# Helper function for calling the API
def multi_turn_Conversation(msg):

  message = geminidataanalytics.Message()
  message.user_message.text = msg

  # Send a multi-turn request by including previous turns and the new message
  input_message.append(message)

  request = geminidataanalytics.ChatRequest(
      inline_context=inline_context,
      parent=f"projects/{billing_project}/locations/global",
      messages=input_message,
  )

  # Make the request
  stream = data_chat_client.chat(request=request)

  # Handle the response
  for response in stream:
    show_message(response)
    input_message.append(response)

# Send the first turn request
multi_turn_Conversation("Which species of tree is most prevalent?")

# Send follow-up turn request
multi_turn_Conversation("Can you show me the results as a bar chart?")

בדוגמה הקודמת, מחליפים את ערכי הדוגמה באופן הבא:

Which species of tree is most prevalent?: שאלה בשפה טבעית ששולחים לסוכנות הנתונים.
Can you show me the results as a bar chart?: שאלה המשך שמבוססת על השאלה הקודמת או מחדדת אותה.

הגדרת פונקציות עזר

קוד לדוגמה שכולל הגדרות של פונקציות עזר שנעשה בהן שימוש בדוגמאות הקוד הקודמות. הפונקציות האלה עוזרות לנתח את התשובה מה-API ולהציג את התוצאות.

from pygments import highlight, lexers, formatters
import pandas as pd
import requests
import json as json_lib
import altair as alt
import IPython
from IPython.display import display, HTML

import proto
from google.protobuf.json_format import MessageToDict, MessageToJson

def handle_text_response(resp):
  parts = getattr(resp, 'parts')
  print(''.join(parts))

def display_schema(data):
  fields = getattr(data, 'fields')
  df = pd.DataFrame({
    "Column": map(lambda field: getattr(field, 'name'), fields),
    "Type": map(lambda field: getattr(field, 'type'), fields),
    "Description": map(lambda field: getattr(field, 'description', '-'), fields),
    "Mode": map(lambda field: getattr(field, 'mode'), fields)
  })
  display(df)

def display_section_title(text):
  display(HTML('<h2>{}</h2>'.format(text)))

def format_looker_table_ref(table_ref):
 return 'lookmlModel: {}, explore: {}, lookerInstanceUri: {}'.format(table_ref.lookml_model, table_ref.explore, table_ref.looker_instance_uri)

def format_bq_table_ref(table_ref):
  return '{}.{}.{}'.format(table_ref.project_id, table_ref.dataset_id, table_ref.table_id)

def display_datasource(datasource):
  source_name = ''
  if 'studio_datasource_id' in datasource:
   source_name = getattr(datasource, 'studio_datasource_id')
  elif 'looker_explore_reference' in datasource:
   source_name = format_looker_table_ref(getattr(datasource, 'looker_explore_reference'))
  else:
    source_name = format_bq_table_ref(getattr(datasource, 'bigquery_table_reference'))

  print(source_name)
  display_schema(datasource.schema)

def handle_schema_response(resp):
  if 'query' in resp:
    print(resp.query.question)
  elif 'result' in resp:
    display_section_title('Schema resolved')
    print('Data sources:')
    for datasource in resp.result.datasources:
      display_datasource(datasource)

def handle_data_response(resp):
  if 'query' in resp:
    query = resp.query
    display_section_title('Retrieval query')
    print('Query name: {}'.format(query.name))
    print('Question: {}'.format(query.question))
    print('Data sources:')
    for datasource in query.datasources:
      display_datasource(datasource)
  elif 'generated_sql' in resp:
    display_section_title('SQL generated')
    print(resp.generated_sql)
  elif 'result' in resp:
    display_section_title('Data retrieved')

    fields = [field.name for field in resp.result.schema.fields]
    d = {}
    for el in resp.result.data:
      for field in fields:
        if field in d:
          d[field].append(el[field])
        else:
          d[field] = [el[field]]

    display(pd.DataFrame(d))

def handle_chart_response(resp):
  def _value_to_dict(v):
    if isinstance(v, proto.marshal.collections.maps.MapComposite):
      return _map_to_dict(v)
    elif isinstance(v, proto.marshal.collections.RepeatedComposite):
      return [_value_to_dict(el) for el in v]
    elif isinstance(v, (int, float, str, bool)):
      return v
    else:
      return MessageToDict(v)

  def _map_to_dict(d):
    out = {}
    for k in d:
      if isinstance(d[k], proto.marshal.collections.maps.MapComposite):
        out[k] = _map_to_dict(d[k])
      else:
        out[k] = _value_to_dict(d[k])
    return out

  if 'query' in resp:
    print(resp.query.instructions)
  elif 'result' in resp:
    vegaConfig = resp.result.vega_config
    vegaConfig_dict = _map_to_dict(vegaConfig)
    alt.Chart.from_json(json_lib.dumps(vegaConfig_dict)).display();

def show_message(msg):
  m = msg.system_message
  if 'text' in m:
    handle_text_response(getattr(m, 'text'))
  elif 'schema' in m:
    handle_schema_response(getattr(m, 'schema'))
  elif 'data' in m:
    handle_data_response(getattr(m, 'data'))
  elif 'chart' in m:
    handle_chart_response(getattr(m, 'chart'))
  print('\n')

פיתוח סוכן נתונים באמצעות Python SDK קל לארגן דפים בעזרת אוספים אפשר לשמור ולסווג תוכן על סמך ההעדפות שלך.

אימות והגדרה של הסביבה

ציון הפרויקט לחיוב והוראות למערכת

התחברות למקור נתונים

חיבור לנתונים של Looker

מפתחות API

טוקן גישה

התחברות לנתוני BigQuery

חיבור לנתונים של Looker Studio

הגדרת הקשר לצ'אט עם מצב או ללא מצב

צ'אט עם שמירת מצב

צ'אט ללא שמירת מצב

יצירת סוכן נתונים

אחזור של סוכן נתונים

יצירת שיחה

שימוש ב-API לשאילת שאלות

צ'אט עם שמירת מצב

שליחת בקשת צ'אט עם מצב עם הפניה ל-Conversation

צ'אט ללא שמירת מצב

שליחת בקשת צ'אט ללא שמירת מצב עם הפניה מסוג DataAgent

שליחת בקשת צ'אט ללא שמירת מצב עם הקשר בשורה

שליחת בקשות עם כמה שלבים

הגדרת פונקציות עזר

פיתוח סוכן נתונים באמצעות Python SDK

שליחת בקשת צ'אט עם מצב עם הפניה ל-`Conversation`

שליחת בקשת צ'אט ללא שמירת מצב עם הפניה מסוג `DataAgent`