使用自定义代理

准备工作

本教程假定您已阅读并遵循以下说明:

获取代理的实例

如需查询代理,您首先需要代理的实例。您可以创建新实例,也可以获取代理的现有实例

如需获取与特定资源 ID 对应的代理,请执行以下操作:

Python 版 Vertex AI SDK

运行以下代码:

import vertexai

client = vertexai.Client(  # For service interactions via client.agent_engines
    project="PROJECT_ID",
    location="LOCATION",
)

agent = client.agent_engines.get(name="projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID")

print(agent)

其中

请求

运行以下代码:

from google import auth as google_auth
from google.auth.transport import requests as google_requests
import requests

def get_identity_token():
    credentials, _ = google_auth.default()
    auth_request = google_requests.Request()
    credentials.refresh(auth_request)
    return credentials.token

response = requests.get(
f"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID",
    headers={
        "Content-Type": "application/json; charset=utf-8",
        "Authorization": f"Bearer {get_identity_token()}",
    },
)

REST

curl \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID

使用 Vertex AI SDK for Python 时,agent 对象对应于包含以下内容的 AgentEngine 类:

  • 包含已部署代理相关信息的 agent.api_resource。您还可以调用 agent.operation_schemas() 以返回代理支持的操作列表。如需了解详情,请参阅支持的操作
  • 一种允许同步服务交互的 agent.api_client
  • 一种允许异步服务交互的 agent.async_api_client

本部分的其余内容假设您有一个名为 agent 的实例。

列出支持的操作

在本地开发代理时,您可以访问并了解其支持的操作。如需使用部署的代理,您可以枚举其支持的操作:

Python 版 Vertex AI SDK

运行以下代码:

print(agent.operation_schemas())

请求

运行以下代码:

import json

json.loads(response.content).get("spec").get("classMethods")

REST

spec.class_methods 形式表示,来自对 curl 请求的响应。

每个操作的架构都是一个字典,用于记录您可以调用的代理方法的相关信息。支持的操作集取决于您用于开发代理的框架:

例如,以下是 LangchainAgentquery 操作的架构:

{'api_mode': '',
 'name': 'query',
 'description': """Queries the Agent with the given input and config.
    Args:
        input (Union[str, Mapping[str, Any]]):
            Required. The input to be passed to the Agent.
        config (langchain_core.runnables.RunnableConfig):
            Optional. The config (if any) to be used for invoking the Agent.
    Returns:
        The output of querying the Agent with the given input and config.
""",            '        ',
 'parameters': {'$defs': {'RunnableConfig': {'description': 'Configuration for a Runnable.',
                                             'properties': {'configurable': {...},
                                                            'run_id': {...},
                                                            'run_name': {...},
                                                            ...},
                                             'type': 'object'}},
                'properties': {'config': {'nullable': True},
                               'input': {'anyOf': [{'type': 'string'}, {'type': 'object'}]}},
                'required': ['input'],
                'type': 'object'}}

其中

  • name 是操作的名称(即,对于名为 query 的操作为 agent.query)。
  • api_mode 是操作的 API 模式("" 表示同步,"stream" 表示流式传输)。
  • description 是操作的说明,基于方法的文档字符串。
  • parameters 是输入参数的架构,采用 OpenAPI 架构格式。

使用支持的操作查询代理

对于自定义代理,您可以使用在开发代理时定义的以下任意查询或流式传输操作:

请注意,某些框架仅支持特定的查询或流式操作:

框架 支持的查询操作
智能体开发套件 async_stream_query
LangChain querystream_query
LangGraph querystream_query
AG2 query
LlamaIndex query

查询智能体

使用 query 操作查询代理:

Vertex AI SDK for Python

agent.query(input="What is the exchange rate from US dollars to Swedish Krona today?")

请求

from google import auth as google_auth
from google.auth.transport import requests as google_requests
import requests

def get_identity_token():
    credentials, _ = google_auth.default()
    auth_request = google_requests.Request()
    credentials.refresh(auth_request)
    return credentials.token

requests.post(
    f"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID:query",
    headers={
        "Content-Type": "application/json; charset=utf-8",
        "Authorization": f"Bearer {get_identity_token()}",
    },
    data=json.dumps({
        "class_method": "query",
        "input": {
            "input": "What is the exchange rate from US dollars to Swedish Krona today?"
        }
    })
)

REST

curl \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID:query -d '{
  "class_method": "query",
  "input": {
    "input": "What is the exchange rate from US dollars to Swedish Krona today?"
  }
}'

查询响应是一个字符串,类似于本地应用测试的输出:

{"input": "What is the exchange rate from US dollars to Swedish Krona today?",
 # ...
 "output": "For 1 US dollar you will get 10.7345 Swedish Krona."}

从代理流式传输响应

使用 stream_query 操作从代理流式传输响应:

Vertex AI SDK for Python

agent = agent_engines.get("projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID")

for response in agent.stream_query(
    input="What is the exchange rate from US dollars to Swedish Krona today?"
):
    print(response)

请求

from google import auth as google_auth
from google.auth.transport import requests as google_requests
import requests

def get_identity_token():
    credentials, _ = google_auth.default()
    auth_request = google_requests.Request()
    credentials.refresh(auth_request)
    return credentials.token

requests.post(
    f"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID:streamQuery",
    headers={
        "Content-Type": "application/json",
        "Authorization": f"Bearer {get_identity_token()}",
    },
    data=json.dumps({
        "class_method": "stream_query",
        "input": {
            "input": "What is the exchange rate from US dollars to Swedish Krona today?"
        },
    }),
    stream=True,
)

REST

curl \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID:streamQuery?alt=sse -d '{
  "class_method": "stream_query",
  "input": {
    "input": "What is the exchange rate from US dollars to Swedish Krona today?"
  }
}'

Vertex AI Agent Engine 会以迭代生成的对象序列的形式流式传输响应。例如,一组三个响应可能如下所示:

{'actions': [{'tool': 'get_exchange_rate', ...}]}  # first response
{'steps': [{'action': {'tool': 'get_exchange_rate', ...}}]}  # second response
{'output': 'The exchange rate is 11.0117 SEK per USD as of 2024-12-03.'}  # final response

异步查询代理

如果您在开发代理时定义了 async_query 操作,则 Vertex AI SDK for Python 支持对代理进行客户端异步查询:

Vertex AI SDK for Python

agent = agent_engines.get("projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID")

response = await agent.async_query(
    input="What is the exchange rate from US dollars to Swedish Krona today?"
)
print(response)

查询响应是一个字典,与本地测试的输出相同:

{"input": "What is the exchange rate from US dollars to Swedish Krona today?",
 # ...
 "output": "For 1 US dollar you will get 10.7345 Swedish Krona."}

从代理异步流式传输响应

如果您在开发代理时定义了 async_stream_query 操作,则可以使用代理的某个操作(例如 async_stream_query)从代理异步流式传输响应:

Python 版 Vertex AI SDK

agent = agent_engines.get("projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID")

async for response in agent.async_stream_query(
    input="What is the exchange rate from US dollars to Swedish Krona today?"
):
    print(response)

async_stream_query 操作在后台调用相同的 streamQuery 端点,并以迭代生成的对象序列的形式异步流式传输响应。例如,一组三个响应可能如下所示:

{'actions': [{'tool': 'get_exchange_rate', ...}]}  # first response
{'steps': [{'action': {'tool': 'get_exchange_rate', ...}}]}  # second response
{'output': 'The exchange rate is 11.0117 SEK per USD as of 2024-12-03.'}  # final response

响应应与本地测试期间生成的响应相同。

后续步骤