Before you begin
This tutorial assumes that you have read and followed the instructions in:
- Develop a LlamaIndexQueryPipeline agent: to develop
agent
as an instance ofLlamaIndexQueryPipelineAgent
. - User authentication to authenticate as a user for querying the agent.
- Import and initialize the SDK to initialize the client for getting a deployed instance (if needed).
Get an instance of an agent
To query a LlamaIndexQueryPipelineAgent
, you need to first
create a new instance or
get an existing instance.
To get the LlamaIndexQueryPipelineAgent
corresponding to a specific resource ID:
Vertex AI SDK for Python
Run the following code:
agent = client.agent_engines.get(name="projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID")
Python requests library
Run the following code:
from google import auth as google_auth
from google.auth.transport import requests as google_requests
import requests
def get_identity_token():
credentials, _ = google_auth.default()
auth_request = google_requests.Request()
credentials.refresh(auth_request)
return credentials.token
response = requests.get(
f"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID",
headers={
"Content-Type": "application/json; charset=utf-8",
"Authorization": f"Bearer {get_identity_token()}",
},
)
REST API
curl \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID
When using the Vertex AI SDK for Python, the agent
object corresponds to an
AgentEngine
class that contains the following:
- an
agent.api_resource
with information about the deployed agent. You can also callagent.operation_schemas()
to return the list of operations that the agent supports. See Supported operations for details. - an
agent.api_client
that allows for synchronous service interactions - an
agent.async_api_client
that allows for asynchronous service interactions
The rest of this section assumes that you have an AgentEngine
instance, named as agent
.
Supported operations
The following operations are supported for LlamaIndexQueryPipelineAgent
:
query
: for getting a response to a query synchronously.
The query
method supports the following type of argument:
input
: the messages to be sent to the agent.
Query the agent
The command:
agent.query(input="What is Paul Graham's life in college?")
is equivalent to the following (in full form):
agent.query(input={"input": "What is Paul Graham's life in college?"})
To customize the input dictionary, see Customize the prompt template.
You can also customize the agent's behavior beyond input
by passing additional keyword arguments to query()
.
response = agent.query(
input={
"input" = [
"What is Paul Graham's life in college?",
"How did Paul Graham's college experience shape his career?",
"How did Paul Graham's college experience shape his entrepreneurial mindset?",
],
},
batch=True # run the pipeline in batch mode and pass a list of inputs.
)
print(response)
See the QueryPipeline.run
code for a complete list of available parameters.