Use a LlamaIndex Query Pipeline agent

Before you begin

This tutorial assumes that you have read and followed the instructions in:

Develop a LlamaIndexQueryPipeline agent: to develop agent as an instance of LlamaIndexQueryPipelineAgent.
User authentication to authenticate as a user for querying the agent.
Import and initialize the SDK to initialize the client for getting a deployed instance (if needed).

Get an instance of an agent

To query a LlamaIndexQueryPipelineAgent, you need to first create a new instance or get an existing instance.

To get the LlamaIndexQueryPipelineAgent corresponding to a specific resource ID:

Vertex AI SDK for Python

Run the following code:

import vertexai

client = vertexai.Client(  # For service interactions via client.agent_engines
    project="PROJECT_ID",
    location="LOCATION",
)

agent = client.agent_engines.get(name="projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID")

print(agent)

where

PROJECT_ID is the Google Cloud project ID under which you develop and deploy agents, and
LOCATION is one of the supported regions.
RESOURCE_ID is the ID of the deployed agent as a reasoningEngine resource.

Python requests library

Run the following code:

from google import auth as google_auth
from google.auth.transport import requests as google_requests
import requests

def get_identity_token():
    credentials, _ = google_auth.default()
    auth_request = google_requests.Request()
    credentials.refresh(auth_request)
    return credentials.token

response = requests.get(
f"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID",
    headers={
        "Content-Type": "application/json; charset=utf-8",
        "Authorization": f"Bearer {get_identity_token()}",
    },
)

REST API

curl \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID

When using the Vertex AI SDK for Python, the agent object corresponds to an AgentEngine class that contains the following:

an agent.api_resource with information about the deployed agent. You can also call agent.operation_schemas() to return the list of operations that the agent supports. See Supported operations for details.
an agent.api_client that allows for synchronous service interactions
an agent.async_api_client that allows for asynchronous service interactions

The rest of this section assumes that you have an AgentEngine instance, named as agent.

Supported operations

The following operations are supported for LlamaIndexQueryPipelineAgent:

query: for getting a response to a query synchronously.

The query method supports the following type of argument:

input: the messages to be sent to the agent.

Query the agent

The command:

agent.query(input="What is Paul Graham's life in college?")

is equivalent to the following (in full form):

agent.query(input={"input": "What is Paul Graham's life in college?"})

To customize the input dictionary, see Customize the prompt template.

You can also customize the agent's behavior beyond input by passing additional keyword arguments to query().

response = agent.query(
    input={
      "input" = [
        "What is Paul Graham's life in college?",
        "How did Paul Graham's college experience shape his career?",
        "How did Paul Graham's college experience shape his entrepreneurial mindset?",
      ],
    },
    batch=True  # run the pipeline in batch mode and pass a list of inputs.
)
print(response)

See the QueryPipeline.run code for a complete list of available parameters.