Use a LlamaIndex Query Pipeline agent

Before you begin

This tutorial assumes that you have read and followed the instructions in:

Get an instance of an agent

To query a LlamaIndexQueryPipelineAgent, you need to first create a new instance or get an existing instance.

To get the LlamaIndexQueryPipelineAgent corresponding to a specific resource ID:

Vertex AI SDK for Python

Run the following code:

agent = client.agent_engines.get(name="projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID")

Python requests library

Run the following code:

from google import auth as google_auth
from google.auth.transport import requests as google_requests
import requests

def get_identity_token():
    credentials, _ = google_auth.default()
    auth_request = google_requests.Request()
    credentials.refresh(auth_request)
    return credentials.token

response = requests.get(
f"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID",
    headers={
        "Content-Type": "application/json; charset=utf-8",
        "Authorization": f"Bearer {get_identity_token()}",
    },
)

REST API

curl \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID

When using the Vertex AI SDK for Python, the agent object corresponds to an AgentEngine class that contains the following:

  • an agent.api_resource with information about the deployed agent. You can also call agent.operation_schemas() to return the list of operations that the agent supports. See Supported operations for details.
  • an agent.api_client that allows for synchronous service interactions
  • an agent.async_api_client that allows for asynchronous service interactions

The rest of this section assumes that you have an AgentEngine instance, named as agent.

Supported operations

The following operations are supported for LlamaIndexQueryPipelineAgent:

  • query: for getting a response to a query synchronously.

The query method supports the following type of argument:

  • input: the messages to be sent to the agent.

Query the agent

The command:

agent.query(input="What is Paul Graham's life in college?")

is equivalent to the following (in full form):

agent.query(input={"input": "What is Paul Graham's life in college?"})

To customize the input dictionary, see Customize the prompt template.

You can also customize the agent's behavior beyond input by passing additional keyword arguments to query().

response = agent.query(
    input={
      "input" = [
        "What is Paul Graham's life in college?",
        "How did Paul Graham's college experience shape his career?",
        "How did Paul Graham's college experience shape his entrepreneurial mindset?",
      ],
    },
    batch=True  # run the pipeline in batch mode and pass a list of inputs.
)
print(response)

See the QueryPipeline.run code for a complete list of available parameters.

What's next