Esta página foi traduzida pela API Cloud Translation.

Desenvolver um agente de pipeline de consulta do LlamaIndex

Nesta página, mostramos como desenvolver um agente usando o modelo LlamaIndex Query Pipelines (a classe LlamaIndexQueryPipelineAgent no SDK da Vertex AI para Python). Ele foi projetado para responder a perguntas usando a geração aumentada de recuperação (RAG, na sigla em inglês), como a seguinte consulta: "Como foi a vida de Paul Graham na faculdade?"

Siga estas etapas para desenvolver um agente usando os pipelines de consulta do LlamaIndex:

Definir e configurar um modelo
Definir e usar um extrator
Definir e usar um sintetizador de respostas
(Opcional) Personalizar o modelo de comando
(Opcional) Personalizar a orquestração

Antes de começar

Verifique se o ambiente está configurado seguindo as etapas em Configurar o ambiente.

Definir e configurar um modelo

Defina e configure um modelo para uso do agente do pipeline de consulta do LlamaIndex.

Defina a versão do modelo:
```
model = "gemini-2.0-flash"
```

(Opcional) Especifique os parâmetros do modelo:

model_kwargs = {
    # vertexai_config (dict): By providing the region and project_id parameters,
    # you can enable model usage through Vertex AI.
    "vertexai_config": {
        "project": "PROJECT_ID",
        "location": "LOCATION"
    },
    # temperature (float): The sampling temperature controls the degree of
    # randomness in token selection.
    "temperature": 0.28,
    # context_window (int): The context window of the model.
    # If not provided, the default context window is 200000.
    "context_window": 200000,
    # max_tokens (int): Token limit determines the maximum
    # amount of text output from one prompt. If not provided,
    # the default max_tokens is 256.
    "max_tokens": 256,
}

Crie um LlamaIndexQueryPipelineAgent usando as seguintes configurações de modelo:

from vertexai.preview import reasoning_engines

agent = reasoning_engines.LlamaIndexQueryPipelineAgent(
    model=model,                # Required.
    model_kwargs=model_kwargs,  # Optional.
)

Se você estiver executando em um ambiente interativo (como o terminal ou o notebook do Colab), poderá consultar o agente:

response = agent.query(input="What is Paul Graham's life in college?")

print(response)

Você receberá uma resposta semelhante a esta:

{'message': {'role': 'assistant',
  'additional_kwargs': {},
  'blocks': [{'block_type': 'text',
    'text': "Unfortunately, there's not a lot of publicly available information about Paul Graham's personal life in college. ..."}]},
  'raw': {'content': {'parts': [{'video_metadata': None,
      'thought': None,
      'code_execution_result': None,
      'executable_code': None,
      'file_data': None,
      'function_call': None,
      'function_response': None,
      'inline_data': None,
      'text': "Unfortunately, there's not a lot of publicly available information about Paul Graham's personal life in college. ..."}],
    'role': 'model'},
    'citation_metadata': None,
    'finish_message': None,
    'token_count': None,
    'avg_logprobs': -0.1468650027438327,
    'finish_reason': 'STOP',
    'grounding_metadata': None,
    'index': None,
    'logprobs_result': None,
    'safety_ratings': [{'blocked': None,
      'category': 'HARM_CATEGORY_HATE_SPEECH',
      'probability': 'NEGLIGIBLE',
      'probability_score': 0.022949219,
      'severity': 'HARM_SEVERITY_NEGLIGIBLE',
      'severity_score': 0.014038086},
    {'blocked': None,
      'category': 'HARM_CATEGORY_DANGEROUS_CONTENT',
      'probability': 'NEGLIGIBLE',
      'probability_score': 0.056640625,
      'severity': 'HARM_SEVERITY_NEGLIGIBLE',
      'severity_score': 0.029296875},
    {'blocked': None,
      'category': 'HARM_CATEGORY_HARASSMENT',
      'probability': 'NEGLIGIBLE',
      'probability_score': 0.071777344,
      'severity': 'HARM_SEVERITY_NEGLIGIBLE',
      'severity_score': 0.024047852},
    {'blocked': None,
      'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT',
      'probability': 'NEGLIGIBLE',
      'probability_score': 0.103515625,
      'severity': 'HARM_SEVERITY_NEGLIGIBLE',
      'severity_score': 0.05102539}],
    'usage_metadata': {'cached_content_token_count': None,
    'candidates_token_count': 222,
    'prompt_token_count': 10,
    'total_token_count': 232}},
  'delta': None,
  'logprobs': None,
  'additional_kwargs': {}}

(Opcional) Personalizar seu modelo

O modelo LlamaIndexQueryPipelineAgent usa Google GenAI por padrão para fornecer acesso a todos os modelos básicos disponíveis em Google Cloud. Para usar um modelo não disponibilizado por Google GenAI, defina model_builder= da seguinte maneira:

from typing import Optional

def model_builder(
    *,
    model_name: str,                      # Required. The name of the model
    model_kwargs: Optional[dict] = None,  # Optional. The model keyword arguments.
    **kwargs,                             # Optional. The remaining keyword arguments to be ignored.
):

Para ver uma lista dos modelos de chat compatíveis com o LlamaIndexQueryPipeline e respectivos recursos, consulte Integrações de LLM disponíveis. Cada modelo de chat usa um conjunto próprio de valores compatíveis para model= e model_kwargs=.

IA generativa do Google

A IA generativa do Google é instalada por padrão quando você configura seu ambiente e é usada automaticamente no modelo LlamaIndexQueryPipelineAgent quando você omite model_builder.

from vertexai.preview import reasoning_engines

agent = reasoning_engines.LlamaIndexQueryPipelineAgent(
    model=model,                # Required.
    model_kwargs=model_kwargs,  # Optional.
)

Anthropic

Siga a documentação da Anthropic para configurar uma conta e instalar o pacote llama-index-llms-anthropic.

Defina model_builder para retornar o modelo Anthropic:

def model_builder(*, model_name: str, model_kwargs = None, **kwargs):
    from llama_index.llms.anthropic import Anthropic

    return Anthropic(model=model_name, **model_kwargs)

Use o modelo da Anthropic no modelo LlamaIndexQueryPipelineAgent:

from vertexai.preview import reasoning_engines

agent = reasoning_engines.LlamaIndexQueryPipelineAgent(
    model="claude-3-opus-20240229",           # Required.
    model_builder=model_builder,              # Required.
    model_kwargs={
        "api_key": "ANTHROPIC_API_KEY",    # Required.
        "temperature": 0.28,                  # Optional.
    },
)

OpenAILike

É possível usar OpenAILike com a API ChatCompletions do Gemini.

Siga a documentação do OpenAILike para instalar o pacote:
```
pip install llama-index-llms-openai-like
```

Defina um model_builder que retorne o modelo OpenAILike:

def model_builder(
    *,
    model_name: str,
    model_kwargs = None,
    project: str,   # Specified via vertexai.init
    location: str,  # Specified via vertexai.init
    **kwargs,
):
    import google.auth
    from llama_index.llms.openai_like import OpenAILike

    # Note: the credential lives for 1 hour by default.
    # After expiration, it must be refreshed.
    creds, _ = google.auth.default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
    auth_req = google.auth.transport.requests.Request()
    creds.refresh(auth_req)

    if model_kwargs is None:
        model_kwargs = {}

    endpoint = f"https://{location}-aiplatform.googleapis.com"
    api_base = f'{endpoint}/v1beta1/projects/{project}/locations/{location}/endpoints/openapi'

    return OpenAILike(
        model=model_name,
        api_base=api_base,
        api_key=creds.token,
        **model_kwargs,
    )

Use o modelo no modelo LlamaIndexQueryPipelineAgent:

from vertexai.preview import reasoning_engines

agent = reasoning_engines.LlamaIndexQueryPipelineAgent(
    model="google/gemini-2.0-flash",  # Or "meta/llama3-405b-instruct-maas"
    model_builder=model_builder,        # Required.
    model_kwargs={
        "temperature": 0,               # Optional.
        "max_retries": 2,               # Optional.
    },
)

Definir e usar um extrator

Depois de definir o modelo, defina o extrator que ele usa para raciocínio. Um recuperador pode ser criado com base em índices, mas também pode ser definido de forma abrangente. Teste seu extrator localmente.

Defina um recuperador que retorne documentos relevantes e pontuações de similaridade:

def retriever_builder(model, retriever_kwargs=None):
    import os
    import requests
    from llama_index.core import (
        StorageContext,
        VectorStoreIndex,
        load_index_from_storage,
    )
    from llama_index.core import SimpleDirectoryReader
    from llama_index.embeddings.vertex import VertexTextEmbedding
    import google.auth

    credentials, _ = google.auth.default()
    embed_model = VertexTextEmbedding(
        model_name="text-embedding-005", project="PROJECT_ID", credentials=credentials
    )

    data_dir = "data/paul_graham"
    essay_file = os.path.join(data_dir, "paul_graham_essay.txt")
    storage_dir = "storage"

    # --- Simple Download (if needed) ---
    if not os.path.exists(essay_file):
        os.makedirs(data_dir, exist_ok=True)  # Make sure the directory exists
        essay_url = "https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt"
        try:
            response = requests.get(essay_url)
            response.raise_for_status()  # Check for download errors
            with open(essay_file, "wb") as f:
                f.write(response.content)
            print("Essay downloaded.")
        except requests.exceptions.RequestException as e:
            print(f"Download failed: {e}")

    # --- Build/Load Index ---
    if not os.path.exists(storage_dir):
        print("Creating new index...")
        # --- Load Data ---
        reader = SimpleDirectoryReader(data_dir)
        docs = reader.load_data()

        index = VectorStoreIndex.from_documents(docs, model=model, embed_model=embed_model)
        index.storage_context.persist(persist_dir=storage_dir)
    else:
        print("Loading existing index...")
        storage_context = StorageContext.from_defaults(persist_dir=storage_dir)
        index = load_index_from_storage(storage_context, embed_model=embed_model)

    return index.as_retriever()

Teste o extrator:

from llama_index.llms.google_genai import GoogleGenAI

model = GoogleGenAI(
    model=model,
    **model_kwargs
)
retriever = retriever_builder(model)
retrieved_response = retriever.retrieve("What is Paul Graham's life in College?")

A resposta recuperada será semelhante a esta:

[
  NodeWithScore(
    node=TextNode(
      id_='692a5d5c-cd56-4ed0-8e29-ecadf6eb9933',
      embedding=None,
      metadata={'file_path': '/content/data/paul_graham/paul_graham_essay.txt', 'file_name': 'paul_graham_essay.txt', 'file_type': 'text/plain', 'file_size': 75042, 'creation_date': '2025-03-24', 'last_modified_date': '2025-03-24'},
      excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'],
      excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'],
      relationships={
        <NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='3e1c4d73-1e1d-4e83-bd16-2dae24abb231', node_type='4', metadata={'file_path': '/content/data/paul_graham/paul_graham_essay.txt', 'file_name': 'paul_graham_essay.txt', 'file_type': 'text/plain', 'file_size': 75042, 'creation_date': '2025-03-24', 'last_modified_date': '2025-03-24'}, hash='0c3c3f46cac874b495d944dfc4b920f6b68817dbbb1699ecc955d1fafb2bf87b'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='782c5787-8753-4f65-85ed-c2833ea6d4d8', node_type='1', metadata={'file_path': '/content/data/paul_graham/paul_graham_essay.txt', 'file_name': 'paul_graham_essay.txt', 'file_type': 'text/plain', 'file_size': 75042, 'creation_date': '2025-03-24', 'last_modified_date': '2025-03-24'}, hash='b8e6463833887a8a2b13f1b5a623672819faedc1b725d9565ba003223628db0e'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='f7d2cb7e-fa0c-40bf-b8e7-b888e36b87f9', node_type='1', metadata={}, hash='db7cc1a67fa3afd1e5f24c8c61583781ce6a00c444da8f25a5374468c17b7de0')
      },
      metadata_template='{key}: {value}',
      metadata_separator='\n',
      text='So I looked around to see what I could salvage from the wreckage of my plans, and there was Lisp...',
      mimetype='text/plain',
      start_char_idx=7166,
      end_char_idx=11549,
      metadata_separator='\n',
      text_template='{metadata_str}\n\n{content}'
    ),
    score=0.7403571819090398
  )
]

Para usar o extrator no modelo LlamaIndexQueryPipelineAgent, adicione-o ao argumento retriever_builder=:

from vertexai.preview import reasoning_engines

agent = reasoning_engines.LlamaIndexQueryPipelineAgent(
    model=model,                          # Required.
    model_kwargs=model_kwargs,            # Optional.
    retriever_builder=retriever_builder,  # Optional.
)

Teste o agente localmente executando consultas de teste:

response = agent.query(
    input="What is Paul Graham's life in College?"
)

A resposta é uma lista serializável em JSON de nós com pontuações.

[{'node': {'id_': '692a5d5c-cd56-4ed0-8e29-ecadf6eb9933',
  'embedding': None,
  'metadata': {'file_path': '/content/data/paul_graham/paul_graham_essay.txt',
    'file_name': 'paul_graham_essay.txt',
    'file_type': 'text/plain',
    'file_size': 75042,
    'creation_date': '2025-03-12',
    'last_modified_date': '2025-03-12'},
  'excluded_embed_metadata_keys': ['file_name',
    'file_type',
    'file_size',
    'creation_date',
    'last_modified_date',
    'last_accessed_date'],
  'excluded_llm_metadata_keys': ['file_name',
    'file_type',
    'file_size',
    'creation_date',
    'last_modified_date',
    'last_accessed_date'],
  'relationships': {'1': {'node_id': '07ee9574-04c8-46c7-b023-b22ba9558a1f',
    'node_type': '1',
    'metadata': {},
    'hash': '44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a',
    'class_name': 'RelatedNodeInfo'},
    '2': {'node_id': 'ac7e54aa-6fff-40b5-a15e-89c5eb234936',
    'node_type': '1',
    'metadata': {'file_path': '/content/data/paul_graham/paul_graham_essay.txt',
      'file_name': 'paul_graham_essay.txt',
      'file_type': 'text/plain',
      'file_size': 75042,
      'creation_date': '2025-03-12',
      'last_modified_date': '2025-03-12'},
    'hash': '755327a01efe7104db771e4e6f9683417884ea6895d878da882d2b21a6b66442',
    'class_name': 'RelatedNodeInfo'},
    '3': {'node_id': '3a04be27-ac46-4acd-a8c6-031689508982',
    'node_type': '1',
    'metadata': {},
    'hash': 'db7cc1a67fa3afd1e5f24c8c61583781ce6a00c444da8f25a5374468c17b7de0',
    'class_name': 'RelatedNodeInfo'}},
  'metadata_template': '{key}: {value}',
  'metadata_separator': '\n',
  'text': 'So I looked around to see what I could salvage from the wreckage of my plans, and there was Lisp...',
  'mimetype': 'text/plain',
  'start_char_idx': 7164,
  'end_char_idx': 11547,
  'metadata_separator': '\n',
  'text_template': '{metadata_str}\n\n{content}',
  'class_name': 'TextNode'},
  'score': 0.25325886336265013,
  'class_name': 'NodeWithScore'}
]

Definir e usar um sintetizador de respostas

Depois de definir o modelo e o extrator, defina o response-synthesizer, que gera uma resposta de um LLM usando uma consulta do usuário e um determinado conjunto de blocos de texto. É possível usar o get_response_synthesizer padrão ou configurar o modo de resposta.

Defina um sintetizador de respostas que retorne a resposta:

def response_synthesizer_builder(model, response_synthesizer_kwargs=None):
    from llama_index.core.response_synthesizers import SimpleSummarize

    return SimpleSummarize(llm=model)

Teste a função:

response_synthesizer = response_synthesizer_builder(model=model)
response = response_synthesizer.get_response(
    "What is Paul Graham's life in College?",
    [node.model_dump_json() for node in retrieved_response],
)

A resposta será semelhante a esta:

"While in a PhD program for computer science, he took art classes and worked on a book about Lisp hacking. He applied to art schools, got accepted to RISD, and later got an invitation to take the entrance exam at the Accademia di Belli Arti in Florence. He was accepted to both. He attended the Accademia, but was disappointed by the lack of instruction."

Para usar o sintetizador de respostas no modelo LlamaIndexQueryPipeline, adicione-o ao argumento response_synthesizer_builder=:

from vertexai.preview import reasoning_engines

agent = reasoning_engines.LlamaIndexQueryPipelineAgent(
    model=model,                                                    # Required.
    model_kwargs=model_kwargs,                                      # Optional.
    retriever_builder=retriever_builder,                            # Optional.
    response_synthesizer_builder=response_synthesizer_builder,      # Optional.
)

Teste o pipeline de consulta RAG completo localmente executando consultas de teste:

response = agent.query(
    input="What is Paul Graham's life in College?"
)

A resposta é um dicionário semelhante a este:

{
  'response': "While in college, he was drawn to McCarthy's 1960 Lisp, although he didn't fully grasp the reasons for his interest at the time. He also had a brief encounter with surplus Xerox Dandelions in the computer lab but found them too slow for his liking. \n",
  'source_nodes': [
    '{"node":{"id_":"95889c30-53c7-43d0-bf91-930dbb23bde6"...,"score":0.7077213268404997,"class_name":"NodeWithScore"}'
  ],
  'metadata': {
    '95889c30-53c7-43d0-bf91-930dbb23bde6': {
      'file_path': '/content/data/paul_graham/paul_graham_essay.txt',
      'file_name': 'paul_graham_essay.txt',
      'file_type': 'text/plain',
      'file_size': 75042,
      'creation_date': '2025-03-25',
      'last_modified_date': '2025-03-25'
    }
  }
}

(Opcional) Personalizar o modelo de comando

Os modelos de comando traduzem a entrada do usuário em instruções para o modelo, orientando as respostas para uma saída contextualmente relevante e coerente. Consulte Comandos para mais detalhes.

O modelo de comando padrão é organizado sequencialmente nas seguintes seções:

Seção	Descrição
(Opcional) Instrução do sistema	Instruções para o agente ser aplicado em todas as consultas.
Entrada do usuário	A consulta do usuário que o agente precisa responder.

O modelo de comando padrão será gerado se você criar o agente sem especificar seu próprio modelo de comando e terá o seguinte aspecto:

from llama_index.core import prompts
from llama_index.core.base.llms import types

message_templates = [
  types.ChatMessage(role=types.MessageRole.SYSTEM, content=system_instruction),
  types.ChatMessage(role=types.MessageRole.USER, content="{input}"),
]
prompts.ChatPromptTemplate(message_templates=message_templates)

Você pode usar o modelo de comando completo ao instanciar o agente no exemplo a seguir:

  from vertexai.preview import reasoning_engines

  system_instruction = "I help to find what is Paul Graham's life in College"

  agent = reasoning_engines.LlamaIndexQueryPipelineAgent(
      model=model,
      system_instruction=system_instruction,
  )

É possível substituir o modelo de comando padrão pelo seu próprio modelo e usá-lo ao criar o agente:

prompt_str = "Please answer {question} about {name}"
prompt_tmpl = PromptTemplate(prompt_str)

from vertexai.preview import reasoning_engines
agent = reasoning_engines.LlamaIndexQueryPipelineAgent(
    model = model,
    prompt = prompt_tmpl,
)

agent.query(
    input={
        "name": "Paul Graham",
        "question": "What is the life in college?",
    }
)

(Opcional) Personalizar a orquestração

Todos os componentes LlamaIndexQueryPipeline implementam a interface de componente de consulta, que fornece esquemas de entrada e saída para orquestração. O LlamaIndexQueryPipelineAgent precisa que um executável seja criado para responder a consultas. Por padrão, o LlamaIndexQueryPipelineAgent cria uma cadeia sequencial ou um gráfico acíclico dirigido (DAG) usando Query Pipeline.

Convém personalizar a orquestração se você pretende fazer o seguinte:

Implemente um agente que estenda o pipeline RAG (por exemplo, estendendo um módulo de solicitação, modelo, extrator, sintetizador de respostas para mecanismo de consulta, transformador de consulta, analisadores de saída, pós-processador/reclassificadores ou componente de consulta personalizada).
Solicite que o agente use o ReAct para executar ferramentas e anotar cada etapa com comentários sobre por que ela foi realizada. Para fazer isso, substitua o executável padrão ao criar o LlamaIndexQueryPipelineAgent especificando o argumento runnable_builder=:
```
from typing import Optional
from llama_index.core.llms import function_calling

def runnable_builder(
    model: function_calling.FunctionCallingLLM,
    *,
    system_instruction: Optional[str] = None,
    prompt: Optional[query.QUERY_COMPONENT_TYPE] = None,
    retriever: Optional[query.QUERY_COMPONENT_TYPE] = None,
    response_synthesizer: Optional[query.QUERY_COMPONENT_TYPE] = None,
    runnable_kwargs: Optional[Mapping[str, Any]] = None,
):
```
Em que:
- model corresponde ao modelo de chat retornado pelo model_builder (consulte Definir e configurar um modelo).
- retriever e retriever_kwargs correspondem ao extrator e às configurações a serem usadas (consulte Definir um extrator).
- response_synthesizer e response_synthesizer_kwargs correspondem ao sintetizador de respostas e às configurações a serem usadas (consulte Definir um sintetizador de respostas).
- system_instruction e prompt correspondem à configuração do comando (consulte Personalizar o modelo de comando).
- agent_executor_kwargs e runnable_kwargs são os argumentos de palavra-chave que podem ser usados para personalizar o executável.

É possível personalizar a lógica de orquestração usando um pipeline personalizado ou o ReAct:

Pipeline personalizado

Para fornecer um módulo extra (como um pós-processador) ao agente, substitua runnable_builder por LlamaIndexQueryPipelineAgent.

Defina um pós-processador:

def post_processor_builder():
  from llama_index.core.postprocessor import SimilarityPostprocessor

  # similarity postprocessor: filter nodes below 0.7 similarity score
  return SimilarityPostprocessor(similarity_cutoff=0.7)

def runnable_with_postprocessor_builder(
    model, runnable_kwargs, **kwargs
):
  from llama_index.core.query_pipeline import QueryPipeline

  pipeline = QueryPipeline(**runnable_kwargs)
  pipeline_modules = {
      "retriever": retriever_builder(model),
      "postprocessor": post_processor_builder(),
  }
  pipeline.add_modules(pipeline_modules)
  pipeline.add_link("retriever", "postprocessor")

  return pipeline

agent = reasoning_engines.LlamaIndexQueryPipelineAgent(
  model=model,
  runnable_builder=runnable_with_postprocessor_builder,
)

Consultar o agente:

result = agent.query(input="What is Paul Graham's life in College?")

A saída será semelhante a esta:

[
  {
    'node': {'id_': 'bb7d2942-213d-4fb3-a7cb-1a664642a7ff',
    'embedding': None,
    'metadata': {
      'file_path': '/content/data/paul_graham/paul_graham_essay.txt',
      'file_name': 'paul_graham_essay.txt',
      'file_type': 'text/plain',
      'file_size': 75042,
      'creation_date': '2025-03-25',
      'last_modified_date': '2025-03-25'
    },
    'excluded_embed_metadata_keys': [
      'file_name',
      'file_type',
      'file_size',
      'creation_date',
      'last_modified_date',
      'last_accessed_date'
    ],
    'excluded_llm_metadata_keys': [
      'file_name',
      'file_type',
      'file_size',
      'creation_date',
      'last_modified_date',
      'last_accessed_date'
    ],
    'relationships': {'1': {'node_id': 'c508cee5-5ef2-4fdf-a33d-0427dcb78b5c',
      'node_type': '4',
      'metadata': {'file_path': '/content/data/paul_graham/paul_graham_essay.txt',
        'file_name': 'paul_graham_essay.txt',
        'file_type': 'text/plain',
        'file_size': 75042,
        'creation_date': '2025-03-25',
        'last_modified_date': '2025-03-25'},
      'hash': '0c3c3f46cac874b495d944dfc4b920f6b68817dbbb1699ecc955d1fafb2bf87b',
      'class_name': 'RelatedNodeInfo'},
      '2': {'node_id': '97a84b41-62bf-4959-acae-cfd4bdfbd4d9',
      'node_type': '1',
      'metadata': {'file_path': '/content/data/paul_graham/paul_graham_essay.txt',
        'file_name': 'paul_graham_essay.txt',
        'file_type': 'text/plain',
        'file_size': 75042,
        'creation_date': '2025-03-25',
        'last_modified_date': '2025-03-25'},
      'hash': 'a7dd352be97e47e8e553ceda3d2d2c9e9d5c54adb298063c94da06167938d583',
      'class_name': 'RelatedNodeInfo'},
      '3': {'node_id': 'b984eea1-f0bc-4880-812e-3f49f1e304b8',
      'node_type': '1',
      'metadata': {},
      'hash': 'db7cc1a67fa3afd1e5f24c8c61583781ce6a00c444da8f25a5374468c17b7de0',
      'class_name': 'RelatedNodeInfo'}},
    'metadata_template': '{key}: {value}',
    'metadata_separator': '\n',
    'text': 'So I looked around to see what I could salvage from the wreckage of my plans, and there was Lisp...',
    'mimetype': 'text/plain',
    'start_char_idx': 7166,
    'end_char_idx': 11549,
    'metadata_separator': '\n',
    'text_template': '{metadata_str}\n\n{content}',
    'class_name': 'TextNode'},
    'score': 0.7403571819090398,
    'class_name': 'NodeWithScore'
  },
  {
    'node': {'id_': 'b984eea1-f0bc-4880-812e-3f49f1e304b8...'}
    'score': 0.7297395567513889,
    'class_name': 'NodeWithScore'
  }
]

Agente ReAct

Para fornecer um comportamento de chamada de ferramenta com seu próprio agente ReAct, substitua runnable_builder por LlamaIndexQueryPipelineAgent.

Defina uma função de exemplo que retorne uma taxa de câmbio:

def get_exchange_rate(
  currency_from: str = "USD",
  currency_to: str = "EUR",
  currency_date: str = "latest",
):
  """Retrieves the exchange rate between two currencies on a specified date.

  Uses the Frankfurter API (https://api.frankfurter.app/) to obtain
  exchange rate data.

  Args:
      currency_from: The base currency (3-letter currency code).
          Defaults to "USD" (US Dollar).
      currency_to: The target currency (3-letter currency code).
          Defaults to "EUR" (Euro).
      currency_date: The date for which to retrieve the exchange rate.
          Defaults to "latest" for the most recent exchange rate data.
          Can be specified in YYYY-MM-DD format for historical rates.

  Returns:
      dict: A dictionary containing the exchange rate information.
          Example: {"amount": 1.0, "base": "USD", "date": "2023-11-24",
              "rates": {"EUR": 0.95534}}
  """
  import requests
  response = requests.get(
      f"https://api.frankfurter.app/{currency_date}",
      params={"from": currency_from, "to": currency_to},
  )
  return response.json()

Crie um agente ReAct personalizado com ferramentas:

def runnable_with_tools_builder(model, runnable_kwargs=None, **kwargs):
  from llama_index.core.query_pipeline import QueryPipeline
  from llama_index.core.tools import FunctionTool
  from llama_index.core.agent import ReActAgent

  llama_index_tools = []
  for tool in runnable_kwargs.get("tools"):
      llama_index_tools.append(FunctionTool.from_defaults(tool))
  agent = ReActAgent.from_tools(llama_index_tools, llm=model, verbose=True)
  return QueryPipeline(modules = {"agent": agent})

agent = reasoning_engines.LlamaIndexQueryPipelineAgent(
  model="gemini-2.0-flash",
  runnable_kwargs={"tools": [get_exchange_rate]},
  runnable_builder=runnable_with_tools_builder,
)