Tools

Using tools, you can connect agent apps to external systems. These systems can augment the knowledge of agent apps and empower them to execute complex tasks efficiently.

You can use built-in tools or build customized tools to suit your requirements.

Limitations

The following limitations apply:

  • You must create a data store (or connect an existing data store) when creating a data store tool for an agent app.
  • Apps with both chunked and non-chunked data stores are not supported.

Built-in tools

Built-in tools are hosted by Google. You can activate these tools in agent apps without the need for manual configuration.

The supported built-in tools are:

  • Code Interpreter: A Google first party tool that combines the capability of code generation and code execution and allow the user to perform various tasks, including: data analysis, data visualization, text processing, solving equations or optimization problems.

Your agent app is optimized to determine how and when these tools should be invoked, but you can provide additional examples to fit your use cases.

Examples should have a schema like the following:

{
  "toolUse": {
    "tool": "projects/PROJECT_ID/locations/LOCATION_ID/agents/AGENT_ID/tools/df-code-interpreter-tool",
    "action": "generate_and_execute",
    "inputParameters": [
      {
        "name": "generate_and_execute input",
        "value": "4 + 4"
      }
    ],
    "outputParameters": [
      {
        "name": "generate_and_execute output",
        "value": {
          "output_files": [
            {
              "name": "",
              "contents": ""
            }
          ],
          "execution_result": "8",
          "execution_error": "",
          "generated_code": "GENERATED_CODE"
        }
      }
    ]
  }
}

OpenAPI tools

An agent app can connect to an external API using an OpenAPI tool by providing the OpenAPI schema. By default, the agent app will call the API on your behalf. Alternatively, you can execute OpenAPI tools on the client side.

Example schema:

openapi: 3.0.0
info:
  title: Simple Pets API
  version: 1.0.0
servers:
  - url: 'https://api.pet-service-example.com/v1'
paths:
  /pets/{petId}:
    get:
      summary: Return a pet by ID.
      operationId: getPet
      parameters:
        - in: path
          name: petId
          required: true
          description: Pet id
          schema:
            type: integer
      responses:
        200:
          description: OK
  /pets:
    get:
      summary: List all pets
      operationId: listPets
      parameters:        
        - name: petName
          in: query
          required: false
          description: Pet name
          schema:
            type: string
        - name: label
          in: query
          description: Pet label
          style: form
          explode: true
          required: false
          schema:
            type: array
            items:
              type: string
        - name: X-OWNER
          in: header
          description: Optional pet owner provided in the HTTP header
          required: false
          schema:
            type: string
        - name: X-SESSION
          in: header
          description: Dialogflow session id
          required: false
          schema:
            $ref: "@dialogflow/sessionId"
      responses:
        '200':
          description: An array of pets
          content:
            application/json:
              schema:
                type: array
                items:
                  $ref: '#/components/schemas/Pet'
    post:
      summary: Create a new pet
      operationId: createPet
      requestBody:
        description: Pet to add to the store
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/Pet'
      responses:
        '201':
          description: Pet created
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Pet'
components:
  schemas:
    Pet:
      type: object
      required:
        - id
        - name        
      properties:
        id:
          type: integer
          format: int64
        name:
          type: string
        owner:
          type: string
        label:
          type: array
          items:
            type: string

You can optionally use the internal schema reference @dialogflow/sessionId as parameter schema type. With this parameter schema type, the Dialogflow session ID for the current conversation will be supplied as a parameter value. For example:

- name: X-SESSION
   in: header
   description: Dialogflow session id
   required: false
   schema:
     $ref: "@dialogflow/sessionId"

OpenAPI tool limitations

The following limitations apply:

  • Supported parameter types are path, query, header. The cookie parameter type is not supported yet.
  • Parameters defined by OpenAPI schema support the following data types: string, number, integer, boolean, array. The object type is not supported yet.
  • You currently can't specify query parameters in the console example editor.
  • Request and response body must be empty or JSON.

OpenAPI tool API authentication

The following authentication options are supported when calling an external API:

  • Dialogflow Service Agent auth
    • Dialogflow can generate an ID token or access token using Dialogflow Service Agent. The token is added in the authorization HTTP header when Dialogflow calls an external API.
    • An ID token can be used to access Cloud Functions and Cloud Run services after you grant the roles/cloudfunctions.invoker and roles/run.invoker roles to service-agent-project-number@gcp-sa-dialogflow.iam.gserviceaccount.com. If the Cloud Functions and Cloud Run services are in the same resource project, you don't need additional IAM permission to call them.
    • An access token can be used to access other Google Cloud APIs after you grant required roles to service-agent-project-number@gcp-sa-dialogflow.iam.gserviceaccount.com.
  • API key
    • You can configure API key authentication by providing the key name, request location (header or query string) and API key so that Dialogflow passes the API key in the request.
  • OAuth
    • OAuth Client Credential flow is supported for server-to-server authentication. Client ID, Client Secret and Token endpoint from OAuth provider need to be configured in Dialogflow. Dialogflow exchanges an OAuth access token and passes it in the auth header of the request.
    • For other OAuth flows, you need to use the Function Tool to integrate with your own sign-in UI to exchange the token.
  • Mutual TLS authentication
  • Custom CA certificate

Data store tools

Data store tools can be used by an agent app for answers to end-user's questions from your data stores. You can set up one data store of each type per tool, and the tool will query each of these data stores for answers. By default, the agent app will call the data store tool on your behalf. Alternatively, you can execute data store tools on the client side.

The data store type can be one of the following:

  • PUBLIC_WEB: A data store that contains public web content.
  • UNSTRUCTURED: A data store that contains unstructured private data.
  • STRUCTURED: A data store that contains structured data (for example FAQ).

The following example shows how to reference a data store:

"dataStoreConnections": [
  {
    "dataStoreType": "PUBLIC_WEB",
    "dataStore": "projects/PROJECT_NUMBER/locations/LOCATION_ID/collections/default_collection/dataStores/DATASTORE_ID"
  },
  {
    "dataStoreType": "UNSTRUCTURED",
    "dataStore": "projects/PROJECT_NUMBER/locations/LOCATION_ID/collections/default_collection/dataStores/DATASTORE_ID"
  },
  {
    "dataStoreType": "STRUCTURED",
    "dataStore": "projects/PROJECT_NUMBER/locations/LOCATION_ID/collections/default_collection/dataStores/DATASTORE_ID"
  }
]

Data store tool responses might also contain snippets about the content source that was used to generate the response. The agent app can further provide instructions on how to proceed with the answer from the data stores or how to respond when there is no answer.

You can overwrite an answer by adding an FAQ entry for a specific question.

Examples can be used to further enhance the agent app behavior. The example should have the following schemas:

{
  "toolUse": {
    "tool": "projects/PROJECT_ID/locations/LOCATION_ID/agents/AGENT_ID/tools/TOOL_ID",
    "action": "TOOL_DISPLAY_NAME",
    "inputParameters": [
      {
        "name": "TOOL_DISPLAY_NAME input",
        "value": {
          "query": "QUERY"
        }
      }
    ],
    "outputParameters": [
      {
        "name": "TOOL_DISPLAY_NAME output",
        "value": {
          "answer": "ANSWER",
          "snippets": [
            {
              "title": "TITLE",
              "text": "TEXT_FROM_DATASTORE",
              "uri": "URI_OF_DATASTORE"
            }
          ]
        }
      }
    ]
  }
}

Create a data store

To create a data store and connect it to your app, you can use the Tools link in the left navigation of the console. Follow the instructions to create a data store.

Additional query parameters

When creating data store tool examples, two optional parameters are available, together with the required query string - a filter string and an userMetadata structured object.

The filter parameter provides the ability to filter search queries of your structured data or unstructured data with metadata. This string must follow the supported filter expression syntax. Multiple examples are encouraged to instruct the agent LLM on how to populate this parameter. In the case of an invalid filter string, the filter will be ignored when performing the search query.

An example of a filter string to refine search results based on location could look like:

  "filter": "country: ANY(\"Canada\")"

The userMetadata parameter provides information about the end-user. Any key-value pairs can be populated in this parameter. This metadata is passed into the data store tool to better inform the search results and tool response. It is encouraged to provide multiple examples to instruct the agent LLM on how to populate this parameter.

An example of a userMetadata parameter value to refine search results relevant to a specific user could look like:

  "userMetadata": {
    "favoriteColor": "blue",
    ...
  }

If you find some responses during testing don't meet your expectations, the following customizations are available in the Tool page for a data store tool:

Grounding confidence

For each response generated from the content of your connected data stores, the agent evaluates a confidence level, which gauges the confidence that all information in the response is supported by information in the data stores. You can customize which responses to allow by selecting the lowest confidence level you are comfortable with. Only responses at or above that confidence level will be shown.

There are 5 confidence levels to choose from: VERY_LOW, LOW, MEDIUM, HIGH, and VERY_HIGH.

Summarization configuration

You can select the generative model used by a data store agent for the summarization generative request. If none gets selected, a default model option is used. The following table contains the available options:

Model Identifier Language Support
text-bison@001 Available in all supported languages.
text-bison@002 Available in all supported languages.
text-bison@001 tuned (conversational) Only English is supported at the moment.
text-bison@001 tuned (informational) Only English is supported at the moment.
gemini-pro Available in all supported languages.

You can also provide your own prompt for the summarization LLM call.

The prompt is a text template that may contain predefined placeholders. The placeholders will be replaced with the appropriate values at runtime and the final text will be sent to the LLM.

The placeholders are as follows:

  • $original-query: The user's query text.
  • $rewritten-query: The agent uses a rewriter module to rewrite the original user query into a more accurate format.
  • $sources: The agent uses Enterprise Search to search for sources based on the user's query. The found sources are rendered in a specific format:

    [1] title of first source
    content of first source
    [2] title of second source
    content of first source
    
  • $conversation: The conversation history is rendered in the following format:

    Human: user's first query
    AI: answer to user's first query
    Human: user's second query
    AI: answer to user's second query
    

A custom prompt should instruct the LLM to return "NOT_ENOUGH_INFORMATION" when it cannot provide an answer. The agent will transform this constant to a user friendly message for the user.

Banned phrases (agent-level configuration)

You have the option to define specific phrases which shouldn't be allowed. These are configured at the agent level and utilized by both the agent LLMs and the data store tools. If the generated response or parts of the LLM prompt, such as the user's inputs, contain any of the banned phrases verbatim, then that response won't be shown.

Function tools

If you have functionality accessible by your client code, but not accessible by OpenAPI tools, you can use function tools. Function tools are always executed on the client side, not by the agent app.

The process is as follows:

  1. Your client code sends a detect intent request.
  2. The agent app detects that a function tool is required, and the detect intent response contains the name of the tool along with input arguments. This session is paused until another detect intent request is received with the tool result.
  3. Your client code calls the tool.
  4. Your client code sends another detect intent request that provides the tool result as output arguments.

The following example shows the input and output schema of a function tool:

{
  "type": "object",
  "properties": {
    "location": {
      "type": "string",
      "description": "The city and state, for example, San Francisco, CA"
    }
  },
  "required": [
    "location"
  ]
}
{
  "type": "object",
  "properties": {
    "temperature": {
      "type": "number",
      "description": "The temperature"
    }
  }
}

The following example shows the initial detect intent request and response using REST:

HTTP method and URL:
POST https://REGION_ID-dialogflow.googleapis.com/v3/projects/PROJECT_ID/locations/LOCATION_ID/agents/AGENT_ID/sessions/SESSION_ID:detectIntent
{
  "queryInput": {
    "text": {
      "text": "what is the weather in Mountain View"
    },
    "languageCode": "en"
  }
}
{
  "queryResult": {
    "text": "what is the weather in Mountain View",
    "languageCode": "en",
    "responseMessages": [
      {
        "source": "VIRTUAL_AGENT",
        "toolCall": {
          "tool": "<tool-resource-name>",
          "action": "get-weather-tool",
          "inputParameters": {
            "location": "Mountain View"
          }
        }
      }
    ]
  }
}

The following example shows the second detect intent request, which provides the tool result:

{
  "queryInput": {
    "toolCallResult": {
      "tool": "<tool-resource-name>",
      "action": "get-weather-tool",
      "outputParameters": {
        "temperature": 28.0
      }
    },
    "languageCode": "en"
  }
}

Client side execution

Like function tools, OpenAPI and data store tools can be executed on the client side by applying an API override when interacting with the session.

For example:

DetectIntentRequest {
  ...
  query_params {
    playbook_state_override {
      playbook_execution_mode: ALWAYS_CLIENT_EXECUTION
    }
  }
  ...
}

The process is as follows:

  1. Your client code sends a detect intent request that specifies client execution.
  2. The agent app detects that a tool is required, and the detect intent response contains the name of the tool along with input arguments. This session is paused until another detect intent request is received with the tool result.
  3. Your client code calls the tool.
  4. Your client code sends another detect intent request that provides the tool result as output arguments.