Data store agents

Data store agents are a special type of Dialogflow agent that can provide LLM generated agent responses based on your website content and uploaded data.

To create this agent, you provide data stores when creating the agent.

A data store agent has special state handlers called data store handlers. Using these data store handlers, your data store agent can have conversations about the content with your end-users.

Limitations

The following limitations apply:

  • This feature currently only supports select languages in GA. See the data store column in the language reference.
  • Only the following regions are supported: global, us multi-region, and eu multi-region.
  • The only structured data store type supported is FAQ.
  • Apps with both chunked and non-chunked data stores are not supported.

Access control

If you are the project owner, you have all the permissions needed to create a data store agent. If you are not the project owner, you must have the following roles:

  • Dialogflow Admin
  • Discovery Engine Admin

For more information, see the Dialogflow access control guide.

Create a data store agent

To create a data store agent:

  1. If you have not already, follow the Dialogflow setup instructions.
  2. Go to the Agent Builder console:

    Agent Builder console

  3. Select your project from the console drop-down.

  4. If you have not already activated the API, read and agree to the Terms of Service, then click Continue and activate the API.

  5. Click Create a New App or New App.

  6. Select Chat.

  7. Provide your company name in the Agent configurations section.

  8. Expand the time zone and language settings section.

  9. Select a time zone.

  10. Select a default language.

  11. Provide an agent name in the Your agent name section.

  12. Select a region or multi-region in the Location of your agent section.

  13. Click Continue.

  14. Connect a data store to your agent by doing one of the following:

    • Select an existing data store that you previously created.
    • Create a new data store:
      1. Click Create New Data Store.
      2. Choose a data source.
      3. Provide data and configuration for the data store source you selected. Your data store location should correspond to the agent location.
      4. Click Create to create the data store.
      5. Select your new data store.
  15. Click Create.

  16. Your agent is now created, and you are automatically redirected to the Available data stores page, where you can add more data stores as needed.

  17. If you have created a new data store for a website, you must verify your domain.

  18. To open your agent with Dialogflow CX, click Preview in the left panel. In the Dialogflow CX console, you can edit or add data store handlers, deploy your agent, and optionally add flows that will handle scenarios not covered by the data stores.

Test your agent

You can use the Dialogflow CX simulator to test your agent.

Settings

The following data store agent settings are available.

Grounding

For each response generated from the content of your connected data stores, a confidence level is calculated, which gauges the confidence that all information in the response is supported by information in the data stores. You can select the lowest confidence level allowed, and the agent won't return responses lower than that level.

There are 5 confidence levels to choose from: very low, low, medium, high, and very high.

You can also apply a grounding heuristics filter. If enabled, responses containing content that is likely inaccurate based on common hallucinations are suppressed.

Data store prompt

You have the option to add additional information about the agent that can improve the quality of answers generated from data store content and make them feel more like your brand:

  • Agent name - What the agent should call itself. If you leave it unset, the default value AI Assistant will be used.
  • Agent identity - What the agent persona will be. If you leave it unset, the default value AI Assistant will be used.
  • Company name Set to the name of your company. This should have already been set as part of the agent creation flow, but is adjustable as needed. It is recommended to set this field correctly (and especially not leave it empty), lest quality of generated answers suffer.
  • Company description A short description of what the company does or offers.
  • Agent scope - Where the agent is meant to be used. If you leave it unset, the default value on the company website will be used.

Once you've filled out this section partially or fully, you can inspect on the right side, under Your prompt, the short paragraph that was derived from these settings. This is used as part of answer generation.

Data store model selection and summarization prompt

When a user query is processed, the agent performs a search of the data stores to find good sources. The agent then sends the user query and sources found to the LLM, which performs a summarization.

You can select which model to use for summarization and optionally provide your own prompt.

Select generative model

You can select the generative model used by a data store agent for the summarization generative request. If none gets selected, text-bison@001 is used. The following table contains the available options:

Model Identifier Language Support
text-bison@001 Available in all supported languages.
text-bison@002 Available in all supported languages.
text-bison@001 tuned (conversational) Only English is supported at the moment.
text-bison@001 tuned (informational) Only English is supported at the moment.
gemini-1.0-pro-001 Available in all supported languages.

Customize the summarization prompt

You can provide your own prompt for the summarization LLM call. The prompt is a text template that may contain predefined placeholders. The placeholders will be replaced with the appropriate values at runtime and the final text will be sent to the LLM.

The placeholders are as follows:

  • $original-query: The user's query text.
  • $rewritten-query: Dialogflow uses a rewriter module to rewrite the original user query into a more accurate format.
  • $sources: Dialogflow uses Enterprise Search to search for sources based on the user's query. The found sources are rendered in a specific format:

    [1] title of first source
    content of first source
    [2] title of second source
    content of second source
    
  • $conversation: The conversation history is rendered in the following format:

    Human: user's first query
    AI: answer to user's first query
    Human: user's second query
    AI: answer to user's second query
    

A custom prompt should instruct the LLM to return "NOT_ENOUGH_INFORMATION" when it cannot provide an answer. In this case, the agent will invoke a no-match event.

For example:

Given the conversation between a Human and a AI assistant and a list of sources,
write a final answer for the AI assistant.
Follow these guidelines:
+ Answer the Human's query and make sure you mention all relevant details from
  the sources, using exactly the same words as the sources if possible.
+ The answer must be based only on the sources and not introduce any additional
  information.
+ All numbers, like price, date, time or phone numbers must appear exactly as
  they are in the sources.
+ Give as comprehensive answer as possible given the sources. Include all
  important details, and any caveats and conditions that apply.
+ The answer MUST be in English.
+ Don't try to make up an answer: If the answer cannot be found in the sources,
  you admit that you don't know and you answer NOT_ENOUGH_INFORMATION.
You will be given a few examples before you begin.

Example 1:
Sources:
[1] <product or service> Info Page
Yes, <company> offers <product or service> in various options or variations.

Human: Do you sell <product or service>?
AI: Yes, <company> sells <product or service>. Is there anything else I can
help you with?

Example 2:
Sources:
[1] Andrea - Wikipedia
Andrea is a given name which is common worldwide for both males and females.

Human: How is the weather?
AI: NOT_ENOUGH_INFORMATION


Begin! Let's work this out step by step to be sure we have the right answer.

Sources:
$sources

$conversation
Human: $original-query
AI:

Improve agent responses

If you find some responses during testing that don't meet your expectations, try the following.

Deploy your agent

There are many ways to deploy your agent:

  • The simplest option is to use a Dialogflow CX integration, which provides a user interface for your agent. Each integration provides instructions for deployment.

  • The Dialogflow Messenger integration is a particularly good option for data store agents. It has built-in options for generative features.

  • You can create your own user interface and use the Dialogflow CX API for interactions. Your user interface implementation is in control of deployment.

Track your agent's performance

You can monitor your agents conversation history and you can use the analytics tool for agent statistics.

Special intents

In addition to handling questions about the content you provide, the data store agent can handle the following types of questions:

  • Agent identification: Handles questions like "Who are you?" or "Are you human?".
  • Escalate to a human agent: Handles questions like "I want to talk to a human" or "I want to talk to a real person".

This is accomplished by automatically generated intents and intent routes.

Hybrid agents

If you have an existing Dialogflow CX agent, you can upgrade this agent to a hybrid agent, which combines the power of precise conversation controls (flows, parameters, intents, conditions, transitions, and so on) with data store handler generative features.

As part of this upgrade, you may wish to delete or temporarily disable intent routes (while testing data store handlers) for certain conversation scenarios from your agent, because the data store handlers can handle those scenarios more simply.

The following scenarios are recommended for data store handlers:

  • Questions that can be answered by your organization's documents or website.
  • FAQs that do not require database lookups.

The following scenarios are not recommended for data store handlers:

  • Content that does not have answers to desired questions.
  • Questions that require database lookups or server requests.
  • Scenarios that require data redaction.
  • Scenarios that require deterministic agent responses.

Dialogflow evaluates end-user input in the following order of preference:

  1. Intent match for routes in scope
  2. FAQ data store content
  3. Unstructured data store content

Input evaluation order

Dialogflow evaluates end-user input in the following order for hybrid agents:

  1. Parameter input while form filling.
  2. Intent matches for routes in scope.
  3. Data store handler with FAQ data store content.
  4. Data store handler with unstructured data store content.

Add or edit data store handlers for an existing agent

Data store handlers are a special type of Dialogflow state handler. This means that you can apply them to flows or pages, and that they are evaluated using the same scope rules.

To add or edit a data store handler:

  1. Go to the Dialogflow CX Console.
  2. Select your Google Cloud project.
  3. Select the agent.
  4. Select the flow associated with the data store handler. This is commonly the default start flow.
  5. Select the page associated with the data store handler. This is commonly the start page.
  6. Click Add state handler in the page data, then select data store.
  7. If you need to create a data store, you will be taken to the Vertex AI Agent Builder user interface. See the data store information to help you make selections.
  8. If you already have a data store, click Edit data store.
  9. Make updates as needed and save when you are complete. See information below about data store specific settings.

Agent responses

In the Agent responses section, you can provide custom responses that reference generative answers. Use $request.knowledge.questions[0] in the Agent says section to provide the generative answer.

Data store response options

You can update the Link maximum field to indicate the maximum number of supplemental links that should be provided by the generative answers.

Handle conversation digressions

An end-user may ask clarifying questions during a conversation. For example, during credit card information collection, they may want to clarify what a CVV is. In this case, your agent should answer the question and return to collecting the necessary credit card information. To accomplish this, you can create a data store handler with data stores that answer the question, apply that handler to the flow start page of the flow that handles credit card information collection, and set a transition target for this handler to return to the "current page".

Handle undesired intent matches

If your agent is matching intents when it should be using a data store handler, you can try the following to correct this:

  • Delete or modify training phrases that are vague, so that all of your training phrases precisely handle the desired intention and do not conflict with your data store content.
  • Use negative examples to avoid intent matching.

Data store filtering

In some cases, you may only want certain data stores available for queries, depending on session parameter values. For example, you may have unique data stores for product categories. To accomplish data store filtering for product categories:

  • Set session parameters to product categories.
  • Create condition routes that check the values of the session parameters and transition to a specific page that has the desired data store handler.
  • The data store handler should transition back to the calling page, so that the conversation can continue.

Personalization

To make generative answers more relevant to end-users, you can provide Dialogflow with information about users.

This information is provided as JSON. There is no expected schema, so you are free to define the object properties. This JSON is sent to the large language model as-is, so descriptive property names and values lead to the best results.

For example:

{
  "subscription plan": "Business Premium Plus",
  "devices owned": [
    {"model": "Google Pixel 7"},
    {"model": "Google Pixel Tablet"}
  ]
}

Personalizing with the Dialogflow API

You can provide this data to Dialogflow when sending detect intent requests. This information must be provided in every detect intent request, because it is not persisted in the session.

Provide this information in the queryParams.endUserMetadata field in the Sessions.detectIntent method.

Select a protocol and version for the Session reference:

Protocol V3 V3beta1
REST Session resource Session resource
RPC Session interface Session interface
C++ SessionsClient Not available
C# SessionsClient Not available
Go SessionsClient Not available
Java SessionsClient SessionsClient
Node.js SessionsClient SessionsClient
PHP Not available Not available
Python SessionsClient SessionsClient
Ruby Not available Not available

Personalizing with Dialogflow Messenger

You can provide this data to the Dialogflow Messenger integration. See the setContext method.

Search configuration

To have better control over the agent behavior and improve the quality of the answers, boost and filter search configurations are exposed to let you boost, bury and filter documents.

Boost controls enable you to change search result ranking by applying a boost value (greater than zero for higher ranking, less than zero for lower ranking) to specific documents.

Filter controls let you to either keep or remove search results based on the specified filter criteria.

This information is provided as JSON to Dialogflow requests. The format of the JSON depends on the search control type.

Boost control

The following search configuration describes a boost control:

"searchConfig": {
  "boostSpecs": [
    {
      "dataStores": [ "DATASTORE_ID" ],
      "spec": [
        {
          "conditionBoostSpecs": {
            "condition": "CONDITION",
            "boost": "1.0"
          }
        }
      ]
    }
  ]
}

Filter control

The following search configuration describes a filter control:

"searchConfig": {
  "filterSpecs": [
    {
      "dataStores": [ "DATASTORE_ID" ],
      "filter": "CONDITION"
    }
  ]
}

Setup search configuration with the Dialogflow API

You can provide this data to Dialogflow when sending detect intent requests. This information must be provided in every detect intent request, because it is not persisted in the session.

Provide this information in the queryParams.searchConfig field in the Sessions.detectIntent method.

Select a protocol and version for the Session reference:

Protocol V3 V3beta1
REST Session resource Session resource
RPC Session interface Session interface
C++ SessionsClient Not available
C# SessionsClient Not available
Go SessionsClient Not available
Java SessionsClient SessionsClient
Node.js SessionsClient SessionsClient
PHP Not available Not available
Python SessionsClient SessionsClient
Ruby Not available Not available

Setup search configuration with Dialogflow Messenger

You can provide this data to the Dialogflow Messenger integration.

To apply a search control, the following snippet needs to be added to the DF messenger code when embedding it into a website:

<script>
  document.addEventListener('df-messenger-loaded', () => {
    const dfMessenger = document.querySelector('df-messenger');
    const searchConfig = { ... }
    dfMessenger.setQueryParameters(searchConfig);
  });
</script>

See the setQueryParameters method.