Building a chatbot agent by using Dialogflow (part 1)

This article is part 1 of a multi-part series of tutorials that show you how to build, secure, and scale a chatbot by using Dialogflow on Google Cloud.

This series consists of these parts:

In this tutorial, you extract content from a document to create a knowledge base. Then, you build a chatbot to converse with your users about topics found in the knowledge base. This tutorial uses a human resources (HR) manual as the example document. However, you can apply this use case to any type of document, such as an operations manual, an instruction manual, or a policy document.

Natural Language API Toolkit (NLTK) is used to extract topics and associated policy text from the document. Then, you create a webhook API that queries the text associated with topics. Finally, you create a chatbot in Dialogflow that can carry on a conversation by using text or voice, and then uses fulfillment calls to the webhook API you created.

The following products are used in this tutorial. If you use different products, you might need to make adjustments to the scripts and commands.

  • Datastore
  • Dialogflow API
  • Cloud Natural Language API API

This tutorial is for users who want to learn how to build a chatbot and assumes that they are familiar with Cloud Shell, Datastore, Datalab, and Linux.

The following diagram shows the architecture of the chatbot deployment.

Architecture of chatbot deployment

The chatbot deployment is broadly divided into two phases:

  • Data load: In this phase, the HR manual is preprocessed by Datalab and the resulting entities are stored in Datastore. Synonyms are extracted from the topics and stored in Datastore.
  • Service user: Now that data is available in Datastore, in the second phase you create a Flask-based server, which can continuously accept requests from the Dialogflow Fulfillment API and respond by using the stored data.

Objectives

  • Use Datalab, Python, data science libraries, and Natural Language API API machine-learning technology to transform an unstructured text document into a structured knowledge base in Datastore.
  • Use Dialogflow to build a conversational chatbot that can respond to questions about the HR manual.
  • Build an API, called a webhook, in Python that Dialogflow calls to get information from external data sources.
  • Integrate your Dialogflow model with a web frontend for a text-chat style interface.
  • Integrate your Dialogflow model with Actions on Google so you can interact with your chatbot through a voice interface on your phone or Google Home.

Costs

This tutorial uses the following billable components of Google Cloud:

To generate a cost estimate based on your projected usage, use the pricing calculator. New Google Cloud users might be eligible for a free trial.

When you finish this tutorial, you can avoid continued billing by deleting the resources you created. For more information, see Cleaning up.

Before you begin

  1. Sign in to your Google Account.

    If you don't already have one, sign up for a new account.

  2. In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.

    Go to the project selector page

  3. Make sure that billing is enabled for your Cloud project. Learn how to confirm that billing is enabled for your project.

  4. Enable the Compute Engine, Cloud Storage, Cloud Source Repositories, Natural Language API, Datastore, and Dialogflow APIs.

    Enable the APIs

Initializing App Engine to use Datastore API

You use Datastore, a NoSQL database, to store content extracted from the document. Google Cloud projects that use the Datastore API require an active App Engine app.

In this section, you assign default settings for values that are used throughout the tutorial, such as region and zone. In this tutorial, you use us-central1 as the default region and us-central1-b as the default zone.

  1. From the Google Cloud Console, click Activate Cloud Shell. You use Cloud Shell for all of the terminal commands in this tutorial.

    OPEN Cloud Shell

  2. Create a Datalab virtual machine (VM):

    gcloud app create
    
  3. When prompted, select us-central1 as the region in which to create the app.

You have successfully initialized App Engine and Datastore.

Getting started with Datalab

This tutorial uses several Datalab notebooks, each of which has a primary function as described in the following list:

  • Preprocessing notebooks: These files are each run once, in order. They extract information from the HR Manual and create a knowledge base that is used by the chatbot to answer questions.

    • ProcessHandbook.ipynb performs semi-structured analysis on the HR manual text file. It extracts topic headings and associated policy text from the file and stores this information as key-value pairs in Datastore to give the chatbot a basic vocabulary.

    • ProcessSynonyms.ipynb uses several Python data science libraries and the Natural Language API API to generate synonyms for topics, which gives the chatbot an expanded vocabulary.

    • DialogFlow.ipynb uses Dialogflow's Entity API to write the topics to Dialogflow's entity module, which makes these words available to the chatbot as a data type.

  • Runtime notebooks: The following notebooks provide a runtime API interface to the knowledge base for the chatbot.

    • webhook.ipynb creates a webhook API that uses Flask, which handles user-generated topic requests by responding with text content responses.

    • ngrok.ipynb creates a public internet-facing HTTPS tunnel for the webhook API process, which Dialogflow can then call over the internet by using HTTPS.

Set up Datalab

  1. In Cloud Shell, initialize your project. Replace [PROJECT_ID] with your Google Cloud project ID.

    gcloud config set core/project [PROJECT_ID]
    
  2. Spread the load among the different zones in the us-central1 region. Replace [ZONE] with a zone in the us-central1 region.

    gcloud config set compute/zone us-central1-[ZONE]
    

    This command takes a few minutes to complete. If you are prompted to enter a passphrase for an ssh key, leave the passphrase blank.

  3. Create a Datalab instance on a VM in the project and zone specified in the preceding steps. Replace [INSTANCE_NAME] with a name for the Datalab instance.

    datalab create [INSTANCE_NAME]
    
  4. When asked to create ssh keys, type Y.

  5. Create an RSA key password and press Enter twice to confirm.

    Wait for the following text to display:

    Updating project ssh metadata...done. Waiting for SSH key to
    propagate. The connection to Datalab is now open and will
    remain until this command is killed. Click on the
    *Web Preview* (square button at top-right), select
    *Change port > Port 8081*, and start using Datalab.
    
  6. You are now connected to your instance. To open a Datalab notebook listing page, click Web preview, and then click Change port.

  7. In the Port Number field, type 8081, and then click Change and Preview.

The Datalab homepage opens in a new browser tab.

Download lab notebooks

  1. In the Datalab homepage, click Notebooks.

    Notebooks in Datalab

  2. To open a new notebook, click +Notebook. In the notebook cell, enter:

    !git clone https://github.com/GoogleCloudPlatform/dialogflow-chatbot.git
    !cp -r dialogflow-chatbot/notebooks/* .
    
  3. Click Run.

Building a chatbot database

In this section, you build a database with two different entities on Datastore. The first entity contains topics of interest used by the chatbot. These topics are the only phrases to which the chatbot reacts. The second entity stores synonyms for the topics created earlier.

Run the ProcessHandbook.ipynb notebook

This first Python notebook, ProcessHandbook.ipynb, extracts heading topics along with their associated content as action_text from the HR manual text file, and stores the topics and associated content as key-value pairs in Datastore. Only run this notebook once.

  1. In Datalab, go to Datalab > Notebooks > ProcessHandbook.ipynb.

  2. Click the Down arrow next to Clear, and then click Clear all Cells.

  3. Run the cells individually and verify that the entries are created. To move through the cells conveniently, press SHIFT + ENTER and wait for each cell to complete before continuing to the next cell. Code cell completion is indicated by a blue bar to the left of the cell.

  4. After you run the last cell on the page, you are finished running this particular notebook and you have extracted the topics from the HR manual. Close the browser to close this notebook.

    To review the database you created, in the Cloud Console, go to the Entities page.

    GO TO THE ENTITIES PAGE

    In the Kind list, click Topic. Review the topics that were created. The action_text column is automatically encoded by Datastore. When you query the data, it is automatically decoded for you.

    Kind list

Run the ProcessSynonyms.ipynb notebook

The ProcessSynonyms.ipynb notebook uses natural language processing (NLP) to create synonyms for the topics extracted.

  1. In Datalab, go to Datalab > Notebooks > ProcessSynonyms.ipynb.

  2. Click the Down arrow next to Clear, and then click Clear all Cells. Run the cells individually and verify that the entries are created. To move through the cells conveniently, press SHIFT + ENTER and wait for each cell to complete before continuing to the next cell. Code cell completion is indicated by a blue bar to the left of the cell.

    The output displays results similar to the following:

    annual salary annual Set([])
    annual salary salary Set([u'wage', u'salary', u'remuneration', u'pay', u'salaries', u'earnings', u'pays', u'wages', u'earning', u'remunerations'])
    compassionate leave compassionate Set([])
    compassionate leave leave Set([u'partings', u'leaves', u'farewells', u'leave', u'farewell', u'parting'])
    disability leave disability Set([u'handicap', u'disabilities', u'disability', u'disablements', u'handicaps', u'disablement', u'impairment',
    u'impairments'])
    disability leave leave Set([u'partings', u'leaves', u'farewells', u'leave', u'farewell', u'parting'])
    discipline discipline Set([u'discipline', u'bailiwicks', u'disciplines', u'fields', u'study', u'field', u'subjects', u'bailiwick', u'corrections',]
    u'studies', u'correction', u'subject'])
    

    Close the browser to close this notebook.

  3. To review the database you created, in the Cloud Console, go to the Datastore > Entities page.

  4. In the Kind list, click Synonym. It might take several minutes for the data to appear. Try refreshing the page after waiting a minute or two if the data doesn't show up right away.

    Synonyms

The knowledge base for your chatbot is now complete.

Creating a chatbot agent

You created the knowledge base for your chatbot, and now you build the chatbot by using Dialogflow.

Create a Dialogflow account

  1. Go to Dialogflow.
  2. In the upper-right corner, click Go to Console.
  3. Click Sign in with Google.
  4. Allow Dialogflow to access your Google Account, and accept the terms of service.

Create a Dialogflow chatbot agent

In this section, you create your chatbot, which Dialogflow calls an agent.

  1. In Dialogflow, click Create Agent.

  2. In the New Agent Form, complete the following fields:

    • In the Agent Name field, enter HR-Chatbot.
    • In the Import Existing Google Cloud Project list, click your Google Cloud project.
  3. Click Create.

Create a topic entity

An entity is essentially a data type in Dialogflow that you can use to parameterize conversations. You create an entity called topic that encapsulates all possible HR topics this chatbot can discuss.

  1. In Dialogflow, click Entities.
  2. Click Create Entity.
  3. In the Entity name field, type Topic.
  4. Click Allow automated expansion. This lets your chatbot recognize topic values that aren't explicitly listed in your data model.
  5. Clear the Define synonyms checkbox. Your webhook handles synonyms instead.
  6. In the Enter value field, type test. You import more values for topic in the next section, but you have to save the entity with at least one value.
  7. Click Save.

Import topic entities from Datastore to Dialogflow

The third notebook, DialogFlow.ipynb, imports topic entries from Datastore into Dialogflow.

  1. In Datalab, go to Datalab > Notebooks > DialogFlow.ipynb.
  2. Click the Down arrow next to Clear, and then click Clear all Cells.
  3. Run the first cell to install the Dialogflow SDK on Datalab by selecting the cell and pressing SHIFT + ENTER.
  4. You must restart Python on your notebook server. In the Reset list, click Interrupt Execution. Then, in the Reset list, click Restart.
  5. Run the remaining cells individually. These cells make API calls to Dialogflow to upload your topics. A list of topics appear in the notebook's output.
  6. In the Dialogflow console, click Entities.
  7. Click @Topic.

    Your entries from Datastore now populate the topic entity.

Training and testing the chatbot

Now that you have all the data required to train the chatbot, you create intents, which capture the questions users might ask about the HR manual. Then, you test the chatbot by using a built-in simulator in Dialogflow.

Create a chatbot intent

An intent in Dialogflow captures a single kind of request and response interaction between your chatbot and your user. For example, the following interaction is modeled as an intent:

  • User: "Hi, I want some information about the HR manual."

    This question activates the HR manual intent.

  • Chatbot: "OK, I'd be happy to help with that. What topic are you interested in?"

To create a chatbot intent:

  1. In Dialogflow, click Intents.

    You only need one intent for your HR chatbot. This intent responds to requests for information about different HR topics. If you want to use this chatbot to search other documents, you need to create an intent for each document.

  2. Click Create Intent.

  3. In the Intent name field, type Topic.

  4. Click Add Parameters and Action.

  5. In the Enter action name field, type lookup.

  6. In the Parameters table, enter the following:

    • In the Parameter Name field, type topic.
    • In the Entity field, type @Topic.
    • In the Value field, type $topic.

    This creates a lookup action that passes a topic parameter to your backend process (webhook), which retrieves information on this topic from the HR manual.

    Lookup action

  7. In the Fulfillment section, click the Down arrow to expand, and then click Enable Fulfillment. Click to turn on Enable webhook call for this intent. Don't turn on Enable webhook call for slot filling.

  8. At the top of the page, click Save.

Train a chatbot intent

  1. Click Add Training Phrases, and complete the following steps:

    1. In the Add user expression field, enter the sample sentence, I'd like to know about discipline.
    2. Select the word discipline.

      Adding training phrases

    3. In the dialog, click @Topic:topic to tell Dialogflow where in your example sentence to find your topic parameter. After specifying the topic parameter in your sentence, press Enter to add the sample sentence.

    4. Delete the text in the Add user expression textbox and continue to add more examples. In the following list, select the word in the substep to add to @Topic:topic. For example, in the phrase, Tell me about discipline, select the word discipline to add.

      List of phrases

      • Tell me about discipline.

        • discipline
      • What are hours of work?

        • Hours of work
      • Tell me about annual salary.

        • annual salary
      • Can you look up discipline?

        • discipline
      • I need to know about discipline.

        • discipline
      • I want to know about discipline.

        • discipline
      • What is discipline?

        • discipline
      • Where can I find out about discipline?

        • discipline
  2. Click Save.

    Dialogflow now trains the agent based on your example intentions. Training is complete when a notification message is displayed.

  3. Click to turn on Enable webhook call for this intent.

    Enable webhook call

Run a webhook process

The webhook.ipynb notebook contains Python code, written in Flask that publishes an API/REST web service that maps to Dialogflow's specification for webhooks. This API takes a topic parameter and queries Datastore for the associated text. The chatbot presents the associated text to the user.

  1. In Datalab, go to Datalab > Notebooks > webhook.ipynb.
  2. Click the Down arrow next to Clear, and then click Clear all cells. Run the cells individually.

    The following output at the bottom of the notebook indicates that your API is running on port 5000 of your local VM.

    *  Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
    

Publish the webhook on the internet

The ngrok.ipynb notebook uses a service called [ngrok](https://ngrok.com/) to create an HTTPS tunnel from your service running behind a Google Cloud firewall, to a public URL on the internet.

  1. In Datalab, go to Datalab > Notebooks > ngrok.ipynb.
  2. Click the Down arrow next to Clear, and then click Clear all Cells.
  3. Click Run, and then click Run all Cells.

    The output is displayed at the bottom of the notebook. Copy the https://[NAME].ngrok.io URL from the output. This is your webhook's public URL, where [NAME] represents a name randomly generated by ngrok.

    Output with webhook URL

    Ignore any warning messages in the output.

  4. In Dialogflow, click Fulfillment to link your chatbot to your webhook running in Google Cloud.

  5. Click to turn on Webhook.

    Webhook toggle

  6. In the URL field, enter your ngrok URL and then add /webhook/. For example, https://bc123456.ngrok.io/webhook/.

  7. Click Save.

Test your chatbot

  • In the Dialogflow console, click Try me now. In the input field, type Tell me about annual salary, and then press ENTER.

    It displays a response from the HR manual.

    Chatbot with response

Test the web demo integration

Dialogflow provides many types of integrations from other services to your chatbot. To view a sample web user interface:

  1. In Dialogflow, click Integrations.
  2. Click to turn on Web Demo.
  3. To open the web demo dialog, click Web Demo.
  4. Click https://bot.dialogflow.com/[GUID] to launch the web demo. [GUID]is an ID generated by Dialogflow.
  5. In the chatbot, type Tell me about annual salary, and then press ENTER.

    The chatbot responds as before.

    Chatbot demo

Cleaning up

If you decide not to continue with part 2 of this tutorial and to avoid incurring charges to your Google Cloud account for the resources used in this tutorial:

Delete the project

  1. In the Cloud Console, go to the Manage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then click Delete.
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

What's next

  • Part 2 of the tutorial shows you how to deploy the agent securely and add scalability.
  • Learn more about Dialogflow.
  • Try out other Google Cloud features for yourself. Have a look at our tutorials.