This article is part 1 of a multi-part series of tutorials that show you how to build, secure, and scale a chatbot by using Dialogflow on Google Cloud.
This series consists of these parts:
- Overview
- Building a chatbot agent by using Dialogflow (part 1- this tutorial)
- Securing and scaling chatbots for production (part 2)
In this tutorial, you extract content from a document to create a knowledge base. Then, you build a chatbot to converse with your users about topics found in the knowledge base. This tutorial uses a human resources (HR) manual as the example document. However, you can apply this use case to any type of document, such as an operations manual, an instruction manual, or a policy document.
Natural Language API Toolkit (NLTK)
is used to extract topics
and associated policy text from the document. Then,
you create a webhook API that queries the text associated with topics
.
Finally, you create a chatbot in Dialogflow that can carry on a
conversation by using text or voice, and then uses fulfillment calls to the
webhook API you created.
The following products are used in this tutorial. If you use different products, you might need to make adjustments to the scripts and commands.
- Datastore
- Dialogflow API
- Cloud Natural Language API API
This tutorial is for users who want to learn how to build a chatbot and assumes that they are familiar with Cloud Shell, Datastore, Datalab, and Linux.
The following diagram shows the architecture of the chatbot deployment.
The chatbot deployment is broadly divided into two phases:
- Data load: In this phase, the HR manual is preprocessed by
Datalab and the resulting entities are stored in
Datastore. Synonyms are extracted from the
topics
and stored in Datastore. - Service user: Now that data is available in Datastore, in the second phase you create a Flask-based server, which can continuously accept requests from the Dialogflow Fulfillment API and respond by using the stored data.
Objectives
- Use Datalab, Python, data science libraries, and Natural Language API API machine-learning technology to transform an unstructured text document into a structured knowledge base in Datastore.
- Use Dialogflow to build a conversational chatbot that can respond to questions about the HR manual.
- Build an API, called a webhook, in Python that Dialogflow calls to get information from external data sources.
- Integrate your Dialogflow model with a web frontend for a text-chat style interface.
- Integrate your Dialogflow model with Actions on Google so you can interact with your chatbot through a voice interface on your phone or Google Home.
Costs
This tutorial uses the following billable components of Google Cloud:
- Compute Engine
- Datastore
- Dialogflow
- Natural Language API API
- Cloud Source Repositories
- Cloud Storage
- Networking
To generate a cost estimate based on your projected usage, use the pricing calculator. New Google Cloud users might be eligible for a free trial.
When you finish this tutorial, you can avoid continued billing by deleting the resources you created. For more information, see Cleaning up.
Before you begin
-
Sign in to your Google Account.
If you don't already have one, sign up for a new account.
-
In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Cloud project. Learn how to confirm that billing is enabled for your project.
- Enable the Compute Engine, Cloud Storage, Cloud Source Repositories, Natural Language API, Datastore, and Dialogflow APIs.
Initializing App Engine to use Datastore API
You use Datastore, a NoSQL database, to store content extracted from the document. Google Cloud projects that use the Datastore API require an active App Engine app.
In this section, you assign default settings for values that are used throughout
the tutorial, such as
region and zone.
In this tutorial, you use us-central1
as the default region and
us-central1-b
as the default zone.
From the Google Cloud Console, click Activate Cloud Shell. You use Cloud Shell for all of the terminal commands in this tutorial.
Create a Datalab virtual machine (VM):
gcloud app create
When prompted, select
us-central1
as the region in which to create the app.
You have successfully initialized App Engine and Datastore.
Getting started with Datalab
This tutorial uses several Datalab notebooks, each of which has a primary function as described in the following list:
Preprocessing notebooks: These files are each run once, in order. They extract information from the HR Manual and create a knowledge base that is used by the chatbot to answer questions.
ProcessHandbook.ipynb
performs semi-structured analysis on the HR manual text file. It extractstopic
headings and associated policy text from the file and stores this information as key-value pairs in Datastore to give the chatbot a basic vocabulary.ProcessSynonyms.ipynb
uses several Python data science libraries and the Natural Language API API to generate synonyms fortopics
, which gives the chatbot an expanded vocabulary.DialogFlow.ipynb
uses Dialogflow's Entity API to write thetopics
to Dialogflow'sentity
module, which makes these words available to the chatbot as a data type.
Runtime notebooks: The following notebooks provide a runtime API interface to the knowledge base for the chatbot.
webhook.ipynb
creates a webhook API that uses Flask, which handles user-generated topic requests by responding with text content responses.ngrok.ipynb
creates a public internet-facing HTTPS tunnel for the webhook API process, which Dialogflow can then call over the internet by using HTTPS.
Set up Datalab
In Cloud Shell, initialize your project. Replace
[PROJECT_ID]
with your Google Cloud project ID.gcloud config set core/project [PROJECT_ID]
Spread the load among the different zones in the
us-central1
region. Replace[ZONE]
with a zone in theus-central1
region.gcloud config set compute/zone us-central1-[ZONE]
This command takes a few minutes to complete. If you are prompted to enter a passphrase for an
ssh
key, leave the passphrase blank.Create a Datalab instance on a VM in the project and zone specified in the preceding steps. Replace
[INSTANCE_NAME]
with a name for the Datalab instance.datalab create [INSTANCE_NAME]
When asked to create
ssh
keys, typeY
.Create an RSA key password and press
Enter
twice to confirm.Wait for the following text to display:
Updating project ssh metadata...done. Waiting for SSH key to propagate. The connection to Datalab is now open and will remain until this command is killed. Click on the *Web Preview* (square button at top-right), select *Change port > Port 8081*, and start using Datalab.
You are now connected to your instance. To open a Datalab notebook listing page, click Web preview, and then click Change port.
In the Port Number field, type
8081
, and then click Change and Preview.
The Datalab homepage opens in a new browser tab.
Download lab notebooks
In the Datalab homepage, click Notebooks.
To open a new notebook, click +Notebook. In the notebook cell, enter:
!git clone https://github.com/GoogleCloudPlatform/dialogflow-chatbot.git !cp -r dialogflow-chatbot/notebooks/* .
Click Run.
Building a chatbot database
In this section, you build a database with two different
entities
on Datastore. The first entity
contains topics
of interest
used by the chatbot. These topics are the only phrases to which the chatbot
reacts. The second entity
stores synonyms for the topics
created earlier.
Run the ProcessHandbook.ipynb
notebook
This first Python notebook, ProcessHandbook.ipynb,
extracts heading topics
along with their associated content as action_text
from the HR manual text
file, and stores the topics
and associated content as key-value pairs in
Datastore. Only run this notebook once.
In Datalab, go to Datalab > Notebooks >
ProcessHandbook.ipynb
.Click the Down arrow next to Clear, and then click Clear all Cells.
Run the cells individually and verify that the entries are created. To move through the cells conveniently, press
SHIFT + ENTER
and wait for each cell to complete before continuing to the next cell. Code cell completion is indicated by a blue bar to the left of the cell.After you run the last cell on the page, you are finished running this particular notebook and you have extracted the
topics
from the HR manual. Close the browser to close this notebook.To review the database you created, in the Cloud Console, go to the Entities page.
In the Kind list, click Topic. Review the
topics
that were created. Theaction_text
column is automatically encoded by Datastore. When you query the data, it is automatically decoded for you.
Run the ProcessSynonyms.ipynb
notebook
The ProcessSynonyms.ipynb
notebook uses natural language processing (NLP)
to create synonyms for the topics extracted.
In Datalab, go to Datalab > Notebooks >
ProcessSynonyms.ipynb
.Click the Down arrow next to Clear, and then click Clear all Cells. Run the cells individually and verify that the entries are created. To move through the cells conveniently, press
SHIFT + ENTER
and wait for each cell to complete before continuing to the next cell. Code cell completion is indicated by a blue bar to the left of the cell.The output displays results similar to the following:
annual salary annual Set([]) annual salary salary Set([u'wage', u'salary', u'remuneration', u'pay', u'salaries', u'earnings', u'pays', u'wages', u'earning', u'remunerations']) compassionate leave compassionate Set([]) compassionate leave leave Set([u'partings', u'leaves', u'farewells', u'leave', u'farewell', u'parting']) disability leave disability Set([u'handicap', u'disabilities', u'disability', u'disablements', u'handicaps', u'disablement', u'impairment', u'impairments']) disability leave leave Set([u'partings', u'leaves', u'farewells', u'leave', u'farewell', u'parting']) discipline discipline Set([u'discipline', u'bailiwicks', u'disciplines', u'fields', u'study', u'field', u'subjects', u'bailiwick', u'corrections',] u'studies', u'correction', u'subject'])
Close the browser to close this notebook.
To review the database you created, in the Cloud Console, go to the Datastore > Entities page.
In the Kind list, click Synonym. It might take several minutes for the data to appear. Try refreshing the page after waiting a minute or two if the data doesn't show up right away.
The knowledge base for your chatbot is now complete.
Creating a chatbot agent
You created the knowledge base for your chatbot, and now you build the chatbot by using Dialogflow.
Create a Dialogflow account
- Go to Dialogflow.
- In the upper-right corner, click Go to Console.
- Click Sign in with Google.
- Allow Dialogflow to access your Google Account, and accept the terms of service.
Create a Dialogflow chatbot agent
In this section, you create your chatbot, which Dialogflow calls an agent.
In Dialogflow, click Create Agent.
In the New Agent Form, complete the following fields:
- In the Agent Name field, enter
HR-Chatbot
. - In the Import Existing Google Cloud Project list, click your Google Cloud project.
- In the Agent Name field, enter
Click Create.
Create a topic entity
An entity
is essentially a data type in Dialogflow that you can use to
parameterize conversations. You create an entity
called topic
that
encapsulates all possible HR topics
this chatbot can discuss.
- In Dialogflow, click Entities.
- Click Create Entity.
- In the Entity name field, type
Topic
. - Click Allow automated expansion. This lets your chatbot recognize
topic
values that aren't explicitly listed in your data model. - Clear the Define synonyms checkbox. Your webhook handles synonyms instead.
- In the Enter value field, type
test
. You import more values fortopic
in the next section, but you have to save theentity
with at least one value. - Click Save.
Import topic entities from Datastore to Dialogflow
The third notebook, DialogFlow.ipynb
, imports topic
entries from
Datastore into Dialogflow.
- In Datalab, go to Datalab > Notebooks >
DialogFlow.ipynb
. - Click the Down arrow next to Clear, and then click Clear all Cells.
- Run the first cell to install the Dialogflow SDK on
Datalab by selecting the cell and pressing
SHIFT + ENTER
. - You must restart Python on your notebook server. In the Reset list, click Interrupt Execution. Then, in the Reset list, click Restart.
- Run the remaining cells individually.
These cells make API calls to Dialogflow to upload your
topics
. A list oftopics
appear in the notebook's output. - In the Dialogflow console, click Entities.
Click @Topic.
Your entries from Datastore now populate the
topic
entity.
Training and testing the chatbot
Now that you have all the data required to train the chatbot, you create intents, which capture the questions users might ask about the HR manual. Then, you test the chatbot by using a built-in simulator in Dialogflow.
Create a chatbot intent
An intent in Dialogflow captures a single kind of request and response interaction between your chatbot and your user. For example, the following interaction is modeled as an intent:
User: "Hi, I want some information about the HR manual."
This question activates the HR manual intent.
Chatbot: "OK, I'd be happy to help with that. What topic are you interested in?"
To create a chatbot intent:
In Dialogflow, click Intents.
You only need one intent for your HR chatbot. This intent responds to requests for information about different HR topics. If you want to use this chatbot to search other documents, you need to create an intent for each document.
Click Create Intent.
In the Intent name field, type
Topic
.Click Add Parameters and Action.
In the Enter action name field, type
lookup
.In the Parameters table, enter the following:
- In the Parameter Name field, type
topic
. - In the Entity field, type
@Topic
. - In the Value field, type
$topic
.
This creates a
lookup
action that passes atopic
parameter to your backend process (webhook), which retrieves information on this topic from the HR manual.- In the Parameter Name field, type
In the Fulfillment section, click the Down arrow to expand, and then click Enable Fulfillment. Click to turn on Enable webhook call for this intent. Don't turn on Enable webhook call for slot filling.
At the top of the page, click Save.
Train a chatbot intent
Click Add Training Phrases, and complete the following steps:
- In the Add user expression field, enter the sample
sentence,
I'd like to know about discipline
. Select the word discipline.
In the dialog, click
@Topic:topic
to tell Dialogflow where in your example sentence to find your topic parameter. After specifying the topic parameter in your sentence, pressEnter
to add the sample sentence.Delete the text in the Add user expression textbox and continue to add more examples. In the following list, select the word in the substep to add to
@Topic:topic
. For example, in the phrase,Tell me about discipline
, select the worddiscipline
to add.Tell me about discipline.
discipline
What are hours of work?
Hours of work
Tell me about annual salary.
annual salary
Can you look up discipline?
discipline
I need to know about discipline.
discipline
I want to know about discipline.
discipline
What is discipline?
discipline
Where can I find out about discipline?
discipline
- In the Add user expression field, enter the sample
sentence,
Click Save.
Dialogflow now trains the agent based on your example intentions. Training is complete when a notification message is displayed.
Click to turn on Enable webhook call for this intent.
Run a webhook process
The webhook.ipynb
notebook contains Python code, written in Flask that
publishes an API/REST web service that maps to Dialogflow's
specification for webhooks. This API takes a topic
parameter and queries
Datastore for the associated text. The chatbot presents the
associated text to the user.
- In Datalab, go to Datalab > Notebooks >
webhook.ipynb
. Click the Down arrow next to Clear, and then click Clear all cells. Run the cells individually.
The following output at the bottom of the notebook indicates that your API is running on port
5000
of your local VM.* Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
Publish the webhook on the internet
The ngrok.ipynb
notebook uses a service called
[ngrok](https://ngrok.com/)
to create an HTTPS tunnel from your service running behind a Google Cloud
firewall, to a public URL on the internet.
- In Datalab, go to Datalab > Notebooks >
ngrok.ipynb
. - Click the Down arrow next to Clear, and then click Clear all Cells.
Click Run, and then click Run all Cells.
The output is displayed at the bottom of the notebook. Copy the
https://[NAME].ngrok.io
URL from the output. This is your webhook's public URL, where[NAME]
represents a name randomly generated byngrok.
Ignore any warning messages in the output.
In Dialogflow, click Fulfillment to link your chatbot to your webhook running in Google Cloud.
Click to turn on Webhook.
In the URL field, enter your
ngrok
URL and then add/webhook/
. For example,https://bc123456.ngrok.io/webhook/.
Click Save.
Test your chatbot
In the Dialogflow console, click Try me now. In the input field, type
Tell me about annual salary
, and then pressENTER
.It displays a response from the HR manual.
Test the web demo integration
Dialogflow provides many types of integrations from other services to your chatbot. To view a sample web user interface:
- In Dialogflow, click Integrations.
- Click to turn on Web Demo.
- To open the web demo dialog, click Web Demo.
- Click
https://bot.dialogflow.com/[GUID]
to launch the web demo.[GUID]
is an ID generated by Dialogflow. In the chatbot, type
Tell me about annual salary
, and then pressENTER
.The chatbot responds as before.
Cleaning up
If you decide not to continue with part 2 of this tutorial and to avoid incurring charges to your Google Cloud account for the resources used in this tutorial:
Delete the project
- In the Cloud Console, go to the Manage resources page.
- In the project list, select the project that you want to delete, and then click Delete.
- In the dialog, type the project ID, and then click Shut down to delete the project.
What's next
- Part 2 of the tutorial shows you how to deploy the agent securely and add scalability.
- Learn more about Dialogflow.
- Try out other Google Cloud features for yourself. Have a look at our tutorials.