In this post, I’m going to look at the new Knowledge Connectors feature in Google Dialogflow. As I look at the features in more detail I’m assuming you understand the more common Dialogflow terms and features – agents, intents & entities.
It’s also important to remember this feature is in beta.
We’ve been working on chatbot projects for 2 years now and a large number of our chatbot project have shared a similar requirement: the ability to answer a large number of questions on a particular subject. This may be to answer technical questions about a product offering or to offer information for a particular service.
Often the information related to these types of questions is held on our chatbot customer’s own websites as FAQ pages or in specific PDFs or unstructured text documents.These types of knowledge bases can often hold large amounts of information and so technically they will provide answers to thousands of chatbot questions.
The challenge for a successful chatbot is utilising this often unstructured information to understand a question and provide the correct answer. To meet this challenge we can look at 2 approaches; the traditional one and using the new Dialogflow Knowledge Connectors.
Stepping back a bit it’s important to briefly go over the traditional approach to creating chatbot conversational ability. There are a number of different chabot frameworks out there such as Google Dialogflow, IBM Watson, Microsoft Bot, Rasa etc and they all largely use the same concepts. A user submits a voice or text query and this utterance will be matched to an intent and any entities extracted. The matched intent would either provide a static response or rely on some form of application layer to perform the required action to provide the response to the user.
This approach can be easy. However, things can get complex and difficult to manage if the scope of intents is very large and or/ the information is constantly being updated. If we want to support questions with knowledge base information then each question needs to be created as an intent and the correct response formulated. This can lead to problems such as:
- Problems with the Intent Classification model growth causing more incorrect classifications.
- The amount of effort required to keep adding more training data to the model to ensure that the accuracy of the Intent classification remains high. Fortunately, Dialogflow provides a training UI in the web console to help keep track of any misclassified utterances, analyzing them and adding these to the training data, however, this does take time.
- Creating and managing intents to support new information in documents stores.
Enter Knowledge Connectors
Knowledge connectors are a beta feature released in 2019 to complement the traditional intent approach. When your agent doesn’t match an incoming user query to an intent then you can configure your agent to look at the knowledge base(s) for a response.
The knowledge datasource(s) can be a document(currently supported content types are text/csv, text/html, application/pdf, plain text) or a web URL which has been provided to the Dialogflow agent.
Using Knowledge Connectors
To be able to use knowledge connectors, you will need to click “Enable beta features and APIs” on your agent’s settings page.
Its also worth mentioning that Knowledge connector settings are not currently included when exporting, importing, or restoring agents. I’m hoping that this is something currently being put in place by the Dialogflow team.
Knowledge connectors can be configured for your agent either through the web console or using the client library that is available in Java, node.js & python. You can also configure from the command line.
To create a knowledge base from the web console, login to Dialogflow & then go to the knowledge tab. The process is fairly straightforward and involves providing a knowledge base name then adding a document to the knowledge base.
After you’ve done that then you just need to add an intent and return the response. It’s also worth keeping in mind you can send all the usual response types and that means including rich responses which I think is pretty cool.
Trying out knowledge connectors
Ok, so its time to try out these wondrous new knowledge connectors. There are 2 different types of knowledge base document type: FAQ & Extractive Question Answering. These choices govern what type of supported content can be used. There are also a number of caveats for each content type which you can read more about this here
Based on these 2 document types I looked at a couple of common use cases which we often encounter at The Bot Forge and correlate well with the document types supported:
- Chatbot FAQ functionality using an existing FAQ webpage in a fairly structured format to provide answers from.
- Chatbot FAQ functionality using information in an unstructured format to provide answers from.
I carried out my tests using a blank Dialogflow agent with beta features enabled.
1- An FAQ Knowledge Base (Knowledge Type: FAQ)
For my knowledge base I used the UCAS Frequently asked questions webpage and used the following URL as my data source. This processes the URL which is in the correct format and creates a series of Question/Answer pairs which can be enabled or disabled in the console, pretty neat!
So giving this a spin my first test was “how do I apply” and the result was spot on,
matchConfidenceLevel: HIGH matchConfidence: 0.97326803
Whilst different variations on the same question also returned a good result.
"im not sure how to apply" matchConfidenceLevel: HIGH matchConfidence: 0.9685159 "can you tell me about how I can apply" matchConfidenceLevel: HIGH matchConfidence: 0.968346
Unfortunately, when I try something a bit less obvious. I get an incorrect result as it matches the wrong intent.
"how do I submit my application" matchConfidenceLevel: HIGH, matchConfidence: 0.9626459
In this case, it’s matching the “How can I make a change to my application” intent with a high confidence but unfortunately it’s the wrong intent. So the problem here is we need to fine-tune the model and re-assign the training phrase (utterance) to the intended intent. The limitation is that in the knowledge base you can’t fine-tune responses. If you want more control you will need to move this faq over to its own intent.
This problem is compounded by the fact that the training feature of the console just lists each response intent as “Default Fallback Intent”. It’s hard to check which responses have been answered incorrectly. One way round is to look in the History area of the console and look at the Raw interaction log of each response.
One really useful feature is that you can assign a specific extracted FAQ from the knowledge document and assign to an intent. Just click on view detail in the document list -> select the question and click the “convert to intents button”. At the same time, this will create a new intent and disable the current Question/Answer pair. So overall pretty impressive if you have webpage or doc of structured FAQs you can use this to power an FAQ chatbot pretty effectively with some monitoring.
2-A more unstructured FAQ Knowledge Base (Knowledge Type: Extractive Question Answering)
In this use case, I wanted to try out the ability of the knowledge connectors to return answers from more unstructured data.
Again there are caveats about what data source you can use you can read more about this here.
For my test, I used a standard drug leaflet with MIME type PDF covering Priorix, from www.medicines.org.uk. I created a new knowledge base, added a new document and made sure I selected the knowledge type as “Extractive Question Answering”. Once imported the PDF is listed in the document list. My aim was to validate if Dialogflow could extract some fairly simple answers from the document. Now for some testing:
"What is Priorix" matchConfidenceLevel: HIGH matchConfidence": 0.88257504 answer : "Priorix, powder and solvent for solution for injection in a pre-filled syringe Measles, Mumps and Rubella vaccine (live)"
Unfortunately, although the response had a high confidence and match score it was actually an incorrect response. Ideally, the answer should have been:
“Priorix is a vaccine for use in children from 9 months up, adolescents and adults to protect them against illnesses caused by measles, mumps and rubella viruses.”
I tried another test:
"how is priorix given" matchConfidenceLevel: HIGH, matchConfidence: 0.8826 answer: The other ingredients are: Powder: amino acids, lactose (anhydrous), mannitol, sorbitol
Again this was an incorrect response. I would have expected the correct response to be:
“How Priorix is given
Priorix is injected under the skin or into the muscle, either in the upper arm or in the outer thigh.”
So unfortunately not great results in extracting answers from the PDF I used. It would be interesting to look at a selection of other types of documents and corpora.
Do Knowledge Connectors work?
Again its important to point out this is a beta feature. There are definitely challenges and in some functional area much more to be done with Knowledge Connects. In conclusion, It’s also important to recognise that I looked at 2 different types of use cases and knowledgebase document types which provided very different results so its worth looking at each one separately.
Chatbot FAQ functionality using an existing FAQ webpage in a fairly structured format.
If you want to convert your FAQ page into a chatbot or if you have a similar structured document such as a PRFAQ for a product or service then Connectors work well.
Just supplying the URL of the FAQ page as a data source to the knowledge connectors is fantastic and provides fairly good results. However, it’s worth keeping in mind there may still be match errors so the history log is invaluable in checking for them. Thankfully it’s fairly easy to manage any question/answer pair which has been handled incorrectly by converting to its own intent.
Chatbot FAQ using a document in an unstructured format.
I found my test results with this use case rather disappointing. The accuracy of the extracted answers was fairly poor for my test case. Although for different document sources you may be able to get better results.
The extracted answers look more like a match based on keywords with some additional coverage but it does not appear to consider the context in which the question is asked. Also, this type of knowledge connector does not provide any full control like intents in terms of context and priority of matching training phrases etc so there is no way of fixing bad responses. A feature where you can evaluate and train responses would be a great addition to the knowledge base so hopefully, that is in the Dialogflow team pipeline.
Should I use Dialogflow Knowledge Connectors?
If you have some FAQ information in a structured format then Knowledge connectors are worth a try with some caveats.
If you have unstructured documents which you want your chatbot to use to extract answers to questions then at the moment knowledge connectors are not a magic bullet. It’s a big ask, but for me, this is where the real value will lie particularly if you want to support large knowledge bases with a chatbot. Knowledge connectors are an experimental feature, so hopefully as the technology advances then they will improve.