What is Conversational UI

So what is Conversational UI? Historically interfaces or UI has been made up of visual elements: buttons, dropdown lists, date pickers, carousels. Now we have the technology to provide conversational ones as well via voice or text-enabled interfaces such as Amazon Echo or Facebook Messenger. Some applications also leverage the best of both worlds, combining traditional UI elements with conversational capability: for example Facebook Messenger.

The most important advancement in Conversational UI has been Natural Language Processing (NLP). This is the field of computing that deals with deciphering the exact words that a user and parsing out of it their actual intent and in what context. You can read about some of the terminologies here. If the bot is the interface, NLP is the brain behind its conversational ability.

Natural Conversational Experiences

The Bot Forge creates natural conversational experiences for voice and text-based applications leveraging the latest AI technology.

Our team is made up of natural language processing (NLP) and Natural language understanding (NLU) experts, artificial intelligence specialists, conversational architects, project managers, and interaction designers. Focused on forging engaging voice and text-based Conversational UI powered by NLP.

Our conversational interfaces can be deployed on websites, mobile applications, messaging applications and voice-enabled devices.

We use the most advanced machine learning technology to power our solutions so that recognising user intent and context works reliably and seamlessly. Our goal is to create awesome conversational experiences.

Training data for chatbots.

I’m going to look at the challenges in creating a chatbot which can answer questions about its specific domain effectively. In particular, I’m going to look at the challenges and possible solutions in creating a chatbot with a reasonable conversational ability at their initial implementation.  Every chatbot project is different but often clients come to us with a large knowledge base which they want a chatbot to support from its release but with very little training data.

We are going to concentrate on a Dialogflow project to look at some examples however the challenges and solution are similar for all the most well know NLP engines, Watson, Rasa, Luis etc.

The Challenge

Chatbots need training data highlighting image

One of the key problems with modern chatbot generation is that they need large amounts of chatbot training data.
If you want your chatbot to understand a specific intention, you need to provide it with a large number of phrases that convey that intention. In a Dialogflow agent, these training phrases are called utterances and Dialogflow stipulate at least 10 training phrases to each intent.

Depending on the field of application for the chatbot, thousands of inquiries in a specific subject area can be required to make it ready for use with each one of these lines of enquiry needing multiple training phrases.

The training process of an ai powered chatbot means that chatbots learn from each new inquiry. The more requests a chatbot has processed, the better trained it is. The NLU(Natural Language Understanding) is continually improved, and the bot’s detection patterns are refined. Unfortunately, a large number of additional queries are necessary to optimize the bot, working towards the goal of reaching a recognition rate approaching 90-100% often means a long bedding in process of several months.

Data Scarcity

One of the main issues in today’s chatbots generation is that large amounts of training information are required to match the challenges described previously. You have to give it a large number of phrases that convey your purpose if you want your chatbot to understand a specific intention.

To date, these large training corpus had to be manually generated. This can be a time-consuming job with an associated increase in the cost of the project. One of the main issues we have faced is that often clients want to see quick results in a chatbot implementation. These types of chatbot projects are often use cases which are providing information regarding a wide-ranging domain and may not necessarily have a lot of chat transcripts or emails to work with to create the initial training model. In these cases there is often not enough training data and so it takes time to get decent and accurate match rates.

The Solution

Chatbot training data

THE BOT FORGE PROVIDES
CHATBOT TRAINING DATA
CREATION SERVICES

The Bot Forge offers an artificial training data service to automate training phrase creation for your specific domain or chatbot use-case. Our process will automatically generate intent variation datasets that cover all of the different ways that users from different demographic groups might call the same intent which can be used as the base training for your chatbot.

Multi NLP platform support
Multi-language support

Our training data is not restricted solely to Dialogflow agents, the output data can be formatted for the following agent types:

  • rasa: Rasa JSON format
  • luis: LUIS JSON format
  • witai: Wit.ai JSON format
  • watson: Watson JSON format
  • lex: Lex JSON format
  • dialogflow: Dialogflow JSON format

We provide training datasets in 100+ languages

We offer our synthetic training data creation services to our chatbot clients. However, if you already have your own chatbot project and just want to boost its conversational ability we can provide synthetic training data to meet your needs.

Testing the Solution

Chatbot interface picture

We wanted to test the effectiveness of using our synthetic training data in a Dialogflow chatbot agent by varying the number of utterances per intent using our own synthetic training data.

Dialogflow test agents

We carried out three different tests (A B and C) with 3 separate Dialogflow agents. Each agent had identical agent settings. The agents had 3 identical intents to provide information about the topic of angel investors:

  • what_is_an_angel_investor
  • what_percentage_do_angel_investors_want
  • do_angel_investors_seek_control

In the first test (A) the chatbot was trained with 2 hand-tagged training phrases (utterances) per intent. Test (B) had 10 training phrases from our own synthetic training data per intent and test (C) had between 25 and 60 training phrases per intent.

The Test

We tested each agent with 12 separate questions similar to but distinct from the ones in the training sets.

We didn’t carry out any training during testing once the chatbots were created.

We recorded the % of queries matched to the correct intent, the incorrect intent or no match and also the intent detection confidence 0.0 (completely uncertain) to  1.0 (completely certain) from the agent response.

Overall test results

View the results here

% correct match rate

% incorrect match

%no match

Average Intent Detection Confidence

Test A (2x utterances) 50% 42% 8% 0.6437837225
Test B (10x  utterances) 91% 9% 0% 0.7590197883
Test C (25-60x utterances) 100% 0% 0% 0.856748325

Test A provided a 50% match rate. We observed a significant improvement in test B with the introduction of some of our synthetic training data to the agent. We were able to improve the match rate from 41% to 91% whilst TestC with 25-60 training phrases enabled a match rate of 100%. The average intent detection confidence also grew

In summary, chatbots need a decent amount of training data to provide accurate results. If there is not enough training data then a chatbots accuracy is affected and it can take some time to train it whilst being used to reach acceptable performance levels. At the same time, it can be costly and time-consuming to create training data for a chatbot needing to handle large numbers of intents.

Our synthetic training data creation service allows us to create big training sets with no effort thus reducing initial costs in chatbot creation and improving the usability of a chatbot from the initial release stages. If you only have a limited number of training phrases per intent and have large numbers of intents, our service is able to generate the rest of variants needed to go from really poor results to a chatbot with greater levels of accuracy in providing responses. We have carried out these tests with Dialogflow, but our conclusions are relevant for ML-based bot platforms in general. We can conclude that our Artificial Training Data service is able to drastically improve the results of chatbot platforms that are highly dependent on training data

Chatbot Training Never Ends!

I’ve looked at the benefits of using our training data at the early stages of a chatbot project. However, it’s important to note that the key to success, in the long run, is to constantly monitor your chatbot and continue training to get smarter. Either by doing constant training with human effort or by scheduling regular training cycles, incorporating new utterances and conversations from real users.

If you want to know more about our chatbot training data creation services get in touch

Appendix

View the test results here

We build a lot of different types of chatbots at The Bot Forge and deliver these to a variety of platforms such as Web, Facebook Messenger, Slack or WhatsApp. To create our chat bots we often use different AI platforms which offer more suitable features for a specific project. All the major cloud and open-source providers have adopted similar sets of features for their conversational AI platforms and provide good NLU (Natural Language Understanding). There are also some strong options for open source privately hosted systems.

Conversational AI Platform Features

We wanted to spend some time looking at some of the more popular AI platforms in a bit more depth in this series. To help look at each one we have focused on the following specific features:

API and UI

A conversational AI platform should provide User Interface(UI) tools to plan conversational flow and help train and update the system

Context

As well as intent and entities, a context object allows the system to keep track of context discussed within the conversation, other information about the user’s situation, and where the conversation is up to. This is often the NLP feature which is vital in creating a complex conversation beyond a simple FAQ bot.

Conversation flow

Looking at the current position of a conversation, the context and the user’s last utterance with intents and entities all come together as rules to manage the conversational flow. This can be challenging to create and manage so a platforms’ tools in the form of a flow engine, in code and complimented by a visual tool can provide advantages depending on the chatbot project itself. Other features such as slot-filling (ensuring
that all the entities for an intent are present, and prompting the user for any that are missing) can be important.

Whilst most platforms fall into this category some systems use machine learning to learn from test conversational data and then create a probabilistic model to control flow. These systems rely on large datasets.

Pre-built channel integrations

Having a conversational platform that supports your target channel out-of-the-box can substantially speed up delivery of a chatbot solution and your flexibility in using the same conversational engine for a different integration. This is one of the reasons we really like Dialogflow’s tooling.

Chatbot Content Types

Whist the focus of a conversational AI platform is understanding pure text, messaging systems and web interfaces often involve other content, such as buttons, images, emojis, URLs and voice input/output. The ability of a platform to support these features is important to create a rich user experience and help to manage the conversational flow.

Integrations

Bot responses can be enhanced by integrating information from the user with information from internal or external web services. We use this type of ability a lot in creating our chatbots and in our opinion feel its one of the most powerful features of a chatbot solution. With this in mind, the ability to configure calls to external services from within a conversation and use responses to manage conversational flow is important in building chatbot conversations.

Pre Trained Intents and Entities

Instead of creating entity types such as dates, places or currencies for each project some systems provide these pre-trained to deal with complex variations. In the same way, common user intents and utterances such as small-talk is offered pre-trained from some platforms.

Analytics and Logs

The key to creating a successful chatbot is that they need to be constantly trained and monitored. To aid in continuously improving the system once initially launched, the
conversational tools should provide a dashboard of the user conversations; showing stats for responses, user interactions and other metrics. Export of these logs is also useful to import into other systems. Other important AI features enable easily training missed intents, catching bad sentiments and monitoring flow.

Techstack

It can be important to take into account what libraries are provided by an AI platform and in what supported languages. In the end, this may favour your choice of solution if it fits with your current codebase or teams skillset. However, as a full-stack javascript software house, we find Node.JS to be our server stack of choice when building our bots and most AI platforms cater for this.

Costs

These are the costs for the cloud hosting and cloud NLU solutions. An important aspect to consider particularly for large scale enterprise chatbots handling large volumes of traffic where NLU costs can reach £1000s a month.

Many providers offer a free tier for their AI platform solutions. A paid for tier will then normally offer enhanced versions of the service with enterprise focused features and support for greater volume and performance. Costs tend to be charged in one of 3 ways, per API call, per conversation or daily active user and also per active monthly user (normally subscriptions are in tiers). We try and look at costs as publicly published for the paid-for plans suitable for enterprise use in a shared public cloud environment.

The Platforms

Keeping all these feature sets in mind we hope to look at the following AI platforms over the coming posts.

  • Botkit
  • Chatfuel
  • Amazon Lex
  • Microsoft Luis
  • Google Dialogflow
  • Rasa
  • IBM Watson

Please get in touch if you feel we should look at a platform which we have missed!

Our first AI platform blog post will be coming soon!