Chatbot_Presentation

Implementing Chatbots
using Deep Learning.
By : Rohan Chikorde
Introduction
What is a CHATBOT?
 A chat robot, a computer program that simulates human
conversation, or chat, through artificial intelligence.
 It is a service, powered by rules and artificial intelligence,
that you interact with via a chat interface.
 The service could be any number of things, ranging from
functional to fun, and it could live in any major chat product
(Facebook Messenger, Slack, Telegram, Text Messages, etc).
List of best AI Chatbots:
 Mitsuku (Leobner Prize Winner) - Prize in AI for Chatbots in 2013
 Jabberwacky
 PersonalityForge
 Botser
 Cleverbot
* http://www.techstext.com/list-of-best-chatbots-to-converse/
Types of Chatbot
 RETRIEVAL-BASED MODELS -
o Uses a repository of predefined responses and some kind of heuristic to
pick an appropriate response based on the input and context.
o The heuristic could be as simple as a rule-based expression match, or as
complex as an ensemble of Machine Learning classifiers.
 GENERATIVE MODELS-
o This bot has an artificial brain AKA artificial intelligence. You don’t have
to be ridiculously specific when you are talking to it. It understands
language, not just commands.
o This bot continuously gets smarter as it learns from conversations it has
with people.
Open Domain vs. Closed Domain
 In an open domain setting, the user can take the
conversation anywhere. There isn’t necessarily have a well-
defined goal or intention.
Ex: Conversation about refinancing one’s mortgage
 In a closed domain setting, the space of possible inputs and
outputs is somewhat limited because the system is trying to
achieve a very specific goal.
Ex : Hotel’s Customer Support or Shopping Assistants
 The longer the conversation the more difficult to automate it because it need to keep track of
what has been said.
Ex: Customer support conversations.
 Short-Text Conversations where the goal is to create a single response to a single input.
Ex: What is your name?
Long vs Short Conversations
Implementing a
Retrieval-Based
Model In
TensorFlow
Architecture
of AI Chatbot
Retrieval Based Model
 The vast majority of production systems today are retrieval-based, or a combination of
retrieval-based and generative model.
 Generative models are an active area of research, but we’re not quite there yet.
 For building Hotel’s Customer Support, right now best bet is most likely a retrieval-based
model.
The Ubuntu Dialog Corpus
 The Ubuntu Dialog Corpus (UDC) is one of the largest public dialog datasets available.
 It’s based on chat logs from the Ubuntu channels on a public IRC network.
 The training data consists of 1,000,000 examples, 50% positive (label 1) and 50% negative
(label 0).
 Each example consists of a context, the conversation up to this point, and an utterance, a
response to the context.
 The dataset originally comes in CSV format. We could work directly with CSVs, but it’s better
to convert our data into Tensorflow’s proprietary Example format.
 The main benefit of this format is that it allows us to load tensors directly from the input files
and let Tensorflow handle all the shuffling, batching and queuing of inputs. As part of the
preprocessing, also create a vocabulary.
 This means we map each word to an integer number, e.g. “cat” may become 2631. The
TFRecord files which will generate store these integer numbers instead of the word strings. Its
better to save the vocabulary so that we can map back from integers to words later on.
Data Pre-processing
 One of the Deep Learning model for building chatbot is called a Dual Encoder LSTM network.
 There are many Deep Learning architectures – it’s an active research area.
 seq2seq model often used in Machine Translation would probably do well on this task.
Deep Learning Model
 tf-idf predictor
o tf-idf stands for “term frequency – inverse document” frequency and it measures how important a
word in a document is relative to the whole corpus.
o Documents that have similar content will have similar tf-idf vectors.
o Intuitively, if a context and a response have similar words they are more likely to be a correct pair.
Implementation…
Dual Encoder LSTM Model
Working of Dual Encoder LSTM
 Both the context and the response text are split by words, and each word is embedded into a
vector. The word embedding are initialized with Stanford’s GloVe vectors and are fine-tuned during
training.
 Both the embedded context and response are fed into the same Recurrent Neural Network word-
by-word. The RNN generates a vector representation that, loosely speaking, captures the “meaning”
of the context and response (c and r).
 It then, multiply c with a matrix M to “predict” a response r’. The matrix M is learned during
training.
 It measure the similarity of the predicted response r’ and the actual response r by taking the dot
product of these two vectors. A large dot product means the vectors are similar and that the
response should receive a high score.
 Then it applies a sigmoid function to convert that score into a probability.
Creating an Input Function
 In order to use Tensorflow’s built-in support for training and evaluation we need to create an
input function — a function that returns batches of our input data.
 In fact, because our training and test data have different formats, we need different input
functions for them. The input function should return a batch of features and labels.
Steps:
 On a high level, the function does the following:
o Create a feature definition that describes the fields in our Example file
o Read records from the input_files with tf.TFRecordReader
o Parse the records according to the feature definition
o Extract the training labels
o Batch multiple examples and training labels
o Return the batched examples and training labels
Creating the Model
 As we have different formats of training and evaluation data we have to create a function
wrapper that take care of bringing the data into the right format.
 It takes a model argument, which is a function that actually makes predictions.
 In our case it’s the Dual Encoder LSTM, but we could easily swap it out for some other neural
network
Evaluating the model & making Predictions
 After training the model we can evaluate it on the test set.
 This will run the evaluation metrics on the test set instead of the validation set.
 We will get probability scores for unseen data.
 We could imagine feeding in 100 potential responses to a context and then picking the one
with the highest score.
References
 The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn
Dialogue Systems
o https://arxiv.org/abs/1506.08909
 Artificial intelligence markup language (aiml).
o http://alice.sunlitsurf.com/alice/aiml.html.
 Intelligent Chat Bot for Banking System
o http://www.ijettcs.org/Volume4Issue5(2)/IJETTCS-2015-10-09-16.pdf
 WILDML, Deep Learning for Chatbot
o http://www.wildml.com/2016/07/deep-learning-for-chatbots-2-retrieval-based-model-tensorflow/
Thank You
1 de 22

Recomendados

Chatbot ppt por
Chatbot pptChatbot ppt
Chatbot pptGeff Thomas
67.4K visualizações10 slides
Chat bots and AI por
Chat bots and AIChat bots and AI
Chat bots and AIGeff Thomas
3.1K visualizações10 slides
What is a chatbot? por
What is a chatbot?What is a chatbot?
What is a chatbot?Kamini Bharti
1.2K visualizações8 slides
Chat bots por
Chat botsChat bots
Chat botsChandulal Kavar
2K visualizações24 slides
Chatbot por
ChatbotChatbot
Chatbothaseeb muhsin
6.8K visualizações17 slides
How do Chatbots Work? A Guide to Chatbot Architecture por
How do Chatbots Work? A Guide to Chatbot ArchitectureHow do Chatbots Work? A Guide to Chatbot Architecture
How do Chatbots Work? A Guide to Chatbot ArchitectureMaruti Techlabs
824 visualizações10 slides

Mais conteúdo relacionado

Mais procurados

Chat Bots Presentation 8.9.16 por
Chat Bots Presentation 8.9.16Chat Bots Presentation 8.9.16
Chat Bots Presentation 8.9.16Samuel Adams, MBA
1.7K visualizações12 slides
The Chatbots Are Coming: A Guide to Chatbots, AI and Conversational Interfaces por
The Chatbots Are Coming: A Guide to Chatbots, AI and Conversational InterfacesThe Chatbots Are Coming: A Guide to Chatbots, AI and Conversational Interfaces
The Chatbots Are Coming: A Guide to Chatbots, AI and Conversational InterfacesTWG
18.1K visualizações54 slides
Chatbot ppt por
Chatbot pptChatbot ppt
Chatbot pptManish Mishra
3.4K visualizações41 slides
Ai chatbot ppt.pptx por
Ai chatbot ppt.pptxAi chatbot ppt.pptx
Ai chatbot ppt.pptxaashnareddy1
4.9K visualizações19 slides
Chatbot and Virtual AI Assistant Implementation in Natural Language Processing por
Chatbot and Virtual AI Assistant Implementation in Natural Language Processing Chatbot and Virtual AI Assistant Implementation in Natural Language Processing
Chatbot and Virtual AI Assistant Implementation in Natural Language Processing Shrutika Oswal
1.9K visualizações26 slides
Chatbot por
ChatbotChatbot
ChatbotAlexandre Uehara
1.1K visualizações27 slides

Mais procurados(20)

Chat Bots Presentation 8.9.16 por Samuel Adams, MBA
Chat Bots Presentation 8.9.16Chat Bots Presentation 8.9.16
Chat Bots Presentation 8.9.16
Samuel Adams, MBA1.7K visualizações
The Chatbots Are Coming: A Guide to Chatbots, AI and Conversational Interfaces por TWG
The Chatbots Are Coming: A Guide to Chatbots, AI and Conversational InterfacesThe Chatbots Are Coming: A Guide to Chatbots, AI and Conversational Interfaces
The Chatbots Are Coming: A Guide to Chatbots, AI and Conversational Interfaces
TWG18.1K visualizações
Chatbot ppt por Manish Mishra
Chatbot pptChatbot ppt
Chatbot ppt
Manish Mishra3.4K visualizações
Ai chatbot ppt.pptx por aashnareddy1
Ai chatbot ppt.pptxAi chatbot ppt.pptx
Ai chatbot ppt.pptx
aashnareddy14.9K visualizações
Chatbot and Virtual AI Assistant Implementation in Natural Language Processing por Shrutika Oswal
Chatbot and Virtual AI Assistant Implementation in Natural Language Processing Chatbot and Virtual AI Assistant Implementation in Natural Language Processing
Chatbot and Virtual AI Assistant Implementation in Natural Language Processing
Shrutika Oswal1.9K visualizações
Chatbot por Alexandre Uehara
ChatbotChatbot
Chatbot
Alexandre Uehara1.1K visualizações
Chatbot por Kamini Bharti
ChatbotChatbot
Chatbot
Kamini Bharti714 visualizações
CHATBOT PPT-2.pptx por LohithaJangala
CHATBOT PPT-2.pptxCHATBOT PPT-2.pptx
CHATBOT PPT-2.pptx
LohithaJangala7.1K visualizações
Final presentation on chatbot por VaishnaviKhandelwal6
Final presentation on chatbotFinal presentation on chatbot
Final presentation on chatbot
VaishnaviKhandelwal643.2K visualizações
Chatbot por UTSAB NEUPANE
ChatbotChatbot
Chatbot
UTSAB NEUPANE794 visualizações
ChatGPT.pdf por dhatura
ChatGPT.pdfChatGPT.pdf
ChatGPT.pdf
dhatura1.8K visualizações
Artificially Intelligent chatbot Implementation por Rakesh Chintha
Artificially Intelligent chatbot ImplementationArtificially Intelligent chatbot Implementation
Artificially Intelligent chatbot Implementation
Rakesh Chintha9.9K visualizações
Chatbots 101 por Venu Vasudevan
Chatbots 101Chatbots 101
Chatbots 101
Venu Vasudevan2K visualizações
Chatbot Artificial Intelligence por Md. Mahedi Mahfuj
Chatbot Artificial IntelligenceChatbot Artificial Intelligence
Chatbot Artificial Intelligence
Md. Mahedi Mahfuj41.5K visualizações
Chatbots por Vectr.Consulting
ChatbotsChatbots
Chatbots
Vectr.Consulting15.5K visualizações
chatGPT.txt por Kamleshlodhi1
 chatGPT.txt chatGPT.txt
chatGPT.txt
Kamleshlodhi12.5K visualizações
Let's Build a Chatbot! por Christopher Mohritz
Let's Build a Chatbot!Let's Build a Chatbot!
Let's Build a Chatbot!
Christopher Mohritz3.8K visualizações

Similar a Chatbot_Presentation

ms_3.pdf por
ms_3.pdfms_3.pdf
ms_3.pdfSatishBhalshankar
8 visualizações3 slides
NLP_A Chat-Bot_answering_queries_of_UT-Dallas_Students por
NLP_A Chat-Bot_answering_queries_of_UT-Dallas_StudentsNLP_A Chat-Bot_answering_queries_of_UT-Dallas_Students
NLP_A Chat-Bot_answering_queries_of_UT-Dallas_StudentsHimanshu kandwal
500 visualizações8 slides
IRJET - Deep Learning based Chatbot por
IRJET - Deep Learning based ChatbotIRJET - Deep Learning based Chatbot
IRJET - Deep Learning based ChatbotIRJET Journal
15 visualizações4 slides
DataChat_FinalPaper por
DataChat_FinalPaperDataChat_FinalPaper
DataChat_FinalPaperUrjit Patel
157 visualizações5 slides
AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位 por
AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位
AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位eLearning Consortium 電子學習聯盟
146 visualizações163 slides
ijeter35852020.pdf por
ijeter35852020.pdfijeter35852020.pdf
ijeter35852020.pdfSatishBhalshankar
7 visualizações7 slides

Similar a Chatbot_Presentation(20)

NLP_A Chat-Bot_answering_queries_of_UT-Dallas_Students por Himanshu kandwal
NLP_A Chat-Bot_answering_queries_of_UT-Dallas_StudentsNLP_A Chat-Bot_answering_queries_of_UT-Dallas_Students
NLP_A Chat-Bot_answering_queries_of_UT-Dallas_Students
Himanshu kandwal500 visualizações
IRJET - Deep Learning based Chatbot por IRJET Journal
IRJET - Deep Learning based ChatbotIRJET - Deep Learning based Chatbot
IRJET - Deep Learning based Chatbot
IRJET Journal15 visualizações
DataChat_FinalPaper por Urjit Patel
DataChat_FinalPaperDataChat_FinalPaper
DataChat_FinalPaper
Urjit Patel157 visualizações
ijeter35852020.pdf por SatishBhalshankar
ijeter35852020.pdfijeter35852020.pdf
ijeter35852020.pdf
SatishBhalshankar7 visualizações
MACHINE-DRIVEN TEXT ANALYSIS por Massimo Schenone
MACHINE-DRIVEN TEXT ANALYSISMACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSIS
Massimo Schenone255 visualizações
NEURAL NETWORK BOT por IRJET Journal
NEURAL NETWORK BOTNEURAL NETWORK BOT
NEURAL NETWORK BOT
IRJET Journal5 visualizações
ENSEMBLE MODEL FOR CHUNKING por ijasuc
ENSEMBLE MODEL FOR CHUNKINGENSEMBLE MODEL FOR CHUNKING
ENSEMBLE MODEL FOR CHUNKING
ijasuc30 visualizações
IRJET- Conversational Assistant based on Sentiment Analysis por IRJET Journal
IRJET- Conversational Assistant based on Sentiment AnalysisIRJET- Conversational Assistant based on Sentiment Analysis
IRJET- Conversational Assistant based on Sentiment Analysis
IRJET Journal34 visualizações
Tata Motors GDC .LTD Internship por Omkar Rane
Tata Motors GDC .LTD Internship Tata Motors GDC .LTD Internship
Tata Motors GDC .LTD Internship
Omkar Rane171 visualizações
leewayhertz.com-What role do embeddings play in a ChatGPT-like model.pdf por robertsamuel23
leewayhertz.com-What role do embeddings play in a ChatGPT-like model.pdfleewayhertz.com-What role do embeddings play in a ChatGPT-like model.pdf
leewayhertz.com-What role do embeddings play in a ChatGPT-like model.pdf
robertsamuel2314 visualizações
IRJET- Recruitment Chatbot por IRJET Journal
IRJET- Recruitment ChatbotIRJET- Recruitment Chatbot
IRJET- Recruitment Chatbot
IRJET Journal47 visualizações
Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ... por rahul_net
Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...
Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...
rahul_net655 visualizações
Discovering User's Topics of Interest in Recommender Systems por Gabriel Moreira
Discovering User's Topics of Interest in Recommender SystemsDiscovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender Systems
Gabriel Moreira6.1K visualizações
A Research Paper on HUMAN MACHINE CONVERSATION USING CHATBOT por IRJET Journal
A Research Paper on HUMAN MACHINE CONVERSATION USING CHATBOTA Research Paper on HUMAN MACHINE CONVERSATION USING CHATBOT
A Research Paper on HUMAN MACHINE CONVERSATION USING CHATBOT
IRJET Journal18 visualizações
Named Entity Recognition For Hindi-English code-mixed Twitter Text por Amogh Kawle
Named Entity Recognition For Hindi-English code-mixed Twitter Text Named Entity Recognition For Hindi-English code-mixed Twitter Text
Named Entity Recognition For Hindi-English code-mixed Twitter Text
Amogh Kawle160 visualizações
DOMAIN BASED CHUNKING por ijnlc
DOMAIN BASED CHUNKINGDOMAIN BASED CHUNKING
DOMAIN BASED CHUNKING
ijnlc19 visualizações
DOMAIN BASED CHUNKING por kevig
DOMAIN BASED CHUNKINGDOMAIN BASED CHUNKING
DOMAIN BASED CHUNKING
kevig67 visualizações
DOMAIN BASED CHUNKING por kevig
DOMAIN BASED CHUNKINGDOMAIN BASED CHUNKING
DOMAIN BASED CHUNKING
kevig31 visualizações

Chatbot_Presentation

  • 1. Implementing Chatbots using Deep Learning. By : Rohan Chikorde
  • 3. What is a CHATBOT?  A chat robot, a computer program that simulates human conversation, or chat, through artificial intelligence.  It is a service, powered by rules and artificial intelligence, that you interact with via a chat interface.  The service could be any number of things, ranging from functional to fun, and it could live in any major chat product (Facebook Messenger, Slack, Telegram, Text Messages, etc).
  • 4. List of best AI Chatbots:  Mitsuku (Leobner Prize Winner) - Prize in AI for Chatbots in 2013  Jabberwacky  PersonalityForge  Botser  Cleverbot * http://www.techstext.com/list-of-best-chatbots-to-converse/
  • 5. Types of Chatbot  RETRIEVAL-BASED MODELS - o Uses a repository of predefined responses and some kind of heuristic to pick an appropriate response based on the input and context. o The heuristic could be as simple as a rule-based expression match, or as complex as an ensemble of Machine Learning classifiers.  GENERATIVE MODELS- o This bot has an artificial brain AKA artificial intelligence. You don’t have to be ridiculously specific when you are talking to it. It understands language, not just commands. o This bot continuously gets smarter as it learns from conversations it has with people.
  • 6. Open Domain vs. Closed Domain  In an open domain setting, the user can take the conversation anywhere. There isn’t necessarily have a well- defined goal or intention. Ex: Conversation about refinancing one’s mortgage  In a closed domain setting, the space of possible inputs and outputs is somewhat limited because the system is trying to achieve a very specific goal. Ex : Hotel’s Customer Support or Shopping Assistants
  • 7.  The longer the conversation the more difficult to automate it because it need to keep track of what has been said. Ex: Customer support conversations.  Short-Text Conversations where the goal is to create a single response to a single input. Ex: What is your name? Long vs Short Conversations
  • 10. Retrieval Based Model  The vast majority of production systems today are retrieval-based, or a combination of retrieval-based and generative model.  Generative models are an active area of research, but we’re not quite there yet.  For building Hotel’s Customer Support, right now best bet is most likely a retrieval-based model.
  • 11. The Ubuntu Dialog Corpus  The Ubuntu Dialog Corpus (UDC) is one of the largest public dialog datasets available.  It’s based on chat logs from the Ubuntu channels on a public IRC network.  The training data consists of 1,000,000 examples, 50% positive (label 1) and 50% negative (label 0).  Each example consists of a context, the conversation up to this point, and an utterance, a response to the context.
  • 12.  The dataset originally comes in CSV format. We could work directly with CSVs, but it’s better to convert our data into Tensorflow’s proprietary Example format.  The main benefit of this format is that it allows us to load tensors directly from the input files and let Tensorflow handle all the shuffling, batching and queuing of inputs. As part of the preprocessing, also create a vocabulary.  This means we map each word to an integer number, e.g. “cat” may become 2631. The TFRecord files which will generate store these integer numbers instead of the word strings. Its better to save the vocabulary so that we can map back from integers to words later on. Data Pre-processing
  • 13.  One of the Deep Learning model for building chatbot is called a Dual Encoder LSTM network.  There are many Deep Learning architectures – it’s an active research area.  seq2seq model often used in Machine Translation would probably do well on this task. Deep Learning Model
  • 14.  tf-idf predictor o tf-idf stands for “term frequency – inverse document” frequency and it measures how important a word in a document is relative to the whole corpus. o Documents that have similar content will have similar tf-idf vectors. o Intuitively, if a context and a response have similar words they are more likely to be a correct pair. Implementation…
  • 16. Working of Dual Encoder LSTM  Both the context and the response text are split by words, and each word is embedded into a vector. The word embedding are initialized with Stanford’s GloVe vectors and are fine-tuned during training.  Both the embedded context and response are fed into the same Recurrent Neural Network word- by-word. The RNN generates a vector representation that, loosely speaking, captures the “meaning” of the context and response (c and r).  It then, multiply c with a matrix M to “predict” a response r’. The matrix M is learned during training.  It measure the similarity of the predicted response r’ and the actual response r by taking the dot product of these two vectors. A large dot product means the vectors are similar and that the response should receive a high score.  Then it applies a sigmoid function to convert that score into a probability.
  • 17. Creating an Input Function  In order to use Tensorflow’s built-in support for training and evaluation we need to create an input function — a function that returns batches of our input data.  In fact, because our training and test data have different formats, we need different input functions for them. The input function should return a batch of features and labels.
  • 18. Steps:  On a high level, the function does the following: o Create a feature definition that describes the fields in our Example file o Read records from the input_files with tf.TFRecordReader o Parse the records according to the feature definition o Extract the training labels o Batch multiple examples and training labels o Return the batched examples and training labels
  • 19. Creating the Model  As we have different formats of training and evaluation data we have to create a function wrapper that take care of bringing the data into the right format.  It takes a model argument, which is a function that actually makes predictions.  In our case it’s the Dual Encoder LSTM, but we could easily swap it out for some other neural network
  • 20. Evaluating the model & making Predictions  After training the model we can evaluate it on the test set.  This will run the evaluation metrics on the test set instead of the validation set.  We will get probability scores for unseen data.  We could imagine feeding in 100 potential responses to a context and then picking the one with the highest score.
  • 21. References  The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems o https://arxiv.org/abs/1506.08909  Artificial intelligence markup language (aiml). o http://alice.sunlitsurf.com/alice/aiml.html.  Intelligent Chat Bot for Banking System o http://www.ijettcs.org/Volume4Issue5(2)/IJETTCS-2015-10-09-16.pdf  WILDML, Deep Learning for Chatbot o http://www.wildml.com/2016/07/deep-learning-for-chatbots-2-retrieval-based-model-tensorflow/