SlideShare uma empresa Scribd logo
1 de 39
Deep learning enabled Question
Answering models
PROJECT WORK PRESENTATION
Saurabh Saxena
2015HT12604
Introduction
DEEP LEARNING AND QUESTION ANSWERING SYSTEMS
Deep learning
 What is Deep learning?
Deep learning is a new area of Machine learning research that uses multi-
layered Artificial neural networks. The objective is to learn multiple levels of
representation and abstraction that help to make sense of data such as images,
sound, and text. It is and is becoming increasingly relevant because of three
key reasons :
 An infinitely flexible function – universal function approximation via Neural networks
 All-purpose parameter fitting – using gradient descent and its derivative algorithms
 Fast and scalable – availability of cheap GPUs for fast matrix multiplications
 Typical applications of Deep learning
 Convolution Neural Networks(CNN) in Computer vision and machine translation
 Recurrent Neural Network(RNN) like LSTM/GRU in language modeling
 Tree Neural Networks(TNN) in sentiment analysis
 Reinforcement learning in Game playing and intelligent agents
4
Basic Building blocks of deep learning
Most DL Networks (including Question Answering models) are composed
out of these basic building blocks:
• Fully Connected Network
• Word Embedding
• Convolutional Neural Network
• Recurrent Neural Network
General Architecture of a Deep model
 What is a Question Answering System?
The basic idea of an automated QA system is to extract information from
documents and given a user query provide a short and concise answer that will
meet user’s information needs.
 Traditional QA systems are basically of 2 types :
 Information Retrieval(IR) based QA – Match and ranking based broad domain
QA using mostly unstructured data, example -> Search engines
 Knowledge-based(KB) QA – semantic representation of query using structured
data like triple stores or SQL example -> Freebase , DBPedia, and Wolfram
alpha
 Question types
 Factoid questions – DeepMind CNN/DailyMail datset
 Cloze style questions – MCTest dataset and bAbI
 Open domain question answering – WikiQA and LAMBADA
QA systems
QA scenarios
QA scenarios
QA scenarios
QA scenarios
Motivations – What deep learning can
do for QA systems ?
 Traditional QA pipeline relies a lot on manual feature engineering. The aim of
deep learning models is to eliminate this.
 Aim to build systems that can directly read documents and then answer
questions based on those documents.
 RNNs have been successful in language modeling and generation but could
not achieve much success in QA as they cannot store enough context in their
hidden states . To answer complex questions models require supporting facts
far back in the past.
 Suffer from vanishing gradient problem if too many time-steps are used.
 Solution - incorporate explicit Memory in the model and a way to address
that memory for read and write.
Memory networks for QA
AND THEIR VARIANTS
What are Memory Networks ?
 Class of models that combine large memory with learning component that
can read and write to it.
 Incorporates reasoning with attention over memory (RAM).
 Most ML has limited memory which is more-or-less all that’s needed for
“low level” tasks e.g. object detection.
 Long-term memory is required to read a story and then e.g. answer
questions about it.
 It is also required for dialog: to remember previous dialog (short- and
long-term), and respond.
 Models are scalable - can store and read large amount of data in memory
- entire KB
All MemNN have four component networks (which may or
may not have shared parameters):
 I: (input feature map) convert incoming data to the internal feature
representation.
 G: (generalization) update memories given new input.
 O: produce new output (in feature representation space) given the
memories.
 R: (response) convert output O into response seen by the outside world
Step 1: controller converts incoming data to internal
feature representation (I)
Step 2: write head updates the memories and writes the data
into memory (G)
Step 3: given the external input, the read head reads
the memory and fetches relevant data (O)
Step 4: controller combines the external data with
memory contents returned by read head to generate
output (O, R)
State-of-the art Memory Networks
Datasets to train Deep QA models
BABI , LAMBADA , MCTEST AND MORE…
Datasets available to train/test QA
models
 Facebook bAbI Simplequestions– A set of 20 tasks for testing text understanding
and reasoning. For each task, there are 10000 questions for training, and 1000 for
testing. Each task tests the machine on a specific skill set.
https://research.fb.com/downloads/babi/
 Facebook bAbI Chidlren's Book Test(CBT)- Text passages and corresponding
questions drawn from Project Gutenberg Children's books. 669,343 training
questions , 8,000 dev questions and 10,000 test questions
 MCTest - consists of 500 stories and 2000 questions. By being fictional, the answer
typically can be found only in the story itself. Requires machines to answer
multiple-choice reading comprehension questions about fictional stories, directly
tackling the high-level goal of open-domain machine comprehension.
http://research.microsoft.com/en-us/um/redmond/projects/mctest/
 Language Modeling Broadened to Account for Discourse Aspects(LAMBADA
dataset) - consists of 10,022 passages, divided into 4,869 development and 5,153
test passages (extracted from 1,331 and 1,332 disjoint novels, respectively). The
average passage consists of 4.6 sentences in the context plus 1 target sentence, for
a total length of 75.4 tokens (dev) / 75 tokens (test).
http://clic.cimec.unitn.it/lambada/
 DeepMind CNN and DailyMail dataset - Collection of news articles and
corresponding cloze queriesEach dataset contains many documents (90k and 197k
each), and each document has on average 4 questions approximately. Each
question is a sentence with one missing word/phrase which can be found from the
accompanying document/context
http://cs.nyu.edu/~kcho/DMQA/
 Stanford Question answering Dataset (SQuAD) - reading comprehension dataset
consisting of questions posed by crowd-workers on a set of Wikipedia articles. The
answer to every question is a segment of text, or span, from the corresponding
reading passage. There are 100,000+ question-answer pairs on 500+ articles.
https://rajpurkar.github.io/SQuAD-explorer/explore/1.1/dev/
 AI2 Science Exams - Elementary science questions from US state and regional
science exam. 170 multi-state and 108 4th grade questions.
http://allenai.org/data/science-exam-questions.html
 WikiQA - 3047 questions sampled from Bing query logs. Each question associated
with a Wikipedia page. All sentences in the summary paragraph of the page
become the candidate answers. Only 1/3rd questions have a correct answer in the
candidate answer set.
https://www.microsoft.com/en-us/research/publication/wikiqa-a-challenge-
dataset-for-open-domain-question-answering/
Facebook bAbI dataset – 20 tasks
• Single supporting fact
• Two supporting facts
• Three supporting facts
• Two argument relations
• Three argument relations
• Yes/No questions
• Counting
• Lists/sets
• Simple Negation
• Indefinite Knowledge
• Basic Coreference
• Conjunction
• Compound Coreference
• Time Reasoning
• Basic Deduction
• Basic Induction
• Positional Reasoning
• Size Reasoning
• Path Finding
• Agent’s Motivations
20 tasks in brief..
End-to-End MemNN
Dynamic MemNN
Key-value MemNN Architecture
Experimental Setup to train deep
models
GPU, THEANO, KERAS , CUDA , CUDNN AND MORE…
Component Description
Operating System Ubuntu 16.04 VM on Intel Octa core CPU with 6.5 GB RAM
Graphics Card NVIDIDA Testla K80 with 12 GB Ram and 2080 CUDA cores
Graphics Toolkit CUDA 8.0 with CuDNN 6.0
Python Package Manager Anaconda (Continuum Analytics) for Python 2.7
Deep learning library Keras v2.0.2
with Theano v0.9.0 backend
Other python modules  Bcolz v1.0.0 for fast saving/loading of trained weights
 Numpy v1.12.1 for all multi-dimensional numeric manipulations
 Scikit-learn v0.18.1 for preprocessing, pipelining, feature-extraction, decomposition , dataset
splits and all general non-deep machine algorithms
 Cpickle for saving model
 NLTK toolkit for traditional linguistic tasks
 Matplotlib v2.0.0 – for visualizing data
 Pydot v1.0.28 and GraphViz v2.38.0– for visualizing deep models
 Openblas 0.2.19 – for fast linear algebra operations
 Pandas v0.19.2 for structured data manipulation
 Protobuf 3.0.0 for protocol buffering
 Flask v0.12 for web display
Experimental setup in Google Cloud
Compute Engine setup in Google Cloud
GPU details
Training Summary
MODELS, TEST ACCURACY AND MORE…
Model summary for bAbi Task#1
Training summary for bAbI Task#1 – one supporting fact
Training summary for bAbI Task#2 – 2 supporting fact
Joint training on all 20 tasks simultaneously
Demo on bAbi tasks -
Correct answers
Demo – Incorrect answer
Future work
 Train Dynamic Memory network on bAbi dataset
 Train Key-value memory network on bAbi dataset
 Evaluate the performance of current models on other datasets like
LAMBADA and Stanford SQUAD
 Explore the possibility of transfer learning so that models trained on open
source datasets can be applied to corporate datasets with only fine tuning
 Explore the use of trained models in dialog modeling for Helpdesk
Question answering
Thanks

Mais conteúdo relacionado

Mais procurados

NLP with Deep Learning
NLP with Deep LearningNLP with Deep Learning
NLP with Deep Learningfmguler
 
Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index
Real-Time Open-Domain Question Answering with Dense-Sparse Phrase IndexReal-Time Open-Domain Question Answering with Dense-Sparse Phrase Index
Real-Time Open-Domain Question Answering with Dense-Sparse Phrase IndexMinjoon Seo
 
Web application development - The past, the present, the future
Web application development - The past, the present, the futureWeb application development - The past, the present, the future
Web application development - The past, the present, the futureJuho Vepsäläinen
 
Visualization for Software Analytics
Visualization for Software AnalyticsVisualization for Software Analytics
Visualization for Software AnalyticsMargaret-Anne Storey
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language ProcessingPranav Gupta
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Mustafa Jarrar
 
Natural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - IntroductionNatural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - IntroductionAritra Mukherjee
 
Machine Learning Inference at the Edge
Machine Learning Inference at the EdgeMachine Learning Inference at the Edge
Machine Learning Inference at the EdgeJulien SIMON
 
Principles and Parameters in Syntax
Principles and Parameters in SyntaxPrinciples and Parameters in Syntax
Principles and Parameters in SyntaxOusama Bziker
 
Basics of MongoDB
Basics of MongoDB Basics of MongoDB
Basics of MongoDB Habilelabs
 
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review Jayneel Vora
 
Natural lanaguage processing
Natural lanaguage processingNatural lanaguage processing
Natural lanaguage processinggulshan kumar
 
Computer Networking: A Top-Down Approach
Computer Networking: A Top-Down Approach Computer Networking: A Top-Down Approach
Computer Networking: A Top-Down Approach PolRobinson
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers Arvind Devaraj
 
파이썬을 활용한 챗봇 서비스 개발 3일차
파이썬을 활용한 챗봇 서비스 개발 3일차파이썬을 활용한 챗봇 서비스 개발 3일차
파이썬을 활용한 챗봇 서비스 개발 3일차Taekyung Han
 
An Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBAn Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBLee Theobald
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingCloudxLab
 

Mais procurados (20)

NLP with Deep Learning
NLP with Deep LearningNLP with Deep Learning
NLP with Deep Learning
 
Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index
Real-Time Open-Domain Question Answering with Dense-Sparse Phrase IndexReal-Time Open-Domain Question Answering with Dense-Sparse Phrase Index
Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index
 
Web application development - The past, the present, the future
Web application development - The past, the present, the futureWeb application development - The past, the present, the future
Web application development - The past, the present, the future
 
Visualization for Software Analytics
Visualization for Software AnalyticsVisualization for Software Analytics
Visualization for Software Analytics
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing
 
Natural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - IntroductionNatural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - Introduction
 
Machine Learning Inference at the Edge
Machine Learning Inference at the EdgeMachine Learning Inference at the Edge
Machine Learning Inference at the Edge
 
Principles and Parameters in Syntax
Principles and Parameters in SyntaxPrinciples and Parameters in Syntax
Principles and Parameters in Syntax
 
Basics of MongoDB
Basics of MongoDB Basics of MongoDB
Basics of MongoDB
 
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review
 
Natural lanaguage processing
Natural lanaguage processingNatural lanaguage processing
Natural lanaguage processing
 
Word embedding
Word embedding Word embedding
Word embedding
 
Computer Networking: A Top-Down Approach
Computer Networking: A Top-Down Approach Computer Networking: A Top-Down Approach
Computer Networking: A Top-Down Approach
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
 
Transformers AI PPT.pptx
Transformers AI PPT.pptxTransformers AI PPT.pptx
Transformers AI PPT.pptx
 
Roadmap: How to Learn Machine Learning in 6 Months
Roadmap: How to Learn Machine Learning in 6 MonthsRoadmap: How to Learn Machine Learning in 6 Months
Roadmap: How to Learn Machine Learning in 6 Months
 
파이썬을 활용한 챗봇 서비스 개발 3일차
파이썬을 활용한 챗봇 서비스 개발 3일차파이썬을 활용한 챗봇 서비스 개발 3일차
파이썬을 활용한 챗봇 서비스 개발 3일차
 
An Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBAn Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDB
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 

Semelhante a Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk

A Platform for Accelerating Machine Learning Applications
 A Platform for Accelerating Machine Learning Applications A Platform for Accelerating Machine Learning Applications
A Platform for Accelerating Machine Learning ApplicationsNVIDIA Taiwan
 
Introduction of Deep Learning
Introduction of Deep LearningIntroduction of Deep Learning
Introduction of Deep LearningMyungjin Lee
 
Final training course
Final training courseFinal training course
Final training courseNoor Dhiya
 
BigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for SparkBigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for SparkDESMOND YUEN
 
Deep Learning on Qubole Data Platform
Deep Learning on Qubole Data PlatformDeep Learning on Qubole Data Platform
Deep Learning on Qubole Data PlatformShivaji Dutta
 
Sjug #26 ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23
Sjug #26   ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23Sjug #26   ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23
Sjug #26 ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23Tomasz Sikora
 
Big Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-onBig Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-onDony Riyanto
 
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Herman Wu
 
Deep Learning libraries and first experiments with Theano
Deep Learning libraries and first experiments with TheanoDeep Learning libraries and first experiments with Theano
Deep Learning libraries and first experiments with TheanoVincenzo Lomonaco
 
Deep Learning and Watson Studio
Deep Learning and Watson StudioDeep Learning and Watson Studio
Deep Learning and Watson StudioSasha Lazarevic
 
Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and TransformerArvind Devaraj
 
2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetup2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetupGanesan Narayanasamy
 
Synthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep LearningSynthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep LearningS N
 
Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Julien SIMON
 
Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesTuri, Inc.
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...inside-BigData.com
 

Semelhante a Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk (20)

A Platform for Accelerating Machine Learning Applications
 A Platform for Accelerating Machine Learning Applications A Platform for Accelerating Machine Learning Applications
A Platform for Accelerating Machine Learning Applications
 
Introduction of Deep Learning
Introduction of Deep LearningIntroduction of Deep Learning
Introduction of Deep Learning
 
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Distributed deep learning_over_spark_20_nov_2014_ver_2.8Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
 
Final training course
Final training courseFinal training course
Final training course
 
Distributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark MeetupDistributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark Meetup
 
BigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for SparkBigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for Spark
 
Deep Learning on Qubole Data Platform
Deep Learning on Qubole Data PlatformDeep Learning on Qubole Data Platform
Deep Learning on Qubole Data Platform
 
Sjug #26 ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23
Sjug #26   ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23Sjug #26   ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23
Sjug #26 ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23
 
Big Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-onBig Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-on
 
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
 
Deep Learning libraries and first experiments with Theano
Deep Learning libraries and first experiments with TheanoDeep Learning libraries and first experiments with Theano
Deep Learning libraries and first experiments with Theano
 
Deep Learning and Watson Studio
Deep Learning and Watson StudioDeep Learning and Watson Studio
Deep Learning and Watson Studio
 
Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and Transformer
 
AI and Deep Learning
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
 
2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetup2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetup
 
Synthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep LearningSynthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep Learning
 
Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)
 
Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep Features
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
System mldl meetup
System mldl meetupSystem mldl meetup
System mldl meetup
 

Último

GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 

Último (20)

GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 

Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk

  • 1. Deep learning enabled Question Answering models PROJECT WORK PRESENTATION Saurabh Saxena 2015HT12604
  • 2. Introduction DEEP LEARNING AND QUESTION ANSWERING SYSTEMS
  • 3. Deep learning  What is Deep learning? Deep learning is a new area of Machine learning research that uses multi- layered Artificial neural networks. The objective is to learn multiple levels of representation and abstraction that help to make sense of data such as images, sound, and text. It is and is becoming increasingly relevant because of three key reasons :  An infinitely flexible function – universal function approximation via Neural networks  All-purpose parameter fitting – using gradient descent and its derivative algorithms  Fast and scalable – availability of cheap GPUs for fast matrix multiplications  Typical applications of Deep learning  Convolution Neural Networks(CNN) in Computer vision and machine translation  Recurrent Neural Network(RNN) like LSTM/GRU in language modeling  Tree Neural Networks(TNN) in sentiment analysis  Reinforcement learning in Game playing and intelligent agents
  • 4. 4 Basic Building blocks of deep learning Most DL Networks (including Question Answering models) are composed out of these basic building blocks: • Fully Connected Network • Word Embedding • Convolutional Neural Network • Recurrent Neural Network
  • 5. General Architecture of a Deep model
  • 6.  What is a Question Answering System? The basic idea of an automated QA system is to extract information from documents and given a user query provide a short and concise answer that will meet user’s information needs.  Traditional QA systems are basically of 2 types :  Information Retrieval(IR) based QA – Match and ranking based broad domain QA using mostly unstructured data, example -> Search engines  Knowledge-based(KB) QA – semantic representation of query using structured data like triple stores or SQL example -> Freebase , DBPedia, and Wolfram alpha  Question types  Factoid questions – DeepMind CNN/DailyMail datset  Cloze style questions – MCTest dataset and bAbI  Open domain question answering – WikiQA and LAMBADA QA systems
  • 11. Motivations – What deep learning can do for QA systems ?  Traditional QA pipeline relies a lot on manual feature engineering. The aim of deep learning models is to eliminate this.  Aim to build systems that can directly read documents and then answer questions based on those documents.  RNNs have been successful in language modeling and generation but could not achieve much success in QA as they cannot store enough context in their hidden states . To answer complex questions models require supporting facts far back in the past.  Suffer from vanishing gradient problem if too many time-steps are used.  Solution - incorporate explicit Memory in the model and a way to address that memory for read and write.
  • 12. Memory networks for QA AND THEIR VARIANTS
  • 13. What are Memory Networks ?  Class of models that combine large memory with learning component that can read and write to it.  Incorporates reasoning with attention over memory (RAM).  Most ML has limited memory which is more-or-less all that’s needed for “low level” tasks e.g. object detection.  Long-term memory is required to read a story and then e.g. answer questions about it.  It is also required for dialog: to remember previous dialog (short- and long-term), and respond.  Models are scalable - can store and read large amount of data in memory - entire KB
  • 14. All MemNN have four component networks (which may or may not have shared parameters):  I: (input feature map) convert incoming data to the internal feature representation.  G: (generalization) update memories given new input.  O: produce new output (in feature representation space) given the memories.  R: (response) convert output O into response seen by the outside world Step 1: controller converts incoming data to internal feature representation (I) Step 2: write head updates the memories and writes the data into memory (G) Step 3: given the external input, the read head reads the memory and fetches relevant data (O) Step 4: controller combines the external data with memory contents returned by read head to generate output (O, R)
  • 16. Datasets to train Deep QA models BABI , LAMBADA , MCTEST AND MORE…
  • 17. Datasets available to train/test QA models  Facebook bAbI Simplequestions– A set of 20 tasks for testing text understanding and reasoning. For each task, there are 10000 questions for training, and 1000 for testing. Each task tests the machine on a specific skill set. https://research.fb.com/downloads/babi/  Facebook bAbI Chidlren's Book Test(CBT)- Text passages and corresponding questions drawn from Project Gutenberg Children's books. 669,343 training questions , 8,000 dev questions and 10,000 test questions  MCTest - consists of 500 stories and 2000 questions. By being fictional, the answer typically can be found only in the story itself. Requires machines to answer multiple-choice reading comprehension questions about fictional stories, directly tackling the high-level goal of open-domain machine comprehension. http://research.microsoft.com/en-us/um/redmond/projects/mctest/
  • 18.  Language Modeling Broadened to Account for Discourse Aspects(LAMBADA dataset) - consists of 10,022 passages, divided into 4,869 development and 5,153 test passages (extracted from 1,331 and 1,332 disjoint novels, respectively). The average passage consists of 4.6 sentences in the context plus 1 target sentence, for a total length of 75.4 tokens (dev) / 75 tokens (test). http://clic.cimec.unitn.it/lambada/  DeepMind CNN and DailyMail dataset - Collection of news articles and corresponding cloze queriesEach dataset contains many documents (90k and 197k each), and each document has on average 4 questions approximately. Each question is a sentence with one missing word/phrase which can be found from the accompanying document/context http://cs.nyu.edu/~kcho/DMQA/
  • 19.  Stanford Question answering Dataset (SQuAD) - reading comprehension dataset consisting of questions posed by crowd-workers on a set of Wikipedia articles. The answer to every question is a segment of text, or span, from the corresponding reading passage. There are 100,000+ question-answer pairs on 500+ articles. https://rajpurkar.github.io/SQuAD-explorer/explore/1.1/dev/  AI2 Science Exams - Elementary science questions from US state and regional science exam. 170 multi-state and 108 4th grade questions. http://allenai.org/data/science-exam-questions.html  WikiQA - 3047 questions sampled from Bing query logs. Each question associated with a Wikipedia page. All sentences in the summary paragraph of the page become the candidate answers. Only 1/3rd questions have a correct answer in the candidate answer set. https://www.microsoft.com/en-us/research/publication/wikiqa-a-challenge- dataset-for-open-domain-question-answering/
  • 20. Facebook bAbI dataset – 20 tasks • Single supporting fact • Two supporting facts • Three supporting facts • Two argument relations • Three argument relations • Yes/No questions • Counting • Lists/sets • Simple Negation • Indefinite Knowledge • Basic Coreference • Conjunction • Compound Coreference • Time Reasoning • Basic Deduction • Basic Induction • Positional Reasoning • Size Reasoning • Path Finding • Agent’s Motivations
  • 21. 20 tasks in brief..
  • 22.
  • 23.
  • 27. Experimental Setup to train deep models GPU, THEANO, KERAS , CUDA , CUDNN AND MORE…
  • 28. Component Description Operating System Ubuntu 16.04 VM on Intel Octa core CPU with 6.5 GB RAM Graphics Card NVIDIDA Testla K80 with 12 GB Ram and 2080 CUDA cores Graphics Toolkit CUDA 8.0 with CuDNN 6.0 Python Package Manager Anaconda (Continuum Analytics) for Python 2.7 Deep learning library Keras v2.0.2 with Theano v0.9.0 backend Other python modules  Bcolz v1.0.0 for fast saving/loading of trained weights  Numpy v1.12.1 for all multi-dimensional numeric manipulations  Scikit-learn v0.18.1 for preprocessing, pipelining, feature-extraction, decomposition , dataset splits and all general non-deep machine algorithms  Cpickle for saving model  NLTK toolkit for traditional linguistic tasks  Matplotlib v2.0.0 – for visualizing data  Pydot v1.0.28 and GraphViz v2.38.0– for visualizing deep models  Openblas 0.2.19 – for fast linear algebra operations  Pandas v0.19.2 for structured data manipulation  Protobuf 3.0.0 for protocol buffering  Flask v0.12 for web display Experimental setup in Google Cloud
  • 29. Compute Engine setup in Google Cloud
  • 31. Training Summary MODELS, TEST ACCURACY AND MORE…
  • 32. Model summary for bAbi Task#1
  • 33. Training summary for bAbI Task#1 – one supporting fact Training summary for bAbI Task#2 – 2 supporting fact
  • 34. Joint training on all 20 tasks simultaneously
  • 35. Demo on bAbi tasks - Correct answers
  • 36.
  • 38. Future work  Train Dynamic Memory network on bAbi dataset  Train Key-value memory network on bAbi dataset  Evaluate the performance of current models on other datasets like LAMBADA and Stanford SQUAD  Explore the possibility of transfer learning so that models trained on open source datasets can be applied to corporate datasets with only fine tuning  Explore the use of trained models in dialog modeling for Helpdesk Question answering