SlideShare uma empresa Scribd logo
1 de 29
Baixar para ler offline
5
What is wrong with apps and web models?
Conversation as an emerging paradigm for mobile UI
Bots as intelligent conversational interface agents
Major types of conversational bots:
• Social ChatBots (e.g. XiaoIce)
• InfoBots
• TaskCompletion Bots (goal-oriented)
• Personal Assistant Bots (above + recommd.)
http://venturebeat.com/2016/08/01/how-deep-reinforcement-learning-can-help-chatbots/
Bots Technology Overview: three generations; latest deep RL
Generation I: Symbolic Rule/Template Based
• Centered on grammatical rule & ontological design by
human experts (early AI approach)
• Easy interpretation, debugging, and system update
• Popular before late 90’s
• Still in use in commercial systems and by bots startups
• Limitations:
• reliance on experts
• hard to scale over domains
• data used only to help design rules, not for learning
• Example system next
6
Generation II: Data Driven, (shallow) Learning
• Data used not to design rules for NLU and action, but to learn statistical
parameters in dialogue systems
• Reduce cost of hand-crafting complex dialogue manager
• Robustness against speech recog errors in noisy environment
• MDP/POMDP & reinforcement learning for dialogue policy
• Discriminative (CRF) & generative (HMM) methods for NLU
• Popular in academic research until 2014 (before deep learning arrived at the
dialogue world); in parallel with Generation I (BBN, AT&T, CMU, SRI, CU …)
• Limitations:
• Not easy to interpret, debug, and update systems
• Still hard to scale over domains
• Models & representations not powerful enough; no end-2-end, hard to scale up
• Remained academic until deep learning arrived
• Example system next
8
Components of a state-based spoken dialogue system
Generation III: Data-Driven Deep Learning
• Like Generation-II, data used to learn everything in dialogue systems
• Reduce cost of hand-crafting complex dialogue manager
• Robustness against speech recog errors in noisy environment & against NLU errors
• MDP/POMDP & reinforcement learning for dialogue policy (same)
• But, neural models & representations are much more powerful
• End-to-End learning becomes feasible
• Attracted huge research efforts since 2015 (after deep learning’s success in
vision/speech and in deep RL shown success in Atari games)
• Limitations:
• Still not easy to interpret, debug, and update systems
• Lack interface btw cont. neural learning and symbolic NL structure to human users
• Active research in scaling over domains via deep transfer learning & RL
• No clear commercial success reported yet
• Deep RL & example research next
12
What is reinforcement learning (RL)?
• RL in Generation-II ---> not working! (with unnecessarily complex POMDP)
• RL in Generation-III ---> working! due to deep learning -- like NN vs DNN in ASR)
• RL is learning what to do so as to maximize a numerical reward signal
• “What to do” means mapping from a situation in a given environment to an action
• Takes inspiration from biology / psychology
• RL is a characterization of a problem class
• Doesn’t refer to a specific solution
• There are many methods for solving RL problems
• In its most general form, RL problems:
• Have a stateful environment, where actions can change the state of the environment
• Learn by trial and error, not by being shown examples of the right action
• Have delayed rewards, where an action’s value may not be clear until some time after it is taken
Stateful Model for RL
Agent
Environment
State
estimator
𝑠𝑡 = Summary 𝑜0, 𝑎0, 𝑜1, 𝑎1 ⋯ , 𝑜𝑡−1, 𝑎 𝑡−1, 𝑜𝑡
Trajectory: 𝑎0, 𝑟1, 𝑠1, 𝑎1, 𝑟2, 𝑠2, 𝑎2, ⋯
Return: σ 𝜏=𝑡+1
∞
𝛾 𝜏−1 𝑟𝜏 , 1 ≥ 𝛾 ≥ 0
Policy: 𝜋 𝑠𝑡 → 𝑎 𝑡
Objective: 𝜋∗ = arg max
𝜋
𝐸 σ 𝜏=𝑡+1
∞
𝛾 𝜏−1 𝑟𝜏 | 𝜋 , ∀ 𝑠𝑡
𝑎 𝑡
𝑜𝑡
𝑟𝑡
𝑠𝑡
Language
understanding
Language (response)
generation
Dialogue
Policy
𝑎 = 𝜋(𝑠)
Collect rewards
(𝑠, 𝑎, 𝑟, 𝑠’)
Optimize
𝑄(𝑠, 𝑎)
User input (o)
Response
𝑠
𝑎
Type pf Bots State Action Reward
Social ChatBots Chat history System Response # of turns maximized;
Intrinsically motivated reward
InfoBots (interactive Q/A) User current question
+ Context/history
Answers to current
question by system
Relevance of answer;
# of turns minimized
Task-Oriented Bots User current input +
Context/history
DialogAct w. SlotValue
in current turn
Task success rate;
# of turns minimized
Q-Learning [Sutton & Barto 98]
• Assume 𝑄(𝑠, 𝑎) for all 𝑠, 𝑎 can be represented in a table
1. Initialize an array 𝑄(𝑠, 𝑎) arbitrarily
2. Choose actions based on 𝑄 such that all actions are taken in all states (infinitely often in the
limit)
3. On each time step, update one element of the array:
∆𝑄 𝑠𝑡, 𝑎 𝑡 = 𝛼𝑟𝑡+1 + 𝛾(max
𝑎′
𝑄(𝑠𝑡+1, 𝑎′
))
• Model-free learning:
• Learning long-term optimal behavior without model of the environment
Function Approximation
• In many tasks, (𝑠, 𝑎) is too large for tabular representation
• Estimate the action-value function approximately as
• 𝜃: a linear function (baseline)
• 𝜃: a DNN, aka Deep Q-Network (DQN)
• Optimize 𝜃 using SGD w.r.t loss
Q-Learning for DQN [DeepMind 15]
• Learning becomes unstable
• Correlations present in the sequence of states
• Small updates to Q leads to significant change of policy and data distribution
• Correlations btw the to-be-learned 𝑄 and the target value 𝑟 + max
𝑎′
𝑄(𝑠, 𝑎′)
• Solution
• Experience replay: randomize training samples (𝑠, 𝑎, 𝑟, 𝑠′)
• Use a separate 𝑄 function to compute targets 𝑦
User goal – Two tickets for “Deadpool” tomorrow 9PM at AMC Pacific Place 11 theater, Seattle.
RULE BASED SYSTEM
User: Do you have any ticket for movie ‘Deadpool’ at 9:00 pm?
Agent: Which theater do you want?
User: AMC pacific place 11 theater, please!
Agent: How many tickets do you want?
User: Two!
Agent: What date?
User: Tomorrow!
Agent: Which city do you want?
User: Seattle!
Agent: Do you have any specification for zip code?
User: No, I do not care.
Agent: Do you have any distance constraint?
User: No, I do not care.
Agent: How about video format? 2d? Imax 3d?
User: Any kind is okay.
Agent: Do you have any constraint to price?
User: No!
Agent: Okay, I will issue 2 tickets for you, tomorrow 9:00 pm at AMC pacific
place 11 theater, Seattle,
movie ‘Deadpool’.
User: Okay, thanks.
Agent: Thanks!
REINFORCEMENT LEARNING SYSTEM
User: I want two tickets for movie ‘Deadpool’.
Agent: What is the starttime you prefer?
User: 9:00 PM
Agent: Which day?
User: Tomorrow
Agent: Which city?
User: Seattle
Agent: Which theater do you prefer?
User: AMC Pacific Place 11 theater.
Agent: Okay, I will issue 2 tickets for you, tomorrow 9:00
pm at AMC pacific place 11 theater, Seattle,
movie ‘Deadpool’.
User: Okay, thanks.
Agent: Thanks!
Wasteful!
(Very likely the user
does not care.)
Extrinsically / Intrinsically motivated RL
• Extrinsic motivation – do sth for specific reward
• RL is driven by task-specific reward
• Learning task-specific skills, don’t cope flexibly
with new problems
• Intrinsic motivation – do sth inherently enjoyable
• RL is driven by curiosity w/o explicit reward
• Developing board competence, which makes learning
task-specific skill more easily
Task Completion Bots InfoBots Social Bots
• Li Deng & Yang Liu (Edited Book) 2017. Deep Learning in Natural Language Processing, Springer, Aug 2017 (scheduled)
• Pararth Shah, Dilek Hakkani-Tür, Larry Heck. 2017. Interactivereinforcementlearningfortask-oriented dialoguemanagement. arXiv.
• Dilek Hakkani-Tur, Gokhan Tur, Asli Celikyilmaz, YunNung Chen, Jianfeng Gao, Li Deng, and Ye-Yi Wang. 2016. Multi-domain joint semantic frame
parsing using bi-directional RNN-LSTM. INTERSPEECH.
• Antoine Bordes and Jason Weston. 2016. Learning end-to-end goal-oriented dialog. arXiv.
• Mehdi Fatemi, Layla El Asri, Hannes Schulz, Jing He, Kaheer Suleman. 2016. Policy Networks with Two-Stage Training for Dialogue Systems.
SIGDIAL.
• Layla El Asri, Jing He, Kaheer Suleman. 2016. A Sequence-to-Sequence Model for User Simulation in Spoken Dialogue Systems. INTERSPEECH.
• Yun-Nung Chen, Dilek Hakkani-Tur, Gokhan Tur, Jianfeng Gao, and Li Deng. 2016. End-to-end memory networks with knowledge carryover for multi-
turn spoken language understanding. INTERSPEECH.
• Bhuwan Dhingra, Lihong Li, Xiujun Li, Jianfeng Gao Yun-Nung Chen, Faisal Ahmed, Li Deng. 2016. End-to-end
reinforcement learning of dialogue agents for information access. to submit to ACL.
• Xuesong Yang, Yun-Nung Chen, Dilek Hakkani-Tur, Paul Crook, Xiujun Li, Jianfeng Gao, Li Deng. 2016. End-to-end joint learning of natural
language understanding and dialogue manager. arXiv
• Zachary C. Lipton, Jianfeng Gao, Lihong Li, Xiujun Li, Faisal Ahmed, Li Deng. 2016. Efficient Exploration for Dialogue Policy Learning with BBQ
Networks & Replay Buffer Spiking. ArXiv.
• Jason D Williams and Geoffrey Zweig. 2016. End to-end LSTM-based dialog control optimized with supervised and reinforcement learning. arXiv
• Tiancheng Zhao and Maxine Eskenazi. 2016. Towards end-to-end learning for dialog state tracking and management using deep reinforcement
learning. arXiv preprint arXiv
• Pei-Hao Su, MilicaGasic, Nikola Mrksic, Lina Rojas-Barahona, Stefan Ultes, David Vandyke, Tsung-Hsien Wen and Steve Young. 2016. On-line
active reward learning for policy optimisation in spoken dialogue systems. ACL.
• Tsung-Hsien Wen, Milica Gasic, Nikola Mrksic,Pei-Hao Su, David Vandyke, and Steve Young. 2015. Semantically conditioned LSTM-based natural
language generation for spoken dialogue systems. EMNLP.
• Gregoire Mesnil, Yann Dauphin, Kaisheng Yao, Yoshua Bengio, Li Deng, Dilek Hakkani-Tur, Xiaodong He, Larry Heck, Gokhan Tur, Dong Yu, and
Geoffrey Zweig. 2015. “Using recurrent neural networks for slot filling in spoken language understanding,” IEEE/ACM Transactions on Audio,
Speech, and Language Processing, vol. 23, no. 3, pp. 530–539.
References on deep-learning dialogue systems (Generation-III technology)


 integrated end-to-end design






23
24
“This joint paper (2012) from the major speech
recognition laboratories details the first major
industrial application of deep learning.”
25
(CNN + LSTM)ꚚHMM hybrid
attentional layer-wise context expansion LACE
spatial smoothing
letter trigrams
• Lowest ASR error rate on SWBD: 5.9%
human SR 5.9%
Achieving Human Parity in Conversational Speech
Recognition
5 areas of potential ASR breakthrough
1. better modeling for end-to-end and other specialized architectures capable of disentangling
mixed acoustic variability factors (e.g. sequential GAN)
2. better integrated signal processing and neural learning to combat difficult far-field acoustic
environments especially with mixed speakers
3. use of neural language understanding to model long-span dependency for semantic
and syntactic consistency in speech recognition outputs, use of semantic understanding in
spoken dialogue systems to provide feedbacks to make acoustic speech recognition easier
4. use of naturally available multimodal “labels” such as images, printed text, and handwriting
to supplement the current way of providing text labels to synchronize with the
corresponding acoustic utterances (NIPS Multimodality Workshop)
5. development of ground-breaking deep unsupervised learning methods for exploitation of
potentially unlimited amounts of naturally found acoustic data of speech without the
otherwise prohibitively high cost of labeling based on the current deep supervised learning
paradigm
Here are the key points about deep learning for speech recognition from this paper:- This 2012 paper from major labs like Microsoft, IBM, Google was one of the first to apply deep learning successfully to large vocabulary speech recognition, achieving a significant reduction in error rates. - It proposed a hybrid system that combined a deep neural network (DNN or CNN+LSTM) with a traditional Hidden Markov Model (HMM) for acoustic modeling. The DNN provided a representation of the speech signal that was more discriminative than traditional features.- Training a DNN required a huge amount of labeled speech data, on the order of thousands of hours. This was one of the first works to leverage such large datasets.- The hybrid system design
Here are the key points about deep learning for speech recognition from this paper:- This 2012 paper from major labs like Microsoft, IBM, Google was one of the first to apply deep learning successfully to large vocabulary speech recognition, achieving a significant reduction in error rates. - It proposed a hybrid system that combined a deep neural network (DNN or CNN+LSTM) with a traditional Hidden Markov Model (HMM) for acoustic modeling. The DNN provided a representation of the speech signal that was more discriminative than traditional features.- Training a DNN required a huge amount of labeled speech data, on the order of thousands of hours. This was one of the first works to leverage such large datasets.- The hybrid system design
Here are the key points about deep learning for speech recognition from this paper:- This 2012 paper from major labs like Microsoft, IBM, Google was one of the first to apply deep learning successfully to large vocabulary speech recognition, achieving a significant reduction in error rates. - It proposed a hybrid system that combined a deep neural network (DNN or CNN+LSTM) with a traditional Hidden Markov Model (HMM) for acoustic modeling. The DNN provided a representation of the speech signal that was more discriminative than traditional features.- Training a DNN required a huge amount of labeled speech data, on the order of thousands of hours. This was one of the first works to leverage such large datasets.- The hybrid system design

Mais conteúdo relacionado

Mais procurados

OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014Paris Open Source Summit
 
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...AI Frontiers
 
Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...
Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...
Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...AI Frontiers
 
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRAN
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRANDeep Learning for Machine Translation, by Satoshi Enoue, SYSTRAN
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRANTAUS - The Language Data Network
 
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...Yun-Nung (Vivian) Chen
 
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...AI Frontiers
 
Practical Deep Learning for NLP
Practical Deep Learning for NLP Practical Deep Learning for NLP
Practical Deep Learning for NLP Textkernel
 
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...Yun-Nung (Vivian) Chen
 
Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...
Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...
Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...TAUS - The Language Data Network
 
NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk Vijay Ganti
 
NLP with Deep Learning
NLP with Deep LearningNLP with Deep Learning
NLP with Deep Learningfmguler
 
An Intelligent Assistant for High-Level Task Understanding
An Intelligent Assistant for High-Level Task UnderstandingAn Intelligent Assistant for High-Level Task Understanding
An Intelligent Assistant for High-Level Task UnderstandingYun-Nung (Vivian) Chen
 
ODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPindico data
 
Rigourous evaluation of nlp models in real world deployment
Rigourous evaluation of nlp models in real world deploymentRigourous evaluation of nlp models in real world deployment
Rigourous evaluation of nlp models in real world deploymentSandy Man
 
Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...
Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...
Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...Yun-Nung (Vivian) Chen
 
Deep Learning for Artificial Intelligence (AI)
Deep Learning for Artificial Intelligence (AI)Deep Learning for Artificial Intelligence (AI)
Deep Learning for Artificial Intelligence (AI)Er. Shiva K. Shrestha
 
Deep Learning for Dialogue Modeling - NTHU
Deep Learning for Dialogue Modeling - NTHUDeep Learning for Dialogue Modeling - NTHU
Deep Learning for Dialogue Modeling - NTHUYun-Nung (Vivian) Chen
 
Xuedong Huang - Deep Learning and Intelligent Applications
Xuedong Huang - Deep Learning and Intelligent ApplicationsXuedong Huang - Deep Learning and Intelligent Applications
Xuedong Huang - Deep Learning and Intelligent ApplicationsMachine Learning Prague
 

Mais procurados (20)

OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014
 
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...
 
Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...
Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...
Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...
 
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRAN
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRANDeep Learning for Machine Translation, by Satoshi Enoue, SYSTRAN
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRAN
 
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...
 
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
 
Deep learning for NLP
Deep learning for NLPDeep learning for NLP
Deep learning for NLP
 
Practical Deep Learning for NLP
Practical Deep Learning for NLP Practical Deep Learning for NLP
Practical Deep Learning for NLP
 
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...
 
Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...
Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...
Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...
 
NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk
 
NLP with Deep Learning
NLP with Deep LearningNLP with Deep Learning
NLP with Deep Learning
 
An Intelligent Assistant for High-Level Task Understanding
An Intelligent Assistant for High-Level Task UnderstandingAn Intelligent Assistant for High-Level Task Understanding
An Intelligent Assistant for High-Level Task Understanding
 
ODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLP
 
Rigourous evaluation of nlp models in real world deployment
Rigourous evaluation of nlp models in real world deploymentRigourous evaluation of nlp models in real world deployment
Rigourous evaluation of nlp models in real world deployment
 
Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...
Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...
Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...
 
Deep Learning for Artificial Intelligence (AI)
Deep Learning for Artificial Intelligence (AI)Deep Learning for Artificial Intelligence (AI)
Deep Learning for Artificial Intelligence (AI)
 
Deep Learning for Dialogue Modeling - NTHU
Deep Learning for Dialogue Modeling - NTHUDeep Learning for Dialogue Modeling - NTHU
Deep Learning for Dialogue Modeling - NTHU
 
Xuedong Huang - Deep Learning and Intelligent Applications
Xuedong Huang - Deep Learning and Intelligent ApplicationsXuedong Huang - Deep Learning and Intelligent Applications
Xuedong Huang - Deep Learning and Intelligent Applications
 
Deep Learning for Dialogue Systems
Deep Learning for Dialogue SystemsDeep Learning for Dialogue Systems
Deep Learning for Dialogue Systems
 

Semelhante a Here are the key points about deep learning for speech recognition from this paper:- This 2012 paper from major labs like Microsoft, IBM, Google was one of the first to apply deep learning successfully to large vocabulary speech recognition, achieving a significant reduction in error rates. - It proposed a hybrid system that combined a deep neural network (DNN or CNN+LSTM) with a traditional Hidden Markov Model (HMM) for acoustic modeling. The DNN provided a representation of the speech signal that was more discriminative than traditional features.- Training a DNN required a huge amount of labeled speech data, on the order of thousands of hours. This was one of the first works to leverage such large datasets.- The hybrid system design

Natural Language Processing: From Human-Robot Interaction to Alzheimer’s Dete...
Natural Language Processing: From Human-Robot Interaction to Alzheimer’s Dete...Natural Language Processing: From Human-Robot Interaction to Alzheimer’s Dete...
Natural Language Processing: From Human-Robot Interaction to Alzheimer’s Dete...Jekaterina Novikova, PhD
 
Rasa Developer Summit - Bing Liu - Interactive Learning of Task-Oriented Dial...
Rasa Developer Summit - Bing Liu - Interactive Learning of Task-Oriented Dial...Rasa Developer Summit - Bing Liu - Interactive Learning of Task-Oriented Dial...
Rasa Developer Summit - Bing Liu - Interactive Learning of Task-Oriented Dial...Rasa Technologies
 
#1 Berlin Students in AI, Machine Learning & NLP presentation
#1 Berlin Students in AI, Machine Learning & NLP presentation#1 Berlin Students in AI, Machine Learning & NLP presentation
#1 Berlin Students in AI, Machine Learning & NLP presentationparlamind
 
Deep learning: the future of recommendations
Deep learning: the future of recommendationsDeep learning: the future of recommendations
Deep learning: the future of recommendationsBalázs Hidasi
 
NeurIPS_2018_ConvAI2_ParticipantSlides.pptx
NeurIPS_2018_ConvAI2_ParticipantSlides.pptxNeurIPS_2018_ConvAI2_ParticipantSlides.pptx
NeurIPS_2018_ConvAI2_ParticipantSlides.pptxKaiduTester
 
End-to-End Joint Learning of Natural Language Understanding and Dialogue Manager
End-to-End Joint Learning of Natural Language Understanding and Dialogue ManagerEnd-to-End Joint Learning of Natural Language Understanding and Dialogue Manager
End-to-End Joint Learning of Natural Language Understanding and Dialogue ManagerYun-Nung (Vivian) Chen
 
What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?krisztianbalog
 
Multiskill Conversational AI
Multiskill Conversational AIMultiskill Conversational AI
Multiskill Conversational AIDaniel Kornev
 
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systemsQi He
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersRoelof Pieters
 
Multiskill Conversational AI
Multiskill Conversational AIMultiskill Conversational AI
Multiskill Conversational AIDaniel Kornev
 
Tomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPTomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPMachine Learning Prague
 
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2Karthik Murugesan
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPMENGSAYLOEM1
 
Deep Learning: a birds eye view
Deep Learning: a birds eye viewDeep Learning: a birds eye view
Deep Learning: a birds eye viewRoelof Pieters
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introductionananth
 

Semelhante a Here are the key points about deep learning for speech recognition from this paper:- This 2012 paper from major labs like Microsoft, IBM, Google was one of the first to apply deep learning successfully to large vocabulary speech recognition, achieving a significant reduction in error rates. - It proposed a hybrid system that combined a deep neural network (DNN or CNN+LSTM) with a traditional Hidden Markov Model (HMM) for acoustic modeling. The DNN provided a representation of the speech signal that was more discriminative than traditional features.- Training a DNN required a huge amount of labeled speech data, on the order of thousands of hours. This was one of the first works to leverage such large datasets.- The hybrid system design (20)

Natural Language Processing: From Human-Robot Interaction to Alzheimer’s Dete...
Natural Language Processing: From Human-Robot Interaction to Alzheimer’s Dete...Natural Language Processing: From Human-Robot Interaction to Alzheimer’s Dete...
Natural Language Processing: From Human-Robot Interaction to Alzheimer’s Dete...
 
Rasa Developer Summit - Bing Liu - Interactive Learning of Task-Oriented Dial...
Rasa Developer Summit - Bing Liu - Interactive Learning of Task-Oriented Dial...Rasa Developer Summit - Bing Liu - Interactive Learning of Task-Oriented Dial...
Rasa Developer Summit - Bing Liu - Interactive Learning of Task-Oriented Dial...
 
#1 Berlin Students in AI, Machine Learning & NLP presentation
#1 Berlin Students in AI, Machine Learning & NLP presentation#1 Berlin Students in AI, Machine Learning & NLP presentation
#1 Berlin Students in AI, Machine Learning & NLP presentation
 
Deep learning: the future of recommendations
Deep learning: the future of recommendationsDeep learning: the future of recommendations
Deep learning: the future of recommendations
 
NeurIPS_2018_ConvAI2_ParticipantSlides.pptx
NeurIPS_2018_ConvAI2_ParticipantSlides.pptxNeurIPS_2018_ConvAI2_ParticipantSlides.pptx
NeurIPS_2018_ConvAI2_ParticipantSlides.pptx
 
End-to-End Joint Learning of Natural Language Understanding and Dialogue Manager
End-to-End Joint Learning of Natural Language Understanding and Dialogue ManagerEnd-to-End Joint Learning of Natural Language Understanding and Dialogue Manager
End-to-End Joint Learning of Natural Language Understanding and Dialogue Manager
 
What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?
 
Multiskill Conversational AI
Multiskill Conversational AIMultiskill Conversational AI
Multiskill Conversational AI
 
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
 
MILA DL & RL summer school highlights
MILA DL & RL summer school highlights MILA DL & RL summer school highlights
MILA DL & RL summer school highlights
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ers
 
Tensorflowv5.0
Tensorflowv5.0Tensorflowv5.0
Tensorflowv5.0
 
Realizing AI Conversational Bot
Realizing AI Conversational BotRealizing AI Conversational Bot
Realizing AI Conversational Bot
 
Open ai openpower
Open ai openpowerOpen ai openpower
Open ai openpower
 
Multiskill Conversational AI
Multiskill Conversational AIMultiskill Conversational AI
Multiskill Conversational AI
 
Tomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPTomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLP
 
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLP
 
Deep Learning: a birds eye view
Deep Learning: a birds eye viewDeep Learning: a birds eye view
Deep Learning: a birds eye view
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
 

Mais de AI Frontiers

Divya Jain at AI Frontiers : Video Summarization
Divya Jain at AI Frontiers : Video SummarizationDivya Jain at AI Frontiers : Video Summarization
Divya Jain at AI Frontiers : Video SummarizationAI Frontiers
 
Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI
Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI
Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI AI Frontiers
 
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 1: Heuristi...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 1: Heuristi...Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 1: Heuristi...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 1: Heuristi...AI Frontiers
 
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-lecture 2: Incremen...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-lecture 2: Incremen...Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-lecture 2: Incremen...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-lecture 2: Incremen...AI Frontiers
 
Training at AI Frontiers 2018 - Udacity: Enhancing NLP with Deep Neural Networks
Training at AI Frontiers 2018 - Udacity: Enhancing NLP with Deep Neural NetworksTraining at AI Frontiers 2018 - Udacity: Enhancing NLP with Deep Neural Networks
Training at AI Frontiers 2018 - Udacity: Enhancing NLP with Deep Neural NetworksAI Frontiers
 
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 3: Any-Angl...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 3: Any-Angl...Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 3: Any-Angl...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 3: Any-Angl...AI Frontiers
 
Percy Liang at AI Frontiers : Pushing the Limits of Machine Learning
Percy Liang at AI Frontiers : Pushing the Limits of Machine LearningPercy Liang at AI Frontiers : Pushing the Limits of Machine Learning
Percy Liang at AI Frontiers : Pushing the Limits of Machine LearningAI Frontiers
 
Ilya Sutskever at AI Frontiers : Progress towards the OpenAI mission
Ilya Sutskever at AI Frontiers : Progress towards the OpenAI missionIlya Sutskever at AI Frontiers : Progress towards the OpenAI mission
Ilya Sutskever at AI Frontiers : Progress towards the OpenAI missionAI Frontiers
 
Mark Moore at AI Frontiers : Uber Elevate
Mark Moore at AI Frontiers : Uber ElevateMark Moore at AI Frontiers : Uber Elevate
Mark Moore at AI Frontiers : Uber ElevateAI Frontiers
 
Mario Munich at AI Frontiers : Consumer robotics: embedding affordable AI in ...
Mario Munich at AI Frontiers : Consumer robotics: embedding affordable AI in ...Mario Munich at AI Frontiers : Consumer robotics: embedding affordable AI in ...
Mario Munich at AI Frontiers : Consumer robotics: embedding affordable AI in ...AI Frontiers
 
Arnaud Thiercelin at AI Frontiers : AI in the Sky
Arnaud Thiercelin at AI Frontiers : AI in the SkyArnaud Thiercelin at AI Frontiers : AI in the Sky
Arnaud Thiercelin at AI Frontiers : AI in the SkyAI Frontiers
 
Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...
Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...
Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...AI Frontiers
 
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Set...
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Set...Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Set...
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Set...AI Frontiers
 
Sumit Gupta at AI Frontiers : AI for Enterprise
Sumit Gupta at AI Frontiers : AI for EnterpriseSumit Gupta at AI Frontiers : AI for Enterprise
Sumit Gupta at AI Frontiers : AI for EnterpriseAI Frontiers
 
Yuandong Tian at AI Frontiers : Planning in Reinforcement Learning
Yuandong Tian at AI Frontiers : Planning in Reinforcement LearningYuandong Tian at AI Frontiers : Planning in Reinforcement Learning
Yuandong Tian at AI Frontiers : Planning in Reinforcement LearningAI Frontiers
 
Alex Ermolaev at AI Frontiers : Major Applications of AI in Healthcare
Alex Ermolaev at AI Frontiers : Major Applications of AI in HealthcareAlex Ermolaev at AI Frontiers : Major Applications of AI in Healthcare
Alex Ermolaev at AI Frontiers : Major Applications of AI in HealthcareAI Frontiers
 
Long Lin at AI Frontiers : AI in Gaming
Long Lin at AI Frontiers : AI in GamingLong Lin at AI Frontiers : AI in Gaming
Long Lin at AI Frontiers : AI in GamingAI Frontiers
 
Melissa Goldman at AI Frontiers : AI & Finance
Melissa Goldman at AI Frontiers : AI & FinanceMelissa Goldman at AI Frontiers : AI & Finance
Melissa Goldman at AI Frontiers : AI & FinanceAI Frontiers
 
Li Deng at AI Frontiers : From Modeling Speech/Language to Modeling Financial...
Li Deng at AI Frontiers : From Modeling Speech/Language to Modeling Financial...Li Deng at AI Frontiers : From Modeling Speech/Language to Modeling Financial...
Li Deng at AI Frontiers : From Modeling Speech/Language to Modeling Financial...AI Frontiers
 
Ashok Srivastava at AI Frontiers : Using AI to Solve Complex Economic Problems
Ashok Srivastava at AI Frontiers : Using AI to Solve Complex Economic ProblemsAshok Srivastava at AI Frontiers : Using AI to Solve Complex Economic Problems
Ashok Srivastava at AI Frontiers : Using AI to Solve Complex Economic ProblemsAI Frontiers
 

Mais de AI Frontiers (20)

Divya Jain at AI Frontiers : Video Summarization
Divya Jain at AI Frontiers : Video SummarizationDivya Jain at AI Frontiers : Video Summarization
Divya Jain at AI Frontiers : Video Summarization
 
Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI
Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI
Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI
 
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 1: Heuristi...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 1: Heuristi...Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 1: Heuristi...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 1: Heuristi...
 
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-lecture 2: Incremen...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-lecture 2: Incremen...Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-lecture 2: Incremen...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-lecture 2: Incremen...
 
Training at AI Frontiers 2018 - Udacity: Enhancing NLP with Deep Neural Networks
Training at AI Frontiers 2018 - Udacity: Enhancing NLP with Deep Neural NetworksTraining at AI Frontiers 2018 - Udacity: Enhancing NLP with Deep Neural Networks
Training at AI Frontiers 2018 - Udacity: Enhancing NLP with Deep Neural Networks
 
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 3: Any-Angl...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 3: Any-Angl...Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 3: Any-Angl...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 3: Any-Angl...
 
Percy Liang at AI Frontiers : Pushing the Limits of Machine Learning
Percy Liang at AI Frontiers : Pushing the Limits of Machine LearningPercy Liang at AI Frontiers : Pushing the Limits of Machine Learning
Percy Liang at AI Frontiers : Pushing the Limits of Machine Learning
 
Ilya Sutskever at AI Frontiers : Progress towards the OpenAI mission
Ilya Sutskever at AI Frontiers : Progress towards the OpenAI missionIlya Sutskever at AI Frontiers : Progress towards the OpenAI mission
Ilya Sutskever at AI Frontiers : Progress towards the OpenAI mission
 
Mark Moore at AI Frontiers : Uber Elevate
Mark Moore at AI Frontiers : Uber ElevateMark Moore at AI Frontiers : Uber Elevate
Mark Moore at AI Frontiers : Uber Elevate
 
Mario Munich at AI Frontiers : Consumer robotics: embedding affordable AI in ...
Mario Munich at AI Frontiers : Consumer robotics: embedding affordable AI in ...Mario Munich at AI Frontiers : Consumer robotics: embedding affordable AI in ...
Mario Munich at AI Frontiers : Consumer robotics: embedding affordable AI in ...
 
Arnaud Thiercelin at AI Frontiers : AI in the Sky
Arnaud Thiercelin at AI Frontiers : AI in the SkyArnaud Thiercelin at AI Frontiers : AI in the Sky
Arnaud Thiercelin at AI Frontiers : AI in the Sky
 
Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...
Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...
Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...
 
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Set...
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Set...Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Set...
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Set...
 
Sumit Gupta at AI Frontiers : AI for Enterprise
Sumit Gupta at AI Frontiers : AI for EnterpriseSumit Gupta at AI Frontiers : AI for Enterprise
Sumit Gupta at AI Frontiers : AI for Enterprise
 
Yuandong Tian at AI Frontiers : Planning in Reinforcement Learning
Yuandong Tian at AI Frontiers : Planning in Reinforcement LearningYuandong Tian at AI Frontiers : Planning in Reinforcement Learning
Yuandong Tian at AI Frontiers : Planning in Reinforcement Learning
 
Alex Ermolaev at AI Frontiers : Major Applications of AI in Healthcare
Alex Ermolaev at AI Frontiers : Major Applications of AI in HealthcareAlex Ermolaev at AI Frontiers : Major Applications of AI in Healthcare
Alex Ermolaev at AI Frontiers : Major Applications of AI in Healthcare
 
Long Lin at AI Frontiers : AI in Gaming
Long Lin at AI Frontiers : AI in GamingLong Lin at AI Frontiers : AI in Gaming
Long Lin at AI Frontiers : AI in Gaming
 
Melissa Goldman at AI Frontiers : AI & Finance
Melissa Goldman at AI Frontiers : AI & FinanceMelissa Goldman at AI Frontiers : AI & Finance
Melissa Goldman at AI Frontiers : AI & Finance
 
Li Deng at AI Frontiers : From Modeling Speech/Language to Modeling Financial...
Li Deng at AI Frontiers : From Modeling Speech/Language to Modeling Financial...Li Deng at AI Frontiers : From Modeling Speech/Language to Modeling Financial...
Li Deng at AI Frontiers : From Modeling Speech/Language to Modeling Financial...
 
Ashok Srivastava at AI Frontiers : Using AI to Solve Complex Economic Problems
Ashok Srivastava at AI Frontiers : Using AI to Solve Complex Economic ProblemsAshok Srivastava at AI Frontiers : Using AI to Solve Complex Economic Problems
Ashok Srivastava at AI Frontiers : Using AI to Solve Complex Economic Problems
 

Último

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Último (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Here are the key points about deep learning for speech recognition from this paper:- This 2012 paper from major labs like Microsoft, IBM, Google was one of the first to apply deep learning successfully to large vocabulary speech recognition, achieving a significant reduction in error rates. - It proposed a hybrid system that combined a deep neural network (DNN or CNN+LSTM) with a traditional Hidden Markov Model (HMM) for acoustic modeling. The DNN provided a representation of the speech signal that was more discriminative than traditional features.- Training a DNN required a huge amount of labeled speech data, on the order of thousands of hours. This was one of the first works to leverage such large datasets.- The hybrid system design

  • 1.
  • 2.
  • 3.
  • 4.
  • 5. 5 What is wrong with apps and web models? Conversation as an emerging paradigm for mobile UI Bots as intelligent conversational interface agents Major types of conversational bots: • Social ChatBots (e.g. XiaoIce) • InfoBots • TaskCompletion Bots (goal-oriented) • Personal Assistant Bots (above + recommd.) http://venturebeat.com/2016/08/01/how-deep-reinforcement-learning-can-help-chatbots/ Bots Technology Overview: three generations; latest deep RL
  • 6. Generation I: Symbolic Rule/Template Based • Centered on grammatical rule & ontological design by human experts (early AI approach) • Easy interpretation, debugging, and system update • Popular before late 90’s • Still in use in commercial systems and by bots startups • Limitations: • reliance on experts • hard to scale over domains • data used only to help design rules, not for learning • Example system next 6
  • 7.
  • 8. Generation II: Data Driven, (shallow) Learning • Data used not to design rules for NLU and action, but to learn statistical parameters in dialogue systems • Reduce cost of hand-crafting complex dialogue manager • Robustness against speech recog errors in noisy environment • MDP/POMDP & reinforcement learning for dialogue policy • Discriminative (CRF) & generative (HMM) methods for NLU • Popular in academic research until 2014 (before deep learning arrived at the dialogue world); in parallel with Generation I (BBN, AT&T, CMU, SRI, CU …) • Limitations: • Not easy to interpret, debug, and update systems • Still hard to scale over domains • Models & representations not powerful enough; no end-2-end, hard to scale up • Remained academic until deep learning arrived • Example system next 8
  • 9.
  • 10. Components of a state-based spoken dialogue system
  • 11.
  • 12. Generation III: Data-Driven Deep Learning • Like Generation-II, data used to learn everything in dialogue systems • Reduce cost of hand-crafting complex dialogue manager • Robustness against speech recog errors in noisy environment & against NLU errors • MDP/POMDP & reinforcement learning for dialogue policy (same) • But, neural models & representations are much more powerful • End-to-End learning becomes feasible • Attracted huge research efforts since 2015 (after deep learning’s success in vision/speech and in deep RL shown success in Atari games) • Limitations: • Still not easy to interpret, debug, and update systems • Lack interface btw cont. neural learning and symbolic NL structure to human users • Active research in scaling over domains via deep transfer learning & RL • No clear commercial success reported yet • Deep RL & example research next 12
  • 13. What is reinforcement learning (RL)? • RL in Generation-II ---> not working! (with unnecessarily complex POMDP) • RL in Generation-III ---> working! due to deep learning -- like NN vs DNN in ASR) • RL is learning what to do so as to maximize a numerical reward signal • “What to do” means mapping from a situation in a given environment to an action • Takes inspiration from biology / psychology • RL is a characterization of a problem class • Doesn’t refer to a specific solution • There are many methods for solving RL problems • In its most general form, RL problems: • Have a stateful environment, where actions can change the state of the environment • Learn by trial and error, not by being shown examples of the right action • Have delayed rewards, where an action’s value may not be clear until some time after it is taken
  • 14. Stateful Model for RL Agent Environment State estimator 𝑠𝑡 = Summary 𝑜0, 𝑎0, 𝑜1, 𝑎1 ⋯ , 𝑜𝑡−1, 𝑎 𝑡−1, 𝑜𝑡 Trajectory: 𝑎0, 𝑟1, 𝑠1, 𝑎1, 𝑟2, 𝑠2, 𝑎2, ⋯ Return: σ 𝜏=𝑡+1 ∞ 𝛾 𝜏−1 𝑟𝜏 , 1 ≥ 𝛾 ≥ 0 Policy: 𝜋 𝑠𝑡 → 𝑎 𝑡 Objective: 𝜋∗ = arg max 𝜋 𝐸 σ 𝜏=𝑡+1 ∞ 𝛾 𝜏−1 𝑟𝜏 | 𝜋 , ∀ 𝑠𝑡 𝑎 𝑡 𝑜𝑡 𝑟𝑡 𝑠𝑡
  • 15. Language understanding Language (response) generation Dialogue Policy 𝑎 = 𝜋(𝑠) Collect rewards (𝑠, 𝑎, 𝑟, 𝑠’) Optimize 𝑄(𝑠, 𝑎) User input (o) Response 𝑠 𝑎 Type pf Bots State Action Reward Social ChatBots Chat history System Response # of turns maximized; Intrinsically motivated reward InfoBots (interactive Q/A) User current question + Context/history Answers to current question by system Relevance of answer; # of turns minimized Task-Oriented Bots User current input + Context/history DialogAct w. SlotValue in current turn Task success rate; # of turns minimized
  • 16. Q-Learning [Sutton & Barto 98] • Assume 𝑄(𝑠, 𝑎) for all 𝑠, 𝑎 can be represented in a table 1. Initialize an array 𝑄(𝑠, 𝑎) arbitrarily 2. Choose actions based on 𝑄 such that all actions are taken in all states (infinitely often in the limit) 3. On each time step, update one element of the array: ∆𝑄 𝑠𝑡, 𝑎 𝑡 = 𝛼𝑟𝑡+1 + 𝛾(max 𝑎′ 𝑄(𝑠𝑡+1, 𝑎′ )) • Model-free learning: • Learning long-term optimal behavior without model of the environment
  • 17. Function Approximation • In many tasks, (𝑠, 𝑎) is too large for tabular representation • Estimate the action-value function approximately as • 𝜃: a linear function (baseline) • 𝜃: a DNN, aka Deep Q-Network (DQN) • Optimize 𝜃 using SGD w.r.t loss
  • 18. Q-Learning for DQN [DeepMind 15] • Learning becomes unstable • Correlations present in the sequence of states • Small updates to Q leads to significant change of policy and data distribution • Correlations btw the to-be-learned 𝑄 and the target value 𝑟 + max 𝑎′ 𝑄(𝑠, 𝑎′) • Solution • Experience replay: randomize training samples (𝑠, 𝑎, 𝑟, 𝑠′) • Use a separate 𝑄 function to compute targets 𝑦
  • 19. User goal – Two tickets for “Deadpool” tomorrow 9PM at AMC Pacific Place 11 theater, Seattle. RULE BASED SYSTEM User: Do you have any ticket for movie ‘Deadpool’ at 9:00 pm? Agent: Which theater do you want? User: AMC pacific place 11 theater, please! Agent: How many tickets do you want? User: Two! Agent: What date? User: Tomorrow! Agent: Which city do you want? User: Seattle! Agent: Do you have any specification for zip code? User: No, I do not care. Agent: Do you have any distance constraint? User: No, I do not care. Agent: How about video format? 2d? Imax 3d? User: Any kind is okay. Agent: Do you have any constraint to price? User: No! Agent: Okay, I will issue 2 tickets for you, tomorrow 9:00 pm at AMC pacific place 11 theater, Seattle, movie ‘Deadpool’. User: Okay, thanks. Agent: Thanks! REINFORCEMENT LEARNING SYSTEM User: I want two tickets for movie ‘Deadpool’. Agent: What is the starttime you prefer? User: 9:00 PM Agent: Which day? User: Tomorrow Agent: Which city? User: Seattle Agent: Which theater do you prefer? User: AMC Pacific Place 11 theater. Agent: Okay, I will issue 2 tickets for you, tomorrow 9:00 pm at AMC pacific place 11 theater, Seattle, movie ‘Deadpool’. User: Okay, thanks. Agent: Thanks! Wasteful! (Very likely the user does not care.)
  • 20. Extrinsically / Intrinsically motivated RL • Extrinsic motivation – do sth for specific reward • RL is driven by task-specific reward • Learning task-specific skills, don’t cope flexibly with new problems • Intrinsic motivation – do sth inherently enjoyable • RL is driven by curiosity w/o explicit reward • Developing board competence, which makes learning task-specific skill more easily Task Completion Bots InfoBots Social Bots
  • 21. • Li Deng & Yang Liu (Edited Book) 2017. Deep Learning in Natural Language Processing, Springer, Aug 2017 (scheduled) • Pararth Shah, Dilek Hakkani-Tür, Larry Heck. 2017. Interactivereinforcementlearningfortask-oriented dialoguemanagement. arXiv. • Dilek Hakkani-Tur, Gokhan Tur, Asli Celikyilmaz, YunNung Chen, Jianfeng Gao, Li Deng, and Ye-Yi Wang. 2016. Multi-domain joint semantic frame parsing using bi-directional RNN-LSTM. INTERSPEECH. • Antoine Bordes and Jason Weston. 2016. Learning end-to-end goal-oriented dialog. arXiv. • Mehdi Fatemi, Layla El Asri, Hannes Schulz, Jing He, Kaheer Suleman. 2016. Policy Networks with Two-Stage Training for Dialogue Systems. SIGDIAL. • Layla El Asri, Jing He, Kaheer Suleman. 2016. A Sequence-to-Sequence Model for User Simulation in Spoken Dialogue Systems. INTERSPEECH. • Yun-Nung Chen, Dilek Hakkani-Tur, Gokhan Tur, Jianfeng Gao, and Li Deng. 2016. End-to-end memory networks with knowledge carryover for multi- turn spoken language understanding. INTERSPEECH. • Bhuwan Dhingra, Lihong Li, Xiujun Li, Jianfeng Gao Yun-Nung Chen, Faisal Ahmed, Li Deng. 2016. End-to-end reinforcement learning of dialogue agents for information access. to submit to ACL. • Xuesong Yang, Yun-Nung Chen, Dilek Hakkani-Tur, Paul Crook, Xiujun Li, Jianfeng Gao, Li Deng. 2016. End-to-end joint learning of natural language understanding and dialogue manager. arXiv • Zachary C. Lipton, Jianfeng Gao, Lihong Li, Xiujun Li, Faisal Ahmed, Li Deng. 2016. Efficient Exploration for Dialogue Policy Learning with BBQ Networks & Replay Buffer Spiking. ArXiv. • Jason D Williams and Geoffrey Zweig. 2016. End to-end LSTM-based dialog control optimized with supervised and reinforcement learning. arXiv • Tiancheng Zhao and Maxine Eskenazi. 2016. Towards end-to-end learning for dialog state tracking and management using deep reinforcement learning. arXiv preprint arXiv • Pei-Hao Su, MilicaGasic, Nikola Mrksic, Lina Rojas-Barahona, Stefan Ultes, David Vandyke, Tsung-Hsien Wen and Steve Young. 2016. On-line active reward learning for policy optimisation in spoken dialogue systems. ACL. • Tsung-Hsien Wen, Milica Gasic, Nikola Mrksic,Pei-Hao Su, David Vandyke, and Steve Young. 2015. Semantically conditioned LSTM-based natural language generation for spoken dialogue systems. EMNLP. • Gregoire Mesnil, Yann Dauphin, Kaisheng Yao, Yoshua Bengio, Li Deng, Dilek Hakkani-Tur, Xiaodong He, Larry Heck, Gokhan Tur, Dong Yu, and Geoffrey Zweig. 2015. “Using recurrent neural networks for slot filling in spoken language understanding,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 3, pp. 530–539. References on deep-learning dialogue systems (Generation-III technology)
  • 22.    integrated end-to-end design      
  • 23. 23
  • 24. 24 “This joint paper (2012) from the major speech recognition laboratories details the first major industrial application of deep learning.”
  • 25. 25 (CNN + LSTM)ꚚHMM hybrid attentional layer-wise context expansion LACE spatial smoothing letter trigrams • Lowest ASR error rate on SWBD: 5.9% human SR 5.9% Achieving Human Parity in Conversational Speech Recognition
  • 26. 5 areas of potential ASR breakthrough 1. better modeling for end-to-end and other specialized architectures capable of disentangling mixed acoustic variability factors (e.g. sequential GAN) 2. better integrated signal processing and neural learning to combat difficult far-field acoustic environments especially with mixed speakers 3. use of neural language understanding to model long-span dependency for semantic and syntactic consistency in speech recognition outputs, use of semantic understanding in spoken dialogue systems to provide feedbacks to make acoustic speech recognition easier 4. use of naturally available multimodal “labels” such as images, printed text, and handwriting to supplement the current way of providing text labels to synchronize with the corresponding acoustic utterances (NIPS Multimodality Workshop) 5. development of ground-breaking deep unsupervised learning methods for exploitation of potentially unlimited amounts of naturally found acoustic data of speech without the otherwise prohibitively high cost of labeling based on the current deep supervised learning paradigm