SlideShare a Scribd company logo
1 of 29
Deep learning
For Natural Language Processing
Presented By: Waziri Shebogholo
University of Dodoma
shebogholo@gmail.com
Overview of the talk
 Introduction to NLP
 Applications of NLP
 Word representations
 Language Model
 RNN model and it’s variants
 Sentiment analysis (practical)
 Conclusion
What is Natural Language Processing?
Let’s define NLP as:-
The field of study that aims at making computers
able to understand human language and perform
useful tasks, like making appointments.
It’s at the intersection of CS, AI and Linguistics.
NLP is difficult, but why?
 Complexity in representing and learning
 Human languages are ambiguous
Why deep Learning for NLP?
NLP based on human-designed features is:-
1. Too specific
2. Requires domain specific knowledge
NLP applications
 Sentiment analysis (today)
 Information extraction
 Dialog agents / chatbots
 Language modelling
 Machine Translation
 Speech recognition
Just to mention a few examples of NLP capabilities
Word Representation
The common way to represent words is by using vectors.
That’s vectors do encode meaning of words in NLP
Approaches to this:-
1. Discrete representation
2. Distributed representation
Discrete representation (one-hot representation)
 Words are regarded as atomic symbols
 Each word is represented using a vector of size |V|
 ‘1’ at one point and ‘0’ at all others
Example
Corpus: “I love deep learning”, “I love NLP”, “Machine learning is funny”
|V| = {“I”, “love”, “deep”, “learning”, “NLP”, “Machine”, “is”, “funny”}
One-hot representation of love (using the above vocabulary)
 (0,1,0,0,0,0,0,0,)
Problems with one-hot representation
 Similar words aren't represented the same way
 Computational complexity due to curse of
dimensionality
Alternative!
Distributed representation
Represent a word by means of its neighbors
“You shall know a word by the company it keeps.”
J.R. Firth, 1957
All words or just few words?
1. Full-window approach, e.g. Latent Semantic Analysis
2. Local-window approach, e.g. Word2Vec (our focus)
Word2Vec
There’re two flavors to Word2Vec
1. Skip-gram
2. Continuous bag-of-word (CBOW)
About the two models
1. CBOW
Predict the center word given the surrounding words
2. Skip-gram
Predict the surrounding words given the center word.
Language Model
Compute probability of the next word
given the previous words.
Why do we have to care about LMs?
They’re used in a lot of NLP tasks from
machine translation, text generation, speech
recognition, and a lot more.
Language Models
1. Count-based Language Models
Apply fixed window size of the
previous words to calculate probabilities of
the upcoming words.
2. Neural Network Models
It may condition a word based on all
of the previous words in a corpus. RNN is
the most widely used model in this task.
Recurrent Neural Network (RNN)
In deep learning, all problems can be
classified as:-
1. Fixed topological structure problems
e.g. Images …image classification
2. Sequential data problems
e.g. text/audio …speech recognition
RNN for sequential data.
RNN
In a normal feed forward network for making
prediction, you need not any relation to previous
outputs that has been classified.
Scenario:
While reading a book, you need to remember the
context mentioned and what’s discussed in the
entire book.
This is the case in sentiment analysis where
algorithm need to remember the context of words
before classifying document as Neg/Pos.
Why are RNN’s capable of such a task:-
1. Hidden states can store a lot of
information and pass it on, effectively
2. Hidden states are updated by
nonlinear function.
Where do we find RNNs
1. Chatbots
2. Handwriting detection
3. Video and audio classification
4. Sentiment analysis
5. Time series analysis
Recurrence
Recurrent function is called at each
time step to model temporal data.
Temporal data … depend on the previous
units of data.
𝑥 𝑡 = 𝑥 𝑡 − 1 + 𝑏
We first initialize initial hidden state
Then:-
For each time step:-
𝑎 𝑡
= U𝑥 𝑡
+ Wℎ(𝑡−1)
+ b
𝑎 𝑡
-- activation at one time step
Then
ℎ 𝑡 = tanh(𝑎 𝑡)
Then after:-
𝑜 𝑡 = 𝑉ℎ 𝑡 + c (bias)
Finally
𝑦 𝑡 = softmax(𝑜 𝑡)
Our parameters are :-
b and c as well as U, V and W weight matrices
U for input-to-hidden connection
V for hidden-to-hidden connection
W for output-to-hidden connection
Note: That was example network
that maps input to output of the same
length.
Bi-directional RNN
Neural Machine Translation (NMT)
Sentiment analysis
!
Deep Learning for Natural Language Processing

More Related Content

What's hot

A word sense disambiguation technique for sinhala
A word sense disambiguation technique  for sinhalaA word sense disambiguation technique  for sinhala
A word sense disambiguation technique for sinhala
Vijayindu Gamage
 
An Improved Approach to Word Sense Disambiguation
An Improved Approach to Word Sense DisambiguationAn Improved Approach to Word Sense Disambiguation
An Improved Approach to Word Sense Disambiguation
Surabhi Verma
 
Amharic WSD using WordNet
Amharic WSD using WordNetAmharic WSD using WordNet
Amharic WSD using WordNet
Seid Hassen
 
Machine translation with statistical approach
Machine translation with statistical approachMachine translation with statistical approach
Machine translation with statistical approach
vini89
 
Implemetation of parallelism in HMM DNN based state of the art kaldi ASR Toolkit
Implemetation of parallelism in HMM DNN based state of the art kaldi ASR ToolkitImplemetation of parallelism in HMM DNN based state of the art kaldi ASR Toolkit
Implemetation of parallelism in HMM DNN based state of the art kaldi ASR Toolkit
Shubham Verma
 

What's hot (20)

Learning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionaryLearning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionary
 
L1
L1L1
L1
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
 
DataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyan
DataFest 2017. Introduction to Natural Language Processing by Rudolf EremyanDataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyan
DataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyan
 
Introduction to natural language processing
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processing
 
Chatbots and Deep Learning
Chatbots and Deep LearningChatbots and Deep Learning
Chatbots and Deep Learning
 
Intent Classifier with Facebook fastText
Intent Classifier with Facebook fastTextIntent Classifier with Facebook fastText
Intent Classifier with Facebook fastText
 
Google Duplex AI
Google Duplex AIGoogle Duplex AI
Google Duplex AI
 
A Light Introduction to Transfer Learning for NLP
A Light Introduction to Transfer Learning for NLPA Light Introduction to Transfer Learning for NLP
A Light Introduction to Transfer Learning for NLP
 
A word sense disambiguation technique for sinhala
A word sense disambiguation technique  for sinhalaA word sense disambiguation technique  for sinhala
A word sense disambiguation technique for sinhala
 
An Improved Approach to Word Sense Disambiguation
An Improved Approach to Word Sense DisambiguationAn Improved Approach to Word Sense Disambiguation
An Improved Approach to Word Sense Disambiguation
 
Word embedding
Word embedding Word embedding
Word embedding
 
Chatbot_Presentation
Chatbot_PresentationChatbot_Presentation
Chatbot_Presentation
 
Natural language processing (Python)
Natural language processing (Python)Natural language processing (Python)
Natural language processing (Python)
 
Amharic WSD using WordNet
Amharic WSD using WordNetAmharic WSD using WordNet
Amharic WSD using WordNet
 
Machine translation with statistical approach
Machine translation with statistical approachMachine translation with statistical approach
Machine translation with statistical approach
 
Implemetation of parallelism in HMM DNN based state of the art kaldi ASR Toolkit
Implemetation of parallelism in HMM DNN based state of the art kaldi ASR ToolkitImplemetation of parallelism in HMM DNN based state of the art kaldi ASR Toolkit
Implemetation of parallelism in HMM DNN based state of the art kaldi ASR Toolkit
 
Speech recognition system
Speech recognition systemSpeech recognition system
Speech recognition system
 
Everything you need to know about chatbots
Everything you need to know about chatbotsEverything you need to know about chatbots
Everything you need to know about chatbots
 
History of deep learning
History of deep learningHistory of deep learning
History of deep learning
 

Similar to Deep Learning for Natural Language Processing

Natural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptxNatural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptx
SHIBDASDUTTA
 

Similar to Deep Learning for Natural Language Processing (20)

NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 
Nltk
NltkNltk
Nltk
 
Pycon India 2018 Natural Language Processing Workshop
Pycon India 2018   Natural Language Processing WorkshopPycon India 2018   Natural Language Processing Workshop
Pycon India 2018 Natural Language Processing Workshop
 
CSCE181 Big ideas in NLP
CSCE181 Big ideas in NLPCSCE181 Big ideas in NLP
CSCE181 Big ideas in NLP
 
Turkish language modeling using BERT
Turkish language modeling using BERTTurkish language modeling using BERT
Turkish language modeling using BERT
 
Natural language processing: feature extraction
Natural language processing: feature extractionNatural language processing: feature extraction
Natural language processing: feature extraction
 
Sk t academy lecture note
Sk t academy lecture noteSk t academy lecture note
Sk t academy lecture note
 
Oop concept
Oop conceptOop concept
Oop concept
 
NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2
 
Deep network notes.pdf
Deep network notes.pdfDeep network notes.pdf
Deep network notes.pdf
 
Module 8: Natural language processing Pt 1
Module 8:  Natural language processing Pt 1Module 8:  Natural language processing Pt 1
Module 8: Natural language processing Pt 1
 
Natural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptxNatural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptx
 
An Introduction to Recent Advances in the Field of NLP
An Introduction to Recent Advances in the Field of NLPAn Introduction to Recent Advances in the Field of NLP
An Introduction to Recent Advances in the Field of NLP
 
Effective Approach for Disambiguating Chinese Polyphonic Ambiguity
Effective Approach for Disambiguating Chinese Polyphonic AmbiguityEffective Approach for Disambiguating Chinese Polyphonic Ambiguity
Effective Approach for Disambiguating Chinese Polyphonic Ambiguity
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptx
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptx
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on Language
 

Recently uploaded

Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
ptikerjasaptiker
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
vexqp
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
ranjankumarbehera14
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
nirzagarg
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
vexqp
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
cnajjemba
 

Recently uploaded (20)

Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATIONCapstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptx
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
 

Deep Learning for Natural Language Processing

  • 1. Deep learning For Natural Language Processing Presented By: Waziri Shebogholo University of Dodoma shebogholo@gmail.com
  • 2. Overview of the talk  Introduction to NLP  Applications of NLP  Word representations  Language Model  RNN model and it’s variants  Sentiment analysis (practical)  Conclusion
  • 3. What is Natural Language Processing? Let’s define NLP as:- The field of study that aims at making computers able to understand human language and perform useful tasks, like making appointments. It’s at the intersection of CS, AI and Linguistics. NLP is difficult, but why?  Complexity in representing and learning  Human languages are ambiguous
  • 4. Why deep Learning for NLP? NLP based on human-designed features is:- 1. Too specific 2. Requires domain specific knowledge
  • 5. NLP applications  Sentiment analysis (today)  Information extraction  Dialog agents / chatbots  Language modelling  Machine Translation  Speech recognition Just to mention a few examples of NLP capabilities
  • 6. Word Representation The common way to represent words is by using vectors. That’s vectors do encode meaning of words in NLP Approaches to this:- 1. Discrete representation 2. Distributed representation
  • 7. Discrete representation (one-hot representation)  Words are regarded as atomic symbols  Each word is represented using a vector of size |V|  ‘1’ at one point and ‘0’ at all others Example Corpus: “I love deep learning”, “I love NLP”, “Machine learning is funny” |V| = {“I”, “love”, “deep”, “learning”, “NLP”, “Machine”, “is”, “funny”} One-hot representation of love (using the above vocabulary)  (0,1,0,0,0,0,0,0,)
  • 8. Problems with one-hot representation  Similar words aren't represented the same way  Computational complexity due to curse of dimensionality Alternative!
  • 9. Distributed representation Represent a word by means of its neighbors “You shall know a word by the company it keeps.” J.R. Firth, 1957 All words or just few words? 1. Full-window approach, e.g. Latent Semantic Analysis 2. Local-window approach, e.g. Word2Vec (our focus)
  • 10. Word2Vec There’re two flavors to Word2Vec 1. Skip-gram 2. Continuous bag-of-word (CBOW)
  • 11. About the two models 1. CBOW Predict the center word given the surrounding words 2. Skip-gram Predict the surrounding words given the center word.
  • 12. Language Model Compute probability of the next word given the previous words. Why do we have to care about LMs? They’re used in a lot of NLP tasks from machine translation, text generation, speech recognition, and a lot more.
  • 13. Language Models 1. Count-based Language Models Apply fixed window size of the previous words to calculate probabilities of the upcoming words. 2. Neural Network Models It may condition a word based on all of the previous words in a corpus. RNN is the most widely used model in this task.
  • 14. Recurrent Neural Network (RNN) In deep learning, all problems can be classified as:- 1. Fixed topological structure problems e.g. Images …image classification 2. Sequential data problems e.g. text/audio …speech recognition RNN for sequential data.
  • 15. RNN In a normal feed forward network for making prediction, you need not any relation to previous outputs that has been classified. Scenario: While reading a book, you need to remember the context mentioned and what’s discussed in the entire book. This is the case in sentiment analysis where algorithm need to remember the context of words before classifying document as Neg/Pos.
  • 16. Why are RNN’s capable of such a task:- 1. Hidden states can store a lot of information and pass it on, effectively 2. Hidden states are updated by nonlinear function.
  • 17. Where do we find RNNs 1. Chatbots 2. Handwriting detection 3. Video and audio classification 4. Sentiment analysis 5. Time series analysis
  • 18. Recurrence Recurrent function is called at each time step to model temporal data. Temporal data … depend on the previous units of data. 𝑥 𝑡 = 𝑥 𝑡 − 1 + 𝑏
  • 19.
  • 20. We first initialize initial hidden state Then:- For each time step:- 𝑎 𝑡 = U𝑥 𝑡 + Wℎ(𝑡−1) + b 𝑎 𝑡 -- activation at one time step
  • 21. Then ℎ 𝑡 = tanh(𝑎 𝑡)
  • 22. Then after:- 𝑜 𝑡 = 𝑉ℎ 𝑡 + c (bias)
  • 23. Finally 𝑦 𝑡 = softmax(𝑜 𝑡)
  • 24. Our parameters are :- b and c as well as U, V and W weight matrices U for input-to-hidden connection V for hidden-to-hidden connection W for output-to-hidden connection
  • 25. Note: That was example network that maps input to output of the same length.