SlideShare uma empresa Scribd logo
1 de 11
Attention Mechanism in Language
Understanding and their Applications
by Rajarshee Mitra, Research Engineer, Artifacia
(@rajarshee_mitra)
March 25, 2017
AI Meet|
Agenda
1. What is Seq2Seq ?
2. Challenges in vanilla Seq2Seq
3. Attention - introduction
4. Attention (contd.)
5. Attention - microscopic view
6. Visualizing attention
7. More applications of attention
8. Reference.
AI Meet|
What is Seq2Seq ?
● Contains two different RNNs: An encoder and a decoder.
● Encoder encodes the sentence to form a context vector.
● Decodes decodes the vector to generate language.
● Applications: NMT, Summarization, Conversations etc.
AI Meet|
Challenges in vanilla Seq2Seq
● Hard for encoder to compress the whole source sentence into a
single vector.
● Performance deteriorates rapidly as the length of sentence
increases.
● A single context vector for generating every word in decoder does
not produce the best results.
AI Meet|
Attention
● Does not squash the whole source sentence into a vector.
● Considers a subset of the word vectors in the source more than the
others while generating each target word.
● It is an intuitive process and comparable to how we read.
● While inferring something from a piece of text -- like answering
questions, we pay more attention to some words each time in the
text.
● Eventually, the machine learns where to attend more and where,
less.
● Eg. while translating an english sentence to french, the fourth word
in the french sentence can be highly correlated to the first word in
the english sentence.
● Hence, it is not very useful to consider the whole english sentence
everytime for generating each french word.
AI Meet|
Attention (contd.)
● From a high level point of view, the attention model differs
from a traditional Seq2Seq in sense that in vanilla seq2seq,
we simply feed h_t to softmax and not h_t_
● C_t differs for every time step in the decoder.
AI Meet|
Attention - microscopic view
AI Meet|
Visualizing attention
Top: Translation , Below: Voice recognition
AI Meet|
More applications of attention
1. Neural Machine Translation.
2. Text Summarization.
3. Voice recognition.
4. Generate parse trees of sentences.
5. Chatbots.
Attentional interfaces can be used whenever one wants to
interface with a neural network that has a repeating
structure in its output.
AI Meet|
Reference
1. https://nlp.stanford.edu/pubs/emnlp15_attn.pdf
1. http://distill.pub/2016/augmented-rnns
1. https://arxiv.org/pdf/1409.0473.pdf
THANK YOU
Join
meetup.com/Artifacia-AI-Meet/

Mais conteúdo relacionado

Mais procurados

Word embeddings, RNN, GRU and LSTM
Word embeddings, RNN, GRU and LSTMWord embeddings, RNN, GRU and LSTM
Word embeddings, RNN, GRU and LSTMDivya Gera
 
Survey of Attention mechanism & Use in Computer Vision
Survey of Attention mechanism & Use in Computer VisionSurvey of Attention mechanism & Use in Computer Vision
Survey of Attention mechanism & Use in Computer VisionSwatiNarkhede1
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers Arvind Devaraj
 
Text similarity measures
Text similarity measuresText similarity measures
Text similarity measuresankit_ppt
 
Attention is all you need
Attention is all you needAttention is all you need
Attention is all you needHoon Heo
 
An introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTAn introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTSuman Debnath
 
Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and TransformerArvind Devaraj
 
Natural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - IntroductionNatural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - IntroductionAritra Mukherjee
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaAlexey Grigorev
 
NLP_KASHK:Context-Free Grammar for English
NLP_KASHK:Context-Free Grammar for EnglishNLP_KASHK:Context-Free Grammar for English
NLP_KASHK:Context-Free Grammar for EnglishHemantha Kulathilake
 
Introduction to NP Completeness
Introduction to NP CompletenessIntroduction to NP Completeness
Introduction to NP CompletenessGene Moo Lee
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with PythonBenjamin Bengfort
 

Mais procurados (20)

Word embeddings, RNN, GRU and LSTM
Word embeddings, RNN, GRU and LSTMWord embeddings, RNN, GRU and LSTM
Word embeddings, RNN, GRU and LSTM
 
Survey of Attention mechanism & Use in Computer Vision
Survey of Attention mechanism & Use in Computer VisionSurvey of Attention mechanism & Use in Computer Vision
Survey of Attention mechanism & Use in Computer Vision
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
 
np complete
np completenp complete
np complete
 
Attention Is All You Need
Attention Is All You NeedAttention Is All You Need
Attention Is All You Need
 
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
 
Text similarity measures
Text similarity measuresText similarity measures
Text similarity measures
 
Attention is all you need
Attention is all you needAttention is all you need
Attention is all you need
 
NLP_KASHK:Smoothing N-gram Models
NLP_KASHK:Smoothing N-gram ModelsNLP_KASHK:Smoothing N-gram Models
NLP_KASHK:Smoothing N-gram Models
 
Rnn and lstm
Rnn and lstmRnn and lstm
Rnn and lstm
 
Recurrent neural network
Recurrent neural networkRecurrent neural network
Recurrent neural network
 
An introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTAn introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERT
 
LSTM Basics
LSTM BasicsLSTM Basics
LSTM Basics
 
Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and Transformer
 
Natural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - IntroductionNatural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - Introduction
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga Petrova
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
 
NLP_KASHK:Context-Free Grammar for English
NLP_KASHK:Context-Free Grammar for EnglishNLP_KASHK:Context-Free Grammar for English
NLP_KASHK:Context-Free Grammar for English
 
Introduction to NP Completeness
Introduction to NP CompletenessIntroduction to NP Completeness
Introduction to NP Completeness
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with Python
 

Semelhante a Attention Mechanism in Language Understanding and its Applications

Speech To Speech Translation
Speech To Speech TranslationSpeech To Speech Translation
Speech To Speech TranslationIRJET Journal
 
Natural language processing
Natural language processingNatural language processing
Natural language processingBirger Moell
 
Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...
Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...
Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...IRJET Journal
 
IRJET- Applications of Artificial Intelligence in Neural Machine Translation
IRJET- Applications of Artificial Intelligence in Neural Machine TranslationIRJET- Applications of Artificial Intelligence in Neural Machine Translation
IRJET- Applications of Artificial Intelligence in Neural Machine TranslationIRJET Journal
 
IRJET - Storytelling App for Children with Hearing Impairment using Natur...
IRJET -  	  Storytelling App for Children with Hearing Impairment using Natur...IRJET -  	  Storytelling App for Children with Hearing Impairment using Natur...
IRJET - Storytelling App for Children with Hearing Impairment using Natur...IRJET Journal
 
Creating a compiler for your own language
Creating a compiler for your own languageCreating a compiler for your own language
Creating a compiler for your own languageAndrea Tino
 
Dot Net Interview Questions with Answers (Part-1) by conceptserve technologies
Dot Net Interview Questions with Answers (Part-1) by conceptserve technologies Dot Net Interview Questions with Answers (Part-1) by conceptserve technologies
Dot Net Interview Questions with Answers (Part-1) by conceptserve technologies ConceptServe Technologies
 
The Ring programming language version 1.2 book - Part 77 of 84
The Ring programming language version 1.2 book - Part 77 of 84The Ring programming language version 1.2 book - Part 77 of 84
The Ring programming language version 1.2 book - Part 77 of 84Mahmoud Samir Fayed
 
Natural Language Interface for Java Programming: Survey
Natural Language Interface for Java Programming: SurveyNatural Language Interface for Java Programming: Survey
Natural Language Interface for Java Programming: Surveyrahulmonikasharma
 
The Ring programming language version 1.9 book - Part 97 of 210
The Ring programming language version 1.9 book - Part 97 of 210The Ring programming language version 1.9 book - Part 97 of 210
The Ring programming language version 1.9 book - Part 97 of 210Mahmoud Samir Fayed
 
Natural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsNatural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsShreyas Suresh Rao
 
NLP Techniques for Machine Translation.docx
NLP Techniques for Machine Translation.docxNLP Techniques for Machine Translation.docx
NLP Techniques for Machine Translation.docxKevinSims18
 
IRJET - Text Optimization/Summarizer using Natural Language Processing
IRJET - Text Optimization/Summarizer using Natural Language Processing IRJET - Text Optimization/Summarizer using Natural Language Processing
IRJET - Text Optimization/Summarizer using Natural Language Processing IRJET Journal
 
A NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISH
A NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISHA NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISH
A NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISHIRJET Journal
 
Let linguistics guide software analysis
Let linguistics guide software analysisLet linguistics guide software analysis
Let linguistics guide software analysisPooja Rani
 
IRJET - Analysis on Code-Mixed Data for Movie Reviews
IRJET - Analysis on Code-Mixed Data for Movie ReviewsIRJET - Analysis on Code-Mixed Data for Movie Reviews
IRJET - Analysis on Code-Mixed Data for Movie ReviewsIRJET Journal
 
IRJET - Speech to Speech Translation using Encoder Decoder Architecture
IRJET -  	  Speech to Speech Translation using Encoder Decoder ArchitectureIRJET -  	  Speech to Speech Translation using Encoder Decoder Architecture
IRJET - Speech to Speech Translation using Encoder Decoder ArchitectureIRJET Journal
 
A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to MarathiA Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathiaciijournal
 
A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to MarathiA Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathiaciijournal
 

Semelhante a Attention Mechanism in Language Understanding and its Applications (20)

Speech To Speech Translation
Speech To Speech TranslationSpeech To Speech Translation
Speech To Speech Translation
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...
Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...
Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...
 
IRJET- Applications of Artificial Intelligence in Neural Machine Translation
IRJET- Applications of Artificial Intelligence in Neural Machine TranslationIRJET- Applications of Artificial Intelligence in Neural Machine Translation
IRJET- Applications of Artificial Intelligence in Neural Machine Translation
 
IRJET - Storytelling App for Children with Hearing Impairment using Natur...
IRJET -  	  Storytelling App for Children with Hearing Impairment using Natur...IRJET -  	  Storytelling App for Children with Hearing Impairment using Natur...
IRJET - Storytelling App for Children with Hearing Impairment using Natur...
 
Creating a compiler for your own language
Creating a compiler for your own languageCreating a compiler for your own language
Creating a compiler for your own language
 
Dot Net Interview Questions with Answers (Part-1) by conceptserve technologies
Dot Net Interview Questions with Answers (Part-1) by conceptserve technologies Dot Net Interview Questions with Answers (Part-1) by conceptserve technologies
Dot Net Interview Questions with Answers (Part-1) by conceptserve technologies
 
The Ring programming language version 1.2 book - Part 77 of 84
The Ring programming language version 1.2 book - Part 77 of 84The Ring programming language version 1.2 book - Part 77 of 84
The Ring programming language version 1.2 book - Part 77 of 84
 
Natural Language Interface for Java Programming: Survey
Natural Language Interface for Java Programming: SurveyNatural Language Interface for Java Programming: Survey
Natural Language Interface for Java Programming: Survey
 
The Ring programming language version 1.9 book - Part 97 of 210
The Ring programming language version 1.9 book - Part 97 of 210The Ring programming language version 1.9 book - Part 97 of 210
The Ring programming language version 1.9 book - Part 97 of 210
 
Natural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsNatural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application Trends
 
NLP Techniques for Machine Translation.docx
NLP Techniques for Machine Translation.docxNLP Techniques for Machine Translation.docx
NLP Techniques for Machine Translation.docx
 
IRJET - Text Optimization/Summarizer using Natural Language Processing
IRJET - Text Optimization/Summarizer using Natural Language Processing IRJET - Text Optimization/Summarizer using Natural Language Processing
IRJET - Text Optimization/Summarizer using Natural Language Processing
 
A NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISH
A NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISHA NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISH
A NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISH
 
Let linguistics guide software analysis
Let linguistics guide software analysisLet linguistics guide software analysis
Let linguistics guide software analysis
 
IRJET - Analysis on Code-Mixed Data for Movie Reviews
IRJET - Analysis on Code-Mixed Data for Movie ReviewsIRJET - Analysis on Code-Mixed Data for Movie Reviews
IRJET - Analysis on Code-Mixed Data for Movie Reviews
 
IRJET - Speech to Speech Translation using Encoder Decoder Architecture
IRJET -  	  Speech to Speech Translation using Encoder Decoder ArchitectureIRJET -  	  Speech to Speech Translation using Encoder Decoder Architecture
IRJET - Speech to Speech Translation using Encoder Decoder Architecture
 
[IJET-V2I1P13] Authors:Shilpa More, Gagandeep .S. Dhir , Deepak Daiwadney and...
[IJET-V2I1P13] Authors:Shilpa More, Gagandeep .S. Dhir , Deepak Daiwadney and...[IJET-V2I1P13] Authors:Shilpa More, Gagandeep .S. Dhir , Deepak Daiwadney and...
[IJET-V2I1P13] Authors:Shilpa More, Gagandeep .S. Dhir , Deepak Daiwadney and...
 
A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to MarathiA Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathi
 
A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to MarathiA Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathi
 

Último

UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 

Último (20)

UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 

Attention Mechanism in Language Understanding and its Applications

  • 1. Attention Mechanism in Language Understanding and their Applications by Rajarshee Mitra, Research Engineer, Artifacia (@rajarshee_mitra) March 25, 2017
  • 2. AI Meet| Agenda 1. What is Seq2Seq ? 2. Challenges in vanilla Seq2Seq 3. Attention - introduction 4. Attention (contd.) 5. Attention - microscopic view 6. Visualizing attention 7. More applications of attention 8. Reference.
  • 3. AI Meet| What is Seq2Seq ? ● Contains two different RNNs: An encoder and a decoder. ● Encoder encodes the sentence to form a context vector. ● Decodes decodes the vector to generate language. ● Applications: NMT, Summarization, Conversations etc.
  • 4. AI Meet| Challenges in vanilla Seq2Seq ● Hard for encoder to compress the whole source sentence into a single vector. ● Performance deteriorates rapidly as the length of sentence increases. ● A single context vector for generating every word in decoder does not produce the best results.
  • 5. AI Meet| Attention ● Does not squash the whole source sentence into a vector. ● Considers a subset of the word vectors in the source more than the others while generating each target word. ● It is an intuitive process and comparable to how we read. ● While inferring something from a piece of text -- like answering questions, we pay more attention to some words each time in the text. ● Eventually, the machine learns where to attend more and where, less. ● Eg. while translating an english sentence to french, the fourth word in the french sentence can be highly correlated to the first word in the english sentence. ● Hence, it is not very useful to consider the whole english sentence everytime for generating each french word.
  • 6. AI Meet| Attention (contd.) ● From a high level point of view, the attention model differs from a traditional Seq2Seq in sense that in vanilla seq2seq, we simply feed h_t to softmax and not h_t_ ● C_t differs for every time step in the decoder.
  • 7. AI Meet| Attention - microscopic view
  • 8. AI Meet| Visualizing attention Top: Translation , Below: Voice recognition
  • 9. AI Meet| More applications of attention 1. Neural Machine Translation. 2. Text Summarization. 3. Voice recognition. 4. Generate parse trees of sentences. 5. Chatbots. Attentional interfaces can be used whenever one wants to interface with a neural network that has a repeating structure in its output.
  • 10. AI Meet| Reference 1. https://nlp.stanford.edu/pubs/emnlp15_attn.pdf 1. http://distill.pub/2016/augmented-rnns 1. https://arxiv.org/pdf/1409.0473.pdf