Enviar pesquisa
Carregar
Acl reading@2016 10-26
•
1 gostou
•
508 visualizações
S
sekizawayuuki
Seguir
Neural Machine Translation of Rare Words with Subword Units
Leia menos
Leia mais
Educação
Denunciar
Compartilhar
Denunciar
Compartilhar
1 de 13
Baixar agora
Baixar para ler offline
Recomendados
Lingsoft Language Management Central
Lingsoft Language Management Central
stormbom_lingsoft
Yves Peirsman - Deep Learning for NLP
Yves Peirsman - Deep Learning for NLP
Hendrik D'Oosterlinck
GRDDL: A Pictorial Approach
GRDDL: A Pictorial Approach
Chimezie Ogbuji
Class9
Class9
issbp
Linguistic markup and transclusion processing in XML documents
Linguistic markup and transclusion processing in XML documents
Simon Dew
How to translate your Single Page Application - Webcamp 2016 (en)
How to translate your Single Page Application - Webcamp 2016 (en)
Viktor Turskyi
Deep learning Type Inference for Dynamic Programming Languages
Deep learning Type Inference for Dynamic Programming Languages
Amir M. Mir
Etymology Markup in TEI XML
Etymology Markup in TEI XML
Jack Bowers
Recomendados
Lingsoft Language Management Central
Lingsoft Language Management Central
stormbom_lingsoft
Yves Peirsman - Deep Learning for NLP
Yves Peirsman - Deep Learning for NLP
Hendrik D'Oosterlinck
GRDDL: A Pictorial Approach
GRDDL: A Pictorial Approach
Chimezie Ogbuji
Class9
Class9
issbp
Linguistic markup and transclusion processing in XML documents
Linguistic markup and transclusion processing in XML documents
Simon Dew
How to translate your Single Page Application - Webcamp 2016 (en)
How to translate your Single Page Application - Webcamp 2016 (en)
Viktor Turskyi
Deep learning Type Inference for Dynamic Programming Languages
Deep learning Type Inference for Dynamic Programming Languages
Amir M. Mir
Etymology Markup in TEI XML
Etymology Markup in TEI XML
Jack Bowers
Scientific and Technical Translation in English - Week 4
Scientific and Technical Translation in English - Week 4
Ron Martinez
Scientific and Technical Translation in English - Week 7
Scientific and Technical Translation in English - Week 7
Ron Martinez
Introduction to functional programming, with Elixir
Introduction to functional programming, with Elixir
kirandanduprolu
Scientific and technical translation in English - Week 8
Scientific and technical translation in English - Week 8
Ron Martinez
AINL 2016: Eyecioglu
AINL 2016: Eyecioglu
Lidia Pivovarova
Recent trends in natural language processing
Recent trends in natural language processing
Balayogi G
AINL 2016: Kravchenko
AINL 2016: Kravchenko
Lidia Pivovarova
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
Plain Concepts
Presentation of OpenNLP
Presentation of OpenNLP
Robert Viseur
Everyday Functional Programming in JavaScript
Everyday Functional Programming in JavaScript
Leo Hernandez
Neural Network Language Models for Candidate Scoring in Multi-System Machine...
Neural Network Language Models for Candidate Scoring in Multi-System Machine...
Matīss
A Model-Based Approach to Language Integration
A Model-Based Approach to Language Integration
Marco Torchiano
CoLing 2016
CoLing 2016
Matīss
Modern Programming Languages classification Poster
Modern Programming Languages classification Poster
Saulo Aguiar
Xml processing-by-asfak
Xml processing-by-asfak
Asfak Mahamud
Open nlp presentationss
Open nlp presentationss
Chandan Deb
paper introducing: Exploiting source side monolingual data in neural machine ...
paper introducing: Exploiting source side monolingual data in neural machine ...
sekizawayuuki
目的言語の低頻度語の高頻度語への言い換えによるニューラル機械翻訳の改善
目的言語の低頻度語の高頻度語への言い換えによるニューラル機械翻訳の改善
sekizawayuuki
Nlp2016 sekizawa
Nlp2016 sekizawa
sekizawayuuki
Coling2016 pre-translation for neural machine translation
Coling2016 pre-translation for neural machine translation
sekizawayuuki
Emnlp読み会@2015 10-09
Emnlp読み会@2015 10-09
sekizawayuuki
[論文紹介]Selecting syntactic, non redundant segments in active learning for mach...
[論文紹介]Selecting syntactic, non redundant segments in active learning for mach...
sekizawayuuki
Mais conteúdo relacionado
Mais procurados
Scientific and Technical Translation in English - Week 4
Scientific and Technical Translation in English - Week 4
Ron Martinez
Scientific and Technical Translation in English - Week 7
Scientific and Technical Translation in English - Week 7
Ron Martinez
Introduction to functional programming, with Elixir
Introduction to functional programming, with Elixir
kirandanduprolu
Scientific and technical translation in English - Week 8
Scientific and technical translation in English - Week 8
Ron Martinez
AINL 2016: Eyecioglu
AINL 2016: Eyecioglu
Lidia Pivovarova
Recent trends in natural language processing
Recent trends in natural language processing
Balayogi G
AINL 2016: Kravchenko
AINL 2016: Kravchenko
Lidia Pivovarova
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
Plain Concepts
Presentation of OpenNLP
Presentation of OpenNLP
Robert Viseur
Everyday Functional Programming in JavaScript
Everyday Functional Programming in JavaScript
Leo Hernandez
Neural Network Language Models for Candidate Scoring in Multi-System Machine...
Neural Network Language Models for Candidate Scoring in Multi-System Machine...
Matīss
A Model-Based Approach to Language Integration
A Model-Based Approach to Language Integration
Marco Torchiano
CoLing 2016
CoLing 2016
Matīss
Modern Programming Languages classification Poster
Modern Programming Languages classification Poster
Saulo Aguiar
Xml processing-by-asfak
Xml processing-by-asfak
Asfak Mahamud
Open nlp presentationss
Open nlp presentationss
Chandan Deb
Mais procurados
(16)
Scientific and Technical Translation in English - Week 4
Scientific and Technical Translation in English - Week 4
Scientific and Technical Translation in English - Week 7
Scientific and Technical Translation in English - Week 7
Introduction to functional programming, with Elixir
Introduction to functional programming, with Elixir
Scientific and technical translation in English - Week 8
Scientific and technical translation in English - Week 8
AINL 2016: Eyecioglu
AINL 2016: Eyecioglu
Recent trends in natural language processing
Recent trends in natural language processing
AINL 2016: Kravchenko
AINL 2016: Kravchenko
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
Presentation of OpenNLP
Presentation of OpenNLP
Everyday Functional Programming in JavaScript
Everyday Functional Programming in JavaScript
Neural Network Language Models for Candidate Scoring in Multi-System Machine...
Neural Network Language Models for Candidate Scoring in Multi-System Machine...
A Model-Based Approach to Language Integration
A Model-Based Approach to Language Integration
CoLing 2016
CoLing 2016
Modern Programming Languages classification Poster
Modern Programming Languages classification Poster
Xml processing-by-asfak
Xml processing-by-asfak
Open nlp presentationss
Open nlp presentationss
Destaque
paper introducing: Exploiting source side monolingual data in neural machine ...
paper introducing: Exploiting source side monolingual data in neural machine ...
sekizawayuuki
目的言語の低頻度語の高頻度語への言い換えによるニューラル機械翻訳の改善
目的言語の低頻度語の高頻度語への言い換えによるニューラル機械翻訳の改善
sekizawayuuki
Nlp2016 sekizawa
Nlp2016 sekizawa
sekizawayuuki
Coling2016 pre-translation for neural machine translation
Coling2016 pre-translation for neural machine translation
sekizawayuuki
Emnlp読み会@2015 10-09
Emnlp読み会@2015 10-09
sekizawayuuki
[論文紹介]Selecting syntactic, non redundant segments in active learning for mach...
[論文紹介]Selecting syntactic, non redundant segments in active learning for mach...
sekizawayuuki
Emnlp読み会@2017 02-15
Emnlp読み会@2017 02-15
sekizawayuuki
Destaque
(7)
paper introducing: Exploiting source side monolingual data in neural machine ...
paper introducing: Exploiting source side monolingual data in neural machine ...
目的言語の低頻度語の高頻度語への言い換えによるニューラル機械翻訳の改善
目的言語の低頻度語の高頻度語への言い換えによるニューラル機械翻訳の改善
Nlp2016 sekizawa
Nlp2016 sekizawa
Coling2016 pre-translation for neural machine translation
Coling2016 pre-translation for neural machine translation
Emnlp読み会@2015 10-09
Emnlp読み会@2015 10-09
[論文紹介]Selecting syntactic, non redundant segments in active learning for mach...
[論文紹介]Selecting syntactic, non redundant segments in active learning for mach...
Emnlp読み会@2017 02-15
Emnlp読み会@2017 02-15
Semelhante a Acl reading@2016 10-26
Ontology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptx
Chris Mungall
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)
Alia Hamwi
AINL 2016: Nikolenko
AINL 2016: Nikolenko
Lidia Pivovarova
Intro to KotlinNLP
Intro to KotlinNLP
Matteo Grella
Introduction to KotlinNLP
Introduction to KotlinNLP
Pier Paolo Grassi
Php packages
Php packages
abdelrahman samy
A Brief Introduction to SKOS
A Brief Introduction to SKOS
Heather Hedden
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask Learners
Young Seok Kim
sw owl
sw owl
abdollahtaghipour
Open vocabulary problem
Open vocabulary problem
JaeHo Jang
Translating phrases in neural machine translation
Translating phrases in neural machine translation
sekizawayuuki
ICANN 51: IDN Root Zone LGR (workshop)
ICANN 51: IDN Root Zone LGR (workshop)
ICANN
Preliminary study on using vector quantization latent spaces for TTS/VC syste...
Preliminary study on using vector quantization latent spaces for TTS/VC syste...
Yamagishi Laboratory, National Institute of Informatics, Japan
haenelt.ppt
haenelt.ppt
ssuser4293bd
Pedagogical applications of corpus data for English for General and Specific ...
Pedagogical applications of corpus data for English for General and Specific ...
Pascual Pérez-Paredes
Ontology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical Study
Debashisnaskar
Improving Japanese-to-English Neural Machine Translation by Paraphrasing the ...
Improving Japanese-to-English Neural Machine Translation by Paraphrasing the ...
sekizawayuuki
Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...
Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...
Yuki Tomo
sete linguagens em sete semanas
sete linguagens em sete semanas
tdc-globalcode
Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...
Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...
Normunds Grūzītis
Semelhante a Acl reading@2016 10-26
(20)
Ontology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptx
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)
AINL 2016: Nikolenko
AINL 2016: Nikolenko
Intro to KotlinNLP
Intro to KotlinNLP
Introduction to KotlinNLP
Introduction to KotlinNLP
Php packages
Php packages
A Brief Introduction to SKOS
A Brief Introduction to SKOS
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask Learners
sw owl
sw owl
Open vocabulary problem
Open vocabulary problem
Translating phrases in neural machine translation
Translating phrases in neural machine translation
ICANN 51: IDN Root Zone LGR (workshop)
ICANN 51: IDN Root Zone LGR (workshop)
Preliminary study on using vector quantization latent spaces for TTS/VC syste...
Preliminary study on using vector quantization latent spaces for TTS/VC syste...
haenelt.ppt
haenelt.ppt
Pedagogical applications of corpus data for English for General and Specific ...
Pedagogical applications of corpus data for English for General and Specific ...
Ontology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical Study
Improving Japanese-to-English Neural Machine Translation by Paraphrasing the ...
Improving Japanese-to-English Neural Machine Translation by Paraphrasing the ...
Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...
Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...
sete linguagens em sete semanas
sete linguagens em sete semanas
Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...
Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...
Mais de sekizawayuuki
Improving lexical choice in neural machine translation
Improving lexical choice in neural machine translation
sekizawayuuki
Incorporating word reordering knowledge into attention-based neural machine t...
Incorporating word reordering knowledge into attention-based neural machine t...
sekizawayuuki
Acl読み会@2015 09-18
Acl読み会@2015 09-18
sekizawayuuki
読解支援@2015 08-10-6
読解支援@2015 08-10-6
sekizawayuuki
読解支援@2015 08-10-5
読解支援@2015 08-10-5
sekizawayuuki
読解支援@2015 08-10-4
読解支援@2015 08-10-4
sekizawayuuki
読解支援@2015 08-10-3
読解支援@2015 08-10-3
sekizawayuuki
読解支援@2015 08-10-2
読解支援@2015 08-10-2
sekizawayuuki
読解支援@2015 08-10-1
読解支援@2015 08-10-1
sekizawayuuki
読解支援@2015 07-24
読解支援@2015 07-24
sekizawayuuki
読解支援@2015 07-17
読解支援@2015 07-17
sekizawayuuki
読解支援@2015 07-13
読解支援@2015 07-13
sekizawayuuki
読解支援@2015 07-03
読解支援@2015 07-03
sekizawayuuki
読解支援@2015 06-26
読解支援@2015 06-26
sekizawayuuki
Naacl読み会@2015 06-24
Naacl読み会@2015 06-24
sekizawayuuki
読解支援@2015 06-12
読解支援@2015 06-12
sekizawayuuki
読解支援@2015 06-09
読解支援@2015 06-09
sekizawayuuki
読解支援@2015 06-05
読解支援@2015 06-05
sekizawayuuki
読解支援@2015 05-22
読解支援@2015 05-22
sekizawayuuki
読解支援@2015 05-15
読解支援@2015 05-15
sekizawayuuki
Mais de sekizawayuuki
(20)
Improving lexical choice in neural machine translation
Improving lexical choice in neural machine translation
Incorporating word reordering knowledge into attention-based neural machine t...
Incorporating word reordering knowledge into attention-based neural machine t...
Acl読み会@2015 09-18
Acl読み会@2015 09-18
読解支援@2015 08-10-6
読解支援@2015 08-10-6
読解支援@2015 08-10-5
読解支援@2015 08-10-5
読解支援@2015 08-10-4
読解支援@2015 08-10-4
読解支援@2015 08-10-3
読解支援@2015 08-10-3
読解支援@2015 08-10-2
読解支援@2015 08-10-2
読解支援@2015 08-10-1
読解支援@2015 08-10-1
読解支援@2015 07-24
読解支援@2015 07-24
読解支援@2015 07-17
読解支援@2015 07-17
読解支援@2015 07-13
読解支援@2015 07-13
読解支援@2015 07-03
読解支援@2015 07-03
読解支援@2015 06-26
読解支援@2015 06-26
Naacl読み会@2015 06-24
Naacl読み会@2015 06-24
読解支援@2015 06-12
読解支援@2015 06-12
読解支援@2015 06-09
読解支援@2015 06-09
読解支援@2015 06-05
読解支援@2015 06-05
読解支援@2015 05-22
読解支援@2015 05-22
読解支援@2015 05-15
読解支援@2015 05-15
Último
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
camerronhm
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
Admir Softic
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
Dr. Sarita Anand
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Pooja Bhuva
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
Celine George
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
Amil baba
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
Nirmal Dwivedi
Understanding Accommodations and Modifications
Understanding Accommodations and Modifications
MJDuyan
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Pooja Bhuva
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
agholdier
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
Celine George
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
annathomasp01
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
neillewis46
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
marlenawright1
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
Ramakrishna Reddy Bijjam
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
Jisc
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
Elizabeth Walsh
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
Esquimalt MFRC
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Pooja Bhuva
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
Celine George
Último
(20)
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
Understanding Accommodations and Modifications
Understanding Accommodations and Modifications
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
Acl reading@2016 10-26
1.
ACL 2016 reading Neural Machine Transla8on of Rare Words with Subword Units author : Rico Sennrich, Barry Haddow , Alexandra Birch presenta8on : Sekizawa Yuuki Komachi lab M1 16/10/26 1
2.
Neural Machine Transla8on of Rare Words with Subword Units • NMT : fixed vocabulary • transla8on : open-vocabulary àNMT have to address out-of-vocabulary(OOV) such as rare and unknown words •
propose method • encode OOV words as sequences of subword units • result(BLEU, WMT2015, compare with baseline) • Eng-Ger : +1.1, Eng-Rus : +1.3 • main contribu8on • open vocabulary NMT by encoding words via subword units • adapt byte pair encoding to word segmenta8on 16/10/26 2
3.
transparent word category to translate • name en88es • copy src à trg •
need transcrip8on (if alphabets or syllabraries differ) • cognates, loanwords • character-level differ • morphologically complex words • mul8ple morphemes • tranlsate separately 16/10/26 3
4.
related work • Durrani et al. 2014 • copy unknown words (alphabet is shared) •
translitera8on is required (alphabets differ) • Mikolov et al. 2012 • inves8gate subword language models • propose to use syllables (speech recogni8on) 16/10/26 4
5.
byte pair encoding(BPE) (Gage, 1994) • BPE : simple data compression technique • itera8vely replace the most frequent pair of bytes in a with a single, unused byte •
this paper • merge characters or character sequences • most frequent pair (‘A’,’B’) à ‘AB’ • don’t cross word boundary (for efficiency) • aden8on model operates on variable-length units 16/10/26 5
6.
BPE example • learning • word:freq : {low:5, lowest:2, newer:6, wider:3} •
marge & count 1. ‘r’ ‘</w>’ : 9 à marge’r</w>’ 2. ‘e’ ‘r</w>’ : 9 àmarge’er</w>’ 3. ‘l’ ‘o’ : 7 àmarge’lo’ 4. ‘lo’ ‘w’ : 7 àmarge’low’ à OOV : ‘lower’ segmented ‘low er</w>’ 16/10/26 6
7.
Evalua8on • data : shared transla8on task of WMT 2015 • En-Ge train : 4.2m sentence, 100m tokens •
En-Ru train : 2.6m sentence, 50m tokens • dev : newstest2013, test : newstest2015 • use BLEU, CHR F3, character ngram F3 16/10/26 7
8.
segmenta8on sta8cs (train) number of unknown tokens in newstest2013 16/10/26 8 segmenta8on technique in SMT 59,500 merge 89,500 merge unsegmented words
9.
result(En-Ge) • Wunk : word-level model OOV output is UNK • Wdict : Wunk with a back-off dict to rare words (baseline) •
C2-50k : character bigrams with 50,000 unsegmented words • BPE-J90k : learning BPE symbols on vocab union • BPE-60k : learning BPE symbols separately 16/10/26 9
10.
result(En-Ge) • words : 44,085 • not in top 50,000 words : 2,900 •
OOV : 1,168 16/10/26 10
11.
result(En-Ge) • words : 55,654 • not in top 50,000 words : 5,442 •
OOV : 851 16/10/26 11
12.
transla8on example En – Ge En- Ru 16/10/26 12
13.
Neural Machine Transla8on of Rare Words with Subword Units • main contribu8on • capable of open-vocabulary in NMT •
represent OOV as a sequence of subword units • using byte pair encoding • simple and effec8ve than back-off model • future work • learn op8oal vocab size for a transla8on task • ex: language pair, amount of training data… 16/10/26 13
Baixar agora