SlideShare uma empresa Scribd logo
1 de 28
Baixar para ler offline
Contributions Dataset Models Experiments Conclusion
Weak Semi-Markov CRFs for NP Chunking in
Informal Text
Aldrian Obaja Muis and Wei Lu
Singapore University of Technology and Design
Contributions Dataset Models Experiments Conclusion
Paper Contributions
In this paper, we contributed:
1 Noun phrase-annotated SMS corpus1
1
Tao Chen and Min-Yen Kan (2013). “Creating a live, public short message
service corpus: the NUS SMS corpus”. In: Language Resources and
Evaluation. Vol. 47. Springer Netherlands, pp. 299–335.
2 / 13
Contributions Dataset Models Experiments Conclusion
Paper Contributions
In this paper, we contributed:
1 Noun phrase-annotated SMS corpus1
2 Weak semi-Markov CRF
1
Tao Chen and Min-Yen Kan (2013). “Creating a live, public short message
service corpus: the NUS SMS corpus”. In: Language Resources and
Evaluation. Vol. 47. Springer Netherlands, pp. 299–335.
2 / 13
Contributions Dataset Models Experiments Conclusion
NP-annotated SMS Corpus
3 / 13
Contributions Dataset Models Experiments Conclusion
NP-annotated SMS Corpus
We used Brat Rapid Annotation Tool (BRAT)2 for annotations,
recruiting undergraduate students to annotate the noun phrases.
2
http://brat.nlplab.org/
4 / 13
Contributions Dataset Models Experiments Conclusion
NP-annotated SMS Corpus
We used Brat Rapid Annotation Tool (BRAT)2 for annotations,
recruiting undergraduate students to annotate the noun phrases.
Examples:
2
http://brat.nlplab.org/
4 / 13
Contributions Dataset Models Experiments Conclusion
NP-annotated SMS Corpus
We used Brat Rapid Annotation Tool (BRAT)2 for annotations,
recruiting undergraduate students to annotate the noun phrases.
Examples:
2
http://brat.nlplab.org/
4 / 13
Contributions Dataset Models Experiments Conclusion
Annotations Statistics
64 annotators
5 / 13
Contributions Dataset Models Experiments Conclusion
Annotations Statistics
64 annotators
26,500 SMS messages
5 / 13
Contributions Dataset Models Experiments Conclusion
Annotations Statistics
64 annotators
26,500 SMS messages
76,490 noun phrases
5 / 13
Contributions Dataset Models Experiments Conclusion
Annotations Statistics
64 annotators
26,500 SMS messages
76,490 noun phrases
359,009 tokens
5 / 13
Contributions Dataset Models Experiments Conclusion
Models
6 / 13
Contributions Dataset Models Experiments Conclusion
Models Comparison
n : # words in the sentence, |Y| : # labels, L : max segment length
B B B
I I I
O O O
said Dr Teh
Fig. 1: Linear CRF: O(n|Y|2
)
7 / 13
Contributions Dataset Models Experiments Conclusion
Models Comparison
n : # words in the sentence, |Y| : # labels, L : max segment length
B B B
I I I
O O O
said Dr Teh
Fig. 1: Linear CRF: O(n|Y|2
)
7 / 13
Contributions Dataset Models Experiments Conclusion
Models Comparison
n : # words in the sentence, |Y| : # labels, L : max segment length
B B B
I I I
O O O
said Dr Teh
Fig. 1: Linear CRF: O(n|Y|2
)
7 / 13
Contributions Dataset Models Experiments Conclusion
Models Comparison
n : # words in the sentence, |Y| : # labels, L : max segment length
B B B
I I I
O O O
said Dr Teh
Fig. 1: Linear CRF: O(n|Y|2
)
N N N
O O O
said Dr Teh
Fig. 2: Semi-CRF: O(nL |Y|
2
)
7 / 13
Contributions Dataset Models Experiments Conclusion
Models Comparison
n : # words in the sentence, |Y| : # labels, L : max segment length
B B B
I I I
O O O
said Dr Teh
Fig. 1: Linear CRF: O(n|Y|2
)
N N N
O O O
said Dr Teh
Fig. 2: Semi-CRF: O(nL |Y|
2
)
7 / 13
Contributions Dataset Models Experiments Conclusion
Models Comparison
n : # words in the sentence, |Y| : # labels, L : max segment length
B B B
I I I
O O O
said Dr Teh
Fig. 1: Linear CRF: O(n|Y|2
)
N N N
O O O
said Dr Teh
Fig. 2: Semi-CRF: O(nL |Y|
2
)
N N NN N N
O O OO O O
said Dr Teh
Fig. 3: Weak Semi-CRF: O(n |Y|
2
+ nL |Y|)
7 / 13
Contributions Dataset Models Experiments Conclusion
Models Comparison
n : # words in the sentence, |Y| : # labels, L : max segment length
B B B
I I I
O O O
said Dr Teh
Fig. 1: Linear CRF: O(n|Y|2
)
N N N
O O O
said Dr Teh
Fig. 2: Semi-CRF: O(nL |Y|
2
)
N N NN N N
O O OO O O
said Dr Teh
Fig. 3: Weak Semi-CRF: O(n |Y|
2
+ nL |Y|)
7 / 13
Contributions Dataset Models Experiments Conclusion
Models Comparison
n : # words in the sentence, |Y| : # labels, L : max segment length
B B B
I I I
O O O
said Dr Teh
Fig. 1: Linear CRF: O(n|Y|2
)
N N N
O O O
said Dr Teh
Fig. 2: Semi-CRF: O(nL |Y|
2
)
N N NN N N
O O OO O O
said Dr Teh
Fig. 3: Weak Semi-CRF: O(n |Y|
2
+ nL |Y|)
7 / 13
Contributions Dataset Models Experiments Conclusion
Models Comparison
n : # words in the sentence, |Y| : # labels, L : max segment length
B B B
I I I
O O O
said Dr Teh
Fig. 1: Linear CRF: O(n|Y|2
)
N N N
O O O
said Dr Teh
Fig. 2: Semi-CRF: O(nL |Y|
2
)
N N NN N N
O O OO O O
said Dr Teh
Fig. 3: Weak Semi-CRF: O(n |Y|
2
+ nL |Y|)
7 / 13
Contributions Dataset Models Experiments Conclusion
Empirical Verification
8 / 13
Contributions Dataset Models Experiments Conclusion
F1-Score
Basic features +affixes All features
50
60
70
80
71.19
72.49 72.68
74.37 74.69 74.5874.39 74.60 74.31
F1-Score(%)
Linear CRF Semi-CRF Weak Semi-CRF
9 / 13
Contributions Dataset Models Experiments Conclusion
Training Speed
5,000 10,000 15,000 20,000
0.5
1
1.5
2
# training instances (SMS)
Avg.timeperiteration(s)
Linear-CRF
Semi-CRF
Weak Semi-CRF
10 / 13
Contributions Dataset Models Experiments Conclusion
Conclusion
11 / 13
Contributions Dataset Models Experiments Conclusion
Conclusion
We have created a new NP-annotated dataset on informal text
12 / 13
Contributions Dataset Models Experiments Conclusion
Conclusion
We have created a new NP-annotated dataset on informal text
We can split the decisions of selecting segment length and
segment type to improve the training time, while maintaining
similar accuracy
12 / 13
Contributions Dataset Models Experiments Conclusion
Thank You
Code and data available at:
http://statnlp.org/research/ie/
Aldrian Obaja Muis and Wei Lu
Singapore University of Technology and Design
13 / 13

Mais conteúdo relacionado

Semelhante a Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text

Phonetic distance based accent
Phonetic distance based accentPhonetic distance based accent
Phonetic distance based accentsipij
 
NAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURES
NAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURESNAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURES
NAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURESacijjournal
 
Modeling Text Independent Speaker Identification with Vector Quantization
Modeling Text Independent Speaker Identification with Vector QuantizationModeling Text Independent Speaker Identification with Vector Quantization
Modeling Text Independent Speaker Identification with Vector QuantizationTELKOMNIKA JOURNAL
 
Subjective comparison of_speech_enhancement_algori (1)
Subjective comparison of_speech_enhancement_algori (1)Subjective comparison of_speech_enhancement_algori (1)
Subjective comparison of_speech_enhancement_algori (1)Priyanka Reddy
 
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...IJERA Editor
 
NLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology ConstraintsNLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology ConstraintsDimitris Kontokostas
 
Automatic Profiling Of Learner Texts
Automatic Profiling Of Learner TextsAutomatic Profiling Of Learner Texts
Automatic Profiling Of Learner TextsJeff Nelson
 
Comparative performance analysis of channel normalization techniques
Comparative performance analysis of channel normalization techniquesComparative performance analysis of channel normalization techniques
Comparative performance analysis of channel normalization techniqueseSAT Journals
 
2019 acl bio_nlp_nli_surf_poster
2019 acl bio_nlp_nli_surf_poster2019 acl bio_nlp_nli_surf_poster
2019 acl bio_nlp_nli_surf_posterjiin nam
 
SBML FOR OPTIMIZING DECISION SUPPORT'S TOOLS
SBML FOR OPTIMIZING DECISION SUPPORT'S TOOLS SBML FOR OPTIMIZING DECISION SUPPORT'S TOOLS
SBML FOR OPTIMIZING DECISION SUPPORT'S TOOLS cscpconf
 
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...mathsjournal
 
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...mathsjournal
 
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...mathsjournal
 
Drosophila Three-Point Test Cross Lab Write-Up Instructions.docx
Drosophila Three-Point Test Cross Lab Write-Up Instructions.docxDrosophila Three-Point Test Cross Lab Write-Up Instructions.docx
Drosophila Three-Point Test Cross Lab Write-Up Instructions.docxharold7fisher61282
 
A combined-conventional-and-differential-evolution-method-for-model-order-red...
A combined-conventional-and-differential-evolution-method-for-model-order-red...A combined-conventional-and-differential-evolution-method-for-model-order-red...
A combined-conventional-and-differential-evolution-method-for-model-order-red...Cemal Ardil
 
SBML FOR OPTIMIZING DECISION SUPPORT'S TOOLS
SBML FOR OPTIMIZING DECISION SUPPORT'S TOOLSSBML FOR OPTIMIZING DECISION SUPPORT'S TOOLS
SBML FOR OPTIMIZING DECISION SUPPORT'S TOOLScsandit
 
Genetic Programming for Generating Prototypes in Classification Problems
Genetic Programming for Generating Prototypes in Classification ProblemsGenetic Programming for Generating Prototypes in Classification Problems
Genetic Programming for Generating Prototypes in Classification ProblemsTarundeep Dhot
 
(2013) A Trade-off Between Number of Impressions and Number of Interaction At...
(2013) A Trade-off Between Number of Impressions and Number of Interaction At...(2013) A Trade-off Between Number of Impressions and Number of Interaction At...
(2013) A Trade-off Between Number of Impressions and Number of Interaction At...International Center for Biometric Research
 

Semelhante a Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text (20)

L3 v2
L3 v2L3 v2
L3 v2
 
Phonetic distance based accent
Phonetic distance based accentPhonetic distance based accent
Phonetic distance based accent
 
NAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURES
NAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURESNAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURES
NAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURES
 
Modeling Text Independent Speaker Identification with Vector Quantization
Modeling Text Independent Speaker Identification with Vector QuantizationModeling Text Independent Speaker Identification with Vector Quantization
Modeling Text Independent Speaker Identification with Vector Quantization
 
Subjective comparison of_speech_enhancement_algori (1)
Subjective comparison of_speech_enhancement_algori (1)Subjective comparison of_speech_enhancement_algori (1)
Subjective comparison of_speech_enhancement_algori (1)
 
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...
 
NLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology ConstraintsNLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology Constraints
 
Automatic Profiling Of Learner Texts
Automatic Profiling Of Learner TextsAutomatic Profiling Of Learner Texts
Automatic Profiling Of Learner Texts
 
Comparative performance analysis of channel normalization techniques
Comparative performance analysis of channel normalization techniquesComparative performance analysis of channel normalization techniques
Comparative performance analysis of channel normalization techniques
 
2019 acl bio_nlp_nli_surf_poster
2019 acl bio_nlp_nli_surf_poster2019 acl bio_nlp_nli_surf_poster
2019 acl bio_nlp_nli_surf_poster
 
SBML FOR OPTIMIZING DECISION SUPPORT'S TOOLS
SBML FOR OPTIMIZING DECISION SUPPORT'S TOOLS SBML FOR OPTIMIZING DECISION SUPPORT'S TOOLS
SBML FOR OPTIMIZING DECISION SUPPORT'S TOOLS
 
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
 
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
 
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
 
Drosophila Three-Point Test Cross Lab Write-Up Instructions.docx
Drosophila Three-Point Test Cross Lab Write-Up Instructions.docxDrosophila Three-Point Test Cross Lab Write-Up Instructions.docx
Drosophila Three-Point Test Cross Lab Write-Up Instructions.docx
 
A combined-conventional-and-differential-evolution-method-for-model-order-red...
A combined-conventional-and-differential-evolution-method-for-model-order-red...A combined-conventional-and-differential-evolution-method-for-model-order-red...
A combined-conventional-and-differential-evolution-method-for-model-order-red...
 
SBML FOR OPTIMIZING DECISION SUPPORT'S TOOLS
SBML FOR OPTIMIZING DECISION SUPPORT'S TOOLSSBML FOR OPTIMIZING DECISION SUPPORT'S TOOLS
SBML FOR OPTIMIZING DECISION SUPPORT'S TOOLS
 
Genetic Programming for Generating Prototypes in Classification Problems
Genetic Programming for Generating Prototypes in Classification ProblemsGenetic Programming for Generating Prototypes in Classification Problems
Genetic Programming for Generating Prototypes in Classification Problems
 
(2013) A Trade-off Between Number of Impressions and Number of Interaction At...
(2013) A Trade-off Between Number of Impressions and Number of Interaction At...(2013) A Trade-off Between Number of Impressions and Number of Interaction At...
(2013) A Trade-off Between Number of Impressions and Number of Interaction At...
 
2-IJCSE-00536
2-IJCSE-005362-IJCSE-00536
2-IJCSE-00536
 

Mais de Association for Computational Linguistics

Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Association for Computational Linguistics
 
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Association for Computational Linguistics
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsAssociation for Computational Linguistics
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsAssociation for Computational Linguistics
 
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Association for Computational Linguistics
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Association for Computational Linguistics
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...Association for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Association for Computational Linguistics
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Association for Computational Linguistics
 

Mais de Association for Computational Linguistics (20)

Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
 
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour AnalysisCastro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
 
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
 
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text SimplificationElior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
 
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
 

Último

Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterMateoGardella
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.MateoGardella
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Shubhangi Sonawane
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 

Último (20)

Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 

Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text

  • 1. Contributions Dataset Models Experiments Conclusion Weak Semi-Markov CRFs for NP Chunking in Informal Text Aldrian Obaja Muis and Wei Lu Singapore University of Technology and Design
  • 2. Contributions Dataset Models Experiments Conclusion Paper Contributions In this paper, we contributed: 1 Noun phrase-annotated SMS corpus1 1 Tao Chen and Min-Yen Kan (2013). “Creating a live, public short message service corpus: the NUS SMS corpus”. In: Language Resources and Evaluation. Vol. 47. Springer Netherlands, pp. 299–335. 2 / 13
  • 3. Contributions Dataset Models Experiments Conclusion Paper Contributions In this paper, we contributed: 1 Noun phrase-annotated SMS corpus1 2 Weak semi-Markov CRF 1 Tao Chen and Min-Yen Kan (2013). “Creating a live, public short message service corpus: the NUS SMS corpus”. In: Language Resources and Evaluation. Vol. 47. Springer Netherlands, pp. 299–335. 2 / 13
  • 4. Contributions Dataset Models Experiments Conclusion NP-annotated SMS Corpus 3 / 13
  • 5. Contributions Dataset Models Experiments Conclusion NP-annotated SMS Corpus We used Brat Rapid Annotation Tool (BRAT)2 for annotations, recruiting undergraduate students to annotate the noun phrases. 2 http://brat.nlplab.org/ 4 / 13
  • 6. Contributions Dataset Models Experiments Conclusion NP-annotated SMS Corpus We used Brat Rapid Annotation Tool (BRAT)2 for annotations, recruiting undergraduate students to annotate the noun phrases. Examples: 2 http://brat.nlplab.org/ 4 / 13
  • 7. Contributions Dataset Models Experiments Conclusion NP-annotated SMS Corpus We used Brat Rapid Annotation Tool (BRAT)2 for annotations, recruiting undergraduate students to annotate the noun phrases. Examples: 2 http://brat.nlplab.org/ 4 / 13
  • 8. Contributions Dataset Models Experiments Conclusion Annotations Statistics 64 annotators 5 / 13
  • 9. Contributions Dataset Models Experiments Conclusion Annotations Statistics 64 annotators 26,500 SMS messages 5 / 13
  • 10. Contributions Dataset Models Experiments Conclusion Annotations Statistics 64 annotators 26,500 SMS messages 76,490 noun phrases 5 / 13
  • 11. Contributions Dataset Models Experiments Conclusion Annotations Statistics 64 annotators 26,500 SMS messages 76,490 noun phrases 359,009 tokens 5 / 13
  • 12. Contributions Dataset Models Experiments Conclusion Models 6 / 13
  • 13. Contributions Dataset Models Experiments Conclusion Models Comparison n : # words in the sentence, |Y| : # labels, L : max segment length B B B I I I O O O said Dr Teh Fig. 1: Linear CRF: O(n|Y|2 ) 7 / 13
  • 14. Contributions Dataset Models Experiments Conclusion Models Comparison n : # words in the sentence, |Y| : # labels, L : max segment length B B B I I I O O O said Dr Teh Fig. 1: Linear CRF: O(n|Y|2 ) 7 / 13
  • 15. Contributions Dataset Models Experiments Conclusion Models Comparison n : # words in the sentence, |Y| : # labels, L : max segment length B B B I I I O O O said Dr Teh Fig. 1: Linear CRF: O(n|Y|2 ) 7 / 13
  • 16. Contributions Dataset Models Experiments Conclusion Models Comparison n : # words in the sentence, |Y| : # labels, L : max segment length B B B I I I O O O said Dr Teh Fig. 1: Linear CRF: O(n|Y|2 ) N N N O O O said Dr Teh Fig. 2: Semi-CRF: O(nL |Y| 2 ) 7 / 13
  • 17. Contributions Dataset Models Experiments Conclusion Models Comparison n : # words in the sentence, |Y| : # labels, L : max segment length B B B I I I O O O said Dr Teh Fig. 1: Linear CRF: O(n|Y|2 ) N N N O O O said Dr Teh Fig. 2: Semi-CRF: O(nL |Y| 2 ) 7 / 13
  • 18. Contributions Dataset Models Experiments Conclusion Models Comparison n : # words in the sentence, |Y| : # labels, L : max segment length B B B I I I O O O said Dr Teh Fig. 1: Linear CRF: O(n|Y|2 ) N N N O O O said Dr Teh Fig. 2: Semi-CRF: O(nL |Y| 2 ) N N NN N N O O OO O O said Dr Teh Fig. 3: Weak Semi-CRF: O(n |Y| 2 + nL |Y|) 7 / 13
  • 19. Contributions Dataset Models Experiments Conclusion Models Comparison n : # words in the sentence, |Y| : # labels, L : max segment length B B B I I I O O O said Dr Teh Fig. 1: Linear CRF: O(n|Y|2 ) N N N O O O said Dr Teh Fig. 2: Semi-CRF: O(nL |Y| 2 ) N N NN N N O O OO O O said Dr Teh Fig. 3: Weak Semi-CRF: O(n |Y| 2 + nL |Y|) 7 / 13
  • 20. Contributions Dataset Models Experiments Conclusion Models Comparison n : # words in the sentence, |Y| : # labels, L : max segment length B B B I I I O O O said Dr Teh Fig. 1: Linear CRF: O(n|Y|2 ) N N N O O O said Dr Teh Fig. 2: Semi-CRF: O(nL |Y| 2 ) N N NN N N O O OO O O said Dr Teh Fig. 3: Weak Semi-CRF: O(n |Y| 2 + nL |Y|) 7 / 13
  • 21. Contributions Dataset Models Experiments Conclusion Models Comparison n : # words in the sentence, |Y| : # labels, L : max segment length B B B I I I O O O said Dr Teh Fig. 1: Linear CRF: O(n|Y|2 ) N N N O O O said Dr Teh Fig. 2: Semi-CRF: O(nL |Y| 2 ) N N NN N N O O OO O O said Dr Teh Fig. 3: Weak Semi-CRF: O(n |Y| 2 + nL |Y|) 7 / 13
  • 22. Contributions Dataset Models Experiments Conclusion Empirical Verification 8 / 13
  • 23. Contributions Dataset Models Experiments Conclusion F1-Score Basic features +affixes All features 50 60 70 80 71.19 72.49 72.68 74.37 74.69 74.5874.39 74.60 74.31 F1-Score(%) Linear CRF Semi-CRF Weak Semi-CRF 9 / 13
  • 24. Contributions Dataset Models Experiments Conclusion Training Speed 5,000 10,000 15,000 20,000 0.5 1 1.5 2 # training instances (SMS) Avg.timeperiteration(s) Linear-CRF Semi-CRF Weak Semi-CRF 10 / 13
  • 25. Contributions Dataset Models Experiments Conclusion Conclusion 11 / 13
  • 26. Contributions Dataset Models Experiments Conclusion Conclusion We have created a new NP-annotated dataset on informal text 12 / 13
  • 27. Contributions Dataset Models Experiments Conclusion Conclusion We have created a new NP-annotated dataset on informal text We can split the decisions of selecting segment length and segment type to improve the training time, while maintaining similar accuracy 12 / 13
  • 28. Contributions Dataset Models Experiments Conclusion Thank You Code and data available at: http://statnlp.org/research/ie/ Aldrian Obaja Muis and Wei Lu Singapore University of Technology and Design 13 / 13