SlideShare uma empresa Scribd logo
1 de 68
Baixar para ler offline
Aaron L.-F. Han
Natural Language Processing & Portuguese-Chinese Machine Translation
Laboratory
University of Macau, Macau S.A.R., China
2013.08 @ CUHK, Hong Kong
Email: hanlifengaaron AT gmail DOT com
Homepage: http://www.linkedin.com/in/aaronhan
 The importance of machine translation (MT) evaluation
 Automatic MT evaluation metrics introduction
1. Lexical similarity
2. Linguistic features
3. Metrics combination
 Designed metric: LEPOR Series
1. Motivation
2. LEPOR Metrics Description
3. Performances on international ACL-WMT corpora
4. Publications and Open source tools
 Other research interests and publications
• Eager communication with each other of different
nationalities
– Promote the translation technology
• Rapid development of Machine translation
– machine translation (MT) began as early as in the 1950s
(Weaver, 1955)
– big progress science the 1990s due to the development of
computers (storage capacity and computational power)
and the enlarged bilingual corpora (Marino et al. 2006)
• Some recent works of MT research:
– Och (2003) present MERT (Minimum Error Rate Training)
for log-linear SMT
– Su et al. (2009) use the Thematic Role Templates model to
improve the translation
– Xiong et al. (2011) employ the maximum-entropy model,
etc.
– The data-driven methods including example-based MT
(Carl and Way, 2003) and statistical MT (Koehn, 2010)
became main approaches in MT literature.
• How well the MT systems perform and whether they
make some progress?
• Difficulties of MT evaluation
– language variability results in no single correct translation
– the natural languages are highly ambiguous and different
languages do not always express the same content in the
same way (Arnold, 2003)
• Traditional manual evaluation criteria:
– intelligibility (measuring how understandable the
sentence is)
– fidelity (measuring how much information the translated
sentence retains as compared to the original) by the
Automatic Language Processing Advisory Committee
(ALPAC) around 1966 (Carroll, 1966)
– adequacy (similar as fidelity), fluency (whether the
sentence is well-formed and fluent) and comprehension
(improved intelligibility) by Defense Advanced Research
Projects Agency (DARPA) of US (White et al., 1994)
• Problems of manual evaluations :
– Time-consuming
– Expensive
– Unrepeatable
– Low agreement (Callison-Burch, et al., 2011)
2.1 Lexical similarity
2.2 Linguistic features
2.3 Metrics combination
• Precision-based
Bleu (Papineni et al., 2002 ACL)
• Recall-based
ROUGE(Lin, 2004 WAS)
• Precision and Recall
Meteor (Banerjee and Lavie, 2005 ACL)
• Word-order based
NKT_NSR(Isozaki et al., 2010EMNLP), Port (Chen
et al., 2012 ACL), ATEC (Wong et al., 2008AMTA)
• Word-alignment based
AER (Och and Ney, 2003 J.CL)
• Edit distance-based
WER(Su et al., 1992Coling), PER(Tillmann et al.,
1997 EUROSPEECH), TER (Snover et al., 2006
AMTA)
• Language model
LM-SVM (Gamon et al., 2005EAMT)
• Shallow parsing
GLEU (Mutton et al., 2007ACL), TerrorCat (Fishel
et al., 2012WMT)
• Semantic roles
Named entity, morphological, synonymy,
paraphrasing, discourse representation, etc.
• MTeRater-Plus (Parton et al., 2011WMT)
– Combine BLEU, TERp (Snover et al., 2009) and Meteor
(Banerjee and Lavie, 2005; Lavie and Denkowski, 2009)
• MPF & WMPBleu (Popovic, 2011WMT)
– Arithmetic mean of F score and BLEU score
• SIA (Liu and Gildea, 2006ACL)
– Combine the advantages of n-gram-based metrics and
loose-sequence-based metrics
• LEPOR: automatic machine translation evaluation
metric considering the: Length Penalty, Precision, n-
gram Position difference Penalty and Recall.
• Weaknesses in existing metrics:
– perform well on certain language pairs but weak on others,
which we call as the language-bias problem;
– consider no linguistic information (leading the metrics
result in low correlation with human judgments) or too
many linguistic features (difficult in replicability), which we
call as the extremism problem;
– present incomprehensive factors (e.g. BLEU focus on
precision only).
– What to do?
• to address some of the existing problems:
– Design tunable parameters to address the language-bias
problem;
– Use concise or optimized linguistic features for the
linguistic extremism problem;
– Design augmented factors.
• Sub-factors:
• 𝐿𝑃 =
exp 1 −
𝑟
𝑐
: 𝑐 < 𝑟
1 ∶ 𝑐 = 𝑟
exp 1 −
𝑐
𝑟
: 𝑐 > 𝑟
(1)
• 𝑟: length of reference sentence
• 𝑐: length of candidate (system-output) sentence
• 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙 = exp −𝑁𝑃𝐷 (2)
• 𝑁𝑃𝐷 =
1
𝐿𝑒𝑛𝑔𝑡ℎ 𝑜𝑢𝑡𝑝𝑢𝑡
|𝑃𝐷𝑖|
𝐿𝑒𝑛𝑔𝑡ℎ 𝑜𝑢𝑡𝑝𝑢𝑡
𝑖=1
(3)
• 𝑃𝐷𝑖 = |𝑀𝑎𝑡𝑐ℎ𝑁𝑜𝑢𝑡𝑝𝑢𝑡 − 𝑀𝑎𝑡𝑐ℎ𝑁𝑟𝑒𝑓| (4)
• 𝑀𝑎𝑡𝑐ℎ𝑁𝑜𝑢𝑡𝑝𝑢𝑡: position of matched token in
output sentence
• 𝑀𝑎𝑡𝑐ℎ𝑁𝑟𝑒𝑓: position of matched token in reference
sentence
Fig. 1. N-gram word alignment algorithm
Fig. 2. Example of n-gram word alignment
Fig. 3. Example of NPD calculation
• N-gram precision and recall:
• 𝑃𝑛 =
#𝑛𝑔𝑟𝑎𝑚 𝑚𝑎𝑡𝑐ℎ𝑒𝑑
#𝑛𝑔𝑟𝑎𝑚 𝑐ℎ𝑢𝑛𝑘𝑠 𝑖𝑛 𝑠𝑦𝑠𝑡𝑒𝑚 𝑜𝑢𝑡𝑝𝑢𝑡
(5)
• 𝑅 𝑛 =
#𝑛𝑔𝑟𝑎𝑚 𝑚𝑎𝑡𝑐ℎ𝑒𝑑
#𝑛𝑔𝑟𝑎𝑚 𝑐ℎ𝑢𝑛𝑘𝑠 𝑖𝑛 𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒
(6)
• 𝐻𝑃𝑅 = 𝐻𝑎𝑟𝑚𝑜𝑛𝑖𝑐 𝛼𝑅 𝑛, 𝛽𝑃𝑛 =
𝛼+𝛽
𝛼
𝑅 𝑛
+
𝛽
𝑃 𝑛
(7)
Fig. 4. Example of bigram matching
• LEPOR Metrics:
• 𝐿𝐸𝑃𝑂𝑅 = 𝐿𝑃 × 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙 × 𝐻𝑎𝑟𝑚𝑜𝑛𝑖𝑐(𝛼𝑅, 𝛽𝑃) (8)
• ℎ𝐿𝐸𝑃𝑂𝑅 =
𝐻𝑎𝑟𝑚𝑜𝑛𝑖𝑐 𝑤 𝐿𝑃 𝐿𝑃, 𝑤 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙, 𝑤 𝐻𝑃𝑅 𝐻𝑃𝑅
• =
𝑤 𝑖
𝑛
𝑖=1
𝑤 𝑖
𝐹𝑎𝑐𝑡𝑜𝑟 𝑖
𝑛
𝑖=1
=
𝑤 𝐿𝑃+𝑤 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙+𝑤 𝐻𝑃𝑅
𝑤 𝐿𝑃
𝐿𝑃
+
𝑤 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙
𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙
+
𝑤 𝐻𝑃𝑅
𝐻𝑃𝑅
(9)
• 𝑛𝐿𝐸𝑃𝑂𝑅 = 𝐿𝑃 × 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙 × 𝑒𝑥𝑝( 𝑤 𝑛 𝑙𝑜𝑔𝐻𝑃𝑅𝑁
𝑛=1 )
(10)
• Example, employment of linguistic features:
Fig. 5. Example of n-gram POS alignment
Fig. 6. Example of NPD calculation
• Combination with linguistic features:
• ℎ𝐿𝐸𝑃𝑂𝑅𝑓𝑖𝑛𝑎𝑙 =
1
𝑤ℎ𝑤+𝑤ℎ𝑝
(𝑤ℎ𝑤ℎ𝐿𝐸𝑃𝑂𝑅 𝑤𝑜𝑟𝑑 +
𝑤ℎ𝑝ℎ𝐿𝐸𝑃𝑂𝑅 𝑃𝑂𝑆) (11)
• ℎ𝐿𝐸𝑃𝑂𝑅 𝑃𝑂𝑆 and ℎ𝐿𝐸𝑃𝑂𝑅 𝑤𝑜𝑟𝑑 use the same
algorithm on POS sequence and word sequence
respectively.
• When multi-references:
• Select the alignment that results in the minimum NPD
score.
Fig. 7. N-gram alignment when multi-references
• How reliable is the automatic metric?
• Evaluation criteria for evaluation metrics:
– Human judgments are the golden to approach, currently.
• Correlation with human judgments:
– System-level correlation
– Segment-level correlation
• System-level correlation:
• Spearman rank correlation coefficient:
– 𝜌 𝑋𝑌 = 1 −
6 𝑑 𝑖
2𝑛
𝑖=1
𝑛(𝑛2−1)
(12)
– 𝑋 = 𝑥1, … , 𝑥 𝑛 , 𝑌 = {𝑦1, … , 𝑦𝑛}
• Pearson correlation coefficient:
– 𝜌 𝑋𝑌 =
(𝑥 𝑖−𝜇 𝑥)(𝑦 𝑖−𝜇 𝑦)𝑛
𝑖=1
𝑥 𝑖−𝜇 𝑥
2𝑛
𝑖=1 𝑦 𝑖−𝜇 𝑦
2𝑛
𝑖=1
(13)
• Segment-level Kendall’s tau correlation:
• 𝜏 =
𝑛𝑢𝑚 𝑐𝑜𝑛𝑐𝑜𝑟𝑑𝑎𝑛𝑡 𝑝𝑎𝑖𝑟𝑠−𝑛𝑢𝑚 𝑑𝑖𝑠𝑐𝑜𝑟𝑑𝑎𝑛𝑡 𝑝𝑎𝑖𝑟𝑠
𝑡𝑜𝑡𝑎𝑙 𝑝𝑎𝑖𝑟𝑠
(14)
• The segment unit can be a single sentence or
fragment that contains several sentences.
• Performances on ACL-WMT 2011 copora
• Two translation directions:
– English-to-other (Spanish, German, French, Czech)
– Other-to-English
• System-level metrics:
– 𝐿𝐸𝑃𝑂𝑅 𝐴 =
1
𝑛𝑢𝑚 𝑠𝑒𝑛𝑡
|𝐿𝐸𝑃𝑂𝑅𝑖|
𝑛𝑢𝑚 𝑠𝑒𝑛𝑡
𝑖=1 (15)
– 𝐿𝐸𝑃𝑂𝑅 𝐵 = 𝐿𝑃 × 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙 × 𝐻𝑎𝑟𝑚𝑜𝑛𝑖𝑐(𝛼𝑅, 𝛽𝑃) (16)
Table 1. system-level Spearman correlation with human judgment on WMT11 corpora
• Performances on ACL-WMT 2013 copora
• Two translation directions:
– English-to-other (Spanish, German, French, Czech, and
Russian)
– Other-to-English
• System-level & sentence-level
– LEPOR_v3.1: hLEPOR, nLEPOR_baseline
– ℎ𝐿𝐸𝑃𝑂𝑅 =
1
𝑛𝑢𝑚 𝑠𝑒𝑛𝑡
|ℎ𝐿𝐸𝑃𝑂𝑅𝑖|
𝑛𝑢𝑚 𝑠𝑒𝑛𝑡
𝑖=1 (17)
– ℎ𝐿𝐸𝑃𝑂𝑅𝑓𝑖𝑛𝑎𝑙 =
1
𝑤ℎ𝑤+𝑤ℎ𝑝
(𝑤ℎ𝑤ℎ𝐿𝐸𝑃𝑂𝑅 𝑤𝑜𝑟𝑑 + 𝑤ℎ𝑝ℎ𝐿𝐸𝑃𝑂𝑅 𝑃𝑂𝑆) (18)
– 𝑛𝐿𝐸𝑃𝑂𝑅 = 𝐿𝑃 × 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙 × exp( 𝑤 𝑛 𝑙𝑜𝑔𝐻𝑃𝑅𝑁
𝑛=1 ) (19)
Table 2&3. system-level Pearson correlation with human judgment
Table 4&5. segment-level Kendall’s tau correlation with human judgment
• LEPOR: A Robust Evaluation Metric for Machine
Translation with Augmented Factors
– Aaron L.-F. Han, Derek F. Wong and Lidia S. Chao.
Proceedings of COLING 2012: Posters, pages 441–450,
Mumbai, India.
• Language-independent Model for Machine
Translation Evaluation with Reinforced Factors
– Aaron L.-F. Han, Derek Wong, Lidia S. Chao, Liangye He, Yi
Lu, Junwen Xing, Xiaodong Zeng. Proceedings of MT
Summit 2013. Nice, France.
• Language independent MT evaluation-LEPOR:
https://github.com/aaronlifenghan/aaron-project-lepor
• MT evaluation with linguistic features-hLEPOR:
https://github.com/aaronlifenghan/aaron-project-hlepor
• English-French Phrase tagset mapping and application in
unsupervised MT evaluation-HPPR:
https://github.com/aaronlifenghan/aaron-project-hppr
• Unsupervised English-Spanish MT evaluation-EBLEU:
https://github.com/aaronlifenghan/aaron-project-ebleu
• Projects Homepage: https://github.com/aaronlifenghan
• My research interests:
– Natural Language Processing
– Signal Processing
– Machine Learning
– Artificial Intelligence
– Pattern Recognition
• My past research works:
– Machine Translation Evaluation, Word Segmentation,
Entity Recognition, Multilingual Treebanks
• Other publications:
• A Description of Tunable Machine Translation Evaluation Systems in
WMT13 Metrics Task
– Aaron Li-Feng Han, Derek Wong, Lidia S. Chao, Yi Lu, Yervant
Ho, Yiming Wang, Zhou jiaji. Proceedings of the ACL 2013 EIGHTH
WORKSHOP ON STATISTICAL MACHINE TRANSLATION (ACL-WMT
2013), 8-9 August 2013. Sofia, Bulgaria.
– ACL-WMT13 METRICS TASK:
Our metrics are language independent
English-vs-other (French, Spanish, Czech, German, Russian)
Can perform on both system-level and segment-level
The official results show our metrics have advantages as compared to others.
• Quality Estimation for Machine Translation Using the Joint Method of
Evaluation Criteria and Statistical Modeling
– Aaron Li-Feng Han, Yi Lu, Derek F. Wong, Lidia S. Chao, Yervant
Ho, Anson Xing. Proceedings of the ACL 2013 EIGHTH WORKSHOP ON
STATISTICAL MACHINE TRANSLATION (ACL-WMT 2013), 8-9 August
2013. Sofia, Bulgaria.
– ACL-WMT13 QUALITY ESTIMATION TASK (no reference translation):
Task 1.1: sentence level EN-ES quality estimation
Task 1.2: system selection, EN-ES, EN-DE, new
Task 2: word-level QE, EN-ES, binary classification, multi-class classification, new
We design novel EN-ES POS tagset mapping and metric EBLEU in task 1.1.
We explore the Naïve Bayes and Support Vector Machine in task 1.2.
We achieve the highest F1 score in task 2 using Conditional Random Field.
Designed POS tagset mapping of Spanish (Tree tagger) to universal tagset
(Petrov et al., 2012)
• 𝐸𝐵𝐿𝐸𝑈 = 1 − 𝑀𝐿𝑃 × exp( 𝑤 𝑛log(𝐻(𝛼𝑅 𝑛, 𝛽𝑃𝑛)))
– 𝑀𝐿𝑃 =
𝑒1−
𝑠
ℎ
𝑖𝑓 ℎ < 𝑠
𝑒1−
ℎ
𝑠
𝑖𝑓 ℎ ≥ 𝑠
– 𝑃𝑛 =
#𝑐𝑜𝑚𝑚𝑜𝑛 𝑛𝑔𝑟𝑎𝑚 𝑐ℎ𝑢𝑛𝑘
#𝑛𝑔𝑟𝑎𝑚 𝑐ℎ𝑢𝑛𝑘 𝑖𝑛 𝑡𝑎𝑟𝑔𝑒𝑡 𝑠𝑒𝑛𝑡𝑒𝑛𝑐𝑒
– 𝑅 𝑛 =
#𝑐𝑜𝑚𝑚𝑜𝑛 𝑛𝑔𝑟𝑎𝑚 𝑐ℎ𝑢𝑛𝑘
#𝑛𝑔𝑟𝑎𝑚 𝑐ℎ𝑢𝑛𝑘 𝑖𝑛 𝑠𝑜𝑢𝑟𝑐𝑒 𝑠𝑒𝑛𝑡𝑒𝑛𝑐𝑒
• Bayes’ rule:
– 𝑝 𝑐𝑖 𝑥1, 𝑥2, … , 𝑥 𝑛 =
𝑝 𝑥1,𝑥2,…,𝑥 𝑛 𝑐 𝑖 𝑝(𝑐 𝑖)
𝑝(𝑥1,𝑥2,…,𝑥 𝑛)
• SVM, find the point with smallest margin to hyper
plane and then maximize this margin:
– 𝑎𝑟𝑔𝑚𝑎𝑥 𝑤, 𝑏 {𝑚𝑖𝑛
𝑛
(𝑙𝑎𝑏𝑒𝑙 ∙ (𝑤 𝑇 𝑥 + 𝑏)) ∙
1
||𝑤||
}
• Conditional random fields:
– 𝑃 𝑌 𝑋 ∝
exp( 𝜆 𝑘 𝑓𝑘 𝑒, 𝑌| 𝑒, 𝑋𝑒∈𝐸,𝑘 + 𝜇 𝑘 𝑔 𝑘(𝑣, 𝑌| 𝑣, 𝑋)𝑣∈𝑉,𝑘 )
Designed features for CRF & NB algorithms
ACL-WMT13 word level quality estimation task results
• Phrase Tagset Mapping for French and English Treebanks and Its
Application in Machine Translation Evaluation
– Aaron Li-Feng Han, Derek F. Wong, Lidia S. Chao, Yervant Ho, Shuo
Li, Lynn Ling Zhu. In GSCL 2013. LNCS Vol. 8105, Volume Editors: Iryna
Gurevych, Chris Biemann and Torsten Zesch.
– German Society for Computational Linguistics (oral presentation):
To facilitate future research in unsupervised induction of syntactic structures
We design French-English phrase tagset mapping
We propose a universal phrase tagset
Phase tags extracted from French Treebank and English Penn Treebank
Explore the employment of the proposed mapping in unsupervised MT evaluation
Designed phrase tagset mapping for English and French
Evaluation based on parsing information from syntactic treebanks
Convert the word sequence into universal phrase tagset sequence
• 𝐻𝑃𝑃𝑅 = 𝐻𝑎𝑟(𝑤 𝑃𝑠 𝑁1 𝑃𝑠𝐷𝑖𝑓, 𝑤 𝑃𝑟 𝑁2 𝑃𝑟𝑒, 𝑤 𝑅𝑐 𝑁3 𝑅𝑒𝑐)
– 𝑁1 𝑃𝑠𝐷𝑖𝑓 =
1
𝑛
𝑁1 𝑃𝑠𝐷𝑖𝑓𝑖
– 𝑁2 𝑃𝑟𝑒 = 𝑒𝑥𝑝 𝑤 𝑛 𝑙𝑜𝑔𝑃𝑛𝑁2
– 𝑁3 𝑅𝑒𝑐 = 𝑒𝑥𝑝( 𝑤 𝑛 𝑙𝑜𝑔𝑅 𝑛𝑁3
)
• A Study of Chinese Word Segmentation Based on the Characteristics of
Chinese
– Aaron Li-Feng Han, Derek F. Wong, Lidia S. Chao, Yervant Ho, Lynn Ling
Zhu, Shuo Li. Accepted. In GSCL 2013. LNCS Vol. 8105, Volume Editors:
Iryna Gurevych, Chris Biemann and Torsten Zesch.
– German Society for Computational Linguistics (poster paper):
No word boundary in the Chinese expression
Chinese word segmentation is a difficult problem
Word segmentation is crucial to the word alignment in machine translation
We discuss the characteristics of Chinese and design optimized features
We formulize some problems and issues in Chinese word segmentation
• AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH
INFORMATION
– Aaron Li-Feng Han, Derek F. Wong, Lidia S. Chao, Yervant Ho. In TSD
2013. Plzen, Czech Republic. LNAI Vol. 8082, pp. 121-128. Volume
Editors: I. Habernal and V. Matousek. Springer-Verlag Berlin
Heidelberg.
– Text, Speech and Dialogue 2013 (oral presentation):
We explore the unsupervised machine translation evaluation method
We design hLEPOR algorithm for the first time
We explore the POS usage in unsupervised MT evaluation
Experiments are performed on English vs French, German
• Chinese Named Entity Recognition with Conditional Random Fields in the
Light of Chinese Characteristics
– Aaron Li-Feng Han, Derek Fai Wong and Lidia Sam Chao. In Proceeding
of LP&IIS. M.A. Klopotek et al. (Eds.): IIS 2013, LNCS Vol. 7912, pp. 57–
68, Warsaw, Poland. Springer-Verlag Berlin Heidelberg.
– Intelligent Information System 2013 (oral presentation):
Named entity recognition is important in IR, MT, text analysis, etc.
Chinese named entity recognition is more difficult due to no word boundary
We compare the performances of different algorithm, NB, CRF, SVM, ME
We analysis the characteristics respectively on personal, location, organization names
We show the performance of different features and select the optimized one.
• Ongoing and further works:
– The combination of translation and evaluation, tuning the
translation model using evaluation metrics
– Evaluation models from the perspective of semantics
– The exploration of unsupervised evaluation models,
extracting features from source and target languages
• Actually speaking, the evaluation works are very
related to the similarity measuring. Where I have
employed them is in the MT evaluation only. These
works can be further developed into other literature:
– information retrieval
– question and answering
– Searching
– text analysis
– etc.
Q and A
Thanks for your attention!
Aaron L.-F. Han, 2013.08
• 1. Weaver, Warren.: Translation. In William Locke and A. Donald Booth, editors,
• Machine Translation of Languages: Fourteen Essays. John Wiley and Sons, New
• York, pages 15-23 (1955)
• 2. Marino B. Jose, Rafael E. Banchs, Josep M. Crego, Adria de Gispert, Patrik Lambert,
• Jose A. Fonollosa, Marta R. Costa-jussa: N-gram based machine translation,
• Computational Linguistics, Vol. 32, No. 4. pp. 527-549, MIT Press (2006)
• 3. Och, F. J.: Minimum Error Rate Training for Statistical Machine Translation. In
• Proceedings of (ACL-2003). pp. 160-167 (2003)
• 4. Su Hung-Yu and Chung-Hsien Wu: Improving Structural Statistical Machine Translation
• for Sign Language With Small Corpus Using Thematic Role Templates as
• Translation Memory, IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE
• PROCESSING, VOL. 17, NO. 7, SEPTEMBER (2009)
• 5. Xiong D., M. Zhang, H. Li: A Maximum-Entropy Segmentation Model for Statistical
• Machine Translation, Audio, Speech, and Language Processing, IEEE Transactions
• on, Volume: 19, Issue: 8, 2011 , pp. 2494- 2505 (2011)
• 6. Carl, M. and A. Way (eds): Recent Advances in Example-Based Machine Translation.
• Kluwer Academic Publishers, Dordrecht, The Netherlands (2003)
• 7. Koehn P.: Statistical Machine Translation, (University of Edinburgh), Cambridge
• University Press (2010)
• 8. Arnold, D.: Why translation is dicult for computers. In Computers and Translation:
• A translator's guide. Benjamins Translation Library (2003)
• 9. Carroll, J. B.: Aan experiment in evaluating the quality of translation, Pierce, J.
• (Chair), Languages and machines: computers in translation and linguistics. A report
• by the Automatic Language Processing Advisory Committee (ALPAC), Publication
• 1416, Division of Behavioral Sciences, National Academy of Sciences, National Research
• Council, page 67-75 (1966)
• 10. White, J. S., O'Connell, T. A., and O'Mara, F. E.: The ARPA MT evaluation
• methodologies: Evolution, lessons, and future approaches. In Proceedings of the
• Conference of the Association for Machine Translation in the Americas (AMTA
• 1994). pp 193-205 (1994)
• 11. Su Keh-Yih, Wu Ming-Wen and Chang Jing-Shin: A New Quantitative Quality
• Measure for Machine Translation Systems. In Proceedings of the 14th International
• Conference on Computational Linguistics, pages 433-439, Nantes, France, July
• (1992)
• 12. Tillmann C., Stephan Vogel, Hermann Ney, Arkaitz Zubiaga, and Hassan Sawaf:
• Accelerated DP Based Search For Statistical Translation. In Proceedings of the 5th
• European Conference on Speech Communication and Technology (EUROSPEECH97)
• (1997)
• 13. Papineni, K., Roukos, S., Ward, T. and Zhu, W. J.: BLEU: a method for automatic
• evaluation of machine translation. In Proceedings of the (ACL 2002), pages 311-318,
• Philadelphia, PA, USA (2002)
• 14. Doddington, G.: Automatic evaluation of machine translation quality using ngram
• co-occurrence statistics. In Proceedings of the second international conference
• on Human Language Technology Research(HLT 2002), pages 138-145, San Diego,
• California, USA (2002)
• 15. Turian, J. P., Shen, L. and Melanmed, I. D.: Evaluation of machine translation
• and its evaluation. In Proceedings of MT Summit IX, pages 386-393, New Orleans,
• LA, USA (2003)
• 16. Banerjee, S. and Lavie, A.: Meteor: an automatic metric for MT evaluation with
• high levels of correlation with human judgments. In Proceedings of ACL-WMT,
• pages 65-72, Prague, Czech Republic (2005)
• 17. Denkowski, M. and Lavie, A.: Meteor 1.3: Automatic metric for reliable optimization
• and evaluation of machine translation systems. In Proceedings of (ACL-WMT),
• pages 85-91, Edinburgh, Scotland, UK (2011)
• 18. Snover, M., Dorr, B., Schwartz, R., Micciulla, L. and Makhoul, J.: A study of
• translation edit rate with targeted human annotation. In Proceedings of the Conference
• of the Association for Machine Translation in the Americas (AMTA), pages
• 223-231, Boston, USA (2006)
• 19. Chen, B. and Kuhn, R.: Amber: A modied bleu, enhanced ranking metric. In
• Proceedings of (ACL-WMT), pages 71-77, Edinburgh, Scotland, UK (2011)
• 20. Bicici, E. and Yuret, D.: RegMT system for machine translation, system combination,
• and evaluation. In Proceedings ACL-WMT, pages 323-329, Edinburgh,
• Scotland, UK (2011)
• 21. Taylor, J. Shawe and N. Cristianini: Kernel Methods for Pattern Analysis. Cambridge
• University Press 2004.
• 22. Wong, B. T-M and Kit, C.: Word choice and word position for automatic MT
• evaluation. In Workshop: MetricsMATR of the Association for Machine Translation
• in the Americas (AMTA), short paper, 3 pages, Waikiki, Hawai'I, USA (2008)
• 23. Isozaki, H., Hirao, T., Duh, K., Sudoh, K., and Tsukada, H.: Automatic evaluation
• of translation quality for distant language pairs. In Proceedings of the 2010
• Conference on (EMNLP), pages 944{952, Cambridge, MA (2010)
• 24. Talbot, D., Kazawa, H., Ichikawa, H., Katz-Brown, J., Seno, M. and Och, F.: A
• Lightweight Evaluation Framework for Machine Translation Reordering. In Proceedings
• of the Sixth (ACL-WMT), pages 12-21, Edinburgh, Scotland, UK (2011)
• 25. Song, X. and Cohn, T.: Regression and ranking based optimisation for sentence
• level MT evaluation. In Proceedings of the (ACL-WMT), pages 123-129, Edinburgh,
• Scotland, UK (2011)
• 26. Popovic, M.: Morphemes and POS tags for n-gram based evaluation metrics. In
• Proceedings of (ACL-WMT), pages 104-107, Edinburgh, Scotland, UK (2011)
• 27. Popovic, M., Vilar, D., Avramidis, E. and Burchardt, A.: Evaluation without references:
• IBM1 scores as evaluation metrics. In Proceedings of the (ACL-WMT),
• pages 99-103, Edinburgh, Scotland, UK (2011)
• 28. Petrov S., Leon Barrett, Romain Thibaux, and Dan Klein: Learning accurate,
• compact, and interpretable tree annotation. Proceedings of the 21st ACL, pages
• 433-440, Sydney, July (2006)
• 29. Callison-Bruch, C., Koehn, P., Monz, C. and Zaidan, O. F.: Findings of the 2011
• Workshop on Statistical Machine Translation. In Proceedings of (ACL-WMT), pages
• 22-64, Edinburgh, Scotland, UK (2011)
• 30. Callison-Burch, C., Koehn, P., Monz, C., Peterson, K., Przybocki, M. and Zaidan,
• O. F.: Findings of the 2010 Joint Workshop on Statistical Machine Translation and
• Metrics for Machine Translation. In Proceedings of (ACL-WMT), pages 17-53, PA,
• USA (2010)
• 31. Callison-Burch, C., Koehn, P., Monz,C. and Schroeder, J.: Findings of the 2009
• Workshop on Statistical Machine Translation. In Proceedings of ACL-WMT, pages
• 1-28, Athens, Greece (2009)
• 32. Callison-Burch, C., Koehn, P., Monz,C. and Schroeder, J.: Further meta-evaluation
• of machine translation. In Proceedings of (ACL-WMT), pages 70-106, Columbus,
• Ohio, USA (2008)
• 33. Avramidis E., Popovic, M., Vilar, D., Burchardt, A.: Evaluate with Condence
• Estimation: Machine ranking of translation outputs using grammatical features. In
• Proceedings of the Sixth Workshop on Statistical Machine Translation, Association
• for Computational Linguistics (ACL-WMT), pages 65-70, Edinburgh, Scotland, UK
• (2011)

Mais conteúdo relacionado

Mais procurados

Challenges in transfer learning in nlp
Challenges in transfer learning in nlpChallenges in transfer learning in nlp
Challenges in transfer learning in nlpLaraOlmosCamarena
 
Statistical machine translation for indian language copy
Statistical machine translation for indian language   copyStatistical machine translation for indian language   copy
Statistical machine translation for indian language copyNakul Sharma
 
Thomas Wolf "Transfer learning in NLP"
Thomas Wolf "Transfer learning in NLP"Thomas Wolf "Transfer learning in NLP"
Thomas Wolf "Transfer learning in NLP"Fwdays
 
Experiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine TranslationExperiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine Translationkhyati gupta
 
TARGETED ADVERSARIAL EXAMPLES FOR BLACK BOX AUDIO SYSTEMS - Rohan Taori, Amog...
TARGETED ADVERSARIAL EXAMPLES FOR BLACK BOX AUDIO SYSTEMS - Rohan Taori, Amog...TARGETED ADVERSARIAL EXAMPLES FOR BLACK BOX AUDIO SYSTEMS - Rohan Taori, Amog...
TARGETED ADVERSARIAL EXAMPLES FOR BLACK BOX AUDIO SYSTEMS - Rohan Taori, Amog...GeekPwn Keen
 
[slide] A Compare-Aggregate Model with Latent Clustering for Answer Selection
[slide] A Compare-Aggregate Model with Latent Clustering for Answer Selection[slide] A Compare-Aggregate Model with Latent Clustering for Answer Selection
[slide] A Compare-Aggregate Model with Latent Clustering for Answer SelectionSeoul National University
 
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT Lifeng (Aaron) Han
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue
Transformers to Learn Hierarchical Contexts in Multiparty DialogueTransformers to Learn Hierarchical Contexts in Multiparty Dialogue
Transformers to Learn Hierarchical Contexts in Multiparty DialogueJinho Choi
 
Word representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2VecWord representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2Vecananth
 
Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...
Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...
Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...KozoChikai
 
Text Representations for Deep learning
Text Representations for Deep learningText Representations for Deep learning
Text Representations for Deep learningZachary S. Brown
 
Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...
Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...
Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...Association for Computational Linguistics
 
Towards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositoriesTowards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositoriesValentina Paunovic
 
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRAN
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRANDeep Learning for Machine Translation, by Satoshi Enoue, SYSTRAN
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRANTAUS - The Language Data Network
 

Mais procurados (19)

Deep Learning for Machine Translation
Deep Learning for Machine TranslationDeep Learning for Machine Translation
Deep Learning for Machine Translation
 
Challenges in transfer learning in nlp
Challenges in transfer learning in nlpChallenges in transfer learning in nlp
Challenges in transfer learning in nlp
 
Statistical machine translation for indian language copy
Statistical machine translation for indian language   copyStatistical machine translation for indian language   copy
Statistical machine translation for indian language copy
 
Language models
Language modelsLanguage models
Language models
 
Arabic question answering ‫‬
Arabic question answering ‫‬Arabic question answering ‫‬
Arabic question answering ‫‬
 
Thomas Wolf "Transfer learning in NLP"
Thomas Wolf "Transfer learning in NLP"Thomas Wolf "Transfer learning in NLP"
Thomas Wolf "Transfer learning in NLP"
 
Experiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine TranslationExperiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine Translation
 
TARGETED ADVERSARIAL EXAMPLES FOR BLACK BOX AUDIO SYSTEMS - Rohan Taori, Amog...
TARGETED ADVERSARIAL EXAMPLES FOR BLACK BOX AUDIO SYSTEMS - Rohan Taori, Amog...TARGETED ADVERSARIAL EXAMPLES FOR BLACK BOX AUDIO SYSTEMS - Rohan Taori, Amog...
TARGETED ADVERSARIAL EXAMPLES FOR BLACK BOX AUDIO SYSTEMS - Rohan Taori, Amog...
 
[slide] A Compare-Aggregate Model with Latent Clustering for Answer Selection
[slide] A Compare-Aggregate Model with Latent Clustering for Answer Selection[slide] A Compare-Aggregate Model with Latent Clustering for Answer Selection
[slide] A Compare-Aggregate Model with Latent Clustering for Answer Selection
 
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue
Transformers to Learn Hierarchical Contexts in Multiparty DialogueTransformers to Learn Hierarchical Contexts in Multiparty Dialogue
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue
 
The NLP Muppets revolution!
The NLP Muppets revolution!The NLP Muppets revolution!
The NLP Muppets revolution!
 
Word representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2VecWord representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2Vec
 
Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...
Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...
Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...
 
Text Representations for Deep learning
Text Representations for Deep learningText Representations for Deep learning
Text Representations for Deep learning
 
Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...
Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...
Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...
 
Towards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositoriesTowards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositories
 
On using monolingual corpora in neural machine translation
On using monolingual corpora in neural machine translationOn using monolingual corpora in neural machine translation
On using monolingual corpora in neural machine translation
 
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRAN
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRANDeep Learning for Machine Translation, by Satoshi Enoue, SYSTRAN
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRAN
 

Destaque

ESR12 Hanna Bechara - EXPERT Summer School - Malaga 2015
ESR12 Hanna Bechara - EXPERT Summer School - Malaga 2015ESR12 Hanna Bechara - EXPERT Summer School - Malaga 2015
ESR12 Hanna Bechara - EXPERT Summer School - Malaga 2015RIILP
 
Challenges and opportunities in research evaluation: toward a better evaluati...
Challenges and opportunities in research evaluation: toward a better evaluati...Challenges and opportunities in research evaluation: toward a better evaluati...
Challenges and opportunities in research evaluation: toward a better evaluati...ORCID, Inc
 
TSD2013.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFORMATION
TSD2013.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFORMATIONTSD2013.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFORMATION
TSD2013.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFORMATIONLifeng (Aaron) Han
 
Becoming a Tech-Savvy Translator and Interpreter in the Digital Age
Becoming a Tech-Savvy Translator and Interpreter in the Digital AgeBecoming a Tech-Savvy Translator and Interpreter in the Digital Age
Becoming a Tech-Savvy Translator and Interpreter in the Digital AgeBrauerTraining .com
 
Collection evaluation techniques for academic libraries
Collection evaluation techniques for academic libraries Collection evaluation techniques for academic libraries
Collection evaluation techniques for academic libraries ALISS
 
Introduction To Translation Technologies
Introduction To Translation TechnologiesIntroduction To Translation Technologies
Introduction To Translation Technologiesxenotext
 
Examination and Evaluation
Examination and EvaluationExamination and Evaluation
Examination and Evaluationjagannath Dange
 
Unit 7 - Deixis and Definiteness
Unit 7 - Deixis and DefinitenessUnit 7 - Deixis and Definiteness
Unit 7 - Deixis and DefinitenessAshwag Al Hamid
 
A study on performance appraisal of human resource management in hero moto co...
A study on performance appraisal of human resource management in hero moto co...A study on performance appraisal of human resource management in hero moto co...
A study on performance appraisal of human resource management in hero moto co...Vikash Pathak
 
Students Evaluation and Examination Methods
Students  Evaluation and Examination MethodsStudents  Evaluation and Examination Methods
Students Evaluation and Examination MethodsAhmed-Refat Refat
 
Evaluation in Education
Evaluation in EducationEvaluation in Education
Evaluation in EducationKusum Gaur
 
Measuring ROI of Training
Measuring ROI of Training  Measuring ROI of Training
Measuring ROI of Training Yodhia Antariksa
 
Machine Translation: What it is?
Machine Translation: What it is?Machine Translation: What it is?
Machine Translation: What it is?Multilizer
 
Techniques in translation, computer assisted, machine translation, subtitling...
Techniques in translation, computer assisted, machine translation, subtitling...Techniques in translation, computer assisted, machine translation, subtitling...
Techniques in translation, computer assisted, machine translation, subtitling...Moses Altovar
 
Presentation on deixis and distance
Presentation on deixis and distancePresentation on deixis and distance
Presentation on deixis and distanceTaiba Arooj
 
Translation Types
Translation TypesTranslation Types
Translation TypesElena Shapa
 

Destaque (18)

ESR12 Hanna Bechara - EXPERT Summer School - Malaga 2015
ESR12 Hanna Bechara - EXPERT Summer School - Malaga 2015ESR12 Hanna Bechara - EXPERT Summer School - Malaga 2015
ESR12 Hanna Bechara - EXPERT Summer School - Malaga 2015
 
Challenges and opportunities in research evaluation: toward a better evaluati...
Challenges and opportunities in research evaluation: toward a better evaluati...Challenges and opportunities in research evaluation: toward a better evaluati...
Challenges and opportunities in research evaluation: toward a better evaluati...
 
TSD2013.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFORMATION
TSD2013.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFORMATIONTSD2013.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFORMATION
TSD2013.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFORMATION
 
Becoming a Tech-Savvy Translator and Interpreter in the Digital Age
Becoming a Tech-Savvy Translator and Interpreter in the Digital AgeBecoming a Tech-Savvy Translator and Interpreter in the Digital Age
Becoming a Tech-Savvy Translator and Interpreter in the Digital Age
 
Collection evaluation techniques for academic libraries
Collection evaluation techniques for academic libraries Collection evaluation techniques for academic libraries
Collection evaluation techniques for academic libraries
 
Introduction To Translation Technologies
Introduction To Translation TechnologiesIntroduction To Translation Technologies
Introduction To Translation Technologies
 
Examination and Evaluation
Examination and EvaluationExamination and Evaluation
Examination and Evaluation
 
Unit 7 - Deixis and Definiteness
Unit 7 - Deixis and DefinitenessUnit 7 - Deixis and Definiteness
Unit 7 - Deixis and Definiteness
 
A study on performance appraisal of human resource management in hero moto co...
A study on performance appraisal of human resource management in hero moto co...A study on performance appraisal of human resource management in hero moto co...
A study on performance appraisal of human resource management in hero moto co...
 
Techniques in Translation
Techniques in TranslationTechniques in Translation
Techniques in Translation
 
Students Evaluation and Examination Methods
Students  Evaluation and Examination MethodsStudents  Evaluation and Examination Methods
Students Evaluation and Examination Methods
 
Techniques of Japanese Management
Techniques of Japanese ManagementTechniques of Japanese Management
Techniques of Japanese Management
 
Evaluation in Education
Evaluation in EducationEvaluation in Education
Evaluation in Education
 
Measuring ROI of Training
Measuring ROI of Training  Measuring ROI of Training
Measuring ROI of Training
 
Machine Translation: What it is?
Machine Translation: What it is?Machine Translation: What it is?
Machine Translation: What it is?
 
Techniques in translation, computer assisted, machine translation, subtitling...
Techniques in translation, computer assisted, machine translation, subtitling...Techniques in translation, computer assisted, machine translation, subtitling...
Techniques in translation, computer assisted, machine translation, subtitling...
 
Presentation on deixis and distance
Presentation on deixis and distancePresentation on deixis and distance
Presentation on deixis and distance
 
Translation Types
Translation TypesTranslation Types
Translation Types
 

Semelhante a CUHK intern PPT. Machine Translation Evaluation: Methods and Tools

TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFO...
TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFO...TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFO...
TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFO...Lifeng (Aaron) Han
 
pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...
pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...
pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...Lifeng (Aaron) Han
 
Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...Lifeng (Aaron) Han
 
Lepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metricLepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metricLifeng (Aaron) Han
 
Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...Lifeng (Aaron) Han
 
LP&IIS2013 PPT. Chinese Named Entity Recognition with Conditional Random Fiel...
LP&IIS2013 PPT. Chinese Named Entity Recognition with Conditional Random Fiel...LP&IIS2013 PPT. Chinese Named Entity Recognition with Conditional Random Fiel...
LP&IIS2013 PPT. Chinese Named Entity Recognition with Conditional Random Fiel...Lifeng (Aaron) Han
 
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...Lviv Data Science Summer School
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information RetrievalNik Spirin
 
COLING 2012 - LEPOR: A Robust Evaluation Metric for Machine Translation with ...
COLING 2012 - LEPOR: A Robust Evaluation Metric for Machine Translation with ...COLING 2012 - LEPOR: A Robust Evaluation Metric for Machine Translation with ...
COLING 2012 - LEPOR: A Robust Evaluation Metric for Machine Translation with ...Lifeng (Aaron) Han
 
Can programming be liberated from the von neumann style?
Can programming be liberated from the von neumann style?Can programming be liberated from the von neumann style?
Can programming be liberated from the von neumann style?Oriol López Massaguer
 
MT SUMMIT13.Language-independent Model for Machine Translation Evaluation wit...
MT SUMMIT13.Language-independent Model for Machine Translation Evaluation wit...MT SUMMIT13.Language-independent Model for Machine Translation Evaluation wit...
MT SUMMIT13.Language-independent Model for Machine Translation Evaluation wit...Lifeng (Aaron) Han
 
An Introduction to NLP4L
An Introduction to NLP4LAn Introduction to NLP4L
An Introduction to NLP4LKoji Sekiguchi
 
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...Lifeng (Aaron) Han
 
Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...alessio_ferrari
 
Machine translation from English to Hindi
Machine translation from English to HindiMachine translation from English to Hindi
Machine translation from English to HindiRajat Jain
 
PL Lecture 01 - preliminaries
PL Lecture 01 - preliminariesPL Lecture 01 - preliminaries
PL Lecture 01 - preliminariesSchwannden Kuo
 

Semelhante a CUHK intern PPT. Machine Translation Evaluation: Methods and Tools (20)

TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFO...
TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFO...TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFO...
TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFO...
 
pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...
pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...
pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...
 
Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...
 
Lepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metricLepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metric
 
Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...
 
LP&IIS2013 PPT. Chinese Named Entity Recognition with Conditional Random Fiel...
LP&IIS2013 PPT. Chinese Named Entity Recognition with Conditional Random Fiel...LP&IIS2013 PPT. Chinese Named Entity Recognition with Conditional Random Fiel...
LP&IIS2013 PPT. Chinese Named Entity Recognition with Conditional Random Fiel...
 
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information Retrieval
 
Anabela Barreiro - Alinhamentos
Anabela Barreiro - AlinhamentosAnabela Barreiro - Alinhamentos
Anabela Barreiro - Alinhamentos
 
Cross language alignments - challenges guidelines and gold sets
Cross language alignments - challenges guidelines and gold setsCross language alignments - challenges guidelines and gold sets
Cross language alignments - challenges guidelines and gold sets
 
COLING 2012 - LEPOR: A Robust Evaluation Metric for Machine Translation with ...
COLING 2012 - LEPOR: A Robust Evaluation Metric for Machine Translation with ...COLING 2012 - LEPOR: A Robust Evaluation Metric for Machine Translation with ...
COLING 2012 - LEPOR: A Robust Evaluation Metric for Machine Translation with ...
 
Linguistic Evaluation of Support Verb Construction Translations by OpenLogos ...
Linguistic Evaluation of Support Verb Construction Translations by OpenLogos ...Linguistic Evaluation of Support Verb Construction Translations by OpenLogos ...
Linguistic Evaluation of Support Verb Construction Translations by OpenLogos ...
 
Can programming be liberated from the von neumann style?
Can programming be liberated from the von neumann style?Can programming be liberated from the von neumann style?
Can programming be liberated from the von neumann style?
 
MT SUMMIT13.Language-independent Model for Machine Translation Evaluation wit...
MT SUMMIT13.Language-independent Model for Machine Translation Evaluation wit...MT SUMMIT13.Language-independent Model for Machine Translation Evaluation wit...
MT SUMMIT13.Language-independent Model for Machine Translation Evaluation wit...
 
haenelt.ppt
haenelt.ppthaenelt.ppt
haenelt.ppt
 
An Introduction to NLP4L
An Introduction to NLP4LAn Introduction to NLP4L
An Introduction to NLP4L
 
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
 
Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...
 
Machine translation from English to Hindi
Machine translation from English to HindiMachine translation from English to Hindi
Machine translation from English to Hindi
 
PL Lecture 01 - preliminaries
PL Lecture 01 - preliminariesPL Lecture 01 - preliminaries
PL Lecture 01 - preliminaries
 

Mais de Lifeng (Aaron) Han

WMT2022 Biomedical MT PPT: Logrus Global and Uni Manchester
WMT2022 Biomedical MT PPT: Logrus Global and Uni ManchesterWMT2022 Biomedical MT PPT: Logrus Global and Uni Manchester
WMT2022 Biomedical MT PPT: Logrus Global and Uni ManchesterLifeng (Aaron) Han
 
Measuring Uncertainty in Translation Quality Evaluation (TQE)
Measuring Uncertainty in Translation Quality Evaluation (TQE)Measuring Uncertainty in Translation Quality Evaluation (TQE)
Measuring Uncertainty in Translation Quality Evaluation (TQE)Lifeng (Aaron) Han
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...Lifeng (Aaron) Han
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio... HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...Lifeng (Aaron) Han
 
Meta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methodsMeta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methodsLifeng (Aaron) Han
 
Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...
Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...
Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...Lifeng (Aaron) Han
 
Apply chinese radicals into neural machine translation: deeper than character...
Apply chinese radicals into neural machine translation: deeper than character...Apply chinese radicals into neural machine translation: deeper than character...
Apply chinese radicals into neural machine translation: deeper than character...Lifeng (Aaron) Han
 
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...Lifeng (Aaron) Han
 
Chinese Character Decomposition for Neural MT with Multi-Word Expressions
Chinese Character Decomposition for  Neural MT with Multi-Word ExpressionsChinese Character Decomposition for  Neural MT with Multi-Word Expressions
Chinese Character Decomposition for Neural MT with Multi-Word ExpressionsLifeng (Aaron) Han
 
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longer
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longerBuild moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longer
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longerLifeng (Aaron) Han
 
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...Lifeng (Aaron) Han
 
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...Lifeng (Aaron) Han
 
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel CorporaMultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel CorporaLifeng (Aaron) Han
 
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.Lifeng (Aaron) Han
 
A deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine TranslationA deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine TranslationLifeng (Aaron) Han
 
machine translation evaluation resources and methods: a survey
machine translation evaluation resources and methods: a surveymachine translation evaluation resources and methods: a survey
machine translation evaluation resources and methods: a surveyLifeng (Aaron) Han
 
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...Lifeng (Aaron) Han
 
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning ModelChinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning ModelLifeng (Aaron) Han
 
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...Lifeng (Aaron) Han
 
PubhD talk: MT serving the society
PubhD talk: MT serving the societyPubhD talk: MT serving the society
PubhD talk: MT serving the societyLifeng (Aaron) Han
 

Mais de Lifeng (Aaron) Han (20)

WMT2022 Biomedical MT PPT: Logrus Global and Uni Manchester
WMT2022 Biomedical MT PPT: Logrus Global and Uni ManchesterWMT2022 Biomedical MT PPT: Logrus Global and Uni Manchester
WMT2022 Biomedical MT PPT: Logrus Global and Uni Manchester
 
Measuring Uncertainty in Translation Quality Evaluation (TQE)
Measuring Uncertainty in Translation Quality Evaluation (TQE)Measuring Uncertainty in Translation Quality Evaluation (TQE)
Measuring Uncertainty in Translation Quality Evaluation (TQE)
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio... HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 
Meta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methodsMeta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methods
 
Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...
Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...
Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...
 
Apply chinese radicals into neural machine translation: deeper than character...
Apply chinese radicals into neural machine translation: deeper than character...Apply chinese radicals into neural machine translation: deeper than character...
Apply chinese radicals into neural machine translation: deeper than character...
 
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...
 
Chinese Character Decomposition for Neural MT with Multi-Word Expressions
Chinese Character Decomposition for  Neural MT with Multi-Word ExpressionsChinese Character Decomposition for  Neural MT with Multi-Word Expressions
Chinese Character Decomposition for Neural MT with Multi-Word Expressions
 
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longer
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longerBuild moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longer
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longer
 
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...
 
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
 
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel CorporaMultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora
 
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
 
A deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine TranslationA deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine Translation
 
machine translation evaluation resources and methods: a survey
machine translation evaluation resources and methods: a surveymachine translation evaluation resources and methods: a survey
machine translation evaluation resources and methods: a survey
 
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
 
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning ModelChinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
 
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
 
PubhD talk: MT serving the society
PubhD talk: MT serving the societyPubhD talk: MT serving the society
PubhD talk: MT serving the society
 

Último

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 

Último (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 

CUHK intern PPT. Machine Translation Evaluation: Methods and Tools

  • 1. Aaron L.-F. Han Natural Language Processing & Portuguese-Chinese Machine Translation Laboratory University of Macau, Macau S.A.R., China 2013.08 @ CUHK, Hong Kong Email: hanlifengaaron AT gmail DOT com Homepage: http://www.linkedin.com/in/aaronhan
  • 2.  The importance of machine translation (MT) evaluation  Automatic MT evaluation metrics introduction 1. Lexical similarity 2. Linguistic features 3. Metrics combination  Designed metric: LEPOR Series 1. Motivation 2. LEPOR Metrics Description 3. Performances on international ACL-WMT corpora 4. Publications and Open source tools  Other research interests and publications
  • 3. • Eager communication with each other of different nationalities – Promote the translation technology • Rapid development of Machine translation – machine translation (MT) began as early as in the 1950s (Weaver, 1955) – big progress science the 1990s due to the development of computers (storage capacity and computational power) and the enlarged bilingual corpora (Marino et al. 2006)
  • 4. • Some recent works of MT research: – Och (2003) present MERT (Minimum Error Rate Training) for log-linear SMT – Su et al. (2009) use the Thematic Role Templates model to improve the translation – Xiong et al. (2011) employ the maximum-entropy model, etc. – The data-driven methods including example-based MT (Carl and Way, 2003) and statistical MT (Koehn, 2010) became main approaches in MT literature.
  • 5. • How well the MT systems perform and whether they make some progress? • Difficulties of MT evaluation – language variability results in no single correct translation – the natural languages are highly ambiguous and different languages do not always express the same content in the same way (Arnold, 2003)
  • 6. • Traditional manual evaluation criteria: – intelligibility (measuring how understandable the sentence is) – fidelity (measuring how much information the translated sentence retains as compared to the original) by the Automatic Language Processing Advisory Committee (ALPAC) around 1966 (Carroll, 1966) – adequacy (similar as fidelity), fluency (whether the sentence is well-formed and fluent) and comprehension (improved intelligibility) by Defense Advanced Research Projects Agency (DARPA) of US (White et al., 1994)
  • 7. • Problems of manual evaluations : – Time-consuming – Expensive – Unrepeatable – Low agreement (Callison-Burch, et al., 2011)
  • 8. 2.1 Lexical similarity 2.2 Linguistic features 2.3 Metrics combination
  • 9. • Precision-based Bleu (Papineni et al., 2002 ACL) • Recall-based ROUGE(Lin, 2004 WAS) • Precision and Recall Meteor (Banerjee and Lavie, 2005 ACL)
  • 10. • Word-order based NKT_NSR(Isozaki et al., 2010EMNLP), Port (Chen et al., 2012 ACL), ATEC (Wong et al., 2008AMTA) • Word-alignment based AER (Och and Ney, 2003 J.CL) • Edit distance-based WER(Su et al., 1992Coling), PER(Tillmann et al., 1997 EUROSPEECH), TER (Snover et al., 2006 AMTA)
  • 11. • Language model LM-SVM (Gamon et al., 2005EAMT) • Shallow parsing GLEU (Mutton et al., 2007ACL), TerrorCat (Fishel et al., 2012WMT) • Semantic roles Named entity, morphological, synonymy, paraphrasing, discourse representation, etc.
  • 12. • MTeRater-Plus (Parton et al., 2011WMT) – Combine BLEU, TERp (Snover et al., 2009) and Meteor (Banerjee and Lavie, 2005; Lavie and Denkowski, 2009) • MPF & WMPBleu (Popovic, 2011WMT) – Arithmetic mean of F score and BLEU score • SIA (Liu and Gildea, 2006ACL) – Combine the advantages of n-gram-based metrics and loose-sequence-based metrics
  • 13. • LEPOR: automatic machine translation evaluation metric considering the: Length Penalty, Precision, n- gram Position difference Penalty and Recall.
  • 14. • Weaknesses in existing metrics: – perform well on certain language pairs but weak on others, which we call as the language-bias problem; – consider no linguistic information (leading the metrics result in low correlation with human judgments) or too many linguistic features (difficult in replicability), which we call as the extremism problem; – present incomprehensive factors (e.g. BLEU focus on precision only). – What to do?
  • 15. • to address some of the existing problems: – Design tunable parameters to address the language-bias problem; – Use concise or optimized linguistic features for the linguistic extremism problem; – Design augmented factors.
  • 16. • Sub-factors: • 𝐿𝑃 = exp 1 − 𝑟 𝑐 : 𝑐 < 𝑟 1 ∶ 𝑐 = 𝑟 exp 1 − 𝑐 𝑟 : 𝑐 > 𝑟 (1) • 𝑟: length of reference sentence • 𝑐: length of candidate (system-output) sentence
  • 17. • 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙 = exp −𝑁𝑃𝐷 (2) • 𝑁𝑃𝐷 = 1 𝐿𝑒𝑛𝑔𝑡ℎ 𝑜𝑢𝑡𝑝𝑢𝑡 |𝑃𝐷𝑖| 𝐿𝑒𝑛𝑔𝑡ℎ 𝑜𝑢𝑡𝑝𝑢𝑡 𝑖=1 (3) • 𝑃𝐷𝑖 = |𝑀𝑎𝑡𝑐ℎ𝑁𝑜𝑢𝑡𝑝𝑢𝑡 − 𝑀𝑎𝑡𝑐ℎ𝑁𝑟𝑒𝑓| (4) • 𝑀𝑎𝑡𝑐ℎ𝑁𝑜𝑢𝑡𝑝𝑢𝑡: position of matched token in output sentence • 𝑀𝑎𝑡𝑐ℎ𝑁𝑟𝑒𝑓: position of matched token in reference sentence
  • 18. Fig. 1. N-gram word alignment algorithm
  • 19. Fig. 2. Example of n-gram word alignment
  • 20. Fig. 3. Example of NPD calculation
  • 21. • N-gram precision and recall: • 𝑃𝑛 = #𝑛𝑔𝑟𝑎𝑚 𝑚𝑎𝑡𝑐ℎ𝑒𝑑 #𝑛𝑔𝑟𝑎𝑚 𝑐ℎ𝑢𝑛𝑘𝑠 𝑖𝑛 𝑠𝑦𝑠𝑡𝑒𝑚 𝑜𝑢𝑡𝑝𝑢𝑡 (5) • 𝑅 𝑛 = #𝑛𝑔𝑟𝑎𝑚 𝑚𝑎𝑡𝑐ℎ𝑒𝑑 #𝑛𝑔𝑟𝑎𝑚 𝑐ℎ𝑢𝑛𝑘𝑠 𝑖𝑛 𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒 (6) • 𝐻𝑃𝑅 = 𝐻𝑎𝑟𝑚𝑜𝑛𝑖𝑐 𝛼𝑅 𝑛, 𝛽𝑃𝑛 = 𝛼+𝛽 𝛼 𝑅 𝑛 + 𝛽 𝑃 𝑛 (7)
  • 22. Fig. 4. Example of bigram matching
  • 23. • LEPOR Metrics: • 𝐿𝐸𝑃𝑂𝑅 = 𝐿𝑃 × 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙 × 𝐻𝑎𝑟𝑚𝑜𝑛𝑖𝑐(𝛼𝑅, 𝛽𝑃) (8) • ℎ𝐿𝐸𝑃𝑂𝑅 = 𝐻𝑎𝑟𝑚𝑜𝑛𝑖𝑐 𝑤 𝐿𝑃 𝐿𝑃, 𝑤 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙, 𝑤 𝐻𝑃𝑅 𝐻𝑃𝑅 • = 𝑤 𝑖 𝑛 𝑖=1 𝑤 𝑖 𝐹𝑎𝑐𝑡𝑜𝑟 𝑖 𝑛 𝑖=1 = 𝑤 𝐿𝑃+𝑤 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙+𝑤 𝐻𝑃𝑅 𝑤 𝐿𝑃 𝐿𝑃 + 𝑤 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙 + 𝑤 𝐻𝑃𝑅 𝐻𝑃𝑅 (9) • 𝑛𝐿𝐸𝑃𝑂𝑅 = 𝐿𝑃 × 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙 × 𝑒𝑥𝑝( 𝑤 𝑛 𝑙𝑜𝑔𝐻𝑃𝑅𝑁 𝑛=1 ) (10)
  • 24. • Example, employment of linguistic features: Fig. 5. Example of n-gram POS alignment Fig. 6. Example of NPD calculation
  • 25. • Combination with linguistic features: • ℎ𝐿𝐸𝑃𝑂𝑅𝑓𝑖𝑛𝑎𝑙 = 1 𝑤ℎ𝑤+𝑤ℎ𝑝 (𝑤ℎ𝑤ℎ𝐿𝐸𝑃𝑂𝑅 𝑤𝑜𝑟𝑑 + 𝑤ℎ𝑝ℎ𝐿𝐸𝑃𝑂𝑅 𝑃𝑂𝑆) (11) • ℎ𝐿𝐸𝑃𝑂𝑅 𝑃𝑂𝑆 and ℎ𝐿𝐸𝑃𝑂𝑅 𝑤𝑜𝑟𝑑 use the same algorithm on POS sequence and word sequence respectively.
  • 26. • When multi-references: • Select the alignment that results in the minimum NPD score. Fig. 7. N-gram alignment when multi-references
  • 27. • How reliable is the automatic metric? • Evaluation criteria for evaluation metrics: – Human judgments are the golden to approach, currently. • Correlation with human judgments: – System-level correlation – Segment-level correlation
  • 28. • System-level correlation: • Spearman rank correlation coefficient: – 𝜌 𝑋𝑌 = 1 − 6 𝑑 𝑖 2𝑛 𝑖=1 𝑛(𝑛2−1) (12) – 𝑋 = 𝑥1, … , 𝑥 𝑛 , 𝑌 = {𝑦1, … , 𝑦𝑛} • Pearson correlation coefficient: – 𝜌 𝑋𝑌 = (𝑥 𝑖−𝜇 𝑥)(𝑦 𝑖−𝜇 𝑦)𝑛 𝑖=1 𝑥 𝑖−𝜇 𝑥 2𝑛 𝑖=1 𝑦 𝑖−𝜇 𝑦 2𝑛 𝑖=1 (13)
  • 29. • Segment-level Kendall’s tau correlation: • 𝜏 = 𝑛𝑢𝑚 𝑐𝑜𝑛𝑐𝑜𝑟𝑑𝑎𝑛𝑡 𝑝𝑎𝑖𝑟𝑠−𝑛𝑢𝑚 𝑑𝑖𝑠𝑐𝑜𝑟𝑑𝑎𝑛𝑡 𝑝𝑎𝑖𝑟𝑠 𝑡𝑜𝑡𝑎𝑙 𝑝𝑎𝑖𝑟𝑠 (14) • The segment unit can be a single sentence or fragment that contains several sentences.
  • 30. • Performances on ACL-WMT 2011 copora • Two translation directions: – English-to-other (Spanish, German, French, Czech) – Other-to-English • System-level metrics: – 𝐿𝐸𝑃𝑂𝑅 𝐴 = 1 𝑛𝑢𝑚 𝑠𝑒𝑛𝑡 |𝐿𝐸𝑃𝑂𝑅𝑖| 𝑛𝑢𝑚 𝑠𝑒𝑛𝑡 𝑖=1 (15) – 𝐿𝐸𝑃𝑂𝑅 𝐵 = 𝐿𝑃 × 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙 × 𝐻𝑎𝑟𝑚𝑜𝑛𝑖𝑐(𝛼𝑅, 𝛽𝑃) (16)
  • 31. Table 1. system-level Spearman correlation with human judgment on WMT11 corpora
  • 32. • Performances on ACL-WMT 2013 copora • Two translation directions: – English-to-other (Spanish, German, French, Czech, and Russian) – Other-to-English
  • 33. • System-level & sentence-level – LEPOR_v3.1: hLEPOR, nLEPOR_baseline – ℎ𝐿𝐸𝑃𝑂𝑅 = 1 𝑛𝑢𝑚 𝑠𝑒𝑛𝑡 |ℎ𝐿𝐸𝑃𝑂𝑅𝑖| 𝑛𝑢𝑚 𝑠𝑒𝑛𝑡 𝑖=1 (17) – ℎ𝐿𝐸𝑃𝑂𝑅𝑓𝑖𝑛𝑎𝑙 = 1 𝑤ℎ𝑤+𝑤ℎ𝑝 (𝑤ℎ𝑤ℎ𝐿𝐸𝑃𝑂𝑅 𝑤𝑜𝑟𝑑 + 𝑤ℎ𝑝ℎ𝐿𝐸𝑃𝑂𝑅 𝑃𝑂𝑆) (18) – 𝑛𝐿𝐸𝑃𝑂𝑅 = 𝐿𝑃 × 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙 × exp( 𝑤 𝑛 𝑙𝑜𝑔𝐻𝑃𝑅𝑁 𝑛=1 ) (19)
  • 34. Table 2&3. system-level Pearson correlation with human judgment
  • 35. Table 4&5. segment-level Kendall’s tau correlation with human judgment
  • 36. • LEPOR: A Robust Evaluation Metric for Machine Translation with Augmented Factors – Aaron L.-F. Han, Derek F. Wong and Lidia S. Chao. Proceedings of COLING 2012: Posters, pages 441–450, Mumbai, India. • Language-independent Model for Machine Translation Evaluation with Reinforced Factors – Aaron L.-F. Han, Derek Wong, Lidia S. Chao, Liangye He, Yi Lu, Junwen Xing, Xiaodong Zeng. Proceedings of MT Summit 2013. Nice, France.
  • 37.
  • 38.
  • 39. • Language independent MT evaluation-LEPOR: https://github.com/aaronlifenghan/aaron-project-lepor • MT evaluation with linguistic features-hLEPOR: https://github.com/aaronlifenghan/aaron-project-hlepor • English-French Phrase tagset mapping and application in unsupervised MT evaluation-HPPR: https://github.com/aaronlifenghan/aaron-project-hppr • Unsupervised English-Spanish MT evaluation-EBLEU: https://github.com/aaronlifenghan/aaron-project-ebleu • Projects Homepage: https://github.com/aaronlifenghan
  • 40. • My research interests: – Natural Language Processing – Signal Processing – Machine Learning – Artificial Intelligence – Pattern Recognition • My past research works: – Machine Translation Evaluation, Word Segmentation, Entity Recognition, Multilingual Treebanks
  • 41. • Other publications: • A Description of Tunable Machine Translation Evaluation Systems in WMT13 Metrics Task – Aaron Li-Feng Han, Derek Wong, Lidia S. Chao, Yi Lu, Yervant Ho, Yiming Wang, Zhou jiaji. Proceedings of the ACL 2013 EIGHTH WORKSHOP ON STATISTICAL MACHINE TRANSLATION (ACL-WMT 2013), 8-9 August 2013. Sofia, Bulgaria. – ACL-WMT13 METRICS TASK: Our metrics are language independent English-vs-other (French, Spanish, Czech, German, Russian) Can perform on both system-level and segment-level The official results show our metrics have advantages as compared to others.
  • 42. • Quality Estimation for Machine Translation Using the Joint Method of Evaluation Criteria and Statistical Modeling – Aaron Li-Feng Han, Yi Lu, Derek F. Wong, Lidia S. Chao, Yervant Ho, Anson Xing. Proceedings of the ACL 2013 EIGHTH WORKSHOP ON STATISTICAL MACHINE TRANSLATION (ACL-WMT 2013), 8-9 August 2013. Sofia, Bulgaria. – ACL-WMT13 QUALITY ESTIMATION TASK (no reference translation): Task 1.1: sentence level EN-ES quality estimation Task 1.2: system selection, EN-ES, EN-DE, new Task 2: word-level QE, EN-ES, binary classification, multi-class classification, new We design novel EN-ES POS tagset mapping and metric EBLEU in task 1.1. We explore the Naïve Bayes and Support Vector Machine in task 1.2. We achieve the highest F1 score in task 2 using Conditional Random Field.
  • 43. Designed POS tagset mapping of Spanish (Tree tagger) to universal tagset (Petrov et al., 2012)
  • 44. • 𝐸𝐵𝐿𝐸𝑈 = 1 − 𝑀𝐿𝑃 × exp( 𝑤 𝑛log(𝐻(𝛼𝑅 𝑛, 𝛽𝑃𝑛))) – 𝑀𝐿𝑃 = 𝑒1− 𝑠 ℎ 𝑖𝑓 ℎ < 𝑠 𝑒1− ℎ 𝑠 𝑖𝑓 ℎ ≥ 𝑠 – 𝑃𝑛 = #𝑐𝑜𝑚𝑚𝑜𝑛 𝑛𝑔𝑟𝑎𝑚 𝑐ℎ𝑢𝑛𝑘 #𝑛𝑔𝑟𝑎𝑚 𝑐ℎ𝑢𝑛𝑘 𝑖𝑛 𝑡𝑎𝑟𝑔𝑒𝑡 𝑠𝑒𝑛𝑡𝑒𝑛𝑐𝑒 – 𝑅 𝑛 = #𝑐𝑜𝑚𝑚𝑜𝑛 𝑛𝑔𝑟𝑎𝑚 𝑐ℎ𝑢𝑛𝑘 #𝑛𝑔𝑟𝑎𝑚 𝑐ℎ𝑢𝑛𝑘 𝑖𝑛 𝑠𝑜𝑢𝑟𝑐𝑒 𝑠𝑒𝑛𝑡𝑒𝑛𝑐𝑒
  • 45. • Bayes’ rule: – 𝑝 𝑐𝑖 𝑥1, 𝑥2, … , 𝑥 𝑛 = 𝑝 𝑥1,𝑥2,…,𝑥 𝑛 𝑐 𝑖 𝑝(𝑐 𝑖) 𝑝(𝑥1,𝑥2,…,𝑥 𝑛) • SVM, find the point with smallest margin to hyper plane and then maximize this margin: – 𝑎𝑟𝑔𝑚𝑎𝑥 𝑤, 𝑏 {𝑚𝑖𝑛 𝑛 (𝑙𝑎𝑏𝑒𝑙 ∙ (𝑤 𝑇 𝑥 + 𝑏)) ∙ 1 ||𝑤|| } • Conditional random fields: – 𝑃 𝑌 𝑋 ∝ exp( 𝜆 𝑘 𝑓𝑘 𝑒, 𝑌| 𝑒, 𝑋𝑒∈𝐸,𝑘 + 𝜇 𝑘 𝑔 𝑘(𝑣, 𝑌| 𝑣, 𝑋)𝑣∈𝑉,𝑘 )
  • 46. Designed features for CRF & NB algorithms
  • 47. ACL-WMT13 word level quality estimation task results
  • 48.
  • 49. • Phrase Tagset Mapping for French and English Treebanks and Its Application in Machine Translation Evaluation – Aaron Li-Feng Han, Derek F. Wong, Lidia S. Chao, Yervant Ho, Shuo Li, Lynn Ling Zhu. In GSCL 2013. LNCS Vol. 8105, Volume Editors: Iryna Gurevych, Chris Biemann and Torsten Zesch. – German Society for Computational Linguistics (oral presentation): To facilitate future research in unsupervised induction of syntactic structures We design French-English phrase tagset mapping We propose a universal phrase tagset Phase tags extracted from French Treebank and English Penn Treebank Explore the employment of the proposed mapping in unsupervised MT evaluation
  • 50. Designed phrase tagset mapping for English and French
  • 51. Evaluation based on parsing information from syntactic treebanks
  • 52. Convert the word sequence into universal phrase tagset sequence
  • 53. • 𝐻𝑃𝑃𝑅 = 𝐻𝑎𝑟(𝑤 𝑃𝑠 𝑁1 𝑃𝑠𝐷𝑖𝑓, 𝑤 𝑃𝑟 𝑁2 𝑃𝑟𝑒, 𝑤 𝑅𝑐 𝑁3 𝑅𝑒𝑐) – 𝑁1 𝑃𝑠𝐷𝑖𝑓 = 1 𝑛 𝑁1 𝑃𝑠𝐷𝑖𝑓𝑖 – 𝑁2 𝑃𝑟𝑒 = 𝑒𝑥𝑝 𝑤 𝑛 𝑙𝑜𝑔𝑃𝑛𝑁2 – 𝑁3 𝑅𝑒𝑐 = 𝑒𝑥𝑝( 𝑤 𝑛 𝑙𝑜𝑔𝑅 𝑛𝑁3 )
  • 54. • A Study of Chinese Word Segmentation Based on the Characteristics of Chinese – Aaron Li-Feng Han, Derek F. Wong, Lidia S. Chao, Yervant Ho, Lynn Ling Zhu, Shuo Li. Accepted. In GSCL 2013. LNCS Vol. 8105, Volume Editors: Iryna Gurevych, Chris Biemann and Torsten Zesch. – German Society for Computational Linguistics (poster paper): No word boundary in the Chinese expression Chinese word segmentation is a difficult problem Word segmentation is crucial to the word alignment in machine translation We discuss the characteristics of Chinese and design optimized features We formulize some problems and issues in Chinese word segmentation
  • 55.
  • 56. • AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFORMATION – Aaron Li-Feng Han, Derek F. Wong, Lidia S. Chao, Yervant Ho. In TSD 2013. Plzen, Czech Republic. LNAI Vol. 8082, pp. 121-128. Volume Editors: I. Habernal and V. Matousek. Springer-Verlag Berlin Heidelberg. – Text, Speech and Dialogue 2013 (oral presentation): We explore the unsupervised machine translation evaluation method We design hLEPOR algorithm for the first time We explore the POS usage in unsupervised MT evaluation Experiments are performed on English vs French, German
  • 57.
  • 58. • Chinese Named Entity Recognition with Conditional Random Fields in the Light of Chinese Characteristics – Aaron Li-Feng Han, Derek Fai Wong and Lidia Sam Chao. In Proceeding of LP&IIS. M.A. Klopotek et al. (Eds.): IIS 2013, LNCS Vol. 7912, pp. 57– 68, Warsaw, Poland. Springer-Verlag Berlin Heidelberg. – Intelligent Information System 2013 (oral presentation): Named entity recognition is important in IR, MT, text analysis, etc. Chinese named entity recognition is more difficult due to no word boundary We compare the performances of different algorithm, NB, CRF, SVM, ME We analysis the characteristics respectively on personal, location, organization names We show the performance of different features and select the optimized one.
  • 59.
  • 60. • Ongoing and further works: – The combination of translation and evaluation, tuning the translation model using evaluation metrics – Evaluation models from the perspective of semantics – The exploration of unsupervised evaluation models, extracting features from source and target languages
  • 61. • Actually speaking, the evaluation works are very related to the similarity measuring. Where I have employed them is in the MT evaluation only. These works can be further developed into other literature: – information retrieval – question and answering – Searching – text analysis – etc.
  • 62. Q and A Thanks for your attention! Aaron L.-F. Han, 2013.08
  • 63. • 1. Weaver, Warren.: Translation. In William Locke and A. Donald Booth, editors, • Machine Translation of Languages: Fourteen Essays. John Wiley and Sons, New • York, pages 15-23 (1955) • 2. Marino B. Jose, Rafael E. Banchs, Josep M. Crego, Adria de Gispert, Patrik Lambert, • Jose A. Fonollosa, Marta R. Costa-jussa: N-gram based machine translation, • Computational Linguistics, Vol. 32, No. 4. pp. 527-549, MIT Press (2006) • 3. Och, F. J.: Minimum Error Rate Training for Statistical Machine Translation. In • Proceedings of (ACL-2003). pp. 160-167 (2003) • 4. Su Hung-Yu and Chung-Hsien Wu: Improving Structural Statistical Machine Translation • for Sign Language With Small Corpus Using Thematic Role Templates as • Translation Memory, IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE • PROCESSING, VOL. 17, NO. 7, SEPTEMBER (2009) • 5. Xiong D., M. Zhang, H. Li: A Maximum-Entropy Segmentation Model for Statistical • Machine Translation, Audio, Speech, and Language Processing, IEEE Transactions • on, Volume: 19, Issue: 8, 2011 , pp. 2494- 2505 (2011) • 6. Carl, M. and A. Way (eds): Recent Advances in Example-Based Machine Translation. • Kluwer Academic Publishers, Dordrecht, The Netherlands (2003)
  • 64. • 7. Koehn P.: Statistical Machine Translation, (University of Edinburgh), Cambridge • University Press (2010) • 8. Arnold, D.: Why translation is dicult for computers. In Computers and Translation: • A translator's guide. Benjamins Translation Library (2003) • 9. Carroll, J. B.: Aan experiment in evaluating the quality of translation, Pierce, J. • (Chair), Languages and machines: computers in translation and linguistics. A report • by the Automatic Language Processing Advisory Committee (ALPAC), Publication • 1416, Division of Behavioral Sciences, National Academy of Sciences, National Research • Council, page 67-75 (1966) • 10. White, J. S., O'Connell, T. A., and O'Mara, F. E.: The ARPA MT evaluation • methodologies: Evolution, lessons, and future approaches. In Proceedings of the • Conference of the Association for Machine Translation in the Americas (AMTA • 1994). pp 193-205 (1994) • 11. Su Keh-Yih, Wu Ming-Wen and Chang Jing-Shin: A New Quantitative Quality • Measure for Machine Translation Systems. In Proceedings of the 14th International • Conference on Computational Linguistics, pages 433-439, Nantes, France, July • (1992)
  • 65. • 12. Tillmann C., Stephan Vogel, Hermann Ney, Arkaitz Zubiaga, and Hassan Sawaf: • Accelerated DP Based Search For Statistical Translation. In Proceedings of the 5th • European Conference on Speech Communication and Technology (EUROSPEECH97) • (1997) • 13. Papineni, K., Roukos, S., Ward, T. and Zhu, W. J.: BLEU: a method for automatic • evaluation of machine translation. In Proceedings of the (ACL 2002), pages 311-318, • Philadelphia, PA, USA (2002) • 14. Doddington, G.: Automatic evaluation of machine translation quality using ngram • co-occurrence statistics. In Proceedings of the second international conference • on Human Language Technology Research(HLT 2002), pages 138-145, San Diego, • California, USA (2002) • 15. Turian, J. P., Shen, L. and Melanmed, I. D.: Evaluation of machine translation • and its evaluation. In Proceedings of MT Summit IX, pages 386-393, New Orleans, • LA, USA (2003) • 16. Banerjee, S. and Lavie, A.: Meteor: an automatic metric for MT evaluation with • high levels of correlation with human judgments. In Proceedings of ACL-WMT, • pages 65-72, Prague, Czech Republic (2005)
  • 66. • 17. Denkowski, M. and Lavie, A.: Meteor 1.3: Automatic metric for reliable optimization • and evaluation of machine translation systems. In Proceedings of (ACL-WMT), • pages 85-91, Edinburgh, Scotland, UK (2011) • 18. Snover, M., Dorr, B., Schwartz, R., Micciulla, L. and Makhoul, J.: A study of • translation edit rate with targeted human annotation. In Proceedings of the Conference • of the Association for Machine Translation in the Americas (AMTA), pages • 223-231, Boston, USA (2006) • 19. Chen, B. and Kuhn, R.: Amber: A modied bleu, enhanced ranking metric. In • Proceedings of (ACL-WMT), pages 71-77, Edinburgh, Scotland, UK (2011) • 20. Bicici, E. and Yuret, D.: RegMT system for machine translation, system combination, • and evaluation. In Proceedings ACL-WMT, pages 323-329, Edinburgh, • Scotland, UK (2011) • 21. Taylor, J. Shawe and N. Cristianini: Kernel Methods for Pattern Analysis. Cambridge • University Press 2004. • 22. Wong, B. T-M and Kit, C.: Word choice and word position for automatic MT • evaluation. In Workshop: MetricsMATR of the Association for Machine Translation • in the Americas (AMTA), short paper, 3 pages, Waikiki, Hawai'I, USA (2008)
  • 67. • 23. Isozaki, H., Hirao, T., Duh, K., Sudoh, K., and Tsukada, H.: Automatic evaluation • of translation quality for distant language pairs. In Proceedings of the 2010 • Conference on (EMNLP), pages 944{952, Cambridge, MA (2010) • 24. Talbot, D., Kazawa, H., Ichikawa, H., Katz-Brown, J., Seno, M. and Och, F.: A • Lightweight Evaluation Framework for Machine Translation Reordering. In Proceedings • of the Sixth (ACL-WMT), pages 12-21, Edinburgh, Scotland, UK (2011) • 25. Song, X. and Cohn, T.: Regression and ranking based optimisation for sentence • level MT evaluation. In Proceedings of the (ACL-WMT), pages 123-129, Edinburgh, • Scotland, UK (2011) • 26. Popovic, M.: Morphemes and POS tags for n-gram based evaluation metrics. In • Proceedings of (ACL-WMT), pages 104-107, Edinburgh, Scotland, UK (2011) • 27. Popovic, M., Vilar, D., Avramidis, E. and Burchardt, A.: Evaluation without references: • IBM1 scores as evaluation metrics. In Proceedings of the (ACL-WMT), • pages 99-103, Edinburgh, Scotland, UK (2011) • 28. Petrov S., Leon Barrett, Romain Thibaux, and Dan Klein: Learning accurate, • compact, and interpretable tree annotation. Proceedings of the 21st ACL, pages • 433-440, Sydney, July (2006)
  • 68. • 29. Callison-Bruch, C., Koehn, P., Monz, C. and Zaidan, O. F.: Findings of the 2011 • Workshop on Statistical Machine Translation. In Proceedings of (ACL-WMT), pages • 22-64, Edinburgh, Scotland, UK (2011) • 30. Callison-Burch, C., Koehn, P., Monz, C., Peterson, K., Przybocki, M. and Zaidan, • O. F.: Findings of the 2010 Joint Workshop on Statistical Machine Translation and • Metrics for Machine Translation. In Proceedings of (ACL-WMT), pages 17-53, PA, • USA (2010) • 31. Callison-Burch, C., Koehn, P., Monz,C. and Schroeder, J.: Findings of the 2009 • Workshop on Statistical Machine Translation. In Proceedings of ACL-WMT, pages • 1-28, Athens, Greece (2009) • 32. Callison-Burch, C., Koehn, P., Monz,C. and Schroeder, J.: Further meta-evaluation • of machine translation. In Proceedings of (ACL-WMT), pages 70-106, Columbus, • Ohio, USA (2008) • 33. Avramidis E., Popovic, M., Vilar, D., Burchardt, A.: Evaluate with Condence • Estimation: Machine ranking of translation outputs using grammatical features. In • Proceedings of the Sixth Workshop on Statistical Machine Translation, Association • for Computational Linguistics (ACL-WMT), pages 65-70, Edinburgh, Scotland, UK • (2011)