SlideShare uma empresa Scribd logo
1 de 30
Post-Editing of Machine
Translation (PEMT)
Developing Requirements and
Compensation Schemes
Overview
 MT 101
 Estimation of MT quality
 Definition of post-editing
 PEMT Requirements
 PEMT skills
 Pricing and compensation
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 2
Caveat emptor
 Your presenter is an MT enthusiast
 Using MT since 1990
 No bias
 MT is here to stay
 No special knowledge
 Could sound trivial
• Common sense
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 3
Methods
 Rule-based Machine Translation (RbMT)
 Transfer
 Interlingua
 Data-driven (stochastic) machine
translation
 Example-based Machine
Translation (EbMT)
 Statistical Machine Translation (SMT)
 Both methods can work for projects where
MT is suitable
 Most commercial systems are now all hybrids of some sort
 Post processing and cleanup
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 4
Rules-based
Machine Translation (RbMT)
 Analytical approach
 Grammatical representation of language
• Morphological analysis
– Inflection and conjugation of
words
• Syntactic analysis
– Sequence of words and sentence
structure
• Semantic analysis
– Meaning of words in context
 Heavy dependence on bilingual dictionaries
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 5
RbMT Issues
 Disambiguation
 Plain, correct, and consistent source
 Constant refinement of rules
 Accurate and comprehensive dictionaries
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 6
Example-based Machine
Translation (EbMT)
 Bilingual corpus
 Body of reference for similarities
• Combination of segments
– Best approximation
 Fuzzy matching algorithms
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 7
Statistical
Machine Translation (SMT)
 Empirical strategy
 Statistical probability
• Statistical assessment of words and phrase positioning
within segments from corpus
– Brute force computing
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 8
Two Models for Learning Data
 Translation model
 Words and word sequences in SL to find the most
likely corresponding words in TL
 Target-language model
 The most likely way in which corresponding TL
words will be combined
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 9
EbMT & SMT Issues
 Parallel corpora
 Analysis is challenging
• Problems with large corpora
 Translation memories instead
of corpora
 Segmentation for accuracy and
matching
• Problems with alignment
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 10
Ambiguity in SMT
 Translation models handle word sequences
 Likeliness of reproducing a wrong interpretation if
in model
 Collocations (dependencies between words)
could be hard to capture
 Target-language models
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 11
SWOT Analysis for MT
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 12
• Speed
• Volumes
• Consistency
• Complexity
• Error incidence
• Amount of skills, expertise and
understanding needed
• Not commonplace
• Least engaging, highly rewarding, non-
binding content
• Areas for improvement
• Training and customization
• Language data and rules optimization
• Writing
• Controlled languages
• Reliability
• Problematic ROI
Modes of Use
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 13
Unrestricted
texts
High
quality
Restricted input
Low quality
Impractical
Interactive
Fully
automatic
Estimating MT quality
 Biased baseline: MT is always bad
 Most translators do not know much about MT
• Same old jokes about silly mistakes
• A mixture of ignorance and fear
 Automatic metrics
 Hard interpretation
 PEMT effort
 Annotation guidelines
• Assigning 1-5 scores
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 14
Automatic Metrics
 BLEU
 No indication of accuracy
 1 ≤ P ≤ 0
• 1 = professional human translation
• .65 = human quality
• A score increase does not necessarily mean improved translation quality
 NIST
 Based on BLEU, with some alterations
 METEOR (Metric for Evaluation of Translation with Explicit ORdering)
 Based on BLEU
• Harmonic mean of unigram precision and recall
 F-Measure (F1 Score or F-Score)
 A measure of a test’s accuracy used in machine learning
 Based on BLEU, with some alterations
 WER (Word Error Rate)
 Most often used in speech recognition
• 1 ≤ P ≤ 0
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 15
Prerequisites
 Post-editing throughput must be faster than
translation
 Post-editing must be less keyboard intensive
than translation
 Post-editing must be less cognitively
demanding than translation
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 16
PEMT Effort
 System
 RbMT
• Dictionary
• Rules
• Customizability
 Data‐driven
• Suitability of input
• Training data
– Volume
– Domain
 Product captivity
• Different technologies that can or cannot be used within more than just one tool
 Language pair
 Outcome in one language combination cannot be compared with that in
another
 Text type
 Domain
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 17
PEMT Issues
 System
 RbMT
• Incorrect word/term
• Incorrect attachment
• Meaning not
disambiguated
 Data‐driven
• Missing words
• Capitalization
• Punctuation
• Fluency inconsistency
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 18
Post-editing
 Gist
 Raw MT
• Disposable (volatile UGC)
• Validation of automatic evaluation
 Light
 Making the translation understandable
• Ignoring all stylistic issues
• Adjusting mechanical errors (capitalization, and
punctuation)
• Replacing unknown words (misspelled in ST)
• Removing redundant words
 Heavy
 Making the translation stylistically appropriate
• Fixing machine‐induced meaning distortion
• Making grammatical and syntactic adjustments
• Checking terminology (new terms)
• Partially or completely rewriting sentences
– Adjusting for target language fluency
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 19
Degrees of Post-editing
 User requirements
 Quality expectations
 Perishability
 Volume
 Text function
 Turn‐around time
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 20
PEMT Specs
 Type of MT
 In house vs. outsourced MT
 Type of MT output
 Generic, untrained MT output
 Trained MT output
 Quality guidelines and index for raw translation
 ≥ 40% reusable
• BLEU
– Acceptable 0.3 to 0.5
– Good: 0.5 to 1
 Request a sample
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 21
PEMT Specs
 Rationale for MT
 Increased throughput
• More languages
• More content
 Faster turnaround time
 Reduced cost
 Accuracy and consistency
 Target consumer
 Quality of the finished product
 Reprocessing
 Publication
 Amount and type of PEMT
 Gist
 Light
 Heavy
 Participation in ongoing training of engine
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 22
Warning
 MT engines are not all equal
 Raw output quality is not consistent from
system to system and language to language
 MT error patterns are not consistent from
segment to segment
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 23
PEMT Instructions
 Clear and concise
 Tools
 Stick to style guide
• Language specific conventions
• Country/region standards
• Grammar, syntax and orthographic conventions
 Retain as much raw translation as possible
 Don’t hesitate too long over a problem
 Don’t worry about style
 Don’t embark on time‐consuming research
 Make changes only where absolutely necessary
• Non sense
• Wrong words (possibly misspelled in source text)
• Missing words
• Punctuation, capitalization
• Inflection
• Gender
• Word order
• Formatting
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 24
PEMT Instructions
 Example for heavy PEMT project
 The quality expected is publishable quality, this means no
deletions or omissions in the text, full accuracy and no
mistranslations with regards to the source text,
compliance to language rules of grammar and spelling for
the target language, and compliance to the terminology
following the glossary provided.
 Make as less edits as possible
• Just correct errors
– Follow the glossary
– If different terminology is found, replace it with the one in the glossary
• Do not introduce preferential changes
• Do not re-write text, unless to correct nonsenses
• Do not try to ‘improve’ the text
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 25
PEMT skills
 Working knowledge of SL
 Excellent command of TL
 Specialized domain knowledge
 Ability to comply with guidelines
 Unbiased attitude towards MT
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 26
Establish your own PEMT academyEstablish your own PEMT academy
Pricing and Compensation
 It will take a year or two more to build out a
widely accepted and dominant compensation
model
 The final model will be tied to productivity
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 27
Pricing and Compensation
 Common approaches
 Paying as for high fuzzy matches
• 85%-94%
• PEMT and post-editing of fuzzy matches are deeply
different
– Fuzzy matches are inherently correct segments
 Minor changes (possibly a term or two)
– MT is not necessarily inherently correct
 Even ‘Light PEMT’ could eventually result heavy
 Paying a time-based fee
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 28
Always
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 29
 Provide candidate post-editors with MT
samples before contracting
 Agree on throughput rates
• 450 to 750 words/hour
 Run a pilot project
 For every domain
 For any new combination
Compensation Grid
 Generals
 Method
 Type of output
• Generic, trained or untrained
 Quality of output
 Number of references
 Quality expectations
• Threshold
 File formats
• Tagging
 Time-based fee
 Productivity rate
• Productivity differ by post-editor
 Time for filling in QA forms
• For ongoing training of engine
 Use a spreadsheet to track time
© 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 30

Mais conteúdo relacionado

Mais procurados

Private Equity Powerpoint Presentation Slides
Private Equity Powerpoint Presentation SlidesPrivate Equity Powerpoint Presentation Slides
Private Equity Powerpoint Presentation SlidesSlideTeam
 
Product or Service Development Process
Product or Service Development Process Product or Service Development Process
Product or Service Development Process Jawwad Jaskani
 
9 Borland Solo Pruebas 2009
9 Borland Solo Pruebas 20099 Borland Solo Pruebas 2009
9 Borland Solo Pruebas 2009Pepe
 
2010 s1-operations managementsession1intro
2010 s1-operations managementsession1intro2010 s1-operations managementsession1intro
2010 s1-operations managementsession1intro1STOUTSOURCE LTD
 
Co-Developing and Implementing a Content Strategy Focued on User Experience R...
Co-Developing and Implementing a Content Strategy Focued on User Experience R...Co-Developing and Implementing a Content Strategy Focued on User Experience R...
Co-Developing and Implementing a Content Strategy Focued on User Experience R...LavaCon
 
Operations Management Processes and Supply Chains 12th Edition Krajewski Test...
Operations Management Processes and Supply Chains 12th Edition Krajewski Test...Operations Management Processes and Supply Chains 12th Edition Krajewski Test...
Operations Management Processes and Supply Chains 12th Edition Krajewski Test...kalotogub
 
Supply chain at hewlett packard
Supply chain at hewlett packardSupply chain at hewlett packard
Supply chain at hewlett packardGaurav Singh
 
Capability Maturity Model PowerPoint Presentation Slides
Capability Maturity Model PowerPoint Presentation Slides Capability Maturity Model PowerPoint Presentation Slides
Capability Maturity Model PowerPoint Presentation Slides SlideTeam
 
Services marketing
Services marketingServices marketing
Services marketingArun Gupta
 
Chap 8 service quality ppt
Chap 8 service quality ppt Chap 8 service quality ppt
Chap 8 service quality ppt kesahv
 
TestEngineer_BasilUmmerKutty
TestEngineer_BasilUmmerKuttyTestEngineer_BasilUmmerKutty
TestEngineer_BasilUmmerKuttyBasil Ummer Kutty
 
Optimizing Prices of Outsourced Services | How to make benchmarking work for ...
Optimizing Prices of Outsourced Services | How to make benchmarking work for ...Optimizing Prices of Outsourced Services | How to make benchmarking work for ...
Optimizing Prices of Outsourced Services | How to make benchmarking work for ...Everest Group
 

Mais procurados (20)

Private Equity Powerpoint Presentation Slides
Private Equity Powerpoint Presentation SlidesPrivate Equity Powerpoint Presentation Slides
Private Equity Powerpoint Presentation Slides
 
Er pvs bestofbreed
Er pvs bestofbreedEr pvs bestofbreed
Er pvs bestofbreed
 
Er pvs bestofbreed
Er pvs bestofbreedEr pvs bestofbreed
Er pvs bestofbreed
 
Product or Service Development Process
Product or Service Development Process Product or Service Development Process
Product or Service Development Process
 
Quality analyst job description
Quality analyst job descriptionQuality analyst job description
Quality analyst job description
 
Offshore Billing Rate Analysis | Offshire Insights
Offshore Billing Rate Analysis | Offshire InsightsOffshore Billing Rate Analysis | Offshire Insights
Offshore Billing Rate Analysis | Offshire Insights
 
9 Borland Solo Pruebas 2009
9 Borland Solo Pruebas 20099 Borland Solo Pruebas 2009
9 Borland Solo Pruebas 2009
 
Chap03
Chap03Chap03
Chap03
 
2010 s1-operations managementsession1intro
2010 s1-operations managementsession1intro2010 s1-operations managementsession1intro
2010 s1-operations managementsession1intro
 
Chap12
Chap12Chap12
Chap12
 
Co-Developing and Implementing a Content Strategy Focued on User Experience R...
Co-Developing and Implementing a Content Strategy Focued on User Experience R...Co-Developing and Implementing a Content Strategy Focued on User Experience R...
Co-Developing and Implementing a Content Strategy Focued on User Experience R...
 
Naresh
NareshNaresh
Naresh
 
Operations Management Processes and Supply Chains 12th Edition Krajewski Test...
Operations Management Processes and Supply Chains 12th Edition Krajewski Test...Operations Management Processes and Supply Chains 12th Edition Krajewski Test...
Operations Management Processes and Supply Chains 12th Edition Krajewski Test...
 
Supply chain at hewlett packard
Supply chain at hewlett packardSupply chain at hewlett packard
Supply chain at hewlett packard
 
Capability Maturity Model PowerPoint Presentation Slides
Capability Maturity Model PowerPoint Presentation Slides Capability Maturity Model PowerPoint Presentation Slides
Capability Maturity Model PowerPoint Presentation Slides
 
Services marketing
Services marketingServices marketing
Services marketing
 
Chap 8 service quality ppt
Chap 8 service quality ppt Chap 8 service quality ppt
Chap 8 service quality ppt
 
TestEngineer_BasilUmmerKutty
TestEngineer_BasilUmmerKuttyTestEngineer_BasilUmmerKutty
TestEngineer_BasilUmmerKutty
 
Optimizing Prices of Outsourced Services | How to make benchmarking work for ...
Optimizing Prices of Outsourced Services | How to make benchmarking work for ...Optimizing Prices of Outsourced Services | How to make benchmarking work for ...
Optimizing Prices of Outsourced Services | How to make benchmarking work for ...
 
Agile Journey to agile
Agile   Journey to agileAgile   Journey to agile
Agile Journey to agile
 

Semelhante a Post-Editing of Machine Translation: Developing Requirements and Compensation Schemes

What machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyWhat machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyIconic Translation Machines
 
User Empowered Machine Translation. Dion Wiggins, Asia Online
User Empowered Machine Translation. Dion Wiggins, Asia OnlineUser Empowered Machine Translation. Dion Wiggins, Asia Online
User Empowered Machine Translation. Dion Wiggins, Asia OnlineABBYY Language Serivces
 
Getting the Most from MT + PE
Getting the Most from MT + PEGetting the Most from MT + PE
Getting the Most from MT + PELuigi Muzii
 
Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize
 
Thinking Strategically About Content Destined for Machine Translation
Thinking Strategically About Content Destined for Machine TranslationThinking Strategically About Content Destined for Machine Translation
Thinking Strategically About Content Destined for Machine TranslationContent Rules, Inc.
 
Improving Translator Productivity with MT: A Patent Translation Case Study
Improving Translator Productivity with MT: A Patent Translation Case StudyImproving Translator Productivity with MT: A Patent Translation Case Study
Improving Translator Productivity with MT: A Patent Translation Case StudyIconic Translation Machines
 
Lexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLoriThicke
 
Go global with this Winning Combination – Content strategy and Machine Transl...
Go global with this Winning Combination – Content strategy and Machine Transl...Go global with this Winning Combination – Content strategy and Machine Transl...
Go global with this Winning Combination – Content strategy and Machine Transl...kantanmt
 
Seven components of content strategy global swisher
Seven components of content strategy global swisherSeven components of content strategy global swisher
Seven components of content strategy global swisherVal Swisher
 
Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...SDL
 
The Seven Components of a Global Content Strategy
The Seven Components of a Global Content StrategyThe Seven Components of a Global Content Strategy
The Seven Components of a Global Content StrategyContent Rules, Inc.
 
Trends In Technology: Worldware 2010
Trends In Technology:  Worldware 2010Trends In Technology:  Worldware 2010
Trends In Technology: Worldware 2010LoriThicke
 
Language Quality Management: Models, Measures, Methodologies
Language Quality Management: Models, Measures, Methodologies Language Quality Management: Models, Measures, Methodologies
Language Quality Management: Models, Measures, Methodologies Sajan
 
Machine Translation: Latest Innovations and their Impact on Commercial Transl...
Machine Translation: Latest Innovations and their Impact on Commercial Transl...Machine Translation: Latest Innovations and their Impact on Commercial Transl...
Machine Translation: Latest Innovations and their Impact on Commercial Transl...SDL
 
GENEO Software Overview 2015
GENEO Software Overview 2015GENEO Software Overview 2015
GENEO Software Overview 2015Andy Hemingway
 
GENEO Software Overview 2015
GENEO Software Overview 2015GENEO Software Overview 2015
GENEO Software Overview 2015Mark Radley
 
Introducing language technology in the editing process: How to do things righ...
Introducing language technology in the editing process: How to do things righ...Introducing language technology in the editing process: How to do things righ...
Introducing language technology in the editing process: How to do things righ...Loctimize GmbH
 
What Writers Don’t Know About Translation Can Be Costly
What Writers Don’t Know About Translation Can Be CostlyWhat Writers Don’t Know About Translation Can Be Costly
What Writers Don’t Know About Translation Can Be CostlySTC-Philadelphia Metro Chapter
 

Semelhante a Post-Editing of Machine Translation: Developing Requirements and Compensation Schemes (20)

Closing the Gap between Corpora and Termbases, CHAT2013
Closing the Gap between Corpora and Termbases, CHAT2013Closing the Gap between Corpora and Termbases, CHAT2013
Closing the Gap between Corpora and Termbases, CHAT2013
 
What machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyWhat machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happy
 
User Empowered Machine Translation. Dion Wiggins, Asia Online
User Empowered Machine Translation. Dion Wiggins, Asia OnlineUser Empowered Machine Translation. Dion Wiggins, Asia Online
User Empowered Machine Translation. Dion Wiggins, Asia Online
 
Getting the Most from MT + PE
Getting the Most from MT + PEGetting the Most from MT + PE
Getting the Most from MT + PE
 
Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
 
Thinking Strategically About Content Destined for Machine Translation
Thinking Strategically About Content Destined for Machine TranslationThinking Strategically About Content Destined for Machine Translation
Thinking Strategically About Content Destined for Machine Translation
 
Improving Translator Productivity with MT: A Patent Translation Case Study
Improving Translator Productivity with MT: A Patent Translation Case StudyImproving Translator Productivity with MT: A Patent Translation Case Study
Improving Translator Productivity with MT: A Patent Translation Case Study
 
Lexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLexcelera MT Breaking Compromises
Lexcelera MT Breaking Compromises
 
Go global with this Winning Combination – Content strategy and Machine Transl...
Go global with this Winning Combination – Content strategy and Machine Transl...Go global with this Winning Combination – Content strategy and Machine Transl...
Go global with this Winning Combination – Content strategy and Machine Transl...
 
Seven components of content strategy global swisher
Seven components of content strategy global swisherSeven components of content strategy global swisher
Seven components of content strategy global swisher
 
Linguistic Evaluation of Support Verb Construction Translations by OpenLogos ...
Linguistic Evaluation of Support Verb Construction Translations by OpenLogos ...Linguistic Evaluation of Support Verb Construction Translations by OpenLogos ...
Linguistic Evaluation of Support Verb Construction Translations by OpenLogos ...
 
Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...
 
The Seven Components of a Global Content Strategy
The Seven Components of a Global Content StrategyThe Seven Components of a Global Content Strategy
The Seven Components of a Global Content Strategy
 
Trends In Technology: Worldware 2010
Trends In Technology:  Worldware 2010Trends In Technology:  Worldware 2010
Trends In Technology: Worldware 2010
 
Language Quality Management: Models, Measures, Methodologies
Language Quality Management: Models, Measures, Methodologies Language Quality Management: Models, Measures, Methodologies
Language Quality Management: Models, Measures, Methodologies
 
Machine Translation: Latest Innovations and their Impact on Commercial Transl...
Machine Translation: Latest Innovations and their Impact on Commercial Transl...Machine Translation: Latest Innovations and their Impact on Commercial Transl...
Machine Translation: Latest Innovations and their Impact on Commercial Transl...
 
GENEO Software Overview 2015
GENEO Software Overview 2015GENEO Software Overview 2015
GENEO Software Overview 2015
 
GENEO Software Overview 2015
GENEO Software Overview 2015GENEO Software Overview 2015
GENEO Software Overview 2015
 
Introducing language technology in the editing process: How to do things righ...
Introducing language technology in the editing process: How to do things righ...Introducing language technology in the editing process: How to do things righ...
Introducing language technology in the editing process: How to do things righ...
 
What Writers Don’t Know About Translation Can Be Costly
What Writers Don’t Know About Translation Can Be CostlyWhat Writers Don’t Know About Translation Can Be Costly
What Writers Don’t Know About Translation Can Be Costly
 

Mais de Luigi Muzii

Measuring for success: Goals, performances, and outcomes
Measuring for success: Goals, performances, and outcomesMeasuring for success: Goals, performances, and outcomes
Measuring for success: Goals, performances, and outcomesLuigi Muzii
 
Sharing efforts to get the most from MT+PE
Sharing efforts to get the most from MT+PESharing efforts to get the most from MT+PE
Sharing efforts to get the most from MT+PELuigi Muzii
 
Convegno Unilingue 2017
Convegno Unilingue 2017Convegno Unilingue 2017
Convegno Unilingue 2017Luigi Muzii
 
Standards, terminology and Europe
Standards, terminology and EuropeStandards, terminology and Europe
Standards, terminology and EuropeLuigi Muzii
 
Introduzione alla terminologia
Introduzione alla terminologiaIntroduzione alla terminologia
Introduzione alla terminologiaLuigi Muzii
 
KPIs and Capability Statements
KPIs and Capability StatementsKPIs and Capability Statements
KPIs and Capability StatementsLuigi Muzii
 
Europeo, Feb 1, 1991
Europeo, Feb 1, 1991Europeo, Feb 1, 1991
Europeo, Feb 1, 1991Luigi Muzii
 
Term Mining and Terminology Management in a Corporate Setting Perspective
Term Mining and Terminology Management in a Corporate Setting PerspectiveTerm Mining and Terminology Management in a Corporate Setting Perspective
Term Mining and Terminology Management in a Corporate Setting PerspectiveLuigi Muzii
 
Let's call the whole thing off
Let's call the whole thing offLet's call the whole thing off
Let's call the whole thing offLuigi Muzii
 
Diversità in rete: distanza che si trasforma in ricchezza
Diversità in rete: distanza che si trasforma in ricchezzaDiversità in rete: distanza che si trasforma in ricchezza
Diversità in rete: distanza che si trasforma in ricchezzaLuigi Muzii
 
Terminologia per la traduzione
Terminologia per la traduzioneTerminologia per la traduzione
Terminologia per la traduzioneLuigi Muzii
 
Is quality under pressure? Or is translation?
Is quality under pressure? Or is translation?Is quality under pressure? Or is translation?
Is quality under pressure? Or is translation?Luigi Muzii
 
Is quality under pressure? Or is translation?
Is quality under pressure? Or is translation?Is quality under pressure? Or is translation?
Is quality under pressure? Or is translation?Luigi Muzii
 
Vendor & Project Management
Vendor & Project ManagementVendor & Project Management
Vendor & Project ManagementLuigi Muzii
 
Introduzione alla localizzazione
Introduzione alla localizzazioneIntroduzione alla localizzazione
Introduzione alla localizzazioneLuigi Muzii
 
Perspectives in translator training: knowledge-driven or tech-driven?
Perspectives in translator training: knowledge-driven or tech-driven?Perspectives in translator training: knowledge-driven or tech-driven?
Perspectives in translator training: knowledge-driven or tech-driven?Luigi Muzii
 
Re-thinking pricing and business models for cloud translation
Re-thinking pricing and business models for cloud translationRe-thinking pricing and business models for cloud translation
Re-thinking pricing and business models for cloud translationLuigi Muzii
 

Mais de Luigi Muzii (20)

Measuring for success: Goals, performances, and outcomes
Measuring for success: Goals, performances, and outcomesMeasuring for success: Goals, performances, and outcomes
Measuring for success: Goals, performances, and outcomes
 
Hic et Nunc
Hic et NuncHic et Nunc
Hic et Nunc
 
Sharing efforts to get the most from MT+PE
Sharing efforts to get the most from MT+PESharing efforts to get the most from MT+PE
Sharing efforts to get the most from MT+PE
 
Convegno Unilingue 2017
Convegno Unilingue 2017Convegno Unilingue 2017
Convegno Unilingue 2017
 
White Noise
White NoiseWhite Noise
White Noise
 
Standards, terminology and Europe
Standards, terminology and EuropeStandards, terminology and Europe
Standards, terminology and Europe
 
Introduzione alla terminologia
Introduzione alla terminologiaIntroduzione alla terminologia
Introduzione alla terminologia
 
KPIs and Capability Statements
KPIs and Capability StatementsKPIs and Capability Statements
KPIs and Capability Statements
 
Europeo, Feb 1, 1991
Europeo, Feb 1, 1991Europeo, Feb 1, 1991
Europeo, Feb 1, 1991
 
Term Mining and Terminology Management in a Corporate Setting Perspective
Term Mining and Terminology Management in a Corporate Setting PerspectiveTerm Mining and Terminology Management in a Corporate Setting Perspective
Term Mining and Terminology Management in a Corporate Setting Perspective
 
Let's call the whole thing off
Let's call the whole thing offLet's call the whole thing off
Let's call the whole thing off
 
Diversità in rete: distanza che si trasforma in ricchezza
Diversità in rete: distanza che si trasforma in ricchezzaDiversità in rete: distanza che si trasforma in ricchezza
Diversità in rete: distanza che si trasforma in ricchezza
 
Terminologia per la traduzione
Terminologia per la traduzioneTerminologia per la traduzione
Terminologia per la traduzione
 
Is quality under pressure? Or is translation?
Is quality under pressure? Or is translation?Is quality under pressure? Or is translation?
Is quality under pressure? Or is translation?
 
Is quality under pressure? Or is translation?
Is quality under pressure? Or is translation?Is quality under pressure? Or is translation?
Is quality under pressure? Or is translation?
 
Vendor & Project Management
Vendor & Project ManagementVendor & Project Management
Vendor & Project Management
 
It101
It101It101
It101
 
Introduzione alla localizzazione
Introduzione alla localizzazioneIntroduzione alla localizzazione
Introduzione alla localizzazione
 
Perspectives in translator training: knowledge-driven or tech-driven?
Perspectives in translator training: knowledge-driven or tech-driven?Perspectives in translator training: knowledge-driven or tech-driven?
Perspectives in translator training: knowledge-driven or tech-driven?
 
Re-thinking pricing and business models for cloud translation
Re-thinking pricing and business models for cloud translationRe-thinking pricing and business models for cloud translation
Re-thinking pricing and business models for cloud translation
 

Último

Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear RegressionRavindra Nath Shukla
 
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756dollysharma2066
 
Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Roland Driesen
 
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Dave Litwiller
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756dollysharma2066
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsP&CO
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Servicediscovermytutordmt
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...amitlee9823
 
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒anilsa9823
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.Aaiza Hassan
 
The Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case studyThe Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case studyEthan lee
 
7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...Paul Menig
 
Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Neil Kimberley
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Dipal Arora
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Centuryrwgiffor
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...amitlee9823
 
HONOR Veterans Event Keynote by Michael Hawkins
HONOR Veterans Event Keynote by Michael HawkinsHONOR Veterans Event Keynote by Michael Hawkins
HONOR Veterans Event Keynote by Michael HawkinsMichael W. Hawkins
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMANIlamathiKannappan
 

Último (20)

Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear Regression
 
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
 
Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...
 
Mifty kit IN Salmiya (+918133066128) Abortion pills IN Salmiyah Cytotec pills
Mifty kit IN Salmiya (+918133066128) Abortion pills IN Salmiyah Cytotec pillsMifty kit IN Salmiya (+918133066128) Abortion pills IN Salmiyah Cytotec pills
Mifty kit IN Salmiya (+918133066128) Abortion pills IN Salmiyah Cytotec pills
 
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and pains
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Service
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
 
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.
 
The Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case studyThe Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case study
 
7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...
 
Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
 
Forklift Operations: Safety through Cartoons
Forklift Operations: Safety through CartoonsForklift Operations: Safety through Cartoons
Forklift Operations: Safety through Cartoons
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
 
HONOR Veterans Event Keynote by Michael Hawkins
HONOR Veterans Event Keynote by Michael HawkinsHONOR Veterans Event Keynote by Michael Hawkins
HONOR Veterans Event Keynote by Michael Hawkins
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMAN
 

Post-Editing of Machine Translation: Developing Requirements and Compensation Schemes

  • 1. Post-Editing of Machine Translation (PEMT) Developing Requirements and Compensation Schemes
  • 2. Overview  MT 101  Estimation of MT quality  Definition of post-editing  PEMT Requirements  PEMT skills  Pricing and compensation © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 2
  • 3. Caveat emptor  Your presenter is an MT enthusiast  Using MT since 1990  No bias  MT is here to stay  No special knowledge  Could sound trivial • Common sense © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 3
  • 4. Methods  Rule-based Machine Translation (RbMT)  Transfer  Interlingua  Data-driven (stochastic) machine translation  Example-based Machine Translation (EbMT)  Statistical Machine Translation (SMT)  Both methods can work for projects where MT is suitable  Most commercial systems are now all hybrids of some sort  Post processing and cleanup © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 4
  • 5. Rules-based Machine Translation (RbMT)  Analytical approach  Grammatical representation of language • Morphological analysis – Inflection and conjugation of words • Syntactic analysis – Sequence of words and sentence structure • Semantic analysis – Meaning of words in context  Heavy dependence on bilingual dictionaries © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 5
  • 6. RbMT Issues  Disambiguation  Plain, correct, and consistent source  Constant refinement of rules  Accurate and comprehensive dictionaries © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 6
  • 7. Example-based Machine Translation (EbMT)  Bilingual corpus  Body of reference for similarities • Combination of segments – Best approximation  Fuzzy matching algorithms © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 7
  • 8. Statistical Machine Translation (SMT)  Empirical strategy  Statistical probability • Statistical assessment of words and phrase positioning within segments from corpus – Brute force computing © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 8
  • 9. Two Models for Learning Data  Translation model  Words and word sequences in SL to find the most likely corresponding words in TL  Target-language model  The most likely way in which corresponding TL words will be combined © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 9
  • 10. EbMT & SMT Issues  Parallel corpora  Analysis is challenging • Problems with large corpora  Translation memories instead of corpora  Segmentation for accuracy and matching • Problems with alignment © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 10
  • 11. Ambiguity in SMT  Translation models handle word sequences  Likeliness of reproducing a wrong interpretation if in model  Collocations (dependencies between words) could be hard to capture  Target-language models © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 11
  • 12. SWOT Analysis for MT © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 12 • Speed • Volumes • Consistency • Complexity • Error incidence • Amount of skills, expertise and understanding needed • Not commonplace • Least engaging, highly rewarding, non- binding content • Areas for improvement • Training and customization • Language data and rules optimization • Writing • Controlled languages • Reliability • Problematic ROI
  • 13. Modes of Use © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 13 Unrestricted texts High quality Restricted input Low quality Impractical Interactive Fully automatic
  • 14. Estimating MT quality  Biased baseline: MT is always bad  Most translators do not know much about MT • Same old jokes about silly mistakes • A mixture of ignorance and fear  Automatic metrics  Hard interpretation  PEMT effort  Annotation guidelines • Assigning 1-5 scores © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 14
  • 15. Automatic Metrics  BLEU  No indication of accuracy  1 ≤ P ≤ 0 • 1 = professional human translation • .65 = human quality • A score increase does not necessarily mean improved translation quality  NIST  Based on BLEU, with some alterations  METEOR (Metric for Evaluation of Translation with Explicit ORdering)  Based on BLEU • Harmonic mean of unigram precision and recall  F-Measure (F1 Score or F-Score)  A measure of a test’s accuracy used in machine learning  Based on BLEU, with some alterations  WER (Word Error Rate)  Most often used in speech recognition • 1 ≤ P ≤ 0 © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 15
  • 16. Prerequisites  Post-editing throughput must be faster than translation  Post-editing must be less keyboard intensive than translation  Post-editing must be less cognitively demanding than translation © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 16
  • 17. PEMT Effort  System  RbMT • Dictionary • Rules • Customizability  Data‐driven • Suitability of input • Training data – Volume – Domain  Product captivity • Different technologies that can or cannot be used within more than just one tool  Language pair  Outcome in one language combination cannot be compared with that in another  Text type  Domain © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 17
  • 18. PEMT Issues  System  RbMT • Incorrect word/term • Incorrect attachment • Meaning not disambiguated  Data‐driven • Missing words • Capitalization • Punctuation • Fluency inconsistency © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 18
  • 19. Post-editing  Gist  Raw MT • Disposable (volatile UGC) • Validation of automatic evaluation  Light  Making the translation understandable • Ignoring all stylistic issues • Adjusting mechanical errors (capitalization, and punctuation) • Replacing unknown words (misspelled in ST) • Removing redundant words  Heavy  Making the translation stylistically appropriate • Fixing machine‐induced meaning distortion • Making grammatical and syntactic adjustments • Checking terminology (new terms) • Partially or completely rewriting sentences – Adjusting for target language fluency © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 19
  • 20. Degrees of Post-editing  User requirements  Quality expectations  Perishability  Volume  Text function  Turn‐around time © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 20
  • 21. PEMT Specs  Type of MT  In house vs. outsourced MT  Type of MT output  Generic, untrained MT output  Trained MT output  Quality guidelines and index for raw translation  ≥ 40% reusable • BLEU – Acceptable 0.3 to 0.5 – Good: 0.5 to 1  Request a sample © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 21
  • 22. PEMT Specs  Rationale for MT  Increased throughput • More languages • More content  Faster turnaround time  Reduced cost  Accuracy and consistency  Target consumer  Quality of the finished product  Reprocessing  Publication  Amount and type of PEMT  Gist  Light  Heavy  Participation in ongoing training of engine © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 22
  • 23. Warning  MT engines are not all equal  Raw output quality is not consistent from system to system and language to language  MT error patterns are not consistent from segment to segment © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 23
  • 24. PEMT Instructions  Clear and concise  Tools  Stick to style guide • Language specific conventions • Country/region standards • Grammar, syntax and orthographic conventions  Retain as much raw translation as possible  Don’t hesitate too long over a problem  Don’t worry about style  Don’t embark on time‐consuming research  Make changes only where absolutely necessary • Non sense • Wrong words (possibly misspelled in source text) • Missing words • Punctuation, capitalization • Inflection • Gender • Word order • Formatting © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 24
  • 25. PEMT Instructions  Example for heavy PEMT project  The quality expected is publishable quality, this means no deletions or omissions in the text, full accuracy and no mistranslations with regards to the source text, compliance to language rules of grammar and spelling for the target language, and compliance to the terminology following the glossary provided.  Make as less edits as possible • Just correct errors – Follow the glossary – If different terminology is found, replace it with the one in the glossary • Do not introduce preferential changes • Do not re-write text, unless to correct nonsenses • Do not try to ‘improve’ the text © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 25
  • 26. PEMT skills  Working knowledge of SL  Excellent command of TL  Specialized domain knowledge  Ability to comply with guidelines  Unbiased attitude towards MT © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 26 Establish your own PEMT academyEstablish your own PEMT academy
  • 27. Pricing and Compensation  It will take a year or two more to build out a widely accepted and dominant compensation model  The final model will be tied to productivity © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 27
  • 28. Pricing and Compensation  Common approaches  Paying as for high fuzzy matches • 85%-94% • PEMT and post-editing of fuzzy matches are deeply different – Fuzzy matches are inherently correct segments  Minor changes (possibly a term or two) – MT is not necessarily inherently correct  Even ‘Light PEMT’ could eventually result heavy  Paying a time-based fee © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 28
  • 29. Always © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 29  Provide candidate post-editors with MT samples before contracting  Agree on throughput rates • 450 to 750 words/hour  Run a pilot project  For every domain  For any new combination
  • 30. Compensation Grid  Generals  Method  Type of output • Generic, trained or untrained  Quality of output  Number of references  Quality expectations • Threshold  File formats • Tagging  Time-based fee  Productivity rate • Productivity differ by post-editor  Time for filling in QA forms • For ongoing training of engine  Use a spreadsheet to track time © 2013 Luigi Muzii Developing PEMT Requirements and Compensation Schemes 30

Notas do Editor

  1. Monolingual post-editors Experts in the domain, but not bilingual Bilingual post-editors Professional translators with domain expertise, they are trained to understand issues with MT and not only correct the error in the sentence, but work to create rules for the MT engine to follow. A fully qualified professional translator has to have two sets of skills when translating a text. On the one hand the language skills to generally understand the source language and to write well in the target language, and on the other hand the domain knowledge to understand the content of a possibly very specialized technical document. Both skill sets may be hard to find, especially in combination. In fact, it is common practice in the translation industry to differentiate translators according to their qualifications.
  2. While automated metrics such as BLEU and human metrics such as Edit Distance are useful indicators of quality, by themselves they do not provide enough information. Productivity is the metric that matters most to LSPs as this relates directly to profit margin. Measuring productivity provides LSPs and post-editors with a simple means to determine a fair rate for MT post-editing. A fair rate can be established based upon the productivity gain realized via an MT + human approach and the reduced effort required to deliver the same quality output. If post-editing MT is 3 times faster than a human only translation only approach, then there is justification for reducing rates by 33% of the regular rate, but generally since this is just a small sample it would be wiser to adjust this upwards to a level that may accommodate more variance in the MT output. From the translators perspective, they are being paid less per word, but are being paid more per hour overall.