SlideShare a Scribd company logo
1 of 49
February 23, 2018
Athens, Greece #eliatogether
SharingEfforts
ToGettheMostfromMT+PE
Luigi Muzii
sQuid
#eliatogether
Introduction
Sharing Efforts to Get the Most from MT+PE 2© 2018 Luigi Muzii
Γεια σας
Με λένε
#eliatogether
In this
industry since
1982
Sharing Efforts to Get the Most from MT+PE 3© 2018 Luigi Muzii
#eliatogether
Working with
MT since
1991
Sharing Efforts to Get the Most from MT+PE 4© 2018 Luigi Muzii
#eliatogether
Working in
telecom until
2002
© 2018 Luigi Muzii 5Sharing Efforts to Get the Most from MT+PE
#eliatogether
Freelancing
since 2002
Sharing Efforts to Get the Most from MT+PE 6© 2018 Luigi Muzii
#eliatogether
University
teacher until
2011
© 2018 Luigi Muzii Sharing Efforts to Get the Most from MT+PE 7
#eliatogether
Business
consultant
since 2012
© 2018 Luigi Muzii
8Sharing Efforts to Get the Most from MT+PE
#eliatogether
Outline
• Clearing the field
• Laying foundations
• Defining requirements
• Arranging the platform
• Running projects
Sharing Efforts to Get the Most from MT+PE 9© 2018 Luigi Muzii
#eliatogether
Clearing the field
Target groups
© 2018 Luigi Muzii Sharing Efforts to Get the Most from MT+PE 10
#eliatogether
Practical
advice
Sharing Efforts to Get the Most from MT+PE 11© 2018 Luigi Muzii
#eliatogether
• To apply whether you are a
freelancer, a project manager, or a
translation buyer
Guiding
principles
Sharing Efforts to Get the Most from MT+PE 12© 2018 Luigi Muzii
#eliatogether
Three
scenarios
• In the making (freelancers)
• Downstream (customers)
• On constraints (LSPs)
Sharing Efforts to Get the Most from MT+PE 13© 2018 Luigi Muzii
#eliatogether
Laying foundations
Devising strategies
© 2018 Luigi Muzii Sharing Efforts to Get the Most from MT+PE 14
#eliatogether
Method?
Sharing Efforts to Get the Most from MT+PE 15© 2018 Luigi Muzii
#eliatogether
Already in the past?SMT
Sharing Efforts to Get the Most from MT+PE 16© 2018 Luigi Muzii
#eliatogether
Still SMT, but…
a whole different kettle of fishNMT?
Sharing Efforts to Get the Most from MT+PE 17© 2018 Luigi Muzii
#eliatogether
Not exactly
child’s play
• Tools
• Data
• Knowledge
Sharing Efforts to Get the Most from MT+PE 18© 2018 Luigi Muzii
#eliatogether
• MT will proliferate
• Good translators will still be lacking
Foresight
Sharing Efforts to Get the Most from MT+PE 19© 2018 Luigi Muzii
#eliatogether
Everyone’s needed, no one’s
indispensableJoin forces
Sharing Efforts to Get the Most from MT+PE 20© 2018 Luigi Muzii
#eliatogether
3 tips for
getting
started
• Recap goals and expectations
• Check MT readiness
• Plan for assistance
Sharing Efforts to Get the Most from MT+PE 21© 2018 Luigi Muzii
#eliatogether
Defining requirements
Simple and straightforward
Sharing Efforts to Get the Most from MT+PE 22© 2018 Luigi Muzii
#eliatogether
Where available data is larger and
quality is higherScope
Sharing Efforts to Get the Most from MT+PE 23© 2018 Luigi Muzii
#eliatogether
• Reduce labor
• Boost productivity
• Keep consistency
Goals
Sharing Efforts to Get the Most from MT+PE 24© 2018 Luigi Muzii
#eliatogether
• Familiarize with technology
• Strengthen your expertise
• Tackle security issues
• Scrub your data
• Plan for support
• Revise your pricing model
Separate the
wheat from
the chaff
Sharing Efforts to Get the Most from MT+PE 25© 2018 Luigi Muzii
#eliatogether
Building a platform
Selection, set-up, training, testing
Sharing Efforts to Get the Most from MT+PE 26© 2018 Luigi Muzii
#eliatogether
Givens
• Not all engines are created equal
• Raw output can vary across
systems—and language pairs
• Errors may not follow a consistent
pattern
• Engine performances also vary
Sharing Efforts to Get the Most from MT+PE 27© 2018 Luigi Muzii
#eliatogether
Set-up
• Data
 Maintenance
• Customized engine
 +100,000 segments
• Tool settings
 Sub-segment recall
 Fuzzy match repair
Sharing Efforts to Get the Most from MT+PE 28© 2018 Luigi Muzii
#eliatogether
Engine
• Total cost of ownership
• Integration
• Expertise
• Security
Sharing Efforts to Get the Most from MT+PE 29© 2018 Luigi Muzii
#eliatogether
Best practices
Running projects
Sharing Efforts to Get the Most from MT+PE 30© 2018 Luigi Muzii
#eliatogether
Dos
• Know your data
• Master quality metrics
• Devise a post-editing fee scheme
Sharing Efforts to Get the Most from MT+PE 31© 2018 Luigi Muzii
#eliatogether
Don’ts
• Mess with data
• DIY/Rely on vendors
• Expect miracles
Sharing Efforts to Get the Most from MT+PE 32© 2018 Luigi Muzii
#eliatogether
In any case,
remember:
Tell the customer you are using MT
So you won’t get sued
Sharing Efforts to Get the Most from MT+PE 33© 2018 Luigi Muzii
#eliatogether
The fuel Output is only as good as the data
used
Sharing Efforts to Get the Most from MT+PE 34© 2018 Luigi Muzii
#eliatogether
Good
(effective)
data
• Few reliable sources
• Single domain
• Current data
• Same encoding
• No empty segments
• No errors
• Terminologically consistent
segments
• Same style
• Same-length segments
Sharing Efforts to Get the Most from MT+PE 35© 2018 Luigi Muzii
#eliatogether
The output Accept that output is unpredictable
Sharing Efforts to Get the Most from MT+PE 36© 2018 Luigi Muzii
#eliatogether
• Fast
• Unchallenging
• Flowing
Post-editing:
expectations
Sharing Efforts to Get the Most from MT+PE 37© 2018 Luigi Muzii
#eliatogether
• EditTime
 The time required to get a raw MT output
to the desired standard
• Post-editing effort
 Percentage of edits to be applied to raw
MT output to attain the desired standard
Post-editing:
measures
Sharing Efforts to Get the Most from MT+PE 38© 2018 Luigi Muzii
#eliatogether
Can only be computed downstreamEdit time
Sharing Efforts to Get the Most from MT+PE 39© 2018 Luigi Muzii
#eliatogether
• Probabilistic forecasts
 Based on automatic metrics
• Depending on
 Post‐editing level
 Volume
 Turn‐around time
Post‐editing
effort
Sharing Efforts to Get the Most from MT+PE 40© 2018 Luigi Muzii
#eliatogether
Post-editing
levels
• Gisting
 Volatile content
 Automatic scripts to fix mechanical/recurring
errors
• Light
 Continuous delivery
 Fixing capitalization and punctuation, replacing
unknown words, removing redundant words,
ignoring stylistic issues
• Full
 Publishing and engine training
 Fixing meaning distortion, fixing grammar and
syntax, translating untranslated terms (possibly
new terms), adjusting fluency
Sharing Efforts to Get the Most from MT+PE 41© 2018 Luigi Muzii
#eliatogether
Dos
• Test before operating
• Ask for MT samples for negotiation
• Negotiate throughput rates
• Ask for glossary (with DNT words)
• Ask for for instructions
• Be open to feedback
Sharing Efforts to Get the Most from MT+PE 42© 2018 Luigi Muzii
#eliatogether
Don’ts
• Use MT to sustain price competition
• Process poor MT outputs
• Treat post-editing as fuzzy matches
Sharing Efforts to Get the Most from MT+PE 43© 2018 Luigi Muzii
#eliatogether
Post-editing
instructions
• Tool selection
• Environment setup
• General references
• Conventions
• Project details
• Pricing model
• Operating instructions
Sharing Efforts to Get the Most from MT+PE 44© 2018 Luigi Muzii
#eliatogether
Pricing and
compensation
• Upstream
 Clear-cut predictive scheme
 No fuzzy match scheme
 Fuzzy match over 85% are inherently correct while
MT segments may contain errors and inaccuracies
• Downstream
 Measurement of actual work
Sharing Efforts to Get the Most from MT+PE 45© 2018 Luigi Muzii
#eliatogether
Negotiation
grid
• Generals
 Engine
 Generic or trained
 Quality
 Raw output
 Expectations
 Formats and formatting
• Compensation
 Per-word rate
 Productivity rate
 Hourly rate
 Time tracking
Sharing Efforts to Get the Most from MT+PE 46© 2018 Luigi Muzii
#eliatogether
• A considerably low pay rate
unrelated to language pair and MT
output quality
• MT output quality is lower than a
generic free online service
When to say
NO
Sharing Efforts to Get the Most from MT+PE 47© 2018 Luigi Muzii
#eliatogether
Automatic
processing
• Pre-processing
 Empty, untranslated, duplicated segments
 Normalization
 Punctuation, diacritics, extra spaces, noise
 Numbers, dates, weights, measures
 Terminology
 Spellcheck
• Post-processing
 Encoding
 Normalization
 Terminology
 Spellcheck
Sharing Efforts to Get the Most from MT+PE 48© 2018 Luigi Muzii
#eliatogether
Ευχαριστίες
Don’t forget your download card
© 2018 Luigi Muzii Sharing Efforts to Get the Most from MT+PE 49

More Related Content

Similar to Sharing efforts to get the most from MT+PE

Procter and Gamble
Procter and Gamble Procter and Gamble
Procter and Gamble ANNI GUPTA
 
Digital Marketing Bootcamp - Evaluating Marketing Automation
Digital Marketing Bootcamp - Evaluating Marketing AutomationDigital Marketing Bootcamp - Evaluating Marketing Automation
Digital Marketing Bootcamp - Evaluating Marketing AutomationMarketo
 
It Expo Deck V1.1 20101027
It Expo Deck V1.1 20101027It Expo Deck V1.1 20101027
It Expo Deck V1.1 20101027FONMaster
 
How to Overcome the Challenges of Scaling Agile
How to Overcome the Challenges of Scaling AgileHow to Overcome the Challenges of Scaling Agile
How to Overcome the Challenges of Scaling AgileJoshua A. Jack
 
Why Strategy for the Mid-Market CIO
Why Strategy for the Mid-Market CIO  Why Strategy for the Mid-Market CIO
Why Strategy for the Mid-Market CIO Mary Patry
 
Joining the dots: Data and Marketing Strategy
Joining the dots: Data and Marketing StrategyJoining the dots: Data and Marketing Strategy
Joining the dots: Data and Marketing StrategyNicole Williams
 
Strategies to Reduce Conflict in the Virtual Workplace
Strategies to Reduce Conflict in the Virtual WorkplaceStrategies to Reduce Conflict in the Virtual Workplace
Strategies to Reduce Conflict in the Virtual WorkplaceCynthia Clay
 
Tieto ped2018 allhumansarenaturalborndesign hinkers
Tieto ped2018 allhumansarenaturalborndesign hinkersTieto ped2018 allhumansarenaturalborndesign hinkers
Tieto ped2018 allhumansarenaturalborndesign hinkersSean McGuire
 
Hang on - Change is Here
Hang on - Change is HereHang on - Change is Here
Hang on - Change is HereMarco
 
Connecting Analytics to Strategy: Keeping Your Corporate Objective in Sight
Connecting Analytics to Strategy: Keeping Your Corporate Objective in SightConnecting Analytics to Strategy: Keeping Your Corporate Objective in Sight
Connecting Analytics to Strategy: Keeping Your Corporate Objective in SightShelley Reece
 
Strategies to Manage Conflict in the Virtual Workplace
Strategies to Manage Conflict in the Virtual WorkplaceStrategies to Manage Conflict in the Virtual Workplace
Strategies to Manage Conflict in the Virtual WorkplaceCynthia Clay
 
2019 Preparing for the Next Decade
2019 Preparing for the Next Decade2019 Preparing for the Next Decade
2019 Preparing for the Next DecadeDouglas Sleeter
 
Q3 Meet Up '23 - Community Update
Q3 Meet Up '23 - Community UpdateQ3 Meet Up '23 - Community Update
Q3 Meet Up '23 - Community UpdateVictoriaMetrics
 
Show Me You Care: Why You Should Be Talking About Privacy and Value-Exchange
Show Me You Care: Why You Should Be Talking About Privacy and Value-ExchangeShow Me You Care: Why You Should Be Talking About Privacy and Value-Exchange
Show Me You Care: Why You Should Be Talking About Privacy and Value-ExchangeTealium
 
Marketo@Marketo: Advanced Report Builder
Marketo@Marketo: Advanced Report BuilderMarketo@Marketo: Advanced Report Builder
Marketo@Marketo: Advanced Report BuilderMarketo
 
Rescuing a-legacy-codebase
Rescuing a-legacy-codebaseRescuing a-legacy-codebase
Rescuing a-legacy-codebaseCurtis Poe
 
Technology Strategy: What is it and why do we need it? DDDEU Jan 2019
Technology Strategy: What is it and why do we need it? DDDEU Jan 2019Technology Strategy: What is it and why do we need it? DDDEU Jan 2019
Technology Strategy: What is it and why do we need it? DDDEU Jan 2019Scott Millett
 
Analytics Teams: 5 Things You Need to Know Before You Deploy Your Model
Analytics Teams: 5 Things You Need to Know Before You Deploy Your ModelAnalytics Teams: 5 Things You Need to Know Before You Deploy Your Model
Analytics Teams: 5 Things You Need to Know Before You Deploy Your ModelDecision Management Solutions
 

Similar to Sharing efforts to get the most from MT+PE (20)

Procter and Gamble
Procter and Gamble Procter and Gamble
Procter and Gamble
 
What is a claims handling pilot?
What is a claims handling pilot?What is a claims handling pilot?
What is a claims handling pilot?
 
Digital Marketing Bootcamp - Evaluating Marketing Automation
Digital Marketing Bootcamp - Evaluating Marketing AutomationDigital Marketing Bootcamp - Evaluating Marketing Automation
Digital Marketing Bootcamp - Evaluating Marketing Automation
 
It Expo Deck V1.1 20101027
It Expo Deck V1.1 20101027It Expo Deck V1.1 20101027
It Expo Deck V1.1 20101027
 
How to Overcome the Challenges of Scaling Agile
How to Overcome the Challenges of Scaling AgileHow to Overcome the Challenges of Scaling Agile
How to Overcome the Challenges of Scaling Agile
 
Why Strategy for the Mid-Market CIO
Why Strategy for the Mid-Market CIO  Why Strategy for the Mid-Market CIO
Why Strategy for the Mid-Market CIO
 
Joining the dots: Data and Marketing Strategy
Joining the dots: Data and Marketing StrategyJoining the dots: Data and Marketing Strategy
Joining the dots: Data and Marketing Strategy
 
Agile Adventures: Developers vs. Testers
Agile Adventures: Developers vs. TestersAgile Adventures: Developers vs. Testers
Agile Adventures: Developers vs. Testers
 
Strategies to Reduce Conflict in the Virtual Workplace
Strategies to Reduce Conflict in the Virtual WorkplaceStrategies to Reduce Conflict in the Virtual Workplace
Strategies to Reduce Conflict in the Virtual Workplace
 
Tieto ped2018 allhumansarenaturalborndesign hinkers
Tieto ped2018 allhumansarenaturalborndesign hinkersTieto ped2018 allhumansarenaturalborndesign hinkers
Tieto ped2018 allhumansarenaturalborndesign hinkers
 
Hang on - Change is Here
Hang on - Change is HereHang on - Change is Here
Hang on - Change is Here
 
Connecting Analytics to Strategy: Keeping Your Corporate Objective in Sight
Connecting Analytics to Strategy: Keeping Your Corporate Objective in SightConnecting Analytics to Strategy: Keeping Your Corporate Objective in Sight
Connecting Analytics to Strategy: Keeping Your Corporate Objective in Sight
 
Strategies to Manage Conflict in the Virtual Workplace
Strategies to Manage Conflict in the Virtual WorkplaceStrategies to Manage Conflict in the Virtual Workplace
Strategies to Manage Conflict in the Virtual Workplace
 
2019 Preparing for the Next Decade
2019 Preparing for the Next Decade2019 Preparing for the Next Decade
2019 Preparing for the Next Decade
 
Q3 Meet Up '23 - Community Update
Q3 Meet Up '23 - Community UpdateQ3 Meet Up '23 - Community Update
Q3 Meet Up '23 - Community Update
 
Show Me You Care: Why You Should Be Talking About Privacy and Value-Exchange
Show Me You Care: Why You Should Be Talking About Privacy and Value-ExchangeShow Me You Care: Why You Should Be Talking About Privacy and Value-Exchange
Show Me You Care: Why You Should Be Talking About Privacy and Value-Exchange
 
Marketo@Marketo: Advanced Report Builder
Marketo@Marketo: Advanced Report BuilderMarketo@Marketo: Advanced Report Builder
Marketo@Marketo: Advanced Report Builder
 
Rescuing a-legacy-codebase
Rescuing a-legacy-codebaseRescuing a-legacy-codebase
Rescuing a-legacy-codebase
 
Technology Strategy: What is it and why do we need it? DDDEU Jan 2019
Technology Strategy: What is it and why do we need it? DDDEU Jan 2019Technology Strategy: What is it and why do we need it? DDDEU Jan 2019
Technology Strategy: What is it and why do we need it? DDDEU Jan 2019
 
Analytics Teams: 5 Things You Need to Know Before You Deploy Your Model
Analytics Teams: 5 Things You Need to Know Before You Deploy Your ModelAnalytics Teams: 5 Things You Need to Know Before You Deploy Your Model
Analytics Teams: 5 Things You Need to Know Before You Deploy Your Model
 

More from Luigi Muzii

Measuring for success: Goals, performances, and outcomes
Measuring for success: Goals, performances, and outcomesMeasuring for success: Goals, performances, and outcomes
Measuring for success: Goals, performances, and outcomesLuigi Muzii
 
Getting the Most from MT + PE
Getting the Most from MT + PEGetting the Most from MT + PE
Getting the Most from MT + PELuigi Muzii
 
Convegno Unilingue 2017
Convegno Unilingue 2017Convegno Unilingue 2017
Convegno Unilingue 2017Luigi Muzii
 
Standards, terminology and Europe
Standards, terminology and EuropeStandards, terminology and Europe
Standards, terminology and EuropeLuigi Muzii
 
TLC 2015 Warsaw - The Rumble Seat - Presentation
TLC 2015 Warsaw - The Rumble Seat - PresentationTLC 2015 Warsaw - The Rumble Seat - Presentation
TLC 2015 Warsaw - The Rumble Seat - PresentationLuigi Muzii
 
TLC 2015 Warsaw - The Rumble Seat - Companion Text
TLC 2015 Warsaw - The Rumble Seat - Companion TextTLC 2015 Warsaw - The Rumble Seat - Companion Text
TLC 2015 Warsaw - The Rumble Seat - Companion TextLuigi Muzii
 
Introduzione alla terminologia
Introduzione alla terminologiaIntroduzione alla terminologia
Introduzione alla terminologiaLuigi Muzii
 
KPIs and Capability Statements
KPIs and Capability StatementsKPIs and Capability Statements
KPIs and Capability StatementsLuigi Muzii
 
Europeo, Feb 1, 1991
Europeo, Feb 1, 1991Europeo, Feb 1, 1991
Europeo, Feb 1, 1991Luigi Muzii
 
Term Mining and Terminology Management in a Corporate Setting Perspective
Term Mining and Terminology Management in a Corporate Setting PerspectiveTerm Mining and Terminology Management in a Corporate Setting Perspective
Term Mining and Terminology Management in a Corporate Setting PerspectiveLuigi Muzii
 
Let's call the whole thing off
Let's call the whole thing offLet's call the whole thing off
Let's call the whole thing offLuigi Muzii
 
Diversità in rete: distanza che si trasforma in ricchezza
Diversità in rete: distanza che si trasforma in ricchezzaDiversità in rete: distanza che si trasforma in ricchezza
Diversità in rete: distanza che si trasforma in ricchezzaLuigi Muzii
 
Terminologia per la traduzione
Terminologia per la traduzioneTerminologia per la traduzione
Terminologia per la traduzioneLuigi Muzii
 
Is quality under pressure? Or is translation?
Is quality under pressure? Or is translation?Is quality under pressure? Or is translation?
Is quality under pressure? Or is translation?Luigi Muzii
 
Is quality under pressure? Or is translation?
Is quality under pressure? Or is translation?Is quality under pressure? Or is translation?
Is quality under pressure? Or is translation?Luigi Muzii
 
Vendor & Project Management
Vendor & Project ManagementVendor & Project Management
Vendor & Project ManagementLuigi Muzii
 

More from Luigi Muzii (20)

Measuring for success: Goals, performances, and outcomes
Measuring for success: Goals, performances, and outcomesMeasuring for success: Goals, performances, and outcomes
Measuring for success: Goals, performances, and outcomes
 
Hic et Nunc
Hic et NuncHic et Nunc
Hic et Nunc
 
Getting the Most from MT + PE
Getting the Most from MT + PEGetting the Most from MT + PE
Getting the Most from MT + PE
 
Convegno Unilingue 2017
Convegno Unilingue 2017Convegno Unilingue 2017
Convegno Unilingue 2017
 
White Noise
White NoiseWhite Noise
White Noise
 
Standards, terminology and Europe
Standards, terminology and EuropeStandards, terminology and Europe
Standards, terminology and Europe
 
ATC 2015
ATC 2015ATC 2015
ATC 2015
 
TLC 2015 Warsaw - The Rumble Seat - Presentation
TLC 2015 Warsaw - The Rumble Seat - PresentationTLC 2015 Warsaw - The Rumble Seat - Presentation
TLC 2015 Warsaw - The Rumble Seat - Presentation
 
TLC 2015 Warsaw - The Rumble Seat - Companion Text
TLC 2015 Warsaw - The Rumble Seat - Companion TextTLC 2015 Warsaw - The Rumble Seat - Companion Text
TLC 2015 Warsaw - The Rumble Seat - Companion Text
 
Introduzione alla terminologia
Introduzione alla terminologiaIntroduzione alla terminologia
Introduzione alla terminologia
 
KPIs and Capability Statements
KPIs and Capability StatementsKPIs and Capability Statements
KPIs and Capability Statements
 
Europeo, Feb 1, 1991
Europeo, Feb 1, 1991Europeo, Feb 1, 1991
Europeo, Feb 1, 1991
 
Term Mining and Terminology Management in a Corporate Setting Perspective
Term Mining and Terminology Management in a Corporate Setting PerspectiveTerm Mining and Terminology Management in a Corporate Setting Perspective
Term Mining and Terminology Management in a Corporate Setting Perspective
 
Let's call the whole thing off
Let's call the whole thing offLet's call the whole thing off
Let's call the whole thing off
 
Diversità in rete: distanza che si trasforma in ricchezza
Diversità in rete: distanza che si trasforma in ricchezzaDiversità in rete: distanza che si trasforma in ricchezza
Diversità in rete: distanza che si trasforma in ricchezza
 
Terminologia per la traduzione
Terminologia per la traduzioneTerminologia per la traduzione
Terminologia per la traduzione
 
Is quality under pressure? Or is translation?
Is quality under pressure? Or is translation?Is quality under pressure? Or is translation?
Is quality under pressure? Or is translation?
 
Is quality under pressure? Or is translation?
Is quality under pressure? Or is translation?Is quality under pressure? Or is translation?
Is quality under pressure? Or is translation?
 
Vendor & Project Management
Vendor & Project ManagementVendor & Project Management
Vendor & Project Management
 
It101
It101It101
It101
 

Recently uploaded

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 

Recently uploaded (20)

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

Sharing efforts to get the most from MT+PE

Editor's Notes

  1. Good morning. Thank you for coming.
  2. My name is Luigi Muzii.
  3. I have been working in the language business since 1982, as a translator, terminologist, technical writer, and localizer.
  4. I have been working with machine translation (MT) since 1991.
  5. I worked in telecommunications until the year 2002, managing translation and localization projects, designing, developing, and managing terminology and documentation, and designing and providing training for customers and in-house staff.
  6. I taught terminology and localization at LUSPIO (now UNINT) university until 2011. Since 2012, I have been a full-time independent business consultant helping customers choose and implement best-suited technologies and redesign their business processes to get the best in multilingual content production, translation, and localization.
  7. I taught terminology and localization at LUSPIO (now UNINT) university until 2011.
  8. Since 2012, I have been a full-time independent business consultant helping customers choose and implement best-suited technologies and redesign their business processes to get the best in multilingual content production, translation, and localization.
  9. We will start with identifying target groups, then we will continue with laying the foundations for using MT in a production environment. We will also consider the definition of requirements for MT platforms and projects, and the setup of an MT platform, and finally with running MT and post-editing projects.
  10. This presentation is designed to provide some practical advice about tackling the challenges that freelancers, project managers, and translation buyers face when approaching MT, implementing MT or running MT and post-editing projects.
  11. Three target groups can be identified for three kinds of task: In the making; machine translation is used for “suggestions” while processing a document by a human translator; this is probably the most common approach today; Downstream; machine translation is used to fully process a document that will be possibly post-edited; this approach is typically adopted by larger clients with established experience in the field; On constraints; machine translation is used by an LSP to finalize a translation job by asking translators to work on suggestions coming from a specialized engine. While scenarios two and three might meet the customer’s need for confidentiality and IP protection through an in-house engine using only the client’s own data, scenario one is becoming more and more general among professional translators, given the astonishing improvement of online engines. At the same time, scenario three is slowly but constantly applying to LSPs who try to escape price and volume pressures through machine translation and post-editing.
  12. The three scenarios just outlined require different strategies to be devised. The first one involves the machine-translation method.
  13. Given the circumstances, the premises and the many reservations about it—basically, the hype—a question must be asked: Is PEMT already in the past?
  14. In this presentation, the reference method is PB-SMT (Phrase-Based Statistical Machine Translation) because PB-SMT engines are inexpensive and effective, whereas customizable NMT (Neural Machine Translation) engines are still quite pricey and challenging as to technical requirements and operational complexity, thus out of range for most customers. Translators working on unrestricted documents (scenario one applies here,) would generally choose an online NMT engine. For customers requiring confidentiality and IP protection and willing to leverage their own language data (scenario two and three apply here,) a highly customizable PB-SMT engine might be a valid option, especially where no major investment is envisaged in the field, vast and suitable data is available and/or limited volumes are processed.
  15. In general, the main drivers in the adoption of MT are productivity (speed and volumes) and usefulness (consistency and marginal cost) especially for large projects otherwise involving many translators. Unfortunately, MT is not exactly child’s play. MT engines are complex and challenging applications requiring a combination of tools, data, and knowledge. This is a rare commodity, especially in a single person, on both side of a translation project.
  16. Also, in the future, while MT will continue to proliferate, the shortage of good translators will increase.
  17. Therefore, joining forces is important to explore and vet as many solutions as possible. According to a popular saying, everyone’s needed, no one’s indispensable, and can be easily replaced by anyone else with a similar profile in the same role. This also means, though, that, to be valuable, everyone’s effort is needed, on the highest level of performance all the time. For quite some time now, translation is no longer a solitary feat, but a collaborative effort. This is especially true with the current level of technology.
  18. In this respect, three steps should be completed before starting any MT project. Recap your goals and requirements What you expect from MT and how much you are willing to rely on it. Check your MT readiness; Realistically analyze your knowledge, tools, and data. Plan for assistance Never venture into an unfamiliar territory without a guide.
  19. When planning to implement MT, keep scope, goals, and expectations clearly distinct.
  20. Identify one or more items within your scope of business for MT, possibly picking those where the amount of data available is larger, and the quality higher.
  21. Clearly define and prioritize your goals. Major goals may be reducing labor, boosting productivity and keeping consistency, especially in larger projects.
  22. Be realistic in your expectations. Therefore, familiarize with technology and strengthen your expertise to make the best of it. Tackle any security issues for confidentiality and data integrity and protection of intellectual property. Don’t forget to scrub your data if you plan to train an engine and to plan for any relevant support. Finally, revise your pricing model to encompass MT-related tasks.
  23. When building an MT platform, never forget that MT engines are not all equal, for different environments, configurations, and data. Therefore, although the output could be considered someway predictable, raw output quality can vary across systems and language combinations, and error may not follow a consistent pattern. Performances of MT engines also vary.
  24. In data-driven MT, data maintenance is crucial, and it is the first task when setting up an MT platform. Data must be organized, cleaned, and fixed for terminology and style. For a customized engine, at least 100,000 paired segments are necessary and the cleaner and healthier the better. Another important factor to the effectiveness of a MT strategy are the tools used for data preparation, pre-translation and post-editing. Special attention must be given to translation tool settings to allow for sub-segment recall and fuzzy match repair.
  25. Finally, when choosing the engine, the items to consider are: The total cost of ownership (TCO) Integration Expertise Security (especially as to intellectual property and confidentiality)
  26. When running MT projects, best practices may be different depending on whether you are a translation buyer or vendor.
  27. In general, knowing your data and mastering quality metrics is a must. As to post-editing, devise your own compensation scheme.
  28. A common mistake is to consider all content as equal and then mess with data. In the same way, absolutely avoid relying only on your capacities or on vendors. In the end, everything can be summed up in the simple invitation to not expect any miracles.
  29. Never forget to agree with the customer and the content owner about using MT, especially if you are using a SaaS/online platform to prevent being sued.
  30. Also remember that data is the fuel to any SMT/NMT engine and that the output is only as good as the data used. In fact, these engines perform statistical predictions by inference, and when the amount and quality of data increase, an engine improves.
  31. Collect as much data as possible, but always make sure it comes from few reliable sources in a restricted domain, that it contains correct translations with no errors, it is real and recent, and terminologically and stylistically consistent.
  32. At this point, you must accept that MT output is unpredictable. For this reason, MT quality assessment should be run in such a way as to prevent any subjectivity.
  33. For the same reason, post-editing is and will remain a critical, integrated part of MT usage, and it is expected to be fast, unchallenging, and flowing.
  34. Anyway, the amount of post-editing required can be hard to assess. To plan deadlines and allocate a budget for the job, two different measures can be used, the edit time and the post-editing effort. The first is the time required to get a raw MT output to the desired standard, and the latter is the percentage of edits to be applied to raw MT output to attain the desired standard.
  35. The main problem with edit time is that it can only be computed downstream, assuming that the time spent has been entirely devoted to editing.
  36. The post-editing effort can be estimated through probabilistic forecasts based on automatic metrics as a reverse projection of the productivity boost. In fact, translation productivity is measured as the throughput expressed in the number of words per hour, and MT is supposed to improve it by reducing the turnaround time and increasing the workable volumes. However, the post-editing effort and the turnaround time are hard to predict, especially for translation of publishable quality and/or data for incremental training of engines. In fact, it depends on diverse factors such as the quality expectations for the finalized output, the volume of content to process, and the allotted time for the task. Also, the effort required depends on the post-editing level.
  37. The post-editing level is generally restricted to: Gisting; Light; Full. Gisting consists in fixing recurring errors in raw MT output with automatic scripts. It is used for volatile content, e.g. messaging, conversations, etc. Light post-editing consists in fixing capitalization and punctuation errors, replacing unknown words, removing redundant words or inserting missing ones, and ignoring all stylistic issues. It is generally used for continuous delivery and reprocessing. Full post-editing consists in fixing meaning distortion, grammar and syntax, translating untranslated terms (possibly new terms), and adjusting fluency. It is reserved for publishing and engine training.
  38. Finally, always follow a few basic rules before boarding on a post-editing project: Test before operating; Ask for MT samples for negotiation; Negotiate throughput rates; Ask for glossary with the list of DNT words; Ask for instructions; Be open to feedback.
  39. Similarly, Never use MT to sustain price competition; Never process poor MT outputs; Never treat post-editing as fuzzy matches.
  40. Remember that different engines, domains, and language pairs produce different outputs, involve different post-editing efforts, and require different post-editing instructions. These should address tool selection criteria and environment setup guidelines, as well as a style guide, and a comprehensive term base. They should also address conventions and the type and number of project details as well as the general pricing model and the actual operating instructions.
  41. As to pricing and compensation, for light post-editing of very good output when speed is the major concern and the first requirement, a model should be settled prior to any assessment of the actual MT output based on a clear-cut predictive scheme. However, do not follow any translation-memory fuzzy matches scheme. In fact, while fuzzy matches over 85% are inherently correct and generally require minor changes, machine-translated segments may contain errors and inaccuracies, and even a supposedly light post-editing may prove challenging. A downstream computation scheme might also be devised in full post-editing for an accurate measurement of the actual work performed. This is usually made by computing the edit distance and then inferring the percentage on the hourly rate.
  42. A negotiation grid can be helpful to cross-reference type and nature of engines, quality of raw output, and all the relevant technical requirements with compensation based on productivity, type of performance, bid (hourly rate) and ancillary services (e.g. filling in QA forms for ongoing training of engine.)
  43. In any case, a strong and clear “No, thanks!” is more than reasonable when a considerably low pay rate is offered that is unrelated to language pair and MT output quality and/or MT output quality is lower than a generic free online service.
  44. Lastly, raw MT output should be processed before a post-editing job for automatic removal of empty and/or untranslated segments and duplications, fixing of diacritics, punctuation marks, extra spaces, noise and numbers; terminology should also be checked for consistency and a spellcheck should be run. A post-processing stage should also be envisaged involving encoding, normalization, formatting (especially tag injection,) a terminology check and, obviously, a spellcheck.