SlideShare uma empresa Scribd logo
1 de 24
WeMT Tools and
Processes
TAUS Showcase October 2013
By Olga Beregovaya

copyright © welocalize 2013. all rights reserved. www.welocalize.com
We’ll talk about:

• MT Programs
• Metrics
• Engines
• Language Tools

www.welocalize.com
Current MT Programs
Dell – 27 languages
Autodesk – 11 languages
PayPal - 8 languages
Cisco – 17 languages between 3 tiers
Intuit – 20+languages
Microsoft (pre-project support)
McAfee (pilot)
… many more in pilot stage
MT Program: Path-to-Success
Components
A set of MT engines – “mix and match”
TMT Selection Mechanisms
Post-editing Environment
Processes and metrics
Data gathering and reporting tool – what,
how much, how fast and at what effort
EDUCATION EDUCATION EDUCATION
CHANGE

The recipe
for success
Process and Workflow
All aspects of the localization ecosystem are
taken into consideration

MT KPIs:

Selecting the right MT engine
By using our MT engine selection Scorecard we make sure all
important KPIs are taken into consideration at selection time

Empowerment through education
Internal, by the use of customized Toolkits; external, through
specialised Trainings.

The feedback loop
Constructive communication from post-editor to MT
provider

 Productivity: Throughputs
 Productivity: Delta
 Quality: LQA
 Quality: Automatic Scores
 Cost
 GlobalSight: Connectivity
 GlobalSight: Tagging
 Human Evaluation
 Customization: Internal/External
 Customization: Time
MT Program Design - Source
o
o
o
o

o

o

Source content classification (i.e. marketing/UI/UA/UGC)
Length of the source segment
Source segment morpho-syntactic complexity
Presence/absence of pre-defined glossary terms or multi-word glossary
elements, UI elements, numeric variables, product lists, ‘do-not-translate’
and transliteration lists
Tag density - Metadata attributes and their representation in localization
industry standard formats (“tags”)
ROC – quality levels based on content use (“impact”)

3D Model: Expected productivity mapped to desired quality levels and source
content complexity

copyright © welocalize 2013. all rights reserved. www.welocalize.com
MT Engine Selection Scorecard
Productivity - Throughputs
Number of post-edited words per hour
Productivity - Delta
Percentage difference between translation and postediting time
Cost
Extrapolation, cost per word
CMS - Connectivity
We have tested and used
Is there a connector in place?
different engines so we’ve seen
Quality/Nature of source
the good, the bad and the ugly;
now we can better appreciate
Quality (Final) - LQA
what we have
Internal quality verification
Quality (MT) - Automatic Scores
A set of automatic scoring systems is used
Scorecard - Metrics
Overall data
Productivity metrics

Automatic Scoring
Human Evaluation
Toolkits and Trainings
Our experience:
 Most translators know and have experienced post-editing but they have
limited knowledge of any other related aspect (automatic scoring, output
differences between RBMT and SMT...)
 The majority of people who work in localization have heard about MT but
most of them still find it a daunting subject.
Our answer:
 Continuous MT and PE related trainings and documentation for language
providers
 Customized Toolkits for different internal departments (Production, Quality,
Sales, Vendor Management)
copyright © welocalize 2013. all rights reserved. www.welocalize.com
Transparency and Ownership
Theory – knowledge foundations
Practice – customized PE sessions for different client accounts

Transparency – process, engine selection/customization, evaluations
Training helps a lot - After I was told
some of the background information
and tips and tricks for certain
engines/outputs, I was much more
relaxed and happy to give MT a go.

Responsibility – valid evaluations, constructive feedback, quality ownership
Legacy data – best prediction tool
> Statistics from legacy knowledge base
The feedback loop
For me the biggest
advantage would be
the possibility to
implement a client
terminology list [in SMT]

I wish we could easily fix
the corpus for outdated
terminology and
characters

Teach the engine to properly
cope with sentences containing
more than one verb and/or
verbs in progressive form

engine retraining improved significantly the
handling of tags and spaces around tags,
this is a productive achievement as it saves
us a lot of manual corrections.
Feedback and Engine Improvement
“Beyond the Engine” Tools
• Teaminology - crowdsourcing platform for centralized term governance; simultaneous
concordance search of TMs and term bases => clean training data
• Dispatcher - A global community content translation application that connects user
generated content (UGC) including live chats, social media, forums, comments and
knowledge bases to customized machine translation (MT) engines for real-time
translation
• Source Candidate Scorer – scoring of candidate sentences against historically good and
bad sentences based on POS and perplexity
• Corpus Preparation Toolkit – set of application to maximize data preparation for MT
engine training
Teaminology

Teaminology
Dispatcher
Source Candidate Scorer
Source
Candidate
Scorer

Compares your source content to “the good” and “the bad”
legacy segments and estimates potential suitability for MT
Corpus Preparation Suite
Variety of tools to prepare corpus for training MT engines such as:
•
•
•
•
•
•
•

Deleting formatting tags from TMX
Removing double spaces
Removing duplicated punctuation (e.g. commas)
Deleting segments where source = target
Deleting segments containing only URLs
Escaping characters
Removing duplicate sentences

copyright © welocalize 2013. all rights reserved. www.welocalize.com
Corpus Preparation: TM Creator
Aggregates training data from various relevant sources

TM Creator
Corpus Preparation: TMX Splitter

Extracts the relevant training corpus
based on the TMX metadata
Welocalize Moses Implementation
• Why? Far more control over engine quality since we can control corpus
preparation and output post-processing
• Control over metadata handling
• Ties into our company open-source philosophy
• Have experienced personnel in-house
• Can extend and customize Moses functionality as necessary
• Have connector to TMS (GlobalSight)
RESULTS: In our internal tests with Moses/DoMT, we are getting automated
scores similar to commercial engines for the languages into which we localize
most.
Same feedback received from human evaluators

copyright © welocalize 2013. all rights reserved. www.welocalize.com
… And it works!
We are in the position to offer realistic discounts and aggressive
timelines providing quality levels appropriate for the content

copyright © welocalize 2013. all rights reserved. www.welocalize.com
“Work-in-progress” Projects

• Ongoing improvements to our adaptation of iOmegaT tool
(Welocalize/CNGL)
• Industry Partner in CNGL “Source Content Profiler” project
• Adoption of TMTPrime (CNGL) - MT vs. Fuzzy Match selection
mechanism
• Language and content-specific pre-processing for the inhouse Moses deployment
• Teaminology – adding linguistic intelligence

copyright © welocalize 2013. all rights reserved. www.welocalize.com
Contact
Language_Tools_Group_all@welocalize.com
We speak MT - the language of the future
Welocalize, Inc.
www.welocalize.com
Headquarters
241 East 4th St. Suite 207
Frederick, Maryland 21701 USA
[t] +1.301.668.0330
[t] +1.800.370.9515 Toll Free
[f] +1.301.668.0335
[e] marketing@welocalize.com

copyright © welocalize 2013. all rights reserved. www.welocalize.com

Mais conteúdo relacionado

Mais procurados (8)

Chowdappa Resume
Chowdappa ResumeChowdappa Resume
Chowdappa Resume
 
Srik
SrikSrik
Srik
 
Shalini Sharma Resume
Shalini Sharma ResumeShalini Sharma Resume
Shalini Sharma Resume
 
Resume_pdf
Resume_pdfResume_pdf
Resume_pdf
 
VishalSrivastava_NewV1.0
VishalSrivastava_NewV1.0VishalSrivastava_NewV1.0
VishalSrivastava_NewV1.0
 
Erp selection criteria - uwsb
Erp  selection criteria - uwsbErp  selection criteria - uwsb
Erp selection criteria - uwsb
 
Resume
ResumeResume
Resume
 
CV_Mike Yan
CV_Mike YanCV_Mike Yan
CV_Mike Yan
 

Destaque

Website Localization – Industry Best Practices by TripleInk
Website Localization – Industry Best Practices by TripleInkWebsite Localization – Industry Best Practices by TripleInk
Website Localization – Industry Best Practices by TripleInkUta Moncur
 
2013 LocWorld London – LSP 2.0 – From Good to Excellence
2013 LocWorld London – LSP 2.0 – From Good to Excellence2013 LocWorld London – LSP 2.0 – From Good to Excellence
2013 LocWorld London – LSP 2.0 – From Good to ExcellenceStefan Gentz
 
Transifex at the TAUS Translation Technology Showcase - Silicon Valley 2015
Transifex at the TAUS Translation Technology Showcase - Silicon Valley 2015Transifex at the TAUS Translation Technology Showcase - Silicon Valley 2015
Transifex at the TAUS Translation Technology Showcase - Silicon Valley 2015TAUS - The Language Data Network
 
Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...
Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...
Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...Welocalize
 
2013 CHAT tcworld tekom Welocalize Teaminology
2013 CHAT tcworld tekom Welocalize Teaminology 2013 CHAT tcworld tekom Welocalize Teaminology
2013 CHAT tcworld tekom Welocalize Teaminology Welocalize
 
Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014
Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014
Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014Welocalize
 
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L MargMT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L MargWelocalize
 
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-Editing
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-EditingSafaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-Editing
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-EditingWelocalize
 
Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA...
 Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA... Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA...
Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA...Welocalize
 
Tools-Driven Content Curation & Engine Training ATMA 2014
Tools-Driven Content Curation & Engine Training ATMA 2014Tools-Driven Content Curation & Engine Training ATMA 2014
Tools-Driven Content Curation & Engine Training ATMA 2014Welocalize
 
An MT Journey Intuit and Welocalize Localization World 2013
An MT Journey Intuit and Welocalize Localization World 2013An MT Journey Intuit and Welocalize Localization World 2013
An MT Journey Intuit and Welocalize Localization World 2013Welocalize
 
How Much Cake to Eat: The Case for Targeted MT Engines
How Much Cake to Eat: The Case for Targeted MT EnginesHow Much Cake to Eat: The Case for Targeted MT Engines
How Much Cake to Eat: The Case for Targeted MT EnginesWelocalize
 
EAMT Presentation by Welocalize Olga Beregovaya May 2015
EAMT Presentation by Welocalize Olga Beregovaya May 2015EAMT Presentation by Welocalize Olga Beregovaya May 2015
EAMT Presentation by Welocalize Olga Beregovaya May 2015Welocalize
 
MT and Post-Editing User-Generated Content AMTA 2014
MT and Post-Editing User-Generated Content AMTA 2014MT and Post-Editing User-Generated Content AMTA 2014
MT and Post-Editing User-Generated Content AMTA 2014Welocalize
 
Better translations through automated source and post edit analysis
Better translations through automated source and post edit analysisBetter translations through automated source and post edit analysis
Better translations through automated source and post edit analysisWelocalize
 
Why FOMO Is an Event Marketer's Secret Weapon by Angela Giacchetti
Why FOMO Is an Event Marketer's Secret Weapon by Angela GiacchettiWhy FOMO Is an Event Marketer's Secret Weapon by Angela Giacchetti
Why FOMO Is an Event Marketer's Secret Weapon by Angela GiacchettiBizBash
 
The technical side of Internationalization at Eventbrite or "Sh!t we're going...
The technical side of Internationalization at Eventbrite or "Sh!t we're going...The technical side of Internationalization at Eventbrite or "Sh!t we're going...
The technical side of Internationalization at Eventbrite or "Sh!t we're going...Renaud Visage
 

Destaque (20)

Website Localization – Industry Best Practices by TripleInk
Website Localization – Industry Best Practices by TripleInkWebsite Localization – Industry Best Practices by TripleInk
Website Localization – Industry Best Practices by TripleInk
 
2013 LocWorld London – LSP 2.0 – From Good to Excellence
2013 LocWorld London – LSP 2.0 – From Good to Excellence2013 LocWorld London – LSP 2.0 – From Good to Excellence
2013 LocWorld London – LSP 2.0 – From Good to Excellence
 
Website localization case studies
Website localization case studiesWebsite localization case studies
Website localization case studies
 
Transifex at the TAUS Translation Technology Showcase - Silicon Valley 2015
Transifex at the TAUS Translation Technology Showcase - Silicon Valley 2015Transifex at the TAUS Translation Technology Showcase - Silicon Valley 2015
Transifex at the TAUS Translation Technology Showcase - Silicon Valley 2015
 
Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...
Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...
Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...
 
2013 CHAT tcworld tekom Welocalize Teaminology
2013 CHAT tcworld tekom Welocalize Teaminology 2013 CHAT tcworld tekom Welocalize Teaminology
2013 CHAT tcworld tekom Welocalize Teaminology
 
Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014
Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014
Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014
 
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L MargMT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
 
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-Editing
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-EditingSafaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-Editing
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-Editing
 
Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA...
 Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA... Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA...
Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA...
 
Tools-Driven Content Curation & Engine Training ATMA 2014
Tools-Driven Content Curation & Engine Training ATMA 2014Tools-Driven Content Curation & Engine Training ATMA 2014
Tools-Driven Content Curation & Engine Training ATMA 2014
 
An MT Journey Intuit and Welocalize Localization World 2013
An MT Journey Intuit and Welocalize Localization World 2013An MT Journey Intuit and Welocalize Localization World 2013
An MT Journey Intuit and Welocalize Localization World 2013
 
How Much Cake to Eat: The Case for Targeted MT Engines
How Much Cake to Eat: The Case for Targeted MT EnginesHow Much Cake to Eat: The Case for Targeted MT Engines
How Much Cake to Eat: The Case for Targeted MT Engines
 
EAMT Presentation by Welocalize Olga Beregovaya May 2015
EAMT Presentation by Welocalize Olga Beregovaya May 2015EAMT Presentation by Welocalize Olga Beregovaya May 2015
EAMT Presentation by Welocalize Olga Beregovaya May 2015
 
MT and Post-Editing User-Generated Content AMTA 2014
MT and Post-Editing User-Generated Content AMTA 2014MT and Post-Editing User-Generated Content AMTA 2014
MT and Post-Editing User-Generated Content AMTA 2014
 
Website localization of catalogue
Website localization of catalogueWebsite localization of catalogue
Website localization of catalogue
 
Better translations through automated source and post edit analysis
Better translations through automated source and post edit analysisBetter translations through automated source and post edit analysis
Better translations through automated source and post edit analysis
 
Why FOMO Is an Event Marketer's Secret Weapon by Angela Giacchetti
Why FOMO Is an Event Marketer's Secret Weapon by Angela GiacchettiWhy FOMO Is an Event Marketer's Secret Weapon by Angela Giacchetti
Why FOMO Is an Event Marketer's Secret Weapon by Angela Giacchetti
 
Rakuten Proposal
Rakuten ProposalRakuten Proposal
Rakuten Proposal
 
The technical side of Internationalization at Eventbrite or "Sh!t we're going...
The technical side of Internationalization at Eventbrite or "Sh!t we're going...The technical side of Internationalization at Eventbrite or "Sh!t we're going...
The technical side of Internationalization at Eventbrite or "Sh!t we're going...
 

Semelhante a WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization World

TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...TAUS - The Language Data Network
 
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...TAUS - The Language Data Network
 
Tech capabilities with_sa
Tech capabilities with_saTech capabilities with_sa
Tech capabilities with_saRobert Martin
 
Automotive Software Cost Estimation - The UCE Approach - Emmanuel Mary
Automotive Software Cost Estimation - The UCE Approach - Emmanuel MaryAutomotive Software Cost Estimation - The UCE Approach - Emmanuel Mary
Automotive Software Cost Estimation - The UCE Approach - Emmanuel MaryNesma
 
What machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyWhat machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyIconic Translation Machines
 
Benosoftware Technology
Benosoftware  TechnologyBenosoftware  Technology
Benosoftware TechnologyBenoSoftware
 
Improving Translator Productivity with MT: A Patent Translation Case Study
Improving Translator Productivity with MT: A Patent Translation Case StudyImproving Translator Productivity with MT: A Patent Translation Case Study
Improving Translator Productivity with MT: A Patent Translation Case StudyIconic Translation Machines
 
Elite mindz introduction
Elite mindz introductionElite mindz introduction
Elite mindz introductionSimerjeet Singh
 
EliteMindz: Who are we? Where do we serve ? What are our products & services?
EliteMindz: Who are we? Where do we serve ? What are our products & services?EliteMindz: Who are we? Where do we serve ? What are our products & services?
EliteMindz: Who are we? Where do we serve ? What are our products & services?Simerjeet Singh
 
Senior Quality Analyst
Senior Quality AnalystSenior Quality Analyst
Senior Quality AnalystAnkur Gupta
 
Singapore MuleSoft Meetup - 23 Nov 2022
Singapore MuleSoft Meetup - 23 Nov 2022Singapore MuleSoft Meetup - 23 Nov 2022
Singapore MuleSoft Meetup - 23 Nov 2022Royston Lobo
 
User Empowered Machine Translation. Dion Wiggins, Asia Online
User Empowered Machine Translation. Dion Wiggins, Asia OnlineUser Empowered Machine Translation. Dion Wiggins, Asia Online
User Empowered Machine Translation. Dion Wiggins, Asia OnlineABBYY Language Serivces
 
The Jnaapti Virtual Coach Platform
The Jnaapti Virtual Coach PlatformThe Jnaapti Virtual Coach Platform
The Jnaapti Virtual Coach PlatformJnaapti
 
Lexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLoriThicke
 

Semelhante a WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization World (20)

TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
 
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
 
Tech capabilities with_sa
Tech capabilities with_saTech capabilities with_sa
Tech capabilities with_sa
 
Automotive Software Cost Estimation - The UCE Approach - Emmanuel Mary
Automotive Software Cost Estimation - The UCE Approach - Emmanuel MaryAutomotive Software Cost Estimation - The UCE Approach - Emmanuel Mary
Automotive Software Cost Estimation - The UCE Approach - Emmanuel Mary
 
Consulting
ConsultingConsulting
Consulting
 
Resume Aditya Santhanam
Resume Aditya SanthanamResume Aditya Santhanam
Resume Aditya Santhanam
 
What machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyWhat machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happy
 
Reshma Resume 2016
Reshma Resume 2016Reshma Resume 2016
Reshma Resume 2016
 
Benosoftware Technology
Benosoftware  TechnologyBenosoftware  Technology
Benosoftware Technology
 
Suganya_Palanisamy
Suganya_PalanisamySuganya_Palanisamy
Suganya_Palanisamy
 
Rushcode overview
Rushcode overviewRushcode overview
Rushcode overview
 
Improving Translator Productivity with MT: A Patent Translation Case Study
Improving Translator Productivity with MT: A Patent Translation Case StudyImproving Translator Productivity with MT: A Patent Translation Case Study
Improving Translator Productivity with MT: A Patent Translation Case Study
 
Elite mindz introduction
Elite mindz introductionElite mindz introduction
Elite mindz introduction
 
EliteMindz: Who are we? Where do we serve ? What are our products & services?
EliteMindz: Who are we? Where do we serve ? What are our products & services?EliteMindz: Who are we? Where do we serve ? What are our products & services?
EliteMindz: Who are we? Where do we serve ? What are our products & services?
 
Senior Quality Analyst
Senior Quality AnalystSenior Quality Analyst
Senior Quality Analyst
 
Singapore MuleSoft Meetup - 23 Nov 2022
Singapore MuleSoft Meetup - 23 Nov 2022Singapore MuleSoft Meetup - 23 Nov 2022
Singapore MuleSoft Meetup - 23 Nov 2022
 
User Empowered Machine Translation. Dion Wiggins, Asia Online
User Empowered Machine Translation. Dion Wiggins, Asia OnlineUser Empowered Machine Translation. Dion Wiggins, Asia Online
User Empowered Machine Translation. Dion Wiggins, Asia Online
 
The Jnaapti Virtual Coach Platform
The Jnaapti Virtual Coach PlatformThe Jnaapti Virtual Coach Platform
The Jnaapti Virtual Coach Platform
 
Resume
ResumeResume
Resume
 
Lexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLexcelera MT Breaking Compromises
Lexcelera MT Breaking Compromises
 

Mais de Welocalize

Automating the Localization Workflow. What Works?
Automating the Localization Workflow. What Works?Automating the Localization Workflow. What Works?
Automating the Localization Workflow. What Works?Welocalize
 
MT Quality Evaluations: From Test Environment to Production
MT Quality Evaluations: From Test Environment to ProductionMT Quality Evaluations: From Test Environment to Production
MT Quality Evaluations: From Test Environment to ProductionWelocalize
 
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...Welocalize
 
Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize
 
Content Marketing World 2014 Language Fun Fact Challenge by welocalize
Content Marketing World 2014 Language Fun Fact Challenge by welocalizeContent Marketing World 2014 Language Fun Fact Challenge by welocalize
Content Marketing World 2014 Language Fun Fact Challenge by welocalizeWelocalize
 
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014Welocalize
 
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...Welocalize
 
Beyond Disruption: Make Way for Return on Content by Welocalize Olga Beregovaya
Beyond Disruption: Make Way for Return on Content by Welocalize Olga BeregovayaBeyond Disruption: Make Way for Return on Content by Welocalize Olga Beregovaya
Beyond Disruption: Make Way for Return on Content by Welocalize Olga BeregovayaWelocalize
 
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...Welocalize
 

Mais de Welocalize (9)

Automating the Localization Workflow. What Works?
Automating the Localization Workflow. What Works?Automating the Localization Workflow. What Works?
Automating the Localization Workflow. What Works?
 
MT Quality Evaluations: From Test Environment to Production
MT Quality Evaluations: From Test Environment to ProductionMT Quality Evaluations: From Test Environment to Production
MT Quality Evaluations: From Test Environment to Production
 
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...
 
Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
 
Content Marketing World 2014 Language Fun Fact Challenge by welocalize
Content Marketing World 2014 Language Fun Fact Challenge by welocalizeContent Marketing World 2014 Language Fun Fact Challenge by welocalize
Content Marketing World 2014 Language Fun Fact Challenge by welocalize
 
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014
 
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...
 
Beyond Disruption: Make Way for Return on Content by Welocalize Olga Beregovaya
Beyond Disruption: Make Way for Return on Content by Welocalize Olga BeregovayaBeyond Disruption: Make Way for Return on Content by Welocalize Olga Beregovaya
Beyond Disruption: Make Way for Return on Content by Welocalize Olga Beregovaya
 
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...
 

Último

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 

Último (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 

WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization World

  • 1. WeMT Tools and Processes TAUS Showcase October 2013 By Olga Beregovaya copyright © welocalize 2013. all rights reserved. www.welocalize.com
  • 2. We’ll talk about: • MT Programs • Metrics • Engines • Language Tools www.welocalize.com
  • 3. Current MT Programs Dell – 27 languages Autodesk – 11 languages PayPal - 8 languages Cisco – 17 languages between 3 tiers Intuit – 20+languages Microsoft (pre-project support) McAfee (pilot) … many more in pilot stage
  • 4. MT Program: Path-to-Success Components A set of MT engines – “mix and match” TMT Selection Mechanisms Post-editing Environment Processes and metrics Data gathering and reporting tool – what, how much, how fast and at what effort EDUCATION EDUCATION EDUCATION CHANGE The recipe for success
  • 5. Process and Workflow All aspects of the localization ecosystem are taken into consideration MT KPIs: Selecting the right MT engine By using our MT engine selection Scorecard we make sure all important KPIs are taken into consideration at selection time Empowerment through education Internal, by the use of customized Toolkits; external, through specialised Trainings. The feedback loop Constructive communication from post-editor to MT provider  Productivity: Throughputs  Productivity: Delta  Quality: LQA  Quality: Automatic Scores  Cost  GlobalSight: Connectivity  GlobalSight: Tagging  Human Evaluation  Customization: Internal/External  Customization: Time
  • 6. MT Program Design - Source o o o o o o Source content classification (i.e. marketing/UI/UA/UGC) Length of the source segment Source segment morpho-syntactic complexity Presence/absence of pre-defined glossary terms or multi-word glossary elements, UI elements, numeric variables, product lists, ‘do-not-translate’ and transliteration lists Tag density - Metadata attributes and their representation in localization industry standard formats (“tags”) ROC – quality levels based on content use (“impact”) 3D Model: Expected productivity mapped to desired quality levels and source content complexity copyright © welocalize 2013. all rights reserved. www.welocalize.com
  • 7. MT Engine Selection Scorecard Productivity - Throughputs Number of post-edited words per hour Productivity - Delta Percentage difference between translation and postediting time Cost Extrapolation, cost per word CMS - Connectivity We have tested and used Is there a connector in place? different engines so we’ve seen Quality/Nature of source the good, the bad and the ugly; now we can better appreciate Quality (Final) - LQA what we have Internal quality verification Quality (MT) - Automatic Scores A set of automatic scoring systems is used
  • 8. Scorecard - Metrics Overall data Productivity metrics Automatic Scoring Human Evaluation
  • 9. Toolkits and Trainings Our experience:  Most translators know and have experienced post-editing but they have limited knowledge of any other related aspect (automatic scoring, output differences between RBMT and SMT...)  The majority of people who work in localization have heard about MT but most of them still find it a daunting subject. Our answer:  Continuous MT and PE related trainings and documentation for language providers  Customized Toolkits for different internal departments (Production, Quality, Sales, Vendor Management) copyright © welocalize 2013. all rights reserved. www.welocalize.com
  • 10. Transparency and Ownership Theory – knowledge foundations Practice – customized PE sessions for different client accounts Transparency – process, engine selection/customization, evaluations Training helps a lot - After I was told some of the background information and tips and tricks for certain engines/outputs, I was much more relaxed and happy to give MT a go. Responsibility – valid evaluations, constructive feedback, quality ownership
  • 11. Legacy data – best prediction tool > Statistics from legacy knowledge base
  • 12. The feedback loop For me the biggest advantage would be the possibility to implement a client terminology list [in SMT] I wish we could easily fix the corpus for outdated terminology and characters Teach the engine to properly cope with sentences containing more than one verb and/or verbs in progressive form engine retraining improved significantly the handling of tags and spaces around tags, this is a productive achievement as it saves us a lot of manual corrections.
  • 13. Feedback and Engine Improvement
  • 14. “Beyond the Engine” Tools • Teaminology - crowdsourcing platform for centralized term governance; simultaneous concordance search of TMs and term bases => clean training data • Dispatcher - A global community content translation application that connects user generated content (UGC) including live chats, social media, forums, comments and knowledge bases to customized machine translation (MT) engines for real-time translation • Source Candidate Scorer – scoring of candidate sentences against historically good and bad sentences based on POS and perplexity • Corpus Preparation Toolkit – set of application to maximize data preparation for MT engine training
  • 17. Source Candidate Scorer Source Candidate Scorer Compares your source content to “the good” and “the bad” legacy segments and estimates potential suitability for MT
  • 18. Corpus Preparation Suite Variety of tools to prepare corpus for training MT engines such as: • • • • • • • Deleting formatting tags from TMX Removing double spaces Removing duplicated punctuation (e.g. commas) Deleting segments where source = target Deleting segments containing only URLs Escaping characters Removing duplicate sentences copyright © welocalize 2013. all rights reserved. www.welocalize.com
  • 19. Corpus Preparation: TM Creator Aggregates training data from various relevant sources TM Creator
  • 20. Corpus Preparation: TMX Splitter Extracts the relevant training corpus based on the TMX metadata
  • 21. Welocalize Moses Implementation • Why? Far more control over engine quality since we can control corpus preparation and output post-processing • Control over metadata handling • Ties into our company open-source philosophy • Have experienced personnel in-house • Can extend and customize Moses functionality as necessary • Have connector to TMS (GlobalSight) RESULTS: In our internal tests with Moses/DoMT, we are getting automated scores similar to commercial engines for the languages into which we localize most. Same feedback received from human evaluators copyright © welocalize 2013. all rights reserved. www.welocalize.com
  • 22. … And it works! We are in the position to offer realistic discounts and aggressive timelines providing quality levels appropriate for the content copyright © welocalize 2013. all rights reserved. www.welocalize.com
  • 23. “Work-in-progress” Projects • Ongoing improvements to our adaptation of iOmegaT tool (Welocalize/CNGL) • Industry Partner in CNGL “Source Content Profiler” project • Adoption of TMTPrime (CNGL) - MT vs. Fuzzy Match selection mechanism • Language and content-specific pre-processing for the inhouse Moses deployment • Teaminology – adding linguistic intelligence copyright © welocalize 2013. all rights reserved. www.welocalize.com
  • 24. Contact Language_Tools_Group_all@welocalize.com We speak MT - the language of the future Welocalize, Inc. www.welocalize.com Headquarters 241 East 4th St. Suite 207 Frederick, Maryland 21701 USA [t] +1.301.668.0330 [t] +1.800.370.9515 Toll Free [f] +1.301.668.0335 [e] marketing@welocalize.com copyright © welocalize 2013. all rights reserved. www.welocalize.com

Notas do Editor

  1. Our KPIs: organic; the list can increase or be adapted to a new situation depending on the particular needs.