This presentation is a part of the MosesCore project that encourages the development and usage of open source machine translation tools, notably the Moses statistical MT toolkit.
MosesCore is supported by the European Commission Grant Number 288487 under the 7th Framework Programme.
For the latest updates, follow us on Twitter - #MosesCore
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
TAUS MT SHOWCASE, Moses in the Mix. A Technology Agnostic Approach to a Winning MT Strategy, Lori Thicke. LexWorks
1. TAUS
MACHINE
TRANSLATION
SHOWCASE
Moses in the Mix: A Technology Agnostic Approach to a
Winning MT Strategy!
!
10:50 – 11:10!
Wednesday, 12 June 2013!
!
Lori Thicke!
LexWorks!
2. Moses
in
the
Mix:
A
Technology
Agnos-c
Approach
to
the
Winning
MT
Strategy
3. • McKinsey s definition of the T-
shaped company
• Language Services Provider
(Lexcelera, founded 1986; managing
translators & post-editors)
• MT Services Provider (training
engines, post-editing, etc.)
• Technology Agnostic!
What is LexWorks?!
4. • Developing new technologies to help MT
work better with community content!
5.
6. Other
Technology
Agnos-cs
A good MT strategy should be
technology-agnostic and look for the
most efficient solution on a case-by-case
basis. The type of technology that best
suits your needs will change depending
on the language pair. !
7.
8. All approaches - SMT, RBMT,
Hybrid - are good when matched
to the course!
9. The
process
aims
to
define
best
of
breed
soluDons
for
superior
performance
MT is not a tool. MT is an industrial
process.!
10. 1.
Best
of
breed
means
raw
MT
that
is
perfectly
understandable
MS Translator! Systran Hybrid!
sentences:! %! %!
not understandable! 15.65! 20.87!
partly understandable! 20.00! 34.78!
fully understandable! 64.35! 44.35!
11. Raw
MT
for
FAQs
and
Forum
Content
MS Translator! Systran Hybrid!
Average score on FAQ article! 2.6! 2.4!
Average score on forum! 2.31! 1.97!
Overall score! 2.48! 2.23!
12. 2. Best of breed means managing
post-editing costs!
13. 3.
Best
of
breed
means
retaining
your
post-‐
editors
15. 15!
Area! Feature! RBMT! SMT!
Capability!Add rare language pairs! !!
Capability!Number of languages it can handle out of the box! 20! 50!
Cost! Free or Open Source version exists! !! !!
Quality! Respects grammatical rules! !!
Quality! Handles software tags properly! !!
Quality! Output is fluent! !!
Quality! Can handle bad grammar! !!
Quality! Quality improves with Controlled Authoring! !!
Quality! Output is predictable! !!
Quality!
Retains corrections to terminology (and applies
the correct grammar)!
!!
16. 16!
Area! Feature! RBMT! SMT!
Suitability!
Is better for User Generated Content and broad
domain material such as patents!
!!
Suitability!
Is better suited to on-the-fly translations of short
shelf-life content!
!!
Suitability! Is better for documentation and even software! !!
Suitability! Is suited for rare language pairs! !!
Suitability! Is better suited to post-editing! !!
Training! Learns automatically ! !!
Training! Rapid development customization cycle! !!
Training! Effective with limited training corpus! !!