Our presentation at #Eliatogether in Athens was favored by many attendees. Will disintermediation be a force to reckon with in the translation industry as it has happened in the hotel and travel industries? What is the role of machine translation in all this? How does neural machine translation work?
2. A few words about…Manuel Herranz
Majored in mechanical engineering &
languages (UK)
Joined Giddings & Lewis - Ford
Valencia / Chihuahua 1993 - 1996
Rolls Royce Marine & Industrial
Spain / Argentina 1997-98 and 2000
Joined B.I Corporation Japan 1996-2004
Friendly buy-out 2005: Pangeanic
3. What we do as an industry
Fight in a segmented market Enable international business
Help people /organizations to
communicate
Innovate
Differentiate?
At what
speed?
Really?
5. More importantly…
Are you ahead of the game?
95% LSCs have no iSEO strategy (beyond having a bilingual website) becau
translation is expensive. Do you invest in the product you sell?
80% have no national SEO strategy
50% apply /adopt MT (only 25% have MT
embedded in their systems / custom engines)
10% have centralized TM system to leverage
past content.
Most operate with hundreds of TMs in a server.
6. Business model in 5 years?
Disintermediation – these
companies re-invented
business models
Or offer added value
vs
LSCs become language recruitment agencies/ HHRR specialists?
Success: Basic business proposition + value
7. Industry revenues more than doubled from ~$19B in 2005 to ~$40B
in 2016. [Common Sense Advisory data]
Translation buyers worry about ever-growing content volumes and
more language pairs – but with stable or shrinking budgets.
Management expects Amazon, Microsoft or Google Translate will
take care of “language problems” one day less complexity, lower
cost.
The US Census shows that translation workers doubled since 2008.
[Slator, May 24, 2017]
Automation: project management value replaced by bots?
The number of working translators on LinkedIn has increased by
50+% since 2010 [LinkedIn data]
Market forces squeeze mid-sized companies from both ends: large
can offer economies of scale. Small are specialists, niche or local.
Business model in 5, 4, 3 years…?
8. Disintermediation / Direct clients
Satisfying the 5Bn searches/day and
increasing demand for cheaper
language services
TM+MT leveraging, CAT agnostic,
inexpensive tools
When Google was founded in September
1998, it was serving ten thousand search
queries per day (by the end of 2006 that
same amount would be served in a single
second)
Affordable
Efficiency areas and growth hacks
9. Centralised TMs for advanced leveraging (ActivaTM)
The web as a sales tool (SEO, SEM) online ordering system (Cor).
Machine Translation: SMT -> NMT … some astounding results.
Winning iADDATPA: largest EC MT infrastructure project linking MT
vendors to Public Administrations.
Efficiency areas and growth hacks
The Pangeanic Experience
10. The Database
Elastic Search-based
All language assets in one
database, irrespective of tool
that created them
Deep learning for tag handling
CAT-tool agnostic (solves
interoperability issues)
Automatic fuzzy match repair
Matrix
(triangulate to create new language
pairs)
Statistics on all segment units,
words, domains
Remote access, API
Pre-filter prior to MT (TM+MT)
More powerful (strict) fuzzy matchin
g than traditional CAT-tools
Saved +14%
in fuzzy matches
12. National research project with EU funding
Full platform
Use by Pangeanic, LSPs, 3rd parties
Eases estimation and automates workflows in
any translation format (doc or web)
CMS agnostic – extracts text and converts to
xliff (doc or web)
Translate sections of a web only (batches)
Detect new content or content that has been
eliminated to update language versions
The Web
15. Neural Machine Translation - background
Artificial Neural Networks for SMT
History of ANN-based Machine Translation and
Language Modelling for SMT:
1997 [Castano & Casacuberta 97] (JAUME I &
U.Politécnica): Machine translation using neural
networks and finite-state models
(PangeaMT: https://www.prhlt.upv.es/wp/research-
areas/mt-showcase)
2007 [Schwenk & Costa-jussa 07]: Smooth
bilingual n-gram translation.
2012 [Le & Allauzen 12, Schwenk 12]: Continuous
space translation models with neural networks.
2014 [Devlin & Zbib 14]: Fast and robust neural
networks for SMT
Conventional SMT
Use of statistics has been controversial in
computational linguistics:
Chomsky 1969: ... the notion ’probability of a
sentence’ is an entirely useless one, under any
known interpretation of this term.
Considered to be true by most experts in (rule-
based) natural language processing and artificial
intelligence
History of Statistical Approach to MT
1989-94: IBM’s pioneering work
since 1996: only a few teams favored SMT:
U.Politécnica Valencia, RWTH Aachen, HKUST,
CMU
2006/2007 Google Translate
2006-2012 Euromatrix
2009: PangeaMT
2016: First trials in NMT
2017: European Commission: iADDATPA
project
16. CMS 2
CMS 1Tilde MT
Pangeanic
KantanMT
AT systems
IADAAPTA Platform
(cloud vs on-premise)
CKAN
Widget
browser
eTranslation
- Requests: Supporting both synchronous and
asynchronous requests
- Many IADAAPTA deployments are possible.
- A global instance register is kept by commercial partner.
- send translation request
- receive webhook
- Ask for “request done”
Priority
Admin
Lang / Q
router
User
management/
Profile
BACKOFFICE:
- Global (instance management)
- Individual (for each instance)
- AT systems receive webhook
- Ask for “content request”
Documents
(proprietary
formats)
Conversion
e-Sens
AS4
Profile
Complia
nt)
e-Sens
AS4
Profile
Complia
nt)
Prompsit
iADAATPA:
18. This is more realistic: MT in the wild, wild, wild world
Quite an accurate workflow when integrating MT at a company
MT engine
19. Neural Machine Translation –
Is there a future for translation services?
Machine translation will displace only those humans who translate like machines.
(The remaining) translators will focus on tasks that require intelligence.
- Arle Lommel, 2012
20. Neural Machine Translation
Tests in F/I/G/S, RU, PT point to a very strong preference towards NMT fluency bit.ly/neural-machine-translation-pangeanic.
On average: from a set of 250 sentences, around 85%-92% were good or very good (A or B). ES/PT/IT results similar to FR
Evaluation: Translation companies and professional freelance translators
EN-DE set of 250 sentences
NMT SMT
A 132 53% 34 14%
B 98 39% 95 38%
C 14 6% 97 39%
D 6 2% 24 10%
250 250
EN-FR set of 250 sentences
NMT SMT
A 150 60% 39 16%
B 76 30% 126 50%
C 21 71 28%
D 3 14 6%
EN-RU set of 250 sentences
NMT SMT
A 128 51% 39 16%
B 84 34% 43 17%
C 22 9% 60 24%
D 16 6% 108 43%
250 250
23. Feed Forward Neural Machine Translation
Training set Test
Reference
translation
Out of which we
take 2000
sentences to try
the system with in-
domain text (a
typical sentence
the system may
encounter in the
future)
Remove any protocol configuration files that are not used for the
specified protocol .
These tables are sometimes referred to as " no sync " tables .
This chapter will describe many of those pages and parameters .
24. Feed Forward Neural Machine Translation
Error function
(detects “wrong
match”)
Input Query
Label
(data we already know)
Output
Update function, ie
(the “learning process”)
New Weights (W)
+
New Bias (B)
And after many, many training sessions detecting
patterns, trial and errors and feedback loops…
25. Feed Forward Neural Machine Translation
Label
(data we already
know)
Output
Error
function
(detects
“wrong
match”)
Input
Query
No
Update !!!
(“learning
process”
completed)
Now we
have a
system!!!! Input
Queries
Output
Labels
80%-85%
accuracy!!
26. Recurrent NMT + Attention Models
Attention models tell the system which
encoder states to look at
a good and sound agreement un buen y sólido acuerdo
un buen y sólido acuerdo
<s>
<s>
28. Open Questions
Are you working in the same way as 5 years ago? Do you
think you will be working in the same way in 2023?
Translation companies will remain providing translation
services only?
New business models: offer translation order automation
(management systems), disintermediation, raw MT
services?
Will large translation companies consolidate and
dominate globally? Can new players emerge with the
right tools, selling globally?
Is translation company-to-translation company selling a
viable model?