- MT@EC is a machine translation system developed by the European Commission to provide automated translations for all 24 official EU languages.
- It was launched in 2013 to address the growing translation needs of the EU, which far exceed the translation capacity of the Commission.
- MT@EC is used both for disseminating information to understand texts in other languages, and as a tool to aid human translators in drafting translations more efficiently.
- The system continues to be improved through customization pilots with public institutions and by incorporating translator feedback to enhance quality over time.
How to Troubleshoot Apps for the Modern Connected Worker
TAUS MT Showcase, MT@EC for European public administrations and online services, Spyridon Pilos, European Commission
1. Wednesday,
4
June
MT@EC
for
Europen
Public
Administra>ons
and
Online
Services
Spyridon
Pilos,
European
Commission
TAUS
Machine
Transla>on
Showcase
2014
Dublin
(Ireland)
The
research
within
the
project
MosesCore
leading
to
these
results
has
received
funding
from
the
European
Union
7th
Framework
Programme,
grant
agreement
no
288487
2. MT@EC
European Commission machine translation
for public administrations and digital services
in the European Union
Spyridon Pilos
Head of Language applications, IT unit
Directorate-General for Translation (DGT)
Dublin, 4 June 20142
3. European Commission machine translation
• European Commission and languages
• MT@EC: machine translation for EU users
• What next?
3
6. 6
Why do we need machine translation?
• The Commission…
• DGT has 1700 translators
• Over 2 M pages translated in 2013
• But…
…just to make europa.eu fully multilingual
almost 6.8 M documents to be translated
or 8 500 translators/year!
The result:
Thousands of non-translated documents
(and this does not include user generated content)
7. MT and EC: a long history
Started in the 1970s
• Eurotra (78-92): research, high expectations
• Rule-based ECMT (75-97), costly to develop – not scalable
(18 language pairs in 20 years - coverage of post-2004
languages never attempted- system shut down in 2010
Data-driven systems (Statistical MT) :
• cheap and quick to develop… if you have good data
• EC needs solution for all EU languages… and has good data
EC action plan (2009), Inter-service task force (2010)
• The goal: MT@EC offering machine translation for all
languages to and from English, operational in July 2013
8. MT for understanding (inbound)
MT
L2
L3
…
Ln
L1
Robustness, Coverage
Practically unlimited
demand; free web-based
services cover much of it
Requirements for MT@EC
• Provide MT as a (simple and robust) service
• Optimise quality for understandability (gisting)
• Deal with many domains, document types, formats, …
• Scale to huge volumes
Two Usage Scenarios for MT@EC
9. MT for dissemination (outbound)
Textual quality
MT
L2
L3
…
Ln
L1
Publishable quality can only be
authored by humans; Translation
Memories & CAT-Tools used by
professional translators
• Requirements for MT@EC
• Provide MT as a tool within a CAT workflow
• Develop new ways to incorporate feedback
• explicit feedback on MT quality, implicit feedback via TM
• improvements requiring language-specific knowledge
• towards hybrid approaches
• Optimise quality for post-editing
Two Usage Scenarios for MT@EC
10. MT@EC: a European Commission product
•
• Released : 26 June 2013 (version 1.0)
• Languages: All 24 EU official languages
552 language pairs (61 direct)
• Technology: Statistical machine translation
using open source software Moses co-funded by EU
Framework Programmes for research and innovation
• Development by DGT: between 2010-2013
co-funded by the ISA* programme (action 2.8)
• * Interoperability solutions for public administrations
10
11. • Delivery: - web user interface (human to machine)
- web services (machine to machine)
• Security: Host (EC data centre) + access (ECAS)
+ transfer (sTesta)
• Special features:
• Source document format/formatting maintained
• Specific output formats for translation: tmx and xliff
• Can translate multiple documents to multiple languages
• Translation can also be returned by email
• Indication of quality for language pairs
• Feedback mechanism
11
MT@EC description
12. Quality evaluation
and improvement…
• “Maturity Check” (April-May 2011)
• Can baseline MT engines already be used as such?
• Identify main sources of problems for various languages,
cluster them across languages
• Real-life trial (July 2011-June 2013)
• Make first MT results available to translators
• Auto-MT for 10..19 “best” language pairs (now: all)
• On-demand MT for others (now all languages get MT)
• Automatic scores
• BLEU scores for internal tuning and regression testing
• Can help to identify domains/document types where MT
is most useful, but also point to systematic difficulties
… with the help of
DGT translators
13. Maturity check 2011 (EN->X)
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
ES
FR
IT
PT
RO
DE
DA
NL
SV
BG
CS
PL
SK
SL
EL
MT
LT
LV
ET
FI
HU
useful useless
Romance
languages
inflected
Germanic
languages
Slavic
languages
Baltic
lang.
analytic
Semitic
highly inflected languages
Hellenic
Finno-
Ugric
composita
strong
aggluti-
nation
DGT's SMT maturity check outcome as a ( ) sentences ratio + morphology
Language
differences
14. + Aid for typing
+ time savings
+ “original” proposed solution
+ guides the terminological research
From the translator's
point of view
— gender/numbers and order of words
— can be "fluent", but with mistranslations
— omissions and additions
— risk of error when
incorrect terminology suggested
— quality dependent
on the quality of the originals
14
15. 15
§ … the staff of European institutions and bodies:
§ European Commission,
§ European Parliament,
§ Council of the European Union,
§ European Court of Justice,
§ Court of Auditors,
§ Economic and social committee
§ Committee of the regions
§ European Central Bank,
§ European Investment Bank
§ Translation Centre
§ … and more
MT@EC is already available to…
è DGT took into account the needs of
translators and other staff when designing the servcie
20. MT@EC is also integrated
into EC digital services
à operational
20
Service
Description/URL
IMI
Internal Market Information System
http://ec.europa.eu/internal_market/imi-net/index_en.html
SOLVIT
SOLVIT is an on-line problem solving network concerning
missapplication of Internal Market law by public authorities.
http://ec.europa.eu/solvit/
è DGT supports and advises
for better integration on the customer side
21. Integration into EC digital services
à under development (indicative list)
21
Service
Description/URL
nLex A common gateway to National Law
http://eur-lex.europa.eu/n-lex/
TED TED (Tenders Electronic Daily) is the online version of the
'Supplement to the Official Journal of the European Union', dedicated
to European public procurement.
http://ted.europa.eu/
e-Justice The future electronic one-stop-shop in the area of justice.
http://e-justice.europa.eu/
Joinup Joinup is an open collaborative platform supporting interoperability in
Europe.
https://joinup.ec.europa.eu/
22. Integration into EC digital services
à initiated (indicative list)
22
Service
Description/URL
ODR Platform to facilitate the resolution of consumer disputes out-of-court
(Alternative Dispute Resolution)
http://ec.europa.eu/consumers/redress_cons/adr_en.htm
EURES The European Job Mobility portal newtorking the European
employment services.
https://ec.europa.eu/eures/
EQF The portal supporting the implementation fo the European
Qualifications Framework for lifelong learning.
http://ec.europa.eu/eqf/home_en.htm
ESCO The multilingual classification of European Skills, Competences,
Qualifications and Occupations; identifies and categorises skills and
competences, qualifications and occupations in 22 European
languages. Supports EURES and other similar portals.
https://ec.europa.eu/esco/
23. MT@EC for public administrations
23
Free real-life trial in 2014:
§ - Staff can have direct free access to the standard MT@EC
service (upon request)
• - Organisations can participate in a "customisation" pilot
project, where DGT builds specific engines with their data
(based on bilateral cooperation agreements)
è DGT to understand better their needs and constraints
and develop appropriate service delivery models
24. Customisation pilots
• Pilot A: Connect an information system to the standard
MT@EC service.
• Pilot B: DGT builds custom engines (their data) available
to all through MT@EC
• Pilot C: DGT builds custom engines (their data) available
only to them through MT@EC
• Pilot D: DGT builds custom engines (their data) for you to
run in their premises
• Pilot E: DGT assists you to build their own custom
engines for you to run in their premises
24
25. MT@EC: right for the EU
Quality:
• built on data derived from EU translations
(Euramis translation memory system: 800 M segments in
24 languages and annual growth rate > 20% )
• designed for EU relevant collaboration
• team of computational linguists working with
translators and linguists in DGT
• work to improve MT for all EU languages
Security
Customer support
25
26. MT@EC: what next
26
• CEF (Connecting Europe Facility)
• A funding programme for building and deploying
infrastructures.
• Includes deploying mature technologies to build, enable and
operate pan- European Digital Services.
• Includes an Automated Translation (AT) platform as one of
its core building blocks for digital services.
• A key component of the AT platform is MT@EC.
27. The automated translation platform
27
• To facilitate cross-border information exchange and
enable cross-border access to online content and
services provided by the digital service infrastructures
of the CEF.
• To offer MT services to EU institutions and public
administrations in the Member States.
• To build on the existing Commission Machine
Translation service (MT@EC)
• Emphasis is placed on secure, quality, customisable
machine translation.
è Follow this space:
http://ec.europa.eu/digital-agenda/en/connecting-europe-facility