SlideShare a Scribd company logo
1 of 25
Download to read offline
Welcome to the Cloud!
Terminology as a Service
Andrejs Vasiļjevs
Tilde
tekom 2013 / Wiesbaden / 07.11.2013.
Complexity of terminology works
 Term identification in the source text
 Consulting online databases and local files for translation
equivalents
 Creating and maintaining terminology glossaries
 Sharing term glossaries and involving others in their
polishing
 Structuring data in the industry standard formats
 Integrating term glossaries in CAT and other productivity
tools
 Keeping terminology up to date
 etc.
Terminology as a Service

cloud-based platform for acquiring, cleaning up,
sharing, and reusing multilingual terminological data
TaaS User Needs Survey Results:
Importance of terminology work

1.8%
14.8%

43.5%
Very important
Quite important

Less important
Not important

39.9%
TaaS User Needs Survey:
willingness to share
60.5%

39.5%

Yes, provided that…
16.7%

No, because…
8.3%

24.9%

6.0%

4.6%

16.5%

48.6%

7.6%
19.2%

11.4%
14.2%
Joint contribution to the DB
Access control
Legal aspects
External quality control
Little effort
Anonymity
Other

22.0%
Legal restrictions
Poor quality/Lack of time
Own asset
Risk of misunderstanding
TaaS Partners

 Tilde

Latvia (Coordinator)

 TAUS

Netherlands

 Kilgray

Hungary

 Cologne University

of Applied Sciences
 University of Sheffield

Germany
UK
TaaS Mission
 Simplify the process for language workers to prepare,
store and share of task-specific multilingual term glossaries

 Provide instant access to term translation equivalents and
translation candidates for professional translators through
CAT tools
 Domain adaptation of statistical machine translation
systems by dynamic integration with TaaS provided
terminology data
Key services of TaaS
 Automatic extraction of monolingual term
candidates
from user uploaded documents
 Automatic retrieval of translation equivalents
from different public and industry terminology
databases
 Translation candidate acquisition
from multilingual web data
 Facilities for cleaning-up
by users automatically acquired terminological
data;

 Data sharing and integration facilities
through APIs and export tools
Focus areas

Research






Quality
Performance
Scalability
Interoperability

 Term extraction
 Collection of domain specific
multilingual corpora
 Max(FTC)

Development

Usage

 Usability
 Outreach
 Sustainability
TaaS Services
Target Repositories
 TAUS Data
repository of multilingual translation memories
 EuroTermBank
databank of federated multilingual terminology
 IATE
inter-institutional termbank of European Union
 META-SHARE
distributed Pan-European repository of language
resources
Integration
 Support for industry standard
formats
 Integration into CAT and
productivity tools
 API to integrate TaaS services
into various software
applications
Term identification and annotation
HTML Term Annotation
Term entries for terms identified in EuroTermBank are stored in TBX format
in a <script> element that is placed in the HTML5 document.
XLIFF Term Annotation
Identifying and marking terms
New W3C standard for Internationalization
Tag Set ITS 2.0

ITS 2.0 enriched
content

ITS 2.0 enriched
content
Showcase

Web Page

Terminology
Annotation
Web Service API

Plaintext
TaaS Terminology Services

Human users
(e.g., translators,
terminologists)

ITS2.0
term-annotated content
export / visualisation

ITS2.0
term-annotated
content
ITS 2.0
enriched
content

Term-annotated
content
ITS2.0
term-annotated
content

Machine users

CAT Tools MT Systems
CAT tools

MT

https
REST

https
REST

Presentation Layer

included

Public API

included

Web Page UI

External
TDBs
https
REST

Web
Browsers
http/https
html

TaaS Architecture

Application Logic Layer
Terminology
collection
management

User
management

Data Storage Layer
(Shared Term Repository)

Terminology
collection
search

Terminology
collection
creation

Term extraction workflows
Full collection
creation
workflow

Monolingual
collection
creation

High-performance
Computing (HPC) Cluster

File Store

HPC frontend

SGE

Translation
candidate
extraction

Modules
Term extraction
TXT extractor
TWSC
Kilgray Term
Extractor
Term normalizer

CPU

CPU

Collection creator

CPU

CPU

Statistical DB
acquisition

CPU
Statistical
DB

CPU

CPU

Shared Term
Repository
DB

Text
tagging
with terms

CPU

CPU

CPU

CPU

CPU

Parameter retriever
Bilingual Term
Extraction System
Statistical DB feeding

....

Translation
lookup
ETB & STR
IATE
TAUS API
Statistical DB
Collection merger

Result processing
Collection Importer
Marked Text
enrichment
koks timber

How to instruct SMT
to use the right terms?
Put TaaS in the service for MT
s
do-it-yourself
MT factory
on the cloud
Boost in the quality of
machine translation
Narrow Domain Automotive MT
English – Latvian

DATA
2 M unique parallel sentences
1.9 M monolingual sentences
0.2 M in-domain monolingual

QUALITY
16% improvement from
terminology integration
Come & Try
demo.taas-project.eu
Thank you!
andrejs@tilde.com

The research within the project TaaS leading to these results has received funding from the European
Union Seventh Framework Programme (FP7/2007-2013), Grant Agreement no 296312

More Related Content

Similar to Welcome to the Cloud! Terminology as a Service, CHAT2013

TaaS Workshop 2014, Terminology as a Service, Indra Samite, Tilde
TaaS Workshop 2014, Terminology as a Service, Indra Samite, TildeTaaS Workshop 2014, Terminology as a Service, Indra Samite, Tilde
TaaS Workshop 2014, Terminology as a Service, Indra Samite, TildeTAUS - The Language Data Network
 
Common industry API for translation services presented by TAUS at FEISGILTT
Common industry API for translation services presented by TAUS at FEISGILTTCommon industry API for translation services presented by TAUS at FEISGILTT
Common industry API for translation services presented by TAUS at FEISGILTTTAUS - The Language Data Network
 
TAUS webinar The Big Picture View On The Translation Industry, March 2013
TAUS webinar The Big Picture View On The Translation Industry, March 2013TAUS webinar The Big Picture View On The Translation Industry, March 2013
TAUS webinar The Big Picture View On The Translation Industry, March 2013TAUS - The Language Data Network
 
Semantic interoperability courses training module 2 - core vocabularies v0.11
Semantic interoperability courses   training module 2 - core vocabularies v0.11Semantic interoperability courses   training module 2 - core vocabularies v0.11
Semantic interoperability courses training module 2 - core vocabularies v0.11Semic.eu
 
TAUS Knowledge Base: Communicating Translation Automation
TAUS Knowledge Base: Communicating Translation AutomationTAUS Knowledge Base: Communicating Translation Automation
TAUS Knowledge Base: Communicating Translation AutomationIsabella Massardo
 
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...faflrt
 
LavaCon 2017 - Authored by Man and Machine: Interactive Documents?
LavaCon 2017 - Authored by Man and Machine: Interactive Documents?LavaCon 2017 - Authored by Man and Machine: Interactive Documents?
LavaCon 2017 - Authored by Man and Machine: Interactive Documents?Jack Molisani
 
Changing patterns and variables of obligations of Libraries
Changing patterns and variables of obligations of LibrariesChanging patterns and variables of obligations of Libraries
Changing patterns and variables of obligations of LibrariesMunesh Kumar
 
Knowledge Discovery in an Agents Environment
Knowledge Discovery in an Agents EnvironmentKnowledge Discovery in an Agents Environment
Knowledge Discovery in an Agents EnvironmentManjulaPatel
 
Semantic Web: Technolgies and Applications for Real-World
Semantic Web: Technolgies and Applications for Real-WorldSemantic Web: Technolgies and Applications for Real-World
Semantic Web: Technolgies and Applications for Real-WorldAmit Sheth
 
Tatiana Gornostay: Language Meets Knowledge in Digital Content Management
Tatiana Gornostay: Language Meets Knowledge in Digital Content ManagementTatiana Gornostay: Language Meets Knowledge in Digital Content Management
Tatiana Gornostay: Language Meets Knowledge in Digital Content Managementmbruemmer
 
ECM And Enterprise Metadata in SharePoint 2010
ECM And Enterprise Metadata in SharePoint 2010ECM And Enterprise Metadata in SharePoint 2010
ECM And Enterprise Metadata in SharePoint 2010Phuong Nguyen
 
Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...
Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...
Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...Baden Hughes
 
Archiving as a Service - A Model for the Provision of Shared Archiving Servic...
Archiving as a Service - A Model for the Provision of Shared Archiving Servic...Archiving as a Service - A Model for the Provision of Shared Archiving Servic...
Archiving as a Service - A Model for the Provision of Shared Archiving Servic...janaskhoj
 

Similar to Welcome to the Cloud! Terminology as a Service, CHAT2013 (20)

TaaS Workshop 2014, Terminology as a Service, Indra Samite, Tilde
TaaS Workshop 2014, Terminology as a Service, Indra Samite, TildeTaaS Workshop 2014, Terminology as a Service, Indra Samite, Tilde
TaaS Workshop 2014, Terminology as a Service, Indra Samite, Tilde
 
Common industry API for translation services presented by TAUS at FEISGILTT
Common industry API for translation services presented by TAUS at FEISGILTTCommon industry API for translation services presented by TAUS at FEISGILTT
Common industry API for translation services presented by TAUS at FEISGILTT
 
TAUS webinar The Big Picture View On The Translation Industry, March 2013
TAUS webinar The Big Picture View On The Translation Industry, March 2013TAUS webinar The Big Picture View On The Translation Industry, March 2013
TAUS webinar The Big Picture View On The Translation Industry, March 2013
 
Aos ciard-china
Aos ciard-chinaAos ciard-china
Aos ciard-china
 
WEBINAR: TAUS Outlook 2013
WEBINAR: TAUS Outlook 2013WEBINAR: TAUS Outlook 2013
WEBINAR: TAUS Outlook 2013
 
Semantic interoperability courses training module 2 - core vocabularies v0.11
Semantic interoperability courses   training module 2 - core vocabularies v0.11Semantic interoperability courses   training module 2 - core vocabularies v0.11
Semantic interoperability courses training module 2 - core vocabularies v0.11
 
Use and integration of controlled vocabularies (AGROVOC) in DSpace Repositories
Use and integration of controlled vocabularies (AGROVOC) in DSpace RepositoriesUse and integration of controlled vocabularies (AGROVOC) in DSpace Repositories
Use and integration of controlled vocabularies (AGROVOC) in DSpace Repositories
 
TAUS Knowledge Base: Communicating Translation Automation
TAUS Knowledge Base: Communicating Translation AutomationTAUS Knowledge Base: Communicating Translation Automation
TAUS Knowledge Base: Communicating Translation Automation
 
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
 
LavaCon 2017 - Authored by Man and Machine: Interactive Documents?
LavaCon 2017 - Authored by Man and Machine: Interactive Documents?LavaCon 2017 - Authored by Man and Machine: Interactive Documents?
LavaCon 2017 - Authored by Man and Machine: Interactive Documents?
 
Changing patterns and variables of obligations of Libraries
Changing patterns and variables of obligations of LibrariesChanging patterns and variables of obligations of Libraries
Changing patterns and variables of obligations of Libraries
 
Knowledge Discovery in an Agents Environment
Knowledge Discovery in an Agents EnvironmentKnowledge Discovery in an Agents Environment
Knowledge Discovery in an Agents Environment
 
Cyflwyniad Bloc
Cyflwyniad BlocCyflwyniad Bloc
Cyflwyniad Bloc
 
Semantic Web: Technolgies and Applications for Real-World
Semantic Web: Technolgies and Applications for Real-WorldSemantic Web: Technolgies and Applications for Real-World
Semantic Web: Technolgies and Applications for Real-World
 
Tatiana Gornostay: Language Meets Knowledge in Digital Content Management
Tatiana Gornostay: Language Meets Knowledge in Digital Content ManagementTatiana Gornostay: Language Meets Knowledge in Digital Content Management
Tatiana Gornostay: Language Meets Knowledge in Digital Content Management
 
ECM And Enterprise Metadata in SharePoint 2010
ECM And Enterprise Metadata in SharePoint 2010ECM And Enterprise Metadata in SharePoint 2010
ECM And Enterprise Metadata in SharePoint 2010
 
Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...
Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...
Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...
 
Help File Proposal
Help File ProposalHelp File Proposal
Help File Proposal
 
Archiving as a Service - A Model for the Provision of Shared Archiving Servic...
Archiving as a Service - A Model for the Provision of Shared Archiving Servic...Archiving as a Service - A Model for the Provision of Shared Archiving Servic...
Archiving as a Service - A Model for the Provision of Shared Archiving Servic...
 
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
 

More from TAUS - The Language Data Network

TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS - The Language Data Network
 
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...TAUS - The Language Data Network
 
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)TAUS - The Language Data Network
 
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann... Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...TAUS - The Language Data Network
 
A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...TAUS - The Language Data Network
 
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...TAUS - The Language Data Network
 
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...TAUS - The Language Data Network
 
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...TAUS - The Language Data Network
 
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 The Theory and Practice of Computer Aided Translation Training System, Liu Q... The Theory and Practice of Computer Aided Translation Training System, Liu Q...
The Theory and Practice of Computer Aided Translation Training System, Liu Q...TAUS - The Language Data Network
 
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)TAUS - The Language Data Network
 
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 A use-case for getting MT into your company, Kerstin Berns (berns language c... A use-case for getting MT into your company, Kerstin Berns (berns language c...
A use-case for getting MT into your company, Kerstin Berns (berns language c...TAUS - The Language Data Network
 

More from TAUS - The Language Data Network (20)

TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
 
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
 
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
 
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
 
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
 
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
 
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
 
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann... Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 
A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...
 
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
 
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
 
Farmer Lv (TrueTran)
Farmer Lv (TrueTran)Farmer Lv (TrueTran)
Farmer Lv (TrueTran)
 
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
 
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 The Theory and Practice of Computer Aided Translation Training System, Liu Q... The Theory and Practice of Computer Aided Translation Training System, Liu Q...
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 
Translation Technology Showcase in Shenzhen
Translation Technology Showcase in ShenzhenTranslation Technology Showcase in Shenzhen
Translation Technology Showcase in Shenzhen
 
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
 
SDL Trados Studio 2017, Jocelyn He (SDL)
SDL Trados Studio 2017, Jocelyn He (SDL)SDL Trados Studio 2017, Jocelyn He (SDL)
SDL Trados Studio 2017, Jocelyn He (SDL)
 
How we train post-editors - Yongpeng Wei (Lingosail)
How we train post-editors - Yongpeng Wei (Lingosail)How we train post-editors - Yongpeng Wei (Lingosail)
How we train post-editors - Yongpeng Wei (Lingosail)
 
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 A use-case for getting MT into your company, Kerstin Berns (berns language c... A use-case for getting MT into your company, Kerstin Berns (berns language c...
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 
QE integrated in XTM, by Bob Willans (XTM)
QE integrated in XTM, by Bob Willans (XTM)QE integrated in XTM, by Bob Willans (XTM)
QE integrated in XTM, by Bob Willans (XTM)
 

Recently uploaded

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 

Recently uploaded (20)

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 

Welcome to the Cloud! Terminology as a Service, CHAT2013

  • 1. Welcome to the Cloud! Terminology as a Service Andrejs Vasiļjevs Tilde tekom 2013 / Wiesbaden / 07.11.2013.
  • 2. Complexity of terminology works  Term identification in the source text  Consulting online databases and local files for translation equivalents  Creating and maintaining terminology glossaries  Sharing term glossaries and involving others in their polishing  Structuring data in the industry standard formats  Integrating term glossaries in CAT and other productivity tools  Keeping terminology up to date  etc.
  • 3. Terminology as a Service cloud-based platform for acquiring, cleaning up, sharing, and reusing multilingual terminological data
  • 4. TaaS User Needs Survey Results: Importance of terminology work 1.8% 14.8% 43.5% Very important Quite important Less important Not important 39.9%
  • 5. TaaS User Needs Survey: willingness to share 60.5% 39.5% Yes, provided that… 16.7% No, because… 8.3% 24.9% 6.0% 4.6% 16.5% 48.6% 7.6% 19.2% 11.4% 14.2% Joint contribution to the DB Access control Legal aspects External quality control Little effort Anonymity Other 22.0% Legal restrictions Poor quality/Lack of time Own asset Risk of misunderstanding
  • 6. TaaS Partners  Tilde Latvia (Coordinator)  TAUS Netherlands  Kilgray Hungary  Cologne University of Applied Sciences  University of Sheffield Germany UK
  • 7. TaaS Mission  Simplify the process for language workers to prepare, store and share of task-specific multilingual term glossaries  Provide instant access to term translation equivalents and translation candidates for professional translators through CAT tools  Domain adaptation of statistical machine translation systems by dynamic integration with TaaS provided terminology data
  • 8. Key services of TaaS  Automatic extraction of monolingual term candidates from user uploaded documents  Automatic retrieval of translation equivalents from different public and industry terminology databases  Translation candidate acquisition from multilingual web data  Facilities for cleaning-up by users automatically acquired terminological data;  Data sharing and integration facilities through APIs and export tools
  • 9. Focus areas Research     Quality Performance Scalability Interoperability  Term extraction  Collection of domain specific multilingual corpora  Max(FTC) Development Usage  Usability  Outreach  Sustainability
  • 11. Target Repositories  TAUS Data repository of multilingual translation memories  EuroTermBank databank of federated multilingual terminology  IATE inter-institutional termbank of European Union  META-SHARE distributed Pan-European repository of language resources
  • 12. Integration  Support for industry standard formats  Integration into CAT and productivity tools  API to integrate TaaS services into various software applications
  • 14. HTML Term Annotation Term entries for terms identified in EuroTermBank are stored in TBX format in a <script> element that is placed in the HTML5 document.
  • 16. Identifying and marking terms New W3C standard for Internationalization Tag Set ITS 2.0 ITS 2.0 enriched content ITS 2.0 enriched content Showcase Web Page Terminology Annotation Web Service API Plaintext TaaS Terminology Services Human users (e.g., translators, terminologists) ITS2.0 term-annotated content export / visualisation ITS2.0 term-annotated content ITS 2.0 enriched content Term-annotated content ITS2.0 term-annotated content Machine users CAT Tools MT Systems
  • 17.
  • 18. CAT tools MT https REST https REST Presentation Layer included Public API included Web Page UI External TDBs https REST Web Browsers http/https html TaaS Architecture Application Logic Layer Terminology collection management User management Data Storage Layer (Shared Term Repository) Terminology collection search Terminology collection creation Term extraction workflows Full collection creation workflow Monolingual collection creation High-performance Computing (HPC) Cluster File Store HPC frontend SGE Translation candidate extraction Modules Term extraction TXT extractor TWSC Kilgray Term Extractor Term normalizer CPU CPU Collection creator CPU CPU Statistical DB acquisition CPU Statistical DB CPU CPU Shared Term Repository DB Text tagging with terms CPU CPU CPU CPU CPU Parameter retriever Bilingual Term Extraction System Statistical DB feeding .... Translation lookup ETB & STR IATE TAUS API Statistical DB Collection merger Result processing Collection Importer Marked Text enrichment
  • 19. koks timber How to instruct SMT to use the right terms?
  • 20. Put TaaS in the service for MT
  • 21.
  • 23. Boost in the quality of machine translation Narrow Domain Automotive MT English – Latvian DATA 2 M unique parallel sentences 1.9 M monolingual sentences 0.2 M in-domain monolingual QUALITY 16% improvement from terminology integration
  • 25. Thank you! andrejs@tilde.com The research within the project TaaS leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013), Grant Agreement no 296312