SlideShare uma empresa Scribd logo
1 de 38
Baixar para ler offline
Dr. Klemens Waldhör
klemens.waldhoer@heartsome.de




            FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör   1
Overview
  Open TMS Overview
  Architecture
  Implementation
  Current Status




                                                                       2
                FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Overview




       FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör   3
Goals
    Three translation memory systems for one and the same process?
    Software investments that make translation costs shoot through the roof?
    Exchange formats that put the brakes on productivity?
    FOLT (Forum Open Language Tools) is concerned with the entire process of producing multilingual
    documentation. From the creation of the source text to production in foreign languages, we analyze our
    processes for weaknesses and a lack of standardisation.

    Primary objectives:
        - Sharing experiences of processes using standard industry software
        - Sharing experiences of the use of Open Source software
        - Standardisation of interchange formats
        -Testing new Open Source technologies and improving existing technologies in the translation market
        - Public support for non-proprietary software and software development
        - Publication of aims and results

    www.folt.de




  Development of the OpenSource Translation
  Memory system OpenTMS

                                                                                                              4
                                              FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
OpenTMS Requirements
 Software
    Web based application
    Server / Client Architecture
    Thin client
    No installation
    No proprietary run time components
    Preferred open source software
    Modular software approach
 OS independent operating system
    Windows, Linux, Mac …
 Standard hardware
 Interfaces
    Integration into CMS
    Workflow management should be supported
 Open source database
    Basically all SQL da-tabases should be supported
 Scalability
    Single and multi user requirement


                                                                                     5
                              FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Architecture




        FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör   6
Example Work Flow
   Seamless integration of different tools in the translation / localisation workflow

                   Terminology
                   Translation

   Machine                                        OpenTMS
  Translation                                      Editor



Translation
 Memory
                   XLIFF                              Back
                                                     Converter
                                                                                 2.


                                                                  3.

       Segmenter

                         Converter
                                                   1.
                                                                             CMS
                                                                                          7
                                   FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Architecture based on Standards
  XLIFF
  TMX
  TBX
  SRX
  …




    In general                                     XML
                                                                      8
               FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
OpenTMS System Architecture
                   Application Model
                            GUI Model                            Interface Model




                                          Security Model

                               User           Document                   Data
                               Model           Model                     Model



                                                Process
                                                 Model



                                     OpenTMS Core Library
For details see Waldhör, K. (2008). OPENTMS SOFTWARE ARCHITECTURE.

                                                                                               9
                                        FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Software Structure
  Hierarchy of functions and processes
  Common functions / methods stored in a core library
  Method calls should be transparent
    Running on server or user machine
    Scripting language



      OpenTMS primitive                        OpenTMS core
         procedure                                library

    OpenTMS Process


    OpenTMS Network Process


                                                                                 10
                          FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Modelling Language
Linguistic Property    N:1                                                   Terminology
                General Linguistic Object
                                                                             Translation Memory
                         inherits
                                                                                mapping

                             N:1
  Monolingual Object                      Multilingual Object




                       Data Source



                                                                                                  11
                                   FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
OpenTMS Processes
    Human Initiated Interactions
                                                                        Data Source


   OpenTMS Initiated Interactions                 Interactive                             Interactive
                                                 Terminology                              Translation
                                                 Translation                               Memory

                                             Pre                        OpenTMS
                                                                                               Back
Converter        Segmenter                Translation                   Translation
                                                                                              Converter
                                           Memory                         Editor




                                                                                                   12
                                   FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Implementation




       FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör   13
Programming Language et al
  Java
   Java Coding Standards
   Java Documentation Standard
   Delivered as jar files
  Eclipse
  Data Sources
   SQL DB: Hibernate based
  Documentation UML
   Generated ESS Model


                                                                         14
                  FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Data Sources
  Language related data are represented as “data sources”
  Idea
     Make the data access interface independent from the data itself
  Not being restricted to SQL databases only
     Also flat data or xml files
        TMX, XLIFF files as a data source
        …
     Machine translation (MT) as data source
     Spread sheets
        E.g. Excel as terminology lists
    Object Oriented Databases
    DMS systems
    “Web Sites” (http based interfaces)
  Define a common interface for all access functions
    Allows adaption to individual data source properties
        e.g. read only data sources like MT


                                                                                     15
                              FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Data Sources
                       Access to data
                      sources through
                   standardised interface
     O
     P
     E
     N           Open
     T           TMS            Data type
     M                           specific
     S           Data            access
                Source          functions
     S
     O
                 Layer
     F
     T     Maps the OpenTMS
     W    access functions to the
         specific data component
     A
     R                             Various data
                                components like files
     E                                 etc.

                                                                                  16
                           FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Core Data Model Status
  Data Source methods defined
    Are extended depending on needs and requirements
  SQL
    Access optimisation
    Hibernate based
    First version finished
  Other OpenSource databases…
    OODBS
      DB4O partially implemented for testing purposes
  Other data sources
    TMX files
    XLIFF files
    MT
        Google & Microsoft Translator


                                                                                    17
                             FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Data Source Core Functions
  Data Sources
  Create
  Delete
  Import TMX, XLIFF File
  Export TMX, XLIFF File
  Copy between data sources




                                                                        18
                 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Fuzzy Search – Core Function of TM
  Step 1: Search in KD-TREE
   Restricts the number of strings to search
   Finds possible matching strings
  Step 2: Levenshtein Similarity
   Compare matches from step 1 now to determine
   real similarity
  Step 3: Get source and target MOLs / MUL
   Create translation (alt-trans)




                                                                            19
                     FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Data Source Configuration
  SQL Data Source contained in hibernate
  directory




  Existing data sources contained in database
  directory

                                                                         20
                  FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Translation Core Functions
  Convert (to and from XLIFF)
   Currently externally done Araya
   Complex document format like WinWord etc. thru
   Open Office Converters
  Segment
   Currently external Araya
  Translate




                                                                           21
                    FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Current Data Source Interface




                                                                       22
                FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Security



   Managing Security



           FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör   23
Security Levels
  Level 0
    No security procedures are applied, data are transferred as
    they are.
  Level 1
    The communication channel is secured. It uses standard
    secure protocols here.
  Level 2
    Encoding for security is done here on data level. Basically
    this means that strings are encrypted when the are
    communicated through a communication channel or are
    written or retrieved from a database. This also involves
    encrypted XLIFF files (resp. parts of it).
  Level 4
    GUI level related security

                                                                                24
                         FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Security and Files
  Protection of parts of
  the document
    Encrypt specific parts of
    the xml documents
  Additional security
  when transferring files
    Even if a file gets in the
    wrong hands the file
    cannot be read.
  Secure XLIFF
    Source
    Target
  Secure TBX
  Secure TMX
    TU…

                                                                                  25
                           FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Security



   Eclipse



             FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör   26
Eclipse Core Methods




                                                                      27
               FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Eclipse RPC Server & Utility Methods




                                                                       28
                FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Eclipse GUI Methods




                                                                      29
               FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
XML-RPC Interface
   openTMX.xml contains access functions
call env.bat
call java -Xmx1024m %OPENTMSJAVABASE%
de.folt.rpc.client.OpenTMSClient
message=TranslateDocument
sourceDocument=%2
sklDocument=%2.skl"
xliffDocument=%2.xlf"
segDocument=%2.seg.xlf„
translatedDocument=%2.trans.xlf"
paragraphBasesSegmentation=yes"
segmentBreakOnCrLf=1
dataSourceName=%1
dataSourceMatchQuality=80
sourceDocumentLanguage=de
targetDocumentLanguage=en
sourceDocumentEncoding=UTF-8
targetDocumentEncoding=UTF-8
inputDocumentType=FILE
dataSourceType=sql




                                                                                          30
                                   FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Current Implementation
  openTMS.jar
   Contains compiled classes and source code
  arayaserver-opentms.jar
   Conversion functions
      Compiled classes
   External.jar
      External classes for Araya (parser etc.)
  Hibernate directory
   Hibernate jar files
  Database jdbc driver
   Database driver jar files

                                                                                31
                         FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Integration Araya XLIFF Editor




                                                                       32
                FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Ubuntu VM Distribution




                                                                      33
               FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Data Source Editor
                                                    Edit MOL/MOL
                                                      Properties




         Language Specific Segments




                                           Delete & Save Functions
               Search Functions




                                                                               34
                        FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Downloads
 http://sourceforge.net/projects/op
 en-tms
 Ubuntu Version
 Windows Version:
 www.heartsome.de/arayatest/op
 entmsserver.exe
   Im Xliff Editor:
   www.heartsome.de/arayatest/araya-
   freeversion.exe
 YourKit Java Profiler for
 performance measurements



                                                                            35
                     FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Possible Contributions
  XML Parser!
    Generalise OpenTMS XML interfaces to support any kind
    of xml parsers (currently jdom)
    Faster XML parser?!
  Logging Packing
    Optimised, line numbers, class names
  Exception Handling
    Improvement
    Localisation approach / String handling
  Test Environment
  XLIFF / TMX package improvements
    TBX reader
    SRX segmentation

                                                                               36
                        FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Possible Contributions Converters
  Document Converters
   XML
   OpenOffice as central converter for txt, rtf, doc,
   xls, ppt…
   MIF
   …
  Data Model Converter
   Trados
   Star
   Across
   …

                                                                            37
                     FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
Contact
Heartsome Europe GmbH
Friedrichstr. 17
D-90574 Roßtal


www.heartsome.de



Dr. Klemens Waldhör


T: +49 9127 579001
F: +49 9127 951178
klemens.waldhoer@heartsome.de


                                                                                       38
                                FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör

Mais conteúdo relacionado

Semelhante a Folt - Open TMS - A presentation for universities

8 ontology integration and interoperability (onto i op)
8 ontology integration and interoperability (onto i op)8 ontology integration and interoperability (onto i op)
8 ontology integration and interoperability (onto i op)
AEGIS-ACCESSIBLE Projects
 
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
Christoph Lange
 
Markus Voelter Textual DSLs
Markus Voelter Textual DSLsMarkus Voelter Textual DSLs
Markus Voelter Textual DSLs
deimos
 

Semelhante a Folt - Open TMS - A presentation for universities (20)

Folt Treffen 22062009
Folt Treffen 22062009Folt Treffen 22062009
Folt Treffen 22062009
 
The Distributed Ontology Language (DOL): Use Cases, Syntax, and Extensibility
The Distributed Ontology Language (DOL): Use Cases, Syntax, and ExtensibilityThe Distributed Ontology Language (DOL): Use Cases, Syntax, and Extensibility
The Distributed Ontology Language (DOL): Use Cases, Syntax, and Extensibility
 
Making Heterogeneous Ontologies Interoperable Through Standardisation
Making Heterogeneous Ontologies Interoperable Through StandardisationMaking Heterogeneous Ontologies Interoperable Through Standardisation
Making Heterogeneous Ontologies Interoperable Through Standardisation
 
8 ontology integration and interoperability (onto i op)
8 ontology integration and interoperability (onto i op)8 ontology integration and interoperability (onto i op)
8 ontology integration and interoperability (onto i op)
 
8 ontology integration and interoperability (onto i op)
8 ontology integration and interoperability (onto i op)8 ontology integration and interoperability (onto i op)
8 ontology integration and interoperability (onto i op)
 
PlayBox Archival Solution Presentation
PlayBox Archival Solution PresentationPlayBox Archival Solution Presentation
PlayBox Archival Solution Presentation
 
Heartsome Portfolio
Heartsome PortfolioHeartsome Portfolio
Heartsome Portfolio
 
Intermediate Language Design of High-level Language VMs: Towards Comprehensiv...
Intermediate Language Design of High-level Language VMs: Towards Comprehensiv...Intermediate Language Design of High-level Language VMs: Towards Comprehensiv...
Intermediate Language Design of High-level Language VMs: Towards Comprehensiv...
 
VOICE BROWSER
VOICE BROWSERVOICE BROWSER
VOICE BROWSER
 
VOICE BROWSER
VOICE BROWSERVOICE BROWSER
VOICE BROWSER
 
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
 
voice browser
voice browservoice browser
voice browser
 
Bilingual Terminology Extraction
Bilingual Terminology ExtractionBilingual Terminology Extraction
Bilingual Terminology Extraction
 
Markus Voelter Textual DSLs
Markus Voelter Textual DSLsMarkus Voelter Textual DSLs
Markus Voelter Textual DSLs
 
Crafting Your Customized Legal Mastery: A Guide to Building Your Private LLM
Crafting Your Customized Legal Mastery: A Guide to Building Your Private LLMCrafting Your Customized Legal Mastery: A Guide to Building Your Private LLM
Crafting Your Customized Legal Mastery: A Guide to Building Your Private LLM
 
Wreck a nice beach: adventures in speech recognition
Wreck a nice beach: adventures in speech recognitionWreck a nice beach: adventures in speech recognition
Wreck a nice beach: adventures in speech recognition
 
Scope
ScopeScope
Scope
 
ML Tutorial Introduction
ML Tutorial IntroductionML Tutorial Introduction
ML Tutorial Introduction
 
Efficient Intralingual Text To Speech Web Podcasting And Recording
Efficient Intralingual Text To Speech Web Podcasting And RecordingEfficient Intralingual Text To Speech Web Podcasting And Recording
Efficient Intralingual Text To Speech Web Podcasting And Recording
 
The Design, Evolution and Use of KernelF
The Design, Evolution and Use of KernelFThe Design, Evolution and Use of KernelF
The Design, Evolution and Use of KernelF
 

Mais de Klemens Waldhör

Mais de Klemens Waldhör (8)

Bilingual Term Extraction Tool (in German)
Bilingual Term Extraction Tool (in German)Bilingual Term Extraction Tool (in German)
Bilingual Term Extraction Tool (in German)
 
Bilingual Term Extraction Tool (in English)
Bilingual Term Extraction Tool (in English)Bilingual Term Extraction Tool (in English)
Bilingual Term Extraction Tool (in English)
 
Heartsome Europe TMX Editor
Heartsome Europe TMX EditorHeartsome Europe TMX Editor
Heartsome Europe TMX Editor
 
Heartsome Europe Xliff Editor User Guide German
Heartsome Europe Xliff Editor User Guide GermanHeartsome Europe Xliff Editor User Guide German
Heartsome Europe Xliff Editor User Guide German
 
Heartsome Europe Xliff Editor User Guide English
Heartsome Europe Xliff Editor User Guide EnglishHeartsome Europe Xliff Editor User Guide English
Heartsome Europe Xliff Editor User Guide English
 
Bilingual TMX EditorTool (in German)
Bilingual TMX EditorTool (in German)Bilingual TMX EditorTool (in German)
Bilingual TMX EditorTool (in German)
 
Heartsome Europe Bilingual TMX EditorTool (in English)
Heartsome Europe Bilingual TMX EditorTool (in English)Heartsome Europe Bilingual TMX EditorTool (in English)
Heartsome Europe Bilingual TMX EditorTool (in English)
 
Vortrag Ostbayrischer Tourismustag2008 Waldhoer
Vortrag Ostbayrischer Tourismustag2008 WaldhoerVortrag Ostbayrischer Tourismustag2008 Waldhoer
Vortrag Ostbayrischer Tourismustag2008 Waldhoer
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 

Folt - Open TMS - A presentation for universities

  • 1. Dr. Klemens Waldhör klemens.waldhoer@heartsome.de FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör 1
  • 2. Overview Open TMS Overview Architecture Implementation Current Status 2 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 3. Overview FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör 3
  • 4. Goals Three translation memory systems for one and the same process? Software investments that make translation costs shoot through the roof? Exchange formats that put the brakes on productivity? FOLT (Forum Open Language Tools) is concerned with the entire process of producing multilingual documentation. From the creation of the source text to production in foreign languages, we analyze our processes for weaknesses and a lack of standardisation. Primary objectives: - Sharing experiences of processes using standard industry software - Sharing experiences of the use of Open Source software - Standardisation of interchange formats -Testing new Open Source technologies and improving existing technologies in the translation market - Public support for non-proprietary software and software development - Publication of aims and results www.folt.de Development of the OpenSource Translation Memory system OpenTMS 4 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 5. OpenTMS Requirements Software Web based application Server / Client Architecture Thin client No installation No proprietary run time components Preferred open source software Modular software approach OS independent operating system Windows, Linux, Mac … Standard hardware Interfaces Integration into CMS Workflow management should be supported Open source database Basically all SQL da-tabases should be supported Scalability Single and multi user requirement 5 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 6. Architecture FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör 6
  • 7. Example Work Flow Seamless integration of different tools in the translation / localisation workflow Terminology Translation Machine OpenTMS Translation Editor Translation Memory XLIFF Back Converter 2. 3. Segmenter Converter 1. CMS 7 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 8. Architecture based on Standards XLIFF TMX TBX SRX … In general XML 8 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 9. OpenTMS System Architecture Application Model GUI Model Interface Model Security Model User Document Data Model Model Model Process Model OpenTMS Core Library For details see Waldhör, K. (2008). OPENTMS SOFTWARE ARCHITECTURE. 9 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 10. Software Structure Hierarchy of functions and processes Common functions / methods stored in a core library Method calls should be transparent Running on server or user machine Scripting language OpenTMS primitive OpenTMS core procedure library OpenTMS Process OpenTMS Network Process 10 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 11. Modelling Language Linguistic Property N:1 Terminology General Linguistic Object Translation Memory inherits mapping N:1 Monolingual Object Multilingual Object Data Source 11 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 12. OpenTMS Processes Human Initiated Interactions Data Source OpenTMS Initiated Interactions Interactive Interactive Terminology Translation Translation Memory Pre OpenTMS Back Converter Segmenter Translation Translation Converter Memory Editor 12 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 13. Implementation FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör 13
  • 14. Programming Language et al Java Java Coding Standards Java Documentation Standard Delivered as jar files Eclipse Data Sources SQL DB: Hibernate based Documentation UML Generated ESS Model 14 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 15. Data Sources Language related data are represented as “data sources” Idea Make the data access interface independent from the data itself Not being restricted to SQL databases only Also flat data or xml files TMX, XLIFF files as a data source … Machine translation (MT) as data source Spread sheets E.g. Excel as terminology lists Object Oriented Databases DMS systems “Web Sites” (http based interfaces) Define a common interface for all access functions Allows adaption to individual data source properties e.g. read only data sources like MT 15 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 16. Data Sources Access to data sources through standardised interface O P E N Open T TMS Data type M specific S Data access Source functions S O Layer F T Maps the OpenTMS W access functions to the specific data component A R Various data components like files E etc. 16 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 17. Core Data Model Status Data Source methods defined Are extended depending on needs and requirements SQL Access optimisation Hibernate based First version finished Other OpenSource databases… OODBS DB4O partially implemented for testing purposes Other data sources TMX files XLIFF files MT Google & Microsoft Translator 17 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 18. Data Source Core Functions Data Sources Create Delete Import TMX, XLIFF File Export TMX, XLIFF File Copy between data sources 18 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 19. Fuzzy Search – Core Function of TM Step 1: Search in KD-TREE Restricts the number of strings to search Finds possible matching strings Step 2: Levenshtein Similarity Compare matches from step 1 now to determine real similarity Step 3: Get source and target MOLs / MUL Create translation (alt-trans) 19 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 20. Data Source Configuration SQL Data Source contained in hibernate directory Existing data sources contained in database directory 20 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 21. Translation Core Functions Convert (to and from XLIFF) Currently externally done Araya Complex document format like WinWord etc. thru Open Office Converters Segment Currently external Araya Translate 21 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 22. Current Data Source Interface 22 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 23. Security Managing Security FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör 23
  • 24. Security Levels Level 0 No security procedures are applied, data are transferred as they are. Level 1 The communication channel is secured. It uses standard secure protocols here. Level 2 Encoding for security is done here on data level. Basically this means that strings are encrypted when the are communicated through a communication channel or are written or retrieved from a database. This also involves encrypted XLIFF files (resp. parts of it). Level 4 GUI level related security 24 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 25. Security and Files Protection of parts of the document Encrypt specific parts of the xml documents Additional security when transferring files Even if a file gets in the wrong hands the file cannot be read. Secure XLIFF Source Target Secure TBX Secure TMX TU… 25 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 26. Security Eclipse FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör 26
  • 27. Eclipse Core Methods 27 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 28. Eclipse RPC Server & Utility Methods 28 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 29. Eclipse GUI Methods 29 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 30. XML-RPC Interface openTMX.xml contains access functions call env.bat call java -Xmx1024m %OPENTMSJAVABASE% de.folt.rpc.client.OpenTMSClient message=TranslateDocument sourceDocument=%2 sklDocument=%2.skl" xliffDocument=%2.xlf" segDocument=%2.seg.xlf„ translatedDocument=%2.trans.xlf" paragraphBasesSegmentation=yes" segmentBreakOnCrLf=1 dataSourceName=%1 dataSourceMatchQuality=80 sourceDocumentLanguage=de targetDocumentLanguage=en sourceDocumentEncoding=UTF-8 targetDocumentEncoding=UTF-8 inputDocumentType=FILE dataSourceType=sql 30 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 31. Current Implementation openTMS.jar Contains compiled classes and source code arayaserver-opentms.jar Conversion functions Compiled classes External.jar External classes for Araya (parser etc.) Hibernate directory Hibernate jar files Database jdbc driver Database driver jar files 31 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 32. Integration Araya XLIFF Editor 32 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 33. Ubuntu VM Distribution 33 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 34. Data Source Editor Edit MOL/MOL Properties Language Specific Segments Delete & Save Functions Search Functions 34 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 35. Downloads http://sourceforge.net/projects/op en-tms Ubuntu Version Windows Version: www.heartsome.de/arayatest/op entmsserver.exe Im Xliff Editor: www.heartsome.de/arayatest/araya- freeversion.exe YourKit Java Profiler for performance measurements 35 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 36. Possible Contributions XML Parser! Generalise OpenTMS XML interfaces to support any kind of xml parsers (currently jdom) Faster XML parser?! Logging Packing Optimised, line numbers, class names Exception Handling Improvement Localisation approach / String handling Test Environment XLIFF / TMX package improvements TBX reader SRX segmentation 36 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 37. Possible Contributions Converters Document Converters XML OpenOffice as central converter for txt, rtf, doc, xls, ppt… MIF … Data Model Converter Trados Star Across … 37 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
  • 38. Contact Heartsome Europe GmbH Friedrichstr. 17 D-90574 Roßtal www.heartsome.de Dr. Klemens Waldhör T: +49 9127 579001 F: +49 9127 951178 klemens.waldhoer@heartsome.de 38 FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör