SlideShare uma empresa Scribd logo
1 de 29
Baixar para ler offline
Semantic Analysis and Concept-based
Translation for Multilingual Information
                Systems

               Johannes Leveling and
                Sven Hartrumpf and
                  Rainer Osswald

   Intelligent Information and Communication Systems (IICS)
          University of Hagen (FernUniversität in Hagen)
                      58084 Hagen, Germany
          firstname.lastname@fernuni-hagen.de


           GAL 2007, Hildesheim, Germany
Semantic
Analysis and
  Concept-
   based
 Translation
                                                                              Outline
 J. Leveling,
S. Hartrumpf,
 R. Osswald

Concept-           1 Concept-based Representation: MultiNet
based
Representa-
tion:
MultiNet           2 Three Phases for a Concept-Based Multilingual IR
Three Phases            System
for a Concept-
Based
Multilingual IR
System             3 Concept-Based Information Systems
Concept-
Based
Information
Systems            4 Applications
Applications

Conclusion
and Outlook        5 Conclusion and Outlook
References




J. Leveling, S. Hartrumpf, R. Osswald   Semantic Analysis and Concept-based Translation   2 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                        Motivation for Concept-Based
 J. Leveling,
S. Hartrumpf,
                                                          Translation
 R. Osswald

Concept-
based
                        • Example 1:
Representa-
tion:
                            Query expansion in information retrieval (IR) with
MultiNet
                            elements from same synset
Three Phases
for a Concept-        → needs word sense disambiguation (differentiation of
Based
Multilingual IR         concepts), otherwise loss of precision
System

Concept-
                        • Example 2:
Based
Information
                            Question answering (QA): questions on relations
Systems
                            between concepts (situations, events, etc.)
Applications
                            Example: Who killed Lee Harvey Oswald?
Conclusion
and Outlook           → need semantic representation;
References              bag-of-words information retrieval is not enough


J. Leveling, S. Hartrumpf, R. Osswald        Semantic Analysis and Concept-based Translation   3 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                              The MultiNet Paradigm
 J. Leveling,
S. Hartrumpf,
 R. Osswald             • Meaning and knowledge representation:
Concept-
                            Multilayered Extended Semantic Networks (Helbig,
based
Representa-
                            2001, 2006)
tion:
MultiNet                • Semantic network of nodes (concepts) and edges
Three Phases                (semantic relations from a fixed set)
for a Concept-
Based                   • In addition:
Multilingual IR
System                      semantic sorts, semantic features, layer information
Concept-
Based                   • Different types of concepts:
Information
Systems                     lexicalized vs. non-lexicalized
Applications            • Language-independence:
Conclusion
and Outlook                 annotation of English/Czech sentences from the Wall
References                  Street Journal with MultiNet (Charles University,
                            Prague)

J. Leveling, S. Hartrumpf, R. Osswald      Semantic Analysis and Concept-based Translation   4 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                          Selected Semantic Relations
 J. Leveling,         Relation          Description
S. Hartrumpf,
 R. Osswald
                      ASSOC             association
Concept-              ATTCH             attachment of object to object
based
Representa-           CHPA              change of sorts (property →abstract object)
tion:
MultiNet
                      EXP               experiencer
Three Phases
                      MCONT             an informational process or object
for a Concept-        OBJ               neutral object
Based
Multilingual IR       PRED              predicative concept specifying a plurality
System
                      PROP              property relationship
Concept-
Based                 PARS              meronymy
Information
Systems
                      SCAR              carrier of a state
Applications
                      SSPE              state specifier
                      SUB               conceptual subordination for objects
Conclusion
and Outlook           SUBS              conceptual subordination for situations
References            SYNO              synonymy
                      TEMP              temporal restriction for a situation
                       ALTN 1           an introduction of alternatives
J. Leveling, S. Hartrumpf, R. Osswald            Semantic Analysis and Concept-based Translation   5 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                        The Computational Lexicon –
 J. Leveling,
S. Hartrumpf,
                                                        HaGenLex
 R. Osswald

Concept-
based
Representa-
tion:                   • Semantically oriented (German) lexical resource
MultiNet
                            (Hartrumpf et al., 2003)
Three Phases
for a Concept-          • Consists of multiple lexicons:
Based
Multilingual IR             • full syntactico-semantic information (26,000 entries)
System
                            • flat lexicon (50,000 entries)
Concept-
Based                       • compound lexicon (30,000 entries; structure and
Information                   semantics)
Systems
                            • name lexicons (250,000 entries)
Applications

Conclusion              • Support for the lexicographer: LIAplus workbench
and Outlook

References




J. Leveling, S. Hartrumpf, R. Osswald       Semantic Analysis and Concept-based Translation   6 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                        Sample Concepts (German)
 J. Leveling,
S. Hartrumpf,
 R. Osswald

Concept-
based
Representa-             • essen.1.1: (Der Student) (ißt) (eine Schokolade).
tion:
MultiNet                • essen.1.2: (Der Student) (ißt) sich (satt).
Three Phases
for a Concept-          • essen.2.1: Das Kind hat kein Essen bekommen.
Based
Multilingual IR
System
                        • essen.2.2: Das Essen am Abend dauerte 2 Stunden.
Concept-                • fressen.1.1: (Der Hund) (frißt) (einen Knochen).
Based
Information             • fressen.1.2: (Die Großmutter) (frißt) (einen Narren) (an
Systems

Applications
                            den Blumen).
Conclusion
and Outlook

References




J. Leveling, S. Hartrumpf, R. Osswald      Semantic Analysis and Concept-based Translation   7 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                                       Lexicon Entry (German):
 J. Leveling,
S. Hartrumpf,
                                                                      essen.1.1
 R. Osswald
                          n-sign                                                          
Concept-                  morph base        ”essen”                                        
based                             infl-para i129g                                          
                                   v-syn                                                   
Representa-               
                                    v-type     main
                                                                                            
tion:
                          syn                                                              
                                   perf-aux haben                                          
MultiNet                          v-control nocontr
                                                                                           
                                                                                          
                                           sem                                             
                                    sem
                                           entity nonment-action                           
Three Phases                                                                             
                                  c-id ”essen.1.1”
for a Concept-
                                                                                          
                                                                                         
Based                                                                                  
                                            rel agt                                      
Multilingual IR                                                                        
                                                         np-syn                          
System                                                                                  
                                                           cat np
                                                                                     
                                            syn                                      
                                                           agr case nom
                                           sel                                       
Concept-                         
                          semsel                                                     
                                                                 sem
                                                                                           
Based                     
                          
                                  
                                  select           semsel sem
                                                                 entity human-object
                                                                                           
                                                                                           
Information               
                          
                                            
                                              rel aff
                                                                                          
                                                                                           
Systems
                                  
                                                                                       
                                                        np-syn                         
                                                        cat np                         
Applications                                syn                                      
                                                       agr case acc                  
                                           sel                                       
Conclusion                                                   sem                     
                                                   semsel   sem
and Outlook                                                       entity sort co


References




J. Leveling, S. Hartrumpf, R. Osswald                  Semantic Analysis and Concept-based Translation   8 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                                       Lexicon Entry (German):
 J. Leveling,
S. Hartrumpf,
                                                                    fressen.1.1
 R. Osswald
                          n-sign                                                                           
Concept-                  morph base        ”fressen”                                                       
based                             infl-para i139g                                                           
                                   v-syn                                                                    
Representa-               
                                    v-type     main
                                                                                                             
tion:
                          syn                                                                               
                                   perf-aux haben                                                           
MultiNet                          v-control nocontr
                                                                                                            
                                                                                                           
                                           sem                                                              
                                    sem
                                           entity nonment-action                                            
Three Phases                                                                                              
                                  c-id ”fressen.1.1”
for a Concept-
                                                                                                           
                                                                                                          
Based                                                                                                   
                                            rel agt                                                       
Multilingual IR                                                                                         
                                                         np-syn                                           
System                                                                                                   
                                                           cat np
                                                                                                      
                                            syn                                                       
                                                           agr case nom
                                           sel                                                        
Concept-                         
                          semsel                                                                      
                                                                 sem
                                                                                                            
Based                     
                          
                                  
                                  select           semsel sem
                                                                 entity animal-object ∨ human-object
                                                                                                            
                                                                                                            
Information               
                          
                                            
                                              rel aff
                                                                                                           
                                                                                                            
Systems
                                  
                                                                                                        
                                                        np-syn                                          
                                                        cat np                                          
Applications                                syn                                                       
                                                       agr case acc                                   
                                           sel                                                        
Conclusion                                                   sem                                      
                                                   semsel   sem
and Outlook                                                       entity sort co


References




J. Leveling, S. Hartrumpf, R. Osswald                   Semantic Analysis and Concept-based Translation          9 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                                      Semantic analysis –
 J. Leveling,
S. Hartrumpf,
                                                     The WOCADI parser
 R. Osswald

Concept-
based
Representa-             • Produces semantic network representation from
tion:
MultiNet                    (German) texts (Hartrumpf, 2003):
Three Phases                    •   resolves coreferences,
for a Concept-
Based                           •   analyzes idioms,
Multilingual IR
System
                                •   decompounds nouns and adjectives,
Concept-
                                •   identifies metonymy,
Based                           •   resolves deictic expressions etc.
Information
Systems
                        • Applied to large corpora, including
Applications
                            CLEF-NEWS newspaper corpus (275,000 articles) and
Conclusion
and Outlook                 German Wikipedia (500,000 articles)
References




J. Leveling, S. Hartrumpf, R. Osswald        Semantic Analysis and Concept-based Translation   10 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                                                      SN Example (German)
 J. Leveling,
S. Hartrumpf,
 R. Osswald                    du.1.1                         streß.1.1         psychisch.1.1


Concept-




                                                                                           PROP
                                                                     SUBS
                                       SUB
based
Representa-
                                               dokument.1.1                                              problem.1.1
tion:                                                                                             PRED




                                                                            *ALTN1
MultiNet                                 c3               c7                         c6
                                                                                                                                            prüfling.1.1
Three Phases
                                                     PRED
                                       EXP




                                                                                                                                    PRED
for a Concept-                                                                                                                c10
Based
Multilingual IR                               OBJ             MCONT                       ATTCH
                                                                                                                   *ALTN1
System                                   c2              c1                 c5                       c8
                                                                                                                                            kandidat.1.1
                                SUBS




Concept-
                                                     SCAR



                                                                 E
                                                                 P
                                                              SS



Based                                                                                                                         c9
Information




                                                                                                                       PRED
                                                                                                                                       B
Systems                                                        SUBS                                                                  SU

Applications                  finden.1.1            c4               berichten.2.2
                                                                                                                                    ASSOC
Conclusion                                                                                                    prüfungskandidat.1.1prüfung.1.1
and Outlook

References

                    Finde Dokumente, die über psychische Probleme oder Stress von
                    Prüfungskandidaten oder Prüflingen berichten. (GIRT topic 116)
J. Leveling, S. Hartrumpf, R. Osswald                          Semantic Analysis and Concept-based Translation                                             11 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                                                                SN Example (English)
 J. Leveling,
S. Hartrumpf,
 R. Osswald                     you                                stress                  mental

Concept-




                                                                                                 PROP
                                                                         SUBS
                                       SUB
based
Representa-
                                                    document                                                    problem
tion:                                                                                                   PRED




                                                                                  *ALTN1
MultiNet                                 c3                      c7                        c6
                                                                                                                                                  examinee
Three Phases
                                                        PRED
                                       EXP




                                                                                                                                          PRED
for a Concept-                                                                                                                      c10
Based
Multilingual IR                               OBJ                MCONT                          ATTCH
                                                                                                                          *ALTN1
System                                   c2                 c1                    c5                       c8
                                                                                                                                                  candidate
                                SUBS




Concept-
                                                        SCAR



                                                                    PE
                                                                 SS



Based                                                                                                                               c9
Information




                                                                                                                             PRED
                                                                                                                                             B
Systems                                                           SUBS                                                                     SU

Applications                    find                   c4                       report
                                                                                                                                          ASSOC
Conclusion                                                                                                                                         exam
and Outlook

References

                    ‘Find documents reporting on mental problems or stress of examination
                    candidates or examinees.’ (GIRT topic 116)
J. Leveling, S. Hartrumpf, R. Osswald                             Semantic Analysis and Concept-based Translation                                             12 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                        Phase 1: Using Statistical MT
 J. Leveling,
S. Hartrumpf,
                                                  and Web Services
 R. Osswald

Concept-                • Employ (statistical) machine translation (MT) web
based
Representa-               service for IR experiments (translation of
tion:
MultiNet                  queries/questions): Systran, Promt, ...
Three Phases            • Problems:
for a Concept-
Based                           • translating questions:
Multilingual IR
System                            most systems trained on declarative sentences;
Concept-                          imperative forms often misunderstood
Based
Information
                                  (Find documents ... →Fund Dokument ...)
Systems                         • named entity recognition:
Applications                      not reliable (Neuengland →new narrow country )
Conclusion
and Outlook
                        • Performance loss from off-the-shelf translation tools for
References                  QA@CLEF: 50%
                            further examples: Ligozat et al. (2006)

J. Leveling, S. Hartrumpf, R. Osswald        Semantic Analysis and Concept-based Translation   13 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                               Phase 2: Aligning
 J. Leveling,
S. Hartrumpf,
                                         Concept-based Tools and
 R. Osswald
                                                      Resources
Concept-
based
Representa-
tion:
                        • Morphology and syntax are different for different
MultiNet
                            languages
Three Phases
for a Concept-          • Semantics is the same (in general)
Based
Multilingual IR         • Our approach:
System

Concept-
                            • create lexicons for different languages ;
Based                         fast construction parallel to existing lexicon(s), e.g.
Information
Systems                       HaGenLex →HaEnLex
Applications                • develop parser for different languages
Conclusion                  • apply methods from IR/QA on SN representation
and Outlook

References
                        • General idea: replace concepts (labels) in semantic
                            network representation (as a form of translation)

J. Leveling, S. Hartrumpf, R. Osswald      Semantic Analysis and Concept-based Translation   14 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                        Status of Alignment of Lexical
 J. Leveling,
S. Hartrumpf,
                                                          Resources
 R. Osswald

Concept-
based
                        • German to English dictionaries: about 100,000
Representa-
tion:
                            word/phrase translations
MultiNet
                        • Mapping between HaGenLex concepts and GermaNet
Three Phases
for a Concept-              concepts, plus GermaNet to EuroWordNet mapping:
Based
Multilingual IR             about 14,000 concept translations
System

Concept-
                        • Wikipedia articles (in German and English): about
Based
Information
                            3,000 proper noun translations for cities, countries,
Systems
                            persons, organizations, etc.
Applications
                        • HaEnLex (parallel English version of HaGenLex) with
Conclusion
and Outlook                 full morphologic, syntactic, semantic description of
References                  concepts: about 7,000 English entries


J. Leveling, S. Hartrumpf, R. Osswald        Semantic Analysis and Concept-based Translation   15 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                        Linguistic Phenomena (1/6)
 J. Leveling,
S. Hartrumpf,
 R. Osswald

Concept-
based               Compounds (rare in English):
Representa-
tion:
MultiNet
                        • with regular semantics
Three Phases                Kinderernährung →nutrition of children
for a Concept-
Based                   • with irregular semantics
Multilingual IR
System                      Frauenzimmer →dame (?); ladies’ room (?)
Concept-
Based                   • borderline cases
Information
Systems                     Bankwesen →banking (system) (?)
Applications          → compound-less semantic representation is possible
Conclusion
and Outlook

References




J. Leveling, S. Hartrumpf, R. Osswald     Semantic Analysis and Concept-based Translation   16 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                        Linguistic Phenomena (2/6)
 J. Leveling,
S. Hartrumpf,
 R. Osswald

Concept-
based
Representa-         Idioms:
tion:
MultiNet                • with corresponding idiom:
Three Phases
for a Concept-              in den Sinn kommen (DE) →to start thinking about sth.
Based
Multilingual IR
                            to come into mind (EN) →to start thinking about sth.
System
                        • without equivalent idiom:
Concept-
Based                       to be someone’s cup of tea (EN) →to like
Information
Systems
                      → semantic representation of idioms
Applications

Conclusion
and Outlook

References




J. Leveling, S. Hartrumpf, R. Osswald     Semantic Analysis and Concept-based Translation   17 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                        Linguistic Phenomena (3/6)
 J. Leveling,
S. Hartrumpf,
 R. Osswald

Concept-
based
                    Metonymy:
Representa-
tion:                   • with corresponding metonymy pattern (for regulat
MultiNet
                            metonymy):
Three Phases
for a Concept-              The White House agreed, that ... (EN)
Based
Multilingual IR             →place-for-government
System
                            Das Weiße Haus stimmte zu, dass ... (DE)
Concept-
Based                       →place-for-government
Information
Systems                 • without: ?
Applications

Conclusion
                      → no problems, yet
and Outlook

References




J. Leveling, S. Hartrumpf, R. Osswald     Semantic Analysis and Concept-based Translation   18 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                        Linguistic Phenomena (4/6)
 J. Leveling,
S. Hartrumpf,
 R. Osswald

Concept-
based               Proper nouns:
Representa-
tion:
MultiNet
                        • transcriptions and transliterations, historic name
Three Phases                variants
for a Concept-
Based                   • Böll →Boell;
Multilingual IR
System                      Gorbatschow →Gorbatchev, Gorbatchov
Concept-
Based
                      → can be solved using aligned online resources e.g.
Information
Systems
                        Wikipedia
Applications          → treat name variants as elements of the same synset
Conclusion
and Outlook

References




J. Leveling, S. Hartrumpf, R. Osswald     Semantic Analysis and Concept-based Translation   19 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                        Linguistic Phenomena (5/6)
 J. Leveling,
S. Hartrumpf,
 R. Osswald

Concept-
based
                    Semantic gaps/lexical gaps:
Representa-
tion:                   • Fohlen (DE) →colt (if male),
MultiNet

Three Phases
                        • Fohlen (DE) →filly (if female)
for a Concept-
Based                   • Alignment of lexicon entries: morpho-syntactic features
Multilingual IR
System                      differ in different languages, syntactic features also,
Concept-                    semantic features do not (in general) but: net
Based
Information                 entries/rules/entailments may be slightly different?!,
Systems
                            because they already involve other concepts (which
Applications

Conclusion
                            have to be translated)
and Outlook

References




J. Leveling, S. Hartrumpf, R. Osswald      Semantic Analysis and Concept-based Translation   20 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                               Linguistic Phenomena (6/6)
 J. Leveling,
S. Hartrumpf,
                    Semantic gaps/lexical gaps:
 R. Osswald
                               essen.1.1 →eat.1.1 AND fressen.1.1 →eat.1.1
Concept-
based
Representa-               n-sign                                                                         
tion:
                                             ”eat”
MultiNet                  morph base
                                   infl-para i20                                                           
                                                                                                          
                                  v-syn                                                                   
Three Phases              syn
                                  v-type main
                                                                                                           
for a Concept-
                                                                                                          
                                                                                                          
                                           sem
                                    sem
Based
                                                                                                          
                                          entity nonment-action                                         
Multilingual IR           
                                 c-id
                                  
                                                                ”eat.1.1”                                 
                                                                                                          
System                    
                                                                                                      
                                            rel agt                                                     
                                                                                                         
Concept-                                                                                               
                                                        np-syn                                        
                                                   syn
Based                                                    cat np                                        
                          semsel           sel                                                      
Information                                                      sem
                                 
                                                   semsel sem                                            
                                                                entity animal-object ∨ human-object      
Systems                          select
                                                                                                       
                                            rel aff                                                     
                                                                                                         
Applications
                                                                                 
                                                        np-syn                                        
                                                    syn
                                                           cat np
                                                                                                      
                                                                                                    
Conclusion                                 sel               sem                                    
                                                   semsel   sem
and Outlook                                                       entity sort co

References




J. Leveling, S. Hartrumpf, R. Osswald                  Semantic Analysis and Concept-based Translation         21 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                              Phase 3: Towards a
 J. Leveling,
S. Hartrumpf,
                                        Concept-Based Translation
 R. Osswald

Concept-
based
Representa-
tion:
MultiNet

Three Phases            • Assumption that the same inventory of relations hold
for a Concept-
Based                       (about 140 relations) for different languages
Multilingual IR
System                  • Natural language generation (for German)
Concept-
Based                   • Possible solution: English parser, generate natural
Information
Systems                     language from semantic network representation
Applications

Conclusion
and Outlook

References




J. Leveling, S. Hartrumpf, R. Osswald      Semantic Analysis and Concept-based Translation   22 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                        Monolingual Concept-Based IR
 J. Leveling,
S. Hartrumpf,
 R. Osswald
                        • Techniques of standard IR: stemming and stopword
Concept-
based                       removal
Representa-
tion:                   • Monolingual concept-based IR:
MultiNet
                            • represent queries (and documents) as semantic
Three Phases
for a Concept-                networks
Based                       • (translate concepts)
Multilingual IR
System                      • employ methods on semantic network representation
Concept-
Based
                        • Advantages:
Information
Systems
                            • semantics of compounds (relation to its constituents)
Applications
                            • semantics of prepositions is typically represented by
Conclusion
                              semantic relation or function (no full translation needed)
and Outlook                 • lemmatizing (instead of stemming)
References                  • query expansion with elements of synsets



J. Leveling, S. Hartrumpf, R. Osswald         Semantic Analysis and Concept-based Translation   23 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                        Multilingual Concept-Based IR
 J. Leveling,
S. Hartrumpf,
 R. Osswald

Concept-
based
Representa-
                        • Three different approaches at supporting a multilingual
tion:                       search
MultiNet

Three Phases
                               1   translate queries into the document language
for a Concept-                 2   translate documents into the query language
Based
Multilingual IR                3   translate both queries and documents into an
System
                                   interlingua
Concept-
Based                   • Multilingual concept-based IR: same as monolingual
Information
Systems                     approach, but translate concepts (1, 2, or 3)
Applications                →towards an interlingua
Conclusion
and Outlook

References




J. Leveling, S. Hartrumpf, R. Osswald        Semantic Analysis and Concept-based Translation   24 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                          Projects and Evaluations
 J. Leveling,
S. Hartrumpf,
 R. Osswald
                        • GeoCLEF (Leveling and Veiel, 2006): Web service for
Concept-
based
                            MT (query translation)
Representa-
tion:                   • GIRT-4 experiments (Leveling, 2004, 2006a): combined
MultiNet
                            concept and word translation
Three Phases
for a Concept-          • NLI-Z39.50 (Leveling, 2006b): replace terminal
Based
Multilingual IR
System
                            concepts in SN, then treat translation alternatives as a
Concept-
                            synset for query expansion (no decision for a single
Based
Information
                            reading necessary)
Systems
                        • QA@CLEF (Hartrumpf and Leveling, 2007): Web
Applications
                            service for MT, then analysis; concept-based translation
Conclusion
and Outlook                 with rudimentary English parser (preliminary
References                  experiments)


J. Leveling, S. Hartrumpf, R. Osswald      Semantic Analysis and Concept-based Translation   25 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                                                         Conclusion
 J. Leveling,
S. Hartrumpf,
 R. Osswald

Concept-
based
Representa-
                        • General approach:
tion:
MultiNet                        • Parse queries
Three Phases                    • Translate concepts in SN representation
for a Concept-
Based
                                • Operate on SN representation
Multilingual IR
System                  • Aims at multilingual information systems for different
Concept-
Based
                            purposes:
Information
Systems
                            IR, QA
Applications            • 3 phases (currently phase 2)
Conclusion
and Outlook

References




J. Leveling, S. Hartrumpf, R. Osswald       Semantic Analysis and Concept-based Translation   26 / 27
Semantic
Analysis and
  Concept-
   based
 Translation
                                                                               Outlook
 J. Leveling,
S. Hartrumpf,
 R. Osswald

Concept-
based
Representa-
tion:
MultiNet
                        • Create a repository of interlingua concepts:
Three Phases
                            allow for a concept-based machine-translation of text
for a Concept-
Based
                            →natural language generation
Multilingual IR
System
                            →MT
Concept-                • Outlook for IR/QA:
Based
Information                 index semantic relations as well
Systems

Applications

Conclusion
and Outlook

References




J. Leveling, S. Hartrumpf, R. Osswald     Semantic Analysis and Concept-based Translation   27 / 27
Semantic          Hartrumpf, Sven (2003). Hybrid Disambiguation in Natural Language
Analysis and          Analysis. Osnabrück, Germany: Der Andere Verlag.
  Concept-
   based            Hartrumpf, Sven; Hermann Helbig; and Rainer Osswald (2003). The
 Translation
                      semantically based computer lexicon HaGenLex – Structure and
 J. Leveling,
S. Hartrumpf,         technological environment. Traitement automatique des langues,
 R. Osswald           44(2):81–105.
Concept-            Hartrumpf, Sven and Johannes Leveling (2007). Interpretation and
based                 normalization of temporal expressions for question answering. In
Representa-
tion:                 Evaluation of Multilingual and Multi-modal Information Retrieval: 7th
MultiNet
                      Workshop of the Cross-Language Evaluation Forum, CLEF 2006
Three Phases          (edited by Peters, Carol; Paul Clough; Fredric C. Gey; Jussi Karlgren;
for a Concept-
Based                 Bernardo Magnini; Douglas W. Oard; Maarten de Rijke; and
Multilingual IR
System
                      Maximilian Stempfhuber), volume 4730 of LNCS, pp. 432–439. Berlin:
                      Springer.
Concept-
Based               Helbig, Hermann (2001). Die semantische Struktur natürlicher Sprache:
Information
Systems               Wissensrepräsentation mit MultiNet. Berlin: Springer.
Applications        Helbig, Hermann (2006). Knowledge Representation and the Semantics
Conclusion            of Natural Language. Berlin: Springer.
and Outlook
                    Leveling, Johannes (2004). University of Hagen at CLEF 2003: Natural
References
                      language access to the GIRT4 data. In Comparative Evaluation of
                      Multilingual Information Access Systems: 4th Workshop of the
                      Cross-Language Evaluation Forum, CLEF 2003 (edited by Peters,
J. Leveling, S. Hartrumpf, R. Osswald      Semantic Analysis and Concept-based Translation   27 / 27
Semantic              Carol; Julio Gonzalo; Martin Braschler; and Michael Kluck), volume
Analysis and            3237 of LNCS, pp. 412–424. Berlin: Springer.
  Concept-
   based            Leveling, Johannes (2006a). A baseline for NLP in domain-specific
 Translation
                      information retrieval. In Accessing Multilingual Information
 J. Leveling,
S. Hartrumpf,         Repositories: 6th Workshop of the Cross-Language Evaluation Forum,
 R. Osswald           CLEF 2005 (edited by Peters, Carol; Fredric C. Gey; Julio Gonzalo;
                      Gareth J. F. Jones; Michael Kluck; Bernardo Magnini; Henning Müller;
Concept-
based                 and Maarten de Rijke), volume 4022 of LNCS, pp. 222–225. Berlin:
Representa-           Springer.
tion:
MultiNet            Leveling, Johannes (2006b). Formale Interpretation von Nutzeranfragen
Three Phases          für natürlichsprachliche Interfaces zu Informationsangeboten im
for a Concept-
Based                 Internet. Der andere Verlag, Tönning, Germany.
Multilingual IR
System              Leveling, Johannes and Dirk Veiel (2006). University of Hagen at
Concept-              GeoCLEF 2006: Experiments with metonymy recognition in
Based                 documents. In Results of the CLEF 2006 Cross-Language System
Information
Systems               Evaluation Campaign, Working Notes for the CLEF 2006 Workshop
Applications
                      (edited by Nardi, Alessandro; Carol Peters; and José Luis Vicedo).
                      Alicante, Spain.
Conclusion
and Outlook         Ligozat, Anne-Laure; Brigitte Grau; Isabelle Robba; and Anne Vilnat
References             (2006). Evaluation and improvement of cross-lingual question
                       answering strategies. In Proceedings of the EACL 2006 Workshop on
                       Multilingual Question Answering (MLQA’06), pp. 23–30. Trento, Italy.
J. Leveling, S. Hartrumpf, R. Osswald       Semantic Analysis and Concept-based Translation   27 / 27

Mais conteúdo relacionado

Semelhante a Semantic Analysis and Concept-based Translation for Multilingual Information Systems

Bridging the Systemic and Semantic Spheres
Bridging the Systemic and Semantic SpheresBridging the Systemic and Semantic Spheres
Bridging the Systemic and Semantic SpheresHelene Finidori
 
HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri...
HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri...HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri...
HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri...AMD Developer Central
 
Roeder rocky 2011_46
Roeder rocky 2011_46Roeder rocky 2011_46
Roeder rocky 2011_46Chris Roeder
 
Beyond Word2Vec: Embedding Words and Phrases in Same Vector Space
Beyond Word2Vec: Embedding Words and Phrases in Same Vector SpaceBeyond Word2Vec: Embedding Words and Phrases in Same Vector Space
Beyond Word2Vec: Embedding Words and Phrases in Same Vector SpaceVijay Prakash Dwivedi
 
OpenWN-PT: a Brazilian Wordnet for all
OpenWN-PT: a Brazilian Wordnet for allOpenWN-PT: a Brazilian Wordnet for all
OpenWN-PT: a Brazilian Wordnet for allAlexandre Rademaker
 
Individual Brain Charting, a high-resolution fMRI dataset for cognitive mappi...
Individual Brain Charting, a high-resolution fMRI dataset for cognitive mappi...Individual Brain Charting, a high-resolution fMRI dataset for cognitive mappi...
Individual Brain Charting, a high-resolution fMRI dataset for cognitive mappi...Ana Luísa Pinho
 
Towards a Marketplace of Open Source Software Data
Towards a Marketplace of Open Source Software DataTowards a Marketplace of Open Source Software Data
Towards a Marketplace of Open Source Software DataFernando Silva Parreiras
 
Tweeting beyond Facts – The Need for a Linguistic Perspective
Tweeting beyond Facts – The Need for a Linguistic PerspectiveTweeting beyond Facts – The Need for a Linguistic Perspective
Tweeting beyond Facts – The Need for a Linguistic PerspectiveData Science Society
 
Towards comprehensive syntactic and semantic annotations of the clinical narr...
Towards comprehensive syntactic and semantic annotations of the clinical narr...Towards comprehensive syntactic and semantic annotations of the clinical narr...
Towards comprehensive syntactic and semantic annotations of the clinical narr...Jinho Choi
 
Temporal Hypermap Theory and Application
Temporal Hypermap Theory and ApplicationTemporal Hypermap Theory and Application
Temporal Hypermap Theory and ApplicationAbel Nyamapfene
 
Monotonic Multihead Attention review
Monotonic Multihead Attention reviewMonotonic Multihead Attention review
Monotonic Multihead Attention reviewJune-Woo Kim
 
RNA sequencing analysis tutorial with NGS
RNA sequencing analysis tutorial with NGSRNA sequencing analysis tutorial with NGS
RNA sequencing analysis tutorial with NGSHAMNAHAMNA8
 
Colloquium talk on modal sense classification using a convolutional neural ne...
Colloquium talk on modal sense classification using a convolutional neural ne...Colloquium talk on modal sense classification using a convolutional neural ne...
Colloquium talk on modal sense classification using a convolutional neural ne...Ana Marasović
 

Semelhante a Semantic Analysis and Concept-based Translation for Multilingual Information Systems (20)

NLP
NLPNLP
NLP
 
NLP
NLPNLP
NLP
 
Bridging the Systemic and Semantic Spheres
Bridging the Systemic and Semantic SpheresBridging the Systemic and Semantic Spheres
Bridging the Systemic and Semantic Spheres
 
HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri...
HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri...HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri...
HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri...
 
Roeder rocky 2011_46
Roeder rocky 2011_46Roeder rocky 2011_46
Roeder rocky 2011_46
 
Beyond Word2Vec: Embedding Words and Phrases in Same Vector Space
Beyond Word2Vec: Embedding Words and Phrases in Same Vector SpaceBeyond Word2Vec: Embedding Words and Phrases in Same Vector Space
Beyond Word2Vec: Embedding Words and Phrases in Same Vector Space
 
Omsa
OmsaOmsa
Omsa
 
OpenWN-PT: a Brazilian Wordnet for all
OpenWN-PT: a Brazilian Wordnet for allOpenWN-PT: a Brazilian Wordnet for all
OpenWN-PT: a Brazilian Wordnet for all
 
SEASR Text
SEASR TextSEASR Text
SEASR Text
 
Individual Brain Charting, a high-resolution fMRI dataset for cognitive mappi...
Individual Brain Charting, a high-resolution fMRI dataset for cognitive mappi...Individual Brain Charting, a high-resolution fMRI dataset for cognitive mappi...
Individual Brain Charting, a high-resolution fMRI dataset for cognitive mappi...
 
Towards a Marketplace of Open Source Software Data
Towards a Marketplace of Open Source Software DataTowards a Marketplace of Open Source Software Data
Towards a Marketplace of Open Source Software Data
 
eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and SummarizationeSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
 
haenelt.ppt
haenelt.ppthaenelt.ppt
haenelt.ppt
 
Tweeting beyond Facts – The Need for a Linguistic Perspective
Tweeting beyond Facts – The Need for a Linguistic PerspectiveTweeting beyond Facts – The Need for a Linguistic Perspective
Tweeting beyond Facts – The Need for a Linguistic Perspective
 
Towards comprehensive syntactic and semantic annotations of the clinical narr...
Towards comprehensive syntactic and semantic annotations of the clinical narr...Towards comprehensive syntactic and semantic annotations of the clinical narr...
Towards comprehensive syntactic and semantic annotations of the clinical narr...
 
Temporal Hypermap Theory and Application
Temporal Hypermap Theory and ApplicationTemporal Hypermap Theory and Application
Temporal Hypermap Theory and Application
 
Recurrent Neural Network
Recurrent Neural NetworkRecurrent Neural Network
Recurrent Neural Network
 
Monotonic Multihead Attention review
Monotonic Multihead Attention reviewMonotonic Multihead Attention review
Monotonic Multihead Attention review
 
RNA sequencing analysis tutorial with NGS
RNA sequencing analysis tutorial with NGSRNA sequencing analysis tutorial with NGS
RNA sequencing analysis tutorial with NGS
 
Colloquium talk on modal sense classification using a convolutional neural ne...
Colloquium talk on modal sense classification using a convolutional neural ne...Colloquium talk on modal sense classification using a convolutional neural ne...
Colloquium talk on modal sense classification using a convolutional neural ne...
 

Último

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Último (20)

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Semantic Analysis and Concept-based Translation for Multilingual Information Systems

  • 1. Semantic Analysis and Concept-based Translation for Multilingual Information Systems Johannes Leveling and Sven Hartrumpf and Rainer Osswald Intelligent Information and Communication Systems (IICS) University of Hagen (FernUniversität in Hagen) 58084 Hagen, Germany firstname.lastname@fernuni-hagen.de GAL 2007, Hildesheim, Germany
  • 2. Semantic Analysis and Concept- based Translation Outline J. Leveling, S. Hartrumpf, R. Osswald Concept- 1 Concept-based Representation: MultiNet based Representa- tion: MultiNet 2 Three Phases for a Concept-Based Multilingual IR Three Phases System for a Concept- Based Multilingual IR System 3 Concept-Based Information Systems Concept- Based Information Systems 4 Applications Applications Conclusion and Outlook 5 Conclusion and Outlook References J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 2 / 27
  • 3. Semantic Analysis and Concept- based Translation Motivation for Concept-Based J. Leveling, S. Hartrumpf, Translation R. Osswald Concept- based • Example 1: Representa- tion: Query expansion in information retrieval (IR) with MultiNet elements from same synset Three Phases for a Concept- → needs word sense disambiguation (differentiation of Based Multilingual IR concepts), otherwise loss of precision System Concept- • Example 2: Based Information Question answering (QA): questions on relations Systems between concepts (situations, events, etc.) Applications Example: Who killed Lee Harvey Oswald? Conclusion and Outlook → need semantic representation; References bag-of-words information retrieval is not enough J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 3 / 27
  • 4. Semantic Analysis and Concept- based Translation The MultiNet Paradigm J. Leveling, S. Hartrumpf, R. Osswald • Meaning and knowledge representation: Concept- Multilayered Extended Semantic Networks (Helbig, based Representa- 2001, 2006) tion: MultiNet • Semantic network of nodes (concepts) and edges Three Phases (semantic relations from a fixed set) for a Concept- Based • In addition: Multilingual IR System semantic sorts, semantic features, layer information Concept- Based • Different types of concepts: Information Systems lexicalized vs. non-lexicalized Applications • Language-independence: Conclusion and Outlook annotation of English/Czech sentences from the Wall References Street Journal with MultiNet (Charles University, Prague) J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 4 / 27
  • 5. Semantic Analysis and Concept- based Translation Selected Semantic Relations J. Leveling, Relation Description S. Hartrumpf, R. Osswald ASSOC association Concept- ATTCH attachment of object to object based Representa- CHPA change of sorts (property →abstract object) tion: MultiNet EXP experiencer Three Phases MCONT an informational process or object for a Concept- OBJ neutral object Based Multilingual IR PRED predicative concept specifying a plurality System PROP property relationship Concept- Based PARS meronymy Information Systems SCAR carrier of a state Applications SSPE state specifier SUB conceptual subordination for objects Conclusion and Outlook SUBS conceptual subordination for situations References SYNO synonymy TEMP temporal restriction for a situation ALTN 1 an introduction of alternatives J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 5 / 27
  • 6. Semantic Analysis and Concept- based Translation The Computational Lexicon – J. Leveling, S. Hartrumpf, HaGenLex R. Osswald Concept- based Representa- tion: • Semantically oriented (German) lexical resource MultiNet (Hartrumpf et al., 2003) Three Phases for a Concept- • Consists of multiple lexicons: Based Multilingual IR • full syntactico-semantic information (26,000 entries) System • flat lexicon (50,000 entries) Concept- Based • compound lexicon (30,000 entries; structure and Information semantics) Systems • name lexicons (250,000 entries) Applications Conclusion • Support for the lexicographer: LIAplus workbench and Outlook References J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 6 / 27
  • 7. Semantic Analysis and Concept- based Translation Sample Concepts (German) J. Leveling, S. Hartrumpf, R. Osswald Concept- based Representa- • essen.1.1: (Der Student) (ißt) (eine Schokolade). tion: MultiNet • essen.1.2: (Der Student) (ißt) sich (satt). Three Phases for a Concept- • essen.2.1: Das Kind hat kein Essen bekommen. Based Multilingual IR System • essen.2.2: Das Essen am Abend dauerte 2 Stunden. Concept- • fressen.1.1: (Der Hund) (frißt) (einen Knochen). Based Information • fressen.1.2: (Die Großmutter) (frißt) (einen Narren) (an Systems Applications den Blumen). Conclusion and Outlook References J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 7 / 27
  • 8. Semantic Analysis and Concept- based Translation Lexicon Entry (German): J. Leveling, S. Hartrumpf, essen.1.1 R. Osswald n-sign  Concept- morph base ”essen”  based  infl-para i129g   v-syn  Representa-  v-type main  tion: syn   perf-aux haben  MultiNet v-control nocontr      sem  sem  entity nonment-action  Three Phases    c-id ”essen.1.1” for a Concept-      Based        rel agt  Multilingual IR        np-syn  System     cat np       syn   agr case nom   sel    Concept-   semsel      sem  Based    select semsel sem entity human-object   Information     rel aff    Systems          np-syn      cat np   Applications    syn       agr case acc     sel    Conclusion     sem   semsel sem and Outlook entity sort co References J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 8 / 27
  • 9. Semantic Analysis and Concept- based Translation Lexicon Entry (German): J. Leveling, S. Hartrumpf, fressen.1.1 R. Osswald n-sign  Concept- morph base ”fressen”  based  infl-para i139g   v-syn  Representa-  v-type main  tion: syn   perf-aux haben  MultiNet v-control nocontr      sem  sem  entity nonment-action  Three Phases    c-id ”fressen.1.1” for a Concept-      Based        rel agt  Multilingual IR        np-syn  System     cat np       syn   agr case nom   sel    Concept-   semsel      sem  Based    select semsel sem entity animal-object ∨ human-object   Information     rel aff    Systems          np-syn      cat np   Applications    syn       agr case acc     sel    Conclusion     sem   semsel sem and Outlook entity sort co References J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 9 / 27
  • 10. Semantic Analysis and Concept- based Translation Semantic analysis – J. Leveling, S. Hartrumpf, The WOCADI parser R. Osswald Concept- based Representa- • Produces semantic network representation from tion: MultiNet (German) texts (Hartrumpf, 2003): Three Phases • resolves coreferences, for a Concept- Based • analyzes idioms, Multilingual IR System • decompounds nouns and adjectives, Concept- • identifies metonymy, Based • resolves deictic expressions etc. Information Systems • Applied to large corpora, including Applications CLEF-NEWS newspaper corpus (275,000 articles) and Conclusion and Outlook German Wikipedia (500,000 articles) References J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 10 / 27
  • 11. Semantic Analysis and Concept- based Translation SN Example (German) J. Leveling, S. Hartrumpf, R. Osswald du.1.1 streß.1.1 psychisch.1.1 Concept- PROP SUBS SUB based Representa- dokument.1.1 problem.1.1 tion: PRED *ALTN1 MultiNet c3 c7 c6 prüfling.1.1 Three Phases PRED EXP PRED for a Concept- c10 Based Multilingual IR OBJ MCONT ATTCH *ALTN1 System c2 c1 c5 c8 kandidat.1.1 SUBS Concept- SCAR E P SS Based c9 Information PRED B Systems SUBS SU Applications finden.1.1 c4 berichten.2.2 ASSOC Conclusion prüfungskandidat.1.1prüfung.1.1 and Outlook References Finde Dokumente, die über psychische Probleme oder Stress von Prüfungskandidaten oder Prüflingen berichten. (GIRT topic 116) J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 11 / 27
  • 12. Semantic Analysis and Concept- based Translation SN Example (English) J. Leveling, S. Hartrumpf, R. Osswald you stress mental Concept- PROP SUBS SUB based Representa- document problem tion: PRED *ALTN1 MultiNet c3 c7 c6 examinee Three Phases PRED EXP PRED for a Concept- c10 Based Multilingual IR OBJ MCONT ATTCH *ALTN1 System c2 c1 c5 c8 candidate SUBS Concept- SCAR PE SS Based c9 Information PRED B Systems SUBS SU Applications find c4 report ASSOC Conclusion exam and Outlook References ‘Find documents reporting on mental problems or stress of examination candidates or examinees.’ (GIRT topic 116) J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 12 / 27
  • 13. Semantic Analysis and Concept- based Translation Phase 1: Using Statistical MT J. Leveling, S. Hartrumpf, and Web Services R. Osswald Concept- • Employ (statistical) machine translation (MT) web based Representa- service for IR experiments (translation of tion: MultiNet queries/questions): Systran, Promt, ... Three Phases • Problems: for a Concept- Based • translating questions: Multilingual IR System most systems trained on declarative sentences; Concept- imperative forms often misunderstood Based Information (Find documents ... →Fund Dokument ...) Systems • named entity recognition: Applications not reliable (Neuengland →new narrow country ) Conclusion and Outlook • Performance loss from off-the-shelf translation tools for References QA@CLEF: 50% further examples: Ligozat et al. (2006) J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 13 / 27
  • 14. Semantic Analysis and Concept- based Translation Phase 2: Aligning J. Leveling, S. Hartrumpf, Concept-based Tools and R. Osswald Resources Concept- based Representa- tion: • Morphology and syntax are different for different MultiNet languages Three Phases for a Concept- • Semantics is the same (in general) Based Multilingual IR • Our approach: System Concept- • create lexicons for different languages ; Based fast construction parallel to existing lexicon(s), e.g. Information Systems HaGenLex →HaEnLex Applications • develop parser for different languages Conclusion • apply methods from IR/QA on SN representation and Outlook References • General idea: replace concepts (labels) in semantic network representation (as a form of translation) J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 14 / 27
  • 15. Semantic Analysis and Concept- based Translation Status of Alignment of Lexical J. Leveling, S. Hartrumpf, Resources R. Osswald Concept- based • German to English dictionaries: about 100,000 Representa- tion: word/phrase translations MultiNet • Mapping between HaGenLex concepts and GermaNet Three Phases for a Concept- concepts, plus GermaNet to EuroWordNet mapping: Based Multilingual IR about 14,000 concept translations System Concept- • Wikipedia articles (in German and English): about Based Information 3,000 proper noun translations for cities, countries, Systems persons, organizations, etc. Applications • HaEnLex (parallel English version of HaGenLex) with Conclusion and Outlook full morphologic, syntactic, semantic description of References concepts: about 7,000 English entries J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 15 / 27
  • 16. Semantic Analysis and Concept- based Translation Linguistic Phenomena (1/6) J. Leveling, S. Hartrumpf, R. Osswald Concept- based Compounds (rare in English): Representa- tion: MultiNet • with regular semantics Three Phases Kinderernährung →nutrition of children for a Concept- Based • with irregular semantics Multilingual IR System Frauenzimmer →dame (?); ladies’ room (?) Concept- Based • borderline cases Information Systems Bankwesen →banking (system) (?) Applications → compound-less semantic representation is possible Conclusion and Outlook References J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 16 / 27
  • 17. Semantic Analysis and Concept- based Translation Linguistic Phenomena (2/6) J. Leveling, S. Hartrumpf, R. Osswald Concept- based Representa- Idioms: tion: MultiNet • with corresponding idiom: Three Phases for a Concept- in den Sinn kommen (DE) →to start thinking about sth. Based Multilingual IR to come into mind (EN) →to start thinking about sth. System • without equivalent idiom: Concept- Based to be someone’s cup of tea (EN) →to like Information Systems → semantic representation of idioms Applications Conclusion and Outlook References J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 17 / 27
  • 18. Semantic Analysis and Concept- based Translation Linguistic Phenomena (3/6) J. Leveling, S. Hartrumpf, R. Osswald Concept- based Metonymy: Representa- tion: • with corresponding metonymy pattern (for regulat MultiNet metonymy): Three Phases for a Concept- The White House agreed, that ... (EN) Based Multilingual IR →place-for-government System Das Weiße Haus stimmte zu, dass ... (DE) Concept- Based →place-for-government Information Systems • without: ? Applications Conclusion → no problems, yet and Outlook References J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 18 / 27
  • 19. Semantic Analysis and Concept- based Translation Linguistic Phenomena (4/6) J. Leveling, S. Hartrumpf, R. Osswald Concept- based Proper nouns: Representa- tion: MultiNet • transcriptions and transliterations, historic name Three Phases variants for a Concept- Based • Böll →Boell; Multilingual IR System Gorbatschow →Gorbatchev, Gorbatchov Concept- Based → can be solved using aligned online resources e.g. Information Systems Wikipedia Applications → treat name variants as elements of the same synset Conclusion and Outlook References J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 19 / 27
  • 20. Semantic Analysis and Concept- based Translation Linguistic Phenomena (5/6) J. Leveling, S. Hartrumpf, R. Osswald Concept- based Semantic gaps/lexical gaps: Representa- tion: • Fohlen (DE) →colt (if male), MultiNet Three Phases • Fohlen (DE) →filly (if female) for a Concept- Based • Alignment of lexicon entries: morpho-syntactic features Multilingual IR System differ in different languages, syntactic features also, Concept- semantic features do not (in general) but: net Based Information entries/rules/entailments may be slightly different?!, Systems because they already involve other concepts (which Applications Conclusion have to be translated) and Outlook References J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 20 / 27
  • 21. Semantic Analysis and Concept- based Translation Linguistic Phenomena (6/6) J. Leveling, S. Hartrumpf, Semantic gaps/lexical gaps: R. Osswald essen.1.1 →eat.1.1 AND fressen.1.1 →eat.1.1 Concept- based Representa- n-sign  tion: ”eat” MultiNet morph base infl-para i20     v-syn  Three Phases syn v-type main  for a Concept-     sem sem Based     entity nonment-action  Multilingual IR   c-id  ”eat.1.1”   System         rel agt    Concept-        np-syn    syn Based   cat np   semsel  sel    Information sem    semsel sem   entity animal-object ∨ human-object  Systems  select       rel aff    Applications       np-syn   syn cat np            Conclusion   sel  sem   semsel sem and Outlook entity sort co References J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 21 / 27
  • 22. Semantic Analysis and Concept- based Translation Phase 3: Towards a J. Leveling, S. Hartrumpf, Concept-Based Translation R. Osswald Concept- based Representa- tion: MultiNet Three Phases • Assumption that the same inventory of relations hold for a Concept- Based (about 140 relations) for different languages Multilingual IR System • Natural language generation (for German) Concept- Based • Possible solution: English parser, generate natural Information Systems language from semantic network representation Applications Conclusion and Outlook References J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 22 / 27
  • 23. Semantic Analysis and Concept- based Translation Monolingual Concept-Based IR J. Leveling, S. Hartrumpf, R. Osswald • Techniques of standard IR: stemming and stopword Concept- based removal Representa- tion: • Monolingual concept-based IR: MultiNet • represent queries (and documents) as semantic Three Phases for a Concept- networks Based • (translate concepts) Multilingual IR System • employ methods on semantic network representation Concept- Based • Advantages: Information Systems • semantics of compounds (relation to its constituents) Applications • semantics of prepositions is typically represented by Conclusion semantic relation or function (no full translation needed) and Outlook • lemmatizing (instead of stemming) References • query expansion with elements of synsets J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 23 / 27
  • 24. Semantic Analysis and Concept- based Translation Multilingual Concept-Based IR J. Leveling, S. Hartrumpf, R. Osswald Concept- based Representa- • Three different approaches at supporting a multilingual tion: search MultiNet Three Phases 1 translate queries into the document language for a Concept- 2 translate documents into the query language Based Multilingual IR 3 translate both queries and documents into an System interlingua Concept- Based • Multilingual concept-based IR: same as monolingual Information Systems approach, but translate concepts (1, 2, or 3) Applications →towards an interlingua Conclusion and Outlook References J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 24 / 27
  • 25. Semantic Analysis and Concept- based Translation Projects and Evaluations J. Leveling, S. Hartrumpf, R. Osswald • GeoCLEF (Leveling and Veiel, 2006): Web service for Concept- based MT (query translation) Representa- tion: • GIRT-4 experiments (Leveling, 2004, 2006a): combined MultiNet concept and word translation Three Phases for a Concept- • NLI-Z39.50 (Leveling, 2006b): replace terminal Based Multilingual IR System concepts in SN, then treat translation alternatives as a Concept- synset for query expansion (no decision for a single Based Information reading necessary) Systems • QA@CLEF (Hartrumpf and Leveling, 2007): Web Applications service for MT, then analysis; concept-based translation Conclusion and Outlook with rudimentary English parser (preliminary References experiments) J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 25 / 27
  • 26. Semantic Analysis and Concept- based Translation Conclusion J. Leveling, S. Hartrumpf, R. Osswald Concept- based Representa- • General approach: tion: MultiNet • Parse queries Three Phases • Translate concepts in SN representation for a Concept- Based • Operate on SN representation Multilingual IR System • Aims at multilingual information systems for different Concept- Based purposes: Information Systems IR, QA Applications • 3 phases (currently phase 2) Conclusion and Outlook References J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 26 / 27
  • 27. Semantic Analysis and Concept- based Translation Outlook J. Leveling, S. Hartrumpf, R. Osswald Concept- based Representa- tion: MultiNet • Create a repository of interlingua concepts: Three Phases allow for a concept-based machine-translation of text for a Concept- Based →natural language generation Multilingual IR System →MT Concept- • Outlook for IR/QA: Based Information index semantic relations as well Systems Applications Conclusion and Outlook References J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 27 / 27
  • 28. Semantic Hartrumpf, Sven (2003). Hybrid Disambiguation in Natural Language Analysis and Analysis. Osnabrück, Germany: Der Andere Verlag. Concept- based Hartrumpf, Sven; Hermann Helbig; and Rainer Osswald (2003). The Translation semantically based computer lexicon HaGenLex – Structure and J. Leveling, S. Hartrumpf, technological environment. Traitement automatique des langues, R. Osswald 44(2):81–105. Concept- Hartrumpf, Sven and Johannes Leveling (2007). Interpretation and based normalization of temporal expressions for question answering. In Representa- tion: Evaluation of Multilingual and Multi-modal Information Retrieval: 7th MultiNet Workshop of the Cross-Language Evaluation Forum, CLEF 2006 Three Phases (edited by Peters, Carol; Paul Clough; Fredric C. Gey; Jussi Karlgren; for a Concept- Based Bernardo Magnini; Douglas W. Oard; Maarten de Rijke; and Multilingual IR System Maximilian Stempfhuber), volume 4730 of LNCS, pp. 432–439. Berlin: Springer. Concept- Based Helbig, Hermann (2001). Die semantische Struktur natürlicher Sprache: Information Systems Wissensrepräsentation mit MultiNet. Berlin: Springer. Applications Helbig, Hermann (2006). Knowledge Representation and the Semantics Conclusion of Natural Language. Berlin: Springer. and Outlook Leveling, Johannes (2004). University of Hagen at CLEF 2003: Natural References language access to the GIRT4 data. In Comparative Evaluation of Multilingual Information Access Systems: 4th Workshop of the Cross-Language Evaluation Forum, CLEF 2003 (edited by Peters, J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 27 / 27
  • 29. Semantic Carol; Julio Gonzalo; Martin Braschler; and Michael Kluck), volume Analysis and 3237 of LNCS, pp. 412–424. Berlin: Springer. Concept- based Leveling, Johannes (2006a). A baseline for NLP in domain-specific Translation information retrieval. In Accessing Multilingual Information J. Leveling, S. Hartrumpf, Repositories: 6th Workshop of the Cross-Language Evaluation Forum, R. Osswald CLEF 2005 (edited by Peters, Carol; Fredric C. Gey; Julio Gonzalo; Gareth J. F. Jones; Michael Kluck; Bernardo Magnini; Henning Müller; Concept- based and Maarten de Rijke), volume 4022 of LNCS, pp. 222–225. Berlin: Representa- Springer. tion: MultiNet Leveling, Johannes (2006b). Formale Interpretation von Nutzeranfragen Three Phases für natürlichsprachliche Interfaces zu Informationsangeboten im for a Concept- Based Internet. Der andere Verlag, Tönning, Germany. Multilingual IR System Leveling, Johannes and Dirk Veiel (2006). University of Hagen at Concept- GeoCLEF 2006: Experiments with metonymy recognition in Based documents. In Results of the CLEF 2006 Cross-Language System Information Systems Evaluation Campaign, Working Notes for the CLEF 2006 Workshop Applications (edited by Nardi, Alessandro; Carol Peters; and José Luis Vicedo). Alicante, Spain. Conclusion and Outlook Ligozat, Anne-Laure; Brigitte Grau; Isabelle Robba; and Anne Vilnat References (2006). Evaluation and improvement of cross-lingual question answering strategies. In Proceedings of the EACL 2006 Workshop on Multilingual Question Answering (MLQA’06), pp. 23–30. Trento, Italy. J. Leveling, S. Hartrumpf, R. Osswald Semantic Analysis and Concept-based Translation 27 / 27