SlideShare a Scribd company logo
1 of 21
Download to read offline
1st Workshop on Digital Information Management
30-31 March 2011, Corfu, Greece




               “Failed Queries:
          a Morpho-Syntactic Analysis
        Based on Transaction Log Files”
    Anna Mastora1, Maria Monopoli2 and Sarantos Kapidakis1
                                    1Laboratory on Digital Libraries & Electronic Publishing,
                 Department of Archives & Library Sciences, Ionian University, 72 Ioannou
                                                        Theotoki str., 49100, Corfu, Greece
                  2Library Section, Economic Research Department, Bank of Greece, 21 El.

                                                    Venizelos ave., 10250, Athens, Greece
                               {mastora, sarantos}@ionio.gr, mmonopoli@bankofgreece.gr

                                    This research has been co-financed by the European Union
                                    (European Social Fund – ESF) and Greek national funds through
                                    the Operational Program "Education and Lifelong Learning" of
                                    the National Strategic Reference Framework (NSRF) - Research
                                    Funding Program: Heracleitus II. Investing in knowledge
                                    society through the European Social Fund.
Presentation outline

    Introduction
    Aims and Objectives
    Related Research
    Definitions and Methodology
    Results
    Conclusions
    Future Work
2     1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
Aims and Objectives

    The aim of this study is to elaborate on the
    procedure needed in order to analyze morpho-
    syntactically the typing-error queries submitted
    during the search process
    The objectives of the study are twofold
    – Explore the extent and types of failed queries due to
      typing errors
    – Explore the feasibility of their morpho-syntactic
      analysis


3      1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
Related research (i)

    Information Retrieval techniques do not work
    effectively at all times
    “Not working effectively” includes:
    – Not retrieving relevant documents (low Recall), or,
    – Retrieving non relevant documents (low Precision)
    Part of studying what is not retrieved is the
    analysis of “Failed queries” or “Failure analysis”
    Significant interest has been expressed on
    failed queries as the outcome of subject
    searching
4      1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
Related research (ii)

    Natural Language Processing is essential,
    especially in highly inflectional languages,
    such as Greek
    Part of Speech (PoS) tagging and
    morpho-syntactic analysis of words are
    valuable tools when mechanisms are
    applied for word disambiguation (either
    morpho-syntactic or word-sense driven)
5      1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
Definitions and Limitations (i)

    What constitutes a “failed query”?
    – It depends. Some approaches consider:
         Precision and Recall, applying retrieval
         effectiveness measures
         User satisfaction, applying users’ criteria to
         measure if a query was failed
         Transaction Log files analysis (with or without
         relevance judgments)
         Critical incident technique, using direct
         observation of human behavior


6     1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
Definitions and Limitations (ii)

    Failed query: a query with zero hits
    – A bit arbitrary (yet practical) since
          zero hits cannot solidly define a query as failed
          not all non-zero hits queries can be defined as successful
    – For the purpose of this study this is a rather safe
      definition since
          the database was known to include relevant items
          the information need was relevant to the context of the
          database
          we used transaction log files without relevance judgments
    – We analyze the queries morpho-syntactically, i.e. as
      “bag of words”, detached from any semantic
      designation (well… most of the times…)

7      1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
Definitions and Limitations (iii)

    Morpho-syntactic analysis:                 cognitive process that
    constitutes an intermediate layer between morphological and
    syntactic analysis and aims to assign unambiguous morpho-
    syntactic information to words of texts
    Morpho-syntactic information: the morphological
    origin and the morpho-syntactic properties of a word (e.g. the
    word ανθρώπου is the genitive singular form of the masculine
    noun [άνθρωπος])
    Inflectional languages:                             with a high morpheme-per-
    word ratio
    Morpheme:                the smallest meaningful linguistic unit


8       1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
Methodology: Experimental setting

    In vitro experiment         (27 participants, undergrads, Dpt. of
    Archives & Library Sciences)
    13 given information needs related to
    environmental issues
     – The queries submitted for this purpose formed the terms’ corpus
    Subject search only, Simple search
    Bibliographic database of the Evonymos
    Ecological Library (customized)
    Transaction Log Files (1 xml log file/ per user/ per
    session)
    Language used: Greek                         (highly inflectional)


9        1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
Methodology: The Workflow




10    1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
Methodology: Analysis specifications

       Manual designations to categories of failed
     queries, all examined one-by-one

       Use of Open Xerox tokenizer and morphological
     analyzer (PoS tags also included)

        Use of MS Word dictionary


     ♦ For more definitions and a full list of citations, please, refer to the full paper
     of this presentation hosted at the workshop’s website


11            1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
Results (i)

     1,284 queries were submitted overall
     36% were failed queries (i.e. 459)
     Further categorization of Failed Queries



     Further categorization of Typing errors



12      1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
Results (ii)

     Typing error queries: 90
     Result after tokenization: 156 tokens
     Morpho-syntactic analysis of tokens
     (terms as submitted): 20/156 identified
     and analyzed
        *Poor performance in identifying the terms*



13      1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
Results (iii)

     Error correction of tokens using MS Word
     dictionary
     – At this point we interfered with the results assigning
       the semantically correct word




14      1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
Results (iv)

     Morpho-syntactic analysis of corrected
     tokens: 139/156 identified and analyzed
     *Good performance in identifying the terms*




15      1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
An example




16   1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
Conclusions (i)
     36% of the queries submitted returned zero hits
     19.6% of failed queries were due to typing errors
     Typing error queries require more steps in the process
     towards morpho-syntactic analysis
     Tools for morpho-syntactic analysis of the Greek
     language need to be rich in tags in order to work
     adequately. This affects the complexity of the tool but it
     is essential since Greek is a highly inflectional language
     and cannot be analyzed easily.
     Transaction Log files offer a good starting point but
     need further analysis and support from other tools in
     order to be used for successful word-sense
     disambiguation

17       1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
Conclusions (ii)

     MS Word dictionary
     – Was consulted in order to correct ~70% of the
       typing errors. The remaining were the part of
       submitted phrases with no typing error.
     – In 45.5% of the cases, first MS Word suggestion
       was correct.
     – Seems to favor at all instances “plural masculine”
       over “singular feminine”, i.e. for “πυρινική”, it first
       suggests “πυρηνικοί” and second “πυρηνική”.
     – Does not recognize named entities

18      1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
Conclusions (iii)

     Open Xerox
     – Works well for identifying and characterizing the
       segments
     – Is not enriched with many tags which leads to
       multiple suggestions. The goal in these cases is the
       less possible suggestions for each segment
     – Does not recognize capitalized words or words with
       no accent marks, meaning that the text submitted
       must be preprocessed to meet certain requirements
     – Does not recognize named entities

19      1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
Future Work

       Deal with matters such as
      –Named entities recognition
      –Language identification
      –Word-sense disambiguation
     in order to achieve higher rates of morpho-
     syntactic analysis and apply automated
     mechanisms in this process
       Match the analyzed corpus of queries to
     Knowledge Organization Systems to assist query
     expansion
20       1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
Thank you for your attention!

                               Any questions


                                                                Contact: Anna Mastora
                                               Laboratory on Digital Libraries & Electronic Publishing,
                                         Department of Archives & Library Sciences, Ionian University,
                                                     72 Ioannou Theotoki str., 49100, Corfu, Greece

                                                                         mastora@ionio.gr

21   1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011

More Related Content

What's hot

Phd dr mahmoud hassan elgamel
Phd dr mahmoud hassan elgamel Phd dr mahmoud hassan elgamel
Phd dr mahmoud hassan elgamel Dr-mahmoud Algamel
 
A prior case study of natural language processing on different domain
A prior case study of natural language processing  on different domain A prior case study of natural language processing  on different domain
A prior case study of natural language processing on different domain IJECEIAES
 
Bgs120 debopriyo roy
Bgs120 debopriyo royBgs120 debopriyo roy
Bgs120 debopriyo royDebopriyo Roy
 
Digital Humanities and Internet Research: shared methods and perspectives
Digital Humanities and Internet Research: shared methods and perspectivesDigital Humanities and Internet Research: shared methods and perspectives
Digital Humanities and Internet Research: shared methods and perspectivesCornelius Puschmann
 

What's hot (6)

Phd dr mahmoud hassan elgamel
Phd dr mahmoud hassan elgamel Phd dr mahmoud hassan elgamel
Phd dr mahmoud hassan elgamel
 
A prior case study of natural language processing on different domain
A prior case study of natural language processing  on different domain A prior case study of natural language processing  on different domain
A prior case study of natural language processing on different domain
 
Bgs120 debopriyo roy
Bgs120 debopriyo royBgs120 debopriyo roy
Bgs120 debopriyo roy
 
Lit mtap
Lit mtapLit mtap
Lit mtap
 
Digital Humanities and Internet Research: shared methods and perspectives
Digital Humanities and Internet Research: shared methods and perspectivesDigital Humanities and Internet Research: shared methods and perspectives
Digital Humanities and Internet Research: shared methods and perspectives
 
Learn to Program: The Fundamentals
Learn to Program: The FundamentalsLearn to Program: The Fundamentals
Learn to Program: The Fundamentals
 

Viewers also liked

A morphosyntactic-categorization-scheme-for-the-automated-analysis-of-nepali
A morphosyntactic-categorization-scheme-for-the-automated-analysis-of-nepaliA morphosyntactic-categorization-scheme-for-the-automated-analysis-of-nepali
A morphosyntactic-categorization-scheme-for-the-automated-analysis-of-nepaliAshmit Bhattarai
 
LING 100 - Morphosyntactic Categories
LING 100 - Morphosyntactic CategoriesLING 100 - Morphosyntactic Categories
LING 100 - Morphosyntactic CategoriesMeagan Louie
 
Collocation and multi word lexemes
Collocation and multi word lexemesCollocation and multi word lexemes
Collocation and multi word lexemesJon Mills
 
Morphology # Productivity in Word-Formation
Morphology # Productivity in Word-FormationMorphology # Productivity in Word-Formation
Morphology # Productivity in Word-FormationAni Istiana
 
Word vs lexeme by james jamie 2014 presentation assigned by asifa memon lect...
Word vs lexeme  by james jamie 2014 presentation assigned by asifa memon lect...Word vs lexeme  by james jamie 2014 presentation assigned by asifa memon lect...
Word vs lexeme by james jamie 2014 presentation assigned by asifa memon lect...James Jamie
 
Morphological Analysis
Morphological AnalysisMorphological Analysis
Morphological AnalysisInnowiz
 
Words and lexemes ppt
Words and lexemes pptWords and lexemes ppt
Words and lexemes pptAngeline-dbz
 

Viewers also liked (7)

A morphosyntactic-categorization-scheme-for-the-automated-analysis-of-nepali
A morphosyntactic-categorization-scheme-for-the-automated-analysis-of-nepaliA morphosyntactic-categorization-scheme-for-the-automated-analysis-of-nepali
A morphosyntactic-categorization-scheme-for-the-automated-analysis-of-nepali
 
LING 100 - Morphosyntactic Categories
LING 100 - Morphosyntactic CategoriesLING 100 - Morphosyntactic Categories
LING 100 - Morphosyntactic Categories
 
Collocation and multi word lexemes
Collocation and multi word lexemesCollocation and multi word lexemes
Collocation and multi word lexemes
 
Morphology # Productivity in Word-Formation
Morphology # Productivity in Word-FormationMorphology # Productivity in Word-Formation
Morphology # Productivity in Word-Formation
 
Word vs lexeme by james jamie 2014 presentation assigned by asifa memon lect...
Word vs lexeme  by james jamie 2014 presentation assigned by asifa memon lect...Word vs lexeme  by james jamie 2014 presentation assigned by asifa memon lect...
Word vs lexeme by james jamie 2014 presentation assigned by asifa memon lect...
 
Morphological Analysis
Morphological AnalysisMorphological Analysis
Morphological Analysis
 
Words and lexemes ppt
Words and lexemes pptWords and lexemes ppt
Words and lexemes ppt
 

Similar to Failed queries: a morpho-syntactic analysis based on transaction log files

Supporting User's Exploration of Digital Libraries, Suedl 2012 workshop proce...
Supporting User's Exploration of Digital Libraries, Suedl 2012 workshop proce...Supporting User's Exploration of Digital Libraries, Suedl 2012 workshop proce...
Supporting User's Exploration of Digital Libraries, Suedl 2012 workshop proce...pathsproject
 
APPLYING QUALITATIVE RESEARCH IN E-LEARNING DISCUSSION AND FINDINGS FROM THR...
APPLYING QUALITATIVE RESEARCH IN E-LEARNING  DISCUSSION AND FINDINGS FROM THR...APPLYING QUALITATIVE RESEARCH IN E-LEARNING  DISCUSSION AND FINDINGS FROM THR...
APPLYING QUALITATIVE RESEARCH IN E-LEARNING DISCUSSION AND FINDINGS FROM THR...Monica Waters
 
MULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUES
MULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUESMULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUES
MULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUESijcseit
 
Enhancing Intercultural Communicative Competence through cross-cultural inter...
Enhancing Intercultural Communicative Competence through cross-cultural inter...Enhancing Intercultural Communicative Competence through cross-cultural inter...
Enhancing Intercultural Communicative Competence through cross-cultural inter...Kristi Jauregi Ondarra
 
Tenc Winterschool09 Davinia Slideshare
Tenc Winterschool09 Davinia SlideshareTenc Winterschool09 Davinia Slideshare
Tenc Winterschool09 Davinia Slideshareguest94c824
 
ASK Tools 4 Scaling-Up Open Access 2 Global Education
ASK Tools 4 Scaling-Up Open Access 2 Global EducationASK Tools 4 Scaling-Up Open Access 2 Global Education
ASK Tools 4 Scaling-Up Open Access 2 Global EducationDemetrios G. Sampson
 
Dialectal Arabic sentiment analysis based on tree-based pipeline optimizatio...
Dialectal Arabic sentiment analysis based on tree-based pipeline  optimizatio...Dialectal Arabic sentiment analysis based on tree-based pipeline  optimizatio...
Dialectal Arabic sentiment analysis based on tree-based pipeline optimizatio...IJECEIAES
 
Share.TEC presentation at EdReNe expert workshop, J. Earp
Share.TEC presentation at EdReNe expert workshop, J. EarpShare.TEC presentation at EdReNe expert workshop, J. Earp
Share.TEC presentation at EdReNe expert workshop, J. EarpShare.TEC
 
Share.TEC: Sharing Digital Resources in the Teacher Education Community
Share.TEC: Sharing Digital Resources in the Teacher Education CommunityShare.TEC: Sharing Digital Resources in the Teacher Education Community
Share.TEC: Sharing Digital Resources in the Teacher Education CommunityErik Axdorph
 
Knowledge Management Cultures: A Comparison of Engineering and Cultural Scien...
Knowledge Management Cultures: A Comparison of Engineering and Cultural Scien...Knowledge Management Cultures: A Comparison of Engineering and Cultural Scien...
Knowledge Management Cultures: A Comparison of Engineering and Cultural Scien...Ralf Klamma
 
2015-11-18 research seminar
2015-11-18 research seminar2015-11-18 research seminar
2015-11-18 research seminarifi8106tlu
 
Eurocall2014 SpeakApps Presentation - SpeakApps and Learning Analytics
Eurocall2014 SpeakApps Presentation - SpeakApps and Learning AnalyticsEurocall2014 SpeakApps Presentation - SpeakApps and Learning Analytics
Eurocall2014 SpeakApps Presentation - SpeakApps and Learning AnalyticsSpeakApps Project
 
Digital Systems and Services for Open Access Education and Learning
Digital Systems and Services for Open Access Education and LearningDigital Systems and Services for Open Access Education and Learning
Digital Systems and Services for Open Access Education and LearningDemetrios G. Sampson
 
Structuring Self Organised Language Learning Online and Offline
Structuring Self Organised Language Learning Online and OfflineStructuring Self Organised Language Learning Online and Offline
Structuring Self Organised Language Learning Online and OfflineMonika Anclin
 
Geographic collections development policies and GIS services: a research in U...
Geographic collections development policies and GIS services: a research in U...Geographic collections development policies and GIS services: a research in U...
Geographic collections development policies and GIS services: a research in U...Giannis Tsakonas
 
Training Informatics Teachers in Managing ICT Facilities in Schools
Training Informatics Teachers in Managing ICT Facilities in SchoolsTraining Informatics Teachers in Managing ICT Facilities in Schools
Training Informatics Teachers in Managing ICT Facilities in SchoolsChrysanthi Tziortzioti
 
The student experience: How does it align with policy directives?
The student experience: How does it align with policy directives?The student experience: How does it align with policy directives?
The student experience: How does it align with policy directives?EDEN Digital Learning Europe
 
Conole Policy Eden
Conole Policy EdenConole Policy Eden
Conole Policy Edengrainne
 

Similar to Failed queries: a morpho-syntactic analysis based on transaction log files (20)

Supporting User's Exploration of Digital Libraries, Suedl 2012 workshop proce...
Supporting User's Exploration of Digital Libraries, Suedl 2012 workshop proce...Supporting User's Exploration of Digital Libraries, Suedl 2012 workshop proce...
Supporting User's Exploration of Digital Libraries, Suedl 2012 workshop proce...
 
APPLYING QUALITATIVE RESEARCH IN E-LEARNING DISCUSSION AND FINDINGS FROM THR...
APPLYING QUALITATIVE RESEARCH IN E-LEARNING  DISCUSSION AND FINDINGS FROM THR...APPLYING QUALITATIVE RESEARCH IN E-LEARNING  DISCUSSION AND FINDINGS FROM THR...
APPLYING QUALITATIVE RESEARCH IN E-LEARNING DISCUSSION AND FINDINGS FROM THR...
 
MULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUES
MULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUESMULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUES
MULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUES
 
Enhancing Intercultural Communicative Competence through cross-cultural inter...
Enhancing Intercultural Communicative Competence through cross-cultural inter...Enhancing Intercultural Communicative Competence through cross-cultural inter...
Enhancing Intercultural Communicative Competence through cross-cultural inter...
 
Tenc Winterschool09 Davinia Slideshare
Tenc Winterschool09 Davinia SlideshareTenc Winterschool09 Davinia Slideshare
Tenc Winterschool09 Davinia Slideshare
 
Al pres
Al presAl pres
Al pres
 
ASK Tools 4 Scaling-Up Open Access 2 Global Education
ASK Tools 4 Scaling-Up Open Access 2 Global EducationASK Tools 4 Scaling-Up Open Access 2 Global Education
ASK Tools 4 Scaling-Up Open Access 2 Global Education
 
Dialectal Arabic sentiment analysis based on tree-based pipeline optimizatio...
Dialectal Arabic sentiment analysis based on tree-based pipeline  optimizatio...Dialectal Arabic sentiment analysis based on tree-based pipeline  optimizatio...
Dialectal Arabic sentiment analysis based on tree-based pipeline optimizatio...
 
Share.TEC presentation at EdReNe expert workshop, J. Earp
Share.TEC presentation at EdReNe expert workshop, J. EarpShare.TEC presentation at EdReNe expert workshop, J. Earp
Share.TEC presentation at EdReNe expert workshop, J. Earp
 
Share.TEC: Sharing Digital Resources in the Teacher Education Community
Share.TEC: Sharing Digital Resources in the Teacher Education CommunityShare.TEC: Sharing Digital Resources in the Teacher Education Community
Share.TEC: Sharing Digital Resources in the Teacher Education Community
 
Knowledge Management Cultures: A Comparison of Engineering and Cultural Scien...
Knowledge Management Cultures: A Comparison of Engineering and Cultural Scien...Knowledge Management Cultures: A Comparison of Engineering and Cultural Scien...
Knowledge Management Cultures: A Comparison of Engineering and Cultural Scien...
 
2015-11-18 research seminar
2015-11-18 research seminar2015-11-18 research seminar
2015-11-18 research seminar
 
Eurocall2014 SpeakApps Presentation - SpeakApps and Learning Analytics
Eurocall2014 SpeakApps Presentation - SpeakApps and Learning AnalyticsEurocall2014 SpeakApps Presentation - SpeakApps and Learning Analytics
Eurocall2014 SpeakApps Presentation - SpeakApps and Learning Analytics
 
Digital Systems and Services for Open Access Education and Learning
Digital Systems and Services for Open Access Education and LearningDigital Systems and Services for Open Access Education and Learning
Digital Systems and Services for Open Access Education and Learning
 
Structuring Self Organised Language Learning Online and Offline
Structuring Self Organised Language Learning Online and OfflineStructuring Self Organised Language Learning Online and Offline
Structuring Self Organised Language Learning Online and Offline
 
Geographic collections development policies and GIS services: a research in U...
Geographic collections development policies and GIS services: a research in U...Geographic collections development policies and GIS services: a research in U...
Geographic collections development policies and GIS services: a research in U...
 
The impact of standardized terminologies and domain-ontologies in multilingua...
The impact of standardized terminologies and domain-ontologies in multilingua...The impact of standardized terminologies and domain-ontologies in multilingua...
The impact of standardized terminologies and domain-ontologies in multilingua...
 
Training Informatics Teachers in Managing ICT Facilities in Schools
Training Informatics Teachers in Managing ICT Facilities in SchoolsTraining Informatics Teachers in Managing ICT Facilities in Schools
Training Informatics Teachers in Managing ICT Facilities in Schools
 
The student experience: How does it align with policy directives?
The student experience: How does it align with policy directives?The student experience: How does it align with policy directives?
The student experience: How does it align with policy directives?
 
Conole Policy Eden
Conole Policy EdenConole Policy Eden
Conole Policy Eden
 

More from Giannis Tsakonas

Αρχειακά Μεταδεδομένα: Πρότυπα και Διαχείριση στον Παγκόσμιο Ιστό
Αρχειακά Μεταδεδομένα: Πρότυπα και Διαχείριση στον Παγκόσμιο ΙστόΑρχειακά Μεταδεδομένα: Πρότυπα και Διαχείριση στον Παγκόσμιο Ιστό
Αρχειακά Μεταδεδομένα: Πρότυπα και Διαχείριση στον Παγκόσμιο ΙστόGiannis Tsakonas
 
The “Nomenclature of Multidimensionality” in the Digital Libraries Evaluation...
The “Nomenclature of Multidimensionality” in the Digital Libraries Evaluation...The “Nomenclature of Multidimensionality” in the Digital Libraries Evaluation...
The “Nomenclature of Multidimensionality” in the Digital Libraries Evaluation...Giannis Tsakonas
 
Increasing traceability of physical library items through Koha: the case of S...
Increasing traceability of physical library items through Koha: the case of S...Increasing traceability of physical library items through Koha: the case of S...
Increasing traceability of physical library items through Koha: the case of S...Giannis Tsakonas
 
Ακαδημαϊκές Βιβλιοθήκες στην Πάτρα: Προηγμένες υπηρεσίες, δικτύωση & προοπτικές
Ακαδημαϊκές Βιβλιοθήκες στην Πάτρα: Προηγμένες υπηρεσίες, δικτύωση & προοπτικέςΑκαδημαϊκές Βιβλιοθήκες στην Πάτρα: Προηγμένες υπηρεσίες, δικτύωση & προοπτικές
Ακαδημαϊκές Βιβλιοθήκες στην Πάτρα: Προηγμένες υπηρεσίες, δικτύωση & προοπτικέςGiannis Tsakonas
 
We were group no 2: notes for the MLAS2015 workshop
We were group no 2: notes for the MLAS2015 workshopWe were group no 2: notes for the MLAS2015 workshop
We were group no 2: notes for the MLAS2015 workshopGiannis Tsakonas
 
Βιβλιοθήκες & Πολιτισμός: τα προφανή και τα ευνόητα
Βιβλιοθήκες & Πολιτισμός: τα προφανή και τα ευνόηταΒιβλιοθήκες & Πολιτισμός: τα προφανή και τα ευνόητα
Βιβλιοθήκες & Πολιτισμός: τα προφανή και τα ευνόηταGiannis Tsakonas
 
{Tech}changes: the technological state of Greek Libraries.
{Tech}changes: the technological state of Greek Libraries.{Tech}changes: the technological state of Greek Libraries.
{Tech}changes: the technological state of Greek Libraries.Giannis Tsakonas
 
Affective relationships between users & libraries in times of economic stress
Affective relationships between users & libraries in times of economic stressAffective relationships between users & libraries in times of economic stress
Affective relationships between users & libraries in times of economic stressGiannis Tsakonas
 
Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...
Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...
Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...Giannis Tsakonas
 
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - Σεμινάριο Κύπρου
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - Σεμινάριο ΚύπρουΔεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - Σεμινάριο Κύπρου
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - Σεμινάριο ΚύπρουGiannis Tsakonas
 
FRBR και Linked Data - Σεμινάριο Αθήνας
FRBR και Linked Data -  Σεμινάριο ΑθήναςFRBR και Linked Data -  Σεμινάριο Αθήνας
FRBR και Linked Data - Σεμινάριο ΑθήναςGiannis Tsakonas
 
Automatic Medical Document Generation via Spatial SNOMED Elements in Hysteros...
Automatic Medical Document Generation via Spatial SNOMED Elements in Hysteros...Automatic Medical Document Generation via Spatial SNOMED Elements in Hysteros...
Automatic Medical Document Generation via Spatial SNOMED Elements in Hysteros...Giannis Tsakonas
 
Policies for geospatial collections: a research in US and Canadian academic l...
Policies for geospatial collections: a research in US and Canadian academic l...Policies for geospatial collections: a research in US and Canadian academic l...
Policies for geospatial collections: a research in US and Canadian academic l...Giannis Tsakonas
 
Developing a Metadata Model for Historic Buildings: Describing and Linking Ar...
Developing a Metadata Model for Historic Buildings: Describing and Linking Ar...Developing a Metadata Model for Historic Buildings: Describing and Linking Ar...
Developing a Metadata Model for Historic Buildings: Describing and Linking Ar...Giannis Tsakonas
 
Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...
Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...
Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...Giannis Tsakonas
 
Path-based MXML Storage and Querying
Path-based MXML Storage and QueryingPath-based MXML Storage and Querying
Path-based MXML Storage and QueryingGiannis Tsakonas
 
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - FRBR και Linked Data
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - FRBR και Linked DataΔεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - FRBR και Linked Data
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - FRBR και Linked DataGiannis Tsakonas
 
Open Bibliographic Data and E-LIS
Open Bibliographic Data and E-LISOpen Bibliographic Data and E-LIS
Open Bibliographic Data and E-LISGiannis Tsakonas
 
Dileo Presentation (in English)
Dileo Presentation (in English)Dileo Presentation (in English)
Dileo Presentation (in English)Giannis Tsakonas
 
Evaluation Insights to Key Processes of Digital Repositories
Evaluation Insights to Key Processes of Digital RepositoriesEvaluation Insights to Key Processes of Digital Repositories
Evaluation Insights to Key Processes of Digital RepositoriesGiannis Tsakonas
 

More from Giannis Tsakonas (20)

Αρχειακά Μεταδεδομένα: Πρότυπα και Διαχείριση στον Παγκόσμιο Ιστό
Αρχειακά Μεταδεδομένα: Πρότυπα και Διαχείριση στον Παγκόσμιο ΙστόΑρχειακά Μεταδεδομένα: Πρότυπα και Διαχείριση στον Παγκόσμιο Ιστό
Αρχειακά Μεταδεδομένα: Πρότυπα και Διαχείριση στον Παγκόσμιο Ιστό
 
The “Nomenclature of Multidimensionality” in the Digital Libraries Evaluation...
The “Nomenclature of Multidimensionality” in the Digital Libraries Evaluation...The “Nomenclature of Multidimensionality” in the Digital Libraries Evaluation...
The “Nomenclature of Multidimensionality” in the Digital Libraries Evaluation...
 
Increasing traceability of physical library items through Koha: the case of S...
Increasing traceability of physical library items through Koha: the case of S...Increasing traceability of physical library items through Koha: the case of S...
Increasing traceability of physical library items through Koha: the case of S...
 
Ακαδημαϊκές Βιβλιοθήκες στην Πάτρα: Προηγμένες υπηρεσίες, δικτύωση & προοπτικές
Ακαδημαϊκές Βιβλιοθήκες στην Πάτρα: Προηγμένες υπηρεσίες, δικτύωση & προοπτικέςΑκαδημαϊκές Βιβλιοθήκες στην Πάτρα: Προηγμένες υπηρεσίες, δικτύωση & προοπτικές
Ακαδημαϊκές Βιβλιοθήκες στην Πάτρα: Προηγμένες υπηρεσίες, δικτύωση & προοπτικές
 
We were group no 2: notes for the MLAS2015 workshop
We were group no 2: notes for the MLAS2015 workshopWe were group no 2: notes for the MLAS2015 workshop
We were group no 2: notes for the MLAS2015 workshop
 
Βιβλιοθήκες & Πολιτισμός: τα προφανή και τα ευνόητα
Βιβλιοθήκες & Πολιτισμός: τα προφανή και τα ευνόηταΒιβλιοθήκες & Πολιτισμός: τα προφανή και τα ευνόητα
Βιβλιοθήκες & Πολιτισμός: τα προφανή και τα ευνόητα
 
{Tech}changes: the technological state of Greek Libraries.
{Tech}changes: the technological state of Greek Libraries.{Tech}changes: the technological state of Greek Libraries.
{Tech}changes: the technological state of Greek Libraries.
 
Affective relationships between users & libraries in times of economic stress
Affective relationships between users & libraries in times of economic stressAffective relationships between users & libraries in times of economic stress
Affective relationships between users & libraries in times of economic stress
 
Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...
Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...
Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...
 
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - Σεμινάριο Κύπρου
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - Σεμινάριο ΚύπρουΔεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - Σεμινάριο Κύπρου
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - Σεμινάριο Κύπρου
 
FRBR και Linked Data - Σεμινάριο Αθήνας
FRBR και Linked Data -  Σεμινάριο ΑθήναςFRBR και Linked Data -  Σεμινάριο Αθήνας
FRBR και Linked Data - Σεμινάριο Αθήνας
 
Automatic Medical Document Generation via Spatial SNOMED Elements in Hysteros...
Automatic Medical Document Generation via Spatial SNOMED Elements in Hysteros...Automatic Medical Document Generation via Spatial SNOMED Elements in Hysteros...
Automatic Medical Document Generation via Spatial SNOMED Elements in Hysteros...
 
Policies for geospatial collections: a research in US and Canadian academic l...
Policies for geospatial collections: a research in US and Canadian academic l...Policies for geospatial collections: a research in US and Canadian academic l...
Policies for geospatial collections: a research in US and Canadian academic l...
 
Developing a Metadata Model for Historic Buildings: Describing and Linking Ar...
Developing a Metadata Model for Historic Buildings: Describing and Linking Ar...Developing a Metadata Model for Historic Buildings: Describing and Linking Ar...
Developing a Metadata Model for Historic Buildings: Describing and Linking Ar...
 
Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...
Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...
Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...
 
Path-based MXML Storage and Querying
Path-based MXML Storage and QueryingPath-based MXML Storage and Querying
Path-based MXML Storage and Querying
 
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - FRBR και Linked Data
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - FRBR και Linked DataΔεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - FRBR και Linked Data
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - FRBR και Linked Data
 
Open Bibliographic Data and E-LIS
Open Bibliographic Data and E-LISOpen Bibliographic Data and E-LIS
Open Bibliographic Data and E-LIS
 
Dileo Presentation (in English)
Dileo Presentation (in English)Dileo Presentation (in English)
Dileo Presentation (in English)
 
Evaluation Insights to Key Processes of Digital Repositories
Evaluation Insights to Key Processes of Digital RepositoriesEvaluation Insights to Key Processes of Digital Repositories
Evaluation Insights to Key Processes of Digital Repositories
 

Recently uploaded

Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 

Recently uploaded (20)

Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 

Failed queries: a morpho-syntactic analysis based on transaction log files

  • 1. 1st Workshop on Digital Information Management 30-31 March 2011, Corfu, Greece “Failed Queries: a Morpho-Syntactic Analysis Based on Transaction Log Files” Anna Mastora1, Maria Monopoli2 and Sarantos Kapidakis1 1Laboratory on Digital Libraries & Electronic Publishing, Department of Archives & Library Sciences, Ionian University, 72 Ioannou Theotoki str., 49100, Corfu, Greece 2Library Section, Economic Research Department, Bank of Greece, 21 El. Venizelos ave., 10250, Athens, Greece {mastora, sarantos}@ionio.gr, mmonopoli@bankofgreece.gr This research has been co-financed by the European Union (European Social Fund – ESF) and Greek national funds through the Operational Program "Education and Lifelong Learning" of the National Strategic Reference Framework (NSRF) - Research Funding Program: Heracleitus II. Investing in knowledge society through the European Social Fund.
  • 2. Presentation outline Introduction Aims and Objectives Related Research Definitions and Methodology Results Conclusions Future Work 2 1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
  • 3. Aims and Objectives The aim of this study is to elaborate on the procedure needed in order to analyze morpho- syntactically the typing-error queries submitted during the search process The objectives of the study are twofold – Explore the extent and types of failed queries due to typing errors – Explore the feasibility of their morpho-syntactic analysis 3 1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
  • 4. Related research (i) Information Retrieval techniques do not work effectively at all times “Not working effectively” includes: – Not retrieving relevant documents (low Recall), or, – Retrieving non relevant documents (low Precision) Part of studying what is not retrieved is the analysis of “Failed queries” or “Failure analysis” Significant interest has been expressed on failed queries as the outcome of subject searching 4 1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
  • 5. Related research (ii) Natural Language Processing is essential, especially in highly inflectional languages, such as Greek Part of Speech (PoS) tagging and morpho-syntactic analysis of words are valuable tools when mechanisms are applied for word disambiguation (either morpho-syntactic or word-sense driven) 5 1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
  • 6. Definitions and Limitations (i) What constitutes a “failed query”? – It depends. Some approaches consider: Precision and Recall, applying retrieval effectiveness measures User satisfaction, applying users’ criteria to measure if a query was failed Transaction Log files analysis (with or without relevance judgments) Critical incident technique, using direct observation of human behavior 6 1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
  • 7. Definitions and Limitations (ii) Failed query: a query with zero hits – A bit arbitrary (yet practical) since zero hits cannot solidly define a query as failed not all non-zero hits queries can be defined as successful – For the purpose of this study this is a rather safe definition since the database was known to include relevant items the information need was relevant to the context of the database we used transaction log files without relevance judgments – We analyze the queries morpho-syntactically, i.e. as “bag of words”, detached from any semantic designation (well… most of the times…) 7 1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
  • 8. Definitions and Limitations (iii) Morpho-syntactic analysis: cognitive process that constitutes an intermediate layer between morphological and syntactic analysis and aims to assign unambiguous morpho- syntactic information to words of texts Morpho-syntactic information: the morphological origin and the morpho-syntactic properties of a word (e.g. the word ανθρώπου is the genitive singular form of the masculine noun [άνθρωπος]) Inflectional languages: with a high morpheme-per- word ratio Morpheme: the smallest meaningful linguistic unit 8 1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
  • 9. Methodology: Experimental setting In vitro experiment (27 participants, undergrads, Dpt. of Archives & Library Sciences) 13 given information needs related to environmental issues – The queries submitted for this purpose formed the terms’ corpus Subject search only, Simple search Bibliographic database of the Evonymos Ecological Library (customized) Transaction Log Files (1 xml log file/ per user/ per session) Language used: Greek (highly inflectional) 9 1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
  • 10. Methodology: The Workflow 10 1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
  • 11. Methodology: Analysis specifications Manual designations to categories of failed queries, all examined one-by-one Use of Open Xerox tokenizer and morphological analyzer (PoS tags also included) Use of MS Word dictionary ♦ For more definitions and a full list of citations, please, refer to the full paper of this presentation hosted at the workshop’s website 11 1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
  • 12. Results (i) 1,284 queries were submitted overall 36% were failed queries (i.e. 459) Further categorization of Failed Queries Further categorization of Typing errors 12 1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
  • 13. Results (ii) Typing error queries: 90 Result after tokenization: 156 tokens Morpho-syntactic analysis of tokens (terms as submitted): 20/156 identified and analyzed *Poor performance in identifying the terms* 13 1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
  • 14. Results (iii) Error correction of tokens using MS Word dictionary – At this point we interfered with the results assigning the semantically correct word 14 1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
  • 15. Results (iv) Morpho-syntactic analysis of corrected tokens: 139/156 identified and analyzed *Good performance in identifying the terms* 15 1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
  • 16. An example 16 1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
  • 17. Conclusions (i) 36% of the queries submitted returned zero hits 19.6% of failed queries were due to typing errors Typing error queries require more steps in the process towards morpho-syntactic analysis Tools for morpho-syntactic analysis of the Greek language need to be rich in tags in order to work adequately. This affects the complexity of the tool but it is essential since Greek is a highly inflectional language and cannot be analyzed easily. Transaction Log files offer a good starting point but need further analysis and support from other tools in order to be used for successful word-sense disambiguation 17 1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
  • 18. Conclusions (ii) MS Word dictionary – Was consulted in order to correct ~70% of the typing errors. The remaining were the part of submitted phrases with no typing error. – In 45.5% of the cases, first MS Word suggestion was correct. – Seems to favor at all instances “plural masculine” over “singular feminine”, i.e. for “πυρινική”, it first suggests “πυρηνικοί” and second “πυρηνική”. – Does not recognize named entities 18 1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
  • 19. Conclusions (iii) Open Xerox – Works well for identifying and characterizing the segments – Is not enriched with many tags which leads to multiple suggestions. The goal in these cases is the less possible suggestions for each segment – Does not recognize capitalized words or words with no accent marks, meaning that the text submitted must be preprocessed to meet certain requirements – Does not recognize named entities 19 1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
  • 20. Future Work Deal with matters such as –Named entities recognition –Language identification –Word-sense disambiguation in order to achieve higher rates of morpho- syntactic analysis and apply automated mechanisms in this process Match the analyzed corpus of queries to Knowledge Organization Systems to assist query expansion 20 1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011
  • 21. Thank you for your attention! Any questions Contact: Anna Mastora Laboratory on Digital Libraries & Electronic Publishing, Department of Archives & Library Sciences, Ionian University, 72 Ioannou Theotoki str., 49100, Corfu, Greece mastora@ionio.gr 21 1st Workshop on Digital Information Management, Corfu, Greece, 30-31 March 2011