SlideShare a Scribd company logo
1 of 22
Download to read offline
Automated Focus Extraction for
    Question Answering over Topic Maps

     Rani Pinchuk, Alexander Mikhailian and Tiphaine Dalmas




Automated Focus Extraction for Question Answering over Topic Maps   TMRA’09, Leipzig
2




       Context: domain portable Question
          Answering over Topic Maps
•Partly funded by the Flemish government as part of the ITEA2
 project LINDO (ITEA2-06011)
•The research towards portable domain question answering over
 Topic Maps is done within the Belgian part of the LINDO project.




Automated Focus Extraction for Question Answering over Topic Maps   TMRA’09, Leipzig
3




                            Why Topic Maps?
      • Space industry needs a solution to the knowledge
        retention problem.
      • More structured than mind maps, less formal than
          RDF/OWL.
      • Allows to organize information in an ontological view.
      • An ISO standard.




Automated Focus Extraction for Question Answering over Topic Maps   TMRA’09, Leipzig
4




                            Why Topic Maps?

                                                 Who is the composer of La Bohème?

                                                      Puccini




Automated Focus Extraction for Question Answering over Topic Maps        TMRA’09, Leipzig
5




         LINDO-BE General Architecture


                       Focus
                      Extractor                                                       Answer
Question                                                      Graph        Answer
                                          Anchorer
                                                             Reducer      Extractor



                     Time Exp.
                                                       Topic Map Engine
                     Extractor




Automated Focus Extraction for Question Answering over Topic Maps                       TMRA’09, Leipzig
6




         LINDO-BE General Architecture


                       Focus
                      Extractor                                                       Answer
Question                                                      Graph        Answer
                                          Anchorer
                                                             Reducer      Extractor



                     Time Exp.
                                                       Topic Map Engine
                     Extractor




Automated Focus Extraction for Question Answering over Topic Maps                       TMRA’09, Leipzig
7




                            Question Focus
Focus is the type of the answer in the question terminology

                                                 Who is the composer of La Bohème?

                                                      Puccini




Automated Focus Extraction for Question Answering over Topic Maps       TMRA’09, Leipzig
8




                                           Focus

             Asking Point (AP)                        Expected Answer Type (EAT)



“Who is the librettist of La Tilda?” HUMAN: “Who wrote the libretto for La Tilda?”
              (explicit)                                               (implicit)

                                                                    EAT Classes:    TIME,

                                                                                    NUMERIC,

                                                                                    DEFINITION,

                                                                                    LOCATION,
Automated Focus Extraction for Question Answering over Topic Maps                   TMRA’09, Leipzig
                                                                                    HUMAN,
9




           Is it difficult to find the focus?
      •   Where was Puccini born?
                                                                                                  City
      •   What is Puccini's place of birth?
      •   What is Puccini's birthplace?




                                                                                                is a
      •   What is the birth place of Puccini?
      •   What city was Puccini born in?                                                       Lucca
                                                                                          ce
      •   What place was Puccini born in?                                           in pla
                                                                                    n
                                                                                 or
      •   Where is Puccini from?                                               b n
                                                                                   o
                                                                                rs
                                                                              pe
                                                                    Puccini




Automated Focus Extraction for Question Answering over Topic Maps                          TMRA’09, Leipzig
10




Why AP should take precedence over EAT?
                                                    “Who is the librettist of La Tilda?”

                                                    EAT         =   HUMAN        Person
                                                    AP          =   Librettist




Automated Focus Extraction for Question Answering over Topic Maps                TMRA’09, Leipzig
11




                         Precision and Recall

                       | {relevant} I {retrieved } |
                    P=
                              | {retrieved } |


                        | {relevant} I {retrieved} |
                     R=
                                | {relevant} |


Automated Focus Extraction for Question Answering over Topic Maps   TMRA’09, Leipzig
12




Why AP should take precedence over EAT?
                                                    “Who is the librettist of La Tilda?”

                                                    EAT         =    HUMAN        Person
                                                    AP          =    Librettist

                                                    PAP         =    57/57           =
                                                                1
                                                    PEAT        =     57/1165        =
                                                                0.049



Automated Focus Extraction for Question Answering over Topic Maps                 TMRA’09, Leipzig
13




Why AP should take precedence over EAT?
         Results over 100 annotated questions:


                               Name            Precision            Recall

                             AP                         0.311          0.30

                             EAT                        0.089          0.21




Automated Focus Extraction for Question Answering over Topic Maps             TMRA’09, Leipzig
14




                              Focus Branching




Automated Focus Extraction for Question Answering over Topic Maps   TMRA’09, Leipzig
15




            Focus Extractor Architecture
• Supervised machine learning based on the
  principal of maximum entropy (Maxent).
• 2100 questions have been annotated:
   • 1500 from Li & Roth corpus
   • 500 from TREC-10
   • 100 asked over the Italian Opera topic map
• The corpus was split into 80% of training and
  20% testing. The evaluation was done 10 times,
  each time shuffling the training and test data.
Question                             POS             Syntactic      Lexical     Focus      Focus
                Tokenizer
                                    Tagger            Parser        Analysis   Extractor


Automated Focus Extraction for Question Answering over Topic Maps                    TMRA’09, Leipzig
16




                   Questions Annotation
     Asking Point                                   Expected Answer Type

                                               HUMAN: Who is Puccini
          O: What                              DEFINITION: What is Tosca?
         AP: opera                             LOCATION: Where did Dante die?
          O: did                               TIME: When did Puccini die?
          O: Puccini                           NUMERIC: How many characters have
          O: write                                          been killed by poisoning?
          O: ?                                 OTHER: What did Heinrich Heine write?

        AP classifier                                          EAT classifier


Automated Focus Extraction for Question Answering over Topic Maps               TMRA’09, Leipzig
17




                                        AP Results

           Class                  Precision                    Recall       F-Score
     AskingPoint                             0.854                  0.734        0.789
     Other                                   0.973                  0.987        0.980




Automated Focus Extraction for Question Answering over Topic Maps               TMRA’09, Leipzig
18




                                        EAT Results
            Class                  Precision                    Recall      F-Score
      DEFINITION                              0.887                 0.800        0.841
      LOCATION                                0.834                 0.812        0.821
      HUMAN                                   0.904                 0.753        0.820
      TIME                                    0.880                 0.802        0.838
      NUMERIC                                 0.943                 0.782        0.854
      OTHER                                   0.746                 0.893        0.812



Automated Focus Extraction for Question Answering over Topic Maps               TMRA’09, Leipzig
19




                                   Overall Results
       The overall results are provided as the accuracy
       of the classifier.

         Accuracy = correct instances / overall instances

                                         Value                      Std dev      Std err

   Focus (AP+EAT)                               0.827                    0.020         0.006




Automated Focus Extraction for Question Answering over Topic Maps                  TMRA’09, Leipzig
20




                         Prediction of Accuracy




Automated Focus Extraction for Question Answering over Topic Maps   TMRA’09, Leipzig
21




                                   Conclusions
       • We achieved 82.7% accuracy for focus extraction.
       • The specificity of the focus degrades gracefully (we first try
         to extract the AP, and fall back to the EAT).
       • The focus is identified dynamically instead of relying on
         static taxonomy of question types.
       • Machine learning techniques were used throughout the
         application stack.
       • The results could be improved with more training data.
       • The whole setting is domain independent.



Automated Focus Extraction for Question Answering over Topic Maps   TMRA’09, Leipzig
22




                                     Questions?


                                          Thank you




Automated Focus Extraction for Question Answering over Topic Maps   TMRA’09, Leipzig

More Related Content

More from tmra

External Schema for Topic Map Database
External Schema for Topic Map DatabaseExternal Schema for Topic Map Database
External Schema for Topic Map Databasetmra
 
Weber 2010 brn
Weber 2010 brnWeber 2010 brn
Weber 2010 brntmra
 
Subject Headings make information to be topic maps
Subject Headings make information to be topic mapsSubject Headings make information to be topic maps
Subject Headings make information to be topic mapstmra
 
Inquiry Optimization Technique for a Topic Map Database
Inquiry Optimization Technique for a Topic Map DatabaseInquiry Optimization Technique for a Topic Map Database
Inquiry Optimization Technique for a Topic Map Databasetmra
 
Topic Merge Scenarios for Knowledge Federation
Topic Merge Scenarios for Knowledge FederationTopic Merge Scenarios for Knowledge Federation
Topic Merge Scenarios for Knowledge Federationtmra
 
JavaScript Topic Maps in server environments
JavaScript Topic Maps in server environmentsJavaScript Topic Maps in server environments
JavaScript Topic Maps in server environmentstmra
 
Modelling IMS QTI with Topic Maps
Modelling IMS QTI with Topic MapsModelling IMS QTI with Topic Maps
Modelling IMS QTI with Topic Mapstmra
 
Designing a gui_description_language_with_topic_maps
Designing a gui_description_language_with_topic_mapsDesigning a gui_description_language_with_topic_maps
Designing a gui_description_language_with_topic_mapstmra
 
Maiana - The social Topic Maps explorer
Maiana - The social Topic Maps explorerMaiana - The social Topic Maps explorer
Maiana - The social Topic Maps explorertmra
 
Tmra2010 matsuuraposter
Tmra2010 matsuuraposterTmra2010 matsuuraposter
Tmra2010 matsuurapostertmra
 
Automatic semantic interpretation of unstructured data for knowledge management
Automatic semantic interpretation of unstructured data for knowledge managementAutomatic semantic interpretation of unstructured data for knowledge management
Automatic semantic interpretation of unstructured data for knowledge managementtmra
 
Putting topic maps to rest.tmra2010
Putting topic maps to rest.tmra2010Putting topic maps to rest.tmra2010
Putting topic maps to rest.tmra2010tmra
 
Presentation final
Presentation finalPresentation final
Presentation finaltmra
 
Evaluation of Instances Asset in a Topic Maps-Based Ontology
Evaluation of Instances Asset in a Topic Maps-Based OntologyEvaluation of Instances Asset in a Topic Maps-Based Ontology
Evaluation of Instances Asset in a Topic Maps-Based Ontologytmra
 
Defining Domain-Specific Facets for Topic Maps With TMQL Path Expressions
Defining Domain-Specific Facets for Topic Maps With TMQL Path ExpressionsDefining Domain-Specific Facets for Topic Maps With TMQL Path Expressions
Defining Domain-Specific Facets for Topic Maps With TMQL Path Expressionstmra
 
Mappe1
Mappe1Mappe1
Mappe1tmra
 
Et Tu, Brute? Topic Maps and Discourse Semantics
Et Tu, Brute? Topic Maps and Discourse SemanticsEt Tu, Brute? Topic Maps and Discourse Semantics
Et Tu, Brute? Topic Maps and Discourse Semanticstmra
 
A PHP library for Ontopia-CMS Integration
A PHP library for Ontopia-CMS IntegrationA PHP library for Ontopia-CMS Integration
A PHP library for Ontopia-CMS Integrationtmra
 
Live Integration Framework
Live Integration FrameworkLive Integration Framework
Live Integration Frameworktmra
 
Hatana tmra 2010
Hatana tmra 2010Hatana tmra 2010
Hatana tmra 2010tmra
 

More from tmra (20)

External Schema for Topic Map Database
External Schema for Topic Map DatabaseExternal Schema for Topic Map Database
External Schema for Topic Map Database
 
Weber 2010 brn
Weber 2010 brnWeber 2010 brn
Weber 2010 brn
 
Subject Headings make information to be topic maps
Subject Headings make information to be topic mapsSubject Headings make information to be topic maps
Subject Headings make information to be topic maps
 
Inquiry Optimization Technique for a Topic Map Database
Inquiry Optimization Technique for a Topic Map DatabaseInquiry Optimization Technique for a Topic Map Database
Inquiry Optimization Technique for a Topic Map Database
 
Topic Merge Scenarios for Knowledge Federation
Topic Merge Scenarios for Knowledge FederationTopic Merge Scenarios for Knowledge Federation
Topic Merge Scenarios for Knowledge Federation
 
JavaScript Topic Maps in server environments
JavaScript Topic Maps in server environmentsJavaScript Topic Maps in server environments
JavaScript Topic Maps in server environments
 
Modelling IMS QTI with Topic Maps
Modelling IMS QTI with Topic MapsModelling IMS QTI with Topic Maps
Modelling IMS QTI with Topic Maps
 
Designing a gui_description_language_with_topic_maps
Designing a gui_description_language_with_topic_mapsDesigning a gui_description_language_with_topic_maps
Designing a gui_description_language_with_topic_maps
 
Maiana - The social Topic Maps explorer
Maiana - The social Topic Maps explorerMaiana - The social Topic Maps explorer
Maiana - The social Topic Maps explorer
 
Tmra2010 matsuuraposter
Tmra2010 matsuuraposterTmra2010 matsuuraposter
Tmra2010 matsuuraposter
 
Automatic semantic interpretation of unstructured data for knowledge management
Automatic semantic interpretation of unstructured data for knowledge managementAutomatic semantic interpretation of unstructured data for knowledge management
Automatic semantic interpretation of unstructured data for knowledge management
 
Putting topic maps to rest.tmra2010
Putting topic maps to rest.tmra2010Putting topic maps to rest.tmra2010
Putting topic maps to rest.tmra2010
 
Presentation final
Presentation finalPresentation final
Presentation final
 
Evaluation of Instances Asset in a Topic Maps-Based Ontology
Evaluation of Instances Asset in a Topic Maps-Based OntologyEvaluation of Instances Asset in a Topic Maps-Based Ontology
Evaluation of Instances Asset in a Topic Maps-Based Ontology
 
Defining Domain-Specific Facets for Topic Maps With TMQL Path Expressions
Defining Domain-Specific Facets for Topic Maps With TMQL Path ExpressionsDefining Domain-Specific Facets for Topic Maps With TMQL Path Expressions
Defining Domain-Specific Facets for Topic Maps With TMQL Path Expressions
 
Mappe1
Mappe1Mappe1
Mappe1
 
Et Tu, Brute? Topic Maps and Discourse Semantics
Et Tu, Brute? Topic Maps and Discourse SemanticsEt Tu, Brute? Topic Maps and Discourse Semantics
Et Tu, Brute? Topic Maps and Discourse Semantics
 
A PHP library for Ontopia-CMS Integration
A PHP library for Ontopia-CMS IntegrationA PHP library for Ontopia-CMS Integration
A PHP library for Ontopia-CMS Integration
 
Live Integration Framework
Live Integration FrameworkLive Integration Framework
Live Integration Framework
 
Hatana tmra 2010
Hatana tmra 2010Hatana tmra 2010
Hatana tmra 2010
 

Recently uploaded

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

Recently uploaded (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

Automated Focus Extraction for Question Answering over Topic Maps

  • 1. Automated Focus Extraction for Question Answering over Topic Maps Rani Pinchuk, Alexander Mikhailian and Tiphaine Dalmas Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
  • 2. 2 Context: domain portable Question Answering over Topic Maps •Partly funded by the Flemish government as part of the ITEA2 project LINDO (ITEA2-06011) •The research towards portable domain question answering over Topic Maps is done within the Belgian part of the LINDO project. Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
  • 3. 3 Why Topic Maps? • Space industry needs a solution to the knowledge retention problem. • More structured than mind maps, less formal than RDF/OWL. • Allows to organize information in an ontological view. • An ISO standard. Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
  • 4. 4 Why Topic Maps? Who is the composer of La Bohème? Puccini Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
  • 5. 5 LINDO-BE General Architecture Focus Extractor Answer Question Graph Answer Anchorer Reducer Extractor Time Exp. Topic Map Engine Extractor Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
  • 6. 6 LINDO-BE General Architecture Focus Extractor Answer Question Graph Answer Anchorer Reducer Extractor Time Exp. Topic Map Engine Extractor Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
  • 7. 7 Question Focus Focus is the type of the answer in the question terminology Who is the composer of La Bohème? Puccini Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
  • 8. 8 Focus Asking Point (AP) Expected Answer Type (EAT) “Who is the librettist of La Tilda?” HUMAN: “Who wrote the libretto for La Tilda?” (explicit) (implicit) EAT Classes: TIME, NUMERIC, DEFINITION, LOCATION, Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig HUMAN,
  • 9. 9 Is it difficult to find the focus? • Where was Puccini born? City • What is Puccini's place of birth? • What is Puccini's birthplace? is a • What is the birth place of Puccini? • What city was Puccini born in? Lucca ce • What place was Puccini born in? in pla n or • Where is Puccini from? b n o rs pe Puccini Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
  • 10. 10 Why AP should take precedence over EAT? “Who is the librettist of La Tilda?” EAT = HUMAN Person AP = Librettist Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
  • 11. 11 Precision and Recall | {relevant} I {retrieved } | P= | {retrieved } | | {relevant} I {retrieved} | R= | {relevant} | Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
  • 12. 12 Why AP should take precedence over EAT? “Who is the librettist of La Tilda?” EAT = HUMAN Person AP = Librettist PAP = 57/57 = 1 PEAT = 57/1165 = 0.049 Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
  • 13. 13 Why AP should take precedence over EAT? Results over 100 annotated questions: Name Precision Recall AP 0.311 0.30 EAT 0.089 0.21 Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
  • 14. 14 Focus Branching Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
  • 15. 15 Focus Extractor Architecture • Supervised machine learning based on the principal of maximum entropy (Maxent). • 2100 questions have been annotated: • 1500 from Li & Roth corpus • 500 from TREC-10 • 100 asked over the Italian Opera topic map • The corpus was split into 80% of training and 20% testing. The evaluation was done 10 times, each time shuffling the training and test data. Question POS Syntactic Lexical Focus Focus Tokenizer Tagger Parser Analysis Extractor Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
  • 16. 16 Questions Annotation Asking Point Expected Answer Type HUMAN: Who is Puccini O: What DEFINITION: What is Tosca? AP: opera LOCATION: Where did Dante die? O: did TIME: When did Puccini die? O: Puccini NUMERIC: How many characters have O: write been killed by poisoning? O: ? OTHER: What did Heinrich Heine write? AP classifier EAT classifier Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
  • 17. 17 AP Results Class Precision Recall F-Score AskingPoint 0.854 0.734 0.789 Other 0.973 0.987 0.980 Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
  • 18. 18 EAT Results Class Precision Recall F-Score DEFINITION 0.887 0.800 0.841 LOCATION 0.834 0.812 0.821 HUMAN 0.904 0.753 0.820 TIME 0.880 0.802 0.838 NUMERIC 0.943 0.782 0.854 OTHER 0.746 0.893 0.812 Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
  • 19. 19 Overall Results The overall results are provided as the accuracy of the classifier. Accuracy = correct instances / overall instances Value Std dev Std err Focus (AP+EAT) 0.827 0.020 0.006 Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
  • 20. 20 Prediction of Accuracy Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
  • 21. 21 Conclusions • We achieved 82.7% accuracy for focus extraction. • The specificity of the focus degrades gracefully (we first try to extract the AP, and fall back to the EAT). • The focus is identified dynamically instead of relying on static taxonomy of question types. • Machine learning techniques were used throughout the application stack. • The results could be improved with more training data. • The whole setting is domain independent. Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
  • 22. 22 Questions? Thank you Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig