SlideShare uma empresa Scribd logo
1 de 39
ConstantinOrasan Research Group in Computational Linguistics, University of Wolverhampton, UK http://www.wlv.ac.uk/~in6093/ From TREC to Watson: is open domain question answering a solved problem?
Structure of the talk 4 July 2011 Constantin Orasan - KEPT 2011 Brief introduction to QA Video 1: Where are we now – IBM Watson The structure of a QA system Video 2: Watson vs. humans Overview of Watson QA from the point of view of users/companies Conclusions
Information overload 4 July 2011 Constantin Orasan - KEPT 2011 “Getting information off the Internet is like taking a drink from a fire hydrant” Mitchell Kapor
What is question answering? 4 July 2011 Constantin Orasan - KEPT 2011 A way to address the problem of information overload Question answering aims at identifying the answer to a question posed in natural languagein a large collection of documents The information provided by QA is more focused than information retrieval The output can be the exact answer or a text snippet which contains the answer The domain took off as a result of the introduction of QA track in TREC, whilst cross-lingual QA as a result of CLEF
Types of QA systems 4 July 2011 Constantin Orasan - KEPT 2011 ,[object Object],+ can potentially answer any question - very low accuracy (especially in cross-lingual settings) ,[object Object],+ very little language processing necessary - limited to the answers in the database ,[object Object],+ very high accuracy - can require extensive language processing and limited to one domain
Evolution of QA domain 4 July 2011 Constantin Orasan - KEPT 2011 Early QA systems  date as back as 1960s and were mainly front ends to databases had limited usability  Open-domain QA  emerged as a result of the increasing amount of data available to answer a question need to find and extract the answer developed last 1990s as a result of the QA track at Text REtrieval Conferences emphasis on factoid questions, but other types of questions were also explored CLEF competitions have encouraged development of cross-lingual systems.
Where are we now? 4 July 2011 Constantin Orasan - KEPT 2011 IBM and the Jeopardy Challenge Jeopardy! is an American quiz show where participants are given clues and need to guess the question (e.g. if the clue is The Father of Our Country; he didn't really chop down a cherry tree the contestant would respond Who is George Washington?) Watson is a QA system developed by IBM http://www.youtube.com/watch?v=FC3IryWr4c8
Structure of an open domain QA system 4 July 2011 Constantin Orasan - KEPT 2011 A typical open domain QA system consists of: Question processor Document processor Answer extractor (and validation) Can have components for cross-lingual processing Has access to several external resources
Question processor 4 July 2011 Constantin Orasan - KEPT 2011 Produces an interpretation of the question Determines the Question Type (e.g. factoid, definition, procedure, etc.) Determines the Expected Answer Type (EAT) On the basis of the question it produces a query Determines syntactic and semantic relations between the words from the questions Expands the query with synonyms May perform translation of the keywords in the query in the case of cross-lingual QA
Expected answer type calculation 4 July 2011 Constantin Orasan - KEPT 2011 Relies on the existence of an answer type taxonomy This taxonomy can be made open-domain by linking to general ontologies such as WordNet The EAT can be determined using rule-based as well as machine learning approaches Who is the president of Romania? Where is Paris? Knowledge of domain can greatly improve the identification of EAT and help deal with ambiguities
Query formulation 4 July 2011 Constantin Orasan - KEPT 2011 Produces a query from the question As a list of keywords As a list of phrases Identifies entities present in the question Produce variants of the query by introducing morphological, lexical and semantic variations Domain knowledge is  very important for identification of entities and generation of valid variations and vital in cross-lingual scenarios
Document processing 4 July 2011 Constantin Orasan - KEPT 2011 Uses the query produced in the previous step to retrieve paragraphs which may contain the answer It is largely domain independent as it relies on text retrieval engines Ranks results, but this is largely independent of the QA task For limited collections of texts it is possible to enrich the index with various linguistic information which can help further processing When the domain is known, characteristics of the input files can improve the retrieval (e.g. presence of metadata)
Answer extraction 4 July 2011 Constantin Orasan - KEPT 2011 Uses a variety of techniques to identify the answer of a question The answer should have the type of EAT Very often rely on previously created patterns (e.g. When was the telephone invented? can be answered if there is a sentence that matches the pattern The telephone was invented in <date>),  Many patterns can express the same answer (e.g. the telephone, invented in <date>) Relations identified in the question between the expected answer and entities from the question can be exploited by patterns
Answer extraction (II) 4 July 2011 Constantin Orasan - KEPT 2011 Potential answers are ranked according to functions which are usually learned from the data The ranking and validation of answers can be done using external sources such as the Internet QA for well defined domains can rely on better patterns The functions learned usually work well only on the type of data used for training
Open domain QA - evaluation 4 July 2011 Constantin Orasan - KEPT 2011 Great coverage, but low accuracy For example: EPHYRA QA system in TRAC 2007 reports an accuracy of 0.20 for factoid questions (Schlaefer et al. 2007) OpenEphyra was used for a cross-lingual Romanian – English QA system and we obtained 0.11 accuracy for factoid questions (Dornescu et al. 2008) – the best performing system for all cross-lingual QA tasks in CLEF 2008 The results are not directly comparable (different QA engines, tuned differently, different collections, different tasks) But does it make sense to do open domain question answering?
How did Watson perform? 4 July 2011 Constantin Orasan - KEPT 2011 http://www.youtube.com/watch?v=Puhs2LuO3Zc
How was this achieved? 4 July 2011 Constantin Orasan - KEPT 2011 Starting point the Practical Intelligent Question Answering Technology (PIQUANT) developed by IBM to participate in TREC Has been under development at IBM for more than 6 years by a team of 4 full time researchers Was one of the top three to five in many TRECs PIQUANT was performing around 0.33 on the TREC data PIQUANT used a standard architecture for QA
How was this achieved? (II) 4 July 2011 Constantin Orasan - KEPT 2011 Lots of extra work was put in the system: a core team of 20 researchers working for almost 4 years PIQUANT system was enriched with a large number of modules for language processing The processing was parallelised heavily Lots of components were developed to deal with specific problems (lots of experts) Watson tries to combine deep and shallow knowledge Had access to large data sets and very good hardware
Overview of Watson’s structure 4 July 2011 Constantin Orasan - KEPT 2011
Hardware used 4 July 2011 Constantin Orasan - KEPT 2011 Watson is a workload optimized system designed for complex analytics, made possible by integrating massively parallel POWER7 processors and the IBM DeepQA software to answer Jeopardy! questions in under three seconds. Watson is made up of a cluster of ninety IBM Power 750 servers (plus additional I/O, network and cluster controller nodes in 10 racks) with a total of 2880 POWER7 processor cores and 16 Terabytes of RAM. Each Power 750 server uses a 3.5 GHz POWER7 eight core processor, with four threads per core. The POWER7 processor's massively parallel processing capability is an ideal match for Watson's IBM DeepQA software which is embarrassingly parallel (that is a workload that is easily split up into multiple parallel tasks). According to John Rennie, Watson can process 500 gigabytes, the equivalent of a million books, per second. IBM's master inventor and senior consultant Tony Pearson estimated Watson's hardware cost at about $3 million and with 80 TeraFLOPs would be placed 94th on the Top 500 Supercomputers list. From: http://en.wikipedia.org/wiki/Watson_(computer)
Speed of answer 4 July 2011 Constantin Orasan - KEPT 2011 In Jeopardy! an answer needs to be provided in 3-5 seconds In initial experiments with running Watson on a single processor an answer was obtained in about 2 hours The system was implemented using Apache UIMA Asynchronous Scaleout Massively parallel architecture Indexes used to answer the questions had to be pre-processed using Hadoop
Watson was not only NLP 4 July 2011 Constantin Orasan - KEPT 2011 Betting strategyhttp://www.youtube.com/watch?v=vA9aqAd2iso
To sum up, Watson is: 4 July 2011 Constantin Orasan - KEPT 2011 An amazing engineering project A massive investment Research in many domains of NLP A big PR stunt A way to improve the IBM position in text analytics But it is not really a technology ready to be deployed But was it a real progress in open-domain QA?
So is open domain QA a solved problem? Can we really solve open domain QA? Do we really need open domain QA? Do we care?
QA from user perspective ,[object Object]
Are rarely open domain
Can rarely be formulated in one go
Do not always contain answers from only one source
Companies
Have very well defined needs
Have access to previously asked questions
Need very high accuracy
Most of them cannot afford to invest millions of dollars ,[object Object]
The QALL-ME project demonstrators in domain of tourism – can answer questions in the domain of cinema/movies and accommodation.  E.g.  What movies can I see in Wolverhampton this week?  How can I get to Novotel Hotel, Wolverhampton? the questions can be asked in any of the four languages in the consortium small scale demonstrator built for Romanian
QALL-ME framework 4 July 2011 Constantin Orasan - KEPT 2011
The QALL-ME ontology 4 July 2011 Constantin Orasan - KEPT 2011 All the reasoning and processing is done using a domain ontology The ontology also provides the means of achieving cross-lingual QA Determines the way data is stored in the database Ontologies need to be developed for each domain
30 Part of the tourism ontology
Evaluation of the QALL-ME prototype 4 July 2011 Constantin Orasan - KEPT 2011 For the cinema domain the accuracy ranged between 60% to 85% depending on the language The system was tested on real questions posed by the users which were completely independent from the ones used to develop the system The error were mainly caused by wrongly identified named entities, missing patterns and mistakes of the entailment engine In an commercial environment this system can be revised every day in order to obtain much higher performance
Closed domain QA for commercial companies 4 July 2011 Constantin Orasan - KEPT 2011 Closed domain QA has a certain appeal with companies These companies normally have large databases of questions and answers from customers The domain can be very clearly defined In some cases the systems needed are actually canned QA systems

Mais conteúdo relacionado

Mais procurados

Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...
Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...
Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...Polytechnic University of Bari
 
Practical machine learning - Part 1
Practical machine learning - Part 1Practical machine learning - Part 1
Practical machine learning - Part 1Traian Rebedea
 
Apply chinese radicals into neural machine translation: deeper than character...
Apply chinese radicals into neural machine translation: deeper than character...Apply chinese radicals into neural machine translation: deeper than character...
Apply chinese radicals into neural machine translation: deeper than character...Lifeng (Aaron) Han
 
Chinese Character Decomposition for Neural MT with Multi-Word Expressions
Chinese Character Decomposition for  Neural MT with Multi-Word ExpressionsChinese Character Decomposition for  Neural MT with Multi-Word Expressions
Chinese Character Decomposition for Neural MT with Multi-Word ExpressionsLifeng (Aaron) Han
 
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.Lifeng (Aaron) Han
 
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerAutomatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerFrancesco Osborne
 
Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking  Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking Mohamed BEN ELLEFI
 
EKAW 2016 - TechMiner: Extracting Technologies from Academic Publications
EKAW 2016 - TechMiner: Extracting Technologies from Academic PublicationsEKAW 2016 - TechMiner: Extracting Technologies from Academic Publications
EKAW 2016 - TechMiner: Extracting Technologies from Academic PublicationsFrancesco Osborne
 
Stream Reasoning: Where we got so far. Oxford 2010.1.18
Stream Reasoning: Where we got so far. Oxford 2010.1.18Stream Reasoning: Where we got so far. Oxford 2010.1.18
Stream Reasoning: Where we got so far. Oxford 2010.1.18Emanuele Della Valle
 
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...Julien PLU
 
Dstc6 an introduction
Dstc6 an introductionDstc6 an introduction
Dstc6 an introductionhkh
 
Crash-course in Natural Language Processing
Crash-course in Natural Language ProcessingCrash-course in Natural Language Processing
Crash-course in Natural Language ProcessingVsevolod Dyomkin
 
An Evolution of Deep Learning Models for AI2 Reasoning Challenge
An Evolution of Deep Learning Models for AI2 Reasoning ChallengeAn Evolution of Deep Learning Models for AI2 Reasoning Challenge
An Evolution of Deep Learning Models for AI2 Reasoning ChallengeTraian Rebedea
 
Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...alessio_ferrari
 
WISS QA Do it yourself Question answering over Linked Data
WISS QA Do it yourself Question answering over Linked DataWISS QA Do it yourself Question answering over Linked Data
WISS QA Do it yourself Question answering over Linked DataAndre Freitas
 

Mais procurados (20)

Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...
Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...
Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...
 
Practical machine learning - Part 1
Practical machine learning - Part 1Practical machine learning - Part 1
Practical machine learning - Part 1
 
Apply chinese radicals into neural machine translation: deeper than character...
Apply chinese radicals into neural machine translation: deeper than character...Apply chinese radicals into neural machine translation: deeper than character...
Apply chinese radicals into neural machine translation: deeper than character...
 
Chinese Character Decomposition for Neural MT with Multi-Word Expressions
Chinese Character Decomposition for  Neural MT with Multi-Word ExpressionsChinese Character Decomposition for  Neural MT with Multi-Word Expressions
Chinese Character Decomposition for Neural MT with Multi-Word Expressions
 
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
 
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerAutomatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
 
Aspects of NLP Practice
Aspects of NLP PracticeAspects of NLP Practice
Aspects of NLP Practice
 
Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking  Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking
 
NLP Project Full Cycle
NLP Project Full CycleNLP Project Full Cycle
NLP Project Full Cycle
 
Sybrandt Thesis Proposal Presentation
Sybrandt Thesis Proposal PresentationSybrandt Thesis Proposal Presentation
Sybrandt Thesis Proposal Presentation
 
EKAW 2016 - TechMiner: Extracting Technologies from Academic Publications
EKAW 2016 - TechMiner: Extracting Technologies from Academic PublicationsEKAW 2016 - TechMiner: Extracting Technologies from Academic Publications
EKAW 2016 - TechMiner: Extracting Technologies from Academic Publications
 
Arabic question answering ‫‬
Arabic question answering ‫‬Arabic question answering ‫‬
Arabic question answering ‫‬
 
Stream Reasoning: Where we got so far. Oxford 2010.1.18
Stream Reasoning: Where we got so far. Oxford 2010.1.18Stream Reasoning: Where we got so far. Oxford 2010.1.18
Stream Reasoning: Where we got so far. Oxford 2010.1.18
 
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
 
Dstc6 an introduction
Dstc6 an introductionDstc6 an introduction
Dstc6 an introduction
 
Crash-course in Natural Language Processing
Crash-course in Natural Language ProcessingCrash-course in Natural Language Processing
Crash-course in Natural Language Processing
 
An Evolution of Deep Learning Models for AI2 Reasoning Challenge
An Evolution of Deep Learning Models for AI2 Reasoning ChallengeAn Evolution of Deep Learning Models for AI2 Reasoning Challenge
An Evolution of Deep Learning Models for AI2 Reasoning Challenge
 
Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...
 
NLP & DBpedia
 NLP & DBpedia NLP & DBpedia
NLP & DBpedia
 
WISS QA Do it yourself Question answering over Linked Data
WISS QA Do it yourself Question answering over Linked DataWISS QA Do it yourself Question answering over Linked Data
WISS QA Do it yourself Question answering over Linked Data
 

Destaque

Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question AnsweringSujit Pal
 
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...Andre Freitas
 
Presentation of Domain Specific Question Answering System Using N-gram Approach.
Presentation of Domain Specific Question Answering System Using N-gram Approach.Presentation of Domain Specific Question Answering System Using N-gram Approach.
Presentation of Domain Specific Question Answering System Using N-gram Approach.Tasnim Ara Islam
 
Instant Question Answering System
Instant Question Answering SystemInstant Question Answering System
Instant Question Answering SystemDhwaj Raj
 
openQA Hoverboard - Open-source Question Answering Framework
openQA Hoverboard - Open-source Question Answering FrameworkopenQA Hoverboard - Open-source Question Answering Framework
openQA Hoverboard - Open-source Question Answering FrameworkEdgard Marx
 
Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributio...
Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributio...Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributio...
Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributio...Andre Freitas
 
Liam Mc Cormick Architecture
Liam Mc Cormick ArchitectureLiam Mc Cormick Architecture
Liam Mc Cormick ArchitectureDamien Wilson
 
Fear Factor with Outsourcing
Fear Factor with OutsourcingFear Factor with Outsourcing
Fear Factor with OutsourcingBenaud Jacob
 
Amazing wax statues in kolhapur, india,
Amazing wax statues in kolhapur,  india,Amazing wax statues in kolhapur,  india,
Amazing wax statues in kolhapur, india,Heena Modi
 
The Path to Social ROI - Facebook Marketing Success Summit 2012
The Path to Social ROI - Facebook Marketing Success Summit 2012The Path to Social ROI - Facebook Marketing Success Summit 2012
The Path to Social ROI - Facebook Marketing Success Summit 2012Chris Treadaway
 
Introduction to Palm's Mojo SDK
Introduction to Palm's Mojo SDKIntroduction to Palm's Mojo SDK
Introduction to Palm's Mojo SDKBrendan Lim
 
Loving hut - dessert time!
Loving hut - dessert time!Loving hut - dessert time!
Loving hut - dessert time!Heena Modi
 
Some highlights from my stay in gambia
Some highlights from my stay in gambiaSome highlights from my stay in gambia
Some highlights from my stay in gambiaHeena Modi
 
24 Tirthankaras
24 Tirthankaras24 Tirthankaras
24 TirthankarasHeena Modi
 
Biggie And Small
Biggie And SmallBiggie And Small
Biggie And SmallHeena Modi
 

Destaque (20)

Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question Answering
 
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
 
Presentation of Domain Specific Question Answering System Using N-gram Approach.
Presentation of Domain Specific Question Answering System Using N-gram Approach.Presentation of Domain Specific Question Answering System Using N-gram Approach.
Presentation of Domain Specific Question Answering System Using N-gram Approach.
 
Instant Question Answering System
Instant Question Answering SystemInstant Question Answering System
Instant Question Answering System
 
openQA Hoverboard - Open-source Question Answering Framework
openQA Hoverboard - Open-source Question Answering FrameworkopenQA Hoverboard - Open-source Question Answering Framework
openQA Hoverboard - Open-source Question Answering Framework
 
Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributio...
Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributio...Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributio...
Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributio...
 
Liam Mc Cormick Architecture
Liam Mc Cormick ArchitectureLiam Mc Cormick Architecture
Liam Mc Cormick Architecture
 
Powerpoint
PowerpointPowerpoint
Powerpoint
 
Fear Factor with Outsourcing
Fear Factor with OutsourcingFear Factor with Outsourcing
Fear Factor with Outsourcing
 
Amazing wax statues in kolhapur, india,
Amazing wax statues in kolhapur,  india,Amazing wax statues in kolhapur,  india,
Amazing wax statues in kolhapur, india,
 
The Path to Social ROI - Facebook Marketing Success Summit 2012
The Path to Social ROI - Facebook Marketing Success Summit 2012The Path to Social ROI - Facebook Marketing Success Summit 2012
The Path to Social ROI - Facebook Marketing Success Summit 2012
 
LectureNotes-01-DSA
LectureNotes-01-DSALectureNotes-01-DSA
LectureNotes-01-DSA
 
Introduction to Palm's Mojo SDK
Introduction to Palm's Mojo SDKIntroduction to Palm's Mojo SDK
Introduction to Palm's Mojo SDK
 
Loving hut - dessert time!
Loving hut - dessert time!Loving hut - dessert time!
Loving hut - dessert time!
 
Kansas sights
Kansas sightsKansas sights
Kansas sights
 
Some highlights from my stay in gambia
Some highlights from my stay in gambiaSome highlights from my stay in gambia
Some highlights from my stay in gambia
 
Agency Profile: Dawson Marketing Group
Agency Profile: Dawson Marketing GroupAgency Profile: Dawson Marketing Group
Agency Profile: Dawson Marketing Group
 
24 Tirthankaras
24 Tirthankaras24 Tirthankaras
24 Tirthankaras
 
Biggie And Small
Biggie And SmallBiggie And Small
Biggie And Small
 
Life Matters
Life MattersLife Matters
Life Matters
 

Semelhante a From TREC to Watson: is open domain question answering a solved problem?

Resource Description Framework Approach to Data Publication and Federation
Resource Description Framework Approach to Data Publication and FederationResource Description Framework Approach to Data Publication and Federation
Resource Description Framework Approach to Data Publication and FederationPistoia Alliance
 
QUrdPro: Query processing system for Urdu Language
QUrdPro: Query processing system for Urdu LanguageQUrdPro: Query processing system for Urdu Language
QUrdPro: Query processing system for Urdu LanguageIJERA Editor
 
The Nature of Information
The Nature of InformationThe Nature of Information
The Nature of InformationAdrian Paschke
 
Ontology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval SystemOntology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval SystemIJTET Journal
 
Architecture of an ontology based domain-specific natural language question a...
Architecture of an ontology based domain-specific natural language question a...Architecture of an ontology based domain-specific natural language question a...
Architecture of an ontology based domain-specific natural language question a...IJwest
 
A black-box-approach-for-response-quality-evaluation-of-conversational-agent-...
A black-box-approach-for-response-quality-evaluation-of-conversational-agent-...A black-box-approach-for-response-quality-evaluation-of-conversational-agent-...
A black-box-approach-for-response-quality-evaluation-of-conversational-agent-...Cemal Ardil
 
Semantic data integration proof of concept
Semantic data integration proof of conceptSemantic data integration proof of concept
Semantic data integration proof of conceptNicolas Bertrand
 
The Exploitation of OpenAPI Documents for the Generation of Web Frontends
The Exploitation of OpenAPI Documents for the Generation of Web FrontendsThe Exploitation of OpenAPI Documents for the Generation of Web Frontends
The Exploitation of OpenAPI Documents for the Generation of Web FrontendsIstvanKoren
 
Ontology-based information extraction in the DERI Reading Group
Ontology-based information extraction in the DERI Reading GroupOntology-based information extraction in the DERI Reading Group
Ontology-based information extraction in the DERI Reading GroupTobias Wunner
 
CNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundationCNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundationJohn Doove
 
NLP Tasks and Applications.ppt useful in
NLP Tasks and Applications.ppt useful inNLP Tasks and Applications.ppt useful in
NLP Tasks and Applications.ppt useful inKumari Naveen
 
lect36-tasks.ppt
lect36-tasks.pptlect36-tasks.ppt
lect36-tasks.pptHaHa501620
 
Question Focus Recognition in Question Answering Systems
Question Focus Recognition in Question  Answering Systems Question Focus Recognition in Question  Answering Systems
Question Focus Recognition in Question Answering Systems Waheeb Ahmed
 
06 making information pay 2011 -- solomon, madi (pearson)
06   making information pay 2011 -- solomon, madi (pearson)06   making information pay 2011 -- solomon, madi (pearson)
06 making information pay 2011 -- solomon, madi (pearson)bisg
 
1st SEALS evaluation campaign results: a worldwide evaluation of semantic tec...
1st SEALS evaluation campaign results: a worldwide evaluation of semantic tec...1st SEALS evaluation campaign results: a worldwide evaluation of semantic tec...
1st SEALS evaluation campaign results: a worldwide evaluation of semantic tec...SEALS - Semantic Evaluation at Large Scale
 
Manchester Seminar Liberate Your Library October 2009
Manchester Seminar   Liberate Your Library   October 2009Manchester Seminar   Liberate Your Library   October 2009
Manchester Seminar Liberate Your Library October 2009Jonathan Field
 
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMSA BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMSijaia
 
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMSA BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMSgerogepatton
 

Semelhante a From TREC to Watson: is open domain question answering a solved problem? (20)

Resource Description Framework Approach to Data Publication and Federation
Resource Description Framework Approach to Data Publication and FederationResource Description Framework Approach to Data Publication and Federation
Resource Description Framework Approach to Data Publication and Federation
 
QUrdPro: Query processing system for Urdu Language
QUrdPro: Query processing system for Urdu LanguageQUrdPro: Query processing system for Urdu Language
QUrdPro: Query processing system for Urdu Language
 
The Nature of Information
The Nature of InformationThe Nature of Information
The Nature of Information
 
Ontology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval SystemOntology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval System
 
Architecture of an ontology based domain-specific natural language question a...
Architecture of an ontology based domain-specific natural language question a...Architecture of an ontology based domain-specific natural language question a...
Architecture of an ontology based domain-specific natural language question a...
 
A black-box-approach-for-response-quality-evaluation-of-conversational-agent-...
A black-box-approach-for-response-quality-evaluation-of-conversational-agent-...A black-box-approach-for-response-quality-evaluation-of-conversational-agent-...
A black-box-approach-for-response-quality-evaluation-of-conversational-agent-...
 
Semantic data integration proof of concept
Semantic data integration proof of conceptSemantic data integration proof of concept
Semantic data integration proof of concept
 
Semantic annotation of biomedical data
Semantic annotation of biomedical dataSemantic annotation of biomedical data
Semantic annotation of biomedical data
 
The Exploitation of OpenAPI Documents for the Generation of Web Frontends
The Exploitation of OpenAPI Documents for the Generation of Web FrontendsThe Exploitation of OpenAPI Documents for the Generation of Web Frontends
The Exploitation of OpenAPI Documents for the Generation of Web Frontends
 
Ontology-based information extraction in the DERI Reading Group
Ontology-based information extraction in the DERI Reading GroupOntology-based information extraction in the DERI Reading Group
Ontology-based information extraction in the DERI Reading Group
 
CNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundationCNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundation
 
Searching for the best translation combination
Searching for the best translation combinationSearching for the best translation combination
Searching for the best translation combination
 
NLP Tasks and Applications.ppt useful in
NLP Tasks and Applications.ppt useful inNLP Tasks and Applications.ppt useful in
NLP Tasks and Applications.ppt useful in
 
lect36-tasks.ppt
lect36-tasks.pptlect36-tasks.ppt
lect36-tasks.ppt
 
Question Focus Recognition in Question Answering Systems
Question Focus Recognition in Question  Answering Systems Question Focus Recognition in Question  Answering Systems
Question Focus Recognition in Question Answering Systems
 
06 making information pay 2011 -- solomon, madi (pearson)
06   making information pay 2011 -- solomon, madi (pearson)06   making information pay 2011 -- solomon, madi (pearson)
06 making information pay 2011 -- solomon, madi (pearson)
 
1st SEALS evaluation campaign results: a worldwide evaluation of semantic tec...
1st SEALS evaluation campaign results: a worldwide evaluation of semantic tec...1st SEALS evaluation campaign results: a worldwide evaluation of semantic tec...
1st SEALS evaluation campaign results: a worldwide evaluation of semantic tec...
 
Manchester Seminar Liberate Your Library October 2009
Manchester Seminar   Liberate Your Library   October 2009Manchester Seminar   Liberate Your Library   October 2009
Manchester Seminar Liberate Your Library October 2009
 
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMSA BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
 
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMSA BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
 

Mais de Constantin Orasan

New trends in NLP applications
New trends in NLP applicationsNew trends in NLP applications
New trends in NLP applicationsConstantin Orasan
 
The role of linguistic information for shallow language processing
The role of linguistic information for shallow language processingThe role of linguistic information for shallow language processing
The role of linguistic information for shallow language processingConstantin Orasan
 
What is Computer-Aided Summarisation and does it really work?
What is Computer-Aided Summarisation and does it really work?What is Computer-Aided Summarisation and does it really work?
What is Computer-Aided Summarisation and does it really work?Constantin Orasan
 
Tutorial on automatic summarization
Tutorial on automatic summarizationTutorial on automatic summarization
Tutorial on automatic summarizationConstantin Orasan
 
Porting the QALL-ME framework to Romanian
Porting the QALL-ME framework to RomanianPorting the QALL-ME framework to Romanian
Porting the QALL-ME framework to RomanianConstantin Orasan
 
Annotation of anaphora and coreference for automatic processing
Annotation of anaphora and coreference for automatic processingAnnotation of anaphora and coreference for automatic processing
Annotation of anaphora and coreference for automatic processingConstantin Orasan
 

Mais de Constantin Orasan (7)

New trends in NLP applications
New trends in NLP applicationsNew trends in NLP applications
New trends in NLP applications
 
The role of linguistic information for shallow language processing
The role of linguistic information for shallow language processingThe role of linguistic information for shallow language processing
The role of linguistic information for shallow language processing
 
What is Computer-Aided Summarisation and does it really work?
What is Computer-Aided Summarisation and does it really work?What is Computer-Aided Summarisation and does it really work?
What is Computer-Aided Summarisation and does it really work?
 
Tutorial on automatic summarization
Tutorial on automatic summarizationTutorial on automatic summarization
Tutorial on automatic summarization
 
Message project leaflet
Message project leafletMessage project leaflet
Message project leaflet
 
Porting the QALL-ME framework to Romanian
Porting the QALL-ME framework to RomanianPorting the QALL-ME framework to Romanian
Porting the QALL-ME framework to Romanian
 
Annotation of anaphora and coreference for automatic processing
Annotation of anaphora and coreference for automatic processingAnnotation of anaphora and coreference for automatic processing
Annotation of anaphora and coreference for automatic processing
 

Último

Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDhatriParmar
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmStan Meyer
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptxDhatriParmar
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Association for Project Management
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfPrerana Jadhav
 
Using Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea DevelopmentUsing Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea Developmentchesterberbo7
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxMichelleTuguinay1
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptxmary850239
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1GloryAnnCastre1
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWQuiz Club NITW
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptxmary850239
 
week 1 cookery 8 fourth - quarter .pptx
week 1 cookery 8  fourth  -  quarter .pptxweek 1 cookery 8  fourth  -  quarter .pptx
week 1 cookery 8 fourth - quarter .pptxJonalynLegaspi2
 

Último (20)

Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
Paradigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTAParadigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTA
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and Film
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdf
 
prashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Professionprashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Profession
 
Using Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea DevelopmentUsing Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea Development
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITW
 
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxINCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx
 
week 1 cookery 8 fourth - quarter .pptx
week 1 cookery 8  fourth  -  quarter .pptxweek 1 cookery 8  fourth  -  quarter .pptx
week 1 cookery 8 fourth - quarter .pptx
 

From TREC to Watson: is open domain question answering a solved problem?

  • 1. ConstantinOrasan Research Group in Computational Linguistics, University of Wolverhampton, UK http://www.wlv.ac.uk/~in6093/ From TREC to Watson: is open domain question answering a solved problem?
  • 2. Structure of the talk 4 July 2011 Constantin Orasan - KEPT 2011 Brief introduction to QA Video 1: Where are we now – IBM Watson The structure of a QA system Video 2: Watson vs. humans Overview of Watson QA from the point of view of users/companies Conclusions
  • 3. Information overload 4 July 2011 Constantin Orasan - KEPT 2011 “Getting information off the Internet is like taking a drink from a fire hydrant” Mitchell Kapor
  • 4. What is question answering? 4 July 2011 Constantin Orasan - KEPT 2011 A way to address the problem of information overload Question answering aims at identifying the answer to a question posed in natural languagein a large collection of documents The information provided by QA is more focused than information retrieval The output can be the exact answer or a text snippet which contains the answer The domain took off as a result of the introduction of QA track in TREC, whilst cross-lingual QA as a result of CLEF
  • 5.
  • 6. Evolution of QA domain 4 July 2011 Constantin Orasan - KEPT 2011 Early QA systems date as back as 1960s and were mainly front ends to databases had limited usability Open-domain QA emerged as a result of the increasing amount of data available to answer a question need to find and extract the answer developed last 1990s as a result of the QA track at Text REtrieval Conferences emphasis on factoid questions, but other types of questions were also explored CLEF competitions have encouraged development of cross-lingual systems.
  • 7. Where are we now? 4 July 2011 Constantin Orasan - KEPT 2011 IBM and the Jeopardy Challenge Jeopardy! is an American quiz show where participants are given clues and need to guess the question (e.g. if the clue is The Father of Our Country; he didn't really chop down a cherry tree the contestant would respond Who is George Washington?) Watson is a QA system developed by IBM http://www.youtube.com/watch?v=FC3IryWr4c8
  • 8. Structure of an open domain QA system 4 July 2011 Constantin Orasan - KEPT 2011 A typical open domain QA system consists of: Question processor Document processor Answer extractor (and validation) Can have components for cross-lingual processing Has access to several external resources
  • 9. Question processor 4 July 2011 Constantin Orasan - KEPT 2011 Produces an interpretation of the question Determines the Question Type (e.g. factoid, definition, procedure, etc.) Determines the Expected Answer Type (EAT) On the basis of the question it produces a query Determines syntactic and semantic relations between the words from the questions Expands the query with synonyms May perform translation of the keywords in the query in the case of cross-lingual QA
  • 10. Expected answer type calculation 4 July 2011 Constantin Orasan - KEPT 2011 Relies on the existence of an answer type taxonomy This taxonomy can be made open-domain by linking to general ontologies such as WordNet The EAT can be determined using rule-based as well as machine learning approaches Who is the president of Romania? Where is Paris? Knowledge of domain can greatly improve the identification of EAT and help deal with ambiguities
  • 11. Query formulation 4 July 2011 Constantin Orasan - KEPT 2011 Produces a query from the question As a list of keywords As a list of phrases Identifies entities present in the question Produce variants of the query by introducing morphological, lexical and semantic variations Domain knowledge is very important for identification of entities and generation of valid variations and vital in cross-lingual scenarios
  • 12. Document processing 4 July 2011 Constantin Orasan - KEPT 2011 Uses the query produced in the previous step to retrieve paragraphs which may contain the answer It is largely domain independent as it relies on text retrieval engines Ranks results, but this is largely independent of the QA task For limited collections of texts it is possible to enrich the index with various linguistic information which can help further processing When the domain is known, characteristics of the input files can improve the retrieval (e.g. presence of metadata)
  • 13. Answer extraction 4 July 2011 Constantin Orasan - KEPT 2011 Uses a variety of techniques to identify the answer of a question The answer should have the type of EAT Very often rely on previously created patterns (e.g. When was the telephone invented? can be answered if there is a sentence that matches the pattern The telephone was invented in <date>), Many patterns can express the same answer (e.g. the telephone, invented in <date>) Relations identified in the question between the expected answer and entities from the question can be exploited by patterns
  • 14. Answer extraction (II) 4 July 2011 Constantin Orasan - KEPT 2011 Potential answers are ranked according to functions which are usually learned from the data The ranking and validation of answers can be done using external sources such as the Internet QA for well defined domains can rely on better patterns The functions learned usually work well only on the type of data used for training
  • 15. Open domain QA - evaluation 4 July 2011 Constantin Orasan - KEPT 2011 Great coverage, but low accuracy For example: EPHYRA QA system in TRAC 2007 reports an accuracy of 0.20 for factoid questions (Schlaefer et al. 2007) OpenEphyra was used for a cross-lingual Romanian – English QA system and we obtained 0.11 accuracy for factoid questions (Dornescu et al. 2008) – the best performing system for all cross-lingual QA tasks in CLEF 2008 The results are not directly comparable (different QA engines, tuned differently, different collections, different tasks) But does it make sense to do open domain question answering?
  • 16. How did Watson perform? 4 July 2011 Constantin Orasan - KEPT 2011 http://www.youtube.com/watch?v=Puhs2LuO3Zc
  • 17. How was this achieved? 4 July 2011 Constantin Orasan - KEPT 2011 Starting point the Practical Intelligent Question Answering Technology (PIQUANT) developed by IBM to participate in TREC Has been under development at IBM for more than 6 years by a team of 4 full time researchers Was one of the top three to five in many TRECs PIQUANT was performing around 0.33 on the TREC data PIQUANT used a standard architecture for QA
  • 18. How was this achieved? (II) 4 July 2011 Constantin Orasan - KEPT 2011 Lots of extra work was put in the system: a core team of 20 researchers working for almost 4 years PIQUANT system was enriched with a large number of modules for language processing The processing was parallelised heavily Lots of components were developed to deal with specific problems (lots of experts) Watson tries to combine deep and shallow knowledge Had access to large data sets and very good hardware
  • 19. Overview of Watson’s structure 4 July 2011 Constantin Orasan - KEPT 2011
  • 20. Hardware used 4 July 2011 Constantin Orasan - KEPT 2011 Watson is a workload optimized system designed for complex analytics, made possible by integrating massively parallel POWER7 processors and the IBM DeepQA software to answer Jeopardy! questions in under three seconds. Watson is made up of a cluster of ninety IBM Power 750 servers (plus additional I/O, network and cluster controller nodes in 10 racks) with a total of 2880 POWER7 processor cores and 16 Terabytes of RAM. Each Power 750 server uses a 3.5 GHz POWER7 eight core processor, with four threads per core. The POWER7 processor's massively parallel processing capability is an ideal match for Watson's IBM DeepQA software which is embarrassingly parallel (that is a workload that is easily split up into multiple parallel tasks). According to John Rennie, Watson can process 500 gigabytes, the equivalent of a million books, per second. IBM's master inventor and senior consultant Tony Pearson estimated Watson's hardware cost at about $3 million and with 80 TeraFLOPs would be placed 94th on the Top 500 Supercomputers list. From: http://en.wikipedia.org/wiki/Watson_(computer)
  • 21. Speed of answer 4 July 2011 Constantin Orasan - KEPT 2011 In Jeopardy! an answer needs to be provided in 3-5 seconds In initial experiments with running Watson on a single processor an answer was obtained in about 2 hours The system was implemented using Apache UIMA Asynchronous Scaleout Massively parallel architecture Indexes used to answer the questions had to be pre-processed using Hadoop
  • 22. Watson was not only NLP 4 July 2011 Constantin Orasan - KEPT 2011 Betting strategyhttp://www.youtube.com/watch?v=vA9aqAd2iso
  • 23. To sum up, Watson is: 4 July 2011 Constantin Orasan - KEPT 2011 An amazing engineering project A massive investment Research in many domains of NLP A big PR stunt A way to improve the IBM position in text analytics But it is not really a technology ready to be deployed But was it a real progress in open-domain QA?
  • 24. So is open domain QA a solved problem? Can we really solve open domain QA? Do we really need open domain QA? Do we care?
  • 25.
  • 27. Can rarely be formulated in one go
  • 28. Do not always contain answers from only one source
  • 30. Have very well defined needs
  • 31. Have access to previously asked questions
  • 32. Need very high accuracy
  • 33.
  • 34. The QALL-ME project demonstrators in domain of tourism – can answer questions in the domain of cinema/movies and accommodation. E.g. What movies can I see in Wolverhampton this week? How can I get to Novotel Hotel, Wolverhampton? the questions can be asked in any of the four languages in the consortium small scale demonstrator built for Romanian
  • 35. QALL-ME framework 4 July 2011 Constantin Orasan - KEPT 2011
  • 36. The QALL-ME ontology 4 July 2011 Constantin Orasan - KEPT 2011 All the reasoning and processing is done using a domain ontology The ontology also provides the means of achieving cross-lingual QA Determines the way data is stored in the database Ontologies need to be developed for each domain
  • 37. 30 Part of the tourism ontology
  • 38. Evaluation of the QALL-ME prototype 4 July 2011 Constantin Orasan - KEPT 2011 For the cinema domain the accuracy ranged between 60% to 85% depending on the language The system was tested on real questions posed by the users which were completely independent from the ones used to develop the system The error were mainly caused by wrongly identified named entities, missing patterns and mistakes of the entailment engine In an commercial environment this system can be revised every day in order to obtain much higher performance
  • 39. Closed domain QA for commercial companies 4 July 2011 Constantin Orasan - KEPT 2011 Closed domain QA has a certain appeal with companies These companies normally have large databases of questions and answers from customers The domain can be very clearly defined In some cases the systems needed are actually canned QA systems
  • 40. Interactive QA 4 July 2011 Constantin Orasan - KEPT 2011 It is easy to ask: Where can I eat paella tonight? but what about: What mobile phones are smart phones with a camera, have GPS, have touch screen, are produced by HTC or Nokia, but definitely not Samsung, have 1G internal memory …. Interactivity is needed to deal with ambiguities, wrong answers, user feedback, etc.
  • 41. Interactive QA 4 July 2011 Constantin Orasan - KEPT 2011 “process where the user is a continual part of the information loop” At intersection of Dialogue Systems and Question Answering Initiate dialogue with the user in cases where there are too many or too few answers, or there is some ambiguity in the request Can suggest follow up questions Provide a more natural way of locating a needed information Companies are interested in developing IQA systems as a way for providing customer services
  • 42. Hypothetical IQA session [1] SYS: Hi! [2] USER: Can you find me a smart phone with a camera? [3] SYS: There are [NUMBER][big number] of smart phones featuring a camera, would you like to search for some [BRAND]? [4] USER: No, which have GPS? [5] SYS: [STILL BIG NUMBER] have GPS, are you interested in ones having TOUCH SCREEN?[Suggesting new constraints] [6] USER: Yes, it would be great. [7] SYS: [NUMBER] of Nokia phones, [NUMBER] HTC phones, [NUMBER] Samsung phones, [NUMBER] of other brands. [8] USER: Ok, what about HTC phones? [9] SYS: [NUMBER] of them have [feature], [NUMBER] of them have [feature]. [10] USER: What are the ones with [feature]? [11] SYS: Here you are: [RESULTS]. 26/05/2011 Knowledge acquisition from Wikipedia for IQA 35
  • 43. Answers from more than one source 4 July 2011 Constantin Orasan - KEPT 2011 Many complex questions need to compose the answer to a question from several sources: List questions: List all the cantons in Switzerland which border Germany Sentiment questions:What features people like in Vista? This is part of the new trend in “deep QA” Even though users probably really need such answers, the technology is still at the stage of research projects
  • 44. To sum up … 4 July 2011 Constantin Orasan - KEPT 2011 Some researchers believe that search is dead and “deep QA” is the future This was largely fuelled by IBM’s Watson’s winning the Jeopardy! Watson is a fantastic QA system, but it does not solve the problem of open domain QA For real applications we still want to focus on very well defined domains We still want to have the user in the loop to facilitate asking questions Watson may have revived the interest in QA
  • 45. Watson is not always right 4 July 2011 Constantin Orasan - KEPT 2011 but it kind of knows this …. http://www.youtube.com/watch?v=7h4baBEi0iA
  • 46. Thank you for your attention 4 July 2011 Constantin Orasan - KEPT 2011