SlideShare uma empresa Scribd logo
1 de 10
Baixar para ler offline
WLV: a question generation system
    for QGSTEC 2010 task B


        Andrea Varga and Le An Ha

    1
        Research Group in Computational Linguistics
              University of Wolverhampton



    18 June 2010 / QGSTEC 2010
Outline


Task B: Question generation from a single sentence


Our previous experience in question generation


Our method used to solve task B


Evaluation results on development data set


Conclusions
Task B: Question generation from a single sentence:

       Input:
           a sentences from Wikipedia, OpenLearn, Yahoo!Answers or similar data
           sources
           a specific target question type (which, what, who, when, where, why, how
           many/long, yes/no)
       Output:
           2 questions generated per question type
       Example:
       <instance ="4">
       <source>K100_2</source>
       <text>In 1996, the trust employed over 7,000 staff and managed another six
       sites in Leeds and the surrounding area.</text>
       <question type="where">Where did the trust employ over 7,000 staff and
       manage another six sites?</question>
       <question type="where" />
       <question type="when">When did the trust employ over 7,000 staff and manage
       another six sites in Leeds and the surrounding area?</question>
       <question type="when" />
       <question type="how many">In 1996, the trust employed how many staff and
       managed another six sites in Leeds and the surrounding area?</question>
       <question type="how many">In 1996, the trust employed 7,000 staff and
       managed how many sites in Leeds and the surrounding area?</question>
       </instance>
Our previous experience in question generation:
Initial multiple-choice question (MCQ) generation system


          Our previous work:
               Mitkov and Ha (2003)
               Mitkov et al. (2006)


          Input: instructive text (textbook chapters and encyclopaedia entries)

          Performed tasks:
               term extraction
               -noun phrases satisfying the [AN]+N or [AN]*NP[AN]*N regular expression
               question generation
                     sentence filtering constraints
                     -the terms occur in the main clauses or subordinate clauses
                     -the sentence has coordinate structure
                     -the sentence contains negations

               distractor selection
          Resources: Corpora and ontologies (WordNet)
          Question types: which, how many
Our method used to solve task B:
Modified question generation system



          Input: single sentence

          Performed tasks:
              identification of key phrases
              -noun phrases satisfying the [AN]+N or [AN]*NP[AN]*N regular expression
              - preporsitional phrases
              - adverbial phrases

              assignment of semantic types
              - a named entity recognition (NER) module assigns for the head of each
              phrase a semantic type: location; person; time; number; other

              identification of question type

              question generation
              - we added few more syntactic rules for the missing question types
              - we removed several constraints
          Question types: which, what, who, when, where, why, how many
Our method used to solve task B:
Question generation: "WhichH VO"; "WhichH do-support SV" (1)




         Input: source clauses are:
              finite
              contain at least one key phrase
              of subject-verb-object (SVO) or SV structure
         Which and What questions
         -key phrases: all the NPs
              S(key phrase)VO => "WhichH VO" where WhichH is replaced by:
              -"Which" + head of NP (in case of multi-word phrase)
              -"Which" + hypernym of the word from WordNet (in case of single-word
              phrase)

              S(key phrase)VO => "What VO"

              SVO(key phrase) => "WhichH do-support SV"

              SVO(key phrase) => "What do-support SV"
Our method used to solve task B:
Question generation: "WhichH VO"; "WhichH do-support SV" (2)


         Who, Whose and Whom
         -key phrases: NPs recognised as person names
              for NP in subject position S(key phrase)VO => "Who VO"

              for NP in possessive structure S(key phrase)VO => "Whose VO"

              for NP in any other position S(key phrase)VO => "Whom VO"
         When and Where
         -key phrases for the when questions:NPs, PPs and AdvPs (being the
         extent of a temporal expression)
         -key phrases for where questions: NPs, PPs (the head of the phrases
         recognised as location)
              S(key phrase)VO => When VO

              S(key phrase)VO => Where VO

              SVO(key phrase) => When do-support SV

              SVO(key phrase) => Where do-support SV

              subclauses containing the answer are ignored
Our method used to solve task B:
Question generation: "WhichH VO"; "WhichH do-support SV" (3)




         Why
         -key phrases: NPs
              Why do-support VO; ignoring the subclause containing the answer
         How many
         -key phrases: NPs containing numeric expressions
              S(key phrase)VO => "How many H VO"

              SVO(key phrase) => "How many H do-support SV"

              S(key phrase)VO => "How many percent VO"

              SVO(key phrase) => "How many percent do-support SV"
Evaluation results on development data set:
Manual evaluation results


          115 questions has been generated out of 180 because
               we have not built a model to generate yes/no questions
               the transformational rules are not able to deal with sentences that are too
               complicated
               some of the sentences were incorrectly parsed
               the system failed to identify any source clause for some sentences


          kappa agreement on Relevance was 0.21
          kappa agreement on Syntactic Correctness and Fluency was 0.22

                                                 Human One         Human Two
                Relevance(180 questions)            2.45             2.85
                Relevance(115 questions)            1.57             2.20
                Syntactic(180 questions)            2.85             3.10
                Syntactic(115 questions)            2.20             2.64
            Table: average Relevance and Syntactic Correctness and Fluency values
Conclusions :




       we presented our question generation system used to generate
       questions from a single sentence:

           115 questions were generated out of the target 180 questions

           for the different question types: which, what, who, when, where, how many

           the generated questions do not score well in both relevancy and syntactic
           correctness measures

           the agreement between two human judges is quite low

Mais conteúdo relacionado

Mais procurados

Introduction to word embeddings with Python
Introduction to word embeddings with PythonIntroduction to word embeddings with Python
Introduction to word embeddings with PythonPavel Kalaidin
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Bhaskar Mitra
 
Word Embeddings - Introduction
Word Embeddings - IntroductionWord Embeddings - Introduction
Word Embeddings - IntroductionChristian Perone
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsBhaskar Mitra
 
(Kpi summer school 2015) word embeddings and neural language modeling
(Kpi summer school 2015) word embeddings and neural language modeling(Kpi summer school 2015) word embeddings and neural language modeling
(Kpi summer school 2015) word embeddings and neural language modelingSerhii Havrylov
 
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...Daniele Di Mitri
 
Word2vec ultimate beginner
Word2vec ultimate beginnerWord2vec ultimate beginner
Word2vec ultimate beginnerSungmin Yang
 
Deep Learning for Information Retrieval
Deep Learning for Information RetrievalDeep Learning for Information Retrieval
Deep Learning for Information RetrievalRoelof Pieters
 
Word Embedding to Document distances
Word Embedding to Document distancesWord Embedding to Document distances
Word Embedding to Document distancesGanesh Borle
 
Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchBhaskar Mitra
 
Query Recommendation - Barcelona 2017
Query Recommendation - Barcelona 2017Query Recommendation - Barcelona 2017
Query Recommendation - Barcelona 2017Puya - Hossein Vahabi
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageRoelof Pieters
 
word2vec - From theory to practice
word2vec - From theory to practiceword2vec - From theory to practice
word2vec - From theory to practicehen_drik
 
word embeddings and applications to machine translation and sentiment analysis
word embeddings and applications to machine translation and sentiment analysisword embeddings and applications to machine translation and sentiment analysis
word embeddings and applications to machine translation and sentiment analysisMostapha Benhenda
 
Deep Learning and Text Mining
Deep Learning and Text MiningDeep Learning and Text Mining
Deep Learning and Text MiningWill Stanton
 
(Deep) Neural Networks在 NLP 和 Text Mining 总结
(Deep) Neural Networks在 NLP 和 Text Mining 总结(Deep) Neural Networks在 NLP 和 Text Mining 总结
(Deep) Neural Networks在 NLP 和 Text Mining 总结君 廖
 
Question Answering with Subgraph Embeddings
Question Answering with Subgraph EmbeddingsQuestion Answering with Subgraph Embeddings
Question Answering with Subgraph EmbeddingsKarel Ha
 

Mais procurados (20)

Introduction to word embeddings with Python
Introduction to word embeddings with PythonIntroduction to word embeddings with Python
Introduction to word embeddings with Python
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
 
Word Embeddings - Introduction
Word Embeddings - IntroductionWord Embeddings - Introduction
Word Embeddings - Introduction
 
Deep learning for nlp
Deep learning for nlpDeep learning for nlp
Deep learning for nlp
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word Embeddings
 
(Kpi summer school 2015) word embeddings and neural language modeling
(Kpi summer school 2015) word embeddings and neural language modeling(Kpi summer school 2015) word embeddings and neural language modeling
(Kpi summer school 2015) word embeddings and neural language modeling
 
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
 
Word2vec ultimate beginner
Word2vec ultimate beginnerWord2vec ultimate beginner
Word2vec ultimate beginner
 
Deep Learning for Information Retrieval
Deep Learning for Information RetrievalDeep Learning for Information Retrieval
Deep Learning for Information Retrieval
 
Word Embedding to Document distances
Word Embedding to Document distancesWord Embedding to Document distances
Word Embedding to Document distances
 
Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for Search
 
Query Recommendation - Barcelona 2017
Query Recommendation - Barcelona 2017Query Recommendation - Barcelona 2017
Query Recommendation - Barcelona 2017
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on Language
 
word2vec - From theory to practice
word2vec - From theory to practiceword2vec - From theory to practice
word2vec - From theory to practice
 
word embeddings and applications to machine translation and sentiment analysis
word embeddings and applications to machine translation and sentiment analysisword embeddings and applications to machine translation and sentiment analysis
word embeddings and applications to machine translation and sentiment analysis
 
Deep Learning and Text Mining
Deep Learning and Text MiningDeep Learning and Text Mining
Deep Learning and Text Mining
 
(Deep) Neural Networks在 NLP 和 Text Mining 总结
(Deep) Neural Networks在 NLP 和 Text Mining 总结(Deep) Neural Networks在 NLP 和 Text Mining 总结
(Deep) Neural Networks在 NLP 和 Text Mining 总结
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
Question Answering with Subgraph Embeddings
Question Answering with Subgraph EmbeddingsQuestion Answering with Subgraph Embeddings
Question Answering with Subgraph Embeddings
 

Destaque

Questioner on 'Playground'
Questioner on 'Playground'Questioner on 'Playground'
Questioner on 'Playground'Catringc
 
Questioner on 'Playground'
Questioner on 'Playground'Questioner on 'Playground'
Questioner on 'Playground'Catringc
 
IBM Bluemix - Building a Project with Maven
IBM Bluemix - Building a Project with MavenIBM Bluemix - Building a Project with Maven
IBM Bluemix - Building a Project with MavenCraig Trim
 
Question Types in Natural Language Processing
Question Types in Natural Language ProcessingQuestion Types in Natural Language Processing
Question Types in Natural Language ProcessingCraig Trim
 
INTERVIEW TECHNIQUES
INTERVIEW TECHNIQUES INTERVIEW TECHNIQUES
INTERVIEW TECHNIQUES Google
 
6 Interview Techniques
6 Interview Techniques6 Interview Techniques
6 Interview TechniquesTony Rodgers
 
Factors that Affect Quality of Communication
Factors that Affect Quality of CommunicationFactors that Affect Quality of Communication
Factors that Affect Quality of CommunicationJoseph Divina
 
1.02 factors that affect communication
1.02 factors that affect communication1.02 factors that affect communication
1.02 factors that affect communicationmelodiekernahan
 
Interviewing Techniques
Interviewing TechniquesInterviewing Techniques
Interviewing Techniquesshowslides
 
Organizational communication 2
Organizational communication 2Organizational communication 2
Organizational communication 2Ultraspectra
 
Interview and it's Types
Interview and it's TypesInterview and it's Types
Interview and it's TypesLearn By Watch
 
Organizational communication
Organizational communicationOrganizational communication
Organizational communicationYuvraj Gupta
 
Factors of communication
Factors of  communicationFactors of  communication
Factors of communicationBilal Hussain
 
communication in an Organization
communication in an Organizationcommunication in an Organization
communication in an OrganizationAmit Kumar
 
Organizational communication
Organizational communicationOrganizational communication
Organizational communicationNingsih SM
 
5 barriers to effective communication
5 barriers to effective communication5 barriers to effective communication
5 barriers to effective communicationDiego Rodrigo
 
Organizational Communication
Organizational CommunicationOrganizational Communication
Organizational Communicationamanideutsch
 

Destaque (20)

Questioner on 'Playground'
Questioner on 'Playground'Questioner on 'Playground'
Questioner on 'Playground'
 
Questioner on 'Playground'
Questioner on 'Playground'Questioner on 'Playground'
Questioner on 'Playground'
 
Small powerpoint
Small powerpoint Small powerpoint
Small powerpoint
 
IBM Bluemix - Building a Project with Maven
IBM Bluemix - Building a Project with MavenIBM Bluemix - Building a Project with Maven
IBM Bluemix - Building a Project with Maven
 
Question Types in Natural Language Processing
Question Types in Natural Language ProcessingQuestion Types in Natural Language Processing
Question Types in Natural Language Processing
 
INTERVIEW TECHNIQUES
INTERVIEW TECHNIQUES INTERVIEW TECHNIQUES
INTERVIEW TECHNIQUES
 
6 Interview Techniques
6 Interview Techniques6 Interview Techniques
6 Interview Techniques
 
Factors that Affect Quality of Communication
Factors that Affect Quality of CommunicationFactors that Affect Quality of Communication
Factors that Affect Quality of Communication
 
1.02 factors that affect communication
1.02 factors that affect communication1.02 factors that affect communication
1.02 factors that affect communication
 
Interviewtechniques ppt
Interviewtechniques pptInterviewtechniques ppt
Interviewtechniques ppt
 
Interviewing Techniques
Interviewing TechniquesInterviewing Techniques
Interviewing Techniques
 
Types of Interviews
Types of InterviewsTypes of Interviews
Types of Interviews
 
Organizational communication 2
Organizational communication 2Organizational communication 2
Organizational communication 2
 
Interview and it's Types
Interview and it's TypesInterview and it's Types
Interview and it's Types
 
Organizational communication
Organizational communicationOrganizational communication
Organizational communication
 
Factors of communication
Factors of  communicationFactors of  communication
Factors of communication
 
communication in an Organization
communication in an Organizationcommunication in an Organization
communication in an Organization
 
Organizational communication
Organizational communicationOrganizational communication
Organizational communication
 
5 barriers to effective communication
5 barriers to effective communication5 barriers to effective communication
5 barriers to effective communication
 
Organizational Communication
Organizational CommunicationOrganizational Communication
Organizational Communication
 

Semelhante a Varga ha

A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingTed Xiao
 
Why Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveWhy Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveJames Hendler
 
Question Focus Recognition in Question Answering Systems
Question Focus Recognition in Question  Answering Systems Question Focus Recognition in Question  Answering Systems
Question Focus Recognition in Question Answering Systems Waheeb Ahmed
 
CMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics ICMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics Ibutest
 
Natural Language Processing: Lecture 255
Natural Language Processing: Lecture 255Natural Language Processing: Lecture 255
Natural Language Processing: Lecture 255deffa5
 
D Whitelock LAK presentation open_essayistfv
D Whitelock LAK presentation  open_essayistfvD Whitelock LAK presentation  open_essayistfv
D Whitelock LAK presentation open_essayistfvDenise Whitelock
 
Grammarly Meetup: Paraphrase Detection in NLP (PART 2) - Andriy Gryshchuk
Grammarly Meetup: Paraphrase Detection in NLP (PART 2) - Andriy GryshchukGrammarly Meetup: Paraphrase Detection in NLP (PART 2) - Andriy Gryshchuk
Grammarly Meetup: Paraphrase Detection in NLP (PART 2) - Andriy GryshchukGrammarly
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsRoelof Pieters
 
Subword tokenizers
Subword tokenizersSubword tokenizers
Subword tokenizersHa Loc Do
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language ProcessingPranav Gupta
 
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systemsQi He
 
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...Seth Grimes
 
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...Seth Grimes
 
A Distributed Architecture System for Recognizing Textual Entailment
A Distributed Architecture System for Recognizing Textual EntailmentA Distributed Architecture System for Recognizing Textual Entailment
A Distributed Architecture System for Recognizing Textual EntailmentFaculty of Computer Science
 
Intro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringIntro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringTraian Rebedea
 
Analyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in PythonAnalyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in PythonAbhinav Gupta
 
Factoid based natural language question generation system
Factoid based natural language question generation systemFactoid based natural language question generation system
Factoid based natural language question generation systemAnimesh Shaw
 

Semelhante a Varga ha (20)

A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language Processing
 
Why Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveWhy Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspective
 
Question Focus Recognition in Question Answering Systems
Question Focus Recognition in Question  Answering Systems Question Focus Recognition in Question  Answering Systems
Question Focus Recognition in Question Answering Systems
 
CMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics ICMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics I
 
columbia-gwu
columbia-gwucolumbia-gwu
columbia-gwu
 
Natural Language Processing: Lecture 255
Natural Language Processing: Lecture 255Natural Language Processing: Lecture 255
Natural Language Processing: Lecture 255
 
NLP & DBpedia
 NLP & DBpedia NLP & DBpedia
NLP & DBpedia
 
D Whitelock LAK presentation open_essayistfv
D Whitelock LAK presentation  open_essayistfvD Whitelock LAK presentation  open_essayistfv
D Whitelock LAK presentation open_essayistfv
 
Grammarly Meetup: Paraphrase Detection in NLP (PART 2) - Andriy Gryshchuk
Grammarly Meetup: Paraphrase Detection in NLP (PART 2) - Andriy GryshchukGrammarly Meetup: Paraphrase Detection in NLP (PART 2) - Andriy Gryshchuk
Grammarly Meetup: Paraphrase Detection in NLP (PART 2) - Andriy Gryshchuk
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
 
Subword tokenizers
Subword tokenizersSubword tokenizers
Subword tokenizers
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
 
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
 
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
 
A Distributed Architecture System for Recognizing Textual Entailment
A Distributed Architecture System for Recognizing Textual EntailmentA Distributed Architecture System for Recognizing Textual Entailment
A Distributed Architecture System for Recognizing Textual Entailment
 
Intro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringIntro to Deep Learning for Question Answering
Intro to Deep Learning for Question Answering
 
Analyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in PythonAnalyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in Python
 
Factoid based natural language question generation system
Factoid based natural language question generation systemFactoid based natural language question generation system
Factoid based natural language question generation system
 
Spoken Content Retrieval
Spoken Content RetrievalSpoken Content Retrieval
Spoken Content Retrieval
 

Varga ha

  • 1. WLV: a question generation system for QGSTEC 2010 task B Andrea Varga and Le An Ha 1 Research Group in Computational Linguistics University of Wolverhampton 18 June 2010 / QGSTEC 2010
  • 2. Outline Task B: Question generation from a single sentence Our previous experience in question generation Our method used to solve task B Evaluation results on development data set Conclusions
  • 3. Task B: Question generation from a single sentence: Input: a sentences from Wikipedia, OpenLearn, Yahoo!Answers or similar data sources a specific target question type (which, what, who, when, where, why, how many/long, yes/no) Output: 2 questions generated per question type Example: <instance ="4"> <source>K100_2</source> <text>In 1996, the trust employed over 7,000 staff and managed another six sites in Leeds and the surrounding area.</text> <question type="where">Where did the trust employ over 7,000 staff and manage another six sites?</question> <question type="where" /> <question type="when">When did the trust employ over 7,000 staff and manage another six sites in Leeds and the surrounding area?</question> <question type="when" /> <question type="how many">In 1996, the trust employed how many staff and managed another six sites in Leeds and the surrounding area?</question> <question type="how many">In 1996, the trust employed 7,000 staff and managed how many sites in Leeds and the surrounding area?</question> </instance>
  • 4. Our previous experience in question generation: Initial multiple-choice question (MCQ) generation system Our previous work: Mitkov and Ha (2003) Mitkov et al. (2006) Input: instructive text (textbook chapters and encyclopaedia entries) Performed tasks: term extraction -noun phrases satisfying the [AN]+N or [AN]*NP[AN]*N regular expression question generation sentence filtering constraints -the terms occur in the main clauses or subordinate clauses -the sentence has coordinate structure -the sentence contains negations distractor selection Resources: Corpora and ontologies (WordNet) Question types: which, how many
  • 5. Our method used to solve task B: Modified question generation system Input: single sentence Performed tasks: identification of key phrases -noun phrases satisfying the [AN]+N or [AN]*NP[AN]*N regular expression - preporsitional phrases - adverbial phrases assignment of semantic types - a named entity recognition (NER) module assigns for the head of each phrase a semantic type: location; person; time; number; other identification of question type question generation - we added few more syntactic rules for the missing question types - we removed several constraints Question types: which, what, who, when, where, why, how many
  • 6. Our method used to solve task B: Question generation: "WhichH VO"; "WhichH do-support SV" (1) Input: source clauses are: finite contain at least one key phrase of subject-verb-object (SVO) or SV structure Which and What questions -key phrases: all the NPs S(key phrase)VO => "WhichH VO" where WhichH is replaced by: -"Which" + head of NP (in case of multi-word phrase) -"Which" + hypernym of the word from WordNet (in case of single-word phrase) S(key phrase)VO => "What VO" SVO(key phrase) => "WhichH do-support SV" SVO(key phrase) => "What do-support SV"
  • 7. Our method used to solve task B: Question generation: "WhichH VO"; "WhichH do-support SV" (2) Who, Whose and Whom -key phrases: NPs recognised as person names for NP in subject position S(key phrase)VO => "Who VO" for NP in possessive structure S(key phrase)VO => "Whose VO" for NP in any other position S(key phrase)VO => "Whom VO" When and Where -key phrases for the when questions:NPs, PPs and AdvPs (being the extent of a temporal expression) -key phrases for where questions: NPs, PPs (the head of the phrases recognised as location) S(key phrase)VO => When VO S(key phrase)VO => Where VO SVO(key phrase) => When do-support SV SVO(key phrase) => Where do-support SV subclauses containing the answer are ignored
  • 8. Our method used to solve task B: Question generation: "WhichH VO"; "WhichH do-support SV" (3) Why -key phrases: NPs Why do-support VO; ignoring the subclause containing the answer How many -key phrases: NPs containing numeric expressions S(key phrase)VO => "How many H VO" SVO(key phrase) => "How many H do-support SV" S(key phrase)VO => "How many percent VO" SVO(key phrase) => "How many percent do-support SV"
  • 9. Evaluation results on development data set: Manual evaluation results 115 questions has been generated out of 180 because we have not built a model to generate yes/no questions the transformational rules are not able to deal with sentences that are too complicated some of the sentences were incorrectly parsed the system failed to identify any source clause for some sentences kappa agreement on Relevance was 0.21 kappa agreement on Syntactic Correctness and Fluency was 0.22 Human One Human Two Relevance(180 questions) 2.45 2.85 Relevance(115 questions) 1.57 2.20 Syntactic(180 questions) 2.85 3.10 Syntactic(115 questions) 2.20 2.64 Table: average Relevance and Syntactic Correctness and Fluency values
  • 10. Conclusions : we presented our question generation system used to generate questions from a single sentence: 115 questions were generated out of the target 180 questions for the different question types: which, what, who, when, where, how many the generated questions do not score well in both relevancy and syntactic correctness measures the agreement between two human judges is quite low