Varga ha

WLV: a question generation system
for QGSTEC 2010 task B

Andrea Varga and Le An Ha

1
Research Group in Computational Linguistics
University of Wolverhampton

18 June 2010 / QGSTEC 2010

Outline

Task B: Question generation from a single sentence

Our previous experience in question generation

Our method used to solve task B

Evaluation results on development data set

Conclusions

Task B: Question generation from a single sentence:

Input:
a sentences from Wikipedia, OpenLearn, Yahoo!Answers or similar data
sources
a speciﬁc target question type (which, what, who, when, where, why, how
many/long, yes/no)
Output:
2 questions generated per question type
Example:
<instance ="4">
<source>K100_2</source>
<text>In 1996, the trust employed over 7,000 staff and managed another six
sites in Leeds and the surrounding area.</text>
<question type="where">Where did the trust employ over 7,000 staff and
manage another six sites?</question>
<question type="where" />
<question type="when">When did the trust employ over 7,000 staff and manage
another six sites in Leeds and the surrounding area?</question>
<question type="when" />
<question type="how many">In 1996, the trust employed how many staff and
managed another six sites in Leeds and the surrounding area?</question>
<question type="how many">In 1996, the trust employed 7,000 staff and
managed how many sites in Leeds and the surrounding area?</question>
</instance>

Our previous experience in question generation:
Initial multiple-choice question (MCQ) generation system

Our previous work:
Mitkov and Ha (2003)
Mitkov et al. (2006)

Input: instructive text (textbook chapters and encyclopaedia entries)

Performed tasks:
term extraction
-noun phrases satisfying the [AN]+N or [AN]*NP[AN]*N regular expression
question generation
sentence ﬁltering constraints
-the terms occur in the main clauses or subordinate clauses
-the sentence has coordinate structure
-the sentence contains negations

distractor selection
Resources: Corpora and ontologies (WordNet)
Question types: which, how many

Our method used to solve task B:
Modified question generation system

Input: single sentence

Performed tasks:
identification of key phrases
-noun phrases satisfying the [AN]+N or [AN]*NP[AN]*N regular expression
- preporsitional phrases
- adverbial phrases

assignment of semantic types
- a named entity recognition (NER) module assigns for the head of each
phrase a semantic type: location; person; time; number; other

identification of question type

question generation
- we added few more syntactic rules for the missing question types
- we removed several constraints
Question types: which, what, who, when, where, why, how many

Question generation: "WhichH VO"; "WhichH do-support SV" (1)

Input: source clauses are:
ﬁnite
contain at least one key phrase
of subject-verb-object (SVO) or SV structure
Which and What questions
-key phrases: all the NPs
S(key phrase)VO => "WhichH VO" where WhichH is replaced by:
-"Which" + head of NP (in case of multi-word phrase)
-"Which" + hypernym of the word from WordNet (in case of single-word
phrase)

S(key phrase)VO => "What VO"

SVO(key phrase) => "WhichH do-support SV"

SVO(key phrase) => "What do-support SV"


Who, Whose and Whom
-key phrases: NPs recognised as person names
for NP in subject position S(key phrase)VO => "Who VO"

for NP in possessive structure S(key phrase)VO => "Whose VO"

for NP in any other position S(key phrase)VO => "Whom VO"
When and Where
-key phrases for the when questions:NPs, PPs and AdvPs (being the
extent of a temporal expression)
-key phrases for where questions: NPs, PPs (the head of the phrases
recognised as location)
S(key phrase)VO => When VO

S(key phrase)VO => Where VO

SVO(key phrase) => When do-support SV

SVO(key phrase) => Where do-support SV

subclauses containing the answer are ignored


Why
-key phrases: NPs
Why do-support VO; ignoring the subclause containing the answer
How many
-key phrases: NPs containing numeric expressions
S(key phrase)VO => "How many H VO"

SVO(key phrase) => "How many H do-support SV"

S(key phrase)VO => "How many percent VO"

SVO(key phrase) => "How many percent do-support SV"

Evaluation results on development data set:
Manual evaluation results

115 questions has been generated out of 180 because
we have not built a model to generate yes/no questions
the transformational rules are not able to deal with sentences that are too
complicated
some of the sentences were incorrectly parsed
the system failed to identify any source clause for some sentences

kappa agreement on Relevance was 0.21
kappa agreement on Syntactic Correctness and Fluency was 0.22

Human One Human Two
Relevance(180 questions) 2.45 2.85
Relevance(115 questions) 1.57 2.20
Syntactic(180 questions) 2.85 3.10
Syntactic(115 questions) 2.20 2.64
Table: average Relevance and Syntactic Correctness and Fluency values

Conclusions :

we presented our question generation system used to generate
questions from a single sentence:

115 questions were generated out of the target 180 questions

for the different question types: which, what, who, when, where, how many

the generated questions do not score well in both relevancy and syntactic
correctness measures

the agreement between two human judges is quite low

Varga ha

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (20)

Semelhante a Varga ha

Semelhante a Varga ha (20)

Varga ha