Lecture: Question Answering

Seman&c
Analysis
in
Language
Technology

http://stp.lingﬁl.uu.se/~santinim/sais/2016/sais_2016.htm  
 
Question Answering
Marina
San(ni

san$nim@stp.lingﬁl.uu.se

Department
of
Linguis(cs
and
Philology

Uppsala
University,
Uppsala,
Sweden

Spring
2016

1

Previous
Lecture:
IE
–
Named
En$ty
Recogni$on
(NER)

2

•  A
very
important
sub-‐task:
ﬁnd
and
classify

names
in
text,
for
example:

•  The
decision
by
the
independent
MP
Andrew

Wilkie
to
withdraw
his
support
for
the
minority

Labor
government
sounded
drama(c
but
it

should
not
further
threaten
its
stability.
When,

aJer
the
2010
elec(on,
Wilkie,
Rob
OakeshoN,

Tony
Windsor
and
the
Greens
agreed
to
support

Labor,
they
gave
just
two
guarantees:

conﬁdence
and
supply.

Named
En$ty
Recogni$on
(NER)

Person

Date

Loca(on

Organiza(on

Etc.

NER
pipeline

4

Representa(ve

documents

Human

annota(on

Annotated

documents

Feature

extrac(on

Training
data
Sequence

classiﬁers

NER
system

Encoding
classes
for
sequence
labeling

IO
encoding
IOB
encoding

Fred

PER

B-‐PER

showed

O

O

Sue

PER

B-‐PER

Mengqiu

PER

B-‐PER

Huang

PER

I-‐PER

‘s

O

O

new

O

O

pain(ng
O

O

Features
for
sequence
labeling

•  Words

•  Current
word
(essen(ally
like
a
learned
dic(onary)

•  Previous/next
word
(context)

•  Other
kinds
of
inferred
linguis(c
classiﬁca(on

•  Part-‐of-‐speech
tags

•  Other
features

•  Word
shapes

•  etc.

6

Features: Word shapes
•  Word Shapes
•  Map words to simplified representation that encodes attributes
such as length, capitalization, numerals, Greek letters, internal
punctuation, etc.
Varicella-zoster Xx-xxx
mRNA xXXX
CPA1 XXXd
•  Varicella
zoster
is
a

virus

•  Messenger
RNA
(mRNA)
is
a
large

family
of
RNA
molecules

•  CPA1
(Carboxypep(dase
A1

(Pancrea(c))
is
a
Protein
Coding
gene.

Inspira$on
figure

Task:
Develop
a
set
of
regular

expressions
to
recognize
the

character
shape
features.

•  Possible
set
of
REs
matching
the

inspira(on
figure
(syntax
dpn
on

prLang):

8

No
need
to
remember
things
by
heart:
once

you
know
what
you
have
to
do,
find
the

correct
syntax
on
the
web!

The
gold
standard
corpus

There
are
always
many

solu(ons
to
a
research

ques(on!
You
had
to
make

your
choice…
Basic
steps:

1.  Analyse
the
data
(you
must

know
your
data
well!!!);

2.  Get
an
idea
of
the
paNerns

3.  Choose
the
way
to
go…

4.  Report
your
results

9

Proposed
solu$ons

•  (Xx*)*
regardless
the
NE

type

•  Complex
paNerns
that

could
iden(fy
approx.
900

lines
out
of
1316
en((es

(regardless
NE
type)

•  etc…

10

Some
alterna$ves:
create
paLerns
per
NE
type…

(divide
and
conquer
approach
J
)

Ex:
person
names
(283):
most

person
names
have
the
shape:

(Xx*){2}
(presumably
you
woud
get

high
accuracy)

Miles
Sindercombe
p:person

Armand
de
Pontmar(n
p:person

Alicia
Gorey
p:person

Kim
Crosby
(singer)
p:person

Edmond
Roudnitska
p:person

Shobha
Gurtu
p:person

Bert
Greene
p:person

Danica
McKellar
p:person

11

Sheila
O'Brien
p:person

Mar(n
Day
p:person

Clive
MaNhew-‐Wilson
p:person

Venugopal
Dhoot
p:person

Cliﬀord
Berry
p:person

Munir
Malik
p:person

Mary
Sears
p:person

Charles
Wayne
"Chuck"
Day
p:person

Michael
Formanek
p:person

Felix
Carlebach
p:person

Alexander
Keith,
Jr.
p:person

Omer
Vanaudenhove
p:person

What’s
the
mathema$cal
formalism
underlying

REs?

12

Conver$ng
the
regular
expression

(a|b)*
to
a
DFA

14

Conver$ng
the
regular
expression
(a*|b*)*
to
a
DFA

15

Conver$ng
the
regular
expression

ab(a|b)*
to
a
DFA

16

Chomsky
hierarchy

•  Regular
expressions
help
solve
problems
that
are
tractable
by

”regular
grammars”.

17

For
example,
it
is
not
possible
to
write
an
FSM
(and

consequently
regular
expressions)
that
generates
the

language
an
bn,
i.e.
the
set
of
all
strings
which
consist

of
a
(possibly
empty)
block
of
as
followed
by
a

(possibly
empty)
block
of
bs
of
exactly
the
same

length).

Areas
where
ﬁnite
state
methods
have
been
shown
to

be
par(cularly
useful
in
NLP
are
phonological
and

morphological
processing.

In
our
case,
we
must
explore
and
experiment
with
the

NE
corpus
and
see
if
there
are
sequences
that
cannot

be
captured
by
a
regular
language.

For
some
problems,

•  …
the
expressive
power
of
REs
is
exactly
what

is
needed

•  For
some
other
problems,
the
expressive
power
of
REs
is
too

weak…

•  Addionally,
since
REs
a
basically
hand-‐wriNen
rules,
it
is
easy
to
get

entagled
with
rules…
at
one
point
you
do
not
know
any
more
how
the

rules
interact
with
each
other…
so
results
might
be
unpredictable
J

18

End
of
previous
lecture

19

Question
Answering
What
is
Ques(on

Answering?

Acknowledgements
Most
slides
borrowed
or
adapted
from:

Dan
Jurafsky
and
Christopher
Manning,
Coursera

Dan
Jurafsky
and
James
H.
Mar(n
(2015)

J&M(2015,
draJ):
hNps://web.stanford.edu/~jurafsky/slp3/

22

Ques$on
Answering

What do worms eat?
worms
eat
what
worms
eat
grass
Worms eat grass
worms
eat
grass
Grass is eaten by worms
birds
eat
worms
Birds eat worms
horses
eat
grass
Horses with worms eat grass
with
worms
Ques%on: Poten%al-Answers:
One
of
the
oldest
NLP
tasks
(punched
card
systems
in
1961)

Simmons,
Klein,
McConlogue.
1964.
Indexing
and

Dependency
Logic
for
Answering
English
Ques(ons.

American
Documenta(on
15:30,
196-‐204

Ques$on
Answering:
IBM’s
Watson

•  Won
Jeopardy
on
February
16,
2011!

•  IBM’s
Watson
is
a
Ques(on
Answering
system.

•  What
is
Jeopardy?

23

Jeopardy!

•  Jeopardy!
is
an
American
television
quiz
compe((on
in
which

contestants
are
presented
with
general
knowledge
clues
in
the

form
of
answers,
and
must
phrase
their
responses
in
the
form
of

ques/ons.

•  The
original
day(me
version
debuted
on
NBC
on
March
30,

1964,

24

Watson’s
performance

•  With
the
answer:
“You
just
need
a
nap.
You
don’t
have
this

sleep
disorder
that
can
make
suﬀerers
nod
oﬀ
while
standing

up,”
Watson
replied,
“What
is
narcolepsy?”

25

Ques$on
Answering:
IBM’s
Watson

•  The
winning
reply!

26

WILLIAM WILKINSON’S
“AN ACCOUNT OF THE PRINCIPALITIES OF
WALLACHIA AND MOLDOVIA”
INSPIRED THIS AUTHOR’S
MOST FAMOUS NOVEL
Bram
Stoker

29

Types
of
Ques$ons
in
Modern
Systems

•  Factoid
ques(ons

•  Who
wrote
“The
Universal
Declara/on
of
Human
Rights”?

•  How
many
calories
are
there
in
two
slices
of
apple
pie?

•  What
is
the
average
age
of
the
onset
of
au/sm?

•  Where
is
Apple
Computer
based?

•  Complex
(narra(ve)
ques(ons:

•  In
children
with
an
acute
febrile
illness,
what
is
the

eﬃcacy
of
acetaminophen
in
reducing
fever?

•  What
do
scholars
think
about
Jeﬀerson’s
posi/on
on

dealing
with
pirates?

Commercial
systems:

mainly
factoid
ques$ons

Where
is
the
Louvre
Museum
located?
In
Paris,
France

What’s
the
abbrevia(on
for
limited

partnership?

L.P.

What
are
the
names
of
Odin’s
ravens?
Huginn
and
Muninn

What
currency
is
used
in
China?
The
yuan

What
kind
of
nuts
are
used
in
marzipan?
almonds

What
instrument
does
Max
Roach
play?
drums

What
is
the
telephone
number
for
Stanford

University?

650-‐723-‐2300

Paradigms
for
QA

•  IR-‐based
approaches

•  TREC;

IBM
Watson;
Google

•  Knowledge-‐based

•  Apple
Siri;
Wolfram
Alpha;

•  Hybrid
approaches

•  IBM
Watson;
True
Knowledge
Evi

31

Many
ques$ons
can
already
be
answered

by
web
search

•  a

32

IR-‐based
Ques$on
Answering

•  a

33

Things
change
all
the
$me….
J

•  Google
was
a
pure
IR-‐based
QA,
but
in
2012
Knowledge
Graph

was
added
to
Google's
search
engine.

•  The
Knowledge
Graph
is
a
knowledge
base
used
by
Google
to

enhance
its
search
engine's
search
results
with
seman(c-‐search

informa(on
gathered
from
a
wide
variety
of
sources.

•  Wikipedia:
The
goal
of
KGraph
is
that
users
would
be
able
to
use
this
informa(on
to
resolve
their

query
without
having
to
navigate
to
other
sites
and
assemble
the
informa(on
themselves.
[...]

According
to
some
news
websites,
the
implementa(on
of
Google's
Knowledge
Graph
has
played
a

role
in
the
page
view
decline
of
various
language
versions
of
Wikipedia.

34

35

IR-‐based
Factoid
QA

Document
DocumentDocument
Docume
ntDocume
ntDocume
ntDocume
ntDocume
nt
Question
Processing
Passage
Retrieval
Query
Formulation
Answer Type
Detection
Question
Passage
Retrieval
Document
Retrieval
Answer
Processing
Answer
passages
Indexing
Relevant
Docs
DocumentDocument
Document

IR-‐based
Factoid
QA

•  QUESTION
PROCESSING

•  Detect
ques(on
type,
answer
type,
focus,
rela(ons

•  Formulate
queries
to
send
to
a
search
engine

•  PASSAGE
RETRIEVAL

•  Retrieve
ranked
documents

•  Break
into
suitable
passages
and
rerank

•  ANSWER
PROCESSING

•  Extract
candidate
answers

•  Rank
candidates

•  using
evidence
from
the
text
and
external
sources

Knowledge-‐based
approaches
(Siri)

•  Build
a
seman(c
representa(on
of
the
query

•  Times,
dates,
loca(ons,
en((es,
numeric
quan((es

•  Map
from
this
seman(cs
to
query
structured
data

or
resources

•  Geospa(al
databases

•  Ontologies
(Wikipedia
infoboxes,
dbPedia,
WordNet,
Yago)

•  Restaurant
review
sources
and
reserva(on
services

•  Scien(ﬁc
databases

37

SIRI's
main
tasks,
at
a
high
level,
involve:

•  Using
ASR
(Automa(c
speech
recogni(on)
to
transcribe
human
speech
(in
this
case,
short

uNerances
of
commands,
ques(ons,
or
dicta(ons)
into
text.

•  Using
natural
language
processing
(part
of
speech
tagging,
noun-‐phrase
chunking,
dependency
&

cons(tuent
parsing)
to
translate
transcribed
text
into
"parsed
text".

•  Using
ques(on
&
intent
analysis
to
analyze
parsed
text,
detec(ng
user
commands
and
ac(ons.

("Schedule
a
mee(ng",
"Set
my
alarm",
...)

•  Using
data
technologies
to
interface
with
3rd-‐party
web
services
such
as
OpenTable,

WolframAlpha,
to
perform
ac(ons,
search
opera(ons,
and
ques(on
answering.

•  ULerances
SIRI
has
iden$ﬁed
as
a
ques$on,
that
it
cannot
directly
answer,
it
will
forward
to

more
general
ques$on-‐answering
services
such
as
WolframAlpha

•  Transforming
output
of
3rd
party
web
services
back
into
natural
language
text
(eg,
Today's

weather
report
-‐>
"The
weather
will
be
sunny")

•  Using
TTS
(text-‐to-‐speech)
technologies
to
transform
the
natural
language
text
from
step
5

above
into
synthesized
speech.

38

Hybrid
approaches
(IBM
Watson)

•  Build
a
shallow
seman(c
representa(on
of
the
query

•  Generate
answer
candidates
using
IR
methods

•  Augmented
with
ontologies
and
semi-‐structured
data

•  Score
each
candidate
using
richer
knowledge
sources

•  Geospa(al
databases

•  Temporal
reasoning

•  Taxonomical
classiﬁca(on

39

Question
Answering
Answer
Types
and

Query
Formula(on

Factoid
Q/A

41

Document
DocumentDocument
Docume
ntDocume
ntDocume
ntDocume
ntDocume
nt
Question
Processing
Passage
Retrieval
Query
Formulation
Answer Type
Detection
Question
Passage
Retrieval
Document
Retrieval
Answer
Processing
Answer
passages
Indexing
Relevant
Docs
DocumentDocument
Document

Ques$on
Processing

Things
to
extract
from
the
ques$on

•  Answer
Type
Detec(on

•  Decide
the
named
en$ty
type
(person,
place)
of
the
answer

•  Query
Formula(on

•  Choose
query
keywords
for
the
IR
system

•  Ques(on
Type
classiﬁca(on

•  Is
this
a
deﬁni(on
ques(on,
a
math
ques(on,
a
list
ques(on?

•  Focus
Detec(on

•  Find
the
ques(on
words
that
are
replaced
by
the
answer

•  Rela(on
Extrac(on

•  Find
rela(ons
between
en((es
in
the
ques(on
42

Question Processing
They’re the two states you could be reentering if you’re crossing
Florida’s northern border
•  Answer
Type:

US
state

•  Query:

two
states,
border,
Florida,
north

•  Focus:
the
two
states

•  Rela(ons:

borders(Florida,
?x,
north)

43

Answer
Type
Detec$on:
Named
En$$es

•  Who
founded
Virgin
Airlines?

• 
PERSON

•  What
Canadian
city
has
the
largest
popula/on?

• 
CITY.

Answer
Type
Taxonomy

•  6
coarse
classes

•  ABBEVIATION,
ENTITY,
DESCRIPTION,
HUMAN,
LOCATION,

NUMERIC

•  50
ﬁner
classes

•  LOCATION:
city,
country,
mountain…

•  HUMAN:
group,
individual,
(tle,
descrip(on

•  ENTITY:
animal,
body,
color,
currency…

45

Xin
Li,
Dan
Roth.
2002.
Learning
Ques(on
Classiﬁers.
COLING'02

46

Part
of
Li
&
Roth’s
Answer
Type
Taxonomy

LOCATION
NUMERIC
ENTITY HUMAN
ABBREVIATION
DESCRIPTION
country city state
date
percent
money
sizedistance
individual
title
group
food
currency
animal
definition
reason expression
abbreviation

48

More
Answer
Types

Answer
types
in
Jeopardy

•  2500
answer
types
in
20,000
Jeopardy
ques(on
sample

•  The
most
frequent
200
answer
types
cover
<
50%
of
data

•  The
40
most
frequent
Jeopardy
answer
types

he,
country,
city,
man,
ﬁlm,
state,
she,
author,
group,
here,
company,

president,
capital,
star,
novel,
character,
woman,
river,
island,
king,

song,
part,
series,
sport,
singer,
actor,
play,
team,

show,

actress,
animal,
presiden(al,
composer,
musical,
na(on,

book,
(tle,
leader,
game

49

Ferrucci
et
al.
2010.
Building
Watson:
An
Overview
of
the
DeepQA
Project.
AI
Magazine.
Fall
2010.
59-‐79.

Answer
Type
Detec$on

•  Hand-‐wriNen
rules

•  Machine
Learning

•  Hybrids

Answer
Type
Detec$on

•  Regular
expression-‐based
rules

can
get
some
cases:

•  Who
{is|was|are|were}
PERSON

•  PERSON
(YEAR
–
YEAR)

•  Other
rules
use
the
ques$on
headword:

(the
headword
of
the
first
noun
phrase
aJer
the
wh-‐word)

•  Which
city
in
China
has
the
largest
number
of

foreign
financial
companies?

•  What
is
the
state
flower
of
California?

Answer
Type
Detec$on

•  Most
oJen,
we
treat
the
problem
as
machine
learning

classifica(on

•  Define
a
taxonomy
of
ques(on
types

•  Annotate
training
data
for
each
ques(on
type

•  Train
classifiers
for
each
ques(on
class

using
a
rich
set
of
features.

•  features
include
those
hand-‐wriNen
rules!

52

Features
for
Answer
Type
Detec$on

•  Ques(on
words
and
phrases

•  Part-‐of-‐speech
tags

•  Parse
features
(headwords)

•  Named
En((es

•  Seman(cally
related
words

53

Factoid
Q/A

54

Document
DocumentDocument
Docume
ntDocume
ntDocume
ntDocume
ntDocume
nt
Question
Processing
Passage
Retrieval
Query
Formulation
Answer Type
Detection
Question
Passage
Retrieval
Document
Retrieval
Answer
Processing
Answer
passages
Indexing
Relevant
Docs
DocumentDocument
Document

Keyword
Selec$on
Algorithm

1.
Select
all
non-‐stop
words
in
quota(ons

2.
Select
all
NNP
words
in
recognized
named
en((es

3.
Select
all
complex
nominals
with
their
adjec(val
modiﬁers

4.
Select
all
other
complex
nominals

5.
Select
all
nouns
with
their
adjec(val
modiﬁers

6.
Select
all
other
nouns

7.
Select
all
verbs

8.
Select
all
adverbs

9.
Select
the
QFW
word
(skipped
in
all
previous
steps)

10.
Select
all
other
words

Dan
Moldovan,
Sanda
Harabagiu,
Marius
Paca,
Rada
Mihalcea,
Richard
Goodrum,

Roxana
Girju
and
Vasile
Rus.
1999.
Proceedings
of
TREC-‐8.

Choosing keywords from the query
56
Who coined the term “cyberspace” in his novel “Neuromancer”?
1 1
4 4
7
cyberspace/1 Neuromancer/1 term/4 novel/4 coined/7
Slide
from
Mihai
Surdeanu

Question
Answering
Passage
Retrieval
and

Answer
Extrac(on

Factoid
Q/A

58

Document
DocumentDocument
Docume
ntDocume
ntDocume
ntDocume
ntDocume
nt
Question
Processing
Passage
Retrieval
Query
Formulation
Answer Type
Detection
Question
Passage
Retrieval
Document
Retrieval
Answer
Processing
Answer
passages
Indexing
Relevant
Docs
DocumentDocument
Document

59

Passage
Retrieval

•  Step
1:
IR
engine
retrieves
documents
using
query
terms

•  Step
2:
Segment
the
documents
into
shorter
units

•  something
like
paragraphs

•  Step
3:
Passage
ranking

•  Use
answer
type
to
help
rerank
passages

Features
for
Passage
Ranking

•  Number
of
Named
En((es
of
the
right
type
in
passage

•  Number
of
query
words
in
passage

•  Number
of
ques(on
N-‐grams
also
in
passage

•  Proximity
of
query
keywords
to
each
other
in
passage

•  Longest
sequence
of
ques(on
words

•  Rank
of
the
document
containing
passage

Either
in
rule-‐based
classiﬁers
or
with
supervised
machine
learning

Factoid
Q/A

61

Document
DocumentDocument
Docume
ntDocume
ntDocume
ntDocume
ntDocume
nt
Question
Processing
Passage
Retrieval
Query
Formulation
Answer Type
Detection
Question
Passage
Retrieval
Document
Retrieval
Answer
Processing
Answer
passages
Indexing
Relevant
Docs
DocumentDocument
Document

Answer
Extrac$on

•  Run
an
answer-‐type
named-‐en(ty

tagger
on
the
passages

•  Each
answer
type
requires
a
named-‐en(ty
tagger
that
detects
it

•  If
answer
type
is
CITY,
tagger
has
to
tag
CITY

•  Can
be
full
NER,
simple
regular
expressions,
or
hybrid

•  Return
the
string
with
the
right
type:

•  Who is the prime minister of India (PERSON)

Manmohan Singh, Prime Minister of India, had told
left leaders that the deal would not be renegotiated.!
•  How tall is Mt. Everest? (LENGTH)

The official height of Mount Everest is 29035 feet!

Ranking
Candidate
Answers

•  But
what
if
there
are
mul(ple
candidate
answers!

Q: Who was Queen Victoria’s second son?!
•  Answer
Type:

Person

•  Passage:

The
Marie
biscuit
is
named
aJer
Marie
Alexandrovna,

the
daughter
of
Czar
Alexander
II
of
Russia
and
wife
of

Alfred,
the
second
son
of
Queen
Victoria
and
Prince

Albert

Apposi(on
is
a

gramma(cal

construc(on
in
which

two
elements,
normally

noun
phrases,
are
placed

side
by
side,
with
one

element
serving
to

iden(fy
the
other
in
a

diﬀerent
way.

Use
machine
learning:

Features
for
ranking
candidate
answers

Answer
type
match:

Candidate
contains
a
phrase
with
the
correct
answer
type.

PaLern
match:
Regular
expression
paNern
matches
the
candidate.

Ques$on
keywords:
#
of
ques(on
keywords
in
the
candidate.

Keyword
distance:
Distance
in
words
between
the
candidate
and
query
keywords

Novelty
factor:
A
word
in
the
candidate
is
not
in
the
query.

Apposi$on
features:
The
candidate
is
an
apposi(ve
to
ques(on
terms

Punctua$on
loca$on:
The
candidate
is
immediately
followed
by
a

comma,
period,
quota(on
marks,
semicolon,
or
exclama(on
mark.

Sequences
of
ques$on
terms:
The
length
of
the
longest
sequence

of
ques(on
terms
that
occurs
in
the
candidate
answer.

Candidate
Answer
scoring
in
IBM
Watson

•  Each
candidate
answer
gets
scores
from
>50
components

•  (from
unstructured
text,
semi-‐structured
text,
triple
stores)

•  logical
form
(parse)
match
between
ques(on
and
candidate

•  passage
source
reliability

•  geospa(al
loca(on

•  California

is

”southwest
of
Montana”

•  temporal
rela(onships

•  taxonomic
classiﬁca(on
65

66

Common
Evalua$on
Metrics

1. Accuracy
(does
answer
match
gold-‐labeled
answer?)

2. Mean
Reciprocal
Rank

•  For
each
query
return
a
ranked
list
of
M
candidate
answers.

•  Its
score
is
1/Rank
of
the
ﬁrst
right
answer.

•  Take
the
mean
over
all
N
queries

MRR =
1
rankii=1
N
∑
N

67

Common
Evalua$on
Metrics

1. Accuracy
(does
answer
match
gold-‐labeled
answer?)

2. Mean
Reciprocal
Rank:

•  The
reciprocal
rank
of
a
query
response
is
the
inverse
of
the
rank
of
the

ﬁrst
correct
answer.

•  The
mean
reciprocal
rank
is
the
average
of
the
reciprocal
ranks
of

results
for
a
sample
of
queries
Q

MRR =
1
rankii=1
N
∑
N
=

Common
Evalua$on
Metrics:
MRR

•  The
mean
reciprocal
rank
is
the
average
of
the
reciprocal
ranks

of
results
for
a
sample
of
queries
Q.

•  (ex
adapted
from
Wikipedia)

•  3
ranked
answers
for
a
query,
with
the
ﬁrst
one
being
the
one
it
thinks
is

most
likely
correct

•  Given
those
3
samples,
we
could
calculate
the
mean
reciprocal
rank
as

(1/3
+
1/2
+
1)/3
=
11/18
or
about
0.61.

68

69

Common
Evalua$on
Metrics

1. Mean
Reciprocal
Rank

•  For
each
query
return
a
ranked
list
of
M
candidate
answers.

•  Query
score
is
1/Rank
of
the
ﬁrst
correct
answer

•  If
ﬁrst
answer
is
correct:
1

•  else
if
second
answer
is
correct:
½

•  else
if
third
answer
is
correct:

⅓,

etc.

•  Score
is
0
if
none
of
the
M
answers
are
correct

•  Take
the
mean
over
all
N
queries

MRR =
1
rankii=1
N
∑
N

Use
of
this
metric

•  Mean
reciprocal
rank
is
a
sta(s(c
measure
for
evalua(ng

any
process
that
produces
a
list
of
possible
responses
to
a

sample
of
queries,
ordered
by
probability
of
correctness.

•  Machine
transla(on

•  Ques(on
answering

•  Etc.

70

Question
Answering
Advanced:
Answering

Complex
Ques(ons

Answering
harder
ques$ons

Q:
What
is
water
spinach?

A:
Water
spinach
(ipomoea
aqua(ca)
is
a
semi-‐aqua(c
leafy
green
plant
with
long

hollow
stems
and
spear-‐
or
heart-‐shaped
leaves,
widely
grown
throughout
Asia
as
a

leaf
vegetable.
The
leaves
and
stems
are
oJen
eaten
s(r-‐fried
ﬂavored
with
salt
or
in

soups.
Other
common
names
include
morning
glory
vegetable,
kangkong
(Malay),

rau
muong
(Viet.),
ong
choi
(Cant.),
and
kong
xin
cai
(Mand.).
It
is
not
related
to

spinach,
but
is
closely
related
to
sweet
potato
and
convolvulus.

Answering
harder
ques$on

Q:
In
children
with
an
acute
febrile
illness,
what
is
the
eﬃcacy
of

single
medica(on
therapy
with
acetaminophen
or
ibuprofen
in

reducing
fever?

A:
Ibuprofen
provided
greater
temperature
decrement
and
longer

dura(on
of
an(pyresis
than
acetaminophen
when
the
two
drugs

were
administered
in
approximately
equal
doses.
(PubMedID:

1621668,
Evidence
Strength:
A)

Answering
harder
ques$ons
via

query-‐focused
summariza$on

•  The
(boNom-‐up)
snippet
method

•  Find
a
set
of
relevant
documents

•  Extract
informa(ve
sentences
from
the
documents
(using
…-‐idf,
MMR)

•  Order
and
modify
the
sentences
into
an
answer

•  The
(top-‐down)
informa(on
extrac(on
method

•  build
specific
answerers
for
different
ques(on
types:

•  defini(on
ques(ons,

•  biography
ques(ons,

•  certain
medical
ques(ons

The
Informa$on
Extrac$on
method

•  a
good
biography
of
a
person
contains:

•  a
person’s
birth/death,
fame
factor,
educa$on,
na$onality
and
so
on

•  a
good
deﬁni$on
contains:

•  genus
or
hypernym

•  The
Hajj
is
a
type
of
ritual

•  a
medical
answer
about
a
drug’s
use
contains:

•  the
problem
(the
medical
condi(on),

•  the
interven$on
(the
drug
or
procedure),
and

•  the
outcome
(the
result
of
the
study).

Informa$on
that
should
be
in
the
answer

for
3
kinds
of
ques$ons

Document
Retrieval
11 Web documents
1127 total
sentences
Predicate
Identification
Data-Driven
Analysis
383 Non-Specific Definitional sentences
Sentence clusters,
Importance ordering
Definition
Creation
9 Genus-Species Sentences
The Hajj, or pilgrimage to Makkah (Mecca), is the central duty of Islam.
The Hajj is a milestone event in a Muslim's life.
The hajj is one of five pillars that make up the foundation of Islam.
...
The Hajj, or pilgrimage to Makkah [Mecca], is the central duty of Islam. More than
two million Muslims are expected to take the Hajj this year. Muslims must perform
the hajj at least once in their lifetime if physically and financially able. The Hajj is a
milestone event in a Muslim's life. The annual hajj begins in the twelfth month of
the Islamic year (which is lunar, not solar, so that hajj and Ramadan fall sometimes
in summer, sometimes in winter). The Hajj is a week-long pilgrimage that begins in
the 12th month of the Islamic lunar calendar. Another ceremony, which was not
connected with the rites of the Ka'ba before the rise of Islam, is the Hajj, the
annual pilgrimage to 'Arafat, about two miles east of Mecca, toward Mina…
"What is the Hajj?"
(Ndocs=20, Len=8)
Architecture
for
complex
ques$on
answering:

deﬁni$on
ques$ons
S.
Blair-‐Goldensohn,
K.
McKeown
and
A.
Schlaikjer.
2004.

Answering
Deﬁni(on
Ques(ons:
A
Hyrbid
Approach.

Lecture: Question Answering

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (20)

Semelhante a Lecture: Question Answering

Semelhante a Lecture: Question Answering (20)

Mais de Marina Santini

Mais de Marina Santini (20)

Último

Último (20)

Lecture: Question Answering