Mais conteúdo relacionado
Semelhante a ESWC SS 2012 - Friday Keynote Chris Welty: Inside the Mind of Watson (15)
Mais de eswcsummerschool (20)
ESWC SS 2012 - Friday Keynote Chris Welty: Inside the Mind of Watson
- 1. Inside the mind of Watson
Chris Welty
IBM Research
ibmwatson.com
Do Not Record. Do Not Distribute.
© 2011 IBM Corporation
- 2. The Core Technical Team*
Researchers and Engineers in NLP, ML, IR, KR&R and CL at
IBM Labs and a growing number of universities
© 2011 IBM Corporation
- 3. Automatic Open-Domain Question Answering
A Long-Standing Challenge in Artificial Intelligence to emulate human expertise
Given
– Rich Natural Language Questions
– Over a Broad Domain of Knowledge
Deliver
–
–
–
–
3
Precise Answers: Determine what is being asked & give precise response
Accurate Confidences: Determine likelihood answer is correct
Consumable Justifications: Explain why the answer is right
Fast Response Time: Precision & Confidence in <3 seconds
© 2011 IBM Corporation
- 4. What is Jeopardy?
Jeopardy! is an American quiz
show
– 1964 – Today
answer-and-question format
– contestants are presented with
clues in the form of answers
– must phrase their responses in
question form.
Example
– Category: General Science
– Clue: When hit by electrons, a
phosphor gives off electromagnetic
energy in this form
– Answer: What is light?
© 2011 IBM Corporation
- 5. The Jeopardy! Challenge
Hard for humans, hard for machines
Broad/Open
Domain
Complex
Language
High
Precision
Accurate
Confidence
High
Speed
5
$1000
The first person
If you are looking at
mentioned
the wainscoating,for different reasons.by name in
But hard
‘The Man in the Iron
you are looking in
Mask’ is this hero of a
this direction.
previous book by the
Who is
same author.
$200
What is down?
D’Artagnan?
For people, the challenge is knowing the answer
For machines, the challenge is understanding the
question
$600
In cell division, mitosis
splits the nucleus &
cytokinesis splits this
What is
liquid cushioning the
nucleus
cytoplasm?
$800
The conspirators against
this man were wounded by
each other while they
Who is Julius
stabbed at him
Caesar?
© 2011 IBM Corporation
- 6. What It Takes to compete against Top Human Jeopardy! Players
Our Analysis Reveals the Winner’s Cloud
Each dot – actual historical human Jeopardy! games
Top human
players are
remarkably
good.
Winning Human
Performance
Grand Champion
Human Performance
2007 QA Computer System
More Confident
Less Confident
© 2011 IBM Corporation
- 7. What It Takes to compete against Top Human Jeopardy! Players
Our Analysis Reveals the Winner’s Cloud
Each dot – actual historical human Jeopardy! games
Winning Human
Performance
In 2007, we committed to
making a Huge Leap!
Grand Champion
Human Performance
Computers?
Not So Good.
2007 QA Computer System
More Confident
Less Confident
© 2011 IBM Corporation
- 8. Welty’s Trident
A new software paradigm is emerging
– Increasingly, computational tasks require inexact solutions that
combine multiple methods in unpredictable ways
Knowledge is not the destination
– Watson does not answer a question by translating natural language
input into formally represented knowledge and simply running queries
against this knowledge
Machine intelligence is not human intelligence
– The difference is most notable in the mistakes they make
© 2011 IBM Corporation
- 9. Welty’s Trident
A new software paradigm is emerging
– Increasingly, computational tasks require inexact solutions that
combine multiple methods in unpredictable ways
Knowledge is not the destination
– Watson does not answer a question by translating natural language
input into formally represented knowledge and simply running queries
against this knowledge
Machine intelligence is not human intelligence
– The difference is most notable in the mistakes they make
© 2011 IBM Corporation
- 10. DeepQA: The Technology Behind Watson
An example of a new software paradigm
DeepQA generates and scores many hypotheses using an extensible collection of
Natural Language Processing, Machine Learning and Reasoning Algorithms.
These gather and weigh evidence over both unstructured and structured content to
determine the answer with the best confidence.
Learned Models
help combine and
weigh the Evidence
Evidence
Sources
Question
Answer
Sources
Primary
Search
Question &
Topic
Analysis
Candidate
Answer
Generation
Question
Decomposition
Answer
Scoring
Hypothesis
Generation
Hypothesis
Generation
Evidence
Retrieval
Hypothesis and
Evidence Scoring
Hypothesis and Evidence
Scoring
...
Models
Deep
Evidence
Scoring
Synthesis
Models
Models
Models
Models
Models
Final Confidence
Merging &
Ranking
Answer &
Confidence
© 2011 IBM Corporation
- 11. Example Question
In 1894 C.W. Post
created his warm
cereal drink Postum in
this Michigan city
Question
Analysis
Keywords: 1894, C.W. Post,
created …
Lexical AnswerType:
(Michingan city)
Date(1984)
Relations:
Create(Post, cereal drink)
…
Related Content
(Structured & Unstructured)
Primary
Search
Candidate Answer Generation
General Foods
[0.58 0 -1.3 … 0.97]
1985
[0.71 1 13.4 … 0.72]
Post Foods
[0.12 0
aramour
Battle Creek
[0.84 1 10.6 … 0.21]
[0.33 0
Grand Rapids
2.0 … 0.40]
6.3 … 0.83]
…
[0.91 0 -8.2 … 0.61]
Battle Creek (0.85)
Post Foods ( 0.20)
1985
(0.05)
[0.21 1 11.1 … 0.92]
…
1)
2)
3)
…
Evidence
Retrieval
Merging &
Ranking
[0.91 0 -1.7 … 0.60]
Evidence
Scoring
© 2011 IBM Corporation
- 12. Hypothesis Scoring
Category: MICHIGAN MANIA
Clue: In 1894 C.W. Post created his warm cereal drink Postum in this
Tycor
Michigan city
Temporal
Answer Scorers can be applied depending on different relations or constraints detected in the
question. For example, this question focus with modifiers is “Michigan city.” Watson can
Spatial
detect this as a geospatial relation that indicates the correct answer must be a city spatially
Popularity
located within the sate of Michigan.
…
Candidate Answers
Evidence Feature Scores (Answer Scoring + Passage Scoring)
Doc Rank
Pass Rank
Ty Cor
Geo
General Foods
0
1
0.1
0
Post Foods
2
1
0.1
0
Battle Creek
1
2
0.8
1
Will Keith Kellogg
3
0.1
0
0.9
1
0.0
0
Grand Rapids
1895
0
© 2011 IBM Corporation
- 13. Passage Scoring
Category: MICHIGAN MANIA
Clue: In 1894 C.W. Post created his warm cereal drink Postum in this
Michigan city
In Deep Evidence Scoring, Watson retrieves evidence for each candidate answer, then evaluates the evidence using a
large number of deep evidence scoring analytics. The evidence for a candidate answer may come from the original
document or passage where the candidate answer was generated, or it may come from an evidence retrieval search
performed by taking the keyword search query from Step 2, replacing the focus terms with the candidate answer, and
retrieving the relevant passages that are found. The passages, or “context” in which the candidate answer occurs are
evaluated as evidence to support or refute the candidate answer as the correct answer for the question.
General Foods
Battle Creek
1895: In Battle Creek, Michigan, C.W.
Post made thecamePOSTUM , a cereal
C.W. Post first to the Battle Creek
beverage. Post created GRAPE-NUTS
sanitarium to cure his upset stomach.
cereal in 1897, and POST TOASTIES
He later created Postum, a cerealcorn flakes in 1908
based coffee substitute
Post Foods
1854 C. W. Post (Charles William) was
born. He founded the Postum Cereal Co.
General Foods' products go from
in 1895 (renamed General Foods Corp.breakfast
(Post's cereals) to Postum cereal
in 1922) to manufacture warm nightcaps (Postum,
Sanka), also wash the pots and pans that its
beverage
foods are cooked in (S.O.S. Scouring Pads
The company was incorporated in 1922,
Post Foods, LLC, also known as Post Cereals
having developed from the earlier Postum
(formerly Postum Cereals) was founded by C.W.
Cereal Co. Ltd., founded by C.W. Post
Post. It began in 1895 with the first Postum, a
(1854-1914) in 1895 in Battle Creek, Mich.
"cereal beverage", developed by Post in Battle
After a number of experiments, Post
Creek, Michigan. The first cereal, Grape-Nuts,
marketed his first product-the cereal
It was named after C. W. Post, the founder of
was developed in 1897
beverage called Postum-in 1895
the Postum Cereal Company that later
became General Foods. The cereal company
unit was later sold off and is now Post Foods
© 2011 IBM Corporation
- 14. Merging Candidate Answers and Scoring
the Confidence
Category: MICHIGAN MANIA
Clue: In 1894 C.W. Post created his warm cereal drink Postum in this …
In the final processing step, Watson detects variants of the same answer and merges their feature scores together.
Watson then computes the final confidence scores for the candidate answers by applying a series of Machine
Learning models that weight all of the feature scores to produce the final confidence scores.
Candidate
Answers
Evidence Feature Scores
Doc
Rank
Pass
Rank
Ty Cor
General Foods
0
1
0.1
Post Foods
2
1
Battle Creek
1
2
Will Keith Kellogg
3
Geo
LFAC
S
Term
Match
Temporal
0
0.2
22
1
0.1
0
0.4
41
1
0.8
1
0.5
30
0.9
0.1
0
0
23
0.5
0.9
1
0
10
0.5
0.0
0
0
21
Correct
Answer
0
0.6
Post Foods
0.152
1895
0.040
0.033
General Foods
1895
0.946
Grand Rapids
Machine
Learning
Model
Application
Confidence
Battle Creek
Grand Rapids
Final Answers
0.014
© 2011 IBM Corporation
- 15. “Minimal” Deep QA Pipeline
Category: MICHIGAN MANIA
Clue: In 1894 C.W. Post created his warm cereal drink Postum in this
Michigan city
Question
Battle Creek
Primary
Search
Question
Analysis
LAT
Document
Search
Results
R
Mitchigan
City
0
1
Title
General
Foods
Battle
Creek
2
Post Foods
3
Will Keith
Kellogg
Hypothesis
Generation
Candidate
Answers
General
Foods
Post
Foods
Battle
Creek
Hypothesis and
Evidence Scoring
Final Confidence
Merging &
Ranking
Evidence Features
Ty Cor
Geo
Final Answers
Confidence
0.1
0
Battle Creek
0.946
0.1
0
Post Foods
0.152
0.8
1
1895
0.040
© 2011 IBM Corporation
- 16. A new software paradigm emerging (not that we invented it)
The basic Watson computation is Hypothesis Scoring
How well does an answer fit into a question?
More than 100 different Hypothesis scoring software components
No single scoring component does the whole job
Many of them do very similar jobs
12 typing components, 8 passage alignment components, 10 ngram components, …
These components are not integrated with each other beyond that they
each produce a score for each hypothesis
A machine learning algorithm learns how to combine them to produce
a final score
The development methodology involved an incremental approach of
producing stable baseline systems and testing changes with “follow-ons”
Changes that improve performance according to our metrics are
accepted into the next stable baseline
© 2011 IBM Corporation
- 19. Welty’s Trident
A new software paradigm is emerging
– Increasingly, computational tasks require inexact solutions that
combine multiple methods in unpredictable ways
Knowledge is not the destination
– Watson does not answer a question by translating natural language
input into formally represented knowledge and simply running queries
against this knowledge
Machine intelligence is not human intelligence
– The difference is most notable in the mistakes they make
© 2011 IBM Corporation
- 20. ClassicQA: NOT The Technology Behind Watson
From the dawn of AI, it was envisioned that question answering would work by having a
process that completely translated natural language (content & questions) into an
unambiguous (logical) representation, and a reasoning process would run on that
representation to produce answers. This vision has never been realized.
Question
Answer
Sources
Primary
Search
Formal
Query
GOFNLP
Logical
Reasoner
Formal Knowledge
Answer &
Confidence
© 2011 IBM Corporation
- 25. Using Structured Evidence
• Exploit wealth of freely available
structured information
• e.g. Linked Open Data (LOD)
• Types, Relations, Links
• Complement results from
unstructured text analysis
• Classic Precision Vs. Recall Tradeoff
Useful for explanation data
–Precise and reliable evidence
(e.g. spatial / temporal constraint match)
© 2011 IBM Corporation
- 26. Structured Data and Inference in Watson
Spatial Reasoning
Relation Detection and
Scoring Using Structured
KBs
Q: “This 1997 Titanic hero..”
matches
<Dicaprio, lead-actor, Titanic>
Answer Typing
(Type Coercion)
LAT: Scottish Inventor
Answer: James Watt
Anti-Type Coercion
LAT: Country
Candidate: Einstein
Answer In Clue
Q: “In 2003, ‘Big
Blue’ acquired this
company..”
Downweigh IBM
Evidence
Sources
Containment (“This African country..”)
Relative direction (“This sea east of Florida..”)
Border (“This state bordering the Great
Lakes..”)
Relative location (“bldg. near Times Square..”)
Numeric Properties: area/population/height
(“This sea, largest in area,..”)
Temporal Reasoning
Lifespan, Duration
Question
Models
Primary
Search
Question &
Topic
Analysis
Candidate
Answer
Generation
Question
Decomposition
LAT Inference
Q: “Annexation of this in
1803..”
(Using PRISMATIC)
“this” Region
Hypothesis
Generation
Evidence
Retrieval
Models
Evidence
Scoring
Hypothesis and
Evidence Scoring
Synthesis
Evidence Diffusion
Q: “Sunan Intl. Airport is in this country”
Diffuse evidence from
(Pyongyang ->> N Korea)
Models
Models
Final Confidence
Merging &
Ranking
Answer &
Confidence
© 2011 IBM Corporation
- 27. LOD Impact on DeepQA for Typing Answers
+ ~10%
66.5%
66.0%
65.5%
65.0%
64.5%
64.0%
63.5%
63.0%
62.5%
62.0%
61.5%
An ensemble of TyCor components
© 2011 IBM Corporation
- 28. Welty’s Trident
A new software paradigm is emerging
– Increasingly, computational tasks require inexact solutions that
combine multiple methods in unpredictable ways
Knowledge is not the destination
– Watson does not answer a question by translating natural language
input into formally represented knowledge and simply running queries
against this knowledge
Machine intelligence is not human intelligence
– The difference is most notable in the mistakes they make
© 2011 IBM Corporation
- 29. And the winner is….not human
is….
100%
90%
80%
Precision
70%
60%
50%
40%
30%
20%
10%
0%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
% Answered
© 2011 IBM Corporation
- 30. IBM Research
HOW TASTY
WAS MY
LITTLE
FRENCHMAN
FATHERLY NICKNAMES
THIS FRENCHMAN WAS "THE
FATHER OF BACTERIOLOGY”
© 2009 IBM Corporation
- 31. IBM Research
President
Bush
THERE'S A FIRST TIME FOR
EVERYTHING
IN 1824 THIS FIRST FOREIGNER TO
ADDRESS A JOINT SESSION OF
CONGRESS CONGRATULATED THE
U.S. ON ITS GROWTH
© 2009 IBM Corporation
- 34. OLYMPIC ODDITIES
Had only
one hand
It was the anatomical oddity of U.S.
gymnast George Eyser, who won a
gold medal on the parallel bars in 1904
© 2011 IBM Corporation
- 35. Welty’s Trident
A new software paradigm is emerging
– Increasingly, computational tasks require inexact solutions that
combine multiple methods in unpredictable ways
Knowledge is not the destination
– Watson does not answer a question by translating natural language
input into formally represented knowledge and simply running queries
against this knowledge
Machine intelligence is not human intelligence
– The difference is most notable in the mistakes they make
© 2011 IBM Corporation