SlideShare uma empresa Scribd logo
1 de 36
Baixar para ler offline
Inside the mind of Watson
Chris Welty
IBM Research
ibmwatson.com

Do Not Record. Do Not Distribute.
© 2011 IBM Corporation
The Core Technical Team*
Researchers and Engineers in NLP, ML, IR, KR&R and CL at
IBM Labs and a growing number of universities

© 2011 IBM Corporation
Automatic Open-Domain Question Answering
A Long-Standing Challenge in Artificial Intelligence to emulate human expertise

 Given
– Rich Natural Language Questions
– Over a Broad Domain of Knowledge

 Deliver
–
–
–
–

3

Precise Answers: Determine what is being asked & give precise response
Accurate Confidences: Determine likelihood answer is correct
Consumable Justifications: Explain why the answer is right
Fast Response Time: Precision & Confidence in <3 seconds

© 2011 IBM Corporation
What is Jeopardy?
 Jeopardy! is an American quiz
show
– 1964 – Today

 answer-and-question format
– contestants are presented with
clues in the form of answers
– must phrase their responses in
question form.

 Example
– Category: General Science
– Clue: When hit by electrons, a
phosphor gives off electromagnetic
energy in this form
– Answer: What is light?

© 2011 IBM Corporation
The Jeopardy! Challenge
Hard for humans, hard for machines
Broad/Open
Domain
Complex
Language
High
Precision
Accurate
Confidence
High
Speed
5

$1000
The first person
If you are looking at
mentioned
the wainscoating,for different reasons.by name in
But hard
‘The Man in the Iron
you are looking in
Mask’ is this hero of a
this direction.
previous book by the
Who is
same author.

$200

What is down?

D’Artagnan?

For people, the challenge is knowing the answer
For machines, the challenge is understanding the
question

$600
In cell division, mitosis
splits the nucleus &
cytokinesis splits this
What is
liquid cushioning the
nucleus
cytoplasm?

$800
The conspirators against
this man were wounded by
each other while they
Who is Julius
stabbed at him

Caesar?
© 2011 IBM Corporation
What It Takes to compete against Top Human Jeopardy! Players
Our Analysis Reveals the Winner’s Cloud

Each dot – actual historical human Jeopardy! games

Top human
players are
remarkably
good.

Winning Human
Performance

Grand Champion
Human Performance

2007 QA Computer System

More Confident

Less Confident
© 2011 IBM Corporation
What It Takes to compete against Top Human Jeopardy! Players
Our Analysis Reveals the Winner’s Cloud

Each dot – actual historical human Jeopardy! games

Winning Human
Performance

In 2007, we committed to
making a Huge Leap!
Grand Champion
Human Performance

Computers?
Not So Good.

2007 QA Computer System

More Confident

Less Confident
© 2011 IBM Corporation
Welty’s Trident

 A new software paradigm is emerging
– Increasingly, computational tasks require inexact solutions that
combine multiple methods in unpredictable ways

 Knowledge is not the destination
– Watson does not answer a question by translating natural language
input into formally represented knowledge and simply running queries
against this knowledge

 Machine intelligence is not human intelligence
– The difference is most notable in the mistakes they make

© 2011 IBM Corporation
Welty’s Trident

 A new software paradigm is emerging
– Increasingly, computational tasks require inexact solutions that
combine multiple methods in unpredictable ways

 Knowledge is not the destination
– Watson does not answer a question by translating natural language
input into formally represented knowledge and simply running queries
against this knowledge

 Machine intelligence is not human intelligence
– The difference is most notable in the mistakes they make

© 2011 IBM Corporation
DeepQA: The Technology Behind Watson
An example of a new software paradigm
DeepQA generates and scores many hypotheses using an extensible collection of
Natural Language Processing, Machine Learning and Reasoning Algorithms.
These gather and weigh evidence over both unstructured and structured content to
determine the answer with the best confidence.
Learned Models
help combine and
weigh the Evidence
Evidence
Sources

Question

Answer
Sources

Primary
Search

Question &
Topic
Analysis

Candidate
Answer
Generation

Question
Decomposition

Answer
Scoring

Hypothesis
Generation

Hypothesis
Generation

Evidence
Retrieval

Hypothesis and
Evidence Scoring

Hypothesis and Evidence
Scoring

...

Models

Deep
Evidence
Scoring

Synthesis

Models

Models

Models

Models

Models

Final Confidence
Merging &
Ranking

Answer &
Confidence
© 2011 IBM Corporation
Example Question
In 1894 C.W. Post
created his warm
cereal drink Postum in
this Michigan city

Question
Analysis

Keywords: 1894, C.W. Post,
created …
Lexical AnswerType:
(Michingan city)
Date(1984)
Relations:
Create(Post, cereal drink)
…

Related Content
(Structured & Unstructured)

Primary
Search

Candidate Answer Generation

General Foods

[0.58 0 -1.3 … 0.97]

1985

[0.71 1 13.4 … 0.72]

Post Foods

[0.12 0

aramour
Battle Creek

[0.84 1 10.6 … 0.21]
[0.33 0

Grand Rapids

2.0 … 0.40]

6.3 … 0.83]

…

[0.91 0 -8.2 … 0.61]

Battle Creek (0.85)
Post Foods ( 0.20)
1985
(0.05)

[0.21 1 11.1 … 0.92]

…

1)
2)
3)

…

Evidence
Retrieval

Merging &
Ranking

[0.91 0 -1.7 … 0.60]
Evidence
Scoring

© 2011 IBM Corporation
Hypothesis Scoring
Category: MICHIGAN MANIA
Clue: In 1894 C.W. Post created his warm cereal drink Postum in this
Tycor
Michigan city
Temporal
Answer Scorers can be applied depending on different relations or constraints detected in the
question. For example, this question focus with modifiers is “Michigan city.” Watson can
Spatial
detect this as a geospatial relation that indicates the correct answer must be a city spatially
Popularity
located within the sate of Michigan.
…
Candidate Answers

Evidence Feature Scores (Answer Scoring + Passage Scoring)
Doc Rank

Pass Rank

Ty Cor

Geo

General Foods

0

1

0.1

0

Post Foods

2

1

0.1

0

Battle Creek

1

2

0.8

1

Will Keith Kellogg

3

0.1

0

0.9

1

0.0

0

Grand Rapids

1895

0

© 2011 IBM Corporation
Passage Scoring
Category: MICHIGAN MANIA
Clue: In 1894 C.W. Post created his warm cereal drink Postum in this
Michigan city
In Deep Evidence Scoring, Watson retrieves evidence for each candidate answer, then evaluates the evidence using a
large number of deep evidence scoring analytics. The evidence for a candidate answer may come from the original
document or passage where the candidate answer was generated, or it may come from an evidence retrieval search
performed by taking the keyword search query from Step 2, replacing the focus terms with the candidate answer, and
retrieving the relevant passages that are found. The passages, or “context” in which the candidate answer occurs are
evaluated as evidence to support or refute the candidate answer as the correct answer for the question.

General Foods

Battle Creek
1895: In Battle Creek, Michigan, C.W.
Post made thecamePOSTUM , a cereal
C.W. Post first to the Battle Creek
beverage. Post created GRAPE-NUTS
sanitarium to cure his upset stomach.
cereal in 1897, and POST TOASTIES
He later created Postum, a cerealcorn flakes in 1908
based coffee substitute

Post Foods
1854 C. W. Post (Charles William) was
born. He founded the Postum Cereal Co.
General Foods' products go from
in 1895 (renamed General Foods Corp.breakfast
(Post's cereals) to Postum cereal
in 1922) to manufacture warm nightcaps (Postum,
Sanka), also wash the pots and pans that its
beverage
foods are cooked in (S.O.S. Scouring Pads

The company was incorporated in 1922,
Post Foods, LLC, also known as Post Cereals
having developed from the earlier Postum
(formerly Postum Cereals) was founded by C.W.
Cereal Co. Ltd., founded by C.W. Post
Post. It began in 1895 with the first Postum, a
(1854-1914) in 1895 in Battle Creek, Mich.
"cereal beverage", developed by Post in Battle
After a number of experiments, Post
Creek, Michigan. The first cereal, Grape-Nuts,
marketed his first product-the cereal
It was named after C. W. Post, the founder of
was developed in 1897
beverage called Postum-in 1895
the Postum Cereal Company that later

became General Foods. The cereal company
unit was later sold off and is now Post Foods

© 2011 IBM Corporation
Merging Candidate Answers and Scoring
the Confidence
Category: MICHIGAN MANIA
Clue: In 1894 C.W. Post created his warm cereal drink Postum in this …
In the final processing step, Watson detects variants of the same answer and merges their feature scores together.
Watson then computes the final confidence scores for the candidate answers by applying a series of Machine
Learning models that weight all of the feature scores to produce the final confidence scores.

Candidate
Answers

Evidence Feature Scores
Doc
Rank

Pass
Rank

Ty Cor

General Foods

0

1

0.1

Post Foods

2

1

Battle Creek

1

2

Will Keith Kellogg

3

Geo

LFAC
S

Term
Match

Temporal

0

0.2

22

1

0.1

0

0.4

41

1

0.8

1

0.5

30

0.9

0.1

0

0

23

0.5

0.9

1

0

10

0.5

0.0

0

0

21

Correct
Answer

0

0.6

Post Foods

0.152

1895

0.040
0.033

General Foods
1895

0.946

Grand Rapids

Machine
Learning
Model
Application

Confidence

Battle Creek

Grand Rapids

Final Answers

0.014

© 2011 IBM Corporation
“Minimal” Deep QA Pipeline
Category: MICHIGAN MANIA
Clue: In 1894 C.W. Post created his warm cereal drink Postum in this
Michigan city

Question

Battle Creek
Primary
Search

Question
Analysis

LAT

Document
Search
Results
R

Mitchigan
City

0
1

Title
General
Foods
Battle
Creek

2

Post Foods

3

Will Keith
Kellogg

Hypothesis
Generation

Candidate
Answers
General
Foods
Post
Foods
Battle
Creek

Hypothesis and
Evidence Scoring

Final Confidence
Merging &
Ranking

Evidence Features
Ty Cor

Geo

Final Answers

Confidence

0.1

0

Battle Creek

0.946

0.1

0

Post Foods

0.152

0.8

1

1895

0.040
© 2011 IBM Corporation
A new software paradigm emerging (not that we invented it)
 The basic Watson computation is Hypothesis Scoring
 How well does an answer fit into a question?
 More than 100 different Hypothesis scoring software components
 No single scoring component does the whole job
 Many of them do very similar jobs
 12 typing components, 8 passage alignment components, 10 ngram components, …
 These components are not integrated with each other beyond that they
each produce a score for each hypothesis
 A machine learning algorithm learns how to combine them to produce
a final score
 The development methodology involved an incremental approach of
producing stable baseline systems and testing changes with “follow-ons”
 Changes that improve performance according to our metrics are
accepted into the next stable baseline
© 2011 IBM Corporation
Follow-on development

+ ~10%

© 2011 IBM Corporation
Incremental Baselines
100%
90%

11/2010

80%

4/2010

Precision

70%

10/2009
5/2009

60%

12/2008

50%

8/2008
5/2008

40%

12/2007

30%
20%
Baseline

10%
0%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

% Answered
© 2011 IBM Corporation
Welty’s Trident

 A new software paradigm is emerging
– Increasingly, computational tasks require inexact solutions that
combine multiple methods in unpredictable ways

 Knowledge is not the destination
– Watson does not answer a question by translating natural language
input into formally represented knowledge and simply running queries
against this knowledge

 Machine intelligence is not human intelligence
– The difference is most notable in the mistakes they make

© 2011 IBM Corporation
ClassicQA: NOT The Technology Behind Watson
From the dawn of AI, it was envisioned that question answering would work by having a
process that completely translated natural language (content & questions) into an
unambiguous (logical) representation, and a reasoning process would run on that
representation to produce answers. This vision has never been realized.

Question

Answer
Sources

Primary
Search

Formal
Query

GOFNLP

Logical
Reasoner
Formal Knowledge

Answer &
Confidence

© 2011 IBM Corporation
into the Gap

Language
Recall

NLP

Knowledge

FAIL

Precision

Mentions

Scale

Semantic
Technology

Brittleness

Acquisition
© 2011 IBM Corporation
into the Gap

Language

Knowledge
Scale

Recall

Semantic
Technology

NLP
Precision

Brittleness

No!
Mentions

Acquisition
© 2011 IBM Corporation
into the Gap

Language

Knowledge

Knowledge is not the destination

Scale

Recall

Semantic
Technology

NLP
Precision

Mentions

Brittleness

Acquisition
© 2011 IBM Corporation
into the Gap

IR

LF

Language

NER

ML

Crowds
SemTech

Task
(e.g. QA)

Parsing

© 2011 IBM Corporation
Using Structured Evidence
• Exploit wealth of freely available
structured information
• e.g. Linked Open Data (LOD)
• Types, Relations, Links

• Complement results from
unstructured text analysis
• Classic Precision Vs. Recall Tradeoff

 Useful for explanation data
–Precise and reliable evidence
(e.g. spatial / temporal constraint match)

© 2011 IBM Corporation
Structured Data and Inference in Watson
Spatial Reasoning

Relation Detection and
Scoring Using Structured
KBs
Q: “This 1997 Titanic hero..”
matches
<Dicaprio, lead-actor, Titanic>

Answer Typing
(Type Coercion)
LAT: Scottish Inventor
Answer: James Watt

Anti-Type Coercion
LAT: Country
Candidate: Einstein

Answer In Clue
Q: “In 2003, ‘Big
Blue’ acquired this
company..”
 Downweigh IBM
Evidence
Sources

Containment (“This African country..”)
Relative direction (“This sea east of Florida..”)
Border (“This state bordering the Great
Lakes..”)
Relative location (“bldg. near Times Square..”)
Numeric Properties: area/population/height
(“This sea, largest in area,..”)

Temporal Reasoning
Lifespan, Duration

Question

Models

Primary
Search

Question &
Topic
Analysis

Candidate
Answer
Generation

Question
Decomposition

LAT Inference
Q: “Annexation of this in
1803..”
(Using PRISMATIC)
“this”  Region

Hypothesis
Generation

Evidence
Retrieval

Models

Evidence
Scoring

Hypothesis and
Evidence Scoring

Synthesis

Evidence Diffusion
Q: “Sunan Intl. Airport is in this country”
Diffuse evidence from
(Pyongyang ->> N Korea)

Models
Models

Final Confidence
Merging &
Ranking

Answer &
Confidence

© 2011 IBM Corporation
LOD Impact on DeepQA for Typing Answers

+ ~10%

66.5%
66.0%
65.5%
65.0%
64.5%
64.0%
63.5%
63.0%
62.5%
62.0%
61.5%

An ensemble of TyCor components

© 2011 IBM Corporation
Welty’s Trident

 A new software paradigm is emerging
– Increasingly, computational tasks require inexact solutions that
combine multiple methods in unpredictable ways

 Knowledge is not the destination
– Watson does not answer a question by translating natural language
input into formally represented knowledge and simply running queries
against this knowledge

 Machine intelligence is not human intelligence
– The difference is most notable in the mistakes they make

© 2011 IBM Corporation
And the winner is….not human
is….
100%
90%
80%

Precision

70%
60%
50%
40%

30%
20%
10%
0%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

% Answered
© 2011 IBM Corporation
IBM Research

HOW TASTY
WAS MY
LITTLE
FRENCHMAN
FATHERLY NICKNAMES

THIS FRENCHMAN WAS "THE
FATHER OF BACTERIOLOGY”

© 2009 IBM Corporation
IBM Research

President
Bush
THERE'S A FIRST TIME FOR
EVERYTHING

IN 1824 THIS FIRST FOREIGNER TO
ADDRESS A JOINT SESSION OF
CONGRESS CONGRATULATED THE
U.S. ON ITS GROWTH

© 2009 IBM Corporation
IBM Research

Michael
MUSIC

WHAT IS THE TEXT OF AN OPERA
CALLED?

© 2009 IBM Corporation
Kosher
HAPPY MEALS

GRASSHOPPERS EAT PRIMARILY
THIS

© 2011 IBM Corporation
OLYMPIC ODDITIES

Had only
one hand
It was the anatomical oddity of U.S.
gymnast George Eyser, who won a
gold medal on the parallel bars in 1904

© 2011 IBM Corporation
Welty’s Trident

 A new software paradigm is emerging
– Increasingly, computational tasks require inexact solutions that
combine multiple methods in unpredictable ways

 Knowledge is not the destination
– Watson does not answer a question by translating natural language
input into formally represented knowledge and simply running queries
against this knowledge

 Machine intelligence is not human intelligence
– The difference is most notable in the mistakes they make

© 2011 IBM Corporation
CONFIRMED KEYNOTE:
TOM MALONE, MIT

PAPER DEADLINES:
MID JUNE

iswc2012.semanticweb.org

Mais conteúdo relacionado

Semelhante a ESWC SS 2012 - Friday Keynote Chris Welty: Inside the Mind of Watson

IBM Watson for Ecosystem Program - You as ISV / Startup can enhance/build app...
IBM Watson for Ecosystem Program - You as ISV / Startup can enhance/build app...IBM Watson for Ecosystem Program - You as ISV / Startup can enhance/build app...
IBM Watson for Ecosystem Program - You as ISV / Startup can enhance/build app...
Romeo Kienzler
 

Semelhante a ESWC SS 2012 - Friday Keynote Chris Welty: Inside the Mind of Watson (15)

CMU 2011 Watson Event
CMU 2011 Watson EventCMU 2011 Watson Event
CMU 2011 Watson Event
 
Ib Extended Essay Politics. Online assignment writing service.
Ib Extended Essay Politics. Online assignment writing service.Ib Extended Essay Politics. Online assignment writing service.
Ib Extended Essay Politics. Online assignment writing service.
 
University of phoenix crt 205 homework help
University of phoenix crt 205 homework helpUniversity of phoenix crt 205 homework help
University of phoenix crt 205 homework help
 
Expert Essay Writers - Buy Essays Online Australia - 20171
Expert Essay Writers - Buy Essays Online Australia - 20171Expert Essay Writers - Buy Essays Online Australia - 20171
Expert Essay Writers - Buy Essays Online Australia - 20171
 
IBM Watson for Ecosystem Program - You as ISV / Startup can enhance/build app...
IBM Watson for Ecosystem Program - You as ISV / Startup can enhance/build app...IBM Watson for Ecosystem Program - You as ISV / Startup can enhance/build app...
IBM Watson for Ecosystem Program - You as ISV / Startup can enhance/build app...
 
Prompt Engineering - an Art, a Science, or your next Job Title?
Prompt Engineering - an Art, a Science, or your next Job Title?Prompt Engineering - an Art, a Science, or your next Job Title?
Prompt Engineering - an Art, a Science, or your next Job Title?
 
Gilded age, urbanziation, immigration, and progressivism 2015 2016
Gilded age, urbanziation, immigration, and progressivism 2015 2016Gilded age, urbanziation, immigration, and progressivism 2015 2016
Gilded age, urbanziation, immigration, and progressivism 2015 2016
 
How To Write A Response Paper To A Movie Great Ti
How To Write A Response Paper To A Movie  Great TiHow To Write A Response Paper To A Movie  Great Ti
How To Write A Response Paper To A Movie Great Ti
 
Future of AI-powered automation in business
Future of AI-powered automation in businessFuture of AI-powered automation in business
Future of AI-powered automation in business
 
James Joyce Dubliners Essay Topics
James Joyce Dubliners Essay TopicsJames Joyce Dubliners Essay Topics
James Joyce Dubliners Essay Topics
 
5 Tips For Teaching Essay Writing To ESL Students -
5 Tips For Teaching Essay Writing To ESL Students -5 Tips For Teaching Essay Writing To ESL Students -
5 Tips For Teaching Essay Writing To ESL Students -
 
😊 Research Paper Analysis. Applied Behavior Analysis
😊 Research Paper Analysis. Applied Behavior Analysis😊 Research Paper Analysis. Applied Behavior Analysis
😊 Research Paper Analysis. Applied Behavior Analysis
 
Types Of Research Papers What Is A Research Paper
Types Of Research Papers  What Is A Research PaperTypes Of Research Papers  What Is A Research Paper
Types Of Research Papers What Is A Research Paper
 
Upmc tpdev7
Upmc tpdev7Upmc tpdev7
Upmc tpdev7
 
Prompt Engineering - an Art, a Science, or your next Job Title?
Prompt Engineering - an Art, a Science, or your next Job Title?Prompt Engineering - an Art, a Science, or your next Job Title?
Prompt Engineering - an Art, a Science, or your next Job Title?
 

Mais de eswcsummerschool

Hands On: Amazon Mechanical Turk - M. Acosta - ESWC SS 2014
Hands On: Amazon Mechanical Turk - M. Acosta - ESWC SS 2014 Hands On: Amazon Mechanical Turk - M. Acosta - ESWC SS 2014
Hands On: Amazon Mechanical Turk - M. Acosta - ESWC SS 2014
eswcsummerschool
 
Mon norton tut_publishing01
Mon norton tut_publishing01Mon norton tut_publishing01
Mon norton tut_publishing01
eswcsummerschool
 
Mon domingue introduction to the school
Mon domingue introduction to the schoolMon domingue introduction to the school
Mon domingue introduction to the school
eswcsummerschool
 
Mon norton tut_querying cultural heritage data
Mon norton tut_querying cultural heritage dataMon norton tut_querying cultural heritage data
Mon norton tut_querying cultural heritage data
eswcsummerschool
 
Tue acosta hands_on_providinglinkeddata
Tue acosta hands_on_providinglinkeddataTue acosta hands_on_providinglinkeddata
Tue acosta hands_on_providinglinkeddata
eswcsummerschool
 
Thu bernstein key_warp_speed
Thu bernstein key_warp_speedThu bernstein key_warp_speed
Thu bernstein key_warp_speed
eswcsummerschool
 
Fri schreiber key_knowledge engineering
Fri schreiber key_knowledge engineeringFri schreiber key_knowledge engineering
Fri schreiber key_knowledge engineering
eswcsummerschool
 
Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02
eswcsummerschool
 
Mon fundulaki tut_querying linked data
Mon fundulaki tut_querying linked dataMon fundulaki tut_querying linked data
Mon fundulaki tut_querying linked data
eswcsummerschool
 

Mais de eswcsummerschool (20)

Semantic Aquarium - ESWC SSchool 14 - Student project
Semantic Aquarium - ESWC SSchool 14 - Student projectSemantic Aquarium - ESWC SSchool 14 - Student project
Semantic Aquarium - ESWC SSchool 14 - Student project
 
Syrtaki - ESWC SSchool 14 - Student project
Syrtaki  - ESWC SSchool 14 - Student projectSyrtaki  - ESWC SSchool 14 - Student project
Syrtaki - ESWC SSchool 14 - Student project
 
Keep fit (a bit) - ESWC SSchool 14 - Student project
Keep fit (a bit)  - ESWC SSchool 14 - Student projectKeep fit (a bit)  - ESWC SSchool 14 - Student project
Keep fit (a bit) - ESWC SSchool 14 - Student project
 
Arabic Sentiment Lexicon - ESWC SSchool 14 - Student project
Arabic Sentiment Lexicon - ESWC SSchool 14 - Student projectArabic Sentiment Lexicon - ESWC SSchool 14 - Student project
Arabic Sentiment Lexicon - ESWC SSchool 14 - Student project
 
FIT-8BIT An activity music assistant - ESWC SSchool 14 - Student project
FIT-8BIT An activity music assistant - ESWC SSchool 14 - Student projectFIT-8BIT An activity music assistant - ESWC SSchool 14 - Student project
FIT-8BIT An activity music assistant - ESWC SSchool 14 - Student project
 
Personal Tours at the British Museum - ESWC SSchool 14 - Student project
Personal Tours at the British Museum  - ESWC SSchool 14 - Student projectPersonal Tours at the British Museum  - ESWC SSchool 14 - Student project
Personal Tours at the British Museum - ESWC SSchool 14 - Student project
 
Exhibition recommendation using British Museum data and Event Registry - ESWC...
Exhibition recommendation using British Museum data and Event Registry - ESWC...Exhibition recommendation using British Museum data and Event Registry - ESWC...
Exhibition recommendation using British Museum data and Event Registry - ESWC...
 
Empowering fishing business using Linked Data - ESWC SSchool 14 - Student pro...
Empowering fishing business using Linked Data - ESWC SSchool 14 - Student pro...Empowering fishing business using Linked Data - ESWC SSchool 14 - Student pro...
Empowering fishing business using Linked Data - ESWC SSchool 14 - Student pro...
 
Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014
Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014 Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014
Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014
 
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
 
Hands On: Amazon Mechanical Turk - M. Acosta - ESWC SS 2014
Hands On: Amazon Mechanical Turk - M. Acosta - ESWC SS 2014 Hands On: Amazon Mechanical Turk - M. Acosta - ESWC SS 2014
Hands On: Amazon Mechanical Turk - M. Acosta - ESWC SS 2014
 
Tutorial: Querying a Marine Data Warehouse Using SPARQL - I. Fundulaki - ESWC...
Tutorial: Querying a Marine Data Warehouse Using SPARQL - I. Fundulaki - ESWC...Tutorial: Querying a Marine Data Warehouse Using SPARQL - I. Fundulaki - ESWC...
Tutorial: Querying a Marine Data Warehouse Using SPARQL - I. Fundulaki - ESWC...
 
Mon norton tut_publishing01
Mon norton tut_publishing01Mon norton tut_publishing01
Mon norton tut_publishing01
 
Mon domingue introduction to the school
Mon domingue introduction to the schoolMon domingue introduction to the school
Mon domingue introduction to the school
 
Mon norton tut_querying cultural heritage data
Mon norton tut_querying cultural heritage dataMon norton tut_querying cultural heritage data
Mon norton tut_querying cultural heritage data
 
Tue acosta hands_on_providinglinkeddata
Tue acosta hands_on_providinglinkeddataTue acosta hands_on_providinglinkeddata
Tue acosta hands_on_providinglinkeddata
 
Thu bernstein key_warp_speed
Thu bernstein key_warp_speedThu bernstein key_warp_speed
Thu bernstein key_warp_speed
 
Fri schreiber key_knowledge engineering
Fri schreiber key_knowledge engineeringFri schreiber key_knowledge engineering
Fri schreiber key_knowledge engineering
 
Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02
 
Mon fundulaki tut_querying linked data
Mon fundulaki tut_querying linked dataMon fundulaki tut_querying linked data
Mon fundulaki tut_querying linked data
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

ESWC SS 2012 - Friday Keynote Chris Welty: Inside the Mind of Watson

  • 1. Inside the mind of Watson Chris Welty IBM Research ibmwatson.com Do Not Record. Do Not Distribute. © 2011 IBM Corporation
  • 2. The Core Technical Team* Researchers and Engineers in NLP, ML, IR, KR&R and CL at IBM Labs and a growing number of universities © 2011 IBM Corporation
  • 3. Automatic Open-Domain Question Answering A Long-Standing Challenge in Artificial Intelligence to emulate human expertise  Given – Rich Natural Language Questions – Over a Broad Domain of Knowledge  Deliver – – – – 3 Precise Answers: Determine what is being asked & give precise response Accurate Confidences: Determine likelihood answer is correct Consumable Justifications: Explain why the answer is right Fast Response Time: Precision & Confidence in <3 seconds © 2011 IBM Corporation
  • 4. What is Jeopardy?  Jeopardy! is an American quiz show – 1964 – Today  answer-and-question format – contestants are presented with clues in the form of answers – must phrase their responses in question form.  Example – Category: General Science – Clue: When hit by electrons, a phosphor gives off electromagnetic energy in this form – Answer: What is light? © 2011 IBM Corporation
  • 5. The Jeopardy! Challenge Hard for humans, hard for machines Broad/Open Domain Complex Language High Precision Accurate Confidence High Speed 5 $1000 The first person If you are looking at mentioned the wainscoating,for different reasons.by name in But hard ‘The Man in the Iron you are looking in Mask’ is this hero of a this direction. previous book by the Who is same author. $200 What is down? D’Artagnan? For people, the challenge is knowing the answer For machines, the challenge is understanding the question $600 In cell division, mitosis splits the nucleus & cytokinesis splits this What is liquid cushioning the nucleus cytoplasm? $800 The conspirators against this man were wounded by each other while they Who is Julius stabbed at him Caesar? © 2011 IBM Corporation
  • 6. What It Takes to compete against Top Human Jeopardy! Players Our Analysis Reveals the Winner’s Cloud Each dot – actual historical human Jeopardy! games Top human players are remarkably good. Winning Human Performance Grand Champion Human Performance 2007 QA Computer System More Confident Less Confident © 2011 IBM Corporation
  • 7. What It Takes to compete against Top Human Jeopardy! Players Our Analysis Reveals the Winner’s Cloud Each dot – actual historical human Jeopardy! games Winning Human Performance In 2007, we committed to making a Huge Leap! Grand Champion Human Performance Computers? Not So Good. 2007 QA Computer System More Confident Less Confident © 2011 IBM Corporation
  • 8. Welty’s Trident  A new software paradigm is emerging – Increasingly, computational tasks require inexact solutions that combine multiple methods in unpredictable ways  Knowledge is not the destination – Watson does not answer a question by translating natural language input into formally represented knowledge and simply running queries against this knowledge  Machine intelligence is not human intelligence – The difference is most notable in the mistakes they make © 2011 IBM Corporation
  • 9. Welty’s Trident  A new software paradigm is emerging – Increasingly, computational tasks require inexact solutions that combine multiple methods in unpredictable ways  Knowledge is not the destination – Watson does not answer a question by translating natural language input into formally represented knowledge and simply running queries against this knowledge  Machine intelligence is not human intelligence – The difference is most notable in the mistakes they make © 2011 IBM Corporation
  • 10. DeepQA: The Technology Behind Watson An example of a new software paradigm DeepQA generates and scores many hypotheses using an extensible collection of Natural Language Processing, Machine Learning and Reasoning Algorithms. These gather and weigh evidence over both unstructured and structured content to determine the answer with the best confidence. Learned Models help combine and weigh the Evidence Evidence Sources Question Answer Sources Primary Search Question & Topic Analysis Candidate Answer Generation Question Decomposition Answer Scoring Hypothesis Generation Hypothesis Generation Evidence Retrieval Hypothesis and Evidence Scoring Hypothesis and Evidence Scoring ... Models Deep Evidence Scoring Synthesis Models Models Models Models Models Final Confidence Merging & Ranking Answer & Confidence © 2011 IBM Corporation
  • 11. Example Question In 1894 C.W. Post created his warm cereal drink Postum in this Michigan city Question Analysis Keywords: 1894, C.W. Post, created … Lexical AnswerType: (Michingan city) Date(1984) Relations: Create(Post, cereal drink) … Related Content (Structured & Unstructured) Primary Search Candidate Answer Generation General Foods [0.58 0 -1.3 … 0.97] 1985 [0.71 1 13.4 … 0.72] Post Foods [0.12 0 aramour Battle Creek [0.84 1 10.6 … 0.21] [0.33 0 Grand Rapids 2.0 … 0.40] 6.3 … 0.83] … [0.91 0 -8.2 … 0.61] Battle Creek (0.85) Post Foods ( 0.20) 1985 (0.05) [0.21 1 11.1 … 0.92] … 1) 2) 3) … Evidence Retrieval Merging & Ranking [0.91 0 -1.7 … 0.60] Evidence Scoring © 2011 IBM Corporation
  • 12. Hypothesis Scoring Category: MICHIGAN MANIA Clue: In 1894 C.W. Post created his warm cereal drink Postum in this Tycor Michigan city Temporal Answer Scorers can be applied depending on different relations or constraints detected in the question. For example, this question focus with modifiers is “Michigan city.” Watson can Spatial detect this as a geospatial relation that indicates the correct answer must be a city spatially Popularity located within the sate of Michigan. … Candidate Answers Evidence Feature Scores (Answer Scoring + Passage Scoring) Doc Rank Pass Rank Ty Cor Geo General Foods 0 1 0.1 0 Post Foods 2 1 0.1 0 Battle Creek 1 2 0.8 1 Will Keith Kellogg 3 0.1 0 0.9 1 0.0 0 Grand Rapids 1895 0 © 2011 IBM Corporation
  • 13. Passage Scoring Category: MICHIGAN MANIA Clue: In 1894 C.W. Post created his warm cereal drink Postum in this Michigan city In Deep Evidence Scoring, Watson retrieves evidence for each candidate answer, then evaluates the evidence using a large number of deep evidence scoring analytics. The evidence for a candidate answer may come from the original document or passage where the candidate answer was generated, or it may come from an evidence retrieval search performed by taking the keyword search query from Step 2, replacing the focus terms with the candidate answer, and retrieving the relevant passages that are found. The passages, or “context” in which the candidate answer occurs are evaluated as evidence to support or refute the candidate answer as the correct answer for the question. General Foods Battle Creek 1895: In Battle Creek, Michigan, C.W. Post made thecamePOSTUM , a cereal C.W. Post first to the Battle Creek beverage. Post created GRAPE-NUTS sanitarium to cure his upset stomach. cereal in 1897, and POST TOASTIES He later created Postum, a cerealcorn flakes in 1908 based coffee substitute Post Foods 1854 C. W. Post (Charles William) was born. He founded the Postum Cereal Co. General Foods' products go from in 1895 (renamed General Foods Corp.breakfast (Post's cereals) to Postum cereal in 1922) to manufacture warm nightcaps (Postum, Sanka), also wash the pots and pans that its beverage foods are cooked in (S.O.S. Scouring Pads The company was incorporated in 1922, Post Foods, LLC, also known as Post Cereals having developed from the earlier Postum (formerly Postum Cereals) was founded by C.W. Cereal Co. Ltd., founded by C.W. Post Post. It began in 1895 with the first Postum, a (1854-1914) in 1895 in Battle Creek, Mich. "cereal beverage", developed by Post in Battle After a number of experiments, Post Creek, Michigan. The first cereal, Grape-Nuts, marketed his first product-the cereal It was named after C. W. Post, the founder of was developed in 1897 beverage called Postum-in 1895 the Postum Cereal Company that later became General Foods. The cereal company unit was later sold off and is now Post Foods © 2011 IBM Corporation
  • 14. Merging Candidate Answers and Scoring the Confidence Category: MICHIGAN MANIA Clue: In 1894 C.W. Post created his warm cereal drink Postum in this … In the final processing step, Watson detects variants of the same answer and merges their feature scores together. Watson then computes the final confidence scores for the candidate answers by applying a series of Machine Learning models that weight all of the feature scores to produce the final confidence scores. Candidate Answers Evidence Feature Scores Doc Rank Pass Rank Ty Cor General Foods 0 1 0.1 Post Foods 2 1 Battle Creek 1 2 Will Keith Kellogg 3 Geo LFAC S Term Match Temporal 0 0.2 22 1 0.1 0 0.4 41 1 0.8 1 0.5 30 0.9 0.1 0 0 23 0.5 0.9 1 0 10 0.5 0.0 0 0 21 Correct Answer 0 0.6 Post Foods 0.152 1895 0.040 0.033 General Foods 1895 0.946 Grand Rapids Machine Learning Model Application Confidence Battle Creek Grand Rapids Final Answers 0.014 © 2011 IBM Corporation
  • 15. “Minimal” Deep QA Pipeline Category: MICHIGAN MANIA Clue: In 1894 C.W. Post created his warm cereal drink Postum in this Michigan city Question Battle Creek Primary Search Question Analysis LAT Document Search Results R Mitchigan City 0 1 Title General Foods Battle Creek 2 Post Foods 3 Will Keith Kellogg Hypothesis Generation Candidate Answers General Foods Post Foods Battle Creek Hypothesis and Evidence Scoring Final Confidence Merging & Ranking Evidence Features Ty Cor Geo Final Answers Confidence 0.1 0 Battle Creek 0.946 0.1 0 Post Foods 0.152 0.8 1 1895 0.040 © 2011 IBM Corporation
  • 16. A new software paradigm emerging (not that we invented it)  The basic Watson computation is Hypothesis Scoring  How well does an answer fit into a question?  More than 100 different Hypothesis scoring software components  No single scoring component does the whole job  Many of them do very similar jobs  12 typing components, 8 passage alignment components, 10 ngram components, …  These components are not integrated with each other beyond that they each produce a score for each hypothesis  A machine learning algorithm learns how to combine them to produce a final score  The development methodology involved an incremental approach of producing stable baseline systems and testing changes with “follow-ons”  Changes that improve performance according to our metrics are accepted into the next stable baseline © 2011 IBM Corporation
  • 17. Follow-on development + ~10% © 2011 IBM Corporation
  • 19. Welty’s Trident  A new software paradigm is emerging – Increasingly, computational tasks require inexact solutions that combine multiple methods in unpredictable ways  Knowledge is not the destination – Watson does not answer a question by translating natural language input into formally represented knowledge and simply running queries against this knowledge  Machine intelligence is not human intelligence – The difference is most notable in the mistakes they make © 2011 IBM Corporation
  • 20. ClassicQA: NOT The Technology Behind Watson From the dawn of AI, it was envisioned that question answering would work by having a process that completely translated natural language (content & questions) into an unambiguous (logical) representation, and a reasoning process would run on that representation to produce answers. This vision has never been realized. Question Answer Sources Primary Search Formal Query GOFNLP Logical Reasoner Formal Knowledge Answer & Confidence © 2011 IBM Corporation
  • 23. into the Gap Language Knowledge Knowledge is not the destination Scale Recall Semantic Technology NLP Precision Mentions Brittleness Acquisition © 2011 IBM Corporation
  • 25. Using Structured Evidence • Exploit wealth of freely available structured information • e.g. Linked Open Data (LOD) • Types, Relations, Links • Complement results from unstructured text analysis • Classic Precision Vs. Recall Tradeoff  Useful for explanation data –Precise and reliable evidence (e.g. spatial / temporal constraint match) © 2011 IBM Corporation
  • 26. Structured Data and Inference in Watson Spatial Reasoning Relation Detection and Scoring Using Structured KBs Q: “This 1997 Titanic hero..” matches <Dicaprio, lead-actor, Titanic> Answer Typing (Type Coercion) LAT: Scottish Inventor Answer: James Watt Anti-Type Coercion LAT: Country Candidate: Einstein Answer In Clue Q: “In 2003, ‘Big Blue’ acquired this company..”  Downweigh IBM Evidence Sources Containment (“This African country..”) Relative direction (“This sea east of Florida..”) Border (“This state bordering the Great Lakes..”) Relative location (“bldg. near Times Square..”) Numeric Properties: area/population/height (“This sea, largest in area,..”) Temporal Reasoning Lifespan, Duration Question Models Primary Search Question & Topic Analysis Candidate Answer Generation Question Decomposition LAT Inference Q: “Annexation of this in 1803..” (Using PRISMATIC) “this”  Region Hypothesis Generation Evidence Retrieval Models Evidence Scoring Hypothesis and Evidence Scoring Synthesis Evidence Diffusion Q: “Sunan Intl. Airport is in this country” Diffuse evidence from (Pyongyang ->> N Korea) Models Models Final Confidence Merging & Ranking Answer & Confidence © 2011 IBM Corporation
  • 27. LOD Impact on DeepQA for Typing Answers + ~10% 66.5% 66.0% 65.5% 65.0% 64.5% 64.0% 63.5% 63.0% 62.5% 62.0% 61.5% An ensemble of TyCor components © 2011 IBM Corporation
  • 28. Welty’s Trident  A new software paradigm is emerging – Increasingly, computational tasks require inexact solutions that combine multiple methods in unpredictable ways  Knowledge is not the destination – Watson does not answer a question by translating natural language input into formally represented knowledge and simply running queries against this knowledge  Machine intelligence is not human intelligence – The difference is most notable in the mistakes they make © 2011 IBM Corporation
  • 29. And the winner is….not human is…. 100% 90% 80% Precision 70% 60% 50% 40% 30% 20% 10% 0% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% % Answered © 2011 IBM Corporation
  • 30. IBM Research HOW TASTY WAS MY LITTLE FRENCHMAN FATHERLY NICKNAMES THIS FRENCHMAN WAS "THE FATHER OF BACTERIOLOGY” © 2009 IBM Corporation
  • 31. IBM Research President Bush THERE'S A FIRST TIME FOR EVERYTHING IN 1824 THIS FIRST FOREIGNER TO ADDRESS A JOINT SESSION OF CONGRESS CONGRATULATED THE U.S. ON ITS GROWTH © 2009 IBM Corporation
  • 32. IBM Research Michael MUSIC WHAT IS THE TEXT OF AN OPERA CALLED? © 2009 IBM Corporation
  • 33. Kosher HAPPY MEALS GRASSHOPPERS EAT PRIMARILY THIS © 2011 IBM Corporation
  • 34. OLYMPIC ODDITIES Had only one hand It was the anatomical oddity of U.S. gymnast George Eyser, who won a gold medal on the parallel bars in 1904 © 2011 IBM Corporation
  • 35. Welty’s Trident  A new software paradigm is emerging – Increasingly, computational tasks require inexact solutions that combine multiple methods in unpredictable ways  Knowledge is not the destination – Watson does not answer a question by translating natural language input into formally represented knowledge and simply running queries against this knowledge  Machine intelligence is not human intelligence – The difference is most notable in the mistakes they make © 2011 IBM Corporation
  • 36. CONFIRMED KEYNOTE: TOM MALONE, MIT PAPER DEADLINES: MID JUNE iswc2012.semanticweb.org