SlideShare a Scribd company logo
1 of 36
Download to read offline
How can we compare unstructured,
structured and self-structured knowledge
representation?
Fabio Petroni
25 June 2020
1
Unstructured and Structured KBs @
Structured
large-scare textual corpora
Knowledge Representation
knowledge graph
key-value memory
Unstructured
soft
hard
Structured
machinelarge-scare textual corpora
Knowledge Representation
knowledge graph
key-value memory
Unstructured Self-Structured
soft
hard
Structured
machinelarge-scare textual corpora
Knowledge Representation
knowledge graph
key-value memory
Unstructured Self-Structured
How can we compare these
approaches (+combinations)?
soft
hard
Structured
machinelarge-scare textual corpora
Knowledge Representation
knowledge graph
key-value memory
Unstructured Self-Structured
downstream tasks
soft
hard
6Current NLP benchmarks
e.g., natural language inference
s1: At the other end of
Pennsylvania Avenue,
people began to line up for
a White House tour.
s2: People formed a
line at the end of
Pennsylvania Avenue.
entailment
- focus on reading comprehension
- emergence of general architectures (e.g., BERT)
- local information is sufficient to solve the task
...
Knowledge Intensive NLP tasks
- require to seek knowledge in a large body of documents
even for humans to be solved
Knowledge
Source
Unstructured
Knowledge Intensive NLP tasks
- require to seek knowledge in a large body of documents
even for humans to be solved
Structured
Knowledge
Source
Unstructured+
9Knowledge Intensive NLP tasks
1 - Slot Filling
2 - Entity Linking
3 - Open Domain QA
4 - Fact Checking
5 - Factual Generations
Knowledge Intensive NLP task 1 - Slot Filling
GiacomoTedesco
date of birth
place of birth
occupation
position played on team / speciality
TAC-KBP challenges
McNamee and Dang, 2009; Ji et al., 2010; Surdeanu, 2013; Surdeanu and Ji, 2014
collect information on certain relations (or slots) of
entities from large collections of natural language text
Knowledge Intensive NLP task 1 - Slot Filling
GiacomoTedesco plays in _____ position .
<GiacomoTedesco, position played on team>
What position does GiacomoTedesco play?
structured query
natural question
cloze-style question
several ways to approach the problem
GiacomoTedesco
date of birth
place of birth
occupation
position played on team / speciality
12
Petroni et al, 2019-2020
single token answers
T-REx (Elsahar et al, 2018)
Google-RE
https://code.google.com/archive/
p/relation-extraction-corpus/
https://github.com/facebookresearch/LAMA
Slot Filling
131
0
25
50
75
100
RE RE-ora Wikidata
automatic KG
structured
human curated
structured
accuracy
Sorokin and Gurevych (2017)
Petroni et al, 2019-2020
structured query
structured query
141
0
25
50
75
100
RE RE-ora BERT Wikidata
automatic KG
structured
self-
structured
unstructured solution = read all Wikipedia to predict
accuracy
human curated
structured
Petroni et al, 2019-2020
structured query
structured query
cloze-style question
151
0
25
50
75
100
RE RE-ora BERT DrQA BERT-ret BERT-ora Wikidata
automatic KG
structured
self-
structured
indexed human curated
structured
unstructured solution = read all Wikipedia to predict
accuracy
Petroni et al, 2019-2020
structured query
structured query
cloze-style question
enriched
cloze-style question
natural question
161
0
25
50
75
100
RE RE-ora BERT DrQA BERT-ret BERT-ora Wikidata
automatic KG
structured
self-
structured
indexed human curated
structured
unstructured solution = read all Wikipedia to predict
accuracy
soft hard
hardcoded rules for
what knowledge is
KB in-
parameters
Petroni et al, 2019-2020
structured query
structured query
cloze-style question
cloze-style question
natural question
17LAMA limitations
- Single token might favour BERT
- Wikidata gets all answer right
- Explainability not assessed
GiacomoTedesco plays
in _____ position .
midfielder
provenance
answer
both open/close book
can get the answer
for the wrong reason
evidence from the
knowledge source
Knowledge Intensive NLP task 2 - Entity Linking 18
The most comprehensive
photographic handbook for
mushroom is authored by
Michael Jordan.
wiki/Michael_Jordan_(mycologist)
is the task of assigning a unique identity to entities mentioned in text
19
dense retrieval with MIPS and bi-encoder
wikipedia
dense space
[…] authored by
[SE] Michael
Jordan [EE].
is an English
mycologist
Michael Jordan (m)
is former
basketball player
Michael Jordan
T
T
Tiger
T
Ferrari
Domus Aurea
Carriage
Moon
Jaguar
…
BI-ENCODER
MIPS
Dog
Lake Avernus
Mannheim
Diego Maradona
5.9M points
TAC-KBP 2010 20
He et al. (2013) 81.0
Sun et al. (2015) 83.9
Yamada et al. (2016) 85.5
Globerson et al. (2016) 87.2
Sil et al. (2018) 87.4
Nie et al. (2018) 89.1
Raiman and Raiman (2018) 90.9
Cao et al. (2018) 91.0
Gillick et al. (2019) 87.0
Wu et al. (2019) 94.5
Févry et al (2020) 94.9
dense retrieval
bi-encoder
Ji et al. 2010
TAC-KBP 2010 21Ji et al. 2010
He et al. (2013) 81.0
Sun et al. (2015) 83.9
Yamada et al. (2016) 85.5
Globerson et al. (2016) 87.2
Sil et al. (2018) 87.4
Nie et al. (2018) 89.1
Raiman and Raiman (2018) 90.9
Cao et al. (2018) 91.0
Gillick et al. (2019) 87.0
Wu et al. (2019) 94.5
Févry et al (2020) 94.9
uniform candidate set - whole Wikipedia
dense retrieval
bi-encoder
TAC-KBP ks
~700K entities
real-world scenario
each dataset defines a different set of candidate entities
TAC-KBP 2010 22
He et al. (2013) 81.0
Sun et al. (2015) 83.9
Yamada et al. (2016) 85.5
Globerson et al. (2016) 87.2
Sil et al. (2018) 87.4
Nie et al. (2018) 89.1
Raiman and Raiman (2018) 90.9
Cao et al. (2018) 91.0
Gillick et al. (2019) 87.0
Wu et al. (2019) 94.5
Févry et al (2020) 94.9
Févry et al (2020) 91.4
Wu et al. (2019) 92.8all Wikipedia
~5.9M entities
TAC-KBP ks
~700K entities
Ji et al. 2010
real-world scenario
Knowledge Intensive NLP task 3 - Open Domain QA
What's the highest mountain in Europe?
- TriviaQA (Joshi et al., 2017)
- HotpotQA (Yang et al., 2018)
- Natural Questions (Kwiatkowski et al., 2019)
- ELI5 (Fan et al., 2019)
- ...
24
dense retrieval with MIPS and bi-encoder
wikipedia
dense space
What’s the highest
mountain in
Europe?
is the second-
highest mountain
Mont Blanc p1
Mount Elbrus is a
dormant vulcano
Mount Elbrus p1
T
T
Tiger p5
T
Ferrari p1
Domus Aurea p5
Carriage p4
Moon p2
Jaguar p4
…
BI-ENCODER
MIPS
Dog p6
Lake Avernus p2
Mannheim p1
Diego Maradona p1
21M points
Exact Match 25
Natural
Questions
open dev
TriviaQA
official test
Roberts et al. 2020 T5 36.6 60.5
Guu et al. 2020 REALM 40.4
Karpukhin et al. 2020 DPR 41.5
Lewis et al. 2020 RAG 44.5 68
Exact Match 26
Natural
Questions
TriviaQA
Roberts et al. 2020 T5 36.6 60.5
Guu et al. 2020 REALM 40.4
Karpukhin et al. 2020 DPR 41.5
Lewis et al. 2020 RAG 44.5 68
Self-Structured
Unstructured + Structured + Self-Structured
Limitations 27
Credit: Firstname Lastname
- different split of the data
- explainability not assessed
provenance
answer
both open/close book
can get the answer
for the wrong reason
evidence from the
knowledge source
text, lists, tables, images
What's the highest
mountain in Europe?
Mount Elbrus
Knowledge Intensive NLP task 4 - Fact Checking 28
FEVER
Thorne et al., 2018-2019
Lorelai Gilmore's father
is named Robert.
claim
SUPPORTS
REFUTES
NOT ENOUGH INFO
answer
provenance
3-way
classification
Label Accuracy 29
3-way 2-way
Zhong et al. 2020 DREAM 76.8 -
Thorne et al. 2020 RoBERTa - 92.2*
Lewis et al. 2019 BART 64.0 81.1
Lewis et al. 2020 RAG 72.5 89.5
* with oracle evidence
Discussion 30
FEVER score - elegant way to combine explainability and
downstream performance
only award points for accuracy if the correct evidence is found
a ton of manual annotations
FEVER is an artificial task
fact-checking in the real word is another game
Knowledge Intensive NLP task 5 - Factual Generations 31
GPT2 generation
Massarelli et al. (2019)
Princess Margaret, Countess of Snowdon, (Margaret Rose 21
August 1930 - 9 February 2002) was the younger daughter of
King GeorgeVI and Queen ElizabethThe Queen Mother and
the only sibling of Queen Elizabeth II.
She married Antony Armstrong-Jones, a photographer, in 1960.
It was the first marriage for the Queen and the first for Prince
Philip, Duke of Edinburgh.
After divorcing Armstrong-Jones in 1978, she married Group
Captain PeterTownsend in June that same year.
She died at the age of 71 on 9 February 2002.
Why did Princess Margaret marry Antony Armstrong-Jones?
prompt
Delayed
Beam
Search
up to 64% of generated sentences with claims are SUPPORTED
Jeopardy Question Generation 32
Input: The Divine Comedy
BART: This epic poem by Dante is divided into three parts: the
Inferno,The Purgatorio & the Purgatorio
RAG: This 14th Century work is divided into 3 sections:“inferno”,
“Purgatorio” & “Paradiso” Factuality Specificity
BART Better 7.1% 16.8%
RAG better 42.7% 37.4%
both good 11.7% 11.8%
both poor 17.7% 6.9%
no majority 20.8% 20.1%
human evaluation
Lewis et al. (2020)
33
Fabio Petroni
1Jeopardy Question Generation
Input: Hemingway
RAG: “The Sun Also Rises” is a novel by
this author of "A Farewell to Arms"
Document 1: his works are considered classics of American
literature ... His wartime experiences formed the basis for his novel
”A Farewell to Arms” (1929) ...
Document 2: ... artists of the 1920s ”Lost Generation” expatriate
community. His debut novel, ”The Sun Also Rises”, was published
in 1926.
BO
S
”
The
Sun
Also
R
ises
”
is
a
novel
by
thisauthor
of
”
A
Fare
w
ell
to
Arm
s
”
Doc 1
Doc 2
Doc 3
Doc 4
Doc 5
33
Lewis et al. (2020)
Interaction Between Parametric / Non-Parametric Knowledge
Retrieved documents cue correct responses from BART:
Feed BART with input Hemingway and partial decoding “The Sun:
Completion: “The Sun also Rises” is a novel by this author of
“the Sun Also Rises”
Feed BART with input Hemingway and partial decoding “The Sun Also
Rises” is a novel by this author of “A:
Completion: “The Sun also Rises” is a novel by this author of “A
Farewell to Arms”
Lewis et al. (2020)
Conclusion 35
Can a model read the web and autonomously write an
encyclopedia ?
Encoding unit of text with a LM seems a really
promising way to build knowledge bases
We should use a variegated set of
knowledge intensive language tasks to
evaluate knowledge representation
The ultimate Knowledge Intensive task
THANK YOU
36
@Fabio_Petroni

More Related Content

Recently uploaded

Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 

Recently uploaded (20)

Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 

Featured

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Featured (20)

Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 

How can we compare unstructured, structured and self-structured knowledge representation?

  • 1. How can we compare unstructured, structured and self-structured knowledge representation? Fabio Petroni 25 June 2020 1 Unstructured and Structured KBs @
  • 2. Structured large-scare textual corpora Knowledge Representation knowledge graph key-value memory Unstructured soft hard
  • 3. Structured machinelarge-scare textual corpora Knowledge Representation knowledge graph key-value memory Unstructured Self-Structured soft hard
  • 4. Structured machinelarge-scare textual corpora Knowledge Representation knowledge graph key-value memory Unstructured Self-Structured How can we compare these approaches (+combinations)? soft hard
  • 5. Structured machinelarge-scare textual corpora Knowledge Representation knowledge graph key-value memory Unstructured Self-Structured downstream tasks soft hard
  • 6. 6Current NLP benchmarks e.g., natural language inference s1: At the other end of Pennsylvania Avenue, people began to line up for a White House tour. s2: People formed a line at the end of Pennsylvania Avenue. entailment - focus on reading comprehension - emergence of general architectures (e.g., BERT) - local information is sufficient to solve the task ...
  • 7. Knowledge Intensive NLP tasks - require to seek knowledge in a large body of documents even for humans to be solved Knowledge Source Unstructured
  • 8. Knowledge Intensive NLP tasks - require to seek knowledge in a large body of documents even for humans to be solved Structured Knowledge Source Unstructured+
  • 9. 9Knowledge Intensive NLP tasks 1 - Slot Filling 2 - Entity Linking 3 - Open Domain QA 4 - Fact Checking 5 - Factual Generations
  • 10. Knowledge Intensive NLP task 1 - Slot Filling GiacomoTedesco date of birth place of birth occupation position played on team / speciality TAC-KBP challenges McNamee and Dang, 2009; Ji et al., 2010; Surdeanu, 2013; Surdeanu and Ji, 2014 collect information on certain relations (or slots) of entities from large collections of natural language text
  • 11. Knowledge Intensive NLP task 1 - Slot Filling GiacomoTedesco plays in _____ position . <GiacomoTedesco, position played on team> What position does GiacomoTedesco play? structured query natural question cloze-style question several ways to approach the problem GiacomoTedesco date of birth place of birth occupation position played on team / speciality
  • 12. 12 Petroni et al, 2019-2020 single token answers T-REx (Elsahar et al, 2018) Google-RE https://code.google.com/archive/ p/relation-extraction-corpus/ https://github.com/facebookresearch/LAMA Slot Filling
  • 13. 131 0 25 50 75 100 RE RE-ora Wikidata automatic KG structured human curated structured accuracy Sorokin and Gurevych (2017) Petroni et al, 2019-2020 structured query structured query
  • 14. 141 0 25 50 75 100 RE RE-ora BERT Wikidata automatic KG structured self- structured unstructured solution = read all Wikipedia to predict accuracy human curated structured Petroni et al, 2019-2020 structured query structured query cloze-style question
  • 15. 151 0 25 50 75 100 RE RE-ora BERT DrQA BERT-ret BERT-ora Wikidata automatic KG structured self- structured indexed human curated structured unstructured solution = read all Wikipedia to predict accuracy Petroni et al, 2019-2020 structured query structured query cloze-style question enriched cloze-style question natural question
  • 16. 161 0 25 50 75 100 RE RE-ora BERT DrQA BERT-ret BERT-ora Wikidata automatic KG structured self- structured indexed human curated structured unstructured solution = read all Wikipedia to predict accuracy soft hard hardcoded rules for what knowledge is KB in- parameters Petroni et al, 2019-2020 structured query structured query cloze-style question cloze-style question natural question
  • 17. 17LAMA limitations - Single token might favour BERT - Wikidata gets all answer right - Explainability not assessed GiacomoTedesco plays in _____ position . midfielder provenance answer both open/close book can get the answer for the wrong reason evidence from the knowledge source
  • 18. Knowledge Intensive NLP task 2 - Entity Linking 18 The most comprehensive photographic handbook for mushroom is authored by Michael Jordan. wiki/Michael_Jordan_(mycologist) is the task of assigning a unique identity to entities mentioned in text
  • 19. 19 dense retrieval with MIPS and bi-encoder wikipedia dense space […] authored by [SE] Michael Jordan [EE]. is an English mycologist Michael Jordan (m) is former basketball player Michael Jordan T T Tiger T Ferrari Domus Aurea Carriage Moon Jaguar … BI-ENCODER MIPS Dog Lake Avernus Mannheim Diego Maradona 5.9M points
  • 20. TAC-KBP 2010 20 He et al. (2013) 81.0 Sun et al. (2015) 83.9 Yamada et al. (2016) 85.5 Globerson et al. (2016) 87.2 Sil et al. (2018) 87.4 Nie et al. (2018) 89.1 Raiman and Raiman (2018) 90.9 Cao et al. (2018) 91.0 Gillick et al. (2019) 87.0 Wu et al. (2019) 94.5 Févry et al (2020) 94.9 dense retrieval bi-encoder Ji et al. 2010
  • 21. TAC-KBP 2010 21Ji et al. 2010 He et al. (2013) 81.0 Sun et al. (2015) 83.9 Yamada et al. (2016) 85.5 Globerson et al. (2016) 87.2 Sil et al. (2018) 87.4 Nie et al. (2018) 89.1 Raiman and Raiman (2018) 90.9 Cao et al. (2018) 91.0 Gillick et al. (2019) 87.0 Wu et al. (2019) 94.5 Févry et al (2020) 94.9 uniform candidate set - whole Wikipedia dense retrieval bi-encoder TAC-KBP ks ~700K entities real-world scenario each dataset defines a different set of candidate entities
  • 22. TAC-KBP 2010 22 He et al. (2013) 81.0 Sun et al. (2015) 83.9 Yamada et al. (2016) 85.5 Globerson et al. (2016) 87.2 Sil et al. (2018) 87.4 Nie et al. (2018) 89.1 Raiman and Raiman (2018) 90.9 Cao et al. (2018) 91.0 Gillick et al. (2019) 87.0 Wu et al. (2019) 94.5 Févry et al (2020) 94.9 Févry et al (2020) 91.4 Wu et al. (2019) 92.8all Wikipedia ~5.9M entities TAC-KBP ks ~700K entities Ji et al. 2010 real-world scenario
  • 23. Knowledge Intensive NLP task 3 - Open Domain QA What's the highest mountain in Europe? - TriviaQA (Joshi et al., 2017) - HotpotQA (Yang et al., 2018) - Natural Questions (Kwiatkowski et al., 2019) - ELI5 (Fan et al., 2019) - ...
  • 24. 24 dense retrieval with MIPS and bi-encoder wikipedia dense space What’s the highest mountain in Europe? is the second- highest mountain Mont Blanc p1 Mount Elbrus is a dormant vulcano Mount Elbrus p1 T T Tiger p5 T Ferrari p1 Domus Aurea p5 Carriage p4 Moon p2 Jaguar p4 … BI-ENCODER MIPS Dog p6 Lake Avernus p2 Mannheim p1 Diego Maradona p1 21M points
  • 25. Exact Match 25 Natural Questions open dev TriviaQA official test Roberts et al. 2020 T5 36.6 60.5 Guu et al. 2020 REALM 40.4 Karpukhin et al. 2020 DPR 41.5 Lewis et al. 2020 RAG 44.5 68
  • 26. Exact Match 26 Natural Questions TriviaQA Roberts et al. 2020 T5 36.6 60.5 Guu et al. 2020 REALM 40.4 Karpukhin et al. 2020 DPR 41.5 Lewis et al. 2020 RAG 44.5 68 Self-Structured Unstructured + Structured + Self-Structured
  • 27. Limitations 27 Credit: Firstname Lastname - different split of the data - explainability not assessed provenance answer both open/close book can get the answer for the wrong reason evidence from the knowledge source text, lists, tables, images What's the highest mountain in Europe? Mount Elbrus
  • 28. Knowledge Intensive NLP task 4 - Fact Checking 28 FEVER Thorne et al., 2018-2019 Lorelai Gilmore's father is named Robert. claim SUPPORTS REFUTES NOT ENOUGH INFO answer provenance 3-way classification
  • 29. Label Accuracy 29 3-way 2-way Zhong et al. 2020 DREAM 76.8 - Thorne et al. 2020 RoBERTa - 92.2* Lewis et al. 2019 BART 64.0 81.1 Lewis et al. 2020 RAG 72.5 89.5 * with oracle evidence
  • 30. Discussion 30 FEVER score - elegant way to combine explainability and downstream performance only award points for accuracy if the correct evidence is found a ton of manual annotations FEVER is an artificial task fact-checking in the real word is another game
  • 31. Knowledge Intensive NLP task 5 - Factual Generations 31 GPT2 generation Massarelli et al. (2019) Princess Margaret, Countess of Snowdon, (Margaret Rose 21 August 1930 - 9 February 2002) was the younger daughter of King GeorgeVI and Queen ElizabethThe Queen Mother and the only sibling of Queen Elizabeth II. She married Antony Armstrong-Jones, a photographer, in 1960. It was the first marriage for the Queen and the first for Prince Philip, Duke of Edinburgh. After divorcing Armstrong-Jones in 1978, she married Group Captain PeterTownsend in June that same year. She died at the age of 71 on 9 February 2002. Why did Princess Margaret marry Antony Armstrong-Jones? prompt Delayed Beam Search up to 64% of generated sentences with claims are SUPPORTED
  • 32. Jeopardy Question Generation 32 Input: The Divine Comedy BART: This epic poem by Dante is divided into three parts: the Inferno,The Purgatorio & the Purgatorio RAG: This 14th Century work is divided into 3 sections:“inferno”, “Purgatorio” & “Paradiso” Factuality Specificity BART Better 7.1% 16.8% RAG better 42.7% 37.4% both good 11.7% 11.8% both poor 17.7% 6.9% no majority 20.8% 20.1% human evaluation Lewis et al. (2020)
  • 33. 33 Fabio Petroni 1Jeopardy Question Generation Input: Hemingway RAG: “The Sun Also Rises” is a novel by this author of "A Farewell to Arms" Document 1: his works are considered classics of American literature ... His wartime experiences formed the basis for his novel ”A Farewell to Arms” (1929) ... Document 2: ... artists of the 1920s ”Lost Generation” expatriate community. His debut novel, ”The Sun Also Rises”, was published in 1926. BO S ” The Sun Also R ises ” is a novel by thisauthor of ” A Fare w ell to Arm s ” Doc 1 Doc 2 Doc 3 Doc 4 Doc 5 33 Lewis et al. (2020)
  • 34. Interaction Between Parametric / Non-Parametric Knowledge Retrieved documents cue correct responses from BART: Feed BART with input Hemingway and partial decoding “The Sun: Completion: “The Sun also Rises” is a novel by this author of “the Sun Also Rises” Feed BART with input Hemingway and partial decoding “The Sun Also Rises” is a novel by this author of “A: Completion: “The Sun also Rises” is a novel by this author of “A Farewell to Arms” Lewis et al. (2020)
  • 35. Conclusion 35 Can a model read the web and autonomously write an encyclopedia ? Encoding unit of text with a LM seems a really promising way to build knowledge bases We should use a variegated set of knowledge intensive language tasks to evaluate knowledge representation The ultimate Knowledge Intensive task