What knowledge bases know (and what they don't)

What knowledge bases know
(and what they don't)
Simon Razniewski
Free University of Bozen-Bolzano, Italy
Max Planck Institute for Informatics
(starting November 2017)

About myself
• Assistant professor at FU Bozen-Bolzano, South Tyrol, Italy (since 2014)
• PhD from FU Bozen-Bolzano (2014)
• Diplom from TU Dresden, Germany (2010)
• Research visits at UCSD (2012), AT&T Labs-Research (2013),
UQ (2015), MPII (2016)
Trilingual
The Alps’ oldest
criminal case: Ötzi
1/8th of EU apples

What do knowledge bases know?
What is a knowledge base?
 A collection of general world knowledge
• Common sense:
• Apples are sweet or sour,
• Cats are smaller than cars
• Activities:
• “whisper” and “shout” are implementations of “talk”
• Facts:
• Saarbrücken is the capital of the Saarland
• Ötzi has blood type O
3

Factual KBs: An old dream of AI
• Early manual efforts (CYC, 1980s)
• Structured extraction (YAGO, DBpedia, 2000s)
• Text mining and extraction (NELL, Prospera,
Textrunner, 2000s)
• Back to the roots: Wikidata (2012)
4

KBs are useful (1/2): QA
What is the capital of the Saarland?
Try yourself:
• When was Trump born?
• What is the nickname of Ronaldo?
• Who invented the light bulb?
Q: What is the capital of the Saarland?

KBs are useful (2/2): Language Generation
7
• Wikipedia in world’s most spoken language:
1/10 as many articles as English Wikipedia
• World’s fourth most spoken language: 1/100
 Wikidata intended to help
resource-poor languages

KB construction: Current state
• More than 2300 papers with titles containing
“information extraction” in the last 4 years [Google Scholar]
• Large KBs at Google, Microsoft, Alibaba, Bloomberg, …
• Progress visible downstream
• IBM Watson beats humans in trivia game in 2011
• Entity linking systems close to human performance on
popular news corpora
• Systems pass 8th grade science tests
in the AllenAI Science challenge in 2016
• But how good are KBs themself?
8

How good are the KBs that we build?
Is what they know true?
(precision or correctness)
 Do they know what is true?
(recall or completeness)
9

KBs know much of what is true
10
Google Knowledge Graph: 39 out of 48 Tarantino movies 
DBpedia: 167 out of 204 Nobel laureates
in Physics 
Wikidata: 2 out of 2
children of Obama 

Affiliations
https://query.wikidata.org/
SELECT (COUNT(?p) as ?result)
WHERE {?p worksFor Saarland_University.}
• Saarland University:
• MPI-INF:
• MPI-SWS:
11
325
2
0
(wdt:P108) (wd:Q700758)

KBs know little of what is true
12
DBpedia: contains 6 out of 35
Dijkstra Prize winners  Google Knowledge Graph:
``Points of Interest’’ – Completeness? 
Wikidata knows not so well
about employees here 

What previous work says
14
[Dong et al., KDD 2014]
There are known knowns; there are
things we know we know. We also
know there are known unknowns;
that is to say we know there are some
things we do not know. But there are
also unknown unknowns – the ones
we don't know we don't know.
KB engineers have only tried to
make KBs bigger. The point,
however is to understand what
they are trying to approximate.

Outline – Assessing KB recall
1. Logical foundations
2. Rule mining
3. Information extraction
4. Data presence heuristic
15

2. Rule mining
16

Closed and open-world assumption
worksIn
Name Department
John D1
Mary D2
Bob D3
17
worksIn(John, D1)?
worksIn(Ellen, D3)?
Closed-world
assumption
Open-world
assumption
• (Relational) databases traditionally employ the closed-world assumption
• KBs necessarily operate under the open-world assumption
 Yes  Yes
 No  Maybe

Open-world assumption
• Q: Hamlet written by Goethe?
KB: Maybe
• Q: Schwarzenegger lives in Dudweiler?
KB: Maybe
• Q: Trump brother of Kim Jong Un?
KB: Maybe
 Open-world assumption often too cautious
18

Teaching KBs to say “no”
• Need power to express
both maybe and no
= Partial-closed world assumption
• Approach: Completeness statements [Motro 1989]
19
Completeness statement:
worksIn is complete for employees of D1
worksIn(John, D1)?
worksIn(Ellen, D1)?
worksIn(Ellen, D3)?
 Yes
 No
 Maybe
worksIn
Name Department
John D1
Mary D2
Bob D3

Completeness statements
• Assertions about the available database containing
all information on a certain topic
“worksIn is complete for employees of D1”
• Form constraints between an ideal database and
the available database
∀𝑥: 𝑤𝑜𝑟𝑘𝑠𝐼𝑛𝑖
𝑥, 𝐷1 → 𝑤𝑜𝑟𝑘𝑠𝐼𝑛 𝑎
(𝑥, 𝐷1)
• Can have expressivity ranging from simple
selections up to first-order-logic
20

If you have completeness statements
you can do wonderful things…
• Develop techniques for deciding whether a
conjunctive query answer is complete [VLDB 2011]
• Assign unambiguous semantics to SQL nulls
[CIKM 2012]
• Create an algebra for propagating completeness
[SIGMOD 2015]
• Ensure the soundness of queries with negation
[ICWE 2016]
• ….
21

Where would completeness
statements come from?
• Data creators should pass them along as metadata
• Or editors should add them in curation steps
• Developed plugin and external tool COOL-WD
(Completeness tool for Wikidata)
22

But…
• Requires human effort
• Editors are lazy
• Automatically created KBs do not even have editors
Remainder of this talk:
How to automatically acquire information
about KB completeness/recall
24

2. Rule mining
25

Rule mining: Idea (1/2)
Certain patterns in data hint at completeness/incompleteness
• People with a death date but no death place are incomplete for death place
• Movies with a producer are complete for directors
• People with less than two parents are incomplete for parents
26

Rule mining: Idea (2/2)
• Examples can be expressed as Horn rules:
dateOfDeath(X, Y) ∧ lessThan1(X, placeOfDeath)
⇒ incomplete(X, placeOfDeath)
movie(X) ∧ producer(X, Z) ⇒ complete(X, director)
lessThan2(X, hasParent) ⇒ incomplete(X, hasParent)
Can such patterns be discovered
with association rule mining?
27

Rule mining: Implementation
• We extended the AMIE association rule mining system
with predicates on
• Complete/incomplete complete(X, director)
• Object counts lessThan2(X, hasParent)
• Popularity popular(X)
• Negated classes person(X) ∧ ¬ adult(X)
• Then mined rules with complete/incomplete in the head
for 20 YAGO/Wikidata relations
• Result: Can predict (in-)completeness
with 46-100% F-score
28[Galarraga et al., WSDM 2017]

Rule mining: Challenges
• Consensus:
human(x)  Complete(x, graduatedFrom)
schoolteacher(x)  Incomplete(x, graduatedFrom)
professor(x)  Complete(x, graduatedFrom)
John ∈ (human, schoolteacher, professor)
 Complete(John, graduatedFrom)?
• Rare properties require very large training data
• E.g., monks being complete for spouses
• Annotated ~3000 rows at 10ct/row  0 monks
29

2. Rule mining
30

Information extraction: Idea
31
KB: 0 KB: 1 KB: 2
Recall: 0% Recall: 50% Recall: 100%
…
Barack and Michelle
have two children
…

Information extraction: Implementation
• Developed a CRF-based classifier for identifying
numbers that express relation cardinalities
• Works for a variety of topics such as
• Family relations has 2 siblings
• Geopolitics is composed of seven boroughs
• Artwork consists of three episodes
• Finds the existence of 178% more children than
currently in Wikidata
32
[Mirza et al, ISWC 2016 + ACL 2017]

Information extraction: Challenges
• Cardinalities are frequently expressed nonnumeric:
• Nouns has twins, is a trilogy
• Indefinite articles They have a daughter
• Negation/adjectives Have no children/is childless
• Often requires reasoning
Has 3 children from Ivana and one from Marla
• Training (dist. supervision) struggles with false positives
• KBs used for training are themselves incomplete
President Garfield: Wikidata knows only of 4 out of 7 children
33

Vision: Make IE recall-aware
Textual information extraction usually gives precision estimates
“John was born in Malmö, Sweden.” citizenship(John, Sweden) – precision 95%
“John grew up in Malmö, Sweden.” citizenship(John, Sweden) – precision 70%
Can we also produce recall estimates?
“John has a son, Tom, and a daughter, Susan.”
child(John, Tom), child(John, Susan) – recall 90%
“John brought his children Susan and Tom to school.”
child(John, Tom), child(John, Susan) – recall 30%
34

2. Rule mining
35

Data presence heuristic: Idea
KB: dateOfBirth(John, 17.5.1983)
Q: dateOfBirth(John, 31.12.1999)?
A: Probably not
Single-value properties:
• Having one value  Property is complete
• Look at data alone suffices
36

What are single-value properties?
37
year
Extreme case, but…
• Multiple
citizenships
• More parents due
to adoption
• Several Twitter
accounts due to
presidentship

All hopes lost?
• Presence of a value is better than nothing
• Even better: For non-functional attributes,
data is still frequently added in batches
• All clubs Diego Maradona played for
• All ministers of Merkel’s new cabinet
• …
• Checking data presence is a common heuristic
among Wikidata editors
38

Value presence heuristic - example
[https://www.wikidata.org/wiki/Wikidata:Wikivoyage/Lists/Embassies]

Data presence heuristic: Challenges
4.1: Which properties to look at?
4.2: How to quantify data presence?
40

4.1: Which properties to look at? (1/2)
• Complete(Wikidata for Putin)?
• There are more than 3000 properties one can assign to Putin…
• Not all properties are relevant to everyone.
(Think of goals scored or monastic order)
• Are at least all relevant properties there?
• What do you mean by relevant?
41

42
State-of-the-art approach gets 61% of high-agreement triples right
• Mistakes frequency for interestingness
Our method using also linguistic similarity achieves 75%
We used crowdsourcing to annotate 350 random
(person, property1, property2)
triples with human perception of interestingness
[Razniewski et al., ADMA 2017]
4.1: Which properties to look at? (2/2)

4.2: How to quantify data presence?
We have values for 46 out of 77 relevant properties for Putin
 Hard to interpret
Proposal: Quantify based on comparison
with other similar entities
Ingredients:
• Similarity metric Who is similar to Trump?
• Data quantification How much data is good/bad?
• Deployed on Wikidata, but evaluation difficult
43
[Ahmeti et al., ESWC 2017]

https://www.wikidata.org/wiki/User:Ls1g/Recoin

2. Rule mining
5. Summary
46

Summary (1/3)
• Increasing KB quality can to some extent
be noticed downstream
• Precision easy to evaluate
• Recall largely unknown
47

Summary (2/3)
• Ideal is human-curated completeness information
• Created in conjunction with data (COOL-WD tool)
• Not really scalable
• Automated alternatives:
• Association rule mining
• Information extraction
• Looking at existence of data is a useful start
48

Summary (3/3)
• Recall-aware information extraction an open
challenge
• Concepts of relevance and relative completeness
in KBs little understood to date
• I look forward to fruitful collaborations with UdS,
MPI-SWS and MPI-INF
49

What knowledge bases know (and what they don't)

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a What knowledge bases know (and what they don't)

Semelhante a What knowledge bases know (and what they don't) (20)

Último

Último (20)

What knowledge bases know (and what they don't)

Notas do Editor