Learning event extractors from knowledge base revisions

Learning to Extract Events from
Knowledge Base Revisions
Alexander Konovalov
Ohio State University
konovalov.2@osu.edu
Benjamin Strauss
strauss.105@osu.edu
Alan Ritter
ritter.1492@osu.edu
Brendan O'Connor
University of Amherst
brenocon@cs.umass.edu

Knowledge Bases
Some notable examples:
● Google Knowledge Graph
● Microsoft Satori KB
● Wolfram Alpha
● Wikidata
● DBPedia
● Wikipedia Infoboxes
● Freebase

Extracting KBs from
Text
Prior work assumes static text
corpora and knowledge bases:
● NELL [AAAI 2015]
● Mintz et al. [ACL 2009]
● DeepDive [VLDS 2012]
● Knowledge Vault [KDD 2014]

Real-time Knowledge Bases
Knowledge Bases are not static snapshots!
Events are constantly changing the world.

Wikipedia: a dynamically evolving knowledge base

50% of deaths updated within
a couple of days
Manual Editing…

50% of deaths updated within
a couple of days
... but for scientists take 31 days to
reach 50% coverage!*
Manual Editing…
can not Scale Up!
* https://research.googleblog.com/2013/05/distributing-edit-history-of-wikipedia.html
Goal: extract KB revisions from text as
soon as public knowledge is available

Infobox
Revisions
News
Event
extraction
pipeline
Training
Social Media
In a nutshell

Prediction
Infobox
Revisions
News
Event
extraction
pipeline
Social Media
In a nutshell

X welcomes Y as our new mayor
X sworn in as Y mayor
In a nutshell

In a nutshell
X welcomes Y as our new mayor
X sworn in as Y mayor{{infobox Settlement
| name=Pittsburgh
| Leader=Bill Peduto

In a nutshell: formally.
● An event is a quadruple of a
predicate, two entities, and a
time.

● Features are collected from tweets written
before time t.
Prediction time
Feature window

● A bag of Mintz et al. like features.
State arg1 wins in arg2
cnn projects arg1 wins IN arg2
0.333
0.115
CurrentTeam arg1 to arg2
arg1 joins arg2
arg1 signs...with arg2
1.291
0.513
0.419
DeathPlace arg1 found dead in arg2
arg1 found JJ IN arg2
arg1 killed in arg2
arg1 memorial...in arg2
0.269
0.260
0.195
0.125

● L₂ regularized log-linear model.
● Optimize conditional likelihood with
respect to θ.

Challenge: Non-Semantic Edits
| spouse = Soumaya Domit (1967–1999; her
death); 6 children
| spouse = Soumaya Domit
<small>(1967-1999; her death)</small>

Challenge: Vandalism
|leader_name = Chuck Norris|leader_name = Dave Paulekas

Revision as of 17:27,
7 April 2015
➕ | death_date = {{death date and age
|2015|02|03|1978|05|29}}}
➕ | death_place = [[Valhalla, New
York|Valhalla, NY]], U.S.
Challenge: Edits After Event

Experiments: datasets.
● Twitter (Social Media)
○ Training: Sep. 2008 - Jun. 2011
○ Test: Jun. 2011 - Jan. 2012
○ 10% firehose.
● Annotated Gigaword (News)
○ Training: Jun. 2004 - Jun. 2009
○ Test: Jun 2009 - Jan. 2011
○ [Napoles et al. 2012]

● Twitter:
○ langid.py [IJCNLP 2011]
○ Domain-specific Twitter NLP
[Ritter et al. 2011]
○ Down-sampling
Twitter
Firehose
langid.py
Tokens,
NE, POS
Down
sampling
Annotated
Samples

● Annotated Gigaword
○ POS, NER, Syntactic trees, etc...
○ Each sentence is a single data-point.
Annotated
Gigaword
Split into sentences
Annotated
Samples

Attributes
Six Wikipedia
infobox attributes:
● CurrentTeam
● LeaderName
● StateRepresentative
● Spouse
● Predecessor
● DeathPlace

● Intuition: Samples near the time of a
revision are likely to mention an event
that causes the change.
Annotated
Samples
Matched
Samples
Match NE
Aligned Samples
Use
intuition!
Unaligned Samples

● Intuition: Samples near the time…
Use temporal alignment (-10...+3 days).
Annotated
Samples
Matched
Samples
Match NE
Aligned Samples
Unaligned Samples
temporal
alignment

Experiments: baseline.
● Baseline: a traditional relation
extraction system that uses
no temporal information.
Annotated
Samples
Matched
Samples
Match NE
Aligned Samples
Unaligned Samples
temporal
alignment

Experiments: automatic evaluation.
● Evaluate assuming that
events = edits (positive).
● Mix 90% randomly sampled as negative
samples.

Experiments: automatic evaluation.
Twitter Gigaword
Attribute Our System Baseline Our System Baseline
Current Team 0.53 0.28 0.03 0.03
Leader Name 0.50 0.12 0.27 0.05
Spouse 0.27 0.22 0.08 0.05
State
Representative
0.01 0.002 0.43 0.29
Predecessor 0.02 0.09 0.24 0.17
Death Place 0.49 0.63 0.41 0.35

Experiments: human evaluation.
Automatic evaluation:
● Noise - underestimates precision & recall.
● Missing edits - overestimates recall.
● Unrealistic ratio of signal / noise.

Experiments: human evaluation.
● Split the predictions into month-long bins.
● Annotate the top predictions from both systems.

Experiments: evaluation.
● Human judgement via Amazon Mechanical Turk,
7 workers.

Experiments: evaluation.
● Inter-Annotator Agreement (Fleiss-Kappa):
○ 0.64 on Twitter
○ 0.30 on Gigaword
Context matters:
Much of Wednesday was about pomp
and circumstance for Rand and Ron
Paul. They were sworn in, [...] But the
newly minted senator from Kentucky
also tended to some business [...]

Experiments: All attributes.
Predictions per month
Precision
Twitter Gigaword

Experiments: All attributes.
Predictions per month
Precision
Twitter Gigaword
Our system
Baseline

Conclusions
● Revisions - a source for distant
supervision for event extraction.

Conclusions
● Use temporal and surface matching to
identify reliable infobox edits
corresponding to real world events.

Conclusions
● Use temporal and surface matching to
identify reliable infobox edits
corresponding to real world events.
● Automate KB updates by learning
predictors from past KB updates.

Conclusions
● We generate on average 34.3 edits per
month with high precision for 6
attributes.
● Often beat human KB contributors in
terms of recall (64%⇑ before) and lead
time (40 days).

Future Work
● Database revisions are not always
well-aligned:

Future Work
● Align edits to the actual events using
latent variable alignment model.
With an
offset

Q & A
Learning to Extract Events from Knowledge Base Revisions
You can reach me via konovalov.2@osu.edu.
Code and data will be available soon at github.com/alexknvl/dsup.

Learning event extractors from knowledge base revisions

Recomendados

Recomendados

Mais conteúdo relacionado

Último

Último (20)

Destaque

Destaque (20)

Learning event extractors from knowledge base revisions