SlideShare a Scribd company logo
1 of 15
Download to read offline
SwissLink
High-Precision, Context-Free Entity Linking
Exploiting Unambiguous Labels
Roman Prokofyev, Michael Luggen, Djellel Eddine Difallah, Philippe Cudré-Mauroux
eXascale Infolab, University of Fribourg, Switzerland
Entity Linking
“In natural language processing, entity linking, [...] is the task of determining
the identity of entities mentioned in text.
https://en.wikipedia.org/wiki/Entity_linking
Where the identity of an entity is commonly defined as an entry in a Knowledge
Base (KB).
It is usually solved in a multi-step process involving Named Entity Recognition
(NER) followed by a Candidate Selection and finally the Disambiguation.
2
Entity Linking
1. Named Entity Recognition (NER)
Distinguish between word of speech and defined concepts, also known as
named entities. Often involves a Part of Speech (POS) tagger.
2. Candidate Selection
Selecting possible candidates from the target Knowledge Base (where
entities are defined).
3. Disambiguation
Deciding which candidate is the correct identity corresponding to the
mention of a Named Entity. 3
Entity Linking
1. Named Entity Recognition (NER)
“It is a blast to visit Adam once more.”
2. Candidate Selection
Adam -> Adam (Name), Adam (City) in Oman, Amsterdam
3. Disambiguation
Adam -> https://en.wikipedia.org/wiki/Amsterdam
4
Motivation: High-precision context-free entity linking
● Certain applications require high-precision linked entities
○ Interactive applications where humans review results
○ Machine learning: training predictive
models may require high-precision
annotated text (no overfitting)
● Context-free
○ Works with any type of input:
text, tweets, search queries
○ But limited to unambiguous labels
The F1 score strikes a balance (harmonic mean) between precision and recall.
This is not necessarily the best optimization for the task at hand. 5
Precision
Recall
F1Score
Motivation: Categories of links to Wikipedia
What labels are used to link to entities (as Wikipedia pages) on the web?
Link by the most common label
web browser
Link by context
divided into three
subgroups: East,
West, and South
Link by reference
Wikipedia
Erroneous link
Oregon
Incorrectly linked entity even when
considering the context
<Web_browser>
381’623
times
<East_Slavic_languages>
<Angelina_Jolie>
16’333
times <University_of_Oregon>
6
Motivation: Prior probability scores
● Most important feature when not considering context
● Conditional probability P(link|label)
● Problems:
Does not necessarily capture ambiguity
Adam -> Adam (Name), Adam (City) in Oman, Amsterdam
Does not take categories into account
Wikipedia -> Angelina_Jolie [16’333]
7
Method (Problem)
Problem Formulation.
Given an arbitrary textual document ID
as input
Identify all named entities substrings {l1
, .., lk
}
And link them to their respective entities.
Effectively, our methods will return as output a set of label-entity pairs
OD
={(l1
,ez
),...,(lk
,ex
)}.
8
Method (Different Overall Approach)
Common
Named entity recognition -> candidate selection -> disambiguation
Context Free
Extract surface forms (KB or annotated corpus) -> clean and catalog -> fast
string matching
Surface form: a string representing an entity in a text.
Annotated corpus: e.g. Wikipedia articles, Common Crawl
9
Method (Catalog)
DBpedia
DBpedia labels can be considered as a catalog after the removal of ambiguous
labels. Downside: The labels in DBpedia are rather sparse.
Wikipedia
The internal links of Wikipedia are a good source of surface forms with links to
entities (Wikipedia pages). Downside: Noise is introduced due to the categories of
links.
10
Method
Ratio
Decide on which surface forms have ambiguous labels which can not be
considered without context.
Percentile method
Removes long tail and then readjusts weights to get better recall
11
Evaluation
Curated ground truth based on
Wikipedia articles allows us to
compare with manual annotations
in Wikipedia.
(30 randomly sampled articles)
● Ratio method: low recall
● Ratio+Percentile 99: best
12
Evaluation (Discussion)
● Increasing the ratio introduces more ambiguous labels -> direct impact on
precision
● The percentile method is balancing this effect by separating the ambiguity
from the popularity of the entities
● In general, we observe that the Percentile-Ratio method with 99-Percentile
and 10-Ratio strikes a good balance between high-precision results (>95%)
and reasonable recall (45%, 1309 entities)
13
High-Precision, Context-Free Entity Linking
Exploiting Unambiguous Labels
Links
Ground truth: https://github.com/eXascaleInfolab/Wikipedia30
Methods: https://github.com/eXascaleInfolab/kilogram
Evaluation: http://w3id.org/gerbil/experiment?id=201604300040
14
15

More Related Content

What's hot

香港六合彩 &raquo; SlideShare
香港六合彩 &raquo; SlideShare香港六合彩 &raquo; SlideShare
香港六合彩 &raquo; SlideSharebiyu
 
Best C Sharp C# Training Online C# Online Course C# Online Training Best on...
Best C Sharp C# Training Online C# Online Course   C# Online Training Best on...Best C Sharp C# Training Online C# Online Course   C# Online Training Best on...
Best C Sharp C# Training Online C# Online Course C# Online Training Best on...Evanta Technologies
 
Object oriented programming concept
Object oriented programming conceptObject oriented programming concept
Object oriented programming conceptPina Parmar
 
Object oriented programming C++
Object oriented programming C++Object oriented programming C++
Object oriented programming C++AkshtaSuryawanshi
 
Pursuing Domain-Driven Design practices in PHP
Pursuing Domain-Driven Design practices in PHPPursuing Domain-Driven Design practices in PHP
Pursuing Domain-Driven Design practices in PHPGiorgio Sironi
 
Introduction to Object Oriented Programming
Introduction to Object Oriented ProgrammingIntroduction to Object Oriented Programming
Introduction to Object Oriented ProgrammingMd. Tanvir Hossain
 
Oop concepts classes_objects
Oop concepts classes_objectsOop concepts classes_objects
Oop concepts classes_objectsWilliam Olivier
 
Object Oriented Programming Concepts
Object Oriented Programming ConceptsObject Oriented Programming Concepts
Object Oriented Programming ConceptsAbhigyan Singh Yadav
 
Object Oriented Concept
Object Oriented ConceptObject Oriented Concept
Object Oriented ConceptD Nayanathara
 
Std 12 computer chapter 6 object oriented concepts (part 1)
Std 12 computer chapter 6 object oriented concepts (part 1)Std 12 computer chapter 6 object oriented concepts (part 1)
Std 12 computer chapter 6 object oriented concepts (part 1)Nuzhat Memon
 
Higher Order Applicative XML (Monterey 2002)
Higher Order Applicative XML (Monterey 2002)Higher Order Applicative XML (Monterey 2002)
Higher Order Applicative XML (Monterey 2002)Peter Breuer
 
Object database standards, languages and design
Object database standards, languages and designObject database standards, languages and design
Object database standards, languages and designDabbal Singh Mahara
 
Session 19 - Review Session
Session 19 - Review SessionSession 19 - Review Session
Session 19 - Review SessionPawanMM
 
Object oriented programming
Object oriented programmingObject oriented programming
Object oriented programmingumairrajpoot6
 

What's hot (20)

香港六合彩 &raquo; SlideShare
香港六合彩 &raquo; SlideShare香港六合彩 &raquo; SlideShare
香港六合彩 &raquo; SlideShare
 
Best C Sharp C# Training Online C# Online Course C# Online Training Best on...
Best C Sharp C# Training Online C# Online Course   C# Online Training Best on...Best C Sharp C# Training Online C# Online Course   C# Online Training Best on...
Best C Sharp C# Training Online C# Online Course C# Online Training Best on...
 
Testing in isolation
Testing in isolationTesting in isolation
Testing in isolation
 
C plusplus
C plusplusC plusplus
C plusplus
 
Object oriented programming concept
Object oriented programming conceptObject oriented programming concept
Object oriented programming concept
 
Object oriented programming C++
Object oriented programming C++Object oriented programming C++
Object oriented programming C++
 
General oops concepts
General oops conceptsGeneral oops concepts
General oops concepts
 
Pursuing Domain-Driven Design practices in PHP
Pursuing Domain-Driven Design practices in PHPPursuing Domain-Driven Design practices in PHP
Pursuing Domain-Driven Design practices in PHP
 
Introduction to Object Oriented Programming
Introduction to Object Oriented ProgrammingIntroduction to Object Oriented Programming
Introduction to Object Oriented Programming
 
Oop concepts classes_objects
Oop concepts classes_objectsOop concepts classes_objects
Oop concepts classes_objects
 
Object Oriented Programming Concepts
Object Oriented Programming ConceptsObject Oriented Programming Concepts
Object Oriented Programming Concepts
 
Object Oriented Concept
Object Oriented ConceptObject Oriented Concept
Object Oriented Concept
 
Std 12 computer chapter 6 object oriented concepts (part 1)
Std 12 computer chapter 6 object oriented concepts (part 1)Std 12 computer chapter 6 object oriented concepts (part 1)
Std 12 computer chapter 6 object oriented concepts (part 1)
 
Higher Order Applicative XML (Monterey 2002)
Higher Order Applicative XML (Monterey 2002)Higher Order Applicative XML (Monterey 2002)
Higher Order Applicative XML (Monterey 2002)
 
Object database standards, languages and design
Object database standards, languages and designObject database standards, languages and design
Object database standards, languages and design
 
Session 19 - Review Session
Session 19 - Review SessionSession 19 - Review Session
Session 19 - Review Session
 
General OOP concept [by-Digvijay]
General OOP concept [by-Digvijay]General OOP concept [by-Digvijay]
General OOP concept [by-Digvijay]
 
Object oriented programming
Object oriented programmingObject oriented programming
Object oriented programming
 
Inner Classes in Java
Inner Classes in JavaInner Classes in Java
Inner Classes in Java
 
Oop concept
Oop conceptOop concept
Oop concept
 

Similar to SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous Labels

The Triplex Approach for Recognizing Semantic Relations from Noun Phrases, Ap...
The Triplex Approach for Recognizing Semantic Relations from Noun Phrases, Ap...The Triplex Approach for Recognizing Semantic Relations from Noun Phrases, Ap...
The Triplex Approach for Recognizing Semantic Relations from Noun Phrases, Ap...Iman Mirrezaei
 
Question Answering with Lydia
Question Answering with LydiaQuestion Answering with Lydia
Question Answering with LydiaJae Hong Kil
 
Chapter 1- Introduction.ppt
Chapter 1- Introduction.pptChapter 1- Introduction.ppt
Chapter 1- Introduction.pptTigistTilahun1
 
CPP_,module2_1.pptx
CPP_,module2_1.pptxCPP_,module2_1.pptx
CPP_,module2_1.pptxAbhilashTom4
 
Grammarly AI-NLP Club #6 - Sequence Tagging using Neural Networks - Artem Che...
Grammarly AI-NLP Club #6 - Sequence Tagging using Neural Networks - Artem Che...Grammarly AI-NLP Club #6 - Sequence Tagging using Neural Networks - Artem Che...
Grammarly AI-NLP Club #6 - Sequence Tagging using Neural Networks - Artem Che...Grammarly
 
Answer ado.net pre-exam2018
Answer ado.net pre-exam2018Answer ado.net pre-exam2018
Answer ado.net pre-exam2018than sare
 
Topic Extraction on Domain Ontology
Topic Extraction on Domain OntologyTopic Extraction on Domain Ontology
Topic Extraction on Domain OntologyKeerti Bhogaraju
 
Introduction to Java Object Oiented Concepts and Basic terminologies
Introduction to Java Object Oiented Concepts and Basic terminologiesIntroduction to Java Object Oiented Concepts and Basic terminologies
Introduction to Java Object Oiented Concepts and Basic terminologiesTabassumMaktum
 
1 intro
1 intro1 intro
1 introabha48
 
Code Search Based on Deep Neural Network and Code Mutation
Code Search Based on Deep Neural Network and Code MutationCode Search Based on Deep Neural Network and Code Mutation
Code Search Based on Deep Neural Network and Code MutationNorihiro Yoshida
 
Object oriented database concepts
Object oriented database conceptsObject oriented database concepts
Object oriented database conceptsTemesgenthanks
 
Semantic IoT Semantic Inter-Operability Practices - Part 1
Semantic IoT Semantic Inter-Operability Practices - Part 1Semantic IoT Semantic Inter-Operability Practices - Part 1
Semantic IoT Semantic Inter-Operability Practices - Part 1iotest
 

Similar to SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous Labels (20)

The Triplex Approach for Recognizing Semantic Relations from Noun Phrases, Ap...
The Triplex Approach for Recognizing Semantic Relations from Noun Phrases, Ap...The Triplex Approach for Recognizing Semantic Relations from Noun Phrases, Ap...
The Triplex Approach for Recognizing Semantic Relations from Noun Phrases, Ap...
 
Question Answering with Lydia
Question Answering with LydiaQuestion Answering with Lydia
Question Answering with Lydia
 
Chapter 1- Introduction.ppt
Chapter 1- Introduction.pptChapter 1- Introduction.ppt
Chapter 1- Introduction.ppt
 
Core java part1
Core java  part1Core java  part1
Core java part1
 
CPP_,module2_1.pptx
CPP_,module2_1.pptxCPP_,module2_1.pptx
CPP_,module2_1.pptx
 
Grammarly AI-NLP Club #6 - Sequence Tagging using Neural Networks - Artem Che...
Grammarly AI-NLP Club #6 - Sequence Tagging using Neural Networks - Artem Che...Grammarly AI-NLP Club #6 - Sequence Tagging using Neural Networks - Artem Che...
Grammarly AI-NLP Club #6 - Sequence Tagging using Neural Networks - Artem Che...
 
Java oo ps concepts
Java oo ps conceptsJava oo ps concepts
Java oo ps concepts
 
Answer ado.net pre-exam2018
Answer ado.net pre-exam2018Answer ado.net pre-exam2018
Answer ado.net pre-exam2018
 
Topic Extraction on Domain Ontology
Topic Extraction on Domain OntologyTopic Extraction on Domain Ontology
Topic Extraction on Domain Ontology
 
Oop java
Oop javaOop java
Oop java
 
Dom
DomDom
Dom
 
Introduction to Java Object Oiented Concepts and Basic terminologies
Introduction to Java Object Oiented Concepts and Basic terminologiesIntroduction to Java Object Oiented Concepts and Basic terminologies
Introduction to Java Object Oiented Concepts and Basic terminologies
 
Introduction to odbms
Introduction to odbmsIntroduction to odbms
Introduction to odbms
 
1 intro
1 intro1 intro
1 intro
 
Java Notes
Java NotesJava Notes
Java Notes
 
Code Search Based on Deep Neural Network and Code Mutation
Code Search Based on Deep Neural Network and Code MutationCode Search Based on Deep Neural Network and Code Mutation
Code Search Based on Deep Neural Network and Code Mutation
 
Unit 5.ppt
Unit 5.pptUnit 5.ppt
Unit 5.ppt
 
Java pdf
Java   pdfJava   pdf
Java pdf
 
Object oriented database concepts
Object oriented database conceptsObject oriented database concepts
Object oriented database concepts
 
Semantic IoT Semantic Inter-Operability Practices - Part 1
Semantic IoT Semantic Inter-Operability Practices - Part 1Semantic IoT Semantic Inter-Operability Practices - Part 1
Semantic IoT Semantic Inter-Operability Practices - Part 1
 

More from eXascale Infolab

Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link PredictionBeyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link PredictioneXascale Infolab
 
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...eXascale Infolab
 
Representation Learning on Complex Graphs
Representation Learning on Complex GraphsRepresentation Learning on Complex Graphs
Representation Learning on Complex GraphseXascale Infolab
 
A force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapA force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapeXascale Infolab
 
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...eXascale Infolab
 
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceansDependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceanseXascale Infolab
 
SANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutionSANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutioneXascale Infolab
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataeXascale Infolab
 
Entity-Centric Data Management
Entity-Centric Data ManagementEntity-Centric Data Management
Entity-Centric Data ManagementeXascale Infolab
 
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataLDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataeXascale Infolab
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataeXascale Infolab
 
The Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task CrowdsourcingThe Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task CrowdsourcingeXascale Infolab
 
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...eXascale Infolab
 
CIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingCIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingeXascale Infolab
 
An Introduction to Big Data
An Introduction to Big DataAn Introduction to Big Data
An Introduction to Big DataeXascale Infolab
 
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)eXascale Infolab
 

More from eXascale Infolab (20)

Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link PredictionBeyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
 
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
 
Representation Learning on Complex Graphs
Representation Learning on Complex GraphsRepresentation Learning on Complex Graphs
Representation Learning on Complex Graphs
 
A force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapA force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory map
 
Cikm 2018
Cikm 2018Cikm 2018
Cikm 2018
 
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
 
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceansDependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
 
Crowd scheduling www2016
Crowd scheduling www2016Crowd scheduling www2016
Crowd scheduling www2016
 
SANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutionSANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference Resolution
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked Data
 
Entity-Centric Data Management
Entity-Centric Data ManagementEntity-Centric Data Management
Entity-Centric Data Management
 
SSSW 2015 Sense Making
SSSW 2015 Sense MakingSSSW 2015 Sense Making
SSSW 2015 Sense Making
 
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataLDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web Data
 
The Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task CrowdsourcingThe Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task Crowdsourcing
 
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
 
CIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingCIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition ranking
 
OLTP-Bench
OLTP-BenchOLTP-Bench
OLTP-Bench
 
An Introduction to Big Data
An Introduction to Big DataAn Introduction to Big Data
An Introduction to Big Data
 
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
 

Recently uploaded

+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...Health
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...gragchanchal546
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabiaahmedjiabur940
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxronsairoathenadugay
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...HyderabadDolls
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...HyderabadDolls
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...kumargunjan9515
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...gajnagarg
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1ranjankumarbehera14
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numberssuginr1
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 

Recently uploaded (20)

+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 

SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous Labels

  • 1. SwissLink High-Precision, Context-Free Entity Linking Exploiting Unambiguous Labels Roman Prokofyev, Michael Luggen, Djellel Eddine Difallah, Philippe Cudré-Mauroux eXascale Infolab, University of Fribourg, Switzerland
  • 2. Entity Linking “In natural language processing, entity linking, [...] is the task of determining the identity of entities mentioned in text. https://en.wikipedia.org/wiki/Entity_linking Where the identity of an entity is commonly defined as an entry in a Knowledge Base (KB). It is usually solved in a multi-step process involving Named Entity Recognition (NER) followed by a Candidate Selection and finally the Disambiguation. 2
  • 3. Entity Linking 1. Named Entity Recognition (NER) Distinguish between word of speech and defined concepts, also known as named entities. Often involves a Part of Speech (POS) tagger. 2. Candidate Selection Selecting possible candidates from the target Knowledge Base (where entities are defined). 3. Disambiguation Deciding which candidate is the correct identity corresponding to the mention of a Named Entity. 3
  • 4. Entity Linking 1. Named Entity Recognition (NER) “It is a blast to visit Adam once more.” 2. Candidate Selection Adam -> Adam (Name), Adam (City) in Oman, Amsterdam 3. Disambiguation Adam -> https://en.wikipedia.org/wiki/Amsterdam 4
  • 5. Motivation: High-precision context-free entity linking ● Certain applications require high-precision linked entities ○ Interactive applications where humans review results ○ Machine learning: training predictive models may require high-precision annotated text (no overfitting) ● Context-free ○ Works with any type of input: text, tweets, search queries ○ But limited to unambiguous labels The F1 score strikes a balance (harmonic mean) between precision and recall. This is not necessarily the best optimization for the task at hand. 5 Precision Recall F1Score
  • 6. Motivation: Categories of links to Wikipedia What labels are used to link to entities (as Wikipedia pages) on the web? Link by the most common label web browser Link by context divided into three subgroups: East, West, and South Link by reference Wikipedia Erroneous link Oregon Incorrectly linked entity even when considering the context <Web_browser> 381’623 times <East_Slavic_languages> <Angelina_Jolie> 16’333 times <University_of_Oregon> 6
  • 7. Motivation: Prior probability scores ● Most important feature when not considering context ● Conditional probability P(link|label) ● Problems: Does not necessarily capture ambiguity Adam -> Adam (Name), Adam (City) in Oman, Amsterdam Does not take categories into account Wikipedia -> Angelina_Jolie [16’333] 7
  • 8. Method (Problem) Problem Formulation. Given an arbitrary textual document ID as input Identify all named entities substrings {l1 , .., lk } And link them to their respective entities. Effectively, our methods will return as output a set of label-entity pairs OD ={(l1 ,ez ),...,(lk ,ex )}. 8
  • 9. Method (Different Overall Approach) Common Named entity recognition -> candidate selection -> disambiguation Context Free Extract surface forms (KB or annotated corpus) -> clean and catalog -> fast string matching Surface form: a string representing an entity in a text. Annotated corpus: e.g. Wikipedia articles, Common Crawl 9
  • 10. Method (Catalog) DBpedia DBpedia labels can be considered as a catalog after the removal of ambiguous labels. Downside: The labels in DBpedia are rather sparse. Wikipedia The internal links of Wikipedia are a good source of surface forms with links to entities (Wikipedia pages). Downside: Noise is introduced due to the categories of links. 10
  • 11. Method Ratio Decide on which surface forms have ambiguous labels which can not be considered without context. Percentile method Removes long tail and then readjusts weights to get better recall 11
  • 12. Evaluation Curated ground truth based on Wikipedia articles allows us to compare with manual annotations in Wikipedia. (30 randomly sampled articles) ● Ratio method: low recall ● Ratio+Percentile 99: best 12
  • 13. Evaluation (Discussion) ● Increasing the ratio introduces more ambiguous labels -> direct impact on precision ● The percentile method is balancing this effect by separating the ambiguity from the popularity of the entities ● In general, we observe that the Percentile-Ratio method with 99-Percentile and 10-Ratio strikes a good balance between high-precision results (>95%) and reasonable recall (45%, 1309 entities) 13
  • 14. High-Precision, Context-Free Entity Linking Exploiting Unambiguous Labels Links Ground truth: https://github.com/eXascaleInfolab/Wikipedia30 Methods: https://github.com/eXascaleInfolab/kilogram Evaluation: http://w3id.org/gerbil/experiment?id=201604300040 14
  • 15. 15