We propose a concentric-based approach that enables to represent the con- text of a news item, by harmonizing into a single model the representative entities, which can be extracted using information retrieval and natural language processing techniques (Core), and other entities that get prominent according to different dimensions such as informativeness, semantic connectivity, or popularity (Crust).
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
Concentric Semantic Snapshot
1. THE CONCENTRIC NATURE OF
NEWS SEMANTIC SNAPSHOTS
JOSÉ LUIS REDONDO GARCIA
GIUSEPPE RIZZO
RAPHAËL TRONCY
@peputo / redondo@eurecom.fr
@giusepperizzo / giuseppe.rizzo@eurecom.fr
@rtroncy / raphael.troncy@eurecom.fr
2. Overview
2October 8, 2015 8th International Conference on Knowledge Capture
1. Introducing the Problem:
Contextualizing News Items
o The News Semantic Snapshot (NSS)
2. Previous Work:
o Frequency-based Functions
o Multidimensional Relevancy Approach
3. A Concentric Model for Generating NSS
3. Overview
3October 8, 2015 8th International Conference on Knowledge Capture
1. Introducing the Problem:
Contextualizing News Items
o The News Semantic Snapshot (NSS)
2. Previous Work:
o Frequency-based Functions
o Multidimensional Relevancy Approach
3. A Concentric Model for Generating NSS
4. The Problem: Contextualizing News
4October 8, 2015 8th International Conference on Knowledge Capture
Wolfgang Schäuble
Finance Minister Ruling Party in Ger.
Christian
Democratic Union
1 2 3
5. 5October 8, 2015 8th International Conference on Knowledge Capture
Sarah Harrison
WikiLeaks Editor Airport in Moscow
Sheremetyevo
The Problem: Contextualizing News
1 2 3
7. 1 2 3
7
News Semantic Snapshot
(NSS) [1]
October 8, 2015 8th International Conference on Knowledge Capture
News Semantic Snapshot (NSS)
[1] Redondo et al., Generating the Semantic Snapshot of Newscasts using Entity
Expansion, ICWE 2015, Rotterdam.
8. Recreating the NSS
News Semantic Snapshot
8October 8, 2015 8th International Conference on Knowledge Capture
(2)
(1)
1 2 3
9. Involving: (experts in the news domain + users)
Dimensions:
Play with the data and help us to extend it at:
https://github.com/jluisred/NewsConceptExpansion/wiki/Golden-
Standard-Creation
News Semantic Snapshot:
Gold Standard
(1) Video Subtitles
(2) Image in the video
(3) Text in the video image
(4) Suggestions of an expert
(5) Related articles
9October 8, 2015 8th International Conference on Knowledge Capture
1 2 3
10. Recreating the NSS
News Semantic Snapshot
10October 8, 2015 8th International Conference on Knowledge Capture
(2)
(1)
1 2 3
11. (1) Bringing in Missing Entities:
News Entity Expansion
October 8, 2015 11
1.a)
8th International Conference on Knowledge Capture
Web sites to be crawled:
- Google
- L1 : A set of 10
internationals English
speaking newspapers
- L2 : A set of 3
international newspapers
used in GS
Temporal Window:
- 1W:
- 2W:
Annotation filtering
- Schema.org
1.b)
Parameters [1]:
1 2 3
[1] Redondo et al., Generating the Semantic Snapshot of Newscasts using Entity
Expansion, ICWE 2015, Rotterdam.
12. News Semantic Snapshot
12October 8, 2015 8th International Conference on Knowledge Capture
(2)
(1)Recall (E. Expansion)
= 0.91
Recall (NER on Subtitles)
= 0.42
Recreating the NSS
1 2 3
13. October 8, 2015 138th International Conference on Knowledge Capture
(NSS)
(Entity Expansion)
0
N
FIdeal(ei)
(NSS)
FX(ei)
=?
MNDCG
The Selection Problem:
1 2 3
14. Overview
14October 8, 2015 8th International Conference on Knowledge Capture
1. Introducing the Problem:
Contextualizing News Items
o The News Semantic Snapshot (NSS)
2. Previous Work:
o Frequency-based Function
o Multidimensional Relevancy Approach
3. A Concentric Model for Generating NSS
15. 1º Entity Frequency
SNOW Workshop 2014 [2]
October 8, 2015 158th International Conference on Knowledge Capture
A
1 2 3
[2] Redondo et al., Describing and Contextualizing Events in TV News
Show}, SNOW Workshop, WWW 2014, Seoul, Korea.
16. Frequency Based: Results
October 8, 2015 168th International Conference on Knowledge Capture
(NSS)
(Expansion)
FREQ
0
N
(NSS)
F(Laura Poitras) = 2
F(Glenn Greenwald) = 1
1 2 3
17. October 8, 2015 17
(Fr) (FrGaussian)
15th International Conference on Web Engineering (ICWE)
Multidimensional Approach
ICWE 2015 [1]
1 2 3
[1] Redondo et al., Generating the
Semantic Snapshot of Newscasts using
Entity Expansion, ICWE 2015,
Rotterdam.
18. POPULARITY (FPOP) EXPERT RULES (FEXP)
18
- Based on Google Trends
- w = 2 months
- μ + 2*σ (2.5%)
Example:
- [ Location, = 0.48 ]
- [ Person, = 0.74 ]
- [ Organization, = 0.95 ]
- [ < 2 , = 0.0 ]
October 8, 2015 15th International Conference on Web Engineering (ICWE) 18
Multidimensional Approach
1 2 3
19. - News Entity Expansion + Dimensions Generate the
News Semantic Snapshot
- Best score: 0.667 in MNDCG at 10, better than BS1/2
• Collection: CSE (Google + 2W + Schema.org)
• Ranking:
• Expert Rules
• Popularity
October 8, 2015 198th International Conference on Knowledge Capture
Multidimensionality: Results
1 2 3
20. October 8, 2015 208th International Conference on Knowledge Capture
(NSS))
(Expansion)
FREQ POP EXP
+ + =
(NSS)
Multidimensionality: Results
1 2 3
21. October 8, 2015 8th International Conference on Knowledge Capture 21
Follow up: Fine-Tuning
1. Exploit Google Relevance (+1.80%)
2. Promote Subtitle Entities (+2.50%)
3. Exploit Named Entity Extractor’s confidence (+0.20%)
4. Interpret popularity Dimension (+1.40%)
5. Performing Clustering before Filtering (-0.60%)
- NO SIGNIFICANT IMPROVEMENT -
1 2 3
22. October 8, 2015 228th International Conference on Knowledge Capture
(NSS)
Tune
Function XFREQ POP EXP
No Improvement: Why?
Re-ShuffleOriginal
(NSS)
How many Dimensions?
How to combine them?
1 2 3
23. Overview
23October 8, 2015 8th International Conference on Knowledge Capture
1. Introducing the Problem:
Contextualizing News Items
o The News Semantic Snapshot (NSS)
2. Previous Work:
o Frequency-based Function
o Multidimensional Relevancy Approach
3. A Concentric Model for Generating NSS
24. October 8, 2015 8th International Conference on Knowledge Capture 24
Thinking Outside the Box:
1. Is there room for improvement?
2. Is MNDCG a good measure to
evaluate NSS?
3. How to significantly improve the
approach?
1 2 3
25. October 8, 2015 8th International Conference on Knowledge Capture 25
Room for Improvement?
GAIN
1 2 3
26. October 8, 2015 8th International Conference on Knowledge Capture 26
Room for Improvement?
1 2 3
27. October 8, 2015 8th International Conference on Knowledge Capture 27
How to Evaluate NSS?
MNDCG:
• Too focused on success at first positions (decay
Function)
• NSS intends to be flexible, ranking is application-
dependent
COMPACTNESS:
• Prioritizes coverage over ranking
• Compromise between: Recall and NSS size
• Recall*: positives are weighted according to score in GT
(NSS)
1 2 3
28. October 8, 2015 288th International Conference on Knowledge Capture
Compactness:
Recall: 22/33 = 0.66
Sa = 27
Sb = 33
Sc = 54
Sa = 27
Sb = 33
Sc= 54
(NSS)
A B CA
B
C
> >
1 2 3
29. October 8, 2015 8th International Conference on Knowledge Capture 29
Re-thinking the Approach:
Concentric Snapshot
Duality in News Entity Spectrum:
• REPRESENTATIVE entities:
• Driving the plot of the story, sometimes evident for
users.
• RELEVANT entities
• Related to former via specific reasons
Exploit the entity semantic relations
Unexpected?
1 2 3
30. October 8, 2015 8th International Conference on Knowledge Capture 30
Hypothesis:
Concentric Snapshot
CORE:
• Representative entities
• Spottable via
Frequency dimensions
• High degree of
cohesiveness
CRUST:
• Attached to the Core via
particular relations
• Agnostic to relevancy
nature: informativeness,
interestingness, etc.
1 2 3
31. October 8, 2015 8th International Conference on Knowledge Capture 31
Core Generation
a) Representative entities:
Frequency Dimension
(NSS)
b) Cohesiveness (DBpedia)
1 2 3
32. October 8, 2015 8th International Conference on Knowledge Capture 32
Crust Generation
The number of Web
documents talking
simultaneously about a
particular entity e and the
Core:
??
1 2 3
33. October 8, 2015 8th International Conference on Knowledge Capture 33
Experimental Settings
1. Entity Frequency
• Core1: Jaro-Winkler > 0.9
• Core2: Frequency based on Exact String matching
2. Cohesiveness:
• Everything is Connected Engine [3]
• Skb(e1, e2) > 0.125
CORE: (2 configurations)
[3] Everything is Connected
Engine:
https://github.com/mmlab/eice
1 2 3
34. October 8, 2015 8th International Conference on Knowledge Capture 34
1. Candidates for CRUST generation:
• Ex1: 1° ICWE2015 by R*(50): L2+Google, F3 1W, Gauss+ POP
• Ex2: 2° ICWE 2015 by R*(50): L2+Google, F3 1W, Freq + POP
2. Function for attaching entities to CORE:
• SWEB(ei, Core) over Google CSE, default Configuration
CRUST:
Experimental Settings
1 2 3
(2 configurations)
35. October 8, 2015 8th International Conference on Knowledge Capture 35
• Core+Crust:
• CrustOnly:
Projecting CORE and CRUST:
(NSS)
(Expansion)
CORE CRUST Core+Crust CrustOnly
Experimental Settings
1 2 3
(2 configurations)
36. October 8, 2015 8th International Conference on Knowledge Capture 36
Baselines:
BAS01: best run in ICWE 2015 at R*(50)
BAS02: second best run in ICWE 2015 at R*(50)
FREQPOPEXP
Experimental Settings
1 2 3
37. October 8, 2015 8th International Conference on Knowledge Capture 37
Results: Compactness
Percentage decrease of 36.9% over BAS01
IdealGT: size of SSN according to Gold Standard
(2*2*2 + 2) Runs
1 2 3
38. October 8, 2015 8th International Conference on Knowledge Capture 38
Results: Recall* over N
1 2 3
39. October 8, 2015 8th International Conference on Knowledge Capture 39
Conclusion
• News applications can benefit from the News Semantic Snapshot (NSS)
• Proposed a concentric based model for generating the NSS:
• Formalizes duality in entities (Representative VS Relevant)
• Exploit the entity semantic relations between Core and Crust.
• Accommodate into a single model different relevancy dimensions via the
notion of web presence ( SWeb )
• Concentric model better reproduces the NSS:
• Better Compactness: 36.9% over BAS01
• Similar recall, Smaller size
• Concentric model easier to implement:
• Core can be reproduced via Frequency Dimension
• Crust brings up relevant entities without having to deal with fuzzy
dimensions
1 2 3
40. October 8, 2015 8th International Conference on Knowledge Capture 40
Future
• Extend the number of videos considered in GT:
From 5 to 23 (+18), check [4] for more information
• Spot not only relationships between Crust and the Core but
also predicates that characterize them:
[4] https://github.com/jluisred/NewsConceptExpansion/wiki/Golden-Standard-Creation
Editor in WikiLeaks
1 2 3
41. JOSÉ LUIS REDONDO GARCIA
GIUSEPPE RIZZO
RAPHAËL TRONCY
@peputo / redondo@eurecom.fr
@giusepperizzo / giuseppe.rizzo@eurecom.fr
@rtroncy / raphael.troncy@eurecom.fr
http://www.slideshare.net/joseluisredondo/concentric-semantic-snapshot
Visit poster at booth:
34
Notas do Editor
Usage of the NSS ??
Why entities? Introduce the importance of this decision
Usage of the NSS ??
Why entities? Introduce the importance of this decision
Usage of the NSS ??
Why entities? Introduce the importance of this decision
Usage of the NSS ??
Why entities? Introduce the importance of this decision
Usage of the NSS ??
Why entities? Introduce the importance of this decision
Usage of the NSS ??
Why entities? Introduce the importance of this decision
----- Meeting Notes (6/16/15 11:16) -----
Extending the Repository
----- Meeting Notes (6/16/15 11:16) -----
Extending the Repository
----- Meeting Notes (6/16/15 11:16) -----
Extending the Repository
Usupervised
----- Meeting Notes (6/16/15 11:16) -----
Extending the Repository
Usage of the NSS ??
Why entities? Introduce the importance of this decision
Usage of the NSS ??
Why entities? Introduce the importance of this decision