2. Social Tagging Applications
Social tagging applications are used to organize, classify, manage
and share knowledge resources
! Tags are freely chosen keywords attached to resources
! Tags often describe an aspect of the resource
Lora
o
Aroy Graz
Key
note Dial E
I-know
for
2012 Events Sema
n
Pre
pa rin
W tic
eb
g
Con for
fer
enc
e
KOM – Multimedia Communications Lab 2
www.wordle.net
4. Overview
! Motivation
! Basics
! Folksonomy
! Folksonomy Extended by Tag Types
! Graph-based Resource Recommendation
! Challenge: Concept Drift
! AspectScore & InteliScore
! Evaluation Methodology and Metrics
! Results
! Conclusion & Future Work
KOM – Multimedia Communications Lab 4
5. Folksonomy
Users Tags Resources
A folksonomy is a quadruple
F:= (U, T, R, Y), where
Research
Talk
U – Users
T – Tags
R – Resources
Y ⊆ U ! T ! R - tag assignment Ranking
Algorithms
Slideshare
[Hotho et al. 2006]
KOM – Multimedia Communications Lab 5
6. Folksonomy Extended by Tag Types
Users Tags Resources
An extended folksonomy
FA:= (U, T, R, A, Y) where
Research
Talk
U – Users Perso
n
Typ
T – Tags e Loca
tion
Topic
R – Resources Event Oth
Act er
A – Tag Types ivity Ranking
Algorithms
Y⊆U!T!RxA
A = {Topic, Resource Type, Location,
Person, Event, Activity, Other}
Slideshare
[Böhnstedt et al. 2009]
KOM – Multimedia Communications Lab 6
8. Adapted PageRank
[Hotho et al. 2006]
" " Dancing
$%&'()*+,& Tango ! l
0 Festiva
0
# #
"-.
"
#-.
1
!
PageRank‘s intelligent surfer model
#-.
Buenos " The ranking of a node is determined by how
"-. Aires often the surfer visits the node
0
Buenos Adjoining edges are followed with a certain
Aires probability – determined by the edge weights
0
" The query node acts as the starting point and
focus i.e. the surfer returns to this node with
! a certain probability – determined by the
node weights
KOM – Multimedia Communications Lab 8
9. Challenge: Concept Drift
Concept drift is a challenge for graph-based ranking algorithms
! e.g. Ambiguous tags can cause concept drift as a single tag might represent
multiple semantic concepts
?
football News about Messi
?
FC Barcelona Website
Dallas Cowboys‘ Website
KOM – Multimedia Communications Lab 9
10. Overview
! Motivation
! Basics
! Folksonomy
! Folksonomy Extended by Tag Types
! Graph-based Resource Recommendation
! Challenge: Concept Drift
! AspectScore & InteliScore
! Evaluation Methodology and Metrics
! Results
! Conclusion & Future Work
KOM – Multimedia Communications Lab 10
11. InteliScore
The semantic information gained from the semantic relatedness
between tags is used to reduce concept drift
Dancing
Tango
Festival
Semantic Relatedness (XESA)
0.005
Buenos
Aires
XESA calculates the semantic
relatedness between pairs of tokens
(tags) using the English Wikipedia as
reference corpus
[Scholl et al. 2010]
Source: wikipedia.org KOM – Multimedia Communications Lab 11
12. AspectScore
Tag types help to alleviate concept drift
! Tags are disambiguated with respect to different aspects of a resource that a
user may describe while tagging
topic location
Buenos
Aires
Tourism in Argentina
News about Tango
KOM – Multimedia Communications Lab 12
13. AspectScore
Tag types help to alleviate concept drift
! e.g. by focusing on the tags describing the content of resources
topic location
Buenos
Aires
Tourism in Argentina
News about Tango
topic
Assumption:
The tags of type „Topic“
describe the content of the resources well,
Google Map of Buenos Aires therefore „Topic“ Tags are given priority.
KOM – Multimedia Communications Lab 13
14. AspectScore: Step 1
Buenos
1. Transform Query Node Tango Buenos Aires
into Query Tags 3 Aires 1
1
User query node is transformed into tag nodes,
weighted by the usage frequency of the user
Assumption:
Tags of a user describe the user‘s interests well
[Abel 2011]
KOM – Multimedia Communications Lab 14
15. AspectScore: Step 1
Query Tag
1. Transform Query Node
in Query Tags
Dancing
Tango Festiva
l
Query Node
Buenos
Aires
Buenos
Aires Query Tags
KOM – Multimedia Communications Lab 15
16. AspectScore: Step 2
1. Transform Query Node
into Query Tags
2. Create Folksonomy
Graph for each Query Tag
KOM – Multimedia Communications Lab 16
17. AspectScore: Step 2
1. Transform Query Node " " Dancing
$%&'()*+, Tango ! l
into Query Tags 3 Festiva
0
# #
"
2. Create Folksonomy "
Graph for each Query Tag
! #
!
Buenos "
"
Aires
" 0
" Buenos
Aires
0 Depending on Ranking Algorithm
e.g. FolkRank
"
!
KOM – Multimedia Communications Lab 17
18. AspectScore: Step 3
1. Transform Query Node "#"$ "#"$ Dancing
'()*+",-. Tango ! l
into Query Tags 3 Festiva
0
"%"$ "%"$
"#"$
2. Create Folksonomy "#"$
Graph for each Query Tag "%"$
!
!
Buenos "#"$
"#"$
3. Adapt Edge Weights Aires
"#"$ 0
Buenos
"#"$&
Aires
0 Edge Weights are adapted
"#"$
(in several iteration steps)
depending on Query Tag
!
KOM – Multimedia Communications Lab 18
19. AspectScore: Step 4
1. Transform Query Node "#"$ "#"$ Dancing
'()*+",-. Tango ! l
into Query Tags 3 Festiva
0
"%"$ "%"$
"#"$
2. Create Folksonomy "#"$
Graph for each Query Tag "%"$
!
!
Buenos "#"$
"#"$
3. Adapt Edge Weights Aires
"#"$ 0
Buenos
"#"$&
Aires
4. Run Ranking Algorithm 0 Run e.g. FolkRank on the
"#"$
adapted folksonomy graph
!
KOM – Multimedia Communications Lab 19
20. AspectScore: Step 5
1. Transform Query Node
into Query Tags
2. Create Folksonomy
Graph for each Query Tag
3. Adapt Edge Weights
The resulting rankings are accumulated
giving preference to certain tag types
4. Run Ranking Algorithm e.g. topic tags
Buenos
Tango Buenos Aires
5. Accumulate Results Aires 1δ Topic
3δ Topic
for each Query Node 1δ Location
KOM – Multimedia Communications Lab 20
21. Overview
! Motivation
! Basics
! Folksonomy
! Folksonomy Extended by Tag Types
! Graph-based Resource Recommendation
! Challenge Concept Drift
! AspectScore & InteliScore
! Evaluation Methodology and Metrics
! Results
! Conclusion & Future Work
KOM – Multimedia Communications Lab 21
22. Evaluation Methodology: LeavePostOut
A post is a Pu,r= {(u,r,t)|(u,r,t) ! Y}
Dancing Dancing
Tango Tango
Festival Festival
Buenos Buenos
Aires Aires
For LeavePostOut, the recommendation task
with user as input is harder as with tag as input
[Jäschke et al. 2007]
KOM – Multimedia Communications Lab 22
23. Evaluation Methodology: LeaveRTOut
RTr,t= {(u,r,t)|(u,r,t) ! Y}
Dancing Dancing
Tango Tango
Festival Festival
Buenos Buenos
Aires Aires
For LeaveRTOut, the recommendation task
with tag as input is harder as with user as input
KOM – Multimedia Communications Lab 23
24. Evaluation Corpus
Bibsonomy corpus with a p-core extraction at level 5 to reduce noise
and to focus on the dense portion of the corpus
Before After Tag Type Count
Users 7243 69 Topic 2225
Bookmark resources 281550 9 Other 486
Bibtex resources 469654 134 Resource Type 198
Tags 216094 179 Event 182
Tag assignments 2740834 3269 Person/Organisation 143
Bookmark posts 330192 51 Activity 35
Bibtex posts 526691 959
FReSET – Domínguez García et al 2012
http://www.kom.tu-darmstadt.de/research-results/downloads/software/freset/
KOM – Multimedia Communications Lab 24
Knowledge and Data Engineering Group, University of Kassel: Benchmark Folksonomy Data from Bibsonomy, version of July 7th 2011
25. Evaluation Metrics
Mean Average Precision:
The mean of the Average
|Q| mj Precision over several queries Q
1 1
MAP(Q) = Precision(Rjk )
|Q| j=1 mj k=1 [Manning et al 2008]
Mean Normalized Precision:
The mean of the normalized
Precision at k
|Q|
1 Precisionj (k) over several queries Q
MNP(Q, k) =
|Q| j=1 Precisionmax,j (k)
KOM – Multimedia Communications Lab 25
26. Visualization of Results with Violin Plots
A violin plot is a combination of a box plot and a density trace
3rd Quartile
Median
1st Quartile
[Hintze et al. 1998]
KOM – Multimedia Communications Lab 26
28. Evaluation Results for LeavePostOut
Evaluation results for the recommendation task having tag as input
Approaches MAP
AspectScore 0.2240
FolkRank 0.2136
InteliScore 0.1801
Popularity 0.0937
KOM – Multimedia Communications Lab 28
29. Evaluation Results for LeaveRTOut
Evaluation results for the recommendation task having tag as input
KOM – Multimedia Communications Lab 29
30. Evaluation Results for LeaveRTOut
Evaluation results for the recommendation task having tag as input
Approaches MAP
Popularity 0.0834
AspectScore 0.0589
FolkRank 0.0529
InteliScore 0.0433
KOM – Multimedia Communications Lab 30
31. Conclusion and Future Work
Exploiting semantic information for resource ranking in folksonomies
AspectScore InteliScore
Tag disambiguation importance of Based on semantic
tags (based on type) n
relatedness between tags
Perso
Typ
e Loca
tion
e.g. XESA
Topic
Event Oth
Act er
ivit
y
Limitations
! Manually labeled tag type dataset – error prone, subjective
! XESA based on English Wikipedia – No semantic relatedness measurable for
27% of tags in corpus
Future Work
! Evaluation using CROKODIL corpus – an e-learning application with tag types
! User Study www.crokodil.de
KOM – Multimedia Communications Lab 31