Weaving Linked Open Data into User Profiling on the Social Web
1. Weaving Linked Open Data into
User Profiling on the Social Web
MultiA-Pro, Lyon, France, April 16th 2012
Fabian Abel, Claudia Hauff, Geert-Jan Houben, Ke Tao
Web Information Systems, TU Delft, the Netherlands
Delft
University of
Technology
2. What we do: Science and Engineering
for the Personal Web
domains: news social media cultural heritage public data e-learning
Personalized Personalized
Adaptive Systems
Recommendations Search
recommending
points of interest Analysis and
User Modeling
Semantic Enrichment,
Linkage and Alignment
user/usage data
Social Web
Weaving Linked Open Data into User Profiling on the Social Web 2
3. Kyle
Hometown: South
Park, Colorado
Weaving Linked Open Data into User Profiling on the Social Web 3
4. Kyle recently uploaded photos to Flickr
tags: delft, vermeer with
tags: girl
geo: The Hague earring
pearl
geo: The Hague
During his trip to the Netherlands,
he uploaded pictures to Flickr.
Weaving Linked Open Data into User Profiling on the Social Web 4
5. Kyle tweets about his upcoming trip
Looking forward to visit
Paris next week!
Weaving Linked Open Data into User Profiling on the Social Web 5
6. Interests of Kyle?
• Given Kyle’s Flickr and Twitter
activities, can we infer Kyle’s
interests? tags: delft, vermeer with
tags: girl
geo: The Hague earring
pearl
• Knowing that Kyle will visit Paris, geo: The Hague
France, can we recommend him
places that might be interesting
for him?
Looking forward to visit
Paris next week!
Weaving Linked Open Data into User Profiling on the Social Web 6
7. Challenges
• How to create a meaningful profile that supports the given
application?
how to bridge between the Social Web chatter of a user
and the candidate items of a recommender system?
c9
User Profile
c6 Application
c1 concept weight
c5 that demands user
interest profile regarding
? c1
-concepts
c3
d4 d3
d1 d5
Social Web c2 d2 d3
d4
Candidate Items Recommendations
Weaving Linked Open Data into User Profiling on the Social Web 7
8. Challenge of Recommending Points of
Interests (POIs) to Kyle
The astronomer
Johannes Vermeer
Looking forward to dbpedia:Louvre
visit Paris next week!
dbpedia:Paris
Weaving Linked Open Data into User Profiling on the Social Web 8
9. LOD-based User Modeling
concepts that can be extracted User Profile
from the user data
weighting strategies concept weight
c2 c1
c4 c9 0.4
c6 c2
c1 cy 0.1
c5
c3 0.2
cx … …
c3
Application
background knowledge that demands user
user data
(graph structures) interest profile regarding
Social Web -concepts
Linked Data
Weaving Linked Open Data into User Profiling on the Social Web 9
10. User Modeling Building Blocks
Weaving Linked Open Data into User Profiling on the Social Web 10
11. Strategies for exploiting the RDF-based
background knowledge graph
Indirect Mention
Indirect Mention
Direct Mention
@RDF_statement
@RDF_graph
dbpedia:Louvre
tags: girl with
pearl earring
geo: The Hague A B C …
dbpedia:Paris
Artifact
dbpedia:Girl_with_pearl_earring
The The
lacemaker astronomer
Johannes Vermeer
Weaving Linked Open Data into User Profiling on the Social Web 11
12. Weighting Scheme
User Profile
concept weight
c1 374
?
c2 152
?
c3 73
?
… …
Weighting scheme: count the
number of occurrences of a given
graph pattern.
Weaving Linked Open Data into User Profiling on the Social Web 13
13. Source of User Data
• Twitter
• Flickr
Weaving Linked Open Data into User Profiling on the Social Web 14
14. Mining the Geographic Origins of User Data
• Tweets or Flickr images posted by a GPS-enabled device;
• Images geo-tagged manually on Flickr world map
• Otherwise : exploit the title and the tags the users assign
to their images
Weaving Linked Open Data into User Profiling on the Social Web 15
15. Evaluation
Weaving Linked Open Data into User Profiling on the Social Web 16
16. Research Questions
1. How does the source of user data influence the quality in
deducing user preferences for POIs?
2. How does the consideration of background knowledge from
the Linked Open Data Cloud impact the quality of the user
modeling?
3. What (combination of) user modeling strategies allows for
the best quality?
Weaving Linked Open Data into User Profiling on the Social Web 17
17. Dataset
users 394
duration 11 months
tweets 2,489,088
location 11%
pictures 833,441
location 70.6%(within 10km)
Weaving Linked Open Data into User Profiling on the Social Web 18
18. Dataset Characteristics: Profile Sizes
81.4% of the users,
Twitter-based profiles are
bigger than Flickr-based profiles
Weaving Linked Open Data into User Profiling on the Social Web 19
19. Experimental Setup
• Task:
= Recommending POIs
= Predicting POIs which a user will visit
• Ground truth:
• split data into training data (= first nine months) and test data (= last
two months)
• POIs that the user visited in the last two months are considered as
relevant
• Metrics:
• Precision@k, Recall@k and F-Measure@k: precision, recall and f-
measure within the top k of the ranking of recommended items
Weaving Linked Open Data into User Profiling on the Social Web 20
20. Results: Impact of User Data Source
Selection
Combination of Twitter
!#+"
2"#*
and Flickr user data
!#*" allows for best
&' +#$, --* ( . * #, &
!# "
) performance
/01
!# "
(
!# "
' 78 $! "
,
!#&" 98 $! "
!# "
%
! "#$%% ( )*
, 8 $! "
!#$"
!"
, -./01" 23 .4 51" , -./01"6 "
23 .4 51"
Twitter seems to be
more valuable for the
given application
Weaving Linked Open Data into User Profiling on the Social Web 21
21. Results: Impact of Strategies for Exploiting
RDF-based Background Knowledge
The more background
!# "
,
2"#*
information the better
!#+"
the user modeling
&' +#$, --* ( . * 1 #, &
!#*"
!# "
) performance
/0
!# "
(
89 $! "
,
!# "
'
!#&" : 9 $! "
! "#$%% ( )*
!# "
% ; 9 $! "
!#$"
!"
- ./012" .4- ./012" .4- ./012"
3 04564" 3 04564"7 3 04564"7"
" 7
Weaving Linked Open Data into User Profiling on the Social Web 22
22. Results: Combining different Strategies
Combining all background
exploitation strategies improves the
user modeling performance clearly
Weaving Linked Open Data into User Profiling on the Social Web 23
23. Conclusions
What we did:
• LOD-based User Modeling on the Social Web
• Different strategies for exploiting RDF-based background
knowledge
Findings:
1. Combination of different user data sources (Flickr & Twitter) is
beneficial for the user modeling performance
2. User modeling quality increases the more background
knowledge one considers
3. Combination of strategies achieves the best performance
Future work:
• Investigate weighting schemes that weight the different RDF
graph patterns for acquiring background knowledge differently
Weaving Linked Open Data into User Profiling on the Social Web 24
24. Thank you!
Slides : http://goo.gl/Zdg4K
Email: K.Tao@tudelft.nl
Twitter: @wisdelft @taubau
http://persweb.org
Weaving Linked Open Data into User Profiling on the Social Web 25
Notas do Editor
In Web Information Systems Group from TU Delft, we gather user/usage data from social webUpon that…Furthermore, build application for personalized rec, personalized search, and adaptive systems.The domains includes…In this talk, we focus on recommending the points of interest. Before diving into the theoretical model, let’s come to a story.
Kyle is a guy from South Park, Colorado.
Recently, he travelled to the Netherlands, visited a museum in The Haugue, where he took pictures of two works and uploaded to Flickr.
On another platform, he also tweeted about his upcoming trip. So he was planning to visit Paris soon.
As we want to build an application for recommending the POIs, the problem here is to infer the personal interests of Kyle, from two sites of the Social Web.
…Go back to the example… turn
The view of Delft & the girl with pearl earing are works of Johannes Vermeer (a Dutch painter in 17th century). We assume that Kyle like these two work, and will also be interested in other works of Vermeer, like the astronomer and the lacemaker, etc. Explanation of POIs, what do we want to recommendPOIs = The entities with a type of Place on DBpedia, showed in purple in this slides.
To tackle these challenges…Find the concepts in LOD
Too difficult
----- Meeting Notes (11/4/12 15:37) -----Add more contents
For recommending the Point of Interests, we used two social web platforms as sources of user data; one is the most popular microblogging service Twitter, another one is the most popular images sharing site Flickr.
Let us first look at the Flickr and Twitter profiles…
Might be better to remove the Costs column...?be prepared for the question "How do you know that a user has visited a POI?" -> "POIs that the user tweeted about or took a photo of"
MENTION: this means that the two sources complement each other
further explanation of combining different strategies
Our framework extracts typed entities from enriched tweets/news and provides strategies for detecting semantic (trending) relationships between entities. We:investigated the precision and recall of the relation detection strategies,analyzed how the strategies perform for each type of relationships andWhich strategy performs best in detecting relationships between entities?Does the accuracy depend on the type of entities which are involved in a relation?How do the strategies perform for discovering relationships which have temporal constraints, and how fast can the strategies detect (trending) relationships?evaluated the quality and speed for discovering trending relationships that possibly have a limited temporal validity.