Markus Strohmaier presented research on extracting semantics from crowdsourced data. The goals are to utilize online crowd behavior for constructing and maintaining large-scale semantic structures. Methods discussed include analyzing social labeling with hashtags, social tagging patterns like tag relatedness and hierarchies, and extracting knowledge from social navigation. A prototype called HANNE uses inductive concept learning from navigation data to enrich knowledge bases. The research aims to scale semantic extraction to very large groups and online contexts while maintaining semantic quality.
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Extracting Semantics from Crowds
1. Knowledge Management Institute
Extracting Semantics from Crowds
Stanford Center for Biomedical Informatics Research, Colloquium
Stanford University, Palo Alto, CA
Jan 6th, 2010
Markus Strohmaier
Assistant Professor
Graz University of Technology, Austria
Visiting Scientist
(XEROX) PARC, USA
Markus Strohmaier 2011
1
2. Knowledge Management Institute
with …
D. H li C Kö
D Helic, C. Körner, J. Pöschko, C. W
J Pö hk C Wagner
(Graz University of Technology, Austria)
D. Benz, G
D Benz G. Stumme
(U. of Kassel, Germany)
A. Hotho
(U. of Würzburg, Germany)
S. Hellmann, J. Lehmann, C. Stadler, J. Unbehauen
(U. Leipzig,
(U of Leipzig Germany)
Markus Strohmaier 2011
2
3. Knowledge Management Institute
Classification Systems
in Information and Library Sciences
Usually produced and maintained by few
(e.g.
(e g dozens of) domain experts
experts.
but: used by many (potentially millions).
Can a very large group (a crowd) of users
contribute to ontology engineering efforts?
Markus Strohmaier 2011
3
4. Knowledge Management Institute
Crowdsourcing Semantics:
An Experiment (2008)
LoC
L C posted 3000 photos
t d h t
to flickr:
24 hours after launch:
• over 4,000 unique tags
• about 19,000 tags added
Output? noisy,
How can this data contribute to the
unstructured, unqualified,
induction of semantic structures? weak semantics
Markus Strohmaier 2011
4
5. Knowledge Management Institute
Research Agenda
Vision:
Vi i
Utilizing online behavior of crowds for
the construction, maintenance and enrichment
construction
of large-scale semantic structures.
Mission:
• to model behavior of large numbers (millions) of users online
• to develop techniques and algorithms that acquire semantic structures
from users‘ interaction with data
• to influence user behavior and emerging semantics
• to evaluate results
Markus Strohmaier 2011
5
6. Knowledge Management Institute
Extracting Semantics from …
Motivation
Social Labeling
• Hashtag Semantics
Social Tagging
• Tag Relatedness
• Tag Generality
• Tag Hierarchies
Social Navigation
• Navigational Knowledge
Engineering
g g
Markus Strohmaier 2011
6
7. Knowledge Management Institute
Activities of Users Online
Users engage in…
Labeling Tagging Navigation
Can we tap into the outcome of these
activities to extract semantic structures?
Markus Strohmaier 2011
7
8. Knowledge Management Institute
Extracting Semantics from Crowds
Motivation
Social Labeling
• Hashtag Semantics
Social Tagging
• Tag Relatedness
• Tag Generality
• Tag Hierarchies
Social Navigation
• Navigational Knowledge
Engineering
g g
Markus Strohmaier 2011
8
9. Knowledge Management Institute
Social Labeling
Example: Twitter
users label short messages
with concepts (hashtags)
Which hashtags behave as strong identifiers (if any), and could
they be mapped to concept identifiers in the Semantic Web
(URIs)?
[Laniado and Mika 2010]
Markus Strohmaier 2011
9
10. Markus Strohmaier
Knowledge Management Institute
with
and C.Wagner
2011 J. Pöschko
ConceptNet
developed by
C. Wagner, M. Strohmaie The Wisdom in Tw
W er, weetonomies: Acquiri ing Latent Conceptua Structures from So
al ocial Awareness Stre
eams,
10
Semmantic Search 2010 Workshop (SemSearch2
W 2010), in conjunction with the 19th Internatio
w onal World Wide Web Conference (WWW20
C 010),
Rale
eigh, NC, USA, April 26-30, ACM, 2010.
2
11. Markus Strohmaier
Knowledge Management Institute
such a network?
p
with
X is a subconcept of Y
e.g. X is more general than Y,
and C.Wagner
2011 J. Pöschko
ConceptNet
Can we qualify semantic associations in
ent
Ev-
ent
Ev-
developed by
C. Wagner, M. Strohmaie The Wisdom in Tw
W er, weetonomies: Acquiri ing Latent Conceptua Structures from So
al ocial Awareness Stre
eams,
11
Semmantic Search 2010 Workshop (SemSearch2
W 2010), in conjunction with the 19th Internatio
w onal World Wide Web Conference (WWW20
C 010),
Rale
eigh, NC, USA, April 26-30, ACM, 2010.
2
12. Knowledge Management Institute
Extracting Semantics from Crowds
Motivation
Social Labeling
• Hashtag Semantics
Social Tagging
• Tag Relatedness
• Tag Generality
• Tag Hierarchies
Social Navigation
• Navigational Knowledge
Engineering
g g
Markus Strohmaier 2011
12
13. Knowledge Management Institute
Social Tagging
Example: Delicious
Users label and categorize
Resources resources with concepts (tags)
Tags
Users
U
A folksonomy is a tuple F:= (U, T, R, Y) where
• th th
the three di j i t fi it sets U T R correspond t
disjoint, finite t U, T, d to user 1
– a set of persons or users u ∈ U
– a set of tags t ∈ T and
– a set of resources or objects r ∈ R tag 1 res. 1
• Y ⊆ U ×T ×R, called set of tag assignments
Markus Strohmaier 2011
13
14. Knowledge Management Institute
Tag Relatedness
Different
tag similiarity
measures
applied to del.icio.us
C. Cattuto, D. Benz, A. Hotho, G. Stumme, Semantic Grounding of Tag Relatedness in Social Bookmarking Systems, 7th International Semantic Web Conference
ISWC2008, LNCS 5318, 615-631 (2008).
Markus Strohmaier 2011
14
15. Knowledge Management Institute
Tag Generality
frequency
Del.icio.us
with D. Benz
et al.
D. Benz, C. Körner, A. Hotho, G. Stumme, M. Strohmaier, One Tag to bind them all: Measuring Term Abstractness in Social Metadata, Submitted to ESWC 2011.
Markus Strohmaier 2011
15
16. Knowledge Management Institute
Tag Hierarchy
TP..Taxonomic TR..Taxonomic TF..Taxonomic TO…Taxonomic
precision recall F measure overlap
Different folksonomy algorithms
with D. Helic
et al.
M. Strohmaier, D. Helic, D. Benz. Evaluation of Folksonomy Induction Algorithms. Submitted to the ACM Transactions on Intelligent Systems and Technology, (2011).
Markus Strohmaier 2011
16
17. Knowledge Management Institute
Extracting Semantics from Crowds
Motivation
Social Labeling
• Hashtag Semantics
Social Tagging
• Tag Relatedness
• Tag Generality
• Tag Hierarchies
Social Navigation
• Navigational Knowledge
Engineering
g g
Markus Strohmaier 2011
17
18. Knowledge Management Institute
Social Navigation
Users search for and navigate
to certain sets of resources
Can we turn ordinary users into contributors for
ontology enrichment?
t l i h t?
Navigational Knowledge Engineering:
A light-weight methodology for low-cost
knowledge engineering by a massive user base.
Markus Strohmaier 2011
18
19. Knowledge Management Institute
Prototype: HANNE
HANNE: Holistic Application for Navigational
HANNE H li ti A li ti f N i ti l
Knowledge Engineering
http://aksw.org/Projects/NKE
http://aksw org/Projects/NKE
with Sebastian Hellmann et al., University of Leipzig
Markus Strohmaier 2011
19
20. Knowledge Management Institute
Navigational Knowledge Engineering
http://aksw.org/Projects/NKE
Example: Extending DBPedia with NKE
Markus Strohmaier 2011
S. Hellmann, J. Lehmann, C. Stadler, J. Unbehauen, M. Strohmaier, Navigational Knowledge Engineering, Submitted to WWW 2011. 20
21. Knowledge Management Institute
Navigational Knowledge Engineering
Choose initial positive and
negative examples from the
search result.
Here we are looking for
Football Clubs in Saxony, a
region in German
Germany.
Markus Strohmaier 2011
http://aksw.org/Projects/NKE
21
22. Knowledge Management Institute
Inductive Concept Learning
To test the coverage, the function
Returns the value true if e is covered by H, and false otherwise.
Markus Strohmaier 2011
Nada Lavrac and Saso Dzeroski, Inductive Logic Programming: Techniques and Applications, 1994 22
23. Knowledge Management Institute
HANNE &
The Problem of Inductive Concept Learning
given:
• Background knowledge (OWL/DL knowledge base)
• positive and negative examples of a concept
(instances)
(i t )
find:
fi d
• A hypothesis (expressed as OWL class descriptions)
that covers all positive and no negative examples
Markus Strohmaier 2011
23
24. Knowledge Management Institute
Based on the
extension, ICL
searches for
suitable
hypotheses.
Markus Strohmaier 2011
S. Hellmann, J. Lehmann, C. Stadler, J. Unbehauen, M. Strohmaier, Navigational Knowledge Engineering, Submitted to WWW 2011. 24
25. Knowledge Management Institute
Concepts Learned
• The Learned C
Concept is shown in Manchester O OWL SSyntax
• The user can retain the concept for later retrieval.
• Saved concepts are displayed as social navigation
suggestions. Can be used to enrich existing knowledge
base.
base
Markus Strohmaier 2011
S. Hellmann, J. Lehmann, C. Stadler, J. Unbehauen, M. Strohmaier, Navigational Knowledge Engineering, Submitted to WWW 2011. 25
26. Knowledge Management Institute
Result
Useful properties:
• Biased towards high recall
• Scales well: Number of training
examples is more important than the
size of the background knowledge
With only 2 positives and 4 negatives,
it is possible to find 13 more
instances, which are f tb ll clubs
i t hi h football l b
situated close to Saxony, Germany
Markus Strohmaier 2011
S. Hellmann, J. Lehmann, C. Stadler, J. Unbehauen, M. Strohmaier, Navigational Knowledge Engineering, Submitted to WWW 2011. 26
28. Knowledge Management Institute
Summary
We
W can observe that:
b th t
• semantic structures can be obtained as a byproduct
of online crowd behavior (t i categorizing, navigation)
(tagging, t i i i ti )
• these structures can approximate structures in
reference knowledge bases (DBPedia WordNet etc)
(DBPedia, WordNet,
but:
• different levels of scale mean different degrees of
formality and quality of semantic structures
• pragmatics influences resulting semantics
Markus Strohmaier 2011
28
29. Knowledge Management Institute
Outlook
Current and planned collaborations:
C t d l d ll b ti
(XEROX) PARC, USA and U. Würzburg, Germany:
• PoSTS: Interactions between pragmatics and semantics in social tagging systems
• DFG/FWF coop. project proposal 2011 2014 (~400.000 EUR, under review)
2011-2014 ( 400.000
• Visiting Scholar with (XEROX) PARC in 2010/2011
Media-X, Stanford U., USA:
• CALHWIN: Implementation of a social health information network for California
Semantic analysis of social media
• NSF project proposal 2011-2014 (under review)
U. Of Leipzig, Germany:
• Navigational Knowledge Engineering
Related events I am involved in:
• ACM SIGWEB Working Group on Social Media (Co-chair)
• ACM Hypertext 2011 track „Social Media“ (Co-chair)
• WWW‘2011 Workshop on Usage Analysis and the Web of Data (PC member)
W k h U A l i d th W b f D t b )
Markus Strohmaier 2011
29
30. Knowledge Management Institute
A N What‘s the fknowledge of SAS? h
New A s d for S
What
Agenda Semantic R
ti Research
7 January, 2011
Web Resources
Natural Language Constructs
Real-World Happenings
Utilizing online behavior of crowds for
30
the construction, maintenance and enrichment of
large-scale semantic structures
Markus Strohmaier 2011
http://www.flickr.com/photos/waldoj/722508166/
http://www.flickr.com/photos/matthewfield/2306001896/ 30
31. Knowledge Management Institute
Thank You.
Acknowledgements
collaborators and co-authors:
D. Helic, C. Körner, J. Pöschko, C. Wagner (Graz U. of Technology, Austria)
D. Benz, G. Stumme (U. of Kassel, Germany)
A. Hotho (U. of Würzburg, Germany)
S. Hellmann, J. Lehmann, C. Stadler, J. Unbehauen (U. of Leipzig, Germany)
Markus Strohmaier 2011
31