Extracting semantics from crowds

Knowledge Management Institute

Extracting Semantics from Crowds

International Summer School on Semantic Computing (SSSC 2011)
(SSSC’2011)
August 8 - 12, 2011, Berkeley, California

Markus Strohmaier

Assistant Professor
Graz University of Technology, Austria

Visiting Scientist
(XEROX) PARC, USA

with collaborators D. Helic, C. Körner, J. Pöschko, R. Kern, C. Trattner, C. Wagner (Graz University of
Technology, Austria), D. Benz, G. Stumme (U. of Kassel, Germany), A. Hotho (U. of Würzburg, Germany), S.
Hellmann, J. Lehmann, C. Stadler, J. Unbehauen (U. of Leipzig, Germany)
Markus Strohmaier 2011
1


Egypt 2011
Twitter Trends place
event

person

2


Semantic Structures: The DMOZ Project

3


Classification Systems
in Information and Library Sciences

Usually produced and maintained by few
(e.g.
(e g dozens of) domain experts
experts.

but: used by many (potentially millions).

Can a very large group (a crowd) of users
contribute to ontology engineering efforts?

4


Objectives
Provide (some) answers to the following questions:
P id ( ) t th f ll i ti

• What is the difference between extracting semantics from text
vs. extracting semantics from crowds?
• Why should we study crowds and crowd behavior from a
semantic computing perspective?
ti ti ti ?
• How can semantics be extracted from online crowd behavior,
such as
• …from Social Labeling
• …from Social Tagging
• …from Soc a Navigation
o Social a ga o
• What are the implications for semantic computing research?

5


Extracting Semantics from …

Motivation

Social Labeling
• Hashtag Semantics
Social Tagging Extraction
• Tag Relatedness
• Tag Generality
• Tag Hierarchies
Social Navigation
• Navigational Knowledge Interaction
Engineering
g g

6


Ontology engineering vs. learning
• O t l
Ontology engineering
i i
• Expert-driven, small-scale
• Knowledge and concept identification
• Preliminary informal representation
• Formalization (RDF, OWL, etc)
• Evaluation d Maintenance
E l ti and M i t
• Ontology learning
• Data-driven, large-scale
Data driven large scale
• Source selection and data sampling
• Data exploration and probing
• Concept and link learning
• Evaluation and Updating

7


Distributional
Distrib tional Semantics
[Hovy 2011]

• I recent years, people in NLP and IR h
In t l i d have started
t t d
using a different representation for all kinds of
problems: word distributions [Harris 1954]: “words
that occur in the
same contexts tend
to have similar
meanings“

• Statistical NLP operates at word level, frequency distributions are
used as the (de facto) semantics of a word
( )
• Not actual semantics, but captures something of contents
• Not compositional: how to ‘add’ two distributions?
• No explicit theory of semantics
[Hovy 2011] Based on slides by Ed Hovy, USC, Reasoning with Text Workshop 2011 (link) 8


Why
Wh apple is similiar to pear
[based on slides by Hovy 2011 / Pantel 02]

Patrick Pantel and Dekang Lin. 2002. Discovering word senses from text. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining 9
(KDD '02). ACM, New York, NY, USA, 613-619.


Why
Wh apple is not similiar to toothbr sh
toothbrush



Distributional
Distrib tional Semantics



Hearst Patterns
H t P tt
M. Hearst, Automatic Acquisition of Hyponyms from Large Text Corpora 1992

(S1) The bow lute, such as the
Bambara ndang, is plucked and has
an individual
curved neck for each string.

12


Ontology Learning From Text

13


Evaluation

A SURVEY OF ONTOLOGY EVALUATION TECHNIQUES, Janez Brank, Marko Grobelnik, Dunja Mladenić,
Proceedings of the Conference on Data Mining and Data Warehouses (SiKDD 2005), 2005
14


Limitations of Knowledge Extraction from Text
(or why we want to study semantics in crowds)

• Delay
•Time it t k t write (hi h quality) t t on a t i
Ti takes to it (high lit ) text topic
• Population bias and dependence
• Authors´ demographics
Authors
• study language of target groups (e.g. software developers)
• Style
y
•Internal monologue vs. conversation with others
• Topical bias
• General Fiction, Mystery, Science Fiction, Romance, Humor, ..
• Books, newspapers, magazines [Brown Corpus]

Brown Corpus: http://icame.uib.no/brown/bcm.html 15


(i.e. Social Media)

How can we tap into users interaction with data and with each
other for the extraction of semantic structures?

16


Crowdsourcing

“Crowdsourcing represents the act of a company or
institution taking a function once performed by
employees and outsourcing it to an undefined (and
generally large) network of people in the form of an
open call.”
p

[J. Howe 2006]

17


Crowdsourcing Semantics:
An Experiment (2008)

LoC
L C posted 3000 photos
t d h t
to flickr:

24 hours after launch:
• over 4,000 unique tags
• about 19,000 tags added

Output? noisy,
How can this data contribute to the
unstructured, unqualified,
induction of semantic structures? weak semantics

18


Vision:
Vi i
Utilizing online behavior of crowds for
the construction, maintenance and enrichment
construction
of large-scale semantic structures.

Mission:
• to model behavior of large numbers (millions) of users online
• to develop techniques and algorithms that acquire semantic structures
from users‘ interaction with data and with each other
• to influence user behavior and emerging semantics
• to evaluate results

19


Activities of Crowds Online
Users engage in…

Labeling Tagging Navigation

Can we tap into the outcome of these activities
to extract and evaluate semantic structures?
20



Motivation

Social Labeling
Social Tagging
• Tag Relatedness
• Tag Generality
• Tag Hierarchies
Social Navigation
• Navigational Knowledge
Engineering
g g

21


Social Labeling
Example: Twitter
users label short messages
with concepts (hashtags)

Whether hashtags behave as strong identifiers, and if they could be
mapped to concept identifiers in the Semantic Web (URIs)?
[Laniado and Mika 2010]

Craig s
Craig‘s talk this morning: Background knowledge for URLs to explore other URLs
URLs,

22

Markus Strohmaier

with
and C.Wagner
2011 J. Pöschko
ConceptNet
developed by

C. Wagner, M. Strohmaie The Wisdom in Tw
W er, weetonomies: Acquiri ing Latent Conceptua Structures from So
al ocial Awareness Stre
eams,
23

Semmantic Search 2010 Workshop (SemSearch2
W 2010), in conjunction with the 19th Internatio
w onal World Wide Web Conference (WWW20
C 010),
Rale
eigh, NC, USA, April 26-30, ACM, 2010.
2

Markus Strohmaier

such a network?

p
with
X is a subconcept of Y
e.g. X is more general than Y,

and C.Wagner
2011 J. Pöschko
ConceptNet

Can we qualify semantic associations in
ent
Ev-
ent
Ev-
developed by

C. Wagner, M. Strohmaie The Wisdom in Tw
W er, weetonomies: Acquiri ing Latent Conceptua Structures from So
al ocial Awareness Stre
eams,
24

Semmantic Search 2010 Workshop (SemSearch2
W 2010), in conjunction with the 19th Internatio
w onal World Wide Web Conference (WWW20
C 010),
Rale
eigh, NC, USA, April 26-30, ACM, 2010.
2



Motivation

Social Labeling
Social Tagging
• Tag Relatedness
• Tag Generality
• Tag Hierarchies
Social Navigation
Engineering
g g

25


Social Tagging
Example: Delicious
Users label and categorize
Resources resources with concepts (tags)

Tags

Users
U

is a tuple F:= (U, T, R, Y) where
• th th
the three di j i t fi it sets U T R correspond t
disjoint, finite t U, T, d to user 1

– a set of persons or users u ∈ U
– a set of tags t ∈ T and
– a set of resources or objects r ∈ R tag 1 res. 1
• Y ⊆ U ×T ×R, called set of tag assignments

26


Tag Relatedness
ResCont

t1 t2 t3

r1 r2
Different
tag similiarity
measures
applied to del.icio.us

C. Cattuto, D. Benz, A. Hotho, G. Stumme, Semantic Grounding of Tag Relatedness in Social Bookmarking Systems, 7th International Semantic Web Conference
27
ISWC2008, LNCS 5318, 615-631 (2008).


Intuition:
Latent Hierachical Structures
high centrality:
more abstract
[
[Strohmaier et al 2011]
]

[Heyman and Garcia-Molina 2006]

low centrality:
more specific

Note: Betweeness centrality is usually difficult to calculate. Calculating all shortest
p
path is usually O(n2). For all nodes O(n3). We use an approximation.
y ( ) ( ) pp

M. Strohmaier, D. Helic, D. Benz. Evaluation of Folksonomy Induction Algorithms. Submitted to the ACM Transactions on Intelligent Systems and Technology, (2011). 28


Tag Abstractness

frequency

Del.icio.us

with D. Benz
et al.

D. Benz, C. Körner, A. Hotho, G. Stumme, M. Strohmaier, One Tag to bind them all: Measuring Term Abstractness in Social Metadata, Submitted to ESWC 2011. 29


Emergent semantics
through hierarchical clustering
Approaches:
• k-means
• Affinity propagation
• Tag generality

Applications:
• user navigation
i ti
• ontology learning
• disambiguation
on a delicious dataset
Evaluation:
• Semantic grounding to
Golden Standards (e.g.
WordNet)
W dN t)
Study of different tagging systems:
BibSonomy, CiteULike ,Delicious, Flickr, LastFM Semantic and Pragmatic Quality
varying greatly.
i tl
Anon Plangprasopchok, Kristina Lerman, and Lise Getoor (2010), Integrating Structured Metadata via Relational Affinity Propagation, in Proceedings of AAAI Workshop on Statistical
Relational AI
30


Semantic Validation of Folksonomies
Semantic Networks Semantic Groundingg
(Emergent) (Golden Standard)
via e.g. hierarchical clustering WordNet: a lexical DB for English

computers

Map- Synset Hierarchy
Programming ping
programming
distance d1 = 1 distance
d2 = 2
Python
Design
g languages
g g
patterns
abs. difference |d1 - d2| a Semantic
simple p y for the q
p proxy quality
y grounding j
java python
of emergent semantics
Based on slides by A. Hotho 31


Semantically Evaluating
Tag Hierarchies

[Dellschaft, Staab 2006]
TP..Taxonomic TR..Taxonomic TF..Taxonomic TO…Taxonomic
precision recall F measure overlap

Different folksonomy algorithms
Holds with other knowledge bases (Yago, Wikitaxonomy) and datasets (bibsonomy,
citeulike, lastfm less so)
[Dellschaft, Staab 2006] On How to Perform a Gold Standard Based Evaluation of Ontology Learning (2006), by Klaas Dellschaft , Steffen Staab, In Proceedings of the
5th International Semantic Web Conference (ISWC’06)

M. Strohmaier, D. Helic, D. Benz. Evaluation of Folksonomy Induction Algorithms. Submitted to the ACM Transactions on Intelligent Systems and Technology, (2011). 32


Limitations and Opportunities
Pragmatics i fl
P ti influence semantics:
ti

1. Tagging behavior effects preciseness of emerging
semantics [Körner et al. 2010]
2. Social t
2 S i l networks i fl
k influence resulting semantic networks
lti ti t k
[Wang and Groth 2010]
3.
3 Stream types effect stream semantics
[Wagner and Strohmaier 2010]

33


Different motivations for tagging

User A User B

This These
seems to seem to
be a be
category! keywords!

34


How do Semantics Emerge?
Are they Influenced by the Pragmatics of Tagging?
Different styles of tagging:
Diff t t l ft i
To categorize or to describe resources [Strohmaier et al. 2010]
Categorizer (C) Describer (D)

Goal later browsing later search
Change of vocabulary costly cheap
Size f
Si of vocabulary
b l limited
li it d Open
O
Tags subjective objective

Example tag clouds

Semantic Assumption:
Categorizers produce more precise emergent semantics than Describers.

[Strohmaier et al. 2010] M. Strohmaier, C. Koerner, R. Kern, Why do Users Tag? Detecting Users' Motivation for Tagging in Social Tagging Systems, 4th International AAAI Conference on
35
Weblogs and Social Media (ICWSM2010), Washington, DC, USA, May 23-26, 2010.


Example 1
p
Describers outperform categorizers on precision of
emergent tag semantics
Categorizers perform Describers perform
worse than random better than random

Random Random
users users

(C) (D)

C. Körner, D. Benz, A. Hotho, M. Strohmaier, G. Stumme, Stop Thinking, Start Tagging: Tag Semantics Emerge From Collaborative Verbosity, 19th International World Wide Web Conference
Markus Strohmaier
(WWW2010), Raleigh, NC, USA, April 26-30, ACM, 2010.
2011
36


Example 2
p
Categorizers outperform describers on social classification
accuracy

Describers perform worse
than Categorizers

Supervised multi-class SVM

A. Zubiaga, C. Koerner, and M. Strohmaier. Tags vs. shelves: From social tagging to social classification. In 22nd ACM SIGWEB Conference on Hypertext and Hypermedia 37
(HT 2011), Eindhoven, Netherlands, ACM, 2011.


Example 3
Type of stream aggregation effects semantics
… the type of aggregation:
Hashtag stream aggregations are more robust against external
disturbances than user list streams

Hashtag Stream User List Stream
OR(RUa)S(Rh) OR(RUa)S(RUL)
38



Motivation

Social Labeling
Social Tagging
• Tag Relatedness
• Tag Generality
• Tag Hierarchies
Social Navigation
Engineering
g g

39


Social Navigation
Users search for and navigate
to certain sets of resources

Can we produce ontological constructs as a
byproduct of navigational activities of users ?
b d t f i ti l ti iti f
Navigational Knowledge Engineering:
A light-weight methodology for low-cost
knowledge engineering by a massive user base.

40


Prototype: HANNE
HANNE: Holistic Application for Navigational
HANNE H li ti A li ti f N i ti l
Knowledge Engineering
http://aksw.org/Projects/NKE
http://aksw org/Projects/NKE

with S. Hellmann, J. Lehmann, C. Stadler, J. Unbehauen, University of Leipzig
41


Navigational Knowledge Engineering

Example: Extending DBPedia with NKE

S. Hellmann, J. Lehmann, C. Stadler, J. Unbehauen, M. Strohmaier, Navigational Knowledge Engineering, (in preparation) 42


Navigational Knowledge Engineering

Choose initial positive and
negative examples from the
search result.
Here we are looking for
Football Clubs in Saxony, a
region in German
Germany.

43


The Problem of Inductive Concept Learning

U i a universal set of objects
is i l t f bj t
C is a concept: a subset of objects in U

To learn a concept C means to learn to recognize
objects in C t be able t t ll whether
bj t i C, to b bl to tell h th

Example:
U may be the set of all patients in a register, and
f
The set of all patients having a particular disease

44


Inductive Concept Learning

To test the coverage, the function

Returns the value true if e is covered by H, and false otherwise.

Nada Lavrac and Saso Dzeroski, Inductive Logic Programming: Techniques and Applications, 1994 45


HANNE &
The Problem of Inductive Concept Learning

given:
• Background knowledge (OWL/DL knowledge base)
• positive and negative examples of a concept
(instances)
(i t )

find:
fi d
• A hypothesis (expressed as OWL class descriptions)
that covers all positive and no negative examples

46


Based on the
extension, ICL
searches for
suitable
hypotheses.



Concepts Learned

• The Learned C
Concept is shown in Manchester O OWL SSyntax
• The user can retain the concept for later retrieval.
• Saved concepts are displayed as social navigation
suggestions. Can be used to enrich existing knowledge
base.
base


Result
Useful properties:
• Biased towards high recall

• Scales well: Number of training
examples is more important than the
size of the background knowledge

With only 2 positives and 4 negatives,
it is possible to find 13 more
instances, which are f tb ll clubs
i t hi h football l b
situated close to Saxony, Germany



Mock up (1)

50


Mock up (2)

51


Extracting Semantics from …

Motivation

Social Labeling
Social Tagging
• Tag Relatedness
• Tag Generality
• Tag Hierarchies
Social Navigation
Engineering
g g Conclusions
52


Objectives
Provide (some) answers to the following questions:
P id ( ) t th f ll i ti

• What is the difference between extracting semantics from text
vs. extracting semantics from crowds?
• Why should we study crowds and crowd behavior from a
semantic computing perspective?
ti ti ti ?
• How can semantics be extracted from online crowd behavior,
such as
• …from Social Labeling
• …from Social Tagging
• …from Soc a Navigation
o Social a ga o
• What are the implications for semantic computing research?

53


Social Computation for the Web of Data

Crowds provide some advantages over semantic extraction from text.

It is through the process of social computation, i.e.
the combination of social behavior and algorithmic computation,
that
th t semantic structures emerge.
ti t t

In order to extract semantics from crowds understanding users
crowds, users’
behavior and its impact on emerging semantic structures is
important.

Shaping or classifying certain users’ behavior are ways of
increasing p
g preciseness of emerging semantic structures.
g g

54


Crowd S
C d Semantics
ti
based on [Hovy 2011]

[Hovy 2011] Based on slides by Ed Hovy, USC, Reasoning with Text Workshop 2011 (link) 55


Summary
We
W can observe that:
b th t
• semantic structures can be obtained as a byproduct
of online crowd behavior (l b li t i navigation)
(labeling, tagging, i ti )
• these structures can approximate structures in
reference knowledge bases (DBPedia WordNet etc)
(DBPedia, WordNet,
but:
• pragmatics influences resulting semantics
• semantic preciseness remains a challenge

56

An Agenda for Semantic Computing
Research s
R What‘s the knowledge of SAS?
What
h

10 August, 2011
0
Web Resources
Natural Language Constructs
Real-World Happenings
Utilizing online behavior of crowds for

57
the construction, maintenance and enrichment of
large-scale semantic structures
http://www.flickr.com/photos/waldoj/722508166/
http://www.flickr.com/photos/matthewfield/2306001896/ 57


Thank You.

Acknowledgements

co-authors and collaborators

H.P. Grahsl, D. Helic, C. Körner, R. Kern, C. Trattner (TU Graz)
D. Benz, G. Stumme (U. Kassel)
A. Hotho (U. Würzburg)
S. H ll
S Hellmann, J. Lehmann, C. Stadler, J. Unbehauen (U. of Leipzig, Germany)
J L h C St dl J U b h (U f L i i G )

58

Extracting semantics from crowds

Recommended

Recommended

More Related Content

Similar to Extracting semantics from crowds

Similar to Extracting semantics from crowds (20)

Recently uploaded

Recently uploaded (20)

Extracting semantics from crowds