The document discusses analyzing online social networks by representing interactions as networks. It addresses how to define nodes and edges, issues around sampling populations and measuring relationships, and how network statistics can provide insight into how information spreads and social influence functions. Examples are given of different networks constructed from Usenet discussion data and how events are reflected in political discussion volumes over time.
2. The Questions We Will Consider Today:
The Questions We Will Consider Today:
• How can we represent online interactions
How can we represent online interactions
as networks?
• Wh
Why would we want to do that?
ld d h ?
• What are the features we should look for?
What are the features we should look for?
3. The Questions We Will Consider Today:
The Questions We Will Consider Today:
• How can we represent online interactions
How can we represent online interactions
as networks?
• Wh
Why would we want to do that?
ld d h ?
• What are the features we should look for?
What are the features we should look for?
What we will not consider:
What we will *not* consider:
how to get the data (crawling,
pp g q y g
screen scrapping, querying SQL
databases...)
9. First Lesson
First Lesson
Define very carefully
Define very carefully
• who are the nodes? sampling issues
who are the nodes? – sampling issues
• what are the edges? – measurement issues
17. What are the edges? Defining the relationships
What are the edges? Defining the relationships
Research Institutes
Health
Charities
NGOs
Media
Religious
Organisations
Environment
Education
Security
Activism
Intergovernmental
UN Agencies
Sports Associations
S t A i ti
Professional Associations
Adamic and Adar (2003) Friends and
( ) Gonzalez-Bailon (2009) Social Factors
Networks on the Web, Social Networks, Underlying the Structure of the Web, Social
25(3): 211-230 Networks, 31(4): 271-280
18. What are the edges? Defining the relationships
What are the edges? Defining the relationships
Three Ways of Measuring Friendship (three types of links)
y g p( yp )
Lewis et al. (2008) ʺTastes, ties, and time: A new social network dataset using
Facebook.com.ʺ Social Networks 30:330‐42
19. What are the edges? Defining the relationships
What are the edges? Defining the relationships
the Difference between Weak and Strong Ties
Choudhury, M.D., W.A. Mason, J.M. Hofman,
Choudhury M D W A Mason J M Hofman and D J Watts (2010). "Inferring Relevant Social
D.J. Watts. (2010) Inferring
Networks from Interpersonal Communication" in Proceedings of the 19th international conference
on World Wide Web.
20. Common Network Statistics
Common Network Statistics
• Mean degree
• F ti
Fraction of nodes in largest component
f d i l t t
• Geodesic distance
• Cl t i
Clustering coefficient (transitivity)
ffi i t (t iti it )
• Degree correlation coefficient
23. Why Should we Care about these Network Stats?
Why Should we Care about these Network Stats?
• Networks shape the flow of information
Networks shape the flow of information
• the most central websites and blogs are the most visible
• the eco chamber effect
the eco‐chamber effect
• Networks channel social influence and contagion
• viral marketing
viral marketing
• collective action and mobilisations
• Networks allow us to understand social
Networks allow us to understand social
interactions better
24. Why Should we Care about these Network Stats?
Why Should we Care about these Network Stats?
• Networks shape the flow of information
Networks shape the flow of information
• the most central websites and blogs are the most visible
• the eco chamber effect
the eco‐chamber effect
• Networks channel social influence and contagion
• viral marketing
viral marketing
• collective action and mobilisations
• Networks allow us to understand social
Networks allow us to understand social
interactions better
• Are citizens becoming increasingly isolated?
Are citizens becoming increasingly isolated?
27. What Surveys Tell us about Personal Networks
What Surveys Tell us about Personal Networks
McPherson et al (2006) "Social Isolation in America: Changes in Core Discussion
al. Social
Networks over Two Decades." American Sociological Review 71:353-375.
28. Research Questions
Research Questions
• Are discussion networks really shrinking?
Measurement artefact?
Measurement artefact?
• How stable is participation in discussion
How stable is participation in discussion
networks over time?
Gonzalez‐Bailon (2010) “The Online Response to Offline Disengagement. The
Growth of Internet‐Enabled Political Discussion Networks (1999‐2005)”,
under review
29. The Data: Usenet Discussions
comp.*
p misc.* news.* rec.* sci.* soc.* talk.*
comp.software rec.music talk.religion
comp.sys.mac rec.art.movies
i lk li i
talk.politics
… … …
Smith, Marc. 1999. ʺInvisible Crowds in Cyberspace: Mapping the Social
Structure of the Usenet.ʺ in Communities in Cyberspace, edited by M. Smith
and P. Kollock. London: Routledge.
30. The Data: Usenet Discussions
comp.*
p misc.* news.* rec.* sci.* soc.* talk.*
comp.software rec.music talk.religion
comp.sys.mac rec.art.movies
i lk li i
talk.politics
… … …
usa news uk
‘politics’ us soc org free regionalism ‘politica’ es charla
local hipcrime forums groups
democrats homosexuality immigration
discussioni
935 groups crimehip
935 groups marxism world gmane arms-d nationalism 97 groups
97 groups italia clari agora
internet taxation peace anti fascism
anti-fascism
natl-socialism gov parties assassination guns alt news philosophy
hil h
670,000 users extremism emircpih socialism philosophy rent-control
lang fido libertarian circumcision england tax-ev asion
89,000 users internazionale polo
lega-nord destra fidonet forums
clinton party -of -the-unacanceller tw can hatemongers sc culture referendum crime soc
web liberal co national-socialist grinch animals europe suck bleed
alleanza-nazionale politically
editorial web reti msn bologna
republican soviet bush general drugs votelink constitution chinese personalities
murders commentary misc
republicans white president crypto newsguy d scu bc fan european-union activism pa military efnet mideast chess infotimes
cattolici fido pt v erdi politicas kharkov
cna liberalism newt freenet conservative fsu politicscn irc games progressive class media religion black git discuss pnet pubforum discussion nuov ipartiti sesso esp correct wankers
congress tibet sci texas ab announce green wales elections medicine arabic theory hk ie ilt christian-democrat pakistan
international corruption candidate amend2 sk yps people india environment asu za
31. Thousands x 100
000
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5
6
7
8
19990901
1 19990901
19991201
1 19991201
20000301
1 200
000301
20000601
1 200
000601
Bush vs Gore
20000901
1 200
000901
20001201
1 200
001201
Berlusconi vs Rutelli
20010301
1 200
010301
20010601
1 200
010601
20010901
1 200
010901
20011201
1 200
011201
20020301
1 200
020301
(a) politics
(b) politica
20020601
1 200
020601
20020901
1 200
020901
20021201
1 200
021201
invasion Iraq
20030301
1 200
030301
Number of discussions started
Number of discussions started
20030601
1 200
030601
20030901
1 200
030901
Madrid bombs
20031201
1 200
031201
Zapatero vs Rajoy
20040301
1 200
040301
20040601
1 200
040601
Bush vs Kerry
20040901
1 200
040901
20041201
1 200
041201
35. (a) politics (N~280,000)
The Effects of
The Effects of Responses received
Online Discussions started
Transitivity
Networks
Networks Size personal network
Messages sent
on Length of
Commitment
C it t -0.6
06
R^2 = 0.114
-0.4
04 -0.2
02 0.0
00 0.2
02
(b) politica (N~20,000)
Responses received
Transitivity
Size personal network
Discussions started
Messages sent
-0.6 -0.4 -0.2 0.0 0.2
R^2 = 0.149
37. Users Don t Stay for Long, however...
Users Don’t Stay for Long, however...
‘politics’ (N~336,000)
p ( , ) ‘politica’ (N~23,000)
p ( )
Survival Probability
Survival Probability
0.8
8
0.8
8
One One
More than one More than one
0.4
0.4
0.0
0.0
0 10 20 30 40 50 60 0 10 20 30 40 50 60
Months Months
38. Research Questions
Research Questions
• Are discussion networks really shrinking?
No – d li
N and online networks also have a positive
t k l h iti
impact on engagement
• How stable is participation in discussion
How stable is participation in discussion
networks over time?
Not much – difficult to tell if more or less than
Not much difficult to tell if more or less than
offline (we don’t have offline data)
39. Let’s assess the example:
• Who are the nodes?
• What is the meaning of links?
What is the meaning of links?
• What is the sampling strategy?
•HHow is the time dimension dealt with?
i h i di i d l i h?
• What are the network features analysed?
40. If you need more info...
If you need more info
Hansen, D., B. Shneiderman, and M. Smith (2010).
Analyzing Social Media Networks with NodeXL.
y g
Morgan Kaufmann
Newman, M.E.J. (2010). Networks: An Introduction,
y , ,
Oxford University Press, Oxford, UK