Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
How to utilize ‘big data’ on SNS for academic purpose?
1. How to utilize ‘big data’ on
SNS for academic purpose?
Virtual Knowledge Studio (VKS)
Asso. Prof. Dr. Han Woo PARK
CyberEmotions Research Institute
Dept. of Media & Communication
YeungNam University
214-1 Dae-dong, Gyeongsan-si,
Gyeongsangbuk-do 712-749
Republic of Korea
http://www.hanpark.net
http://asia-triplehelix.org
A Keynote to the Japanese Society of Socio-Informatics Annual Meeting, September 14, 2012, Maebashi
3. Big data
Big data usually includes data sets with sizes
beyond the ability of commonly-used software tools
to capture, manage, and process the data within a
tolerable elapsed time.
Big data sizes may vary per discipline.
Characteristics: Garner’s 3Vs plus SAS’s VC
- Volume (amount of data), velocity (speed of data in
and out), variety (range of data types and sources)
- Variability: Data flows can be highly inconsistent
with daily, seasonal, and event-triggered peak data
loads
- Complexity: Multiple data sources requiring cleaning,
linking, and matching the data across systems.
4. Computational Social Science
A minor but growing approach to
the study of society
Focus on the methodological
perspective based on the use
of new digital tools to manage
the data deluge
5. CSS Approach
1. development of webometric tools to automate
social Internet research process (e.g., data
collection and analysis from search engines,
SNS and microblogging sites)
2. experimentation with new types of data
visualization (e.g, HNA and dynamic
geographical mappings using Google)
8. Research tradition of Webometrics
• 1) development of online tools to automate
the Internet research process, such as data
collection and analysis
• 2) experimentation with new types of data
visualization, such as social network and
hyperlink analysis and multimedia and
dynamic mappings
9. 9
Interface
WCU
WEBOMETRICS
INSTITUTE
INVESTIGATING INTERNET-BASED POLITICSS WITH E-RESEARCH TOOLS
WCU
WEBOMETRICS
INSTITUTE
INVESTIGATING INTERNET-BASED POLITICSS WITH E-RESEARCH TOOLS
The interface is fairly self-explanatory:
-Tick or untick to collect either only hit
number or the title, URL, and description
of the results
- Select which of the search options you
want to include
- Click on the '...' button to select the text file
that contains the queries you wish to run
- Click 'Run Queries'
11. Cyworld Extractor - Overview
Java-based software
tool that, given the
URL of a politician
on Cyworld, extracts
comments given by
citizens along with
related profile
attributes.
The stored data,
which can amount to
thousands of
records, is stored in
a suitable format for
import into
statistical software
12. Twitter Extractor - Overview
Sharing a similar interface
and extraction mechanism
with the Cyworld
extractor, this application
requires the URL of a user
on Twitter. It is then
possible to collect all
tweets and determine the
attributes of the user’s
follower / following
network
15. OhMyNews vs.Chosun: Emotionality comparison
(Jul 2009 - Feb 2010)water
France
EU
Independent
Africa
Kabul
gas
Colombia
Venezuela
Pakistan
press
Hollywood
parliament
American
Italy
police
Hungary
Google
voter
Europe
Russia
Copenhagen
election
Obama
Haiti
India
China
CommunistParty
Afghanistan
PresidentBarackObama
Canada
Korea
Taliban
warming
Poland
Japan
Australia
ban
U.S.
climatechange
opposition
H1N1
Authority
Belgium
Dalai
Sweden
Palestinian
pandemic
woman
Israel
oil
UN
Conservative
Asia
Internet
Afghan
journalist
economy
Brazil
Amazon
NorthKorea
Jerusalem
Berlusconi
ASEAN
Uganda
Brussels
OhMyNews
-1.00
-0.80
-0.60
-0.40
-0.20
0.00
0.20
0.40
0.60
0.80
1.00
OhMyNews
Chosun
16. • Using the sentiment analysis, we are trying to
find differences and similarities in emotional
polarity of main topics covered in news stories
by OhMyNews versus Chosun.
• "MEAN POLARITY" - represents polarity on the
scale from -1 (negative) to 1 (positive) for 78
popular topics covered in the both newspapers.
• For example, topic "Uganda" tend to be
mentioned in the positive context by OhMyNews,
but in the negative context by Chosun. Or topic
"opposition" tend to be neutral in OhMyNews,
but positive in Chosun, and so on
20. Frequently occurring key words in e-science webpages in South Korea
E-science in Asia:
Dreams and realities for social science research
Created on Many Eyes(http://many-eyes.com)
Results
Park, H. W. (2010). Mapping the e-science landscape in South Korea using the webometrics
method. Journal of Computer-Mediated Communication, Vol. 15, No. 2. 211 – 229
21. Websites retrieved more than two times
Note: Websites are larger according to their frequency of retrieval; however, heir
colors and locations are randomly-chosen for the best visualization
Park, H. W. (2010). Mapping the e-science landscape in South Korea using the webometrics
method. Journal of Computer-Mediated Communication, Vol. 15, No. 2. 211 – 229
22. Why CSS?
• Savage and Burrows (2007, p.
886) laments, “Fifty years ago,
academic social scientists might
be seen as occupying the apex of
the – generally limited – social
science research ‘apparatus’. Now
they occupy an increasingly
marginal position in the huge
research infrastructure.
Bonacich, P. (2004).
The Invasion of the Physicists. Social Networks 26(3): 285-288
23. Type Traditional Science -------------------------> e-Science
Stage 1 2 3 4
Information
gathering
Libraries;
personal
conversations
Offline database
Online
databases;
link collections;
discussion lists
Digital libraries;
Knowbots
Data
production
Interviews;
experiments
Electron,
text analysis;
simulation/
modeling
Internet surveys
Distributed
computing;
virtual reality
Data
management
Card files;
lists
Hypertextual card
files; databases
Networked card
files; de-central
databases
Data
processing/
analysis
With paper
and pencil
Electron, data-
processing;
expert systems
Modelling;
simulations
Artificial
intelligence
<Table 1> Development stage of e-Science Nentwich(2003)
25. All modes are wrong but some are useful
- Emergence of data author on dataverse
26. “Webometrics refers to a set of research methods that
illustrates texts and their web linkages as a network and
quantitatively examine the spreadable aspects of web-
mediated communication activities of social actors and
issues (Jenkins, 2011), in comparison to traditional
methods (Savage & Burrows, 2007; Salganik & Levy,
2012). ” (by Han Woo Park)
28. Seminal publications: * 실시간 피인용률 보기
Garton, L., Haythornthwaite, C., & Wellman, B. (1997
). Studying online social networks. Journal of Compu
ter-Mediated Communication, 3(1).
Wellman, B. (2001). 'Computer networks as social n
etworks,' Science, Vol. 293, Issue (14), pp. 2031-203
4.
Park, H. W. (2003). Hyperlink network analysis: A ne
w method for the study of social structure on the web
. Connections, 25(1), 49-61 .
Park, H. W., & Thelwall, M. (2003). Hyperlink analyse
s of the World Wide Web: A review. Journal of Comp
uter-Mediated Communication, 8(4).
29. Recent special issues related to CSS
Special issues
- Social Science Computer Review, 2011, 29(3)
Theme: Social Networking Activities Across Countries
- Asian Journal of Communication, 2011, 21(5),
Theme: Online Social Capital and Participation in Asia-
Pacific
- Scientometrics, 2012, 90(2)
Theme : Triple Helix and Innovation in Asia using
Scientometrics, Webometrics, and Informetrics
- Journal of Computer-Mediated Communication, 2012, 17(2)
Theme: Hyperlinked Society
30. Selected publications related to CSS
Recent publications
- Park, H. W., Barnett, G. A., & Chung, C. J. (2011). Structural changes in the global hyperlink
network: Centralization or diversification. Global networks. 11 (4). 522–542
- Lim, Y. S., & Park, H. W. (2011). How Do Congressional Members Appear on the Web?:
Tracking the Web Visibility of South Korean Politicians. Government Information Quarterly.
28 (4), 514-521.
- Sandra González-Bailón, Rafael E. Banchs and Andreas Kaltenbrunner (2012). Emotions,
Public Opinion, and U.S. Presidential Approval Rates: A 5-Year Analysis of Online Political
Discussions Human Communication Research
- Sams, S., Park, H. W. (2012 forthcoming). The Presence of Hyperlinks and Messages on Social
Networking Sites: A Case Study of Cyworld in Korea. Journal of Computer-Mediated
Communication
- Nam, Y., Lee, Y.-O., Park, H.W. (2013, March). Can web ecology provide a clearer
understanding of people’s information behavior during election campaigns?. Social Science
Information.
33. Social media refers to a set of online tools that
supports social interaction between users.
34.
35.
36.
37. Cross-Cultural Analysis of
Beehive Status Messages within IBM
Users in high power
distance may use
the status messages
more for indicating
general career
interests and skills,
rather than time-
based updates of
what one is doing or
how one is feeling
41. A cross-cultural comparison of Twitter use between
Korea and Japan
How do cultural attitudes of users influence their Twitter
use?
42. Korea (N=286) Japan (N=283)
Valid no. Male Female Valid no. Male Fmale
Gender 165
106
(64%)
59
(36%)
204
145
(71%)
59
(29%)
Reciprocity 76.40% 73.80%
No. of Tweets 4292 9347**
No. of
followers
1047** 323
No. of
followings
980** 285
Pieces of
geographic
information
166 (58%) 143 (51%)
No. of
metropolitans
154 111 (72%) 143 68 (48%)
Participants and Their Twitter Use
* The percentages for gender and no. of metropolitans were calculated using only the valid cases.
43. Differences in Twitter Use between
Korea and Japan 1
No significant difference in the proportion of
reciprocal connections.
A high proportion of reciprocal connections
Face negotiation theory (Ting-Toomey, 1988)
In collectivistic cultures, members consider their partners’ face as
long as this consideration does not conflict with the members’
individual needs.
Korean and Japanese users might not have wanted to
embarrass their followers by providing no response.
44. Differences in Twitter Use between
Korea and Japan 2
Korean users had more followers and followings,
which indicates that Korean users more tolerant of in-
groups than Japanese users.
Simple versus contextual collectivism
Koreans, who reflect simple collectivism, are
flexible in defining in-groups depending on
situations, and it is common for Koreans to belong
to more than one in-group.
In Japanese culture, which reflects contextual
collectivism, members are likely to maintain a few
confined in-groups throughout their lives regardless
of the situation or context.
45. Differences in Twitter Use between
Korea and Japan 3
Japanese users posted more messages through
their Twitter timeline.
Unexpected result based on cross-cultural theories
The unexpected results may be explained by the
differences in the history of mobile communication
(not in cultural traits) between the two countries.
Japan’s mobile communications industry
46. Types of Tweets: Korea vs. Japan
* IS: Information Sharing; SP: Self-Promotion; OC: Opinions/Complaints; RT: Statements/Random Thoughts; ME: Me
Now; QF: Questions for Followers; PM: Presence Maintenance; AM: Anecdotes-Me; AO: Anecdotes-Others.
* Blue bar: Korea; red bar: Japan.
47. Analysis of Tweets
For Korean users, the primary purpose of using Twitter
was information sharing, a goal-oriented
communication action.
Information sharing is an effective communication
strategy both for facilitating faithful interactions and for
maintaining individuality.
For Japanese users, it was personal graffiti
disappearing (or unnecessary) communication context.
Their messages seemed as if they were talking to
themselves in public. Self-disclosure while excluding
out-group members from their personal lives.
분석 사례
49. Conclusion
• Twitter users in Korea tend to embrace their Twitter
connections within the in-group boundary, despite some
differences between Twitter connections and offline in-
group members in terms of the degree of intimacy.
• Twitter users in Japan tend to control their content and
connections to maintain closed social relationships.
• Social media have changed the ways in which
individuals socialize and communicate through the
Internet across countries and cultures.
• By negotiating with local cultures, users develop their
own communication strategies for global social media.
51. Messages tweeted during 15:00 March 11, 2011 to 7:00
March 13, 2011
Earthquake occurred around 14:46 on March 11 in Miyagi,
Fukushima, Iwate, northern Ibaraki and further, Tokyo, Ibaraki, Chiba
(South Kanto area)
During earthquake, people in the area cannot use their cell phone for
calling and instead, they can access Twitter.
10-hour time period
Beginning 03/11 15:00 to 03/12 01:00
Middle 1 03/12 01:01 to 03/12 11:00
Middle 2 03/12 11:01 to 03/12 21:00
Last 03/12 21:01 to 03/13 07:00
DATA COLLECTION
52. Beginning Middle1 Middle2 Last Total
Valid
Case
112 196 141 119 568
VALID CASES
데이터 수집 기간 동안 올라온 일본어 메시지 중 지진 관련 메시지들만 분석
53. Information-related Official information from news reports or government
notices
Opinion-related Opinion or commentary related content on the Japan
earthquake
Technology/Media
related
Messages that all types of media (including television
and social media) are mentioned
Emotion-related Personal emotional statements and concern about the
situation and sufferers
Action-related Including suggestion, plan for helping sufferers (e.g.
suggestion for donation or voluntary service)
Personal experience-
based
Information described by sufferers (personal episodes
and surroundings)
Other Unrelated tweets (missing data)
TYPE OF TWEET
Heverin, T. & Jach, L. (2010
54. Time
period
(hours)
Information
(%)
Opinion
(%)
Technolog
y/Media
(%)
Emotion
(%)
Action (%) Personal-
based (%)
1-10 12.9 4.0 5.0 23.8 19.8 34.7
11-20 16.7 9.3 6.7 16.0 26.0 25.3
21-30 14.1 5.1 10.3 12.8 23.1 34.6
31-40 18.1 13.8 5.3 5.3 25.5 31.9
PERCENTAGE OF TWEET TYPE
PER 10 HOUR TIME PERIOD
• 미디어 혹은 관련 조직의 공식 정보(information) 보다는 개인들의 경험에 기반한 정보
(personal-based) 가 더 많이 교환됨
• 의견 메시지는 사건 발생 직전 보다 어느 정도 시간이 지난 후 더 많이 올라옴
• 감정적인 메시지는 사건 직후 많았다가 (주로 안부와 걱정, 불안, 공포) 시간이 지나면서 줄어
듦
• 성금 모금이나 복구 참여를 촉구하는 메시지 (action)는 꾸준히 포스팅. 지진 발생이 특정 지역
의 개별 피해사례가 아니라 국가적 위기 사항으로 인식하고 있음을 짐작할 수 있음
55. Percentage of Tweet Type
per 10 hour Time Period
Time
period
(hours)
Informati
on (%)
Opinion
(%)
Technolo
gy/Media
(%)
Emotion
(%)
Action (%) Personal-
based (%)
1-10 12.9 4.0 5.0 23.8 19.8 34.7
11-20 16.7 9.3 6.7 16.0 26.0 25.3
21-30 14.1 5.1 10.3 12.8 23.1 34.6
31-40 18.1 13.8 5.3 5.3 25.5 31.9
Time
period
(hours)
Info
(official +
personal)
Opinion (%) Technology
/Media (%)
Emotion
(%)
Action (%)
1-10 47.6 4.0 5.0 23.8 19.8
11-20 42.0 9.3 6.7 16.0 26.0
21-30 48.7 5.1 10.3 12.8 23.1
31-40 50.0 13.8 5.3 5.3 25.5
57. 12.9
4.0 5.0
23.8 19.8
34.716.7
9.3 6.7
16.0
26.0
25.3
14.1
5.1 10.3
12.8
23.1
34.6
18.1
13.8
5.3
5.3
25.5
31.9
IF OP TM EM AC PE
Last
Middle2
Middle1
Beginning
Type of Tweets during Japanese Earthquake (Mar 11 to 13 2011)
IF : Information-related
OP: Opinion-related
TM: Technology/Media related
EM: Emotion-related
AC: Action-related
PE: Personal information
IFIF OP TM EM AC PE
58. 15.6
8.3 6.6
14.9
23.9
30.7
IF OP TM EM AC PE
Total
Type of Tweets during Japanese Earthquake (Mar 11 to 13 2011)
IF : Information-related
OP: Opinion-related
TM: Technology/Media related
EM: Emotion-related
AC: Action-related
PE: Personal information
IFIF OP TM EM AC PE
60. 트윗 메시지에 포함된 URL 분석
TLD Domains %
jp 22 53.7
com 11 26.8
net 4 9.8
tv 1 2.4
uk 1 2.4
org 1 2.4
biz 1 2.4
Total 41 100
• 일본내 웹페이지가 가장 많이 정
보로 제공되었음을 알 수 있음
• 위기시 정부홈페이지나 관련 사
이트에서 정보를 얻으려는 커뮤
니케이션 행위가 드물었음을 짐
작할 수 있다는 점에서 정부 관련
기관 URL (go.jp or gov)이 없음을
눈여겨볼만함
61. Category Type Number % Total
News channel News providing 15 28.4 15 (28.4)
Official
information channel
Weather information 2 3.8
8 (15.2%)
Crisis-specialized site
for survival confirmation
2 3.8
Transportation information 2 3.8
Search engine 1 1.9
Disaster information
for foreigners
1 1.9
Personal channel
Video sharing 7 13.3
20 (38%)
Personal blog 7 13.3
Photo sharing 4 7.6
Online community 2 3.8
Action fundraising 1 1.9 1 (1.9%)
Commodity
Online bookstore 3 5.7
5 (9.5%)
Internet shopping mall 2 3.8
Other 4 7.6 4 (7.6%)
Total 53 100 53 (100%)
62. ①
②
③
The status of minihompy
①How active ②How famous ③How friendly
Gender
Name
Minihompy
Visitor count
xxx
사진
xxx
xxx
65. Sentimental Analysis of Korean
Politicians’ Cyworld mini-hompy
Chi-square = 11.472, df = 1, p<.01, two-tailed
The results indicates a significant relationship between
gender and online comments.
Gender
Total
Male Female
Comments
Positive 509 491 1000
Negative 247 159 406
Total 756 650 1406
66. To identify the relationship among gender,
comment type, and user activity, posters were
divided into four groups:
females contributing positive comments (FP),
males contributing positive comments (MP),
females contributing negative comments (FN), and
males contributing negative comments (MN).
The FP group was the most active group, the FN
group’s activity was similar to that of male groups,
and the MP group was more active than the MN
group.
68. Where do Korean users want take us?-Korea
Category Domain Comments linking to Domain %
Petition agora.media.daum.net 325 17.6
News news.naver.com 150 8.1
SNS cyworld.com 139 7.5
Forum cafe.naver.com 106 5.7
Blog blog.naver.com 72 3.9
Blog blog.daum.net 69 3.7
Blog rokp.tistory.com 61 3.3
NGO bss.or.kr 56 3
Forum cafe.daum.net 51 2.8
Government socialenterprise.go.kr 49 2.7
Total 1078 58.3
Based on 1,078 (58.3%) of 1,849 links to Korean services
69. What makes Korean users hyperlink to?
Category Information
provision
Network
building
Identity/image
building
Audience
sharing
Message
amplification
Spam
Opposition
Female
1 20 0 0 11 9
Opposition
Male
3 4 1 1 13 8
Opposition
Unknown
0 11 1 0 14 2
Ruling
Female
1 6 0 0 29 3
Ruling
Male
1 5 0 0 23 7
Ruling
Unknown
0 12 0 1 16 3
Total 6 58 2 2 106 32
% 3% 28% 1% 1% 51% 16%
Table 6: Comments categorized by link type from the six groups of gender and political affiliation
Based on 206 comments agreed on by both coders from the initial set of 300
70. Sentiment of Korean users to link
candlelight protest
suicide of e
x-president
Roh
71.
72. Political role of the Internet
Normalization perspective:
Internet may reflect the traditional power structure
among individual politicians.
Equalization (Innovation) perspective:
Internet may reform the offline hierarchical
structure of individual politicians.
73. Web Visibility
Web visibility as an indicator of online political power
Presence or appearance of actors or issues being
discussed by the public (Internet users) on the web.
Tracking web visibility is powerful way to get an insight
into public reactions to actors or issues.
Recent studies indicates the positive relationships
between politicians’ web visibility level and election.
Also, the co-occurrence web visibility between two
politicians represents their hidden online political
relationships based on the public perception.
74. Web Trend Analysis
• Jangan district in Suwon City, Gyeonggi Province
(Park, CS)
(Lee, CY)
(Ahn, DS)
(Yoon, JY)
75. 박찬숙 이찬열 안동섭 윤준영
33,106
38,187
5,570
716
Blogs vs. Votes
• Jangan district in Suwon City, Gyeonggi Province
N. of Votes
N. of Blogs
(Park, CS)(Lee, CY) (Ahn, DS) (Yoon, JY)
(Park, CS)(Lee, CY) (Ahn, DS) (Yoon, JY)
76. Results
• Correlation Analysis (N. of Blogs & N. of
Votes)
– Pearson r = .586, p < .01 (N=29)
– Spearman rho = .797, p < .01 (N=29)
• Simple Regression Analysis
– N. of Votes = 1,055.56 + 79.99(N. of Blogs)
– R2 = .344 (F = 14.128, p < .01)
– ß = .586 (t = 3.759, p < .01)
80. Outlines
Web ecology
Inter-relationship among websites by the hu
man activity of using the Internet in informati
on ecology
Observing integration and changes of diverse
information behavior during the campaign peri
od of the 2010 regional elections in South Kor
eaWeb Ecology - 2011 ICA 5/29/2011
Web
ecology
Public opinion &
Campaign
Issues
81. Co-occurrences can take
places either Web-Mentionin
g or Hyperlinking.
Actor A Actor B
User, Voter
(Webpage,
News, Blog,
SNS, etc.)
Hyperlink
Hyperlink
Web-mentioning
or Hyperlinking
co-occurrence
5/29/2011
93. Results & Discussion:
Network Analysis (2/7)
Data Collection for Web 1.0
• Official homepages of South Korean Assembly members
• Manual collection: Observation
• Inter-linkage: Who links to whom matrix
• Explicit links excluding links in board
• 2-Year tracking of same Assembly members: 2000-2001
94. Results & Discussion:
Network Analysis (3/7) Web 1.0
2000
2001
‣ 59 isolated in 2000
‣ more centralised in 2001
‣ network of 2001 ➭ a ‘star’ network
- might affected by political events
➭ presidential election in 2001
95. Results & Discussion:
Network Analysis (4/7)
•Data collection for Web 2.0
• Personal blogs of South Korean Assembly members
• Manual collection: Observation
• Blogroll links: Excluding links in postings
• Inter-linkage: Who links to whom matrix
• 2-Year tracking of same Assembly members: 2005-2006
• Phone interview about usage behaviours
96. Results & Discussion:
Network Analysis (5/7) Web 2.0
2005 2006
‣ hubs disappearing
‣ easy use of blogs
‣ Clear boundaries between different parties
‣ strong presence of GNP Assembly members
➭ party policy on using blogs
97. Twitter
Results & Discussion:
Network Analysis (6/7)
‣ more connection between different parties
‣ the ruling party pays less attention on alternative media
98. Results & Discussion:
Network Analysis (7/7)
Web Type Year
Sum of links
(Mean)
Density
Centralisation
Gini Coefficient
In Out
Web 1.0
(N=245)
2000
373
(1.52)
0.006 1.84 69.33 0.984
2001
515
(2.10)
0.009 1.19 99.55 0.996
Web 2.0
(N=99)
2005
652
(6.59)
0.067 22.07 41.66 0.759
2006
589
(5.95)
0.061 20.67 35.10 0.763
Twitter
(N=22)
2009
111
(5.05)
0.240 24.72 39.68 0.408
99. Data collection
Date of collection – from February to April 2010
Homepage – LexiURL searcher to retrieve data from the
Yahoo! database
Blog – Manually collected by visiting Assembly
members’ blog page
Twitter – An automated computer program using Twitter’s
API (Application Programming Interface) to retrieve data
from Twitter
Analysis & Visualization – UciNet
106. Five Politicians’ Following-Based Ego
Networks
Size
of
node
Color
of
node
Number of followers
1.5 Yellow 0 to 10,000
3.0 Purple 10,001 to 100,000
3.5 Pink 100,001 to 1,000,000
4.0 Blue More than 1,000,000
Diagram 1. Five Politicians’ Following-Based Ego
Networks
The size and color of each node corresponds to the
number of followers as follows:
GG Kang HR Won
KW Na DY Chung HC Noh
한나라당 진보신당 민주노동당 민주당
107. Five Politicians’ Follower-Based Ego
Networks
Size of
node
Color of
node
Number of
followers
1.5 Yellow 0 to 10,000
3.0 Purple 10,001 to 100,000
3.5 Pink 100,001 to
1,000,000
4.0 Blue More than
1,000,000
Diagram 2. Five Politicians’ Follower-
Based Ego Networks
The size and color of each node
corresponds to the number of followers
as follows:
GG Kang
KW Na
HR Won
DY Chung HC Noh
한나라당 진보신당 민주노동당 민주당
108. Overlaps in terms of Twitter Followers
HR Won
GG Kang
DY Chung
HC Noh
KW Na
한나라당 (GNP)
진보신당 (PNP)
민주노동당 (MDP)
민주당 (MP)
109. HR Won
GG Kang
DY Chung
HC Noh
KW Na
Overlaps in terms of Twitter Followings
한나라당
진보신당
민주노동당
민주당
110. Data
Nov 2010, API application, 189 Korean Politicians
National Assembly Members
and Political Figures (i.e. Mayors or Governors)
Total
Twitter Account
Holder
Ruling Party vs.
Opposition Parties
Grand National Party 173 110 110
Democratic Party 92 56
79
Democratic Labor Party 5 5
New Progressive Party 3 3
Liberty Forward Party 16 4
Creative Korea Party 2 2
Future Hope Alliance 8 3
Federation of Citizen-Centered Party 1 0
Citizen Participatory Party 1 1
Independent 8 5
Total 309 189
115. Findings
• The distribution between politician confirms this
• The following-follower network shows a linear
function, meanwhile the mention network is more
similar to power-law distribution.
• Reciprocity-based connections is, basically, “ I link to
people who linked to me” .. So linear function
• The gravity to popular person is that “I mention to
people who get most attention” So preferential
attachment principle to connect…leads power-law
function.
117. Why Twitaddons.com ?
Upfront, self-identified
motivation for joining a
group
Member list
Relational data of
Twitter activities
118. What Motivation ?
Social motivation
Attachment to a group identity
Informational motivation
Access to information (info
overload theory)
Sharing information (positive
self-evaluation and social
acceptance)
Interpersonal motivation
Friendship and bond-based
attachment to members
Twitaddons.com group
조폐공사, 국민의 명령,
비정규직당, 희망공화국
멘토스당, 똘끼주식당
애플러들의 모임, 아이폰4당
강남당, 강서당
119. Network Structure Measure
Whole network, instead
of ego-centric network
Degree (popularity),
closeness (efficiency),
and betweenness
(control) centralities
Betweenness centrality
as a significant
predictor of leadership
(Mullen & Johnson, 1991)
Reciprocity of ties
Types of retweeted
hyperlinks
Distribution of in-
degree in terms of
mentions
129. Tentative Research Result
Social motivation (조폐공사, 희망공화국, 국민의 명령, 비정규직)
- Higher degree and closeness centralities compared to other groups
- Higher mention sender per receiver ratio
- URLs: civic association homepage, news services
Informational motivation (똘끼주식당, 멘토스 / 아이폰4, 애플러들의 모임)
- Access to information: Higher mention sender per receiver ratio
Larger number of Tweets including URLs
URLs: financial information services
- Sharing information: Higher mention per sender ratio
URLs: application services
Interpersonal motivation (강남당, 강서당)
- Higher reciprocity
- Higher mention per sender ratio
- Smaller number of Tweets including URLs
- URLs: video, picture uploading services
131. The Mutual Information in Two Dimensions:
Tij = Hi + Hj – Hij
Tij ≥ 0
The Mutual Information in Three Dimensions:
TUIG = HU + HI + HG
– HUI – HIG – HUG
+ HUIG
TUIG is potentially negative
A negative entropy can be a consequence of
the mutual relations at the network level.
The configuration then reduces the uncertainty.
아시아 학연산 연구회
132. Triple Helix indicators
- Communication Perspective
• TH innovation takes places mainly in three
KCI spaces where the content of
communication (i.e., information-sharing
behavior) is transferred.
• Knowledge space: Scientometrics
• Convergence space: Technometrics
• Innovation space: Webometrics
133. Measuring Twitter-based political
participation by using TH indicators
The absolute entropy values were lower when the
trilateral relationship included the two conservative
politicians: Na and Won. As indicated earlier, the lower
the entropy value, the less stable the communication
system is. Thus, the communication system became
more unbalanced in trilateral relationships that included
the two conservative politicians. On the other hand, in
those trilateral relationships including only one
conservative politician, the entropy values were higher,
and the communication system was more stable. These
results suggest that the level of political deliberation,
expressed in terms of the degree of stability in the
communication system, increases when politicians with
different political orientations form trilateral relationships
134. Who r u going to partner in terms of TH?
Politician (A B C) A B C AB AC BC ABC
Na, Won, Noh 18000 377 16000 898 118 50 32
Na, Won, Kang 16000 380 4438 898 1 1 1
Na, Won, Chung 16000 357 14000 898 63 68 1
Na, Noh, Kang 18000 15000 3817 118 1 571 0
Na, Noh, Chung 16000 14000 13000 118 63 737 0
Na, Kang, Chung 15000 3618 13000 1 63 280 1
Won, Noh, Kang 9208 19000 10000 50 1 571 0
Won, Noh, Chung 8353 18000 27000 50 68 737 1
Won, Kang, Chung 8154 10000 28000 1 68 280 1
No, Kang, Chung 18000 9224 27000 571 737 280 151
출처: Measuring Twitter-Based Political Participation and Deliberation in th
e South Korean Context by Using Social Network and Triple Helix Indicator
s
http://www.springerlink.com/content/77w06uv002179062/
135. A comparison of trilateral relationships
of five politicians on Twitter
136. The Webpage and News Media Categories
Figure 2. T-values for the webpage catagory
Figure 3. T-values for the news media category
137. Using Twitter to explore
communication processes within
online innovation communities
According to Etzkowitz (2008, p. 20-23), the
circulation of individuals belonging to three
institutions can provide each organization with
a new innovation environment.
Circulation of individuals from one
organizational sphere to another can stimulate
hybridization and the creation of new social
formats.
138. Note: B (Blackberry), A (Android), O.H. (Official HTC),
K.H. (Korean HTC)
‘a, b, c’ stand for the actors’ placement, from first to third.
139. Co-construction of online communities
Based on Triple Helix analysis, four T values, made by different combination of three groups, were
generated. Among all negative T values, the T of “Blackberry,” “Android,” and “Official HTC” had
the lowest value, which implies that this trilateral relation had the most synergic effect through shared
members. What is interesting is that the lowest T value of trilateral relations does not include “Korean
HTC,” while the others do. Taking a closer look at bilateral relations, it seems to be clear that the T
values of bilateral relations with “Korean HTC” denote a lesser amount of mutual information shared
between groups. Recalling that the organizer of “Korean HTC” shared more information and
conversation with members than with followers (see Table 1 and Fig. 3), contrary to other innovation
group organizers, it can be speculated that the relative closeness of the members of “Korean HTC”
may have led to fewer exchanges of members with other groups, resulted in higher T, and finally
might generate less differentiation and innovation.
141. WCU WEBOMETRICS INSTITUTE
Conclusion
Mindset shift
• Scholars and researchers in social sciences
need to recognize and acknowledge the
opportunities that are available
– E.g. access to vast data and new modes of
data collection and analysis
• The emerging era of networked research
leads to two possible scenarios
–Education and training programs have to be
put in place to produce a new breed of social
scientists with combined expertise and
knowledge of computational science and
social sciences
–What is more actionable in the shorter term
is to engender and promote collaborative
efforts between these different fields
http://blog.jove.com/wp-content/uploads/2012/05/Publishing.png
142. WCU WEBOMETRICS INSTITUTE
Conclusion
Mindset shift
• In Korea, there appears to be a lack of desire for either
distance international collaboration through the
Access Grid or the use of high performance
computing facilities among social scientists
– Little demand as social scientists’ current choices for their research
practices are still shaped by offline facilities rather than online
technology capabilities
– Policy-makers and technology developers to involve social scientists
in design and application processes, but change in mindset among
researchers is needed to transform e-science into a reality for social
scientists
Good role model in the West
– Oxford Internet Institute, The Virtual Knowledge Studio for the Humanities
and Social Sciences, The Institute for Quantitative Social Science at
Harvard University
143. Big data and the end of theory?
Does big data have the answers? Maybe some, but not all,
says - Mark Graham
In 2008, Chris Anderson, then editor of Wired, wrote a
provocative piece titled The End of Theory. Anderson was
referring to the ways that computers, algorithms, and big data
can potentially generate more insightful, useful, accurate, or
true results than specialists or
domain experts who traditionally craft carefully targeted
hypotheses and research strategies.
We may one day get to the point where sufficient quantities of
big data can be harvested to answer all of the social questions
that most concern us. I doubt it though. There will always be
digital divides; always be uneven data shadows; and always
be biases in how information and technology are used and
produced.
And so we shouldn't forget the important role of specialists to
contextualise and offer insights into what our data do, and
maybe more importantly, don't tell us.
http://www.guardian.co.uk/news/datablog/2012/mar/09/big-data-
theory
Notas do Editor
프로파일 사진. 자기가 분명히 드러나는 사진은 미국애들이 더 많이 사용. 위 그래프에는 없지만 한국애들의69%가 제3의 사진 활용. 다음 두개 슬라이드는 미국 페이스북(58명 중 일부), 한국 싸이월드(92명 중 일부) 프로파일 사진 모은 것 (설문참가자들 중 프로파일내용분석 허가한 사람들 중 일부의 사진들임)이 사진들은 2009년 가을 (10월?)에 수집한 것임.
So, we want to see politician’s Twitter network really different from previous studies on politicians’ Network and collected data.We have collected 189 politicians and divided them into two political groups or ruling party and oppsition parties because literally there were too many parties in South Korea.
And this is cohesiveness of network table.. ( explanation)
It could be more intuitive to see through graphics.We depicted Politicians’ Twitter network. We have drawn the mention network over the following-follower network(explanation, if necessary)
The distribution between politician confirms this The following-follower network shows a linear function, meanwhile the mention network is more similar to power-law distribution.Reciprocity-based connections is, basically, “ I link to people who linked to me” .. So linear functionThe gravity to popular person is that “I mention to people who get most attention” So preferential attachment principle to connect…leads power-law function.