Understanding what attracts users to engage with social media content is important in domains such as market analytics, advertising, and community management.
To date, many pieces of work have examined engagement dynamics in isolated platforms with little consideration or assessment of how these dynamics might vary between disparate social media systems. Additionally, such explorations have often used different features and notions of engagement, thus rendering the cross-platform comparison of engagement dynamics limited. In this paper we define a common framework of engagement analysis and examine and compare engagement dynamics across five social media platforms: Facebook, Twitter, Boards.ie, Stack Overflow and the SAP Community Network. We define a variety of common features (social and content) to capture the dynamics that correlate with engagement in multiple social media platforms, and present an evaluation pipeline intended to enable cross-platform comparison. Our comparison results demonstrate the varying factors at play in different platforms, while also exposing several similarities.
Meet Incall & Out Escort Service in D -9634446618 | #escort Service in GTB Na...
Mining and Comparing Engagement Dynamics Across Multiple Social Media Platforms #websci14
1. Mining and Comparing Engagement Dynamics
Across Multiple Social Media Platforms
Matthew Rowe
Lancaster University, UK
@halani
harith-alani
@halani
ACM Web Science Conference (WebSci) 2014, Bloomington, IND
http://people.kmi.open.ac.uk/harith/
Harith Alani
Knowledge Media institute, UK
3. Moving on …
§ How can we move on
from these (micro)
studies?
§ Are results consistent
across datasets, and
platforms?
§ One way forward is:
§ Multiple platforms
§ Multiple topics
4. Publications on "social media analysis”
0
100
200
300
400
500
600
2006 2007 2008 2009 2010 2011 2012 2013
Publications on "social media analysis"
9. Apples and Oranges
§ We mix and compare
different features,
datasets, and platforms
§ Aim is to figure out their
similarities and
differences
10. Contributions
§ Examine replying dynamics as a modality of engagement
§ Define a framework of engagement analysis that fits multiple social platforms
§ Show the varying features at play in different platforms, and where the
similarities and differences are
§ Contrast the role of different features on engagement likelihood across five
social media platforms
§ Compare results to relevant literature on same or different platforms and
engagement indicators
11. 7 datasets from 5 platforms
Platform Posts Users Seeds Non-seeds Replies
Boards.ie 6,120,008 65,528 398,508 81,273 5,640,227
Twitter Random 1,468,766 753,722 144,709 930,262 390,795
Twitter (Haiti
Earthquake)
65,022 45,238 1,835 60,686 2,501
Twitter (Obama
State of Union
Address)
81,458 67,417 11,298 56,135 14,025
SAP 427,221 32,926 87,542 7,276 332,403
Server Fault 234,790 33,285 65,515 6,447 162,828
Facebook 118,432 4,745 15,296 8,123 95,013
Seed posts are those that receive a reply
Non-seed posts are those with no replies
12. Data Balancing
Platform Seeds Non-seeds Instance Count
Boards.ie 398,508 81,273 162,546
Twitter Random 144,709 930,262 289,418
Twitter (Haiti
Earthquake)
1,835 60,686 3,670
Twitter (Obama State
of Union Address)
11,298 56,135 22,596
SAP 87,542 7,276 14,552
Server Fault 65,515 6,447 12,894
Facebook 15,296 8,123 16,246
Total 521,922
For each dataset, an equal number of seeds and non-seed
posts are used in the analysis.
13. Features
§ Post Length: number of words in
the post
§ Complexity: Measures the
cumulative entropy of terms in a
post
§ Readability: Gunning Fog index,
gauges how hard the post is to
parse by readers, and LIX
Readability metric to determine
complexity of words based on
number of letters
§ Referral Count: number of URLs
in the post
§ Informativeness: TF-IDF of the
post
§ Polarity: average sentiment
polarity of the post (using
SentiWordnet)
§ In-degree: number of in-coming
social connections (explicit or implicit)
§ Out-degree: number of out-going
social connections (explicit or implicit)
§ Post Count: number of posts made in
previous 6 months
§ User Age: length of membership in
community in days
§ Post Rate: number of posts by the
user per day
Social Features
Content Features
14. Classification of Posts
Seed Posts
Non-Seed
Posts
§ Binary classification model
§ Trained with social, content,
and combined features
§ 80/20 training/testing
§ Compare results across
platforms, to see how a change
in each feature is associated
with likelihood of engagement
§ Compare engagement
dynamics from our platforms
against the literature
15. Classification Results
Feature P R F1
Social 0.592 0.591 0.591
Content 0.664 0.660 0.658
Social+Content 0.670 0.666 0.665
(Random) (Haiti Earthquake)
(Obama’s State Union Address)
P R F1
0.561 0.561 0.560
0.612 0.612 0.611
0.628 0.628 0.628
P R F1
0.968 0.966 0.966
0.752 0.747 0.747
0.974 0.973 0.973
Feature P R F1
Social 0.542 0.540 0.539
Content 0.650 0.642 0.639
Social+Content 0.656 0.649 0.646
P R F1
0.650 0.631 0.628
0.575 0.541 0.521
0.652 0.632 0.629
P R F1
0.528 0.380 0.319
0.626 0.380 0.275
0.568 0.407 0.359
Feature P R F1
Social 0.635 0.632 0.632
Content 0.641 0.641 0.641
Social+Content 0.660 0.660 0.660
§ Performance of the logistic regression
classifier trained over different feature
sets and applied to the test set.
16. Effect of features on engagement
Boards.ie
β
−2
−1
0
1
2
Twitter Random
β
−0.5
0.0
0.5
1.0
Twitter Haiti
−6e+16
−4e+16
−2e+16
0e+00
2e+16
4e+16
6e+16
Twitter Union
β
−0.8
−0.6
−0.4
−0.2
0.0
0.2
Server Fault
β
−1.0
−0.5
0.0
0.5
1.0
1.5
2.0
SAP
β
−10
−5
0
5
Facebook
β
−0.1
0.0
0.1
0.2
0.3
0.4
0.5
In−degree
Out−degree
Post Count
Age
Post Rate
Post Length
Referrals Count
Polarity
Complexity
Readability
Readability Fog
Informativeness
Logistic regression coefficients for each platform's features
17. Significance of regression coefficients
Boards.ie
p
0.0
0.2
0.4
0.6
0.8
1.0
Titter Random
p
0.0
0.2
0.4
0.6
0.8
1.0
Titter Haiti
p
0.0
0.2
0.4
0.6
0.8
1.0
Titter Union
p
0.0
0.2
0.4
0.6
0.8
1.0
Server Fault
p
0.0
0.2
0.4
0.6
0.8
1.0
SAP
p
0.0
0.2
0.4
0.6
0.8
1.0
Facebook
p
0.0
0.2
0.4
0.6
0.8
1.0
In−degree
Out−degree
Post Count
Age
Post Rate
Post Length
Referrals Count
Polarity
Complexity
Readability
Readability Fog
Informativeness
21. Summary
§ We tested the consistency and applicability of engagement
patterns across multiple platforms
§ Used 12 social/content features that map to 5 platforms
§ Studied the impact of those features on engagement across these
platforms
§ Compared the impact of our features against generally relevant
studies in the literature
§ Showed that same features could play a different roles in different
platforms, or different non-random datasets
22. So what’s Next!
§ LOTS!
§ Apply same study to more datasets from the same platforms, and from other
platforms
§ Expand from replies to other engagement indicators
§ Improve classification of seeds/non-seeds with more common features
§ Further study on impact of topics and non-randomness on engagement
dynamics
§ Take user type into account – e.g. posts from new agencies are more likely to
be tweeted than replied to
23. Questions!
1. Why those specific datasets and platforms?
2. What about platform-specific features?
3. Could we ever get a full understanding of these dynamics
across all social platforms?
4. Could these findings be used to increase engagement?
5. Who’s right/wrong when the same feature appears to have
conflicting impact on the same platform?
6. Couldn’t be the case that the same feature is used
differently in different platforms?
7. How could we study event-specific engagement dynamics?