2. Motivation
Digital Enterprise Research Institute www.deri.ie
• Information cascades of high interest in marketing, CRM, etc.
• A common approach is to maximise information diffusion by
targeting influential actors
• In the context of many online communities (e.g. discussion
fora) the information is shared to the community as a whole
and not to individual actors
common case – targeting individuals cross-community case – targeting communities
Enabling Networked Knowledge
3. Objectives
Digital Enterprise Research Institute www.deri.ie
• Our main hypothesis is that it is possible to efficiently
spread a message over the information flow network by
targeting highly influential communities
• The main problem is then formulated as a prediction of
the set of communities to target such that the message is
spread over the network as much as possible
• Spread over the actors, i.e. user activation fraction
• Spread over the communities, i.e. community
activation fraction
Enabling Networked Knowledge
4. Methods: Definition of Impact
Digital Enterprise Research Institute www.deri.ie
• We propose (Belák et al., ‘12) to take two factors into account:
1. degree of community membership of the users
2. centrality of the users within each community
• Impact of community A on community B defined as an average centrality of
actors from A within B, weighted by their membership in A
Enabling Networked Knowledge
5. Methods: Targeting
Communities
Digital Enterprise Research Institute www.deri.ie
• Level of dispersion (heterogeneity) of total impact of community i can be
measured as an entropy of an i-th row/column of the impact matrix
• We propose to target communities by means of the product of the total
impact of community i and its entropy: impact focus (IF)
• We simulated the diffusion by extending Independent Cascade (ICM) and
Linear Threshold (LTM) Models (Kempe et al., ‘03)
1. Take q target communities and sample s users from each of them
2. Run the original models from the union of sampled users
• Information diffusion network derived from the reply-to network:
replies to
i rji j
information
i j
flow wij
Enabling Networked Knowledge
6. Evaluation Strategy
Digital Enterprise Research Institute www.deri.ie
• IF compared with random targeting (R), and group in-degree (GI)
(Everett & Borgatti, ’99)
• The main aim was to investigate robustness of our framework with
respect to:
• Character of the system
• Diffusion models
• User and Community Activation Fractions
• Procedural outline
1. Target q communities using one of the heuristics evaluated on
the data from time-slice t
2. Run the diffusion model on the network from time-slice t+1
3. Compute an average user and community spreads over all
pairs (t, t+1)
Enabling Networked Knowledge
7. Evaluation Data-Sets
Digital Enterprise Research Institute www.deri.ie
• 51 weeks of data of the largest Irish
discussion board system
• Segmented using 1 week sliding window
• 1 week window represents approx. 84% of
cross-fora posting activity
• 540 communities, 5.3k users/snapshot (avg)
• 5 years of data from the technical support fora of SAP
• Used only for the diffusion experiments
• Segmented using 2 months sliding window
• 2 months represent approx. 50% of cross-fora posting
activity
• 33 communities, 2k users/snapshot (avg)
Enabling Networked Knowledge
8. User Act. Fraction
Digital Enterprise Research Institute www.deri.ie
One targeted community
q=1, Boards−LTM q=1, SAP−LTM
0.8
0.30
0.7
0.25
0.6
mean user activation fraction (u)
mean user activation fraction (u)
0.20
0.5
0.15
0.4
0.10
0.3
0.05
0.2
IF IF
GI GI
0.00
0.1
R R
5 10 15 20 5 10 15 20
user sample size (s) user sample size (s)
Enabling Networked Knowledge
9. Community Act. Fr.
Digital Enterprise Research Institute www.deri.ie
One targeted community
q=1, Boards−LTM q=1, SAP−LTM
0.5
0.8
0.7
0.4
mean community activation fraction (c)
mean community activation fraction (c)
0.6
0.3
0.5
0.4
0.2
0.3
0.1
0.2
IF IF
GI GI
0.1
0.0
R R
5 10 15 20 5 10 15 20
user sample size (s) user sample size (s)
Enabling Networked Knowledge
10. Community Act. Fr.
Digital Enterprise Research Institute www.deri.ie
Five targeted communities
q=5, Boards−LTM q=5, SAP−LTM
0.5
0.8
0.7
0.4
mean community activation fraction (c)
mean community activation fraction (c)
0.6
0.3
0.5
0.4
0.2
0.3
0.1
0.2
IF IF
GI GI
0.1
0.0
R R
5 10 15 20 5 10 15 20
user sample size (s) user sample size (s)
Enabling Networked Knowledge
11. Results Highlights
Digital Enterprise Research Institute www.deri.ie
• Diffusion process became saturated at approximately 80% of users
or communities in Boards, and 30% in SAP
• More efficient to target few communities
• Impact Focus outperformed the other two strategies with respect to
both user and community activation fractions, namely for small
number of targeted communities (i.e. [1, 2]) and
seed users (i.e. [1, 20])
• Diminishing returns
• For high number of targeted communities and seed users, random
strategy outperformed the other two with respect to community
activation fractions in SAP data-set
• SAP network fragmented into many small components, which
made it hard to reach peripheral communities
Enabling Networked Knowledge
12. Conclusion
Digital Enterprise Research Institute www.deri.ie
• The evaluation demonstrated that the framework
• is able to identify highly influential communities
• can predict which communities to target s.t. the
message spreads efficiently over both individual users
and communities
• We aim to extend it with content analysis
• E.g. What are the most influential communities with
respect to a particular topic?
• We will also investigate empirically-observed topic
cascades and modify our models accordingly if needed
Enabling Networked Knowledge
13. Questions?
Digital Enterprise Research Institute www.deri.ie
References
• Belák V., Lam S., Hayes C. Cross-Community Influence in Discussion
Fora. ICWSM. AAAI, 2012.
• M. Everett and S. Borgatti. The centrality of groups and classes. J. of
Mathematical Sociology, 23(3):181–201, 1999.
• D. Kempe, J. Kleinberg, and É. Tardos. Maximizing the spread of
influence through a social network. SIGKDD. ACM, 2003.
Enabling Networked Knowledge