7. Competitive Diffusion
Key Points
Competitive Diffusion occurs often
Expressive Model for Competitive
Diffusion
Scale to large Social Networks
8. Competitive Diffusion
General Diffusion Models
One diffusion model for each problem class
- Time consuming
- Complex
Wouldn’t it be nice to have a general, easy
to use language to express such models?
- Like SQL for data access
Language Model SN Data
Diffusion Process
8
9. Competitive Diffusion
General Diffusion Models
Approach: Develop an expressive framework to
represent diffusion models using logical rules
- Logic rules easy to read and write by humans
• Shakarian & Subrahmanian [2010]
- General idea:
“If some condition holds in the network, then some
diffusion is likely to occur with some confidence”
- Based on General Annotated Programming (GAP)
• More expressive than “standard” boolean rules
• Can represent non-linear diffusion processes (arbitrary
functions)
• Kifer & Subrahmanian [92]
9
10. Competitive Diffusion
Modeling in Practice I
Think of your network as a database of facts
- represented as (ground) atoms
- Can represent un-/directed, multi-labeled data
knows
Mary Bob
coworker best friends
wife
knows siblings
Jeff John
Jane
wife(John,Mary), knows(John, Jane), siblings(Bob, Jane),
etc
10
11. Competitive Diffusion
Modeling in Practice II
Write a set of rules that describe the
diffusion process you are interested in
- using annotated rules with variables which are
grounded against your network database
knows
Mary Bob
coworker best friends
wife
knows siblings
Jeff John
Jane
vote(B,Dem):X wife(A,B):1 vote(A,Dem):X
knows(A,Bi):1 vote(Bi,Dem):Xi vote(A,Dem):ΣXi÷n
11
13. Competitive Diffusion
Competition
Hard competition expressed as constraints
- Example: A person has only one vote
vote(A,Dem) + vote(A,Republican) ≤ 1
Soft competition expressed by rule weights
which represent the relative probability that
the described diffusion will happen
- Example: If person B votes democratic, then B’s husband
is likely to vote democrats as well (but not necessarily):
vote(B,Dem):X wife(A,B):1 vote(A,Dem):X | 0.8
13
14. Competitive Diffusion
Diffusion Rules II
Ground Rule
B1:X1 .. Bn:Xn H:f(Xi) | w
Given Interpretation I:
Satisfaction:
I(H) ≥ f(I(Bi))
Weighted Distance from Satisfaction:
w * max(0, f(I(Bi))-I(H) )
14
15. Competitive Diffusion
Probabilistic Model Semantics
We use the rules and their weights to define a
probability distribution over the space of
“possible unfoldings of the diffusion process”
- i.e. interpretations or confidence assignments
- Exponential family distribution (as used e.g. in p*
models)
d(R1,I) All ground rules
d(P,I) = d(R,I) x =
d(Rn,I) P = set of rules
x Ri = ground rule
( I | P) = 1/Z exp (- d(P,I))
15
16. Competitive Diffusion
Most Probable Interpretation
Finding the most probable interpretation
(MPI) is an optimization problem
- Reminiscent of determining least energy state
argmaxI ( I | P) = argminI d(P,I)
Restricting the GAP annotations to be
convex makes the problem tractable
- We currently focus on conic annotations which give
O(n3.5) complexity (i.e. SOCP)
- n=number of ground rules
16
18. Competitive Diffusion
Minimal Grounding
Use fixpoint operator to determine the
minimal non-ground interpretation
- Keep the number of ground atoms small
- Intuition: If there is no evidence for it, we
don’t consider it
• If John and Jane aren’t married, don’t need to
consider rules with wife(John,Jane)
- Implementation: Ground out rules iteratively
until no further ground atoms are added to
the interpretation.
18
20. Competitive Diffusion
Dependency Graph
vote(Mary,Dem) vote(Jane, Dem)
vote(Mary,Dem) wife(John,Mary) vote(Jane,Dem) friend(John,Jane)
vote(John,Dem) | 0.8 vote(John,Dem) | 0.3
vote(John,Dem)
Idea: Partition Dependency graph into
strongly connected components and
solve MPI on each independently
20
21. Competitive Diffusion
Approximate Algorithm
1. Ground out dependency graph with
fixpoint operator
2. Partition dependency graph using a
modularity maximizing clustering alg
- Inspired by Blondel et al [06]
- Aggregate rule weights
3. Compute MPI on each cluster fixing
confidence values of outside atoms
4. Go to 1 until change in I < Θ
21
22. USA
dean author
Competitive Diffusion
member
Prof Prof
Jones Baneri Italy
in
Paper
“ABC”
comment
author
UC
CS
UMD
CS
in
faculty
friends
faculty
Prof
Calero
department in
member
faculty presented
Prof
Dooley
attended
Social
Science
department
University
MD
Universita
Calabria
department in
dean
ASONAM
09
attended
faculty
submitted
Prof
Roma
author
UMD
Physics
author
member
visited
organized
accepted friends
author
KPLLC Paper
09 “UVW”
S3
Prof
Smith
Paper
“HIJ”
submitted
Paper
“XYZ”
comment
attended
comment
student of
author
S2
student of
Prof
Olsen
collaborates
Prof
Lund
member
dean
Prof
Larsen
faculty
Jamie
Lock
member
Karl
Oede
Social
Science
visited
Odense SDU
Physics Odense
colleagues
John
Doe
department
Denmark
23. Competitive Diffusion
Experiments
Synthetically generated scale free,
labeled social networks
6 edge types, 7 rules
Used different parameter settings for
convergence condition
Executed on single 16 core machine
with 256 GB of memory.
23
24. Competitive Diffusion
Scalability
16000
Exact vs Approximate Algorithm Running Times
14000
12000
Time in Seconds
Exact Algorithm
10000
Approximate Algorithm with Parameters A
8000
6000
4000
2000
0
0 10000 20000 30000 40000 50000 60000 70000 80000
# Edges in Graph
24
25. Competitive Diffusion
Accuracy
7%
Relative Error compared to Exact Inference
6%
Percentage Relative Error
5%
Parameters B
4%
Parameters A
3% Parameters C
Parameters D
2%
Parameters E
1%
0%
0 10000 20000 30000 40000 50000 60000 70000 80000
Number of Edges
25
26. Competitive Diffusion
Runtime
Running Time Comparison of Approximate Algorithm
500
450
Parameters B
400
Parameters A
350
Time in Seconds
Parameters C
300 Parameters D
250 Parameters E
200
150
100
50
0
0 10000 20000 30000 40000 50000 60000 70000 80000
Number of Edges
26
27. Competitive Diffusion
Accuracy on large SN
Relative Error on Large Networks
6%
Percentage Relative Error
5% Parameters B
Parameters C
4%
Parameters D
3% Parameters E
2%
1%
0%
3.5E+05 7.0E+05 1.4E+06 2.8E+06 5.6E+06
Number of Edges
Log-scale
27
28. Competitive Diffusion
Runtime on large SN
Runtime Comparison on Large Networks
40000
Time in Seconds
4000 Parameters B
Parameters A
2M edges
Parameters C
in 48 min
Parameters D
Parameters E
400
3.5E+05 7.0E+05 1.4E+06 2.8E+06 5.6E+06
Number of Edges
Log-log-scale
28
29. Competitive Diffusion
Conclusion
Expressive language for competitive
diffusion models
Scalable algorithm to compute such
models on large social networks
Verify model on real diffusion data
29