Using Vector Clocks to Visualize Communication Flow
1. Using Vector Clocks to Visualize
Communication Flow
Martin Harrigan
Complex & Adaptive Systems Laboratory (CASL)
University College Dublin
2. Introduction
Communication Flow
Vector Clocks
The Methodology
Experiments
Conclusions & Future Work
3. Introduction
A metaphor for communication: flow in conduits
‘slow and fast conduits’ (Kossinets et al., ’08)
‘transatlantic flows of communication’ (Leskovec
& Horvitz, ’08)
‘a series of tubes’ (Stevens, ’06)
It is a tempting metaphor when visualizing communication flow.
A dynamic weighted directed graph?
people ↔ vertices
communications ↔ (weighted)
directed edges
time ↔ animation
4. Introduction
Is the graphical design grounded in the substance to be
communicated (Brandes, ’99)?
Is there a discrepancy between the visualization and the
analysis of communication flow?
Visualization Analysis
crossings synchronization
area information latency
slopes temporal distance
symmetries network backbones
mental map periodicity
Can we integrate the two?
5. Introduction
Vector clocks are a useful tool when analyzing
communication flow...
Introduced by Fidge (’88) and Mattern (’89).
Aid with the causal ordering of events in
distributed systems.
Provide greater insight into communication flow
than dynamic weighted directed graphs (Kossinets et al.,
’08).
We can use them to visualize communication flow.
8. Introduction
B and D agree
to meet at the
Radisson Blu.
9. Introduction
A checks with B
and C where they
should meet.
D is no longer
reachable but they
know to meet him
at the Radisson
Blu.
10. Introduction
Vector clocks maintain, for each actor, the time of their most
recent communication, either directly or indirectly, from every
other actor.
13. The Methodology
During a time interval [0, T ], we have a set of vertices V and
a set of time-attributed directed edges E.
The instantaneous graph Gt = (V, Et ) at time-slice t is the
graph with vertex set V and edge set
Et = {(u, v)|(u, v, t) ∈ E}.
Input: G0 , . . . , GT (instantaneous graphs)
14. The Methodology
For each instantaneous graph:
1 Each vertex has a corresponding vector clock which represents
a point in a high-dimensional space.
2 Each time-attributed directed edge (communication) updates
the vector clock of the target (receiving) vertex.
3 We compute the distances between all pairs of vector clocks
using an appropriate metric.
4 We construct a dissimilarity matrix from these distances.
5 We use multidimensional scaling (MDS) to produce an 2-d
visualization of the data points.
15. The Methodology
Output: C0 , . . . , CT (2-d coordinates for each time-slice)
t=0 t=1 t=2 t=3
Greene et al., ’10, Lancichinetti et al., 08
17. The Methodology
For each instantaneous graph:
1 Each vertex has a corresponding vector clock which represents
a point in a high-dimensional space.
t1
Initialization
t2
φu,t =
.
. Individual increment at
.
each time-slice
tn
18. The Methodology
For each instantaneous graph:
2 Each time-attributed directed edge (communication) updates
the vector clock of the target (receiving) vertex.
t1 s1 max(t1 , s1 )
t2 s2 max(t2 , s2 )
φu,t = , φv,t = =⇒ φv,t =
.
. .
. .
.
. . .
tn sn max(tn , sn )
19. The Methodology
For each instantaneous graph:
3 We compute the distances between all pairs of vector clocks
using an appropriate metric.
d(φu,t , φv,t ) = (t1 − s1 )2 + (t2 − s2 )2 + · · · + (tn − sn )2
Which metric?
20. The Methodology
For each instantaneous graph:
4 We construct a dissimilarity matrix from these distances.
0 d(φu1 ,t , φu2 ,t ) · · · d(φu1 ,t , φun ,t )
d(φu2 ,t , φu1 ,t ) 0
Mt =
.
. ..
. .
d(φun ,t , φu1 ,t ) 0
21. The Methodology
For each instantaneous graph:
5 We use multidimensional scaling (MDS) to produce an 2-d
visualization of the data points.
Dynamic MDS?
Procrustes Analysis?
22. Experiments
Four artificial datasets comprising temporal sequences of
communications between 100 actors during a time interval
[0, 99].
The datasets were generated by fixing the set of possible
communications and then selecting a communication from the
set of possible communications at time-slice t with probability
p = 0.005.
Each dataset had a distinct underlying communication
pattern.
23. Experiments
DS1: All communications were possible.
25th 50th 75th 100th
24. Experiments
DS2: Communications were possible between every pair of actors
in only one direction such that there were no directed cycles of
communication.
25th 50th 75th 100th
25. Experiments
DS3: The actors were partitioned into four equal subsets and all
intra-subset communications were possible.
25th 50th 75th 100th
26. Experiments
DS4: The actors were partitioned as in DS3 and intra-subset
communications were possible between every pair of actors in only
one direction such that there were no directed cycles of
communication.
25th 50th 75th 100th
27. Experiments
We also visualized a VAST 2008 challenge dataset (Grinstein et al.,
’08).
This dataset comprises mobile phone call records over a 10
day period between 400 unique mobile phones.
We set each time-slice equal to one hour.
25th 50th 75th 100th
28. Conclusions & Future Work
A novel methodology for visualizing communication flow.
temporal sequence of communications → vector clocks
distance metric
vector clocks − − − − − → dissimilarity matrix
−−−−−
MDS
dissimilarity matrix −→ 2-d visualizations
−
Actors who have received the same or
communicatively-equivalent communications are placed close
together whereas actors that have received largely different
communications are placed far apart.
29. Conclusions & Future Work
There is much future work:
Both the distance metric (Bellman, ’61) and the choice of MDS
algorithm need investigation.
Can we extend vector clocks to model the attenuation of
information, the bounded capacity of communication
channels, etc.? Can we visualize synchronicity, network
backbones, periodicity?
Scalability (maintenance of the vector clocks and the
computation of the dissimilarity matrix).