The authors present TRIÈST, a suite of one-pass streaming algorithms to compute unbiased, low-variance, high-quality approximations of the global and local number of triangles in a fully-dynamic graph represented as an adversarial stream of edge insertions and deletions.
TRIEST: Counting Local and Global Triangles in Fully-Dynamic Streams with Fixed Memory Size
1. TRIÈST: Approximating Triangle Counts
in Fully-Dynamic Graph Edge Streams
with Fixed Memory
Matteo Riondato – Labs, Two Sigma Investments
CMU DB Group – October 24, 2016
1 / 26
2. Who am I?
Matteo Riondato
Working at
Labs, Two Sigma Investments (Research Scientist);
CS Dept., Brown U. (Visiting Asst. Prof.);
Doing research in algorithmic data science
(used to be data mining, but somehow we forgot about algorithms. . . );
algorithmic data science = (theory × practice)(theory×practice)
Tweeting @teorionda;
“Living” at http://matteo.rionda.to.
2 / 26
3. What am I going to talk about?
TRIÈST: a suite of algorithms for approximately counting triangles in fully-dynamic
edge streams, using a fixed amount of storage/space/memory.
Joint work with:
• Lorenzo De Stefani (Brown);
• Alessandro Epasto (Google Research);
• Eli Upfal (Brown);
Best student paper award at ACM KDD’16;
Journal version under submission to ACM TKDD,
available from http://bit.ly/triestkdd;
TRIÈST: Counting Local and Global Triangles in
Fully-Dynamic Streams with Fixed Memory Size
Lorenzo De Stefani
Brown University
Providence, RI, USA
lorenzo@cs.brown.edu
Alessandro Epastoú
Google
New York, NY, USA
aepasto@google.com
Matteo Riondato*
Two Sigma Investments
New York, NY, USA
matteo@twosigma.com
Eli Upfal
Brown University
Providence, RI, USA
eli@cs.brown.edu
“Ogni lassada xe persa”1
– Proverb from Trieste, Italy.
ABSTRACT
We present trièst, a suite of one-pass streaming algorithms
to compute unbiased, low-variance, high-quality approxima-
tions of the global and local (i.e., incident to each vertex)
number of triangles in a fully-dynamic graph represented as
an adversarial stream of edge insertions and deletions.
Our algorithms use reservoir sampling and its variants to
exploit the user-specified memory space at all times. This is
in contrast with previous approaches, which require hard-to-
choose parameters (e.g., a fixed sampling probability) and
o er no guarantees on the amount of memory they use. We
analyze the variance of the estimations and show novel con-
centration bounds for these quantities.
Our experimental results on very large graphs demon-
strate that trièst outperforms state-of-the-art approaches
in accuracy and exhibits a small update time.
1. INTRODUCTION
Exact computation of characteristic quantities of Web-
scale networks is often impractical or even infeasible due
approximation of these quantities. For e ciency, the algo-
rithms should aim at exploiting the available memory space
as much as possible and they should require only one pass
over the stream.
We introduce trièst, a suite of sampling-based, one-pass
algorithms for adversarial fully-dynamic streams to approx-
imate the global number of triangles and the local number of
triangles incident to each vertex. Mining local and global
triangles is a fundamental primitive with many applications
(e.g., community detection [4], topic mining [10], spam/anomaly
detection [3, 27], ego-networks mining [12] and protein in-
teraction networks analysis [29].)
Many previous works on triangle estimation in streams
also employ sampling (see Sect. 3), but they usually require
the user to specify in advance an edge sampling probability
p that is fixed for the entire stream. This approach presents
several significant drawbacks. First, choosing a p that allows
to obtain the desired approximation quality requires to know
or guess a number of properties of the input (e.g., the size
of the stream). Second, a fixed p implies that the sample
size grows with the size of the stream, which is problematic
when the stream size is not known in advance: if the user
3 / 26
4. What are triangles?
Let G = (V , E) be a graph.
1 2
3
4 5
6
7
8
Triangle: a set of three edges forming a cycle;
Global triangle count ∆G: the no. of triangles in G;
Local triangle count ∆v for v ∈ V : the no. of triangles that v “belongs” to;
Applications: community/spam/event detection, link prediction/recommendation,
prototype for more complex patterns, . . .
4 / 26
5. What are triangles?
Let G = (V , E) be a graph.
1 2
3
4 5
6
7
8
Triangle: a set of three edges forming a cycle;
Global triangle count ∆G: the no. of triangles in G;
Local triangle count ∆v for v ∈ V : the no. of triangles that v “belongs” to;
Applications: community/spam/event detection, link prediction/recommendation,
prototype for more complex patterns, . . .
4 / 26
6. What are triangles?
Let G = (V , E) be a graph.
1 2
3
4 5
6
7
8
Triangle: a set of three edges forming a cycle;
Global triangle count ∆G: the no. of triangles in G; E.g., ∆G = 3;
Local triangle count ∆v for v ∈ V : the no. of triangles that v “belongs” to;
Applications: community/spam/event detection, link prediction/recommendation,
prototype for more complex patterns, . . .
4 / 26
7. What are triangles?
Let G = (V , E) be a graph.
1 2
3
4 5
6
7
8
Triangle: a set of three edges forming a cycle;
Global triangle count ∆G: the no. of triangles in G; E.g., ∆G = 3;
Local triangle count ∆v for v ∈ V : the no. of triangles that v “belongs” to;
E.g., ∆1 = 2, ∆5 = 3, ∆6 = 0, . . .
Applications: community/spam/event detection, link prediction/recommendation,
prototype for more complex patterns, . . .
4 / 26
8. What are fully-dynamic edge streams?
Discrete time t, starting at t = 0 and never ending;
At each time step, a new edge update (insertion or deletion) is on the stream:
Time . . . t∗ t∗ + 1 t∗ + 2 t∗ + 3 t∗ + 4 t∗ + 5 . . .
Stream . . . +, (1, 2) +, (3, 2) +, (1, 3) −, (3, 2) +, (1, 5) +, (4, 5) . . .
The order may be fixed in advance by an adversary.
G(t) = (V (t), E(t)): graph induced by the edges inserted and not deleted up to time t.
5 / 26
9. What are fully-dynamic edge streams?
Discrete time t, starting at t = 0 and never ending;
At each time step, a new edge update (insertion or deletion) is on the stream:
Time . . . t∗ t∗ + 1 t∗ + 2 t∗ + 3 t∗ + 4 t∗ + 5 . . .
Stream . . . +, (1, 2) +, (3, 2) +, (1, 3) −, (3, 2) +, (1, 5) +, (4, 5) . . .
The order may be fixed in advance by an adversary.
G(t) = (V (t), E(t)): graph induced by the edges inserted and not deleted up to time t.
Example: Time: t∗; Element on the stream: +, (1, 2)
Graph G(t∗): 1
0 4
3 2
5 / 26
10. What are fully-dynamic edge streams?
Discrete time t, starting at t = 0 and never ending;
At each time step, a new edge update (insertion or deletion) is on the stream:
Time . . . t∗ t∗ + 1 t∗ + 2 t∗ + 3 t∗ + 4 t∗ + 5 . . .
Stream . . . +, (1, 2) +, (3, 2) +, (1, 3) −, (3, 2) +, (1, 5) +, (4, 5) . . .
The order may be fixed in advance by an adversary.
G(t) = (V (t), E(t)): graph induced by the edges inserted and not deleted up to time t.
Example: Time: t∗ + 1; Element on the stream: +, (3, 2)
Graph G(t∗): 1
0 4
3 2
5 / 26
11. What are fully-dynamic edge streams?
Discrete time t, starting at t = 0 and never ending;
At each time step, a new edge update (insertion or deletion) is on the stream:
Time . . . t∗ t∗ + 1 t∗ + 2 t∗ + 3 t∗ + 4 t∗ + 5 . . .
Stream . . . +, (1, 2) +, (3, 2) +, (1, 3) −, (3, 2) +, (1, 5) +, (4, 5) . . .
The order may be fixed in advance by an adversary.
G(t) = (V (t), E(t)): graph induced by the edges inserted and not deleted up to time t.
Example: Time: t∗ + 1; Element on the stream: +, (3, 2)
Graph G(t∗+1): 1
0 4
3 2
5 / 26
12. What are fully-dynamic edge streams?
Discrete time t, starting at t = 0 and never ending;
At each time step, a new edge update (insertion or deletion) is on the stream:
Time . . . t∗ t∗ + 1 t∗ + 2 t∗ + 3 t∗ + 4 t∗ + 5 . . .
Stream . . . +, (1, 2) +, (3, 2) +, (1, 3) −, (3, 2) +, (1, 5) +, (4, 5) . . .
The order may be fixed in advance by an adversary.
G(t) = (V (t), E(t)): graph induced by the edges inserted and not deleted up to time t.
Example: Time: t∗ + 2; Element on the stream: +, (1, 3)
Graph G(t∗+1): 1
0 4
3 2
5 / 26
13. What are fully-dynamic edge streams?
Discrete time t, starting at t = 0 and never ending;
At each time step, a new edge update (insertion or deletion) is on the stream:
Time . . . t∗ t∗ + 1 t∗ + 2 t∗ + 3 t∗ + 4 t∗ + 5 . . .
Stream . . . +, (1, 2) +, (3, 2) +, (1, 3) −, (3, 2) +, (1, 5) +, (4, 5) . . .
The order may be fixed in advance by an adversary.
G(t) = (V (t), E(t)): graph induced by the edges inserted and not deleted up to time t.
Example: Time: t∗ + 2; Element on the stream: +, (1, 3)
Graph G(t∗+2): 1
0 4
3 2
5 / 26
14. What are fully-dynamic edge streams?
Discrete time t, starting at t = 0 and never ending;
At each time step, a new edge update (insertion or deletion) is on the stream:
Time . . . t∗ t∗ + 1 t∗ + 2 t∗ + 3 t∗ + 4 t∗ + 5 . . .
Stream . . . +, (1, 2) +, (3, 2) +, (1, 3) −, (3, 2) +, (1, 5) +, (4, 5) . . .
The order may be fixed in advance by an adversary.
G(t) = (V (t), E(t)): graph induced by the edges inserted and not deleted up to time t.
Example: Time: t∗ + 3; Element on the stream: −, (3, 2)
Graph G(t∗+2): 1
0 4
3 2
5 / 26
15. What are fully-dynamic edge streams?
Discrete time t, starting at t = 0 and never ending;
At each time step, a new edge update (insertion or deletion) is on the stream:
Time . . . t∗ t∗ + 1 t∗ + 2 t∗ + 3 t∗ + 4 t∗ + 5 . . .
Stream . . . +, (1, 2) +, (3, 2) +, (1, 3) −, (3, 2) +, (1, 5) +, (4, 5) . . .
The order may be fixed in advance by an adversary.
G(t) = (V (t), E(t)): graph induced by the edges inserted and not deleted up to time t.
Example: Time: t∗ + 3; Element on the stream: −, (3, 2)
Graph G(t∗+3): 1
0 4
3 2
5 / 26
16. What are fully-dynamic edge streams?
Discrete time t, starting at t = 0 and never ending;
At each time step, a new edge update (insertion or deletion) is on the stream:
Time . . . t∗ t∗ + 1 t∗ + 2 t∗ + 3 t∗ + 4 t∗ + 5 . . .
Stream . . . +, (1, 2) +, (3, 2) +, (1, 3) −, (3, 2) +, (1, 5) +, (4, 5) . . .
The order may be fixed in advance by an adversary.
G(t) = (V (t), E(t)): graph induced by the edges inserted and not deleted up to time t.
Example: Time: t∗ + 4; Element on the stream: +, (1, 5)
Graph G(t∗+3): 1
0 4
3 2
5 / 26
17. What are fully-dynamic edge streams?
Discrete time t, starting at t = 0 and never ending;
At each time step, a new edge update (insertion or deletion) is on the stream:
Time . . . t∗ t∗ + 1 t∗ + 2 t∗ + 3 t∗ + 4 t∗ + 5 . . .
Stream . . . +, (1, 2) +, (3, 2) +, (1, 3) −, (3, 2) +, (1, 5) +, (4, 5) . . .
The order may be fixed in advance by an adversary.
G(t) = (V (t), E(t)): graph induced by the edges inserted and not deleted up to time t.
Example: Time: t∗ + 4; Element on the stream: +, (1, 5)
Graph G(t∗+4): 1
0 4
53 2
5 / 26
18. What are fully-dynamic edge streams?
Discrete time t, starting at t = 0 and never ending;
At each time step, a new edge update (insertion or deletion) is on the stream:
Time . . . t∗ t∗ + 1 t∗ + 2 t∗ + 3 t∗ + 4 t∗ + 5 . . .
Stream . . . +, (1, 2) +, (3, 2) +, (1, 3) −, (3, 2) +, (1, 5) +, (4, 5) . . .
The order may be fixed in advance by an adversary.
G(t) = (V (t), E(t)): graph induced by the edges inserted and not deleted up to time t.
Example: Time: t∗ + 4; Element on the stream: +, (1, 5)
Graph G(t∗+4): 1
0 4
53 2
5 / 26
19. What are fully-dynamic edge streams?
Discrete time t, starting at t = 0 and never ending;
At each time step, a new edge update (insertion or deletion) is on the stream:
Time . . . t∗ t∗ + 1 t∗ + 2 t∗ + 3 t∗ + 4 t∗ + 5 . . .
Stream . . . +, (1, 2) +, (3, 2) +, (1, 3) −, (3, 2) +, (1, 5) +, (4, 5) . . .
The order may be fixed in advance by an adversary.
G(t) = (V (t), E(t)): graph induced by the edges inserted and not deleted up to time t.
Example: Time: t∗ + 5; Element on the stream: +, (4, 5)
Graph G(t∗+4): 1
0 4
53 2
5 / 26
20. What are fully-dynamic edge streams?
Discrete time t, starting at t = 0 and never ending;
At each time step, a new edge update (insertion or deletion) is on the stream:
Time . . . t∗ t∗ + 1 t∗ + 2 t∗ + 3 t∗ + 4 t∗ + 5 . . .
Stream . . . +, (1, 2) +, (3, 2) +, (1, 3) −, (3, 2) +, (1, 5) +, (4, 5) . . .
The order may be fixed in advance by an adversary.
G(t) = (V (t), E(t)): graph induced by the edges inserted and not deleted up to time t.
Example: Time: t∗ + 5; Element on the stream: +, (4, 5)
Graph G(t∗+5): 1
0 4
53 2
5 / 26
21. What are fully-dynamic edge streams?
Discrete time t, starting at t = 0 and never ending;
At each time step, a new edge update (insertion or deletion) is on the stream:
Time . . . t∗ t∗ + 1 t∗ + 2 t∗ + 3 t∗ + 4 t∗ + 5 . . .
Stream . . . +, (1, 2) +, (3, 2) +, (1, 3) −, (3, 2) +, (1, 5) +, (4, 5) . . .
The order may be fixed in advance by an adversary.
G(t) = (V (t), E(t)): graph induced by the edges inserted and not deleted up to time t.
Example: Time: t∗ + 5; Element on the stream: +, (4, 5)
Graph G(t∗+5): 1
0 4
53 2
The global and local triangle counts change from G(t) to G(t+1);
Our goal: at each time t, give an estimate of ∆G(t) and ∆v , v ∈ V (t).
5 / 26
22. Why is working on fully-dynamic edge streams difficult?
The stream is infinite: storing all (or a constant fraction of) the edges is impossible;
There is no end of the stream: post-processing at the end of the stream is impossible;
Updates arrive continuously: re-running an algorithm from scratch after each update
is infeasible;
Triangle counts change continuously: spending a long time on each update to get the
exact count is infeasible and illogical;
An efficient algorithm for fully-dynamic streams must tackle all these challenges.
TRIÈST does.
6 / 26
23. Why is working on fully-dynamic edge streams difficult?
The stream is infinite: storing all (or a constant fraction of) the edges is impossible;
→ TRIÈST stores a user-specified, fixed amount M of edges;
There is no end of the stream: post-processing at the end of the stream is impossible;
→ TRIÈST needs no postprocessing.
Updates arrive continuously: re-running an algorithm from scratch after each update
is infeasible; → TRIÈST is incremental and one-pass;
Triangle counts change continuously: spending a long time on each update to get the
exact count is infeasible and illogical; → TRIÈST computes high-quality estimates;
An efficient algorithm for fully-dynamic streams must tackle all these challenges.
TRIÈST does.
6 / 26
24. What is TRIÈST?
(the local dialect name of Trieste, a city in the North-East of Italy, next to Slovenia.)
TRIÈST (TRIangles EST imation):
A suite of 3 algorithms for approximate triangle counting from edge streams:
• TRIÈST-BASE: baseline algorithm for insertion-only streams;
• TRIÈST-IMPR: improved algorithm for insertion only streams with reduced variance;
• TRIÈST-FD: algorithm for fully-dynamic streams.
All three algorithms offer unbiased estimators of the local and global triangle counts;
We also present a complete analysis of their variance and give concentration bounds;
7 / 26
25. Aren’t there other algorithms to estimate triangles?
There are many algorithms for estimating triangles from data streams;
Most-recent ones are based on independent edge sampling with fixed probability;
They use an ever-increasing amount of space;
Work
Single
pass
Fixed
space
Local
counts
Global
counts
Fully-dynamic
streams
Becchetti et al. 2010 /
Kolountzakis et al. 2012
Pavan et al. 2013
Jha et al. 2015
Ahmed et al. 2014
Lim et al. 2015
TRIÈST
TRIÈST is the first to tackle all the challenges;
It is based on reservoir sampling, a well-known non-independent sampling scheme;
The analysis is challenging, but the gains are worth the price.
8 / 26
26. What is the general idea behind TRIÈST?
Let’s focus on TRIÈST-BASE for now (i.e., insertion-only streams);
TRIÈST-BASE maintains a collection S of M edges from the stream;
The edges in S induce a graph GS = (VS, S);
TRIÈST-BASE maintains the exact values for
∆GS
: the number of triangles in GS; and
∆vS : the number of triangles in GS incident to v ∈ VS.
Maintaining the exact counts ∆GS
and ∆vS , v ∈ V (t) after each update is fast:
Estimates for ∆G(t) and ∆v , v ∈ V (t) are obtained from ∆GS
and ∆vS by weighting by
a probability πt (stay tuned!)
9 / 26
27. How does TRIÈST-BASE work?
TRIÈST-BASE uses a random sampling scheme known as reservoir sampling;
At any time t ≤ M, deterministically insert the edge currently on the stream into S;
At any t M, flip a coin with tail-bias M/t;
If the outcome is head, do nothing;
If the outcome is tail :
1) Choose an edge in S u.a.r. and replace it with the edge currently on the stream;
2) Decrease ∆GS
and ∆vS , v ∈ VS, by the no. of triangles involving the removed edge;
3) Increase ∆GS
and ∆vS , v ∈ VS, by the no. of triangles involving the inserted edge;
10 / 26
28. Is an example worth a thousand words?
Memory: M = 8; Time: end of t∗ − 1;
Actions:
Graph GS = (VS, S):
1
0 4
53
2
Global triangle count ∆GS
: 3
11 / 26
29. Is an example worth a thousand words?
Memory: M = 8; Time: t∗;
Edge on the stream: (2, 5);
Coin bias: M/t∗;
Actions:
Graph GS = (VS, S):
1
0 4
53
2
Global triangle count ∆GS
: 3
11 / 26
30. Is an example worth a thousand words?
Memory: M = 8; Time: t∗;
Edge on the stream: (2, 5);
Coin bias: M/t∗; Coin flip outcome: tail;
Actions: 1) Remove an edge in GS at random (e.g., (0, 1)); 2) Add (2, 5) to GS.
3) Update ∆GS
;
Graph GS = (VS, S):
1
0 4
53
2
Global triangle count ∆GS
: 3
11 / 26
31. Is an example worth a thousand words?
Memory: M = 8; Time: t∗;
Edge on the stream: (2, 5);
Coin bias: M/t∗; Coin flip outcome: tail;
Actions: 1) Remove an edge in GS at random (e.g., (0, 1)); 2) Add (2, 5) to GS.
3) Update ∆GS
;
Graph GS = (VS, S):
1
0 4
53
2
Global triangle count ∆GS
: 3
11 / 26
32. Is an example worth a thousand words?
Memory: M = 8; Time: t∗;
Edge on the stream: (2, 5);
Coin bias: M/t∗; Coin flip outcome: tail;
Actions: 1) Remove an edge in GS at random (e.g., (0, 1)); 2) Add (2, 5) to GS.
3) Update ∆GS
;
Graph GS = (VS, S):
1
0 4
53
2
Global triangle count ∆GS
: 3
11 / 26
33. Is an example worth a thousand words?
Memory: M = 8; Time: t∗;
Edge on the stream: (2, 5);
Coin bias: M/t∗; Coin flip outcome: tail;
Actions: 1) Remove an edge in GS at random (e.g., (0, 1)); 2) Add (2, 5) to GS.
3) Update ∆GS
;
Graph GS = (VS, S):
1
0 4
53
2
Global triangle count ∆GS
: 3−1 + 1 = 3
11 / 26
34. Is an example worth a thousand words?
Memory: M = 8; Time: t∗ + 1;
Edge on the stream: (2, 4);
Coin bias: M/(t∗ + 1); Coin flip outcome:
Actions:
Graph GS = (VS, S):
1
0 4
53
2
Global triangle count ∆GS
: 3
11 / 26
35. Is an example worth a thousand words?
Memory: M = 8; Time: t∗ + 1;
Edge on the stream: (2, 4);
Coin bias: M/(t∗ + 1); Coin flip outcome: head;
Actions: Do nothing;
Graph GS = (VS, S):
1
0 4
53
2
Global triangle count ∆GS
: 3
11 / 26
36. How does TRIÈST-BASE estimate the number of triangles?
Lemma
The set S ⊆ E(t) is chosen uniformly at random among all subsets of E(t) of size M.
This does not imply/assume that S is a collection of independently sampled edges.
12 / 26
37. How does TRIÈST-BASE estimate the number of triangles?
Lemma
The set S ⊆ E(t) is chosen uniformly at random among all subsets of E(t) of size M.
This does not imply/assume that S is a collection of independently sampled edges.
Corollary
The probability that a triangle (a, b, c) of G(t) is in GS at time t is
πt =
t − 3
M − 3
t
M
12 / 26
38. How does TRIÈST-BASE estimate the number of triangles?
Lemma
The set S ⊆ E(t) is chosen uniformly at random among all subsets of E(t) of size M.
This does not imply/assume that S is a collection of independently sampled edges.
Corollary
The probability that a triangle (a, b, c) of G(t) is in GS at time t is
πt =
t − 3
M − 3
t
M
because
t
M
: M-subsets of E(t) (|E(t)| = t)
t − 3
M − 3
: M-subsets of E(t) containing (a, b, c)
12 / 26
39. How does TRIÈST-BASE estimate the number of triangles?
Lemma
The set S ⊆ E(t) is chosen uniformly at random among all subsets of E(t) of size M.
This does not imply/assume that S is a collection of independently sampled edges.
Corollary
The probability that a triangle (a, b, c) of G(t) is in GS at time t is
πt =
t − 3
M − 3
t
M
because
t
M
: M-subsets of E(t) (|E(t)| = t)
t − 3
M − 3
: M-subsets of E(t) containing (a, b, c)
Hence, TRIÈST-BASE computes the unbiased estimate of ∆G(t) :
∆G(t) =
∆GS
πt
.
12 / 26
40. Where are the theorems?
We give complete analysis of unbiasedness, variance, and novel concentration bounds;
The events “edge a ∈ S at time t“ and “edge b ∈ S at time t” are not independent;
This makes the analysis of variance and concentration bounds quite challenging;
13 / 26
41. Where are the theorems?
We give complete analysis of unbiasedness, variance, and novel concentration bounds;
The events “edge a ∈ S at time t“ and “edge b ∈ S at time t” are not independent;
This makes the analysis of variance and concentration bounds quite challenging;
Theorem (Concentration bound, (ε, δ)-approximation)
Let t ≥ 0 and assume |∆(t)| 0. For any ε, δ ∈ (0, 1), let
Φ = 3
8ε−2
3h(t) + 1
|∆(t)|
ln
(3h(t) + 1)e
δ
.
If
M ≥ max tΦ 1 +
1
2
ln2/3
(tΦ) , 12ε−1
+ e2
, 25 ,
then |ξ(t)τ(t) − |∆(t)|| ε|∆(t)| with probability 1 − δ.
Proving this was fun:
we used results on graph coloring,Poisson approximations, and Chernoff bounds.
13 / 26
42. Ok, but can I show you something?
To exactly show the variance of TRIÈST-BASE estimator ∆GS
:
1) Express variance as sum of covariances of each pair of triangles:
Var(∆GS
) =
pairs (a,b)
Cov(a, b)
2) Explicitly compute covariance formulas:
2.a) For pairs of triangles sharing an edge, compute the probability of 5 edges
being in S:
πt
(M − 3)(M − 4))
(t − 3)(t − 4)
2.b) For pairs of triangles not sharing an edge, compute the probability of 6 edges
being in S:
πt
(M − 3)(M − 4)(M − 5)
(t − 3)(t − 4)(t − 5)
The variance depends on the real no. of triangles in G(t) and on the no. of triangles in
G(t) sharing an edge. 14 / 26
43. What is wrong with TRIÈST-BASE?
Weaknesses:
1) -BASE uses the exact value of ∆GS
at time t to estimate ∆G(t) ;
Over time, ∆GS
may decrease, and so would the estimation,. . .
while ∆G(t ) never decreases: ≥ ∆G(t) for any t t!
2) -BASE only counts a triangle if all three edges are in S. . . but if two edges are in
S, and the third one is on the stream right now, we may infer that the triangle exists,
so we should count it;
TRIÈST-IMPR solves these weaknesses, resulting in estimates with lower variance;
15 / 26
44. What is wrong with TRIÈST-BASE?
Weaknesses:
1) -BASE uses the exact value of ∆GS
at time t to estimate ∆G(t) ;
Over time, ∆GS
may decrease, and so would the estimation,. . .
while ∆G(t ) never decreases: ≥ ∆G(t) for any t t!
Solution: never decrease the estimate, i.e., use GS only to identify new triangles;
2) -BASE only counts a triangle if all three edges are in S. . . but if two edges are in
S, and the third one is on the stream right now, we may infer that the triangle exists,
so we should count it;
Solution: first increment the counters, then decide whether to insert the edge into S;
TRIÈST-IMPR solves these weaknesses, resulting in estimates with lower variance;
15 / 26
45. How does TRIÈST-IMPR work?
Memory: M = 8; Time: end of t∗ − 1;
Graph GS = (VS, S):
1
0 4
53
2
Triangle counter λ(= ∆GS
): 3
16 / 26
46. How does TRIÈST-IMPR work?
Memory: M = 8; Time: t∗;
Edge on the stream: (2, 5);
Action: Weighted increment of λ using the of triangles closed by (2, 5)
with weight (t∗ − 1)(t∗ − 2)/(M(M − 1));
Graph GS = (VS, S):
1
0 4
53
2
Triangle counter λ(= ∆GS
): 3+(t∗−1)(t∗−2)
M(M−1)
16 / 26
47. How does TRIÈST-IMPR work?
Memory: M = 8; Time: t∗;
Edge on the stream: (2, 5);
Action: Weighted increment of λ using the of triangles closed by (2, 5)
with weight (t∗ − 1)(t∗ − 2)/(M(M − 1));
Coin bias: M/t∗; Coin flip outcome: tail;
Actions: Remove an edge in GS chosen at random (e.g., (0, 1)); Add (2, 5) to GS;
Graph GS = (VS, S):
1
0 4
53
2
Triangle counter λ(= ∆GS
): 3+(t∗−1)(t∗−2)
M(M−1)
16 / 26
48. How does TRIÈST-IMPR work?
Memory: M = 8; Time: t∗;
Edge on the stream: (2, 5);
Action: Weighted increment of λ using the of triangles closed by (2, 5)
with weight (t∗ − 1)(t∗ − 2)/(M(M − 1));
Coin bias: M/t∗; Coin flip outcome: tail;
Actions: Remove an edge in GS chosen at random (e.g., (0, 1)); Add (2, 5) to GS;
Graph GS = (VS, S):
1
0 4
53
2
Triangle counter λ(= ∆GS
): 3+(t∗−1)(t∗−2)
M(M−1)
16 / 26
49. How does TRIÈST-IMPR work?
Memory: M = 8; Time: t∗;
Edge on the stream: (2, 5);
Action: Weighted increment of λ using the of triangles closed by (2, 5)
with weight (t∗ − 1)(t∗ − 2)/(M(M − 1));
Coin bias: M/t∗; Coin flip outcome: tail;
Actions: Remove an edge in GS chosen at random (e.g., (0, 1)); Add (2, 5) to GS;
Graph GS = (VS, S):
1
0 4
53
2
Triangle counter λ(= ∆GS
): 3+(t∗−1)(t∗−2)
M(M−1)
16 / 26
50. How does TRIÈST-IMPR work?
Memory: M = 8; Time: t∗;
Edge on the stream: (2, 5);
Action: Weighted increment of λ using the of triangles closed by (2, 5)
with weight (t∗ − 1)(t∗ − 2)/(M(M − 1));
Coin bias: M/t∗; Coin flip outcome: tail;
Actions: Remove an edge in GS chosen at random (e.g., (0, 1)); Add (2, 5) to GS;
Graph GS = (VS, S):
1
0 4
53
2
Triangle counter λ(= ∆GS
): 3+(t∗−1)(t∗−2)
M(M−1)
16 / 26
51. How does TRIÈST-IMPR work?
Memory: M = 8; Time: t∗ + 1;
Edge on the stream: (2, 4);
Action: Weighted increment of λ using the of triangles closed by (2, 4)
with weight t∗(t∗ − 1)/(M(M − 1));
Coin bias: Coin flip outcome:
Actions:
Graph GS = (VS, S):
1
0 4
53
2
Triangle counter λ(= ∆GS
): 3+(t∗−1)(t∗−2)
M(M−1) +2t∗(t∗−1)
M(M−1)
16 / 26
52. How does TRIÈST-IMPR work?
Memory: M = 8; Time: t∗ + 1;
Edge on the stream: (2, 4);
Action: Weighted increment of λ using the of triangles closed by (2, 4)
with weight t∗(t∗ − 1)/(M(M − 1));
Coin bias: M/(t∗ + 1); Coin flip outcome: head;
Actions: Do nothing;
Graph GS = (VS, S):
1
0 4
53
2
Triangle counter λ(= ∆GS
): 3+(t∗−1)(t∗−2)
M(M−1) +2t∗(t∗−1)
M(M−1)
16 / 26
53. How does TRIÈST-IMPR estimate the number of triangles?
TRI`-EST-IMPR returns λ as the unbiased estimate of ∆G(t) .
17 / 26
54. How does TRIÈST-IMPR estimate the number of triangles?
TRI`-EST-IMPR returns λ as the unbiased estimate of ∆G(t) .
Corollary
The probability that a triangle of G(t) is “seen” and causes an increment in λ at time t
when the third edge of the triangle is on the stream is:
ρt =
t − 2
M − 2
t − 1
M
=
M(M − 1)
(t − 2)(t − 1)
.
Since ρt πt, TRI`-EST-IMPR’s estimations have lower variance than
TRI`-EST-BASE’s.
17 / 26
55. Where are the theorems?
The order of the updates on the streams affects the probability of “seeing” a triangle;
This further complicates the analysis of the variance:
Theorem (Upper bound to the variance)
Then, for any time t M, we have
Var τ(t)
≤ |∆(t)
| max 1,
(t − 1)(t − 2)
(M(M − 1))
− 1 + z(t) t − 1 − M
M
.
We proceed case-by-case: not-intuitive, tedious, pessimistic, inelegant, and loose;
18 / 26
56. What about fully-dynamic edge streams?
Handling deletions is hard;
TRIÈST-FD’s approach is inspired by random pairing (Gemulla et al., 2009).
TRIÈST-FD tracks all deletions, and update S by removing deleted edges;
This is not enough;
The resulting S is no longer a uniform sample of the non-deleted edges in G(t);
TRIÈST-FD keeps track of the max. number of edges at any time t;
This allows to compute the bias of the current S due to unpaired deletions.
TRIÈST-FD weights ∆S by the bias, to obtain the estimate for ∆G(t) ;
19 / 26
57. Where are the experiments?
Implementation: C++. Available from http://bit.ly/triestkdd
Graphs: Last.fm, Patent-Cit, Patent-Coaut, Twitter, Yahoo!, and others
Goals: evaluate variance, runtime, scalability.
Environment: Brown CS computing cluster (single core, max 4GB RAM)
20 / 26
58. How does TRIÈST-IMPR perform?
Yahoo! graph with 1.2 billion edges (computing exact ∆G is infeasible);
Space M = 1 million ( 0.1% of the graph);
0
1x10
10
2x10
10
3x1010
4x10
10
5x10
10
6x10
10
7x10
10
8x10
10
0
2x10
8
4x10
8
6x10
8
8x10
8
1x10
9
1.2x10
9
Globaltrianglecount
Time t
max est.
min est.
avg est.
Takeaway: The unbiased estimates are highly concentrated around the mean.
21 / 26
59. How does TRIÈST-IMPR perform compared to other methods?
Last.fm graph (40 million edges, 1 billion triangles);
Space M = 100K (0.25% of the graph);
Compared with MASCOT (KDD’15), which uses edge sampling with fixed probability;
0
2x10
8
4x10
8
6x10
8
8x10
8
1x10
9
1.2x109
1.4x109
0
5x10
6
1x10
7
1.5x10
7
2x10
7
2.5x10
7
3x10
7
3.5x10
7
Globaltrianglecount
Time t
ground truth
max est. TRIEST-IMPR
min est. TRIEST-IMPR
max est. MASCOT-I
min est. MASCOT-I
0
2x10
7
4x107
6x10
7
8x10
7
1x10
8
1.2x108
0
5x10
6
1x10
7
1.5x10
7
2x10
7
2.5x10
7
3x10
7
3.5x10
7
Std.dev.oftheestimation
Time t
std dev TRIEST-IMPR
std dev MASCOT-I
Takeaway: TRIÈST has much more accurate estimations with lower variance.
22 / 26
60. How does TRIÈST-FD perform?
0
200000
400000
600000
800000
1x10
6
1.2x10
6
1.4x10
6
1.6x10
6
0
5x10
6
1x10
7
1.5x10
7
2x10
7
2.5x10
7
3x10
7
Globaltrianglecount
Time t
ground truth
avg est.+std dev
avg est.-std dev
avg est.
(c) Patent (Cit.)
0
2x10
7
4x10
7
6x107
8x10
7
1x10
8
1.2x10
8
0
1x10
7
2x10
7
3x10
7
4x10
7
5x10
7
6x10
7
7x10
7
8x10
7
Globaltrianglecount
Time t
ground truth
avg est.+std dev
avg est.-std dev
avg est.
(d) LastFm
-5x109
0
5x109
1x1010
1.5x1010
2x1010
2.5x10
10
0
5x10
8
1x10
9
1.5x10
9
2x10
9
2.5x10
9
Globaltrianglecount
Time t
avg est.+std dev
avg est.-std dev
avg est.
(e) Yahoo! Answers
Takeaway:
1) The estimations are very accurate;
2) TRIÉST allows to study the evolution of triangles at a level not available before;
E.g., it is possible to detect patterns and anomalies.
23 / 26
61. How scalable is TRIÈST-FD?
We measured the average time to handle an update on the stream;
1
10
100
1000
10000
patent-cit
patent-coaut
lastfm
yahoo
Avg.microsecsperupdate
M=200000
M=500000
M=1000000
Takeaway: between 2 µs/edge and 3 ms/edge;
(i.e., between 500k edges/sec. and 300 edges/sec.) 24 / 26
62. What didn’t I tell you?
The Goods:
Concentration results (the one for TRIÈST-BASE is very elegant;)
Theorems for TRIÈST-FD;
TRIÈST for multigraphs (various defs. of triangle counts);
Many more experiments and comparisons with state-of-the-art;
The Bads:
Results on variance are upper bounds, often loose;
Some of the concentration bounds are quite naïve (Chebyshev Ineq.);
The bounds should not depend on the order of the edges on the stream;
The Betters:
We are exploring the use of cube sampling and balanced sampling to solve the issues.
25 / 26
63. What did I talk about?
TRIÈST: three algorithms for triangle counts estimation in fully-dynamic edge streams;
• Uses a fixed, constant amount of memory;
• Is intrinsically incremental;
• Scales to billion edges graphs and handles tens of thousands of; edges per second;
• Uses reservoir sampling in a smart way;
• Gives unbiased, low-variance, highly-concentrated estimates;
Complex analysis due to non-independent sampling, but worth the effort!
Thank you!
EML: matteo@twosigma.com TWTR: @teorionda
WWW: http://matteo.rionda.to
26 / 26
64. This document is being distributed for informational and educational purposes only and is not an offer to sell or the solicitation of an offer to buy
any securities or other instruments. The information contained herein is not intended to provide, and should not be relied upon for investment
advice. The views expressed herein are not necessarily the views of Two Sigma Investments, LP or any of its affiliates (collectively, “Two Sigma”).
Such views reflect significant assumptions and subjective of the author(s) of the document and are subject to change without notice. The
document may employ data derived from third-party sources. No representation is made as to the accuracy of such information and the use of
such information in no way implies an endorsement of the source of such information or its validity.
The copyrights and/or trademarks in some of the images, logos or other material used herein may be owned by entities other than Two Sigma. If
so, such copyrights and/or trademarks are most likely owned by the entity that created the material and are used purely for identification and
comment as fair use under international copyright and/or trademark laws. Use of such image, copyright or trademark does not imply any
association with such organization (or endorsement of such organization) by Two Sigma, nor vice versa.