SlideShare uma empresa Scribd logo
1 de 21
SPIRE2019:26thInternationalSymposiumonStringProcessingandInformationRetrieval
Fast Identification of Heavy Hitters
by Cached and Packed Group Testing
Hiroki Arimura
Hokkaido University
Japan
Takeaki Uno
NII
Japan
Yusaku Kaneta
Rakuten Mobile, Inc.
Japan
2
The φ-Heavy Hitters Problem
[Cormode, Muthukrishnan, ACM TODS 2005]
§Tracking φ-heavy hitters in a dynamic multiset S of elements.
• φ-heavy hitter: element in universe U = [0..n) with its frequency more than φ|S|.
• Challenges: large alphabet, output sensitivity, high-speed operation
Input: data stream of pairs (xi, Δi) ∈ U × {±1} and real numbers ε, φ in [0, 1).
Task: Maintain frequency information of elements for supporting
• QUERY(): return a set R ⊆ U such that R includes (1) all φ-heavy hitters and
(2) no others whose frequency is no more than (φ − ε)N for N = Σi Δi.
• INSERT(x)/DELETE(x): increment/decrement the frequency Nx of element x.
The (ε-approximate) φ-Heavy Hitters Problem in the turnstile model
Model of computation: The standard w-bit word RAM
3
Large Universes in Mobile Networks
The operation time of existing practical methods depends on
log |U| = log n (Large in practice!)
Combinatorial Group Testing [Cormode, Muthukrishnan, ACM TODS 2005]
Hierarchical Count-Min Sketch [Cormode, Muthukrishnan, LATIN 2005]
IPv4 IPv6
Examples of universe U log |U| = log n
IP addresses 32 128
Pairs of IP addresses 64 256
Five tuples (source/destination IP/Port + Protocol) 104 296
Q. Can we eliminate dependency on log n from operation time?
4
Main results
Key technique: Packed Bidirectional Counter Arrays
Our paper also proposes "cached candidate technique” for improving CGT for arbitrary updates.
This study CGT: Combinatorial Group Testing
[Cormode, Muthukrishnan, ACM TODS 2015]
Update O(r) amortized O(log(n)r) O(logb(n)r)
Query O(r2/ε) O((log(n)+r)r/ε) O((blogb(n)+r)r/ε)
Space O(log(n)r/ε) O(log(n)r/ε) O(blogb(n)r/ε)
n: size of universe δ: failure probability r = log(1/(δφ)) b: any integer in [2..n]
Model of computation: The standard w-bit word RAM
5
Related Work: Packed Counters
Maintaining an array ofm = O(w) counters on the w-bit word RAM.
§Textbook solution for a single counter [Mehlhorn & Sanders, 2008]:
• Ops = inc/test or dec/test: O(1) space; O(m) amortized time.
§Nested counters [Grabowski & Fredriksson, IPL 2008]:
• Ops = inc/test: O(m) space; O(1) amortized time
§Trit counters [Bille & Thorup, SODA 2010]
• Ops = dec/reset/test: O(m) space; O(1) amortized time.
§Bidirectional counters [This talk]
• Ops = inc/dec/test: O(m) space; O(1) time for inc/dec (amortized) and test.
• Naïve bidirectional counters: O(m) space; O(m) time for all operations.
"test": ispositive (C[i] > 0), iszero (C[i] = 0), or isnegative (C[i] < 0)
6
How to improve CGT using
Packed Bidirectional Counter Arrays?
7
CGT: A Practical Data Structure
§Reports all φ-heavy hitters with probability at least 1 – δ for a specified δ.
§Idea: Random partition of U into d = 2/ε
subsets via each hash function hi.
• A φ-heavy hitter x can be identified from each
C[i, hi(x), 0..m] with probability at least 1/2.
• Setting r = log(1/(δφ) results in a desired failure
probability δ of missing any φ-heavy hitter.
Combinatorial Group Testing [CM, ACM TODS 2005]
1. Three-dimensional counter array: C[1..r, 1..d, 0..m]
2. A set of universal hash functions: h1, ..., hr: U → [1..d]
r = log(1/(δφ))
d = 2/ε
m = 1 + lg n
8
CGT: A Practical Data Structure
§Reports all φ-heavy hitters with probability at least 1 – δ for a specified δ.
Combinatorial Group Testing [CM, ACM TODS 2005]
INCREMENT(C, x):
C[i] = C[i] + bit(x, i)
for every i in [1..m].
DECREMENT(C, x):
C[i] = C[i] − bit(x, i)
for every i in [1..m].
ISPOSITIVE(C)
Return z = Σi [C[i] > 0] · 2i
[X] is 1 (resp. 0) if X is true (resp. false)
CGT reduces both QUERY and UPDATE
to three basic operations on bidirectional counter array C[1..m]:
CGT: A Practical Data Structure
1. Three-dimensional counter array: C[1..r, 1..d, 0..m]
2. A set of universal hash functions: h1, ..., hr: U → [1..d]
9
CGT: A Practical Data Structure
Combinatorial Group Testing [CM, ACM TODS 2005]
UPDATE(x, Δ): O(log(n)r) time
1. Add Δ to N
2. for i in [1..r] do:
3. Add Δ to C[i, hi(x), 0]
4. if Δ < 0 then: x ← ~x
5. INCREMENT(C[i, hi(x), 1..m], x)
6. DECREMENT(C[i, hi(x), 1..m], ~x)
QUERY(): O((log(n)+r)r/ε) time
1. for i in [1..r] do:
2. for j in [1..d] do:
3. // C[i, j, k] 2C[i, j, k] – C[i, j, 0]
4. x ← ISPOSITIVE(C[i, j, 1..m])
5. if mini C[i, hi(x), 0] > φN then:
6. report x as a φ-heavy hitter
INCREMENT(C, x):
C[i] = C[i] + bit(x, i)
for every i in [1..m].
DECREMENT(C, x):
C[i] = C[i] − bit(x, i)
for every i in [1..m].
ISPOSITIVE(C)
Return z = Σi [C[i] > 0] · 2i
[X] is 1 (resp. 0) if X is true (resp. false)
10
CGT: A Practical Data Structure
Combinatorial Group Testing [CM, ACM TODS 2005]
UPDATE(x, Δ): O(log(n)r) time
1. Add Δ to N
2. for i in [1..r] do:
3. Add Δ to C[i, hi(x), 0]
4. if Δ < 0 then: x ← ~x
5. INCREMENT(C[i, hi(x), 1..m], x)
6. DECREMENT(C[i, hi(x), 1..m], ~x)
QUERY(): O((log(n)+r)r/ε) time
1. for i in [1..r] do:
2. for j in [1..d] do:
3. // C[i, j, k] 2C[i, j, k] – C[i, j, 0]
4. x ← ISPOSITIVE(C[i, j, 1..m])
5. if mini C[i, hi(x), 0] > φN then:
6. report x as a φ-heavy hitter
INCREMENT(C, x):
C[i] = C[i] + bit(x, i)
for every i in [1..m].
DECREMENT(C, x):
C[i] = C[i] − bit(x, i)
for every i in [1..m].
ISPOSITIVE(C)
Return z = Σi [C[i] > 0] 2i.
[X] is 1 (resp. 0) if X is true (resp. false)
Q. Can we implement
INCREMENT/DECREMENT/ISPOSITIVE in o(m) time?
11
Packed Bidirectional Counter Arrays
§Basic idea: Exploiting word-level parallelism of the w-bit word RAM
• Redundant binary representation of C[1..m] using digits {0, ±1, ±2}.
• The corresponding k-th digits of C[1..m] are packed into O(1) words.
• The packed k-th digits of C[1..m] are updated in O(1) time, once in 2k times.
INCREMENT(C, x):
C[i] = C[i] + bit(x, i)
for every i in [1..m].
DECREMENT(C, x):
C[i] = C[i] − bit(x, i)
for every i in [1..m].
ISPOSITIVE(C)
Return z = Σi [C[i] > 0] 2i.
[X] is 1 (resp. 0) if X is true (resp. false)
O(1) amortized time O(1) amortized time O(1) time
using O(m) space (compact!) for m = O(w)
12
Packed Bidirectional Counter Arrays
§Basic idea: Exploiting word-level parallelism of the w-bit word RAM
• Redundant binary representation of C[1..m] using digits {0, ±1, ±2}.
• The corresponding k-th digits of C[1..m] are packed into O(1) words.
• The packed k-th digits of C[1..m] are updated in O(1) time, once in 2k times.
··· ···
m × w = O(w2) bits: O(w) time to access
Naïve bidirectional counters
C[1] C[i] C[m]
Packed bidirectional counter array
m × O(1) = O(w) bits: O(1) time to access
C[1]
···
C[i]
···
C[m]
wdigits
13
Packed Bidirectional Counter Arrays
= 1
= 0
in {0, ±1}
in {0, ±1, ±2}
Fixed-schedule
carry propagation
in O(1) amortized time
[GF, IPL 2008][BT, SODA 2010]
Packed
redundant binary counters
using digits {0, ±1, ±2}
Packed
orders of magnitudes
for detecting sign inversion
···
t
0
1
2
···
level(t)
···
1 2 3 4 5 6 7 8
3
9
level(t) = min{i | t mod 2i = 0}
1. Propagate carry bits 2. Fix orders of magnitudesThe k-th digits are updated
once in 2k times
Never
overflow
The t-th update:
14
Lemma (Packed Bidirectional Counters)
There exists an O(m)-space data structure for representing
an array C[1..m] of m bidirectional counters supporting
§INCREMENT/DECREMENT in O(1) amortized time
§ISPOSITIVE in O(1) time
on the standard w-bit word RAM with w ≥ m.
15
Theorem
§Plugging packed bidirectional counters into CGT, we obtain:
There exists an O(lg(n)r/ε)-space randomized data structure
for solving the ε-approximate φ-heavy hitters problem with
§INSERT/DELETE in O(r) amortized time
§QUERY in O(r2/ε) time with probability at least 1 - δ
on the standard w-bit word RAM with w ≥ lg n. Here, n is
the universe size, δ is a failure probability, and r = lg(1/(δφ)).
16
Experiments: Setup
§Data: 14 datasets of 10 M integers
• Universe: [0, 264).
• Zipf distribution of skewness z in { 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0 }.
• Threshold φ (= ε) in { 0.0001, 0.0005, 0.001, 0.005, 0.01}.
§Methods:
• Ours [This work]: Our proposed method with #rounds r = 4.
• CGT(b) [CM, TODS 2005]: Combinatorial Group Testing with branching factor b in { 2, 16 }.
• CMH(b) [CM, LATIN 2005]: Hierarchical Count-Min Sketch with branching factor b in { 2/16 }.
CGT(b) and CMH(b) were configured as in [Cormode, Hadjieleftheriou, PVLDB 2008].
§Hardware:
• MacBook Pro with Intel® Core™ i7-8559 (2.7GHz) and 16GB main memory.
17
Experiments: Precision
§Ours achieved competitive precisions for skewness z ≥ 1.4.
• Ours output more false positives than others for skewness z < 1.4.
• For z < 1.4, ours should have used larger ε to suppress false positives.
• Recalls of all methods were 100%.
0.8 1.2 1.6 2.0
0
20
40
60
80
100
Precision(%)
= 0.0001
0.8 1.2 1.6 2.0
= 0.0005
0.8 1.2 1.6 2.0
Skewness
= 0.001
Ours CGT2 CGT16 CMH2 CMH16
0.8 1.2 1.6 2.0
= 0.005
0.8 1.2 1.6 2.0
= 0.01
1.2 1.6 2.0
= 0.0005
0.8 1.2 1.6 2.0
Skewness
= 0.001
Ours CGT2 CGT16 CMH2 CMH16
0.8 1.2 1.6
= 0.005
CMH(16)CMH(2)CGT(16)Ours CGT(2)
[CM, TODS 2005] [CM, LATIN 2005]
18
Experiments: Update time
§Ours achieved competitive update throughputs with CMH(16).
• CMH(16) achieved best and stable update throughputs.
• CGT(16) had heavy dependence on φ even if it doesn’t in theory.
• CGT(2) and CMH(2) were not competitive.
0.8 1.2 1.6 2.0
0
5000
10000
15000
20000
25000
30000
Updates/msec
= 0.0001
0.8 1.2 1.6 2.0
= 0.0005
0.8 1.2 1.6 2.0
Skewness
= 0.001
Ours CGT2 CGT16 CMH2 CMH16
0.8 1.2 1.6 2.0
= 0.005
0.8 1.2 1.6 2.0
= 0.01
Note: Median of 5 measured times is reported
1.2 1.6 2.0
= 0.0005
0.8 1.2 1.6 2.0
Skewness
= 0.001
Ours CGT2 CGT16 CMH2 CMH16
0.8 1.2 1.6
= 0.005
CMH(16)CMH(2)CGT(16)Ours CGT(2)
[CM, TODS 2005] [CM, LATIN 2005]
19
Experiments: Query time
§Ours achieved best query throughputs except for φ = 0.0001.
• Note: ε = φ and r = O(1) in our experiments.
• CGT family (including ours) must examine Θ(1/φ) candidates of heavy hitters.
• CMH family is output sensitive: it is fast if # of heavy hitters is less than 1/φ.
0.8 1.2 1.6 2.0
0
1
2
3
4
5
Queries/msec
= 0.0001
0.8 1.2 1.6 2.0
0
5
10
15
20
= 0.0005
0.8 1.2 1.6 2.0
Skewness
0
10
20
30
40
= 0.001
Ours CGT2 CGT16 CMH2 CMH16
0.8 1.2 1.6 2.0
0
50
100
150
200
250
= 0.005
0.8 1.2 1.6 2.0
0
200
400
600
800
1000
1200
= 0.01
1.2 1.6 2.0
= 0.0005
0.8 1.2 1.6 2.0
Skewness
= 0.001
Ours CGT2 CGT16 CMH2 CMH16
0.8 1.2 1.6
= 0.005
CMH(16)CMH(2)CGT(16)Ours CGT(2)
[CM, TODS 2005] [CM, LATIN 2005]
20
Conclusion
§The φ-Heavy Hitters Problem in the strict turnstile model.
We improved CGT [CM, ACM TODS 2005] in
• Update time: from O(log(n)r) to amortized O(r)
• Query time: from O((log(n)+r)r/ε) to O(r2/ε)
using the same O(log(n)r/ε) space for a universe of size n and r = log(1/(δφ)).
§Packed Bidirectional Counter Array:
• Extension of [GF, IPL 2008] and [BT, SODA 2010] to bidirectional counters.
• Ops = inc/dec/test: O(1) amortized inc/dec and O(1) test in compact space.
§Future work
• Extension of our method to arbitrary updates.
Fast Identification of Heavy Hitters by Cached and Packed Group Testing

Mais conteúdo relacionado

Mais procurados

19 algorithms-and-complexity-110627100203-phpapp02
19 algorithms-and-complexity-110627100203-phpapp0219 algorithms-and-complexity-110627100203-phpapp02
19 algorithms-and-complexity-110627100203-phpapp02Muhammad Aslam
 
Tpr star tree
Tpr star treeTpr star tree
Tpr star treeWin Yu
 
A Note on Latent LSTM Allocation
A Note on Latent LSTM AllocationA Note on Latent LSTM Allocation
A Note on Latent LSTM AllocationTomonari Masada
 
Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities Gael Varoquaux
 
NTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsNTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsMark Chang
 
Speaker Diarization
Speaker DiarizationSpeaker Diarization
Speaker DiarizationHONGJOO LEE
 
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian ProcessesAccelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian ProcessesMatt Moores
 
ZK Study Club: Sumcheck Arguments and Their Applications
ZK Study Club: Sumcheck Arguments and Their ApplicationsZK Study Club: Sumcheck Arguments and Their Applications
ZK Study Club: Sumcheck Arguments and Their ApplicationsAlex Pruden
 
Digit recognizer by convolutional neural network
Digit recognizer by convolutional neural networkDigit recognizer by convolutional neural network
Digit recognizer by convolutional neural networkDing Li
 
Convolutional Neural Network
Convolutional Neural NetworkConvolutional Neural Network
Convolutional Neural NetworkJun Young Park
 
Computational Linguistics week 5
Computational Linguistics  week 5Computational Linguistics  week 5
Computational Linguistics week 5Mark Chang
 
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...Alex Pruden
 
CPM2013-tabei201306
CPM2013-tabei201306CPM2013-tabei201306
CPM2013-tabei201306Yasuo Tabei
 
Homomorphic Encryption
Homomorphic EncryptionHomomorphic Encryption
Homomorphic EncryptionVictor Pereira
 
Java program-to-calculate-area-and-circumference-of-circle
Java program-to-calculate-area-and-circumference-of-circleJava program-to-calculate-area-and-circumference-of-circle
Java program-to-calculate-area-and-circumference-of-circleUniversity of Essex
 

Mais procurados (20)

Profiling in Python
Profiling in PythonProfiling in Python
Profiling in Python
 
19 algorithms-and-complexity-110627100203-phpapp02
19 algorithms-and-complexity-110627100203-phpapp0219 algorithms-and-complexity-110627100203-phpapp02
19 algorithms-and-complexity-110627100203-phpapp02
 
Tpr star tree
Tpr star treeTpr star tree
Tpr star tree
 
A Note on TopicRNN
A Note on TopicRNNA Note on TopicRNN
A Note on TopicRNN
 
A Note on Latent LSTM Allocation
A Note on Latent LSTM AllocationA Note on Latent LSTM Allocation
A Note on Latent LSTM Allocation
 
Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities
 
NTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsNTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANs
 
Speaker Diarization
Speaker DiarizationSpeaker Diarization
Speaker Diarization
 
Recursive algorithms
Recursive algorithmsRecursive algorithms
Recursive algorithms
 
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian ProcessesAccelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
 
AML
AMLAML
AML
 
Ch8
Ch8Ch8
Ch8
 
ZK Study Club: Sumcheck Arguments and Their Applications
ZK Study Club: Sumcheck Arguments and Their ApplicationsZK Study Club: Sumcheck Arguments and Their Applications
ZK Study Club: Sumcheck Arguments and Their Applications
 
Digit recognizer by convolutional neural network
Digit recognizer by convolutional neural networkDigit recognizer by convolutional neural network
Digit recognizer by convolutional neural network
 
Convolutional Neural Network
Convolutional Neural NetworkConvolutional Neural Network
Convolutional Neural Network
 
Computational Linguistics week 5
Computational Linguistics  week 5Computational Linguistics  week 5
Computational Linguistics week 5
 
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
 
CPM2013-tabei201306
CPM2013-tabei201306CPM2013-tabei201306
CPM2013-tabei201306
 
Homomorphic Encryption
Homomorphic EncryptionHomomorphic Encryption
Homomorphic Encryption
 
Java program-to-calculate-area-and-circumference-of-circle
Java program-to-calculate-area-and-circumference-of-circleJava program-to-calculate-area-and-circumference-of-circle
Java program-to-calculate-area-and-circumference-of-circle
 

Semelhante a Fast Identification of Heavy Hitters by Cached and Packed Group Testing

SIAM - Minisymposium on Guaranteed numerical algorithms
SIAM - Minisymposium on Guaranteed numerical algorithmsSIAM - Minisymposium on Guaranteed numerical algorithms
SIAM - Minisymposium on Guaranteed numerical algorithmsJagadeeswaran Rathinavel
 
ByteCode 2012 Talk: Quantitative analysis of Java/.Net like programs to under...
ByteCode 2012 Talk: Quantitative analysis of Java/.Net like programs to under...ByteCode 2012 Talk: Quantitative analysis of Java/.Net like programs to under...
ByteCode 2012 Talk: Quantitative analysis of Java/.Net like programs to under...garbervetsky
 
Mm chap08 -_lossy_compression_algorithms
Mm chap08 -_lossy_compression_algorithmsMm chap08 -_lossy_compression_algorithms
Mm chap08 -_lossy_compression_algorithmsEellekwameowusu
 
Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...
Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...
Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...Alexander Litvinenko
 
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...Michael Lie
 
Digital Control Systems Jntu Model Paper{Www.Studentyogi.Com}
Digital Control Systems Jntu Model Paper{Www.Studentyogi.Com}Digital Control Systems Jntu Model Paper{Www.Studentyogi.Com}
Digital Control Systems Jntu Model Paper{Www.Studentyogi.Com}guest3f9c6b
 
D I G I T A L C O N T R O L S Y S T E M S J N T U M O D E L P A P E R{Www
D I G I T A L  C O N T R O L  S Y S T E M S  J N T U  M O D E L  P A P E R{WwwD I G I T A L  C O N T R O L  S Y S T E M S  J N T U  M O D E L  P A P E R{Www
D I G I T A L C O N T R O L S Y S T E M S J N T U M O D E L P A P E R{Wwwguest3f9c6b
 
FPGA based BCH Decoder
FPGA based BCH DecoderFPGA based BCH Decoder
FPGA based BCH Decoderijsrd.com
 
Low-rank tensor approximation (Introduction)
Low-rank tensor approximation (Introduction)Low-rank tensor approximation (Introduction)
Low-rank tensor approximation (Introduction)Alexander Litvinenko
 
Data Structure: Algorithm and analysis
Data Structure: Algorithm and analysisData Structure: Algorithm and analysis
Data Structure: Algorithm and analysisDr. Rajdeep Chatterjee
 
Theoretical and Practical Bounds on the Initial Value of Skew-Compensated Cl...
Theoretical and Practical Bounds on the Initial Value of  Skew-Compensated Cl...Theoretical and Practical Bounds on the Initial Value of  Skew-Compensated Cl...
Theoretical and Practical Bounds on the Initial Value of Skew-Compensated Cl...Xi'an Jiaotong-Liverpool University
 
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Chiheb Ben Hammouda
 
Efficient Volume and Edge-Skeleton Computation for Polytopes Given by Oracles
Efficient Volume and Edge-Skeleton Computation for Polytopes Given by OraclesEfficient Volume and Edge-Skeleton Computation for Polytopes Given by Oracles
Efficient Volume and Edge-Skeleton Computation for Polytopes Given by OraclesVissarion Fisikopoulos
 
An Efficient Convex Hull Algorithm for a Planer Set of Points
An Efficient Convex Hull Algorithm for a Planer Set of PointsAn Efficient Convex Hull Algorithm for a Planer Set of Points
An Efficient Convex Hull Algorithm for a Planer Set of PointsKasun Ranga Wijeweera
 

Semelhante a Fast Identification of Heavy Hitters by Cached and Packed Group Testing (20)

SIAM - Minisymposium on Guaranteed numerical algorithms
SIAM - Minisymposium on Guaranteed numerical algorithmsSIAM - Minisymposium on Guaranteed numerical algorithms
SIAM - Minisymposium on Guaranteed numerical algorithms
 
ByteCode 2012 Talk: Quantitative analysis of Java/.Net like programs to under...
ByteCode 2012 Talk: Quantitative analysis of Java/.Net like programs to under...ByteCode 2012 Talk: Quantitative analysis of Java/.Net like programs to under...
ByteCode 2012 Talk: Quantitative analysis of Java/.Net like programs to under...
 
Mm chap08 -_lossy_compression_algorithms
Mm chap08 -_lossy_compression_algorithmsMm chap08 -_lossy_compression_algorithms
Mm chap08 -_lossy_compression_algorithms
 
Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...
Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...
Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...
 
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
 
Digitalcontrolsystems
DigitalcontrolsystemsDigitalcontrolsystems
Digitalcontrolsystems
 
Digital Control Systems Jntu Model Paper{Www.Studentyogi.Com}
Digital Control Systems Jntu Model Paper{Www.Studentyogi.Com}Digital Control Systems Jntu Model Paper{Www.Studentyogi.Com}
Digital Control Systems Jntu Model Paper{Www.Studentyogi.Com}
 
D I G I T A L C O N T R O L S Y S T E M S J N T U M O D E L P A P E R{Www
D I G I T A L  C O N T R O L  S Y S T E M S  J N T U  M O D E L  P A P E R{WwwD I G I T A L  C O N T R O L  S Y S T E M S  J N T U  M O D E L  P A P E R{Www
D I G I T A L C O N T R O L S Y S T E M S J N T U M O D E L P A P E R{Www
 
FPGA based BCH Decoder
FPGA based BCH DecoderFPGA based BCH Decoder
FPGA based BCH Decoder
 
chapter1.ppt
chapter1.pptchapter1.ppt
chapter1.ppt
 
chapter1.ppt
chapter1.pptchapter1.ppt
chapter1.ppt
 
Low-rank tensor approximation (Introduction)
Low-rank tensor approximation (Introduction)Low-rank tensor approximation (Introduction)
Low-rank tensor approximation (Introduction)
 
Data Structure: Algorithm and analysis
Data Structure: Algorithm and analysisData Structure: Algorithm and analysis
Data Structure: Algorithm and analysis
 
Theoretical and Practical Bounds on the Initial Value of Skew-Compensated Cl...
Theoretical and Practical Bounds on the Initial Value of  Skew-Compensated Cl...Theoretical and Practical Bounds on the Initial Value of  Skew-Compensated Cl...
Theoretical and Practical Bounds on the Initial Value of Skew-Compensated Cl...
 
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
 
Gate-Cs 2006
Gate-Cs 2006Gate-Cs 2006
Gate-Cs 2006
 
R Language Introduction
R Language IntroductionR Language Introduction
R Language Introduction
 
Efficient Volume and Edge-Skeleton Computation for Polytopes Given by Oracles
Efficient Volume and Edge-Skeleton Computation for Polytopes Given by OraclesEfficient Volume and Edge-Skeleton Computation for Polytopes Given by Oracles
Efficient Volume and Edge-Skeleton Computation for Polytopes Given by Oracles
 
An Efficient Convex Hull Algorithm for a Planer Set of Points
An Efficient Convex Hull Algorithm for a Planer Set of PointsAn Efficient Convex Hull Algorithm for a Planer Set of Points
An Efficient Convex Hull Algorithm for a Planer Set of Points
 
Time complexity.ppt
Time complexity.pptTime complexity.ppt
Time complexity.ppt
 

Mais de Rakuten Group, Inc.

コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話Rakuten Group, Inc.
 
楽天における安全な秘匿情報管理への道のり
楽天における安全な秘匿情報管理への道のり楽天における安全な秘匿情報管理への道のり
楽天における安全な秘匿情報管理への道のりRakuten Group, Inc.
 
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...Rakuten Group, Inc.
 
DataSkillCultureを浸透させる楽天の取り組み
DataSkillCultureを浸透させる楽天の取り組みDataSkillCultureを浸透させる楽天の取り組み
DataSkillCultureを浸透させる楽天の取り組みRakuten Group, Inc.
 
大規模なリアルタイム監視の導入と展開
大規模なリアルタイム監視の導入と展開大規模なリアルタイム監視の導入と展開
大規模なリアルタイム監視の導入と展開Rakuten Group, Inc.
 
楽天における大規模データベースの運用
楽天における大規模データベースの運用楽天における大規模データベースの運用
楽天における大規模データベースの運用Rakuten Group, Inc.
 
楽天サービスを支えるネットワークインフラストラクチャー
楽天サービスを支えるネットワークインフラストラクチャー楽天サービスを支えるネットワークインフラストラクチャー
楽天サービスを支えるネットワークインフラストラクチャーRakuten Group, Inc.
 
楽天の規模とクラウドプラットフォーム統括部の役割
楽天の規模とクラウドプラットフォーム統括部の役割楽天の規模とクラウドプラットフォーム統括部の役割
楽天の規模とクラウドプラットフォーム統括部の役割Rakuten Group, Inc.
 
Rakuten Services and Infrastructure Team.pdf
Rakuten Services and Infrastructure Team.pdfRakuten Services and Infrastructure Team.pdf
Rakuten Services and Infrastructure Team.pdfRakuten Group, Inc.
 
The Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdfThe Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdfRakuten Group, Inc.
 
Supporting Internal Customers as Technical Account Managers.pdf
Supporting Internal Customers as Technical Account Managers.pdfSupporting Internal Customers as Technical Account Managers.pdf
Supporting Internal Customers as Technical Account Managers.pdfRakuten Group, Inc.
 
Making Cloud Native CI_CD Services.pdf
Making Cloud Native CI_CD Services.pdfMaking Cloud Native CI_CD Services.pdf
Making Cloud Native CI_CD Services.pdfRakuten Group, Inc.
 
How We Defined Our Own Cloud.pdf
How We Defined Our Own Cloud.pdfHow We Defined Our Own Cloud.pdf
How We Defined Our Own Cloud.pdfRakuten Group, Inc.
 
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoTravel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoRakuten Group, Inc.
 
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoTravel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoRakuten Group, Inc.
 
Introduction of GORA API Group technology
Introduction of GORA API Group technologyIntroduction of GORA API Group technology
Introduction of GORA API Group technologyRakuten Group, Inc.
 
100PBを越えるデータプラットフォームの実情
100PBを越えるデータプラットフォームの実情100PBを越えるデータプラットフォームの実情
100PBを越えるデータプラットフォームの実情Rakuten Group, Inc.
 
社内エンジニアを支えるテクニカルアカウントマネージャー
社内エンジニアを支えるテクニカルアカウントマネージャー社内エンジニアを支えるテクニカルアカウントマネージャー
社内エンジニアを支えるテクニカルアカウントマネージャーRakuten Group, Inc.
 

Mais de Rakuten Group, Inc. (20)

コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
 
楽天における安全な秘匿情報管理への道のり
楽天における安全な秘匿情報管理への道のり楽天における安全な秘匿情報管理への道のり
楽天における安全な秘匿情報管理への道のり
 
What Makes Software Green?
What Makes Software Green?What Makes Software Green?
What Makes Software Green?
 
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
 
DataSkillCultureを浸透させる楽天の取り組み
DataSkillCultureを浸透させる楽天の取り組みDataSkillCultureを浸透させる楽天の取り組み
DataSkillCultureを浸透させる楽天の取り組み
 
大規模なリアルタイム監視の導入と展開
大規模なリアルタイム監視の導入と展開大規模なリアルタイム監視の導入と展開
大規模なリアルタイム監視の導入と展開
 
楽天における大規模データベースの運用
楽天における大規模データベースの運用楽天における大規模データベースの運用
楽天における大規模データベースの運用
 
楽天サービスを支えるネットワークインフラストラクチャー
楽天サービスを支えるネットワークインフラストラクチャー楽天サービスを支えるネットワークインフラストラクチャー
楽天サービスを支えるネットワークインフラストラクチャー
 
楽天の規模とクラウドプラットフォーム統括部の役割
楽天の規模とクラウドプラットフォーム統括部の役割楽天の規模とクラウドプラットフォーム統括部の役割
楽天の規模とクラウドプラットフォーム統括部の役割
 
Rakuten Services and Infrastructure Team.pdf
Rakuten Services and Infrastructure Team.pdfRakuten Services and Infrastructure Team.pdf
Rakuten Services and Infrastructure Team.pdf
 
The Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdfThe Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdf
 
Supporting Internal Customers as Technical Account Managers.pdf
Supporting Internal Customers as Technical Account Managers.pdfSupporting Internal Customers as Technical Account Managers.pdf
Supporting Internal Customers as Technical Account Managers.pdf
 
Making Cloud Native CI_CD Services.pdf
Making Cloud Native CI_CD Services.pdfMaking Cloud Native CI_CD Services.pdf
Making Cloud Native CI_CD Services.pdf
 
How We Defined Our Own Cloud.pdf
How We Defined Our Own Cloud.pdfHow We Defined Our Own Cloud.pdf
How We Defined Our Own Cloud.pdf
 
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoTravel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech info
 
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoTravel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech info
 
OWASPTop10_Introduction
OWASPTop10_IntroductionOWASPTop10_Introduction
OWASPTop10_Introduction
 
Introduction of GORA API Group technology
Introduction of GORA API Group technologyIntroduction of GORA API Group technology
Introduction of GORA API Group technology
 
100PBを越えるデータプラットフォームの実情
100PBを越えるデータプラットフォームの実情100PBを越えるデータプラットフォームの実情
100PBを越えるデータプラットフォームの実情
 
社内エンジニアを支えるテクニカルアカウントマネージャー
社内エンジニアを支えるテクニカルアカウントマネージャー社内エンジニアを支えるテクニカルアカウントマネージャー
社内エンジニアを支えるテクニカルアカウントマネージャー
 

Último

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

Último (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

Fast Identification of Heavy Hitters by Cached and Packed Group Testing

  • 1. SPIRE2019:26thInternationalSymposiumonStringProcessingandInformationRetrieval Fast Identification of Heavy Hitters by Cached and Packed Group Testing Hiroki Arimura Hokkaido University Japan Takeaki Uno NII Japan Yusaku Kaneta Rakuten Mobile, Inc. Japan
  • 2. 2 The φ-Heavy Hitters Problem [Cormode, Muthukrishnan, ACM TODS 2005] §Tracking φ-heavy hitters in a dynamic multiset S of elements. • φ-heavy hitter: element in universe U = [0..n) with its frequency more than φ|S|. • Challenges: large alphabet, output sensitivity, high-speed operation Input: data stream of pairs (xi, Δi) ∈ U × {±1} and real numbers ε, φ in [0, 1). Task: Maintain frequency information of elements for supporting • QUERY(): return a set R ⊆ U such that R includes (1) all φ-heavy hitters and (2) no others whose frequency is no more than (φ − ε)N for N = Σi Δi. • INSERT(x)/DELETE(x): increment/decrement the frequency Nx of element x. The (ε-approximate) φ-Heavy Hitters Problem in the turnstile model Model of computation: The standard w-bit word RAM
  • 3. 3 Large Universes in Mobile Networks The operation time of existing practical methods depends on log |U| = log n (Large in practice!) Combinatorial Group Testing [Cormode, Muthukrishnan, ACM TODS 2005] Hierarchical Count-Min Sketch [Cormode, Muthukrishnan, LATIN 2005] IPv4 IPv6 Examples of universe U log |U| = log n IP addresses 32 128 Pairs of IP addresses 64 256 Five tuples (source/destination IP/Port + Protocol) 104 296 Q. Can we eliminate dependency on log n from operation time?
  • 4. 4 Main results Key technique: Packed Bidirectional Counter Arrays Our paper also proposes "cached candidate technique” for improving CGT for arbitrary updates. This study CGT: Combinatorial Group Testing [Cormode, Muthukrishnan, ACM TODS 2015] Update O(r) amortized O(log(n)r) O(logb(n)r) Query O(r2/ε) O((log(n)+r)r/ε) O((blogb(n)+r)r/ε) Space O(log(n)r/ε) O(log(n)r/ε) O(blogb(n)r/ε) n: size of universe δ: failure probability r = log(1/(δφ)) b: any integer in [2..n] Model of computation: The standard w-bit word RAM
  • 5. 5 Related Work: Packed Counters Maintaining an array ofm = O(w) counters on the w-bit word RAM. §Textbook solution for a single counter [Mehlhorn & Sanders, 2008]: • Ops = inc/test or dec/test: O(1) space; O(m) amortized time. §Nested counters [Grabowski & Fredriksson, IPL 2008]: • Ops = inc/test: O(m) space; O(1) amortized time §Trit counters [Bille & Thorup, SODA 2010] • Ops = dec/reset/test: O(m) space; O(1) amortized time. §Bidirectional counters [This talk] • Ops = inc/dec/test: O(m) space; O(1) time for inc/dec (amortized) and test. • Naïve bidirectional counters: O(m) space; O(m) time for all operations. "test": ispositive (C[i] > 0), iszero (C[i] = 0), or isnegative (C[i] < 0)
  • 6. 6 How to improve CGT using Packed Bidirectional Counter Arrays?
  • 7. 7 CGT: A Practical Data Structure §Reports all φ-heavy hitters with probability at least 1 – δ for a specified δ. §Idea: Random partition of U into d = 2/ε subsets via each hash function hi. • A φ-heavy hitter x can be identified from each C[i, hi(x), 0..m] with probability at least 1/2. • Setting r = log(1/(δφ) results in a desired failure probability δ of missing any φ-heavy hitter. Combinatorial Group Testing [CM, ACM TODS 2005] 1. Three-dimensional counter array: C[1..r, 1..d, 0..m] 2. A set of universal hash functions: h1, ..., hr: U → [1..d] r = log(1/(δφ)) d = 2/ε m = 1 + lg n
  • 8. 8 CGT: A Practical Data Structure §Reports all φ-heavy hitters with probability at least 1 – δ for a specified δ. Combinatorial Group Testing [CM, ACM TODS 2005] INCREMENT(C, x): C[i] = C[i] + bit(x, i) for every i in [1..m]. DECREMENT(C, x): C[i] = C[i] − bit(x, i) for every i in [1..m]. ISPOSITIVE(C) Return z = Σi [C[i] > 0] · 2i [X] is 1 (resp. 0) if X is true (resp. false) CGT reduces both QUERY and UPDATE to three basic operations on bidirectional counter array C[1..m]: CGT: A Practical Data Structure 1. Three-dimensional counter array: C[1..r, 1..d, 0..m] 2. A set of universal hash functions: h1, ..., hr: U → [1..d]
  • 9. 9 CGT: A Practical Data Structure Combinatorial Group Testing [CM, ACM TODS 2005] UPDATE(x, Δ): O(log(n)r) time 1. Add Δ to N 2. for i in [1..r] do: 3. Add Δ to C[i, hi(x), 0] 4. if Δ < 0 then: x ← ~x 5. INCREMENT(C[i, hi(x), 1..m], x) 6. DECREMENT(C[i, hi(x), 1..m], ~x) QUERY(): O((log(n)+r)r/ε) time 1. for i in [1..r] do: 2. for j in [1..d] do: 3. // C[i, j, k] 2C[i, j, k] – C[i, j, 0] 4. x ← ISPOSITIVE(C[i, j, 1..m]) 5. if mini C[i, hi(x), 0] > φN then: 6. report x as a φ-heavy hitter INCREMENT(C, x): C[i] = C[i] + bit(x, i) for every i in [1..m]. DECREMENT(C, x): C[i] = C[i] − bit(x, i) for every i in [1..m]. ISPOSITIVE(C) Return z = Σi [C[i] > 0] · 2i [X] is 1 (resp. 0) if X is true (resp. false)
  • 10. 10 CGT: A Practical Data Structure Combinatorial Group Testing [CM, ACM TODS 2005] UPDATE(x, Δ): O(log(n)r) time 1. Add Δ to N 2. for i in [1..r] do: 3. Add Δ to C[i, hi(x), 0] 4. if Δ < 0 then: x ← ~x 5. INCREMENT(C[i, hi(x), 1..m], x) 6. DECREMENT(C[i, hi(x), 1..m], ~x) QUERY(): O((log(n)+r)r/ε) time 1. for i in [1..r] do: 2. for j in [1..d] do: 3. // C[i, j, k] 2C[i, j, k] – C[i, j, 0] 4. x ← ISPOSITIVE(C[i, j, 1..m]) 5. if mini C[i, hi(x), 0] > φN then: 6. report x as a φ-heavy hitter INCREMENT(C, x): C[i] = C[i] + bit(x, i) for every i in [1..m]. DECREMENT(C, x): C[i] = C[i] − bit(x, i) for every i in [1..m]. ISPOSITIVE(C) Return z = Σi [C[i] > 0] 2i. [X] is 1 (resp. 0) if X is true (resp. false) Q. Can we implement INCREMENT/DECREMENT/ISPOSITIVE in o(m) time?
  • 11. 11 Packed Bidirectional Counter Arrays §Basic idea: Exploiting word-level parallelism of the w-bit word RAM • Redundant binary representation of C[1..m] using digits {0, ±1, ±2}. • The corresponding k-th digits of C[1..m] are packed into O(1) words. • The packed k-th digits of C[1..m] are updated in O(1) time, once in 2k times. INCREMENT(C, x): C[i] = C[i] + bit(x, i) for every i in [1..m]. DECREMENT(C, x): C[i] = C[i] − bit(x, i) for every i in [1..m]. ISPOSITIVE(C) Return z = Σi [C[i] > 0] 2i. [X] is 1 (resp. 0) if X is true (resp. false) O(1) amortized time O(1) amortized time O(1) time using O(m) space (compact!) for m = O(w)
  • 12. 12 Packed Bidirectional Counter Arrays §Basic idea: Exploiting word-level parallelism of the w-bit word RAM • Redundant binary representation of C[1..m] using digits {0, ±1, ±2}. • The corresponding k-th digits of C[1..m] are packed into O(1) words. • The packed k-th digits of C[1..m] are updated in O(1) time, once in 2k times. ··· ··· m × w = O(w2) bits: O(w) time to access Naïve bidirectional counters C[1] C[i] C[m] Packed bidirectional counter array m × O(1) = O(w) bits: O(1) time to access C[1] ··· C[i] ··· C[m] wdigits
  • 13. 13 Packed Bidirectional Counter Arrays = 1 = 0 in {0, ±1} in {0, ±1, ±2} Fixed-schedule carry propagation in O(1) amortized time [GF, IPL 2008][BT, SODA 2010] Packed redundant binary counters using digits {0, ±1, ±2} Packed orders of magnitudes for detecting sign inversion ··· t 0 1 2 ··· level(t) ··· 1 2 3 4 5 6 7 8 3 9 level(t) = min{i | t mod 2i = 0} 1. Propagate carry bits 2. Fix orders of magnitudesThe k-th digits are updated once in 2k times Never overflow The t-th update:
  • 14. 14 Lemma (Packed Bidirectional Counters) There exists an O(m)-space data structure for representing an array C[1..m] of m bidirectional counters supporting §INCREMENT/DECREMENT in O(1) amortized time §ISPOSITIVE in O(1) time on the standard w-bit word RAM with w ≥ m.
  • 15. 15 Theorem §Plugging packed bidirectional counters into CGT, we obtain: There exists an O(lg(n)r/ε)-space randomized data structure for solving the ε-approximate φ-heavy hitters problem with §INSERT/DELETE in O(r) amortized time §QUERY in O(r2/ε) time with probability at least 1 - δ on the standard w-bit word RAM with w ≥ lg n. Here, n is the universe size, δ is a failure probability, and r = lg(1/(δφ)).
  • 16. 16 Experiments: Setup §Data: 14 datasets of 10 M integers • Universe: [0, 264). • Zipf distribution of skewness z in { 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0 }. • Threshold φ (= ε) in { 0.0001, 0.0005, 0.001, 0.005, 0.01}. §Methods: • Ours [This work]: Our proposed method with #rounds r = 4. • CGT(b) [CM, TODS 2005]: Combinatorial Group Testing with branching factor b in { 2, 16 }. • CMH(b) [CM, LATIN 2005]: Hierarchical Count-Min Sketch with branching factor b in { 2/16 }. CGT(b) and CMH(b) were configured as in [Cormode, Hadjieleftheriou, PVLDB 2008]. §Hardware: • MacBook Pro with Intel® Core™ i7-8559 (2.7GHz) and 16GB main memory.
  • 17. 17 Experiments: Precision §Ours achieved competitive precisions for skewness z ≥ 1.4. • Ours output more false positives than others for skewness z < 1.4. • For z < 1.4, ours should have used larger ε to suppress false positives. • Recalls of all methods were 100%. 0.8 1.2 1.6 2.0 0 20 40 60 80 100 Precision(%) = 0.0001 0.8 1.2 1.6 2.0 = 0.0005 0.8 1.2 1.6 2.0 Skewness = 0.001 Ours CGT2 CGT16 CMH2 CMH16 0.8 1.2 1.6 2.0 = 0.005 0.8 1.2 1.6 2.0 = 0.01 1.2 1.6 2.0 = 0.0005 0.8 1.2 1.6 2.0 Skewness = 0.001 Ours CGT2 CGT16 CMH2 CMH16 0.8 1.2 1.6 = 0.005 CMH(16)CMH(2)CGT(16)Ours CGT(2) [CM, TODS 2005] [CM, LATIN 2005]
  • 18. 18 Experiments: Update time §Ours achieved competitive update throughputs with CMH(16). • CMH(16) achieved best and stable update throughputs. • CGT(16) had heavy dependence on φ even if it doesn’t in theory. • CGT(2) and CMH(2) were not competitive. 0.8 1.2 1.6 2.0 0 5000 10000 15000 20000 25000 30000 Updates/msec = 0.0001 0.8 1.2 1.6 2.0 = 0.0005 0.8 1.2 1.6 2.0 Skewness = 0.001 Ours CGT2 CGT16 CMH2 CMH16 0.8 1.2 1.6 2.0 = 0.005 0.8 1.2 1.6 2.0 = 0.01 Note: Median of 5 measured times is reported 1.2 1.6 2.0 = 0.0005 0.8 1.2 1.6 2.0 Skewness = 0.001 Ours CGT2 CGT16 CMH2 CMH16 0.8 1.2 1.6 = 0.005 CMH(16)CMH(2)CGT(16)Ours CGT(2) [CM, TODS 2005] [CM, LATIN 2005]
  • 19. 19 Experiments: Query time §Ours achieved best query throughputs except for φ = 0.0001. • Note: ε = φ and r = O(1) in our experiments. • CGT family (including ours) must examine Θ(1/φ) candidates of heavy hitters. • CMH family is output sensitive: it is fast if # of heavy hitters is less than 1/φ. 0.8 1.2 1.6 2.0 0 1 2 3 4 5 Queries/msec = 0.0001 0.8 1.2 1.6 2.0 0 5 10 15 20 = 0.0005 0.8 1.2 1.6 2.0 Skewness 0 10 20 30 40 = 0.001 Ours CGT2 CGT16 CMH2 CMH16 0.8 1.2 1.6 2.0 0 50 100 150 200 250 = 0.005 0.8 1.2 1.6 2.0 0 200 400 600 800 1000 1200 = 0.01 1.2 1.6 2.0 = 0.0005 0.8 1.2 1.6 2.0 Skewness = 0.001 Ours CGT2 CGT16 CMH2 CMH16 0.8 1.2 1.6 = 0.005 CMH(16)CMH(2)CGT(16)Ours CGT(2) [CM, TODS 2005] [CM, LATIN 2005]
  • 20. 20 Conclusion §The φ-Heavy Hitters Problem in the strict turnstile model. We improved CGT [CM, ACM TODS 2005] in • Update time: from O(log(n)r) to amortized O(r) • Query time: from O((log(n)+r)r/ε) to O(r2/ε) using the same O(log(n)r/ε) space for a universe of size n and r = log(1/(δφ)). §Packed Bidirectional Counter Array: • Extension of [GF, IPL 2008] and [BT, SODA 2010] to bidirectional counters. • Ops = inc/dec/test: O(1) amortized inc/dec and O(1) test in compact space. §Future work • Extension of our method to arbitrary updates.