Ashfaq Munshi, ML7 Fellow, Pepperdata

Classifying Multivariate Time
Series Scalably
Ashfaq Munshi, Saeed Bidhendi, Faramarz Munshi
November 10, 2017

• Background and Motivation
• Univariate Time Series (UTS)
• Multivariate Time Series (MTS)
• Conclusion
Overview
© Pepperdata, Inc.2

Pepperdata Telemetry Data Scale
Example production deployment:
570
Nodes
20
Tasks /
Node
300
Metrics /
Task
5-Sec
Sampling
41 Million
Points /
Minute

300
Trillion
Performance
Data Points
Collected
Our Big Data About Production Big Data
22
Thousand
Production
Nodes
50
Million
Jobs/Year

Example Time Series

• Highly variable in length
• 10 data points to 10K+ data points
• Missing data
• Extremely noisy
Characteristics of our TS

Problem
Classify this collection of time series
to give operators a better understanding of
resource utilization on their clusters and to
enable a scheduler to better optimize cluster
resources

• Two recent approaches from the literature
• Transform the TS into an image then use a tiled CNN
[Wang & Oats 2015]
• Transform the TS into a bag of patterns
[Schafer & Leser 2017]
• Dataset is the UCR data set
• 82 time series data sets
• Number of series < 10K
• Data points per series < 2K
Approaches and Data Set

• Map the time series into
• Gramian Angular Summation Fields
• Gramian Angular Difference Fields
• Markov Transition Fields
• Feed images into a tiled CNN for classification
Time Series and Images
[Wang & Oats, 2015]

• Normalize the time series into [-1,1]
• Transform to Polar Coordinates
Gramian Angular Fields
[Wang & Oats, 2015]

Example GADF Image
[Wang & Oats, 2015]

• Divide TS into windows
• Fourier Transform TS in window
• Apply low-pass filter
• Quantize the Fourier coefficients
• Map window to words
• Extract features from sentences
• Use Logistic Regression classifier
Time Series and Bag of Patterns
[Schafer & Leser 2017]

• Convert TS into image (GADF)
• Use Google’s pre-trained CNN; trained on inception v3
• Embed into 2,048-dimensional vector space
• Train MLP
• 2 hidden layers (50 nodes each)
• ReLU activation
• Dropout for regularization (.1, .2)
• Softmax final layer
Our “Off the shelf” Approach (PD)

Accuracies for a subset of UCR
0%
20%
40%
60%
80%
100%
BOSS (91.1)
PD (89.8)
GADF+GASF+MTF (86.4)

Accuracy on a subset of UCR
68%
70%
72%
74%
76%
78%
80%
82%
84%
86%
WEASEL 1-NN DTW CV 1-NN DTW BOSS Learning
Shapelet (LS)
TSBF ST EE (PROP) COTE
(ensemble)
PD

Training Time Comparison
 PD

• Two recent approaches from the literature
• Use an ESN (“Echo State Network”) to map MTS into
state clouds [Wang, Wang, Liu 2015]
• Use Dynamic Time Warping with Mahalanobis distance
metric [Mei, Liu, Wang, Gao 2016]
• Dataset is from UCI, a small subset of UCR and others
• Number of series ~ 10K
• Data points per series ~ 200
Approaches and Data Set

• Make TS for each variable the same length by zero
padding
• Convert each TS into a GADF image
• Interpolate any missing data points in the image using
linear interpolation on the image
• Stack the images for the five variables
• Use the same process as before for univariate time
series
Our “Off the Shelf” Approach (PD)

5-Fold Cross Validation Error
0
5
10
15
20
25
30
Robot failure LP1 Robot failure LP2 Robot failure LP3 Robot failure LP4 Robot failure LP5
MDDTW Best
PD 5-fold

10-Fold Cross Validation Error
0
5
10
15
20
25
30
Robot failure LP1 Robot failure LP2 Robot failure LP3 Robot failure LP4 Robot failure LP5
Echo Network Best
PD 10-fold

• Four variables:
• CPU, Virtual Memory, HDFS reads, Network Ops
• Each time series collected over one week
• 10 data points to 10K+ data points
• Missing data
• Extremely noisy
• For periods longer than a week, data is much larger
• Sampling rate is the same for all TS
PD Data

Accuracy per Label on PD Dataset G
0
20
40
60
80
100
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Accuracy
Number of TS = 3092
Lengths per TS = 5 to 8500
Average Accuracy = 78.14%

Accuracy per Label on PD Dataset R
Number of TS = 6715
Lengths per TS = 5 to 9400
Average Accuracy = 75.95
0
20
40
60
80
100
120
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Summary
Our “Off the Shelf” approach is as good as the
best approaches for both UTS and MTS. And,
the methodology is the same for both types of
TS.

Ashfaq Munshi, ML7 Fellow, Pepperdata

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (15)

Semelhante a Ashfaq Munshi, ML7 Fellow, Pepperdata

Semelhante a Ashfaq Munshi, ML7 Fellow, Pepperdata (20)

Mais de MLconf

Mais de MLconf (20)

Último

Último (20)

Ashfaq Munshi, ML7 Fellow, Pepperdata

Notas do Editor