SlideShare uma empresa Scribd logo
1 de 29
Classifying Multivariate Time
Series Scalably
Ashfaq Munshi, Saeed Bidhendi, Faramarz Munshi
November 10, 2017
• Background and Motivation
• Univariate Time Series (UTS)
• Multivariate Time Series (MTS)
• Conclusion
Overview
© Pepperdata, Inc.2
Background
Pepperdata Telemetry Data Scale
Example production deployment:
© Pepperdata, Inc.5
570
Nodes
20
Tasks /
Node
300
Metrics /
Task
5-Sec
Sampling
41 Million
Points /
Minute
300
Trillion
Performance
Data Points
Collected
Our Big Data About Production Big Data
© Pepperdata, Inc.6
22
Thousand
Production
Nodes
50
Million
Jobs/Year
Example Time Series
© Pepperdata, Inc.7
• Highly variable in length
• 10 data points to 10K+ data points
• Missing data
• Extremely noisy
Characteristics of our TS
© Pepperdata, Inc.8
Problem
© Pepperdata, Inc.9
Classify this collection of time series
to give operators a better understanding of
resource utilization on their clusters and to
enable a scheduler to better optimize cluster
resources
Univariate Time Series
• Two recent approaches from the literature
• Transform the TS into an image then use a tiled CNN
[Wang & Oats 2015]
• Transform the TS into a bag of patterns
[Schafer & Leser 2017]
• Dataset is the UCR data set
• 82 time series data sets
• Number of series < 10K
• Data points per series < 2K
Approaches and Data Set
© Pepperdata, Inc.11
• Map the time series into
• Gramian Angular Summation Fields
• Gramian Angular Difference Fields
• Markov Transition Fields
• Feed images into a tiled CNN for classification
Time Series and Images
© Pepperdata, Inc.12
[Wang & Oats, 2015]
• Normalize the time series into [-1,1]
• Transform to Polar Coordinates
Gramian Angular Fields
© Pepperdata, Inc.13
[Wang & Oats, 2015]
Example GADF Image
© Pepperdata, Inc.14
[Wang & Oats, 2015]
• Divide TS into windows
• Fourier Transform TS in window
• Apply low-pass filter
• Quantize the Fourier coefficients
• Map window to words
• Extract features from sentences
• Use Logistic Regression classifier
Time Series and Bag of Patterns
© Pepperdata, Inc.15
[Schafer & Leser 2017]
• Convert TS into image (GADF)
• Use Google’s pre-trained CNN; trained on inception v3
• Embed into 2,048-dimensional vector space
• Train MLP
• 2 hidden layers (50 nodes each)
• ReLU activation
• Dropout for regularization (.1, .2)
• Softmax final layer
Our “Off the shelf” Approach (PD)
© Pepperdata, Inc.16
Accuracies for a subset of UCR
© Pepperdata, Inc.17
0%
20%
40%
60%
80%
100%
BOSS (91.1)
PD (89.8)
GADF+GASF+MTF (86.4)
Accuracy on a subset of UCR
© Pepperdata, Inc.18
68%
70%
72%
74%
76%
78%
80%
82%
84%
86%
WEASEL 1-NN DTW CV 1-NN DTW BOSS Learning
Shapelet (LS)
TSBF ST EE (PROP) COTE
(ensemble)
PD
Training Time Comparison
© Pepperdata, Inc.19
 PD
Multivariate Time Series
• Two recent approaches from the literature
• Use an ESN (“Echo State Network”) to map MTS into
state clouds [Wang, Wang, Liu 2015]
• Use Dynamic Time Warping with Mahalanobis distance
metric [Mei, Liu, Wang, Gao 2016]
• Dataset is from UCI, a small subset of UCR and others
• Number of series ~ 10K
• Data points per series ~ 200
Approaches and Data Set
© Pepperdata, Inc.21
• Make TS for each variable the same length by zero
padding
• Convert each TS into a GADF image
• Interpolate any missing data points in the image using
linear interpolation on the image
• Stack the images for the five variables
• Use the same process as before for univariate time
series
Our “Off the Shelf” Approach (PD)
© Pepperdata, Inc.22
5-Fold Cross Validation Error
© Pepperdata, Inc.23
0
5
10
15
20
25
30
Robot failure LP1 Robot failure LP2 Robot failure LP3 Robot failure LP4 Robot failure LP5
MDDTW Best
PD 5-fold
10-Fold Cross Validation Error
© Pepperdata, Inc.24
0
5
10
15
20
25
30
Robot failure LP1 Robot failure LP2 Robot failure LP3 Robot failure LP4 Robot failure LP5
Echo Network Best
PD 10-fold
• Four variables:
• CPU, Virtual Memory, HDFS reads, Network Ops
• Each time series collected over one week
• 10 data points to 10K+ data points
• Missing data
• Extremely noisy
• For periods longer than a week, data is much larger
• Sampling rate is the same for all TS
PD Data
© Pepperdata, Inc.25
Accuracy per Label on PD Dataset G
© Pepperdata, Inc.26
0
20
40
60
80
100
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Accuracy
Number of TS = 3092
Lengths per TS = 5 to 8500
Average Accuracy = 78.14%
Accuracy per Label on PD Dataset R
© Pepperdata, Inc.27
Number of TS = 6715
Lengths per TS = 5 to 9400
Average Accuracy = 75.95
0
20
40
60
80
100
120
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
Summary
© Pepperdata, Inc.28
Our “Off the Shelf” approach is as good as the
best approaches for both UTS and MTS. And,
the methodology is the same for both types of
TS.
Thank You

Mais conteúdo relacionado

Mais procurados

Tianqi Chen, PhD Student, University of Washington, at MLconf Seattle 2017
Tianqi Chen, PhD Student, University of Washington, at MLconf Seattle 2017Tianqi Chen, PhD Student, University of Washington, at MLconf Seattle 2017
Tianqi Chen, PhD Student, University of Washington, at MLconf Seattle 2017
MLconf
 
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16
MLconf
 

Mais procurados (20)

Introduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithmIntroduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithm
 
1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration
1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration
1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration
 
Continuous control with deep reinforcement learning (DDPG)
Continuous control with deep reinforcement learning (DDPG)Continuous control with deep reinforcement learning (DDPG)
Continuous control with deep reinforcement learning (DDPG)
 
Dueling network architectures for deep reinforcement learning
Dueling network architectures for deep reinforcement learningDueling network architectures for deep reinforcement learning
Dueling network architectures for deep reinforcement learning
 
DQN (Deep Q-Network)
DQN (Deep Q-Network)DQN (Deep Q-Network)
DQN (Deep Q-Network)
 
Tianqi Chen, PhD Student, University of Washington, at MLconf Seattle 2017
Tianqi Chen, PhD Student, University of Washington, at MLconf Seattle 2017Tianqi Chen, PhD Student, University of Washington, at MLconf Seattle 2017
Tianqi Chen, PhD Student, University of Washington, at MLconf Seattle 2017
 
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
 
task scheduling in cloud datacentre using genetic algorithm
task scheduling in cloud datacentre using genetic algorithmtask scheduling in cloud datacentre using genetic algorithm
task scheduling in cloud datacentre using genetic algorithm
 
Safe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement LearningSafe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement Learning
 
deep reinforcement learning with double q learning
deep reinforcement learning with double q learningdeep reinforcement learning with double q learning
deep reinforcement learning with double q learning
 
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16
 
Deep learning and image analytics using Python by Dr Sanparit
Deep learning and image analytics using Python by Dr SanparitDeep learning and image analytics using Python by Dr Sanparit
Deep learning and image analytics using Python by Dr Sanparit
 
Fast Single-pass K-means Clusterting at Oxford
Fast Single-pass K-means Clusterting at Oxford Fast Single-pass K-means Clusterting at Oxford
Fast Single-pass K-means Clusterting at Oxford
 
Optimization in deep learning
Optimization in deep learningOptimization in deep learning
Optimization in deep learning
 
Higher Order Fused Regularization for Supervised Learning with Grouped Parame...
Higher Order Fused Regularization for Supervised Learning with Grouped Parame...Higher Order Fused Regularization for Supervised Learning with Grouped Parame...
Higher Order Fused Regularization for Supervised Learning with Grouped Parame...
 
Deep Reinforcement Learning: Q-Learning
Deep Reinforcement Learning: Q-LearningDeep Reinforcement Learning: Q-Learning
Deep Reinforcement Learning: Q-Learning
 
ddpg seminar
ddpg seminarddpg seminar
ddpg seminar
 
Image classification with neural networks
Image classification with neural networksImage classification with neural networks
Image classification with neural networks
 
05 k-means clustering
05 k-means clustering05 k-means clustering
05 k-means clustering
 
High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and Modeling
 

Destaque

LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
MLconf
 
Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017
Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017
Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017
MLconf
 
Daniel Shank, Data Scientist, Talla at MLconf SF 2017
Daniel Shank, Data Scientist, Talla at MLconf SF 2017Daniel Shank, Data Scientist, Talla at MLconf SF 2017
Daniel Shank, Data Scientist, Talla at MLconf SF 2017
MLconf
 
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017
MLconf
 
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017
MLconf
 
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017
MLconf
 
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...
MLconf
 
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
MLconf
 

Destaque (15)

LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
 
Jonas Schneider, Head of Engineering for Robotics, OpenAI
Jonas Schneider, Head of Engineering for Robotics, OpenAIJonas Schneider, Head of Engineering for Robotics, OpenAI
Jonas Schneider, Head of Engineering for Robotics, OpenAI
 
Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017
Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017
Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017
 
Daniel Shank, Data Scientist, Talla at MLconf SF 2017
Daniel Shank, Data Scientist, Talla at MLconf SF 2017Daniel Shank, Data Scientist, Talla at MLconf SF 2017
Daniel Shank, Data Scientist, Talla at MLconf SF 2017
 
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017
 
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017
 
Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017
Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017
Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017
 
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017
 
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...
 
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
 
Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017
Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017
Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017
 
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
 
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
 
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
 
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
 

Semelhante a Ashfaq Munshi, ML7 Fellow, Pepperdata

Puppet Camp London Fall 2014: Keynote
Puppet Camp London Fall 2014: KeynotePuppet Camp London Fall 2014: Keynote
Puppet Camp London Fall 2014: Keynote
Puppet
 
ADCSS 2022
ADCSS 2022ADCSS 2022
Super COMPUTING Journal
Super COMPUTING JournalSuper COMPUTING Journal
Super COMPUTING Journal
Pandey_G
 
Puppet Camp Melbourne: Keynote
Puppet Camp Melbourne: KeynotePuppet Camp Melbourne: Keynote
Puppet Camp Melbourne: Keynote
Puppet
 
Puppet Camp Seattle 2014: Keynote
Puppet Camp Seattle 2014: KeynotePuppet Camp Seattle 2014: Keynote
Puppet Camp Seattle 2014: Keynote
Puppet
 

Semelhante a Ashfaq Munshi, ML7 Fellow, Pepperdata (20)

MSR 2009
MSR 2009MSR 2009
MSR 2009
 
Tsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in ChinaTsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in China
 
Early Application experiences on Summit
Early Application experiences on Summit Early Application experiences on Summit
Early Application experiences on Summit
 
Yufeng Guo - Tensor Processing Units: how TPUs enable the next generation of ...
Yufeng Guo - Tensor Processing Units: how TPUs enable the next generation of ...Yufeng Guo - Tensor Processing Units: how TPUs enable the next generation of ...
Yufeng Guo - Tensor Processing Units: how TPUs enable the next generation of ...
 
The Pill for Your Migration Hell
The Pill for Your Migration HellThe Pill for Your Migration Hell
The Pill for Your Migration Hell
 
Puppet Camp London Fall 2014: Keynote
Puppet Camp London Fall 2014: KeynotePuppet Camp London Fall 2014: Keynote
Puppet Camp London Fall 2014: Keynote
 
Exascale Deep Learning for Climate Analytics
Exascale Deep Learning for Climate AnalyticsExascale Deep Learning for Climate Analytics
Exascale Deep Learning for Climate Analytics
 
A Billion Points of Data Pressure
A Billion Points of Data PressureA Billion Points of Data Pressure
A Billion Points of Data Pressure
 
Times Series Feature Extraction Methods of Wearable Signal Data for Deep Lear...
Times Series Feature Extraction Methods of Wearable Signal Data for Deep Lear...Times Series Feature Extraction Methods of Wearable Signal Data for Deep Lear...
Times Series Feature Extraction Methods of Wearable Signal Data for Deep Lear...
 
Overview of DuraMat software tool development
Overview of DuraMat software tool developmentOverview of DuraMat software tool development
Overview of DuraMat software tool development
 
Keynote: Machine Learning for Design Automation at DAC 2018
Keynote:  Machine Learning for Design Automation at DAC 2018Keynote:  Machine Learning for Design Automation at DAC 2018
Keynote: Machine Learning for Design Automation at DAC 2018
 
BIRTE-13-Kawashima
BIRTE-13-KawashimaBIRTE-13-Kawashima
BIRTE-13-Kawashima
 
Tajo_Meetup_20141120
Tajo_Meetup_20141120Tajo_Meetup_20141120
Tajo_Meetup_20141120
 
ADCSS 2022
ADCSS 2022ADCSS 2022
ADCSS 2022
 
Dasia 2022
Dasia 2022Dasia 2022
Dasia 2022
 
Super COMPUTING Journal
Super COMPUTING JournalSuper COMPUTING Journal
Super COMPUTING Journal
 
Puppet Camp Melbourne: Keynote
Puppet Camp Melbourne: KeynotePuppet Camp Melbourne: Keynote
Puppet Camp Melbourne: Keynote
 
Fiware: Connecting to robots
Fiware: Connecting to robotsFiware: Connecting to robots
Fiware: Connecting to robots
 
Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...
Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...
Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...
 
Puppet Camp Seattle 2014: Keynote
Puppet Camp Seattle 2014: KeynotePuppet Camp Seattle 2014: Keynote
Puppet Camp Seattle 2014: Keynote
 

Mais de MLconf

Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
MLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
MLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
MLconf
 

Mais de MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Ashfaq Munshi, ML7 Fellow, Pepperdata

  • 1. Classifying Multivariate Time Series Scalably Ashfaq Munshi, Saeed Bidhendi, Faramarz Munshi November 10, 2017
  • 2. • Background and Motivation • Univariate Time Series (UTS) • Multivariate Time Series (MTS) • Conclusion Overview © Pepperdata, Inc.2
  • 4.
  • 5. Pepperdata Telemetry Data Scale Example production deployment: © Pepperdata, Inc.5 570 Nodes 20 Tasks / Node 300 Metrics / Task 5-Sec Sampling 41 Million Points / Minute
  • 6. 300 Trillion Performance Data Points Collected Our Big Data About Production Big Data © Pepperdata, Inc.6 22 Thousand Production Nodes 50 Million Jobs/Year
  • 7. Example Time Series © Pepperdata, Inc.7
  • 8. • Highly variable in length • 10 data points to 10K+ data points • Missing data • Extremely noisy Characteristics of our TS © Pepperdata, Inc.8
  • 9. Problem © Pepperdata, Inc.9 Classify this collection of time series to give operators a better understanding of resource utilization on their clusters and to enable a scheduler to better optimize cluster resources
  • 11. • Two recent approaches from the literature • Transform the TS into an image then use a tiled CNN [Wang & Oats 2015] • Transform the TS into a bag of patterns [Schafer & Leser 2017] • Dataset is the UCR data set • 82 time series data sets • Number of series < 10K • Data points per series < 2K Approaches and Data Set © Pepperdata, Inc.11
  • 12. • Map the time series into • Gramian Angular Summation Fields • Gramian Angular Difference Fields • Markov Transition Fields • Feed images into a tiled CNN for classification Time Series and Images © Pepperdata, Inc.12 [Wang & Oats, 2015]
  • 13. • Normalize the time series into [-1,1] • Transform to Polar Coordinates Gramian Angular Fields © Pepperdata, Inc.13 [Wang & Oats, 2015]
  • 14. Example GADF Image © Pepperdata, Inc.14 [Wang & Oats, 2015]
  • 15. • Divide TS into windows • Fourier Transform TS in window • Apply low-pass filter • Quantize the Fourier coefficients • Map window to words • Extract features from sentences • Use Logistic Regression classifier Time Series and Bag of Patterns © Pepperdata, Inc.15 [Schafer & Leser 2017]
  • 16. • Convert TS into image (GADF) • Use Google’s pre-trained CNN; trained on inception v3 • Embed into 2,048-dimensional vector space • Train MLP • 2 hidden layers (50 nodes each) • ReLU activation • Dropout for regularization (.1, .2) • Softmax final layer Our “Off the shelf” Approach (PD) © Pepperdata, Inc.16
  • 17. Accuracies for a subset of UCR © Pepperdata, Inc.17 0% 20% 40% 60% 80% 100% BOSS (91.1) PD (89.8) GADF+GASF+MTF (86.4)
  • 18. Accuracy on a subset of UCR © Pepperdata, Inc.18 68% 70% 72% 74% 76% 78% 80% 82% 84% 86% WEASEL 1-NN DTW CV 1-NN DTW BOSS Learning Shapelet (LS) TSBF ST EE (PROP) COTE (ensemble) PD
  • 19. Training Time Comparison © Pepperdata, Inc.19  PD
  • 21. • Two recent approaches from the literature • Use an ESN (“Echo State Network”) to map MTS into state clouds [Wang, Wang, Liu 2015] • Use Dynamic Time Warping with Mahalanobis distance metric [Mei, Liu, Wang, Gao 2016] • Dataset is from UCI, a small subset of UCR and others • Number of series ~ 10K • Data points per series ~ 200 Approaches and Data Set © Pepperdata, Inc.21
  • 22. • Make TS for each variable the same length by zero padding • Convert each TS into a GADF image • Interpolate any missing data points in the image using linear interpolation on the image • Stack the images for the five variables • Use the same process as before for univariate time series Our “Off the Shelf” Approach (PD) © Pepperdata, Inc.22
  • 23. 5-Fold Cross Validation Error © Pepperdata, Inc.23 0 5 10 15 20 25 30 Robot failure LP1 Robot failure LP2 Robot failure LP3 Robot failure LP4 Robot failure LP5 MDDTW Best PD 5-fold
  • 24. 10-Fold Cross Validation Error © Pepperdata, Inc.24 0 5 10 15 20 25 30 Robot failure LP1 Robot failure LP2 Robot failure LP3 Robot failure LP4 Robot failure LP5 Echo Network Best PD 10-fold
  • 25. • Four variables: • CPU, Virtual Memory, HDFS reads, Network Ops • Each time series collected over one week • 10 data points to 10K+ data points • Missing data • Extremely noisy • For periods longer than a week, data is much larger • Sampling rate is the same for all TS PD Data © Pepperdata, Inc.25
  • 26. Accuracy per Label on PD Dataset G © Pepperdata, Inc.26 0 20 40 60 80 100 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Accuracy Number of TS = 3092 Lengths per TS = 5 to 8500 Average Accuracy = 78.14%
  • 27. Accuracy per Label on PD Dataset R © Pepperdata, Inc.27 Number of TS = 6715 Lengths per TS = 5 to 9400 Average Accuracy = 75.95 0 20 40 60 80 100 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
  • 28. Summary © Pepperdata, Inc.28 Our “Off the Shelf” approach is as good as the best approaches for both UTS and MTS. And, the methodology is the same for both types of TS.

Notas do Editor

  1. Expedia cluster, 3/21-3/24. https://beta-dashboard.pepperdata.com/expedia-chandler-prod/charts#s=2017/03/21-13:07&e=2017/03/24-13:07&tzo=-7&m=basic