O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
EVALUATING REAL-TIME ANOMALY DETECTION:
THE NUMENTA ANOMALY BENCHMARK
MLCONF San Francisco
November 13, 2015
Subutai Ahmad...
2
Monitoring
IT infrastructure
Uncovering
fraudulent
transactions
Tracking
vehicles
Real-time
health
monitoring
Monitoring...
3
EXAMPLE: PREVENTATIVE MAINTENANCE
4
EXAMPLE: PREVENTATIVE MAINTENANCE
Planned
shutdown
Behavioral change
preceding failure
Catastrophic
failure
5
YET ANOTHER BENCHMARK?
• A benchmark consists of:
• Labeled data sets
• Scoring mechanism
• Versioning system
• Most exi...
6
NUMENTA ANOMALY BENCHMARK (NAB)
• NAB: a rigorous benchmark for anomaly
detection in streaming applications
7
NUMENTA ANOMALY BENCHMARK (NAB)
• NAB: a rigorous benchmark for anomaly
detection in streaming applications
• Real-world...
8
NUMENTA ANOMALY BENCHMARK (NAB)
• NAB: a rigorous benchmark for anomaly
detection in streaming applications
• Real-world...
9
NUMENTA ANOMALY BENCHMARK (NAB)
• NAB: a rigorous benchmark for anomaly
detection in streaming applications
• Real-world...
10
EXAMPLE: LOAD BALANCER HEALTH
Unusually high load balancer latency
11
EXAMPLE: HOURLY SERVICE DEMAND
Spike in demand
Unusually low demand
12
EXAMPLE: PRODUCTION SERVER CPU
Spiking behavior becomes the new norm
Spike anomaly
13
HOW SHOULD WE SCORE ANOMALIES?
• The perfect detector
• Detects every anomaly
• Detects anomalies as soon as possible
•...
14
HOW SHOULD WE SCORE ANOMALIES?
• The perfect detector
• Detects every anomaly
• Detects anomalies as soon as possible
•...
15
WHERE IS THE ANOMALY?
16
NAB DEFINES ANOMALY WINDOWS
17
• Effect of each detection is scaled
relative to position within window:
• Detections outside window are false
positive...
18
• Effect of each detection is scaled
relative to position within window:
• Detections outside window are false
positive...
19
OTHER DETAILS
• Application profiles
• Three application profiles assign different weightings based on the tradeoff bet...
20
OTHER DETAILS
• Application profiles
• Three application profiles assign different weightings based on the tradeoff bet...
21
TESTING ALGORITHMS WITH NAB
• NAB is a community effort
• The goal is to have researchers independently evaluate a larg...
22
TESTING ALGORITHMS WITH NAB
• NAB is a community effort
• The goal is to have researchers independently evaluate a larg...
23
NAB V1.0 RESULTS (58 FILES)
24
DETECTION RESULTS: CPU USAGE ON
PRODUCTION SERVER
Simple spike, all 3
algorithms detect
Shift in usage
Etsy
Skyline
Num...
25
DETECTION RESULTS: MACHINE
TEMPERATURE READINGS
HTM detects purely
temporal anomaly
Etsy
Skyline
Numenta
HTM
Twitter
AD...
26
DETECTION RESULTS: TEMPORAL CHANGES IN
BEHAVIOR OFTEN PRECEDE A LARGER SHIFT
HTM detects anomaly 3
hours earlier
Etsy
S...
27
SUMMARY
• Anomaly detection is most common application for streaming analytics
• NAB is a community benchmark for strea...
28
SUMMARY
• Anomaly detection is most common application for streaming analytics
• NAB is a community benchmark for strea...
29
NAB RESOURCES
Table 12 at MLConf
Repository: https://github.com/numenta/NAB
Paper:
A. Lavin and S. Ahmad, “Evaluating R...
THANK YOU!
QUESTIONS?
Próximos SlideShares
Carregando em…5
×

Subutai Ahmad, VP of Research, Numenta at MLconf SF - 11/13/15

1.075 visualizações

Publicada em

Real-time Anomaly Detection for Real-time Data Needs: Much of the world’s data is becoming streaming, time-series data, where anomalies give significant information in often-critical situations. Examples abound in domains such as finance, IT, security, medical, and energy. Yet detecting anomalies in streaming data is a difficult task, requiring detectors to process data in real-time, not batches, and learn while simultaneously making predictions. Are there algorithms up for the challenge? Which are the most capable? The Numenta Anomaly Detection Benchmark (NAB) attempts to provide a controlled and repeatable environment of open-source tools to test and measure anomaly detection algorithms on streaming data. The perfect detector would detect all anomalies as soon as possible, trigger no false alarms, work with real-world time-series data across a variety of domains, and automatically adapt to changing statistics. These characteristics are formalized in NAB, using a custom scoring algorithm to evaluate the detectors on a benchmark dataset with labeled, real-world time-series data. We present these components, and describe the end-to-end scoring process. We give results and analyses for several algorithms to illustrate NAB in action. The goal for NAB is to provide a standard, open-source framework for which we can compare and evaluate different algorithms for detecting anomalies in streaming data.

Publicada em: Tecnologia
  • Seja o primeiro a comentar

Subutai Ahmad, VP of Research, Numenta at MLconf SF - 11/13/15

  1. 1. EVALUATING REAL-TIME ANOMALY DETECTION: THE NUMENTA ANOMALY BENCHMARK MLCONF San Francisco November 13, 2015 Subutai Ahmad sahmad@numenta.com
  2. 2. 2 Monitoring IT infrastructure Uncovering fraudulent transactions Tracking vehicles Real-time health monitoring Monitoring energy consumption Detection is necessary, but prevention is often the goal REAL-TIME ANOMALY DETECTION • Exponential growth in IoT, sensors and real-time data collection is driving an explosion of streaming data • The biggest application for machine learning is anomaly detection
  3. 3. 3 EXAMPLE: PREVENTATIVE MAINTENANCE
  4. 4. 4 EXAMPLE: PREVENTATIVE MAINTENANCE Planned shutdown Behavioral change preceding failure Catastrophic failure
  5. 5. 5 YET ANOTHER BENCHMARK? • A benchmark consists of: • Labeled data sets • Scoring mechanism • Versioning system • Most existing benchmarks are designed for batch data, not streaming data • Hard to find benchmarks containing real world data labeled with anomalies • We saw a need for a benchmark that is designed to test anomaly detection algorithms on real-time, streaming data • A standard community benchmark could spur innovation in real- time anomaly detection algorithms
  6. 6. 6 NUMENTA ANOMALY BENCHMARK (NAB) • NAB: a rigorous benchmark for anomaly detection in streaming applications
  7. 7. 7 NUMENTA ANOMALY BENCHMARK (NAB) • NAB: a rigorous benchmark for anomaly detection in streaming applications • Real-world benchmark data set • 58 labeled data streams (47 real-world, 11 artificial streams) • Total of 365,551 data points
  8. 8. 8 NUMENTA ANOMALY BENCHMARK (NAB) • NAB: a rigorous benchmark for anomaly detection in streaming applications • Real-world benchmark data set • 58 labeled data streams (47 real-world, 11 artificial streams) • Total of 365,551 data points • Scoring mechanism • Reward early detection • Anomaly windows • Scoring function • Different “application profiles”
  9. 9. 9 NUMENTA ANOMALY BENCHMARK (NAB) • NAB: a rigorous benchmark for anomaly detection in streaming applications • Real-world benchmark data set • 58 labeled data streams (47 real-world, 11 artificial streams) • Total of 365,551 data points • Scoring mechanism • Reward early detection • Anomaly windows • Scoring function • Different “application profiles” • Open resource • AGPL repository contains data, source code, and documentation • github.com/numenta/NAB
  10. 10. 10 EXAMPLE: LOAD BALANCER HEALTH Unusually high load balancer latency
  11. 11. 11 EXAMPLE: HOURLY SERVICE DEMAND Spike in demand Unusually low demand
  12. 12. 12 EXAMPLE: PRODUCTION SERVER CPU Spiking behavior becomes the new norm Spike anomaly
  13. 13. 13 HOW SHOULD WE SCORE ANOMALIES? • The perfect detector • Detects every anomaly • Detects anomalies as soon as possible • Provides detections in real time • Triggers no false alarms • Requires no parameter tuning • Automatically adapts to changing statistics
  14. 14. 14 HOW SHOULD WE SCORE ANOMALIES? • The perfect detector • Detects every anomaly • Detects anomalies as soon as possible • Provides detections in real time • Triggers no false alarms • Requires no parameter tuning • Automatically adapts to changing statistics • Scoring methods in traditional benchmarks are insufficient • Precision/recall does not incorporate importance of early detection • Artificial separation into training and test sets does not handle continuous learning • Batch data files allow look ahead and multiple passes through the data
  15. 15. 15 WHERE IS THE ANOMALY?
  16. 16. 16 NAB DEFINES ANOMALY WINDOWS
  17. 17. 17 • Effect of each detection is scaled relative to position within window: • Detections outside window are false positives (scored low) • Multiple detections within window are ignored (use earliest one) SCORING FUNCTION
  18. 18. 18 • Effect of each detection is scaled relative to position within window: • Detections outside window are false positives (scored low) • Multiple detections within window are ignored (use earliest one) • Total score is sum of scaled detections + weighted sum of missed detections: SCORING FUNCTION
  19. 19. 19 OTHER DETAILS • Application profiles • Three application profiles assign different weightings based on the tradeoff between false positives and false negatives. • EKG data on a cardiac patient favors False Positives. • IT / DevOps professionals hate False Positives. • Three application profiles: standard, favor low false positives, favor low false negatives.
  20. 20. 20 OTHER DETAILS • Application profiles • Three application profiles assign different weightings based on the tradeoff between false positives and false negatives. • EKG data on a cardiac patient favors False Positives. • IT / DevOps professionals hate False Positives. • Three application profiles: standard, favor low false positives, favor low false negatives. • NAB emulates practical real-time scenarios • Look ahead not allowed for algorithms. Detections must be made on the fly. • No separation between training and test files. Invoke model, start streaming, and go. • No batch, per dataset, parameter tuning. Must be fully automated with single set of parameters across datasets. Any further parameter tuning must be done on the fly.
  21. 21. 21 TESTING ALGORITHMS WITH NAB • NAB is a community effort • The goal is to have researchers independently evaluate a large number of algorithms • Very easy to plug in and test new algorithms
  22. 22. 22 TESTING ALGORITHMS WITH NAB • NAB is a community effort • The goal is to have researchers independently evaluate a large number of algorithms • Very easy to plug in and test new algorithms • Seed results with three algorithms: • Hierarchical Temporal Memory • Numenta’s open source streaming anomaly detection algorithm • Models temporal sequences in data, continuously learning • Etsy Skyline • Popular open source anomaly detection technique • Mixture of statistical experts, continuously learning • Twitter ADVec • Open source anomaly detection released earlier this year • Robust outlier statistics + piecewise approximation
  23. 23. 23 NAB V1.0 RESULTS (58 FILES)
  24. 24. 24 DETECTION RESULTS: CPU USAGE ON PRODUCTION SERVER Simple spike, all 3 algorithms detect Shift in usage Etsy Skyline Numenta HTM Twitter ADVec Red denotes False Positive Key
  25. 25. 25 DETECTION RESULTS: MACHINE TEMPERATURE READINGS HTM detects purely temporal anomaly Etsy Skyline Numenta HTM Twitter ADVec Red denotes False Positive Key All 3 detect catastrophic failure
  26. 26. 26 DETECTION RESULTS: TEMPORAL CHANGES IN BEHAVIOR OFTEN PRECEDE A LARGER SHIFT HTM detects anomaly 3 hours earlier Etsy Skyline Numenta HTM Twitter ADVec Red denotes False Positive Key
  27. 27. 27 SUMMARY • Anomaly detection is most common application for streaming analytics • NAB is a community benchmark for streaming anomaly detection • Includes a labeled dataset with real data • Scoring methodology designed for practical real-time applications • Fully open source codebase
  28. 28. 28 SUMMARY • Anomaly detection is most common application for streaming analytics • NAB is a community benchmark for streaming anomaly detection • Includes a labeled dataset with real data • Scoring methodology designed for practical real-time applications • Fully open source codebase • What’s next for NAB? • We hope to see researchers test additional algorithms • We hope to spark improved algorithms for streaming • More data sets! • Could incorporate UC Irvine dataset, Yahoo labs dataset (not open source) • Would love to get more labeled streaming datasets from you • Add support for multivariate anomaly detection
  29. 29. 29 NAB RESOURCES Table 12 at MLConf Repository: https://github.com/numenta/NAB Paper: A. Lavin and S. Ahmad, “Evaluating Real-time Anomaly Detection Algorithms – the Numenta Anomaly Benchmark,” to appear in 14th International Conference on Machine Learning and Applications (IEEE ICMLA’15), 2015. Preprint available: http://arxiv.org/abs/1510.03336 Contact info: sahmad@numenta.com , alavin@numenta.com
  30. 30. THANK YOU! QUESTIONS?

×