SlideShare uma empresa Scribd logo
1 de 55
Baixar para ler offline
Tracking the tracker:
Time Series Analysis in Python From First
Principles
Kenneth Emeka Odoh
PyCon APAC
@National University of Singapore
Computing 1 (COM1) / Level 2
13 Computing Drive Singapore 117417
May 31st, 2018 - June 2nd, 2018
Talk Plan
● Bio
● Motivations & Goals
● Theory
● Application: Forecasting
● Application: Anomaly Detection
● Deployment
● Lessons
● ConclusionKenneth Emeka Odoh
2
Bio
Education
● Masters in Computer Science (University of Regina) 2013 - 2016
● A number of MOOCs in statistics, algorithm, and machine learning
Work
● Research assistant in the field of Visual Analytics 2013 - 2016
○ ( https://scholar.google.com/citations?user=3G2ncgwAAAAJ&hl=en )
● Software Engineer in Vancouver, Canada 2017 -
● Regular speaker at a number of Vancouver AI meetups, organizer of distributed systems meetup,
and Haskell study group and organizer of Applied Cryptography study group 2017 -
Kenneth Emeka Odoh
3
This talk focuses on two applications in time series analysis:
● Forecasting
○ Can I use the past to predict the future?
● Anomaly Detection
○ Is a given point, normal or abnormal?
"Simplicity is the ultimate sophistication"--- Leonardo da Vinci
Kenneth Emeka Odoh
4
Motivation
Figure 1: Pole vault showing time series of position
[http://vaultermagazine.com/middle-school-womens-pole-vault-rankings-from-athletic-net-5/]
Kenneth Emeka Odoh
5
Figure 2: Missile Defence System
[http://www.bbc.com/news/world-middle-east-43730068]
Kenneth Emeka Odoh
6
Figure 3: Bitcoin time series [yahoo finance]Kenneth Emeka Odoh
7
Goals
● A real-time model for time series analysis.
● A less computational intensive model thereby resulting in scalability.
● A support for multivariate / univariate time series data.
introducing
pySmooth
( https://github.com/kenluck2001/pySmooth)
and
Anomaly
( https://github.com/kenluck2001/anomaly)
Kenneth Emeka Odoh
8
Application: Forecasting
Kenneth Emeka Odoh
9
Let’s dive into the maths
for some minutes
[http://www.boom-studios.com/2017/12/01/20th-century-fox-buys-girl-detective-adventure-movie-goldie-vance/]
Kenneth Emeka Odoh
10
Theory
● Time Series Data
○ Yt
where t=1,2,3, …, n where t is index and n is number of points.
● Time Series Decomposition
Sometimes the relation may be in multiplicative form
Where Yt
is predicted time series, Vt
is trend, St
is cyclic component, Ht
is holiday
component, and t
is residual.
Kenneth Emeka Odoh
11
● pth-Order Difference Equation, Autoregressive, AR(p)
AR(p) = 1
Yt-1
+ 2
Yt-2
+ … + p
Yt-p
+ t
, for i = 1, …, p
Where is parameter, and t
is noise respectively. See EWMA for anomaly detection as an
example of AR.
● Moving Average Process, MA(q)
MA(q) = 1 t-1
+ 2 t-2
+ … + q t-q
, for i = 1, …, q
Where is parameter, and t
is noise respectively.
Kenneth Emeka Odoh
12
● Autoregressive Moving Average, ARMA
ARMA(p,q) = 1
Yt-1
+ 2
Yt-2
+ … + p
Yt-p
+ t
+ 1 t-1
+ 2 t-2
+ … + q t-q
Where p, q, , and are parameters respectively.
● Autoregressive Integrated Moving Average, ARIMA
ARIMA(p, d, q) = ARMA(p+d,q)
See Box-Jenkins methodology for choosing good parameters.
Kenneth Emeka Odoh
13
● ARCH
● GARCH ( related to exponential weighted moving average ). See section on anomaly detection.
● NGARCH
● IGARCH
● EGARCH
…
For more information,
see link ( https://en.wikipedia.org/wiki/Autoregressive_conditional_heteroskedasticity )
For more practical demonstration, see
(https://www.kaggle.com/jagangupta/time-series-basics-exploring-traditional-ts)
Kenneth Emeka Odoh
14
Why Kalman Filters?
● Allows for using measurement signals to augment the prediction of
your next states. State is the target variable.
● Allows for correction using the correct state from the recent past step.
● Allows for online prediction in real time.
It is good to note the following:
● How to identify the appropriate measurement signals?
● Measurement captures the trend of the state.
● Modeling can include domain knowledge e.g physics.Kenneth Emeka Odoh
15
Figure 4: How kalman filter worksKenneth Emeka Odoh
16
● Kalman filter
Where x, y, F,n, v are states, measurement, function, measurement noise, and
state noise respectively.
Kenneth Emeka Odoh
17
Kalman formulation allows for
● Handling missing data.
● Data fusion
It is a state space model.
Continuous form of the forward algorithm of an HMM [Andrew Fraser, 2008]. [Julier, 2002]
Discrete Kalman Filter (DKF)
● Capture linear relationship in the data.
● Fluctuates around the means and
covariance.
● Eventual convergence as the data increases,
thereby improving filter performance
( bayesian ).
In the presence of nonlinear relationship . The
increased error in the posterior estimates , thereby
leading to suboptimal filter performance
( underfitting ).
[Welch & Bishop, 2006]
Kenneth Emeka Odoh
18
Extended Kalman Filter (EKF)
● Makes use of Jacobians and hessians.
○ Increase computation cost.
○ Need for differentiable non-linear function
● Linearizing nonlinear equation using
taylor series to 1st order.
● Same computational complexity as
Unscented kalman filter.
● Captures nonlinear relationship in the
data.
The limited order of taylor series can lead to
error in the posterior estimates , thereby leading
to suboptimal filter performance.
[Welch & Bishop, 2006]
Kenneth Emeka Odoh
19
Unscented Kalman Filter (UKF)
● Better sampling to capture more representative characteristics of the
model.
○ By using sigma points.
○ Sigma point are extra data points that are chosen within the region surrounding the original data. This helps to
account for variability by capturing the likely position that data could be given some perturbation. Sampling
these points provides richer information of the distribution of the data. It is more ergodic.
● Avoid the use of jacobians and hessians.
○ Possible to use any form of nonlinear function ( not only differentiable function ).
● Linearizing nonlinear function using taylor series to 3rd order.
● Same computational efficiency as the EKF.
[Julier, 2002]Kenneth Emeka Odoh
20
Figure 5: Geometric interpretation of covariance matrix
For more info,
[http://www.visiondummy.co
m/2014/04/geometric-interpre
tation-covariance-matrix/]
21
Kenneth Emeka Odoh
Figure 6: Actual sampling vs extended kalman filter, and unscented kalman filters
Kenneth Emeka Odoh [Julier, 2002]
22
Another kind of Kalman filters
● Particle filtering
See more tweaks that can be made in the Kalman filter formulation
{ https://users.aalto.fi/~ssarkka/pub/cup_book_online_20131111.pdf}
23
● RLS (Recursive Linear Regression)
Initial model at time, t with an update as new data arrives at time t+1.
y = (t)* xt
At time t+1, we have data, xt+1
and yt+1
and estimate (t+1) in incremental manner
Kenneth Emeka Odoh
24[Wellstead & Karl, 1991]
[Sherman and Morrison formula]
https://en.wikipedia.org/wiki/Sherman%E2%80%93Morrison_formula
[See closed form of ]
Online ARIMA
RLS
+
‘Vanilla’ ARIMAKenneth Emeka Odoh
25
[Kenneth Odoh]
Box-Jenkins Methodology
● Transform data so that a form of stationarity is attained.
○ E.g taking logarithm of the data, or other forms of scaling.
● Guess values p, d, and q respectively for ARIMA (p, d, q).
● Estimate the parameters needed for future prediction.
● Perform diagnostics using AIC, BIC, or REC (Regression Error Characteristic Curves) [Jinbo &
Kristin, 2003].
Probably, a generalization of Occam’s Razor
Kenneth Emeka Odoh
[https://www.tass.gov.uk/about/facts-figures/]
26
Stationarity
● Constant statistical properties over time
○ Joint probability distribution is shift-invariant
○ Can occur in different moments (mean, variance, skewness, and kurtosis)
● Why is it useful?
● Mean reverting.
● Markovian assumption
○ Future is independent of the past given the present
With states (A, B, C, D), then a sequence of states occurring in a sequence
P (A, B, C,D) = P (D | A, B, C) * P (C | A, B) * P (B | A) * P (A)
= P (D | C) * P(C | B) * P (B | A) * P (A)
27
Kenneth Emeka Odoh
● n: number of states, m: number of observations
● S: set of states, Y: set of measurements
■ S = {s1
, s2
, …, sn
}, Y = {y1
, y2
, …, ym
}
■ P S(1)
= [p1
, p2
, …, pn
] -- prior probability
■ T = P ( s(t+1) | s(t) ) --- transition probability
■ P S(t+1)
= P S(t)
* T --- stationarity
● Mapping states to measurement, P ( y|s )
● Makes it easy to compose more relationship using bayes theorem in Kalman filtering
formulation (continuous states and measurements).
Kenneth Emeka Odoh
28
● There are methods for converting data from non-stationary to
stationary
( implicit in algorithm formulation or explicit in data preprocessing )
● Detrending
● Differencing
● Avoid using algorithm based on stationarity assumption with
non-stationary data.
● Testing for stationarity
● Formally, use unit root test [https://faculty.washington.edu/ezivot/econ584/notes/unitroot.pdf]
● Informally, compare means or other statistic measures between different disjoint batches.
For a fuller treatment on stationarity, see [http://www.phdeconomics.sssup.it/documents/Lesson4.pdf]
Kenneth Emeka Odoh
29
Why stocks are hard to predict
● If efficient market hypothesis holds, then the stock times series, Xt
is i.i.d.
This means that the past cannot be used to predict the future.
P( Xt
| Xt-1
) = P(Xt
)
● Efficient market hypothesis
● Stocks are always at the right price or value
● Cannot profit by purchasing undervalued or overvalued asset.
● Stock selection is not guaranteed to increase returns.
● Profit can only be achieved by seeking out riskier investments.
In summary, it is “impossible to beat the market”. Very Controversial
● "Essentially, all models are wrong, but some are useful." --- George Box
○ As such, there is still hope with prediction, albeit a faint one.
○ Avoid transient spurious correlation. (Domain knowledge, causal inference, & statistical models ).
Kenneth Emeka Odoh
30
Other Things to Consider in your Model
● Outliers
● Collinearity
● Heteroscedasticity
● Underfitting vs overfitting
31
API
● Time difference model
● Online ARIMA
● Discrete kalman filter
● Extended kalman filter
● Unscented kalman filter
Kenneth Emeka Odoh
32
API: Online ARIMA
import numpy as np
from RecursiveARIMA import RecursiveARIMA
X = np.random.rand(10,5)
recArimaObj = RecursiveARIMA(p=6, d=0, q=6)
recArimaObj.init( X )
x = np.random.rand(1,5)
recArimaObj.update ( x )
print "Next state"
print recArimaObj.predict( )
Kenneth Emeka Odoh
33
API: Discrete Kalman Filter
import numpy as np
from DiscreteKalmanFilter import DiscreteKalmanFilter
X = np.random.rand(2,2)
Z = np.random.rand(2,2)
dkf = DiscreteKalmanFilter()
dkf.init( X, Z ) #training phase
dkf.update( )
x = np.random.rand(1,2)
z = np.random.rand(1,2)
print "Next state"
print dkf.predict( x, z )
Kenneth Emeka Odoh
34
API: Extended Kalman Filter
import numpy as np
from ExtendedKalmanFilter import ExtendedKalmanFilter
X = np.random.rand(2,2)
Z = np.random.rand(2,15)
dkf = ExtendedKalmanFilter()
dkf.init( X, Z ) #training phase
dkf.update( )
x = np.random.rand(1,2)
z = np.random.rand(1,15)
print "Next state"
print dkf.predict( x, z )
Kenneth Emeka Odoh
35
API: Unscented Kalman filter
import numpy as np
from UnscentedKalmanFilter import UnscentedKalmanFilter
X = np.random.rand(2,2)
Z = np.random.rand(2,15)
dkf = UnscentedKalmanFilter()
dkf.init( X, Z ) #training phase
dkf.update( )
x = np.random.rand(1,2)
z = np.random.rand(1,15)
print "Next state"
print dkf.predict( x, z )
Kenneth Emeka Odoh
36
Other Approaches
● LSTM / RNN
● HMM
● Traditional ML methods
Popular libraries: number of R packages, Facebook’s prophet
Kenneth Emeka Odoh
37
Application: Anomaly Detection
Kenneth Emeka Odoh
38
Anomaly Detection
It is ideal for unbalanced data set
● small number of negative samples and large number of positive samples or vice versa.
What is a normal data?
● This is the holy grail of a number of anomaly detection methods.
Figure 7: Types of signal changes: abrupt transient shift (A), abrupt distributional shift (B), and
gradual distributional shift (C) [kelvin & william, 2012].
Kenneth Emeka Odoh
39
Data Stream
● Time and space constraints.
● Online algorithms
○ Detecting concept drift.
○ Forgetting unnecessary history.
○ Revision of model after significant change in distribution.
● Time delay to prediction.
Kenneth Emeka Odoh
40
Handling Data Stream
There are a number of window techniques :
● Fixed window
● Adaptive window (ADWIN)
● Landmark window
● Damped window
[Zhu & Shasha, 2002]
Kenneth Emeka Odoh
41
Basic Primer on Statistics
Expected value, E[X] of X is the weighted average of the domain that X can take with each value
weighted according to the probability of the event occuring. This is the long-run mean.
Variance , Var[X] is the squared of the standard deviation. This is the expectation of the squared
deviation of X from its mean. This is a spread of the data.
Var[X] = E[X2
] − E2
[X]
Standard deviation , ,is a measure of dispersion.
Kenneth Emeka Odoh
42
Algorithm 1: Probabilistic Exponential Moving Average
[kelvin & william, 2012].
Kenneth Emeka Odoh
43
Probabilistic Exponentially Weighted
Moving Average (PEWMA)
● Retains the forgetting parameter of EWMA with an
added extra parameter to include probability of
data.
● Works with every kind of shift.
# Weighted Moving
average
# Weighted Moving
E(X2
)
# Moving standard
deviation
# weighing factor
# weighing factor
# normal distribution
pdf
# Z-score
Exponentially Weighted Moving Average
(EWMA)
● Add a forgetting parameter to weight recent items
more or less.
● Volatile to abrupt transient change and bad with
distributional shifts.
Figure 8: Normal Distribution Curve. 44
Kenneth Emeka Odoh
threshold
Anomaly: A machine learning library for
Anomaly detection
● A new library in Python named anomaly
○ ( https://github.com/kenluck2001/anomaly )
○ Written by Kenneth Emeka Odoh
● This make use of a redis store to ensure persistence of summary
parameters as new data arrives.
● This is capable of handling multiple data streams with time series data.
● Current implementation does not support multivariate time series.
Kenneth Emeka Odoh
45
Exercise [ Group Activity ]
● Modify Algorithm 1 to work for the
multivariate case?
● Customize threshold (Symmetric vs
One-sided outlier)
46
Deployment
This aims to show a streaming architecture for serving content.
The goal is that the deployment should not be AWS specific to prevent getting locked down in a
proprietary vendor.
● Flask
● Redis
● Gevent
● Scikit-learn / keras / numpy
● PostgreSQL
Kenneth Emeka Odoh
47
Machine learning as a service
● Web Server
○ Nginx (web server), Flask (lightweight web framework), and Gunicon (load balancer), Gevent (non-blocking
I/O)
● Caching Server
○ Redis (as a transportation medium and caching)
● ML Server
○ Custom algorithm and third party libraries
● Database
○ PostgreSQL
48
Kenneth Emeka Odoh
49
Kenneth Emeka Odoh
Figure 9: An example of a deployment architecture.
?
Database
● Authentication
● metadata
● User management
○ Subscription
○ Managing limiting rates
50
Kenneth Emeka Odoh
Other Considerations
● Rather than serve json in your rest API, think of using Google protobuf.
● Use streaming frameworks like Apache Kafka for building the pipeline.
● Flask is an easy to use web framework, think about using in MVP.
● Load balancer and replicated database.
● Make a Version control of your model to enhance reproducibility.
○ If necessary, map each prediction interval to a model.
Kenneth Emeka Odoh
51
Lessons
● "Prediction is very difficult, especially if it's about the future." -- Niels Bohr.
● "If you make a great number of predictions, the ones that were wrong
will soon be forgotten, and the ones that turn out to be true will make
you famous." -- Malcolm Gladwell.
● "I have seen the future and it is very much like the present, only
longer." -- Kehlog Albran.
Kenneth Emeka Odoh
52
Conclusion
● Recursive ARIMA
○ Not ideal for chaotic or irregularly spaced time series.
● Higher orders and moment are necessary in the model, making unscented Kalman filter
ideal for chaotic time series.
● Time difference model tends to overfit.
● MLE based method tend to under-estimates the variance.
● ARIMA requires domain knowledge to choose ideal parameters.
● Regularizers can be used in the RLS to reduce overfitting.
● Kalman gain ⇔ damped window
Kenneth Emeka Odoh 53
Thanks for listening
@kenluck2001
Kenneth Emeka Odoh
https://www.linkedin.com/in/kenluck2001/
kenneth.odoh@gmail.com
54
https://github.com/kenluck2001?tab=repositories
References1. Castanon, D., & Karl, C. W. SC505: Stochastic processes. Retrieved 06/15, 2017, from http://www.mit.edu/people/hmsallum/GradSchool/sc505notes.pdf
2. Wellstead, P. E. & Karl, C. W. (1991). Self-tuning Systems: Control and Signal Processing. Chichester, United Kingdom: John Wiley & Sons Ltd.
3. Hamilton, J. D. (1994). Time series analysis. Chichester, United Kingdom: Princeton University Press.
4. Julier, S. J. The scaled unscented transformation. Retrieved 06/15, 2017, from https://www.cs.unc.edu/~welch/kalman/media/pdf/ACC02-IEEE1357.PDF
5. Terejanu, G. A. Extended kalman filter tutorial. Retrieved 06/15, 2017, from https://www.cse.sc.edu/~terejanu/files/tutorialEKF.pdf
6. Wan , E. A., & Merwe, R. (2000). The unscented kalman filters for nonlinear estimation. IEEE Adaptive Systems for Signal Processing, Communications, and Control
Symposium, pp. 153-158.
7.Kevin M. Carter & William W. Streilein (2012). Probabilistic reasoning for streaming anomaly detection. In Proceedings of the Statistical Signal Processing Workshop, pp.
377–380.
8. Y.Zhu & D. Shasha (2002). Statstream: Statistical monitoring of thousands of data streams in real time. In Proceedings of the 28th international conference on Very Large
Data Bases, 358–369. VLDB Endowment.
9. Welch, G., & Bishop, G. An introduction to the kalman filter. Retrieved 06/15, 2017, from https://www.cs.unc.edu/~welch/media/pdf/kalman_intro.pdf
10. Andrew Fraser (2008). Hidden markov models and Dynamical System. Retrieved 06/15, 2017, fromhttp://www.siam.org/books/ot107/
11. Jinbo Bi & Kristin P. Bennett (2003). Regression Error Characteristic Curves. Retrieved 06/15, 2017, from
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.566.4289&rep=rep1&type=pdf
Kenneth Emeka Odoh
55

Mais conteúdo relacionado

Mais procurados

Data Structures and Algorithm - Week 3 - Stacks and Queues
Data Structures and Algorithm - Week 3 - Stacks and QueuesData Structures and Algorithm - Week 3 - Stacks and Queues
Data Structures and Algorithm - Week 3 - Stacks and QueuesFerdin Joe John Joseph PhD
 
MULTIPROCESSOR SCHEDULING AND PERFORMANCE EVALUATION USING ELITIST NON DOMINA...
MULTIPROCESSOR SCHEDULING AND PERFORMANCE EVALUATION USING ELITIST NON DOMINA...MULTIPROCESSOR SCHEDULING AND PERFORMANCE EVALUATION USING ELITIST NON DOMINA...
MULTIPROCESSOR SCHEDULING AND PERFORMANCE EVALUATION USING ELITIST NON DOMINA...ijcsa
 
Data Structures and Algorithm - Week 9 - Search Algorithms
Data Structures and Algorithm - Week 9 - Search AlgorithmsData Structures and Algorithm - Week 9 - Search Algorithms
Data Structures and Algorithm - Week 9 - Search AlgorithmsFerdin Joe John Joseph PhD
 
Data Structures and Algorithm - Week 4 - Trees, Binary Trees
Data Structures and Algorithm - Week 4 - Trees, Binary TreesData Structures and Algorithm - Week 4 - Trees, Binary Trees
Data Structures and Algorithm - Week 4 - Trees, Binary TreesFerdin Joe John Joseph PhD
 
Data Structures and Algorithm - Week 8 - Minimum Spanning Trees
Data Structures and Algorithm - Week 8 - Minimum Spanning TreesData Structures and Algorithm - Week 8 - Minimum Spanning Trees
Data Structures and Algorithm - Week 8 - Minimum Spanning TreesFerdin Joe John Joseph PhD
 
Optimization problems and algorithms
Optimization problems and  algorithmsOptimization problems and  algorithms
Optimization problems and algorithmsAboul Ella Hassanien
 
Beginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBeginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBenjamin Bengfort
 
Data Structures and Algorithm - Week 5 - AVL Trees
Data Structures and Algorithm - Week 5 - AVL TreesData Structures and Algorithm - Week 5 - AVL Trees
Data Structures and Algorithm - Week 5 - AVL TreesFerdin Joe John Joseph PhD
 
A Variant of Modified Diminishing Increment Sorting: Circlesort and its Perfo...
A Variant of Modified Diminishing Increment Sorting: Circlesort and its Perfo...A Variant of Modified Diminishing Increment Sorting: Circlesort and its Perfo...
A Variant of Modified Diminishing Increment Sorting: Circlesort and its Perfo...CSCJournals
 
Comparative Performance Analysis & Complexity of Different Sorting Algorithm
Comparative Performance Analysis & Complexity of Different Sorting AlgorithmComparative Performance Analysis & Complexity of Different Sorting Algorithm
Comparative Performance Analysis & Complexity of Different Sorting Algorithmpaperpublications3
 
Incremental and Multi-feature Tensor Subspace Learning applied for Background...
Incremental and Multi-feature Tensor Subspace Learning applied for Background...Incremental and Multi-feature Tensor Subspace Learning applied for Background...
Incremental and Multi-feature Tensor Subspace Learning applied for Background...ActiveEon
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsYoung-Geun Choi
 
Artificial Intelligence Searching Techniques
Artificial Intelligence Searching TechniquesArtificial Intelligence Searching Techniques
Artificial Intelligence Searching TechniquesDr. C.V. Suresh Babu
 
presentation
presentationpresentation
presentationjie ren
 
Algorithm analysis in fundamentals of data structure
Algorithm analysis in fundamentals of data structureAlgorithm analysis in fundamentals of data structure
Algorithm analysis in fundamentals of data structureVrushali Dhanokar
 
Matrix Factorisation (and Dimensionality Reduction)
Matrix Factorisation (and Dimensionality Reduction)Matrix Factorisation (and Dimensionality Reduction)
Matrix Factorisation (and Dimensionality Reduction)HJ van Veen
 

Mais procurados (20)

Data Structures and Algorithm - Week 3 - Stacks and Queues
Data Structures and Algorithm - Week 3 - Stacks and QueuesData Structures and Algorithm - Week 3 - Stacks and Queues
Data Structures and Algorithm - Week 3 - Stacks and Queues
 
MULTIPROCESSOR SCHEDULING AND PERFORMANCE EVALUATION USING ELITIST NON DOMINA...
MULTIPROCESSOR SCHEDULING AND PERFORMANCE EVALUATION USING ELITIST NON DOMINA...MULTIPROCESSOR SCHEDULING AND PERFORMANCE EVALUATION USING ELITIST NON DOMINA...
MULTIPROCESSOR SCHEDULING AND PERFORMANCE EVALUATION USING ELITIST NON DOMINA...
 
Data Structures and Algorithm - Week 9 - Search Algorithms
Data Structures and Algorithm - Week 9 - Search AlgorithmsData Structures and Algorithm - Week 9 - Search Algorithms
Data Structures and Algorithm - Week 9 - Search Algorithms
 
Data Structures and Algorithm - Week 4 - Trees, Binary Trees
Data Structures and Algorithm - Week 4 - Trees, Binary TreesData Structures and Algorithm - Week 4 - Trees, Binary Trees
Data Structures and Algorithm - Week 4 - Trees, Binary Trees
 
Data Structures and Algorithm - Week 8 - Minimum Spanning Trees
Data Structures and Algorithm - Week 8 - Minimum Spanning TreesData Structures and Algorithm - Week 8 - Minimum Spanning Trees
Data Structures and Algorithm - Week 8 - Minimum Spanning Trees
 
Optimization problems and algorithms
Optimization problems and  algorithmsOptimization problems and  algorithms
Optimization problems and algorithms
 
Beginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBeginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix Factorization
 
Data Structures and Algorithm - Week 5 - AVL Trees
Data Structures and Algorithm - Week 5 - AVL TreesData Structures and Algorithm - Week 5 - AVL Trees
Data Structures and Algorithm - Week 5 - AVL Trees
 
Introduction to data structure and algorithms
Introduction to data structure and algorithmsIntroduction to data structure and algorithms
Introduction to data structure and algorithms
 
A Variant of Modified Diminishing Increment Sorting: Circlesort and its Perfo...
A Variant of Modified Diminishing Increment Sorting: Circlesort and its Perfo...A Variant of Modified Diminishing Increment Sorting: Circlesort and its Perfo...
A Variant of Modified Diminishing Increment Sorting: Circlesort and its Perfo...
 
Comparative Performance Analysis & Complexity of Different Sorting Algorithm
Comparative Performance Analysis & Complexity of Different Sorting AlgorithmComparative Performance Analysis & Complexity of Different Sorting Algorithm
Comparative Performance Analysis & Complexity of Different Sorting Algorithm
 
Incremental and Multi-feature Tensor Subspace Learning applied for Background...
Incremental and Multi-feature Tensor Subspace Learning applied for Background...Incremental and Multi-feature Tensor Subspace Learning applied for Background...
Incremental and Multi-feature Tensor Subspace Learning applied for Background...
 
Unit1 pg math model
Unit1 pg math modelUnit1 pg math model
Unit1 pg math model
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep models
 
Artificial Intelligence Searching Techniques
Artificial Intelligence Searching TechniquesArtificial Intelligence Searching Techniques
Artificial Intelligence Searching Techniques
 
presentation
presentationpresentation
presentation
 
Algorithm analysis in fundamentals of data structure
Algorithm analysis in fundamentals of data structureAlgorithm analysis in fundamentals of data structure
Algorithm analysis in fundamentals of data structure
 
Matrix Factorisation (and Dimensionality Reduction)
Matrix Factorisation (and Dimensionality Reduction)Matrix Factorisation (and Dimensionality Reduction)
Matrix Factorisation (and Dimensionality Reduction)
 
How to Accelerate Molecular Simulations with Data? by Žofia Trsťanová, Machin...
How to Accelerate Molecular Simulations with Data? by Žofia Trsťanová, Machin...How to Accelerate Molecular Simulations with Data? by Žofia Trsťanová, Machin...
How to Accelerate Molecular Simulations with Data? by Žofia Trsťanová, Machin...
 
505 260-266
505 260-266505 260-266
505 260-266
 

Semelhante a Tracking the tracker: Time Series Analysis in Python from First Principles

On selection of periodic kernels parameters in time series prediction
On selection of periodic kernels parameters in time series predictionOn selection of periodic kernels parameters in time series prediction
On selection of periodic kernels parameters in time series predictioncsandit
 
On Selection of Periodic Kernels Parameters in Time Series Prediction
On Selection of Periodic Kernels Parameters in Time Series Prediction On Selection of Periodic Kernels Parameters in Time Series Prediction
On Selection of Periodic Kernels Parameters in Time Series Prediction cscpconf
 
ON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTION
ON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTIONON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTION
ON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTIONcscpconf
 
safe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learningsafe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learningRyo Iwaki
 
Computational Intelligence for Time Series Prediction
Computational Intelligence for Time Series PredictionComputational Intelligence for Time Series Prediction
Computational Intelligence for Time Series PredictionGianluca Bontempi
 
Machine Learning Applications in Subsurface Analysis: Case Study in North Sea
Machine Learning Applications in Subsurface Analysis: Case Study in North SeaMachine Learning Applications in Subsurface Analysis: Case Study in North Sea
Machine Learning Applications in Subsurface Analysis: Case Study in North SeaYohanes Nuwara
 
Synthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingSynthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingAdam Doyle
 
A unique sorting algorithm with linear time & space complexity
A unique sorting algorithm with linear time & space complexityA unique sorting algorithm with linear time & space complexity
A unique sorting algorithm with linear time & space complexityeSAT Journals
 
presentation.ppt
presentation.pptpresentation.ppt
presentation.pptWasiqAli28
 
Ch24 efficient algorithms
Ch24 efficient algorithmsCh24 efficient algorithms
Ch24 efficient algorithmsrajatmay1992
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...Kusano Hitoshi
 
Development of deep reinforcement learning for inverted pendulum
Development of deep reinforcement learning for inverted  pendulumDevelopment of deep reinforcement learning for inverted  pendulum
Development of deep reinforcement learning for inverted pendulumIJECEIAES
 
ASYMTOTIC NOTATIONS BIG O OEMGA THETE NOTATION.pptx
ASYMTOTIC NOTATIONS BIG O OEMGA THETE NOTATION.pptxASYMTOTIC NOTATIONS BIG O OEMGA THETE NOTATION.pptx
ASYMTOTIC NOTATIONS BIG O OEMGA THETE NOTATION.pptxsunitha1792
 
Scalable trust-region method for deep reinforcement learning using Kronecker-...
Scalable trust-region method for deep reinforcement learning using Kronecker-...Scalable trust-region method for deep reinforcement learning using Kronecker-...
Scalable trust-region method for deep reinforcement learning using Kronecker-...Willy Marroquin (WillyDevNET)
 
Increasing electrical grid stability classification performance using ensemb...
Increasing electrical grid stability classification performance  using ensemb...Increasing electrical grid stability classification performance  using ensemb...
Increasing electrical grid stability classification performance using ensemb...IJECEIAES
 

Semelhante a Tracking the tracker: Time Series Analysis in Python from First Principles (20)

On selection of periodic kernels parameters in time series prediction
On selection of periodic kernels parameters in time series predictionOn selection of periodic kernels parameters in time series prediction
On selection of periodic kernels parameters in time series prediction
 
On Selection of Periodic Kernels Parameters in Time Series Prediction
On Selection of Periodic Kernels Parameters in Time Series Prediction On Selection of Periodic Kernels Parameters in Time Series Prediction
On Selection of Periodic Kernels Parameters in Time Series Prediction
 
ON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTION
ON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTIONON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTION
ON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTION
 
Particle filter
Particle filterParticle filter
Particle filter
 
safe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learningsafe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learning
 
Computational Intelligence for Time Series Prediction
Computational Intelligence for Time Series PredictionComputational Intelligence for Time Series Prediction
Computational Intelligence for Time Series Prediction
 
Machine Learning Applications in Subsurface Analysis: Case Study in North Sea
Machine Learning Applications in Subsurface Analysis: Case Study in North SeaMachine Learning Applications in Subsurface Analysis: Case Study in North Sea
Machine Learning Applications in Subsurface Analysis: Case Study in North Sea
 
Jmestn42351212
Jmestn42351212Jmestn42351212
Jmestn42351212
 
Master_Thesis_Harihara_Subramanyam_Sreenivasan
Master_Thesis_Harihara_Subramanyam_SreenivasanMaster_Thesis_Harihara_Subramanyam_Sreenivasan
Master_Thesis_Harihara_Subramanyam_Sreenivasan
 
Synthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingSynthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-making
 
A unique sorting algorithm with linear time & space complexity
A unique sorting algorithm with linear time & space complexityA unique sorting algorithm with linear time & space complexity
A unique sorting algorithm with linear time & space complexity
 
presentation.ppt
presentation.pptpresentation.ppt
presentation.ppt
 
Ch24 efficient algorithms
Ch24 efficient algorithmsCh24 efficient algorithms
Ch24 efficient algorithms
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
 
Development of deep reinforcement learning for inverted pendulum
Development of deep reinforcement learning for inverted  pendulumDevelopment of deep reinforcement learning for inverted  pendulum
Development of deep reinforcement learning for inverted pendulum
 
ASYMTOTIC NOTATIONS BIG O OEMGA THETE NOTATION.pptx
ASYMTOTIC NOTATIONS BIG O OEMGA THETE NOTATION.pptxASYMTOTIC NOTATIONS BIG O OEMGA THETE NOTATION.pptx
ASYMTOTIC NOTATIONS BIG O OEMGA THETE NOTATION.pptx
 
Algorithms
Algorithms Algorithms
Algorithms
 
Scalable trust-region method for deep reinforcement learning using Kronecker-...
Scalable trust-region method for deep reinforcement learning using Kronecker-...Scalable trust-region method for deep reinforcement learning using Kronecker-...
Scalable trust-region method for deep reinforcement learning using Kronecker-...
 
Increasing electrical grid stability classification performance using ensemb...
Increasing electrical grid stability classification performance  using ensemb...Increasing electrical grid stability classification performance  using ensemb...
Increasing electrical grid stability classification performance using ensemb...
 

Último

FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceAlex Henderson
 
Cot curve, melting temperature, unique and repetitive DNA
Cot curve, melting temperature, unique and repetitive DNACot curve, melting temperature, unique and repetitive DNA
Cot curve, melting temperature, unique and repetitive DNACherry
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cherry
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxANSARKHAN96
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxRenuJangid3
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.Cherry
 
Plasmid: types, structure and functions.
Plasmid: types, structure and functions.Plasmid: types, structure and functions.
Plasmid: types, structure and functions.Cherry
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Serviceshivanisharma5244
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxCherry
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professormuralinath2
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspectsmuralinath2
 
Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsbassianu17
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Cherry
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusNazaninKarimi6
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIADr. TATHAGAT KHOBRAGADE
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Cherry
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxMohamedFarag457087
 

Último (20)

FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Cot curve, melting temperature, unique and repetitive DNA
Cot curve, melting temperature, unique and repetitive DNACot curve, melting temperature, unique and repetitive DNA
Cot curve, melting temperature, unique and repetitive DNA
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
 
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
 
Plasmid: types, structure and functions.
Plasmid: types, structure and functions.Plasmid: types, structure and functions.
Plasmid: types, structure and functions.
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptx
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditions
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 

Tracking the tracker: Time Series Analysis in Python from First Principles

  • 1. Tracking the tracker: Time Series Analysis in Python From First Principles Kenneth Emeka Odoh PyCon APAC @National University of Singapore Computing 1 (COM1) / Level 2 13 Computing Drive Singapore 117417 May 31st, 2018 - June 2nd, 2018
  • 2. Talk Plan ● Bio ● Motivations & Goals ● Theory ● Application: Forecasting ● Application: Anomaly Detection ● Deployment ● Lessons ● ConclusionKenneth Emeka Odoh 2
  • 3. Bio Education ● Masters in Computer Science (University of Regina) 2013 - 2016 ● A number of MOOCs in statistics, algorithm, and machine learning Work ● Research assistant in the field of Visual Analytics 2013 - 2016 ○ ( https://scholar.google.com/citations?user=3G2ncgwAAAAJ&hl=en ) ● Software Engineer in Vancouver, Canada 2017 - ● Regular speaker at a number of Vancouver AI meetups, organizer of distributed systems meetup, and Haskell study group and organizer of Applied Cryptography study group 2017 - Kenneth Emeka Odoh 3
  • 4. This talk focuses on two applications in time series analysis: ● Forecasting ○ Can I use the past to predict the future? ● Anomaly Detection ○ Is a given point, normal or abnormal? "Simplicity is the ultimate sophistication"--- Leonardo da Vinci Kenneth Emeka Odoh 4
  • 5. Motivation Figure 1: Pole vault showing time series of position [http://vaultermagazine.com/middle-school-womens-pole-vault-rankings-from-athletic-net-5/] Kenneth Emeka Odoh 5
  • 6. Figure 2: Missile Defence System [http://www.bbc.com/news/world-middle-east-43730068] Kenneth Emeka Odoh 6
  • 7. Figure 3: Bitcoin time series [yahoo finance]Kenneth Emeka Odoh 7
  • 8. Goals ● A real-time model for time series analysis. ● A less computational intensive model thereby resulting in scalability. ● A support for multivariate / univariate time series data. introducing pySmooth ( https://github.com/kenluck2001/pySmooth) and Anomaly ( https://github.com/kenluck2001/anomaly) Kenneth Emeka Odoh 8
  • 10. Let’s dive into the maths for some minutes [http://www.boom-studios.com/2017/12/01/20th-century-fox-buys-girl-detective-adventure-movie-goldie-vance/] Kenneth Emeka Odoh 10
  • 11. Theory ● Time Series Data ○ Yt where t=1,2,3, …, n where t is index and n is number of points. ● Time Series Decomposition Sometimes the relation may be in multiplicative form Where Yt is predicted time series, Vt is trend, St is cyclic component, Ht is holiday component, and t is residual. Kenneth Emeka Odoh 11
  • 12. ● pth-Order Difference Equation, Autoregressive, AR(p) AR(p) = 1 Yt-1 + 2 Yt-2 + … + p Yt-p + t , for i = 1, …, p Where is parameter, and t is noise respectively. See EWMA for anomaly detection as an example of AR. ● Moving Average Process, MA(q) MA(q) = 1 t-1 + 2 t-2 + … + q t-q , for i = 1, …, q Where is parameter, and t is noise respectively. Kenneth Emeka Odoh 12
  • 13. ● Autoregressive Moving Average, ARMA ARMA(p,q) = 1 Yt-1 + 2 Yt-2 + … + p Yt-p + t + 1 t-1 + 2 t-2 + … + q t-q Where p, q, , and are parameters respectively. ● Autoregressive Integrated Moving Average, ARIMA ARIMA(p, d, q) = ARMA(p+d,q) See Box-Jenkins methodology for choosing good parameters. Kenneth Emeka Odoh 13
  • 14. ● ARCH ● GARCH ( related to exponential weighted moving average ). See section on anomaly detection. ● NGARCH ● IGARCH ● EGARCH … For more information, see link ( https://en.wikipedia.org/wiki/Autoregressive_conditional_heteroskedasticity ) For more practical demonstration, see (https://www.kaggle.com/jagangupta/time-series-basics-exploring-traditional-ts) Kenneth Emeka Odoh 14
  • 15. Why Kalman Filters? ● Allows for using measurement signals to augment the prediction of your next states. State is the target variable. ● Allows for correction using the correct state from the recent past step. ● Allows for online prediction in real time. It is good to note the following: ● How to identify the appropriate measurement signals? ● Measurement captures the trend of the state. ● Modeling can include domain knowledge e.g physics.Kenneth Emeka Odoh 15
  • 16. Figure 4: How kalman filter worksKenneth Emeka Odoh 16
  • 17. ● Kalman filter Where x, y, F,n, v are states, measurement, function, measurement noise, and state noise respectively. Kenneth Emeka Odoh 17 Kalman formulation allows for ● Handling missing data. ● Data fusion It is a state space model. Continuous form of the forward algorithm of an HMM [Andrew Fraser, 2008]. [Julier, 2002]
  • 18. Discrete Kalman Filter (DKF) ● Capture linear relationship in the data. ● Fluctuates around the means and covariance. ● Eventual convergence as the data increases, thereby improving filter performance ( bayesian ). In the presence of nonlinear relationship . The increased error in the posterior estimates , thereby leading to suboptimal filter performance ( underfitting ). [Welch & Bishop, 2006] Kenneth Emeka Odoh 18
  • 19. Extended Kalman Filter (EKF) ● Makes use of Jacobians and hessians. ○ Increase computation cost. ○ Need for differentiable non-linear function ● Linearizing nonlinear equation using taylor series to 1st order. ● Same computational complexity as Unscented kalman filter. ● Captures nonlinear relationship in the data. The limited order of taylor series can lead to error in the posterior estimates , thereby leading to suboptimal filter performance. [Welch & Bishop, 2006] Kenneth Emeka Odoh 19
  • 20. Unscented Kalman Filter (UKF) ● Better sampling to capture more representative characteristics of the model. ○ By using sigma points. ○ Sigma point are extra data points that are chosen within the region surrounding the original data. This helps to account for variability by capturing the likely position that data could be given some perturbation. Sampling these points provides richer information of the distribution of the data. It is more ergodic. ● Avoid the use of jacobians and hessians. ○ Possible to use any form of nonlinear function ( not only differentiable function ). ● Linearizing nonlinear function using taylor series to 3rd order. ● Same computational efficiency as the EKF. [Julier, 2002]Kenneth Emeka Odoh 20
  • 21. Figure 5: Geometric interpretation of covariance matrix For more info, [http://www.visiondummy.co m/2014/04/geometric-interpre tation-covariance-matrix/] 21 Kenneth Emeka Odoh
  • 22. Figure 6: Actual sampling vs extended kalman filter, and unscented kalman filters Kenneth Emeka Odoh [Julier, 2002] 22
  • 23. Another kind of Kalman filters ● Particle filtering See more tweaks that can be made in the Kalman filter formulation { https://users.aalto.fi/~ssarkka/pub/cup_book_online_20131111.pdf} 23
  • 24. ● RLS (Recursive Linear Regression) Initial model at time, t with an update as new data arrives at time t+1. y = (t)* xt At time t+1, we have data, xt+1 and yt+1 and estimate (t+1) in incremental manner Kenneth Emeka Odoh 24[Wellstead & Karl, 1991] [Sherman and Morrison formula] https://en.wikipedia.org/wiki/Sherman%E2%80%93Morrison_formula [See closed form of ]
  • 25. Online ARIMA RLS + ‘Vanilla’ ARIMAKenneth Emeka Odoh 25 [Kenneth Odoh]
  • 26. Box-Jenkins Methodology ● Transform data so that a form of stationarity is attained. ○ E.g taking logarithm of the data, or other forms of scaling. ● Guess values p, d, and q respectively for ARIMA (p, d, q). ● Estimate the parameters needed for future prediction. ● Perform diagnostics using AIC, BIC, or REC (Regression Error Characteristic Curves) [Jinbo & Kristin, 2003]. Probably, a generalization of Occam’s Razor Kenneth Emeka Odoh [https://www.tass.gov.uk/about/facts-figures/] 26
  • 27. Stationarity ● Constant statistical properties over time ○ Joint probability distribution is shift-invariant ○ Can occur in different moments (mean, variance, skewness, and kurtosis) ● Why is it useful? ● Mean reverting. ● Markovian assumption ○ Future is independent of the past given the present With states (A, B, C, D), then a sequence of states occurring in a sequence P (A, B, C,D) = P (D | A, B, C) * P (C | A, B) * P (B | A) * P (A) = P (D | C) * P(C | B) * P (B | A) * P (A) 27 Kenneth Emeka Odoh
  • 28. ● n: number of states, m: number of observations ● S: set of states, Y: set of measurements ■ S = {s1 , s2 , …, sn }, Y = {y1 , y2 , …, ym } ■ P S(1) = [p1 , p2 , …, pn ] -- prior probability ■ T = P ( s(t+1) | s(t) ) --- transition probability ■ P S(t+1) = P S(t) * T --- stationarity ● Mapping states to measurement, P ( y|s ) ● Makes it easy to compose more relationship using bayes theorem in Kalman filtering formulation (continuous states and measurements). Kenneth Emeka Odoh 28
  • 29. ● There are methods for converting data from non-stationary to stationary ( implicit in algorithm formulation or explicit in data preprocessing ) ● Detrending ● Differencing ● Avoid using algorithm based on stationarity assumption with non-stationary data. ● Testing for stationarity ● Formally, use unit root test [https://faculty.washington.edu/ezivot/econ584/notes/unitroot.pdf] ● Informally, compare means or other statistic measures between different disjoint batches. For a fuller treatment on stationarity, see [http://www.phdeconomics.sssup.it/documents/Lesson4.pdf] Kenneth Emeka Odoh 29
  • 30. Why stocks are hard to predict ● If efficient market hypothesis holds, then the stock times series, Xt is i.i.d. This means that the past cannot be used to predict the future. P( Xt | Xt-1 ) = P(Xt ) ● Efficient market hypothesis ● Stocks are always at the right price or value ● Cannot profit by purchasing undervalued or overvalued asset. ● Stock selection is not guaranteed to increase returns. ● Profit can only be achieved by seeking out riskier investments. In summary, it is “impossible to beat the market”. Very Controversial ● "Essentially, all models are wrong, but some are useful." --- George Box ○ As such, there is still hope with prediction, albeit a faint one. ○ Avoid transient spurious correlation. (Domain knowledge, causal inference, & statistical models ). Kenneth Emeka Odoh 30
  • 31. Other Things to Consider in your Model ● Outliers ● Collinearity ● Heteroscedasticity ● Underfitting vs overfitting 31
  • 32. API ● Time difference model ● Online ARIMA ● Discrete kalman filter ● Extended kalman filter ● Unscented kalman filter Kenneth Emeka Odoh 32
  • 33. API: Online ARIMA import numpy as np from RecursiveARIMA import RecursiveARIMA X = np.random.rand(10,5) recArimaObj = RecursiveARIMA(p=6, d=0, q=6) recArimaObj.init( X ) x = np.random.rand(1,5) recArimaObj.update ( x ) print "Next state" print recArimaObj.predict( ) Kenneth Emeka Odoh 33
  • 34. API: Discrete Kalman Filter import numpy as np from DiscreteKalmanFilter import DiscreteKalmanFilter X = np.random.rand(2,2) Z = np.random.rand(2,2) dkf = DiscreteKalmanFilter() dkf.init( X, Z ) #training phase dkf.update( ) x = np.random.rand(1,2) z = np.random.rand(1,2) print "Next state" print dkf.predict( x, z ) Kenneth Emeka Odoh 34
  • 35. API: Extended Kalman Filter import numpy as np from ExtendedKalmanFilter import ExtendedKalmanFilter X = np.random.rand(2,2) Z = np.random.rand(2,15) dkf = ExtendedKalmanFilter() dkf.init( X, Z ) #training phase dkf.update( ) x = np.random.rand(1,2) z = np.random.rand(1,15) print "Next state" print dkf.predict( x, z ) Kenneth Emeka Odoh 35
  • 36. API: Unscented Kalman filter import numpy as np from UnscentedKalmanFilter import UnscentedKalmanFilter X = np.random.rand(2,2) Z = np.random.rand(2,15) dkf = UnscentedKalmanFilter() dkf.init( X, Z ) #training phase dkf.update( ) x = np.random.rand(1,2) z = np.random.rand(1,15) print "Next state" print dkf.predict( x, z ) Kenneth Emeka Odoh 36
  • 37. Other Approaches ● LSTM / RNN ● HMM ● Traditional ML methods Popular libraries: number of R packages, Facebook’s prophet Kenneth Emeka Odoh 37
  • 39. Anomaly Detection It is ideal for unbalanced data set ● small number of negative samples and large number of positive samples or vice versa. What is a normal data? ● This is the holy grail of a number of anomaly detection methods. Figure 7: Types of signal changes: abrupt transient shift (A), abrupt distributional shift (B), and gradual distributional shift (C) [kelvin & william, 2012]. Kenneth Emeka Odoh 39
  • 40. Data Stream ● Time and space constraints. ● Online algorithms ○ Detecting concept drift. ○ Forgetting unnecessary history. ○ Revision of model after significant change in distribution. ● Time delay to prediction. Kenneth Emeka Odoh 40
  • 41. Handling Data Stream There are a number of window techniques : ● Fixed window ● Adaptive window (ADWIN) ● Landmark window ● Damped window [Zhu & Shasha, 2002] Kenneth Emeka Odoh 41
  • 42. Basic Primer on Statistics Expected value, E[X] of X is the weighted average of the domain that X can take with each value weighted according to the probability of the event occuring. This is the long-run mean. Variance , Var[X] is the squared of the standard deviation. This is the expectation of the squared deviation of X from its mean. This is a spread of the data. Var[X] = E[X2 ] − E2 [X] Standard deviation , ,is a measure of dispersion. Kenneth Emeka Odoh 42
  • 43. Algorithm 1: Probabilistic Exponential Moving Average [kelvin & william, 2012]. Kenneth Emeka Odoh 43 Probabilistic Exponentially Weighted Moving Average (PEWMA) ● Retains the forgetting parameter of EWMA with an added extra parameter to include probability of data. ● Works with every kind of shift. # Weighted Moving average # Weighted Moving E(X2 ) # Moving standard deviation # weighing factor # weighing factor # normal distribution pdf # Z-score Exponentially Weighted Moving Average (EWMA) ● Add a forgetting parameter to weight recent items more or less. ● Volatile to abrupt transient change and bad with distributional shifts.
  • 44. Figure 8: Normal Distribution Curve. 44 Kenneth Emeka Odoh threshold
  • 45. Anomaly: A machine learning library for Anomaly detection ● A new library in Python named anomaly ○ ( https://github.com/kenluck2001/anomaly ) ○ Written by Kenneth Emeka Odoh ● This make use of a redis store to ensure persistence of summary parameters as new data arrives. ● This is capable of handling multiple data streams with time series data. ● Current implementation does not support multivariate time series. Kenneth Emeka Odoh 45
  • 46. Exercise [ Group Activity ] ● Modify Algorithm 1 to work for the multivariate case? ● Customize threshold (Symmetric vs One-sided outlier) 46
  • 47. Deployment This aims to show a streaming architecture for serving content. The goal is that the deployment should not be AWS specific to prevent getting locked down in a proprietary vendor. ● Flask ● Redis ● Gevent ● Scikit-learn / keras / numpy ● PostgreSQL Kenneth Emeka Odoh 47
  • 48. Machine learning as a service ● Web Server ○ Nginx (web server), Flask (lightweight web framework), and Gunicon (load balancer), Gevent (non-blocking I/O) ● Caching Server ○ Redis (as a transportation medium and caching) ● ML Server ○ Custom algorithm and third party libraries ● Database ○ PostgreSQL 48 Kenneth Emeka Odoh
  • 49. 49 Kenneth Emeka Odoh Figure 9: An example of a deployment architecture. ?
  • 50. Database ● Authentication ● metadata ● User management ○ Subscription ○ Managing limiting rates 50 Kenneth Emeka Odoh
  • 51. Other Considerations ● Rather than serve json in your rest API, think of using Google protobuf. ● Use streaming frameworks like Apache Kafka for building the pipeline. ● Flask is an easy to use web framework, think about using in MVP. ● Load balancer and replicated database. ● Make a Version control of your model to enhance reproducibility. ○ If necessary, map each prediction interval to a model. Kenneth Emeka Odoh 51
  • 52. Lessons ● "Prediction is very difficult, especially if it's about the future." -- Niels Bohr. ● "If you make a great number of predictions, the ones that were wrong will soon be forgotten, and the ones that turn out to be true will make you famous." -- Malcolm Gladwell. ● "I have seen the future and it is very much like the present, only longer." -- Kehlog Albran. Kenneth Emeka Odoh 52
  • 53. Conclusion ● Recursive ARIMA ○ Not ideal for chaotic or irregularly spaced time series. ● Higher orders and moment are necessary in the model, making unscented Kalman filter ideal for chaotic time series. ● Time difference model tends to overfit. ● MLE based method tend to under-estimates the variance. ● ARIMA requires domain knowledge to choose ideal parameters. ● Regularizers can be used in the RLS to reduce overfitting. ● Kalman gain ⇔ damped window Kenneth Emeka Odoh 53
  • 54. Thanks for listening @kenluck2001 Kenneth Emeka Odoh https://www.linkedin.com/in/kenluck2001/ kenneth.odoh@gmail.com 54 https://github.com/kenluck2001?tab=repositories
  • 55. References1. Castanon, D., & Karl, C. W. SC505: Stochastic processes. Retrieved 06/15, 2017, from http://www.mit.edu/people/hmsallum/GradSchool/sc505notes.pdf 2. Wellstead, P. E. & Karl, C. W. (1991). Self-tuning Systems: Control and Signal Processing. Chichester, United Kingdom: John Wiley & Sons Ltd. 3. Hamilton, J. D. (1994). Time series analysis. Chichester, United Kingdom: Princeton University Press. 4. Julier, S. J. The scaled unscented transformation. Retrieved 06/15, 2017, from https://www.cs.unc.edu/~welch/kalman/media/pdf/ACC02-IEEE1357.PDF 5. Terejanu, G. A. Extended kalman filter tutorial. Retrieved 06/15, 2017, from https://www.cse.sc.edu/~terejanu/files/tutorialEKF.pdf 6. Wan , E. A., & Merwe, R. (2000). The unscented kalman filters for nonlinear estimation. IEEE Adaptive Systems for Signal Processing, Communications, and Control Symposium, pp. 153-158. 7.Kevin M. Carter & William W. Streilein (2012). Probabilistic reasoning for streaming anomaly detection. In Proceedings of the Statistical Signal Processing Workshop, pp. 377–380. 8. Y.Zhu & D. Shasha (2002). Statstream: Statistical monitoring of thousands of data streams in real time. In Proceedings of the 28th international conference on Very Large Data Bases, 358–369. VLDB Endowment. 9. Welch, G., & Bishop, G. An introduction to the kalman filter. Retrieved 06/15, 2017, from https://www.cs.unc.edu/~welch/media/pdf/kalman_intro.pdf 10. Andrew Fraser (2008). Hidden markov models and Dynamical System. Retrieved 06/15, 2017, fromhttp://www.siam.org/books/ot107/ 11. Jinbo Bi & Kristin P. Bennett (2003). Regression Error Characteristic Curves. Retrieved 06/15, 2017, from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.566.4289&rep=rep1&type=pdf Kenneth Emeka Odoh 55