SlideShare uma empresa Scribd logo
1 de 174
Baixar para ler offline
Rob J Hyndman
Automatic time series
forecasting
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Exponential smoothing
5 ARIMA modelling
6 Automatic nonlinear forecasting?
7 Time series with complex seasonality
8 Recent developments
Automatic time series forecasting Motivation 2
Motivation
Automatic time series forecasting Motivation 3
Motivation
Automatic time series forecasting Motivation 3
Motivation
Automatic time series forecasting Motivation 3
Motivation
Automatic time series forecasting Motivation 3
Motivation
Automatic time series forecasting Motivation 3
Motivation
1 Common in business to have over 1000
products that need forecasting at least monthly.
2 Forecasts are often required by people who are
untrained in time series analysis.
Specifications
Automatic forecasting algorithms must:
¯ determine an appropriate time series model;
¯ estimate the parameters;
¯ compute the forecasts with prediction intervals.
Automatic time series forecasting Motivation 4
Motivation
1 Common in business to have over 1000
products that need forecasting at least monthly.
2 Forecasts are often required by people who are
untrained in time series analysis.
Specifications
Automatic forecasting algorithms must:
¯ determine an appropriate time series model;
¯ estimate the parameters;
¯ compute the forecasts with prediction intervals.
Automatic time series forecasting Motivation 4
Example: Asian sheep
Automatic time series forecasting Motivation 5
Numbers of sheep in Asia
Year
millionsofsheep
1960 1970 1980 1990 2000 2010
250300350400450500550
Example: Asian sheep
Automatic time series forecasting Motivation 5
Automatic ETS forecasts
Year
millionsofsheep
1960 1970 1980 1990 2000 2010
250300350400450500550
Example: Cortecosteroid sales
Automatic time series forecasting Motivation 6
Monthly cortecosteroid drug sales in Australia
Year
Totalscripts(millions)
1995 2000 2005 2010
0.40.60.81.01.21.4
Example: Cortecosteroid sales
Automatic time series forecasting Motivation 6
Automatic ARIMA forecasts
Year
Totalscripts(millions)
1995 2000 2005 2010
0.40.60.81.01.21.4
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Exponential smoothing
5 ARIMA modelling
6 Automatic nonlinear forecasting?
7 Time series with complex seasonality
8 Recent developments
Automatic time series forecasting Forecasting competitions 7
Makridakis and Hibon (1979)
Automatic time series forecasting Forecasting competitions 8
Makridakis and Hibon (1979)
Automatic time series forecasting Forecasting competitions 8
Makridakis and Hibon (1979)
This was the first large-scale empirical evaluation of
time series forecasting methods.
Highly controversial at the time.
Difficulties:
 How to measure forecast accuracy?
 How to apply methods consistently and objectively?
 How to explain unexpected results?
Common thinking was that the more
sophisticated mathematical models (ARIMA
models at the time) were necessarily better.
If results showed ARIMA models not best, it
must be because analyst was unskilled.
Automatic time series forecasting Forecasting competitions 9
Makridakis and Hibon (1979)
I do not believe that it is very fruitful to attempt to
classify series according to which forecasting techniques
perform “best”. The performance of any particular
technique when applied to a particular series depends
essentially on (a) the model which the series obeys;
(b) our ability to identify and fit this model correctly and
(c) the criterion chosen to measure the forecasting
accuracy. — M.B. Priestley
. . . the paper suggests the application of normal scientific
experimental design to forecasting, with measures of
unbiased testing of forecasts against subsequent reality,
for success or failure. A long overdue reform.
— F.H. Hansford-Miller
Automatic time series forecasting Forecasting competitions 10
Makridakis and Hibon (1979)
I do not believe that it is very fruitful to attempt to
classify series according to which forecasting techniques
perform “best”. The performance of any particular
technique when applied to a particular series depends
essentially on (a) the model which the series obeys;
(b) our ability to identify and fit this model correctly and
(c) the criterion chosen to measure the forecasting
accuracy. — M.B. Priestley
. . . the paper suggests the application of normal scientific
experimental design to forecasting, with measures of
unbiased testing of forecasts against subsequent reality,
for success or failure. A long overdue reform.
— F.H. Hansford-Miller
Automatic time series forecasting Forecasting competitions 10
Makridakis and Hibon (1979)
Modern man is fascinated with the subject of
forecasting — W.G. Gilchrist
It is amazing to me, however, that after all this
exercise in identifying models, transforming and so
on, that the autoregressive moving averages come
out so badly. I wonder whether it might be partly
due to the authors not using the backwards
forecasting approach to obtain the initial errors.
— W.G. Gilchrist
Automatic time series forecasting Forecasting competitions 11
Makridakis and Hibon (1979)
Modern man is fascinated with the subject of
forecasting — W.G. Gilchrist
It is amazing to me, however, that after all this
exercise in identifying models, transforming and so
on, that the autoregressive moving averages come
out so badly. I wonder whether it might be partly
due to the authors not using the backwards
forecasting approach to obtain the initial errors.
— W.G. Gilchrist
Automatic time series forecasting Forecasting competitions 11
Makridakis and Hibon (1979)
I find it hard to believe that Box-Jenkins, if properly
applied, can actually be worse than so many of the
simple methods — C. Chatfield
Why do empirical studies sometimes give different
answers? It may depend on the selected sample of
time series, but I suspect it is more likely to depend
on the skill of the analyst and on their individual
interpretations of what is meant by Method X.
— C. Chatfield
. . . these authors are more at home with simple
procedures than with Box-Jenkins. — C. Chatfield
Automatic time series forecasting Forecasting competitions 12
Makridakis and Hibon (1979)
I find it hard to believe that Box-Jenkins, if properly
applied, can actually be worse than so many of the
simple methods — C. Chatfield
Why do empirical studies sometimes give different
answers? It may depend on the selected sample of
time series, but I suspect it is more likely to depend
on the skill of the analyst and on their individual
interpretations of what is meant by Method X.
— C. Chatfield
. . . these authors are more at home with simple
procedures than with Box-Jenkins. — C. Chatfield
Automatic time series forecasting Forecasting competitions 12
Makridakis and Hibon (1979)
I find it hard to believe that Box-Jenkins, if properly
applied, can actually be worse than so many of the
simple methods — C. Chatfield
Why do empirical studies sometimes give different
answers? It may depend on the selected sample of
time series, but I suspect it is more likely to depend
on the skill of the analyst and on their individual
interpretations of what is meant by Method X.
— C. Chatfield
. . . these authors are more at home with simple
procedures than with Box-Jenkins. — C. Chatfield
Automatic time series forecasting Forecasting competitions 12
Makridakis and Hibon (1979)
There is a fact that Professor Priestley must accept:
empirical evidence is in disagreement with his
theoretical arguments. — S. Makridakis  M. Hibon
Dr Chatfield expresses some personal views about
the first author . . . It might be useful for Dr Chatfield
to read some of the psychological literature quoted
in the main paper, and he can then learn a little
more about biases and how they affect prior
probabilities. — S. Makridakis  M. Hibon
Automatic time series forecasting Forecasting competitions 13
Makridakis and Hibon (1979)
There is a fact that Professor Priestley must accept:
empirical evidence is in disagreement with his
theoretical arguments. — S. Makridakis  M. Hibon
Dr Chatfield expresses some personal views about
the first author . . . It might be useful for Dr Chatfield
to read some of the psychological literature quoted
in the main paper, and he can then learn a little
more about biases and how they affect prior
probabilities. — S. Makridakis  M. Hibon
Automatic time series forecasting Forecasting competitions 13
Consequences of MH (1979)
As a result of this paper, researchers started to:
¯ consider how to automate forecasting methods;
¯ study what methods give the best forecasts;
¯ be aware of the dangers of over-fitting;
¯ treat forecasting as a different problem from
time series analysis.
Makridakis  Hibon followed up with a new
competition in 1982:
1001 series
Anyone could submit forecasts (avoiding the
charge of incompetence)
Multiple forecast measures used.
Automatic time series forecasting Forecasting competitions 14
Consequences of MH (1979)
As a result of this paper, researchers started to:
¯ consider how to automate forecasting methods;
¯ study what methods give the best forecasts;
¯ be aware of the dangers of over-fitting;
¯ treat forecasting as a different problem from
time series analysis.
Makridakis  Hibon followed up with a new
competition in 1982:
1001 series
Anyone could submit forecasts (avoiding the
charge of incompetence)
Multiple forecast measures used.
Automatic time series forecasting Forecasting competitions 14
M-competition
Automatic time series forecasting Forecasting competitions 15
M-competition
Main findings (taken from Makridakis  Hibon, 2000)
1 Statistically sophisticated or complex methods do
not necessarily provide more accurate forecasts
than simpler ones.
2 The relative ranking of the performance of the
various methods varies according to the accuracy
measure being used.
3 The accuracy when various methods are being
combined outperforms, on average, the individual
methods being combined and does very well in
comparison to other methods.
4 The accuracy of the various methods depends upon
the length of the forecasting horizon involved.
Automatic time series forecasting Forecasting competitions 16
M3 competition
Automatic time series forecasting Forecasting competitions 17
Makridakis and Hibon (2000)
“The M3-Competition is a final attempt by the authors to
settle the accuracy issue of various time series methods. . .
The extension involves the inclusion of more methods/
researchers (in particular in the areas of neural networks
and expert systems) and more series.”
3003 series
All data from business, demography, finance and
economics.
Series length between 14 and 126.
Either non-seasonal, monthly or quarterly.
All time series positive.
MH claimed that the M3-competition supported the
findings of their earlier work.
However, best performing methods far from “simple”.
Automatic time series forecasting Forecasting competitions 18
Makridakis and Hibon (2000)
Best methods:
Theta
A very confusing explanation.
Shown by Hyndman and Billah (2003) to be average of
linear regression and simple exponential smoothing
with drift, applied to seasonally adjusted data.
Later, the original authors claimed that their
explanation was incorrect.
Forecast Pro
A commercial software package with an unknown
algorithm.
Known to fit either exponential smoothing or ARIMA
models using BIC.
Automatic time series forecasting Forecasting competitions 19
M3 results (recalculated)
Method MAPE sMAPE MASE
Theta 17.42 12.76 1.39
ForecastPro 18.00 13.06 1.47
ForecastX 17.35 13.09 1.42
Automatic ANN 17.18 13.98 1.53
B-J automatic 19.13 13.72 1.54
Automatic time series forecasting Forecasting competitions 20
M3 results (recalculated)
Method MAPE sMAPE MASE
Theta 17.42 12.76 1.39
ForecastPro 18.00 13.06 1.47
ForecastX 17.35 13.09 1.42
Automatic ANN 17.18 13.98 1.53
B-J automatic 19.13 13.72 1.54
Automatic time series forecasting Forecasting competitions 20
® Calculations do not match
published paper.
® Some contestants apparently
submitted multiple entries but only
best ones published.
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Exponential smoothing
5 ARIMA modelling
6 Automatic nonlinear forecasting?
7 Time series with complex seasonality
8 Recent developments
Automatic time series forecasting Forecasting the PBS 21
Forecasting the PBS
Automatic time series forecasting Forecasting the PBS 22
Forecasting the PBS
The Pharmaceutical Benefits Scheme (PBS) is
the Australian government drugs subsidy scheme.
Many drugs bought from pharmacies are
subsidised to allow more equitable access to
modern drugs.
The cost to government is determined by the
number and types of drugs purchased.
Currently nearly 1% of GDP ($14 billion).
The total cost is budgeted based on forecasts
of drug usage.
Automatic time series forecasting Forecasting the PBS 23
Forecasting the PBS
The Pharmaceutical Benefits Scheme (PBS) is
the Australian government drugs subsidy scheme.
Many drugs bought from pharmacies are
subsidised to allow more equitable access to
modern drugs.
The cost to government is determined by the
number and types of drugs purchased.
Currently nearly 1% of GDP ($14 billion).
The total cost is budgeted based on forecasts
of drug usage.
Automatic time series forecasting Forecasting the PBS 23
Forecasting the PBS
The Pharmaceutical Benefits Scheme (PBS) is
the Australian government drugs subsidy scheme.
Many drugs bought from pharmacies are
subsidised to allow more equitable access to
modern drugs.
The cost to government is determined by the
number and types of drugs purchased.
Currently nearly 1% of GDP ($14 billion).
The total cost is budgeted based on forecasts
of drug usage.
Automatic time series forecasting Forecasting the PBS 23
Forecasting the PBS
The Pharmaceutical Benefits Scheme (PBS) is
the Australian government drugs subsidy scheme.
Many drugs bought from pharmacies are
subsidised to allow more equitable access to
modern drugs.
The cost to government is determined by the
number and types of drugs purchased.
Currently nearly 1% of GDP ($14 billion).
The total cost is budgeted based on forecasts
of drug usage.
Automatic time series forecasting Forecasting the PBS 23
Forecasting the PBS
In 2001: $4.5 billion budget, under-forecasted
by $800 million.
Thousands of products. Seasonal demand.
Subject to covert marketing, volatile products,
uncontrollable expenditure.
Although monthly data available for 10 years,
data are aggregated to annual values, and only
the first three years are used in estimating the
forecasts.
All forecasts being done with the FORECAST
function in MS-Excel!
Automatic time series forecasting Forecasting the PBS 24
Forecasting the PBS
In 2001: $4.5 billion budget, under-forecasted
by $800 million.
Thousands of products. Seasonal demand.
Subject to covert marketing, volatile products,
uncontrollable expenditure.
Although monthly data available for 10 years,
data are aggregated to annual values, and only
the first three years are used in estimating the
forecasts.
All forecasts being done with the FORECAST
function in MS-Excel!
Automatic time series forecasting Forecasting the PBS 24
Forecasting the PBS
In 2001: $4.5 billion budget, under-forecasted
by $800 million.
Thousands of products. Seasonal demand.
Subject to covert marketing, volatile products,
uncontrollable expenditure.
Although monthly data available for 10 years,
data are aggregated to annual values, and only
the first three years are used in estimating the
forecasts.
All forecasts being done with the FORECAST
function in MS-Excel!
Automatic time series forecasting Forecasting the PBS 24
Forecasting the PBS
In 2001: $4.5 billion budget, under-forecasted
by $800 million.
Thousands of products. Seasonal demand.
Subject to covert marketing, volatile products,
uncontrollable expenditure.
Although monthly data available for 10 years,
data are aggregated to annual values, and only
the first three years are used in estimating the
forecasts.
All forecasts being done with the FORECAST
function in MS-Excel!
Automatic time series forecasting Forecasting the PBS 24
Forecasting the PBS
In 2001: $4.5 billion budget, under-forecasted
by $800 million.
Thousands of products. Seasonal demand.
Subject to covert marketing, volatile products,
uncontrollable expenditure.
Although monthly data available for 10 years,
data are aggregated to annual values, and only
the first three years are used in estimating the
forecasts.
All forecasts being done with the FORECAST
function in MS-Excel!
Automatic time series forecasting Forecasting the PBS 24
PBS data
Automatic time series forecasting Forecasting the PBS 25
Total cost: A03 concession safety net group
Time
$thousands
1995 2000 2005
020040060080010001200
PBS data
Automatic time series forecasting Forecasting the PBS 25
Total cost: A05 general copayments group
Time
$thousands
1995 2000 2005
050100150200
PBS data
Automatic time series forecasting Forecasting the PBS 25
Total cost: D01 general copayments group
Time
$thousands
1995 2000 2005
0100200300400500600700
PBS data
Automatic time series forecasting Forecasting the PBS 25
Total cost: S01 general copayments group
Time
$thousands
1995 2000 2005
050010001500
PBS data
Automatic time series forecasting Forecasting the PBS 25
Total cost: R03 general copayments group
Time
$thousands
1995 2000 2005
10002000300040005000
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Exponential smoothing
5 ARIMA modelling
6 Automatic nonlinear forecasting?
7 Time series with complex seasonality
8 Recent developments
Automatic time series forecasting Exponential smoothing 26
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
Automatic time series forecasting Exponential smoothing 27
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
Automatic time series forecasting Exponential smoothing 27
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Automatic time series forecasting Exponential smoothing 27
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Ad,N: Additive damped trend method
Automatic time series forecasting Exponential smoothing 27
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Ad,N: Additive damped trend method
M,N: Exponential trend method
Automatic time series forecasting Exponential smoothing 27
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Ad,N: Additive damped trend method
M,N: Exponential trend method
Md,N: Multiplicative damped trend method
Automatic time series forecasting Exponential smoothing 27
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Ad,N: Additive damped trend method
M,N: Exponential trend method
Md,N: Multiplicative damped trend method
A,A: Additive Holt-Winters’ method
Automatic time series forecasting Exponential smoothing 27
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
A,N: Holt’s linear method
Ad,N: Additive damped trend method
M,N: Exponential trend method
Md,N: Multiplicative damped trend method
A,A: Additive Holt-Winters’ method
A,M: Multiplicative Holt-Winters’ method
Automatic time series forecasting Exponential smoothing 27
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
There are 15 separate exponential smoothing
methods.
Automatic time series forecasting Exponential smoothing 27
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
There are 15 separate exponential smoothing
methods.
Each can have an additive or multiplicative error,
giving 30 separate models.
Automatic time series forecasting Exponential smoothing 27
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
There are 15 separate exponential smoothing
methods.
Each can have an additive or multiplicative error,
giving 30 separate models.
Only 19 models are numerically stable.
Automatic time series forecasting Exponential smoothing 27
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic time series forecasting Exponential smoothing 28
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic time series forecasting Exponential smoothing 28
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
↑
Trend
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic time series forecasting Exponential smoothing 28
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
↑
Trend Seasonal
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic time series forecasting Exponential smoothing 28
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
↑
Error Trend Seasonal
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic time series forecasting Exponential smoothing 28
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
↑
Error Trend Seasonal
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic time series forecasting Exponential smoothing 28
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
↑
Error Trend Seasonal
Examples:
A,N,N: Simple exponential smoothing with additive errors
A,A,N: Holt’s linear method with additive errors
M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors
Automatic time series forecasting Exponential smoothing 28
Innovations state space models
¯ All ETS models can be written in innovations
state space form (IJF, 2002).
¯ Additive and multiplicative versions give the
same point forecasts but different prediction
intervals.
ETS state space model
xt−1
εt
yt
Automatic time series forecasting Exponential smoothing 29
State space model
xt = (level, slope, seasonal)
ETS state space model
xt−1
εt
yt
xt
Automatic time series forecasting Exponential smoothing 29
State space model
xt = (level, slope, seasonal)
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1
Automatic time series forecasting Exponential smoothing 29
State space model
xt = (level, slope, seasonal)
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1
Automatic time series forecasting Exponential smoothing 29
State space model
xt = (level, slope, seasonal)
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2
Automatic time series forecasting Exponential smoothing 29
State space model
xt = (level, slope, seasonal)
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2
Automatic time series forecasting Exponential smoothing 29
State space model
xt = (level, slope, seasonal)
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2 yt+3
εt+3
Automatic time series forecasting Exponential smoothing 29
State space model
xt = (level, slope, seasonal)
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2 yt+3
εt+3 xt+3
Automatic time series forecasting Exponential smoothing 29
State space model
xt = (level, slope, seasonal)
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2 yt+3
εt+3 xt+3 yt+4
εt+4
Automatic time series forecasting Exponential smoothing 29
State space model
xt = (level, slope, seasonal)
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2 yt+3
εt+3 xt+3 yt+4
εt+4
Automatic time series forecasting Exponential smoothing 29
State space model
xt = (level, slope, seasonal)
Estimation
Compute likelihood L from
ε1, ε2, . . . , εT.
Optimize L wrt model
parameters.
Innovations state space models
Let xt = ( t, bt, st, st−1, . . . , st−m+1) and εt
iid
∼ N(0, σ2
).
yt = h(xt−1) + k(xt−1)εt Observation equation
µt et
xt = f(xt−1) + g(xt−1)εt State equation
Additive errors:
k(xt−1) = 1. yt = µt + εt.
Multiplicative errors:
k(xt−1) = µt. yt = µt(1 + εt).
εt = (yt − µt)/µt is relative error.
Automatic time series forecasting Exponential smoothing 30
Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give same
point forecasts but different prediction
intervals.
Estimation
L∗
(θ, x0) = n log
n
t=1
ε2
t /k2
(xt−1) + 2
n
t=1
log |k(xt−1)|
= −2 log(Likelihood) + constant
Minimize wrt θ = (α, β, γ, φ) and initial states
x0 = ( 0, b0, s0, s−1, . . . , s−m+1).
Automatic time series forecasting Exponential smoothing 31
Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give same
point forecasts but different prediction
intervals.
Estimation
L∗
(θ, x0) = n log
n
t=1
ε2
t /k2
(xt−1) + 2
n
t=1
log |k(xt−1)|
= −2 log(Likelihood) + constant
Minimize wrt θ = (α, β, γ, φ) and initial states
x0 = ( 0, b0, s0, s−1, . . . , s−m+1).
Automatic time series forecasting Exponential smoothing 31
Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give same
point forecasts but different prediction
intervals.
Estimation
L∗
(θ, x0) = n log
n
t=1
ε2
t /k2
(xt−1) + 2
n
t=1
log |k(xt−1)|
= −2 log(Likelihood) + constant
Minimize wrt θ = (α, β, γ, φ) and initial states
x0 = ( 0, b0, s0, s−1, . . . , s−m+1).
Automatic time series forecasting Exponential smoothing 31
Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give same
point forecasts but different prediction
intervals.
Estimation
L∗
(θ, x0) = n log
n
t=1
ε2
t /k2
(xt−1) + 2
n
t=1
log |k(xt−1)|
= −2 log(Likelihood) + constant
Minimize wrt θ = (α, β, γ, φ) and initial states
x0 = ( 0, b0, s0, s−1, . . . , s−m+1).
Automatic time series forecasting Exponential smoothing 31
Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give same
point forecasts but different prediction
intervals.
Estimation
L∗
(θ, x0) = n log
n
t=1
ε2
t /k2
(xt−1) + 2
n
t=1
log |k(xt−1)|
= −2 log(Likelihood) + constant
Minimize wrt θ = (α, β, γ, φ) and initial states
x0 = ( 0, b0, s0, s−1, . . . , s−m+1).
Automatic time series forecasting Exponential smoothing 31
Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give same
point forecasts but different prediction
intervals.
Estimation
L∗
(θ, x0) = n log
n
t=1
ε2
t /k2
(xt−1) + 2
n
t=1
log |k(xt−1)|
= −2 log(Likelihood) + constant
Minimize wrt θ = (α, β, γ, φ) and initial states
x0 = ( 0, b0, s0, s−1, . . . , s−m+1).
Automatic time series forecasting Exponential smoothing 31
Q: How to choose
between the 19
different ETS models?
Cross-validation
Traditional evaluation
Automatic time series forecasting Exponential smoothing 32
q q q q q q q q q q q q q q q q q q q q q q q q q time
Training data Test data
Cross-validation
Traditional evaluation
Standard cross-validation
Automatic time series forecasting Exponential smoothing 32
q q q q q q q q q q q q q q q q q q q q q q q q q time
Training data Test data
q q q q q q q q q q q q q q q q q q q qq q qq q
q q q q q q q q q q q q q q q q q q q qqq qq q
q q q q q q q q q q q q q q q q q q q qq qqq q
q q q q q q q q q q q q q q q q q q q qqqq q q
q q q q q q q q q q q q q q q q q q q qq q qq q
Cross-validation
Traditional evaluation
Standard cross-validation
Time series cross-validation
Automatic time series forecasting Exponential smoothing 32
q q q q q q q q q q q q q q q q q q q q q q q q q time
Training data Test data
q q q q q q q q q q q q q q q q q q q qq q qq q
q q q q q q q q q q q q q q q q q q q qqq qq q
q q q q q q q q q q q q q q q q q q q qq qqq q
q q q q q q q q q q q q q q q q q q q qqqq q q
q q q q q q q q q q q q q q q q q q q qq q qq q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
Cross-validation
Traditional evaluation
Standard cross-validation
Time series cross-validation
Automatic time series forecasting Exponential smoothing 32
q q q q q q q q q q q q q q q q q q q q q q q q q time
Training data Test data
q q q q q q q q q q q q q q q q q q q qq q qq q
q q q q q q q q q q q q q q q q q q q qqq qq q
q q q q q q q q q q q q q q q q q q q qq qqq q
q q q q q q q q q q q q q q q q q q q qqqq q q
q q q q q q q q q q q q q q q q q q q qq q qq q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
Cross-validation
Traditional evaluation
Standard cross-validation
Time series cross-validation
Automatic time series forecasting Exponential smoothing 32
q q q q q q q q q q q q q q q q q q q q q q q q q time
Training data Test data
q q q q q q q q q q q q q q q q q q q qq q qq q
q q q q q q q q q q q q q q q q q q q qqq qq q
q q q q q q q q q q q q q q q q q q q qq qqq q
q q q q q q q q q q q q q q q q q q q qqqq q q
q q q q q q q q q q q q q q q q q q q qq q qq q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
q q q q q q q q q q q q q q q q q q q q q q q q q
Also known as “Evaluation on
a rolling forecast origin”
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters in the model.
This is a penalized likelihood approach.
If L is Gaussian, then AIC ≈ c + T log MSE + 2k
where c is a constant, MSE is from one-step
forecasts on training set, and T is the length of
the series.
Minimizing the Gaussian AIC is asymptotically
equivalent (as T → ∞) to minimizing MSE from
one-step forecasts on test set via time series
cross-validation.
Automatic time series forecasting Exponential smoothing 33
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters in the model.
This is a penalized likelihood approach.
If L is Gaussian, then AIC ≈ c + T log MSE + 2k
where c is a constant, MSE is from one-step
forecasts on training set, and T is the length of
the series.
Minimizing the Gaussian AIC is asymptotically
equivalent (as T → ∞) to minimizing MSE from
one-step forecasts on test set via time series
cross-validation.
Automatic time series forecasting Exponential smoothing 33
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters in the model.
This is a penalized likelihood approach.
If L is Gaussian, then AIC ≈ c + T log MSE + 2k
where c is a constant, MSE is from one-step
forecasts on training set, and T is the length of
the series.
Minimizing the Gaussian AIC is asymptotically
equivalent (as T → ∞) to minimizing MSE from
one-step forecasts on test set via time series
cross-validation.
Automatic time series forecasting Exponential smoothing 33
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters in the model.
This is a penalized likelihood approach.
If L is Gaussian, then AIC ≈ c + T log MSE + 2k
where c is a constant, MSE is from one-step
forecasts on training set, and T is the length of
the series.
Minimizing the Gaussian AIC is asymptotically
equivalent (as T → ∞) to minimizing MSE from
one-step forecasts on test set via time series
cross-validation.
Automatic time series forecasting Exponential smoothing 33
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters in the model.
This is a penalized likelihood approach.
If L is Gaussian, then AIC ≈ c + T log MSE + 2k
where c is a constant, MSE is from one-step
forecasts on training set, and T is the length of
the series.
Minimizing the Gaussian AIC is asymptotically
equivalent (as T → ∞) to minimizing MSE from
one-step forecasts on test set via time series
cross-validation.
Automatic time series forecasting Exponential smoothing 33
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
Corrected AIC
For small T, AIC tends to over-fit. Bias-corrected
version:
AICC = AIC + 2(k+1)(k+2)
T−k
Bayesian Information Criterion
BIC = AIC + k[log(T) − 2]
BIC penalizes terms more heavily than AIC
Minimizing BIC is consistent if there is a true
model.
Automatic time series forecasting Exponential smoothing 34
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
Corrected AIC
For small T, AIC tends to over-fit. Bias-corrected
version:
AICC = AIC + 2(k+1)(k+2)
T−k
Bayesian Information Criterion
BIC = AIC + k[log(T) − 2]
BIC penalizes terms more heavily than AIC
Minimizing BIC is consistent if there is a true
model.
Automatic time series forecasting Exponential smoothing 34
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
Corrected AIC
For small T, AIC tends to over-fit. Bias-corrected
version:
AICC = AIC + 2(k+1)(k+2)
T−k
Bayesian Information Criterion
BIC = AIC + k[log(T) − 2]
BIC penalizes terms more heavily than AIC
Minimizing BIC is consistent if there is a true
model.
Automatic time series forecasting Exponential smoothing 34
What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automatic
forecasting purposes. Also requires large T.
As T → ∞, BIC selects true model if there is
one. But that is never true!
AICc focuses on forecasting performance, can
be used on small samples and is very fast to
compute.
Empirical studies in forecasting show AIC is
better than BIC for forecast accuracy.
Automatic time series forecasting Exponential smoothing 35
What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automatic
forecasting purposes. Also requires large T.
As T → ∞, BIC selects true model if there is
one. But that is never true!
AICc focuses on forecasting performance, can
be used on small samples and is very fast to
compute.
Empirical studies in forecasting show AIC is
better than BIC for forecast accuracy.
Automatic time series forecasting Exponential smoothing 35
What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automatic
forecasting purposes. Also requires large T.
As T → ∞, BIC selects true model if there is
one. But that is never true!
AICc focuses on forecasting performance, can
be used on small samples and is very fast to
compute.
Empirical studies in forecasting show AIC is
better than BIC for forecast accuracy.
Automatic time series forecasting Exponential smoothing 35
What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automatic
forecasting purposes. Also requires large T.
As T → ∞, BIC selects true model if there is
one. But that is never true!
AICc focuses on forecasting performance, can
be used on small samples and is very fast to
compute.
Empirical studies in forecasting show AIC is
better than BIC for forecast accuracy.
Automatic time series forecasting Exponential smoothing 35
What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automatic
forecasting purposes. Also requires large T.
As T → ∞, BIC selects true model if there is
one. But that is never true!
AICc focuses on forecasting performance, can
be used on small samples and is very fast to
compute.
Empirical studies in forecasting show AIC is
better than BIC for forecast accuracy.
Automatic time series forecasting Exponential smoothing 35
ets algorithm in R
Automatic time series forecasting Exponential smoothing 36
Based on Hyndman, Koehler,
Snyder  Grose (IJF 2002):
Apply each of 19 models that are
appropriate to the data. Optimize
parameters and initial values
using MLE.
Select best method using AICc.
Produce forecasts using best
method.
Obtain prediction intervals using
underlying state space model.
ets algorithm in R
Automatic time series forecasting Exponential smoothing 36
Based on Hyndman, Koehler,
Snyder  Grose (IJF 2002):
Apply each of 19 models that are
appropriate to the data. Optimize
parameters and initial values
using MLE.
Select best method using AICc.
Produce forecasts using best
method.
Obtain prediction intervals using
underlying state space model.
ets algorithm in R
Automatic time series forecasting Exponential smoothing 36
Based on Hyndman, Koehler,
Snyder  Grose (IJF 2002):
Apply each of 19 models that are
appropriate to the data. Optimize
parameters and initial values
using MLE.
Select best method using AICc.
Produce forecasts using best
method.
Obtain prediction intervals using
underlying state space model.
ets algorithm in R
Automatic time series forecasting Exponential smoothing 36
Based on Hyndman, Koehler,
Snyder  Grose (IJF 2002):
Apply each of 19 models that are
appropriate to the data. Optimize
parameters and initial values
using MLE.
Select best method using AICc.
Produce forecasts using best
method.
Obtain prediction intervals using
underlying state space model.
Exponential smoothing
Automatic time series forecasting Exponential smoothing 37
Forecasts from ETS(M,A,N)
Year
millionsofsheep
1960 1970 1980 1990 2000 2010
300400500600
Exponential smoothing
fit - ets(livestock)
fcast - forecast(fit)
plot(fcast)
Automatic time series forecasting Exponential smoothing 38
Forecasts from ETS(M,A,N)
Year
millionsofsheep
1960 1970 1980 1990 2000 2010
300400500600
Exponential smoothing
Automatic time series forecasting Exponential smoothing 39
Forecasts from ETS(M,Md,M)
Year
Totalscripts(millions)
1995 2000 2005 2010
0.40.60.81.01.21.41.6
Exponential smoothing
fit - ets(h02)
fcast - forecast(fit)
plot(fcast)
Automatic time series forecasting Exponential smoothing 40
Forecasts from ETS(M,Md,M)
Year
Totalscripts(millions)
1995 2000 2005 2010
0.40.60.81.01.21.41.6
Exponential smoothing
 fit
ETS(M,Md,M)
Smoothing parameters:
alpha = 0.3318
beta = 4e-04
gamma = 1e-04
phi = 0.9695
Initial states:
l = 0.4003
b = 1.0233
s = 0.8575 0.8183 0.7559 0.7627 0.6873 1.2884
1.3456 1.1867 1.1653 1.1033 1.0398 0.9893
sigma: 0.0651
AIC AICc BIC
-121.97999 -118.68967 -65.57195
Automatic time series forecasting Exponential smoothing 41
M3 comparisons
Method MAPE sMAPE MASE
Theta 17.42 12.76 1.39
ForecastPro 18.00 13.06 1.47
ForecastX 17.35 13.09 1.42
Automatic ANN 17.18 13.98 1.53
B-J automatic 19.13 13.72 1.54
ETS 18.06 13.38 1.52
Automatic time series forecasting Exponential smoothing 42
Exponential smoothing
Automatic time series forecasting Exponential smoothing 43
Exponential smoothing
Automatic time series forecasting Exponential smoothing 43
www.OTexts.com/fpp
Exponential smoothing
Automatic time series forecasting Exponential smoothing 43
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Exponential smoothing
5 ARIMA modelling
6 Automatic nonlinear forecasting?
7 Time series with complex seasonality
8 Recent developments
Automatic time series forecasting ARIMA modelling 44
ARIMA models
yt−1
yt−2
yt−3
yt
Inputs Output
Automatic time series forecasting ARIMA modelling 45
ARIMA models
yt−1
yt−2
yt−3
εt
yt
Inputs Output
Automatic time series forecasting ARIMA modelling 45
Autoregression (AR)
model
ARIMA models
yt−1
yt−2
yt−3
εt
εt−1
εt−2
yt
Inputs Output
Automatic time series forecasting ARIMA modelling 45
Autoregression moving
average (ARMA) model
ARIMA models
yt−1
yt−2
yt−3
εt
εt−1
εt−2
yt
Inputs Output
Automatic time series forecasting ARIMA modelling 45
Autoregression moving
average (ARMA) model
Estimation
Compute likelihood L from
ε1, ε2, . . . , εT.
Use optimization
algorithm to maximize L.
ARIMA models
yt−1
yt−2
yt−3
εt
εt−1
εt−2
yt
Inputs Output
Automatic time series forecasting ARIMA modelling 45
Autoregression moving
average (ARMA) model
Estimation
Compute likelihood L from
ε1, ε2, . . . , εT.
Use optimization
algorithm to maximize L.
ARIMA model
Autoregression moving
average (ARMA) model
applied to differences.
ARIMA modelling
Automatic time series forecasting ARIMA modelling 46
ARIMA modelling
Automatic time series forecasting ARIMA modelling 46
ARIMA modelling
Automatic time series forecasting ARIMA modelling 46
Auto ARIMA
Automatic time series forecasting ARIMA modelling 47
Forecasts from ARIMA(0,1,0) with drift
Year
millionsofsheep
1960 1970 1980 1990 2000 2010
250300350400450500550
Auto ARIMA
fit - auto.arima(livestock)
fcast - forecast(fit)
plot(fcast)
Automatic time series forecasting ARIMA modelling 48
Forecasts from ARIMA(0,1,0) with drift
Year
millionsofsheep
1960 1970 1980 1990 2000 2010
250300350400450500550
Auto ARIMA
Automatic time series forecasting ARIMA modelling 49
Forecasts from ARIMA(3,1,3)(0,1,1)[12]
Year
Totalscripts(millions)
1995 2000 2005 2010
0.40.60.81.01.21.4
Auto ARIMA
fit - auto.arima(h02)
fcast - forecast(fit)
plot(fcast)
Automatic time series forecasting ARIMA modelling 50
Forecasts from ARIMA(3,1,3)(0,1,1)[12]
Year
Totalscripts(millions)
1995 2000 2005 2010
0.40.60.81.01.21.4
Auto ARIMA
 fit
Series: h02
ARIMA(3,1,3)(0,1,1)[12]
Coefficients:
ar1 ar2 ar3 ma1 ma2 ma3
-0.3648 -0.0636 0.3568 -0.4850 0.0479 -0.353
s.e. 0.2198 0.3293 0.1268 0.2227 0.2755 0.212
sma1
-0.5931
s.e. 0.0651
sigma^2 estimated as 0.002706: log likelihood=290.25
AIC=-564.5 AICc=-563.71 BIC=-538.48
Automatic time series forecasting ARIMA modelling 51
How does auto.arima() work?
A non-seasonal ARIMA process
φ(B)(1 − B)d
yt = c + θ(B)εt
Need to select appropriate orders p, q, d, and
whether to include c.
Automatic time series forecasting ARIMA modelling 52
Algorithm choices driven by forecast accuracy.
How does auto.arima() work?
A non-seasonal ARIMA process
φ(B)(1 − B)d
yt = c + θ(B)εt
Need to select appropriate orders p, q, d, and
whether to include c.
Hyndman  Khandakar (JSS, 2008) algorithm:
Select no. differences d via KPSS unit root test.
Select p, q, c by minimising AICc.
Use stepwise search to traverse model space,
starting with a simple model and considering
nearby variants.
Automatic time series forecasting ARIMA modelling 52
Algorithm choices driven by forecast accuracy.
How does auto.arima() work?
A non-seasonal ARIMA process
φ(B)(1 − B)d
yt = c + θ(B)εt
Need to select appropriate orders p, q, d, and
whether to include c.
Hyndman  Khandakar (JSS, 2008) algorithm:
Select no. differences d via KPSS unit root test.
Select p, q, c by minimising AICc.
Use stepwise search to traverse model space,
starting with a simple model and considering
nearby variants.
Automatic time series forecasting ARIMA modelling 52
Algorithm choices driven by forecast accuracy.
How does auto.arima() work?
A seasonal ARIMA process
Φ(Bm
)φ(B)(1 − B)d
(1 − Bm
)D
yt = c + Θ(Bm
)θ(B)εt
Need to select appropriate orders p, q, d, P, Q, D, and
whether to include c.
Hyndman  Khandakar (JSS, 2008) algorithm:
Select no. differences d via KPSS unit root test.
Select D using OCSB unit root test.
Select p, q, P, Q, c by minimising AIC.
Use stepwise search to traverse model space,
starting with a simple model and considering
nearby variants.
Automatic time series forecasting ARIMA modelling 53
M3 comparisons
Method MAPE sMAPE MASE
Theta 17.42 12.76 1.39
ForecastPro 18.00 13.06 1.47
B-J automatic 19.13 13.72 1.54
ETS 18.06 13.38 1.52
AutoARIMA 19.04 13.86 1.47
Automatic time series forecasting ARIMA modelling 54
M3 comparisons
Method MAPE sMAPE MASE
Theta 17.42 12.76 1.39
ForecastPro 18.00 13.06 1.47
B-J automatic 19.13 13.72 1.54
ETS 18.06 13.38 1.52
AutoARIMA 19.04 13.86 1.47
ETS-ARIMA 17.92 13.02 1.44
Automatic time series forecasting ARIMA modelling 54
M3 conclusions
MYTHS
Simple methods do better.
Exponential smoothing is better than ARIMA.
FACTS
The best methods are hybrid approaches.
ETS-ARIMA (the simple average of ETS-additive
and AutoARIMA) is the only fully documented
method that is comparable to the M3
competition winners.
Automatic time series forecasting ARIMA modelling 55
M3 conclusions
MYTHS
Simple methods do better.
Exponential smoothing is better than ARIMA.
FACTS
The best methods are hybrid approaches.
ETS-ARIMA (the simple average of ETS-additive
and AutoARIMA) is the only fully documented
method that is comparable to the M3
competition winners.
Automatic time series forecasting ARIMA modelling 55
M3 conclusions
MYTHS
Simple methods do better.
Exponential smoothing is better than ARIMA.
FACTS
The best methods are hybrid approaches.
ETS-ARIMA (the simple average of ETS-additive
and AutoARIMA) is the only fully documented
method that is comparable to the M3
competition winners.
Automatic time series forecasting ARIMA modelling 55
M3 conclusions
MYTHS
Simple methods do better.
Exponential smoothing is better than ARIMA.
FACTS
The best methods are hybrid approaches.
ETS-ARIMA (the simple average of ETS-additive
and AutoARIMA) is the only fully documented
method that is comparable to the M3
competition winners.
Automatic time series forecasting ARIMA modelling 55
M3 conclusions
MYTHS
Simple methods do better.
Exponential smoothing is better than ARIMA.
FACTS
The best methods are hybrid approaches.
ETS-ARIMA (the simple average of ETS-additive
and AutoARIMA) is the only fully documented
method that is comparable to the M3
competition winners.
Automatic time series forecasting ARIMA modelling 55
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Exponential smoothing
5 ARIMA modelling
6 Automatic nonlinear forecasting?
7 Time series with complex seasonality
8 Recent developments
Automatic time series forecasting Automatic nonlinear forecasting? 56
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition!
Very few machine learning methods get
published in the IJF because authors cannot
demonstrate their methods give better
forecasts than linear benchmark methods,
even on supposedly nonlinear data.
Some good recent work by Kourentzes and
Crone (Neurocomputing, 2010) on automated
ANN for time series.
Watch this space!
Automatic time series forecasting Automatic nonlinear forecasting? 57
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition!
Very few machine learning methods get
published in the IJF because authors cannot
demonstrate their methods give better
forecasts than linear benchmark methods,
even on supposedly nonlinear data.
Some good recent work by Kourentzes and
Crone (Neurocomputing, 2010) on automated
ANN for time series.
Watch this space!
Automatic time series forecasting Automatic nonlinear forecasting? 57
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition!
Very few machine learning methods get
published in the IJF because authors cannot
demonstrate their methods give better
forecasts than linear benchmark methods,
even on supposedly nonlinear data.
Some good recent work by Kourentzes and
Crone (Neurocomputing, 2010) on automated
ANN for time series.
Watch this space!
Automatic time series forecasting Automatic nonlinear forecasting? 57
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition!
Very few machine learning methods get
published in the IJF because authors cannot
demonstrate their methods give better
forecasts than linear benchmark methods,
even on supposedly nonlinear data.
Some good recent work by Kourentzes and
Crone (Neurocomputing, 2010) on automated
ANN for time series.
Watch this space!
Automatic time series forecasting Automatic nonlinear forecasting? 57
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition!
Very few machine learning methods get
published in the IJF because authors cannot
demonstrate their methods give better
forecasts than linear benchmark methods,
even on supposedly nonlinear data.
Some good recent work by Kourentzes and
Crone (Neurocomputing, 2010) on automated
ANN for time series.
Watch this space!
Automatic time series forecasting Automatic nonlinear forecasting? 57
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Exponential smoothing
5 ARIMA modelling
6 Automatic nonlinear forecasting?
7 Time series with complex seasonality
8 Recent developments
Automatic time series forecasting Time series with complex seasonality 58
Examples
Automatic time series forecasting Time series with complex seasonality 59
US finished motor gasoline products
Weeks
Thousandsofbarrelsperday
1992 1994 1996 1998 2000 2002 2004
6500700075008000850090009500
Examples
Automatic time series forecasting Time series with complex seasonality 59
Number of calls to large American bank (7am−9pm)
5 minute intervals
Numberofcallarrivals
100200300400
3 March 17 March 31 March 14 April 28 April 12 May
Examples
Automatic time series forecasting Time series with complex seasonality 59
Turkish electricity demand
Days
Electricitydemand(GW)
2000 2002 2004 2006 2008
10152025
TBATS model
TBATS
Trigonometric terms for seasonality
Box-Cox transformations for heterogeneity
ARMA errors for short-term dynamics
Trend (possibly damped)
Seasonal (including multiple and non-integer periods)
Automatic algorithm described in AM De Livera,
RJ Hyndman, and RD Snyder (2011). “Forecasting
time series with complex seasonal patterns using
exponential smoothing”. Journal of the American
Statistical Association 106(496), 1513–1527.
Automatic time series forecasting Time series with complex seasonality 60
TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(i)
t−mi
+ dt
t = t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + βdt
dt =
p
i=1
φidt−i +
q
j=1
θjεt−j + εt
s
(i)
t =
ki
j=1
s
(i)
j,t
Automatic time series forecasting Time series with complex seasonality 61
s
(i)
j,t = s
(i)
j,t−1 cos λ
(i)
j + s
∗(i)
j,t−1 sin λ
(i)
j + γ
(i)
1 dt
s
(i)
j,t = −s
(i)
j,t−1 sin λ
(i)
j + s
∗(i)
j,t−1 cos λ
(i)
j + γ
(i)
2 dt
TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(i)
t−mi
+ dt
t = t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + βdt
dt =
p
i=1
φidt−i +
q
j=1
θjεt−j + εt
s
(i)
t =
ki
j=1
s
(i)
j,t
Automatic time series forecasting Time series with complex seasonality 61
s
(i)
j,t = s
(i)
j,t−1 cos λ
(i)
j + s
∗(i)
j,t−1 sin λ
(i)
j + γ
(i)
1 dt
s
(i)
j,t = −s
(i)
j,t−1 sin λ
(i)
j + s
∗(i)
j,t−1 cos λ
(i)
j + γ
(i)
2 dt
Box-Cox transformation
TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(i)
t−mi
+ dt
t = t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + βdt
dt =
p
i=1
φidt−i +
q
j=1
θjεt−j + εt
s
(i)
t =
ki
j=1
s
(i)
j,t
Automatic time series forecasting Time series with complex seasonality 61
s
(i)
j,t = s
(i)
j,t−1 cos λ
(i)
j + s
∗(i)
j,t−1 sin λ
(i)
j + γ
(i)
1 dt
s
(i)
j,t = −s
(i)
j,t−1 sin λ
(i)
j + s
∗(i)
j,t−1 cos λ
(i)
j + γ
(i)
2 dt
Box-Cox transformation
M seasonal periods
TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(i)
t−mi
+ dt
t = t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + βdt
dt =
p
i=1
φidt−i +
q
j=1
θjεt−j + εt
s
(i)
t =
ki
j=1
s
(i)
j,t
Automatic time series forecasting Time series with complex seasonality 61
s
(i)
j,t = s
(i)
j,t−1 cos λ
(i)
j + s
∗(i)
j,t−1 sin λ
(i)
j + γ
(i)
1 dt
s
(i)
j,t = −s
(i)
j,t−1 sin λ
(i)
j + s
∗(i)
j,t−1 cos λ
(i)
j + γ
(i)
2 dt
Box-Cox transformation
M seasonal periods
global and local trend
TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(i)
t−mi
+ dt
t = t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + βdt
dt =
p
i=1
φidt−i +
q
j=1
θjεt−j + εt
s
(i)
t =
ki
j=1
s
(i)
j,t
Automatic time series forecasting Time series with complex seasonality 61
s
(i)
j,t = s
(i)
j,t−1 cos λ
(i)
j + s
∗(i)
j,t−1 sin λ
(i)
j + γ
(i)
1 dt
s
(i)
j,t = −s
(i)
j,t−1 sin λ
(i)
j + s
∗(i)
j,t−1 cos λ
(i)
j + γ
(i)
2 dt
Box-Cox transformation
M seasonal periods
global and local trend
ARMA error
TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(i)
t−mi
+ dt
t = t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + βdt
dt =
p
i=1
φidt−i +
q
j=1
θjεt−j + εt
s
(i)
t =
ki
j=1
s
(i)
j,t
Automatic time series forecasting Time series with complex seasonality 61
s
(i)
j,t = s
(i)
j,t−1 cos λ
(i)
j + s
∗(i)
j,t−1 sin λ
(i)
j + γ
(i)
1 dt
s
(i)
j,t = −s
(i)
j,t−1 sin λ
(i)
j + s
∗(i)
j,t−1 cos λ
(i)
j + γ
(i)
2 dt
Box-Cox transformation
M seasonal periods
global and local trend
ARMA error
Fourier-like seasonal
terms
TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(i)
t−mi
+ dt
t = t−1 + φbt−1 + αdt
bt = (1 − φ)b + φbt−1 + βdt
dt =
p
i=1
φidt−i +
q
j=1
θjεt−j + εt
s
(i)
t =
ki
j=1
s
(i)
j,t
Automatic time series forecasting Time series with complex seasonality 61
s
(i)
j,t = s
(i)
j,t−1 cos λ
(i)
j + s
∗(i)
j,t−1 sin λ
(i)
j + γ
(i)
1 dt
s
(i)
j,t = −s
(i)
j,t−1 sin λ
(i)
j + s
∗(i)
j,t−1 cos λ
(i)
j + γ
(i)
2 dt
Box-Cox transformation
M seasonal periods
global and local trend
ARMA error
Fourier-like seasonal
terms
TBATS
Trigonometric
Box-Cox
ARMA
Trend
Seasonal
Examples
fit - tbats(gasoline)
fcast - forecast(fit)
plot(fcast)
Automatic time series forecasting Time series with complex seasonality 62
Forecasts from TBATS(0.999, {2,2}, 1, {52.1785714285714,8})
Weeks
Thousandsofbarrelsperday
1995 2000 2005
70008000900010000
Examples
fit - tbats(callcentre)
fcast - forecast(fit)
plot(fcast)
Automatic time series forecasting Time series with complex seasonality 63
Forecasts from TBATS(1, {3,1}, 0.987, {169,5, 845,3})
5 minute intervals
Numberofcallarrivals
0100200300400500
3 March 17 March 31 March 14 April 28 April 12 May 26 May 9 June
Examples
fit - tbats(turk)
fcast - forecast(fit)
plot(fcast)
Automatic time series forecasting Time series with complex seasonality 64
Forecasts from TBATS(0, {5,3}, 0.997, {7,3, 354.37,12, 365.25,4})
Days
Electricitydemand(GW)
2000 2002 2004 2006 2008 2010
10152025
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Exponential smoothing
5 ARIMA modelling
6 Automatic nonlinear forecasting?
7 Time series with complex seasonality
8 Recent developments
Automatic time series forecasting Recent developments 65
Further competitions
1 2011 tourism forecasting competition.
2 Kaggle and other forecasting platforms.
3 GEFCom 2012: Point forecasting of
electricity load and wind power.
4 GEFCom 2014: Probabilistic forecasting
of electricity load, electricity price,
wind energy and solar energy.
Automatic time series forecasting Recent developments 66
Further competitions
1 2011 tourism forecasting competition.
2 Kaggle and other forecasting platforms.
3 GEFCom 2012: Point forecasting of
electricity load and wind power.
4 GEFCom 2014: Probabilistic forecasting
of electricity load, electricity price,
wind energy and solar energy.
Automatic time series forecasting Recent developments 66
Further competitions
1 2011 tourism forecasting competition.
2 Kaggle and other forecasting platforms.
3 GEFCom 2012: Point forecasting of
electricity load and wind power.
4 GEFCom 2014: Probabilistic forecasting
of electricity load, electricity price,
wind energy and solar energy.
Automatic time series forecasting Recent developments 66
Further competitions
1 2011 tourism forecasting competition.
2 Kaggle and other forecasting platforms.
3 GEFCom 2012: Point forecasting of
electricity load and wind power.
4 GEFCom 2014: Probabilistic forecasting
of electricity load, electricity price,
wind energy and solar energy.
Automatic time series forecasting Recent developments 66
Forecasts about forecasting
1 Automatic algorithms will become more
general — handling a wide variety of time
series.
2 Model selection methods will take account
of multi-step forecast accuracy as well as
one-step forecast accuracy.
3 Automatic forecasting algorithms for
multivariate time series will be developed.
4 Automatic forecasting algorithms that
include covariate information will be
developed.
Automatic time series forecasting Recent developments 67
Forecasts about forecasting
1 Automatic algorithms will become more
general — handling a wide variety of time
series.
2 Model selection methods will take account
of multi-step forecast accuracy as well as
one-step forecast accuracy.
3 Automatic forecasting algorithms for
multivariate time series will be developed.
4 Automatic forecasting algorithms that
include covariate information will be
developed.
Automatic time series forecasting Recent developments 67
Forecasts about forecasting
1 Automatic algorithms will become more
general — handling a wide variety of time
series.
2 Model selection methods will take account
of multi-step forecast accuracy as well as
one-step forecast accuracy.
3 Automatic forecasting algorithms for
multivariate time series will be developed.
4 Automatic forecasting algorithms that
include covariate information will be
developed.
Automatic time series forecasting Recent developments 67
Forecasts about forecasting
1 Automatic algorithms will become more
general — handling a wide variety of time
series.
2 Model selection methods will take account
of multi-step forecast accuracy as well as
one-step forecast accuracy.
3 Automatic forecasting algorithms for
multivariate time series will be developed.
4 Automatic forecasting algorithms that
include covariate information will be
developed.
Automatic time series forecasting Recent developments 67
For further information
robjhyndman.com
Slides and references for this talk.
Links to all papers and books.
Links to R packages.
A blog about forecasting research.
Automatic time series forecasting Recent developments 68

Mais conteúdo relacionado

Mais procurados

Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)Matt Hansen
 
Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)Matt Hansen
 
Hypothesis Testing: Spread (Compare 1:Standard)
Hypothesis Testing: Spread (Compare 1:Standard)Hypothesis Testing: Spread (Compare 1:Standard)
Hypothesis Testing: Spread (Compare 1:Standard)Matt Hansen
 
Discovering Statistics Using IBM SPSS Statistics 4th Edition Field Test Bank
Discovering Statistics Using IBM SPSS Statistics 4th Edition Field Test BankDiscovering Statistics Using IBM SPSS Statistics 4th Edition Field Test Bank
Discovering Statistics Using IBM SPSS Statistics 4th Edition Field Test Bankguzofahug
 
Hypothesis Testing: Proportions (Compare 1:Standard)
Hypothesis Testing: Proportions (Compare 1:Standard)Hypothesis Testing: Proportions (Compare 1:Standard)
Hypothesis Testing: Proportions (Compare 1:Standard)Matt Hansen
 
Hypothesis Testing: Spread (Compare 2+ Factors)
Hypothesis Testing: Spread (Compare 2+ Factors)Hypothesis Testing: Spread (Compare 2+ Factors)
Hypothesis Testing: Spread (Compare 2+ Factors)Matt Hansen
 
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)Matt Hansen
 
Forecasting using data - Deliver 2016
Forecasting using data  - Deliver 2016Forecasting using data  - Deliver 2016
Forecasting using data - Deliver 2016Troy Magennis
 
Ways to evaluate a machine learning model’s performance
Ways to evaluate a machine learning model’s performanceWays to evaluate a machine learning model’s performance
Ways to evaluate a machine learning model’s performanceMala Deep Upadhaya
 
Session 9 intro_of_topics_in_hypothesis_testing
Session 9 intro_of_topics_in_hypothesis_testingSession 9 intro_of_topics_in_hypothesis_testing
Session 9 intro_of_topics_in_hypothesis_testingGlory Codilla
 
AN ALTERNATIVE APPROACH FOR SELECTION OF PSEUDO RANDOM NUMBERS FOR ONLINE EXA...
AN ALTERNATIVE APPROACH FOR SELECTION OF PSEUDO RANDOM NUMBERS FOR ONLINE EXA...AN ALTERNATIVE APPROACH FOR SELECTION OF PSEUDO RANDOM NUMBERS FOR ONLINE EXA...
AN ALTERNATIVE APPROACH FOR SELECTION OF PSEUDO RANDOM NUMBERS FOR ONLINE EXA...cscpconf
 
DoWhy Python library for causal inference: An End-to-End tool
DoWhy Python library for causal inference: An End-to-End toolDoWhy Python library for causal inference: An End-to-End tool
DoWhy Python library for causal inference: An End-to-End toolAmit Sharma
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldPyData
 
Prescriptive Analytics: A Hands-on Introduction to Getting Actionable Insight...
Prescriptive Analytics: A Hands-on Introduction to Getting Actionable Insight...Prescriptive Analytics: A Hands-on Introduction to Getting Actionable Insight...
Prescriptive Analytics: A Hands-on Introduction to Getting Actionable Insight...Barton Poulson
 

Mais procurados (16)

Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
 
Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)
 
Hypothesis Testing: Spread (Compare 1:Standard)
Hypothesis Testing: Spread (Compare 1:Standard)Hypothesis Testing: Spread (Compare 1:Standard)
Hypothesis Testing: Spread (Compare 1:Standard)
 
Discovering Statistics Using IBM SPSS Statistics 4th Edition Field Test Bank
Discovering Statistics Using IBM SPSS Statistics 4th Edition Field Test BankDiscovering Statistics Using IBM SPSS Statistics 4th Edition Field Test Bank
Discovering Statistics Using IBM SPSS Statistics 4th Edition Field Test Bank
 
Hypothesis Testing: Proportions (Compare 1:Standard)
Hypothesis Testing: Proportions (Compare 1:Standard)Hypothesis Testing: Proportions (Compare 1:Standard)
Hypothesis Testing: Proportions (Compare 1:Standard)
 
Hypothesis Testing: Spread (Compare 2+ Factors)
Hypothesis Testing: Spread (Compare 2+ Factors)Hypothesis Testing: Spread (Compare 2+ Factors)
Hypothesis Testing: Spread (Compare 2+ Factors)
 
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)
 
Forecasting using data - Deliver 2016
Forecasting using data  - Deliver 2016Forecasting using data  - Deliver 2016
Forecasting using data - Deliver 2016
 
Ways to evaluate a machine learning model’s performance
Ways to evaluate a machine learning model’s performanceWays to evaluate a machine learning model’s performance
Ways to evaluate a machine learning model’s performance
 
Hypothesis testing 1.0
Hypothesis testing 1.0Hypothesis testing 1.0
Hypothesis testing 1.0
 
Test for proportion
Test for proportionTest for proportion
Test for proportion
 
Session 9 intro_of_topics_in_hypothesis_testing
Session 9 intro_of_topics_in_hypothesis_testingSession 9 intro_of_topics_in_hypothesis_testing
Session 9 intro_of_topics_in_hypothesis_testing
 
AN ALTERNATIVE APPROACH FOR SELECTION OF PSEUDO RANDOM NUMBERS FOR ONLINE EXA...
AN ALTERNATIVE APPROACH FOR SELECTION OF PSEUDO RANDOM NUMBERS FOR ONLINE EXA...AN ALTERNATIVE APPROACH FOR SELECTION OF PSEUDO RANDOM NUMBERS FOR ONLINE EXA...
AN ALTERNATIVE APPROACH FOR SELECTION OF PSEUDO RANDOM NUMBERS FOR ONLINE EXA...
 
DoWhy Python library for causal inference: An End-to-End tool
DoWhy Python library for causal inference: An End-to-End toolDoWhy Python library for causal inference: An End-to-End tool
DoWhy Python library for causal inference: An End-to-End tool
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper Seabold
 
Prescriptive Analytics: A Hands-on Introduction to Getting Actionable Insight...
Prescriptive Analytics: A Hands-on Introduction to Getting Actionable Insight...Prescriptive Analytics: A Hands-on Introduction to Getting Actionable Insight...
Prescriptive Analytics: A Hands-on Introduction to Getting Actionable Insight...
 

Semelhante a Automatic time series forecasting

Automatic algorithms for time series forecasting
Automatic algorithms for time series forecastingAutomatic algorithms for time series forecasting
Automatic algorithms for time series forecastingRob Hyndman
 
What is scenario
What is scenarioWhat is scenario
What is scenarioAdil Khan
 
Greg Wilson - We Know (but ignore) More Than We Think
Greg Wilson - We Know (but ignore) More Than We ThinkGreg Wilson - We Know (but ignore) More Than We Think
Greg Wilson - We Know (but ignore) More Than We Think#DevTO
 
Undergraduated Thesis
Undergraduated ThesisUndergraduated Thesis
Undergraduated ThesisVictor Li
 
Modelling Innovation – some options from probabilistic to radical
Modelling Innovation – some options from probabilistic to radicalModelling Innovation – some options from probabilistic to radical
Modelling Innovation – some options from probabilistic to radicalBruce Edmonds
 
Qnt 565 Effective Communication / snaptutorial.com
Qnt 565    Effective Communication / snaptutorial.comQnt 565    Effective Communication / snaptutorial.com
Qnt 565 Effective Communication / snaptutorial.comBaileyabh
 
Machine learning in medicine: calm down
Machine learning in medicine: calm downMachine learning in medicine: calm down
Machine learning in medicine: calm downBenVanCalster
 
Selection of Research Material relating to RiskMetrics Group CDO Manager
Selection of Research Material relating to RiskMetrics Group CDO ManagerSelection of Research Material relating to RiskMetrics Group CDO Manager
Selection of Research Material relating to RiskMetrics Group CDO Managerquantfinance
 
Study of different approaches to Out of Distribution Generalization
Study of different approaches to Out of Distribution GeneralizationStudy of different approaches to Out of Distribution Generalization
Study of different approaches to Out of Distribution GeneralizationMohamedAmineHACHICHA1
 
ThesisDraftGarbageIn-GarbageOutSimulatingInternalConsistency.docx
ThesisDraftGarbageIn-GarbageOutSimulatingInternalConsistency.docxThesisDraftGarbageIn-GarbageOutSimulatingInternalConsistency.docx
ThesisDraftGarbageIn-GarbageOutSimulatingInternalConsistency.docxThomas Goodheart
 
A good response to others is not something like I agree. Please .docx
A good response to others is not something like I agree. Please .docxA good response to others is not something like I agree. Please .docx
A good response to others is not something like I agree. Please .docxransayo
 
A Pyramid ofDecision ApproachesPaul J.H. Schoemaker J. E.docx
A Pyramid ofDecision ApproachesPaul J.H. Schoemaker J. E.docxA Pyramid ofDecision ApproachesPaul J.H. Schoemaker J. E.docx
A Pyramid ofDecision ApproachesPaul J.H. Schoemaker J. E.docxransayo
 
FPP 1. Getting started
FPP 1. Getting startedFPP 1. Getting started
FPP 1. Getting startedRob Hyndman
 
Confounding and Directed Acyclic Graphs
Confounding and Directed Acyclic GraphsConfounding and Directed Acyclic Graphs
Confounding and Directed Acyclic GraphsDarren L Dahly PhD
 
QNT 565 Exceptional Education - snaptutorial.com
QNT 565  Exceptional Education - snaptutorial.comQNT 565  Exceptional Education - snaptutorial.com
QNT 565 Exceptional Education - snaptutorial.comDavisMurphyB48
 
Mixing ABM and policy...what could possibly go wrong?
Mixing ABM and policy...what could possibly go wrong?Mixing ABM and policy...what could possibly go wrong?
Mixing ABM and policy...what could possibly go wrong?Bruce Edmonds
 
QNT 565 Education Specialist / snaptutorial.com
QNT 565 Education Specialist / snaptutorial.comQNT 565 Education Specialist / snaptutorial.com
QNT 565 Education Specialist / snaptutorial.comMcdonaldRyan169
 

Semelhante a Automatic time series forecasting (20)

Automatic algorithms for time series forecasting
Automatic algorithms for time series forecastingAutomatic algorithms for time series forecasting
Automatic algorithms for time series forecasting
 
What is scenario
What is scenarioWhat is scenario
What is scenario
 
Greg Wilson - We Know (but ignore) More Than We Think
Greg Wilson - We Know (but ignore) More Than We ThinkGreg Wilson - We Know (but ignore) More Than We Think
Greg Wilson - We Know (but ignore) More Than We Think
 
Undergraduated Thesis
Undergraduated ThesisUndergraduated Thesis
Undergraduated Thesis
 
Modelling Innovation – some options from probabilistic to radical
Modelling Innovation – some options from probabilistic to radicalModelling Innovation – some options from probabilistic to radical
Modelling Innovation – some options from probabilistic to radical
 
Qnt 565 Effective Communication / snaptutorial.com
Qnt 565    Effective Communication / snaptutorial.comQnt 565    Effective Communication / snaptutorial.com
Qnt 565 Effective Communication / snaptutorial.com
 
Machine learning in medicine: calm down
Machine learning in medicine: calm downMachine learning in medicine: calm down
Machine learning in medicine: calm down
 
Selection of Research Material relating to RiskMetrics Group CDO Manager
Selection of Research Material relating to RiskMetrics Group CDO ManagerSelection of Research Material relating to RiskMetrics Group CDO Manager
Selection of Research Material relating to RiskMetrics Group CDO Manager
 
Study of different approaches to Out of Distribution Generalization
Study of different approaches to Out of Distribution GeneralizationStudy of different approaches to Out of Distribution Generalization
Study of different approaches to Out of Distribution Generalization
 
Unit2-RM.pptx
Unit2-RM.pptxUnit2-RM.pptx
Unit2-RM.pptx
 
Mm23
Mm23Mm23
Mm23
 
ThesisDraftGarbageIn-GarbageOutSimulatingInternalConsistency.docx
ThesisDraftGarbageIn-GarbageOutSimulatingInternalConsistency.docxThesisDraftGarbageIn-GarbageOutSimulatingInternalConsistency.docx
ThesisDraftGarbageIn-GarbageOutSimulatingInternalConsistency.docx
 
A good response to others is not something like I agree. Please .docx
A good response to others is not something like I agree. Please .docxA good response to others is not something like I agree. Please .docx
A good response to others is not something like I agree. Please .docx
 
A Pyramid ofDecision ApproachesPaul J.H. Schoemaker J. E.docx
A Pyramid ofDecision ApproachesPaul J.H. Schoemaker J. E.docxA Pyramid ofDecision ApproachesPaul J.H. Schoemaker J. E.docx
A Pyramid ofDecision ApproachesPaul J.H. Schoemaker J. E.docx
 
FPP 1. Getting started
FPP 1. Getting startedFPP 1. Getting started
FPP 1. Getting started
 
Confounding and Directed Acyclic Graphs
Confounding and Directed Acyclic GraphsConfounding and Directed Acyclic Graphs
Confounding and Directed Acyclic Graphs
 
QNT 565 Exceptional Education - snaptutorial.com
QNT 565  Exceptional Education - snaptutorial.comQNT 565  Exceptional Education - snaptutorial.com
QNT 565 Exceptional Education - snaptutorial.com
 
Mixing ABM and policy...what could possibly go wrong?
Mixing ABM and policy...what could possibly go wrong?Mixing ABM and policy...what could possibly go wrong?
Mixing ABM and policy...what could possibly go wrong?
 
Demand forcasting
Demand forcastingDemand forcasting
Demand forcasting
 
QNT 565 Education Specialist / snaptutorial.com
QNT 565 Education Specialist / snaptutorial.comQNT 565 Education Specialist / snaptutorial.com
QNT 565 Education Specialist / snaptutorial.com
 

Mais de Rob Hyndman

Exploring the feature space of large collections of time series
Exploring the feature space of large collections of time seriesExploring the feature space of large collections of time series
Exploring the feature space of large collections of time seriesRob Hyndman
 
Exploring the boundaries of predictability
Exploring the boundaries of predictabilityExploring the boundaries of predictability
Exploring the boundaries of predictabilityRob Hyndman
 
MEFM: An R package for long-term probabilistic forecasting of electricity demand
MEFM: An R package for long-term probabilistic forecasting of electricity demandMEFM: An R package for long-term probabilistic forecasting of electricity demand
MEFM: An R package for long-term probabilistic forecasting of electricity demandRob Hyndman
 
Visualization of big time series data
Visualization of big time series dataVisualization of big time series data
Visualization of big time series dataRob Hyndman
 
Probabilistic forecasting of long-term peak electricity demand
Probabilistic forecasting of long-term peak electricity demandProbabilistic forecasting of long-term peak electricity demand
Probabilistic forecasting of long-term peak electricity demandRob Hyndman
 
Visualization and forecasting of big time series data
Visualization and forecasting of big time series dataVisualization and forecasting of big time series data
Visualization and forecasting of big time series dataRob Hyndman
 
Academia sinica jan-2015
Academia sinica jan-2015Academia sinica jan-2015
Academia sinica jan-2015Rob Hyndman
 
Coherent mortality forecasting using functional time series models
Coherent mortality forecasting using functional time series modelsCoherent mortality forecasting using functional time series models
Coherent mortality forecasting using functional time series modelsRob Hyndman
 
Forecasting Hierarchical Time Series
Forecasting Hierarchical Time SeriesForecasting Hierarchical Time Series
Forecasting Hierarchical Time SeriesRob Hyndman
 
Forecasting using R
Forecasting using RForecasting using R
Forecasting using RRob Hyndman
 
SimpleR: tips, tricks & tools
SimpleR: tips, tricks & toolsSimpleR: tips, tricks & tools
SimpleR: tips, tricks & toolsRob Hyndman
 
R tools for hierarchical time series
R tools for hierarchical time seriesR tools for hierarchical time series
R tools for hierarchical time seriesRob Hyndman
 
Demographic forecasting
Demographic forecastingDemographic forecasting
Demographic forecastingRob Hyndman
 
Forecasting electricity demand distributions using a semiparametric additive ...
Forecasting electricity demand distributions using a semiparametric additive ...Forecasting electricity demand distributions using a semiparametric additive ...
Forecasting electricity demand distributions using a semiparametric additive ...Rob Hyndman
 

Mais de Rob Hyndman (14)

Exploring the feature space of large collections of time series
Exploring the feature space of large collections of time seriesExploring the feature space of large collections of time series
Exploring the feature space of large collections of time series
 
Exploring the boundaries of predictability
Exploring the boundaries of predictabilityExploring the boundaries of predictability
Exploring the boundaries of predictability
 
MEFM: An R package for long-term probabilistic forecasting of electricity demand
MEFM: An R package for long-term probabilistic forecasting of electricity demandMEFM: An R package for long-term probabilistic forecasting of electricity demand
MEFM: An R package for long-term probabilistic forecasting of electricity demand
 
Visualization of big time series data
Visualization of big time series dataVisualization of big time series data
Visualization of big time series data
 
Probabilistic forecasting of long-term peak electricity demand
Probabilistic forecasting of long-term peak electricity demandProbabilistic forecasting of long-term peak electricity demand
Probabilistic forecasting of long-term peak electricity demand
 
Visualization and forecasting of big time series data
Visualization and forecasting of big time series dataVisualization and forecasting of big time series data
Visualization and forecasting of big time series data
 
Academia sinica jan-2015
Academia sinica jan-2015Academia sinica jan-2015
Academia sinica jan-2015
 
Coherent mortality forecasting using functional time series models
Coherent mortality forecasting using functional time series modelsCoherent mortality forecasting using functional time series models
Coherent mortality forecasting using functional time series models
 
Forecasting Hierarchical Time Series
Forecasting Hierarchical Time SeriesForecasting Hierarchical Time Series
Forecasting Hierarchical Time Series
 
Forecasting using R
Forecasting using RForecasting using R
Forecasting using R
 
SimpleR: tips, tricks & tools
SimpleR: tips, tricks & toolsSimpleR: tips, tricks & tools
SimpleR: tips, tricks & tools
 
R tools for hierarchical time series
R tools for hierarchical time seriesR tools for hierarchical time series
R tools for hierarchical time series
 
Demographic forecasting
Demographic forecastingDemographic forecasting
Demographic forecasting
 
Forecasting electricity demand distributions using a semiparametric additive ...
Forecasting electricity demand distributions using a semiparametric additive ...Forecasting electricity demand distributions using a semiparametric additive ...
Forecasting electricity demand distributions using a semiparametric additive ...
 

Último

Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 

Último (20)

Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 

Automatic time series forecasting

  • 1. Rob J Hyndman Automatic time series forecasting
  • 2. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Exponential smoothing 5 ARIMA modelling 6 Automatic nonlinear forecasting? 7 Time series with complex seasonality 8 Recent developments Automatic time series forecasting Motivation 2
  • 3. Motivation Automatic time series forecasting Motivation 3
  • 4. Motivation Automatic time series forecasting Motivation 3
  • 5. Motivation Automatic time series forecasting Motivation 3
  • 6. Motivation Automatic time series forecasting Motivation 3
  • 7. Motivation Automatic time series forecasting Motivation 3
  • 8. Motivation 1 Common in business to have over 1000 products that need forecasting at least monthly. 2 Forecasts are often required by people who are untrained in time series analysis. Specifications Automatic forecasting algorithms must: ¯ determine an appropriate time series model; ¯ estimate the parameters; ¯ compute the forecasts with prediction intervals. Automatic time series forecasting Motivation 4
  • 9. Motivation 1 Common in business to have over 1000 products that need forecasting at least monthly. 2 Forecasts are often required by people who are untrained in time series analysis. Specifications Automatic forecasting algorithms must: ¯ determine an appropriate time series model; ¯ estimate the parameters; ¯ compute the forecasts with prediction intervals. Automatic time series forecasting Motivation 4
  • 10. Example: Asian sheep Automatic time series forecasting Motivation 5 Numbers of sheep in Asia Year millionsofsheep 1960 1970 1980 1990 2000 2010 250300350400450500550
  • 11. Example: Asian sheep Automatic time series forecasting Motivation 5 Automatic ETS forecasts Year millionsofsheep 1960 1970 1980 1990 2000 2010 250300350400450500550
  • 12. Example: Cortecosteroid sales Automatic time series forecasting Motivation 6 Monthly cortecosteroid drug sales in Australia Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.4
  • 13. Example: Cortecosteroid sales Automatic time series forecasting Motivation 6 Automatic ARIMA forecasts Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.4
  • 14. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Exponential smoothing 5 ARIMA modelling 6 Automatic nonlinear forecasting? 7 Time series with complex seasonality 8 Recent developments Automatic time series forecasting Forecasting competitions 7
  • 15. Makridakis and Hibon (1979) Automatic time series forecasting Forecasting competitions 8
  • 16. Makridakis and Hibon (1979) Automatic time series forecasting Forecasting competitions 8
  • 17. Makridakis and Hibon (1979) This was the first large-scale empirical evaluation of time series forecasting methods. Highly controversial at the time. Difficulties: How to measure forecast accuracy? How to apply methods consistently and objectively? How to explain unexpected results? Common thinking was that the more sophisticated mathematical models (ARIMA models at the time) were necessarily better. If results showed ARIMA models not best, it must be because analyst was unskilled. Automatic time series forecasting Forecasting competitions 9
  • 18. Makridakis and Hibon (1979) I do not believe that it is very fruitful to attempt to classify series according to which forecasting techniques perform “best”. The performance of any particular technique when applied to a particular series depends essentially on (a) the model which the series obeys; (b) our ability to identify and fit this model correctly and (c) the criterion chosen to measure the forecasting accuracy. — M.B. Priestley . . . the paper suggests the application of normal scientific experimental design to forecasting, with measures of unbiased testing of forecasts against subsequent reality, for success or failure. A long overdue reform. — F.H. Hansford-Miller Automatic time series forecasting Forecasting competitions 10
  • 19. Makridakis and Hibon (1979) I do not believe that it is very fruitful to attempt to classify series according to which forecasting techniques perform “best”. The performance of any particular technique when applied to a particular series depends essentially on (a) the model which the series obeys; (b) our ability to identify and fit this model correctly and (c) the criterion chosen to measure the forecasting accuracy. — M.B. Priestley . . . the paper suggests the application of normal scientific experimental design to forecasting, with measures of unbiased testing of forecasts against subsequent reality, for success or failure. A long overdue reform. — F.H. Hansford-Miller Automatic time series forecasting Forecasting competitions 10
  • 20. Makridakis and Hibon (1979) Modern man is fascinated with the subject of forecasting — W.G. Gilchrist It is amazing to me, however, that after all this exercise in identifying models, transforming and so on, that the autoregressive moving averages come out so badly. I wonder whether it might be partly due to the authors not using the backwards forecasting approach to obtain the initial errors. — W.G. Gilchrist Automatic time series forecasting Forecasting competitions 11
  • 21. Makridakis and Hibon (1979) Modern man is fascinated with the subject of forecasting — W.G. Gilchrist It is amazing to me, however, that after all this exercise in identifying models, transforming and so on, that the autoregressive moving averages come out so badly. I wonder whether it might be partly due to the authors not using the backwards forecasting approach to obtain the initial errors. — W.G. Gilchrist Automatic time series forecasting Forecasting competitions 11
  • 22. Makridakis and Hibon (1979) I find it hard to believe that Box-Jenkins, if properly applied, can actually be worse than so many of the simple methods — C. Chatfield Why do empirical studies sometimes give different answers? It may depend on the selected sample of time series, but I suspect it is more likely to depend on the skill of the analyst and on their individual interpretations of what is meant by Method X. — C. Chatfield . . . these authors are more at home with simple procedures than with Box-Jenkins. — C. Chatfield Automatic time series forecasting Forecasting competitions 12
  • 23. Makridakis and Hibon (1979) I find it hard to believe that Box-Jenkins, if properly applied, can actually be worse than so many of the simple methods — C. Chatfield Why do empirical studies sometimes give different answers? It may depend on the selected sample of time series, but I suspect it is more likely to depend on the skill of the analyst and on their individual interpretations of what is meant by Method X. — C. Chatfield . . . these authors are more at home with simple procedures than with Box-Jenkins. — C. Chatfield Automatic time series forecasting Forecasting competitions 12
  • 24. Makridakis and Hibon (1979) I find it hard to believe that Box-Jenkins, if properly applied, can actually be worse than so many of the simple methods — C. Chatfield Why do empirical studies sometimes give different answers? It may depend on the selected sample of time series, but I suspect it is more likely to depend on the skill of the analyst and on their individual interpretations of what is meant by Method X. — C. Chatfield . . . these authors are more at home with simple procedures than with Box-Jenkins. — C. Chatfield Automatic time series forecasting Forecasting competitions 12
  • 25. Makridakis and Hibon (1979) There is a fact that Professor Priestley must accept: empirical evidence is in disagreement with his theoretical arguments. — S. Makridakis M. Hibon Dr Chatfield expresses some personal views about the first author . . . It might be useful for Dr Chatfield to read some of the psychological literature quoted in the main paper, and he can then learn a little more about biases and how they affect prior probabilities. — S. Makridakis M. Hibon Automatic time series forecasting Forecasting competitions 13
  • 26. Makridakis and Hibon (1979) There is a fact that Professor Priestley must accept: empirical evidence is in disagreement with his theoretical arguments. — S. Makridakis M. Hibon Dr Chatfield expresses some personal views about the first author . . . It might be useful for Dr Chatfield to read some of the psychological literature quoted in the main paper, and he can then learn a little more about biases and how they affect prior probabilities. — S. Makridakis M. Hibon Automatic time series forecasting Forecasting competitions 13
  • 27. Consequences of MH (1979) As a result of this paper, researchers started to: ¯ consider how to automate forecasting methods; ¯ study what methods give the best forecasts; ¯ be aware of the dangers of over-fitting; ¯ treat forecasting as a different problem from time series analysis. Makridakis Hibon followed up with a new competition in 1982: 1001 series Anyone could submit forecasts (avoiding the charge of incompetence) Multiple forecast measures used. Automatic time series forecasting Forecasting competitions 14
  • 28. Consequences of MH (1979) As a result of this paper, researchers started to: ¯ consider how to automate forecasting methods; ¯ study what methods give the best forecasts; ¯ be aware of the dangers of over-fitting; ¯ treat forecasting as a different problem from time series analysis. Makridakis Hibon followed up with a new competition in 1982: 1001 series Anyone could submit forecasts (avoiding the charge of incompetence) Multiple forecast measures used. Automatic time series forecasting Forecasting competitions 14
  • 29. M-competition Automatic time series forecasting Forecasting competitions 15
  • 30. M-competition Main findings (taken from Makridakis Hibon, 2000) 1 Statistically sophisticated or complex methods do not necessarily provide more accurate forecasts than simpler ones. 2 The relative ranking of the performance of the various methods varies according to the accuracy measure being used. 3 The accuracy when various methods are being combined outperforms, on average, the individual methods being combined and does very well in comparison to other methods. 4 The accuracy of the various methods depends upon the length of the forecasting horizon involved. Automatic time series forecasting Forecasting competitions 16
  • 31. M3 competition Automatic time series forecasting Forecasting competitions 17
  • 32. Makridakis and Hibon (2000) “The M3-Competition is a final attempt by the authors to settle the accuracy issue of various time series methods. . . The extension involves the inclusion of more methods/ researchers (in particular in the areas of neural networks and expert systems) and more series.” 3003 series All data from business, demography, finance and economics. Series length between 14 and 126. Either non-seasonal, monthly or quarterly. All time series positive. MH claimed that the M3-competition supported the findings of their earlier work. However, best performing methods far from “simple”. Automatic time series forecasting Forecasting competitions 18
  • 33. Makridakis and Hibon (2000) Best methods: Theta A very confusing explanation. Shown by Hyndman and Billah (2003) to be average of linear regression and simple exponential smoothing with drift, applied to seasonally adjusted data. Later, the original authors claimed that their explanation was incorrect. Forecast Pro A commercial software package with an unknown algorithm. Known to fit either exponential smoothing or ARIMA models using BIC. Automatic time series forecasting Forecasting competitions 19
  • 34. M3 results (recalculated) Method MAPE sMAPE MASE Theta 17.42 12.76 1.39 ForecastPro 18.00 13.06 1.47 ForecastX 17.35 13.09 1.42 Automatic ANN 17.18 13.98 1.53 B-J automatic 19.13 13.72 1.54 Automatic time series forecasting Forecasting competitions 20
  • 35. M3 results (recalculated) Method MAPE sMAPE MASE Theta 17.42 12.76 1.39 ForecastPro 18.00 13.06 1.47 ForecastX 17.35 13.09 1.42 Automatic ANN 17.18 13.98 1.53 B-J automatic 19.13 13.72 1.54 Automatic time series forecasting Forecasting competitions 20 ® Calculations do not match published paper. ® Some contestants apparently submitted multiple entries but only best ones published.
  • 36. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Exponential smoothing 5 ARIMA modelling 6 Automatic nonlinear forecasting? 7 Time series with complex seasonality 8 Recent developments Automatic time series forecasting Forecasting the PBS 21
  • 37. Forecasting the PBS Automatic time series forecasting Forecasting the PBS 22
  • 38. Forecasting the PBS The Pharmaceutical Benefits Scheme (PBS) is the Australian government drugs subsidy scheme. Many drugs bought from pharmacies are subsidised to allow more equitable access to modern drugs. The cost to government is determined by the number and types of drugs purchased. Currently nearly 1% of GDP ($14 billion). The total cost is budgeted based on forecasts of drug usage. Automatic time series forecasting Forecasting the PBS 23
  • 39. Forecasting the PBS The Pharmaceutical Benefits Scheme (PBS) is the Australian government drugs subsidy scheme. Many drugs bought from pharmacies are subsidised to allow more equitable access to modern drugs. The cost to government is determined by the number and types of drugs purchased. Currently nearly 1% of GDP ($14 billion). The total cost is budgeted based on forecasts of drug usage. Automatic time series forecasting Forecasting the PBS 23
  • 40. Forecasting the PBS The Pharmaceutical Benefits Scheme (PBS) is the Australian government drugs subsidy scheme. Many drugs bought from pharmacies are subsidised to allow more equitable access to modern drugs. The cost to government is determined by the number and types of drugs purchased. Currently nearly 1% of GDP ($14 billion). The total cost is budgeted based on forecasts of drug usage. Automatic time series forecasting Forecasting the PBS 23
  • 41. Forecasting the PBS The Pharmaceutical Benefits Scheme (PBS) is the Australian government drugs subsidy scheme. Many drugs bought from pharmacies are subsidised to allow more equitable access to modern drugs. The cost to government is determined by the number and types of drugs purchased. Currently nearly 1% of GDP ($14 billion). The total cost is budgeted based on forecasts of drug usage. Automatic time series forecasting Forecasting the PBS 23
  • 42. Forecasting the PBS In 2001: $4.5 billion budget, under-forecasted by $800 million. Thousands of products. Seasonal demand. Subject to covert marketing, volatile products, uncontrollable expenditure. Although monthly data available for 10 years, data are aggregated to annual values, and only the first three years are used in estimating the forecasts. All forecasts being done with the FORECAST function in MS-Excel! Automatic time series forecasting Forecasting the PBS 24
  • 43. Forecasting the PBS In 2001: $4.5 billion budget, under-forecasted by $800 million. Thousands of products. Seasonal demand. Subject to covert marketing, volatile products, uncontrollable expenditure. Although monthly data available for 10 years, data are aggregated to annual values, and only the first three years are used in estimating the forecasts. All forecasts being done with the FORECAST function in MS-Excel! Automatic time series forecasting Forecasting the PBS 24
  • 44. Forecasting the PBS In 2001: $4.5 billion budget, under-forecasted by $800 million. Thousands of products. Seasonal demand. Subject to covert marketing, volatile products, uncontrollable expenditure. Although monthly data available for 10 years, data are aggregated to annual values, and only the first three years are used in estimating the forecasts. All forecasts being done with the FORECAST function in MS-Excel! Automatic time series forecasting Forecasting the PBS 24
  • 45. Forecasting the PBS In 2001: $4.5 billion budget, under-forecasted by $800 million. Thousands of products. Seasonal demand. Subject to covert marketing, volatile products, uncontrollable expenditure. Although monthly data available for 10 years, data are aggregated to annual values, and only the first three years are used in estimating the forecasts. All forecasts being done with the FORECAST function in MS-Excel! Automatic time series forecasting Forecasting the PBS 24
  • 46. Forecasting the PBS In 2001: $4.5 billion budget, under-forecasted by $800 million. Thousands of products. Seasonal demand. Subject to covert marketing, volatile products, uncontrollable expenditure. Although monthly data available for 10 years, data are aggregated to annual values, and only the first three years are used in estimating the forecasts. All forecasts being done with the FORECAST function in MS-Excel! Automatic time series forecasting Forecasting the PBS 24
  • 47. PBS data Automatic time series forecasting Forecasting the PBS 25 Total cost: A03 concession safety net group Time $thousands 1995 2000 2005 020040060080010001200
  • 48. PBS data Automatic time series forecasting Forecasting the PBS 25 Total cost: A05 general copayments group Time $thousands 1995 2000 2005 050100150200
  • 49. PBS data Automatic time series forecasting Forecasting the PBS 25 Total cost: D01 general copayments group Time $thousands 1995 2000 2005 0100200300400500600700
  • 50. PBS data Automatic time series forecasting Forecasting the PBS 25 Total cost: S01 general copayments group Time $thousands 1995 2000 2005 050010001500
  • 51. PBS data Automatic time series forecasting Forecasting the PBS 25 Total cost: R03 general copayments group Time $thousands 1995 2000 2005 10002000300040005000
  • 52. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Exponential smoothing 5 ARIMA modelling 6 Automatic nonlinear forecasting? 7 Time series with complex seasonality 8 Recent developments Automatic time series forecasting Exponential smoothing 26
  • 53. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M Automatic time series forecasting Exponential smoothing 27
  • 54. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing Automatic time series forecasting Exponential smoothing 27
  • 55. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Automatic time series forecasting Exponential smoothing 27
  • 56. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method Automatic time series forecasting Exponential smoothing 27
  • 57. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method M,N: Exponential trend method Automatic time series forecasting Exponential smoothing 27
  • 58. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method M,N: Exponential trend method Md,N: Multiplicative damped trend method Automatic time series forecasting Exponential smoothing 27
  • 59. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method M,N: Exponential trend method Md,N: Multiplicative damped trend method A,A: Additive Holt-Winters’ method Automatic time series forecasting Exponential smoothing 27
  • 60. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method M,N: Exponential trend method Md,N: Multiplicative damped trend method A,A: Additive Holt-Winters’ method A,M: Multiplicative Holt-Winters’ method Automatic time series forecasting Exponential smoothing 27
  • 61. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M There are 15 separate exponential smoothing methods. Automatic time series forecasting Exponential smoothing 27
  • 62. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M There are 15 separate exponential smoothing methods. Each can have an additive or multiplicative error, giving 30 separate models. Automatic time series forecasting Exponential smoothing 27
  • 63. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M There are 15 separate exponential smoothing methods. Each can have an additive or multiplicative error, giving 30 separate models. Only 19 models are numerically stable. Automatic time series forecasting Exponential smoothing 27
  • 64. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Automatic time series forecasting Exponential smoothing 28
  • 65. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Automatic time series forecasting Exponential smoothing 28
  • 66. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing ↑ Trend Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Automatic time series forecasting Exponential smoothing 28
  • 67. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing ↑ Trend Seasonal Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Automatic time series forecasting Exponential smoothing 28
  • 68. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing ↑ Error Trend Seasonal Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Automatic time series forecasting Exponential smoothing 28
  • 69. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing ↑ Error Trend Seasonal Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Automatic time series forecasting Exponential smoothing 28
  • 70. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing ↑ Error Trend Seasonal Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Automatic time series forecasting Exponential smoothing 28 Innovations state space models ¯ All ETS models can be written in innovations state space form (IJF, 2002). ¯ Additive and multiplicative versions give the same point forecasts but different prediction intervals.
  • 71. ETS state space model xt−1 εt yt Automatic time series forecasting Exponential smoothing 29 State space model xt = (level, slope, seasonal)
  • 72. ETS state space model xt−1 εt yt xt Automatic time series forecasting Exponential smoothing 29 State space model xt = (level, slope, seasonal)
  • 73. ETS state space model xt−1 εt yt xt yt+1 εt+1 Automatic time series forecasting Exponential smoothing 29 State space model xt = (level, slope, seasonal)
  • 74. ETS state space model xt−1 εt yt xt yt+1 εt+1 xt+1 Automatic time series forecasting Exponential smoothing 29 State space model xt = (level, slope, seasonal)
  • 75. ETS state space model xt−1 εt yt xt yt+1 εt+1 xt+1 yt+2 εt+2 Automatic time series forecasting Exponential smoothing 29 State space model xt = (level, slope, seasonal)
  • 76. ETS state space model xt−1 εt yt xt yt+1 εt+1 xt+1 yt+2 εt+2 xt+2 Automatic time series forecasting Exponential smoothing 29 State space model xt = (level, slope, seasonal)
  • 77. ETS state space model xt−1 εt yt xt yt+1 εt+1 xt+1 yt+2 εt+2 xt+2 yt+3 εt+3 Automatic time series forecasting Exponential smoothing 29 State space model xt = (level, slope, seasonal)
  • 78. ETS state space model xt−1 εt yt xt yt+1 εt+1 xt+1 yt+2 εt+2 xt+2 yt+3 εt+3 xt+3 Automatic time series forecasting Exponential smoothing 29 State space model xt = (level, slope, seasonal)
  • 79. ETS state space model xt−1 εt yt xt yt+1 εt+1 xt+1 yt+2 εt+2 xt+2 yt+3 εt+3 xt+3 yt+4 εt+4 Automatic time series forecasting Exponential smoothing 29 State space model xt = (level, slope, seasonal)
  • 80. ETS state space model xt−1 εt yt xt yt+1 εt+1 xt+1 yt+2 εt+2 xt+2 yt+3 εt+3 xt+3 yt+4 εt+4 Automatic time series forecasting Exponential smoothing 29 State space model xt = (level, slope, seasonal) Estimation Compute likelihood L from ε1, ε2, . . . , εT. Optimize L wrt model parameters.
  • 81. Innovations state space models Let xt = ( t, bt, st, st−1, . . . , st−m+1) and εt iid ∼ N(0, σ2 ). yt = h(xt−1) + k(xt−1)εt Observation equation µt et xt = f(xt−1) + g(xt−1)εt State equation Additive errors: k(xt−1) = 1. yt = µt + εt. Multiplicative errors: k(xt−1) = µt. yt = µt(1 + εt). εt = (yt − µt)/µt is relative error. Automatic time series forecasting Exponential smoothing 30
  • 82. Innovations state space models All models can be written in state space form. Additive and multiplicative versions give same point forecasts but different prediction intervals. Estimation L∗ (θ, x0) = n log n t=1 ε2 t /k2 (xt−1) + 2 n t=1 log |k(xt−1)| = −2 log(Likelihood) + constant Minimize wrt θ = (α, β, γ, φ) and initial states x0 = ( 0, b0, s0, s−1, . . . , s−m+1). Automatic time series forecasting Exponential smoothing 31
  • 83. Innovations state space models All models can be written in state space form. Additive and multiplicative versions give same point forecasts but different prediction intervals. Estimation L∗ (θ, x0) = n log n t=1 ε2 t /k2 (xt−1) + 2 n t=1 log |k(xt−1)| = −2 log(Likelihood) + constant Minimize wrt θ = (α, β, γ, φ) and initial states x0 = ( 0, b0, s0, s−1, . . . , s−m+1). Automatic time series forecasting Exponential smoothing 31
  • 84. Innovations state space models All models can be written in state space form. Additive and multiplicative versions give same point forecasts but different prediction intervals. Estimation L∗ (θ, x0) = n log n t=1 ε2 t /k2 (xt−1) + 2 n t=1 log |k(xt−1)| = −2 log(Likelihood) + constant Minimize wrt θ = (α, β, γ, φ) and initial states x0 = ( 0, b0, s0, s−1, . . . , s−m+1). Automatic time series forecasting Exponential smoothing 31
  • 85. Innovations state space models All models can be written in state space form. Additive and multiplicative versions give same point forecasts but different prediction intervals. Estimation L∗ (θ, x0) = n log n t=1 ε2 t /k2 (xt−1) + 2 n t=1 log |k(xt−1)| = −2 log(Likelihood) + constant Minimize wrt θ = (α, β, γ, φ) and initial states x0 = ( 0, b0, s0, s−1, . . . , s−m+1). Automatic time series forecasting Exponential smoothing 31
  • 86. Innovations state space models All models can be written in state space form. Additive and multiplicative versions give same point forecasts but different prediction intervals. Estimation L∗ (θ, x0) = n log n t=1 ε2 t /k2 (xt−1) + 2 n t=1 log |k(xt−1)| = −2 log(Likelihood) + constant Minimize wrt θ = (α, β, γ, φ) and initial states x0 = ( 0, b0, s0, s−1, . . . , s−m+1). Automatic time series forecasting Exponential smoothing 31
  • 87. Innovations state space models All models can be written in state space form. Additive and multiplicative versions give same point forecasts but different prediction intervals. Estimation L∗ (θ, x0) = n log n t=1 ε2 t /k2 (xt−1) + 2 n t=1 log |k(xt−1)| = −2 log(Likelihood) + constant Minimize wrt θ = (α, β, γ, φ) and initial states x0 = ( 0, b0, s0, s−1, . . . , s−m+1). Automatic time series forecasting Exponential smoothing 31 Q: How to choose between the 19 different ETS models?
  • 88. Cross-validation Traditional evaluation Automatic time series forecasting Exponential smoothing 32 q q q q q q q q q q q q q q q q q q q q q q q q q time Training data Test data
  • 89. Cross-validation Traditional evaluation Standard cross-validation Automatic time series forecasting Exponential smoothing 32 q q q q q q q q q q q q q q q q q q q q q q q q q time Training data Test data q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q qqq qq q q q q q q q q q q q q q q q q q q q q qq qqq q q q q q q q q q q q q q q q q q q q q qqqq q q q q q q q q q q q q q q q q q q q q q qq q qq q
  • 90. Cross-validation Traditional evaluation Standard cross-validation Time series cross-validation Automatic time series forecasting Exponential smoothing 32 q q q q q q q q q q q q q q q q q q q q q q q q q time Training data Test data q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q qqq qq q q q q q q q q q q q q q q q q q q q q qq qqq q q q q q q q q q q q q q q q q q q q q qqqq q q q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q
  • 91. Cross-validation Traditional evaluation Standard cross-validation Time series cross-validation Automatic time series forecasting Exponential smoothing 32 q q q q q q q q q q q q q q q q q q q q q q q q q time Training data Test data q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q qqq qq q q q q q q q q q q q q q q q q q q q q qq qqq q q q q q q q q q q q q q q q q q q q q qqqq q q q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q
  • 92. Cross-validation Traditional evaluation Standard cross-validation Time series cross-validation Automatic time series forecasting Exponential smoothing 32 q q q q q q q q q q q q q q q q q q q q q q q q q time Training data Test data q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q qqq qq q q q q q q q q q q q q q q q q q q q q qq qqq q q q q q q q q q q q q q q q q q q q q qqqq q q q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Also known as “Evaluation on a rolling forecast origin”
  • 93. Akaike’s Information Criterion AIC = −2 log(L) + 2k where L is the likelihood and k is the number of estimated parameters in the model. This is a penalized likelihood approach. If L is Gaussian, then AIC ≈ c + T log MSE + 2k where c is a constant, MSE is from one-step forecasts on training set, and T is the length of the series. Minimizing the Gaussian AIC is asymptotically equivalent (as T → ∞) to minimizing MSE from one-step forecasts on test set via time series cross-validation. Automatic time series forecasting Exponential smoothing 33
  • 94. Akaike’s Information Criterion AIC = −2 log(L) + 2k where L is the likelihood and k is the number of estimated parameters in the model. This is a penalized likelihood approach. If L is Gaussian, then AIC ≈ c + T log MSE + 2k where c is a constant, MSE is from one-step forecasts on training set, and T is the length of the series. Minimizing the Gaussian AIC is asymptotically equivalent (as T → ∞) to minimizing MSE from one-step forecasts on test set via time series cross-validation. Automatic time series forecasting Exponential smoothing 33
  • 95. Akaike’s Information Criterion AIC = −2 log(L) + 2k where L is the likelihood and k is the number of estimated parameters in the model. This is a penalized likelihood approach. If L is Gaussian, then AIC ≈ c + T log MSE + 2k where c is a constant, MSE is from one-step forecasts on training set, and T is the length of the series. Minimizing the Gaussian AIC is asymptotically equivalent (as T → ∞) to minimizing MSE from one-step forecasts on test set via time series cross-validation. Automatic time series forecasting Exponential smoothing 33
  • 96. Akaike’s Information Criterion AIC = −2 log(L) + 2k where L is the likelihood and k is the number of estimated parameters in the model. This is a penalized likelihood approach. If L is Gaussian, then AIC ≈ c + T log MSE + 2k where c is a constant, MSE is from one-step forecasts on training set, and T is the length of the series. Minimizing the Gaussian AIC is asymptotically equivalent (as T → ∞) to minimizing MSE from one-step forecasts on test set via time series cross-validation. Automatic time series forecasting Exponential smoothing 33
  • 97. Akaike’s Information Criterion AIC = −2 log(L) + 2k where L is the likelihood and k is the number of estimated parameters in the model. This is a penalized likelihood approach. If L is Gaussian, then AIC ≈ c + T log MSE + 2k where c is a constant, MSE is from one-step forecasts on training set, and T is the length of the series. Minimizing the Gaussian AIC is asymptotically equivalent (as T → ∞) to minimizing MSE from one-step forecasts on test set via time series cross-validation. Automatic time series forecasting Exponential smoothing 33
  • 98. Akaike’s Information Criterion AIC = −2 log(L) + 2k Corrected AIC For small T, AIC tends to over-fit. Bias-corrected version: AICC = AIC + 2(k+1)(k+2) T−k Bayesian Information Criterion BIC = AIC + k[log(T) − 2] BIC penalizes terms more heavily than AIC Minimizing BIC is consistent if there is a true model. Automatic time series forecasting Exponential smoothing 34
  • 99. Akaike’s Information Criterion AIC = −2 log(L) + 2k Corrected AIC For small T, AIC tends to over-fit. Bias-corrected version: AICC = AIC + 2(k+1)(k+2) T−k Bayesian Information Criterion BIC = AIC + k[log(T) − 2] BIC penalizes terms more heavily than AIC Minimizing BIC is consistent if there is a true model. Automatic time series forecasting Exponential smoothing 34
  • 100. Akaike’s Information Criterion AIC = −2 log(L) + 2k Corrected AIC For small T, AIC tends to over-fit. Bias-corrected version: AICC = AIC + 2(k+1)(k+2) T−k Bayesian Information Criterion BIC = AIC + k[log(T) − 2] BIC penalizes terms more heavily than AIC Minimizing BIC is consistent if there is a true model. Automatic time series forecasting Exponential smoothing 34
  • 101. What to use? Choice: AIC, AICc, BIC, CV-MSE CV-MSE too time consuming for most automatic forecasting purposes. Also requires large T. As T → ∞, BIC selects true model if there is one. But that is never true! AICc focuses on forecasting performance, can be used on small samples and is very fast to compute. Empirical studies in forecasting show AIC is better than BIC for forecast accuracy. Automatic time series forecasting Exponential smoothing 35
  • 102. What to use? Choice: AIC, AICc, BIC, CV-MSE CV-MSE too time consuming for most automatic forecasting purposes. Also requires large T. As T → ∞, BIC selects true model if there is one. But that is never true! AICc focuses on forecasting performance, can be used on small samples and is very fast to compute. Empirical studies in forecasting show AIC is better than BIC for forecast accuracy. Automatic time series forecasting Exponential smoothing 35
  • 103. What to use? Choice: AIC, AICc, BIC, CV-MSE CV-MSE too time consuming for most automatic forecasting purposes. Also requires large T. As T → ∞, BIC selects true model if there is one. But that is never true! AICc focuses on forecasting performance, can be used on small samples and is very fast to compute. Empirical studies in forecasting show AIC is better than BIC for forecast accuracy. Automatic time series forecasting Exponential smoothing 35
  • 104. What to use? Choice: AIC, AICc, BIC, CV-MSE CV-MSE too time consuming for most automatic forecasting purposes. Also requires large T. As T → ∞, BIC selects true model if there is one. But that is never true! AICc focuses on forecasting performance, can be used on small samples and is very fast to compute. Empirical studies in forecasting show AIC is better than BIC for forecast accuracy. Automatic time series forecasting Exponential smoothing 35
  • 105. What to use? Choice: AIC, AICc, BIC, CV-MSE CV-MSE too time consuming for most automatic forecasting purposes. Also requires large T. As T → ∞, BIC selects true model if there is one. But that is never true! AICc focuses on forecasting performance, can be used on small samples and is very fast to compute. Empirical studies in forecasting show AIC is better than BIC for forecast accuracy. Automatic time series forecasting Exponential smoothing 35
  • 106. ets algorithm in R Automatic time series forecasting Exponential smoothing 36 Based on Hyndman, Koehler, Snyder Grose (IJF 2002): Apply each of 19 models that are appropriate to the data. Optimize parameters and initial values using MLE. Select best method using AICc. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.
  • 107. ets algorithm in R Automatic time series forecasting Exponential smoothing 36 Based on Hyndman, Koehler, Snyder Grose (IJF 2002): Apply each of 19 models that are appropriate to the data. Optimize parameters and initial values using MLE. Select best method using AICc. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.
  • 108. ets algorithm in R Automatic time series forecasting Exponential smoothing 36 Based on Hyndman, Koehler, Snyder Grose (IJF 2002): Apply each of 19 models that are appropriate to the data. Optimize parameters and initial values using MLE. Select best method using AICc. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.
  • 109. ets algorithm in R Automatic time series forecasting Exponential smoothing 36 Based on Hyndman, Koehler, Snyder Grose (IJF 2002): Apply each of 19 models that are appropriate to the data. Optimize parameters and initial values using MLE. Select best method using AICc. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.
  • 110. Exponential smoothing Automatic time series forecasting Exponential smoothing 37 Forecasts from ETS(M,A,N) Year millionsofsheep 1960 1970 1980 1990 2000 2010 300400500600
  • 111. Exponential smoothing fit - ets(livestock) fcast - forecast(fit) plot(fcast) Automatic time series forecasting Exponential smoothing 38 Forecasts from ETS(M,A,N) Year millionsofsheep 1960 1970 1980 1990 2000 2010 300400500600
  • 112. Exponential smoothing Automatic time series forecasting Exponential smoothing 39 Forecasts from ETS(M,Md,M) Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.41.6
  • 113. Exponential smoothing fit - ets(h02) fcast - forecast(fit) plot(fcast) Automatic time series forecasting Exponential smoothing 40 Forecasts from ETS(M,Md,M) Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.41.6
  • 114. Exponential smoothing fit ETS(M,Md,M) Smoothing parameters: alpha = 0.3318 beta = 4e-04 gamma = 1e-04 phi = 0.9695 Initial states: l = 0.4003 b = 1.0233 s = 0.8575 0.8183 0.7559 0.7627 0.6873 1.2884 1.3456 1.1867 1.1653 1.1033 1.0398 0.9893 sigma: 0.0651 AIC AICc BIC -121.97999 -118.68967 -65.57195 Automatic time series forecasting Exponential smoothing 41
  • 115. M3 comparisons Method MAPE sMAPE MASE Theta 17.42 12.76 1.39 ForecastPro 18.00 13.06 1.47 ForecastX 17.35 13.09 1.42 Automatic ANN 17.18 13.98 1.53 B-J automatic 19.13 13.72 1.54 ETS 18.06 13.38 1.52 Automatic time series forecasting Exponential smoothing 42
  • 116. Exponential smoothing Automatic time series forecasting Exponential smoothing 43
  • 117. Exponential smoothing Automatic time series forecasting Exponential smoothing 43 www.OTexts.com/fpp
  • 118. Exponential smoothing Automatic time series forecasting Exponential smoothing 43
  • 119. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Exponential smoothing 5 ARIMA modelling 6 Automatic nonlinear forecasting? 7 Time series with complex seasonality 8 Recent developments Automatic time series forecasting ARIMA modelling 44
  • 120. ARIMA models yt−1 yt−2 yt−3 yt Inputs Output Automatic time series forecasting ARIMA modelling 45
  • 121. ARIMA models yt−1 yt−2 yt−3 εt yt Inputs Output Automatic time series forecasting ARIMA modelling 45 Autoregression (AR) model
  • 122. ARIMA models yt−1 yt−2 yt−3 εt εt−1 εt−2 yt Inputs Output Automatic time series forecasting ARIMA modelling 45 Autoregression moving average (ARMA) model
  • 123. ARIMA models yt−1 yt−2 yt−3 εt εt−1 εt−2 yt Inputs Output Automatic time series forecasting ARIMA modelling 45 Autoregression moving average (ARMA) model Estimation Compute likelihood L from ε1, ε2, . . . , εT. Use optimization algorithm to maximize L.
  • 124. ARIMA models yt−1 yt−2 yt−3 εt εt−1 εt−2 yt Inputs Output Automatic time series forecasting ARIMA modelling 45 Autoregression moving average (ARMA) model Estimation Compute likelihood L from ε1, ε2, . . . , εT. Use optimization algorithm to maximize L. ARIMA model Autoregression moving average (ARMA) model applied to differences.
  • 125. ARIMA modelling Automatic time series forecasting ARIMA modelling 46
  • 126. ARIMA modelling Automatic time series forecasting ARIMA modelling 46
  • 127. ARIMA modelling Automatic time series forecasting ARIMA modelling 46
  • 128. Auto ARIMA Automatic time series forecasting ARIMA modelling 47 Forecasts from ARIMA(0,1,0) with drift Year millionsofsheep 1960 1970 1980 1990 2000 2010 250300350400450500550
  • 129. Auto ARIMA fit - auto.arima(livestock) fcast - forecast(fit) plot(fcast) Automatic time series forecasting ARIMA modelling 48 Forecasts from ARIMA(0,1,0) with drift Year millionsofsheep 1960 1970 1980 1990 2000 2010 250300350400450500550
  • 130. Auto ARIMA Automatic time series forecasting ARIMA modelling 49 Forecasts from ARIMA(3,1,3)(0,1,1)[12] Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.4
  • 131. Auto ARIMA fit - auto.arima(h02) fcast - forecast(fit) plot(fcast) Automatic time series forecasting ARIMA modelling 50 Forecasts from ARIMA(3,1,3)(0,1,1)[12] Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.4
  • 132. Auto ARIMA fit Series: h02 ARIMA(3,1,3)(0,1,1)[12] Coefficients: ar1 ar2 ar3 ma1 ma2 ma3 -0.3648 -0.0636 0.3568 -0.4850 0.0479 -0.353 s.e. 0.2198 0.3293 0.1268 0.2227 0.2755 0.212 sma1 -0.5931 s.e. 0.0651 sigma^2 estimated as 0.002706: log likelihood=290.25 AIC=-564.5 AICc=-563.71 BIC=-538.48 Automatic time series forecasting ARIMA modelling 51
  • 133. How does auto.arima() work? A non-seasonal ARIMA process φ(B)(1 − B)d yt = c + θ(B)εt Need to select appropriate orders p, q, d, and whether to include c. Automatic time series forecasting ARIMA modelling 52 Algorithm choices driven by forecast accuracy.
  • 134. How does auto.arima() work? A non-seasonal ARIMA process φ(B)(1 − B)d yt = c + θ(B)εt Need to select appropriate orders p, q, d, and whether to include c. Hyndman Khandakar (JSS, 2008) algorithm: Select no. differences d via KPSS unit root test. Select p, q, c by minimising AICc. Use stepwise search to traverse model space, starting with a simple model and considering nearby variants. Automatic time series forecasting ARIMA modelling 52 Algorithm choices driven by forecast accuracy.
  • 135. How does auto.arima() work? A non-seasonal ARIMA process φ(B)(1 − B)d yt = c + θ(B)εt Need to select appropriate orders p, q, d, and whether to include c. Hyndman Khandakar (JSS, 2008) algorithm: Select no. differences d via KPSS unit root test. Select p, q, c by minimising AICc. Use stepwise search to traverse model space, starting with a simple model and considering nearby variants. Automatic time series forecasting ARIMA modelling 52 Algorithm choices driven by forecast accuracy.
  • 136. How does auto.arima() work? A seasonal ARIMA process Φ(Bm )φ(B)(1 − B)d (1 − Bm )D yt = c + Θ(Bm )θ(B)εt Need to select appropriate orders p, q, d, P, Q, D, and whether to include c. Hyndman Khandakar (JSS, 2008) algorithm: Select no. differences d via KPSS unit root test. Select D using OCSB unit root test. Select p, q, P, Q, c by minimising AIC. Use stepwise search to traverse model space, starting with a simple model and considering nearby variants. Automatic time series forecasting ARIMA modelling 53
  • 137. M3 comparisons Method MAPE sMAPE MASE Theta 17.42 12.76 1.39 ForecastPro 18.00 13.06 1.47 B-J automatic 19.13 13.72 1.54 ETS 18.06 13.38 1.52 AutoARIMA 19.04 13.86 1.47 Automatic time series forecasting ARIMA modelling 54
  • 138. M3 comparisons Method MAPE sMAPE MASE Theta 17.42 12.76 1.39 ForecastPro 18.00 13.06 1.47 B-J automatic 19.13 13.72 1.54 ETS 18.06 13.38 1.52 AutoARIMA 19.04 13.86 1.47 ETS-ARIMA 17.92 13.02 1.44 Automatic time series forecasting ARIMA modelling 54
  • 139. M3 conclusions MYTHS Simple methods do better. Exponential smoothing is better than ARIMA. FACTS The best methods are hybrid approaches. ETS-ARIMA (the simple average of ETS-additive and AutoARIMA) is the only fully documented method that is comparable to the M3 competition winners. Automatic time series forecasting ARIMA modelling 55
  • 140. M3 conclusions MYTHS Simple methods do better. Exponential smoothing is better than ARIMA. FACTS The best methods are hybrid approaches. ETS-ARIMA (the simple average of ETS-additive and AutoARIMA) is the only fully documented method that is comparable to the M3 competition winners. Automatic time series forecasting ARIMA modelling 55
  • 141. M3 conclusions MYTHS Simple methods do better. Exponential smoothing is better than ARIMA. FACTS The best methods are hybrid approaches. ETS-ARIMA (the simple average of ETS-additive and AutoARIMA) is the only fully documented method that is comparable to the M3 competition winners. Automatic time series forecasting ARIMA modelling 55
  • 142. M3 conclusions MYTHS Simple methods do better. Exponential smoothing is better than ARIMA. FACTS The best methods are hybrid approaches. ETS-ARIMA (the simple average of ETS-additive and AutoARIMA) is the only fully documented method that is comparable to the M3 competition winners. Automatic time series forecasting ARIMA modelling 55
  • 143. M3 conclusions MYTHS Simple methods do better. Exponential smoothing is better than ARIMA. FACTS The best methods are hybrid approaches. ETS-ARIMA (the simple average of ETS-additive and AutoARIMA) is the only fully documented method that is comparable to the M3 competition winners. Automatic time series forecasting ARIMA modelling 55
  • 144. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Exponential smoothing 5 ARIMA modelling 6 Automatic nonlinear forecasting? 7 Time series with complex seasonality 8 Recent developments Automatic time series forecasting Automatic nonlinear forecasting? 56
  • 145. Automatic nonlinear forecasting Automatic ANN in M3 competition did poorly. Linear methods did best in the NN3 competition! Very few machine learning methods get published in the IJF because authors cannot demonstrate their methods give better forecasts than linear benchmark methods, even on supposedly nonlinear data. Some good recent work by Kourentzes and Crone (Neurocomputing, 2010) on automated ANN for time series. Watch this space! Automatic time series forecasting Automatic nonlinear forecasting? 57
  • 146. Automatic nonlinear forecasting Automatic ANN in M3 competition did poorly. Linear methods did best in the NN3 competition! Very few machine learning methods get published in the IJF because authors cannot demonstrate their methods give better forecasts than linear benchmark methods, even on supposedly nonlinear data. Some good recent work by Kourentzes and Crone (Neurocomputing, 2010) on automated ANN for time series. Watch this space! Automatic time series forecasting Automatic nonlinear forecasting? 57
  • 147. Automatic nonlinear forecasting Automatic ANN in M3 competition did poorly. Linear methods did best in the NN3 competition! Very few machine learning methods get published in the IJF because authors cannot demonstrate their methods give better forecasts than linear benchmark methods, even on supposedly nonlinear data. Some good recent work by Kourentzes and Crone (Neurocomputing, 2010) on automated ANN for time series. Watch this space! Automatic time series forecasting Automatic nonlinear forecasting? 57
  • 148. Automatic nonlinear forecasting Automatic ANN in M3 competition did poorly. Linear methods did best in the NN3 competition! Very few machine learning methods get published in the IJF because authors cannot demonstrate their methods give better forecasts than linear benchmark methods, even on supposedly nonlinear data. Some good recent work by Kourentzes and Crone (Neurocomputing, 2010) on automated ANN for time series. Watch this space! Automatic time series forecasting Automatic nonlinear forecasting? 57
  • 149. Automatic nonlinear forecasting Automatic ANN in M3 competition did poorly. Linear methods did best in the NN3 competition! Very few machine learning methods get published in the IJF because authors cannot demonstrate their methods give better forecasts than linear benchmark methods, even on supposedly nonlinear data. Some good recent work by Kourentzes and Crone (Neurocomputing, 2010) on automated ANN for time series. Watch this space! Automatic time series forecasting Automatic nonlinear forecasting? 57
  • 150. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Exponential smoothing 5 ARIMA modelling 6 Automatic nonlinear forecasting? 7 Time series with complex seasonality 8 Recent developments Automatic time series forecasting Time series with complex seasonality 58
  • 151. Examples Automatic time series forecasting Time series with complex seasonality 59 US finished motor gasoline products Weeks Thousandsofbarrelsperday 1992 1994 1996 1998 2000 2002 2004 6500700075008000850090009500
  • 152. Examples Automatic time series forecasting Time series with complex seasonality 59 Number of calls to large American bank (7am−9pm) 5 minute intervals Numberofcallarrivals 100200300400 3 March 17 March 31 March 14 April 28 April 12 May
  • 153. Examples Automatic time series forecasting Time series with complex seasonality 59 Turkish electricity demand Days Electricitydemand(GW) 2000 2002 2004 2006 2008 10152025
  • 154. TBATS model TBATS Trigonometric terms for seasonality Box-Cox transformations for heterogeneity ARMA errors for short-term dynamics Trend (possibly damped) Seasonal (including multiple and non-integer periods) Automatic algorithm described in AM De Livera, RJ Hyndman, and RD Snyder (2011). “Forecasting time series with complex seasonal patterns using exponential smoothing”. Journal of the American Statistical Association 106(496), 1513–1527. Automatic time series forecasting Time series with complex seasonality 60
  • 155. TBATS model yt = observation at time t y (ω) t = (yω t − 1)/ω if ω = 0; log yt if ω = 0. y (ω) t = t−1 + φbt−1 + M i=1 s (i) t−mi + dt t = t−1 + φbt−1 + αdt bt = (1 − φ)b + φbt−1 + βdt dt = p i=1 φidt−i + q j=1 θjεt−j + εt s (i) t = ki j=1 s (i) j,t Automatic time series forecasting Time series with complex seasonality 61 s (i) j,t = s (i) j,t−1 cos λ (i) j + s ∗(i) j,t−1 sin λ (i) j + γ (i) 1 dt s (i) j,t = −s (i) j,t−1 sin λ (i) j + s ∗(i) j,t−1 cos λ (i) j + γ (i) 2 dt
  • 156. TBATS model yt = observation at time t y (ω) t = (yω t − 1)/ω if ω = 0; log yt if ω = 0. y (ω) t = t−1 + φbt−1 + M i=1 s (i) t−mi + dt t = t−1 + φbt−1 + αdt bt = (1 − φ)b + φbt−1 + βdt dt = p i=1 φidt−i + q j=1 θjεt−j + εt s (i) t = ki j=1 s (i) j,t Automatic time series forecasting Time series with complex seasonality 61 s (i) j,t = s (i) j,t−1 cos λ (i) j + s ∗(i) j,t−1 sin λ (i) j + γ (i) 1 dt s (i) j,t = −s (i) j,t−1 sin λ (i) j + s ∗(i) j,t−1 cos λ (i) j + γ (i) 2 dt Box-Cox transformation
  • 157. TBATS model yt = observation at time t y (ω) t = (yω t − 1)/ω if ω = 0; log yt if ω = 0. y (ω) t = t−1 + φbt−1 + M i=1 s (i) t−mi + dt t = t−1 + φbt−1 + αdt bt = (1 − φ)b + φbt−1 + βdt dt = p i=1 φidt−i + q j=1 θjεt−j + εt s (i) t = ki j=1 s (i) j,t Automatic time series forecasting Time series with complex seasonality 61 s (i) j,t = s (i) j,t−1 cos λ (i) j + s ∗(i) j,t−1 sin λ (i) j + γ (i) 1 dt s (i) j,t = −s (i) j,t−1 sin λ (i) j + s ∗(i) j,t−1 cos λ (i) j + γ (i) 2 dt Box-Cox transformation M seasonal periods
  • 158. TBATS model yt = observation at time t y (ω) t = (yω t − 1)/ω if ω = 0; log yt if ω = 0. y (ω) t = t−1 + φbt−1 + M i=1 s (i) t−mi + dt t = t−1 + φbt−1 + αdt bt = (1 − φ)b + φbt−1 + βdt dt = p i=1 φidt−i + q j=1 θjεt−j + εt s (i) t = ki j=1 s (i) j,t Automatic time series forecasting Time series with complex seasonality 61 s (i) j,t = s (i) j,t−1 cos λ (i) j + s ∗(i) j,t−1 sin λ (i) j + γ (i) 1 dt s (i) j,t = −s (i) j,t−1 sin λ (i) j + s ∗(i) j,t−1 cos λ (i) j + γ (i) 2 dt Box-Cox transformation M seasonal periods global and local trend
  • 159. TBATS model yt = observation at time t y (ω) t = (yω t − 1)/ω if ω = 0; log yt if ω = 0. y (ω) t = t−1 + φbt−1 + M i=1 s (i) t−mi + dt t = t−1 + φbt−1 + αdt bt = (1 − φ)b + φbt−1 + βdt dt = p i=1 φidt−i + q j=1 θjεt−j + εt s (i) t = ki j=1 s (i) j,t Automatic time series forecasting Time series with complex seasonality 61 s (i) j,t = s (i) j,t−1 cos λ (i) j + s ∗(i) j,t−1 sin λ (i) j + γ (i) 1 dt s (i) j,t = −s (i) j,t−1 sin λ (i) j + s ∗(i) j,t−1 cos λ (i) j + γ (i) 2 dt Box-Cox transformation M seasonal periods global and local trend ARMA error
  • 160. TBATS model yt = observation at time t y (ω) t = (yω t − 1)/ω if ω = 0; log yt if ω = 0. y (ω) t = t−1 + φbt−1 + M i=1 s (i) t−mi + dt t = t−1 + φbt−1 + αdt bt = (1 − φ)b + φbt−1 + βdt dt = p i=1 φidt−i + q j=1 θjεt−j + εt s (i) t = ki j=1 s (i) j,t Automatic time series forecasting Time series with complex seasonality 61 s (i) j,t = s (i) j,t−1 cos λ (i) j + s ∗(i) j,t−1 sin λ (i) j + γ (i) 1 dt s (i) j,t = −s (i) j,t−1 sin λ (i) j + s ∗(i) j,t−1 cos λ (i) j + γ (i) 2 dt Box-Cox transformation M seasonal periods global and local trend ARMA error Fourier-like seasonal terms
  • 161. TBATS model yt = observation at time t y (ω) t = (yω t − 1)/ω if ω = 0; log yt if ω = 0. y (ω) t = t−1 + φbt−1 + M i=1 s (i) t−mi + dt t = t−1 + φbt−1 + αdt bt = (1 − φ)b + φbt−1 + βdt dt = p i=1 φidt−i + q j=1 θjεt−j + εt s (i) t = ki j=1 s (i) j,t Automatic time series forecasting Time series with complex seasonality 61 s (i) j,t = s (i) j,t−1 cos λ (i) j + s ∗(i) j,t−1 sin λ (i) j + γ (i) 1 dt s (i) j,t = −s (i) j,t−1 sin λ (i) j + s ∗(i) j,t−1 cos λ (i) j + γ (i) 2 dt Box-Cox transformation M seasonal periods global and local trend ARMA error Fourier-like seasonal terms TBATS Trigonometric Box-Cox ARMA Trend Seasonal
  • 162. Examples fit - tbats(gasoline) fcast - forecast(fit) plot(fcast) Automatic time series forecasting Time series with complex seasonality 62 Forecasts from TBATS(0.999, {2,2}, 1, {52.1785714285714,8}) Weeks Thousandsofbarrelsperday 1995 2000 2005 70008000900010000
  • 163. Examples fit - tbats(callcentre) fcast - forecast(fit) plot(fcast) Automatic time series forecasting Time series with complex seasonality 63 Forecasts from TBATS(1, {3,1}, 0.987, {169,5, 845,3}) 5 minute intervals Numberofcallarrivals 0100200300400500 3 March 17 March 31 March 14 April 28 April 12 May 26 May 9 June
  • 164. Examples fit - tbats(turk) fcast - forecast(fit) plot(fcast) Automatic time series forecasting Time series with complex seasonality 64 Forecasts from TBATS(0, {5,3}, 0.997, {7,3, 354.37,12, 365.25,4}) Days Electricitydemand(GW) 2000 2002 2004 2006 2008 2010 10152025
  • 165. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Exponential smoothing 5 ARIMA modelling 6 Automatic nonlinear forecasting? 7 Time series with complex seasonality 8 Recent developments Automatic time series forecasting Recent developments 65
  • 166. Further competitions 1 2011 tourism forecasting competition. 2 Kaggle and other forecasting platforms. 3 GEFCom 2012: Point forecasting of electricity load and wind power. 4 GEFCom 2014: Probabilistic forecasting of electricity load, electricity price, wind energy and solar energy. Automatic time series forecasting Recent developments 66
  • 167. Further competitions 1 2011 tourism forecasting competition. 2 Kaggle and other forecasting platforms. 3 GEFCom 2012: Point forecasting of electricity load and wind power. 4 GEFCom 2014: Probabilistic forecasting of electricity load, electricity price, wind energy and solar energy. Automatic time series forecasting Recent developments 66
  • 168. Further competitions 1 2011 tourism forecasting competition. 2 Kaggle and other forecasting platforms. 3 GEFCom 2012: Point forecasting of electricity load and wind power. 4 GEFCom 2014: Probabilistic forecasting of electricity load, electricity price, wind energy and solar energy. Automatic time series forecasting Recent developments 66
  • 169. Further competitions 1 2011 tourism forecasting competition. 2 Kaggle and other forecasting platforms. 3 GEFCom 2012: Point forecasting of electricity load and wind power. 4 GEFCom 2014: Probabilistic forecasting of electricity load, electricity price, wind energy and solar energy. Automatic time series forecasting Recent developments 66
  • 170. Forecasts about forecasting 1 Automatic algorithms will become more general — handling a wide variety of time series. 2 Model selection methods will take account of multi-step forecast accuracy as well as one-step forecast accuracy. 3 Automatic forecasting algorithms for multivariate time series will be developed. 4 Automatic forecasting algorithms that include covariate information will be developed. Automatic time series forecasting Recent developments 67
  • 171. Forecasts about forecasting 1 Automatic algorithms will become more general — handling a wide variety of time series. 2 Model selection methods will take account of multi-step forecast accuracy as well as one-step forecast accuracy. 3 Automatic forecasting algorithms for multivariate time series will be developed. 4 Automatic forecasting algorithms that include covariate information will be developed. Automatic time series forecasting Recent developments 67
  • 172. Forecasts about forecasting 1 Automatic algorithms will become more general — handling a wide variety of time series. 2 Model selection methods will take account of multi-step forecast accuracy as well as one-step forecast accuracy. 3 Automatic forecasting algorithms for multivariate time series will be developed. 4 Automatic forecasting algorithms that include covariate information will be developed. Automatic time series forecasting Recent developments 67
  • 173. Forecasts about forecasting 1 Automatic algorithms will become more general — handling a wide variety of time series. 2 Model selection methods will take account of multi-step forecast accuracy as well as one-step forecast accuracy. 3 Automatic forecasting algorithms for multivariate time series will be developed. 4 Automatic forecasting algorithms that include covariate information will be developed. Automatic time series forecasting Recent developments 67
  • 174. For further information robjhyndman.com Slides and references for this talk. Links to all papers and books. Links to R packages. A blog about forecasting research. Automatic time series forecasting Recent developments 68