Automatic time series forecasting

1. Rob J Hyndman Automatic time series forecasting 1

2. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Time series forecasting 5 Evaluating forecast accuracy 6 Exponential smoothing 7 ARIMA modelling 8 Automatic nonlinear forecasting? 9 Time series with complex seasonality 10 Forecasts about automatic forecasting Automatic time series forecasting Motivation 2

3. Motivation Automatic time series forecasting Motivation 3

8. Motivation 1 2 Common in business to have over 1000 products that need forecasting at least monthly. Forecasts are often required by people who are untrained in time series analysis. Speciﬁcations Automatic forecasting algorithms must: ¯ determine an appropriate time series model; ¯ estimate the parameters; ¯ compute the forecasts with prediction intervals. Automatic time series forecasting Motivation 4

9. Motivation 1 2 Common in business to have over 1000 products that need forecasting at least monthly. Forecasts are often required by people who are untrained in time series analysis. Speciﬁcations Automatic forecasting algorithms must: ¯ determine an appropriate time series model; ¯ estimate the parameters; ¯ compute the forecasts with prediction intervals. Automatic time series forecasting Motivation 4

10. Example: Asian sheep 450 400 350 300 250 millions of sheep 500 550 Numbers of sheep in Asia 1960 1970 1980 1990 2000 2010 Year Automatic time series forecasting Motivation 5

11. Example: Asian sheep 450 400 350 300 250 millions of sheep 500 550 Automatic ETS forecasts 1960 1970 1980 1990 2000 2010 Year Automatic time series forecasting Motivation 5

12. Example: Cortecosteroid sales 1.0 0.8 0.6 0.4 Total scripts (millions) 1.2 1.4 Monthly cortecosteroid drug sales in Australia 1995 2000 2005 2010 Year Automatic time series forecasting Motivation 6

13. Example: Cortecosteroid sales 1.0 0.8 0.6 0.4 Total scripts (millions) 1.2 1.4 Automatic ARIMA forecasts 1995 2000 2005 2010 Year Automatic time series forecasting Motivation 6

14. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Time series forecasting 5 Evaluating forecast accuracy 6 Exponential smoothing 7 ARIMA modelling 8 Automatic nonlinear forecasting? 9 Time series with complex seasonality 10 Forecasts about automatic forecasting Automatic time series forecasting Forecasting competitions 7

15. Makridakis and Hibon (1979) Automatic time series forecasting Forecasting competitions 8

16. Makridakis and Hibon (1979) Automatic time series forecasting Forecasting competitions 8

17. Makridakis and Hibon (1979) This was the ﬁrst large-scale empirical evaluation of time series forecasting methods. Highly controversial at the time. Difﬁculties: How to measure forecast accuracy? How to apply methods consistently and objectively? How to explain unexpected results? Common thinking was that the more sophisticated mathematical models (ARIMA models at the time) were necessarily better. If results showed ARIMA models not best, it must be because analyst was unskilled. Automatic time series forecasting Forecasting competitions 9

18. Makridakis and Hibon (1979) I do not believe that it is very fruitful to attempt to classify series according to which forecasting techniques perform “best”. The performance of any particular technique when applied to a particular series depends essentially on (a) the model which the series obeys; (b) our ability to identify and ﬁt this model correctly and (c) the criterion chosen to measure the forecasting accuracy. — M.B. Priestley . . . the paper suggests the application of normal scientiﬁc experimental design to forecasting, with measures of unbiased testing of forecasts against subsequent reality, for success or failure. A long overdue reform. — F.H. Hansford-Miller Automatic time series forecasting Forecasting competitions 10

19. Makridakis and Hibon (1979) I do not believe that it is very fruitful to attempt to classify series according to which forecasting techniques perform “best”. The performance of any particular technique when applied to a particular series depends essentially on (a) the model which the series obeys; (b) our ability to identify and ﬁt this model correctly and (c) the criterion chosen to measure the forecasting accuracy. — M.B. Priestley . . . the paper suggests the application of normal scientiﬁc experimental design to forecasting, with measures of unbiased testing of forecasts against subsequent reality, for success or failure. A long overdue reform. — F.H. Hansford-Miller Automatic time series forecasting Forecasting competitions 10

20. Makridakis and Hibon (1979) Modern man is fascinated with the subject of forecasting — W.G. Gilchrist It is amazing to me, however, that after all this exercise in identifying models, transforming and so on, that the autoregressive moving averages come out so badly. I wonder whether it might be partly due to the authors not using the backwards forecasting approach to obtain the initial errors. — W.G. Gilchrist Automatic time series forecasting Forecasting competitions 11

21. Makridakis and Hibon (1979) Modern man is fascinated with the subject of forecasting — W.G. Gilchrist It is amazing to me, however, that after all this exercise in identifying models, transforming and so on, that the autoregressive moving averages come out so badly. I wonder whether it might be partly due to the authors not using the backwards forecasting approach to obtain the initial errors. — W.G. Gilchrist Automatic time series forecasting Forecasting competitions 11

22. Makridakis and Hibon (1979) I find it hard to believe that Box-Jenkins, if properly applied, can actually be worse than so many of the simple methods — C. Chatfield Why do empirical studies sometimes give different answers? It may depend on the selected sample of time series, but I suspect it is more likely to depend on the skill of the analyst and on their individual interpretations of what is meant by Method X. — C. Chatfield . . . these authors are more at home with simple procedures than with Box-Jenkins. — C. Chatfield Automatic time series forecasting Forecasting competitions 12

25. Makridakis and Hibon (1979) There is a fact that Professor Priestley must accept: empirical evidence is in disagreement with his theoretical arguments. — S. Makridakis M. Hibon Dr Chatfield expresses some personal views about the first author . . . It might be useful for Dr Chatfield to read some of the psychological literature quoted in the main paper, and he can then learn a little more about biases and how they affect prior probabilities. — S. Makridakis M. Hibon Automatic time series forecasting Forecasting competitions 13

26. Makridakis and Hibon (1979) There is a fact that Professor Priestley must accept: empirical evidence is in disagreement with his theoretical arguments. — S. Makridakis M. Hibon Dr Chatfield expresses some personal views about the first author . . . It might be useful for Dr Chatfield to read some of the psychological literature quoted in the main paper, and he can then learn a little more about biases and how they affect prior probabilities. — S. Makridakis M. Hibon Automatic time series forecasting Forecasting competitions 13

27. Consequences of MH (1979) As a result of this paper, researchers started to: ¯ ¯ ¯ ¯ consider how to automate forecasting methods; study what methods give the best forecasts; be aware of the dangers of over-ﬁtting; treat forecasting as a different problem from time series analysis. Makridakis Hibon followed up with a new competition in 1982: 1001 series Anyone could submit forecasts (avoiding the charge of incompetence) Multiple forecast measures used. Automatic time series forecasting Forecasting competitions 14

28. Consequences of MH (1979) As a result of this paper, researchers started to: ¯ ¯ ¯ ¯ consider how to automate forecasting methods; study what methods give the best forecasts; be aware of the dangers of over-ﬁtting; treat forecasting as a different problem from time series analysis. Makridakis Hibon followed up with a new competition in 1982: 1001 series Anyone could submit forecasts (avoiding the charge of incompetence) Multiple forecast measures used. Automatic time series forecasting Forecasting competitions 14

29. M-competition Automatic time series forecasting Forecasting competitions 15

30. M-competition Main ﬁndings (taken from Makridakis Hibon, 2000) 1 Statistically sophisticated or complex methods do not necessarily provide more accurate forecasts than simpler ones. 2 The relative ranking of the performance of the various methods varies according to the accuracy measure being used. 3 The accuracy when various methods are being combined outperforms, on average, the individual methods being combined and does very well in comparison to other methods. 4 The accuracy of the various methods depends upon the length of the forecasting horizon involved. Automatic time series forecasting Forecasting competitions 16

31. M3 competition Automatic time series forecasting Forecasting competitions 17

32. Makridakis and Hibon (2000) “The M3-Competition is a final attempt by the authors to settle the accuracy issue of various time series methods. . . The extension involves the inclusion of more methods/ researchers (in particular in the areas of neural networks and expert systems) and more series.” 3003 series All data from business, demography, finance and economics. Series length between 14 and 126. Either non-seasonal, monthly or quarterly. All time series positive. MH claimed that the M3-competition supported the findings of their earlier work. However, best performing methods far from “simple”. Automatic time series forecasting Forecasting competitions 18

33. Makridakis and Hibon (2000) Best methods: Theta A very confusing explanation. Shown by Hyndman and Billah (2003) to be average of linear regression and simple exponential smoothing with drift, applied to seasonally adjusted data. Later, the original authors claimed that their explanation was incorrect. Forecast Pro A commercial software package with an unknown algorithm. Known to ﬁt either exponential smoothing or ARIMA models using BIC. Automatic time series forecasting Forecasting competitions 19

34. M3 results (recalculated) Method MAPE sMAPE MASE Theta ForecastPro ForecastX Automatic ANN B-J automatic 17.42 18.00 17.35 17.18 19.13 Automatic time series forecasting 12.76 13.06 13.09 13.98 13.72 Forecasting competitions 1.39 1.47 1.42 1.53 1.54 20

35. M3 results (recalculated) Method MAPE sMAPE MASE Theta 17.42 12.76 ForecastPro 18.00 13.06 ForecastX 17.35 13.09 ® Calculations 17.18 13.98 Automatic ANN do not match published B-J automatic paper. 19.13 13.72 1.39 1.47 1.42 1.53 1.54 ® Some contestants apparently submitted multiple entries but only best ones published. Automatic time series forecasting Forecasting competitions 20

36. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Time series forecasting 5 Evaluating forecast accuracy 6 Exponential smoothing 7 ARIMA modelling 8 Automatic nonlinear forecasting? 9 Time series with complex seasonality 10 Forecasts about automatic forecasting Automatic time series forecasting Forecasting the PBS 21

37. Forecasting the PBS Automatic time series forecasting Forecasting the PBS 22

38. Forecasting the PBS The Pharmaceutical Beneﬁts Scheme (PBS) is the Australian government drugs subsidy scheme. Many drugs bought from pharmacies are subsidised to allow more equitable access to modern drugs. The cost to government is determined by the number and types of drugs purchased. Currently nearly 1% of GDP ($14 billion). The total cost is budgeted based on forecasts of drug usage. Automatic time series forecasting Forecasting the PBS 23

42. Forecasting the PBS In 2001: $4.5 billion budget, under-forecasted by $800 million. Thousands of products. Seasonal demand. Subject to covert marketing, volatile products, uncontrollable expenditure. Although monthly data available for 10 years, data are aggregated to annual values, and only the ﬁrst three years are used in estimating the forecasts. All forecasts being done with the FORECAST function in MS-Excel! Automatic time series forecasting Forecasting the PBS 24

47. Total cost: A03 concession safety net group 600 400 200 0 $ thousands 800 1000 1200 PBS data 1995 2000 2005 Time Automatic time series forecasting Forecasting the PBS 25

48. PBS data 100 50 0 $ thousands 150 200 Total cost: A05 general copayments group 1995 2000 2005 Time Automatic time series forecasting Forecasting the PBS 25

49. Total cost: D01 general copayments group 400 300 200 100 0 $ thousands 500 600 700 PBS data 1995 2000 2005 Time Automatic time series forecasting Forecasting the PBS 25

50. PBS data 1000 500 0 $ thousands 1500 Total cost: S01 general copayments group 1995 2000 2005 Time Automatic time series forecasting Forecasting the PBS 25

51. PBS data 4000 3000 2000 1000 $ thousands 5000 Total cost: R03 general copayments group 1995 2000 2005 Time Automatic time series forecasting Forecasting the PBS 25

52. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Time series forecasting 5 Evaluating forecast accuracy 6 Exponential smoothing 7 ARIMA modelling 8 Automatic nonlinear forecasting? 9 Time series with complex seasonality 10 Forecasts about automatic forecasting Automatic time series forecasting Time series forecasting 26

53. Time series forecasting Inputs Output y t −1 y t −2 yt y t −3 Automatic time series forecasting Time series forecasting 27

54. Time series forecasting Inputs Output y t −1 y t −2 yt y t −3 Autoregression (AR) model εt Automatic time series forecasting Time series forecasting 27

55. Time series forecasting Inputs Output y t −1 y t −2 yt y t −3 Autoregression moving average (ARMA) model εt ε t −1 ε t −2 Automatic time series forecasting Time series forecasting 27

56. Time series forecasting Inputs Output y t −1 y t −2 yt y t −3 εt ε t −1 ε t −2 Autoregression moving average (ARMA) model Estimation Compute likelihood L from ε1 , ε2 , . . . , εT . Use optimization algorithm to maximize L. Automatic time series forecasting Time series forecasting 27

57. Time series forecasting xt −1 yt εt Automatic time series forecasting State space model xt is unobserved. Time series forecasting 28

58. Time series forecasting xt −1 yt εt xt Automatic time series forecasting State space model xt is unobserved. Time series forecasting 28

59. Time series forecasting xt −1 yt εt xt yt+1 State space model xt is unobserved. εt+1 Automatic time series forecasting Time series forecasting 28

60. Time series forecasting xt −1 yt εt xt yt+1 εt+1 xt+1 Automatic time series forecasting State space model xt is unobserved. Time series forecasting 28

61. Time series forecasting xt −1 yt εt xt yt+1 State space model xt is unobserved. εt+1 xt+1 yt+2 εt+2 Automatic time series forecasting Time series forecasting 28

62. Time series forecasting xt −1 yt εt xt yt+1 State space model xt is unobserved. εt+1 xt+1 yt+2 εt+2 xt +2 Automatic time series forecasting Time series forecasting 28

63. Time series forecasting xt −1 yt εt xt yt+1 State space model xt is unobserved. εt+1 xt+1 yt+2 εt+2 xt +2 y t +3 εt+3 Automatic time series forecasting Time series forecasting 28

64. Time series forecasting xt −1 yt εt xt yt+1 State space model xt is unobserved. εt+1 xt+1 yt+2 εt+2 xt +2 y t +3 εt+3 xt +3 Automatic time series forecasting Time series forecasting 28

65. Time series forecasting xt −1 yt εt xt yt+1 State space model xt is unobserved. εt+1 xt+1 yt+2 εt+2 xt +2 y t +3 εt+3 xt +3 y t +4 ε t +4 Automatic time series forecasting Time series forecasting 28

66. Time series forecasting xt −1 yt εt xt yt+1 State space model xt is unobserved. εt+1 xt+1 yt+2 εt+2 xt +2 y t +3 εt+3 xt +3 Estimation Compute likelihood L from ε1 , ε2 , . . . , εT . Use optimization algorithm to maximize L. Automatic time series forecasting y t +4 ε t +4 Time series forecasting 28

67. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Time series forecasting 5 Evaluating forecast accuracy 6 Exponential smoothing 7 ARIMA modelling 8 Automatic nonlinear forecasting? 9 Time series with complex seasonality 10 Forecasts about automatic forecasting Automatic time series forecasting Evaluating forecast accuracy 29

68. Cross-validation Traditional evaluation Training data q q q q q q q q q q Automatic time series forecasting q q Test data q q q q q q q q q q q q q Evaluating forecast accuracy time 30

69. Cross-validation Traditional evaluation Training data q q q q q q q q q q q q Test data q q q q q q q q q q q q q time Standard cross-validation q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Automatic time series forecasting Evaluating forecast accuracy 30

70. Cross-validation Traditional evaluation Training data q q q q q q q q q q q q Test data q q q q q q q q q q q q q time Standard cross-validation q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Time series cross-validation q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Automatic time series forecasting Evaluating forecast accuracy 30

71. Cross-validation Traditional evaluation Training data q q q q q q q q q q q q Test data q q q q q q q q q q q q q time Standard cross-validation q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Time series cross-validation q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Automatic time series forecasting Evaluating forecast accuracy 30

72. Cross-validation Traditional evaluation Training data q q q q q q q q q q q q Test data q q q q q q q q q q q q q time Standard cross-validation q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Time series cross-validation q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Alsoq knownqas “Evaluationq on q q q q q q q q q q q q q q q q q q a rolling forecast qorigin” q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Automatic time series forecasting Evaluating forecast accuracy 30

73. Akaike’s Information Criterion AIC = −2 log(L) + 2k where L is the likelihood and k is the number of estimated parameters in the model. This is a penalized likelihood approach. If L is Gaussian, then AIC ≈ c + T log MSE + 2k where c is a constant, MSE is from one-step forecasts on training set, and T is the length of the series. Minimizing the Gaussian AIC is asymptotically equivalent (as T → ∞) to minimizing MSE from one-step forecasts on test set via time series cross-validation. Automatic time series forecasting Evaluating forecast accuracy 31

78. Akaike’s Information Criterion AIC = −2 log(L) + 2k Corrected AIC For small T, AIC tends to over-ﬁt. Bias-corrected version: AICC = AIC + 2(k +1)(k +2) T −k Bayesian Information Criterion BIC = AIC + k [log(T ) − 2] BIC penalizes terms more heavily than AIC Minimizing BIC is consistent if there is a true model. Automatic time series forecasting Evaluating forecast accuracy 32

81. What to use? Choice: AIC, AICc, BIC, CV-MSE CV-MSE too time consuming for most automatic forecasting purposes. Also requires large T. As T → ∞, BIC selects true model if there is one. But that is never true! AICc focuses on forecasting performance, can be used on small samples and is very fast to compute. Empirical studies in forecasting show AIC is better than BIC for forecast accuracy. Automatic time series forecasting Evaluating forecast accuracy 33

86. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Time series forecasting 5 Evaluating forecast accuracy 6 Exponential smoothing 7 ARIMA modelling 8 Automatic nonlinear forecasting? 9 Time series with complex seasonality 10 Forecasts about automatic forecasting Automatic time series forecasting Exponential smoothing 34

87. Exponential smoothing Automatic time series forecasting Exponential smoothing 35

88. Exponential smoothing www.OTexts.com/fpp Automatic time series forecasting Exponential smoothing 35

89. Exponential smoothing methods Trend Component Seasonal Component N A M (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N M (Multiplicative) M,N Ad ,A M,A Ad ,M M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M Automatic time series forecasting Exponential smoothing 36

90. Exponential smoothing methods Trend Component Seasonal Component N A M (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N M (Multiplicative) M,N Ad ,A M,A Ad ,M M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M N,N: Simple exponential smoothing Automatic time series forecasting Exponential smoothing 36

91. Exponential smoothing methods Trend Component Seasonal Component N A M (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N M (Multiplicative) M,N Ad ,A M,A Ad ,M M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M N,N: A,N: Simple exponential smoothing Holt’s linear method Automatic time series forecasting Exponential smoothing 36

92. Exponential smoothing methods Trend Component Seasonal Component N A M (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N M (Multiplicative) M,N Ad ,A M,A Ad ,M M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M N,N: A,N: Ad ,N: Simple exponential smoothing Holt’s linear method Additive damped trend method Automatic time series forecasting Exponential smoothing 36

93. Exponential smoothing methods Trend Component Seasonal Component N A M (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N M (Multiplicative) M,N Ad ,A M,A Ad ,M M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M N,N: A,N: Ad ,N: M,N: Simple exponential smoothing Holt’s linear method Additive damped trend method Exponential trend method Automatic time series forecasting Exponential smoothing 36

94. Exponential smoothing methods Trend Component Seasonal Component N A M (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N M (Multiplicative) M,N Ad ,A M,A Ad ,M M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M N,N: A,N: Ad ,N: M,N: Md ,N: Simple exponential smoothing Holt’s linear method Additive damped trend method Exponential trend method Multiplicative damped trend method Automatic time series forecasting Exponential smoothing 36

95. Exponential smoothing methods Trend Component Seasonal Component N A M (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N M (Multiplicative) M,N Ad ,A M,A Ad ,M M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M N,N: A,N: Ad ,N: M,N: Md ,N: A,A: Simple exponential smoothing Holt’s linear method Additive damped trend method Exponential trend method Multiplicative damped trend method Additive Holt-Winters’ method Automatic time series forecasting Exponential smoothing 36

96. Exponential smoothing methods Trend Component Seasonal Component N A M (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N M (Multiplicative) M,N Ad ,A M,A Ad ,M M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M N,N: A,N: Ad ,N: M,N: Md ,N: A,A: A,M: Simple exponential smoothing Holt’s linear method Additive damped trend method Exponential trend method Multiplicative damped trend method Additive Holt-Winters’ method Multiplicative Holt-Winters’ method Automatic time series forecasting Exponential smoothing 36

97. Exponential smoothing methods Trend Component Seasonal Component N A M (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N M (Multiplicative) M,N Ad ,A M,A Ad ,M M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M There are 15 separate exponential smoothing methods. Automatic time series forecasting Exponential smoothing 36

98. Exponential smoothing methods Trend Component Seasonal Component N A M (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N M (Multiplicative) M,N Ad ,A M,A Ad ,M M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M There are 15 separate exponential smoothing methods. Each can have an additive or multiplicative error, giving 30 separate models. Automatic time series forecasting Exponential smoothing 36

99. Exponential smoothing methods Trend Component Seasonal Component N A M (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N M (Multiplicative) M,N Ad ,A M,A Ad ,M M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M There are 15 separate exponential smoothing methods. Each can have an additive or multiplicative error, giving 30 separate models. Only 19 models are numerically stable. Automatic time series forecasting Exponential smoothing 36

100. Exponential smoothing methods Seasonal Component N A M (None) (Additive) (Multiplicative) Trend Component N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N M (Multiplicative) M,N Ad ,A M,A Ad ,M M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M General notation E T S : ExponenTial Smoothing Examples: A,N,N: A,A,N: M,A,M: Simple exponential smoothing with additive errors Holt’s linear method with additive errors Multiplicative Holt-Winters’ method with multiplicative errors Automatic time series forecasting Exponential smoothing 37

101. Exponential smoothing methods Seasonal Component N A M (None) (Additive) (Multiplicative) Trend Component N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N M (Multiplicative) M,N Ad ,A M,A Ad ,M M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M General notation E T S : ExponenTial Smoothing Examples: A,N,N: A,A,N: M,A,M: Simple exponential smoothing with additive errors Holt’s linear method with additive errors Multiplicative Holt-Winters’ method with multiplicative errors Automatic time series forecasting Exponential smoothing 37

102. Exponential smoothing methods Seasonal Component N A M (None) (Additive) (Multiplicative) Trend Component N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N M (Multiplicative) M,N Ad ,A M,A Ad ,M M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M General notation Examples: A,N,N: A,A,N: M,A,M: E T S : ExponenTial Smoothing ↑ Trend Simple exponential smoothing with additive errors Holt’s linear method with additive errors Multiplicative Holt-Winters’ method with multiplicative errors Automatic time series forecasting Exponential smoothing 37

103. Exponential smoothing methods Seasonal Component N A M (None) (Additive) (Multiplicative) Trend Component N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N M (Multiplicative) M,N Ad ,A M,A Ad ,M M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M General notation Examples: A,N,N: A,A,N: M,A,M: E T S : ExponenTial Smoothing ↑ Trend Seasonal Simple exponential smoothing with additive errors Holt’s linear method with additive errors Multiplicative Holt-Winters’ method with multiplicative errors Automatic time series forecasting Exponential smoothing 37

104. Exponential smoothing methods Seasonal Component N A M (None) (Additive) (Multiplicative) Trend Component N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N M (Multiplicative) M,N Ad ,A M,A Ad ,M M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M General notation Examples: A,N,N: A,A,N: M,A,M: E T S : ExponenTial Smoothing ↑ Error Trend Seasonal Simple exponential smoothing with additive errors Holt’s linear method with additive errors Multiplicative Holt-Winters’ method with multiplicative errors Automatic time series forecasting Exponential smoothing 37

105. Exponential smoothing methods Seasonal Component N A M (None) (Additive) (Multiplicative) Trend Component N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N M (Multiplicative) M,N Ad ,A M,A Ad ,M M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M General notation Examples: A,N,N: A,A,N: M,A,M: E T S : ExponenTial Smoothing ↑ Error Trend Seasonal Simple exponential smoothing with additive errors Holt’s linear method with additive errors Multiplicative Holt-Winters’ method with multiplicative errors Automatic time series forecasting Exponential smoothing 37

106. Exponential smoothing methods Innovations state space modelsComponent Seasonal Trend N A M ¯ AllComponent ETS models can (None) (Additive) (Multiplicative) be written in innovations N state space form (IJF, 2002). N,A (None) N,N A (Additive) N,M A,N A,A A,M d d d d d d ¯ Additive and multiplicative versions give,M the A (Additive damped) A ,N A ,A A d M Md same point forecasts but different prediction (Multiplicative) M,N M,A M,M intervals. damped) M ,N (Multiplicative M ,A M ,M General notation Examples: A,N,N: A,A,N: M,A,M: E T S : ExponenTial Smoothing ↑ Error Trend Seasonal Simple exponential smoothing with additive errors Holt’s linear method with additive errors Multiplicative Holt-Winters’ method with multiplicative errors Automatic time series forecasting Exponential smoothing 37

107. ETS state space models xt −1 yt εt xt yt+1 εt+1 xt+1 yt+2 εt+2 xt +2 y t +3 εt+3 xt +3 State space model xt = ( t , bt , st , st−1 , . . . , st−m+1 ) y t +4 Estimation Optimize L wrt θ = (α, β, γ, φ) and initial states x0 = ( 0 , b0 , s0 , s−1 , . . . , s−m+1 ). Automatic time series forecasting ε t +4 Exponential smoothing 38

108. ets algorithm in R Based on Hyndman Khandakar (IJF 2008): Apply each of 19 models that are appropriate to the data. Optimize parameters and initial values using MLE. Select best method using AICc. Produce forecasts using best method. Obtain prediction intervals using underlying state space model. Automatic time series forecasting Exponential smoothing 39

112. Exponential smoothing 400 300 millions of sheep 500 600 Forecasts from ETS(M,A,N) 1960 1970 1980 1990 2000 2010 Year Automatic time series forecasting Exponential smoothing 40

113. Exponential smoothing 400 fit - ets(livestock) fcast - forecast(fit) plot(fcast) 300 millions of sheep 500 600 Forecasts from ETS(M,A,N) 1960 1970 1980 1990 2000 2010 Year Automatic time series forecasting Exponential smoothing 41

114. Exponential smoothing 1.2 1.0 0.8 0.6 0.4 Total scripts (millions) 1.4 1.6 Forecasts from ETS(M,Md,M) 1995 2000 2005 2010 Year Automatic time series forecasting Exponential smoothing 42

115. Exponential smoothing 1.2 0.6 0.8 1.0 fit - ets(h02) fcast - forecast(fit) plot(fcast) 0.4 Total scripts (millions) 1.4 1.6 Forecasts from ETS(M,Md,M) 1995 2000 2005 2010 Year Automatic time series forecasting Exponential smoothing 43

116. M3 comparisons Method MAPE sMAPE MASE Theta ForecastPro ForecastX Automatic ANN B-J automatic 17.42 18.00 17.35 17.18 19.13 12.76 13.06 13.09 13.98 13.72 1.39 1.47 1.42 1.53 1.54 ETS 18.06 13.38 1.52 Automatic time series forecasting Exponential smoothing 44

117. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Time series forecasting 5 Evaluating forecast accuracy 6 Exponential smoothing 7 ARIMA modelling 8 Automatic nonlinear forecasting? 9 Time series with complex seasonality 10 Forecasts about automatic forecasting Automatic time series forecasting ARIMA modelling 45

118. Time series forecasting yt−1 yt−2 yt yt−3 ARIMA model Autoregression moving average (ARMA) model applied to differences. εt εt−1 εt−2 Estimation Compute likelihood L from ε1 , ε2 , . . . , εT . Use optimization algorithm to maximize L. Automatic time series forecasting ARIMA modelling 46

119. ARIMA modelling Automatic time series forecasting ARIMA modelling 47

122. Auto ARIMA 450 400 350 300 250 millions of sheep 500 550 Forecasts from ARIMA(0,1,0) with drift 1960 1970 1980 1990 2000 2010 Year Automatic time series forecasting ARIMA modelling 48

123. Auto ARIMA Forecasts from ARIMA(0,1,0) with drift 450 400 350 300 250 millions of sheep 500 550 fit - auto.arima(livestock) fcast - forecast(fit) plot(fcast) 1960 1970 1980 1990 2000 2010 Year Automatic time series forecasting ARIMA modelling 49

124. Auto ARIMA 1.0 0.8 0.6 0.4 Total scripts (millions) 1.2 1.4 Forecasts from ARIMA(3,1,3)(0,1,1)[12] 1995 2000 2005 2010 Year Automatic time series forecasting ARIMA modelling 50

125. Auto ARIMA 0.6 0.8 1.0 fit - auto.arima(h02) fcast - forecast(fit) plot(fcast) 0.4 Total scripts (millions) 1.2 1.4 Forecasts from ARIMA(3,1,3)(0,1,1)[12] 1995 2000 2005 2010 Year Automatic time series forecasting ARIMA modelling 51

126. How does auto.arima() work? A non-seasonal ARIMA process φ(B)(1 − B)d yt = c + θ(B)εt Need to select appropriate orders p, q, d, and whether to include c. Algorithm choices driven by forecast accuracy. Automatic time series forecasting ARIMA modelling 52

127. How does auto.arima() work? A non-seasonal ARIMA process φ(B)(1 − B)d yt = c + θ(B)εt Need to select appropriate orders p, q, d, and whether to include c. Hyndman Khandakar (JSS, 2008) algorithm: Select no. differences d via KPSS unit root test. Select p, q, c by minimising AICc. Use stepwise search to traverse model space, starting with a simple model and considering nearby variants. Algorithm choices driven by forecast accuracy. Automatic time series forecasting ARIMA modelling 52

128. How does auto.arima() work? A non-seasonal ARIMA process φ(B)(1 − B)d yt = c + θ(B)εt Need to select appropriate orders p, q, d, and whether to include c. Hyndman Khandakar (JSS, 2008) algorithm: Select no. differences d via KPSS unit root test. Select p, q, c by minimising AICc. Use stepwise search to traverse model space, starting with a simple model and considering nearby variants. Algorithm choices driven by forecast accuracy. Automatic time series forecasting ARIMA modelling 52

129. How does auto.arima() work? A seasonal ARIMA process Φ(Bm )φ(B)(1 − B)d (1 − Bm )D yt = c + Θ(Bm )θ(B)εt Need to select appropriate orders p, q, d, P, Q, D, and whether to include c. Hyndman Khandakar (JSS, 2008) algorithm: Select no. differences d via KPSS unit root test. Select D using OCSB unit root test. Select p, q, P, Q, c by minimising AIC. Use stepwise search to traverse model space, starting with a simple model and considering nearby variants. Automatic time series forecasting ARIMA modelling 53

130. M3 comparisons Method MAPE sMAPE MASE Theta 17.42 ForecastPro 18.00 B-J automatic 19.13 12.76 13.06 13.72 1.39 1.47 1.54 ETS AutoARIMA 13.38 13.86 1.52 1.47 Automatic time series forecasting 18.06 19.04 ARIMA modelling 54

131. M3 comparisons Method MAPE sMAPE MASE Theta 17.42 ForecastPro 18.00 B-J automatic 19.13 12.76 13.06 13.72 1.39 1.47 1.54 ETS AutoARIMA ETS-ARIMA 13.38 13.86 13.02 1.52 1.47 1.44 Automatic time series forecasting 18.06 19.04 17.92 ARIMA modelling 54

132. M3 conclusions MYTHS Simple methods do better. Exponential smoothing is better than ARIMA. FACTS The best methods are hybrid approaches. ETS-ARIMA (the simple average of ETS-additive and AutoARIMA) is the only fully documented method that is comparable to the M3 competition winners. Automatic time series forecasting ARIMA modelling 55

137. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Time series forecasting 5 Evaluating forecast accuracy 6 Exponential smoothing 7 ARIMA modelling 8 Automatic nonlinear forecasting? 9 Time series with complex seasonality 10 Forecasts about automatic forecasting Automatic time series forecasting Automatic nonlinear forecasting? 56

138. Automatic nonlinear forecasting Automatic ANN in M3 competition did poorly. Linear methods did best in the NN3 competition! Very few machine learning methods get published in the IJF because authors cannot demonstrate their methods give better forecasts than linear benchmark methods, even on supposedly nonlinear data. Some good recent work by Kourentzes and Crone (Neurocomputing, 2010) on automated ANN for time series. Watch this space! Automatic time series forecasting Automatic nonlinear forecasting? 57

143. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Time series forecasting 5 Evaluating forecast accuracy 6 Exponential smoothing 7 ARIMA modelling 8 Automatic nonlinear forecasting? 9 Time series with complex seasonality 10 Forecasts about automatic forecasting Automatic time series forecasting Time series with complex seasonality 58

144. Examples 6500 7000 7500 8000 8500 9000 9500 Thousands of barrels per day US finished motor gasoline products 1992 1994 1996 1998 2000 2002 2004 Weeks Automatic time series forecasting Time series with complex seasonality 59

145. Examples 300 200 100 Number of call arrivals 400 Number of calls to large American bank (7am−9pm) 3 March 17 March 31 March 14 April 28 April 12 May 5 minute intervals Automatic time series forecasting Time series with complex seasonality 59

146. Examples 20 15 10 Electricity demand (GW) 25 Turkish electricity demand 2000 2002 2004 2006 2008 Days Automatic time series forecasting Time series with complex seasonality 59

147. TBATS model TBATS Trigonometric terms for seasonality Box-Cox transformations for heterogeneity ARMA errors for short-term dynamics Trend (possibly damped) Seasonal (including multiple and non-integer periods) Automatic algorithm described in AM De Livera, RJ Hyndman, and RD Snyder (2011). “Forecasting time series with complex seasonal patterns using exponential smoothing”. Journal of the American Statistical Association 106(496), 1513–1527. Automatic time series forecasting Time series with complex seasonality 60

148. Examples 9000 8000 fit - tbats(gasoline) fcast - forecast(fit) plot(fcast) 7000 Thousands of barrels per day 10000 Forecasts from TBATS(0.999, {2,2}, 1, {52.1785714285714,8}) 1995 2000 2005 Weeks Automatic time series forecasting Time series with complex seasonality 61

149. Examples 500 Forecasts from TBATS(1, {3,1}, 0.987, {169,5, 845,3}) 300 200 100 0 Number of call arrivals 400 fit - tbats(callcentre) fcast - forecast(fit) plot(fcast) 3 March 17 March 31 March 14 April 28 April 12 May 26 May 9 June 5 minute intervals Automatic time series forecasting Time series with complex seasonality 62

150. Examples Forecasts from TBATS(0, {5,3}, 0.997, {7,3, 354.37,12, 365.25,4}) 20 15 10 Electricity demand (GW) 25 fit - tbats(turk) fcast - forecast(fit) plot(fcast) 2000 2002 2004 2006 2008 2010 Days Automatic time series forecasting Time series with complex seasonality 63

151. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Time series forecasting 5 Evaluating forecast accuracy 6 Exponential smoothing 7 ARIMA modelling 8 Automatic nonlinear forecasting? 9 Time series with complex seasonality 10 Forecasts about automatic forecasting Automatic time series forecasting Forecasts about automatic forecasting 64

152. Forecasts about forecasting 1 2 3 4 Automatic algorithms will become more general — handling a wide variety of time series. Model selection methods will take account of multi-step forecast accuracy as well as one-step forecast accuracy. Automatic forecasting algorithms for multivariate time series will be developed. Automatic forecasting algorithms that include covariate information will be developed. Automatic time series forecasting Forecasts about automatic forecasting 65

156. For further information robjhyndman.com Slides and references for this talk. Links to all papers and books. Links to R packages. A blog about forecasting research. Automatic time series forecasting Forecasts about automatic forecasting 66

Automatic time series forecasting

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (20)

Semelhante a Automatic time series forecasting

Semelhante a Automatic time series forecasting (20)

Último

Último (20)

Automatic time series forecasting