1. Arthur CHARPENTIER - Sales forecasting.
Sales forecasting # 2
Arthur Charpentier
arthur.charpentier@univ-rennes1.fr
1
2. Arthur CHARPENTIER - Sales forecasting.
Agenda
Qualitative and quantitative methods, a very general introduction
• Series decomposition
• Short versus long term forecasting
• Regression techniques
Regression and econometric methods
• Box & Jenkins ARIMA time series method
• Forecasting with ARIMA series
Practical issues : forecasting with MSExcel
2
3. Arthur CHARPENTIER - Sales forecasting.
Time series decomposition
0 20 40 60 80
15000200002500030000
A13 Highway
3
4. Arthur CHARPENTIER - Sales forecasting.
Time series decomposition
0 20 40 60 80
15000200002500030000
A13 Highway
4
5. Arthur CHARPENTIER - Sales forecasting.
Time series decomposition
0 20 40 60 80
−500005000
A13 Highway, removing trending
5
16. Arthur CHARPENTIER - Sales forecasting.
Time series decomposition, modeling the random part
Histogram of residuals (v2)
Density
−3000 −2000 −1000 0 1000 2000
0e+002e−044e−04
16
23. Arthur CHARPENTIER - Sales forecasting.
Time series decomposition, modeling the seasonal
componant
0 20 40 60 80
15000200002500030000
A13 Highway: trend and cycle
23
25. Arthur CHARPENTIER - Sales forecasting.
Modeling the random component
The unpredictible random component is the key element when forecasting. Most
of the uncertainty comes from this random component εt.
The lower the variance, the smaller the uncertainty on forecasts.
The general theoritical framework related to randomness of time series is related
to weakly stationary.
25
26. Arthur CHARPENTIER - Sales forecasting.
Dening stationarity
Time series (Xt) is weakly stationary if
for all t, E X2
t +∞,
for all t, E (Xt) = µ, constant independent of t,
for all t and for all h, cov (Xt, Xt+h) = E ([Xt − µ] [Xt+h − µ]) = γ (h),
independent of t.
Function γ (·) is called autocovariance function.
Given a stationary series (Xt) , dene the autocovariance function, as
h → γX (h) = cov (Xt, Xt−h) = E (XtXt−h) − E (Xt) .E (Xt−h) .
and dene the autocorrelation function, as
h → ρX (h) = corr (Xt, Xt−h) =
cov (Xt, Xt−h)
V (Xt) V (Xt−h)
=
γX (h)
γX (0)
.
26
27. Arthur CHARPENTIER - Sales forecasting.
Dening stationarity
A process (Xt) is said to be strongly stationary if for all t1, ..., tn and h we have
the following law equality
L (Xt1
, ..., Xtn
) = L (Xt1+h, ..., Xtn+h) .
A time series (εt) is a white noise if all autocovariances are null, i.e. γ (h) = 0 for
all h = 0. Thus, a process (εt) is a white noise if it is stationary, centred and
noncorrelated, i.e.
E (εt) = 0, V (εt) = σ2
and ρε (h) = 0 for any h = 0.
27
28. Arthur CHARPENTIER - Sales forecasting.
Statistical issues
Consider a set of observations {X1, ..., XT }.
The empirical mean is dened as
XT =
1
T
T
t=1
Xt.
The empirical autocovariance function is dened as
γT (h) =
1
T − h
T −h
t=1
Xt − XT Xt−h − XT ,
while the empirical autocorrelation function is dened as
ρT (h) =
γT (h)
γT (0)
.
Remark those estimators can be biased, but asymptotically unbiased. More
precisely γT (h) → γ (h) and ρT (h) → ρ (h) as T → ∞.
28
29. Arthur CHARPENTIER - Sales forecasting.
Backward and forward operators
Dene the lag operator L (or B for backward) the linear operator dened as
L : Xt −→ L (Xt) = LXt = Xt−1,
and the forward operator F,
F : Xt −→ F (Xt) = FXt = Xt+1,
Note that L ◦ F = F ◦ L = I (identity operator) and further F = L−1
and
L = F−1
.
it is possible to compose those operators : L2
= L ◦ L, and more generally
Lp
= L ◦ L ◦ ... ◦ L where p ∈ N
with convention L0
= I. Note that Lp
(Xt) = Xt−p.
Let A denote a polynom,A (z) = a0 + a1z + a2z2
+ ... + apzp
. Then A (L) is the
29
30. Arthur CHARPENTIER - Sales forecasting.
operator
A (L) = a0I + a1L + a2L2
+ ... + apLp
=
p
k=0
akLk
.
Let (Xt) denote a time series. Series (Yt) dened by Yt = A (L) Xt satises
Yt = A (L) Xt =
p
k=0
akXt−k.
or, more generally, assuming that we can formally the limit,
A (z) =
∞
k=0
akzk
et A (L) =
∞
k=0
akLk
.
30
31. Arthur CHARPENTIER - Sales forecasting.
Backward and forward operators
Note that for all moving average A and B, then
A (L) + B (L) = (A + B) (L)
α ∈ R, αA (L) = (αA) (L)
A (L) ◦ B (L) = (AB) (L) = B (L) ◦ A (L) .
Moving average C = AB = BA satises
∞
k=0
akLk
◦
∞
k=0
bkLk
=
∞
i=0
ciLi
où ci =
i
k=0
akbi−k.
31
32. Arthur CHARPENTIER - Sales forecasting.
Geometry and probability
Recall that it is possible to dene an inner product in L2
(space of squared
integrable variables, i.e. nite variance),
X, Y = E ([X − E(X)] · [Y − E(Y )]) = cov([X − E(X)], [Y − E(Y )])
Then the associated norm is ||X||2
= E [X − E(X)]2
= V (X).
Two random variables are then orthogonal if X, Y = 0, i.e.
cov([X − E(X)], [Y − E(Y )]) = 0.
Hence conditional expectation is simply a projection in the L2
, E(X|Y ) is the the
projection is the space generated by Y of random variable X, i.e.
E(X|Y ) = φ(Y ), such that
X − φ(Y ) ⊥ X, i.e. X − φ(Y ), X = 0,
φ(Y ) = Z∗
= argmin{Z = h(Y ), ||X − Z||2
}
E(φ(Y )) ∞.
32
33. Arthur CHARPENTIER - Sales forecasting.
Linear projection
The conditional expectation E(X|Y ) is a projection if the set of all functions
{h(Y )}.
In linear regression, the projection if made in the subset of linear functions h(·).
We call this linear function conditional linear expectation, or linear projection,
denoted EL(X|Y ).
In purely endogeneous models, the best forecast for XT +1 given past
informations {XT , XT −1, XT −2, · · · , XT −h, ...} is
XT +1 = E(XT +1|{XT , XT −1, XT −2, · · · , XT −h, · · · }) = φ(XT , XT −1, XT −2, · · · , XT −h, ·
Since estimating a nonlinear function is dicult (especially in high dimension),
we focus on linear functions, i.e. autoregressive models,
XT +1 = EL(XT +1|{XT , XT −1, XT −2, · · · , XT −h, · · · }) = α0XT +α1XT −1+α2XT −2+· · ·
33
34. Arthur CHARPENTIER - Sales forecasting.
Dening partial autocorrelations
Given a stationary series (Xt), dene the partial autocorrelation function
h → ψX (h) as
ψX (h) = corr Xt, Xt−h ,
where
Xt−h = Xt−h − EL (Xt−h|Xt−1, ..., Xt−h+1)
Xt = Xt − EL (Xt|Xt−1, ..., Xt−h+1) .
34
36. Arthur CHARPENTIER - Sales forecasting.
Time series decomposition, modeling the random part
0 5 10 15
−0.20.00.20.40.60.81.0
Lag
ACF
Autocorrelations of residuals (v2)
36
37. Arthur CHARPENTIER - Sales forecasting.
Time series decomposition, modeling the random part
5 10 15
−0.2−0.10.00.10.2
Lag
PartialACF
Partial autocorrelations of residuals (v2)
37
38. Arthur CHARPENTIER - Sales forecasting.
Time series decomposition, modeling the detrended series
0 20 40 60 80
−500005000
A13 Highway, removing trending
38
39. Arthur CHARPENTIER - Sales forecasting.
Time series decomposition, modeling the detrended series
0 5 10 15 20 25 30 35
−0.50.00.51.0
Lag
ACF
Autocorrelations of detrended series
39
40. Arthur CHARPENTIER - Sales forecasting.
Time series decomposition, modeling the detrended series
0 5 10 15 20 25 30 35
−0.4−0.20.00.20.40.6
Lag
PartialACF
Partial autocorrelations of detrended series
40
41. Arthur CHARPENTIER - Sales forecasting.
Time series decomposition, modeling Yt = Xt − Xt−12
0 10 20 30 40 50 60 70
−3000−1000010002000
A13 Highway: lagged detrended series
41
42. Arthur CHARPENTIER - Sales forecasting.
Time series decomposition, modeling Yt = Xt − Xt−12
0 10 20 30 40 50 60 70
−3000−1000010002000
A13 Highway: lagged detrended series
42
43. Arthur CHARPENTIER - Sales forecasting.
Time series decomposition, modeling Yt = Xt − Xt−12
0 5 10 15 20 25 30 35
−0.20.00.20.40.60.81.0
Lag
ACF
Autocorrelations of lagged detrended series
43
44. Arthur CHARPENTIER - Sales forecasting.
Time series decomposition, modeling Yt = Xt − Xt−12
0 5 10 15 20 25 30 35
−0.2−0.10.00.10.2
Lag
PartialACF
Partial autocorrelations of lagged detrended series
44
48. Arthur CHARPENTIER - Sales forecasting.
Estimating autocorrelations with MSExcel
48
49. Arthur CHARPENTIER - Sales forecasting.
A white noise
A white noise is dened as a centred process (E(εt) = 0), stationary
(V (εt) = σ2
), such that cov (εt, εt−h) = 0 for all h = 0.
The so-called Box-Pierce test can be used to test
H0 : ρ (1) = ρ (2) = ... = ρ (h) = 0
Ha : there exists i such that ρ (i) = 0.
The idea is to use
Qh = T
h
k=1
ρ2
k,
where h is the lag number and T the total number of observations.
Under H0, Qh has a χ2
distribution, with h degrees of freedom.
49
50. Arthur CHARPENTIER - Sales forecasting.
A white noise
Another statistics with better properties is a modied version of Q,
Qh = T (T + 2)
h
k=1
ρ2
k
T − k
,
Most of the softwares return Qh for h = 1, 2, · · · , and the associated p-value. If p
exceeds 5% (the standard signicance level) we feel condent in accepting H0,
while if p is less than 5% , we should reject H0.
50
51. Arthur CHARPENTIER - Sales forecasting.
A white noise
0 100 200 300 400 500
−2−10123
Simulated white noise
0 10 20 30 40
0.00.20.40.60.81.0
Lag
ACF
White noise autocorrelations
0 10 20 30 40
−0.6−0.4−0.20.00.20.40.6
Lag
PartialACF
White noise partial autocorrelations
51
52. Arthur CHARPENTIER - Sales forecasting.
Time series decomposition, testing for white noise
Box−Pierce statistic, testing for white noise on lagged detrended series
5 10 15 20
05101520
0.00.20.40.60.81.0
QBox−Piercestatistics
p−value
52
53. Arthur CHARPENTIER - Sales forecasting.
Time series decomposition, testing for white noise
Box−Pierce statistic, testing for white noise on residuals (v2)
5 10 15 20
0102030405060
0.00.20.40.60.81.0
QBox−Piercestatistics
p−value
53
54. Arthur CHARPENTIER - Sales forecasting.
Autoregressive process AR(p)
We call autoregressive process of order p, denoted AR (p), a stationnary process
(Xt) satisfying equation
Xt −
p
i=1
φiXt−i = εt for all t ∈ Z, (1)
where the φi's are real-valued coecients and where (εt) is a white noise process
with variance σ2
. (1) is equivalent to
Φ (L) Xt = εt where Φ (L) = I − φ1L − · · · − φpLp
54
55. Arthur CHARPENTIER - Sales forecasting.
Autoregressive process AR(1), order 1
The general expression for AR (1) process is
Xt − φXt−1 = εt for all t ∈ Z,
where (εt) is a white noise with variance σ2
.
If φ = ±1, process (Xt) is not stationary. E.g. if φ = 1, Xt = Xt−1 + εt (called
random walk) can be written
Xt − Xt−h = εt + εt−1 + ... + εt−h+1,
and thus E (Xt − Xt−h)
2
= hσ2
.
But it is possible to prove that for any stationary process
E (Xt − Xt−h)
2
≤ 4V (Xt). Since it is impossible to have for any h,
hσ2
≤ 4V (Xt), it means that the process cannot be stationary.
55
56. Arthur CHARPENTIER - Sales forecasting.
Autoregressive process AR(1), order 1
If |φ| 1 it is possible to invert the polynomial lag operator
Xt = (1 − φL)
−1
εt =
∞
i=0
φi
εt−i (as a function of the past) (εt) ). (2)
For a stationary process,the aucorelation function is given by ρ (h) = φh
.
Further, ψ(1) = φ and ψ(h) = 0 for h ≥ 2.
56
61. Arthur CHARPENTIER - Sales forecasting.
Autoregressive process AR(2), order 2
Those processes are also called Yule process, and they satisfy
1 − φ1L − φ2L2
Xt = εt,
where the roots of Φ (z) = 1 − φ1z − φ2z2
are assumed to lie outside the unit
circle, i.e.
1 − φ1 + φ2 0
1 + φ1 − φ2 0
φ2
1 + 4φ2 0,
61
62. Arthur CHARPENTIER - Sales forecasting.
Autoregressive process AR(2), order 2
Autocorrelation function satises equation
ρ (h) = φ1ρ (h − 1) + φ2ρ (h − 2) for any h ≥ 2,
and the partial autocorrelation function satises
ψ (h) =
ρ (1) for h = 1
ρ (2) − ρ (1)
2
/ 1 − ρ (1)
2
for h = 2
0 for h ≥ 3.
62
65. Arthur CHARPENTIER - Sales forecasting.
Moving average process MA(q)
We call moving average process of order q, denoted MA (q), a stationnary
process (Xt) satisfying equation
Xt = εt +
q
i=1
θiεt−i for all t ∈ Z, (3)
where the θi's are real-valued coecients, and process (εt) is a white noise
process with variance σ2
. (3) processes can be written equivalently
Xt = Θ (L) εt whereΘ (L) = I + θ1L + ... + θqLq
.
The autocovariance function satises
γ (h) = E (XtXt−h)
= E ([εt + θ1εt−1 + ... + θqεt−q] [εt−h + θ1εt−h−1 + ... + θqεt−h−q])
=
[θh + θh+1θ1 + ... + θqθq−h] σ2
if 1 ≤ h ≤ q
0 if h q,
65
66. Arthur CHARPENTIER - Sales forecasting.
Moving average process MA(q)
If h = 0, then γ (0) = 1 + θ2
1 + θ2
2 + ... + θ2
q σ2
. This equation can be written
γ (k) = σ2
q
j=0
θjθj+k with convention θ0 = 1.
Autocovariance function satises
ρ (h) =
θh + θh+1θ1 + ... + θqθq−h
1 + θ2
1 + θ2
2 + ... + θ2
q
if 1 ≤ h ≤ q,
and ρ (h) = 0 if h q.
66
67. Arthur CHARPENTIER - Sales forecasting.
Moving average process MA(1), order 1
The general expression of MA (1) is
Xt = εt + θεt−1, for all t ∈ Z,
where (εt) is a white noise with variance σ2
. Autocorrelations are given by
ρ (1) =
θ
1 + θ2
, and ρ (h) = 0, for h ≥ 2.
Note that −1/2 ≤ ρ (1) ≤ 1/2 : MA (1) processes only have small
autocorrelations.
Partial autocorrelation of order h is given by
ψ (h) =
(−1)
h
θh
θ2
− 1
1 − θ2(h+1)
.
67
70. Arthur CHARPENTIER - Sales forecasting.
Autoregressive moving average process ARMA(p, q)
We call autoregressive moving average process of orders p and q, denoted
ARMA (p, q), a stationnary process (Xt) satisfying equation
Xt =
p
j=1
φjXt−j + εt +
q
i=1
θiεt−i for all t ∈ Z, (4)
where the φj's and θi's are real-valued coecients, and process (εt) is a white
noise process with variance σ2
. (4) processes can be written equivalently
Φ (L) Xt = Θ (L) εt,
where Φ (L) = I − φ1L − ... − φqLq
and Θ (L) = I + θ1L + ... + θqLq
.
70
71. Arthur CHARPENTIER - Sales forecasting.
Autoregressive moving average process ARMA(p, q)
Note that under some technical assumptions, one can write
Xt = Φ−1
(L) ◦ Θ (L) εt,
i.e. the ARMA(p, q) process is also an MA(∞) process, and
Φ (L) ◦ Θ−1
(L) Xt = εt,
i.e. the ARMA(p, q) process is also an AR(∞) process.
Wald's theorem claims that any stationary process (satisfying further technical
conditions) can be written as a MA process.
More generally, in practice, a stationary series can be modeled either by an
AR(p) process, a MA(q), or an ARMA(p , q ) whith p p and q q .
71
75. Arthur CHARPENTIER - Sales forecasting.
Forecasting with AR(1) processes
Consider an AR (1) process, Xt = µ + φXt−1 + εt then
• T X∗
T +1 = µ + φXT ,
• T X∗
T +2 = µ + φ.T X∗
T +1 = µ + φ [µ + φXT ] = µ [1 + φ] + φ2
XT ,
• T X∗
T +3 = µ + φ.T X∗
T +2 = µ + φ [µ + φ [µ + φXT ]] = µ 1 + φ + φ2
+ φ3
XT ,
and recursively T X∗
T +h can be written
T X∗
T +h = µ + φ.T X∗
T +h−1 = µ 1 + φ + φ2
+ ... + φh−1
+ φh
XT .
or equivalently
T X∗
T +h =
µ
φ
+ φh
XT −
µ
φ
= µ
1 − φh
1 − φ
1+φ+φ2+...+φh−1
+ φh
XT .
75
76. Arthur CHARPENTIER - Sales forecasting.
Forecasting with AR(1) processes
The forecasting error made at time T for horizon h is
T ∆h = T X∗
T +h − XT +h =T X∗
T +h − [φXT +h−1 + µ + εT +h]
= ...
= T X∗
T +h − φh
1 XT + φh−1
+ ... + φ + 1 µ
+εT +h + φεT +h−1 + ... + φh−1
εT +1,
(6)
thus, T ∆h = εT +h + φεT +h−1 + ... + φh−1
εT +1, with variance having variance
V = 1 + φ2
+ φ4
+ ... + φ2h−2
σ2
, where V (εt) = σ2
.
thus, variance of the forecast error increasing with horizon.
76