Cointegration and error correction models are used to analyze the relationship between non-stationary time series variables. The Dickey-Fuller test determines if variables contain a unit root and are non-stationary. If two non-stationary variables have a stationary linear combination, they are cointegrated, indicating a long-run equilibrium relationship. An error correction model represents the short-run dynamic adjustment between cointegrated variables back to their long-run equilibrium when shocked.
2. Introduction
• Assess the importance of stationary variables
when running OLS regressions.
• Describe the Dickey-Fuller test for stationarity
• Explain the concept of Cointegration with a bi-
variate model
• Discuss the importance of error correction
models and their relationship to cointegration.
• Describe how to test for a set theory using
cointegration.
3. OLS Regression with I(1) data
• The following results were produced when
output was regressed against stock prices:
DW
R
DW
R
y
s t
t
2
2
3
.
0
,
9
.
0
(0.1)
(0.4)
4
.
0
6
.
0
ˆ
4. OLS Regression with I(1) data
• In the previous slide, the results can not be interpreted
as there is clear evidence of autocorrelation.
• However the explanatory power is very high suggesting
a very good result.
• In this case the drift in both variables is related, but not
explicitly modelled, causing autocorrelation. But as the
drifts in the two variables is related, the explanatory
power is high
• This produces the case where the R-squared statistic is
larger than the DW statistic, often referred to as an
indirect test for cointegration
5. Difference Stationary and Trend
Stationary
• The main method for inducing stationarity is
to difference the data. For instance the
random walk becomes stationary on
differencing:
t
t
t
t
t
t
t
u
y
y
y
u
y
y
1
1
6. Trend Stationary
• A series is said to be trend stationary when
it is stationary around a trend:
trend
t
u
t
y t
t
1
0
7. Differenced Variables
• If in a bi-variate model, both variables are
difference-stationary, then one way
around the problem is to run a model with
differenced variables instead of level
variables:
t
t
t u
x
y
1
0
8. Differenced Variables
• However this option may not be acceptable as:
- The variables in this form may not be in
accordance with the original theory
- This model could be omitting important
long-run information, differenced
variables are usually thought of as
representing the short-run.
- This model may not have the correct
functional form.
9. Stationary data
• One of the most important tests for stationarity is
the Dickey-Fuller Test or Augmented Dickey-
Fuller Test (ADF).
• The test is based on a random walk and the fact
that a random walk has a unit root.
• If the variable in question follows a random walk,
it is therefore not stationary.
• This is why when testing to determine if a
variable is stationary, it is said to be testing for a
‘unit root’.
10. Dickey-Fuller Test for Stationarity
• The test is based on the following
regression. The coefficient on the lagged
level variable is then used to test if it
equals zero, in the same way as a t-test:
t
t
t u
y
y
1
11. Dickey-Fuller Test
• This test assumes that the error term (u) follows
the Gauss-Markov assumptions.
• The test statistic does not follow the t-
distribution, the critical values have been
produced specifically for this test.
• A constant and trend could also be included in
this test, the test statistic would still be the test
for whether the coefficient on the lagged level
variable equals zero
• In this case the test is for a unit root against no
unit root, i.e. the variable needs to be
differenced once to induce stationarity.
12. Augmented Dickey-Fuller Test
(ADF)
• The error term in the Dickey-Fuller test
usually has autocorrelation, which needs
to be removed if the result is to be valid.
The main way is to add lagged dependent
variables until the autocorrelation has
been mopped up.
• The test is the same as before in that it is
the coefficient on the lagged dependent
variable that is tested.
13. Augmented Dickey-Fuller Test
• The test is as follows, where the number of
lagged dependent variables is determined
by an information criteria:
t
N
i
i
t
t
t u
y
y
y
0
1
14. I(2) Variables
• When a variable contains two unit roots, it is said
to be I(2) and needs to be differenced twice to
induce stationarity.
• When using the ADF test, the data is first tested
to determine if it contains a unit root, i.e. it is I(1)
and not I(0)
• If it is not I(0), it could be I(1), I(2) or have a
higher order of unit roots
• In this case the ADF test needs to be conducted
on the differenced variable to determine if it is
I(1) or I(2). (It is very rare to find I(3) or higher
orders).
15. Dickey-Fuller Test
• Most tests using the Dickey-Fuller (DF) and
Augmented Dickey-Fuller (ADF) technique are
considered to have low power. (Accept the null of a
unit root more often than should). The power
depends on:
• The time span of the data rather than the number of
observations.
• If is roughly equal to one, but not exactly, the ADF
test may indicate a non-stationary process
• These tests assume a single unit root, but many time
series are I (2) or higher
• The tests fail to account for structural breaks in the
time series.
16. Engle-Granger Approach to
Cointegration
• This is essentially a bi-variate approach and is
based on the Augmented Dickey-Fuller test for
stationarity.
• If we have two non-stationary variables
containing a unit root (i.e. I(1) variables), then
we describe them as being cointegrated if the
error term is stationary (i.e. I(0)).
• We test for the stationarity of the error term
using the ADF test in the same way as the
individual variables.
17. Cointegration
• When we have an I(0) error term, with two
I(1) variables, in effect the drift process in
the I(1) variables have cancelled each
other out to produce an error term with no
drift.
• If there is evidence of cointegration
between X and Y, we say that there is a
long-run equilibrium relationship between
X and Y
18. Granger Representation Theorem
• According to Granger, if there is evidence of
cointegration between two or more variables,
then a valid error correction model should also
exist between the two variables.
• The error correction model is then a
representation of the short-run dynamic
relationship between X and Y, in which the error
correction term incorporates the long-run
information about X and Y into our model.
• This implies that the error correction term will be
significant, if cointegration exists.
19. Engle-Granger Two-Step Method
• The method involves firstly estimating the
cointegrating relationship and test for
cointegration.
• The second stage involves forming the
error correction model, where the error
correction term is the residual from the
cointegrating relationship, lagged once.
20. Cointegration Example
• The following cointegrating relationship was run, the
residual was then tested to determine if it was
stationary and the error correction model (ECM)
formed:
ECM
u
s
u
u
u
s
y
t
t
t
t
t
t
t
)
(
y
2.89)
-
is
value
critical
ADF
s,
parenthese
in
(SE
(0.24)
78
.
0
ˆ
1
1
0
t
1
1
0
21. Cointegration Example
• In the previous slide, to determine if the
variables are cointegrated, the ADF test has
been conducted on the residual, giving a test
statistic of (-0.78/0.24)= -3.25, this is more
negative than the -2.89 critical value so we reject
the null hypothesis of no cointegration.
• The ECM is then formed using the residual
lagged one time period as the error correction
term.
22. Error Correction Models
• An error correction model includes only I(0)
variables.
• This requires all our non-stationary variables to
be first-differenced, to produce stationary
variables
• The error correction term is the residual from the
cointegrating relationship, lagged one time
period, this too will be I(0) if the variables are
cointegrated
• The error correction model can include a
number of lags on both variables
23. Error Correction Models
• The ECM models the short-run dynamics of the
model.
• As with short-run models including lags, it can
be used for forecasting.
• The coefficient on the error correction term can
be used as a further test for cointegration. It is
called the Bannerjee ECM test and requires a
separate set of critical values to determine if
cointegration has occurred.
24. Error Correction Term
• The error correction term tells us the speed with which
our model returns to equilibrium following an exogenous
shock.
• It should be negatively signed, indicating a move back
towards equilibrium, a positive sign indicates movement
away from equilibrium
• The coefficient should lie between 0 and 1, 0 suggesting
no adjustment one time period later, 1 indicates full
adjustment
• The error correction term can be either the difference
between the dependent and explanatory variable (lagged
once) or the error term (lagged once), they are in effect
the same thing.
25. Example of ECM
• The following ECM was formed, using 60
observations:
ip)
relationsh
ing
cointegrat
a
from
residual
the
is
u
s,
parenthese
in
(SE
(0.08)
(0.12)
(0.56)
)
(
32
.
0
24
.
0
78
.
0
ˆ 1
t
t
t u
x
y
26. Example of an ECM
• The error correction term has a t-statistic
of 4, which is highly significant supporting
the cointegration result.
• The coefficient on the error correction term
is negative, so the model is stable.
• The coefficient of -0.32, suggests 32%
movement back towards equilibrium
following a shock to the model, one time
period later.
27. Potential Problems with
Cointegration
• The ADF test often indicates acceptance of the null
hypothesis (no cointegration), when in fact cointegration
is present
• The ADF test is best when we have a long time span of
data, rather than large amounts of observations over a
short time span. This can be a problem with financial
data which tends to cover a couple of years, but with
high frequency data (i.e. daily data)
• It is only really used for bi-variate cointegration tests,
although it can be used for multivariate models, a
different set of critical values is required.
28. Multivariate Approach to
Cointegration
• A different approach to testing for cointegration
is generally required when we have more then 2
variables in the model
• If we assume all the variables are endogenous,
we can construct a VAR and then test for
cointegration
• One of the most common approaches to
multivariate cointegration is the Johansen
Maximum Likelihood (ML) test.
• This test involves testing the characteristic roots
or eigenvalues of the π matrix (coefficients on
the lagged dependent variable).
29. Steps in Testing for Cointegration
1) Test all the variables to determine if they are I(0), I(1)
or I(2) using the ADF test.
2) If both variables are I(1), then carry out the test for
cointegration
3) If there is evidence of cointegration, use the residual to
form the error correction term in the corresponding
ECM
4) Add in a number of lags of both explanatory and
dependent variables to the ECM
5) Omit those lags that are insignificant to form a
parsimonious model
6) Use the ECM for dynamic forecasting of the dependent
variable and assess the accuracy of the forecasts.
30. Conclusion
• The Dickey-Fuller or Augmented Dickey-Fuller
tests test for stationarity, based on the test for a
random walk.
• The Engle-Granger approach to cointegration in
a bi-variate model, involves testing for
stationarity of the residual using the ADF test.
• According to the Granger representation
theorem, if there is cointegration between our
two variables, we should be able to form the
appropriate error correction model.