Presenting Climate Change Models that estimate and forecast global temperature levels in association or caused by CO2 concentration (ppm) levels. These models also replicate IPCC scenarios.
2. 2
Content
1. Introduction
2. Data
3. Baseline trend models
4. CO2 models
5. Out-of-sample forecasts
6. Replicating IPCC scenarios
7. Granger Causality, VAR, IRFs
8. VAR Forecast
3. 3
1. Introduction
This presentation discloses the modeling of global temperature* associated, or caused,
by a rising concentration in CO2 in parts per million (ppm). Other variables will also be
explored and tested to include within these Climate Change models.
The above is:
a) To assess the information imparted by CO2 concentration into this model
estimating and predicting temperature;
b) To test the accuracy of such models to fit the historical temperature data and to
forecast temperature within out-of-sample testing framework;
c) To replicate the most recent IPCC scenarios;
d) To better understand the relationship between CO2 concentration and
temperature and to attempt to demonstrate causality of CO2 -> temperature.
* Measured as temperature anomaly over the 1850 – 1900 average global temperature.
6. Annual CO2 concentration in parts
per million from 1880 to 2020.
Data from 1880 to 1958 is derived
from a cooperative effort between
three different scientific teams from
Australia and France constructing
the data derived from ice core
analysis.
Data from 1958 to 2020 is from the
NOAA.
6
7. We understand that comparing two levels
variables, without detrending them, can lead to
spurious correlations and regressions.
However, when two level variables are cointegrated
the above caveat is nullified. We will disclose later
cointegration testing for these two variables.
As observed this scatter plot shows a pretty strong
correlation between the two variables.
7
8. The relationship between CO2 and temperature can be split over two periods. The first one (1880 – 1970) with CO2
concentration ranging from 290 to 325 ppm is associated with a not so strong linear relationship between the two
variables. The second one (1971 – 2020) with CO2 concentration ranging from 325 to close to 420 is associated with a very
strong linear relationship. For the purpose of our modeling, we will not split the data as the related regression parameters
are pretty stable (intercept and slope of the regression equations shown on the scatter plots). 8
9. Checking the Autocorrelation of the Residuals of the Ordinary Least Square (OLS)
Cointegration Regression: Temperature ~ CO2
9
Given that we are using level variables, the residuals autocorrelation levels as
captured by the ACF and PACF graphs is reasonably low. And, at the onset
suggests that these two variables (CO2 and temperature) may be indeed
cointegrated.
The PACF graph at the bottom is the one used to select the number of yearly lags
we should select to conduct our unit root testing to confirm that these residuals
are indeed stationary (do not have a unit root).
Even though within the PACF graph, only lag 1 crosses the line of statistical
significance ( > 0.2), we will use up to lag 4 to be more conservative.
10. 10
Testing the residuals of the OLS Cointegration Regression
Temperature ~ CO2 for stationarity
Test p-value Interpretation confirming residuals are stationary
ADF test 0.01 Reject the null hypothesis that residuals are nonstationary
Phillips Perron 0.01 Reject the null hypothesis that residuals are nonstationary
KPSS > 0.1 Accept the null hypothesis that residuals are stationary
We used 4 lags for each of the above unit root test. In each case, the respective unit root tests confirmed that the
Cointegration Regression residuals were stationary. This confirmation allows us to proceed in modeling the
relationship between CO2 and temperature using level variables knowing that these two variables are explicitly
cointegrated.
Further residual model testing often includes testing for autocorrelation, heteroskedasticity, and normal distribution.
However, any related residual issues do not affect the regression coefficients biasness. They may affect the reliability
of regression coefficients confidence intervals and their statistical significance. However, if such regression
coefficients are associated with t-stats > 2.5 or 3.0, statistical significance is typically not an issue (even after
adjusting with Robust Standard Errors). Additionally, in some cases as we’ll see we are not explicitly concerned with
levels of statistical significance, as long as the variable make good sense in terms of explaining how the climate
system works, and that the variable regression coefficient has the appropriate sign.
11. 11
3. Baseline trend models
Within this section we will develop models that do not use CO2 as an exogenous variable but simply various trend
variables (counting 1, 2, 3, 4,…). This is just to test whether just the passing of time is the driving trend and not so
much CO2 as a causal factor.
This is a pretty good test on whether your level-based original model is truly valid and not another example of a
spurious regression using level variables.
13. 13
The Trend Model residuals are pretty awful looking
-0.40
-0.30
-0.20
-0.10
0.00
0.10
0.20
0.30
0.40
0.50
Residual
280 300 320 340 360 380 400 420
CO2 concentration (ppm)
Trend Model residuals
-0.40
-0.30
-0.20
-0.10
0.00
0.10
0.20
0.30
0.40
0.50
-0.60 -0.40 -0.20 0.40 0.60
Residual
0.00 0.20
Model Estimate
Trend Model residuals
A good model should have a residual curve (red dashed line) that is flat, straight, and sits at the 0.00 level. This would
indicate residuals that are stationary and mean reverting around the 0.00 level. These residuals are far away from
meeting that standard. They are clearly nonstationary.
15. The residual trend line (red) relative to Model Estimates on the x-axis within the right-hand graph is now perfectly flat,
straight, and on the 0.00 line as it should. Even that same residual trend line when using CO2 concentration on the x-axis
is actually reasonably flat. It looks like this model appear to capture a good deal of the information imparted by the CO2
variable. We have now a pretty competitive Baseline Trend model to assess the validity of our upcoming CO2 mode1ls5.
The Trend 2 Model residuals are far better looking
16. Description of the Trend 2 model
The square of the trend (trend2) is a
very large number, so the resulting
regression coefficient is very small:
0.000085.
16
All the Goodness-of-fit measures are very
high. And, the resulting model errors are
pretty low. This is kind of amazing given that
we have just used trend variables to fit the
temperature history starting back in 1880.
17. 17
4. CO2 Models
We will introduce two CO2 based models to estimate and forecast temperature.
The first one will be our simple linear OLS Cointegration Regression just using CO2 as our stand alone exogenous
variable.
The second one will be a more complete model that will also include the influence on temperature from the
Pacific Decadal Oscillation with warm years due to El Nino and cold years due to La Nina. This model will also
include another intervention variable covering the years from 1940 to 1970 before sulfates aerosol were heavily
regulated. Sulfates have a lowering effect on temperature that partly counters the rising effect of CO2.
18. CO2 model description
Notice the extremely high t-stat of
the CO2 coefficient, leaving no
doubt as to the statistical
significance of this variable.
18
19. The CO2 model has very good looking residuals (flat red lines)
19
20. The more complete CO2 based model
The El Nino variable has a p-value of 0.155,
not stat. significant at the Alpha < 0.10 level.
However, within a sport betting market, this
same p-value would correspond to one team
being favored with odds close to 6-to-1 of
winning. That be a pretty good bet.
In view of the above, we are comfortable
including the El Nino variable in our model.
It also makes sense to include both years
that have a positive impact on temperature
(El Nino) with the ones that have a negative
impact (La Nina).
20
21. The complete Model residuals are still reasonably good looking (fairly flat red curves)
21
22. 22
Model Competition regarding the fit of Temperature history
CO2 model Model Trend 2
Adjusted R Square 0.891 0.917 0.888
Predicted R Square 0.890 0.913 0.885
RMSE 0.117 0.102 0.119
MAE 0.095 0.082 0.095
Whether looking at measure of variance explanation (Adjusted R Square), one-observation prediction (Predicted R
Square) or model errors (RMSE and Mean Absolute Error), the three models are very close.
The CO2 model and the Trend 2 model are just about dead even on all counts. The Model that includes the other
variables such as El Nino and La Nina is fractionally more accurate.
If we stopped our analysis now, one could prematurely conclude that the trend (including the trend square variable)
just about explains everything regarding the progressive increase in temperature from 1880 and 2020. And, that the
two CO2 based models really do not add much information if any above just capturing this trend. This could lead
one to assessing our CO2 based models as “spurious.” Additional analysis will confirm otherwise supporting that
including a CO2 variable far improves the prediction accuracy of such model. Fitting the historical data is one thing.
Making reasonably accurate prediction is far more challenging and useful.
24. 0.0
0.2
0.4
0.6
0.8
1.0
1.2
Temperature Anomaly. Historical Fit since 1990
Actual CO2 model Model Trend 2
Focusing on the more recent
period since 1990 util 2020, we
can observe similar pretty good fit
between the three models.
24
The more complete Model has a
slightly better fit by better
capturing the temperature
oscillations associated with El
Nino/La Nina.
However, notice how the Trend 2
model starts to underestimate the
temperature level starting in
2014. This may be the first
indication that CO2 does impart
some valuable information to this
temperature model.
25. 25
5. Out-of-sample forecasts
Fitting historical data is one thing. And, one way or another it is often relatively easy even in the case of fitting
historical temperature level from 1880 to 2020, as we have seen. Predicting observations using out-of-sample
forecasts, also called Hold Out testing, is far more difficult and is a far more relevant test of a model predictive
accuracy.
With such models, you run often into a situation where a model fits the historical data really well, but predicts
really poorly (in Hold Out testing). This is a classic situation of model overfitting. It happens all the time.
Within this section we will test whether our models are overfit, or if instead they do provide predictive
information.
26. 26
Cross Validation test
Mean Absolute Error
History Cross-val. C.V./History
CO2 model 0.095 0.099 1.05
Model 0.082 0.089 1.09
Trend 2 0.095 0.104 1.10
Cross validation is a rigorous form of out-of-sample forecast testing. In our case, we removed 14 observations from the
data to create a 14-year prediction window. And, we did this exercise 10 times to cover the 141 yearly observations
within our complete data set.
So, the first prediction window went from 1880 to 1893. We used a model with history from 1894 to 2020 to attempt
to predict the 1880 – 1893 years.
The second prediction window was from 1894 to 1907. We used a model with history in all other years outside the
prediction window. And, we continued this process until using the most recent 14 years as our prediction window.
The table compares the Mean Absolute Error (MAE) of each
of our three models when we first used the entire data set
to fit the history. Next, it discloses the MAE that is the
average MAE of the 10 cross validation prediction windows.
And, next we look at the ratio or multiple of the cross
validations MAE divided by the MAE during history. The
cross validation MAE by definition should be much higher than the MAE during history. If that multiple is greater than
1.5, you may be dealing with a model that is overfit. As shown above, all our three models perform well on this count
with very little deterioration during the cross validation test. Again the complete Model is a bit better than the other
two. And, at the margin our CO2 model did a bit better than the Trend 2 model during cross validation.
27. 2006 – 2020 Out-of-sample Hold Out Test
27
Temperature Anomaly estimate 2006 - 2020
Actual CO2 model Model Trend2
2005 0.68 0.68 0.68 0.68
2006 0.64 0.63 0.64 0.56
2007 0.65 0.65 0.62 0.58
2008 0.54 0.67 0.54 0.59
2009 0.65 0.68 0.69 0.61
2010 0.72 0.71 0.68 0.63
2011 0.60 0.73 0.60 0.64
2012 0.65 0.75 0.72 0.66
2013 0.68 0.78 0.75 0.67
2014 0.75 0.80 0.81 0.69
2015 0.92 0.82 0.69 0.71
2016 1.01 0.85 0.82 0.72
2017 0.92 0.88 0.85 0.74
2018 0.84 0.90 0.87 0.76
2019 0.97 0.93 0.90 0.78
2020 1.00 0.95 0.92 0.79
Temp increase 0.33 0.28 0.25 0.12
MAE 0.07 0.06 0.11
Temperature Anomaly. Hold Out 2006 - 2020
1.20
1.00
0.80
0.60
0.40
0.20
0.00
2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
Actual CO2 model Model Trend2
When we attempt to forecast the recent period (2006 – 2020) using historical data (1880 – 2005), the Trend 2 model way
underestimates the increase in temperature over the recent period ( + 0.12 vs. + 0.33 for actuals). The two CO2 based
models do a lot better with respective temperature increase ranging from + 0.25 to + 0.28.
28. 1990 – 2005 Out-of-sample Hold Out Test
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004
2005
Temperature Anomaly. Hold Out 1990 - 2005
Actual CO2 Model Model Trend2
This is the exact same pattern as the prior Hold
Out Test. On a begin-to-end point basis, the
Trend 2 model greatly underestimates the
temperature increase over the 1990 – 2005
period.
28
Notice how the simpler CO2 model does better
than the more complete Model on a begin-to-
end point basis. This was also true in the Hold
Out test on the previous slide.
The repeated relative failure of the Trend 2
model is not so surprising. Polynomial
regressions are notoriously good at fitting
historical data; but often not so good the
minute you do some out-of-sample testing.
29. 1982 – 2020 Out-of-sample Hold Out Test
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
Temperature Anomaly. Hold Out 1982 - 2020
Actual CO2 Model Model Trend 2
This is an unusually long Hold Out test
where we removed the most 39 recent
years of the data (1982 – 2020).
29
Just by knowing the CO2 concentration
level, we would have come up with an
excellent begin-to-end point estimation
of the overall temperature increase over
this 39 year period (CO2 Model). And,
that estimation is far superior than the
estimation from the other two models.
30. Why is the complete Model a distant second to the simpler CO2 based model?
It is because the Pacific Decadal Oscillation that captures the El Nino (+) and La Nina (-) is not so decadal. It is
very volatile and captured in 3-month moving average that can often fluctuate between an El Nino (+) and La
Nina (-) phenomenon within the same year. Therefore, the yearly based capture of those phenomena is highly
inaccurate.
30
31. Attempt to improve Hold Out with a Robust Quantile Regression
regular CO2 model. To the contrary, the regular CO2 model generated a better set of predictions over this Hold Out
period. This gives us some comfort that this CO2 model is pretty well specified, not overly influenced by outliers within its
historical data, and able to make really pretty good predictions over a 39 year period. Prediction success over such a long
period (just assuming we know the accurate value of CO2 concentration) is very rare for such time series models.31
Temperature Anomaly. Hold Out 1982 - 2020
1.2
1.0
0.8
0.6
0.4
0.2
0.0
1981 1983 1985 1987 1989 1991 1993 1995 1997 1999 2001 2003 2005 2007 2009 2011 2013 2015 2017 2019
Actual CO2 Model Robust Model
Linear regressions such as our CO2 models can be
affected by outliers of both the Y variable
(temperature) due to variables not included in the
model (El Nino/La Nina, influence of other greenhouse
gases, etc.) or the X variable (non linear change or
random jumps in the CO2 concentration variable).
To remedy the above issue of a regression coefficients
being influenced or distorted by outliers in the
historical data, we use robust regressions that are more
resistant to the influence of such outliers. A common
robust regression method is Quantile Regression that
regresses to the Median instead of the Mean. And,
therefore much reduces the influence of outliers.
However, as shown such a Robust Quantile Regression
did not improve the Hold Out performance of our
33. IPCC Scenarios
33
Within its most recent assessment, the IPCC has developed 5 different scenarios. The most benign one being called
SSP1-1.9 whereby CO2 concentration would remain relatively flat between 400 to 450 ppm. And, the temperature
anomaly would remain close to + 1.5 degree Celsius. The most severe one is called SSP5-8.5 when CO2 concentration
would continue increasing rapidly to 1100 ppm by the end of the century; and, the temperature anomaly would reach
about + 4.4 degree Celsius.
Source: IPCC Technical Summary 2021. The large gray letters
are part of the following statement ”accepted version subject
to final editing.”
34. 34
CO2 model LN(CO2) model
Intercept -3.2 -19.8
Coefficient 0.010 3.43
Temperature anomaly estimates
CO2 ppm CO2 model LN(CO2) model
300 -0.20 -0.21
400 0.80 0.78
500 1.81 1.54
600 2.81 2.17
700 3.82 2.70
800 4.82 3.15
900 5.83 3.56
1000 6.84 3.92
1100 7.84 4.25
1200 8.85 4.55
Temperature
Anomaly
in
deg.
Celsius
Replicating IPCC Scenarios
10
9
8
7
6
5
4
3
2
1
0
-1
300 400 500 600 700 800 900 1000 1100 1200
CO2 Concentration (ppm)
CO2 model LN(CO2) model
Attempting to replicate the IPCC scenarios
Our CO2 linear model appears to way overshoot IPCC scenarios when using true-out-of-sample CO2 concentrations
that are way higher than what the model was trained on (much greater than 420 ppm and going up to 1200 ppm).
However, using a very similar model structure and simply using the LN(CO2) generates a curve that looks like it may
very well replicate the IPCC scenarios. We will look at that in greater detail on the next slide.
Note how the two models are very close when using CO2 concentrations that the linear CO2 model was trained on,
ranging from 300 to 400 ppm
35. As shown on the graph, the
LN(CO2) model temperature
estimates with CO2 concentration
up to 1200 ppm come very close to
the ones generated by the IPCC
scenarios.
The graph highlights the
temperature estimates for the most
benign IPCC scenario, SSP1-1.9, and
the most severe one, SSP5-8.5. The
model slightly underestimates the
former; and, is pretty much right on
the money for the latter (the most
severe scenario).
35
36. Why did we not use LN(CO2) instead of CO2 to estimate and forecast
temperature earlier? It is for a simple reason. When CO2 is < 420 ppm,
historically there is a very strong linear relationship
between CO2 and temperature. That linear
relationship is much stronger and better fitting than a
logarithmic relationship between the two variables.
36
We tested a logarithmic model with LN(CO2). It was
pretty good, but it came a distant second to the linear
CO2 model when conducting out-of-sample Hold Out
testing.
Over the longer term, going forward, and with true-
out-sample CO2 concentration levels (much above 420
ppm), the scientific community within the IPCC
assesses that the CO2 vs. temperature relationship
follows a logarithmic curve. That’s a very good thing. If
the relationship would continue to be linear, our
survival would become increasingly unlikely.
37. Description of the CO2 Model vs. LN(CO2) one
As shown below, regarding the historical fit of the temperature data both models are very close. The Adjusted R
Squares are nearly even at 0.89. And, the respective model Standard Errors between 0.117 and 0.118 degree
Celsius are also very close.
37
38. 1982 – 2020 Out-of-sample Hold Out Test. CO2 Model vs. LN(CO2) Model
As shown on our long out-of-sample
Hold Out test (1982 – 2020), the CO2
model performs much better than the
LN(CO2) model. This is especially true if
we look at it from a begin-to-endpoint
perspective.
The CO2 model just about meets the
endpoint in 2020 when the temperature
anomaly is + 1.00 degree Celsius.
Meanwhile, the LN(CO2) model misses it
by almost 0.2 degree Celsius.
1.2
1.1
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
Temperature Anomaly. Hold Out 1982 - 2020
Actual CO2 Model LN(CO2) Model
38
39. 39
7. Granger Causality, VAR, IRFs
We will use the mentioned statistical methods to attempt to assess the causality of the CO2 concentration on
temperature. Based on the disclosed work so far we already know there is a very strong association, or
correlation, between the two. But, is this association truly causal? Demonstrating causality in any such models
is most often extremely challenging. Often one can’t demonstrate true causality or even Granger causality (a
less absolute definition of causality that merely entails that one variable is the chronological predecessor of
another without necessarily causing the other.
40. 40
The steps to evaluate Granger Causality in this particular case
1) Does CO2 Granger cause temperature? Run Granger Causality test: CO2 -> temperature.
2) Test in which direction this causality manifest itself. Run Granger Causality test in the reverse causal direction:
temperature -> CO2. This sounds absurd but there may be ecosystem explanations supporting why this may be so.
The math is agnostic on stuff like that. Granger Causality just checks if A causes B more than B causes A to confirm
the causality direction.
3) What sign direction is this causality. Obviously, we want CO2 concentration to cause rising temperatures not
declining one. To check that, we will observe the directional signs of the CO2 variables regression coefficients
embedded in the underlying Vector Autoregression (VAR) model. If they sum up to a strong positive value, you
have confirmed your hypothesis that CO2 causes rising temperatures. Otherwise, you have not.
4) Next, check out the Impulse Response Function (IRF) graphs to visualize how an unanticipated shock in CO2
concentration reverberates on temperature increase over the next 10 years.
5) Next, explore the Forecast Error Variance Decomposition (FEVD) to evaluate how much information CO2 does truly
impart to these VAR models.
Only once you have completed all five steps will you have drawn a complete picture of the Granger causality
between two variables. Many practitioners stop after the very first step in a hurry to confirm their hypothesis; while
being less than enthusiastic about pursuing the next steps that may not confirm their hypothesis.
41. Does CO2 Granger cause Temperature?
41
Yes, it does
We ran a set of Granger Causality tests. You start with a baseline autoregressive model that just includes 1 yearly lag
of the temperature to estimate the temperature history. Next, you develop a second model by adding the 1 year lag
of CO2 to also estimate the temperature history. Finally, you test with an F test and a Chi Square test whether the
residuals of the second model including the CO2 lag are much lower than the residuals of the baseline
autoregressive model. If they are indeed lower at a statistically significant level, you conclude that CO2 does
Granger cause temperature.
You repeat this procedure up to including 4 yearly lags (we did not contemplate using more lags. Beyond 4 yearly
lags, we may likely start overfitting the model on the autoregressive properties of the respective time series). As
shown above, both the series of F tests and Chi Square tests using models with up to 4 lags all confirm that CO2
clearly Granger cause temperature. Indeed, in all cases the resulting p-values are essentially Zero allowing us to
reject the null hypothesis that there is no statistically significant difference between the two sets of residuals
(baseline autoregressive model vs. model including the CO2 lags).
CO2 Granger causes Temperature testing
F test
Value p -value
Chi Square test
Value p -value
# of lags
1 39.9 0.00 40.7 0.00
2 21.4 0.00 44.4 0.00
3 10.0 0.00 31.6 0.00
4 5.7 0.00 24.4 0.00
42. Does CO2 Granger cause Temperature… more than
Temperature Granger causing CO2? Yes it does
42
CO2 Granger causes Temperature testing
F test
Value p -value
Chi Square test
Value p -value
# of lags
1 39.9 0.00 40.7 0.00
2 21.4 0.00 44.4 0.00
3 10.0 0.00 31.6 0.00
4 5.7 0.00 24.4 0.00
Temperature Granger causes CO2 testing
F test
Value p -value
Chi Square test
Value p -value
# of lags
1 1.9 0.17 1.9 0.17
2 3.9 0.02 8.2 0.02
3 2.9 0.04 9.2 0.03
4 1.8 0.14 7.6 0.11
When you run all the Granger causality test in the other direction, all the
F tests and Chi Square test are a lot lower, and the resulting p-value are
much lower. In several of the Granger causality tests, we can’t reject the
null hypothesis that any difference in residuals between the baseline
autoregressive model and the model that includes CO2 is just due to
randomness.
43. # of lags selection for the VAR models using Information Criteria
The models described earlier that include lags of both CO2 and temperature to establish causality in either
direction are essentially unrestricted Vector Autoregression (VAR) models. When used for other purposes, on a
stand alone basis, such models are also called Autoregressive Distributed Lag (ARDL) models, a popular model
structure in social sciences and econometrics.
43
As a side note, when using level variables one should typically use other forms of VAR (not unrestricted). But,
given that the residuals of our unrestricted VAR models are uncorrelated, we should be ok to proceed as is.
To select the best number of lags for our VAR models, we will check the output of information criteria generated
by an R function. The lower the information criterion value the better the model fit and specification.
# of Lags
Info Criteria 1 2 3 4
AIC -6.66 -6.85 -6.80 -6.87
HQ -6.61 -6.76
-6.63
-6.68 -6.72
SC -6.54 -6.51 -6.49
FPE 0.00128 0.00106 0.00111 0.00104
As shown above, two of the information criteria select the VAR models with 2 lags. And, the other two select the VAR
models with 4 lags. But, notice that all four models (with lags ranging from 1 up to 4 yearly lags) have very close
information criteria values. In essence, they are very competitive with each other. So, we will often look at all four
models.
44. Does the CO2 vs. Temperature causal relationship have the
appropriate positive sign? … well here it gets a bit foggy
Yet, when we look at the overall Granger causality effect of CO2 on temperature (associated with an unexpected
upward shock in CO2), this net effect seems very small at around 0.005 to 0.006 regardless of the VAR we use. We
derive this net effect by summing the CO2 lags regression coefficients. But, at least this net effect is positive.
44
Model equation causal direction: CO2 causes temperature
Model CO2 Lags Coefficient t stat p-value
VAR w/ 1 lag CO2 lag 1 0.005 6.32 0.00
VAR w/ 2 lags CO2 lag 1 -0.049 -2.17 0.03
CO2 lag 2 0.055 2.40 0.02
Sum 0.006
VAR w/ 3 lags CO2 lag 1 -0.045 -1.79 0.07
CO2 lag 2 0.051 1.22 0.22
CO2 lag 3 0.000 0.01 1.00
Sum 0.006
VAR w/ 4 lags CO2 lag 1 -0.044 -1.72 0.09
CO2 lag 2 0.058 1.35 0.18
CO2 lag 3 -0.016 -0.37 0.71
CO2 lag 4 0.008 0.29 0.77
Sum 0.006
Observing the signs of the CO2 lags regression
coefficients leaves us to answer the above question
with much nuance.
The VAR models with 2 and 3 lags both have one CO2
coefficient with the wrong negative sign. The VAR with
4 lags has two coefficients with the wrong sign. I
In some cases, we can accept coefficients with the
wrong sign considering that the CO2 -> temperature
relationship may have some mean-reverting properties
that would cause this reversal in coefficients signs.
45. Impulse Response Functions
45
The cumulative Impulse Response Function over the next 10 year periods describing the impact on temperature in response
to an unanticipated upward shock of a one unit increase in CO2 concentration is rather unsettling. Well, when using a VAR
model with only 1 lag, the IRF graph makes much sense; as it illustrates CO2 having a positive impact on temperature level
(left graph). But, the graph on the right that describes the same IRF for a VAR with 2 lags suggests that an upward shock in
CO2 would have a negative impact on temperature level. The IRF graphs for VAR with 3 and 4 lags looked nearly identical to
the VAR with 2 lags IRF graph (right hand graph) with the negative sign.
46. 46
Forecast Error Variance Decomposition (FEVD)
For the VAR with 1 lag model fitting temperature, the table
indicates that the autoregressive lag of temperature provides
the vast majority of the information to fit temperature as the Y
dependent variable. And, that the exogenous CO2 lag 1
variable provides very little information to the model.
The FEVD profile for all the other VAR models with up to 4 lags
had the exact same FEVD profile with the lags of the
temperature variable providing over 99% of the information to
the model; and, the exogenous CO2 lags providing very little
information to these VAR models.
Forecast Error Variance Decomposition (FEVD)
VAR with just 1 lag
CO2 causes temperature
Period temperature co2
1 1.000 0.000
2 1.000 0.000
3 0.999 0.001
4 0.998 0.002
5 0.997 0.003
6 0.996 0.004
7 0.995 0.005
8 0.994 0.006
9 0.992 0.008
10 0.991 0.009
47. 47
Why did some of our Granger Causality Analysis later steps
showed ambivalent results?
The first couple of steps showed pretty convincing mathematical results that CO2 does Granger cause
temperature. However, as shown the later steps were between ambivalent to disproving.
The above is probably due to a couple of phenomena.
The first one is generic to these types of analysis. It is common to confirm Granger causality through the first
couple of steps of such analysis. But confirmation through all 5 steps is much less common.
The second phenomenon potentially specific to this modeling exercise is that the temperature level variable has a
very high level of autocorrelation. And, within VAR models this strong autocorrelation of temperature probably
has much reduced the explanatory impact of CO2. Thus, the temperature lags partly crowded out the CO2 ones in
terms of estimating temperature levels with VAR models. More specifically, the temperature autocorrelation lag 1
is 0.9518; and, is a bit higher than the CO2 vs. temperature correlation lag 1 at 0.9453. One would think we could
resolve this situation by detrending the variables and dealing with yearly changes in temperature and CO2
concentration. But, there is too much volatility in the yearly change variables to demonstrate any explicit
relationship between the two variables. I had done such an exercise years ago. And, it would only serve as a
mean to demonstrate that there is no Granger causal relationship between the two variables.
48. 48
8. VAR Forecast
Here we will revisit forecasting temperature anomaly over the 1982 – 2020 period using a model trained using
1880 – 1981 data. But, using VAR structures we will now attempt to conduct this forecast with no information
whatsoever (no info regarding prospective CO2 concentration levels).
This type of forecast testing is so challenging that it is bordering on the absurd. Imagine actually forecasting a time
series variable (S&P 500, GDP, CPI, etc.) over the next 39 years without any exogenous information over those
prospective years. That be probably close to impossible.
49. Revisiting our best 1982 – 2020 forecast with the CO2 Model
This was our best temperature
anomaly forecast so far over the 1982
– 2020 period using data from 1880
to 1981 to train our CO2 based
model.
As shown, this is a remarkably good
forecast. It entails that if you could
have known CO2 concentration over
this period (1982 – 2020), you could
have generated a pretty good
estimate of the temperature anomaly
over this same period (1982 – 2020).
Notice that all the CO2 model
estimates of the temperature
anomaly fall well within the 95%
Prediction Interval. This is a rather
unusually good situation.
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
Temperature Anomaly. Hold Out 1982 - 2020. With 95% Prediction Interval
Actual CO2 Model Lower Upper
49
50. A VAR model w/ 1 lag using LN(CO2) can predict with no info whatsoever!
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
Temperature Anomaly. VAR w/ LN(CO2) forecast 1 lag, 1982 - 2020. P.I. 95%
Actual VAR fcst Lower Upper
Just using LN(CO2) instead of CO2 as
our second Z variable within a VAR
model with 1 lag generates a
surprisingly good forecast of the
temperature anomaly over the 1982
– 2020 period with no information
whatsoever regarding this period!
50
This is rather astonishing.
As shown, the VAR forecast does
overestimate temperature by just
about 0.1 degree Celsius at the onset
in 1982 and in 2020. That’s a very
small error given the model is not fed
any information.
51. Comparing our CO2 Model vs. VAR (with LN(CO2) forecasts
1.2
1.1
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
Temperature Anomaly. Hold Out 1982 - 2020
Actual CO2 Model VAR
Temperature anomaly over the 1982 - 2020 period
Actual CO2 Model VAR
Average 0.537 0.561 0.617
Median 0.555 0.533 0.581
Max 1.005 0.975 1.100
Min 0.140 0.225 0.295
Range 0.865 0.751 0.804
51
Ok, the VAR model does
overestimate the temperature
anomaly a bit relative to the OLS
Cointegration Regression (CO2
Model). But, the VAR
overestimation is really pretty
small when considering the VAR
model generated a 39 year forecast
with no info whatsoever. By
contrast, the CO2 model was fed
the precise CO2 concentration level
over that entire period. That is a
huge difference.
52. Why did the VAR (w/ LN(CO2) overestimated temperature?
52
440
430
420
410
400
390
380
370
360
350
340
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
CO2 (ppm). VAR (w/ LN(CO2) forecast 1 lag, 1982 - 2020 with P.I. 95%
Actual VAR fcst lower upper
This question is a little perplexing because
we observed earlier that using LN(CO2)
instead of CO2 within our earlier OLS
regressions resulted in the LN(CO2) model
underestimating temperature over the Hold
Out (1982 – 2020) by quite a bit.
But, when we use this same LN(CO2)
variable within this VAR model, instead of
underestimating temperature, it actually
overestimates them by a little bit.
Part of the reason is that this same VAR
model does overestimate CO2
concentration.
Remember in the former Hold Out tests with the standard OLS regressions, these models were fed with CO2
concentration over the 1982 – 2020 period; while the models were trained over the 1880 – 1981 period. With this VAR
model, we are dealing with a rather extraordinary situation where it was trained over the 1880 – 1981 period; and, it was
not provided any information over the Hold Out period (1982 – 2020). Yet, it was asked to forecast temperature over that
same period. That’s a very challenging situation.
53. Conclusion
53
Using CO2 concentration to estimate and forecast temperature anomaly levels was on many counts
surprisingly successful.
More complex models using additional variables associated with the Pacific Decadal Oscillation (El Nino (+); La
Nina (-)) proved not so successful. They could fit the historical data. But, they turned out inferior in
forecasting compared to the simpler model just using CO2 concentration.
Using the natural log of CO2 as an independent variable was surprisingly successful for replicating the IPCC
scenarios and also in forecasting the temperature anomaly over the 1982 – 2020 period with no info
whatsoever using a VAR model with one lag.
When it came to a full fledge Granger causality analysis, our results were much humbler. We could confirm
Granger causality through the first two steps (Granger causality and its relationship direction). But, the
subsequent steps turned out to be rather ambivalent (VAR regression coefficients signs, IRFs, FEVD).