SlideShare uma empresa Scribd logo
1 de 102
Baixar para ler offline
Backtesting Value at Risk and Expected Shortfall with
Underlying Volatility Clustering and Fat Tails
by
Stefano Bochicchio Estival BSc
A thesis submitted in conformity with the requirements
for the degree of Master of Science
Department of Mathematics
Faculty of Mathematical & Physical Sciences
University College London
September, 2016
Disclaimer
I, Stefano Bochicchio Estival, confirm that the work presented in this thesis is
my own. Where information has been derived from other sources, I confirm that this
has been indicated in the thesis.
Signature
Date
2
Abstract
Since the financial crisis in 2008, risk management has become one of the most
important topics in finance. The need to accurately assess the risk exposure of a
financial entity has ignited a discussion between academics and regulators to search
for the most accurate and reliable way to measure risk. The most prominent risk
measures are Value at Risk (VaR) and Expected Shortfall (ES). Furthermore, back-
testing has become an important tool to verify the performance of risk measures.
In the context of the behaviour of financial time series, “volatility clustering” and
“fat tails” are the most important properties. [15,36]. This motivates the following
question: What is the effect of these properties on the backtesting procedure of VaR
and ES?
The objective of this thesis is to investigate and analyse the backtesting procedure
of VaR and ES when exposed to data enriched with the properties of “fat tails” and
“volatility clustering”.
The structure of this thesis is integrated as follows. First, the GARCH(1,1)
model is proposed as a reliable tool that embodies the property of “volatility clus-
tering” and the Student t’s distribution as trustworthy model that captures the “fat
tails” property. Second, the parameters of the proposed GARCH(1,1) model and
the Student t’s distribution are estimated using the JPMM stock data and random
3
simulations are generated in order to obtain the “in-sample” and “out-of-sample”
subsets. Third, the VaR and ES estimates for both models are computed using the
“in-sample” subset. Fourth, VaR is backtested using Christoffersen’s [12] tests, while
ES is backtested using Acerbi and Szekely’s [1] Test I and II on the “out-sample”
dataset. Finally, the correspondent p-values of the tests are calculated in order to
conclude whether the estimated risk measures pass the backtest.
In conclusion, the following results were obtained. Regarding the GARCH(1,1)
model, the VaR estimate overestimated the expected VaR violations. Additionally,
these violations were not independent due to the “volatility clustering” property.
Furthermore, The ES estimate indeed passed the backtest but suggested that the
“real” risk was overestimated.
Regarding the Student’s t distribution, the VaR estimate passed the backtest as
the VaR violations were in line with the estimation. Moreover, the violations proved
to be independent. Likewise, The ES estimate passed the backtest. Hence, the
“fat tails” property did not affect the backtesting procedure for both risk measures.
Finally, further lines of investigation are recommended in order to study this topic
with a different focus.
This thesis was completed under the supervision of Professor Johannes Ruf and
Professor Alejandro G´omez.
4
Acknowledgments
Firstly I would like to thank Prof. Alejandro G´omez for his unconditional support
and excellent supervision, I highly appreciate the dedication he showed to this thesis.
Secondly I would like to thank Prof. Johannes Ruf for his teachings during this
whole year, I am deeply grateful to the help that he provided throughout my Master
studies and during this thesis.
Thirdly I would like to thank la bandita.
Last but not least, I would like to thank my parents and Vanessa for their con-
tinuous support.
Stefano Bochicchio Estival, University College London, September 2016
5
To my family and to Vanessa, thanks for all the unconditional support.
Contents
Disclaimer 2
Abstract 3
Acknowledgments 5
List of Tables 10
List of Figures 12
1 Introduction 14
2 Properties of Financial Time Series 18
2.1 Volatility Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.1.1 GARCH Model . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 Fat Tails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.1 Student’s t Distribution . . . . . . . . . . . . . . . . . . . . . 23
3 Properties of Risk Measures and Introduction to VaR and ES 27
3.1 Properties of Risk Measures . . . . . . . . . . . . . . . . . . . . . . . 28
3.1.1 Coherence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.1.2 Elicitability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.1.3 Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 Introduction to Value at Risk . . . . . . . . . . . . . . . . . . . . . . 33
3.3 Introduction to Expected Shortfall . . . . . . . . . . . . . . . . . . . 36
7
Contents 8
4 Theoretical Background for Backtesting VaR and ES 39
4.1 Statistical Background . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.2 Backtesting Value at Risk . . . . . . . . . . . . . . . . . . . . . . . . 42
4.2.1 Regulatory Framework . . . . . . . . . . . . . . . . . . . . . . 42
4.2.2 Statistical Framework . . . . . . . . . . . . . . . . . . . . . . 43
4.2.3 Unconditional Coverage Tests . . . . . . . . . . . . . . . . . . 45
4.2.3.1 Violation Ratio . . . . . . . . . . . . . . . . . . . . . 45
4.2.3.2 Failure Test . . . . . . . . . . . . . . . . . . . . . . . 46
4.2.3.3 Proportion of Failures (POF) . . . . . . . . . . . . . 47
4.2.3.4 Christoffersen’s Unconditional Coverage Test . . . . 48
4.2.4 Independence Tests . . . . . . . . . . . . . . . . . . . . . . . . 49
4.2.4.1 Christoffersen’s Independence Test (Markov Test) . . 49
4.2.4.2 Christoffersen and Pelletier’s Duration Test . . . . . 50
4.2.5 Conditional Coverage Tests . . . . . . . . . . . . . . . . . . . 51
4.2.5.1 Joint Markov Test . . . . . . . . . . . . . . . . . . . 51
4.2.5.2 Christoffersen’s Conditional Coverage Joint Test . . . 51
4.3 Backtesting Expected Shortfall . . . . . . . . . . . . . . . . . . . . . 52
4.3.1 Quantile Approximation . . . . . . . . . . . . . . . . . . . . . 52
4.3.2 Acerbi and Szekely Test . . . . . . . . . . . . . . . . . . . . . 54
4.3.2.1 Test I . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3.2.2 Test II . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.3.2.3 Test III . . . . . . . . . . . . . . . . . . . . . . . . . 57
5 Backtesting VaR and ES with the Generated Data 59
5.1 Data Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.1.1 Volatility Clustering . . . . . . . . . . . . . . . . . . . . . . . 61
5.1.2 Fat Tails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.2 Computation of Risk Measures . . . . . . . . . . . . . . . . . . . . . 67
5.2.1 Computation of Value at Risk . . . . . . . . . . . . . . . . . . 67
5.2.2 Computation of Expected Shortfall . . . . . . . . . . . . . . . 72
5.3 Backtesting Value at Risk and Expected Shortfall Using Selected Tests 75
5.3.1 Backtesting Value at Risk . . . . . . . . . . . . . . . . . . . . 75
Contents 9
5.3.2 Backtesting Expected Shortfall . . . . . . . . . . . . . . . . . 81
5.3.2.1 Acerbi and Szekely Test I . . . . . . . . . . . . . . . 82
5.3.2.2 Acerbi and Szekely Test II . . . . . . . . . . . . . . . 84
6 Conclusions, limitations and further research 90
A Statistical Tests For VaR Backtesting 95
Bibliography 97
List of Tables
2.1 Kurtosis of the FTSE Index with Fitted Normal Distribution . . . . . 23
2.2 Kurtosis of the FTSE Index with Fitted Student’s t Distribution . . . 26
3.1 Summary of Properties for VaR and ES 1
. . . . . . . . . . . . . . . . 38
4.1 Hypothesis Testing Summary Table . . . . . . . . . . . . . . . . . . . 40
4.2 Contingency Table for Christoffersen’s Markov Test 2
. . . . . . . . . 49
5.1 Statistical Test for the Residuals of the Returns Data Series of JPMM
with α = 0.05 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.2 Parameter Estimates, Standard Error and Test Statistic for the Fitted
GARCH(1,1) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.3 Estimates and Standard Error of the Fitted Student’s t Distribution . 65
5.4 Kurtosis of the Simulated Student’s t Distribution . . . . . . . . . . . 65
5.5 Estimates of VaR for the GARCH(1,1) Model and the Student’s t
Distribution with the Correspondent α. . . . . . . . . . . . . . . . . . 71
5.6 Estimates of VaR and ES for Both Methods with Various Confidence
Levels α . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.7 Test Statistic and p-value for the Z1 (X) Test . . . . . . . . . . . . . . 83
5.8 Test Statistic and p-value for the Z2 (X) Test . . . . . . . . . . . . . . 85
A.1 Statistical Test for the GARCH(1,1) Model with α = 0.05 . . . . . . 95
A.2 Statistical Test for the GARCH(1,1) Model with α = 0.025 . . . . . . 95
A.3 Statistical Test for the GARCH(1,1) Model with α = 0.01 . . . . . . 95
A.4 Statistical Test for Student t’s Distribution with α = 0.05 . . . . . . . 96
A.5 Statistical Test for Student t’s Distribution with α = 0.025 . . . . . . 96
10
A.6 Statistical Test for Student t’s Distribution with α = 0.01 . . . . . . . 96
11
List of Figures
2.1 FTSE Daily Returns . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 Empirical and Fitted Normal Distribution of the FTSE Index . . . . 24
2.3 Rescale of Empirical and Fitted Normal Distribution of the FTSE Index 24
2.4 Rescale of Empirical and Fitted Student’s t distribution of the FTSE
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1 VaR and ES for a Loss Function that is Normally Distributed with
µ = 0 and σ2
= 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.1 Daily Returns of JPMM . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.2 Sample Autocorrelation Function and Sample Partial Autocorrelation
Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.3 Simulated Conditional Variance and Returns for the Fitted GARCH(1,1)
Model for a Selected Path . . . . . . . . . . . . . . . . . . . . . . . . 64
5.4 Sample Autocorrelations for the Conditional Variance (up) and Re-
turns (down) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.5 Fitted Student’s t Distribution vs. Empirical Distribution . . . . . . 66
5.6 Cumulative Mean for the VaR of the GARCH(1,1) Model with Se-
lected Values of α. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.7 Moving Average with a Window of 1,000 Observations for the VaR of
the GARCH(1,1) Model with Selected Values of α. . . . . . . . . . . 69
5.8 Cumulative Mean for the VaR of the Student’s t Distribution with
Selected Values of α in Conjunction with the Quantile Values Derived
from the Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
12
5.9 Cumulative Mean for the ES of the GARCH(1,1) Model with Selected
Values of α . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.10 Moving Average with a Window of 1,000 Observations for the ES of
the GARCH(1,1) Model with Selected Values of α. . . . . . . . . . . 73
5.11 Cumulative Mean for the ES of the Student’s t Distribution with Se-
lected Values of α. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.12 Backtesting VaR with Student’s t Distribution Generated Data with
α = 0.05 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.13 Backtesting VaR with Student’s t Distribution Generated Data with
α = 0.025 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.14 Backtesting VaR with Student’s t Distribution Generated Data with
α = 0.01 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.15 Backtesting VaR with GARCH(1,1) Generated Data with α = 0.05 . 78
5.16 Backtesting VaR with GARCH(1,1) Generated Data with α = 0.025 . 78
5.17 Backtesting VaR with Student’s t Distribution Generated Data with
α = 0.01 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.18 Contribution to the Z2(X) Test Statistic for the GARCH(1,1) Model
with α = 0.05 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.19 Contribution to the Z2(X) Test Statistic for the GARCH(1,1) Model
with α = 0.025 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.20 Contribution to the Z2(X) Test Statistic for the GARCH(1,1) Model
with α = 0.01 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.21 Contribution to the Z2(X) Test Statistic for the Student’s t Distribu-
tion with α = 0.05 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.22 Contribution to the Z2(X) Test Statistic for the Student’s t Distribu-
tion with α = 0.025 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.23 Contribution to the Z2(X) Test Statistic for the Student’s t Distribu-
tion with α = 0.01 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
13
Chapter 1
Introduction
Since the financial crisis in 2008, risk management has become one of the most
important topics in finance. The need to accurately assess the risk exposure of a
financial entity has ignited a discussion between academics and regulators to search
for the most accurate and reliable way to measure risk. The most common type
of risk is market risk, which measures the sensitivity of the value of a portfolio
with respect to changes in the price of the underlying financial products. Another
important manifestation of risk is called credit risk, which embodies the risk of not
receiving outstanding payments from a financial counterpart due to a default. Within
the realm of credit risk there exists a subset called credit counterparty risk which is
mainly incurred when trading OTC1
derivatives, as the fulfilment of future cashflows
depends directly on a financial counterparty. Moreover, liquidity risk corresponds to
the risk that arises when financial positions cannot be opened or closed at the desired
prices due to a lack of trading activity in the market. Operational risk measures the
risk associated with partial or complete failure of internal processes such as human
or computational systems [39]. Within operational risk, legal risk corresponds to the
unexpected losses attributed to a defective transaction related to a dispute or legal
action against a certain financial entity [38] (for more information about other types
1
OTC stands for “over the counter” which denotes the non-standardized contracts that are not
traded in exchanges but directly between counterparties.
14
Chapter 1. Introduction 15
of risk, see McNeil et al. [39]).
A natural question that arises when dealing with risk is: How can risk be mea-
sured?
In their seminal article, Emmer et al. [24] mention that the concept of risk mea-
surement is fundamental to the correct management of risk. Specifically, Kou et
al. [30] indicate that “a risk measure attempts to assign a single numerical value to
the random loss of a portfolio of assets.”
The modern history of risk measurement starts with Markovitz in 1952 [37] when
he introduced the concept of risk together with the return of a financial product. In
his work, Markovitz defines risk as the “standard deviation” of returns [24].
At the end of 1974, the Basel Committee on Banking Supervision (BCBS) was
established by the members of the Group of Ten (G-10) countries. Its objective is to
ensure global financial stability by setting the minimum regulatory framework for the
supervision of the banking industry. In the second agreement of the BCBS in 2004
[4], Value at Risk (VaR) was adopted as the benchmark downside risk measure to
quantify the market risk for financial institutions (for more information about other
downside risk measures please see Nawrocki [40]) [6, 31, 39]. In October 2013, the
BCBS [5] proposed a change in its regulations and, therefore, introduced Expected
Shortfall (ES) as a suggested financial risk measure to capture unexpected losses
incurred in financial distress [1].
In the financial regulatory framework, risk measures need to be backtested to
assess accurately the capital needed to set aside in order to cover extreme portfolio
losses. When it comes to backtesting VaR, certain standardized tests can be im-
plemented in order to cross-check the current capital requirements as explained by
Campbell [9] and Kupiec [31].
On the other hand, Gneiting [26] and Carver [10] mention that ES is not back-
testable due to the fact that it does not fulfil the property of elicitability (see Section
Chapter 1. Introduction 16
3.1.2). Nevertheless, Acerbi and Szekely [1], Kerhof et al. [29] and Costanzino et
al. [18] mention that elicitability is not a necessary factor to determine if a risk mea-
sure is backtestable. As a consequence, the former authors introduce standardized
non-parametric backtesting procedures for ES.
In the context of the behaviour of financial time series, the property of “volatility
clustering” is frequently embodied by financial assets as shown first by Mandelbrot
[36] and studied by Cont [16]. Moreover, after the financial crisis of 2008, the property
of “fat tails” in the probability distribution of prices has manifested in the dynamics
of financial markets (for further information please see Dash [20]). Now the following
question can be asked: What is the effect of these properties on the backtesting
procedure of VaR and ES?
The objective of this thesis is to investigate and analyse the backtesting procedure
of VaR and ES when exposed to data enriched with the properties of “fat tails” and
“volatility clustering”.
In Chapter 2, the properties of “volatility clustering” and “fat tails” are pre-
sented. The GARCH(1,1) model is proposed as a tool that embodies the property
of “volatility clustering” and the Student t’s distribution is taken as a trustworthy
model that captures the “fat tails” property. Moreover, the fulfilment of these prop-
erties is empirically evidenced in a specific financial time series namely, the FTSE
index. In Chapter 3, a background on the properties that are important for risk mea-
sures is exposed. Additionally, VaR and ES are introduced. In Chapter 4, a thorough
analysis of the backtesting procedures available for VaR and ES is undertaken.
In Chapter 5 the methodology of the thesis is introduced. First, the parameters
of the proposed GARCH(1,1) model and the Student t’s distribution are estimated
using the JPMM stock data and random simulations are generated in order to obtain
the “in-sample” and “out-of-sample” subsets. As a next step, the VaR and ES
estimates for both models are computed using the “in-sample” subset. Afterwards,
Chapter 1. Introduction 17
VaR is backtested using Christoffersen’s [12] tests, while ES is backtested using
Acerbi and Szekely’s [1] Test I and II on the “out-sample” dataset. Finally, the
correspondent p-values of the tests are calculated in order to conclude whether the
estimated risk measures pass the backtest.
Chapter 2
Properties of Financial Time Series
In this chapter, a theoretical background introduces the role of the two most
common properties of financial times series: “volatility clustering” and “fat tails”.
Furthermore, two models are introduced as catalysts of these properties. Particularly,
the GARCH(1,1) model is used to capture the “volatility clustering” property and
the Student’s t distribution is chosen to embody the property of “fat tails”.
18
Chapter 2. Properties of Financial Time Series 19
2.1 Volatility Clustering
The volatility clustering phenomenon was first described by Mandelbrot [36] as
“large changes tend to be followed by large changes, of either sign and small changes
tend to be followed by small changes.” In other words, when volatility is high during
a certain period of time it tends to be high for the consequent periods and vice versa.
Moreover, Cont [16] indicates that the volatility clustering effect corresponds to the
fact that financial time series returns are non-linearly dependent on time.
As Figure 2.1 illustrates, large clusters of returns arrive consecutively in the FTSE
Index. This is a clear manifestation of the volatility clustering effect on financial
instruments.
Figure 2.1: FTSE Daily Returns1
2.1.1 GARCH Model
The GARCH (Generalized Autoregressive Conditional Heteroscedasticity) model
was developed by Engle [25] and generalized by Bollerslev [8]. This model is a
popular reference in modelling the dynamic variability of time series. Due to the
1
Price data obtained from www.yahoofinance.com.
Chapter 2. Properties of Financial Time Series 20
fact that prices fluctuate during periods of financial stress, conditional variances are
non-constant.
GARCH models have proven to be interesting tools to embody the volatility
clustering effect on financial time series [8, 25]. This is due to the fact that in the
GARCH model, the present level of volatility is dependent on the volatility of one
period before. For example, if volatility is high for a previous time step, it suggests
that it would still be high for the next time step. Therefore, in the realm of finance,
the GARCH model is an appealing option to model financial time series [44,50].
Cont [16] even calls the volatility clustering feature the “GARCH effect”. How-
ever, the author mentions that this event is non-parametric and it is not implicitly
linked to the GARCH(1,1) model specification.
Definition 2.1.1 GARCH process. The process Xt follows a GARCH process
composed of p past conditional variances (σ2
i−1) and q past squared innovations (X2
i−1)
if
σ2
t = ω +
q
i=1
αiX2
t−i +
p
i=1
βiσ2
t−i
Xt = σt t
(2.1)
where ω ∈ R , βi, αi ≥ 0 and t ∼ N(0, 1)
Due to both its usefulness and importance in the financial industry [44,46], this thesis
focuses on the GARCH(1,1) process which takes one lag for the past conditional vari-
ances (σ2
i−1) and one lag for the past squared innovations (X2
i−1).The GARCH(1,1)
can be represented using Definition 2.1.1 with p = 1 and q = 1.
σ2
t = ω + αX2
t−1 + βσ2
t−1 (2.2)
Chapter 2. Properties of Financial Time Series 21
where, ω ∈ R and β, α ≥ 0
For the purpose of this thesis the returns of certain financial time series (Xt)
follow a GARCH(1,1) process which fulfil the following Xt ∼ N(0, σ2
t ). where σ2
t
satisfies Equation 2.2. Additionally, in order to have a stationary solution for the
GARCH(1,1) model the following equation needs to hold.
α + β < 1 (2.3)
Lindner [34] argues that the process Xt has a finite variance if, and only if, Equation
2.3 is fulfilled.
For the estimation of the parameters, the maximum likelihood approach is usually
used to produce the estimated parameters of the model. It is known that
Xt ∼ N(0, σ2
t )
σ2
t = ω + αX2
t−1 + βσ2
t−1
(2.4)
so in order to find the estimated coefficient vector ν = (ω, α, β)T
, the following is
obtained
L(θ) = 0.5
n
k=2
X2
t
σ2
t
−
1
σt
∂σt
∂ν
(2.5)
J = −0.5
n
k=2
E
1
σ2 t
∂σt
∂ν
∂σt
∂νT
(2.6)
where L(θ) is the gradient of the loglikelihood function and J is the Fisher’s In-
formation Matrix. Consequently, the estimated parameters can be found using the
iterative scheme from Newton’s optimization method [54] (see Yang [54]).
Chapter 2. Properties of Financial Time Series 22
2.2 Fat Tails
The property of fat tails2
in time series has been labelled as a stylized fact3
in
financial assets as evidenced by [15,21]. This feature refers to the property that data
possesses extreme values that tend to be separated from the mean of the distribution.
In other words, data is underestimated by a normal distribution as it assigns a low
probability to events far from the mean. Therefore, a better treatment can be pro-
vided with the use of heavy-tailed distributions such as the Student’s t distribution.
Cont [15] mentions that precise behaviour of the tails may be sometimes difficult to
determine.
From a mathematical viewpoint, the property of fat tails can be represented with
the following formula.
P(X > x) ∼ x−α
α > 0 (2.7)
In other words, the asymptotic density function of the extreme events fX(x) decays
as polynomial with α > 0. For example, in the case of a normal distribution it is
quadratically exponential and therefore the decay compared to a polynomial is faster.
Conversely, for the case of the Student t’s distribution the asymptotic distribution
corresponds to a polynomial decay.
A useful property to determine the property of fat tails in the data can be captured
with the kurtosis (normalized fourth moment) of the distribution, which is defined
as follows
KX =
E[(X − µ)4
]
(E[(X − µ)2])2
, (2.8)
2
The term of fat tails and heavy tails is are used interchangeably in the literature and in this
thesis.
3
Cont [15] mentions that a stylized fact is defined as “a common denominator among the
properties observed in studies of different markets and instruments.”
Chapter 2. Properties of Financial Time Series 23
where µ corresponds to the mean of the random variable X.
Distribution Kurtosis
Empirical Distribution 12.5354
Fitted Normal Distribution 3.0018
Table 2.1: Kurtosis of the FTSE Index with Fitted Normal Distribution
Table 2.1 shows that the kurtosis of the empirical data extracted from the FTSE
index is higher than the one obtained from the fitted normal distribution. Hence, as
it was already mentioned, the excess kurtosis observed in the FTSE index cant’ be
accurately modelled by a normal distribution.
Moreover, Figures 2.2 and 2.3 present the empirical distribution in conjunction
with the fitted normal distribution for the FTSE index daily compounded returns.
As it can be seen on the graphs, the empirical distribution assigns more probability to
extreme events in comparison to the fitted normal distribution. Hence, this suggests
that the empirical data may possess the property of fat tails as also observed in
Table 2.1.
Figure 2.3 presents the events that are higher than µ + 3σ for the empirical
distribution. It is known that the fitted normal distribution covers about 99.7% of
its area in the interval (µ − 3σ, µ + 3σ). Therefore, the events higher than µ + 3σ
are extremely unlikely. To put it into perspective, if normal random numbers were
to be drawn every day, the event that a trial lies outside the interval (µ−6σ, µ+6σ)
would occur every 1.38 million years. On the contrary, Figure 2.3 shows that this
event happened more than once in the last 32 years of the FTSE index data.
2.2.1 Student’s t Distribution
As evidenced in Section 2.2, the normal distribution may underestimate the true
underlying behaviour of the returns of financial time serie. Therefore, when it comes
Chapter 2. Properties of Financial Time Series 24
Figure 2.2: Empirical and Fitted Normal Distribution of the FTSE Index
Figure 2.3: Rescale of Empirical and Fitted Normal Distribution of the FTSE Index
Chapter 2. Properties of Financial Time Series 25
to model excessive returns in financial time series, the use of fat tails distributions
seems appropriate.
One of the most famous fat tails distributions is the Student’s t distribution. Stoy-
anov [48] mentions that the underlying reason why this distribution is so widespread
is due to its simplicity and the easy implementation of a numerical method for its
application. Therefore, the Student’s t distribution is used as the catalyst to generate
the dataset enriched with the property of fat tails.
Definition 2.2.1 Student’s t distribution. Let X be a random variable. X is a
Student’s t distribution with ν degrees of freedom if it has the following probability
density function:
f(t) =
Γ(ν+1
2
)
√
νπ Γ(ν
2
)
1 +
(t−µ
σ
)2
ν
−ν+1
2
, (2.9)
where µ and σ correspond to the location and the scaling parameters respectively.
Moreover, Γ corresponds to the Gamma function, which is defined in Equation 2.10,
Γ(t) =
ˆ ∞
0
xt−1
e−x
dx. (2.10)
Returning to the example concerning the FTSE index, Figure 2.4 shows the same
information as Figure 2.3 with the exception that a fitted Student’s t distribution is
utilized instead. As the graph shows, this distribution takes more into consideration
the extreme values in comparison to the fitted normal distribution. Finally, Table 2.2
illustrates the kurtosis of the fitted Student t’s distribution compared to the empirical
distribution. Clearly the Student t’s distribution matches better the actual kurtosis
of the empirical data as compared to Table 2.1.
Chapter 2. Properties of Financial Time Series 26
Figure 2.4: Rescale of Empirical and Fitted Student’s t distribution of the FTSE Index
Distribution Kurtosis
Empirical Distribution 12.5354
Student’s t Fitted Distribution 20.2836
Table 2.2: Kurtosis of the FTSE Index with Fitted Student’s t Distribution
Chapter 3
Properties of Risk Measures and
Introduction to VaR and ES
In this chapter the theoretical background of risk measures is presented. More-
over, the standard risk measures proposed by the regulators and the industry are
presented, namely: VaR and ES.
27
Chapter 3. Properties of Risk Measures and Introduction to VaR and ES 28
3.1 Properties of Risk Measures
As already mentioned in Chapter 1, “a risk measure attempts to assign a single
numerical value to the random loss of a portfolio of assets.” Kou et al. [30]. In this
section a formal definition is given in terms of the desired properties that a risk
measure must possess. This section is based on the layout presented by Emmer et
al. [24].
3.1.1 Coherence
The concept of coherence is important as it groups various mathematical proper-
ties that should be taken into account in order to select a suitable risk measure [24].
Specifically, Artzner et al. [3] propose the following four key properties that need to
be fulfilled in order for a risk measure to be coherent.
Definition 3.1.1 Homogeneity. A certain risk measure ζ(·) is called homogeneous
if for all loss variables L and h ≥ 0 it holds that
ζ(hL) = hζ(L)
Definition 3.1.2 Subadditivity. A certain risk measure ζ(·) is called subadditive
if for all loss variables L and K it holds that
ζ(L + K) ζ(L) + ζ(K)
Definition 3.1.3 Monotonicity. A certain risk measure ζ(·) is called monotonic
if for all loss variables L and K it holds that
L K =⇒ ζ(L) ζ(K)
Chapter 3. Properties of Risk Measures and Introduction to VaR and ES 29
Definition 3.1.4 Translation Invariance. A certain risk measure ζ(·) is called
translation invariant if for all loss variables L and δ ∈ R it holds that
ζ(L − δ) = ζ(L) − δ
3.1.2 Elicitability
Elicitability [26,32,43] plays an important role in the determination of an appro-
priate risk measure. Before formalizing the definition of elicitability the following
definitions need to be introduced.
Definition 3.1.5 Scoring function. A scoring function is defined as follows
s : R × R → [0, ∞)
(x, y) → s(x, y)
where x and y correspond to the forecast and the realization respectively. Put into
words, a scoring function is a function that assigns a numerical score in terms of
the distance between the forecasted value and the realized value. For example, this
difference could be measured by the square error s(x, K) = (x − K)2
or the absolute
error s(x, K) = |(x − K)|.
Definition 3.1.6 Consistency. Let τ be a functional on a class of probability mea-
sures P on R:
τ : P → 2R
Q → τ(Q) ⊂ R
A scoring function s : R × R → [0, ∞) is consistent for the functional τ relative to
Chapter 3. Properties of Risk Measures and Introduction to VaR and ES 30
the class P if and only if, ∀Q ∈ P, t ∈ τ(Q), x ∈ R and L being the loss random
variable defined on (Ω, F, Q) then
EQ[s(t, L)] ≤ EQ[s(x, L)]
Definition 3.1.7 Strict Consistency. A scoring function S is strictly consistent
if and only if it is consistent and
EQsS(t, L)] = EQ[s(x, L)] =⇒ x ∈ τ(Q)
Finally, the definition of elicitability can be introduced.
Definition 3.1.8 Elicitability. The functional τ is elicitable relative to P if and
only if there exists a scoring function S which is strictly consistent for τ relative to
P.
This definition is used by Emmer et al. [24]. Moreover, the authors mention that
elicitability is a very helpful property for the determination of optimal point forecasts.
Hence, if there exists a strictly consistent scoring function S for a functional τ then
elicitability can be defined as follows (also used by Acerbi and Szekely [1])
ι = arg min
x
E[s(x, K)] (3.1)
where s(x) is a scoring function and ι(K) is a statistic of the random variable K. One
of the most important properties of elicitability is that it can be utilized to assess the
performance of forecast models [26]. It is worth noting that usually elicitability refers
to the risk measure itself and not to a functional with respect to the risk measure.
On the following sense, a “weak” second order elicitability can be defined as
Chapter 3. Properties of Risk Measures and Introduction to VaR and ES 31
follows
Definition 3.1.9 Conditional Elicitability A functional τ of Q is called condi-
tionally elicitable if there exists functionals κ and κ : D → 2R
with D ⊂ Q × 2R
that
satisfy the following
• κ is elicitable relative to Q
• (P, κ) ∈ D ∀P ∈ Q
• ∀c ∈ κ(Q) the functional κc : Qc → 2R
, P → κ(P, c) ⊂ R is elicitable relative
to Qc = {P ∈ Q : (P, c) ∈ D}
the property of Conditional Elicitability is relevant when forecasting risk measures
that are not elicitable.
3.1.3 Robustness
Robustness is defined as the sensibility that a certain model has when altering its
underlying parameters. A robust risk measure, in a strict sense, is not significantly
affected by external as well as internal shocks. In the risk context, Emmer et al. [24]
mention that without robustness it could be the case that results cannot be relevant
as small measurement errors lead to big changes in the estimated risk measure.
Furthermore, Cont et al. [17] define robustness with a different focus. Specifically,
instead of assuming that the sensibility comes from measurement errors, they assign
it to the actual inflow of new data into estimate the model.
When analysing the robustness of a certain risk measure, a distance should be
defined. Emmer et al. [24] proposes the following definition
Chapter 3. Properties of Risk Measures and Introduction to VaR and ES 32
Definition 3.1.10 Wasserstein distance The Wasserstein distance between two
probability measures P and Q is,
Dws(P, Q) = inf{E(|X − Y |) : X ∼ P, Y ∼ Q} (3.2)
using Equation 3.1.10 the definition of robustness can be introduced.
Definition 3.1.11 Robustness. A risk measure µ is called robust with respect to
the Wasserstein distance if
lim
x→∞
Dws(Xn, X) = 0 ⇒ lim
x→∞
|µ(Xn) − µ(X)|= 0 (3.3)
where Xn ∼ Pn, n ∈ N as well as Pn corresponds to a probability measure and µ
corresponds to a certain risk measure.
Chapter 3. Properties of Risk Measures and Introduction to VaR and ES 33
3.2 Introduction to Value at Risk
VaR is the most widespread risk measures in finance. VaR was developed by J.P.
Morgan in 1994 on their publication of the Riskmetrics framework. This catapulted
it as a benchmark risk measure in the industry [28, 41]. Afterwards, as already
mentioned in Chapter 1, the Basel Committee on Banking Supervision introduced
VaR as the internal benchmark for banks to calculate the capital requirements.
Definition 3.2.1 Value at Risk (VaR). A portfolio’s Value at Risk (VaR) corre-
sponds to the α quantile of the profit and loss distribution X [9]
VaRt(α) = −F−1
(α) = inf{x ∈ R : F(x) ≥ α} (3.4)
where F−1
(α) is the quantile function (inverse CDF1
) of the profit and loss distribu-
tion.
When defining VaR, a confidence level (1 − α) and a time interval (t) must be given.
Specifically, t determines the distribution, and in the risk metric perspective this
parameter is introduced for convenience and clarity.
As an illustrative example, let t = 1 and α = 0.01. Therefore, if VaR corresponds
to the value of $1,000, then under the mathematical definition, 99% of the times the
incurred loss of a certain portfolio in one day exceeds $1,000.
In spite of the popularity and simplicity of VaR, some shortcomings have been
diagnosed on this risk measure. As a first important drawback, VaR does not provide
any information regarding the magnitude of the excess loss beyond the α level. This
is an important pitfall as the VaR could underestimate the actual loss of the portfo-
lio [42,45,53]. Second, VaR is criticized by its lack of subadditivity (see Definition
3.1.2) and therefore its lack of coherence (see Definition 3.1.1). This result is disturb-
1
Assuming F() is continuous and the inverse function exists.
Chapter 3. Properties of Risk Measures and Introduction to VaR and ES 34
ing because of the following reason: there may be no direct benefit by diversifying
portfolios as actually the VaR can be higher for the added portfolio [39,53]. Never-
theless, there are some cases in which VaR is subadditive. For example, according
to Haugh [27], VaR is indeed subadditive when dealing with elliptical distributions
as well as with distributions which are continuous and symmetric.
Regarding the computation of VaR, there exist three major techniques that are
commonly implemented.
1. Variance-Covariance Approach. The variance-covariance approach is based
in the assumption that returns are normally distributed. Therefore, historical
data is taken in order to estimate the parameters of the normal distribution
(µ, σ2
). Consequently, when quantiles need to be obtained, the calculation
is merely simplified by the fact that it corresponds to the ones in the normal
distribution.
A strong advantage of this method is that it is very flexible and simple to use.
Moreover, it facilitates the inclusion of stress scenarios to analyse the sensitivity
of the results when parameters are changed [51].
However, the most important pitfall of this technique is that the returns of
the portfolio are assumed to be normally distributed. As already explained
in Section 2.2, the normal distribution may sometimes underestimate the true
behaviour of financial assets.
2. Historical Simulation. As mentioned by O’Brien et al. [42], the Historical
Simulation technique is the most popular approach for calculating VaR.
This technique is mainly based on the historical information of the financial
products that compose a specific portfolio. It is assumed that the weights
of the financial instruments in the portfolio do not change for the observation
period. In this case, VaR is obtained by inspecting the quantiles of the empirical
Chapter 3. Properties of Risk Measures and Introduction to VaR and ES 35
distribution generated by the historical prices.
The key advantage of this technique lies in the fact that it is non-parametric.
In other words, there is no need to estimate any kind of parameters as the
distribution is based on the historical prices. Moreover, Nieppola [41] mentions
that using historical data series can account for the property of “heavy tails”
of the distribution.
One of the most important pitfalls of this model is that it assumes that the
behaviour of past prices is a good model for its behaviour in the future; “driving
by looking in the rearview mirror”. Therefore, it assumes that history could
repeat itself in the future. For example, Dowd [22] mentions that if the data is
unusually quiet, the VaR calculated under the Historical Simulation approach
could underestimate the “true risk”. Moreover, another important shortcoming
of the model is that, as past prices are the most important input for the model,
a long history of data is needed. That could pose a problem when taking
into account financial instruments that have a short-lived history [41](for more
information regarding the Historical Simulation approach refer to Dowd [22]).
3. Monte Carlo Simulation. The Monte Carlo Simulation approach, despite
being a really powerful VaR calculation technique, is the most challenging
technique to implement [22]. The Monte Carlo method relies on the simula-
tion of financial variables which are estimated with respect to market data.
Specifically, price paths are simulated at various times to calculate the implied
distribution from which VaR estimates can be computed.
One of the most important disadvantages of the Monte Carlo approach is that
the computational cost is extremely high. In other words, multiple simulated
paths need to be generated in order to obtain a robust result requiring sub-
stantial computational memory. This can be crucial when trading in a high-
Chapter 3. Properties of Risk Measures and Introduction to VaR and ES 36
frequency environment or when estimating the risk of the whole portfolio of a
large bank (for more information about the Monte Carlo technique see Niep-
pola [41]).
3.3 Introduction to Expected Shortfall
As already shown in Section 3.2.1, one of the main drawbacks that VaR poses
is that it does not take into account the magnitude of the loss. As a consequence
Expected Shortfall2
was introduced as an enhanced risk measure.
Definition 3.3.1 Expected Shortfall. Let X be a profit and loss random variable
such that E(X) < ∞ with probability density function fX, the ES can be defined as
follows
ESt(α) =
1
1 − α
ˆ 1
α
gu(fX)du = −E[X|X ≤ −VaRt(α)] (3.5)
where g(·) corresponds to the quantile function or the inverse cumulative density
function of the profit and loss distribution.
Put into words, the Expected Shortfall as denoted in Equation 3.5 weights the
probability under the tail of the loss distribution for the losses that exceed the VaR
threshold. As a consequence, the following relationship holds
|ESt(α)|≥ |VaRt(α)| (3.6)
2
Expected Shortfall is also defined with a different nomenclature. Across the literature, it is also
called Expected Tail Loss, Conditional VaR, Tail VaR, Tail Conditional Expectation, and Worst
Conditional Expectation. For more information refer to Acerbi and Tasche [2].
Chapter 3. Properties of Risk Measures and Introduction to VaR and ES 37
Figure 3.1: VaR and ES for a Loss Function that is Normally Distributed with µ = 0 and
σ2 = 1.
Figure 3.1 depicts the values for VaR and ES with respect to a normally dis-
tributed loss random variable with µ = 0 and σ2
= 1. As it can be observed on the
graph, Equation 3.6 holds.
The key advantage of ES with respect to VaR is that it takes into account the
magnitude of the loss beyond the VaR threshold and therefore proves to be a more
precise measure of the actual exposure to market risk. Moreover, from a mathemati-
cal standpoint, ES fulfils all the properties of a coherent risk measure (see Definition
3.1.1) as shown by Artzner et al. [3]. Also, Dowd [22] defines ES as “the most
attractive coherent risk measure”.
Nonetheless, the most troublesome drawback for ES, arises from the lack of the
elicitability property (see Definition 3.1.2). Specifically, some authors as Gneiting [26]
and Carver [10] mention that ES is not backtestable due to the fact that it is not
elicitable. Nevertheless, Acerbi and Szekely [1], Kerhof et al. [29], and other authors
have done substantial work in developing non-parametric standardized backtesting
procedures to test ES.
In summary, Table 3.1 shows the properties that hold for VaR and ES.
Chapter 3. Properties of Risk Measures and Introduction to VaR and ES 38
Property VaR ES
Coherence
Robustness
Elicitability
Conditional Elicitability
Table 3.1: Summary of Properties for VaR and ES 3
3
Table taken from Emmer et al. [24]
Chapter 4
Theoretical Background for
Backtesting VaR and ES
The concept of backtesting risk measures is crucial for the validation of risk
models. Backtesting is essential for employers and risk managers which need to
assess whether risk measures are well calibrated. [22]. Backtesting is composed by
statistical and quantitative tests that verifies if a certain risk measure (in this case
VaR and ES) is consistent with the assumptions of the model.
In this chapter the statistical background of hypothesis testing is introduced and
a variety of backtesting procedures across the academic literature is exhibited for
VaR and ES.
39
Chapter 4. Theoretical Background for Backtesting VaR and ES 40
4.1 Statistical Background
When backtesting risk measures, the procedure hypothesis testing is crucial to
assess the performance of risk measures.
A hypothesis test normally defines two types of hypothesis; a null hypothesis
(H0) and the alternative hypothesis (Ha). Usually, the objective of hypothesis test-
ing relies on verifying if the null hypothesis is true.
H0
Decision True False
Reject H0 Type I error (α)
Not Reject H0 Type II error (β)
Table 4.1: Hypothesis Testing Summary Table
Table 4.1 describes the different cases when testing the null hypothesis. The most
troublesome decision that can be done is rejecting the null hypothesis when in fact
it is true. This is called Type I error or significance (α). Another nomenclature that
can be found in the literature is “false positive.” Under normal circumstances, this
error is set up at the beginning of the test with values normally ranging from 0.01 to
0.05. The probabilistic interpretation of α is the chance of rejecting null hypothesis
when it is actually true. Moreover, another type of possible error when performing
a hypothesis testing is the so-called error type II or β. Specifically, β corresponds
to the probability of accepting the null hypothesis when it is actually false. In
the statistical literature, this is called “false negative”. Finally, in the statistical
literature, the quantity 1−β is called “power” of the test. That is, the probability of
rejecting the null hypothesis when it is indeed false. Hence it is desirable to obtain
the highest “power” when performing hypothesis testing [11].
As the significance level increases, it is more probable that null hypothesis is
accepted, and therefore the probability that the “true model” is rejected decreases
Chapter 4. Theoretical Background for Backtesting VaR and ES 41
(Type I error). Nevertheless, this implies that it is more probable to incorrectly
accepting the “false” model (Type II error) [22].
When performing hypothesis testing a test statistic is crucial. A test statistic is
basically a function of a given data sample that is used to judge whether the null
hypothesis is true or not. Specifically, it is compared to a certain critical value in
terms of α to check whether the null hypothesis can be rejected or not.
The usual way of verifying the test statistic is with the p-value. That is, the
p-value (ρ) corresponds to the probability of observing a more extreme value under
the null hypothesis . Hence, if the following holds, ρ ≤ α the null hypothesis is
rejected and if ρ > α then it is not possible to reject the null hypothesis.
In order to clarify the aforementioned definitions, an illustrative example is shown.
Lets suppose that there is a sample of a certain random variable X1
. The null
hypothesis is that these data points are normally distributed with mean equal to
0 (µ = 0) and a certain standard deviation (σ). Therefore, H0 corresponds to
X ∼ N(0, σ2
) and for the Ha it can be said that X ∼ N(φ, σ2
) where φ ≥ 0
Now the following test statistic is taken in order to verify whether µ = 0
=
¯X
s√
n
(4.1)
where n denotes the sample size, ¯X the sample mean and σ the given standard
deviation. This is known as the “z-test.”
Now considering the case where ¯X = 3 , n = 10 and σ = 8, then = 2.4.
Therefore, by setting the significance to α = 0.01 the p-value is
ρ = P(X > ) = 1 − 0.9918 = 0.0082 (4.2)
1
samples of the random variable X which are identically distributed and independent.
Chapter 4. Theoretical Background for Backtesting VaR and ES 42
hence, ρ < α. Therefore, under the null hypothesis, it is highly unlikely that X > 2.4
and hence the null hypothesis can be rejected.
4.2 Backtesting Value at Risk
The following section depicts different popular backtesting models used for VaR.
Specifically, the regulatory framework as explained by Campbell [9] is exposed as
the principal method used by the BCBS. Moreover, the unconditional coverage tests
proposed by Li [33], Jorion [28], Kupiec [31] and Christoffersen [12] are shown. As a
complement to the unconditional coverage test, the independence test in the version
of the Markov test by Christoffersen’s [13] as well as Christoffersen and Pelletier’s [13]
duration test is introduced. Finally, the conditional coverage tests, which covers both
the independence and the unconditional coverage property, described by Christof-
fersen [12] and Christoffersen and Pelletier [13] are described (for more backtesting
tests refer to Nieppola [41] and Campbell [9]).
4.2.1 Regulatory Framework
The regulatory guidelines require banks to calculate the capital needed to be
set aside in order to cover non-conventional losses. The amount that should be
reserved in the regulatory nomenclature is denoted as the market risk capital (MRC).
The MRC is a function of the internal VaR that the financial institution calculates.
Specifically, the MRC takes the highest of the following two factors. First, the
traditional 1% VaR calculated over a 10 day horizon. Second, the 60 day average
of the previous reported 1% VaR adjusted by a factor (st). In a mathematical
Chapter 4. Theoretical Background for Backtesting VaR and ES 43
perspective, it is defined with the following formula
MRCt = max(VaRt(0.01), st
1
60
59
i=0
VaRt−i(0.01)) + ct (4.3)
where ct corresponds to the credit risk associated with the bank’s portfolio. Moreover
st is a multiplication factor determined by the times of VaR violations in the previous
250 trading days. Or more specifically
st =



3 if N ≤ 4
3 + 0.2(N − 4) if 5 ≤ N ≤ 9
4 if 10 < N
(4.4)
where N denotes the number of violations exceeding VaR.
Put into words, when the factor st increases (more violations of VaR in the 250
testing days), then the term on the right-hand side of Equation 4.4 augments and
therefore the MRC increments. This is logical as more violations of the VaR point out
that the current VaR calculation model may not be accurate and, therefore, should
be adjusted in order to improve the MRC level. Campbell [9] calls this technique the
“traffic light” approach as the multiplicative factor st is divided into three different
sets: the “green” light, which logically means the least amount of VaR violations,
the “amber” light which accounts for a higher amount of VaR violations and the
“red” light which is the maximum value taken by st.
4.2.2 Statistical Framework
In this section, the statistical terms that are a common denominator to the back-
testing procedures are introduced.
A key term in VaR backtesting is the “hit” function which counts how many
Chapter 4. Theoretical Background for Backtesting VaR and ES 44
times the profit and loss realizations during a certain time exceed the VaR estimate.
Put into a mathematical context
It+1(α) =



1 if Xt,t+1 ≤ −VaRt(α)
0 if Xt,t+1 ≥ −VaRt(α)
(4.5)
where Xt,t+1 denotes the profit and loss over the period (t, t + 1). In his work from
1998, Christoffersen [12] mentions that the VaR accuracy can be determined by
inspecting whether the “hit” sequence fulfils the following properties.
• Unconditional Coverage Property. The property of unconditional cover-
age defines that the probability that a realized loss exceeds the VaR estimate
should be α · 100%. In other words, P(It+1 = 1) = α. As an illustrative ex-
ample, let α = 0.05. In this case, it would be expected to encounter 5 VaR
violation for every 100 realized returns in the case the VaR estimate is congru-
ent. However, if there were more VaR violations then the VaR estimate may
underestimate the “real” risk. On the other hand, if the VaR violations are
less than 5, then the VaR estimate may overestimate the“real” risk.
• Independence Property. This property analyses how VaR violations occur.
Specifically, the independence property states that two arbitrary elements of
the “hit” sequence have to be strictly independent of each other. In other
words, the prior history of the “hit” sequence should not convey any kind of
information on whether the future “hit” sequence occurs. As an illustrative
example, if there is clustering in the data, it may be expected that the “hit”
sequence clusters on that same period. In this case, the evidence may suggest
that the times of the “hit” sequence are not independent.
• Conditional Coverage Property. This property is mainly a joint test that
Chapter 4. Theoretical Background for Backtesting VaR and ES 45
considers the unconditional coverage property as well as the independence prop-
erty simultaneously. Campbell [9] synthesizes this property with the following
statement
It(α) ∼ B(α) i.i.d. (4.6)
where B(α) denotes the Bernoulli distribution with probability α.
4.2.3 Unconditional Coverage Tests
In this section the unconditional coverage tests proposed by Li [33], Jorion [28],
Kupiec [31] and Christoffersen [12] are discussed.
4.2.3.1 Violation Ratio
The following test taken from Li [33] is composed of the following test statistic.
ζ =
T
t=1 It(α)
T · α
(4.7)
Put into words, if the VaR estimate were accurate then the numerator of Equation
4.7 would be close to the denominator. Therefore, the sum of the “hit” arrivals would
be similar to the theoretical expected VaR violations. The rule of thumb in this case
in order to verify the result is that 0.8 < ζ < 1.2.
Chapter 4. Theoretical Background for Backtesting VaR and ES 46
4.2.3.2 Failure Test
This test, exposed by Jorion [28], records the failure rate, which is calculated as
the proportion of the time in which VaR violations occur. Let N be the number
of exceptions and T as the total number of days analysed. Hence, N
T
denotes the
failure rate. Ideally, α = N
T
should be an unbiased estimator for the probability of α
which denotes the confidence level of the test. The set-up for this test is exactly the
testing framework of Bernoulli trials. That is, under the null hypothesis the number
of exceptions N is distributed with the binomial distribution
n
N
αk
(1 − α)n−k
(4.8)
where the mean and variance are nα and nα(1 − α) respectively.
In the case where T is large enough, then using the Central Limit Theorem (CLT)
the binomial distribution can be approximated by the normal distribution
m =
N − αn
α(1 − α)n
d
−→ N(0, 1). (4.9)
consequently, it is known that m is approximately distributed with a normal dis-
tribution so the critical values can be obtained directly. For example, if the test is
defined with a 95% level (α = 0.05), the correspondent critical value is 1.96.
Chapter 4. Theoretical Background for Backtesting VaR and ES 47
4.2.3.3 Proportion of Failures (POF)
The POF test, proposed by Kupiec [31], is based on the following test statistic
(using the notation in [9])
POF = 2 log((
α
α
)I(α)
(
1 − α
1 − α
)T−I(α)
)
I(α) =
T
t=1
It(α)
α =
I(α)
T
(4.10)
where T denotes the number of total observations, and It(α) is the “hit” sequence.
By simple inspection of Equation 4.10 it can be seen that, if the empirical prob-
ability of VaR violations (α) is exactly the same as α, then the POF test statistic
collapses to the value of zero. Conversely, when the empirical probability VaR viola-
tions is different to the expected violation rate (α) then the POF test statistic may
indicate that the VaR overestimates or underestimates the actual underlying risk. As
an example, it would be expected for one trading year (i.e. T = 255) with α = 0.03
to spot on average 7.65 VaR violations. In the case when the actual amount of VaR
violations would be 12, α = 0.047, α = 0.03 so the POF would be equal to -11.52.
A normalized version of the POF can be expressed in the following way (using
the notation of [9]).
z =
√
T(α − α)
α(1 − α)
(4.11)
As the distribution of the test statistic z is normally distributed, the hypothesis
testing procedure may be undertaken in the traditional way. In other words, the
suitable critical point for a normal distribution would be compared to the realized
test statistic in order to determine the acceptance or rejection of the null hypothesis.
Chapter 4. Theoretical Background for Backtesting VaR and ES 48
An advantage of this approach is that when there are no VaR violations at all then
z = 0. This fixes an anomaly encountered in the POF stated in Equation 4.10 as it
is undefined when there are no VaR violations at all since log(0) is not defined [9].
4.2.3.4 Christoffersen’s Unconditional Coverage Test
Christoffersen [12] proposes the following test statistic in order to test the uncon-
ditional coverage property
CUCT = 2 log((α)I(α)
(1 − α)T−I(α)
) − 2 log((α)I(α)
(1 − α)T−I(α)
)
I(α) =
T
t=1
It(α)
α =
I(α)
T
(4.12)
when T → ∞.
CUCT
d
−−−→
T→∞
χ2
(1) (4.13)
For example if α = 0.05 then the critical value with which the test statistic γ would
be compared is a χ2
with one degree of freedom.
In spite of the unconditional coverage tests’ simplicity and popularity, they are
haunted by an important pitfall. There is no analysis whether the VaR violations
occur in a specific fashion (i.e they “cluster” in certain periods or occur in pairs). As
a consequence, a complementary property needs to exploit the independence between
various groups of VaR violations.
Chapter 4. Theoretical Background for Backtesting VaR and ES 49
4.2.4 Independence Tests
In this section the independence test in the version of the Markov test by Christof-
fersen’s [12] as well as Christoffersen and Pelletier’s [13] duration test is introduced.
4.2.4.1 Christoffersen’s Independence Test (Markov Test)
The Markov Test inspects the independence property by implementing the fol-
lowing 2 × 2 contingency table
It(α) = 0 It(α) = 1
It+1(α) = 0 T1 T3 T1 + T3
It+1(α) = 1 T2 T4 T2 + T4
T1 + T2 T3 + T4 T
Table 4.2: Contingency Table for Christoffersen’s Markov Test 2
where It(α) corresponds to the “hit” sequence as defined in Section 4.2.2. More-
over, T1 and T2 represent the non-violation and violation of the VaR at time t + 1
given that there was no violation in the prior time step, respectively. Conversely, T3
and T4 represent the non-violation and violation of the VaR at time t + 1 given that
there was a violation in the prior time step, respectively.
Ideally if the process It+1(α) is independent, then the following should hold
T2
T1 + T2
=
T4
T3 + T4
(4.14)
In other words, the proportion of VaR violations given that there was no violation in
the previous time step should be the same as the proportion of VaR violations given
that there was a VaR violation in the previous period. Therefore, the fact that there
was or was not a violation in the previous time step does not provide any kind of
2
Table taken from [13]
Chapter 4. Theoretical Background for Backtesting VaR and ES 50
information to whether there is a VaR violation in the current time step and hence
the independence property holds.
The test statistic is defined as follows
CIT = −2 ln
(1 − π)T1+T3
(π)T2+T4
(1 − π0)T1 πT2
0 (1 − π1)T3 πT4
1
(4.15)
where
π0 =
T2
T1 + T2
π1 =
T4
T3 + T4
π =
T2 + T4
T1 + T2 + T3 + T4
(4.16)
and CIT is distributed with a χ2
distribution with one degree of freedom.
For example, if α = 0.05 then the critical value with which the test statistic CIT
would be compared to the critical value of a χ2
with one degree of freedom.
4.2.4.2 Christoffersen and Pelletier’s Duration Test
Christoffersen and Pelletier [13] proposed in 2004 a different approach in order
to prove the independent property in VaR calculations. In the case VaR violations
are independent of each other, the time elapsed between two VaR violations should
be independent of the time that elapsed since the last violation. In other words,
Campbell [9] mentions that the time between VaR violations should not present any
type of “duration dependence.”
Despite the sophistication of this approach, it can not be depicted in a 2×2 matrix
contingency table as in the Markov test. Therefore, a whole statistical model has to
be estimated for the duration between VaR violations. In their work, Christoffersen
and Pelletier [13] propose the exponential distribution as the desired distribution of
Chapter 4. Theoretical Background for Backtesting VaR and ES 51
the duration between VaR violations as it possesses the property of memory-loss.
4.2.5 Conditional Coverage Tests
In order to have a reliable VaR measure, the independence as well as the un-
conditional coverage property need to be fulfilled. The following section covers the
conditional coverage tests proposed by Christoffersen [12] and Christoffersen and
Pelletier [13].
4.2.5.1 Joint Markov Test
The joint Markov test is based on the duration test by Christoffersen and Pel-
letier [13] used in Section 4.2.4.2 combined with the Markov test implemented by
Christoffersen [12]. Invoking Table 4.2, the joint Markov test proposes the following
equality in case the unconditional coverage and the independence property hold
T2
T1 + T2
=
T4
T3 + T4
= α (4.17)
where α is the confidence level for the test. Specifically, the LHS of the equality corre-
sponds to the independence property and the RHS corresponds to the unconditional
coverage property.
4.2.5.2 Christoffersen’s Conditional Coverage Joint Test
Christoffersen’s conditional coverage is simply the aggregation of the [12] uncon-
ditional coverage test and the [12].
The test statistic of both tests is added up to create a new test statistic CCCT
CCCT = CIT + CUCT (4.18)
Chapter 4. Theoretical Background for Backtesting VaR and ES 52
where CCCT is distributed with a χ2
with two degrees of freedom. Therefore, the
value of the new test statistic is compared to the correspondent critical value of a χ2
distribution with 2 degrees of freedom.
4.3 Backtesting Expected Shortfall
In the case of backtesting ES, the procedure is not as direct as the VaR backtesting
according to Wimmerstedt [53] and Acerbi and Szekely [1]. Some authors attribute
this difficulty to the fact that its is does not fulfil the property of elicitability (see
Definition 3.1.2) [26].
In this thesis, the method employed by Emmer et al. [24] and the various tests
implemented by Acerbi and Szekely [1] are reviewed (for more backtesting methods
refer to Clift et al. [14]).
4.3.1 Quantile Approximation
The following approach is based on a research paper by Acerbi and Tasche [2].
This method is recognized by its simplicity as it is far less complex than the other
approaches used to backtest ES. As a first step, ES is represented in terms of VaR.
ESt(α) =
1
1 − α
ˆ 1
α
VaRt(k)dk (4.19)
Chapter 4. Theoretical Background for Backtesting VaR and ES 53
In the next step, dividing the interval [1, α] into four subintervals of equal length
∆k = 1−α
4
, the following is obtained.
[α, α +
(1 − α)
4
]
k0
, [α +
(1 − α)
4
, α +
(1 − α)
2
]
k1
,
[α +
(1 − α)
2
, α +
3
4
(1 − α)]
k2
, [α +
3
4
(1 − α), 1]
k3
As a next step, approximating the integral in Equation 4.19 using Riemann sums
the following holds.
ESt(α) ≈
4
i=1
VaRt(k − 1)∆k (4.20)
Finally, by simplifying the expression the desired result is obtained.
ESt(α) ≈
1
4
[VaRt(α) + VaRt(0.75α + 0.25) + VaRt(0.5α + 0.5) + VaRt(0.25α + 0.75)]
(4.21)
For example when α = 0.01 the following holds
ESt(0.01) ≈
1
4
[VaRt(0.01) + VaRt(0.2575) + VaRt(0.505) + VaRt(0.7525)] (4.22)
where VaRt(α) correspond to the backtested VaR estimates. Therefore, the various
VaR estimates need to be backtested in order to determine if the ES passes the
backtesting procedure.
A remarkable advantage of this method is that it does not rely on Monte Carlo
simulations [1]. However, due to the fact that this method is based on a linear
approximation of the ES, it may sometimes be difficult to assess how many supporting
points suffice in order to ensure the reliability of the backtesting procedure.
Chapter 4. Theoretical Background for Backtesting VaR and ES 54
4.3.2 Acerbi and Szekely Test
The following collection of non-parametrical tests proposed by Acerbi and Szekely
[1] is implemented using Monte Carlo simulations. As the test statistic does not have
a predefined distribution, simulations need to be implemented to obtain a reliable
empirical distribution.
In this case, the null hypothesis stands for the fact that the predicted model
perfectly fits the realized model. Therefore, the estimate of ES passes the backtest.
Finally, it is worth noting that this is a one-sided test. In other words, the null
hypothesis is rejected only if the risk measure underestimates the actual risk. Hence,
the null hypothesis may be accepted with a risk measure that overestimates the
actual risk.
4.3.2.1 Test I
Invoking Equation 3.5 the following holds
ESt(α) = −E[X|X + VaRt(α) < 0] (4.23)
where [Xi]T
i=1 corresponds to the series of returns. Rewriting Equation 4.23 the
following equation is obtained
E
Xt
ESt(α)
+ 1|Xt + VaRt(α) < 0 = 0 (4.24)
using the definition of the “hit” function It(α) from Section 4.2.2, denoting T as
the number of observations and NT the number of VaR violations, the following test
Chapter 4. Theoretical Background for Backtesting VaR and ES 55
statistic is defined
Z1(X) =
T
t=1
XtIt
|ESt(α)|
NT
+ 1 (4.25)
In the next step, the hypothesis testing is implemented by defining the following
H0 : Pα
t = Fα
t , ∀ t (4.26)
where Pα
t corresponds to the conditional tail distribution of Pt which is the predicted
distribution of returns (known). Moreover, Ft corresponds to the realized distribution
of returns (unknown) and Fα
t denotes the conditional tail distribution.3
The alternative hypothesis the following holds
Ha : ESt(α) ≥ ESt(α) ∀t
VaRt(α) = VaRt(α) ∀t
(4.27)
where ESt(α) and VaRt(α) denote the estimated ES and VaR from the realized
returns.
Put into words, under the alternative hypothesis the ES is underestimated by
the model. Nevertheless, the VaR estimate is not rejected. Therefore, this test
is just exposed to the magnitude of the VaR and is independent of the violation’s
frequency [1]. Furthermore,
EH0 [Z1] = 0 and EH1 [Z1] < 0 (4.28)
put into words, if the mean of the test statistic Z1 is 0 then the ES passes the
3
Acerbi and Szekely assume that the functions Ft and Pt are continuous.
Chapter 4. Theoretical Background for Backtesting VaR and ES 56
backtest. However, if the mean is different than zero then there is enough evidence
to show that the ES could be underestimated.
4.3.2.2 Test II
The second test is based on the unconditional representation of ES as shown in
the following formula
ESt(α) = −E
XtIt(α)
α
(4.29)
where [Xi]T
i=1 corresponds to the series of returns. Furthermore, It(α) corresponds
to the “hit” function from Section 4.2.2. After rearranging the following holds
Z2(X) =
T
t=1
XtIt
Tα|ESt(α)|
+ 1 (4.30)
As a next step, in order to implement the hypothesis testing the following is defined
H0 : Pα
t = Fα
t , ∀ t (4.31)
where Pα
t corresponds to the conditional tail distribution of Pt which is the predicted
distribution of returns (known). Moreover, Ft corresponds to the realized distribution
of returns (unknown) and Fα
t denotes the conditional tail distribution.4
Put into
words, the null hypothesis H0 describes that the predicted model perfectly fits the
realized model. Therefore, the estimate of ES passes the backtest.
4
Acerbi and Szekely assume that the functions Ft and Pt are continuous.
Chapter 4. Theoretical Background for Backtesting VaR and ES 57
For the alternative hypothesis the following holds
Ha : ESt(α) ≥ ESt(α) ∀t
VaRt(α) ≥ VaRt(α) ∀t
(4.32)
where ESt(α) and VaRt(α) denote the estimated ES and VaR from the realized
returns. Put into words, the ES is underestimated by the model compared to the
realized model. Moreover, the alternative hypothesis rejects ES and VaR jointly.
Therefore, this test is affected by both the magnitude as well as the VaR violations.
Additionally,
EH0 [Z2] = 0 and EH1 [Z2] < 0 (4.33)
Finally, Acerbi and Szekely [1] propose the following relationship between the
two test statistics
Z2 = 1 − (1 − Z1)
T
t=1 It(α)
Tα
(4.34)
4.3.2.3 Test III
The following approach is based on Berkowitz [7]. The test analyses if the ob-
served ranks Ut = P(Xt) are i.i.d. U(0, 1). Ideally, P(Xt) ∼ U(0, 1).
Acerbi and Szekely [1] use the following definition of ES
ESN
t (α) = ESN
t,α(Y ) = −
1
[Nα]
[Nα]
t=1
(Yt) (4.35)
where N is the number of returns and Y corresponds to the ordered returns. Addi-
tionally, the operator [·] corresponds to the lowest integer operator. In other words,
Chapter 4. Theoretical Background for Backtesting VaR and ES 58
Equation 4.35 corresponds to the average return weighted by Nα, which is the ex-
pected number of exceptions in the sample N. Hence, the following test statistic is
proposed
Z3 = −
1
T
T
t=1
EST
t,α(P−1
t (U))
EV [EST
t,α(P−1
t (V ))]
+ 1 (4.36)
as already stated in Section 4.3.2.1 and 4.30 the following holds
EH0 [Z3] = 0 and EH1 [Z3] < 0 (4.37)
for this case the null hypothesis is tested
H0 : Pt = Ft ∀t (4.38)
against the alternative hypothesis
H1 : Pt Ft ∀t (4.39)
where stands for weak stochastic dominance.
Chapter 5
Backtesting VaR and ES with the
Generated Data
In this chapter a methodology is proposed in order to analyse the backtesting
procedures regarding VaR and ES using data enriched with the properties of volatility
clustering and fat tails. Also, the analysis and the results are shown having in mind
the potential advantages and/or drawbacks when dealing with these stylized facts.
59
Chapter 5. Backtesting VaR and ES with the Generated Data 60
Methodology
The methodology is exposed below and its structure is as follows.
1. Data generation. As a first step, data is generated based on the GARCH(1,1)
model and Student’s t distribution. Moreover, the generated dataset is divided
in “in-sample” and “out-of-sample”. Specifically, the “in-sample” subset is
used to estimate the VaR and ES and the “out-of-sample” subset is used for
the backtesting procedures.
2. Computation of risk measures. In this step, a detailed explanation to
estimate VaR and ES based on the “in-sample” subset. Moreover, the use of
extra simulations guarantee the robustness of the estimations.
3. Backtesting of risk measures using selected tests. In this step, the
performance of the estimates of VaR and ES calculated in the previous step is
analysed using the “out-of-sample” subset. Specifically, a selection of the tests
exposed in Sections 4.2 and 4.3 are implemented.
4. Analysis of results. As a final step, an analysis is carried out to assess the
statistical significance and viability of the estimations.
5.1 Data Generation
In this section, the information of the stock of JPMM is used to estimate the
parameters for the GARCH(1,1) model and the Student t’s distribution.
Furthermore, the “in-sample” and “out-of-sample” subsets are generated. Specif-
ically, for the “in-sample” a total number of 7,000 paths composed of 10,000 simula-
tions are calculated to provide a robust workspace for the estimation of VaR and ES
in the next section. The “out-of-sample” set is constituted with one path of 10,000
simulations.
Chapter 5. Backtesting VaR and ES with the Generated Data 61
It is worth noting that the “out-of-sample” data is independent with respect
to the “in-sample” set. This ensures the independence of the estimation and the
validation of the model.
5.1.1 Volatility Clustering
As already mentioned in Section 2.1.1, the GARCH(1,1) model is used to capture
the volatility clustering effect property that is observed in financial time series.
First, using the daily price returns of the stock JPMM starting from the year 1983
(official date of the financial time series) to 20151
, the continuously daily compounded
returns are calculated as depicted in Figure 5.1.
Figure 5.1: Daily Returns of JPMM
Second, before fitting the GARCH(1,1) process to the JPMM compounded daily
returns, according to Tsay [49], significant autocorrelations need to be eliminated
from the data. In other words, it needs to be tested whether there exist autocorrela-
tions in the JPMM returns. In order to address this, the Ljung-Box test is undertaken
and the graphs of the autocorrelation and partial autocorrelation are inspected (for
1
Prices obtained from www.yahoofinance.com.
Chapter 5. Backtesting VaR and ES with the Generated Data 62
more information about this test refer to Ljung and Box [35]).
Moreover, another condition to ensure the suitability of the GARCH(1,1) model
is based on testing whether the residuals are serially correlated as stated by Da
Rocha [19]. Specifically, there needs to be evidence of an outstanding ARCH effect.
This property is proved using Engle’s test (for more information refer to Engle [25]).
Figure 5.2 depicts the sample autocorrelation function as well as the sample par-
tial autocorrelation function. As it can be seen, the residuals are not significantly
different than zero with a significance level α = 0.05. This can be corroborated with
the Lung-Box test in Table 5.1. As the p-value of the Ljung-Box test is bigger than
the significance α = 0.05, there is not enough evidence to reject the null hypothesis
which mentions that residuals are not serially autocorrelated. Moreover, analysing
the same table, the Engle ARCH effect test presents a p-value of 0 and therefore
shows that there exists an ARCH effect in the data. In summary, there is no further
statistical treatment needed for the returns series of JPMM as is already lacks of
autocorrelation and possesses the ARCH effect. Given the prior statistical analysis,
it is indeed reasonable to use the GARCH(1,1) model to fit the data.
Test statistic Critical value p-value
Ljung-Box test 28.2229 31.4104 0.1042
Engle ARCH effect test 472.3581 3.8415 0
Table 5.1: Statistical Test for the Residuals of the Returns Data Series of JPMM with
α = 0.05
As a next step, the parameters of the GARCH(1,1) model are estimated2
. Table
5.2 depicts the estimated parameters together with the correspondent standard error.
2
The estimation of the parameters is calculated based on the maximum likelihood approach
men using the built-in economic toolbox in Matlab 2016a.
Chapter 5. Backtesting VaR and ES with the Generated Data 63
Figure 5.2: Sample Autocorrelation Function and Sample Partial Autocorrelation Func-
tion
Parameter Estimated value Standard error
ω 0.0321078 0.00404445
β 0.91575 0.0021361
α 0.0830329 0.00164358
Table 5.2: Parameter Estimates, Standard Error and Test Statistic for the Fitted
GARCH(1,1) Model
Figure 5.3 graphs the conditional variance jointly with the returns time series for
a selected path. As it can be seen from the graph, the volatility clustering effect is
present in the simulation of the GARCH(1,1) model3
.
Finally, Figure 5.4 graphs the sample correlogram for the conditional variance and
returns respectively for the same path as Figure 5.3. Specifically, for the conditional
variance, there exists a high dependence with respect to the previous variance. That
is expected due to the fact that the GARCH(1,1) model is highly dependent on the
volatility of the previous time step. However, the correlation slowly deteriorates as
the lag between variances increases.
3
The simulation of the parameters are calculated using the Monte Carlo method in the built-in
economic toolbox in Matlab 2016a.
Chapter 5. Backtesting VaR and ES with the Generated Data 64
Figure 5.3: Simulated Conditional Variance and Returns for the Fitted GARCH(1,1)
Model for a Selected Path
Figure 5.4: Sample Autocorrelations for the Conditional Variance (up) and Returns
(down)
Chapter 5. Backtesting VaR and ES with the Generated Data 65
5.1.2 Fat Tails
As already mentioned in Section 2.2 the Student’s t distribution is used to embody
the fat tails property that is observed in financial time series.
First, using the daily compounded returns of the stock JPMM starting from the
year 1983 to 2015, the parameters fo the Student’s t distribution are estimated4
.
Table 5.3 depicts the estimated values for the fitted Student’s t distribution model.
Figure 5.5 portrays the fitting process of the Student’s t distribution in the ob-
served empirical data. Specifically, the empirical distribution of the JPMM returns
is graphed in conjunction with the fitted Student’s t distribution with the estimated
parameters of Table 5.3.
Parameter Estimated value Standard error
µ 0.0269772 0.0192303
σ 1.40222 0.021351
ν 2.82064 0.104641
Table 5.3: Estimates and Standard Error of the Fitted Student’s t Distribution
Second, Table 5.4 presents the kurtosis of the Student’s t distribution in compar-
ison with the one of a normal distribution. This suggests the presence of fat tails in
the estimated model. Finally, the “in-sample” data is simulated. tails5
.
Distribution Kurtosis
Fitted Student’s t distribution 30.4492
Fitted Normal Distribution 3.0797
Table 5.4: Kurtosis of the Simulated Student’s t Distribution
4
The estimation of the parameters are calculated based on the maximum likelihood approach
men using the built-in economic toolbox in Matlab 2016a.
5
The simulation of the parameters are calculated using the Monte Carlo method (inverse CDF
approach) in Matlab 2016a.
Chapter 5. Backtesting VaR and ES with the Generated Data 66
Figure 5.5: Fitted Student’s t Distribution vs. Empirical Distribution
Chapter 5. Backtesting VaR and ES with the Generated Data 67
5.2 Computation of Risk Measures
In this section, the estimations of VaR and ES are calculated with “in-sample”
data generated in Section 5.1.
It is worth noting that the significance levels of α = 0.05, 0.025, 1 are used to
compute the risk measures, as this are the ones most used in practice.
5.2.1 Computation of Value at Risk
The approach used for the calculation of VaR is the Monte Carlo method. As
already described in Section 3.2, one of the most important disadvantages of the
Monte Carlo method is that the computational cost is extremely high [41]. However,
this method is really useful for treating complex processes like the GARCH(1,1)
model.
First, for the GARCH(1,1) model, as there is no predefined distribution that
models the process, an iterative simulation approach is implemented in order to ob-
tain a reliable and robust VaR estimate. Specifically, the VaR estimate is calculated
with the following procedure.
1. For every sample path, the 7,000 simulations are sorted from lowest to highest
simulated returns.
2. In order to find the appropriate return that corresponds to the VaR then the
correspondent index (ια) is computed with the following formula
ια = c(Nα) (5.1)
where N corresponds to the total number of simulations, in this case 7,000.
Moreover, 1 − α stands for the desired confidence level. Furthermore, c(˙) de-
Chapter 5. Backtesting VaR and ES with the Generated Data 68
notes the ceiling function, which rounds a certain input value to the nearest
higher possible integer, therefore ια ∈ N.
3. Once ια is calculated, it is substituted back into the correspondent sorted vector
to obtain the estimated VaR value. In other words,
VaRk
t (α) = ϑk
(ια) (5.2)
where ϑ() corresponds to the sorted vector of returns calculated in the first step
and k embodies the current simulation path being analysed (k = 1 . . . 10, 000)
[52].
The simulations for each path of the GARCH(1,1) model is used to calculate a
single VaR estimate. By the Law of Large Numbers (LLN) (refer to Durrett [23]
for a definition of the Law of Large Numbers), if the estimate is based on various
simulations, it will converge to the mean which in this case is the ES. Moreover, The
LLN applies whenever the random variable (in this case the sampling procedure) has
a bounded variance. Particularly, the GARCH(1,1) model has a finite variance as
Equation 2.3 holds for the estimated parameters. Figure 5.6 presents the cumulative
mean of the VaR estimates with selected values of α. The graph suggests that the
cumulative mean stabilizes as more VaR simulations are averaged out.
Sinharay [47] propose the running mean plots as a useful way to validate if the
Monte Carlo method converges appropriately. In case the running mean plot sta-
bilizes, the algorithm converges. Figure 5.7 depicts the moving average for selected
values of α using a window of 1,000 observations as well as the overall mean. As
it can be observed, the moving average values remain stationary and close to the
overall mean.
Chapter 5. Backtesting VaR and ES with the Generated Data 69
Figure 5.6: Cumulative Mean for the VaR of the GARCH(1,1) Model with Selected Values
of α.
Figure 5.7: Moving Average with a Window of 1,000 Observations for the VaR of the
GARCH(1,1) Model with Selected Values of α.
Chapter 5. Backtesting VaR and ES with the Generated Data 70
In the case of the Student’s t distribution, the estimation of VaR is less complex.
Specifically, as the Student’s t distribution is a predefined probability distribution
with specified parameters, the calculation of VaR collapses on finding the correspon-
dent quantile in terms of α.
For the sake of completeness, the same simulation procedure is undertaken as in
the GARCH(1,1) model in order to verify that the simulations indeed converge to
the quantile value denoted by the probability distribution. Simulations within each
path are generated using the inverse cumulative density function (CDF) approach6
.
Furthermore, the convergence test suggested by Sinharay [47] is redundant when
implemented with the Student’s t distribution as it is a parametrical distribution
with a delimited probability density function.
Figure 5.8 portrays the cumulative mean taken from the Student’s t simulation.
As it can be observed, the convergence of the cumulative mean of the simulations to
the quantile value of the distribution occurs almost intermediately.
Finally, Table 5.5 presents the VaR estimates for the GARCH(1,1) model as well
as the Student’s t distribution. As it can be seen on the table, the magnitude of the
VaR values for every α are higher in the case of the GARCH(1,1) compared to the
Student’s t distribution. This may be due to the fact that the implied GARCH(1,1)
simulated distribution possesses a higher frequency of extreme values due to the
volatility clustering effect. Moreover, for the Student’s t distribution it could be the
case that, although extreme values are theoretically present, probably the frequency
is not high enough in order to affect the VaR estimate.
6
For more information regarding this method to generate random numbers with the desired
distribution refer to a standard statistics book, for example, Dowd [22]. Additionally, the uniform
random numbers needed are extracted from the rand() function in Matlab 2016a.
Chapter 5. Backtesting VaR and ES with the Generated Data 71
Figure 5.8: Cumulative Mean for the VaR of the Student’s t Distribution with Selected
Values of α in Conjunction with the Quantile Values Derived from the Distribution
-VaR
GARCH(1,1) α = 0.05 -5.8043
GARCH(1,1) α = 0.025 -8.2585
GARCH(1,1) α = 0.01 -12.2022
Student’s t distribution α = 0.05 -3.3603
Student’s t distribution α = 0.025 -4.6004
Student’s t distribution α = 0.01 -6.6754
Table 5.5: Estimates of VaR for the GARCH(1,1) Model and the Student’s t Distribution
with the Correspondent α.
Chapter 5. Backtesting VaR and ES with the Generated Data 72
5.2.2 Computation of Expected Shortfall
In order to calculate the Expected Shortfall, Equation 3.5 is used. The average
value is taken over all losses that exceed the VaR value calculated in Section 5.2.1.
This method is applied to both the GARCH(1,1) model as well as the Student’s t
distribution. In order to find a reliable estimate of ES, for the GARCH(1,1) model
and the Student’s t distribution, the arithmetic mean is proposed. Specifically, the
mean is calculated as the average of the ES values for the different paths.
Firstly, Figure 5.9 describes the cumulative average of the GARCH(1,1) ES es-
timate. As it can be seen on the graph, the cumulative mean stabilizes after some
iterations. As already mentioned in Section 5.2.1, Figure 5.10 serves as a visual test
to assess if a convergence is reached for the value of the ES. In other words, a moving
average with a time window of 1,000 observations is implemented to test whether
the moving averages vary with respect to each other. After inspecting the graph, the
moving average indeed remains stationary and it is close to the overall mean.
Secondly, the ES estimate for the Student’s t distribution is computed and anal-
ysed. Due to the fact that the Student’s t distribution is a parametrical probability
distribution, the actual convergence of the simulations to the “real value” is ex-
pected to be reached rapidly. For the sake of completeness, Figure 5.11 presents the
cumulative mean for the ES value of the Student’s t distribution. As expected, the
convergence occurs fast.
Table 5.6 shows that the ES values obtained for the GARCH(1,1) model are
greater in magnitude than the ones calculated for the Student’s t distribution.
It can be noted that the “in-sample” calculations for VaR and ES are stationary
and thus do not take into account the “out-of-sample” innovations. This is important
as risk measures should be exposed to data different from the one that was used to
estimate them. The fact that the risk measure is not constantly morphing to adapt
to new data provides better insights into the potential pitfalls of their performance.
Chapter 5. Backtesting VaR and ES with the Generated Data 73
Figure 5.9: Cumulative Mean for the ES of the GARCH(1,1) Model with Selected Values
of α
Figure 5.10: Moving Average with a Window of 1,000 Observations for the ES of the
GARCH(1,1) Model with Selected Values of α.
Chapter 5. Backtesting VaR and ES with the Generated Data 74
Figure 5.11: Cumulative Mean for the ES of the Student’s t Distribution with Selected
Values of α.
-VaR -ES |V aR − ES|
GARCH(1,1) α = 0.05 -5.8042 -9.6233 3.8191
GARCH(1,1) α = 0.025 -8.2585 -12.5765 4.3180
GARCH(1,1) α = 0.01 -12.2021 -16.9243 4.7222
Student’s t distribution α = 0.05 -3.3606 -5.6972 2.3366
Student’s t distribution α = 0.025 -4.6004 -7.5066 2.9062
Student’s t distribution α = 0.01 -6.6754 -10.6152 3.9398
Table 5.6: Estimates of VaR and ES for Both Methods with Various Confidence Levels α
Chapter 5. Backtesting VaR and ES with the Generated Data 75
5.3 Backtesting Value at Risk and Ex-
pected Shortfall Using Selected Tests
In the next step, the VaR and ES estimates from Section 5.2 are backtested
with the “out-of-sample” data. Particularly, a selection of the tests presented in
Sections 4.2 and 4.3 is implemented.
5.3.1 Backtesting Value at Risk
In order to backtest the VaR estimations obtained in Section 5.2.1, a selection of
the tests introduced in Section 4.2. Specifically, the collection of Christoffersen’s [12]
tests is implemented.
The key reasons why this group of tests is chosen are the following. First, the
whole collection provides an overall assessment of the most important properties that
VaR should fulfil in order for it to be a reliable market risk estimate. Second, these
tests have a predefined distribution for the test statistic, namely the χ2
distribution,
and therefore make the hypothesis testing more robust.
In the first step, the unconditional coverage property is tested using the test intro-
duced in Section 4.2.3.4. In the second step, the Markov’s test from Section 4.2.4.1
is undertaken to prove the independence property. Finally, the conditional coverage
property, which assess the overall performance of the backtesting procedure, is tested
using the conditional coverage test exposed in Section 4.2.5.2.
Before proceeding with these tests, the “hit” function, as defined in Section 4.2.2,
is analysed in order to provide further insights into the backtesting procedure. Fig-
ures 5.12 to 5.17 illustrate the performance of the VaR estimate with respect to the
“out-of-sample” data. Figures 5.12a, 5.13a, 5.14a, 5.15a, 5.16a and 5.17a show the
“out-of-sample” data analysed in conjunction with the estimated “in-sample” VaR
Chapter 5. Backtesting VaR and ES with the Generated Data 76
threshold. Moreover, these graphs present in red the simulated returns that exceed
the estimated VaR threshold. On the other hand, Figures 5.12b, 5.13b, 5.14b, 5.15b,
5.16b and 5.17b present the cumulative sum of the“hit” function.
In Figures 5.15 to 5.17 the GARCH(1,1) generated data is presented. In Fig-
ures 5.15a, 5.16a and 5.17a it can be seen that the returns show volatility clustering
and therefore the VaR is surpassed consecutively. This leads to the “hit” function
having a drastic jump, as depicted in Figures 5.15b, 5.16b and 5.17b.
In Figures 5.12 to 5.14 the Student t’s distribution is exposed. Figures 5.12a,
5.13a and 5.14a show that there is no clear volatility clustering effect in the data.
However, some returns are more extreme than the ones observed in the GARCH(1,1)
generated data. Moreover, Figures 5.12b, 5.13b and 5.14b show that VaR violations
arrive in a more uniform fashion when compared to the GARCH(1,1) data. This may
influence the independence property when implementing Christoffersen’s statistical
test further on.
Finally, it is interesting to note for both models that, when α decreases the plots
behave in a more erratic and discontinuous fashion. In the case of the Student’s t
distribution, the behaviour of the graph starts to lose the shape of a straight line.
Similarly, for the GARCH(1,1) model, the sudden cluster of VaR violations occur in
a more accentuated way.
Chapter 5. Backtesting VaR and ES with the Generated Data 77
(a) Returns and VaR Estimate (b) Sum of VaR Violations
Figure 5.12: Backtesting VaR with Student’s t Distribution Generated Data with α = 0.05
(a) Returns and VaR Estimate (b) Sum of VaR Violations
Figure 5.13: Backtesting VaR with Student’s t Distribution Generated Data with α =
0.025
(a) Returns and VaR Estimate (b) Sum of VaR Violations
Figure 5.14: Backtesting VaR with Student’s t Distribution Generated Data with α = 0.01
Chapter 5. Backtesting VaR and ES with the Generated Data 78
(a) Returns and VaR Estimate (b) Sum of VaR Violations
Figure 5.15: Backtesting VaR with GARCH(1,1) Generated Data with α = 0.05
(a) Returns and VaR Estimate (b) Sum of VaR Violations
Figure 5.16: Backtesting VaR with GARCH(1,1) Generated Data with α = 0.025
(a) Returns and VaR Estimate (b) Sum of VaR Violations
Figure 5.17: Backtesting VaR with Student’s t Distribution Generated Data with α = 0.01
Chapter 5. Backtesting VaR and ES with the Generated Data 79
Now, focus is given to the statistical tests proposed at the beginning of this
section.
Tables A.1 to A.6 provide a summary of the selected statistical tests. The test
statistic, as well as the p-value, are presented in order to test whether the null
hypothesis is true or false.
In this collection of statistical tests the null hypothesis stands for the desired
property. For instance, the null hypothesis in the independence test corresponds to
the fact that VaR violations are independent.
Firstly, Tables A.1 to A.3 show the summarized statistics for the GARCH(1,1)
model. Starting with the unconditional coverage analysis, it can be seen that the
p-value indicates that there is enough evidence to reject the null hypothesis for the
various significance levels. Hence, the “hit” function lacks the property of uncon-
ditional coverage. That is, the total number of exceptions embodied by the “hit”
function does not match the expected theoretical exceptions given by Tα. Likewise,
using Equation 4.7 a violation ratio of aproximately 0.5 is obtained. That means
that less VaR violations occur compared to what is expected. In summary, the VaR
estimate of the GARCH(1,1) model overestimates the “true” risk value. For exam-
ple, Figure 5.15 states that when α = 0.05, there are almost 300 observations that
surpass VaR, compared to 10, 000 ·
α
0.05 = 500 expected in theory. In practice, this
would give a very conservative calculation of the actual risk and may not optimally
use resources.
Secondly, Tables A.1 to A.3 show there is enough evidence to reject the hypothesis
that the “hit” function is independent. This behaves in line with what is visualized
in Figures 5.15 to 5.17. Therefore, the volatility clustering effect is a solid evidence
that the “hit” function is not independent. This is confirmed by the rule of thumb
in Equation 4.14 which shows a big disparity between the probability of the arrival
of a VaR violation given no violation in the previous period and the probability of
Chapter 5. Backtesting VaR and ES with the Generated Data 80
the arrival of a VaR violation given a VaR violation in the previous period.
Finally, the conditional coverage test, which jointly tests for the both above-
stated properties is not relevant as it was already shown that both are rejected for
the GARCH(1,1) model. Hence, the conditional coverage property does not hold.
Proceeding with the analysis of the Student’s t distribution in the next step,
Tables A.4 to A.6 present the statistical tests for this model.
First, the analysis of the unconditional coverage property across all the selected
significance levels does not provide enough evidence to reject the null hypothesis.
In other words, the statistical test supports the fact that arrivals of VaR match the
ones stated by the model. This is supported by the fact that the violation ratio, as
calculated in Equation 4.7, lies between the desired value of 0.8 and 1.2. For example,
when α = 0.05, then the actual number of violations is very close to 500 as shown
in Figure 5.12b, while the number of theoretical violations is 10, 000 ·
α
0.05 = 500.
In summary, the expected VaR violations closely match the “out-of-sample” realized
VaR violations.
Second, the independence property for the Student’s t distribution’s “hit” func-
tion holds as there is not enough evidence to reject the hypothesis that the “hit”
function is independent. Similarly, this is indicated by the close magnitude on both
sides of Equation 4.14. In other words, the probability of the arrival of a VaR vi-
olation given no violation in the previous period compared to the arrival of a VaR
violation given a VaR violation in the previous period is similar.
Finally, the analysis of the conditional coverage property does not show enough
evidence to reject the hypothesis that the Student t’s distribution fulfils the condi-
tional coverage property. This is expected, as the test is composed of the uncondi-
tional coverage as well as the independence property which indeed hold.
Chapter 5. Backtesting VaR and ES with the Generated Data 81
5.3.2 Backtesting Expected Shortfall
In order to backtest ES estimations, a selection of the tests introduced in Section
4.3 is implemented. Specifically, Test I and II from Acerbi and Szekely [1] are used
for the backtesting procedure. Those tests were selected as they are non-parametric
as mentioned in Section 4.3.2 and, therefore, do not assume any kind of return
distribution. Within the tests, the unconditional coverage for ES is tested. The
independence property does not need to be tested as it is equivalent to the one
calculated in the VaR section.
A significant difference of this backtesting method compared to the methods
implemented for backtesting VaR is that the test statistic does not have a predefined
distribution. As a consequence, simulations need to be implemented in order to
propose a reliable empirical distribution for the test statistic.
Acerbi and Szekely [1] propose the following guideline in order to calculate the
empirical p-value.
1. Simulate independent and identically distributed samples of a certain return
distribution ˚Rj
t ∼ Rt ∀t, ∀j = 1, ...., N, where N corresponds to the number of
simulated paths.
2. Compute the test statistic Zj
= Z( ˚Rj
t ) based on the simulated returns.
3. Assess the test statistic by calculating its respective empirical p-value which is
determined as follows
ρ =
1
N
N
j=1
{Zj
< Z(
←−
R )} (5.3)
where Z(
←−
R ) corresponds to the “out-of-sample” realized value of the test statistic.
Finally, as already mentioned in Section 4.3.2, the null hypothesis of these tests
stands for the fact that the ES estimate is a good estimate for the market risk and,
Chapter 5. Backtesting VaR and ES with the Generated Data 82
therefore, the ES estimate passes the backtest. Nevertheless, it is a one-sided test. In
other words, the null hypothesis is rejected only if the risk measure underestimates
the actual risk. Hence, the null hypothesis may be accepted with a risk measure that
overestimates the actual risk.
5.3.2.1 Acerbi and Szekely Test I
First, the Test I as already explained in Section 4.3.2.1 is carried out. To obtain
the test statistic, the formula is recalled
Z1(X) =
T
t=1
XtIt
|ESt(α)|
NT
+ 1
Rearranging Equation 5.3.2.1, the following is obtained
Z1(X) =
T
t=1
XtIt
NT
|ESt(α)|
+ 1
now the numerator embodies the “out-of-sample” estimate of the ES while the “in-
sample” estimation is represented in the denominator. Hence, if the “out-of-sample”
and “in-sample” estimations of the ES are identical, Z1(X) = 0. This means that,
the “in-sample” estimation of the ES passes the backtest successfully. Conversely,
when there exists a significant difference between these two estimations, the null
hypothesis is rejected and therefore the ES underestimates the actual risk.
Acerbi and Szekely [1] mention that in order to perform this test, an estimate
of VaR needs to be available due to the existence of It(α). Furthermore, as already
introduced in Section 4.3, the authors mention that Z1(X) is an average over the
VaR exceptions. Therefore it is sensitive to the exception’s magnitude but not to its
frequency.
Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails and Volatility Clustering
Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails and Volatility Clustering
Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails and Volatility Clustering
Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails and Volatility Clustering
Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails and Volatility Clustering
Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails and Volatility Clustering
Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails and Volatility Clustering
Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails and Volatility Clustering
Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails and Volatility Clustering
Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails and Volatility Clustering
Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails and Volatility Clustering
Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails and Volatility Clustering
Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails and Volatility Clustering
Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails and Volatility Clustering
Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails and Volatility Clustering
Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails and Volatility Clustering
Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails and Volatility Clustering
Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails and Volatility Clustering
Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails and Volatility Clustering
Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails and Volatility Clustering

Mais conteúdo relacionado

Mais procurados

biometry MTH 201
biometry MTH 201 biometry MTH 201
biometry MTH 201 musadoto
 
A Bilevel Optimization Approach to Machine Learning
A Bilevel Optimization Approach to Machine LearningA Bilevel Optimization Approach to Machine Learning
A Bilevel Optimization Approach to Machine Learningbutest
 
A Framework for Understanding and Controlling Batch Cooling Crystallization
A Framework for Understanding and Controlling Batch Cooling CrystallizationA Framework for Understanding and Controlling Batch Cooling Crystallization
A Framework for Understanding and Controlling Batch Cooling CrystallizationDaniel Griffin
 
Medical Malpractice And Contract Disclosure A Study Of The
Medical Malpractice And Contract Disclosure A Study Of TheMedical Malpractice And Contract Disclosure A Study Of The
Medical Malpractice And Contract Disclosure A Study Of Thelegal5
 
Fundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zingg
Fundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zinggFundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zingg
Fundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zinggRohit Bapat
 
Introduction to Rheology
Introduction to RheologyIntroduction to Rheology
Introduction to RheologyXavirodp
 
PhD thesis "On the intelligent Management of Sepsis"
PhD thesis "On the intelligent Management of Sepsis"PhD thesis "On the intelligent Management of Sepsis"
PhD thesis "On the intelligent Management of Sepsis"Vicente RIBAS-RIPOLL
 

Mais procurados (16)

Thesis lebanon
Thesis lebanonThesis lebanon
Thesis lebanon
 
Lakhotia09
Lakhotia09Lakhotia09
Lakhotia09
 
biometry MTH 201
biometry MTH 201 biometry MTH 201
biometry MTH 201
 
A Bilevel Optimization Approach to Machine Learning
A Bilevel Optimization Approach to Machine LearningA Bilevel Optimization Approach to Machine Learning
A Bilevel Optimization Approach to Machine Learning
 
A Framework for Understanding and Controlling Batch Cooling Crystallization
A Framework for Understanding and Controlling Batch Cooling CrystallizationA Framework for Understanding and Controlling Batch Cooling Crystallization
A Framework for Understanding and Controlling Batch Cooling Crystallization
 
Time series Analysis
Time series AnalysisTime series Analysis
Time series Analysis
 
thesis
thesisthesis
thesis
 
Medical Malpractice And Contract Disclosure A Study Of The
Medical Malpractice And Contract Disclosure A Study Of TheMedical Malpractice And Contract Disclosure A Study Of The
Medical Malpractice And Contract Disclosure A Study Of The
 
Thesis
ThesisThesis
Thesis
 
final_report_template
final_report_templatefinal_report_template
final_report_template
 
Fundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zingg
Fundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zinggFundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zingg
Fundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zingg
 
MSci Report
MSci ReportMSci Report
MSci Report
 
Introduction to Rheology
Introduction to RheologyIntroduction to Rheology
Introduction to Rheology
 
10.1.1.866.373
10.1.1.866.37310.1.1.866.373
10.1.1.866.373
 
PhD thesis "On the intelligent Management of Sepsis"
PhD thesis "On the intelligent Management of Sepsis"PhD thesis "On the intelligent Management of Sepsis"
PhD thesis "On the intelligent Management of Sepsis"
 
Barret templates
Barret templatesBarret templates
Barret templates
 

Destaque

Estudio de la tabla periódica prof. Zabaneh
Estudio de la tabla periódica prof. ZabanehEstudio de la tabla periódica prof. Zabaneh
Estudio de la tabla periódica prof. ZabanehMarcos Zabaneh
 
Andres internet
Andres internetAndres internet
Andres internetAndresrx
 
Iso 8859-1
Iso 8859-1Iso 8859-1
Iso 8859-1emollon
 
Assignment 7 - Michael
Assignment 7 - MichaelAssignment 7 - Michael
Assignment 7 - MichaelA2MediaGroup3
 
Legal-landscape-struggles-to-keep-pace-with-the-rise-of-Telemedicine
Legal-landscape-struggles-to-keep-pace-with-the-rise-of-TelemedicineLegal-landscape-struggles-to-keep-pace-with-the-rise-of-Telemedicine
Legal-landscape-struggles-to-keep-pace-with-the-rise-of-TelemedicineCiara Farrell
 
Web components: mais simples e produtivo com polymer!
Web components: mais simples e produtivo com polymer!Web components: mais simples e produtivo com polymer!
Web components: mais simples e produtivo com polymer!Andrew Willard
 
Light reactions and the calvin cycle
Light reactions and the calvin cycleLight reactions and the calvin cycle
Light reactions and the calvin cycleStephanie Beck
 
3-3 Proving Lines Parallel Concepts.pdf
3-3 Proving Lines Parallel Concepts.pdf3-3 Proving Lines Parallel Concepts.pdf
3-3 Proving Lines Parallel Concepts.pdfLomasGeomConc16
 
Senegal | Aug-16 | SMART VILLAGES : Breakout Sessions
Senegal | Aug-16 | SMART VILLAGES : Breakout SessionsSenegal | Aug-16 | SMART VILLAGES : Breakout Sessions
Senegal | Aug-16 | SMART VILLAGES : Breakout SessionsSmart Villages
 
Comunicación móvil y sociedad
Comunicación móvil y sociedadComunicación móvil y sociedad
Comunicación móvil y sociedadsu_marcela
 
Cortijo recién restaurado con magníficas instalaciones ecuestres a 5' del aer...
Cortijo recién restaurado con magníficas instalaciones ecuestres a 5' del aer...Cortijo recién restaurado con magníficas instalaciones ecuestres a 5' del aer...
Cortijo recién restaurado con magníficas instalaciones ecuestres a 5' del aer...Alvaro Parias
 
Healthcare Interpretation Network (HIN) Annual General Meeting - October 25, ...
Healthcare Interpretation Network (HIN) Annual General Meeting - October 25, ...Healthcare Interpretation Network (HIN) Annual General Meeting - October 25, ...
Healthcare Interpretation Network (HIN) Annual General Meeting - October 25, ...hintnet
 
Senegal | Aug-16 | Les défis mondiaux et liens entre l’alimentation-Eau-Energie
Senegal | Aug-16 | Les défis mondiaux et liens entre l’alimentation-Eau-EnergieSenegal | Aug-16 | Les défis mondiaux et liens entre l’alimentation-Eau-Energie
Senegal | Aug-16 | Les défis mondiaux et liens entre l’alimentation-Eau-EnergieSmart Villages
 
Obat herbal wasir ambejoss berkhasiat
Obat herbal wasir ambejoss berkhasiatObat herbal wasir ambejoss berkhasiat
Obat herbal wasir ambejoss berkhasiatwawan wijanarko
 
Jose nicolas (1)
Jose nicolas (1)Jose nicolas (1)
Jose nicolas (1)JOSE_ISABEL
 

Destaque (20)

Estudio de la tabla periódica prof. Zabaneh
Estudio de la tabla periódica prof. ZabanehEstudio de la tabla periódica prof. Zabaneh
Estudio de la tabla periódica prof. Zabaneh
 
Wake Up
Wake UpWake Up
Wake Up
 
Andres internet
Andres internetAndres internet
Andres internet
 
Iso 8859-1
Iso 8859-1Iso 8859-1
Iso 8859-1
 
Assignment 7 - Michael
Assignment 7 - MichaelAssignment 7 - Michael
Assignment 7 - Michael
 
Granados
GranadosGranados
Granados
 
Legal-landscape-struggles-to-keep-pace-with-the-rise-of-Telemedicine
Legal-landscape-struggles-to-keep-pace-with-the-rise-of-TelemedicineLegal-landscape-struggles-to-keep-pace-with-the-rise-of-Telemedicine
Legal-landscape-struggles-to-keep-pace-with-the-rise-of-Telemedicine
 
Web components: mais simples e produtivo com polymer!
Web components: mais simples e produtivo com polymer!Web components: mais simples e produtivo com polymer!
Web components: mais simples e produtivo com polymer!
 
Light reactions and the calvin cycle
Light reactions and the calvin cycleLight reactions and the calvin cycle
Light reactions and the calvin cycle
 
3-3 Proving Lines Parallel Concepts.pdf
3-3 Proving Lines Parallel Concepts.pdf3-3 Proving Lines Parallel Concepts.pdf
3-3 Proving Lines Parallel Concepts.pdf
 
Senegal | Aug-16 | SMART VILLAGES : Breakout Sessions
Senegal | Aug-16 | SMART VILLAGES : Breakout SessionsSenegal | Aug-16 | SMART VILLAGES : Breakout Sessions
Senegal | Aug-16 | SMART VILLAGES : Breakout Sessions
 
Parq de las leyends
Parq de las leyendsParq de las leyends
Parq de las leyends
 
Comunicación móvil y sociedad
Comunicación móvil y sociedadComunicación móvil y sociedad
Comunicación móvil y sociedad
 
Cortijo recién restaurado con magníficas instalaciones ecuestres a 5' del aer...
Cortijo recién restaurado con magníficas instalaciones ecuestres a 5' del aer...Cortijo recién restaurado con magníficas instalaciones ecuestres a 5' del aer...
Cortijo recién restaurado con magníficas instalaciones ecuestres a 5' del aer...
 
Tarea 8
Tarea 8Tarea 8
Tarea 8
 
Healthcare Interpretation Network (HIN) Annual General Meeting - October 25, ...
Healthcare Interpretation Network (HIN) Annual General Meeting - October 25, ...Healthcare Interpretation Network (HIN) Annual General Meeting - October 25, ...
Healthcare Interpretation Network (HIN) Annual General Meeting - October 25, ...
 
Senegal | Aug-16 | Les défis mondiaux et liens entre l’alimentation-Eau-Energie
Senegal | Aug-16 | Les défis mondiaux et liens entre l’alimentation-Eau-EnergieSenegal | Aug-16 | Les défis mondiaux et liens entre l’alimentation-Eau-Energie
Senegal | Aug-16 | Les défis mondiaux et liens entre l’alimentation-Eau-Energie
 
Obat herbal wasir ambejoss berkhasiat
Obat herbal wasir ambejoss berkhasiatObat herbal wasir ambejoss berkhasiat
Obat herbal wasir ambejoss berkhasiat
 
Jose nicolas (1)
Jose nicolas (1)Jose nicolas (1)
Jose nicolas (1)
 
De ce FJSC?
De ce FJSC?De ce FJSC?
De ce FJSC?
 

Semelhante a Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails and Volatility Clustering

Lecturenotesstatistics
LecturenotesstatisticsLecturenotesstatistics
LecturenotesstatisticsRekha Goel
 
The value at risk
The value at risk The value at risk
The value at risk Jibin Lin
 
Manual de tecnicas de bioestadística basica
Manual de tecnicas de bioestadística basica Manual de tecnicas de bioestadística basica
Manual de tecnicas de bioestadística basica KristemKertzeif1
 
Clustering Financial Time Series and Evidences of Memory E
Clustering Financial Time Series and Evidences of Memory EClustering Financial Time Series and Evidences of Memory E
Clustering Financial Time Series and Evidences of Memory EGabriele Pompa, PhD
 
Opinion Formation about Childhood Immunization and Disease Spread on Networks
Opinion Formation about Childhood Immunization and Disease Spread on NetworksOpinion Formation about Childhood Immunization and Disease Spread on Networks
Opinion Formation about Childhood Immunization and Disease Spread on NetworksZhao Shanshan
 
938838223-MIT.pdf
938838223-MIT.pdf938838223-MIT.pdf
938838223-MIT.pdfAbdetaImi
 
Classification System for Impedance Spectra
Classification System for Impedance SpectraClassification System for Impedance Spectra
Classification System for Impedance SpectraCarl Sapp
 
Mth201 COMPLETE BOOK
Mth201 COMPLETE BOOKMth201 COMPLETE BOOK
Mth201 COMPLETE BOOKmusadoto
 
Thesis_Sebastian_Ånerud_2015-06-16
Thesis_Sebastian_Ånerud_2015-06-16Thesis_Sebastian_Ånerud_2015-06-16
Thesis_Sebastian_Ånerud_2015-06-16Sebastian
 
Navarro & Foxcroft (2018). Learning statistics with jamovi (1).pdf
Navarro & Foxcroft (2018). Learning statistics with jamovi (1).pdfNavarro & Foxcroft (2018). Learning statistics with jamovi (1).pdf
Navarro & Foxcroft (2018). Learning statistics with jamovi (1).pdfTerimSura
 
Introductory Statistics Explained.pdf
Introductory Statistics Explained.pdfIntroductory Statistics Explained.pdf
Introductory Statistics Explained.pdfssuser4492e2
 
Stochastic Processes and Simulations – A Machine Learning Perspective
Stochastic Processes and Simulations – A Machine Learning PerspectiveStochastic Processes and Simulations – A Machine Learning Perspective
Stochastic Processes and Simulations – A Machine Learning Perspectivee2wi67sy4816pahn
 
Incorporating Learning Strategies in Training of Deep Neural Networks for Au...
Incorporating Learning Strategies in Training of Deep Neural  Networks for Au...Incorporating Learning Strategies in Training of Deep Neural  Networks for Au...
Incorporating Learning Strategies in Training of Deep Neural Networks for Au...Artur Filipowicz
 
From sound to grammar: theory, representations and a computational model
From sound to grammar: theory, representations and a computational modelFrom sound to grammar: theory, representations and a computational model
From sound to grammar: theory, representations and a computational modelMarco Piccolino
 
A Comparative Study Of Generalized Arc-Consistency Algorithms
A Comparative Study Of Generalized Arc-Consistency AlgorithmsA Comparative Study Of Generalized Arc-Consistency Algorithms
A Comparative Study Of Generalized Arc-Consistency AlgorithmsSandra Long
 

Semelhante a Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails and Volatility Clustering (20)

Lecturenotesstatistics
LecturenotesstatisticsLecturenotesstatistics
Lecturenotesstatistics
 
The value at risk
The value at risk The value at risk
The value at risk
 
Manual de tecnicas de bioestadística basica
Manual de tecnicas de bioestadística basica Manual de tecnicas de bioestadística basica
Manual de tecnicas de bioestadística basica
 
Clustering Financial Time Series and Evidences of Memory E
Clustering Financial Time Series and Evidences of Memory EClustering Financial Time Series and Evidences of Memory E
Clustering Financial Time Series and Evidences of Memory E
 
Opinion Formation about Childhood Immunization and Disease Spread on Networks
Opinion Formation about Childhood Immunization and Disease Spread on NetworksOpinion Formation about Childhood Immunization and Disease Spread on Networks
Opinion Formation about Childhood Immunization and Disease Spread on Networks
 
938838223-MIT.pdf
938838223-MIT.pdf938838223-MIT.pdf
938838223-MIT.pdf
 
Classification System for Impedance Spectra
Classification System for Impedance SpectraClassification System for Impedance Spectra
Classification System for Impedance Spectra
 
Mth201 COMPLETE BOOK
Mth201 COMPLETE BOOKMth201 COMPLETE BOOK
Mth201 COMPLETE BOOK
 
probabilidades.pdf
probabilidades.pdfprobabilidades.pdf
probabilidades.pdf
 
Thesis_Sebastian_Ånerud_2015-06-16
Thesis_Sebastian_Ånerud_2015-06-16Thesis_Sebastian_Ånerud_2015-06-16
Thesis_Sebastian_Ånerud_2015-06-16
 
Navarro & Foxcroft (2018). Learning statistics with jamovi (1).pdf
Navarro & Foxcroft (2018). Learning statistics with jamovi (1).pdfNavarro & Foxcroft (2018). Learning statistics with jamovi (1).pdf
Navarro & Foxcroft (2018). Learning statistics with jamovi (1).pdf
 
Introductory Statistics Explained.pdf
Introductory Statistics Explained.pdfIntroductory Statistics Explained.pdf
Introductory Statistics Explained.pdf
 
phd-thesis
phd-thesisphd-thesis
phd-thesis
 
Stochastic Processes and Simulations – A Machine Learning Perspective
Stochastic Processes and Simulations – A Machine Learning PerspectiveStochastic Processes and Simulations – A Machine Learning Perspective
Stochastic Processes and Simulations – A Machine Learning Perspective
 
2020-2021 EDA 101 Handout.pdf
2020-2021 EDA 101 Handout.pdf2020-2021 EDA 101 Handout.pdf
2020-2021 EDA 101 Handout.pdf
 
Incorporating Learning Strategies in Training of Deep Neural Networks for Au...
Incorporating Learning Strategies in Training of Deep Neural  Networks for Au...Incorporating Learning Strategies in Training of Deep Neural  Networks for Au...
Incorporating Learning Strategies in Training of Deep Neural Networks for Au...
 
Kretz dis
Kretz disKretz dis
Kretz dis
 
From sound to grammar: theory, representations and a computational model
From sound to grammar: theory, representations and a computational modelFrom sound to grammar: theory, representations and a computational model
From sound to grammar: theory, representations and a computational model
 
A Comparative Study Of Generalized Arc-Consistency Algorithms
A Comparative Study Of Generalized Arc-Consistency AlgorithmsA Comparative Study Of Generalized Arc-Consistency Algorithms
A Comparative Study Of Generalized Arc-Consistency Algorithms
 
EvalInvStrats_web
EvalInvStrats_webEvalInvStrats_web
EvalInvStrats_web
 

Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails and Volatility Clustering

  • 1. Backtesting Value at Risk and Expected Shortfall with Underlying Volatility Clustering and Fat Tails by Stefano Bochicchio Estival BSc A thesis submitted in conformity with the requirements for the degree of Master of Science Department of Mathematics Faculty of Mathematical & Physical Sciences University College London September, 2016
  • 2. Disclaimer I, Stefano Bochicchio Estival, confirm that the work presented in this thesis is my own. Where information has been derived from other sources, I confirm that this has been indicated in the thesis. Signature Date 2
  • 3. Abstract Since the financial crisis in 2008, risk management has become one of the most important topics in finance. The need to accurately assess the risk exposure of a financial entity has ignited a discussion between academics and regulators to search for the most accurate and reliable way to measure risk. The most prominent risk measures are Value at Risk (VaR) and Expected Shortfall (ES). Furthermore, back- testing has become an important tool to verify the performance of risk measures. In the context of the behaviour of financial time series, “volatility clustering” and “fat tails” are the most important properties. [15,36]. This motivates the following question: What is the effect of these properties on the backtesting procedure of VaR and ES? The objective of this thesis is to investigate and analyse the backtesting procedure of VaR and ES when exposed to data enriched with the properties of “fat tails” and “volatility clustering”. The structure of this thesis is integrated as follows. First, the GARCH(1,1) model is proposed as a reliable tool that embodies the property of “volatility clus- tering” and the Student t’s distribution as trustworthy model that captures the “fat tails” property. Second, the parameters of the proposed GARCH(1,1) model and the Student t’s distribution are estimated using the JPMM stock data and random 3
  • 4. simulations are generated in order to obtain the “in-sample” and “out-of-sample” subsets. Third, the VaR and ES estimates for both models are computed using the “in-sample” subset. Fourth, VaR is backtested using Christoffersen’s [12] tests, while ES is backtested using Acerbi and Szekely’s [1] Test I and II on the “out-sample” dataset. Finally, the correspondent p-values of the tests are calculated in order to conclude whether the estimated risk measures pass the backtest. In conclusion, the following results were obtained. Regarding the GARCH(1,1) model, the VaR estimate overestimated the expected VaR violations. Additionally, these violations were not independent due to the “volatility clustering” property. Furthermore, The ES estimate indeed passed the backtest but suggested that the “real” risk was overestimated. Regarding the Student’s t distribution, the VaR estimate passed the backtest as the VaR violations were in line with the estimation. Moreover, the violations proved to be independent. Likewise, The ES estimate passed the backtest. Hence, the “fat tails” property did not affect the backtesting procedure for both risk measures. Finally, further lines of investigation are recommended in order to study this topic with a different focus. This thesis was completed under the supervision of Professor Johannes Ruf and Professor Alejandro G´omez. 4
  • 5. Acknowledgments Firstly I would like to thank Prof. Alejandro G´omez for his unconditional support and excellent supervision, I highly appreciate the dedication he showed to this thesis. Secondly I would like to thank Prof. Johannes Ruf for his teachings during this whole year, I am deeply grateful to the help that he provided throughout my Master studies and during this thesis. Thirdly I would like to thank la bandita. Last but not least, I would like to thank my parents and Vanessa for their con- tinuous support. Stefano Bochicchio Estival, University College London, September 2016 5
  • 6. To my family and to Vanessa, thanks for all the unconditional support.
  • 7. Contents Disclaimer 2 Abstract 3 Acknowledgments 5 List of Tables 10 List of Figures 12 1 Introduction 14 2 Properties of Financial Time Series 18 2.1 Volatility Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.1.1 GARCH Model . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2 Fat Tails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.2.1 Student’s t Distribution . . . . . . . . . . . . . . . . . . . . . 23 3 Properties of Risk Measures and Introduction to VaR and ES 27 3.1 Properties of Risk Measures . . . . . . . . . . . . . . . . . . . . . . . 28 3.1.1 Coherence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.1.2 Elicitability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.1.3 Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.2 Introduction to Value at Risk . . . . . . . . . . . . . . . . . . . . . . 33 3.3 Introduction to Expected Shortfall . . . . . . . . . . . . . . . . . . . 36 7
  • 8. Contents 8 4 Theoretical Background for Backtesting VaR and ES 39 4.1 Statistical Background . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.2 Backtesting Value at Risk . . . . . . . . . . . . . . . . . . . . . . . . 42 4.2.1 Regulatory Framework . . . . . . . . . . . . . . . . . . . . . . 42 4.2.2 Statistical Framework . . . . . . . . . . . . . . . . . . . . . . 43 4.2.3 Unconditional Coverage Tests . . . . . . . . . . . . . . . . . . 45 4.2.3.1 Violation Ratio . . . . . . . . . . . . . . . . . . . . . 45 4.2.3.2 Failure Test . . . . . . . . . . . . . . . . . . . . . . . 46 4.2.3.3 Proportion of Failures (POF) . . . . . . . . . . . . . 47 4.2.3.4 Christoffersen’s Unconditional Coverage Test . . . . 48 4.2.4 Independence Tests . . . . . . . . . . . . . . . . . . . . . . . . 49 4.2.4.1 Christoffersen’s Independence Test (Markov Test) . . 49 4.2.4.2 Christoffersen and Pelletier’s Duration Test . . . . . 50 4.2.5 Conditional Coverage Tests . . . . . . . . . . . . . . . . . . . 51 4.2.5.1 Joint Markov Test . . . . . . . . . . . . . . . . . . . 51 4.2.5.2 Christoffersen’s Conditional Coverage Joint Test . . . 51 4.3 Backtesting Expected Shortfall . . . . . . . . . . . . . . . . . . . . . 52 4.3.1 Quantile Approximation . . . . . . . . . . . . . . . . . . . . . 52 4.3.2 Acerbi and Szekely Test . . . . . . . . . . . . . . . . . . . . . 54 4.3.2.1 Test I . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.3.2.2 Test II . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.3.2.3 Test III . . . . . . . . . . . . . . . . . . . . . . . . . 57 5 Backtesting VaR and ES with the Generated Data 59 5.1 Data Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 5.1.1 Volatility Clustering . . . . . . . . . . . . . . . . . . . . . . . 61 5.1.2 Fat Tails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 5.2 Computation of Risk Measures . . . . . . . . . . . . . . . . . . . . . 67 5.2.1 Computation of Value at Risk . . . . . . . . . . . . . . . . . . 67 5.2.2 Computation of Expected Shortfall . . . . . . . . . . . . . . . 72 5.3 Backtesting Value at Risk and Expected Shortfall Using Selected Tests 75 5.3.1 Backtesting Value at Risk . . . . . . . . . . . . . . . . . . . . 75
  • 9. Contents 9 5.3.2 Backtesting Expected Shortfall . . . . . . . . . . . . . . . . . 81 5.3.2.1 Acerbi and Szekely Test I . . . . . . . . . . . . . . . 82 5.3.2.2 Acerbi and Szekely Test II . . . . . . . . . . . . . . . 84 6 Conclusions, limitations and further research 90 A Statistical Tests For VaR Backtesting 95 Bibliography 97
  • 10. List of Tables 2.1 Kurtosis of the FTSE Index with Fitted Normal Distribution . . . . . 23 2.2 Kurtosis of the FTSE Index with Fitted Student’s t Distribution . . . 26 3.1 Summary of Properties for VaR and ES 1 . . . . . . . . . . . . . . . . 38 4.1 Hypothesis Testing Summary Table . . . . . . . . . . . . . . . . . . . 40 4.2 Contingency Table for Christoffersen’s Markov Test 2 . . . . . . . . . 49 5.1 Statistical Test for the Residuals of the Returns Data Series of JPMM with α = 0.05 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.2 Parameter Estimates, Standard Error and Test Statistic for the Fitted GARCH(1,1) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 5.3 Estimates and Standard Error of the Fitted Student’s t Distribution . 65 5.4 Kurtosis of the Simulated Student’s t Distribution . . . . . . . . . . . 65 5.5 Estimates of VaR for the GARCH(1,1) Model and the Student’s t Distribution with the Correspondent α. . . . . . . . . . . . . . . . . . 71 5.6 Estimates of VaR and ES for Both Methods with Various Confidence Levels α . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 5.7 Test Statistic and p-value for the Z1 (X) Test . . . . . . . . . . . . . . 83 5.8 Test Statistic and p-value for the Z2 (X) Test . . . . . . . . . . . . . . 85 A.1 Statistical Test for the GARCH(1,1) Model with α = 0.05 . . . . . . 95 A.2 Statistical Test for the GARCH(1,1) Model with α = 0.025 . . . . . . 95 A.3 Statistical Test for the GARCH(1,1) Model with α = 0.01 . . . . . . 95 A.4 Statistical Test for Student t’s Distribution with α = 0.05 . . . . . . . 96 A.5 Statistical Test for Student t’s Distribution with α = 0.025 . . . . . . 96 10
  • 11. A.6 Statistical Test for Student t’s Distribution with α = 0.01 . . . . . . . 96 11
  • 12. List of Figures 2.1 FTSE Daily Returns . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2 Empirical and Fitted Normal Distribution of the FTSE Index . . . . 24 2.3 Rescale of Empirical and Fitted Normal Distribution of the FTSE Index 24 2.4 Rescale of Empirical and Fitted Student’s t distribution of the FTSE Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.1 VaR and ES for a Loss Function that is Normally Distributed with µ = 0 and σ2 = 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 5.1 Daily Returns of JPMM . . . . . . . . . . . . . . . . . . . . . . . . . 61 5.2 Sample Autocorrelation Function and Sample Partial Autocorrelation Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 5.3 Simulated Conditional Variance and Returns for the Fitted GARCH(1,1) Model for a Selected Path . . . . . . . . . . . . . . . . . . . . . . . . 64 5.4 Sample Autocorrelations for the Conditional Variance (up) and Re- turns (down) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.5 Fitted Student’s t Distribution vs. Empirical Distribution . . . . . . 66 5.6 Cumulative Mean for the VaR of the GARCH(1,1) Model with Se- lected Values of α. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.7 Moving Average with a Window of 1,000 Observations for the VaR of the GARCH(1,1) Model with Selected Values of α. . . . . . . . . . . 69 5.8 Cumulative Mean for the VaR of the Student’s t Distribution with Selected Values of α in Conjunction with the Quantile Values Derived from the Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 12
  • 13. 5.9 Cumulative Mean for the ES of the GARCH(1,1) Model with Selected Values of α . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 5.10 Moving Average with a Window of 1,000 Observations for the ES of the GARCH(1,1) Model with Selected Values of α. . . . . . . . . . . 73 5.11 Cumulative Mean for the ES of the Student’s t Distribution with Se- lected Values of α. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 5.12 Backtesting VaR with Student’s t Distribution Generated Data with α = 0.05 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.13 Backtesting VaR with Student’s t Distribution Generated Data with α = 0.025 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.14 Backtesting VaR with Student’s t Distribution Generated Data with α = 0.01 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.15 Backtesting VaR with GARCH(1,1) Generated Data with α = 0.05 . 78 5.16 Backtesting VaR with GARCH(1,1) Generated Data with α = 0.025 . 78 5.17 Backtesting VaR with Student’s t Distribution Generated Data with α = 0.01 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 5.18 Contribution to the Z2(X) Test Statistic for the GARCH(1,1) Model with α = 0.05 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 5.19 Contribution to the Z2(X) Test Statistic for the GARCH(1,1) Model with α = 0.025 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 5.20 Contribution to the Z2(X) Test Statistic for the GARCH(1,1) Model with α = 0.01 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 5.21 Contribution to the Z2(X) Test Statistic for the Student’s t Distribu- tion with α = 0.05 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 5.22 Contribution to the Z2(X) Test Statistic for the Student’s t Distribu- tion with α = 0.025 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 5.23 Contribution to the Z2(X) Test Statistic for the Student’s t Distribu- tion with α = 0.01 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 13
  • 14. Chapter 1 Introduction Since the financial crisis in 2008, risk management has become one of the most important topics in finance. The need to accurately assess the risk exposure of a financial entity has ignited a discussion between academics and regulators to search for the most accurate and reliable way to measure risk. The most common type of risk is market risk, which measures the sensitivity of the value of a portfolio with respect to changes in the price of the underlying financial products. Another important manifestation of risk is called credit risk, which embodies the risk of not receiving outstanding payments from a financial counterpart due to a default. Within the realm of credit risk there exists a subset called credit counterparty risk which is mainly incurred when trading OTC1 derivatives, as the fulfilment of future cashflows depends directly on a financial counterparty. Moreover, liquidity risk corresponds to the risk that arises when financial positions cannot be opened or closed at the desired prices due to a lack of trading activity in the market. Operational risk measures the risk associated with partial or complete failure of internal processes such as human or computational systems [39]. Within operational risk, legal risk corresponds to the unexpected losses attributed to a defective transaction related to a dispute or legal action against a certain financial entity [38] (for more information about other types 1 OTC stands for “over the counter” which denotes the non-standardized contracts that are not traded in exchanges but directly between counterparties. 14
  • 15. Chapter 1. Introduction 15 of risk, see McNeil et al. [39]). A natural question that arises when dealing with risk is: How can risk be mea- sured? In their seminal article, Emmer et al. [24] mention that the concept of risk mea- surement is fundamental to the correct management of risk. Specifically, Kou et al. [30] indicate that “a risk measure attempts to assign a single numerical value to the random loss of a portfolio of assets.” The modern history of risk measurement starts with Markovitz in 1952 [37] when he introduced the concept of risk together with the return of a financial product. In his work, Markovitz defines risk as the “standard deviation” of returns [24]. At the end of 1974, the Basel Committee on Banking Supervision (BCBS) was established by the members of the Group of Ten (G-10) countries. Its objective is to ensure global financial stability by setting the minimum regulatory framework for the supervision of the banking industry. In the second agreement of the BCBS in 2004 [4], Value at Risk (VaR) was adopted as the benchmark downside risk measure to quantify the market risk for financial institutions (for more information about other downside risk measures please see Nawrocki [40]) [6, 31, 39]. In October 2013, the BCBS [5] proposed a change in its regulations and, therefore, introduced Expected Shortfall (ES) as a suggested financial risk measure to capture unexpected losses incurred in financial distress [1]. In the financial regulatory framework, risk measures need to be backtested to assess accurately the capital needed to set aside in order to cover extreme portfolio losses. When it comes to backtesting VaR, certain standardized tests can be im- plemented in order to cross-check the current capital requirements as explained by Campbell [9] and Kupiec [31]. On the other hand, Gneiting [26] and Carver [10] mention that ES is not back- testable due to the fact that it does not fulfil the property of elicitability (see Section
  • 16. Chapter 1. Introduction 16 3.1.2). Nevertheless, Acerbi and Szekely [1], Kerhof et al. [29] and Costanzino et al. [18] mention that elicitability is not a necessary factor to determine if a risk mea- sure is backtestable. As a consequence, the former authors introduce standardized non-parametric backtesting procedures for ES. In the context of the behaviour of financial time series, the property of “volatility clustering” is frequently embodied by financial assets as shown first by Mandelbrot [36] and studied by Cont [16]. Moreover, after the financial crisis of 2008, the property of “fat tails” in the probability distribution of prices has manifested in the dynamics of financial markets (for further information please see Dash [20]). Now the following question can be asked: What is the effect of these properties on the backtesting procedure of VaR and ES? The objective of this thesis is to investigate and analyse the backtesting procedure of VaR and ES when exposed to data enriched with the properties of “fat tails” and “volatility clustering”. In Chapter 2, the properties of “volatility clustering” and “fat tails” are pre- sented. The GARCH(1,1) model is proposed as a tool that embodies the property of “volatility clustering” and the Student t’s distribution is taken as a trustworthy model that captures the “fat tails” property. Moreover, the fulfilment of these prop- erties is empirically evidenced in a specific financial time series namely, the FTSE index. In Chapter 3, a background on the properties that are important for risk mea- sures is exposed. Additionally, VaR and ES are introduced. In Chapter 4, a thorough analysis of the backtesting procedures available for VaR and ES is undertaken. In Chapter 5 the methodology of the thesis is introduced. First, the parameters of the proposed GARCH(1,1) model and the Student t’s distribution are estimated using the JPMM stock data and random simulations are generated in order to obtain the “in-sample” and “out-of-sample” subsets. As a next step, the VaR and ES estimates for both models are computed using the “in-sample” subset. Afterwards,
  • 17. Chapter 1. Introduction 17 VaR is backtested using Christoffersen’s [12] tests, while ES is backtested using Acerbi and Szekely’s [1] Test I and II on the “out-sample” dataset. Finally, the correspondent p-values of the tests are calculated in order to conclude whether the estimated risk measures pass the backtest.
  • 18. Chapter 2 Properties of Financial Time Series In this chapter, a theoretical background introduces the role of the two most common properties of financial times series: “volatility clustering” and “fat tails”. Furthermore, two models are introduced as catalysts of these properties. Particularly, the GARCH(1,1) model is used to capture the “volatility clustering” property and the Student’s t distribution is chosen to embody the property of “fat tails”. 18
  • 19. Chapter 2. Properties of Financial Time Series 19 2.1 Volatility Clustering The volatility clustering phenomenon was first described by Mandelbrot [36] as “large changes tend to be followed by large changes, of either sign and small changes tend to be followed by small changes.” In other words, when volatility is high during a certain period of time it tends to be high for the consequent periods and vice versa. Moreover, Cont [16] indicates that the volatility clustering effect corresponds to the fact that financial time series returns are non-linearly dependent on time. As Figure 2.1 illustrates, large clusters of returns arrive consecutively in the FTSE Index. This is a clear manifestation of the volatility clustering effect on financial instruments. Figure 2.1: FTSE Daily Returns1 2.1.1 GARCH Model The GARCH (Generalized Autoregressive Conditional Heteroscedasticity) model was developed by Engle [25] and generalized by Bollerslev [8]. This model is a popular reference in modelling the dynamic variability of time series. Due to the 1 Price data obtained from www.yahoofinance.com.
  • 20. Chapter 2. Properties of Financial Time Series 20 fact that prices fluctuate during periods of financial stress, conditional variances are non-constant. GARCH models have proven to be interesting tools to embody the volatility clustering effect on financial time series [8, 25]. This is due to the fact that in the GARCH model, the present level of volatility is dependent on the volatility of one period before. For example, if volatility is high for a previous time step, it suggests that it would still be high for the next time step. Therefore, in the realm of finance, the GARCH model is an appealing option to model financial time series [44,50]. Cont [16] even calls the volatility clustering feature the “GARCH effect”. How- ever, the author mentions that this event is non-parametric and it is not implicitly linked to the GARCH(1,1) model specification. Definition 2.1.1 GARCH process. The process Xt follows a GARCH process composed of p past conditional variances (σ2 i−1) and q past squared innovations (X2 i−1) if σ2 t = ω + q i=1 αiX2 t−i + p i=1 βiσ2 t−i Xt = σt t (2.1) where ω ∈ R , βi, αi ≥ 0 and t ∼ N(0, 1) Due to both its usefulness and importance in the financial industry [44,46], this thesis focuses on the GARCH(1,1) process which takes one lag for the past conditional vari- ances (σ2 i−1) and one lag for the past squared innovations (X2 i−1).The GARCH(1,1) can be represented using Definition 2.1.1 with p = 1 and q = 1. σ2 t = ω + αX2 t−1 + βσ2 t−1 (2.2)
  • 21. Chapter 2. Properties of Financial Time Series 21 where, ω ∈ R and β, α ≥ 0 For the purpose of this thesis the returns of certain financial time series (Xt) follow a GARCH(1,1) process which fulfil the following Xt ∼ N(0, σ2 t ). where σ2 t satisfies Equation 2.2. Additionally, in order to have a stationary solution for the GARCH(1,1) model the following equation needs to hold. α + β < 1 (2.3) Lindner [34] argues that the process Xt has a finite variance if, and only if, Equation 2.3 is fulfilled. For the estimation of the parameters, the maximum likelihood approach is usually used to produce the estimated parameters of the model. It is known that Xt ∼ N(0, σ2 t ) σ2 t = ω + αX2 t−1 + βσ2 t−1 (2.4) so in order to find the estimated coefficient vector ν = (ω, α, β)T , the following is obtained L(θ) = 0.5 n k=2 X2 t σ2 t − 1 σt ∂σt ∂ν (2.5) J = −0.5 n k=2 E 1 σ2 t ∂σt ∂ν ∂σt ∂νT (2.6) where L(θ) is the gradient of the loglikelihood function and J is the Fisher’s In- formation Matrix. Consequently, the estimated parameters can be found using the iterative scheme from Newton’s optimization method [54] (see Yang [54]).
  • 22. Chapter 2. Properties of Financial Time Series 22 2.2 Fat Tails The property of fat tails2 in time series has been labelled as a stylized fact3 in financial assets as evidenced by [15,21]. This feature refers to the property that data possesses extreme values that tend to be separated from the mean of the distribution. In other words, data is underestimated by a normal distribution as it assigns a low probability to events far from the mean. Therefore, a better treatment can be pro- vided with the use of heavy-tailed distributions such as the Student’s t distribution. Cont [15] mentions that precise behaviour of the tails may be sometimes difficult to determine. From a mathematical viewpoint, the property of fat tails can be represented with the following formula. P(X > x) ∼ x−α α > 0 (2.7) In other words, the asymptotic density function of the extreme events fX(x) decays as polynomial with α > 0. For example, in the case of a normal distribution it is quadratically exponential and therefore the decay compared to a polynomial is faster. Conversely, for the case of the Student t’s distribution the asymptotic distribution corresponds to a polynomial decay. A useful property to determine the property of fat tails in the data can be captured with the kurtosis (normalized fourth moment) of the distribution, which is defined as follows KX = E[(X − µ)4 ] (E[(X − µ)2])2 , (2.8) 2 The term of fat tails and heavy tails is are used interchangeably in the literature and in this thesis. 3 Cont [15] mentions that a stylized fact is defined as “a common denominator among the properties observed in studies of different markets and instruments.”
  • 23. Chapter 2. Properties of Financial Time Series 23 where µ corresponds to the mean of the random variable X. Distribution Kurtosis Empirical Distribution 12.5354 Fitted Normal Distribution 3.0018 Table 2.1: Kurtosis of the FTSE Index with Fitted Normal Distribution Table 2.1 shows that the kurtosis of the empirical data extracted from the FTSE index is higher than the one obtained from the fitted normal distribution. Hence, as it was already mentioned, the excess kurtosis observed in the FTSE index cant’ be accurately modelled by a normal distribution. Moreover, Figures 2.2 and 2.3 present the empirical distribution in conjunction with the fitted normal distribution for the FTSE index daily compounded returns. As it can be seen on the graphs, the empirical distribution assigns more probability to extreme events in comparison to the fitted normal distribution. Hence, this suggests that the empirical data may possess the property of fat tails as also observed in Table 2.1. Figure 2.3 presents the events that are higher than µ + 3σ for the empirical distribution. It is known that the fitted normal distribution covers about 99.7% of its area in the interval (µ − 3σ, µ + 3σ). Therefore, the events higher than µ + 3σ are extremely unlikely. To put it into perspective, if normal random numbers were to be drawn every day, the event that a trial lies outside the interval (µ−6σ, µ+6σ) would occur every 1.38 million years. On the contrary, Figure 2.3 shows that this event happened more than once in the last 32 years of the FTSE index data. 2.2.1 Student’s t Distribution As evidenced in Section 2.2, the normal distribution may underestimate the true underlying behaviour of the returns of financial time serie. Therefore, when it comes
  • 24. Chapter 2. Properties of Financial Time Series 24 Figure 2.2: Empirical and Fitted Normal Distribution of the FTSE Index Figure 2.3: Rescale of Empirical and Fitted Normal Distribution of the FTSE Index
  • 25. Chapter 2. Properties of Financial Time Series 25 to model excessive returns in financial time series, the use of fat tails distributions seems appropriate. One of the most famous fat tails distributions is the Student’s t distribution. Stoy- anov [48] mentions that the underlying reason why this distribution is so widespread is due to its simplicity and the easy implementation of a numerical method for its application. Therefore, the Student’s t distribution is used as the catalyst to generate the dataset enriched with the property of fat tails. Definition 2.2.1 Student’s t distribution. Let X be a random variable. X is a Student’s t distribution with ν degrees of freedom if it has the following probability density function: f(t) = Γ(ν+1 2 ) √ νπ Γ(ν 2 ) 1 + (t−µ σ )2 ν −ν+1 2 , (2.9) where µ and σ correspond to the location and the scaling parameters respectively. Moreover, Γ corresponds to the Gamma function, which is defined in Equation 2.10, Γ(t) = ˆ ∞ 0 xt−1 e−x dx. (2.10) Returning to the example concerning the FTSE index, Figure 2.4 shows the same information as Figure 2.3 with the exception that a fitted Student’s t distribution is utilized instead. As the graph shows, this distribution takes more into consideration the extreme values in comparison to the fitted normal distribution. Finally, Table 2.2 illustrates the kurtosis of the fitted Student t’s distribution compared to the empirical distribution. Clearly the Student t’s distribution matches better the actual kurtosis of the empirical data as compared to Table 2.1.
  • 26. Chapter 2. Properties of Financial Time Series 26 Figure 2.4: Rescale of Empirical and Fitted Student’s t distribution of the FTSE Index Distribution Kurtosis Empirical Distribution 12.5354 Student’s t Fitted Distribution 20.2836 Table 2.2: Kurtosis of the FTSE Index with Fitted Student’s t Distribution
  • 27. Chapter 3 Properties of Risk Measures and Introduction to VaR and ES In this chapter the theoretical background of risk measures is presented. More- over, the standard risk measures proposed by the regulators and the industry are presented, namely: VaR and ES. 27
  • 28. Chapter 3. Properties of Risk Measures and Introduction to VaR and ES 28 3.1 Properties of Risk Measures As already mentioned in Chapter 1, “a risk measure attempts to assign a single numerical value to the random loss of a portfolio of assets.” Kou et al. [30]. In this section a formal definition is given in terms of the desired properties that a risk measure must possess. This section is based on the layout presented by Emmer et al. [24]. 3.1.1 Coherence The concept of coherence is important as it groups various mathematical proper- ties that should be taken into account in order to select a suitable risk measure [24]. Specifically, Artzner et al. [3] propose the following four key properties that need to be fulfilled in order for a risk measure to be coherent. Definition 3.1.1 Homogeneity. A certain risk measure ζ(·) is called homogeneous if for all loss variables L and h ≥ 0 it holds that ζ(hL) = hζ(L) Definition 3.1.2 Subadditivity. A certain risk measure ζ(·) is called subadditive if for all loss variables L and K it holds that ζ(L + K) ζ(L) + ζ(K) Definition 3.1.3 Monotonicity. A certain risk measure ζ(·) is called monotonic if for all loss variables L and K it holds that L K =⇒ ζ(L) ζ(K)
  • 29. Chapter 3. Properties of Risk Measures and Introduction to VaR and ES 29 Definition 3.1.4 Translation Invariance. A certain risk measure ζ(·) is called translation invariant if for all loss variables L and δ ∈ R it holds that ζ(L − δ) = ζ(L) − δ 3.1.2 Elicitability Elicitability [26,32,43] plays an important role in the determination of an appro- priate risk measure. Before formalizing the definition of elicitability the following definitions need to be introduced. Definition 3.1.5 Scoring function. A scoring function is defined as follows s : R × R → [0, ∞) (x, y) → s(x, y) where x and y correspond to the forecast and the realization respectively. Put into words, a scoring function is a function that assigns a numerical score in terms of the distance between the forecasted value and the realized value. For example, this difference could be measured by the square error s(x, K) = (x − K)2 or the absolute error s(x, K) = |(x − K)|. Definition 3.1.6 Consistency. Let τ be a functional on a class of probability mea- sures P on R: τ : P → 2R Q → τ(Q) ⊂ R A scoring function s : R × R → [0, ∞) is consistent for the functional τ relative to
  • 30. Chapter 3. Properties of Risk Measures and Introduction to VaR and ES 30 the class P if and only if, ∀Q ∈ P, t ∈ τ(Q), x ∈ R and L being the loss random variable defined on (Ω, F, Q) then EQ[s(t, L)] ≤ EQ[s(x, L)] Definition 3.1.7 Strict Consistency. A scoring function S is strictly consistent if and only if it is consistent and EQsS(t, L)] = EQ[s(x, L)] =⇒ x ∈ τ(Q) Finally, the definition of elicitability can be introduced. Definition 3.1.8 Elicitability. The functional τ is elicitable relative to P if and only if there exists a scoring function S which is strictly consistent for τ relative to P. This definition is used by Emmer et al. [24]. Moreover, the authors mention that elicitability is a very helpful property for the determination of optimal point forecasts. Hence, if there exists a strictly consistent scoring function S for a functional τ then elicitability can be defined as follows (also used by Acerbi and Szekely [1]) ι = arg min x E[s(x, K)] (3.1) where s(x) is a scoring function and ι(K) is a statistic of the random variable K. One of the most important properties of elicitability is that it can be utilized to assess the performance of forecast models [26]. It is worth noting that usually elicitability refers to the risk measure itself and not to a functional with respect to the risk measure. On the following sense, a “weak” second order elicitability can be defined as
  • 31. Chapter 3. Properties of Risk Measures and Introduction to VaR and ES 31 follows Definition 3.1.9 Conditional Elicitability A functional τ of Q is called condi- tionally elicitable if there exists functionals κ and κ : D → 2R with D ⊂ Q × 2R that satisfy the following • κ is elicitable relative to Q • (P, κ) ∈ D ∀P ∈ Q • ∀c ∈ κ(Q) the functional κc : Qc → 2R , P → κ(P, c) ⊂ R is elicitable relative to Qc = {P ∈ Q : (P, c) ∈ D} the property of Conditional Elicitability is relevant when forecasting risk measures that are not elicitable. 3.1.3 Robustness Robustness is defined as the sensibility that a certain model has when altering its underlying parameters. A robust risk measure, in a strict sense, is not significantly affected by external as well as internal shocks. In the risk context, Emmer et al. [24] mention that without robustness it could be the case that results cannot be relevant as small measurement errors lead to big changes in the estimated risk measure. Furthermore, Cont et al. [17] define robustness with a different focus. Specifically, instead of assuming that the sensibility comes from measurement errors, they assign it to the actual inflow of new data into estimate the model. When analysing the robustness of a certain risk measure, a distance should be defined. Emmer et al. [24] proposes the following definition
  • 32. Chapter 3. Properties of Risk Measures and Introduction to VaR and ES 32 Definition 3.1.10 Wasserstein distance The Wasserstein distance between two probability measures P and Q is, Dws(P, Q) = inf{E(|X − Y |) : X ∼ P, Y ∼ Q} (3.2) using Equation 3.1.10 the definition of robustness can be introduced. Definition 3.1.11 Robustness. A risk measure µ is called robust with respect to the Wasserstein distance if lim x→∞ Dws(Xn, X) = 0 ⇒ lim x→∞ |µ(Xn) − µ(X)|= 0 (3.3) where Xn ∼ Pn, n ∈ N as well as Pn corresponds to a probability measure and µ corresponds to a certain risk measure.
  • 33. Chapter 3. Properties of Risk Measures and Introduction to VaR and ES 33 3.2 Introduction to Value at Risk VaR is the most widespread risk measures in finance. VaR was developed by J.P. Morgan in 1994 on their publication of the Riskmetrics framework. This catapulted it as a benchmark risk measure in the industry [28, 41]. Afterwards, as already mentioned in Chapter 1, the Basel Committee on Banking Supervision introduced VaR as the internal benchmark for banks to calculate the capital requirements. Definition 3.2.1 Value at Risk (VaR). A portfolio’s Value at Risk (VaR) corre- sponds to the α quantile of the profit and loss distribution X [9] VaRt(α) = −F−1 (α) = inf{x ∈ R : F(x) ≥ α} (3.4) where F−1 (α) is the quantile function (inverse CDF1 ) of the profit and loss distribu- tion. When defining VaR, a confidence level (1 − α) and a time interval (t) must be given. Specifically, t determines the distribution, and in the risk metric perspective this parameter is introduced for convenience and clarity. As an illustrative example, let t = 1 and α = 0.01. Therefore, if VaR corresponds to the value of $1,000, then under the mathematical definition, 99% of the times the incurred loss of a certain portfolio in one day exceeds $1,000. In spite of the popularity and simplicity of VaR, some shortcomings have been diagnosed on this risk measure. As a first important drawback, VaR does not provide any information regarding the magnitude of the excess loss beyond the α level. This is an important pitfall as the VaR could underestimate the actual loss of the portfo- lio [42,45,53]. Second, VaR is criticized by its lack of subadditivity (see Definition 3.1.2) and therefore its lack of coherence (see Definition 3.1.1). This result is disturb- 1 Assuming F() is continuous and the inverse function exists.
  • 34. Chapter 3. Properties of Risk Measures and Introduction to VaR and ES 34 ing because of the following reason: there may be no direct benefit by diversifying portfolios as actually the VaR can be higher for the added portfolio [39,53]. Never- theless, there are some cases in which VaR is subadditive. For example, according to Haugh [27], VaR is indeed subadditive when dealing with elliptical distributions as well as with distributions which are continuous and symmetric. Regarding the computation of VaR, there exist three major techniques that are commonly implemented. 1. Variance-Covariance Approach. The variance-covariance approach is based in the assumption that returns are normally distributed. Therefore, historical data is taken in order to estimate the parameters of the normal distribution (µ, σ2 ). Consequently, when quantiles need to be obtained, the calculation is merely simplified by the fact that it corresponds to the ones in the normal distribution. A strong advantage of this method is that it is very flexible and simple to use. Moreover, it facilitates the inclusion of stress scenarios to analyse the sensitivity of the results when parameters are changed [51]. However, the most important pitfall of this technique is that the returns of the portfolio are assumed to be normally distributed. As already explained in Section 2.2, the normal distribution may sometimes underestimate the true behaviour of financial assets. 2. Historical Simulation. As mentioned by O’Brien et al. [42], the Historical Simulation technique is the most popular approach for calculating VaR. This technique is mainly based on the historical information of the financial products that compose a specific portfolio. It is assumed that the weights of the financial instruments in the portfolio do not change for the observation period. In this case, VaR is obtained by inspecting the quantiles of the empirical
  • 35. Chapter 3. Properties of Risk Measures and Introduction to VaR and ES 35 distribution generated by the historical prices. The key advantage of this technique lies in the fact that it is non-parametric. In other words, there is no need to estimate any kind of parameters as the distribution is based on the historical prices. Moreover, Nieppola [41] mentions that using historical data series can account for the property of “heavy tails” of the distribution. One of the most important pitfalls of this model is that it assumes that the behaviour of past prices is a good model for its behaviour in the future; “driving by looking in the rearview mirror”. Therefore, it assumes that history could repeat itself in the future. For example, Dowd [22] mentions that if the data is unusually quiet, the VaR calculated under the Historical Simulation approach could underestimate the “true risk”. Moreover, another important shortcoming of the model is that, as past prices are the most important input for the model, a long history of data is needed. That could pose a problem when taking into account financial instruments that have a short-lived history [41](for more information regarding the Historical Simulation approach refer to Dowd [22]). 3. Monte Carlo Simulation. The Monte Carlo Simulation approach, despite being a really powerful VaR calculation technique, is the most challenging technique to implement [22]. The Monte Carlo method relies on the simula- tion of financial variables which are estimated with respect to market data. Specifically, price paths are simulated at various times to calculate the implied distribution from which VaR estimates can be computed. One of the most important disadvantages of the Monte Carlo approach is that the computational cost is extremely high. In other words, multiple simulated paths need to be generated in order to obtain a robust result requiring sub- stantial computational memory. This can be crucial when trading in a high-
  • 36. Chapter 3. Properties of Risk Measures and Introduction to VaR and ES 36 frequency environment or when estimating the risk of the whole portfolio of a large bank (for more information about the Monte Carlo technique see Niep- pola [41]). 3.3 Introduction to Expected Shortfall As already shown in Section 3.2.1, one of the main drawbacks that VaR poses is that it does not take into account the magnitude of the loss. As a consequence Expected Shortfall2 was introduced as an enhanced risk measure. Definition 3.3.1 Expected Shortfall. Let X be a profit and loss random variable such that E(X) < ∞ with probability density function fX, the ES can be defined as follows ESt(α) = 1 1 − α ˆ 1 α gu(fX)du = −E[X|X ≤ −VaRt(α)] (3.5) where g(·) corresponds to the quantile function or the inverse cumulative density function of the profit and loss distribution. Put into words, the Expected Shortfall as denoted in Equation 3.5 weights the probability under the tail of the loss distribution for the losses that exceed the VaR threshold. As a consequence, the following relationship holds |ESt(α)|≥ |VaRt(α)| (3.6) 2 Expected Shortfall is also defined with a different nomenclature. Across the literature, it is also called Expected Tail Loss, Conditional VaR, Tail VaR, Tail Conditional Expectation, and Worst Conditional Expectation. For more information refer to Acerbi and Tasche [2].
  • 37. Chapter 3. Properties of Risk Measures and Introduction to VaR and ES 37 Figure 3.1: VaR and ES for a Loss Function that is Normally Distributed with µ = 0 and σ2 = 1. Figure 3.1 depicts the values for VaR and ES with respect to a normally dis- tributed loss random variable with µ = 0 and σ2 = 1. As it can be observed on the graph, Equation 3.6 holds. The key advantage of ES with respect to VaR is that it takes into account the magnitude of the loss beyond the VaR threshold and therefore proves to be a more precise measure of the actual exposure to market risk. Moreover, from a mathemati- cal standpoint, ES fulfils all the properties of a coherent risk measure (see Definition 3.1.1) as shown by Artzner et al. [3]. Also, Dowd [22] defines ES as “the most attractive coherent risk measure”. Nonetheless, the most troublesome drawback for ES, arises from the lack of the elicitability property (see Definition 3.1.2). Specifically, some authors as Gneiting [26] and Carver [10] mention that ES is not backtestable due to the fact that it is not elicitable. Nevertheless, Acerbi and Szekely [1], Kerhof et al. [29], and other authors have done substantial work in developing non-parametric standardized backtesting procedures to test ES. In summary, Table 3.1 shows the properties that hold for VaR and ES.
  • 38. Chapter 3. Properties of Risk Measures and Introduction to VaR and ES 38 Property VaR ES Coherence Robustness Elicitability Conditional Elicitability Table 3.1: Summary of Properties for VaR and ES 3 3 Table taken from Emmer et al. [24]
  • 39. Chapter 4 Theoretical Background for Backtesting VaR and ES The concept of backtesting risk measures is crucial for the validation of risk models. Backtesting is essential for employers and risk managers which need to assess whether risk measures are well calibrated. [22]. Backtesting is composed by statistical and quantitative tests that verifies if a certain risk measure (in this case VaR and ES) is consistent with the assumptions of the model. In this chapter the statistical background of hypothesis testing is introduced and a variety of backtesting procedures across the academic literature is exhibited for VaR and ES. 39
  • 40. Chapter 4. Theoretical Background for Backtesting VaR and ES 40 4.1 Statistical Background When backtesting risk measures, the procedure hypothesis testing is crucial to assess the performance of risk measures. A hypothesis test normally defines two types of hypothesis; a null hypothesis (H0) and the alternative hypothesis (Ha). Usually, the objective of hypothesis test- ing relies on verifying if the null hypothesis is true. H0 Decision True False Reject H0 Type I error (α) Not Reject H0 Type II error (β) Table 4.1: Hypothesis Testing Summary Table Table 4.1 describes the different cases when testing the null hypothesis. The most troublesome decision that can be done is rejecting the null hypothesis when in fact it is true. This is called Type I error or significance (α). Another nomenclature that can be found in the literature is “false positive.” Under normal circumstances, this error is set up at the beginning of the test with values normally ranging from 0.01 to 0.05. The probabilistic interpretation of α is the chance of rejecting null hypothesis when it is actually true. Moreover, another type of possible error when performing a hypothesis testing is the so-called error type II or β. Specifically, β corresponds to the probability of accepting the null hypothesis when it is actually false. In the statistical literature, this is called “false negative”. Finally, in the statistical literature, the quantity 1−β is called “power” of the test. That is, the probability of rejecting the null hypothesis when it is indeed false. Hence it is desirable to obtain the highest “power” when performing hypothesis testing [11]. As the significance level increases, it is more probable that null hypothesis is accepted, and therefore the probability that the “true model” is rejected decreases
  • 41. Chapter 4. Theoretical Background for Backtesting VaR and ES 41 (Type I error). Nevertheless, this implies that it is more probable to incorrectly accepting the “false” model (Type II error) [22]. When performing hypothesis testing a test statistic is crucial. A test statistic is basically a function of a given data sample that is used to judge whether the null hypothesis is true or not. Specifically, it is compared to a certain critical value in terms of α to check whether the null hypothesis can be rejected or not. The usual way of verifying the test statistic is with the p-value. That is, the p-value (ρ) corresponds to the probability of observing a more extreme value under the null hypothesis . Hence, if the following holds, ρ ≤ α the null hypothesis is rejected and if ρ > α then it is not possible to reject the null hypothesis. In order to clarify the aforementioned definitions, an illustrative example is shown. Lets suppose that there is a sample of a certain random variable X1 . The null hypothesis is that these data points are normally distributed with mean equal to 0 (µ = 0) and a certain standard deviation (σ). Therefore, H0 corresponds to X ∼ N(0, σ2 ) and for the Ha it can be said that X ∼ N(φ, σ2 ) where φ ≥ 0 Now the following test statistic is taken in order to verify whether µ = 0 = ¯X s√ n (4.1) where n denotes the sample size, ¯X the sample mean and σ the given standard deviation. This is known as the “z-test.” Now considering the case where ¯X = 3 , n = 10 and σ = 8, then = 2.4. Therefore, by setting the significance to α = 0.01 the p-value is ρ = P(X > ) = 1 − 0.9918 = 0.0082 (4.2) 1 samples of the random variable X which are identically distributed and independent.
  • 42. Chapter 4. Theoretical Background for Backtesting VaR and ES 42 hence, ρ < α. Therefore, under the null hypothesis, it is highly unlikely that X > 2.4 and hence the null hypothesis can be rejected. 4.2 Backtesting Value at Risk The following section depicts different popular backtesting models used for VaR. Specifically, the regulatory framework as explained by Campbell [9] is exposed as the principal method used by the BCBS. Moreover, the unconditional coverage tests proposed by Li [33], Jorion [28], Kupiec [31] and Christoffersen [12] are shown. As a complement to the unconditional coverage test, the independence test in the version of the Markov test by Christoffersen’s [13] as well as Christoffersen and Pelletier’s [13] duration test is introduced. Finally, the conditional coverage tests, which covers both the independence and the unconditional coverage property, described by Christof- fersen [12] and Christoffersen and Pelletier [13] are described (for more backtesting tests refer to Nieppola [41] and Campbell [9]). 4.2.1 Regulatory Framework The regulatory guidelines require banks to calculate the capital needed to be set aside in order to cover non-conventional losses. The amount that should be reserved in the regulatory nomenclature is denoted as the market risk capital (MRC). The MRC is a function of the internal VaR that the financial institution calculates. Specifically, the MRC takes the highest of the following two factors. First, the traditional 1% VaR calculated over a 10 day horizon. Second, the 60 day average of the previous reported 1% VaR adjusted by a factor (st). In a mathematical
  • 43. Chapter 4. Theoretical Background for Backtesting VaR and ES 43 perspective, it is defined with the following formula MRCt = max(VaRt(0.01), st 1 60 59 i=0 VaRt−i(0.01)) + ct (4.3) where ct corresponds to the credit risk associated with the bank’s portfolio. Moreover st is a multiplication factor determined by the times of VaR violations in the previous 250 trading days. Or more specifically st =    3 if N ≤ 4 3 + 0.2(N − 4) if 5 ≤ N ≤ 9 4 if 10 < N (4.4) where N denotes the number of violations exceeding VaR. Put into words, when the factor st increases (more violations of VaR in the 250 testing days), then the term on the right-hand side of Equation 4.4 augments and therefore the MRC increments. This is logical as more violations of the VaR point out that the current VaR calculation model may not be accurate and, therefore, should be adjusted in order to improve the MRC level. Campbell [9] calls this technique the “traffic light” approach as the multiplicative factor st is divided into three different sets: the “green” light, which logically means the least amount of VaR violations, the “amber” light which accounts for a higher amount of VaR violations and the “red” light which is the maximum value taken by st. 4.2.2 Statistical Framework In this section, the statistical terms that are a common denominator to the back- testing procedures are introduced. A key term in VaR backtesting is the “hit” function which counts how many
  • 44. Chapter 4. Theoretical Background for Backtesting VaR and ES 44 times the profit and loss realizations during a certain time exceed the VaR estimate. Put into a mathematical context It+1(α) =    1 if Xt,t+1 ≤ −VaRt(α) 0 if Xt,t+1 ≥ −VaRt(α) (4.5) where Xt,t+1 denotes the profit and loss over the period (t, t + 1). In his work from 1998, Christoffersen [12] mentions that the VaR accuracy can be determined by inspecting whether the “hit” sequence fulfils the following properties. • Unconditional Coverage Property. The property of unconditional cover- age defines that the probability that a realized loss exceeds the VaR estimate should be α · 100%. In other words, P(It+1 = 1) = α. As an illustrative ex- ample, let α = 0.05. In this case, it would be expected to encounter 5 VaR violation for every 100 realized returns in the case the VaR estimate is congru- ent. However, if there were more VaR violations then the VaR estimate may underestimate the “real” risk. On the other hand, if the VaR violations are less than 5, then the VaR estimate may overestimate the“real” risk. • Independence Property. This property analyses how VaR violations occur. Specifically, the independence property states that two arbitrary elements of the “hit” sequence have to be strictly independent of each other. In other words, the prior history of the “hit” sequence should not convey any kind of information on whether the future “hit” sequence occurs. As an illustrative example, if there is clustering in the data, it may be expected that the “hit” sequence clusters on that same period. In this case, the evidence may suggest that the times of the “hit” sequence are not independent. • Conditional Coverage Property. This property is mainly a joint test that
  • 45. Chapter 4. Theoretical Background for Backtesting VaR and ES 45 considers the unconditional coverage property as well as the independence prop- erty simultaneously. Campbell [9] synthesizes this property with the following statement It(α) ∼ B(α) i.i.d. (4.6) where B(α) denotes the Bernoulli distribution with probability α. 4.2.3 Unconditional Coverage Tests In this section the unconditional coverage tests proposed by Li [33], Jorion [28], Kupiec [31] and Christoffersen [12] are discussed. 4.2.3.1 Violation Ratio The following test taken from Li [33] is composed of the following test statistic. ζ = T t=1 It(α) T · α (4.7) Put into words, if the VaR estimate were accurate then the numerator of Equation 4.7 would be close to the denominator. Therefore, the sum of the “hit” arrivals would be similar to the theoretical expected VaR violations. The rule of thumb in this case in order to verify the result is that 0.8 < ζ < 1.2.
  • 46. Chapter 4. Theoretical Background for Backtesting VaR and ES 46 4.2.3.2 Failure Test This test, exposed by Jorion [28], records the failure rate, which is calculated as the proportion of the time in which VaR violations occur. Let N be the number of exceptions and T as the total number of days analysed. Hence, N T denotes the failure rate. Ideally, α = N T should be an unbiased estimator for the probability of α which denotes the confidence level of the test. The set-up for this test is exactly the testing framework of Bernoulli trials. That is, under the null hypothesis the number of exceptions N is distributed with the binomial distribution n N αk (1 − α)n−k (4.8) where the mean and variance are nα and nα(1 − α) respectively. In the case where T is large enough, then using the Central Limit Theorem (CLT) the binomial distribution can be approximated by the normal distribution m = N − αn α(1 − α)n d −→ N(0, 1). (4.9) consequently, it is known that m is approximately distributed with a normal dis- tribution so the critical values can be obtained directly. For example, if the test is defined with a 95% level (α = 0.05), the correspondent critical value is 1.96.
  • 47. Chapter 4. Theoretical Background for Backtesting VaR and ES 47 4.2.3.3 Proportion of Failures (POF) The POF test, proposed by Kupiec [31], is based on the following test statistic (using the notation in [9]) POF = 2 log(( α α )I(α) ( 1 − α 1 − α )T−I(α) ) I(α) = T t=1 It(α) α = I(α) T (4.10) where T denotes the number of total observations, and It(α) is the “hit” sequence. By simple inspection of Equation 4.10 it can be seen that, if the empirical prob- ability of VaR violations (α) is exactly the same as α, then the POF test statistic collapses to the value of zero. Conversely, when the empirical probability VaR viola- tions is different to the expected violation rate (α) then the POF test statistic may indicate that the VaR overestimates or underestimates the actual underlying risk. As an example, it would be expected for one trading year (i.e. T = 255) with α = 0.03 to spot on average 7.65 VaR violations. In the case when the actual amount of VaR violations would be 12, α = 0.047, α = 0.03 so the POF would be equal to -11.52. A normalized version of the POF can be expressed in the following way (using the notation of [9]). z = √ T(α − α) α(1 − α) (4.11) As the distribution of the test statistic z is normally distributed, the hypothesis testing procedure may be undertaken in the traditional way. In other words, the suitable critical point for a normal distribution would be compared to the realized test statistic in order to determine the acceptance or rejection of the null hypothesis.
  • 48. Chapter 4. Theoretical Background for Backtesting VaR and ES 48 An advantage of this approach is that when there are no VaR violations at all then z = 0. This fixes an anomaly encountered in the POF stated in Equation 4.10 as it is undefined when there are no VaR violations at all since log(0) is not defined [9]. 4.2.3.4 Christoffersen’s Unconditional Coverage Test Christoffersen [12] proposes the following test statistic in order to test the uncon- ditional coverage property CUCT = 2 log((α)I(α) (1 − α)T−I(α) ) − 2 log((α)I(α) (1 − α)T−I(α) ) I(α) = T t=1 It(α) α = I(α) T (4.12) when T → ∞. CUCT d −−−→ T→∞ χ2 (1) (4.13) For example if α = 0.05 then the critical value with which the test statistic γ would be compared is a χ2 with one degree of freedom. In spite of the unconditional coverage tests’ simplicity and popularity, they are haunted by an important pitfall. There is no analysis whether the VaR violations occur in a specific fashion (i.e they “cluster” in certain periods or occur in pairs). As a consequence, a complementary property needs to exploit the independence between various groups of VaR violations.
  • 49. Chapter 4. Theoretical Background for Backtesting VaR and ES 49 4.2.4 Independence Tests In this section the independence test in the version of the Markov test by Christof- fersen’s [12] as well as Christoffersen and Pelletier’s [13] duration test is introduced. 4.2.4.1 Christoffersen’s Independence Test (Markov Test) The Markov Test inspects the independence property by implementing the fol- lowing 2 × 2 contingency table It(α) = 0 It(α) = 1 It+1(α) = 0 T1 T3 T1 + T3 It+1(α) = 1 T2 T4 T2 + T4 T1 + T2 T3 + T4 T Table 4.2: Contingency Table for Christoffersen’s Markov Test 2 where It(α) corresponds to the “hit” sequence as defined in Section 4.2.2. More- over, T1 and T2 represent the non-violation and violation of the VaR at time t + 1 given that there was no violation in the prior time step, respectively. Conversely, T3 and T4 represent the non-violation and violation of the VaR at time t + 1 given that there was a violation in the prior time step, respectively. Ideally if the process It+1(α) is independent, then the following should hold T2 T1 + T2 = T4 T3 + T4 (4.14) In other words, the proportion of VaR violations given that there was no violation in the previous time step should be the same as the proportion of VaR violations given that there was a VaR violation in the previous period. Therefore, the fact that there was or was not a violation in the previous time step does not provide any kind of 2 Table taken from [13]
  • 50. Chapter 4. Theoretical Background for Backtesting VaR and ES 50 information to whether there is a VaR violation in the current time step and hence the independence property holds. The test statistic is defined as follows CIT = −2 ln (1 − π)T1+T3 (π)T2+T4 (1 − π0)T1 πT2 0 (1 − π1)T3 πT4 1 (4.15) where π0 = T2 T1 + T2 π1 = T4 T3 + T4 π = T2 + T4 T1 + T2 + T3 + T4 (4.16) and CIT is distributed with a χ2 distribution with one degree of freedom. For example, if α = 0.05 then the critical value with which the test statistic CIT would be compared to the critical value of a χ2 with one degree of freedom. 4.2.4.2 Christoffersen and Pelletier’s Duration Test Christoffersen and Pelletier [13] proposed in 2004 a different approach in order to prove the independent property in VaR calculations. In the case VaR violations are independent of each other, the time elapsed between two VaR violations should be independent of the time that elapsed since the last violation. In other words, Campbell [9] mentions that the time between VaR violations should not present any type of “duration dependence.” Despite the sophistication of this approach, it can not be depicted in a 2×2 matrix contingency table as in the Markov test. Therefore, a whole statistical model has to be estimated for the duration between VaR violations. In their work, Christoffersen and Pelletier [13] propose the exponential distribution as the desired distribution of
  • 51. Chapter 4. Theoretical Background for Backtesting VaR and ES 51 the duration between VaR violations as it possesses the property of memory-loss. 4.2.5 Conditional Coverage Tests In order to have a reliable VaR measure, the independence as well as the un- conditional coverage property need to be fulfilled. The following section covers the conditional coverage tests proposed by Christoffersen [12] and Christoffersen and Pelletier [13]. 4.2.5.1 Joint Markov Test The joint Markov test is based on the duration test by Christoffersen and Pel- letier [13] used in Section 4.2.4.2 combined with the Markov test implemented by Christoffersen [12]. Invoking Table 4.2, the joint Markov test proposes the following equality in case the unconditional coverage and the independence property hold T2 T1 + T2 = T4 T3 + T4 = α (4.17) where α is the confidence level for the test. Specifically, the LHS of the equality corre- sponds to the independence property and the RHS corresponds to the unconditional coverage property. 4.2.5.2 Christoffersen’s Conditional Coverage Joint Test Christoffersen’s conditional coverage is simply the aggregation of the [12] uncon- ditional coverage test and the [12]. The test statistic of both tests is added up to create a new test statistic CCCT CCCT = CIT + CUCT (4.18)
  • 52. Chapter 4. Theoretical Background for Backtesting VaR and ES 52 where CCCT is distributed with a χ2 with two degrees of freedom. Therefore, the value of the new test statistic is compared to the correspondent critical value of a χ2 distribution with 2 degrees of freedom. 4.3 Backtesting Expected Shortfall In the case of backtesting ES, the procedure is not as direct as the VaR backtesting according to Wimmerstedt [53] and Acerbi and Szekely [1]. Some authors attribute this difficulty to the fact that its is does not fulfil the property of elicitability (see Definition 3.1.2) [26]. In this thesis, the method employed by Emmer et al. [24] and the various tests implemented by Acerbi and Szekely [1] are reviewed (for more backtesting methods refer to Clift et al. [14]). 4.3.1 Quantile Approximation The following approach is based on a research paper by Acerbi and Tasche [2]. This method is recognized by its simplicity as it is far less complex than the other approaches used to backtest ES. As a first step, ES is represented in terms of VaR. ESt(α) = 1 1 − α ˆ 1 α VaRt(k)dk (4.19)
  • 53. Chapter 4. Theoretical Background for Backtesting VaR and ES 53 In the next step, dividing the interval [1, α] into four subintervals of equal length ∆k = 1−α 4 , the following is obtained. [α, α + (1 − α) 4 ] k0 , [α + (1 − α) 4 , α + (1 − α) 2 ] k1 , [α + (1 − α) 2 , α + 3 4 (1 − α)] k2 , [α + 3 4 (1 − α), 1] k3 As a next step, approximating the integral in Equation 4.19 using Riemann sums the following holds. ESt(α) ≈ 4 i=1 VaRt(k − 1)∆k (4.20) Finally, by simplifying the expression the desired result is obtained. ESt(α) ≈ 1 4 [VaRt(α) + VaRt(0.75α + 0.25) + VaRt(0.5α + 0.5) + VaRt(0.25α + 0.75)] (4.21) For example when α = 0.01 the following holds ESt(0.01) ≈ 1 4 [VaRt(0.01) + VaRt(0.2575) + VaRt(0.505) + VaRt(0.7525)] (4.22) where VaRt(α) correspond to the backtested VaR estimates. Therefore, the various VaR estimates need to be backtested in order to determine if the ES passes the backtesting procedure. A remarkable advantage of this method is that it does not rely on Monte Carlo simulations [1]. However, due to the fact that this method is based on a linear approximation of the ES, it may sometimes be difficult to assess how many supporting points suffice in order to ensure the reliability of the backtesting procedure.
  • 54. Chapter 4. Theoretical Background for Backtesting VaR and ES 54 4.3.2 Acerbi and Szekely Test The following collection of non-parametrical tests proposed by Acerbi and Szekely [1] is implemented using Monte Carlo simulations. As the test statistic does not have a predefined distribution, simulations need to be implemented to obtain a reliable empirical distribution. In this case, the null hypothesis stands for the fact that the predicted model perfectly fits the realized model. Therefore, the estimate of ES passes the backtest. Finally, it is worth noting that this is a one-sided test. In other words, the null hypothesis is rejected only if the risk measure underestimates the actual risk. Hence, the null hypothesis may be accepted with a risk measure that overestimates the actual risk. 4.3.2.1 Test I Invoking Equation 3.5 the following holds ESt(α) = −E[X|X + VaRt(α) < 0] (4.23) where [Xi]T i=1 corresponds to the series of returns. Rewriting Equation 4.23 the following equation is obtained E Xt ESt(α) + 1|Xt + VaRt(α) < 0 = 0 (4.24) using the definition of the “hit” function It(α) from Section 4.2.2, denoting T as the number of observations and NT the number of VaR violations, the following test
  • 55. Chapter 4. Theoretical Background for Backtesting VaR and ES 55 statistic is defined Z1(X) = T t=1 XtIt |ESt(α)| NT + 1 (4.25) In the next step, the hypothesis testing is implemented by defining the following H0 : Pα t = Fα t , ∀ t (4.26) where Pα t corresponds to the conditional tail distribution of Pt which is the predicted distribution of returns (known). Moreover, Ft corresponds to the realized distribution of returns (unknown) and Fα t denotes the conditional tail distribution.3 The alternative hypothesis the following holds Ha : ESt(α) ≥ ESt(α) ∀t VaRt(α) = VaRt(α) ∀t (4.27) where ESt(α) and VaRt(α) denote the estimated ES and VaR from the realized returns. Put into words, under the alternative hypothesis the ES is underestimated by the model. Nevertheless, the VaR estimate is not rejected. Therefore, this test is just exposed to the magnitude of the VaR and is independent of the violation’s frequency [1]. Furthermore, EH0 [Z1] = 0 and EH1 [Z1] < 0 (4.28) put into words, if the mean of the test statistic Z1 is 0 then the ES passes the 3 Acerbi and Szekely assume that the functions Ft and Pt are continuous.
  • 56. Chapter 4. Theoretical Background for Backtesting VaR and ES 56 backtest. However, if the mean is different than zero then there is enough evidence to show that the ES could be underestimated. 4.3.2.2 Test II The second test is based on the unconditional representation of ES as shown in the following formula ESt(α) = −E XtIt(α) α (4.29) where [Xi]T i=1 corresponds to the series of returns. Furthermore, It(α) corresponds to the “hit” function from Section 4.2.2. After rearranging the following holds Z2(X) = T t=1 XtIt Tα|ESt(α)| + 1 (4.30) As a next step, in order to implement the hypothesis testing the following is defined H0 : Pα t = Fα t , ∀ t (4.31) where Pα t corresponds to the conditional tail distribution of Pt which is the predicted distribution of returns (known). Moreover, Ft corresponds to the realized distribution of returns (unknown) and Fα t denotes the conditional tail distribution.4 Put into words, the null hypothesis H0 describes that the predicted model perfectly fits the realized model. Therefore, the estimate of ES passes the backtest. 4 Acerbi and Szekely assume that the functions Ft and Pt are continuous.
  • 57. Chapter 4. Theoretical Background for Backtesting VaR and ES 57 For the alternative hypothesis the following holds Ha : ESt(α) ≥ ESt(α) ∀t VaRt(α) ≥ VaRt(α) ∀t (4.32) where ESt(α) and VaRt(α) denote the estimated ES and VaR from the realized returns. Put into words, the ES is underestimated by the model compared to the realized model. Moreover, the alternative hypothesis rejects ES and VaR jointly. Therefore, this test is affected by both the magnitude as well as the VaR violations. Additionally, EH0 [Z2] = 0 and EH1 [Z2] < 0 (4.33) Finally, Acerbi and Szekely [1] propose the following relationship between the two test statistics Z2 = 1 − (1 − Z1) T t=1 It(α) Tα (4.34) 4.3.2.3 Test III The following approach is based on Berkowitz [7]. The test analyses if the ob- served ranks Ut = P(Xt) are i.i.d. U(0, 1). Ideally, P(Xt) ∼ U(0, 1). Acerbi and Szekely [1] use the following definition of ES ESN t (α) = ESN t,α(Y ) = − 1 [Nα] [Nα] t=1 (Yt) (4.35) where N is the number of returns and Y corresponds to the ordered returns. Addi- tionally, the operator [·] corresponds to the lowest integer operator. In other words,
  • 58. Chapter 4. Theoretical Background for Backtesting VaR and ES 58 Equation 4.35 corresponds to the average return weighted by Nα, which is the ex- pected number of exceptions in the sample N. Hence, the following test statistic is proposed Z3 = − 1 T T t=1 EST t,α(P−1 t (U)) EV [EST t,α(P−1 t (V ))] + 1 (4.36) as already stated in Section 4.3.2.1 and 4.30 the following holds EH0 [Z3] = 0 and EH1 [Z3] < 0 (4.37) for this case the null hypothesis is tested H0 : Pt = Ft ∀t (4.38) against the alternative hypothesis H1 : Pt Ft ∀t (4.39) where stands for weak stochastic dominance.
  • 59. Chapter 5 Backtesting VaR and ES with the Generated Data In this chapter a methodology is proposed in order to analyse the backtesting procedures regarding VaR and ES using data enriched with the properties of volatility clustering and fat tails. Also, the analysis and the results are shown having in mind the potential advantages and/or drawbacks when dealing with these stylized facts. 59
  • 60. Chapter 5. Backtesting VaR and ES with the Generated Data 60 Methodology The methodology is exposed below and its structure is as follows. 1. Data generation. As a first step, data is generated based on the GARCH(1,1) model and Student’s t distribution. Moreover, the generated dataset is divided in “in-sample” and “out-of-sample”. Specifically, the “in-sample” subset is used to estimate the VaR and ES and the “out-of-sample” subset is used for the backtesting procedures. 2. Computation of risk measures. In this step, a detailed explanation to estimate VaR and ES based on the “in-sample” subset. Moreover, the use of extra simulations guarantee the robustness of the estimations. 3. Backtesting of risk measures using selected tests. In this step, the performance of the estimates of VaR and ES calculated in the previous step is analysed using the “out-of-sample” subset. Specifically, a selection of the tests exposed in Sections 4.2 and 4.3 are implemented. 4. Analysis of results. As a final step, an analysis is carried out to assess the statistical significance and viability of the estimations. 5.1 Data Generation In this section, the information of the stock of JPMM is used to estimate the parameters for the GARCH(1,1) model and the Student t’s distribution. Furthermore, the “in-sample” and “out-of-sample” subsets are generated. Specif- ically, for the “in-sample” a total number of 7,000 paths composed of 10,000 simula- tions are calculated to provide a robust workspace for the estimation of VaR and ES in the next section. The “out-of-sample” set is constituted with one path of 10,000 simulations.
  • 61. Chapter 5. Backtesting VaR and ES with the Generated Data 61 It is worth noting that the “out-of-sample” data is independent with respect to the “in-sample” set. This ensures the independence of the estimation and the validation of the model. 5.1.1 Volatility Clustering As already mentioned in Section 2.1.1, the GARCH(1,1) model is used to capture the volatility clustering effect property that is observed in financial time series. First, using the daily price returns of the stock JPMM starting from the year 1983 (official date of the financial time series) to 20151 , the continuously daily compounded returns are calculated as depicted in Figure 5.1. Figure 5.1: Daily Returns of JPMM Second, before fitting the GARCH(1,1) process to the JPMM compounded daily returns, according to Tsay [49], significant autocorrelations need to be eliminated from the data. In other words, it needs to be tested whether there exist autocorrela- tions in the JPMM returns. In order to address this, the Ljung-Box test is undertaken and the graphs of the autocorrelation and partial autocorrelation are inspected (for 1 Prices obtained from www.yahoofinance.com.
  • 62. Chapter 5. Backtesting VaR and ES with the Generated Data 62 more information about this test refer to Ljung and Box [35]). Moreover, another condition to ensure the suitability of the GARCH(1,1) model is based on testing whether the residuals are serially correlated as stated by Da Rocha [19]. Specifically, there needs to be evidence of an outstanding ARCH effect. This property is proved using Engle’s test (for more information refer to Engle [25]). Figure 5.2 depicts the sample autocorrelation function as well as the sample par- tial autocorrelation function. As it can be seen, the residuals are not significantly different than zero with a significance level α = 0.05. This can be corroborated with the Lung-Box test in Table 5.1. As the p-value of the Ljung-Box test is bigger than the significance α = 0.05, there is not enough evidence to reject the null hypothesis which mentions that residuals are not serially autocorrelated. Moreover, analysing the same table, the Engle ARCH effect test presents a p-value of 0 and therefore shows that there exists an ARCH effect in the data. In summary, there is no further statistical treatment needed for the returns series of JPMM as is already lacks of autocorrelation and possesses the ARCH effect. Given the prior statistical analysis, it is indeed reasonable to use the GARCH(1,1) model to fit the data. Test statistic Critical value p-value Ljung-Box test 28.2229 31.4104 0.1042 Engle ARCH effect test 472.3581 3.8415 0 Table 5.1: Statistical Test for the Residuals of the Returns Data Series of JPMM with α = 0.05 As a next step, the parameters of the GARCH(1,1) model are estimated2 . Table 5.2 depicts the estimated parameters together with the correspondent standard error. 2 The estimation of the parameters is calculated based on the maximum likelihood approach men using the built-in economic toolbox in Matlab 2016a.
  • 63. Chapter 5. Backtesting VaR and ES with the Generated Data 63 Figure 5.2: Sample Autocorrelation Function and Sample Partial Autocorrelation Func- tion Parameter Estimated value Standard error ω 0.0321078 0.00404445 β 0.91575 0.0021361 α 0.0830329 0.00164358 Table 5.2: Parameter Estimates, Standard Error and Test Statistic for the Fitted GARCH(1,1) Model Figure 5.3 graphs the conditional variance jointly with the returns time series for a selected path. As it can be seen from the graph, the volatility clustering effect is present in the simulation of the GARCH(1,1) model3 . Finally, Figure 5.4 graphs the sample correlogram for the conditional variance and returns respectively for the same path as Figure 5.3. Specifically, for the conditional variance, there exists a high dependence with respect to the previous variance. That is expected due to the fact that the GARCH(1,1) model is highly dependent on the volatility of the previous time step. However, the correlation slowly deteriorates as the lag between variances increases. 3 The simulation of the parameters are calculated using the Monte Carlo method in the built-in economic toolbox in Matlab 2016a.
  • 64. Chapter 5. Backtesting VaR and ES with the Generated Data 64 Figure 5.3: Simulated Conditional Variance and Returns for the Fitted GARCH(1,1) Model for a Selected Path Figure 5.4: Sample Autocorrelations for the Conditional Variance (up) and Returns (down)
  • 65. Chapter 5. Backtesting VaR and ES with the Generated Data 65 5.1.2 Fat Tails As already mentioned in Section 2.2 the Student’s t distribution is used to embody the fat tails property that is observed in financial time series. First, using the daily compounded returns of the stock JPMM starting from the year 1983 to 2015, the parameters fo the Student’s t distribution are estimated4 . Table 5.3 depicts the estimated values for the fitted Student’s t distribution model. Figure 5.5 portrays the fitting process of the Student’s t distribution in the ob- served empirical data. Specifically, the empirical distribution of the JPMM returns is graphed in conjunction with the fitted Student’s t distribution with the estimated parameters of Table 5.3. Parameter Estimated value Standard error µ 0.0269772 0.0192303 σ 1.40222 0.021351 ν 2.82064 0.104641 Table 5.3: Estimates and Standard Error of the Fitted Student’s t Distribution Second, Table 5.4 presents the kurtosis of the Student’s t distribution in compar- ison with the one of a normal distribution. This suggests the presence of fat tails in the estimated model. Finally, the “in-sample” data is simulated. tails5 . Distribution Kurtosis Fitted Student’s t distribution 30.4492 Fitted Normal Distribution 3.0797 Table 5.4: Kurtosis of the Simulated Student’s t Distribution 4 The estimation of the parameters are calculated based on the maximum likelihood approach men using the built-in economic toolbox in Matlab 2016a. 5 The simulation of the parameters are calculated using the Monte Carlo method (inverse CDF approach) in Matlab 2016a.
  • 66. Chapter 5. Backtesting VaR and ES with the Generated Data 66 Figure 5.5: Fitted Student’s t Distribution vs. Empirical Distribution
  • 67. Chapter 5. Backtesting VaR and ES with the Generated Data 67 5.2 Computation of Risk Measures In this section, the estimations of VaR and ES are calculated with “in-sample” data generated in Section 5.1. It is worth noting that the significance levels of α = 0.05, 0.025, 1 are used to compute the risk measures, as this are the ones most used in practice. 5.2.1 Computation of Value at Risk The approach used for the calculation of VaR is the Monte Carlo method. As already described in Section 3.2, one of the most important disadvantages of the Monte Carlo method is that the computational cost is extremely high [41]. However, this method is really useful for treating complex processes like the GARCH(1,1) model. First, for the GARCH(1,1) model, as there is no predefined distribution that models the process, an iterative simulation approach is implemented in order to ob- tain a reliable and robust VaR estimate. Specifically, the VaR estimate is calculated with the following procedure. 1. For every sample path, the 7,000 simulations are sorted from lowest to highest simulated returns. 2. In order to find the appropriate return that corresponds to the VaR then the correspondent index (ια) is computed with the following formula ια = c(Nα) (5.1) where N corresponds to the total number of simulations, in this case 7,000. Moreover, 1 − α stands for the desired confidence level. Furthermore, c(˙) de-
  • 68. Chapter 5. Backtesting VaR and ES with the Generated Data 68 notes the ceiling function, which rounds a certain input value to the nearest higher possible integer, therefore ια ∈ N. 3. Once ια is calculated, it is substituted back into the correspondent sorted vector to obtain the estimated VaR value. In other words, VaRk t (α) = ϑk (ια) (5.2) where ϑ() corresponds to the sorted vector of returns calculated in the first step and k embodies the current simulation path being analysed (k = 1 . . . 10, 000) [52]. The simulations for each path of the GARCH(1,1) model is used to calculate a single VaR estimate. By the Law of Large Numbers (LLN) (refer to Durrett [23] for a definition of the Law of Large Numbers), if the estimate is based on various simulations, it will converge to the mean which in this case is the ES. Moreover, The LLN applies whenever the random variable (in this case the sampling procedure) has a bounded variance. Particularly, the GARCH(1,1) model has a finite variance as Equation 2.3 holds for the estimated parameters. Figure 5.6 presents the cumulative mean of the VaR estimates with selected values of α. The graph suggests that the cumulative mean stabilizes as more VaR simulations are averaged out. Sinharay [47] propose the running mean plots as a useful way to validate if the Monte Carlo method converges appropriately. In case the running mean plot sta- bilizes, the algorithm converges. Figure 5.7 depicts the moving average for selected values of α using a window of 1,000 observations as well as the overall mean. As it can be observed, the moving average values remain stationary and close to the overall mean.
  • 69. Chapter 5. Backtesting VaR and ES with the Generated Data 69 Figure 5.6: Cumulative Mean for the VaR of the GARCH(1,1) Model with Selected Values of α. Figure 5.7: Moving Average with a Window of 1,000 Observations for the VaR of the GARCH(1,1) Model with Selected Values of α.
  • 70. Chapter 5. Backtesting VaR and ES with the Generated Data 70 In the case of the Student’s t distribution, the estimation of VaR is less complex. Specifically, as the Student’s t distribution is a predefined probability distribution with specified parameters, the calculation of VaR collapses on finding the correspon- dent quantile in terms of α. For the sake of completeness, the same simulation procedure is undertaken as in the GARCH(1,1) model in order to verify that the simulations indeed converge to the quantile value denoted by the probability distribution. Simulations within each path are generated using the inverse cumulative density function (CDF) approach6 . Furthermore, the convergence test suggested by Sinharay [47] is redundant when implemented with the Student’s t distribution as it is a parametrical distribution with a delimited probability density function. Figure 5.8 portrays the cumulative mean taken from the Student’s t simulation. As it can be observed, the convergence of the cumulative mean of the simulations to the quantile value of the distribution occurs almost intermediately. Finally, Table 5.5 presents the VaR estimates for the GARCH(1,1) model as well as the Student’s t distribution. As it can be seen on the table, the magnitude of the VaR values for every α are higher in the case of the GARCH(1,1) compared to the Student’s t distribution. This may be due to the fact that the implied GARCH(1,1) simulated distribution possesses a higher frequency of extreme values due to the volatility clustering effect. Moreover, for the Student’s t distribution it could be the case that, although extreme values are theoretically present, probably the frequency is not high enough in order to affect the VaR estimate. 6 For more information regarding this method to generate random numbers with the desired distribution refer to a standard statistics book, for example, Dowd [22]. Additionally, the uniform random numbers needed are extracted from the rand() function in Matlab 2016a.
  • 71. Chapter 5. Backtesting VaR and ES with the Generated Data 71 Figure 5.8: Cumulative Mean for the VaR of the Student’s t Distribution with Selected Values of α in Conjunction with the Quantile Values Derived from the Distribution -VaR GARCH(1,1) α = 0.05 -5.8043 GARCH(1,1) α = 0.025 -8.2585 GARCH(1,1) α = 0.01 -12.2022 Student’s t distribution α = 0.05 -3.3603 Student’s t distribution α = 0.025 -4.6004 Student’s t distribution α = 0.01 -6.6754 Table 5.5: Estimates of VaR for the GARCH(1,1) Model and the Student’s t Distribution with the Correspondent α.
  • 72. Chapter 5. Backtesting VaR and ES with the Generated Data 72 5.2.2 Computation of Expected Shortfall In order to calculate the Expected Shortfall, Equation 3.5 is used. The average value is taken over all losses that exceed the VaR value calculated in Section 5.2.1. This method is applied to both the GARCH(1,1) model as well as the Student’s t distribution. In order to find a reliable estimate of ES, for the GARCH(1,1) model and the Student’s t distribution, the arithmetic mean is proposed. Specifically, the mean is calculated as the average of the ES values for the different paths. Firstly, Figure 5.9 describes the cumulative average of the GARCH(1,1) ES es- timate. As it can be seen on the graph, the cumulative mean stabilizes after some iterations. As already mentioned in Section 5.2.1, Figure 5.10 serves as a visual test to assess if a convergence is reached for the value of the ES. In other words, a moving average with a time window of 1,000 observations is implemented to test whether the moving averages vary with respect to each other. After inspecting the graph, the moving average indeed remains stationary and it is close to the overall mean. Secondly, the ES estimate for the Student’s t distribution is computed and anal- ysed. Due to the fact that the Student’s t distribution is a parametrical probability distribution, the actual convergence of the simulations to the “real value” is ex- pected to be reached rapidly. For the sake of completeness, Figure 5.11 presents the cumulative mean for the ES value of the Student’s t distribution. As expected, the convergence occurs fast. Table 5.6 shows that the ES values obtained for the GARCH(1,1) model are greater in magnitude than the ones calculated for the Student’s t distribution. It can be noted that the “in-sample” calculations for VaR and ES are stationary and thus do not take into account the “out-of-sample” innovations. This is important as risk measures should be exposed to data different from the one that was used to estimate them. The fact that the risk measure is not constantly morphing to adapt to new data provides better insights into the potential pitfalls of their performance.
  • 73. Chapter 5. Backtesting VaR and ES with the Generated Data 73 Figure 5.9: Cumulative Mean for the ES of the GARCH(1,1) Model with Selected Values of α Figure 5.10: Moving Average with a Window of 1,000 Observations for the ES of the GARCH(1,1) Model with Selected Values of α.
  • 74. Chapter 5. Backtesting VaR and ES with the Generated Data 74 Figure 5.11: Cumulative Mean for the ES of the Student’s t Distribution with Selected Values of α. -VaR -ES |V aR − ES| GARCH(1,1) α = 0.05 -5.8042 -9.6233 3.8191 GARCH(1,1) α = 0.025 -8.2585 -12.5765 4.3180 GARCH(1,1) α = 0.01 -12.2021 -16.9243 4.7222 Student’s t distribution α = 0.05 -3.3606 -5.6972 2.3366 Student’s t distribution α = 0.025 -4.6004 -7.5066 2.9062 Student’s t distribution α = 0.01 -6.6754 -10.6152 3.9398 Table 5.6: Estimates of VaR and ES for Both Methods with Various Confidence Levels α
  • 75. Chapter 5. Backtesting VaR and ES with the Generated Data 75 5.3 Backtesting Value at Risk and Ex- pected Shortfall Using Selected Tests In the next step, the VaR and ES estimates from Section 5.2 are backtested with the “out-of-sample” data. Particularly, a selection of the tests presented in Sections 4.2 and 4.3 is implemented. 5.3.1 Backtesting Value at Risk In order to backtest the VaR estimations obtained in Section 5.2.1, a selection of the tests introduced in Section 4.2. Specifically, the collection of Christoffersen’s [12] tests is implemented. The key reasons why this group of tests is chosen are the following. First, the whole collection provides an overall assessment of the most important properties that VaR should fulfil in order for it to be a reliable market risk estimate. Second, these tests have a predefined distribution for the test statistic, namely the χ2 distribution, and therefore make the hypothesis testing more robust. In the first step, the unconditional coverage property is tested using the test intro- duced in Section 4.2.3.4. In the second step, the Markov’s test from Section 4.2.4.1 is undertaken to prove the independence property. Finally, the conditional coverage property, which assess the overall performance of the backtesting procedure, is tested using the conditional coverage test exposed in Section 4.2.5.2. Before proceeding with these tests, the “hit” function, as defined in Section 4.2.2, is analysed in order to provide further insights into the backtesting procedure. Fig- ures 5.12 to 5.17 illustrate the performance of the VaR estimate with respect to the “out-of-sample” data. Figures 5.12a, 5.13a, 5.14a, 5.15a, 5.16a and 5.17a show the “out-of-sample” data analysed in conjunction with the estimated “in-sample” VaR
  • 76. Chapter 5. Backtesting VaR and ES with the Generated Data 76 threshold. Moreover, these graphs present in red the simulated returns that exceed the estimated VaR threshold. On the other hand, Figures 5.12b, 5.13b, 5.14b, 5.15b, 5.16b and 5.17b present the cumulative sum of the“hit” function. In Figures 5.15 to 5.17 the GARCH(1,1) generated data is presented. In Fig- ures 5.15a, 5.16a and 5.17a it can be seen that the returns show volatility clustering and therefore the VaR is surpassed consecutively. This leads to the “hit” function having a drastic jump, as depicted in Figures 5.15b, 5.16b and 5.17b. In Figures 5.12 to 5.14 the Student t’s distribution is exposed. Figures 5.12a, 5.13a and 5.14a show that there is no clear volatility clustering effect in the data. However, some returns are more extreme than the ones observed in the GARCH(1,1) generated data. Moreover, Figures 5.12b, 5.13b and 5.14b show that VaR violations arrive in a more uniform fashion when compared to the GARCH(1,1) data. This may influence the independence property when implementing Christoffersen’s statistical test further on. Finally, it is interesting to note for both models that, when α decreases the plots behave in a more erratic and discontinuous fashion. In the case of the Student’s t distribution, the behaviour of the graph starts to lose the shape of a straight line. Similarly, for the GARCH(1,1) model, the sudden cluster of VaR violations occur in a more accentuated way.
  • 77. Chapter 5. Backtesting VaR and ES with the Generated Data 77 (a) Returns and VaR Estimate (b) Sum of VaR Violations Figure 5.12: Backtesting VaR with Student’s t Distribution Generated Data with α = 0.05 (a) Returns and VaR Estimate (b) Sum of VaR Violations Figure 5.13: Backtesting VaR with Student’s t Distribution Generated Data with α = 0.025 (a) Returns and VaR Estimate (b) Sum of VaR Violations Figure 5.14: Backtesting VaR with Student’s t Distribution Generated Data with α = 0.01
  • 78. Chapter 5. Backtesting VaR and ES with the Generated Data 78 (a) Returns and VaR Estimate (b) Sum of VaR Violations Figure 5.15: Backtesting VaR with GARCH(1,1) Generated Data with α = 0.05 (a) Returns and VaR Estimate (b) Sum of VaR Violations Figure 5.16: Backtesting VaR with GARCH(1,1) Generated Data with α = 0.025 (a) Returns and VaR Estimate (b) Sum of VaR Violations Figure 5.17: Backtesting VaR with Student’s t Distribution Generated Data with α = 0.01
  • 79. Chapter 5. Backtesting VaR and ES with the Generated Data 79 Now, focus is given to the statistical tests proposed at the beginning of this section. Tables A.1 to A.6 provide a summary of the selected statistical tests. The test statistic, as well as the p-value, are presented in order to test whether the null hypothesis is true or false. In this collection of statistical tests the null hypothesis stands for the desired property. For instance, the null hypothesis in the independence test corresponds to the fact that VaR violations are independent. Firstly, Tables A.1 to A.3 show the summarized statistics for the GARCH(1,1) model. Starting with the unconditional coverage analysis, it can be seen that the p-value indicates that there is enough evidence to reject the null hypothesis for the various significance levels. Hence, the “hit” function lacks the property of uncon- ditional coverage. That is, the total number of exceptions embodied by the “hit” function does not match the expected theoretical exceptions given by Tα. Likewise, using Equation 4.7 a violation ratio of aproximately 0.5 is obtained. That means that less VaR violations occur compared to what is expected. In summary, the VaR estimate of the GARCH(1,1) model overestimates the “true” risk value. For exam- ple, Figure 5.15 states that when α = 0.05, there are almost 300 observations that surpass VaR, compared to 10, 000 · α 0.05 = 500 expected in theory. In practice, this would give a very conservative calculation of the actual risk and may not optimally use resources. Secondly, Tables A.1 to A.3 show there is enough evidence to reject the hypothesis that the “hit” function is independent. This behaves in line with what is visualized in Figures 5.15 to 5.17. Therefore, the volatility clustering effect is a solid evidence that the “hit” function is not independent. This is confirmed by the rule of thumb in Equation 4.14 which shows a big disparity between the probability of the arrival of a VaR violation given no violation in the previous period and the probability of
  • 80. Chapter 5. Backtesting VaR and ES with the Generated Data 80 the arrival of a VaR violation given a VaR violation in the previous period. Finally, the conditional coverage test, which jointly tests for the both above- stated properties is not relevant as it was already shown that both are rejected for the GARCH(1,1) model. Hence, the conditional coverage property does not hold. Proceeding with the analysis of the Student’s t distribution in the next step, Tables A.4 to A.6 present the statistical tests for this model. First, the analysis of the unconditional coverage property across all the selected significance levels does not provide enough evidence to reject the null hypothesis. In other words, the statistical test supports the fact that arrivals of VaR match the ones stated by the model. This is supported by the fact that the violation ratio, as calculated in Equation 4.7, lies between the desired value of 0.8 and 1.2. For example, when α = 0.05, then the actual number of violations is very close to 500 as shown in Figure 5.12b, while the number of theoretical violations is 10, 000 · α 0.05 = 500. In summary, the expected VaR violations closely match the “out-of-sample” realized VaR violations. Second, the independence property for the Student’s t distribution’s “hit” func- tion holds as there is not enough evidence to reject the hypothesis that the “hit” function is independent. Similarly, this is indicated by the close magnitude on both sides of Equation 4.14. In other words, the probability of the arrival of a VaR vi- olation given no violation in the previous period compared to the arrival of a VaR violation given a VaR violation in the previous period is similar. Finally, the analysis of the conditional coverage property does not show enough evidence to reject the hypothesis that the Student t’s distribution fulfils the condi- tional coverage property. This is expected, as the test is composed of the uncondi- tional coverage as well as the independence property which indeed hold.
  • 81. Chapter 5. Backtesting VaR and ES with the Generated Data 81 5.3.2 Backtesting Expected Shortfall In order to backtest ES estimations, a selection of the tests introduced in Section 4.3 is implemented. Specifically, Test I and II from Acerbi and Szekely [1] are used for the backtesting procedure. Those tests were selected as they are non-parametric as mentioned in Section 4.3.2 and, therefore, do not assume any kind of return distribution. Within the tests, the unconditional coverage for ES is tested. The independence property does not need to be tested as it is equivalent to the one calculated in the VaR section. A significant difference of this backtesting method compared to the methods implemented for backtesting VaR is that the test statistic does not have a predefined distribution. As a consequence, simulations need to be implemented in order to propose a reliable empirical distribution for the test statistic. Acerbi and Szekely [1] propose the following guideline in order to calculate the empirical p-value. 1. Simulate independent and identically distributed samples of a certain return distribution ˚Rj t ∼ Rt ∀t, ∀j = 1, ...., N, where N corresponds to the number of simulated paths. 2. Compute the test statistic Zj = Z( ˚Rj t ) based on the simulated returns. 3. Assess the test statistic by calculating its respective empirical p-value which is determined as follows ρ = 1 N N j=1 {Zj < Z( ←− R )} (5.3) where Z( ←− R ) corresponds to the “out-of-sample” realized value of the test statistic. Finally, as already mentioned in Section 4.3.2, the null hypothesis of these tests stands for the fact that the ES estimate is a good estimate for the market risk and,
  • 82. Chapter 5. Backtesting VaR and ES with the Generated Data 82 therefore, the ES estimate passes the backtest. Nevertheless, it is a one-sided test. In other words, the null hypothesis is rejected only if the risk measure underestimates the actual risk. Hence, the null hypothesis may be accepted with a risk measure that overestimates the actual risk. 5.3.2.1 Acerbi and Szekely Test I First, the Test I as already explained in Section 4.3.2.1 is carried out. To obtain the test statistic, the formula is recalled Z1(X) = T t=1 XtIt |ESt(α)| NT + 1 Rearranging Equation 5.3.2.1, the following is obtained Z1(X) = T t=1 XtIt NT |ESt(α)| + 1 now the numerator embodies the “out-of-sample” estimate of the ES while the “in- sample” estimation is represented in the denominator. Hence, if the “out-of-sample” and “in-sample” estimations of the ES are identical, Z1(X) = 0. This means that, the “in-sample” estimation of the ES passes the backtest successfully. Conversely, when there exists a significant difference between these two estimations, the null hypothesis is rejected and therefore the ES underestimates the actual risk. Acerbi and Szekely [1] mention that in order to perform this test, an estimate of VaR needs to be available due to the existence of It(α). Furthermore, as already introduced in Section 4.3, the authors mention that Z1(X) is an average over the VaR exceptions. Therefore it is sensitive to the exception’s magnitude but not to its frequency.