Tải bản đầy đủ (.pdf) (48 trang)

EXTREME VALUE THEORY AND APPLICATIONS TO FINANCIAL MARKET

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (538.59 KB, 48 trang )

VIETNAM NATIONAL UNIVERSITY
UNIVERSITY OF SCIENCE
FACULTY OF MATHEMATICS, MECHANICS AND INFORMATICS
Dao Minh Phuong
EXTREME VALUE THEORY
AND APPLICATIONS TO
FINANCIAL MARKET
Undergraduate Thesis
Advanced Undergraduate Program in Mathematics
Hanoi - 2012
VIETNAM NATIONAL UNIVERSITY
UNIVERSITY OF SCIENCE
FACULTY OF MATHEMATICS, MECHANICS AND INFORMATICS
Dao Minh Phuong
EXTREME VALUE THEORY
AND APPLICATIONS TO
FINANCIAL MARKET
Undergraduate Thesis
Advanced Undergraduate Program in Mathematics
Thesis advisor: Dr. Luu Hoang Duc
Hanoi - 2012
Contents
Chapter 1. Extreme Value Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2. Block Maxima method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1. L imiting Behavior of Maxima and Extrema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.2. F ish er- Tippett Theorem (1928) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3. The Peaks- over- Thresholds (POT) method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.1. The Generalized Pareto Distribution (GPD). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.2. The POT meth od. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.3. Pickands- Balkema- de Hann theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11


Chapter 2. Applications: Some theoretical computations . . . . . . . . . . . . . . . . . . . 13
2.1. Block Maxima Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.1. Maximum Likelihood Est imation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.2. Hill estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1.3. Value at Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.1.4. B lock Maxima Method Ap proach to VaR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.1.5. Multipe r iod VaR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.1.6. Return Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2. The POT method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.1. The selection of threshold u . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.2. E x pected Shortfall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.3. POT method approach to VaR and Expected Sho r fall . . . . . . . . . . . . . . . . . . . . . . 22
Chapter 3. Applications: Empirical computations . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.1. Stock market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2. Case study: the Crash in 1987 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3. Risk measure computations using R: case study for Coca-Cola stock . . 30
3.3.1. B lock Maxima Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
i
3.3.2. POT method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
ii
ACKNOWLEDGM ENTS
I am grateful to all those who spent their time and support for this the-
sis. Foremost among this group is my advisor and instructor- Dr Luu Hoang Duc.
Thank you for your dedication, patience, enthusiasm, motivation and immense
knowledge. I am highly appreciate for your encouragement to the thesis.
Second, my sincere thanks to all professors and lecturers of Faculty of Math-
ematics, Mechanics and Informatics for their help throughout my university’s life at
Hanoi University of Science.
Last but not least, I would like to thank my family and friends from K53- Ad-
vanced Mathematics program who always support and give me advice during my

university’s life.
iii
Introduction
Conceptually, mathematics and finance have a lot of things in common. Both
speak everyday about the changes of economy and factors that lead to those changes.
The subject ”financial mathematics” was created to apply mathematical theories and
results to develop and to predict the quantitative changes which might lead to se-
rious damages in financial world and economies, and to suggest best solutions for
those phenomena.
The purpose of this thesis is to highlight the significant conceptual overlap
between mathematics and finance and represent the possibility for advancement of
financial resea rch through the applications of a mathematical method calle d ”Ex-
treme Value Theory”. Extreme Value Theory is a branch of statistical mathematics
that studies extremal de viation from the median of probability distribution. It seeks
to assess, from a given ordered sample of a given random variable, the probability
of events that are more extreme than any observed prior. These values are definitely
important since they usually describe times of the greatest gains or losses. It pro-
vides the solid fundamental needed for the statistical modeling of such events and
the computation of extremal risk finance. Devastating floods, tornadoes, seismic,
market crashes, etc are all real world phenomena modeled by extreme value theory.
This thesis consists of three chapters. The first chapter represents Extreme
Value Theory and some theoretical results, the block maxima and the Peaks-over-
Threshold method. The next chapter introduces some a p p lications in financial mar-
ket such as Value at Risk, return levels, the choice of threshold . And the la st chapter
is some empirical computations using R.
1
CHAPTER 1
Extreme Value Theory
1.1. Introd uction
In this chapter, we introduce the e xtreme value theory in statistic literatures.

Basically, extreme value theory, when modeling the maxima of a random variable,
plays the same role as the central limit theorem does for modeling the sum of vari-
ables. Both two theories represent the limiting distributions.
Let’s consider a random variable showing daily returns. Basically, there are
two methods for id entifying extremes in real data. The first method considers the
maximum of variable in consecutive periods, such as months or year. These selected
observations from the extreme events, called block maxima. In the left of Figure1.1,
the observations X
2
, X
5
, X
7
, X
11
perform the block maxima for four periods of each
three observations. In contrast, the second method focuses on the realizations sur-
passing a given threshold. All the observations X
1
, X
2
, X
7
, X
8
, X
9
, X
11
of the right

panel exceed to threshold u and compose the extreme events.
The block maxima is the traditional approach used to analyze data with oc-
casional as for instance hydrologic data. Nevertheless, the Peaks-over- thresholds
(POT) method use data more efficiently so it has been used popularly in recent ap-
plications.
In the following sections, the block maxima and POT methods are presented.
They are mostly based on Embrechts [2].
1.2. Block Maxima method
In this part, we will consider the so called Generalized Extreme Value distri-
bution (in short GEV) and the most important result: the Fisher theorem. In order
2
Figure 1.1: Block maxima (left panel) and peaks- over- threshold u (right panel)
to do that, let’s consider the behaviour of extrema and maxima:
1.2.1. Limiting Behavior of Maxima and Extrema
Let X
1
, X
2
, . . . , be iid random variables with distribution function (df) F. In
risk management applications, these may exhibit operational losses, financial losses
or insurance losses.
Let M
n
= max(X
1
, . . . , X
n
) be worst-case loss in a sample of n losses. Let the
range of X
1

, . . . , X
n
be (l,u). Obviously
P (M
n
≤ x) = P(X
1
≤ x, . . . , X
n
≤ x) (1.1)
=
n

i=1
P (X
i
≤ x) = F
n
(x). (1.2)
It can be shown that, almost certainly, M
n
n→ ∞
−→ x
F
,
where x
F
: = sup{x ∈ R : F(x) < 1} ≤ ∞ is the right endpoint of F.
In practice, the distribution F(x) is unknown and therefore, the cummulative dis-
3

tribution function (in short cfd) of M
n
is also unknown. Nevertheless, as n goes to
infinity, F
n
(x) → 0 if x < u and F
n
(x) → 1 if x > u or we can say that F
n
(x) be-
comes degenerated. This degenerated cdf does not have any practical value. Hence,
the extreme value theory is interested in finding sequences of real numbers a
n
> 0
and b
n
such that (M
n
− b
n
)/a
n
, the seq uence of normalized maxima and converges
in distribution, i.e:
P

(M
n
− b
n

)
a
n
≤ x

= F
n
(a
n
x + b
n
)
n→ ∞
−→ H(x), (1.3)
for some non degenerate distribution function H(x).
If this condition holds, then F is in the maximum domain of attraction of H, and
we write F ∈ MDA(H). H depends on location series b
n
and scale a
n
, thus it creates
a unique type of distribution.
For more details and comprehensive treatment of extreme value theory, we
refer to Embrechts [2].
Let M
n∗
= (M
n
− b
n

)/a
n
. Now under the independent assumption, the lim-
iting distribution of normalized minima M
n∗
is given as:
H
ξ
(x) =

exp(−(1 + ξx)
−1 /ξ
) ξ = 0,
exp(−e
−x
) ξ = 0,
(1.4)
with x such that 1 + ξx > 0 and ξ is the shape parameter. This parametrization is
continuous in ξ.
The one-parameter representation in equation (1.4) was suggested by Jenkinson
(1955) and Von Mises (1954), known as Generalized Extreme Value Distribution
(GEV). It consists three types of limiting distribution of Gnedenko(1943) :
ξ > 0 H
ξ
corresponds to classical Frechet df
ξ = 0 H
ξ
corresponds to classical Gumbel df
ξ < 0 H
ξ

corresponds to classical Weibull df
Below, we present now the most important and funda mental result - the
Fisher- Tippe tt theorem.
4
Figure 1.2: GEV: distribution functions for various ξ
1.2.2. Fisher- Tippett Theorem (1928 )
Theorem 1.1. If appropriately normalized maxima converge in distribution to a non-degenerate
limit, then the limit distribution must be an extreme value distribution, abbreviated:
F ∈ MDA (H) then H is of typ e H
ξ
for some ξ.
where H
ξ
is Generalized Extreme Value Distribution.
The Fisher- Tippett theorem essentially says that the GEV is the only p ossible
limiting distribution for normalized block maxima.
One of the main part to apply Fisher- Tippett theorem is to determine in
which case F ∈ MDA (H
ξ
) holds. The following remarks would help us to do that.
1. Fr
´
echet case: (ξ > 0)
Gnedenko (1943) pointed out that for ξ > 0
5
Figure 1.3: GEV: densities for various ξ, The solid line is Gumbel distribution, the
dashed line is Frechet with ξ = 0.9 and the dotted line is Weibull with ξ = −0 . 5
F ∈ MDA (H
ξ
) ⇐⇒ 1 − F(x) = x

−1 /ξ
L(x)
for some slowly varying function L(x).
A function L on (0, ∞) is called slowly varying if
lim
x→∞
L(tx)
L(x)
= 1, t > 0
Remark 1.2. We see that the Fr
´
eche t distribution has a polynomial decaying tail and
thus suits well heavy tailed distribution. So if the tail of the distribution function
decays like a power function, then the distribution is in MDA (H
ξ
) for ξ > 0.
Example 1.3. A typical example is the Pareto distribution,
F(x) = 1 − (
K
K + x
)
α
, α, K > 0, x ≥ 0,
6
is in MDA(H
1/α
) (Fr
´
echet case) if we take a
n

= Kn
1/α
/α, b
n
= Kn
1/α
−K.
Other heavy- tailed distributions such as Burr, Cauchy, log gamma, t-distributions
and various mixture models also belong to Fr
´
eche t family. Some moments can be
infinite.
2. Gumbel case: F ∈ MDA (H
0
)
The characterization of this class is more complex. Generally it includes dis-
tributions whose tails decay roughly exponentially and we call these thin-
tailed or light-tailed distributions. All moments exist for distributions in the
Gumbel- class.
Example 1 . 4 . The exponential distribution: F(x) = 1 − e
−λx
, λ > 0, x ≥ 0. We
take a
n
= 1/λ, b
n
= (log n)/λ, ξ = 0, then F (x) is in MD A(H
0
).
3. Weilbull case: ξ < 0

The Weibull distribution is the asymptotic distribution of finite endpoint dis-
tributions.
Examples are uniform and beta distributions.
Remark 1.5. • Necessarily all commonly encountered continuous distributions are in
the maximum domain of attraction of an extreme value distribution.
• From figure 1.3, we also see that the right tail of the distribution falls exponentially for
the Gumbel case, by a power function for the Frechet case and it is finite for Weibull
case. In risk measurement, we are mostly interested in Frechet famil y which consists
of stable and student-t distribution.
• The normalizing sequences a
n
and b
n
can always be chosen so that the limit H
ξ
has
standard form without rescaling or relocation.
The Fisher- Tippett theorem has two influential implications. First of all, the
tail behaviour of the cdf of F(x) determines the limiting distribution H
ξ
(x) of the
unnormalized maxima. Therefore, extreme value theory is generally applicable to
a huge range of distributions. Second, the tail index ξ does not depend on the time
interval of M
t
or it is stable under time aggregation.
7
Next, we introduce location and scale parameters µ and σ > 0 indicating the
unknown norming constants and work with
P(M

n
< x) ≈ H
ξ
((x − µ)/σ) := H
ξ,µ,σ
(x) x ∈ D, (1.5)
D =







(−∞, µ −
σ
ξ
) ξ < 0
(−∞, ∞) ξ = 0
(µ −
σ
ξ
, ∞) ξ < 0
which is the limiting distribution of unnormalized maxima.
Obviously, H
ξ,µ,σ
is of type H
ξ
. The parameter ξ is also called the tail index,
represents the thickness of the tail of distribution. The Frechet distribution corre-

sponds to fat-tailed distributions and has been found to be the most suitable for
fat-tailed financial data. This result is very useful since the asymptotic distribution
of the maximum always has one of these three distributions, no matter what the
original distribution is.
In practice, we need to measure ξ, µ, σ. We congregate data on block maxima
and then fit the three- parameters form of the GEV. This requires lots of raw data
to form sufficiently many, sufficiently large blocks. Methods used to calculate those
parameters will be described in the next chapter.
1.3. The Peaks- over- Thre sholds (POT) method
From what we have discussed above, we see that the disadvantage of the
block maxima method is that if we observe data over a period of a few years and
then take the maximum value for each period, we might loose the potential extreme
events. For instance, we are interested in modeling the rainfall of Vietnam. Given
that heavy rain often occurs over summer period, we expect that the most extreme
rainfall to take place over a few months in summer. However, if we divide a year
into twelve months to observe and then to choose the highest points, we will loose
some high rainfall occurring in summer but still choose low rainfall in the winter.
Hence, we seek for alternative approach to help us handle this kind of situation.
8
Another method, called the Peaks- over- Thresholds (POT) method is to set
a threshold which d ata is taken as extreme and then collect the exceedances over
a threshold. We model this data using the Generalized Pareto distribution which
calculate the probability of recording extreme events surpass the threshold.
1.3.1. The Generalized Pareto Distribution (GPD)
The GPD is a two parameter distribution with distribution function:
G
ξ,β
(x) =

1 − (1 + ξx/β)

−1 /ξ
ξ = 0
1 − exp(−x/β) ξ = 0
(1.6)
where β > 0, and the support x ≥ 0 if ξ ≥ 0 and x ∈ [0, −β/ξ] if ξ < 0.
Figure below shows the shape of the GPD distribution G
ξ,σ
(x) where ξ, called
tail index or shape parameter takes positive, negative and zero value. The scaling
parameter σ is chosen to 1.
Figure 1.4: Shape of GPD G
ξ,σ
for σ = 1
The tail inde x ξ gives an exhibition of the ponderousness of the tail, the bigger
ξ, the heavier the tail. E(X
k
) does not exist for k ≥ 1/ξ. In general, we can not fix an
9
upper bound for the financial losses, only distributions with shape parameter ξ ≥ 0
are suited to model financial return distributions.
ξ > 0 Pareto (parametrized version)
ξ = 0 Exponential
ξ < 0 Pareto type II .
1.3.2. The POT method
In this part, we would like to give a theoretical foundations for the second
approach. The primary and important knowledge need to be remembered are ex-
cess distribution and the Pickand- Balkema- de Hann theorem.
The excess distribution : Consider an unknown distribution function F of a
random variable. Let u be the high threshold. We are interested in estimating the
distribution function F

u
of the values of x above a certain threshold u :
Figure 1.5: Distribution function F and conditional distribution function F
u
The distribution function F
u
is called the conditional excess distribution function
and is defined as:
F
u
(x) = P(X −u ≤ x|X > u) =
F(x + u ) −F(u)
1 − F(u)
, (9)
10
for 0 ≤ x ≤ x
F
−u where x
F
≤ ∞ is the right end point of F.
The realizations of the random variable X is primarily between 0 and u and
thus the measurement of F in this interval generally does not pose any problem. The
estimation of the portion F
u
however can be hard as we have in general very little
observations in this area.
Example 1.6. 1. The exponential distribution
F(x) = 1 −e
λx
, λ > 0, x ≥ 0.

F
u
(x) = F(x), x ≥ 0.
This distribution has the ”memory-less” property.
2. GPD
F(x) = G
ξ,β
(x).
F
u
(x) = G
ξ,β+ξu
(x),
where 0 ≤ x < ∞ if ξ ≥ 0 and 0 ≤ x < −
β
ξ
− u if ξ < 0.
The excess distribution of a GPD also is a GPD with the same shape parameter; solely
the scaling changes.
1.3.3. Pickands- Balkema- de Hann theorem
Extreme value theory is very helpful as it provides us a powerful result about
the conditional excess distribution function which is stated in the following theo-
rem:
Theorem 1.7. (Pickands(1975), Balkema and de Haan (1974)) : For a large class of under-
lying distribution function F, the conditional excess distribution function F
u
(x), for u large,
we can find a function β(u) such that
lim
u→x

F
sup
0≤x<x
F
−u



F
u
(x) −G
ξ,β(u)
(x)



= 0 (1.7)
if and only if F ∈ MDA (H
ξ
), ξ ∈ R.
11
Basically, all the common continuous distributions used in risk management
or insurance mathematics are in MDA(H
¸
) for some value of ξ.
Pickands- Balkema -de Hann theorem explain the importance of General-
ized Pareto Distribution (GPD). That is, the GPD is the natural model for the unknow
excess distribution over sufficiently high thresholds. For a large class of underly-
ing distribution function F, the conditional excess distribution function F
u

(x), for u
large, is well approximated by
F
u
(x) ≈ G
ξ,β
(x),
for some ξ and β. To measure this parameters we fit the GPD to excess amounts
over the threshold u. Standard properties of maximum likelihood estimators apply
if ξ > −0.5.
In order to implement the POT method, we have to choose a conformable
threshold u. There are data- analytic tools such as mean excess plot to help us, al-
though afterward simulations will suggest that inference is often sturdy to choice of
threshold.
12
CHAPTER 2
Applications: Some theoretical
computations
Extreme value theory has many applications in financial market and risk
management. However, within the framework of this thesis, we can only present
some applications of extreme value theory in computation of value at risk, return
levels and expected shortfall. As we mentioned above, there are two main methods
are used to estimate these values.
This chapter is mostly based on McNeil [6] and S.Tsay [8].
2.1. Block Maxima Method
2.1.1. Maximum Likelihood Estimation
Maximum likelihood estimation is a method of estimating the parameters of
a statistical model. Given block maxima data y = (M
(1)
n

, . . . , M
(m)
n
)

from m blocks
of size n. We need to estimate θ = (ξ, µ, σ)

. We build a log- likelihood by assuming
we have independent observations from GEV with density h
θ
.
l(θ; y) = log

m

i=1
h
θ
(M
(ı)
n
)1
1+ξ(M
(ı)
n
−µ)/σ>0

, (2.1)
where

h
θ
(M
(i)
n
) =







1
σ

1 + ξ
M
(i)
n
−µ
σ

−1
ξ−1
exp

−(1 + ξ
M
(i)

n
−µ
σ
)
−1
ξ

ξ = 0; 1 + ξ
M
(i)
n
−µ
σ
> 0
1
σ
exp


M
(i)
n
−µ
σ

exp

−exp(−
M
(i)

n
−µ
σ
)

ξ = 0
(2.2)
and maximize this with respect to θ to get the MLE
ˆ
θ = (
ˆ
ξ,
ˆ
µ,
ˆ
σ)

.
13
Procedure to find MLE:
• Define the likelihood function L(θ)
• Take the natural logarithm lnL (θ)
• Differentiate lnL(θ) with respect to θ and then equate the derivative to 0.
• Solve for parameter θ and obtain
ˆ
θ.
• Check whether it is a maximizer or global maximizer.
Let us consider examples of some general distribution functions to under-
stand how it works.
Example 2.1. • For one parameter, let us consider the geometric distribution, which

has the probability mass function is:
f (x, p) = p(1 − p)
x−1
, 0 ≤ p ≤ 1
Hence, the likelihood function is:
L(p) =
n

i=1

p(1 − p)
x−1

= p
n
(1 − p)
−n+

n
i=1
x
i
Taking the natural logarithm of L(p ), we have :
lnL = nlnp +

−n +
n

i=1
x

i

ln(1 − p).
Taking the derivative with respect to p, we obtain:
dlnL
dp
=
n
p

(
−n +

n
i=1
x
i
)
(1 − p)
Equating
dlnL
dp
to zero and solving for p, we get:
p =
n

n
i=1
x
i

=
1
x
.
Hence, we get the maximum likelihood estimator of p as:
ˆ
p =
n

n
i=1
X
i
=
1
X
14
• For two parameters, let us consider the normal distribution N(µ, σ
2
). Th e density
function for the normal variable is given by :
f (x) =
1
σ


exp


(x −µ)

2

2

.
Hence, the likelihood function is:
L(µ, σ
2
) =
n

i=1
1
σ


exp


(x
i
−µ)
2

2

=
1
(2π)
n/2

σ
n
exp



n
i=1
(x
i
− µ)
2

2

Let θ = σ
2
then take natural logarithm to get:
lnL(µ, θ) = −
n
2
ln(2π) −
n
2
ln(θ) −

n
i=1
(x
i

−µ)
2

∂lnL(µ, θ)
∂µ
=
2

n
i=1
(x
i
− µ)

∂lnL(µ, θ)
∂θ
=
−n

+

n
i=1
(x
i
−µ)
2

2
Solving the derivatives equal to zero simultaneously, we obtain:

ˆ
µ =
X
ˆ
σ
2
=
ˆ
θ =

n
i=1
(X
i

X)
2
n
When applying to GEV distribution, obviously in defining blocks, bias and
variance have to be balanced. We reduce bias by improving the block size n; we re-
duce variance by increasing the number of blocks m.
2.1.2. Hill estimator
Another way to computed the shape parameter ξ is using Hill estimate. This
method can be applied to direct r
t
T
t =1
. Let k be a positive integer and r
(1)
≤ r

(2)

··· ≤ r
(T)
, the estimator of ξ is defined as :
ξ
h
(k) =
1
k
k

i=1
[ln(r
(T−i+1)
) − ln(r
(T−k)
)] (2.3)
The Hill estimator can be applied for Frechet distribution only but it works
more efficiently than the Pickands estimator whenever ap plicable. Goldie and Smith(1987)
showed that

k[ξ
h
(k) −ξ] is asymptotic normal with mean 0 and variance ξ
2
.
15
2.1.3. Value at Risk
There are various kinds of risk in financial markets. Three main kinds are

credit risk, market risk and operational risk. The concept of Value at Risk (VaR)
is applicable to all types, but it mostly concerned with market risk. VaR is single
measure of the amount by which an company’s position in a risk category might
decrease due to general market movements during a given holding period. For in-
stance, if a portfolio of stocks has a one-day 5% VaR of $1 million, there is a 0.05
probability that the portfolio will fall in value by more than $1 million over a one
day period if there is no trading.
The estimation can be used by financial companies to assess their risks or by
a regulatory committee to set margin requirements. In both cases, VaR is used to
ensure that the financial firms can still maintain their business after a devastating
event. From viewpoint of financial companies, VaR can be defined as the maxi-
mal loss of a financial position during a given time period for a given probability.
Under this perspective, one considers VaR as a measure of loss associated with a ex-
traordinary event under normal market conditions. From a viewpoint of regulatory
committee, VaR can be defined as the minimal loss under rare market milieu. Both
definitions lead to same VaR estimate despite the different concepts.
Assume a random variable X with continuous distribution function F mod-
els losses or n egative returns on a certain financial instrument over a certain time
horizon. Then VaR
q
is defined as the q-th quantile of the distribution F :
VaR
q
= F
−1
(q) = in f {x ∈ R : F(x) ≥ q} (2.4)
F
−1
is called the quantile function.
Let’s recall the conditional excess distribution function:

F
u
(y) = P(X −u ≤ y|X > u) =
F(y + u) −F
u
1 −F
u
= P(Y ≤ y|X > u), y ≥ 0 (2.5)
where X
1
, . . . , X
n
are random variables exceeding threshold u and Y
1
, . . . , Y
n
are the
series of exceedances (Y
i
= X
i
−u )
Now we compute the distribution F of the extreme observations X
i
:
F(u + y) = P(X ≥ u + y) = P((X ≥ u + y|X > u) ·P(X > u)) (2.6)
16
F(u + y) = P(X − u ≥ y|X > u) ·P(X > u) (2.7)
F(u + y) = F(u)F
u

(x −u), Letx = u + y (2.8)
Besides, R.Smith (1987) introduced a tail estimator based on GPD ap p roxi-
mation to excess distribution. Let
N
u
=
n

i=1
1
{X
i
>u}
(2.9)
be the random number of observations over the threshold u from iid sample X
1
, . . . , X
n
.
We estimate
F
u
experiential by :
F
u
=
1
n
n


i=1
1
{X
i
>u}
=
N
u
n
(2.10)
where n is the total number of observations and for sufficiently high threshold u,
F
u
(x −u) ≈ G
ξ,β(u)
(x −u) . Hence, we get the tail estimator:
ˆ
F(x) =
N
u
n
(1 +
ˆ
ξ
x − u
ˆ
β
)
−1 /
ˆ

ξ
; (2.11)
where x > u. A high u reduces bias in measuring excess function. A low u
reduces variance in estimating excess function and F
u
.
17
Inverting the tail estimation formula with q > F(u) we obtain:
ˆ
VaR
q
= u +
ˆ
β
ˆ
ξ


n
N
u
(1 − q)


ˆ
ξ
−1

. (2.12)
Asymmetric confidence interval for x

q
may be built using profile likelihood
method.
2.1.4. Block Maxima Method Approach to VaR
Assume that there are T observations of an fortune return available in the
same period. We divide the same pe riod into k non-overlapping blocks of length n.
If T = kn + m with 1 ≤ m ≤ n then we delete first m observations from the sample.
Now, plugging the maximum likelihood estimates into equation (1.5) for x = (r −
µ
n
)/σ
n
to get the quantile of a given probability of the generalized extreme value
distribution. Let p

be a small upper tail probability that expresses the potential loss
and r

n
be the (1 − p

)th quantile of the block maxima under the limiting generalized
extreme value distribution. Then we have:
1 − p

=

exp[−(1 +
ξ
n

(r

n
−µ
n
)
σ
n
)
−1 /ξ
n
] ξ
n
= 0,
exp[−exp(−
(r

n
−µ
n
)
σ
n
)] ξ
n
= 0,
(2.13)
where 1 +
ξ
n

(r

n
−µ
n
)
σ
n
)
−1 /ξ
n
≥ 0 for ξ
n
= 0. Rewriting this equation as :
ln(1 − p

) =

−(1 +
ξ
n
(r

n
−µ
n
)
σ
n
)

−1 /ξ
n
ξ
n
= 0,
−exp(−
(r

n
−µ
n
)
σ
n
) ξ
n
= 0,
(2.14)
we get the quantile as:
r

n
=

µ
n

σ
n
ξ

n
{1 − [−ln(1 − p

)]
−ξ
n
ξ
n
= 0,
µ
n
− σ
n
ln[−ln(1 − p

)] ξ
n
= 0,
(2.15)
For the given upper tail probability p

, the quantile r

n
of equation (2.15) is actually
the VaR based on extreme value theory for the block maxima. Now we are going
to make the explicit relationship between block maxima and the observed return r
t
series.
We have:

1 − p

= P(r
n,i
≤ r

n
)
= P(r
n,1
≤ r

n
, r
n,2
≤ r

n
, . . . , r
n,i
≤ r

n
)
=

n
t =1
(r
t

≤ r

n
)
= [P(r
t
≤ r

n
)]
n
(2.16)
18
This relationship between probablilities allow us to obtain VaR from the original
fortune return series r
t
. For a small upper tail probability p, the (1 − p)th quantile
of r
t
is r

n
if the upper tail probability p

if the block maxima is selected based on
equation (2.16) where P(r
t
≤ r

n

) = 1 − p. Therefore, for a given small upper tail
probability p, the VaR of a financial position with log return r
t
is:
VaR =

µ
n

σ
n
ξ
n
{1 − [−nln(1 − p)]
−ξ
n
} ξ
n
= 0,
µ
n
−σ
n
ln[−nln(1 − p)] ξ
n
= 0,
(2.17)
where n is the length of block.
2.1.5. Multiperiod VaR
The square root of time rule of the Risk Metrics methodology becomes a spe-

cial case under the extreme value theory. The proper relationship between l-day and
1-day horizons is:
VaR(l) = l
1/σ
VaR = l
ξ
VaR (2.18)
Here, σ is the tail index and ξ is the shape parameter of the extreme value distribu-
tion. This relationship is refered to as the σ-root of the time rule. Note that σ = 1/ξ
, not the scale parameter σ
n
in equation (2.17).
2.1.6. Return Levels
Often when examining extreme data we are concerned on a sking questions:
”How often could we expect to observe extreme events?” or ”How often do we
expect stock prices to be falled?” or ”How low will it fall ?” In order to answer
these questions, we need to estimate the return levels for a given return period. The
return period is the amount of time we expect to wait before recording an extreme
event and the return levels expresses the intensity of the event that occurs within
that period. R
n,k
, the k n-block return level or the quantiles, is defined by:
P(M
n
> R
n,k
) = 1/k (2.19)
It is the level expected to be surpassed in one out of k period of length n on
average.
19

Let H is the distribution of maxima observed over successive non-overlapping
of equal length. We use the approximation of the estimated GEV:
R
n,k
≈ H
−1
ξ,µ,σ
(1 −1/k) ≈ µ + σ((−lo g(1 −1/k)
−ξ
) − 1) (2.20)
Substituting parameters ξ, µ, σ by their estimates
ˆ
ξ,
ˆ
µ,
ˆ
σ, we obtain :
ˆ
R
k
=



ˆ
µ −
ˆ
σ
ˆ
ξ

(1 − (−ln(1 −
1
k
))

ˆ
ξ
)
ˆ
ξ = 0
ˆ
µ −
ˆ
σ ln(−ln(1 −
1
k
))
ˆ
ξ = 0
(2.21)
For instance, a value of
ˆ
R
50
of 10 means that the maximum losses during a period of
1 year exceed 10% per year on average in 50 years.
Note that this equation is quite similar to the estimate of VaR, except for the
value p

is substituted by 1/k. So the difference between VaR and return level is

return level can be applied for only block maximum, not for underlying returns.
2.2. The POT method
The block maxima is a n useful method for calculating VaR but sometimes, it
encounters some difficulties. The point here is the choice of length n is not clearly
defined, so it is hard to chose appropriate block’s length . Moreover, this method
of extreme value theory is unconditional. As a result, it may not take into serious
effects to other explanatory variables. To tackle these problems, we use the POT
method. That is, we care about the exceedances of the measurement over some hi gh
threshold and the times at which it they take p lace. For instance, for a long position,
suppose we choose u = 2 . 5% and suppose that the ith exceedance happens at day
t
i
, that is, r
t
i
≤ u. So the new objective is the data t
i
, r
t
i
− u. Here r
t
i
− u is the
exceedance over threshold u and t
i
is the time whe re the ith exceedance occurs.
For a short position, we can choose u = 2% and focus on the data t
i
, r

t
i
− u which
r
t
i
≥ u. The question here is how to chose such an threshold u? We know that
different choices of threshold lead to different measure of the shape parameter and
the tail index.
20

×