Let X1,
X2, …,Xn be iid. variables (e.g. account balances, market returns, cash-flows,
profit/loss results…), with distribution function
If we expect that
extreme values of X might occur occasionally (e.g. bank-run
withdrawals, extraordinary losses, asset values slump…), we can approximate (and extrapolate) the
tail of such distribution with Generalized Pareto distribution.
Tail of a distribution
is to be determined by a specific threshold µ, hence we are looking for
“peaks-over-threshold” of µ. The choice of µ is arbitrary, however critical, as on
side we shall fit the Generalized Pareto distribution on the (heavy) tail only,
on other side sufficient number of observations is necessary to estimate its
parameters.
In general, the distribution
of exceedances y=(X-µ) is given:
- for each y>0 by the function
 
 
or expressed alternatively as complementary cumulative
distribution
 
In case of all X>µ
we denote
 
And we can approximate
Y by Generalized Pareto distribution with parameter ξ
(shape) and β (scale)
where y≥0
if ξ≥0 and 0≤y≤-β/ξ if ξ<0.
 
Under
certain conditions (if F is in the domain of attraction of the Fréchet, Gumbel or Weibull distributions) and for µ large enough (close to some
maximal observation xmax) we can use the Generalized Pareto
Distribution to model the exceedance distribution
 
which holds for
complementary distribution functions too
 
which leads to
expression through multiplication
 
Any quantile x can then be found with quantile function Q(p) at
probability p:
 
With estimated
parameters of Generalized Pareto Distribution β and ξ and estimated ζ probability of
occurrence over threshold µ
we estimate the distribution
 
Example in R
library(POT)
par(mfrow=c(3,3))
# generate normal
distribution with some extremes
n=1e3
ext=runif(20,10,50)
x=c(rnorm(n = n-length(c),mean
= 10,sd = 5),ext)
hist(x,n=100)
# find threshold
mrlplot(data=x, nt=20)
tcplot(data=x,nt=20)
# fit gpd on right tail
loc=20
mle=fitgpd(data=x,
threshold=loc, est="mle")
print(mle$param)
plot(mle,which = c(1,3)) #
show p-p plot and histogram
# generate random from this
distribution
hist(rgpd(n=1e3,loc=loc,scale=mle$param[1],shape=mle$param[2]),n=100)
# calculate gdp 99% quantile
and compare with empirical quantile
qgpd(p=0.999,
loc=loc,scale=mle$param[1],shape=mle$param[2],lambda=mean(x<=loc))
quantile(x,0.999)
 
No comments:
Post a Comment