bivariate generalized pareto distribution in...

Introduction Extreme value models Applications

Bivariate generalized Pareto

distribution in practice

Pál Rakonczai

Eötvös Loránd University, Budapest, Hungary

Minisymposium on Uncertainty Modelling27 September 2011, CSASC 2011, Krems, Austria

Pál Rakonczai Bivariate generalized Pareto distribution

Introduction Extreme value models Applications

Outline

I Short summary of extreme value theory (EVT), including:I univariate/multivariate caseI representation of dependence structures within EVTI multivariate generalized Pareto distribution of type II

I Construction of new asymmetric models

I Properties and comparison with well-known models

I Applications on environmental data (wind speed / flood peak)


Introduction Extreme value models Applications Univariate Extremes Multivariate Extremes New models

Introduction

I There is often interest in understanding how the extremes oftwo different processes are related to each other

I One possible solution is using asymptotic approach from EVT

I EVT provides limits for the distribution of extremes for asingle/multivariate stationary process



Limit for Maximum

Theorem (Fisher and Tippet, 1928)

If there exist {an} and {bn} > 0 sequences such that

P(Mn − an

bn≤ z

)

→ G (z) as n → ∞

for some non-degenerate distribution G, then

G (x) = exp

{

−(

1 + ξx − µ

σ

)−1ξ

}

,

where 1 + ξ x−µσ > 0 and in such a case random variable belongs tothe max-domain of attraction of a GEV distribution.

→ Often applied for block maxima (monthly, yearly, ...)



Limit for Threshold Exceedances

How to model large values over a given high threshold?

Theorem (Balkema and de Haan, 1974)

Suppose, that X1, ...,Xni.i.d.∼ F , where F belongs to the

max-domain of attraction of GEV for some ξ, µ and σ > 0. Then

P(Xi − u ≤ z |Xi > u) → H(z) = 1−(

1 +ξz

σ̃

)−1ξ

as u → z+,

where σ̃ = σ + ξ(u − µ).

→ Often applied for threshold exceedances over a highquantile of the obs.



Remarks

I The limit distributions are not GaussianI Strong link between GEV and GPDI See examples below:

−2 0 2 4 6 8 10

0.00.1

0.20.3

0.4

GEV

domain

dens

ity

GumbelFrechetWeibull

2 4 6 8 10

0.00.2

0.40.6

0.81.0

GPD

domain

dens

ity

ξ= 0ξ= 0.5ξ= − 0.3



Limit for Multivariate Maxima

I Component-wise maxima of Xi = (Xi ,1, ...,Xi ,d ) is

Mn =(

n∨

i=1

Xi ,1, ...,

n∨

i=1

Xi ,d

)

.

I If X ∼ F and there exist an and bn > 0 sequences ofnormalizing vectors, such that

P(Mn − an

bn≤ z

)

= F n(bnz+ an) → G (z),

where Gi margins are non-degenerate, then G is said to be amultivariate extreme value distribution (MEVD).



Characterization I. - Resnick (1987)

I MEVD with unit Fréchet margins can be written as

GFréchet(t1, . . . , td ) = exp(

−V (t1, . . . , td ))

I V is called exponent measure

V (t1, . . . , td ) = ν(([0, t1]×· · ·×[0, td ])c ) =

∫

Sd

d∨

i=1

(wi

ti

)

S(dw),

I S spectral measure is a finite measure on the d -dimensionalunit simplex satisfying the equations

∫

Sd

wiS(dw) = 1 for i = 1, ..., d .



Some properties of the characterization I.

I Total mass of S is always S(Sd ) = d

I Density not only on the interior of Sd but also on each of thelower-dimensional subspaces of Sd

I For independent margins S({ei}) = 1 for all ei vertices of thesimplex

I For completely dependent margins S((1/d , ..., 1/d)) = d



Characterization II. (bivariate case) - Pickands (1981)

The dependence function A(t) must satisfy the following threeproperties: (P)

1. A(t) is convex2. max{(1− t), t} ≤ A(t) ≤ t3. A(0) = A(1) = 1

V (t1, t2) =( 1

t1+

1

t2

)

A( t1

t1 + t2

)

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

t

A(t

)

(0,1) (1,1)

(1/2,1/2)

CONVEX



For those who like uniform margins better

The copula of a MEVD G is the distribution function of therandom vector (G1(Y1), ...,Gd (Yd )) and is given by

C (u) = exp(

−V( −1

log u1, . . . ,

−1

log ud

))

,u ∈ [0, 1]d .

Such a copula necessarily satisfies the stability property

C s(u) = C (us1, . . . , usd )

and conversely if this property is satisfied than we get a copula of aMEVD.



Logistic Dependence (Gumbel copula)

A(t) = (tα + (1− t)α)1/α

C (u, v) = exp(−((− log u)α + (− log v)α)1/α)

Copula distribution function

u

v

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.0 0.2 0.4 0.6 0.8 1.0

0.00.2

0.40.6

0.81.0

α = 2.5

0.0 0.2 0.4 0.6 0.8 1.0

01

23

45

67

Spectral density

t

(α−1)(

t(1−t))

(α−2) (t

α +(1−

t)α )(1

α−2)

α = 1.5α = 2α = 2.5α = 3α = 3.5

0.0 0.2 0.4 0.6 0.8 1.0

0.40.5

0.60.7

0.80.9

1.0

Dependence function

t(tα

+(1−t)

α )(1α)

α = 1.5α = 2α = 2.5α = 3α = 3.5



Logistic BEVD and its asymmetric extension

A(t) = (1− Φ1)t + (1− Φ2)(1− t) + ((Φ1t)α + (Φ2(1− t))

α)1/α

0 ≤ Φ1,Φ2 ≤ 1

→ S({0, 1}) = 1− Φ2,S({1, 0}) = 1− Φ1

−2 0 2 4 6

−20

24

6

x

y

0.00

0.05

0.10

0.15

0.20

0.25

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.1

6

symmetric, dep=2.5

−2 0 2 4 6

−20

24

6x

y0.00

0.05

0.10

0.15

0.02 0.04

0.06

0.08

0.1 0.1

2 0.14

asy = c(0.5,0.9)

−2 0 2 4 6

−20

24

6

y

0.00

0.05

0.10

0.15

0.02

0.04

0.06 0.08

0.1

0.12

asy = c(0.5,0.5)

−2 0 2 4 6

−20

24

6

y

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.01

0.02

0.03

0.04 0.05 0.06

0.07

0.08

0.0

9

0.1

0.11

asy = c(0.5,0.1)



MGPD of type II

I Problem: taking maxima hides the time structure within theperiod..

I Might be better investigating all observations over a givenhigh threshold

Let Y = (Y1, ...,Yd ) denote a random vector, u = (u1, ..., ud ) ahigh threshold vector and so X = Y − u the vector of exceedances.The distribution of X � 0 exceedances can be well approximatedby the multivariate generalized Pareto distribution (Rootzén andTajvidi, 2006)

H(x1, . . . , xd ) =−1

logG(

0, . . . , 0) log

G(

x1, . . . , xd)

G(

x1 ∧ 0, . . . , xd ∧ 0) ,

where 0 < G (0, . . . , 0) < 1.



What is type I. then?

I If we claim exceedance in all of the coordinates (BGPD I), weusually get simpler models, with nice properties (marginals areunivariate GPD etc.)

I But we may use less data

Logistic bgpd by ’evd’

Fehmarn m/s

Hann

over

m/s

0 5 10 15 20 25

05

1015

2025

30

density curves

No Inference!

No Inference!No Inference!

0 5 10 15 20 25

05

1015

2025

30

Logistic bgpd by ’mgpd’

Fehmarn m/s

Hann

over

m/s

density curves

No Inference!



Remarks

I H1(x) = H(x ,∞) is not a one dimensional GPD, only theconditional distribution of X1|X1 > 0 is GPD.

I The ξi , µi , σi parameters cannot be considered as the ithmarginal parameters any more

I Motivating problem: asymmetric logistic model does not leadto an absolutely continuous MGPD..

TheoremLet H be a MGPD represented by an absolutely continuous MEVD

G with spectral measure S. The MGPD H is absolutely continuous

if and only if S(int(Sd )) = d holds, i.e all mass is put on theinterior of the simplex.



Overview of parametric dependence models

Model Asym. Density Complications

Sym. Logistic − X −Asym. Logistic (Tawn, 1988) X − −Ψ-logistic∗ X X convexity constr.Sym. Neg. Logistic − X −Asym. Neg. Logistic (Joe, 1990) X − −Ψ-negative logistic∗ X X convexity constr.Bilogistic (Smith, 1990) X X not explicitNeg. Bilogistic (C. & T., 1994) X X not explicitTajvidi’s (Tajvidi, 1996) − X −C-T (C. & T., 1991) X X only s(w) is givenMixed (Tawn, 1988) − if ψ = 1 −Asym. Mixed X − −



Construction of absolutely continuous BGPDs

1. take an absolutely continuous baseline dependence model;

2. take a strictly monotonic transformationΨ(x) : [0, 1] → [0, 1], such that Ψ(0) = 0, Ψ(1) = 1;

3. construct a new dependence model from the baseline modelAΨ(t) = A(Ψ(t));

4. check the constraints: A′Ψ(0) = −1, A′

Ψ(1) = 1 and A′′

Ψ ≥ 0(convexity).



An example for Ψ(x)

I Feasible transformation if

Ψ(t) = t + f (t) ⇒ (A(Ψ(t)))′ = A′(Ψ(t))× [1 + f ′(t)]

so choosing f such that f ′(0) = f ′(1) = 0, the A′Ψ(0) = −1and A′Ψ(1) = 1 are fulfilled.

I

fψ1,ψ2(t) = ψ1(t(1− t))ψ2 , for t ∈ [0, 1],

where ψ1 ∈ R and ψ2 ≥ 1 are asymmetry parameters.



Graphical examples

0.0 0.2 0.4 0.6 0.8 1.0

0.0

1.0

2.0

3.0

δ2tAαLog=2(Ψ(t))

t

Psi−Logisticψ1 = − 0.3ψ1 = − 0.15ψ1 = 0ψ1 = 0.15ψ1 = 0.3

0.0 0.2 0.4 0.6 0.8 1.0

0.0

1.0

2.0

3.0

δ2tAαNegLog=1.2(Ψ(t))

t

Psi−NegLogψ1 = − 0.3ψ1 = − 0.15ψ1 = 0ψ1 = 0.15ψ1 = 0.3

1.4 1.6 1.8 2.0 2.2

−0

.4−

0.2

0.0

0.2

0.4

Constraints of convexityfor psi−logistic models

α

ψ1

ψ2 = 1.8ψ2 = 2ψ2 = 2.2

0.9 1.0 1.1 1.2 1.3 1.4 1.5

−4

−2

02

4

Constraints of convexityfor psi−negative logistic models

α

ψ1

ψ2 = 1.8ψ2 = 2ψ2 = 2.2

Figure: Upper panels: second derivatives of Ψ-logistic and Ψ-negativePál Rakonczai Bivariate generalized Pareto distribution


Remarks

I Of course f and or the baseline dependence function can bechosen differently

I This transformation has particular impact beyond the BGPDmodels as well: applicable for BEVDs, BGPDs of type I or forcopulas of any bivariate distributions

I Although the transformations Ψ was defined now for thebivariate case only, it can be extended for the higherdimensional cases as well as e.g. in the trivariate case it canhave a form of

Ψ(t1, t2) = (t1+ψ1,1(t1([1−t2]−t1))ψ2,1 , t2+ψ2,1(t2([1−t1]−t2))

ψ2,2).



Further models - polinomial transformations

fa,p(t) =64a(p5 − p4 − 2p3 − 2p2 + 5p − 2)p5

(p + 1)2(p − 1)2(−1 + 2p)(1 + 2p2 − 3p)t6

+−32a(5p6 − 2p5 − 13p4 − 12p3 + 15p2 + 6p − 6)p4

(p + 1)2(p − 1)2(−1 + 2p)(1 + 2p2 − 3p)t5

+32a(4p7 + 3p6 − 14p5 − 15p4 + 23p2 − 8p − 2)p3

(p + 1)2(p − 1)2(−1 + 2p)(1 + 2p2 − 3p)t4

+−32a(p7 + 4p6 − 5p5 − 10p4 − 5p3 + 12p2 + 2p − 4)p3

(1 + 2p2 − 3p)(p + 1)2(4p − 1 + 2p3 − 5p2)t3

+32a(p6 − 3p4 − p2 + 4p − 2)p3

(1 + 2p2 − 3p)(p + 1)2(4p − 1 + 2p3 − 5p2)t2.



Further models - polinomial transformations

−4 −2 0 2 4

−4−2

02

4

0.00

0.05

0.10

0.15

0.20

0.25

0.30

−4 −2 0 2 4

−4−2

02

4

0.00

0.05

0.10

0.15

0.20

0.25

0.30

−4 −2 0 2 4

−4−2

02

4

0.00

0.05

0.10

0.15

0.20

0.25

0.30



BEVD case

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7


Introduction Extreme value models Applications Wind data Flood data Software

Wind data

I Daily windspeed values at two locations in northern Germany(Bremerhaven and Schleswig) over the past five decades(1957-2007)

I Originally hourly windspeed was recorded, but in order toreduce the serial correlation within the series, we calculatedthe maxima of the original observations for each day.

I The data for the above cities are rather strongly correlated,due to the relatively short distance between them (Kendall’scorrelation is 0.57) and also can be considered asasymptotically dependent

I 95% quantiles have been used as threshold level (13.8m/s,10.3m/s)

I Observations are considered as being stationary,non-stationarity can also be handled e.g. by choosingtime-dependent model parameters



Wind data

10 15 20 25

510

1520

Exceedances (m/s)

Bremerhaven

Schl

eswi

g

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Copula sample

Bremerhaven

Schl

eswi

g



Parameter estimation

I We found that maximum likelihood methods perform well ingeneral

I However there is strong pairwise correlation between theparameters µi , σi , i = 1, 2

I This can be extinguished in practice by keeping e.g. onelocation parameter fixed during the optimization process

I Estimation within this subclass gives stable convergenceresults

Prediction regions (Hall and Tajvidi,2000) have been computed -the smallest region containing an observation with a given (usuallyhigh) probability..



Prediction regions for the 4 best models(having the smallest χ2-statistics)

The effect of asymmetry parameters - relevant torsion in theregions regions:

5 10 15 20 25 30

510

1520

25

NegLog

Bremerhaven(m/s)

Schle

swig(

m/s)

Regionsγ= 0.99γ= 0.95γ= 0.9

5 10 15 20 25 30

510

1520

25

Psi−NegLog

Bremerhaven(m/s)

Schle

swig(

m/s)

Regionsγ= 0.99γ= 0.95γ= 0.9

5 10 15 20 25 30

510

1520

25

Coles−Tawn

Bremerhaven(m/s)

Schle

swig(

m/s)

Regionsγ= 0.99γ= 0.95γ= 0.9

5 10 15 20 25 30

510

1520

25

Tajvidi’s

Bremerhaven(m/s)

Schle

swig(

m/s)

Regionsγ= 0.99γ= 0.95γ= 0.9



Goodness-of-fit: Bremerhaven és Schleswigγ 0.99 0.95-0.99 0.9-0.95 0.75-0.9 0.5-0.75 0.5 χ2

Exp. 12.6 50.3 62.8 188.6 314.2 628.5 -

Log 7 30 66 216 350 588 21.5Ψ-L 9 33 60 215 349 591 16.9NL 11 31 66 215 334 600 14.0Ψ-NL 11 34 63 210 337 602 10.7Mix 12 46 71 247 331 550 30.3C-T 12 34 65 209 330 607 9.1Tajv. 11 35 62 212 340 597 11.5BL 6 29 62 220 354 586 25.6NBL 13 32 61 220 337 594 15.5



Results

I C-T (χ2CT = 9.1) and Ψ-Neglog models (χ2Ψ-NegLog = 10.7)

perform the best

I In general the new asymmetric (Ψ) models are better thantheir symmetric baseline versions, χ2Log = 21.5 reduces to

χ2Ψ-Log = 16.9 and χ2NegLog = 14 reduces to χ

2Ψ-NegLog = 10.7

I The difference between the log-likelihood values of the Neglogand Ψ-NegLog models is around 10, which shows a significantimprovement in the likelihood ratio as well

I The Tajvidi’s generalized symmetric logistic model(χ2Tajvidi’s = 11.5) is the best among the symmetric ones.



Danube water level data - 100 years of daily flood peaks

Fitted BGPD models for flood data from Budapest and Tass(threshold was chosen at about the 95% quantiles. Black regioncovers 95%, green: 90%, blue: 75%).

Logistic

−400 −200 0 200 400

−400

−200

020

040

0

Psi−Logistic

−400 −200 0 200 400

−400

−200

020

040

0



R package available

I Graphs and computations have been made by the add-onpackage mgpd of R

I Wind data are also availabe

I Updated version is coming up very soon (mgpd v2.0)

Acknowledgements: The European Union and the EuropeanSocial Fund have provided financial support to the project underthe grant agreement no. TÁMOP 4.2.1./B-09/KMR-2010-0003.



Thank you for your attention!


IntroductionExtreme value modelsUnivariate ExtremesMultivariate ExtremesNew models

ApplicationsWind dataFlood data Software

bivariate generalized pareto distribution in...

Documents