spatial conditional extremes via the gibbs sampler...spatial conditional extremes via the gibbs...

23
Spatial conditional extremes via the Gibbs sampler Adrian Casey School of Mathematics University of Edinburgh April 28, 2020 1 / 23

Upload: others

Post on 27-Jan-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

  • Spatial conditional extremes via the Gibbssampler

    Adrian CaseySchool of MathematicsUniversity of Edinburgh

    April 28, 2020

    1 / 23

  • Data

    I The wave height data consists of 1680 hindcast observationsof significant wave height during storms measured at 150locations.

    I Sites arranged on an approximate grid in the North Sea, withaxes running NE-SW and SE-NW.

    I The data includes the wind direction at each site for eachstorm. Extremes are dependent upon this variable.

    2 / 23

  • Data

    56°N

    57°N

    58°N

    59°N

    60°N

    61°N

    62°N

    63°N

    6°W 4°W 2°W 0° 2°E 4°E 6°Elongitude

    latit

    ude

    40

    45

    50

    55

    Extremes

    Location of sites colour shows number of extremes (0.975 quantile)

    3 / 23

  • Wind direction

    N−NNE

    NNE−ENE

    ENE−E

    E−ESE

    ESE−SE

    SE−SSE

    SSE−SS−SSW

    SSW−SW

    SW−WSW

    WSW−W

    W−WNW

    WNW−NW

    NW−NNW

    NNW−N

    Wind direction at site 1

    n

    10

    20

    30

    Number of extremes

    Number of storms with extreme waves vs wind direction at site 1

    4 / 23

  • Heffernan & Tawn

    Suppose that X is a random vector with exponential marginaldistributions. We condition on the random variable Xk being abovea high threshold u. Let X−k be the other components of thevector. The conditional model rests on the relatively weakassumption that, given Xi > u there exist vector functionsak : R→ Rd−1 and bk : R→ Rd−1 such that,

    Pr

    {Xk − u > y ,

    X−k − ak(Xk)bk(Xk)

    ≤ z|Xk > u}

    → exp(−y)G k(z), u →∞. (1)

    5 / 23

  • Spatial extremes

    Questions that might be asked:

    I How many sites are likely to experience extreme valuessimultaneously?

    I Given an extreme value at one location, what is thedistribution of values at other sites ?

    Particular issues:

    I Asymptotics give no general form for the distribution G k(z).In spatial statistics this distribution will be high-dimensionaland so difficult to model outside the Gaussian framework.

    I An alternative approach is the graphical extremes methodsoutlined in the recent paper by Engelke & Hitz. This has thedisadvantage of assuming asymptotic dependence, which isgenerally not the case in spatial statistical problems.

    6 / 23

  • Gaussian Markov random fields

    Let {si} be a set of sites with associated random variables{X (si )}. Let ∆i be the neighbourhood ( the Markov blanket) ofX (si ). For a Gaussian distibution N (0,Q−1) and precision matrixQij=Q(si , sj),

    E [X (si )] = −∑sj∈∆i

    QijQii

    X (sj), (2)

    Var [X (si )] =1

    Qii. (3)

    7 / 23

  • ∆i for first order GMRF

    X (sj1)

    X (sj2)

    X (sj3)

    X (sj4)

    si

    8 / 23

  • A local extreme value model

    Assume that, for a range of (positively dependent) MRFs withexponential margins, there exists a scalar function acting on theneighbourhood of site i a∆(X∆i ) such that ,

    Z =X (si )− αa∆(X∆i )

    (a∆(X∆i ))β

    ∼ G , (4)

    a∆(X∆i ) > u (5)

    where G is a non-degenerate univariate distribution function, withconvergence above some high threshold value u.

    9 / 23

  • The function a∆

    We propose a form for the function a∆ which is homogeneous ineach of the components in the neighbourhood. This function arisesin the analysis of asymptotically independent kth order Markovchains.

    a∆(X∆i ) =

    ∑j∈∆i

    γj(γjX (sj))δ

    1/δ

    (6)

    ∑j∈∆i

    γj = 1, δ > 0 (7)

    10 / 23

  • Example:GMRF

    Consider the GMRF in Slide 6, drop the i suffix and and set

    βj =Q(si ,sj )Q(si ,si )

    , σ = 1Q(si ,si ) , then it can be shown that,

    a∆(X∆) =

    ∑j∈∆

    γj(γjXj)1/2

    2 , (8)γj =

    β2/3j∑k β

    2/3k

    (9)

    β = 0.5 (10)

    α =∑k

    β2/3k (11)

    Z ∼ N(0,√

    2σ) (12)

    11 / 23

  • The Gibbs sampler

    The Gibbs sampling method is a method of sampling from the fulljoint distribution by sequential sampling from a series of conditionaldistributions. So the method follows the following pattern,

    Draw x∗1 from f (X (s1)|X (s2) = x2 . . .X(sn) = xn)Draw x∗2 from f (X (s2)|X (s1) = x∗1 ,X (s3) = x3, . . .X (sn) = xn)...

    I This method is often applied to GMRFs because theconditioning set is reduced to the neighbourhood at eachpoint. The existence of this local model suggests a path tosampling from the full joint distribution of {X (s1), . . .X (sn)},given that the vector at some site, say X (s1), is extreme.

    12 / 23

  • The Gibbs sampler algorithm

    Algorithm 1: A Gibbs sampler for a local extreme modelData: Regular lattice dataResult: A distribution that allows effective extrapolation to higher quantiles of

    conditional extremes1 Initialization; Set the lattice values to a random sample from the exponential

    distribution . One site is set to be above some threshold2 while samples < N do3 while i < n do4 read current lattice values {X (s1), . . .X (sn)};5 if i = 1 then6 sample from u + Exp(1);7 else if The neighbourhood function a∆(X∆i ) > u then8 Simulate a value from the fitted local extremes model;9 Replace X (si ) with that value;

    10 else11 simulate a new value from some model of the bulk density

    f (X (si ) | X (sj ) : j 6= i) ;12 Replace X (si ) with that value;

    13 After one sweep of the lattice, save as sample;

    13 / 23

  • Working through the lattice

    X1

    X5

    X3X4

    X8

    X2

    Figure 1: a∆(X∆1 ) > u, so use local model samplea∆(X∆2 ) < u so use abulk model to sample X2, here dependent on 8 neighbours.

    14 / 23

  • Fitting a local extremes model to data

    Asuming stationarity, we seek a model for the dependence of a siteX (si ) on the four nearest lattice points.

    X (si )|∆i = αa∆(X∆i ) + a∆(X∆i )βZ , a∆(X∆i ) > u, (13)

    a∆(X∆i ) =

    ∑j∈∆i

    γj(γjX (sj))δ

    1/δ

    (14)

    ∑j∈∆i

    γj = 1, δ > 0 (15)

    i = 1 . . . n. (16)

    15 / 23

  • Fitting a local extremes model to data

    X1 X2

    X3X4

    si

    16 / 23

  • Fitting a model for the bulk distribution

    We need a model for the conditional distribution applicable in thenon-extreme region,

    p(X (si )|X (sj) : j 6= i). (17)

    The modelling strategy chosen for this step is to first transform thedata to normal scale and then to fit a linear model to each fullconditional in turn. In order to avoid over-fitting and to managethe number of parameters in this model, a Lasso is applied.

    17 / 23

  • Parameter stability

    Quantile γ1 γ2 γ3 γ4 β δ α

    0.92 0.27 0.23 0.28 0.22 0.36 0.86 4.00

    0.95 0.27 0.23 0.28 0.22 0.16 0.91 3.98

    0.975 0.29 0.22 0.28 0.20 0.01 0.75 3.96

    0.99 0.25 0.26 0.26 0.24 0.02 1.43 4.12

    Table 1: Table showing parameter stability in Θ1

    18 / 23

  • Sample from data

    56°N

    57°N

    58°N

    59°N

    60°N

    61°N

    62°N

    63°N

    6°W 4°W 2°W 0° 2°E 4°E 6°Elongitude

    latit

    ude

    2

    3

    4

    Extremes

    Real sample with site 1 near the 0.975 quantile, colour shows value at site.

    19 / 23

  • Realisation from the Gibbs sampler

    56°N

    57°N

    58°N

    59°N

    60°N

    61°N

    62°N

    63°N

    6°W 4°W 2°W 0° 2°E 4°E 6°Elongitude

    latit

    ude

    3.0

    3.5

    4.0

    Extremes

    Realisation with site 1 near the 0.975 quantile, colour shows value at site.

    20 / 23

  • Expected extreme waves in the same storm

    I Simulation based on 3000 Gibbs sampling results with a 1000sample burn-in.

    I Results

    Extreme quantile Data Simulation

    0.95 102.80 108.800.975 90.72 92.640.99 86.85 88.650.995 78.80 68.000.999 NA 28.80

    Table 2: Expected number of sites with a wave in the extremequantile, given a similarly extreme wave at site 1.

    21 / 23

  • Conclusions and next steps

    I Generally encouraging results for this method of conditionalspatial extremes.

    I Significant problems remain.I Boundary conditions.I Non-stationarity and wind dependence.I The method relies on some knowledge of the distribution of

    a∆.I Analysis of the distribution{X (s1), . . . ,X (sn)} | max{X (s1), . . . ,X (sn)} > u iscomputationally demanding.

    22 / 23

  • References

    S. Engelke and A. S. Hitz. Graphical models for extremes. arXiv preprintarXiv:1812.01734, 2018.

    J. E. Heffernan and J. A. Tawn. A conditional approach for multivariateextreme values (with discussion). Journal of the Royal StatisticalSociety: Series B (Statistical Methodology), 66(3):497–546, 2004.

    I. Papastathopoulos and J. A. Tawn. Extreme events of higher-ordermarkov chains: hidden tail chains and extremal yule-walker equations.arXiv preprint arXiv:1903.04059, 2019.

    M. Reistad, Ø. Breivik, H. Haakenstad, O. J. Aarnes, B. R. Furevik, andJ.-R. Bidlot. A high-resolution hindcast of wind and waves for thenorth sea, the norwegian sea, and the barents sea. Journal ofGeophysical Research: Oceans, 116(C5), 2011.

    23 / 23

    PreliminariesApplication & resultsHeffernan & TawnSpatial ExtremesGaussian Markov random fields

    MethodsA local extreme value modelThe function aExampleDeriving a full joint distributionThe Gibbs samplerDeriving a full joint distribution