Supplement of Extreme-coastal-water-level estimation and projection: a comparison of statistical methods

Abstract. Accurate estimates of the probability of extreme sea levels are pivotal for assessing risk and for designing coastal defense structures. This probability is typically estimated by modeling observed sea-level records using one of a few statistical approaches. In this study we comparatively apply the generalized-extreme-value (GEV) distribution, based on block maxima (BM) and peaks-over-threshold (POT) formulations, and the recent metastatistical extreme-value distribution (MEVD) to four long time series of sea-level observations distributed along European coastlines. A cross-validation approach, dividing available data into separate calibration and test sub-samples, is used to compare their performances in high-quantile estimation. To address the limitations posed by the length of the observational time series, we quantify the estimation uncertainty associated with different calibration sample sizes from 5 to 30 years. We study extreme values of the coastal water level – the sum of the water level setup induced by meteorological forcing and of the astronomical tide – and we find that the MEVD framework provides robust quantile estimates, especially when longer sample sizes of 10–30 years are considered. However, differences in performance among the approaches explored are subtle, and a definitive conclusion on an optimal solution independent of the return period of interest remains elusive. Finally, we investigate the influence of end-of-century projected mean sea levels on the probability of occurrence of extreme-total-water-level (the sum of the instantaneous water level and the increasing mean sea level) frequencies. The analyses show that increases in the value of total water levels corresponding to a fixed return period are highly heterogeneous across the locations explored.



Introduction
This supplementary material contains two sections: 1) a general overview of the extreme value theory represented both in the conventional form (block maxima) and in the "threshold" form based on the generalized Pareto distribution (peaks-overthreshold); 2) complementary figures to the main text.

T1. Extreme Value Theory
The Extreme Value Theory (EVT) is a statistical technique that provides a theoretical framework to quantify the occurrence probability of random variables at unusually high (or low) extreme events. The cornerstone of EVT is the three-types theorem introduced by Fisher and Tippett (1928) and later proved by Gnedenko (1943). This results led Gumbel (1958) to introduce a statistical methodology for extreme values. The basic idea is to study the statistical behavior of: where X i (for i = 1 . . . , n) are a sequence of independent and identically distributed (i.i.d.) random variables having a common distribution function (F ). The distribution function of M n is given by the n th power of F : The classical EVT focuses on the asymptotic behavior of this distribution. Under certain circumstances, it can be shown that exist scaling constants a n > 0 and b n , that allow to obtain a nondegenerate distribution (H(x), i.e., it is not always either 0 or 1), such that: The corresponding normalized variable M * n = Mn−bn a n has a limiting distribution that must be one of the three types of extreme value distributions (Gumbel, Fréchet and Weibull) characterized by different behaviors and shapes of the tail. Von Mises (1936) proposed a single distribution which combines all three types of asymptotic extreme value distributions into a single family known as generalized extreme value (GEV) distribution: where: 1. µ is a location parameter; 2. ψ is a scale parameter; 3. ξ is a shape parameter which controls the type of the tail distribution: 1) ξ→0 defines the light tailed case (Gumbel type or EV1) characterized by an exponential tail; 2) ξ>0 identifies the heavy tailed case (Fréchet type or EV2) described by a power law; 3) ξ<0 gives the short tailed case (negative Weibull case or EV3) which has a bounded upper tail.
The EVT identifies the GEV distribution as a general model to describe the distribution of extreme events. The three GEV parameters can be estimated by using the well-known statistical methods: maximum-likelihood, probability-weighted moments or Bayesian methods. In many applications to environmental and hydrology processes, two fundamental approaches are widely used to extreme value statistics, based on 1) maxima over some fixed time period (block maxima), and 2) exceedances over high threshold (peaks-over-threshold). The following subsections outline the concepts underlying these methods.

T2. Modeling block maxima
The block maxima (BM) approach consists of dividing the observation sample into a sequence of maximum values extracted from blocks of fixed time intervals and fitting the GEV distribution (Eq. (2)) to the set of block maxima obtained. The choice of the suitable block size is a preliminary step, amounting to a trade-off between bias and error variance. In most environmental processes is commonly used a block size of one year leading to study the annual maxima time series. Generally, for practical applications we are interested in estimating the T -years return levels associated with the extreme values. If the GEV distribution is an appropriate model for block maxima, it is possible to estimate the quantile x T which is the level expected to be exceeded on average once every 1/T years. The cumulative probability is given by H(x T ) = 1 − 1/T and the estimates of extreme quantiles of the annual maxima distribution are then obtained by inverting the Eq. 2: The BM method is commonly used both for its simplicity and also because the annual maxima are undoubtedly independent variables. Despite its simplicity, the BM method is a wasteful approach because only one value from each block is used with loss of some important available information.

T3. Peaks-over-threshold
To overcome the limitations of the previous method, an alternative approach is widely used to study the extreme events known as the peaks-over-threshold (POT, introduced by Balkema and de Haan (1974) and Pickands (1975)). The POT method allows us to analyze all data exceeding a specific threshold value. The idea under this approach is to set an high threshold u, and to study all the exceedances of u. Suppose X i (for i = 1, . . . , n) is a sequence of i.i.d. random variables whose distribution function is F and let define the excesses over u as Y i = X i − u conditioned on X i > u, the cumulative distribution of exceedances is defined by: Pickands (1975) established the connection between EVT and the generalized Pareto distribution (GPD). He showed that a GPD approximation is possible if the distribution of the X i satisfies P r{M n ≤ z} ≈ G(z), where M n and G(z) are given respectively by Eq. (1) and Eq. (2). Moreover, for very large threshold u, the distribution function of the exceedances (Y i = X i − u) can be approximated by the generalized Pareto family: H(y; σ u , ξ) = 1 − (1 + ξ σ u · y) (−1/ξ) a) defined on {y : y > 0 and (1 + ξ/σ u · y) > 0}, where σ u = σ + ξ(u − µ); b) with two parameters: scale (σ) and shape (ξ).
This result implies that, if block maxima have approximate distribution G, then threshold excesses have a corresponding approximate distribution within the generalized Pareto family. In this case, the parameters of the GPD of threshold exceedances are determined by those of the associated GEV distribution of block maxima. In particular, the shape parameter (ξ) is equal to that of the corresponding GEV distribution and is invariant to block size (n) while σ u is unaffected by changes in u and σ. As it happens for the GEV distribution, the shape parameter is dominant in determining the behavior of the GPD tail: 1. ξ > 0 the distribution has no upper limit (equivalent to Pareto distribution) and the tail distribution function satisfies 1 − H(y) ∼ cy (−1/ξ) with c > 0, i.e. the polynomial distribution; 2. ξ < 0 the distribution of excesses has an upper endpoint at ω F = σ u /(|ξ|); 3. ξ = 0 the distribution is unbounded. This case is interpreted as ξ→0 i.e. the exponential distribution with mean σ.
Fixed a threshold u, the number of exceedances is assumed to be a random variable itself and it is modeled with Poisson distribution leading to the so called Poisson-GPD model. According to this model, if we assume the number of yearly exceedances to have a Poisson distribution (with mean λ) and all the exceedances to be independent realizations and GPD distributed, the probability that the annual maximum of the process is less than a certain value x is: This property suggests that the probability distribution of the annual maxima of a GPD-Poisson model is the same as the GEV distribution (see Eq. (2)). The GEV and GPD models are consistent with one another if ξ GEV = ξ GP D , σ = ψ + ξ(u − µ) and The POT method allows us to estimate the GEV parameters based on a greater number of events, whereas the traditional fitting methods considering only the annual maxima with consequent distortion in the shape of the tail. However, the optimal threshold selection requires particular attention in order to satisfy the two hypothesis underlying the method: 1) the number of events/year is Poisson-distributed; 2) exceedances over the threshold come from a Generalized Pareto Distribution (GPD). Threshold choice involves a trade-off between variance, which increases with higher thresholds due to the smaller number of excesses, and bias, which arise when the threshold is too low.       and MEVD (with parameter estimation on data from the whole calibration sample) for the Marseille station. The plots are obtained as a result of the cross-validation method used to test the global performance of the models and are estimated for 1,000 random realizations and for sample size: a) S = 30 years (in-sample test in the left column); b) V = M − S years (out-of-sample test in the right column). The colours represent the point density around the 45°line (dashed black line) corresponding to the best fit. Figure S8. NEWLYN (UK) -QQ plots of extreme-coastal-water-level quantiles computed for the GEV-based approaches (BM and POT) and MEVD for the Newlyn station. The plots are obtained as a result of the cross-validation method used to test the global performance of the models and are estimated for 1,000 random realizations and for sample size: a) S = 30 years (in-sample test in the left column); b) V = M − S years (out-of-sample test in the right column). The colors represent the point density around the 45°line (dashed black line) corresponding to the best fit.