Articles | Volume 22, issue 3
Research article
01 Apr 2022
Research article |  | 01 Apr 2022

Extreme-coastal-water-level estimation and projection: a comparison of statistical methods

Maria Francesca Caruso and Marco Marani

Accurate estimates of the probability of extreme sea levels are pivotal for assessing risk and for designing coastal defense structures. This probability is typically estimated by modeling observed sea-level records using one of a few statistical approaches. In this study we comparatively apply the generalized-extreme-value (GEV) distribution, based on block maxima (BM) and peaks-over-threshold (POT) formulations, and the recent metastatistical extreme-value distribution (MEVD) to four long time series of sea-level observations distributed along European coastlines. A cross-validation approach, dividing available data into separate calibration and test sub-samples, is used to compare their performances in high-quantile estimation. To address the limitations posed by the length of the observational time series, we quantify the estimation uncertainty associated with different calibration sample sizes from 5 to 30 years. We study extreme values of the coastal water level – the sum of the water level setup induced by meteorological forcing and of the astronomical tide – and we find that the MEVD framework provides robust quantile estimates, especially when longer sample sizes of 10–30 years are considered. However, differences in performance among the approaches explored are subtle, and a definitive conclusion on an optimal solution independent of the return period of interest remains elusive. Finally, we investigate the influence of end-of-century projected mean sea levels on the probability of occurrence of extreme-total-water-level (the sum of the instantaneous water level and the increasing mean sea level) frequencies. The analyses show that increases in the value of total water levels corresponding to a fixed return period are highly heterogeneous across the locations explored.

1 Introduction

The statistical analysis of extreme values of random variables is of wide conceptual and applicative importance in science and engineering (Coles2001; Beirlant et al.2004; Castillo et al.2005; Finkenstädt and Rootzén2004). Modeling extreme-value probability of occurrence is indeed an active field of theoretical and applied research in many fields, such as hydrology and climatology (Katz et al.2002; Cancelliere2017; Mekonnen et al.2021; Miniussi and Marra2021), ecology (Katz et al.2005; Rypkema et al.2019), ocean wave modeling (Rueda et al.2016; Benetazzo et al.2017; Barbariol et al.2019), transport engineering (Songchitruksa and Tarko2006), geophysical processes (Pisarenko et al.2014a, b; Elvidge and Angling2018; Hosseini et al.2020), biomedical data analysis (De Zea Bermudez and Mendes2012; Chiu et al.2018), insurance and financial applications (Embrechts et al.1997; Chan et al.2022), and many others.

In particular, the reliable estimation of the occurrence probability of coastal flooding events of large magnitude is crucial to environmental hazard evaluation (Coles and Tawn2005; Hamdi et al.2018) and to decision-making and mitigation measure design. In fact, coastal flooding hazard has been increasing on a global scale in recent decades, a trend expected to continue as a result of climate change (Meehl et al.2007; Church et al.2013; Fortunato et al.2016). Several studies highlight that global sea-level rise will continue accelerating in the 21st century as a consequence of climate change (Church and White2006; Jevrejeva et al.2008; Church and White2011; Haigh et al.2014b; Hay et al.2015). Additionally, changes in storminess may have an important role in modifying the frequency and magnitude of water level extremes (Lowe et al.2010; Menéndez and Woodworth2010; Woodworth et al.2011). Much of the current work on extreme-coastal-flooding events is based on the classical extreme-value theory (EVT) (Fréchet1927; Dalrymple1960; Coles2001; Woodworth and Blackman2002; Hamdi et al.2014, 2015; and references therein), which identifies the family of distribution functions known as generalized-extreme-value (GEV) distribution (Von Mises1936) as a general model for the distribution of maxima (or minima) extracted from fixed time periods of equal length (“blocks”, most commonly with a length of 1 year). The GEV, according to its original formulation, arises as a limiting distribution for maxima (or minima, not considered here) of a sequence of independent and identically distributed (i.i.d.) random variables. The peaks-over-threshold (POT) formulation (Balkema and de Haan1974; Pickands1975) extends the original GEV derivation by modeling all events exceeding a high threshold as opposed to considering just yearly maxima as in the GEV-block maxima formulation (GEV–BM). The POT approach again recovers the GEV distribution as the distribution of the annual maxima if two assumptions are valid (Davison and Smith1990): (1) the number of events per year is Poisson-distributed; (2) exceedances over the threshold come from a generalized Pareto distribution (GPD). Under these suitable conditions, in the following we refer to the POT framework as the POT–GPD formulation. For a brief overview of the theory underlying EVT and the two main methods based on the GEV distribution (i.e., BM and POT approaches), the reader can refer to the “Methods” section or the Supplement. The POT–GPD approach is often considered to be superior to GEV–BM in practical applications due to its more efficient use of often scarce observations. For extreme-sea-level studies in particular, Coles and Tawn (2005) and Haigh et al. (2010) recognize two weaknesses in the use of the GEV–BM analysis: (1) sea level is the combination of tide-driven (deterministic) and storm-driven (stochastic) components (the presence of a deterministic component is suggested to violate the i.i.d. assumption required in the GEV–BM derivation); (2) sea-level data are collected frequently (e.g., hourly), while the GEV–BM approach only studies annual maxima, with an extremely inefficient use of the data. The POT framework exploits more of the available information with respect to the BM approach (e.g., Coles2001; Bernardara et al.2014). However, the choice of a suitable threshold to retain a few above-threshold events per year is a critical step, and the estimation uncertainty significantly depends on threshold selection (Önöz and Bayazit2001; Li et al.2012; Solari et al.2017). The selected threshold value implies a balance between bias and estimation error variance (Coles2001). In fact, too low a threshold will violate the independence hypothesis of the framework, leading to bias, while too high a threshold will retain just a few values above the threshold, leading to high error variance.

More generally, GEV-based approaches, by construction, discard most of the observations and do not attempt to optimize the use of the available information (Volpi et al.2019). Furthermore, the traditional extreme-value theory derives the GEV distribution either as the asymptotic distribution when the number of events per block becomes very large or through the ad hoc GPD–Poisson assumptions underlying the POT approach. Whether these hypotheses do apply to the case of sea levels is a matter of discussion, but it seems beneficial to adopt methods that require the least number of a priori assumptions on the properties of the event arrival process. As a contribution to overcoming the limitations of the traditional EVT, here we explore the use of an alternative approach for modeling extreme sea levels, the metastatistical extreme-value distribution (MEVD). This approach was introduced by Marani and Ignaccolo (2015) and has been previously applied to rainfall, flood-frequency analysis, and hurricane intensities. The MEVD models the distribution of yearly maxima starting from the distribution of “ordinary values”, i.e., all the available data, in contrast to just considering annual maxima or a few values above a threshold. Moreover, the MEVD framework (i) is a non-asymptotic extreme-value distribution, which does not require the number of events per year to be large as in the traditional theory, and (ii) makes no a priori assumptions on the properties of the event occurrence process. In previous applications, the MEVD has been shown to significantly reduce estimation uncertainty compared to traditional approaches, especially when considering return periods greater than the sample size used for parameter estimation (Zorzetto et al.2016; Marra et al.2018; Miniussi and Marani2020; Miniussi et al.2020a, b).

Here we comparatively analyze the performance of GEV-based approaches and MEVD in high-quantile estimations with application to extreme sea levels at different observation sites. The aim is to (1) identify the statistical tool affording minimal uncertainty in the estimate of extreme sea levels with assigned probability of exceedance and (2) model and understand how climate change will affect the extreme-sea-level occurrence. To achieve these objectives, we analyze selected sea-level time series along the European coastline and evaluate extreme-sea-level predictive uncertainty by adopting a cross-validation approach in which calibration and test samples are kept separate and independent. Subsequently, we use the optimized estimation method to infer possible changes in coastal flooding hazard under Intergovernmental Panel on Climate Change (IPCC) climate change scenario RCP4.5 and RCP8.5.

The structure of the paper is as follows: Sect. 2 outlines the sea-level data and the methodology used in this application, and results are described in Sect. 3, while the conclusions are given in Sect. 4.

2 Materials and methods

2.1 Data

The analyses were performed using daily and hourly sea-level records from four tide gauge stations (see Table 1) distributed along European coastlines: Venice (Italy), Hornbæk (Denmark), Marseille (France), and Newlyn (United Kingdom). The study sites span a variety of geographical locations, coastal morphologies, and storm regimes.

Venice sea-level data (maximum and minimum daily observations) were obtained from the “Centro Previsioni e Segnalazioni Maree” of the Venice Municipality (, Città di Venezia2020) for the Punta della Salute gauge station. The remaining water level data, all at the hourly scale, were downloaded from the University of Hawaii Sea Level Center (UHSLC) repository (, last access: 10 November 2020; Caldwell et al.2015).

All sea-level datasets span long observational periods: 148 years for Venice, 122 years for Hornbæk, 115 years for Marseille (ca. 19 missing years), and 102 years for Newlyn.

The raw data for all stations were pre-processed to eliminate (1) years with less than 6 months of water level observations and (2) days with less than 24 h of data (for the case of hourly data). This process yields four “cleaned up” time series that were subsequently used in the analyses (see Table 1). Figure 1 shows daily maximum sea levels at the gauge stations explored after pre-processing.

Table 1Information of sea-level data used in this application.

Download Print Version | Download XLSX

Figure 1Daily maximum sea levels at different gauge stations explored after pre-processing: Venice (IT), Hornbæk (DK), Marseille (FR), and Newlyn (UK).


2.2 Methods

2.2.1 Mean-sea-level removal

The sea-level sequence is highly correlated and is generated by a non-stationary process due to long-term trends in mean sea level, the deterministic tidal component, surge seasonality, and interactions between the tide and surge (Dixon and Tawn1999). Tide–surge interactions may change amplitude and phase of the surges, mostly in shallow estuarine areas (Johns and Ali1980; Bernier and Thompson2007; Zhang et al.2010). Therefore, this effect needs to be taken into account when separating the surge and tide components. However, here, we do not attempt to separate these contributions; we only analyze the sum given by the combination of the water level setup, induced by meteorological forcing, and the astronomical tide. Hence, we simply study such sum as the final result of the nonlinear interactions between individual components. Under this premise, for a given site and at any instant of time t, the observed sea level z(t) (after averaging out waves) can be split into three components (Pugh and Vassie1978): mean sea level, m.s.l.(t); astronomically induced tidal level, x(t); and meteorologically induced surge level, y(t):

(1) z ( t ) = m . s . l . ( t ) + x ( t ) + y ( t ) .

The term m.s.l.(t) represents the long-term variations in water levels and of the elevation datum (i.e., possible land subsidence or uplift). Local m.s.l.(t) does not change uniformly over time, and its calculation is affected by many factors, such as tidal phases, long-term wind and atmospheric-pressure patterns, and vertical land motion (subsidence or uplift). The tidal contribution to the instantaneous sea level, x(t), caused by the gravitational forces exerted by the moon and the sun is deterministic in nature and can be predicted with a good degree of accuracy. This tidal variability occurs with characteristic periodicities between 12 h and 18.61 years (Eliot2010; Haigh et al.2011; Pugh and Woodworth2014; Peng et al.2019; Valle-Levinson et al.2021). This latter longest tidal periodicity corresponds to the precession of the lunar nodal cycle. The storm-surge contribution, y(t), is the meteorologically induced change in the water level generated by a combination of factors, such as the magnitude and direction of the wind, spatial gradients in atmospheric pressure, storm size, fetch, bathymetry, and storm duration (Hall and Sobel2013).

Two classes of methods are widely used to estimate the probability of occurrence of extreme sea levels: direct and indirect methods. Indirect methods model separately the deterministic and the stochastic components of z(t), followed by a convolution to obtain the probability distribution of their sum. Examples are the joint probability method (Pugh and Vassie1978, 1980), the revised joint probability method (Tawn and Vassie1989), the exceedance probability method (Middleton and Thompson1986; Hamon and Middleton1989), and the empirical simulation technique (Scheffner et al.1996; Goring et al.2011). Direct methods, such as the one adopted here, analyze observed values compounding the astronomical and stochastic storm-surge component. Direct methods mostly differ based on the analysis approach adopted, such as the annual maxima method (Jenkinson1955; Gumbel1958), the peaks-over-threshold method (Davison and Smith1990), or the r-largest method (Smith1986; Tawn1988). Here, we study the distribution of the sum, h(t), of the contributions from the deterministic tide and the stochastic surge:

(2) h ( t ) = z ( t ) - m . s . l . ( t ) .

From a statistical point of view, this choice is justified by the fact that the random arrival of storms adds a stochastic surge contribution at unpredictable times, thereby causing h(t) to be values from a random variable, even though it contains a deterministic component. The presence of a deterministic component of course does imply a strong auto-correlation in the observed signal, which will be subsequently filtered out by suitable signal processing described below.

Here, m.s.l.(t) is computed as the yearly average of daily levels. The yearly average is chosen rather than the customary 19-year average that eliminates all tidal periodicities, however small in amplitude, to better capture the surge contribution that causes the water level to deviate during a storm with respect to the “current” yearly value of m.s.l.(t). Once h(t) is computed by removing m.s.l.(t) from recorded levels, all local maxima of h(t) or water level peaks are identified, and their values constitute the basis for subsequent analyses of (i) the study of long-term trends of maximum yearly departures from the average mean sea level (two-tail Mann–Kendall test; Mann1945) and (ii) statistical inference of past coastal flooding events and their potential future changes.

In the following discussion, we use the terms “total water level” and “coastal water level” when referring to the quantities z(t) and h(t), respectively.

2.2.2 Extreme-value theory

As highlighted by Serinaldi and Kilsby (2014), the EVT deals with the asymptotic distributional behavior of two types of data modeled with two well-known approaches, namely the so-called block maxima (BM) and peaks-over-threshold (POT) approaches. The first type models the maximum values extracted from blocks of fixed length, whereas the second one models all the exceedances of high threshold. The cornerstones of the EVT are two theorems: the Fisher–Tippett–Gnedenko theorem (also known as the three-types theorem; Fisher and Tippett1928; Gnedenko1943; Gumbel1958) and the Pickands–Balkema–de Haan theorem (also known as the second theorem of EVT; Balkema and de Haan1974; Pickands1975).

According to the three type theorem, there are three possible non-degenerate distribution functions which can arise as limiting distributions of extremes of random samples: (i) the Gumbel distribution, or type I; (ii) the Fréchet distribution, or type II; and (iii) the reverse-Weibull distribution, or type III. The above three limiting distribution laws can be combined into a single family of three-parameter distribution known as the generalized-extreme-value (GEV) distribution given by

(3) G ( x ; μ , ψ , ξ ) = exp { - [ 1 + ξ ψ ( x - μ ) ] } - 1 / ξ ,

defined in the region for which {x:1+ξψ(x-μ)>0}. In Eq. (3) μ(-,+) is a location parameter, ψ>0 is a scale parameter, and ξ(-,+) is a shape parameter which controls the nature of the tail distribution (Fréchet type for ξ>0, Gumbel type for ξ=0, and reverse-Weibull type for ξ<0).

The second theorem of EVT defines a method to model the tail of the distribution above a threshold value (Davison and Smith1990). In particular, the theorem states that for a large enough threshold value, u, the distribution of exceedances of some high threshold (y=X-u, where X is a sequence of i.i.d. random variables) is described by a generalized Pareto distribution (GPD), which has the following cumulative distribution function:

(4) G ( y ; σ u , ξ ) = 1 - ( 1 + ξ σ u y ) - 1 / ξ ,

defined on {y:y>0 and (1+ξσuy>0)}, where σu and ξ are the scale and shape parameters, respectively.
There is a link between these two distributions according to which, if block maxima have approximate GEV distribution, then threshold excesses have corresponding approximate distribution within the generalized Pareto family and vice versa, and GEV can be obtained from GPD under two appropriate conditions (i.e., the occurrences are Poisson-distributed, and excesses over the threshold come from a GPD). The duality between Eqs. (3) and (4) means that the GPD parameters of the excesses are uniquely determined by those of the associated GEV distribution of block maxima (see, e.g., Coles2001). In particular, the shape parameter, ξ, is equal to that of the corresponding GEV distribution, and the scale parameters of the two distributions are related by σu=σ+ξ(u-y).

The interested reader can refer to Coles (2001) for a detailed description of statistical methods for extremes in hydrology or Papalexiou and Koutsoyiannis (2013) for a recent overview of the history of the EVT.

2.2.3 The metastatistical extreme-value distribution

The typical EVT derivation starts from the premise that the maximum value among n realizations of a random variable (Mn) is distributed according to the cumulative distribution function P(Mnx)=G(x)=F(x;θ)n (where, as customary, a capital letter indicates the random variable, and a lower-case letter indicates a value of the random variable). This approach assumes that the n values of the random variable of interest are generated by the same distribution, the “ordinary value” distribution F(x;θ), and are thus independent and identically distributed; n is the number of events in a block, such that G(x) is the cumulative distribution of the block maxima. The classical EVT assumes either that the number of events per block is large (asymptotic hypothesis, leading to the GEV–BM formulation) or that the number of events per block above a high threshold is distributed according to a Poisson distribution (POT–GPD formulation). The recently proposed metastatistical extreme-value distribution (Marani and Ignaccolo2015) is a doubly stochastic approach (Dubey1968; Beck and Cohen2003) that relaxes these hypotheses by considering both the parameters (θ) of the ordinary value probability distribution and the number of events per block to be random variables. Hence, the MEVD cumulative distribution of block maxima (estimated using a much greater sample than just yearly maxima used in the BM approach) is then defined as the compound probability:

(5) G ( x ) = n = 1 + Ω Θ F ( x ; θ ) n g ( n , θ ) d θ ,

where g(n,θ) is the joint probability distribution of the number of events in a block and of the parameter vector (discrete in N and continuous in Θ), and ΩΘ is the population of all possible parameter values.

For practical applications, the MEVD can be approximated by substituting the ensemble average in Eq. (5) with the sample average computed over all the blocks in the time series, obtaining

(6) G ( x ) 1 M j = 1 M F ( x ; θ j ) n j ,

where M is the number of blocks in the historical record, F(x;θj) is the cumulative distribution of ordinary values in the jth block, and nj is the number of events in the jth block. A common choice for the block length is 1 year. Note that the values of the parameter θj may be estimated on an estimation window (EW) with a length that is different from block length. For example, if the block length is 1 year, it may be advantageous to estimate parameter values on longer time slices to ensure, depending on the rate of event occurrence, that a reliable estimation of the parameters is possible. Miniussi and Marani (2020) in applications to daily rainfall extremes find that, when the number of events per year is fewer than 20–25, then the optimal EW length may be greater than 1 year.

It is interesting to note that the POT approach, briefly described above, can be thought of as a particular case of MEVD. In fact, Zorzetto et al. (2016) highlight that if one assumes (i) x to be the excess over a high threshold, (ii) F(x;θj) to be a generalized Pareto distribution (with fixed, deterministic parameters), and (iii) n to be generated by a Poisson distribution, then the GEV distribution is recovered as a particular case of the MEVD by means of the POT approach.

MEVD has been applied in several earth-science contexts. In rainfall extreme estimates, the ordinary value distribution is assumed to be Weibull when applied to point daily rainfall (Marani and Ignaccolo2015; Zorzetto et al.2016; Schellander et al.2019; Miniussi and Marani2020; Miniussi et al.2020b), point sub-daily rainfall (Marra et al.2018), and satellite rainfall estimates (Zorzetto and Marani2019, 2020). For floods across the United States, Miniussi et al. (2020a) propose to adopt a gamma distribution for F(x;θj). Hosseini et al. (2020) describe Atlantic hurricane intensities using a generalized Pareto ordinary value distribution. In all cases the appropriate form for the underlying ordinary value distribution was identified by minimizing the estimation uncertainty within a cross-validation approach, which is also followed here. In this particular application to extreme coastal water levels, three candidate probability distributions for F(x;θj) in Eq. (6) are tested, i.e., the gamma, Weibull, and generalized Pareto distributions. Based on the comparative evaluation of the performance of these distributions, e.g., using diagnostic quantile–quantile scatterplots, the generalized Pareto distribution emerged as the best model for the “ordinary” coastal-water-level values.

In the present context, we define as ordinary values any coastal water elevation (i.e., the maximum water level reached during a storm event) greater than a site-specific threshold value. This threshold is chosen to be as small as possible (differently from the POT approach) to retain as much of the observational information as possible and will be dependent on the magnitude of the local tidal range (sea-level difference between high and low water level over a tidal cycle) and of storm contributions. Additionally, the threshold is set to be large enough to filter out coastal-water-level peaks that are likely fully determined by tidal fluctuation in the absence of any storm contribution. Given the above constraints, we also choose the threshold value that minimizes the estimation error under the MEVD framework.

As suggested by several rainfall applications, ordinary distribution parameters are here estimated using the probability-weighted moments (PWMs) method in non-overlapping estimation windows of 5 years. In the present application, the optimal estimation window length was set to 5 years to obtain a more robust parameter estimation, especially when few values in each year are available. PWM estimation, introduced by Greenwood et al. (1979), is widely applied because of its good performance, particularly in the presence of small sample sizes, and its reduced estimation bias and sensitivity to the presence of outliers in the data (Hosking et al.1985; Hosking and Wallis1987; Hosking1990).

2.2.4 Selection of independent events

The GEV-based approaches are fit on either annual peak maxima (GEV–BM) or on a few water level peaks over a high threshold (POT–GPD), which can be assumed to be realizations of independent stochastic variables. The MEVD requires that all ordinary values (coastal-water-level peaks in this case) within one block may be assumed to be realizations from independent random variables. This hypothesis, in turn, requires that observed peaks are filtered to only retain events that may be considered to be independent, through a declustering process (Coles2001; Ferro and Segers2003; Beirlant et al.2004; Bommier2014; Marra et al.2018). Several criteria have been developed for such processing of the data. A common criterion sets the minimal time separation, or lag (τ), for two events to be considered independent. Intuitively, high-water-level events separated by a sufficiently long time period are reasonably caused by distinct storm events. However, when analyzing the water level with respect to current mean sea level, a quantity that contains the deterministic tidal contribution, dependence may be expected to be present also for large lags. In theory, some dependence is present for lags up to the longest periodicity in the tidal signal (18.61 years). In practice, as the dependence in the tidal signal decreases for increasing lag, one expects that a much shorter threshold time lag will be sufficient to make sure that only independent events are considered. The analysis of the correlograms of selected coastal-water-level peaks shows that some correlation persists also for long time lags and also in the declustered time series. Even though the strength of this correlation is relatively small (the autocorrelation function, ACF, is always less than 0.3), it could impact the ability of the MEVD, which assumes independence, to capture observed extreme behavior. The declustering process does significantly decrease correlation, as may be seen by comparing Fig. S1 (ACF prior to declustering) and Fig. S2 (after declustering). Interestingly, it is seen that the tidal contribution (that generates periodicities in the ACF) is strongly visible in Venice and Newlyn, while it is quite small in Hornbæk and Marseille. The underlying tidally induced correlation becomes more clearly visible after declustering also in Hornbæk and Marseille. We note that the existing literature implementing declustering approaches to coastal level signals normally focuses on studying the storm-surge component only. As a result, it uses threshold time lag values that are smaller than those adopted here because characteristic correlation times of the surge component are significantly smaller than those associated with the sum given by the combination of surge and tidal components. For example, the independence between two consecutive storm surge events in southern Europe has been found to be achieved with a threshold lag of 3 d (Cid et al.2015). A threshold separation of 1 d between consecutive events is imposed by Tebaldi et al. (2012) in their analysis of storm surges along the US coast. Haigh et al. (2010) adopt a threshold lag of 30 h in the English Channel, while Bernardara et al. (2011) assume a 72 h independence criterion. After exploring values between 24 h and several days, we adopt a threshold lag of 30 d, which yielded the minimum estimation error under the MEVD approach and is consistent with the main lunar periodicity. The result of this declustering process is a set of independent events with magnitudes hk, whose number nj in year – or block – j is a realization of a random variable as illustrated in Eqs. (5) and (6).

2.2.5 Cross-validation procedure

Statistical modeling aims to use sample information to infer the probability distribution of the population from which the data are extracted. This inference is uncertain due to imperfect parameter estimates and to the possible inability of the chosen distribution to capture the statistical properties of the underlying population. Although these sources of uncertainty are inherent in any statistical model, their impact can be minimized by a careful choice of the model and by an effective use of all sources of information (Coles2001). In many applications uncertainty is estimated by means of goodness-of-fit measures, which quantify how well the distribution compares to the sample on which it was fitted. However, this procedure does not provide a measure of the predictive uncertainty encountered when trying to estimate the probability of occurrence of the “next” yet unobserved value. In this study, we evaluate the performance in high-quantile estimation associated with the use of the MEVD and the GEV distribution by means of a cross-validation (CV) procedure, in which model predictions of the probability of occurrence are compared to frequencies from data that were not used in the estimation of model parameters. This is possible by dividing observations into two sets of independent data: the estimation set is the sample from which model parameters are estimated, and the test set is the sample with which model predictions are compared.

The procedure can be summarized as follows: (a) we randomly reshuffle the observational years on record while keeping all the independent water level peaks in their original year to (1) preserve both the ordinary value frequency distribution in each year and the distribution of the number of events per year and (2) remove possible non-stationarity and correlation in the time series; (b) we divide the observational sample into two independent sub-samples obtained by randomly selecting S years from the original time series of length M; this sub-sample (in the following “calibration sample”) is used for parameter estimation, while data in the remaining V=M-S years are used for testing (in the following “validation sample” or “test sample”); (c) as usual in frequency analysis, we associate with each observed yearly maximum, xi, an empirical frequency value given by Weibull’s estimator Fi=i/(V+1), where i is the rank of xi in the list of yearly maxima sorted in ascending order, and V=M-S is the sample size in the validation sub-sample; the return period Tr associated with each yearly maximum is then simply Tr,i=1/(1-Fi); (d) we estimate the GEV and MEVD quantiles using the parameter values estimated in step (b) from the calibration sub-sample; (e) focusing on the validation sub-sample, in every realization (for p=1,,Nr; Nr=1000 here) and for a fixed mean recurrence time (Tr), we compute the nondimensional error (NDE) between the estimated and observed quantiles as NDEp(S,Tr)=[h(est,p)(S,Tr)-h(obs,p)(S,Tr)]/h(obs,p)(S,Tr); (f) we repeat the CV scheme above Nr times. This procedure is performed for different calibration sample sizes (S=5,10,20, and 30 years) to evaluate how estimation uncertainty varies with return period and calibration sample size.

2.2.6 Future total water level projections

Future increases in the frequency of extreme total water levels (i.e., the variable previously referred as z(t)) due to climate change will have serious impacts on coastal regions. These impacts will vary temporally and regionally, depending on (i) the local relative mean-sea-level rise (including possible subsidence or uplift), (ii) current storm-surge intensity probability distributions, and (iii) changes in the dominant meteorological dynamics. In this particular application to extreme coastal water levels (i.e., the sum given by the combination of the water level setup, induced by meteorological forcing, and the astronomical tide), only the first two factors are considered.

It is very likely that sea-level rise will continue to accelerate over time, thereby increasing the frequency of extreme-sea-level events, leading to severe flooding in many low-lying coastal cities and small islands (Oppenheimer et al.2019). Various techniques have been used to study possible changes in coastal flooding hazard (e.g., McInnes et al.2013; Vousdoukas et al.2016). Several authors have found that past variations in the frequency of occurrence of extreme sea levels have been primarily determined by changes in mean sea level (e.g., Zhang et al.2000; Woodworth and Blackman2004; Lowe et al.2010; Menéndez and Woodworth2010; Haigh et al.2014b; Wahl et al.2017). This implies that effects of variations in storminess (e.g., magnitude, trajectories, and frequency) have been small in the observational record compared to the dominant effects of mean-sea-level changes (Haigh et al.2014a). This notion is also confirmed by our trend analyses of maximum yearly departures from the average sea level (see Sect. 3.1), which fail to detect trends in the maximum difference between total sea level and concurrent mean sea level except at one of the sites (Venice), where it is smaller (0.7 mm yr−1) than past and projected rates of sea-level rise ( 3.0 and  8.0 mm yr−1, respectively, by the end of the century, according to the RCP8.5 IPCC scenario).

Based on these elements, here we estimate the probability of future total water levels along European coastlines by assuming that changes in the tidal and storm-surge components are negligible with respect to changes in mean sea level, an assumption common to previous approaches (Araújo and Pugh2008; Haigh et al.2010; Tebaldi et al.2012).

To assess how the exceedance probabilities of extreme total water levels might change in the future, the projections of sea-level rise through 2100 from the IPCC’s Fifth Assessment Report (AR5) are used. In particular, we explore an intermediate (RCP4.5) and an extreme scenario (RCP8.5), using CMIP5 model outputs from the “Integrated Climate Data Center” (ICDC) database (University of Hamburg:, Church et al.2013).

For each tide gauge, our approach can be summarized as follows: (1) we infer the probability distribution of extreme coastal water levels (annual maxima) from observed independent events whose intensity (maximum coastal water level attained, hk) is defined with respect to the concurrent mean sea level computed on a yearly basis; (2) we estimate the future probability of extreme total water levels by translating extreme-level quantile estimates upward according to location-specific projections of mean sea level in the year 2100 (thereby implicitly assuming subsidence and uplift to be negligible).

2.2.7 Return period

One of the main objectives of frequency analysis is to calculate the average recurrence interval or return period. It is a widely used concept in hydrological and geophysical risk analysis. If a process is stationary, the return period (Tr) of an event magnitude is defined as the average time elapsing between two consecutive exceedances of this magnitude. Alternatively, it may be said that a magnitude value is expected to be exceeded, on average, in each return period. If the yearly maximum magnitude h is exceeded on average once in Tr years, then its exceedance probability, E(h)=1-G(h), in a given year is


Therefore, the return period of the level value h is the inverse of the probability of exceedance and can be expressed as a function of the cumulative distribution, G(h), of annual maxima, e.g., through the MEVD (Eq. 6):

(7) T r ( h ) = 1 E ( h ) = 1 1 - G ( h ) .

Because for a fixed value of mean sea level there is a one-to-one relation between the value of the sum of the astronomical and the storm surge contribution, h, and the total water level, z=h+m.s.l., one can write Gh(h)=P[H>h]=P[H>z-m.s.l.]=P[Z-m.s.l.>z-m.s.l.]=P[Z>z]=Gz(h) such that Eq. (7) can be used once the cumulative distribution is known and for each (time-dependent) value of m.s.l. to determine the return period of the total water level (at the time when m.s.l. is evaluated):

(8) T r ( z ) = 1 1 - G z ( h ) = 1 1 - G h ( h ) = 1 1 - G ( z - m . s . l . ) .

Based on the hypothesis introduced in Sect. 2.2.6 that mean-sea-level rise is the dominant effect in future coastal flooding, we assume that the characteristics of the extremes (i.e., the parameters of the GPDs defining the MEVD) remain valid in future scenarios. Equation (8) clarifies that the return period of a fixed value z decreases as m.s.l. increases, basically because for higher values of m.s.l. a smaller value of h is needed to achieve the same total water level z. This decrease is nonlinear due to the nonlinear form of the right-hand side in Eq. (8).

3 Results and discussion

3.1 Mann–Kendall trend analysis

We start by computing mean sea level on a yearly basis and by subtracting it from observed total water level. The first question that we want to explore is the presence of long-term trends, unrelated to sea-level rise and associated with other factors (e.g., human-induced factors, morphological variations), in the “cleaned up” signal, i.e., the observed measurements without mean sea level. To answer this question, in this work we focus on the deviation of yearly maxima from yearly mean sea level and test for the presence of a trend by the two-tail Mann–Kendall test (Mann1945). Figure 2 summarizes results for each location explored. From a first visual inspection of Fig. 2, the Venice (1872–2019) and Hornbæk (1891–2012) time series appear to show an increasing trend in the deviations of yearly maxima from yearly mean sea level (blue line) of different magnitudes. In contrast, Marseille sea-level observations (1985–2018) seem to be characterized by a decreasing trend. Finally, the Newlyn historical record (1915–2016) displays a fairly constant signal with no noticeable variations. The application of the Mann–Kendall test reveals a partly different story. The test rejects the hypothesis of the absence of a trend at the 95 % confidence level only for the Venice site (p valueVenice=0.014). This result suggests that the increase in the yearly maximum deviations from yearly mean sea level may be a direct result of the local morphological variations in lagoon channels where the tidal wave propagates (whereby dissipation of the wave is reduced) and/or land subsidence. In contrast, at the remaining locations, the null hypothesis of no trend cannot be rejected (p valueHornbæk=0.352, p valueMarseille=0.110, and p valueNewlyn=0.997). The results obtained from these analyses support the validity of the hypothesis that mean-sea-level rise is the dominant factor in determining the future frequency of coastal flooding (see Sect. 2.2.6). For the tests performed here to compare different extreme-value statistical models, the possible presence of trends (e.g., in Venice) is irrelevant since such tests are performed by first reshuffling observed values, thereby eliminating any existing trend, albeit small.

Figure 2Deviation of yearly maxima from yearly mean sea level (blue line) and 19-year running mean (black line) calculated for Venice (IT), Hornbæk (DK), Marseille (FR), and Newlyn (UK).


3.2 Extreme-value analysis

The MEVD formulation requires the choice of an optimal distribution of ordinary values that can represent the characteristics of the natural phenomenon under analysis. Different candidate distributions for the F(x;θj) in Eq. (6) are evaluated, and the most suitable distribution is selected on the basis of the CV procedure comparing the MEVD-estimated quantiles with the observed ones. As previously introduced in Sect. 2.2.3, according to different tests, the appropriate distribution to model the ordinary sea-level values is the generalized Pareto distribution (GPD). We highlight that the GPD used in the MEVD framework is obtained by imposing a small threshold (differently from the high threshold adopted in the POT–GPD approach) to capture the distribution of the main body of the probability distribution of the ordinary events and does not require the event arrival process to be Poisson (Marani and Zorzetto2019).

As mentioned above (Sect. 2.2.4), the independence between two consecutive coastal-water-level events is guaranteed by imposing a minimum time lag. Firstly, we select daily maxima sea levels from the original record; secondly, we define as independent events those that are separated by at least 30 d. Subsequently, the samples used for statistical inference are built as follows: (1) GEV–BM – the yearly maxima are selected; (2) POT–GPD – as proposed by Coles (2001), the optimal threshold (u) is determined by studying the stability of the GPD shape (ξ) and modified scale (σ*=σu-ξu) parameters estimated using a wide range of values of u; using this method, threshold values of 65 cm (Venice), 50 cm (Hornbæk), 35 cm (Marseille), and 260 cm (Newlyn) were identified; (3) MEVD – all the independent coastal-water-level events above a low threshold are used to fit the probability distributions of ordinary values. The optimal threshold to apply to all the independent events for extrapolating the sample of ordinary values is chosen by testing different threshold values and evaluating the goodness of fit of the distribution using diagnostic graphical plots. According to the selection criteria described in Sect. 2.2.3, the low thresholds adopted at the four study sites are 59 cm for Venice, 40 cm for Hornbæk, 25 cm for Marseille, and 250 cm for Newlyn. For every observed site, Table 2 and Fig. S3 display the gradual increase in the number of independent events (i.e., annual maxima, exceedances over the threshold, and ordinary values) used to infer the distributions when moving from GEV–BM and POT–GPD to MEVD approaches.

Considering the above threshold values, the observed and estimated distributions of coastal water level are compared by plotting their quantiles against each other. By comparing measures of in-sample and out-of-sample test predictive accuracy, the results are presented by means of quantile–quantile (QQ) plots. The reader can refer to Fig. 3 (or Figs. S4, S5, S6, S7, and S8 in the Supplement) to compare the results obtained with the MEVD framework (or the GEV-based approaches – GEV–BM and POT–GPD – vs. the MEVD formulation) for the four sites analyzed. QQ plots are obtained as a result of the CV procedure with 1000 random realizations and sample size: (a) S=30 years (in-sample test in the left column), (b) V=M-S years (out-of-sample test in the right column). The colors represent the density of points around the 45 line (i.e., the line of equality). This highlights how the estimated quantiles are closely comparable with the observed ones for all the three approaches tested and for both the sample sizes explored (S and V). In particular, if the reader looks at Figs. S4–S8 in the Supplement and if out-of-sample performance is considered, it is difficult to quantify which distribution is the best due to a large variability in the estimates. Overall, if only the MEVD performance is investigated, the reader can look to the right column (out-of-sample test) in Fig. 3, where the results display that the MEVD formulation performs similarly for all sites analyzed. In particular, it proves to be a good model for lower and intermediate quantiles but shows variability in the estimates for higher quantiles.

Table 2Total number of independent events and average number of events per year for all the gauge stations explored.

Download Print Version | Download XLSX

Figure 3QQ plots of extreme-coastal-water-level quantiles, computed with the MEVD framework, for the (a) Venice (IT), (b) Hornbæk (DK), (c, d) Marseille (FR), and (e) Newlyn (UK) sites. The MEVD parameter estimations are based on non-overlapping sub-samples of fixed size (5 years), while subplots indicated with the letter (d) display the QQ plots obtained with MEVD parameter estimations based on data from the whole calibration sample size. The plots are obtained as a result of the cross-validation method used to test the global performance of the models and are estimated for 1000 random realizations and for sample size (1) S=30 years (in-sample test in the left column) and (2) V=MS years (out-of-sample test in the right column). The colors represent the point density around the 45 line (dashed black line) corresponding to the best fit.


We now focus on evaluating the performance of the three approaches (GEV–BM, POT–GPD, and MEVD) in high-quantile estimation. We explore the predictive performance of the MEVD and GEV distribution as a function of the NDE (Sect. 2.2.5) computed for the maximum return period, Tr,max=M-S+1, associated with the maximum value in each test sub-sample that we randomly extract in the CV approach. The use of the NDE metric allows us to easily characterize and compare model estimation uncertainty associated with fixed return time of interest and the variation in the calibration sample size (from 5 to 30 years). The results are summarized by means of box plots (Fig. 4) and kernel density estimates computed for a calibration sample size of 30 years (Fig. 5). Table 3 summarizes the main results underlying the chosen evaluation metric. When we focus on the case of a short sample (5 years), different sites display variable results: (I) the GEV and MEVD approaches perform similarly for Venice (Fig. 4a) and Hornbæk (Fig. 4b) with similar interquartile ranges and underestimations of the actual quantile; (II) for the Newlyn gauge station (Fig. 4d) the GEV–BM distribution yields better results, even though the POT and MEVD median errors are also close to zero. In contrast, when we consider longer calibration sample sizes (from 10 to 30 years), the MEVD-based estimates outperform the traditional approaches for most gauge stations explored: (I) results for the Venice site confirm the robustness of the MEVD with respect to the GEV distribution especially for calibration sample size equal to 30 years; in this case, the median error in the MEVD estimates tends to be closer to zero (−0.004), corresponding to approximately unbiased estimates; (II) the Hornbæk station displays similar results to those for Venice, and the MEVD-based estimates become more reliable when we consider a calibration sample size greater than 10–20 years; (III) Newlyn estimation errors show a trade-off between the BM method and MEVD for calibration sample size equal to 20 and 30 years.

Figure 4Distribution of the nondimensional error (NDE) for maximum sample return period (Tr) represented by means of box plots at given gauge stations explored: (a) Venice (IT), (b) Hornbæk (DK), (c) Marseille (FR), (d) Newlyn. In the case of the Marseille (FR) site, MEVD parameter estimation is based (1) on non-overlapping sub-samples of fixed size (5 years; green color) and (2) on data from the whole calibration sample (black color).


Figure 5Kernel density estimates for the nondimensional error (NDE) distributions obtained with a calibration sample size (S) of 30 years and maximum return period (Tr) at given gauge stations explored: (a) Venice (IT), (b) Hornbæk (DK), (c) Marseille (FR), (d) Newlyn (UK). In the case of the Marseille (FR) site, MEVD parameter estimation is based (1) on non-overlapping sub-samples of fixed size (5 years; green color) and (2) on data from the whole calibration sample (black color).


Results for the Marseille site show a peculiar behavior that requires a specific discussion. In this case, the application of the traditional extreme-value theory is advantageous when compared with the MEVD (Fig. 4c). In order to better understand the application to the Marseille site, we performed MEVD parameter estimation using two approaches: (1) estimation based on non-overlapping calibration samples of fixed size (5 years as for the other sites); (2) parameter estimation on data from the whole calibration sample. The comparison of the results from these two setups confirms that when longer time slices are used for estimating GPD parameters (black color in Fig. 4c), the MEVD performance is improved (for example when we consider S=30 years, MEVD median[S-yearwindow]=0.17 vs. MEVD median[5-yearwindow]=0.35), though it does not yet match the results obtained from the GEV–BM approach (GEV–BM median error =0.016). This can be explained by considering that sea-level peaks occur in Marseille about once every year on average. In this case GEV–BM is advantageous because the small number of events per year does not provide a more numerous calibration sample with respect to the sample of annual maxima. This result confirms the conclusion by Miniussi and Marani (2020), according to which the selection of the estimation window size for fitting the ordinary value distribution strongly depends on the average number of extreme events per year.

We also provide a comparative analysis between the three methods to evaluate if the tested extreme-value distributions are representative of the entire range of return times of interest. To achieve this purpose, we evaluate method performance also for intermediate Tr values, greater than the calibration sample size, since for Tr<S the empirical quantiles can be used. We perform this additional analysis for the Venice, Hornbæk, and Newlyn sites. Figure 6 summarizes the results obtained by estimating the probability distribution parameters on 30-year calibration sub-samples. The analyses suggest that when we focus on the median error associated with moderate values of the return period, GEV–BM displays an overall greater robustness (e.g., in the case of Venice and Hornbæk sites) with respect to POT–GPD and MEVD, which exhibit greater fluctuations. In particular, results show that MEVD is a good model for the highest values of the return period but exhibit a greater absolute value of the estimation error for smaller Tr. Overall, the results suggest that no single approach is clearly superior at all values of Tr due to a large variability in the estimates. For example, for the Venice site there is a decrease (in many cases an unbiased estimate) in the MEVD-NDE values for intermediate Tr (between 85 and 105 years), while for greater Tr values (but smaller than Tr,max) the error shows an overestimation of the actual quantile with respect to traditional approaches (which exhibit an underestimation tendency). To be more specific, if Tr>105 years are considered, MEVD yields error estimates between 0 % and <10 %, while errors associated with GEV–BM and POT–GPD lie between 0 % and <-20 %. The Hornbæk site shows similar results to the Venice site, while Newlyn’s results exhibit more fluctuations for large Tr values with much reduced smaller amplitudes and values of the NDE.

Table 3Results of the evaluation metric obtained for all the gauge stations and for calibration sample sizes (S) equal to 5 and 30 years. In the case of the Marseille site, text in bold refers to MEVD parameter estimation based on data from the whole calibration sample size.

Download Print Version | Download XLSX

Figure 6Median of the nondimensional error (NDE) for return period greater than the calibration sample size in test sub-sample for the GEV–BM, POT–GPD, and MEVD approaches (magenta, blue, and green dots, respectively). The results are obtained for the Venice (IT), Hornbæk (DK), and Newlyn (UK) sites and by estimating the distribution parameters on 30-year calibration sub-samples.


Figure 7Future total water level projections, with respect to the current mean sea level, in Venice (IT; panels a and b), Hornbæk (DK; panels c and d), Marseille (FR; panels e and f), and Newlyn (UK; panels g and h). (a, c, e, g) Annual (black line) and future mean sea level until 2100 with RCP4.5 (blue line) and RCP8.5 (red line). Dashed lines represent the 95 % confidence intervals. (b, d, f, h) Return period curves for extreme total water level. The green curve represents the estimates obtained with the observed record; the blue and red curves represent the estimates obtained with the projected sea-level rise in the year 2100 with RCP4.5 (blue) and RCP8.5 (red), respectively; the gray dots indicate the observed annual maxima. The triangle, square, and pentagon highlight the heights of extreme total water levels for a fixed return period equal to 500 years.


3.3 Future projections of extreme total water levels

We next explore how sea-level rise may influence the frequency of extreme total water levels across the sites analyzed. As described in Sect. 2.2.6, we only evaluate the influence of an increased mean sea level; i.e., we do not address possible changes in storm regimes (see, e.g., Tebaldi et al.2012).

We use site-specific sea-level projections from IPCC’s AR5 (Church et al.2013), which indicate an accelerating sea-level rise at all four observation sites (for each gauge station under analysis, the reader can refer to the panels a, c, e, and g in Fig. 7), with expected water level increases by the end of the century (RCP8.5) of 48 cm in Venice, 52 cm in Hornbæk, 59 cm in Newlyn, and 54 cm in Marseille. The panels (b), (d), (f), and (h) in Fig. 7 show observed (green line) and future (blue and red lines) changes in the return period associated with maximum water level events due to sea-level rise. These curves were obtained by using, in Eq. (8), the MEVD with parameters estimated on 5-year sliding windows. As noted above, changes in return levels are nonlinear: relative changes are more significant for smaller extremes than for larger ones. The Tr vs. z curves are concave downward and display varying slopes depending on the site explored. When a fixed return period is considered (e.g., 500 years), the mean-sea-level projections quantify the expected increase in extreme-total-water-level peaks for that particular return period. These changes vary heterogeneously across the different coastal sites explored. By comparing the percentage changes associated with the two emission scenarios and the two return periods (Table 4), Venice and Marseille are seen to experience the greatest changes in extreme total water levels (e.g., with reference to Tr = 100 years and RCP8.5, the variations at the Venice and Marseille sites are approximately 23 % and 29 %, respectively). All sites display greater percent changes for the lower 100-year return period in each scenario; i.e., “less infrequent” extremes will be most impacted by sea-level changes in the near future.

Changes in sea-level extremes can also be studied by focusing on changes in the return period of a fixed value of the total water level. To this end, one can define a sensitivity measure (SM) as

(9) SM = 1 T r d T r d m . s . l . = - 1 T r 1 [ 1 - G ( z - m . s . l . ) ] 2 f ( z - m . s . l . ) = - f ( z - m . s . l . ) T r ,

which is obtained by derivation of Eq. (8) and where f(z)=dGdz is the probability density function associated with G(z). Equation (9) shows that, at a given site and for a set value of z, the relative change in return period grows linearly with Tr. For example, see in Fig. 6b, d, f, and h how, for a given value of z, changes (horizontal spacing between the curves) are greater for Tr=1000 years than for Tr=500 years. The expression for SM also tells us that changes in Tr are more significant, everything else being equal, for values of z-m.s.l. near the mode of the distribution, where f(z-m.s.l.) is maximum (e.g., compare changes at the Venice or Hornbæk sites with those at Newlyn for a same initial value of Tr). Finally, Eq. (9) shows that percentage changes in Tr are highly site-dependent through the shape of f(z-m.s.l.).

Table 4Results of the percentage changes in total water level (Δz) obtained with the two future scenarios (RCP4.5 and RCP8.5) and the return periods (100 and 500 years) for the four sites under analysis.

Download Print Version | Download XLSX

4 Conclusions

The comparative examination of extreme-value distributions applied to observed sea levels at several sites along European coasts provides insights into the predictive performance of traditional and new approaches. Our analyses confirm some practical and conceptual advantages of the MEVD with respect to traditional methods. A cross-validation scheme (with 1000 realizations for each site) was used to compare model performance in high-quantile estimation. The use of two independent sub-samples (calibration and test sample) allows the quantification of actual predictive uncertainty.

We find that the MEVD approach provides reliable estimates of high quantiles for almost all the gauge stations explored, particularly when sufficiently long calibration sample sizes are considered. Differences in performance between the MEVD framework and GEV-based approaches are not large, and a definitive conclusion on an optimal solution independent of the return period of interest remains elusive. However, small differences in the estimation accuracy are relevant for engineering applications when dealing with rare extreme events. If we focus on high-return-period quantile estimation, our analyses show that the MEVD approach provides reliable estimates for almost all the gauge stations explored. Data from the Marseille gauge station exhibit a behavior that deviates from those from other sites, showing an inferior predictive performance of the MEVD with respect to GEV-based approaches. We interpret this fact to be linked to the small average number of sea-level peaks every year: the small sample of yearly ordinary events available prevents the MEVD from adding significant information with respect to GEV–BM and POT–GPD. Conversely, when we evaluate method performance for intermediate return period values, GEV–BM displays an overall greater robustness, and MEVD exhibits a greater absolute value of the estimation error.

Unfortunately, the size of the available datasets does not allow us to explore model performance for greater values of the return period. Future work could investigate if the estimation error can be reduced, with respect to what was found here, by using different approaches, e.g., by assuming “time-invariant” parameters in the ordinary distribution, whose estimation would thus be performed on the entire calibration dataset rather than on relatively short sliding windows. Synthetic water level time series may be produced by one of the several existing numerical models to extend analyses to arbitrarily long return periods.

Finally, we explored projections of the frequency of extreme total water levels driven by changes in mean sea level. The sensitivity of extreme-water-level frequency to sea-level rise is location-dependent, and we find that, at a given site and for a set value of the total water level extreme, the relative change in return time grows linearly with return time itself.

Data availability

All data used are publicly available from sources cited in the main text. Venice sea-level data were obtained from the “Centro Previsioni e Segnalazioni Maree” of the Venice Municipality (, Città di Venezia2020). The remaining water level data were downloaded from the University of Hawaii Sea Level Center (UHSLC) repository (, Caldwell et al.2015). The CMIP5 model outputs used for the future total water level projections are available at the “Integrated Climate Data Center” (ICDC) database of the University of Hamburg (, Church et al.2013).


The supplement related to this article is available online at:

Author contributions

MM designed and coordinated the research. MFC performed the research and analyzed the data. Both authors contributed to the writing and editing of the manuscript.

Competing interests

The contact author has declared that neither they nor their co-author has any competing interests.


Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


Maria Francesca Caruso acknowledges the PhD School in Civil Sciences and Environmental Engineering at the University of Padova for funding her PhD. Marco Marani acknowledges the support of the Venezia2021 research program.

Financial support

Scientific activity was performed as part of the research program Venezia2021, coordinated by CORILA, with the contribution of the Provveditorato for the Public Works of Veneto, Trentino Alto Adige, and Friuli Venezia Giulia, provided through the concessionary of State Consorzio Venezia Nuova.

Review statement

This paper was edited by Philip Ward and reviewed by two anonymous referees.


Araújo, I. B. and Pugh, D. T.: Sea levels at Newlyn, 1915–2005: Analysis of trends for future flooding risks, J. Coastal Res., 24, 203–212,, 2008. a

Balkema, A. A. and de Haan, L.: Residual life time at great age, Ann. Probab., 2, 792–804,, 1974. a, b

Barbariol, F., Bidlot, J.-R., Cavaleri, L., Sclavo, M., Thomson, J., and Benetazzo, A.: Maximum wave heights from global model reanalysis, Prog. Oceanogr., 175, 139–160,, 2019. a

Beck, C. and Cohen, E. G. D.: Superstatistics, Physica A: Statistical Mechanics and its Applications, 322, 267–275,, 2003. a

Beirlant, J., Goegebeur, Y., Segers, J. J. J., and Teugels, J.: Statistics of Extremes: Theory and Applications, John Wiley & Sons, Chichester, UK, ISBN 0471976474, 2004. a, b

Benetazzo, A., Ardhuin, F., Bergamasco, F., Cavaleri, L., Guimarães, P. V., Schwendeman, M., Sclavo, M., Thomson, J., and Torsello, A.: On the shape and likelihood of oceanic rogue waves, Sci. Rep, 7, 1–11,, 2017. a

Bernardara, P., Andreewsky, M., and Benoit, M.: Application of regional frequency analysis to the estimation of extreme storm surges, J. Geophys. Res.-Oceans, 116, C02008,, 2011. a

Bernardara, P., Mazas, F., Kergadallan, X., and Hamm, L.: A two-step framework for over-threshold modelling of environmental extremes, Nat. Hazards Earth Syst. Sci., 14, 635–647,, 2014. a

Bernier, N. B. and Thompson, K. R.: Tide-surge interaction off the east coast of Canada and northeastern United States, J. Geophys. Res.-Oceans, 112, C06008,, 2007. a

Bommier, E.: Peaks-over-threshold modelling of environmental data, Master's thesis, Department of Mathematics, Uppsala University, (last access: 7 June 2021), 2014. a

Caldwell, P. C., Merrifield, M. A., and Thompson, P. R.: Sea level measured by tide gauges from global oceans – the Joint Archive for Sea Level holdings (NCEI Accession 0019568), Version 5.5, NOAA National Centers for Environmental Information [data set],, 2015. a, b

Cancelliere, A.: Non Stationary Analysis of Extreme Events, Water Resour. Manag., 31, 3097–3110,, 2017. a

Castillo, E., Hadi, A. S., Balakrishnan, N., and Sarabia, J. M.: Extreme Value and Related Models in Engineering and Science Applications, John Wiley & Sons, New York, ISBN 978-0-471-67172-5, 2005. a

Chan, S., Chu, J., Zhang, Y., and Nadarajah, S.: An extreme value analysis of the tail relationships between returns and volumes for high frequency cryptocurrencies, Research in International Business and Finance, 59, 101541,, 2022. a

Chiu, Y., Chebana, F., Abdous, B., Bélanger, D., and Gosselin, P.: Mortality and morbidity peaks modeling: an extreme value theory approach, Stat. Methods Med. Res., 27, 1498–1512,, 2018. a

Church, J. A. and White, N. J.: A 20th century acceleration in global sea‐level rise, Surv. Geophys., 33, L01602,, 2006. a

Church, J. A. and White, N. J.: Sea-Level Rise from the Late 19th to the Early 21st Century, Surv. Geophys., 32, 585–602,, 2011. a

Church, J. A., Clark, P. U., Cazenave, A., Gregory, J. M., Jevrejeva, S., Levermann, A., Merrifield, M. A., Milne, G. A., Nerem, R. S., Nunn, P. D., Payne, A. J., Pfeffer, W. T., Stammer, D., and Unnikrishnan, A. S.: Sea Level Change, in: Climate Change 2013: The Physical Science Basis, Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change, edited by: Stocker, T., Qin, D., Plattner, G.-K., Tignor, M., Allen, S., Boschung, J., Nauels, A., Xia, Y., Bex, V., and Midgley, P., chap. 13, 1137–1216, Cambridge University Press, (last access: 30 June 2021), (data available at:, last access: 16 December 2020) 2013. a, b, c, d

Cid, A., Menéndez, M., Castanedo, S., Abascal, A. J., Méndez, F. J., and Medina, R.: Long-term changes in the frequency, intensity and duration of extreme storm surge events in southern Europe, Clim. Dynam., 46, 1503–1516,, 2015. a

Città di Venezia: Centro Previsioni e Segnalazioni Maree,, last access: 5 March 2020. a, b

Coles, S.: An Introduction to Statistical Modeling of Extreme Values, Springer, London, 1. Edn.,, 2001. a, b, c, d, e, f, g, h, i

Coles, S. and Tawn, J.: Bayesian modelling of extreme surges on the UK east coast, Philos. T. Roy. Soc. A, 363, 1387–1406,, 2005. a, b

Dalrymple, T.: Flood frequency Analysis, Manual of hydrology: Part 3. Flood-flow techniques 1543, Geological Survey Water Supply Paper 1543-A, U.S. Government Publishing Office, Washington, WA, USA,, 1960. a

Davison, A. C. and Smith, R. L.: Models for Exceedances over High Thresholds, J. Roy. Stat. Soc. B-Met., 52, 393–442, 1990. a, b, c

De Zea Bermudez, P. and Mendes, Z.: Extreme value theory in medical sciences: modeling total high cholesterol levels, Journal of Statistical Theory and Practice, 6, 468–491,, 2012. a

Dixon, M. J. and Tawn, J. A.: The effect of non-stationarity on extreme sea-level estimation, J. Roy. Stat. Soc. C-App., 48, 135–151, 1999. a

Dubey, S. D.: A compound weibull distribution, Nav. Res. Logist. Q., 15, 179–188,, 1968. a

Eliot, M.: Influence of interannual tidal modulation on coastal flooding along the Western Australian coast, J. Geophys. Res.-Oceans, 115, C11013,, 2010. a

Elvidge, S. and Angling, M. J.: Using Extreme Value Theory for Determining the Probability of Carrington-Like Solar Flares, Adv. Space Res., 16, 417–421,, 2018. a

Embrechts, P., Klüppelberg, C., and Mikosch, T.: Modelling Extremal Events for insurance and finance, Springer, New York, Applications of Mathematics, Springer, Berlin, Heidelberg, 1 edn.,, 1997. a

Ferro, C. A. T. and Segers, J.: Inference for clusters of extreme values, J. Roy Stat. Soc. B-Met., 65, 545–556,, 2003. a

Finkenstadt, B. and Rootzén, H.: Extreme Values in Finance, Telecommunications and the Environment, Monographs on Statistics & Applied Probability, Chapman & Hall/CRC, 1 Edn.,, 2004. a

Fisher, R. A. and Tippett, L. H. C.: Limiting forms of the frequency distribution of the largest or smallest member of a sample, Math. Proc. Cambridge, 24, 180–190,, 1928. a

Fortunato, A. B., Li, K., Bertin, X., Rodrigues, M., and Miguez, B. M.: Determination of extreme sea levels along the Iberian Atlantic coast, Ocean Eng., 111, 471–482,, 2016. a

Fréchet, M. R.: Sur la loi de probabilité de l’écart maximum, Ann. Soc. Polon. Math., 6, 93–116, 1927. a

Gnedenko, B. V.: Sur la distribution limite du terme maximum d’une serie aleatoire, Ann. Math., 44, 423–453, 1943. a

Goring, D. G., Stephens, S. A., Bell, R. G., and P., P. C.: Estimation of Extreme Sea Levels in a Tide-Dominated Environment Using Short Data Records, J. Waterw. Port. C, 137, 150–159,, 2011. a

Greenwood, J. A., Landwehr, J. M., Matalas, N. C., and Wallis, J. R.: Probability weighted moments: definition and relation to parameters of several distributions expressable in inverse form, Water Resour. Res., 15, 1049–1054,, 1979. a

Gumbel, E. J.: Statistics of Extremes, Columbia University Press, New York, 1958. a, b

Haigh, I. D., Nicholls, R., and Wells, N.: Assessing changes in extreme sea levels: Application to the English Channel, 1900–2006, Cont. Shelf. Res., 30, 1042–1055,, 2010. a, b, c

Haigh, I. D., Eliot, M., and Pattiaratchi, C.: Global influences of the 18.61‐year nodal cycle and 8.85-year cycle of lunar perigee on high tidal levels, J. Geophys. Res.-Oceans, 116, 25249,, 2011. a

Haigh, I. D., MacPherson, L. R., Mason, M. S., Wijeratne, E. M. S., Pattiaratchi, C. B., Crompton, R. P., and George, S.: Estimating present day extreme water level exceedance probabilities around the coastline of Australia: tropical cyclone-induced storm surges, Clim. Dynam., 42, 139–157,, 2014a. a

Haigh, I. D., Wahl, T., Rohling, E. J., Price, R. M., Pattiaratchi, C., Calafat, F. M., and Dangendorf, S.: Timescales for detecting a significant acceleration in sea level rise, Nat. Commun., 5, 3635,, 2014b. a, b

Hall, T. M. and Sobel, A. H.: On the impact angle of Hurricane Sandy's New Jersey landfall, Geophys. Res. Lett., 40, 2312–2315,, 2013. a

Hamdi, Y., Bardet, L., Duluc, C.-M., and Rebour, V.: Extreme storm surges: a comparative study of frequency analysis approaches, Nat. Hazards Earth Syst. Sci., 14, 2053–2067,, 2014. a

Hamdi, Y., Bardet, L., Duluc, C.-M., and Rebour, V.: Use of historical information in extreme-surge frequency estimation: the case of marine flooding on the La Rochelle site in France, Nat. Hazards Earth Syst. Sci., 15, 1515–1531,, 2015. a

Hamdi, Y., Garnier, E., Giloy, N., Duluc, C.-M., and Rebour, V.: Analysis of the risk associated with coastal flooding hazards: a new historical extreme storm surges dataset for Dunkirk, France, Nat. Hazards Earth Syst. Sci., 18, 3383–3402,, 2018. a

Hamon, B. V. and Middleton, J. F.: Return periods of extreme sea levels: the exceedance probability method, Int. Hydrogr. Rev., 66, 165–177, 1989. a

Hay, C., Morrow, E., Kopp, R., and Mitrovica, J. X.: Probabilistic reanalysis of twentieth-century sea-level rise, Nature, 517, 481–484,, 2015. a

Hosking, J. R. M.: L-moments: Analysis and estimation of distributions using linear combinations of order statistics, J. Roy. Stat. Soc. B-Met., 52, 105–124, 1990. a

Hosking, J. R. M. and Wallis, J. R.: Parameter and Quantile Estimation for the Generalized Pareto Distribution, Technometrics, 29, 339–349, 1987. a

Hosking, J. R. M., Wallis, J. R., and Wood, E. F.: Estimation of the generalized extreme-value distribution by the method of probability-weighted moments, Technometrics, 27, 251–261,, 1985. a

Hosseini, S. R., Scaioni, M., and Marani, M.: Extreme Atlantic hurricane probability of occurrence through the Metastatistical Extreme Value Distribution, Geophys. Res. Lett., 47, 2019GL086138,, 2020. a, b

Jenkinson, A.: The frequency distribution of the annual maximum (or minimum) values of meteorological elements, Q. J. Roy. Meteor. Soc., 81, 158–171, 1955. a

Jevrejeva, S., Moore, J. C., Grinsted, A., and Woodworth, P. L.: Recent global sea level acceleration started over 200 years ago?, Geophys. Res. Lett., 35, L08715,, 2008. a

Johns, B. and Ali, M. A.: The numerical modeling of storm surges in the Bay of Bengal, Q. J. Roy. Meteor. Soc., 106, 1–18,, 1980. a

Katz, R. W., Parlange, M. B., and Naveau, P.: Statistics of extremes in hydrology, Adv. Water Resour., 25, 1287–1304,, 2002. a

Katz, R. W., Brush, G. S., and Parlange, M. B.: Statistics of extremes: modeling ecological disturbances, Ecology, 86, 1124–1134,, 2005. a

Li, G., Zhang, X., Zwiers, F., and Wen, Q. H.: Quantification of uncertainty in high-resolution temperature scenarios for North America, J. Climate, 25, 3373–3389,, 2012. a

Lowe, J. A., Woodworth, P. L., Knutson, T., McDonald, R. E., McInnes, K., Woth, K., Von Storch, H., Wolf, J., Swail, V., Bernier, N., Gulev, S., Horsburgh, K., Unnikrishnan, A. S., Hunter, J., and Weisse, R.: Past and future changes in extreme sea levels and waves, in: Understanding Sea‐Level Rise and Variability, edited by: Church, J. A., Woodworth, P. L., Aarup, T., and Wilson, W. S., chap. 11, 326–375, Wiley-Blackwell,, 2010. a, b

Mann, H. B.: Nonparametric Tests Against Trend, Econometrica, 13, 245–259, 1945. a, b

Marani, M. and Ignaccolo, M.: A metastatistical approach to rainfall extremes, Adv. Water Resour., 79, 121–126,, 2015. a, b, c

Marani, M. and Zorzetto, E.: Doubly stochastic distributions of extreme events, arXiv [preprint], arXiv:1902.09862, 2019. a

Marra, F., Nikolopoulos, E. I., Anagnostou, E. N., and Morin, E.: Metastatistical extreme value analysis of hourly rainfall from short records: Estimation of high quantiles and impact of measurement errors, Adv. Water Resour., 117, 27–39,, 2018. a, b, c

McInnes, K. L., Hubbert, G., Macadam, I., and O’Grady, J.: An assessment of current and future vulnerability to coastal inundation due to sea-level extremes in Victoria, southeast Australia, Int. J. Climatol., 33, 33–47,, 2013. a

Meehl, G., Stocker, T., Collins, W. D., Friedlingstein, P., Gaye, A., Gregory, J. M., Kitoh, A., Knutti, R., Murphy, J. M., Noda, A., Raper, S. C. B., Watterson, I. G., Weaver, A. J., and Zhao, Z. C.: Global Climate Projections, in: Climate Change 2007: The Physical Science Basis. Contribution of Working Group I to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change, edited by: Solomon, S., Qin, D., Manning, M., Chen, Z., Marquis, M., Averyt, K., Tignor, M., and Miller, H., chap. 10, 747–846, Cambridge University Press, (last access: 30 June 2021), 2007. a

Mekonnen, K., Melesse, A. M., and Woldesenbet, T. A.: Effect of temporal sampling mismatches between satellite rainfall estimates and rain gauge observations on modelling extreme rainfall in the Upper Awash Basin, Ethiopia, J. Hydrol., 598, 126467,, 2021. a

Menéndez, M. and Woodworth, P. L.: Changes in extreme high water levels based on a quasi‐global tide‐gauge data set, J. Geophys. Res., 115, 1124–1134,, 2010. a, b

Middleton, J. F. and Thompson, K. R.: Return periods of extreme sea levels from short records, J. Geophys. Res., 91, 11707–11716,, 1986. a

Miniussi, A. and Marani, M.: Estimation of Daily Rainfall Extremes Through the Metastatistical Extreme Value Distribution: Uncertainty Minimization and Implications for Trend Detection, Water Resour. Res., 56, e2019WR026535,, 2020. a, b, c, d

Miniussi, A. and Marra, F.: Estimation of extreme daily precipitation return levels at-site and in ungauged locations using the simplified MEV approach, J. Hydrol., 603, 126946,, 2021. a

Miniussi, A., Marani, M., and Villarini, G.: Metastatistical Extreme Value Distribution applied to floods across the continental United States, Adv. Water Resour., 136, 103498,, 2020a. a, b

Miniussi, A., Villarini, G., and Marani, M.: Analyses through the metastatistical extreme value distribution identify contributions of tropical cyclones to rainfall extremes in the eastern United States, Geophys. Res. Lett., 47, e2020GL087238,, 2020b. a, b

Önöz, B. and Bayazit, M.: Effect of the occurrence process of the peaks over threshold on the flood estimates, J. Hydrol., 244, 86–96,, 2001. a

Oppenheimer, M., Glavovic, B., Hinkel, J., van de Wal, R., Magnan, A., Abd-Elgawad, A., Cai, R., Cifuentes-Jara, M., DeConto, R., Ghosh, T., Hay, J., Isla, F., Marzeion, B., Meyssignac, B., and Sebesvari, Z.: Sea Level Rise and Implications for Low-Lying Islands, Coasts and Communities, in: IPCC Special Report on the Ocean and Cryosphere in a Changing Climate, edited by: Pörtner, H.-O., Roberts, D., Masson-Delmotte, V., Zhai, P., Tignor, M., Poloczanska, E., Mintenbeck, K., Alegría, A., Nicolai, M., Okem, A., Petzold, J., Rama, B., and Weyer, N. M., chap. 4, 321–445, Cambridge University Press, (last access: 30 June 2021), 2019. a

Papalexiou, S. M. and Koutsoyiannis, D.: Battle of extreme value distributions: A global survey on extreme daily rainfall, Water Resour. Res., 49, 187–201,, 2013. a

Peng, D., Hill, E. M., and Meltzner, A. J.and Switzer, A. D.: Tide gauge records show that the 18.61‐year nodal tidal cycle can change high water levels by up to 30 cm, J. Geophys. Res.-Oceans, 124, 736–749,, 2019. a

Pickands III, J.: Statistical inference using extreme order statistics, Ann. Stat., 3, 119–131,, 1975. a, b

Pisarenko, V. F., Rodkin, M. V., and Rukavishnikova, T. A.: Estimation of the probability of strongest seismic disasters based on the extreme value theory, Izvestiya, Physics of the Solid Earth, 50, 311–324,, 2014a. a

Pisarenko, V. F., Sornette, A., Sornette, D., and Rodkin, M. V.: Characterization of the Tail of the Distribution of Earthquake Magnitudes by Combining the GEV and GPD Descriptions of Extreme Value Theory, Pure Appl. Geophys., 171, 1599–1624,, 2014b. a

Pugh, D. T. and Vassie, J. M.: Extreme sea levels from tide and surge probability, in: Proc. 16th International Conference on Coastal Engineering 1978, Hamburg, Germany, chap. 52, 911–930, American Society of Civil Engineers,, 1978. a, b

Pugh, D. T. and Vassie, J. M.: Applications of the joint probability method for extreme sea level computations, in: Proc. Inst. Civ. Eng., Part 2, 959–975,, 1980. a

Pugh, D. T. and Woodworth, P. L.: Sea-Level Science: Understanding Tides, Surges, Tsunamis and Mean Sea-Level Changes, Cambridge University Press,, 2014. a

Rueda, A., Camus, P., Méndez, F. J., Tomás, A., and Luceño, A.: An extreme value model for maximum wave heights based on weather types, J. Geophys. Res.-Oceans, 121, 1262–1273,, 2016. a

Rypkema, D. C., Horvitz, C. C., and Tuljapurkar, S.: How climate affects extreme events and hence ecological population models, Ecology, 100, e02684,, 2019. a

Scheffner, N. W., Borgman, L. E., and Mark, D. J.: Empirical simulation technique based storm surge frequency analyses, J. Waterw. Port Coast, 122, 93–101,, 1996. a

Schellander, H., Lieb, A., and Hell, T.: Error structure of metastatistical and generalized extreme value distributions for modeling extreme rainfall in Austria, Earth and Space Science, 6, 1616–1632,, 2019. a

Serinaldi, F. and Kilsby, C. G.: Rainfall extremes: Toward reconciliation after the battle of distributions, Water Resour. Res., 50, 336–352,, 2014. a

Smith, R. L.: Extreme value theory based on the r largest annual events, J. Hydrol., 86, 27–43,, 1986. a

Solari, S., Egüen, M., Polo, M. J., and Losada, M. A.: Peaks Over Threshold (POT): A methodology for automatic threshold estimation using goodness of fit p-value, Water Resour. Res., 53, 2833–2849,, 2017. a

Songchitruksa, P. and Tarko, A. P.: The extreme value theory approach to safety estimation, Accident Anal. Prev., 38, 811–822,, 2006. a

Tawn, J. A.: An extreme-value theory model for dependent observations, J. Hydrol., 101, 227–250,, 1988. a

Tawn, J. A. and Vassie, J. M.: Extreme sea levels: The joint probability method revisited and revised, in: Proc. Inst. Civ. Eng., Part 2, 429–442,, 1989. a

Tebaldi, C., Strauss, B. H., and Zervas, C. E.: Modelling sea level rise impacts on storm surges along US coasts, Environ. Res. Lett., 7, 014–032,, 2012. a, b, c

Valle-Levinson, A., Marani, M., Carniello, L., D'Alpaos, A., and Lanzoni, S.: Astronomic link to anomalously high mean sea level in the northern Adriatic Sea, Estuar. Coast. Shelf. S., 257, 107418,, 2021. a

Volpi, E., Fiori, A., Grimaldi, S., Lombardo, F., and Koutsoyiannis, D.: Save hydrological observations! Return period estimation without data decimation, J. Hydrol., 571, 782–792,, 2019. a

Von Mises, R.: La distribution de la plus grande de n valuers, Rev. math. Union interbalcanique, 1, 141–160, 1936. a

Vousdoukas, M. I., Voukouvalas, E., Annunziato, A., Giardino, A., and Feyen, L.: Projections of extreme storm surge levels along Europe, Clim. Dynam., 47, 3171–3190,, 2016. a

Wahl, T., Haigh, I. D., Nicholls, R. J., Arns, A., Dangendorf, S., Hinkel, J., and Slangen, A. B.: Understanding extreme sea levels for broad-scale coastal impact and adaptation analysis, Nat. Commun., 8, 1–12,, 2017. a

Woodworth, P. L. and Blackman, D. L.: Changes in high waters at Liverpool since 1768, Int. J. Climatol., 22, 697–7147,, 2002. a

Woodworth, P. L. and Blackman, D. L.: Evidence for systematic changes in extreme high waters since the mid‐1970s, J. Climate, 17, 1190–1197,<1190:EFSCIE>2.0.CO;2, 2004.  a

Woodworth, P. L., Menéndez, M., and Roland Gehrels, W.: Evidence for Century-Timescale Acceleration in Mean Sea Levels and for Recent Changes in Extreme Sea Levels, Surv. Geophys., 32, 603–618,, 2011. a

Zhang, K., Douglas, B. C., and Leatherman, S. P.: Twentieth-century storm activity along the U.S. east coast, J. Climate, 13, 1748–1761,<1748:TCSAAT>2.0.CO;2, 2000. a

Zhang, W.-Z., Shi, F., Hong, H.-S., Shang, S.-P., and Kirby, J. T.: Tide–surge interaction intensified by the Taiwan Strait, J. Geophys. Res.-Oceans, 115, C06012,, 2010. a

Zorzetto, E. and Marani, M.: Downscaling of rainfall extremes from satellite observations, Water Resour. Res., 55, 156–174,, 2019. a

Zorzetto, E. and Marani, M.: Extreme value metastatistical analysis of remotely sensed rainfall in ungauged areas: Spatial downscaling and error modelling, Adv. Water Resour., 135, 103483,, 2020. a

Zorzetto, E., Botter, G., and Marani, M.: On the emergence of rainfall extremes from ordinary events, Geophys. Res. Lett., 43, 8076–8082,, 2016. a, b, c

Short summary
We comparatively evaluate the predictive performance of traditional and new approaches to estimate the probability distributions of extreme coastal water levels. The metastatistical approach maximizes the use of observational information and provides reliable estimates of high quantiles with respect to traditional methods. Leveraging the increased estimation accuracy afforded by this approach, we investigate future changes in the frequency of extreme total water levels.
Final-revised paper