To withstand coastal flooding, protection of coastal facilities and structures must be designed with the most accurate estimate of extreme storm surge return levels (SSRLs). However, because of the paucity of data, local statistical analyses often lead to poor frequency estimations. The regional frequency analysis (RFA) reduces the uncertainties associated with these estimations by extending the dataset from local (only available data at the target site) to regional (data at all the neighboring sites including the target site) and by assuming, at the scale of a region, a similar extremal behavior. In this work, the empirical spatial extremogram (ESE) approach is used. This is a graph representing all the coefficients of extremal dependence between a given target site and all the other sites in the whole region. It allows quantifying the pairwise closeness between sites based on the extremal dependence. The ESE approach, which should help with have more confidence in the physical homogeneity of the region of interest, is applied on a database of extreme skew storm surges (SSSs) and used to perform a RFA.

To resist flooding hazard in coastal areas, considering the most accurate
frequency estimates of extreme storm surge return levels (SSRLs) (1000-year
return level, for instance) with an appropriate confidence level becomes a
major operational concern when designing protections. The

More recently, Weiss (2014) introduced a physically based method to
delineate homogeneous regions in order to perform a RFA of the extreme SSSs.
This method depends on the storm footprints identified through a
declustering algorithm using a storm propagation probabilistic criterion.
However, if a target site is very close to the limit of the region, the
information at the site located on the other side of the region of interest
can be wrongly excluded, even though both sites likely offer similar
information and have likely similar asymptotic properties. This problem is
also known as the “border effect”. For instance, in the regions of
interest obtained by Weiss (2014), the two French sites, Boulogne-Sur-Mer
and Calais, are located in two different regions, while they are
geographically close, with a distance of about 30 km. Indeed, despite the
fact that both sites face different seas (North Sea and English Channel),
they have the same climate (according to the climate comparator proposed by
Météo-France:

To address this limitation (the border-effect problem) and to form a physically homogeneous region centered on a target site, we take up, in the present paper, an approach, which was proposed by Hamdi et al. (2016), using the empirical spatial extremogram (ESE), in which the extremal dependence between two observation series becomes a measure of the neighborhood between the two associated sites. A pairwise measure between sites based on the spatial extremal coefficients was defined to carry out a RFA applied on extreme SSSs. The composition of regions built here is based on the similarity of sites' attributes. The higher the value of the spatial extremal coefficient between the target site and another site is, the greater the dependency of extreme SSSs, therefore indicating that storms impacting the target site tend to also impact the other site, which can be included in the region of the target site. Indeed, in a region of interest, the process generating storms and impacting the target site will tend to impact the other sites in the region as well and vice versa. It is with this in mind that the processes generating storms in a region are considered physically homogeneous. Then, it is assumed that sites with sufficiently high spatial extremal coefficients (with the target site) may be included in the same region of influence of the target site (the physically homogeneous region). The region may also be considered a typical storm footprint in the neighborhood of the target site. Obviously, the dependence between sites must be taken into account in the statistical analysis. Once a physically homogeneous region (centered on the target site) is formed, the statistical homogeneity is then checked and the regional frequency estimation (and, in particular, the dependence model and the way to calculate the effective duration of observations) is subsequently performed.

Our objective in this work is to conduct a RFA using the ESE. The ESE approach should enable us to get rid of the border effect and the distance impact and also to be more confident about the physically homogenous aspect of the region. The paper is organized as follows. A description of the methods is presented in Sect. 2. The case study with SSSs data in the whole region is also presented in Sect. 2. In Sect. 3, the ESE is applied to some target sites. The results of the analysis are further discussed in Sect. 4, before the conclusion and perspectives in Sect. 5.

The objective of the present section is to use an approach based on the spatial extremogram to form a homogenous region to be used in a RFA of extreme SSSs. The region of interest which is expected to be centered on a given target site must be physically and statistically homogeneous. The use of the spatial extremogram technique in index-flood-based RFA is the main contribution of the present paper. It is expected that the spatial-extremogram-based procedure leads to regions of interest with no border effect and with less residual heterogeneity.

Let

An illustrative example on how the extremal dependence coefficient
is empirically computed. Among the seven extreme twins (

The scheme for obtaining the extremal dependence between a target site and
all the other sites has to be applied differently for each case study. The
first question one can ask is the following: from which value of the extremal dependence
coefficient can two sites begin to be considered neighbors? This leads to
some other questions: when does the extremal correlation begin to be
statistically significant? Or from which value of

Let

Once the homogeneous region of interest centered on the target site is
obtained, the procedure begins by constructing a regional sample of
independent storms. A storm is defined as a physical event that induces
extreme SSSs (i.e., exceeds the extreme quantile

The RFA uses the flood index principal, which stipulates that within a
statistically homogeneous region, extreme events normalized with a local
index are drawn from a common regional distribution. It is assumed that the
distribution of these extreme normalized SSSs converges to a generalized
Pareto distribution (GPD) and the number of exceedances converges to a
Poisson distribution. The annual SSS quantile was used as a local index to
normalize on-site samples. A further noteworthy statistical setting of the
developed RFA is that it uses a relatively high threshold, allowing the
selection of extreme storms corresponding to an annual rate

In a regional context, the effective duration of observations

The regional frequency model used herein is based on the extreme value theory. As mentioned in Sect. 2.2, the peaks-over-threshold (POT) approach (e.g., Pickands, 1975), in which the excesses are analyzed with the
GPD, is used in the RFA, using a threshold equal to 1 and taking into account
the seasonality. Seasonal effects, considered for other variables in the
literature (e.g., Morton et al., 1997; Méndez et al., 2008), can be modeled through a sinusoid. The regional distribution becomes a discrete mixture of GPD and sinusoid, with a seasonally varying scale parameter

Let us denote

SSS datasets are obtained from the temporal series of hourly observations and predicted tide levels (the astronomical tide), collected at a total of 67 sites located on the Spanish, French (Atlantic and English Channel), and British coasts (see Fig. 2). The French tide gauges are managed by the French oceanographic service (SHOM – Service Hydrographique et Océanographique de la Marine), while Spanish and British ones are managed by IEO (Instituto Español de Oceanografía, Spain) and the British Oceanographic Data Centre (BODC), respectively.

Location of sites used for the study: 67 ports along the Spanish, French, and British coasts. Each site is associated with a number. The table on the right shows the correspondence between numbers and sites. The circled points represent the target sites for which a centered RFA is carried out in this study.

For convenience, the same observation periods as those used by Weiss (2014) were used in the present study. They range from 1846 for Brest (in France) to 2011 (for almost all the sites), and they show an average effective duration of observations of 31 years. In most cases, local series are characterized by the presence of many gaps. It is to be noted that the sea levels must be corrected from a possible eustatism (a general variation in mean sea level) in order to avoid inducing bias in the calculation of the surges: a correction is done if annual sea levels (calculated following the Permanent Service for Mean Sea Level recommendations) show significant trends. It is also noteworthy that the impact of climate change on the estimated return levels and associated uncertainties is not covered by this paper. The use of projected sea level rise could, however, be the subject of another paper.

Furthermore, in connection with the choice of the variable of interest, the focus is restricted to SSS series because in regions with strong tidal influence, the coastal flooding hazard is most noticeable around the time of high tide. Indeed, the SSS is a fundamental input for many statistical investigations of coastal hazards. It is defined as the difference between the maximum observed sea level and the one predicted around the time of high tide. Thus, the resulting SSS series have a temporal resolution of approximately 12.4 h. The reader is referred to Bernardara et al. (2011) for a more detailed introduction on skew surges. The developed RFA is performed at many target sites along the French (Atlantic and English Channel) coast. One of the most important features of these target sites is the fact that the region in which they are located has experienced significant storms during the last few decades (1953, 1987, Lothar and Martin in 1999, and Xynthia in 2010). Figure 2 displays the geographic location of the whole region. As depicted in the left-hand side of the figure, three target sites (red empty circles) are selected to perform the developed methodology and estimate the 1000-year return level.

All the simulations are carried out within the R environment (open-source
software for statistical computing:

Two types of thresholds are used in the calculation of the empirical spatial
extremogram. The first threshold sets the extreme quantiles to extract
extreme SSSs, and the second one (the neighborhood threshold) sets the
extremal coefficient above which sites are considered neighbors. Since
thresholds that are too high result in introducing a high variance and those that are too low
introduce a bias in the results, there is a trade-off to be made between
variance and bias. Indeed, the asymptotic properties of the marginal SSSs
can be violated when too low of an extreme quantile

One of the most important features of the Calais site is the fact that it is
located close to a border of one of the regions found by Weiss (2014). In
addition, the region in which this site is located has experienced
significant storms during the last 2 decades (Martin in 1999 and Xynthia in 2010). Figure 3 displays the geographic location of five
homogeneous regions according to Weiss (2014). The scheme for obtaining
the pairwise extremal dependence coefficients between Calais as a target
site and all the other sites is applied herein. From the ESE depicted in the
Fig. 4a, a geographically coherent region of interest
corresponding to the neighborhood threshold (

Five physically and statistically homogenous regions (according to Weiss, 2014). The regions are represented by five colors. This figure shows that, for example, site 24 (Calais) is located in the region shown in blue and is very close to a border. Site 23 (Boulogne), however, which is very close to site 24 (Calais), is nevertheless in another region (the region shown in red). This separation between site 23 and site 24 may seem artificial.

The ESE for Calais

Physically homogenous regions (the list of sites belonging to a region are surrounded by red zones) for Calais, Brest, and La Rochelle. Neighboring sites are represented with green dots (target sites are represented with red dots). Target sites are not close to a border.

The ESE for the whole region with Brest as a target site is shown in Fig. 4b, and the associated region of interest is depicted in the
middle panel of Fig. 5. As shown in this figure, the region of interest
around Brest is larger than the one centered on Calais, with many sites for
which the extremal dependence coefficients are at the limit of the
neighboring threshold

One of the most important features of the La Rochelle site is the fact that
the region in which this site is located recently experienced the
storm Xynthia (2010). It has been the subject of many studies after this
storm (e.g., Hamdi et al., 2015). Figure 4c shows the ESE (with La Rochelle as a target site), and in the rightmost panel of Fig. 5 the homogeneous region of interest centered on La Rochelle is depicted. As concluded with
the two first target sites, Calais and Brest, it can be seen in the rightmost
panel of Fig. 5 that the region of interest is also smaller than that
obtained by Weiss (2014) but better centered on the La Rochelle site. The
time during which La Rochelle and both Saint-Malo (station number 18) and
Saint-Servan (station number 17) sites operated simultaneously is relatively
small (14 years for Saint-Malo and 2 years for Saint-Servan). The extremal
dependence coefficients for these two sites are equal to 0.29 and 0.4, respectively. The question of whether to consider Saint-Servan and Saint-Malo as being inside the region or not has been raised. Since both sites are very close to each other (with a distance of less than 2 km), it seems logical to either add them both to the region or withdraw them both from the region. We finally decided to integrate them into the region of interest because the site Jersey is part of the region centered on La Rochelle, with a dependency
extremal probability equal to the neighboring threshold (

Once the physically homogeneous regions are formed, the statistical
homogeneity must be verified. As mentioned earlier in this paper, the
L-moment-based homogeneity tests (heterogeneity measure and discrepancy)
are used. The heterogeneity measure

As mentioned earlier in this paper, a regional pooling method to estimate
the regional distribution for each homogeneous region is used. Indeed, a
storm that can impact several sites (thus generating intersite dependence)
during a single storm is considered only once in the regional sample. The
distribution of the maximum regional SSSs (

Comparison of total and effective durations of observations (years) used in the present study with those used by Weiss (2014). The effective duration of observations takes into account the intersite dependence. A total duration of observations is the sum of all on-site durations in the region of interest (without intersite dependence).

A GPD distribution taking into account the seasonality is then fitted to the
regional sample. The distribution parameters are estimated with the
penalized maximum likelihood method (Coles and Dixon, 1999). The most
adequate distributions are obtained with the AIC. The Exp

The GPD–sinusoid fitted to regional SSSs (plotting positions, RLs,
and confidence intervals) for the target sites: Exp

In the next RFA step, local quantiles are estimated by multiplying the regional ones by the local indices. The results are summarized in Table 2, presenting a comparison of the 1000-year return levels with associated confidence intervals.

Comparison of the 1000-year return levels with the 70 % of the confidence interval width (70 % CI) in brackets obtained herein with those of Weiss (2014).

Possible models for the fitting of the regional samples.

It is worth concluding that better centering the region of interest on the Calais site did not significantly change quantiles (a decrease of only 7 cm) but rather narrowed the associated confidence interval of about 12 cm. This outcome refers only to the region of interest around Calais, and it is different for the region centered on Brest and La Rochelle sites. However, the quantiles and associated confidence intervals are overall roughly the same, but the method presented herein better answers the uncertainties linked to the border effect issue, notably through the ESE tool.

One of the most important features of the ESE-based approach used in this paper to form a physically homogenous region centered on a target site is the fact that it avoids the problem of the so called “border effect”. Moreover, and in contrast to that introduced by Weiss (2014), the extremogram tool seems to prevent sites that are too distant from belonging to the same homogeneous region. This reduces physical and statistical heterogeneity that could be generated by pairwise sites that are quite far apart. Consequently, the spatial extremogram approach offers the key advantage leading to a certain geographical consistency. Despite the fact that the 1000-year return level and associated confidence interval obtained in this work are close to those obtained by Weiss (2014), the spatial extremogram method improves the physical homogeneity of the regions of interest and can decrease the effective duration of observations. Nevertheless, findings for the sites of Calais (which is no longer close to the border of a region) and Dunkerque seem to be particularly interesting for us because they are no longer close to the border of a region and since they can be representative sites for the Gravelines Nuclear Power Station in France. Furthermore, physical homogeneity may have an impact on the statistical one. Indeed, by using the L-moment-based criteria (Hosking and Wallis, 1997), it is concluded that unlike regions 1 and 2 in Weiss (2014) (which are considered to be possibly statistically homogeneous), all the regions built herein are statistically homogeneous, which is also progress.

The ESE-based approach can, nevertheless, be limited by the size of the common pairwise time period (during which data are present on both sites). Indeed, when the tide gauge at two different sites is often not operational over different periods of time, the common time period between these two sites used to calculate the spatial extremal coefficient may be short. Thus, sometimes a spatial extremogram can be considered not relevant, and therefore this shortcoming must be taken into account during the formation of the regions of interest, for instance by possibly removing the site involved. It will thus be interesting to analyze the uncertainties related to the ESE approach in order to have more reliability on the estimates of the extremal dependence between sites.

This study aims to perform regional frequency estimations of SSSs as an alternative to the local frequency analysis. Several ideas and approaches have been proposed in the literature to tackle the issue of the delineation of homogeneous regions, which is a main step in a RFA. The present work provides detailed reasoning for the need to use a more robust and reliable method which allows for the delineation of one homogeneous region centered on a target site and utilizes a method based on the calculation of pairwise extremal dependence coefficients (the empirical spatial extremogram) introduced by Hamdi et al. (2016) and compares the results with those obtained by Weiss (2014). The regional sample of the independent maximum normalized SSSs is then constructed from the series of the concordant sites in the region of interest. A regional effective duration of observations reflecting the intersite dependence of this sample is subsequently calculated and used in the regional frequency estimations.

Another consideration in this paper is applying and illustrating the ESE approach to a whole region containing sites located on the Spanish, French (Atlantic and English Channel), and British coasts with three target sites in France (Calais, Brest, and La Rochelle). A regional mixture of GPD–sinusoid distribution with a seasonally varying scale parameter and confidence intervals is examined. Overall, the results suggest that the regional analysis can be helpful in making a more appropriate assessment of the risk associated with the coastal flooding hazard. The application demonstrates also that the return levels (RLs) and associated confidence intervals estimates for Calais, Brest, and La Rochelle target sites are close to those obtained in a previous work (Weiss, 2014).

An in-depth study using more physical data and criteria in addition to the ESE (such as the atmospheric pressure or the wind speed and direction) could help to form regions that are more physically homogeneous. The concept of the ESE should find additional applications for the assessment of risk associated with other hazards in other climate and geoscience fields (e.g., extreme temperature and heatwave hazards). Associating confidence intervals with the spatial extremal coefficients could also be interesting. Another possible future endeavor is to perform a RFA using a regional sample containing all the regional SSSs (not only the maximum per storm and without considering the intersite dependence).

The systematic data can be obtained by request addressed to the corresponding author: SHOM (Service Hydrographique et Océanographique de la Marine) for French data, the BODC (British Oceanographic Data Centre) for British data, and the IEO (Instituto Español de Oceanografía) for Spanish data.

Initially, MA and YH agreed at the same time on the idea of carrying out a centered regional frequency analysis of skew storm surge. SG programed the R code for the extremogram calculations from an idea of YH, which was supported by MA. MA followed and guided all of SG's work. These works were presented to YH, PB, and RF, who analyzed them. Their views were taken into account in the study. MA wrote the first version of the paper. YH and MA made the various corrections necessary during the revision process. RF also added some corrections. The conclusion was developed from the analyses of MA, SG, YH, RF, and PB.

The authors declare that they have no conflict of interest.

This article is part of the special issue “Advances in extreme value analysis and application to natural hazards”. It is a result of the Advances in Extreme Value Analysis and application to Natural Hazard (EVAN), Paris, France, 17–19 September 2019.

The permission to publish the results of this ongoing research study was granted by the EDF (Electricité De France). The results in this paper should, of course, be considered to be R & D exercises without any significance or embedded commitments for the real behavior of the EDF power facilities or its regulatory control and licensing. The authors would like to acknowledge the SHOM (Service Hydrographique et Océanographique de la Marine, France), the BODC (British Oceanographic Data Centre, UK), and the IEO (Instituto Español de Oceanografía, Spain) for providing the data used in this study.

This paper was edited by Ira Didenkulova and reviewed by Andreas Sterl and one anonymous referee.