Regional analysis of multivariate compound flooding potential: sensitivity analysis and spatial patterns

. In coastal regions, floods can arise through a combination of multiple drivers, including direct surface run- off, river 15 discharge, storm surge and waves. In this study, we analyse compound flood potential in Europe caused by these four main flooding sources using state-of-the-art databases with homogenous forcing (i.e., ERA5). First, we perform an analysis to assess the sensitivity of the compound flooding potential to several factors: 1) sampling method; 2) time window to select the concurrent event of the conditioned driver; 3) dependence metrics; 4) wave-driven sea level definition. We observe higher correlation coefficients using annual maxima than peaks over threshold. Regarding the other factors, our results show similar 20 spatial distributions of the compound flooding potential. Second, the dependence between the pairs of drivers using the Kendall’s rank correlation coefficient and the joint occurrence are synthesized for coherent patterns of compound flooding potential using a clustering technique. This quantitative multi-driver assessment not only distinguishes where overall compound flooding potential is the highest, but also discriminates which driver combinations are more likely to contribute to compound flooding. We identify hotspots of compound flooding potential located along the southern coast of the North 25 Atlantic Ocean and the northern coast of the Mediterranean Sea. The model bathymetry was updated to use a more recent version of ETOPO2 (NOAA 2006). A new wave advection scheme was introduced in the WAM model with a revised unresolved bathymetry scheme to better account for the propagation along coastlines and to better model the impact of unresolved islands (Bidlot 2012). The slow attenuation of long period swells as well as the impact of shallow water on the wind input was introduced with an overall retuning of the level of dissipation due to white-capping (Bidlot 2012). The atmosphere and ocean 155 are coupled by a two-way interaction: atmosphere generates ocean waves through the surface wind stress, while the waves inﬂuence the atmospheric boundary layer via sea-state dependencies in the surface roughness. Altimeter measurements were used to assimilate information on signiﬁcant wave height. Independent buoy data were used for validation showing significant improvement in the wave height in ERA5 data compared to ERA-Interim (Hersbach et al., 2020). to be considered an independent event, and; 2) the threshold. The independence between extreme events is assured 205 by declustering the events based on the duration of the storms in the study area and selecting the highest event within each storm. The criteria used to select independent events in this study comprise a storm duration of 5 days for river discharge and 3 days for precipitation, storm surge and waves. These values were selected based on an analysis of the duration of highest storms conditioned to each variable in the study domain and following numbers used in previous studies (Ward et al., 2018, Hendry et al., 2019, Marcos et al., 2019, Bevacqua et al., 2019). Many methodologies for an automated threshold selection 210 have been proposed based on graphical methods combined with goodness of fit (e.g., Solari et al., 2017) but such techniques are difficult to implement in regional studies due to the different characteristics of time series of the several drivers involved in coastal flooding. Hence, we decide to apply the POT method with a threshold that guarantees 3 (POT3) or 6 (POT6) events/year to analyse also the effect of the value of the threshold. which can assist to decide the most appropriate methodological approach to perform high-resolution hydrodynamic and impact modelling.

flooding arising from the combination of river discharge and sea level along the UK coast. Such quantifications of compound flooding potential are based on dependence measures (e.g., using correlation coefficients, Wahl et al., 2015; or joint occurrence, Hendry et al., 2019) or bivariate statistical models (e.g., bivariate logistic threshold-excess model, Zheng et al., 2013;or copulas, Wahl et al., 2015, Moftakhari et al., 2017. Furthermore, recent advances in large-scale sea level 70 and river discharge modelling (Muis et al., 2016, Yamazaki et al., 2014 which provide time-series of these drivers over durations of more than 30 years, allow the identification of potential hotspots at country, continental and global scales (Wu et al., 2018;Bevacqua et al., 2019Bevacqua et al., , 2020Couasnon et al., 2020).
Regarding the identification of compound events, conditional sampling is usually applied (Wahl et al., 2015;Ward et al., 2018;Couasnon et al., 2020) which implies that compound events are conditioned to one of the drivers being extreme. For this 75 reason, when limiting to a bivariate characterization of compound events (e.g., when using correlation coefficients), two subsets of extreme events are identified, and the dependence is analysed when one or the other of the variables is extreme.
Another option is to select pairs of high values when both variables exceed individual high percentiles (e.g., 95th percentile, as Bevacqua et al., 2019), but in this case, compound events are defined only when both individual drivers are characterized as being extreme. This issue is similar to what happens when measuring compound flooding potential based on factors or 80 indices that quantify the effect of the dependency using copulas with AND hazard scenarios , Ganguli and Merz, 2019, Couasnon et al., 2020. However, it could be sufficient that only one of the driver variables was extreme to make a bivariate occurrence hazardous (OR hazard scenario, Moftakhari et al., 2017).
In the compound flooding studies summarised above, it is evident that a wide range of different statistical approaches have been used to define compound flooding potential, usually caused by the combination of two drivers. Only Hendry et al. (2021, 85 submitted) have to date considered all four potential drivers of flooding in coastal regions (precipitation, river discharge, sea level and waves). Recently, Ridder et al., 2021 have identified hotspots of compound events that potentially cause high-impact floods related to wet conditions based on the joint occurrence of multiple hazards pairs (precipitation, wind, hail, streamflow and storm surge). In other studies, the wave component has typically been included in the sea level directly (by combining wave height with storm surge and/or astronomical tide) without analysing the correlation with the other drivers (Bevacqua et 90 al., 2019). Similarly precipitation and river discharge have often been considered as an equivalent driver in the analysis of compound coastal flooding in combination with storm surge (Bevacqua et al., 2020). Additionally, different sampling methods to identify compound events and dependence measures to quantify compound flooding potential are used. The net effect of these variations in practice is to complicate comparisons between different studies.
The overall aim of this paper, is to perform a regional analysis along the North Atlantic, Mediterranean and Black Sea 95 coastlines of the compound flood potential caused by pluvial, fluvial and oceanographic sources during 1979-2018, using stateof-the-art model hindcasts with homogenous forcing (i.e., ERA5). In addition to this aim, two specific objectives are defined: 1) to assess the sensitivity of the compound flood potential to several factors that can affect the identification of compound events and the analysis of the spatial distribution of compound flooding potential, and 2) to detect different types of compound events and spatial patterns of compound flooding potential that arise from the combination of the four drivers. The paper is 100 https://doi.org/10.5194/nhess-2021-50 Preprint. Discussion started: 3 March 2021 c Author(s) 2021. CC BY 4.0 License. structured as follows. The datasets and methods are detailed in Section 2s and 3, respectively. The results of the two objectives are discussed in Section 4. Key findings are discussed in Section 5, with conclusions given in Section 6.

Data
We use modelled data to cover the entire coastlines that are focus of this study, using long-term, spatially continuous and temporally consistent gridded data for all four flood source variables, as discussed in each of the sub-sections below. Of note 105 is that although these databases are not available on a common grid and there are differences in their spatial resolution, they are all derived from the European Centre of Medium-Range Weather Forecast (ECMWF)'s latest global atmospheric reanalysis (ERA5, Hersbach et al., 2020), or from hindcasts forced by this atmospheric reanalysis. The new database GloFAS-ERA (Harrigan et al., 2020) is used for the first time to characterize river discharge when studying compound flooding potential.
We do not account for pluvial flooding directly, as pluvial flooding is a much smaller scale process. Instead, following Wahl 110 et al (2015), we use precipitation at each analysis site as a proxy for surface runoff potential. Our four flood source variables are, therefore: precipitation (P), river discharge (Q), storm surge (S) and waves (W), this latter variable being characterized by the significant wave height. Each of the four databases employed are described in the following sub-sections, including how we have selected the locations for the sensitivity analyses and compound event characterization.

Precipitation time-series 115
Precipitation time-series have been extracted from the ERA5 reanalysis, which is based on the Integrated Forecasting System (IFS) Cy41r2, which has been used in the ECMWF operational medium range forecasting system since 2016 and benefits from a decade of developments in model physics, core dynamics and data assimilation (Hersbach et al., 2020). The ERA5 reanalysis replaces the ERA-Interim reanalysis with a significantly enhanced horizontal resolution of 31 km, compared to 80 km for ERA-Interim. Long-term (1998Long-term ( -2018 and monthly average precipitation rates from ERA-Interim and ERA5 have been 120 evaluated by comparing them with values from other datasets (e.g., the version 7 of NASA's TRMM Multi-satellite Precipitation Analysis (TMPA) 3B43 dataset, Huffman et al. 2010) and there is a marked improvement in the estimated precipitation in ERA5 compared to ERA-Interim (Hersbach et al., 2020). The ERA5 hourly dataset spans 1979 onwards and it is currently publicly available at the Copernicus Climate Change Service. Here, accumulated daily precipitation is calculated from hourly data. 125

River discharge time-series
River discharge time-series were extracted from the Global Flood Awareness System (GloFAS)-ERA5 reanalysis (Harrigan et al., 2020). This reanalysis is a global gridded dataset (excluding Antarctica), with a horizontal resolution of 0.1° at a daily time step and with a 40 years long duration starting 1 January 1979. The GloFAS-ERA5 river discharge reanalysis was produced by coupling the land surface model runoff component of the ECMWF ERA5 global reanalysis with the LISFLOOD 130 https://doi.org/10.5194/nhess-2021-50 Preprint. Discussion started: 3 March 2021 c Author(s) 2021. CC BY 4.0 License.
hydrological and channel routing model (van der Knijff et al., 2010). LISFLOOD allows the lateral connectivity of ERA5 runoff grid cells routed through the river channel to produce river discharge. ERA5 runoff is produced from the HTESSEL land surface model (Hydrology Tiled ECMWF Scheme for Surface Exchanges over Land; Balsamo et al., 2009) with an advanced land data assimilation system to assimilate conventional in-situ and satellite observations for land surface variables.
Groundwater and river routing parameters in GloFAS were calibrated against river discharge observations for 1,287 135 catchments globally by Hirpa et al. (2018). A total of 463 of the largest lakes (surface area > 100 km2) and 667 largest reservoirs have been incorporated into the GloFAS river network.

Storm surge time series
Hourly and daily storm surge time-series have been extracted from the Coastal Dataset for the Evaluation of Climate Impact (CoDEC) . The third generation Global Tide and Surge Model (GTSM, Kernkamp et al., 2011), with a 140 coastal resolution of 2.5 km (1.25 km in Europe), was forced with meteorological fields from the ERA5 climate reanalysis to simulate extreme sea levels for the period 1979 to 2017. Besides the increase of the resolution of GTSM v3.0 from 5 km along the coast (50 km in the deep ocean) to 2.5 km along the coast (25 km in the deep ocean), the GTSM v3.0 model performance for tides was also improved by the implementation of additional physical processes. The validation against observed sea levels demonstrated a good performance which reflects not only the more skilful hydrodynamic simulations but also the accuracy of 145 the meteorological forcing. ERA5 represents better the evolution of weather systems due to an increase of the spatial and temporal resolution. The annual maxima had a mean bias 50% lower than the mean bias of the previous Global Tide and Surge reanalysis dataset (Muis et al., 2016).

Wave time series
Hourly wave time-series have been extracted from the ERA5 reanalysis at a spatial resolution of 0.5° with some improvements 150 and updates compared to ERA-Interim (Hersbach et al., 2020). The model bathymetry was updated to use a more recent version of ETOPO2 (NOAA 2006). A new wave advection scheme was introduced in the WAM model with a revised unresolved bathymetry scheme to better account for the propagation along coastlines and to better model the impact of unresolved islands (Bidlot 2012). The slow attenuation of long period swells as well as the impact of shallow water on the wind input was introduced with an overall retuning of the level of dissipation due to white-capping (Bidlot 2012). The atmosphere and ocean 155 are coupled by a two-way interaction: atmosphere generates ocean waves through the surface wind stress, while the waves influence the atmospheric boundary layer via sea-state dependencies in the surface roughness. Altimeter measurements were used to assimilate information on significant wave height. Independent buoy data were used for validation showing significant improvement in the wave height in ERA5 data compared to ERA-Interim (Hersbach et al., 2020). 160 https://doi.org/10.5194/nhess-2021-50 Preprint. Discussion started: 3 March 2021 c Author(s) 2021. CC BY 4.0 License.

Selection of the study locations
The spatial resolution of the four datasets is shown in Figure 1a for the area of Ireland, UK and north-western France. The ERA5 precipitation database has a resolution of 0.25°x0.25°, the ERA5 wave database has a resolution of 0.5°x0.5°, the CODEC storm surge database has a resolution of 2.5 km along the coast, and the GloFAS-ERA5 river discharge database has a resolution of 0.1°x0.1°. The river network data implemented in GloFAS for routing operations was produced using fine-165 scale hydrography inputs from HydroSHEDS (Lehner et al., 2008).
We used HydroSHEDS data to identify the mouths of rivers with a catchment area higher than 1,000 km2 to adjust the distribution of study locations for a regional analysis. The GloFAS grid nodes closest to the locations of the river mouths were checked and we selected the grid node with the highest river discharge. The closest grid nodes of the precipitation, wave and storm surge databases to the selected GloFAS grid nodes were identified and are shown in Figure 1b. Across the whole domain 170 (see Figure S1) we analyse 540 locations. Figure 1: a) Spatial resolution of the precipitation (ERA5), river discharge (GloFAS-ERA5), storm surge (CoDEC) and wave (ERA5) data in the area of Ireland, UK and north-western France. b) Selected river discharge grid nodes at the mouth of rivers with a catchment area > 1000 km2 and the closest precipitation, storm surge and wave grid nodes along these coasts.

Methods
We characterize the compound flooding potential generated by the four flood drivers (P, Q, S, W) by calculating the dependence between all possible pairs of the four main source variables along the coasts of the Eastern North Atlantic, Mediterranean Sea and Black Sea. In addition, we also superimpose linearly the storm surge and wave components into a combined sea level, ignoring the astronomical tidal component of sea level, as it is deterministic. We do this considering two 180 definitions of the wave-driven sea level component: 1) a simplified for the wave contribution to sea level (e.g., Vousdoukas et al., 2017), called this variable SW, and, 2) through the use of semi-empirical formulations (e.g., Vitousek et al., 2017), that also take into account the wave period (Tp) to calculate Setup. We term to this second definition of sea level, as the sum of S and the second one is conditioned on the first (secondary driver) in the two-sided conditional sampling we apply.
In the following sub-sections, we first describe the sensitivity analysis designed to elucidate the extent to which the methodology employed affects the quantification of compound flooding potential. Second, we describe the clustering method used here to identify different types of multivariate compound events and the spatial distribution of compound flooding 190 potential thereby arising across the four flood drivers considered in this study.

Sensitivity analysis
Our first objective is to assess the sensitivity of compound flood potential to different aspects of the underpinning methodology: 1) sampling method used, 2) time window investigated, 3) dependence metrics employed, and, 4) definition of wave-driven sea level. Each of these sensitivity tests is discussed in turn below. 195

Sampling
Evaluation of the dependence between multiple drivers is limited to a bivariate analysis which imposes a two-sided conditional sampling to select multivariate extremes. Compound extreme events are defined in previously published studies are either selected using Annual Maxima (AM) or Peak Over Threshold (POT). Although Ward et al. (2018) performed a sensitivity analysis and compared correlations between river discharge and storm surge using both POT with two thresholds (equal to 200 95th and 99th percentile) and AM sampling, here we build on that prior work to examine also the effect of the sampling method in the dependence between the four coastal flooding drivers considered. The disadvantage of the AM is that events are selected that might not be considered extreme in the dominant variable. On the other hand, the POT approach increases the amount of selected extreme events but introduces two parameters in the selection process: 1) the definition of the time between peaks for each peak to be considered an independent event, and; 2) the threshold. The independence between extreme events is assured 205 by declustering the events based on the duration of the storms in the study area and selecting the highest event within each storm. The criteria used to select independent events in this study comprise a storm duration of 5 days for river discharge and 3 days for precipitation, storm surge and waves. These values were selected based on an analysis of the duration of highest storms conditioned to each variable in the study domain and following numbers used in previous studies , Hendry et al., 2019, Marcos et al., 2019, Bevacqua et al., 2019. Many methodologies for an automated threshold selection 210 have been proposed based on graphical methods combined with goodness of fit (e.g., Solari et al., 2017) but such techniques are difficult to implement in regional studies due to the different characteristics of time series of the several drivers involved in coastal flooding. Hence, we decide to apply the POT method with a threshold that guarantees 3 (POT3) or 6 (POT6) events/year to analyse also the effect of the value of the threshold. https://doi.org/10.5194/nhess-2021-50 Preprint. Discussion started: 3 March 2021 c Author(s) 2021. CC BY 4.0 License.

Time window 215
The conditional sampling introduces another factor that could affect the definition of compound events and this is related to the selection of the concurrent value of the secondary variable to the identified extreme events of the dominant variable.
Specifically, there could be a temporal lag between variables that leads to a potential coastal flooding event. This lag can be implemented after identifying both series of extremes from the two drivers (Hendry et al., 2019) or by a time window when identifying the value of the conditioned variable (Wahl et al., 2015, Coausnon et al., 2019. Once a time 220 window (Δt) is established, the value of the secondary variable is selected as the maximum value within ± Δt days. A variety of temporal windows have been considered, from zero lag (Bevacqua et al., 2020), through ±1 day (Wahl et al., 2015), ±3 days (Couasnon et al., 2020) to ±5 days , Hendry et al., 2019. Furthermore, although river discharge data have been extracted at the river mouth, not all the databases for the four coastal flooding drivers have the same spatial resolution, varying considerably the distance between grid nodes at each location of the study domain. Therefore, here we test the 225 sensitivity of the identification of bivariate compound events to time windows between 1 and 10 days or 1 and 3 days, keeping the lag which provides the highest correlation coefficient between each pair of variables at each location.

Dependence metrics
Several correlation coefficient definitions (e.g., Kendall's tau, Wahl et al., 2015; or Spearman's rho, Couasnon et al., 2020) and other metrics for the characterization of the dependence between events when both drivers are extreme (e.g., joint 230 occurrence, Hendry et al., 2019, or tail dependence, Marcos et al., 2019 have been used in previous studies. Here we analyse the extent to which characterization of compound events could be affected by the selection of these varying dependence metrics. Correlation coefficient: We use the Kendall's rank correlation coefficient tau and Spearman's rank correlation coefficient rho because they are commonly used nonparametric methods of detecting associations between two variables. Significance is 235 assessed at = 0.05 (i.e., 95% confidence level), using corresponding p values. The Spearman's rank correlation between two variables is defined as the covariance of the two variables normalized by the product of their standard deviations between the rank scores of those two variables. The Kendall's tau is also a rank order correlation coefficient which quantifies the difference between the % of concordant and discordant pairs among all possible pairwise events. The Kendall's correlation is considered to be more robust than the Spearman's correlation because it offers better estimates with smaller asymptotic variance and is 240 less susceptible to outliers (Ganguli and Merz, 2019).
Joint occurrence: The 'joint occurrence method' (Hendry et al., 2019) consists of simply counting the number of times extreme events are identified in the two drivers analysed within the time window (Δt) considered.
Tail dependence: The dependence structure in the tails between two variables can be measured by the pair of statistics (χ, ̅ ) (e.g., Coles et al., 2001). Both coefficients χ and ̅ are defined as limit values which tend to 1 if both variables are 245 asymptotically dependent over a certain threshold. The coefficient χ represents the probability of bivariate extreme events https://doi.org/10.5194/nhess-2021-50 Preprint. Discussion started: 3 March 2021 c Author(s) 2021. CC BY 4.0 License. when both variables are extreme and provides a measure of the dependence strength (Marcos et al., 2019), referred to as extremal correlation (Zscheischler et al., 2020a). ̅ is the residual tail dependence coefficient and contains additional information about the association (-1< ̅ <1) between extremes of both variables when they are asymptotically independent (χ=0). We use the function taildep from the R package extRemes (Gilleland and Katz, 2016) to derive these values. 250

Definition of wave-driven sea level
Although non-linear interactions between storm surges and waves could amplify the magnitude of the sea level, the assumption that both contributions may be linearly summed is generally adopted and has often been used as a proxy of coastal flooding driven by oceanographic variables (Rueda et al., 2016, Bevacqua et al., 2019, Marcos et al., 2019. Regarding the wave contribution to sea level, when wind-generated waves approach nearshore and break in the shallow surf zone, they induce 255 variations of the sea level at different time and space scales enhancing coastal flooding. The highest wave-driven contribution to the total water level, called run-up, depends on two dynamically different processes: (1) wave setup, which is a time-average sea level rise occurring over a few hours to several days, and which is determined by local wind sea and swell conditions, and (2) swash which is a high-frequency process by which sea level fluctuates due to individual incident waves, with an additional low-frequency component generated by infragravity waves (related to the presence of groups in incident short waves). The 260 magnitude and expanse of both components depends on the sea-state characteristics (significant wave height, period and spectrum shape; Guza and Fedderson, 2012), as well as nearshore bathymetry and topography. Spatially, setup could extend from tens of meters in steep coastal areas to several kilometres in low-sloping coastal areas, while runup extension varies from a few meters to on the order of a hundred meters in reflective and dissipative environments, respectively (Dodet et al., 2019).
Runup is not usually included in the wave component of the sea level driver in coastal flooding analysis because its temporal 265 duration is on the order of hours and requires local geological characteristics that could artificially inflate the wave contribution in global and regional studies (Aucan et al., 2018). The setup contribution is defined using empirical formulations with different levels of sophistication. Wave setup has been approximated as the significant wave height multiplied by 0.2 (Vousdoukas et al., 2018, Bevacqua et al., 2019, Marcos et al., 2019 or by applying the Stockdon formulation (Stockdon et al., 2006) with different parameterizations , Rueda et al., 2017, Melet et al., 2018. The wave setup contribution to the 270 total water level is very sensitive to this parameterization (Aucan et al., 2018). Furthermore, in the definition of the sea level as the sum of S and 0.2W, here called SW, we have also applied the Stockdon formulation for wave setup for dissipative beaches (as Vitousek et al., 2017) which is known to provide similar results as using a beach slope of 0.02 (~50% of the world's beaches have slopes smaller than 0.02, Aucan et al., 2018), here termed water level (WL), as the sum of S and Setup.

Characterization 275
The evaluation of compound flooding potential due to the combination of four drivers based on a bivariate analysis is a complex problem due to the high dimensionality (e.g., spatial variability of the dependence metrics and relative contribution of each pair of drivers). We apply a two-step cascade classification with two sub-objectives: 1) to analyse the dependence between the https://doi.org/10.5194/nhess-2021-50 Preprint. Discussion started: 3 March 2021 c Author(s) 2021. CC BY 4.0 License. metrics that characterize the bivariate compound flooding potential between the pairs of drivers, and 2) to extract spatial patterns from this dataset in order to identify hot spots of compound flooding potential arising from P-Q-S-W. 280 The two-step classification method consists of the use of self-organizing maps (SOM) as the first step and applying the kmeans algorithm (KMA) as the second step (Rueda et al., 2017). In this study, SOM is applied first to take advantage of the powerful visualization characteristics but not to obtain a reduced number of clusters. The k-means algorithm is a classification technique that divides the high-dimensional data space into a number of clusters, each one defined by a centroid and formed by the data for which the centroid is the nearest . The SOM automatically extract clusters of high-285 dimensional data and projects them into a two-dimensional organized space (2D lattice), allowing an intuitive visualization of the classification (Kohonen, 2000). A SOM algorithm is a version of the KMA in which the centroids are forced with a neighborhood adaptation mechanism to a 2D lattice preserving the original topology of the data and producing that similar patterns in the original space are close in the 2D lattice. The Maximum-Dissimilarity-Algorithm (MDA, Camus et al., 2011) is applied to initialize the KMA to obtain a better distribution of the centroids over the multi-dimensional data space and avoid 290 random initialization. The optimal number of clusters is evaluated using the Davis-Boudin (Davies and Bouldin, 1979) and the gap criteria .

Sensitivity analysis
This section describes the results for the first objective, relating to the sensitivity analysis. Results from each of the four 295 sensitivity tests is described in turn.

Sampling
Two-sided conditional sampling has been applied to the seven pairs of drivers identified when applying the AM and the POT methods with either 3 or 6 events per year. Figure 2 shows the comparison of the correlation coefficients between Q and P using the three approaches when either Q or P are the dominant drivers. Only this pair of variables is shown because it presents 300 the highest correlation and the purpose of this subsection is only to test the sampling method. Locations are divided in five regions (see Figure   315 Figure 3 shows the comparison of the number of co-occurring events between Q and P using the three approaches. The number of events is higher when using POT3 compared to AM and when using POT3 compared to POT6, as expected. The scatter plots follow roughly the 1:3 or 1:2 slope, respectively, indicating an approximate tripling or doubling in the number of events between different approaches. However, the spatial pattern is similar (correlation coefficient between the number of co-320 occurring events is around 0.93-0.98) which means that the three methods identify broadly equivalent areas where prone to compound events with both variables being extreme. Results for all pairs of drivers are shown in Figure S3, which has a similar behaviour, albeit with an overall lower number of co-occurring events.   Figure 4 shows the comparison between the highest correlation coefficient obtained when using a time window (Δt) of ± 3 days versus using Δt = ± 10 days for the pairs of variables Q and P, S, W or SW using POT3, and the join occurrence between Q and P, S, W or SW. Only locations with a significant correlation (p<0.05) are represented in Figure 4. Results indicate there is no major difference in the correlation between drivers when employing the two investigated time windows. Larger differences (~0.1 higher correlation) are obtained when Q is the dominant variable in few locations (9 of the 540) where 335 correlation is moderate overall (~0.30). The joint occurrences tend to be slightly higher (10 number of co-occurring events) with a higher time window but also fewer compound events (by half) are detected in locations with low-medium join occurrence (<~40) using a Δt = ± 3 days (lower left corner of scatter plots in Figure 4b).    Figure 5a shows the comparisons when using the Kendall's versus Spearman's rank coefficients for the pairs of variables (Q-P/P-Q, Q-S/S-Q, Q-W/W-Q, Q-SW/SW-Q, P-S/S-P, P-W/W-P, P-SW/SW-P), using a time window of ± 3 days and across the three approaches (AM, POT3, POT6) considered in the conditional sampling. There is a categorical correspondence between both correlation coefficients, with Kendall's coefficient having a tendency to be smaller than the Spearman's coefficient. 355 Therefore, to characterize compound events in terms of correlation and its spatial distribution, the information provided by both coefficients is equivalent. The number of locations with significant correlation is very similar with both correlation coefficients (see Table 1).
Regarding (χ, ̅ ) statistics, the usual way to decide the threshold involves making a visual examination of the evolution of their empirical estimates for increasing threshold levels (Zscheischler et al., 2020a). Here, we decided to estimate χ at a probability 360 threshold 0.95 after careful examination of the results for different levels. The comparison between χ and the joint occurrences divided by 3 events per year and the number of years (40) for the pair of variables Q-S is shown in Figure 5b. There is high correspondence (correlation coefficient around 0.9) between both dependence metrics mainly because both of them measure the probability of bivariate extremes when both drivers are extreme. The remaining small differences may be due to different sampling processes leading to different extreme subsets. The statistic χ is estimated using the empirical distribution of the daily 365 time series of Q (mean daily values) and S (maximum daily values) and the extremes are selected without any clustering. On the other hand, the number of co-occurring events was calculated using a POT with a threshold that guarantees 3 events per year and with a storm duration of 3 or 5 days to select independent events. A similar relationship between both metrics has been found for other combinations of variable pairs (not shown here).

Definition of wave-driven sea level
The effect of the definition of sea level (oceanographic flooding drivers) in the characterization of compound events is analysed 380 here by comparing the Kendall's correlation coefficients obtained between the pair of variables Q or P when using the two varying sea level definitions (SW and WL) used in this study ( Figure 6). Differences in the obtained correlation coefficients are typically small (mainly around 0.05), except along the southern Atlantic coast where the differences are slightly higher than 0.15. The southern Atlantic coast region presents the highest correlations between pluvial or fluvial sources and oceanographic drivers (correlation coefficients around 0.6-0.7), within the entire study domain meaning that the identification 385 of this region as an area with significance dependence is still preserved. The effect of the sea level definition on the correlation when Q or P are the dominant variables in the conditional sampling is much smaller (around 0.05, not shown here).

Dependence between pairs of the four flooding drivers 400
Here we consider the results obtained using the POT3 method and a time window of ± 3 days to characterize the multivariate compound events along the North Atlantic, Mediterranean and Black Sea coasts. The analysis of the dependence between S-W is performed at an hourly temporal resolution and using a smaller Δt (1 day) because it is considered that both drivers contribute simultaneously to the definition of sea level. The dependence between S-W as calculated using daily data is compared with the results using hourly data ( Figure S4 For Q and W (Figure 7-c), the spatial distribution is quite similar to Q-S with slightly smaller correlation along the most 425 southern Atlantic coast and higher along the west coast of Iberian Peninsula. Correlation only when Q is the dominant variable is even more pronounced in locations along the Mediterranean Sea (e.g., eastern coast of Spain). These spatial patterns are also reflected in the distribution of the correlation coefficient between Q and SW as a combination of both (Figure 7-d).
Spatial distribution of the dependence between P and S, W or SW ( Figure S5

Severity index
Here we define an index, based on driver severity, to be included in the characterization of the spatial patterns of compound 445 flooding potential. The driver severity is calculated as the sum of normalized thresholds of each driver, applied in the conditional sampling (multiplied by 0.2 in the case of W). Q thresholds, which cover a wide range of values, have been categorized into 10 intervals [0-10-25-50-100-250-500-750-1000-5000->25000 m3/s] to avoid skewing the driver severity due to very high discharge magnitude in several locations. Driver severity is divided into 11 scores from 0 to 1. Figure 8d shows the spatial distribution of the severity index (SI). Areas with highest SI are concentrated in the North Sea, the northwest of the 450 Iberian Peninsula, the eastern coast of the Adriatic Sea, the eastern coast of the Black Sea and few locations that represent large rivers. Coastal areas with the lowest SI are mainly concentrated along the southern coast of the Mediterranean Sea and the most southern coast of the Atlantic Ocean of our study domain. The SI spatial distribution indicates that an identical SI ranking can be determined by different combinations of driver extremes. To facilitate this analysis, we classify the thresholds of the four drivers (shown in Figure S6) into 10 clusters to define the main combinations of driver extremeness ( Figure S8a). 455 The probability of occurrence of each cluster (number of locations of the study domain represented by each cluster) associated with each SI rank ( Figure S8b) provides which combinations of the four driver thresholds have an equal SI rank. For example, locations with SI equal to 0 are associated with only one cluster (represented in light green) which is defined by a combination of the lowest thresholds of Q, P and S and low W severity. On the other hand, locations with SI equal to 1 are associated with clusters 1 and 2 (in yellow and orange, respectively) which are characterized mainly by the severity of one driver (Q or S, 460 respectively) but also associated with clusters 3, 7 and 8 (in red, dark and light blue, respectively) with high severity of two or three drivers (Q, P and W, or Q and P, or Q and S, respectively). The spatial distribution of these clusters ( Figure S8c   SIi], where the subscript represents the i-th grid point. Each parameter is normalized to avoid assigning different weights in the classification process. We first use the SOM algorithm to obtain a large collection of centroids (20×20=400) projected onto a 2D organized lattice that helps to analyse the dependence between the 11 parameters. The hexagonal SOM of 20x20 size of the compound flooding potential derived from the 11 metrics outlined above for the study sites is shown in Figure 9. Results are shown in individual panels (Figures 9a-k)  For example, we can observe that the three parameters related with the pair Q-P (ρ1, ρ2, JO; see Figure 9a-c) present a similar distribution in the lattice which means that there is a high dependence between them. Locations with the highest correlation between Q-P(P-Q) (Figures 9a and 9b) also have the highest number of joint occurrences (Figure 9c). Centroids with the highest dependence parameters between Q-SW and P-SW are concentrated in the upper right corner of the lattice (Figure 9d-495 j). They are also characterized by the highest number of joint occurrences between Q-P ( Figure 9c) and high-medium dependence between Q-P (Figures 9a and 9b). Accordingly, the highest joint occurrences between all three variables are also found in the upper right corner (Figure 9i). Other centroids located in the upper left side of the lattice represent locations with high dependence between P-SW (Figure 9g-h) but not between Q-SW and relatively smaller dependence between Q-P.
The severity index (Figure 9k) reveals that locations with highest driver severity do not present dependence between any pair 500 of drivers (lower right corner of each individual panel in Figure 9). These locations are the ones represented principally by clusters with the severity determined by only one driver (clusters 1 and 2 in Figure 8a). The probability of many SOM centroids is null (Figure 9l) because a large lattice size (20x20) is required to guarantee that SOM centroids cover the multidimensional data space defined by the 11 parameters.

510
In the second step, we apply the KMA to find a reduced number of clusters applied to the SOM centroids. We obtain an optimal number of 8 clusters by applying the Davis-Bouldin and gap criteria. Figure 10b shows the KMA classification in 8 groups over the SOM lattice, and highlights the similarity between KMA clusters because neighbouring centroids in the 2D lattice 515 have similar values of the eleven parameters (see black contours of Figure 9 which delimitate the SOM centroids belonging to each KMA cluster). Figure 10a shows the mean value of the eleven parameters associated to each cluster and the variability within each group (25th, 75th, 5th, 95th percentile). The number of locations represented by each cluster is shown in Figure   10c.
The KMA classification reveals two clusters (red and pink groups) that represent locations where more compound flood events 520 can occur. The pink group is characterized by the highest dependence (both correlation and joint occurrences) between Q-SW (SW-Q) and P-SW (SW-Q) and high dependence between Q-P (P-Q), which is reflected in the highest number joint https://doi.org/10.5194/nhess-2021-50 Preprint. Discussion started: 3 March 2021 c Author(s) 2021. CC BY 4.0 License. occurrences between the three drivers. The SI centroid presents a medium-high severity with a wide variability. The red group represents locations with the highest dependence between Q-P (P-Q), and medium correlation, only when extremes are selected conditioned to oceanographic drivers, and high joint occurrence between Q-SW and Q-P. It is characterized by a low-medium 525 severity index because most of the locations present mild meteocean conditions (represented by cluster 4 in Figure S8a). The purple cluster follows the other two in terms of compound flooding potential. It is characterized by a high joint occurrence between Q-SW and P-SW but lower dependence between Q-P and represents locations with a high severity index. Green and blue clusters stand out for the high numbers of compound events resulting mainly from the combination of P-SW. The green cluster is also characterized by significant dependence between Q-P and Q-SW at locations with low severity meteocean 530 climates. In contrast, the blue cluster represents locations without compound events generated by the combination of Q and SW. The dark green cluster represents locations where only compound events from the combination of Q-P can occur. The

Discussion
In this paper, we have analysed the compound flooding potential arising from pluvial, fluvial and oceanographic sources. The assessment is based on a bivariate analysis of the dependence between drivers (P, Q, S and W) using a two-sided conditional sampling. Our first objective is focused on the analysis of the sensitivity of the results to several factors that have been applied 565 indiscriminately in previous studies with the purpose of identifying compound events and characterizing compound flooding potential.
First, we apply AM and POT sampling approaches to analyse how the choice of these approaches affects the computed dependency between variables. It is noteworthy that our results show that a lower statistical dependence is obtained when using the POT method, which is in agreement with Ward et al. (2018), yet a broad consensus has emerged in favour of the use 570 of the POT approach for identifying extreme events (Mazas et al., 2014, Coles, 2001. AM can potentially disregard information on extremes in the remaining data from using only one data point per year (Méndez et al., 2016), or select events https://doi.org/10.5194/nhess-2021-50 Preprint. Discussion started: 3 March 2021 c Author(s) 2021. CC BY 4.0 License.
that are not extreme, as we have observed in several of our study sites. However, the spatial patterns in both approaches are similar and comparable areas are identified as hotspots with relatively higher dependence.
In our second sensitivity analysis, we found that the choice of the time window used has almost a negligible effect on the 575 computed correlation coefficients (at least for the time windows in excess of the 3 days used here). The higher probability of finding more severe events using a longer Δt has only be reflected in a higher correlation in few locations. However, it can result in a lower number of compound events when both drivers are extreme (i.e. less joint occurrences).
In the third sensitivity analysis, we investigated differences in the characterization of the dependence between drivers using different metrics. We found that the correlations obtained are always higher when using the Spearman's rank correlation 580 coefficient as compared to the Kendall's coefficient, but they are unequivocally related, and equivalent spatial distributions are obtained irrespective of the choice of the correlation coefficient. Regarding joint occurrences and tail dependence, we found that both provide comparable quantifications of the dependence between driver variables when both variables are extreme. Moreover, the concept of joint occurrences provides a better measure of the compound flooding potential because it applies a declustering method to select independent events. 585 The last factor we assessed is the definition of wave-driven contribution to the sea level when using the sum of the S and W components. Any differences in correlation emerging as a result of the definition of wave-driven contribution to sea level seem to be explained by a higher dependence between surge and the simplified wave-driven sea level (20% of W) than surge and setup based also on the wave period (see Figure S7). Although the same beach slope is considered in all study locations, our results showed that the combination of W and Tp in the estimation of setup influences more the selection of compound events 590 conditioned to sea level than the semi-empirical formulation itself.
Our second objective was focused on estimating the spatial distribution of compound flooding potential considering the drivers Q, P, S, W, and SW. We observed significant differences in the dependence between the pairs of drivers and even for one individual pair depending on which driver is employed as the dominant one in the selection of the compound events. We find that the correlation coefficient and joint occurrences are not always positively related. Therefore, we considered that 595 combinations of both metrics provide complementary information about the type of compound events and represents different flooding mechanisms (Wahl et al., 2015). The joint occurrence only characterizes compound events when both drivers are extreme. On the other hand, correlation coefficients characterize those compound events generated when one of the drivers is extreme but not necessarily the other, providing information about the relative severity of the secondary driver.
Regarding a comparison with previous global and regional studies of compound flooding potential in the study domain, the 600 hotspots we have identified on the coasts of Portugal, the Strait of Gibraltar and Morocco have also been detected in Couasnon et al. (2020). However, although we found a higher number of joint Q-S occurrences on the south-west and west coasts of the UK than on the east coast, as previously noted by Hendry et al. (2019), the number of co-occurrences is lower in our analysis, as is also the case around the coast of Ireland. Similar high joint occurrences between Q-S in the northern and eastern Mediterranean coasts and in the coast of Tunisia are found, in accordance with Couasnon et al. (2020). We do not observe a 605 predominance of higher correlation when compound events are conditioned to Q as in Coausnon et al. (2020) found. However, https://doi.org/10.5194/nhess-2021-50 Preprint. Discussion started: 3 March 2021 c Author(s) 2021. CC BY 4.0 License. differences in the correlation between the two conditional samples are found between P and S, W or SW. As pointed out by Hendry et al. (2019), storms that generate high precipitation are different to the ones that generate high storm surges.
Specifically, heavy precipitation and extreme surges are driven by deep low-pressure systems, while intense rainfall can also be caused by convection without intense cyclonic activity (Bevacqua et al., 2019). Therefore, there is higher probability of 610 compound events when S or W are the dominant variables (generated by extratropical storms) than when compound events are conditioned to P (convective storms). This effect seems to be less perceptible between river discharge and oceanographic variables because other climatic and non-climatic factors affect the fluvial source driver (Bevacqua et al., 2020), as for example, land use characteristics or snowmelt, evaporation, and accumulated precipitation over previous weeks or months.
Bevacqua et al., 2019 found the lowest joint return periods due to high dependence between P-S were concentrated along the 615 Atlantic coast and in the Mediterranean Sea (particularly in the regions of the Gulf of Valencia (Spain), the northwest Algeria, the Gulf of Lion (France), the Adriatic coast of Balkan Peninsula, the Aegean coast, southern Turkey and the Levante region).
Even though we did not calculate return periods, our results suggest similar areas of higher dependency between P-S. We find a distinct pattern between southern and northern European coasts with more joint occurrences between S-W, especially over the Irish Sea, English Channel and south coasts of the North Sea and Baltic Sea in line with the results of Petroliagkis (2018). 620 Similar regional patterns of dependence between S-W as we find here were reported by Marcos et al. (2019), but we find some additional local areas with strong dependence when both drivers are extreme (characterized by the statistic χ or joint occurrence) such as the western coast of the Iberian Peninsula and certain areas in the Mediterranean Sea, perhaps because we used a higher Δt than in Marcos et al. (2019).
We apply a two-step clustering method to synthesize the high dimensional results of the bivariate characterization of compound 625 flooding potential from the four sources. First, the SOM algorithm allows us to analyse the multivariate dependence between the correlation coefficients and the joint occurrences between the pairs of drivers Q-P, Q-SW and P-SW, enabling the establishment of the degree of contribution of each driver combination to compound flooding in the study area. Moreover, the method also distinguishes if the identified compound events are more likely to occur when both drivers are extreme or when only one driver is extreme. With the second step of the clustering method, we identify a reduced number of types of compound 630 events based on the contribution of each driver combination and the driver severity. The spatial distribution of these types of compound events reveals spatial patterns of compound flooding potential. These patterns allow us to discern locations with the highest overall compound flooding potential and the associated contributions of each pair of drivers. Additionally, we introduce a severity index to distinguish between locations with similar compound flooding potential (from the dependence perspective) but very different driver severity. 635 The main limitation of our study is the identification of the compound events based on drivers. None of the contributing variables have to be necessarily extreme to create a compound flooding event. Therefore, the selection of compound events should ideally be based on an impact function or a risk function that accounts for exposure and vulnerability. However, this function is usually unknown and difficult to derive, especially in regional and global studies. An intermediate approach could be based on the selection of the compound events in the extreme water levels generated by the interaction between oceanographic drivers and riverine drivers. The amplification of the flooding impact has been identified at the local scale (van den Hurk et al. 2015, Kumbier et al. 2018, and recently at global scale (Eilander et al. 2020) using a global coupled rivercoast flood model framework (Ikeuchi et al., 2017).

Conclusions 645
In this paper we have evaluated the compound flooding potential arising from the combination of precipitation, river discharge, storm surge and waves along the coasts of the eastern North Atlantic Ocean, Mediterranean Sea and Black Sea. The paper provides two advances. First, we performed a series of sensitivity analyses to establish how methodological choices affect the identification of compound flood events. Specifically, we investigated: 1) the sampling method, 2) the time window used to match events in the two-way sampling, 3) the use of typical metrics applied in the evaluation of the dependence between 650 drivers, and 4) the definition of the wave-driven sea level contribution. Among these, the sampling approach produces the highest differences in the quantification of the compound flooding potential. However, none of these factors analyzed here introduces significant differences in the spatial distribution of the compound flooding potential which means that similar locations where compound flooding is more likely to occur are identified.
The second major advance forwarded by our work comprises a new regional characterization of compound flood potential 655 using a methodology which aggregates the bivariate dependence between driver combinations. This multivariate characterization reveals three main locations with high compound flooding potential: the southern coast of the North Atlantic, the western coasts of France and UK and the northern coast of the Mediterranean Sea. These locations are characterized by compound events that arise from the combination of the four drivers, albeit with differences related to the driver severity, the contribution of each pair of drivers and the predominance of compound events when drivers are extreme. Other locations of 660 relatively high compound flood potential include the eastern coast of Italy and southern Mediterranean Sea, where compound flooding is mainly driven by combinations of P-SW.
This regional quantitative assessment of multivariate compound flooding potential can be considered as a pre-diagnostic tool for coastal management. Results provide information about which areas are more predisposed to experience compound flooding. In addition, this multivariate flooding potential classification identifies the relevant drivers of coastal compound 665 flooding at each location which can assist to decide the most appropriate methodological approach to perform high-resolution hydrodynamic and impact modelling.