The impact of natural hazards such as storm surges and waves on coastal areas during extreme tropical cyclone (TC) events can be amplified by the cascading effects of multiple hazards. Quantitative estimation of the marginal distribution and joint probability distribution of storm surges and waves is essential to understanding and managing tropical cyclone disaster risks. In this study, the dependence between storm surges and waves is quantitatively assessed using the extreme value theory (EVT) and the copula function for the Leizhou Peninsula and the island of Hainan of China, based on numerically simulated surge heights (SHs) and significant wave heights (SWHs) for every 30 min from 1949 to 2013. The steps for determining coastal protection standards in scalar values are also demonstrated. It is found that the generalized extreme value (GEV) function and Gumbel copula function are suitable for fitting the marginal and joint distribution characteristics of the SHs and SWHs, respectively, in this study area. Secondly, the SHs show higher values as locations get closer to the coastline, and the SWHs become higher further from the coastline. Lastly, the optimal design values of SHs and SWHs under different joint return periods can be estimated using the nonlinear programming method. This study shows the effectiveness of the bivariate copula function in evaluating the probability for different scenarios, providing a valuable reference for optimizing the design of engineering protection standards.

Tropical cyclone storm surges and waves could cause severe loss of life and property in offshore and coastal areas (Chen and Yu, 2017; Marcos et al., 2019; Wahl et al., 2015), and it is of great importance to quantify the intensity–frequency relationship of storm surges and waves, in order to understand the joint severity of multi-hazard extreme tropical cyclones (TCs; Zhang and Wang, 2021; Galiatsatou and Prinos, 2016).

In the past, many studies have analyzed single hazard indicators of tropical cyclone storm surges and waves (Lin et al., 2010; Shi et al., 2020; Teena et al., 2012), often with observed time series data or with simulated results by numerical models (Petroliagkis et al., 2016; Bilskie et al., 2016; Huang et al., 2013; Papadimitriou et al., 2020). The intensity values of the surge height (SH) or significant wave height (SWH) of a specific return period can be estimated based on the extreme value theory (EVT) (Teena et al., 2012; Muraleedharan et al., 2007; Morellato and Benoit, 2010; Niedoroda et al., 2010). Accordingly, the estimated probabilities of single hazards, such as SH or SWH, have been widely applied in the protection standard in coastal areas (Bomers et al., 2019; Perk et al., 2019; Lee and Jun, 2006).

However, strong storm surges and waves often occur concurrently during tropical cyclone events, which often cause greater impact than estimated only with a single variate due to the cascading effects of multi-hazards. For example, when high waves near the coast take place along strong storm surges, the overtopping and overflowing at sea dike can lead to a large area of inundation and severe damage to coastal facilities (Rao et al., 2012; Hughes and Nadal, 2009; Pan et al., 2019). Similarly, rising sea levels due to storm surges would improve the probability of wave overtopping (Pan et al., 2013; Li et al., 2012). The concurrent interaction between storm surges and waves may cause the modeling of multi-hazards with significant uncertainties. Some studies have investigated the physical interaction of storm surges and waves through numerical simulation by coupling storm surge and wave models (Xie et al., 2016; Kimf et al., 2016; Brown, 2010) for specific events.

Statistical tools such as joint probability analysis have been used in multidimensional natural hazard assessment (Hsu et al., 2018). Since the copula function does not restrict the marginal distribution function and can be relatively easily extended to multiple dimensions, it is often used to construct joint probability of multiple variates (Nelsen, 2006; Chen and Guo, 2019). There are a variety of applications with copula function for double hazards, for example, rainfall and storm surge (Jang and Chang, 2022), wind and storm surge (Trepanier et al., 2015), and storm surge and wave (Corbella and Stretch, 2013; Wahl et al., 2012).

In coastal protection standard design, it is essential to analyze and estimate the joint probability of SH and SWH. Chen et al. (2019) used the copula functions to analyze the joint probability of extremely significant wave heights (SWHs) and surge heights (SHs) at nine representative stations along China's coasts. Galiatsatou and Prinos (2016) investigated the joint probability of extreme wave heights and storm surges with time with a non-stationary bivariate approach. Marcos et al. (2019) statistically assessed the dependence between extreme storm surges and wind waves along global coastal areas using the outputs of numerical models. Most previous joint probability studies on storm surges and waves mainly focused on location-specific rather than region-wide analysis. In addition, even with the joint probability of bivariate estimation, only an intercepted curve can be obtained since their probability is a three-dimensional surface. In addition, as the intensities of the bivariates and their simultaneous probability are three-dimensional surfaces, the cross-section at a given return period is a curve rather than a specific scale value, so the joint probability of SHs and SWHs alone can not be used directly as a reference value for engineering protection standard. In order to obtain two specific scalars for SH and SWH, other constraints such as their preferred simultaneous return periods are needed (Xu et al., 2022).

In this study, we aim to explore the joint probability characteristics of tropical cyclone storm surges and waves for large coastal areas and to investigate the methods and steps for selecting the protection standard of sea dikes. Firstly, the marginal distribution and copula function of modeling nodes in the study area is fitted based on the long-term numerically simulated tropical cyclone SH and SWH from 1949 to 2013. Next, the optimal copula functions are selected for every modeling node based on the Kolmogorov–Smirnov (K–S) test, Akaike information criterion (AIC) (Cavanaugh and Neath, 2019), and Bayesian information criterion (BIC) (Neath and Cavanaugh, 2012).. Then, the correlation between SH and SWH is quantified using the copula function to calculate the probabilities under simultaneous, joint, conditional, and different-level combinations. The change in bivariate occurrence probability after increasing the engineering protection standard for the SHs and SWHs is quantitatively assessed. Finally, with the maximum bivariate simultaneous return period as the objective function and the bivariate joint return period as the constraint, the optimum engineering design values of SHs and SWHs are solved by the nonlinear programming method.

The best-track dataset of historical TCs in the northwestern Pacific (NWP) is obtained from the Tropical Cyclone Data Center of the China Meteorological Administration (CMA). The CMA records in detail the location (longitude and latitude), time (year, month, day, hour), central minimum pressure, and 2 m average near-center maximum sustained wind speed (MSW) for every 6 h track point of each TC event since 1949 (Lu et al., 2021). The landfall of TCs in China is concentrated on the southeast coast, especially in the coastal areas of the South China Sea. Figure 1a shows the spatial distribution of the best track and maximum sustained wind speed of 86 historical TCs screened in this study from 1949 to 2013.

Best track and MSW of 86 TCs in this study from 1949 to 2013

The SH dataset is obtained from the Ocean University of China, mainly through the ADvanced CIRCulation model (ADCIRC) simulations, which includes the SHs of 86 TCs affecting the eastern coast of the Leizhou Peninsula and the island of Hainan from 1949 to 2013 (Liu et al., 2018; Li et al., 2016). The previous study provides a water depth map for the study area (Liu et al., 2018). The ADCIRC model integrates the effects of various boundary conditions and external forcing and uses triangular grids with different resolutions, making it more computationally efficient and applicable in numerical simulations. The simulation results are the total water level after the superposition of the water gain caused by a tropical cyclone and astronomical tide, and the time step is 30 min.

To improve the simulation accuracy and computing speed of the hot spot area,
the model adopts a triangular grid with nested small- and large-area grids,
and the resolutions of different area grids are set in a gradual resolution
range from 0.0039 to 0.3

The bathymetry of storm surge modeling area

The boundary condition to force the surge in the subdomain is the time series of the water level on each boundary nodes, which includes both the tide elevation of eight major constituents (M2, S2, N2, K2, K1, O1, P1, and Q1) in that area from OSU (Oregon State University) Tidal Prediction Software and the surge elevation extracted from the full domain results (Liu et al., 2018). Comparing the simulation values with the measured surge height at the observation sites, we discover that the absolute standard error is 47 cm, the relative standard error is 22 %, and the simulation results are similar to the observed values in most cases. Thus, the dataset could be used to assess the hazard of TC storm surges. Figure 3a shows an example of the simulation results of the surge height of TC Nasha (ID: 1117) at a specific moment.

Distribution of surge height

The SWH dataset is also obtained from the Ocean University of China, mainly through the Simulating WAves Nearshore (SWAN) model, and includes the SWHs of 86 TC events affecting the study area from 1949 to 2013 (Li et al., 2016). The SWAN model has the advantage of high computational accuracy and stability and has been widely used in numerical simulations of offshore waters. The simulation results include indicators such as significant wave height, mean period, and wave direction, and the time step is 1 h.

The model also uses a triangular grid with nested small- and large-area
grids and gradual resolution, but the nodes' scopes and locations differ
from those of the storm surge model. The calculation region for the
large area is 15–22

Comparing the observed data of buoy stations with the simulated values
reveals that the unstructured grid can well reflect the wave variation
conditions in the sea. In addition, the mean absolute and root mean square
errors of the simulated results of the locally encrypted unstructured
triangular grid are the smallest, indicating that the data can effectively
reproduce the wave distribution during tropical cyclones. It shall be noted
that the effect of sea level rise due to storm surge was not considered
during the SWH simulation, which will influence the accuracy of SWHs,
especially in intermedia and shallow water. In this paper, we choose the SWH
as an indicator of tropical cyclone wave hazard. Figure 3b shows an example of the significant wave height of TC

Based on the location of the nodes of the triangular grid in the storm surge
(Sect. 2.2) and wave datasets (Sect. 2.3), we select the region with a dense
distribution of both as the study area, and the finalized spatial range is
110–113

Sklar (1973) elucidates the role that copulas play in the
relationship between multivariate distribution and their univariate margin
distribution and states that any multivariate joint distribution can be
described by a univariate marginal distribution function and a couple
describing the dependence structure between the variables (Nelsen, 2006). Let

The marginal function means that the probability density function (PDF) and
cumulative distribution function (CDF) of the univariate are constructed by
intensity–frequency analysis to reflect the probability of occurrence of the
univariate at different intensities. The method is widely utilized in
natural hazard assessments such as tropical cyclones, floods, droughts, and
earthquakes. We select five commonly employed marginal functions for the
annual extreme fitting of tropical cyclone storm surges and waves,
including the Gumbel, Weibull, gamma, exponential, and generalized extreme
value (GEV) functions. In this study, the maximum likelihood method is used
to estimate the function parameters, based on which the optimal marginal
functions for SHs and SWHs are screened by the following steps: firstly, the

There are a variety of copulas families, including meta-elliptical copulas
(normal and

Formulas and parameter ranges for three types of bivariate Archimedean copula functions.

Note:

The return period (RP) indicates the period of natural hazard events, and it
is a crucial indicator for quantifying the hazard level, which is widely
utilized in hazard analysis. The formula for the return period of a single
hazard indicator is as follows:

Based on the copula function, it can quantitatively estimate the probability of a multivariate being greater than a specified threshold. The bivariate probability refers to the likelihood that various conditions will occur simultaneously, and the bivariate return period refers to the average time interval required for multiple states to be simultaneously greater than a certain threshold.

The definitions of three types of joint probabilities and return periods are
given according to the univariate return period formula. The first type is
when two variables simultaneously reach a given threshold, which will be
defined as the simultaneous probability

To carry out the tropical cyclone storm surge and wave combination scenario
simulation, we classify the SH and SWH into five classes (Table 2) by referring to the “Technical directives for risk assessment and zoning of marine disasters – Part 1: Storm Surge” (MNR, 2019) and “Technical directives for risk assessment and zoning of marine disasters - Part 2: Waves” (MNR, 2021). We calculate the bivariate probabilities for
discretized hazard level combination scenarios based on the marginal and
copula functions of the storm surge and wave. The formula is as follows:

Hazard level classification thresholds for combined scenarios of tropical cyclone surge height and significant wave height.

In actual engineering protection design, if the protection standards of SH
and SWH are appropriately increased or decreased, it can change the
simultaneous bivariate probability

Based on the binary copula function, the bivariate joint probability of
extreme storm surges and waves under different joint return periods is
available. In order to achieve the optimal protection effects, it is natural
that we need to set the maximum bivariate simultaneous probability of SH and
SWH as target functions (Eq. 16) and use joint probability as constraints
(Eq. 17).

Diagram of determining design surge height and significant wave
height based on their joint and simultaneous return periods (red curves are
joint return periods (RP

Since there are different densities and locations of the triangular grids in the storm surge and wave models, we use the storm surge triangular grid nodes as the benchmark and the wave node closest to each storm surge node as the wave simulation result based on the nearest-neighbor method. Therefore, a dataset of storm surges and waves with the same number and location of nodes is reconstructed, containing 1665 nodes in the study area.

In this paper, based on the reconstructed storm surge and wave simulation
results of historical TC events, we calculate each node's annual extremes of
SH and SWH. Firstly, the time series of the bivariate annual extremes for
all nodes are fitted with five marginal functions, including Gumbel,
Weibull, gamma, exponential, and generalized extreme value (GEV). Next, the

Frequency and percentage of five functions passing the K–S test and the optimal function for all nodes of SH and SWH.

Based on the statistical results, it is found that for fitting the SH, the K–S test of the GEV function had the highest non-rejection rate of 100 %, and the corresponding optimal ratio was 30.04 %, so GEV is set as the optimal marginal function in this study. For the SWH fitting, the number of nodes with no rejection in the K–S test of the GEV function is 1657, accounting for 99.52 % of the total number of nodes, and the corresponding percentage of preferences is also higher than that of other functions. We apply the GEV function to fit the marginal function of the SH and SWH at all nodes and calculate the PDF, CDF, and RP. Figure 5 shows an example of the PDF and CDF of the SH and SWH for a given node.

Fitting results of the PDF and CDF of the surge sight and
significant wave height based on the GEV function (using node
(110.5142

Based on the univariate return period formula (Eq. 5), the SH and SWH are estimated for six typical return periods of 5, 10, 20, 50, 100, and 200 years at all nodes. To analyze the distribution characteristics of the univariate return period in this study area, we chose the cubic spline interpolation method to interpolate the intensity values at each node with different return periods into a raster with a resolution of 1 km (Figs. 6 and 7).

Spatial distribution of surge heights of tropical cyclones for six typical return periods.

Spatial distribution of significant wave heights of tropical cyclones for six typical return periods.

As shown in Fig. 6, the SH shows a significant increasing trend as it approaches the coastline. The SHs along the eastern coast of the Leizhou Peninsula are higher than most other regions. Frequent TC events, TC moving direction (Fig. 1), and pocket-shaped coastal topography (Fig. 2) are all favorable factors to water accumulation in this area. Another area with high SHs is located to the east of the island of Hainan. Besides frequent TCs, this area is at the transition zone from the continental shelf to the continental slope, where bathymetry changes rapidly and can bring strong storm surges easily.

As shown in Fig. 7, the SWHs near the shore are generally smaller than that in the open sea, and there is a significant decreasing trend in SWH as it gets closer to the coastline. This is mainly attributed to the shallow shore depth, island obstruction, wave breaking, and seabed friction attenuation. Among them, the SWHs in the eastern Leizhou Peninsula are lower than that of other seas, which is mainly influenced by the curved depressed coastline and the topography of the shore section. The SWHs are influenced by the frequency, duration, and intensity of TCs, so the SWH is higher in the east and south of the island of Hainan than in the north. The east side of the island of Hainan from the continental shelf to the continental slope causes a wave-breaking effect and dissipation caused by the dramatic change in seafloor topography height, which results in a more significant gradient in SWH. In addition, it shall be noted that errors may be introduced during the estimation of SWHs with GEV due to the limited number of TC events.

The optimal GEV function is utilized as the marginal function for the TC storm surges and waves, based on which three copula functions are applied to the bivariate joint fitting of 1665 nodes. The function parameters are fitted by the maximum likelihood method, and the K–S test is used to determine whether the hypothesis that the sample obeys a certain functional distribution is rejected. Next, we count the number of nodes that pass the K–S test for the three types of copula functions and their percentage of the total number of nodes (Table 4). The statistical results show that the number of nodes passing the K–S test for the Gumbel copula function is 1603, accounting for 96.28 % of all nodes, so it is used as the optimal copula function. The Gumbel copula function is applied to the bivariate joint fitting of SH and SWH for all nodes, and the PDF and CDF are calculated.

Frequency and percentage of three copula functions passing the K–S test for all nodes of surge height and significant wave height of tropical cyclones.

Simultaneous probabilities of combined scenarios with four typical return periods for surge height and significant wave heights of tropical cyclones.

Based on the optimal marginal function and copula function, we calculate
RP

Joint probabilities of combined scenarios with four typical return periods for surge height and significant wave heights of tropical cyclones.

The simultaneous bivariate probability

The joint bivariate probability

Based on the formula of conditional bivariate probability

Conditional probabilities of bivariate for different return periods of tropical cyclone significant wave heights.

According to the classification thresholds of the hazard indicators (Table 2), SH and SWH are divided into five classes. We calculate the combined scenario probability

Probabilities of combined scenarios with different levels of surge height and significant wave height for tropical cyclones.

Regarding the vertical variation pattern, when the SH hazard level is determined, as the SWH hazard level increases, the high-value area of the combined scenario probability gradually moves away from the coastline, and the scope of the nearshore low-value area gradually expands. This result is consistent with the geographic distribution pattern: the SWH is low nearshore and high offshore. In the horizontal variation pattern, when the hazard level of SWH is determined, as the hazard level of SH increases, the range of low-value areas for the combined scenario probabilities expands, and the low-value area's left boundary gradually approaches the coastline. This result is consistent with the geographic distribution of SHs being high nearshore and low offshore. Overall, the maximum value of the probability for each combined scenario tends to decrease as the hazard level of SH or SWH increases. The larger SH and SWH are concentrated in the eastern Leizhou Peninsula at a certain distance from the coast, with other areas less likely to occur.

Based on the calculated

In the design of storm surge and wave protection standards, if one hazard
indicator is dominant, upgrading the return period for the other variable
can effectively change bivariate

Difference in the simultaneous probability of tropical cyclone surge height and significant wave height for scenarios with elevated return period protection standards.

Difference in joint probability of tropical cyclone surge height and significant wave height for scenarios with elevated return period protection standards.

Differences in the conditional probability of tropical cyclone surge height and significant wave height for scenarios with elevated return period protection standards.

Figure 12 shows the distribution of the reduction values of bivariate

Figure 13 shows the distribution of the reduced values for bivariate

Figure 14 shows the distribution of the reduced values of bivariate

In the engineering protection standard, the appropriate design values of the
SHs and SWHs are set according to the bivariate RP

Design surge heights for six typical joint return period scenarios.

Design significant wave heights for six typical joint return period scenarios.

When RP

When RP

In this study, we aimed to estimate joint probability analysis of storm surges and waves using copula functions on a large dataset from a wide area and to determine their respective design standards as scalar values of SWH and SH. Our main conclusions are as follows:

The GEV function is the most suitable for fitting the probability distribution characteristics of the annual extremes of tropical cyclone SH and SWH for all nodes in the study area. The Gumbel copula function is appropriate as a bivariate joint distribution function for all nodes in the study area.

The hazard of a single indicator can be characterized by the univariate intensity values with different return periods, which the optimal marginal function can estimate. Our findings show that the SH exhibits a significant increasing trend closer to the coastline, while SWH is higher farther from the shoreline across different return periods. However, we also observe apparent spatial heterogeneity in the distribution, influenced by factors such as the shoreline shape, coastal and submarine topography, and deflection forces.

Bivariate probabilities are utilized in this study to assess the integrated hazard of multiple indicators, including

In the actual design for engineering protection standards, the bivariate

We have used R programming language for joint probability analysis, and the code is available at

The datasets used in this study are available at

FWH and ZHX conceived the research framework and developed the methodology. ZHX was responsible for the code compilation, data analysis, graphic visualization, and first draft writing. FWH managed the implementation of research activities and revised the manuscript. CM participated in the data collection of this study. All authors discussed the results and contributed to the final version of the paper.

The contact author has declared that none of the authors has any competing interests.

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors acknowledge the financial support of the National Key Research and Development Program of China and the Key Special Project for Introduced Talents Team of Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou). We are grateful to Xing Liu of the Ocean University of China for providing the simulation data of storm surges and waves for historical tropical cyclone events.

This research has been supported by the National Key Research and Development Program of China (grant nos. 2017YFA0604903 and 2018YFC1508803) and the Key Special Project for Introduced Talents Team of the Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou) (grant no. GML2019ZD0601).

This paper was edited by Brunella Bonaccorso and reviewed by Francesco Serinaldi and two anonymous referees.