Estimation of synthetic flood design hydrographs using a distributed rainfall – runoff model coupled with a copula-based single storm rainfall generator

In this paper a procedure to derive synthetic flood design hydrographs (SFDH) using a bivariate representation of rainfall forcing (rainfall duration and intensity) via copulas, which describes and models the correlation between two variables independently of the marginal laws involved, coupled with a distributed rainfall–runoff model, is presented. Rainfall–runoff modelling (R–R modelling) for estimating the hydrological response at the outlet of a catchment was performed by using a conceptual fully distributed procedure based on the Soil Conservation Service – Curve Number method as an excess rainfall model and on a distributed unit hydrograph with climatic dependencies for the flow routing. Travel time computation, based on the distributed unit hydrograph definition, was performed by implementing a procedure based on flow paths, determined from a digital elevation model (DEM) and roughness parameters obtained from distributed geographical information. In order to estimate the primary return period of the SFDH, which provides the probability of occurrence of a hydrograph flood, peaks and flow volumes obtained through R–R modelling were treated statistically using copulas. Finally, the shapes of hydrographs have been generated on the basis of historically significant flood events, via cluster analysis. An application of the procedure described above has been carried out and results presented for the case study of the Imera catchment in Sicily, Italy.


Introduction
Floods are a global problem and are considered the most frequent natural disaster worldwide.They may have serious socio-economic impacts in a community, causing victims, population displacement and damages to the environment, ecology, landscape and services.
Flood risk analysis and assessment are required to provide information on current or future flood hazard and risks in order to accomplish flood risk mitigation, to propose, evaluate and select measures to reduce risk.Thus, the European Parliament has adopted the new Directive 2007/60/EC (European Union, 2007) that requires member states to assess if coastal areas and water courses are at risk from flooding, to produce flood risk maps and take measures to mitigate the consequent risk.The objective of this directive is to establish a framework for the assessment and management of flood risk in Europe, emphasising both the frequency and magnitude of a flood as well as its consequences.
Reliable estimates of the likely magnitude of the extreme floods are essential in order to reduce future flood damages.Despite the occurrence of extreme floods, a problem across Europe, physical mechanisms responsible for the generation of floods will vary between countries and regions.As a result, no standardised European approach to flood frequency estimation exists.Where methods exist, they are often simple and their ability to predict accurately the effect of environmental change (e.g.urbanisation, land-use change, river training and climate change) is unknown.
Moreover, Mediterranean water courses have specific features compared to other river systems.Mediterranean catchments are, in fact, generally small, with extents of a few hundred km 2 , highly torrential and may generate flash floods (Brigandì and Aronica, 2008;Koutroulis and Tsanis, 2010;Aronica et al., 2012a;Camarasa-Belmonte and Soriano, 2012).Runoff generation in those areas is the final result of numerous spatial and temporal complex processes that take place at the hillslope and on catchment scale.The complexity of the processes involved derives from the great heterogeneity of rainfall inputs, surface and subsurface characteristics, and a strong nonlinear dependency on antecedent wetness which controls the infiltration capacity of the soil surface and the connectivity of surface and subsurface runoff pathways (Nicolau et al., 1996;Candela et al., 2005).
The flood frequency analysis (FFA) estimates how often a specified event will occur and aims to evaluate the flood event in terms of a maximum discharge value corresponding to a given return period and/or relative volume.The probability of future events can be predicted by fitting the past observations to selected probability distributions.Flood event estimation (hydrograph design) requires the use of different methods depending on whether it is enough to know the maximum discharge value or whether it is necessary to know the full hydrograph.In both cases, the problem can be solved directly, starting from flow measurements available for the catchment, or indirectly using rainfall data recorded as input for a rainfall-runoff model.This latter approach is the basis of the derived distributed approach methods (DDA methods) that allow one to derive flood hydrographs using rainfallrunoff models.Analytical difficulties associated with this approach are, often, overcome by adopting numerical Monte Carlo methods.In these cases, a stochastic rainfall generator is used in order to generate rainfall data for a single event or continuously (Blöschl and Sivapalan, 1997;Loukas, 2002;Rahman et al., 2002;Aronica and Candela, 2007).
FFC analysis is, usually, based on the derivation of FFC to define the maximum discharge value corresponding to a given return period only.However, for flooding management, it is not enough to know information about flood peaks only, but it is also useful to evaluate flood volume and hydrograph duration statistically.Since flood peaks and corresponding flood volumes are variables of the same phenomenon, they should be correlated and, consequently, bivariate statistical analyses should be applied (Serinaldi and Grimaldi, 2011;Aronica et al., 2012b).
In general, bivariate (multivariate) probability models were limited by mathematical difficulties due to the generation of consistent joint laws with ad hoc marginals.Actually, copulas have overcame many of these problems (Salvadori et al., 2007), as they are able to model the dependence structure independently of the marginal distributions.
The aim of the paper is to propose an indirect procedure for the estimation of synthetic flood design hydrographs using a bivariate representation (via copulas) of rainfall (rainfall duration and intensity), used as forcing input to a distributed rainfall-runoff model for assessing the hydrological response of a catchment.Specifically, the proposed procedure is based on a single-event approach and not on a continuous simulation.
In fact, despite several authors showing how "continuous simulation" schemes that consist of generating long synthetic rainfall time series and transforming them through a continuous rainfall-runoff model could be preferable with respect to "event-based" schemes for the SFDH definition (Verhoest et al., 2010;Grimaldi et al., 2012b), this requires complex rainfall-runoff models to simulate hydrological response and the availability of long continuous and reliable time series of hydrological variables such as rainfall and discharge.When, as in many real-world cases, available data are not sufficient to allow the calibration of a continuous (and often with a complex structure) rainfall-runoff model, an event-based approach could be preferable, because it is easily applicable and affected by fewer errors and total uncertainties (Wagener et al., 2004).
An important aspect to be considered in the estimation of the SFDH is the choice of an appropriate return period.Several authors found (see for instance Salvadori et al., 2011;Gräler et al., 2013;Requena et al., 2013;Volpi and Fiori, 2014) how different return periods estimated by fitted copulas could be considered.Between them, the primary return period, which provides the probability of occurrence of a flood hydrograph, was selected in this study.In order to estimate it, peaks and flow volumes obtained through R-R modelling were treated statistically via copulas.Finally, the shape of the hydrograph was generated on the basis of a modelled flood event, via cluster analysis.

Case study
The Imera catchment, with an area of about 2000 km 2 , is located in the southwestern part of Sicily, Italy (Fig. 1).The study was focused on one sub-catchment named Drasi, characterised by an area of 1789 km 2 whose outlet is about 30 km upstream the river mouth in the Mediterranean Sea.The "Imera at Drasi" flowgauge station is located at the outlet of this sub-catchment.Its climate is Mediterranean with hot dry summers and a rainy winter season from October to April and the hydrological response is greatly dependent on the soil water initial state, which is highly variable because of the large range of weather conditions.The measurement network (Fig. 1) consists of eight raingauges (Canicattì, Caltanissetta, Delia, Mazzarino, Enna, Riesi, Petralia Sottana, Polizzi Generosa) located within the catchment and characterised by a temporal resolution of 10 min, and of one flowgauge (Drasi).Historical series of rainfall are available from 1960 up to date on an hourly basis and from 2001 up to date on a 10 min basis, while discharges are available only on an hourly basis.Unfortunately the hourly series (rainfall and discharges) shows many gaps, especially in the periods of high flows.A digital elevation model characterised by a resolution of 200 m with a grid of 278 × 399 pixels is also available for this catchment.

Methodology
This section describes in detail the proposed procedure to derive SFDH whose layout can be summarised as follows: (1) stochastic generation of rainfall to derive single synthetic rainfall events by a bivariate analysis based on copulas; (2) rainfall-runoff modelling for the estimation of the hydrological response at the outlet of the catchment using a conceptual fully distributed model; (3) derivation of syn-thetic hydrographs (for a given design return period) by bivariate analysis of rainfall-runoff outputs.

Stochastic generation of rainfall
The hydrological input was derived by using a stochastic model to derive single synthetic sub-hourly rainfall events (Brigandì, 2009;Brigandì and Aronica, 2013).Generated rainfall events are totally stochastic but with characteristics in terms of shape, duration and average intensity that have to satisfy the parameters derived by statistical analyses of the available historic records.Stochastic generation of rainfall was based on the implementation of the two following modules: -Intensity-duration (statistical description and generation of storm characteristics using a bivariate model).
-Temporal pattern (generation of within-storm temporal characteristics as time step intensity variations, using simple statistical descriptors).

Intensity-duration module
Since storm duration, average intensity or rainfall volumes are variables of the same phenomenon, they should be correlated.Consequently, these variables have to be analysed jointly through bivariate models and, particularly, those based on the theory of copulas.
In a two-dimensional context, copula C is a function which represents the joint distribution function of two dependent random variables uniformly distributed between 0 and 1: where u 1 and u 2 denote realisations.Let two random variables be X and Y , with their marginal distribution functions F x (x) and F y (y).Through a change of variables it is possible to obtain the bivariate cumulative distribution function, as expressed by the theorem of Sklar (1959): The advantage of the copula method is that no assumption is needed for the variables to be independent or have the same type of marginal distribution.More information and applications about copulas can be found in Nelsen (2006) and Genest and Favre (2007).
A particular subclass of copulas, called Archimedean, is widely used for hydrological applications, given many useful properties and particularly given their easy construction.More in particular, because storm duration and average rainfall intensity are variables negatively correlated with each other, the Frank copula was considered for this application because of its ability to model both positive and negative correlations (Favre et al., 2004;Zhang and Singh, 2007).De Michele and Salvadori (2003) found, in fact, how the Frank copula is the best candidate for modelling the dependence between average rainfall intensity and storm duration in comparison with other families of copulas, and also for reproducing the asymptotic statistical distribution of the storm depth.
In the present study, Frank's family class of 2-copulas were considered: with generation function where θ is the parameter of a copula function that is related to the Kendall coefficient of correlation τ between X and Y through the expression where 1 is the first-order Debye function.The first step in determining a copula is to obtain its generating function from bivariate observations.The procedure to calculate the generating function and the resulting copula followed in this study was described by Genest and Rivest (1993).It assumes that for a random sample of bivariate observations (x 1 , y 1 ), (x 2 , y 2 ), . . ., (x N , y N ) the underlying distribution function H X,Y (x, y) has an associated Archimedean copula C(u 1 , u 2 ) which can also be regarded as an alternative expression of the joint cumulative distribution function (CDF).The procedure involves the following steps, all implemented through the use of Matlab © routines: 1. Determine Kendall's τ from observations.2. Determine the copula parameter θ from the above value of τ using Eqs.( 5) and ( 6).For the Frank copula families introduced above, the θ parameter needs to be determined numerically, since there are no closed-form relations between τ and θ.
3. Obtain the copula having calculated the copula parameter θ.One can also obtain the generating function of each copula, since the generating function is expressed in terms of the copula parameter.
Moreover, the use of copulas requires the determination of marginal distributions based on univariate data.Therefore, fitting of several extreme value distributions was considered, i.e. exponential, Gamma, Weibull and two-parameter lognormal distribution functions: These are one or two parameter distributions, allowing for various degrees of model complexity.Three parameter distributions were not considered for this application, given the reduced size of the samples.

Rainfall temporal pattern submodel
In order to define the temporal patterns of rainfall for each event, the mass curves concept, as similarly implemented by other various authors (Huff, 1967;Garcia-Guzman and Aranda-Oliver, 1993;Chow et al., 1988), was considered here.Following this kind of approach the variability of precipitation within a rainy period is represented by a dimensionless hyetograph H (d), defined as follows: which identifies the fraction of rainfall accumulated over the time interval [0, d].In Eq. ( 8), t (0 ≤ t ≤ D) is a fraction of the total duration D of the considered event and, consequently, d = t/D(0 ≤ d ≤ 1) is the corresponding dimensionless duration, h(t) is the rainfall depth at time t (0 ≤ h ≤ V ), V = I • D is the total storm volume and D the storm duration for the event.
The normalised events obtained are the input for selecting an appropriate probability function for the hyetograph shape.Although any continuous density function could be appropriate to represents the shape for every analysed time step between 0 and 1, here the choice was orientated towards the Beta distribution because it is a very simple model that fits reasonably well this kind of rainfall data.
The Beta cumulative distribution function for a given value x and a given pair of parameters a and b is where B(a, b) is the Beta function

Rainfall-runoff modelling
In Mediterranean areas, intense weather phenomena, responsible for the flood events, are often characterised by high spatial variability.For this reason, in this study a conceptual fully distributed model with climatic dependencies for the flow routing, derived starting from a semi-distributed model presented by Di Lorenzo (1993), was proposed and implemented.This model is characterised by a parsimonious structure (Jakeman and Hornberger, 1993) which was preferred over more complex models because of its simplicity and ability to approximate catchment runoff behaviour characterised by a fast hydrological response, small areas and inadequate hydrological data.
The proposed model is based on the representation in the form of a linear kinematic mechanism of transfer of the full outflows coming from different contributing areas of the catchment through the definition of a distributed hydrological response array with climatic characteristics.
Rainfall inputs are, hence, distributed in space and varying in time.They are represented by a three-dimensional matrix P of order (A, B, N), where A and B are the number of cells into which the catchment was divided in the direction x and y, and N represents the number of intervals into which the rainfall event of duration (with N = / t) was discretised (on time step t basis): In Eq. (11) the generic term P i,j,t represents the rainfall, expressed in mm, falling on the cell of coordinates i, j at time t.
To transform the gross rainfall into effective rainfall, the SCS-CN method, adopted by the USDA Soil Conservation Service (1986), was used here.This method allows one to incorporate information on land-use change, as the CN is a function of soil type, land use, soil cover conditions and degree of saturation of the soil before the start of the storm.
Since a precipitation variable in time was considered, the runoff volume, P e,i,j,t , was calculated in a dynamic form (Chow et al., 1988;Montaldo et al., 2007;Grimaldi et al.,  2012a) as a function of the storm depth P i,j,t , given initial abstraction, I a,i,j = cS i,j , and the infiltrated volume, F i,j,t , according to the following expression: with F i,j,t calculated with the following expression: and The CN i,j parameter was also defined in a distributed form starting from a map of its spatial distribution obtained on the basis of the knowledge of soil types, land use and hydrologic soil types.Using Eqs. ( 12)-( 14), the matrix of effective rainfalls P e was obtained with the same structure of the matrix (Eq.11).
The matrix H of order ( , A, B) describes the hydrological response of the catchment and represents the space-time distribution of contributing areas (isochrone areas).It can be derived starting from the knowledge of the concentration time for each cell within the catchment:  where H ,i,j represents the cell surface of the i and j coordinates and concentration time ϑ n (with ϑ n = ϑ catch /n • t, and n = 1, 2, . .., ) and is the number of intervals in which catchment concentration time ϑ catch is discretised.
In particular, in order to derive the concentration time at cell scale, the Wooding formula (1965) was used here: where L i,j →out [m] is the hydraulic path length between the centroid of the cell of coordinates i, j and the outlet section of the catchment, k i,j →out [m 1/3 s −1 ] is the Strickler roughness for the same path, s i,j →out [m m −1 ] is its slope, and r i,j [m s −1 ] is the average rainfall intensity for the rainfall event over the cell of coordinates i, j .The matrix of runoff Q is obtained by multiplying the hydrological response matrix H by the effective rainfall matrix, P e : in which Q i,j represents the available runoff for the ϑ isochrone zone at time t.
In this application, the path lengths and their average slopes to be included in Eq. ( 16) were extracted from a DEM (Noto and La Loggia, 2007).

Synthetic hydrograph derivation
The final step of the methodology consists in the derivation of flood design hydrographs (FDH) via synthetic generation by using the output from the rainfall-runoff model.The derivation of FDH was carried out following these steps: (a) modelling of the statistical correlation between flood peak and volume pairs generated by the R-R model via copulas; (b) definition of the normalised hydrograph shape in probabilistic form; (c) final derivation of the FDH by rescaling the selected shape (i.e. for a given joint return period) given the synthetic flood peak and volume values.The core of step (c) is the choice of a joint (bivariate) return period (JRP).Several authors found (see for instance Salvadori et al., 2011;Gräler et al., 2013;Requena et al., 2013;Volpi and Fiori, 2014) how different return periods estimated by fitted copulas could be considered.More in particular, three types of return periods are usually defined: (1) the so-called OR return period, well known also as a primary return period, in which the thresholds x or y are exceeded by the respective random variables X and Y , (2) the so-called AND  return period, where the thresholds x and y are exceeded by the respective random variables X and Y , and (3) the secondary return period, or Kendall return period, that is associated with the primary return period and is defined as the mean inter-arrival time of events (called super-critical or dangerous events) more critical than design events.Although all these three return periods can be obtained using copulas thanks to their formulation, in this application, the primary return period was used because it is an intuitive extension of the definition of a univariate return period.Furthermore, as this paper mainly proposes a method for the generation of a design ensemble, the choice of the return period was considered only as a mere example.The primary return period can be easily calculated by means of a bivariate copula C(u 1 , u 2 ) as All couples (u 1 , u 2 ) that are on the same contour (corresponding to an isoline p) of the copula C will have the same bivariate return period.Hence, for a given design return period T , the corresponding level p = C(u 1 , u 2 ) can be calculated easily using Eq. ( 18) and all the pairs (u 1 , u 2 ) on the isoline p have the same return period.In order to select the single design point u 1,T , u 2,T , Salvadori et al. (2011)  joint probability: Finally, the corresponding design values Q max,T and V T can be calculated easily by inverting the marginal CDF: 4 Application of the proposed methodology

Calibration of rainfall model
Synthetic sub-hourly rainfall events were derived starting from a sample of annual maximum rainfall events, extracted from the series of 10 min rainfall data recorded at the raingauges mentioned above.Following De Michele and Salvadori (2003), two events were considered independent if they were separated by a dry period of at least 7 h.As a consequence an inter-event time equal to 7 h was adopted here to derive single rainfall events from the available rainfall series, for which average intensity and duration were derived.Kao and Govindaraju (2007) stated how the definition of annual maximum events for multivariate problems is somewhat ambiguous.As matter of fact, extreme rainfall events could be defined as those storms that have both high volume and peak intensity.Therefore, the definition of extreme rainfall based on events with annual maximum joint cumulative probability was considered in this study, where the joint cumulative probabilities of samples can be estimated directly via the empirical copula C n as introduced by Yue et al. (1999).
Moreover, as all raingauges are in a hydrologically homogeneous area (Cannarozzo et al., 1995), all subsequent statistical analyses were performed by aggregating all selected events obtaining a final sample of 80 rainfall events whose characteristics are reported in Table 1.The procedure described in Sect.3.1.1was applied in order to derive the θ parameter for the Frank copula and the generating function.The copula so obtained was characterised by a parameter θ equal to −3.7573, calculated using Eq. ( 6) where the Kendall coefficient of correlation τ was equal to −0.381.
In order to evaluate the goodness of fit of the chosen copula, the empirical C n and best fitted copula joint distribution were reported in a Q-Q plot (Fig. 2, left).Furthermore, to confirm the goodness of fit, the parametric and nonparametric values of the function K(z), as defined by Genest and Rivest (1993), were calculated and shown in Fig. 2 (right).These comparisons confirmed that the Frank 2-copula is well suited to describing the dependence structure between the available intensity-duration data.
The parameters of the marginal distributions considered were estimated by applying the maximum likelihood method and the best fitted distribution was selected using various criteria.More in particular, the Akaike information criterion (AIC), the relative root mean square error (RRMSE), and the Anderson-Darling test (Kottegoda and Rosso, 2008) were applied to verify the goodness of the fitting.The results, as shown in Tables 2 and 3, returned lognormal probability distribution as the best marginal distribution for average storm intensity, and Weibull distribution as the best marginal distribution for storm duration.Furthermore, Fig. 3 shows the two marginal distributions defined respectively by Eq. (7c-d) and the empirical exceedance probabilities computed using the Gringorten formula of the observed data.
The historical dimensionless mass curves derived for all the events selected are plotted in Fig. 4.Then, these curves were sampled in 11 equal time steps (0; 0.1; 0.2; . . .; 0.9; 1) and, for each time step considered, the parameter estimation of the Beta distribution was carried out using the ML method (Table 4).In order to test the model's ability to reproduce rainfall event characteristics, 1000 events were generated using the Monte Carlo procedure.As can be seen in Fig. 5, generated events show an excellent reproducibility of historical events characteristics both in terms of duration-intensity correlation and in terms of dimensionless hyetographs.

Calibration of the rainfall-runoff model
Regarding the rainfall-runoff model, calibration is only required for three parameters: c and CN i,j , for the effective rainfall module (Eqs.12-14), and the hydraulic roughness k i,j →out for the transfer module (Eq.15).The latter two parameters are spatially distributed and their calibration should     difficulties arising from considering spatial correlation for the calibration, a simple and efficient procedure proposed by Candela et al. (2005) was implemented here.
To include its spatial distribution, the CN map was rescaled according to some weights w CN i,j allowing for CN spatial variability in the catchment, with regards to a reference value CN which coincides with the spatially averaged value of CN i,j : In this way, it is not necessary to calibrate each single value of CN, but only the reference value.Hence, the new CN spatial distribution can be obtained easily from Eq. ( 18) given the spatial distribution of w CN i,j .The values of the weight coefficients have been obtained from the CN map for AMC condition II available for the Imera catchment at 100 m grid resolution (Fig. 6) by using Eq. ( 21) given the CN reference value.This map was extracted from the CN II map for the whole of Sicily produced using the information coming from the Corine Land Cover 2000 project (Regione Siciliana, 2004).The uncalibrated value of the reference CN obtained from this map is equal to 82.
Similarly, the Strickler roughness coefficients (Fig. 7) were rescaled according to some weights, w k i,j →out , allowing for roughness variability in the catchment, with regard to a reference roughness value k which coincides with the spatially averaged value of k i,j →out .
The spatially averaged value of k i,j →out was easily calculated starting from its effective spatial distribution in relation to soil type and land use by the modified Engmann table (Engmann, 1986;Candela et al., 2005) (Table 5).Its value was derived equal to 20.5 m −1/3 s 1− .As for CN, the weight coefficients have been obtained from the roughness coefficients map using Eq. ( 22).The calibration of the three model parameters was carried out by comparing observed and predicted variables in terms of discharges for the event of 21 December 1976, recorded at the Imera at the Drasi flowgauge station (Fig. 8).The mentioned event was chosen for the calibration because it was a very significant event in terms of duration, flood volume and peak discharge.
In calibration it is not difficult to get optimal fittings to the observations by adjusting parameter values, but rather that there are many sets of parameter values that will give acceptable fits to the data (Beven, 1993).Often there are no techniques available for estimating or measuring the values of effective parameters required at the grid element or catchment scale.These values will therefore be subject to some uncertainty, especially in semiarid areas for which data are not always adequate and there is an extreme variability in space and time of all factors controlling the runoff processes (Yair and Lavee, 1982).
In this study parameter calibration was performed by using the Generalised Likelihood Uncertainty Estimation (GLUE) procedure proposed by Beven and Binley (1992).GLUE is a Monte Carlo technique that transforms the problem of searching for an optimum parameter set into a search for the sets of parameter values, which would give reliable simulations for a wide range of model inputs.Following this approach there is no requirement to minimise or maximise any objective function, but the performances of individual parameter sets are characterised by a likelihood weight, computed by comparing predicted to observed responses using some kind of likelihood measure.The GLUE places emphasis on the study of the range of parameter values, which have given rise to all of the feasible simulations (Freer et al., 1996).
Table 6 lists parameters required for the model and the ranges assigned to each for the Monte Carlo simulations; in particular, each interval was chosen as wide as possible in order to explore a feasible parameter space.
This analysis was carried out by generating 5000 uniform random sets of parameters and using these sets to perform model simulations.The results presented in this study use the sum of squared errors as a basic likelihood measure, in the form of the Nash and Sutcliffe (1970) efficiency criterion: where L( i /Y ) is the likelihood measure for the ith model simulation for parameter vector i conditioned on a set of observations Y , σ 2 i is the associated error variance for the ith model and σ 2 obs is the observed variance for the event under consideration.
Figure 9 shows scatter plots for the likelihood measure based on Eq. ( 23) for each of the three parameters to be calibrated.Each dot represents one run of the model with different randomly chosen parameter values within the ranges of Table 6.These plots are projections of the surface of the likelihood measure within a three-dimensional parameter space into single parameter axes.Scatter plots for the three parameters are very similar to each other, in terms of form of likelihood surface and level of performance.
It is readily seen from these plots that, consistent with the concepts that underlie the GLUE approach, there is considerable overlap in performance between simulations and that there are many different parameter sets that give acceptable simulations.
Moreover, a best fit parameter set was fixed corresponding to maximum efficiency values, and a comparison between observed and simulated hydrographs was reported in Fig. 10.

Results
The capability of the proposed procedure in reproducing the joint statistics of both peak discharges and corresponding volumes was tested through the generation of 5000 synthetic hydrographs starting from 5000 synthetic rainfall events of an assigned shape, average intensity and duration.Figure 11 shows the scatter plot of the pairs (Q max , V ) derived from synthetic hydrographs generated.Comparison with pairs of observed (Q max , V ) values at the Drasi station (Aronica et al., 2012b) shows a good ability of the procedure to reproduce both observed values and their correlative structure for all ranges of values.
The first step of the methodology (as outlined in Sect.3.3) involved the choice of the best copula for the bivariate anal-ysis of the output data from the R-R model.In this study, three copula families (namely Gumbel-Hougard, Frank and Clayton) were adapted to the 5000 generated pairs of flood peak discharges and volumes.These two series are characterised by a Kendall correlation coefficient equal to 0.932.The parameter of the studied copulas was estimated using the inversion of Kendall's Tau method (Table 7).
In order to select the copula that best represents the dependence structure of observed variables, graphical tools and statistical tests were used here.In Fig. 12 the K plot, as defined by Genest and Rivest (1993), is shown for the three copula families considered.For a best detection of modelling the correlation structure, the normalised scatter plot of the empirical and theoretical 5000 pairs is reported in Fig. 13.In addition, more rigorous tests based on statistical analysis were performed.Specifically, the AIC criterion and the loglikelihood test were applied to verify the goodness of the fitting.The graphical tools and the statistical test returned the Gumbel-Hougard copula family as the best choice for describing the dependence structure between the flood peaks and volumes.
The parameters of the marginal distributions used here (exponential, Gamma, Weibull, lognormal, and GEV) were estimated by applying the maximum likelihood method, and the best fitted distribution was selected using various criteria.Again, the AIC and the Anderson-Darling test (Kottegoda and Rosso, 2008) were applied to verify the goodness of the fitting (Tables 8 and 9).The goodness of fit criteria returned Gamma distribution as the best marginal distribution for both univariate variables.Figure 14 shows the marginal distribution defined by Eq. (7b) compared with the exceedance probabilities computed using the Gringorten formula of the empirical data.
A comparison between a sample generated from the Gumbel-Hougard copula and the empirical (Q max − V ) pairs is plotted in Fig. 15 (left).Also, contours of the fitted copula that represent the events with the same probability of occurrence are shown (Fig. 15,right).
The second step of the procedure is devoted to the derivation of the shape of the FDH generated via cluster analysis with the Ward method (1963) using the procedure proposed by Apel et al. (2004Apel et al. ( , 2006) ) and Aronica et al. (2012b).The procedure consists in normalising the empirical hydrographs (5000 in this study) in such a way that the non-dimensional hydrograph has peak flow and a volume equal to 1.The normalised hydrographs were then grouped into various clusters according to the Ward method (1963) (the minimum variance algorithm that minimises increments in sums of squares of distances of any two clusters that can be formed at each step).The results of this cluster analysis are the three shapes of hydrograph shown in Fig. 16.In relation to the number of hydrographs belonging to each cluster, a probability of about 0.11 (Shape 1), 0.5 (Shape 2) and 0.39 (Shape 3) were assigned to these shapes.
The final step of the procedure allows one to obtain the SFDH for any return time by merging the non-dimensional hydrographs (for a given probability) and the generated peakvolume pairs from copulas.
In this application, as an example, the SFDH with a design return period of T = 100 years was calculated.For T = 100 years, the copula level p is equal to 0.99, corresponding to a specific isoline (Fig. 17).
Eq. ( 22) can be solved to find the single design point u 1,T , u 2,T with the largest joint probability, i.e. (0.9912, 0.9901).Using the inverse marginal CDFs the design event pair is obtained: (Q max,T , V T ) = (4564.8m 3 s −1 , 162.5 Mm 3 ).Finally, the design hydrograph can be obtained using Shape 3 and de-normalising the time and the discharge by multiplying by the values of the pair (Fig. 18).

Conclusions
In this study a procedure to derive flood design hydrographs (FDH) using a bivariate representation of rainfall forcing (rainfall duration and intensity) described by copulas coupled with a distributed rainfall-runoff model was presented.In order to estimate, the return period of the SFDH which gives the probability of occurrence of hydrograph flood peaks and flow volumes obtained through R-R modelling was treated statistically via copulas.The choice of copulas was motivated by its strong capability to describe the statistical correlation between variables, which allows one to obtain the return period related to the entire SFDH and not only to a single variable (usually the peak flow) as in the univariate analysis.This circumstance has a significant importance in all those cases where all the hydrological variables (flood volume, flood peaks, etc.) included in a design hydrograph play an important role (i.e.hazard and risk mapping, design of flood control systems as reservoirs or storage areas, etc.).
In addition, a statistical label (in terms of probability of occurrence) was also given to the hydrograph shape through the cluster analysis of the R-R model-generated hydrographs.This completes the statistical definition of the FDH, which can be identified with a specific return period (joint return period, JRP).
The procedure described above applied to the case study of Imera catchment i, and shows how this approach allows a reliable estimation of the design flood hydrograph in a way which can also be implemented easily in practical situations.
These results, hence, underline the necessity of considering JRP estimation methods in the definition of design events for all practical purposes.
Further research efforts will be devoted to move from single-design-event methods to ensemble-design-event methods by considering uncertainty via a complete application of the GLUE procedure.

Figure 6 .
Figure 6.Spatial distribution of CN II for the Imera catchment.

Figure 13 .
Figure 13.Normalised scatter plot for the different copula models considered.The "empirical" points represent the pairs coming from the R-R model.

Table 1 .
Information and statistics of the rainfall data for 80 events registered from 2010 to 2011.

Table 2 .
Marginal distribution parameters and goodness of fit results for average intensity (I ).

Table 3 .
Marginal distribution parameters and goodness of fit results for duration (D).

Table 4 .
Parameter estimation of the Beta distribution.

Table 5 .
Engmann modified table reported Strickler's coefficient values related to Imera catchment land use.

Table 6 .
Ranges of parameters considered for the calibration.

Table 7 .
Goodness of fit results for copula (Q max , V ).

Table 8 .
Marginal distribution parameters and goodness of fit results for flood peaks (Q max ).
* The critical value for the Anderson-Darling test is 2.492.

Table 9 .
Marginal distribution parameters and goodness of fit results for flood volumes (V ).
* The critical value for the Anderson-Darling test is 2.492.