Methodology for flood frequency estimations in small catchments

Estimations of flood frequencies in small catchments are difficult due to a lack of measured discharge data. This problem is usually solved in the Czech Republic by hydrologic modelling when there is a reason not to use the data provided by the Czech hydrometeorological institute, which are quite expensive and have a very low level of accuracy. Another way is to use a simple method which provides sufficient estimates of flood frequency based on the available spatial data. A new methodology is being developed considering all important factors affecting flood formation in small catchments. The relationship between catchment descriptors and flood characteristics has been analysed first to get an overview of the importance of each considered descriptor. The results for different descriptors vary from a highly correlated relationship of an expected shape to a relationship which is opposite to that expected, mainly in the case of land use. The parameterisation of the methodology is also presented, including the sensitivity tests on each involved catchment descriptor and cross-validation of achieved results. In its present form, the methodology achieves an R2 adj value of about 0.61 for 10and 0.60 for 100-year return periods.


Introduction
A methodology for the estimations of flood frequency in small catchments is being developed.The reason for the presented research is mainly the fact that engineers often need quick instant design flood estimates for the purposes of different feasibility studies.It usually takes at least one month to obtain such estimates from the official provider, and the data can be relatively expensive.It is also important to take into consideration the fact that provided design floods have a relatively low level of accuracy.The uncertainty of the data for small catchments which are usually ungauged is up to ±60 % in the 4th class of accuracy according to Czech standards (Kulasová and Holík, 1997).This means that use of the data should take into consideration its uncertainty and that it is appropriate to apply correction coefficients in cases of higher safety demands.
The approach used for the development of the presented methodology in general applies similarity principles which have been discussed by many authors worldwide (Burn, 1997;Merz and Blöschl, 2005;Wagener et al., 2007;Patil and Stieglitz, 2012).These principles are usually applied in three ways: (i) for the direct estimation of flood quantiles, (ii) for the estimation of probability distribution parameters and (iii) for the estimation of hydrologic model parameters.The methodology proposed in this study adopts the first option in order to be applicable by practical engineers who may be unfamiliar with the application of more advanced statistical analysis methods or hydrologic models.
There are different regression-based methods which are similar to the method proposed by the authors of this paper.These methods adopt different procedures for parameter estimation, such as ordinary least square regression (OLS), weighted least square regression (WLS), and generalised least square regression (GLS), which is discussed by Stedinger and Tasker (1985) and further by Pandey and Nguyen (1999), who involved more parameter estimation methods such as least absolute value regression, robust regression and others.These methods are further improved by the involvement of a Bayesian framework approach and Monte Carlo simulations (Haddad and Rahman, 2012;Haddad et al., 2012;Micevski and Kuczera, 2009).
The proposed method considers in general the power function with shifts in two directions, which makes it different from other similar methods.This approach allows one to Published by Copernicus Publications on behalf of the European Geosciences Union.
V. David and T. Davidova: Methodology for flood frequency estimations in small catchments choose more suitable parts of the power function (according to its slope and curvature), but on the other hand, it avoids the linearisation of the problem by logarithmic transformation.The mentioned general shape can also lead to too high a number of model parameters, which could make the model insufficiently robust, which corresponds to results presented by Perrin et al. (2001) for continuous models.Thus, the shifts were excluded in the first step of model development presented in this paper, and will be included in the next step.

Overview of the proposed methodology
The proposed methodology is based on the calculation of flood frequencies using catchment descriptors.The procedure is derived using GIS tools and spatial data analysis.The method should be applicable to any small catchment in the Czech Republic for which the input data are available.The initial list of catchment descriptors which are considered important for flood formation is as follows: -catchment area, -storm rainfall characteristics, -slope conditions of the catchment, This list corresponds in general to the list used by Sefton and Howarth (1998) for their study, and involves both physical geographic and climate properties as discussed by Berger and Entekhabi (2001).Berger and Entekhabi (2001) did an analysis of the long-term basin response, but it can be expected that it is even more necessary to involve both types of information in flood response assessment.The list also contains types of parameters used for the purpose of maximum possible flood calculation by Reed and Field (1992).Eng et al. (2007) did the analysis of the combination of catchment characteristics with the geographic region-ofinfluence approach, resulting in better model performance.Nezhad et al. (2010) also used geographic information together with climatic characteristics for flood frequency analysis, and applied the residual kriging in physiographical space for quantile estimation using canonical correlation analysis.However, geographic location was not included in the presented analysis, because it is considered to have an indirect influence on flood discharge values, and the emphasis was put on catchment descriptors for which a direct influence is expected.
First, the procedure for the calculation of catchment descriptors was defined, including the necessary input data layers.The most important input is the digital elevation model (DEM), which is sufficient input for the calculation of the catchment area, the average slope and the catchment shape.This is a relatively easily available data source, and the procedures for its processing for purposes of hydrologic analyses have been broadly published.The situation is quite different in the case of the calculation of other descriptors.There are different reasons for the difficulties in the data acquisition and processing.There are many different sources of land use data which differ in many aspects, mainly the resolution, accuracy and information content.Moreover, there are no layers available containing storm rainfall characteristics or gapless soil maps containing sufficient information for the infiltration properties assessment.This is why the analysis of the influence of soil properties could not yet be done and why there was a need to prepare rainfall characteristic maps.
The general construction of the proposed methodology consisting in the multiplication of power functions of catchment descriptors and the choice of catchment descriptors is similar to those published for example by Asquith and Slade (1996) or Olson (2009) for conditions of regions in the United States, or by Jaafar et al. (2011) for conditions in the southwest of England.In general, the value is calculated as a product of functions of single catchment descriptors as described by the equation where Q N is flood discharge with the return period of N years, a 0 and d 0 are correction parameters, f i are mathematical functions and CD i are catchment descriptors.
The function for the calculation of each component of Eq. ( 1) is in general considered a power function, with shifts in both directions in the form of where a i , b i , c i and d i are parameters of mathematical functions f i .Shifts are driven by parameters b i and d i .Equation (1) then becomes which can be rewritten as where When the simple power function is considered, which is usual for similar applications and considered more robust, the values of b i and d i are equal to 0. The simplified form of Eq. ( 4) then becomes

Assessment of the relationship between catchment descriptors and flood discharges
Discharges with different return periods published by Zítek (1970) were used for the analysis.The data published in this book were derived based on a time series from 250 gauging stations spread over the whole area of the former Czechoslovakia.The shortest length considered for the analysis was 25 years, while the median is 43 years.These data are based only on years containing no gaps, which means that years containing gaps in records were excluded from quantile calculations.
A subset of 196 catchments with a catchment area (A) less than 150 km 2 located in the Vltava River basin and the Dyje River basin (see Fig. 1), for which the flood frequency data are published by Zítek (1970), was chosen for the analysis.Catchments were delineated using a layer containing fourthorder catchments available from the Water Research Institute and elevation data to get polygons for calculations of catchment descriptor values.
The calculation procedure was different for each considered catchment descriptor.The simplest one is the calculation of the descriptor related to the catchment slope (s), which is calculated as the average slope from the layer containing slopes calculated based on the analysis of DEM.DEM is also input for the calculation of the catchment descriptor related to catchment shape.For this purpose, shape factor -SFwas used, which is defined as the drainage area divided by the square of the longest flow path.This descriptor was chosen based on previous research (David, 2011).The longest flow path for each catchment was calculated using standard GIS procedures (flow direction and flow length), while the catchment area was calculated simply from the geometry of each catchment polygon.
For the calculation of the descriptor related to storm rainfall layers containing information on the value of maximum 24 h precipitation (P ), totals for the considered return periods needed to be prepared.They were interpolated from point data digitised based on the information for each gauging station published by Šamaj et al. (1985).The descriptor for each catchment was then calculated as an average value over the catchment area.
Land use was analysed using a curve number parameter (CN) published by, among others, Mishra and Singh (2003).This parameter originally combines the information on land use and soil infiltration properties.However, the spatial distribution was considered only with respect to land use, while soil information was considered spatially homogeneous and corresponding to hydrological soil group B, in order to be able to assess land use influence on flood discharges separately.
The analyses focusing on the relationship between catchment descriptors and flood discharges were performed individually for each of the considered catchment descriptors.For this paper, flood discharge with return periods of T = 10 years and T = 100 years were selected as in the case of the study published by Pandey and Nguyen (1999).Correlations between dependent and exploratory variables were analysed using parameter optimisation for Eq. ( 2).

Analyses of the correlation between catchment descriptors and flood discharges
Basic analyses were performed by relating the value of the catchment descriptor directly to the value of the flood discharge within the given return period.The assumption is that there should be a significant relationship between the catchment area and the peak discharge value and between the precipitation total for a given return period and the peak discharge value related to the same return period.

Catchment area
The catchment area is considered the most important catchment descriptor.It is assumed that the value of flood discharge increases with an increasing catchment area.However, it is usually also assumed that the relationship between the catchment area and flood discharge is not linear in small catchments (Sivapalan et al., 2002).The main reason consists in the spatial distribution of storm rainfalls, which are the most frequent causes of floods in small catchments in the conditions of the Czech Republic.Areas of catchments involved in the analyses vary in range from 7.7 to 146.4 km 2 with a mean value of 58.2 km 2 .More than 51 % of catchments are smaller than 50 km 2 , and more than 87 % are smaller than 100 km 2 .
The results show a relatively clear relationship between the catchment area and the peak discharge value.Figure 2 shows plots of the peak discharge values against the catchment area.Fitted lines are also shown, having R 2 = 0.18 for T = 10 years and R 2 = 0.20 for T = 100 years.These values are not as high as expected.This is most likely caused by a relatively narrow range of values.

Maximum 24 h precipitation total
The precipitation total, together with the catchment area, is considered the most important factor affecting flood discharge values.The product of the precipitation total and catchment area can be understood as the volume of water available for runoff.Thus, the value of peak discharge is considered increasing with an increasing precipitation total, but the relationship is not considered linear, as lower values of precipitation totals are in general more affected by losses.Values of maximum 24 h precipitation totals for different return periods are the only data source which is available as a continuous map for the whole area of the Czech Republic, and therefore this characteristic was used for the analysis, although floods are usually caused by precipitation events with a duration shorter than 24 h.Interpolated values of maximum 24 h precipitation within the sample vary in range from 51.9 to 92.3 mm in the case of a 10-year return period and from 75.8 to 146.8 mm in the case of a 100-year return period.Average values are 61.9 and 92.6 mm, respectively.However, most catchments have values of a maximum 24 h precipitation total in a very narrow range, which is from 55 to 65 mm for more than 72 % in the case of a 10-year return period, and from 80 to 95 mm in the case of a 100-year return period.
The results show that the relationship between the maximum 24 h precipitation total and peak discharge value for T = 10 years follows almost a straight line (see Fig. 3).Achieved values of R 2 are higher than for the catchment area, i.e.R 2 = 0.32 for T = 10 years and R 2 = 0.27 for T = 100 years.

Average slope of the catchment
Catchment slope conditions are considered important mainly due to their influence on overland flow velocities.A higher slope of the catchment leads to faster concentrations and consequently to higher values of peak discharge.
Slope conditions in catchments involved in the analysis vary in a relatively wide range from flat to mountainous areas.The average slope (s) ranges from 1.5 to 18.8 %, but 60 % have a value in the range from 4 to 10 %.
Results of performed analyses confirm the assumption of increasing peak discharge values with increasing average slope of the catchment.Results presented in Fig. 4 show the almost straight shape of a fitted curve.Values of the determination coefficient achieved by parameter optimisation are R 2 = 0.15 for T = 10 years and R 2 = 0.14 for T = 100 years.

Catchment shape
Catchment shape affects flood discharges through the runoff concentration.It is assumed that wide catchments (fan shaped) have higher values of peak discharges than narrow oblong catchments (fern shaped), which is published, among others, by Murthy (2002).According to the definition of SF, the values are higher for fan-shaped catchments than for fernshaped catchments.
The value of SF can theoretically range from 0 to π, but it usually does not exceed the value of about 0.6.This is also  the case for the sample used for the analysis, where the maximum value is SF = 0.57, while the minimum is 0.10.However, most catchments (75 %) have a value of SF below 0.22.
Results obtained by the basic analyses performed are in opposition to those expected.These results are shown in Fig. 5. Fitted curves have a shape representing a decreasing value of flood discharge with an increasing value of SF.However, coefficients of determination are very low: R 2 = 0.02 for T = 10 years as well as for T = 100 years, which indicates no match between this catchment descriptor and peak discharge.
The results shown do not necessarily refute the mentioned principles.This can be caused by a stronger influence of other factors.Therefore, further analyses had to be performed of the influence of this catchment property.

Land use
Land use affects flood discharges in different ways.Mainly, precipitation losses caused by interception and infiltration and affection of routing speed by the surface roughness are important.For the purposes of the presented analyses, the CN value was chosen as a catchment descriptor.This parameter was designed to calculate direct runoff, which means that it affects flood discharge values through the volume of runoff.It reaches values from 0 to 100.A zero value corresponds to no runoff, while a value of 100 corresponds to the maximum runoff.This means that peak discharge should increase with an increasing CN value.CN values were calculated from maps derived for the whole area considered covered by hydrological soil group B to exclude the influence of soil conditions.
In the sample of catchments used in this study, the values of CN range from 62.1 to 80.9.
Results of performed basic analyses are again opposite to the meaning of the CN parameter.The trend is decreasing in both cases shown in Fig. 6.The determination is furthermore relatively high, i.e.R 2 = 0.28 for T = 10 and R 2 = 0.26 for T = 100 years.
There are several possible reasons for such results.First, the influence of land use can be weaker than the influence of other factors, which cannot be avoided in this type of analysis.Second, areas of land use types with low values of CN, such as forests, are usually concentrated in hilly and mountainous areas, which typically have high and intense storm rainfall and consequently high values of flood discharges.

Additional analyses of the correlation between catchment descriptors and flood discharges
Further analyses were performed with the aim of excluding, at least partially, the influence of the two most important factors affecting flood discharges.These are, according to basic analyses, the catchment area and the maximum 24 h precipitation total.The product of these two descriptors is the total volume of water available for runoff, which was identified by Kjeldsen and Rosbjerg (2002) as a descriptor leading to better estimates than the catchment area alone.Thus, the assessment was performed of the relationship between selected   catchment descriptors and the flood discharge divided by the product of the catchment area and the precipitation total.

Catchment shape
Results of the comparison of shape factor and flood discharge divided by the product of the catchment area and the precipitation total show a growing trend (see Fig. 7).This corresponds to the assumption that flood discharge values increase with increasing values of the shape factor.The trend obtained by fitting the curve shaped according to the equation is not very significant, having a value of R 2 = 0.01 for T = 10 years as well as for T = 100 years, which again indicates no match.This results in the supposition that this catchment descriptor probably cannot significantly increase the performance of the proposed methodology for estimations of flood discharges.

Land use
In the case of land use represented by the CN value, the results of the comparison with flood discharge divided by the catchment area and precipitation total are similar to those obtained by the basic analysis.It shows an inverted proportion of peak discharge values per unit and unit precipitation to the value of CN in both cases, which is again opposite to the definition of the CN parameter.
The trend obtained by fitting the curve shaped according to the equation is relatively significant (see Fig. 8), having a value of R 2 = 0.30 for T = 10 years as well as for T = 100 years.This corresponds to the best values of previous analyses.However, it needs to be analysed further due to the obtained results.

Available volume
To make a simple check of the influence of the two parameters providing the best performance, the assessment of the relationship between available volumes of the maximum 24 h precipitation total and peak discharge values was performed.The volume was calculated as a product of the precipitation total and the catchment area, which were identified as the most important catchment descriptors.
The results of this analysis show that the performance of the calculation based on this parameter does not provide important improvement with respect to the value of the maximum 24 h precipitation total for the given return period.The value of the determination coefficient is a bit higher in the case of the 100-year return period (R 2 = 0.33).In the case of the 10-year return period, the value of the determination coefficient is lower (R 2 = 0.30) than for the application of the 24 h precipitation total alone.

Methodology parameterisation and validation
The calibration was carried out for the simpler form of a power function to get the results for a more robust calculation procedure.In this case, the OLS method was used for a log-transformed equation.The shape of the equation after this transformation is as follows: Parameter values were then estimated using the equation where β is a vector of equation parameter estimators (log a 0 , c 1 , . . ., c m ), CD is a matrix of catchment descriptor logarithms with the first column filled by 1, and X is a vector of flood discharge value logarithms.
The methodology was first parameterised using the whole data set for all tested catchment descriptors including also the land use descriptor and shape factor, which did not provide good results when analysed individually.The parameterisation was then carried out again without considering the least important descriptors (those having the weakest individual correlation to design flood values) to assess if they can be excluded from the calculation without loss of model performance.Parameters were always recalibrated after removing any descriptor.Land use and shape factor were identified as the least significant descriptors according to the results presented in previous analyses.Thus, each of them was individually removed from the complete set of descriptors, and parameters for all other descriptors were recalibrated.Furthermore, both mentioned descriptors were removed, and finally also catchment slope descriptor was removed.Values of the t statistic were then calculated for each of mentioned combinations in order to assess the significance of each catchment descriptor in a quantitative way.
The results of parameterisation were assessed using a determination coefficient (R 2 ) and an adjusted value of determination coefficient (R 2 adj ) calculated as where Q obs,i is the T = 10-year or T = 100-year design flood value, Q est,i is the estimated design flood value, and Q obs is an average of all design flood values in the data set.For all considered catchment descriptors, the results obtained by parameterisation of the methodology can be considered satisfactory, having R 2 adj = 0.61 for T = 10 and R 2 adj = 0.60 for T = 100.The value of the adjusted determination coefficient decreases with the number of dropped catchment descriptors, as shown in Fig. 15.Exclusion of land use resulted in the value of R 2 adj being decreased to 0.56 and 0.53, respectively, while exclusion of the shape factor resulted in the value of R 2 adj changing only very little.Exclusion of both resulted in a decrease to similar values as in the case of land use exclusion only, which were 0.55 and 0.53.Further exclusion of the average slope resulted in small decreases in the R 2 adj value to 0.53 and 0.49, respectively.These results show that the involvement of all considered catchment descriptors improves the performance of the proposed methodology, but the involvement of SF improves the performance only very little.Moreover, the values of the t statistic (t c 4 ) related to SF are not high enough to reject the null hypothesis (c 4 = 0) in both cases where SF is involved in the regression for T = 100 years and in one case for T = 10 years at a level of 0.05 (see Table 4).Thus, this parameter will not be considered for further development.
There is a variety of methods which can be used for validation in case of a lack of validation data.These are mainly Monte Carlo (Haddad et al., 2013), leave-one-out (Jaafar et al., 2011) or k-fold cross-validation.In this case, the k-fold cross-validation method was used, which is based on splitting the data set into k similarly large subsamples and running the calibration and validation k times when using always one subsample for validation while using the others together for calibration.For the purpose of this study, the value of k = 7 was chosen.The data set was divided into folds randomly to avoid distortion of results.Values of the determination coefficient as well as other performance metrics were calculated for both the training and validation data set in the case of each fold.These were then compared to values obtained by the calibration using the whole data set.
As measures of performance, root mean square error (RMSE) and mean absolute error (MAE) values which are both discussed by Willmott and Matsuura (2005) were used for each considered combination of catchment descriptorssee Eqs. ( 10) and (11).Additionally, relative values of RMSE and MAE were used -% RMSE and % MAE -expressed as shown by Eqs. ( 12) and ( 13).Finally, bias was calculated for the calibration using the whole data set as well as for folds to make a check on the bias of estimated values resulting from logarithmic transformation (McCuen et al., 1990).% RMSE = 100    For purposes of validation, values of R 2 adj were calculated for each fold for both the training and validation set.The comparison with the values obtained by calibration on the whole data set is shown in Fig. 14.

Conclusions and outcomes
There are several conclusions that can be drawn based on the performed analyses.First, the influence of each analysed catchment descriptor is not significant enough to be used as the only one explaining peak discharge values.Best results achieve a value of determination coefficient of about R 2 = 0.3.This is not a very strong relationship, and the uncertainty would be too high in the case of considering one single parameter.This outcome was assumed because of the nature of the flood phenomenon.It is a process which is very complex, and there are many factors which play an  important role.Second, the most important catchment descriptors are the area and the maximum 24 h precipitation total, which again confirms the initial assumption, although the fit is not as high as expected in the case of the catchment area.Third, the involvement of a land use descriptor (CN) improves the performance of the methodology, even though the initial analysis did not confirm its influence on flood discharges with respect to its definition.Furthermore, this parameter improves the performance even more than shape factor.
The results of parameterisation and cross-validation show that the concept used for the formulation of the methodology is reasonable and that the methodology provides satisfactory results.Values of optimised method parameters are reported in Table 3.However, the performance of the calculation is not as high as it could possibly be, and therefore the methodology will be developed further.Further research will focus on the manipulation of variables driven by shifts as expressed by the considered general shape of the methodology.This          will need to be done carefully to avoid too high a number of model parameters, which could decrease the robustness.The development will also focus on the possible involvement of soil properties as the characteristic which could improve the performance of the methodology, as it is considered an important factor in flood formation.
The values of RMSE and MAE are shown in Table 1; relative values are then shown in Table 2.The results for calibration using the whole data set show that in the worst case (only area and precipitations are involved), the relative value of RMSE is about 42 %, which is less than the value given by the standard.When all considered catchment descriptors are involved, the relative values of RMSE are 38 and 37 % respectively.Scatterplots were drawn for all considered combinations of catchment descriptors to show a fit between observed and estimated values (see Figs.  3. It shows that both means and medians for folds are very close to values for the whole data set, except for parameter a 0 .Thus, the final forms of values obtained for the whole data set and the

Figure 1 .
Figure 1.Catchments less than 150 km 2 selected for the analysis.

Figure 2 .
Figure 2. Relationship between catchment area and peak discharge value for T = 10 years (left panel) and T = 100 years (right panel) with fitted lines following Eq.(2).

Figure 3 .
Figure 3. Relationship between catchment average maximum 24 h precipitation total and peak discharge values for T = 10 years (left panel) and T = 100 years (right panel) with fitted lines following Eq.(2).

Figure 4 .
Figure 4. Relationship between catchment average slope and peak discharge value for T = 10 years (left panel) and T = 100 years (right panel) with fitted lines following Eq.(2).

Figure 5 .
Figure 5. Relationship between catchment shape factor and peak discharge value for T = 10 years (left panel) and T = 100 years (right panel) with fitted lines following Eq.(2).

Figure 6 .
Figure 6.Relationship between catchment average CN value and peak discharge value for T = 10 years (left panel) and T = 100 years (right panel) with fitted lines following Eq.(2).

Figure 7 .
Figure 7. Relationship between catchment average shape factor and peak discharge value per unit area and unit precipitation total for T = 10 years (left panel) and T = 100 years (right panel) with fitted lines following Eq.(2).

Figure 8 .
Figure 8. Relationship between catchment average CN value and peak discharge value per unit area and unit precipitation total for T = 10 years (left panel) and T = 100 years (right panel) with fitted lines following Eq.(2).

Figure 9 .
Figure 9. Relationship between catchment 24 h precipitation volume and peak discharge value for T = 10 years (left panel) and T = 100 years (right panel) with fitted lines following Eq.(2).

Figure 10 .
Figure 10.Scatterplot of observed versus estimated discharge values for a combination of catchment area, maximum daily precipitation total, average catchment slope, shape factor and curve number for T = 10 years (left panel) and T = 100 years (right panel).

Figure 11 .
Figure 11.Scatterplot of observed versus estimated discharge values for a combination of catchment area, maximum daily precipitation total, average catchment slope and shape factor for T = 10 years (left panel) and T = 100 years (right panel).

Figure 12 .
Figure 12.Scatterplot of observed versus estimated discharge values for a combination of catchment area, maximum daily precipitation total, average catchment slope and curve number for T = 10 years (left panel) and T = 100 years (right panel).

Figure 13 .
Figure 13.Scatterplot of observed versus estimated discharge values for a combination of catchment area, maximum daily precipitation total and average catchment slope for T = 10 years (left panel) and T = 100 years (right panel).

Figure 14 .
Figure 14.Scatterplot of observed versus estimated discharge values for a combination of catchment area and maximum daily precipitation total for T = 10 years (left panel) and T = 100 years (right panel).

Figure 15 .
Figure 15.Values of the determination coefficient for calibration to the whole data set and for calibration and validation using 7 folds for T = 10 years (left panel) and T = 100 years (right panel).

Figure 16 .
Figure 16.Values of root mean square error (m 3 s −1 ) for calibration to the whole data set and for calibration and validation using 7 folds for T = 10 years (left panel) and T = 100 years (right panel).

Figure 17 .
Figure 17.Values of relative root mean square error for calibration to the whole data set and for calibration and validation using 7 folds for T = 10 years (left panel) and T = 100 years (right panel).

Figure 18 .
Figure 18.Values of mean absolute error (m 3 s −1 ) for calibration to the whole data set and for calibration and validation using 7 folds for T = 10 years (left panel) and T = 100 years (right panel).

Figure 19 .
Figure 19.Values of relative mean absolute error for calibration to the whole data set and for calibration and validation using 7 folds for T = 10 years (left panel) and T = 100 years (right panel).

Figure 20 .
Figure 20.Bias for calibration to the whole data set and for calibration and validation using 7 folds for T = 10 years (left panel) and T = 100 years (right panel).

Figure 21 .
Figure 21.Values of parameter c 1 calibrated to the whole data set and to 7 folds for N = 10 years (left panel) and N = 100 years (right panel).

Figure 22 .
Figure 22.Values of parameter c 2 calibrated to the whole data set and to 7 folds for N = 10 years (left panel) and N = 100 years (right panel).

Figure 23 .
Figure 23.Values of parameter c 3 calibrated to the whole data set and to 7 folds for N = 10 years (left panel) and N = 100 years (right panel).

Figure 24 .Figure 25 .
Figure 24.Values of parameter c 4 calibrated to the whole data set and to 7 folds for N = 10 years (left panel) and N = 100 years (right panel).
10-14).The comparison of performance metrics for the calibration using the whole data set and for the calibration using folds is shown in Figs.16-19.The values of considered metrics are in general slightly worse than in the case of calibration for the whole data set, but they do not differ much.Calculated bias values show that, in general, the model overestimates flood discharge.Bias values obtained by the calibration for the whole data set and for both training and validation subsets are shown in Fig. 20.Calibrated values of model parameters were compared for the calibration for the whole data set and calibration using folds.These values are shown in Figs.21-25.The results show that parameter values for folds do not differ much from those obtained by the calibration for the whole data set.In general, the variance of calibrated parameters increases with decreasing importance of the corresponding catchment descriptors, being highest in the case of shape factors and curve numbers.The comparison of mean and median values of model parameters obtained for folds with values obtained for the whole data set is provided in Table

Table 1 .
Values of RMSE and MAE for all considered combinations of involved catchment descriptors.

Table 2 .
Values of % RMSE and % MAE for all considered combinations of involved catchment descriptors.

Table 3 .
Values of calibrated model parameters.

Table 4 .
t values table for considered parameters (t α/2 n−K = 1.9723 to 1.9725, for α = 0.05, n = 196 and K = 5 to 2).shape of equations for the calculation of flood discharge values when considering all catchment descriptors except for SF are as shown by Eqs.(14) and (15). final