Flood frequency analysis supported by the largest historical flood

The use of non-systematic flood data for statistical purposes depends on the reliability of the assessment of both flood magnitudes and their return period. The earliest known extreme flood year is usually the beginning of the historical record. Even if one properly assesses the magnitudes of historic floods, the problem of their return periods remains unsolved. The matter at hand is that only the largest flood (XM) is known during whole historical period and its occurrence marks the beginning of the historical period and defines its length (L). It is common practice to use the earliest known flood year as the beginning of the record. It means that the L value selected is an empirical estimate of the lower bound on the effective historical length M. The estimation of the return period of XM based on its occurrence ( L), i.e.M̂ = L, gives a severe upward bias. The problem arises that to estimate the time period ( M) representative of the largest observed flood XM. From the discrete uniform distribution with support 1,2, . . . ,M of the probability of theL position of XM, one gets L̂ = M/2. ThereforeM̂ = 2L has been taken as the return period of XM and as the effective historical record length as well this time. As in the systematic period ( N ) all its elements are smaller than XM, one can get M̂ = 2(L + N). The efficiency of using the largest historical flood (XM) for large quantile estimation (i.e. one with return period T = 100 years) has been assessed using the maximum likelihood (ML) method with various length of systematic record (N ) and various estimates of the historical period length M̂ comparing accuracy with the case when systematic records alone (N ) are used only. The simulation procedure used for the purpose incorporates N systematic record and the largest historic flood (XMi) in the periodM, which appeared in the Li year of the historical period. The simulation results for selected two-parameter distributions, values of their parameters, different N andM values are presented in terms of bias and root mean square error (RMSEs) of the quantile of interest are more widely discussed.


Introduction
Flood engineering usually deals with the determination of the flood of a given return period T years, i.e. the flood quantile X T or the design flood.The problems with the assessment of these parameters result from short time series (N < T ), unknown probability distribution function of annual peaks, error corrupted data, the simplifying assumptions as of identical independently distributed (i.i.d.) data and, in particular, the assumption of stationarity of relatively long data series.All these account for high uncertainty of the upper quantile estimate.The effect of sample size is widely documented for various distribution models and estimation methods; thus, it is obvious that due to a short sample the confidence interval of the design flood estimate is already very broad.In addition to flood frequency analysis (FFA) other sources of error would result in increasing uncertainty in the design flood estimate.This feature is not appreciated by the designers as they want to have only one value for designing flood related structures.Conversely, efforts to improve the accuracy of estimates of the hydrologic design value by specifying the various sources of uncertainty and incorporating them in the analysis produce the opposite effect from the one intended.
To improve the accuracy of estimates of upper quantiles, all possible sources of additional information and "statistical tricks" are used: independent peaks above the threshold, seasonal approach, regional analysis, record augmentation by correlation with longer nearby records and, finally, augmentation of the systematic records by historical and palaeo-flood data.
Frequency analysis of flood data arising from systematic, historical, and palaeo-flood records has been proposed by several investigators (a review Stedinger and Baker, 1987;Frances et al., 1994;MacDonald, 2013).The use of nonsystematic flood data for statistical purposes depends on the reliability of assessment both flood magnitudes and their return period.If the historical record is available, the information about the floods larger than the prevailing majority of floods reported in the systematic record can be introduced to the data sets along with, if we are lucky, the unique information about the largest reported floods.Serious difficulties relate to the (un)availability and (non-)exhaustiveness of historical information, the (low) quality and (in)accuracy of historical sources.As if it was not enough, depending on the number of parameters and their method of estimation, the estimates of high quantiles are more or less sensitive to the largest observed floods.
The earliest and simplest procedures for employing historical and palaeo-flood data were based on plotting positions and graphical concepts (Zhang, 1982(Zhang, , 1985;;Bernieur et al., 1986;Wang and Adams, 1984;Hirsch, 1985;Cohn, 1986).The The probability weighted moments (PWM) method and linear moments (L-moments) were introduced by Ding and Yang (1988); Wang (1990Wang ( , 1996) ) and Hosking (1995).To deal with historical and palaeo-floods, Hosking and Wallis (1986a, b) applied the maximum likelihood (ML) as the estimation method.Recently the Bayesian estimation paradigm has been incorporated (Vigilione et al., 2013;Parent and Bernier, 2003;Reis and Stedinger, 2005).It takes into account that the historical floods are known with uncertainty, for instance with lower and upper bounds (in fact the effect of corrupted historical flood magnitudes was investigated by Hosking and Wallis via the maximum likelihood (ML) as mentioned as early as 1986a, b).The subject of historical floods currently constitutes one of the main scientific threads in flood frequency analysis (MacDonald, 2013;Payrastre et al., 2011Payrastre et al., , 2013)).It is important to add that the inclusion of historical information is recommended in a number of national and international policy documents (e.g.EU Flood Directive).The two-parameter distributions -namely log Gumbel, Weibull and Gumbel distributions together with maximum likelihood method -were considered by Frances et al. (1994) to tackle systematic and historical or palaeoflood data in FFA.To assess the potential statistical derived from historical information the asymptotic variances of the quantile estimates from the systematic records alone and the combined time series were compared by means of computer simulation experiments.The study performed to define the length (M) of historical period indicate that value of the historical data for estimating flood quantiles can vary depending on only three factors: the relative magnitudes of the length of the systematic record (N) and the length of the historical period (M); the return period (T ) of the flood quantile of interest; and the probability threshold defining the historical floods.
Most often it is the first historical large flood that is the most remembered (and described in historical sources) and, therefore, it is usually not considered as important (or simply not known) as what had happened before (Girguś and Strupczewski, 1965).In other words, the largest (palaeo-)historical flood is best remembered (and thus recorded) because of its destructive character and its disruptive effect on many lives, and in the case of pre-historical time, the largest inundation swept away any evidence of smaller floods that occurred earlier.The date of the first recorded historical flood is taken as the historical memory length L; i.e.L becomes the duration of a non-systematic period commencing on the date of the large flood.Even if one properly assesses the magnitudes of historic floods, the problem of their return periods remains unsolved.In most literature examples (especially Benson, 1950;Dalrymple, 1960;IACWD, 1982;Zhang, 1982;NERC, 1975, p. 177) one reads that the effective length of the historical record M used for frequency analysis is always taken to be the period from the first extraordinary flood to the beginning of the systematic record, i.e.L.
The matter at hand is that only the largest flood (XM) is known during the entire historical period, and its occurrence marks the beginning of the historical period and defines its length (L) (Fig. 1).That is because the beginning of the historical period was somehow forced by the appearance of the largest flood (XM), but in fact its unusual magnitude corresponds rather to a longer return period than L (or, if in systematic record all observations are smaller than XM, to (L + N )-period); i.e. the probability that the actual return period of XM is longer than the L is greater than fifty percent.The case of the largest historical flood smaller than the systematic record maximum is not considered here.
Attempts to eliminate or lessen this error lead us to estimate the time period (M) representative of the largest observed flood XM as accurately as possible.In order to do so, we will carry out the evaluation of the efficiency of using the largest historical flood (XM) for large quantile estimation and its comparison with the case when systematic records alone (N ) are used.To keep and preserve the unspoiled genuine information contained in the observation (XM, L), the return period ( M) of the largest observed historical flood (XM) should be assessed without data from the systematic record provided that it does not contain elements larger than XM values.
It is obvious that the return period of the historical flood assessed on the base of the year of occurrence (L) represents just the lower limit of its real empirical return period (M).Of course, there is an upper empirical limit as well, which however, can not be estimated unambiguously.This is so because, if the occurrence of a large flood was reported in a given year, surely a similar or more serious flood a year before would have been also noted and commented in historical sources (Hirsch and Stedinger, 1987).The same can be stated for a horizon of 2, 3, 4, etc. years.If we could identify this time span, we would have determined the upper limit of the empirical return period.
The estimation of M based on the date of the first extraordinary flood occurrence exacerbates an already severe imprecision.By defining all floods during the M period as historical floods above a given threshold and taking four different plotting position formulas, Hirsch and Stedinger (1987) calculated (with the use of a Monte Carlo experiment) the magnitude of the upward bias of the plotting position of the largest sample elements occurring when L is taken as the beginning of the historical record.Doing so they noticed that L is a random variable dependent on the flood-producing process itself; this would be a violation of the assumption of the plotting position formulas.
Similarly, Hosking and Wallis (1986a, b) use a Monte Carlo (MC) computer simulation to assess whether a single palaeo-flood estimate, when included in a single-site maximum likelihood (ML) flood frequency analysis procedure, gives a worthwhile increase in the accuracy of estimates of extreme floods.They found that the main factors affecting the utility of this kind of palaeological information are the specification of the fitted flood frequency (whether it has two or three unknown parameters) and the size of the measurement error of palaeo-discharge estimates.Errors in estimating the date of the palaeo-flood are considered to be of minor importance.For distributions with higher coefficient of variation (CV) or skewness (CS), the difference between the effects of the errors of the magnitude of palaeo-flood and its return period is smaller.
Note that the randomness of the systematic records time series of i.i.d.variable can also be sometimes questioned and undermined, e.g. when the largest value XM of a time series intentionally terminates the N -elements' systematic record.Then the XM is the last element of the N-element time series.Such a case may arise when a water gauge was swept away by a heavy flood (XM) and not restored, or when the hydrological station was intentionally moved.As before, the use of such a series in FFA with M = N will lead to an overestimation of large quantiles.

Problem formulation
The object of the paper is to assess by the use of the maximum likelihood (ML) method in order to determine whether there is any impact of the largest flood terminating the time series assuming its magnitude (XM) is known.Therefore, the case of systematic data completed by the largest flood is compared with the case where records contain systematic data only.These two variants are examined by comparing the bias (B) and the root mean square error (RMSE) of flood quantiles.Similarly to Frances et al. (1994), the two twoparameter distributions, namely Gumbel and Weibull were used when applying the simulation experiments.The emphasis is put on the effect of misspecification of the return period (M) of the largest historical (palaeo-) flood (XM) and on the proper assessment of the M estimate on the basis of XM occurrence (L).So far, the results of such research have not been presented in the hydrological literature.We are aware, however, that the results obtained will differ, if a three-parameter distribution (e.g.generalised extreme value, GEV) was involved in calculations.Upper quantiles, which are the values of interest, depend strongly on asymmetry that is easier to manipulate in three (or more) parametric distributions.
The theoretical framework of our research is based on Maximum Likelihood estimation which has been generally found to have desirable properties for both systematic and historical information (Frances et al., 1994;Stedinger and Cohn, 1986;Naulet et al., 2005).It is assumed that the annual maximum floods are independent and identically distributed.
Assessment of the return period M of the XM flood Hirsch and Stedinger (1987) considered that the time of occurrence of the earliest documented historical flood L is the random variable defining a lower bound of the sample size used for computation of plotting positions.The position L of the largest in M period element (XM) (Fig. 1) is the random variable being discretely and uniformly distributed in the M period, i.e. p t = 1/M for t = 1, 2 . . .M. Obviously the magnitude of the largest element (XM) is also a random variable.Within the population, it can correspond to a smaller or larger value of the exceedance probability than 1/M, thereby defining the effective return period (M R ) of XM.Therefore the difference (M R − L) is not restricted in sign.
Assume that the return interval (M) of XM is known.As L is the uniformly distributed variable in the M length time series with support L ∈ [0, 1, . . ., M], one gets E(L) = M/2 and V (L) = M 2 /12.In reality M is not known and its assessment is our goal.Taking the observed L value as the estimate of the expecting value, i.e.L = E(L) we get the M estimate equal M = 2L.Because regardless of the estimation method the quantile estimators are not in general linear function of M, the minimum bias of quantile B xp = E xp M − x p does not necessarily correspond to the zero-bias of M, i.e. to M = 2L.If in the systematic period (N ) all its elements are smaller than XM, one can get M = 2 (L + N).Note that usually N L.

Simulation procedure
The simulation procedure incorporates N systematic record and the largest historic flood (XM) in the period M which appeared in the L year toward the end of the historical period (Fig. 1).Obviously, the systematic record and both magnitude (XM) and year of occurrence (L) randomly vary from simulation to simulation.An estimate of the length of the historical period shall be successively M = L, 2L and the actual value M = M, i.e. the length of the period M in simulation experiment.First, generate a gauged record x 1 , x 2 , . . ., x N of independent random variates from the assumed two-parameter flood-like distribution [F (x)] with parameters chosen to give specified values of CV.Then generate historical series of the same distribution of the length M, i.e. y 1 , y 2 , . . ., y M , and find the maximum event (XM) of the historical series denoting the time (L) of its occurrence.Since the random variables (XM) and L are mutually independent, the XM can be generated from the distribution of the largest element in a M-element series, i.e.F (M) = F 1,M (y) = F M (y), while the corresponding time of its occurrence (L) can be generated from the discrete uniform distribution with support {1, 2, . . ., M}.The use of two-parameter distributions allows us inter alia to concentrate only on the profit one can get while incorporating additional historical information into systematic data without analysing the estimation errors within the context of asymmetry.This is more flexible in three-parameter distributions.
A flood frequency distribution fitted by the method of maximum likelihood has a distribution function F (x, θ ) and a density function f (x, θ), where θ is a vector of unknown parameters, then the likelihood function (L) is taken to be i.e. the use of incomplete data likelihood, where M = L, 2L and M, and for systematic record only (2) Calculate quantile estimates XT = F −1 1 − 1 T ; θ for M = L, 2L and M and the systematic record (N) only (i.e. when M = 0), where F −1 is the inverse distribution function of the fitted flood frequency distribution, θ is the maximum likelihood estimate of θ, and T is the return period of interest.
Repeat the above steps a large number of times (i) and calculate the mean and variance of XT , and hence the relative bias RB and relative RMSE of XT taking Mi = L i , 2 L i and M and the systematic record (N ) only ( M = 0) considered as an estimator of the true quantile X T = F −1 (1 − 1/T ; θ ).If in a generated series one gets max(x 1 , x 2 , . . ., x N ) ≥ XM, such a simulation is ignored, which allows us to assume M = 2L.

Simulation results
The concise frame of this paper forced us to limit the number of models we took into consideration in our calculations.In order to lessen the number of the figures for all the combinations of CS and CV values, we resigned from three-parameter distributions such as generalised extreme value (GEV) and turned into its two-parameter special forms, namely Gumbel (Gu) and Weibull (We).Another cause was also that, however theoretically sound, the GEV working perfectly for large samples often fails in far-from-asymptotic samples which we examine in this study.We scrutinised a number of two-and three-parameter distribution functions in terms of their best fit to hydrological annual and seasonal peak flows in Poland and it turned out that despite the regime of the river, other models were preferred rather than GEV (Strupczewski et al., 2012;Kochanek et al., 2012).However, the crucial argument after the choice of the parent distribution was the pioneering works of Frances et al. (1994) that we wanted to continue and develop.Results of simulation experiments are shown for Gu and We distributions with four values of the coefficient of variation CV = 0.25, 0.5, 0.75, 1.0, with two different lengths of systematic records N = 15, 50 and the length of effective historical period M = N10 (a) where a ∈ [0, 3].Due to the limited capacity of this paper without the loss of generality, only the selected results were presented in Figs.2-5, namely for CV = 0.25 and 1.0; the results for CV = 0.5 and 0.75 locate themselves between those presented in the figures.Results for the correct value of the return period ( M = M) are compared with those got for M = L i , 2L i .For completion the results for the systematic record only (i.e.M = 0) were presented in all figures (solid line).Of course, for this case the results do not depend on M and in consequence on log(M/N ).

Discussion of the results
1.The shorter the gauged record (N) is, the more useful the historical information.
2. Using as the true return period of the largest historical flood (XM) the estimate of the historical memory length (L) results in considerable upward bias RB of 1 % quantile, far exceeding the bias for the systematic record only.Its value increases with C V (and C S ) and with the M/N ratio.
3. Using an ML estimation the M = 2L instead of M = L considerably reduces the bias and further reduction is obtained for the M = M, i.e. for the return period (M) of the largest historical flood XM.
4. Although the use of M = 2L instead of M = L reduces the bias more than twice, it is still ca.40 % larger than the bias of a known return period M of XM, and comparable to the bias from systematic record (N).
5. As far as the relative root mean square error (RRMSE) of 1 % quantile is concerned, for both Gumbel and Weibull models one can notice some reduction (understood as the difference between non-systematic and systematic results) in its values when one uses L, 2L or M return periods in comparison to the systematic sample.The worst reduction of RRMSE one gets for L, better for 2L and the best for M which means that it is worth, at least, considering using a historical measurement XM in upper quantile estimation and then set the return period of XM to 2L rather than L if we do not know M.
6.The reduction in RMSE for both models (Gumbel and Weibull) rises generally with M/N ratio.In other words: the bigger M (compared to N), the higher distance between RRMSE values got for the sample with additional historical information and the systematic series.It goes without saying, that for N = 15 one gets better reduction than for N = 50.
7. For both the Gumbel and Weibull models the reduction in RRMSE compared to systematic samples depends on CV -the larger CV, the larger is the reduction.
8. For Gumbel model in comparison to systematic sample for log(M/N) = 3, CV = 0.25 and N = 15 the reduction gets up to 2.2, 3.6 and 5.3 % for L, 2L and M, respectively.For N = 50, these numbers are roughly three times smaller.9.For Weibull the gain in RRMSE is more spectacular and for log(M/N) = 3, N = 15 and CV = 0.25 equals to 3.4, 4 and 4.9 % for L, 2L and M, respectively (when CV = 1.0 the gain is ca.four times higher).For N = 50 the general trend for Weibull remains the same as for N = 15, but the reduction of RRMSE is smaller.10.To sum up the RRMSE issues, the inclusion of the largest historical flood in FFA with M = 2L (i.e. the effective historical record length) gives a few-percent reduction in RRMSE of extreme flood estimates.However, the reduction is ca.20 up to 60 % lower than if we took Mas the length of simulation period.The true value of M is not available in reality, so one is doomed to use 2L instead.
11. Therefore, to benefit from the largest historical observation every effort should be made to establish M accurately.
12. In the absence of any information about the period preceding the occurrence of XM one should put M equal to 2L or 2(L + N ).
13.The benefit from including the largest historical flood of a given value is measured by the reduction of RRMSE.It depends on: a. the length of systematic record (N ), b. the ratio of the true return period of XM, i.e.M to N , c. the ratio of N to the return period of quantile of interest, d. the C V and skewness of the parent distribution.

Conclusions
Errors in historical data reduce, of course, the utility of the data for improvement of the estimation of flood magnitude at a given return period.In the simulations (Figs.2-5) it was assumed that the magnitude of the largest historical flood (XM) was measured without error and the same was assumed for the systematic record.It is realistic to suppose that the XM flood was measured much less accurately than the gauged record.Error in estimating the largest historical magnitude (XM) is much more important than error in estimating the date of its occurrence (e.g.Hosking and Wallis, 1986a, b).Ironically the tendency to improve the accuracy of estimates of flood quantiles through more realistic assumptions and a fuller use of the information leads to just the opposite effect -to the increase of uncertainty of flood estimates.
The next step should be to refer to the general problem of historical information when the applied distribution model is false, which is always the case (Strupczewski et al., 2002), and/or is the three-parametric function.On the other hand, the uncertainty of the palaeo-historical floods (both in terms of their magnitude and return period) combined with considerable increases in the complexity of the problem (when compared to analysis of systematic data only) provokes a fundamental question, whether the whole operation is worth a candle.Therefore, whether to include the palaeo-historical information or turn a blind eye to it, is a matter of conscience.
All these generate three important practical problems which we leave for further study, namely: 1. What is the theoretical upper limit of accuracy of high quantile estimation when the theoretical value (i.e.taken from the parent distribution) of return period for XM is known?
2. Here in our simulation experiment we assumed the knowledge of the true (parent) distribution function.
The role of historical information when the assumed distribution serves as the model of the true distribution remains, for the time being, unknown.
3. We also assumed that the parent distributions are the Gumbel and Weibull two-parameter distributions which narrows down the ability to manipulate the skewness of the distributions.The obtained results will, of course, differ when a three-parameter distributions are involved in the calculations.
Only the solutions to these three problems completed by the consideration of the observation errors in FFA brings us closer to the answer to the fundamental question stated above, i.e. whether the available palaeo-historical record can provide worthwhile improvement in flood estimates.

Figure 1 .
Figure 1.The case of N systematic and the largest flood in the beginning of historical period.

Figure 2 .
Figure 2. Relative bias (RB) and relative root mean square error (RRMSE) of XT =100 as a function of gauge record length N and historic period M for Mi = 0, L i , 2L i , M. Parent distribution Gumbel with CV equal: 0.25 and 1.0 and N = 15.Fitted distribution Gumbel.

Figure 3 .
Figure 3. RB and RRMSE of XT =100 as a function of gauge record length N and historic period M for Mi = 0, L i , 2L i , M. Parent distribution Gumbel with CV equal 0.25 and 1.0 and N = 50.Fitted distribution Gumbel.

Figure 4 .
Figure 4. RB and RRMSE of XT =100 as a function of gauge record length N and historic period M for Mi = 0, L i , 2L i , M. Parent distribution Weibull with CV equal 0.25 and 1.0 and N = 15.Fitted distribution Weibull.

Figure 5 .
Figure 5. RB and RRMSE of XT =100 as a function of gauge record length N and historic period M for Mi = 0, L i , 2L i , M. Parent distribution Weibull with CV equal 0.25 and 1.0 and N = 50.Fitted distribution Weibull.