Influence of flood frequency on residential building losses

For the purpose of flood risk analysis, reliable loss models are an indispensable need. The most common models use stage-damage functions relating damage to water depth. They are often derived from empirical flood loss data (i.e. loss data collected after a flood event). However, object specific loss data (e.g. losses of single residential buildings) from recent flood events in Germany showed higher average losses in less probable events, regardless of actual water level. Hence, models that were derived from such data tend to overestimate losses caused by more probable events. Therefore, it is the aim of the study to analyse the relation between flood damage and recurrence interval and to propose a method for considering recurrence interval in flood loss modelling. The survey was based on residential building loss data ( n = 2158) of recent flood events in 2002, 2005 and 2006 in Germany and on-site recurrence interval of the respective events. We discovered a highly significant positive correlation between loss extent and recurrence interval for classified water levels as well as increasing average losses for longer recurrence intervals within each class. The application of principal component analysis revealed the interrelation between factors that influence the damage extent directly or indirectly, and recurrence interval. No single factor or component could be identified that explained the influence of recurrence interval, which led to the conclusion that recurrence interval cannot substitute, but complement other damage influencing factors in flood loss modelling approaches. Finally, a method was developed to include recurrence interval in typical flood loss models and make them applicable to a wider range of flood events. Validation including statistical error analysis showed that the modified models improve loss estimates in comparison to traditional approaches. The proposed multi-parameter model FLEMOps+r performs particularly well. Correspondence to: F. Elmer (elmer@gfz-potsdam.de)


Introduction
Flood risk can be defined as the probability and the magnitude of expected losses that result from interactions between flood hazard and vulnerable conditions (UNISDR, 2004).It exemplifies the tension at the interface of society, settlements and the environment and is the price to pay for the benefits of using water resources in multiple ways.The economic and societal implications are considerable.Floods are responsible for 20-30% of the economic losses caused by natural hazards worldwide (Douben and Ratnayake, 2006).Even the death toll due to (freshwater) floods shows a slight increase from 1975-2001(Jonkman, 2005)).In Germany, the severe flood event in August 2002 caused monetary losses of more than 11 billion Euros (see Engel, 2004, for an event description) and 21 casualties.In the aftermath the assessment of flood losses for compensation purposes and, in the medium term, for cost-benefit analyses of flood protection measures and flood risk management in general, became an urgent need.Thorough analysis and a subsequent assessment of the risk are indispensable parts of managing the risk and mitigating flood damages, thereby enhancing the benefits of risk management and limiting costs of damage mitigation measures.
Flood damages are usually divided in direct and indirect damages which are further divided in tangibles and intangibles (Smith and Ward, 1998).Most studies concentrated on direct tangible losses as the assessment of indirect damages, while very important, remains methodologically difficult.It is often only approached implicitly, e.g. by using risk aversion concepts to account for the disproportional increase in indirect losses (and intangibles) of extreme events (e.g.Merz et al., 2009).Intangibles are even more difficult to assess, although some recent studies revealed the important implications they can cause (Siegrist and Gutscher, 2006).While we acknowledge the importance of these damages, this study only accounts for tangible direct losses to F. Elmer et al.: Influence of flood frequency on residential building losses residential buildings.For the purpose of the study this narrowed approach was suitable as the influence of flood frequency on direct flood losses could be clearly demonstrated.
Flood risk science can resort to decades of research.Extensive overviews of flood loss assessment studies are given by Smith (1994), Büchele et al. (2006) and Merz (2006).Seminal analyses on flood loss functions started in the UK in the early 1970s.They were summarized in the "Blue Manual" (Penning-Rowsell and Chatterton, 1977) and are perpetuated ever since ("Multi-coloured Manual", Penning-Rowsell et al., 2005).The data and loss functions still are the standard approach for flood loss analysis and estimation in Great Britain.
In state-of-the-art loss estimation, mostly water level is taken as the main impact-parameter in so called stagedamage-curves, which estimate a certain degree of damage or absolute damage based on the inundation depth.Besides the Multi-Coloured Manual of the Flood Hazard Research Centre (FHRC) at Middlesex University (Penning-Rowsell et al., 2005), examples for this approach are the flood loss functions developed by the International Commission for the Protection of the Rhine (Egli, 2002), Hydrotec (2002), Emergency Management Australia (E.M.A., 2003) and the Federal Emergency Management Agency (FEMA - Scawthorn et al., 2006).
In Germany, the first studies that dealt with flood loss analysis were closely linked to the flood loss database HOWAS (Kiefer, 1976;Meon et al., 1986;Günther and Schmidtke, 1988) run by the Bavarian Water Agency (Bayerisches Landesamt für Wasserwirtschaft).Buck and Merkel (1999) analysed the datasets and the data quality.They concluded that square-root functions are the best approach to estimate losses.Later works used the HOWAS data sets and advanced in the data analysis (e.g.Merz and Gocht, 2001;Merz et al., 2004).Merz et al. (2004) demonstrated the high uncertainty of stage-damage functions that are derived from empirical data (i.e.loss data collected after flood events) and suggested to consider more factors -besides water level and building use -in loss modelling.
Several authors, e.g.Penning-Rowsell et al. (2005), Buck et al. (2008) emphasized the variability of the objects at risk and provided loss functions for a wide range of representative objects that have undergone very detailed analyses in terms of the values at risk and their resilience.These approaches are based on synthetic, i.e. what-if loss data and analyses.
In order to learn more about damage processes, extensive datasets about flood losses and a range of damage influencing factors were surveyed in Germany after floods in 2002, 2005and 2006(Thieken et al., 2005;;Kreibich and Thieken, 2008).The influencing factors were divided into impact factors e.g.water depth, contamination, and resistance factors like type of building, preparedness, early warning (Thieken et al., 2005).In the aftermath, Thieken et al. (2008) and Kreibich et al. (2010) developed flood loss estimation models for the residential and for the commercial sector (FLE-MOps and FLEMOcs, respectively) that include information about the objects at risk and consider water level, flood water contamination and precautionary measures at the object as impact factors.The models were derived from actual data surveyed after the 2002 flood event in the Elbe and Danube catchments (FLEMOps).For the derivation of FLEMOcs additional empirical data from events in 2005 (Danube) and 2006 (Elbe) were also considered.Model validation demonstrated the improvements and reliability of the new models.However, transferability of the model applications in space and time are still limited (Thieken et al., 2008).Apel et al. (2009) exemplified that the imminent uncertainties in loss estimation and the huge deviance of estimates have their origins more in the loss modelling approach than in the hydraulic modelling.De Moel and Aerts (2009) showed that this also applies to a comparison of the uncertainties in land use data, water depth data, and stagedamage curves: the use of different stage-damage curves caused much higher deviations in loss estimates than different sources of land use or water depth information.Merz and Thieken (2009) preselected plausible loss modelling approaches in terms of applicability to their research area.Under this condition they got similar contributions from flood frequency estimations, inundation scenarios and damage estimations to overall uncertainty.Including all available models without preselecting heavily extended the uncertainty bounds of the flood risk curves in their work.Hence, putting emphasis on the improvement and validation of loss models might cause the highest gain in risk estimation accuracy.
The amount of damage is determined by the combination of impact and resistance.It is the challenge in loss modelling to identify how and to which the degree they influence damages.Thorough analysis of flood events, damages and the underlying processes is the way to fulfil this task.Therefore, information about actual damages caused by flooding are necessary and have to be collected, analysed and made accessible (Thieken et al., 2009).A substantial set of data is necessary, because no two flood events are the same.Flood characteristics vary strongly in time and space.Different river basins can be affected in one event and even for a single flood event within only one catchment the flood characteristics differ.It is important to keep in mind that flood magnitude varies along a river network and also for each time step during the event.A single event never has a homogeneous recurrence interval.Generally, when comparing single events, extreme events cause higher accumulated damages than more likely events as they affect larger areas and cause a bigger share of deep inundation.Furthermore, for the same reasons they aggravate indirect effects e.g. business interruptions.
Analyses of data from events in 2002, 2005 and 2006 showed that average building losses for private households were much higher in the flood event in August 2002 in the Elbe catchment than in the other events.This event has caused enormous losses and was much more extreme in terms of recurrence intervals than the flooding in the Danube Associated with this flood magnitude were much higher relative loss averages in the Elbe catchment.The water level class affiliations show a similar pattern but were less distinct than the differences in average loss ratio (Fig. 1).We hypothesise in this paper that floods with high probabilities and low magnitude primarily affect objects whose inhabitants have previously experienced flooding or at least gathered information about flood damage mitigation and that damages are lower because permanent and temporal resistance are higher due to better preparedness and precautionary measures.
In Fig. 2 this hypothesis is exemplified: a flood that occurs on average every ten years affects buildings 2 and 3 (dark blue).Due to the exposition of the buildings, the inundation depth at each building (stage) is different and hence we expect a higher loss ratio in building 3 than in building 2. In an event with a recurrence interval of 100 years, the cumulative losses are apparently higher as a bigger number of objects, in this example all four buildings, are affected (light blue).On the object-scale the inundation depth is the same for building 1 in the HQ100-event than for building 2 in the HQ10-event.This also applies to building 3 and building 4, respectively.State-of-the-art loss estimation models apply the same function to affected buildings of the same type (black graph).But we assume that there is a difference in loss extent ( D) if these buildings are affected by events of different probability (dark blue and light blue graphs).None of the models mentioned above considers this difference.
Consequently, this study analyses the relation between flood damage and recurrence interval with the intention to propose a method for considering this relation in flood loss modelling.
For this purpose we address the following three main questions in this paper: (a) Is there a correlation between loss extent and recurrence interval that is not only a result of the greater spatial extent and the higher water levels in extreme events?(b) Which damage influencing factors are altered by changes in flood frequency?
(c) How can the findings be accounted for in flood loss estimations?
In the first step (a), the basic question had to be answered before further analyses could be conducted.Flood frequency (in terms of recurrence interval or exceedance probability) describes a characteristic of an event but is not an impact or resistance factor itself.In the second step (b) we wanted to find the causes of the differences in the object specific loss extent in high and low probability flood events.For this purpose, we analysed which damage influencing parameters change with flood probability.At last these results were utilised by finding ways to integrate recurrence interval in existing flood loss models (c).The usefulness of this step was judged by validating and F. Elmer et al.: Influence of flood frequency on residential building losses comparing the accuracy of losses estimated by the "traditional" and modified loss models.
The paper is structured as follows: in Sect. 2 all data analysed in the survey are described.Section 3 presents the methods and models used and the purposes for applying them for data analyses.The order in this chapter is repeated in Sect. 4 "Results and discussion"; the sequences of presenting the methodologies and showing the respective results coincide.
The section "Conclusions" sums up the most notable findings and translates them into recommendations for future flood loss analyses and closes with a short outlook on future research assignments.

Data
For the analyses, data about flood losses in the residential sector were used.The data set contains information about -building losses -building loss ratios -site-specific recurrence interval of the flood event (values, classified) -water level (classified) above ground surface for every single loss case, and additional information about relevant loss influencing parameters.

Flood loss data
In the aftermath of the severe flood in August 2002 that hit the rivers Elbe, Danube and tributaries, 1697 households responded to a standardised questionnaire that contained about 180 questions.The survey addressed the following topics: flood warning, precautionary measures, flood impact, emergency measures, evacuation, contamination and cleaning-up, characteristics of the affected household contents and buildings and losses to contents and buildings, recovery of the households, and information about flood experience as well as socio-economic variables.Many topics were covered by several questions and most questions offered multiple categories of answers.Hence data aggregation was necessary.
The derived indicators and the complete survey are described in detail by Thieken et al., 2005, Thieken et al., 2007and Kreibich et al., 2005.
The data base was further extended by a second survey in 2006.Households affected during the August 2005 flood in the Danube catchment and in the March/April 2006 flood in the Elbe region were surveyed via telephoneinterviews using a computer-aided questionnaire that is based on the questionnaire used in 2003.A digital version of this questionnaire can be found at http: //www.gfz-potsdam.de/portal/gfz/Struktur/Departments/Department+5/sec54/Ressourcen/Dokumente/Questions+ MEDIS?binary=true\&status=300\&language=de.
In this second survey, a total of 461 flood-affected households were interviewed.317 of the interviews addressed households in the Danube catchment and 144 in the Elbe catchment.As there were also minor floods in the Elbe catchment in 2005 and in the Danube catchment in 2006, a limited number of loss cases from these events were also assessed in the 2006 survey.Most interviews took place six to 18 months after the damaging flood event.Only household members could be contacted who still lived in or had already returned to the damaged building.Effectively, heavy or total losses are probably underrepresented in the data set.In reality, these losses are quite uncommon and, given the size of our sample, this underrepresentation should not bias the results significantly.
The analyses required data about building losses and water levels, which could be directly taken from the interview answers.

Additional data
Additional necessary information items to supplement the data sets were loss ratios and object specific flood recurrence intervals.To calculate loss ratios, actual losses as given in the interviews were divided by the building values, which were estimated as follows.Building values (replacement costs) were estimated by using actuarial valuation methods.The VdS guideline 772 1988-10 (Dietz, 1999), commonly used in the insurance sector, offers a method to estimate absolute building values in "Mark 1914".The results can be transferred to replacement costs for any given year by a correction factor.Necessary information to apply this method was taken from interview answers concerning total floor space, basement area, number of storeys and roof type.To compare all monetary information, the year 2006 was used as a reference.Building values were transferred to replacement costs as of 2006 by applying the building price index published by the German Federal Statistical Office (Statistisches Bundesamt, 2009).
The recurrence intervals of the described flood events at the specific locations of the damaged objects were assigned to the damage cases as the best possible estimation for flood probability.This approach does not treat flood events as temporally and spatially coherent, based on a stationary uniform flood probability (e.g."Millennium flood").It considers the variations in return period along the river network, but doesn't differentiate the probability to be flooded for objects located at the same river stretch.To be as accurate as possible, the recurrence intervals had to be calculated for the highest possible spatial resolution.For all gauges in the study areas where discharge information were obtainable, the annual maximum series (AMS) were derived from the discharge data.An estimation of the flood recurrence interval was done for all time series with records of at least 30 (hydrological) years.The Generalised Extreme Value (GEV) distribution function was fitted to the AMS of  more than 120 gauges using the L-Moment method.The resulting recurrence intervals for maximum annual discharges in 2002, 2005 and 2006 were assigned to the respective catchments of each gauge.Boundaries for areas or river reaches with the same recurrence interval were defined by the first major tributary downstream of the gauge and include ungauged tributaries that disembogue in between.The catchment boundaries were taken from the CCM River and Catchment database for Europe (Vogt et al., 2007).At the confluence of major affected rivers (e.g.Elbe and Mulde rivers at Dessau, Saxony-Anhalt) official reports and other sources were used as complementary information to decide which river contributed most to the reported losses.

Integrated data set
The survey data sets were merged and supplemented with additional information on building values and recurrence intervals.The complete data set contains 2158 residential loss cases.Spatial distribution of gauges and loss cases is illustrated in Fig. 3.
Cases where no information about monetary losses to buildings was given had to be excluded.So the dataset for this analysis was reduced to 1327 household interviews.The number of cases with building loss information as well as information about water level and recurrence interval can be found in Table 1.

Methods
Analyses by Thieken et al. (2007) showed that water level is the most important factor that determines the extent of building losses.In a given data set that represents loss cases caused by a single flood event, the different water levels form a unique pattern of distribution (distribution curve) that depends on the spatial distribution of settlements in the flooded area, and on geomorphology.In different events, this distribution curve can vary due to shifting event characteristics, even if the same region is affected (Merz, 2006).To eliminate these variations and analyse only the influence of recurrence interval, the complete data set is classified by water level and recurrence interval.The five water level classes are coherent with the classification in the model FLEMOps (Büchele et al., 2006;Thieken et al., 2008).Indicator values (0 = no r prior flood experience; 9 = recently or frequently affected by flood events)

Recurrence interval and building losses
First, building losses and building loss ratios were correlated with recurrence intervals for all 1308 cases.The Spearman-Rho correlation coefficient was used to calculate the significance of the correlation.The correlation is interpreted as significant when α does not exceed the 0.05 level (two-sided).
For the classification of recurrence interval we looked for significant breaks in the distribution of loss ratios along all cases that are ordered according to their recurrence intervals.By applying the Epanechnikov-kernel (local linear regression with 40% of standard bandwidth) potential breakpoints were identified.For the loss cases within each resulting class no more positive trends were found.The class borders were interpreted as significant changes in the influence of flood recurrence interval on loss extent or damage influencing factors.
The Kruskal-Wallis-test (H-test) for independence was performed for all water level classes with the significance level set at α = 0.05.Afterwards the Spearman-Rho correlation coefficient between building losses and recurrence intervals was calculated for each water level class.The value of the correlation index and its significance determined whether analysing the influence of the "extremeness" of a flood event on loss extent was reasonable and statistically valid.

Dimensions of recurrence interval (flood frequency)
For a number of items it was tested how they are linked to recurrence interval.The selection of these items was oriented on Thieken et al. (2005).Only attributes that were obtained in the 2006 as well as 2003 survey or indexes that could be derived from answers in both surveys in an analogous manner were considered.Table 2 gives an overview of the selected factors, their scale, and measure and whether they are an impact or resistance factor.
Principal Component Analysis (PCA) with varimax rotation was used to aggregate the number of items to a smaller number of dimensions (principal components).The number of components is limited to those components with eigenvalues exceeding one.Recurrence interval is excluded from this kind of analysis.After conducting the PCA the resulting components were correlated (Spearman-Rho correlation coefficient) with the unclassified recurrence interval data.Significant correlations were interpreted in terms of the mutual connections between flood frequency and damage influencing dimensions.

Flood loss modelling approaches
To evaluate whether including recurrence interval in modelling flood losses really improves the accuracy of results, flood loss estimation models that take into account recurrence interval were developed, validated and compared to models that do not include recurrence interval.Different modelling approaches as well as modified versions of these approaches that consider recurrence interval were analysed.
We used linear, square root and polynomial stage-damage curves.All these approaches are used in practice and their performance has been tested in previous studies (e.g.MURL, 2000;IKSR, 2001;Hydrotec, 2002).The fourth model tested is the flood loss estimation model for the private sector (FLEMOps) (Thieken et al., 2008).In the basic FLEMOps model, five water level classes, three building types and two building quality classes are used as input.In the extended version (FLEMOps+), the combinations of three grades of flood water contamination and three classes describing the extent of private precautionary measures are added to the basic model (Thieken et al., 2008).All models were derived from the integrated data set, according to their basic principles.These loss model types were modified by including recurrence interval as additional parameter in the respective loss estimation functions.
For the linear, square root and polynomial model, the integration of recurrence interval was done by simply calculating a single loss function for each recurrence interval class.For this purpose, the loss cases were classified into three recurrence interval classes (see Sect. 3.1).For each class, a regression function was derived that describes the correlation between building loss ratio and recurrence interval for all cases in the class.The naming convention for all models that consider recurrence interval is the name of the basic model and an additional "r" for recurrence interval.

The new Flood Loss Estimation MOdel FLEMOps+r
For FLEMOps+, a different approach was necessary as this extended model already considers scaling factors for each combination of contamination and precautionary measures.
The parameters recurrence interval, extent of contamination and precautionary measures are not independent from each other, but interrelated.Hence multiplying multiple scaling factors would have exaggerated the influence of these factors.A complete classification includes combinations of all three factors without being biased by their interrelation.For the development of FLEMOps+r, a makeover of the model derivation approach was necessary: multiplying a basic function with a scaling factor biases the outcome if the parameters, the function and the scaling factor are based on, are not independent.The parameter "water level" in the basic loss function is correlated with each of the parameters which contribute to the scaling factor.Therefore, the scaling factor was modified by calculating the relation of relative building loss (derived from the interview) and a reference relative building loss estimate for each case.The reference estimates were taken from the application of the basic FLEMOps that only considers water level and building characteristics.Eq. ( 1) shows the formula used in FLEMOps for estimating the loss ratio D E j for a case j with an inundation depth in the water level class h, building type t and building quality q.
The relations of all cases in each parameter combination class were averaged.The results for the respective class produced the new scaling factor.This approach removed the bias caused by the unequal distribution of water levels in the respective classes.We retrieved a new loss estimation model FLEMOps+r that considers water level, contamination, precaution, and recurrence interval as well as building characteristics without biasing the estimation results due to the correlation of the included parameters as given in Eq. ( 2).

Validation and model comparison
The resulting models were validated by using the Leave-oneout cross-validation method.This technique created an effectively independent sample from the existing data set.One loss case was removed and the model was derived from the remaining 1326 loss cases.Then, the parameter combination of the removed case was fed into the model and the relative loss for this case was estimated.This procedure was repeated for each loss case, i.e. 1327 times.The overall model error was calculated from the differences between the estimated values and the actual relative loss from the interview.Afterwards, a comparison of the errors for all models was conducted.
To additionally judge how well the models performed for special areas of interest, e.g.how well they estimated losses www.nat-hazards-earth-syst-sci.net/10/2145/2010/Nat.Hazards Earth Syst.Sci., 10, 2145-2159, 2010 for high water levels (cf.Thieken et al., 2008) or low probability events, a bootstrapping algorithm was applied to the building loss values from the interviews.The sampling with replacement was done 10 000 times.A confidence interval of 95% was calculated from the bootstrapped sample for each water level class and each recurrence interval class, respectively.The mean relative losses for the required classes, as estimated by each model, could then be analysed in terms of how well they fitted into the confidence interval, and how close they were to the mean relative loss per class, as taken from the interviews.
Nine model variants were compared: the linear, square root and polynomial models in their basic form and with separate functions for each recurrence interval class, and the FLEMOps model family.The basic FLEMOps model based on water level classes and building characteristics, "FLEMOps+" with additional information about contamination and precaution and the newly developed "FLEMOps+r" with water level, building type and quality, contamination, precaution and recurrence interval as influencing parameters.The model quality was rated by comparing the error statistics of each model.Hence the mean bias error (MBE), the mean absolute error (MAE) and the root mean square error (RMSE) were calculated from the estimation results.Furthermore, the average estimates per recurrence interval class and per water level class were plotted against the respective average losses from the interview answers.Comparing estimated class averages favours models that have separate functions for each of these classes and hence produce hardly any deviations between interview answers and estimates.Multiparameter models (FLEMOps+ and FLEMOps+r) use only one (scaled) basic function and therefore could only come close to the results of the other models at best.

Recurrence interval and building losses
The Spearman-Rho correlation coefficient between specific recurrence interval per loss case and absolute building loss was 0.333, while the respective value for building loss ratio was 0.344.Both results were significant at the 0.01 level (two-sided).This did not necessarily indicate a special influence of this parameter, as low probability events also cause a higher share of deep inundation.Hence, the correlations were also calculated for separate water level classes to eliminate the influence of inundation depth.Table 1 shows how the data set was split into 15 subgroups by classifying the loss cases by recurrence interval and water level.Most subgroups contained enough cases to allow further analyses.Only the combination of high water levels and high flood frequencies occurred quite rarely.
For monetary losses the recurrence interval classes were significantly independent for four water level classes with the exception of the class "21-60 cm", where the result was only significant on the 90%-level (Table 3).For loss ratios, independence was given for all classes on the 95%-level.The Spearman-Rho correlation coefficients for each water level class are not as pronounced as the results for the whole sample.Still, a positive correlation that is significant on the 95%level was found in all classes for both, absolute and relative losses (Table 3).
The findings were further analysed by comparing average relative losses (means, medians, quartiles) for the 15 subgroups.Figure 4 illustrates the results.
All subgroups with ten or fewer cases, i.e. the groups with a recurrence interval of less than ten years and a water level of more than 20 cm did not allow reliable conclusions.These groups were not excluded, but marked by the white bars as  not significant.All other groups contained at least 33 and a maximum of 268 loss cases.Mean loss ratios increased with higher water levels, as expected, but there was also an almost steady increase of average building loss ratio in each water level class for increasing recurrence intervals: average building losses were higher in extreme flood events regardless of water level, than in more frequent events.

Dimensions of recurrence interval
The results of a principal component analysis (pair wise deletion, available case approach) with the twelve selected items -loss information and recurrence intervals excluded -were as follows: the first four principal components with eigenvalues higher than one explained about 56% of the variance.Although this was not a very good result, we still considered the reduction in dimensionality from twelve to four with an explained variance of nearly 56% as sufficient.The share of explained variance per component was levelled to a certain degree by applying the varimax rotation to the matrix.
Table 4 shows the contribution of each parameter to the single components (i.e. the correlation between parameter and component).Values less than .5 were left out of the table for better readability; values higher than .5 describe the characteristics of the respective components and were marked by bold text.For item eleven there was no clear picture: the highest component loading was only .458 for component 1.For information the Spearman-Rho Correlation coefficient between all items and Recurrence interval is also given in the table.
We created generic terms that denote the significance of each component.The names are based on the contribution of the selected items to the four components as well as thematic proximity of items that were closely related to one component: Component 1: Reaction time (of the river system, the early warning system, the population) Component 2: Load (object specific flood impact characteristics) Component 3: Response (mitigation measures) Component 4: Preparedness (and experience) These terms were used to interpret the results of the following analysis.There are moderate but highly significant correlations between recurrence interval and three of the four components.Recurrence interval was negatively correlated with components 3 (Response) and 4 (Preparedness), which lead to the assumption that mitigation is less pronounced in low probability events.We also concluded that flood experience is related to flood probability, as living in an area affected by frequent floods leads to a high level of flood experience.The positive correlation for component 2 hints to more pronounced flood impact characteristics in less frequent events.Component 1 (Reaction) showed no significant correlation with recurrence interval at all.
Findings by Siegrist and Gutscher (2006) showed the importance of experience in triggering mitigation behaviour.The same authors (Siegrist and Gutscher, 2008) proved that only people who were affected by a flood can realistically assess the consequences of flooding.With regard to these findings it was surprising that the correlation with component 4 (preparedness/experience) that could have logically explained the influence of flood probability on losses, did not stand out at all.This is further confirmed by the fact that no significant correlation could be found between Recurrence interval and the flood experience indicator.
Recurrence interval was not clearly associated with only one, but quite equally with three components; thus, obviously various parameters change with variations in flood probability.For this reason, recurrence interval cannot substitute -or cannot be substituted by -one or a limited number of related parameters in flood loss modelling, but complement other parameters already included in existing loss estimation models.Hence, it was evident to include recurrence interval directly in the loss models.

Flood loss models
The linear, square root and polynomial models feature continuous functions.The stage-damage-curves in Fig. 5 show the basic and the extended, i.e. including recurrence interval (marked with r), models as derived from the integrated dataset.The plotted building loss cases form the base for model derivation.Their huge variability demonstrates the need to incorporate more parameters than only water level in flood loss modelling.
To integrate recurrence interval in the FLEMOps+ loss estimation model, the influence of combinations of recurrence interval and other important factors was quantified for our dataset: combining precaution, contamination and recurrence interval classes resulted in 27 classes and included many classes with too little cases for deriving reasonable results (Table 5).A closer look at the distribution of loss cases among classes showed that many combinations are much more likely than others.This finding was further supported by the fact that classes with many cases are clustered in the parameter space formed by the three factors.Further analyses were limited to classes with a certain number of cases that still covered a vast majority of cases and hence will work for most loss estimations.
The limit was set at a minimum of 30 loss cases per class.With this selection, less than 30% of all classes represented 73% of all cases.When a class contained less than 30 members it was aggregated with neighbouring classes based on Nat.Hazards Earth Syst.Sci., 10, 2145Sci., 10, -2159Sci., 10, , 2010 www.nat-hazards-earth-syst-sci.net/10/2145/2010/ the values of the combined parameters.All classes, where the expression of only one of the three parameters differed from the value of the same parameter in the original class by ±1, were added to this class (light grey).The selected factor combination classes and the respective scaling factors can be seen in Table 5.
After the aggregation, only three classes (dark grey) remained where, even after adding all cases from neighbouring classes, the number of members was still less than 30.We concluded that these parameter combinations (heavy contamination in a frequent event) are highly unlikely.For calculations where such a combination occurred, no scaling factor was applied (i.e. in the model runs the scaling factor is set at 1).

Validation and model comparison
The basic square root, polynomial and linear function models and their respective versions that include recurrence interval as well as FLEMOps and its extensions were compared and cross-validated.Each model was derived 1327 times leaving out one loss case in every run.The error statistics were calculated for all cases that were estimated by all models (Table 6).
The newly developed FLEMOps+r had the smallest absolute and root mean square errors.The absolute value of the MBE was least for the square root model with separate regression functions per recurrence interval class.The MBE as a signed measure shows whether the models tend to overestimate or underestimate building loss ratios.The MBE for most models was quite small, with the exception of FLEMOps+ which overestimated relative losses to a higher degree than the other models.Especially models with separate functions for each recurrence interval class, but also all other models showed widely negligible biasing tendencies.
Bias is also one component of the mean squared error (and hence the RMSE) as the MSE equals the variance of the errors plus the square of the mean error.Therefore by minimizing the mean squared error, implicitly the bias as well as the variance of the errors is minimized as well.We interpreted the RMSE results in Table 6 in the way that FLEMOps+r offered the best "compromise" between reduced error variance and acceptable bias followed by the square root T model (separate functions per recurrence interval class).As the RMSE is in the same units as the data, i.e. building loss ratios, it gives an impression about the size of a "typical" error.This is similar to the mean absolute error (MAE).The MAE for all models is slightly smaller than the respective RMSE, because the RMSE is more sensitive to and puts more weight on outliers.The MAE results in terms of model comparison reproduced the findings from the interpretation of the RMSE.
The average estimation of the relative loss per recurrence interval class and per water level class was calculated from the bootstrapped sample for all models.Figures 6 and 7    and the five water level classes, respectively.We plotted the mean estimates per class against the mean relative losses taken from the interviews.The error bars give the 95% confidence intervals.Values outside of this range point to an underestimation or overestimation in the respective classes.All models without recurrence interval underestimated relative losses for events with high recurrence intervals and overestimated losses for more probable events with the exception FLEMOps+ which overestimated losses for all recurrence interval classes.This bias could be eliminated by considering recurrence interval.To get a more complete overall picture about the performance and quality of the different models, the average estimates were also compared to the mean relative losses from the interviews for all water level classes (Fig. 7).
The results for the water level classes did not show the clear cut picture as the results for the recurrence interval classes.The linear models overestimated relative losses for low water levels and underestimated for the 101-150 cm water level class.The square root models as well as the basic polynomial model overestimated relative losses for moderate water levels (61-100 cm).FLEMOps showed the best results.Average estimates by the modified polynomial T model were also within the confidence interval for all classes.The FLEMOps+ model showed a tendency to overestimate for high water levels while FLEMOps+r was the only model that underestimated losses for low water levels.
The comparison by water level class was closer to a fair judgement of the performance of the loss models in terms of estimating losses because water level is considered in all models.Still, a cautious interpretation of the results is required because some models, e.g. the basic FLEMOps model, are based on average losses per water level class and therefore were highly favoured by this approach.Statements about model quality that are based on the analysis of the error statistics give less illustrative but more reliable results.

Conclusions
A highly significant positive correlation was found between recurrence interval and loss extent.This correlation could not be fully explained by different water levels.Building loss ratios rise with decreasing probability of the damaging flood event at the object location.Recurrence interval is among the most important damage influences and hence, loss estimations should not apply a uniform loss function to low probability and high probability flood events.
We could not identify a single or a limited range of thematically related parameters that changed with and therefore could be explained by recurrence interval.In fact, different parameters contributed to the main principal components that were correlated with recurrence interval.These parameters were rather diverse thematically.Recurrence interval characterises flood impact more generally and therefore cannot replace but complement other main impact and resistance factors in flood loss modelling.Consequently, recurrence interval should be used as an additional parameter in currently available loss models.It is easy to obtain from discharge time series and improves the applicability of models to events of different likelihood.
The estimation of separate mean building loss ratios for five water level as well as three recurrence interval classes and the comparison to the respective mean loss ratios taken from the interview answers showed that those models, which consider all combinations of both parameters, produced the least biased results.Error analysis helped to rate the estimation accuracy of loss models.It showed that including more damage influencing parameters in loss modelling improves the accuracy of the estimations, if the interdependencies of the parameters are incorporated in the loss functions.The estimation of building losses can be significantly enhanced if the likelihood of the damaging event is considered in the modelling approach.The proposed multi-parameter model FLEMOps+r performs particularly well.
The basic advantage of including recurrence interval is that the object-oriented estimation of losses is supplemented by a new dimension on the event scale: so far only the im-pact and resistance parameters that cause or prevent damages at the object had been considered in loss modelling, but now also ever changing flood characteristics are factored to some extent.Very often there is a discrepancy between the situation that caused those damages used for model development and the situation the model is applied to.Considering event probability reduces this discrepancy.
Employing the proposed model is especially useful if multiple event scenarios are used e.g. for comprehensive risk analyses.A set of impact scenarios is often differentiated by recurrence interval to cover the whole range of possible events.The estimation of risks should be more realistic if the loss models consider this differentiation between the impact scenarios.
The application of the proposed model has implications for risk management decisions in terms of cost-benefit analysis as it decreases the tendency of underestimating negative consequences of extreme floods and hence, increases the weight of these events in risk analyses.On the other hand, the influence of long and middle term hazard changes (e.g.climate induced changes in flood frequency and magnitude) is reduced to some degree: high magnitude events that become more frequent will have lower assigned damages under conditions of rising flood hazard while under conditions of reduced flood hazard, events of a certain magnitude become less probable and accordingly more damage prone.
Despite the advances in data assessment and model development there is still room for future research and improvement.Applying advanced statistical methods could help to elaborate in detail the complex interactions of damage influences in general and the connection between flood probability and damage generating parameters in particular.Empirically derived loss models usually suffer from a lack of information about damages caused by infrequent extreme events and hence are not very accurate in estimating the impact of such events.On the other hand, as it is the case in our data set, if such an event occurs, it is much more likely that assessment campaigns are set up.In the present data set, frequent events are underrepresented.This data gap could be closed by establishing a framework for continuously assessing flood losses and thereby creating an up-to-date data set that describes flood damages in Germany (or elsewhere) representatively.

Fig. 1 .
Fig. 1.Comparison of flood event averages: building loss ratios, water level classes, recurrence intervals (top to bottom) for cases with loss information.

Fig. 4 .
Fig. 4. Comparison of average building loss ratios per water level and recurrence interval class.

Fig. 6 .
Fig. 6.Model estimates and interview answers for three recurrence interval classes.

Fig. 7 .
Fig. 7. Model estimates and interview answers for five water level classes.

Table 1 .
Building loss cases per recurrence interval and water level class combination.
Italics: <= 10 loss cases in class; no further analyses

Table 2 .
Selected items for principal component analysis.

Table 3 .
Spearman-Rho correlation coefficient for residential building loss and recurrence interval per water level class.Correlation significant at the 0.05 level (2-sided).Bold values indicate an asymptotic significance in the K-W-test at the 95% level. *

Table 4 .
Rotated component matrix; contribution of items to four principal components and Spearman-Rho correlation of components, items and recurrence interval.
Method: principal component analysis; Rotation method: varimax with Kaiser-normalisation. Rotation converged in six iterations.* * Correlation significant at the 0.01 level (two-sided).* Correlation significant at the 0.05 level (two-sided).

Table 5 .
Parameter combinations, class averages and scaling factors.

Table 6 .
Error statistics for relative loss estimates of cases estimated by all models.