Beyond the stage-damage function: Estimating the economic damage on residential buildings from storm surges

Given the predicted global increase in extreme weather events, such as storm surges, the design of effective response strategies requires a very detailed and accurate understanding of the major factors driving damage costs. The costs of climate hazards are usually estimated using engineering approaches, which, based on different levels of building-specific information, link water inundation levels to the costs incurred by building owners. More recently, a number of scientific papers have pointed to the limitations of such approaches because they omit important information about key context-specific factors such as 5 emergency response options and a range of social factors reflecting age and social networks in the affected communities. This study contributes to this growing literature by providing rigorous and detailed econometric estimates of damage costs for residential buildings resulting from a storm surge that impacted large parts of Denmark in December 2013. We collected a comprehensive data set consisting of insurance cost data, the characteristics of individual buildings (size, age, construction materials, heating source and distance from bodies of water), emergency services, previous experience with storm surges in 10 the municipality and socio-economic factors. Our results indicate that the isolated effect of inundation depth on damage costs is highly sensitive to the inclusion of other explanatory variables. In our models the isolated effect of inundation depth is more than halved when our full set of control variables is included. Furthermore, our findings highlight the importance of controlling for spatial effects, such as the level of emergency services and socio-economic conditions. Discussing the transferability of our findings, we highlight key sensitivities when using our damage functions in other contexts. 15

These inclusions and the consideration of additional explanatory variables have also highlighted the need to consider intangible variables such as adaptive capacity and preparedness (Merz et al., 2004), as important determinants of the damage costs. 60 Overall, the previous findings in the literature highlight the importance of including several explanatory variables when estimating the economic damage costs from flooding, herein inundation depth, building construction materials, flood characteristics and the adaptive capacity of the affected population. Furthermore, previous studies also highlight the sensitive nature of damage cost estimates, discussing how transferable damage estimates are between storms, geography and over time (Cammerer et al., 2013;Merz et al., 2014), highlighting the need to account for model uncertainty (Figueiredo et al., 2017) 1 . 65 These studies are carried out especially within the fields of pluvial and fluvial flooding (e.g. Ootegem et al. (2015); Spekkers et al. (2014); Merz et al. (2004Merz et al. ( , 2010; Amadio et al. (2019); Pistrika et al. (2014); Jongman et al. (2012)). ). In this paper, the focus is on flooding from storm surges (coastal flooding), an area where the existing literature is scarce. Flooding from storm surges has different characteristics from pluvial and fluvial flooding in that, for example, the water is saltwater, potentially causing different types of damage. Transferring damage models estimated from pluvial flooding to coastal flooding might 70 therefore provide incorrect expected economic damage costs.
To study the importance of geography or the spatial effect on damage costs, in our dataset we have collected storm damage costs from 29 different municipalities in Denmark that were affected by the same storm. This allows us to study how the same storm can give rise to different damage costs across municipalities by controlling for differences in, for example, emergency responses and socio-economic factors. Spatial effects have increasingly been recognized in both theoretical and applied econo-75 metric work (Anselin and Arribas-Bel, 2013;Kelejian and Prucha, 2010), but also specifically in relation to damage costs from flood events (Cammerer et al., 2013).
The novelty of this study lies in both its combination of different data sources and the variables it includes in the development of a multivariable econometric model that estimates the expected damage costs from storm surge-induced coastal flooding.
By coupling influencing variables from a comprehensive set of sources and sectors, including new insurance data, national 80 emergency management data, flood simulations and building characteristics, we are able to control rigorously for a great number of parameters. In the literature all these have previously been identified separately as important determinants of damage costs but, to the best of our knowledge, have never been analyzed within the same econometric framework.

Materials and methods
To perform our econometric estimate of damage costs, we rely on several different sources of data, which are presented in the 85 subsections below. The specification of the econometric models is given in Section 2.2.

Data
In this study, we combine variables from six unique datasets. Table 1 presents each of the included variables, their units and levels, and their dataset source. The following sections present each dataset individually.  and sluices are closed during the flood simulations, as would be the case during a storm surge. To quantify the performance of our flood-model simulation, we evaluated accuracies for stand-alone residential buildings by spatially comparing the DSC data (insurance claims) with the flood hazard maps. We selected a hot-spot area near Roskilde Fjord for evaluation, as this represents a large share of the insurance payouts, see Figure 1. A three-step procedure was used for the evaluation. First, we assigned each of the stand-alone buildings to the closest water-level measuring station (three stations are located in this area).  insurance claims are flooded in the model. However, we do not know whether these buildings actually experienced high water levels during the storm surge, but were better protected and therefore did not experience any economic losses. However, a general concern in using static flood models is that the flooded area may be over-estimated, as the model does not consider the 6 https://doi.org/10.5194/nhess-2020-30 Preprint. Discussion started: 16 March 2020 c Author(s) 2020. CC BY 4.0 License. the model does not account for the natural infiltration and filling of drainage systems, which contribute to the over-estimates.

Buildling characteristics
The Danish Building and Housing Register (BBR) contains information on building and housing characteristics for all developed properties in Denmark. The register contains a large variety of variables for each building, such as type, year of construction, number of rooms, size, garage and garden size, as well as technical details like building materials and heating 135 type (BBR, 2019).

Flood experience
As an indicator of the experience of flooding, we use insurance data from the DSC for previous storm surges from [1999][2000][2001][2002][2003][2004][2005][2006][2007][2008] aggregated at the municipality level. As shown in Table 1, the two indicators are i) the number of insurance claims and ii) an index number of the total compensation sum with index 100 corresponding to approximately 10,700,000e, which corresponds insurance claims between municipalities, we use an index instead of the actual insurance claim. In estimating our model to account for multicollinearity, only the number of insurance claims is used. to uncertainties in these estimates, we only include time spent on a given action in the dataset. The data are on the address level. However, since the emergency services' activities will often benefit larger areas, the data points have been summarized for each Danish municipality. Data on emergency and rescue services therefore relate to the municipal level. Within economics, multiple regression analysis using Ordinary Least Squares (OLS) is a widely used econometric method of constructing multi-variable models from empirical data (Wooldridge, 2013). The method has been applied in a few studies on the determinants of damage costs for flooding events on building (Ootegem et al., 2015), but in some cases the analyses lack 170 sufficient data for the method to be firmly established (Komolafe et al., 2019;Romali et al., 2019). This indicates a need for additional studies using econometric methods to estimate the damage from flooding events. The general model is expressed as Equation 1.

Social vulnerability indicators
where each of the slope coefficients is a partial derivative of y with respect to the x variable it multiplies. That is, holding all 175 other x's fixed, β 1 = ∂y/∂x 1 (Wooldridge, 2013). Efficient and non-biased estimates for all parameters can be obtained as long as the basic OLS assumptions are respected. Ensuring linearity in the relationship between the dependent and the independent variables, the model can be estimated as a log-log or log-linear model in order to assess the relative impact of variance in the independent variables on the dependent variables.

Variable selection and functional form 180
For each observation, the original dataset included more than eighty variables, with detailed variables for building characteristics, location, storm characteristics, previous storms, insurance claims and the level of emergency services provided. The literature review provided input into the initial selection of variables, identifying those that several studies have found to be significant predictors of the economic damage from a flooding event (Merz et al., 2013;Zhai et al., 2005;Elmer et al., 2010).
In this paper, we apply a two-stage variable selection and definition of functional form in our econometric models. In the first 185 step, we used the R software package 'PanJen' (Jensen and Panduro, 2018) to identify the functional form for each potential explanatory variable. A visual investigation of the relationship between the explanatory variables and the dependent variables indicated that most or some of the relationships cannot be characterized as linear. As a result, we applied the semi-parametric tools in PanJen to compare the performance of different functional forms of the explanatory variables.
To detect the additional predictive power of different types of data, the regression analysis was performed by systematically 190 adding explanatory variables and assessing the resulting statistic, yielding a total of three different models. An overview of the variables included in the final regression models is given in Table B1, while an overview of our a priori variable hypotheses can be found in Table E1,  Damage i = β 0 + β 1 ln(depth i )+ + β 2 ln(size i ) + β 3 age i + β 4 renovated i + β 5 centralheating i + β 6 stoveheating i + β 7 electricheating i + β 8 heatpumpheating i + β 9 lightweightconcrete i + β 10 timbered i + β 11 brick i + β 12 concrete i 3 Results and discussion

OLS regression results
All regression models were estimated using Stata 13 (StataCorp, 2013) ], and the results are presented in Table 2. All mentions of statistical significance in what follows refer to a 5% significance level, although Table 2 also reports significance levels of 1% and 10%. The initial and simple model (Model 1), based on inundation depth alone, was included in the study to depict a baseline model in order to identify the change in predictive power when more explanatory variables are added, as in Models 205 2 and 3. Across all three models, our results indicate that an increase in water depth significantly increases damage costs, which confirms both intuition and previous findings in the literature (Messner, 2007).Furthermore, our results also indicate that individual building characteristics influence damage costs: for example, larger buildings have significantly greater damage (Model 3), and buildings heated by electric and central heating have significantly larger damage costs (Models 2 and 3). In addition, adding spatial variables, such as municipal characteristics, increases explanatory value (higher adjusted R2) without 210 changing the significance levels of most of the variables. Model 3, which captures most of the spatial effects, shows especially the importance of including municipality characteristics, including social vulnerability, resources, the emergency response, etc.
Our preferred model, Model 3, is an extended model including both building characteristics and variables in order to capture specific spatial variations. We include a continuous measure for the return period of the storm surge, rp, and a variable, prevcount, that measures historical exposure to storm surges at the municipal level, as captured by the number of historical 215 insurance claims. In addition, we include variables related to the national Emergency Management Agency's Services (EMAS) during the storm surge, which controls for the duration of the effort (EMAS_duration). This variable is linked to the inundation depth and return period, since it is expected that the emergency services have greater representation in areas with higher inundation depths. Furthermore, three spatial variables regarding the socioeconomic characteristics of the municipalities were Robust standard errors in parentheses * p < 0.10, ** p < 0.05, *** p < 0.01 EMAS = Emergency Management Agency Services added. As described earlier, these variables are included to capture any difference in damage costs at the spatial level of 220 the municipality with regard to how vulnerable local communities are. The hypothesis is that the vulnerability of a local community could influence the municipality's overall efforts to reduce the expected damage costs so that municipalities that are more vulnerable have less economic and social capital with which to respond to extreme events such as storm surges. The overall explanatory power of the model as captured by the adjusted R2 is 0.37. The increase in the explanatory power from 0.21 in Model 1 and 0.32 in Model 2 reflects the finding of Ootegem et al. (2015), who also saw an improvement by including variables for socio-economic status and building characteristics.
Model 3 shows a statistically significant increase in the resulting damage of 0.38% per 1% increase in inundation depth, a positive relationship also identified previously in the literature (Jonkman et al., 2008). The effect on damage from the inundation depth is smaller compared to Models 1 and 2, indicating the presence of omitted variable bias in those models.
We find that several of the building characteristics, such as size, age and heating source, significantly impact on the size of 230 the damage costs. Larger buildings have statistically significant greater damage, with a 1% increase in the size of the building leading to an increase of 0.46%. The finding of a positive relationship between size and damage cost is confirmed by a study by Carisi et al. (2018), who also find a significant, positive relationship.Older buildings are subject to lower expected damage costs, with a ten-year increase in a building's age reducing damage by 4%. Interpreting the causal effects of age upon damage costs is not straightforward. It is likely that the observed negative effects of age on damage costs could be due to older buildings having 235 been built in safer locations, for example, in areas not prone to flooding. There has been an increasing tendency in Denmark, as well as globally, for the development of new settlements and neighborhoods to be located closer to flood-prone areas (Seto et al., 2011;Small and Nicholls, 2003), presumably motivated by the amenity benefits gained from proximity to the water. This trend could explain why older residential buildings are expected to have lower damage costs than newer buildings.
Furthermore, our results show that, compared to buildings heated by district heating, those heated by an electrical heating  Table   ?? in the Appendix). The reference category, district heating, is only observed for 102 buildings. Nevertheless this category was chosen as a reference since the four remaining categories are all characterized by being building-specific and thus are potentially more susceptible to damage than a district heating system would be. So, although these percentage effects seems 245 large compared to the other effects we found, they essentially capture the greater sensitivity of a localized heating system compared to a network-based system and highlight the sensitivity of these systems to a storm-surge event. The interpretation of the heating systems' impact on damage costs could reflect the locations of different heating systems within buildings and thus the costs of repairing damaged heating systems. Confirming this hypothesis, heat pumps and electric heaters, which we find contribute most to damage costs, are often placed in low-to-ground positions within buildings.

250
The direction and significance of both the age of a building and whether it has electrical and central heating remains unchanged between Models 2 and 3, suggesting that the relevance of these variables is robust to the inclusion of the specified spatial variables. However, the size of the effect is reduced, and the heat pump does not seem to suffer greater damage than the district heating systems in Model 3. Nonetheless the effect of size is only significant in Model 3, which indicates that including important spatial variables reduces an omitted variable bias. We find no indication in any of the models that exterior 255 construction materials influence damage costs, nor distance from the nearest coast or lake. Also, the results indicate that the storm intensity (rp) and the number and severity of prior storm surges have no significant effect on the damage costs.
Interestingly, the variables regarding emergency responses show a significant decrease in damage costs the more time is spent on emergency actions. There is a larger decrease in damage costs at lower inundation depths, as captured by the partial effect of the interaction term (EMAS_duration + ln(depth)*EMAS_duration). The calculated reduction in damage per extra 260 hour spent on emergency management ranges from 1.16% for the lowest recorded inundation depth to a reduction in damage of 0.16% for the highest recorded inundation depth.
We find no effect of either the share of retired persons in a municipality or the average income per capita on damage costs. However, the results indicate that municipalities with greater public expenditure on medical consultations also have statistically lower damage costs. Intuitively, we would have assumed that municipalities with higher public expenditure on 265 medical consultations indicate a vulnerable community, which would suggest that the effect on damage costs should be positive.
On the other hand, this could also be seen as an indication of a resourceful community in which people act on their health problems, which could explain the negative relationship we observe.
To sum up, our results indicate a substantial increase in explanatory power from 21% to 39%, going from a simple linear regression including only inundation depth (Model 1) to a multivariable regression model (Model 3). Furthermore, our study 270 shows that adding explanatory variables besides inundation depth decreases the effect of inundation depth alone, for example, a fall from a 0.57% increase in damage costs to 0.37% from a 1% increase in inundation depth -a substantial difference. Thus, our study indicates that using only the simulated inundation depth to predict damage costs could lead to these costs being over-estimated. Specifically, the inclusion of spatial variables changes the effect size of several of the explanatory variables, adding to the explanatory power of our model.

275
If OLS regressions are to provide unbiased and consistent estimates, we have to assume that observations are independent of one another (Lesage, 2014). In this case, we might suspect the presence of neighborhood effects that influence damage costs at a lower level than the municipal level. As pointed out by Anselin and Arribas-Bel (2013), it is only in very restricted cases that fixed spatial effects, such as those we include in Model 3, correctly account for the possible spatial dependence between observations. An example of such a small-scale spatial effect could be the degree of social cohesion among neighbors, 280 which potentially could influence their ability or willingness to help each other during an extreme event such as a storm surge. Not controlling for such a potential effect in the model set up means that spatial auto-correlation might be present, leading to biased and inconsistent OLS estimators. To investigate whether Models 2 or 3 display any spatial auto-correlation, we compare model performance based on Moran's I (see Table F1 in Appendix), indicating that spatial autocorrelation is present in Model 2 (Moran's I= 0.117***). However, in Model 3, which includes spatial variables at the municipality level, 285 we find that the evidence of significant spatial autocorrelation has decreased (Moran's I = 0.062**). In particular, the test for spatial autocorrelation in the error term is only significant at a 10 % level, suggesting that the model accounts for most of the unobserved spatial effects. This again highlights the importance of including variables representing spatial variance and not just focusing on specific house or individual characteristics. The indication that some spatial dependence is still present in the extended model suggests that future studies could explore this further using, for example, spatial econometric methods 290 (LeSage and Pace, 2009;Cliff, 1973).
Multivariable damage cost models of storm-surge flooding are essential in supporting cost-effective investments in adaptation options. First, they can highlight individual or combinations of variables that are driving high economic losses, showing which variables are particularly important to address in devising risk-reduction strategies. Secondly, the models can be used to pri-295 oritize actions between different geographical areas and provide an estimate of how much society should be willing to invest in adaptive measures. Our results confirm that multivariable models can provide more accurate results than a simple model.
Through the inclusion of a wide set of explanatory variables in the damage costs of coastal flooding, our study particularly highlights the necessity of controlling for spatial variation in estimating damage costs, whether in relation to socio-economic conditions, flooding characteristics or emergency services. Including more variables increases the explanatory power of our 300 model from 21% in the simple model, where inundation depth is the only variable, to 39% in the multivariable model. Several different variables are found to significantly influence damage costs. Besides inundation depth, they include building characteristics (size, age), emergency response efforts and type of heating source. In particular, the presence of electric heating systems appears to be surprisingly sensitive to flooding, as damage costs are found to be much higher for buildings with such systems compared with, for example, houses drawing on district heating. The chief benefit of our approach is the stringent econometric 305 method we have used. In theory, this approach should facilitate transferability to other settings similar to those in our study.
We are aware that the type of data used for this model is not widely accessible in all regions and countries, which is why the results of Model 3 will not easily be transferred to other regions. However, our study still highlights the importance of including variables that can account for differences across municipalities and regions. The results of our study should thus be replicated in other settings before being used as input to adaptation option strategies. 310 The source of heating might influence the total damage amount, depending on how sensitive the system is towards flooding. Stoves, heat pumps and central heating systems could be more vulnerable to flooding compared to district heating distance to coastline (meter) -The longer away a property is from the coastline, the lower the damages it incurs previous insurance claims in area due to storm surge (count) -If buildings in the municipality has been flooded before due to prior storm surges, they might be more prepared for the next one and thus have lower damage costs previous insurance relief payment in area due to storm surge (index) -If buildings in the municipality earlier have had high insurance payments due to storm surges, we could expect building owners to be more prepared for following storms. Grahn, T. and Nyberg, R.: Damage assessment of lake floods: Insured damage to private property during two lake floods in Sweden