Urban pluvial flood risk assessment – data resolution and spatial scale when developing screening approaches on the microscale

. Urban development models typically provide simulated building areas in an aggregated form. When using such outputs to parametrize pluvial ﬂood risk simulations in an urban setting, we need to identify ways to characterize imperviousness and ﬂood exposure. We develop data-driven approaches for establishing this link, and we focus on the data resolutions and spatial scales that should be considered. We use regression models linking aggregated building areas to total imperviousness and models that link aggregated building areas and simulated ﬂood areas to ﬂood damage. The data resolutions used for training regression models are demonstrated to have a strong impact on identiﬁability, with too ﬁne data resolutions preventing the identiﬁcation of the link between building areas and hydrology and too coarse resolutions leading to uncertain parameter estimates. The optimal data resolution for modeling imperviousness was identiﬁed to be 400 m in our case study, while an aggregation of the data to at least 1000 m resolution is required when modeling ﬂood damage. In addition, regression models for ﬂood damage are more robust when considering building data with coarser resolutions of 200 m than with ﬁner resolutions. The results suggest that aggregated building data can be used to derive realistic estimations of ﬂood risk in screening simulations.


Introduction
The development of pluvial flood risk adaptation measures in urban areas typically requires that a variety of combinations of different measures are tested van Berchum et al., 2018). In addition, flood risk is strongly affected by climate change, urbanization and socioeconomic changes (Di Baldassarre et al., 2015;Hinkel et al., 2014;Muis et al., 2015;Muller, 2007;Semadeni-Davies et al., 2008). Projections of these parameters are subject to substantial uncertainties over infrastructure lifetimes between 30 and 100 years (Cohen, 2004;Granger and Jeon, 2007;Hall et al., 2014;Madsen et al., 2014).
To consider these uncertainties in the design of water infrastructures, scenario assessments are performed. In these assessments, model simulations of the urban layout are linked to water systems models (Urich and Rauch, 2014), and the combined impact of climate change, represented as changing forcing in the water systems model, and changes in exposure, represented by varying simulated urban layouts, is assessed. For example, Löwe et al. (2017Löwe et al. ( , 2018) linked a vector-based urban development model to a 1D-2D hydraulic model of the urban catchment to assess pluvial and coastal flood risk, while Mustafa et al. (2018) implemented a similar setup for fluvial flood risk, considering a cellular automata model for urban development and 2D hydraulic simulations. Other studies have applied cellular automata to study the effect of urbanization on extreme rainfall and resulting flood risk (Huong and Pathirana, 2013) and to quantify changes in coastal flood areas as a result of urbanization (Sekovski et al., 2015).
Raster-based implementations for modeling urban development, such as the ones used by Mustafa et al. (2018) and , have the advantage of short simulation times. Such models can be combined into a flood risk screening setup together with fast flood simulation tools, i.e., a setup which allows for a fast evaluation of flood risk with limited accuracy. Such setups enable testing flood risk Published by Copernicus Publications on behalf of the European Geosciences Union.
adaptation measures in a scenario-based approach, where the combination of various potential measures and different socioeconomic and climate scenarios easily leads to simulation requirements exceeding 10 000 events (Kwakkel et al., 2015;Löwe et al., 2017Löwe et al., , 2018van Berchum et al., 2018). In this context, conceptual flood simulation tools as described by Bermúdez et al. (2018) and Jamali et al. (2018); Jamali et al. (2019) may be preferable over machine learning techniques (e.g., Wang et al., 2015), because they allow for a physically interpretable implementation of surface adaptation measures and because they can be linked to conceptual models of the drainage system and thus be used for a combined assessment of flood risk and other environmental impacts of the drainage system.
When applying a linked (possibly conceptual) urban development-hydraulic simulation setup for pluvial flood risk assessment, we need to consider the effects of increasingly impervious areas, leading to increased runoff and thus larger flood hazard , as well as of increasing exposure, resulting from an increase in the potentially flood-prone urban area (Löwe et al., 2017). For both parameters, urban development simulations will frequently not provide a full quantification of the hydrologically relevant variables. For example, impervious surfaces such as terraces, carports or even streets might not be explicitly represented in the urban development model. Similarly, microscale flood damage assessments where simulated flood areas are overlaid with building and infrastructure objects are state-of-theart in urban hydrology (Hammond et al., 2015), but some building types that are relevant for flood damage assessments (e.g., schools) might not be modeled, and the location of buildings may not exactly reflect reality or may be blurred if (raster-based) cellular automata approaches are applied. In addition, while Bruwier et al. (2018) clearly demonstrated that building data affect urban flood simulations by blocking flow paths, this effect is difficult to consider if an urban de-velopment simulation only provides building information in the form of building area density.
For the case where urban development models provide aggregated, raster-based outputs, it is not clear how to link this output to hydrological modeling approaches and subsequent economic pluvial risk assessments. Related work has applied ad hoc definitions (Löwe et al., 2017), guesstimates from planning documents (Bach et al., 2013) and manual tuning of model parameters  to predict imperviousness based on modeled building areas.
Data-driven, empirical approaches would be highly attractive to parametrize this link. Our aim is to evaluate such procedures and to characterize the data resolutions and spatial scales for which robust performance can be obtained. Similarly, for damage assessment we would be highly interested in procedures that allow for upscaling of locally derived depth-damage functions, which are likely to provide better damage estimates (Cammerer et al., 2013) and facilitate acceptance among stakeholders. This need was also recog-nized in the literature (de Moel et al., 2015). Upscaling procedures were previously described by Kreibich et al. (2010) and Thieken et al. (2008) but focused on mesoscale damage assessments rather than assessments on the city, or even neighborhood, scale that we are interested in when performing exploratory modeling for urban flood adaptation.
None of the previous work has explicitly assessed to what extent data resolutions applied in the development of scaling procedures affect the outcome of these procedures and at which spatial scale reasonable predictions can be obtained. A thorough assessment of these issues throughout the pluvial urban flood risk modeling chain is the main contribution of this paper.

Study area and data
We consider the city of Odense, Denmark, as a case study. Odense has approximately 200 000 inhabitants, and it is located in a typical moraine landscape close to the sea.
As base data characterizing the urban form, we were provided with building footprints in vector format by Odense Municipality (Fig. 1). The building footprints included information on the building types that were grouped into the 11 classes shown in Table S1 in the Supplement. In addition, information on the number of residential units and the commercial floor space area in each building was available.
Data on impervious area were provided in vector format. The data were obtained from remote sensing campaigns and grouped into six classes (Fig. 3). The responsible utility VandCenter Syd continuously performs manual, small-scale evaluations of which percentage of each impervious area class is connected to the sewer system. These evaluations were performed for each of the 18 000 subcatchments used in the existing hydrodynamic model for the city's drainage network. We used this processed dataset for our analysis; i.e., impervious area was considered as effective impervious area connected to the pipe system. A digital elevation model (DEM) was available from the Agency for Data Supply and Efficiency (2018) at a resolution of 0.4 m. The data supplier ensured hydrological validity of the data by removing obstacles for major flow paths such as bridges. The data were averaged to a resolution of 5 m. Figure 1 shows terrain elevations, footprints of the existing buildings and the network of existing major roads. We refer to Löwe et al. (2019) for a detailed evaluation of the characteristics of the urban layout in the case study area. Figure 2 illustrates the overall problem. Hydrological modeling and flood damage assessment are commonly performed based on polygon data characterizing the urban layout. Fast, raster-based urban development models instead provide information about the building area inside a pixel or the land- use mix inside a pixel, which, through an assumed building density, can be translated into building area. Typically, these models operate with raster resolutions on the order of 100 to 200 m Fuglsang et al., 2013;Mustafa et al., 2018). Such coarse input data will affect rainfall-runoff simulations, i.e., the location where flood hazards occur, and are likely to be incompatible with flood damage assessments derived from polygon data. To analyze issues arising in different parts of the pluvial flood risk modeling chain, we performed hydrological assessments considering imaginary urban development model outputs in the form of rasterized building data with resolutions between 25 and 2000 m.

Methods
We structured our study around steps illustrated in Fig. 3. Summarized roughly, these steps involved the identification of a regression relationship between rasterized building footprint areas (the assumed urban development modeling output) and impervious area. The identified relationship was subsequently applied to derive a raster of predicted impervious area, which was used to parametrize 2D hydrodynamic simulations of surface water flow. The results of these simulations were used to estimate the extent of flooded building area, which was then used as input to regression models that predicted flood damage derived from a reference simulation. The reasoning behind this approach was the following: 1. Urban development models in general, and fast, rasterbased modeling approaches in particular, do not provide detailed information on all impervious areas in a catchment. Thus, we need to estimate empirical relationships between an assumed urban development modeling output (here raster-based building footprint areas for differ-ent building types) and measured imperviousness. Fitting the regression relationship to datasets with varying resolutions provides insight into the spatial scale at which the link between urban layout and imperviousness can be identified. Generating predictions at varying resolutions provides insight into the spatial scale at which reasonable predictions can be generated.
2. In a hydrological model, coarse representations of imperviousness affect the runoff volume and location where runoff occurs and will thus lead to different simulations of flood hazards. We performed hydrodynamic 2D flood simulations where the hydrodynamic model was parametrized using impervious areas based on building areas with varying levels of aggregation.
Comparing the resulting flood maps to a reference simulation, we can quantify how increasingly coarse representations of the urban layout affect simulated flood hazard.
3. Economic flood damage is an important parameter in decision-making related to flood adaptation. The standard approach for damage estimation in urban hydrology is to overlay high-resolution flood areas and building polygons. If only coarse, raster-based building data are available, flood damage can be derived by establishing a regression relationship between flood damage derived from a reference simulation and the extent of flooded building area as a measure of exposure. Inspecting the validity of this relationship provides insight into the combined impact of coarse representations of the urban layout on hazard and exposure.
In addition to the above, buildings affect simulated flood hazards by obstructing flow paths. This effect cannot be considered when only coarse building data are available. To isolate this effect in our study, we performed an additional baseline simulation where buildings were not included in the DEM (not shown in Fig. 3). We compared simulated flood areas and damage to the reference.

Model setup
Our aim was to predict impervious area in simulated urban developments when the assumed output of an urban development is building footprint areas for different building types. Linear regression approaches for modeling such relationships were previously documented by Butler and Davies (2011) for detached housing only and by Chabaeva et al. (2009) for a variety of land cover classes derived from satellite observations. To identify a regression relationship, we rasterized the high-resolution polygon data. We modeled, for each pixel j and each of the building types i shown in Table S1, the observed impervious area A imp,j in square meters per square meters as a function of the building footprint area A bf,i,j in square meters per square meters and coefficients a i : Scatterplots of impervious area versus building area are included in the Supplement (Fig. S1). We have not included an intercept in Eq. (1) to ensure undeveloped areas are assigned an imperviousness of 0 and because the scatterplots did not suggest that an intercept would be necessary. For fine data resolutions this leads to biased regression predictions. While the dataset certainly is subject to spatial autocorrelation, the regression models provided strong predictive performance, and we have therefore not investigated the matter further.
To test the impact of spatial data resolution, we fitted regression models to datasets with 80 different resolutions x fit , ranging from 25 to 2000 m in steps of 25 m. The regression coefficients identified for each resolution were then used to predict imperviousness at 80 different aggregation levels x pred , ranging from 25 to 2000 m. We embedded our tests into a cross-validation setup where 80 % of the dataset were used for calibration and 20 % for model validation. If x pred > x fit , we sampled from the pixels of the dataset used for prediction and otherwise from the pixels of the fitting dataset. For cross validation, a pixel from the dataset with finer resolution was linked to the pixel of the dataset with coarser resolution with which it shared the greatest overlap. The cross-validation procedure was repeated k = 1000 times; i.e., a total of 80 × 80 × 1000 regression models were considered.

Performance assessment
During each iteration, we computed root-mean-square error RMSE A imp ,k , coefficient of determination COD A imp ,k and bias ratio RBIAS A imp ,k : where A imp,pred,j and A imp,obs,j were predicted and observed impervious areas for a pixel j in the validation dataset and A imp,obs was the average imperviousness of all pixels j in the validation dataset. We considered the median of RBIAS A imp ,k and COD A imp ,k over all k iterations as measures of goodness of fit and the standard deviation σ (RMSE k ) of RMSE A imp ,k as a measure of how reliably the model could be identified for a given combination of x fit and x pred .

Model setup
We performed 2D flood simulations of pluvial hazards for 10 different models, considering a model where imperviousness was determined from the original imperviousness dataset and where buildings were included in the DEM for flow calculation (baseline model); a model where imperviousness was determined from the original imperviousness dataset and where buildings were not included in the DEM for flow calculation (baseline without buildings); and models where imperviousness was derived considering the regression relationship shown in the Supplement (Sect. S2) and considering building data aggregated to resolutions x b of 25, 50, 100, 200, 300, 500, 750 and 1000 m as input -buildings were not explicitly included in the DEM for flow calculation in this case.
Our 2D modeling approach was the exact same approach as used by Kaspersen et al. (2017) for the same case study area. The 2D surface flow model was implemented in MIKE 21 (DHI, 2016) using a grid size of 5 m by 5 m. Simulations were performed for Chicago design storms (CDSs) with return periods of 20 and 100 years and durations of 4 h. Rainfall-runoff computations were performed for each grid cell during each time step of a simulated event, and the runoff created in each cell was then included in the simulation of surface water flows.
As in Kaspersen et al. (2017), runoff R t in time step t for each 5 m pixel was computed as where P t was the rain intensity and IS the ratio of impervious area in a pixel to its total area. The effective infiltration intensity f t (1 − IS) in a cell was computed based on a constant infiltration rate f t = 29.3 mm h −1 . On the impervious portions of a pixel, the rain intensity P t,RP5 of a 5-year design storm at the same time step t was subtracted from the rain intensity to simulate the effect of drainage systems. Impervious areas linked to major roads ( Fig. 1) were preserved throughout all simulations. In an urban development simulation, main roads would need to be considered explicitly, instead of being lumped into a regression prediction of imperviousness with building areas as the only input. As an example, we included maps of infiltration rates f t (1−IS) derived for two building datasets in the Supplement, Sect. S3. The 2D flood model was not calibrated to reflect observed flooding in the catchment. While the simulated flood maps may not coincide with reality, they provide a realistic baseline for the further analysis.

Performance assessment
We compared the simulated flood maps to the baseline simulation where true imperviousness percentages were applied for runoff modeling and buildings were included in the DEM. In the comparison, we focused on built-up areas and excluded natural areas and water bodies.
We created contingency tables where we counted in how many pixels both the predicted flood map under scrutiny and the baseline flood map exceeded a water level of 0.1 m (hits) and how often this was the case only for the baseline model (misses) or the tested model (false alarms). Subsequently, we computed the scores hit rate HR, false alarm ratio FAR and critical success index CSI as defined in (Bennett et al., 2013). In addition, we evaluated the total area flooded above a water level of 0.1 m.

Flood damage assessment (C)
Based on the 2D flood simulations performed for the baseline situation, we assessed flood damage. The derived damage data were subsequently used as a reference for training and validating the regression models derived in Sect. 3.4.
Direct flood damage in urban areas is commonly assessed by overlaying polygons of exposed objects with highresolution flood maps. Damage is then assigned to each object (e.g., a building) depending on the greatest adjacent water depth (Hammond et al., 2015). For our assessment, we have focused on direct, tangible flood damage as this is most directly related to the urban form.
We distinguished between two approaches for damage assessment, which we expected might yield different results in terms of which impacts different data resolutions may have in damage assessment. The first type is threshold-based approaches, where a unit damage is assigned to an object if the water level exceeds a defined threshold. In Denmark, such approaches are frequently applied in the context of pluvial risk assessments (Kaspersen and Halsnaes, 2017;Odense Kommune, 2014;Olsen et al., 2015), because water levels are generally low. In the international literature, depthdamage curves are widely applied (Penning-Rowsell et al., 2013;Thieken et al., 2008) where damage potential is assigned to different objects in the urban space. Depending on the flood water level, different portions of the damage potential are realized.
We considered the framework of Olsen et al. (2015) as an example for the unit-damage approach, while the framework of Beckers et al. (2013) was considered as an example for the depth-damage-based approach. The latter builds on damage functions from FLEMO (Thieken et al., 2008;Kreibich et al., 2010). It is the only example we were able to find in the literature where damage potential for residential and commercial properties was published for the same case study. We have therefore selected it for our work. Table 1 summarizes both approaches. We have not considered damage to road structures, because this was of negligible magnitude.
Flood damage was derived by overlaying the simulated flood areas with the building polygons. Damage per square meter was derived for each building, considering the damage functions shown in Table 1. The building polygons were then rasterized to a resolution of 1m and subsequently aggregated to the different data resolutions used for fitting the regression models detailed in Sect. 3.4.
We have also derived flood damage for the baseline simulation where buildings were not included in the DEM. The damage values were not used for regression but are shown in the results section, as they provide insight into the impact of blocked flow paths on damage assessment.

Model setup
In the regression of flood damage, we considered the building footprint area A flooded,WL[i] flooded with a water level above threshold WL[i] as the main input variable. This area was determined by downsampling the building raster data with resolutions x b of 25, 50, 100, 200, 300, 500, 750 and 1000 m to the same resolution as the flood maps (5 m) and summing up the building areas for all pixels which were flooded above the threshold of interest.
We reasoned that the regression models should reflect the characteristics of the damage function applied in the original damage assessment. We have therefore considered a model structure based on the three building classes considered in damage assessment. A square-root transformation was applied to both input and output variables to linearize the rela- tionship (see Sect. S6): The flooded building footprint areas for residential (A flooded,res ), commercial (A flooded,comm ) and public (A flooded,pub ) buildings were computed as the total footprint area of the corresponding class that was flooded above water level WL[i] and below WL[i + 1], and coefficients b 1i , b 2i and b 3i were estimated for each threshold WL[i]. The mapping between the 11 building types considered in our case study and three building classes considered for damage assessment is illustrated in Table S1.
For the damage data derived based both on Olsen et al. (2015) and on Beckers et al. (2013), we have applied Eq. (6) with a single damage threshold of 0.1 m, resulting in a model with three input variables that corresponded to the total flooded footprint area for each building type. This approach was named DMOD1 in the following. In addition, for the damage data derived based on Beckers et al. (2013), we also applied a model where all five water level thresholds shown in Table 1 were considered. The result was a regression model with 15 input variables that reflected the building footprint areas flooded above the different water level thresholds considered in the original damage assessment. This approach was called DMOD2.
Similar to the approach for impervious areas in Sect. 3.1, we fitted the regression models DMOD1 and DMOD2 considering 80 different input data resolutions x fit between 25 and 2000 m. The flooded building area A bf,WL [i] was always determined at a resolution of 5m (corresponding to the resolution of the flood map) and was subsequently aggregated to the resolution that should be used for regression fitting.
To distinguish to what extent coarse building data affect damage assessment by creating uncertainty in flood exposure or flood hazard, we derived flooded building areas both from the baseline flood map (considering true imperviousness and buildings included in the DEM for flood simulation) and from the flood map created in a 2D simulation with the aggregated building data which were also considered for damage regression.

Performance assessment
To assess model performance, we performed cross validation. The city was divided into squares with an edge length x pred of 2000 m (see Sect. S5). We trained the regression model on a random sample of 80 % of the subareas and assessed model performance on the remaining 20 %. This process was repeated k = 1000 times.
When the regression models were fitted to datasets with resolutions x fit finer than 2000 m, we linked the pixels at the lower data resolution to the subarea with which they overlapped most. Regression modeling was then performed at the finer resolution, and predicted damage for each subarea was computed by aggregating the values from the linked pixels. The subdivision into subareas allowed us to evaluate model www.nat-hazards-earth-syst-sci.net/20/981/2020/ Nat. Hazards Earth Syst. Sci., 20, 981-997, 2020 performance at a constant spatial scale despite applying different data resolutions for model fitting. However, it had the disadvantage that the pixels in the datasets used for regression modeling were not always completely included in a subarea, leading to noise in the computed scores.
To evaluate regression fit, we computed for crossvalidation iteration k the COD of damage values D pred,j,k predicted for each subarea j in the validation dataset by comparing to the baseline damage D baseline,j value for the same subarea: The index CV2000 indicates that cross validation was performed on a spatial scale of 2000 m. In addition, we computed the total damage ratio DR tot,k considering all subareas j in the validation dataset as Median values of COD D,CV2000,k and DR tot,k were considered in the analysis of results. For the cases where flooded building areas A flooded,WL [i] were derived based on the flood map from the baseline simulation, scores were marked with subscript BF.

Results
The results section was structured into the same parts that were also highlighted in Fig. 3. Performance scores related to the simulation of flood hazards and the assessment of flood damage (parts B to D) were collected in Tables 2 and 3, distinguishing between results for building data with varying resolutions x b . Figure 4 summarizes COD A imp ,k , RBIAS A imp ,k and RMSE A imp ,k , where regression models for impervious area were fitted for varying data resolutions ( x fit ) and where the coefficients fitted for one resolution were used to predict impervious areas considering building data aggregated to varying resolutions as input ( x pred ). When the regression models were fitted to data with resolutions below approximately 250 m, the relationship between building footprint areas and imperviousness could not be identified, because building footprint areas would then not necessarily be located in the same pixels as the associated features of the urban layout (e.g., sidewalks). Regression coefficients approached 1 for the finest data resolutions x fit and hardly varied during cross validation (not shown). This led to low values for COD Aimp and an underprediction of the total imperviousness (RBIAS Aimp < 1). Considering the prediction resolution x pred , values of COD Aimp above 0.95 were achieved at spatial scales above 500 m. For finer spatial scales, there would be random variations in the imperviousness that could not be explained by the extent of building footprint areas alone (see also Fig. S1).

Estimation of impervious areas (A)
While the median predictive performance of the regression models (COD Aimp and RBIAS Aimp ) remained constant for data resolutions x fit between approximately 250 and 2000 m, the standard deviation of the RMSE values obtained for a fixed prediction resolution was minimal for data resolutions on the order of 400 m (Fig. 4f). For coarser data resolutions there would be a larger portion of the cross-validation iterations where the regression models were not properly identified. This behavior was considered plausible, because coarser data resolutions are accompanied by a loss of information on spatial variability and because the number of data points decreases. For finer data resolutions x fit , the negative bias in predicted imperviousness similarly led to an increase in σ (RMSE k ), because prediction errors varied depending on which areas were sampled for validation. This effect was not observed for x pred = 250 m, because the impervious areas that were not captured during parameter estimation were then also in the validation phase largely located in pixels where the building area was zero (leading to a predicted imperviousness of constant zero).
For our case study, we identified a data resolution x fit of 400 m as the optimal trade-off between capturing the link between urban layout and imperviousness by aggregating data into large enough pixels on the one hand and avoiding loss of information by blurring the dataset on the other. Figure 5 shows the total area which was simulated flooded above different water level thresholds. Results are compared for the baseline model and for a model where imperviousness was specified based on building footprint areas aggregated to a raster resolution x b of 200 m. The figure suggests that the model based on aggregated building data simulated fewer areas flooded with high water levels for the 20-year event. The reason was that this model did not consider the blockage of surface flow paths by buildings. The effect can also be seen by comparing the flood maps in the lower part of Fig. 5.

2D flood simulation (B)
For the 100-year event, similar total flooded areas were obtained for both models, which can be associated with the greater degree of water movement on the surface and, as a result, the filling of sinks in both models. However, the performance scores shown in Table 3 suggest that there was substantial disagreement between the two models in where flooding occurred. It was difficult to conclude how severely Nat. Hazards Earth Syst. Sci., 20, 981-997, 2020 www.nat-hazards-earth-syst-sci.net/20/981/2020/   simulated flood maps deviated from the baseline in absolute terms because the performance scores were based on pixelby-pixel comparisons and thus suffered from double-penalty issues.
For both return periods, the score values in Tables 2 and 3 suggest that the flood maps generated with models based on aggregated building data generally resembled the flood map from the baseline simulation without buildings. An increasingly coarse representation of imperviousness in the model thus had little impact on the simulated flood maps as compared to the effect caused by the blockage of flow paths in the baseline simulation.
A minor effect was noticeable in particular in the total simulated flood areas. Coarse building area resolutions implied that impervious areas would be distributed increasingly evenly over the catchment, leading to the distribution of effective precipitation over larger areas; surface flows with small water levels; and, as a result, fewer areas where water levels would exceed the threshold of 0.1 m. On the other hand, total impervious areas would be underestimated by the regression model for fine building datasets as a result of the regression specification without intercept. In fact, total impervious areas would be underestimated by 10 % with the 25 m building raster set, while the bias would exponentially decrease to under 1 % at a resolution x b of 300 m. These two competing effects implied that the flood maps obtained based on 25 m building raster data resembled the baseline best in the 20-year event, where runoff depths were comparably small and significant water depths only occurred due to an aggregation of impervious areas. For the 100-year event, raster sets with resolutions of 50 and 100 m yielded the best trade-off between avoiding an underestimation of impervious areas and ensuring sufficient spatial aggregation of impervious areas.
It needs to be emphasized that the effects discussed above were very minor compared to the impact of whether buildings were considered in the DEM applied for 2D simulation or not. The missing impact of increasingly coarse representations of imperviousness is likely to be linked to the fact that sewer systems were considered by reducing effective rainfall in a manner which was proportional to the imperviousness in a pixel (Eq. 5); i.e., the design of the assumed sewer system followed the distribution of impervious areas in space. not considered in the 2D simulation of surface flows. In general, the latter approach led to an underestimation of flood damage, because blocked flow paths in the baseline led to higher water levels.

Damage assessment (C)
The figure also illustrates differences in the results obtained for the two damage frameworks. Considering an aggregation level of 400 m, we noticed individual pixels where damage derived using depth-damage curves (Beckers et al., 2013) was several times greater than for the threshold-based method (Olsen et al., 2015), while damage was of a similar magnitude on an aggregation level of 2000 m. In addition, the approach based on depth-damage curves was subject to stronger variations in and stronger underestimation of total damage. These effects were mainly caused by large commercial buildings which could induce very high damage values when water ponded next to these buildings in the baseline simulation, even though the flooded area would often be small. The threshold-based damage assessment was more robust towards such effects, because a unit damage would be assigned which depended on neither the building size nor the water level.

Damage regression (D)
Performance scores for damage regression models fitted based on building data with varying aggregation levels were summarized in Tables 2 and 3. The scores shown in the tables were derived considering a data resolution x fit of 1000 m.
The damage regression generally scored high values for COD D,CV2000 (median values obtained in cross validation) and only slightly biased total damage (DR tot ), suggesting that, at aggregation levels of 2000 m and above, the regression models were able to compensate for deviations in both the simulated flood area and aggregated representations of building exposure in the form of raster representations of building footprint areas. In addition, there was little difference in the regression scores when flooded building areas were derived using flood maps created based on the aggregated building data and when the baseline flood map was applied (comparing COD D,CV2000 and COD D,CV2000,BF ), supporting the statement above.
Both of the above statements were not true for the cases where damage was derived based on the framework of Beckers et al. (2013) in the 20-year event. Similar to the observations in Sect. 4.2 and 4.3, this effect was tied to local ponding near large buildings in the baseline simulation and the associated large damage assigned by the framework of Beck- . COD D,CV2000,BF was much higher in these cases than COD D,CV2000 , which underlines that the regression models were not able to reproduce damage simply because no or insufficient degrees of flooding were simulated in areas where major damage occurred. Figure 7 illustrates how COD D,CV2000 varied when different data resolutions x fit were applied for regression model fitting and when different building data resolutions x b were considered both for parametrizing imperviousness in the surface flood simulations and for computing flooded building area as input to the regression models. As the computed score values were noisy (see Sect. 3.4), we have displayed smoothed lines (R function loess with parameter span = 0.25; Cleveland et al., 1992;R Core Team, 2018). True values were included as dots to illustrate the level of variation around the smoothed line. Values obtained for the bestperforming building data resolution x b = 200 m were colored blue.
Similar to the results obtained for impervious areas, a minimal data resolution x fit between 200 and 1000 m was required to properly identify the regression models, depending on the damage framework and the resolution of the building data considered. More surprisingly, building data with a resolution x b of 200 m consistently yielded high COD D,CV2000 values, while high-resolution building data only yielded high score values when damage was computed according to the threshold-based approach of Olsen et al. (2015).
For the framework of Beckers et al. (2013) and a return period of 100 years, Fig. 8 illustrates the damage computed in the baseline simulation and compares it to regression pre-dictions generated using building data aggregated to raster resolutions x b of 25, 200 and 750 m. For a building data resolution of 25 m substantial over-and underpredictions of damage were observed. These effects were mediated when considering coarser building data with a resolution of 200 m, while the coarsest building dataset with a resolution of 750 m no longer allowed for the capturing of the spatial variability in flood damage. Figure 2 illustrates simulated flood areas and building data for the pixels marked as Area 1 and Area 2 in Fig. (8). Similar damage was observed in the baseline simulation for both areas. However, the extent of the flooded area is very different in both cases. In particular, only very small parts of the building overlap with the flooded area in Area 2 for a building data resolution x b of 25 m. For a resolution x b of 200 m, the spatial averaging of building areas leads to a lower value for the flooded building area in Area 1 and a higher value in Area 2, allowing for a better regression fit. Similar to the discussion in Sect. 4.3, this effect was less pronounced when flood damage was computed according to the framework of Olsen et al. (2015), because the threshold-method was less prone to creating high damage in individual locations.
Finally, comparing values of COD D,CV2000 and DR tot for DMOD1 and DMOD2 in Tables 2 and 3, little difference could be observed between the two models. In fact, the more complex DMOD2 occasionally yielded lower scores, because more parameters needed to be identified. In addition, the flooded building areas for different level thresholds were correlated, because areas with large water depths would typically also be associated with greater flood extents in gen- Figure 7. COD D,CV2000 considering flood damage regression models (DMOD1) fitted at different data resolutions ( x fit ) and considering building data aggregated to different resolutions in meters (lines with varying colors). Lines were smoothed, while dots indicate the true COD D,CV2000 values derived from each combination of fitting resolution and building input data resolution. Dots were colored blue for a building data resolution of 200 m and grey otherwise. eral (Fig. 2), and the additional variables thus yielded little additional information in the regression process.

Using aggregated building data for flood risk assessment
The results suggest that the consideration of aggregated building data affected both the simulation of flood hazards, and the assessment of flood damage. In terms of the simulated flood hazards the main effect arose from not considering the blockage of surface flow paths in the 2D flood simulations when considering aggregated building data. Coarse representations of imperviousness and the resulting change in rainfall-runoff behavior had little effect in comparison.
Despite the aggregation of building data, we were able to achieve realistic representations of flood exposure, which were illustrated by the high COD D,CV2000 and DR tot values obtained during damage regression. Building data aggregated to resolutions x b on the order of 200 m yielded better regression performance than building data with finer resolutions when considering damage derived using the depthdamage-based framework of Beckers et al. (2013). Performance of the finer and coarser datasets was similar when considering damage derived based on the threshold-based framework of Olsen et al. (2015). These trends were independent of whether the baseline flood map was applied in damage regression or the flood map simulated was based on aggregated building data. Slightly higher aggregation levels of the building raster sets can thus be considered beneficial for flood screening approaches, as they yield a more robust representation of flood exposure.
The damage regression yielded total damage estimates that, for a building data resolution x b of 200 m, differed between 1 % and 10 % from the baseline values. This was considerably better than the total damage values obtained in the baseline simulation where buildings were neglected in the 2D flood simulation but damage assessment was performed using building polygon data. This highlights the need for adjusting damage frameworks developed for high-resolution data to the actual modeling context.

Damage frameworks for pluvial flood risk assessment
The damage assessment approach based on depth-damage curves (Beckers et al., 2013) produced high, localized dam-age values where flow paths were blocked by large buildings. These situations were difficult to reproduce using aggregated building data, because it was not possible to simulate the local ponding of water, in particular for the smaller event.
It is questionable whether this damage assessment approach is reasonable for pluvial flood risk, because it relies on modeled water depths which in reality would be unlikely to occur in this form, because the water would likely enter the building and distribute without causing major structural damage. Damage assessment approaches which are less sensitive to water depths may thus be preferable for pluvial flood risk assessment.
The issue could be mitigated by explicitly considering water flow through buildings in the surface flow model, which, however, poses technical challenges. Alternatively, robust regression approaches are likely to yield better results when performing damage regression in the presence of such issues.

Data resolution in the development of scaling approaches
Very clear dependencies on spatial scale could be identified when developing regression models that predicted impervious areas as a function of building footprint areas. The optimal data resolution x fit for developing these models was identified to be on the order of 400 m. For finer data resolutions, buildings would not necessarily be located in the same pixel as other impervious areas linked to the buildings (e.g., sidewalks), resulting in an underestimation of impervious areas by the regression models. For coarser resolutions, the data would gradually become too aggregated to properly identify the link between the different building types and imperviousness, leading to a stronger variability in the results during cross validation. Reliable predictions of imperviousness could be obtained at spatial scales x pred above 500 m (COD > 0.95).
In a similar manner, the performance of regression models for flood damage only reached acceptable levels when data resolutions x fit between 500 and 1000 m were considered during parameter estimation, depending on the level of aggregation of the considered building dataset. DR tot approached values near 1 only when data resolutions x fit of 1500 m and coarser were considered (see Fig. S6), suggesting that the data needed to be aggregated to such levels to counterbalance local variations in where flooding was simulated and which buildings were exposed to flooding.

Limitations
We performed 2D surface flow simulations based on publicly available DEM data where buildings and plants were removed in an automated manner. Our results suggest that the simulated flood maps were very strongly affected by whether the blockage of flow paths through buildings was considered in the DEM or not. Remnants originating from the DEM cleaning process may affect this result and could be an explanation for the rather low performance scores of the simulations where buildings were not included in the DEM. For example, slight misalignments between building polygons and building locations in the DEM may result in artificial sinks in the baseline simulation which would not be possible to reproduce in simulations without buildings.
Our 2D flood modeling approach was a simplified representation of the urban water cycle. This approach was justified as our intention was to evaluate which spatial scales should be considered in the development of flood screening approaches. For detailed assessment of the risk we would recommend 1D-2D calculation methods to more accurately represent where flooding occurs in the catchment.

Generalization and application
The regression parameters for imperviousness are likely to depend on topography and urban layout (e.g., degree of urban creep and density of the urban developments). In addition, the optimal data resolution for identifying regression relationships is likely to depend on the urban layout, with coarser data resolutions being optimal in less densely developed cities. This implies that regression models can be transferred between cities with similar urban layout and topography, but in many cases it will be necessary to identify optimal spatial scales and model parameters for the specific case study.
For flood damage regression, optimal spatial scales and the identified regression models additionally depend on the approach which is used for calculating flood damage. Further, the level of damage incurred by a given extent of flooded area must be expected to depend on the location of sinks and flow paths in the specific case study and the degree to which urban planning was performed in a flood-aware manner . We thus expect that these regression models always have to be identified for the specific case study. Considering the impact of different approaches to land-use planning is an important line of future research in the development of flood screening approaches. This effect can be considered by training regression models to different datasets.
Based on the considerations above, we suggest the following work flow for developing a fast flood risk screening setup in a new case study: 1. Obtain vector-based building data and highly resolved imperviousness data from aerial imagery as base data characterizing the urban layout.
2. Perform hydrodynamic flood simulations (e.g., 1D-2D) for the case study to derive a baseline flood map and compute flood damage.
3. Train regression models for impervious area and identify a suitable data resolution x fit using cross validation as demonstrated in this paper (see computer code in Löwe, 2019).
5. Use the flood map and rasterized building data to train damage regression models. Identify suitable resolutions for training data ( x fit ) and building rasters ( x b ) using cross validation.
6. Apply setup -simulate urban development in raster format; predict impervious area based on the simulated building areas; use predicted imperviousness for rainfall-runoff calculation in fast flood simulation tool; and compute flood damage based on the generated flood map, simulated building areas and damage regression model.

Conclusions
We studied how different data resolutions affect the identification of empirical relationships between building data and urban hydrology and at which spatial scales reasonable predictions could be obtained. Based on our results, we draw the following conclusions: 1. The identification of empirical relations between urban layout and urban hydrology is subject to a bias-variance tradeoff. Too fine spatial data resolutions prevent the identification of empirical relationships and lead to biased results, while too coarse resolutions reduce the number of data points and blur out spatial variations, leading to uncertainty in the estimated relationships. The optimal data resolutions are expected to vary for different topographies and urban layouts and must thus be evaluated in the specific case study.
2. Simulated pluvial flood hazards are strongly affected by whether surface flow simulations consider the blockage of flow paths through buildings and less by spatially averaged representations of imperviousness during rainfall-runoff calculations.
3. Water levels are underestimated if local ponding near buildings is not considered in the surface flow simulations, as would be the case when considering aggregated building data as input. Without correction, this effect also leads to an underestimation of flood damage. 4. A simple regression model predicting flood damage in an area as a function of the extent of flooded building area can, to some extent, compensate for deficiencies in the simulated flood area. Building data aggregated to resolutions on the order of 200 m were the preferred input in our case study and performed more robustly than building data with finer resolutions, because they reduced local extrema in flooded building areas.
5. Regression models for flood damage must be expected to depend on whether flood-aware spatial planning was applied in the case study used for model training or not. Different models must thus be trained to consider different land-use management strategies.
6. Local ponding next to large buildings can create rather large water levels in simulations of pluvial flood risk that may be unrealistic. Damage assessment frameworks where damage increases as a function of water levels are vulnerable to this type of error which is specific to pluvial flood risk.
Code and data availability. Computer code for fitting regression models for imperviousness and flood damage was made available by Löwe (2019, https://doi.org/10.11583/DTU.8863766). Building and imperviousness data were proprietary. These datasets and the derived 2D flood models were therefore not made available. Upon request, the authors will attempt to obtain permission for sharing this data.