A statistical model to estimate the local vulnerability to severe weather

We present a spatial analysis of weather-related fire brigade operations in Berlin. By comparing operation occurrences to insured losses for a set of severe weather events we demonstrate the representativeness and usefulness of such data in the analysis of weather impacts on local scales. We investigate factors influencing the local rate of operation occurrence. While depending on multiple factors – which are often not available – we focus on publicly available quantities. These include topographic features, land use information based on satellite data and information on urban structure based on data from the OpenStreetMap project. After identifying suitable predictors such as housing coverage or local density of the road network we set up a statistical model to be able to predict the average occurrence frequency of local fire brigade operations. Such model can be used to determine potential “hotspots” for weather impacts even in areas or cities where no systematic records are available and can thus serve as a basis for a broad range of tools or applications in emergency management and planning.


Introduction
It has been stated within the Sendai Framework for Disaster Risk Reduction 2015-2030 by the United Nations (UNISDR, 2015) that the implementation of effective disaster risk reduction measures should be based on an understanding of disaster risks, including all its dimensions of vulnerability, capacity, exposure of persons and assets, hazard characteristics and the environment.On local and national levels, this requires to systematically evaluate, record, share and publicly account for disaster losses to gain understanding of the impacts in the context of event-specific hazard, exposure and vulnerability information.
While insurance records are a very useful data source and have been used in many analyses of regional weather impacts, their availability is generally limited due to economic interests of insurance providers.Making use of records of local emergency managers (first responders) yields an immense potential as an alternative database for analysing weather impacts, particularly on local scales.While often such records exist, they mostly lack systematic and homogenous data format and quality standards.Definition of such data standards must be regarded key requisite to be able to scientifically address disaster losses as required within the Sendai Framework.
Relating emergency call data to extreme weather, most studies analyse ambulance operation data or emergency department visits in face of temperature extremes, in particular extreme heat (Bassil et al., 2005;Dolney and Sheridan, 2008;Schaffer et al., 2012;Thornes et al., 2014).Wargon et al. (2009) have done a review on studies concerned with the modelling and forecasting of emergency department visits.It is found that the number of patient visits at emergency departments or walk-in clinics can be modelled with rather good performance.Mostly based on predictors such as the day of the week or season these models explain between 31 and 75 % of patient-volume variability.However, including meteorological data apparently failed to improve model performance (Wargon et al., 2009).Findings of more recent studies, however, do find that weather factors such as temperature and humidity play a role in the demand for ambulance services and demonstrate that including weather fore-cast data can in fact improve forecasts of daily ambulance demand (Wong and Lai, 2014).
There have been only few studies making use of spatial information of emergency operation data (i.e. the location of an assistance request) in relation to severe weather events.Two studies by Schuster et al. (2005) and Rossi et al. (2013) compared emergency call data with radar reflectivity data for a severe hailstorm event and found a satisfying representation of the hailstorm path in the density of emergency calls on the ground.Other studies have tried to utilize similar data, but they faced problems concerning the availability of accurate data.As described in Busch (2008), problems can occur in the case of catastrophic events since the archiving of fire brigade operation data is often limited in such cases.In particular, this means that spatial information on the individual location of operations is not archived, hindering spatial analyses for these events.Pardowitz and Göber (2016) have demonstrated -similar to the studies mentioned above -that satisfying correspondence of radar reflectivity for severe thunderstorm events and locations with occurrences of fire brigade operations can be found.However, this occurrence is strongly modulated by other factors such as the density of buildings.This is a confirmation of the common understanding that the occurrence and height of impacts are determined by the simultaneous existence of a hazard and vulnerability against this hazard.
Approaches to address the local vulnerability have been developed in flood impact modelling.Apel et al. (2009) and Jongman et al. (2012) describe different modelling approaches to estimate economic damages for flood events (particularly the 2002 flood event in Saxony).Based on data from digital elevation models (DEMs), local damages are estimated in dependence of inundation depth.Furthermore, such depth-damage relation can be differentiated -e.g. by considering information on land use -to account for variable vulnerabilities.
In this study we focus exclusively on the estimation of predictors describing the local vulnerability and exposure.We thus neglect temporal variations of weather parameters and investigate the long-term averaged occurrence frequencies (which can be regarded as the equivalent to a local "climatology") of fire brigade operations.We also test whether it is possible to predict these long-term occurrence frequencies for areas in which we might not have actual operation records available.
Based on a new data set of fire brigade operations in Berlin for the period of 2002-2012, this study aims at assessing the latter, namely the vulnerability against hydrometeorological hazards.In a first step we analyse this new data set particularly with respect to the question of how these impacts are related to building damages induced by windstorms and thunderstorms.It needs to be noted that neither insurance loss data nor the archive of fire brigade operation extensively describes all possible weather impacts.Instead it is important to investigate the different causes of weather impacts included in the individual data sets.Within the metropolitan area of Berlin, we then aim to identify factors describing the local vulnerability and thus influencing the local risk for weather impacts as given by the fire brigade operations.Potential factors include topographic features, land use information based on satellite data and information on urban structure based on data from the OpenStreetMap (OSM) project.After identifying suitable predictors such as housing density or local density of the road network we set up a statistical model to be able to predict local operation densities.Such model can be used to determine potential "hotspots" for weather impacts even in areas or cities where no systematic records are available.
The remainder of this paper is structured as follows.Section 2 describes the various data sets that are used to describe impacts as well as potential predictors for vulnerability.Methodological steps and modelling approaches are described in Sect.3, while results are shown in Sect. 4. Finally, Sect. 5 provides a discussion of results as well as the major conclusions that can be drawn from this study.

Fire brigade operations
A data set provided by the Berlin fire brigade is analysed, comprising weather-related fire brigade operations for the period 2002-2011.The data set contains location and time of alerts, as well as keywords associated with each operation indicating the type of operation.Keywords indicate "water-related" operations, "tree-related" operations, "traffic obstructions", operations related to "construction elements", operations due to "ice and snow" and a few operations associated with other keywords.The keywords are assigned operationally by the Berlin fire brigade and reflect the type of (technical) operation the fire brigade had to handle.While "water-related" operations consist of flooded basements or other incidents requiring the disposal of water, the keyword "tree-related" refers to operations in which windthrow had to be handled.The keyword "traffic obstruction" includes all operations dealing with the removal of obstacles to restore traffic while "construction element" refers to the removal of damages due to destroyed construction components.It needs to be noted that individual keywords might overlap in the sense that more than one keyword applies.However, for the present analysis we focus on the primary keywords assigned to an individual operation.Additional details on the usage of keywords of the Berlin fire brigade can be found in Kox et al. (2015).Total counts of weather-related fire brigade operations in the period 2002-2011 amounted to slightly above 10 000 per year.This is about 27 % of all operations of the Berlin fire brigade, which -according to the annual reports -amounted to about 37 000 operations per year in the same period.In comparison, fire extinction op- erations (about 7500 per year) accounted for about 20 % of all operations.Note that ambulance call outs (∼ 245 000 per year) and false alarms (∼ 31 000 per year) have been disregarded here.Most weather-related operations are due to water damages (33 %).Traffic obstruction accounted for 25 % of operations and tree-related operations for about 17 % (Table 1).Operations related to construction elements accounted for about 14 % and ice-and snow-related operations for 2 %.Some other keywords (individually accounting for 1 % or less each) were used, which sum up to about 8 %.Stratifying by season shows that, in total, operations are equally distributed over winter (October-March) and summer half year (April-September).This choice is done primarily to best discriminate between thunderstorm and winter storm impacts (see e.g.Donat et al., 2011).The individual types of operations, however, partly show distinct differences in summer and winter (Table 1).Particularly tree-related operations occur mainly in summer (73 %) while ice-and snow-related operations naturally occur in winter exclusively.

Building loss data
Insurance data on windstorm and thunderstorm losses to residential buildings were provided by the German insurance association (Gesamtverband der Deutschen Versicherungswirtschaft e.V., GDV).Berlin-wide damages are available on a daily basis for the period 1997-2011, while data at the zip code level (190 within Berlin) are available for a small selection of events only.Direct (liquid) precipitation damages as well as flooding damages are not part of the available data set even though they might be highly relevant in the case of severe precipitation related to thunderstorm events.Still, for investigations of severe weather events -particularly small-scale events such as thunderstorms -insurance loss data are thus extremely valuable.However, difficulties arise when interpreting the insurance data since the data set does not allow a direct attribution of losses to their cause (i.e.hail or windstorm induced).In addition, faulty at-tribution of individual insurance claims (both temporal and spatial) can cause inaccurate loss figures.For example, this can be because the exact day of occurrence of a damage is unknown in some cases.In addition, if damage occurs at a house managed by a real estate company, the insurance claim might be attributed according to their administrative centre instead of the actual origin.For the set of events for which insured loss data are available on zip code area, an evaluation of the spatial patterns and a comparison to the occurrences of fire brigade operations can be made.Wapler et al. (2015).
The data set comprises area-wide coverage of losses due to windstorm and thunderstorm.Also, to address the temporal correlation of different impacts, Berlin-wide losses are analysed and compared to total operation counts within Berlin.
According to the insurance loss records (covering windstorm and hail damages), EUR 8.12 Mio in damage is recorded for Berlin per year.The temporal distribution is rather balanced with 48 % of damage occurring in summer and 52 % in winter.While most damage in winter is related to intense winter windstorms (Klawa and Ulbrich, 2003;Pinto et al, 2010;Donat et al., 2011), a large share of summer damage is due to thunderstorms and in particular due to hailfall (Aller and Kozlowski, 2008;Kunz and Puskeiler, 2010).

OpenStreetMap data
Data from the open-source project OSM (www.openstreetmap.org, last access September 2016) are used to derive predictors for local vulnerability.Particularly we analyse georeferenced information on individual buildings (including their location and extent) as well as information on road networks.As a first predictor, the number of buildings per grid cell on a regular 1 × 1 km grid is derived.Also, by including information on housing extent, the fraction of the grid cell covered by buildings is calculated.As discussed later, even though these quantities are highly correlated, both predictors should be considered to distinguish between the high-density city centre with very large buildings and suburban areas with large numbers of detached houses.
Additionally, the density of the road networks is considered by calculating the total length of road segments within a 1 × 1 km grid cell (specified as a length per grid cell area, thus km km −2 ).The OSM data set contains a classification of the road networks (the major categories being highway, primary, secondary and tertiary road networks), which is why road densities can be assessed individually for these classes.It would be valuable to add the population density as a predictor as well.However, population density is not freely available on the spatial resolution required in this analysis (1 km).Freely available data sets include the CIESIN global gridded data set (with a resolution of about 5 km) or that of the German Federal Statistical Office (DESTATIS), which is available on a district level only.

CORINE land cover data
The CORINE (Coordination of Information on the Environment) land cover (CLC) data set provides European-wide information on land cover and land use, based on a unified classification of the most important types of land usage (CEC, 1994;Bossard et al., 2000, Büttner et al., 2012).More specifically, we used CLC2006, which is based on SPOT-4/5 and IRS P6 LISS III satellite data.Geometric accuracy of satellite images is specified to be smaller than 25 m and resulting minimum mapping units within CLC are specified to be 25 ha, with the geometric accuracy of the CLC data being better than 100 m.In total, 44 land usage classes are used in CLC2006 as subcategories of the main land usage types: "artificial surfaces", "agricultural areas", "forest and seminatural areas", "wetlands" and "water bodies".More details on CLC2006 can be found in Büttner et al. (2012).The original data consist of polygon data in the form of shape files, which have been processed to calculate land use characteristics on a grid-point basis.For this, the area fractions of all 44 CLC types (adding up to 100 %) are calculated on a specified grid.Here we use a regular long-lat grid with a 1 × 1 km resolution.These gridded fields of the area fraction are then used as predictors in the following analyses.

Data from DEM
Data from the digital elevation model dgm200 (GeoBasis-DE / BKG 2016) are also used to derive orographic height and slope.Original data have a horizontal resolution of 200 m and are available for Germany.Alternatively, GTOPO30 has been used, which has a lower horizontal resolution of 30 arcsec (approximately 1 km).However, GTOPO30 is available globally.The data are used to derive orographic height as well as the slope on a regular 1 × 1 km grid for Germany.In the case of dgm200, which has a finer resolution compared to the target grid, orographic height is calculated as the average height over all original 200 m× 200 m grid boxes within a target grid box.In the case of GTOPO30 orographic height on the target grid is determined by means of a nearest neighbour remapping.Since differences for the Berlin region were negligible, dgm200 has been used in the following.The slope is calculated according to the algorithm proposed by Horn (1981).The algorithm also assesses the aspect, which -in further studies -might be considered as an additional vulnerability predictor.However, since the area for which the vulnerabilities are analysed is limited to Berlin (featuring no considerable height variations), topographic features play only a minor role here.However, in future studies including other investigation areas, topographic features may be more important to consider.

Comparison of fire brigade operations and building damage data
To assess how representative spatial information on weather impacts on a sub-city scale can be derived from the data set and whether there is a temporal correspondence between daily damages and operation numbers, a comparison of building loss data and fire brigade operations is performed.
While insured losses on residential housing comprise specific impacts caused by windstorms and thunderstorms, fire brigade operations can be caused by additionally meteorological phenomena such as flooding (which is not included in the loss data set) or impacts due to freezing rain or road icing.The aim of this comparison is to identify how specific categories of fire brigade operations (i.e.tree-related operations) are related to the wind and hail impacts as described by the insured loss data.It can be expected that other categories (operations due to road icing) will not relate to insured losses.
For a set of events, including two convectively driven summer events and four windstorm events, a qualitative and quantitative comparison is performed between the spatial patterns of building damages and the occurrence of fire brigade operations.This is done by calculating total operation count numbers for zip code areas (190 within Berlin) for each of the six events.Besides total operation numbers, counts for operations related to individual keywords are assessed.Resulting maps are compared and spatial correlations calculated.Spatial correlations (i.e.measuring the correlation of spatial variations amongst zip codes) are calculated using the Pearson correlation.Since it cannot be assumed that all considered parameters are Gaussian distributed, the dif-ference in results was tested using the Spearman rank correlation.It was found that results are not qualitatively affected by the chosen correlation method.Correlations are tested for significance by testing whether the Pearson's product moment correlation follows at distribution.Significance of correlations is assessed by considering the resulting p values.Daily total operation counts for Berlin are furthermore compared to Berlin-wide damages, which are available on a daily level for the period of 2002-2011.Temporal correlations to daily building damages are calculated, again for both total count and counts for operations related to individual keywords.

Spatial correlation between potential vulnerability predictors and patterns of operation occurrences
To identify predictors for vulnerability, a spatial correlation analysis between numerous quantities derived from the different geospatial data sets and gridded operation densities is performed.Variables include gridded densities of man-made structures (buildings, streets), topographic features (height, slope) and land use information.The latter is pre-processed such that the area fraction of a specific land use type (as specified in the CORINE data set) within each 1 × 1 km grid box is given.Again, spatial correlations are assessed using either operations of one specific category or operations irrespective of their type.As described above, correlations are calculated using the Pearson correlation.Again, using the Spearman rank correlation did not qualitatively affect the results.Significance of correlations is assessed as described in the previous section.

Multiple linear regression model
On the basis of the set of potential vulnerability predictors (as listed in Table 3), a statistical model is set up based on multiple linear regression to analyse the predictability of the spatial distribution of (long-term) operation occurrence rates.Such model could potentially be used to identify "hotspots" in the local occurrence of operations in areas where no explicit data on operations are available and might be highly relevant in terms of long-term planning of capacities for effective emergency management.In the following, three different types of models are addressed.A linear model, a logarithmic variant (assuming a log-normal distribution of the predictant, modelling the logarithm of operation density) and a Poisson model (typically used to model count variables).To provide robust results and prevent overfitting of the data, an appropriate subset of variables must be chosen from the set of available predictors.This is particularly important since some predictors are highly correlated amongst each other, which is referred to as multicollinearity (Belsley et al. 1980).Even though multicollinearity does not reduce the predictive power of the model, it may strongly affect the interpretation of individual regression coefficients of predictors containing mutual information.Besides being the cause of overfitting, it is thus desirable to reduce the number of (correlated) predictors to also better interpret resulting regression coefficients.
To do so we chose an iterative procedure which -starting from an initial model -stepwise removes or adds predictor variables.Which predictor to add to (or remove from) the list of predictors in the model is decided in each iterative step by maximization of the Akaike information criterion (AIC; see Akaike, 1985).The basic idea is to assume a certain penalty for each (additional) predictor within the model.This penalty needs to be balanced to the resulting goodness of the model fit (e.g. by means of R 2 ), leading to an optimization problem between the total penalty and fit quality.The algorithm converges if no predictor can be added or removed to further optimize the model in terms of the AIC.To perform this optimization procedure, the weight of the penalty can be varied by means of the parameter k.While k = 2 corresponds to the classical AIC, higher k result in an increased penalty for additional predictors.Different choices of k will ultimately lead to different optimized models including more (less) predictor variables if k is lower (higher).

Model validation methodology
To assess the predictive skill of the optimized model a cross validation is set up.For this, the area of Berlin is divided into four sectors.Then the model -using the set of predictors identified by means of the iterative procedure described above -is fitted four times, each time using all grid points within three of the four sectors.Each model fit is then used for predicting the operation density for grid points within the fourth sector.In this way predictions are obtained for data that have not been used for model fitting.Calculating the mean square error of these model predictions in comparison to observed operation density values results in the crossvalidation error, which is used as the criterion for predictive model skill.The optimal choice of k in the iterative optimization procedure described above is not known up front.Different choices of k lead to differing numbers of predictor variables.Thus, to find the optimal model for predictive purposes, we vary k and compare the predictive model skill of the resulting models.The optimal choice is found by maximizing the predictive model skill, i.e. minimizing the crossvalidation error.

Comparison of fire brigade operations and building damage data
Daily operation counts in the period 2002-2011 for the whole of Berlin are correlated to daily building losses in Berlin.Correlations are calculated for total operation counts as well as counts for operations associated with individual alert keywords, additionally stratified by season.The aim of this analysis is to identify how specific categories of fire brigade operations (i.e.tree-related operations) are related to the wind and hail impacts as described by the insured loss data.Particularly because impact data cannot be directly related (e.g.missing flood damages in the insured loss data) it is valuable to analyse the relationships to gain an understanding of the causing events for fire brigade operations which is not readily available.However, this means that the interpretation of correlation results is difficult, because wind, hail and precipitation may occur simultaneously for both winter storms and thunderstorms.It is not directly clear if a certain correlation means that both data sets contain impacts due to the same meteorological factor (i.e.wind) or if correlations are due to the simultaneous occurrence of multiple meteorological factors.
Highest correlations are found between tree-related operations and building damages, particularly in winter (0.74).In addition, operations associated with the alert keyword "construction element" show rather high correlations to building damages (0.67).In both cases, winter correlations are higher which indicates that a large share of these operations are caused by severe wind gusts.Counts of water damage operations in summer do not show any correlation to building losses, which is due to the fact that flooding damages are not contained in the insurance data set available.In winter, however, considerable correlation is found (0.41).It can be assumed that this correlation is because water-related operations in winter often occur in conjunction with large-scale storm events, which would indicate that precipitation impacts coincide with wind impacts on housing.Correlating treerelated with water-related operations gives further weight to this assumption.While correlation is considerable in winter (0.25), there is low correlation in summer (0.08).Similar results are found correlating operations related to the keyword "construction elements" and water-related operations.Thus, operations caused by severe winds (tree-related and construction elements) and water-related operations seem to occur mostly independently in summer, while in winter they seem to coincide more often.However, the low correlation between summer damage and water-related operations is still surprising.The fact that flooding damage to housing is not included in the loss data set obviously leads to a non-existing correlation when regarding effects due to rainfall only.Thunderstorm events being often related to severe precipitation and in some cases to hail would suggest a certain correlation between hail-induced building damage and water-related operations in summer.The fact that no correlation is found might in turn indicate that either hailfall is sufficiently rare to make up for a significant effect or hailfall impacts do not play a major role for the occurrence of operations.
Spatial patterns of insured losses and operation occurrences were compared for a set of four windstorm events (Kyrill, Emma, Xynthia and Lothar07) and two convectively driven summer events (Aram and Gunnar).A visual comparison of impacts for the winter storm Kyrill (17-19 Jan-uary 2007) and the thunderstorms related to the frontal passage of Gunnar (22 June 2011) can be found in Fig. 1.In general, a rather good agreement in the patterns of the number of operations per zip code area and the number of insurance claims is found.For Kyrill, both data sets show considerably higher impacts in the south of Berlin, while central and some northern parts of Berlin featured lower impacts.It can be argued that there is an influence of the size of areas that is not homogeneous (particularly large areas are found in the south, while particularly small areas in central Berlin).However, the consideration of relative numbers (normalizing for the zip code area) did not alter the qualitative findings.For the thunderstorms related to the frontal passage of Gunnar, spatial patterns also show considerable agreement.Affected areas are considerably larger when considering fire brigade operations, while building damages are more concentrated on individual zip code areas.This might be related to localized hailfall that led to localized occurrence of building damage, while precipitation and wind gusts were more widespread, leading to water-related operations and windinduced tree fall in larger areas.A spatial correlation analysis is performed, correlating the number of insurance claims and the number of operations within each zip code area.We found that using different quantities (e.g.damage ratio and normalized operation densities) does not qualitatively influence the correlations.Also, it must be kept in mind that these spatial correlations are evaluated only for individual events, which may thus not be generalized.Resulting spatial correlations for the six events are given in Table 2.
Most prominently, significant correlations are found for tree-related operations in relation to building damages.This might affirm that tree-related operations mostly represent wind-induced tree fall, which relates directly to windinduced building damages.For some events (Kyrill, Lothar07 and Aram), considerable correlation is found for waterrelated operations while for the others there is no correlation at all.While no direct water-induced damages are included in loss data set, there might be an indirect relation.For a specific event, severe precipitation might coincide with hailfall, which can induce damages.For Lothar07 and Aram there are confirmed hail observations in Berlin or surrounding areas.For Kyrill a study indicates that there was thunderstorm activity during the frontal passage, which might have been related to hailfall (Fink et al., 2009).The authors also note that the severe precipitation could have increased damages.This might in turn explain why for Kyrill, Lothar07 and Aram, correlations for tree-related operations and building damages are particularly high.
Thus, it shows that the relationship is far from being an identity between building damages and fire brigade operations.Spatial patterns can be found to show considerable agreement in some cases, but individual impacts (i.e.different categories of operations) are generated by multiple meteorological variables (severe gusts, precipitation and hail) or even by a complex interplay of those variables.Addi-  tional factors might distort the relationship between insured damages and operations.These include the fact that in the case of major events both insurers and emergency services might alter their usual procedural strategies.For instance, insurers forego detailed plausibility checks for individual damage reports in the case of cumulative loss events.Also, emergency services request the public to handle non-lifethreatening damage by themselves in certain situations to relieve workload for first responders.Both reasons might have contributed to the fact that for Kyrill an extremely high insured loss has been recorded (about 10 times higher compared to Lothar07) while the number of fire brigade operations is not as exceptional (comparable to Lothar07).

Spatial correlation between potential vulnerability predictors and patterns of operation occurrences
Patterns of average operation densities (represented by the number of operations per square kilometre and per year) are calculated on a 1 × 1 km grid (Fig. 2).Considering all operations (Fig. 2a), distinct spatial variations can be observed.
In general, high densities are found in central areas of Berlin, while outskirts feature low densities.However, numerous additional spatial variations can be found, such as particularly low operation densities in less densely (or unsettled) areas such as the Grunewald and areas in the south-east of Berlin.But distinct local minima in operation densities are found for central parts of Berlin as well, e.g. for the zoo or the former airport Tempelhof.Considering individual alert keywords shows that patterns of the spatial densities of operations considerably vary.While water-related operations show a rather similar spatial pattern compared with all operations, operations related to traffic-obstructions or tree fall are distributed rather differently.Both are distributed more broadly over the area of Berlin, not featuring the distinct concentration in the centre.Furthermore, for operations related to traffic obstructions a concentration of emergency operations near important junctions can be found (Fig. 2c).For tree-related operations, it seems that maxima of operation occurrence are not found in forest areas themselves but rather at their borders with housing areas (e.g.compare the border areas of the Grunewald in Fig. 2d).This is not unexpected, since major impacts due to tree fall are not expected in wooden areas but rather in areas where trees are present in the direct vicinity of man-made structures (e.g.roadside trees or trees in recreational areas).This implies that only in very few cases can the modelling of vulnerabilities to (meteorological) hazards be made in a univariate fashion.Instead, combinations of multiple factors will determine local vulnerability and consequently those that should be considered.
Examples of the spatial patterns of potential predictors for vulnerability are given in Fig. 3.Even though building density (shown in Fig. 3a) and building coverage (Fig. 3b) are based on the same data (i.e.individual housing information as derived from OpenStreetMap), different information can be extracted.While building density is calculated as the number of houses per square kilometre, building coverage assesses the area fraction covered by buildings.Hence, building density is particularly high in suburban areas with numerous small houses while building coverage is highest in central ar- eas with concentrated large buildings.Similarly, information on the density of the road network can be derived from Open-StreetMap (Fig. 3c).Additional predictor variables from the CORINE land cover data set are assessed by calculating the fraction of a grid box that is covered by areas of a specific CORINE land use type (as one example, Fig. 3d shows the area fraction of artificial surfaces).Again, quite different information can be gathered, e.g. when considering the different land use types encoded in CORINE.Finally, with respect to the aim of modelling local vulnerabilities, the characteristics of the local urban structure can be described on the basis of not one but instead many of these predictor variables.
For the predictor variables listed in Table 3, the spatial correlation to the gridded operations densities is calculated.For this, only those grid points within Berlin are considered for which data on operations are available.Furthermore, correlation is assessed for individual alert keywords, as well as considering all operations.Resulting correlations are listed in Table 3, with colours indicating positive correlation (in red) and negative correlations (in blue).Several predictor variables stand out in this table, in particular the building coverage and the area fraction of continuous urban fabric, which have high correlations with spatial patterns of operations disregarding their alert keyword.One exception are tree-related operations for which the correlation with both building coverage and area fraction of continuous urban fabric is considerably lower.Instead, in this case correlations are rather high for building density and the area fraction of discontinuous urban fabric.It might be assumed that this is due to the fact that, particularly in the outskirts of Berlin (with a high number of small buildings), the vulnerability is increased due to the presence of trees in gardens (i.e. in the vicinity to buildings).However, in general it can be deduced that the degree of urbanization (both expressed by the area coverage of housing and indicated by continuous urban fabric areas) plays a major role in determining highly vulnerable areas.Both variables can be interpreted as a proxy for the number of "objects" at harm (e.g. the number of basements or drainage systems in the case of water-related emergencies).For tree-related operations, the picture is quite different, however.Operation densities are particularly high in areas of discontinuous urban fabric and seem to be enhanced in areas of high building densities (i.e. the number of houses per square kilometre).Both indicate that tree-related operations are more likely in less densely covered urban areas, where assumedly more roadside trees or trees as part of recreational areas can be found in close vicinity to building structures.Considering area fraction of wooded areas (particularly coniferous forests), negative correlations with operations of all alert keywords are found.This can be explained by the fact that this variable is essentially inverted in areas with a high fraction covered by urban structures.Interestingly, tree-related operations are also negatively correlated to areas with a high fraction of wooded areas.This indicated that it is not areas with many trees which are particularly vulnerable, but instead areas in which trees are found in the vicinity of man-made structures.
Considering the density of the road network it is found that positive correlation with the patterns of each individual alert keyword exists.This holds in particular for the secondary and tertiary road networks.A simple explanation for this is that areas in which a high density of secondary and tertiary roads exists mostly coincide with areas of high building coverage.Additionally, it can be found that correlations of road density patterns are highest with respect to operations related to traffic obstruction.This is due to the fact that traffic obstructions are more likely to occur in areas with a high density of roads.All the above-mentioned findings show that even though there is no complete correspondence between individual predictors and the occurrence of operations, numerous predictors can be found explaining a share of the spatial variability of weather impacts.This shall be investigated in the following by building multivariate models to statistically describe the spatial patterns of operation occurrences.

Multivariate modelling of the occurrence of fire brigade operations
The set of predictors described above is used to set up a multivariate model to be able to predict the local occurrence rate of operations.As described in Sect.3, the iterative procedure consists of the repeated application of a parameter selection algorithm while iteratively increasing the penalty for additional model parameters.The optimal model is then chosen by means of the cross-validation error (to prevent overfitting and estimate the predictive ability of the resulting model).As an example, the results of this iterative procedure are shown in Fig. 4 for water-related operations.For the linear model the optimal model is identified having 12 predictor variables (which is the result of choosing k = 1 as the penalty weight).Choosing a lower weight of 0 obviously results in a model taking into account all 33 possible predictors featuring severe overfitting (mean square cross-validation error (MSCVE) is about 4000 in this case).The procedure is applied to fire brigade operations associated with individual alert keywords, as well as all operations together.For the latter, the resulting optimal model includes a set of 12 predictor variables (listed in Table 4), explaining 83 % of the variance in the spatial pat-tern of operations.In accordance to the correlation analysis, the predictor "building coverage" possesses the highest contribution to the explained variation (EV = 59 %) while lower contributions are found for other variables (e.g."area fraction continuous urban fabric" contributes 11 % and "building density" 6 %).Of course, there is not a direct correspondence when comparing the EV of individual predictor variables to the correlations as listed in Table 4, since certain predictor variables might be strongly correlated (multicollinearity; see Sect.3.3).By adding a predictor which is correlated to predictors already in the model, the increase in model performance might be small even though the correlation to the predictant is high.The results described above apply to a basic linear model.Alternatively, the predictor selection methodology can be applied while using alternative models, i.e. a log-normal and a Poisson model.Results showed that, in general, predictive abilities of the statistical models (in terms of the cross-validation error) are not increased (not shown).By means of the MSCVE, the linear models appear to perform best.However, the linear model suffers from the disadvantage of predicting negative values for the number of emergency calls in some cases, while both log-normal and Poisson models do not.It should be noted that the different models may contain a different set and even different number of predictors.The comparison of individual regression parameters is thus difficult.However, models can be compared in terms of their predictive skill.In the following, results are shown using the linear model, performing best in terms of the predictive skill (assessed by means of the MSCVE).
In comparison with the maps shown in Fig. 2, Fig. 5 shows model predictions for the average number of operations on a 1 × 1 km grid cell.In the case considering all emergency operations (Fig. 5a) or only water related (Fig. 5b), the model nicely reproduces the concentration of operation occurrences in central parts of Berlin, while especially forest areas such as the Grunewald feature very low occurrence rates.In addition, the amplitude of this variation (ranging from 0 to about 80 operations per km 2 per year considering all operations) is well captured.Individual hotspots of high operation occurrence rates, however, are only partly reproduced.This is particularly the case for a hotspot in the south-east of Berlin centre (Fig. 2a and b, corresponding to northern parts of the district Neukölln).In this area particularly, water-related operations are very high.It is possible that this is influenced by an extraordinarily high population density in these areas, information which is only partly (and indirectly) covered by predictor variables such as building coverage.Also, other factors such as housing conditions or very localized troughs (potentially leading to water accumulation in the case of severe rain) might also affect the occurrence of emergency operations.Such information, however, was available for this study and thus cannot be taken into account.
Also in the case of traffic-, and tree-, related operations, predicted patterns (shown in Fig. 5c, d) reproduce observed patterns rather well.In both cases occurrence rates are less Table 4. Resulting optimal models.First column indicates the number of predictors as well as the total explained variation (EV) of the chosen optimal model (according to the cross-validation error).The leading predictors of each model are shown indicating weather a positive (+) or negative effect (−) is found.In the last column, EV in percent is given for these leading predictors.Within the concentrated in central parts of Berlin but are more widely distributed across Berlin.Particularly in the case of treerelated operations, model predictions show a rather homogeneous distribution over large parts of Berlin (Fig. 5d), while local maxima in the observed operation density (Fig. 2d) are poorly captured.Considering the EV for the different models, it is confirmed that for tree-related operations the predictive ability of the model is the worst, with an EV of 53 %.In comparison, the model for all operations has an EV of 83 % (Table 4).

Conclusions and discussion
A comparison of a new data set containing spatial and temporal information on emergency operations of the Berlin fire brigade with damage data has been performed.Spatial pat- terns can be derived and correspondences amongst both impact data sets can be found.However, a complex interplay of meteorological conditions leads to a variety of weather impacts, making it very hard to directly compare the data sets.Instead, the availability of both data sets might be considered as particularly valuable for the reconstructing the multifaceted impacts of severe weather events.The relation to predictor variables (i.e. the structure of settlement as well as characteristics of land use) has been addressed by means of an analysis of spatial correlations.Particularly the information on the local building coverage shows a rather high influence on the occurrence of operations.Accordingly, areas classified as continuous urban fabric (within the CORINE land cover data set) exhibit high rates of fire brigade operations.By analysing individual alert keywords, other variables turn out as valuable predictors.For example, in the case of traffic-related operation these include the local density of the road network.In the case of treerelated operations, the areas classified as discontinuous urban fabric correlate with high occurrence rates.One interpretation is that in these areas a higher number of trees are present in the direct vicinity of man-made structures (e.g.roadside trees or trees in recreational areas).
Multivariate modelling including an iterative prediction selection algorithm has been conducted, with resulting models being able to predict the local vulnerabilities.Evaluation of models showed moderate model performances for treerelated operation occurrences (explained variation of 53 %), while for other types of operations -i.e.water-related, trafficrelated, or all operations combined -model results were better (explained variation of 70-80 %).In all cases, spatial patterns of operation occurrences can be reproduced well.Ex-cept for tree-related operations, the amplitude of variations can also be reproduced.However, individual hotspots with high occurrence rates are only insufficiently predicted, indicating that particular information influencing the local vulnerabilities is not included in the predictor variables available in this study.In the case of water-related operations, these might include housing conditions.Also, information on local tree stocks, particularly in the vicinity of vulnerable structures might be very valuable to better model tree-related operation occurrences.
Table 4 serves as an overview of the most relevant parameters to describe the local vulnerability to severe weather, which can be the starting point for further investigations.While many details on individual predictor variables and their descriptive power with respect to specific types of operations are available, a common feature is identified, namely the fact that the building coverage is by far the most dominant factor to describe the local vulnerability.
The model has been developed and tested for the Berlin area due to the availability of fire brigade operation records for Berlin.However, model predictions can be derived for the whole of Germany.Such model predictions might be particularly valuable for regions with no systematic records on weather impacts.However, such extrapolation might suffer from potentially severe limitations.The occurrence of severe weather conditions is not homogenous over Germany, with storm frequencies being higher in northern regions and thunderstorm frequencies being higher in southern regions.Thus, the distribution of hazards causing local impacts can differ considerably, which will certainly affect the occurrence of emergency operations.Such effects are excluded in the presented modelling approach, which assume a homogenous distribution of hazards.For the investigation of Berlin this is certainly a valid assumption.The extraction of model predictions for other urban areas might suffer from an offset in terms of absolute number of operations.Such model predictions can, however, still be very valuable since they can provide information on spatial variation in operation occurrences on a sub-city scale.Still, future work should include meteorological and climatological information on different hazards, which will strongly influence local vulnerability and thus predicted weather impacts.
The presented model to predict the local vulnerability to severe weather can serve as a basis for a broad range of tools or applications in emergency management.These might include tools for the long-term resource planning of local emergency management capacities.Also, handling shortterm variations in the demand of local emergency management capacities might be supported by such tools when including actual weather information.In this study, we focussed on data sets which are publicly available -partly open-source community data -for at least the whole of Europe.This yields great potential for the design of national or even pan-European tools and applications in emergency management.

Figure 1 .
Figure 1.Spatial comparison of the number of insurance claims (a, b) to the number of fire brigade operations (c, d) for winter storm Kyrill in 2007 (a, c) and frontal passage Gunnar in 2011 (b, d).

Figure 2 .
Figure 2. Mean yearly density of fire brigade operations during 2002-2011 calculated on a 1 × 1 km grid (units: operations per square kilometre per year).Operation recordings are available for Berlin only (boundaries are indicated by black solid lines); i.e. zero values outside of Berlin are due to unavailability of data.Note the different colouring scale due to the fact that the absolute numbers of operations for a certain operation type vary considerably.

Figure 3 .
Figure 3. Example set of exposure predictors calculated on a 1 × 1 km grid.While building density (a), building coverage (b) and street density for tertiary roads and higher (c) are based on information extracted from OpenStreetMap data, panel (d) shows the area fraction of artificial surfaces as derived from the CORINE land cover data set.

Figure 4 .
Figure 4. Results of the iterative procedure to optimize the regression model.Increasing the penalty term for additional predictors leads to a model with smaller sets of predictor variables.For each of the resulting predictor sets and the corresponding multiple regression, the mean cross-validation error (MSCVE) is calculated and plotted here.Blue circles represent validation results using a linear regression, orange circles represent results using the lognormal model and red circles represent results using a poisson regression.

Figure 5 .
Figure 5. Modelled mean yearly density of fire brigade operations (units: operations per square kilometre per year).Results are shown for the model including all operations disregarding their type (a), for water-related operations (b), for traffic obstructions (c) and for tree-related operations (d).

Table 1 .
Distributions of impacts of different types in Berlin for the period 2002-2011, stratified by their cause and by season, i.e. winter and summer half year and temporal correlations to daily building damages.

Table 2 .
Spatial correlations for specific events.The correlation is calculated between the number of fire brigade operations and the number of insurance claims within individual zip code areas.

Table 3 .
Spatial correlation coefficients (Pearson correlation) between yearly averaged operation density with exposure predictors.Some CORINE classes are excluded in this table, if there are no areas in Berlin, and thus the area fraction (AF) is 0 everywhere.High and low correlations are highlighted in red and blue, respectively.