On the clustering of winter storm loss events over Germany

During the last decades, several windstorm series hit Europe leading to large aggregated losses. Such storm series are examples of serial clustering of extreme cyclones, presenting a considerable risk for the insurance industry. Clustering of events and return periods of storm series for Germany are quantified based on potential losses using empirical models. Two reanalysis data sets and observations from German weather stations are considered for 30 winters. Histograms of events exceeding selected return levels (1-, 2and 5-year) are derived. Return periods of historical storm series are estimated based on the Poisson and the negative binomial distributions. Over 4000 years of general circulation model (GCM) simulations forced with current climate conditions are analysed to provide a better assessment of historical return periods. Estimations differ between distributions, for example 40 to 65 years for the 1990 series. For such less frequent series, estimates obtained with the Poisson distribution clearly deviate from empirical data. The negative binomial distribution provides better estimates, even though a sensitivity to return level and data set is identified. The consideration of GCM data permits a strong reduction of uncertainties. The present results support the importance of considering explicitly clustering of losses for an adequate risk assessment for economical applications.


Introduction
Intense extratropical storms are the major weather hazard affecting western and central Europe (Klawa and Ulbrich, 2003;Schwierz et al., 2010;Pinto et al., 2012).Such storms typically hit western Europe when the upper tropospheric jet stream is intensified and extended towards Europe (e.g.Hanley and Caballero, 2012;Gómara et al., 2014).If these large-scale conditions remain over several days, multiple windstorms may affect Europe in a comparatively short time period (Fink et al., 2009).The occurrence of such "cyclone families" (e.g.Bjerknes and Solberg, 1922) can lead to large socio-economic impacts, cumulative losses (sum of losses caused by a particular series of events or aggregated over a defined time period) and fatalities.In statistical terms, this effect is known as serial clustering of events, for example of cyclones (Mailier et al., 2006).A recent study showed that clustering of extratropical cyclones over the eastern North Atlantic and western Europe is a robust feature in reanalysis data (Pinto et al., 2013).Furthermore, there is evidence that clustering increases for extreme cyclones, particularly over the North Atlantic storm track area and western Europe (Vitolo et al., 2009;Pinto et al., 2013).In terms of windstorm-associated losses, a general result is that large annual losses can be traced back to multiple storms within a calendar year (MunichRe, 2001).One of the most severe storm series regarding insured losses for the German market occurred in early 1990, which includes the storms "Daria1 ", "Herta", "Nana", "Judith", "Ottilie", "Polly", "Vivian" and "Wiebke", reaching a total cost of ca.EUR 5500 million indexed to 2012 (Aon Benfield, 2013).The cumulative damages associated with the windstorm series in December 1999 and January 2007 rank among the highest of the recent decades, with total costs reaching EUR 1500 million and about EUR 3000 million in terms of insured losses, respectively (Aon Benfield, 2013).Also the winter of 2013/14 has been characterised by multiple storms leading to large socio-economic impacts ("Christian" 28 October 2013, "Xaver" 7 December 2013, "Dirk" 23 December 2013, "Anne" 3 January 2014, and "Christina" 5 January 2014), which have affected primarily the British Isles.
The estimation of return periods of single storms (event based losses) and storm series (cumulative losses) is needed to determine the "occurrence loss exceeding probability" (OEP; event loss) and the "aggregate loss exceeding probability" (AEP; accumulated loss per calendar year) for risk assessment and the fulfilment of the Solvency II (Solvency Capital Requirements, QIS5) requirements.As top annual aggregated market losses (like 1990 for Germany) are associated with multiple storms, the importance of clustering has long been discussed within the insurance industry.However, little to no attention has been paid to the clustering of windstorm related losses in peer-review literature.In this study, the clustering of estimated potential losses associated with extratropical windstorms is analysed in detail for Germany and for recent decades.In particular, the probability of occurrence of multiple storm events per winter over Germany exceeding a certain return level is evaluated with help of reanalysis and general circulation model (GCM) data.

Data
In statistical terms, it is possible to build a simple storm loss model using both wind gusts and daily maximum 10 m wind speeds.For example, Pinto et al. (2007) gave evidence that loss estimations following the Klawa and Ulbrich (2003) approach based on both variables provide equivalent results.For this study, wind gusts are available and considered for German weather service ("Deutscher Wetterdienst", hereafter DWD) observation data.As no gust data are available for reanalysis and GCM, a daily maximum of 10 m wind speed is used for those data sets.
Reanalysis data from the National Centre for Environmental Prediction/National Centre for Atmospheric Research (hereafter NCEP) as well as from the European Centre for Medium Range Weather Forecast (ERA-Interim project, hereafter ERAI) are used in this study.The NCEP data are available on a Gaussian grid with a resolution of T62 (1.875 • , roughly 200 km; Kistler et al., 2001), while the ERAI data are available on a reduced Gaussian grid with a resolution of T255 (0.7 • ; about 80 km over central Europe; Dee et al., 2011).For comparability, ERAI is interpolated to the NCEP grid performed with a bilinear interpolation method (Fig. 1a shows relevant grid points for Germany).For both data sets, the 6-hourly instantaneous 10 m wind speed (hereafter wind) is considered.The daily maxima (largest values for each calendar day between 00:00, 06:00, 12:00 and 18:00 UTC) are selected.Based on these daily maxima the 98th percentiles (see Sect. 3) are calculated for 30 winters (October-March, 1981/82 to 2010/11) respectively.
In order to obtain statistically robust estimates of the return periods of storm series based on potential losses, a large ensemble of 47 simulations performed with the coupled ECHAM5/MPI-OM1 (European Centre Hamburg Version 5/Max Planck Institute Version -Ocean Model version 1; Jungclaus et al., 2006; hereafter ECHAM5) GCM is analysed.These simulations have a wide variety of setups, but are all consistent with greenhouse gas forcing conditions between the year 1860 (pre-industrial) and near future (2030) climate conditions.All simulations were performed with T63 resolution (1.875 • , roughly 200 km, see grid in Fig. 1b); 37 of them were conducted for the ESSENCE (Ensemble SimulationS of Extreme weather events under Nonlinear Climate changE) project (Sterl et al., 2008).Details of all simulations can be found in Supplement A. Again, the 6hourly instantaneous 10 m wind speed is used to determine the daily maxima.The 98th percentile for GCM data is calculated based on the 37 ESSENCE simulations for the winter half year, as the length of this data set is long enough to derive statistically stable estimates.
As the physical cause for building losses can be primarily attributed to the peak wind gusts (Della-Marta et al., 2009) a data set of daily maxima of the 10 m wind gust observations from DWD is used for comparability and validation purposes.The time series of these data sets differ in terms of the length of the available time period and data quality (e.g.Born et al., 2012).After an evaluation, 112 stations (Fig. 1c) are considered for further analyses.For these stations, wind gusts for at least 80 % of the days in winter are available for the period 1981/82 to 2010/11.The 98th percentile at each station is calculated for the winter half year.Then, a normalisation of the 10 m wind gust observations with the 98th percentile at each station is performed.The normalised values were interpolated to the 0.25 • grid of the population density (Fig. 1c) using the inverse distance weighted interpolation of second order.This method assumes that the interpolated value for each grid box should be influenced more by nearby stations and less by more distant stations.The second-order fit permits a higher weighting for nearer stations.
The German Insurance Association ("Gesamtverband der Deutschen Versicherungswitschaft", hereafter GDV) provides a simulation of daily residential building losses for private buildings for the period of 1984-2008 for the 439 administrative districts of Germany.This data were collected from most of the insurance companies active in the German market, so are representative of the insured market loss in Germany and are used here as a reference.Loss ratios, i.e. the ratio between losses attributed to one event and the total insured value for that area are used.Inflation effects can be neglected as well as other socio-economic factors that may have changed slightly during this period.More information can be found in Donat et al. (2011) and Held et al. (2013).
As insurance data are not available for the whole analysed period, population of the year 2000 is used as proxy for the estimation of potential losses.This data set was provided by the Centre for International Earth Science Information Network (CIESIN) of Columbia University and the Centro International de Agricultura Tropical (CIAT).The population density is given as inhabitants km −2 , with a spatial resolution of 0.25 • ×0.25 • (Fig. 1, coloured boxes).For grid boxes which are only partially within German borders the percentage of each box is calculated with the geoinformation system (GIS).

Methodology
In this section, the potential loss indices based on the approach by Klawa and Ulbrich (2003) and Pinto et al. (2012) are presented.These indices are used to select events exceeding a certain return level.For the chosen events, histograms are analysed, and statistical distributions like the Poisson and the negative binomial distribution are used to estimate return periods of storm series.As the GCM data overestimate the frequency of zonal weather patterns, the approach to calibrate GCM data towards reanalysis using weather types is described.

Storm loss indices
The potential loss associated with a storm can be quantified using simple empirical models (Palutikof and Skellern, 1991;Klawa and Ulbrich, 2003;Pinto et al., 2007).Here, calendarday-based potential damages for Germany are estimated by using a modified version of the loss model of Klawa and Ulbrich (2003) for stations and gridded data.The general assumptions of the loss model are as follows: -Losses occur only if a critical wind speed is exceeded.
This threshold corresponds to the local 98th percentile (v 98 ) of the daily maximum wind speed (e.g.Palutikof and Skellern, 1991;Klawa and Ulbrich, 2003).
-Above this threshold, the potential damage increases with the cube of the maximum wind speed, as the kinetic energy flux is proportional to the cube of wind speed.This implies a strong non-linearity in the windloss relation.
-Insured losses depend on the amount of insured property values within the affected area.As real insured property values are not available, the local population density (POP) is used as proxy.
-To each population density grid cell, the wind data (reanalyses, GCM) from the nearest location are allocated (nearest neighbour approach).
Following these assumptions, the potential loss (LI raw ) per calendar day is defined by the sum of all grid points ij with v ij exceeding v 98 ij weighted by the population density: POP ij = population density for grid point ij, v ij = wind speed at grid point ij and v 98 ij = 98th percentile at grid point ij.
Following Pinto et al. (2012), a meteorological index (MI raw ) is also considered.MI raw is defined as the sum of all grid points ij per calendar day, where v ij is exceeding v 98 ij without weighting with the population density: In this study, the method is modified to identify individual events of high LI raw (or MI raw ).In a first step overlapping 3-day sliding time windows of LI raw (MI raw ) time series are analysed, as this corresponds to the 72 h event definition often used by insurance companies in reinsurance treaties (Klawa and Ulbrich, 2003).Moreover, given that Germany is a comparatively small area, 3 days are reasonable for separating events.For each 3-day time window, the middle day is defined as event if it is a local maximum of LI raw (MI raw ).If no maximum is identified within the 3-day window, the first day after an event (for all LI raw = 0; considering the last day of the 3-day time window) is defined as event.The outcome is a time series of events.With this approach, storms like "Vivian" and "Wiebke" (26 and 28 February 1990) can be identified as separate events (see Supplement E).
In a second step, the local details of the identified events are analysed in more detail.In analogy to the above, the temporal local maximum of the 3-day time window at each on the event day is replaced with the identified maximum value of the first or the last day of the 3-day time window in LI raw (MI raw ).In rare cases, events are only separated by 1 day (e.g.Vivian and Wiebke, see is identified between both events (here 27 February 1990), it is allocated to the event with higher . This ensures that each local maximum only counts once.To guarantee spatially coherent wind fields, larger values occurring on the first or third day only substitute the values from the middle day if multiple (spatially contiguous) nearby grid points exceed the 98th percentile.
The method to estimate potential losses of single events can be described as This new definition has the advantage that single storm events can be well separated.Furthermore, strong potential losses occurring 1 day before or 1 day after an event, which are probably associated with the same event, are incorporated in LI raw (MI raw ).
Hereafter LI 3-D (MI 3-D ) is named LI (MI) for simplicity.These formulations are used for reanalysis, DWD and GCM data.Then, the resulting time series of LI (MI) are ranked and 1-, 2-and 5-year return levels are computed.The selected samples of events exceeding each corresponding threshold (e.g. 30, 15 and 6 events respectively for 30 years of reanalysis data) are then assigned to individual winters.The naming is given by the second year, e.g.winter 1989/90 is named 1990.

Statistics
The Poisson distribution is the simplest approach to describe independent events and is often used to model the number of events occurring within a defined time period.This procedure is useful to describe the temporal distribution of events at a certain region and is typically used by insurance companies to estimate losses of winter storms.This discrete distribution depends on one parameter and is a special case of the binomial distribution.For the Poisson distribution the rate parameter λ is equal to both the variance (Var(x)) and mean (E(x)).For a random variable x the probability distribution is defined as After Mailier et al. (2006), the dispersion statistics (a simple measure of clustering) is defined as If the Var(x) > E(x) the distribution is overdispersive (clustering), for E(x) > Var(x) the distribution is underdispersive (regular) and for E(x) = Var(x) it is a random process.Beside the Poisson distribution the negative binomial distribution is one of the major statistics that is used to describe insurance risks.Following Wilks (2006), the probability of the negative binomial distribution is defined as with ( ) = gamma function, k = auxiliary parameter > 0 (see below), and 0 < q < 1, q = 1 − p, p = probability.As in our study E(x) is fixed as the return level of considered events, q is the only free parameter.The estimation of q is done by a nonlinear least-square estimate using the Gauss-Newton algorithm.
Considering E(x) = kq 1−q and Var(x) = kq Wilks, 2006).( 8) The dispersion statistics can also be described as For q = 0, the negative binomial distribution is equal to the Poisson distribution.The higher q, the higher is the overdispersion and therefore the clustering of events.The return period is defined as the inverse of the probability (Emanuel and Jagger, 2010).The estimation of return periods of storm series consisting of events with a certain return level is calculated by the probability P for x events of certain intensity within 1 year:

Calibration of GCM data with circulation weather types
In order to obtain robust estimates of return periods for the historical storm series, the large ensemble of ECHAM5 simulations is considered to enhance the data sample.As the large-scale atmospheric circulation is too zonal over Europe in GCMs (e.g.Sillmann and Croci-Maspoli, 2009), a correction of the model bias towards the reanalysis climatology is necessary.This correction is performed based on weather types, so that the variability of weather patterns over Germany corresponds to the historical time period.The selected weather typing classification is the circulation weather type (CWT) following Lamb (1972) and Jones et al. (1993).
The large-scale flow conditions over Germany are calculated from 00:00 UTC mean sea level pressure fields, using 10 Two circulation types are considered: cyclonic (C) and anticyclonic (A).If neither rotational nor directional flow dominates, the day is attributed as hybrid CWT (e.g.anticyclonicwest).The correction is done by adapting the relative frequency of events per CWT in the GCM simulations to the number of events per CWT in the ERAI data (see Sect. 4.3).This is only a first-order correction of the model biases.In fact, differences in the probability density function of extreme losses per weather type may still be present (Pinto et al., 2010).

Results
In this section, the different loss indices (Sect.4.1) and the events selection (Sect.4.2) are first analysed for the reanalysis period.Second, results of the calibration of GCM data based on CWTs are presented in Sect.4.3.The estimation of return periods for storm series based on reanalysis (Sect.4.4) and GCM data (Sect.4.5) follow.

Comparison of loss indices for the reanalysis period
The loss indices described in Sect.3.1 are now compared based on different data sets.First, the MIs based on both reanalysis data sets are compared to the MI derived from DWD data as an illustrating example (storm series of early 1990).
Results for the period from 15 January to 15 March 1990 are displayed in Fig. 2a.The outcome shows that the timing of extreme events ("Daria" 25 January 1990, "Herta" 4 February 1990, "Judith" 7 February 1990, "Vivian" 26 February 1990 and "Wiebke" 1 March 1990) is generally well identified from all three data sets.In some cases, a 1-day shift is observed, e.g. for 12 and 15 February.Such modifications are associated with the methodology of the data assimilation within the data set (e.g.highest winds in NCEP may occur at 18:00 UTC of a certain day, for ERAI only 6 h later).In case of doubt the first day is taken (see Sect. 3.1).This means that the split-up of events and thus accurate event identification may depend on the data set.Though the timing of the events is well accessed, the relative intensity of the events sometimes differs from data set to data set (e.g."Vivian", 26 February 1990).The results for the LIs (Fig. 2b) are also compared to accumulated potential losses based on the GDV data.With this aim, the latter is also aggregated for time windows of 3 days.The timing of the identified events is predominantly correct.As expected, the findings are similar to those for the MIs, with a good assessment of the timing of the events and differences in terms of the relative intensity between data sets.A calibration of the intensity towards the GDV data is not performed, as a linear calibration (as implemented e.g. in Held et al., 2013) would not change the relative ranking of events within a certain data set.Nevertheless, storms on successive days cannot always be well separated with our methodology.For example, storms "Elvira" (4 March 1998) and "Farah" (5 March 1998) cannot be separated for either reanalysis or DWD data (not shown).However, this is also not possible based on insurance loss data.
On the other hand, our method separates important storms like "Vivian" and "Wiebke" (26 February and 1 March 1990; Fig. 2).The top 30 events for the two reanalysis data sets as well as the DWD observations are shown in Table 1.Per definition, these are the events exceeding the 1-year return level for each data set.The most prominent historical storms affecting Germany like "Kyrill" (18 January 2007), "Vivian" (26 February 1990) and "Daria" (25 January 1990) are identified in all three data sets as top events.However, some differences are found regarding the exceeded return level.For example, storm "Daria" is estimated as a 5-year return level event for NCEP and DWD data and as 2-year event for ERAI.These differences are partly attributed by the resolution of the data sets and to known caveats.For instance, the relatively weak values for "Lothar" (26 December 1999) in NCEP can be directly attributed to an insufficient representation of this storm in the data set (Ulbrich et al., 2001; see their Fig. 1).Other differences may be associated with data availability or interpolation to the population density grid for DWD versus the lower resolution gridded data sets for NCEP and ERAI.In spite of these limitations, the method is able to identify consistent events, which constitutes a reliable basis to estimate the return period of storm series in the following.However, 70 % of the identified events in NCEP data are also found in ERAI and DWD data, and the same is valid for DWD and ERAI.

Comparison of identified events for the reanalysis period
Bar plots for different data sets and intensities (1-, 2-, 5year return level events) are now analysed for the 30-year period.For each threshold, the selected LI samples (30, 15 and 6 events, respectively) are shown in Fig. 3.In some cases the number of events per winter differs from data set to data set.Nevertheless, in all three data sets a maximum of events is found in the winter 1989/90 (Fig. 3a, b denoted 1990).Differences in the identified number of events at the 1year return level are determined for 11 winters.For example for ERAI, the winter 1983 features four 1-year events, while NCEP only features two events.For stronger events exceeding a 2-or 5-year return level, seven/six years with a difference in the number of events are identified.For instance at the 2-year return level for the storm series of 2000 (1999/00, see Fig. 3a, b) two events for ERAI, and no event for NCEP data are detected.This can be attributed to the limited representation of storms like "Lothar" (26 December 1999) in NCEP (c.f.Ulbrich et al., 2001).However, both data sets are generally in good agreement, identifying clearly the winters with well-known storm series like in 1990 or 2007.In comparison to the estimations based on the DWD observation data (Fig. 3c) some differences to the reanalysis data are apparent.For example the storm series of 2002 is not identified for DWD data.On the other hand, the storm series of 1990 includes six events for the DWD data (1-year return level).
As mentioned in Sect.4.1, this could be attributed both to known caveats of the data sets, station density vs. gridded data, and to the methodology used to assign the data to the population grid cells.In spite of these deviations, the historical storm series can be generally identified in all data sets.Furthermore, the resulting overall statistics over the 30 years are also similar (Supplement B) as the small permutations of the single events are in balance.

Calibration of GCM data based on CWTs
In order to enable the calibration of the GCM data, the distribution of the events for each CWT within the reanalysis period is analysed.Each loss event is assigned to the identified CWT for the corresponding date.Additionally to the 1-, 2-and 5-year return levels, a return level of 0.5 years is considered to help with the calibration.The resulting histograms are similar for both reanalysis data sets (Fig. 4a, b).Considering frequent events (0.5-year), most events are identified for W CWT. The focus on this class becomes more pronounced for higher return levels.For example for a return level of 5 years the maximum of all events are in the westerly CWT for both reanalyses.This predominance of windstorms in the westerly flow type is in line with previous results (e.g.Donat et al., 2010;Pinto et al., 2010).For the GCM data (Fig. 4c) the distribution of the events per CWT is different.Most frequent events (e.g.0.5-year) are identified for A CWT.For higher return levels (e.g.5-year) the events are more equally distributed over all CWTs than for the two reanalyses.This bias is corrected assuming the same relative frequency of events per CWT as in ERAI for GCM data.For example, two SW events are identified for the top 30 and ERAI, which corresponds to 6.7 % of all considered events.The corresponding number of events in GCM is 273 (6.7 % of 4092).Thus, the top 273 SW events are included in the event set of the 4092 top events.The resulting distribution is shown in Fig. 4d.

Estimation of return periods of storm series based on reanalysis
The identified frequency of events per year for the two reanalysis data sets as well as the DWD based data set is almost identical for the considered return levels (see Supplement B1).For succinctness, in the following only results based on ERAI data are discussed in detail.The return period of storm series with a certain return level is estimated based on the negative binomial and on the Poisson distribution (Supplement D, left).The related return periods are shown in Table 2 (left).
A return period about 65 years is estimated for a storm series with four 1-year return level events (like 1990) based on the Poisson distribution (Table 2).For the negative binomial distribution the assessed return period is ca.49 years.On the other hand, for a return level of observed two 5-year events (like 1990), the estimated return periods are 61 years for Poisson and about 42 years for negative binomial distributions.A value of about 0.16 for 1-year return level and of 0.25 for 5-year return level are determined for the negative binomial distribution, both indicating serial clustering (see Table 3a).The values calculated with Eq. ( 6) are different, with more clustering for frequent events (0.24 for 1-year return level events) and less clustering for extreme events (0.17 at the 5-year return level, Table 3b).Nevertheless, both methods identify overdispersion for the events.The estimated return period of storm series with two events per year for 1-year level (like in 1984) with the negative binomial distribution and the Poisson distribution are closer to each other, with about 5.9 and 5.4 years (Table 2).In fact, for 1-year events large deviations between the two distributions are only found for four or more events per year.The same is true for 2year (5-year) occurrences and three (two) or more events per year (Table 2).In these cases, the Poisson distribution clearly overestimates the return period of multiple events per winter.
In order to test the sensitivity to certain storm series like 1990, additional computations were performed based on NCEP and ERAI as above but single years (with three and four events) were removed respectively.Results show for all data little dependence on the selected years (not shown).For comparatively frequent storm series, a relatively small spread is identified, e.g. for 1-year return level and three events per year the estimated return period remains between 15 and 16 years.On the other hand, for 5-year return levels and three events the range is much larger, with estimates between 112 and 306 years (not shown).As the estimation of the return period is almost independent of the chosen years, the method is reliable for further application.

Estimation of return periods of storm series based on GCM data
The large ensemble of GCM runs is now considered to enhance the estimation of return periods of historical storm As the propagation of uncertainty for one event per year and 1-year return level is not possible to identify, the error bars are set to be the same as for zero events per year.
Table 3. values for the different data sets: (a) calculated with Eq. ( 9) and with the information of the confidence interval (b) computed with ψ = Var(X) E(X) − 1, RL: Return Level.series.The corresponding return periods are shown in Table 2 (right).The consideration of 4092 years leads to the identification of multiple years with four or more 1-year events.This enables more accurate estimates of the return period as well as lower uncertainties calculated with the Gaussian error propagation (Table 2).Following the above given examples, a return period of 41 years is assessed for a storm series with four events per year exceeding the 1-year return level (like 1990).This value is lower than for the negative binomial fit based on ERAI data and the Poisson distribution (49 and 65 years, respectively).The obtained return period for two events per year exceeding the 5-year level is about 48 years.Clear deviations between the Poisson distribution and the negative binomial distribution are also found for four (three/two) or more events for 1-(2-/5-) year level (see Table 2, Supplement D).
The consideration of GCM data with bias correction (GCM corr ) leads only to a small difference for return periods, e.g.notable for less frequent events and higher return levels (Table 2).The for GCM attributions are in all cases clearly positive, also indicating clustering of the events (Table 3a).Clustering is also positive, but lower or similar when being calculated with Eq. ( 6) (Table 3b).However, unlike previous results obtained for extratropical cyclones (Pinto et al., 2013), the value does not increase for larger return levels.For more intense events (5-year return level) the derived becomes smaller (e.g.= 0.11 considering all GCM corr runs), indicating less deviation from the Poisson distribution than for the 1-year events ( = 0.6 considering all GCM corr runs).The decrease of values is contributed to the fact that the sample of lower intensity events includes also higher intensity events and therefore more clusters are expected.For higher return level the occurrence of cluster is more random and therefore closer to the Poisson distribution.The reason for the differences compared to Pinto et al. (2013) may be that they based their conclusions on lower percentiles (and thus a higher frequency of events).This suggests that clustering of windstorm and associated losses is quite complex, particularly in terms of intensity variations.Nevertheless, and in all cases, clear overestimation of the return period is identified for the GCM based on the Poisson distribution.This is an important result, as it indicates that return periods of storm series are better estimated with the negative binomial distribution than with the Poisson distribution, especially for winters with a considerable number of events.
Analogously to the historical data, a sensitivity analysis was performed regarding the GCM data.In this case, it was analysed how the estimates depend on the choice of GCM runs.With this aim, the computations were repeated for each of the 47 runs (see Supplement A) individually and combinations of them.As the length of the runs is different, this also provides some insight on how the results may be sensitive to the length of the time series.For example the estimated return periods of three events a winter above the 1-year return level are assessed to ca. 15 and 16 years depending on whether the whole data set, selected groups of runs or individual runs (see Table 2) are considered.The major difference is the uncertainty: while for all GCM corr data, 15 ± 0.59 years is estimated, the value is for example 15 ± 1.05 years for all ESSENCE corr runs, 16±3.2 years for the PRE corr run and for example 15 ± 8.24 years for the first Essence corr run (length only 50 years; not included separately in Table 2).PRE corr is different because it is expected to have more (multi) decadal variability (505 years of free running coupled GCM simulation) than shorter 50-year runs.These results demonstrate that the estimation of return periods by the negative binomial distribution is robust and depend only little on the length of data set.The more events per year are considered, the wider the uncertainty range.For a storm series as in 1990 (four events above the 1-year return level, three above the 2-year return level and two events above the 5-year return level) for all data sets and return levels the negative binomial based estimate for the return period is between 40 and 65 years.This is for all cases a more reliable estimate compared to the empirical data (see Supplement B2) than based on the Poisson distribution, which has an estimate of 65 years (1-year return level) and for more extreme events with a return level of 2-year (5-year) an assessment of 79 (61) years.The deviations between the Poisson and the negative binomial distribution are much larger if less frequent series are considered (Table 2).
For insurance applications, it is often desirable to consider not exactly a certain number of events, but rather a minimum value, e.g. three or more events per year above 2-year return level.With this aim, the estimations of Table 2 were computed for cumulative probabilities (Supplement C).Results are in line with the previous: for example, the estimated return periods for four or more events at the 1-year return level is between 26 and 40 years based on the negative binomial distribution, whereas by the Poisson distribution it is 53 years.For two or more events at the 5-year return level the range is between 42 and 53 years with the negative binomial distribution, while for the Poisson distribution it is 57 years.Also from this perspective, the results clearly indicate the importance of estimates with the negative binomial distribution, which considers explicitly the clustering of events.
For insurance applications, it is important to use reliable methods to estimate "occurrence loss exceeding probability" (OEP) and the "aggregate loss exceeding probability" (AEP).With this aim, an adequate quantification of clustering is essential.In this study we analysed different methods to estimate the return period of series of windstorm related losses exceeding selected return levels.For the purpose of statistical robustness, a combination of two reanalysis data, observation DWD data and an ensemble of over 4000 years of GCM runs were considered.First, the potential loss for Germany was estimated using an approach of the storm loss model of Klawa and Ulbrich (2003) for all data sets and additionally a meteorological index (Pinto et al., 2012).These methods were adapted to separate consecutive potential losses associated with extreme events within 3 days.As Germany is a comparatively small area, this time frame is reasonable for separating events.Moreover, it accords to the 72 h event definition, which is often used by insurance companies in reinsurance treaties (Klawa and Ulbrich, 2003).The estimated events are ranked and only the top events representing a return level of 1-year, 2-year or 5-years are analysed.The distribution of the number of events per winter was analysed.This was followed by the estimation of the return period of storm series like in 1990 (with four storms in ERAI) with the Poisson distribution as well as with the negative binomial distribution.The main conclusion is that especially for storm series with many events per winter (e.g.four events exceeding the 1-year return level) the Poisson distribution clearly overestimates the return period for storm series, as overdispersion is evident.Deviations from the Poisson distribution are also identified when considering the long GCM data set (over 4000 years), but results show that mean estimates and uncertainties do vary between data sets (see Table 2).In general terms, the negative binomial distribution provides a good approximation of the empirical data.However, a constant overdispersion factor cannot be identified for storm losses, as changes both with intensities and between data sets.This suggests that clustering of windstorms and associated losses is a complex phenomenon and needs further discussion.The primary advantage of considering the extended GCM data set is a strong reduction in the uncertainties.
As qualitatively good insurance data or meteorological data (peak gusts) are mostly available only after 1970, it is difficult to classify the year 1990 based on the historical time period alone.According to our evaluation based on 30 years of observational data (NCEP, DWD, ERAI) there is a strong indication that the return period of this event combination (four events with a loss return level of ≥ 1 year) is longer than the existing data length (30 years).The used negative binomial distribution suggests return periods of about 49 years (ERAI).Nevertheless, the estimated uncertainty is large, as the data basis of only 30 years is clearly too short.By using the 4092 years of GCM data a strong reduction of the uncertainty estimates was achieved.These results put the historical storm series into a much larger perspective: the estimates indicate that an occurrence of exactly four events like in 1990 takes place once in 40-53 years.If four or more events are considered, the estimation of the accumulated likelihood is between 26 and 40 years based on the negative binomial distribution.
Results of the present study are potentially helpful for insurance companies to parameterise loss frequency assumptions of severe winter storm events.In Germany, the possible number of significant storm events per year was intensively discussed after the storm series in 1990, which is the top annual aggregated loss for recent decades (e.g. for insurance of residential buildings in Germany, after inflation correction: GDV, 2012).Even over 20 years later, German companies use the 1990 storm series as an internal benchmark test for their reinsurance cover or capital requirements.A similar discussion took place in France after the events "Lothar" and "Martin" (Ulbrich et al., 2001) hit the country in late 1999.
The present results demonstrate that the negative binomial distribution provides good estimates of return periods for less frequent storm series.Future work should focus on a more detailed analysis of events with different return periods within one winter as this could improve results.Furthermore, an investigation of the clustering within single CWTs, especially for CWTs with a high frequency of events, could be helpful for a better understanding of the physical aspects of clustering.Another interesting investigation could be to perform a similar analysis of further European countries.
3-900051-07-0, http://www.R-project.org/.We thank the German Weather Service (DWD) for the wind gust data.Furthermore we thank the two referees, who helped to improve the manuscript.Finally, we are grateful to Julia Mömken for the CWT computations and Mark Reyers (both University of Cologne) for discussions.
Edited by: B. Merz Reviewed by: R. Caballero and R. Vitolo

Figure 1 .
Figure 1.(a) Location of reanalysis grid points (black) over and near Germany and population density (POP, colours) in number of inhabitants km −2 per 0.25 • grid cell; (b) same as (a) but for ECHAM5 GCM grid points; (c) same as (a) but for DWD stations.Only stations providing 80 % of the wind gust records for the period 1981/82 to 2010/11 are considered (112 stations).For each 0.25 • grid cell, the wind/gust is associated using the nearest neighbour method.

Figure 2 .
Figure 2. Time series of 3-day accumulated losses between 15 January and 15 March 1990.The values are normalised by the maximum accumulated loss of the period 1981/82 to 2010/11 for each data set.(a) Comparison between MI derived DWD gust observations (blue), MI estimates based on NCEP (green) as well as MI obtained from ERAI (orange); (b) same as (a) but for LI and additionally compared to simulated insurance data (GDV, red).Unlike MI, LI is population weighted.

Figure 3 .
Figure 3.Time series of the number of events per winter exceeding the 1-year return level (red), 2-year return level (green) and 5-year return level (blue) between 1981/82 and 2010/11.(a) LI estimated based on NCEP; (b) same as (a) but for ERAI; (c) same as (a) but for DWD gust.The indicated year corresponds to the second year of a winter (2000 indicates 1999/00).

Figure 4 .
Figure 4. (a) Distribution of events exceeding a certain return level depending on the CWT for LI NCEP.Colours denotes the different return level (0.5-, 1-, 2-and 5-year events); (b) same as (a) but for ERAI; (c) same as (a) but for the GCM ensemble; (d) same as (c) but for the corrected frequency of events per weather type based on ERAI.For (a) and (b) the total number of years is 30, for (c) and (d) it is 4092 years.

Table 1 .
List of the identified top 30 events and corresponding return level for each event for NCEP, ERAI and DWD gust data.Dates are given as dd.mm.yyyy.

Table 2 .
Estimated return periods for three different return levels (1-, 2-, 5-year) based on the Poisson distribution (Pois.RP), the empirical data for each data set (eRP), and the negative binomial distribution (Neg.Bin.RP; with uncertainty estimates * using the Gaussian error propagation) for NCEP, ERAI and independent selected GCM samples (GCM: all runs, GCM corr , 37 ESSENCE runs: ESS corr , 3 20C runs from MPI: 20C corr , PRE corr from MPI, 3 CSMT runs from MPI: CSMT corr ; all runs indexed with corr are bias corrected based on CWTs) considering only the number of years available for each data set respectively.The number of years is indicated below each data set.For further details see TableB1in the Supplement.