Assessment of extreme wind speeds from Regional Climate Models – Part 1 : Estimation of return values and their evaluation

Frequency and intensity of gust wind speeds associated with severe mid-latitude winter storms are estimated by applying extreme value statistics to data sets from regional climate models (RCM). Maximum wind speeds related to probability are calculated with the classical peaks over threshold method, where a statistical distribution function is fitted to the reduced sample describing the tail of the distribution function. From different sensitivity studies it is found that the Generalized Pareto Distribution in combination with a Maximum-Likelihood estimator provide the most reliable and robust results. For a reference period from 1971 to 2000, the ability of the RCMs to realistically simulate extreme wind speeds is investigated. For this purpose, data from three RCM scenarios, including the REMO-UBA simulations at 10 km resolution and the so-called consortial runs performed with the CCLM at 18 km resolution (two runs), are evaluated with observations and a pre-existing storm hazard map for Germany. It is found that all RCMs tend to underestimate the magnitude of the gusts in a range between 10 and 30% for a 10-year return period. Averaged over the investigation area, the underestimation is higher for CCLM compared to REMO. The spatial distribution of the gusts, on the other hand, is well reproduced, in particular by REMO.


Introduction
Severe winter storms and related destructive wind speeds pose a significant threat to modern societies and their assets.In Central Europe, winter storms are responsible for more than 50% of the total economic loss due to natural hazards Correspondence to: M. Kunz (michael.kunz@kit.edu)(Münchener Rück, 2007).Single extreme storm events with a low probability of occurrence, such as Lothar on 26 December 1999 (Ulbrich et al., 2001;Wernli et al., 2002) or Kyrill on 18 January 2007 (Fink et al., 2009), caused economic losses in excess of 10 billion EUR each.Therefore, accurate assessment of the frequency distribution of extreme near-surface wind speeds is a fundamental prerequisite for engineering, forestry, and risk management purposes.
In the light of global warming it is still a matter of debate to what extent the frequency and/or intensity of severe winter storms may change as a consequence of the greenhouse gas (GHG) forcing conditions expected in the 21th century (IPCC, 2007).A comprehensive review of the actual knowledge on the climatology of mid-latitude cyclones for present and future climate conditions is given by Ulbrich et al. (2009).Several recent studies (e.g.Leckebusch et al., 2006;Pinto et al., 2006;Rockel and Woth, 2007;Pinto et al., 2007b) investigate the relation between the frequency and intensity of cyclones or extreme winds on the basis of global or regional climate models (GCM/RCM).They all found evidence of a slight increase in the frequency of high wind speeds over Europe, in particular at the end of the 21th century, compared to present climate conditions.The studies consider either changes in the number of events over a specific threshold (>Bf 8;Rockel and Woth, 2007) or in the 98th percentiles of the distribution function (Leckebusch et al., 2006;Pinto et al., 2007a;Fink et al., 2009).Using such comparatively low thresholds, severe winter storms with high return periods (RP) or -the inverse -low probabilities are not adequately described.
Accurate assessment of the storm hazard in terms of wind speed related to probability is based on proper statistical description of the underlying data set.Extreme value statistics can be applied specifically to model the behaviour in the tails of the distribution of interest.Basically, two different methods exist for statistically describing a sample of extremes.
Published by Copernicus Publications on behalf of the European Geosciences Union.
One is the classical generalized extreme value (GEV) distribution, which is based on annual maxima.It comprises a family of three different probability distribution functions (Fisher and Tippett, 1928).The other approach is the peaks over threshold (POT) method, where all events above a chosen threshold are selected for the sample (Coles, 2001;Palutikof et al., 1999).This method increases the number of events included in the analysis and, correspondingly, reduces statistical uncertainty (Brabson and Palutikof, 2000).Key criteria for using the POT-method are the definition of either a fixed threshold or a number of events constituting the sample and event independence.The latter is taken into account by implementing a minimum time lag between two consecutive storm events.Once a sample of events has been created, the next steps are (i) identifying a statistical distribution best fitting the data, and (ii) estimating the unknown parameters of the distribution function.In principle, several statistical methods are applicable to model a sample of extremes.
In this paper, statistical methods most appropriate for characterizing a sample of daily maximum wind speeds are evaluated.First, the fits of three different cumulative distribution functions are computed: gamma distribution, exponential distribution, and generalized Pareto distribution (GPD).Next, four different methods are applied to estimate the a priori unknown parameters of the distribution function.The sensitivity and robustness of the results to variable sample sizes are evaluated.Maximum wind speeds obtained from three different RCM realizations for a control period between 1971 and 2000 are evaluated against a pre-existing storm hazard map and observation data.From this, specific specific characteristics inherent in the realizations and the models are examined.
In Part 2 (Kunz et al., 2010), relative changes in extreme wind speeds projected for the next few decades will be quantified.These studies will be based on results of five RCM runs comprising different model setups, different realizations of the driving GCM, and different emission scenarios.
The present paper is organized as follows: Sect. 2 contains a brief overview of the data sets used in this study.Section 3 provides a short description of the different distribution functions and parameter estimator methods.Section 4 identifies the methods most appropriate for describing extreme wind speeds.Section 5 provides an evaluation of the RCM simulation results for past decades.Discussion and some conclusions follow in Sect.6.

Data sets
Validation of appropriate statistical methods and estimates of extreme wind climatology are based on the RCMs of REMO and CCLM, which will be shortly described in the following sections.Maximum wind speeds are examined for a control period from 1971 to 2000 (CTRL) and -for the statistical tests in Sect. 4 -a projection period from 2021 to 2050 for the A1B emission scenario (A1B).For evaluation purposes, data of a pre-existing storm hazard map, complemented by synoptic stations (SYNOP) of the German Weather Service (Deutscher Wetterdienst, DWD), are used.As severe synoptic-scale storms in Central Europe are restricted to the winter season, only data for the months between October and March are analyzed.The entire area under investigation ranges from 47 • to 55 • N and from 5.5 • to 15.5 • E, covering all of Germany and parts of adjacent countries.The quantitative evaluation is conducted for Germany because high-quality data sets are available only for this region.

ECHAM5 global climate model
The ECHAM5/MPI-OM global model (Roeckner et al., 2003(Roeckner et al., , 2006) is a coupled atmospheric-ocean model (Marsland et al., 2003) developed at the Max-Planck-Institute for Meteorology (MPI-M), Hamburg (Germany).The atmospheric component is the spectral model ECHAM5 run at T63 horizontal resolution, which approximately corresponds to a spatial resolution of 1.87 • (∼200 km).The calculations carried out at the MPI-M were computed with historical GHG and aerosol concentrations for the period 1860-2000.The initial conditions of the three realizations are different states of the 500-year pre-industrial control run based on constant GHG concentrations of the year 1860.We used daily maximum near-surface (10 m) wind speeds from the first realization to examine the sensitivity of the statistical distribution function to variable sample sizes.

Regional Model (REMO)
The REMO is a hydrostatic RCM based on the Europamodell, the former operational weather prediction model of DWD (Majewski, 1991).Further development and various applications were carried out at the MPI-M (Jacob, 2001).
The REMO experiments used in this study (Jacob, 2005a,b) were commissioned by the German Federal Environment Agency (UBA); they cover Germany, Austria and Switzerland (Jacob et al., 2008) with a very high spatial resolution of 0.088 • (∼10 km).In these model runs, REMO employs the physical parameterization schemes of ECHAM4, not those of the Europamodell.In the vertical dimension, a hybrid coordinate system is used with terrain-following model levels.The horizontal grid is a rotated Arakawa Cgrid where all variables except the wind velocity refer to the centre of a grid box.The model runs are driven by initial and boundary condition of ECHAM5/MPI-OM applying a double nesting method with an intermediate step of 0.44 • (∼50 km) resolution.Daily maximum gust speeds were examined, which are parameterized according to the Europamodell in terms of turbulent kinetic energy (TKE) in the atmospheric boundary layer (Majewski, 1991).An overview of the RCMs and their setups is presented in Table 1.(Hollweg et al., 2008).These socalled consortial runs (Lautenschlager et al., 2009a,b) -hereinafter called CCLM1 and CCLM2 -are based on the first two realizations of ECHAM5/MPI-OM (see Table 1 for an overview of the model).Terrain-following coordinates on a rotated Arakawa C-grid are used.The 10 m maximum gusts are determined as the peak levels of both turbulent and convective gusts (Schulz and Heise, 2003).Turbulent gusts are derived from the turbulent state in the atmospheric boundary layer by way of the drag coefficient for momentum and absolute wind speed at the lowest model level.Convective gusts, on the other hand, are parameterized in terms of the wind speed transported from higher to lower levels by the downdraft according to the convection scheme of Tiedtke (1989).In the output of the consortial runs, however, convective gusts are restricted to a maximum value of 30 m s −1 .As it is not known whether a value of 30 m s −1 is related to convective or turbulent gusts, direct use of the model gusts is not possible.Therefore, gusts were parameterized subsequently by multiplying mean wind speeds with constant empirical gust factors, f , which depend solely on the roughness length, z 0 , of the terrain.The same approach was applied successfully in the storm hazard map presented in the next section (Hofherr and Kunz, 2010).The gust factors do not consider the actual weather situation, in particular the conditions within the boundary layer.Especially in case of strong convective gusts, this may lead to an underestimation of actual gust speeds near the surface.Daily maximum gusts are computed for each grid point by where ū and v are the horizontal components of the daily maximum mean wind speed.The gust factors, f , were de- rived originally from the ratio between turbulent fluctuations (averaged over a 3-s period) and mean wind speed (averaged over a 10-min period) for different vegetation characteristics according to the studies of Wieringa (1986).They are listed in Table 2 together with the number of grid points within each category.

Synoptic station data (SYNOP)
Wind measurements at manned and automatic weather stations of DWD were used for quantitative evaluation of the RCM results (see Fig. 1 for station locations).These data comprise hourly means and daily maximum gust speeds.
Since wind measurements over a longer time period may be affected by changes in the environment, relocations or changes in instrumentation, the data were checked thoroughly.Where possible, inhomogeneities in the time series were corrected (Hofherr and Kunz, 2010).Measurement errors as well as local gusts related to thunderstorms were filtered out by comparing daily maximum wind speeds with both hourly observations from the same station and time series from neighbouring stations.Only stations in continuous operation over at least 20 years were used, yielding a total number of 150 stations.As can be seen in Fig. 1, the station sites are spread more or less evenly over the whole domain.
The relative frequency distribution in terms of station height is similar to those of CSHM and the RCMs.

Storm hazard map CSHM
To close the gap between point observations (SYNOP) and area-covering RCM data, wind fields from a pre-existing storm hazard map (hereinafter referred to as CSHM) of the Center for Disaster Management and Risk Reduction Technology (CEDIM) were used additionally (Heneka et al., 2006;Hofherr and Kunz, 2010).The CSHM is based on simulated wind fields of the annual most severe storms between 1971 and 2000 using the numerical Karlsruhe Atmospheric Mesoscale Model (KAMM; Adrian, 1987).The model, initialized by ERA-40 re-analysis data (Simmons and Gibson, 2000), provides mean wind speed in a very high spatial resolution of 1 km×1 km.The very high resolution allows various orographic structures relevant for the near-surface winds to be resolved.A nudging technique, i.e. a weak relaxation towards an atmospheric reference state, was employed to adjust the simulated wind fields to the observations (factors limited to a range between 0.7 and 1.3).As KAMM has no internal parameterization scheme for the gusts, they were determined by multiplying mean wind speeds with empirical gust factors depending on land use only (same as for CCLM).Finally, a statistical Gumbel distribution function was fitted to the wind data modelled at each grid point to obtain extreme gust speeds for specific return periods.The CSHM wind fields are quantitatively as well as qualitatively in good agreement to the observations.Only highest wind speeds, especially over the mountains, are somewhat underestimated.

Statistical methods
The basis of statistical modelling are time series for each grid point, from which the strongest events are selected.To obtain a sample of statistically independent events, a minimum time lag of three days between two storms is considered (Palutikof et al., 1999).After creation of a sample, the next steps are (i) identification of a probability distribution function (PDF) best fitting the data, and (ii) estimate of the unknown parameters.Several statistical methods, summarized below, are applicable to a sample of extremes.These methods rely on stationary samples which are not influenced by any trend.A check of the REMO data yielded only insignificant trends within the CTRL period.In case of non-stationary samples, methods like non-stationary GPD models should be used to account for this (see, e.g., the review of Khaliq et al., 2006).

Generalized pareto distribution
The GPD describes the behaviour of extreme values above a defined threshold, ζ (Hosking and Wallis, 1987;Palutikof et al., 1999).To avoid that a different number of events enter the samples, ζ is not used as a fixed parameter here but adjusted at each grid point, thus ensuring uniform sample sizes.
The cumulative distribution function (CDF) of the GPD is defined by where x is the random variable.The shape parameter, k, indicates the width, the scale parameter, α, the slope of the CDF.If k=0, the GPD is reduced to the exponential distribution, With the crossing rate, λ, as the expected number of peaks above ζ , the wind speed as a function of the return period, T , is These functions are called the hazard relations, also defining the hazard curves.For k>0, the function converges asymptotically towards an upper bound.However, it is infinity if k≤0.This implies an unbounded increase in wind speed for increasing return periods, which makes no sense physically.The advantage of the exponential distribution (Eq.3) is that a negative value of k is impossible, as k=0 by definition.

Gamma distribution
The gamma distribution is a two-parameter family of continuous CDFs, which is defined by where k and α again are the shape and scale parameters, respectively (Wilks, 1995).The Gamma function, (k), is an analytic function that extends the concept of factorial to the complex numbers.The CDF of the gamma distribution takes on a wide variety of shapes as a function of k.It comprises the exponential distribution (for k=1), the chi-square distribution (for k=2) or the Gaussian distribution (for k→∞).
The wide range of possible shapes is the reason why also the gamma distribution was tested in addition to the widely used GPD to statistically describe extreme wind speeds.

Parameter estimation
Once an appropriate CDF (or PDF) has been adopted, the next step is estimating the unknown parameters, k and α.
Several methods are available, all of them basically applicable to any CDF.According to Hosking and Wallis (1987), for example, the method of moments (MOM) exhibits the highest efficiency in estimating the parameters of the GPD when k≈0, whereas probability weighted moments (PWM) is more appropriate when k≈−0.2.For large sample sizes, those authors judged the maximum-likelihood (ML) method to be the best choice when k>0. Kharin and Zwiers (2000) quantified extreme wind speeds from a GCM by using the GEV and the method of L-moments (LM) to estimate the unknown parameters.For a data set of observed extreme gust speeds, Brabson and Palutikof (2000) examined the sensitivities of the GEV and the GPD by using ML and probability weighted moments (PWM).In a recent study, Della-Marta et al. (2009) applied the GPD in combination with an ML estimator to a sample of modelled extreme wind speeds from ERA-40 re-analysis.

Method of moments
The method of moments (MOM) is a parameter estimation technique based on matching the sample moments to the corresponding distribution moments (Hosking and Wallis, 1987).The unknown parameters, k and α, are determined by where X is the mean and s 2 is the variance of the sample.

Method of maximum-likelihood
The method of ML is a technique identifying the most likely values of k and α for a given sample (Wilks, 1995).The method adopts parameter values by maximizing a likelihood function, with f as the PDF.Usually, L is expressed by the loglikelihood function.Taking the first derivatives of Eq. ( 9), with respect to the parameters of any PDF yields two equations for k and α (Hosking and Wallis, 1987).If necessary, the second derivative can be used to determine the sign of the solution.When k<0.5, the estimators have their familiar properties of consistency, asymptotic normality, and asymptotic efficiency (Hosking and Wallis, 1987).From the asymptotic covariance matrix of the ML estimator, uncertainties in the parameter estimates can be derived.

Method of probability weighted moments
The PWM method calculates the estimators by comparing the first two moments of the population with those of the sample.Unbiased estimators of k and α are given by (Hosking and Wallis, 1987) where the zeroth moment, b 0 , is the mean of the sample (X), and the first moment, b 1 , is given by with n as the sample size and X j as the members of the sample.According to Abild et al. (1992) and Palutikof et al. (1999) the estimates (11-12) are valid within the range −0.5<k< + 0.5 only.

Method of L-moments
The LM are linear combinations of certain probability weighted moments with simple interpretations as measures of location, dispersion and shape of the sample.The method is very robust and less sensitive to outliers of the data base, even for small samples (Vogel and Fennessey, 1993).According to Hosking (1990), the first two L-moments for the distribution function are where λ 1 and λ 2 depend on the estimated parameters, α and k, and the threshold, ζ (see Sect. 3.1.1).

Uncertainty estimate
Application of extreme value theory comprises several statistical uncertainties.Factors contributing most to statistical errors are the infrequent occurrence of extremes, proper description of the tail of the distribution function, and proper estimate of the unknown parameters of a CDF.Even if the uncertainty can be reduced by appropriate statistical modelling of the underlying data set as evaluated in the next section, statistical errors still remain in the results.
To estimate the uncertainty of the results, confidence intervals of the fitted CDF are determined by a bootstrap method (e.g., Efron and Tibshirani, 1997).This method is based on a number of resamples obtained by random resampling with replacement of the original data set (non-parametric bootstrap).In this study, the original samples comprising the most extreme gusts are resampled 1000 times.Return values are estimated from each sample by fitting and inverting the GPD.Two-sided confidence intervals on the 90% level of significance are obtained from the bootstrap samples.Differences between two return values for a specific RP are statistically significant if their 90% confidence intervals do not overlap, which corresponds to the 1% significance level (Kharin and Zwiers, 2000).

Evaluation of the most appropriate statistical methods
The different methods described in the section above are applied to wind speed data from REMO and ECHAM5 to find a CDF and parameter estimator most suitable describing the properties of the sample.In Sect.4.3, we will evaluate the stability of the results for different sample sizes.At each grid point, the 100 highest wind speeds in the 30-year periods are considered without any spatial clustering.Therefore, the days considered for the samples may slightly differ from one grid point to another.

Sensitivity to the probability distribution
The three CDFs (Eqs.2, 3, and 6) are applied to a sample consisting of maximum gust wind speeds from REMO.From the theoretical distributions, the 90th and 99th percentiles are computed for each grid point and compared with the same percentiles of the sample.For reasons of better distinguishability, the number of points in the diagrams is reduced to a test region covering the federal state of Baden-Württemberg (southwest Germany).However, the same results were obtained for the whole area under investigation (not shown).
The two data sets are presented as scatter plots referred to as quantile-quantile plots (Fig. 2).This allow easy examination and evaluation of the underlying CDF.The higher the scattering around the 1:1 line, the lower the skill of the CDF to reproduce the sample.For all three distributions displayed in Fig. 2, most of the points fall close to the 1:1 line.While the results for the GPD are almost unbiased, both the gamma and exponential distributions yield lower gust speeds than the sample.When the focus is on severe storms only, the tail of the distribution as expressed, for example, by the 99th percentile is the most interesting part.In all cases, its scatter is substantially higher than for the other percentile.The lowest scatter, in particular for the 99th percentile values, is obtained when applying the GPD.It should be noted that the unknown parameters, k and α, were estimated by the ML method in Fig. 2. The use of other parameter estimators, however, did not change our analysis of the most appropriate CDF in general (not shown).
As the GPD appears to be the best choice to fit the wind speed data, it will be applied in all subsequent analyses of this study.

Sensitivity to the parameter estimation method
The four parameter estimators, MOM, ML, PWM, and LM, are applied to gust wind speed data from REMO.Results are evaluated in terms of the shape parameter, k, which is decisive in particular for high RPs (see Sect. 3).When k is positive, GPD proceeds asymptotically towards an upper bound for decreasing probabilities.For winter storms, the upper bound must be far below the physics-based limit of sound velocity, as the pressure gradient is assumed to have an upper limit due to the balance with Coriolis, acceleration, and friction forces according to the Navier-Stokes equations.Besides, a velocity in excess of 300 km h −1 has never been observed in Europe (maximum recorded gust in Germany: 259 km h −1 at the Wendelstein during windstorm Lothar on 26 December 1999).Also a sign change in k is problematic when comparing two different time periods such that the curves tend to diverge with increasing RP.This, however, is important only for a low sampling uncertainty, as will be studied in detail in Part 2 (Kunz et al., 2010).
From all grid points of the investigation area histograms of k are calculated.Considered are the four parameter estimators for both periods, CTRL and A1B.As shown in Fig. 3, the histograms are not very sensitive to the estimation method.The requirement of k to be in a range of ±0.5 when applying the GPD is met for a large majority of grid points and for all methods.Only a small fraction lies outside this range (e.g., for ML: CTRL≈0.1%,A1B≈0.5%).Moreover, k is positive for most of the grid points, ensuring physically appropriate asymptotic behaviour of the GPD.This is the case for both the CTRL and the A1B period, where k tends to assume higher values.Comparing the different methods shows that ML and MOM as well as PWM and LM yield approximately similar results, in particular for CTRL.The small differences between PWM and LM are not surprising since they are based on certain weighted moments of the sample.Both the histograms of ML and MOM show a narrower spread and a higher amount of positive k-values compared to the two other methods.Overall, ML produces the largest number of grid points with k≥0 (approx.93% in CTRL and 97% in A1B) and yields more pronounced peaks.For these reasons, we decided to apply ML in all subsequent analyses.

Parameter sensitivity to a variable sample size
The definition of either a fixed threshold, ζ , (see Eq. 2) or a number of independent events entering the samples seems to be somewhat arbitrary.The samples must be large enough for any statistical analysis, but as the size increases, more and more weaker wind speeds affect the CDF.In this section we explore the robustness of the statistics with respect to variable sample sizes, which vary between 40 and 150 (i.e., 1.3 and 5 events per annum on average, respectively).To elucidate the sensitivity of the parameters for a larger, representative domain unaffected by local-scale terrain variations, statistics were applied to daily maximum wind speeds from ECHAM5 (run 1).
According to Fig. 4a and b, the climate change signal, V , is almost insensitive to changing sample sizes at the two selected grid points.Only for the high RP of 50 years, V shows considerable changes, in particular for the grid point at 49 • N/9 • E (Fig. 4b).Here, a sample size of 82 events, for example, yields a change of -0.1 m s −1 , whereas it is -1.1 m s −1 when 137 events are considered.This confirms that the GPD is also sensitive to lower wind speed data and not only to the most extreme values (e.g., Felici et al., 2007).The high variability in V , however, is within the confidence intervals for the whole range displayed.For the 50-year RP, statistical uncertainty decreases with increasing sample sizes, which is not the case for a 1 or 10 years RP.Note that only the grip point at 51 • N/9 • E (Fig. 4a) indicate a significant increase in extreme wind speed as the confidence intervals only include positive values.
The free parameters of the GPD, k and α, were computed by the ML method also for variable samples of the CTRL run.As can be seen in Fig. 4, the two free parameters of the GPD, k and α, depend on each other.An increase in k is linked with an increase in α, and vice versa.At the two grid points both parameters exhibit a comparatively high variability over the whole range of sizes displayed.No distinct plateau, where k and α are less sensitive to the number of data, can be identified.As discussed above, the resulting climate change signal, V , however, is only marginally affected by this variability.
The hypothesis that 100 independent storm events may be a sound number for the samples is neither supported nor disproved by this sensitivity study.Rather it appears to be a good compromise between focusing on extremes and considering enough random variables to reduce statistical uncertainties.All subsequent examinations in this study rely on the GPD with an ML estimator, and consider an RP of 10 years.

Evaluation of regional climate models
The general ability of one-way nested RCMs to accurately simulate local-scale climate features when driven by largescale information was demonstrated in the so-called Big-Brother Experiment designed by Denis et al. (2002).They found that small-scale low-level features absent from the initial and lateral boundary conditions are almost fully regenerated by the RCM.Using the same methodology, Diaconescu et al. (2007), however, showed that errors in the driving models are passed to the RCM, suggesting that the large-scales precondition the small-scales.This is an important constraint in particular for extremes, where the GCMs and, thus, the RCMs have problems describing the heavy tail of the distribution function.In case of intense and small scale storms, the low resolution of the driving GCM prevents reliable representation of the maximum intensity (e.g., Ulbrich et al., 2001).Further limitations in the RCMs storm representation are due to simplifications in model physics and shortcomings in the gust wind parameterization schemes.
This section presents an evaluation of extreme gust wind speeds as obtained from three RCM realizations.By comparing these data with CSHM and station observations, main characteristics and features of the RCMs are identified.Keep in mind that only REMO data are based on a gust wind parameterization scheme in terms of TKE, whereas for CSHM and CCLM data empirical gust wind factors were applied (see Sect. 2).Besides, the CSHM wind fields are based on annual maxima of the most severe storms modelled by a Gumbel distribution function.The resulting hazard curves exhibit a steeper slope than that obtained from the GPD.For a 2-and 10-year RP considered here, however, the differences between the two methods are only marginal.

Comparison of spatial gust patterns
The gust wind fields for a 10-year RP displayed in Fig. 5 show a distinct spatial variability due to the superposition of atmospheric disturbances induced on different spatial and temporal scales.On the large scale, the wind field is determined by the frequency and intensity of extratropical cyclones, both of which decreasing in north-to-south and west-to-east directions (e.g., Della-Marta et al., 2009).On a local scale, the near-surface wind field is modified mainly by the terrain's roughness causing enhanced vertical exchange of horizontal momentum (Wieringa, 1993), and by orographic effects, in particular by flow deflections at orographic structures (Smith, 1979).Consequently, the regions most affected by high wind speeds are the North Sea coast and the crests of the mountains as long as they can be resolved by the models.In contrast, the lowest values are typi-cal of the north-eastern areas and of broad valley such as the upper Rhine valley in southwest Germany.
Comparison of the different model results reveals the benefit of higher model resolution.On the 1 km grid of CSHM (Fig. 5a), gust speed shows considerable variation on the local scale, which is more or less connected with the terrain elevation.Significant changes are found over distances of just a few kilometers.Unlike CSHM, the results of both RCMs exhibit considerably reduced spatial variability of the gusts, which is due in particular to the lower grid resolution of 10 and 18 km, respectively.All realizations exhibit significantly lower gusts compared to that of CSHM.This applies to the whole study area, irrespective of terrain and land use characteristics.Only REMO (Fig. 5b) shows distinct spatial variations of the gusts strongly controlled by the height of the terrain.Enhanced gusts over the crests of the low mountain ranges, a prominent feature in CSHM, are reproduced to a certain degree, but with a considerable lower magnitude of less than 35 m s −1 .Very high wind speeds in excess of 40 m s −1 are obtained only over the Alpine regions of Switzerland and Austria.3.
In the two CCLM runs (Fig. 5c and d), spatial differences are even less pronounced.In general, the wind speed is reduced further compared to both CSHM and REMO.Certainly, one reason for the underestimation is the inability of CCLM to resolve important orographic structures because of the lower resolution of 18 km.Over almost flat terrain, however, such as the northern parts of Germany, where the horizontal resolution is expected to be of minor importance, CCLM still yields lower gust speeds than REMO.Hence, it can be assumed that the simplified gust parameterization based on empirical gust factors (see Sect. 2.3) is another reason for the negative bias.Although driven by different realizations of ECHAM5, the two CCLM runs show very similar structures.Only over the central parts of Germany wind speed is slightly more underestimated in the CCLM2 run than in the first one.On the other hand, both the patterns and magnitudes of the gusts are almost the same over Northern and Southern Germany as well as over the Alpine region.
Histograms of the gusts were drawn (Fig. 6) to quantify all differences of the three RCM runs without giving special consideration to the location.Related statistical values are listed in Table 3.In order to refer to the same domain, only grid points in Germany were considered in this analysis.Because the number of SYNOP stations is limited, both the histogram (moving average of 5• v=0.5 m s −1 ) and the statistical parameters should be regarded as rough estimates.
Again, the magnitudes of the gusts are seen to be controlled strongly by the resolution of the model.Accordingly, the highest medians of 34.2 and 33.4 m s −1 are obtained by the SYNOP and CSHM data.Against to this, both CCLM runs showing roughly the same distribution estimate the lowest median of 24.5 m s −1 .Compared to CSHM (and SYNOP), the REMO and CCLM gusts are lower by a fac- tor of 4.2 and 8.4 m s −1 , respectively, which is equal to a relative difference of 12.4 and 25.8%.All models show a distinct second peak due to the increase of wind speed over the oceans and -depending on spatial resolution -the mountains.
Interestingly, the main parts of the histograms of REMO and CCLM1 are shifted relative to each other, while their second peaks exhibit almost the same values of approximately 35 m s −1 .The shift between the major parts of the histograms suggests a kind of linear relation between the two data sets, at least on average irrespective of the location.This can be interpreted as further indication for weaknesses in the rough gust wind parameterization scheme applied to CCLM data.In Fig. 5, the area of the highest wind speed in both cases is located over North and Baltic Sea, where the spatial resolution is expected not to be of major relevance.As these two models are driven by the same GCM realization, this means that the driving global model is decisive for the results of the RCMs as long as the terrain is almost homogeneous.Over complex terrain, on the other hand, the RCM features become more important, as can be seen in the differences between the REMO and the CCLM results.

Comparison at station locations
In the next step, wind speeds of the RCMs are evaluated with observations by DWD weather stations.For direct comparison, the model grid points closest to the observation site were used without any interpolation.It should be noted that wind speed observations are point measurements which may be influenced strongly by terrain and land use features in the immediate vicinity of the site.Model data, on the other hand, represents conditions on a scale similar to the grid-size (more than 100 km 2 ).Hence, any comparison is subject to the different representation of terrain characteristics.
Gust wind speeds as a function of the return period, referred to as hazard curves, are shown in Fig. 7 at six sites related to specific terrain features and/or geographical locations (see Fig. 1).The stations of Heligoland (4 m a.s.l.) and Rostock (4 m) located on the Northern Sea and Baltic Sea coasts, respectively, show the highest wind speeds due to exposure to severe storms originating from the North Atlantic.Together with the stations of Teterow (46 m), Berlin (45 m), and Stuttgart (419 m), both the west-to-east as well as the north-to-south gradients in wind speed are displayed.While the Teterow station is surrounded by almost flat farmland, the Berlin station is representative for a large built-up area where wind speeds are reduced by a higher roughness length.Finally, the station of Hohenpeissenberg (977 m) illustrates the considerable increase of wind speed over the mountains.
Again, it is obvious that the three RCM simulations underestimate the gusts at all stations over the whole range of probability displayed.In particular, the two CCLM runs, which are in good agreement to each other, show the largest bias, except for the station of Heligoland.This indicates that the simplified gust parameterization scheme applied to CCLM data provides reliable results, at least over almost homoge-neous, flat terrain.At the station of Berlin, the RCM results are similar to that obtained for Teterow where the measurements yield higher gusts because of a low roughness length.As already seen in Fig. 5, the smallest discrepancies between observations and simulations appear for the island station of Heligoland, whereas the greatest differences are found for the mountain station of Hohenpeissenberg.Note, however, that the method of the nearest neighbour applied in this comparison is problematic over complex terrain where wind speeds vary considerably even over short distances.In all cases, the characteristics of the hazard curves are similar in terms of slope and asymptotic behaviour for long RPs.The hazard curves of CCLM1 and REMO intersect only at the station of Heligoland, but do so for high RP.Plotting positions of the cumulative probability (dots in Fig. 7) are estimated by F i =1−(x i −0.44)/(n+0.12),where x 1 ≥...≥x n are the ordered maxima and n is the total number of events x i according to Gringorten (1963).This plotting rule originally was derived from a double exponential distribution and is most appropriate for displaying extreme wind speed data.As can be seen in the spread of the maxima observed, severe storms occur only infrequently, which is a general feature of meteorological extremes.For example, when splitting the samples observed at the station of Stuttgart into two equally spaced intervals ( v =9 m s −1 ), 94 of the events lie in the first interval (20-29 m s −1 ), whereas only 6 events are in the second interval (29-38 m s −1 ).This results in high skewness for the tail of the distribution function.As can be seen in Fig. 7, this skewness is reproduced more or less by all RCM simulations and at all sites.At the grid points nearest to the station of Stuttgart, for example, the upper interval comprises only 14 events according to REMO and 12 and 5 events according to CCLM1 and CCLM2, respectively.Hence, it can be concluded that the RCMs are able to reproduce reasonable tails of the distribution function defining the extremes.
The confidence intervals confirm that statistical uncertainty increases with RP.When an RP in the range of the period of investigation is considered without any extrapolation to higher levels, the statistical errors are below 20%.Interestingly, statistical uncertainties of the hazard curves calculated from RCM data are not higher than those from the observations.The staircase-shaped positions in the observations at the sites of Teterow, Rostock, and Berlin are due to rounding of the recorded data until 1990 at some stations of the former German Democratic Republic.
To rely not only on a few selected stations but to consider all observations in operation during CTRL, statistical distributions of the gusts were calculated for a 2-and 10-year RP (Fig. 8).Because of the strong influence of the terrain height on wind speed, the results were divided into three different height classes.
In the first class (1-200 m), the gusts observed vary between 24 and 36.5 m s −1 for a 2-year RP, and between 26 and 39 m s −1 for a 10-year RP.This class represents grid points with a great variety of terrain features including coastal areas and lowlands as well as deep valleys down to the southern most parts of the study area.As most of the grid points are from northern parts, where synoptic-scale storms occur more frequently than in all other regions, higher values are obtained in this than in the second height class (200-600 m).All models, including CSHM, underestimate both the magnitude and variability of the gusts.
In the next class, 200-600 m, the RCM results are in better agreement to observations and CSHM.Closest to the distribution observed is REMO, in particular for the 2-year RP, where the median of the gusts is underestimation only by 9.0%.The two CCLM runs, on the other hand, show marginally reduced values, but also lower variability in terms of maxima and differences between the third and first quartiles.
As expected, the highest wind speeds are observed over mountainous regions (>600 m).This increase is adequately reproduced by REMO, but not at all by the two CCLM runs.Also the total spread of the distribution function from the observations (and CSHM) is partly reproduced by the REMO results but, again, not so by the two CCLM runs.Only slight differences are obtained compared to the other two categories.The bias of the model results is higher for the upper than for the lower tail of the distribution.This reveals that the highest wind speeds in the models exhibit a higher bias than the lower gusts.Interestingly, the distribution of the gusts observed extends to both sides in that height class.This means that both the highest and the lowest wind speeds are observed over the low mountain ranges, the latter in deep valleys on the lee side of the mountains.

Summary and conclusions
Extreme value analysis techniques were evaluated and applied to RCM results in order to estimate the skill of the models to reproduce reliable distributions of gust wind speed.First, we tested different distribution functions and methods of parameter estimation as well as the robustness of the methods to variable sample sizes.Maximum gust wind speed data obtained from three RCM realizations were evaluated with observations and a pre-existing storm hazard map for Germany within the climatological period from 1971 to 2000.In this process, the main characteristics inherent in the different models and realizations were examined with respect to location and elevation.
The following conclusions are drawn from the statistical tests conducted in this study which are relevant to statistical descriptions of extreme wind speed data from an RCM (or GCM): -The method found to be most suitable for statistical description of a sample of extreme wind speed data are the GPD in combination with an ML estimator.Application of this method to RCM data yields the least scatter between data from the sample and inferred from the distribution fitted as well as the smallest number of grid points with a negative shape parameter.This finding is in compliance to other studies based on wind speeds from observations or reanalysis data (e.g., Brabson and Palutikof, 2000;Della-Marta et al., 2009).
-For high RPs (>10 a), the results were found to be sensitive to variable sample sizes.A good compromise is a 10-year RP, where related gust speeds are highly destructive but, on the other hand, statistical uncertainty is acceptable.
The climatological wind fields derived from the RCMs of REMO and CCLM show distinct differences in the magnitude of gusts and in spatial distribution.Several conclusions can be drawn from comparisons with observations and the storm hazard map (CSHM): -The RCMs employed basically are able to reproduce reliable extremes which occur only infrequently.This is a prerequisite when applying extreme value analysis techniques.
-In general, the spatial distribution of storm climatology is well reproduced by all model runs.However, statistics reveal a systematic underestimation of simulated gusts by 10 to 30% for a 10-year RP, depending on location and height of terrain.Similar trends were found by Leckebusch et al. (2006), who evaluated different RCM results from the EU project PRUDENCE with respect of the upper tail of the distribution function.However, and to our best knowledge, no evaluation of RCM gust speeds described by extreme value statistics has been conducted so far.
-Higher spatial resolution of the models, such as the 10 km of REMO, permits better representation of the main orographic structures, thus yielding higher spatial variability of the gusts over complex terrain.Reliable representation of the local storm climate, however, requires even higher resolution.
-In this study, the variance of climatological wind fields is determined mostly by the RCM and its gust parameterization scheme.The influence of the driving global model in terms of the first two ECHAM5 realizations is found to be of minor importance, in particular over complex terrain where the spatial resolution and physical parameterizations are most decisive.
-Due to the variability of the model results, an ensemble of different RCM runs is essential for the assessment of future changes in extreme wind speeds.This was considered in the PRUDENCE project for mean wind speeds (Beniston et al., 2007;Rockel and Woth, 2007) and in the EU project ENSEMBLES for gust speed as well as in a recent paper of Rauthe et al. (2010).
-In order to obtain more realistic wind speeds, it is essential to introduce comprehensive parameterisation schemes for the near-surface wind fields and the gusts, for example the physically-based gust model of Brasseur (2001).As shown in several studies, highresolution limited area models are able to reproduce reliable wind gusts if a physically based gust wind parameterisation is considered (Goyette et al., 2003;Pinto et al., 2009;Schwierz et al., 2009).The lack of such a parameterisation in the CCLM runs is the main reason for the significant underestimation of the gusts compared to REMO.
The underestimation of the gusts can be ascribed to the spatial resolution of the global-regional model chain.As a result of the coarse resolution of approximately 200 km, the GCM is unable to simulate reliably pressure fields and pressure gradients (Della-Marta et al., 2009).Severe storms of limited spatial extension of a few hundred kilometres only, such as windstorm Lothar in December 1999, are not simulated reliably by a GCM (Wernli et al., 2002).Even if the RCM offers the freedom to allow for the development of internal dynamical structures not imprinted on the lateral boundaries, it cannot be expected that exceptional strong pressure gradients evolve within the RCM domain.This hypothesis is supported by the fact that also CCLM runs on a 7 km grid, driven by ERA-40 reanalysis data, yield lower pressure gradients and higher core pressures for past severe storms compared to observations.As shown by Hofherr and Kunz (2010), gust wind speeds obtained from these runs for specific RPs are considerably lower compared to both the CSHM and observations.Besides, the higher the resolution of the RCM, the steeper are the mountain slopes, producing flow acceleration and vertical transport of horizontal momentum, both increasing near-surface gusts.
On the other hand, it is found that the models are able to reproduce approximately the spatial variability of the wind fields in comparisons of the observations and CSHM.Therefore, it can be assumed that the underestimation of the gusts is a systematic error.As far as the climate change signal is concerned, it can be supposed that the constraints and shortcomings discussed above will remain the same irrespective of the time period considered.Consequently, systematic errors will vanish when relative differences of gust wind speeds between two different time periods are computed.
The methods found in this study to be most suitable for statistically describing extreme wind speeds will be employed to different RCM scenarios in Part 2 of this study (Kunz et al., 2010).Relative changes in gust wind speed between a past (CTRL) and a future (2021-2050) time period will be examined using an ensemble of RCM runs comprising different models, emission scenarios, and realizations of the GCM.

Fig. 1 .
Fig. 1.Orography of Germany resolved at 1 km×1 km with locations of surface stations used for evaluation of the RCM results (white dots; black dots used in the comparison of Fig. 7).

Fig. 2 .
Fig. 2. Quantile-quantile plots showing the fit of three theoretical distributions to the 90th and 99th percentiles of wind speed data from REMO for each grid point in the federal state of Baden-Württemberg.Each point represents the data from the sample on the ordinate and the corresponding theoretical estimate on the abscissa from the gamma (a), exponential (b), and generalized Pareto (c) distributions.

Fig. 3 .
Fig. 3. Shape parameter, k, as a function of different parameter estimators (ML, PWM, LM, MOM) for REMO wind speed data.

Fig. 4 .
Fig. 4. Absolute changes in gust wind speed between A1B and CTRL with 90% confidence intervals (dotted lines) for a 1-, 10-, and 50-year RP (a and b) and parameter values, α and k, for a 10-year RP in CTRL (c and d) as a function of sample size at two different grid points of ECHAM5, at 51 • N/9 • E (a and c) and 49 • N/9 • E (b and d).

Fig. 6 .
Fig. 6.Relative number of grid points as a function of gust wind speeds for a 10-year return period from CSHM, REMO, CCLM1, CCLM2, and SYNOP.Statistical parameters of the distributions are listed in Table3.

Fig. 7 .
Fig. 7. Gust wind speed as a function of the return period with 90% confidence intervals for selected observation sites of Heligoland, Teterow, Rostock, Berlin, Stuttgart, and Hohenpeissenberg (station height indicated; see Fig. 1 for the locations) according to observations, REMO, and CCLM data.

Fig. 8 .
Fig. 8. Distribution of gust wind speed for a 2-year (a) and 10-year (b) return period from observations and different RCMs for three height intervals.Indicated are maximum and minimum values, median and mean (star), 25th and 75th percentiles (box), and 1.5•interquartile range (iqr; vertical lines).

Table 1 .
Technical description of the REMO and CCLM regional climate models from which results are used in this study.

Table 2 .
Empirical gust factors as a function of roughness length, z 0 , with the corresponding gust factor, f , and the number of grid points (GP) in each category.

Table 3 .
Statistical parameters of the distribution of the gust wind speeds from SYNOP, CSHM, REMO, CCLM1 and CCLM2 data with mean and 25th, 50th, and 75th percentiles of the distribution function (q25, median, q75) in m s −1 .