Natural Hazards and Earth System Sciences Stochastic Bias-correction of Daily Rainfall Scenarios for Hydrological Applications

The accuracy of rainfall predictions provided by climate models is crucial for the assessment of climate change impacts on hydrological processes. In fact, the presence of bias in downscaled precipitation may produce large bias in the assessment of soil moisture dynamics, river flows and groundwater recharge. In this study, a comparison between statistical properties of rainfall observations and model control simulations from a Regional Climate Model (RCM) was performed through a robust and meaningful representation of the precipitation process. The output of the adopted RCM was analysed and re-scaled exploiting the structure of a stochastic model of the point rainfall process. In particular, the stochastic model is able to adequately reproduce the rainfall intermittency at the synoptic scale, which is one of the crucial aspects for the Mediterranean environments. Possible alteration in the local rainfall regime was investigated by means of the historical daily time-series from a dense rain-gauge network, which were also used for the analysis of the RCM bias in terms of dry and wet periods and storm intensity. The result is a stochastic scheme for bias-correction at the RCM-cell scale, which produces a realistic representation of the daily rainfall intermittency and precipitation depths, though a residual bias in the storm intensity of longer storm events persists.


Introduction
Remarkable research efforts have, thus far, been addressed to assess the predictability of the climate system by improving the climate model physics, resolution and parameterization for unresolved processes, which result in the development of high-resolution Global Climate Models (GCMs) and Regional Climate Models (RCMs).Nevertheless, the simulated climate behaviour is still far from being consistent at higher frequency and local scales, which are basically needed to undertake impact studies.As climate models use crude simplifications of complex atmosphere-land processes, their outputs cannot be expected to exactly reproduce the observed local dynamics.Consequently, a basin-scale assessment of climate change impacts may produce large biases in the simulation of river flows if the raw output precipitation from a GCM (or a RCM) is adopted.Particularly, the scale mismatch between climate model output and the spatial resolution (river basin or sub-basin), at which hydrological models are applied (e.g., Wilby et al., 2000;Burlando and Rosso, 2002), is the main limiting factor to the direct use of climate scenarios in impact prediction.Several hydrological impact studies require, in fact, accurate model simulations not only of time-average conditions, but also of the day-to-day (and even sub-daily) variability.In this framework, a rigorous model evaluation of the simulated daily precipitation statistics is an important step in assessing the models' reliability for climate impact applications (Frei et al., 2003).
Various downscaling techniques are used to bridge the mentioned scale gap, as reviewed in Fowler et al. (2007).
They range from simple schemes that use trends in climate variables from GCMs simulation to force the historical climate records, to the widely used dynamical downscaling obtained by nesting some RCM within a GCM (e.g., Giorgi and Mearns, 1991).Other downscaling approaches are based on statistical transfer functions linking local climate to the output of a GCM or RCM (e.g., Wilby et al., 1998), and those classified as climatic analogue-procedures.
Despite the physical consistency of the dynamical downscaling and the significant improvements in the RCMs' simulations of annual and seasonal atmospheric cycles, the bias inherited from the driving GCM (Fowler et al., 2007) inevitably remains (Wood et al., 2004).Moreover, larger errors are found in daily precipitation statistics, such as wetday frequency, precipitation intensity and quantiles of the frequency distribution.In particular, the simulation of prolonged dry summer conditions represents a problem for the climate models in the Mediterranean region as pointed out by Frei et al. (2003).
In the case of predicted daily precipitation over a given study area, comparisons with observational datasets should be focused on the daily behaviour of alternating dry and wet periods, this being one of the crucial aspects for the hydrological processes.So far, this kind of consistency analysis is seldom reported in the validation of global and regional climate models.On the contrary, these models are commonly assessed in terms of their simulated mean climatologies of a few recent decades, against gridded datasets of monthly observations (IPCC, 2007).
In order to obtain realistic, atmospheric forcing from climate models to be used in an impact study, methods for scalebias correction of the output variables are needed (Déqué, 2007).The performance of such methods relies on their ability to reproduce the space-time properties of rainfall fields, which have been recognized as fundamental issues for the improvement of observation and modelling techniques of hydro-climatologic processes (Deidda, 2000;Deidda et al., 2006).
Among several mathematical approaches, those based on point-process were proved particularly suitable when extreme rainfall events are considered (Salson and Garcia-Bartual, 2003).Starting from large scale atmospheric states predicted by global and regional models, a bias correction scheme using a stochastic model of storm arrival, duration and intensity is proposed in this paper.This methodology is proposed in order to project the rainfall regime, including extreme rainfall conditions and prolonged dry periods.

Data and methods
Before any bias correction method is developed and implemented, it is important to investigate the predictive performance of the adopted climate model.We evaluated the RCM's bias at basin scale against daily rainfall records from a rain gauge network.Then, we provided a simple framework to investigate possible alterations of the daily rainfall occurrence and intensity, under climate change, exploiting a stochastic simulation model suitable in investigating both ordinary regimes and extreme climate events.Based on the intermittency features of the daily rainfall process, the investigation has been undertaken using the statistical descriptors of the storm events which characterise the alternating renewal process of the so-called wet-dry spell model (Eagleson, 1978).Then the storm parameterization, based on RCM projections and local observations, has been adopted for the synthetic generation of daily rainfall at the RCM-cell scale.The stochastic rainfall generator was assumed from the rainfall representation proposed by Veneziano and Iacobellis (2002), which combines an "exterior process" at the synoptic scale with a hierarchical pulse model (IRP) for the "interior process".In this paper, only the model of the exterior process is used, which characterises the arrival, duration and average intensity of rainfall events.The within-storm rainfall intensity, which strongly depends on the interior process, is not considered here and is left open to further research.

RCM and observed rainfall series
We analysed the output of a high-horizontal-resolution RCM named EBU-POM (Gualdi et al., 2008), which was specifically developed for Southern Europe, Mediterranean and the Balkan areas by a scientific cooperation between the Italian INGV, the Serbian Republic Hydro Meteorological Service (RHMSS) and the University of Belgrade (UB).The model is a coupled atmosphere-ocean model with an atmospheric resolution of 0.125 • × 0.125 • .The atmospheric model component is the EBU limited area model, a version of the NCEP's Eta model, while the ocean component is the Princeton Ocean Model (POM) (Blumberg and Mellor, 1987).In this study, the boundary conditions for the RCM's simulations are taken from the SINTEX-G global model (Gualdi et al., 2003a, b).An overall validation of the EBU-POM model was done using the CRU CL 1.0 dataset covering 1960-1991 (Gualdi et al., 2008;New et al., 1999New et al., , 2000New et al., , 2002)).
A control run simulation, spanning over a 30-yr period , was adopted, in which the greenhouse gas concentrations of the historical period were specified based on the observations reported in the ENSEMBLES project (http://www.cnrm.meteo.fr/ensembles/public/results/results.html).This period corresponds to the current climatological normals established by the World Meteorological Organization (WMO), which provide a standard reference for many impact studies.Note, however, that in some regions, observations during this time period may exhibit anthropogenic climate changes relative to earlier periods.Nevertheless, the period 1961-1990 is used here as a baseline because it is of a sufficient duration to establish a reliable climatology, yet not too long, nor too contemporary to include any current global climate change signal.On the other hand, the future emission scenarios, adopted as radiative forcing in the global climate model, was assumed according to the Special Report on Emissions Scenarios (SRES) by the Intergovernmental Panel on Climate Change (IPCC).According to the IPCC and, in particular, to the Special Report on Emissions Scenarios (SRES), the radiative forcing is formulated following different hypothetical trajectories of the global socio-economic development, in terms of greenhouse gas emissions.Among the several possible greenhouse emission scenarios, six families of scenarios are discussed in the IPCC Fourth Assessment Report (IPCC, 2007), each containing individual common themes.The SRES A1 family, which is often adopted in future climate studies, is characterised by a rapid economic growth, a global population that reaches 9 billion in 2050 and then gradually declines, a quick spread of new and efficient technologies, a convergent income and way of life between regions and extensive social and cultural interactions worldwide.Within the A1 scenario family, the A1B scenario was adopted as the input to the SINTEX-G model covering the period 2000-2100.The A1B scenario is commonly referred to as a development trajectory with a balanced emphasis on all energy sources.
The 21st century run, available for the adopted RCM, covers the period 2003-2030.Such a period is often adopted as a crucial time-slice for the assessment of possible impacts and consequent adaptation measures, because of its proximity to the historical period and reliability of the greenhouse gas emission scenario.It is worth mentioning that the different scenarios provided by IPCC deterministically encompass some possible dynamic evolutions of the system, which also depend on the application of global strategies for limiting future CO 2 emissions.
The study area refers to seven model cells covering about 2000 km 2 of the Candelaro river basin (Fig. 1), a semi-arid catchment facing the Adriatic coast of Puglia in Southern Italy, with an elevation ranging from 0 to 1150 m a.s.l., surrounded by the Southern Apennine Chain (West) and the Gargano Promontory (North).This basin is part of the rural case study in the CIRCE research project (FP6 Project No. 036961), aiming to assess the climate change scenarios and impacts in the Mediterranean region.Twelve weather stations inside the Candelaro river basin were considered, from which daily rainfall series were extracted for the same period of the model's control run.
In the analysis, we have compared the rain gauge records with the seven model grid-cells, each with a size of 21 × 27 km 2 , covering the study area.To operate this comparison, the daily observations have been averaged over the grid cell domains.Basically, the daily observations have been weighted on representative areas of the rain gauge stations falling into each grid cell.The representative areas were delineated by the Thiessen polygons around each rain gauge station and, therefore, the observations corresponding to area-weighted values were clipped on the RCM grid (Fig. 1).
To analyse the intermittency properties of rainfall series, both the RCM control and scenario, and the observed datasets were decomposed into dry and wet clusters of days and the average daily intensity calculated for each wet period.Statistical tests were also performed on the daily rainfall observations in the study area to assure the stationarity of the reference period.
Results of the analysis of model bias and potential change in the rainfall regime are reported in this paper considering, separately, each of the seven RCM grid cells.Results, concerning the distributional agreement and suitability of the adopted stochastic model, are shown with reference to the data from cell #4.This cell was chosen as the one with the greatest amount of information in terms of the reference gauged stations (see Fig. 1).Moreover, cell #4 covers an area of about 560 km 2 which includes some important upstream sub-basins of the Candelaro river.

Rainfall model and stochastic generation of local precipitation scenarios
We adopt, as a reference rainfall model, a two-step scheme which distinguishes between the arrival, duration and average intensity of synoptic storms and then disaggregates rainfall intensity, in time, within each synoptic event.Following a standard notation in rainfall modelling, Veneziano and Iacobellis (2002) refer to the former as the exterior process and to the latter as the interior process.For the exterior process, a well-known representation of the daily rainfall as an alternating renewal process, with independent mean rainfall intensities for different rainstorms, is considered.The interior process is provided by an Iterated Random Pulse (IRP) representation with a random number, location and intensity of the pulses and multifractal properties at small scales.As a result, the combined stochastic process produces a random pattern of wet and dry periods inside the synoptic events of any preferred time resolution.For this paper, only the exterior process is considered for the bias-correction of the RCM scenario.For this kind of application, in fact, the coarse space-resolution of the RCM-pixel is unappealing for a complete representation of the rainfall space-time interior process (e.g., Veneziano et al., 2006).Size effects of the subdaily disaggregation of temporal rainfall will be the topic of future investigation.On the other hand, a space-time IRP  application would suit the local downscaling of an RCM output, but this goes beyond the topic of this paper.
The exterior model for the representation of storms on a daily scale consists of an alternating sequence of dry and wet periods with independent durations.We assume that the distribution of the dry periods is Weibull and that of the wet periods is exponential.The average rainfall intensities in different wet periods are independent and identically distributed variables, with exponential distribution.Therefore, four parameters are needed to describe the exterior process: mean value m D and exponent k D of the Weibull distribution of the dry periods, mean duration of the wet periods m W and mean value m I of the average rainfall intensity during the synoptic events.According to the Weibull distribution as in Stedinger et al. (1992), in fact, the scale parameter α D is related to m D and k D as: The stochastic modelling of the local rainfall process, through this scheme, explicitly allows for a reparameterization based on projected storm statistics derived from the climate models (Burlando andRosso, 1991, 2002;Kilsby et al., 2007).In particular, the model reparameterization needed to account for the climate scenarios is based on the storm properties of the observations and the structure of the stochastic model.Therefore, under the reasonable hypothesis that the model mismatch is due to, above all, an imperfect parameterization of the precipitation physics, we may derive the expected alteration of the rainfall regime using the change factor, obtained as the ratio between the model parameter for the 21st century run (21 c ) and the control run (20 c ), in Eq. ( 1): where ϑ * i represents a generic parameter to be used in constraining some stochastic rainfall model, while ϑ OBS i and ϑ RCM i are the corresponding parameters, respectively, extracted from the observational dataset and from the RCMsimulated time-series.
The proposed bias-correction methodology exploits the predictive capability of the adopted climate model and the scheme of the stochastic representation of rainfall in order to provide a realistic RCM-cell scale projection of the daily rainfall.

Results
The storm parameters adopted in the rainfall exterior process, namely mean storm intensity (m I ), mean wet duration (m W ), mean length of dry periods (m D ) and the Weibull shape parameter (k D ), were seasonally derived for the reference period  from the corresponding series of the RCM and observations.Exponential and Weibull probability charts were used to evaluate the fit of the respective storm features to the theoretical distributions assumed in the model scheme and compare the climate model output against daily rainfall observations.Figures 2, 3, and 4 show the empirical distributions of the observed and modelled variables for cell #4.They reasonably follow the respective theoretical distributions below the 90th percentile.In the same figures, in order to assess the statistical meaningfulness of distributional differences, the 95 % level confidence bands are represented, derived by using the percentile bootstrapping technique by Kottegoda and Rosso (2008, p. 517-519).According to the confidence bands, the distributional differences between the RCM-modelled and observed data are statistically meaningful for storm intensity in all seasons except spring (Fig. 2c), for wet periods, only in the spring season (Fig. 3c) and for dry periods, in spring and summer seasons (Fig. 4c and d).The RCM simulation for the reference period yielded an underestimation of the storm intensity in all seasons, particularly in autumn and winter, a large overestimation of the dry period lengths in spring and summer, and a good agreement for the wet durations in autumn, winter and summer.The good reproduction of synoptic weather conditions in autumn and winter is also shown in Fig. 5, for the same cell, in terms of empirical frequency distributions of storm occurrences conditional on duration.The analysis of model-observation bias was performed considering the ratio between each model parameter found as maximum likelihood estimates from the RCM and observed rainfall series ϑ RCM i (20 c )/ϑ OBS i (20 c ) .Table 1 shows the parameter estimates for the observed and RCM series relative to cell #4.The model bias is represented in Fig. 6 for all the seven cells.The general behaviour can be summarized into a poor capability to represent the intermittency of the daily rainfall, particularly for spring and summer inter-storm periods, and an overall underestimation of the mean storm intensity.
In the comparison of the storm parameters obtained from the RCM dataset and the observations (namely m I , m W and m D ), the non-parametric Wilconox rank sum method was employed in order to test the statistical meaningfulness of the equality of the means.This test was performed at a 95 % confidence level to the RCM and observed daily rainfall time-series regarding the reference period .The bias in the means was expressed as p-value (probability, under the null hypothesis, to obtain a value of the test statistic as high or higher than the value computed from the sample) in Table 2 for all the seven cells.Statistically, meaningful biases (p-values below 0.05) were found for dry and wet periods in all cells in different seasons, while storm intensity was biased in every cell and season with only one exception (cell 4, summer season).These results highlight the need to operate a model-bias correction in order to adopt the rainfall scenario for any further application.The ability of the rainfall generator to provide a realistic series of daily rainfall at the grid-cell scale was tested by graphical representation of distributional agreement between the observations and the simulated records referred to the calibration period .To this goal, the rainfall duration curves (RDCs) were used to represent the seasonal relationship between the magnitude and frequency of rainy days.In Fig. 7, the seasonal RDCs are reported for the stochastic rainfall generation, using the exterior process, and the station observations, having variable inter-storm intensity.The local scale reproduction of the synoptic rainfall process through the exterior model was able to capture the daily wet/dry alternation regime for all season, though assuming uniform rainfall intensity in each of the generated storms.Details on the distributional differences between the model-generated storm intensities for the reference period  and those evaluated from the observations are reported in Fig. 8 in the form of quantile-quantile maps.We observe that the limitations emerging in Fig. 7 may be dependent on the intrinsic structure of the stochastic generation of daily storms, with average intensity given by independent, exponentially distributed events, therefore, neglecting the superposition of pulses of the interior process, which was not modelled, as well as the inter-storm lacunarity.Nevertheless, the rainfall generator at least improved the reconstruction of the mean storm depth-duration curve modelled by the RCM in the reference period (Fig. 9), thus, assuring a quantitative balance between observed and modelled storm depths.
The expected alteration of the rainfall regime (under the A1B scenario) was assessed through the change factors of the aforesaid exterior model parameters (see Eq. 2) between the 21st century run and the control run (Fig. 10).Compared to the 20th century's observations , the obtained rainfall scenario  for the study area is characterised by longer dry spells in summer, a shorter storm duration in all seasons except spring and stronger rain intensity in summer and autumn.
Results relative to cell #4, according to the RCM rainfall scenario for the 21st century, show that the mean inter-storm period is predicted to increase by 66 % in summer and about 10 % or less in all other seasons; the mean storm duration is predicted to decrease by 15 %, 11 % and 18 %, respectively, in autumn, winter and summer; correspondingly the mean storm intensity is expected to increase by 21 % in autumn and 14 % in summer with smaller changes in the other seasons.The change of the Weibull shape parameter k D in all seasons shows very small variations around 1. With regard to the entire area of investigation, a certain homogeneity is found by comparing results of different cells.Some heterogeneity is shown only by cell #3 whose rainfall regime may be affected by the orographic influence of the Gargano Promontory.On the other hand, this cell covers a karstic area of the Candelaro basins which produces a low runoff contribution.
Finally, according to the proposed bias correction method, the mean storm intensity m I , the mean storm duration m W and the mean dry duration m D , were scaled, using Eq. ( 2), from those estimated from the observations.The Weibull shape parameter k D was kept constant (as it is only slightly altered, as shown in Fig. 10).The re-scaled model parameters were used as input to a Monte Carlo rainfall generator, producing statistically homogeneous time series of the future rainfall scenario (Figs.11 and 12).
The structure of Eq. ( 2) suggests that the behaviour of the change factor, discussed above, is inherited by the biascorrected series.Nevertheless, results in Figs.11 and 12 demonstrate that the impact of the bias-correction is significant with respect to the prediction of the evolution of rainfall regime.Looking at Fig. 11a and c, one might notice that the increased duration of dry spells is present, but strongly attenuated by the stochastic bias-correction with respect to the RCM scenario.On the other hand, the decrease in storm duration is amplified by the bias-correction as shown in Fig. 11b  and d. Figure 12 shows that in all seasons, but mainly in autumn and winter, the bias-corrected series reveals higher rainfall intensities than the observed rainfall data in the reference period, while the RCM scenarios predict an opposite trend.
Finally, as a general comment, the large distance between the raw and post-processed scenarios emerging at the local scale (corresponding to one grid-cell) always underlines the need to apply suitable bias correction methods to obtain more realistic input data to be used in climate change impact studies.Durations [days] Rainfall depth [mm] Observations  Generated rainfall exterior process  Linear regression (Observations) Linear regression (Generated rainfall exterior process )

Conclusions
For many freshwater-dependent ecosystems, the assessment of climate change impacts on hydrological response is driven, above all, by local precipitation patterns.RCMs, in general, tend to underestimate daily rainfall intensity in all seasons, mainly due to a coarse parameterization of process physics which is not able to reproduce the local scale atmospheric conditions causing intense convective precipitation.Also, the temporal dynamics of precipitation characterised by peculiar intermittency behaviours are only roughly captured by climate models.Therefore, appropriate correction procedures are required to obtain useful information from RCMs simulations.
The analysis of the rainfall intermittency on a daily scale provides a clear picture of the capability of climate models to predict daily precipitation in terms of wet and dry alternations and storm intensity against rainfall observations.At the same time, this type of analysis can be summarized into a few statistical parameters representing the process complexity.
The comparison between modelled and observed rainfall, in terms of storm parameters, allows for a bias-free assessment of climate change through the relative measure of alteration in the wet/dry periods and storm intensity.This approach has also immediate application in the development of stochastic weather generators, which are recognized as an effective operational tool to downscale RCM predictions at the local scale.In fact, by assuming a conventional exterior process of rainfall, only four parameters are needed to represent the point rainfall process encompassing both ordinary and extreme weather conditions (i.e.heavy storms and droughts).
The structure of the model adopted for the synthetic generator of rainfall records allows, in fact, to reproduce internally consistent climate records of any required length, in order to evaluate the hydrological impacts that may occur under ordinary and extreme weather conditions.By doing so, the model bias intrinsically affecting the daily rainfall scenarios  is not propagated into the station scale projection.At the present stage, the proposed bias-correction method is suitable in evaluating local scale climate scenarios and to overcome sample length limitations related to the available climate model runs through Monte Carlo simulations.
Concerning the study area and the adopted RCM simulation, the bias-correction method applied to the A1B precipitation scenario was presented in order to illustrate its key role with regard to the prediction of climate change.We clearly state that it was not possible to perform any out of sample validation of the bias-correction scheme because of the scarcity of observations for the 20th century and the stationarity of the considered reference period.Moreover, the validation of the bias-correction scheme cannot be disconnected from the main problem of validating climate change models.This is a crucial, scientific question, but is still open and well beyond the purpose of this paper.Nevertheless, according to results presented here, the rainfall projections obtained after the correction show: -a sensible attenuation of the drying climate scenario obtained by the raw RCM projection; -a stronger decrease of storm duration with respect to the raw RCM projection; -a significant increase in storm intensity which was not predicted by the raw RCM projection.
Based on these results, we believe it is important to acknowledge, at this stage of the research on climate change, the need of exploiting a well-established bias correction scheme in order to reconcile RCM outputs with ground-based observations.This limitation could be tackled in the next years by further development of Regional Climate modelling while only the detection of statistically significant alteration in the rainfall statistics of much longer historical records will strengthen the credibility of climate models and, above all, will validate the trajectories of the emission scenarios adopted to force them.

Fig. 1 .
Fig. 1.The rainfall network in the Candelaro river basin and the corresponding RCM grid-cells.

Fig. 2 .
Fig. 2.Comparison between the RCM simulation for the reference period (squares) and the rainfall observations (circles), using the exponential probability plots of the storm intensity with 95 % confidence bands of their corresponding theoretical distributions for autumn (OND) (a), winter (JFM) (b), spring (AMJ) (c) and summer (JAS) (d).The reported data refers to one single RCM grid-cell (cell #4) and the corresponding rain-gauge observations.

Fig. 5 .
Fig. 5. Storm events (relative to cell #4) ranked according to their durations, coming from the observations and RCM simulated precipitation for the period 1961-1990: (a) autumn season; (b) winter season.

Fig. 6 .
Fig. 6.Analysis of RCM predictive performance through the assessment of the model/observation bias ϑ RCM i

Fig. 9 .
Fig. 9. Average DDF curves relative to cell #4 for observed data (solid line) and generated rainfall exterior process (dotted line) for winter (a) and spring (b).The linear interpolations are based on the hypothesis of a maximum duration of single storms of less than five days.

Fig. 11 .
Fig. 11.Comparison for one between the probability plots of dry and wet periods obtained from the reference observations (circles), the RCM output for the 21st century (squares) and the bias-corrected rainfall (triangles).The two seasons, with the larger alteration in the 21st century, are reported for the summer dry periods (a), the summer wet durations (b), the autumn dry periods (c), the autumn wet durations (d).

Fig. 12 .
Fig. 12.Comparison for one RCM grid-cell (cell #4) between the probability plots of storm intensities obtained from the reference observations (circles), the RCM output for the 21st century (squares) and the bias-corrected rainfall (triangles).Storm intensities are reported for autumn (a), winter (b), spring (c) and summer (d).

Table 1 .
Analysis of RCM predictive performance through the assessment of the model/observation bias (RCM/OBS) of the following storm parameters relative to cell #4: mean storm intensity (m I ), mean wet period (m W ), mean dry period (m D ) and Weibull shape parameter (k D ).The parameters are estimated seasonally for the reference period.Each season of the year is reported according to the initials of the corresponding months.

Table 2 .
P-value results of the Wilconox rank sum test for the difference of means between observed and RCM data (reference period) at a 95 % confidence level.Bold characters are used to remark the passed test.