On the influence of cell size in physically-based distributed hydrological modelling to assess extreme values in water resource planning

This paper studies the influence of changing spatial resolution on the implementation of distributed hydrological modelling for water resource planning in Mediterranean areas. Different cell sizes were used to investigate variations in the basin hydrologic response given by the model WiMMed, developed in Andalusia (Spain), in a selected watershed. The model was calibrated on a monthly basis from the available daily flow data at the reservoir that closes the watershed, for three different cell sizes, 30, 100, and 500 m, and the effects of this change on the hydrological response of the basin were analysed by means of the comparison of the hydrological variables at different time scales for a 3yr-period, and the effective values for the calibration parameters obtained for each spatial resolution. The variation in the distribution of the input parameters due to using different spatial resolutions resulted in a change in the obtained hydrological networks and significant differences in other hydrological variables, both in mean basin-scale and values distributed in the cell level. Differences in the magnitude of annual and global runoff, together with other hydrological components of the water balance, became apparent. This study demonstrated the importance of choosing the appropriate spatial scale in the implementation of a distributed hydrological model to reach a balance between the quality of results and the computational cost; thus, 30 and 100-m could be chosen for water resource management, without significant decrease in the accuracy of the simulation, but the 500m cell size resulted in significant overestimation of runoff and consequently, could involve uncertain decisions based on the expected availability of rainfall excess for storage in the reservoirs. Particular values of the effective calibration parameters are also provided for this hydrological model and the study area.


Introduction
The distributed hydrological physically-based models reflect the spatial variability of input data and describe in great detail the processes occurring in the basin.These models are applied to each spatial unit, into which the system is divided (normally, the cell of a digital elevation model, DEM) and relays the response of each unit to the point of departure, concatenating these responses over time and space in order to produce outputs at the basin-scale (Beven, 1989).In this way, analytical models can be applied in systems with a great territorial extension, but the rigour and accuracy of the results depend on the quality of the spatial distribution of the input parameters and the adequacy of the global and/or analytical model underlying the spatiotemporal scale, at which the modelling process is studied (Blösch and Sivapalan, 1995).The processes involved in the energy and water balance in the basin present scale effects due to the non-linear character of the equations.In the water balance on the Earth's surface, its non-linearity and high spatial variability are mainly caused by the key role of the soil water content and the spatial and temporal variability of the soil properties (Wood, 1998).
The spatial scale factor in hydrological models and their response have been studied since the 1960s.Choosing between spatial scale of cell, sub-basin or basin modelling depends on the scale of the individual processes involved in the hydrological response of the desired basin.The spatial variability existing within a sub-basin is reflected in the variation of the parameters governing the various processes (Vázquez et al., 2002).At present, it is possible to use small-scale work on distributed models due to the availability of data from high resolution remote or airborne sensing and intensive field measurements (Aguilar, 2008).Thus, it is possible to obtain a good approximation of reality with respect to the distribution of the parameters.
The cell size should be such that, firstly, it adequately represents the spatial variability of basin characteristics, and secondly, the modelling of the most significant processes represents reality with a significant degree of approximation.When the cell size is changed, the mean properties of the territory (e.g.slope, aspect, draining areas, etc) in that area also change.This may have an important effect on estimates given by the models (Zhang and Montgomery, 1994;Kuo et al., 1999).Therefore, it is expected that the effective parameter values characterizing the modelled processes will vary by altering the cell size.In general, increasing the level of discretization increases the accuracy of the simulation, as smaller cells represent the real variability of the basin best.However, there is a level beyond which the model response does not improve, because the uncertainty associated with the spatial interpolation of the information cannot be reduced.Furthermore, the reduction in cell size implies an increase in computation time and an increase in the analysis and data processing work (Vázquez et al., 2002).
On the other hand, given the practical impossibility of measuring the parameters required by a model in each cell, the scale effects are of particular relevance when estimating their effective values, in other words, those that give more accurate estimates (Mertens, 2003), especially for the use of physically-based hydrological models as predictive tools (Binley et al., 1989).
The analysis of these spatial scale factors is particularly relevant in the case the Mediterranean Basin, where the spatial variability of rainfall can be very high, not only on an hourly and daily scale, but also seasonally and annually, due to a combination of a sometimes moderate torrential rainfall event and the existence of significant topographic gradients.In these basins, the scale effects of an adequate spatial representation of rainfall on the hydrological response in the fluvial network can represent a significant fraction of the total set.The latter may include significant deviations in annual discharge and errors in the planning of water resources and their origin, especially in poorly gauged basins, which, unfortunately, is the frequent case.These facts pose a constraint for an adequate simulation of medium-and long-term hydrological performance, especially for the prevention of both drought and flood cycle effects, and reservoir operation criteria.
This work studies the effects of the spatial scale on the characterization of precipitation in a Mediterranean Basin and the influence on the water balance components, particularly on the estimated monthly runoff at a watershed scale and the consequences for water resources planning.For this purpose, a headwater basin in southwestern Spain has been selected, where an Atlantic influence facilitates significant annual and seasonal wet cycles in spite of the Mediterranean character of the region where it is located.
The spatial scale effects on the hydrological modelling have been analysed through simulation by WiMMed, a distributed hydrological physically-based model developed for Mediterranean watersheds (Polo et al., 2009;Herrero et al., 2010) by the comparison of three different scales of spatial resolution commonly used in hydrological modelling: 30, 100, and 500-m.The implications derived for water resource planning, especially for drought effect prevention, have been discussed.

Study area
The study area is the watershed contributing to the Zahara reservoir, a sub-basin located in the Guadalete River headwaters, in the southwest of the Iberian Peninsula (Fig. 1a).It is a mountainous basin in an about 131 km 2 area, with a topographic gradient of up to 1300 m.a.s.l.(Fig. 1b).The geological configuration of the watershed presents a great heterogeneity of differentiated lithological units, among which Tertiary clays stand out, with blocks of a variable lithology, age and provenance, and, to a lesser extent, Jurassic limestones and dolomites in the areas of a greater relief (Fig. 2b), and Quaternary deposits associated with the drainage network (Consejería de Medio Ambiente, 2005;Cano and Ruiz, 1981;Moreno et al., 1980) (Fig. 1c).The land cover consists mainly of forest formations and dry-farmed crops without major urban areas (Consejería de Medio Ambiente, 2003) (Fig. 1d), which can be considered as natural in terms of hydrological characteristics.The average rainfall in the basin is about 900 mm•yr −1 , and average annual maximum and minimum temperatures are 22 • C and 11.5 • C, respectively.

Overview of the distributed model and input data used
WiMMed (Water Integrated Management for Mediterranean Watersheds) is a physically-based and distributed model at watershed scale developed for integrated management of water resources.The hydrological module was especially conceived to include characteristic aspects of Mediterranean watersheds, where both spatial and temporal variability play a relevant role in their highly variable response, particularly The hydrological response of the watershed at every cell of the digital elevation model (DEM) used is simulated from the definition of event or non-event situation, associated with the occurrence of rainfall in the watershed or with the period between events, respectively.The energy and water balance at each cell is implemented as a cascade or reservoirs (vegetation cover, snow cover, and vadose zone of the soil), where calculations are generally made on a time step of 1 h under event situations, or 1 day under non-event situations.If snow is present, this time step can be reduced to minutes or even seconds, regardless of the event/non-event situation, due to the energy balance submodel requirements (Herrero et al., 2009).The individual interpolation algorithms used to approximate the spatial distribution of each meteorological variable include corrections with height for both rainfall and temperature (Herrero, 2007), and topographic corrections for solar radiation (Aguilar et al., 2010).Interception by the vegetation is estimated by using the Rutter and Gash models from Landsat TM data analysis of cover fraction, and the available forest and crop cartography (Polo et al., 2011).
When snow is present, the snowmelt/accumulation dynamics is simulated by a 1-D thermodynamic model of energy and water balance in the snow cover (Herrero et al., 2009;Herrero et al., 2011b).During events (rainfall/snowmelt) and non-events (snowmelt), the infiltration fluxes are calculated by the Green and Ampt equation, where redistribution is approximated by means of a lag time (Aguilar, 2008); during non-events, the slow drying of the vegetation and soil is calculated by a combination of the Penman-Monteith equation and Hargreaves equation (Aguilar and Polo, 2011).Water excess in every cell for each time step is routed as runoff along the hillslopes by means of a travel time distribution, estimated from an effective velocity field calculated from the DEM (Aguilar, 2008), which estimates the flow concentration at the selected points of interest in the fluvial network.Baseflow contribution is included from recession curves, where the difference between fractured or aggregated materials is taken into account (Millares et al., 2009).The resulting hydrographs are used for calibration at the available gauged points in the watershed, after time aggregation of results to meet the time scale, for which the accuracy of the measured data is optimal for both event and non-event situations.These hydrographs can be used as input data for flow routing along the main river channels in the watershed by means of Muskinghum routing or a 1-D high resolution hydrodynamic model ( Ávila, 2007), depending on the desired results and available morphological information.
A detailed description of each process and associate parameter characterization/calibration can be found in the cited literature.Herrero et al. (2010) and Herrero et al. (2011a) provide the user's manual for the available WiMMed interface for Windows, and the theoretical basis of the hydrological-hydraulic model for water balance and flow calculations.
In the study area, snow processes were negligible, and interception losses were not included due to the nonavailability of some required data.The input data used in the hydrological simulations were: Topographic features derived from the digital elevation model (DEM): surface drainage system, slope, and orientation.
Physicochemical and hydraulic properties of the soil selected from the available spatial database performed by Rodríguez (2008), in which thematic maps were obtained for Andalusia at a 250-m resolution: hydraulic conductivity (mm•h −1 ), saturation and residual moisture values (mm•mm −1 ), air-entry matric potential (mm), retention parameter of the van Genuchten (dimensionless) and soil thickness (mm).
Land cover and land use information from different sources: forest and crop cartography (Consejería de Medio Ambiente, 2003), cover fraction and albedo (dimensionless) from Landsat TM data analyses for the study period.

Grid resizing
Grid resizing was done by the selection of different resampling techniques integrated in geographic information systems (GIS), according to the continuous or discrete nature of the data.The original DEM of 30-m cell size was resized to 100 and 500-m a bilinear option; this technique uses a weighted average from the four nearest cells in the original raster dataset, and provides a smoothed surface at the upper desired scale.The input maps with discrete data, such as soil hydraulic properties, land cover features, and the aquifer regions were also resized but using the nearest technique, to maintain the original raster values by the assignment of the value from the cell nearest to the resized cell center.The variables related to topography, such as slope, aspect, flow direction, and sub-basin limits, were finally calculated from each resized DEM.

Evaluation of model estimates
The WiMMed hydrological model was run at the study basin for the period 2003-2006 with three spatial resolutions: 30, 100, and 500-m of cell size.The scale effects arising from the different spatial definition of input variables and parameters on the annual runoff at the closure point of the watershed (Fig. 1) were analysed from the monthly calibration performed for each case.To this purpose, three different aspects were studied: 1. Differences in watershed morphology: the resulting drainage network from each DEM was analysed in terms of the slope distribution, channel versus hill cell fractions, and the total calculated watershed area.
2. Differences in water input from rainfall analyses: the resulting spatial distribution of the annual precipitation, and average rainfall at watershed scale.
3. Finally, differences in the hydrological simulation: effective values of the calibration parameters.The hydrological variables analysed in this study were precipitation, direct runoff, infiltration, percolation and runoff.Outputs obtained at daily scale were aggregated into monthly scale in order to magnify spatial scale effects in the results.
WiMMed calibration was performed from the available daily flow data as daily contributions to the reservoir located at the basin outlet, during the period from 1 September 2003 to 31 August 2006.Daily data were aggregated on a monthly basis to overpass the evident gap filling made on the series, whose quality did not allow to work on a daily basis for hydrological simulation; this time scale can be considered adequate for medium-and long-term water resource planning, but obviously not for flood simulation.The study period was simulated for each selected cell size, and the effective values of the calibration parameters obtained by minimizing the root-mean-square error (RMSE) between the observed (subscript with O) and simulated (subscript with S) monthly series for each cell size, 30, 100, and 500 mm (RMSE O−S30 , RMSE O−S100 , and RMSE O−S500 , respectively).The calibration process in WiMMed is performed through the application of an uniform dimensionless correction factor, which affects the original distributed map of each parameter.In this work, following the conclusions of the sensitivity analysis in Aguilar (2008), the following parameters were selected for calibration: surface saturated hydraulic conductivity (K s1 ) and soil-second layer saturated hydraulic conductivity (K s2 ).Thus, the calibration process produced the optimum values of two factors, f Ks1 and f Ks2 , respectively.
Once the model was calibrated for each cell size under analysis, the resulting monthly runoff at the monitoring point was aggregated along each year and during the 3-yr-period; the corresponding annual and global RMSE were also calculated.

Results and discussion
By changing the cell size from one scale to another, the mean properties of the territory in the study area changed due to the resizing of the DEM.Thus, the average slope of the basin did not experiment remarkable changes between the 30 and 100m cell sizes, but decreased from 15.6 • to 9.3 • by increasing the cell size from 30-m to 500-m; the total basin area obtained also varied, from 131 km 2 to 139 km 2 for 30-m and 500-m cells sizes, respectively, which affects the resulting average rainfall in the basin at different time scales, its spatial distribution, and subsequently the annual water balance, as will be discussed later.Table 1 includes these area values together with the resulting classification of cells in the basin.These variations in the physical characteristics of the terrain have an influence on the design of the hydrological network; the drainage networks obtained with 30-m and 100-m spatial resolution are similar and better resemble reality than the result data obtained with the 500-m cell size (Fig. 2).It must be noted that, although the total number of cells in the basin decreases by increasing the cell size, the proportion of cells to be considered to channels greatly increases.Thus, for 30-m and 100-m, 1.3 % and 3.3 % respectively correspond to cells in the fluvial bed, while for cells sized 500-m, this fraction increases up to 15.5 % (Table 1).Although not considered in this work, this fact involves a significantly higher importance of the flow processes in channels, and consequently will influence the resulting effective value of the parameters, affecting the travel time, which must be taken into account for the simulation of extreme flow values, flood events or erosive thresholds (Millares et al., 2012).
As a result, the basin divides are changed too; this is due to the cell size value itself and, indirectly, to the new values of elevation obtained in the resized DEMs.Thus, the topographic features derived from the DEM, such as aspect, flow direction, drainage channels, and associated sub-basin limits, are also affected.Other input parameters like the physical properties of soil or ground cover are also affected by the resizing of the cells, which smoothes the interval range for each parameter as cell size increases.
The change of the cell size led to a change in the rainfall fields, too.Thus, differences were not only in the averaged values at watershed scale (Table 2), but also in the distributed values at cell scale, which were significant for the study pe-riod, with maximum values being the most affected, decreasing as cell size increased (Fig. 2).
On an annual basis, the variation of precipitation was similar for the three years considered, with an increase close to 3 % when using the 500-m cell size over 30-m, whereas a decrease of 0.1 % was obtained for the 100-m cell size (Table 2).As expected, significant non-linear effects resulted in the hydrological variables analysed.Hence, the average values of all the variables decreased when using the 100-m cell size over 30-m, but with different significance, whereas for the 500-m cell size, no trend was observed with increasing/decreasing values, depending on the water balance component (Table 3).Thus, the average direct runoff decreased by 2.5 %, whilst the average infiltration decreased by 0.1 % using the 100-m cell size.With 500-m, the average direct runoff decreased by 1.5 %, whilst the average infiltration increased to up to 2.9 %.The results reveal the conservative behaviour of the hydrological model, since negligible differences were found between the relative values obtained for each cell size.
Figure 3 represents the simulated, monthly runoff for each case together with the observed monthly discharges at the monitoring point (Fig. 3a) and the monthly precipitation (Fig. 3b).All cell sizes reproduced the time evolution of runoff, and also approximated in general the measured values, with some exceptions discussed later; however, the use of the 500-m cell size overestimated runoff during periods with continuous rainfall events, due to the combined effect of the apparent overestimation of rainfall and underestimation of percolation.
Figure 4a shows the accumulated flow volume in each hydrological year, with clear differences among a wet year (2003/2004), a dry year (2004/2005), and a medium year (2005/2006).The simulated flows were lower than measured in the first period, the wettest year, but not in the other two.The 30 and 100-m cell sizes show similar results, whereas the 500-m is always higher, but the model exhibits the same trend, over or underestimating the annual flow, independent of the cell size.However, the total balance is offset at the end of the 3-yr-period in the case of the 30 and 100-m cell sizes, as shown in Fig. 4b, which shows the accumulated flow for the entire series, whereas the 500-m cell size overestimates the global runoff up to a 7 %, which may pose a constraint for safe estimations of water resource availability, and thus for the prevention of drought effects on the medium-and longterm, especially if only short time series are available.
Finally, Table 4 shows the effective values for the calibration parameters used in the described simulations.Calibration of the model at a monthly scale with each cell size was performed on the parameters f Ks1 and f Ks2 , related to the surface and subsurface soil hydraulic conductivity, as described in Sect. 2. Mean and extreme hydraulic conductivity values in the watershed resulting from the calibration are given in Table 4 for each cell size.The best result, namely the minimum RMSE O−S value for each cell size, was Table 1.Total number of grids, channel grids (absolute value and percentage), watershed area and its variation with increasing cell size, and the calculated mean slope and extremes at 30-m, 100-m, and 500-m cell size.always obtained with a decrease in the second parameter, which means a decrease of the soil hydraulic conductivity.The lower RMSE O−S was obtained with both 30-m and 100m cell sizes, with RMSE O−S100 = 0.83 hm 3 (Table 4), closely followed by the 30-m, with RMSE O−S30 = 0.85 hm 3 ; finally, the 500-m cell size resulted in a RMSE O−S500 = 0.89 hm 3 .Differences between the simulated values mostly occur during the wetter periods, as previously discussed.
On an annual basis, RMSE values are 4.22, 4.32, and 4.13 hm 3 for the 30, 100, and 500-m cell sizes, respectively.Dispersion in the annual degree of fitting is smoothed by the increase in cell size; however, for the whole study period, which comprises three different years in terms of the rainfall regime, the RMSE increases as cell size does, with values of 1.33, 2.27, and 3.65 hm 3 for 30, 100, and 500-m, respectively.From these results, it can be deduced that, under the constraint of the maximum degree of fitness achieved depending on the initial quality of the flow data series (accuracy and length), the increase in cell size induces a significant overestimation for the whole period, which could involve more risky decisions if the 500-m model was used for hydrological planning, since drought periods could be expected to be supported by the storage accumulated during the wet years.
Finally, the computer processing time required by the model decreased, when compared to the 30-m cell size, up to 13 % and 0.4 % for the 100 and 500-m cell size.

Conclusions
Poorly gauged watersheds are, unfortunately, frequent in many regions.In such cases, decision-makers usually have to choose between using local empirical rainfall-runoff relationships to estimate water runoff and expected storage on the medium-and long term, or modelling further runoff generation by means of physical approaches.subject to the uncertainty imposed by the limited calibration possibilities, allows, on the one hand, the possibility of assessing the impact of soil use or meteorological variability of the water resources and, on the other hand, a more ac-curate description of the spatiotemporal distribution of rainfall, which is the most important source of variability in the watershed response, and is more frequently available at a higher resolution level, both in time and space, than flow data.However, especially when good quality data are lacking, a compromise must be made between the spatial scale at which the physical processes are simulated and the associated computational time.This study demonstrated the importance of choosing the appropriate spatial scale in the implementation of a distributed hydrological model for medium-term water resource planning, since the hydrological variables change as average values of the input variables and parameters do, when the spatial resolution is changed.Increasing the size of cells had a direct effect on the runoff volume generated at every time scale, firstly, because the number of cells that the flow must travel to reach a channel cell is reduced and also the proportion of channel cells increases.But, secondly, the outflow change is also the result of the variation in the spatial distribution of rainfall due to resizing of the DEM grid, despite the fact that the same meteorological datasets and interpolation techniques are used.
When the spatial resolution changes, it usually requires a recalibration of the actual parameter values; however, this process is not always successful, due to the nonlinear be-haviour of hydrological processes and models.The results show that it is possible to find effective values for the calibration parameters that, on a monthly scale, can surpass the a priori limitation of the available data series, and provide the planning strategy with adequate results for the 3-yr-study period.The lower cell sizes tested resulted in similar results on monthly and annual basis.However, the simulations obtained with low resolution (500 m) were much faster, but the results are less useful in calculating the runoff volumes necessary in studies on water resource estimation, as stated in this work, or other issues not included here but strongly dependent on the scale effects inherent to the hydrological simulation, such as the quantification of soil loss or the transfer of substances through the soil profile, among others.From the modeler's point of view, the definition of the spatial variability of the input parameters, the scale of the models, and the representation of hydrological processes on this scale are of great importance.Therefore, it is important to reach a compromise between spatial scale and the model processing time.The results show how, for the whole period, the 30-m cell size provides the better performance, but also that the 100-m cell size can achieve similar results with significantly lower computational times.
In particular, the issue that many watersheds lack a sufficiently dense monitoring network for surface flow, and that, even then, high frequency good quality measurements are rarely available, was not addressed in depth in the text.But the use of hydrological models is the only alternative for water resource planning in these cases; particularly, in Mediterranean regions, the great variability of both the spatial and temporal distribution of rainfall is the main source of uncertainty, as the results showed, when estimating water resources in the short-, medium-, and long-term.This also affects the capability of predicting extreme maximum values in runoff, and flood prevention, which would not be affordable with the calibration achieved from the available data in this study area, since it requires more definition in the field data.
Fig. 1.(a) Location of Zahara watershed in Spain; (b) DEM with meteorological stations; (c) lithology and (d) land cover.
Fig. 3. (a) Observed and simulated monthly flow volume for each cell size, and (b) monthly rainfall, for the period 2003-2006.

Table 2 .
Differences in average annual precipitation values at the basin scale between different cell sizes.

Table 3 .
Differences in average precipitation, direct runoff, infiltration, and percolation at the basin scale, and the corresponding rainfall fraction (in brackets), for the period 2003-2006 between different cell sizes.

Table 4 .
Calibrated factors for the surface and soil saturated hydraulic conductivity parameters, f Ks1 and f Ks2 , respectively, and mean, maximum, and minimum individual final values for 30-m, 100-m, and 500-m cell sizes.