Statistical modelling of rainfall-induced shallow landsliding using static predictors and numerical weather predictions: preliminary results

Our study is aimed at estimating the added value provided by Numerical Weather Prediction (NWP) data for the modelling and prediction of rainfall-induced shallow landslides. We implemented a quantitative indirect statistical modelling of such phenomena by using, as input predictors, both geomorphological, geological, climatological information and numerical data obtained by running a limitedarea weather model. Two standard statistical techniques are used to combine the predictor variables: a generalized linear model and Breiman’s random forests. We tested these models for two rainfall events that occurred in 2011 and 2013 in Tuscany region (central Italy). Modelling results are compared with field data and the forecasting skill is evaluated by mean of sensitivity–specificity receiver operating characteristic (ROC) analysis. In the 2011 rainfall event, the random forests technique performs slightly better than generalized linear model with area under the ROC curve (AUC) values around 0.91 vs. 0.84. In the 2013 rainfall event, both models provide AUC values around 0.7. Using the variable importance output provided by the random forests algorithm, we assess the added value carried by numerical weather forecast. The main results are as follows: (i) for the rainfall event that occurred in 2011 most of the NWP data, and in particular hourly rainfall intensities, are classified as “important” and (ii) for the rainfall event that occurred in 2013 only NWP soil moisture data in the first centimetres below ground is found to be relevant for landslide assessment. In the discussions we argue how these results are connected to the type of precipitation observed in the two events.


Introduction
In recent years, Tuscany region and nearby areas (central Italy), have been hit by noticeably heavy rainfall events (Parodi et al., 2012;Sacchi, 2012;Avanzi et al., 2013;Rebora et al., 2013;Fiori et al., 2014;Buzzi et al., 2014;Perna et al., 2015).In particular the northwestern part of the Tuscany region is very prone to frequent and severe rainstorms, due to its geographical position and its steep orography close to the sea, making this area one of the wettest in northern and central Italy (Brunetti et al., 2009).Because of the fragility of the territory (Avanzi et al., 2013), large rainfall amounts and intensities triggered a great number of shallow landslides, causing damage, injuries and human losses.To mention a few of these disastrous events, on 24-25 December 2009 heavy rainfalls and snow melting triggered more than 600 landslides in the Serchio River valley and Apuan Alps (Avanzi et al., 2013).During October 2010 a relatively long wet spell played a significant role in predisposing the slopes to instability.Subsequently the rainstorm that affected the Massa-Carrara area on 31 October 2010 triggered hundreds of landslides (Avanzi et al., 2013).On 25 October 2011 a mesoscale convective system (Rebora et al., 2013)   landslides and debris flows occurred in the area causing huge economic losses and 13 fatalities (Rebora et al., 2013;Galve et al., 2014).
To arrange efficient alarm systems and in order to reduce property damage and hazard for human lives, civil protection plans are based on operational weather forecasting (Regione Toscana, 2006).Nevertheless, prediction of rainfallinduced landslides is problematic since it is determined by rainfall infiltration, soil characteristics, antecedent soil moisture content and rainfall history (Guzzetti et al., 2007).Regarding the area under examination (see Fig. 1), several studies deal with the definition of the critical rainfall thresholds (Guzzetti et al., 2007) for the initiation of shallow landsliding and debris flows.Among these studies, we recall Giannecchini (2005Giannecchini ( , 2006)); Giannecchini et al. (2007Giannecchini et al. ( , 2012)); Rosi et al. (2012) and Segoni et al. (2014b).Another approach for the prediction of landslides is a susceptibility assessment (Chung and Fabbri, 1999;Guzzetti et al., 2005a), since future landslides are likely to occur under the same conditions that produced them in the past (Guzzetti et al., 1999).For northwestern Tuscany and nearby areas, several works deal with landslide susceptibility mapping (hereafter LSM: Catani et al., 2005;Ermini et al., 2005;Federici et al., 2007;Avanzi et al., 2009;Falaschi et al., 2009;Catani et al., 2013).Among these studies, Catani et al. (2013) implemented a landslide susceptibility estimation in the Arno River basin, by using the random forests technique.In particular they considered some issues related to the resolution of the mapping unit and to the optimal number of landslide conditioning variables.They found that a mapping unit of 50 m (pixel size) gives the best results with respect to 10, 20, 100, 250 and 500 m mapping units.They took into account several input predictors such as geomorphology factors (digital elevation model, slope, curvatures, slopes), lithology factors, land cover, distance to roads, rivers and faults and climatology factors.They concluded that the optimal number of predictors for the classification ranges from 9 (mapping unit 10 m) to 24 (mapping unit 20 and 50 m).None of the cited papers make use of numerical weather forecasts as input predictors.
Recently (Schmidt et al., 2008;Segoni et al., 2009;Mercogliano et al., 2013a, b;Segoni et al., 2014a) the Numerical Weather Prediction (NWP) data have been used as a promising and reliable tool for the prediction of shallow landslides triggered by precipitation.In New Zealand, Schmidt et al. (2008) implemented a landslide forecasting system based on three components: (i) regional NWP data (forecast up to 48 h ahead), (ii) a hydrological model fed with NWP data aimed at simulating soil moisture and groundwater levels and (iii) a slope stability model aimed at estimating failure probabilities within a hillslope.The spatial resolution of NWP data is 12 km, whereas the resolution of the hydrological model and of the slope stability model is 30 m.For a specific extreme event that occurred in the lower North Island of New Zealand in February 2004, the authors achieved hit rates of about 70-90 % and false alarm ratios of about 30 %. Segoni et al. (2009) implemented a real-time forecasting chain for rainfall-induced shallow landslide to be used for civil protection purposes.The architecture of the forecasting chain is quite complex and takes advantage of techniques and tools from different fields including meteorology, hydrology, geomorphological and geo-technical modelling, remote sensing and GIS.Regarding the NWP data, the authors downscaled the rainfall forecast provided by a limited-area model using a meta-Gaussian model aimed at reproducing small-scale rainfall fields.The final resolution of the precipitation forecast is 1.75 km in space and 10 min in time.Similarly Mercogliano et al. (2013a, b) implemented a forecasting chain for rainfallinduced shallow landslides.They coupled the NWP regional data, originally produced at 2.8 km of resolution and statistically downscaled to 10 m, with a physically based slope stability simulator (hydrological and geo-technical tool).In Mercogliano et al. (2013b), the authors tested the procedure in a pilot site in the northwestern part of Tuscany (Lucca, Pistoia and Prato administrative provinces) for a specific rainfall event.A quantitative validation of the results was not performed, nevertheless the authors concluded that additional well-documented study cases need to be simulated to better understand the spatial organization of the input parameters and improve the quality of the results.
Our work takes its origin from these latter papers.We considered two heavy rainfall events that occurred in 2011 and 2013, that affected Lunigiana and Garfagnana in the northwestern part of Tuscany region (central Italy).We carried out an analysis including a statistical modelling of spatial landslide occurrence by using two techniques: the generalized linear model (hereafter GLM;McCullagh, 1984;McCullagh and Nelder, 1989) and the random forests classifier (hereafter RF; Breiman, 2001).For both statistical models, we used, as predictors, static geographical layers and dynamical NWP data achieved by running the Weather and Research Forecasting (WRF) model (Skamarock et al., 2005;Skamarock and Klemp, 2008).
The approach we adopted is an attempt to conjugate and integrate the added information carried by a regional numerical weather model which operates at the meso-γ scale ( 2-20 km of spatial resolution), with the micro-γ scale (≤ 20 m of spatial resolution, according to Orlanski, 1975) which is an average value of the mapping unit of landslide size occurring at the basin scale (Guzzetti et al., 1999(Guzzetti et al., , 2005b)).Differently from the cited papers (Schmidt et al., 2008;Mercogliano et al., 2013a, b), this goal is achieved without performing any downscaling of the NWP data (grid point about 3 km) to a finer resolution.In this way we preserve the original information content provided by the numerical model without introducing any artificial knowledge on the precipitation patterns.
The evaluation of the modelling results is performed through the analysis of the receiver operating characteristic (ROC) curve in terms of the underlying area (AUC), a threshold-independent index widely used (Frattini et al., 2010).
Beside the reliability of the statistical modelling of rainfall-induced shallow landslides, the innovative feature of our study is the assessment, by using RF's diagnostics, of the relative importance of NWP data in LSM, since so far there is a lack of studies focusing on this issue.
The positive impact of mesoscale NWP output supports the reliability of numerical forecast and further confirms (Schmidt et al., 2008;Segoni et al., 2009;Mercogliano et al., 2013a, b) its use for the setting-up of a real-time forecasting chain for the prediction of the occurrence of rainfall-induced shallow landslides over large areas (basin catchment scale).

Description of the study cases and of the areas of interest
We implemented the statistical modelling of shallow landsliding induced by precipitation, focusing our attention on two heavy rainfall events that occurred in the northwestern part of Tuscany region (central Italy, see Fig. 1) on 25 October 2011 (Lunigiana) and on 18 March 2013 (Garfagnana).In Sects.
2.1.1 and 2.1.2,we describe the rainfall events from a meteorological point of view and we give a brief description of the area of interest considering geographical and geomorphological features.

Study case 25 October 2011
The first rainfall event, hereafter 25OCT2011, occurred on 25 October 2011 and involved the Lunigiana area belonging to the administrative province of Massa-Carrara (see left inset of Fig. 1).The area is located along the Apennines chain and is mainly mountainous (the highest peaks reach almost 2000 m).It is very close to the Ligurian Sea gulfonly a few kilometres away.Due to its orography and geographical position, the area represents a natural barrier for the Atlantic humid air masses and frequently the precipitation amount reaches or exceeds 3000 mm yr −1 , making this area one of the wettest in Italy (Brunetti et al., 2009).From a hydrological point of view, it is characterized by the presence of one main river basin (Magra basin) having an area of about 992 km 2 (in the administrative province of Massa- Carrara).A detailed study on the critical thresholds able to trigger shallow landslides in this area was carried out by Giannecchini (2006).Avanzi et al. (2013) studied the fragility of the territory by analysing the damage that occurred in two heavy rainfall events in 2009 and 2010 (in this latter paper the study area was slightly larger than that one here under examination).
From a geological point of view, following Di Naccio et al. ( 2013), the Northern Apennines are a NW-SE-trending belt formed by NE-verging tectonic units stacked since the late Oligocene after the collision of the Corsica-Sardinia and Adria continental blocks.The main tectonic units are (i) the Liguride allochthon, (ii) the Subligurian unit and (iii) the Tuscan unit (for a comprehensive synthesis and review see Argnani et al., 2003, and references therein).
From a meteorological point of view, the 25OCT2011 rainfall event was deeply investigated by Buzzi et al. (2014) using a numerical weather model and by Rebora et al. (2013) using the measurements available from a large number of sensors, both ground based and space borne.In this latter paper, the authors concluded that the large-scale features of the event and the complex geographical characteristics of the area determined the conditions for the persistence of heavy precipitations over the same region, i.e. an organized and self-regenerating mesoscale convective system (MCS).In the area of interest, rainfall amounts were registered by the remotely automated weather station network operated by the National Civil Protection Department.The maximum cumulative rainfall was recorded at the Pontremoli rain-gauge (Magra River valley) with maximum rainfall rates of 374 mm mm (24 h) −1 , 317 mm (12 h) −1 , 243 mm (6 h) −1 , 158 mm (3 h) −1 and 67 mm (1 h) −1 .For this rain-gauge, Regione Toscana (2011)   rainfall events for 1, 3, 6, 12 and 24 h duration: 51, 438, > 500, > 500 and 293 years respectively.
A landslide inventory map for this event was created by both field surveys of the expert personnel of the Regional Civil Protection Office and by use of Rapid-Eye images.A semi-automatic detection algorithm based on Rapid-Eye pre-/post-event images (13 October pre-event and 29 October post-event) was applied.The Rapid-Eye data were multispectral 5-band with 5 m of spatial resolution.Shadow areas in satellite images created problems in the lower parts of hillslopes, especially in the less populated and hardest to reach areas, so possible commission/omission errors may occur.Since technical authority surveys were conducted mostly close to populated areas, possible omission errors may occur in particular in forested areas.The final inventory map reported 243 shallow landslides in an area of about 212 km 2 (see the minimum bounding rectangle in the left of Fig. 1), whereas the convex hull of the area has an extent of about 123 km 2 ).Average area of landslides is around 2260 m 2 and average perimeter is around 181 m, whereas the average altitude of landslide initiation points is around 588 m.Estimating how complete is a landslide inventory is a difficult task (Malamud et al., 2004).Nevertheless in Fig. 2 we show the landslide probability density as a function of landslides area (Malamud et al., 2004) which follows the same trend of frequency-area distributions of landslides as found in previous works (Guzzetti et al., 2002;Malamud et al., 2004).

Study case 18 March 2013
The second rainfall event, hereafter 18MAR2013, occurred on 18 March 2013 and involved the Garfagnana area belonging to the administrative province of Lucca (see right inset of Fig. 1).This is mainly a mountainous area (the average elevation of the main catchment is 717 m) and is very close to the Ligurian Sea gulf.Long-time series of precipitation data recorded by local rain-gauges report yearly averages of about 2000-2300 mm (Avanzi et al., 2013).Hydrologically, it is characterized by the presence of one main river basin (Serchio basin) with an area of about 1565 km 2 plus several other minor rivers.
Geological features of the area are very similar to the ones described in Sect.2.1.1 for Lunigiana.For an extensive and deeper analysis see Di Naccio et al. (2013) and references therein.
The 18MAR2013 rainfall event occurred during the month of March 2013, which saw the highest monthly precipitation amounts recorded over the last 30 years (Regione Toscana, 2013) for the northwestern part of Tuscany and the Serchio and Magra basins in particular.During the period 5-19 March 2013, the rain-gauges belonging to the administrative province of Lucca and to the Serchio River basin, registered about 310 mm of precipitation against an average monthly value of about 80 mm (climatology is based on the period 1983-2012).This relevant amount of precipitation was the result of two major rainstorms that affected the area of interest: the first one occurring in the period 11-12 March 2013, the second one occurring on 18 March 2013.Due to the high degree of saturation of the soils and due to the surface runoff, on 18 March 2013, several regional hydrometers exceeded the warning levels and flooding alerts were issued by the local Civil Protection Office (Regione Toscana, 2013) for five rivers (Ombrone Pistoiese, Bisenzio, Serchio, Magra, Cecina).As can be argued from synoptic analysis (Regione Toscana, 2013), the 18MAR2013 rainfall event was determined firstly by a warm front over the northern Tyrrhenian Sea and Ligurian Sea, driven by a deep low over Great Britain (988 hPa at 06:00 UTC).Then in the second part of the day the cold front hit the Tuscany region, while the precipitations ended by the late evening/night.The regional raingauge network registered hourly precipitation intensities up to 31 mm h −1 (rain-gauge located near the Monte Macine peak at 1480 m a.s.l.), whereas the average hourly precipitation intensity among the available pluviometers was about 9 mm h −1 .For this rainfall event, it was not possible to estimate return period due to the lack of a long and consistent time series in the area.Nevertheless, Regione Toscana (2013) evaluated this event as one of the largest of the previous 20 years.
A landslide inventory map for this event was created by field surveys of the expert personnel of the local Genio Civile Office.The map reported 127 shallow landslides in an area of about 2038 km 2 (see the minimum bounding rectangle in the right of Fig. 1), while the convex hull where landslides were observed has an area of about 1416 km 2 .Due to time constraints, it was not possible to integrate field survey observations with pre-/post-event image analyses and thus commission/omission errors are very likely.The field surveys did not collect any information about the size (area and perimeter) of landslides, so we cannot report descriptive statistics on the inventory.

Description of the geographical static predictors
In the following, we list the geographical static predictors.We divide them in four groups: geomorphology, hydrology, geology and climate related predictors.The layers are raster data sets and were produced using GIS technologies.The pixel resolution of each layer is 30 m, if not otherwise specified.
An extensive and exhaustive discussion about the choice of the input parameters (typology and number of predictors) in susceptibility assessment studies can be found in Catani et al. (2013) and we used this work as a main reference.The usefulness of some predictors is still debated and can depend on the methodology adopted or the area of investigation and its landslide features.The number of predictors taken into account is also debated and it has been also found that increasing the number of predisposing factors could lead to a worsening of the prediction accuracy (Floris et al., 2011).For this reason, in landslide susceptibility assessment, it is important to implement an automated procedure for the selection of the meaningful variables.As discussed in more detail in Sect.2.4, we chose two suitable methods: the logistic regression with an AIC selection (the GLM) and the RF algorithm, since it naturally estimates the variable's importance for predictive classification.All the predictors here described are schematically listed in Table 1.
Geomorphology-related predictors: -Elevation (DEM): this data set is a hydrologically corrected 30 m digital elevation model, resampled from an original database produced at 10 m of resolution.Elevation is a very common parameter often taken into account in landslide susceptibility assessments (Catani et al., 2013;Felicísimo et al., 2013), since it is related to several predisposing factors such as average precipitation, vegetation, etc.
-Altitude above channel network (AaCN): the algorithm for producing the AaCN uses the channel network for streams.It measures the altitude for each grid cell of the DEM to the nearest channel network elevation.A splines interpolation surface is created, called Channel Network Base Level, then this value is subtracted from the DEM to obtain the AaCN.This parameter has been used in recent works of LSM by Marjanovic et al. (2011) and Mȃrgȃrint et al. (2013).
-Aspect (ASP): this represents the orientation of each cell with respect to the adjacent cells.It influences the landslide susceptibility because it determines how the terrain is exposed to rainfall and solar radiation (Guzzetti et al., 1999) and thus to soil water content.
-Slope (SLP): this is directly derived from the DEM layer.It controls the driving forces (component of weight of material in the direction of failure) acting on hillslopes.For this reason, it has been widely used in geomorphology and landslide mapping studies (Guzzetti et al., 1999;Goetz et al., 2011;Catani et al., 2013).
-LS factor (LSF): this represents the topographic factor (length-slope factor) from the revised universal soil loss equation (RUSLE) according to Moore and Wilson (1992).Despite the fact that the RUSLE equation is commonly used to predict soil erosion on a cell-bycell basis, recently a high correlation has been found between (R)USLE-based soil erosion map and landslide locations (Pradhan et al., 2012).
-Planar curvature (PLAC): basically this is the second derivative of DEM and corresponds to the concavity/convexity of the land surface measured perpendicular to aspect, i.e. parallel to the contour.Goetz et al. (2011) and Catani et al. (2013) used this parameter in their landslide susceptibility study based on GAM and RF models respectively.
-Profile curvature (PRFC): this is a common morphological layer derived from the DEM.It describes the shape of the relief in the direction of the steepest slope.It corresponds to the concavity/convexity of the land surface measured parallel to aspect, i.e. perpendicular to the contour.It is known to affect the flow velocity of water and influences erosion and deposition.It has been used in several landslide assessment studies among which we recall Goetz et al. (2011) and Catani et al. (2013).
Hydrology-related predictors: -Convergence index (COVI): this index represents the convergence/divergence with respect to overland flow.
It is similar to plan or horizontal curvature, but gives much smoother results.The calculation uses the aspects of surrounding cells and looks to which degree the surrounding cells point to the centre cell.The result is given as percentages; negative values correspond to convergent flow conditions, positive to divergent ones.This predictor has been recently used in LSMs by Nefeslioglu et al. (2011).
-Time of concentration (ToC): this measures the response of a watershed to a rainfall event.It measures the time (in hours) needed by a rainfall drop to reach the closure of a watershed from the farthest point of it.It is a function of the topography, geology, and land use within the watershed.It is considered as one of the most critical factor for the estimation of the duration of the triggering rainfall (D'Odorico and Fagherazzi, 2003).
-Topographic wetness index (TWI): this is commonly used to quantify topographic control on hydrological processes.It is calculated by using the formula where a is the local upslope contributing area and β is the local slope angle.This index is related to the soil moisture (Nefeslioglu et al., 2008;Yilmaz, 2010).
The main limitation of the above formula is that it assumes steady-state conditions and uniform soil properties.However, researchers assert that the formula is applicable in a wide range of cases and it has been used in assessment in LSM (Nefeslioglu et al., 2012).
-Distance from drainage channel network (DfCN): this is the Euclidean distance from the river network.The distances from rivers are evaluated by computing the minimum distance between cells and the nearest watercourse.This layer has been considered in similar works as a predisposing factor, because it takes into account a possible activating mechanism related to erosion along the slope foot (Mossa et al., 2005;Mancini et al., 2010).
Geology-related predictors: this group of predictors includes data from two regional databases produced by the Tuscany administration: the Regional Geological Continuum (scale 1 : 10 000) and the regional pedological database (scale 1 : 50 000).The Regional Geological Continuum is the joint effort of several local institutions (universities, research institutes, private entities, coordinated and led by the regional administration) and was recently updated with extensive field campaigns covering about 70 % of the territory.This database is freely available through web facilities.The regional pedological database (level 2) was revised during the period 2009-2012.It was derived using data collected over sample areas of the territory.On average, the sample area extent was about 15-25 km 2 and 20 to 40 observations were performed with the standard of 2-4 vertical profiles.The controls consisted of soil stratigraphic profiles, described, sampled and analysed from wells or exploratory drillings.In a second stage of the work, an unsupervised classification of the whole territory was performed and further corrected by expert personnel.
-Distance from main tectonic features (DfTF): this is the Euclidean distance from main tectonic features.This layer has been used by Costanzo et al. (2012) for LSM on a large scale, resulting as an effective factor for translational slides.
-Bedrock litho-technical map (BLT): this includes 15 different classes of bedrock based on litho-technical properties derived from the literature.This layer is time invariant and has been considered as a relevant causal factor in predictive hazard models assuming that future landslides are likely to occur at past and present instability sites (Guzzetti et al., 1999).Catani et al. (2005) acknowledged the bedrock lithology as a strong controlling factor on landslide occurrence in their study for the Arno River basin (Tuscany region).
-Landslides main scarps (LMS): this layer represents the exposed portions of the surface of rupture.These features are obtained with automated procedures from landslide crowns and DEM.
-Soil permeability (SKST): this predictor is derived from the regional pedological database and has been deter-mined using the HYRES pedo-transfer function (PDTf).
The term "permeability", as used in soil surveys, indicates saturated hydraulic conductivity (Ksat).In other words, it indicates the rate of water movement, in centimetres per hour, when the soil is saturated.
-Landslides and superficial deposits (LaSD): this layer takes into account the presence of landslide bodies, or areas where superficial formations (debris cones, talus, colluvial and eluvial deposits, etc) outcrop.The use of this layer in susceptibility assessment studies is justified by the hypothesis that future landslides will be likely to occur under the same conditions that led to past landslide events (Varnes et al., 1984;Carrara et al., 1991).
-Slope structural setting (SSS): this represents the relationship between the structural setting and the slope aspect (Cruden and Hu, 1998).This factor is rarely considered in large-scale susceptibility analysis due to the difficulty of data acquisition and its expression in a continuous surface (Atkinson and Massari, 1998;Guzzetti et al., 1999;Donati and Turrini, 2002).In this study the SSS factor was obtained by the spatialization of the attitude data available in the regional database, taking into account all the elements that led to the rupture of the geological substrate continuity.The continuous surface realized was then combined with the slope aspect and slope gradient to obtain information about the relation between landslides and different combinations of SSS.
Since it is a directional variable we considered the sine of the SSS.
Climate-related predictors: recently rainfall climatology has been considered in LSMs as a predisposing factor instead of as a triggering factor (Schicker and Moon, 2012;Catani et al., 2013).In fact the average precipitation values describe the attitude of the territory to be hit by a storm of a given type.In the following, we include a set of variables accounting for the precipitation amount (expressed in mm) of a rainfall event occurring in a defined time interval (expressed in hours) and having a defined return period (expressed in years).In this, our predictor is slightly different from that one considered by Catani et al. (2013) who evaluated the return period of a defined precipitation amount occurring in a defined time interval.
-Rainfall 12, 24, 48, 96 h duration and 100-year return period (R12, R24, R48, R96): this data set is the result of rainfall frequency analysis (Baldi et al., 2014).It estimates the amount of rainfall falling at a given point for a specific duration and return period.In the present study, the duration considered is 12, 24, 48 and 96 hours and the return period is 100 years.It was derived from statistical analysis of rainfall time series of the regional rain-gauge network (30 years minimum), interpolated over the area of interest.

V. Capecchi et al.: Statistical modelling of shallow landslides using static predictors and NWP data
Beside the above-mentioned layers, we included in the static predictors two additional thematic maps: -Corine land cover (COR): land cover provides information on vegetation and takes into account human activity on hillslopes.It is considered a predisposing factor and has been used for landslide probability of occurrence mapping (Varnes et al., 1984;Costanzo et al., 2012;Catani et al., 2013).The original Corine data set (Bossard et al., 2000)  -Vegetation index (EVI): the EVI vegetation index is derived from the Moderate Resolution Imaging Spectroradiometer (MODIS) on board of the Terra and Aqua satellite (Huete et al., 2002).This index is designed for providing accurate measurements of regional/globalscale vegetation dynamics (phenology).Conceptually it is complementary to the well known Normalized Difference Vegetation Index (NDVI) from which differs because it is more responsive to canopy structural variations, including leaf area index (LAI), canopy type, plant physiognomy and canopy architecture (Gao et al., 2000).Formally it is a difference of near-infrared, red and blue atmosphere-corrected surface reluctance.In the present paper we use a layer derived from the temporal climatology of the index, using the available satellite imagery for the time series starting from February 2000 and ending in December 2013.Time series data are aggregated to 16 days to minimize cloud contamination.
The spatial resolution of the layer is 250 m.Vegetation status (density and health) is considered a predisposing factor for shallow landslide and debris flows because it is basically a proxy for wetness.It reflects the variation in subsurface water and the fact that deep-rooted vegetation binds colluvium to bedrock.Vegetation index (namely NDVI and in particular its radiometric signature), has been used as an aid to the visual detection of landslides and for the semi-automatic classification of satellite images into stable or unstable slopes (Borghuis et al., 2007;Mondini et al., 2011;Guzzetti et al., 2012).(Hong et al., 2006) Land-surface option Unified Noah model (Chen et al., 1996)

Description of the NWP model and of the numerical weather predictors
The limited-area numerical model used in this study is the WRF model (Skamarock et al., 2005;Skamarock and Klemp, 2008).It is the result of the joint efforts of US governmental agencies and the University of Oklahoma.It is a fully compressible, Eulerian, non-hydrostatic mesoscale model, specifically designed to provide accurate numerical weather forecasts both for research activities, with the dynamical core Advanced Research WRF (ARW), and for operations, with the dynamical core Non-hydrostatic Mesoscale Model (NMM).In the present work we used the WRF-ARW core updated at version 3.5 (April 2013).The model dynamics, equations and numerical schemes implemented in the WRF-ARW core are fully described in Skamarock et al. (2005), Klemp et al. (2007) and Skamarock and Klemp (2008).The model physics, including the different options available, is described in Chen and Dudhia (2000).
A summary of the model's settings chosen for the present study is shown in Table 2, while the geographical extent of the simulation area is depicted in Fig. 1 (outer rectangle).
Here we briefly recall that the horizontal spatial resolution adopted (3 km) is known to be adequate to resolve explicitly the convective processes (Kain et al., 2008;Bryan and Morrison, 2012).
Initial and lateral boundary conditions were obtained from the ECMWF-IFS (European Centre for Medium-Range Weather Forecasts -Integrated Forecasting System) global model.The spectral resolution of the global model is T1279, which roughly corresponds to 16 km of horizontal resolution; vertical levels are 91.Since one of the main purposes of this work is to investigate the potential ability of the regional numerical model to predict in advance possible landslides triggered by heavy rainfall, as boundary conditions, we used the forecast (not analysis) provided by the ECMWF-IFS model.The analysis time is 00:00 UTC 24 October 2011 for the 25OCT2011 event and 00:00 UTC 17 March 2013 for the 18MAR2013 event.The length of the simulations is 48 h for both events.Regarding the NWP predictors, in this preliminary stage of the investigation, we subjectively decided to include a minimal set of explanatory variables: precipitation amounts accumulated over the rainfall event, and mean and maximum hourly precipitation intensity, mean and maximum soil moisture in four layers below ground.Soil moisture is evaluated in the following four layers: 0-10, 10-40, 40-100 and 100-200 cm below ground.This is the partition of soil implemented in the Noah land surface model (Chen et al., 1996) which is incorporated in the WRF model in relation to the physical processes occurring in the interface between land and the near-surface atmosphere.A summary of the meteorological predictors is reported in Table 1 (bottom part of the table ).
To validate the rainfall forecast provided by the WRF model in terms of quantitative precipitation forecast (QPF), we used the data gathered by the remotely automated weather station network operated by the National Civil Protection Department.Data from 20 automated raingauges recording precipitation every hour were collected for 25OCT2011, while data from 60 rain-gauges were collected for 18MAR2013.Locations of rain-gauges are shown in Fig. 3.For each rain-gauge location, we extracted the predicted values of the numerical simulation and compared them with the observed rainfall amounts registered in a 24 h period, namely from 00:00 UTC 25 October 2011 to 00:00 UTC 26 October 2011 for 25OCT2011 and from 00:00 UTC 18 March 2013 to 00:00 UTC 19 March 2013 for 18MAR2013.The ability of the model to simulate the precipitation amounts was analysed using the contingency tables (Wilks, 2011) for selected rainfall's thresholds and using quantitative indices, namely RMSE and multiplicative bias.

Description of the statistical modelling of landslide hazard
For the two rainfall events under examination, we implemented a landslide hazard modelling based on a quantitative indirect statistical model (Carrara et al., 1991;Guzzetti et al., 1999).In other words, using separately the GLM and the RF classifier, we construct a statistical functional relationship between instability factors (geological, geomorphological, climatological thematic layers and NWP outputs) with the distribution of landslides obtained from the event inventory maps.A consequence of this approach is the mapping unit which is forced to be grid-cells.It is important to repeat that no statistical downscaling was performed to resample, geo-statistically, the NWP outputs (produced at 3 km of horizontal resolution) to the resolution of the static instability factors (produced at 30 m of horizontal resolution).As stated in Sect. 1, this has been done because we want to preserve and evaluate the information provided by NWP data produced at their native scale (meso-γ scale) in a landslide susceptibility framework.The final result of the modelling is a map showing the classification of the area of interest into domains of different hazard degree ranging between 0 (stable slopes) and 1 (unstable slopes).In the literature (Carrara et al., 1999;Guzzetti et al., 1999;Catani et al., 2005) this type of map is also referred to as a landslide hazard map.
The GLM (McCullagh, 1984;McCullagh and Nelder, 1989) is a statistical technique used to model the relation between a response variable L and a set of explanatory variables {X i }, i = 1, . .., n.In the present case, L is the presence/absence of landslide, while the {X i } variables are the static parameters detailed in Sect.2.2 and the NWP outputs detailed in Sect.2.3.The GLM with a logit link function is one of the most frequently used techniques in landslide susceptibility modelling and it has been largely and successfully applied.See the review article by Brenning (2005) and references therein for a list of works using logistic regression.For our purposes GLM is appropriate because (a) it handles both categorical and continuous predictor variables and (b) it does not assume a specific distribution of predictors.In the present study the GLM is taken into account because it is considered a sort of benchmark for LSM.A drawback of GLM is the incapacity to model potential non-linearities in the relationship between response and explanatory variables.In the present work, logistic regression is performed after applying an automatic stepwise forward variable selection based on the Akaike information criterion (AIC).The GLM is available in R's "stats" package.The RF classifier (Breiman, 2001) belongs to the family of machine learning algorithms.It is based on classification trees (Breiman et al., 1984) and on the idea of bagging (i.e.bootstrap-aggregation) predictors (Breiman, 1996).A RF is an ensemble of classification trees, where each tree is constructed from a random subset of the observations (i.e. the dependent variable) and at each node of the tree only a random subset of the predictors (i.e. the independent variables) is used.The data not chosen to construct the tree ("out-ofbag") are used to assess the predictive skill of the tree.The most common classification among all the trees is the prediction of the RF.
Díaz-Uriarte and De Andres (2006) stress some of the features and advantages of RF in bioinformatics classification problems.Schematically RF: (a) handles both continuous and categorical predictors naturally, (b) no formal distributions of the variable's predictors is assumed, (c) has an automatic variable selection, (d) does not need a cross-validation of the results but has a built-in estimate of the model's accuracy, (e) there is little need to fine-tune parameters and (f) incorporates highly non-linear interactions among predictors.This method has been extensively used in the literature in a variety of application fields ranging from bioinformatics (Díaz-Uriarte and De Andres, 2006) to remote sensing (Pal, 2005;Brenning, 2009) and ecology (Cutler et al., 2007) among others.
In the landslide mapping or geomorphological context it was used by Brenning (2009); Stumpf and Kerle (2011a, b); Vorpahl et al. (2012); Catani et al. (2013).In particular Catani et al. (2013) applied the RF algorithm to produce LSMs of the Arno River basin (about 9100 km 2 ) at different mapping units ranging from 10 to 500 m.They found that the RF is feasible and robust provided that a preliminary procedure is performed aimed at assessing the optimal number of trees of a single realization of RF.Brenning (2009), in his study on an automatic rock glacier detection procedure using multispectral remote sensed data, found that RF tends to overfit to the training data, achieving unsatisfactory results in crossvalidation.Marmion et al. (2008) found that RF provides low classification accuracy in a geomorphological application if compared to other statistical methods.In the paper by Brenning (2005), the author discusses how RF prediction is problematic outside the training set and in general RF was found to be prone to overfitting.
Beyond these disadvantages, for our purposes, one of the main features of the RF algorithm is the variable importance output.This measures the importance of each variable to perform a correct classification in the tree when the model is validated on the OOB (out-of-bag) data.Alternatively, the measure is based on the decrease of classification accuracy when the variables in a node of a tree are permuted randomly (Breiman, 2001).A variable's importance can be assessed by means of two different measures: mean decrease in accuracy and mean decrease in node impurity.We chose the mean decrease in accuracy which is meaningful in the present context since AUC is chosen as the performance measure of the predictive skill of the model.As done by Brenning (2009) and  Catani et al. (2013) the number of trees used in this study is 200.
In the present work, we used the R implementation (Liaw and Wiener, 2002) of the original RF code developed by L. Breiman and A. Cutler.In order to remove methodological limitations due to the lack of a performance estimation of the results on an independent test set, we ran both statistical models (GLM and RF) for both study cases (25OCT2011 and 18MAR2013) dividing randomly each event inventory data set in two groups: a training set and a validation set.The first group is 70 % of each event inventory data set, whereas the validation set is the remaining 30 %.The relative size of the two groups (70 and 30 %) is based on Carrara et al. (2008).For model fitting, the number of non-landslide locations equals the number of landslide locations according to Brenning (2005) and Goetz et al. (2011).The non-landslide locations were selected randomly in the two areas of interest.
Due to the small number of elements in the resulting training sets (in particular for 18MAR2013), overfitting may occur in the predictive relationship of the models.So we decided to perform multiple runs of the model (100 runs for each study case), with different random choice of the training and test sets, since Stumpf and Kerle (2011b) found that RF variable importance is almost stable when running multiple classification runs.

Results
Since one of the crucial points to properly forecast the rainfall-triggered landslides is an accurate prediction of spatial patterns and temporal intensity of rainfall (Crozier, 1999), in Sect.3.1 we show the verification of the WRF predictions using rain-gauge observations as ground truth.In Sect.3.2 we present the accuracy of the landslide susceptibility maps in terms of ROC curves and of the corresponding underlying area.In Sect.3.3 we show the variable importance outputs provided by the RF algorithm.

Evaluation of the forecasting skills of NWP outputs
To have an indication of the precipitation patterns simulated by the WRF model for 25OCT2011, in Fig. 4 we show a visual comparison of the rainfall amounts predicted by the model (panel a) and the observed data (panel b).In Table 3 (top), we show the descriptive statistics (average values and percentiles) of the observed and modelled rainfall data.To evaluate the under-estimation of the numerical data, we calculated the False Alarm Rate (FAR) and Probability of Detection (POD) for selected rainfall thresholds (see Fig. 5).
To estimate errors and bias, we computed the RMSE, the multiplicative bias and the correlation coefficient: the RMSE is about 150 mm and the multiplicative bias is about 0.49, whereas the correlation coefficient is about 0.19.To understand whether displacement errors occur in the WRF simulation, we calculated descriptive statistics not only for the numerical data extracted for rain-gauge locations, but also for all the numerical data extracted in a larger area affected by the 25OCT2011 event (inset rectangle in the left of Fig. 1).Descriptive statistics of these data are reported in Table 3 (top).Since in the rain-gauge data set we observed high rainfall intensities (up to 67 mm h −1 in the Pontremoli rain-gauge station), a special attention was devoted to the verification of the maximum rainfall intensities (mm h −1 ) predicted by the model.On average, over the 20 rain-gauges considered, the RMSE of rainfall intensity data is around 19 mm h −1 and the multiplicative bias is about 0.68.Summary descriptive statis-  4.
For the 18MAR2013 event, to have an idea of simulated rainfall patterns, we show in Fig. 4 a visual comparison between the rainfall data simulated by the model (panel c) and the observed rainfall data (panel d).The quantitative evaluation of forecasting skill reported a RMSE of about 45 mm, whereas the multiplicative bias is about 0.59 and the correlation coefficient is about 0.63.To evaluate the distributions of both numerical data and observations we show in Table 3 (bottom) the descriptive statistics.To evaluate over-/underestimation we computed the FAR and POD skill scores (see Fig. 6), where FAR and POD values are computed and plotted against selected thresholds ranging from the rounded minimum (10 mm) to the rounded maximum (130 mm) of the modelled data.
To evaluate the feasibility of an early warning system for rainfall-induced shallow landslide based on NWP data, we reported the computational time required by the WRF simulation.Considering the extent outlined in Fig. 1 and the grid spacing of 3 km, the CPU time is approximately 2 h on an eight-processor quad-core HPC Linux server.

Evaluation of landslide hazard maps
To visualize the modelling effects and to enable a few considerations, outputs of the model prediction have been provided.In Figs. 7 and 8 we show a single output, for each study case, among the 100 runs performed.The results for the study case 25OCT2011 are shown in Fig. 7a for the GLM and Fig. 7b for the RF model.The results for the study case 18MAR2013 are shown in Fig. 8a for the GLM and Fig. 8b for the RF model.In the maps, we have subjectively masked out the pixels where slope is below 6 %.In the maps, values range from 0 (green colour) indicating pixels classified as stable slopes to 1 (light grey colour) indicating pixels classified as unstable slopes.To evaluate the accuracy of landslide hazard maps, we calculated ROC and AUC values of the 100 runs performed on the test sets.Summary statistics of the AUC values achieved by the GLM and RF model runs are reported in Table 5.
To evaluate the feasibility of an early warning system, based on the forecasting chain proposed, we reported the CPU time required by GLM and RF.Considering the extent of the two areas of interest (see Fig. 1), the resolution of both static predictors and NWP outputs and considering the statistical models adopted, landslide hazard maps were produced requiring less than 10 min of CPU time on an eight-processor quad-core HPC Linux server.

Evaluation of importance of predictor variables
The optimability of the parameter set is evaluated on the basis of the out-of-bag error as explained in Catani et al. (2013).
Out of the 34 parameters considered in the analysis, a combination of 22 parameters represents the best configuration for 25OCT2011 on average over the 100 RF runs performed, whereas a combination 12 parameters is the best set, on average, for the 18MAR2013.Since in both cases the cardinality of the optimal set is quite high with respect to the total number of predictors, as indicators for the stability of the variable importance we decided, subjectively, to report (i) which is the average position of each predictor variable and (ii) how many times (in %) a predictor variable was classified as "important" in the first 10 (5) places of the ranking for 25OCT2011 (18MAR2013).In other words, we decided to restrict the optimal set of predictor variables classified as important to the first 10 (5) positions and count the average position of each predictor among the 100 RF runs.For 25OCT2011 elevation, distance from main tectonic features, vegetation index, maximum soil moisture in the layer 40-100 cm, total NWP rainfall amount and mean NWP rainfall hourly intensity are the six most frequent variables with average rank 4.5, 3.7, 3.4, 6.2, 2.6 and 2.6 respectively.AaCN, maximum soil moisture in the layer 10-40 cm and maximum NWP rainfall hourly intensity are the next most  frequent variables in the first 10 positions for 25OCT2011.The other variables approximately occur in the first 10 positions of the optimal set, in no more than one-third of the 100 RF runs performed.
Regarding 18MAR2013 the most frequent variable is elevation, which is always classified in the first position, followed by AaCN, maximum soil moisture in the layer 0-10 cm, SSS and land cover, each one occurring in the optimal set in at least half of the 100 RF runs.
Complete results for 25OCT2011 and 18MAR2013 are reported in Table 6.

Discussions
In an effort to elucidate the impact of NWP data in LSM, we implemented a simple statistical method by using variable  we discuss how the major findings of the work demonstrate the usefulness of NWP in LSM.
In spite the main aim of the work, a few issues need also to be addressed regarding (i) the reliability of NWP data used and (ii) the accuracy of the LSMs produced by the RF and the GLM, this last being a sort of benchmark for LSM.

Reliability of NWP data
Regarding 25OCT2011, from the analysis of Fig. 5 and Table 3 (top), it is clear that the model largely under-estimates the rainfall amounts for high thresholds (broadly greater than 60 mm).This fact is also confirmed by multiplicative bias, RMSE and correlation coefficient achieved (0.49, 150 mm and 0.19 respectively).The failure of the model to predict the exact locations of the deep convection processes can also be visually appreciated from Fig. 4a and b.The general underestimation of NWP model forecast in predicting precipitation maxima during intense precipitation events is a well known issue and was recently addressed for this specific rainfall event by Buzzi et al. (2014).On the other hand, from Table 3 (top), we can state that the model is able to capture the characteristics of a heavy rainfall event in the area of interest.In particular the maximum value achieved by the model in the whole area of interest (218 mm (24 h) −1 ) is similar to that obtained by Buzzi et al. (2014), who analysed the same rainfall event with the ISAC convection-permitting MOLOCH model (Buzzi et al., 2004).In their paper, the authors found a maximum rainfall amount of 286 mm (24 h) −1 .It should be noted, however, that they used the ECMWF-IFS analysis at 12 UTC 24 October 2011 (instead of 00:00 UTC 24 October 2011 as done here) and that their model horizontal resolution is 1.5 km (instead of 3 km as setup here).Similarly, from Table 4, we can assess that the model is also able to reproduce adequately the hourly rainfall intensities yielding moderate RMSE (19 mm h −1 ) and a multiplicative bias around 0.68.Overall we can conclude that despite the under-estimation of precipitation maxima on a point-to-point verification, for 25OCT2011 the numerical weather model provides results as reliable as possible considering the time of integration and the resolution adopted.
Similar considerations can be applied for the verification of 18MAR2013 rainfall forecast.Table 3 (bottom) and Fig. 6 show the over-estimation of WRF rainfall data for low thresholds and under-estimation for high thresholds.This is a feature of the model which was also found in similar studies in Italy (Oberto et al., 2012).However, for this event, quantitative forecasting skills (multiplicative bias and RMSE) are better than those found in Oberto et al. (2012).This mainly because this event was not characterized by deep convection processes (Regione Toscana, 2013) and thus fine-scale variability is represented more adequately.
Considering that high-intensity rainfall events are often spatially very variable, fine-scale variability cannot be properly modelled by the WRF simulations produced with grid point about 3 km.Previous similar studies, in particular Mercogliano et al. (2013b), used statistical techniques to resample, from the meso-γ to the micro-γ scale, rainfall data produced by the NWP model.In general, downscaling statistical methods, by construction, introduce a source of uncertainty because they depend upon the choice of predictors (mainly topographical variables) and they suppose stationarity in the predictor-predictand relationship (Fowler et al., 2007).On the other hand, regarding downscaling dynamical methods, it is well known that increasing horizontal resolution produces more skillful forecasts (Buzzi et al., 2014).Hopefully, in the near future operational NWP data will be produced with grid point spacing about 1 km (Schwartz, 2014) or below.Moreover, it is commonly accepted that predictions of intensive precipitation episodes of convective origin should be improved through data assimilation techniques (Schwitalla and Wulfmeyer, 2014).For these considerations, at the present stage of the work, we prefer to preserve the information content of NWP output produced at the meso-γ scale and to evaluate its value and possible positive impact on landslide modelling.

Accuracy of landslide susceptibility maps
Although AUC values, in general, are not comparable between studies in different study areas, the area under the ROC curve was found to be a good representative index of the accuracy of landslide probabilistic forecast (Frattini et al., 2010).Results here achieved and summarized in Table 5 are similar to those found in recent literature, considering in particular that the areas under examination here do not contain large "easy-to-predict" portions (i.e.flat valley floors or less steep forelands).Catani et al. (2013) in their study investigating the landslide susceptibility in Tuscany, using the RF technique, found AUC values ranging from 0.74 to 0.97 when increasing the number of samples required to calibrate a model.The resolution of their study was 50 m, which is comparable to the one adopted here (30 m) and the performance estimation technique is the same.In the present study, for the 25OCT2011 study case, the AUC values obtained are quite high both for the GLM and, even better, for the RF classifier, and are comparable to those found in Catani et al. (2013).On the other hand, for the 18MAR2013 case, average AUC values are lower for both models.This fact might be related to the potential incompleteness of the inventory map produced which is a limitation of the study case.
In the susceptibility maps produced (see a sample output in Fig. 7), the pixelated shape of some areas is due to the fact that no downscaling is performed to the NWP outputs (produced at 3 km of horizontal resolution) towards the resolution of the static thematic layers (30 m of horizontal resolution).Nevertheless, this coarse approximation does not dramatically affect the results as demonstrated by the AUC values achieved.

V. Capecchi et al.: Statistical modelling of shallow landslides using static predictors and NWP data
As a limiting factor of the study, it has to be stated that, in the context of landslide prediction, the relevance of the maps produced is limited by the fact that the models were fitted to an event landslide inventory and to NWP data simulated for the same date.In the near future and operationally, the idea is to calibrate the models for some study cases (i.e. at least 10-15 rainfall events) and then apply the models to other independent cases, or, on a daily basis, feed the models with NWP data and get the forecast for the following day.In this preliminary stage, it was not possible to consider additional study cases, because it was not possible to collect enough observed landslides.Moreover, in the present work, we did not train the models for 25OCT2011 and then apply them for 18MAR2013, because the two study cases show different characteristics from a meteorological point of view (convective precipitation vs. stratiform precipitation) and because the two events occurred in two different areas even if geographically contiguous.However, the work here presented is not strictly a hindcast exercise since the NWP data are not a numerical description of the rainfall that occurred (that is, we did not use reanalysis data as initial and boundary conditions).In contrast, NWP data were obtained by numerical integration using global analysis and forecast.
In spite of the simplicity of the statistical approach, another drawback of the methods proposed is that, being datadriven, a model built up for one region and for one particular event cannot be applied, without any re-calibration, to a neighbouring area.
It has to be kept in mind that these results were achieved without any algorithm or model calibrations.A possible tuning of the whole forecasting system may rely on the improvement of NWP performances.For example, since antecedent soil condition is one of the crucial factors for determining landslide-triggering rainfall thresholds (Glade et al., 2000), the NWP's initial soil moisture could be better estimated by means of the assimilation of remote sensed data at regional scale (Capecchi and Brocca, 2014).On the other hand, nowadays, global models have sophisticated assimilation algorithms to ingest observed soil water content estimates and to produce "warm" analysis of soil conditions (Dharssi et al., 2011;de Rosnay et al., 2013).As pointed out by Segoni et al. (2009), in order to gain a more accurate temporal and spatial knowledge of the triggering rainfall, EPS (ensemble prediction systems) or RUC (rapid updated cycle) numerical forecasting chains could be adopted.This latter methodology could be further improved by means of the assimilation of radar data (Segoni et al., 2009) to keep antecedent soil moisture conditions as close to reality as possible.
The total time required to produce a single landslide susceptibility map is around 2 h and 10 min.From an operative point of view, since global NWP forecasts are available each day at about 07:00 UTC (when the analysis time is 00:00 UTC), there is room for short-term regional weather forecast and GLM or RF runs, since regional daily assessment of landslide risk for the following day needs to be is-sued approximately before 12:00 UTC.Obviously, this CPU time can be further reduced by using additional or more recent processors to meet specific needs (for example larger areas of interest) of the end users.

Importance of NWP predictor variables
Our study is mainly aimed at testing and evaluating the information content provided by numerical weather predictions in landslide assessment.This aim is justified by the fact that recently NWP data have become a promising and reliable source of information for a real-time forecasting chain of rainfall-induced shallow landslides (Schmidt et al., 2008;Segoni et al., 2009;Mercogliano et al., 2013b).To our knowledge, there is no literature regarding the evaluation of the importance of NWP data on LSM.The present study differs from the cited papers because of the following: (i) we consider not only NWP rainfall amounts but also NWP hourly rainfall intensities and soil moisture in four layers below ground, (ii) we do not perform any statistical downscaling of the NWP data towards the LSM unit and (iii) we combine both static predictors and NWP data into a "blackbox" statistical model instead of using a physically based model.In particular this latter feature allows us to assess objectively (within the limitations associated with RF) the impact of NWP data on LSM by using the variable importance parameter.From Table 6 we can assess that NWP information is crucial for the 25OCT2011 event, whereas it has a more limited impact for the 18MAR2013 event.In particular for 25OCT2011, it is worth noting that not only is the total NWP rainfall amount found to be important, but also the mean, and to some extent the maximum, hourly NWP rainfall intensities are often ranked in the first positions among the 100 RF runs.In particular, the positive impact of hourly rainfall intensities is a relatively innovative finding and here it is found that, as a predisposing factor, it is as important as the total rainfall amount.In the recent literature, only Segoni et al. (2009) used the radar data to extrapolate hourly or sub-hourly rainfall intensities for LSM purposes.In contrast, regarding 18MAR2013, NWP rainfall data are occasionally classified as important in RF runs.In fact, 18MAR2013 was characterized mainly by stratiform precipitation and the role of the prolonged antecedent precipitations played a relevant role in landsliding activity.Not surprisingly, for this event the maximum soil moisture in the first 10 cm below ground is classified as important in almost 75 % of the runs with an average rank of 3.This result confirms the findings of recent studies investigating the strong linkage between soil moisture and landslide occurrence (Ray et al., 2010).In particular, Ponziani et al. (2012) in analysing the role of antecedent soil moisture conditions at the regional scale, assessed the soil moisture content as being as important as rainfall amounts for the triggering of landslides.In addition, this finding suggests the possible use of NWP soil moisture contents for the initialization of hydrological states in modelling, instead of using NWP rainfall data as the main driver of soil moisture patterns (Schmidt et al., 2008).It is also worth noting how, for this rainfall event, NWP root-zone soil moisture contents (namely water content in the layer 100-200 cm below ground) is classified as important in more than 25 % of the 100 RF runs.On can argue that this is the effect of the prolonged antecedent precipitation.Similarly, for 25OCT2011, NWP soil moisture in the layers 10-40 and 40-100 cm below ground are found to be important in 67 and 90 % of the RF cases.This effect needs to be further investigated to understand whether there is a causal relationship between subsurface soil moisture content and land movement or whether RF overfitting occurs.
In general, in spite of RF ease of use, the fact that RF overfitting may occur in geomorphological applications (Brenning, 2005(Brenning, , 2009;;Marmion et al., 2008) is a possible drawback of the present study and any result regarding predictor variables importance should be analysed with care.For this reason we tried to mitigate possible accidental effects in the predictor/predictands relationships by performing multiple runs of the models varying the choice of the training and test sets.
A few issues need to be addressed concerning the importance of the static predictors.This work does not add anything innovative in relation to the relative importance of the static layers in LSM with respect to other similar and recent works (Catani et al., 2013).The common findings with this paper are as follows: (i) elevation is always (or very often) ranked in the first positions, (ii) the slope curvature variables do not seem to have any influence on landslide susceptibility, (iii) the lack of importance that the prediction model assigns to the hydrology-related variable is confirmed.On the other hand, some discrepancies found are as follows: (i) the land cover has a relevant influence on landslide assessment for 18MAR2013, while Catani et al. (2013) found a relatively marginal impact, (ii) the climate-related predictor variables in Catani et al. (2013) are often classified as important in the first positions at several mapping units, whereas in the present paper these variables are classified as important only occasionally.Nevertheless, for these predictors it is interesting to stress how (i) for 25OCT2011 (large precipitation amount in a relatively short time period) the 12 and 24 h duration and 100-year return period occur in the RF rank, while (ii) for 18MR2013 the 48 and 96 h duration and 100-year return period occur, taking into account the tendency of the territory to be affected by long-lasting and prolonged precipitation.

Conclusions
Our study presents the opportunity to integrate NWP information content into a statistical method to take account in advance of the relationship between rainfall, soil moisture and landslide movements.The relative importance of the NWP data is assessed by using RF diagnostics and it is found that rainfall amounts and hourly intensities are relevant for the convective precipitation case, whereas for the prolonged antecedent precipitation case, the NWP surface soil moisture is as important as geomorphology-related predictors.We therefore tried to bridge the gap between the micro-γ scale (≤ 20 m), which is the characteristic scale of geomorphology, hydrology and climate related information, with the meso-γ scale, which is the current typical scale of the NWP forecast.The final output of the statistical modelling has been evaluated by dividing the event inventory map into training sets and test sets and the results are estimated by using the AUC values.To account for possible overfitting in model estimates, or, in other words, to evaluate the stability of variable importance, an "ensemble" of model runs was performed.
A deeper interpretation of the preliminary results obtained is an ongoing task.Further well-documented severe rainfall events need to be addressed and the results need to be validated with independent ground observations to take account of possible overfitting in the procedures and to properly calibrate the model.
caused large precipitations amounts in the Magra River valley; flash floods, Published by Copernicus Publications on behalf of the European Geosciences Union.

Figure 1 .
Figure 1.Map of Italian domain showing the two areas of interest.The left inset shows the Lunigiana area belonging to the administrative province of Massa-Carrara.This area was affected by the 25OCT2011 event.The right inset shows the Garfagnana area belonging to the administrative province of Lucca.This area was affected by the 18MAR2013 event.Both insets show the area (black rectangles) where the statistical models were implemented and tested.The outer black rectangle represents the domain of the WRF simulations.

Figure 2 .
Figure 2. Probability density as a function of landslides area of the inventory map for 25OCT2011.

Figure 3 .
Figure 3. Locations of the rain-gauges for the two study cases: 25OCT2011 (closed circles) and 18MAR2013 (crosses).

Figure 4 .
Figure 4. (a), (b) Modelled and observed precipitation in mm accumulated in the 24 h period starting at 00:00 UTC of 25 October 2011.(c), (d) Modelled and observed precipitation in mm accumulated in the 24 h period starting at 00:00 UTC of 18 March 2013.

Figure 5 .
Figure 5. Study case 25OCT2011: false alarm rate (top) and probability of detection (bottom) skills obtained when validating the accumulated 24 h rainfall predicted by the WRF model with the rainfall data observed at the rain-gauge locations displayed in Fig. 3.

Figure 6 .
Figure 6.Study case 18MAR2013: false alarm rate (top) and probability of detection (bottom) skills obtained when validating the accumulated 24 h rainfall predicted by the WRF model with the rainfall data observed at rain-gauge locations displayed in Fig. 3.

Figure 7 .
Figure 7. 25OCT2011 landslide hazard maps.Results from the GLM (a) and from the RF classifier (b).Crosses points are the event inventory maps produced by field surveys a few days after the rainfall event.

Figure 8 .
Figure 8. 18MAR2013 landslide hazard maps.Results from the GLM (a) and from the RF classifier (b).Crosses points are the event inventory maps produced by field surveys a few days after the rainfall event.

Capecchi et al.: Statistical modelling of shallow landslides using static predictors and NWP data
reported estimates of return period V.

Table 1 .
List of the predictors taken into account in both GLM and RF classifier for landslide hazard mapping.

Table 2 .
Basic options of the numerical WRF simulations.

Table 3 .
Descriptive statistics of the observed and modelled 24 h rainfall amounts for both study cases (unit is mm).

Table 4 .
Descriptive statistics of the observed and modelled 1 h rainfall values for 25OCT2011 (unit is mm).

Table 5 .
AUC values achieved by the GLM and RF models for both study cases.Summary statistics are computed considering the 100 runs performed (δ stands for standard deviation).

Table 6 .
List of predictors classified as "important" by the RF algorithm for 25OCT2011 and 18MAR2013.The average position and the percentage of occurrences in the first 10 (5) positions in 25OCT2011 (18MAR2013) ranking are computed over the 100 runs of RF.