Using online databases for landslide susceptibility assessment: an example from the Veneto Region (northeastern Italy)

In this paper, spatial data available in the Italian portals was used to evaluate the landslide susceptibility of the Euganean Hills Regional Park, located SW of Padua (northeastern Italy). Quality, applicability and possible analysis scales of the online data were investigated. After a brief overview on the WebGIS portals around the world, their contents and tools for natural risk analyses, a susceptibility analysis of the study area was carried out using a simple probabilistic approach that compared landslide distribution and influencing factors. The input factors used in the analysis depended on available data and included landslides, morphometric data (elevation, slope, curvature, profile and plan Curvature) and non-morphometric data (land use, distance to roads and distance to rivers). Great attention was paid to the pre-processing step, in particular the re-classification of continuous data that was performed following objective, geologic and geomorphologic criteria. The results of the study show that the simple probabilistic approach used for the susceptibility evaluation showed quite good accuracy and precision (repeatability). However, heuristic, statistical or deterministic methods could be applied to the online data to improve the prediction. The data available online for the Italian territory allows susceptibility assessment at medium and large scales. Morphometric factors, such as elevation and slope angle, are important because they implicitly include information that is not available, such as lithologic and structural data. The main drawback of the Italian online databases is the lack of information on the frequency of landslides; thus, a complete hazard analysis is not possible. Correspondence to: M. Floris (mario.floris@unipd.it) Despite the good results achieved to date, collection and sharing of data on natural risks must be improved in Italy and around the world. The creation of spatial data infrastructure and more WebGIS portals is desirable.


Introduction
The most widely adopted definition for landslide hazard is "the probability of occurrence of a potentially damaging phenomenon (landslide) within a given area and in a given period of time" (Varnes et al., 1984).This definition implies the evaluation of both the spatial (where) and temporal (when) probability of occurrence (Einstein 1997) and of the magnitude of a given phenomena (Aleotti and Choudhury, 1999;Guzzetti et al., 1999;).
The evaluation of spatial and temporal landslide hazard is highly dependent on the precision and accuracy of the input factors, in particular landslide identification and characterization (van Westen et al., 2008).Factors influencing spatial landslide hazard (susceptibility) include geological, geomorphological, hydrogeological and tectonic features, geomechanical and geotechnical properties, land use and management and morphometric factors.Determining which factors to use in an analysis depends on the method of analysis, the availability and quality of the input data, including the scale of input data and the aims of the output (Soeters and van Westen, 1996;Lee and Min, 2001;Cascini et al., 2005;Fell et al., 2008;van Westen et al., 2008).Moreover, the choice can be made a priori on the basis of in-depth territorial knowledge and expert judgment or by performing statistical analyses to identify the significance of each of the influencing factors.In the case of temporal predictions, the evaluation is often more difficult.It can be determined by deducing the frequency of landslide phenomena from historical data or through multi-temporal analysis of aerial and satellite images (Chadwick et al., 2005;Romeo et al., 2006;Zanutta et al., 2006;Devoli et al., 2007) or relating landslides and their triggering factors (Leroi, 1997), such as earthquakes or heavy rainfall, for which a long time series of measurements is available and statistical and probabilistic analyses can be performed (Ding et al., 2006;Floris and Bozzano, 2008;García-Rodríguez et al., 2008).
Since the 1980s, more than 600 publications on the topic of landslide hazard and risk assessment have been published by many authors around the world.Many methods and techniques, from qualitative to quantitative, have been proposed, some of which use Geographic Information Systems (Chacón et al., 2006 and references therein).GIS techniques offer basic capabilities and specific tools for storing and processing spatial data to evaluate both spatial and temporal landslide hazard.In the last ten years, due to the increase in tools for online sharing, management and communication of geographical data, many web-based Geographical Information Systems (WebGIS) have been created worldwide by local governments, nations, universities and other research centers.
In this paper, the authors evaluate the possibility of using online data to make spatial and temporal predictions.In this context, the quality and applicability of online databases available for the Italian territory were examined.The limitations of using online data, in particular with respect to the scale at which the analysis is carried out (e.g., small/regional, medium, large, detailed), were investigated.After a brief overview of the WebGIS available around the world and of the most important geographic portals in Italy, a landslide hazard analysis of the area of Euganean Hills Regional Park (northeastern Italy, Fig. 1) using available online data is presented.Because of the lack of data useful for the evaluation of temporal hazards, a GIS-based landslide susceptibility analysis of the study area was performed, and the results were tested and discussed.

Online databases
Exposure to risks due to natural hazards is increasing around the world, and thus more tools for risk management and mitigation are needed.The creation of online databases can make an important contribution to sharing data and experiences between the scientific community and authorities.Considering scientific publications on GIS-based landslide hazard assessment, the degree of sharing and the availability of online geo-data from some representative sample nations have been evaluated (Table 1).The distribution of WebGIS Portals in the world with high-level sharing and contents, such as Canada, Australia, United States of America, United Kingdom and Spain, is inhomogeneous.Asian nations, such as China and South Korea, which are widely affected by hydrological and geological events and have extensive scientific literature, do not have efficient online databases yet.In the next future, an increase in WebGIS portals dedicated to the dissemination of data and advances for natural disasters and risk prevention is desirable.
In Italy, national projects for the creation of WebGIS and web-based databases began in the 1980s and have increased over the years with different territorial extensions: national, interregional and regional (Table 1).
The Italian National Cartographical Portal, the E-GEO project (a complete inventory of Italian geo-thematic maps) and the IFFI project (a detailed picture of the landslide distribution within Italy) represent the major national projects in Environmental and Earth Sciences.During the last decade, data available through these portals has been used for several purposes, such as the analysis of the spatial and temporal distribution of landslides to map landslide-hazard variability (Giardino et al., 2004;Colombo et al., 2005) and the evaluation of landslide volume at regional scales (Marchesini et al., 2009).
The interregional portals, administered by Basin Authorities (public corporations established in 1989 for environmental protection and town-and-country planning), are noteworthy because they allow both free download of available geo-data and browsing, viewing and querying of the Hydro-geological Arrangement Plan (PAI, Piano di assetto idrogeologico) databases.
Portals from the different Italian regions give access to large quantities of spatial data, but the contents and degree of sharing differ among the regions.An overview of the contents of these portals that highlights the need for common tools and resources for landslide hazard analyses on the basis of accessibility, completeness of data and free download can be used to rank the regional portals (Fig. 1).Regional portals that provide all three of the above features are assigned a value of 3; regional portals not showing any of the characteristics receive a value of 0. This classification shows that regional portals such as Emilia Romagna, Umbria and Marche have poor accessibility and no downloading tools despite the large quantities of data.However, 15 of the 20 reviewed portals offer good accessibility and completeness, and in nine cases, free download is allowed, which is good for the development of a spatial data infrastructure for landslide hazard evaluation and risk management in Italy.

Case study
The area of Euganean Hills Regional Park (Fig. 2) was chosen as the study area for testing the quality and applicability of online databases in landslide susceptibility analyses.This choice was driven by the in-depth territorial knowledge of this area by our research group, which allowed a critical analysis of the results of this study.
The Euganean Hills area can be subdivided into three main sub-areas (Piccoli et al, 1981):  The identified features of these three areas are attributed to structural and mechanical causes.The emplacement of the volcanic bodies produced uplift, bulging and jointing of the sedimentary units, activating and facilitating erosional and gravitational processes that acted in a differential way because of the variation in physical and mechanical properties between the sedimentary rocks and the volcanic units.
The training area is affected by different landslide typologies, mainly soil slips and rotational and translational slides (Cruden and Varnes classification, 1996) involving debris covers and sedimentary deposits (Fig. 3).Other typologies, such as falls and topples, and lateral spreadings are rare and involve volcanic deposits.Thus, the susceptibility analysis performed in this study includes only soil slips and rotational and translational slides (not subdivided in the online database).

Input datasets and method
The landslide susceptibility was evaluated using data from online spatial databases of the Institute for Environmental Protection and Research (ISPRA, Istituto Superiore per la Protezione e la Ricerca Ambientale), the regional portal of Veneto and the national portal of the Italian Ministry of Environment (Table 1).
Access to the ISPRA portal data on landslides (IFFI project) required a formal request because even the data that was viewable was not downloadable.The IFFI (Inventario Fenomeni Franosi Italiani, Italian Landslides Inventory) geodatabase consists of information about the location of mass movement, type of movement, state of activity, lithology, land use, information source, description of direct damage, triggering factors and remedial works.For the present analysis, only the spatial location and the type of movement data were used.
The Veneto Region Cartographic Portal vector terrain data at a 1:5000 scale is, in contrast, directly downloadable.Based on these data, a Digital Elevation Model (DEM) and a suite of DEM-derived factors influencing landslides occurrence were built up.On the same portal, vector data of land use cover could be freely consulted, but a request had to be made to the Veneto Region administration to acquire the data.
The Ministry of Environment's cartographic portal provides useful information for landslide analyses at different scales.The data can be consulted through the online We-bGIS or using the ArcIMS service but they are not downloadable.For this paper, information on the temporal pattern of landslides was obtained from this portal in the form of multi-temporal ortho-photos and radar interferometry data.
The inputs of influencing factors used in the susceptibility analysis depended on the available data.Landslides, morphometric data (elevation, slope, curvature, profile and plan curvature) and non-morphometric data (land use, distance to roads and distance to rivers) were included in the analysis.As stated in the previous section, because the study area is mainly affected by rotational and translational slides and soil slips (Fig. 3), analysis was performed only for these landslide types.Morphometric factors were derived using a 5 × 5 m DEM built utilizing the Topo to Raster tool, available in the ESRI ArcGIS package.This tool permits interpolation of a hydrologically correct surface from point, line, and polygon data.Further morphometric factors could be extracted from the available data (slope direction, slope length/shape, flow direction, flow accumulation, internal relief, drainage density, energy relief, roughness), but they were irrelevant or redundant for the purpose of this work.Non-morphometric factors available in vector format were converted to raster with the same cell spacing as the DEM.The input raster datasets were overlaid exactly for the analysis.
In most probabilistic and statistical methods used in the evaluation of landslide susceptibility, continuous data needs to be reclassified into ranges or categories for analysis.In this work, elevation and slope angle data were re-classified by taking into account the geomorphic features of the area in different morphological domains, reflecting different geological, structural and geomorphic settings.Thus, elevation and slope angle data were subdivided into ranges that can give information on features not found in the online database.Fig. 6.Success rate curves showing how prediction images of Fig. 4 fit the instability conditions of the study area.On the x-axis, portions of the accumulated area predicted as susceptible are sorted in descending order.On the y-axis, portions of the area affected by instability phenomena are reported.Thus, as an example, it can be deduced that 30 % of the most susceptible area corresponds to 60 % of landslides, 65 % of rotational and translational slides and 72 % of soil slips.
Re-classified elevation data consists of four categories (Table 4): the first category (0-20 m) includes the sector of alluvial deposits not affected by landslides, the second category (20-80 m) is characterized by calcareous rocks and colluvial deposits, the third category (80-220 m) includes the outcropping of debris deposits, the fourth and last (220-600 m) includes the outcropping of volcanic rocks.Slope angle was classified into five categories accounting for the natural breaks in the continuous data.Curvature was re-classified using natural breaks that were modified manually to distinguish concave from flat and convex zones.The distance to roads and distance to rivers were re-classified using natural breaks and a geomorphic criterion.Categories were chosen on the basis of the presence of landslides, potential influence  on instability of roads and presence of delineated stream networks.
The landslide susceptibility assessment was performed first using the entire dataset of landslides (indicated as landslides in Figs. 4, 6 and 7) and then dividing the same dataset on the basis of the landslide type (rotational/translational slides and soil slips).The sub-division in type of movement is important at large or detailed analysis scales, but it is less relevant at medium to regional scales.
The susceptibility was evaluated using a simple probabilistic method that compares the spatial landslide distribution with the influencing factors (Table 2) (Lee and Min 2001;Lee and Pradhan, 2007).Using this method, the relative influence of each range/category of landslide related factors can be weighted through the landslide index (b/a).An index of less than one means a low correlation between landslide and range, whereas an index greater than one means medium to high correlation.Landslide indexes for each category of input factors were calculated (Table 4), and then the indexes were summed up cell by cell to obtain the prediction images (i.e., susceptibility maps) of Fig. 4.
To test the effectiveness of the forecasting model and obtain information on the potential of available data for susceptibility assessment, the area was subdivided into two zones: a training area and a test area (Fig. 5).The training area was used to calculate landslide indexes to apply to the test area.
Finally, a sensitivity analysis was carried out to identify which combination of the influencing factors could be used to refine the forecasting model and to highlight the most related factors and the possibility of excluding redundant inputs.

Results
The success rate curves of Fig. 6 show how the prediction images of Fig. 4 fit the landslides that occurred in the area (Chung and Fabbri, 2003).In all cases, the prediction fits quite well with the observed instability of the area.The 30 % of 5 × 5 m pixels classified as most susceptible cover more than 60 % of the area identified as landslide, and zones without the landslides were classified with low and very low indexes.
The predictive rate curves of Fig. 7 show how the predictions carried out in the training area fit the landslides that occurred in the test area.The results fit well, especially in the case of rotational and translational slides.As for the success rate curves, 30 % of the area classified as most susceptible covers 60 % of the landslides that occurred in the test area.Moreover, the predictive rate curves help in the interpretation of results of the susceptibility analysis and in classifying the prediction images of Fig. 4 (Chung and Fabbri, 2003).The area was classified as having high, medium, low or very low susceptibility on the basis of the shape of these curves.As an example, in the case of the entire dataset of landslides, the most dangerous category includes areas with a landslide index greater than ten and an extension of about 30 % of the test area that corresponds to a portion of unstable areas of 54 % (Table 3, Fig. 4), medium susceptibility is assigned to areas with a landslide index between 8 and 10 that cover 38 % of landslides in the test area, low susceptibility is assigned to areas with an index between 6 and 8 that cover 8 % of landslides, and very low susceptibility is assigned to areas with an index between 4 and 6 that cover no landslides.Hence, the classification of the susceptibility in categories is based on the degree of prediction of each category.
The lacking accuracy in prediction cannot be attributed to the use of a simple probabilistic model.In fact, the model showed quite good precision (repeatability) when applied to different areas, that is, the training and the test areas.As shown in Fig. 8 in the case of landslides and translational and rotational slides, the success and predictive rate curves overlap, meaning that the prediction model can be applied indifferently to various areas giving the same results.However, available online data is important when using heuristic, statistical and deterministic models to improve the prediction.
The landslide-related factors used as input layers are significant for the landslide susceptibility analysis of the Euganean Hills.The calculated landslide indexes show that, for each factor, there exists one or more categories clearly predisposed to landslide (Table 4).In most cases, higher landslide indexes (>1.5), highlighted in bold red in Table 4, are different for the two landslide typologies, and in some cases, the relation between the distribution of landslides and categories of influencing factors is verified for one of the landslide types and not for the other.As an example, the distance to rivers can be correlated to soil slips but not to translational and rotational slides because for soil slips, one of the categories has a high landslide index, which implies that zones without a well-defined stream network are mostly predisposed to superficial phenomena rather than rotational and translational slides.Factors most related to landslide occurrence are elevation, slope angle and land use, as highlighted by the sensitivity analysis (Figs. 9 and 10).In fact, these factors implicitly include important information not available in the Italian online databases, such as lithological and structural data.The success rate curves of Fig. 9 show that the best fit of the instability conditions of the study area is reached when using elevation, slope angle and land use as input factors and a good fit is reached when using only the DEM-derived factors.The predictive rate curves of Fig. 10 confirm the high influence of elevation, slope angle and land use in the prediction, whereas DEM-derived factors permit a less accurate prediction similar to that made using all the inputs.This result is important because morphometric data is often the most unique data available.
The importance of landslide susceptibility evaluations at different scales of the inputs considered in this study is reported in Table 4.All the inputs have a high importance at the large scale and in detailed work.The quite low accuracy of the susceptibility evaluation for the study area and lack of data in available online databases that are crucial in analyses at the detailed scale (van Westen et al., 2008), such as slide type sub-division (rotational and translational), lithology, landslide activity and landslide monitoring, suggest that the possible scale of analysis using available Italian online databases is from large to medium (<1:10 000 and >1:50 000), depending on the purpose of the prediction.

Conclusions
The results of this study show that the availability of online data for landslide susceptibility assessment in Italy is quite good, allowing analyses from the large to medium scales.Nevertheless, in Italy and in the world, sharing of data on natural risks must be improved through the creation of spatial data infrastructure and the increased use of the powerful tools furnished by ever more sophisticated and effective WebGIS systems.
The susceptibility analysis of the Euganean Hills Regional Park highlights the role of DEM-derived factors in landslide susceptibility assessment.The contents of input factors such as elevation and slope angle are important because they help with landslide susceptibility evaluation where lithological and structural data are lacking.However, attention must be paid when using these factors in forecast modeling.Current morphometric features could be the result of landslide processes and not of morphometry pre-disposed to instability phenomena.In this case, DEM-derived factors can help in identifying landslides rather than in the implementation of forecasting models.However, the use of morphometric      factors in landslide susceptibility evaluation is recommended for information on first-time landslides, and depending on the analysis scale, the accuracy and resolution of DEMs can improve the results (Floris et al., 2010;van Westen et al. 2008).Furthermore, morphometric factors are continuous data; therefore, in most of the forecasting models suggested by different authors, they must be re-classified taking into account objective criteria and also geologic, geomorphologic and structural settings of the area under investigation.Finally, one of the main drawbacks of the Italian online databases is their lack of data on the frequency of landslides (i.e., rate of activity), which is useful for a quantitative and complete landslide hazard assessment.In the IFFI database, a field reporting the date movement of landslides is available, but records are very few.For example, in the case of Euganean Hills, where only 18 % of landslide reports have temporal data associated with them due to the fact that activation and re-activation are usually recorded only when a landslide damages anthropogenic structures or infrastructure, as well as in the whole Italian territory, the percentage of landslides threatening built-up areas is low.Further information on the temporal pattern of landslides can be found in the Ministry of Environment's cartographic portal, where the results of radar interferometry for the last twenty years are reported.The limits of interferometry techniques in detecting ground displacements caused by landslides (Catania et al., 2005;Colesanti and Wasowski, 2006;Crosetto et al., 2005;Rott and Nagler, 2006) allow the assessment of the state of activity in only a few cases.The main limitation relates to using the PS-scatters technique, which is not suitable for detecting ground displacements due to landslide typologies (shallow and rapid movements) affecting the Euganean hills and to the geo-environmental characteristics of the area (i.e., land use), although it can be considered very effective in many applications (Ferretti et al., 2001).In this environment, promising results can be achieved by using interferometry SBAS (Small Baseline Subset) techniques (Guzzetti et al., 2009;Lauknes et al., 2010).The integration of results from the two interferometry techniques with remote sensing data from ground and airborne platforms will be the subject of further research leading to a quantitative evaluation of the landslide state of activity of the Euganean Hills and to an engineering geology design for landslide monitoring.

Fig. 1 .
Fig. 1.Classification of regional portals in Italy based on the accessibility, completeness and ability to freely download data for landslide hazard analyses.

Fig. 5 .
Fig. 5.The study area was sub-divided into training and test areas.The training area was used to evaluate landslide, translational and rotational slide and soil slip indexes, and the test area was used to evaluate the performance of the forecasting model.

Fig. 8 .
Fig. 8.Comparison between success and predictive rate curves for landslides (a), translational and rotational slides (b) and soil slips (c), showing the accuracy and precision of the forecasting model.In (a) and (b), the model could be considered quite accurate (30 % of the area predicted as the most hazardous covers 60 % of landslides occurred), and the overlapping of the curves demonstrates that the model can be applied to different areas giving similar results (high precision).In (c), the model is quite accurate but shows low precision (different results in different areas).

Fig. 9 .
Fig. 9. Success rate curves showing how the predictions made using the entire set of landslides and different combination of influencing factors fit the instability conditions of the study area.AIF: all inputs; DDF: DEM-derived factors (elevation, slope angle and curvature); LSD: land use, slope angle and elevation; LD: land use and elevation; and LS: land use and slope angle.

Fig. 10 .
Fig. 10.Predictive rate curves showing how the forecasting models implemented in the training area using the entire set of landslides and different combinations of the most influencing factors fit the instability conditions of the test area.AIF: all inputs; DDF: DEMderived factors (elevation, slope angle and curvature); LSD: land use, slope angle and elevation.

Table 2 .
Scheme of the probabilistic method used for the landslide susceptibility analysis.

Table 3 .
Levels of susceptibility assigned on the basis of the degree of prediction of different landslide index ranges.

Table 4 .
Landslides, rotational and translational slides and soil slips indexes for each category of input factors.Indicated values with medium to high susceptibility to instability phenomena are shown in red.The importance of the factors for landslide susceptibility evaluation at different scales is also indicated.