Assessing the spatial variability of weights of landslide causal factors in di ff erent regions from Romania using logistic regression

Introduction Conclusions References


Introduction
Landslides are widespread gravitational processes, controlled by various factors related to geology, geomorphology, hydrology, climate and land use and having significant potential impact on the environment and human society.As in the case of any type of risk phenomena, the analysis of landslide risk assumes the use of "observations about what we know to make predictions about what we don't know" (Paustenbach, 2002).Generally, the evaluation of landslide risk takes into account components such as landslide susceptibility, landslide hazard, landslide vulnerability and consequently, the elements at risk.
Compared to the other components, landslide susceptibility can be modeled with a relatively high degree of accuracy.This is defined as the occurrence probability of a landslide event in a certain area.The assessment of different probability degrees is based on the assumption that slope failures in the future will be more likely to occur under the conditions that led to past and present slope movements (Varnes, 1984;Carrara et al., 1995;Guzzetti et al., 1999;Ercanoglu, 2008).Because the temporal factor is not taken into account (Dai and Lee, 2002;Z êzere et al., 2002), landslide susceptibility relies on a rather complex knowledge of slope movements and their controlling factors (Ayalew and Yamaghishi, 2005).The manner in which these conditions combine themselves spatially and temporally, leading to landslide manifestations, is still in an early stage of exploration.
Landslide susceptibility assessment can be approached by means of qualitative or heuristic methods (which are partially subjective and essentially based on expert knowledge), quantitative methods (based on numerical expressions of the relations between controlling factors and landslide activities), or combinations of qualitative and quantitative (hybrid) methods.The quantitative methods have developed rapidly during the last two decades due to the development and growing accessibility of geoinformation tools, including geographic information systems, remote sensing, digital photogrammetry, global positioning systems (van Westen et. al., 2008;Guzzetti et al., Introduction Conclusions References Tables Figures

Back Close
Full 2012).The application of statistical tools and new research techniques facilitate a fast and accurate computation and give more insights into the landsliding process, including its mapping (Guzetti et al., 1999;van Westen et al., 2006).Statistical methods include bivariate analysis, which approaches the relations between the controlling factors individually, and multivariate analysis, which evaluates the relative importance of each instability factor with respect to the others, allowing a better understanding of the interrelationships between the controlling factors (Falaschi et al., 2009).One of the most popular statistical method used for landslide susceptibility assessment is the binary logistic regression (BLR), with numerous applications for this purpose, especially at regional scales (S üzen and Doyuran, 2004;Zhu and Huang, 2006;Thiery et al., 2007;Mathew et al., 2009;Bai et al., 2010Bai et al., , 2011;;Rossi et al., 2010;Van Den Eeckhaut et al., 2010;Atkinson and Massari, 2011;Ercanoglu and Temiz, 2011;Akgun, 2012).The main advantages of this method is its capability to eliminate unrelated causative factors and evaluate the significance of the related ones (Yesilnacar and Topal, 2005;Falaschi et al., 2009;Chauhan et al., 2010;Ghosh et al., 2011).
The identification and selection of the causal parameters plays an essential role in landslide susceptibility assessment (Aleotti and Chowdury, 1999).However, the selection of parameters is far from being "standardized".It usually depends on expert knowledge, size of the area, time, scale, landslide types, applied methodology, budget, data availability and reliability (Glade and Crozier, 2005).BLR provides, as well as other multivariate methods, numerical weights for the causal factors, as expressions of the degree in which their spatial combinations influence landslide manifestations.The present study employs this method in order to evaluate the landslide susceptibility in different geographical areas, using roughly the same predictors, and to achieve an Introduction

Conclusions References
Tables Figures

Back Close
Full accurate image concerning the spatial variability and range of variation of the causal factors.For this purpose, four sectors were chosen belonging to different geographical regions from Romania, located both in hilly areas (Transylvanian Plateau, Moldavian Plateau) and in lower mountain regions (Subcarpathians).In all these sectors, the landslides, either old or recent, have important extents, constituting the main land degradation form.

Study areas
As previously mentioned, four sectors were selected for analysis, namely C ȃpus ¸u de C âmpie, S ¸ipote, Lungani and Helegiu, located in representative regions in Romania in terms of spatial extent of landslides (Fig. 1).Each sector has square shape with sides of 15 km (225 km 2 ), corresponding to the rectangular grid of the Romanian 1 : 25 000 topographic map.Two of them -C ȃpus ¸u de C âmpie and Lungani -have already been evaluated as to landslide susceptibility in a previous study (M ȃrg ȃrint et al., 2011).The first analysed sector -C ȃpus ¸u de C âmpie -is situated in the central part of the country, within the Transylvanian Depression, which is developed on a series of saliferous domes and brachy-anticlines with mean flank slopes of 3-6 • (Irimus ¸, 1998).The lithology is represented by Sarmatian deposits (Volhynian-Basarabian), including clays and marls with sand intercalations, incorporating loose sandstones and volcanic tuffs.
In the south-western part of the sector, there are more recent deposits of Pannonian age, represented by clays with sand intercalations.The altitude varies between 283 and 572 m, the relief energy is below 150 m and the density of relief fragmentation is relatively high.The mean annual precipitations are around 600-630 mm yr −1 , their monthly distribution presenting a peak within April-July period.The agricultural lands dominate the sector (about 90 % of the total surface), the proportion of arable lands reaching 70 %.Landslides are the dominant slope modelling processes, characteristic for the region being the large deep seated landslides named glimee (Morariu and G ârbacea, 1968).These are rotational landslides, dormant or active, developed during several Introduction

Conclusions References
Tables Figures

Back Close
Full stages, with deluvium thickness normally more than 30 m, usually showing steps-like and hummock morphology.Many other slope areas form the study sector, mostly the cuesta escarpments, are affected by active shallow landslides.The next two sectors -S ¸ipote and Lungani -are situated in north-eastern Romania, in the central part of the Moldavian Plateau, belonging to the extensive east-European geostructural platform.The surface deposits present monoclinic structure with an inclination of 4-8 m km −1 along the NNW-SSE direction (Ionesi, 1994).They are constituted, in the upper part, by an alternating sequence of Sarmatian marls, clays, sandstones and sand complexes.The altitudes vary between 45 and 218 m, while the relief energy and the density of relief fragmentation present similar values to those from the C ȃpus ¸u de C âmpie sector.The mean annual precipitations are around 530-560 mm yr −1 , being unevenly distributed within the year (more than half of the annual quantity falls from May to August).Slope stability is also influenced by land use (deforestations, crops cultivated on slopes, dense network of ponds), the growing extent of roads and settlements (M ȃrg ȃrint et al., 2010).Slide amphitheatres, known as h ârtoape, are typical for slope morphology.These are semicircular depressions, shaped through successive landslide and/or erosion processes starting from the origin of torrential valleys.Important areas are associated with old, dormant landslides, which have thicknesses of 10-20 m, but recent shallow reactivations are also present.
The fourth sector -Helegiu -is situated in the Moldavian Subcarpathians, a com-

Methodology
The logistic regression method has been selected to fulfil the purpose of the present study.This method belongs to the group called the generalized linear models (GLM).
The natural logarithm of the odds ratio, that is the ratio between the probability for an event to occur and the probability for an event not to occur, ln[P/(1 − P )], is called logit.
If this quantity can be expressed as a linear combination of predictors (x), then the probability for an event to occur can be further derived: In this manner, the probability of an event (landslide) to occur is linked to a linear combination of predictors through a logistic function.The regression coefficients are computed using the maximum likelihood estimation (S üzen and Doyuran, 2004;Bai et al., 2010).Compared to linear regression, there is no unique solution for logistic regression coefficients.That is why the maximum likelihood estimation follows an iterative algorithm.Though the regression coefficients are not readily interpretable, one can use the standardized coefficients to assess the relative importance of predictors.
In the present study ten predictors were considered to be potential causal factors for landslides occurrence in all four sectors: elevation, slope angle, mean curvature, plan curvature, profile curvature, distance from drainage network, slope aspect, slope height, land use and surface lithology.
The necessary data for landslide susceptibility computation were acquired from car- Next, starting from the digitized elevation isolines, the digital elevation model (DEM) of each sector was computed, with spatial resolution of 20 m × 20 m.The DEMs were further used to derive the thematic layers representing the geomorphometrical predictors required in the analysis.Elevation, slope angle, mean curvature, plan curvature, profile curvature, distance from drainage network and slope aspect were computed using ArcGIS 9.3 software, while slope height, representing the altitudes above river channels, was derived in SAGA-GIS 2.0.8 software.The land use layer was created by vectorization of land use polygons on the basis of high resolution ortho-rectified aerial photos ( 2006), which were georeferenced using the 1 : 5000 topographic maps.The following land use categories were depicted by photointerpretation and named according to Romanian cadastral terminology: arable, pastures, arable and pastures, forest, water, built areas and unproductive land.Then, the predictor surface lithology was acquired from the geological map of Romania at scale 1 : 200 000, other more accurate sources being unavailable for this parameter.At this scale, only Helegiu mountainous sector is better individualized, because of its higher geological complexity.
There are two manners to integrate qualitative predictors in logistic regression models.One approach is to express the classes of each categorical parameter as dummy variables (Guzzetti et al., 1999;Dai and Lee, 2002;Ohlmacher and Davis, 2003;Nefeslioglu et al., 2008;etc).Another approach is to compute landslide densities for categorical parameters and use them as predictors (Zhu and Huang, 2006;Yilmaz, 2009;Bai et al., 2010).The present study exploits the latter approach in order to avoid the creation of excessively high numbers of dummy variables.Consequently, landslide densities were computed for slope aspect, land use and surface lithology according to the following formula: where LD i is the landslide density value for class i , LA i and A i are the landslide area in class i and the total area of class i , respectively, LA and A are the total landslide Introduction

Conclusions References
Tables Figures

Back Close
Full area in the study region and the total area of the study region respectively.In order to achieve the landslide density raster layers, the zonal histogram procedure form ArcGIS 9.3 Spatial Analyst extension was employed using the landslides polygons as zone dataset.The results were exported and processed in Excel software in order to obtain the landslide density values for each class.These values were then recorded into the attribute tables of the qualitative factors, which were further converted into raster layers.Because a certain amount of redundancy is present among the considered predictors, a selection procedure must be applied.In the present study, the XLSTAT 2010 trial version software was used to apply the logistic regression and the selection of the relevant predictors was performed by the stepwise (forward) procedure implemented into the logistic regression module.This procedure adds the variables one by one, checking at each step if the contribution of the new variable, assessed through Wald chi-square test, is statistically significant.After the third variable is added, the procedure checks if removing any of the variables improves the model.
It is generally acknowledged that the application of logistic regression requires fairly equal number of presences (1) and absences (0) in the input dataset (Nefeslioglu et al., 2008;Bai et al., 2010;Ayalew and Yamagishi, 2005;García-Rodríguez et al., 2008;Gorum et al., 2008).In the present study, the depletion areas of each landslide was identified and mapped.These areas were then randomly sampled and each point was assigned the value of 1 in the attribute database to indicate slope failure occurrences.
Next, a random sample of the same size was generated outside the landslide depletion areas, each point being coded with 0. In order to test the predictive potential of the models, 20 % of the samples, randomly selected, were used for validation as independent datasets.The application of logistic regression aimed to achieve the landslide susceptibility maps for all four sectors.The continuous susceptibility values (from 0 to 1) were further classified using the natural breaks method (Jenks) algorithm, which identifies the class breaks that the best group similar values and maximizes the differences between Introduction

Conclusions References
Tables Figures

Back Close
Full

Results
Through the stepwise filtering procedure of logistic regression model the relevant causative factors in landslide occurrence were selected for each of the four analysed sectors.Figure 2a-f displays the spatial distribution of the six predictors in the case of Helegiu sector.
Maps of landslide susceptibility were achieved for each sector, the values of which were classified using the natural breaks method (Jenks) (Fig. 3).
Table 1 presents the percentages of susceptibility classes for each sector.It is to be noticed that very low and low susceptibility classes group 70-75 % of C ȃpus ¸u de C âmpie, S ¸ipote and Lungani sectors, while these classes represent about 57 % in the case of Helegiu sector.The high and very high susceptibility classes represent 14-18 % in C ȃpus ¸u de C âmpie, S ¸ipote and Lungani sectors and about 27 % in the case of Helegiu sector.
The logistic regression coefficients are given in Table 2, the predictors being arranged in order of decreasing importance according to the standardized coefficient values.Introduction

Conclusions References
Tables Figures

Back Close
Full The Receiver Operating Characteristic (ROC) Curves, one of the most useful tool for evaluating the logistic regression model fit (Gorsevski et al., 2006), were computed for training samples.The area under the ROC curves indicates high degree of accuracy for all landslide susceptibility models (Fig. 4).It is to be noticed that S ¸ipote sector, followed by C ȃpus ¸u de C âmpie and Lungani sectors, present the higher values, of 0.922 and 0.912 respectively, compared to the mountainous sector of Helegiu (0.852).
The percentages of correctly classified points, for a cut-off value of 0.5, achieved for both training and validation samples, indicate good and stable logistic regression models (Table 3).Higher predictive accuracy is noticed as well for the plateau sectors, especially for S ¸ipote and Lungani (with an overall accuracy of 86.86 % and 86.88 %, respectively).
The graphic representations of standardized coefficients' values are presented in Fig. 5 and prove to be useful for better understanding the relations between spatial distribution of susceptibility classes (Fig. 3a-d) and for the influence of each factor (e.g.Fig. 2a-f for Helegiu sector).

Discussions
The landslide susceptibility in all sectors is generally explained by the slope angle, land use and slope height above the channel network.Other factors play secondary roles, such as profile and plan curvature, elevation, surface lithology and distance from drainage network.The slope aspect parameter was removed from the analysis by the stepwise procedure in the case of the C ȃpus ¸u de C âmpie, S ¸ipote and Lungani sectors, while mean curvature parameter was eliminated for all sectors.
Slope angle is the most important factor for C ȃpus ¸u de C âmpie, S ¸ipote and Lungani sectors.This is the parameter that is almost constantly found among the most important three factors within most of the studies applying a similar methodology for landslide susceptibility assessment at regional scale (Ayalew et al., 2005;Gorsevski et al., 2006;Bai et al, 2010;Chauhan et al., 2010;Dominguez-Cuesta et al., 2010;Pradhan and

Conclusions References
Tables Figures

Back Close
Full Lee, 2010; Van den Eeckhaut et al., 2010;Yalcin et al., 2011).The great influence of slope factor highlights the high and very high susceptibility classes, which are clearly positioned along the cuesta escarpments.Land use is the most important factor for Helegiu sector and is placed in the second position in the case of C ȃpus ¸u de C âmpie and S ¸ipote sectors.The highest landslide density values are associated with pastures, but it is obvious that, in many situations, landslides occurred prior to the change of land use into pastures.From this point of view, it may be possible for the results to be influenced by the consideration of present land use and not by the one prior to landslides' occurrence (Atkinson and Massari, 1998).However, especially for the plateau sectors (C ȃpus ¸u de C âmpie and S ¸ipote), land degradation processes, including landsliding, were favoured by long term subsistence agricultural practices with no agrotechnical conservation measures, with high degree of land property fragmentation, and tillage along the maximum slope gradient direction.The persistence and the shifting on parallel tracks of agricultural exploitation roads have constituted, in many situations, favourable conditions for the extension of landslides in the affected areas.For Helegiu sector, the land use factor stands out through its much higher weight relative to the other factors, due to the massive deforestations from the last two centuries, which led to the great extension of landslides on terrains currently used as pastures.Yet, another possible explanation is the integration of the unproductive land class, which does not appear in the other sectors (Fig. 2a).
For Lungani sector, the lower relative importance of this parameter is explained by the presence of Bahluiet ¸floodplain (in the central-northern part), which is mostly covered with pastures, but where landslides are missing.
The slope height is the next important factor, being the second in the case of Lungani sector and the third for C ȃpus ¸u de C âmpie and S ¸ipote sectors.Its significant influence is explained by the high relative altitude of landslide depletion areas on which the models are based.
The lithological factor occupies the fourth position in the predictors' hierarchy in the case of Helegiu sector.The landslide density values reveal the influence of some Introduction

Conclusions References
Tables Figures

Back Close
Full sequences of marl, sandstone and conglomerate strata in increasing the landslide susceptibility values (formations of Lutetian age) (Fig. 2c).For the other sectors this parameter has a lower influence due to the lack of detailed geological maps and to the relatively high geological uniformity.The other predictors, as already mentioned, proved to be the less important factors in all study sectors.

Conclusions
The scientific literature provides several hierarchies of predictors with respect to their influence on landslide susceptibility assessment, having large range of variation.In most cases, certain factors occupy the first ranks: slope gradient, lithology, land use, and slope aspect.The present study concurs with these findings, placing the factor weights within the limits that are specified in other similar studies.For all study sectors, high values of predictors' weights are noticed for slope angle, land use and slope height.The influence of lithology, in the case of Helegiu mountainous sector, plays also an important role, which confirms the fact that, under high geological diversity conditions, the lithological factor has a significant weight in landslide susceptibility.
The positions and weights associated with the other factors show high degrees of variability from one sector to another.It is obvious that the selection of common predictors in landslide susceptibility assessment leads to more generalized analyses.The variation of factor weights may suggest the existence of other factors, with local influences, which are probably considered redundant in some cases, but which should be evaluated as they reflect the regional traits of landslide manifestation process.This variability could be also related with the spatial scale and with level of detail of input materials, on the basis of which the data acquisition is performed.Also it can be stated that the weights assigned to causal factors by means of logistic regression are capable to reveal some important regional characteristics for landslide manifestations.Introduction

Conclusions References
Tables Figures

Back Close
Full  Full Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | plex structural unit bordering the Carpathians Mountains.The structure of nappes and the diverse lithology have conditioned the formation of a fragmented relief, with steep slopes, favouring the great extension of mass movement processes.The Paleogene and Neogene geological strata are represented by marls-clays, clays, sands, gravel, loams, with intercalations of volcanic tuffs and gypsum.The mean annual precipitations vary around 530-670 mm yr −1 , heavy rainfalls being characteristic.Apart from slope modelling processes, this sector is characterized by intense hydrographic activity and extended areas which were subject to deforestations during the last two centuries.Discussion Paper | Discussion Paper | Discussion Paper | tographic and aerial photographic materials, the primary basis for spatial data acquisition being the 1 : 25 000 Romanian topographic map, with Gauss-Kr üger transversal cylindric projection, printed in 1984.In a first stage, the landslide inventories were carried out for all sectors, based on interpretation of the 2006 ortho-rectified aerial photos with a spatial resolution of 0.5 m, which were further checked and validated by field campaigns.All types of landslides, dormant or active, were taken into consideration, resulting total numbers of 528 landslides for C ȃpus ¸u de C âmpie sector, 284 for S ¸ipote sector, 286 for Lungani sector and 851 for Helegiu sector.Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |classes (Fig.3).Five susceptibility classes were separated: very low, low, medium, high and very high.There are several ways to test the quality of the logistic regression model.The likelihood ratio is used to compare a given model with the saturated model showing a theoretically perfect fit.The pseudocoefficients of determination (e.g.McFadden, Cox and Snell, Nagelkerke) indicate the accuracy of fitting associated with the model.Analogously to the determination coefficient used in multiple linear regression, the values of the pseudo-R 2 s vary between 0 and 1, measuring how well the model is adjusted.For models' validation, the present study employs the classification accuracy tables, the ROC curve and AUC parameter.

Table 1 .
Percentages of landslide susceptibility classes.