Assessment of earthquake-triggered landslide susceptibility in El Salvador based on an Artificial Neural Network model

This paper presents an approach for assessing earthquake-triggered landslide susceptibility using artificial neural networks (ANNs). The computational method used for the training process is a back-propagation learning algorithm. It is applied to El Salvador, one of the most seismically active regions in Central America, where the last severe destructive earthquakes occurred on 13 January 2001 (Mw 7.7) and 13 February 2001 ( Mw 6.6). The first one triggered more than 600 landslides (including the most tragic, Las Colinas landslide) and killed at least 844 people. The ANN is designed and programmed to develop landslide susceptibility analysis techniques at a regional scale. This approach uses an inventory of landslides and different parameters of slope instability: slope gradient, elevation, aspect, mean annual precipitation, lithology, land use, and terrain roughness. The information obtained from ANN is then used by a Geographic Information System (GIS) to map the landslide susceptibility. In a previous work, a Logistic Regression (LR) was analysed with the same parameters considered in the ANN as independent variables and the occurrence or non-occurrence of landslides as dependent variables. As a result, the logistic approach determined the importance of terrain roughness and soil type as key factors within the model. The results of the landslide susceptibility analysis with ANN are checked using landslide location data. These results show a high concordance between the landslide inventory and the high susceptibility estimated zone. Finally, a comparative analysis of the ANN and LR models are made. The advantages and disadvantages of both approaches are discussed using Receiver Operating Characteristic (ROC) curves. Correspondence to: M. J. Garćıa-Rodŕıguez (mjosegr@topografia.upm.es)


Introduction
Landslide susceptibility studies are appropriate for evaluation and mitigation plans in potential landslide areas.There are several techniques available for landslide susceptibility research; however, due to uncertainty as an inherent nature of landslide phenomena, statistical and artificial intelligence approaches provide computational models to assess landslide susceptibility over large regions.In January and February 2001, El Salvador experienced several destructive earthquakes that caused hundreds of landslides of various sizes.In this study, we have used an ANN model to assess the susceptibility of the whole country of El Salvador to earthquakeinduced landslides.
The first of those earthquakes occurred on 13 January 2001, with the epicentre located off the western coast of El Salvador in the subduction zone between the Cocos and Caribbean plates (Fig. 1).The earthquake had a magnitude of M w 7.7 and it produced more than 600 landslides, the most significant of them taking place in Santa Tecla, Las Colinas (Fig. 2), where 585 people died.
The second destructive earthquake occurred one month later, on 13 February, with a magnitude of M w 6.6.The epicentre was located to the west of San Miguel, this earthquake was associated with local faults that cross the country from east to west.
Data collected in situ are composed of numerous evidences of landslides and effects of the earthquakes, as well as with aerial images, topographical and geologic maps.They provide a complete base for the application of landslide susceptibility methodologies and a data source to contrast with the observations of landslides.The calibrated models will allow for the post-evaluation of landslide susceptibility associated with future earthquakes in El Salvador.(El Salvador, 13 January 2001), where 585 people died.The epicenter was located off the western coast of El Salvador in the subduction zone between the Cocos and Caribbean plates with a magnitude of M w 7.7 and a focal depth of 40 km (Benito et al., 2004).

Brief state of the art
Predicting future landslide locations requires a quantitative methodology to model these complex phenomena from past events with data gathered in the field or by other techniques, such as remote sensing, aerial photography or LIDAR.In the last few years, the development and the spread of soft computing applied to model natural phenomena have increased exponentially.The fuzzy set theory, ANN, and genetic algorithms were applied to classification problems with the Geographic Information System (GIS), mainly because of their high capacity for analyzing heterogeneous and uncertain data.
ANNs are non-linear function approximators suitable for the assessment of indirect landslide susceptibility estimations.They provide good predictions even when given noisy and uncertain data.ANNs have been applied successfully to the assessment of landslide susceptibility.Lee et al. (2003a) determined landslide susceptibility by using ANN models and compared neural models with probabilistic and statistical ones.Lee et al. (2004) developed a method to integrate ANNs to calculate the Landslide Susceptibility Index (LSI).The weights of the relative importance of different factors for landslide occurrence were successively found from the ANN model.Lee et al. (2003b) used a multi-layered perceptron (MLP) neural network to estimate landslide susceptibility.The results were checked by ranking the susceptibility index in classes of equal area and the results showed a good agreement between the susceptibility map and the landslide inventory.Ermini et al. (2005) used ANNs to classify terrain units considering hillslope factors by applying two neural architectures: a PNN (Probabilistic Neural Network) and a MLP network; they concluded with a slight preference for the MLP network.Gómez and Kavzoglu (2005) estimated an ANN model for assessing landslide risk with different parameters derived from a Digital Elevation Model (DEM), remote sensing imagery, and documentary data in a MLP.Yesilnacar and Topal (2005) applied logistic regression (LR) analysis and neural networks to prepare a landslide susceptibility map for the rerouting of a pipeline after a segment of natural gas pipeline was damaged due to a landslide in Turkey.The strengths and weaknesses of these techniques were compared and the results found by ANNs were more realistic than those found by LR.

Application to El Salvador: data sources and GIS
The methodology for the assessment of earthquake-triggered landslide susceptibility has involved the integration of data sources into a GIS (Fig. 3).The evaluation of susceptibility requires data input from variables representing physical parameters known to contribute to the initiation of landslides.El Salvador GIS is composed of diverse information including a 1:100 000 geological map, a 1:25 000 Digital Cartography, precipitation data, ground strong motions records, epicentres catalogue, and an inventory of landslides.The landslides inventory used consists of data on slope movement from the 2001 El Salvador earthquakes compiled by the SNET (Servicio Nacional de Estudios Territoriales) of El Salvador.Descriptions and classifications of landslides were mainly based on the system developed by Cruden and Varnes (1996), which takes into consideration the type of movement, materials involved, and the state or activity of the unstable slopes.For our study, the inventory was processed and the debris flows were taken into account and separated from other types of mass movements, such as rock falls.Out of 235 samples, 112 belonged to this inventory as landslides and 123 samples were removed as non-landslides.
All of these layers of information from different sources and formats have been integrated into a GIS.In addition, there are other layers of generated maps and requirements, including Digital Terrain Model (DTM), slope map, aspect, roughness, and precipitation, for the assessment susceptibility (García-Rodríguez et al., 2008, 2009).
The physical properties of slope-forming materials, such as strength and permeability, are related to the lithology factor, which therefore should affect the likelihood of slope failure (Dai et al., 2002).The GIS information of lithology is structured into three types of layers: polygons (geologic, pedogenic, and volcanic classes), lines (faults, escarpments, dikes, paleo-riverbeds, and mineral seams), and points (fumaroles, fossils, and volcanic classes).The surface geologic maps (scale 1:100 000) were digitised and georeferenced to obtain these data.The lithological units shown in the surface geologic maps were reclassified according to the classification by the SNET and a generalized geologic map was produced in four types of rock and soil: hard rock, soft rock, consolidated soil, and unconsolidated soil (García-Rodríguez et al., 2008).There are two lithological categories with relatively high landslide density: hard rock (43.2%) including pyroclastic deposits and associated volcaniclastics and un- A digital terrain model (DTM) can be used to classify the local relief and locate points of maximum and minimum heights.A model with a 100-m cell size was created from 20-m contour lines on the 1:25 000 topographic maps.The cell size was chosen for its suitability for work at a regional scale.Some terrain attributes, such as slope gradient and aspect, were derived from the DTM.Slope gradient was calculated using a 3×3 moving window based on of Horn's (1981) algorithm.For slopes of uniform isotropic material, an increased slope gradient correlates with an increased likelihood of failure.However, the soil thickness and strength may vary over a wide range among sites.Aspect can be defined as the slope direction, which identifies the downslope direction of the maximum rate of elevation change.The aspect of a slope can influence landslide initiation because it affects moisture retention and vegetation cover, which in turn affect soil strength and susceptibility to landslides.The amount of rainfall on a slope may also vary depending on its aspect (Wieczorek et al., 1997).
Terrain roughness is a measure of the undulation or relief of the topographic surface.The analysis of texture within a digital image is closely allied to the geomorphometric measurement of roughness.In fact, the variation of roughness embodies two primary scales: grain (or image resolution, 100 m in our case) and texture.Grain refers to the longest significant wavelength of a terrain surface, while texture refers to the shortest one.In order to calculate the terrain roughness, we applied spatial variability function to the DTM (Mardia, 1972;Band, 1989).The spatial variability function (R) measures the dispersion of the vector perpendicular to the surface; for example, for a nearly flat terrain, the perpendicular vectors to the surface points will be approximately parallel, and this will give a low dispersion value.The unit vector perpendicular to the surface at point i is given by an expression that depends of the slope and aspect at point i ( Upton and Fingleton, 1989).The roughness w is a function of R which is obtained as the square sum of the vector coordinates for neighbouring points and the neighbour environment (n).
For the rainfall factor, we created a mean annual precipitation map with a resolution of 100×100 m.This was done through Kriging interpolation (Isaaks and Srivastava, 1989) from the precipitation database compiled by the SNET for the period 1961-1990.
In addition, the role of vegetation in slope stability is considerable.Some types of land use/cover, especially woody vegetation with large and strong root systems, provide both hydrological and mechanical effects that generally stabilize slopes (Montgomery et al., 2000).In contrast, more landslides may be initiated in unvegetated areas.Therefore, land use/cover map from SNET was reclassified in 13 classes: urban areas, forests, annual, mixed and permanent crops, moisture areas, swamps, mining, grasses, bush vegetation, industrial areas, artificial green, areas and lakes (García-Rodríguez et al., 2008).

Artificial Neural Networks model
ANNs are generic non-linear function approximators that were developed by McCulloch and Pitts (1943) and extensively used for pattern recognition and classification.ANNs are networks of highly interconnected neural computing elements that have the ability to respond to input stimuli and to learn to adapt to the environment.ANN establishes rules during the learning phase and uses these rules to predict outputs.The input neurons in the neural network are intrinsic factors to the slope instability, such as topographic, climatic, geological, and geotechnical data from different sources, and are stored in a GIS.
In order to cope with non-linear separable problems, such as the landslide phenomena, a complex model is needed.ANNs have shown their effectiveness dealing with nonlinearity on several occasions (Ercanoglu et al., 2005).Because ANNs have the ability to handle imprecise and fuzzy data, they can be used to work with continuous, categorical, and binary data without violating any assumptions.These techniques have been successfully applied to many problems, including forecasting and prediction problems (Bishop, 1996).The ANN model is employed to analyze specific elements related to the study area that contributed to landsliding in the past.The resulting information can then be used in the prediction of areas that may face landsliding in the future.The ANNs are programmed with error correction learning, which means that some desired response for the system must be known (Werbos, 1974;Parker, 1985).
The behaviour of an ANN depends on the architecture of the network and on both the weights assigned to the connections and the transfer function (Fig. 4).In our case, we have chosen seven input layers, one hidden layer, and one output layer.First, we present the system with the input data and obtain the output.Second, we adjust the system to make the output closer to what it should be.The first stage is referred to as feed-forward, and the second as back-propagation.

Back-propagation algorithm
The back-propagation algorithm was applied to calculate the weights between the input layer and the hidden layer, and between the hidden layer and the output layer.This was accomplished by modifying the number of hidden nodes, the learning rate and the momentum parameters until the aim was reached (minimum RMS values).The nodes perform nonlinear input-output transformations by means of sigmoid activation functions.This structure of nodes and connections, known as the network topology, together with the weights of the connections, determines the final behaviour of the network.
The sequence of the back-propagation algorithm is presented in Fig. 5.The receiving node sums the weighted signals from all the nodes to which it is connected in the pre-ceding layer.The expression is explicated by the equation h j (Eq. 1) where w ij is the weight between the node i and the node j , and x is the output value from node i.The value produced by the hidden node j is the activation function f evaluated at the sum produced within the nodej (Eq.2): o j is the output value from node j of the hidden layer.In turn, the output value is a function of the weight between the hidden and output layers, and the outputs of the input nodes (Eq.3): Where y is the final response variable of the ANN.The activation function f is normally a non-linear sigmoid function, which is applied to the sum f of the weight of the inputs before proceeding to the next layer.It is used the sigmoid function (Eq.4): which is easy to calculate because of its peculiar characteristic that its derivative is expressed with the same function, as shown in (Eq.5), When developing a neural network for a particular problem, the data set is usually separated into two subsets: training and validation data.In the training set, the selection of samples indicates the most relevant aspects for solving the problem, while the validation data are useful for the evaluation of the neural network.The network architecture can be modified, such as changing the number nodes in the hidden layer, learning rate and momentum parameter.The combination selected was [1, 0.9, 0.1], respectively.From the initial data, 80% was chosen randomly for training and the remaining 20% was used for validation.Initially, the network is provided with random connection weights which represents the input to the network and the expected output.The weights are automatically modified to reduce the output error through an iterative procedure of back-propagation of errors.Consequently, the process is repeated until the tests are satisfactory.
Using an ANN as a linear classifier would mean the use zero hidden layers.In our study, we have used one hidden layer, and tried several neurons within that layer.The more layers and the more neurons within each layer that are used, the more non-linearity that is obtained.Theoretically, the more non-linearity there is, the better the model would be for a natural phenomenon; however, more training samples would be needed to estimate the weights.We have observed that a large number of weights lead to poor generalizations, because we have a limited number of training samples.The training rate is the other parameter of the ANN that we tuned although the convergence of the network was not very sensitive to it.Finally, 1000 iterations were taken for each test.
While better results were seen with the higher the number of iterations, we observed that for most of the trials, stabilization happened around 200 iterations.

Validation process
The total error of the model is defined by the formula (Eq.6): This error is a measure of the discrepancy between the networks output value and the target (desired output) value.
Neural network training is performed by trying to minimize the total error for the training set, as a function of the weights.
The error is back-propagated through the neural network until the system minimized the sum of errors E T by changing the weights between the layers.The adjusted weights can be expressed by the generalized delta equation (Eq.7): where w ij is the incremental difference of weight, and η is the learning rate parameter, positive and less than one (Eq.8): The landslide susceptibility model must measure the accuracy of the results in order to know the scientific significance (Chung andFabbri, 1999, 2003).The validation of the analysis was carried out by comparing the training sites and the estimated landslide map obtained by applying the weights derived from the ANN model.
A confusion matrix (contingency matrix) was calculated to determine the accuracy of a classification result by comparing the location and class of each ground truth pixel with the corresponding location and class in the classification image.The overall accuracy was calculated by summing the number of pixels classified correctly and dividing by the total number of pixels.The ground truth Region of Interest (ROI) defines the true class of the pixels.The pixels classified correctly are found along the diagonal of the confusion matrix (Table 1) which lists the number of pixels that were classified into the correct ground truth class.The results of the ANN process have an overall accuracy (95.1%) and Kappa coefficient (0.9013) from the confusion matrix (Table 1).The Kappa coefficient (k) is another measure of the classification accuracy.It is calculated by the equation (Eq.9): where N is the total number of pixels in all the ground truth classes, x kk is the element of the confusion matrix diagonals, and x k x k is the elements of the ground truth pixels in a class times the sum of the classified pixels in that class summed over all classes.
The producer accuracy (PA) and user accuracy (UA) are shown in Table 1.The PA is a measure indicating the probability that the classifier has labelled an image pixel into Class A, given that the ground truth is Class A. In the confusion matrix example, the region #1 class (landslide) has 2867 ground truth pixels, where 2592 pixels are classified correctly.The producer accuracy is the ratio 2592/2867, or 90.41%.UA is a measure indicating the probability that a pixel is Class A, given that the classifier has labelled the pixel into Class A. The classifier has labelled 2625 pixels as the region #1 class, and 2592 pixels are classified correctly.The user accuracy is the ratio 2592/2625 or 98.74%.
The ROC visualizes a classifier's performance in order to select the proper decision threshold.It provides a probability of detection versus a probability of false alarm curve (Fawcett, 2006).ROCs are equivalent to prediction and success-rate curves proposed by Chung and Fabbri (2003).As can be observed from the ROC analysis (Fig. 6), the LR approach has a better performance than the ANN for a small range (between 0.0 and 0.07) of false alarms; however, when false alarms increase the ANN approach works better than LR.When false alarms are about 0.4, both techniques work similarly.Some authors take the area under the ROC curve (AU-ROC) as a measurement of the model's accuracy (Hosmer and Lemeshow, 2000;Lee, 2005).The area is a measure of discrimination, that is, the ability of the technique to classify those pixels correctly with and without some probability of landslide.This threshold-independent measure of discrimination between both classes takes values between 0.5 (no discrimination) and 1 (perfect discrimination).Therefore, the closer the ROC plot is to the upper left corner, the higher the overall accuracy of the test is.An area of 1 represents a perfect test and an area of 0.5 represents a worthless test.A rough guide for knowing the accuracy of a classifier is 0.5-0.6 for a fail, 0.6-0.7 for poor, 0.7-0.8 for fair, 0.8-0.9 for good, and 0.9-1 for excellent.The area corresponding to the ANN model is 0.963; while in the LR model, it is 0.980.This indicates a very high predictive capacity for both models.Although the value is slightly better for the LR model,  it cannot be said that LR is better than ANN is.What is interesting in the plot is the shape of the curves and how the curves cross, which we will discuss further in Sect.6.

Susceptibility map
The main assumption in slope instability modelling is that the past occurrence of landslides at a specific site is indicative of the potential for future landslides to occur in sites with similar characteristics.The of susceptibility requires data input of variables representing physical parameters known to contribute to the initiation of landslides.An ANN model was estimated by incorporating the physical parameters contributing to the formation of landslides.The terrain conditions of El Salvador were fed into the ANN model to calculate the landslide susceptibility.The susceptibility values obtained by this model were converted to a raster file in a GIS-based regional susceptibility, and a landslide susceptibility map for El Salvador was produced (García-Rodríguez, 2009).
The results are shown in a Susceptibility Map (S) (Fig. 7), which defines the slope stability of an area into categories that range from stable to unstable.The susceptibility map in Fig. 7 is a representation of the histogram of the probability maps with different categories (Ayalew and Yamagishi, 2005).For our study, five susceptibility classes are chosen: very low, low, medium, high, and very high (Table 2).The classification of susceptibility is checked by the examination of the number of landslides locations within each class, the high and very high classes have 83% of landslides which supplies robustness to the ANN approach.For its representation, we used a color scheme that relates warm colors (red, orange, and yellow) to unstable and marginally unstable areas and cool colors (green shades) to stable areas.Similar to other studies, the landslide susceptibility model using ANN (Rossi et al., 2010) has a strong dichotomy in zonations, with a limited number of intermediate susceptibilities (0.2-0.9).

Discussion and conclusions
In previous sections, a landslide susceptibility map of a regional scale of El Salvador was derived by using an ANN and employing a back-propagation learning algorithm (Fig. 7).This ANN was programmed with the landslide inventory (80% training dataset), and the information was included in a Geographic Information System (GIS).The factors considered were slope gradient, slope aspect, elevation, land-use, lithology, mean annual precipitation, and roughness terrain which were integrated on a GIS (García-Rodríguez et al., 2008).The validation process was performed with a confusion matrix for an overall accuracy of 95.1%.
Recently, we have worked with LR techniques using the same input data, and the results illustrated the importance of terrain roughness and soil type as key factors within the model; using only these two variables, the analysis returned a significance level of 89.4% (García-Rodríguez et al., 2008, 2009).In the LR, the was estimated according to the logistic formula, which provided a deterministic model for the data and yielded weighted factors for each contributing factor.It also allowed the calculation of the odds ratio, which represents the degree of risk associated with each factor.However, LR fits the data to a fixed function, so it is less flexible and less capable of solving complex problems compared to ANNs.The advantages of ANNs and its possible application to the evaluation of landslide susceptibility come from the remarkable information processing characteristics of the artificial simulated biological system, including the ability to handle imprecise and fuzzy information, fault and failure tolerance, high parallelism, non-linearity, robustness, capability to generalize, and tolerance to noise data (Basheer and Hajmeer, 2000).On the other hand, a disadvantage is that they are known as black-box methods, since it is not known exactly how ANNs learn particular problems and apply the extracted rules to new cases, or how conclusions can be drawn from the trained networks (Gómez and Kavzoglu, 2005).Although the accuracy produced in our work is of 95.1% for the ANN versus 89.4% for the LR, the black-box characteristic of the former does not allow for the investigation of which variables are more influential to the response variable.However, this is possible with the LR because the weight of each factor is known.
Given that both the ANN and LR methods have great AU-ROC, only the ROC curves provides some subtle differences in the modus operandi of each classifier.For fewer false alarms, the sensibility of LR is superior to ANN, while for higher false alarm the superiority is for the ANN until a value of approximately of 0.4 of false alarms, when both classifiers perform similarly.This characteristic translates into Fig.6, where it can be observed how the ANN gives broader areas of susceptibility than its correspondent for LR, as developed by García-Rodríguez et al. (2008).The LR shows a more clearly defined area of susceptibility that corresponds to a steep ROC curve, while ANN gives a spread-defined area of susceptibility that corresponds to a gradual rise of the ROC curve.
ANNs have been used in areas which were once reserved for multivariate statistical analysis.Owing to this they are often considered to be statistical methods; however, it is necessary to be aware of subtle conceptual differences between these two methods in order to use ANNs effectively.In particular, we are using a specific type of ANN called backpropagation with one hidden layer, this is closely related to the statistic method called projection pursuit regression, which projects the data matrix of explanatory variables in the optimal direction.Compared with the statistical methods, neural networks allow the target classes to be defined with more consideration to their distribution in the corresponding domain of each data source (Zhou, 1999).The use of a GIS where the information is stored in layers facilitates implementation of statistical or computational models, such as we have showed in this paper.
A literature review of recent works reveals that authors found ANN to be superior to other methods; particularly, some works compared LR and ANN using different variables and data; most of them confirm the superiority of ANN over LR.The first to compare these two methods were Yelsilnacar and Topal (2005); we have mentioned it above, they found ANN to be more realistic for landslide susceptibility mapping.More recently, other authors also worked with LR and ANN for landslide susceptibility maps.Falaschi et al. (2009) compared LR and ANN and found ANN to be superior to LR. Yilmaz (2009) found ANN superior to frequency ratio and logistic regression; they both used ROC as the evaluation element.Pradhan and Lee (2010) also found ANN to be superior to frequency ratio and logistic regression for data for the Klang Valley, Malaysia; they performed trials with several different factors; in all cases, ANN was superior to LR.
Given the idiosyncrasy of the specific problem dealt with in this paper, such as scale and inventory, the ANN model is superior to the LR model in general terms, even though the AUROC is slightly better for the LR model, but there are others characteristics that should be taken into account when performing the analysis, such as flexibility of model, explicability of variables, and ROC shape.Finally, it could be said that there is no a best model for all cases, it depends on many factors including the number of training samples, the independent variables being considered, and the scale.

Fig. 1 .
Fig. 1.DEM data together with the main epicentres of 2001 in El Salvador, tectonic features and landslide locations of 13 January 2001 earthquake (from ESRI shaded relief imagery which was developed using GTOPO30, Shuttle Radar Topography Mission (SRTM), and National Elevation Data (NED) data from the USGS).

Fig. 2 .
Fig. 2. Earthquake-triggered landslide in Las Colinas, Santa Tecla (El Salvador, 13 January 2001), where 585 people died.The epicenter was located off the western coast of El Salvador in the subduction zone between the Cocos and Caribbean plates with a magnitude of M w 7.7 and a focal depth of 40 km(Benito et al., 2004).

Fig. 4 .
Fig. 4. Artificial Neural Network Architecture with seven neurons in the input layer to correspond to the independent variables.The w ij and w j k represent the weights, o j represents values for the hidden layer, and y k represents the dependent variable, which gives of the current pixel's degree of susceptibility for a landslide.

Fig. 5 .
Fig. 5.A back-propagation neural network with the sigmoid function used as activation function of the ANN.

Fig. 6 .
Fig. 6.The ROC curve calculated for the ANN model.The red color represents the LR model and blue color represents the ANN.

Fig. 7 .
Fig. 7. Susceptibility map calculated from a methodology based on the ANN model.

Table 1 .
Confusion matrix representing the number of pixels and percentage of ground truth for two regions: landslide (region #1) and non-landslide (region #2) and producer and user accuracy in number of pixels and percentage.

Table 2 .
Number of pixels and percentage of susceptibility map classified into five levels: very low, low, medium, high, and very high.