Assessment of Landslide Susceptibility using Weight of Evidence and Frequency Ratio Model in Shahpur Valley, Eastern Hindu Kush

This study assessed landslide susceptibility using Weight of Evidence (WoE) and Frequency Ratio (FR) model in Shahpur valley, situated in the eastern Hindu Kush. Here, the landslides are recurrent phenomena that disrupt the natural environment and causes huge property damages as well as human losses every year. These damages are expected to increase due to the high rate of deforestation in the region, population growth, agricultural expansion and infrastructural development on the slopes. Initially, the landslide inventory map was prepared from the SPOT-5 satellite image and was verified from frequent field visits. Seven landslide contributing factors including surface geology, fault lines, slope aspect and gradient, land use, proximity to roads and streams were selected. To analyze the relationship between landslide occurrence with its causal factors, WoE and FR models were used. Based on WoE and FR model landslide susceptibility zonation maps were prepared and reclassified into very low to very high landslide susceptible zones. Finally, the resultant maps of landslide susceptibility were validated using the success rate curve and prediction rate curve approach to validate the models.


Introduction
Globally, the frequency of geological and hydro-meteorological disasters is increased in the last two decades with its devastating consequences (Rahman et al. 2017). Landslide is among the geological hazards that cause huge damages to human life, property, and infrastructure (Jehan & Ahmad 2006). The Hindu Kush-Himalayan (HKH) is young mountain system where landslides, avalanches, floods, and earthquakes are very common (A. Rahman & Shaw, 2014;G. Rahman et al., 2017). In this region, landsliding is a recurrent phenomenon and mostly been initiated by seismic activity or rainfall (Kamp et al., 2010;Regmi et al., 2014).. Kanungo et al. (2009) reported that the global share of landslides was five percent among all the natural hazards during 1990-2005 and tend to increase in future because of seismic activities, rainfall intensity and anthropogenic activities on the fragile slope (Pareek et al., 2010;Conforti et al., 2014).
Landsliding is one of the complex geomorphic process (Nandi & Shakoor, 2010;Allen et al., 2011) mainly triggered by the seismic activity, drainage pattern, land cover, slope gradient and rainfall (Sudmeier-Rieux et al., 2012;G. Rahman et al., 2017). Similarly, anthropogenic activities in terms of road construction, expansion of human settlement, deforestation and agricultural activities on fragile slope further intensifies the landslide susceptibility (Rahman et al. 2017).
The attention has been paid to the role of landslides in disturbance of the ecological system. The environmental effects caused by landslide are changes in agricultural activities, river morphology, and natural ecosystems, because of landslide dams (Nakamura et al., 2000). Other effects are sedimentation in river channels and flash floods due to the breaching of landslide dams. Landslides also disturb the natural habitat of certain endanger species in the susceptible zone. Landslide events also affect the biodiversity of the affected area, therefore strict forest preservation measures are highly required to reduce environmental damages (Geertsema & Pojar, 2007).
Landslide susceptibility is basically the geo-spatial probability of slope failure. During last decade, numerous scientific studies including Lee,(2004), Chen and Wang,(2007), Kavzoglu et al.,(2014), Bourenane et al.,(2016), Ding et al.,(2017) and G. Rahman et al.,(2017) have been conducted regarding the fragile mountains and developed a wide range of various methods for analyzing landslide susceptible areas. Quantitative, semi-quantitative and qualitative techniques including statistical and deterministic approaches have been used in various studies to assess landslide susceptibility (C. J. Van Westen et al., 2008). Landslide can be assessed using semi-quantitative, quantitative and qualitative methods for identification of areas having similar characteristics with respect to geological and geomorphological settings of landslide-prone areas (Kouli et al., 2010).
The spatial probability of landslides can be predicted by applying various quantitative methodologies like frequency ratio, information value, weight of evidence, fuzzy neural network, logistic regression, and many others. These methods depend on landslide inventory of and thematic maps of its causal factors (Hussin et al., 2016). In recent years, geospatial technology is widely used for landslide susceptibility mapping, risk identification and management (Akbar & Ha, 2011). Geospatial technologies provide a framework for mapping landslide events and combine the causal factors to produce landslide susceptibility map and therefore it has become an integral part of landslide susceptibility zonation (LSZ).
The HKH is an active seismic region and hence most of the landslides have been triggered by seismic activities (Kamp et al., 2010). Developmental work in HKH regiosn is usually affected by the recurrently occurring landslide events. Thus, it is the very necessary to identify the landslide prone areas to minimize its adverse effects.

The Study Area
The study area, Shahpur valley lies in the Hindu Raj Mountains. These mountains are considered as the offshoot of the Hindu Kush mountain system (Dichter, 1967). The height of these mountains decrease from north to south. The latitudinal extent of the valley is 34° 52′ 31′′ to 35° 9′ 35′′ while longitudinal extent is 72° 40′ 10′′ to 72° 48′ 44′′ as shown in Figure 1

Methods and Material
The Shahpur valley was selected for detailed analysis of landslide susceptibility and for that purpose various landslide casual factors were selected. Data collected from primary and secondary sources were used to achieve the objectives of this study (Figure 3). The past landslide sites were identified and mapped on SPOT5 image with 2.5m resolution of April 2013. A thorough field study was also conducted to confirm the landslide sites on the ground and identify the landslide triggering factors with local community knowledge. Seven factors involved in slope instability, surface geology, proximity to fault line, slope gradient and aspect, land use/ land cover, nearness to road and streams were identified.
Data regarding landslide casual factors were acquired from various sources i.e. surface geology and tectonics data from geological map of North Pakistan. The administrative boundaries and settlement shape-files were prepared from topographic sheets (RF 1:50,000) obtained from the survey of Pakistan.
Spatial features of roads network were acquired from the office of Communication and Works Department, Peshawar. Land use/land cover map was obtained after applying supervised classification on the SPOT satellite image using ArcGIS 10.2. ASTERGDEM with 30m resolution was used for extracting slope angle, slope aspect and hydrography of the study area. Furthermore, a detailed field survey was conducted to validate the sites of already activated and potentially active landslide area.
GIS and Remote Sensing have been used for the preparation of spatial databases and landslides inventory map. Weight of evidence and frequency ratio model analysis are bivariate statistical methodologies in which the importance of each factor or combined factors is individually analyzed with respect to spatial distribution of existing landslides. The assumption in both models is that the factors which influenced the incidence of landslides in the past will be the same to trigger new landslides in the future.

Weight of Evidence Model
Weight of evidence model (Bonham-Carter et al., 1989;Bonham-Carter, 1994) is based on Eq. 1 and Eq. 2: In the above equations, is the probability while ln is the natural log. and ̅ respectively represent the presence and absence of potential landslide evidence factors. Likewise, and ̅ is the presence and absence of landslide respectively. For the calculation of weight of each causal factors contributing to landslide occurrence Eq.3 and Eq.4 have been used after (C. Van Westen et al., 2003).
Where the 1 is the number of pixels express the existence of both landslide contributing factor and landslides; 2 represents the presence of landslide and absence of landslide contributing factor. While 3 represents the presence of landslide contributing factor and absence of landslide.
Similarly, 4 represents the absence of both landslide and landslide contributing factors. Final weight expressed with was calculated using Eq.5: Where, is the difference of + and − . This elucidates the spatial relationship of all landslide contributing factors and landslide.

Frequency Ratio Model
To analyze the effect of landslide contributing factors on the occurrence of landsliding was also examined through frequency ratio model. It is a ratio of landslides occurred area with respect to the total study area and is also the proportion of the landslide occurrence probabilities to a non-occurrence for a given attribute (Bonham-Carter, 1994;Lee & Talib, 2005). In frequency ratio model, a statistical value for each class of a factor map using the equation: Where, ( ) is the number of landslide pixels containing class , ( ) is the total number of pixels of class , ∑ ( ) is the total number of landslide pixels in the entire study area, whereas ∑ ( ) is the total number of pixels in the entire study area.

Landslide Susceptibility Index (LSI)
LSI for both, frequency ratio and weight of evidence model was generated by combining the landslide causal/ contributing factors in GIS based on the and values for overlay analysis using Eq.7: Where ∑ is the total derived weight of evidence model and ∑ is the total derived weight of frequency ratio model.

Results and Discussion
In this paper frequency ratio and weight of evidence models are used with the aim to determine and geo-visualize landslide susceptibility with a resultant map of susceptibility zonation that has been extensively applied in many parts of the world for landslides risk reduction (Shahabi et al., 2015).

Inventory of Landslides in Shahpur Valley
The past landslides sites were marked on multi-spectral SPOT satellite image of April 2013.
These sites were verified through series of field visits. About three hundred landslides of varying sizes were marked on the satellite image and verified from field investigation ( Figure 4). Among these landslides 50% were debris flow and debris while 40% were consists of mudflow and mudslide ( Figure   4). Only 10% among these landslides were rockfall and rockslide. This landslide inventory was randomly divided into two groups, group one was taken as training landslides (80%) and the second group was taken as validation landslides (20%) as shown in figure 5. These landslides were then rasterized to find out the number of pixels in every class of a factor map for calculation of frequency ratio and weight of evidence model values.

Landslide Contributing/ causal factors
In this study, surface lithology/geology, stream buffer for assessing impacts of stream proximity, land cover, slope aspect, slope gradient, fault line impacts and impacts of road network were selected as landslides contributing factors (Figure 6). WoE and FR statistical models used based on correlation of past landslide and causal factors were used to define the weight of each class of every factor map. In WoE model the positive weight ( + ), negative weight ( − ) and contrast weight ( ) while for FR model the frequency ratio was calculated for each class of a contributing factor map (Table 1). brought by Indus river and its tributaries derived from the Kohistan island arc terrane (Baig, 1990).
Similar results were found in FR values. The highest negative correlation was in geology class Jijal Ultramafics having value -3.64 and FR 0.03 (Table 1).

Fault Line
The occurrence of landslides has a strong correlation with fault lines (Korup, 2004;G. Rahman et al., 2019). Fault lines existence at high slope gradient provides favorable settings for slope failure.
There is a complex tectonic structure in the study area and is considered as causal factor for slope instability. It is evident from the analysis that the tectonic structures have strong correlation with landslide occurrence. The highest positive value (1.56) was found in the buffer zone of 0-250 meters followed by a W c value of 0.77 in the 251-500 meters buffer zone and the lowest (-1.6) was in area of greater than 1000 meters according to WoE model. Similar results was found in frequency ratio model, the highest FR value (2.87) was in the buffer zone of 0-250 meters and the lowest was in buffer zone greater than 1000 meters area.

Slope Gradient
The slope gradient affects the population distribution, their activities and distribution of natural resources. Likewise, landslide distribution has a close association with slope gradient and acts as a controlling factor in slope failure. Slope gradient has a direct relation with slope failure and the chances of landslide incidence escalate with an increase in slope gradient. It was observed during field visits that the high landslide density areas were on the slope along the road and stream where lateral cutting was a dominant factor. The map of the slope gradient for the study area was generated from AsterGDEM having 30 meters spatial resolution in GIS (Figure 6c). The analysis of both WoE and FR shows that the role of 31-45 degree slope is higher in slope failure as the highest value (0.29) and FR value (1.14) was found in this class of slope gradient (Table 1). While the slope gradient 0-5 and 6-15 degree class has a negative correlation with landslide.

Slope Aspect
Slope aspect does not have a direct impact on landslide occurrence but indirectly accelerate the landslide process. The sunlight intensity and duration, amount of rainfall, moisture-holding capacity and distribution of vegetation all are affected by slope direction. The analysis reveals that the south-facing slope has very strong positive correlation with landslide as the value of (0.53) and FR (1.55) is higher in this class followed by northwest (0.21) and FR (1.21) facing slope (Table 1). In the study area, high landslides in south-facing slopes may be due to its high exposition to sunlight and receiving ample amount of rainfall as of windward side.

Land Use/ Land Cover
The forest cover protects the mountainous slope from weathering and mass wasting processes as the roots hold the underneath soil and keep the slope stable. Increasing population growth has increased the demand for wood and land for food has disturbed the slope of almost all the mountainous region of the world and have led to slope instability. Land cover of Shahpur valley was developed from the SPOT satellite of image (Figure 6a). Analyzing the influence of land use/ land cover on landslide, statistical weight for each class of land use was calculated using WoE and frequency ratio model. The highest weight of both WoE ( = 0.80) and FR (2.17) was found for stream/torrent class. This was because in the study area the stream/torrent has high lateral erosion and thus initiates new slides. The second high positive correlation was of agriculture land with landslide. In the study area, forest cover is mostly cleared for agriculture activities. Agriculture practice is on the terrace field which also makes the slope susceptible to landslide. It was found from the analysis that barren land has a negative correlation with landslide as in the study area the land was barren because of the presence of hard rock masses which does not support any vegetation on the higher slopes.

Proximity to Road
The road constructions often disturb the slope and expedite the weathering and mass wasting process thus increase the probability of landslide occurrence. It also provides means of accessibility and accelerates the process of deforestation. In the current study, proximity to road is used as a causal factor of landslide. The results show a high positive correlation with road proximity up to 300 meters. The highest value (0.68) and FR (1.88) was found in 0-100 meters of road proximity. This elucidates that the slope near to road has more probability of slope failure.

Proximity to Stream/torrent
In order to examine the relationship of stream/torrent on landslide, WoE and frequency ratio statistical models were applied. It was found from the analysis that both WoE and FR have higher value near the stream that indicates high probability in this region. The highest (0.89) and FR value (2.08) were found in the proximity of 0-100 meters ( Table 1). The results show that the region up to 400 meters of proximity to stream shows a positive correlation toward the landslide probability. The highest negative correlation was found in the buffer zone greater than 500 meters of stream.

Landslide Susceptibility Zonation
Landslide is the common menace to the property, human lives and infrastructure in Shahpur valley. For its mitigation, the first important step is to identify high susceptible landslide areas. LSZ map divides the region into very low to very high susceptible zone according to their susceptibility based on the integration of landslide causal factors. GIS provides framework for the integration of different landslide causal factors to produce LSZ map. To minimize subjectivity, quantitative weight to each class of factor maps was applied based WoE and FR models for generation of LSZ map of Shahpur valley.
The LSZ map was created based on both WoE and FR models by summing all the relative weight of each class of factor maps using the following expressions: Where ∑ is the total derived weight of each class of the factor maps for WoE model, while ∑ is the sum of the derived weight of each class of the factor map of the frequency ratio model. In both cases the higher the value of LSI, greater would be the probability of landslides incident. Based on LSI, the study area was divided into zones of Very high to very low Susceptibility (Figure 7).

Validation of Landslide Susceptibility Map
The landslide susceptibility map was validated using success rate curve based on training landslides that were 80% of the total landslide inventory and prediction rate curve using validation landslides that were 20% of the total landslide inventory. The success rate curve and prediction rate curve elucidates the accuracy of WoE and FR for selected causal factors to landslide occurrences ( Figure   8). The success rate curve and prediction rate curve was calculated using the LSI values ranging from highly susceptible to very low susceptible class and overlaid with the existing layer of landslide area through the geo-statistical tool in GIS. Cumulative percentages for both susceptibility class and landslide area were calculated and susceptibility class was plot on x-axis and landslide area on y-axis to generate both success rate curve and prediction rate curve.

Conclusion
In the current study frequency ratio and weight of evidence models were applied to develop landslide susceptibility maps. Initially, past landslides were marked on SPOT5 satellite image, validated with consecutive field visits and then plotted on map. Landslide causal factors that were identified from literature review including surface lithology, fault lines, land cover, slope gradient and aspect, distance from streams and roads. The maps of these factors were prepared for susceptibility analysis. The roles of each class of these factor maps in landslide occurrence were analyzed and assigned weights were calculated by implementing Bayesian probability models i.e.
weight of evidence and frequency ratio. The required susceptibility maps were generated using ∑ and ∑ values through overlay analysis in GIS.
The maps of landslide susceptibility were prepared based on both models and then validated using the success rate curve and prediction rate curve. It is further concluded that in Shahpur valley, the results of the frequency ratio model showed better results than the weight of evidence model for landslide susceptibility studies in the Hindu Kush region. This study can assist the disaster management authorities to develop location-specific mitigation measures for landslide hazards to avoid loss of life and damages to infrastructure in the future. The study concludes that landslide hazards in the region may have negative impacts on agricultural activities, natural ecosystems, river morphology, human lives and infrastructure in the study area. In this regard, proper land use planning and strict forest preservation measures are highly required to reduce environmental damages.

Declaration
Availabiliity of data and materials: The data was obtained from different government organizaitons so it is not possible for us to share the data as per acquiring the data rules and regualiations.