Articles | Volume 25, issue 11
https://doi.org/10.5194/nhess-25-4629-2025
https://doi.org/10.5194/nhess-25-4629-2025
Research article
 | 
24 Nov 2025
Research article |  | 24 Nov 2025

Decoupling urban and non-urban landslides for susceptibility mapping in transitional landscapes: a case study from Southwestern Constantine, Algeria

Zakaria Matougui, Yacine Mohamed Daksi, Mehdi Dib, and Chaouki Benabbas
Abstract

This study develops a framework for decoupling and investigating urban and non-urban landslide mechanisms, focusing on Constantine, Algeria, a city with complex topography and high landslide susceptibility. The region presents a heterogeneous landscape, where dense urban zones coexist with bare rural areas, influencing slope stability differently. A landslide inventory of 184 events was compiled and classified into urban and non-urban categories. Using geospatial data (topography, hydrology, landcover, lithology) and machine learning models (Random Forest, XGBoost, LightGBM, Multi-Layer Perceptron, and Logistic Regression), landslide susceptibility maps were generated for three datasets: urban, non-urban, and mixed. Model performance was assessed using cross-validation and evaluation metrics (ROC-AUC, F1-score, precision, recall), while SHAP analysis provided insights into factor importance. The results reveal distinct landslide drivers across environments. In urban areas, landslides are primarily influenced by aspect, slope, and proximity to streams, while distance to roads plays a lesser role, likely due to engineered slopes and drainage infrastructure. In non-urban areas, distance to roads is the most critical factor, highlighting the destabilising effects of road cuts in rural landscapes. Slope and proximity to streams remain key determinants, with lithology playing a more significant role in naturally driven failures. This study underscores the importance of context-specific landslide modelling and the potential biases of using mixed urban and non-urban inventories. The findings provide actionable insights for targeted mitigation, land-use planning, and infrastructure design. By distinguishing between urban and non-urban landslides, this research bridges critical gaps in understanding landslide dynamics across diverse landscapes.

Share
1 Introduction

In the face of rapid urbanization and the rarefaction of land suitable for construction, particularly in hilly and mountainous regions, ensuring the safety of infrastructures has become a crucial challenge due to the prevailing soil instabilities and disorders. Accurately assessing these hazards is essential for informed spatial planning, enabling sustainable economic development and safeguarding the well-being of communities. This context, which combines expanding urban areas over hilly regions, raises a number of challenges related to the assessment of soil instability due to the interplay of anthropogenic and natural factors. Urban development alters slope stability through construction activities, modifications to natural drainage (surface and subsurface water circulation) and increased loads, while deforestation and land use changes accelerate erosion and reduce natural stabilization (Benabbas, 2006; Hadji et al., 2013). Furthermore, the higher density of infrastructures in urban areas significantly alters and weakens the natural landscape, potentially intensifying pressure on slope stability and thereby increasing susceptibility to slope failures (Carrión-Mero et al., 2021; El Kechebour, 2015). In contrast, slope stability in rural regions is shaped predominantly by natural factors (Carrión-Mero et al., 2021), such as heavy rainfall, seismic events, and lithological evolution, leading to different instability mechanisms compared to urban settings.

Northern Algeria, with towns built on largely loose soil (marl and clay), often unstable and particularly vulnerable to landslides. The city of Constantine, as some geology and geotechnical specialists often referred to as “an open sky museum for landslides” exemplifies this problem (Bougdal et al., 2007). As Algeria's third-largest urban area and the most important in the eastern region, Constantine frequently experiences landslides that not only affect densely populated urban areas but also neighbouring rural regions (Mezerreg et al., 2019). This duality of urban and non-urban impacts makes Constantine a prime location for studying the interplay of factors driving landslides in different environments. Rapid urbanization in regions like Constantine often results in incomplete or insufficient geospatial data, complicating the integration of urban characteristics into predictive models, particularly in heterogeneous and transitional zones between urban and non-urban areas (buffer zones). This makes it challenging to accurately capture the spatial variability of landslide mechanisms. Moreover, population exposure, environmental degradation, and the increased strain on mitigation infrastructure amplify the risks, underscoring the need for targeted and context-specific hazard assessments (Achour et al., 2017; Manchar et al., 2018; Mezhoud and Benazzouz, 2018).

Although extensive research has been carried out on landslide hazard assessment in urban areas (Bathrellos et al., 2009; Huang et al., 2023; Pascale et al., 2010, 2013), these studies generally examine urban environments in isolation. They rarely investigate how urban and non-urban processes differ, particularly in transitional zones where the two settings interact. At the same time, the broader international literature shows that machine learning (ML) has become a widely adopted tool for landslide susceptibility mapping. However, most applications still treat the landscape as a homogeneous entity, without explicitly distinguishing between urban and non-urban contexts.

Early work illustrates this limitation. For instance, (Caniani et al., 2008) applied artificial neural networks in Potenza, Italy, but without differentiating urban from non-urban landslides either in the inventory or during modelling. More recent studies have followed a similar approach. (Islam et al., 2025) applied a hybrid ML model to a rapidly urbanizing area in Bangladesh, and (Luo et al., 2025) assessed landslide hazards in the central Guizhou urban agglomeration using SVM, DNN, and bagging algorithms; in both cases, inventories did not explicitly separate urban from non-urban events.

This limitation is critical because urban landslides are often shaped by anthropogenic factors, including slope modification, drainage alteration, construction practices, and infrastructure density, that differ markedly from the natural drivers that dominate in non-urban areas. Without explicitly accounting for these differences, susceptibility models risk oversimplifying complex processes and overlooking the distinct mechanisms of slope failure across contrasting environments.

This study aims to address these challenges by bridging a significant gap in the literature: decoupling urban and non-urban instabilities for landslide susceptibility mapping. By separately analysing the unique driving factors in urban and non-urban environments, this research provides a more nuanced understanding of landslide susceptibility, thereby enhancing the accuracy and applicability of predictive models for sustainable urban and rural development. To achieve this, the study employs a comprehensive methodology integrating geospatial data, machine learning algorithms, and advanced analytical techniques. A detailed landslide inventory was constructed trough a comprehensive field observations and remote sensing imagery, enabling the classification of landslide events into urban and non-urban categories. Causative factors, including topographical, geological, hydrological, and land-use variables, were extracted from sources such as digital elevation models, geological maps, and land cover data. These datasets were standardized and converted into machine-readable formats to ensure consistency and precision. Machine learning models, including Logistic Regression, Random Forest, LightGBM, XGBoost, and Multi-Layer Perceptron, were trained separately for urban, non-urban, and mixed datasets, with hyperparameters optimized using Bayesian techniques. Model performance was evaluated using cross-validation and metrics such as ROC-AUC, F1score and recall while SHAP analysis was employed to interpret the relative importance of each factor. This methodological framework not only produces robust landslide susceptibility maps but also elucidates the distinct mechanisms driving instability in urban and non-urban settings, thereby addressing critical gaps in the existing literature.

2 Materials and methods

2.1 Study area

The territory of the city of Constantine (Fig. 1) is distinguished by a complex morphology, characterized by irregular hills and often deep valleys, which considerably increase its vulnerability to landslides. The old town (three thousand years old) is built on a carbonate and karstified rocky plateau of Cenomanian–Turonian age. The Rhumel river was able to dig and impose a canyon on this limestone plateau. We think that the digging of the canyon of the rock of Constantine was dictated by the post-Pliocene (Quaternary) uplift of this massif and that this uplift is still ongoing.

https://nhess.copernicus.org/articles/25/4629/2025/nhess-25-4629-2025-f01

Figure 1Location of the study area as hill shading to represent the topography with the road network, stream network and landslide polygons mapped.

The region's Mediterranean climate, marked by hot, arid summers and mild, wet winters, leads to episodic intense rainfall events that promote slope saturation and increase the likelihood of mass movements. Furthermore, the predominance of clay-rich and moisture-retentive geological formations significantly amplifies soil instability, particularly under conditions of heavy precipitation and seismic activity.

Constantine has been the focus of numerous studies addressing landslide susceptibility and risk. Early research by (Benaissa and Bellouche, 1999) examined the geotechnical properties of landslide-prone formations, revealing the instability of terrains within the urban area. (Guemache et al., 2011) highlighted the technical difficulties of stabilizing landslides, particularly in the case of the Sidi Rached bridge. Later works, such as those by (Bourenane et al., 2015), applied statistical methods and GIS to map susceptibility, while (Manchar et al., 2018) utilized AHP and interpretation of aerial images for hazard assessment. These foundational studies provide a robust theoretical and methodological base for the current work, which introduces an innovative approach by decoupling urban and non-urban factors in the identification and landslide susceptibility mapping.

Rapid and often unplanned urbanization has led to increased impervious surfaces, altered drainage patterns, and intensified land use changes. The expansion of residential, commercial, and infrastructural developments into hilly areas has disrupted natural slope stabilization mechanisms, increased slope loading, further weakening soil structures. Consequently, Constantine's infrastructure, including roads, bridges, and buildings, is frequently threatened by landslides, posing significant risks to public safety and economic stability (Guemache et al., 2011; Schlögl et al., 2019).

Non-urban areas surrounding Constantine are equally vulnerable, though influenced by different factors. These regions are primarily affected by natural processes such as heavy rainfall, seismic activity, and continuous weathering of geological formations (Mounia et al., 2013). Additionally, agricultural practices, land clearing, and minor construction activities in rural zones contribute to soil erosion and slope degradation, although to a lesser extent compared to urban settings. The heterogeneous landscape of Constantine, characterized by the juxtaposition of densely built urban zones and more stable rural areas, presents a unique opportunity to study the differential impacts of urbanization on landslide susceptibility.

2.2 Landslide inventory

Urban and non-urban landslides differ significantly in their triggers and characteristics due to varying environmental, geological, and human factors. In urban areas, landslides are primarily driven by human activities such as construction, excavation, deforestation, and poorly designed drainage systems. These actions disrupt natural slope stability and reduce soil cohesion, increasing the likelihood of landslides near infrastructure. However, urban landslides can also be triggered by natural events like heavy rainfall and earthquakes, which exacerbate existing vulnerabilities caused by urbanization. In contrast, non-urban landslides are mainly caused by natural processes such as intense rainfall, erosion, and neotectonic seismic activity, with vegetation cover playing a vital role in stabilizing the soil. We identified these rural slope failures through a combination of direct field investigation and multi-temporal remote sensing interpretation (Alharbi et al., 2014; Varnes, 1984).

The triggering factors also vary between environments: urban landslides may result from a combination of human-induced disturbances like leaks or vibrations and natural events such as heavy rains or earthquakes (Chen and Wang, 2023; Ma and Wang, 2024). In contrast, non-urban landslides are mainly influenced by natural phenomena. Understanding these differences is crucial for developing accurate landslide susceptibility maps and implementing effective mitigation strategies tailored to each setting.

2.2.1 Landslide characteristics

The lithological diversity observed in the Constantine study area strongly influences the types of instabilities likely to occur. In the western region, predominant formations consist of thick marls and marly clays, which are particularly susceptible to water saturation. This lithological setting promotes deep rotational landslides, due to the substantial thickness of these cohesive and plastic facies, which can become extensively waterlogged, particularly on moderate to steep slopes (Manchar et al., 2018). Conversely, in the northeastern region, where shallow clays overlay a hard substratum, typically composed of indurated limestone or conglomerates, planar translational landslides are common. Such movements are characteristic of stratified terrains, where a clear sliding surface forms between the superficial loose formation and the underlying rigid formation. Additionally, in areas with gentler slopes, solifluction phenomena occur, involving slow movements of saturated fine materials, exhibiting limited acceleration due to low slope inclination. Lastly, the central region presents a complex scenario: water-sensitive marls coexist with densely built environments, placing additional stress on slopes. This context leads to complex landslides triggered by surcharge, modified hydrological flows, and progressive slope degradation. This variety of instabilities necessitates a differentiated typological approach, integrating geological facies, slope geometry, and anthropogenic factors to effectively map and anticipate slope movements (Hungr et al., 2014).

2.2.2 Landslide inventory classification

The landslide inventory was constructed with the objective of distinguishing between urban and non-urban slope failures while maintaining methodological consistency across the study area. Classification was carried out by overlaying mapped landslide polygons with the land-cover dataset. An event was defined as urban when its polygon intersected built-up zones or close to infrastructure. Conversely, landslides located entirely outside these zones and in areas characterized by bare or agricultural land cover were classified as non-urban.

2.2.3 Landslide inventory in urbanized areas

To establish a comprehensive landslide inventory within the urbanized portions of the study area, landslide identification was primarily based on in-situ observations. A sequence of indirect indicators was employed to detect ground instabilities associated with landslides. These indicators were systematically observed across various infrastructures, including buildings, roads, and sewerage systems. Given the high density of construction in urbanized areas, the utilization of high-resolution satellite imagery for delineating landslides proved ineffective. Consequently, surveys and field observations were the primary tool adopted in this study.

To differentiate a landslide from phenomena such as uneven settlement or swelling, an event was only included in the inventory if its vicinity exhibited multiple indirect indicators in conjunction with testimonies from inhabitants and expert opinions. This multi-faceted approach ensured the reliability and accuracy of the landslide inventory.The indirect indicators utilized in the landslide inventory for urbanized areas include:

Cracks in Buildings and Roads:

  • Buildings: Horizontal or diagonal cracks in floors or walls were considered key indicators of potential landslide activity (Fig. 2a and b).

  • Roads: Longitudinal and transverse cracks in road surfaces, pavements, and car parks were identified as significant indicators of slow-moving landslides (Fig. 2a).

  • Uneven Settlement in Buildings: Uneven ground settlement associated with subsurface instability manifested as uneven floors within structures (Fig. 2b and f).

  • Deformation of Roadways: The presence deformations in roadways such us misalignments, bulges, and depressions suggest, if associated with other indirect indicators in the road vicinity, the presence of a slow landslide.

  • Underground utility damage: Damage to underground utilities, including water and sewage systems, when correlated with other nearby indirect indicators, was attributed to soil movement.

  • Failing retaining infrastructures: The failure of retaining walls and other soil reinforcement structures served as indicators of landslide activity (Fig. 2c and d).

  • Leaning trees and other structures: Trees, electricity poles, or other structures that exhibit leaning from their original vertical positions were considered signs of slope movement or ground instability, especially when accompanied by other indirect indicators (Fig. 2b–e).

https://nhess.copernicus.org/articles/25/4629/2025/nhess-25-4629-2025-f02

Figure 2Field evidence of slope instability in the study area: (a) Example of cracks in buildings and roads; (b) Shallow landslide scarp in an urban setting, with an adjacent demolished structure; (c) Failure of a concrete retaining wall; (d) Sheet-pile retaining system exhibiting lateral bulging; (e) Leaning tree signalling slope deformation; (f) Example of inclined buildings (© Esri, © Open Street Map contributors, and the GIS user community, Source: Esri, Maxar, Earthstar Geographics, and the GIS User Community).

2.2.4 Landslide inventory in non-urbanized areas

In non-urbanized areas, the landslide inventory was constructed primarily through remote sensing interpretation, complemented by selective field verification. Mapping was guided by morphological criteria following the principles of (Varnes, 1984) whereby typical geomorphic signatures of mass movements, such as scarps, displaced material, surface cracks, and toe bulges, were visually identified and delineated. This geomorphological approach ensured that landslides in rural sectors were consistently captured and provided a methodological counterpart to the field-based strategy applied in urban areas.

Each landslide was delineated by mapping the entire affected surface, from the main scarp to the toe, thus incorporating both the source and accumulation zones. Due to the predominance of clays and marls overlying hard limestone or conglomerate formations, most slope movements in the study area are characterized by moderate displacements and relatively small deposition areas. The inventory including urban and non-urban landslides were compiled during a comprehensive field and remote sensing interpretation survey conducted between June and December 2024.

This systematic approach to inventory creation, leveraging multiple indirect indicators alongside direct observations and expert assessments, ensures a robust and reliable dataset for subsequent landslide susceptibility mapping. By focusing on both structural and natural signs of instability, the inventory comprehensively captures the multifaceted nature of urban landslide phenomena, effectively distinguishing them from other ground movement events (Bornaetxea et al., 2018).

To further characterize the mapped landslides, descriptive statistics were calculated for the urban, non-urban, and mixed inventories (Table 1). The urban dataset comprises 123 landslides totalling 18.4 ha, while the non-urban dataset includes 61 landslides covering 21.2 ha. Combined, the mixed inventory contains 184 landslides with a total area of 39.6 ha. Despite its larger number of events, the urban dataset represents the smallest total area.

Table 1Descriptive statistics of mapped landslides in the study area.

Download Print Version | Download XLSX

2.3 Landslide causative factors

Selecting appropriate landslide conditioning factors is essential for accurate susceptibility modelling. In this study, factors capturing both anthropogenic and natural influences were chosen to distinguish between urban and non-urban landslides. By analysing these factors, we isolate the distinct mechanisms driving slope failures in different environments (Fig. 3), enhancing model precision and interpretability.

https://nhess.copernicus.org/articles/25/4629/2025/nhess-25-4629-2025-f03

Figure 3Spatial distribution of conditioning variables used in landslide susceptibility modeling for the study area including: Elevation (m); Slope (°); Distance to streams (m); Aspect (°); Curvature; TWI; Distance to roads (m); Lithology; (i) NDVI; and Land use.

The conditioning factors employed in this study are detailed in Table 2. Topographical and hydrological variables, including Elevation, Slope, Aspect, Curvature, TWI (Topographical Wetness Index) and Distance to stream, were derived from the ALOS PALSAR dataset and DEM-based calculations. These factors are crucial for characterising the terrain, as elevation and slope affect gravitational forces and mass movement potential, while aspect and curvature influence microclimatic conditions and water flow dynamics. TWI quantifies soil moisture accumulation, and distance to streams provides an indication of potential water-induced processes.

Table 2Summary of landslide conditioning factors used in this study.

Download Print Version | Download XLSX

The Land Use and Land Cover information was sourced from the ESA WorldCover dataset at a 10 m resolution. This dataset differentiates between natural vegetation and anthropogenic land uses, distinguishing between urban and non-urban areas. Additionally, the Distance to roads, obtained from OpenStreetMap, captures the influence of infrastructure on slope stability by identifying modifications in natural drainage and terrain disturbance.

Geological factors have been incorporated via a Lithology map based on expert field surveys, which provides vector data on soil and rock formations. This allows for the identification of areas with potentially fragile substrates that are more susceptible to landslides. Finally, the NDVI, calculated using Harmonised Sentinel-2 MSI data, offers a quantitative measure of vegetation cover at a 10 m resolution. This factor is important in assessing the role of vegetation in reinforcing soil and mitigating erosion, thereby influencing overall slope stability.

Density distributions of landslide conditioning factors

To understand the influence of environmental and anthropogenic factors on slope stability, probability density functions were estimated for each conditioning variable using a kernel density estimator (KDE) (Chen, 2017):

(1) f ^ h ( x ) = 1 n h i = 1 n K x - x i h

where f^h(x): estimated probability density at x, n the number of observations, h the bandwidth (smoothing parameter), K the kernel function (e.g., Gaussian), and xi the individual observations.

This approach provides a smoothed representation of the distributions of landslide and non-landslide cells, while allowing urban and non-urban landslide occurrences to be analysed separately. By overlaying these density plots (Fig. 4), several insights can be obtained:

  1. Identifying Factor Importance: Large separations in the distributions (e.g., steep slope angles) suggest that certain factors are particularly significant in driving landslides.

  2. Distinguishing Urban vs. Non-Urban Patterns: Overlaying urban and non-urban curves highlights how features interact with landslide occurrence.

  3. Revealing Potential Thresholds: Certain factor ranges (e.g., slope>5° or elevation between 500 and 600 m) align with higher or lower landslide frequencies, informing both susceptibility modelling and mitigation strategies.

https://nhess.copernicus.org/articles/25/4629/2025/nhess-25-4629-2025-f04

Figure 4Density distributions of landslide conditioning factors.

Download

  • Land Use: Urban landslide densities (red) concentrate heavily in built-up areas. In non-urban settings, landslides are more common in naturally vegetated or agricultural zones, indicating the influence of mostly natural triggers such as intense rainfall, erosion, and weathering. Nonetheless, agricultural practices (eg terracing, irrigation) can also modify slope geometry, soil composition, and water infiltration, which may heighten landslide susceptibility in some areas. Therefore, while urbanisation is often associated with more pronounced slope disturbance and drainage alteration, non-urban landslides remain vulnerable to both natural processes and lower-intensity human activities, highlighting that land use exerts a significant, but context-dependent, influence on slope stability.

  • Lithology: Clayey marl and marly clay stand out as common substrates for both urban and non-urban landslides. Urban areas, however, are characterized by a pronounced presence of these lithological formations, indicating that when inherently weak materials coincide with excavations, leaking infrastructure, or other human-induced alterations, the susceptibility to failure increases dramatically. Geological weaknesses such as low shear strength therefore play an important role, but are intensified in urban contexts by anthropogenic stressors.

  • Aspect: The data indicate that urban landslides tend to be more frequent on slopes facing northeast to northwest, whereas non-urban landslides show a slight preference for north-northwest orientations. Aspect often influences microclimatic conditions such as sunlight exposure and moisture retention; however, in urban settings, these natural patterns can be overshadowed by human-induced modifications to drainage, loading, and vegetation cover.

  • Curvature: Both urban and non-urban landslides appear clustered around low or slightly negative curvature values, corresponding to near-planar or mildly convex slopes. Extremely convex or concave slopes may be less frequently occupied or stabilised by vegetation, which could explain their lower landslide density. Mild curvature zones can conceal latent instabilities, especially when subjected to additional loading or inefficient drainage systems.

  • Elevation: Both urban and non-urban landslides tend to concentrate at mid-range elevations, although urban landslides may favour lower hills where city expansion is more common. Elevation influences climate (rainfall patterns, temperature) and vegetation cover, which in turn affect slope stability. However, urban development decisions, such as building on certain elevation tiers, can interact with these natural processes to increase landslide susceptibility.

  • Slope: In non-urban regions, landslides predominantly occur on steeper slopes (≥8°), where gravity-driven failures are more frequent in the absence of anthropogenic interventions. Urban landslides also manifest on moderate to steep slopes, but can arise on low slopes (<8°) when construction practices, such as excavation and drainage mismanagement, undermine natural stability. Although slope remains a primary driver of landslides across both contexts, urban activities can widen the range of vulnerable gradients.

  • Distance to Stream: Both urban and non-urban landslides cluster nearer to streams, reflecting the erosion and soil saturation that often occur in riparian zones. In urban settings, however, the distribution curve may shift slightly, capturing instances where infrastructure is built near or across waterways, leading to localised alterations in flow patterns. Consequently, water emerges as a critical destabilising agent whether channelled naturally or redirected through anthropogenic interventions.

  • Distance to Roads: Urban landslides commonly occur close to roads, implicating slope excavation, traffic vibrations, and disrupted drainage as potential triggers. By contrast, non-urban landslides are more evenly distributed over different distances from the roads, corresponding to less intense but still present human alteration. Proximity to roads thus represents a strong proxy for anthropogenic disturbance, signalling the need for vigilant maintenance and slope reinforcement measures in areas of dense infrastructure.

  • NDVI: From the density curves, urban landslides are generally associated with lower NDVI values, reflecting the reduced vegetation cover typical of built-up environments. In contrast, non-urban landslides often occur at moderately higher NDVI levels, where vegetation provides low root reinforcement. Nevertheless, agricultural and semi-natural areas may still experience slope failures when land management practices such as deforestation or inadequate irrigation degrade vegetation quality. This pattern indicates that vegetation cover alone does not guarantee slope stability

  • TWI: Although the TWI distributions for all classes appear relatively similar, subtle differences do emerge between landslide and non-landslide cells. The combination of these differences with other factors such as steep slopes or weak lithologies, could provide valuable insights into areas more prone to slope failure.

The density plots reinforce that urban and non-urban landslides share certain triggers (e.g. slope, water presence) but diverge substantially in how anthropogenic factors, like road networks and impervious surfaces, modify the landscape. This underscores the importance of treating urban and non-urban landslides as partially distinct phenomena when developing susceptibility models and risk management strategies.

3 Methodology

This study adopts a multi-stage framework designed to disentangle the distinct mechanisms governing urban and non-urban landslides within the same broader landscape. Additionally, it aims to evaluate the potential bias introduced when relying on a purely urban or purely non-urban landslide inventory in heterogenous landscape contest. By analysing model performance across different dataset configurations, the study assesses whether predictive capabilities differ significantly depending on the spatial context of the training data, highlighting the implications of dataset composition for susceptibility modelling. To achieve this, the methodology began with the compilation of a comprehensive mixed landslide inventory (Fig. 5), incorporating all documented events. This inventory was then subdivided into Urban and Non-Urban datasets based on the land use to enable more focused analyses. The Urban subset comprises 123 landslide events (16.5 % of the total affected surface), while the non-urban subset contains 61 events (83.5 % of the total landslides area). By segmenting the data, we can evaluate the occurrence of landslides in each environment.

https://nhess.copernicus.org/articles/25/4629/2025/nhess-25-4629-2025-f05

Figure 5Flowchart of the methodology adopted.

Download

Environmental factors were integrated according to the specific needs of each dataset. The Mixed dataset retained all covariates, including land use, to reflect the heterogeneity of the entire study area. In contrast, land use was excluded from both the Urban and Non-Urban datasets to avoid redundancy and biasing the models; these subsets inherently represent different land cover categories. This selective factor inclusion ensures that each model targets only those predictors most relevant to its respective environment.

All datasets were standardised to a 10 m spatial resolution for consistency, converting raster layers into dataframes so that each grid cell corresponds to a single data point with associated environmental variables. Negative sample selection followed a twofold procedure: first, we applied a knowledge-based method in which experienced local geologists delineated areas that, based on comprehensive field investigations, low slopes, 50 m buffer from landslides and historical records, were considered stable (Fig. 6); and second, within these expert-identified stable zones, we employed random sampling to generate a set of non-landslide points for model training.

https://nhess.copernicus.org/articles/25/4629/2025/nhess-25-4629-2025-f06

Figure 6Landslide inventory and stable areas used to implement the landslide susceptibility models (©Esri, ©Open Street Map contributors, and the GIS user community, Source: Esri, Maxar, Earthstar Geographics, and the GIS User Community).

Each of the three datasets (Mixed, Urban, and Non-Urban) was then subjected to a consistent analytical pipeline. Label encoding and feature scaling were applied to harmonise data for machine learning models. A multicollinearity analysis was performed to identify and remove highly correlated variables. We employed several machine learning algorithms – including LightGBM, XGBoost, Random Forest, Multi-Layer Perceptron (MLP), and Logistic Regression – with hyperparameter tuning via Bayesian Optimisation (Sun et al., 2024; Yang et al., 2023). Table 3 summarises the key machine learning algorithms employed for landslide susceptibility mapping in this study. The table highlights each algorithm's defining characteristics alongside its suitability.

Table 3Overview of machine learning algorithms for landslide susceptibility analysis.

Download Print Version | Download XLSX

To evaluate the predictive capabilities of the models, a 10-fold cross-validation strategy was adopted. The performance metrics including accuracy, recall, F1 score and ROC-AUC (Table 4) – are then averaged over the ten folds and the standard deviation is calculated for a more reliable estimate of out-of-sample performance (Mas et al., 2013; Saha et al., 2020; Tang et al., 2020). Following the initial performance assessment, SHAP (SHapley Additive exPlanations) analysis was conducted to interpret the importance of individual factors in each model (Liu et al., 2024; Lundberg et al., 2019). SHAP values quantify how much each predictor contributes to moving a model's output from a baseline prediction, offering transparent insights into why certain instances were classified as landslides (or non-landslides). Finally, calibration plots were generated to compare the predicted probability distributions across different datasets. These plots gauge how well the predicted probabilities align with the actual frequencies of landslide occurrence, highlighting any systematic over- or under-confidence in the models' predictions (Gerds et al., 2014; Lv et al., 2024). Once the predictive models were finalised, their outputs were transformed into spatially explicit susceptibility maps. Each pixel (or grid cell) in the study area was assigned a probability (or score) indicating its likelihood of experiencing a landslide, based on the combined influence of the chosen conditioning factors.

Table 4Performance Metrics Used in Model Evaluation.

Download Print Version | Download XLSX

4 Results and discussion

4.1 Multicollinearity assessment

To ensure the reliability of model coefficients and predictions, potential multicollinearity among the landslide conditioning factors was examined using the Variance Inflation Factor (VIF). This statistic quantifies how much the variance of a given regression coefficient is inflated due to correlations with other predictors (Kyriazos and Poga, 2023). In general, a VIF value above 5 indicates moderate multicollinearity, whereas values exceeding 10 suggest a more serious concern that may distort model estimates (O'Brien, 2007).

In Fig. 7, two bar plots present VIF values for the variables under different threshold considerations. For the factors considered, Elevation displays a notably high VIF (>40), implying a strong correlation with other factors, likely because elevation is closely tied to terrain attributes such as slope, TWI and stream. Other variables, including Landuse, TWI, and NDVI, exhibit moderate VIF values near or above 10, suggesting some overlap in how moisture, vegetation, and terrain patterns are captured. By contrast, factors such as Curvature, Distance to roads, and Aspect remain below the threshold, indicating relatively low redundancy with other predictors. For the factors selected, which applies a more conservative VIF threshold of 5, a narrower set of factors is highlighted. Once again, Elevation, Landuse, TWI, NDVI, and Lithology are shown to correlate strongly with each other or with the broader suite of predictors. These findings underscore the importance of caution when selecting variables, since high multicollinearity can degrade predictive performance.

https://nhess.copernicus.org/articles/25/4629/2025/nhess-25-4629-2025-f07

Figure 7viF analysis of landslide conditioning factors.

Download

4.2 Hyperparameters calibration

In this study, we employed several machine learning algorithms to generate landslide susceptibility models for the Mixed, Urban, and Non-Urban datasets. Before the final evaluation, each algorithm underwent a systematic configuration process, using a Bayesian Optimisation approach (Frazier, 2018). The Bayesian method was preferred over more conventional grid or random search methods due to its efficiency in exploring large, multi-dimensional hyperparameter spaces.

Table 5 shows that some hyperparameters (e.g. the learning rate for XGBoost and LightGBM) remain consistent across the Mixed, Urban, and Non-Urban datasets, indicating robust settings regardless of data composition. Other parameters, such as max_features and min_samples_split in RF, vary considerably, suggesting that each environment demands tailored configurations to capture distinct triggers of slope failure. For instance, LightGBM requires a lower num_leaves in the Urban dataset, while Random Forest benefits from a higher number of trees in that same setting. Meanwhile, MLP demonstrates increasing alpha in non-urban areas, suggesting that stronger constraint is needed to handle more naturally driven landslides. These results underscore the importance of custom tuning for each model–dataset combination to achieve optimal performance.

Table 5Hyperparameters for each algorithm dataset.

Download Print Version | Download XLSX

4.3 Landslide susceptibility maps

Figure 8 presents the landslide susceptibility maps generated under each modelling scenario (Mixt, Non-urban, and Urban) for the five different algorithms. The Landslide Susceptibility Index (LSI) expresses the relative spatial probability of landslide occurrence. It reflects how prone an area is to landslides, without reference to the timing or potential impacts. In this study, LSI values were obtained from the machine learning model outputs, where each grid cell was assigned, a continuous score indicating its relative susceptibility. Higher LSI values correspond to greater likelihood of landslide occurrence. For the Mixed and Non-urban scenarios, the maps display broadly similar spatial patterns in high-susceptibility zones, particularly for the tree-based models and Logistic Regression. Nonetheless, minor variations in the extent and intensity of these zones indicate that the Mixed inventory can subtly alter the models' learned relationships.

https://nhess.copernicus.org/articles/25/4629/2025/nhess-25-4629-2025-f08

Figure 8Landslide susceptibility maps using the various models and the different datasets.

In contrast, the Urban maps generally display fewer red patches. This reduction is likely due to the limited combinations of factors used during model training, leading to an underestimation of landslide susceptibility when generalized across the study area. Conversely, the non-urban maps tend to feature larger, more continuous patches of high susceptibility. Among the algorithms, LightGBM, XGBoost, and Random Forest produce consistent spatial patterns with sharper transitions between high- and low-susceptibility regions. This is attributable to their tree-based structures, which effectively capture nonlinear relationships. In comparison, MLP and Logistic Regression yield smoother, more diffuse transitions between susceptibility levels. Notably, Logistic Regression delineates larger continuous zones at the red–blue interface, likely a consequence of its linear decision boundaries.

Despite these differences, all models consistently highlight similar topographic or geological hotspots prone to landslide failure, although the precise shape and intensity of these hotspots vary slightly among the algorithms. This consistency suggests a common underlying signal, albeit one that is interpreted differently depending on the chosen algorithm. Overall, the observed differences underscore that the choice of algorithm can lead to distinct susceptibility patterns, which may have important implications for risk mapping and resource allocation.

The resulting susceptibility maps not only pinpoint areas that appear obviously prone to slope failure, but also reveal zones of precarious stability that may seem stable at present. Such slopes are highly sensitive to unexamined or unregulated human activities, such as excavation or poorly managed water diversion, which could readily tip them into active instability.

Figure 9 presents the density distributions of LSI values for landslide and stable cells across the different models and datasets, with density values computed according to Eq. (1). A clear separation between the two distributions reflects strong model discrimination, whereas substantial overlap indicates weaker predictive performance.

  • LightGBM. Urban landslides form a tight mode near 0.8–1.0, while urban non-landslide cells concentrate at low LSI, indicating good discrimination. Non-urban landslides also peak at high LSI but with a broader spread, and the non-urban stable curve shows a long high-LSI tail, implying more false positives. The mixed curves closely track the non-urban shapes, suggesting that, in the combined dataset, non-urban signatures dominate, which dilutes urban landslides.

  • XGBoost. Among all models, XGBoost shows the clearest discrimination. In the urban dataset, landslide cells cluster sharply at LSI≈0.95, while non-landslide curves collapse toward 0–0.2 with negligible right tails. In non-urban terrain, separation remains strong and superior to LightGBM, with only a small fraction of stable cells assigned high LSI. By contrast, the mixed dataset exhibits a broader, flatter spread of LSIs, reflecting the heterogeneity of urban and non-urban signatures and the resulting dilution of the decision boundary. This reinforces the benefit of modelling the two environments separately.

  • Random Forest. Urban landslides peak around 0.8 with low urban stable densities at high LSI, evidencing good urban performance. In non-urban areas, landslide densities shift to 0.6–0.8 and overlap more with stable cells, reflecting misclassifications. The mixed curves closely mirror the non-urban shapes, indicating that RF's piecewise partitions are dominated by the more non-urban conditions, which reduces selectivity when both environments are pooled. Environment-specific calibration (or thresholds) would likely improve non-urban specificity.

  • MLP. The MLP concentrates landslide probabilities at the high end, indicating good sensitivity. Non-landslide curves are mostly compressed below 0.2, yet they retain residual right-tails, more visible for mixed and urban sets, so a small fraction of stable cells is assigned high LSI (false positives). This pattern is consistent with a high-capacity model capturing non-linear interactions but becoming over-confident under heterogeneous predictors.

  • Logistic Regression. Distributions are broader with substantial overlap, but a consistent right-shift of landslide curves remains: landslide modes lie around 0.60–0.75, while non-landslide modes are closer to 0.20–0.45. The density is highest in urban dataset but shifted to low LSI and it weakens in non-urban and mixed datasets, indicating the poorest discrimination capabilities.

https://nhess.copernicus.org/articles/25/4629/2025/nhess-25-4629-2025-f09

Figure 9Density distributions of Landslide Susceptibility Index (LSI) values obtained by superposing the landslide inventory with the susceptibility maps produced by five machine learning models.

Download

4.4 Performance evaluation

From Fig. 10, interesting trend emerges: the urban dataset achieves the highest overall performance despite being the smallest dataset. This is somewhat unexpected, as a limited sample size typically constrains model accuracy. Conversely, the non-urban dataset exhibits the lowest performance across several metrics, which is surprising given that landslides in rural areas are generally assumed to follow more consistent, terrain-driven failure mechanisms. While topographical, geological, and vegetation-related features remain key predictors, the models struggle to distinguish landslide-prone areas as clearly as in urban environments. This suggests that natural slope failure processes may be more complex or influenced by subtle, less directly measurable factors.

https://nhess.copernicus.org/articles/25/4629/2025/nhess-25-4629-2025-f10

Figure 10Comparison of model performance across different dataset configurations.

Download

Several explanations may account for this counterintuitive result. In urban areas, landslides often create a sharper contrast between unstable and stable cells. As shown in Fig. 4, urban landslides occupy narrower, more distinctive ranges in several predictors, whereas non-urban landslides span broader, overlapping ranges with the stable class. This improves model separability despite the smaller sample size. In addition, negative sample selection biases may further accentuate the contrast as stable areas were identified through knowledge-based methods, thereby blurring class distinctions. Finally, urban slopes typically display lower geomorphological complexity and greater uniformity of triggering factors compared with non-urban terrain.

The mixed dataset performs between these two extremes, but its results vary across different metrics. By combining urban and rural characteristics, the dataset benefits from a larger sample size but at the cost of increased heterogeneity, making it harder for models to capture distinct patterns specific to either environment.

When comparing the five algorithms across the datasets (Fig. 11 and Table S1 in the Supplement), a clear difference emerges. The three tree-based methods maintain relatively consistent performance across the different dataset configurations, indicating their robustness in handling a wide range of features and data distributions. However, subtle variations do appear. For instance, LightGBM may take the lead in the non-urban setting, while XGBoost often excels in the Urban dataset. Overall, however, these ensembles exhibit less variability than the other approaches.

https://nhess.copernicus.org/articles/25/4629/2025/nhess-25-4629-2025-f11

Figure 11Model-wise performance evaluation

Download

By contrast, MLP and Logistic Regression show greater performance swings between the Mixed, Urban, and non-urban datasets. MLP displays notable changes in Precision and Recall, reflecting its sensitivity to differences in dataset size, feature distribution, and hyperparameter settings factors that vary substantially between urban and non-urban landscapes. Logistic Regression experiences the largest performance drop in the non-urban dataset, largely because it relies on linear decision boundaries and struggles with the more complex, non-linear interactions typical of topographically driven environments. This marked variability underscores the importance of selecting models capable of adapting to the interplay of environmental and anthropogenic factors.

Beyond these internal performance metrics, it is also important to situate our findings in relation to previous susceptibility assessments conducted in Constantine Province. Several studies have produced maps using statistical, expert-based, or multi-criteria methods, which provide a useful external reference for comparison with our results.

Landslide susceptibility in Constantine Province has been evaluated in several previous studies using different approaches. For instance, (Achour et al., 2017) analyzed a highway road section using statistical methods; however, their study area does not intersect with ours, limiting the relevance of direct comparison. (Abdıet al., 2021) applied AHP and Fuzzy-AHP methods in a zone that partially overlaps our study area. Although their validation inventory was compiled at a smaller scale, the main landslide-prone zones they identified correspond closely to areas that our mixed and non-urban models classify as high to very high susceptibility. In contrast, their mapping underrepresents small urban landslides, which may explain why our urban model captures additional events not emphasized in their results. Similarly, (Bourenane and Bouhadad, 2021; Bourenane et al., 2015) developed susceptibility maps based on expert judgment and statistical approaches. While their analyses were also conducted at a coarser scale, our non-urban and mixed models broadly agree with their delineation of landslide and highly susceptible areas. Taken together, these consistencies with earlier studies provide an indirect form of external validation, supporting the reliability of our susceptibility models. Despite a smaller study area, this work represents the most comprehensive assessment to date of landslide susceptibility in the Constantine region. It stands out for its spatial scale, the level of detail and reliability of the compiled inventory, the integration of advanced learning methods, and advanced analysis of the findings.

4.5 Calibration of the models

Calibration plots are essential tools for assessing how well a model's predicted probabilities align with actual outcomes. When predictions of models lie above the diagonal, it indicates an underestimation of the true probability of a positive event. Conversely, predictions below the diagonal suggest an overestimation. Ideally, predictions that fall on or near the diagonal demonstrate strong calibration.

According to Fig. 12, the calibration plots for the Urban dataset show that MLP and LightGBM align more closely with the diagonal line of perfect calibration, though minor deviations appear at the lower and upper probability extremes. Random Forest and Logistic Regression exhibit larger discrepancies, especially for higher predicted probabilities, where Logistic Regression underperforms considerably. These patterns suggest that in urban contexts, where anthropogenic factors are particularly influential, the complex algorithms capture probability estimates more consistently, while simpler or more rigid approaches struggle to match actual landslide occurrence. In the Non-urban setting, MLP, XGBoost and LightGBM maintain reasonable calibration across most probability bins, whereas Random Forest underestimates risk at the lower end and overestimate at higher probabilities. By contrast, Logistic Regression show better calibration compared to Urban dataset, although the underestimation of the phenomenon is significative for high susceptibility.

https://nhess.copernicus.org/articles/25/4629/2025/nhess-25-4629-2025-f12

Figure 12Calibration plots for each model and dataset.

Download

The Mixed dataset, yields overall better-calibrated predictions. All models are stable, remaining close to the diagonal over a wider range of probabilities, while Random Forest and Logistic Regression show less deviation than in the purely urban or non-urban subsets. This suggests that pooling diverse samples may help average out some extremes and lead to more balanced probability estimates, albeit at the potential cost of diluting context-specific patterns evident in separate urban or rural analyses.

4.6 Relative importance of the factors for the best model

According to Fig. 13 In the urban model, Aspect emerges as the most variable predictor, suggesting that slope orientation plays a decisive role in urban landslide susceptibility. This effect likely stems from microclimatic differences, such as sunlight exposure and moisture retention which, when combined with urban construction practices, can either exacerbate or mitigate slope instability. Slope follows closely, reinforcing the idea that steep terrains pose significant risks, even in built-up environments.

https://nhess.copernicus.org/articles/25/4629/2025/nhess-25-4629-2025-f13

Figure 13Shap plot for the relative importance of the factors for the best model.

Download

A noteworthy finding is that high distances to streams (i.e., being far from streams) show positive SHAP values, which may appear counterintuitive. In many cities, however, watercourses are often located on flatter, lower-lying land, which might also have received better drainage or protective infrastructure. In contrast, the more intensively built slopes away from streams may lack adequate water management, thereby increasing landslide susceptibility. Areas closer to streams may benefit from slope stabilization measures or less steep terrain, resulting in comparatively lower SHAP values. Regarding lithology, the codes range from:

  • 0 = Alluvium

  • 1 = Limestone

  • 2 = Conglomerate

  • 3 = Brown clayey marl

  • 4 = Ocher-brown marly clay

In urban settings, especially problematic lithologies (3 and 4) may be excavated or heavily engineered, which can reduce their inherent instability reflected in more neutral or negative SHAP values. Conversely, if these weak lithologies remain untreated or poorly managed (e.g adjacent to inadequate construction sites), they tend to produce positive SHAP values, indicating higher landslide risk. Although still important, Distance to roads is not as dominant here as in the non-urban model. This likely reflects the high density of roads in urban areas, where being close to a road may sometimes reduce slope risk (due to engineered supports) or increase it (through cut-and-fill activities). The result is a more “mixed” overall impact on the model. Finally, NDVI (a proxy for vegetation density) is generally low in urban environments. Where NDVI is relatively higher (e.g parks or green slopes), the model assigns negative SHAP values (less risk). Conversely, extremely low NDVI levels (sparse or no vegetation) tend to correlate with positive SHAP values, implying greater susceptibility to slope failure.

For non-urban model, Distance to roads stands out as the primary factor. Even limited road networks in rural or natural areas can have disproportionately large destabilizing effects, such as cutting slopes or altering local drainage. Consequently, points nearer to roads show strongly positive SHAP values, signifying elevated landslide risk, which is in accordance with (He et al., 2024). Distance to stream and Slope also have broad SHAP distributions, demonstrating the importance of traditional geomorphological processes in non-urban areas. Being close to a stream often increases undercutting, erosion, and soil saturation, hence positive SHAP values. Areas far from watercourses typically exhibit negative values, indicating reduced landslide likelihood. Aspect and NDVI follow next in importance. High vegetation cover stabilizes slopes (negative SHAP), whereas areas with sparse vegetation elevated risk. Aspect values facing roughly 200–360° (often westerly aspects) correlate with higher landslide probabilities, likely reflecting local climate or sunlight conditions. While Lithology and Curvature show narrower ranges of influence, they still matter. In non-urban areas, weaker lithologies (3 and 4) are less likely to be engineered or reinforced, so they tend to yield positive SHAP values. Conversely, sturdier materials (0=Alluvium, 1=Limestone, 2=Conglomerate) appear less prone to failure, especially on slopes with minimal human disturbance.

In the mixed model, Distance to roads again features prominently, suggesting that roads, through slope cuts and drainage changes, remain a robust driver of landslide occurrence. The model shows that mid-range values of Distance to roads in particular may increase risk, possibly reflecting zones of suburban expansion or partial engineering. Slope and Distance to stream are also influential, consistent with gravity-driven and water-induced failure mechanisms. Aspect and Lithology display moderate but noticeable effects. The lithology results resemble those in the urban dataset, indicating the model may be influenced by the greater representation of urban environments in the sample or by shared lithological units between urban and rural areas. NDVI continues to reduce landslide probabilities where vegetation is dense (e.g., forested slopes), although its stabilizing role is less pronounced than in purely rural contexts. Curvature shows a narrower range of SHAP values overall, but still distinguishes concave zones from convex slopes. This aligns with the SHAP interpretation, as concave areas (negative curvature/low values) are inherently more prone to landslides due to the concentration of runoff and the facilitation of water infiltration.

Across all three datasets, both natural factors (slope, distance to streams, lithology) and anthropogenic factors (particularly roads) emerge as key landslide predictors, with their relative importance shifting depending on the urban or non-urban context. In urban environments, natural drainage patterns are often disrupted by impervious surfaces and redirected through engineered systems. Areas farther from natural streams may lack adequate subsurface drainage infrastructure, leading to groundwater accumulation and increased pore water pressure, a primary trigger for slope instability. In contrast, non-urban terrains follow more common geomorphological logic, with proximity to streams or steep slopes strongly increasing instability. The mixed dataset blends these trends, underscoring that roads, topography, and hydrological factors are consistently significant across diverse landscapes. By comparing these results, decision-makers can better tailor landslide mitigation strategies, focusing on slope stabilization and drainage management in urban expansions, while prioritizing safe road infrastructure and vegetation conservation in more rural settings.

5 Conclusions

This study addresses the critical challenge of landslide susceptibility assessment in rapidly urbanizing regions, with a focus on the city of Constantine, Algeria, a region known to be vulnerable to landslides due to its complex morpho-geology, clay-rich soils, and intense anthropogenic activities. By decoupling urban and non-urban landslide mechanisms, the research provides a nuanced understanding of the distinct factors driving slope instability in these contrasting environments. The integration of geospatial data, machine learning algorithms, and advanced analytical techniques enabled the development of robust landslide susceptibility models tailored to urban, non-urban, and mixed landscapes.

In urban dataset, the obtained results seem counter-intuitive, because in urban areas natural factors over-rule the anthropogenic ones in landslide susceptibility assessment. On the other hand, mixed dataset tends to have more similarities to non-urban dataset and demonstrated intermediate performance due to the larger inventoried slides in non-urban dataset compare to the urban one, highlighting the challenges of modelling heterogeneous landscapes but also the potential for more balanced risk assessments.

Complex algorithms such as XGBoost, LightGBM, and Random Forest consistently outperformed simpler models like Logistic Regression MLP across all datasets, particularly for the urban dataset. This superior performance underscores the necessity of employing advanced, non-linear models to accurately assess landslide susceptibility, given the inherent complexity of the phenomenon, which involves intricate interactions between environmental and anthropogenic factors. The calibration of these models further supports this conclusion, as it highlights the non-linear nature of the problem, with complex algorithms demonstrating better alignment between predicted probabilities and actual outcomes. In contrast, the LR model, which does not require fine-tuning, exhibited the opposite trend in performance, consistently underperforming compared to the other models. This stark contrast emphasizes the critical role of hyperparameter optimization and the need for more sophisticated modelling approaches to capture the multifaceted dynamics of landslide susceptibility, particularly in urban settings where human activities significantly influence slope stability.

An important outcome of this work is that the different algorithms not only vary in predictive accuracy but also in their estimation tendencies. Boosting methods achieved the strongest discrimination but tended to underestimate susceptibility in marginally unstable zones, Random Forest produced the most balanced delineation of high-susceptibility areas, while Logistic Regression consistently overestimated susceptibility, mapping the largest hazard zones. This gradient of behaviours suggests that model choice should not be guided by accuracy alone but also by the level of security desired in planning and mitigation. In precautionary contexts, an overestimating model may be preferable, while in operational resource management, a balanced or conservative model may be more appropriate.

The study's methodological framework, which incorporates identification and detailed inventory of landslides, advanced machine learning techniques, interpretable factor analysis an uncoupled inventory, represents a significant advancement in landslide susceptibility mapping. By addressing the limitations of existing studies that often treat urban and non-urban environments as homogeneous, this research provides a more accurate and actionable basis for spatial planning and risk mitigation. The findings underscore the importance of context-specific approaches to landslide hazard assessment, particularly in regions undergoing rapid urbanization, and offer valuable insights for policymakers, urban planners, and disaster management authorities aiming to enhance community resilience and sustainable development. Moreover, in our study area, the broader regional context is one of precarious geomorphological stability, where anthropogenic interventions can either stabilise or destabilise the terrain, ultimately interacting with natural processes in a highly complex manner.

Data availability

Some data, models, or code that support the findings of this study are available from the corresponding author upon reasonable request.

Supplement

The supplement related to this article is available online at https://doi.org/10.5194/nhess-25-4629-2025-supplement.

Author contributions

Conceptualization, ZM; methodology, MD and ZM; software, ZM; validation, YMD, MD, ZM and CB; formal analysis, YMD, MD and ZM; investigation, YMD, MD, ZM and CB; data curation, ZM and YMD; writing – original draft preparation, ZM writing – review and editing, ZM, YMD, MD and CB; visualization, ZM; supervision, CB. All authors have read and agreed to the published version of the manuscript.

Competing interests

The contact author has declared that none of the authors has any competing interests.

Disclaimer

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Review statement

This paper was edited by Daniele Giordan and reviewed by two anonymous referees.

References

Abdı, A., Bouamrane, A., Karech, T., Dahri, N., and Kaouachi, A.: Landslide Susceptibility Mapping Using GIS-based Fuzzy Logic and the Analytical Hierarchical Processes Approach: A Case Study in Constantine (North-East Algeria), Geotechnical and Geological Engineering, 39, 5675–5691, https://doi.org/10.1007/s10706-021-01855-3, 2021. 

Achour, Y., Boumezbeur, A., Hadji, R., Chouabbi, A., Cavaleiro, V., and Bendaoud, E. A.: Landslide susceptibility mapping using analytic hierarchy process and information value methods along a highway road section in Constantine, Algeria, Arabian Journal of Geosciences, 10, https://doi.org/10.1007/s12517-017-2980-6, 2017. 

Alharbi, T., Sultan, M., Sefry, S., ElKadiri, R., Ahmed, M., Chase, R., Milewski, A., Abu Abdullah, M., Emil, M., and Chounaird, K.: An assessment of landslide susceptibility in the Faifa area, Saudi Arabia, using remote sensing and GIS techniques, Nat. Hazards Earth Syst. Sci., 14, 1553–1564, https://doi.org/10.5194/nhess-14-1553-2014, 2014. 

ASF DAAC: ALOS PALSAR High Resolution Radiometric Terrain Corrected Product, NASA Alaska Satellite Facility Distributed Active Archive Center [data set], https://doi.org/10.5067/Z97HFCNKR6VA, 2014. 

Bathrellos, G. D., Kalivas, D. P., and Skilodimou, H. D.: GIS-based landslide susceptibility mapping models applied to natural and urban planning in Trikala, central Greece, Estudios Geologicos, 65, 49–65, https://doi.org/10.3989/egeol.08642.036, 2009. 

Benabbas, C.: Evolution Mio-Plio-Quaternaire Des Bassins Continentaux De L'Algerie Nord Orientale: Apport De La Photogeologie Et Analyse Morphostructurale, University Mentouri, Constantine, 242 pp., https://doi.org/10.13140/RG.2.2.11674.25280, 2006. 

Benaissa, A. and Bellouche, M. A.: Geotechnical properties of some landslide-prone geological formations in the urban area of Constantine (Alqeria) [Propriétés géotechniques de quelques formations géologiques propices aux glissements de terrains dans l'agglomération de Constantine (Algérie)], Bulletin of Engineering Geology and the Environment, 57, 301–310, 1999. 

Bornaetxea, T., Rossi, M., Marchesini, I., and Alvioli, M.: Effective surveyed area and its role in statistical landslide susceptibility assessments, Nat. Hazards Earth Syst. Sci., 18, 2455–2469, https://doi.org/10.5194/nhess-18-2455-2018, 2018. 

Bougdal, R., Belhai, D., and Antoine, P.: Géologie détaillée de la ville de Constantine et ses alentours: une donnée de base pour l'étude des glissements de terrain, Bull. Serv. Géol. Nat., 18, 161–187, 2007. 

Bourenane, H. and Bouhadad, Y.: Impact of Land use Changes on Landslides Occurrence in Urban Area: The Case of the Constantine City (NE Algeria), Geotechnical and Geological Engineering, 39, https://doi.org/10.1007/s10706-021-01768-1, 2021. 

Bourenane, H., Bouhadad, Y., Guettouche, M. S., and Braham, M.: GIS-based landslide susceptibility zonation using bivariate statistical and expert approaches in the city of Constantine (Northeast Algeria), Bulletin of Engineering Geology and the Environment, 74, 337–355, https://doi.org/10.1007/s10064-014-0616-6, 2015. 

Breiman, L.: Random Forests, Machine Learning, 45, 5–32, https://doi.org/10.1023/A:1010933404324, 2001. 

Caniani, D., Pascale, S., Sdao, F., and Sole, A.: Neural networks and landslide susceptibility: A case study of the urban area of Potenza, Nat. Hazards, 45, 55–72, https://doi.org/10.1007/s11069-007-9169-3, 2008. 

Carrión-Mero, P., Briones-Bitar, J., Morante-Carballo, F., Stay-Coello, D., Blanco-Torrens, R., and Berrezueta, E.: Evaluation of slope stability in an urban area as a basis for territorial planning: A case study, Applied Sciences (Switzerland), 11, https://doi.org/10.3390/app11115013, 2021. 

Chen, T. and Guestrin, C.: XGBoost: A Scalable Tree Boosting System, 13–17 August, https://doi.org/10.1145/2939672.2939785, 2016. 

Chen, Y.-C.: A tutorial on kernel density estimation and recent advances, Biostatistics and Epidemiology, 1, 161–187, https://doi.org/10.1080/24709360.2017.1396742, 2017. 

Chen, Z. and Wang, G.: Comparison of empirically-based and physically-based analyses of coseismic landslides: A case study of the 2016 Kumamoto earthquake, Soil Dyn. Earthq. Eng., 172, 108009, https://doi.org/10.1016/j.soildyn.2023.108009, 2023. 

Claverie, M., Ju, J., Masek, J. G., Dungan, J. L., Vermote, E. F., Roger, J. C., Skakun, S. V., and Justice, C.: The Harmonized Landsat and Sentinel-2 surface reflectance data set, Remote Sens. Environ., 219, 145–161, https://doi.org/10.1016/j.rse.2018.09.002, 2018. 

Felicísimo, Á. M., Cuartero, A., Remondo, J., and Quirós, E.: Mapping landslide susceptibility with logistic regression, multiple adaptive regression splines, classification and regression trees, and maximum entropy methods: A comparative study, Landslides, 10, 175–189, https://doi.org/10.1007/s10346-012-0320-1, 2013. 

Frazier, P. I.: A Tutorial on Bayesian Optimization, arXiv:1807.02811, https://arxiv.org/abs/1807.02811 (last access: 1 February 2025), 2018. 

Gardner, M. W. and Dorling, S. R.: Artificial neural networks (the multilayer perceptron) – a review of applications in the atmospheric sciences, Atmos. Environ., 32, 2627–2636, https://doi.org/10.1016/S1352-2310(97)00447-0, 1998. 

Gerds, T. A., Andersen, P. K., and Kattan, M. W.: Calibration plots for risk prediction models in the presence of competing risks, Statistics in Medicine, 33, 3191–3203, https://doi.org/10.1002/sim.6152, 2014. 

Guemache, M. A., Chatelain, J. L., Machane, D., Benahmed, S., and Djadia, L.: Failure of landslide stabilization measures: The Sidi Rached viaduct case (Constantine, Algeria), Journal of African Earth Sciences, 59, 349–358, https://doi.org/10.1016/j.jafrearsci.2011.01.005, 2011. 

Hadji, R., Boumazbeur, A. errahmane, Limani, Y., Baghem, M., Chouabi, A. el M., and Demdoum, A.: Geologic, topographic and climatic controls in landslide hazard assessment using GIS modeling: A case study of Souk Ahras region, NE Algeria, Quatern. Int., 302, 224–237, https://doi.org/10.1016/j.quaint.2012.11.027, 2013. 

He, H., Dong, X., Du, S., Guo, H., Yan, Y., and Chen, G.: Study on the Stability of Cut Slopes Caused by Rural Housing Construction in Red Bed Areas: A Case Study of Wanyuan City, China, Sustainability (Switzerland), 16, https://doi.org/10.3390/su16031344, 2024. 

Huang, W., Ding, M., Li, Z., Yu, J., Ge, D., Liu, Q., and Yang, J.: Landslide susceptibility mapping and dynamic response along the Sichuan-Tibet transportation corridor using deep learning algorithms, Catena, 222, 106866, https://doi.org/10.1016/j.catena.2022.106866, 2023. 

Hungr, O., Leroueil, S., and Picarelli, L.: The Varnes classification of landslide types, an update, Landslides, 11, 167–194, https://doi.org/10.1007/s10346-013-0436-y, 2014. 

Islam, M. A., Arrafi, M. A., Peas, M. H., Hossain, T., Hasan, M. M., Murshed, S., and Tania, M. J.: Predicting urban landslides in the hilly regions of Bangladesh leveraging a hybrid machine learning model and CMIP6 climate projections, Geosystems and Geoenvironment, 4, 100354, https://doi.org/10.1016/j.geogeo.2025.100354, 2025. 

Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., and Lui, T.-Y.: LightGBM: A Highly Efficient Gradient Boosting Decision Tree, in: Advances in Neural Information Processing Systems, Curran Associates, Inc., 30, https://proceedings.neurips.cc/paper_files/paper/2017/file/6449f44a102fde848669bdd9eb6b76fa-Paper.pdf (last access: 1 February 2025), 2017. 

El Kechebour, B.: Relation between Stability of Slope and the Urban Density: Case Study, Procedia Engineering, 114, 824–831, https://doi.org/10.1016/j.proeng.2015.08.034, 2015. 

Kyriazos, T. and Poga, M.: Dealing with Multicollinearity in Factor Analysis: The Problem, Detections, and Solutions, Open Journal of Statistics, 13, 404–424, https://doi.org/10.4236/ojs.2023.133020, 2023. 

Liu, B., Guo, H., Li, J., Ke, X., and He, X.: Application and interpretability of ensemble learning for landslide susceptibility mapping along the Three Gorges Reservoir area, China, Nat. Hazards, 120, 4601–4632, https://doi.org/10.1007/s11069-023-06374-3, 2024. 

Lundberg, S. M., Erion, G. G., and Lee, S.-I.: Consistent Individualized Feature Attribution for Tree Ensembles, arXiv e-prints, arXiv 1802.03888, https://arxiv.org/abs/1802.03888 (last access: 1 February 2025), 2019. 

Luo, J., Zhao, Z., Li, W., Huang, L., and Zhao, W.: Landslide hazard assessment of an urban agglomeration in central Guizhou Province based on an information value method and SVM, bagging, DNN algorithm, Scientific Reports, 15, 1–15, https://doi.org/10.1038/s41598-025-86258-7, 2025. 

Lv, J., Zhang, R., Shama, A., Hong, R., He, X., Wu, R., Bao, X., and Liu, G.: Exploring the spatial patterns of landslide susceptibility assessment using interpretable Shapley method: Mechanisms of landslide formation in the Sichuan-Tibet region, J. Environ. Manage., 366, https://doi.org/10.1016/j.jenvman.2024.121921, 2024. 

Ma, H. and Wang, F.: Inventory of shallow landslides triggered by extreme precipitation in July 2023 in Beijing, China, Scientific Data, 11, 1083, https://doi.org/10.1038/s41597-024-03901-0, 2024. 

Manchar, N., Benabbas, C., Hadji, R., Bouaicha, F., and Grecu, F.: Landslide susceptibility assessment in constantine region (NE Algeria) by means of statistical models, Studia Geotechnica et Mechanica, 40, 208–219, https://doi.org/10.2478/sgem-2018-0024, 2018. 

Mas, J. F., Filho, B. S., Pontius, R. G., Gutiérrez, M. F., and Rodrigues, H.: A suite of tools for ROC analysis of spatial models, ISPRS International Journal of Geo-Information, 2, 869–887, https://doi.org/10.3390/ijgi2030869, 2013. 

Matougui, Z. and Zouidi, M.: A temporal perspective on the reliability of wildfire hazard assessment based on machine learning and remote sensing data, Earth Science Informatics, 18, https://doi.org/10.1007/s12145-024-01501-5, 2025. 

Matougui, Z., Djerbal, L., and Bahar, R.: A comparative study of heterogeneous and homogeneous ensemble approaches for landslide susceptibility assessment in the Djebahia region, Algeria, Environ. Sci. Pollut. R, https://doi.org/10.1007/s11356-023-26247-3, 2023. 

Meena, S. R., Puliero, S., Bhuyan, K., Floris, M., and Catani, F.: Assessing the importance of conditioning factor selection in landslide susceptibility for the province of Belluno (region of Veneto, northeastern Italy), Nat. Hazards Earth Syst. Sci., 22, 1395–1417, https://doi.org/10.5194/nhess-22-1395-2022, 2022. 

Mezerreg, N. E. H., Kessasra, F., Bouftouha, Y., Bouabdallah, H., Bollot, N., Baghdad, A., and Bougdal, R.: Integrated geotechnical and geophysical investigations in a landslide site at Jijel, Algeria, Journal of African Earth Sciences, 160, 103633, https://doi.org/10.1016/j.jafrearsci.2019.103633, 2019. 

Mezhoud, L. and Benazzouz, M.-T.: Évaluation de la susceptibilité à l'aléa «glissement de terrain» par l'utilisation de l'outil SIG: application à la ville de Constantine (Algérie), Sciences & Technologie D, 47, 91–103, 2018. 

Mounia, B., Merzoug, B., Chaouki, B., and Djaouza, A. A.: Physico-Chemical Characterization of Limestones and Sandstones in a Complex Geological Context, ExampleNorth-East Constantine: Preliminary Results, International Journal of Engineering and Technology, 114–118, https://doi.org/10.7763/ijet.2013.v5.523, 2013. 

O'Brien, R. M.: A caution regarding rules of thumb for variance inflation factors, Quality and Quantity, 41, 673–690, https://doi.org/10.1007/s11135-006-9018-6, 2007. 

OpenStreetMap contributors: OpenStreetMap – Road network data (2019), https://www.openstreetmap.org (last access: 19 November 2025), 2019. 

Pascale, S., Sdao, F., and Sole, A.: A model for assessing the systemic vulnerability in landslide prone areas, Nat. Hazards Earth Syst. Sci., 10, 1575–1590, https://doi.org/10.5194/nhess-10-1575-2010, 2010. 

Pascale, S., Parisi, S., Mancini, A., Schiattarella, M., Conforti, M., Sole, A., Murgante, B., and Sdao, F.: Landslide susceptibility mapping using artificial neural network in the urban area of Senise and San Costantino Albanese (Basilicata, Southern Italy), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7974 LNCS, 473–488, https://doi.org/10.1007/978-3-642-39649-6_34, 2013. 

Pham, B. T., Tien Bui, D., Prakash, I., and Dholakia, M. B.: Hybrid integration of Multilayer Perceptron Neural Networks and machine learning ensembles for landslide susceptibility assessment at Himalayan area (India) using GIS, Catena, 149, 52–63, https://doi.org/10.1016/j.catena.2016.09.007, 2017. 

Pham, B. T., Nguyen, M. D., Bui, K. T. T., Prakash, I., Chapi, K., and Bui, D. T.: A novel artificial intelligence approach based on Multi-layer Perceptron Neural Network and Biogeography-based Optimization for predicting coefficient of consolidation of soil, Catena, 173, 302–311, https://doi.org/10.1016/j.catena.2018.10.004, 2019. 

Saha, S., Saha, A., Hembram, T. K., Pradhan, B., and Alamri, A. M.: Evaluating the performance of individual and novel ensemble of machine learning and statistical models for landslide susceptibility assessment at Rudraprayag district of Garhwal Himalaya, Applied Sciences (Switzerland), 10, https://doi.org/10.3390/app10113772, 2020. 

Schlögl, M., Richter, G., Avian, M., Thaler, T., Heiss, G., Lenz, G., and Fuchs, S.: On the nexus between landslide susceptibility and transport infrastructure – an agent-based approach, Nat. Hazards Earth Syst. Sci., 19, 201–219, https://doi.org/10.5194/nhess-19-201-2019, 2019. 

Sun, D., Wang, J., Wen, H., Ding, Y. K., and Mi, C.: Landslide susceptibility mapping (LSM) based on different boosting and hyperparameter optimization algorithms: A case of Wanzhou District, China, Journal of Rock Mechanics and Geotechnical Engineering, https://doi.org/10.1016/j.jrmge.2023.09.037, 2024. 

Tang, R. X., Kulatilake, P. H. S. W., Yan, E. C., and Cai, J. Sen: Evaluating landslide susceptibility based on cluster analysis, probabilistic methods, and artificial neural networks, Bulletin of Engineering Geology and the Environment, 79, 2235–2254, https://doi.org/10.1007/s10064-019-01684-y, 2020. 

Tanyu, B. F., Abbaspour, A., Alimohammadlou, Y., and Tecuci, G.: Landslide susceptibility analyses using Random Forest, C4.5, and C5.0 with balanced and unbalanced datasets, Catena, 203, 105355, https://doi.org/10.1016/j.catena.2021.105355, 2021. 

Varnes, D. J.: Landslide hazard zonation: a review of principles and practice, Natural Hazards, United Nations Educational, Scientific and Cultural Organization, Paris, 3, 1–63, https://trid.trb.org/View/281932 (last access: 1 February 2025), 1984. 

Yang, C., Liu, L. L., Huang, F., Huang, L., and Wang, X. M.: Machine learning-based landslide susceptibility assessment with optimized ratio of landslide to non-landslide samples, Gondwana Res., 123, 198–216, https://doi.org/10.1016/j.gr.2022.05.012, 2023. 

Zanaga, D., Van De Kerchove, R., De Keersmaecker, W., Souverijns, N., Brockmann, C., Quast, R., Wevers, J., Grosu, A., Paccini, A., Vergnaud, S., Cartus, O., Santoro, M., Fritz, S., Georgieva, I., Lesiv, M., Carter, S., Herold, M., Li, L., Tsendbazar, N.-E., Ramoino, F., and Arino, O.: ESA WorldCover 10 m 2020 v100 (Version v100), Zenodo [data set], https://doi.org/10.5281/zenodo.5571936, 2021.  

Zehra, K. T., Kursat, O. A., and Candan, G.: Performance Comparison of Landslide Susceptibility Maps Derived from Logistic Regression and Random Forest Models in the Bolaman Basin, Türkiye, Nat. Hazards Rev., 25, 4023054, https://doi.org/10.1061/NHREFO.NHENG-1771, 2024. 

Download
Short summary
In a context of urban expansion and a reduction in the available land for construction, preventive studies against landslides are required. Using field surveys, remote sensing and context-specific models, we studied the risk of landslides in an example of a transitional region. Our models reveal the likelihood of slope failure under varying natural and human pressures, guiding better land management to promote sustainable growth. These insights support safer development in fragile landscapes.
Share
Altmetrics
Final-revised paper
Preprint