Predicting deep-seated landslide displacement on Taiwan's Lushan through the integration of convolutional neural networks and the Age of Exploration-Inspired Optimizer

Chou, Jui-Sheng; Nguyen, Hoang-Minh; Phan, Huy-Phuong; Wang, Kuo-Lung

doi:https://doi.org/10.5194/nhess-25-119-2025

Articles | Volume 25, issue 1

https://doi.org/10.5194/nhess-25-119-2025

Articles | Volume 25, issue 1

Research article

06 Jan 2025

Research article |

| 06 Jan 2025

Predicting deep-seated landslide displacement on Taiwan's Lushan through the integration of convolutional neural networks and the Age of Exploration-Inspired Optimizer

Jui-Sheng Chou, Hoang-Minh Nguyen, Huy-Phuong Phan, and Kuo-Lung Wang

Abstract

Deep-seated landslides have caused substantial damage to both human life and infrastructure in the past. Developing an early warning system for this type of disaster is crucial to reduce its impact on society. This research contributes to developing predictive early warning systems for deep-seated landslide displacement by employing advanced computational models for environmental risk management. The novel framework evaluates machine learning, time series deep learning, and convolutional neural networks (CNNs), identifying the most effective models to be enhanced by the Age of Exploration-Inspired Optimizer (AEIO) algorithm. Our approach demonstrates exceptional forecasting capabilities by utilizing 8 years of comprehensive data – including displacement, groundwater levels, and meteorological information from the Lushan (mountainous) region in Taiwan. The AEIO–MobileNet model precisely predicts imminent deep-seated landslide displacement with a mean absolute percentage error (MAPE) of 2.81 %. These advancements significantly enhance geohazard informatics by providing reliable and efficient tools for landslide risk assessment and management. They help safeguard road networks, construction projects, and infrastructure in vulnerable slope areas.

Download & links

Article (PDF, 7308 KB)

Download & links

How to cite.

Received: 11 May 2024 – Discussion started: 10 Jun 2024 – Revised: 29 Aug 2024 – Accepted: 11 Oct 2024 – Published: 06 Jan 2025

1 Introduction

Landslides are among the most devastating natural disasters (Huang and Fan, 2013), claiming an average of over 4000 lives annually worldwide between 2004 and 2010 (Petley, 2012). Landslides represent a global hazard, particularly in developing countries, where rapid urbanization, population growth, and significant land use changes occur (Caleca et al., 2024). The identification, management, and monitoring of landslides are made difficult by the diversity of their types (shallow slides, deep-seated slides, rockfalls, rock slides, debris flows) and the complexity of their categorization based on triggers, material composition, movement speed, and other characteristics (Das et al., 2022; Hungr et al., 2014). These issues are further exacerbated in countries with complex geological and climatic conditions.

A deep-seated landslide involves the gradual and persistent displacement of a substantial amount of soil and rock, which can escalate into a sudden and devastating event (Kilburn and Petley, 2003; Geertsema et al., 2006; Chigira, 2009). Unlike shallow landslides, which typically affect surface layers to a few meters deep, deep-seated landslides extend deeper, often exceeding 10 m, and can involve the movement of underlying bedrock (Lin et al., 2013). Predicting these events is challenging and costly (Thai Pham et al., 2019), but extensive efforts have been made to do so throughout history (Corominas and Moya, 2008; David and Raymond, 1989; Aleotti and Chowdhury, 1999). One method that has been employed involves thoroughly examining the physical and geological characteristics of the mountainous areas at risk of landslides (Cotecchia et al., 2020). Furthermore, the level of groundwater has been shown by numerous studies in the past to significantly influence the mechanisms behind landslide formation (Miao and Wang, 2023; Preisig, 2020; Iverson and Major, 1987).

In pursuing a generalized approach to landslide forecasting, researchers have determined that the critical factors associated with slope instability exhibit temporal variability, necessitating the use of time series data (Chae et al., 2017). This approach combines slope deformation data collected through sensors drilled deep into the slope bed with data on the natural conditions of the monitoring area, which are collected simultaneously. Upon establishing that the data pertinent to landslide prediction fall within the category of time series data, a formidable challenge in research related to this type of disaster is devising a predictive model capable of forecasting the likelihood of such catastrophes based on related factors.

One of the most effective solutions for constructing models to predict time series data involves applying data-driven techniques. The advancement of computational capabilities has driven the widespread adoption of data-driven machine learning models more than physics-based models. This shift is based on the premise that the data used for slope monitoring originate from nonlinear systems (Zhou et al., 2018). However, a significant drawback of traditional machine learning models, such as random forest and support vector machines, is the difficulty they have handling spatiotemporal data. These models need help to capture the sequential relationships necessary for landslide prediction, resulting in lower performance (T. Zhang et al., 2022; Tehrani et al., 2022).

An increasing array of novel data-driven solutions are being developed to overcome the constraints of traditional machine learning approaches. Among these data-driven solutions, convolutional neural networks (CNNs) have emerged as one of the most effective methods. These CNN models, which excel at automated feature extraction, can enhance efficiency in analyzing complex datasets and improve the accuracy of prediction results (Alzubaidi et al., 2021).

Moreover, there is a noteworthy recent trend in employing metaheuristic optimization algorithms to fine-tune the hyperparameters of artificial intelligence (AI) models, thereby augmenting the models' efficiency. This approach has found applications in geological and construction studies and other fields, showcasing substantial effectiveness. Consequently, the fine-tuning of hyperparameters represents a potent avenue for elevating the efficiency of AI models in research focused on predicting deep-seated landslide displacement.

Leveraging the effective methodologies mentioned above, this study employs AI models optimized by an innovative metaheuristic optimization algorithm to predict deep-seated landslide displacement on the northern slope of Lushan in Ren'ai Township, Nantou County, Taiwan. The geological characteristics of this area have been extensively researched (Wang et al., 2015; Lin et al., 2020). Previous studies have identified varying depths of the shear plane. Specifically, Lin et al. (2020) determined that the depth of the shear plane is 85 and 106 m based on inclinometer data. Our research paper is firmly grounded in empirical evidence meticulously collected over 8 years from extensometers at depths of 70 and 40 m. Our analysis also considers the cumulative impact of storms and heavy rainfall on groundwater levels, utilizing data from four stations measuring groundwater levels in the study area and other weather conditions that potentially trigger landslides. The objectives of our research are as follows:

to analyze the application of machine learning and deep learning methods to time series data to forecast short-term, deep-seated landslide displacement across the Lushan area;
to identify the optimal model and hyperparameters for accurately forecasting deep-seated landslide displacement in the study area;
to evaluate the role of metaheuristic optimization algorithms in fine-tuning the hyperparameters of AI models.

This study represents the first instance of AI models being utilized to predict deep-seated landslides on Lushan. Additionally, it marks the inaugural application of AEIO for fine-tuning AI models in landslide-related research. Our findings serve as a valuable resource for civil engineers, contractors, and inspectors involved in the planning and overseeing of construction projects in landslide-prone areas. Predicting the likelihood of landslide events can help minimize property loss, guide schedule adjustments, improve work safety, and ensure smooth traffic flow during critical periods. Additionally, understanding internal displacement provides engineers with precise data to evaluate the resilience of structures and infrastructure in vulnerable areas, enabling the issuance of prudent warnings.

2 Literature review

2.1 Groundwater levels and the forecasting of deep-seated landslide displacement

Landslide triggers can be attributed to loading, slope geometry, weather conditions, and hydrological conditions (Perkins et al., 2024; van Natijne et al., 2023; Millán-Arancibia and Lavado-Casimiro, 2023; Jones et al., 2023). Among these, hydrological conditions, especially groundwater levels, have been one of the most critical elements considered in studies related to landslide prediction. Numerous studies have substantiated this point. For instance, research by Take et al. (2015) demonstrated that the distance and velocity of landslides triggered under high-antecedent-groundwater conditions are much more significant compared to scenarios with drier conditions. Another study has shown that water accumulation at a soil–bedrock contact can develop positive pore water pressures, causing landslides (Matsushi and Matsukura, 2007) (see Fig. 1). Moreover, studies on past landslide events have also demonstrated similar findings. Examples of this research include the Tessina landslide in northeastern Italy, where groundwater conditions triggered movement (Petley et al., 2005). Additionally, the study by Keqiang et al. (2015) on water-induced landslides in the Three Gorges Reservoir project area highlights the significant impact of hydrological conditions on the likelihood of such disasters.

https://nhess.copernicus.org/articles/25/119/2025/nhess-25-119-2025-f01

Figure 1Schematic illustration showing the effects of groundwater on deep-seated slope failure.

Download

Similarly, Preisig (2020) developed a groundwater prediction model for analyzing the stability of a compound slide in the Jura Mountains. Additionally, Srivastava et al. (2020) explored machine learning algorithms to forecast rainfall and established thresholds for landslide probabilities. Although the research by Srivastava et al. (2020) did not directly rely on groundwater levels to predict landslides, it is evident that rainfall, a crucial factor in their study for landslide prediction, also influences hydrological conditions. Therefore, their research further underscores the importance of considering groundwater levels in landslide prediction.

The northern slope in the Lushan area of central Taiwan, the region investigated in this study, exhibits significant gravitational slope deformation, making it prone to landslides during typhoons or heavy-rainfall events. Lin et al. (2020) conducted in-depth studies on the mechanisms of landslide occurrence based on the geological conditions of the area. While successfully providing valuable insights into the evolution of deep-seated gravitational deformations, their study focuses exclusively on employing traditional analytical methods in geological research, such as analyzing data from geotechnical instruments and conducting geological borehole analysis.

Our research aims to adopt a novel approach compared to previous landslide studies at Lushan by utilizing AI models and metaheuristic optimization algorithms. This research will utilize temperature, humidity, and groundwater levels as input data for AI models to predict deep-seated landslide displacement, thus aiding in landslide forecasting in this region.

2.2 Forecasting slope displacements: conventional methods

Several conventional methods are commonly employed to predict deep-seated slope displacement. These methods primarily involve simulating factors affecting slope stability in landslide-prone areas using data collected from ground-based monitoring devices. An early approach to predicting deep-seated slope movements is geotechnical mapping. This technique characterizes rock and soil strength, density, and porosity.

For instance, Crosta and Agliardi (2003) analyzed the geology and rock mass behavior using Voight's semi-empirical failure criterion, incorporating time-dependent factors to generate velocity curves that indicate risk levels. Recently, Xu et al. (2018) utilized real-time remote monitoring systems to measure internal stress, deep displacement, and surface strain. These data were used to formulate forecasting models to assess slope stability, particularly in railway construction. However, a common challenge with this method is the instability of and frequent changes in the terrain and geology of landslide-prone areas. This necessitates constant updates to the computational model, which can be time-consuming and labor-intensive.

Moreover, physically based numerical and laboratory modeling methods are also gaining attention in landslide research. These methods aim to maintain forecasts using various data types while reducing human workload and ensuring high accuracy. For example, Mufundirwa et al. (2010) conducted a laboratory study to examine the effectiveness of the inverse velocity model in predicting rock mass destruction resulting from landslides at depths of 2 and 4 m along the sliding plane. This study utilized historically recorded data from Asamushi, Japan, and the Vajont Reservoir in Italy (Mufundirwa et al., 2010).

Meanwhile, Wu (2010) employed the numerical discontinuous deformation analysis method to simulate a blocky assembly's post-failure behavior, incorporating earthquake seismic data. Another study follows this trend by Jiang et al. (2011), who utilized the fluid–solid coupling theory to simulate displacement and capture the interaction between fluid and solid materials. However, both numerical models and laboratory modeling methods require substantial effort from researchers. These approaches demand deep expertise and the development of complex models. More importantly, they rely heavily on assumptions during the simulation process and may need to reflect real-world conditions, leading to significant errors in accuracy.

Stability analysis is another commonly used method related to physics, which evaluates the forces acting on slope behavior. Fu and Liao (2010) presented a technique for implementing the nonlinear Hoek–Brown shear strength reduction, determining the correlation between normal and shear stress based on the Hoek–Brown criterion. Subsequently, the micro-units' (micro-units are microscopic components of the rock mass) instantaneous friction angle and cohesive strength under specific stress conditions are calculated.

Although this approach effectively addresses cost and labor issues, it still heavily relies on the researchers' assumptions and is limited by the ability to utilize only a small number of data from the research area. Additionally, there are several other limitations. For instance, Mebrahtu et al. (2022) indicated that stability analyses become less reliable in seismic-load scenarios. Safaei et al. (2011) also noted that stability analysis necessitates a substantial number of detailed input data obtained from laboratory tests and field measurements, thereby limiting the areas that can be effectively assessed.

As previously mentioned, using conventional methods poses significant challenges, as their application requires a deep understanding of both the physics involved and the complex behavior of soil. In addition, traditional methods require specific types of input data, highlighting the rigidity and lack of flexibility inherent in these approaches (Safaei et al., 2011). In contrast, AI models can overcome these difficulties by automatically learning to identify mapping functions between input and output data, eliminating the need for users to have specialized knowledge of soil behavior and physics. Additionally, AI models can be updated to incorporate new input variables, offering flexibility to leverage available data based on real-world conditions. Therefore, AI models will be utilized in this research instead of conventional methods.

2.3 Forecasting slope displacements: machine learning and deep learning

In studies employing machine learning and deep learning models for landslide research, a plethora of research utilizes discrete data to train AI models to predict the probability of landslides or to construct maps depicting landslide susceptibility. For instance, Pradhan and Lee (2010) used a geographic information system (GIS), remote sensing, and a neural network model to analyze landslide susceptibility in the Cameron Highlands, Malaysia. Ten factors, including topographic slope and drainage distance, were processed to generate a susceptibility map. The model achieved 83 % accuracy in predicting landslide locations. In a similar study, Pham et al. (2016) used multiple AI models, including support vector machines (SVMs), logistic regression (LR), Fisher's linear discriminant analysis (FLDA), a Bayesian network (BN), and naïve Bayes (NB), for landslide susceptibility assessment in a region within the state of Uttarakhand, India. The SVM model yielded the best prediction results among the models used.

In addition to discrete data, many landslide studies utilize time series data. When it comes to technical forecasting using time series data, machine learning regression prediction models, such as extreme learning machines (ELMs) (Li et al., 2018), least-squares support vector machines (LSSVMs) (Liu et al., 2019), dynamic neural networks (Aggarwal et al., 2020), random forests (RFs) (Hu et al., 2021), SVMs (Zhang et al., 2021), and Gaussian process regression (GPR) (Hu et al., 2019), have proven highly effective at yielding reliable results. These models also provide scalability and the ability to handle larger datasets. However, it is essential to note that machine learning models are sensitive to the white noise that is typical of time series features. This can pose challenges in capturing subtle behaviors and complex interrelationships, mainly when data availability is limited (Zhang et al., 2020). Finally, feature engineering (the process of selecting and transforming input variables to enhance the performance of the models) is computationally intensive and labor-intensive, limiting its applicability when rapid forecasting is required.

Alongside the machine learning models mentioned above, a range of neural network models, from simpler ones like artificial neural networks (ANNs) to more advanced approaches such as deep neural networks (DNNs) and CNN, are also employed in research related to landslides (Kumar et al., 2017; Zheng et al., 2022). Notably, CNN models have become increasingly popular and are widely used in research related to this disaster type. CNN models often yield superior predictive results compared to other models in terms of landslide susceptibility assessment and displacement prediction (He et al., 2024).

Moreover, another research trend in landslide forecasting involves the use of time series deep learning models such as recurrent neural networks (RNNs), long short-term memory (LSTM), and gated recurrent units (GRUs), which use previous information to generate current outputs and provide state feedback (Yang et al., 2019; Xu et al., 2022; Yang et al., 2022; W. Zhang et al., 2022). These time series deep learning models can effectively capture patterns of changes over time, making them highly suitable for time series data in landslide-related studies. However, there has yet to be a comprehensive study that employs a combination of machine learning methods, time series deep learning, and CNN models to compare and determine the most suitable model for predicting landslide displacement. Therefore, our research aims to address this gap.

Another noteworthy research trend involves using AI models to predict landslides based on spatial–temporal data. For instance, Dahal et al. (2024) utilized spatial–temporal data to pinpoint where landslides may occur and predict when they might happen and the expected landslide area density per mapping unit. The ensemble neural network employed in this research yielded promising predictions, demonstrating its potential for forecasting landslides in Nepal's areas affected by the Gorkha earthquake. However, our study only managed to gather temporal data. Consequently, the AI models developed in our research will be trained to learn and forecast time series data.

2.4 Hybrid metaheuristic optimization algorithm and AI models in landslide prediction

In landslide-related research, numerous studies have employed hybrid models, wherein metaheuristic optimization algorithms optimize the hyperparameters of AI models. For example, Balogun et al. (2021) studied landslide susceptibility mapping in western Serbia. This research collected 14 different condition factors to serve as input data for the support vector regression (SVR) model to predict landslide occurrences. The study results indicate that SVR models, with hyperparameters fine-tuned by optimization algorithms such as gray wolf optimization (GWO), the bat algorithm (BA), and the cuckoo optimization algorithm (COA), all yielded better prediction results compared to using a single model.

Hakim et al. (2022) conducted a study utilizing CNN models optimized by GWO and the imperialist competitive algorithm (ICA) for landslide susceptibility mapping from geo-environmental and topo-hydrological factors in Incheon, South Korea. This research demonstrates that GWO and ICA effectively fine-tuned the CNN model, resulting in a highly accurate landslide susceptibility map.

Jaafari et al. (2022) employed an AI model known as the group method of data handling (GMDH) for classification purposes, optimizing it using the cuckoo search algorithm (CSA) and the whale optimization algorithm (WOA). In northwest Iran, the authors aimed to predict landslides based on various factors, including topographical, geomorphological, and other environmental factors. After training and testing, the GMDH–CSA model produced superior prediction results compared to the GMDH–WOA and the standalone GMDH model.

It is evident from numerous past studies on landslides that the application of metaheuristic optimization algorithms significantly enhances the predictive effectiveness of AI models. Therefore, this study also incorporates this approach to ensure the model's accuracy in landslide prediction. This study will employ a recently developed metaheuristic algorithm that includes a clustering technique, which shows promise in effectively fine-tuning hyperparameters for AI models.

3 Methodology

3.1 Machine learning

In addition to the aforementioned deep learning models, as elucidated earlier, machine learning models will be employed to predict deep-seated landslide displacement in this research. The machine learning models utilized will encompass the following approaches: linear regression (LR) (Stanton, 2001), ANN (McCulloch and Pitts, 2021), SVR (Drucker et al., 1996), classification and regression tree (CART) (Breiman, 1984), radial basis function neural network (RBFNN) (Han et al., 2010), and extreme gradient boosting (XGBoost) (Chen and Guestrin, 2016). These machine learning models will be used to make predictions and will be compared with other deep learning models.

3.2 Deep learning models for time series data

RNN was introduced by Elman in 1990 (Elman, 1990). This model makes predictions based on sequential data, crucial for language modeling, document classification, and time series analysis. The architecture of an RNN model can be found in Appendix A. In this study, advanced models of RNN, such as LSTM and GRU models, are also utilized, and their effectiveness in predicting deep-seated landslides will be compared.

3.3 Convolutional neural networks

In 1998, LeCun introduced a novel type of DNN, known as CNN, specifically designed for processing data with a grid-like structure, such as images (Lecun et al., 1998). The complex layered system of CNN facilitates the automated extraction of features without extensive preprocessing, making it ideal for object recognition, image classification, and segmentation tasks. The detailed mechanism of the CNN model can be found in Appendix B.

This study will use various CNN models to predict deep-seated slope displacement. The CNN models employed in this research include VGG (Simonyan and Zisserman, 2015), ResNet (He et al., 2016), Inception (Szegedy et al., 2015), Xception (Chollet, 2017), MobileNet (Howard et al., 2017), DenseNet (Huang et al., 2017), and NASNet (Zoph et al., 2018). To clarify, the term “standard CNN models” will refer to models with structures that can be user-defined, while “retrained CNN models” will denote those with architectures that have been researched and developed by other scientists and have been proven to be highly effective.

CNN models are typically used for image processing tasks. However, the input data for this study are in numerical and vector form. Therefore, several transformation steps are required to convert these numerical and vector data into image data suitable for CNN input. Detailed information about these transformation steps can be found in the study of Chou and Nguyen (2023).

3.4 Data management and performance analysis

3.4.1 Data splitting and evaluation strategy

To obtain reliable (i.e., generalizable) evaluation and validation results, it is crucial that the data used for testing do not include the data used for training. Therefore, a dataset must be divided into training, validation, and testing subsets before training the AI model. Training data are used to learn patterns, testing data are used to assess model performance and identify errors, and validation data are used to fine-tune the hyperparameters. In the current study, we opted to refrain from employing cross-validation, which tends to be time-consuming. Instead, we adopted the holdout approach to manage our large dataset with well-represented target variables (Fig. 2). A 90:10 ratio is generally used to split datasets into learning and testing data (Di Nunno et al., 2023). When implementing the holdout method during hyperparameter optimization, 20 % of the learning data are used for validation, and the remaining 80 % are used for training.

https://nhess.copernicus.org/articles/25/119/2025/nhess-25-119-2025-f02

Figure 2Data are split under the proposed holdout scheme.

Download

3.4.2 Performance evaluation metrics

This study utilized three widely recognized performance measures to assess the model's effectiveness in prediction accuracy (Chou and Nguyen, 2023). The measures included mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean square error (RMSE).

MAE represents the mean of absolute errors, calculated as the average of the absolute differences between actual and predicted values. Its advantage lies in its simplicity, which provides a straightforward measure of average prediction error. However, a drawback of MAE is its insensitivity to more significant errors, so it may not effectively highlight differences between models when significant errors are present. It is defined as

\begin{matrix} (1) & MAE = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - {\hat{y}}_{i}|, \end{matrix}

where n is the number of predictions, y_i is the ith forecasted value, and ${\hat{y}}_{i}$ is the corresponding ith actual value.

MAPE quantifies the ratio of the average absolute error to the actual value derived from the differences between actual and forecasted values. It provides a clear metric in percentage terms, facilitating straightforward interpretation across various datasets. However, MAPE's limitation arises from its sensitivity to zero values in the actual data, which can become undefined or impractical to compute, limiting its utility in scenarios involving zero or near-zero actual values. The expression for MAPE is as follows:

\begin{matrix} (2) & MAPE = \frac{1}{n} \sum_{i = 1}^{n} |\frac{y_{i} - {\hat{y}}_{i}}{y_{i}}|, \end{matrix}

where n is the number of predictions, y_i is the ith forecasted value, and ${\hat{y}}_{i}$ is the corresponding ith actual value.

RMSE represents the square root of the average squared error between actual and forecasted values and is widely used for its ability to indicate the dispersion of errors. This method captures the magnitude and direction of errors, making it practical for assessing overall prediction accuracy. However, RMSE tends to be more sensitive to outliers and significant errors than MAE due to its squaring of errors during computation. This sensitivity can disproportionately affect its evaluation in datasets with extreme values. The expression for RMSE is as follows:

\begin{matrix} (3) & RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}, \end{matrix}

where n is the number of predictions, y_i is the ith forecasted value, and ${\hat{y}}_{i}$ is the corresponding ith actual value.

3.5 Age of Exploration-Inspired Optimizer

This study employs a range of AI models to forecast deep-seated landslide displacement in mountainous regions. To enhance the prediction accuracy of these AI models, the study incorporates a novel metaheuristic optimization algorithm known as the Age of Exploration-Inspired Optimizer (AEIO). Developed by Chou and Nguyen in 2024, this algorithm has demonstrated high effectiveness in fine-tuning the hyperparameters of AI models. This algorithm treats each particle in the search domain as an explorer. The movement of particles toward regions with higher fitness values parallels the exploratory activities of the Age of Exploration, where explorers sought ideal locations for establishing colonies. In this study, each particle represents a set of hyperparameters, with the ultimate goal of the search process being to identify the optimal particle or hyperparameter set that minimizes prediction error for AI models. Figure 3 illustrates the AEIO algorithm.

https://nhess.copernicus.org/articles/25/119/2025/nhess-25-119-2025-f03

Figure 3Illustration of the Age of Exploration-Inspired Optimizer.

Download

The strength of the AEIO algorithm lies in its ability to develop specific strategies for particles based on their positions, enabling faster convergence to the optimal point and using density-based spatial clustering of applications with noise (DBSCAN) for particle clustering. DBSCAN is an unsupervised clustering method that organizes data points by their spatial closeness in high-dimensional spaces (Ester et al., 1996). This algorithm is particularly effective at detecting clusters of different shapes and densities. It relies on two primary parameters: ε (the radius of the neighborhood) and MinPt's (the minimum number of points required to form a dense area). Clusters are created by locating neighboring points with enough surrounding points, while those that do not belong to any cluster are classified as noise or outliers.

Using the DBSCAN algorithm, the AEIO determines whether particles are in favorable or unfavorable positions, reminiscent of explorers during the Age of Exploration. The proximity (within clusters) allows explorers to gather information and move toward optimal locations, thereby enhancing their ability to establish new colonies. In contrast, explorers far apart (outside clusters) adopt different strategies, relying on limited peer guidance or general trends in their quest for new territories.

In each iteration, explorers forecast their next move. If it promises a better position, they relocate. Otherwise, if the new spot is less favorable for colony establishment, they stay put and await the next iteration. The algorithm employs specific mathematical formulas to calculate the movement step of explorers or particles in the AEIO. The exploratory steps of an explorer in the AEIO algorithm will be continuously iterated until the stop condition is satisfied.

3.5.1 Explorers follow general trends

The explorer choosing this movement type will calculate the distance from their location x_i,d(t) to the center of all other explorers (Meanvl_d(t)) and then attempt to move toward that central point in the hope of finding a better location with the potential to establish a new colony. The following formulas determine the explorer's position after the movement:

\begin{matrix} (4) & \begin{aligned} x_{i, d} (t + 1) & = x_{i, d} (t) + α \cdot ({Meanvl}_{d} (t) - x_{i, d} (t)) \\ \times rand (0, 1) \times R, \end{aligned} \\ (5) & {Meanvl}_{d} (t) = \frac{x_{1, d} (t) + x_{2, d} (t) + \dots + x_{n_{Pop}, d} (t)}{n_{Pop}}, \end{matrix}

where $d = 1, 2, \dots, D$ , with D being the number of dimensions; $i = 1, 2, \dots, n_{Pop}$ ; n_Pop is the total number of explorers; $t = 1, 2, \dots, MaxIt$ ; MaxIt is the maximum value of iterations; α is a parameter for adjusting the particle's movement toward the centroid position (usually equal to 3); Meanvl_d(t) is the centroid of all particles in dimension d; rand(0, 1) is the random number in the range [0, 1]; R is a number that equals 1 or 2 depending on the value of rand(0, 1) per the equation $R = round (1 + rand (0, 1))$ ; x_i,d(t) is the location of particle i in iteration t; and $x_{i, d} (t + 1)$ is the location of particle i in iteration (t+1).

3.5.2 Explorers follow three other peers

Explorers employing this movement method will calculate the average position of three other randomly selected explorers $(\frac{x_{1, d} (t) + x_{2, d} (t) + x_{3, d} (t)}{3})$ and then move toward this newly calculated average position. The explorer's new position is computed using the following formula:

\begin{matrix} (6) & \begin{aligned} x_{i, d} (t + 1) = \\ x_{i, d} (t) + (\frac{x_{1, d} (t) + x_{2, d} (t) + x_{3, d} (t)}{3} - x_{i, d} (t)) \\ \times rand (0, 1) \times R, \end{aligned} \end{matrix}

where x_1,d(t), x_2,d(t), and x_3,d(t) are three random explorers in dimension d at iteration t, with $d = 1, 2, \dots, D$ and D being the number of dimensions; $i = 1, 2, \dots, n_{Pop}$ ; n_Pop is the total number of explorers; $t = 1, 2, \dots, MaxIt$ ; and MaxIt is the maximum value of iteration.

3.5.3 Explorers follow the best one

According to this strategy, the explorer (x_i,d(t)) will move closer to the position of another explorer currently holding the best position (Best_d(t)), as determined by the following formula:

\begin{matrix} (7) & \begin{aligned} x_{i, d} (t + 1) & = x_{i, d} (t) + ({Best}_{d} (t) - x_{i, d} (t)) \\ \times rand (0, 1) \times R, \end{aligned} \end{matrix}

where Best_d(t) represents the position of the particle with the best fitness in dimension d at iteration t; the parameters d and t hold the same meaning as defined in Eq. (6).

3.5.4 Explorers follow guidance from another explorer

Explorers in favorable positions with access to information can execute this movement strategy. In this scenario, explorers (x_i,d(t)) will consult with another explorer. The consulted explorer will compare their direction and distance to the best individual who holds the most favorable position (Best_d(t)) and guide the inquirer. This algorithm assumes that the inquirer can be any explorer, i.e., a random explorer (x_1,d(t)). The following formula describes how to calculate the new position of the explorer following this strategy:

\begin{matrix} (8) & \begin{aligned} x_{i, d} (t + 1) & = x_{i, d} (t) + ({Best}_{d} (t) - x_{1, d} (t)) \\ \times rand (0, 1) \times R, \end{aligned} \end{matrix}

where x_1,d(t) is a random explorer in dimension d at iteration t. The parameters d and t hold the same significance as defined in Eq. (6).

3.5.5 Crowd control mechanism

To enhance the efficiency of AEIO in transitioning between exploration and exploitation, a mechanism is employed to adjust the parameters of DBSCAN throughout each cycle, according to the following formulas:

\begin{matrix} (9) & ε_{d} = (0.1 + \frac{t}{MaxIt}) \times ({Meanvl}_{d} (t) - {Best}_{d} (t)), \\ (10) & MinPt's = round (1 + \frac{t}{MaxIt} \times 10) . \end{matrix}

The exploratory steps in the AEIO algorithm begin by classifying positions using the DBSCAN algorithm. Subsequently, the explorers update the crowd control mechanism according to Eqs. (9) and (10), and move according to various strategies defined by Eqs. (4), (6), (7), and (8). This process is conducted iteratively until the maximum number of iterations is reached.

To fine-tune the hyperparameters of AI models, the AEIO algorithm treats each hyperparameter as a variable. Furthermore, the objective function of the AEIO algorithm seeks to minimize the prediction error of AI models, which is quantified by an evaluation metric (MAPE). Figure 4 presents a flowchart illustrating the process by which the AEIO algorithm aids in fine-tuning hyperparameters for AI models.

https://nhess.copernicus.org/articles/25/119/2025/nhess-25-119-2025-f04

Figure 4Flowchart of the fine-tuning process of AI models by the AEIO algorithm.

Download

4 Lushan hot springs: geography and geology

4.1 Research area

The current study focuses on the northern slope of the Lushan hot springs in Ren'ai Township, Nantou County, Taiwan (Fig. 5), with Nenggao Mountain to the east, the Hehuan Peaks to the north, Zhuoshe Mountain to the south, and the Puli Basins to the west. The terrain features rugged mountain ranges, incipient valleys, and notable river erosion (Lee and Chi, 2011). The Lushan hot springs are located below the hill, and the main access roads for nearby settlements and hot-spring sites include Provincial Highway 14 and County Highway 87.

https://nhess.copernicus.org/articles/25/119/2025/nhess-25-119-2025-f05

In an early study of deep landslides in this area, Lin et al. (2020) reported that the Lushan slope exhibits large-scale deep-seated gravitational slope deformation, characterized by a steep scarp, a gently inclined head, and a curving river at its base. Figure 5 shows the distribution of four survey boreholes (G18, G20, G21, and G25) along the slope, and Fig. 6 illustrates the geological details of the research area in these boreholes. Regolith, slate, and meta-sandstone are three distinct lithological units revealed through drilling. Additionally, the study of Lin et al. (2020) identified the depths of failure planes in these survey boreholes. Specifically, boreholes G18 and G25 did not record any failure planes, while boreholes G20 and G21 recorded failure planes at depths of 85 and 106 m, respectively. These failure planes were identified based on inclinometer data from the corresponding study (Lin et al., 2020).

https://nhess.copernicus.org/articles/25/119/2025/nhess-25-119-2025-f06

Figure 6Illustration of geological drilling surveys.

Download

Initially, the topmost regolith layer's thickness was less than 10 m. Secondly, slate predominated, exhibiting a notable presence with sporadic evidence of weathering that resulted in brecciated patterns. This composition frequently broke into breccia and gouges, particularly along cleavage planes and thin shear zones, indicating its susceptibility to collapse. This geological layer is identified as the area's primary cause of landslide risk. Finally, meta-sandstone appeared intermittently compared to the more prevalent lithological units and was characterized by its fragility and fractures; it occurred less frequently in the drilled samples.

Previous research has detected signs of brittle deformation in the area. These indications include chevron folds within fractures, visible cracks, and intricate jigsaw-puzzle-like patterns at the head of the rock formations. Overturned and flexural toppling fractures are prevalent toward the toe of the slope. Additionally, kink bands are observable on fractures recently undergoing flexural folding along the eastern boundary. Notably, horizontal fractures near the toe region also exhibit inter-fracture gouges. Further details on this geological information can be found in the study by Lin et al. (2020). These instances highlight the potential for significant geological changes and landslide risk in this region.

4.2 Data collection

In this study, hourly data of deep-seated landslide displacement and the groundwater level were collected by the Department of Civil Engineering, College of Science and Technology, at the National Chi Nan University research group over 8 years from July 2009 to June 2017, yielding 68 317 data points. The installation time points and locations are presented in Table 1 and Fig. 5, respectively.

The data used in this study were collected using an in-hole telescopic gauge (E-2), a multidirectional shape acceleration array sensor (SAA) with an underground displacement gauge, and four groundwater-level gauges (A-17, A-18-2, A-20, and A-24). The transmission, storage, and processing of data are described in detail in the research of Lau et al. (2023).

The operation of the in-hole extensometer entailed the installation of a borehole through the sliding surface. One end of a steel cable was anchored at the bottom, and a displacement gauge was placed at the free end to measure deformations automatically. The fixed stops for E-2 and SAA were situated at depths of 70 and 40 m below the surface, respectively. In addition to groundwater-level data, information regarding significant rainfall events in this area was also measured and is presented in Table 2.

Table 1Device installation time points. A dash (–) denotes no data.

Download Print Version | Download XLSX

Based on the collected data, analyses have examined the correlation between groundwater levels and deep-seated landslide displacement at Lushan. To observe this correlation, graphs illustrating the precipitation of recorded heavy rainfall (Fig. 7a), variations in displacement (Fig. 7b and c), and groundwater levels (Fig. 7d) over time have been plotted.

https://nhess.copernicus.org/articles/25/119/2025/nhess-25-119-2025-f07

Figure 7Unified timeline visualization of data in this study: (a) precipitation of heavy rainfall recorded in the studied area; (b) measured displacements from the SAA extensometer; (c) measured displacements from the E-2 extensometer; and (d) groundwater levels at stations A-17, A-18-2, A-20, and A-24.

Download

Figure 7 shows that the displacement values at both stations often exhibit significant increases coinciding with periods of pronounced fluctuations in groundwater levels. Specifically, in June 2012, there was a notable surge in groundwater levels attributed to heavy rainfall from 8 to 17 June 2012, totaling 1029 mm over 219 h (as indicated in Table 2 and Fig. 7a). The abnormal rise in groundwater levels led to increased pore water pressure, which triggered deep-seated landslide displacement at both stations, namely E-2 and SAA, as evidenced in Fig. 7b and c.

Table 2Heavy-rainfall events in the study area.

Download Print Version | Download XLSX

Similar events occurred in November 2017. Heavy rainfall totaling 638.5 mm over 178 h during this period also caused a sudden alteration in groundwater levels, resulting in significant deep-seated landslide displacement. Through comparison, it is apparent that there were up to 13 instances of anomalous heavy rainfall during the study period. However, not every example of heavy rain resulted in significant fluctuations in groundwater levels leading to substantial displacement. Hence, data regarding groundwater-level elevation rather than rainfall data will be used to predict deep-seated landslides.

In addition to groundwater-level data, weather factors such as temperature and humidity are also utilized as input data for the prediction model. This study includes temperature as an input variable for AI models to predict deep-seated landslide displacement due to its impact on soil structure. Elevated temperatures can cause thermal expansion of soil particles, which can increase pore water pressure and reduce effective frictional resistance forces (Pinyol et al., 2018). Additionally, previous research has shown a relationship between temperature and the likelihood of landslides in clay-rich soils, which are also present in the geological composition of Lushan (Shibasaki et al., 2017; Loche and Scaringi, 2023).

This study collected groundwater-level and displacement data on-site using sensors. Furthermore, temperature and humidity data were obtained from the following website: https://power.larc.nasa.gov (last access: 8 January 2024). This dataset is part of the Prediction of Worldwide Energy Resources (POWER) project, developed by the National Aeronautics and Space Administration (NASA) of the United States. The POWER solar data are derived from satellite observations, which are used to infer surface insolation values. Meteorological parameters are sourced from the Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2), assimilation model. The primary solar data are available with a global resolution of 1°×1° latitude × longitude, while the meteorological data are provided at a finer resolution of $\frac{1}{2} ° \times \frac{5}{8} °$ latitude × longitude. Users can download the data hourly, daily, or monthly through the abovementioned website.

Table 3 displays the input and output variables for AI models to predict deep-seated landslide displacement at Lushan. Two datasets will be generated: one for predicting displacement at the E-2 station and another for indicating displacement at the SAA station. Table 4 outlines the number of data points for each dataset and illustrates how the data are divided into training and testing sets.

Table 3Input and output variables of a model predicting deep-seated landslide displacement.

Download Print Version | Download XLSX

Table 4Number of data points.

Download Print Version | Download XLSX

4.3 Data preprocessing

Firstly, the data in this study will undergo a normalization process to scale all features to a consistent range (typically between 0 and 1). This step is essential to ensure that the model considers the importance of each feature, thereby enhancing overall prediction accuracy (Han et al., 2006).

In the current study, the sliding-window technique is implemented after data normalization to organize data according to a specific time frame. This involves using historical data from previous steps to predict the output for subsequent steps (Chou and Ngo, 2016). The forecasting horizon refers to the length of time into the future for which output forecasts are made.

The basic process of the sliding-window technique is illustrated in Fig. 8. To train AI models, this study opts for a window size of 1 week (equivalent to 168 h). This fixed window size is utilized exclusively for individual AI models. Subsequently, the hybrid model's AEIO algorithm and other hyperparameters will fine-tune the window size to determine the most suitable settings.

https://nhess.copernicus.org/articles/25/119/2025/nhess-25-119-2025-f08

Figure 8Sliding-window technique.

Download

This study focuses on predicting deep displacement values at two distinct time intervals: 1 d ahead (+24 h) and 7 d ahead (+168 h). These forecast horizons are strategically chosen to provide timely information, enabling management departments to make accurate decisions regarding evacuating people and assets from areas prone to landslides.

Specifically, for valuable assets and machinery that require time for relocation from landslide-prone areas, having advance knowledge of the landslide event 1 week ahead of relocation is crucial. Furthermore, for humans, animals, or other assets that can be evacuated more swiftly, predicting the landslide 1 d in advance is sufficient to ensure safety.

The predicted outputs are quantified in mm d⁻¹, facilitating decision-making for administrators according to the TGS-SLOPEM106 standard (Ruitang et al., 2017). Table 5 outlines suggested actions corresponding to different degrees of deep displacement as per the TGS-SLOPEM106 standard issued by the Taiwan government.

Table 5Recommendations taken from TGS-SLOPEM106 for addressing displacement values in the early stages of deep sliding.

Download Print Version | Download XLSX

5 Model development and analysis results

5.1 Model development

Predicting deep-seated landslide displacement at Lushan is undoubtedly highly challenging, given that such landslides depend on numerous factors. Therefore, multiple methods will be employed simultaneously to identify the optimal AI model for prediction. These methods include single-machine learning, time series deep learning, CNN, and hybrid models.

This study will conduct a testing process to systematically identify the optimal model capable of accurately predicting deep-seated landslides. An illustration of this process can be found in Fig. 9. Initially, the study will sequentially employ various single numerical AI models, such as machine learning models (LR, ANN, SVR, CART, RBFNN, XGBoost) and time series deep learning models (RNN, bidirectional recurrent neural networks (Bi-RNNs), LSTM, bidirectional long short-term memory (Bi-LSTM), GRU, bidirectional gated recurrent units (Bi-GRUs)), to forecast displacement.

https://nhess.copernicus.org/articles/25/119/2025/nhess-25-119-2025-f09

Figure 9Diagram illustrating the steps for selecting the optimal AI model to predict deep-seated landslide displacement.

Download

Subsequently, the model with the highest prediction accuracy will be selected for integration with the AEIO algorithm, forming a hybrid model. In this hybrid model, the hyperparameters of the best numerical AI model will be fine-tuned by the AEIO algorithm to enhance prediction accuracy.

In addition to the numerical AI models, this study employs individual CNN models for predicting deep-seated landslide displacement. Subsequently, similarly to the approach above, the best CNN model with the highest displacement prediction capability will be fine-tuned by the AEIO algorithm within a hybrid model. In the final step, a comparison process between the two hybrid models – one comprising the best numerical model and the other involving the best CNN model fine-tuned by AEIO – will be conducted to select the optimal model for this study.

5.2 Analysis results

This section will present the experimental results of the steps outlined in Fig. 9, along with relevant metrics and analysis.

5.2.1 Numerical models

(a) Machine learning models

Initially, single-machine learning models will predict deep-seated landslide displacement. In this phase, machine learning models will utilize default hyperparameters, as detailed in the research of Chou and Nguyen (2023). The prediction results of these models at both E-2 and SAA stations are displayed in Table 6. These results show that most machine learning models demonstrate a relatively good predictive capability for displacement, particularly the XGBoost model, which exhibits MAPE values ranging from 8.14 % to 9.58 %. Following closely, CART also produces favorable prediction results, with MAPE ranging from 8.53 % to 9.76 %. Regarding prediction accuracy, XGBoost and CART models outperform LR, ANN, SVR, and RBFNN models.

Table 6Performance results of machine learning models for predicting deep-seated landslide displacement.

Download Print Version | Download XLSX

Moreover, the results in Table 6 also indicate that there is not a significant difference in the prediction errors of the machine learning models at both E-2 and SAA stations, as the error values for both stations are nearly equal across all machine learning models. Regarding the running time, the LR model demonstrates the shortest duration, ranging from 0.0001 to 0.01 s for all runs. However, the prediction accuracy of this model could be higher, as mentioned earlier. In this case, the machine learning model with the longest running time is SVR, ranging from 136.01 to 346.3 s. This, combined with the low MAPE score, indicates that the SVR model operates inefficiently with the dataset in this study. After reviewing the results of the machine learning models in this section, it is observed that XGBoost is the most suitable machine learning model for predicting deep-seated landslides, exhibiting both high prediction accuracy and a short running time.

(b) Time series deep learning models

Similar to the machine learning models, in this section, the time series deep learning models will also be trained with default hyperparameters, as found in the research of Chou and Nguyen (2023). The performance results of these models are shown in Table 7. Overall, akin to the machine learning models, the time series deep learning models also demonstrate fairly good prediction accuracy, especially the best model – the Bi-GRU model, with MAPE ranging from 7.90 % to 9.13 %.

Table 7Performance results of time series deep learning models for predicting deep-seated landslide displacement.

Download Print Version | Download XLSX

The performance of the Bi-GRU model surpasses that of the GRU model because the Bi-GRU model learns patterns from time series data in both the forward and the backward directions on the timeline, thereby capturing more patterns. Furthermore, the Bi-GRU model produces significantly better prediction results with a more complex learning mechanism than other time series deep learning models. However, due to its complex operational mechanism, the Bi-GRU model also requires more processing time than other time series deep learning models. From the results of Table 7, it is observed that the operating time of the Bi-GRU model ranges from 79.81 to 212.75 s.

From the conducted analyses, Bi-GRU has been identified as the best time series deep learning model, owing to its excellent prediction performance. Compared to the best machine learning model, XGBoost (with MAPE ranging from 8.14 % to 9.58 %), the Bi-GRU model (with MAPE ranging from 7.90 % to 9.13 %) demonstrates higher prediction accuracy. Therefore, the Bi-GRU model will be chosen as the best numerical AI model.

5.2.2 Best numerical model fine-tuned by the AEIO algorithm

This section will focus on fine-tuning the hyperparameters of the numerical model to enhance its performance in predicting deep-seated landslide displacement. The AEIO algorithm will fine-tune the hyperparameters of the study's best numerical AI model, the Bi-GRU model. Details regarding the names and search ranges of the hyperparameters are outlined in Table 8. The objective function of the AEIO algorithm during the fine-tuning process is to minimize the MAPE value of the Bi-GRU model.

Table 8Search ranges of the hyperparameters of the optimal hybrid numerical models (Chou and Nguyen, 2023).

Download Print Version | Download XLSX

Table 9 illustrates the results of the fine-tuning process. From this table, it is observed that the AEIO algorithm has successfully identified the optimal hyperparameters of the Bi-GRU model, significantly improving the prediction accuracy of this model. For instance, the MAPE in predicting 1 d ahead displacement of the Bi-GRU model before fine-tuning was 7.9 %, but this number decreased to only 3.03 % after fine-tuning.

Table 9Performance results of hybrid time series deep learning model with AEIO in deep-seated landslide displacement prediction.

Download Print Version | Download XLSX

Fine-tuning the Bi-GRU model using AEIO will maximize its potential, minimizing the prediction error to the lowest possible level. Therefore, the results obtained in this section reflect the actual quality of the dataset as well as the level of difficulty in prediction. Specifically, based on the results in Table 9, it is observed that the predictions for 1 d ahead displacement (with MAPEs of 3.03 % and 3.94 %) consistently outperform those for 7 d ahead displacement (with MAPEs of 6.38 % and 7.96 %).

The 1 d ahead predictions have a shorter time horizon, making them less affected by environmental fluctuations and making changes more accessible to predict. Conversely, in the case of 7 d ahead displacement prediction, this time frame is long enough for various factors, such as weather conditions and human interventions, to occur, increasing uncertainty and volatility in the predicted values.

Additionally, Table 9 indicates that predictions from the dataset of the E-2 station consistently outperform those of the SAA station. Specifically, the displacement prediction at the E-2 station is 3.03 % and 6.38 %, better than the corresponding numbers for the SAA station, which are 3.94 % and 7.96 %, respectively. This is attributed to the dataset collected by the E-2 station being more comprehensive and being gathered over a more extended period than the data of the SAA station (as shown in Table 4).

Table 10 presents the optimal hyperparameters identified by the AEIO algorithm. Furthermore, in terms of running time, most models, after fine-tuning, exhibit longer running times compared to the original model. However, this increase is entirely acceptable since the additional running time is minimal and the benefits of fine-tuning are significant, as mentioned above, aiding in the model's more efficient operation.

Table 10Optimal hyperparameters of the time series deep learning model identified by the AEIO algorithm.

Download Print Version | Download XLSX

5.2.3 Image-based CNN models

This section presents the results of utilizing CNN models, including VGG, ResNet, Inception, Xception, DenseNet, and NASNet, to predict deep-seated landslide displacement. The CNN models in this part use the default settings (Chou and Nguyen, 2023). Table 11 displays the prediction error results of the CNN models for 1 d ahead and 7 d ahead forecasts for both E-2 and SAA stations.

Table 11Performance results of the CNN models for deep-seated landslide displacement prediction.

Download Print Version | Download XLSX

The prediction results demonstrate that most CNN models produce highly accurate predictions. Specifically, predictions made by VGG, ResNet, MobileNet, DenseNet, and Inception exhibit MAPE values below 5 %. Among these, MobileNet and DenseNet201 emerge as the two models with the highest accuracy. For 1 d ahead prediction, the best model for predicting displacement at the E-2 station is MobileNet, with a MAPE of 4.11 %, and the best model for predicting displacement at the SAA station is DenseNet201, with a MAPE of 6.36 %. For 7 d ahead prediction, the best model for predicting displacement at the E-2 station is DenseNet201, with a MAPE of 5.3 %, and the best model for predicting displacement at the SAA station is MobileNet, with a MAPE of 6.8 %. These models will be selected accordingly for fine-tuning in the subsequent section.

Regarding running time, the CNN models in this section exhibit significantly longer running times compared to the numerical models in the previous sections. For example, the running time of the best CNN model to predict 1 d ahead displacement at the E-2 station – MobileNet – is 1.21 h. In contrast, the running time of the best single numerical model for predicting this index is 159.97 s.

While CNN models yield better prediction results, considering their extended running times, users need to weigh practical considerations before opting for this type of model. For instance, CNN models should be employed in cases requiring accurate predictions for research and measurement purposes. Conversely, numerical models like Bi-GRU are more suitable for real-time predictions and computations on low-performance devices.

5.2.4 Best CNN models fine-tuned by the AEIO algorithm

As analyzed in Sect. 5.2.3, the AEIO algorithm will sequentially fine-tune CNN models to enhance prediction accuracy. Table 12 illustrates the search range of hyperparameters for the CNN models to be fine-tuned. Table 13 presents the performance results of the CNN models after being fine-tuned.

Table 12Search ranges of the hyperparameters of the optimal hybrid numerical models (Chou and Nguyen, 2023).

Download Print Version | Download XLSX

Table 13Performance results of best CNN models with AEIO in deep-seated landslide displacement prediction.

Download Print Version | Download XLSX

However, a challenge in this section is that CNN models primarily analyze and learn from image data. Therefore, numerical data must be converted into image data before training. This poses a challenge because current computer hardware may need to be fully capable of efficiently converting numerical data into images for each computation. Hence, this study utilizes the optimal window sizes previously identified for fine-tuning numerical models (Table 10) for this scenario and employs these fixed window sizes for CNN models.

The results of the fine-tuning process demonstrate that the AEIO has successfully identified the optimal hyperparameters for the CNN models, enhancing their accuracy. For instance, in the case of the MobileNet model used for 1 d ahead prediction at the E-2 station, the fine-tuning process reduced the MAPE of this model from 4.11 % to 2.81 %. A similar trend is also observed in the remaining prediction scenarios.

Furthermore, similar to the case of AEIO–Bi-GRU, the CNN models exhibit the same trend, where 1 d ahead predictions are more accurate than 7 d ahead predictions. Similarly, forecasts at the E-2 station demonstrate higher accuracy than predictions at the SAA station. The rationale for this has been explained in Sect. 5.2.2. Lastly, the optimal hyperparameters of each CNN model, identified by the AEIO algorithm, are presented in Table 14. CNN models with optimal hyperparameters are the most effective models in this study for predicting deep-seated landslide displacement.

Table 14Optimal hyperparameters of the CNN models identified by the AEIO algorithm.

Download Print Version | Download XLSX

Figure 10 illustrates the differences between typical AI models' actual and predicted deep-seated landslide displacement. Specifically, Fig. 10a compares the performance of single models against the predicted values, while Fig. 10b does the same for hybrid models. The chart shows that hybrid models demonstrate superior predictive capability for deep-seated landslides compared to single models. This is evident from the displacement line of the hybrid models in Fig. 10b, which closely aligns with the actual deep-seated landslide displacement and significantly outperforms the single models depicted in Fig. 10a.

https://nhess.copernicus.org/articles/25/119/2025/nhess-25-119-2025-f10

Figure 10Graphs comparing the real and predicted deep-seated landslide displacement: (a) prediction results of deep-seated landslide displacement by single AI models; (b) prediction results of deep-seated landslide displacement by AI models optimized using the AEIO algorithm.

Download

5.3 Discussion

This study focuses on landslides on Lushan, Taiwan, intending to develop models to predict deep-seated landslide displacement for both 1 d and 7 d forecasts. These predictive models utilize input data such as the region's groundwater levels, temperature, and humidity. Accurately computing deep-seated landslide displacement offers several benefits. Firstly, it provides timely information for engineers to assess the resilience of structures and infrastructure in at-risk areas, facilitating the issuance of sensible warnings. Secondly, forecasting deep-seated landslide displacement offers insights into the severity of the disaster, aiding in effective evacuation and rescue planning.

Moreover, unlike AI models in previous studies (Balogun et al., 2021; Hakim et al., 2022; Jaafari et al., 2022), our research incorporates machine learning, time series deep learning, and CNN models, utilizing metaheuristic optimization algorithms to fine-tune their hyperparameters. However, the novelty of our study lies in adopting pre-trained models, such as MobileNet, DenseNet, Inception, and VGG, rather than standard CNN models.

By employing various AI models, this study identifies the most effective model for predicting deep-seated landslides and offers a comprehensive overview of the performance of different AI models. Initially, machine learning models exhibited relatively high prediction errors, with MAPE ranging from 8.14 % to 15.19 %. This performance was generally lower than time series deep learning models, which showed MAPEs ranging from 7.9 % to 14.73 %. The superior performance of the time series deep learning models is attributed to their ability to process sequential data and retain information from previous steps. This enables them to learn patterns from the dataset more effectively than traditional machine learning models.

Although time series deep learning models perform well, they fall short compared to CNN models. This disparity can be attributed to CNN's more advanced learning mechanism. The convolutional and pooling layers in CNN enable robust feature extraction from input data, with convolutional layers particularly effective at identifying complex patterns and subtle features in time series data, especially when spatial correlations are present. This capability allows CNNs to uncover critical features that other models may overlook.

The models developed in this study demonstrate predictive solid capabilities for deep-seated landslide displacement. Among them, the AEIO–MobileNet model is the most effective, achieving predictions with a sufficiently low error, indicated by a MAPE of 2.81 %. However, these models' practical applicability in real-world scenarios must be improved due to the time-consuming processes involved in data collection, processing, and AI model operation, making timely predictions challenging. Meanwhile, there have been studies that have successfully built real-time landslide detection systems (Wang et al., 2023; Das et al., 2020; Prakasam et al., 2021). We acknowledge this limitation of our study. Therefore, future research endeavors will aim to address this issue.

The input data used for the AI models were selected because they significantly influence the likelihood of deep-seated landslides, as detailed in Sect. 4.2. However, a limitation of this study is that it needs to evaluate the relative importance of each input data type for prediction accuracy. Future research should explore the impact of different combinations of input data on AI model performance. This could help identify the significance of each input type and reveal the optimal combination of inputs to enhance prediction accuracy further.

6 Conclusion

This study addresses the persistent threat of large slow-moving landslides, a primary concern due to their severe impact on lives and property. Employing various AI models, such as machine learning, time series deep learning, CNN models, and metaheuristic optimization algorithms, the research focuses on predicting deep-seated landslides at Lushan in Ren'ai Township, Nantou County. The study aims to enhance early prediction accuracy by utilizing 8 years of displacement and groundwater-level data from Lushan and weather data from the POWER project. The predictions cover 1 d and 7 d intervals, serving diverse purposes in landslide forecasting for timely evacuation. The research explores single and hybrid AI models to determine the most effective approach. The following conclusions are drawn from this research:

a.
CNN models optimized by the novel AEIO algorithm yield the best prediction results. In particular, AEIO–MobileNet predicts 1 d ahead displacement at the E-2 station with a MAPE score of only 2.81 %, demonstrating high accuracy.
b.
While CNN models boast high prediction accuracy, their computational time is also considerable. Therefore, decisions regarding their usage should also consider real-world constraints.
c.
The AEIO–Bi-GRU model also yields reasonably good prediction results, although not on par with CNN models. The best result achieved by the AEIO–Bi-GRU model is a MAPE of 3.03 % for 1 d ahead prediction at the E-2 station.
d.
The AEIO algorithm has successfully fine-tuned hyperparameters for AI models. Especially in the case of predicting 1 d ahead displacement, it has aided the MobileNet model in improving its predictive capability by 31.6 %, enabling this model to provide more accurate predictions.
e.
The prediction results from the E-2 station consistently outperform those from the SAA station. This is attributed to the fact that data from the E-2 station have been collected over a longer and more comprehensive period.
f.
The study results demonstrate that AI models can accurately predict deep-seated landslide displacement, which can be implemented in real-world scenarios.

Appendix A: Deep learning models for time series

The architecture of an RNN includes an input layer, a hidden layer with a variable number of RNN cells, and an output layer designed for label identification based on future displacement values. Figure A1 illustrates the structure of simple RNNs.

https://nhess.copernicus.org/articles/25/119/2025/nhess-25-119-2025-f11

Figure A1Structure of basic RNNs.

Download

Each cell in an RNN acts as a memory cell, which is interconnected to enable the sequential transfer of time-dependent input information within a sliding window. This makes it possible to consider temporal correlations between events that may be widely separated in the time dimension. The following formula presents the hidden unit of standard RNNs at time t:

\begin{matrix} (A1) & h_{t} = \tanh (W_{x} \cdot x_{t} + W_{h} \cdot h_{t - 1} + b), \end{matrix}

where x_t is the input vector at time t; h_t denotes the output vectors of hidden units for time t; W_x and W_h indicate the input and interconnected weight matrices, respectively, for the output of the hidden layer; b is the bias term; and tanh( ) represents the hyperbolic tangent activation function – i.e., $\tanh (x) = \frac{1 - e^{2 x}}{1 + e^{2 x}}$ . The mechanism of learning over time steps, stored within cells, enables RNNs to effectively capture complex relationships between cells and time sequences. However, as the duration of dependencies increases, RNN models are susceptible to issues related to vanishing gradients. Therefore, RNNs are well-suited to learning time series involving short-term dependencies.

Appendix B: Convolutional neural networks

The architecture of a typical CNN, as illustrated in Fig. B1, comprises an input layer (to receive image data) followed by hidden layers (including convolutional, pooling, and fully connected layers) and concludes with the output layers. As depicted in Fig. B1, the complexity of a CNN progressively increases from the convolutional layer to the fully connected (FC) layer. This design enables the CNN to recognize relatively simple patterns (lines, curves, etc.) before progressing to capture more intricate features (faces, objects, etc.), with the ultimate aim of extracting relevant information for accurate pattern identification.

https://nhess.copernicus.org/articles/25/119/2025/nhess-25-119-2025-f12

Figure B1Structure of a basic CNN.

Download

As illustrated in Fig. B2, the convolutional layer is responsible for most computations in the network. This involves extracting local features from an image using a set of learnable filters known as kernels. The behavior of the filter in the convolutional layer is influenced by two main factors: stride and padding. Stride refers to the pixel shift of the filter across the image, while padding aims to preserve information at the corners. In each iteration, a portion of the image is convolved with a filter to generate a dot product of pixels within its receptive field. This process is replicated across the entire image to produce a feature map. The convolution operation is defined as follows:

\begin{matrix} (B1) & C_{i} = b_{i} + \sum_{j = 1}^{d_{i}} I_{j} \cdot F_{i j}, i = 1, \dots, d_{c}, \end{matrix}

where C_i is the output of the convolutional layer or feature map, b_i is the bias, d_i is the depth of input, I_j is the input image, F_ij is the filter, and d_c is the depth of the convolutional layer.

https://nhess.copernicus.org/articles/25/119/2025/nhess-25-119-2025-f13

Figure B2Processing flow in a convolution layer.

Download

The multiplicative operations are usually followed by an activation function (the final element in the convolutional layer), which introduces nonlinearity and creates intricate mappings between network inputs and outputs. The activation function can be defined as follows:

\begin{matrix} (B2) & Y_{i} = f (C_{i}), \end{matrix}

where Y_i is the output of the convolutional layer after the activation function and f is the activation function.

A rectified linear unit (ReLU) is a nonlinear CNN function with output $f (x) = \max (0, x)$ . A ReLU converts all negative values to zero or returns the original input values if the input exceeds zero. A ReLU is only one of many activation functions; however, it has proven to be the most effective overall.

Pooling layers after the convolution layer can down-sample feature maps by summarizing features within the coverage area of a 2-D filter to reduce sensitivity to feature location, thereby improving resilience to changes in the position of features. Pooling layers also decrease the dimensions of the feature map, reducing the number of parameters to be dealt with, thereby decreasing computational overhead. Output dimensions from the pooling layer are computed as follows:

\begin{matrix} (B3) & \frac{c_{w} - f_{w} + 1}{s} \cdot \frac{c_{h} - f_{h} + 1}{s} \cdot c_{n}, \end{matrix}

where c_n is the number of channels in the feature map and f_w and f_h indicate the width and height of the filter.

Max pooling and average pooling are commonly used in CNNs. Max pooling accentuates salient features by selecting the maximum value within the filter's coverage area. In contrast, average pooling calculates the mean value within the exact location, providing a representative feature value. Illustrations of max pooling and average pooling are presented in Fig. B3.

https://nhess.copernicus.org/articles/25/119/2025/nhess-25-119-2025-f14

Figure B3Max pooling and average pooling.

Download

The final stage of a CNN comprises a series of fully connected (FC) layers. After the convolution and pooling operations, the feature map is flattened into a one-dimensional vector that connects to the FC layers, resembling an ANN. FC layers identify specific features, each represented by a neuron. In regression tasks, each neuron in the FC layer corresponds to a feature contributing to the final numerical output. The value transmitted by each neuron indicates its significance toward the regression result. FC layers are designed to predict the best continuous value for the target variable by combining and processing these neuron outputs. Figure B4 illustrates the structure of an FC layer.

https://nhess.copernicus.org/articles/25/119/2025/nhess-25-119-2025-f15

Figure B4Structure of a fully connected layer.

Download

Data availability

The data and source codes supporting this study's findings are available at https://www.researchgate.net/profile/Jui-Sheng-Chou (last access: 17 December 2024) and from the corresponding author upon reasonable request.

Author contributions

JSC: conceptualization, methodology, supervision, manuscript writing, reviewing, and editing. HMN: data processing, coding, and manuscript writing. HPP: data processing, coding, and manuscript writing. KLW: data preparation, supervision, and reviewing.

Competing interests

The contact author has declared that none of the authors has any competing interests.

Disclaimer

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.

Acknowledgements

The authors extend their gratitude to the National Science and Technology Council (NSTC), Taiwan, for financially supporting this research under NSTC grants 112-2221-E-011-033-MY3 and 111-2221-E-011-037-MY3. We also sincerely thank the Geological Survey and Mining Management Agency, Ministry of Economic Affairs, Taiwan, for providing favorable conditions for conducting this research.

Financial support

This research has been supported by the National Science and Technology Council (grant nos. 112-2221-E-011-033-MY3 and 111-2221-E-011-037-MY3).

Review statement

This paper was edited by Paola Reichenbach and reviewed by Lorenzo Borselli and one anonymous referee.

References

Aggarwal, A., Alshehri, M., Kumar, M., Alfarraj, O., Sharma, P., and Pardasani, K. R.: Landslide data analysis using various time-series forecasting models, Comput. Elect. Eng., 88, 106858, https://doi.org/10.1016/j.compeleceng.2020.106858, 2020.

Aleotti, P. and Chowdhury, R.: Landslide hazard assessment: summary review and new perspectives, Bull. Eng. Geol. Environ., 58, 21–44, https://doi.org/10.1007/s100640050066, 1999.

Alzubaidi, L., Zhang, J., Humaidi, A. J., Al-Dujaili, A., Duan, Y., Al-Shamma, O., Santamaría, J., Fadhel, M. A., Al-Amidie, M., and Farhan, L.: Review of deep learning: concepts, CNN architectures, challenges, applications, future directions, J. Big Data, 8, 53, https://doi.org/10.1186/s40537-021-00444-8, 2021.

Balogun, A. L., Rezaie, F., Pham, Q. B., Gigovic, L., Drobnjak, S., Aina, Y. A., Panahi, M., Yekeen, S. T., and Lee, S.: Spatial prediction of landslide susceptibility in western Serbia using hybrid support vector regression (SVR) with GWO, BAT and COA algorithms, Geosci. Front., 12, 101104, https://doi.org/10.1016/j.gsf.2020.10.009, 2021.

Breiman, L.: Classification and Regression Trees, Taylor & Francis Group, New York, https://doi.org/10.1201/9781315139470, 1984.

Caleca, F., Scaini, C., Frodella, W., and Tofani, V.: Regional-scale landslide risk assessment in Central Asia, Nat. Hazards Earth Syst. Sci., 24, 13–27, https://doi.org/10.5194/nhess-24-13-2024, 2024.

Chae, B.-G., Park, H. J., Catani, F., Simoni, A., and Berti, M.: Landslide prediction, monitoring and early warning: a concise review of state-of-the-art, Geosci. J., 21, 1033–1070, https://doi.org/10.1007/s12303-017-0034-4, 2017.

Chen, T. and Guestrin, C.: XGBoost: A Scalable Tree Boosting System, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 13–17 August 2016, New York, NY, USA, 785–794, https://doi.org/10.1145/2939672.2939785, 2016.

Chigira, M.: September 2005 rain-induced catastrophic rockslides on slopes affected by deep-seated gravitational deformations, Kyushu, southern Japan, Eng. Geol., 108, 1–15, https://doi.org/10.1016/j.enggeo.2009.03.005, 2009.

Chollet, F.: Xception: Deep Learning with Depthwise Separable Convolutions, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 21 July 2017, Honolulu, HI, USA, https://doi.org/10.48550/arXiv.1610.02357, 2017.

Chou, J. S. and Ngo, N. T.: Time series analytics using sliding window metaheuristic optimization-based machine learning system for identifying building energy consumption patterns, Appl. Energ., 177, 751–770, https://doi.org/10.1016/j.apenergy.2016.05.074, 2016.

Chou, J. S. and Nguyen, N. Q.: Forecasting Regional Energy Consumption via Jellyfish Search-Optimized Convolutional-Based Deep Learning, Int. J. Energ. Res., 2023, 3056688, https://doi.org/10.1155/2023/3056688, 2023.

Chou, J. S. and Nguyen, H. M.: Enhancing Energy AI Models: The Age of Exploration-Inspired Optimizer for Superior Power Generation and Consumption Predictions, 2024, in review.

Corominas, J. and Moya, J.: A review of assessing landslide frequency for hazard zoning purposes, Eng. Geol., 102, 193–213, https://doi.org/10.1016/j.enggeo.2008.03.018, 2008.

Cotecchia, F., Santaloia, F., and Tagarelli, V.: Towards A Geo-Hydro-Mechanical Characterization of Landslide Classes: Preliminary Results, Appl. Sci., 10, 7960, https://doi.org/10.3390/app10227960, 2020.

Crosta, G. B. and Agliardi, F.: Failure forecast for large rock slides by surface displacement measurements, Can. Geotech. J., 40, 176–191, https://doi.org/10.1139/t02-085, 2003.

Dahal, A., Tanyas, H., van Westen, C., van der Meijde, M., Mai, P. M., Huser, R., and Lombardo, L.: Space–time landslide hazard modeling via Ensemble Neural Networks, Nat. Hazards Earth Syst. Sci., 24, 823–845, https://doi.org/10.5194/nhess-24-823-2024, 2024.

Das, K., Majumdar, S., Moulik, S., and Fujita, M.: Real-Time Threshold-based Landslide Prediction System for Hilly Region using Wireless Sensor Networks, in: 2020 IEEE International Conference on Consumer Electronics – Taiwan (ICCE-Taiwan), 28–30 September 2020, Taoyuan, Taiwan, https://doi.org/10.1109/ICCE-Taiwan49838.2020.9258181, 2020.

Das, S., Sarkar, S., and Kanungo, D. P.: Rainfall-induced landslide (RFIL) disaster in Dima Hasao, Assam, Northeast India, Landslides, 19, 2801–2808, https://doi.org/10.1007/s10346-022-01962-z, 2022.

David, K. and Raymond, C. W.: Predicting earthquake-induced landslides, with emphasis on arid and semi-arid environments, Landslides in a semi-arid environment, 2(PART 1), 118–149, 1989.

Di Nunno, F., de Marinis, G., and Granata, F.: Short-term forecasts of streamflow in the UK based on a novel hybrid artificial intelligence algorithm, Sci. Rep., 13, 7036, https://doi.org/10.1038/s41598-023-34316-3, 2023.

Drucker, H., Burges, C. J. C., Kaufman, L., Smola, A., and Vapnik, V.: Support vector regression machines, in: NIPS'96: Proceedings of the 9th International Conference on Neural Information Processing Systems, 2–5 December, Dever, Colorado, USA, 155–161, 1996.

Elman, J. L.: Finding Structure in Time, Cognit. Sci., 14, 179–211, https://doi.org/10.1016/0364-0213(90)90002-E, 1990.

Ester, M., Kriegel, H. P., Sander, J., and Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise, in: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 2–4 August, Portland, Oregon, USA, 226–231, 1996.

Fu, W. X. and Liao, Y.: Non-linear shear strength reduction technique in slope stability calculation, Comput. Geotech., 37, 288–298, https://doi.org/10.1016/j.compgeo.2009.11.002, 2010.

Geertsema, M., Hungr, O., Schwab, J. W., and Evans, S. G.: A large rockslide-debris avalanche in cohesive soil at Pink Mountain, northeastern British Columbia, Canada, Eng. Geol., 83, 64–75, https://doi.org/10.1016/j.enggeo.2005.06.025, 2006.

Hakim, W. L., Rezaie, F., Nur, A. S., Panahi, M., Khosravi, K., Lee, C. W., and Lee, S.: Convolutional neural network (CNN) with metaheuristic optimization algorithms for landslide susceptibility mapping in Icheon, South Korea, J. Environ. Manage., 305, 114367, https://doi.org/10.1016/j.jenvman.2021.114367, 2022.

Han, H. G., Chen, Q. L., and Qiao, J. F.: Research on an online self-organizing radial basis function neural network, Neural Comput. Appl., 19, 667-676, https://doi.org/10.1007/s00521-009-0323-6, 2010.

Han, J., Kamber, M., and Pei, J.: Data Mining: Concepts and Techniques, Southeast Asia Edition, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 696 pp., https://doi.org/10.1016/C2009-0-61819-5, 2006.

He, K., Zhang, X., Ren, S., and Sun, J.: Deep Residual Learning for Image Recognition, in: 2016 IEEE Conference on Computer Vision and Pattern Recognition, 27–30 June, Las Vegas, Nevada, USA, https://doi.org/10.48550/arXiv.1512.03385, 2016.

He, R., Zhang, W., Dou, J., Jiang, N., and Zhou, H. X. J.: Application of artificial intelligence in three aspects of landslide risk assessment: A comprehensive review, Rock Mechanics Bulletin, 3, 100144, https://doi.org/10.1016/j.rockmb.2024.100144, 2024.

Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., and Adam, H.: MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, CoRR, abs/1704.04861, arXiv [preprint], https://doi.org/10.48550/arXiv.1704.04861, 2017.

Hu, B., Su, G., Jiang, J., Sheng, J., and Li, J.: Uncertain Prediction for Slope Displacement Time-Series Using Gaussian Process Machine Learning, IEEE Access, 7, 27535–27546, https://doi.org/10.1109/ACCESS.2019.2894807, 2019.

Hu, X. L., Wu, S. S., Zhang, G. C., Zheng, W. B., Liu, C., He, C. C., Liu, Z. X., Guo, X. Y., and Zhang, H.: Landslide displacement prediction using kinematics-based random forests method: A case study in Jinping Reservoir Area, China, Eng. Geol., 283, 105975, https://doi.org/10.1016/j.enggeo.2020.105975, 2021.

Huang, G., Liu, Z., v. d. Maaten, L., and Weinberger, K. Q.: Densely Connected Convolutional Networks, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 21–26 July 2017, Honolulu, HI, USA, https://doi.org/10.1109/CVPR.2017.243, 2017.

Huang, R. Q. and Fan, X. M.: The landslide story, Nat. Geosci., 6, 325–326, https://doi.org/10.1038/ngeo1806, 2013.

Hungr, O., Leroueil, S., and Picarelli, L.: The Varnes classification of landslide types, an update, Landslides, 11, 167–194, https://doi.org/10.1007/s10346-013-0436-y, 2014.

Iverson, R. M. and Major, J. J.: Rainfall, ground-water flow, and seasonal movement at Minor Creek landslide, northwestern California: Physical interpretation of empirical relations, Geol. Soc. Am. Bull., 99, 16, https://doi.org/10.1130/0016-7606(1987)99<579:RGFASM>2.0.CO;2, 1987.

Jaafari, A., Jaafari, A., Panahi, M., Panahi, M., Mafi-Gholami, D., Mafi-Gholami, D., Rahmati, O., Rahmati, O., Shahabi, H., Shahabi, H., Shirzadi, A., Shirzadi, A., Lee, S., Lee, S., Bui, D. T., Bui, D. T., Pradhan, B., and Pradhan, B.: Swarm intelligence optimization of the group method of data handling using the cuckoo search and whale optimization algorithms to model and predict landslides, Appl. Soft. Comput., 116, 108254, https://doi.org/10.1016/j.asoc.2021.108254, 2022.

Jiang, J., Ehret, D., Xiang, W., Rohn, J., Huang, L., Yan, S., and Bi, R.: Numerical simulation of Qiaotou Landslide deformation caused by drawdown of the Three Gorges Reservoir, China, Environ. Earth Sci., 62, 411–419, https://doi.org/10.1007/s12665-010-0536-0, 2011.

Jones, J. N., Bennett, G. L., Abanco, C., Matera, M. A. M., and Tan, F. J.: Multi-event assessment of typhoon-triggered landslide susceptibility in the Philippines, Nat. Hazards Earth Syst. Sci., 23, 1095–1115, https://doi.org/10.5194/nhess-23-1095-2023, 2023.

Keqiang, H., Zhiliang, W., Xiaoyun, M., and Zengtao, L.: Research on the displacement response ratio of groundwater dynamic augment and its application in evaluation of the slope stability, Environ. Earth Sci., 74, 5773–5791, https://doi.org/10.1007/s12665-015-4595-0, 2015.

Kilburn, C. R. J. and Petley, D. N.: Forecasting giant, catastrophic slope collapse: lessons from Vajont, Northern Italy, Geomorphology, 54, 21–32, https://doi.org/10.1016/S0169-555x(03)00052-7, 2003.

Kumar, D., Iakhwan, N., and Rawat, A.: Study and Prediction of Landslide in Uttarkashi, Uttarakhand, India Using GIS and ANN, Am. J. Neural Netw. Appl., 3, 63–74, https://doi.org/10.11648/j.ajnna.20170306.12, 2017.

Lau, Y. M., Wang, K. L., Wang, Y. H., Yiu, W. H., Ooi, G. H., Tan, P. S., Wu, J., Leung, M. L., Lui, H. L., and Chen, C. W.: Monitoring of rainfall-induced landslides at Songmao and Lushan, Taiwan, using IoT and big data-based monitoring system, Landslides, 20, 271–296, https://doi.org/10.1007/s10346-022-01964-x, 2023.

Lecun, Y., Bottou, L., Bengio, Y., and Haffner, P.: Gradient-based learning applied to document recognition, P. IEEE, 86, 2278–2324, https://doi.org/10.1109/5.726791, 1998.

Lee, Y. F. and Chi, Y. Y.: Rainfall-induced landslide risk at Lushan, Taiwan, Eng. Geol., 123, 113–121, https://doi.org/10.1016/j.enggeo.2011.03.006, 2011.

Li, H., Xu, Q., He, Y., and Deng, J.: Prediction of landslide displacement with an ensemble-based extreme learning machine and copula models, Landslides, 15, 2047–2059, https://doi.org/10.1007/s10346-018-1020-2, 2018.

Lin, C. W., Tseng, C. M., Tseng, Y. H., Fei, L. Y., Hsieh, Y. C., and Tarolli, P.: Recognition of large scale deep-seated landslides in forest areas of Taiwan using high resolution topography, J. Asian Earth Sci., 62, 389–400, https://doi.org/10.1016/j.jseaes.2012.10.022, 2013.

Lin, H. H., Lin, M. L., Lu, J. H., Chi, C. C., and Fei, L. Y.: Deep-seated gravitational slope deformation in Lushan, Taiwan: Transformation from cleavage-controlled to weakened rockmass-controlled deformation, Eng. Geol., 264, 105387, https://doi.org/10.1016/j.enggeo.2019.105387, 2020.

Liu, C. Y., Jiang, Z. S., Han, X. S., and Zhou, W. X.: Slope displacement prediction using sequential intelligent computing algorithms, Measurement, 134, 634–648, https://doi.org/10.1016/j.measurement.2018.10.094, 2019.

Loche, M. and Scaringi, G.: Temperature and shear-rate effects in two pure clays: Possible implications for clay landslides, Results Eng., 20, 101647, https://doi.org/10.1016/j.rineng.2023.101647, 2023.

Matsushi, Y. and Matsukura, Y.: Rainfall thresholds for shallow landsliding derived from pressure-head monitoring: cases with permeable and impermeable bedrocks in Boso Peninsula, Japan, Earth Surf. Proc. Land., 32, 1308–1322, https://doi.org/10.1002/esp.1491, 2007.

McCulloch, W. and Pitts, A.: A Logical Calculus of the Ideas Immanent in Nervous Activity (1943), Ideas That Created the Future, Springer, 79–88, https://doi.org/10.1007/BF02478259, 2021.

Mebrahtu, T. K., Heinze, T., Wohnlich, S., and Alber, M.: Slope stability analysis of deep-seated landslides using limit equilibrium and finite element methods in Debre Sina area, Ethiopia, Bull. Eng. Geol. Environ., 81, 403, https://doi.org/10.1007/s10064-022-02906-6, 2022.

Miao, H. B. and Wang, G. H.: Prediction of landslide velocity and displacement from groundwater level changes considering the shear rate-dependent friction of sliding zone soil, Eng. Geol., 327, 107361, https://doi.org/10.1016/j.enggeo.2023.107361, 2023.

Millán-Arancibia, C. and Lavado-Casimiro, W.: Rainfall thresholds estimation for shallow landslides in Peru from gridded daily data, Nat. Hazards Earth Syst. Sci., 23, 1191–1206, https://doi.org/10.5194/nhess-23-1191-2023, 2023.

Mufundirwa, A., Fujii, Y., and Kodama, J.: A new practical method for prediction of geomechanical failure-time, Int. J. Rock Mech. Min., 47, 1079–1090, https://doi.org/10.1016/j.ijrmms.2010.07.001, 2010.

Perkins, J. P., Oakley, N. S., Collins, B. D., Corbett, S. C., and Burgess, W. P.: Characterizing the scale of regional landslide triggering from storm hydrometeorology, EGUsphere [preprint], https://doi.org/10.5194/egusphere-2024-873, 2024.

Petley, D.: Global patterns of loss of life from landslides, Geology, 40, 927–930, https://doi.org/10.1130/G33217.1, 2012.

Petley, D. N., Mantovani, F., Bulmer, M. H., and Zannoni, A.: The use of surface monitoring data for the interpretation of landslide movement patterns, Geomorphology, 66, 133–147, https://doi.org/10.1016/j.geomorph.2004.09.011, 2005.

Pham, B. T., Pradhan, B., Bui, D. T., Prakash, I., and Dholakia, M. B.: A comparative study of different machine learning methods for landslide susceptibility assessment: A case study of Uttarakhand area (India), Environ. Model. Softw., 84, 240–250, https://doi.org/10.1016/j.envsoft.2016.07.005, 2016.

Pinyol, N. M., Alvarado, M., Alonso, E. E., and Zabala, F.: Thermal effects in landslide mobility, Geotechnique, 68, 528–545, https://doi.org/10.1680/jgeot.17.P.054, 2018.

Pradhan, B. and Lee, S.: Regional landslide susceptibility analysis using back-propagation neural network model at Cameron Highland, Malaysia, Landslides, 7, 13–30, https://doi.org/10.1007/s10346-009-0183-2, 2010.

Prakasam, C., Aravinth, R., Kanwar, V. S., and Nagarajan, B. Design and Development of Real-time landslide early warning system through low cost soil and rainfall sensors, Mater. Today: Proc., 45, 5649–5654, https://doi.org/10.1016/j.matpr.2021.02.456, 2021.

Preisig, G.: Forecasting the long-term activity of deep-seated landslides via groundwater flow and slope stability modelling, Landslides, 17, 1693–1702, https://doi.org/10.1007/s10346-020-01427-1, 2020.

Ruitang, L., Zhaowei, C., Zexiong, W., Zhenghan, Z., Jiahao, L., Zhencheng, G., and Yuchong, C.: Mountain Slope Monitoring Guidelines (TGS-SLOPEM106), Chinese Republic Society of Geotechnical Engineering, 2017.

Safaei, M., Omar, H., Huat, B. B. K., Yousof, Z. B. M., and Ghiasi, V.: Deterministic Rainfall Induced Landslide Approaches, Advantage and Limitation, Elect. J. Geotech. Eng., 16, 1619–1650, 2011.

Shibasaki, T., Matsuura, S., and Hasegawa, Y.: Temperature-dependent residual shear strength characteristics of smectite-bearing landslide soils, J. Geophys. Res.-Solid, 122, 1449–1469, https://doi.org/10.1002/2016jb013241, 2017.

Simonyan, K. and Zisserman, A.: Very Deep Convolutional Networks for Large-Scale Image Recognition, ICLR2015, arXiv [preprint], https://doi.org/10.48550/arXiv.1409.1556, 2015.

Srivastava, S., Anand, N., Sharma, S., Dhar, S., and Sinha, L. K.: Monthly Rainfall Prediction Using Various Machine Learning Algorithms for Early Warning of Landslide Occurrence, in: 2020 International Conference for Emerging Technology (INCET), 5–7 June 2020, Belgaum, India, 1–7, https://doi.org/10.1109/INCET49848.2020.9154184, 2020.

Stanton, J. M.: Galton, Pearson, and the Peas: A Brief History of Linear Regression for Statistics Instructors, J. Stat. Educ., 9, 13 pp., https://doi.org/10.1080/10691898.2001.11910537, 2001.

Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z.: Rethinking the Inception Architecture for Computer Vision, in: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 27–30 June 2016, Las Vegas, NV, USA, 2818–2826, https://doi.org/10.1109/CVPR.2016.308, 2015.

Take, W. A., Beddoe, R. A., Davoodi-Bilesavar, R., and Phillips, R.: Effect of antecedent groundwater conditions on the triggering of static liquefaction landslides, Landslides, 12, 469–479, https://doi.org/10.1007/s10346-014-0496-7, 2015.

Tehrani, F. S., Calvello, M., Liu, Z., Zhang, L., and Lacasse, S.: Machine learning and landslide studies: recent advances and applications, Nat. Hazards, 114, 1197–1245, https://doi.org/10.1007/s11069-022-05423-7, 2022.

Thai Pham, B., Shirzadi, A., Shahabi, H., Omidvar, E., Singh, S. K., Sahana, M., Talebpour Asl, D., Bin Ahmad, B., Kim Quoc, N., and Lee, S.: Landslide Susceptibility Assessment by Novel Hybrid Machine Learning Algorithms, Sustainability, 11, 4386, https://doi.org/10.3390/su11164386, 2019.

van Natijne, A. L., Bogaard, T. A., Zieher, T., Pfeiffer, J., and Lindenbergh, R. C.: Machine-learning-based nowcasting of the Vögelsberg deep-seated landslide: why predicting slow deformation is not so easy, Nat. Hazards Earth Syst. Sci., 23, 3723–3745, https://doi.org/10.5194/nhess-23-3723-2023, 2023.

Wang, K.-L., Lin, M.-L., Lin, J.-T., Huang, S.-C., Liao, R.-T., and Chen, C.-W.: Monitoring of the Evolution of a Deep-Seated Landslide in Lushan Area, Taiwan, Eng. Geol. Soc. Terr., 2, 1317–1320, https://doi.org/10.1007/978-3-319-09057-3_231, 2015.

Wang, Y., Dong, J., Zhang, L., Deng, S. H., Zhang, G. K., Liao, M. S., and Gong, J. Y.: Automatic detection and update of landslide inventory before and after impoundments at the Lianghekou reservoir using Sentinel-1 InSAR, Int. J. Appl. Earth Obs., 118, 103224, https://doi.org/10.1016/j.jag.2023.103224, 2023.

Wu, J. H.: Seismic landslide simulations in discontinuous deformation analysis, Comput. Geotech., 37, 594–601, https://doi.org/10.1016/j.compgeo.2010.03.007, 2010.

Xu, J., Li, H., Du, K., Yan, C., Zhao, X., Li, W., and Xu, X.: Field investigation of force and displacement within a strata slope using a real-time remote monitoring system, Environ. Earth Sci., 77, 552, https://doi.org/10.1007/s12665-018-7729-3, 2018.

Xu, J., Jiang, Y., and Yang, C.: Landslide Displacement Prediction during the Sliding Process Using XGBoost, SVR and RNNs, Appl. Sci., 12, 6056, https://doi.org/10.3390/app12126056, 2022.

Yang, B., Yin, K., Lacasse, S., and Liu, Z.: Time series analysis and long short-term memory neural network to predict landslide displacement, Landslides, 16, 677–694, https://doi.org/10.1007/s10346-018-01127-x, 2019.

Yang, S., Jin, A., Nie, W., Liu, C., and Li, Y.: Research on SSA-LSTM-Based Slope Monitoring and Early Warning Model, Sustainability, 14, 10246, https://doi.org/10.3390/su141610246, 2022.

Zhang, L., Shi, B., Zhu, H., Yu, X. B., Han, H., and Fan, X.: PSO-SVM-based deep displacement prediction of Majiagou landslide considering the deformation hysteresis effect, Landslides, 18, 179–193, https://doi.org/10.1007/s10346-020-01426-2, 2021.

Zhang, T., Li, Y., Wang, T., Wang, H., Chen, T., Sun, Z., Luo, D., Li, C., and Han, L.: Evaluation of different machine learning models and novel deep learning-based algorithm for landslide susceptibility mapping, Geosci. Lett., 9, 26, https://doi.org/10.1186/s40562-022-00236-9, 2022.

Zhang, W., Li, H., Tang, L., Gu, X., Wang, L., and Wang, L.: Displacement prediction of Jiuxianping landslide using gated recurrent unit (GRU) networks, Acta Geotech., 17, 1367–1382, https://doi.org/10.1007/s11440-022-01495-8, 2022.

Zhang, W. G., Zhang, R. H., Wu, C. Z., Goh, A. T. C., Lacasse, S., Liu, Z. Q., and Liu, H. L.: State-of-the-art review of soft computing applications in underground excavations, Geosci. Front., 11, 1095–1106, https://doi.org/10.1016/j.gsf.2019.12.003, 2020.

Zheng, H. Y., Liu, B., Han, S. Y., Fan, X. Y., Zou, T. Y., Zhou, Z. L., and Gong, H.: Research on landslide hazard spatial prediction models based on deep neural networks: a case study of northwest Sichuan, China, Environ. Earth Sci., 81, 258, https://doi.org/10.1007/s12665-022-10369-x, 2022.

Zhou, C., Yin, K., Cao, Y., Ahmed, B., and Fu, X.: A novel method for landslide displacement prediction by integrating advanced computational intelligence algorithms, Sci. Rep., 8, 7287, https://doi.org/10.1038/s41598-018-25567-6, 2018.

Zoph, B., Vasudevan, V., Shlens, J., and Le, Q. V.: Learning Transferable Architectures for Scalable Image Recognition, in: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 18–23 June 2018, Salt Lake City, UT, USA, 8697–8710, https://doi.org/10.1109/CVPR.2018.00907, 2018.

Articles

Short summary

This study enhances landslide prediction using advanced machine learning, including new algorithms inspired by historical explorations. The research accurately forecasts landslide movements by analyzing 8 years of data from Taiwan's Lushan, improving early warning and potentially saving lives and infrastructure. This integration marks a significant advancement in environmental risk management.