the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Prediction of volume of shallow landslides due to rainfall using data-driven models
Abstract. Landslides due to rainfall are among most destructive natural disasters that cause property damages, huge financial losses and human deaths in different parts of the World. To plan for mitigation and resilience, the prediction of the volume of rainfall-induced landslides is essential to understand the relationship between the volume of soil materials debris and their associated predictors. Objectives of this research are to construct a model by utilizing advanced data-driven algorithms (i.e., ordinary least square or Linear regression (OLS), random forest (RF), support vector machine (SVM), extreme gradient boosting (EGB), generalized linear model (GLM), decision tree (DT), and deep neural network (DNN), K-nearest neighbor (KNN) and Ridge regression (RR)) for the prediction of the volume of landslides due to rainfall considering geological, geomorphological, and environmental conditions. Models were tested on the Korean landslide dataset to observe the best-performing model, and among tested algorithms, the extreme gradient boosting ranked high with the coefficient of determination (R2 = 0.85) and mean absolute error (MAE = 150.421 m3). The volume of landslides was strongly influenced by slope length, drainage status, slope angle, aspect, and age of trees. The anticipated volume of landslide can be important for land use allocation and efficient landslide risk management.
- Preprint
(3246 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on nhess-2024-90', Anonymous Referee #1, 31 Jul 2024
I am attaching my full comments in the attached PDF. At the same time, I am summarizing my general comments here for the editor's perusal.
This manuscript presents a valuable reflection of data-driven modelling for robust regional-scale analyses of landslide masses. The authors deserve commendation for their interesting research, which has significant implications for hazard prediction and modelling. However, I have some major comments and concerns. While the study is promising and of great interest to the landslide community, it requires further work. Some aspects of the training and testing regimes are not clear. Furthermore, the choice of certain parameters is not well justified which, in my opinion, must be clarified for readers to understand the logic of choosing said parameters. The English language, particularly in the Introduction, needs improvement. Some sentences read awkwardly and are hard to follow. Improved sentence phrasing is necessary to make the manuscript clearer, especially for non-native English readers. In my opinion, a major revision is required to adapt the manuscript before considering acceptance.
-
AC1: 'Reply on RC1', Sang-Guk Yum, 02 Nov 2024
The comment was uploaded in the form of a supplement: https://nhess.copernicus.org/preprints/nhess-2024-90/nhess-2024-90-AC1-supplement.pdf
-
AC1: 'Reply on RC1', Sang-Guk Yum, 02 Nov 2024
-
RC2: 'Comment on nhess-2024-90', Anonymous Referee #2, 02 Sep 2024
General Comments
- In the introduction, the authors should explain more about why volume estimations are crucial for understanding and managing landslide hazards.
- The literature review section should be expanded to incorporate more recent studies on landslide volume prediction models, providing a comprehensive overview of the current state of research in this field.
- The study area section should be enhanced with more detailed information on landslide-triggering factors. Additionally, it would be beneficial to incorporate a figure showing representative rainfall characteristics prior to the recorded landslide events in different parts of the Korean Peninsula. This would help better understand the unique rainfall patterns of the region responsible for landslides.
- Figure 2 needs to be updated. In the predictor variables, the authors should clearly specify which factors are influencing factors and which are triggering factors.
- A more detailed discussion of the input variables considered for volume prediction is required to better understand their roles as influencing and triggering factors. Additionally, the manuscript should provide further justification for the selection of these predictor variables.
- I recommend providing clearer details on the geometry of the landslide inventory.
- A brief discussion on why nine data-driven models were chosen is recommended in the methods section. While these models have become quite common, providing a rationale for their selection will help justify their use in the study.
- The authors mainly use MAE and R2 for model validation. It is recommended to consider additional metrics commonly used in data-driven model evaluation. Relying solely on these two statistics may not comprehensively assess model performance.
- The summary of the various data-driven models (Table 3) indicates that the EGB model is the best-performing. However, the variable importance analysis shown in Figure 5 highlights only a subset of predictor variables, raising questions about whether different models utilize different sets of features. Further clarification is needed.
Specific Comments
- Figure 1(b): The y-axis label is missing.
- In Figure 2, it would be better to use the terms ‘Training and Testing Algorithms’ instead of ‘Run and Test Algorithms’. This terminology more accurately reflects the standard processes involved in model development.
- Line No. 104-107: I recommend that the authors use the acronyms for the different data-driven models here, as they have already been defined earlier in the manuscript. Consistent use of these acronyms throughout the manuscript will improve clarity and readability.
- Line No. 150-152: ‘Thus, planting vegetation is recommended as a better practice to improve soil cohesion and prevent potential landslides due to soil root interaction (Gong et al., 2017; Phillips et al., 2021)’. This is a recommendation, not a description. Please provide appropriate descriptions.
- Why did the authors use 70% of the inventory and 30% for the validation? Why not 50% for each? The authors should state in the methods section why they used these percentages. Further, the authors need to clarify whether training and testing data were chosen randomly or if any specific criteria were used for the analysis.
- Please review the references cited in the text, as there are frequent errors with the use of commas and semicolons between references. This issue occurs multiple times throughout the manuscript and needs correction for proper citation formatting.
- Chen et al. (2015) is not cited correctly in the text. There are two different articles by Chen et al. (2015) listed in the references section. The authors need to distinguish between these references by specifying them as Chen et al. (2015a) and Chen et al. (2015b). The reference section and citation in the text should be updated accordingly to reflect these distinctions.
- Line 133 cites (Kafle, 2022), but this article is either missing from the reference section or is not cited correctly. Please verify and ensure that this reference is properly included and formatted in the reference section.
- Line no 215: Chowdhury (2023)- article not present in the reference section or not mentioned in the correct form.
- Line no 236: (Team, 2022)- article is not present in the reference section or not mentioned in the correct form.
- Line no 239: Jerome et al. (2012)- article not present in the reference section or not mentioned in the correct form.
- Please verify the unit of soil depth. Most landslide inventories presented in Table 2 and Figure 1(d) suggest that the landslides are shallow-seated based on their volume distributions; the unit of soil depth does not seem to align with this observation.
- I suggest adding a few lines in the discussion section to highlight the practical applicability of the proposed model. This would provide insight into how the model can be used in real-world scenarios and its potential impact on practice or policy.
Citation: https://doi.org/10.5194/nhess-2024-90-RC2 -
AC2: 'Reply on RC2', Sang-Guk Yum, 02 Nov 2024
My co-authors and I would like to express our gratitude to the reviewer for constructive feedback and suggestions for strengthening our research. The changes we have made to the attached file in response to such feedback and suggestions have been highlighted in blue to facilitate their identification.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
359 | 121 | 24 | 504 | 9 | 13 |
- HTML: 359
- PDF: 121
- XML: 24
- Total: 504
- BibTeX: 9
- EndNote: 13
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Jérémie Tuganishuri
Chan-Young Yune
Manik Das Adhikari
Seung Woo Lee
Gihong Kim
Sang-Guk Yum
To reduce the consequences of landslides due to rainfall, such as of life and economic losses, and disruption of order of our daily living; this study describes the process of building a machine learning model which can help to estimate the volume of landslides material that can occur in a particular region taking into account of antecedent rainfall, soil characteristics, type of vegetation etc. The findings can be useful for land use, infrastructure design and rainfall disaster management.
To reduce the consequences of landslides due to rainfall, such as of life and economic losses,...