Thank you for your response, dear authors. I appreciate the revisions made after the previous round of reviews, but I believe the manuscript can still be further improved with additional revisions before it is ready for publication. I have presented my comments and suggestions in this iteration of the review, which I hope the authors will find helpful in enhancing the manuscript further.
1. I will begin the second review and my line of questions starting from the Introduction section. In the last round of review, I raised a general concern about the lack of connection between volume estimation, geomorphological process understanding, and engineering solutions. Unfortunately, I still observe two major issues with the revised version:
A. It appears that the authors have addressed a wide range of topics, including engineering solutions for mitigation, risk assessment, financial compensation, and related aspects. While these are undoubtedly important, the way they are presented lacks a clear and cohesive narrative, making it difficult for readers to follow. As a reader, it feels like I am encountering a series of disconnected bullet points about various applications of volume information for landslides and their societal or biodiversity impacts, without a clear sense of purpose or direction in the text.
Let’s take this for example, “Firstly, to manage landslide risk effectively, the quantification of VLDR can be useful for updating hazard maps to reflect the scale of potential landslides in various regions to facilitate the identification of high-risk zones for monitoring and intervention.” Now, normally such statements (especially in a review) is followed by a general explanation as to how the volume information can be used directly for the purposes of “updating hazard maps”, for instance, by illustrating how these updated maps help prioritize areas for additional ground-based investigations, early warning system placements, or resource allocation for slope stabilization efforts. In other words, the statement should detail the direct linkage between volume quantification and subsequent practical steps that can be taken to mitigate landslide risk, rather than simply asserting that such a connection exists without any further elaboration.
This type of simple assertion is not the best for a reader to gauge what really is going on. Particularly, if there is no direct link between volumes and the respective impact. Other statements have the same ‘linking’ problem. Moreover, an equally big issue is that the new added paragraphs read the same to me. I do not gain any new information from the new text. The authors mention: "mitigation strategies, effective risk management, emergency response, public awareness on safety measures and preparedness, drainage system to control surface runoff, determining expected number of personnel for ‘clean up’ and recovery, establishing ecosystem impacts, habitat restoration, protection of crops and farmlands."
Frankly, these topics are very diverse and complex, spanning multiple engineering, scientific, and social science disciplines. However, I see no clear connection to the manuscript's main narrative. Are the authors implying that their method can address all of these issues simply because it can accurately predict volumes? Does all of South Korea face these problems (more or less) equally? My point is that I cannot discern a clear, concise rationale or storyline explaining why landslide volume estimation is necessary. I recommend re-writing the two paragraphs related to the volumes and associated topic in the Introduction more carefully. Please keep the linkage direct and to the point, while citing some examples from the literature.
B. Additionally, the authors did not adequately integrate the volume estimations or predictions into a geomorphological context. This aspect is crucial, as it forms the crux for studies linking sediment transport, material mobilization, and sediment influx into river systems for example. Omitting this perspective is problematic since it averts a reader from understanding how the observed volumes relate to underlying geomorphological processes (including hillslope process evolution), ultimately limiting the usefulness and practicality of the study’s findings for both scientific insight and practical applications in landscape management and hazard mitigation.
I believe the current structure of the Introduction is not optimal. I encourage the authors to take their time and thoroughly revise this section, especially the two paragraphs related to volumes. I suggest a comprehensive overhaul of the discussion on the importance of volumes, incorporating key works by Montgomery, Jaboyedoff, Korup, and van Westen to strengthen the narrative. Please take this opportunity to carefully revise the text such that the importance of volumes is clear and coherent from an application point of view, ranging from both engineering and geomorphological perspectives.
2. Moving on to the next topic, I want to stress a bit more on the geographical split testing argument and the application of the model to other locations/regions.
A. Let’s start with the geographical split. The authors state that dividing the data by region would compromise model reliability due to the reduced size of the test set. While I partially agree, the authors also mention that they incorporated altitude as a predictor variable to reflect geographical diversity citing the influence of orographic rainfall on higher-altitude areas. This reasoning, however, may be oversimplified. Altitude is only one dimension of regional variability and may not fully capture the complexity of geographic differences in landslide susceptibility (and/or by proxy, volumes). Although incorporating altitude could help the model account for some variations associated with elevation, it cannot completely substitute for explicit geographic variability. Regionally distinct factors—such as geology, lithology, vegetation, land use, and soil types—may not be adequately represented by altitude alone. Relying solely on altitude as a proxy for regional variability implies oversimplifying the spatial heterogeneity inherent in landslide processes.
Please note that, while I do support the approach for splitting the data, I disagree with the notion that altitude alone is sufficient to capture the geographic diversity inherent in South Korea's varied landscapes. In my previous review, my suggestion was to consider performing or evaluating a spatial cross-validation, although a regular 10-fold cross-validation could also suffice (as the authors noted that 60% of the data is concentrated in the northeast), despite the methodological differences between the two approaches. I am interested to hear the authors’ thoughts on the use of altitude as a proxy for geographic diversity.
B. For the application of the model, the authors mention using an unknown dataset—presumably the independent test set—where the model achieved an R² value above 0.8, which is quite good. My previous recommendation was to see if the authors could apply the model elsewhere, ideally in a nearby area (still in the South Korean Peninsula) with new (or even old) landslides for which no volume information is currently available. While ML/DL models often perform well on familiar data, they may produce unpredictable, random or less meaningful results when applied in a ‘new’ region or context. This way, the authors might see how the model behaves under different conditions, while providing insights into its generalizability and practical applicability to scenarios beyond the training environment. Of course, the authors will not be able to validate these results (for N number of landslides) since no ground truth exist, but it will give a good idea if the predicted volume prediction numbers are off the charts (e.g., extremely large or very small). This is important to investigate how random the model(s)’ predictions can be and, beyond that, provides additional motivation for the authors' work, moving it beyond merely a ‘modelling exercise’.
3. Regarding the Discussion, the authors stated in their response: “direct comparison with result of existing numerical and statistical models that solely depend on geometrical features of landslide (such as, surface area or runout length) is out of the scope of this investigation”. It seems that the authors may have misinterpreted my suggestions. The recommendation was not to perform a numerical comparison with other methods, such as statistical or numerical models, but rather to review the literature on such methods and highlight why the authors’ approach is reliable.
This suggestion is particularly important because, as the authors themselves mentioned, no previous study has used such a multivariate predictor approach for volume predictions. Therefore, it is important to discuss and review this aspect as a huge chunk of the literature rely on, for instance, numerical and geometrical methods for volume estimations. Additionally, it is crucial to discuss and review related topics in common geomorphological research—such as sediment transport, landscape evolution, and material mobilization—since these processes rely heavily on volume data for quantification. This connects back to the Introduction section, which I previously noted requires an overhaul. Elements introduced in that section can be further expanded upon in the Discussion to emphasize potential applications of the proposed approach. While volume information is undeniably important, simply focusing on the influence of ML/DL on model performance significantly underestimates the broader implications that the Discussion section could address given the scope and nature of the study.
4. Table 1 column ‘Descriptions’ seem misleading. Descriptions should also include the definition of the variables, not just the ‘influence’ of the variable. For example, for Slope angle, there’s no definition as to what it means, but rather a statement which explains the influence of the slope angle (e.g., slope at 20-30 degrees more vulnerable to landslides due to rainfall). This is not really a ‘description’. I suggest either changing the column name or adding a definition first for each variable and then explaining their influence on landslides.
Also, based on line 287, it seems that there are only three types of soil, sandy loam, loam, and silt loam. Please add them in the table for soil types as well.
5. The authors have provided a clear explanation of the feature importance for soil depth, and I appreciate their decision to retain it, as it is crucial for volume estimations. The authors noted that soil depth could play a more significant role in different regional settings with varying behaviors or responses, and I agree with this perspective. I have no further comments on this matter.
6. I appreciate the response and explanation regarding the differences between the Random Forest and EGB models in predicting smaller and larger volumes, respectively. Indeed, an iterative process like EGB, guided by gradient descent, is likely to capture the more intricate patterns associated with landslides generating large volumes. Similarly, the 'average' behaviour of the ensemble approach in Random Forest effectively accounts for the prediction of smaller volumes on average. I have no further comments on this matter.
7. The authors have also explained the landslide movement query very well, and I have no further questions in that regard.
In conclusion, my impression of the technical aspects of the work is positive since much of the authors’ clarifications addressed my concerns. However, the justification for the importance of volume information, its applications, and the future scope remains limited, which undersells the contribution of this study. I believe that an additional round of revisions would further enhance the manuscript, making it more accessible and impactful for a broader audience.
I wish the authors good luck with their revisions. |