Assessing the importance of feature selection in Landslide Susceptibility for Belluno province (Veneto Region, NE Italy)
- 1Department of Geosciences, University of Padova, Padova, Italy
- 2Faculty of Geo-Information Science and Earth Observation (ITC), University of Twente, Enschede, Netherlands
- 1Department of Geosciences, University of Padova, Padova, Italy
- 2Faculty of Geo-Information Science and Earth Observation (ITC), University of Twente, Enschede, Netherlands
Abstract. In the domain of landslide risk science, landslide susceptibility mapping (LSM) is very important as it helps spatially identify potential landslide-prone regions. This study used a statistical ensemble model (Frequency Ratio and Evidence Belief Function) and two machine learning (ML) models (Random Forest and XG-Boost) for LSM in the Belluno province (Veneto Region, NE Italy). The study investigated the importance of the conditioning factors in predicting landslide occurrences using the mentioned models. In this paper, we evaluated the importance of the conditioning factors (features) in the overall prediction capabilities of the statistical and ML algorithms. By the trial-and-error method, we eliminated the least "important" features by using a common threshold. Conclusively, we found that removing the least "important" features does not impact the overall accuracy of the LSM for all three models. Based on the results of our study, the most commonly available features, for example, the topographic features, contributes to comparable results after removing the least "important" ones. This confirms that the requirement for the important factor maps can be assessed based on the physiography of the region. Based on the analysis of the three models, it was observed that most commonly available feature data can be useful for carrying out LSM at regional scale, eliminating the least available ones in most of the use cases due to data scarcity. Identifying LSMs at regional scale has implications for understanding landslide phenomena in the region and post-event relief measures, planning disaster risk reduction, mitigation, and evaluating potentially affected areas.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(4002 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Journal article(s) based on this preprint
Sansar Raj Meena et al.
Interactive discussion
Status: closed
-
RC1: 'Comment on nhess-2021-299', Anonymous Referee #1, 01 Dec 2021
I have revised the manuscript "Assessing the importance of feature selection in Landslide Susceptibility for Belluno province (Veneto Region, NE Italy)", submitted by Sansar Raj Meena, Silvia Puliero, Kushanav Bhuyan, Mario Floris, Filippo Catani, focused on the the importance of feature selection for landslide susceptibility zonation. The manuscript could be interesting for the journal but requires a strong revision. Data are not clearly presented, some definition are confusing, the structure needs to be reorganized. Moreover, the manuscript is not written in a good English and some sentences are difficult to understand. I recommend the authors to submit a revised version of the manuscript after a revision by an English-speaking person. Detailed comments are throughout the text.
- AC2: 'Reply on RC1', Sansar Raj Meena, 19 Jan 2022
-
RC2: 'Comment on nhess-2021-299', Anonymous Referee #2, 07 Dec 2021
I have gone through the manuscript and found that the quality of work is very good and applied. I have some observation needs to be correct before its goes to final publication.
1. The words features and factors are used interchangeably in the paper. Better to stick to one word.
2. Definition between pre and post predictions are not clear. Making it difficult to read and understand the context sometimes.
3. An image of the affected area for example could be very insightful to comment on the extent of the damage over the area.
4. Scale of the maps that are taken into the experimentation are missing.
5. Explanation of the methodology can be better, especially the starting paragraph.
6. The mapping units are not defined as of yet, which must be mentioned.
7. Better explanation of the models is required, mostly in the case of LSM and how these models learn and predict using the models for LSM.
8. The reasoning for 0.3 as the threshold must be reasoned better.
9. Graph axes have no labels.
10. No definition of training and testing datasets for model prediction. Need a section for that.-
AC1: 'Reply on RC2', Sansar Raj Meena, 19 Jan 2022
Comment on nhess-2021-299
Anonymous Referee #2
I have gone through the manuscript and found that the quality of work is very good and applied. I have some observation needs to be correct before its goes to final publication.
- The words features and factors are used interchangeably in the paper. Better to stick to one word.
Ans: Thank you for your suggestions. We use the words “conditioning factors” throughout the manuscript to avoid further confusion.
2. Definition between pre and post predictions are not clear. Making it difficult to read and understand the context sometimes.Ans: The “pre-predictions” refer to the moment before actually training the model and thus, we refer to some literature that performs factor importance prior to model training. Similarly, “post-predictions” refer to factor importance after model training, which we perform in our study here. Nonetheless, we have edited these two words in the document to avoid confusion.
3. An image of the affected area for example could be very insightful to comment on the extent of the damage over the area.Ans: Thank you for your comment. We have added some images in figure 1.
4. Scale of the maps that are taken into the experimentation are missing.
Ans: We have added the scale information in the manuscript.
5. Explanation of the methodology can be better, especially the starting paragraph.
Ans: We have re-arranged the paragraphs along with the conceptual framework diagram to make a more comprehensive and suitable readability experience.
6. The mapping units are not defined as of yet, which must be mentioned.Ans: We have added this information in the manuscript. We have done the analysis at pixel level for our study area.
7. Better explanation of the models is required, mostly in the case of LSM and how these models learn and predict using the models for LSM.Ans: We have explained the models better keeping in mind the usage of them in the context of LSM. Please refer to sections 3.2.1 and 3.2.2.
8. The reasoning for 0.3 as the threshold must be reasoned better.Ans: We have given the reasoning of this in line 398-400. But to recall, we tried countless values as a cut-off or threshold value to see which of the conditioning factors gave the best accuracy for the susceptibility after removal of the factors based on the cut-off value.
9. Graph axes have no labels.Ans: We have added the y-axes labels in the graphs. We refer you to figures 6 and 10.
10. No definition of training and testing datasets for model prediction. Need a section for that.Ans: We have added the definition of training and testing datasets for model prediction in section 2.2.
-
AC1: 'Reply on RC2', Sansar Raj Meena, 19 Jan 2022
Peer review completion










Interactive discussion
Status: closed
-
RC1: 'Comment on nhess-2021-299', Anonymous Referee #1, 01 Dec 2021
I have revised the manuscript "Assessing the importance of feature selection in Landslide Susceptibility for Belluno province (Veneto Region, NE Italy)", submitted by Sansar Raj Meena, Silvia Puliero, Kushanav Bhuyan, Mario Floris, Filippo Catani, focused on the the importance of feature selection for landslide susceptibility zonation. The manuscript could be interesting for the journal but requires a strong revision. Data are not clearly presented, some definition are confusing, the structure needs to be reorganized. Moreover, the manuscript is not written in a good English and some sentences are difficult to understand. I recommend the authors to submit a revised version of the manuscript after a revision by an English-speaking person. Detailed comments are throughout the text.
- AC2: 'Reply on RC1', Sansar Raj Meena, 19 Jan 2022
-
RC2: 'Comment on nhess-2021-299', Anonymous Referee #2, 07 Dec 2021
I have gone through the manuscript and found that the quality of work is very good and applied. I have some observation needs to be correct before its goes to final publication.
1. The words features and factors are used interchangeably in the paper. Better to stick to one word.
2. Definition between pre and post predictions are not clear. Making it difficult to read and understand the context sometimes.
3. An image of the affected area for example could be very insightful to comment on the extent of the damage over the area.
4. Scale of the maps that are taken into the experimentation are missing.
5. Explanation of the methodology can be better, especially the starting paragraph.
6. The mapping units are not defined as of yet, which must be mentioned.
7. Better explanation of the models is required, mostly in the case of LSM and how these models learn and predict using the models for LSM.
8. The reasoning for 0.3 as the threshold must be reasoned better.
9. Graph axes have no labels.
10. No definition of training and testing datasets for model prediction. Need a section for that.-
AC1: 'Reply on RC2', Sansar Raj Meena, 19 Jan 2022
Comment on nhess-2021-299
Anonymous Referee #2
I have gone through the manuscript and found that the quality of work is very good and applied. I have some observation needs to be correct before its goes to final publication.
- The words features and factors are used interchangeably in the paper. Better to stick to one word.
Ans: Thank you for your suggestions. We use the words “conditioning factors” throughout the manuscript to avoid further confusion.
2. Definition between pre and post predictions are not clear. Making it difficult to read and understand the context sometimes.Ans: The “pre-predictions” refer to the moment before actually training the model and thus, we refer to some literature that performs factor importance prior to model training. Similarly, “post-predictions” refer to factor importance after model training, which we perform in our study here. Nonetheless, we have edited these two words in the document to avoid confusion.
3. An image of the affected area for example could be very insightful to comment on the extent of the damage over the area.Ans: Thank you for your comment. We have added some images in figure 1.
4. Scale of the maps that are taken into the experimentation are missing.
Ans: We have added the scale information in the manuscript.
5. Explanation of the methodology can be better, especially the starting paragraph.
Ans: We have re-arranged the paragraphs along with the conceptual framework diagram to make a more comprehensive and suitable readability experience.
6. The mapping units are not defined as of yet, which must be mentioned.Ans: We have added this information in the manuscript. We have done the analysis at pixel level for our study area.
7. Better explanation of the models is required, mostly in the case of LSM and how these models learn and predict using the models for LSM.Ans: We have explained the models better keeping in mind the usage of them in the context of LSM. Please refer to sections 3.2.1 and 3.2.2.
8. The reasoning for 0.3 as the threshold must be reasoned better.Ans: We have given the reasoning of this in line 398-400. But to recall, we tried countless values as a cut-off or threshold value to see which of the conditioning factors gave the best accuracy for the susceptibility after removal of the factors based on the cut-off value.
9. Graph axes have no labels.Ans: We have added the y-axes labels in the graphs. We refer you to figures 6 and 10.
10. No definition of training and testing datasets for model prediction. Need a section for that.Ans: We have added the definition of training and testing datasets for model prediction in section 2.2.
-
AC1: 'Reply on RC2', Sansar Raj Meena, 19 Jan 2022
Peer review completion










Journal article(s) based on this preprint
Sansar Raj Meena et al.
Sansar Raj Meena et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
412 | 184 | 17 | 613 | 12 | 9 |
- HTML: 412
- PDF: 184
- XML: 17
- Total: 613
- BibTeX: 12
- EndNote: 9
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(4002 KB) - Metadata XML