Potential improvements of landslide prediction by hydro-meteorological thresholds: an investigation based on reanalysis soil moisture data and principal component analysis
- 1Department of Civil Engineering and Architecture, University of Pavia, Italy
- 2Department of Civil Engineering and Architecture, University of Catania, Catania, 95123, Italy
- 3Department of Civil Engineering and Architecture, University of Pavia, Pavia, 27100, Italy
- anow at: Department of Civil Engineering and Architecture, University of Catania, Catania, 95123, Italy
- 1Department of Civil Engineering and Architecture, University of Pavia, Italy
- 2Department of Civil Engineering and Architecture, University of Catania, Catania, 95123, Italy
- 3Department of Civil Engineering and Architecture, University of Pavia, Pavia, 27100, Italy
- anow at: Department of Civil Engineering and Architecture, University of Catania, Catania, 95123, Italy
Abstract. In recent times, several efforts have been addressed to understand the extent to which soil moisture estimations may improve the performance of landslide early warning systems (LEWSs). These systems have been traditionally based on rainfall intensity-duration thresholds. Still a limited number of studies explore the possible enhancement of the performance of LEWSs through the identification of hydro-meteorological thresholds. In this study, we propose a methodology for developing regional hydro-meteorological landslide triggering thresholds coupling mean rainfall intensity and soil moisture information. To test the potential improvements in prediction we use ERA5-Land reanalysis soil moisture data, available at four depth levels and hourly resolution. Two different instances are investigated, namely the identification of triggering thresholds using rainfall intensity and the soil moisture at each of four depth levels, and the identification of triggering thresholds using rainfall intensity and a combination of soil moisture at the four depths as obtained by principal component analysis (PCA). We propose thresholds in the form of a piece-wise linear equation. The equation’s parameters are optimized in order to maximize the ROC True Skill Statistic (TSS) prediction performance metric. The proposed hydro-meteorological thresholds are tested on the case of Sicily Island (south Italy) and the performance is compared with those obtained through the traditional rainfall intensity-duration (ID) power-law thresholds. Overall, the results show that the soil moisture information adds a considerable value to the improved thresholds’ performance since the ROC True Skill Statistic increases from 0.50 to 0.71. A similar performance is obtained when the first principal component derived from the PCA is used, proving PCA to be a valuable support tool for the identification of the proposed hydro-meteorological thresholds, as it allows to take into account the multi-layer information while keeping the thresholds two-dimensional.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(983 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Journal article(s) based on this preprint
Nunziarita Palazzolo et al.
Interactive discussion
Status: closed
-
RC1: 'Comment on nhess-2022-175', Anonymous Referee #1, 27 Jul 2022
General Comments:
This is a mostly well-written preprint that I feel only needs minor revisions. The use of PCA to reduce the dimensionality of the hydrometeorological space is novel, and the thresholds produced are an improvement over other methods. There are some missing details regarding the landslide inventory selection process, described below. These missing details constitute the bulk of my concerns and if addressed, I feel that the paper will tell a more complete story of the authors' methodology.
Specific Comments:
Regarding the landslide selection process described near the end of section 2.1:
The selection of ground truth is a critical decision for this type of analysis, especially if that ground truth is partially derived from other algorithms or datasets, such as what you are doing with CTRL-T. I believe this section needs two additions to help convince readers that what you're doing is scientifically sound.
Firstly, for the "adjustable parameters" of CTRL-T, I would like to see some description of how and why you chose the final parameter values. I believe you briefly mention the separating length of time for rainfall events in wet and dry periods later on in the paper. There is also this sentence in line 135: "Rainfall event parameters were calibrated adopting the monthly soil water balance model and evapotranspiration analysis." But I'm unclear on if this calibration process was done automatically by the program or manually by the authors. A final list of adjustable parameter values, with some brief defense of their selection, would help readers understand what parts of CTRL-T are automated and which are tuned by hand.
Secondly, I would describe briefly in greater detail how you decided that landslides did not have identifiable or uncertain landslide conditions. I presume some threshold on the weights was used. If so, what what were those thresholds values and how did you decide them? Or if some other metric was used to quantify the landslide cause as being uncertain, briefly provide and defend those decisions.
Technical Comments and Optional Suggestions:
See attached PDF for grammar corrections and other suggestions for additional figures or figure revisions that I do not feel are mandatory.
-
AC1: 'Reply on RC1', Nunziarita Palazzolo, 05 Oct 2022
Comment on nhess-2022-175
Dear Editor and Reviewers,
we thank the guest editor Francesco Marra for handling our manuscript and the referees for their insightful comments, which will surely help us in improving our work. In the following, we reply to point-to-point Anonymous Referee #1.
Author’s response to anonymous Referee #1
General Comments:
Referee #1: This is a mostly well-written preprint that I feel only needs minor revisions. The use of PCA to reduce the dimensionality of the hydrometeorological space is novel, and the thresholds produced are an improvement over other method. There are some missing details regarding the landslide inventory selection process, described below. These missing details constitute the bulk of my concerns and if addressed, I feel that the paper will tell a more complete story of the authors' methodology.
Author’s Response: We acknowledge Referee#1 for globally appreciating our work, recognizing its novelty aspects. We will revise the manuscript adding more details about the landslide inventory, in order to make clearer the description of the work done.
Specific Comments:
Referee #1: Regarding the landslide selection process described near the end of section 2.1.
The selection of ground truth is a critical decision for this type of analysis, especially if that ground truth is partially derived from other algorithms or datasets, such as what you are doing with CTRL-T. I believe this section needs two additions to help convince readers that what you're doing is scientifically sound.
Firstly, for the "adjustable parameters" of CTRL-T, I would like to see some description of how and why you chose the final parameter values. I believe you briefly mention the separating length of time for rainfall events in wet and dry periods later on in the paper. There is also this sentence in line 135: "Rainfall event parameters were calibrated adopting the monthly soil water balance model and evapotranspiration analysis." But I'm unclear on if this calibration process was done automatically by the program or manually by the authors. A final list of adjustable parameter values, with some brief defense of their selection, would help readers understand what parts of CTRL-T are automated and which are tuned by hand.
Author’s Response: We thank the reviewer for pointing out that more details should be provided regarding these aspects. For the computation of the regional parameters required by CTRL-T, we referred to a previous application of the algorithm to Sicily Island (Melillo et al., 2015). As explained by the authors, the heuristic approach proposed by Brunetti et al. (2010), and updated by Peruccacci et al. (2012), has been adopted to separate two rainfall events. Specifically, according to this approach, the dry period (no rain) has been set equal to 48 hours (P4, warm) between April and October (warm season, Cw), while it has been set equal to 96 hours (P4, cold) from November to March (cold season, Cc). Indeed, in line with Köppen (1931) and Trewartha (1968), it is reasonable to assume that in Sicily, due to the Mediterranean climate, the warm period is longer than the cold one. On lines 135 and 136 we referred to the parameters representative of the time periods used to remove the irrelevant amounts of rain and to reconstruct rainfall events (P1, P2, P4) and to the irrelevant rainfall sub-events that had to be excluded in the calculation of the final events (P3). In more detail:
- P1 represents the dry interval separating isolated rainfall measurements and it has been set equal to 3 hours for the Cw period, and to 6 hours for Cc period;
- P2 represents the dry interval separating the rainfall sub-event, namely the period of continuous rainfall separated from the immediately preceding and the immediately following sub-events by dry periods with no rain. It has been set equal to 6 hours in the Cw period and equal to 12 hours in the Cc period;
- P3 represents the threshold to exclude the sub-events whose contribution can be considered irrelevant for the reconstruction of the rainfall events for the possible initiation of the landslide and, it has been reasonably set equal to 1 mm for the Mediterranean climate;
- P4 represents the minimum dry period separating two rainfall events, where a rainfall event is a period of continuous rainfall resulting from the aggregation of single or multiple sub-events in order to obtain single rainfall events. P4 has been set equal to48 hours for the Cw period and equal to 96 hours for the Cc period.
Additional parameters needed to be set for the reconstruction of the rainfall events are: GS, representing the instrumental sensitivity of the rain gauge; ER, representing the instrumental sensitivity of the rain gauge and the minimum value exceeding which the isolated hourly measurements are considered relevant; and RB, representing the radius of the buffer to assign each landslide to the closest rain gauge.
A final table of the adjustable parameter values, as reported in the following, will be certainly added within the manuscript in order to better explain their meaning and their role with respect to the algorithm.
Parameter name
Parameter value
Cw
Cc
GS [mm]
0.2
0.2
ER [mm]
0.2
0.2
RB [km]
10
10
P1 [h]
3
6
P2 [h]
6
12
P3 [h]
1
1
P4 [h]
48
96
Regarding the monthly soil water balance model and evapotranspiration analysis, the calibration process was done automatically by the program, setting the related benchmarks. It was assumed that the evapotranspiration is inversely proportional to the time necessary to dry the soil, and, specifically, a factor of 2 between all relevant parameters in the Cw and Cc periods has been adopted (ETR(Cw)≅2⋅ETR(Cc)), as revealed by the analysis of the mean annual evapotranspiration in Italy (Melillo, 2009) using the Thornthwaite–Mather method (Thornthwaite and Mather, 1957) and as adopted by Melillo et al. (2015) for a previous application of the algorithm to the Sicily Island.
Referee #1: Secondly, I would describe briefly in greater detail how you decided that landslides did not have identifiable or uncertain rainfall conditions. I presume some threshold on the weights was used. If so, what were those thresholds values and how did you decide them? Or if some other metric was used to quantify the landslide cause as being uncertain, briefly provide and defend those decisions.
Author’s Response: We thank Referee#1 for allowing us to clarify this aspect. The selection of the rainfall events responsible for landslides is performed within the algorithm in two steps. The first step involves the assignment of a record of rainfall measurements, given by a single rain gauge, to each landslide checking the match between the start and end dates of the rainfall events and the day and time of the landslide occurrence. This approach makes it possible to associate each landslide to a single rainfall event, discarding the landslides for which the time match does not fit. If multiple rainfall conditions that are mostly likely responsible for the triggering are found, the weighting procedure explained in lines 140-147 of the manuscript is adopted.
Technical Comments and Optional Suggestions:
Referee #1: See attached PDF for grammar corrections and other suggestions for additional figures or figure revisions that I do not feel are mandatory.
Author’s Response: We also appreciate the additional grammar corrections and the other suggestions for figure revisions that will be certainly introduced within the manuscript, as well as all the additional information and insights reported in the above specific comments.
In particular, we will apply all technical corrections. For the more specific comments in the annotated manuscript, we will modify the manuscript as follows:
- insertion of a more specific overview regarding some statistics related to the cost, damages, and number of casualties due to landslides on a worldwide scale;
- insertion of a grouped bar plot, as a subplot to Figure 4, graphing Eqs. 13, 14, 15, and 16.
-
AC1: 'Reply on RC1', Nunziarita Palazzolo, 05 Oct 2022
-
RC2: 'Comment on nhess-2022-175', Anonymous Referee #2, 29 Jul 2022
The authors compare classical intensity-duration power law thresholds with hydrometeorological thresholds that combine mean intensity with ERA5-Land reanalysis soil moisture. They consider both the soil moisture at 4 different depths and a combination of them obtained with principal component analysis. They find that adding soil moisture information improves the prediction of a rainfall-based threshold.
The manuscript is generally clear and well organized. I only have one major concern with the publication. In fact, the novelty in the work is the use of PCA for combining the soil moisture information at the different depths and keeping the problem 2D (rainfall intensity and soil moisture derived component). While I see the potential of such a methodology and agree with the advantage of keeping the dimensionality of the problem small, I believe some further analyses are required to demonstrate that it is indeed advantageous.
The TSS obtained using intensity and soil moisture at the first or second shallowest depth is pretty much identical to the one using the first component of the PCA. This means that the overall performances are not sufficient to demonstrate the advantage of PCA. The authors should find some alternative way of either demonstrating the advantages directly (with data/results) or demonstrating that the most influential soil layers changes depending on some properties. Can they observe any pattern in which one is more influential than the others? Those could be spatial patterns (e.g., in certain regions the shallower/deeper soil moisture is more important) or relative to landslide properties (e.g., for certain types of landslides the deeper/shallower soil moisture is more important) or relative to the time of the year (e.g., over certain times of the year deeper/shallower soil moisture is more important). Because the PCA piece is the novel piece in this manuscript, I believe it is important to show and demonstrate the advantage compared to just using the shallowest soil moisture.
Besides this, I also have some minor comments:
- I think the method for the definition of rainfall events should be better explained. The definitions of parameters P1-P2-P3-P4 are unclear and so is the sentence in line 148. What does it mean that rainfall conditions were not identified? What are the uncertainties? In rainfall? In landslide properties. Does this mean they were incorrectly identified as “rainfall triggered” in the database ()? Also, what are the values of the parameters chosen (e.g., the maximum radius Rb)? Only information about the interarrival times was provided.
- On soil moisture: could you provide more information about soil moisture? The only information I could find is “association of SM data to the beginning of each rainfall event”, but what does this mean? The first hour of the rainfall event? The hour before its beginning?
- Do rainfall events end when the hour of landslide occurrence or whenever they ended (which could be N hours before or even after the landslide)?
- Would be good to report in Figure 5 also the timing of each landslide (maybe as a vertical black line)
- In Figure 6 you could consider using different plotting techniques to make the figure more readable. In fact, it is impossible from a scatter plot to understand where/how many points there are in the different parts of the plot. You should consider plotting the 2D histogram instead (e.g., see Leonarduzzi et al., 2017). This would allow the reader to better see where most of the events are and which groups of events are “driving” the thresholds. Based on more visible (because it’s less events) distribution of the triggering events, it looks like the not triggering ones are “driving” the threshold. In other words, just by looking at the triggering events, a steeper threshold would improve the performances, which seems to suggest there are a lot of not triggering events in the 20-300h 0.1-1 mm/h region. Would be nice if that could be looked at! What I am suggesting is something similar to what Leonarduzzi et al., (2017) did in Figure 3.
- Would be nice to have information also about TPR and FPR also for the ID threshold (maybe as an additional entry in Table 2)
- I have a similar suggestion for Figure 8 as for Figure 6, allowing the reader to see the distribution of the no-triggering events (the triggering sample is small enough that they are visible without much overlapping)
Finally, in the supplementary pdf, some grammar/rewording suggestions are provided.
Leonarduzzi, E., Molnar, P., and McArdell, B. W. (2017), Predictive performance of rainfall thresholds for shallow landslides in Switzerland from gridded daily data, Water Resour. Res., 53, 6612– 6625, doi:10.1002/2017WR021044.
-
AC2: 'Reply on RC2', Nunziarita Palazzolo, 05 Oct 2022
Comment on nhess-2022-175
Dear Editor and Reviewers,
we thank the guest editor Francesco Marra for handling our manuscript and the referees for their insightful comments, which will surely help us in improving our work. In the following, we reply point-to-point to anonymous Referee #2.
Author’s response to anonymous Referee #2
General Comments:
Referee#2: The authors compare classical intensity-duration power law thresholds with hydrometeorological thresholds that combine mean intensity with ERA5-Land reanalysis soil moisture. They consider both the soil moisture at 4 different depths and a combination of them obtained with principal component analysis. They find that adding soil moisture information improves the prediction of a rainfall-based threshold.
The manuscript is generally clear and well organized. I only have one major concern with the publication. In fact, the novelty in the work is the use of PCA for combining the soil moisture information at the different depths and keeping the problem 2D (rainfall intensity and soil moisture derived component). While I see the potential of such a methodology and agree with the advantage of keeping the dimensionality of the problem small, I believe some further analyses are required to demonstrate that it is indeed advantageous.
The TSS obtained using intensity and soil moisture at the first or second shallowest depth is pretty much identical to the one using the first component of the PCA. This means that the overall performances are not sufficient to demonstrate the advantage of PCA. The authors should find some alternative way of either demonstrating the advantages directly (with data/results) or demonstrating that the most influential soil layers changes depending on some properties. Can they observe any pattern in which one is more influential than the others? Those could be spatial patterns (e.g., in certain regions the shallower/deeper soil moisture is more important) or relative to landslide properties (e.g., for certain types of landslides the deeper/shallower soil moisture is more important) or relative to the time of the year (e.g., over certain times of the year deeper/shallower soil moisture is more important). Because the PCA piece is the novel piece in this manuscript, I believe it is important to show and demonstrate the advantage compared to just using the shallowest soil moisture.
Author’s Response:
We thank the referee for the comment. We understand the concern related to the fact that the PCA combining the information for the four depth layers of soil moisture does not bring a significant improvement in the performance in terms of TSS with respect to the threshold that uses soil moisture from the first or second layer. We however partly disagree with the referee for two reasons. First, from a general standpoint, testing the application of the questioned PCA approach remains a novel aspect regardless of the possible increase or not in performance compared to thresholds that employ a single layer. Second, we would also like to point out that the use of PCA is not the only novelty aspect that we introduce. In particular, novelty points include: 1) a deeper look at how the hydro-meteorological method may help to predict landslides, adding to the body of research already available on the issue, which is still underdeveloped; 2) the use of reanalysis datasets for soil moisture information, which are accessible globally at a coarse resolution and contain significant uncertainties, (and for which it is thus not trivial that an improvement of thresholds could be obtained); 3) the questioning of usual parametric forms of the threshold equation, which is usually a power law (Marino et al., 2020) or a bilinear equation (Mirus et al, 2018; Thomas et al., 2019). In some ways, these points are summarized in the proposed title of the manuscript, which states that we analyze the “potential improvements of landslide prediction by hydro-meteorological thresholds”, and that we just “investigate” the use of “principal component analysis”.
We do, however, recognize that if a certain approach is proposed, it should be worth it was given the increased level of sophistication that is introduced. This may be accomplished by the application of the PCA technique that we introduce to other case studies, which is beyond the scope of the manuscript, as well as the other aspects that the referee suggests (type of landslides, seasons, etc.).
We will thus amend the manuscript by adding some cautionary sentences that will state the need for more studies to further test the real convenience of applying PCA. However, as stated in the manuscript, PCA remains a valid approach to combining the information of the four available layers and thus avoiding the trial-and-error testing of the use of the various single layers, as the performances obtained are at least as good as those using single-layer information.
Minor Comments:
Referee#2: I think the method for the definition of rainfall events should be better explained. The definitions of parameters P1-P2-P3-P4 are unclear and so is the sentence in line 148. What does it mean that rainfall conditions were not identified? What are the uncertainties? In rainfall? In landslide properties. Does this mean they were incorrectly identified as “rainfall triggered” in the database ()? Also, what are the values of the parameters chosen (e.g., the maximum radius Rb)? Only information about the interarrival times was provided.
Author’s Response: We thank the reviewer for pointing out that more details should be provided regarding the definition of rainfall events. As reported on line 128-129, the CTRL-T (Calculation of Thresholds for Rainfall induced Landslides-Tool) code (Melillo et al., 2018) is used for the identification of the rainfall events that were more likely to be responsible for the observed slope failures. Additional information, provided also to Referee#1, is reported in what follows and should be integrated within the manuscript to better clarify the adopted methods.
For the computation of the regional parameters required by CTRL-T, we referred to a previous application of the algorithm to Sicily Island (Melillo et al., 2015). As explained by the authors, the heuristic approach proposed by Brunetti et al. (2010), and updated by Peruccacci et al. (2012), has been adopted to separate two rainfall events. Specifically, according to this approach, the dry period (no rain) has been set equal to 48 hours (P4, warm) between April and October (warm season, Cw), while it has been set equal to 96 hours (P4, cold) from November to March (cold season, Cc). Indeed, in line with Köppen (1931) and Trewartha (1968), it is reasonable to assume that in Sicily, due to the Mediterranean climate, the warm period is longer than the cold one. On lines 135 and 136 we referred to the parameters representative of the time periods used to remove the irrelevant amount of rain and to reconstruct rainfall events (P1, P2, P4) and to the irrelevant rainfall sub-events that had to be excluded in the calculation of the final events (P3). In more detail:
- P1 represents the dry interval separating isolated rainfall measurements and it has been set equal to 3 hours for the Cw period, and to 6 hours for Cc period;
- P2 represents the dry interval separating the rainfall sub-event, namely the period of continuous rainfall separated from the immediately preceding and the immediately following sub-events by dry periods with no rain. It has been set equal to 6 hours in the Cw period and equal to 12 hours in the Cc period;
- P3 represents the threshold to exclude the sub-events whose contribution can be considered irrelevant for the reconstruction of the rainfall events for the possible initiation of the landslide and, it has been reasonably set equal to 1 mm for the Mediterranean climate;
- P4 represents the minimum dry period separating two rainfall events, where a rainfall event is a period of continuous rainfall resulting from the aggregation of single or multiple sub-events in order to obtain single rainfall events. P4 has been set equal to48 hours for the Cw period and equal to 96 hours for the Cc period.
Additional parameters needed to be set for the reconstruction of the rainfall events are: GS, representing the instrumental sensitivity of the rain gauge; ER, representing the instrumental sensitivity of the rain gauge and the minimum value exceeding which the isolated hourly measurements are considered relevant; and RB, representing the radius of the buffer to assign each landslide to the closest rain gauge.
A final table of the adjustable parameter values, as reported in the following, will be certainly added within the manuscript in order to better explain their meaning and their role with respect to the algorithm.
Parameter name
Parameter value
Cw
Cc
GS [mm]
0.2
0.2
ER [mm]
0.2
0.2
RB [km]
10
10
P1 [h]
3
6
P2 [h]
6
12
P3 [h]
1
1
P4 [h]
48
96
Referee#2: On soil moisture: could you provide more information about soil moisture? The only information I could find is “association of SM data to the beginning of each rainfall event”, but what does this mean? The first hour of the rainfall event? The hour before its beginning?
Author’s Response: We thank Referee#2 for allowing us to clarify this aspect. Both precipitation data and soil moisture ones are at the hourly scale, thus the association of SM data to the beginning of each rainfall event means that, for example, if a rainfall event begins at 10:00 am on a given day, it will be associated with the corresponding soil moisture value recorded at 10:00 am on that day. This issue will be better clarified within the manuscript, as well as additional information describing more in deep the ER5-Land dataset.
Referee#2: Do rainfall events end when the hour of landslide occurrence or whenever they ended (which could be N hours before or even after the landslide)?
Author’s Response: The rainfall events end at the hour of landslide occurrence, thus for each triggering rainfall event, the time at which the considered event begins is taken as starting time, while the time of the landslide occurrence is taken as the ending time. The following figure adapted from Peres et al., (2017) makes it clearer:
In case a revised submission will be foreseen, we will add a figure to clarify these aspects.
Referee#2: Would be good to report in Figure 5 also the timing of each landslide (maybe as a vertical black line)
Author’s Response: We thank Referee#2 for this suggestion that will be surely implemented within Figure 5 of the manuscript.
Referee#2: In Figure 6 you could consider using different plotting techniques to make the figure more readable. In fact, it is impossible from a scatter plot to understand where/how many points there are in the different parts of the plot. You should consider plotting the 2D histogram instead (e.g., see Leonarduzzi et al., 2017). This would allow the reader to better see where most of the events are and which groups of events are “driving” the thresholds. Based on more visible (because it’s less events) distribution of the triggering events, it looks like the not triggering ones are “driving” the threshold. In other words, just by looking at the triggering events, a steeper threshold would improve the performances, which seems to suggest there are a lot of not triggering events in the 20-300h 0.1-1 mm/h region. Would be nice if that could be looked at! What I am suggesting is something similar to what Leonarduzzi et al., (2017) did in Figure 3.
Author’s Response: We thank Referee#2 for suggesting a better plotting technique to make the figure more readable. The way through which we represented landslide triggering thresholds (Figures 6, 7, 8) within a semi-log plane represents the most adopted one in the majority of literature studies, however, the 2D histogram of Leonarduzzi et al. (2017), is a nice and interesting alternative plot that surely can help the reader in understanding the thresholds. Thus, we will make all possible efforts in order to improve the readability of the considered Figures 6, 7, and 8.
Referee#2: Would be nice to have information also about TPR and FPR also for the ID threshold (maybe as an additional entry in Table 2)
Author’s Response: This additional information about TPR and FPR for the ID threshold will be certainly added as an additional entry in Table 2.
Referee#2: I have a similar suggestion for Figure 8 as for Figure 6, allowing the reader to see the distribution of the no-triggering events (the triggering sample is small enough that they are visible without much overlapping)
Author’s Response: As previously stated, we will try to revise all the Figures representing the proposed parametric thresholds with the aim to improve their readability.
Referee#2: Finally, in the supplementary pdf, some grammar/rewording suggestions are provided.
Author’s Response: We thank Referee#2 to have carefully reviewed the manuscript also from a grammar/rewording point of view. All provided suggestions noted in the supplementary pdf will be implemented within the manuscript.
Peer review completion








Interactive discussion
Status: closed
-
RC1: 'Comment on nhess-2022-175', Anonymous Referee #1, 27 Jul 2022
General Comments:
This is a mostly well-written preprint that I feel only needs minor revisions. The use of PCA to reduce the dimensionality of the hydrometeorological space is novel, and the thresholds produced are an improvement over other methods. There are some missing details regarding the landslide inventory selection process, described below. These missing details constitute the bulk of my concerns and if addressed, I feel that the paper will tell a more complete story of the authors' methodology.
Specific Comments:
Regarding the landslide selection process described near the end of section 2.1:
The selection of ground truth is a critical decision for this type of analysis, especially if that ground truth is partially derived from other algorithms or datasets, such as what you are doing with CTRL-T. I believe this section needs two additions to help convince readers that what you're doing is scientifically sound.
Firstly, for the "adjustable parameters" of CTRL-T, I would like to see some description of how and why you chose the final parameter values. I believe you briefly mention the separating length of time for rainfall events in wet and dry periods later on in the paper. There is also this sentence in line 135: "Rainfall event parameters were calibrated adopting the monthly soil water balance model and evapotranspiration analysis." But I'm unclear on if this calibration process was done automatically by the program or manually by the authors. A final list of adjustable parameter values, with some brief defense of their selection, would help readers understand what parts of CTRL-T are automated and which are tuned by hand.
Secondly, I would describe briefly in greater detail how you decided that landslides did not have identifiable or uncertain landslide conditions. I presume some threshold on the weights was used. If so, what what were those thresholds values and how did you decide them? Or if some other metric was used to quantify the landslide cause as being uncertain, briefly provide and defend those decisions.
Technical Comments and Optional Suggestions:
See attached PDF for grammar corrections and other suggestions for additional figures or figure revisions that I do not feel are mandatory.
-
AC1: 'Reply on RC1', Nunziarita Palazzolo, 05 Oct 2022
Comment on nhess-2022-175
Dear Editor and Reviewers,
we thank the guest editor Francesco Marra for handling our manuscript and the referees for their insightful comments, which will surely help us in improving our work. In the following, we reply to point-to-point Anonymous Referee #1.
Author’s response to anonymous Referee #1
General Comments:
Referee #1: This is a mostly well-written preprint that I feel only needs minor revisions. The use of PCA to reduce the dimensionality of the hydrometeorological space is novel, and the thresholds produced are an improvement over other method. There are some missing details regarding the landslide inventory selection process, described below. These missing details constitute the bulk of my concerns and if addressed, I feel that the paper will tell a more complete story of the authors' methodology.
Author’s Response: We acknowledge Referee#1 for globally appreciating our work, recognizing its novelty aspects. We will revise the manuscript adding more details about the landslide inventory, in order to make clearer the description of the work done.
Specific Comments:
Referee #1: Regarding the landslide selection process described near the end of section 2.1.
The selection of ground truth is a critical decision for this type of analysis, especially if that ground truth is partially derived from other algorithms or datasets, such as what you are doing with CTRL-T. I believe this section needs two additions to help convince readers that what you're doing is scientifically sound.
Firstly, for the "adjustable parameters" of CTRL-T, I would like to see some description of how and why you chose the final parameter values. I believe you briefly mention the separating length of time for rainfall events in wet and dry periods later on in the paper. There is also this sentence in line 135: "Rainfall event parameters were calibrated adopting the monthly soil water balance model and evapotranspiration analysis." But I'm unclear on if this calibration process was done automatically by the program or manually by the authors. A final list of adjustable parameter values, with some brief defense of their selection, would help readers understand what parts of CTRL-T are automated and which are tuned by hand.
Author’s Response: We thank the reviewer for pointing out that more details should be provided regarding these aspects. For the computation of the regional parameters required by CTRL-T, we referred to a previous application of the algorithm to Sicily Island (Melillo et al., 2015). As explained by the authors, the heuristic approach proposed by Brunetti et al. (2010), and updated by Peruccacci et al. (2012), has been adopted to separate two rainfall events. Specifically, according to this approach, the dry period (no rain) has been set equal to 48 hours (P4, warm) between April and October (warm season, Cw), while it has been set equal to 96 hours (P4, cold) from November to March (cold season, Cc). Indeed, in line with Köppen (1931) and Trewartha (1968), it is reasonable to assume that in Sicily, due to the Mediterranean climate, the warm period is longer than the cold one. On lines 135 and 136 we referred to the parameters representative of the time periods used to remove the irrelevant amounts of rain and to reconstruct rainfall events (P1, P2, P4) and to the irrelevant rainfall sub-events that had to be excluded in the calculation of the final events (P3). In more detail:
- P1 represents the dry interval separating isolated rainfall measurements and it has been set equal to 3 hours for the Cw period, and to 6 hours for Cc period;
- P2 represents the dry interval separating the rainfall sub-event, namely the period of continuous rainfall separated from the immediately preceding and the immediately following sub-events by dry periods with no rain. It has been set equal to 6 hours in the Cw period and equal to 12 hours in the Cc period;
- P3 represents the threshold to exclude the sub-events whose contribution can be considered irrelevant for the reconstruction of the rainfall events for the possible initiation of the landslide and, it has been reasonably set equal to 1 mm for the Mediterranean climate;
- P4 represents the minimum dry period separating two rainfall events, where a rainfall event is a period of continuous rainfall resulting from the aggregation of single or multiple sub-events in order to obtain single rainfall events. P4 has been set equal to48 hours for the Cw period and equal to 96 hours for the Cc period.
Additional parameters needed to be set for the reconstruction of the rainfall events are: GS, representing the instrumental sensitivity of the rain gauge; ER, representing the instrumental sensitivity of the rain gauge and the minimum value exceeding which the isolated hourly measurements are considered relevant; and RB, representing the radius of the buffer to assign each landslide to the closest rain gauge.
A final table of the adjustable parameter values, as reported in the following, will be certainly added within the manuscript in order to better explain their meaning and their role with respect to the algorithm.
Parameter name
Parameter value
Cw
Cc
GS [mm]
0.2
0.2
ER [mm]
0.2
0.2
RB [km]
10
10
P1 [h]
3
6
P2 [h]
6
12
P3 [h]
1
1
P4 [h]
48
96
Regarding the monthly soil water balance model and evapotranspiration analysis, the calibration process was done automatically by the program, setting the related benchmarks. It was assumed that the evapotranspiration is inversely proportional to the time necessary to dry the soil, and, specifically, a factor of 2 between all relevant parameters in the Cw and Cc periods has been adopted (ETR(Cw)≅2⋅ETR(Cc)), as revealed by the analysis of the mean annual evapotranspiration in Italy (Melillo, 2009) using the Thornthwaite–Mather method (Thornthwaite and Mather, 1957) and as adopted by Melillo et al. (2015) for a previous application of the algorithm to the Sicily Island.
Referee #1: Secondly, I would describe briefly in greater detail how you decided that landslides did not have identifiable or uncertain rainfall conditions. I presume some threshold on the weights was used. If so, what were those thresholds values and how did you decide them? Or if some other metric was used to quantify the landslide cause as being uncertain, briefly provide and defend those decisions.
Author’s Response: We thank Referee#1 for allowing us to clarify this aspect. The selection of the rainfall events responsible for landslides is performed within the algorithm in two steps. The first step involves the assignment of a record of rainfall measurements, given by a single rain gauge, to each landslide checking the match between the start and end dates of the rainfall events and the day and time of the landslide occurrence. This approach makes it possible to associate each landslide to a single rainfall event, discarding the landslides for which the time match does not fit. If multiple rainfall conditions that are mostly likely responsible for the triggering are found, the weighting procedure explained in lines 140-147 of the manuscript is adopted.
Technical Comments and Optional Suggestions:
Referee #1: See attached PDF for grammar corrections and other suggestions for additional figures or figure revisions that I do not feel are mandatory.
Author’s Response: We also appreciate the additional grammar corrections and the other suggestions for figure revisions that will be certainly introduced within the manuscript, as well as all the additional information and insights reported in the above specific comments.
In particular, we will apply all technical corrections. For the more specific comments in the annotated manuscript, we will modify the manuscript as follows:
- insertion of a more specific overview regarding some statistics related to the cost, damages, and number of casualties due to landslides on a worldwide scale;
- insertion of a grouped bar plot, as a subplot to Figure 4, graphing Eqs. 13, 14, 15, and 16.
-
AC1: 'Reply on RC1', Nunziarita Palazzolo, 05 Oct 2022
-
RC2: 'Comment on nhess-2022-175', Anonymous Referee #2, 29 Jul 2022
The authors compare classical intensity-duration power law thresholds with hydrometeorological thresholds that combine mean intensity with ERA5-Land reanalysis soil moisture. They consider both the soil moisture at 4 different depths and a combination of them obtained with principal component analysis. They find that adding soil moisture information improves the prediction of a rainfall-based threshold.
The manuscript is generally clear and well organized. I only have one major concern with the publication. In fact, the novelty in the work is the use of PCA for combining the soil moisture information at the different depths and keeping the problem 2D (rainfall intensity and soil moisture derived component). While I see the potential of such a methodology and agree with the advantage of keeping the dimensionality of the problem small, I believe some further analyses are required to demonstrate that it is indeed advantageous.
The TSS obtained using intensity and soil moisture at the first or second shallowest depth is pretty much identical to the one using the first component of the PCA. This means that the overall performances are not sufficient to demonstrate the advantage of PCA. The authors should find some alternative way of either demonstrating the advantages directly (with data/results) or demonstrating that the most influential soil layers changes depending on some properties. Can they observe any pattern in which one is more influential than the others? Those could be spatial patterns (e.g., in certain regions the shallower/deeper soil moisture is more important) or relative to landslide properties (e.g., for certain types of landslides the deeper/shallower soil moisture is more important) or relative to the time of the year (e.g., over certain times of the year deeper/shallower soil moisture is more important). Because the PCA piece is the novel piece in this manuscript, I believe it is important to show and demonstrate the advantage compared to just using the shallowest soil moisture.
Besides this, I also have some minor comments:
- I think the method for the definition of rainfall events should be better explained. The definitions of parameters P1-P2-P3-P4 are unclear and so is the sentence in line 148. What does it mean that rainfall conditions were not identified? What are the uncertainties? In rainfall? In landslide properties. Does this mean they were incorrectly identified as “rainfall triggered” in the database ()? Also, what are the values of the parameters chosen (e.g., the maximum radius Rb)? Only information about the interarrival times was provided.
- On soil moisture: could you provide more information about soil moisture? The only information I could find is “association of SM data to the beginning of each rainfall event”, but what does this mean? The first hour of the rainfall event? The hour before its beginning?
- Do rainfall events end when the hour of landslide occurrence or whenever they ended (which could be N hours before or even after the landslide)?
- Would be good to report in Figure 5 also the timing of each landslide (maybe as a vertical black line)
- In Figure 6 you could consider using different plotting techniques to make the figure more readable. In fact, it is impossible from a scatter plot to understand where/how many points there are in the different parts of the plot. You should consider plotting the 2D histogram instead (e.g., see Leonarduzzi et al., 2017). This would allow the reader to better see where most of the events are and which groups of events are “driving” the thresholds. Based on more visible (because it’s less events) distribution of the triggering events, it looks like the not triggering ones are “driving” the threshold. In other words, just by looking at the triggering events, a steeper threshold would improve the performances, which seems to suggest there are a lot of not triggering events in the 20-300h 0.1-1 mm/h region. Would be nice if that could be looked at! What I am suggesting is something similar to what Leonarduzzi et al., (2017) did in Figure 3.
- Would be nice to have information also about TPR and FPR also for the ID threshold (maybe as an additional entry in Table 2)
- I have a similar suggestion for Figure 8 as for Figure 6, allowing the reader to see the distribution of the no-triggering events (the triggering sample is small enough that they are visible without much overlapping)
Finally, in the supplementary pdf, some grammar/rewording suggestions are provided.
Leonarduzzi, E., Molnar, P., and McArdell, B. W. (2017), Predictive performance of rainfall thresholds for shallow landslides in Switzerland from gridded daily data, Water Resour. Res., 53, 6612– 6625, doi:10.1002/2017WR021044.
-
AC2: 'Reply on RC2', Nunziarita Palazzolo, 05 Oct 2022
Comment on nhess-2022-175
Dear Editor and Reviewers,
we thank the guest editor Francesco Marra for handling our manuscript and the referees for their insightful comments, which will surely help us in improving our work. In the following, we reply point-to-point to anonymous Referee #2.
Author’s response to anonymous Referee #2
General Comments:
Referee#2: The authors compare classical intensity-duration power law thresholds with hydrometeorological thresholds that combine mean intensity with ERA5-Land reanalysis soil moisture. They consider both the soil moisture at 4 different depths and a combination of them obtained with principal component analysis. They find that adding soil moisture information improves the prediction of a rainfall-based threshold.
The manuscript is generally clear and well organized. I only have one major concern with the publication. In fact, the novelty in the work is the use of PCA for combining the soil moisture information at the different depths and keeping the problem 2D (rainfall intensity and soil moisture derived component). While I see the potential of such a methodology and agree with the advantage of keeping the dimensionality of the problem small, I believe some further analyses are required to demonstrate that it is indeed advantageous.
The TSS obtained using intensity and soil moisture at the first or second shallowest depth is pretty much identical to the one using the first component of the PCA. This means that the overall performances are not sufficient to demonstrate the advantage of PCA. The authors should find some alternative way of either demonstrating the advantages directly (with data/results) or demonstrating that the most influential soil layers changes depending on some properties. Can they observe any pattern in which one is more influential than the others? Those could be spatial patterns (e.g., in certain regions the shallower/deeper soil moisture is more important) or relative to landslide properties (e.g., for certain types of landslides the deeper/shallower soil moisture is more important) or relative to the time of the year (e.g., over certain times of the year deeper/shallower soil moisture is more important). Because the PCA piece is the novel piece in this manuscript, I believe it is important to show and demonstrate the advantage compared to just using the shallowest soil moisture.
Author’s Response:
We thank the referee for the comment. We understand the concern related to the fact that the PCA combining the information for the four depth layers of soil moisture does not bring a significant improvement in the performance in terms of TSS with respect to the threshold that uses soil moisture from the first or second layer. We however partly disagree with the referee for two reasons. First, from a general standpoint, testing the application of the questioned PCA approach remains a novel aspect regardless of the possible increase or not in performance compared to thresholds that employ a single layer. Second, we would also like to point out that the use of PCA is not the only novelty aspect that we introduce. In particular, novelty points include: 1) a deeper look at how the hydro-meteorological method may help to predict landslides, adding to the body of research already available on the issue, which is still underdeveloped; 2) the use of reanalysis datasets for soil moisture information, which are accessible globally at a coarse resolution and contain significant uncertainties, (and for which it is thus not trivial that an improvement of thresholds could be obtained); 3) the questioning of usual parametric forms of the threshold equation, which is usually a power law (Marino et al., 2020) or a bilinear equation (Mirus et al, 2018; Thomas et al., 2019). In some ways, these points are summarized in the proposed title of the manuscript, which states that we analyze the “potential improvements of landslide prediction by hydro-meteorological thresholds”, and that we just “investigate” the use of “principal component analysis”.
We do, however, recognize that if a certain approach is proposed, it should be worth it was given the increased level of sophistication that is introduced. This may be accomplished by the application of the PCA technique that we introduce to other case studies, which is beyond the scope of the manuscript, as well as the other aspects that the referee suggests (type of landslides, seasons, etc.).
We will thus amend the manuscript by adding some cautionary sentences that will state the need for more studies to further test the real convenience of applying PCA. However, as stated in the manuscript, PCA remains a valid approach to combining the information of the four available layers and thus avoiding the trial-and-error testing of the use of the various single layers, as the performances obtained are at least as good as those using single-layer information.
Minor Comments:
Referee#2: I think the method for the definition of rainfall events should be better explained. The definitions of parameters P1-P2-P3-P4 are unclear and so is the sentence in line 148. What does it mean that rainfall conditions were not identified? What are the uncertainties? In rainfall? In landslide properties. Does this mean they were incorrectly identified as “rainfall triggered” in the database ()? Also, what are the values of the parameters chosen (e.g., the maximum radius Rb)? Only information about the interarrival times was provided.
Author’s Response: We thank the reviewer for pointing out that more details should be provided regarding the definition of rainfall events. As reported on line 128-129, the CTRL-T (Calculation of Thresholds for Rainfall induced Landslides-Tool) code (Melillo et al., 2018) is used for the identification of the rainfall events that were more likely to be responsible for the observed slope failures. Additional information, provided also to Referee#1, is reported in what follows and should be integrated within the manuscript to better clarify the adopted methods.
For the computation of the regional parameters required by CTRL-T, we referred to a previous application of the algorithm to Sicily Island (Melillo et al., 2015). As explained by the authors, the heuristic approach proposed by Brunetti et al. (2010), and updated by Peruccacci et al. (2012), has been adopted to separate two rainfall events. Specifically, according to this approach, the dry period (no rain) has been set equal to 48 hours (P4, warm) between April and October (warm season, Cw), while it has been set equal to 96 hours (P4, cold) from November to March (cold season, Cc). Indeed, in line with Köppen (1931) and Trewartha (1968), it is reasonable to assume that in Sicily, due to the Mediterranean climate, the warm period is longer than the cold one. On lines 135 and 136 we referred to the parameters representative of the time periods used to remove the irrelevant amount of rain and to reconstruct rainfall events (P1, P2, P4) and to the irrelevant rainfall sub-events that had to be excluded in the calculation of the final events (P3). In more detail:
- P1 represents the dry interval separating isolated rainfall measurements and it has been set equal to 3 hours for the Cw period, and to 6 hours for Cc period;
- P2 represents the dry interval separating the rainfall sub-event, namely the period of continuous rainfall separated from the immediately preceding and the immediately following sub-events by dry periods with no rain. It has been set equal to 6 hours in the Cw period and equal to 12 hours in the Cc period;
- P3 represents the threshold to exclude the sub-events whose contribution can be considered irrelevant for the reconstruction of the rainfall events for the possible initiation of the landslide and, it has been reasonably set equal to 1 mm for the Mediterranean climate;
- P4 represents the minimum dry period separating two rainfall events, where a rainfall event is a period of continuous rainfall resulting from the aggregation of single or multiple sub-events in order to obtain single rainfall events. P4 has been set equal to48 hours for the Cw period and equal to 96 hours for the Cc period.
Additional parameters needed to be set for the reconstruction of the rainfall events are: GS, representing the instrumental sensitivity of the rain gauge; ER, representing the instrumental sensitivity of the rain gauge and the minimum value exceeding which the isolated hourly measurements are considered relevant; and RB, representing the radius of the buffer to assign each landslide to the closest rain gauge.
A final table of the adjustable parameter values, as reported in the following, will be certainly added within the manuscript in order to better explain their meaning and their role with respect to the algorithm.
Parameter name
Parameter value
Cw
Cc
GS [mm]
0.2
0.2
ER [mm]
0.2
0.2
RB [km]
10
10
P1 [h]
3
6
P2 [h]
6
12
P3 [h]
1
1
P4 [h]
48
96
Referee#2: On soil moisture: could you provide more information about soil moisture? The only information I could find is “association of SM data to the beginning of each rainfall event”, but what does this mean? The first hour of the rainfall event? The hour before its beginning?
Author’s Response: We thank Referee#2 for allowing us to clarify this aspect. Both precipitation data and soil moisture ones are at the hourly scale, thus the association of SM data to the beginning of each rainfall event means that, for example, if a rainfall event begins at 10:00 am on a given day, it will be associated with the corresponding soil moisture value recorded at 10:00 am on that day. This issue will be better clarified within the manuscript, as well as additional information describing more in deep the ER5-Land dataset.
Referee#2: Do rainfall events end when the hour of landslide occurrence or whenever they ended (which could be N hours before or even after the landslide)?
Author’s Response: The rainfall events end at the hour of landslide occurrence, thus for each triggering rainfall event, the time at which the considered event begins is taken as starting time, while the time of the landslide occurrence is taken as the ending time. The following figure adapted from Peres et al., (2017) makes it clearer:
In case a revised submission will be foreseen, we will add a figure to clarify these aspects.
Referee#2: Would be good to report in Figure 5 also the timing of each landslide (maybe as a vertical black line)
Author’s Response: We thank Referee#2 for this suggestion that will be surely implemented within Figure 5 of the manuscript.
Referee#2: In Figure 6 you could consider using different plotting techniques to make the figure more readable. In fact, it is impossible from a scatter plot to understand where/how many points there are in the different parts of the plot. You should consider plotting the 2D histogram instead (e.g., see Leonarduzzi et al., 2017). This would allow the reader to better see where most of the events are and which groups of events are “driving” the thresholds. Based on more visible (because it’s less events) distribution of the triggering events, it looks like the not triggering ones are “driving” the threshold. In other words, just by looking at the triggering events, a steeper threshold would improve the performances, which seems to suggest there are a lot of not triggering events in the 20-300h 0.1-1 mm/h region. Would be nice if that could be looked at! What I am suggesting is something similar to what Leonarduzzi et al., (2017) did in Figure 3.
Author’s Response: We thank Referee#2 for suggesting a better plotting technique to make the figure more readable. The way through which we represented landslide triggering thresholds (Figures 6, 7, 8) within a semi-log plane represents the most adopted one in the majority of literature studies, however, the 2D histogram of Leonarduzzi et al. (2017), is a nice and interesting alternative plot that surely can help the reader in understanding the thresholds. Thus, we will make all possible efforts in order to improve the readability of the considered Figures 6, 7, and 8.
Referee#2: Would be nice to have information also about TPR and FPR also for the ID threshold (maybe as an additional entry in Table 2)
Author’s Response: This additional information about TPR and FPR for the ID threshold will be certainly added as an additional entry in Table 2.
Referee#2: I have a similar suggestion for Figure 8 as for Figure 6, allowing the reader to see the distribution of the no-triggering events (the triggering sample is small enough that they are visible without much overlapping)
Author’s Response: As previously stated, we will try to revise all the Figures representing the proposed parametric thresholds with the aim to improve their readability.
Referee#2: Finally, in the supplementary pdf, some grammar/rewording suggestions are provided.
Author’s Response: We thank Referee#2 to have carefully reviewed the manuscript also from a grammar/rewording point of view. All provided suggestions noted in the supplementary pdf will be implemented within the manuscript.
Peer review completion








Journal article(s) based on this preprint
Nunziarita Palazzolo et al.
Nunziarita Palazzolo et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
409 | 129 | 15 | 553 | 6 | 4 |
- HTML: 409
- PDF: 129
- XML: 15
- Total: 553
- BibTeX: 6
- EndNote: 4
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(983 KB) - Metadata XML