Dear authors:
I appreciate the opportunity to review the revised version of your manuscript. It is obvious that you have put a lot of effort into the revisions, and the quality of the manuscript has improved substantially. I really appreciate the addition of the more in-depth discussion of the strengths and weaknesses of the various datasets and the parametric analyses. Despite these substantial improvements, I feel that there are several remaining issues that should be addressed before the manuscript can be published.
Major comments
Description of steps of PRA decision method and CLPA procession
I think that the description of the steps of the PRA detection method could be improved by better aligning the description in the text with the graphic presented in Figure 3. Right now, I find the description rather confusing because it talks about three main steps that do not seem obvious in Fig. 3, and while I understand the reason for the split over the two columns, they do not obviously line up with the description in the text. Furthermore, the presentation of the CLPA processing steps in Fig. 4 is visually very different even though some of the steps are the same as in the PRA detection method. Given these similarities and the fact that the PRA detection and CLPA procession steps are closely tied (as explained in the text several times), I think that a more consistent graphic presentation that highlight these connections more obviously (either in a single or two figures) would allow the reader to understand the approach of the analysis more easily.
Confusion matrix
I am still concerned about the fact that you use the accuracy rate in your study. In reality, you are only looking at the precision/positive predictive value (= true positives/(true positives + false positive)), and your assumption that the true negative rate is 100% artificially produces an accuracy rate value that is halfway between the precision value and 100% without adding any value to the analysis. Similarly, this assumption also creates error rate values that are halfway between the false discovery rate (= false positives/(true positives + false positive)) and 0%. Given your new description of the strengths and weaknesses of the CLPA dataset (which is much appreciated), the assumption of a 100% negative predictive value and 0% false omission rate seems somewhat bold.
In my opinion, it would be more accurate and more transparent to base your evaluation on precision/positive predictive value instead of the accuracy rate. This will not affect the results of your analysis at all, but it will describe the focus of your evaluation more honestly and prevents possible confusion with accuracy rates presented in other studies that actually work with the full confusion matrix. Note that you explicitly point out this limitation yourself on L633. I think it would be very useful for you to highlight in the conclusion section that future studies should aim to assess PRA algorithms with the full confusion matrix.
Comparisons in parametric studies
I appreciate that you now explore the robustness of your approach with a parametric studies. However, I am a bit confused about the fact that parameter values and ranges were only changed in the PRA algorithm and not the validation dataset even though most of them are used the same way in both. It seems obvious that the PRA algorithm that uses the same parameter values as the CLPA processing will naturally perform the best! Applying a different slope or elevation filter in the PRA algorithm but keeping the default one for the CLPA processing obvious decreases the performance. I understand that this relates to the challenging task of defining the “ground truth” (which requires some assumptions), but it seems to me that potential insight from the current approach is limited.
Would it make more sense to also change the parameter values in the CLPA procession like you did for the DEM resolution analysis (L505). I have not completely thought this through, but it would keep the assumptions consistent and allow you to compare apples with apples and not apples with oranges.
Minor comments
L153: If the optimum DEM resolution is examined in the study, shouldn’t all DEM datasets be described in the data section and not just the 25 m one?
L192: I think it would be useful to explicitly explain why you trust the CLPA dataset so much instead of just stating it as a fact.
L237: It is still a bit unclear how the watersheds are actually delineated. Figure S2 shows how the flow direction and accumulation are calculated but does not show how the actual boundaries are drawn. A slightly bigger example with the actual boundaries drawn would be more informative.
L312: The fact that only one pixel of a validation PRA must be identified for a successful match seems a very low bar and a critical assumption of the analysis. It might be worthwhile to justify this choice in more detail and/or explore the effect of different thresholds.
L540: I appreciate the honest discussion of the limitations of the performance measure here, but I think this could be addressed/avoided by using a more appropriate performance measure that takes the limitations of the dataset into account more honestly earlier (see earlier comment).
L585: It would be better to include the suggestion for a full comparison of different PRA algorithms in the conclusion section where you make other suggestions about future research.
L641: I did not read the paper by Giffard-Roisin et al. (2020) in detail, but I think it would be important to briefly mention that while there are benefits to increasing detection power, increasing false positives also has its cost/challenges.
L651: It is not completely correct that you validated your PRA algorithm over entire massifs, because your performance measures are only based on the areas where CLPA data is available, which are fractions of the entire massifs.
L656: Are these suggestions meaningful/realistic given the inherent limitations of the CLPA dataset?
Editorial comments
The manuscript will require detailed editing as the English is still of limited quality. Below are some comments for improving the writing, but there are likely more issues. I assume that the NHESS editorial team will take care of this before the manuscript is published.
Abstract
L17-20: I think the performance measures and values used in this study need to be described more accurately in the abstract. See earlier comment on the performance measures.
Introduction
L 62: “Eventually” is not the right term here. You could say “finally” instead. There are many incorrect uses of “eventually” throughout the manuscript. Please relace throughout.
L63: The last sentence in the paragraph (As a consequence, …), does not seem properly connected to the rest of the paragraph. Please expand and explain in more detail.
L70: Missing “and” before ii).
L 75: Replace “is very dependent” with “depends”.
L 93: “confront” should be “compare”.
L101: Replace “summed-up as” with “summarized in”. “Summed-up” is used in several locations of the manuscript and should be replaced everywhere.
L103: Replace “remain little used so far” with “have only seem limited use so far.”
L115: Replace “ground” with “build”.
Data
L150 – Table 1: First, this table seems to include results already. This is rather unusual for a table in the methods/data section. Second, the areas are not explicitly introduced in the text. Their purpose is mentioned on L 139 in general, but the actual areas are not described.
L197: “Avalanche extensions”, which is used extensively throughout the manuscript is not the right term. In this particular case, “avalanche records” would work, but most often it refers to the “accuracy of the recorded extent of observed avalanches.” Please correct this throughout the manuscript.
PRA detection
L 236: Delete “(where flow accumulation is equal to zero)” as it is repetitive.
L 249: Replace “few” with “too little”.
L 260: “e.g.,” should probably be “i.e.,”
Results
L 340 – Caption of Fig. 5: Replace “concordance” with “agreement”.
Discussion
L646: It seems inaccurate to mention the confusion matrix here since you did not use the full confusion matrix. Instead, you should more strongly highlight that you examined the performance with respect to area and number of PRAs, which is more novel.
Conclusion
L664: It is unclear to me what you mean with “and close contexts (see below).”
L668: You should explicitly explain how your results contribute to the field and not leave this up to the reader. They might not see it themselves.
L680: Replace “confronted” with “compared” or “contrasted”. |