Impact of spatial data uncertainty in debris flow susceptibility analysis

Kurilla, Laurie Jayne; Fubelli, Giandomenico

doi:10.5194/nhess-2021-364

Preprints

https://doi.org/10.5194/nhess-2021-364

Preprints

15 Dec 2021

| 15 Dec 2021

Status: this preprint has been withdrawn by the authors.

Impact of spatial data uncertainty in debris flow susceptibility analysis

Laurie Jayne Kurilla and Giandomenico Fubelli

Abstract. In a study of debris flow susceptibility on the European continent, an analysis of the impact between known location and a location accuracy offset for 99 debris flows, demonstrates the impact of uncertainty in defining appropriate predisposing factors, and consequent analysis for areas of susceptibility.

The dominant predisposing environmental factors, as determined through Maximum Entropy modeling, are presented, and analyzed with respect to the values found at debris flow event points versus a buffered distance of locational uncertainty around each point.

Five Maximum Entropy susceptibility models are developed utilizing the original debris flow inventory of points, randomly generated points, and two models utilizing a subset of points with an uncertainty of 5 km, 1 km, and a model utilizing only points with a known location of “exact”. The AUCs are 0.891, 0.893, 0.896, 0.921, and 0.93, respectively. The “exact” model, with the highest AUC, is ignored in final analyses due to the small number of points, and localized distribution, and hence susceptibility results likely non-representational of the continent.

Each model is analyzed with respect to the AUC, highest contributing factors, factor classes, susceptibility impact, and comparisons of the susceptibility distributions and susceptibility value differences.

Based on model comparisons, geographic extent and context of this study, the models utilizing points with a location uncertainty of less than or equal to 5 km best represent debris flow susceptibility of the continent of Europe. A novel representation of the uncertainty is expressed, and included in a final susceptibility map, as an overlay of standard deviation and mean of susceptibility values for the two best models, providing additional insight for subsequent action.

This preprint has been withdrawn.

Received: 27 Nov 2021 – Discussion started: 15 Dec 2021

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 1705 KB)

Withdrawal notice
This preprint has been withdrawn.
Preprint (1705 KB)

Download & links

This preprint has been withdrawn.

Laurie Jayne Kurilla and Giandomenico Fubelli

Interactive discussion

Status: closed

RC1:
'Comment on nhess-2021-364', Anonymous Referee #1, 17 Jan 2022

Dear authors,

first of all thank you for the nice reading, I enjoyed going through your manuscript.

I have suggested the editor for minor revisions and below what I will do is to initially summarize what I understood of the work you proposed and then report my suggestions.

What you present is a susceptibility model at the continental scale. The phenomena you model are debris flows, which you access from the NASA repository. The mapping unit you chose are grid-cells (did you mention their size in the text?) while the covariates you chose span from climatic to terrain ones, to land use and more. You run this experiment by making use of the locational uncertainty provided in the debris flow metadata. As a result you can select debris flows with different level of certainty of their positional accuracy, then run a susceptibility model for each group respectively.

The modeling framework is solid. I only have one suggestion on this. You can remove the term presence-only from the text because it is true that MaxEnt is often referred to as presence only model in the ecological literature. But, in a landslide context, all the model implementations we run do exactly the same thing that maxent does. For instance, even a logistic regression does exactly what you did here. It starts from a set of locations where you consider your presences, then it extract absences or pseudo-absences at random, with a number equal to what you set here to be your background. So, shall we call all the other models presence-only? I think it is more of a phylosofical definition but in the daily life of every susceptibility paper out there what happens is that the two framework coincide.

Also, your exact model, with only 5 debris flows is quite difficult to justify. There I would stress the limitations even more in the text.

One thing I have noticed is that you use the natural break method to classify your susceptibility. This is something that Lombardo et al. 2020 stress in their work. Often authors use one method or not to justify the classification they opt for. I would suggest to write a couple of lines on why you chose this over any other criterion.

Ref: Lombardo, L., Opitz, T., Ardizzone, F., Guzzetti, F. and Huser, R., 2020. Space-time landslide predictive modelling. , p.103318.

As for the last comments, in all figures you use the acronym for kilometer as Km. This is incorrect as the symbol for kilometer in the international system is km. I would suggest to change it across all figures.

Good luck with the progress of your paper.

Kind regards,

Rev

Citation: https://doi.org/10.5194/nhess-2021-364-RC1
- AC1: 'Reply on RC1', Laurie Kurilla, 17 Jan 2022
  
  We greatly appreciate the detailed suggestions provided by the reviewer, and their time spent in providing the feedback. We agree with and will implement recommended modifications for the final submission. We regret that we did not "catch" some of these issues before submission.
  Regarding the discussion on MaxEnt background points constituting "absence" data, it is noted that “presence” is unknown at the MaxEnt background locations (Merow et al 2013). When using the default setting (as was done in this study), the MaxEnt software uniformly at random selects background locations which may include the “known” debris flow sites, as well. MaxEnt uses background data primarily to characterize environments in the study region rather than to act as “absence” data. (Phillips et al. 2009). Perhaps as further evidence, “A simple strategy to remove sample selection bias is to replace the uniform background data by a random sample of background data drawn from the sampling distribution” (Phillips and Dudek 2008).
  We agree that logistic regression is another statistical method not requiring the input of "absence" data. Pointing out that MaxEnt is a “presence-only” model is just one of the justifications for utilizing this methodology. The discussion and methodologies of “presence-only” vs “presence-absence” will continue to be an important topic for this researcher and your insights are much appreciated.
  Merow, M. et al A practical guide to MaxEnt for modeling species’ distributions: what it does, and why inputs and settings matter 10.1111/j.1600-0587.2013.07872.x
  Phillips, S. et al. 2009. Sample selection bias and presence-only distribution models: implications for background and pseudo-absence data. – Ecol. Appl. 19: 181–197.
  Phillips, S. and Dudik, M. 2008. Modeling of species distributions with MaxEnt: new extensions and a comprehensive evaluation. – Ecography 31: 161.
  Respectfully,
  Laurie J. Kurilla
  
  Citation: https://doi.org/10.5194/nhess-2021-364-AC1

Interactive discussion

Status: closed

RC1:
'Comment on nhess-2021-364', Anonymous Referee #1, 17 Jan 2022

Dear authors,

first of all thank you for the nice reading, I enjoyed going through your manuscript.

I have suggested the editor for minor revisions and below what I will do is to initially summarize what I understood of the work you proposed and then report my suggestions.

What you present is a susceptibility model at the continental scale. The phenomena you model are debris flows, which you access from the NASA repository. The mapping unit you chose are grid-cells (did you mention their size in the text?) while the covariates you chose span from climatic to terrain ones, to land use and more. You run this experiment by making use of the locational uncertainty provided in the debris flow metadata. As a result you can select debris flows with different level of certainty of their positional accuracy, then run a susceptibility model for each group respectively.

The modeling framework is solid. I only have one suggestion on this. You can remove the term presence-only from the text because it is true that MaxEnt is often referred to as presence only model in the ecological literature. But, in a landslide context, all the model implementations we run do exactly the same thing that maxent does. For instance, even a logistic regression does exactly what you did here. It starts from a set of locations where you consider your presences, then it extract absences or pseudo-absences at random, with a number equal to what you set here to be your background. So, shall we call all the other models presence-only? I think it is more of a phylosofical definition but in the daily life of every susceptibility paper out there what happens is that the two framework coincide.

Also, your exact model, with only 5 debris flows is quite difficult to justify. There I would stress the limitations even more in the text.

One thing I have noticed is that you use the natural break method to classify your susceptibility. This is something that Lombardo et al. 2020 stress in their work. Often authors use one method or not to justify the classification they opt for. I would suggest to write a couple of lines on why you chose this over any other criterion.

Ref: Lombardo, L., Opitz, T., Ardizzone, F., Guzzetti, F. and Huser, R., 2020. Space-time landslide predictive modelling. , p.103318.

As for the last comments, in all figures you use the acronym for kilometer as Km. This is incorrect as the symbol for kilometer in the international system is km. I would suggest to change it across all figures.

Good luck with the progress of your paper.

Kind regards,

Rev

Citation: https://doi.org/10.5194/nhess-2021-364-RC1
- AC1: 'Reply on RC1', Laurie Kurilla, 17 Jan 2022
  
  We greatly appreciate the detailed suggestions provided by the reviewer, and their time spent in providing the feedback. We agree with and will implement recommended modifications for the final submission. We regret that we did not "catch" some of these issues before submission.
  Regarding the discussion on MaxEnt background points constituting "absence" data, it is noted that “presence” is unknown at the MaxEnt background locations (Merow et al 2013). When using the default setting (as was done in this study), the MaxEnt software uniformly at random selects background locations which may include the “known” debris flow sites, as well. MaxEnt uses background data primarily to characterize environments in the study region rather than to act as “absence” data. (Phillips et al. 2009). Perhaps as further evidence, “A simple strategy to remove sample selection bias is to replace the uniform background data by a random sample of background data drawn from the sampling distribution” (Phillips and Dudek 2008).
  We agree that logistic regression is another statistical method not requiring the input of "absence" data. Pointing out that MaxEnt is a “presence-only” model is just one of the justifications for utilizing this methodology. The discussion and methodologies of “presence-only” vs “presence-absence” will continue to be an important topic for this researcher and your insights are much appreciated.
  Merow, M. et al A practical guide to MaxEnt for modeling species’ distributions: what it does, and why inputs and settings matter 10.1111/j.1600-0587.2013.07872.x
  Phillips, S. et al. 2009. Sample selection bias and presence-only distribution models: implications for background and pseudo-absence data. – Ecol. Appl. 19: 181–197.
  Phillips, S. and Dudik, M. 2008. Modeling of species distributions with MaxEnt: new extensions and a comprehensive evaluation. – Ecography 31: 161.
  Respectfully,
  Laurie J. Kurilla
  
  Citation: https://doi.org/10.5194/nhess-2021-364-AC1

Laurie Jayne Kurilla and Giandomenico Fubelli

Viewed

Total article views: 1,490 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
1,096	323	71	1,490	98	108

HTML: 1,096
PDF: 323
XML: 71
Total: 1,490
BibTeX: 98
EndNote: 108

Views and downloads (calculated since 15 Dec 2021)

Month	HTML	PDF	XML	Total
Dec 2021	164	30	2	196
Jan 2022	84	15	10	109
Feb 2022	26	6	0	32
Mar 2022	13	2	3	18
Apr 2022	15	9	1	25
May 2022	9	7	1	17
Jun 2022	3	1	1	5
Jul 2022	7	4	0	11
Aug 2022	7	7	0	14
Sep 2022	1	3	0	4
Oct 2022	5	1	6
Nov 2022	7	4	0	11
Dec 2022	7	4	0	11
Jan 2023	8	2	0	10
Feb 2023	8	1	0	9
Mar 2023	13	5	0	18
Apr 2023	3	3	0	6
May 2023	14	6	1	21
Jun 2023	7	6	0	13
Jul 2023	12	5	3	20
Aug 2023	7	3	1	11
Sep 2023	15	5	0	20
Oct 2023	27	9	5	41
Nov 2023	15	0	15
Dec 2023	13	3	0	16
Jan 2024	11	1	0	12
Feb 2024	19	6	2	27
Mar 2024	10	22	0	32
Apr 2024	10	4	6	20
May 2024	8	5	3	16
Jun 2024	6	1	1	8
Jul 2024	7	3	0	10
Aug 2024	10	6	2	18
Sep 2024	4	0	4
Oct 2024	7	3	0	10
Nov 2024	5	0	5
Dec 2024	4	9	0	13
Jan 2025	7	5	1	13
Feb 2025	7	1	1	9
Mar 2025	11	4	4	19
Apr 2025	5	6	0	11
May 2025	6	7	2	15
Jun 2025	9	5	0	14
Jul 2025	17	9	1	27
Aug 2025	35	14	1	50
Sep 2025	320	9	2	331
Oct 2025	14	10	1	25
Nov 2025	12	13	6	31
Dec 2025	13	14	1	28
Jan 2026	14	6	3	23
Feb 2026	18	4	2	24
Mar 2026	17	16	3	36

Cumulative views and downloads (calculated since 15 Dec 2021)

Month	HTML	PDF	XML	Total
Dec 2021	164	30	2	196
Jan 2022	84	15	10	109
Feb 2022	26	6	0	32
Mar 2022	13	2	3	18
Apr 2022	15	9	1	25
May 2022	9	7	1	17
Jun 2022	3	1	1	5
Jul 2022	7	4	0	11
Aug 2022	7	7	0	14
Sep 2022	1	3	0	4
Oct 2022	5	1	6
Nov 2022	7	4	0	11
Dec 2022	7	4	0	11
Jan 2023	8	2	0	10
Feb 2023	8	1	0	9
Mar 2023	13	5	0	18
Apr 2023	3	3	0	6
May 2023	14	6	1	21
Jun 2023	7	6	0	13
Jul 2023	12	5	3	20
Aug 2023	7	3	1	11
Sep 2023	15	5	0	20
Oct 2023	27	9	5	41
Nov 2023	15	0	15
Dec 2023	13	3	0	16
Jan 2024	11	1	0	12
Feb 2024	19	6	2	27
Mar 2024	10	22	0	32
Apr 2024	10	4	6	20
May 2024	8	5	3	16
Jun 2024	6	1	1	8
Jul 2024	7	3	0	10
Aug 2024	10	6	2	18
Sep 2024	4	0	4
Oct 2024	7	3	0	10
Nov 2024	5	0	5
Dec 2024	4	9	0	13
Jan 2025	7	5	1	13
Feb 2025	7	1	1	9
Mar 2025	11	4	4	19
Apr 2025	5	6	0	11
May 2025	6	7	2	15
Jun 2025	9	5	0	14
Jul 2025	17	9	1	27
Aug 2025	35	14	1	50
Sep 2025	320	9	2	331
Oct 2025	14	10	1	25
Nov 2025	12	13	6	31
Dec 2025	13	14	1	28
Jan 2026	14	6	3	23
Feb 2026	18	4	2	24
Mar 2026	17	16	3	36

Viewed (geographical distribution)

Total article views: 1,460 (including HTML, PDF, and XML) Thereof 1,460 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 30 Mar 2026

Download

This preprint has been withdrawn.

Preprint (1705 KB)
Metadata XML

Short summary

Debris flow research, at broader geographic coverages, requires the use of inventories of past events. Such information may not have precise event locations, resulting in current and future susceptibility models with a lower confidence level. This research showcases the problems associated with inaccurate locations in identifying the conditions which predispose an area to debris flows and provides a novel approach to presenting such uncertainties to the users of the resulting models.


Total:	0
HTML:	0
PDF:	0
XML:	0