Uncertainty analysis of the estimation of stony debris ﬂow rainfall threshold: the application to the Backward Dynamical Approach

. Rainfall thresholds, namely rainfall intensity-duration conditions beyond which the probability of debris ﬂow occurrence is considered signiﬁcant, can be used as a forecasting tool in debris-ﬂow early warning system. Many uncertainties may affect the thresholds calibration and, in turn, the reliability and effectiveness of this tool. The purpose of this study is to assess the uncertainty in the determination of the rainfall threshold for stony debris ﬂow based on the Back Dynamical Approach (BDA) (Rosatti et al., 2019), an innovative method to estimate the rainfall duration and averaged intensity strictly related to 5 measured debris ﬂow. The uncertainty analysis has been computed performing two Monte Carlo cascade simulations: (i) to assess the variability in the estimate of rainfall conditions due to the uncertainty of some of the BDA parameters and (ii) to quantify the impact of this variability on the threshold parameters, obtained by using the frequentist method. Then, the deviation between these analysis outcomes and the values obtained in Rosatti et al. (2019) has been examined. The results highlight that the variability in the rainfall condition estimate is strongly related to the debris ﬂow characteristics and the hyetograph 10 shape. Depending on these features, the spreading of the obtained distributions can take both low and high values. Instead, the threshold parameters are characterised by a low statistical spreading. Finally, the consistency between the outcome of this study and the results obtained in Rosatti et al. (2019) has been proved and the critical issues related to the rainfall condition estimation have been discussed.


15
Debris flows are very intense phenomena that affect mountain regions and have a significant impact on the territory in which they occur, causing damages and, in some cases, casualties (Fuchs et al. (2013), Cánovas et al. (2016)). For this reason, debris flow risk management, based on both active and passive mitigation strategies, is crucial to reduce the effects of the phenomenon on the territory. An early warning system is an example of a passive mitigation tool (Huebl and Fiebiger, 2005) as it allows to activate prevention measures (e.g. evacuation sets out in the civil protection plans) before the expected event occurs. 20 In this last framework, the forecast of the possible occurrence of debris flow is mainly based on rainfall thresholds (Chien-Yuan et al. (2005), Segoni et al. (2018)), namely rainfall conditions beyond which the probability of debris flow occurrence is considered significant. Usually, rainfall thresholds are power laws that link the rainfall duration to the rainfall cumulated or intensity  and the relevant coefficients are calibrated on historical data. A considerable literature deals with this topic (e.g. Aleotti (2004), Guzzetti et al. (2007), Jakob et al. (2012), Pan et al. (2018)).

The Backward Dynamical Approach
The Backward Dynamical Approach is an innovative method proposed by Rosatti et al. (2019) that defines the rainfall volume strictly relevant to a stony debris flow in a physical-based way. As described in Rosatti et al. (2019), the BDA starts from the 60 knowledge of the deposited volume V dep occupied by the sediments, surveyed after a debris flow event. Thanks to a simplified volumetric description of the phenomenon dynamic ( Fig.1), the rainfall volume pertaining to the debris flow V DF r , defined as the volume of water necessary to convey downstream as a mixture the volume of sediments V dep , can be express as a function of this latter volume: where c b is the concentration of the sediment in the bed, constant and assumed equal to 0.65 (Takahashi, 2014), and c is a reference volumetric solid concentration of the given debris flow. According to Takahashi (1978), c can be estimated as where i f is the average slope of the last 50 m of the river bed before the deposition zone, ψ is the dynamic friction angle of the sediments and ∆ = (ρ s − ρ l )/ρ l is the sediment relative submerged density where ρ l and ρ s are, respectively, the liquid and 70 solid constant density.
V DF r can also be express as product between the rainfall volume per unit area E and the event basin area A b : from which, a dynamical expression for the rainfall volume per unit area can be obtained: On the other hand, E can be obtained from the forcing of the phenomenon, namely in the hyetograph. Under the assumption of uniform rainfall over the basin, the hydrological expression for E is: where i(t) is the measured rainfall intensity, t 1 and t 2 are the start and end times related to the debris-flow duration. In the absence of event detailed data, the instant of maximum intensity during the event t max is assumed to be the debris flow 80 triggering time (Iadanza et al., 2016) and t 1 and t 2 are assumed to be: Because of the measurement technique, i(t) is a piecewise constant function on time intervals δt. Therefore, the integral in Eq. (5) cannot equal exactly the value of E, and an approximated value must be considered. The values of ∆t 1 = n 1 δt and Figure 1. Conceptual Lagrangian volumetric description of debris flow dynamic from Rosatti et al. (2019). The scheme is divided into three transepts: transept 1 is characterised by the runoff formation; the bed material erosion and the achievement of equilibrium conditions occur in transept 2; transept 3 is characterised by the deposition of sediments with water entrapment. V DF r is the rain volume pertaining to the debris flow, Vero is the bed volume variation related to the erosion, V dep is the deposited volume occupied by the sediments and θ is the inclination angle of the bed with respect to a reference horizontal direction.
∆t 2 = n 2 δt are computed considering the minimum number of intervals (i.e. n 1 and n 2 ) necessary to exceed (or equal) the 85 value of E: For further details on the computation of ∆t 1 and ∆t 2 , please refer to Rosatti et al. (2019).
Therefore, the duration D and the average intensity I related to the event become 2.1 The BDA-based threshold for a study area As described in Rosatti et al. (2019), the BDA methodology has been applied to obtain a stony debris flow rainfall threshold for the Trentino-Alto Adige/Südtirol region (Italy) (Fig. 2). A dataset composed of 84 debris flow events has been considered to 95 calibrate the threshold. Firstly, the (I, D) couples have been computed applying the BDA method. Then, the threshold has been obtained by using the frequentist method (e.g. Brunetti et al. (2010) and Peruccacci et al. (2012)). According to this method,  the threshold is a straight line in the log-log ID plane parallel to the power law obtained fitting the events rainfall conditions. The threshold exponent is therefore equal to b while the intercept is computed 100 by setting a value of the non-exceedance probability of the dataset events, namely by imposing the occurrence probability of debris flows related to rainfall conditions located below the threshold. The threshold equation is then: in which a <â. For more details on the frequentist method, we refer the reader to the above-mentioned references.
In Rosatti et al. (2019) the non-exceedance level has been set equal to 5% and the resulting calibrated threshold is:

Method
As briefly presented in the introduction, to perform a comprehensive study, the uncertainty analysis of the BDA-base threshold estimation is divided into three parts. A first analysis examines the uncertainty propagation in the computation of the BDA Table 1. Uncertainty probability distributions of the input parameters established to perform the LHS. i f,r , A b,r and V dep,r are the reference values (i.e. the values used in the calibration phase) of the average slope, of the basin area and of the sediments respectively.
(2) and E, Eq. (4)) starting from the uncertainty of some input parameters. Subsequently, the effects of this propagation on the frequentist method threshold estimation are analysed.
Finally, a comparison between the results of the two previous analyses and the reference outcomes, obtained in Rosatti et al.
(2019), is carried out. In the following sections, we present the details of each part.

115
The main sources of uncertainties in the BDA outputs are attributed to some of the input parameters and data, namely the average slope i f , the basin area A b , the deposited volume V dep and the dynamical friction angle ψ. As described in Rosatti et al. (2019), during the calibration phase, i f and A b related to each event have been estimated performing GIS analysis based on the available debris flow data (e.g. location of the deposited area, technical reports...) and V dep have been provided by regional agencies. Instead, due to the scarcity of sediments information, ψ has been assumed to be 35 • for all debris flow 120 events. As stressed in the Introduction, the rainfall intensities i(t) associated with the event are assumed to be certain. Future analysis will assess and study also the uncertainties related to this piece of data.
For each event, the uncertainty propagation of i f , A b , V dep and ψ is carried out with MC approach using the Latin Hypercube Sampling (LHS) procedure (McKay et al. (2000), Helton and Davis (2003) and Helton et al. (2006)) for the generation of the sample. The implementation of this procedure is schematized in Fig. 3 and is composed of three main steps. First, the 125 characterization of the uncertainty in the parameters, namely the probability distribution function (pdf) of their values, has to be defined ( Fig. 3(a)). Lacking certain data concerning the pdfs, according to Marino et al. (2008), all the input parameters are assumed to be uniformly distributed. In particular, for each event, the means of the parameter distributions are set equal to the reference values, namely to the values used in the calibration phase (indicated with subscript r). The dynamical friction angle is the only parameter with a validity range. For stony debris flow, according to Lane (1953) and Blijenberg (1995), it varies 130 from 32 • to 38 • . Therefore, assuming 35 • as the mean of the ψ distribution, this range can be obtained by imposing a variation coefficient CV (i.e. the ratio between the standard deviation and the mean) equal to about 5%. Since being greater than zero is the only constraint of the other parameters, for homogeneity this CV value is considered suitable for all parameters. The uncertainties characterization of the parameters is summarized in Table 1.
Second, the parameter samples, namely the ordered sets of parameters values in the form (i f , A b , V dep , ψ), must be obtained.  (2000). This method produces N samples starting with a division of each parameter range into N disjoint intervals of equal probability. Then, one value is randomly selected within every interval, thus obtaining N values for each parameter. These values are then arranged in a matrix composed of N rows and k columns, where k is the number of the parameters (four, in the specific case). In each column, the N values relevant to a single parameter are inserted in random order (Helton et al., 2006). Each row of this matrix 140 gives one of the N parameter samples. According to Marino et al. (2008), to ensure the accuracy, the sample size N should be at least greater than k. In this study, N is set to 100 and the (100 × 4) LHS matrix is generated for each event, based on the previously established pdfs.
Finally, the BDA outputs are obtained starting from each parameter sample, resulting in 100 (I, D) couples for each event (Fig.   3(c)), together with the related concentrations c and rainfall volumes per unit area E.

145
The obtained results are then analysed in term of relative uncertainty, quantified through the computation of the CV of the output distributions. The CV , by definition, is a standardized measure of uncertainty (Håkanson, 2000) and allows to compare the relative variability of the results independently of their measurement units (Abdi, 2010) and of their means. For this reason, the CV is chosen as statistical measure to compare the relative variability between the outputs (between both the same output of different events and different outputs of the same event) and to understand how the uncertainty of an output variable changes depending on the event characteristics. In particular, the CV s are computed for all the BDA outputs.
Moreover, to evaluate how the (I, D) points move in the ID plane, the absolute uncertainty associated with the (I, D) couples estimation is quantified throughout the computation of the 95% uncertainty intervals. The extremes of the latter are defined as the 2.5 and 97.5 percentiles of the D and I distributions. The length of the uncertainty interval allows evaluating the absolute variability of the D and I distributions.

Comparison between MC outputs and reference values
In this analysis, we compare the MC means (I, D) couples, the threshold obtained with the mean values of a and b and the threshold uncertainty bounds with the results of the original calibration phase (reference values) in order to assess the differences between the two approaches.

170
As regards the rainfall conditions, according to Marra (2019), the bias of duration B D and intensity B I are computed for each event as: where the subscripts r and m represent respectively the reference value and the mean of the MC output distribution.
For what concerns the threshold, firstly the differences between the MC intensities I M C,k , where k stands for mean, upper 175 and lower bounds, and the reference threshold ones I t,r (Eq. (12)) are carried out for the same fixed durations used to define the uncertainty bounds: Subsequently, the percentage changes, defined as: are computed to figure out how much the MC results deviate relatively from the calibrated threshold.

Propagation of uncertainty in the BDA outputs
The method described in the previous section has been applied to the 84 debris flow events used to calibrate the threshold for the Trentino-Alto Adige/Südtirol region (Italy) in Rosatti et al. (2019). The values of the CV s for the BDA outputs related to 185 each event are shown in Table 2. The analysis of their variability allows to make the following observations: -D is the output characterised by distributions with the largest and most variable CV s with respect to all the other outputs.
In fact, the events CV D vary between 0% and 157.5%. The reason for this behaviour will be clarified further on; the volume per unit area distributions shows CV E s that vary between 7.1% and 64.6%. It is worth noting that high 200 uncertainty in the estimation of E does not necessarily imply large CV D and/or CV I (e.g. event 12) and vice versa (e.g. event 38).
To better understand the variability of D and I, we classify the events into three categories based on the CV D values: 1. events with zero variability: CV D = 0%; 2. events with low variability: 0% < CV D ≤ 30%;   regardless of the input parameters variability, the concentration is always equal to 0.9 c b . In this case also the variation 210 coefficient of the concentration is equal to zero (e.g. events 1, 13 and 25) and the variation of E is only due to the propagation of the basin area and deposited volume uncertainties (Eq. (4)), namely CV E 7.1%. For these 14 events, such a low variation in E implies the constant computation of the same (I, D) couple; despite CV c is not zero and the CV E is greater than 7.1% (e.g. event 28), the condition of Eq. (7) is satisfied, in all the 100 simulations, with the same time instants of the hyetograph. 34 events fall into this condition.

215
The events that belong to the second category are characterised by I and D distributions relatively low spread around their mean values. For each of these 19 events, the uncertainty of the inputs implies the computation of more (I, D) couples which, however, are relatively close to the mean value.
The third category includes 17 events for which the input parameters uncertainties propagation results in high relative uncertainty in the (I, D) couples computations. In general, these events are related to a hyetograph characterised by an intensity peak 220 and low values around it. This sharp decrease in intensity implies that small variations in E require large variations in the time intervals in order to satisfy Eq. (7). Event 58 is the extreme case of this condition: the duration distribution of this event has an extreme much greater than the mean value (Fig. 5). This is due to the presence of zero intensity temporal instants in the middle of the precipitation event that must be considered to reach the higher values of E. For this reason, for some combinations of the input parameters, D is much greater than the mean value and these extremes entail an increase in the standard deviation, 225 namely in the CV D . The effects of this condition on I are fewer thanks to the mean computation carried out to obtain this  Figure 5. Hyetograph of event 58. In term of D, the extreme value (3.75 h) is much greater than the mean (0.32 h).
As regards the absolute uncertainty associated with the (I, D) couples estimation, the results are shown in Fig. 6: the horizontal and vertical lines represent respectively the 95% uncertainty intervals related to D and I. As evident, the length of 230 the intervals varies greatly depending on the event. In term of intensity, the maximum length is 61.3 mm h −1 and is reached with the event 67 while, for the duration, the event 38 is characterised by the maximum length that is equal to 3.51 h.

Uncertainty in threshold computation
The main statistical quantities concerning the output a and b parameters distributions are given in Table 3. In particular, the variation coefficients are respectively 6.46% and 2.91%. These low values highlight that the relative uncertainties associate to 235 a and b are very small. The low spread nature of the thresholds parameters is also evident in the scatter plot and in the bivariate 3D histogram, respectively shown in Fig. 7(a) and 7(b).
Moreover, the lower and upper bounds of the threshold uncertainty, evaluated as explained in Sect. 3.2, are shown in Fig. 8.  (Table 3). The shaded area represents the threshold uncertainty whose upper and lower bounds have been computed considering the 2.5 and 97.5 percentiles of the intensity distributions for fixed durations.

MC means and reference values
As regards the events rainfall condition, the biases between the reference values and the MC output distributions means (Eq. (13)) are shown in Fig. 9. B D deviates between 0.8 and 1.5 ( Fig. 9(a)) while B I between 0.86 and 1.15 ( Fig. 9(b)). Consistently As regards the threshold, the differences between the mean, lower and upper bounds MC intensities and the reference 250 threshold ones (Eq. (14)) as a function of D are shown in Fig. 10(a). For almost all durations, the threshold obtained with the MC mean values of a and b is slightly lower than the reference one: a positive difference occurs only for the first time interval.
Consistently with the obtained a and b mean values (Table 3) and the B D and B I trends, in the log-log ID plane the mean threshold is respectively slightly more downward translated and clockwise rotated than the reference one. Instead, the upper and lower bounds are respectively always higher and lower than the reference threshold. This means that the latter is contained 255 in the MC threshold uncertainty area. This behaviour is also evident in Figure 10 are plotted: the mean threshold deviates between 0.14% and −5.44%, the upper bound between 8.06% and 12.31% and the lower bound between −8.34% and −22.94% from the reference one. In summary, the results obtained for the BDA outputs analysis has highlighted that the uncertainty of the inputs may result in a both low and high variation of the outputs distributions depending on the event characteristics and on the variable considered.
The main uncertainties are related to the computation of D, as evident from Table 2 and Fig. 6, and are mainly due to the piecewise constant nature and shape of the hyetograph. Indeed, given the discrete nature of i(t), in some cases, even a small 265 change in E may result in a relatively large variation of D (e.g. event 38), namely in the number of temporal instants necessary to achieve the needed rainfall volume. This is especially true if the hyetograph is characterized by time intervals with zero intensity within the event. However, although some events are characterised by a high uncertainty in the estimation of the (I, D) couple, the threshold computation is not affected by high variability. Both the threshold parameters have low variation coefficient (Table 3), are low spread within the ab plane (Fig. 7) and the upper and lower bounds of the uncertainty area are 270 close to the mean threshold (Fig. 8). This low uncertainty is mainly due to zero variability of 48 events outputs: since the (I, D) points of these events are located in the same positions in all the 5000 MC simulations, they propagate zero uncertainty in the threshold computation.
Moreover, the deviation between the MC results and the reference values has been analysed both in term of (I, D) couple estimation and threshold computation. As evident from Fig.9, most events are characterised by B D and B I equal or close to 1.

275
However, the duration bias of some events is high. Despite this, the reference threshold is very close to the threshold computed with the MC mean value of a and b and is contained in the uncertainty area (Fig.10). This means that if the uncertainty analysis has been done during the calibration of the threshold and the mean threshold had been accepted, there would have been no large variations.
As evident, the most critical part of the study is the computation of the (I, D) couples. High variability of some I and D 280 output distributions points out the presence of possible critical issues in the estimation of the rainfall conditions associated with the event, as also highlight in Rosatti et al. (2019). The critical issues may be due to three main conditions. Firstly, some combinations of the input parameters, obtained with the LHS method, may be not representative of the analysed event. This condition leads to inconsistent estimates of the rainfall volume and, consequently, of D and I. Secondly, the analysed debris flow may have had a different dynamic than the simplified one on which the BDA is based. For instance, the hypothesized 285 equilibrium condition may have not been reached by some events due to particular conditions, such as lack of sediments, the presence of non-erodible zones or initiation of a debris flow caused by slope failure. This leads to an overestimation or underestimation of the concentration and, therefore, to a not reliable rainfall condition estimation. Finally, according to Marra et al. (2014), the radar data may be affected by some sources of error. During the calibration phase, to avoid the source of error related to the beam shielding, typical phenomenon for mountain regions (Germann et al., 2006), the events have been 290 filtered out based on the radar signal power. Events with radar signal weakened more than 90% have been excluded (Rosatti et al., 2019). However, other sources of error, such as signal attenuation in heavy rain or wet radome attenuation, have not been considered in the current study. If present, these errors propagate in the computation of V DF r (Eq. (7)) implying an over or underestimation of the available rainfall volume. This results in a wrong estimate of the event (I, D) couple. Further analysis will assess how these three conditions affect the (I, D) couple estimation and, consequently, the threshold calibration.

295
However, the BDA method seems to be robust enough to provide a reliable rainfall threshold: the input parameters uncertainties result in a low variation of the threshold estimation. Further analysis will also assess the robustness and the reliability of the threshold to forecast the possible occurrence of debris flow.
Author contributions. MM designed the experiments, performed the analysis and wrote the paper. DZ and GR supervised the study, analysed the results and wrote the paper.