Empirical atmospheric thresholds for debris flows and flash floods in the southern French Alps

Debris flows and flash floods are often preceded by intense, convective rainfall. The establishment of reliable rainfall thresholds is an important component for quantitative hazard and risk assessment, and for the development of an early warning system. Traditional empirical thresholds based on peak intensity, duration and antecedent rainfall can be difficult to verify due to the localized character of the rainfall and the absence of weather radar or sufficiently dense rain gauge networks in mountainous regions. However, convective rainfall can be strongly linked to regional atmospheric patterns and profiles. There is potential to employ this in empirical threshold analysis. This work develops a methodology to determine robust thresholds for flash floods and debris flows utilizing regional atmospheric conditions derived from ECMWF ERAInterim reanalysis data, comparing the results with raingauge-derived thresholds. The method includes selecting the appropriate atmospheric indicators, categorizing the potential thresholds, determining and testing the thresholds. The method is tested in the Ubaye Valley in the southern French Alps (548 km2), which is known to have localized convection triggered debris flows and flash floods. This paper shows that instability of the atmosphere and specific humidity at 700 hPa are the most important atmospheric indicators for debris flows and flash floods in the study area. Furthermore, this paper demonstrates that atmospheric reanalysis data are an important asset, and could replace rainfall measurements in empirical exceedance thresholds for debris flows and flash floods.


Introduction
A key component in risk assessments for natural hazards is quantifying the probability of occurrence in relation to specific intensities of the hazardous events.Intense shortduration precipitation, long-lasting rainfall, and snowmelt are all potential triggers for hydro-meteorological hazards in mountainous areas in Europe (Brunetti et al., 2013;Sene, 2013).However, while rainfall is often an important element in triggering hydro-meteorological hazards, the actual atmospheric conditions are often complex, with very localized rainfall.
In the European Alps and Mediterranean region, debris flows are generally caused by heavy rainfall from either intense convection, or sustained heavy frontal rainfall (Tarolli et al., 2012).Antecedent conditions, such as previous rainfall, snowmelt and evaporation, are also important; however, they are often not collected or incorporated into the threshold (Guzzetti et al., 2008).Debris flows can be generated by a number of different causes, such as liquefaction of the toe part of landslides, blocking of channels, and accelerated erosion along gullies.Heavy rainfall may trigger debris flows and flash floods in the same channels filled with sediments (van Asch et al., 2013), and both events can be approached similarly in the threshold analysis.Within this paper we refer to rapid instantaneous events such as debris flows or flash floods as flash events.
The role of rainfall in triggering debris flows and flash floods can be examined using physically based models (e.g.Quan Luna et al., 2011;van Asch et al., 2013).Through the use of hydrologic and stability models, these physical models take into account not only rainfall, but other factors such as pore pressure and slope stability (Aleotti, 2004).However, the models can be computationally costly and require extensive parameterization and calibration.Therefore, the application of such models is often only feasible for relatively small areas, such as a single torrent or a few square kilometres (Brunetti et al., 2013).
For larger areas (tens of square kilometres upwards), empirical rainfall thresholds are more frequently used (e.g.Aleotti, 2004;Giannecchini, 2006;Frattini et al., 2009;Brunetti et al., 2010;Berti et al., 2012).Thresholds define minimum or maximum conditions of one or more triggering factors for a particular hazardous event (Frattini et al., 2009).The research focus in this field recently has been towards the development of objective and reproducible thresholds (Guzzetti et al., 2008).Methods include Bayesian inference, where the parameters of the threshold are fit using a probability approach (Guzzetti et al., 2007), and a frequentist approach, based on the frequency of conditions that have resulted in landslides (Brunetti et al., 2010).A detailed review of empirical thresholds for debris flows and landslides can be found in Guzzetti et al. (2008).
For debris flows a typical approach is to define a threshold based on the intensity, duration or antecedent rainfall amounts (Guzzetti et al., 2008).The general form of the rainfall threshold is as below (Eq.1), with three examples from Caine (1980) (Eq.2), Guzzetti et al. (2008) (Eq.3), and Cepeda et al. (2010) (Eq. 4): (1) I = 14.82D −0.39 (2) where intensity (I ) is given in mm h −1 , duration (D) in hours, and α and β are curve parameters Empirical rainfall thresholds rely on accurate rainfall measurements, often requiring sub-daily data (e.g.Aleotti, 2004;Giannecchini, 2006;Cepeda et al., 2010).However, as many hydrological and meteorological stations still collect only daily rainfall, fine-resolution data are not always available.In mountainous areas, precipitation can vary greatly with altitude.Without extensive meteorological networks, the effect of orographic processes on the spatial variation of rainfall can be difficult to determine (Tobin et al., 2011).Therefore, in many threshold studies, many hazardous events are excluded from analysis.Brunetti et al. (2013) automatically excluded events where the closest rain gauge was more than 5 km away or there were not sufficient rainfall data, and in Meyer et al. (2012), 20 % were excluded due to insufficient information.
Other challenges for empirical rainfall thresholds include having a detailed and sufficiently complete inventory of events, and deciding and defining the indicators to use in the thresholds.It also is often not clear how to define a rainfall event (when it starts and finishes), although recent papers have tried to address this (Brunetti et al., 2010;Berti et al., 2012).Finally, many of the empirical methods establish a threshold above which debris flows may occur, without considering non-event observation also above the threshold, as there are many more non-event days.Meyer et al. (2012) used only debris flow events to determine the threshold, then analysed the annual frequency of days above the threshold.As rainfall is not the only factor governing debris flows, there will likely always be uncertainty in the definition of rainfall thresholds (Berti et al., 2012).
One way to approach the significance of a threshold is using Bayesian probability (e.g.Berti et al., 2012).Bayesian probability takes into account the likelihood of an event given certain conditions.However, while Bayes' theorem is useful in determining the probability of an event above a certain threshold, it does not take into account the probability that an event would be below this threshold.So even if the probability of an event occurring above a particular threshold is high, many events may occur below this threshold.
The thresholds above all use rainfall directly.However, it is also possible to analyse the cause of heavy precipitation.Ingredients that can lead to precipitation include mechanisms for uplift of an air mass (such as heating at the surface or orographic lift), increased saturation of the atmosphere, or a mixing of two or more air masses (such as fronts and low pressure systems).Maddox et al. (1979) found for the US that 43 % of flash floods were caused by local convection, while the rest were synoptically driven.Studies in the Mediterranean basin show that heavy precipitation events are often caused by quasi-stationary local convection (e.g.Nuissier et al., 2008).Atmospheric indicators can summarize the principal atmospheric conditions leading to heavy rainfall for a particular area, depending on the different causal mechanisms.
While atmospheric indicators have not had widespread usage in threshold analysis for flash events, they have been used as indicators for heavy rainfall and downscaling climate projections.Trapp et al. (2009) used the product of convective available potential energy (CAPE) and deep-layer wind shear (DLS) as an indicator for severe thunderstorms.Nuissier et al. (2011) used synoptic (large-scale) weather types based on the Hess-Brezowsky Grosswetterlagen classification, as well as low-level moisture flux and low-level wind direction to detect heavy precipitation events in southern France.Other examples of using atmospheric indicators for heavy precipitation include Schmidli et al. (2007), Chen et al. (2010) and Jeong et al. (2012).Identification of synoptic atmospheric conditions that lead to flooding has also been undertaken in a number of studies (e.g.Petrow et al., 2009;Parajka et al., 2010).
Atmospheric indicators can be obtained using reanalysis data from physically based models.Using a forecast model combined with observations, reanalysis data are consistent with both atmospheric observations and the laws of physics (Dee et al., 2011).The weighting given to the observations differs depending on the quality of the observations.Less reliable fields, such as precipitation, are less dependent on observations than more reliable fields such as mean sea level pressure (Tapiador et al., 2012).However, the quality of the output is dependent on the skill of the underlying forecasting model.Overall though, reanalysis data provide a wide range of atmospheric variables that are both spatially complete and coherent (Dee et al., 2011).
Rather than rainfall thresholds from local weather stations, this research develops empirical atmospheric thresholds for debris flows and flash floods using atmospheric indicators to identify the potential heavy rainfall events, using 63 flash events in the southern French Alps.The main advantages are that a dense observational rain gauge network is no longer required, and that there is no need to define explicitly what a rainfall event is.Furthermore, atmospheric thresholds can lead to a better understanding of the meteorological conditions that are related to the occurrence of debris flows and flash floods.Empirical atmospheric thresholds therefore may be an alternative to the conventional empirical rainfall thresholds where dense observational networks are not available, or where further investigation is required into the cause of the rainfall.
The structure of the paper is as follows: first an overview of the study area and the data set is given, followed by a description of the methodology to develop atmospheric thresholds.The methodology includes dividing the flash events into those caused by local convection, and those that are from more synoptically driven, widespread rainfall.Thresholds using weather station data are also generated for comparison.The results are then presented and discussed, with a conclusion on the main results and limitations of developing and using empirical atmospheric threshold for debris flows and flash floods.

Study area and data description
The Ubaye Valley is an east-west oriented valley in the Alpes-de-Haute-Provence, France, with a catchment size of around 548 km 2 and elevation between 1100 and 3000 m a.s.l.(Fig. 1).The Ubaye Valley has a mountainous Mediterranean climate with snow cover at high altitudes for approximately half of the year (Malet et al., 2007).Previous investigation has found that hydro-meteorological events are generally associated with snowmelt and high-intensity summer storms, although the precise triggering conditions have been difficult to determine (Flageollet et al., 1999).
Four of the five weather stations are located close to the main river channel (Fig. 1).Station 5 (Table 1) is only operational during the summer and hence only used for qualitative comparison with the other locations.Information on elevation, length of measurement series and variables for all the weather stations can be found in Table 1.All stations measure daily precipitation, and station 1 also records temperature.Stations 1 to 4 are homogeneous based on the criteria from Wijngaard et al. (2003) and three homogeneity tests (Pettitt, 1979;Alexandersson, 1986;Wang et al., 2010).Total annual precipitation amounts for stations 1 to 4 vary between 730 and 985 mm, with the mean annual daily maximum precipitation amount between 46 mm (station 1) and 53 mm (station 4).The correlation between station 5 and the other four stations in summer is low: between 0.02 and 0.08, based on the Kendall's tau correlation coefficient (Kendall, 1970).The correlation between stations 1 to 4 is higher: between 0.69 and 0.74.
The Ubaye Valley has an extensive landslide, debris flow, and flash flood inventory compiled from historical data in municipal archives, newspapers and technical reports (Flageollet et al., 1999).Historical records provide valuable information on temporal occurrence of larger events, although the events recorded depend on the exposure and awareness of the observers to the hazard (Ibsen and Brunsden, 1996;Carrara et al., 2003).
The historical inventory contains 29 flash floods and 39 debris flows events observed between 1979 and 2010, which occurred between March and November (Fig. 2).Tarolli et al. (2012) found a similar seasonal distribution of flash floods, with events generally occurring between August and November in the western Mediterranean.On average,  discharge levels between September and November closely follow the mean precipitation intensity, while the discharge increases from March to July mainly due to snowmelt (Fig. 2).As the valley is orientated west-east, north-facing slopes are likely to retain snow longer than south-facing slopes.Cepeda et al. (2010) developed Eq. ( 4) for debris flows based on hourly precipitation from station 1.Only seven debris flows were used, as the others occurred before subdaily precipitation measurements were available (1998), or the precipitation or inventory record was deemed to be not sufficient (Cepeda et al., 2010).For the threshold, 86 % of the debris flow events used were correctly predicted, and 5.5 % of rainfall events above the threshold resulted in debris flow.However, no threshold was obtained using only the longer daily rainfall data set.To obtain a threshold for a longer time period, other methods or data sets are therefore required.
ECMWF ERA-Interim reanalysis data are used for analysing the regional atmospheric variables.The data have a spatial resolution of 80 km (T255) and are available for 1979 onwards (Dee et al., 2011).More information about observation and data assimilation and model characteristics for ERA-Interim can be found in Dee et al. (2011).The study area is approximately half of one grid box, so only the grid box containing the study area and those directly beside it are used (nine in total).The variables chosen (Table 2) contain commonly used predictors for statistical downscaling precipitation from Global Climate Models at multiple atmospheric pressure levels (Chen et al., 2010;Jeong et al., 2012).In addition, convective available potential energy (CAPE), deep layer shear (DLS), and soil moisture fields are also included.The first two are added as they might be indicative for convection (Marsh et al., 2009) and soil moisture as part of antecedent conditions.CAPE in particular is an estimate of the energy that a parcel of air would have at the surface if it was lifted.High positive CAPE values indicate that the air may be unstable and favourable for convection.A brief description of each of the variables is also given in Table 2. Atmospheric indicators at 850 and 700 hPa represent lower tropospheric conditions, while indicators at 500 and 250 hPa represent the upper troposphere.The surface variables are available at 3hourly time steps, with the others at 6-hourly time steps (Dee et al., 2011).DLS is estimated using the following equation and the surface wind fields (u10m, v10m) and 500 hPa wind fields (u500hPa, v500hPa) (Seltzer et al., 1985): (5)

Methodology
This section explains a method to establish empirical thresholds for debris flows and flash flood events (flash events) based on regional atmospheric conditions or indicators from the reanalysis data set.Two different thresholds are considered: (1) a probabilistic threshold based on Berti et al. (2012), determining the likelihood of a flash event using a variety of indicators, and (2) a static threshold that takes into account the number of flash events below the threshold as well as the probability of occurrence.Besides defining the threshold, the methodology also examines (a) if the local weather station network was adequately capturing the rainfall causing the event, (b) whether intense convection was the main rainfall source triggering the events, and (c) if other meteorological triggers, such as snowmelt, are relevant to triggering events in the study area.The three steps of the proposed methodology are: -Section 3.1: categorize events based on potential meteorological triggers -Section 3.2: select appropriate atmospheric indicators for each category -Section 3.3: compute the probabilistic and static thresholds and then apply these over a validation period.
Based on the availability of the weather station data and reanalysis data, the period 1979-2010 was chosen as the focus study period.The years from 1989 to 2004 are used for calibration and two validation periods are selected, namely 1979-1988 and 2005-2010.By splitting the validation period into two segments, changes in data quality, such as measurement techniques or observational coverage, are expected to be reduced while maintaining the longest possible data period.The probabilistic and static thresholds are also established using local weather station data for direct comparison with the empirical regional atmospheric thresholds.

Categorization of events
The proposed categories are based on the governing rainfall generation processes, with a secondary subdivision based on potential antecedent conditions.The four categories are: Lslocally generated rainfall, spring, Lr -local rainfall, summer, Ss -synoptic rainfall, spring, and Sr -synoptic rainfall summer.The classification is based on Merz and Blöschl (2008), who identify five categories for river floods based on the type of rainfall and antecedent conditions such as snowmelt and rainfall over several weeks.The categories Ls and Ss assume snowmelt is an antecedent condition, while Lr and Sr assume no snowmelt.For this study, seasonal antecedent conditions (snowmelt or/and rainfall) are based on the average annual discharge pattern in Sect. 2. From Fig. 2, the discharge generally returns to near baseflow levels in July.Added to this, the east-west orientation of the Ubaye Valley means that the south-facing slopes will be snow-free earlier than the northfacing slopes.Therefore, the spring events were defined as flash events between March and June for south facing slopes, and between March to mid-July for north facing slopes.
The rainfall generation processes are split into types where local conditions are driving the generation, or whether it is governed by the synoptic atmospheric processes.In Done et al. (2006), the authors estimate the rate at which CAPE is being removed by convective heating as where t CAPE is the convective timescale and dCAPE dt is the rate of change of CAPE removed by convective heating.Done et al. (2006) suggest that with convective timescales shorter than 6 h the synoptic conditions are governing the instability of the atmosphere, while locally driven intense convection occurs when t CAPE values are high.Non-convective precipitation would also have a low t CAPE value, as CAPE values are generally low (Molini et al., 2011).Applying the criteria of Molini et al. (2011), flash events with t CAPE > 6 h are classified as locally convective (L), and with t CAPE < 6 h corresponding to more equilibrium conditions (S).Molini et al. (2011) and Done et al. (2006) further modified Eq. ( 6) by estimating the latent heat release using the precipitation rate.However, as hourly rainfall rates are not available for any weather station before 1998, and Done et al. (2006) explain this is just a rough indication of the convective timescale, the version in Eq. ( 6) is used.
The accuracy of the classification of rainfall generation type is dependent on the accuracy of CAPE from ERA-Interim.Molini et al. (2011) found, when comparing CAPE values from ERA-Interim with those from a nearby radiosonde, there was only modest correlation, with a coefficient of determination of approximately 60%.Differences would be expected however, when comparing the grid box average with a point location.

Indicator selection
Each day in the calibration period 1989-2004 is assigned a label as an event day (a day where one or more flash events were recorded) or non-event day (where no flash event was recorded).The atmospheric indicators that show a distinction between event days and non-event days can then be used in the development of atmospheric thresholds (Sect.3.3).The silhouette index (SI) is used to identify atmospheric indicators that best differentiate between the clusters of flash events and non-flash events.This index takes into account both the separation between the two clusters as well as the cohesion within the cluster (Rousseeuw, 1987).The index was developed as part of a tool to visualize the distinction between multiple clusters, and as a guide to the validity of the clustering and selection of number of clusters (Rousseeuw, 1987).
It has since been used as a validation tool in classifying atmospheric conditions (e.g.Huth et al., 2008;Kannan and Ghosh, 2011;Kenawy et al., 2013).
An individual silhouette value determines how similar a point is to other points in its own cluster compared to points in other clusters (Rousseeuw, 1987).The SI is then the average of all the silhouette values (Huth et al., 2008), with Eq. ( 7) valid for two clusters: where n c is the number of observations in cluster c, b i is the average Euclidean distance between an observation i and all observations in the other cluster and a i is the average Euclidean distance between i and all observations in the same cluster.
The SI varies between −1 and 1.An individual silhouette value of 1 indicates that the observation is correctly classified as a flash or non-flash event, while a near-zero value indicates that the observation could belong to either cluster, and negative values indicate misclassification (Ansari et al., 2011).The highest SI indicates the best clustering (Ansari et al., 2011).An overall SI value of 1 means that the clusters are compact and well separated from each other (Kenawy et al., 2013).
A worked example of the SI for floods in the Ubaye River is given.Days with high discharge values (flood days) are compared with no-flood days.The no-flood days chosen had similar event and antecedent rainfall amounts as the flood days.Figure 3  The SI value is less reliable for clusters when there is a large difference between the number of flash and non-flash events.Therefore, x days are randomly selected to calculate the SI using the normalized atmospheric variables, where x is the number of flash events.This is repeated multiple times (10 000), with variables with the highest mean SI value selected for threshold analysis.Any atmospheric indicators that had more than 10 % of SI values less than zero were discarded.In Sect.4.2, only the mean SI value is given.
As conventional thresholds are generally defined using two variables, the analysis is performed with the two bestperforming indicators.Furthermore, too many indicators could create noise, or lead to over-fitting of the data (Kenawy et al., 2013).The degree of correlation between atmospheric predictors also reduces the benefit of using many predictors (Hewitson and Crane, 2006).However, where the inventory of flash events is more substantial, three or more atmospheric variables could be used to improve the atmospheric threshold.

Probabilistic and static thresholds
Bayes' theorem expresses the conditional probability of an event A, such as a flash event, occurring given some condition or conditions, B, such as atmospheric conditions (Eq. 8).It is based on the unconditional probability of A occurring, P (A), unconditional probability of the condition occurring P (B), and the conditional probability of P (B|A): Using the two indicators from Sect.3.2 that had the highest SI value, the probability of a flash event occurring was calculated over the observed range of each of the indicators.This is similar to Berti et al. (2012), although extended to using atmospheric indicators.A limitation of using probability of occurrence is that is does not take into account the percentage of flash events above the threshold.Therefore, a static threshold is also determined considering both the number of events above and below the threshold.A static threshold is taken to be a threshold where the values of the indicators remain constant.The indicators used for the static threshold are the same as for the probabilistic threshold.
A confusion matrix displays the performance of a prediction algorithm, such as a static threshold.The four classifiers in the confusion matrix (Mason and Graham, 1999) are: -true positives (TP): the number of correctly predicted events -false positives (FP): the number of events predicted, but where no event occurred -false negatives (FN): the number of events that were not predicted -true negatives (TN): the number of days that were correctly predicted as non-events.
These classifiers can then be used to determine the correlation between the predicted and observed results using the Matthews correlation coefficient (MCC; Powers, 2011): The MCC is similar to the Pearson product-moment correlation coefficient applied to contingency tables (Powers, 2011).A value of 1 indicates perfect correlation, while zero indicates no relationship and negative values indicate negative correlation.Although to our knowledge the MCC has not been used in rainfall threshold assessment, it has been used in bioinformatics, as an assessment tool where there are unequal numbers of events and non-events (Baldi et al., 2000;D'Este and Rahman, 2013).
The MCC is calculated for each combination of atmospheric indicators from the probabilistic threshold.The threshold with the highest MCC value is chosen as the static threshold, with the added condition in that at least 50 % of the flash events are also above the threshold.These selection criteria are somewhat subjective, as the optimal threshold will depend on the application.

Categorization of events
Table 3 shows the t CAPE value (Eq.6) for all separate events in the period 1989-2004.In 66 % of the events, local convection was considered to be the dominant meteorological trigger for flash events in the Ubaye Valley.The observed convective events occurred between 1 June and 23 November (numbers 5 and 13 in Table 3).The synoptic events occurred over a wider range of months, between March and November (numbers 9 and 1 in Table 3).
It is possible that some of the flash events are in the wrong category.Four of the nine synoptic events had no rainfall recorded in at least half of the stations 1-4, which would not  2 for local convection events using daily values.Any value that was not significant at p = 0.05 level was given a value of zero.
be expected with widespread rainfall (numbers 3, 4, 6, 8 in Table 3).However, any misclassification would likely only reduce the efficiency of the clustering (Sect.4.2), and the significance of the thresholds (Sect.4.3).Therefore we used the classification as indicated in Table 3 for the subsequent analysis.

Indicator selection
The two best-performing indicators for the local convective events were CAPE and specific humidity at 700 hPa (Fig. 4).These two indicators showed the highest SI value, 0.32.CAPE especially has been used before as an indicator for intense convection (Marsh et al., 2009), as it indicates atmospheric instability.Q700 is indicative of low-level moisture, which is also necessary for locally generated precipitation.
Comparatively, the U & V winds showed very low SI values, indicating that wind conditions do not separate flash event days from non-event days.This was also true for DLS and soil moisture (SWL).The vertical integral of water vapour flux was also trialled; however the SI value was also low (not shown).Temperature, vorticity and divergence showed moderate SI values, between 0.1 and 0.25 depending on what other atmospheric indicator it was paired with.The moderate SI values separate the flash events from the non-event days somewhat, but not as much as CAPE and Q700. Figure 5 (top panel) shows that for all the synoptic events, only 10 indicator combinations were significant (at p = 0.10).To improve the indicator selection, the SI was calculated again further splitting the events into the Ss and Sr categories (Fig. 5 middle and bottom).However this meant that there were only 4 to 5 flash events in each group.Therefore, any thresholds were unlikely to be as robust as for the local convection and weather stations, as there were fewer events to both calibrate and validate the thresholds.Bottom panel: the SI value for each pair of atmospheric indicators for Sr using (daily value and 8-day average).Any value that was not significant at p = 0.10 level was given a value of zero.
Splitting the synoptic events into the Sr and Ss categories showed differences between the atmospheric indicators with the highest SI (Fig. 5, middle and bottom).For Ss events, temperature at multiple pressure levels separated days with flash events from days with no flash events.This was in combination with 8-day average mid-level divergence, temperature, CAPE or specific humidity.The highest SI value of 0.21 was for temperature (3-day) and specific humidity (8-day), both at 700 hPa.These two indicators were then used as the basis of the thresholds in Sect.4.3.For the Sr flash events, the significant indicators were divergence at 850 hPa (daily), low-level specific humidity, SWL, and 8-day average temperature (Fig. 5).The highest SI of 0.42 for the Sr flash events corresponded to specific humidity and 8-day average temperature.Low-level moisture (Q700 and Q850) again appeared to be a key atmospheric indicator.Low-level temperature was also a key indicator, although only when Ss and Sr events were separated (Fig. 5).
Finally, for the local weather station data, the highest SI value of 0.29 was for the 4-day and daily total rainfall based on the data from station 3. Other stations and combination of stations were tried, but all had lower SI values.These indicators were similar to those for debris flows in Jaiswal and van Westen (2009).Intensity and duration indicators are not used, as hourly data were not available before 1998.Also, previous attempts using daily data showed that all flash events were below the thresholds Eqs. ( 2) and (3).

Weather station thresholds
Based on Bayes' theorem and 1-and 4-day rainfall totals, there was increasing chance of flash events with higher rainfall totals.The highest probability of a flash event was 17 % when the 1-day total is above 80 mm and the 4-day above 96 mm (Fig. 6).This is lower than the maximum probability found in the study by Berti et al. (2012) of 40-60 %.
While Fig. 6 seems reasonable (more precipitation, more likely for a flash event to occur), there are a few limitations.There are 9 days with precipitation totals above 82 mm where no flash event was recorded and hence zero probability of flash occurrence.The lack of recorded events may have been because of low precipitation intensity, or the amount recorded by the rain gauge was much higher than for the rest of the study area.Spatial heterogeneity of rainfall may also be the reason why during the calibration period no precipitation was recorded for one flash event, and less than 10 mm for a further six flash events.From Fig. 1, it can be seen that the related torrents were in some instances more than 10 km away from a rain gauge, which is especially problematic for localized convection where the precipitation is confined to an area of less than 10 km 2 .
For the static threshold, the maximum MCC value during the calibration period, with at least 50 % of events above the threshold, corresponded to the following weather station threshold: -Thres WS : 1-day precipitation > 20 mm and 4-day antecedent precipitation > 22 mm.
The values for the static threshold are given in Table 4.Only 8.5 % of the total number of days were above the Thres WS ((TP + FP)/(FN + TN)), while 55 % of the flash events were above the Thres WS (TP/(TP + FN)).Somewhat surprisingly, 45 % of the event days had less than 20 mm of rainfall.The percentage of the total number of days above Thres WS was slightly lower for the two validation periods (7.5 and 6.1 %, respectively), and the percentage of flash events drops even more (35.7 and 33.3 %, respectively).While the likelihood of a flash event still remains higher for days above the static threshold in the validation period, the drop in percentage of flash events above the threshold indicates differences in the triggering conditions between the calibration and validation periods.The torrents in which flash events occurred are generally closer to station 3 in the earlier validation period than the calibration period.
The results for the static threshold are comparable to those from other studies.Cepeda et al. (2010) found for the same study area that their threshold is exceeded on average 8.6 times per year, while 60 % of debris flows are above the threshold (if including all debris flows between 1998 and 2010).While the percentage of correctly predicted events is similar, the percentage of false positives is only a third of the amount using Eq. ( 4).The better performance of the rainfall threshold using hourly data from station 1 indicates that rainfall intensity is important rather than daily amount.The daily total of 20 mm was in the range of Meyer et al. (2012), between 15 and 107 mm day −1 .The probability of static threshold exceedance was also similar to Meyer et al. ( 2012), whose threshold was exceeded between 0 and 77 days in a year (8.5 % corresponds to 31 days a year).

Atmospheric thresholds: local convection
Flash events during the summer and autumn period are more likely under high instability (CAPE) and high 700 hPa specific humidity (Fig. 7).As both the instability of the atmosphere and low-level moisture increase in Fig. 7, the probability of a flash event also increases.The highest probability (100 %) corresponds to CAPE values above 1100 J kg −1 and normalized Q700 greater than 1.45, although this has only been observed once between 1989 and 2003.
For the static threshold, the maximum MCC value during the calibration period, corresponded to the following threshold: -Thres L : CAPE > 250 J kg −1 and normalized specific humidity at 700 hPa > 0.40.
The confusion matrix results and MCC values are shown in Table 4. From this table it can be seen that 6.8 % of the days are above Thres L , compared with 75 % of local convective flash events.In the validation periods, the percentage of days above Thres L rises to 7.8 % (validation 1) and 7.3 % (validation 2) and 71 and 80 % for the local convection flash events.Compared with the results in Sect.4.3.1,both the probability threshold and static thresholds perform better for the local convection than for the weather station threshold.In both validation periods, more flash events were above the Thres L than Thres WS , with an even smaller number of FPs in the first validation period.Lower number of FP is important for early warning systems where the number of false alarms should be minimized.
While the CAPE value in Thres L was low for intense convection, similar limits have been found in other studies (e.g. for hail, Niall and Walsh, 2005;Pistotnik et al., 2011 for heavy rainfall).Trapp et al. (2009) also found that availability of low-level water vapour was a key component of changes in severe convection at mid-latitudes.

Atmospheric thresholds: synoptic, spring
Figure 8 shows, for Ss indicators, that with warmer 700 hPa temperatures and higher specific humidity the probability of flash event occurrence increases.Warm low to mid-level temperatures could be associated with melting of snow and high moisture levels could indicate rain.Figure 8 had similar probabilities of occurrence compared to Thres WS , with the highest probability of occurrence of 12.5 %.Similar to Fig. 6, the most extreme days (days with the highest moisture and warmest temperature), were not associated with flash events.
Using the criteria in Sect.3.3 resulted in the following threshold: -Thres Ss : 3-day mean temperature at 700 hPa > 271 K and 8-day mean normalized specific humidity at 700 hPa > 0.70.
The values for the confusion matrix and MCC are in Table 4.Only 4.3 % of days are above Thres Ss , and 50 % of the flash events.In the validation periods, the percentage of days above the threshold increased to 7.4 % (validation 1) and 7.2 % (validation 2), while only 1 of the 3 days in the first validation period was above the threshold.In the second validation period, there were no events in this category.
As was the case for Thres Ss , if the 3-day average temperature at 700 hPa (lower troposphere) is above 271 K, then the majority of the study area would be at above freezing temperatures.While snow could still fall at the highest elevations, it is likely that it would rain in lower regions, and that any snow on the ground may melt.The second requirement of Thres Ss , specific humidity at 700 hPa being higher than normal, also .Probability of a flash event from summer synoptic rainfall based on 8-day mean temperature at 700 hPa and 1-day normalized specific humidity at 850 hPa between 1989 and 2003.The y axis is inverted to highlight that the figure represents the probability of a flash event given that T700 is less than the a particular value and Q850 is greater than a particular value.indicated possible rainfall.Therefore, Thres Ss indicated possible snowmelt and rainfall as triggers for flash events.
While both Fig. 8 and Thres Ss made physical sense, the atmospheric threshold for synoptic-spring events did not perform well in the validation periods.This may have been due to the small number of events, and the number of indicators trying to capture the atmospheric triggering conditions.

Atmospheric thresholds: synoptic, summer
Synoptic flash events in summer generally occurred with 8 days of lower than normal temperature at 700 hPa, and increased specific humidity at 850 hPa (Fig. 9).As Sr flash events are associated with colder temperatures, compared to warmer temperatures for Ss flash events, this explains why T700 does not have a significant SI value when Sr and Ss are grouped together (Sect.4.2).The probability of occurrence for this category was lower than any of the previous groups, with a maximum of 2 %.
The Sr static threshold using the above atmospheric indicators corresponded to the following threshold: -Thres Sr : normalized specific humidity at 850 hPa > 0.15 and 8-day mean normalized temperature at 700 hPa < −0.40.
The final group of values in Table 4 shows the performance of the above threshold.During the summer (July-November), 9.3 % of days were above Thres Sr , and 60 % of synoptic summer flash events.However, the percentage of days above the threshold dropped in the two validation periods (8.1 and 5.9 %), and no flash events were above the threshold.
Colder temperatures during a summer synoptic flash event are not unreasonable.Lower temperatures in summer may be associated with a front passing or cooler temperatures from prolonged cloud cover (and potentially rainfall).Similar to the other three atmospheric categories, high specific humidity indicated higher atmospheric moisture and more likely rain.However, Thres Sr was unsuccessful in the validation period.It could be that different synoptic conditions lead to flash events in the two validation periods, or that the events were misclassified.

General discussion
As with any empirical threshold, accuracy and completeness of the inventory and weather data are important.During the classification and subsequent threshold analysis, it is possible that flash events were misclassified.The spatial and temporal resolution of ERA-Interim was not fine enough to explicitly resolve convection.Therefore, parameterization schemes are used, with Dee et al. (2011) showing improvements in the convection parameterization from earlier reanalysis products.Furthermore, as the CAPE values take into account instability over the depth of the troposphere, CAPE values may be underestimated when convection is confined to a shallow layer (Niall and Walsh, 2005).As found in Sect.4.1, it is likely that some events may have been misclassified as local convection or as synoptic.
The number of flash events limits the inferences that can be drawn from the results from this paper.The difficulty of developing atmospheric thresholds with few calibration events was borne out with the synoptic thresholds failing to capture the synoptic flash events in the validation period.However, for the convective flash events, the atmospheric threshold still captured 75 % of events in the validation periods.Furthermore, grouping all 63 flash events together, the atmospheric threshold still performs better than the weather station, although by a smaller margin.
Atmospheric thresholds, like most empirical thresholds, are reliant on near-complete inventories, and only speculations can be made about what may happen under unobserved conditions.Therefore, these methods cannot completely replace physically based models and other threshold analysis techniques.However, for the Ubaye Valley where local convection appears to be the main meteorological trigger of flash events, the atmospheric threshold improves on the local rainfall threshold.This methodology therefore has a potential to work in other areas where rainfall observations are not available, or not complete enough for the traditional empirical rainfall threshold.

Summary and conclusions
The objective of this research was to develop empirical thresholds for rainfall triggered debris flows and flash floods using atmospheric indicators for the Ubaye Valley, France.Similar to rainfall thresholds, these thresholds could be used -In general the atmospheric indicators performed better than the weather station threshold (average MCC value of 0.16 compared with 0.10, and higher probability of occurrence).They also performed better than rainfall thresholds using hourly data.
-The most important atmospheric indicators were CAPE and specific humidity at 700 hPa.Both fit with convective precipitation being the main driver.
-Intense locally driven convection appears to be the main meteorological trigger for flash events in the study area (over 66 % of events).Under these conditions, precipitation can be confined to a small area, and may explain why high precipitation values were not always recorded by the local weather stations.
-Even though the atmospheric thresholds performed better, there was still the high level of uncertainty in both the probabilistic thresholds and the static thresholds.This was especially true for the synoptic rainfall events.
-The number of observed events limits any statistical inference in the thresholds obtained, although this is partly mitigated by using a validation data set.
-The methodology also needs to be trialled in other locations.It may be that in areas where there is a stronger relationship between the local weather stations and rainfall at the location of the flash events that intensityduration thresholds are more suitable.

Figure 1 .
Figure 1.The study area including the location of rain gauges and a single river gauging station.Red lines depicts the affected torrents where debris flows or flash floods occurred between 1979 and 2010 (map based on Breinl et al., 2013).

Figure 2 .
Figure 2. Running 30-day mean daily precipitation and discharge for the period 1979-2009, for the Barcelonnette weather station and river gauge in the Ubaye River.The bar graph displays the number of flash floods and debris flows observed between 1979 and 2010.
et al.: Empirical atmospheric thresholds for debris flows and flash floods

Figure 3 .
Figure 3.A worked example of calculating SI values.The righthand panel plots the specific humidity at 850 hPa and U wind at 850 hPa for five flood days and six non-flood days.These values were then used to derive the individual silhouette values in the plot on the left.

Figure 4 .
Figure 4.The SI value for each pair of atmospheric indicators in Table2for local convection events using daily values.Any value that was not significant at p = 0.05 level was given a value of zero.

Figure 5 .
Figure 5. Top panel: the SI value for each pair of atmospheric indicators for all synoptic events using the daily value and the mean value over 10 days.Middle panel: the SI value for each pair of atmospheric indicators for the Ss events (3-day and 8-day averages).Bottom panel: the SI value for each pair of atmospheric indicators for Sr using (daily value and 8-day average).Any value that was not significant at p = 0.10 level was given a value of zero.

Figure 6 .
Figure 6.Probability of a flash event based on 1-and 4-day precipitation totals from a local rain gauge.Dark blue indicates zero probability of occurrence.

Figure 7 .
Figure 7. Probability of a local convection flash event based on atmospheric indicators CAPE and normalized specific humidity at 700 hPa (between 1989 and 2003).

Figure 8 .
Figure 8. Probability of a flash event from spring synoptic rainfall based on 8-day mean specific humidity at 700 hPa and 3-day mean temperature at 700 hPa between 1989 and 2003.
Figure9.Probability of a flash event from summer synoptic rainfall based on 8-day mean temperature at 700 hPa and 1-day normalized specific humidity at 850 hPa between 1989 and 2003.The y axis is inverted to highlight that the figure represents the probability of a flash event given that T700 is less than the a particular value and Q850 is greater than a particular value.

T.
Turkington et al.: Empirical atmospheric thresholds for debris flows and flash floods in risk assessment, early warning systems, or climate change projections.Using both atmospheric indicators and weather station data, two types of thresholds were obtained: a probability threshold and a static threshold, based on classification statistics and specifically the MCC value.The main conclusions are as follows:

Table 1 .
Weather station information for the Ubaye Valley.The numbers refer to those in Fig.1.The usable years show the percentage of years that are homogeneous and have at least 99 % of days where the gauge was working.T = temperature, P = precipitation.The final column is the mean annual total precipitation, where applicable.

Table 2 .
ERA-Interim variables used in this study, along with abbreviations used.A brief description of each variable is also given.

Table 3 .
Classification of the flash events in the calibration period 1989-2003.The list contains the date of event, the t CAPE value, and its category: Ls -local rainfall, spring Lr -local rainfall, summer, Ss synoptic rainfall, spring, Sr synoptic rainfall, summer.