Brief communication: Rainfall thresholds based on Artificial neural networks can improve landslide early warning

In this communication we show how the use of artificial neural networks (ANNs) can improve the performance of the rainfall thresholds for landslide early warning. Results for Sicily (Italy), show how performance of a traditional rainfall event duration and depth power law threshold, yielding a true skill statistic (TSS) of 0.50, can be improved by ANNs (TSS = 0.59). Then we show how ANNs allow to easily add other variables, like peak rainfall intensity, with a further performance improvement (TSS = 0.64). This may stimulate more research on the use of this powerful tool for deriving landslide early 10 warning thresholds.


Introduction
Landslides triggered by rainfall can cause damage on infrastructures, buildings, and in the worst scenario, even human loss.
Commonly, rainfall thresholds indicating the conditions under which a warning should be issued to protect the population from a possible landslide event, are determined using empirical methods that link characteristics of precipitation, such as duration 15 D and mean intensity I or cumulated rainfall H = I D (Guzzetti et al., 2008). Rainfall thresholds are generally determined by assuming a predetermined parametric equation, which in most of the cases is a power law. Such a constraint can potentially limit the predictive performance of the thresholds, because the informative content of the considered explanatory variables may not be exploited at fullest. Artificial Neural Networks (ANNs), belonging to Artificial intelligence or Machine learning techniques, are a very flexible tool, that allow to potentially remove the mentioned limitation of predetermined parametric 20 threshold forms, as they are capable to reproduce a vast range of non-linear classifiers (Haykin, 1999).
Up to now, a number of studies have used the potentiality of ANNs and of other machine learning techniques in landslide risk analysis. Many studies focused on susceptibility mapping and individual slope instability have exploited the potentialities of ANNs. For instance, Ermini et al. (2005) created a susceptibility map for Riomaggiore (Italy), comparing two different types of ANNs: Multilayer Perceptions (MLP) and a Probabilistic Neural Networks (PNN). Melchiorre et al. (2008) used neural 25 networks in combination with cluster analysis for automatically splitting the available dataset in training and validation subsets, with the aim of deriving improved susceptibility maps. Other studies focus on similar applications, taking into account other variables, such as topographic wetness index (TWI) and stream power index (SPI) (Conforti et al., 2014) or the sediment transport index (STI) and the NDVI (Normalized Difference Vegetation Index), using up to 14 different input variables as predisposing factors (He et al., 2019). Finally, in a recent study, Napoli et al. (2021) used ANNs to map landslide susceptibility 30 based on nine predisposing factors, and then combined the results with a simplified model for landslide runout prediction. In other studies, the focus is on the prediction of individual deep seated landslide displacements by machine-learning algorithms using detailed in situ data (Cao et al., 2016;Krkač et al., 2017;Miao et al., 2018). Among these applications, recently van Natijne et al. (2020) described and listed the available parameters for displacement prediction from radar and remote sensing technology (slope, geology, soil moisture, precipitation/snow melt, land use), and how they may be used within a local early 35 warning system based on machine learning techniques, such as ANNs.
As shown in this short literature review, ANN skills are used to create susceptibility maps and/or in local early warning systems, while application for territorial landslide early warning (Piciullo et al., 2018) has not been investigated so far. In this communication we present our preliminary investigations showing how ANNs can allow to derive landslide early warning thresholds with higher performances than traditional rainfall duration -depth power law thresholds. 40

Data and methods
We refer to the case study of Sicily, one of the 20 regions of Italy (Fig. 1). We have retrieved hourly rainfall from 306 rain gauges distributed within the region, managed by the Regional water observatory (Osservatorio delle Acque, OdA), the SIAS (Sicilian Agro-meteorological Information Service), and by the Regional Civil Protection Department (DRPC).  has been recently updated with landslides occurred in 2018-2019 (https://franeitalia.wordpress.com/database/, last accessed on 29/06/2021). The information within this data base concern landslides triggered by rain but also those triggered by anthropogenic causes and earthquakes. 55 A flow chart of the applied methodology is shown in Fig. 2a. After collecting the data, some preprocessing has been carried out. In particular, landslides triggered by different precursors than rainfall were removed from our analysis. Also, suspicious rainfall data has been removed. In particular, where hourly rainfall exceeded 250 mm -corresponding to about one third of mean yearly rainfall for Sicily -the series has been visually inspected, and in the case of an evident error (rain gauge malfunction) the whole rainfall event surrounding the peak has been removed. 60

Pre-processed precipitation and landslide data were inputted to the CTRL-T (Calculation of Thresholds for Rainfall-induced 65
Landslides-Tool) code . The software consists of a code in R language, and allows to reconstruct rainfall events and characterizing them by the following variables: duration D, mean intensity I, total depth H =D I and peak intensity I p (defined as the maximum hourly intensity occurring during a rainfall event). The most probable rainfall conditions associated to each landslide (multiple rain gauges available for a given location) event are computed by the software based on distance between rain gauge and the landslide location, and the characteristics of the reconstructed rainfall event. Finally, the 70 code provides power-law H-D thresholds for different levels of non-exceedance frequency of triggering events. The software https://doi.org/10.5194/nhess-2021-206 Preprint. Discussion started: 12 July 2021 c Author(s) 2021. CC BY 4.0 License. allows the user to set different values of the parameters to reconstruct rainfall events in order to take into account seasonality, i.e. different average evapotranspiration rates in different periods of the year. In particular, following the study by Melillo et al. (2016), we assumed that in the warm season C W (April -October) the minimum dry period separating two rainfall events is of P 4warm = 48 hours, while in the cold season a longer period is assumed (P 4cold = 96 hours). The rain gauge sensitivity is G s 75 = 0.1 mm. A binary coding has been attributed each rainfall events, flagging triggering events as a target with value of 1 and a non-triggering event with null value. Application of the CTRL-T software yielded 144 triggering rainfall events and 47398 non-triggering events.
The characteristics of the events were used as input variables to ANNs devised for pattern recognition, as implemented within the Neural Net Pattern Recognition tool in MATLAB. The neural network, characterized by a feed-forward structure (Fig 2b), 80 is composed of three layers: input, hidden and output. Two different activation functions have been considered: a tan-sigmoid function f(n) for the hidden layer, and a log-sigmoid ( ) for the output layer: The entire dataset of rainfall events was divided into a training, a validation, and a test data set, selected randomly from the 85 entire dataset, in the proportions of 70%, 15% and 15%. This subdivision allowed to apply the early-stopping criterion to prevent overfitting. According to this criterion, the training of the neural network is stopped when the values of the performance function calculated on the validation dataset start to get worse. The ANNs have been trained through the scaled conjugate gradient backpropagation algorithm, while cross-entropy was assumed as the performance function for training. Denoting the generic ANN output with yi (assuming values in the open interval between 0 and 1) and the binary target with t i , i =1,2, …, N, 90 the cross-entropy function F heavily penalizes inaccurate predictions and assigns minimum penalties for correct predictions: The ability to distinguish triggering events from non-triggering events was measured using the confusion matrix, a double- Results from ANNs are compared with rainfall duration-depth power-law thresholds derived through the maximization of TSS -i.e., again, analysing both triggering and non-triggering events.

Results and discussion
Application of the CTRL-T software has allowed to build the dataset of triggering and non-triggering events and to derive the threshold according to the so-called frequentist method (based on triggering events only). Considering a non-exceedance 110 frequency for triggering events equal to 5%, threshold from the software is as follows: This threshold is lower than the one obtained for Sicily by Gariano et al. (2015), yet comparable with an updated one derived by Melillo et al., (2016) through an earlier version of the algorithm that was then implemented by CTRL-T software.
Specifically, thresholds reported on the mentioned two studies are respectively the following (non-exceedance frequency is 115 again 5%): These thresholds however are not comparable with those to derive with the proposed ANN approach, because non-triggering events are neglected. We have hence derived the power-law threshold corresponding to the maximum TSS, obtaining the 120 following result: = 2.40 0.68 (10) that has a TSS = TSS 0 = 0.50. The threshold has a lower intercept but a higher slope, so, after a duration of about 5 hours, it is above that the one given in Eq. 7. value, i.e., the one yielding the highest TSS. Table 1 shows the results obtained from the tested 160 neural network configurations. In particular, the table shows, for each set of input variables: the optimal number of hidden neurons corresponding to the maximum TSS for the entire data set (third column). The subsequent columns of the table show the TSS for the training, validation and test data sets, with respect to the reported number of hidden neurons. As can be seen, for most of the input configurations, the TSS for the test and validation data sets is generally quite close, if not greater than the TSS in the training data set. This proves that overfitting has been sufficiently prevented, thanks to the early-stopping criterionotherwise the performance in the training data set would have been significantly higher than those in the test data set. Hence, in the following discussion we will refer to the TSS computed on the entire data set. As can be seen from the Table, using only one input variable, the performances are significantly lower than those obtained from the use of the power-law threshold of Eq. 10: however, for the variable with the highest informative content, mean rainfall 145 intensity I, the TSS = 0.45 is quite close to TSS 0 = 0.50. When using input variables in pairs, performances increase significantly. Notably, in the case of the pairs D-I and D-H -i.e., the same variables used for the power law -the TSS = 0.59, which is significantly higher than TSS 0 . The fact that with same input data the neural network provides significantly better performances than the power law, proves that the use of a predetermined parametric form for the threshold equation does not allow to exploit at the fullest the informative content of the input variables, while the flexibility of ANNs allows to achieve a 150 better classification.
Finally, adding a third variable (network input D-H-I p ), a further improvement is obtained (TSS = 0.64). This result demonstrates how neural networks can be an aid in searching additional variables that can provide a more reliable dynamic prediction of landslide triggering conditions. In particular, in this case, it has been shown that peak intensity may have an important informative content, an aspect that has not been perhaps sufficiently investigated in the literature. 155

Conclusions
The identification of rainfall thresholds indicating landslide triggering conditions is a key step for implementing territorial landslide early warning systems. Commonly, thresholds are searched in a limited space, i.e., constrained to a predetermined parametric form, which is generally a power law linking rainfall event, duration D and mean intensity I (or total depth H =D I). In this communication we have shown that choosing a predetermined form for the law of the threshold can potentially limit 160 the performance of the empirical model, and how Artificial neural networks are a valuable tool to overcome this limitation.
The analysis, referred to the case study of Sicily, has shown that an H-D power-law threshold has a maximum true skill statistic of TSS = TSS 0 = 0.50. On the other hand, the classifier based on neural networks, using the same pair of input variables, yielded a significantly greater TSS = 0.59. It has also been shown how neural networks allow to easily explore the potential information content of other variables, and hence provide a way to improve predictive performance. For instance, it has been 165 shown that the inclusion of peak rainfall intensity as an additional variable, can lead to an improvement of performance. It is important that when training neural networks, generalization capabilities are ensured, for instance by the early stopping technique. Overfitting is not an issue for the traditional approach based on the power law -or any other parametric equationas in general the number of free parameters is very low (2 for a power law). This may be a drawback for neural networks, even though it forces one to consider both triggering and non-triggering events, which is fundamental for obtaining thresholds with 170 acceptable statistical characteristics (Peres and Cancelliere, 2021). Another possible disadvantage of neural networks with respect to predetermined-form thresholds is also represented by the fact that it is generally not possible to summarize the neural network classifier as a simple equation. This could hamper the practical implementation of triggering thresholds based on neural networks, which could be perceived as impractical. However, this limit can potentially be overcome by providing a user-friendly software to the end user. 175 Data availability. Landslide data from the Franeitalia database (Calvello and Pecoraro, 2018) are available from https://franeitalia.wordpress.com/database/ (last accessed on 29/06/2021), while part of the rainfall data is available from websites of the Servizio Informativo Agreometeorologico Siciliano (SIAS) (http://www.sias.regione.sicilia.it/, last accessed on 05/07/2021) and the Osservatorio delle Acque (http://www.osservatorioacque.it/, last accessed on 05/07/2021).
Acknowledgements. Pierpaolo Distefano doctoral program's grant is funded by the "Notice 2/2019 for financing the Ph.D. 180 regional grant in Sicily" as part of the Operational Program of European Social Funding 2014-2020 (PO FSE 2014-2020) CUP E65E19000830002. David J. Peres was supported by the post-doctoral grant on "Sviluppo di modelli per la valutazione di strategie innovative di gestione delle risorse idriche in un contesto di cambiamenti climatici" (Development of models for the evaluation of new strategies for water resources management in a changing climate). The research has been partially conducted within the following projects: LIFE 17 CCA/IT/000115 SimetoRES funded by the EASME (now CINEA) of the 185 European Commission, and the Programma Operativo Nazionale Governance e Capacità Istituzionale 2014-2020 -Programma per il supporto al rafforzamento della governance in materia di riduzione del rischio ai fini di protezione civile CUP (Program to support the strengthening of governance in the field of risk reduction for civil protection purposes). APCs funded by "fondi di ateneo 2020-2022, Università di Catania, linea Open Access".