Investigating Causal Factors of Shallow Landslides in Grassland Regions of Switzerland

. Mountainous grassland slopes can be severely affected by soil erosion. To better understand the regional differences of soil erosion patterns, we determine the locations of shallow landslides across different sites and aim at identifying their triggering causal factors. Ten sites across Switzerland located in the Alps (8 sites), in foothill regions (1 site), and the Jura mountains (1 site) were selected for statistical evaluations. For the shallow landslide inventory, we used aerial images (0.25 m) with a deep learning approach (U-Net) to map the locations of eroded sites. We used logistic regression with a Group 5 Lasso variable selection method to identify important explanatory variables for predicting the mapped shallow landslides. The set of variables consists of traditional susceptibility modelling factors and climate-related factors to represent local as well as cross-regional conditions. This set of explanatory variables (predictors) are used to develop individual site models (regional evaluation) as well as an all-in-one model (cross-regional evaluation) using all shallow landslide points simultaneously. While the local conditions of the ten sites lead to different variable selections, consistently slope and aspect were selected as the es- 10 sential explanatory variables of shallow landslide susceptibility. Accuracy scores range between 70.2 and 79.8% for individual site models. The all-in-one model conﬁrms these ﬁndings by selecting slope, aspect as well as roughness as the most important explanatory variables (Accuracy = 72.3%). Our ﬁnding suggest that traditional susceptibility variables describing geomorphological and geological conditions yield satisfactory results for all tested regions. However, for two sites with lower model accuracy, important Acknowledgements. This study was funded by the Swiss National Science Foundation (Project No. 167333) as part of the National Re- 340 search Program NRP75 – Big Data. Calculations were performed at sciCORE (http://scicore.unibas.ch/) scientiﬁc computing core facility at University of Basel.

grassland soils showing bare soil areas can be categorised as shallow erosion (Geitner et al., 2021). These shallow erosion sites are mainly triggered by prolonged and intense rainfall events (shallow landslides) or through abrasion by snow (snow gliding, avalanches) (Wiegand and Geitner, 2010;Geitner et al., 2021). However, in many cases, a combination of these processes can lead to shallow erosion sites and triggering processes cannot be distinguished from aerial photos. Therefore, we use the term shallow landslides in our regions and the frame of this study with no implication of the triggering event. The aim of our study is to statistically evaluate shallow landslide occurrence for 10 different sites (between 16 and 54 km 2 ) across Switzerland. In 30 the past, shallow landslide susceptibility studies have mainly focused on one or two study sites while often testing multiple modelling techniques (Gómez and Kavzoglu, 2005;Meusburger and Alewell, 2009;Vorpahl et al., 2012;Tien Bui et al., 2016;Oh and Lee, 2017;Lee et al., 2020;Nhu et al., 2020b) except for Persichillo et al. (2017), who evaluated four sites in different catchments. For our shallow landslide inventory we map the eroded sites on aerial images (0.25 m resolution) using a U-Net deep learning approach (Ronneberger et al., 2015). The U-Net tool was trained by Samarin et al. (2020) to identify 35 and map the extent of soil erosion features on grassland. While this mapping tool is able to distinguish between different erosion processes/appearances (i.e., shallow landslides, livestock trails, sheet erosion and management effects Samarin et al. (2020)), here, we focus on shallow landslides, as we aim to understand their causal factors and spatial patterns better. With the U-Net mapping tool, we can identify locations of shallow landslides in a very efficient and precise manner, increasing the possibilities for mapping but also future model validation of soil erosion studies (Samarin et al., 2020). The mapped shallow 40 landslide sites are subsequently evaluated with a statistical model to identify the most important explanatory variables and gain a better understanding of causal factors as well as regional differences. For this purpose we use the Group Lasso approach for logistic regressions (Tibshirani, 1996;Yuan and Lin, 2006;Meier et al., 2008). The Group Lasso can deal with continuous and categorical variables and is able to estimate coefficients of classes within a categorical variable. In addition to estimating coefficients, the Lasso can do variable selection simultaneously (Section 2.2). The Lasso tends to yield sparse and interpretable 45 models, avoids over-fitting and is tolerant towards possible collinearity of variables (Dormann et al., 2013). Despite these advantages, the Lasso has only been applied a small number of times for landslide susceptibility modelling (Camilo et al., 2017;Lombardo and Mai, 2018;Gao et al., 2020). We evaluate the shallow landslides within each study site (10 models) and across all 10 study sites simultaneously (all-in-one model) and consider only grassland surfaces. Our aim is to identify explanatory variables that have local importance but also identify variables, which may explain regional differences in shallow 50 landslide occurrence. The selected study sites are a combination of alpine (above 1500 m asl), foothill regions (below 1500 m asl) as well as one site in the Jura mountains (below 1500 m asl). The explanatory variables we use are the same for all sites and consist of a combination of classic landslide susceptibility variables (Budimir et al., 2015) as well as climate-related variables (Karger et al., 2017(Karger et al., , 2018, which may aid in explaining regional differences of shallow landslide occurrence (Section 3.2).
To understand how well the selected variables and their coefficients perform, we evaluate the models on held-out test data. 55 We determine Receiver-Operator-Characteristics (ROC) curves and the corresponding Area-Under-Curve (AUC) as well as the Brier score, which is suitable for binary variables (presence/absence shallow landslides) (Section 2.3). asl). The sites located in the Swiss Alps represent a range of alpine (above 1500 m asl) regions as well as foothill regions (Hornbachtal, below 1500 m asl). Val Cluozza is located in the Swiss National park and shows no signs of anthropogenic influences, and also contains only a small amount of grassland area (8%, rest mostly shrubs and rocks). For other sites in the Alps, grassland covers 34-55 % of the valley. The rest of the land-cover consists of forest area, rock/debris area or, in some cases, urban areas. The shallow landslide densities (shallow landslide affected area in relation to total grassland surfaces) range 70 from 0.06% (Baulmes) to 2.31% (Chrauchtal). Figure 1 shows the locations of all study sites within Switzerland and Table 1 summarises important site information.

Shallow Landslides Inventory
To identify the locations of shallow landslides across the 10 study sites, we use a deep learning approach based on the U-Net architecture (Ronneberger et al., 2015). These mapped shallow landslides are then used for statistical evaluations of causal factors (Section 2.2). This fully convolutional neural network approach for semantic segmentation in images allows for objective and efficient mapping. The U-Net model was trained to identify and map erosion sites on aerial images (Swisstopo, 2010) with the aid of digital terrain model information (Swisstopo, 2014), as described in Samarin et al. (2020). The U-Net model was trained on a small area of 9 km 2 and tested on an area of 17 km 2 in the Urseren Valley (Samarin et al., 2020). For this study we use the same U-Net model without further training to map the new study sites and focus only on the erosion 80 class shallow landslides, as defined in the introduction. The mapping results were carefully examined for all study areas and corrected manually when necessary. We only consider shallow landslides of at least 4 m 2 located on grassland grassland (see Figure 3 for example of mapping results).

Logistic Regression with Group Lasso
With the statistical evaluation of the shallow landslide sites, we aim to understand possible causal factors. We evaluate the 85 10 study sites individually (evaluation within each site) as well as across all of the sites simultaneously (all-in-one model).
The aim of this is to test, whether the same causal factors are important on different spatial scales. For each of the 10 sites an Selection Operator (Lasso) (Tibshirani, 1996). The Lasso regression performs variable selection and coefficient estimation simultaneously. This is obtained by applying a penalty term (II.) to the log-likelihood function of the logistic regression (I.) (Hastie et al., 2016): We consider the linear model z β (x) = β 0 + p j=1 β j x j on a data set of size n with p features, i.e. x i ∈ R p , and binary response The penalty term is determined by the parameter λ which is estimated by minimising the model error. The weight of λ determines how many variables are selected, and in turn, the model shrinks coefficients of variables that contribute to the error (Hastie et al., 2009(Hastie et al., , 2016. By shrinking the coefficients of unimportant variables to zero, they are removed from the model and thereby variable selection is performed. To achieve the least complex model in terms of selected variables, we chose λ to be one standard error larger than the minimal mean square error (Hastie et al., 2009). As some of the explanatory variables 100 are categorical (i.e., geology, aspect) we use the Group Lasso approach. All levels within a categorical variable (encoded as dummy variables) are treated as a group and all coefficients within that group become zero (dismissed) or non-zero (selected) simultaneously (Yuan and Lin, 2006;Hastie et al., 2016). This leads to a new objective function with modified penalty term, where α g is a scaling factor depending on the number of parameters in β g and η K = (η T Kη) 1/2 is a norm depending on the 105 group structure of the G different groups. For more details on the mathematical extension of the Group Lasso we refer to Meier et al. (2008). We implement the Group Lasso for logistic regression with the R-package grpreg (Breheny and Huang, 2015).
Due to the spatial relationship of geographic data sets, we divide the data into spatially separated blocks of 1 km 2 , randomly numbered from 1 to 5 (Valavi et al., 2019) (see Figure 2). These blocks are used for 5-fold cross-validation of the model.  (Lombardo and Mai, 2018). The process of coefficient estimation is repeated 20 times (bootstrapping) with different randomly selected blocks, generating 100 estimates of coefficients for every site (20 times 5-fold cross-validation). We assess the model-selected coefficients by evaluating the range of the coefficient estimates (boxplots) as well as their inclusion rate (number of times selected by models) as the number of ideal variables can vary in each fold.

120
To evaluate the accuracy and the predictive ability of the logistic regression models, we use performance measures described in the following. All model performances are based on test set estimations (predictions evaluated on held-out test data blocks). prediction threshold of 50 %. To summarise the accuracy of the models, we assess the magnitude of the error in the probability predictions using the Brier score (BS) (Equation 3) (Brier, 1950;Wilks, 2006).

130
where N are the number of mapped shallow landslides, f t are the predicted probabilities for shallow landslide occurrence (between 0 or 1), and o t are the observed (mapped) of shallow landslides (either no=0 or yes=1). The Brier score (BS) is equivalent to the Mean-squared error, yet is valid for binary observations. A BS of zero indicates perfect model performance, while 1 is the worst possible score (prediction is opposite of observation). Probability predictions that are further away from the observation are penalised more heavily. If the model predicts a 50 % chance of shallow landslide every time (random), a score 135 of 0.25 is achieved for a balanced data set (Steyerberg et al., 2010;Raja et al., 2017). We re-estimate the BS with bootstrapping (500 repetitions, sampled with replacement) to achieve confidence intervals.
3 Data Sets

Shallow Landslide and Non-Landslide Points
To perform the mapping of shallow landslide sites with the U-Net model (Section 2.1), we require aerial (ortho-)images Cluozza, Val Piora). From the DTM, the derivatives slope, aspect and curvature (plan and profile) are required, which are calculated with ArcGIS (10.5). Additionally, we use data set with land-cover information (SwissTLM, Swisstopo (2019)) 145 to assure only sites with grassland are being mapped. For the mapped shallow landslides, we extract the centre points with ArcGIS of sites with a minimum size of 4 m 2 . Non-landslide points were extracted randomly within the grassland area and with a minimum buffer distance to mapped shallow landslides of 5 m. This shallow landslide data set contains an equal number of landslide to non-landslide points for each study site ( Figure 3). climate-related variables that may explain differences between the sites (e.g., strong precipitation events) from the CHELSA data set (Karger et al., 2017(Karger et al., , 2018. Variables related to land-cover and vegetation are not considered as we filter our study sites divergence of surface flow and profile curvature describes the acceleration of the surface flow (Zevenbergen and C., 1987).

Explanatory Variables
Roughness expresses the difference between maximum and minimum elevation values between a cell and all of its neighboring cells (Wilson et al., 2007). Higher roughness values indicate rougher terrain. Based on flow direction (direction of the steepest descent) we determine the Flow Accumulation, which describes the number of cells flowing into a cell. The Topographic Wetness Index TWI gives indications of where water accumulates on slopes and is calculated with ln(α/ tan β), where α is 170 the upslope area draining through a certain point per unit contour length (Flow Accumulation) and β is the slope (Beven and  Kirkby, 1979). Distance to Roads and Road Density are variables that are often included in landslide susceptibility studies, as they represent constructional interference (Meusburger and Alewell, 2009;Nhu et al., 2020b). Distance to Streams and Stream Density can give further information on rainfall drainage and runoff processes (Nhu et al., 2020b). These variables were calculated based on the SwissTLM data set (Swisstopo, 2019), containing information on road and stream locations using 175 the distance and line density tool (search radius of 500m (Meusburger and Alewell, 2009)) of ArcGIS. In addition to these terrain-related variables, we use variables derived from the CHELSA data set, which contains monthly values on temperature and precipitation from which many environmental parameters are derived (Karger et al., 2017(Karger et al., , 2018. We include the strongest precipitation events of the last 5 years and 10 years prior to the recording year of the aerial images, information on snow fall/cover, growing season length and frost change frequency (5-year average of 2009-2013). While these variables have a 180 comparatively low spatial resolution (30 arc sec, approx. 1 km), they may give a good indication of regional differences of shallow landslide occurrence as they are representative of alpine processes often linked to the triggering of shallow landslides (Meusburger and Alewell, 2008;Wiegand and Geitner, 2010;Löbmann et al., 2020;Geitner et al., 2021). Specifics on the  Steeper slopes tend to be more susceptible to shallow landslides, which is in agreement with other studies that have found slope to be one of their top predictors (Budimir et al., 2015;Goetz et al., 2015;Tien Bui et al., 2016;Oh and Lee, 2017;Persichillo et al., 2017;Lombardo and Mai, 2018;Lee et al., 2020;Nhu et al., 2020b, a).
The aspect was selected most times (84-100 %) for all sites except for Arosa (4 %) and Baulmes (0 %) ( Figure 5)    Frequency. However, these variables were disregarded for some of the sites (low inclusion rates or even excluded completely).  Figures 6 and 7). The boxplots of the estimated coefficients for all 10 sites can be found in the supplemental material (Figures S1 -S10). Chrauchtal ( Figure 6)  side of the Alps, while Val Piora (Figure 7) is located on the south side. They have opposing orientations of the main valley axis (N-S and E-W, see Table 1). Chrauchtal is the site with the highest shallow landslide density (2.31 % with 8073 SLS points), which affects the very high inclusion rates for all explanatory variables ( Figure 5). This also affects the spread of the boxplots, which show small variability of the coefficient values ( Figure 6). With the high number of shallow landslides the variability of coefficients decreases, which means that the Lasso regression estimates very similar coefficient values for all 100  are unfavourable. Roughness is negatively correlated for both sites, meaning that rougher terrain is less favourable to shallow landslides. Variables with smaller coefficients may also be selected often by the Lasso regression. However, these variables tend to have different effects depending on local conditions (e.g. Distance to Roads/Road Density, Elevation or TWI).
To assess the prediction skills of the individual site models, we calculate the ROC curves and the corresponding AUC values (Section 2.3). Curves closer to the top left corner of the plot show models with higher predictive skills (e.g., Urseren, AUC=0.865), while curves closer to the diagonal line have lower predictive skill (e.g., Baulmes, AUC=0.733). Confusion 240 matrix scores summarised in Table 3  Whereas for sites such as Urseren and Val Piora, the available explanatory variables are well suited to describe the mapped shallow landslides. Generally, the number of shallow landslides available at a site does not necessarily affect the mean estimated value of coefficients, but the variability of the estimates is smaller, and the inclusion rates are higher for sites with more data points.  which have a strong influence on the occurrence of shallow landslide (Schauer, 1975;Moser and Hohensinn, 1983;Tasser et al., 2003;Meusburger et al., 2010;Wiegand and Geitner, 2013;Höller, 2014;Leitinger et al., 2018).

Performance of Slope-only model
As the slope is always the most important predictor for shallow landslides in terms of coefficient size and model inclusion

Performance of All-in-one Model
With the all-in-one model, we evaluate whether the same explanatory variables are important for cross-regional evaluations as individual site models (74.1 %). The True Positive Rate lies at 76.3 % and the False Positive Rate at 31.6 %, which is slightly higher than all individual site models. Generally, the individual-site models perform better in most cases, as local conditions are important for the overall accuracy of models. However, the variability of the estimated coefficients of the all-in-one model is relatively low (Figure 11) indicating, that the coefficients were estimated similarly when selected.
The most important variables are comparable to the individual site models, with slope and roughness having the largest 285 coefficients for continuous variables (Goetz et al., 2015). The categorical variables aspect and geology show similar behaviour to the individual site models. The CHELSA climatology variables (Max. Precipitation Events, Snow Days/Snow Cover Days, Growing Season Length and Frost Change Frequency) were originally included with the idea that these might have a stronger impact when doing cross-regional evaluations such as this all-in-one model. From these variables, frost change frequency was selected the most times (88 %). Frost change frequency describes the number of daily events for which the temperature en-290 compasses zero (Karger and Zimmermann, 2019), yet the estimated coefficient is very small. This variable was tested as it may represent snow movement processes related to freezing/thawing cycles, yet, it was too ambiguous. Other climate variables were rarely selected. The inclusion of climate variables may prove helpful when comparing different regions in a "bulk" perspective (e.g. average landslide density per site), but seemingly not when explaining locations of individual shallow landslides across different regions. Additionally, the comparatively low spatial resolution of the CHELSA data set (30 arcsec) may not be q q q q q q q q q q q q q q q q q Additionally, shallow landslide causes can be manifold and singular triggering processes are difficult to assign and the timing of the occurrence is often unknown. If possible, it would be useful to differentiate between triggering factors of shallow landslides based on visual appearance, as was suggested by Geitner et al. (2021). With the U-Net approach used to map the shallow landslide sites on aerial images (0.25 m), it is impossible to distinguish between triggering factors (Samarin et al., 2020;300 Zweifel et al., 2019). With higher spatial resolutions of climate variables and a temporal component to the mapped shallow landslides, it may become possible to assign triggering processes with such evaluation techniques. Additional variables such as land-use information (e.g., grassland management) could be of great importance if available in appropriate spatial resolution and high enough accuracy for all regions (Meusburger and Alewell, 2009;Budimir et al., 2015).
While the explanatory variables for this study were chosen based on data availability, this is not an exclusive list of possible 305 predictors. Many studies have worked towards identifying triggering factors in varying Alpine regions, such as the effects of land-use, snow processes, precipitation events or vegetation cover (Newesely et al., 2000;Tasser et al., 2003;Rickli and Graf, 2009;Geitner, 2010, 2013;Meusburger and Alewell, 2008;Meusburger et al., 2013;Von Ruette et al., 2013;Höller, 2014;Ceaglio et al., 2017;Fromm et al., 2018;Geitner et al., 2021). Therefore, it is difficult to fully quantify all ongoing processes simultaneously in such a complex system, as triggering factors are often interlaced (Zweifel et al., 2019). To 310 ideally represent causal factors for statistical evaluations of shallow landslides, these important processes need to be represented with high spatial resolutions and a temporal component needs to be included (Meusburger and Alewell, 2009).
Using the Lasso regression model, we identified the most important explanatory variables for shallow landslides of 10 study sites located on grassland slopes spread across Switzerland. Due to the different local conditions of the varying sites, different 315 explanatory variables were identified as important. Slope and aspect are among the most important variables. Shallow landslides of sites with an east-west orientation of the valley axis as well as alpine sites were better explained by the available explanatory variables (Urseren, Val Piora, Rappetal and Arosa). This concludes that exposition-related processes in mountainous regions are essential for understanding regional patterns (e.g., snowmelt, snow movement). For the remaining sites, the available selection of explanatory variables was not as well suited and, therefore, important processes could be missed. Sites outside of the main while the available climate-related data sets have proven to be less informative on both regional and cross-regional scales.
Nevertheless, data sets representing triggering shallow landslide conditions and processes in appropriate spatial resolutions 330 would likely improve model performance. Studies focusing on understanding small scale processes are therefore of great importance, and with data availability shifting towards open access and higher spatial resolutions as well as large spatial coverage, such statistical evaluations may improve in the future.
Code and data availability. The full code of the U-Net erosion mapping tool is available under the GNU public license (https://github.com/ bmda-unibas/ErosionSegmentation). Geodata sets were obtained from SwissTopo unless stated otherwise.