Mountainous grassland slopes can be severely affected by soil erosion, among
which shallow landslides are a crucial process, indicating instability of
slopes. We determine the locations of shallow landslides across different
sites to better understand regional differences and to identify their
triggering causal factors. Ten sites across Switzerland located in the Alps
(eight sites), in foothill regions (one site) and the Jura Mountains (one site) were
selected for statistical evaluations. For the shallow-landslide inventory, we
used aerial images (0.25

Soil erosion is an issue affecting many regions of the world and can have
severe consequences for the environment and humanity (e.g. water pollution or
food production)

Images showing examples of shallow landslides. Shallow-landslide sites show
displaced topsoil layers and have a distinct boundary to the
vegetation.

The aim of our study is to statistically evaluate shallow-landslide occurrence
for 10 different sites (between 16 and 54

A total of 10 sites were selected to produce shallow-landslide inventories
(mapping of shallow landslides) and perform subsequent statistical evaluations
of explanatory variables. We only consider grassland areas, which were
identified with the aid of the surface cover information of the product

Map of Switzerland showing the 10 selected study sites (outlined in yellow). Colours of the map show lower elevations in dark and higher elevations in lighter colours. Digital terrain model obtained from © swisstopo.

List of study sites and descriptive information: elevation range, total area of the study site, grassland area within study site in per cent, average slope of grassland area, orientation of the main valley axis, number of shallow landslides and shallow-landslide density in grassland areas. The max precipitation events (monthly values) of the previous 5 and 10 years are averaged over the study site. Note that both time spans might include the same events. GL: grassland; SLS: shallow landslides;

To identify the locations of shallow landslides across the 10 study sites, we
use a deep learning approach based on the U-Net architecture

With the statistical evaluation of the shallow-landslide sites, we aim to
understand possible causal factors. We evaluate the 10 study sites
individually (evaluation within each site) as well as across all of the sites
simultaneously (all-in-one model). The aim of this is to test whether the
same causal factors are important on different spatial scales. For each of the
10 sites an equal number of shallow-landslide and non-landslide points
constitute the binary response variable (no

We consider the linear model

Spatial blocks for fivefold cross-validation shown with the example of Chrauchtal. Blocks have a size of 1

To evaluate the accuracy and the predictive ability of the logistic regression
models, we use performance measures described in the following. All model
performances are based on test set estimations (predictions evaluated on
held-out test data blocks). The receiver–operator characteristic (ROC) curve
is a continuous curve showing the relationship between the true positive rate
(TPR) and false positive rate (FPR) for every probability threshold of the
model predictions

To perform the mapping of shallow-landslide sites with the U-Net model
(Sect.

Example of mapped shallow landslides in the Turbach valley (purple). The centred points (yellow) represent shallow-landslide locations for the lasso model evaluation. Only sites with an area larger than 4

The explanatory variables selected for the statistical evaluation of the
shallow-landslide points are a combination of variables commonly found in
landslide or shallow-landslide susceptibility studies

Table containing the variables used for the logistic regression with information on the type of variable (continuous or categorical), spatial resolution and which data set the variable was originally based on.

For every shallow-landslide and non-landslide point the variables listed in
Table

The lasso regression model selects the relevant explanatory variables and
estimates their regression coefficients to predict the location of shallow
landslides. The statistical evaluation was conducted for all 10 sites
individually and for all sites combined into one large model (all-in-one
model). The same explanatory variables were used for both approaches. Due to
the 20 fivefold cross-validations and random re-samplings (bootstrapping), the coefficients are estimated 100 times. The

The statistical evaluation of the study sites yields one model per site (10
models). We combine the results of all 10 sites in heat maps, showing the
median estimated coefficients (Fig.

Heat map displaying estimates of coefficients (median of 100 estimates) for all 10 sites. Note that not all geological rock classes are present at all sites (grey line). White boxes are equivalent to coefficients of zero and were therefore never selected for the models.

Heat map displaying the inclusion rate of variables for all 10 sites. The numbers indicate how often variables were selected for the models out of 100 estimates. Note that not all geological rock classes are present at all sites (grey line). Darker colours show variables selected more often. White boxes indicate which variables were never selected for the models.

Most sites select slope as the most important variable in terms of coefficient
value as well as the inclusion rate. Only the sites Baulmes (29

The aspect was selected most times (84 %–100 %) for all sites except
for Arosa (4

Other important variables which show a high inclusion rate amongst most sites yet often do not have a large impact concerning the coefficient values are roughness, TWI, distance to roads or streams, road or stream density, and frost change frequency. However, these variables were disregarded for some of the sites (low inclusion rates or even excluded completely). The coefficients' values may have a negative or positive correlation to shallow-landslide points (SLS points), depending on the sites and the local conditions. Geology is important for most sites, while sedimentary rocks and unconsolidated rocks are either present at the sites or selected for the model from all available classes. Unconsolidated rocks are negatively correlated in most cases. They can often be found near the valley bottom in proximity to streams and lakes, which tend to be located outside of shallow-landslide zones. Sedimentary rocks are positively correlated in most cases but can also show a negative correlation, depending on the site.

Boxplots (with whiskers and outliers) showing the coefficient range with 100 repetitions. Numbers above variable names indicate the number of times it was selected for the model. Boxes show the interquartile range (25th and 75th percentile), and the line indicates the median of the coefficients. Chrauchtal and Val Piora are selected from 10 study sites as examples.

Two sites (Chrauchtal and Val Piora) have been selected as examples to show
detailed results of the models and how the selection of explanatory variables
can differ between sites (Fig.

Confusion matrix derivations using 0.5 for the prediction threshold. Perfect scores are accuracy

To assess the prediction skills of the individual-site models, we calculate
the ROC curves and the corresponding AUC values
(Sect.

ROC performance measure of the models for all 10 sites. Plot displays ROC curves with corresponding AUC values.

Performance measure expressed with the Brier score for the models for all 10 sites. Plot shows boxplots of Brier scores, where lower Brier scores are indicative of better model performance.

Generally, the number of shallow landslides available at a site does not
necessarily affect the mean estimated value of coefficients, but the
variability in the estimates is smaller, and the inclusion rates are higher
for sites with more data points. Lower-performing models are for sites
located either outside of the Alpine region (Baulmes, Hornbach) or in the Swiss National
Park (Val Cluozza, only 8

As the slope is always the most important predictor for shallow landslides in
terms of coefficient size and model inclusion rates, a slope-only model was
tested for all sites. The application of the slope-only model indicates how
well slope predicts shallow landslides and how important additional
explanatory variables can be. We therefore compare the results of slope-only
models for all sites to the full-variable models based on their Brier scores
(Table

Brier scores for the

With the all-in-one model, we evaluate whether the same explanatory variables are important for cross-regional evaluations as for individual site evaluations. As all sites included in the all-in-one model have different numbers of SLS points, the sites with more points have a stronger influence on the model's outcome.

On the left-hand side, the ROC curve is displayed with the AUC value for the all-in-one model in black (including locations of probability thresholds) superimposed over the individual-site models in grey. On the right-hand side is the bootstrapped Brier score for the all-in-one model.

Boxplots showing the coefficient range with 100 repetitions. Numbers above variable names indicate the number of times it was selected for the model.

The all-in-one model places the ROC curve at roughly the centre of the
individual-site models (Fig.

The most important variables are comparable to the individual-site models,
with slope and roughness having the largest coefficients for continuous
variables

Susceptibility maps for the study site Chrauchtal based on the local model and the cross-regional

Additionally, shallow-landslide causes can be manifold, and singular triggering
processes are difficult to assign, and the timing of the occurrence is often
unknown. If possible, it would be useful to differentiate between triggering
factors of shallow landslides based on visual appearance, as was suggested by

The calculated coefficients of the logistic regression may be used for spatial
predictions of shallow-landslide occurrence, yielding a susceptibility map of
the region for the remaining grassland areas. These susceptibility maps are
useful to identify areas that may likely be affected by shallow landslides in
the future

In this study we located shallow landslides across 10 study sites spread
across Switzerland. We use the term shallow landslides to describe the erosion
sites, which classifies the erosion feature without implications for the
triggering event. Using the lasso regression model, we identified the most
important explanatory variables for these shallow landslides located on
grassland slopes. Due to the different local conditions of the varying sites,
different explanatory variables were identified as important. Slope and aspect
are among the most important variables. Shallow landslides of sites with an
east–west orientation of the valley axis as well as alpine sites were better
explained by the available explanatory variables (Urseren, Val Piora, Rappetal
and Arosa). This means that exposition-related processes in mountainous
regions are essential for understanding regional patterns (e.g. snowmelt,
snow movement). For the remaining sites, the available selection of
explanatory variables was not as well suited, and therefore important
processes could be missed. Sites outside of the main Alpine region (Baulmes
and Hornbach) or located in the Swiss National Park (Val Cluozza) have a small
number of SLS points, which were not well explained by the available
variables. Performance scores for individual-site models range between
BS

The full code of the U-Net erosion mapping tool is available under the GNU public license (

The supplement related to this article is available online at:

LZ, CA, KM and MS designed the experiments, and LZ carried them out. MS developed the code used for mapping the shallow-landslide sites. LZ performed the mapping, evaluations and calculations. LZ prepared the manuscript with contributions from all co-authors.

The contact author has declared that neither they nor their co-authors have any competing interests.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Calculations were performed at the sciCORE (

This study was funded by the Swiss National Science Foundation (project no. 167333) as part of the National Research Programme NRP75 – Big Data.

This paper was edited by Paolo Tarolli and reviewed by Luigi Lombardo and one anonymous referee.