Comment on nhess-2021-243 Anonymous Referee # 1 Referee comment on " Quantification of meteorological conditions for rockfall triggers in Central Europe

The manuscript of Nissen et al. deals with the set up of a logistic regression model to derive the probability of occurrence of rockfalls in Germany, based on meteorological and hydrological variables. The paper is interesting and the writing is fluent and clear. I enjoyed reading it. Regarding results, the authors are able to quantify the impact of increasing rainfall and increasing subsurface water (i.e., pore water, water in fractures) in terms of variations of probability of occurrence of rockfalls. Despite its general good quality and interesting findings, I believe there are certain aspects that need improvement.


Introduction
Landslides are geomorphological hazards associated with damage and fatalities to people and their connected structures (Froude and Petley, 2018). There is scientific consensus that specific weather conditions can strongly influence landslide occurrences (McColl, 2015). Thus, as effects of climate change become more and more visible, the scientific community tries to understand and predict the consequences for landslides (e.g., Gariano and Guzzetti, 2016;Macciotta et al., 2017;Haque 25 et al., 2019;Bajni et al., 2021). However, specific weather conditions must meet specific ground conditions for landslides to occur. Consequently, meteorological parameters and thresholds are spatially heterogeneous and results from previous studies on this issue site-specific. Furthermore, the term "landslides" encompasses multiple mass wasting processes on slopes (e.g., mud flow and rockfall) that each depend on different preconditions and trigger mechanisms (Varnes, 1978;Hungr et al., 2014).
It is therefore sensible to study these different types of processes separately. 30 Against this background, the present study focuses on multiple rockfall clusters spanning over all of Germany. Rockfall is the removal of superficial and individual rocks from a rock cut slope (Robbins et al., 2021). In solid rock, frequency and size of fissures and cracks are preconditions that promote rockfall, i.e., they represent weak points vulnerable to weathering that may eventually dislodge individual rocks (Erismann and Abele, 2001). Thus, all weathering mechanisms that promote rockfall can also be trigger mechanisms that cause the start of a rockfall event. Weathering mechanisms driven by meteorological events 35 can be the wetting and drying of porous (esp. argillaceous) rocks from precipitation and evaporation, carbonate dissolution in carbonic rocks from rainfall, and frost shattering from low temperatures (Souleymane et al., 2008;Krautblatter et al., 2012;Viles, 2013). In case of frost shattering, weathering and triggering mechanisms may differ, since rockfalls may occur during thawing rather than during cooling periods because the cohesion of the ice-rock interface can be sufficient to hold the rock in place (D'Amato et al., 2016). Moreover, there are weathering mechanisms not directly linked to meteorological events that 40 may promote or trigger rockfalls. Tectonic activity may weather rock through phases of folding, thrusting, strike-slip, and normal faulting (Di Luzio et al., 2020). Tree root growth may expand rock fractures and joints (Dorren et al., 2007). Lastly, anthropogenic induced vibrations and tremors (e.g., from explosions or machine use) or direct constructional interventions may lead to weathering of rock (Gill and Malamud, 2017).
In this context, the question arises whether a statistical model focused on meteorological parameters can accurately predict 45 rockfall occurrence. A first investigation conducted on a monthly basis by Rupp and Damm (2020) already suggests that a relationship between rockfall events, temperature and precipitation is likely to exist in the selected study areas. The present analysis focuses on the quantification of these effects. To allow for the application of the results in climate change studies, it is essential to consider all climatic factors which promote or suppress rockfall together as they can reinforce or cancel each other (Crozier, 2010). For example, climate projections suggest that rockfall promoter and trigger heavy precipitation may increase 50 in magnitude and frequency due to the higher moisture holding capacity of warmer air (IPCC, 2012). At the same time, increases in evaporation due to higher temperatures decreases water availability, thus slowing down weathering mechanisms.
To account for this, a statistical model which includes the interaction between the relevant variables was developed. In the following, a logistic regression model describing the probability of rockfall events in Germany in response to meteorological and hydrological conditions is presented. 55 masses, the mentioned processes are subsumed under the generic term rockfall in this study (Evans and Hungr, 1993;Selby, 1993). Beside an identification number, the location (i.e. coordinates) and the year of occurrence for each rockfall event are stored in the dataset. A total of 642 rockfalls has the year of occurrence, as the remaining are undated. The time span of rockfall occurrences ranges between 1480 and 2018, with the greater number (n = 621) from 1873 onwards. The dense spatial clustering of rockfall events and high temporal data homogeneity set the selection of three study areas (ES = German part of the Elbe 70 Sandstone mountains, HL = northern Hesse and southern Lower Saxony, HR = western Hesse and Rhineland-Palatinate) ( Fig.   1).

Elbe Sandstone cluster
The ES cluster mainly includes the German parts of the Elbe Sandstone Mountains which are located on both sides of the upper reach of the river Elbe between the Czech city Děčín and the Saxon city Pirna. Geologically, the area is dominated by compact 75 Cretaceous sandstones. Fracturing and formation of cracks and fissures came about by extensive uplift processes and long-term tectonic stresses. Fluvial incision accounted for a heavily dissected relief with numerous horizontal cracks, vertical joints and clefts as well as small gorges (Pälchen and Walter, 2008). The climatic conditions are characterised as continental, with warm summers and cold winters. The mean monthly temperature is between -0.8 • C in January and 17.8 • C in July. Between 1946 and 2017 annual precipitation ranges between 398 mm and 1153 mm with an annual average of 758 mm.

Hesse and Lower Saxony cluster
The HL cluster imbeds large parts of the northern German Central Uplands, i.e. the Hesse Highlands and Lower Saxon Hills.
Predominantly, the geological conditions are characterised by Middle Lower Triassic Bunter Sandstone. Pronounced dissections were caused by tectonic stresses (Damm et al., 2010). Quaternary sediments, for example periglacial cover beds and loess covers, cover the bedrock in large parts of the area (Wagner, 2011;Damm et al., 2013). The climatic situation can be described 85 as temperate with warm summers and mild winters. The mean monthly temperature is between 0.5 • C in January and 17.3 • C in July. From 1902 to 2017 annual precipitation ranges between 357 mm and 1099 mm with an annual average of 660 mm.

Hesse and Rhineland-Palatinate cluster
The HR cluster comprises large parts of the Hunsrück Hills in Rhineland-Palatinate and a small part of the Taunus Hills in Hesse. Geologically, Devonian bedrock, namely slate and quartzite, is predominantly present in this area. Distinct plateaus 90 alternate with ridges and incised valleys (LBG, 2005). The climatic situation can be described as temperate, with mild winters and warm summers. The mean monthly temperature is between 1.4 • C in January and 18.4 • C in July. Between 1915 and annual precipitation ranges between 324 mm and 853 mm with an annual average of 641 mm.
It is important to note that the rockfall database is not comprehensive. The increase in the number of recorded events with time ( Fig.2) is not due to climatic conditions but reflects the fact that data on rockfall events was more readily available in 95 recent years.

Meteorological and hydrological variables
For this study, datasets with a long record and high horizontal resolution were used in order to identify meteorological and hydrological conditions for as many rockfall events as possible with sufficient accuracy. It was therefore decided to use the gridded REGNIE dataset (Rauthe et al., 2013) for daily precipitation amounts. The dataset is compiled from spatially interpo-100 lated gauge measurements of the quality-controlled German weather service (DWD) stations. REGNIE is available since 1931 for western Germany. For the new federal states the time series starts in 1951. The spatial resolution is 1 km 2 .
In order to study precipitation intensities, data from the gridded radar based climatology RADKLIM (Winterrath et al., 2018) was used. The dataset includes hourly precipitation from radar measurements adjusted to station observations and exhibits a horizontal resolution of 1x1 km. For the present study the daily maximum was extracted. The time series is comparatively short 105 as it only starts in 2001.
For temperature it was decided to use the gridded E-OBS dataset (Cornes et al., 2018) as it goes back until the year 1950. The horizontal resolution of the grid is 0.1 • x0.1 • , which corresponds to approximately 8 km in Germany. For the analysis of freezethaw cycles the ensemble mean of near-surface atmospheric daily minimum and daily maximum temperatures provided in the v21.0e version of the dataset was used. A freeze-thaw cycle was defined as the transition from a daily minimum temperature 110 below -1 • C to a daily maximum temperature higher than 0 • C.
The water content of the ground (e.g. soil moisture, cleft water, water in rock pores) is measured only at very few sites. Soil moisture monitoring in Germany, for example, relies on modelled soil moisture . In this study we attempt to utilise modelled soil moisture as a representative for all types of sub-surface water and will use the term pore water in the following. We analyse the results of a simulation with the state-of-the-art, grid-based hydrological model mHM (Samaniego 115 et al., 2010), which was calibrated using gauge measurements. The set-up is based on European datasets as described in Rakovec et al. (2016) and Samaniego et al. (2019). We analysed the relative moisture content (i.e. degree of saturation) for the entire column from the surface to a depth of approximately 1.8 m. It is common practice for this model to further normalise these values using percentiles  as the variability of the modelled values is too low.
A challenging point of using simulated soil moisture is, that it is operationally available only for a part of climate scenario 120 simulations. In addition, the moisture variables stored and the levels they represent differ between climate models. Therefore,  the usage of a pore water proxy (D) as an alternative to simulated soil moisture was tested as a predictor for the logistic regression model. D is defined as the difference between precipitation accumulated over a period of time (P rec accum ) and the potential evapotranspiration (P ET ) during this period: (1)

125
The term D is also the basis for the calculation of the Standardised Precipitation Evapotranspiration Index (SPEI; Vicente-Serrano et al., 2010), which includes a standardisation of D in order to allow comparisons between different climatic regions, which is not necessary here. Plenty of empirical methods exist to determine potential evaporation and it depends on the availability and quality of the observations available at a site which one can be used. In this study, the method first proposed by Hargreaves (1994) in the version modified by Droogers and Allen (2002) was applied. As input parameters it needs extraterres-130 trial radiation (which depends on latitude and day of the year), the period mean of maximum and minimum daily temperatures, as well as mean precipitation over the period of interest (which is used as a proxy for cloudiness). Thus, D does not depend on the material of the ground. In order to use D as a proxy for pore water it was accumulated over a period of time. Different accumulation periods were compared and the best result was obtained using a weekly period.
The following analyses include only data from grid boxes that contain the site (es) of at least one rockfall event. This 135 approach ensures that regions without any predisposition (e.g. flat terrain, no rocks) don't obscure the results. Grid boxes that contain locations of rockfall events that took place before the start or after the end of the respective record are also not considered.
where X is the independent variable and Y is the binary information if the event occurred. f (X | Y ) denotes the conditional probability density function for X if Y is true (=1) or false (=0). In practice, a continuous independent variable (e.g. precipitation amount) is split into bins containing a range of values. The WOE for each bin is then calculated using only days within this 145 range. The WOE depends on the fraction of days with a rockfall event to that of uneventful days. For categorical variables the WOE is determined for each category.
A graphical inspection of the result (Fig. 3) already reveals if a relationship between the independent variable and the probability of rockfall exists. An integral measure for the strength of the relationship between the dependent and independent 150 variable is the Information Value (IV; Siddiqi, 2006). It is calculated as with nbins being the number of bins. According to Siddiqi (2006) a predictor is not useful for statistical modelling if the IV value is less than 0.02.
The IV can also be used to rank the variables according to their influence. The highest IV is obtained for daily precipitation 155 followed by precipitation intensity, soil moisture and freeze-thaw cycles. To take into account that the thawing process might take several days, a time span preceding the event was evaluated. Comparing different time spans, it turned out that the IV value associated with a freeze-thaw cycle immediately before the rockfall event (i.e. preceding 2 days) was too low to be considered useful for statistical modelling. Extending the analysis period backward in time increased the IV value, with the increase flattening out after 9 days. The WOE analysis also confirmed that in accordance to the findings of D'Amato et al.

160
(2016) thawing increases rockfall probability while freezing decreases it (not shown). It should be noted, that in the evaluation depicted in Fig. 3 the number of rockfall events and grid boxes included in the WOE calculations differ depending on the length and spatial resolution of the respective meteorological/soil moisture datasets. A consistency test reducing the number of grid boxes, time steps and events to the subset covered by all datasets (not shown) confirms the ranking.
It is important to keep in mind that, despite the name of the analysis, a statistical relationship is no prove for a cause and effect 165 relationship. As rockfall occurrence in Germany exhibits a seasonal cycle with a maximum in January (Rupp and Damm, 2020) it is easy to establish a statistically significant but physically incoherent relationship to any unrelated variable with a similar seasonal cycle. To account for this problem, we only analysed variables for which a physical relationship to rockfall events has already been established in previous studies for other sites.   A well established statistical method to determine probabilities for a binary event (e.g. rockfall vs. no rockfall) based on the conditions of independent variables is logistic regression. Here, a logistic regression model using precipitation, pore water and freeze-thaw cycles as independent meteorological and hydrological variables is fitted. The consideration of individual rockfall clusters (section 2.1) provides information on possible regional differences of the results.
To fit the logistic regression model we used as many data points n = es * ts as possible, with ts being the the number of 175 days for which meteorological/hydrological data is jointly available among all variables used as independent parameters in the model. The number of event sites (es) at which a rockfall event was recorded within the period covered by the meteorological/hydrological observations depends on ts. Other than for the WOE analysis, we neglected the fact that some grid boxes enclose the site of more than one event. As the grouping of the events depends on the resolution of the individual datasets included in the statistical model, this was necessary to allow a comparison of the different models. Evaluations involving pre-180 cipitation intensity are carried out based on a much shorter period and a lower number of sites than all other evaluations, which will be considered when comparing the results.
For logistic regression a generalised linear model with a logit link function is fitted (Wilks, 2011). The probability p k of a rockfall event at a specific time and location can be expressed as The classical score to compare logistic regression models of different complexity is the Brier skill score which, however, becomes unstable for rare events such as ours (Benedetti, 2010). We therefore use the logarithmic skill score (LSS) instead, which behaves similar to the Brier skill score but performs better for extreme probabilities (Benedetti, 2010;Wilks, 2011).

190
The logarithmic skill score quantifies the percentage gain of using the statistical model over just predicting the climatological probability and is calculated as follows: where LS = 1 n n k=1 LS k is the logarithmic score. The value of LS k is determined for each day and location using the proba- LS clim is calculated analogously using the climatological probabilities es n (i.e. one rockfall event at each site). In order to validate the statistical model and ensure that no overfitting took place it can be tested on a sample of independent data by cross validation. The full event catalogue was divided into 5, approximately equally sized groups with events from the different clusters equally distributed between the groups. The statistical model was then trained using only 4 of the groups and 200 afterwards applied to predict event probabilities in the remaining group. The model is judged by computing the logarithmic scores LS and LS clim . The process is repeated for all groups and a mean logarithmic skill score is determined (denoted as LSS cv in Tab.1). In the following analysis LSS cv was used as an objective measure to select the most suitable combination of predictors for modelling rockfall probabilities.
To find the best performing statistical model numerous combinations of the potential predictors were compared. Table 1 lists 205 the results for a selection of these tests. Evaluated predictors include daily precipitation (precip_1day), the local percentile of daily precipitation calculated using wet days (precip_1day_lperc), percentile of simulated total column soil moisture (sm_perc), percentile of parameterised pore water (D_perc), hourly precipitation (precip_1hr), the local percentile of hourly precipitation calculated using wet hours (precip_1hr_lperc), a categorical predictor denoting to which cluster an event belongs (cluster) and a binary predictor indicating if a freeze-thaw cycle occurred at the site during the previous 9 days (ftc). As the full notation of 210 the model equation is space consuming we use the compact symbolic form used in the R programming language (R Core Team, 2018). The operator + denotes adding another predictor term, : marks the product between two predictors while * indicates Table 1. Symbolic formula for a list of logistic regression models tested in this study and main characteristics associated with these models.
The characteristics include the number of coefficients that needed to be determined, the number of rockfall events that could be used for fitting, the logarithmic skill score (LSS), the logarithmic skill score determined by cross validation (LSScv) and the Akaike information criterion (AIC). See text for explanation of symbolic equation notation. 215

Results
The performance of the statistical models listed in Tab.1 can be compared with the help of the cross validated logarithmic skill score (LSS cv ) with a higher skill score denoting a better performance of the model. In addition, the Akaike information criterion can be used to compare two models, which were fitted based on the same number of observations (Akaike, 1974).
Here, a lower AIC value is associated with the better model. Comparing models 1-5 confirms that the most important single 220 predictor is daily precipitation which performs best if included in form of its local percentile (denoted by the suffix "_lperc").
For hourly precipitation LSS cv indicates that absolute values lead to better results than local percentiles. The LSS cv values for soil moisture and for hourly precipitation are of the same magnitude. Models 6-8 reveal that considering soil moisture in addition to the local percentile of daily precipitation improves the statistical model with the best result obtained by using both variables individually as well as their interaction term (model 8). LSS cv can be further increased by adding the binary 225 information of the occurrence of a freeze-thaw cycle in the previous 9 days to the set of predictors (model 11). Adding the binary cluster information (models 9,10,12) has the effect of fitting a separate model to the event clusters. Customising the model to regions, however, does not improve its performance. Model 10 demonstrates the importance of cross validation. This model exhibits the highest number of regression coefficients resulting in a LSS higher than for model 8. The LSS cv is, however, lower than for model 8 indicating that the LSS improvement is achieved by overfitting. At first sight it seems that including different accumulation periods for the calculation of D_perc it was found that a 1-week accumulation performs better than a 2-week accumulation period. Across-site percentiles performed better than the absolute values.
In addition to the combinations shown in Tab. 1 it was also investigated if the regression coefficients depend on the slope angle at the event site. For this we downloaded the Copernicus digital elevation model (DEM) at 25 m horizontal resolution and calculated the slope angle at the rockfall locations using the methodology proposed by Horn (1981). The slopes at the event sites 245 calculated using the DEM data appear plausible at many of the sites. There are, however, also locations for which we determined a slope angle of only few degrees which is inconsistent with the occurrence of rockfall events. A possible explanations could be that the slope was altered by the event and is therefore no longer captured in the DEM dataset representative for the year 2011. In addition, for large-scale rockfall events it is difficult to determine the exact location at which the slope needs to be estimated. Overall, including the slope angle calculated using the DEM as an additional parameter in the logistic regression 250 model did not improve the results.
Model 16 (Eq. 7) can now be used to predict changes in rockfall probability valid on average for specified changes of the meteorological conditions and the pore water preconditions. The response of the rockfall probability to variations in the local daily tation set to median values (Fig. 4a) rockfall events can be expected to appear with approximately climatological probability.
Less/more precipitation leads to an below/above climatological average probability. Increasing the local precipitation from the median to its 90 th percentile approximately doubles the probability for a rockfall event. The amount of precipitation associated with the 50 th percentile varies between 1.2 and 6.8 mm and that for the 90 th percentile between 6.5 and 31.2 depending on the 260 site. The occurrence of a freeze-thaw cycle in the previous days increases the probability for an event. Precipitation becomes more effective when the pore water amount is high (Fig. 4b). When D is at the 95 th percentile, increasing precipitation from its median to its 90th percentile makes a rockfall events almost 4 times more likely. The logistic regression model suggests that the influence of pore water on rockfall probability is on average less pronounced than the influence of daily precipitation. At most sites an increase in pore water amount in the absence of strong precipitation has hardly any effect (Fig. 4c).

265
In this study a statistical model was developed that is able to describe changes in the probability of rockfall events in Germany that can be expected under different meteorological and hydrological conditions. It must be kept in mind, though, that the rockfall database is not comprehensive. Thus, it cannot be used to calculate an absolute baseline probability. In addition, there is no guarantee that the sampling locations are representative for Germany as a whole. In order to investigate to what extent 270 the model depends on the region that is investigated, we defined three study areas characterised by dense spatial clustering and high temporal data homogeneity and evaluated if the statistical model improves when the regression coefficients are allowed to differ between the clusters (models 9, 10 and 12). It was found that including the cluster information did not improve the model. This provides some reassurance that our approach to develop a statistical model for the entire domain of Germany as well as neighbouring low mountain regions in Central Europe regardless of the different local conditions in different areas was 275 reasonable.
The logarithmic skill score used to evaluate the fit of the statistical model describes the percentage improvement over a model that always predicts a climatological probability for rockfall events. The skill score of our model is just over 4% and improves to more than 5% if only the last 20 years are used for model fitting. A value of 4% appears to be not much, but it has to be interpreted keeping the physics of rockfall events in mind. A rockfall event can only be triggered if the slope is predisposed, 280 after many years of weathering. Because of this, most of the time strong rainfall events in an area with high soil moisture preconditions remains without consequences and the number of false alarms in our model is extremely high. This limitation could only be overcome by including information on the predisposition into the statistical model. Unfortunately, this is not feasible as it would be far too expensive to monitor every slope operationally. Prediction errors (i.e. missed alarms) may steam from events triggered by non-meteorological mechanisms. The model skill obtained using only meteorological/hydrological 285 parameters as predictors, however, suggests that non-meteorological influence seems to be a subordinate factor in the rockfall process for the selected study regions. The improvement of the skill when using only data of the last 20 years could be due to the fact that the rockfall dataset is more comprehensive for the recent past or due to improvements in the quality of meteorological observations.
We found that daily precipitation is the most important factor to trigger rockfall events in Germany. The best fit for the 290 statistical model was obtained when using local percentiles rather than across-site percentiles (not shown) or absolute values.
A possible interpretation could be that most rock slopes are balanced under normal climate conditions but can become unstable in the presence of above normal precipitation amounts. Pore water (represented by D or soil moisture) on the other hand leads to better results when across-site percentiles are used. This might be an indication that in most locations pore water on its own is unlikely to trigger a rockfall event. It can weaken porous material making it more susceptible to a trigger like precipitation.

295
This process, however, depends only on the material and not on the climate conditions at the location of the event. The fact that both simulated soil moisture and D improved the statistical model, confirms that these variables can be used as a first-order substitute for all relevant types of sub-surface moisture such as cleft water and water in rock pores.
Using a rockfall dataset for Germany it was possible to build a statistical model which is able to quantify changes in rockfall 300 probability in response to changes in sub-surface moisture and meteorological factors identified in geophysical studies as potential triggers for rockfall events. The model can be regarded as representative for the low mountain ranges in Central Europe.
The model was developed in order to be applied to climate change simulations with the aim of determining if the probability of rockfall events can be expected to change in response to global warming. Applying the statistical model to climate 305 simulation output is facilitated by the fact that the model works with percentiles for most predictors. Thus, only temperature for the evaluation of freeze-thaw cycles needs to be bias corrected. In addition, the complex simulation of soil moisture as a representative for pore water can be substituted by a proxy (i.e. accumulated precipitation minus potential evaporation) which can be easily calculated from climate model output.
For the application in climate change studies it is important that the statistical model considers the interaction between the 310 triggering factors as these are expected to show opposing trends. While heavy precipitation is likely to increase in the future (IPCC, 2012), the number of frost days will decrease with increasing temperatures (IPCC, 2012). Climate projections for aridity in Central Europe depend on location and season (Samaniego et al., 2018). Thus, studies considering only single factors might over or underestimate the response of rockfall to climate change as the interaction of the factors can amplify or diminish the signal.