Articles | Volume 24, issue 11
https://doi.org/10.5194/nhess-24-3869-2024
https://doi.org/10.5194/nhess-24-3869-2024
Research article
 | 
12 Nov 2024
Research article |  | 12 Nov 2024

Reconstructing hail days in Switzerland with statistical models (1959–2022)

Lena Wilhelm, Cornelia Schwierz, Katharina Schröer, Mateusz Taszarek, and Olivia Martius
Abstract

Hail is one of the costliest natural hazards in Switzerland and causes extensive damage to agriculture, cars, and infrastructure each year. In a warming climate, hail frequency and its patterns of occurrence are expected to change, which is why understanding the long-term variability and its drivers is essential. This study presents new multidecadal daily hail time series for northern and southern Switzerland from 1959 to 2022. Daily radar hail proxies and environmental predictor variables from ERA5 reanalysis are used to build an ensemble statistical model for predicting past hail occurrence. Hail days are identified from operational radar-derived probability of hail (POH) data for two study domains, the north and south of the Swiss Alps. We use data from 2002 to 2022 during the convective season from April to September. A day is defined as a hail day when POH surpasses 80 % for a minimum footprint area of the two domains. Separate logistic regression and logistic generalized additive models (GAMs) are built for each domain and combined in an ensemble prediction to reconstruct the final time series. Overall, the models are able to describe the observed time series well. Historical hail reports are used for comparing years with the most and least hail days. For the northern and southern domains, the time series both show a significant positive trend in yearly aggregated hail days from 1959 to 2022. The trend is still positive and significant when considering only the period of 1979–2022. In all models, the trends are driven by moisture and instability predictors. The last 2 decades show a considerable increase in hail days, which is the strongest in May and June. The seasonal cycle has not shifted systematically across decades. This time series allows us to study the local and remote drivers of the interannual variability and seasonality of Swiss hail occurrence.

1 Introduction

During the convective season, hail causes substantial damage to agriculture, cars, and buildings in Switzerland (BAFU2012). One extreme hailstorm on 21 June 2021 caused building damage of CHF 400 million (approx. EUR 415 million) in a single canton alone (Schmid et al.2024; Kopp et al.2023). Addressing hail hazard is challenging, as hail is associated with complex interactions of thunderstorm dynamics with microphysical processes that are modulated by synoptic-scale dynamics. Predicting the development and evolution of convective storms is especially challenging in the complex topography of western Europe. Orography such as the Alps and Jura Mountains can initiate or modulate convection, for example by increasing environmental wind shear, which can lead to stronger storm organization (Kaltenböck and Steinheimer2015; Kunz et al.2018). In a changing climate, we may also expect changes in hail frequency and intensity. Although some studies report indications of increasing hail frequency and size (Púčik et al.2019; Raupach et al.2023a; Battaglioli et al.2023a) and hail damage (Willemse1995) in Europe, other studies show a negative trend or no trend (Manzato et al.2022; Augenstein et al.2023). Trends in damage are not necessarily driven by trends in the hazard. Damage is also linked to exposure and vulnerability and undergoes changes with urban expansion and changes in the built infrastructure.

The pre-Alpine regions north and south of the Alps are regularly affected by severe hailstorms (Nisi et al.2016; Fluck et al.2021). Swiss hail occurrence exhibits a strong year-to-year variability and follows a pronounced seasonal cycle (Schröer et al.2023). Recent studies (Nisi et al.2018, 2020; Barras et al.2021; Schröer et al.2023) have highlighted substantial differences in both interannual and intra-annual hail variability between the northern and southern sides of the Alps. In the northern domain the peak of the convective season typically occurs in June, whereas in the south, it occurs in July (Fig. 2). Moreover, the occurrence of hail-prone and hail-sparse years differs between the two regions.

In contrast to North America, where important drivers of the year-to-year variability of severe convection and hail have been well studied (Tippett et al.2015; Allen et al.2020; Taszarek et al.2020a; Nixon et al.2023), a thorough examination of the long-term variability of hail in Switzerland is currently lacking. The lack of long-term direct hail observations often hinders the analysis of hail frequency patterns and variability (Martius et al.2015). To be able to analyze long-term trends and variability in hail occurrence, we need a hail time series longer than anything currently available. Environmental hail proxies derived from sounding, reanalysis, or model data combined with statistical models are typically used to create such extended time series. The primary advantage of reanalysis data is their spatial and temporal coverage and their availability over long time periods. Here, we use ERA5 data to produce a multidecadal daily hail time series for northern and southern Switzerland from 1959 to 2022. ERA5 is considered one of the most reliable reanalyses in representing convective storm environments (Li et al.2020; Taszarek et al.2020b; Pilguj et al.2022; Varga and Breuer2022; Wu et al.2024).

The development of deep moist convection requires an unstable atmosphere, sufficient moisture at low levels, sufficient vertical wind shear, and an initiation mechanism (Johns and Doswell1992; Doswell et al.1996). For hailstones to form in a storm, three additional elements are needed: an embryo particle (typically graupel or frozen drops), an abundance of supercooled liquid water, and sufficient time for the hailstone to grow within the storm's updraft (Allen et al.2020; Kumjian and Lombardo2020; Kumjian et al.2021). Regional characteristics such as terrain barriers, local wind systems, and warm water surfaces influence the relative importance of these elements necessary for hailstorm development, which is why this study looks at the regions north and south of the Alps separately.

Convection in the region south of the Alps is influenced by the transport of moist and warm air masses originating from the Adriatic and Mediterranean seas during southwesterly or southern flow conditions (e.g., Nisi et al.2016). These air masses create ideal conditions for convective storm development, when coupled with local wind systems such as mountain–plain circulations and valley breezes. Previous studies have highlighted the relevance of anabatic–katabatic wind systems to hail formation in the southern pre-Alpine region and specifically in the Po Valley (Morgan1973; Gladich et al.2011). The southern domain is shielded from northern air masses by the Alpine chain, whereas the northern domain is regularly exposed to frontal systems originating from the west or north (Schemm et al.2016).

Due to these unique topographic and synoptic conditions, predicting hailstorm formation in Switzerland requires regional-specific models that consider individual interactions. Various atmospheric variables have been used in statistical models to predict severe hail-producing thunderstorms in Europe (Groenemeijer and van Delden2007; Kunz2007; García-Ortega et al.2012; Manzato2012; Mohr and Kunz2013; Gascón et al.2015; Púčik et al.2015; Tuovinen et al.2015; Melcón et al.2017). There are regional differences from the United States (Brooks et al.2003; Rasmussen2003; Johnson and Sugden2021; Taszarek et al.2020a; Nixon et al.2023) and Australia (Allen et al.2011; Raupach et al.2023a). Mohr and Kunz (2013) and Kunz (2007) presented a comprehensive list of hail-relevant meteorological parameters and indices that can be used as environmental proxies for Europe, and Huntrieser et al. (1997) presented a list specifically for Switzerland.

The parameters and indices can be grouped into three categories: instability and moisture, which are both thermodynamic, and kinematic conditions. Latent, conditional, and potential instabilities are captured by indices such as convective available potential energy (CAPE) (Moncrieff and Miller1976), the lifted index (Galway1956), the vertical totals index (Miller1972), the Boyden index (Boyden1963), the Showalter index (Showalter1953), and the KO index (Andersson et al.1989). Other indices combine all three instabilities, such as the total totals (Miller1972) and K indices (George1961). Other indices measure the tropospheric moisture content, such as vertically integrated liquid water (Greene and Clark1972), and kinematic conditions, such as the magnitude of the vertical wind shear (Weisman and Klemp1982, 1984). Composite parameters that combine kinematic and thermodynamic variables such as the SWISS index (Huntrieser et al.1997), the significant hail parameter (SHIP), and the hail size index (HSI) also correlate well with the occurrence of large hail (Allen et al.2015; Czernecki et al.2019; Gensini et al.2021; Johnson and Sugden2021). The indices are then used in statistical models to estimate the occurrence of hail.

For instance, Mohr et al. (2015a) used a logistic regression approach to estimate the potential for hailstorms in Germany between 1971 and 2000 and between 2021 and 2050. They find that the potential for hail events is projected to increase significantly in 2021–2050 compared to 1971–2000 in the northwest and south of Germany.

Logistic regression has also been used by Billet et al. (1997), Schmeits et al. (2005), Sánchez et al. (2009), and López et al. (2007) to model thunderstorm and hail events. Recently, Battaglioli et al. (2023a) created a logistic generalized additive model for Europe and the United States from European Severe Weather Database (ESWD) reports and ERA5 data to model trends of large hail (>2 and >5 cm) occurrence. They presented a significant increase in hail frequency in northern Italy and parts of southern Switzerland. Allen et al. (2015) developed a Poisson regression from monthly averages to connect monthly hail frequency to the large-scale atmospheric environment in the United States. Madonna et al. (2018) presented a Poisson regression hail model using radar and ERA5 data specifically for northern Switzerland. Their model captured the intra-annual and interannual hail variability well, and their time series showed an increase of 0.5 hail days per month per decade.

We build on the work of Madonna et al. (2018), but in this study, we increase the resolution of the analysis to daily, we additionally include the south of Switzerland, and we extend the time series back to 1959. Unlike Battaglioli et al. (2023a), who used ESWD severe-weather reports, we use Swiss radar data as proxies to model hail day occurrence. Furthermore, we employ an ensemble of two statistical models, a logistic multiple regression and a logistic generalized additive model (GAM), to leverage the best-fitting predictors for each domain individually. Our statistical models are tailored to Switzerland. Our goal is not to build a model for forecasting, but we want to produce the best possible reconstruction of past hail days in Switzerland from environmental predictor variables. The statistically modeled time series will then be used to study long-term trends and changes in the frequency, seasonality, and variability of model-derived Swiss hailstorms in past decades.

The paper is structured as follows. Section 2 provides an overview of the datasets used in this study and is followed by a description of methods in Sect. 3. Model building and performance are explained in Sect. 4. Results from time series analyses are presented in Sect. 5, which are discussed in Sect. 6. Conclusions follow in Sect. 7.

2 Data

2.1 Radar-derived probability of hail

This study uses the radar- and model-based probability of hail (POH) product as a proxy for hail. POH is an empirical hail detection algorithm from MeteoSwiss that indicates the probability of hail of any size on the ground from 0 % to 100 %. The estimate follows the method of Foote et al. (2005) and Waldvogel et al. (1979) and is based on the vertical distance between the 45 dBZ echo top height measured by the Swiss radar network and the freezing level height obtained from the COSMO-CH numerical weather forecast model (Baldauf et al.2011); see Nisi et al. (2016) and Kopp et al. (2024) for a detailed description of the POH algorithm. POH is currently available from 2002 to 2024 in 5 min and daily time intervals on a 1 km × 1 km Cartesian grid spacing. The third-generation Swiss radar network, which from 2002 to 2012 consisted of three single-polarization Doppler C-band radars, was updated to the more advanced fourth-generation dual-polarization Doppler C-band radars in 2012. Subsequently, two additional radars were installed in mountainous regions at high elevations, where orographic beam blocking minimized low-level interference from the other three radars. We use thoroughly quality-checked and reprocessed POH data from the recently published Swiss hail climatology (Trefalt et al.2023; Schröer et al.2023) and consider areas within a 140 km radius around the five radar stations (Fig. 1). The 140 km radius limitation helps minimize planar artifacts and ground clutter.

https://nhess.copernicus.org/articles/24/3869/2024/nhess-24-3869-2024-f01

Figure 1The dots indicate the locations of the five radars (La Dôle, Albis, Monte Lema, Plaine Mort, and Weissfluh). The shading indicates the two study areas north of the Alps (blue) and south of the Alps (orange). The areas are within a 140 km radius of the five MeteoSwiss weather radars (black circles) overlaid on a digital elevation map (gray shading, source: Federal Office of Topography swisstopo).

The central Alps are excluded from the analysis because hail rarely occurs there (Van Delden2001; Giaiotti et al.2003; Nisi et al.2016) and radar quality may be lower (Feldmann et al.2021). The central Alps are delineated from the northern and southern pre-Alps by the boundaries of the official prognosis regions from the Federal Office of Meteorology and Climatology MeteoSwiss. This selection of the study domains allows the climatological regimes north and south of the Alps to be separated and corresponds to those in Barras et al. (2021). Comparing POH data with car insurance loss data, Nisi et al. (2016) showed that a POH threshold of 80 % best represents hail locally. Note that damage occurs to cars with hailstone sizes of around 2 cm and larger. More information on the definition of hail days is provided in Sect. 3.1.

2.1.1 ERA5 environmental predictors

For multidecadal analyses, ERA5 is the best product currently available for Europe. Therefore, we use ERA5 reanalysis data to quantify the hail potential of the atmosphere (Hersbach et al.2020). In this work, data from 1959 to 2022 at hourly and 6-hourly intervals were used, including model levels (137 levels from 1000 to 1 hPa, 0.5°×0.5° grid spacing), pressure levels (17 levels, 0.5°×0.5° grid spacing), and surface data (0.25°×0.25° grid spacing). We exclude any data before 1959 from our analysis because the quality of ERA5 declines in those years (Bell et al.2021) and cannot be used to analyze trends. A total of 75 convective parameters was calculated (Table S1 in the Supplement).

Statistical models classifying hail events typically select the ERA5 grid point that is temporally and spatially closest to the hail incident. However, such a selection is not possible for reconstructing past hail events because no information is available on the hail event prior to the observational period. Therefore, to model the occurrence of a hail day, we calculate ERA5 profiles averaged across the entire northern or southern domains at 12:00 UTC. The values at 12:00 UTC exhibited the highest predictive skill, which may be attributed to the fact that most storms in Switzerland occur in the late afternoon (e.g., Nisi et al.2016, 2018). Thus, the 12:00 UTC value is most likely to capture the atmospheric conditions before storm formation.

Our definition of hail days focuses on days with more than a single hail cell. The thresholds are set to capture events that led to damage and affected somewhat larger areas (probability of hail  80 % over a minimum area of 580 km2 for the northern domain and 499 km2 for the southern domain, as detailed in Sect. 3.1).

2.2 Historic hail data

To check plausibility, we compared the modeled time series to a historical hail dataset that is a qualitative combination of multiple data sources, mainly crop damage reports, extending back to 1825, and early radar data, including research radar data extending back to 1983 (Müller and Schmutz2021). Most relevant for our study period, 1959–2022, is the agricultural crop damage data archive by the Swiss agricultural hail insurance company Schweizer Hagel. Radar-based measurements complement the archive after 2002. The historical information is temporally resolved on a daily scale and spatially resolved on a municipality scale. From this information, we derived a time series with binary hail information using a threshold of five affected municipalities. The threshold was selected to best match the annually averaged hail days derived from POH data (see Sect. 3.1). This historical data archive is subject to significant uncertainties, including reporting biases, changing vulnerabilities and exposures of crop cultures, hail prevention measures, the fraction of insurance partition, and mergers of municipalities (Willemse1995). Due to these limitations, the historical data cannot be interpreted as a homogeneous time series, and a quantitative comparison is impossible. However, the data contain valuable information on the weakest and strongest active hail years and an indication of multiyear variability, which can complement model evaluation.

3 Methods

In this section, we first provide an explanation of how hail days are extracted using the probability of hail (POH) radar proxies and then analyze the distribution of the POH time series.

3.1 POH time series

To identify hail days in northern and southern Switzerland, we use daily POH data from 2002 to 2022 during the hail-prone months of April to September. We use the same domains and area thresholds as Barras et al. (2021). The daily area of POH  80 % is extracted separately for the domains north and south of the Alps (Fig. 1). To qualify as a hail day, the daily maximum POH must reach or exceed 80 % over an area of at least 580 km2 in the northern domain and 499 km2 in the southern domain. Barras et al. (2021) determined that these thresholds correlate best with days when car damage was reported across Switzerland from 2002 to 2012. This definition implies hail large enough to cause damage to cars, approximately 2 cm in size. The sensitivity of our models to this threshold was tested by varying the area threshold. We found no significant impact on misses or false alarms, consistent with earlier studies indicating low sensitivity to area thresholds (Madonna et al.2018). These criteria yield 566 hail days in the northern domain and 560 in the southern domain between 2002 and 2022. The a priori probability of hail days between 1 April and 30 September is 14.7 % in the north and 14.5 % in the south.

On average, 27.0 hail days per year occurs in the north and 26.7 in the south. A maximum of 44 hail days was recorded in the north in 2009 and a maximum of 37 in the south in 2018. A minimum of 16 hail days occurred in the northern domain in 2020 and a minimum of 17 hail days in the southern domain in 2007.

There is considerable interannual variability with domain-specific differences during the observation period (Fig. 2a). While the most recent years in the south show a frequency above the average, the opposite is true in the north. Yet, with a time series of only 20 years, we cannot assess or interpret trends in a robust way.

https://nhess.copernicus.org/articles/24/3869/2024/nhess-24-3869-2024-f02

Figure 2The number of yearly (a) and monthly (b) hail days for the northern (blue) and southern (orange) domains for 2002–2022.

Download

Hail is a seasonal phenomenon with a strong annual cycle in both domains (Fig. 2b). In the north, hail is most frequent in June, with a total of 166 hail days, followed by July with 157 hail days. In the south, hail is most frequent in July with 189 hail days.

4 Statistical model development and model performance

This section offers an overview of the development of the four statistical models and an evaluation of their performance. We discuss the development and performance of the individual logistic regression models (Sect. 4.1), the generalized additive models (GAMs) (Sect. 4.2), and the ensemble prediction (Sect. 4.3).

4.1 Logistic regression

Applequist et al. (2002) suggest multiple logistic regression as an appropriate tool for a binary classification problem, and logistic regression models have been used effectively in many studies to model the occurrence of hail- and thunderstorms (e.g., Billet et al.1997; Schmeits et al.2005; Sánchez et al.2009; Battaglioli et al.2023a). A multiple logistic regression model predicts the occurrence probability p of hail as a function of several environmental parameters (x1, x2, …, xn) as independent variables (Hosmer and Lemeshow2000). A binary variable, here hail yes/no, is defined as a dependent variable y. The occurrence probability p(x) is defined as

(1) y = p ( x ) = 1 / 1 + e - g ( x ) , where 0 p ( x ) 1 .

The model is based on a linear regression:

(2) g ( x ) = β 0 + β 1 × x 1 + β 2 × x 2 + + β n × x n .

We computed the regression coefficients βn in R with the glm package using the maximum-likelihood method. The dataset was divided into training and test sets by distributing data points from 2012 to 2022 randomly into 70 % and 30 %. Additionally, we used the POH data from 2002 to 2011 as an independent validation set to prevent overfitting. To estimate the performance of the model, we used 10-fold cross-validation. A total of 75 different convective and meteorological parameters were tested as predictors xn (Table S1). The best models were chosen by comparing multiple performance metrics. We considered the Akaike information criterion (AIC), the Bayesian information criterion (BIC), the critical success index (CSI) or threat score, the probability of detection (POD), the false alarm ratio (FAR), the success ratio (SR), and the Heidke skill score (HSS), as well as the bias, precision, and accuracy values. The metrics were calculated from contingency tables by averaging over the 10 test, training, and validation data subsets. Equations for the contingency table metrics, AIC, and BIC can be found in Table A1.

We use a combination of multiple metrics to build a model with the optimal balance between over- and underfitting. The correct prediction of hits is of slightly greater importance than false alarms because finding hail days is our main priority. We also avoided multicollinearity between predictor variables by requiring the variance inflation factor (VIF; Mansfield and Helms1982) of any predictor to remain below 4. We use a probability threshold of p(hail)≥0.4 for the north and p(hail)≥0.44 for the south to identify hail days. This threshold was identified by examining ROC curves and plots of modeled vs. observed hail days.

A residual analysis was performed to ensure no systematic errors remained in the model residuals. We looked at the yearly and monthly averaged residuals of the 10 training and test data subsets separately. A strong increase in the variance of yearly residuals was present in data points before 2012, which warranted our decision to only use POH data from 2012 onwards for training. In 2012, the Swiss radar network underwent a major update. Even though we reduced the size of the training dataset, the predictive skill of the models for both domains increased slightly. Furthermore, we introduce a categorical variable, month, as an additive factor in both models, containing the 6 months of April through September. This addition was intended to reduce the nonstationarities associated with a seasonal cycle. Residuals were more regular after the inclusion of the month factor, and the model's predictive skill increased.

To find the optimal number of predictors, we applied a manual stepwise forward method, resulting in five predictors and the month factor for both the northern and the southern models. The best logistic models for each are

(3) g ( hail ) = β 0 + β 1 × LI + β 2 × TT + β 3 × omega _ vint + β 4 × q _ vint + β 5 × BI + n = 5 9 β n × 1 month = n + 1

for the northern model and

(4) g ( hail ) = β 0 + β 1 × LI + β 2 × KI + β 3 × v _ 500 + β 4 × SP + β 5 × TT + n = 5 9 β n × 1 month = n + 1

for the southern model.

LI is the surface-based lifted index. TT is the total totals index, omega_vint is the vertically integrated vertical velocity, q_vint is the vertically integrated specific humidity, BI is the Boyden index, KI is the K index, v_500 is the meridional component of the wind at 500 hPa, and SP is the mean surface pressure. Descriptions, mean values, and percentiles of all variables can be found in Tables S1 to S3. A detailed evaluation of the performance of the final ensemble prediction is undertaken in Sect. 4.3. Here, we discuss the performance metrics of the logistic models summarized in Table 1.

Table 1Performance metrics of the logistic model for north and south. Metrics are calculated from k-fold cross-validation and are the averages of the test datasets. For POD, CSI, HSS, AUROC, bias, precision, and accuracy, a value close to 1 indicates good performance, whereas FAR, AIC, and BIC should remain as low as possible.

Download Print Version | Download XLSX

The northern model has a higher POD, lower FAR, and lower CSI than the southern model. The performance metrics suggest that the northern model can distinguish better between hail and no-hail days and misses fewer hail days than the southern model. Nonetheless, when comparing our models to other studies, we rank either better with a lower FAR, as in all studies mentioned in Raupach et al. (2023a), or similar to other studies, as in López et al. (2007) and Gascón et al. (2015).

All coefficients and p values of covariates are listed in Table B1. All model predictors except the categorical month factor are significant. Although the month factor was not significant, the model's performance decreased when removing the factor. Possible explanations for the months not being significant in our model include that our sample size is too small for the effect to become significant and that there is multicollinearity between months in the model. Only LI and TT are selected in both models, albeit with different coefficients. The z values in Table B1 show that instability and moisture predictors (LI, KI, q_vint) have the highest feature importance in both models. The z value measures how many standard deviations the coefficients are from 0; hence, the higher the absolute value, the higher the importance.

To illustrate the modeled relationship between response and predictors, Figs. 3 and 4 show marginal response plots of the logistic models. In both figures, the response is plotted against each independent model covariate xn and against the linear combination of all covariates (bottom-right graph) with LOESS smooth functions. The model, represented by the dashed red line, matches the marginal relationships of the data represented by the solid blue lines, and hence all predictors are well fitted and do not need further modification. The gray points show the distribution of the covariates. Some variables have a stronger influence on the model's predicted probability than others. An LI of −5 K translates to a probability of hail of 60 % (Fig. 3a), whereas the highest probability of any omega_vint value reaches less than 30 % (Fig. 3c). In all models, two to three covariates mainly determine the hail occurrence probability, and the remaining covariates are used for fine-tuning.

https://nhess.copernicus.org/articles/24/3869/2024/nhess-24-3869-2024-f03

Figure 3Marginal model plots showing the modeled relationship of each covariate (x axes) to the modeled probability of a hail day (y axes) given all other covariates are held constant at their mean value. The bottom-right graph shows the linear combination of all covariates in their mean function. The model is represented by the dashed red lines, and the marginal relationships of the data are represented by the solid blue lines. The gray points show the distribution of covariates. Some variables have a stronger influence on the model's predicted probability than others.

Download

https://nhess.copernicus.org/articles/24/3869/2024/nhess-24-3869-2024-f04

Figure 4As Fig. 3 but for the southern domain.

Download

We next briefly discuss how each selected model predictor is connected to environments favoring hail. The surface-based lifted index (LI) is a measure of stability of the atmosphere and is defined as the difference between the temperature at 500 hPa and the temperature of a parcel that is lifted from the surface to its lifted condensation level (LCL) dry adiabatically and then pseudo-adiabatically to 500 hPa. A negative LI indicates atmospheric instability, which is favorable for the development of convective storms. The lower the LI, the more unstable the atmosphere (hailstorms possible at an LI of approx. −4 K; Kunz2007). This relationship matches the models' fitted negative linear relationship in both domains (Figs. 3a and 4a).

The total totals (TT) index combines two components, the vertical totals (VT) and the cross totals (CT). The VT reflects static stability or the lapse rate between 850 and 500 hPa. The CT includes the 850 hPa dew point temperature. As a result, TT increases with decreasing static stability and increasing 850 hPa moisture, but it does not capture the moisture below the 850 hPa level. Additionally, convection may be inhibited despite a high TT value if a significant capping inversion is present. A TT of 50 K or larger usually indicates that hailstorms are possible (Mohr and Kunz2013). In the northern and southern models, the probability of hail exceeds 50 %, with TT values of approximately 52 K (Figs. 3b and 4e).

The K index (KI), like the VT, is based on the vertical temperature gradient between 850 and 500 hPa and dew point temperatures at 850 and 700 hPa. Higher humidity at 850 hPa, expressed by higher dew point temperatures at 850 hPa, increases the KI. Furthermore, lower humidity at higher levels (700 hPa) decreases the chance of thunder- or hailstorms occurring. The higher the KI, the higher is the probability of a hailstorm. KI above 20–30 K usually indicates possible thunder- or hailstorms (Kunz2007), which matches our relationship of KI to hail in the southern model (Fig. 4b).

The Boyden index (BI) was originally developed to assess the thunderstorm risk in frontal passages. This convective parameter does not include information on humidity. It considers the temperature at 700 hPa and the thickness of the 1000–700 hPa layer, which is proportional to its temperature. The higher the value of the BI, the greater is the risk of thunderstorms. The threshold value for thunderstorms is approximately 95 (Boyden1963), which is slightly higher than what the model learns for the northern domain (50 % probability of hail at a BI greater than approx. 90, Fig. 3e). As mentioned in Sect. 1, on the north side of the Alps, around 20 %–40 % of Swiss hailstorms are associated with fronts, which is probably why that parameter was chosen and why it is highly important in the model.

The vertically integrated vertical velocity (omega_vint) denotes the vertical motion of air throughout the atmospheric column and primarily reflects large-scale synoptic ascent or descent. In our model, the highest probabilities of hail occur when omega_vint values are negative (Fig. 3c), signifying large-scale ascent.

The vertically integrated specific humidity (q_vint) quantifies the total amount of water vapor available in the atmospheric column and thus indicates the moisture available for hailstorm development. Consequently, a higher q_vint increases hail day probability (Fig. 3d).

Finally, v_500, the meridional component of the wind at 500 hPa, and the mean surface pressure (SP) might be connected to hailstorm development indirectly. Our model shows that the highest probabilities of hail are achieved with neither very high nor very low pressure (Fig. 4d). A positive sign of v_500 indicates air moving northwards at 500 hPa, which the model translates to higher probabilities of hail in the southern domain (Fig. 4c). This indication could be related to a synoptic situation in the south of Switzerland, where moist, warm air is transported from the Mediterranean towards the Alps (Schemm et al.2016). The lack of a kinematic predictor in the northern model is discussed further in Sect. 6.1.

All these connections are part of a complex interplay of atmospheric conditions that contribute to hailstorm development. Therefore, we examine combinations of various parameters to assess the likelihood of hailstorms in our models. When the variable combinations from the northern model are applied to the southern domain and vice versa, the coefficients change and the predictive skill declines. This difference in coefficients and predictive skill underlines the necessity of using unique sets of predictors for each domain instead of a single model across all of Switzerland.

Automatic predictor selection procedures such as recursive feature importance and LASSO gave worse-performing models than a manual stepwise approach combined with expert knowledge that was based on earlier considerations of optimal distribution separations of hail vs. no-hail days (Trefalt2017) and computed correlations (Figs. S1 and S2 in the Supplement). Further discussions on variable selection and their importance follow in Sect. 6.

4.2 Generalized additive models (GAMs)

As mentioned before, the use of a generalized additive model (GAM) was warranted to account for potential nonlinear and nonparametric correlations in the data that may not be adequately captured by a conventional logistic regression. A GAM is a generalized linear model in which the response variable depends linearly on the smooth functions of the model's predictor variables (Hastie and Tibshirani1987). The logistic equation from before (Eq. 2) becomes

(5) g ( x ) = β 0 + f 1 x 1 + f 2 x 2 + + f n x n .

The nonparametric form of the functions fn enhances the flexibility of the model, but it also imposes constraints on additivity, allowing us to interpret the model in a similar manner as the multiple logistic regression. CAPE appeared more often as a model predictor in the GAMs than in the logistic regression models during model training. Nevertheless, the best model for the northern domain preferred the LI over CAPE. The selection of predictors followed the same procedure as in the logistic regression model. For every variable that presented an effective degree of freedom (edf) > 1, a smoothing spline function was applied to allow for nonlinear effects. The model was fitted with the mgcv R package.

The best GAM in the northern domain is

(6) g ( x ) = β 0 + f 1 ( LI ) + f 2 ( KI ) + f 3 ( TT ) + f 4 ( z _ 0 ° C ) + f 5 ( WS _ 06 ) + f 6 ( WS _ 36 ) + n = 5 9 β n × 1 month = n + 1

and for the southern domain

(7) g ( x ) = β 0 + f 1 ( CAPE ) + f 2 ( WS _ 06 ) + f 3 ( Td _ 2 m ) + f 4 ( TT ) + f 5 ( omega _ 500 ) + n = 5 9 β n × 1 month = n + 1 .

Here LI is the surface-based lifted index, KI is the K index, TT is the total totals index, z_0 °C is the freezing level, CAPE is the most unstable convective available potential energy computed for parcels departing from model levels below the 350 hPa level, WS_06 is the magnitude of bulk wind shear between 10 m and 6 km, WS_36 is the magnitude of bulk wind shear between 3 and 6 km, Td_2m is the 2 m dew point temperature, and omega_500 is the vertical velocity at 500 hPa. The final five variables do not appear in the logistic regression models. The thresholds for identifying a hail day were set to p(hail)≥0.40 for the north and p(hail)≥0.41 for the south.

In the northern model the combination of LI and TT and in the southern model the combination of CAPE and TT lead to a strong increase in the performance of the model. We therefore allowed composite parameters such as TT in favor of a better predictive performance. The performance measures for the GAMs can be found in Table 2. Both GAMs perform very similarly to the logistic regression models. The northern GAM outperforms the southern model. Table B2 provides the coefficients and their corresponding p values for parametric covariates, and Table B3 details the nonparametric terms. Again, all model predictors except the month factor are significant. The models' explained variances are 63.1 % for the north and 45.5 % for the south.

Table 2Performance metrics of the GAMs for north and south. Metrics are calculated from k-fold cross-validation and are the average of the test datasets. For POD, CSI, HSS, AUROC, bias, precision, and accuracy, a value close to 1 indicates good performance, whereas FAR, AIC, and BIC should remain as low as possible.

Download Print Version | Download XLSX

We can visualize the modeled relationship between the response and the covariates once again to reflect how each covariate is connected to hailstorm development. Figures 5 and 6 depict partial dependence plots for both GAMs. Each figure illustrates the partial effect of individual model covariates xn on the probability of a hail day. The vertical black lines at the bottom represent the distribution of the covariates. The black lines in the gray band are smoothing functions that capture the modeled relationships. The horizontal red lines are the y=0 lines that separate the plot space into a positive and negative partial effect.

https://nhess.copernicus.org/articles/24/3869/2024/nhess-24-3869-2024-f05

Figure 5Partial dependence plots for each model covariate in the northern model. The solid black line and gray uncertainty range represent the modeled partial effect of the covariate on the response. The red y=0 lines separate positive from negative effects. The short black vertical lines indicate the covariate distribution.

Download

https://nhess.copernicus.org/articles/24/3869/2024/nhess-24-3869-2024-f06

Figure 6The same as Fig. 5 but for the southern model.

Download

In the southern model (Fig. 6a–f), the partial effect on the probability of hail is positive when TT≥47 K, Td_2m≥282 K, WS_06≥10 mṡ−1, and CAPE≥100 J kg−1 and when omega_500 is negative.

CAPE is a measure for the energy available for convection. Large positive values of CAPE indicate that an ascending air parcel would be much warmer than its surrounding environment and therefore very buoyant. High CAPE values indicate that high updraft speeds can occur within thunderstorms, allowing the sustained lifting of moist air to colder altitudes where hailstones can form and grow. Our model shows a strong positive effect of CAPE at values of approx. 500 J kg−1. The slope of the curve then flattens towards higher values, which are also where uncertainty increases (Fig. 6a).

WS_06 has a very similar relationship in the southern model, where at least 10 m s−1 is needed for a positive effect, but then the partial effect increases only slightly with increasing magnitude of deep-level shear (Fig. 6b). Shear has the least importance of all predictors in the model.

The dew point temperature at 2 m (Td_2m) quantifies the temperature and moisture at the surface. Higher dew point temperatures imply higher surface temperatures and more moisture in the air. The release of latent heat due to the condensation of moisture enhances buoyancy and thus fosters the development of the strong updrafts necessary for hail formation. Our model shows the highest partial effect for hail occurrence with the highest dew point temperatures (Fig. 6c).

Similar to the vertically integrated vertical velocity omega_vint, the vertical velocity at 500 hPa (omega_500) is a measure for the vertical motion of air, here for the level at 500 hPa. Negative values indicate upward motion. The highest positive effect is achieved with the strongest negative vertical velocities (Fig. 6e).

In the northern model, the partial effect on the predicted probability of the model is positive when LI≤0 K, TT≥45 K, and KI≥15 K (Fig. 5a–f). We explain the relationship of LI, KI, and TT to hailstorm development in Sect. 4.1. The GAMs fit similar linear relationships to the logistic regression models, with higher probabilities of hail achieved with increasing KI and TT and decreasing LI.

Notably, the deep-layer shear WS_06 exhibits a nonlinear relationship to the response variable. WS_06 has its most negative effect at values around 0–10 m s−1, transitioning to a positive effect above 15 m s−1 (Fig. 5d). The curve flattens at very high wind shear values, suggesting that higher shear does not further increase the probability of hail. Additionally, the confidence intervals of smoothing functions widen significantly towards the tails of each covariate distribution.

GAMs are not limited by multicollinearity between model terms, which is why both WS_36 and WS_06 were selected in the northern model. The model preferred including both WS_36 and WS_06 over either one of them, as the individual predictors otherwise became insignificant and less important. Surprisingly, WS_36 has a negative linear relationship with hail in northern Switzerland. To gain a deeper understanding of how the WS_36 and WS_06 model terms interact, we further examined contour plots depicting conditional probabilities based on pairs of model predictors (not shown). The highest probabilities of hail are achieved with high WS_06 but low WS_36 in the northern model. Trefalt (2017) also found higher WS_06 and lower WS_36 on hail days vs. non-hail days in northern Switzerland. This atypical relationship could stem from the unique environmental conditions in Switzerland compared to the idealized modeling studies conducted for individual hailstorms in the United States (Dennis and Kumjian2017; Nixon et al.2023). It is plausible that the sensitivities to kinematic variables differ between regions due to varying atmospheric dynamics and topographical features. This is further discussed in Sect. 6.1. Conditional probabilities of hail based on the various predictors (not shown) indicate that WS_06 and WS_36 have very low importance in the GAMs compared to SLI and TT.

The freezing level z_0 °C is indicative of the altitude at which freezing occurs in a thunderstorm. A lower freezing level suggests a greater potential for the survival of hail after it is formed due to a longer residence time of hail embryos in the hail growth zone and less melting of hailstones before they reach the surface. However, the model fits a contrasting relation. The probability of hail is the highest at freezing levels between 2500 and 3500 m a.g.l. (Fig. 5f). Punge et al. (2023) also found that at higher elevations (≈2000 m) in South Africa only a very small fraction of satellite-based hail detections and hail damage claims occurred at freezing levels below 2400 m a.g.l.

The model fits a negative linear relationship for freezing levels below 2500 m a.g.l., indicating that lower values of z_0 °C correspond to lower hail probabilities. This relationship has also been seen before by Kunz (2007) and Trefalt (2017). The negative relationship suggests that our model does not learn about the melting or growth of hail embryos from the freezing level but instead uses it as a proxy for surface temperature, as both are positively correlated (Table S3). Thus, the negative effect of low freezing levels on hail probability could be related to lower surface temperatures.

4.3 Ensemble prediction

For the final time series, we create an ensemble prediction combining the best logistic regression model and generalized additive model (GAM) outputs for each domain. The ensemble prediction is generated by averaging the predicted probabilities from both the logistic regression model (LRM) and the generalized additive model (GAM). We again conduct sensitivity tests to determine the best thresholds for discriminating between hail and no hail. These thresholds are identified as 40 % for the northern model and 42 % for the southern model. Overall, the ensemble prediction outperforms individual models across all skill metrics.

We evaluate the ability of the ensemble predictions to reproduce hail occurrence and its variability and seasonal cycle. Figure 7a and b show aggregated hail days from the model and from the POH time series over the period of 2002–2022 for the northern domain (Fig. 7a) and the southern domain (Fig. 7b). In both domains, the lines largely overlap, which means that the model reproduces intra-annual and interannual variability well. On closer examination, a mismatch becomes apparent for both domains for the period of 2002–2011. We excluded these data from the model building due to biases.

https://nhess.copernicus.org/articles/24/3869/2024/nhess-24-3869-2024-f07

Figure 7Observed and modeled number of hail days for the period of 2002–2012 (April to September) for the northern (a, c) and southern (b, d) domain. The gray lines are the observed number of hail days (POH  80 % over a minimum of 580 km2 in the north and 499 km2 in the south). The blue and orange lines are the number of hail days modeled from the ensemble predictions for the northern and southern models, respectively. Plots (a) and (b) show the absolute number of hail days per year, and (c) and (d) show the sum of hail days per month.

Download

Generally, we see intra-annual and interannual variability in the skill of the statistical model in predicting hail days because some years and some months are predicted better than others (Figs. 7a, b and 8a, b). The overall correlation between the hail days per month and year of POH and the model is satisfactory, with 0.91 for the north and 0.87 for the south. Evaluating the model performance only for the years 2012–2022 yields slightly better values.

https://nhess.copernicus.org/articles/24/3869/2024/nhess-24-3869-2024-f08

Figure 8The observed number of hail days (POH  80 % over a minimum of 580 km2 in the north and 499 km2 in the south) plotted against the number of hail days modeled from the ensemble predictions for the northern domain (a, c) and southern domain (b, d) for the whole observational period of 2002–2022. Plots (a) and (b) show the absolute number of hail days per year, and (c) and (d) show the absolute sum of hail days per month. The black lines are the x=y lines, and the orange and blue lines are the fits to the orange and blue circles, respectively. Boxplots show the distributions of the samples.

Download

Our model can reproduce the seasonal pattern in both domains well (Fig. 7c and d). The model captures the typical seasonal pattern with very few hail days at the beginning and end of the hail season and a peak during the warm summer months. The peak of hail days in the north (Fig. 7c) is in June and July but is more prominent in the southern domain (Fig. 7d) and appears mainly in July. The difference in peaks again justifies the use of two separate models to account for the monthly differences in hail frequency. In months with fewer hail days, the models tend to underpredict slightly in both domains (Fig. 8c and d). The correlation between the monthly sum of hail days of the model and the POH is 0.99 for the north and 0.98 for the south.

The ensemble prediction mean POD is 0.77 for the north and 0.61 for the south with an SR (1  FAR) of 0.77 and 0.63, respectively. CSI is 0.60 and 0.44, and bias is 0.98 and 0.88, respectively. The POD, FAR, CSI, and bias are calculated by averaging the metric values of the test and validation datasets of the ensemble prediction; test and validation performance was very similar. The predictive skill of the ensemble prediction compares well with similar studies, such as those mentioned in Raupach et al. (2023a).

5 Analysis of the reconstructed time series

In this section, we present the reconstructed time series from the ensemble prediction and discuss its trends (Sect. 5.1), the drivers of these trends (Sect. 5.2), and changes in the seasonal cycle over time (Sect. 5.3). Finally, we compare our time series with qualitative damage data (Sect. 5.4).

5.1 Modeled long-term trends

Both domains exhibit a significant positive trend in yearly hail day occurrence, with a 45 % increase in modeled hail days in the northern domain and a 48 % increase in the southern domain comparing 1960–1989 to 1990–2019 (Fig. 9). Mann–Kendall's τ in the north is 0.355 with a p value of 4.70×10-5, and in the south τ is 0.369 with a p value of 2.43×10-5. The trend is slightly stronger in the south. The northern model estimates a mean of 18.87 hail days per year during the period of 1959–2022, with a minimum of 6 d in 1962 and 1980 and a maximum of 42 d in 2003 and 2018. In the south, the mean is 20.1 d, with a minimum of 6 d in 1984 and a maximum of 41 d in 2018. In the POH time series, 2003 and 2018 are also the years with the highest number of hail days. The mean number of yearly hail days for the 2002–2022 period is 24.1 d for the northern model and 24.4 d for the southern model. Both estimates are slightly lower than the POH average, with 24.1 hail days per year in the north and 25.3 hail days per year in the south. The variability of yearly or monthly sums of hail days increases over time, with higher variability in the last 2 decades (not shown).

https://nhess.copernicus.org/articles/24/3869/2024/nhess-24-3869-2024-f09

Figure 9Modeled yearly aggregated hail days from 1959 to 2022 (black lines) for the northern (a) and southern (b) domains from the ensemble prediction. The dashed black lines represent the mean, and the solid gray lines with confidence intervals are the linear fits to the yearly hail days from 1959–2022.

Download

It may be argued that deducing trends from ERA5 data-driven models provides biased results before 1979, when satellite data were first assimilated in ERA5. Therefore, we also performed the Mann–Kendall test limited to the period of 1979–2022; τ is 0.318 with a p value of 1.37×10-5 in the north and 0.463 with a p value of 2.87×10-3 in the south. This result means the trend is still positive and significant in both domains, although slightly less intense in the north and more pronounced in the south compared to the 1959–2022 period. This discrepancy is caused by the large interannual variability in both time series. The trends for both periods can be compared in Figs. 9 and C1.

5.2 Drivers of modeled trends

To investigate the factors driving the positive long-term trends in the models, we employed two techniques: partial Mann–Kendall tests and a detrending method proposed by Raupach et al. (2023b).

Using the Raupach et al. (2023b) approach, we assess the impact of individual model predictors on hail day trends by applying the models to data in which one of the predictors was detrended by removing the trend of the annual mean. We then performed Mann–Kendall tests to compare how the trend changed across the whole reconstructed time series from 1959–2022. To find which variable has the highest influence on the trend of each model, we compared τ values by exchanging one variable at a time with its detrended version for each model. For example, in the southern logistic regression model, detrending the LI resulted in a significant reduction in τ from 0.369 to 0.152, indicating a strong influence of LI on the positive trend. Similarly, detrending only KI reduced τ to 0.295, while τ only changed marginally when detrending other predictors. This result suggests that the positive trends in annual hail days in the southern logistic model are primarily explained by LI and to a lesser degree by KI.

Because τ is independent of the measurement scale, we can compare its values directly to find which predictors contribute most to the modeled trends. Across logistic regression models and GAMs for both domains, the positive trends in annual hail days were primarily driven by instability and moisture variables. To ensure the robustness of these results, we also performed partial Mann–Kendall analyses for each model and each model's predictors. We also performed partial Mann–Kendall tests on the ensemble predictions with a selection of parameters and found equal results. The tests again showed that in all models, the variables that contribute to the trends are primarily instability and moisture. The trend was never fully explained by a single variable but by a combination of both moisture and instability. This finding aligns with the connection known between convective instability and moisture availability.

Finally, we need to stress that the contribution of predictors to the trend depends on the importance of the predictors in the models. Additionally, the trend in the model always comes from the underlying trend in the model's predictors.

5.3 Change in the seasonal cycle over time

This section addresses the seasonal analysis of hail occurrence over time. The last 2 decades exhibit a marked increase in hail days per month, which is the strongest in May and June (Fig. 10, blue and purple curves). We excluded the years 1959 and 2020–2022 to ensure consistency in the number of years per decade. Although the monthly curves display considerable variability, their difference is not significant, and no systematic shift is evident, as illustrated by the cumulative distribution function (CDF) plots in Fig. C2. However, our analysis is confined to the months of April to September and cannot support any statements about potential changes in hail days preceding or following the period modeled here.

https://nhess.copernicus.org/articles/24/3869/2024/nhess-24-3869-2024-f10

Figure 10Panels (a) and (b) show the mean number of hail days per month per decade in colored lines with the uncertainty range. The 1960s include the years 1960–1969, the 1970s include the years 1970–1979, and so on. Panel (a) shows the northern domain and (b) the southern domain.

Download

5.4 Plausibility check with historic hail data

Validation of our time series and its trends with observational data was not possible due to the relatively short observational period. Nevertheless, we can conduct plausibility checks with qualitative hail information. As previously noted, these data do not enable any comparison of trends in the modeled time series with historical hail events, as the trends in damage are driven by changes in insurance coverage and exposure and vulnerability of crops. However, it is possible to compare interannual variability. Figure 11 shows the yearly sum of hail days extracted from the historical hail damage dataset in red from 1959 to 2017. The blue line is the yearly sum of both models. Both time series have been detrended and normalized. The correlation between the two time series is 0.43. We did not expect any better results because even for the period of 2012–2022, where we know that our model is closer to the true number of hail events than the historical information, some mismatch is evident. Of the 10 years with the highest number of hail days, 5 (2003, 1994, 1993, 1982, 1971) match, as do the 3 years with the lowest number of hail days (2010, 2005, 1980). Recall that we detrended both time series. When considering the non-detrended and non-normalized yearly time series, both have a similar standard deviation: 5.56 hail days for the model sum and 5.53 for the historical data.

https://nhess.copernicus.org/articles/24/3869/2024/nhess-24-3869-2024-f11

Figure 11The modeled number of hail days in blue (sum of northern and southern models) and the number of hail days derived from qualitative agricultural damage data in red (minimum five affected municipalities) for the 1959–2017 period. Both time series have been normalized and detrended.

Download

6 Discussion

We evaluated ERA5 convective parameters and hail occurrence derived from Swiss radar data for the years 2012–2022 to develop a statistical reconstruction of past hail days in Switzerland. Our analysis has yielded several conclusions, among which the most important are discussed below.

6.1 Model predictor selection

The selection of predictors for the logistic regression models and GAMs needs to be discussed, in particular the absence of wind shear from the logistic regression models of both domains. When training our models, wind shear rarely appeared as a skillful predictor, and even when it did, it was not significant in the logistic regression models. Automated feature selection yielded similar results. We see three possible explanations for this. First, shear could also be indirectly represented by the variable v_500 (v wind component at 500 hPa; southern model) in the southern model.

Second, there is a nonlinear relationship between WS_06 (wind shear from 0 to 6 km) and the response in the GAM, which the logistic model struggles to fit. Additionally, neither v_500 nor the explicit wind shear variables show high feature importance in all models. This low feature importance has also been seen by Trefalt (2017) for Switzerland and by Mohr et al. (2015b) for Germany. A potential reason for this could be the prevalence of high-shear but low-CAPE conditions in our domains, which do not always lead to hail. Thus, the wind shear parameters may not be effective in distinguishing between hail days and non-hail days in a statistical model, since there is no statistically significant difference in the distributions of WS_06 and WS_36 on hail days vs. non-hail days in both domains.

Our third point is that high wind shear might be a less important hail model parameter in regions with complex terrain (Punge and Kunz2016). Although large shear values are required to form supercells, which are likely to produce hail, hail also develops in lower-shear environments (Schemm et al.2016; Trefalt2017; Kumjian and Lombardo2020; Blair et al.2021). In fact, Feldmann et al. (2023) found that only 10 % of severe hailstorms in Switzerland are supercell-type storms, and Schemm et al. (2016) find average lower-tropospheric shear values at hailstorm initiation locations in Switzerland of less than 10 m s−1. Hail events in low-shear environments can be explained by proximity to mountain ranges, where environmental wind shear is increased by the interaction of the wind field with orography, which is often the case in the Alps (Trefalt2017; Kunz et al.2018).

In such complex terrain, shear might be driven by local conditions, such as Alpine pumping, which are not resolved by ERA5's resolution. Alpine pumping arises from differential heating and cooling of air masses over mountains and plains, which drive daytime winds from plains to mountains and nighttime winds in the opposite direction (Lugauer and Winkler2005).

We also tested combinations of shear and CAPE, such as WMAXSHEAR, an important parameter for differentiating between severe and nonsevere weather (Brooks et al.2003; Craven and Brooks2004; Kaltenböck et al.2009; Púčik et al.2015; Tuovinen et al.2015), but for both domains, WMAXSHEAR was not selected in combination with other variables.

The combination of LI and TT including additional variables performed very well and overall better than CAPE and shear. This performance is why three of our models contain the combination of LI and TT. However, TT can be a problematic parameter for several reasons. First, composite parameters are hard to interpret in physical contexts because they combine multiple types of information. One does not know if TT is high because the lapse rates are favorable, because there is plenty of low-level moisture, or because there is a mix of both. This ambiguity is why it is hard to explain why the parameter worked well for our study. In addition, TT takes into account moisture from a single level (850 hPa) and is very sensitive to rapid changes in dew points with height. Consequently, TT can be a case-sensitive parameter that may reach high values in situations when there is no storm and thus create false alarms. This sensitivity is mitigated in the model by information obtained from other predictors.

We do not claim that the combination of LI and TT is better than, for example, CAPE and shear in forecasting individual hail cells or in differentiating between no hail, hail, and large hail. Rather, the specific combinations of approximately five variables in the statistical models worked best for the reconstruction of hail days in the Swiss study areas using the POH radar proxy and low-resolution ERA5 data. Our data-driven approach identified some less common indices; the statistical models leverage these indices effectively within the constraints of our data, and this statistical approach complements our physical understanding. However, our models should not be transferred to other periods or regions without additional verification. Forecasting applications are much better served by the operational COSMO and ICON weather forecast models than by ERA5.

The individual models suffer from rather high false alarm rates, and we were not able to increase the explained variance of the models above approximately 60 % for the northern models and 45 % for the southern models. Hence, the models still lack information for identifying when hail days occur. Notably, the models lack information about convection-triggering mechanisms. We tested including convective inhibition (CIN), but that variable was not a skillful predictor. Therefore, none of the models includes any representation of initiation processes.

This initiation problem is still a challenge for forecasting thunder- and hailstorms (Lock and Houston2014) and results in very high false alarm rates of statistical models in many studies. This problem could be addressed by using convective precipitation as a proxy for initiation or producing a model that computes probabilities of hail from the presence of lightning; an example of such a model is the additive regressive convective hazard model, AR-CHaMo, from Rädler et al. (2018) and Battaglioli et al. (2023a). However, even this model explains relatively low percentages of variance (approx. 30 %), implying the absence of information that cannot be captured by conventional convective parameters and coarse-resolution reanalysis data. This absence may be a motivation to look further into storm microphysics and, for instance, the location of the embryos in time and space during hail-favoring situations.

6.2 Comparison with other studies

Several studies have used logistic regressions or GAMs and daily data to model hail (López et al.2007; Gascón et al.2015; Mohr et al.2015a, b; Rädler et al.2018; Battaglioli et al.2023a, b). Often, CAPE and shear are used as the main hail model predictors (e.g., Allen et al.2015; Madonna et al.2018; Czernecki et al.2019; Battaglioli et al.2023a). In our models, the combination of CAPE and shear was only significant in the southern GAM. For the logistic regression models, we found LI to be a better hail predictor than CAPE, which aligns with Kunz (2007), Mohr et al. (2015a, b), and Rädler et al. (2018). López et al. (2007) use TT and the wind at 500 hPa in their model for the Iberian Peninsula. Gascón et al. (2015) also used wind at 500 hPa. Unfortunately, we cannot directly compare the coefficients of their model to ours because of differences in data sources, resolution, and combination of model parameters. These differences must also be considered when comparing our model's performance to that of other studies.

To gauge the predictive capabilities of the models against those in related studies, we use the performance metrics of the ensemble predictions. Our models outperformed those mentioned in Raupach et al. (2023a) due to lower FAR and higher HSS values. Raupach et al. (2023a) mention HSS ranges from 0.1 to 0.4 compared to our models' HSS of 0.73 (north) and 0.55 (south). FAR ranges from 0.57 to 0.8 compared to 0.23 for the north and 0.35 for the south in our models. However, Raupach et al. (2023a) conducted their study over a much larger area encompassing diverse climate zones. Over our domains, hail is a comparatively frequent event with an a priori probability of approximately 15 % in the sample, which mitigates some of the statistical intricacies. Other studies have demonstrated comparable performance to ours, such as Battaglioli et al. (2023a) using ESWD hail reports and ERA5 data, López et al. (2007) using radar and radiosonde data, and Gascón et al. (2015) using severe-storm reports and Weather Research and Forecasting (WRF) vertical profiles.

6.3 Trends

Our modeled trends from the ensemble predictions align with the findings of Madonna et al. (2018), Rädler et al. (2018), and Battaglioli et al. (2023a). Madonna et al. (2018) reported an approximately 40 % increase in estimated hail days when comparing the periods of 1980–2001 and 2002–2014. Similarly, Rädler et al. (2018) found a 41 % relative increase in hail cases per year during 1979–2016 in western and central Europe. Battaglioli et al. (2023a) identified an 8 % per decade relative increase in hail hours in northern Italy and parts of southern Switzerland for the period of 1950–2022. In our study, we observe significant positive trends in both the northern and the southern domains, with a 45 % increase in modeled hail days in the northern domain and a 48 % increase in the southern domain comparing 1960–1989 to 1990–2019. This translates to a relative increase of 7.5 % and 7.9 % per decade, respectively. This trend can be attributed to an increase in hail-favoring environments in ERA5 in recent decades (Taszarek et al.2021; Pilguj et al.2022). Numerous studies have also identified positive trends in instability and moisture in ERA5 and rawinsonde data for Europe and parts of Switzerland (Mohr and Kunz2013; Rädler et al.2018, 2019).

Our modeled trends in hail occurrence are subject to several limitations. First, POH serves as the “truth”, but POH is an indirect observation of hail and does not perfectly reflect the presence of hail on the ground (Kopp et al.2024). Additionally, the quality of ERA5 data changes over time as more data are assimilated into the reanalysis. The study of Pilguj et al. (2022) comparing ERA5 trends to those extracted from rawinsonde data showed that the reliability of ERA5 has increased in the last 4 decades, implying higher confidence in the positive trends for the 1979–2022 period.

Finally, we want to highlight a limitation in the explanation of our trends. Within our models, the positive trends in annual hail day occurrences are driven by moisture and instability predictors (see Sect. 5.1). We can only quantify the effects of variables that are selected as predictors in the models. Other factors influencing hail occurrence and trends, such as temperature, could still play an important role due to the strong link between temperature, moisture availability, and convective instability.

We assume that the relationship modeled between the predictors and the occurrence of hail is stationary in the period that we investigate. The relationship might break down in a warmer climate, such as with drier summer soils, and the models should not be directly applied to climate change simulations. Studies using Coupled Model Intercomparison Project data have also shown that we could see reductions in relative humidity (RH) but increases in absolute humidity, involving lower RH but higher dew points (Hoogewind et al.2017; Chen et al.2020) over Europe. Despite larger CAPE, the process of convective development may become more difficult due to lower mid-level RH, which leads to a higher lifting condensation level, a higher level of free convection, and thus more negative buoyancy and larger CIN (Hoogewind et al.2017; Chen et al.2020; Taszarek et al.2021). The rise in the freezing level induced by lower- to mid-tropospheric warming could result in hail melting before reaching the ground (Dessens et al.2015; Raupach et al.2023a).

7 Conclusion and outlook

We present a new multidecadal daily hail time series for northern and southern Switzerland from 1959 to 2022, reconstructed from a POH radar hail proxy and ERA5 environmental predictors with statistical models. We built an ensemble prediction from a multiple logistic regression model and a logistic GAM for northern Switzerland and southern Switzerland. Model development included the selection of the most hail-relevant predictors based on multiple performance metrics, residual analysis, and multicollinearity and finding the best model settings. Seasonality is explicitly modeled by a categorical factor for the month in each model. Including the month factor led to a reduction in systematic biases in the residuals, as well as to an improvement in predictive skills. The hail time series was used to analyze long-term trends and changes in frequency, seasonality, and the variability of model-derived Swiss hailstorms in the past few decades.

The final ensemble model reproduces the interannual variability and seasonality of hail radar proxies well. The reconstructed hail time series shows a significant positive trend in the number of hail days per year in both domains from 1959 to 2022. In the south the trend is 7.5 % per decade. The trend is also significant and positive for the period of 1979–2022. The trend is mainly driven by the instability and moisture predictors in all models. The increase in hail days in the last 2 decades is the strongest in May and June. However, the seasonal cycle shows no clear shift towards an earlier start or earlier end, and differences in monthly distributions across decades are not significant. We compared our time series to a historical agricultural insurance data archive. We found agreement in the weakest and strongest hail years and similar interannual variability.

The main purpose of this study is to offer a framework to study intra-annual variability, trends, and past changes in the seasonality of Swiss hail occurrence without long-term direct hail observations. We will use this time series to study local and remote drivers of the intra-annual and interannual variability of Swiss hail. These drivers include sea surface temperature (Jeong et al.2020; Cheng et al.2022), soil moisture anomalies (Taylor2015; Gaal and Kinter2021), and sea ice and snow cover (Wiese1924; Budikova2009). Similarly, it would be interesting to see whether these large-scale variables are related to specific circulation anomalies or synoptic configurations (Schemm et al.2016; Piper and Kunz2017; Rohrer et al.2019). This question will be the subject of future work.

Appendix A: Calculation of model performance metrics

The Akaike information criterion (AIC; Akaike1974) and the Bayesian information criterion (BIC; Schwarz1978) were calculated with the R stats package. The AIC is defined as AIC=-2×ln(L)+k×npar, where ln (L) is the logarithm of the maximum likelihood of the estimated model, k=2, and npar is the number of fitted model parameters. For glm, -2×ln(L) is the deviance. Using k=log (n) provides the BIC instead, where n is the number of observations. The variance inflation factor (VIF) was also calculated in R by VIFi=1/(1Ri2), where Ri2 is the coefficient of determination obtained when regressing the ith predictor on the others.

Figure A1 represents the ensemble predictions' skill in a performance diagram showing the POD (left y axis), SR (lower x axis), CSI (labeled solid contours), and bias scores (labeled dashed lines) (diagram by Roebber2009). A perfect prediction would lie in the top-right corner of the performance space, meaning all metrics approach unity, achieving 100 % correct predictions. The nearer the predictions are to the lower-left corner, the more biased they are, and the more false positives or misses a model produces.

https://nhess.copernicus.org/articles/24/3869/2024/nhess-24-3869-2024-f12

Figure A1Performance diagram summarizing the POD, SR, bias, and CSI of each model. The orange cross shows the performance skill of the south and the blue cross that of the north. The crosslines indicate the confidence interval, calculated from bootstrapping. The circles highlight the mean value. The dashed lines represent bias scores with labels on the outward extension of the line. Labeled solid contours are the CSI.

Download

Table A1Equations and limits for performance metrics that were used to find the best hail models. Performance metrics were calculated from the corresponding contingency tables. TP represents the true positives, FP the false positives, FN the false negatives, and TN the true negatives.

Download Print Version | Download XLSX

Appendix B: Model coefficients

B1 Logistic models

Table B1Coefficients, standard errors, z values, and p values of all covariates of the logistic regression models for north and south. Positive signs indicate a positive relationship of the quantitative predictors with modeled hail occurrence and vice versa. Asterisks indicate significance levels of the p values: * 0.01, ** 0.001, and *** 0.000.

Download Print Version | Download XLSX

B2 GAMs

To examine whether we obey the assumptions necessary for logistic regression, we also checked for extreme outliers and for the linear relationship between the explanatory variables xn and the logit of the response variable y. Some variables did not have a perfect linear relationship, such as CAPE, which is probably one reason why it was not chosen as a predictor for the final logistic regression models. We also build GAMs to allow for nonlinear relationships and interactions that might be poorly fitted in the logistic regression models (see Sect. 4.2).

Table B2Coefficients, standard errors, z values, and p values of all nonparametric covariates of the GAMs for north and south. Positive signs indicate a positive relationship of the quantitative predictors with modeled hail occurrence and vice versa. Asterisks indicate significance levels of the p values: * 0.01, ** 0.001, and *** 0.000. Other nonparametric terms are found in Table B3.

Download Print Version | Download XLSX

Table B3Significance of nonparametric smooth terms in the GAM for the north and the GAM for the south. Edf is the effective degree of freedom, ref. df represents the residual degree of freedom, and chi. sq. is the chi-square statistics. Asterisks indicate significance levels of the p values: * 0.01, ** 0.001, and *** 0.000.

Download Print Version | Download XLSX

Appendix C: Additional figures
https://nhess.copernicus.org/articles/24/3869/2024/nhess-24-3869-2024-f13

Figure C1Similar to Fig. 9 but for the period of 1979–2022. The linear fit is calculated for the yearly hail days from 1979–2022.

Download

https://nhess.copernicus.org/articles/24/3869/2024/nhess-24-3869-2024-f14

Figure C2Cumulative distribution functions of the number of hail days per week per decade (colored lines). Panel (a) shows the northern domain and (b) the southern domain.

Download

Code and data availability

Radar data are available from MeteoSwiss upon request (https://www.meteoschweiz.admin.ch/service-und-publikationen/service.html, Betschart and Hering2012) with a licensing requirement for commercial use. For access to the Swiss historical hail damage data archive, please contact Stefan Müller (stefan.mueller@meteotest.ch). ERA5 datasets can be downloaded via API request directly from the ECMWF Climate Data Store (CDS; https://doi.org/10.24381/cds.adbb2d47, Copernicus Climate Change Service, Climate Data Store2023a and https://doi.org/10.24381/cds.bd0915c6, Copernicus Climate Change Service, Climate Data Store2023b). Convective parameters from ThundeR for Switzerland are available by contacting Lena Wilhelm (lena.wilhelm@unibe.ch) and globally by contacting Mateusz Taszarek (mateusz.taszarek@amu.edu.pl). For code on model building and diagnostics, please contact Lena Wilhelm.

Supplement

The supplement related to this article is available online at: https://doi.org/10.5194/nhess-24-3869-2024-supplement.

Author contributions

LW: conceptualization, data curation, methodology, visualization, writing (original draft, review and editing). OM, CS, and KS: supervision, conceptualization, methodology, writing (review and editing). MT: helpful discussions and writing (review and editing).

Competing interests

The contact author has declared that none of the authors has any competing interests.

Disclaimer

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.

Acknowledgements

Firstly, we would like to thank MeteoSwiss for providing POH data and Hélène Barras for her insights into her POH thresholds. We would also like to thank Stefan Müller for the provision of and helpful insights into the historical hail data archive. Finally, we thank all scClim (https://scclim.ethz.ch/, last access: 6 November 2024) researchers for valuable inputs throughout the project.

Financial support

This research has been supported by the Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (grant no. CRSII5201792).

Review statement

This paper was edited by Joaquim G. Pinto and reviewed by Julian C. Brimelow and two anonymous referees.

References

Akaike, H.: A new look at the statistical model identification, IEEE Trans. Automat. Control, 19, 716–723, https://doi.org/10.1109/TAC.1974.1100705, 1974. a

Allen, J., Karoly, D., and Mills, G.: A severe thunderstorm climatology for Australia and associated thunderstorm environments, Aust. Meteorol. Oceanogr. J., 61, 143–158, https://doi.org/10.22499/2.6103.001, 2011. a

Allen, J. T., Tippett, M. K., and Sobel, A. H.: An empirical model relating U.S. monthly hail occurrence to large-scale meteorological environment, J. Adv. Model. Earth Syst., 7, 226–243, https://doi.org/10.1002/2014MS000397, 2015. a, b, c

Allen, J. T., Giammanco, I. M., Kumjian, M. R., Jurgen Punge, H., Zhang, Q., Groenemeijer, P., Kunz, M., and Ortega, K.: Understanding Hail in the Earth System, Rev. Geophys., 58, e2019RG000665, https://doi.org/10.1029/2019RG000665, 2020. a, b

Andersson, T., Andersson, M., Jacobsson, C., and Nilsson, S.: Thermodynamic indices for forecasting thunderstorms in southern Sweden, Meteorol. Mag., 118, 141–146, 1989. a

Applequist, S., Gahrs, G. E., Pfeffer, R. L., and Niu, X.-F.: Comparison of Methodologies for Probabilistic Quantitative Precipitation Forecasting, Weather Forecast., 17, 783–799, https://doi.org/10.1175/1520-0434(2002)017<0783:COMFPQ>2.0.CO;2, 2002. a

Augenstein, M., Mohr, S., and Kunz, M.: Trends of thunderstorm activity and relation to large-scale atmospheric conditions in western and central Europe, other, display, ESSL https://doi.org/10.5194/ecss2023-98, 2023. a

BAFU: Umgang mit Naturgefahren in der Schweiz – Bericht des Bundesrats in Erfuellung des Postulats 12.4271 Darbellay vom 14.12.2012, Technischer Bericht, BAFU – Bundesamt fuer Umwelt, https://www.bafu.admin.ch/bafu/de/home/themen/naturgefahren/publikationen-studien/publikationen/umgang-mit-naturgefahren-in-der-schweiz.html (last access: 6 November 2024), 2012. a

Baldauf, M., Seifert, A., Förstner, J., Majewski, D., Raschendorfer, M., and Reinhardt, T.: Operational Convective-Scale Numerical Weather Prediction with the COSMO Model: Description and Sensitivities, Mon. Weather Rev., 139, 3887–3905, https://doi.org/10.1175/MWR-D-10-05013.1, 2011. a

Barras, H., Martius, O., Nisi, L., Schroeer, K., Hering, A., and Germann, U.: Multi-day hail clusters and isolated hail days in Switzerland – large-scale flow conditions and precursors, Weather Clim. Dynam., 2, 1167–1185, https://doi.org/10.5194/wcd-2-1167-2021, 2021. a, b, c, d

Battaglioli, F., Groenemeijer, P., Púčik, T., Taszarek, M., Ulbrich, U., and Rust, H.: Modelled multidecadal trends of lightning and (very) large hail in Europe and North America (1950–2021), J. Appl. Meteorol. Clim., 62, 1627–1653, https://doi.org/10.1175/JAMC-D-22-0195.1, 2023a. a, b, c, d, e, f, g, h, i, j

Battaglioli, F., Groenemeijer, P., Tsonevsky, I., and Púčik, T.: Forecasting large hail and lightning using additive logistic regression models and the ECMWF reforecasts, Nat. Hazards Earth Syst. Sci., 23, 3651–3669, https://doi.org/10.5194/nhess-23-3651-2023, 2023b. a

Bell, B., Hersbach, H., Simmons, A., Berrisford, P., Dahlgren, P., Horányi, A., Muñoz‐Sabater, J., Nicolas, J., Radu, R., Schepers, D., Soci, C., Villaume, S., Bidlot, J., Haimberger, L., Woollen, J., Buontempo, C., and Thépaut, J.: The ERA5 global reanalysis: Preliminary extension to 1950, Q. J. Roy. Meteorol. Soc., 147, 4186–4227, https://doi.org/10.1002/qj.4174, 2021. a

Betschart, M. and Hering, A.: Automatic Hail Detection at MeteoSwiss – Verification of the radar based hail detection algorithms POH, MESHS and HAIL, Arbeitsberichte der MeteoSchweiz, 238, 59 pp., https://www.meteoschweiz.admin.ch/service-und-publikationen/service.html (last access: 6 November 2024), 2012. a

Billet, J., DeLisi, M., Smith, B. G., and Gates, C.: Use of Regression Techniques to Predict Hail Size and the Probability of Large Hail, Weather Forecast., 12, 154–164, https://doi.org/10.1175/1520-0434(1997)012<0154:UORTTP>2.0.CO;2, 1997. a, b

Blair, S. F., Deroche, D. R., Boustead, J. M., Leighton, J. W., Barjenbruch, B. L., and Gargan, W. P.: Radar-Based Assessment of the Detectability of Giant Hail, E-J. Sev. Storms Meteorol., 6, 1–30, https://doi.org/10.55599/ejssm.v6i7.34, 2021. a

Boyden, C. J.: A simple instability index for use as a synoptic parameter, Meteorol. Mag., 92, 198–210, 1963. a, b

Brooks, H. E., Lee, J. W., and Craven, J. P.: The spatial distribution of severe thunderstorm and tornado environments from global reanalysis data, Atmos. Res., 67-68, 73–94, https://doi.org/10.1016/S0169-8095(03)00045-0, 2003. a, b

Budikova, D.: Role of Arctic sea ice in global atmospheric circulation: A review, Global Planet. Change, 68, 149–163, https://doi.org/10.1016/j.gloplacha.2009.04.001, 2009. a

Chen, J., Dai, A., Zhang, Y., and Rasmussen, K. L.: Changes in Convective Available Potential Energy and Convective Inhibition under Global Warming, J. Climate, 33, 2025–2050, https://doi.org/10.1175/JCLI-D-19-0461.1, 2020. a, b

Cheng, K., Harris, L., Bretherton, C., Merlis, T. M., Bolot, M., Zhou, L., Kaltenbaugh, A., Clark, S., and Fueglistaler, S.: Impact of Warmer Sea Surface Temperature on the Global Pattern of Intense Convection: Insights From a Global Storm Resolving Model, Geophys. Res. Lett., 49, e2022GL099796, https://doi.org/10.1029/2022GL099796, 2022. a

Copernicus Climate Change Service, Climate Data Store: ERA5 hourly data on single levels from 1940 to present, Copernicus Climate Change Service (C3S) Climate Data Store (CDS) [data set], https://doi.org/10.24381/cds.adbb2d47, 2023a. a

Copernicus Climate Change Service, Climate Data Store: ERA5 hourly data on pressure levels from 1940 to present, Copernicus Climate Change Service (C3S) Climate Data Store (CDS), https://doi.org/10.24381/cds.bd0915c6, 2023b. a

Craven, J. P. and Brooks, H.: Baseline climatology of sounding derived parameters associated with deep moist convection, Natl. Weather Dig., 28, 13–24, 2004. a

Czernecki, B., Taszarek, M., Marosz, M., Półrolniczak, M., Kolendowicz, L., Wyszogrodzki, A., and Szturc, J.: Application of machine learning to large hail prediction – The importance of radar reflectivity, lightning occurrence and convective parameters derived from ERA5, Atmos. Res., 227, 249–262, https://doi.org/10.1016/j.atmosres.2019.05.010, 2019. a, b

Dennis, E. J. and Kumjian, M. R.: The Impact of Vertical Wind Shear on Hail Growth in Simulated Supercells, J. Atmos. Sci., 74, 641–663, https://doi.org/10.1175/JAS-D-16-0066.1, 2017. a

Dessens, J., Berthet, C., and Sanchez, J.: Change in hailstone size distributions with an increase in the melting level height, Atmos. Res., 158–159, 245–253, https://doi.org/10.1016/j.atmosres.2014.07.004, 2015. a

Doswell, C. A., Brooks, H. E., and Maddox, R. A.: Flash Flood Forecasting: An Ingredients-Based Methodology, Weather Forecast., 11, 560–581, https://doi.org/10.1175/1520-0434(1996)011<0560:FFFAIB>2.0.CO;2, 1996. a

Feldmann, M., Germann, U., Gabella, M., and Berne, A.: A characterisation of Alpine mesocyclone occurrence, Weather Clim. Dynam., 2, 1225–1244, https://doi.org/10.5194/wcd-2-1225-2021, 2021. a

Feldmann, M., Hering, A., Gabella, M., and Berne, A.: Hailstorms and rainstorms versus supercells – a regional analysis of convective storm types in the Alpine region, npj Clima. Atmos. Sci., 6, 19, https://doi.org/10.1038/s41612-023-00352-z, 2023. a

Fluck, E., Kunz, M., Geissbuehler, P., and Ritz, S. P.: Radar-based assessment of hail frequency in Europe, Nat. Hazards Earth Syst. Sci., 21, 683–701, https://doi.org/10.5194/nhess-21-683-2021, 2021. a

Foote, B., Krauss, T. W., and Makitov, V.: Hail metrics using conventional radar, in: Proceedings of the 16th Conference on Planned and Inadvertent Weather Modification, 10 January 2005, San Diego, CA, USA, https://ams.confex.com/ams/Annual2005/techprogram/paper_86773.htm (last access: 6 November 2024), 2005. a

Gaal, R. and Kinter, J. L.: Soil Moisture Influence on the Incidence of Summer Mesoscale Convective Systems in the U.S. Great Plains, Mon. Weather Rev., 149, 3981–3994, https://doi.org/10.1175/MWR-D-21-0140.1, 2021. a

Galway, J. G.: The Lifted Index as a Predictor of Latent Instability, B. Am. Meteorol. Soc., 37, 528–529, https://doi.org/10.1175/1520-0477-37.10.528, 1956. a

García-Ortega, E., Merino, A., López, L., and Sánchez, J.: Role of mesoscale factors at the onset of deep convection on hailstorm days and their relation to the synoptic patterns, Atmos. Res., 114-115, 91–106, https://doi.org/10.1016/j.atmosres.2012.05.017, 2012. a

Gascón, E., Merino, A., Sánchez, J., Fernández-González, S., García-Ortega, E., López, L., and Hermida, L.: Spatial distribution of thermodynamic conditions of severe storms in southwestern Europe, Atmos. Res., 164-165, 194–209, https://doi.org/10.1016/j.atmosres.2015.05.012, 2015. a, b, c, d, e

Gensini, V. A., Converse, C., Ashley, W. S., and Taszarek, M.: Machine learning classification of significant tornadoes and hail in the U.S. using ERA5 proximity soundings, Weather Forecast., 36, 2143–2160, https://doi.org/10.1175/WAF-D-21-0056.1, 2021. a

George, J.: Weather forecasting for aeronautics, Q. J. Roy. Meteorol. Soc., 87, 120–120, https://doi.org/10.1002/qj.49708737120, 1961. a

Giaiotti, D., Nordio, S., and Stel, F.: The climatology of hail in the plain of Friuli Venezia Giulia, Atmos. Res., 67-68, 247–259, https://doi.org/10.1016/S0169-8095(03)00084-X, 2003. a

Gladich, I., Gallai, I., Giaiotti, D., and Stel, F.: On the diurnal cycle of deep moist convection in the southern side of the Alps analysed through cloud-to-ground lightning activity, Atmos. Res., 100, 371–376, https://doi.org/10.1016/j.atmosres.2010.08.026, 2011. a

Greene, D. R. and Clark, R. A.: Vertically Integrated Liquid Water – A New Analysis Tool, Mon. Weather Rev., 100, 548–552, https://doi.org/10.1175/1520-0493(1972)100<0548:VILWNA>2.3.CO;2, 1972. a

Groenemeijer, P. and van Delden, A.: Sounding-derived parameters associated with large hail and tornadoes in the Netherlands, Atmos. Res., 83, 473–487, https://doi.org/10.1016/j.atmosres.2005.08.006, 2007. a

Hastie, T. and Tibshirani, R.: Generalized Additive Models: Some Applications, J. Am. Stat. Assoc., 82, 371–386, https://doi.org/10.1080/01621459.1987.10478440, 1987. a

Hersbach, H., Bell, B., Berrisford, P., Hirahara, S., Horányi, A., Muñoz‐Sabater, J., Nicolas, J., Peubey, C., Radu, R., Schepers, D., Simmons, A., Soci, C., Abdalla, S., Abellan, X., Balsamo, G., Bechtold, P., Biavati, G., Bidlot, J., Bonavita, M., De Chiara, G., Dahlgren, P., Dee, D., Diamantakis, M., Dragani, R., Flemming, J., Forbes, R., Fuentes, M., Geer, A., Haimberger, L., Healy, S., Hogan, R. J., Hólm, E., Janisková, M., Keeley, S., Laloyaux, P., Lopez, P., Lupu, C., Radnoti, G., De Rosnay, P., Rozum, I., Vamborg, F., Villaume, S., and Thépaut, J.: The ERA5 global reanalysis, Q. J. Roy. Meteorol. Soc., 146, 1999–2049, https://doi.org/10.1002/qj.3803, 2020. a

Hoogewind, K. A., Baldwin, M. E., and Trapp, R. J.: The Impact of Climate Change on Hazardous Convective Weather in the United States: Insight from High-Resolution Dynamical Downscaling, J. Climate, 30, 10081–10100, https://doi.org/10.1175/JCLI-D-16-0885.1, 2017. a, b

Hosmer, D. W. and Lemeshow, S.: Applied Logistic Regression, in: 1st Edn., Wiley, ISBN 978-0-471-35632-5, ISBN 978-0-471-72214-4, https://doi.org/10.1002/0471722146, 2000. a

Huntrieser, H., Schiesser, H. H., Schmid, W., and Waldvogel, A.: Comparison of Traditional and Newly Developed Thunderstorm Indices for Switzerland, Weather Forecast., 12, 108–125, https://doi.org/10.1175/1520-0434(1997)012<0108:COTAND>2.0.CO;2, 1997. a, b

Jeong, J.-H., Fan, J., Homeyer, C. R., and Hou, Z.: Understanding Hailstone Temporal Variability and Contributing Factors over the U.S. Southern Great Plains, J. Climate, 33, 3947–3966, https://doi.org/10.1175/JCLI-D-19-0606.1, 2020. a

Johns, R. H. and Doswell, C. A.: Severe Local Storms Forecasting, Weather Forecast., 7, 588–612, https://doi.org/10.1175/1520-0434(1992)007<0588:SLSF>2.0.CO;2, 1992. a

Johnson, A. W. and Sugden, K. E.: Evaluation of Sounding-Derived Thermodynamic and Wind-Related Parameters Associated with Large Hail Events, E-J. Sev. Storms Meteorol., 9, 1–42, https://doi.org/10.55599/ejssm.v9i5.57, 2021. a, b

Kaltenböck, R. and Steinheimer, M.: Radar-based severe storm climatology for Austrian complex orography related to vertical wind shear and atmospheric instability, Atmos. Res., 158-159, 216–230, https://doi.org/10.1016/j.atmosres.2014.08.006, 2015. a

Kaltenböck, R., Diendorfer, G., and Dotzek, N.: Evaluation of thunderstorm indices from ECMWF analyses, lightning data and severe storm reports, Atmos. Res., 93, 381–396, https://doi.org/10.1016/j.atmosres.2008.11.005, 2009. a

Kopp, J., Schröer, K., Schwierz, C., Hering, A., Germann, U., and Martius, O.: The summer 2021 Switzerland hailstorms: weather situation, major impacts and unique ­observational data, Weather, 78, 184–191, https://doi.org/10.1002/wea.4306, 2023. a

Kopp, J., Hering, A., Germann, U., and Martius, O.: Verification of weather-radar-based hail metrics with crowdsourced observations from Switzerland, Atmos. Meas. Tech., 17, 4529–4552, https://doi.org/10.5194/amt-17-4529-2024, 2024. a, b

Kumjian, M. R. and Lombardo, K.: A Hail Growth Trajectory Model for Exploring the Environmental Controls on Hail Size: Model Physics and Idealized Tests, J. Atmos. Sci., 77, 2765–2791, https://doi.org/10.1175/JAS-D-20-0016.1, 2020. a, b

Kumjian, M. R., Lombardo, K., and Loeffler, S.: The Evolution of Hail Production in Simulated Supercell Storms, J. Atmos. Sci., 78, 3417–3440, https://doi.org/10.1175/JAS-D-21-0034.1, 2021. a

Kunz, M.: The skill of convective parameters and indices to predict isolated and severe thunderstorms, Nat. Hazards Earth Syst. Sci., 7, 327–342, https://doi.org/10.5194/nhess-7-327-2007, 2007. a, b, c, d, e, f

Kunz, M., Blahak, U., Handwerker, J., Schmidberger, M., Punge, H. J., Mohr, S., Fluck, E., and Bedka, K. M.: The severe hailstorm in southwest Germany on 28 July 2013: characteristics, impacts and meteorological conditions, Q. J. Roy. Meteorol. Soc., 144, 231–250, https://doi.org/10.1002/qj.3197, 2018. a, b

Li, F., Chavas, D. R., Reed, K. A., and Dawson Ii, D. T.: Climatology of Severe Local Storm Environments and Synoptic-Scale Features over North America in ERA5 Reanalysis and CAM6 Simulation, J. Climate, 33, 8339–8365, https://doi.org/10.1175/JCLI-D-19-0986.1, 2020. a

Lock, N. A. and Houston, A. L.: Empirical Examination of the Factors Regulating Thunderstorm Initiation, Mon. Weather Rev., 142, 240–258, https://doi.org/10.1175/MWR-D-13-00082.1, 2014. a

López, L., García-Ortega, E., and Sánchez, J. L.: A short-term forecast model for hail, Atmos. Res., 83, 176–184, https://doi.org/10.1016/j.atmosres.2005.10.014, 2007. a, b, c, d, e

Lugauer, M. and Winkler, P.: Thermal circulation in South Bavaria climatology and synoptic aspects, Meteorol. Z., 14, 15–30, https://doi.org/10.1127/0941-2948/2005/0014-0015, 2005. a

Madonna, E., Ginsbourger, D., and Martius, O.: A Poisson regression approach to model monthly hail occurrence in Northern Switzerland using large-scale environmental variables, Atmos. Res., 203, 261–274, https://doi.org/10.1016/j.atmosres.2017.11.024, 2018. a, b, c, d, e, f

Mansfield, E. R. and Helms, B. P.: Detecting Multicollinearity, Am. Stat., 36, 158–160, https://doi.org/10.2307/2683167, 1982. a

Manzato, A.: Hail in Northeast Italy: Climatology and Bivariate Analysis with the Sounding-Derived Indices, J. Appl. Meteorol. Clim., 51, 449–467, https://doi.org/10.1175/JAMC-D-10-05012.1, 2012. a

Manzato, A., Serafin, S., Miglietta, M. M., Kirshbaum, D., and Schulz, W.: A Pan-Alpine Climatology of Lightning and Convective Initiation, Mon. Weather Rev., 150, 2213–2230, https://doi.org/10.1175/MWR-D-21-0149.1, 2022. a

Martius, O., Kunz, M., Nisi, L., and Hering, A.: Conference Report 1st European Hail Workshop, Meteorol. Z., 24, 441–442, https://doi.org/10.1127/metz/2015/0667, 2015. a

Melcón, P., Merino, A., Sánchez, J. L., López, L., and García-Ortega, E.: Spatial patterns of thermodynamic conditions of hailstorms in southwestern France, Atmos. Res., 189, 111–126, https://doi.org/10.1016/j.atmosres.2017.01.011, 2017. a

Miller, R.: Notes on Analysis and Severe-storm Forecasting Procedures of the Air Force Global Weather Central, Tech. rep., Air Force Global Weather Central, https://archive.org/details/DTIC_AD0744042 (last access: 6 November 2024), 1972. a, b

Mohr, S. and Kunz, M.: Recent trends and variabilities of convective parameters relevant for hail events in Germany and Europe, Atmos. Res., 123, 211–228, https://doi.org/10.1016/j.atmosres.2012.05.016, 2013. a, b, c, d

Mohr, S., Kunz, M., and Geyer, B.: Hail potential in Europe based on a regional climate model hindcast: Hail Potential In Europe, Geophys. Res. Lett., 42, 10,904–10,912, https://doi.org/10.1002/2015GL067118, 2015a. a, b, c

Mohr, S., Kunz, M., and Keuler, K.: Development and application of a logistic model to estimate the past and future hail potential in Germany: Logistic Model Estimating Hail Potential, J. Geophys. Res.-Atmos., 120, 3939–3956, https://doi.org/10.1002/2014JD022959, 2015b. a, b, c

Moncrieff, M. W. and Miller, M. J.: The dynamics and simulation of tropical cumulonimbus and squall lines, Q. J. Roy. Meteorol. Soc., 102, 373–394, https://doi.org/10.1002/qj.49710243208, 1976. a

Morgan, G. M.: A General Description of the Hail Problem in the Po Valley of Northern Italy, J. Appl. Meteorol., 12, 338–353, https://doi.org/10.1175/1520-0450(1973)012<0338:AGDOTH>2.0.CO;2, 1973. a

Müller, S. and Schmutz, M.: Nationales Hagelprojekt. Schlussbericht: Aufbereitung historische Hagel-Daten, Tech. rep., Meteotest AG, Bern, 2021. a

Nisi, L., Martius, O., Hering, A., Kunz, M., and Germann, U.: Spatial and temporal distribution of hailstorms in the Alpine region: a long‐term, high resolution, radar‐based analysis, Q. J. Roy. Meteorol. Soc., 142, 1590–1604, https://doi.org/10.1002/qj.2771, 2016. a, b, c, d, e, f

Nisi, L., Hering, A., Germann, U., and Martius, O.: A 15‐year hail streak climatology for the Alpine region, Q. J. Roy. Meteorol. Soc., 144, 1429–1449, https://doi.org/10.1002/qj.3286, 2018. a, b

Nisi, L., Hering, A., Germann, U., Schroeer, K., Barras, H., Kunz, M., and Martius, O.: Hailstorms in the Alpine region: Diurnal cycle, characteristics, and the nowcasting potential of lightning properties, Q. J. Roy. Meteorol. Soc., 146, 4170–4194, https://doi.org/10.1002/qj.3897, 2020. a

Nixon, C. J., Allen, J. T., and Taszarek, M.: Hodographs and Skew Ts of Hail-Producing Storms, Weather Forecast., 38, 2217–2236, https://doi.org/10.1175/WAF-D-23-0031.1, 2023. a, b, c

Pilguj, N., Taszarek, M., Allen, J. T., and Hoogewind, K. A.: Are Trends in Convective Parameters over the United States and Europe Consistent between Reanalyses and Observations?, J. Climate, 35, 3605–3626, https://doi.org/10.1175/JCLI-D-21-0135.1, 2022. a, b, c

Piper, D. and Kunz, M.: Spatiotemporal variability of lightning activity in Europe and the relation to the North Atlantic Oscillation teleconnection pattern, Nat. Hazards Earth Syst. Sci., 17, 1319–1336, https://doi.org/10.5194/nhess-17-1319-2017, 2017. a

Púčik, T., Groenemeijer, P., Rýva, D., and Kolář, M.: Proximity Soundings of Severe and Nonsevere Thunderstorms in Central Europe, Mon. Weather Rev., 143, 4805–4821, https://doi.org/10.1175/MWR-D-15-0104.1, 2015. a, b

Púčik, T., Castellano, C., Groenemeijer, P., Kühne, T., Rädler, A. T., Antonescu, B., and Faust, E.: Large Hail Incidence and Its Economic and Societal Impacts across Europe, Mon. Weather Rev., 147, 3901–3916, https://doi.org/10.1175/MWR-D-19-0204.1, 2019. a

Punge, H. and Kunz, M.: Hail observations and hailstorm characteristics in Europe: A review, Atmos. Res., 176-177, 159–184, https://doi.org/10.1016/j.atmosres.2016.02.012, 2016. a

Punge, H. J., Bedka, K. M., Kunz, M., Bang, S. D., and Itterly, K. F.: Characteristics of hail hazard in South Africa based on satellite detection of convective storms, Nat. Hazards Earth Syst. Sci., 23, 1549–1576, https://doi.org/10.5194/nhess-23-1549-2023, 2023. a

Rädler, A. T., Groenemeijer, P., Faust, E., and Sausen, R.: Detecting Severe Weather Trends Using an Additive Regressive Convective Hazard Model (AR-CHaMo), J. Appl. Meteorol. Clim., 57, 569–587, https://doi.org/10.1175/JAMC-D-17-0132.1, 2018. a, b, c, d, e, f

Rädler, A. T., Groenemeijer, P. H., Faust, E., Sausen, R., and Púčik, T.: Frequency of severe thunderstorms across Europe expected to increase in the 21st century due to rising instability, npj Clim. Atmos. Sci., 2, 30, https://doi.org/10.1038/s41612-019-0083-7, 2019. a

Rasmussen, E. N.: Refined Supercell and Tornado Forecast Parameters, Weather Forecast., 18, 530–535, https://doi.org/10.1175/1520-0434(2003)18<530:RSATFP>2.0.CO;2, 2003. a

Raupach, T. H., Soderholm, J., Protat, A., and Sherwood, S. C.: An Improved Instability–Shear Hail Proxy for Australia, Mon. Weather Rev., 151, 545–567, https://doi.org/10.1175/MWR-D-22-0127.1, 2023a. a, b, c, d, e, f, g, h

Raupach, T. H., Soderholm, J. S., Warren, R. A., and Sherwood, S. C.: Changes in hail hazard across Australia: 1979–2021, npj Clim. Atmos. Sci., 6, 143, https://doi.org/10.1038/s41612-023-00454-8, 2023b. a, b

Roebber, P. J.: Visualizing Multiple Measures of Forecast Quality, Weather Forecast., 24, 601–608, https://doi.org/10.1175/2008WAF2222159.1, 2009. a

Rohrer, M., Brönnimann, S., Martius, O., Raible, C. C., and Wild, M.: Decadal variations of blocking and storm tracks in centennial reanalyses, Tellus A, 71, 1586236, https://doi.org/10.1080/16000870.2019.1586236, 2019. a

Sánchez, J. L., Marcos, J. L., Dessens, J., López, L., Bustos, C., and García-Ortega, E.: Assessing sounding-derived parameters as storm predictors in different latitudes, Atmos. Res., 93, 446–456, https://doi.org/10.1016/j.atmosres.2008.11.006, 2009. a, b

Schemm, S., Nisi, L., Martinov, A., Leuenberger, D., and Martius, O.: On the link between cold fronts and hail in Switzerland: On the link between cold fronts and hail in Switzerland, Atmos. Sci. Let., 17, 315–325, https://doi.org/10.1002/asl.660, 2016. a, b, c, d, e

Schmeits, M. J., Kok, K. J., and Vogelezang, D. H. P.: Probabilistic Forecasting of (Severe) Thunderstorms in the Netherlands Using Model Output Statistics, Weather Forecast., 20, 134–148, https://doi.org/10.1175/WAF840.1, 2005. a, b

Schmid, T., Portmann, R., Villiger, L., Schröer, K., and Bresch, D. N.: An open-source radar-based hail damage model for buildings and cars, Nat. Hazards Earth Syst. Sci., 24, 847–872, https://doi.org/10.5194/nhess-24-847-2024, 2024. a

Schröer, K., Trefalt, S., Hering, A., Germann, U., and Schwierz, C.: Hagelklima Schweiz: Daten, Ergebnisse und Dokumentation: Fachbericht MeteoSchweiz No. 283, Tech. rep., MeteoSchweiz, https://doi.org/10.18751/PMCH/TR/283.HAGELKLIMA, 2023. a, b, c

Schwarz, G.: Estimating the Dimension of a Model, Ann. Stat., 6, 461–464, 1978. a

Showalter, A. K.: A Stability Index for Thunderstorm Forecasting, B. Am. Meteorol. Soc., 34, 250–252, https://doi.org/10.1175/1520-0477-34.6.250, 1953. a

Taszarek, M., Allen, J. T., Púčik, T., Hoogewind, K. A., and Brooks, H. E.: Severe Convective Storms across Europe and the United States. Part II: ERA5 Environments Associated with Lightning, Large Hail, Severe Wind, and Tornadoes, J. Climate, 33, 10263–10286, https://doi.org/10.1175/JCLI-D-20-0346.1, 2020a. a, b

Taszarek, M., Pilguj, N., Allen, J. T., Gensini, V., Brooks, H. E., and Szuster, P.: Comparison of convective parameters derived from ERA5 and MERRA2 with rawinsonde data over Europe and North America, J. Climate, 34, 3211–3237, https://doi.org/10.1175/JCLI-D-20-0484.1, 2020b. a

Taszarek, M., Allen, J. T., Brooks, H. E., Pilguj, N., and Czernecki, B.: Differing Trends in United States and European Severe Thunderstorm Environments in a Warming Climate, B. Am. Meteorol. Soc., 102, E296–E322, https://doi.org/10.1175/BAMS-D-20-0004.1, 2021. a, b

Taylor, C. M.: Detecting soil moisture impacts on convective initiation in Europe, Geophys. Res. Lett., 42, 4631–4638, https://doi.org/10.1002/2015GL064030, 2015. a

Tippett, M. K., Allen, J. T., Gensini, V. A., and Brooks, H. E.: Climate and Hazardous Convective Weather, Curr. Clim. Change Rep., 1, 60–73, https://doi.org/10.1007/s40641-015-0006-6, 2015. a

Trefalt, S.: Hail and Severe Wind Gusts in the Convective Season in Switzerland, PhD Thesis, Philosophisch-naturwissenschatliche Fakultät der Universität Bern, 2017. a, b, c, d, e, f

Trefalt, S., Germann, U., Hering, A., Clementi, L., Boscacci, M., Schröer, K., and Schwierz, C.: Hail Climate Switzerland Operational radar hail detection algorithms at MeteoSwiss: quality assesssment and improvement: Fachbericht MeteoSchweiz No. 284, Tech. rep., MeteoSchweiz, https://doi.org/10.18751/PMCH/TR/284.HAILCLIMATE, 2023. a

Tuovinen, J.-P., Rauhala, J., and Schultz, D. M.: Significant-Hail-Producing Storms in Finland: Convective-Storm Environment and Mode, Weather Forecast., 30, 1064–1076, https://doi.org/10.1175/WAF-D-14-00159.1, 2015. a, b

Van Delden, A.: The synoptic setting of thunderstorms in western Europe, Atmos. Res., 56, 89–110, https://doi.org/10.1016/S0169-8095(00)00092-2, 2001. a

Varga, A. J. and Breuer, H.: Evaluation of convective parameters derived from pressure level and native ERA5 data and different resolution WRF climate simulations over Central Europe, Clim. Dynam., 58, 1569–1585, https://doi.org/10.1007/s00382-021-05979-3, 2022. a

Waldvogel, A., Federer, B., and Grimm, P.: Criteria for the Detection of Hail Cells, J. Appl. Meteorol., 18, 1521–1525, https://doi.org/10.1175/1520-0450(1979)018<1521:CFTDOH>2.0.CO;2, 1979.  a

Weisman, M. L. and Klemp, J. B.: The Dependence of Numerically Simulated Convective Storms on Vertical Wind Shear and Buoyancy, Mon. Weather Rev., 110, 504–520, https://doi.org/10.1175/1520-0493(1982)110<0504:TDONSC>2.0.CO;2, 1982. a

Weisman, M. L. and Klemp, J. B.: The Structure and Classification of Numerically Simulated Convective Stormsin Directionally Varying Wind Shears, Mon. Weather Rev., 112, 2479–2498, https://doi.org/10.1175/1520-0493(1984)112<2479:TSACON>2.0.CO;2, 1984. a

Wiese, W.: Polareis Und Atmosphärische Schwankungen, Geograf. Ann., 6, 273–299, https://doi.org/10.1080/20014422.1924.11881099, 1924. a

Willemse, S.: A statistical analysis and climatological interpretation of hailstorms in Switzerland, PhD Thesis, ETH Zurich, https://doi.org/10.3929/ETHZ-A-001486581, 1995. a, b

Wu, J., Guo, J., Yun, Y., Yang, R., Guo, X., Meng, D., Sun, Y., Zhang, Z., Xu, H., and Chen, T.: Can ERA5 reanalysis data characterize the pre-storm environment?, Atmos. Res., 297, 107108, https://doi.org/10.1016/j.atmosres.2023.107108, 2024. a

Download
Short summary
In our study we used statistical models to reconstruct past hail days in Switzerland from 1959–2022. This new time series reveals a significant increase in hail day occurrences over the last 7 decades. We link this trend to increases in moisture and instability variables in the models. This time series can now be used to unravel the complexities of Swiss hail occurrence and to understand what drives its year-to-year variability.
Altmetrics
Final-revised paper
Preprint