Impacts of weather on road accidents have been identified in several studies with a focus mainly on monthly or daily accident counts. This study investigates hourly probabilities of road accidents caused by adverse weather conditions in Germany on the spatial scale of administrative districts using logistic regression models. Including meteorological predictor variables from radar-based precipitation estimates, high-resolution reanalysis and weather forecasts improves the prediction of accident probability compared to models without weather information. For example, the percentage of correctly predicted accidents (hit rate) is increased from 30 % to 70 %, while keeping the percentage of wrongly predicted accidents (false-alarm rate) constant at 20 %. When using ensemble weather forecasts up to 21 h instead of radar and reanalysis data, the decline in model performance is negligible. Accident probability has a nonlinear relationship with precipitation. Given an hourly precipitation sum of 1 mm, accident probabilities are approximately 5 times larger at negative temperatures compared to positive temperatures. The findings are relevant in the context of impact-based warnings for road users, road maintenance, traffic management and rescue forces.

The road transport system is one of the most complex and dangerous systems that people have to deal with on a daily basis

Two types of studies can be distinguished regarding the temporal scales. One type of study aims to relate road accidents to weather on a monthly or seasonal timescale

Meteorological data used in accident studies are often derived from measurement stations. Stations are either used individually

Different weather parameters with a significant impact on road accidents have been identified. Depending on the study's modeling strategy and the specific formulation of variables characterizing weather, magnitude and even the sign of the weather impact can vary between different studies. The most important weather parameter considered in most studies is precipitation. On wet roads the tire contact force is reduced

On a monthly basis, snowfall can lead to the reduction in accident numbers, possibly due to indirect effects like reduced traffic volume or the adaption of driving habits

Since the first weather impact models for road accidents

This study follows the predictive modeling approach: we build and assess the skill of logistic regression models for hourly probabilities of weather-related road accidents at the scale of administrative districts in Germany. The aim is to assess model performance at small spatial and temporal scales as well as identify relevant meteorological predictor variables for optimizing the predictive skill. We thus seek an adequate functional relationship between hourly precipitation and accident probability under different temperature conditions and district characteristics. Instead of station-based observations, we use a gridded radar-based precipitation product and a new high-resolution regional reanalysis. Additionally, using ensemble weather forecasts, we assess the predictive skill of the accident model for lead times of up to 21 h.

Section

A data set with anonymized information from police reports of all heavy road accidents in Germany from 2007 until 2012 is used (source: Research Data Centre of the Federal Statistical Office and Statistical Offices of the Länder, Statistik der Straßenverkehrsunfälle, 2007–2012, own calculations).
Heavy road accidents include all accidents with injuries, fatalities or write-offs. Minor accidents are not included in the data set. In total 2 392 329 accidents were reported during the 6-year period under investigation. Most accidents were indicated by the police as being caused by driver behavior. However, 7.7 % (184 201) of the accidents were indicated as being caused by adverse road conditions, which includes a wet, snowy or icy road but also mud or dirt on the road. This class of accidents, which we refer to as

Gridded hourly precipitation sums derived from the RADOLAN (Radar-Online-Aneichung) data set

A reanalysis produced by a novel convective-scale regional reanalysis system for central Europe

Weather forecasts are used to study the predictability of accident probabilities based on weather forecasts with an ensemble prediction system (EPS). We use the regional high-resolution ensemble forecasting system COSMO-DE-EPS, which runs operationally at the DWD before May 2018 with a spatial resolution of 2.8 km for the area of Germany. The COSMO-DE-EPS is initiated every 3 h, with a lead time

For our study, a postprocessed product of the archived COSMO-DE-EPS forecasts for the years 2011 and 2017 was provided by the DWD. Instead of archiving the forecast data on the original model grid, area averages of

We aggregate the different meteorological variables to the level of administrative districts. For the station-based COSMO-DE-EPS forecasts a weighted mean of all available stations in the vicinity of the districts was calculated using the probability density function of a bivariate circular symmetric normal distribution as the weighting function. A standard deviation of 25 km proved to be most appropriate as it corresponds well to the average district area.

For a fair comparison of RADOLAN and COSMO-REA2 with COSMO-DE-EPS forecasts, the same aggregation is applied to the gridded RADOLAN and COSMO-REA2 products: the areal averages around the 758 gauge stations are computed as described in Sect.

Logistic regression models are used to model the probability of a certain event based on independent predictor variables

The parameters of the logistic regression model can be easily converted to the odds ratio

Parameter estimates

Different logistic models are compared with information criteria. The most popular is the Akaike information criterion

The Brier score (BS) is a proper score to measure accuracy of probabilistic forecasts for binary events as they result from a logistic regression model. Based on

By defining a threshold

A skill score SS is a relative measure of how much a forecast

Cross-validation is a technique where the performance of a statistical model is tested using independent data that have not been used for estimating the model coefficients. Here, we use a yearly cross-validation approach. Model parameters are estimated on a data set with 1 year of data left out, and scores are calculated for this respective year. This is repeated several times until a score has been estimated for all years. The score is then averaged over all years and used for model comparison.

To understand the behavior of the model, the predicted accident probabilities of the regression models can be compared to nonparametric estimates for accident frequencies within bins of specific parameter ranges. For example, a predicted accident probability for negative temperatures and a precipitation amount of 1 mm h

The models NULL and HOUR predict the accident probabilities for each district without using weather information (see Tables

Descriptions of predictor variables used in different logistic regression models for hourly probabilities of weather-related road accidents in German administrative districts.

Description of different logistic regression models for hourly probabilities of weather-related road accidents in German administrative districts and their degrees of freedom (Df). Formulas are written using the statistical formula notation system as used in programming languages like R and Python, with colons indicating interaction terms. See Table

The model HOUR includes an additional categorical variable

Accident, radar and reanalysis data overlap in time for the years from 2007 to 2012. For this time period, a binary predictor variable with hourly resolution for the near-surface temperature

Additionally, we fit the models to the individual districts, yielding models RAD_IND and RAD_INT_IND

INT refers to the use of interaction terms in the model equation, while IND refers to estimating model parameters for each district individually.

, respectively. On the one hand, these models capture the district-specific characteristics; on the other hand, the number of available data points for each model is strongly reduced, which complicates the estimation of model parameters, in particular for districts with low accident numbers. These models are used to quantify the benefit of having one model for all districts.The overlapping time period of accident data and COSMO-DE-EPS data are the years 2011 and 2012. For this time period temperature and precipitation are aggregated to the district level as before for all 20 ensemble members. This is done separately for all forecast lead times

The COSMO-DE-EPS provides hourly forecast data but is initialized only every 3 h. Therefore, not all hours are available for all lead times. For example, a lead time of 6 h is only available at 00:00, 03:00, 06:00, 09:00, 12:00, 15:00, 18:00 and 21:00 UTC, while a lead time of 7 h is only available at 01:00, 04:00, 07:00, 10:00, 13:00, 16:00, 19:00 and 22:00 UTC. Furthermore, the logistic regression model uses local time, which has to take into account daylight savings time. Both effects complicate an explicit use of the hour as a predictor variable in combination with COSMO-DE-EPS data. Therefore, to facilitate the incorporation of a diurnal cycle in the model, a two-step procedure is applied. First, the model HOUR is used to forecast the average diurnal cycle of accident probabilities

Three different ways to incorporate the ensemble information in the models are used.

The time-averaged hourly probability that at least one weather-related accident occurs in an administrative district is referred to as

Verification measures for models using radar and reanalysis data. Akaike information criterion (AIC), area under receiver operating characteristic curve (AUC), true positive rate (TPR), logarithmic score (LS) and Brier score (BS). Scores computed in a yearly cross-validation approach for each administrative district are shown as an average of all districts. Skill scores of AUC, LS and BS are computed with the model HOUR as reference (see Table

In the model HOUR all parameters of the categorical variables

The introduction of temperature and precipitation as direct effects in the model RAD leads to a further improvement of the scores compared to NULL and HOUR. With an AUC of about 0.81 and an AUCSS of 0.49 (HOUR as reference), temperature and precipitation can be considered useful in terms of binary classification of accident events. The TPR increases from 0.3 for HOUR to 0.7 for RAD. The interaction terms in RAD_INT slightly improve all scores except for the TPR.

Figure

Distribution of the cross-validated area under receiver operating characteristic curve skill score (AUCSS) of 401 administrative districts is shown for different logistic regression models for weather-related accident probabilities. The probability density is smoothed by a kernel density estimator (shading). The median is indicated by vertical dashed lines.

Figure

Comparison of modeled probabilities of weather-related road accidents with nonparametric probability estimates. Probabilities (lines) and 95 % confidence intervals based on standard errors (shading) of the model RAD

The modeled accident probabilities as a function of

The modeled probabilities as a function of

The modeled probabilities as function of

For more detailed insight into the modeling results, we provide the full model coefficients, standard errors and

Next, we compare the models RAD and RAD_INT, which are fitted to all districts simultaneously, to the models RAD_IND and RAD_INT_IND, which are fitted to all districts individually. Figure

Differences in area under receiver operating characteristic curve skill score (AUCSS) values between the models RAD_IND and RAD (red) and RAD_INT_IND and RAD_INT (black). AUCSS differences are shown for each of the 401 administrative districts vs. the average accident probability

Based on the results of this section, we can conclude that RAD_INT should be preferred over RAD since it achieves the best scores and better represents the functional relationship between probability and precipitation as well as the diurnal cycle. Furthermore, RAD_INT preforms better than RAD_INT_IND, which is fitted to each district individually.

The model RAD_INT showed the best performance among the models predicting accident probability using radar and reanalysis data (Sect.

The model EPS_RAD_INT uses

Area under receiver operating characteristic curve skill score (AUCSS) values of different models for hourly probabilities of weather-related road accidents using radar, reanalysis and weather forecast data from 2011 to 2012 as a function of lead time.

The model EPS_MEMi_INT is estimated for each of the 20 ensemble members individually, which therefore results in 20 deterministic forecasts with 20 individual AUCSS values per lead time. The AUCSS drops from 0.48 at lead time 1 h to below 0.45 at lead time 21 h (gray lines). The spread between the AUCSS of the different ensemble members increases with increasing lead time. The model EPS_MEAN_INT is based on the ensemble mean of the meteorological variables (meteorology-averaged ensemble) and shows a slightly higher AUCSS (solid black line) than all the deterministic forecasts.

The model EPS_PMEAN_INT, which is based on the ensemble mean of the accident probabilities of the 20 versions of EPS_MEMi_INT (probability-averaged ensemble), shows again a slightly higher AUCSS (dashed black line) than the meteorology-averaged ensemble. As expected, the AUCSS values of all models based on weather forecast data are lower than the AUCSS of EPS_RAD_INT based on radar and reanalysis data. However, the differences are relatively small. The LSS shows a similar behavior regarding the lead time dependence as the AUCSS (not shown).

The models RAD_INT and EPS_PMEAN_INT are used in a case study with adverse winter weather conditions on 3 December 2012. At temperatures below the freezing point the fronts of a low-pressure system lead to snowfall in large parts of Germany.
These weather conditions lead to a total number of 280 accidents classified by the police as caused by road condition. The majority of the accidents occurred in southern and western Germany.

Due to regulations regarding anonymization and data protection we are not allowed to show accident counts less than three, which prevents us from showing accident counts for single hours or days at the district level.

For the district of Stuttgart, which was located within the affected area, the RADOLAN data show low precipitation amounts in the early morning and higher precipitation amounts of up to 0.3 mm h

Application of the models EPS_RAD_INT and EPS_PMEAN_INT to an adverse winter weather event on 3 December 2012. Time series are shown for the district of Stuttgart using the COSMO-DE-EPS forecast initialized at 00:00 UTC.

The temperature in COSMO-REA2 is below 0

The accident probability of EPS_RAD_INT shows the combined effect of the average diurnal cycle, RADOLAN precipitation and COSMO-REA2 temperature (Fig.

The hourly accident probability

Model results for adverse winter weather conditions on 3 December 2012 at 17:00 LT based on models EPS_RAD_INT (top) and EPS_PMEAN_INT using the COSMO-DE-EPS forecast with a lead time of 16 h initialized at 00:00 UTC (bottom). From left to right: hourly precipitation at district level, fraction of ensemble members with temperatures below 0

On 3 December 2012 at 17:00 LT, the COSMO-DE-EPS overestimates the precipitation amount in large parts of western and southern Germany compared to RADOLAN (Fig.

Police reports of heavy road accidents in Germany were used to construct hourly time series based on weather-related accidents caused by adverse road conditions for German administrative districts. Different meteorological data sets aggregated to the district level were used in logistic regression models to predict hourly accident probabilities. Models of different complexity were compared after calculating different skill scores using a yearly cross-validation approach. The best model with respect to these scores included district-specific average accident probability, the hour of the day, hourly precipitation and temperature, and their interaction terms. The model reached a hit rate (TPR) of 0.7 when the false-alarm rate (FPR) was fixed at 0.2. With the same false-alarm rate, a model without meteorological parameters only reached a hit rate of 0.3. It was shown that the probability of weather-related accidents increases nonlinearly with increasing hourly precipitation. Given an hourly precipitation of 1 mm, the accident probability is approximately 5 times higher at negative temperatures compared to positive temperatures. In a case study it was shown that the model is able to reasonably capture the spatial and temporal development of accident probabilities during adverse winter weather conditions. When using ensemble weather forecasts to predict accident probabilities, the skill of the logistic regression model remains almost constant for a forecast lead time of up to 21 h. Furthermore, the use of ensemble forecasts leads to a higher skill compared to a setting where ensemble members are treated as individual deterministic forecasts. These findings are in line with the results of

The target variable of this study was weather-related road accidents. The accidents included in the analysis were indicated by the police as being caused by adverse road conditions, which includes a wet, snowy or icy road but also mud or dirt on the road. Thus, the categorization of the accident cause is based on the subjective decision of the police at the location of the accident. This might introduce a bias to the results whose direction or extent is hard to estimate. For example, a large number of accidents that occur during adverse weather conditions are likely to be unrelated to the weather but are caused only by inattention of the driver. Police officers might still categorize these accidents as being weather-related in unclear situations. It should be kept in mind that this could lead to an overestimation of weather-related accident probabilities in the models developed in this study.

It is known that the main parameters affecting accident probability are traffic flow and density. In an optimal case one would use measurements of these variables as a model predictor for accident probability. However, traffic measurements are not continuously available for all administrative districts. Additionally, measurements of traffic flow are mainly available for highways and federal roads and might not be representative for municipal roads, where the majority of the accidents occur. Furthermore, in an operational setting, where the model is applied for predicting future accident probabilities, traffic measurements are not available. Therefore, we decided not to directly include traffic measurements in the models. Instead, the hour of the day was used as a categorical predictor variable to capture the average diurnal cycle of accident probability. It was shown that this approach is able to reasonably represent the inner-day variability of accident probability. The introduction of additional factors like on weekends or holidays did not lead to a significant improvement of the model.

It is a challenging task to combine accident data, which are available for the area of administrative districts, with meteorological data, which are usually available in the form of point observations or gridded data. Different ways of aggregating meteorological data to the district level were tested, and the approach based on distance-weighted averaging, which is presented in this study, showed the best results.

The temperature at 2 m height was used in this study to include the effect of negative temperatures in the statistical model in a relatively simple approach. It has the benefit that the temperature at 2 m height is a well-established meteorological parameter, which is measured at most stations and available in all weather forecasting models. However, it might not reflect the conditions at the road surface, which can deviate from the conditions at 2 m height. Also, the choice of 0

In addition to the weather parameters presented in this study, other parameters like snow fall amount or combined measures of cloud cover and sun angle to describe the impact of sun glare were tested as potential predictor variables. Furthermore, advanced predictor selection techniques like genetic algorithms

We found that the probability of weather-related accidents depends on hourly precipitation to the power of 0.2. This exponent should not be understood as a universal relationship. Instead, it is likely to depend on different aspects of the road system (e.g., how fast the water is able to leave the road surface) or the average car characteristics (e.g., the share of cars equipped with assistance systems or the type of tires). It may even change in time as road and car qualities improve.

In this work we showed two ways of modeling probabilities in different districts: first, by creating a model that distinguishes between different districts based on their average accident probability and, second, by creating a model for each district individually. We found that the first approach leads to higher skill scores, particularly for districts with low accident numbers. Including additional district-specific parameters describing the characteristics of the road network or topographic conditions could help to further refine the model.

This study shows that a skillful relationship between meteorological parameters and weather-related road accidents can be established. Forecasts of probabilities of weather-related road accidents, as presented in this study, might be useful for authorities (traffic management, police or emergency services) on the one hand and road users on the other hand. However, it is reasonable to provide the information about accident risk in different, user-specific formats, which are introduced in Sect.

It was shown that impact-based warning can lead to better actions of the recipients

The accident data for Germany were obtained from the Research Data Centre of the Federal Statistical Office and Statistical Offices of the Länder. The RADOLAN data set

The supplement related to this article is available online at:

Data analysis and visualization were done by NB; all authors contributed to writing the manuscript.

The authors declare that they have no conflict of interest.

This research was carried out within the framework of the Hans-Ertel-Centre for Weather Research. This research network of universities, research institutes and the Deutscher Wetterdienst is funded by the Bundesministerium für Verkehr und Digitale Infrastruktur.

This research has been supported by the Bundesministerium für Verkehr und Digitale Infrastruktur (grant no. 4818DWDP3A).We acknowledge support from the Open Access Publication Initiative of Freie Universität Berlin.

This paper was edited by Joaquim G. Pinto and reviewed by three anonymous referees.