the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Are heavy rainfall events a major trigger of associated natural hazards along the German rail network?
Abstract. Heavy rainfall events and associated natural hazards pose a major threat to rail transport and infrastructure. In this study, the correlation between heavy rainfall events and three associated natural hazards were investigated using GIS analyses and random-effects logistic models. The spatio-temporal linkage of a damage database of DB Netz AG and the CatRaRE-catalogue of the German Weather Service revealed that almost every part of the German rail network was affected by at least one heavy rainfall event between 2011–2021. Twenty-three percent of the flood events, 14 % of the gravitational mass movements and 2 % of the tree fall events occurred after a heavy rainfall event. The random effects logistic regression models showed that a heavy rainfall event significantly increases the probability of occurrence of a flood (tree fall) by a factor of 34.29 (39.85), respectively, with no significant increase for gravitational mass movements. The heavy rainfall index and the 21-days antecedent precipitation index were determined as characteristics of the heavy rainfall events with the strongest impact on all three natural hazards. The results underline the importance of gaining more precise knowledge about the impact of climate triggers on natural hazard-related disturbances, to make rail transport more resilient.
- Preprint
(1632 KB) - Metadata XML
- BibTeX
- EndNote
Status: closed
-
RC1: 'Comment on nhess-2023-196', Anonymous Referee #1, 05 Jan 2024
Review on
"Are heavy rainfall events a major trigger of associated natural hazards along the German rail network?"
by Sonja Szymczak, Frederick Bott, Vigile Marie Fabella and Katharina FrickeGeneral comments
----------------
The subject is interesting and well suited for NHESS. The article needs, however, additional explanations, equations and a better structure. At the moment it is difficult to completely understand what was done and why it was done. I am missing a more thorough discussion relating the design and the results of the statistical models to the relevant physical mechanisms. Some of my remarks below may be a result of misunderstandings or lack of statistical background. But although I am no expert for the methods used in this paper, some general rules apply to most statistical models. I think it is essential that overfitting is ruled out and that the consequences of using so many correlated predictors are discussed.Specific comments
-----------------
Introduction:- - L52-53 The introduction lacks some literature that links precipitation to tree falls (this linkage might be obvious for gravitational mass movements and floods). Does the year 2021 really proofs this relationship? Isn't it likely that the heavy rain was accompanied by wind gusts?
Datasets:
- - L79 spatial intersection has not been explained yet. At least a reference to the section where it will be explained is needed.
- - L93-100 spatial intersection should be explained first. This paragraph should be moved into the results section. It is no longer a description of CatRaRE.
- - L136 Why 5km resolution and not 1km (HYRAS-DE-PRE)? Are you aware that daily precipitation in HYRAS is aggregated from 06 UTC to 06 UTC and that a clear assignment to a single date is not possible? Is daily precipitation the explanatory control variable or 30-day antecedent precipitation (L193), or both? A list of all explanatory control variables is needed in this section (not only a description of the raw data that was used to derive them), otherwise it becomes confusing.
- - L139 DWD soil moisture is based on observations and a soil moisture model.
- - In general, I find it difficult to keep track/distinguish between the different terms (e.g. rain and rail events). A complete list or table in this section on how the terms are used and what they encompass would be helpful (event, natural hazard, observation, explanatory control variables).
Methods:
- - L156 I am not convinced. A flood can be the result of long-lasting precipitation that is not categorized as heavy (as can be seen this year in Germany and Britain).
- - L166 Please explain the analysis of natural breaks (in which data set?) and its results and how this supports the choice of the time period .
- - L171 Corresponding to what?
- - L174 Please explain what panel data analysis, cross sectional analysis and random-effects logistic regression are, for what kind of investigations they should be used and why you chose to apply them here. As these methods are mostly used in social and economic science. It shouldn't be assumed that the methods are known by the audience this article addresses (geological/physical/climatological scientists). I suggest to also write down the equations, explain all variables and terms in the equations using an example from this study.
- Please homogenize the usage of indices. i is used for route segments, time lags and combinations of route segments with heavy rain events. This makes it difficult to understand the equations.
- - L182 Single point in time (which point in time?) or rather all event time steps together? Please explain in more detail.
- - L185 The route segment description would better fit in the datasets section.
- - L200 Why location of beginning of segment and not the middle?
- - L212 This is a normal logistic regression approach (not random-effects). There seems to be a lot of correlation between the independent variables (rain amount and heavy precipitation event, antecedent precipitation index, 30-day precipitation and soil moisture, topographic index and hazard zone ....). Is that a problem for your analysis? What are the consequences for the interpretation of the results? I assume that the OR for these variables is underestimated because parts of the effect are captured by the correlated variables.
- - L231 I do not understand the reasoning behind the approach to include annual and seasonal dummies. Please explain it in terms of physical mechanisms. The annual and seasonal variability is already captured by the inclusion of the precipitation events and the precipitation amounts that also have a seasonal cycle and annual variability. What additional processes do the annual and seasonal dummies represent? Please also explain that dummy means binary. Suggestion: If you need to capture an annual cycle a good approach are harmonic functions (e.g. https://doi.org/10.1016/j.spasta.2017.11.007)
- - L237-238 (Eq 3a and 3b) To me it seems that the method you apply is actually a mixed-effects logistic regression model with random effects (mu_i) and fixed effects (all other effects). I would interpret mu_i as a constant offset that affects the mean probability and depends on the rail segment. Without it the equation has the form of an ordinary logistic regression model. I don't understand the idea behind this approach. What is/are the physical characteristics that differ between segments but are constant within a segment. It seems you already included the relevant geological information as geological control variables (e.g. hazard class, topographic information ...).
- - L245: Please explain from a physical point of view why you investigate these interactions (and why others are not studied). If you include interaction terms more than one regression coefficient is relevant for the variable. I have doubts that you can use the OR (L219) calculated from just one coefficient to compare the importance of the independent variables if you include interaction terms.
- - Table 1 Why don't you create one table including all variables included in the model for a better overview (soil moisture, seasonal dummy, rail segment, 30-day precipitation, .... ). Please also extend the description. What is meant by specified duration? What is the topographic position index?
Results:
- - L284 Does this also hold for the individual processes. Fig. 3 only shows the combined result.
- - L296 ...the higher the value... (of log likelihood or sample size?)
- - L297 The AIC is useful for comparing models of different complexity. However, you can only compare the AIC if the models are fitted using the same number of observations. This is not the case here.
- - Table 2 and Table 4 Please give the full model equations for the 3 hazards. Do you use the same equation for all 3 hazards (e.g. is the hazard indication map for slope and embankment landslides used for all 3 hazards)? How many parameters did you have to fit for each hazard model? As each segment needs one parameter it must be more than 9679. This is a lot compared to the number of natural hazard events that occurred during the analysis period (14461 trees, 1269 floods, 418 gravitational mass movements). Can you rule out overfitting?
- - L340 indicate ... non-linearity ... What brings you to this conclusion?
- - L344 The analysis considers only the interaction terms? Please elaborate what you have done.
- - Figure 4 What does the figure show? The probability or a prediction of the probability using the statistical model? Panels a,d,e: If precipitation becomes strong enough the day should automatically become a heavy rainfall event. Why are there two different curves at high precipitation values? Panels c,f,i: Do these curves make any sense in terms of physics? I assume they are a statistical model artifact. Soil moisture was one of the non-significant parameters.
- - L384 Could this mean that there are no trees if the soil is sealed and therefore the probability for tree fall is low?
Technical corrections
---------------------- - L84/85 Is there a difference between events and heavy rainfall events?
- - L177 proximity in space or time?
- - L212 is the prime symbol missing from beta2?
- - L264 is the prime symbol missing for beta?
Citation: https://doi.org/10.5194/nhess-2023-196-RC1 - AC2: 'Reply on RC1', Sonja Szymczak, 09 Apr 2024
-
CC1: 'Comment on nhess-2023-196', John K. Hillier, 09 Jan 2024
Dear Sonja and co-authors,
A very quick comment. I know from my experience that it is often difficult to find impact-based work using infrastructure losses from other countries. I have previously used rail impacts to examine multi-hazard losses in Great Britain, and there is some work on extreme heat. There may be some in other countries, but that I'm not readily aware if it highlights the utility of linking such work together. Please consider briefly citing some work from other countries on the use of rail network impact data.
John
Reference 1 - and references to the work on heat in the supplementary material of the paper (S1.1 Data). https://www.nature.com/articles/s41558-020-0832-y
Multi-hazard dependencies can increase or decrease risk
- John K. Hillier,
- Tom Matthews,
- Robert L. Wilby &
- Conor Murphy
Nature Climate Change volume 10, pages 595–598 (2020)
Reference 2 - Bloomfield, Hillier, Griffin et al (2023) https://www.sciencedirect.com/science/article/pii/S2212094723000038
Citation: https://doi.org/10.5194/nhess-2023-196-CC1 -
AC4: 'Reply on CC1', Sonja Szymczak, 09 Apr 2024
Thank you for the reference to the two publications. We will be happy to integrate them into the revised version.
Citation: https://doi.org/10.5194/nhess-2023-196-AC4
-
CC2: 'Comment on nhess-2023-196', Katharina Lengfeld, 12 Jan 2024
Comments on: "Are heavy rainfall events a major trigger of associated natural hazards along the German rail network?"
I have read this preprint with great interest and think that it is an important study for under-standing the influence of heavy rainfall events on damage to rail transport and infrastructure. My colleagues and I developed CatRaRE and we highly appreciate the use of the dataset in this study. However, there are a few issues in the description of CatRaRE and the results that I would ask the authors to address. In the case the authors have any questions or would like to discuss some of the issues mentioned below, please feel free to contact me or my colleague Ewelina Walawender.
- CatRaRE catalogue: In the abbreviation CatRaRE the word “catalogue” is already included, therefore CatRaRE catalogue would mean Catalogue of Radar-based heavy Rainfall Events catalogue. I suggest to use either just CatRaRE or catalogue of radar-based heavy rainfall events
- P.3, L.72: CatRaRE W3 and T5 are DOI referenced datasets. Please use the appro-priate reference for the catalogue used in this study, which I guess is the Version 2022.01:
Lengfeld, K., Walawender, E., Winterrath, T., Weigl, E., Becker, A., 2022, Heavy pre-cipitation events version 2022.01 exceeding DWD warning level 3 for severe weather based on RADKLIM-RW version 2017.002, DOI:10.5676/DWD/CatRaRE_W3_Eta_v2022.01.In case another version is used, please check https://www.dwd.de/DE/leistungen/catrare/catrare_daten.html?nn=16102&lsbId=751876
- P.3, L.73-74: In CatRaRE events with 11 different durations between 1 and 72 hours are listed. In the catalogue W3 we use the lower boundary of warning level 3 as a threshold. Not only the warning levels for 1 hour (25 mm) and 6 hours (35 mm) are used, but also the ones for 12 (40 mm), 24 (50 mm), 48 (60 mm) and 72 hours (90 mm). There are no official warning levels for rainfall events with durations of 2, 3, 4, 9 and 18 hours. Therefore, for these durations we linearly interpolate the official warning levels and get thresholds of 27 mm in 2 hours, 29 mm in 3 hours, 31 mm in 4 hours, 37.5 mm in 9 hours and 45 mm in 18 hours.
- P.4, L.98 and Table 1: The SRI does not describe the speed at which rainfall accumulates within a specific duration of time. The SRI is based on the return period of the rainfall amount for indices 1-7, where 7 corresponds to a return period of 100 years. Indices 8-12 are based on the rainfall amount compared a precipitation with a return period of 100 years. Please clarify the description and see Schmitt (2017) and Schmitt et al. (2018) for more information.
- P.6, L.136: What is the reason for choosing the HYRAS dataset over the climatological radar dataset RADKLIM? RADKLIM would correspond to CatRaRE and has a higher spatial resolution comparable to the soil moisture dataset.
- P.8, L.194: Why do the authors start the period with 1 February 2011? Precipitation data should be available for 2010 as well, allowing also for calculation of 30-day antecedent precipitation for January 2011.
- P.10-11, L.252-256 + Section 3.3.: The CatRaRE event variables the authors have chosen (Tab. 1) are calculated for the whole event area (e.g. as an average over the event zone), that can in extreme cases cover several thousand km². However, the damage data used in this study are available for a given route segment (point location), so the cross-analysis makes sense only if a given rainfall event is undifferentiated within its zone in terms of precipitation characteristics (RR, SRI, V3) and occurs over an area with similar landscape pattern (TPI, VSGL, STRM). A pixel-based analy-sis would be more appropriate in this case. Also using the ETA as a measure of extremity is not proper in case of point-analysis, as it is calculated exactly on the basis of the event area.
- P.11, L.270: The fact that only 23% of the flooding events are linked to heavy rainfall events seems surprising to me. Did the authors also check for rainfall events in a certain radius around the flooding since rainfall does not necessarily cause flooding in the region of its occurrence but in the region where the water flows to.
- P.13, L.312-314: It is not clear to me why a rain event should cause a flood one or two days after it’s occurrence. If there is no more rain in the area there shouldn’t occur a flood unless the water comes from another region, e.g. from upstream a river. But then the flood is probably triggered by another rainfall event that occurred upstream and not by the one that occurred in the area with the flooded railway section There-fore, not only a temporal but also a spatial buffer should be taken into consideration. In case the flooding occurred one or two days after the rainfall event I would also suggest checking the HYRAS dataset if there was more rainfall in the damaged area or the surroundings that could have caused the damage but wasn’t classified as an event in CatRaRE. I understand that a detailed analysis of flow paths is beyond the scope of this paper, but the issue as well as the difference between damages caused by a heavy rainfall event and by a flood event should at least be mentioned in the discussion.
- Section 3.3: I am not sure if increasing each parameter by one unit is appropriate. 1 mm increase in mean precipitation is not comparable to increasing the duration by 1 hour or the SRI by 1. Let’s e.g. assume a precipitation sum of 50 mm in 1 hour has a return period of 100 years, which corresponds to SRI = 7. Increasing the precipitation sum by 1 mm leads to 51 mm in 1 hours which most probably still has a SRI of 7 because the return period won’t increase that much. Increasing the SRI by one to SRI = 8 would mean according to Schmitt (2017) that the precipitation would be 1.2 to 1.4 times the precipitation sum for return period of 100 years (which is 50 mm in our case). Therefore, increasing SRI from 7 to 8 would increase the precipitation sum from 50 mm to a value between 60 and 70 mm, which is 10 to 20 times more than the increase of 1 mm that was assumed for investigating the influence of increasing the precipitation by 1 unit. Therefore, the influence of increasing the SRI by one unit is by definition larger than the influence of increasing the mean precipitation by one unit. Also, I don’t quite understand how the duration of precipitation is increased. In my example I had a duration of 1 hour and 50 mm. Does increasing the duration by one unit mean that it will rain 50 mm in 2 hours instead or 2*50 mm = 100 mm in 2 hours?References:
Schmitt, T.G., 2017: Ortsbezogene Regenhöhen im Starkregenindexkonzept SRI12 zur Risi-kokommunikation in der kommunalen Überflutungsvorsorge. Korrespondez Abwasser, Abfall 63, DOI: 10.3242/kae2016.11.001
Schmitt, T.G., Krüger, M., Pfister S., Becker, M., Mudersbach, C., Fuchs, L, Hoppe H. and Lakes, I., 2018: Einheitliches Konzept zur Bewertung von Starkregenereignissen mittels Starkregenindex. Korrespondenz Wasserwirtschaft 11, DOI: 10.3242/kae2018.02.002Citation: https://doi.org/10.5194/nhess-2023-196-CC2 - AC5: 'Reply on CC2', Sonja Szymczak, 09 Apr 2024
-
RC2: 'Comment on nhess-2023-196', Ugur Ozturk, 22 Feb 2024
The manuscript demonstrates the integration of damage data from infrastructure operators with climate data from weather services, aiming to discern potential relationships that could enhance proactive management of natural hazards. The authors' perspective, primarily through the lens of a railroad operator, brings a focused approach to understanding and mitigating disruptions to railroad operations caused by climate extremes. This perspective is particularly timely, given the anticipated increase in such disruptions under the impact of climate change, highlighting the urgent need for targeted countermeasures.
As I read the manuscript, I found the unique approach to examining rainfall, associated hazards, and their impact on the rail network from a rail network operator's standpoint to be both enlightening and compelling. The entire analysis bears the imprint of this distinctive viewpoint, offering insights that are both practical and relevant to the field. However, I feel that certain aspects of the study and the choices made therein would benefit from additional elucidation, and there may be room to broaden some analyses to further strengthen the findings and their implications.
My major concerns concentrate around the method choices forming the foundation of the current study. I highlight the line number of a piece from the manuscript in quotation marks which is followed by my comments after the sign of -->.
Line 79: "spatially intersected with the German rail network" --> Is this intersection achieved considering purely spatial overlap, or are rainfall runoff conditions taken into account as well? For instance, rainfall upstream could potentially impact tracks downstream, even in the absence of local precipitation.
Line 97: "one heavy rainfall event" --> Could the authors clarify if they are referring to hydrogeomorphological events, including mass-wasting process? I suspect that the tree falls might relate more to wind than rainfall. If the tree fall process is indeed related to wind, it might be beneficial to consider a term that encompasses all three phenomena. Perhaps including wind events as a factor, or alternatively, reconsidering the inclusion of tree fall cases, might provide a more concise, easy-to-explain analysis.
Line 158: "e.g. shown for shallow landslides" --> While the observation may hold true for shallow landslides, it's important to note that gravitational mass movements also encompass deep-seated landslides, where the lag time could extend significantly, potentially reaching years. Even excluding these exceptional cases, a lag time of 10-15 days appears realistic, as evidenced by Dille et al. (2022; https://doi.org/10.1038/s41561-022-01073-3). Should the focus be on shallow landslides, it would be helpful if this distinction is made clear. The choice of a 2-day lag for considering landslides raises some concerns for me, and I kindly suggest revisiting this aspect for a more nuanced discussion. The term "gravitational mass movements" might cover more processes than the authors intended.
Lines 200–202: "The segment is considered to have been affected by a heavy rainfall event on a given day if a heavy rainfall event from the CatRaRE database has occurred on that day up to a maximum of two days previously." --> As I mentioned in my previous comment, I'm concerned that the proposed time-lag window might not adequately capture the lag time associated with landslides. Later on, in the results (comment below), the authors highlight that only a small fraction of gravitational mass movements were linked to certain rain events. Hence, as previously mentioned, extending this window could offer a more accurate representation of the impact of heavy rainfall on landslide occurrences.
Line 275: "a total of 59 events (14 %)" --> I wonder if the correlation might become more pronounced if the lag time were extended to 15 days or more. This adjustment could potentially offer a more comprehensive analysis of the impact of heavy rainfall on these events.
Lines 278-279: "Of the 14461 tree fall events, a total of 312 (2 %) events can be spatially and temporally linked to a heavy rainfall event." --> This observation might suggest an indirect connection between rainfall and tree falls, potentially implicating other factors such as wind (as discussed by Gardiner et al.; http://dx.doi.org/10.2139/ssrn.4576016) or flooding (Lucia et al., 2018; https://doi.org/10.1016/j.scitotenv.2018.05.186). Further exploration of these factors could enrich the study.
My minor concerns primarily revolve around the use of terminology and the occasional absence of detailed explanatory statements that could further enhance the manuscript's readability and comprehension. Clarifying these aspects could improve the overall understandability for the readers. I list the minor comments in the attached file to keep my online comments concise.
- AC1: 'Reply on RC2', Sonja Szymczak, 09 Apr 2024
-
RC3: 'Comment on nhess-2023-196', Anonymous Referee #3, 28 Feb 2024
The paper is well-written, and most of the study details are well documented. It is a useful addition to the many modelling studies that assume deterministic relationships between rainfall, inundation, and flood damage to rail infrastructure. We have a few concerns about the railway incident data though, which we would like to clarify below.
Major comments
- For the reader it is not easy to gain an impression of what the DB Netz AG database looks like. Because some of the results are somewhat puzzling and surprising (see point 2), we request the authors to provide more information on how this dataset looks. A few concrete suggestions:
- Include a few example records from the database showing the raw data (i.e. the raw text record that was identified), preferably with examples showing all three types of natural hazards; and examples that did match with rainfall events and examples that did not match.
- Please give an example(s) of how a well-known flood event (example the July 2021 floods) was mapped in the data, ideally with some pictures of how the damage looked in the real world. Earlier NHESS articles (e.g. https://doi.org/10.5194/nhess-22-3831-2022) and the author’s https://doi.org/10.3390/atmos13071118 contain many relevant details that can be used to link the records to.
- Please give examples of that were tagged as a flood by DB, but for which no extreme rainfall was reported.
- We find the finding that only a quarter of the flood events could be linked to extreme rainfall events (line 414) very puzzling. In our view, the potential causes of this are insufficiently discussed in the draft version. Please critically reflect on the following possible causes:
- Is it possible that most of the reported flood events relate to river (fluvial) flooding? On the one hand, one might expect that this is the case, because it would present a natural explanation for the fact that the damage is observed in a different location than the extreme rainfall event. On the other hand, we would be surprised if the ratio between fluvial and pluvial flood events would be 3:1. Also, if 75% of the flood cases would concern river flooding, one may wonder if the methodology of the paper is sound. The authors mention this aspect in line 463, but what is missing is a reflection on whether this explains the low correlation in their results.
- Another take would be that most of the floods are caused by rainfall events that do not qualify as heavy rainfall, which is reasonable given that there is no standardized guideline for defining heavy rainfall (line 72). If this is the case, it would be best to clarify it in the text and highlight it as relevant further research.
- Is it possible that due to other reasons, there is a mismatch between what the DB understands as a ‘flood’ is very different from what the radar data shows?
- Are there other possible explanations?
We mainly raise the above points because for modelling studies, the outcomes of the present study may have large implications. Most modelling studies assume a deterministic relationship between rainfall, inundation and damage. The present study seems to suggest that such relationships would only explain 25% of reported flood damage events. We invite the authors to further reflect on this. Are the author’s aware of any other empirical studies that looked into this relation? Did they find similar results?
Minor comments
- In the abstract, please list the ‘three associated natural hazards’ upfront. Now it takes a while before the reader knows which hazards you examined, namely: floods, gravitational mass movements / landslides?, tree fall.
- Line 47: and smaller tolerance of risk compared to road transport?
- Line 50: Can the authors provide any additional information on what type of damage is reported in the DB database? Do railtrack characteristics play a role? What part of the track is damaged? Could it also be damage to a pier or abutment of a bridge that supports the track?
- Line 59: do you mean bias or correlation?
- Line 64: One or two figures to illustrate the data sources could be useful for a reader that is unfamiliar with them.
- Line 88: Figure 1 description indicates Monthly and yearly distribution of heavy rainfall, stating it as Yearly and monthly would better match the content of the figure. Figures c-h are also more likely to be consulted by the reader when reading the results, making the figure placement inconvenient (though understandable). Could changing the position or splitting the figure into two make it more readable?
- Line 147, can you describe in a few lines how the polygon data looks like? Is it a polygon indicating a uniform amount of rainfall within that polygon?
- Line 149: Punctuation can be improved for readability.
- Line 163: The selection of 2 days as a time window could be better explained; how often does it take longer than 2 days to record the damage events? During periods of heavy rainfall, it may not be safe to collect the data, for example?
- Line 185-200: The definition of route segments could be further clarified and justified, specifically: (1) what is an “operating point”? (2) What are the implications of such wide range of lengths (140 m to 12.7 km) – is the starting point in a long stretch equally representative as that of a short one? (3) Taking 5-meter segments may be unmanageable, but why not, for example, use 1 km segments as a standard?
- Line 209 (and other location): I find the term natural hazards a bit ambivalent in this context, because it could be used to indicate either the extreme rainfall event, or the flood/gravitation mass movement/ tree fall.
- Line 393: what is meant with: can be spatially overlaid.
- Line 397: does this conclusion follow from the data, or from other literature, or from common understanding?
- Line 401-402: Is flooding triggered by heavy rainfall events (Line 402) an example of the statement in the previous sentence? This can be clarified. The use of the word connections makes it sound like it is a separate idea or concept.
- Line 410: This idea is not very clear. An additional sentence with a specific example may help. Something like “[…] but to establish connections between the processes through X or Y, for example, by looking at mass land movements triggered by flooding”, if this was indeed the idea.
- Line 451: What is meant by “the different background of the data collections”? Is it their intended or original purpose rather than their background?
Citation: https://doi.org/10.5194/nhess-2023-196-RC3 - AC3: 'Reply on RC3', Sonja Szymczak, 09 Apr 2024
- For the reader it is not easy to gain an impression of what the DB Netz AG database looks like. Because some of the results are somewhat puzzling and surprising (see point 2), we request the authors to provide more information on how this dataset looks. A few concrete suggestions:
Status: closed
-
RC1: 'Comment on nhess-2023-196', Anonymous Referee #1, 05 Jan 2024
Review on
"Are heavy rainfall events a major trigger of associated natural hazards along the German rail network?"
by Sonja Szymczak, Frederick Bott, Vigile Marie Fabella and Katharina FrickeGeneral comments
----------------
The subject is interesting and well suited for NHESS. The article needs, however, additional explanations, equations and a better structure. At the moment it is difficult to completely understand what was done and why it was done. I am missing a more thorough discussion relating the design and the results of the statistical models to the relevant physical mechanisms. Some of my remarks below may be a result of misunderstandings or lack of statistical background. But although I am no expert for the methods used in this paper, some general rules apply to most statistical models. I think it is essential that overfitting is ruled out and that the consequences of using so many correlated predictors are discussed.Specific comments
-----------------
Introduction:- - L52-53 The introduction lacks some literature that links precipitation to tree falls (this linkage might be obvious for gravitational mass movements and floods). Does the year 2021 really proofs this relationship? Isn't it likely that the heavy rain was accompanied by wind gusts?
Datasets:
- - L79 spatial intersection has not been explained yet. At least a reference to the section where it will be explained is needed.
- - L93-100 spatial intersection should be explained first. This paragraph should be moved into the results section. It is no longer a description of CatRaRE.
- - L136 Why 5km resolution and not 1km (HYRAS-DE-PRE)? Are you aware that daily precipitation in HYRAS is aggregated from 06 UTC to 06 UTC and that a clear assignment to a single date is not possible? Is daily precipitation the explanatory control variable or 30-day antecedent precipitation (L193), or both? A list of all explanatory control variables is needed in this section (not only a description of the raw data that was used to derive them), otherwise it becomes confusing.
- - L139 DWD soil moisture is based on observations and a soil moisture model.
- - In general, I find it difficult to keep track/distinguish between the different terms (e.g. rain and rail events). A complete list or table in this section on how the terms are used and what they encompass would be helpful (event, natural hazard, observation, explanatory control variables).
Methods:
- - L156 I am not convinced. A flood can be the result of long-lasting precipitation that is not categorized as heavy (as can be seen this year in Germany and Britain).
- - L166 Please explain the analysis of natural breaks (in which data set?) and its results and how this supports the choice of the time period .
- - L171 Corresponding to what?
- - L174 Please explain what panel data analysis, cross sectional analysis and random-effects logistic regression are, for what kind of investigations they should be used and why you chose to apply them here. As these methods are mostly used in social and economic science. It shouldn't be assumed that the methods are known by the audience this article addresses (geological/physical/climatological scientists). I suggest to also write down the equations, explain all variables and terms in the equations using an example from this study.
- Please homogenize the usage of indices. i is used for route segments, time lags and combinations of route segments with heavy rain events. This makes it difficult to understand the equations.
- - L182 Single point in time (which point in time?) or rather all event time steps together? Please explain in more detail.
- - L185 The route segment description would better fit in the datasets section.
- - L200 Why location of beginning of segment and not the middle?
- - L212 This is a normal logistic regression approach (not random-effects). There seems to be a lot of correlation between the independent variables (rain amount and heavy precipitation event, antecedent precipitation index, 30-day precipitation and soil moisture, topographic index and hazard zone ....). Is that a problem for your analysis? What are the consequences for the interpretation of the results? I assume that the OR for these variables is underestimated because parts of the effect are captured by the correlated variables.
- - L231 I do not understand the reasoning behind the approach to include annual and seasonal dummies. Please explain it in terms of physical mechanisms. The annual and seasonal variability is already captured by the inclusion of the precipitation events and the precipitation amounts that also have a seasonal cycle and annual variability. What additional processes do the annual and seasonal dummies represent? Please also explain that dummy means binary. Suggestion: If you need to capture an annual cycle a good approach are harmonic functions (e.g. https://doi.org/10.1016/j.spasta.2017.11.007)
- - L237-238 (Eq 3a and 3b) To me it seems that the method you apply is actually a mixed-effects logistic regression model with random effects (mu_i) and fixed effects (all other effects). I would interpret mu_i as a constant offset that affects the mean probability and depends on the rail segment. Without it the equation has the form of an ordinary logistic regression model. I don't understand the idea behind this approach. What is/are the physical characteristics that differ between segments but are constant within a segment. It seems you already included the relevant geological information as geological control variables (e.g. hazard class, topographic information ...).
- - L245: Please explain from a physical point of view why you investigate these interactions (and why others are not studied). If you include interaction terms more than one regression coefficient is relevant for the variable. I have doubts that you can use the OR (L219) calculated from just one coefficient to compare the importance of the independent variables if you include interaction terms.
- - Table 1 Why don't you create one table including all variables included in the model for a better overview (soil moisture, seasonal dummy, rail segment, 30-day precipitation, .... ). Please also extend the description. What is meant by specified duration? What is the topographic position index?
Results:
- - L284 Does this also hold for the individual processes. Fig. 3 only shows the combined result.
- - L296 ...the higher the value... (of log likelihood or sample size?)
- - L297 The AIC is useful for comparing models of different complexity. However, you can only compare the AIC if the models are fitted using the same number of observations. This is not the case here.
- - Table 2 and Table 4 Please give the full model equations for the 3 hazards. Do you use the same equation for all 3 hazards (e.g. is the hazard indication map for slope and embankment landslides used for all 3 hazards)? How many parameters did you have to fit for each hazard model? As each segment needs one parameter it must be more than 9679. This is a lot compared to the number of natural hazard events that occurred during the analysis period (14461 trees, 1269 floods, 418 gravitational mass movements). Can you rule out overfitting?
- - L340 indicate ... non-linearity ... What brings you to this conclusion?
- - L344 The analysis considers only the interaction terms? Please elaborate what you have done.
- - Figure 4 What does the figure show? The probability or a prediction of the probability using the statistical model? Panels a,d,e: If precipitation becomes strong enough the day should automatically become a heavy rainfall event. Why are there two different curves at high precipitation values? Panels c,f,i: Do these curves make any sense in terms of physics? I assume they are a statistical model artifact. Soil moisture was one of the non-significant parameters.
- - L384 Could this mean that there are no trees if the soil is sealed and therefore the probability for tree fall is low?
Technical corrections
---------------------- - L84/85 Is there a difference between events and heavy rainfall events?
- - L177 proximity in space or time?
- - L212 is the prime symbol missing from beta2?
- - L264 is the prime symbol missing for beta?
Citation: https://doi.org/10.5194/nhess-2023-196-RC1 - AC2: 'Reply on RC1', Sonja Szymczak, 09 Apr 2024
-
CC1: 'Comment on nhess-2023-196', John K. Hillier, 09 Jan 2024
Dear Sonja and co-authors,
A very quick comment. I know from my experience that it is often difficult to find impact-based work using infrastructure losses from other countries. I have previously used rail impacts to examine multi-hazard losses in Great Britain, and there is some work on extreme heat. There may be some in other countries, but that I'm not readily aware if it highlights the utility of linking such work together. Please consider briefly citing some work from other countries on the use of rail network impact data.
John
Reference 1 - and references to the work on heat in the supplementary material of the paper (S1.1 Data). https://www.nature.com/articles/s41558-020-0832-y
Multi-hazard dependencies can increase or decrease risk
- John K. Hillier,
- Tom Matthews,
- Robert L. Wilby &
- Conor Murphy
Nature Climate Change volume 10, pages 595–598 (2020)
Reference 2 - Bloomfield, Hillier, Griffin et al (2023) https://www.sciencedirect.com/science/article/pii/S2212094723000038
Citation: https://doi.org/10.5194/nhess-2023-196-CC1 -
AC4: 'Reply on CC1', Sonja Szymczak, 09 Apr 2024
Thank you for the reference to the two publications. We will be happy to integrate them into the revised version.
Citation: https://doi.org/10.5194/nhess-2023-196-AC4
-
CC2: 'Comment on nhess-2023-196', Katharina Lengfeld, 12 Jan 2024
Comments on: "Are heavy rainfall events a major trigger of associated natural hazards along the German rail network?"
I have read this preprint with great interest and think that it is an important study for under-standing the influence of heavy rainfall events on damage to rail transport and infrastructure. My colleagues and I developed CatRaRE and we highly appreciate the use of the dataset in this study. However, there are a few issues in the description of CatRaRE and the results that I would ask the authors to address. In the case the authors have any questions or would like to discuss some of the issues mentioned below, please feel free to contact me or my colleague Ewelina Walawender.
- CatRaRE catalogue: In the abbreviation CatRaRE the word “catalogue” is already included, therefore CatRaRE catalogue would mean Catalogue of Radar-based heavy Rainfall Events catalogue. I suggest to use either just CatRaRE or catalogue of radar-based heavy rainfall events
- P.3, L.72: CatRaRE W3 and T5 are DOI referenced datasets. Please use the appro-priate reference for the catalogue used in this study, which I guess is the Version 2022.01:
Lengfeld, K., Walawender, E., Winterrath, T., Weigl, E., Becker, A., 2022, Heavy pre-cipitation events version 2022.01 exceeding DWD warning level 3 for severe weather based on RADKLIM-RW version 2017.002, DOI:10.5676/DWD/CatRaRE_W3_Eta_v2022.01.In case another version is used, please check https://www.dwd.de/DE/leistungen/catrare/catrare_daten.html?nn=16102&lsbId=751876
- P.3, L.73-74: In CatRaRE events with 11 different durations between 1 and 72 hours are listed. In the catalogue W3 we use the lower boundary of warning level 3 as a threshold. Not only the warning levels for 1 hour (25 mm) and 6 hours (35 mm) are used, but also the ones for 12 (40 mm), 24 (50 mm), 48 (60 mm) and 72 hours (90 mm). There are no official warning levels for rainfall events with durations of 2, 3, 4, 9 and 18 hours. Therefore, for these durations we linearly interpolate the official warning levels and get thresholds of 27 mm in 2 hours, 29 mm in 3 hours, 31 mm in 4 hours, 37.5 mm in 9 hours and 45 mm in 18 hours.
- P.4, L.98 and Table 1: The SRI does not describe the speed at which rainfall accumulates within a specific duration of time. The SRI is based on the return period of the rainfall amount for indices 1-7, where 7 corresponds to a return period of 100 years. Indices 8-12 are based on the rainfall amount compared a precipitation with a return period of 100 years. Please clarify the description and see Schmitt (2017) and Schmitt et al. (2018) for more information.
- P.6, L.136: What is the reason for choosing the HYRAS dataset over the climatological radar dataset RADKLIM? RADKLIM would correspond to CatRaRE and has a higher spatial resolution comparable to the soil moisture dataset.
- P.8, L.194: Why do the authors start the period with 1 February 2011? Precipitation data should be available for 2010 as well, allowing also for calculation of 30-day antecedent precipitation for January 2011.
- P.10-11, L.252-256 + Section 3.3.: The CatRaRE event variables the authors have chosen (Tab. 1) are calculated for the whole event area (e.g. as an average over the event zone), that can in extreme cases cover several thousand km². However, the damage data used in this study are available for a given route segment (point location), so the cross-analysis makes sense only if a given rainfall event is undifferentiated within its zone in terms of precipitation characteristics (RR, SRI, V3) and occurs over an area with similar landscape pattern (TPI, VSGL, STRM). A pixel-based analy-sis would be more appropriate in this case. Also using the ETA as a measure of extremity is not proper in case of point-analysis, as it is calculated exactly on the basis of the event area.
- P.11, L.270: The fact that only 23% of the flooding events are linked to heavy rainfall events seems surprising to me. Did the authors also check for rainfall events in a certain radius around the flooding since rainfall does not necessarily cause flooding in the region of its occurrence but in the region where the water flows to.
- P.13, L.312-314: It is not clear to me why a rain event should cause a flood one or two days after it’s occurrence. If there is no more rain in the area there shouldn’t occur a flood unless the water comes from another region, e.g. from upstream a river. But then the flood is probably triggered by another rainfall event that occurred upstream and not by the one that occurred in the area with the flooded railway section There-fore, not only a temporal but also a spatial buffer should be taken into consideration. In case the flooding occurred one or two days after the rainfall event I would also suggest checking the HYRAS dataset if there was more rainfall in the damaged area or the surroundings that could have caused the damage but wasn’t classified as an event in CatRaRE. I understand that a detailed analysis of flow paths is beyond the scope of this paper, but the issue as well as the difference between damages caused by a heavy rainfall event and by a flood event should at least be mentioned in the discussion.
- Section 3.3: I am not sure if increasing each parameter by one unit is appropriate. 1 mm increase in mean precipitation is not comparable to increasing the duration by 1 hour or the SRI by 1. Let’s e.g. assume a precipitation sum of 50 mm in 1 hour has a return period of 100 years, which corresponds to SRI = 7. Increasing the precipitation sum by 1 mm leads to 51 mm in 1 hours which most probably still has a SRI of 7 because the return period won’t increase that much. Increasing the SRI by one to SRI = 8 would mean according to Schmitt (2017) that the precipitation would be 1.2 to 1.4 times the precipitation sum for return period of 100 years (which is 50 mm in our case). Therefore, increasing SRI from 7 to 8 would increase the precipitation sum from 50 mm to a value between 60 and 70 mm, which is 10 to 20 times more than the increase of 1 mm that was assumed for investigating the influence of increasing the precipitation by 1 unit. Therefore, the influence of increasing the SRI by one unit is by definition larger than the influence of increasing the mean precipitation by one unit. Also, I don’t quite understand how the duration of precipitation is increased. In my example I had a duration of 1 hour and 50 mm. Does increasing the duration by one unit mean that it will rain 50 mm in 2 hours instead or 2*50 mm = 100 mm in 2 hours?References:
Schmitt, T.G., 2017: Ortsbezogene Regenhöhen im Starkregenindexkonzept SRI12 zur Risi-kokommunikation in der kommunalen Überflutungsvorsorge. Korrespondez Abwasser, Abfall 63, DOI: 10.3242/kae2016.11.001
Schmitt, T.G., Krüger, M., Pfister S., Becker, M., Mudersbach, C., Fuchs, L, Hoppe H. and Lakes, I., 2018: Einheitliches Konzept zur Bewertung von Starkregenereignissen mittels Starkregenindex. Korrespondenz Wasserwirtschaft 11, DOI: 10.3242/kae2018.02.002Citation: https://doi.org/10.5194/nhess-2023-196-CC2 - AC5: 'Reply on CC2', Sonja Szymczak, 09 Apr 2024
-
RC2: 'Comment on nhess-2023-196', Ugur Ozturk, 22 Feb 2024
The manuscript demonstrates the integration of damage data from infrastructure operators with climate data from weather services, aiming to discern potential relationships that could enhance proactive management of natural hazards. The authors' perspective, primarily through the lens of a railroad operator, brings a focused approach to understanding and mitigating disruptions to railroad operations caused by climate extremes. This perspective is particularly timely, given the anticipated increase in such disruptions under the impact of climate change, highlighting the urgent need for targeted countermeasures.
As I read the manuscript, I found the unique approach to examining rainfall, associated hazards, and their impact on the rail network from a rail network operator's standpoint to be both enlightening and compelling. The entire analysis bears the imprint of this distinctive viewpoint, offering insights that are both practical and relevant to the field. However, I feel that certain aspects of the study and the choices made therein would benefit from additional elucidation, and there may be room to broaden some analyses to further strengthen the findings and their implications.
My major concerns concentrate around the method choices forming the foundation of the current study. I highlight the line number of a piece from the manuscript in quotation marks which is followed by my comments after the sign of -->.
Line 79: "spatially intersected with the German rail network" --> Is this intersection achieved considering purely spatial overlap, or are rainfall runoff conditions taken into account as well? For instance, rainfall upstream could potentially impact tracks downstream, even in the absence of local precipitation.
Line 97: "one heavy rainfall event" --> Could the authors clarify if they are referring to hydrogeomorphological events, including mass-wasting process? I suspect that the tree falls might relate more to wind than rainfall. If the tree fall process is indeed related to wind, it might be beneficial to consider a term that encompasses all three phenomena. Perhaps including wind events as a factor, or alternatively, reconsidering the inclusion of tree fall cases, might provide a more concise, easy-to-explain analysis.
Line 158: "e.g. shown for shallow landslides" --> While the observation may hold true for shallow landslides, it's important to note that gravitational mass movements also encompass deep-seated landslides, where the lag time could extend significantly, potentially reaching years. Even excluding these exceptional cases, a lag time of 10-15 days appears realistic, as evidenced by Dille et al. (2022; https://doi.org/10.1038/s41561-022-01073-3). Should the focus be on shallow landslides, it would be helpful if this distinction is made clear. The choice of a 2-day lag for considering landslides raises some concerns for me, and I kindly suggest revisiting this aspect for a more nuanced discussion. The term "gravitational mass movements" might cover more processes than the authors intended.
Lines 200–202: "The segment is considered to have been affected by a heavy rainfall event on a given day if a heavy rainfall event from the CatRaRE database has occurred on that day up to a maximum of two days previously." --> As I mentioned in my previous comment, I'm concerned that the proposed time-lag window might not adequately capture the lag time associated with landslides. Later on, in the results (comment below), the authors highlight that only a small fraction of gravitational mass movements were linked to certain rain events. Hence, as previously mentioned, extending this window could offer a more accurate representation of the impact of heavy rainfall on landslide occurrences.
Line 275: "a total of 59 events (14 %)" --> I wonder if the correlation might become more pronounced if the lag time were extended to 15 days or more. This adjustment could potentially offer a more comprehensive analysis of the impact of heavy rainfall on these events.
Lines 278-279: "Of the 14461 tree fall events, a total of 312 (2 %) events can be spatially and temporally linked to a heavy rainfall event." --> This observation might suggest an indirect connection between rainfall and tree falls, potentially implicating other factors such as wind (as discussed by Gardiner et al.; http://dx.doi.org/10.2139/ssrn.4576016) or flooding (Lucia et al., 2018; https://doi.org/10.1016/j.scitotenv.2018.05.186). Further exploration of these factors could enrich the study.
My minor concerns primarily revolve around the use of terminology and the occasional absence of detailed explanatory statements that could further enhance the manuscript's readability and comprehension. Clarifying these aspects could improve the overall understandability for the readers. I list the minor comments in the attached file to keep my online comments concise.
- AC1: 'Reply on RC2', Sonja Szymczak, 09 Apr 2024
-
RC3: 'Comment on nhess-2023-196', Anonymous Referee #3, 28 Feb 2024
The paper is well-written, and most of the study details are well documented. It is a useful addition to the many modelling studies that assume deterministic relationships between rainfall, inundation, and flood damage to rail infrastructure. We have a few concerns about the railway incident data though, which we would like to clarify below.
Major comments
- For the reader it is not easy to gain an impression of what the DB Netz AG database looks like. Because some of the results are somewhat puzzling and surprising (see point 2), we request the authors to provide more information on how this dataset looks. A few concrete suggestions:
- Include a few example records from the database showing the raw data (i.e. the raw text record that was identified), preferably with examples showing all three types of natural hazards; and examples that did match with rainfall events and examples that did not match.
- Please give an example(s) of how a well-known flood event (example the July 2021 floods) was mapped in the data, ideally with some pictures of how the damage looked in the real world. Earlier NHESS articles (e.g. https://doi.org/10.5194/nhess-22-3831-2022) and the author’s https://doi.org/10.3390/atmos13071118 contain many relevant details that can be used to link the records to.
- Please give examples of that were tagged as a flood by DB, but for which no extreme rainfall was reported.
- We find the finding that only a quarter of the flood events could be linked to extreme rainfall events (line 414) very puzzling. In our view, the potential causes of this are insufficiently discussed in the draft version. Please critically reflect on the following possible causes:
- Is it possible that most of the reported flood events relate to river (fluvial) flooding? On the one hand, one might expect that this is the case, because it would present a natural explanation for the fact that the damage is observed in a different location than the extreme rainfall event. On the other hand, we would be surprised if the ratio between fluvial and pluvial flood events would be 3:1. Also, if 75% of the flood cases would concern river flooding, one may wonder if the methodology of the paper is sound. The authors mention this aspect in line 463, but what is missing is a reflection on whether this explains the low correlation in their results.
- Another take would be that most of the floods are caused by rainfall events that do not qualify as heavy rainfall, which is reasonable given that there is no standardized guideline for defining heavy rainfall (line 72). If this is the case, it would be best to clarify it in the text and highlight it as relevant further research.
- Is it possible that due to other reasons, there is a mismatch between what the DB understands as a ‘flood’ is very different from what the radar data shows?
- Are there other possible explanations?
We mainly raise the above points because for modelling studies, the outcomes of the present study may have large implications. Most modelling studies assume a deterministic relationship between rainfall, inundation and damage. The present study seems to suggest that such relationships would only explain 25% of reported flood damage events. We invite the authors to further reflect on this. Are the author’s aware of any other empirical studies that looked into this relation? Did they find similar results?
Minor comments
- In the abstract, please list the ‘three associated natural hazards’ upfront. Now it takes a while before the reader knows which hazards you examined, namely: floods, gravitational mass movements / landslides?, tree fall.
- Line 47: and smaller tolerance of risk compared to road transport?
- Line 50: Can the authors provide any additional information on what type of damage is reported in the DB database? Do railtrack characteristics play a role? What part of the track is damaged? Could it also be damage to a pier or abutment of a bridge that supports the track?
- Line 59: do you mean bias or correlation?
- Line 64: One or two figures to illustrate the data sources could be useful for a reader that is unfamiliar with them.
- Line 88: Figure 1 description indicates Monthly and yearly distribution of heavy rainfall, stating it as Yearly and monthly would better match the content of the figure. Figures c-h are also more likely to be consulted by the reader when reading the results, making the figure placement inconvenient (though understandable). Could changing the position or splitting the figure into two make it more readable?
- Line 147, can you describe in a few lines how the polygon data looks like? Is it a polygon indicating a uniform amount of rainfall within that polygon?
- Line 149: Punctuation can be improved for readability.
- Line 163: The selection of 2 days as a time window could be better explained; how often does it take longer than 2 days to record the damage events? During periods of heavy rainfall, it may not be safe to collect the data, for example?
- Line 185-200: The definition of route segments could be further clarified and justified, specifically: (1) what is an “operating point”? (2) What are the implications of such wide range of lengths (140 m to 12.7 km) – is the starting point in a long stretch equally representative as that of a short one? (3) Taking 5-meter segments may be unmanageable, but why not, for example, use 1 km segments as a standard?
- Line 209 (and other location): I find the term natural hazards a bit ambivalent in this context, because it could be used to indicate either the extreme rainfall event, or the flood/gravitation mass movement/ tree fall.
- Line 393: what is meant with: can be spatially overlaid.
- Line 397: does this conclusion follow from the data, or from other literature, or from common understanding?
- Line 401-402: Is flooding triggered by heavy rainfall events (Line 402) an example of the statement in the previous sentence? This can be clarified. The use of the word connections makes it sound like it is a separate idea or concept.
- Line 410: This idea is not very clear. An additional sentence with a specific example may help. Something like “[…] but to establish connections between the processes through X or Y, for example, by looking at mass land movements triggered by flooding”, if this was indeed the idea.
- Line 451: What is meant by “the different background of the data collections”? Is it their intended or original purpose rather than their background?
Citation: https://doi.org/10.5194/nhess-2023-196-RC3 - AC3: 'Reply on RC3', Sonja Szymczak, 09 Apr 2024
- For the reader it is not easy to gain an impression of what the DB Netz AG database looks like. Because some of the results are somewhat puzzling and surprising (see point 2), we request the authors to provide more information on how this dataset looks. A few concrete suggestions:
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
468 | 144 | 55 | 667 | 31 | 33 |
- HTML: 468
- PDF: 144
- XML: 55
- Total: 667
- BibTeX: 31
- EndNote: 33
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1