Comparing flood forecasting and early warning systems in northwestern Europe

Busker, Tim; Rodriguez Castro, Daniela; Vorogushyn, Sergiy; Kwadijk, Jaap; Zoccatelli, Davide; Oliveira, Rafaella G. L.; Murdock, Heather J.; Pfister, Laurent; Dewals, Benjamin; Slager, Kymo; Thieken, Annegret H.; Verkade, Jan; Willems, Patrick; Aerts, Jeroen C. J. H.

doi:10.5194/nhess-26-1457-2026

Articles | Volume 26, issue 3

https://doi.org/10.5194/nhess-26-1457-2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/nhess-26-1457-2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 26, issue 3

Research article

|

24 Mar 2026

Research article |

| 24 Mar 2026

Comparing flood forecasting and early warning systems in northwestern Europe

Tim Busker, Daniela Rodriguez Castro, Sergiy Vorogushyn, Jaap Kwadijk, Davide Zoccatelli, Rafaella G. L. Oliveira, Heather J. Murdock, Laurent Pfister, Benjamin Dewals, Kymo Slager, Annegret H. Thieken, Jan Verkade, Patrick Willems, and Jeroen C. J. H. Aerts

Download

Final revised paper (published on 24 Mar 2026)
Supplement to the final revised paper
Preprint (discussion started on 14 Mar 2025)
Supplement to the preprint

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2025-828', Anonymous Referee #1, 17 May 2025
Summary and general comments
This study reviews the status of Flood Forecasting and Early Warning Systems (FFEWSs) in transboundary river basins in the Northwestern Europe Countries that were hit by the July 2021 flood (Germany, Luxembourg, Belgium, and The Netherlands). Following the deadly and costly flood event of 2021, such analyses are essential for improving flood risk management and early warning systems chains, and to foster increased regional cooperation in transboundary river basins.
The study uses semi-structured expert interviews and literature review to analyze and compare FFEWS characteristics in the different countries, including forecast types, warning levels, communication protocols, emergency response plans and institutional coordination. Expert interviews from the region reveal that all systems are under a significant and rapid development after the 2021 flood event, which brought attention to some limitations of the FFEWS at the time. The main findings include the identification of key differences between countries and challenges, especially around harmonization and impact-based forecasting, which is still underused in the region, as only Flanders has operational inundation forecasts. Moreover, the authors find a lack of harmonization in protocols and inconsistencies in warning levels and communication protocols, which hinder cross-border coordination in transboundary river basins.
The paper is well written and contains many interesting details about the FFEWS structure in the four Countries studied, which are of interest for the community and the readers of NHESS.
However, it has some analytical limitations that should be addressed to enhance its clarity (in terms of organization of material and description of methods), rigor (in terms of systematic comparisons), and practical relevance (in terms of enhanced discussion and solid recommendations). In terms of clarity, the organization of the material can be improved as some key information is dispersed and difficult to find (see detailed comments below). A better synthesis should be made to be able to better compare all the key FFEWS characteristics across countries and regions (as further detailed in the comments below). Moreover, the discussion of some developments, limitations and barriers of the current systems should be enhanced, to better connect the analytical review of the FFEWS to the recommendations for their improvement.

Major comments
More efforts towards a systematic classification and synthesis of all the important aspects of the FFEWS should be made: some important pieces of information (like the real time data sharing across countries for both river discharge and reservoirs upstream, or the use of forecast post-processing techniques) are scattered in the text and not always reported for all countries; this makes it difficult to find the same level of information for all countries or get a synthetic view across regions. Given that some information is not included in the summary table (Table 1), for some Countries it is difficult to find all details and unclear whether an information is not reported because it is not applicable (e.g. no provisions for data sharing or no forecast post-processing are made) or whether the information was not available or not retrieved. The authors should summarize and list more clearly all the key descriptive information pieces that are currently scattered in the text and reported only as examples (e.g., L. 545 for data sharing). Possibly, a larger table than the current Table 1 (or additional tables) could make the comparisons of all the interesting aspects clearer, specifying where the information is not available, to facilitate the synthesis and interpretation of all findings.

The review of the FFEWS is presented in an overly descriptive way. More efforts in terms of comparative analysis and discussion should be made to move from a descriptive to a prescriptive analysis as the authors aim to do, providing some good recommendations at the end, which could be enhanced. While the paper documents the current situation in detail, it often lacks critical analysis of why systems differ or what technical or institutional barriers may have shaped them. It also misses an opportunity to theorize and discuss barriers to impact-based forecasting, linking to known problems, e.g. governance fragmentation, models computational cost, lack of high-resolution data, trust, etc. The authors should discuss further why the uptake of operational impact-based forecasts remains so challenging, and what technical or institutional barriers currently limit this. Why is Flanders the only region where such forecasts are provided? If it is for the lack of inundation models with suitable computation time for operational applications? Is it for lack of institutional mandates? Why static flood hazard maps computed off-line and available as catalogues (for selected return periods, as done in EFAS) are not used? Is this known? Some limitations of the available flood maps are mentioned and quite clear (e.g., the large deficiencies of the static flood hazard maps during the 2021 flood and the need for more extreme scenarios, as discussed in Section 4.2), but the barriers for the implementation of the currently available flood maps (e.g. EFAS or other national static flood maps) or the production of more accurate inundation maps (near real-time) are not discussed sufficiently. More in general, more discussion is needed to ensure that the recommendations provided are well motivated by the found patterns and limitations. For example, the definition of the rainfall warning levels can be further discussed, as they are just given as absolute rainfall values without explaining how they were derived and why they differ (e.g. L. 160). Some more elements of analysis of the causes of the described characteristics could help, highlighting why the existing FFEWS have certain characteristics (e.g., how the rainfall thresholds are computed), and if the reasons behind these patterns (e.g. differences in rainfall thresholds) have been clarified or if the information is available at all, which is not always so clear so far. For example, the authors could clarify whether the definition of the rainfall thresholds uses different or same RPs in different countries, and why a specific RP or way of calculating is adopted. This was done better for the fluvial flood thresholds. Moreover, related to the warning levels for pluvial and fluvial floods, the authors could discuss the representativeness of the warning levels with respect to flood impacts, e.g. whether for the 2021 floods, there is any information on whether the areas exceeding the warning levels for both pluvial and fluvial floods matched well observed impacts.

The changes or progress made in the FFEWS after the July 2021 flood event are not clearly summarized and described, as they are scattered in various parts of the paper and difficult to find, e.g. in Section 3, L. 178-179 (“In Germany and Luxembourg, a cell broadcasting system was installed in response to and after the July 2021 flood.”). Moreover, there are hints at recent moves from deterministic to probabilistic forecasts, and a statement about it in the abstract (“All regions have invested in probabilistic flood forecasting systems”), while in the article these changes are not clearly reported, i.e. where and when such changes have been made.

Table 1 only reports that now all countries have probabilistic hydrological forecasts, but it is unclear when these have been established. It might be beneficial for sake of clarity to have an additional table or scheme, listing or summarizing all the recent developments in FFEWS, or including some information about recent changes in the current Figure 3 or Table 1. The information to highlight and summarize should include: (i) when and how the probabilistic FFEWS were developed (from deterministic to probabilistic or increase in ensemble size?), being this one of the key findings, (ii) when the online platforms were improved, (iii) the emergency response plans were updated, (iv) the communication protocols changed, e.g. national-scale phone-based alerts, etc. This information is only hinted at in different parts of the paper.

The methodologies followed for the literature review and for the analysis of the interviews lack sufficient transparency and should be further clarified in Section 2.2 (Approach).

For the literature review, the criteria for the selected articles (literature) inclusion are vague. A structured review approach (e.g., PRISMA or at least the search strings + database used + inclusion logic and filtering criteria) would improve transparency (and reproducibility). The authors should mention more clearly at least the search strings used, as only very general keywords are now reported in a vague way. Also, they should specify how many research articles were found and selected, how the country-specific reports were selected and how many were retrieved.

For the semi-structured interviews, the paper briefly mentions that interviews were done with 13 experts, but lacks essential details on the sampling approach (Who was selected and why? How many were invited?). Also, how were the audio recordings of the interviews synthetized (automatically with AI tools or manually)?

Minor comments
Table 1: the last column could report more clearly whether all those online platforms report publicly available forecasts and information; maybe here or in an additional column of the table, it would be interesting to summarize how the forecasts are presented, i.e. in which format (e.g., graph, map, text, etc.); also, the column heading ‘Primary alerting system’ could be clarified (in the heading or in the caption).

L. 62: a 'missed' forecast is less common in this context, usually this would be referred to as a 'missed' event

L. 70: “inaccuracies in spatio-temporal estimations …” are mentioned, but also intensity biases should be recalled

L. 73: the link between the sentence ‘Numerical weather prediction models have improved greatly in recent decades …’ and the following one, starting with 'For example, the Integrated Forecasting System (IFS) operated by …', is not obvious and clear. It should be improved. Probably here it would be interesting to state when the IFS moved to 51 ensemble members, and why it represented an improvement with respect to previous IFS versions (deterministic or lower-size ensemble).

L. 80: after this sentence (‘Over the last decade, ensemble and probabilistic …’) you can add at least one reference on recent reviews on hydrological ensemble forecasting

L. 84: “Thieken et al. (2023b) showed that around a third of the flood-related fatalities in the …” here it would be relevant for the paper arguments to report how (on which basis) they showed this

L. 110: it would be good to clarify immediately how (on which basis) are these recommendations developed

L. 119: I guess that ‘compromises’ is a typo, and it should be ‘comprises’

L. 126-127: the issue of evacuation orders in July 2021 is specified only for Luxembourg; it is unclear and it would be relevant to specify whether to the authors’ knowledge there were no evacuation orders in the other Countries, as now the reader is left to guess so

L. 141-142: the combinations of these keywords and the search strings should be reported in the text or an Appendix for transparency

L. 142-143: it should be clarified how many reports and how were they retrieved and selected

L. 159: when did Germany add this extra warning level (dark purple) to represent events with immediate danger? It would be interesting to know in the context of the paper, highlighting developments after the 2021 flood event

L. 215: is the data exchange in real- or near real-time?

L. 217: the acronym LAWA could be defined here and mentioned to which organisation it should refer (German Working Group on water issues of the Federal States and the Federal Government)?

L. 247: the sentence seems to suggest that no ‘simulation exercises (SimEx)’ of emergency preparedness exercises are carried out, is there any information on this?

L. 267-268: “...uses different weather models from neighboring countries (ECMWF, DWD, MeteoFrance)”, ECMWF should not be mentioned here alongside National organisations, as it is not from a single neighbouring country, and actually Luxembourg is a Member State of ECMWF, so mentioning ECMWF here like this is confusing

L. 159 and L. 272: this would be an interesting point to clarify: what exactly is the definition of an ‘imminent’ or ‘immediate’ danger, used to define the purple level in Countries where this is used?

L. 345: in the “European EFAS forecasts”, the acronym of EFAS could be reported and the reference to the Copernicus EMS of which EFAS is part

L. 554: here is the first time that upstream reservoirs are mentioned, in terms of data exchange; this point should be expanded a little bit, given its relevance in transboundary river basins
Citation: https://doi.org/10.5194/egusphere-2025-828-RC1
- AC2: 'Reply on RC1', Tim Busker, 31 Oct 2025
  
  Dear reviewer, many thanks for your useful comments and feedback. You can find the response to your comments in the attached supplement.
  
  Citation: https://doi.org/10.5194/egusphere-2025-828-AC2
RC2:
'Comment on egusphere-2025-828', Anonymous Referee #2, 08 Oct 2025
This paper addresses an important and timely topic on flood forecasting and early warning systems (FFEWS) in transboundary river basins, using the 2021 flood event as a key reference point. The paper contains a wealth of interesting insights from both literature and key informant interviews (KIIs), and the forensic perspective on the 2021 disaster is particularly valuable.
However, the overall flow and structure of the paper could be strengthened to help the main arguments and contributions emerge more clearly. In particular, the introduction and Section 3 would benefit from more explicit framing, stronger transitions between paragraphs, and a clearer delineation between pre- and post-2021 developments. The methodology also appears somewhat light and would benefit from revision to ensure there is a more systematic approach to data collection, content analysis and communication.
Major comments

Overall structure and flow

The literature review provides valuable forensic insights from 2021, but the narrative currently mixes several issues in single paragraphs. More explicit structuring could help the key challenges and gaps stand out.

The introduction highlights important lessons but does not yet bring out the main research gap clearly.

The flow from paragraph to paragraph can be strengthened with more topic sentences and transitions that guide the reader through the logic.The introduction currently mixes a range of issues in one paragraph; separating them more clearly could help highlight the specific challenges the paper aims to address.

It remains unclear whether Section 3 is purely descriptive (based on literature) or includes empirical data from KIIs. Clarifying this distinction is essential.

Definition of research gap and aim

The research gap and aim of the paper should be defined more clearly and earlier on.

The research question and focus area can also be stated more explicitly, ideally near the end of the introduction.

Clarity of arguments in early sections (L83–L95)

From L83 onwards, the paragraph discusses communication issues but then shifts to examples where modelling outputs were inaccurate (e.g., flood zone delineation). These examples appear to relate more to forecast accuracy than to communication.

The discussion around L90–L95 needs better alignment: the statement on flood awareness between in- and out-of-floodzone populations seems inconsistent with earlier points about fatalities outside the delineated zones, suggesting that flood extents exceeded forecasts.

The sentence on adaptation motivation does not connect directly with the statement on flood warning access (L95).

Clarification of key terms and assumptions

L165: Please elaborate on what constitutes a “clearly defined alarm level.” When is this not clearly defined? It seems this may relate to objective levels corresponding to forecast thresholds and expected impacts.

Section 3: Presentation and organization

The paragraphs describing the table and figure are difficult to follow. Consider adding more guiding sentences to help the reader navigate these visuals.

Depth and rigor of the methodology

The methodology section seems quite light: the sample size is small, and there is no indication of systematic coding or content analysis.

For research question (b), a more in-depth analysis of communication materials would strengthen the conclusions.

Treatment of transboundary dynamics

The transboundary challenges could be brought out more clearly, particularly regarding data sharing and alignment of alert levels.

Consider including another figure to illustrate the communication side of these systems.

Evaluation of effectiveness

The paper provides a rich description but limited critical evaluation of the accuracy and effectiveness of the new developments in FFEWS.

It would be useful to reflect on whether improvements have been validated and to synthesize recommendations based on that assessment.

Scope and temporal framing

The content currently mixes pre- and post-2021 developments, leading to ambiguity about the study’s temporal focus. Clarify whether the analysis primarily concerns pre-2021 systems, post-2021 changes, or both.

Minor Comments

L1, L119: Write out abbreviations such as “bn” and “mm.”

L83–L95: Improve alignment between discussion of communication issues and modelling inaccuracies.

L165: Clarify “clearly defined alarm level.”

L492: Clarify the source of the quoted text.

L495: The phrase “cope for decision makers” is awkward—consider revising to “decision-making challenges” or similar.

Throughout: Review paragraph transitions and ensure topic sentences clearly indicate the purpose of each paragraph.
Citation: https://doi.org/10.5194/egusphere-2025-828-RC2
- AC1: 'Reply on RC2', Tim Busker, 31 Oct 2025
  
  Dear reviewer, many thanks for your useful comments and feedback. You can find the response to your comments in the attached supplement.
  
  Citation: https://doi.org/10.5194/egusphere-2025-828-AC1

Peer review completion

AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload

ED: Reconsider after major revisions (further review by editor and referees) (09 Dec 2025) by Robert Sakic Trogrlic

AR by Tim Busker on behalf of the Authors (14 Dec 2025) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (22 Dec 2025) by Robert Sakic Trogrlic

RR by Anonymous Referee #1 (25 Jan 2026)

Suggestions for revision or reasons for rejection

Overview

The authors have thoroughly addressed the main concerns raised in the previous review. The revised article has been significantly improved in terms of methodological transparency, analytical depth, and clarity of the comparative analysis between the Flood Forecasting and Early Warning Systems (FFEWSs) in the transboundary Meuse and Rhine river basins in Northwestern Europe. The authors have now reported a more systematic classification and synthesis of several important aspects of the FFEWSs, offering more insights into how and why the FFEWSs differ between countries, reflecting differences in forecasting methods (including in models ownership), crisis management procedures, and governance systems (federal vs. centralized mandated agencies). The revised summary table (Table 1) is now far more comprehensive and clearly structured, allowing the readers to find more clearly the key information retrieved by the authors for the different regions. This synthesis supports the comparative analysis of the key aspects of FFEWS across countries and regions that the paper aims to provide. Moreover, the authors added some useful information and insights on the technical improvements of the FFEWSs after the 2021 floods, on the remaining challenges and barriers (e.g., to develop impact-based forecasts), enriching the discussion as suggested. Finally, four insightful recommendations are provided and linked to the main limitations of the FFEWS.

However, the article requires several further minor revisions to enhance its quality, readability and clarity. In particular, here below a summary of the main points is reported followed by detailed comments.

In the introduction, the two research questions exposed (Section 1.3, Research gap and aim) are too narrow with respect to what the paper actually achieves. They are valid but incomplete, as they do not anticipate the analysis of all the main operational and institutional characteristics of the FFEWSs, as they limit the focus on warning levels, which is only one of the points analysed in detail by the authors. Other findings are not introduced, including how these systems have evolved in response to the July 2021 flood event and which technical limitations remain (e.g., lack of impact-based forecasts). These aspects could be introduced as a third question addressed, to improve alignment between the paper’s introductory framing and its findings.

The authors should reduce overlaps and repetitions between the Introduction and the Case study section (see detailed comments below).

On the methodological side, the interview methodology has been clarified (see Section 2.1), including how the interviewed experts were selected, and their role. The literature review approach is better contextualized, but is still not very transparent in terms of the search procedure and selected materials, neither providing the full set of strings for the search, nor stating explicitly the number of documents found (only 16 are mentioned, and the reader is left to guess if they are all - probably not). The authors should further clarify these points for a fully systematic, reproducible literature search.

To summarize, significant improvements have been made, but further minor revisions are required to make the manuscript suitable for publication, enhancing readability and quality (see detailed comments below). These issues should be fixed through a full revision and proof-editing. If all the recommended minor revisions are addressed, the manuscript will become suitable for publication without further review needed.

In the comments below, line (L.) numbers refer to the revised manuscript with tracked changes.

—-----------—-----------—-----------—-----------—-----------—-----------------—-----------—

Minor and technical comments

-- Main comments --

- Section 1.3 - Introduction: The two research questions are valid and reflect some in-depth analyses conducted, but are too narrow relative to the broader scope of the paper. The two questions emphasize warning levels, but do not fully capture the comprehensive analysis of operational and institutional characteristics of FFEWSs across the transboundary basins that the authors conducted. Additionally, the paper explores how these systems have evolved after the July 2021 flood and identifies remaining technical limitations, none of which are reflected in the current framing of questions dealt with in the introduction. Including one (or two) additional research question(s) to address these aspects would improve alignment between the study’s aims and its actual contributions.

- Section 2.1 - Case study region and approach: there are some overlaps and references between these section and Section 1.2 (Current challenges in FFEWS) in the Introduction: I think that the challenges in FFEWS in the Introduction (Section 1.2) should be presented in a more synthetic and general way, without details on the 2021 flood event (including early warning challenges) which would better fit Section 2.1. This would help avoiding repetitions (E.g., “In the Vesdre, Ahr and Geul catchments, a significant share of the local population did not receive warnings (see Introduction).”)

- Section 2.2 - Approach: The “combinations of keywords” used for the literature search and mentioned here (Lines 205-207) should be reported explicitly (at least as an Appendix or Supplementary information) for full transparency of the methods. Also, the number of resulting documents from this literature search seems to be not reported or it is unclear if it’s only the 16 selected articles mentioned in the following sentence (“for an in-depth analysis”). The number of official reports analysed should be clarified too.

- Section 3.1 and Table 1: Differences in lead times of hydrological forecasts are reported in Table 1 but seem to be not commented on in the text. It would be good to highlight these differences and if available to provide insights on the origin or reasons behind such differences. A spontaneous question arises: Why do forecast lead times differ so much (from 1-2 days in Luxembourg to 15 days in The Netherlands)? Are these large differences somehow linked to catchment sizes?

- Section 3: The spatial resolution of the operational hydrological models would be a relevant piece of information but is not reported in the text (nor in Table 1). Can the authors briefly report if there is any information available on this? It would be interesting to see if there are differences among countries also on that aspect.

- Sections 3 and 4: Several statements describing the current issues of FFEWSs (post-2021 floods), needs and wishes of stakeholders and authorities would benefit from more explicit sourcing of information (a more direct reference or quotes).
To enhance transparency and strengthen the qualitative grounding of the analysis, the authors should consider including anonymized quotes from interviewees or at least indicate whether insights stem from interviews (with which authority) or literature (which report or article). For instance, in Section 3.5 (Hydrological Forecasts – The Netherlands, lines 519–520), the following claim would carry greater weight if supported by a direct quote or attributed source: “Predictions are deterministic, although a strong need and wish exist to deploy probabilistic forecasting approaches”. Similarly, in Section 4, where issues raised by interviewees are summarized in Table 2, brief excerpts from interviews could enrich the discussion and bring practitioner voices more directly into the analysis.

---------------------------—-----------—--------------------—-----------—-------------

Detailed technical comments, presentation quality, and English language issues

- The authors should use consistently either the present tense or the past simple and not alternate between the two as done now in most sections (especially in the Abstract and Results). For example, in the abstract: “Expert interviews across the region reveal that …”; “The assessment of warning systems showed …”; “The interviews also revealed …”. I would suggest sticking to the present when talking about what the interviews and analysis show.

- Abstract: I would suggest briefly reporting in the abstract all the four main recommendations and avenues for further research (those outlined in Section 5.2), as only one out of four does not seem the most appropriate summary of the findings.

- Table 1: The caption should clarify that the information reported is relative to the time of the authors’ analysis (present day), i.e., after the 2021 flood event. Otherwise, given the focus of the paper, it can lead to misunderstanding on whether it reports the state of the systems during the 2021 events. Also, the authors should check and clarify the text in the first column (5th row): "Main hydrological and model(s) …”; probably there is a typo (the word ‘and’ should be removed).

- Section 1 - Introduction: check the use of verb tenses; at the moment, the authors often move from present to past tense in an inconsistent way, even within the same section; for example, at Lines 67-69 when describing general concepts.: “For example, uncertainty in the collection and processing of meteorological data (Fig. 1, Left) may lead to a ‘missed’ forecast event, where thresholds in the system were not surpassed, while observed water levels reached extreme heights … By contrast, uncertainties can also lead to so-called false alarms, where the predictions suggest that warning thresholds may be exceeded, while effectively they are not.”. Here I would suggest consistently using the present tense everywhere.

- Section 1 - Introduction (Line 44): the authors should revise the following sentence: “The 2021 flood further showcased the potential of early warning systems”. I do not think that this sentence (especially the wording “showcased the potential”) is in line with the analysis provided by the authors and with the literature (e.g., Da Costa et al., 2026). The 2021 flood event seems to expose more significantly the shortcomings in the early warning system chains rather than their potential, given the deadly outcome of the flood event and the fact that the forecasts and early warnings did not lead to timely and effective anticipatory actions like evacuation orders. Recent additional literature on this should be cited, e.g., Da Costa et al. (2026).

- Section 1 - Introduction (Line 87): the sentence introducing post-processing techniques should be improved as post-processing is generally used for other objectives, e.g. improving forecast skill, more than for estimating the uncertainty.

- Section 1 - Introduction (Line 89): the sentence applies also to other variables rather than precipitation, so the authors should explicitly write that heavy precipitation is just a relevant example here: “in case some (or all) ensemble members exceed a certain threshold of heavy precipitation intensity”; for example, the same applies to hydrological ensemble forecasts or temperature.

- Section 1 - Introduction (L. 92): this is an inaccurate statement as it does not reflect the current or most up-to-date situation: ensemble forecasts are already a standard in hydro-meteorology and widely used in operational systems; the statement sounds outdated and indeed it refers to a quite old reference from 2016. Since then, ensemble flood forecasting has gained significant momentum also in operational contexts, e.g., see Wu et al. (2020) and Speight et al. (2021). The authors should revise this sentence and refer to more recent studies than a 2016 one for such a statement aiming to provide a present-day perspective: “It is expected that ensemble forecasts will be widely integrated in operational forecasting chains in the near future (Pappenberger et al., 2016).” Actually, ensemble forecasts are already commonly made at most major operational weather prediction facilities worldwide (Speight et al., 2021). Most operational hydrological forecasting chains in Europe already integrate probabilistic forecasts, as also shown by the authors in this study in Northwestern Europe.

- Section 3.1 - Lines 245-247: the sentence can be improved for clarity and flow, e.g.: "Flanders is the only region where flood inundation forecasts are run operationally for short-term (48-hour) lead times, whereas all other regions rely on discharge forecasts and thresholds."

- Figure 3: The water-level symbol summarizing the number of warning levels for fluvial floods is not very clear. The authors should consider footnoting the symbols or explaining them in the caption to help readers' comprehension. Moreover, the font size of the text labels appearing beside this symbol (e.g., “>100 year”, etc.) is too small and in these labels the word return period or the abbreviation RP (e.g. “>100 year RP”) should be added for clarity.

- Section 3.5: typo in “a dedicated emergency management plans” - a single plan?

- Section 3.6 (Lines 566-568): improve and clarify the following sentence: “Warnings for a specific color can also be issued if there is a small chance (< 65% or < 25% of the area) on precipitation amounts belonging to the following color.” - clarify and correct: smaller chance (or area) of precipitation amounts belonging to the next, more severe warning class? How much smaller? I guess there must be a minimum probability threshold. Also, the wording “chance on” should better read “chance of” in the following parenthesis: “a small chance on rainfall belonging”.

- Section 3.6 - L. 583: 10-day forecasts should be defined as medium-range forecasts and not as ‘long-term’

- Section 4 - Lines 693-694: It would be important to clarify what the following statement means: “Luxembourg implemented a new alerting system (LU-alert), harmonizing the warning levels of the meteorological forecasts alerts and the crisis management.”

- Section 4.3 - Lines 817-820: the paragraph title “Best practices of impact-based forecasting” could be removed or better phrased to represent the content, as in addition to best practices the plans and initiatives for future developments are reported.

- Section 5 - Line 939: the word 'systems' might be missing after 'national-scale cell broadcasting'

------------------------------------------------------------------------------

References

Da Costa, J., Ebert, E., Hoffmann, D., Cloke, H. L., & Neumann, J. (2026). Signals without action: A value chain analysis of Luxembourg’s 2021 flood disaster. Natural Hazards and Earth System Sciences, 26(1), 343–366. https://doi.org/10.5194/nhess-26-343-2026

Speight LJ, Cranston MD, White CJ, Kelly L. Operational and emerging capabilities for surface water flood forecasting. WIREs Water. 2021; 8:e1517. https://doi.org/10.1002/wat2.1517

Wu W, Emerton R, Duan Q, Wood AW, Wetterhall F, Robertson DE. Ensemble flood forecasting: Current status and future opportunities. WIREs Water. 2020; 7:e1432. https://doi.org/10.1002/wat2.1432

------------------------------------------------------------------------------

Hide

ED: Publish subject to minor revisions (review by editor) (02 Feb 2026) by Robert Sakic Trogrlic

AR by Tim Busker on behalf of the Authors (10 Feb 2026) Author's response Author's tracked changes Manuscript

ED: Publish as is (16 Feb 2026) by Robert Sakic Trogrlic

AR by Tim Busker on behalf of the Authors (25 Feb 2026) Manuscript

Download

Article (3811 KB)
Full-text XML

Short summary

In July 2021, the Netherlands, Luxembourg, Germany, and Belgium were hit by an extreme flood event with over 200 fatalities. Our study provides, for the first time, critical insights into the operational flood early-warning systems in this entire region. Based on 14 expert interviews, we conclude that the systems strongly improved in all countries. Interviewees stressed the need for operational impact-based forecasts, but emphasized that its operational implementation is challenging.