the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Impact-based flood forecasting in the Greater Horn of Africa
Lorenzo Alfieri
Andrea Libertino
Lorenzo Campo
Francesco Dottori
Simone Gabellani
Tatiana Ghizzoni
Alessandro Masoero
Lauro Rossi
Roberto Rudari
Nicola Testa
Eva Trasforini
Ahmed Amdihun
Jully Ouma
Luca Rossi
Yves Tramblay
Huan Wu
Marco Massabò
Download
- Final revised paper (published on 26 Jan 2024)
- Supplement to the final revised paper
- Preprint (discussion started on 02 May 2023)
- Supplement to the preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-804', Anonymous Referee #1, 19 May 2023
OVERVIEW
The paper describes a large-scale flood forecasting system developed for the Greater Horn of Africa. The paper describes the development of the system and results both in terms of flood forecasting and also for impact assessment.
GENERAL COMMENTS
The paper is well written, well-structured and clear. The topic is surely of interest for the readers of Natural Hazard and Earth System Sciences (NHESS) as the paper describes an important effort to develop a large-scale flood forecasting system in an African region. The authors made a great job in developing the system and I believe the paper deserves to be published.
However, I have some major comments that, in my opinion, need to be addressed before the publication.
MAJOR COMMENTS
- The development of the system has required a number of choices with respect to input data, meteorological forecasts, and hydrological modelling. The paper only describes the system currently running without considering possible alternatives. For instance, why satellite precipitation from GSMaP? Why the GFS forecasting system? Have the authors investigated alternative options? I believe that a discussion on the decisions made to develop the system is needed.
- It is not clear how the system works in real time. If I understand correctly, the hydrological model is run every day with last day satellite precipitation from GSMaP (1 day behind now) and 5-day GFS forecast. But in the text it reads ERA5 is used. Presumably the model is run every day starting from N-days before the “now” and ERA5 is used until it is available. Something it reads at the beginning of section 2.5.1, but it seems that ERA5 is not used at all. However, this is not specified in the text and it should be clarified.
- The criteria used for parameter regionalization should be specified.
- Did the authors check the agreement between ERA5, GSMaP and GFS precipitation data? It is a very important and critical aspect in the development of a flood forecasting systems.
- The impact assessment is carried out by defining several indices. However, it is not clear how the indices are calculated and how they are integrated. I assume that normalised indices have been calculated, but this should be clarified.
- The authors say that correlation is a suitable indicator to measure the model capability to detect flood events and it is good if threshold exceedances have to be assessed. I would agree, but it should be shown in the paper. Is the model able to detect flood event correctly in terms of threshold exceedances? A dedicated paragraph should be written on this point.
SPECIFIC COMMENTS (L: line or lines)
L161: “Alfieri et al. (2022a) is missing in the references list.
L181: GEFS is not defined, please check all the acronyms.
L248: It is not clear how many stations are used for calibration and how many for validation.
L269: The Supplemental Material should be cited more clearly, which figure exactly? Which paragraph?
L325: It would be interesting to show stations located downstream large reservoirs to assess the reservoir impact.
Figure 3: The figure is too small and hardly readable. Moreover, the stations shown in the figure should be highlighted in the map. The last panel (bottom right) shows a strange behaviour of river discharge; is there any explanation for that?
L364: Do the authors have an estimation of peak river discharge? Can the authors make a comparison between observed and modelled peak discharge?
RECOMMENDATION
On this basis, I found the topic of the paper quite relevant and I suggest a moderate revision before its publication on NHESS.
Citation: https://doi.org/10.5194/egusphere-2023-804-RC1 - AC1: 'Reply on RC1', Lorenzo Alfieri, 26 Jun 2023
-
RC2: 'Comment on egusphere-2023-804', Anonymous Referee #2, 13 Jun 2023
Title: Impact-based flood forecasting in the Greater Horn of Africa
Author(s): Lorenzo Alfieri et al.
MS No.: egusphere-2023-804
MS type: Research article
Special issue: Reducing the impacts of natural hazards through forecast-based action: from early warning to early action
General comments
This article presents the development and first evaluation of an impact-based medium-range flood forecasting system for the Greater Horn of Africa (GHA) called Flood-PROOFS East Africa (FPEA). The work presented is of great relevance for the readership of Natural Hazard and Earth System Sciences (NHESS) and for the Special Issue. The authors developed FPEA, a valuable system for impact-based early warning and forecast-based action in eastern Africa, as proven by the fact that the system is already operational and supports the African Union Commission and the IGAD Disaster Operation Center in the early warning chain in Eastern Africa. The authors report a first evaluation of the hydrological reanalysis produced by FPEA and a semi-quantitative assessment of the impact forecasts for a recent flood event.
The paper is generally well written and builds on substantial high-quality work. However, some parts of the methods description and results discussion should be improved to make the paper even more impactful and suitable for publication in NHESS. Some key methodological choices are given with no justification and should be further motivated and discussed. More insights on the consistency between reanalysis and forecast biases and climatology are needed to justify and discuss the approach of event detection followed by the authors. Moreover, the quantitative analysis of the model performance and the discussion could be enhanced, as a basic long-term evaluation is carried out only based on the KGE and its components for the simulation runs (with most emphasis on the correlation) to assess the capability of the system in flood event detection. Some more evaluation based on flood-relevant metrics could be made or at least the limitations of the current analysis should be discussed further, as the correlation over a multi-year simulation run does not seem sufficient to understand the capability of the system in detecting flood events. On the other hand, the event-based semi-quantitative analysis of the Nile floods of 2020 is very interesting and well presented. Hopefully future work will extend this analysis to more events and to impact-based quantitative evaluation, as the authors also suggest.
Specific comments
I have some moderate to major comments that the authors should consider to improve the manuscript:
- Motivation of the choice of the model and forcing data: The choice of the hydrological model and hydro-meteorological forecasts and reanalysis data used in FPEA seem to be pre-determined for some unspecified reasons (e.g., not clear if based on known performance in the region or other reason). Given the plethora of hydrological models available, it would be important to discuss the choice of the selected model (Continuum) for any possible region-specific or other criteria for model’s choice (e.g., performance of different hydrological models). Similarly, the choice of the GFS forecast and GSMaP/ERA5 reanalysis is not motivated, while given the existence of alternative global datasets it would be important to explain why GFS, GSMaP and ERA5 have been used. Also, the choice of considering different reanalysis products (ERA5/GSMaP) for precipitation and temperature should be briefly discussed.
- Model calibration and regionalization procedures: some clarifications are needed:
- from the main manuscript it is not clear why for model calibration the normalized Root Mean Square Error (nRMSE) is used in place of the Kling Gupta Efficiency (KGE) or of other popular choices (e.g., NSE). Only in the Supplement Material, the authors explain that the nRMSE “enables a good trade-off in achieving low bias and good correlation”, but this needs to be recalled explicitly in the manuscript. Moreover, it is not clear whether previous studies in the literature show that the nRMSE enable a better trade-off in achieving low bias and good correlation than the KGE, or if the authors’ choice is based on their trial-and-error calibration tests. Is the trade-off between correlation and bias better ensured by nRMSE? If there is no previous study on this, a brief highlight of these results might be shown in the manuscript. Moreover, only the supplement material states that the RMSE is normalized by the average flow obtained from long term records, while this needs to be specified in the main manuscript.
- the 3-year duration period for the calibration runs (4-year including warmup) is quite short compared to calibration periods commonly used in the literature and to the length of data available for this work (as 21 years are used for validation). It is unclear whether the authors tested the sensitivity of the results to the calibration period length. If not, this would be recommended, as the average performance of the long runs in both calibration and validation is very low (e.g., see median KGE_val < -0.41). Readers may wonder whether increasing or changing the calibration period could help improve model performance (in both calibration and validation), as a few previous studies suggest the importance of data length and inclusion of wetter years in the calibration (Anctil et al., 2004; Li et al., 2010). The authors should at least discuss further.
- in explaining the calibration procedure, the authors mention that the entire calibration process was repeated more than once to fine tune the choice of the parameter set, the calibration stations and the calibration period, but it is not clear how the different 2000 runs (see L. 235) and the whole process is setup (if with clear objective rules which should be specified).
- for the regionalization, the adopted criteria of proximity and climatic conditions should be further specified, or a reference should be added.
- Flood event detection and ensemble forecast trigger: The adopted methods for event detection and ensemble triggering rely on the consistency of climatologies of forecasts (driven by GFS and GEFS) and of long-term runs driven by the reanalysis (GSMaP and ERA5). If the climatology biases and relative ranking of flood peaks are different across forecasts and reanalysis the triggers might be less (or not) effective. Lead-time dependent biases in the forecasts are often found in hydro-meteorological forecasts and their consideration has proved important in the literature (Zsoter et al., 2020). A comparative analysis of the climatology of forecasts and long runs from the reanalysis would support the key operational choices adopted for FPEA. Further analysis or at least more discussion on this point would be important.
- Basic model evaluation and missing quantitative forecast evaluation: the quantitative evaluation of FPEA is carried out and presented in Section 3.1 (Hydrological model evaluation) only for the simulations in validation mode with few basic general metrics. I wonder whether the authors could include some results of forecast evaluation too, even if on a shorter period based on the hindcasts available, as this would be very relevant. Regarding the metrics, the use of only the KGE and its components for assessing the simulation may limit the understanding of flood simulation capabilities. The authors claim (L. 329-330) that the correlation is a ‘suitable indicator for the capability in event detection and in turn of flood early warning based on threshold exceedance’. I agree that the correlation is useful to give insights on this capability (more than the bias, of course), but it is not the most suitable indicator for flood event detection. Other metrics (e.g., flood-event based metrics, peak errors, Hit Rates and False Alarms, the Brier score, etc.) might be more suitable. The authors could extend their quantitative model evaluation to other metrics more targeting flood detection capabilities or should at least further discuss the limitations of the current analysis based on the simple correlation.
- Discussion on modelling assumptions and limitations: It would be important to enhance the discussion on the impact of the assumption of no flood defenses (or failure) on the possible overestimation of flood impacts. Similarly, the choice of including only the largest reservoirs of the region (storage > 300 Mm3) and possible assumptions behind their management rules made in the model should be discussed further. In the Section describing the model setup (2.4.1), the modelling of reservoirs and lakes is not even briefly explained and no reference on how they are modelled in Continuum seems to be provided, while it would be important to understand how human influences and dams are considered.
Other minor comments:
- A few more references in the introduction are needed to back up some statements on the projected increase in variability of rainfall and higher risk of flooding (e.g., see sentence: “The variability in the seasonal rainfalls is projected to increase, resulting in more frequent wetter and drier years and a higher risk of flood and drought events.”)
- When mentioning where (in which other Countries, e.g. Italy, Bolivia, Mozambique, etc.) the system is operational (L. 82-83), it would be interesting to see any references if available. Moreover, is not clear if the configuration of the system would be different in each Country and if new impact-based components have been included for the first time East Africa (see L.84-87).
- The impact forecast methodology needs to be further clarified. In Section 2.5.2, it is not completely clear how multiple flood threshold-based inundation maps are combine (e.g. L. 277-280: “In addition to the three warning thresholds used for early warning (i.e., annual frequencies of 1 in 2, 5 and 20 years), we extracted four additional threshold maps with the same annual frequencies as those of the JRC inundation maps (i.e., 1 in 50, 100, 200 and 500 years), to enable impact assessments for a wider spectrum of event magnitude”). Are the static inundation maps corresponding to the six return periods just overlapped when activated by the dynamic forecasts (with no interpolation)?
- In Section 2.1 (The study region), it would be important to mention which other operational flood forecasting systems may already exist in the GHA region at the Country or regional levels, and how the model developed here fills specific gaps.
- L. 315-320: The discussion of the bias of the simulations and differences with previous studies (underestimation vs overestimation) should be improved specifying that different reanalysis products can lead to different biases but results generally vary across catchments (even in a same region as eastern Africa) and a few more citations could be added for this. For example, the following sentence can be improved: “The issue of bias in hydrological simulations in Africa was already pointed out in various previous works, yet with a trend of overestimating discharges when atmospheric reanalyses are used as input.”. The bias trends are expected to depend on the reanalysis products, the basins and models used, as few previous studies showed (Cantoni et al., 2022; Wanzala et al., 2022) and the picture over Africa is expected to be quite complex.
- L. 320-322: After this sentence, it would be good to include a citation to previous work: “it is known that bias does not deteriorate the performance of systems based on threshold exceedance detection, if warning thresholds are consistent with the discharge time series.”
Technical corrections
- L. 139-L. 144: it would be useful to add a link to the mentioned JRC Data Catalog Service webpage and to the source of the Areas of Influence maps
- L.146-149: it would be important to report the sources of the exposure layers in the main manuscript (also for reproducibility and better understanding of what information has been used), while it is OK keeping additional details in the Supplement material.
- L. 160-161: sentence to improve and clarify: “In this work we use vulnerability information from Alfieri et al. (2022a) which values range between 0 and 1 depending on the hazard magnitude…” (check the word ‘which’)
- L. 179: the “respective domain resolutions” (i.e. their range) could be specified here, even if later in the manuscript the model resolutions are mentioned
- L. 191: Silvestro et al. (2013) is missing in the full reference list
- L. 200: it would be interesting to know how the variable grid resolution is fixed for each domain used (is there an objective rule followed to fix it?)
- L. 191-211: the model temporal resolution should be specified in this Section.
- L. 237-242: the form of the sentences reporting the three different key functions of the long-term model simulations should be improved (either as full sentences as the third point, or as a list but using appropriate punctuation).
- L. 269: this and other references to the Supplement material need to be clarified, by adding section title and/or Figure numbers to refer to the exact part of the Supplement Material
- L. 276-277: regarding mapping each pixel of the GloFAS river network to one or more pixels of the Continuum network, the sentence mentioning “automated criteria of proximity and similarity between the drainage areas, followed by manual fitness check” could be clarified reporting more yet brief details on the automated criteria and manual check.
- Equations (1) and (2): units should be reported in the lines below explaining all terms
- L. 296: a link and reference to the ‘myDewetra web interface’ mentioned here for the first time should be reported (maybe introducing briefly what it is)
- L. 304 and Fig 2 caption: bias ratio and variability ratio
- L. 325-326: sentence to correct: “stations immediately downstream large reservoirs (i.e. Victoria Nile downstream ...), which release rules are not easily predictable” – (maybe use ‘for which …’ instead)
- L. 372: check and clarify this sentence as it does not sound clear (“… with a slight but persistent increasing trend resulting from the lamination of flood volumes released by the Sudd Swamps in South Sudan”)
- Figure 4a: it would be good to improve the quality and clarity of the hydrographs (resolution of screenshots)
- Figure 5: caption and figure titles are not consistent (forecasts of affected cropland aggregated vs. population affected)
- Table 1: not clear why the comparison with GloFAS is only carried out for population affected here and not the rest
- L. 430-436: this part of the conclusions could be moved to the introduction or reduced a bit here, while readers would expect a brief summary and highlights of the results presented in the paper in terms of model evaluation
- L. 457-459: check and improve the sentence “forecasts of inundation extent can be benchmarked to satellite acquisitions, which improved latency and availability currently enable almost daily coverage of flood disasters” (maybe better: “which currently enable … thanks to improved latency…”)
References
Anctil, F., Perrin, C., & Andréassian, V. (2004). Impact of the length of observed records on the performance of ANN and of conceptual parsimonious rainfall-runoff forecasting models. Environmental Modelling & Software, 19(4), 357–368. https://doi.org/10.1016/S1364-8152(03)00135-X
Cantoni, E., Tramblay, Y., Grimaldi, S., Salamon, P., Dakhlaoui, H., Dezetter, A., & Thiemig, V. (2022). Hydrological performance of the ERA5 reanalysis for flood modeling in Tunisia with the LISFLOOD and GR4J models. Journal of Hydrology: Regional Studies, 42, 101169. https://doi.org/10.1016/j.ejrh.2022.101169
Li, C., Wang, H., Liu, J., Yan, D., Yu, F., & Zhang, L. (2010). Effect of calibration data series length on performance and optimal parameters of hydrological model. Water Science and Engineering, 3(4), 378–393. https://doi.org/10.3882/j.issn.1674-2370.2010.04.002
Wanzala, M. A., Ficchi, A., Cloke, H. L., Stephens, E. M., Badjana, H. M., & Lavers, D. A. (2022). Assessment of global reanalysis precipitation for hydrological modelling in data-scarce regions: A case study of Kenya. Journal of Hydrology: Regional Studies, 41, 101105. https://doi.org/10.1016/j.ejrh.2022.101105
Zsoter, E., Prudhomme, C., Stephens, E., Pappenberger, F., & Cloke, H. (2020). Using ensemble reforecasts to generate flood thresholds for improved global flood forecasting. Journal of Flood Risk Management, 13(4), e12658. https://doi.org/10.1111/jfr3.12658
Citation: https://doi.org/10.5194/egusphere-2023-804-RC2 - AC2: 'Reply on RC2', Lorenzo Alfieri, 26 Jun 2023