the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Subseasonal-to-seasonal forecasts of Heat waves in West African cities
Abstract. Heat waves are one of the most dangerous climatic hazards for human and ecosystem health worldwide. Accurate forecasts of these dramatic events are useful for policy makers and climate services to anticipate risks and develop appropriate responses. Sub-seasonal to seasonal forecasts are of great importance for actions to mitigate the human and health consequences of extreme heat. In this perspective, the present study addresses the predictability of heat waves at sub-seasonal to seasonal time scale in West African cities over the period 2001–2020. Two types of heat waves were analyzed: dry and wet heat waves using 2-meter temperature (T2m) and wet bulb temperature (Tw) respectively. Two models that are part of the S2S forecasting project, namely the European Centre for Medium-Range Weather Forecasts (ECMWF) and the UK Met Office models, were evaluated using two state-of-the-art reanalysis products, namely the fifth generation ECMWF reanalysis (ERA5) and the Modern-Era Retrospective analysis for Research and Application. The skill of the models to detect hot extreme events is evaluated using the Brier score. The models show significant skills in detecting hot days both for short- and long-term forecasts (2- and 5-week lead times, respectively). The predictability of heat waves in the forecast models is assessed by calculating categorical metrics such as the hit-rate, the Gilbert score and the false alarm ratio (FAR). The forecast models show significant skills in predicting heat wave days compared to a baseline climatology, mainly for short-term forecasts (two weeks lead time) in three climatic regions in West Africa, but the hit-rate values remain below 50 % on average. We find that nighttime heat waves are more predictable than daytime heat waves. On average, the False Alarm Ratio (FAR) is excessively high and tends to increase with the lead time. Only approximately 15 % to 30 % of the predicted heat wave days are actually observed for Week 5 and Week 2, respectively. This suggests that the models overestimate the duration of the heat waves with respect to ERA5. The skill of the models in forecasting dry and wet heat waves are very close. Although the models demonstrate skills on heat wave detection compared to a baseline climatology, they fail in predicting the intensity of heat waves.
- Preprint
(4973 KB) - Metadata XML
-
Supplement
(4568 KB) - BibTeX
- EndNote
Status: closed
-
RC1: 'Comment on nhess-2023-144', Anonymous Referee #1, 15 Sep 2023
The comment was uploaded in the form of a supplement: https://nhess.copernicus.org/preprints/nhess-2023-144/nhess-2023-144-RC1-supplement.pdf
-
AC2: 'Reply on RC1', Cedric Gacial Ngoungue Langue, 17 Dec 2023
The comment was uploaded in the form of a supplement: https://nhess.copernicus.org/preprints/nhess-2023-144/nhess-2023-144-AC2-supplement.pdf
-
AC2: 'Reply on RC1', Cedric Gacial Ngoungue Langue, 17 Dec 2023
-
RC2: 'Comment on nhess-2023-144', Anonymous Referee #2, 22 Sep 2023
Review of nhess-2023-144: Subseasonal-to-seasonal forecasts of Heat waves in West African cities
Review overview
The concept of this study is certainly of interest, and evaluation of extreme events such as heatwaves, which are increasing in frequency, intensity and duration, is important at a range of timescales. I read this paper with interest. This study examines the extended-range/subseasonal timescale, which can be useful for providing early indications of potentially hazardous events, ahead of more detailed forecasts that shorter-range forecasting systems are capable of predicting.
While I find the concept useful and interesting, and I like the range of skill scores used, I did find that several aspects of the methodology are not described clearly and there are some questions around the datasests and methodology used. Much more clarification is required around the forecasts used for this evaluation, and discussion of the potential drawbacks. The authors consistently refer to ‘subseasonal to seasonal’, which, while catchy, is not completely covered here, as the seasonal timeframe is not included. The descriptions should probably be changed to subseasonal throughout. I also had some questions around the identification of heatwaves and the thresholds used, and the method of dealing with ensemble members. The abstract and introduction mention both wet and dry heatwaves, and daytime and nighttime, but these distinctions are not clearly defined and discussed, and appear to be mostly lacking from the rest of the paper and the results and conclusions.
While the research questions and results are interesting, I feel that the structure of the writing could be significantly improved throughout the paper, as it is currently challenging in places to follow the work, and to fully understand the somewhat contrasting conclusions. For a decision-maker, what are the takeaways to help understand how these forecasts could/could not be used in heatwave forecasting and anticipatory action?
I have provided some more specific comments on the text and some of the figures below. I hope these can be useful as the authors consider the revisions and next stages of the manuscript.
Detailed comments
Abstract
- Line 13: Short-term forecasts typically refers to those of <4 days – 2 weeks lead time would typically be classed as medium-range forecasting
- Line 15: Fail is a strong word, without context?
Introduction
- Line 30-35: It could be worth mentioning that often, national meteorological services have a definition of a heatwave used to provide warnings? (unless it is not the case in the study region, but otherwise, there is also a WMO recommended heatwave definition (https://www.un-spider.org/category/disaster-type/extreme-temperature). Same comment at line 41, research paper authors are not the only ones / the authoritative ones to define heatwaves, particularly in a forecasting perspective. Is there a definition used most often by the forecasting services based in the study region?
- Line 39: ‘min, min or max’ – is there a typo here? Min seems repeated
- Line 39: heat stress indices are mentioned, but not really defined anywhere? (check and come back to) – it may be useful to define here what a heat stress index is and how it differs from the other metrics listed
- Line 45: It is of course of crucial importance for early warning systems to provide information on the occurrence of heatwaves. However, early warning systems are not usually done using seasonal weather forecast models, which often lack the skill and resolution to accurately predict individual extreme events. Typically, an early warning system would refer to a shorter/medium-range lead time, supplemented with advanced information on the potential for hazardous weather using S2S forecasts. The authors go on to make this point about seasonal forecasts providing early indications, which I completely agree with, but early warning systems require a range of lead times, including shorter timescales to account for the fact that forecasts get much more accurate at shorter lead times.
- Line 49-50: citation?
- Line 51-xx: I saw that Vitart (et al) also studied the Pacific Northwest heatwave of 2021, considering the ECMWF subseasonal forecasts and a more recent version of the ECMWF model, and 9 other S2S models. This may be of interest for the authors to include, as it uses a more recent model version than the Russian heatwave studies. https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2021GL097036 Other authors also examined other timeranges of the forecasts for this heatwave.
- Line 114: ERA5 is used to initialise the extended-range reforecasts, but not all of the different aspects of ECMWF’s IFS
Section 2
- Section 2.1: it could be interesting in this section to include some description of heatwaves and their impacts in this region – what have been the impacts of significant past heatwaves? What kinds of temperatures are reached? During what season do impactful heatwaves occur?
- Section 2.1: this section refers to the region having a short wet season followed by a long dry season. But the results later are split into winter/spring/summer/autumn – this should be further explained and justified.
- Section 2.2: A brief discussion on the potential disadvantages of reanalysis datasets might be warranted, if not included later (e.g. they may not always have the resolution to be able to pick up the very highest temperatures during heatwaves)
- Section 2.2.2: The assessment of dry and wet heatwaves seems like it should be a separate section, as at the moment it seems to be that it is only applied to MERRA, as it sits under that subtitle. It is also not clear how the heatwave identification described later takes into account both wet and dry heatwaves, nor is this clear in the results. Wet and dry heatwaves should also be defined (do they refer to the aforementioned wet and dry seasons? to the humidity experienced during a heatwave? Or otherwise?)
-
- Section 2.3.1 ECMWF forecasts: there are some errors in this section, and I find some of the description unclear. This isn’t helped by the fact that ECMWF very recently upgraded their models so some of this information no longer stands. It may be useful to revisit the description in this section for clarity. More details below:
- The ECMWF IFS has several separate forecasting systems (medium-range (now high-resolution), extended-range, seasonal), and it would be useful to specify which is being used and described here, as other parts of the system have different resolutions, lead times and ensemble members.
- ECMWF provide both extended-range (up to 46 days) and seasonal (up to 6 months) forecasts to the S2S programme. I understand that the authors are using the extended-range forecasts, and it may be useful to refer to the forecasts as this throughout.
- ‘ECMWF ENS’ is often used to refer to the medium-range (up to 15 days and now high-resolution) ensemble, and so could cause confusion – these are not the same exact forecasting system as the extended-range (at least not any more).
- The authors may wish to specify that they are interpolating to a 0.25° grid to match the resolution of ERA5 for evaluation (I presume), with the caveat that this does reduce the resolution of the native forecasts, and that higher resolutions can be beneficial for capturing extremes.
- The IFS is no longer running at CY41r2. If the authors downloaded hindcast data from forecast dates in 2021, the cycles could have been 47r1 (implemented 30 June 2020), 47r2 (implemented 11 May 2021), or 47r3 (implemented 12 October 2021). The authors should confirm which cycle(s) was used. These cycles indeed had 51 ensemble members at 18 and 36km resolution depending on the lead time, as the authors describe. The latest version of the extended-range forecasts (48r1) has 101 members, run at 36km for the full forecast range (days 0 to 46), and is run daily rather than twice a week. (https://www.ecmwf.int/en/about/media-centre/news/2023/model-upgrade-increases-skill-and-unifies-medium-range-resolutions)
- A useful description of hindcasts/reforecasts may be ‘Hindcasts are forecasts produced for past dates using the most recent version of the forecasting system, and allow analysis of how the current system would have performed, alongside a consistent dataset covering a longer time period for evaluation’, as a useful use of these data for the authors’ purposes?
- The authors also use ‘hindcasts’ and ‘reforecasts’ interchangeably. ECMWF typically call them reforecasts, and the authors should be clearer if it is themselves calling them ‘hindcasts’ throughout the study.
- I didn’t understand the explanation for not using the Thursday hindcasts, sorry
- Line 83: the authors may wish to acknowledge that when dealing with extreme events, including different extreme events in the analysis may well result in different conclusions regarding the skill.
- Section 2.3.2: Parts of the UKMO forecast description are also confusing, for example the transition from discussing 4 members to 7 members. Perhaps a table outlining key aspects of both forecasting systems, and the timeframes to which they apply, would be helpful to provide an overview of the system characteristics?
- Section 2.3.2: I believe the description of the concatenation could be simpler. Is an equation necessary, or is it enough to simply state that prior to 2016, hindcasts are used, and after that, the real-time forecasts are used, followed by the details from line 190? I am not completely convinced of the decision to reduce the number of ensemble members in this methodology, thus reducing the uncertainty representation of the forecast. It would also be useful to provide an overview of how other characteristics of the model have changed between the hindcast version and the potentially multiple operational versions used during the period of the real-time forecasts? This could be covered in the aforementioned table.
- Line 197: I believe here the authors are referring to a lack of data from local stations to evaluate the forecasts again. The sentence implies that no data is available from this region for weather forecasts to assimilate in their production – are the authors sure this is the case? Particularly since weather forecasts also use various other sources of observations beyond station data.
- Section 2.4.2:
- Are the daily maximum, daily minimum and wet bulb computed from the hourly data? Or otherwise?
- Are nighttime and daytime heatwaves considered separately, or as one continuous heatwave that does/does not provide relief overnight? This can have implications for heat stress and health, but it is not clear how it is factored into the authors’ definition of a heatwave. I think it is hinted at, but was not entirely clear to me in the definition.
- Is the 90th percentile representative of the health impact of heatwaves on humans / ecosystems? What if the 90th percentile does not reach a temperature likely to cause heat stress? Why not use a temperature or wet bulb threshold known to cause health impacts in this region?
- Line 213-214 states the 90th percentile is computed over the entire period, and then line 215 says it’s calculated for each day of the year. I am left unsure as to which of these is used (or which is used for which analysis, if both are used at different points), and this could be quite impactful for the results.
- Section 2.4.3; the description of these steps could be simplified and clarified further. The two first points may not really be necessary to spell out, and the third could perhaps be simplified, but it is also not clear over which timeframe this is done. Is it done for each day of the time series? And then if the number of hot days in a row is not >=3, the value is returned to 1?
- Line 232-233 isn’t clear to me, apologies.
- Line 236-238: the reasoning behind this, and how this is applied in the methodology, isn’t clear to me. Why the minimum of the daily thresholds? Does this correspond to a value that is certain to have an impact on human health? How does this allow proper assessment of the severity? Please expand on this. This relates to a previous point about using percentile thresholds, when using set values corresponding to heat stress may be both simpler and more effective.
- Line 243: ensemble forecasting does not only account for uncertainties in the physical component of the model, but also uncertainty arising from the chaotic nature of the atmosphere, and from an imperfect observation network and therefore imperfect initial conditions of the forecast
- Line 247-249: By considering the mean, medium, warmest, coolest, 1st and 3rd ensemble members, you have identified 6 ‘members’. The Met Office forecasts only have 2 or 7 members, and the ECMWF forecasts 11 members, so I am unsure as to why it is less challenging to use these 6 ‘members’ chosen by the authors, rather than more usefully examining the entire ensemble and therefore the full range of uncertainty represented by the ensemble? It should also be considered that the mean (and quartiles, depending on how these are produced) do not represent an actual forecast scenario or physically likely state of the atmosphere, produced by the model, and so caution is required in assessing this both as a forecast and in evaluating it.
Section 3
- some paragraphs would be helpful for readability in sections 3,4,5
- Section 3.1: the use of ‘hot bias’ and ‘cold bias’ is quite strong wording, as opposed to positive and negative. How large are the biases? It is not mentioned in the text,but some of these ‘hot’ biases may only be a small fraction of a degree, so hot might not be the most appropriate choice of wording?
- Given that the authors state that the results comparing to MERRA are significantly different to those using ERA5, I am surprised not to see some figures included in the main text. How does this discrepancy impact the evaluation results, if the two verification datasets are so different?
- Line 321: the plots are shown in °C, but the text uses K – why refer to it differently between the texts and figures?
- Section 3.2: Why are the Tw results only shown in supplementary material if they make up an important part of the research question and results?
- Section 3.3.2: Why is the mean duration the sum divided by the number of affected years, rather than divided by the number of heatwaves? (what if there is more than one heatwave per year?)
- Line 375: can the authors comment on the representation of convection in both models?
- Section 3.4: could the authors explain further the reasoning behind the 20%, 40% and 60% percentile thresholds? I did not follow the aim and reasoning here. The text and Figure 10 seem to refer to a section of the methods that I was unable to find. Perhaps it refers to the last sentences of section 2.4.4, but I did not follow the link, and further explanation may be required.
- Regarding seasons, are there seasons where there may technically be heatwaves as the temperature exceeds the 90th percentile for the time of year, but they would not cause heat stress or health impacts? Should these be considered in the same way as those during other seasons? Why are winter/spring/summer/autumn used if the region experiences two seasons (dry/wet) – how do these correspond? Some context regarding heatwaves themselves and the temperatures reached and impacts in this region could provide interesting further insight (for example in section 2.1 this could be added).
- From a decision-making perspective, it would be interesting to understand how far in advance these forecasting systems may be able to provide a useful prediction/indication of a heatwave. The results are interesting from a modelling perspective, but I finish reading the results section feeling that I would not really have a confident answer to this question. Could the discussion be expanded to consider the results in this context?
Section 4
- Line 456-460: the names of the convection schemes unfortunately do not mean much to me – what are the key differences and the implications?
- Line 461: could the authors expand on ‘the data and initial conditions are completely different’ ?
- Line 465: did the authors not reduce the resolution of both forecasts? What impact could this have? Particularly on the discussion of all results relating to the spatial variability and the intensity
- Overall, I find the discussion section raises some interesting points, but does not really expand on why or how they influence the results
Section 5
- Line 484-485: it was not clear where the key results were that make any distinction between daytime and nighttime heatwaves and how this was handled in the methodology. An interesting aspect of heatwaves is the drop in temperature overnight, and whether this provides any relief from the daytime heat stress, but this doesn’t factor into the discussion at all. What do the authors consider as a nighttime heatwave, one that only occurs at night and not also in the day? It is a little confusing, and more context and insights could probably be included.
- On a similar note, it is not clearly defined the difference between a wet and a dry heatwave, other than the use of different variables. These terms are only really use in the introduction and conclusions, but the link to the results is missing and the methods are not entirely clear.
- Line 491: what is counted as a failure to predict the intensity? At what lead time? This is a very broad statement. It implies the forecasts are not useful at all – do the authors conclude in this paper that extended-range forecasts are not useful for predicting heatwaves? Can they be used or interpreted at all to complement short-range forecasts? Can some information be provided on the skill of short-range forecasts (could be from other studies), to provide context? How do the authors tie in these results, with the earlier statements that based on some skill scores, the models can detect extreme events up to 5 weeks ahead? Detect in what sense?
Figures
Fig 3: The use of (a) and (b) for both the upper and lower panels and the individual panels is a little confusing at first. Perhaps consider (i) and (ii) for the panels? (or just upper and lower?), or split this into two figures.
Fig. 4: The colourscale here is misleading – it should be adjusted so that the white colour falls at 0, with positive and red and negative in blue, otherwise it is very challenge to properly assess where there is a warm/cold bias, particular with a gradient rather than discrete colourbar. The colourscale should reach the same value at the positive and negative ends.
Figures 7, 8: Again, it appears that the colourscales are not covering the same range for the positive and negative ends, and therefore the white colour doesn’t represent 0. This can be misleading for the interpretation and should be fixed so that the scale is the same at each end.
Citation: https://doi.org/10.5194/nhess-2023-144-RC2 -
AC1: 'Reply on RC2', Cedric Gacial Ngoungue Langue, 17 Dec 2023
The comment was uploaded in the form of a supplement: https://nhess.copernicus.org/preprints/nhess-2023-144/nhess-2023-144-AC1-supplement.pdf
Status: closed
-
RC1: 'Comment on nhess-2023-144', Anonymous Referee #1, 15 Sep 2023
The comment was uploaded in the form of a supplement: https://nhess.copernicus.org/preprints/nhess-2023-144/nhess-2023-144-RC1-supplement.pdf
-
AC2: 'Reply on RC1', Cedric Gacial Ngoungue Langue, 17 Dec 2023
The comment was uploaded in the form of a supplement: https://nhess.copernicus.org/preprints/nhess-2023-144/nhess-2023-144-AC2-supplement.pdf
-
AC2: 'Reply on RC1', Cedric Gacial Ngoungue Langue, 17 Dec 2023
-
RC2: 'Comment on nhess-2023-144', Anonymous Referee #2, 22 Sep 2023
Review of nhess-2023-144: Subseasonal-to-seasonal forecasts of Heat waves in West African cities
Review overview
The concept of this study is certainly of interest, and evaluation of extreme events such as heatwaves, which are increasing in frequency, intensity and duration, is important at a range of timescales. I read this paper with interest. This study examines the extended-range/subseasonal timescale, which can be useful for providing early indications of potentially hazardous events, ahead of more detailed forecasts that shorter-range forecasting systems are capable of predicting.
While I find the concept useful and interesting, and I like the range of skill scores used, I did find that several aspects of the methodology are not described clearly and there are some questions around the datasests and methodology used. Much more clarification is required around the forecasts used for this evaluation, and discussion of the potential drawbacks. The authors consistently refer to ‘subseasonal to seasonal’, which, while catchy, is not completely covered here, as the seasonal timeframe is not included. The descriptions should probably be changed to subseasonal throughout. I also had some questions around the identification of heatwaves and the thresholds used, and the method of dealing with ensemble members. The abstract and introduction mention both wet and dry heatwaves, and daytime and nighttime, but these distinctions are not clearly defined and discussed, and appear to be mostly lacking from the rest of the paper and the results and conclusions.
While the research questions and results are interesting, I feel that the structure of the writing could be significantly improved throughout the paper, as it is currently challenging in places to follow the work, and to fully understand the somewhat contrasting conclusions. For a decision-maker, what are the takeaways to help understand how these forecasts could/could not be used in heatwave forecasting and anticipatory action?
I have provided some more specific comments on the text and some of the figures below. I hope these can be useful as the authors consider the revisions and next stages of the manuscript.
Detailed comments
Abstract
- Line 13: Short-term forecasts typically refers to those of <4 days – 2 weeks lead time would typically be classed as medium-range forecasting
- Line 15: Fail is a strong word, without context?
Introduction
- Line 30-35: It could be worth mentioning that often, national meteorological services have a definition of a heatwave used to provide warnings? (unless it is not the case in the study region, but otherwise, there is also a WMO recommended heatwave definition (https://www.un-spider.org/category/disaster-type/extreme-temperature). Same comment at line 41, research paper authors are not the only ones / the authoritative ones to define heatwaves, particularly in a forecasting perspective. Is there a definition used most often by the forecasting services based in the study region?
- Line 39: ‘min, min or max’ – is there a typo here? Min seems repeated
- Line 39: heat stress indices are mentioned, but not really defined anywhere? (check and come back to) – it may be useful to define here what a heat stress index is and how it differs from the other metrics listed
- Line 45: It is of course of crucial importance for early warning systems to provide information on the occurrence of heatwaves. However, early warning systems are not usually done using seasonal weather forecast models, which often lack the skill and resolution to accurately predict individual extreme events. Typically, an early warning system would refer to a shorter/medium-range lead time, supplemented with advanced information on the potential for hazardous weather using S2S forecasts. The authors go on to make this point about seasonal forecasts providing early indications, which I completely agree with, but early warning systems require a range of lead times, including shorter timescales to account for the fact that forecasts get much more accurate at shorter lead times.
- Line 49-50: citation?
- Line 51-xx: I saw that Vitart (et al) also studied the Pacific Northwest heatwave of 2021, considering the ECMWF subseasonal forecasts and a more recent version of the ECMWF model, and 9 other S2S models. This may be of interest for the authors to include, as it uses a more recent model version than the Russian heatwave studies. https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2021GL097036 Other authors also examined other timeranges of the forecasts for this heatwave.
- Line 114: ERA5 is used to initialise the extended-range reforecasts, but not all of the different aspects of ECMWF’s IFS
Section 2
- Section 2.1: it could be interesting in this section to include some description of heatwaves and their impacts in this region – what have been the impacts of significant past heatwaves? What kinds of temperatures are reached? During what season do impactful heatwaves occur?
- Section 2.1: this section refers to the region having a short wet season followed by a long dry season. But the results later are split into winter/spring/summer/autumn – this should be further explained and justified.
- Section 2.2: A brief discussion on the potential disadvantages of reanalysis datasets might be warranted, if not included later (e.g. they may not always have the resolution to be able to pick up the very highest temperatures during heatwaves)
- Section 2.2.2: The assessment of dry and wet heatwaves seems like it should be a separate section, as at the moment it seems to be that it is only applied to MERRA, as it sits under that subtitle. It is also not clear how the heatwave identification described later takes into account both wet and dry heatwaves, nor is this clear in the results. Wet and dry heatwaves should also be defined (do they refer to the aforementioned wet and dry seasons? to the humidity experienced during a heatwave? Or otherwise?)
-
- Section 2.3.1 ECMWF forecasts: there are some errors in this section, and I find some of the description unclear. This isn’t helped by the fact that ECMWF very recently upgraded their models so some of this information no longer stands. It may be useful to revisit the description in this section for clarity. More details below:
- The ECMWF IFS has several separate forecasting systems (medium-range (now high-resolution), extended-range, seasonal), and it would be useful to specify which is being used and described here, as other parts of the system have different resolutions, lead times and ensemble members.
- ECMWF provide both extended-range (up to 46 days) and seasonal (up to 6 months) forecasts to the S2S programme. I understand that the authors are using the extended-range forecasts, and it may be useful to refer to the forecasts as this throughout.
- ‘ECMWF ENS’ is often used to refer to the medium-range (up to 15 days and now high-resolution) ensemble, and so could cause confusion – these are not the same exact forecasting system as the extended-range (at least not any more).
- The authors may wish to specify that they are interpolating to a 0.25° grid to match the resolution of ERA5 for evaluation (I presume), with the caveat that this does reduce the resolution of the native forecasts, and that higher resolutions can be beneficial for capturing extremes.
- The IFS is no longer running at CY41r2. If the authors downloaded hindcast data from forecast dates in 2021, the cycles could have been 47r1 (implemented 30 June 2020), 47r2 (implemented 11 May 2021), or 47r3 (implemented 12 October 2021). The authors should confirm which cycle(s) was used. These cycles indeed had 51 ensemble members at 18 and 36km resolution depending on the lead time, as the authors describe. The latest version of the extended-range forecasts (48r1) has 101 members, run at 36km for the full forecast range (days 0 to 46), and is run daily rather than twice a week. (https://www.ecmwf.int/en/about/media-centre/news/2023/model-upgrade-increases-skill-and-unifies-medium-range-resolutions)
- A useful description of hindcasts/reforecasts may be ‘Hindcasts are forecasts produced for past dates using the most recent version of the forecasting system, and allow analysis of how the current system would have performed, alongside a consistent dataset covering a longer time period for evaluation’, as a useful use of these data for the authors’ purposes?
- The authors also use ‘hindcasts’ and ‘reforecasts’ interchangeably. ECMWF typically call them reforecasts, and the authors should be clearer if it is themselves calling them ‘hindcasts’ throughout the study.
- I didn’t understand the explanation for not using the Thursday hindcasts, sorry
- Line 83: the authors may wish to acknowledge that when dealing with extreme events, including different extreme events in the analysis may well result in different conclusions regarding the skill.
- Section 2.3.2: Parts of the UKMO forecast description are also confusing, for example the transition from discussing 4 members to 7 members. Perhaps a table outlining key aspects of both forecasting systems, and the timeframes to which they apply, would be helpful to provide an overview of the system characteristics?
- Section 2.3.2: I believe the description of the concatenation could be simpler. Is an equation necessary, or is it enough to simply state that prior to 2016, hindcasts are used, and after that, the real-time forecasts are used, followed by the details from line 190? I am not completely convinced of the decision to reduce the number of ensemble members in this methodology, thus reducing the uncertainty representation of the forecast. It would also be useful to provide an overview of how other characteristics of the model have changed between the hindcast version and the potentially multiple operational versions used during the period of the real-time forecasts? This could be covered in the aforementioned table.
- Line 197: I believe here the authors are referring to a lack of data from local stations to evaluate the forecasts again. The sentence implies that no data is available from this region for weather forecasts to assimilate in their production – are the authors sure this is the case? Particularly since weather forecasts also use various other sources of observations beyond station data.
- Section 2.4.2:
- Are the daily maximum, daily minimum and wet bulb computed from the hourly data? Or otherwise?
- Are nighttime and daytime heatwaves considered separately, or as one continuous heatwave that does/does not provide relief overnight? This can have implications for heat stress and health, but it is not clear how it is factored into the authors’ definition of a heatwave. I think it is hinted at, but was not entirely clear to me in the definition.
- Is the 90th percentile representative of the health impact of heatwaves on humans / ecosystems? What if the 90th percentile does not reach a temperature likely to cause heat stress? Why not use a temperature or wet bulb threshold known to cause health impacts in this region?
- Line 213-214 states the 90th percentile is computed over the entire period, and then line 215 says it’s calculated for each day of the year. I am left unsure as to which of these is used (or which is used for which analysis, if both are used at different points), and this could be quite impactful for the results.
- Section 2.4.3; the description of these steps could be simplified and clarified further. The two first points may not really be necessary to spell out, and the third could perhaps be simplified, but it is also not clear over which timeframe this is done. Is it done for each day of the time series? And then if the number of hot days in a row is not >=3, the value is returned to 1?
- Line 232-233 isn’t clear to me, apologies.
- Line 236-238: the reasoning behind this, and how this is applied in the methodology, isn’t clear to me. Why the minimum of the daily thresholds? Does this correspond to a value that is certain to have an impact on human health? How does this allow proper assessment of the severity? Please expand on this. This relates to a previous point about using percentile thresholds, when using set values corresponding to heat stress may be both simpler and more effective.
- Line 243: ensemble forecasting does not only account for uncertainties in the physical component of the model, but also uncertainty arising from the chaotic nature of the atmosphere, and from an imperfect observation network and therefore imperfect initial conditions of the forecast
- Line 247-249: By considering the mean, medium, warmest, coolest, 1st and 3rd ensemble members, you have identified 6 ‘members’. The Met Office forecasts only have 2 or 7 members, and the ECMWF forecasts 11 members, so I am unsure as to why it is less challenging to use these 6 ‘members’ chosen by the authors, rather than more usefully examining the entire ensemble and therefore the full range of uncertainty represented by the ensemble? It should also be considered that the mean (and quartiles, depending on how these are produced) do not represent an actual forecast scenario or physically likely state of the atmosphere, produced by the model, and so caution is required in assessing this both as a forecast and in evaluating it.
Section 3
- some paragraphs would be helpful for readability in sections 3,4,5
- Section 3.1: the use of ‘hot bias’ and ‘cold bias’ is quite strong wording, as opposed to positive and negative. How large are the biases? It is not mentioned in the text,but some of these ‘hot’ biases may only be a small fraction of a degree, so hot might not be the most appropriate choice of wording?
- Given that the authors state that the results comparing to MERRA are significantly different to those using ERA5, I am surprised not to see some figures included in the main text. How does this discrepancy impact the evaluation results, if the two verification datasets are so different?
- Line 321: the plots are shown in °C, but the text uses K – why refer to it differently between the texts and figures?
- Section 3.2: Why are the Tw results only shown in supplementary material if they make up an important part of the research question and results?
- Section 3.3.2: Why is the mean duration the sum divided by the number of affected years, rather than divided by the number of heatwaves? (what if there is more than one heatwave per year?)
- Line 375: can the authors comment on the representation of convection in both models?
- Section 3.4: could the authors explain further the reasoning behind the 20%, 40% and 60% percentile thresholds? I did not follow the aim and reasoning here. The text and Figure 10 seem to refer to a section of the methods that I was unable to find. Perhaps it refers to the last sentences of section 2.4.4, but I did not follow the link, and further explanation may be required.
- Regarding seasons, are there seasons where there may technically be heatwaves as the temperature exceeds the 90th percentile for the time of year, but they would not cause heat stress or health impacts? Should these be considered in the same way as those during other seasons? Why are winter/spring/summer/autumn used if the region experiences two seasons (dry/wet) – how do these correspond? Some context regarding heatwaves themselves and the temperatures reached and impacts in this region could provide interesting further insight (for example in section 2.1 this could be added).
- From a decision-making perspective, it would be interesting to understand how far in advance these forecasting systems may be able to provide a useful prediction/indication of a heatwave. The results are interesting from a modelling perspective, but I finish reading the results section feeling that I would not really have a confident answer to this question. Could the discussion be expanded to consider the results in this context?
Section 4
- Line 456-460: the names of the convection schemes unfortunately do not mean much to me – what are the key differences and the implications?
- Line 461: could the authors expand on ‘the data and initial conditions are completely different’ ?
- Line 465: did the authors not reduce the resolution of both forecasts? What impact could this have? Particularly on the discussion of all results relating to the spatial variability and the intensity
- Overall, I find the discussion section raises some interesting points, but does not really expand on why or how they influence the results
Section 5
- Line 484-485: it was not clear where the key results were that make any distinction between daytime and nighttime heatwaves and how this was handled in the methodology. An interesting aspect of heatwaves is the drop in temperature overnight, and whether this provides any relief from the daytime heat stress, but this doesn’t factor into the discussion at all. What do the authors consider as a nighttime heatwave, one that only occurs at night and not also in the day? It is a little confusing, and more context and insights could probably be included.
- On a similar note, it is not clearly defined the difference between a wet and a dry heatwave, other than the use of different variables. These terms are only really use in the introduction and conclusions, but the link to the results is missing and the methods are not entirely clear.
- Line 491: what is counted as a failure to predict the intensity? At what lead time? This is a very broad statement. It implies the forecasts are not useful at all – do the authors conclude in this paper that extended-range forecasts are not useful for predicting heatwaves? Can they be used or interpreted at all to complement short-range forecasts? Can some information be provided on the skill of short-range forecasts (could be from other studies), to provide context? How do the authors tie in these results, with the earlier statements that based on some skill scores, the models can detect extreme events up to 5 weeks ahead? Detect in what sense?
Figures
Fig 3: The use of (a) and (b) for both the upper and lower panels and the individual panels is a little confusing at first. Perhaps consider (i) and (ii) for the panels? (or just upper and lower?), or split this into two figures.
Fig. 4: The colourscale here is misleading – it should be adjusted so that the white colour falls at 0, with positive and red and negative in blue, otherwise it is very challenge to properly assess where there is a warm/cold bias, particular with a gradient rather than discrete colourbar. The colourscale should reach the same value at the positive and negative ends.
Figures 7, 8: Again, it appears that the colourscales are not covering the same range for the positive and negative ends, and therefore the white colour doesn’t represent 0. This can be misleading for the interpretation and should be fixed so that the scale is the same at each end.
Citation: https://doi.org/10.5194/nhess-2023-144-RC2 -
AC1: 'Reply on RC2', Cedric Gacial Ngoungue Langue, 17 Dec 2023
The comment was uploaded in the form of a supplement: https://nhess.copernicus.org/preprints/nhess-2023-144/nhess-2023-144-AC1-supplement.pdf
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
544 | 177 | 46 | 767 | 61 | 33 | 39 |
- HTML: 544
- PDF: 177
- XML: 46
- Total: 767
- Supplement: 61
- BibTeX: 33
- EndNote: 39
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1