the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
High-Resolution Data Assimilation for Two Maritime Extreme Weather Events: A comparison between 3DVar and EnKF
Abstract. Populated coastal regions in the Mediterranean are known to be severely affected by extreme weather events. Generally, they are initiated over maritime regions, where a lack of in-situ observations is present, hampering the initial conditions estimations and hence, the forecast accuracy. To face this problem, Data Assimilation (DA) is used to improve the estimation of the initial conditions and their respective forecasts. Although comparisons between different DA methods have been performed at global scales, few studies are performed at high-resolution, focusing on extreme weather events triggered over the sea and enhanced by complex topographic regions. In this study, we investigate the role of assimilating different types of conventional and remote-sensing observations using the variational 3DVar and the ensemble-based EnKF, which are of the most common DA schemes used globally at National Weather Centers. To this aim, two different events are chosen because of both the different areas of occurrence and the triggering mechanisms. Both the 3DVar and the EnKF are used at convection permitting scales to improve the predictability of these two high-impact coastal extreme weather episodes, which were poorly predicted by numerical weather prediction models: (a) the heavy precipitation event IOP13 and (b) the intense Mediterranean Tropical-like cyclone Qendresa. Results show that the EnKF and 3DVar perform similarly for the IOP13 event for most of the verification metrics, although looking at the ROC and AUC scores, the EnKF clearly outperforms the 3DVar. However, the ensemble mean of the EnKF is in general worse than the 3DVar for Qendresa, although some of the ensemble members of the EnKF individually outperforms the 3DVar allowing for gaining information on the physics of the event and hence the benefits of using an ensemble-based DA scheme.
- Preprint
(9572 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on nhess-2024-177', Anonymous Referee #1, 01 Dec 2024
Review of "High-Resolution Data Assimilation for Two Maritime Extreme Weather Events: A comparison between 3DVar and EnKF" by Diego Saúl Carrió, Vincenzo Mazzarella, and Rossella Ferretti.
The authors aim at improving the prediction of extreme or high impact weather events. They investigate two main questions. Which method for data assimilation is better: 3D-Var or EnKF? What is the added value of non-conventional observations (radar reflectivity and atmospheric motion vectors).
I found this study's questions and approach interesting. However, I recommend major revisions due to a number of inconsistencies, limited analysis of extreme precipitation, while the overall length of the manuscript could be reduced.Major comments
1) Absence of NODA in comparisons/figures
Although announced in section 6.3, the NODA is not part of any figure as far as I can see. Therefore, the reader can't be sure that the assimilation improves forecasts. It could be that both methods are not much different from NODA.2) Inconsistency of verification measures (shown in results) and indicated aims (section 2)
The paper intends to show ... (indicated aims, L161ff)- the improved prediction of small-scale extreme weather events
- enhanced accuracy of atmospheric conditions in the pre-convective environment
- impact of assimilating in-situ conventional and remote sensing observations
Ad 1: There is no comparison to NODA, thus no improvement measurable. Moreover, the paper evaluates precipitation FSS(>1mm/h), RMSE(1h/6h), but does show to which extent the observed extreme precipitation could be forecasted.
Ad 2: Where did you evaluate the pre-convective environment?
3) Inconsistency of conclusions and results
L974-975: "Similar skill" of EnKF and 3DVar in FSS (Fig8) and Taylor diagram (Fig10)
That contradicts what I see in Fig 8 and 10.L976: "significantly improved the forecast". There was no comparison to NODA. Thus no improvement visible.
L979: "EnKF provides worst results." This is not a disadvantage of EnKF. The ensemble mean of a cyclone pressure field is as useful as the ensemble mean of a precipitation field. As it is not Gaussian, it should not be expected to perform well.
Minor comments
Figure 8: The authors employ FSS of precipitation >1 mm/h for three regions (Fig 8a-c).
This score evaluates correct positioning of precipitation in forecasts, but doesn't show improved prediction of extreme events (1 mm/h is hardly extreme).Figure 8: I don't think that the application of FSS on small areas R1 and can be very meaningful, however, it is also not wrong.
Figure 8: It is unclear what RMSE shows. Is it RMSE of ensemble mean prediction of precipitation? [mm/h]?
Figure 8: Is the FSS of EnKF computed from the ensemble mean forecast or from the whole ensemble?
Figure 10: Which observations have been used for this figure?
The paper is 31 pages. I suggest to shorten the text in the interest of the reader and journal guidelines. For example, the introduction is very long and does not always on point (why are particle filters discussed?). The methods contain a revision of DA equations. I don't see how that serves the rest of the paper.
L162-172: These are general points and considering the limited set of observations for verification, such general questions cannot reasonably be answered, as you state in L174-177. Maybe you can edit L161 to include something like "we address these questions for two high-impact cases"
L162, and others: "high-resolution data assimilation": I don't see what the difference between "high-resolution 3DVar" and 3DVar is. Clearly, if applied at high resolution, any method becomes a high-resolution method. Is there any more to it?
L171-172: Isn't this the same as point (a) before?
Missing table: It would be useful to collect the assimilated observation types per case in a table.
L940: high-resolution DA techniques
Well, plain-vanilla 3D-Var is not a high-resolution DA technique, especially without hydrometeor control variables.Figure 6: The figure is split over two pages. It would be good if it would not be separated from the caption. Labels a-f could be replaced by SYN, CNTRL, NODA, if possible.
L566: The year should be 2012 not 2021.
Finally, I would suggest to mention the opportunities from satellite data assimilation for convective-scale forecasting. Future studies could benefit greatly from that.
Citation: https://doi.org/10.5194/nhess-2024-177-RC1 -
RC2: 'Comment on nhess-2024-177', Anonymous Referee #2, 03 Jan 2025
This study examines the role of assimilating various types of conventional and remote-sensing observations to improve the forecasting of extreme weather events in Mediterranean coastal regions, where the lack of in-situ observations hampers the accuracy of initial condition estimates. Comparing the variational 3DVar and ensemble-based EnKF data assimilation methods, the research focuses on two high-impact events: the heavy precipitation event IOP13 and the Mediterranean Tropical-like cyclone Qendresa. The results indicate that while the EnKF generally outperforms 3DVar for IOP13 in terms of probabilistic metrics, 3DVar provides better overall predictions for Qendresa, though individual EnKF ensemble members offer valuable insights into the event's dynamics, highlighting the benefits of ensemble-based approaches.
The idea behind the study presented holds potential scientific interest. However, there are several aspects in the approach that currently make it unsuitable for publication as it is. I recommend a major revision to improve the analyses conducted, refine the work structure, and address the inconsistencies currently present. Furthermore, the scientific innovation of this paper is positioned within a somewhat limited review of the existing literature on the topic; it would be beneficial to expand and update the literature in the Introduction section to provide a more comprehensive and updated context.
General comments:
- One of the main objectives of the paper is to compare the EnKF and 3DVAR approaches to determine which method provides better forecasts. However, the comparison is based on only two case studies, which is too limited a sample to adequately answer this question. If the paper were introducing a new method, using one or two example case studies might be sufficient. However, for a comparison of two existing methods, a larger number of cases is necessary to draw a robust conclusion.
- The comparison is made between the two methods but does not include the background NODA simulation. To effectively assess the added value of the two methods, especially considering that one is computationally more expensive, it is essential to include the NODA simulation in the comparison. This would allow for an evaluation of how much the simulations have improved over the background using the different data assimilation methods. Moreover, taking the first case, IOP13, as an example, a run starting on the 15th at 00:00 is used as the NODA simulation. While this may be acceptable for assessing improvement from an operational perspective, to evaluate the impact of DA, it would be better to include a simulation with the same initial and boundary conditions as the runs with DA. The same consideration applies to the second case as well.
- For the EnKF, a multi-physics approach is used, while for the 3DVar, a deterministic run is adopted. Has it been evaluated whether the ensemble multi-physics approach provides an advantage for these cases compared to the deterministic run, regardless of the data assimilation? Or, at the very least, has the behavior of the EnKF simulation with the same physical configuration as that used for the 3DVar been analyzed? The paper does not address this consideration, making it impossible to separate the advantage of using a multi-physics ensemble approach from a deterministic run from the benefits of the two data assimilation methods, given that the multi-physics ensemble inherently offers numerous advantages.
- In this work, operational evaluation is frequently discussed, but, for example, in the case of IOP13, the rainfall between 00 and 06 is analyzed from a simulation that starts at 00. In reality, this means that data assimilation occurs right up to the beginning of the event. In an operational context, to assimilate data up to 00:00 on the 15th, as shown in the scheme in Figure 6, it would be necessary to wait at least 15-20 minutes (or even more, it depends on the observations assimilated) after 00:00 on the 15th to acquire the latest observational data before starting the run. The 3DVar is relatively fast as an approach, especially if, as in this case, the observations assimilated are not very dense over the entire domain. But what time would the EnKF run be available? The risk with simulations structured in this way is that the forecast becomes available much later. In an operational context, which is repeatedly emphasized as a goal in the paper, evaluating the cumulative rainfall from 00 to 06 on the 15th effectively means validating rainfall that would not be forecasted but partially in hindcast. Actually, in Figure A1, the hourly precipitation between 00 and 01 is evaluated. Similar considerations can be made for the initial part of the second case, which, however, being evaluated over a longer period, is less affected by this issue.
Major comments:
L 79-84: The literature review is lacking, and there are several studies on small-scale convective NWP that impact coastal regions of the Mediterranean, also with data assimilation approaches. It is recommended to expand the literature review on this topic and reconsider the misleading information in these lines.
L149-150 and 155-157: In this case as well, there is a gap in the literature review. Several studies on the Mediterranean region and Italy configure simulations, also with data assimilation and for operational purposes. It is recommended to update the literature review and position the aim of the study and its innovation in relation to the already existing works.
L161-172: Point a) The comparison with the NODA run is missing in order to address this point. Please refer to General Comment 2 concerning the NODA run that should be used to effectively assess the impact of data assimilation. Point b) It is true that assimilation starts in the pre-convective phase of the event, but it extends until the beginning of the validated time window. How does this approach differ from other similar methods that have already been published and are commonly used in operational contexts? Point c) To assess the impact of observations with high spatial and temporal resolution, why was a broader domain coverage not used, provided by other denser and more homogeneous networks available in Italy? Point d) Comparisons with the NODA run are missing.
Section 3: The study presents a potentially interesting comparison, but it is based on only two case studies and two sets of observations that differ significantly. The in-situ observation sets also appear to vary considerably between the two cases, which makes it challenging to draw a general conclusion. While it is true that observations over the sea are generally limited, a more homogeneous correction across the entire domain could still lead to improvements in the initial and boundary conditions, and, consequently, in the simulation. As it stands, the two cases seem somewhat disconnected, making it difficult to draw a broader conclusion.
L447: The period used for constructing the background is only 2 weeks, which is half of the minimum duration recommended in the WRFDA user guide (at least a 1-month dataset). This could negatively impact the comparison between 3DVar and EnKF, potentially disadvantaging the 3DVar. Have sensitivity tests been conducted using B matrices constructed from longer periods?
L503-504: A multi-physics approach for the EnKF, compared to a deterministic run with 3DVar, significantly disadvantages the deterministic simulation, regardless of the effect of data assimilation. A comparison between the ensemble and the deterministic simulation without the use of DA should be conducted to effectively evaluate the impact of the two DA methods.
L530-531 and 547-548: why the reference experiment is not included in the validation? It is impossible to assess the impact of the assimilation without a reference simulation. Where is the NODA experiment in the work results analysis?
L553-555: The comparison is certainly of interest. However, it is based on only two case studies, where two different sets of observations were used in the CNTRL simulation for two extreme events with completely different characteristics. This makes it challenging, if not impossible, to draw general conclusions. Typically, the run that uses high spatial and temporal resolution data, such as radar and satellite products, shows a greater improvement compared to one that only uses in-situ observations, which is almost well-established in the literature. Expanding the analysis to include more cases or a more consistent dataset could strengthen the conclusions drawn.
L612-613: As mentioned in general comment 4, if the aim is to assess the applicability to an operational framework, the simulation time required for these configurations should be included in the analysis. This would help determine whether the validation time window chosen for the results analysis is fully available as a forecast or if part of it will be in hindcast considering that in real time the assimilation should wait for the observations availability. For operational applications, it is crucial that the forecast be provided in advance of the event’s time window, ideally by at least a few hours. While this may be negligible in the case of a long-lasting event, as in the second use case presented in the work, it is fundamental for the first use case.
Figure A1: For the evaluation of hourly rainfall, only the 1-hour forecast between 00:00 and 01:00 on the 15th was considered, which corresponds to the moment when the data assimilation has just finished. It would be more appropriate to evaluate all hourly accumulations between 00:00 and 06:00 for consistency with the 6-hourly evaluation and to provide a more comprehensive view of the period during which data assimilation has an impact.
L787-789: The members with poorer performance use the Kain-Fritsch cumulus parametrization, which is the same as the deterministic simulation employed for the 3DVar. Was an attempt made with 3DVar, using the Grell-Freitas cumulus parametrization in combination with the YSU PBL scheme, which appears to be the best combination in the ensemble, to assess the impact of this difference?
Summary and Conclusions section: The conclusions of this work should be revisited in light of the previous comments and the work that still needs to be implemented.
Minor comments:
L137-138: Given the approach proposed for an operational purpose, it is recommended to include a more detailed comparison of computational efforts and timing in some section of the paper. This could be important for further evaluating the choice of using an EnKF approach compared to the deterministic 3DVar.
L174-175: If “this study is not aimed to draw any statistically significant conclusion”, how can points a, b, c, and d be addressed?
L259-266: Italy has a dense national network of both radar and in-situ stations, offering comprehensive coverage across the territory. Could you please clarify why the decision was made not to utilize all available observations, which would provide more homogeneous and dense coverage over the entire domain of interest, potentially enhancing the data assimilation process? Especially in the Qendresa use case, was there no radar available with at least some coverage over the sea?
L437-439: The observation operator used considers only the warm rain process and not all the ice microphysics species. In the case of extreme convective events, ice species are typically an important component of the cloud, as evidenced by the usual presence of lightning. Has the impact of using a warm rain approach on radar data assimilation been evaluated, especially in the 3DVar run, which has a single physical configuration due to its deterministic nature?
L467-478: The domains chosen in this work are quite small and one-way nested, which may limit the duration of the data assimilation effect. Given the operational purpose emphasized in the study, why was a single, national-scale domain not used? This approach could have also facilitated a more homogeneous comparison between the two case studies and potentially allowed for a longer data assimilation impact. Can you justify also the use of a one-way nested approach instead of a two-way one?
L564-565: Several studies have shown that the assimilation of high spatial and temporal resolution observations helps to reduce model spin-up. Could you clarify the reasoning behind the decision to use the first 6-hours of simulation without DA, rather than utilizing the DA to reduce spin-up and potentially shorten the simulation time?
L696: The threshold used for the FSS is 1mm/h. Such a low threshold is not very appropriate to assess extreme precipitation events. It would be more suitable to use progressively higher thresholds to evaluate whether the simulations accurately predict the more intense part of the event.
L716-716: The short impact of reflectivity assimilation could be due to factors such as the observation coverage of the domain, the choice of domain size, and the nesting approach. A different setup might allow for a longer impact of the assimilation.
L742-743: The EnKF can benefit from the multi-physics approach when assimilating observations that directly impact the model's microphysical species. For the deterministic 3DVar, was the physical configuration selected the one that performed best for the specific case study?
L748-750: It would be valuable to include an evaluation and comparison with the ensemble run that uses the same physical setup as the 3DVar.
Figure 8: It would be better if RMSE and FSS were applied to the same time window.
L775-776: It would be preferable if the diagrams covered the same time window as the FSS and RMSE.
Citation: https://doi.org/10.5194/nhess-2024-177-RC2
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
147 | 39 | 11 | 197 | 7 | 8 |
- HTML: 147
- PDF: 39
- XML: 11
- Total: 197
- BibTeX: 7
- EndNote: 8
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1