Data impact studies with the AROME WMED reanalysis of the HyMeX SOP1

This paper presents the results of several observing system experiments (OSEs) performed with AROME-WMED. This model is the HyMeX (Hydrological cycle in the Mediterranean Experiment) dedicated version (Fourrié et al., 2019) of the French operational meso-scale model AROME. The second and final reanalyses assimilated most of all available data for a 2 month period corresponding to the first Special Observation Period of HyMeX. In order to assess the impact of various 5 observation data set assimilation on the forecasts, several OSEs or also-called denial experiments, were carried out. In this study, impact of a dense reprocessed network of high quality Global Navigation Satellite System (GNSS) Zenithal Total Delay (ZTD) observations, reprocessed wind-Profilers, lidar-derived vertical profiles of humidity (ground and airborne) and Spanish radar data, is thus discussed. Among the evaluated observations, it is found that the ground-based GNSS ZTD data set provides the largest impact on 10 the analyses and the forecasts as it represents an evenly spread and frequent data set providing information at each analysis time over the AROME-WMED domain. The impact of the reprocessing of GNSS ZTD data also improves the forecast quality but this impact is not statistically significant. The assimilation of the Spanish radar data improves the very short term forecast quality as well as the short term forecasts but this impact remains located over Spain. Marginal impact from wind profilers was observed on wind background quality. No impacts have been found regarding lidar data as they represent a very small data set. 15

on the impact of other observation types (i. e. wind profilers, lidars and Spanish radars). Section 6 focusses on the impact of all these data on the IOP 16a case study. Finally, conclusions are given in Section 7. 60 2 Sensitivity study description and validation methodology

AROME-WMED configuration
The different AROME-WMED model configurations are described in Fourrié et al. (2015Fourrié et al. ( , 2019 and rely on the operational limited area model AROME (Seity et al., 2011;Brousseau et al., 2016) version running at Météo-France since 2008. At the time of the SOP1 campaign, analyses were performed at 2.5 km horizontal resolution every 3 hours with a three dimensional Table 1 presents the distribution of assimilated data as a function of observation types. Satellite data represent the majority of observations. This can be explained by the fact that the IASI sensor provides 44 channels per observation point. Surface observations provide 15.21% of assimilated data. Aircraft and radiosondes give similar amount of data (around 8%). GNSS 80 ZTD represent 1.85% of the total and wind profilers 1.17%. Special efforts were made to assimilated non operational data types such as Lidar water vapour profiles and Spanish radar data. Humidity data from Lidar contribute very few with 0.12% of assimilated data. Radar data represent 11.88% of the total amount of assimilated data and Spanish ones only 0.6%.

Observing System Experiment Description
Special efforts were made to assimilate non operational observation data. To study the contribution of the observations on 85 the analysis and forecast quality of the heavy precipitating events of the SOP1, denial experiments have been devised. These experiments consist of removing one observation data set and to compare the forecast quality with and without assimilating this data set. Here, four denial experiments were conducted on the following observation types: the ground-based GNSS ZTD, the wind profilers, the lidar humidity profiles and Spanish radars. The location of these observations is shown in Figure 1. Table 2 summarizes the names of the denial experiments and the observations considered. The largest differences in terms 90 of number of observations are obtained with NOGNSS which leads to a 1.85% difference in the number of assimilated data.

GNSS Zenithal Total delay
We considered here reprocessed data with a homogeneous reprocessing using a single software and more precise satellite orbits position and clocks (Bock et al., 2016), which were available for the whole SOP1. Additional data were also considered compared to the operational and real-time data set. NOGNSS is the experiment without the dense reprocessed GNSS 95 network Another experiment without re-processed, but with the "operational" GNSS ZTD data assimilated in the real-time AROME-WMED version, called OPERGNSS, was also performed to test the impact brought by the reprocessing of the data and additional GNSS data.

Wind profilers
These data were available for the whole SOP1 and have been reprocessed after the SOP1  with an improved 100 quality control. Here, observations from 8 wind radars (UHF and VHF) were considered. They are mainly located in the SOUth of France, IN COrsica and Menorca (Figure 1). Experiment without wind profilers is called NOWPROF.

Humidity profiles from Lidars
Experiment without Lidar is NOLIDAR. During SOP1, ground based and airborne lidars were operated. The mobile Water vapour and Aerosol Raman LIdar (WALI, Chazette et al. (2016)) operates with an emitted wavelength of 354.7 nm. This 105 instrument was operated at a site close to Ciutadella (western part of Menorca located by 39 59 07 N and 3 50 13 E). Mixing ratio profiles were delivered with a resolution of 15 m for the 0 m -6000 m altitude range. A detailed description of this

Validation protocol
The evaluation of the various denial experiments is made against the reference REANA run, which includes operational and research observations in its 3D-Var assimilation; all assimilated data are described in a companion paper (Fourrié et al., 2019).
As a first step (Analysis and First-Guess sub-section), the performance of the data assimilation system is evaluated by comparing the various Analyses (AN) and First-Guess (FG) values against the assimilated observation (operational data as-125 similation monitoring procedure); statistics of departure from observations (mean and Root Mean Qquare (RMS) error) are computed at the assimilated observation location. Those statistics were also computed using few available independent data.
The first source comes from the vessel Marfret-Niolon, which was an instrumented commercial ship of opportunity, cruising regularly between the southern France harbour of Marseille and two Algerian harbours (Algiers and Mostagadem). Please refer to Figure  Canada (Kouba and Héroux, 2001) and using high-resolution products provided by the International GNSS Service. The sec-135 ond source of independent data comes from wind data obtained from an airborne Doppler cloud-profiler radar named RASTA (Radar Airborne System Tool for Atmosphere (Bouniol et al., 2008;Protat et al., 2009;Delanoë et al., 2013)) that flew 45 days In a second step, the forecast (range between +3 to +54 hours) quality is assessed in terms of surface parameters and 140 precipitation scores. The surface parameters (temperature and relative humidity at 2 m and wind at 10 m) come from the HyMeX database which provides surface synoptic observations available over the AROME-WMED domain, together with additional hourly observations from Météo-France, AEMET and MeteoCat mesoscale networks. Some of these observations were assimilated to produce surface analyses. For the evaluation of the precipitation quality, the dense surface data set rain gauge network available in the HyMeX data base (V4 version) has been used. Scores of 3 hourly accumulated precipitation 145 from all analysis times on a given day are compared to the corresponding observed 24-h accumulated precipitation.

Impact of GNSS data on the analysis and first-guess quality
This section investigates the impact of assimilating the ground-based GNSS ZTD data on the numerical weather prediction model analysis and subsequent forecast quality. This data set represents the largest one in terms of total amount, even though it represents a small fraction of assimilated data (1.85%). One of the key tool used to evaluate the performance of the assimilation

Impact on moisture field
Comparison to the Integrated Water Vapour (IWV) from the reprocessed GNSS observations (not independent from REANA as the information from this data set is assimilated in this experiment) indicates that the best correlation, as expected, is obtained 155 for REANA (around 0.99), the second one being OPERGNSS (around 0.975) and the last one NOGNSS (around 0.96), as shown in Figure 3. This result is confirmed when computing the RMS of the differences. A weak diurnal cycle of the scores is noticed with a maximum correlation around 09 UTC and a minimum around 15 UTC. Concerning the standard deviation of the differences, they are lower during the 3-9 UTC period and larger in the afternoon.  We then discuss the result of the statistics for the analysis and first-guess against radiosonde observations, which represents  Radiances from SEVIRI (on board the geostationnary satellite Meteosat Second Generation (MSG)), sensitive to moisture (channels WV 6.2 µm for upper-troposhere and 7.3 µm for mid-troposphere) are assimilated in AROME. They are an important source of humidity information, especially over oceans where no information from GNSS nor radiosondes is available. Basically no impact between the various experiments is found on the FG and AN statistics for these observations (figure not shown).

175
The correlation between the various AROME-WMED ZTD analyses and corresponding independent (not assimilated) Marfret Niolon observations is slightly and consistently higher for REANA than for NOGNSS and even for OPERGNSS ( Figure   5). There is a correlation maximum around 09 UTC, and a minimum around 15 UTC. The mean ZTD is quite similar in all experiments, with a maximum at 09 UTC and a minimum around 00 UTC. A moist bias is found in all simulations when compared to the mean observation in grey shown in Figure 5. The magnitude of this relative positive (moist) bias is around 0.5 180 percent. Although the sample size of Marfret-Niolon data set is rather small (around 1000 collocations), this is a original result and makes clear that the REANA experiment produces the best reanalysis, and the best 3-hour forecasts.  Figure 7, where only few wind data from conventional observations are available.
Worth to remind that the data from this instrument were not assimilated in REANA. This data set thus represents an additional independent information for the evaluation of our denial experiments. Table 3 provides the root mean square errors for wind calculated with these data. The RMSE for background and analysis are lower in REANA than in the other two experiments. The analysis RMSE for OPERGNSS is lower than the one for NOGNSS.    for NOGNSS in red and GNSS OPER in blue. Dots indicate that the difference between the curves and the REANA curve as a reference is statistically significant at a 0.95 confidence threshold using a bootstrap test. Figure 8 shows that the Equitable Threat Score (ETS) of the 24-h accumulated precipitation computed with the sum of the 3-h precipitation from the 8 analysis times is improved with the assimilation of GNSS ZTD data compared to the NOGNSS experiment. It represents an evaluation of the background quality. The difference is statistically significant for each threshold.
When comparing the assimilation of reprocessed data to the real-time ones, the ETS for precipitation is slightly better with 195 11 https://doi.org/10.5194/nhess-2020-153 Preprint. Discussion started: 25 May 2020 c Author(s) 2020. CC BY 4.0 License.
the reprocessed data set but the differences are not significant except for the 40 mm/day threshold. However this threshold represents only few cases. Overall, the background quality is improved with the assimilation of GNSS observations and the data reprocessing brings improvement in terms of precipitation from 3-hour forecast even though this benefit is not significant.

Impact of GNSS data on medium term forecast
The impact of the GNSS data has also been assessed for longer forecast range (3 to 54-h). The effect of the assimilation of Compared to the observed ZTD from the Marfret Niolon ship, the signal is more noisy because of a smaller dataset but in 205 overall the correlation for the NOGNSS is lower and the standard deviations are in general higher for the NOGNSS forecasts ( Figure 10).
The forecast quality has also been evaluated against surface data. No impact was found on temperature at 2 meters or on 10 m wind. A small impact was found on relative humidity at 2 meters ( Figure 11). A reduction of the bias is noticed with REANA during the first 9-h of the forecast compared to OPER GNSS and NOGNSS. From 12-h onwards the results for REANA and 210 OPERGNSS are similar. Regarding the standard deviation, it is smaller for REANA between 0 and 9-h than for NOGNSS and GNSS OPER and between 21 and 27-h forecast range than for NOGNSS. This difference represents more than 2 % of improvement. For the other forecast ranges the differences are lower than 1%.
The impact of the assimilation of GNSS data on the 24-h accumulated precipitation from the forecast initialized at 06 UTC is less clear. The improvement of the GNSS data reprocessing compared to the real time data set is beneficial for all thresholds except for the 2mm/day (where the ETS is better for OPERGNSS) and is statistically significant for large thresholds (10 and 20 mm/day, Figure 12). The difference between REANA ETS and NOGNSS ETS values is not significant. When examining scores for precipitation forecasts between 30-h and 54-h, there is a small significant degradation of the ETS for the 2 mm/day with the NOGNSS experiment and a small improvement with the OPERGNSS for the 40 mm/day (Figure 13).

Other impact studies 220
As previously mentioned we performed other impact studies with wind profilers, lidar data and Spanish radar data.  Figure 13. Equitable Threat Score of 24 h accumulated precipitation from the 30 to 54 hour forecast range of the long forecast initialized at 00 UTC each day of the period from 5 September to 5 November 2012 computed over the AROME-WMED domain with rain gauges of the HyMeX database (version 4). Dots indicate that the difference between the curves is statistically significant.

Wind profilers
No impact of the assimilation of wind profiler data is found except on wind fiel. Small impact is noticed in terms of wind RMS differences of background and analysis departures for radiosondes, aircraft and satellite winds (Figure 14). The largest impact is a decrease of -0.08 m/s for the radiosonde FG RMS differences at 300 hPa. Concerning the AN RMS differences, 225 the improvement (SATOB) or degradation (AIREP and TEMP) are very small. The largest value obtained at 200 hPa are due to the small number of data available for the computation. A small improvement but not significant (Figure 15), appears on the ETS of the 24 h accumulated precipitation accumulated from the 6 to 30 hour forecast ranges.

Ground-based and airborne lidar data 230
As discussed in Section 2.2, humidity profiles retrieved from ground-based and air-borne lidars have been assimilated in the REANA experiment. In Figure 1, the trajectories of all ATR-42 flights are plotted, together with the localization of the two ground-based lidars. The denial NOLIDAR experiment results are close to the reanalysis ones as these data represent very few additional data and are located over ocean where few observations are available for the comparison. No impact of the Lidar data is found when comparing the various analyzed ZTD to the Marfret-Niolon corresponding observations. These results agree

Spanish radars
No significant impact has been noticed over the HyMeX domain however, when focusing on the scores over the Iberic Peninsula, we obtained a positive and significant impact of the assimilation of Spanish radar data on the ETS for the 24-h accumulated precipitation from the sum of the 8 3-h precipitation background forecast ( Figure 16). This impact also remains in longer fore-240 cast ranges as the ETS for the 24-h precipitation accumulation between 6-h and 30-h forecast ranges is improved with the assimilation of Spanish radars for thresholds between 0.5 and 20 mm/24h ( Figure 17). This impact does not remains at longer forecast ranges ( Figure 18). These results are in good agreement with Wattrelot et al. (2014) study which found an improvement of the short term precipitation forecast scores. However contrary to the aforementioned study, we obtained a significant improved of the 24-h precipitation accumulation between 6-h and 30-h forecast ranges over the Iberic Peninsula. Dots indicate that the difference between the curves is statistically significant. western flux is therefore found in the lower troposphere ( Figure 19) with a low-level jet by the Candillargues radar around 09-12 UTC, associated to a slow evolving weak pressure low (around 995 hPa) localized over southern France on the 26th mid-day. Moreover, low level convergence is reinforced by the complex orography (Cévennes ridge of the Massif-Central and 255 Alps in France) triggering convection. An upper south-westerly wind jet is observed above 500 hPa ( Figure 19); in the evening of the 25th the wind rotates to the west on the CV area as shown by the Candillargues UHF radar.
During 25th and 26th October period, many deep convective systems developed over the Northwestern Mediterranean.
Although accumulated surface precipitation from Friday 26th at 18 UTC to Saturday Oct. 27th at 06 UTC over southern France only reached around 150 mm in 24h, very strong hourly rates (near 50 mm/1h) were recorded, with intense river 260 discharges (Ardèche, Gardons and Gapeau rivers for example). Such intense rainfall amounts led to local flash-floods and 2 casualties in the Var region. In fact as shown in Figure 20, three local precipitation maxima appear on the observed 24-hour accumulated rainfall amount (25th October -06 UTC to 26th October -06 UTC) on the Mediteranean costal area of France and Italy (Liguria Tuscany region); a first elongated one in the Cévennes area (more than 150 mm, M1) and a small second one close to the coast (around 100 mm, M2).
265 Figure 21 shows the 24-h accumulated precipitation between the 6-h and 30-h forecasts for the different experiments considered in this study. The REANA 24-hour accumulated rainfall (+30 -+06 hours forecast range) simulation agrees to the observations for both M1 and M2 systems. The NOLIDAR experiment is very close to REANA, this is consistent with the fact that the amount of additional lidar data is fairly small in REANA when compared to NOLIDAR. The strongest impact is found when no GNSS data are assimilated (NOGNSS run): M1 and M2 are strongly underestimated; surprisingly the OPERGNSS  ative impact is found with the NOWPROF simulation which misses M2 and does not reproduce correctly M1. Over Italy, the gain brought by the observations is not so evident but it is quite well known in data impact studies that the assimilation of observation does not always improve the forecast at each analysis time but in overall.

275
The AROME-WMED model was originally developed to study and forecast heavy-precipitating Mediterranean events during the Special Observation Periods (SOPs) of the HyMeX programme. Two reanalyses were undertaken after the HyMeX autumn campaign for the first SOP. A first one was carried out just after the campaign to provide the same model configuration over the whole SOP1 period because a version upgrade of AROME-WMED occurred during the period. A second reanalysis, performed a few years after, accounted for as many data as possible from the experimental campaign (i.e., lidar and dropsonde humidity