Investigating 3D and 4D Variational Rapid-Update-Cycling Assimilation of Weather Radar Reflectivity for a Flash Flood Event in Central Italy

The precipitation forecast over the Mediterranean basin is still a challenge because of the complex orographic region which amplifies the need for local observation to correctly initialize the forecast. In this context the data assimilation techniques 10 play a key role in improving the initial conditions and consequently the timing and position of precipitation pattern. For the first time, the ability of a cycling 4D-Var to reproduce a severe weather event in central Italy, as well as to provide a comparison with the largely used cycling 3D-Var, is evaluated in this study. The radar reflectivity measured by the Italian ground radar network is assimilated in the WRF model to simulate an event occurred on May 3, 2018 in central Italy. In order to evaluate the impact of data assimilation, several simulations are objectively compared by means of a Fraction Skill Score (FSS), which 15 is calculated for several threshold values, and a Receiver Operating Characteristic (ROC) curve. The results suggest that both assimilation methods in cycling mode improve the 1, 3 and 6-hourly quantitative precipitation estimation. More specifically, the cycling 4D-Var with a warm start initialization shows the highest FSS values in the first hours of simulation both with light and heavy precipitation. Finally, the ROC curve confirms the benefit of 4D-Var: the area under the curve is 0.91 compared to the 0.88 of control experiment without data assimilation. 20


Introduction
The precipitation forecast is a key variable in Numerical Weather Prediction (NWP) especially because of the interest of civil protection agencies, business sectors as well as common people to plan their daily activities. Recently, the development of more accurate parametrization of physical processes allowed a significant progress in NWP at high resolution but the prediction of exact location, timing and intensity of a convective event is still a challenge. 25 The Mediterranean basin is prone to flash flood and heavy precipitation events. One of the most relevant peculiarities of this area is the presence of mountain ranges in proximity of coastal area that lift the airflow, favouring the condensation and the triggering of convection. The Italian territory, which is characterized by a complex orography with two relevant mountain chains (Alpes and Apennines) and steep, urbanized, small catchments, is one of the most expose Mediterranean area to the https://doi.org/10.5194/nhess-2020-406 Preprint. Discussion started: 6 January 2021 c Author(s) 2021. CC BY 4.0 License. I think that the most important aspect of the precipitation forecast is the hydrological cycle and not the interest of Civil protection agencies.
citations here.
areas exposed?
hydrogeological risks. About 90% of Italian municipalities are susceptible to floods which have caused 466 deaths between 30 1990 and 2006 alone and over 19000 million of Euro of damages (Llasat et al., 2010).
The Mediterranean region has also been identified as hotspot for climate change because of its high sensitivity to Global Greenhouse Gas (GHG) concentrations (Giorgi, 2006). The heat stress growing, expected in future years, will contribute to a temperature rise, as well as to increase the water vapour amount in the atmosphere. These factors will lead to an intensification of intense and extreme precipitation events (Willet et al., 2008;Giorgi, 2011). In this context, an accurate prediction of rainfall 35 will be crucial to prevent economic and social damages.
The precipitation forecast is strongly dependent on the initial state, that dominates the evolution of the prognostic variables and consequently the development of precipitation. Hence, small errors in the initial conditions produce a significant shift in the positioning and intensity of convective events. Nowadays, the large availability of high frequency (both in space and time) meteorological data, remote sensing observations and in situ measurements, has encouraged many operational centres to use 40 data assimilation techniques for improving the accuracy of initial state. More specifically, the assimilation of ground radar reflectivity and radial velocity with three-dimensional variational (3D-Var) method proves good results in terms of quantitative precipitation forecast (QPF) for several case study in United States and Korea (Xiao and Sun, 2007;Lee et al., 2010;Ha et al., 2011). The assimilation of radar data with Weather Research Forecast (WRF) 3D-Var confirms positive results also in Europe by using WRF model, for the simulation of flash flood events in central (Maiello et al., 2014; and northern Italy (Lagasio 45 et al., 2019), as well as by using Advanced Regional Prediction System (ARPS) and Application of Research to Operations at Mesoscale (AROME) models for two heavy rainfall cases in Croatia and France (Stanesic and Brewster, 2016;Caumont et al. 2009), respectively. Recently, a few works show a positive impact in the prediction of intense precipitations using fourdimensional variational (4D-Var) with radar data and conventional observations if compared to 3D-Var for the simulation of a cyclonic event in Antarctic region (Chu et al., 2013) and a squall line over US Great Plains (Sun and Wang, 2013), 50 respectively. In Europe, the 4D-Var method proves good performance, improving the QFP and reducing the precipitation spinup time (Mazzarella et al., 2017;. But the 4D-Var, due to the high computational cost, is scarcely applied except in big operational centres. The low predictability and the high spatial and temporal variability of convective precipitation pattern requires a rapid update of initial state (analysis) to reduce the errors and to ensure a well balancing and physically consistent initial conditions. Newly, 55 significant efforts have been made by scientific community to improve the temporal and spatial resolution of satellite and ground radar data. This has enabled the first experiments with an update frequency equal or less than 3 hours. In this context, several weather centres have adopted a cycling assimilation with a 3-hourly update frequency by using 3D-Var with promising results (Ballard et al., 2012;Gao et al., 2004;Stephan et al., 2008;Wang et al. 2013;Caumont et al. 2009). Only the High-Resolution Rapid Refresh (HRRR) system, developed by NCAR, assimilates radar data with sub-hourly frequency over USA, 60 but it does not use a variational method (Smith et al., 2008).
The cycling assimilation with 4D-Var is still a challenge because of the high demand of computational resources. A first attempt to apply the cycling 4D-Var in operational mode was made during the London Olympic Game in 2012 using an NWP-https://doi.org/10.5194/nhess-2020-406 Preprint. Discussion started: 6 January 2021 c Author(s) 2021. CC BY 4.0 License.

billion
Are you referring to NWPs?
There are also other experiences in Italy assimilating radar data together with other data and different models that should be cited here.
based nowcast system (Ballard et al., 2016). The work shows the advantages of 4D-Var that ingests more observations than 3D-Var, estimating with a greater accuracy the initial state of the atmosphere. In addition, the weather radar reflectivity over 65 the whole Italian territory, previously used in a LETKF assimilation scheme (Gastaldo et al., 2018) with promising results, are now assimilated for the two variational methods.
This study aims at: • assessing the performance of 1-hour cycling assimilation with 3D-Var and 4D-Var methods using WRF model; • evaluating the impact of radar reflectivity mosaic, acquired by the Italian radar network, in cycling assimilation with 70 variational methods; • quantifying the improvements of assimilation techniques in terms of QPF for a complex orography region in the Mediterranean basin.
In this regard, a heavy rainfall case occurred in Central Italy on May 3, 2018 is used and several experiments are carried out to quantify the impact of the two assimilation methods in cycling mode. To the aim of identifying the best configuration in 75 terms of QPF, two different statistical methods are applied: a filtering (neighbourhood) technique and a Receiver Operating Characteristic (ROC) statistical indicator. The keys novelty of this paper resides in: i) the application of hourly cycling 4D-Var assimilation with WRF model; ii) the comparison between the two variational assimilation methods 3D/4D-Var in cycling mode; iii) the assessment of the precipitation forecast skill of cycling 4D/3D-Var assimilation methods in an orographically complex region. 80 The paper is organized as follows. The case study is described in Section 2, while the assimilated dataset is presented in Section 3. Section 4 contains the methodological approach: more specifically a brief overview of the assimilation and verification methods, while the model's setup and the design of experiments are described in Section 5. The results are summarized in Section 6, lastly, the conclusions are discussed in Section 7.

Case Study 85
On 2 May 2018, the synoptic scenario was characterized by an upper level through at tear off stage extended from Central Europe to northern Africa whose evolution is slowed by the presence of two anticyclonic circulations over Atlantic Ocean and Russia respectively (Fig. 1). The upper level through, with a northwest-southeast oriented axis, brought cool and dry air from the Artic region towards the Mediterranean basin, advecting a moist and warm south-easterly flow over the Adriatic region.  In the following 24 hours the upper level through evolved in a cut off low, producing a deep low-pressure system (992 hPa) on the western side of Sicily (Fig. 2a). The surface depression slowly moved north-eastward, dissipating its energy over the next 12 hours. 95 The mesoscale conditions over Italian peninsula showed a strong and moist south-easterly flow at upper and middle atmospheric levels, whereas a convergency line between the north eastern (Bora) and south-eastern (Sirocco) winds developed at low levels ( Fig. 2b). This produced heavy precipitations on May 3 over the central Italy Adriatic region which was enhanced by the complex orography of this area. Indeed, the highest peaks of the Apennines mountains (Gran Sasso and Majella, 2912 m and 2793 m MSL, respectively) near the coast further increased the atmospheric instability, promoting the triggering of 100 convective cells (Fig. 3).

Datasets Description
To the aim of assessing the impact of cycling 3D-Var and 4D-Var, the composite radar reflectivity provided by the Italian radar network (Vulpiani et al., 2008) are assimilated in WRF for the numerical simulations.
The Italian ground radar network includes 20 C-band and 3 X-band radar, managed by 11 regional administrations. Among these, 7 C-band and the 3 X-band systems (all with dual-polarization capability) are managed directly by the Italian Civil 115 Protection Department (DPC), which is also the developer and distributor of the national precipitation product. The processing architecture is basically composed of two main steps: firstly, the radar measurements are locally processed by a unique software system, secondly all the products are centralized to generate the national level products.
Different sources of error can affect the radar measurements (Collier, 1996): non-weather returns (clutter), partial beam blocking, beam broadening at increasing distances, vertical variability of precipitation, Radio Local Area Network (RLAN) 120 interferences and rain path attenuation. Due to the morphology of the Italian territory, the uncertainty can be mainly associated to the orography-related effects, especially in southern Italy where the radar coverage as well as the radar overlapping is poor. The DPC processing system aims at identifying most of the uncertainty sources in order to compensate them, whenever it is possible, before estimating precipitation. A point-by-point description of the operational radar processing chain can be found in (Petracca et al., 2018). After the processing, some composite products are generated in real time by DPC and disseminated 125 at national and regional level. Among these, the reflectivity Constant Altitude Plan Position Indicator (CAPPI) at 2000, 3000 and 5000 m MSL, which cover the whole Italian territory, are assimilated into WRF model.
It is worth mentioning that the CAPPI gives a horizontal cross-section of the data at constant altitude, it is a two-dimensional areal representation extracted from three-dimensional radar volume scan data.
Moreover, a thinning is applied to the CAPPI reflectivity data in order to ensure uncorrelated observation errors (Chang et al., 130 2014;Liu and Rabier, 2003) and to reduce the computational complexity, especially for 4D-Var method. CAPPI data are thinned over the 3km domain grid (described in section 5.2) by using an 'ad hoc' procedure. The rainfall data to assess the impact of 3D/4D-Var methods in cycling mode are provided by the rain-gauge network of the Italian Civil Protection Department (DPC) It is composed by roughly 3000 tipping bucket gauges with the minimum detectable rain of 0.2 mm. A careful quality check is applied to the data before using them in the statistical analysis. Errors due to instrument failures, 135 connection problems and erroneous values are removed.

Variational Data Assimilation
The Weather Research and Forecasting (WRF) model (Skamarock et al., 2019) is used for the numerical simulations, while the CAPPI radar data are assimilated by using the WRF data assimilation system (WRFDA, Barker et al., 2012). This system contains the algorithms required by 3D/4D-Var variational assimilation methods which are described in the following 140 subsections.
Are the CAPPI the only radar observations assimilated?
Is there any reference about this quality control? Is it applied by the authors? Clarify.

3D-Var and 4D-Var methods
Nowadays, the WRF 3D-Var (Barker et al., 2012) variational assimilation method is widely applied in meteorological and oceanographic modelling to assimilate a large variety of observations and to generate reliable initial conditions. The 3D-Var is a mathematical technique that combines observations and a short-range forecast (first guess) through the minimization of a 145 coast function. The goal of this method is to reduce the misfit between the observation and the background fields to obtain the best estimate of the true state of atmosphere. In general, the cost function with respect an atmospheric state variable x is defined as follow: where B and R are the background and observation error matrices, respectively; y0 represents the observations, x0 the 150 background field or first guess and finally H is the observation operator that converts data from model space to observation space.
The 3D-Var method has the advantage of being computationally cheap even if it misses the time dependency, hence all the observations that are acquired during the assimilation window are considered at its central time. The WRF 4D-Var (Huang et al., 2009) take the time variable into account using a numerical weather forecast as dynamical constraint. More specifically, 155 the method computes the model trajectory that reduce the misfit with the observations distributed in the assimilation window.
The initial atmospheric state is determined by minimizing the following cost function (2): The assimilation window is divided in k discrete sub windows, where x0 is the analysis state vector at time k0; Hk and Mk are the nonlinear forward and observation models, respectively. Finally, B and R are, as already mentioned, the background and 160 observation error matrices.
The 4D-Var has the advantage of assimilating the observation at its exact time compared to 3D-Var, ensuring a more consistent physics and dynamical balance of the initial conditions. Given the high computational complexity, the incremental approach proposed by Courtier et al. (1994) has been adopted. The tangent linear and adjoint model with a simplified physics are used in the inner loop to increase the computational efficiency. Despite that the application of 4D-Var in operational mode is still 165 limited to the major weather centres only.
The B matrix is a key component in the assimilation processes because it weights the errors in the background field adjusting the spatial spreading of observational information. In this context, the National Meteorological Center (NMC) method (Parrish and Derber, 1992), widely use in the data assimilation community, has been adopted for this work. The B matrix is estimated evaluating the difference between forecasts valid at the same time, but one of them is initialized 12 hours after the other. More 170 in detail, the new method (Wang et al., 2013b) considers u, v, temperature, pseudo relative humidity, and surface pressure as control variables, in contrast with the old one which utilizes the streamfunction, the unbalanced velocity potential and the unbalanced temperature. Recent works using the new method, shows a slightly benefit in the precipitation forecast skill as well as an improving of performance when radar data are assimilated (Wang et al., 2013b;Sun et al., 2016). Thus, for this work the https://doi.org/10.5194/nhess-2020-406 Preprint.  Mazzarella et al. (2020), the var_scaling, and len_scaling parameters which adjust the influence of BE matrix over the background field, are set to 1 and 0.5, respectively.
Recent studies show the positive behaviour of NO_ECHO radar assimilation (Min and Kim, 2016) in reducing the overestimation of precipitation. For this reason, the NO_ECHO option is used for this study. The observations with nonprecipitation echo within the radar range are assimilated to correct the precipitation pattern, removing excessive water vapor 180 and hydrometeors (Lee et al., 2020).

Radar observation operator
The assimilation of CAPPI radar reflectivity for both 3D/4D-Var assimilation methods has been performed through the indirect method (Wang et al. 2013c;Gao and Stensrud 2012). The new approach converts the observed reflectivity in the three hydrometeor mixing ratios in contrast to the direct method (Xiao et al. 2007) which only uses the rainwater mixing ratio. 185 The forward reflectivity operator considers the contribution of snow, rain and hail components and it is represented as: where Ze is the equivalent reflectivity factor, α varies linearly between 0 at Tb = −5 °C and 1 at Tb = 5 °C, and Tb is the background temperature from an NWP.
The three hydrometeor mixing ratios, rain, snow, and hail in Eq.
For the hail component the formulation based on (Lin et al. 1983;Gilmore et al. 2004) is adopted: ( ℎ ) = 4.33 × 10 10 ( ℎ ) 1.75 Finally, the equivalent reflectivity (Ze) is converted in Z (dBZ) using: 200 = 10 log 10 The option, allowing the assimilation of in-cloud humidity from radar reflectivity, has been considered for this study (Wang et al., 2013). In this case the observed operator H is defined by: what are these length scales? Could you interpret them from a physical point of view?
As this is an important point, I believe that you should discuss it in more detail.
where qv represents the specific humidity, rh the relative humidity, and qs the saturated specific humidity of water vapor. 205 Hence, this option allows the assimilation of vapor from in cloud region in addition to the three hydrometeor species calculated with the indirect method.

Model setup
The numerical simulations are carried out with the ARW-WRFV4.0 model (Skamarock et al., 2019). WRF is a mesoscale model, supported by National Center for Atmospheric Research (NCAR) and largely used by the atmospheric modelling 210 community. The main features consist of a fully compressible Euler nonhydrostatic equations with hydrostatic option, staggered Arakawa C grid, and a highly portable and flexible code that optimize the use of computational resources.
A two-way nesting configuration with two domains is used for this study: the mother domain with 379x431 grid points, covers the Italian peninsula with a horizontal resolution of 3 km, while the inner domain (340x319 grid points) is centred over the Abruzzo region (central Italy) with a grid spacing of 1 km (Fig. 4a). To avoid compatibility issues with WRFDA, the vertical 215 terrain following coordinates are used instead of terrain-following hybrid. Both domains adopt 40 vertical levels from the ground up 100 hPa. Because of the high spatial resolution, the convection is explicitly resolved. The same physical parameterizations used by the Center of Excellence in Telesensing of Environment and Model Prediction of Severe Events (CETEMPS) meteorological-hydrological forecast system, are set for this work (Ferretti et al., 2014). More specifically, the microphysical processes are parameterized by using the WSM6 scheme (Hong and Lim, 2006), while the MYJ scheme (Janjic 220 et al., 1994) is applied for the PBL. Shortwave and long wave radiation are considered through the Rapid Radiative Transfer Model (RRTM) scheme (Mlawer et al., 1997). Finally, the Noah land surface model is chosen to parametrize the land surface processes (Chen and Dudhia, 2001). The initial and boundary conditions for the mother domain are provided by European Centre for Medium-Range Weather Forecasts (ECMWF) with a 0.1-degree resolution.   0110 UTC, 0210 UTC and 0310 UTC (Fig. 4b, c, d). Moreover, two additional simulations are performed with the aim of evaluating the performance of cycling assimilation methods in warm start mode. For this purpose, a previously numerical forecast, initialized six hours before, is used as background field. The same aforementioned CAPPI are assimilated in these experiments. 240 In table 1 a summary of all the experiments performed.

Verification methodologies 245
To the aim of evaluating the impact of both 3D-Var and 4D-Var in cycling mode, an objective comparison between the observed and forecast precipitation is performed. To this purpose, the rainfall data collected by DPC rain gauge network are spatially interpolated on the model grid. More in detail, the Inverse-Distance-Weighting (IDW) conservative method (Jones,1999) is used to remap the rain data to the 3km domain grid. The statistical analysis is only performed in a restricted area of domain D01 (Fig. 3) that includes Lazio and Abruzzo regions (hereafter LA), because these are the regions where 250 relevant accumulated precipitations was recorded, and where several floods and relevant damages occurred. In addition, the precipitation forecast verification is also performed in the Lazio region to assess the behaviour of assimilation in leeward and windward sides of the Apennines range. Finally, this work represents a preliminary study to investigate the reliability of the cycling assimilation before implementing it in an operational meteorological-hydrological chain as the one at CETEMPS which activity is focused on the meteorological-hydrological forecast in Abruzzo region. 255 To avoid the spatial limitations of using a point by point approach in the evaluation of quantitative precipitation forecast (Roberts, 2003)  In addition, the Receiver Operating Characteristic (ROC) statistical indicator is computed to strengthen the statistical analysis, evaluating how skilful are the cycling assimilation methods in terms of QPF. The ROC curve synthetizes the information obtained with different thresholds in one diagram, just comparing the hit rate against false alarm rate (Swets 1973). Finally, the area under the ROC curve (AUC) which is largely used to quantify the skill of a forecast system (Mason, 1982;Buizza and 265 Palmer, 1998;Storer et al., 2019;Ferretti et al., 2020), is calculated for each simulation.

Results
With the aim of evaluating the performance of 3D-Var and 4D-Var assimilation methods in cycling mode, the FSS and ROC, previously descripted in Section 5.2, have been calculated. The FSS is calculated in LA subregion, where significant precipitation occurred. To this purpose, three threshold values: 1, 3 and 7 mm (10 and 15 mm for 6 h accumulated precipitation) 270 are used to evaluate the impact of cycling assimilation with light, moderate and heavy precipitation. Finally, the ROC and AUC are calculated to reinforce the statistical analysis and summarize the information obtained with the FSS.

FSS evolution
To compare the 4D/3D-Var experiments in warm/cold start and their ability to reproduce the precipitation pattern, the statistical index has been calculated considering the precipitation accumulated over three specified time periods: 1, 3 and 6 hours, 275 respectively. The time series of FSS is presented in the sections below.

Hourly precipitation
The FSS for the 1 mm h -1 accumulated threshold (Fig. 5a) is calculated for all the experiments starting from 0600 UTC to 1500 UTC in LA region. The results are quite similar for all experiments in the first hour of simulation due to the small accumulated precipitation, even if the CTL is starting from higher values than the other experiments. Later, all experiments display higher 280 FSS values than CTL experiment (red dashed line), pointing out the positive feedback of cycling assimilation in the interval from 0700 UTC to 1200 UTC. Moreover, the simulations with warm initialization clearly show (Fig. 5a, blue and green lines) higher values demonstrating a larger impact on the precipitation forecast than those with cold start. On the other hand, the CTL performs better in the last three hours of simulation, suggesting that the impact of radar reflectivity is decreasing in time. The FSS computed for the 3 mm h -1 threshold (moderate precipitation) is shown in Fig. 5b. The simulations with data assimilation 285 confirm the improvements in term of QPF from 0700 UTC to 1200 UTC. In addition, the two experiments with 4D-Var (Fig.   5b, yellow and blue lines) performs better than 3D-Var, except in the first hours of simulation. This behaviour proves the positive impact of 4D-Var with moderate precipitation. However, similarly to the results for 1 mm h -1 threshold, the impact of radar reflectivity decreases in the final hours. Finally, the FSS is calculated for heavy precipitation, using a threshold value greater than 7 mm h -1 (Fig. 5c). According to the results for the 3 mm h -1 threshold, the FSS confirms the good performance 290 https://doi.org/10.5194/nhess-2020-406 Preprint. Discussion started: 6 January 2021 c Author(s) 2021. CC BY 4.0 License.

Is that behavior caused by the spinup?
The information about the time is missing The main problem with the results is that thresholds are too low. A severe convective event can give a rainfall larger than 50 mm/h. Here 7 mm/1h is considered, which is too low. Larger thresholds must be considered, otherwise the paper is not assessing the impact of radar DA for heavy precipitation thresholds.
of 4D-Var in cycling mode. More specifically the CYC4DAVR_warm (Fig. 5c, blue line) shows the highest FSS values in the whole period with heavy precipitation, except for a very short period where the 4D-Var cold start is reaching higher values than the warm start. Conversely, the benefit of 3D-Var simulations is limited to the first hours of simulation.
The statistical analysis with hourly precipitation clearly shows the positive impact of assimilation in cycling mode with both methods in the initial hours of simulation. Moreover, the simulations in warm mode show the best performance compared to 295 those in cold start, especially the CYC4DVAR_warm (Fig. 5c, blue line) when heavy precipitation occurred.

Three hourly precipitation
The statistical analysis is also carried out for the 3-hourly precipitation accumulation interval to investigate the response of cycling data assimilation for precipitation characterized by different mechanisms as for example convection or orographic uplifting. The FSS for 1 mm 3h -1 (Fig. 6a) highlights the benefit of using 3D and 4D-Var with cold start for a short initial period (0600-0730 UTC), suggesting that the background field is probably more accurate than the previous forecast. But the 305 rate of increasing for both the warm start is larger than CTL and cold starts. Indeed, the FSS values for the warm start are higher than the cold starts up to 1200 UTC (Fig. 6a, blue and green lines). Later the improvement reduces, and CTL performs better (Fig. 6a, red dashed line). The score is also computed for 3 mm 3h -1 threshold (Fig. 6b). Similarly to the results for 1 mm threshold the cycling 4D-Var in cold mode (CYC4DVAR_cold) improves the precipitation forecast at the initial time (0600 UTC to 0800 UTC), while the CYC4DVAR_warm mode confirms the higher values from 0900 to 1200 UTC as well as 310 the poor performance in the first hour. Moreover, the reduction in FSS values for both 4D-Var and 3D-Var compared to CTL in the final hours of simulation period proves that the influence of cycling is restricted to a short time range.
Finally, the FSS is also calculated for threshold value of 7 mm 3h -1 in LA region (Fig. 6c). The highest values are associated with CYC4DVAR_warm (blue line), although all experiments show an improvement compared to CTL in the first hours of simulation period. Later, the experiments converge to the control run but the values of FSS decrease due to the small 315 accumulated rainfall.
The statistical analysis performed using the three threshold values (1, 3 and 7 mm), proves the positive impact of cycling assimilation with radar reflectivity in the interval 0600-1500 UTC and confirms the benefit of 4D-Var compared to 3D-Var.
Moreover, the two simulations with a warm start initialization show a low impact at 0600 UTC. The reason for this is probably the poor dynamic balance of the initial fields at the beginning of simulations in warm mode. 320 https://doi.org/10.5194/nhess-2020-406 Preprint. Discussion started: 6 January 2021 c Author(s) 2021. CC BY 4.0 License.
The thresholds considered are low also in this section. Severe weather is characterized by much larger precipitation thresholds that are missed in this analysis.

Six hourly precipitation
The FSS is also calculated using the 6-hourly precipitation in LA region using three threshold values: 7 mm 6h -1 , 10 mm 6h -1 and 15 mm 6h -1 .
The FSS evolution for 7 mm 6h -1 threshold is provided in Figure 7a. At 1200 UTC all experiments show similar performance due to the small rainfall in the first six hours of simulation. Then, the warm start initialization shows the highest FSS values, 330 pointing out the improvement in precipitation forecast for CYC3DVAR_warm (green line) and CYC4DVAR_warm (blue line) when compared to CTL. Nevertheless, also the two simulations in cold start mode show a positive impact in the whole interval, even though with a slight improvement.
The differences between experiments increase with higher threshold values. The FSS calculated for the 10 mm 6h -1 threshold ( Fig. 7b) proves the positive impact of CYC4DVAR_warm compared to the 3D-Var simulations for the whole simulation 335 interval. The results also suggest that the cycling assimilation with warm start performs better than experiments in cold start for both assimilation methods.
According to the results for the previous threshold, the CYC4DVAR_warm confirms the best performance in term of FSS for the 15 mm 6h -1 threshold (Fig. 7c). Both CYC4DVAR_cold and CYC3DVAR_warm display a positive impact in QPF during the whole simulation period. Conversely, the CYC3DVAR_cold shows a worsening at 1800 UTC when compared to CTL and 340 the other experiments.
In conclusion, the time series of FSS points out the benefit of using a cycling assimilation for radar reflectivity. The warm start with 3D/4D-Var assimilation methods confirms the improvement in terms of QPF, for: • the hourly accumulated precipitation, highlighting the good performance in the localization and timing of the onset of the precipitation and for very light precipitation; 345 • the 3-hourly accumulated precipitation highlighting the improvements for convective cells and orographic precipitation; • the 6-hourly accumulated precipitation.
On the other hand, the FSS depends on threshold values, thus a further statistical indicator has been calculated to the aim of endorse the previous results. Hence, the ROC curve, which summarizes the result obtained with several thresholds in one plot 350 allowing for an easy comparison, is built for this study.

Receiver operating characteristic (ROC)
The Receiver Operating Characteristic (ROC), which compare the probability of detection (POD) and false alarm rate (FAR), is calculated to evaluate how the simulations are skilful in precipitation forecast. The 12-hourly precipitation accumulated from 0600 UTC to 1800 UTC for the NWP experiments and the rain gauge network are used to build the curve (Fig. 8) for the LA region. To investigate the ability of cycling assimilation in predicting rainfall with light, medium and heavy intensity the 360 following threshold are chosen: 1 mm, 12 mm, 24 mm, 36 mm, 48 mm and 54 mm. The curves show low POD values for 1mm threshold and a worsening compared to the CTL. On the other hand, the benefit of cycling assimilation is clearly found with higher threshold values. In fact, the steepness of 3D/4D-Var curves is greater than CTL, suggesting a good forecast skill with moderate and heavy precipitation, namely from 12 mm to 48 mm. In this regard, the CYC4DVAR_warm (green line) shows the best performance, while the CYC3DVAR_cold (red line) reduces its impact with high threshold values. In addition,

Conclusions
In this paper the impact of cycling 3D-Var and 4D-Var variational assimilation methods on the forecast of a heavy precipitation 375 event, occurred in an orographically complex region namely the Central Italy, is evaluated. The reflectivity CAPPI (Constant Altitude Plan Position Indicator) obtained by Italian radar network at 2, 3 and 5 km are assimilated every hour in cycling mode into the WRF model. The comparison among the experiments is performed using a filtering approach, the Fraction Skill Score (FSS), for the Lazio-Abruzzo subregion, that is the area where relevant rainfall occurred. In this regard, the statistical analysis is performed considering 1, 3 and 6 hourly accumulated precipitation with three different threshold values in order to evaluate 380 the benefit of cycling assimilation with light, moderate and heavy precipitation. Finally, the ROC is built, and its area is calculated to further evaluate the reliability of cycling assimilation in precipitation forecast.
The FSS time series for the hourly precipitation highlights the positive impact of radar data for both assimilation methods compared to CTL. The benefit of using a cycling assimilation is clearly shown in the results for both light and moderate precipitation. However, the impact reduces in the last hours of the simulation, when all experiments converge to the control 385 run. Conversely, the FSS calculated for 1mm h -1 threshold shows a worsening of both assimilation methods at the start time of the statistical analysis, because of the low cumulated precipitation. To further evaluate the impact of cycling assimilation with 3D and 4D-Var the analysis is also performed considering the 3 h cumulated rainfall. The results for all thresholds confirm the benefits of assimilating reflectivity data. In this regard, the cycling 4D-Var has a greater impact than 3D-Var experiments and consequently higher FSS value. More specifically, the cold start initializations for cycling both 4D-Var and 3D-Var show an 390 improvement in terms of QPF compared to CTL at beginning of analysis, while the experiments in warm start perform better after few hours. This behaviour is probably caused by a slightly unbalanced initial field for the warm start simulations. Different thresholds are used to evaluate the reliability of cycling data assimilation with 6 hourly precipitation. The FSS for 7 mm 6h -1 threshold confirms a slight improvement of warm start simulations compared to CTL and cold initialization. The CYC4DVAR_warm clearly displays the greatest FSS values at 10 and 15 mm thresholds, pointing out the positive impact of 395 radar reflectivity. Also, the 4D-Var in cold start and the 3D-Var with a warm initialization produce an improvement in QPF although it is smaller than CYC4DVAR_warm. On the other hand, the CYC3DVAR_cold shows a worsening in FSS.
Finally, the ability of cycling assimilation to reproduce the 12 h precipitation field is evaluated by using the ROC and the area under the curve. The curves are calculated considering the period from 0600 UTC to 1800 UTC because of the significant rainfall. The comparison between the simulations confirms that cycling 4D-Var in both warm and cold mode is the best 400 technique, indeed the highest value of AUC=0.91 is obtained. The CTL shows lower steepness than the cycling 4D-Var and an AUC of 0.88. Finally, the AUCs for the two simulations with 3D-Var, respectively 0.87 for warm initialization and 0.89 for the cold, are lower than 4D-Var and comparable with CTL simulation. Therefore, the impact of 3D-Var over 12h accumulated precipitation is less clear.
In conclusion, the use of 3D-Var and 4D-Var methods in cycling assimilation with weather radar reflectivity mosaic data 405 improves the reliability of the precipitation forecast, even if the positive impact reduces in time. The two simulations with a https://doi.org/10.5194/nhess-2020-406 Preprint. Discussion started: 6 January 2021 c Author(s) 2021. CC BY 4.0 License. heavy precipitation are not considered.
I cannot understand this sentence: explain better.
warm start initialization produce good results in terms of FSS, but the differences are small compared to the cold simulations that perform better at the initial time. This behaviour suggests that the precipitation spinup time decreases in cycling assimilation with cold start, while for warm initialization this is not true. In addition, the cycling 4D-Var with warm start (CYC4DVAR_warm) shows better performance than 3D-Var over all precipitation accumulation intervals considered for this 410 study. Finally, the ROC cures and the AUC values also confirms the benefit of 4D-Var in warm start.
The next step of this work will be to assimilate the radial velocity to improve the accuracy of wind field and vertical velocity, thus the positioning of convective cells. This opportunity allows us to complete the assessment of weather radar assimilation in a 4D-Var cycling data assimilation. In addition, the impact of a wider data assimilation windows in cycling 4D-Var could be tested, in combination with a strategy with more outer loops. These solutions allow the assimilating of more data and to 415 take into account the non-linear effects, producing significant increments in the analysis field. Lastly, the results of this study are helpful to decide which cycling assimilation methods will be implemented in operationally CETEMPS meteorologicalhydrogeological chain.

Data availability
Rain gauges data are provided by DEWETRA data portal (http://www.mydewetra.org/). The platform is accessible upon 420 request to the Civil Protection Department. Radar data are obtained from the Civil Protection Department. Access to these data are restricted and readers must request it through contacting the lead author.
Department for rain gauges and radar reflectivity data. The first author also thanks the Center of Excellence in Telesensing of Environment and Model Prediction of Severe events (CETEMPS) for the financial resources.