Developing an index for heavy convective rainfall forecasting over a Mediterranean coastal area

Heavy convective rainfall incidents that occurred over western coastal Greece and led to flash floods are analyzed with respect to mesoscale analysis for the period from January 2006 to June 2011. The synoptic scale circulation is examined throughout the troposphere along with satellite images, lightning data and synoptic observations of weather stations. Well-known instability indices are calculated and tested against synoptic observations. Taking into account the severity of the incidents, the performance of the indices was not as good as expected. Further detailed analysis resulted in the development of a new index that incorporates formalized experience of local weather and modeled knowledge of mechanisms of severe thunderstorms. The proposed index named Local Instability Index (LII), is then evaluated and its performance is found to be quite satisfactory.


Introduction
Thunderstorms accompanied by heavy rainfall often lead to flash flood events with disastrous consequences on the economy, the environment and in some cases have resulted in fatalities.Although the performance of the numerical weather prediction models have been improved (Kelley and Källén, 2013), it is always challenging to further study due to their impacts.
One of the fundamental conditions for a thunderstorm initiation is the existence of an unstable atmosphere.In order to estimate the instability, thermodynamic indices have been created by combining related meteorological parameters (Showalter, 1953;George, 1960;Boyden, 1963;Jefferson , 1963a;Jefferson, 1963b;Miller, 1967;Litynska et Correspondence to: Marina Korologou (marina.korologou@gmail.com) al., 1976;Peppler, 1988;Peppler and Lamb, 1989;Jacovides and Yonetani, 1990;Reuter and Aktary, 1993;Tian and Fan, 2013).These indices have not shown always satisfactory re-35 sults due to local effects that are not well represented or due to limited datasets.
Related studies have been carried out for specific regions of Greece with acceptable results (Dalezios and Papamanolis, 1991;Michalopoulou and Karadana, 1996;Sioutas and 40 Flocas, 2003;Chrysoulakis et al., 2006;Marinaki et al., 2006).The main challenge of these studies was the availability and reliability of observation data as the existing radiosonde network is rather insufficient.It has been shown that the performance of the indices depends on the season or 45 even month, the terrain of the area and the type of the thunderstorms (Michalopoulou and Jacovides, 1987;Prezerakos, 1989;Dalezios and Papamanolis, 1991;Haklander and Van Delden, 2003;Tyagi et al., 2011).
Western Peloponnese, being washed by the Ionian Sea, is 50 an area that is frequently affected by severe thunderstorms (Maheras et al., 2003;Metaxas et al., 1999;Ziakopoulos, 2009;Xoplaki, 2002).However, relevant studies have not been performed so far, mainly due to the luck of radiosondes data.The objective of this study is to examine the thermo-55 dynamic environment of severe thunderstorms with respect to heavy rainfall occurring in this area for the period of 1-1-2006 to 30-6-2011.It is proposed an alternative methodological tool for developing a useful and practical index.This index is going to be used for forecasting these events without 60 employing radiosondes data because they are not available for the most hydrological basins of magnitude 5 and above.

Data
The severe thunderstorms with heavy rainfall occurred in the examined area of northwestern Peloponnese (see Fig. 1), 65 more specifically over the hydrological basin defined by the rivers Peiros, Parapeiros, Vergas and Pinios (almost 2500km 2 ) (MEECC, 2012) during 1 st of January 2006 to 30 th of June 2011 were considered.For this purpose, a mesoscale analysis of the atmosphere with 6-hour time step for that period was performed.Datasets of dry and dew point temperature at the surface and geopotential height, temperature and humidity at the isobaric surfaces of 850, 700, 500, 300 hPa were used.The 6-hourly synoptic scale analysis of the atmosphere derived from the archive of Hellenic National Meteorological Service (HNMS) and a re-analysis of 0.125 deg resolution from the European Centre for Medium-Range Weather Forecasts (ECMWF) with the same time step were also employed (Veremei et al., 2013).Additionally, the surface synoptic observations (SYNOP) derived from the stations of Andravida, Araxos, Pyrgos and Zakynthos (see Fig. 1) were employed and merged in 6-hour intervals in order to be compatible with the aforementioned time step (i.e.00:00 -6:00, 06:00 -12:00, 12:00 to 18:00, 18:00 -24:00).
Missing merged SYNOP were noticed randomly throughout the available dataset, mainly during night hours, weekends and public holidays, representing a percentage of 2.8%, 3.1%, 51.2%, 29.2% for the stations Andravida, Araxos, Pyrgos and Zakynthos respectively.For the Dry Temperature, the missing data were classified in three categories.The first category is characterized by 6hour intervals at Andravida station with no available observations from the nearby stations.This category is consisted of 9 cases.For this category, the Group Method of Data Handling (GMDH) algorithm (Acock, 2000) was employed with dependent variables: - The accuracy (+/− 1 0 C) was found to be as high as 88%.
The second category is consisted of 113 cases, being characterized by available observations at Araxos station at the referring times of the missing observations at Andravida.
In this case, the GMDH algorithm was also employed with one more dependent variable, namely the Dry Temperature of this nearby station.The accuracy (+/− 1 0 C) was found up to 90%.

120
The third category was characterized by two or more successive missing observations, consisting of 106 cases.In this case, the GMDH algorithm was not selected, but a qualitative approach was employed instead, with the aid of respective values from the nearby weather stations when available, the 125 synoptic analysis and the satellite images from the satellite Meteosat-9 and more specifically a combination of the SE-VIRI High Resolution Visible channel and the IR10.8 channel with the aid of the CineSat application.
For the surface relative humidity, the 228 missing merged 130 observations were filled with the aid of a qualitative approach, due the nature of this parameter.The subjective estimation was based on succeeding and preceding observations, on observations of the nearby stations, on the synoptic analysis and on Meteosat-9 images (a combination of the SEVIRI 135 IR3.9, IR10.8 and IR12.0 channels).The amount of precipitation and the duration of each individual thunderstorm led to their intensity determination.
If a thunderstorm occurs within a 6-hour interval in at least one of the examined weather stations with intensity greater 140 than 5 mm/min for at least 5 min then this interval is defined as 6-hour interval of severe thunderstorm.1).Correspondingly, a 6-hour intervals being characterized by more than 10 strokes/ hour, were considered as intervals of severe thunderstorms.These records were merged with the synoptic observations.However there were cases with recorded strokes without recorded thunderstorms from the synoptic observations.The identification of these cases was further verified with the aid of satellite images (Meteosat-9) as derived from the channel combination named Convection RGB(WV6.2 -WV7.3,IR3.9 -IR10.8,NIR1.6 and the This analysis showed 508 6 − hour intervals with thunderstorm events over the examined area, including 143 intervals of severe thunderstorms associated with rainfall intensity greater than 5mm/min for at least 5 minutes duration or with 10strokes/hour according to aforementioned paragraphs.The specific events potentially lead to flash floods.The remaining 365 cases refer either to thunderstorms with no or relatively small amounts of precipitations or thunderstorms associated with frontal activity and were excluded from the subsequent analysis.The 143 severe cases occurred from May to October and thus our study became restricted to these.
Due to limited availability of lightnings data, two distinct sub-periods were used.The first period, from 1-5-2006 to 31-10-2007, that is characterized by lack of the lightnings data.The second one, from 1-6-2008 to 30-6-2011, is considered of higher reliability due to the availability of lightnings data.In the first one, 138 6-hour intervals of thunderstorms occurred, including 54 severe thunderstorms.In the second period 370 events of thunderstorms were observed, including 89 severe events.
A set of metadata were aggregated from the first period data i.e.

Methodology
Available data made feasible the calculation of the thermodynamic instability indices KI, HI, TTI and SWEAT.Due to the fact that these indices refer to a specific geographical point, the Andravida surface weather station was chosen as representative of the examined area because this station presented the smallest number of missing data.Although these indices are satisfactory in many cases worldwide, for the examined area their performance, following the HeVeS (Hellenic Verification Scheme) (Petrou et al., 2009) and the Yule Index (Marinaki et al., 2006), was found to be poor (Dimitrova et al., 2009) and thus of no practical value.This performance could be attributed to the fact that the indices do not take into account the synoptic scale weather patterns nor the local flows.Therefore the development of a new instabil-205 ity index is imperative.
Severe thunderstorms cannot be modeled and consequently predicted analytically nor synthetically (Holton, 2004).The proposed indices for predicting thunderstorms can be considered as a tested hypothesis.These tests were 210 performed for a specific period.Consequently it is always possible for a proposed index to be rejected if applied or tested to a different period unsuccessfully.These validation tests are performed deductively; the proposed index (consisting actually the hypothesis) and its application constrains are 215 considered as the prerequisite knowledge for prediction of the event; if the predicted event is not manifested, the hypothesis is rejected (Trochim, 2000).From a set of proposed indices, the index that is tested more strictly is preferred.It is rational to accept that if there is an effective index, it will be 220 among those who have persisted in criticism and been corroborated.
An index is a successfully tested hypothesis that can be developed from experience, literature or theory, or combination of these (Graham et al., 2010) i.e.Combined Hypoth-225 esis Development.The index that derives from rich explicatory theoretical framework (content) and a consequently deductive hypothesis, incorporates formalized related experience and has performed successfully through strict validation tests, can be conceived that captures important part of 230 the event behavior.
In order to state and support the effectiveness of the new index, it is suggested to use two different sets of data.The first for building the hypothesis i.e. to find the patterns and the rules that associate the events with the meteorological 235 parameters for the specific period.The second for testing and evaluate the hypothesis according to Modus Tollens rule (Lakatos, 1963).It was preferred to use the first sub-period (1-5-2006 to 31-10-2007) for building the hypothesis and the second sub-period (1-6-2008 to 30-6-2011) for testing and 240 evaluation since for the latter sub-period the recorded thunderstorms events are more accurate than the former as explained in the section 2 and the testing of the proposed index (hypothesis) would be more strict.
The factors responsible for forming the Index would be in-245 ferentially derived from the theoretical and empirical analysis.Data Mining and Optimization techniques are employed to determine the critical values of these factors and not the factors themselves, since this would led to an index with poor informative content i.e. relations between the event and parameters with no meteorological meaning.In this study it was attempted to automatically extract associations rules and patterns between the events and the data and metadata using the software tools : MATLAB and AR-MADA for MATLAB (ARMADA, 2011).Data Mining tech-255 niques such as Principal Components Analysis, Association Rules and Cluster Analysis were applied to data and metadata.
However, no useful result was found, mainly due to the sparseness of the phenomena in question.The aforementioned algorithms when applied to cases with rare phenomena modeled by high dimensional data with sparse features like in this case, they loose their effectiveness.To overcome this drawback requires a lot of effort to get results but not with the appropriate precision for this case (Beyer et al., 1999).
Thus, in this study, the described methodological tool of Combined Hypothesis Development was preferred to be used.The index will have the form of a threshold function that flags or not a warning for an impeded thunderstorm with heavy rainfall.The value of 100% for the recall of the index will be a major constrain due to the severity of the consequences of the event.
4 Developing the New Local Instability Index (LII) In this section, the factors accounting for the framework of the index development are depicted and briefly presented along with a specific for the examined area synoptic description.
It is well known that a thunderstorm initiation requires the presence of three ingredients, namely, Energy, Moisture and Lifting Mechanism.Using these ingredients as a guidance, a detailed analysis for the factors that were related to thunderstorms events associated with heavy rainfall was conducted.
These mechanisms are closely related with the synoptic scale circulation over the examined area.More specifically, during the period from May to August (5th to 8th) polar air masses arrive over Mediterranean Sea and as they have crossed the warm continent of Europe, they have become dry and warm (Xoplaki, 2002).At the same time, the eastern Mediterranean region is affected by tropical dry and warm air masses (Rodwell and Hoskins, 2001;Hoskins, 1996).Thus, heat is transferred from the warm lower atmosphere layers to the upper layers of the sea, causing the temperature of the lower atmosphere to be reduced.These conditions enhance the stability of the atmosphere, often associated with temperature inversion and trapping moisture in the lower layers (from the surface to the 3000-5000 ft), inhibiting conditions of any cyclogenesis or depressions passes.
In late summer and especially during September the polar jet stream is shifted to the south.An atmospheric perturbation may interrupt the equatorial flow of the jet, a part of it usually moves southwards causing a northwesterly flow.Consequently, cyclonic conditions are created at the lee side of the Alps (Aebischer, 1998;Kljun et al., 2000) and the geodynamic heights are reduced.The Southeastern movement of that part of the jet is usually enhanced by the specific conditions.As the jet gets momentum, it moves further to the south, resulting in further reduction of the geopotential heights and cyclonic conditions over the area of Boot and northern Sidra Sea (Trigo et al., 2001).As a consequence, southwesterly winds gradually prevail over southern Ionian Sea (Brody and Nestor, 1980) enriching even the middle layers of the atmosphere with moisture and reversing the temperature inversion which occurs at the low layers.The examined area is affected by such condition, as the southwestern 315 stream in conjunction with local orography accumulates further moisture in the lower atmosphere, while in the meantime the perturbation has moved eastwards bringing cold and dry air mass in the upper layers.The combination of these conditions can be explosive and cause severe storms.

320
Throughout September and October (9th to 10th) and when a southwesterly flow prevails in the upper atmosphere, orographic clouds and precipitation are caused over the western Peloponnese windward areas.The shift of winds at 850 hPa to the southwestern sector favours the occurrence of 325 thunderstorms, occasionally severe.
The factors of Energy, Moisture and Lifting are considered as the independent variables of a threshold function that constitute the Local Instability Index (LII) requiring a minimum value for the occurrence of severe thunderstorm.

330
The analysis was carried out every six hours and consequently the Index provides warning values every six hours lasting for the next 12 hours.Due to the severity of the phenomena, it is compulsory for the index to predict all or almost all the phenomena (recall 100%) and simultaneously 335 maintain a high and practicable precision.
In order to determine the critical values of the parameters, the precision of LII was set up as the objective function which should be maximized.The required parameters were the changing variables of the objective function con-340 strained to rational values.Constrain also was the value of recall, set up to 100% as was justified in the previous paragraph.For this purpose, the linear programming (LP)-based branch-and-bound algorithm of the optimization toolbox of MATLAB (R2010a), bintprog was used (Nemhauser and 345 Wolsey, 1988).

Energy Term
Instability is a prerequisite for air mass thunderstorms and can be partially indicated by the Convective Available Potential Energy (CAPE) (Moncrieff and Miller, 1976).Al-350 though CAPE is referring to synoptic scale airmass, it has been shown that CAPE can be used for smaller scale, local weather diagnosis and prediction (Zverev, 1972).CAPE practically defines how strong the updraughts within the thunderstorm potentially are; stronger the updraughts result 355 in heavier rainfalls (Wallace and Hobbs, 2006).

ACAPE Term
Using only the data that are available to operational forecasters in their daily duty, the Energy Term was developed in order to approximate the CAPE.An algorithm in MATLAB was built that accepts the Dry Temperatures (T) and the Dew Point (Td) as inputs from the weather stations of Andravida, Araxos and Pyrgos and calculates a mean T and Td (Holton, 2004).The Lifted Condensation Level (LCL) was computed and simulating the wet adiabatic finally computed the Temperature (Tp) of the surface parcel would have if would be raised in the levels of 850, 700, 500 and 300 hPa.The Approximated CAPE (ACAPE) is the difference Tp-T and is referring to the four pressure levels (ACAPE 850 , ACAPE 700 , ACAPE 500 and ACAPE 300 ).

ACAP E
It should be noted that there are a lot of cases of severe thunderstorms with low and sometimes negative CAPE (Curry and Webster, 1999).
Moreover, in the specific case, it can be stated that large amounts of negative ACAPE 850 is prohibitive for the development of thunderstorm with heavy rainfall (Peppler, 1988).This finding can be modeled by requiring ACAPE 850 ≥ −2.5.At the level of the 700 hPa, the positive energy (ACAPE 700 > 0) is a prerequisite, especially for the summer period where the geopotential heights are higher and more energy is needed for heavy rainfall to form within the thunderstorm (Bol, 2006).A threshold of 1.5 was noticed for the summer period (ACAPE 700 > 1.5).For the upper levels, the smaller values of ACAPE show that there is a smaller possibility for thunderstorm development.Thresholds of minus 2 and minus 8 were noted for the levels 500 and 300 hPa respectively.(ACAPE 500 > −2, ACAPE 300 > −8).

Thickness Term
The thermal properties of the 850 to 500 hPa atmospheric layer are often better represented by the thickness rather than the temperature at a single level (Wallace and Hobbs, 2006).The 850 to 500 hPa thickness is a function of the average temperature and the average moisture content of the air through the specific layer, which are two properties associated with the virtual temperature.Therefore, the specific thickness (between the level z 1 (with pressure p 1 ) and the level z 2 (with pressure p 2 )) is associated with virtual temperature (T v ), as shown below: The Virtual temperature is used for estimating the available convective potential energy and its exclusion may lead to relatively important errors (Doswell and Rasmussen, 1994).
where z LF C and z EL are the heights of the levels of free convection and equilibrium respectively, T v,parcel is the virtual temperature of the specific parcel, T v,env is the virtual temperature of the environment, and g is the acceleration due to gravity.

380
Consequently, the 850 to 500 hPa thickness effect on CAPE led us to include this indicator in the LII formation.For practical reasons, the thickness seasonality was subtracted using the moving average.It has been demonstrated that should be less than 0 for the period from May to August (4)

Moisture Term
According to previous studies (Humphreys, 1926;Showal-390 ter and Fulks, 1943;Fawbush et al., 1951;Appleby, 1954;Whitney and Miller, 1956;Miller, 1967;Schaefer, 1986) the low level moisture is a prerequisite for the thunderstorm initiation and development.Usually, low level moisture increases instability as more latent heat is available to the lower atmo-the atmospheric instability can decrease because moist air is less dense and therefore less able to evaporate precipitation than the drier air.The evaporation of precipitation at or beneath cloud level causes the air-cooling inside precipitation downdrafts, making the air denser and increases instability although that the amount of precipitation is usually small.The minimum amount of moisture that was noticed in the recorded events, expressed in relative humidity term, was 60% at 850 hPA and 40% at 700 hPA and 120% for their sum (see Fig. 4).The aforementioned thresholds are insufficient for heavy rainfall.Although, increasing moisture increases the potentiality for heavy rainfall within the thunderstorm, the moisture in the upper levels may decrease the instability (Bol, 2006) and is not taken into account although it was found that the values of the upper levels were associated with the thunderstorms.

Terrain Heating Effect and Local Features
A lifting force is necessary for a rising parcel of air to overcome the convective inhibition which occurs when a layer of warmer air is above a particular region of air, resulting in the cooler air parcel to be hindered from ascending into the atmosphere (Mapes, 2000).Thus, a temperature inversion is created and therefore a stable region of air.The lift mechanism pushes the cooler parcel of air over the inversion contributing to the thunderstorm development.The main sources of a lift mechanism are associated with terrain features, heating and sea breeze.

Terrain Heating Effect Term
A cold air mass with respect to the ground can be heated from it, increasing its instability and vice versa.If an air mass is cooling from the terrain is becoming denser and unfavorable for thunderstorm development (Kessler, 1983).Taking into account the terrain thermal conductivity with respect to heat storage (terrain heat capacity) the terrain heating effect is suggested to be modeled as : where T 0 , T −1 , T −2 are the temperature of the terrain on the 425 specific day, 1 day and 2 days before respectively at the same time.The weight factor for the previous day temperature is set to 2 and for the temperature of 2 days before is set to 1.They are decreasing as the effect decreases with time.
The threshold was estimated to T H = 2 since greater val-430 ues mean reduced instability.In the cases, where the current is southwesterly, and is consequently supplying to the area , the factor T H became more ineffective moisture and this can be modeled by subtracting three degrees.
Terrain Heating Effect Term Apart from the local influences that were modeled within the aforementioned terms, it can be introduced that the dissolving effect of the easterly downdraft current due to the high mountains on the Eastern parts of the area examined.The LII was calculated on the basis of the data of the period 1-5-2006 to 31-10-2007.88 severe thunderstorms events were predicted.It is important to note that the actual number was 54 and all of them were predicted.Although the importance of 100% recall is controversial, the risk of neglecting a severe thunderstorm warning may prove hazardous, since injuries or fatalities and damages to structures or to the environment may not be prevented.The LII predicted 1418 no thunderstorm events and the actual number was 1452.

Locality Term
The Consistency Table of LII for the specific period is shown in the Table 1 and its performance is described as follows: -Precision = (the number of the thunderstorms that occurred from those that had been forecasted/ the number of the latter) = 61% -Recall = (the number of the thunderstorms that forecasted from those that had been occurred/ the number of the latter) = 100% -Fall-out = (the number of the cases with no thunderstorms from those that had not been forecasted/ the number of the latter) = 98% The weighted harmonic mean of precision and recall, the traditional F-measure or balanced F-score is: F = 2 * P recision * Recall P recision+Recall (7) resulting in F = 76% and the most adequate measure for the case of severe phenomena the F 1.2 that weights recall as 1,2 times much as precision is: The LII was then calculated for the second period 1-6-2008 to 30-6-2011.163 severe thunderstorms were predicted.During this period the actual number was 89 and LII predicted all of them.The LII predicted 2165 no thunderstorm events and the actual number was 2239.The Consistency Table of the LII for the specific period is shown in the Table 2 and its performance is: Precision = 55%, Recall = 100%, Fall-out = 97% The balanced F-score is: F = 71% and the weighted F-score i.e. the total performance is: F 1.2 = 75% The performance of LII per month is illustrated in Table 3.It is demonstrated that the LII had performed very well for the months May, June, September and October, when unstable weather conditions are more likely to occur.In these cases, most of the thunderstorm events took place during noon or afternoon when the terrain heating effect is stronger.

480
The lower levels of the atmosphere were moist enough and the CAPE was suitable.During July and August of the specific period, only two thunderstorm with heavy rainfall events occurred.This was expected as the atmosphere in the region is generally stable for these months, as was previously 485 explained.Although the performance of LII for July and August is rather low, its use is still beneficial, taking into account the severity of the events and that the recall of the LII is 100%.

Conclusions 490
This study presents an alternative methodological tool for the prediction of severe thunderstorms occurring over a specific area.The northwestern Peloponnese was chosen to illustrate the proposed tool because many thunderstorms with heavy rainfall have occurred with disastrous impacts.

495
The parameters used were constrained to those that are easily available to operational forecasters while performing their everyday duties.The statistical correlations of the parameters with the observations were examined.In the cases that the correlations were not justified by the relative the-500 ory, the respective parameters were neglected.Then, the Lo-cal Instability Index (LII) was inferentially drawn by using them.The LII is a threshold function that consists of the low level moisture, a practical approximation of the CAPE, the terrain heating effect and a formalized operational experience.It was found that the LII has satisfactory total performance (75%) over northwestern Peloponnese region for the period from 1-6-2008 to 30-6-2011 and predicted all the thunderstorms with heavy rainfall events (recall = 100%).
The future challenge for further development and opti-510 mization of this tool is to experiment the LII for a longer period and for hydrological basins all around Greece since in case of good performance, the LII would be at the disposal of operational forecasters of HNMC.
the Temperature at 850hPa at the time of the missing observation (T 8500 ) the 24 − hour trend of the T 850 at the specific time related to the same time of the previous and the next day (T 8500 − T 850−24 and T 850+24 − T 8500 ) the Dry Temperature at the same time of the next (T +24 ) and of the previous day (T −24 ) the 24 − trend of the next 6-hour Dry Temperature related to the corresponding hour of the next day (T +30 − T +6 ) the 24 − trend of the previous 6-hour Dry Temperature related to the corresponding hour of the previous day (T −6 T −30 ) the 6 − hour Wind Runs at the same time, before 24 hours and after 24 hours.

Fig. 1 .
Fig. 1.Map of Examined Area.The locations of the stations are displayed.The points A, B, C, D define the area of lightning data.