Articles | Volume 20, issue 9
Review article
16 Sep 2020
Review article |  | 16 Sep 2020

Review article: A comprehensive review of datasets and methodologies employed to produce thunderstorm climatologies

Leah Hayward, Malcolm Whitworth, Nick Pepin, and Steve Dorling

Thunderstorm and lightning climatological research is conducted with a view to increasing knowledge about the distribution of thunderstorm-related hazards and to gain an understanding of environmental factors increasing or decreasing their frequency. There are three main methodologies used in the construction of thunderstorm climatologies: thunderstorm frequency, thunderstorm tracking or lightning flash density. These approaches utilise a wide variety of underpinning datasets and employ many different methods ranging from correlations with potential influencing factors and mapping the distribution of thunderstorm day frequencies to tracking individual thunderstorm cell movements. Meanwhile, lightning flash density climatologies are produced using lightning data alone, and these studies therefore follow a more standardised format. Whilst lightning flash density climatologies are primarily concerned with the occurrence of cloud-to-ground lightning, the occurrence of any form of lightning confirms the presence of a thunderstorm and can therefore be used in the compilation of a thunderstorm climatology. Regardless of approach, the choice of analysis method is heavily influenced by the coverage and quality (detection efficiency and location accuracy) of available datasets as well as by the controlling factors which are under investigation. The issues investigated must also reflect the needs of the end-use application to ensure that the results can be used effectively to reduce exposure to hazard, improve forecasting or enhance climatological understanding.

1 Introduction

Thunderstorms have the potential to produce hazardous weather. All thunderstorms produce lightning, whilst the presence of other weather hazards such as wind, hail, heavy rain and snow can vary with geographic, climatic and synoptic conditions. The intensity of these hazards may vary by region and time of the year and, indeed, from storm to storm. This hazardous weather can cause flooding; damage to property, infrastructure and crops; disruption to transport and outdoor maintenance; and injury and threat to life (Elsom et al., 2018; Piper et al., 2016). One example was the death of a hiker on a ridge in Glencoe, Scotland, in June 2019 (Halliday, 2019). The July 2019 Latitude Festival in England was halted for an hour for safety reasons due to local lightning risk (BBC, 2019) and in that same month seven deaths, 140 injuries and severe damages were caused by a thunderstorm in Greece with high winds, hail and intense rainfall, overturning cars, felling trees, causing flooding and damaging houses and roofs (Giordano, 2019).

Figure 1 is a Venn diagram of weather hazards in a convective cell. This shows that all thunderstorm convective cells must produce lightning to distinguish them from an ordinary convective cell (Doe, 2016). Where precipitation or wind hazards occur without lightning, they are the result of non-electrical convective activity and beyond the scope of this review.

Figure 1Venn diagram of the relationship between convective weather hazards and how thunderstorms are distinguished from ordinary convection by electrical hazards.


Thunderstorm climatology research usually falls into one of three categories; thunderstorm frequency, thunderstorm tracking and lightning flash density (lightning strikes per square kilometre per year). Studies may sometimes utilise more than one approach and thus boundaries between the three can be blurred. Whilst thunderstorm frequency and tracking are concerned with the thunderstorm as a whole and all the hazards therein, lightning flash density is usually concerned exclusively with cloud-to-ground lightning hazards. Intra-cloud and cloud-to-cloud lightning strikes are not included because the focus of such work is on the risk to human life, property and industry. Lightning flash density and lightning frequency are however a form of thunderstorm climatology, because lightning is the only product of a thunderstorm which is unique to its diagnosis.

Producing and communicating the results of thunderstorm climatologies increases public and expert understanding of thunderstorm hazards and how to best reduce associated risks (Brooks et al., 2018). They provide important information for those who may be most exposed to thunderstorm hazards such as outdoor workers and those pursuing outdoor recreation as well as industries which may be vulnerable to disruption such as the power sector, construction and farming (Elsom and Webb, 2017). Preparedness may take different forms, from planning the most appropriate time of year to conduct outdoor maintenance or the most appropriate time of the day to start a hike to local authorities ensuring that drains and other defences are working efficiently prior to the most active thunderstorm times of year.

Accurately diagnosing the weather hazards that are the direct result of thunderstorms can be a challenge, because other than lightning, some precipitation and wind hazards can also be present without a thunderstorm. To ensure the correct diagnosis of thundery convection and the accurate assessment of the spatial and temporal distribution of thunderstorms, climatologists utilise a variety of datasets and methods. Choosing the most appropriate analysis approach and dataset is key to obtaining results that (a) best reflect the distribution of the hazard concerned and (b) are useful to the intended end user.

The purpose of the paper is to conduct a systematic and comprehensive review of the datasets and methodologies applied to create thunderstorm climatologies. This review aims to assist those at the design stage of their research and those new to the subject area to become familiar with the strengths and weaknesses of the available data types, to consider which climatological approach best fits their research goal, and to identify potential alternative approaches which may not have previously been considered. Whilst there are existing reviews in this subject area available (Betz et al., 2009; Cummins and Murphy, 2009; Ellis and Miller, 2016; Nag et al., 2015), these tend to focus either on analysis of a particular dataset, data type or methodology. This paper, in contrast, fills a gap in the literature by providing an overview of the whole subject area to help the reader to subsequently move on to more specific and detailed examples. Lastly, recommendations for research areas which require development are made.

To fulfil the above purposes, we first review the dataset types in Sect. 2, before then moving on to evaluating how different dataset types have been applied in compiling thunderstorm frequency climatologies (Sect. 3) and thunderstorm tracking (Sect. 4). Section 5 reviews the methods used to produce lightning flash density climatologies, using one dataset type: lightning remote sensing data. This section also includes a review on how lightning flash density results have correlated with potential drivers of thunderstorm formation, such as topography, which thereby introduces further methods and datasets. Recommendations for study design are contained in Sect. 6 and future research areas outlined in Sect. 7.

2 Data

Thunderstorm climatologies have traditionally been compiled and analysed using records kept by spotter networks which report thunder heard and lightning seen in different locations (Enno, 2015). Technology has progressed to include radar, satellite sensing and lightning location networks. As a result, research has developed to include information such as cell movement (Lock and Houston, 2015), hazard intensity (Ellis and Miller, 2016), and spatial and temporal extent (Galanaki et al., 2018). Tables 1 to 4 provide a summary of strengths and weaknesses of the main dataset types discussed below. Figure 2 provides a checklist of issues to consider when choosing an appropriate dataset. In the following discussion, for each of the three main approaches, we consider the use of different dataset types including manual reports, radar and satellite approaches, and model reanalyses.

Figure 2Checklist of questions to consider when choosing the most appropriate dataset.


2.1 Manual records: spotter networks and archives

Spotter networks can range from professional observations, such as weather records made at airports (Pinto, 2015), to crowdsourcing reports from enthusiasts, experts and members of the public, as undertaken by The Tornado and Storm Research Organisation (TORRO) in the UK. The type of data recorded can include thunder heard, lightning seen, thunderstorm cell movement and severe weather observations. Archive data are similar to spotter networks in that they rely on human observation, but it does not necessarily form part of an organised network and may take many different forms such as academic papers (Gray and Marshall, 1998), newspaper articles and historical diaries (Munzar and Franc, 2003). This kind of data can help verify other observations or extend records back in time but can also suffer from sporadic coverage in both time and space as well as being difficult to consistently gather and classify (Schuster et al., 2005). Satellite and radar technology, where available, is sometimes used in combination with human observations to provide complementary information such as identifying whether observations at different locations are the result of the same thunderstorm (Tippett et al., 2015). Table 1 provides a summary of advantages and disadvantages of manual records for the purposes of compiling lightning and thunderstorm climatologies.

Table 1Strengths and weaknesses of manual observations used to produce thunderstorm and lightning climatologies.

Download Print Version | Download XLSX

2.2 Thunderstorm remote sensing: satellite and radar

Satellite and radar data are often used as a primary source of information for compiling thunderstorm distributions. For satellite sensing, in the absence of additional data to confirm whether convection is thundery, cloud-top temperatures are analysed to identify those cold enough to likely be a thunderstorm (Bedka, 2011; Gray and Marshall, 1998). For radar, a thunderstorm is diagnosed by identifying the reflectivity values that are most likely to be attributed to a thunderstorm; examples include 40 dBZ reflectivity value (Haberlie et al., 2016) and 46 dBZ (55 dBZ for a thunderstorm with hail) (Wapler and James, 2015). Diagnosing thunderstorms using satellite and radar data in isolation therefore provides a probable (but not definitive) thunderstorm distribution. Alternative datasets such as ground-based lightning location systems provide absolute confirmation that a convective cloud is a thunderstorm, because lightning is a necessary condition for a thunderstorm (Houston et al., 2015). Lightning information can be used to assess the success of different temperature and reflectivity values in discriminating thunderstorm cells or it can be used in place of temperature or reflectivity values to discriminate thunderstorm cells that can then be tracked by radar once identified. Table 2 provides a summary of advantages and disadvantages of remote sensing data for the purposes of compiling lightning and thunderstorm climatologies.

Table 2Strengths and weaknesses of thunderstorm remote sensing (satellite and radar) used to produce thunderstorm and lightning climatologies.

Download Print Version | Download XLSX

2.3 Lightning remote sensing: satellite and ground-based lightning location systems

Lightning location systems were first established several decades ago to collect data on lightning activity. Lightning data quality is primarily assessed by calculating detection efficiency (DE) and location accuracy (LA). Detection efficiency is the percentage of the total number of lightning flashes or strokes a system detects, and location accuracy is the median distance error of detected lightning location. Satellite-based lightning location systems detect lightning using an imaging sensor measuring the near-infrared spectrum over a large field of view (Nag et al., 2015). This type of system is thought to have a high detection efficiency relative to ground-based systems (Bitzer et al., 2016). However, because, until recently, the satellites detecting lightning have been in a low earth orbit, they do not provide continuous temporal coverage, only detecting lightning in an area as the satellite passes over. They also have a relatively low orbital inclination (near the Equator), which means they do not cover higher latitudes (Thompson et al., 2014). High-earth-orbit geostationary satellites in the GOES programme were launched in 2016 and 2017, providing continuous lightning monitoring over the Americas and Pacific and Atlantic oceans (Goodman et al., 2012). Coverage is a function of instrument range and the areas observable from the instrument's position.

Ground-based systems use sensors to detect the electromagnetic waves that propagate through the atmosphere between the ground and the ionosphere (Hudson et al., 2016). Long-range lightning location systems detect electromagnetic waves in the low- and very-low-frequency range. This is because low-frequency waves can travel significant distances (up to 6000 km) without significant attenuation (Said et al., 2010). The lightning strike location and time are determined by either using their arrival times to calculate the distance travelled or measuring the angle the wave arrives from to triangulate the origin point. These data can be collected continuously and made available in real time. Table 3 provides a summary of advantages and disadvantages of lightning remote sensing for the purposes of compiling lightning and thunderstorm climatologies.

Table 3Strengths and weaknesses of lightning remote sensing (satellite and ground-based) used to produce thunderstorm and lightning climatologies.

Download Print Version | Download XLSX

2.4 Thunderstorm indices (proxy data) utilising reanalysis data

One last dataset type to consider is reanalyses. Reanalyses use climate data from a large array of sources to model changing climate variables over a long time period. This provides a consistent spatial and temporal resolution over multiple decades, allowing climate processes to be studied (Dee et al., 2016). Reanalysis data have been used in conjunction with other thunderstorm climatologies to identify the synoptic conditions that promote thunderstorm formation or which influence their behaviour in particular regions (Wapler and James, 2015). The variables used to classify these synoptic conditions into 29 weather patterns were mean-sea-level pressure, geopotential height at 500 hPa, 500–1000 hPa relative thickness and total column precipitable water. Another approach is to calculate average daily values of relevant reanalysis variables such as 500 and 1000 hPa geopotential heights, 500 hPa air temperature and the instability index known as CAPE (convective available potential energy) for a given temporal resolution (Gatidis et al., 2018). Reanalyses can also be used to obtain a longer climatology of thunderstorms by developing indices as proxies of thunderstorm activity (Kaltenböck et al., 2009; Kunz, 2007). This can also allow models of future thunderstorm trends to be developed (Tippett et al., 2015). Different indices may be more or less successful either in general or in different regions and seasons. An example of a commonly used index is CAPE, which uses two of the three main ingredients for deep moist convection (namely instability, moisture and lift) to evaluate the thunderstorm potential of environmental conditions (Moncreiff and Miller, 1976). The numerical CAPE value indicates the atmospheric potential to produce thunderstorms either looking at current conditions for forecasting or reconstructing the atmospheric conditions of the past for climatology (Holley et al., 2014). Table 4 provides a summary of advantages and disadvantages of thunderstorm indices for the purposes of compiling lightning and thunderstorm climatologies.

Table 4Strengths and weaknesses of proxy datasets used to produce thunderstorm and lightning climatologies.

Download Print Version | Download XLSX

Given the variety of datasets and the advantages and disadvantages of each, both the method and the use of data must be carefully considered in light of the overall goal of the research and the characteristics of the study area itself. For example, in Australia some regions are so remote that there are no continuous human thunderstorm observation data, making it impossible to achieve a long climatological record using direct observations of thunderstorms (Allen and Karoly, 2014). For the purposes of analysing the effect of ENSO events, a long record is essential, so the method in this event is dictated by the only dataset available in that study area suitable to achieve the goals of the research, namely reanalysis data.

3 Thunderstorm frequency

A wide variety of different methods have been used when creating a climatology of thunderstorms focused on thunderstorm days or thunderstorm frequency. This variation is due to differences in how a thunderstorm day is diagnosed or defined and how different datasets can be employed in this regard. Figure 3 provides a diagrammatic summary of the different variables to consider during the design of a thunderstorm frequency climatology.

Figure 3Diagrammatic summary of the potential research findings and data utilisation for a thunderstorm climatology created using either thunderstorm frequency or thunderstorm tracking methodologies.


3.1 Manual observation

Human observations and archives produce the longest observational record, and this enables analysis of long-term trends in occurrence and correlation of thunderstorm frequency with long-term cycles/climate signals such as ENSO (Tippett et al., 2015). Correlations with such cycles may help with the predictability of thunderstorm activity. Pinto (2015) was also able to identify increasing thunderstorm activity in areas of urban heat island development from growing cities in Brazil. In the USA observational records exist for over 100 years, and after checking that any variations in the data are not the result of data collection inconsistencies, long-term fluctuations demonstrated an overall decrease in thunderstorms over a 40-year period (Changnon, 2001). Nevertheless, this inter-annual variability in thunderstorm activity was found to vary regionally within the USA, and six distinct time series were identified with peaks in activity all occurring in different years and showing a marked difference to the overall national trend. This difference highlights the importance of considering different spatial scales when producing a thunderstorm climatology.

Different studies define thunderstorm days, hours and onset times in alternative ways. For example a thunderstorm day has been defined as thunder heard once in a 24 h period (Enno et al., 2013), and a thunderstorm is noted to begin when first observed and end 15 min after the last thunder is heard (Enno et al., 2013). There is the potential for “false alarms” if there is only one instance of thunder heard because other noises may be mistaken for thunder. When counting the number of thunderstorms in a day, to ensure that this is done correctly, observations must be separated in time and space (Bielec-Bąkowska, 2003). If thunderstorms start and end on different days, consideration should be given to the purpose of the research; if this is to identify the probability of days with thunderstorms then both days can be counted. However, if the frequency of thunderstorms is of more importance, attributing the thunderstorm to the most appropriate day will avoid night-time thunderstorms being counted twice, inflating thunderstorm-day frequency in those regions.

As shown in Table 1, human observations may contain data from multiple stations, potentially over large areas and in some cases continents, which poses issues with regard to bias and inhomogeneity of data (Schuster et al., 2005; Tuovinen et al., 2009). A European study over a 4-year period utilised records from several different countries and showed that there was likely to be a variable bias due to different data collection techniques (van Delden, 2001). To correct for this, the frequency of thunderstorms per 1000 weather reports at each station was calculated in the belief that this would help correct bias incurred by weather stations being manned inconsistently. Other statistical methods used included filling any data gaps using correlation with nearby stations (that show the closest temporal synchronicity) and testing the homogeneity of the data to help choose which stations to be included and excluding stations which have large data gaps (Enno et al., 2013). The study of Enno produced a climatology of almost 50 years, which showed clear temporal trends, and distributions that could be linked with three main thunderstorm regimes.

3.2 Remote sensing: satellite and radar

Radar reflectivity values are used to quantify the severity of convective events including thunderstorms (Tippett et al., 2015) and to diagnose mesoscale convective systems (Punkka and Bister, 2015), catalogue the percentage of thunderstorms that become intense, and identify thunderstorm initiation times and duration (Mohee and Miller, 2010). In Texas, radar was used to establish a link between the presence of human-made reservoirs and thunderstorm initiation, with the caveat that the reflectivity threshold must be sustained for at least 30 min (Haberlie et al., 2016). The benefit of radar data over human observation is increased confidence for establishing onset times, geographical extent and precise location of the storm. In contrast, with radar data it can be more difficult to distinguish a thunderstorm from an ordinary convective cell by only measuring precipitation intensity. Some very heavy precipitation is not associated with thunderstorms. Satellite imagery can be used in much the same way as radar to identify thunderstorms because it shows the convective area through cloud presence (Gray and Marshall, 1998); cloud-top temperatures below −32C are used to identify mesoscale convective systems and −52C used to classify mesoscale convective complexes (a particularly severe form of mesoscale convective systems). Severe weather reports associated with thunderstorms have been matched to convective areas in satellite imagery which are significantly colder than the surrounding cloud area and therefore identified as the updraught from deep moist convection (Bedka, 2011).

3.3 Remote sensing: satellite and ground-based lightning location systems

Lightning data are commonly used in lightning flash density thunderstorm climatologies. However, there can sometimes be an overlap between lightning flash density and thunderstorm frequency, when lightning data are used to identify thunderstorm days (also referred to as lightning days). A thunderstorm day or lightning day is defined by a certain number of lightning events per day and per area. A reasonable minimum threshold of lightning strikes per area is important because a single strike might be the result of false detection. A successful threshold can be verified with alternative datasets such as human observation and radar; Wapler and James (2015) showed that two lightning strokes within a 15 km radius was found to be the most effective.

Thunderstorm or lightning days can also be used within a lightning flash density study to establish whether a high-lightning area is the result of frequent storms (with attendant high probability of lightning) or less frequent but very intense storms (Soula et al., 2016; Taszarek et al., 2015; Vogt, 2014; Xia et al., 2015). In addition, it can also highlight areas that suffer from frequent thunderstorms which produce only a small amount of lightning, but which may produce other types of hazardous weather such as heavy rain (Xia et al., 2015). It is also useful to ascertain if there are particular regions that favour production of severe thunderstorms (Taszarek et al., 2015). With this in mind, knowing whether there are regions that have a lower detection efficiency (percentage of lightning detected by a lightning location system) can be important. This is because whilst details on storm intensity (number of lightning strikes per storm) are an advantage of lightning data, spatial variations in detection efficiency may bias the results when comparing storms over a large area. Careful validation of results should be undertaken through comparison with other complementary datasets. Also, as lightning location networks have developed more substantially over time, manned thunderstorm observation stations have reduced in number (Enno, 2015) so ascertaining how best to combine manual observations with lightning data may be necessary to maintain a long record. In the USA the two datasets correlate best in areas with high lightning activity (Reap, 2002). For northern Europe it was concluded that the optimum distance for lightning data to correlate with manual records kept by weather stations was in the range of a 9–14 km radius of the observation station depending on the station location (Enno, 2015). It seems that combining two datasets to obtain a long record should be done with caution, and the compatibility of the datasets should be assessed on a case-by-case basis.

Studies use multiple datasets not only to extend the record in time but also to obtain more detail in relation to a thunderstorm climatology. Human observations and records can include details of damage and observations of severe weather events, which when compared to lightning data can be used to classify the severity of a thunderstorm (Kaltenböck et al., 2009). It was noted that this approach is only likely to be successful in populated areas where severe weather and damage were more likely to be recorded and observed.

3.4 Thunderstorm indices (proxy data) utilising reanalysis data

Reanalyses, such as ERA5 European Reanalysis data, have assimilated observational records of land, ocean and atmospheric variables into models from a large variety of observational sources since 1979 and in 2020 will have extended the record back to 1950 (Hersbach et al., 2019). They also have been employed to identify the atmospheric conditions common to regions and seasons of high thunderstorm activity. This does not produce a thunderstorm frequency climatology because there are no direct records of thunderstorm activity. However, they can produce a frequency of thunderstorm-promoting conditions. In Australia, reanalysis data were used to reconstruct a climatology of the atmospheric environment conducive to the development of severe thunderstorms (Allen and Karoly, 2014). This insight assists forecasters in identifying the conditions that have a high probability of generating a hazardous thunderstorm. Indices such as CAPE or LI (lifted index) can be used to predict thunderstorm occurrence based on the atmospheric conditions, and if generated from reanalysis data then a long record can be produced of the potential for thunderstorm formation, which should ideally then be ground-truthed against measurement data. In southwest Germany different indices were tested against severe thunderstorms identified in SYNOP weather station data, radar data and damage reports to ascertain which index or indices work(s) best in which scenarios (Kunz, 2007). This has also been done on a continental scale for the whole of Europe using lightning location system data, severe storm reports and weather forecast model output data to verify the degree to which indices can reliably predict thunderstorms (Kaltenböck et al., 2009). In the USA reanalysis data and indices were used to identify conditions with a high probability of producing severe thunderstorms (defined by hail size, gust speed or tornado damage) (Brooks et al., 2003). These findings were then applied to Europe to produce a climatology of conditions which have the highest probability of producing severe thunderstorms. The results agreed with thunderstorm frequency work that has been done in Europe; however without a long-term Europe-wide climatology the success of this approach remains uncertain.

4 Thunderstorm tracking

Another useful approach is reconstruction of thunderstorm tracks, recording thunderstorm movement which is typical in a specific region, synoptic pattern or time period (season, time of day, month, etc.). This might include data such as thunderstorm life cycle duration (an individual cell or multi-cell thunderstorm), direction of travel, speed and development of intensity (such as lightning or rainfall hazards throughout the life of the storm) and can also include a form of thunderstorm frequency (how often a thunderstorm tracks through a particular area) (Galanaki et al., 2018; Gray and Marshall, 1998). This type of information can help forecasters to identify areas at risk of thunderstorm hazards or assist with now-casting (predicting the movement of an existing storm based on the previous trajectory of the cell), or general climatology. Figure 3 provides a diagrammatic summary of the different variables to consider during the design stage of thunderstorm tracking research.

4.1 Manual observations

Tracking may be possible using manual observations and archive information but it is problematic to connect thunderstorms from one observation location to another and to confidently identify them as the same storm. Therefore these data are often used in combination with other datasets such as satellite and radar (Gray and Marshall, 1998). This study enabled the reconstruction of mesoscale convective system (MCS) tracks over a 16-year period in the UK. An MCS is a collection of thunderstorm cells which make up a continuous storm area that extends over 100 km in at least one direction (Doe, 2016). The benefit of using this combined dataset in this case was that as the UK experiences infrequent MCSs a long period was required to obtain enough tracks for a climatology. The human observations provided confirmation that satellite and radar data diagnosis of a thunderstorm occurrence is correct. This was later updated for a further 17-year period (Lewis and Gray, 2010) to provide a database of MCS tracks for a total of 23 years for the UK. The climatology is used to identify trends in origin points for storms, duration, and start and end times of storms and to link trends in behaviour to specific synoptic conditions. In this case, inclusion of satellite and radar provided additional confidence, but it was noted that some MCSs may have been diagnosed incorrectly because where only human reports were available, multiple but separate scattered thunderstorms may produce a similar distribution of reports to an MCS.

4.2 Remote sensing: satellite imagery and radar

Radar and satellite imagery are often used to track thunderstorm cells in real time for the purpose of nowcasting (anticipating the next most likely movement of the cell) using 3-D reflectivity profiles to define the extent and structure of a thunderstorm (Dixon and Wiener, 1993; Johnson et al., 1998; del Moral et al., 2018). These tracking algorithms have also been applied to historical thunderstorms to develop a catalogue of thunderstorm movements and severity (Chronis et al., 2015; Farnell and Rigo, 2020). Radar-tracked thunderstorm data can be used by industry responsible for infrastructure such as power lines to develop risk models (Mohee and Miller, 2010) and enhance resilience. Detecting thunderstorms at longer ranges is challenging for radar, a problem which can be overcome by using multiple radar devices (Mohee and Miller, 2010). When using output from multiple radar datasets, they need to be merged into a composite so that thunderstorm clusters can be tracked (Lock and Houston, 2015). The linking of clusters into a track has been achieved using both wind direction data (Lock and Houston, 2015) and the previous motion of the storm (Dixon and Wiener, 1993; Johnson et al., 1998; del Moral et al., 2018). The initiation point of a thunderstorm can be approximated by interpolating backwards using the trajectory of the thunderstorm by a time step of 15 min before it was first detected (Lock and Houston, 2015). This can be useful because it can take thunderstorms time to develop to the point where the reflectivity is high enough to be detected, and the first detection by radar is not necessarily representative of the start location for the storm.

There may also be a similar detection delay using satellite data, as they are usually only available every 15 min so there is a potential for 15 min error windows for start and end times (Dotzek and Forster, 2011). Finding the origin point for the storm assists in identifying the conditions that contribute to their formation and, in this case, in correlating thunderstorm formation hot spots with topography as well as identifying the overall spatial distribution of thunderstorm formation.

Radar reflectivity values for thunderstorm tracks can also be used to provide information on severity of thunderstorm precipitation and to quantify how this changes as the storm develops and dissipates (Rigo and Pineda, 2016).

4.3 Remote sensing: satellite and ground-based lightning location systems

Thunderstorm intensity changes have also been inferred from lightning activity (Correoso et al., 2006) by analysing the lightning intensity per 100 km2 for each 30 min stage of the life cycle of 33 MCSs. It was noted that colder storms and the early stages of storms produced the most lightning. There have been numerous studies (Chronis et al., 2015; Farnell and Rigo, 2020; Schultz et al., 2009) that have identified a “jump” in lightning activity within a thunderstorm (e.g. 2 SD above the running mean of lightning strokes from the previous 12 min iteration) as a means of identifying storms which can be tied to observations of severe weather. Research in this area is ongoing to establish how a warning system based on lightning intensity can be adapted to different regions, which may produce different patterns of thunderstorm activity (Ellis and Miller, 2016) and identifying the best combination of the variables to produce the highest probability of detection whilst maintaining a low false alarm rate (Gatlin and Goodman, 2010).

Lightning data have also has been used for thunderstorm tracking purposes either with or without supporting information from radar, satellite and human observation. The main decisions when using lightning data for tracking are (a) deciding how to define a lightning cluster so that it most closely represents the thunderstorm cell or thunderstorm as a whole and (b) how to connect the clusters to produce an accurate track. Identifying a cluster usually involves counting lightning strikes within a given time interval and within a given radius or grid square. The method for doing so varies depending on whether the study aims to track individual thunderstorm cells or whole thunderstorms (which may include multiple cells). For example, a radius of 10 km and 16 min time interval were chosen (around each lightning strike) as a means of counting strikes that originate from the same storm in a study in the Mediterranean region (Galanaki et al., 2018). These parameters compared well with satellite imagery showing the cloud extent. In another study undertaken in the Alps, a thunderstorm cluster was defined as a minimum of 14 flashes within a 4 km radius and 20 min temporal vicinity. Lightning flashes that did not meet this requirement were discarded because this study wished to exclude “weak storms” from the dataset (Bertram and Mayr, 2004). The difference in size is likely a function of differing thunderstorm activity or size between the study areas, which is also therefore an important consideration when choosing cluster size. Other important considerations for cluster size may be the maximum distance a lightning strike can travel from the convective core and the detection efficiency or location accuracy of the dataset itself.

As with satellite and radar data, connecting the lightning clusters into a track can be challenging because there can be multiple thunderstorm cells or multiple thunderstorms in a similar area (which can also split and merge) (del Moral et al., 2018). Tackling this problem has been addressed in a variety of ways. Identifying the mean wind direction between 0 and 6 km elevation (Houston et al., 2015) and choosing the lightning cluster that most closely matches the trajectory of the gradient wind is one method. It should also be noted that some thunderstorms are large enough to move deviantly from the flow (del Moral et al., 2018). A different approach was employed in the Alps specifying that clusters could be connected within a 30 ± direction variation in the mean cell motion of that region (Bertram and Mayr, 2004). This required initial data analysis prior to track construction to calculate the mean by connecting cells that are closest to each other over a whole-day period and gathering data for direction and distance of movement. For unusual flow situations the direction can be changed to avoid incorrect tracking (the process is semi-automated to allow this). Lastly, another method of connecting clusters into a track is ensuring that the time iterations are small enough to provide a spatial overlap (Meyer et al., 2013).

Some problems with using lightning to track thunderstorms include the fact that lightning may not begin at the convective start of the storm, making the initiation point uncertain, and there is also difficulty detecting cloud-based lightning, which is the dominant lightning type for early thunderstorm stages (Bertram and Mayr, 2004). Thunderstorms that are less electrically active may escape detection.

5 Lightning flash density

Lightning flash density studies use data from lightning location systems, and some standardised analysis methods of best practice have been developed when using these datasets. Whilst most lightning climatologies are produced with the intention of minimising exposure to cloud-to-ground lightning hazards (Finke, 1999), lightning climatology can also be viewed as a form of thunderstorm climatology because lightning can be used to confirm thunderstorm activity. Indeed, there are several avenues of research investigating how lightning might be used as a proxy for other thunderstorm hazards such as heavy precipitation (Ezcurra et al., 2002; Iordanidou et al., 2016; Kochtubajda et al., 2013). Lightning flash density studies can overlap with thunderstorm frequency studies when they include “days with lightning” as part of the climatology.

Whilst high lightning flash density may provide an indication of increased thunderstorm activity, this should be treated with caution because it may not so easily detect low lightning thunderstorms, which while less electrically active, may still produce other forms of hazardous weather. This may be remedied by analysing thunderstorm or lightning days (see Sect. 3) in conjunction with lightning flash density. Lightning flash density information can support understanding of lightning and thunderstorm distributions amongst industry end users. Ground flash density (Diendorfer, 2008) is used to calculate the risk from lightning to an asset and is relevant to operations such as wind farms, shipping and sailing, sporting events, and transport infrastructure, as well as many other types of industry and outdoor land use, especially where cloud-to-ground lightning poses a hazard to life. Figure 4 provides a diagrammatic summary of the steps involved in producing a lightning flash density (thunderstorm) climatology and the different variables to consider during study design.

Figure 4Diagrammatic summary of the steps involved in producing a lightning flash density thunderstorm climatology, showing the different variables to consider during the study design stage.


5.1 Lightning flash density method

Whilst thunderstorm frequency uses different types of datasets and different methods, lightning flash density studies depend upon a variety of lightning datasets (lightning location systems vary in detection method, coverage and accuracy). However, they usually follow a relatively standardised methodology, making results easier to compare. Most studies focus on cloud-to-ground lightning because they are primarily concerned with lightning strike damage, but also because most ground-based lightning location systems detect cloud-to-ground strikes most efficiently. These studies often have a shorter timescale than most other climatologies because lightning location networks experience upgrades that limit the period over which they are homogenous. Some systems operate over a limited time span (Tropical Rainfall Measuring Mission Optical Transient Detector for example: Cecil et al., 2014). For lightning detectors placed on satellites, data collection is limited by the satellite deployment duration. Where lightning flash density is required for industry purposes (to obtain a lightning flash density figure as input, for example, to risk assessment models for construction) but no lightning flash density is available, it has been estimated by multiplying days of thunder heard by 0.1 (DEHN + SÖHNE, 2014). Whether this calculation can be used successfully to convert a long record of days with thunder to lightning flash density, where human observations have been replaced by lightning location systems, to produce a long climatology record remains to be seen.

Data often need to be filtered to omit weak events which may not be the result of cloud-to-ground lightning, and individual lightning strokes need to be grouped into lightning flashes (Taszarek et al., 2015). The threshold for excluding weak events may differ depending on the dataset, coverage area and purpose of the study (some may wish to exclude cloud-to-cloud lightning events). Grouping of lightning strokes into flashes is performed by setting an arbitrary time period and spatial area within which if strokes occur together, they are almost certainly the result of the same lightning event. Most studies follow the definition that a flash is an ensemble of all strokes within 10 km of each other within a 1 s interval (Cummins and Murphy, 2009). It is noted that the temporal element of this is the most important, with 1 s being consistent throughout the literature, but the spatial element is more variable (Drüe et al., 2007) as it does not appear to significantly affect the number of grouped flashes, even up to as much as 50 km.

Consideration should also be given to network upgrades, which may affect detection efficiency. Some studies choose timescales and locations which do not include a significant upgrade to obtain homogenous data (Taszarek et al., 2015) while others apply corrections to homogenise the time series (Huffines and Orville, 1999). Applying corrections may provide a longer timescale for a study than would otherwise be possible. Using longer time series is usually more reliable because it minimises the influence of some biases, such as sensor outages or unusually severe weather events. However, choosing a known homogenous data collection period may be the safer way forward, even if it limits the length of record available.

Lightning flash density per square kilometre per year is usually calculated throughout the study area on a grid square basis. The grid box should not be smaller than that required to capture a minimum of 80 lightning events (Diendorfer, 2008) to provide an 80 % confidence that the calculated ground flash density is an accurate representation. Adjustments to ensure that there are 80 events per grid cell may be either a function of grid box size or study duration. For a location accuracy that is between 500 and 1000 m, the grid size should be no smaller than 1 km×1 km (Diendorfer, 2008). The size of the grid box may also vary depending on the size of the study region and the resolution required to address the research question. One suggested improvement for this is to use probabilistic methods to obtain a sub-kilometre lightning flash density resolution which would be better suited to analysing the relationship between lightning and smaller-scale landscape and biological features such as vegetation (Etherington and Perry, 2017). It has been shown to be possible to produce a 100 m×100 m climatology by calculating the radius around a lightning location when it is most probable that the strike occurred within using the known location error data from the lightning location system. The probability of a strike occurring within an area of interest can then be calculated. This method produces a detailed map; however the extra processing required makes this method unlikely to be adopted as standard practice.

Once an appropriate grid size is identified, flash density can be calculated per square kilometre per year for each grid box. Temporal and spatial variations in lightning flash density are then analysed and can include investigations of the impact of potential influencing factors such as topographic features, land use, CAPE (Galanaki et al., 2015), synoptic conditions (Gatidis et al., 2018) and aerosols (Coquillat et al., 2013).

5.2 Global lightning flash density

An advantage of lightning location system data is that some systems operate over very large areas, allowing lightning flash density to be analysed on a global scale. A comparison study was produced, using both a ground-based lightning location system (the World Wide Lightning Location Network, WWLLN) and the satellite- based system Tropical Rainfall Measuring Mission Lightning Imaging Sensor (TRMM-LIS) and Tropical Rainfall Measuring Mission optical transient detector (TRMM-OTD), to ascertain whether the lower detection efficiency of WWLLN had consequences for its identification of diurnal cycles (Virts et al., 2013). The results showed that WWLLN was able to produce plausible diurnal cycles on regional and global scales. Both datasets picked up the general trends of geographical and seasonal lightning variation, but there were areas where one dataset would detect greater lightning amounts than the other (OTD/LIS detecting more lightning in Africa and the Himalayas vs. WWLLN detecting more over the oceans), reflecting the fact that each lightning location system's performance varies spatially.

Unsurprisingly, global maps of lightning flash density show the most intense lightning activity in the tropics due to the intense solar heating initiating convection. Mountain ranges often show greater lightning activity than their surrounding areas (Cecil et al., 2014) due to sun-facing slopes and forced ascent of air helping to release instability. Lightning hotspots have been ranked and vicinity to populated areas recorded to highlight areas that experience high lightning risk and which are more vulnerable to thunderstorm and lightning hazards (Albrecht et al., 2016). Further studies of vulnerability and lightning flash density could usefully include recreational areas, areas with high risk activities and infrastructure.

5.3 Lightning flash density and topography

Strong correlations between mountain ranges and enhanced lightning activity (in comparison to lightning intensity in surrounding lowlands) are noted in numerous studies globally (Etherington and Perry, 2017; Feudale and Manzato, 2014; Mushtaq et al., 2018; Vogt, 2014; Vogt and Hodanish, 2014, 2016; Xia et al., 2015). More analytical information can be obtained by attributing a mean slope or elevation value to each grid square (Galanaki et al., 2015) and choosing appropriate statistical methods to establish correlation. Another method is to create shape files in a GIS environment for each elevation class and to calculate the lightning flash density for each (Vogt and Hodanish, 2016) or join shape files containing elevation data to a lightning density grid to obtain elevation data for each grid cell environment (Mushtaq et al., 2018). Slope gradient is another element of topography that may influence lightning flash density, for example in Colorado where it was noted that lightning flash density increases more rapidly at higher elevations (steeper slope gradients) than at lower elevations (gentler gradients) (Vogt and Hodanish, 2014).

5.4 Lightning flash density and aerosols

There have been several studies examining the influence of aerosols on lightning flash density. Comparing lightning activity during the week with weekend days around commuter/urbanised areas, anthropogenic emissions (during the week) were shown to increase the intensity of lightning activity downwind of Paris because at weekends the lightning activity was less intense (Coquillat et al., 2013). It is argued that natural causes would not change from weekdays to weekends. On a longer timescale, an alternative approach obtained monthly averages of the absorbing aerosol index for each flash density grid cell and calculated the correlation between this and lightning flash density in the Kashmir and Jammu provinces of India. A positive correlation (r=0.61) identified that aerosols may be an influencing factor in controlling lightning activity in these regions (Mushtaq et al., 2018). Urban heat island temperature has been observed to exhibit a maximum on Fridays and minimum on the weekend. In the Charlotte, North Carolina, urban heat island it has been observed that there is a slightly higher mean temperature (1 C) on weekdays than on weekend days (Eastin et al., 2018). Increased temperature during the week may therefore also be a factor influencing increased lightning activity.

5.5 Lightning flash density and land cover

Evaluating the connection between land use/vegetation type and lightning can depend on available datasets. This requires the classification of regions or obtaining land cover classification datasets and attributing this classification to the lightning flash density grid square (Galanaki et al., 2015), or calculating lightning flash density stratified by land use polygons per season. The relationship for an area can then be quantified by scaling the lightning stroke density with the total number of strokes and percentage area of each vegetation/land use category to the total study area. An analysis for different vegetation types in the eastern Mediterranean region (Galanaki et al., 2015) showed that seasonal variation in lightning activity varied between them. For example, in summer lightning showed a preference for forested areas thought to be the result of greater soil moisture and leaf areas permitting more transpiration of moisture into the air. Scrubland showed low lightning activity throughout the year, and in the coldest periods of the year there was increased lightning activity in woodland and wooded grassland.

5.6 Lightning flash density and atmospheric conditions

Correlating lightning activity with meteorological, synoptic or local atmospheric conditions is important to understand how this may affect the distribution of lightning, and thunderstorm-related hazards. Analysis of the influence of atmospheric conditions is often undertaken using reanalysis data (e.g. Gatidis et al., 2018). Using factor analysis for lightning flash density across Greece in fortnightly time iterations for each 0.5 grid square, this study was able to identify three main intra-annual distributions of lightning activity. Namely, high activity occurring in (a) continental mountainous areas in early summer, (b) over the Ionian Sea in early autumn, and (c) over the Aegean Sea in late May and again in mid-autumn. Once the temporal and spatial distributions of the three main peaks in lightning activity were identified, mean atmospheric conditions (average patterns of geopotential heights at 500 and 1000 hPa, air temperature at 500 hPa and CAPE) were obtained on days where there was lightning activity during the peak “season” of activity for each case. This allowed the identification of the atmospheric conditions that were most strongly associated with the lightning activity. The benefit of using factor analysis for fortnightly time periods, rather than a traditional seasonal/monthly analysis, is that it removes the possibility that by parcelling time by human constructs (i.e. months) critical transitions may be missed. Factor analysis ensures objective grouping to identify the main trends (Gatidis et al., 2018).

Thunderstorm indices such as CAPE have been widely evaluated in conjunction with lightning flash density (Galanaki et al., 2015). Convective available potential energy quantifies the atmospheric conditions' potential for deep moist convection. Galanaki et al. (2015) assigned CAPE values into bins for several times of day, and then the lightning activity for each time of day was paired to the corresponding CAPE bin. The results show an increase in lightning activity with increasing CAPE values, with a positive correlation of R>0.87.

Research can also include the effects of long-term variations in atmospheric circulation, such as ENSO events and the North Atlantic Oscillation (NAO) (Piper and Kunz, 2017), on thunderstorm day distributions. Lightning activity for ENSO neutral months can be compared to months with El Niño and La Niña events. This has been addressed in the Northwest Pacific region (Zhang et al., 2018). Abnormal lightning activities were identified during both El Niño and La Niña events. Overall, it was found that there was a 10.3 % increase (4.8 % decrease) in lightning days during El Niño (La Niña) events.

6 Recommendations

In order to gain the most comprehensive understanding of the distribution of thunderstorm hazards the following recommendations should be considered.

6.1 Dataset choice

A major consideration when choosing appropriate underpinning datasets is identifying both the availability for the study region concerned and the appropriate temporal and spatial coverage required to achieve the overall research goal. Once potential datasets are identified, investigations should ascertain the reliability of the data and the homogeneity of the recording methods (Schuster et al., 2005). The research project may need to be adapted if dataset limitations constrain the types of analysis that can be performed. For example, in a region where only lightning data are available, a short record length may mean that long-term trends cannot be analysed, and the focus may need to be on the spatial variation in lightning and thunderstorm occurrence.

Since no one dataset is perfect, it can be beneficial to combine complementary datasets to fill data gaps, validate thunderstorm diagnoses (Gray and Marshall, 1998) and extend spatial and temporal coverage (Enno, 2015). Where datasets cannot be confidently combined, repeating the analysis with more than one dataset can provide validation of results or help to identify the main potential sources of uncertainty.

6.2 The benefits of combining different types of approach

There is substantial benefit to incorporating more than one research methodology into a study (thunderstorm frequency, thunderstorm tracking and lightning flash density) to produce robust results. Good examples of this include lightning flash density climatologies which have incorporated aspects of thunderstorm frequency research (e.g. Soula et al., 2016) since not all thunderstorms produce the same amount or form of electrical activity. Thunderstorm frequency can help distinguish regions that are at risk of rare severe storms from those at risk of frequent less severe storms. Furthermore, differences between thunderstorm frequency and lightning flash density may help identify instances where the spatial variation in lightning flash density has been skewed by severe storms as demonstrated by Anderson and Klugmann (2014).

Thunderstorm frequency and lightning flash density studies can provide data relating to thunderstorm hazard distributions in a fixed region during a fixed period of time, but they cannot provide data relating to the movement of thunderstorms. Factors such as storm location origin, thunderstorm life cycle and motion characteristics also provide important information to characterise the potential hazard in a region. It is important to investigate both Eulerian and Lagrangian approaches to thunderstorm distributions to fully understand the risk from thunderstorm hazards and identify causative factors such as atmospheric conditions. Lastly, lightning flash density approaches can be used within thunderstorm tracking to see how lightning flash density changes throughout the life cycle of the storm (Correoso et al., 2006), identifying whether particular thunderstorm types produce more or fewer lightning hazards.

6.3 Identify the end user

Aside from scientific interest, potential end users should be considered, as this will also influence the choice of method and aim of the research. The study may take the form of analysing hazards for a specific group such as forecasters or nowcasters, mountaineers, and outdoor leisure users (Vogt, 2014); a specific industry such as the power sector (Mohee and Miller, 2010); or more general users of warning services amongst the general public. Identifying the target audience is crucial for tailoring the results so that they can be successfully utilised to mitigate the effects of thunderstorm hazards.

The end user will also determine how best to communicate the results in terms of both dissemination pathways and presentation format. Weather advice services, warning services and forecasters will access the results via scientific journal articles, conference papers, presentations and training courses. If the study has been produced for a specific organisation then they may also require tables of results or maps which they may interrogate and apply for their own purposes and integrate into their own decision support systems. Decision makers in industry and government, as well as the general public, will require clear diagrams, summaries and guidance on how to interpret the results. In recent years apps, social media posts and websites have become popular with interested members of the public being able to observe lightning strikes and radar imagery in real time and sign up to receive alerts via social media with regard to weather warnings. Utilising such platforms to deliver information in relation to past hazard distributions and developing apps and websites to do so could provide easy access to information for the public and could be a potential growth area to enable climatologists to distribute the results of their research.

7 Conclusions – priorities for further research

7.1 Low-lightning areas

Research is most often conducted in populated areas of frequent thunderstorm activity, partly because these regions are more at risk from thunderstorm hazards and partly due to enhanced monitoring producing the observational evidence to support more statistically significant and reliable results. In areas which experience fewer thunderstorms, accessing sufficient data to produce statistically significant results or high-resolution spatial distributions can be problematic. For example, producing a lightning flash density map with an 80 % confidence level requires a grid square to have accumulated at least 80 lightning flashes during the study period (Diendorfer, 2008). In low-lightning-activity areas, to obtain a reasonable sample often requires increasing the grid size or the timescale, thus potentially limiting investigations into intra-annual and monthly distributions at high spatial resolutions.

7.2 Dataset combination techniques

More accurate thunderstorm distributions can be achieved by enabling more accurate syntheses of different data sources. This could take the form of developing methodologies and algorithms which support integration and which can be adapted to incorporate different data types, or alternatively by combining datasets of the same type such as lightning data from multiple systems.

7.3 Reanalysis indices

Testing and improving techniques to define indices from reanalyses could provide a long record of probable thunderstorm activity, in regions where records are short or inhomogeneous, as well as being used in areas where there is a lack of thunderstorm observational data. Testing the results against direct observations and identifying the indices which work best in different regions and seasons would increase confidence in utilising this method.

7.4 Hazard communication and warnings

Developing pathways to communicate thunderstorm distributions to laypersons or targeted end users is necessary to help them plan in advance to better avoid or prepare for thunderstorm hazards. Apps and social media provide platforms which are popular and familiar for laypersons, many people now being familiar with real-time lightning websites and radar imagery. Thus, such methods need to be employed more widely to display climatological data in a user-friendly way.

Appendix A: List of abbreviations
CAPE Convective available potential energy
DE Detection efficiency
ENSO El Niño–Southern Oscillation
ERA European Reanalysis Data
GIS Geographic information systems
hPa Hectopascal pressure unit
LA Location accuracy
LI Lifted index
MCS Mesoscale convective systems
NAO North Atlantic Oscillation
Synop Surface synoptic observations
TORRO The Tornado and Storm Research Organisation
TRMM LIS Tropical Rainfall Measuring Mission Lightning Imaging Sensor
TRMM OTD Tropical Rainfall Measuring Mission Optical Transient Detector
WWLLN World Wide Lightning Location Network
Data availability

No datasets were used in this article.

Author contributions

LH conducted the review of the available literature and wrote the manuscript with MW, NP and SD assisting with conceptual development, contributions to text and editing.

Competing interests

The authors declare that they have no conflict of interest.


We would like to thank the two reviewers for their helpful comments which led to a much improved manuscript. We would also like to thank Weatherquest for useful discussions and sharing their insights in relation to how climatological data can be utilised.

Review statement

This paper was edited by Maria-Carmen Llasat and reviewed by Elissavet Galanaki and Tomeu Rigo.


Albrecht, R. I., Goodman, S. J., Buechler, D. E., Blakeslee, R. J., and Christian, H. J.: Where are the lightning hotspots on earth?, B. Am. Meteorol. Soc., 97, 2051–2068,, 2016. 

Allen, J. T. and Karoly, D. J.: A climatology of Australian severe thunderstorm environments 1979-2011: inter-annual variability and ENSO influence, Int. J. Climatol., 34, 81–97,, 2014. 

Anderson, G. and Klugmann, D.: A European lightning density analysis using 5 years of ATDnet data, Nat. Hazards Earth Syst. Sci., 14, 815–829,, 2014. 

BBC: Latitude Festival: Lightning brings music to a halt:, last access: 14 August 2019. 

Bedka, K. M.: Overshooting cloud top detections using MSG SEVIRI Infrared brightness temperatures and their relationship to severe weather over Europe, Atmos. Res., 99, 175–189,, 2011. 

Bennett, A., Callaghan, G., Gaffard, C., and Nash, J.: The Effect of Changes in Lightning Waveform Propagation Characteristics on the UK Met Office Long Range Lightning Location Network (ATDnet), in Proceedings of the International Lightning Detection Conference and International Lightning Meteorology Conference (ILDC/ILMC), Orlando, FL, 19–22 April 2010. 

Bertram, I. and Mayr, G. J.: Lightning in the eastern Alps 1993-1999, part I: Thunderstorm tracks, Nat. Hazards Earth Syst. Sci., 4, 501–511,, 2004. 

Betz, H. D., Schmidt, K., and Oettinger, W. P.: LINET – An International VLF/LF Lightning Detection Network in Europe, in: Lightning: Principles, Instruments and Applications, edited by: Betz, H. D., Schumann, U., Laroche, P., Springer, Dordrecht,, 2009. 

Bielec-Bąkowska, Z.: Long-term variability of thunderstorm occurrence in Poland in the 20th century, Atmos. Res., 67–68, 35–52,, 2003. 

Bitzer, P. M., Burchfield, J. C., and Christian, H. J.: A Bayesian Approach to Assess the Performance of Lightning Detection Systems, J. Atmos. Ocean. Tech., 33, 563–578,, 2016. 

Brooks, H. E., Lee, J. W., and Craven, J. P.: The spatial distribution of severe thunderstorm and tornado environments from global reanalysis data, Atmos. Res., 67–68, 73–94,, 2003. 

Brooks, H. E., Doswell, C. A. I., and Kay, M. P.: Climatological Estimates of Hourly Tornado Probability for the United States, Weather Forecast., 33, 59–69,, 2018. 

Cecil, D. J., Buechler, D. E., and Blakeslee, R. J.: Gridded lightning climatology from TRMM-LIS and OTD: Dataset description, Atmos. Res., 135–136, 404–414,, 2014. 

Changnon, S. A.: Damaging Thunderstorm Activity in the United States, B. Am. Meteorol. Soc., 82, 597–608,<0597:DTAITU>2.3.CO;2, 2001. 

Chronis, T., Carey, L. D., Schultz, C. J., Schultz, E. V., Calhoun, K. M. and Goodman, S. J.: Exploring lightning jump characteristics, Weather Forecast., 30, 23–37,, 2015. 

Coquillat, S., Boussaton, M.-P., Buguet, M., Lambert, D., Ribaud, J.-F., and Berthelot, A.: Lightning ground flash patterns over Paris area between 1992 and 2003: Influence of pollution, Atmos. Res., 122, 77–92,, 2013. 

Correoso, J. F., Hernández, E., García-Herrera, R., Barriopedro, D. and Paredes, D.: A 3-year study of cloud-to-ground lightning flash characteristics of Mesoscale convective systems over the Western Mediterranean Sea, Atmos. Res., 79, 89–107,, 2006. 

Cummins, K. L. and Murphy, M. J.: An Overview of Lightning Locating Systems: History, Techniques, and Data Uses, With an In-Depth Look at the U.S. NLDN, IEEE Trans. Electromagn. Compat., 51, 499–518,, 2009. 

Dee, D., Fasullo, J., Shea, D., Walsh, J., and National Center for Atmospheric Research Staff (Eds.): The Climate Data Guide: Atmospheric Reanalysis: Overview and Comparison Tables: available at: (last access: 5 August 2020), 2016. 

DEHN + SÖHNE: Lightning Protection Guide, 3rd ed., available at:, (last access: 26 July 2019), 2014. 

del Moral, A., Rigo, T., and Llasat, M. C.: A radar-based centroid tracking algorithm for severe weather surveillance: identifying split/merge processes in convective systems, Atmos. Res., 213, 110–120,, 2018. 

Diendorfer, G.: Some comments on the achievable accuracy of local ground flash density values, in: 29th International Conference on Lightning Protection, ICLP, Uppsala, Sweden, 1–6, 2008. 

Dixon, M. and Wiener, G.: TITAN: Thunderstorm Identification, Tracking, Analysis and Nowcasting – A Radar-based Methodology, Technol. J. Atmos. Ocean., 10, 785–797, 1993. 

Doe, R. (Ed.): Extreme weather: forty years of the Tornado and Storm Research Organisation (TORRO), John Wiley and Sons, Chichester, UK, 2016. 

Dotzek, N. and Forster, C.: Quantitative comparison of METEOSAT thunderstorm detection and nowcasting with in situ reports in the European Severe Weather Database (ESWD), Atmos. Res., 100, 511–522,, 2011. 

Drüe, C., Hauf, T., Finke, U., Keyn, S., and Kreyer, O.: Comparison of a SAFIR lightning detection network in northern Germany to the operational BLIDS network, J. Geophys. Res., 112, D18114,, 2007. 

Eastin, M. D., Baber, M., Boucher, A., Di Bari, S., Hubler, R., Stimac-Spalding, B., and Winesett, T.: Temporal variability of the Charlotte (sub)urban heat Island, J. Appl. Meteorol. Climatol., 57, 81–102,, 2018. 

Ellis, A. and Miller, P.: The Emergence of Lightning in Severe Thunderstorm Prediction and the Possible Contributions from Spatial Science, Geogr. Compass, 10, 192–206,, 2016. 

Elsom, D. M. and Webb, J. D. C.: Lightning deaths in the UK: a 30-year analysis of the factors contributing to people being struck and killed, Int. J. Meteorol., 42, 8–26, 2017. 

Elsom, D. M., Enno, S. E., Horseman, A., and Webb, J. D. C.: Compiling lightning counts for the UK land area and an assessment of the lightning risk facing UK inhabitants, Weather, 73, 171–179,, 2018. 

Enno, S. E.: Comparison of thunderstorm hours registered by the lightning detection network and human observers in Estonia, 2006–2011, Theor. Appl. Climatol., 121, 13–22,, 2015. 

Enno, S. E., Briede, A., and Valiukas, D.: Climatology of thunderstorms in the Baltic countries, 1951–2000, Theor. Appl. Climatol., 111, 309–325,, 2013. 

Etherington, T. R. and Perry, G. L. W.: Spatially adaptive probabilistic computation of a sub-kilometer resolution lightning climatology for New Zealand, Comput. Geosci., 98, 38–45,, 2017. 

Ezcurra, A., Areitio, J., and Herrero, I.: Relationships between cloud-to-ground lightning and surface rainfall during 1992–1996 in the Spanish Basque Country area, Atmos. Res., 61, 239–250,, 2002. 

Farnell, C. and Rigo, T.: The lightning jump algorithm for nowcasting convective rainfall in Catalonia, Atmosphere, 11, 4,, 2020. 

Feudale, L. and Manzato, A.: Cloud-to-Ground Lightning Distribution and Its Relationship with Orography and Anthropogenic Emissions in the Po Valley, J. Appl. Meteorol. Climatol., 53, 2651–2670,, 2014. 

Finke, U.: Space-Time Correlations of Lightning Distributions, Mon. Weather Rev., 127, 1850–1861, 1999. 

Giordano, C.: The Independent: Freak storm kills seven in Greece, including two children: available at:, last access: 25 July 2019. 

Galanaki, E., Kotroni, V., Lagouvardos, K., and Argiriou, A.: A ten-year analysis of cloud-to-ground lightning activity over the Eastern Mediterranean region, Atmos. Res., 166, 213–222,, 2015. 

Galanaki, E., Lagouvardos, K., Kotroni, V., Flaounas, E., and Argiriou, A.: Thunderstorm climatology in the Mediterranean using cloud-to-ground lightning observations, Atmos. Res., 207, 136–144,, 2018. 

Gatidis, C., Lolis, C. J., Lagouvardos, K., Kotroni, V., and Bartzokas, A.: On the seasonal variability and the spatial distribution of lightning activity over the broader Greek area and their connection to atmospheric circulation, Atmos. Res., 208, 180–190,, 2018. 

Gatlin, P. N. and Goodman, S. J.: A total lightning trending algorithm to identify severe thunderstorms, J. Atmos. Ocean. Tech., 27, 3–22,, 2010. 

Goodman, S., Koshak, W., Blakeslee, R., and Mach, D.: GLM Lightning Cluster-Filter Algorithm, Algorithm theoretical basis document, NOAA NESDIS Centre for Satellite Applications and Research, Maryland and Washington DC, USA, 2012. 

Gray, M. E. B. and Marshall, C.: Mesoscale convective events over the UK, 1981–97, Weather, 53, 388–395, 1998. 

Halliday, J.: The Guardian: Woman dies after being struck by lightning in Scottish Highlands: available at:, last access 14 August 2019. 

Haberlie, A. M., Ashley, W. S., Fultz, A. J., and Eagan, S. M.: The effect of reservoirs on the climatology of warm-season thunderstorms in Southeast Texas, USA, Int. J. Climatol., 36, 1808–1820,, 2016. 

Hersbach, H., Bell, B., Berrisford, P., Horányi, A., Sabater, J. M., Nicolas, J., Radu, R., Schepers, D., Simmons, A., Soci, C., and Dee, D.: Global reanalysis: goodbye ERA-Interim, hello ERA5, ECMWF Newsl., 159, 17–24,, 2019. 

Holley, D. M., Dorling, S. R., Steele, C. J., and Earl, N.: A climatology of convective available potential energy in Great Britain, Int. J. Climatol., 34, 3811–3824,, 2014. 

Houston, A. L., Lock, N. A., Lahowetz, J., Barjenbruch, B. L., Limpert, G., and Oppermann, C.: Thunderstorm Observation by Radar (ThOR): An algorithm to develop a climatology of thunderstorms, J. Atmos. Ocean. Tech., 32, 961–981,, 2015. 

Hudson, T. S., Horseman, A., and Sugier, J.: Diurnal, Seasonal, and 11-yr Solar Cycle Variation Effects on the Virtual Ionosphere Reflection Height and Implications for the Met Office's Lightning Detection System, ATDnet, J. Atmos. Ocean. Tech., 33, 1429–1441,, 2016. 

Huffines, G. R. and Orville, R. E.: Lightning Ground Flash Density and Thunderstorm Duration in the Continental United States: 1989–96, J. Appl. Meteorol., 38, 1013–1019,<1013:LGFDAT>2.0.CO;2, 1999. 

Iordanidou, V., Koutroulis, A. G., and Tsanis, I. K.: Investigating the relationship of lightning activity and rainfall: A case study for Crete Island, Atmos. Res., 172–173, 16–27,, 2016. 

Johnson, J. T., Mackeen, P. L., Witt, A., Mitchell, E. D., Stumpf, G. J., Eilts, M. D., and Thomas, K. W.: The storm cell identification and tracking algorithm: An enhanced WSR-88D algorithm, Weather Forecast., 13, 263–276,<0263:TSCIAT>2.0.CO;2, 1998. 

Kaltenböck, R., Diendorfer, G., and Dotzek, N.: Evaluation of thunderstorm indices from ECMWF analyses, lightning data and severe storm reports, Atmos. Res., 93, 381–396,, 2009. 

Keogh, S. J., Hibbett, E., Nash, J., and Eyre, J.: The Met Office Arrival Time Difference (ATD) system for thunderstorm detection and lightning location, Technical Report, Met Office, Exeter, UK, 2006. 

Kochtubajda, B., Burrows, W. R., Liu, A., and Patten, J. K.: Surface Rainfall and Cloud-to-Ground Lightning Relationships in Canada, Atmos.-Ocean, 51, 226–238,, 2013. 

Kunz, M.: The skill of convective parameters and indices to predict isolated and severe thunderstorms, Nat. Hazards Earth Syst. Sci., 7, 327–342,, 2007. 

Lewis, M. W. and Gray, S. L.: Categorisation of synoptic environments associated with mesoscale convective systems over the UK, Atmos. Res., 97, 194–213,, 2010. 

Lock, N. A. and Houston, A. L.: Spatiotemporal distribution of thunderstorm initiation in the US Great Plains from 2005 to 2007, Int. J. Climatol., 35, 4047–4056,, 2015. 

Meyer, V. K., Höller, H., and Betz, H. D.: Automated thunderstorm tracking: utilization of three-dimensional lightning and radar data, Atmos. Chem. Phys., 13, 5137–5150,, 2013. 

Mohee, F. M. and Miller, C.: Climatology of thunderstorms for North Dakota, 2002-06, J. Appl. Meteorol. Climatol., 49, 1881–1890,, 2010. 

Moncreiff, M. W. and Miller, M. J.: The dynamics and simulation of tropical cumulonimbus and squall lines, Q. J. R. Meteorol. Soc., 102, 373–394, 1976. 

Munzar, J. and Franc, M.: Winter thunderstorms in central Europe in the past and the present, Atmos. Res., 67–68, 501–515,, 2003. 

Mushtaq, F., Nee Lala, M. G., and Anand, A.: Spatio-temporal variability of lightning activity over J&K region and its relationship with topography, vegetation cover, and absorbing aerosol index (AAI), J. Atmos. Sol.-Terr. Phys., 179, 281–292,, 2018. 

Nag, A., Murphy, M. J., Schulz, W., and Cummins, K. L.: Lightning locating systems: Insights on characteristics and validation techniques, Adv. Earth Space Sci., 2, 65–93,, 2015. 

Pinto, O.: Thunderstorm climatology of Brazil: ENSO and Tropical Atlantic connections, Int. J. Climatol., 35, 871–878,, 2015. 

Piper, D. and Kunz, M.: Spatiotemporal variability of lightning activity in Europe and the relation to the North Atlantic Oscillation teleconnection pattern, Nat. Hazards Earth Syst. Sci., 17, 1319–1336,, 2017. 

Piper, D., Kunz, M., Ehmele, F., Mohr, S., Mühr, B., Kron, A., and Daniell, J.: Exceptional sequence of severe thunderstorms and related flash floods in May and June 2016 in Germany – Part 1: Meteorological background, Nat. Hazards Earth Syst. Sci., 16, 2835–2850,, 2016. 

Poelman, D. R., Honoré, F., Anderson, G., and Pedeboy, S.: Comparing a Regional, Subcontinental, and Long-Range Lightning Location System over the Benelux and France, J. Atmos. Ocean. Technol., 30, 2394–2405,, 2013a. 

Poelman, D. R., Schulz, W., and Vergeiner, C.: Performance Characteristics of Distinct Lightning Detection Networks Covering Belgium, J. Atmos. Ocean. Technol., 30, 942–951,, 2013b. 

Punkka, A.-J. and Bister, M.: Mesoscale Convective Systems and Their Synoptic-Scale Environment in Finland, Weather Forecast., 30, 182–196,, 2015. 

Reap, R. M.: The Use of Network Lightning Data to Detect Thunderstorms near Surface Reporting Stations, Mon. Weather Rev., 121, 464–469,<0464:tuonld>;2, 2002. 

Rigo, T. and Pineda, N.: Inferring the Severity of a Multicell Thunderstorm Evolving to Supercell, by Means of Radar and Total Lightning, Electron. J. Sev. Storms Meteorol., 11, 1–21, 2016. 

Said, R. K., Inan, U. S., and Cummins, K. L.: Long-range lightning geolocation using a VLF radio atmospheric waveform bank, J. Geophys. Res., 115, D23108,, 2010. 

Schultz, C. J., Petersen, W. A., and Carey, L. D.: Preliminary development and evaluation of lightning jump algorithms for the real-time detection of severe weather, J. Appl. Meteorol. Climatol., 48, 2543–2563,, 2009. 

Schuster, S. S., Blong, R. J., and Speer, M. S.: A hail climatology of the greater Sydney area and New South Wales, Australia, Int. J. Climatol., 25, 1633–1650,, 2005. 

Soula, S., Kasereka, J. K., Georgis, J. F., and Barthe, C.: Lightning climatology in the Congo Basin, Atmos. Res., 178–179, 304–319,, 2016. 

Taszarek, M., Czernecki, B., and Kozioł, A.: A Cloud-to-Ground Lightning Climatology for Poland, Mon. Weather Rev., 143, 4285–4304,, 2015. 

Thompson, K. B., Bateman, M. G., and Carey, L. D.: A Comparison of Two Ground-Based Lightning Detection Networks against the Satellite-Based Lightning Imaging Sensor (LIS), J. Atmos. Ocean. Tech., 31, 2191–2205,, 2014. 

Tippett, M. K., Allen, J. T., Gensini, V. A., and Brooks, H. E.: Climate and Hazardous Convective Weather, Curr. Clim. Chang. Reports, 1, 60–73,, 2015. 

Tuovinen, J. P., Punkka, A. J., Rauhala, J., Hohti, H., and Schultz, D. M.: Climatology of severe hail in Finland: 1930–2006, Mon. Weather Rev., 137, 2238–2249,, 2009. 

van Delden, A.: The synoptic setting of thunderstorms in western Europe, Atmos. Res., 56, 89–110,, 2001. 

Virts, K. S., Wallace, J. M., Hutchins, M. L., and Holzworth, R. H.: Highlights of a New Ground-Based, Hourly Global Lightning Climatology, B. Am. Meteorol. Soc., 94, 1381–1391,, 2013. 

Vogt, B. J.: Visualizing Summertime Lightning Patterns on Colorado Fourteeners, Prof. Geogr., 66, 41–57,, 2014. 

Vogt, B. J. and Hodanish, S. J.: A High-Resolution Lightning Map of the State of Colorado, Mon. Weather Rev., 142, 2353–2360,, 2014. 

Vogt, B. J. and Hodanish, S. J.: A geographical analysis of warm season lightning/landscape interactions across Colorado, USA, Appl. Geogr., 75, 93–103,, 2016.  

Wapler, K. and James, P.: Thunderstorm occurrence and characteristics in Central Europe under different synoptic conditions, Atmos. Res., 158–159, 231–244,, 2015. 

Xia, R., Zhang, D. L., and Wang, B.: A 6-yr cloud-to-ground lightning climatology and its relationship to rainfall over central and eastern China, J. Appl. Meteorol. Climatol., 54, 2443–2460,, 2015. 

Zhang, W., Zhang, Y., Zheng, D., Xu, L., and Lyu, W.: Lightning climatology over the northwest Pacific region: An 11-year study using data from the World Wide Lightning Location Network, Atmos. Res., 210, 41–57,, 2018. 

Short summary
This review article outlines the state of thunderstorm climatologies, which are underrepresented in the literature. Thunderstorms overlap with lightning and intense precipitation events, both of which create important hazards. This article compiles and evaluates information on datasets, research approaches and methodologies used in quantifying thunderstorm distribution, providing an introduction to the topic and signposting new and established researchers to research articles and datasets.
Final-revised paper