<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing with OASIS Tables v3.0 20080202//EN" "journalpub-oasis3.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:oasis="http://docs.oasis-open.org/ns/oasis-exchange/table" xml:lang="en" dtd-version="3.0" article-type="research-article">
  <front>
    <journal-meta><journal-id journal-id-type="publisher">NHESS</journal-id><journal-title-group>
    <journal-title>Natural Hazards and Earth System Sciences</journal-title>
    <abbrev-journal-title abbrev-type="publisher">NHESS</abbrev-journal-title><abbrev-journal-title abbrev-type="nlm-ta">Nat. Hazards Earth Syst. Sci.</abbrev-journal-title>
  </journal-title-group><issn pub-type="epub">1684-9981</issn><publisher>
    <publisher-name>Copernicus Publications</publisher-name>
    <publisher-loc>Göttingen, Germany</publisher-loc>
  </publisher></journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.5194/nhess-22-577-2022</article-id><title-group><article-title>Nowcasting thunderstorm hazards using machine learning: <?xmltex \hack{\break}?> the impact of data sources on performance</article-title><alt-title>Data sources in thunderstorm nowcasting</alt-title>
      </title-group><?xmltex \runningtitle{Data sources in thunderstorm nowcasting}?><?xmltex \runningauthor{J.~Leinonen et al.}?>
      <contrib-group>
        <contrib contrib-type="author" corresp="yes" rid="aff1">
          <name><surname>Leinonen</surname><given-names>Jussi</given-names></name>
          <email>jussi.leinonen@meteoswiss.ch</email>
        <ext-link>https://orcid.org/0000-0002-6560-6316</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff1">
          <name><surname>Hamann</surname><given-names>Ulrich</given-names></name>
          
        <ext-link>https://orcid.org/0000-0001-8091-722X</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff1">
          <name><surname>Germann</surname><given-names>Urs</given-names></name>
          
        <ext-link>https://orcid.org/0000-0002-8539-7080</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff2">
          <name><surname>Mecikalski</surname><given-names>John R.</given-names></name>
          
        </contrib>
        <aff id="aff1"><label>1</label><institution>Federal Office of Meteorology and Climatology MeteoSwiss, Locarno-Monti, Switzerland</institution>
        </aff>
        <aff id="aff2"><label>2</label><institution>Atmospheric Science Department, University of Alabama in Huntsville, Huntsville, Alabama, USA</institution>
        </aff>
      </contrib-group>
      <author-notes><corresp id="corr1">Jussi Leinonen (jussi.leinonen@meteoswiss.ch)</corresp></author-notes><pub-date><day>25</day><month>February</month><year>2022</year></pub-date>
      
      <volume>22</volume>
      <issue>2</issue>
      <fpage>577</fpage><lpage>597</lpage>
      <history>
        <date date-type="received"><day>11</day><month>June</month><year>2021</year></date>
           <date date-type="rev-request"><day>22</day><month>June</month><year>2021</year></date>
           <date date-type="rev-recd"><day>13</day><month>October</month><year>2021</year></date>
           <date date-type="accepted"><day>29</day><month>January</month><year>2022</year></date>
      </history>
      <permissions>
        <copyright-statement>Copyright: © 2022 </copyright-statement>
        <copyright-year>2022</copyright-year>
      <license license-type="open-access"><license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p></license></permissions><self-uri xlink:href="https://nhess.copernicus.org/articles/.html">This article is available from https://nhess.copernicus.org/articles/.html</self-uri><self-uri xlink:href="https://nhess.copernicus.org/articles/.pdf">The full text article is available as a PDF file from https://nhess.copernicus.org/articles/.pdf</self-uri>
      <abstract><title>Abstract</title>

      <p id="d1e116">In order to aid feature selection in thunderstorm nowcasting, we present an analysis of the utility of various sources of data for machine-learning-based nowcasting of hazards related to thunderstorms. We considered ground-based radar data, satellite-based imagery and lightning observations, forecast data from numerical weather prediction (NWP) and the topography from a digital elevation model (DEM), ending up with 106 different predictive variables. We evaluated machine-learning models to nowcast storm track radar reflectivity (representing precipitation), lightning occurrence, and the 45 dBZ radar echo top height that can be used as an indicator of hail, producing predictions for lead times of up to 60 min. The study was carried out in an area in the Northeastern United States for which observations from the Geostationary Operational Environmental Satellite-16 are available and can be used as a proxy for the upcoming Meteosat Third Generation capabilities in Europe. The benefits of the data sources were evaluated using two complementary approaches: using feature importance reported by the machine learning model based on gradient-boosted trees, and by repeating the analysis using all possible combinations of the data sources. The two approaches sometimes yielded seemingly contradictory results, as the feature importance reported by the gradient-boosting algorithm sometimes disregards certain features that are still useful in the absence of more powerful predictors, while, at times, it overstates the importance of other features. We found that the radar data is the most important predictor overall. The satellite imagery is beneficial for all of the studied predictands, and therefore offers a viable alternative in regions where radar data are unavailable, such as over the oceans and in less-developed ares. The lightning data are very useful for nowcasting lightning but are of limited use for the other hazards. While the feature importance ranks NWP data as an important input, the omission of NWP data can be well compensated for by using information in the observational data over the nowcast period. Finally, we did not find evidence that the nowcast benefits from the DEM data.</p>
  </abstract>
    </article-meta>
  </front>
<body>
      

<sec id="Ch1.S1" sec-type="intro">
  <label>1</label><title>Introduction</title>
      <p id="d1e128">Thunderstorms regularly cause a significant risk to human life and damage to property through lightning, heavy precipitation, hail and strong winds. These hazards are highly localized and develop within timescales ranging from tens of minutes to a few hours, which makes them difficult to forecast precisely using numerical weather prediction (NWP) models. NWP models can typically forecast a general tendency for thunderstorms in a given region, but not exactly where and when the most severe impacts will occur. Thus, it is better to issue localized short-term warnings of impacts based on <italic>nowcasting</italic>, the statistical prediction of near-term (0–1 h) developments based on the latest available data, in particular observations.</p>
      <?pagebreak page578?><p id="d1e134">Various tracking and nowcasting systems for thunderstorms have been developed since the 1960s; these usually primarily use radar but sometimes also combine other information such as lightning detection and location data. One particularly widely used radar-based system is Thunderstorm Identification, Tracking and Nowcasting <xref ref-type="bibr" rid="bib1.bibx10" id="paren.1"><named-content content-type="pre">TITAN;</named-content></xref>, which tracks thunderstorms as objects defined as continuous regions of high radar reflectivity. A review of other methods developed before 1998 was given by <xref ref-type="bibr" rid="bib1.bibx66" id="text.2"/>. More recent radar-based approaches include Cell Model Output Statistics <xref ref-type="bibr" rid="bib1.bibx27" id="paren.3"><named-content content-type="pre">CellMOS;</named-content></xref>, TRACE3D <xref ref-type="bibr" rid="bib1.bibx20" id="paren.4"/>, Thunderstorm Radar Tracking <xref ref-type="bibr" rid="bib1.bibx24 bib1.bibx25 bib1.bibx26" id="paren.5"><named-content content-type="pre">TRT;</named-content></xref> and NowCastMIX <xref ref-type="bibr" rid="bib1.bibx29" id="paren.6"/>, while <xref ref-type="bibr" rid="bib1.bibx60" id="text.7"/> used radar and lightning data in combination. Other algorithms are designed to utilize satellite data instead; prominent examples of these include GOES-R Convective Initiation <xref ref-type="bibr" rid="bib1.bibx41 bib1.bibx43" id="paren.8"/>, the Rapid Developing Thunderstorm <xref ref-type="bibr" rid="bib1.bibx2" id="paren.9"><named-content content-type="pre">RDT;</named-content></xref> algorithm of the Nowcasting Satellite Application Facility (NWCSAF), Cb-TRAM <xref ref-type="bibr" rid="bib1.bibx70 bib1.bibx32" id="paren.10"/> and the work of <xref ref-type="bibr" rid="bib1.bibx6" id="text.11"/> and <xref ref-type="bibr" rid="bib1.bibx5" id="text.12"/>.</p>
      <p id="d1e183">Like many other statistical data analysis and prediction tasks, nowcasting of thunderstorms and related hazards has benefited from the rapid advances in machine learning (ML) techniques in the last decade. ML has been a popular technique for nowcasting precipitation <xref ref-type="bibr" rid="bib1.bibx55 bib1.bibx56 bib1.bibx12 bib1.bibx3 bib1.bibx34 bib1.bibx13" id="paren.13"><named-content content-type="pre">e.g.,</named-content></xref>, and has also been used to develop nowcasting methods for lightning <xref ref-type="bibr" rid="bib1.bibx45 bib1.bibx69" id="paren.14"/>, hail <xref ref-type="bibr" rid="bib1.bibx9 bib1.bibx28" id="paren.15"/> and windstorms <xref ref-type="bibr" rid="bib1.bibx59 bib1.bibx35 bib1.bibx36" id="paren.16"/>. However, studies have so far typically used only one data source, though several have been utilized in some cases. Furthermore, most studies have concentrated on predicting only one variable. The variety of adopted methodologies complicates comparisons between the results from different studies.</p>
      <p id="d1e200">In this study, our objective is to provide a systematic assessment of the value of various data sources for nowcasting hazards caused by thunderstorms using a ML approach. As a particular goal, we seek to understand the impact on thunderstorm nowcasting of the new generation of geostationary satellites, which, compared to the previous generation, provide higher-resolution imagery, additional image channels and lightning data. Of these satellites, Geostationary Operational Environmental Satellite (GOES)-16 and -17 are currently operational, while the first of the Meteosat Third Generation (MTG) satellites is expected to launch in 2022. Therefore, we conduct our study in the Northeastern US, where the climate is similar to Central Europe (the primary focus of research at MeteoSwiss), and where GOES-16 has a clear field of view. We include a variety of ground-based, satellite-based and model-derived data sources that are available for that region, and examine their value for nowcasting thunderstorms. We base our study on interpretable ML using gradient-boosting methods. With our results, we aim to provide guidelines for further research and development such that investigators can acquire and process the most relevant data sources and variables for their particular applications. Our approach is similar to that of <xref ref-type="bibr" rid="bib1.bibx44" id="text.17"/>, but we complement that study with a larger number of samples (approximately 88 000 vs. 2000), the use of gradient boosting rather than random forests, the inclusion of NWP and digital elevation model (DEM) data, and a more detailed analysis achieved by excluding combinations of different data sources.</p>
      <p id="d1e207">This article is organized as follows: Sect. <xref ref-type="sec" rid="Ch1.S2"/> describes the study region and the data sources; Sect. <xref ref-type="sec" rid="Ch1.S3"/> explains the data processing and ML methods used; and Sect. <xref ref-type="sec" rid="Ch1.S4"/> presents the results along with a discussion of their meaning. Finally, Sect. <xref ref-type="sec" rid="Ch1.S5"/> concludes the article by summarizing and synthesizing the results and their implications for future studies.</p>
</sec>
<sec id="Ch1.S2">
  <label>2</label><title>Data</title>
<sec id="Ch1.S2.SS1">
  <label>2.1</label><title>Study area and period</title>
      <p id="d1e233">Considering the objectives of the research, we chose to focus on a study area in the northeast of the US, shown in Fig. <xref ref-type="fig" rid="Ch1.F1"/>. The study region is a rectangle in azimuthal equidistant projection <xref ref-type="bibr" rid="bib1.bibx58" id="paren.18"/>, centered at 76<inline-formula><mml:math id="M1" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula> W, 42<inline-formula><mml:math id="M2" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula> N and extending 720 km in the west–east direction and 490 km in the north–south direction. The resolution of the grid is 1 km per pixel.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F1"><?xmltex \currentcnt{1}?><?xmltex \def\figurename{Figure}?><label>Figure 1</label><caption><p id="d1e261">The study area in eastern North America. The blue rectangle indicates the area (720 km <inline-formula><mml:math id="M3" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 490 km), while the orange circles mark the locations of the NEXRAD radars.</p></caption>
          <?xmltex \igopts{width=236.157874pt}?><graphic xlink:href="https://nhess.copernicus.org/articles/22/577/2022/nhess-22-577-2022-f01.png"/>

        </fig>

      <?pagebreak page579?><p id="d1e277">The area is centered on the states of New York and Pennsylvania and also covers parts of the states of Connecticut, Massachusetts, New Hampshire, New Jersey, Rhode Island and Vermont, as well as a region of the Atlantic Ocean and a part of the Canadian province of Ontario. Although this region is not as convectively active as, for example, the US Great Plains or the Southeastern US, we chose it because it still experiences considerable thunderstorm activity and the hazard profile of these storms is similar to those in Central Europe: tornadoes are relatively uncommon, and the hazards consist mostly of hail, lightning, wind gusts and heavy precipitation <xref ref-type="bibr" rid="bib1.bibx31 bib1.bibx8 bib1.bibx67" id="paren.19"/>. The latitude of the region is also similar to that of Central and Southern Europe, and consequently the solar radiation profiles and the viewing angles of satellite instruments in geostationary orbit are similar. The main difference between this region and Central Europe is the topography: much of Central Europe is characterized by the Alps, while our study area is generally smoother and most of the variation in elevation is due to the less-prominent Appalachian mountain range.</p>
      <p id="d1e284">We collected data from data archives for the period ranging from April to September 2020, with a time resolution of up to 5 min, depending on the source. The length of the study period and the size of the area were determined as a compromise between gathering an extensive dataset with a large number of samples and keeping the amount of data (already around 7 TB of raw data) that needed to be downloaded and processed manageable.</p>
</sec>
<sec id="Ch1.S2.SS2">
  <label>2.2</label><title>Data sources</title>
      <p id="d1e295">Since the objective of the study was to investigate the utility of different types of data for nowcasting severe thunderstorms, we selected multiple qualitatively different data sources for analysis. In order to constrain the complexity of the study, we tried to avoid unnecessary overlap between the sources; thus, for example, we did not attempt to use similar data from multiple satellites, nor did we obtain ground-based lightning data, as they were already available from a satellite source. Moreover, in order to avoid the complications of data intermittency, we preferred to focus on data sources that are regularly available and avoid sources such as low-Earth-orbiting satellites that typically pass over a given area only 1–2 times per day. The final dataset includes data from a ground-based operational radar network, multispectral imagery and lightning data from a geostationary satellite, and NWP and DEM data. The details needed to obtain the data can be found under “Code and data availability” at the end of the article. The data sources are described in more detail in the following sections.</p>
<sec id="Ch1.S2.SS2.SSS1">
  <label>2.2.1</label><title>Radar data: NEXRAD</title>
      <p id="d1e305">The Next-Generation Radar <xref ref-type="bibr" rid="bib1.bibx22" id="paren.20"><named-content content-type="pre">NEXRAD;</named-content></xref> network is the US operational radar network operated by the National Weather Service (NWS). It consists of S-band Doppler weather radars that cover most of the continental US as well as many other regions of the country. NEXRAD observations from multiple radars are processed by the National Severe Storms Laboratory (NSSL) into composite products using the Multi-Radar/Multi-Sensor System <xref ref-type="bibr" rid="bib1.bibx68 bib1.bibx57" id="paren.21"><named-content content-type="pre">MRMS;</named-content></xref>. Unfortunately, the MRMS data are currently only available in near-real time and are not publicly archived for more than 24 h. Therefore, we needed to process the data from individual radars – whose data are publicly archived in the long term – into a composite ourselves; the PyART library <xref ref-type="bibr" rid="bib1.bibx23" id="paren.22"/> was used for this purpose. Although this solution has the drawback that we cannot expect to match the quality of a well-developed composite product within this study, it has an advantage in that using the full three-dimensional measured radar observations allows us to calculate any radar variable rather than just those available from the MRMS. In this work, we derived the column maximum reflectivity (MAXZ), the echo top heights at threshold reflectivities of 25, 35 and 45 dBZ, as well as the vertically integrated liquid (VIL), calculated as
              <disp-formula id="Ch1.E1" content-type="numbered"><label>1</label><mml:math id="M4" display="block"><mml:mrow><mml:mi mathvariant="normal">VIL</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">3.44</mml:mn><mml:mo>×</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">6</mml:mn></mml:mrow></mml:msup><mml:msup><mml:mi>Z</mml:mi><mml:mrow><mml:mn mathvariant="normal">4</mml:mn><mml:mo>/</mml:mo><mml:mn mathvariant="normal">7</mml:mn></mml:mrow></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
            where <inline-formula><mml:math id="M5" display="inline"><mml:mi>Z</mml:mi></mml:math></inline-formula> is the radar reflectivity given in mm<inline-formula><mml:math id="M6" display="inline"><mml:msup><mml:mi/><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">6</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> m<inline-formula><mml:math id="M7" display="inline"><mml:msup><mml:mi/><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">3</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> (i.e., <inline-formula><mml:math id="M8" display="inline"><mml:mrow><mml:mi>Z</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:msub><mml:mi>Z</mml:mi><mml:mi mathvariant="normal">dB</mml:mi></mml:msub><mml:mo>/</mml:mo><mml:mn mathvariant="normal">10</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> for <inline-formula><mml:math id="M9" display="inline"><mml:mrow><mml:msub><mml:mi>Z</mml:mi><mml:mi mathvariant="normal">dB</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> in dBZ units) and VIL is in units kg m<inline-formula><mml:math id="M10" display="inline"><mml:msup><mml:mi/><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> <xref ref-type="bibr" rid="bib1.bibx19" id="paren.23"/>. The radar data have a time resolution of 5 min.</p>
      <p id="d1e436">Radars were selected such that good data coverage was achieved throughout the study area. The parts of the area that are over ocean and in Canada are within the range of the selected radars, and the entire region is covered with a minimum beam altitude of at most 6000 ft (1800 m), and less than 3000 ft (900 m) across most of the region. NEXRAD radars operate using rather shallow scan elevation angles of 0.5–19.5<inline-formula><mml:math id="M11" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula>, and consequently each individual radar is blind to the region of the atmosphere directly above it. This gap must be filled with nearby radars, and therefore we also selected some radars outside the study area in order to ensure adequate 3D data availability within the area. The radars used for the study are listed in Table <xref ref-type="table" rid="App1.Ch1.S2.T1"/>, and their locations are also shown as orange circles in Fig. <xref ref-type="fig" rid="Ch1.F1"/>.</p>
</sec>
<sec id="Ch1.S2.SS2.SSS2">
  <label>2.2.2</label><title>Satellite imagery: GOES ABI</title>
      <p id="d1e460">GOES-16 is a new-generation geostationary satellite with advanced instruments for weather observations <xref ref-type="bibr" rid="bib1.bibx62" id="paren.24"/>. The primary GOES-16 instrument used in this study is the Advanced Baseline Imager (ABI), which includes 16 bands with wavelengths ranging from 470 nm (visible) to 13.3 <inline-formula><mml:math id="M12" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">µ</mml:mi></mml:mrow></mml:math></inline-formula>m (thermal infrared) and resolutions ranging from 0.5 to 2 km per pixel in optimal viewing conditions. GOES-16 is located over the Equator at 75.2<inline-formula><mml:math id="M13" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula> W, a longitude near the middle of our study area. The ABI provides a full-disk scan, a variable region of interest (used for hurricanes, for example), and a scan covering only the contiguous US (CONUS) region. For this study, we use the CONUS scan, which is available with a time resolution of 5 min.</p>
      <p id="d1e483">We downloaded the level 1 (L1) data (given as the reflectance or brightness temperature) for the GOES-16 ABI channels <xref ref-type="bibr" rid="bib1.bibx54" id="paren.25"/> as well as the level 2 (L2) cloud products of cloud top height, cloud top pressure and cloud optical depth <xref ref-type="bibr" rid="bib1.bibx21" id="paren.26"/> and the derived stability indices (DSI) product <xref ref-type="bibr" rid="bib1.bibx38" id="paren.27"/>, which includes retrievals of variables such as the<?pagebreak page580?> convective available potential energy (CAPE). We would have preferred to use the cloud-top temperature product as well, but it is available only as a full-disk product, not separately for the CONUS region. Consequently, we omitted the cloud-top temperature because the inferior time resolution (10 min) of the full-disk product would have caused compatibility problems. We also computed the differences of various L1 channels (listed in Table <xref ref-type="table" rid="App1.Ch1.S2.T3"/>) in order to provide better features; see, for example, <xref ref-type="bibr" rid="bib1.bibx42" id="text.28"/> for interpretations of the channel differences from geostationary visible/infrared imagers. The data were projected to our study grid with the PyTroll libraries <xref ref-type="bibr" rid="bib1.bibx51" id="paren.29"/> and corrected for parallax shift using the L2 cloud top height product to determine the appropriate correction.</p>
</sec>
<sec id="Ch1.S2.SS2.SSS3">
  <label>2.2.3</label><title>Lightning data: GOES GLM</title>
      <p id="d1e512">The GOES-16 satellite is also equipped with the Geostationary Lightning Mapper (GLM), which detects lightning strikes <xref ref-type="bibr" rid="bib1.bibx53" id="paren.30"/>. The GLM L2 data consists of the coordinates and properties, such as energy, of individual strikes. Each strike consists of multiple lightning “events,” which are pixel-level detections of lightning; a set of adjacent and simultaneous events is interpreted as a strike. The coordinates and properties of the events are also provided, thus providing information about the spatial extent of each lightning strike.</p>
      <p id="d1e518">We projected the data on the lightning strikes and events, as well as their energies, to the common grid. The original GLM files contain 20 s of data each, but the files were aggregated such that we created derived products with 5 min time resolution. The GLM L2 data are provided with parallax correction already performed, obviating the need for this step.</p>
</sec>
<sec id="Ch1.S2.SS2.SSS4">
  <label>2.2.4</label><title>Numerical weather prediction: ECMWF</title>
      <p id="d1e530">We provide the nowcasting system with information about the state of the atmosphere in the study area using the NWP products from the integrated forecast system (IFS) of the European Center for Medium-Range Weather Forecasting (ECMWF). We chose to use the global IFS rather than a local-area modeling system as our NWP data source because, unlike the satellite and radar data, it is not limited to a particular region, and we expected this to facilitate the adaptation of our methodology and results to Europe and other regions beyond the current study area later. We obtained a collection of 59 different variables provided by ECMWF; the variables are listed in Table <xref ref-type="table" rid="App1.Ch1.S2.T5"/>.</p>
      <p id="d1e535">We use the ECMWF archived forecast product rather than the analysis product in order to only use data that would be available to an operational nowcasting system. We downloaded the ECMWF forecasts at intervals of 12 h in forecast time, and the data in the forecasts have a resolution of 1 h. To each 5 min time step in our common spatiotemporal framework, we assigned the closest 1 h time step from the most recently issued forecast. ECMWF provides the data on a latitude–longitude grid; we used the PyTroll tools to project them onto our study grid.</p>
</sec>
<sec id="Ch1.S2.SS2.SSS5">
  <label>2.2.5</label><title>Digital elevation model: ASTER</title>
      <p id="d1e546">Orography can affect the development of convective storms. In order to enable the nowcasting system to exploit information about the elevation and morphology of the terrain, we obtained the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) global DEM version 3 <xref ref-type="bibr" rid="bib1.bibx1" id="paren.31"/>. The resolution of the ASTER DEM is 30 m (the data are provided at a resolution of 1 arcsec), much finer than our grid pixel size of 1 km. This allows the computation of subpixel properties of the elevation for each grid point. We computed the mean elevation, the elevation gradients and the surface roughness, defined as the root-mean-square (RMS) deviation from the mean, for each pixel in our grid. As a combined variable, we also computed the upslope flow <inline-formula><mml:math id="M14" display="inline"><mml:mi>s</mml:mi></mml:math></inline-formula>, defined as the dot product of the elevation gradient and the flow velocity:
              <disp-formula id="Ch1.E2" content-type="numbered"><label>2</label><mml:math id="M15" display="block"><mml:mrow><mml:mi>s</mml:mi><mml:mo>=</mml:mo><mml:mi mathvariant="normal">∇</mml:mi><mml:mi>h</mml:mi><mml:mo>⋅</mml:mo><mml:mi>v</mml:mi><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
            where <inline-formula><mml:math id="M16" display="inline"><mml:mi>h</mml:mi></mml:math></inline-formula> is the elevation and <inline-formula><mml:math id="M17" display="inline"><mml:mi>v</mml:mi></mml:math></inline-formula> is the flow velocity, which in this case is derived from the radar motion vectors. A positive <inline-formula><mml:math id="M18" display="inline"><mml:mi>s</mml:mi></mml:math></inline-formula> indicates that the air flows predominantly uphill, while a negative <inline-formula><mml:math id="M19" display="inline"><mml:mi>s</mml:mi></mml:math></inline-formula> corresponds to downhill flow. When we discuss the importance of data sources later, in Sect. <xref ref-type="sec" rid="Ch1.S4.SS2"/> and <xref ref-type="sec" rid="Ch1.S4.SS3"/>, we consider the information content of the upslope flow to be one of the DEM variables.</p>
</sec>
</sec>
</sec>
<sec id="Ch1.S3">
  <label>3</label><title>Methods</title>
<sec id="Ch1.S3.SS1">
  <label>3.1</label><title>Data processing</title>
      <p id="d1e629">In order to keep the conclusions of the study general and applicable to different operational environments, general and widely available methods, rather than a particular operational nowcasting system, were used in the data processing workflow applied in this study. This starts with the identification of thunderstorm centers in the MAXZ field. The motion of these centers is then tracked backward and forward in time in a Lagrangian framework by integrating the velocity field obtained with the optical flow method. Once the motion of the center has been estimated, features from different data sources and variables are extracted from the neighborhood of the center at each time step. These features are collected in the ML dataset used to train a gradient-boosting model. Below, we describe each step of the workflow in more detail.</p>
<sec id="Ch1.S3.SS1.SSS1">
  <label>3.1.1</label><title>Extraction of storm centers and tracks</title>
      <p id="d1e639">After processing the data into a single grid as described in Sect. <xref ref-type="sec" rid="Ch1.S2.SS2"/>, we identified regions of active thunderstorms in the data, based on the observed radar reflectivity. For each 5 min time step in the data, we located centers of convective activity using the following procedure:
<list list-type="order"><list-item>
      <p id="d1e646">Start with an empty list of storm centers.</p></list-item><list-item>
      <p id="d1e650">Find the pixel with the highest MAXZ, denoted <inline-formula><mml:math id="M20" display="inline"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="normal">maxZ</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>.</p></list-item><list-item>
      <p id="d1e665">If the MAXZ at <inline-formula><mml:math id="M21" display="inline"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="normal">maxZ</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is at least 37 dBZ:
<list list-type="bullet"><list-item>
      <p id="d1e681">Add <inline-formula><mml:math id="M22" display="inline"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="normal">maxZ</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> to the list of storm centers.</p></list-item><list-item>
      <p id="d1e696">Identify the 25 km diameter circular area surrounding <inline-formula><mml:math id="M23" display="inline"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="normal">maxZ</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and exclude the pixels in it from the rest of the search.</p></list-item><list-item>
      <p id="d1e711">Restart the search from step 2.</p></list-item></list>
Otherwise, end the search.</p></list-item></list>
Thus, storms were identified as regions of high radar reflectivity. We chose the 37 dBZ threshold, which corresponds to a convective precipitation rate of approximately 8 mm h<inline-formula><mml:math id="M24" display="inline"><mml:msup><mml:mi/><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula>, following several previous studies in which thunderstorms were identified using radar reflectivity thresholds of 30–40 dBZ <xref ref-type="bibr" rid="bib1.bibx39 bib1.bibx65 bib1.bibx52 bib1.bibx46 bib1.bibx24 bib1.bibx32" id="paren.32"/>. To prevent radar artifacts from being identified as storms, we discarded centers that had a valid MAXZ in fewer than one-third of the pixels in the surrounding 25 km circle.</p>
      <p id="d1e731">Once the centers had been identified, we tracked their movement in the domain so that the temporal evolution of the storm was separated from its movement. To estimate the motion, we computed the motion vectors of the reflectivity using the autocorrelation-based optical flow method implemented in the PySteps package <xref ref-type="bibr" rid="bib1.bibx50" id="paren.33"/>. This method yields a single motion vector; to allow the motion vector field to vary spatially, we computed a motion vector in this manner for each point in a square grid with a spacing of 97 pixels, using the <inline-formula><mml:math id="M25" display="inline"><mml:mrow><mml:mn mathvariant="normal">200</mml:mn><mml:mo>×</mml:mo><mml:mn mathvariant="normal">200</mml:mn></mml:mrow></mml:math></inline-formula> pixel MAXZ neighborhood of each grid point to compute the vector using the autocorrelation-based method. Once computed in this fashion, the motion vectors were then interpolated to the storm centers. This method produces motion fields with very smooth gradients and is likely to fail to produce the correct motion for regions with high wind shear. Although more advanced methods are available in PySteps, we found these to be more prone to producing artifacts. The procedure described above is more robust, so we found it to be more suitable for the task required in this study: the automated analysis of tens of thousands of samples.</p>
      <p id="d1e749">For each center, we estimated the past location of the corresponding air parcel by backward integrating the motion vectors using Heun's method <xref ref-type="bibr" rid="bib1.bibx61" id="paren.34"><named-content content-type="pre">also known as the improved Euler's method;</named-content></xref>. At each time step, the advected center may be adjusted by up to 2 pixels to align it at the maximum MAXZ in the neighborhood (we found that 2 pixels was sufficient, and that larger adjustments sometimes caused the tracking to drift to the wrong storm center). For future motion, only the data that would be available in a real-time nowcasting scenario was used, and therefore we computed the future tracks using the last available motion vectors at the reference time. Both the past and the future tracks were computed for 60 min from the reference time. Any tracks that extended out of the study area were discarded. An example of centers and tracks extracted in this manner is shown in Fig. <xref ref-type="fig" rid="Ch1.F2"/>.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F2"><?xmltex \currentcnt{2}?><?xmltex \def\figurename{Figure}?><label>Figure 2</label><caption><p id="d1e762">An example of the extracted centers at time <inline-formula><mml:math id="M26" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:math></inline-formula> (orange circles) and tracks from <inline-formula><mml:math id="M27" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">60</mml:mn></mml:mrow></mml:math></inline-formula> to <inline-formula><mml:math id="M28" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">60</mml:mn></mml:mrow></mml:math></inline-formula> min (orange lines). MAXZ in dBZ is shown in the colored map, with coastlines and state borders shown in the background.</p></caption>
            <?xmltex \igopts{width=236.157874pt}?><graphic xlink:href="https://nhess.copernicus.org/articles/22/577/2022/nhess-22-577-2022-f02.png"/>

          </fig>

      <p id="d1e807">The storm identification and tracking scheme was implemented with the objectives of robustness and suitability for ML. Therefore, we opted not to use, for instance, the thunderstorm radar tracking (TRT) cell identification method <xref ref-type="bibr" rid="bib1.bibx24 bib1.bibx25 bib1.bibx26" id="paren.35"/>, which produces variable-sized storm cells, thus complicating analysis. Our scheme approximates the tracking of storm centers but may not always perfectly correspond to it. Therefore, one can state the objective of the ML prediction task more precisely as follows: <italic>given the Lagrangian history of a storm centerpoint selected based on a 37 dBZ reflectivity threshold, predict its future Lagrangian evolution</italic>.</p>
</sec>
<?pagebreak page581?><sec id="Ch1.S3.SS1.SSS2">
  <label>3.1.2</label><title>Feature extraction</title>
      <p id="d1e824">The evolution of a storm over time is described by the changes in the variables in the circular neighborhood of the center. For each variable derived from the data sources described in Sect. <xref ref-type="sec" rid="Ch1.S2.SS2"/>, we extracted the neighborhood mean, the standard deviation and the 10th and 90th percentiles. The percentiles are intended as a soft minimum and a soft maximum and are less sensitive to outliers compared to the exact minimum and maximum. For Boolean variables such as the occurrence of a lightning event, we also computed binary features that were 1 if the variable was true at any pixel in the neighborhood or 0 otherwise.</p><?xmltex \hack{\newpage}?>
</sec>
<?pagebreak page582?><sec id="Ch1.S3.SS1.SSS3">
  <label>3.1.3</label><title>Datasets</title>
      <p id="d1e838">The final dataset collected from the entire study period and study area comprises 87 626 samples that describe the history and future of the detected storm centers. We divided the samples into a training set that was used to train the ML algorithm, a validation set that was used to evaluate the generalization ability during training and a test set that was used for final evaluation. We found that simply sampling these sets randomly from the data made the training prone to overfitting because storm tracks found at a similar time and location had similar evolutions and thus were not independent samples. In order to improve the independence of the training, validation and testing sets, we determined these sets such that the data from each day (00:00–24:00 UTC) were assigned entirely to only one of these sets, mostly eliminating the overlap between them. We sampled the days randomly until at least 10 % of the data were in the validation set and at least another 10 % were in the test set, and assigned the remaining data to the training set. The final datasets were made up of 69 594 training samples, 9160 validation samples and 8872 testing samples.</p>
</sec>
</sec>
<sec id="Ch1.S3.SS2">
  <label>3.2</label><title>Prediction tasks</title>
      <p id="d1e850">The predictands (i.e., the targets of the ML prediction) evaluated in this study were selected based on their relevance for thunderstorm hazard prediction. We examined qualitatively different predictands in order to assess the differences in the contributions of various data sources to the prediction performance.</p>
      <p id="d1e853">The first prediction task we defined was the nowcasting of the evolution of the column maximum reflectivity on the storm track. This variable is highly indicative of thunderstorm development and can function as an indicator of heavy precipitation and hail. In particular, the radar reflectivity <inline-formula><mml:math id="M29" display="inline"><mml:mi>Z</mml:mi></mml:math></inline-formula> can be approximately related to the rain rate <inline-formula><mml:math id="M30" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> by a relation of the form <inline-formula><mml:math id="M31" display="inline"><mml:mrow><mml:mi>Z</mml:mi><mml:mo>=</mml:mo><mml:mi>a</mml:mi><mml:msup><mml:mi>R</mml:mi><mml:mi>b</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula>, where <inline-formula><mml:math id="M32" display="inline"><mml:mi>a</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M33" display="inline"><mml:mi>b</mml:mi></mml:math></inline-formula> are empirically determined constants. Hereafter, we refer to the task of predicting the evolution of the column maximum reflectivity as MAXZ. We examined the prediction performance for lead times of between 5 and 60 min.</p>
      <p id="d1e901">Another important thunderstorm hazard that we were able to quantify using the available dataset was the occurrence of lightning. We used the GLM measurements to identify lightning, and approached lightning prediction as a binary task of predicting whether or not lightning will be present in the 25 km diameter neighborhood of the storm center during a given time period. We refer to this task as LIGHTNING-OCC.</p>
      <p id="d1e904">For hail, we did not have direct observations of its occurrence. However, the presence of hail has been found to be well indicated by the height difference between the radar 45 dBZ echo top and the freezing height <xref ref-type="bibr" rid="bib1.bibx63 bib1.bibx11 bib1.bibx4" id="paren.36"/>. Since the freezing level is obtained from NWP data, the principal task is to predict the echo top height. This, of course, is dependent on a 45 dBZ reflectivity being present in the vertical column. Thus, we divided this task into two components: predicting whether a 45 dBZ echo top will be present (ECHO45-OCC) and, in cases where it was present, predicting its height (ECHO45-HT). In an operational setting, this model could be used by first evaluating ECHO45-OCC; if it predicted that a 45 dBZ reflectivity would occur, ECHO45-HT would then be predicted and the freezing level would be subtracted from it in order to calculate the hail probability.</p>
</sec>
<sec id="Ch1.S3.SS3">
  <label>3.3</label><title>Machine learning: gradient tree boosting</title>
      <p id="d1e918">For the ML prediction, we used gradient boosting (GB) to learn the relationship between the features and the prediction targets. GB is a ML technique that uses decision trees, with trees trained iteratively such that each successive tree corrects the errors of the previous trees. The decision trees are regularized using several techniques in order to prevent overfitting. A review of GB methods can be found in <xref ref-type="bibr" rid="bib1.bibx48" id="text.37"/>.</p>
      <p id="d1e924">One particular advantage of GB for our study is that it allows the importance of the various input features to be quantified. Thus, it is well suited to our aim of assessing the value of different data sources and variables for the prediction of thunderstorm hazards. The results can later be used to guide the selection of appropriate features for different ML methods such as deep learning, where the feature importance is less straightforward to derive.</p>
      <p id="d1e927">We used the open-source LightGBM implementation of the GB algorithm <xref ref-type="bibr" rid="bib1.bibx30" id="paren.38"/> as our ML framework. LightGBM is designed to be computationally efficient and has a reduced memory footprint, facilitating the analysis of large datasets.</p>
</sec>
<sec id="Ch1.S3.SS4">
  <label>3.4</label><title>Training</title>
      <p id="d1e941">We tuned the GB training for each of the various prediction tasks. As described in Sect. <xref ref-type="sec" rid="Ch1.S3.SS2"/>, the learning tasks can be broadly divided into two categories: regression tasks and binary classification tasks. In regression tasks, the objective is to predict the future value of a variable that can be any real number; in binary classification tasks, the objective is to predict the probability of an event occurring.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F3" specific-use="star"><?xmltex \currentcnt{3}?><?xmltex \def\figurename{Figure}?><label>Figure 3</label><caption><p id="d1e948">Examples of the prediction of MAXZ. The figure shows the curves of observed and predicted MAXZ (the mean over the 25 km diameter region of interest) for four different tracked centers. The solid lines show the development of MAXZ, while the dashed lines show the predictions after <inline-formula><mml:math id="M34" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:math></inline-formula>.</p></caption>
          <?xmltex \igopts{width=341.433071pt}?><graphic xlink:href="https://nhess.copernicus.org/articles/22/577/2022/nhess-22-577-2022-f03.png"/>

        </fig>

      <p id="d1e969">After comparing the performance and robustness of the mean square error (MSE) and mean absolute error (MAE), we decided to use MAE as the training objective function for regression tasks, as it tended to give slightly better results with the validation set, with less overfitting. Indeed, when the model was trained with MAE loss, it achieved better MSE in the validation set than an equivalent model trained with MSE loss. <xref ref-type="bibr" rid="bib1.bibx64" id="text.39"/> and <xref ref-type="bibr" rid="bib1.bibx7" id="text.40"/>, among others, have discussed the relative merits of MAE and MSE in the geoscientific context. We also found<?pagebreak page583?> that, compared to training the GB model for the target variable directly, we could achieve superior training performance by first subtracting the bias-corrected persistence prediction (discussed in more detail in Sect. <xref ref-type="sec" rid="Ch1.S4.SS1"/>) and then training the GB model to predict the residual. Binary tasks were trained using the cross entropy as a cost function.</p>
      <p id="d1e981">All tasks were trained using early stopping based on the validation dataset. That is, the training proceeds as long as the training metric keeps improving not only in the training set but also in the validation set, which is not used for training. The early stop limits the overfitting of the GB model.</p>
      <p id="d1e984">The performance with the validation dataset was also used to tune the hyperparameters of the GB model, most importantly the depth of the trees, the number of leaf nodes, the learning rate and various regularization parameters. Although we were able to achieve some improvements by fine-tuning these parameters, the performance on the validation set was not particularly sensitive to changes over a reasonable range of parameters. As the principal goal of this study was not to strictly optimize the performance of the predictions but rather to assess the importance of the various data sources, we considered the hyperparameter tuning to be of secondary importance in this context and were content to use hyperparameters that produce reasonable results after an informal manual search of the parameter space.</p>
      <p id="d1e987">Using the default hyperparameters, approximately 60 min were required on a modern computer with 16 central processing unit (CPU) cores to train the all GB models: 12 models corresponding to different time steps of the MAXZ prediction and two models (0–30 and 30–60 min) for each of the LIGHTNING-OCC, ECHO45-OCC and ECHO-45. Thus, one model took approximately 3 min to train. Evaluating all of the above-mentioned models for the entire testing dataset of 8872 samples required a total of 6 s on the same hardware; this is equivalent to 35 <inline-formula><mml:math id="M35" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">µ</mml:mi></mml:mrow></mml:math></inline-formula>s for one sample and one model.</p><?xmltex \hack{\newpage}?>
</sec>
</sec>
<sec id="Ch1.S4">
  <label>4</label><title>Results and discussion</title>
      <p id="d1e1008">The results of the ML experiments are reported and discussed in the sections below. First, in Sect. <xref ref-type="sec" rid="Ch1.S4.SS1"/> we give a general analysis of the prediction performance. Then we assess the importance of different features and data sources using the GB feature importance (Sect. <xref ref-type="sec" rid="Ch1.S4.SS2"/>) and data exclusion analysis (Sect. <xref ref-type="sec" rid="Ch1.S4.SS3"/>). All reported results are for the test dataset unless otherwise mentioned.</p>
<sec id="Ch1.S4.SS1">
  <label>4.1</label><title>Prediction performance</title>
      <p id="d1e1024">Before evaluating the importance of the various data sources, we quantify the performance of the models in the case where all data sources described in Sect. <xref ref-type="sec" rid="Ch1.S2.SS2"/> are available.</p>
      <p id="d1e1029">Figure <xref ref-type="fig" rid="Ch1.F3"/> shows examples of the real and predicted time series for MAXZ. We note that, in this figure, the MAXZ at <inline-formula><mml:math id="M36" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:math></inline-formula> may be less than the 37 dBZ threshold because the MAXZ shown is the mean over the 25 km diameter region of interest, while a MAXZ exceeding 37 dBZ in a single pixel is enough for a case to be selected. Meanwhile, Fig. <xref ref-type="fig" rid="Ch1.F4"/> shows the error, averaged over all events and data points, of the MAXZ predictand as a function of the lead time <inline-formula><mml:math id="M37" display="inline"><mml:mi>t</mml:mi></mml:math></inline-formula>. In order to provide a more concrete error figure, we also show the corresponding relative error in the rain rate estimated using the relation <inline-formula><mml:math id="M38" display="inline"><mml:mrow><mml:mi>Z</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">300</mml:mn><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">1.4</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>, where <inline-formula><mml:math id="M39" display="inline"><mml:mi>Z</mml:mi></mml:math></inline-formula> is the reflectivity on the linear scale, derived for convective precipitation and frequently used with NEXRAD <xref ref-type="bibr" rid="bib1.bibx40" id="paren.41"><named-content content-type="pre">e.g.,</named-content></xref>. As a baseline prediction, we use the persistence assumption in a Lagrangian framework; that is, it is assumed that the variable will remain the same as it was at time <inline-formula><mml:math id="M40" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:math></inline-formula>. We found that the persistence assumption is biased: the MAXZ at <inline-formula><mml:math id="M41" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>&gt;</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:math></inline-formula> is, on average, lower than that at <inline-formula><mml:math id="M42" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:math></inline-formula>; this can also be seen in most of the examples in Fig. <xref ref-type="fig" rid="Ch1.F3"/>. This reflectivity bias has two sources: first, sampling bias, which occurs because we select centers of intensive thunderstorms with <inline-formula><mml:math id="M43" display="inline"><mml:mrow><mml:mi mathvariant="normal">MAXZ</mml:mi><mml:mo>&gt;</mml:mo><mml:mn mathvariant="normal">37</mml:mn></mml:mrow></mml:math></inline-formula> dBZ; and second, drifting of the thunderstorm track from the actual center of the storm due to inaccuracies in the tracking procedure. The bias is small at short lead times and reaches 3.5 dBZ at <inline-formula><mml:math id="M44" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">60</mml:mn></mml:mrow></mml:math></inline-formula> min.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F4"><?xmltex \currentcnt{4}?><?xmltex \def\figurename{Figure}?><label>Figure 4</label><caption><p id="d1e1150">Errors (in dBZ) of the prediction of MAXZ as a function of lead time, obtained using all the available data. The solid lines show the mean absolute error (MAE), while the dashed lines show the root-mean-square error (RMSE) as shown on the scale on the left. The scale on the right shows the logarithmic reflectivity error converted to the relative error in rain rate <inline-formula><mml:math id="M45" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula>. The blue lines show the result from the GB tree, the orange lines show the Lagrangian persistence assumption and the red lines show the bias-corrected Lagrangian persistence.</p></caption>
          <?xmltex \igopts{width=236.157874pt}?><graphic xlink:href="https://nhess.copernicus.org/articles/22/577/2022/nhess-22-577-2022-f04.png"/>

        </fig>

      <p id="d1e1167"><?xmltex \hack{\newpage}?>We can considerably reduce the error of the persistence assumption by correcting for this bias. In contrast, it is rather difficult to improve from the bias-corrected persistence assumption using the GB model, even if we train the GB model on its residual. In Fig. <xref ref-type="fig" rid="Ch1.F4"/>, it is apparent that the bias correction improves the MAE by approxim<?pagebreak page584?>ately 1.2 dB at <inline-formula><mml:math id="M46" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">60</mml:mn></mml:mrow></mml:math></inline-formula> min, while the GB model only gives a further 0.3 dB of improvement. Nevertheless, the improvement gained with the ML prediction is consistent and increases with longer lead times.</p>
      <p id="d1e1185">For the lightning prediction, the model has an error rate of 8.1 % for LIGHTNING-OCC in the 0–30 min time period and 14.3 % for the 30–60 min period. We can compare these numbers to the climatological occurrence, which would be the error rate of a prediction that lightning never occurs – or conversely, the climatological nonoccurrence would be the error rate of a prediction that it always occurs. The climatological occurrence of lightning in the data is 40.7 % for 0–30 min and 29.2 % for 30–60 min in the test dataset (the difference between these is likely due to the same bias that we discussed in the context of MAXZ above). These results suggest that the nowcasting framework developed here could potentially be adapted to operational lightning nowcasting. Full confusion matrices are shown in Fig. <xref ref-type="fig" rid="Ch1.F5"/>a and b.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F5"><?xmltex \currentcnt{5}?><?xmltex \def\figurename{Figure}?><label>Figure 5</label><caption><p id="d1e1192">Confusion matrices for <bold>(a, b)</bold> the LIGHTNING-OCC prediction task and <bold>(c, d)</bold> the ECHO45-OCC task.</p></caption>
          <?xmltex \igopts{width=170.716535pt}?><graphic xlink:href="https://nhess.copernicus.org/articles/22/577/2022/nhess-22-577-2022-f05.png"/>

        </fig>

      <p id="d1e1207">For the presence of the 45 dBZ echo (ECHO45-OCC), we find error rates of 12.7 % for the 0–30 min range and 16.5 % for the 30–60 min range. The corresponding climatological occurrences in the test set are 38.3 % for 0–30 min and 20.0 % for the 30–60 min range. Thus, we achieve a considerable improvement with the ML approach for the near-term prediction but a far more marginal one for the longer term, which implies a more limited ability to predict hail, and other features associated with the 45 dBZ echo top, using the approach applied in this study at lead times over 30 min. The confusion matrices for ECHO45-OCC can be found in Fig. <xref ref-type="fig" rid="Ch1.F5"/>c and d. In the subset of the test dataset where the 45 dBZ echo is present, the height of the 45 dBZ echo is predicted with a MAE of 693 m for 0–30 min and 841 m for 30–60 min. According to the formula in <xref ref-type="bibr" rid="bib1.bibx11" id="text.42"/> for the probability of hail (POH), these correspond to roughly 16 and 19 percentage-point errors in POH, respectively. Meanwhile, the standard deviations of ECHO45-HT in the test set are 1365 m for 0–30 min and 1404 m for 30–60 min.</p>
</sec>
<sec id="Ch1.S4.SS2">
  <label>4.2</label><title>Feature importance</title>
      <p id="d1e1223">The importances of features to the GB model can be extracted from the LightGBM library after training. In this section, we show the “gain” of various features, i.e., the total reduction in the training loss function attributable to that feature. For clarity and brevity of presentation, we sum the contributions from different feature types (e.g., mean, standard deviation) and different time steps. We also separately consider the total contribution from all features of a given data source, which can give a clearer impression of the total importance of a source that includes a large number of correlated features.</p>
      <?pagebreak page585?><p id="d1e1226">Figures <xref ref-type="fig" rid="Ch1.F6"/> and <xref ref-type="fig" rid="Ch1.F7"/> show the importances of the various predictors and data sources (e.g., ABI, NWP model or radar) for each of the ML objectives defined in Sect. <xref ref-type="sec" rid="Ch1.S3.SS2"/>. In Fig. <xref ref-type="fig" rid="Ch1.F6"/>, we show the feature and source importances of the MAXZ and LIGHTNING-OCC objectives, while Fig. <xref ref-type="fig" rid="Ch1.F7"/> displays the same for ECHO45-OCC and ECHO45-HT.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F6" specific-use="star"><?xmltex \currentcnt{6}?><?xmltex \def\figurename{Figure}?><label>Figure 6</label><caption><p id="d1e1241">The importances of the various features and data sources for the MAXZ <bold>(a, b)</bold> and LIGHTNING-OCC <bold>(c, d)</bold> predictands according to LightGBM. The top panels show the 20 most important source variables for each predictand. The importances have been summed from all features and time steps and normalized such that the most important variable is scaled to 1. The bottom panels show the total importances of the various data sources as a function of lead time (in panel <bold>b</bold>) or the prediction time range (in panel <bold>d</bold>).</p></caption>
          <?xmltex \igopts{width=341.433071pt}?><graphic xlink:href="https://nhess.copernicus.org/articles/22/577/2022/nhess-22-577-2022-f06.png"/>

        </fig>

      <?xmltex \floatpos{t}?><fig id="Ch1.F7" specific-use="star"><?xmltex \currentcnt{7}?><?xmltex \def\figurename{Figure}?><label>Figure 7</label><caption><p id="d1e1265">As Fig. <xref ref-type="fig" rid="Ch1.F6"/>, but for the ECHO45-OCC <bold>(a, b)</bold> and ECHO45-HT <bold>(c, d)</bold> predictands.</p></caption>
          <?xmltex \igopts{width=341.433071pt}?><graphic xlink:href="https://nhess.copernicus.org/articles/22/577/2022/nhess-22-577-2022-f07.png"/>

        </fig>

      <p id="d1e1282">The statistics of feature importance for nowcasting MAXZ, as shown in Fig. <xref ref-type="fig" rid="Ch1.F6"/>a, demonstrate that the most important features for predicting this target variable come from the NEXRAD radar data. The most significant feature is the column maximum reflectivity – the same variable that is being predicted – but the other radar variables also seem to be utilized. The importances grouped by data source, as displayed in Fig. <xref ref-type="fig" rid="Ch1.F6"/>b, show a slightly different view, as in this case the NWP data are of similar importance to the radar. The reason for the apparent discrepancy between the importances of individual features and the total source importance is that contribution of the NWP data is divided over a large number of variables that are correlated to varying degrees. Because of the correlations, the contributions from individual variables appear small, as the GB model splits the gain between the correlated variables, but the contributions combine to give an importance comparable to the radar variables when summed together. Figure <xref ref-type="fig" rid="Ch1.F6"/>b shows some variation in the relative importances of the radar and NWP data between time steps, which we consider to be most likely mere random noise; as the different time steps are predicted by different, independently trained models, they may end up with slightly different values for the feature importance. In general, the importance of the NWP data tends to increase slightly with longer lead times, as was also found in earlier nowcasting studies <xref ref-type="bibr" rid="bib1.bibx33" id="paren.43"><named-content content-type="pre">e.g.,</named-content></xref>. The GOES-16 ABI data are also utilized to a significant degree, while the GLM and ASTER data contribute to a lesser extent.</p>
      <p id="d1e1296">The feature and source importances for LIGHTNING-OCC, as shown in Fig. <xref ref-type="fig" rid="Ch1.F6"/>c and d, are dominated by contributions of the GLM lightning data. This is largely because a region that is already producing lightning is likely to continue to do so in the future, thus providing a reliable predictor, but past occurrence can also indicate temporal tendencies in lightning activity. The NEXRAD, ABI and ECMWF data are considered to be approximately equally important, with their total importance (Fig. <xref ref-type="fig" rid="Ch1.F6"/>d) relative to GLM increasing with longer lead times.</p>
      <p id="d1e1303">Figure <xref ref-type="fig" rid="Ch1.F7"/> shows the feature importances for ECHO45-OCC and ECHO45-HT. Similar to the nowcasting of MAXZ, the most important features are from the NEXRAD radar data. Again, this is not unexpected given that the target variables are defined using the radar. The ECMWF and ABI data contribute less to the prediction of the echo top than to the prediction of MAXZ. According to this analysis, the GLM data are hardly used, while the ASTER DEM data seem to provide a small contribution to the prediction of echo top height. However, as we shall discuss in Sect. <xref ref-type="sec" rid="Ch1.S4.SS3"/>, GLM actually provides useful data in the absence of other predictors, while the importance of the DEM may be due to overfitting.</p>
</sec>
<sec id="Ch1.S4.SS3">
  <label>4.3</label><title>Exclusion studies</title>
      <p id="d1e1318">An alternative way to assess the importance of various data sources is to remove one or more data sources from the training data, retrain the model, and evaluate the change in prediction performance. This approach may give later studies a clearer picture of the value of various data sources in thunderstorm nowcasting applications. Unlike the feature importance, such an exclusion study also allows the use of the testing set for evaluation, showing which variables are important in practice and allowing us to better distinguish generalizing learning ability from overfitting. The results of the exclusion experiments are shown in Fig. <xref ref-type="fig" rid="Ch1.F8"/> (MAXZ and LIGHTNING-OCC) and Fig. <xref ref-type="fig" rid="Ch1.F9"/> (ECHO45-OCC and ECHO45-HT). We also show the equivalent results for the training and validation datasets in Figs. <xref ref-type="fig" rid="App1.Ch1.S1.F10"/>–<xref ref-type="fig" rid="App1.Ch1.S1.F13"/>.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F8" specific-use="star"><?xmltex \currentcnt{8}?><?xmltex \def\figurename{Figure}?><label>Figure 8</label><caption><p id="d1e1331">The results of exclusion experiments on the MAXZ and LIGHTNING-OCC predictands. Each square in panels <bold>(a)</bold>–<bold>(d)</bold> corresponds to a combination of data sources, which can be found by combining the sources listed for the row and the column. For example, the top left square of each panel shows the error metric obtained using all five data sources, while the second column of the second row shows the metric for ECMWF, GLM and NEXRAD data. The predictand and the error are shown on top of each panel; RMSE indicates the root-mean-square error and MAE the mean absolute error. In panels <bold>(a)</bold> and <bold>(b)</bold>, the bottom right corner shows the result obtained with the bias-corrected persistence assumption (see Sect. <xref ref-type="sec" rid="Ch1.S4.SS1"/>), while in panel <bold>(d)</bold>, the bottom right corner shows the baseline climatological occurrence. The results for LIGHTNING-OCC are shown for the 30–60 min time interval.</p></caption>
          <?xmltex \igopts{width=398.338583pt}?><graphic xlink:href="https://nhess.copernicus.org/articles/22/577/2022/nhess-22-577-2022-f08.png"/>

        </fig>

      <?xmltex \floatpos{t}?><fig id="Ch1.F9" specific-use="star"><?xmltex \currentcnt{9}?><?xmltex \def\figurename{Figure}?><label>Figure 9</label><caption><p id="d1e1360">As Fig. <xref ref-type="fig" rid="Ch1.F8"/>, but for the ECHO45-OCC and ECHO45-HT predictands. In panel <bold>(b)</bold>, the bottom right corner shows the climatological occurrence (unlike with MAXZ, we do not use the persistence assumption as a baseline for ECHO45-HT, and therefore nothing is shown in the bottom right corner of panels <bold>(c)</bold> and <bold>(d)</bold>, in contrast to Fig. <xref ref-type="fig" rid="Ch1.F8"/>a and b). The results are shown for the 0–30 min time interval.</p></caption>
          <?xmltex \igopts{width=398.338583pt}?><graphic xlink:href="https://nhess.copernicus.org/articles/22/577/2022/nhess-22-577-2022-f09.png"/>

        </fig>

      <p id="d1e1383">The results for the MAXZ predictand at 60 min lead time can be found in Fig. <xref ref-type="fig" rid="Ch1.F8"/>a and b. There is some noise in the results, so small differences in the metrics should not be overinterpreted, but certain general patterns are apparent. Most noticeably, the two leftmost columns, which correspond to models that have the NEXRAD radar data available, show consistently lower errors than the two columns on the right. The ABI data also have a positive effect, as can be seen by comparing the first column to the second, or the third to the fourth. Examining the differences between the rows, GLM data have a slight positive effect, especially when few other data sources are available, while it is difficult to discern any consistent effect from including the ECMWF data, and including the ASTER data sometimes even appears to make the metrics slightly worse. The latter result may be caused by the GB training process overfitting to the DEM features during training, degrading the results obtained during testing. Among the predictions obtained using only one data source, the one with NEXRAD data yields the best results and is almost as good as using all data sources together, the ABI and GLM data provide slight improvements over the baseline case (shown in the bottom right corner), while the model using only the NWP data from ECMWF yields results approximately equal to the persistence baseline. The latter result is quite surprising considering the large weight assigned to the ECMWF features in the feature importance analysis (Fig. <xref ref-type="fig" rid="Ch1.F6"/>a and b). For unclear reasons, the single best combination seems to be that which uses all data sources except GLM; we suspect that this result is merely coincidental and due to random variation because the results in Fig. <xref ref-type="fig" rid="Ch1.F8"/>a and b do not suggest that the GLM data are detrimental to prediction performance. The results for the training and validation sets (Appendix Figs. <xref ref-type="fig" rid="App1.Ch1.S1.F10"/>a, b and <xref ref-type="fig" rid="App1.Ch1.S1.F12"/>a, b) support this interpretation, since the best combination in the validation dataset is that of NEXRAD and GLM. More generally, the results for the training and validation datasets exhibit patterns similar to those in the test set, which suggests that while individual differences may be attributable to noise, the broader conclusions of the analysis are robust.</p>
      <p id="d1e1396"><?xmltex \hack{\newpage}?>The metrics for LIGHTNING-OCC, shown in Fig. <xref ref-type="fig" rid="Ch1.F8"/>c and d for the 30–60 min time interval, also show a clear pattern. Here, the first, second, fourth and sixth rows, which correspond to the GLM data being available, show better metrics than the other rows. Indeed, prediction using <italic>only</italic> the GLM data performs very well, achieving an error rate of 15.5 %. However, it is interesting to note that good results can be obtained without the direct lightning data as well; for example, the error rate obtained using only the ABI and NEXRAD (15.3 %) data is better than the GLM-only error rate and only 1.2 percentage points worse than the best result (14.1 %). This shows that the feature importance analysis shown in Sect. <xref ref-type="sec" rid="Ch1.S4.SS2"/> is only valid for a specific combination of predictors. When one data source is removed, the missing information can often be substituted by other data sources that were much less used in the case in which everything was available. In general, both the ABI and NEXRAD data improve the prediction results for LIGHTNING-OCC: the first column (with both ABI and NEXRAD available) has the best results overall, the comparison between the second (NEXRAD only) and the third (ABI only) is mixed, and the fourth (neither ABI nor NEXRAD available) has the worst results. The results obtained with the ECMWF data are rather odd. First, in contrast to the MAXZ prediction, the ECMWF data have some skill at predicting lightning on their own, as the ECMWF-only prediction has an error rate that is 2.7 percentage points better than the climatological average of 29.2 %. Second, prediction with both ECMWF and ABI performs 4.6 percentage points better than ABI-only prediction, but adding the ECMWF data to NEXRAD does not offer an improvement over the NEXRAD-only scores. This result may be simply noise, as in the validation dataset (Fig. <xref ref-type="fig" rid="App1.Ch1.S1.F12"/>d) the addition of ECMWF data also improves the results obtained with NEXRAD. The ASTER data do not add much information, and the ASTER-only prediction, with its 38.7 % error rate, actually performs worse than the climatology. This happens because the climatological occurrence in the training dataset, at 42.5 %, is coincidentally significantly higher than in the test dataset. The ASTER-only model is unable to generalize with the scarce data available to it, and only learns to roughl<?pagebreak page587?>y reproduce the climatological error rate in the training dataset, which leads to the overestimation of occurrence in the test set. Indeed, the degradation of the metrics with the addition of ASTER does not occur in the training and validation datasets.</p>
      <p id="d1e1409">For both ECHO45-OCC and ECHO45-HT (Fig. <xref ref-type="fig" rid="Ch1.F9"/>, shown for the 0–30 min interval), the clearest pattern is the importance of the radar data for prediction, which is consistent with the feature importance analysis. Indeed, as long as the NEXRAD data are available, the benefit of adding further data sources is negligible compared to the NEXRAD-only case (error rate of 12.5 %). However, without the NEXRAD data (e.g., in oceanic regions without radar coverage), the other data sources still provide meaningful improvements in ECHO45-OCC over the climatological occurrence of 38.3 %. For example, ABI alone achieves an error rate of 25.3 %, ECMWF alone yields 27.8 %, and GLM alone achieves 21.4 %, while these three sources together reach 20.0 %. Similar patterns are found in ECHO45-HT. These results further demonstrate that the benefits of features and data sources cannot be evaluated in isolation and depend on the other data sources used.</p>
</sec>
</sec>
<sec id="Ch1.S5" sec-type="conclusions">
  <label>5</label><title>Conclusions</title>
      <p id="d1e1424">For machine-learning methods to be utilized effectively for thunderstorm nowcasting, it is necessary for the benefits of the various available data sources to be well understood and quantified. Large amounts of data that are potentially related to convective processes can be obtained from numerous sources, yet it is not always obvious how much benefit one should expect from adding an additional data source, and therefore additional complexity, to a ML model. This study provides guidance for future work to better select data sources for nowcasting particular thunderstorm hazards along predicted storm tracks. We obtained data from ground-based radar (NEXRAD), satellite spectrographic imagery (GOES-16 ABI), satellite-based lightning detection (GOES-16 GLM), a numerical weather prediction model (ECMWF IFS) and a digital elevation model (ASTER), for a total of over 100 input variables. We applied this data to nowcast variables related to precipitation, lightning and hail formation.</p>
      <p id="d1e1427">We have based our evaluation of the importance of various features on two complementary approaches: first, using the feature importance provided by the gradient-boosted tree<?pagebreak page588?> algorithm, and second, retraining the GB algorithm repeatedly using different subsets of the input variables. Testing all possible combinations of input features would have quickly become implausible as the number of features increased, but grouping the features by data source allowed us to cover the most realistic situations of missing data, where an entire data source is unavailable due to either geographical limitations (for example, operational weather radar networks do not cover the oceans) or irregular data outages.</p>
      <?pagebreak page589?><p id="d1e1430">Each of the investigated data sources proved to be useful for predicting at least some of the target variables, except for the DEM, which provides no detectable benefit for any of the predictands. The radar variables are strong predictors for all predictands and are particularly dominant for the targets defined using the radar data. The satellite imagery from ABI provides moderate performance improvements to all predictands, though it is generally less significant than the radar data in this application. The GLM lightning data are highly useful for lightning prediction; for other targets, they provide more modest benefits, although they can still provide improvements to nowcasting performance, particularly when radar data are not available. More generally, the results confirm that satellite data can be used to provide ML-based nowcasts in areas without radar coverage, such as over the oceans and in less-developed regions lacking ground-based radar networks. Meanwhile, the ECMWF forecast data, despite being considered of some importance by the ML algorithm, do not benefit the nowcast according to the data exclusion analysis, as, for the lead times investigated here, the necessary information content is already contained in the other observations.</p>
      <p id="d1e1433">The results show that the feature importance from the GB algorithms may provide seemingly contradictory results compared to the more comprehensive analysis achieved by testing different combinations of features and evaluating the results. Although the two evaluation methods largely agreed on which data sources are the most important, some important differences emerged on closer inspection. This highlights the pitfalls of analyzing the importance of features and data sources in an ML setting when the data sources are partially redundant. A given feature may be beneficial when used alone, but virtually useless when used in conjunction with another, more powerful predictor. For instance, when trained for lightning prediction, the ML algorithm only gains a modest improvement (approximately 10 %) in the error rate from auxiliary data sources when direct lightning data are available, but when trained without lightning data, good prediction performance can still be achieved utilizing the other data sources. This has important implications for real-time nowcasting in time-critical applications such as aviation, as it indicates that ML-based nowcasting can be performed robustly when some input data are missing or delayed.</p>
      <p id="d1e1437"><?xmltex \hack{\newpage}?>Based on the results, we conclude that investigators should be cautious with applying the brute-force strategy of providing ML algorithms with all the available data and letting the training process decide which data are useful. While this may sometimes reveal unexpectedly useful input variables, using data sources that contain little or no generalizable information may also expose the training process to the problem of overfitting, thus actually degrading the results. This can be mitigated by early stopping and by using hyperparameters designed to prevent overfitting, but it is better for both accuracy and training time to simply drop the counterproductive data sources.</p>
      <p id="d1e1441">Future work can take advantage of the results achieved in this study to build more accurate and efficient ML models for the nowcasting of the thunderstorm hazards of heavy precipitation, lightning and hail. It also allows the estimation of the degradation of the result if one observation system is missing. Nevertheless, this work is constrained to a particular set of data sources, a single study area and a specific ML method: the gradient-boosted tree. Although we have selected five commonly utilized data sources, all qualitatively different from each other, later work should extend the analysis to other data sources such as polar-orbiting satellites and ground-based lightning networks. Given suitable data sources, the methodology could also be extended to other hazards such as wind damage and tornado events. It may furthermore be interesting to investigate additional regions; for instance, the DEM data may be more significant in regions with higher mountains. With regard to alternative ML methods, the performance of neural networks should be evaluated in a future study, preferably using the same dataset to facilitate comparisons, as convolutional neural networks are expected to be able to better utilize spatial features such as the high-resolution imagery from the ABI instrument. Neural networks may also be able to utilize large numbers of samples and input variables better.</p>
</sec>

      
      </body>
    <back><app-group>

<app id="App1.Ch1.S1">
  <?xmltex \currentcnt{A}?><label>Appendix A</label><title>Exclusion studies on training and validation datasets</title>
      <p id="d1e1455">The exclusion analyses shown in Sect. <xref ref-type="sec" rid="Ch1.S4.SS3"/> for the test dataset were also performed with the training and validation datasets. The results are shown in Figs. <xref ref-type="fig" rid="App1.Ch1.S1.F10"/> and <xref ref-type="fig" rid="App1.Ch1.S1.F11"/> for the training set and in Figs. <xref ref-type="fig" rid="App1.Ch1.S1.F12"/> and <xref ref-type="fig" rid="App1.Ch1.S1.F13"/> for the validation set.</p><?xmltex \hack{\clearpage}?><?xmltex \floatpos{t}?><fig id="App1.Ch1.S1.F10" specific-use="star"><?xmltex \currentcnt{A1}?><?xmltex \def\figurename{Figure}?><label>Figure A1</label><caption><p id="d1e1470">As Fig. <xref ref-type="fig" rid="Ch1.F8"/>, but for the training dataset.</p></caption>
        <?xmltex \igopts{width=398.338583pt}?><graphic xlink:href="https://nhess.copernicus.org/articles/22/577/2022/nhess-22-577-2022-f10.png"/>

      </fig>

      <?xmltex \floatpos{t}?><fig id="App1.Ch1.S1.F11" specific-use="star"><?xmltex \currentcnt{A2}?><?xmltex \def\figurename{Figure}?><label>Figure A2</label><caption><p id="d1e1483">As Fig. <xref ref-type="fig" rid="Ch1.F9"/>, but for the training dataset.</p></caption>
        <?xmltex \igopts{width=398.338583pt}?><graphic xlink:href="https://nhess.copernicus.org/articles/22/577/2022/nhess-22-577-2022-f11.png"/>

      </fig>

<?xmltex \hack{\clearpage}?><?xmltex \floatpos{t}?><fig id="App1.Ch1.S1.F12" specific-use="star"><?xmltex \currentcnt{A3}?><?xmltex \def\figurename{Figure}?><label>Figure A3</label><caption><p id="d1e1498">As Fig. <xref ref-type="fig" rid="Ch1.F8"/>, but for the validation dataset.</p></caption>
        <?xmltex \igopts{width=398.338583pt}?><graphic xlink:href="https://nhess.copernicus.org/articles/22/577/2022/nhess-22-577-2022-f12.png"/>

      </fig>

      <?xmltex \floatpos{t}?><fig id="App1.Ch1.S1.F13" specific-use="star"><?xmltex \currentcnt{A4}?><?xmltex \def\figurename{Figure}?><label>Figure A4</label><caption><p id="d1e1511">As Fig. <xref ref-type="fig" rid="Ch1.F9"/>, but for the validation dataset.</p></caption>
        <?xmltex \igopts{width=398.338583pt}?><graphic xlink:href="https://nhess.copernicus.org/articles/22/577/2022/nhess-22-577-2022-f13.png"/>

      </fig>

<?xmltex \hack{\clearpage}?>
</app>

<?pagebreak page592?><app id="App1.Ch1.S2">
  <?xmltex \currentcnt{B}?><label>Appendix B</label><title>Further information on the datasets</title>
      <p id="d1e1532">Table <xref ref-type="table" rid="App1.Ch1.S2.T1"/> lists the radars used to compile the NEXRAD dataset we used. Tables <xref ref-type="table" rid="App1.Ch1.S2.T2"/>–<xref ref-type="table" rid="App1.Ch1.S2.T6"/> list the predictors from the various data sources that were used in this study.</p>

<?xmltex \floatpos{h!}?><table-wrap id="App1.Ch1.S2.T1"><?xmltex \hack{\hsize\textwidth}?><?xmltex \currentcnt{B1}?><label>Table B1</label><caption><p id="d1e1545">NEXRAD radars used to produce the dataset used in this study.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="2">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">Location</oasis:entry>
         <oasis:entry colname="col2">Code</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">Albany, New York</oasis:entry>
         <oasis:entry colname="col2">KENX</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Binghamton, New York</oasis:entry>
         <oasis:entry colname="col2">KBGM</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Buffalo, New York</oasis:entry>
         <oasis:entry colname="col2">KBUF</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Burlington, Vermont</oasis:entry>
         <oasis:entry colname="col2">KCXX</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Boston, Massachusetts</oasis:entry>
         <oasis:entry colname="col2">KBOX</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Fort Drum, New York</oasis:entry>
         <oasis:entry colname="col2">KTYX</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">New York City, New York</oasis:entry>
         <oasis:entry colname="col2">KOXZ</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Philadelphia, Pennsylvania</oasis:entry>
         <oasis:entry colname="col2">KDIX</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Portland, Maine</oasis:entry>
         <oasis:entry colname="col2">KGYX</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Pittsburgh, Pennsylvania</oasis:entry>
         <oasis:entry colname="col2">KPBZ</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">State College, Pennsylvania</oasis:entry>
         <oasis:entry colname="col2">KCCX</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

<?xmltex \floatpos{h!}?><table-wrap id="App1.Ch1.S2.T2"><?xmltex \hack{\hsize\textwidth}?><?xmltex \currentcnt{B2}?><label>Table B2</label><caption><p id="d1e1676">Variables from the NEXRAD radar adopted in this study.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="1">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">25 dBZ echo top height</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">35 dBZ echo top height</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">45 dBZ echo top height</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Maximum reflectivity</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Vertically integrated liquid</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M47" display="inline"><mml:mrow><mml:mi>U</mml:mi><mml:mo>/</mml:mo><mml:mi>V</mml:mi></mml:mrow></mml:math></inline-formula> motion components from optical flow</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

<?xmltex \floatpos{h!}?><table-wrap id="App1.Ch1.S2.T3"><?xmltex \hack{\hsize\textwidth}?><?xmltex \currentcnt{B3}?><label>Table B3</label><caption><p id="d1e1742">Variables from the GOES-16 ABI instrument adopted in this study.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="4">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:colspec colnum="3" colname="col3" align="left" colsep="1"/>
     <oasis:colspec colnum="4" colname="col4" align="left"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry namest="col1" nameend="col3" align="center">Level 1 </oasis:entry>
         <oasis:entry colname="col4">Level 2</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">ABI band 01 (0.47 <inline-formula><mml:math id="M48" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">µ</mml:mi></mml:mrow></mml:math></inline-formula>m)</oasis:entry>
         <oasis:entry colname="col2">ABI band 09 (6.9 <inline-formula><mml:math id="M49" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">µ</mml:mi></mml:mrow></mml:math></inline-formula>m)</oasis:entry>
         <oasis:entry colname="col3">Difference 07–08</oasis:entry>
         <oasis:entry colname="col4">Cloud top height</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">ABI band 02 (0.64 <inline-formula><mml:math id="M50" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">µ</mml:mi></mml:mrow></mml:math></inline-formula>m)</oasis:entry>
         <oasis:entry colname="col2">ABI band 10 (7.3 <inline-formula><mml:math id="M51" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">µ</mml:mi></mml:mrow></mml:math></inline-formula>m)</oasis:entry>
         <oasis:entry colname="col3">Difference 07–09</oasis:entry>
         <oasis:entry colname="col4">Cloud top pressure</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">ABI band 03 (0.86 <inline-formula><mml:math id="M52" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">µ</mml:mi></mml:mrow></mml:math></inline-formula>m)</oasis:entry>
         <oasis:entry colname="col2">ABI band 11 (8.4 <inline-formula><mml:math id="M53" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">µ</mml:mi></mml:mrow></mml:math></inline-formula>m)</oasis:entry>
         <oasis:entry colname="col3">Difference 07–10</oasis:entry>
         <oasis:entry colname="col4">Cloud optical depth</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">ABI band 04 (1.37 <inline-formula><mml:math id="M54" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">µ</mml:mi></mml:mrow></mml:math></inline-formula>m)</oasis:entry>
         <oasis:entry colname="col2">ABI band 12 (9.6 <inline-formula><mml:math id="M55" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">µ</mml:mi></mml:mrow></mml:math></inline-formula>m)</oasis:entry>
         <oasis:entry colname="col3">Difference 08–09</oasis:entry>
         <oasis:entry colname="col4">CAPE</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">ABI band 05 (1.6 <inline-formula><mml:math id="M56" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">µ</mml:mi></mml:mrow></mml:math></inline-formula>m)</oasis:entry>
         <oasis:entry colname="col2">ABI band 13 (10.3 <inline-formula><mml:math id="M57" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">µ</mml:mi></mml:mrow></mml:math></inline-formula>m)</oasis:entry>
         <oasis:entry colname="col3">Difference 08–10</oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M58" display="inline"><mml:mi>K</mml:mi></mml:math></inline-formula>-index</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">ABI band 06 (2.2 <inline-formula><mml:math id="M59" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">µ</mml:mi></mml:mrow></mml:math></inline-formula>m)</oasis:entry>
         <oasis:entry colname="col2">ABI band 14 (11.2 <inline-formula><mml:math id="M60" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">µ</mml:mi></mml:mrow></mml:math></inline-formula>m)</oasis:entry>
         <oasis:entry colname="col3">Difference 11–13</oasis:entry>
         <oasis:entry colname="col4">Lifted index</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">ABI band 07 (3.9 <inline-formula><mml:math id="M61" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">µ</mml:mi></mml:mrow></mml:math></inline-formula>m)</oasis:entry>
         <oasis:entry colname="col2">ABI band 15 (12.3 <inline-formula><mml:math id="M62" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">µ</mml:mi></mml:mrow></mml:math></inline-formula>m)</oasis:entry>
         <oasis:entry colname="col3">Difference 12–13</oasis:entry>
         <oasis:entry colname="col4">Showalter index</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">ABI band 08 (6.2 <inline-formula><mml:math id="M63" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">µ</mml:mi></mml:mrow></mml:math></inline-formula>m)</oasis:entry>
         <oasis:entry colname="col2">ABI band 16 (13.3 <inline-formula><mml:math id="M64" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">µ</mml:mi></mml:mrow></mml:math></inline-formula>m)</oasis:entry>
         <oasis:entry colname="col3"/>
         <oasis:entry colname="col4">Total totals index</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

<?xmltex \floatpos{h!}?><table-wrap id="App1.Ch1.S2.T4"><?xmltex \hack{\hsize\textwidth}?><?xmltex \currentcnt{B4}?><label>Table B4</label><caption><p id="d1e2032">Variables from the GOES-16 GLM instrument adopted in this study.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="1">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">Flash density</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Flash energy density</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Event density</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Event energy density</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

<?xmltex \hack{\newpage}?><?xmltex \floatpos{h!}?><table-wrap id="App1.Ch1.S2.T5" specific-use="star"><?xmltex \hack{\hsize\textwidth}?><?xmltex \currentcnt{B5}?><label>Table B5</label><caption><p id="d1e2074">Variables from the ECMWF model output adopted in this study.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="2">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">0 <inline-formula><mml:math id="M65" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula>C isothermal level</oasis:entry>
         <oasis:entry colname="col2">Mean sea level pressure</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">10 m <inline-formula><mml:math id="M66" display="inline"><mml:mrow><mml:mi>U</mml:mi><mml:mo>/</mml:mo><mml:mi>V</mml:mi></mml:mrow></mml:math></inline-formula> wind components</oasis:entry>
         <oasis:entry colname="col2">Medium cloud cover</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">100 m <inline-formula><mml:math id="M67" display="inline"><mml:mrow><mml:mi>U</mml:mi><mml:mo>/</mml:mo><mml:mi>V</mml:mi></mml:mrow></mml:math></inline-formula> wind components</oasis:entry>
         <oasis:entry colname="col2">Potential evaporation</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">200 m <inline-formula><mml:math id="M68" display="inline"><mml:mrow><mml:mi>U</mml:mi><mml:mo>/</mml:mo><mml:mi>V</mml:mi></mml:mrow></mml:math></inline-formula> wind components</oasis:entry>
         <oasis:entry colname="col2">Precipitation type</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">2 m dewpoint temperature</oasis:entry>
         <oasis:entry colname="col2">Skin reservoir content</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">2 m temperature</oasis:entry>
         <oasis:entry colname="col2">Skin temperature</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Boundary layer dissipation</oasis:entry>
         <oasis:entry colname="col2">Snowfall</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Boundary layer height</oasis:entry>
         <oasis:entry colname="col2">Surface latent heat flux</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Cloud base height</oasis:entry>
         <oasis:entry colname="col2">Surface pressure</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Convective available potential energy</oasis:entry>
         <oasis:entry colname="col2">Surface net solar radiation</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Convective available potential energy shear</oasis:entry>
         <oasis:entry colname="col2">Surface net solar radiation, clear sky</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Convective inhibition</oasis:entry>
         <oasis:entry colname="col2">Surface net thermal radiation</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Convective precipitation</oasis:entry>
         <oasis:entry colname="col2">Surface net thermal radiation, clear sky</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Convective rain rate</oasis:entry>
         <oasis:entry colname="col2">Surface sensible heat flux</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Convective snowfall rate water equivalent</oasis:entry>
         <oasis:entry colname="col2">Total cloud cover</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Evaporation</oasis:entry>
         <oasis:entry colname="col2">Total column cloud ice water</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Friction velocity</oasis:entry>
         <oasis:entry colname="col2">Total column cloud liquid water</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Geopotential</oasis:entry>
         <oasis:entry colname="col2">Total column rain water</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Height of convective cloud top</oasis:entry>
         <oasis:entry colname="col2">Total column snow water</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Height of 1 <inline-formula><mml:math id="M69" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula>C wet-bulb temperature</oasis:entry>
         <oasis:entry colname="col2">Total column supercooled liquid water</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Height of 0 <inline-formula><mml:math id="M70" display="inline"><mml:msup><mml:mi/><mml:mo>∘</mml:mo></mml:msup></mml:math></inline-formula>C wet-bulb temperature</oasis:entry>
         <oasis:entry colname="col2">Total column water</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">High cloud cover</oasis:entry>
         <oasis:entry colname="col2">Total column water vapor</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M71" display="inline"><mml:mi>K</mml:mi></mml:math></inline-formula>-index</oasis:entry>
         <oasis:entry colname="col2">Total precipitation</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Large-scale precipitation</oasis:entry>
         <oasis:entry colname="col2">Total precipitation rate</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Large-scale precipitation fraction</oasis:entry>
         <oasis:entry colname="col2">Total totals index</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Large-scale rain rate</oasis:entry>
         <oasis:entry colname="col2">Vertically integrated moisture divergence</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Large-scale snowfall rate water equivalent</oasis:entry>
         <oasis:entry colname="col2">Vertical integral of eastward water vapor flux</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Low cloud cover</oasis:entry>
         <oasis:entry colname="col2">Vertical integral of northward water vapor flux</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

<?xmltex \floatpos{h!}?><table-wrap id="App1.Ch1.S2.T6"><?xmltex \currentcnt{B6}?><label>Table B6</label><caption><p id="d1e2417">Variables from the ASTER DEM adopted in this study.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="1">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">Mean elevation</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Roughness</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Surface gradient</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Upslope flow</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

<?xmltex \hack{\vspace*{220mm}}?>
</app>
  </app-group><notes notes-type="codedataavailability"><title>Code and data availability</title>

      <p id="d1e2460">The ML code used to produce the results and the feature dataset used to train the ML models can be found at <ext-link xlink:href="https://doi.org/10.5281/zenodo.6206919" ext-link-type="DOI">10.5281/zenodo.6206919</ext-link> <xref ref-type="bibr" rid="bib1.bibx37" id="paren.44"/>.</p>

      <p id="d1e2469">The original datasets are described, with instructions for downloading and reading, in the following sources in the References:
<list list-type="bullet"><list-item>
      <p id="d1e2474">NEXRAD radar data: <ext-link xlink:href="https://doi.org/10.7289/V5W9574V" ext-link-type="DOI">10.7289/V5W9574V</ext-link> <xref ref-type="bibr" rid="bib1.bibx49" id="paren.45"/></p></list-item><list-item>
      <p id="d1e2483">GOES ABI L1b data: <ext-link xlink:href="https://doi.org/10.7289/V5W9574V" ext-link-type="DOI">10.7289/V5W9574V</ext-link> <xref ref-type="bibr" rid="bib1.bibx18" id="paren.46"/></p></list-item><list-item>
      <p id="d1e2492">GOES ABI L2 products:
<list list-type="bullet"><list-item>
      <p id="d1e2497">Cloud top height: <ext-link xlink:href="https://doi.org/10.7289/V5HX19ZQ" ext-link-type="DOI">10.7289/V5HX19ZQ</ext-link> <xref ref-type="bibr" rid="bib1.bibx14" id="paren.47"/></p></list-item><list-item>
      <p id="d1e2506">Cloud optical depth: <ext-link xlink:href="https://doi.org/10.7289/V58G8J02" ext-link-type="DOI">10.7289/V58G8J02</ext-link> <xref ref-type="bibr" rid="bib1.bibx15" id="paren.48"/></p></list-item><list-item>
      <p id="d1e2515">Cloud top pressure: <ext-link xlink:href="https://doi.org/10.7289/V5D50K85" ext-link-type="DOI">10.7289/V5D50K85</ext-link> <xref ref-type="bibr" rid="bib1.bibx16" id="paren.49"/></p></list-item><list-item>
      <p id="d1e2524">Derived stability indices: <ext-link xlink:href="https://doi.org/10.7289/V50Z71KF" ext-link-type="DOI">10.7289/V50Z71KF</ext-link> <xref ref-type="bibr" rid="bib1.bibx17" id="paren.50"/></p></list-item></list></p></list-item><list-item>
      <p id="d1e2533">ASTER GDEM Version 3: <ext-link xlink:href="https://doi.org/10.5067/ASTER/ASTGTM.003" ext-link-type="DOI">10.5067/ASTER/ASTGTM.003</ext-link> <xref ref-type="bibr" rid="bib1.bibx47" id="paren.51"/>.</p></list-item></list>
The ECMWF forecast archive is available only to licensed users and participating national meteorological services; these can obtain the data through the ECMWF Meteorological Archival and Retrieval System (MARS).</p>
  </notes><notes notes-type="authorcontribution"><title>Author contributions</title>

      <p id="d1e2546">UH, UG and JRM conceived the initial concept of the study, which was refined by JL and UH. With the support of UH, JL obtained the data, processed them, trained the ML models and analyzed the results. JL led the writing of the paper, with contributions from all co-authors.</p>
  </notes><?xmltex \hack{\newpage}?><notes notes-type="competinginterests"><title>Competing interests</title>

      <p id="d1e2553">The contact author has declared that neither they nor their co-authors have any competing interests.</p>
  </notes><notes notes-type="disclaimer"><title>Disclaimer</title>

      <p id="d1e2559">Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.</p>
  </notes><ack><title>Acknowledgements</title><p id="d1e2565">This study builds on the work realized in the COALITION-3 project of MeteoSwiss, to which Joel Zeder and Shruti Nath contributed significantly. We thank David Haliczer and Christopher Tracy from the University of Alabama in Huntsville for their support in defining the study area and working with the US datasets, and Lorenzo Clementi of MeteoSwiss for comments on the initial draft of the article. We also thank Tomeu Rigo, Anna del Moral Méndez and one anonymous reviewer for their constructive feedback on this article.</p></ack><notes notes-type="financialsupport"><title>Financial support</title>

      <p id="d1e2570">The work of Jussi Leinonen was supported by the fellowship “Seamless Artificially Intelligent Thunderstorm Nowcasts” from the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT). The hosting institution of this fellowship is MeteoSwiss in Switzerland.</p>
  </notes><notes notes-type="reviewstatement"><title>Review statement</title>

      <p id="d1e2576">This paper was edited by Maria-Carmen Llasat and reviewed by Tomeu Rigo, Anna del Moral Méndez and one anonymous referee.</p>
  </notes><ref-list>
    <title>References</title>

      <ref id="bib1.bibx1"><?xmltex \def\ref@label{{Abrams et~al.(2020)Abrams, Crippen, and Fujisada}}?><label>Abrams et al.(2020)Abrams, Crippen, and Fujisada</label><?label Abrams2020ASTERV3?><mixed-citation>Abrams, M., Crippen, R., and Fujisada, H.: ASTER Global Digital Elevation
Model (GDEM) and ASTER Global Water Body Dataset (ASTWBD), Remote Sens., 12,
1156, <ext-link xlink:href="https://doi.org/10.3390/rs12071156" ext-link-type="DOI">10.3390/rs12071156</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx2"><?xmltex \def\ref@label{{Auton\`{e}s and Claudon(2012)}}?><label>Autonès and Claudon(2012)</label><?label Autones2012RDT?><mixed-citation>Autonès, F. and Claudon, M.: Algorithm Theoretical Basis Document for the
Convection Product Processors of the NWC/GEO, Tech. Rep.
SAF/NWC/CDOP/MFT/SCI/ATBD/11, Meteo-France, Toulouse,
<uri>https://www.nwcsaf.org/Downloads/GEO/2018.1/Documents/Scientific_Docs/NWC-CDOP2-GEO-MFT-SCI-ATBD-Convection_v2.2.pdf</uri>
(last access: 21 February 2022), 2012.</mixed-citation></ref>
      <ref id="bib1.bibx3"><?xmltex \def\ref@label{{Ayzel et~al.(2020)Ayzel, Scheffer, and
Heistermann}}?><label>Ayzel et al.(2020)Ayzel, Scheffer, and
Heistermann</label><?label Ayzel2020Precipitation?><mixed-citation>Ayzel, G., Scheffer, T., and Heistermann, M.: RainNet v1.0: a convolutional
neural network for radar-based precipitation nowcasting, Geosci. Model Dev.,
13, 2631–2644, <ext-link xlink:href="https://doi.org/10.5194/gmd-13-2631-2020" ext-link-type="DOI">10.5194/gmd-13-2631-2020</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx4"><?xmltex \def\ref@label{{Barras et~al.(2019)Barras, Hering, Martynov, Noti, Germann, and
Martius}}?><label>Barras et al.(2019)Barras, Hering, Martynov, Noti, Germann, and
Martius</label><?label Barras2019Hail?><mixed-citation>Barras, H., Hering, A., Martynov, A., Noti, P.-A., Germann, U., and Martius,
O.: Experiences with <inline-formula><mml:math id="M72" display="inline"><mml:mrow><mml:mo>&gt;</mml:mo><mml:mn mathvariant="normal">50</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">000</mml:mn></mml:mrow></mml:math></inline-formula> Crowdsourced Hail Reports in Switzerland, B.
Am. Meteorol. Soc., 100, 1429–1440, <ext-link xlink:href="https://doi.org/10.1175/BAMS-D-18-0090.1" ext-link-type="DOI">10.1175/BAMS-D-18-0090.1</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx5"><?xmltex \def\ref@label{{Bedka et~al.(2018)Bedka, Murillo, Homeyer, Scarino, and
Mersiovsky}}?><label>Bedka et al.(2018)Bedka, Murillo, Homeyer, Scarino, and
Mersiovsky</label><?label Bedka2018AboveAnvil?><mixed-citation>Bedka, K., Murillo, E. M., Homeyer, C. R., Scarino, B., and Mersiovsky, H.: The Above-Anvil Cirrus Plume: An Important Severe Weather Indicator in Visible and Infrared Satellite Imagery, Weather Forecast., 33, 1159–1181,
<ext-link xlink:href="https://doi.org/10.1175/WAF-D-18-0040.1" ext-link-type="DOI">10.1175/WAF-D-18-0040.1</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx6"><?xmltex \def\ref@label{{Bedka and Khlopenkov(2016)}}?><label>Bedka and Khlopenkov(2016)</label><?label Bedka2016Overshooting?><mixed-citation>Bedka, K. M. and Khlopenkov, K.: A Probabilistic Multispectral Pattern
Recognition Method for Detection of Overshooting Cloud Tops Using Passive
Satellite Imager Observations, J. Appl. Meteorol. Clim., 55, 1983–2005,
<ext-link xlink:href="https://doi.org/10.1175/JAMC-D-15-0249.1" ext-link-type="DOI">10.1175/JAMC-D-15-0249.1</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx7"><?xmltex \def\ref@label{{Chai and Draxler(2014)}}?><label>Chai and Draxler(2014)</label><?label Chai2005MAE?><mixed-citation>Chai, T. and Draxler, R. R.: Root mean square error (RMSE) or mean absolute
error (MAE)? – Arguments against avoiding RMSE in the literature, Geosci. Model Dev., 7, 1247–1250, <ext-link xlink:href="https://doi.org/10.5194/gmd-7-1247-2014" ext-link-type="DOI">10.5194/gmd-7-1247-2014</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx8"><?xmltex \def\ref@label{{Changnon(1993)}}?><label>Changnon(1993)</label><?label Changnon1993Thunderstorms?><mixed-citation>Changnon, S. A.: Relationships between Thunderstorms and Cloud-to-Ground
Lightning in the United States, J. Appl. Meteorol., 32, 88–105,
<ext-link xlink:href="https://doi.org/10.1175/1520-0450(1993)032&lt;0088:RBTACT&gt;2.0.CO;2" ext-link-type="DOI">10.1175/1520-0450(1993)032&lt;0088:RBTACT&gt;2.0.CO;2</ext-link>, 1993.</mixed-citation></ref>
      <ref id="bib1.bibx9"><?xmltex \def\ref@label{{Czernecki et~al.(2019)Czernecki, Taszarek, Marosz, Półrolniczak,
Kolendowicz, Wyszogrodzki, and Szturc}}?><label>Czernecki et al.(2019)Czernecki, Taszarek, Marosz, Półrolniczak,
Kolendowicz, Wyszogrodzki, and Szturc</label><?label Czernecki2019Hail?><mixed-citation>Czernecki, B., Taszarek, M., Marosz, M., Półrolniczak, M., Kolendowicz, L.,
Wyszogrodzki, A., and Szturc, J.: Application of machine learning to large
hail prediction – The importance of radar reflectivity, lightning occurrence
and convective parameters derived from ERA5, Atmos. Res., 227, 249–262,
<ext-link xlink:href="https://doi.org/10.1016/j.atmosres.2019.05.010" ext-link-type="DOI">10.1016/j.atmosres.2019.05.010</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx10"><?xmltex \def\ref@label{{Dixon and Wiener(1993)}}?><label>Dixon and Wiener(1993)</label><?label Dixon1993TITAN?><mixed-citation>Dixon, M. and Wiener, G.: TITAN: Thunderstorm Identification, Tracking,
Analysis, and Nowcasting – A Radar-based Methodology, J. Atmos. Ocean. Tech., 10, 785–797, <ext-link xlink:href="https://doi.org/10.1175/1520-0426(1993)010&lt;0785:TTITAA&gt;2.0.CO;2" ext-link-type="DOI">10.1175/1520-0426(1993)010&lt;0785:TTITAA&gt;2.0.CO;2</ext-link>, 1993.</mixed-citation></ref>
      <ref id="bib1.bibx11"><?xmltex \def\ref@label{{Foote et~al.(2005)Foote, Krauss, and Makitov}}?><label>Foote et al.(2005)Foote, Krauss, and Makitov</label><?label Foote2005?><mixed-citation>Foote, G. B., Krauss, T. W., and Makitov, V.: Hail metrics using conventional
radar, in: Proc. 16th Conference on Planned and Inadvertent Weather
Modification, <uri>https://ams.confex.com/ams/pdfpapers/86773.pdf</uri> (last
access: 21 February 2021), 2005.</mixed-citation></ref>
      <ref id="bib1.bibx12"><?xmltex \def\ref@label{{Foresti et~al.(2019)Foresti, Sideris, Nerini, Beusch, and
Germann}}?><label>Foresti et al.(2019)Foresti, Sideris, Nerini, Beusch, and
Germann</label><?label Foresti2019PrecipitationNowcasting?><mixed-citation>Foresti, L., Sideris, I. V., Nerini, D., Beusch, L., and Germann, U.: Using a
10-Year Radar Archive for Nowcasting Precipitation Growth and Decay: A
Probabilistic Machine Learning Approach, Weather Forecast., 34, 1547–1569, <ext-link xlink:href="https://doi.org/10.1175/WAF-D-18-0206.1" ext-link-type="DOI">10.1175/WAF-D-18-0206.1</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx13"><?xmltex \def\ref@label{{Franch et~al.(2020)Franch, Nerini, Pendesini, Coviello, Jurman, and
Furlanello}}?><label>Franch et al.(2020)Franch, Nerini, Pendesini, Coviello, Jurman, and
Furlanello</label><?label Franch2020Nowcasting?><mixed-citation>Franch, G., Nerini, D., Pendesini, M., Coviello, L., Jurman, G., and
Furlanello, C.: Precipitation Nowcasting with Orographic Enhanced Stacked
Generalization: Improving Deep Learning Predictions on Extreme Events,
Atmosphere, 11, 267, <ext-link xlink:href="https://doi.org/10.3390/atmos11030267" ext-link-type="DOI">10.3390/atmos11030267</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx14"><?xmltex \def\ref@label{{GOES-R Algorithm Working Group and GOES-R Series Program
Office(2018a)}}?><label>GOES-R Algorithm Working Group and GOES-R Series Program
Office(2018a)</label><?label GOESABIL2ACHAData?><mixed-citation>GOES-R Algorithm Working Group and GOES-R Series Program Office: NOAA GOES-R
Series Advanced Baseline Imager (ABI) Level 2 Cloud Top Height (ACHA) [data set], <ext-link xlink:href="https://doi.org/10.7289/V5HX19ZQ" ext-link-type="DOI">10.7289/V5HX19ZQ</ext-link>, 2018a.</mixed-citation></ref>
      <ref id="bib1.bibx15"><?xmltex \def\ref@label{{GOES-R Algorithm Working Group and GOES-R Series Program
Office(2018b)}}?><label>GOES-R Algorithm Working Group and GOES-R Series Program
Office(2018b)</label><?label GOESABIL2CODData?><mixed-citation>GOES-R Algorithm Working Group and GOES-R Series Program Office: NOAA GOES-R
Series Advanced Baseline Imager (ABI) Level 2 Cloud Optical Depth (COD) [data set], <ext-link xlink:href="https://doi.org/10.7289/V58G8J02" ext-link-type="DOI">10.7289/V58G8J02</ext-link>, 2018b.</mixed-citation></ref>
      <ref id="bib1.bibx16"><?xmltex \def\ref@label{{GOES-R Algorithm Working Group and GOES-R Series Program
Office(2018c)}}?><label>GOES-R Algorithm Working Group and GOES-R Series Program
Office(2018c)</label><?label GOESABIL2CTPData?><mixed-citation>GOES-R Algorithm Working Group and GOES-R Series Program Office: NOAA GOES-R
Series Advanced Baseline Imager (ABI) Level 2 Cloud Top Pressure (CTP) [data set], <ext-link xlink:href="https://doi.org/10.7289/V5D50K85" ext-link-type="DOI">10.7289/V5D50K85</ext-link>, 2018c.</mixed-citation></ref>
      <ref id="bib1.bibx17"><?xmltex \def\ref@label{{GOES-R Algorithm Working Group and GOES-R Series Program
Office(2018d)}}?><label>GOES-R Algorithm Working Group and GOES-R Series Program
Office(2018d)</label><?label GOESABIL2DSIData?><mixed-citation>GOES-R Algorithm Working Group and GOES-R Series Program Office: NOAA GOES-R
Series Advanced Baseline Imager (ABI) Level 2 Derived Stability Indices [data set], <ext-link xlink:href="https://doi.org/10.7289/V50Z71KF" ext-link-type="DOI">10.7289/V50Z71KF</ext-link>, 2018d.</mixed-citation></ref>
      <ref id="bib1.bibx18"><?xmltex \def\ref@label{{{GOES-R Calibration Working Group and GOES-R Series
Program}(2017)}}?><label>GOES-R Calibration Working Group and GOES-R Series
Program(2017)</label><?label GOESABIL1bData?><mixed-citation>GOES-R Calibration Working Group and GOES-R Series Program: NOAA GOES-R
Series Advanced Baseline Imager (ABI) Level 1b Radiances [data set], <ext-link xlink:href="https://doi.org/10.7289/V5BV7DSR" ext-link-type="DOI">10.7289/V5BV7DSR</ext-link>, 2017.</mixed-citation></ref>
      <?pagebreak page595?><ref id="bib1.bibx19"><?xmltex \def\ref@label{{Greene and Clark(1972)}}?><label>Greene and Clark(1972)</label><?label Greene1972VIL?><mixed-citation>Greene, D. R. and Clark, R. A.: Vertically Integrated Liquid Water – A New
Analysis Tool, Mon. Weather Rev., 100, 548–552,
<ext-link xlink:href="https://doi.org/10.1175/1520-0493(1972)100&lt;0548:VILWNA&gt;2.3.CO;2" ext-link-type="DOI">10.1175/1520-0493(1972)100&lt;0548:VILWNA&gt;2.3.CO;2</ext-link>, 1972.</mixed-citation></ref>
      <ref id="bib1.bibx20"><?xmltex \def\ref@label{{Handwerker(2002)}}?><label>Handwerker(2002)</label><?label Handwerker2002TRACE3D?><mixed-citation>Handwerker, J.: Cell tracking with TRACE3D – a new algorithm, Atmos. Res.,
61, 15–34, <ext-link xlink:href="https://doi.org/10.1016/S0169-8095(01)00100-4" ext-link-type="DOI">10.1016/S0169-8095(01)00100-4</ext-link>, 2002.</mixed-citation></ref>
      <ref id="bib1.bibx21"><?xmltex \def\ref@label{{Heidinger et~al.(2020)Heidinger, Pavolonis, Calvert, Hoffman, Nebuda, Straka, Walther, and Wanzong}}?><label>Heidinger et al.(2020)Heidinger, Pavolonis, Calvert, Hoffman, Nebuda, Straka, Walther, and Wanzong</label><?label Heidinger2020GOESRCloud?><mixed-citation>Heidinger, A. K., Pavolonis, M. J., Calvert, C., Hoffman, J., Nebuda, S.,
Straka, W., Walther, A., and Wanzong, S.: ABI Cloud Products from the GOES-R Series, in: The GOES-R Series: A New Generation of Geostationary Environmental Satellites, chap. 6, edited by: Goodman, S. J., Schmit, T. J., Daniels, J., and Redmon, R. J., Elsevier, 43–62,
<ext-link xlink:href="https://doi.org/10.1016/B978-0-12-814327-8.00006-8" ext-link-type="DOI">10.1016/B978-0-12-814327-8.00006-8</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx22"><?xmltex \def\ref@label{{Heiss et~al.(1990)Heiss, McGrew, and Sirmans}}?><label>Heiss et al.(1990)Heiss, McGrew, and Sirmans</label><?label Heiss1990NEXRAD?><mixed-citation>
Heiss, W. H., McGrew, D. L., and Sirmans, D.: Nexrad: next generation weather
radar (WSR-88D), Microwave J., 33, 79+, 1990.</mixed-citation></ref>
      <ref id="bib1.bibx23"><?xmltex \def\ref@label{{Helmus and Collis(2016)}}?><label>Helmus and Collis(2016)</label><?label Helmus2016PyART?><mixed-citation>Helmus, J. J. and Collis, S. M.: The Python ARM Radar Toolkit (Py-ART), a
library for working with weather radar data in the Python programming language, J. Open Res. Software, 4, e25, <ext-link xlink:href="https://doi.org/10.5334/jors.119" ext-link-type="DOI">10.5334/jors.119</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx24"><?xmltex \def\ref@label{{Hering et~al.(2004)Hering, Morel, Galli, Snsi, Ambrosetti, and
Boscacci}}?><label>Hering et al.(2004)Hering, Morel, Galli, Snsi, Ambrosetti, and
Boscacci</label><?label Hering2004TRT?><mixed-citation>Hering, A., Morel, C., Galli, G., Sénési, S., Ambrosetti, P., and Boscacci, M.: Nowcasting thunderstorms in the Alpine region using a radar based adaptive thresholding scheme, in: Proceedings of ERAD 2004, <uri>https://www.copernicus.org/erad/2004/online/ERAD04_P_206.pdf</uri> (last access: 21 February 2022), 2004.</mixed-citation></ref>
      <ref id="bib1.bibx25"><?xmltex \def\ref@label{{Hering et~al.(2005)Hering, Snsi, Ambrosetti, , and
Bernard-Bouissires}}?><label>Hering et al.(2005)Hering, Snsi, Ambrosetti, , and
Bernard-Bouissires</label><?label Hering2005TRT?><mixed-citation>Hering, A., Sénési, S., Ambrosetti, P., and Bernard-Bouissières, I.: Nowcasting thunderstorms in complex cases using radar data, in: WMO Symposium on Nowcasting and Very Short Range Forecasting, <uri>https://www.researchgate.net/publication/228609271_Nowcasting_thunderstorms_in_complex_cases_using_radar_data</uri>
(last access: 21 February 2022), 2005.</mixed-citation></ref>
      <ref id="bib1.bibx26"><?xmltex \def\ref@label{{Hering et~al.(2006)Hering, Germann, Boscacci, , and
Snsi}}?><label>Hering et al.(2006)Hering, Germann, Boscacci, , and
Snsi</label><?label Hering2006TRT?><mixed-citation>Hering, A., Germann, U., Boscacci, M., and Sénési, S.: Operational
thunderstorm nowcasting in the Alpine region using 3D-radar severe weather
parameters and lightning data, in: Proceedings of ERAD 2006, <uri>http://www.crahi.upc.edu/ERAD2006/proceedingsMask/00122.pdf</uri> (last
access: 21 February 2022), 2006.</mixed-citation></ref>
      <ref id="bib1.bibx27"><?xmltex \def\ref@label{{Hoffmann(2008)}}?><label>Hoffmann(2008)</label><?label Hoffman2008CellMOS?><mixed-citation>Hoffmann, J.: Entwicklung und Anwendung von statistischen Vorhersage – Interpretationsverfahren für Gewitternowcasting und Unwetterwarnungen unter Einbeziehung von Fernerkundungsdaten, PhD thesis, Freie Universität Berlin, Berlin, <ext-link xlink:href="https://doi.org/10.17169/refubium-15903" ext-link-type="DOI">10.17169/refubium-15903</ext-link>, 2008.</mixed-citation></ref>
      <ref id="bib1.bibx28"><?xmltex \def\ref@label{{Huang et~al.(2019)Huang, Jiang, Liu, Pan, Li, Guo, Huang, and
Duan}}?><label>Huang et al.(2019)Huang, Jiang, Liu, Pan, Li, Guo, Huang, and
Duan</label><?label Huang2019Hail?><mixed-citation>Huang, W., Jiang, Y., Liu, X., Pan, Y., Li, X., Guo, R., Huang, Y., and Duan,
B.: Classified Early-warning and Nowcasting of Hail Weather Based on Radar
Products and Random Forest Algorithm, in: 2019 International Conference on
Meteorology Observations (ICMO), <ext-link xlink:href="https://doi.org/10.1109/ICMO49322.2019.9026039" ext-link-type="DOI">10.1109/ICMO49322.2019.9026039</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx29"><?xmltex \def\ref@label{{James et~al.(2018)James, Reichert, and
Heizenreder}}?><label>James et al.(2018)James, Reichert, and
Heizenreder</label><?label James2018NowCastMIX?><mixed-citation>James, P. M., Reichert, B. K., and Heizenreder, D.: NowCastMIX: Automatic
Integrated Warnings for Severe Convection on Nowcasting Time Scales at the
German Weather Service, Weather Forecasti., 33, 1413–1433,
<ext-link xlink:href="https://doi.org/10.1175/WAF-D-18-0038.1" ext-link-type="DOI">10.1175/WAF-D-18-0038.1</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx30"><?xmltex \def\ref@label{{Ke et~al.(2017)Ke, Meng, Finley, Wang, Chen, Ma, Ye, and
Liu}}?><label>Ke et al.(2017)Ke, Meng, Finley, Wang, Chen, Ma, Ye, and
Liu</label><?label Ke2017LightGBM?><mixed-citation>Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., and Liu,
T.-Y.: LightGBM: a highly efficient gradient boosting decision tree, in:
Proceedings of the 31st International Conference on Neural Information
Processing Systems, 3149–3157, <uri>https://dl.acm.org/doi/abs/10.5555/3294996.3295074</uri> (last access: 21 February 2022), 2017.</mixed-citation></ref>
      <ref id="bib1.bibx31"><?xmltex \def\ref@label{{Kelly et~al.(1985)Kelly, Schaefer, and
Doswell}}?><label>Kelly et al.(1985)Kelly, Schaefer, and
Doswell</label><?label Kelly1985Thunderstorms?><mixed-citation>Kelly, D. L., Schaefer, J. T., and Doswell, C. A.: Climatology of Nontornadic
Severe Thunderstorm Events in the United States, Mon. Weather Rev., 113,
1997–2014, <ext-link xlink:href="https://doi.org/10.1175/1520-0493(1985)113&lt;1997:CONSTE&gt;2.0.CO;2" ext-link-type="DOI">10.1175/1520-0493(1985)113&lt;1997:CONSTE&gt;2.0.CO;2</ext-link>, 1985.</mixed-citation></ref>
      <ref id="bib1.bibx32"><?xmltex \def\ref@label{{Kober and Tafferner(2009)}}?><label>Kober and Tafferner(2009)</label><?label Kober2009CbTRAM?><mixed-citation>Kober, K. and Tafferner, A.: Tracking and Nowcasting of Convective Cells Using Remote Sensing Data from Radar and Satellite, Meteorol. Z., 1, 75–84, <ext-link xlink:href="https://doi.org/10.1127/0941-2948/2009/359" ext-link-type="DOI">10.1127/0941-2948/2009/359</ext-link>, 2009.</mixed-citation></ref>
      <ref id="bib1.bibx33"><?xmltex \def\ref@label{{Kober et~al.(2012)Kober, Craig, Keil, and
Dörnbrack}}?><label>Kober et al.(2012)Kober, Craig, Keil, and
Dörnbrack</label><?label Kober2012Blending?><mixed-citation>Kober, K., Craig, G. C., Keil, C., and Dörnbrack, A.: Blending a probabilistic nowcasting method with a high-resolution numerical weather prediction ensemble for convective precipitation forecasts, Q. J. Roy. Meteorol. Soc., 138, 755–768, <ext-link xlink:href="https://doi.org/10.1002/qj.939" ext-link-type="DOI">10.1002/qj.939</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bibx34"><?xmltex \def\ref@label{{Kumar et~al.(2020)Kumar, Islam, Sekimoto, Mattmann, and
Wilson}}?><label>Kumar et al.(2020)Kumar, Islam, Sekimoto, Mattmann, and
Wilson</label><?label Kumar2020Convcast?><mixed-citation>Kumar, A., Islam, T., Sekimoto, Y., Mattmann, C., and Wilson, B.: Convcast: An embedded convolutional LSTM based architecture for precipitation nowcasting using satellite data, PLOS One, 15, 1–18,
<ext-link xlink:href="https://doi.org/10.1371/journal.pone.0230114" ext-link-type="DOI">10.1371/journal.pone.0230114</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx35"><?xmltex \def\ref@label{{Lagerquist et~al.(2017)Lagerquist, McGovern, and
Smith}}?><label>Lagerquist et al.(2017)Lagerquist, McGovern, and
Smith</label><?label Lagerquist2017Wind?><mixed-citation>Lagerquist, R., McGovern, A., and Smith, T.: Machine Learning for Real-Time
Prediction of Damaging Straight-Line Convective Wind, Weather Forecast., 32, 2175–2193, <ext-link xlink:href="https://doi.org/10.1175/WAF-D-17-0038.1" ext-link-type="DOI">10.1175/WAF-D-17-0038.1</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx36"><?xmltex \def\ref@label{{Lagerquist et~al.(2020)Lagerquist, McGovern, Homeyer, Gagne, and
Smith}}?><label>Lagerquist et al.(2020)Lagerquist, McGovern, Homeyer, Gagne, and
Smith</label><?label Lagerquist2020Tornado?><mixed-citation>Lagerquist, R., McGovern, A., Homeyer, C. R., Gagne II, D. J., and Smith, T.:
Deep Learning on Three-Dimensional Multiscale Data for Next-Hour Tornado
Prediction, Mon. Weather Rev., 148, 2837–2861, <ext-link xlink:href="https://doi.org/10.1175/MWR-D-19-0372.1" ext-link-type="DOI">10.1175/MWR-D-19-0372.1</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx37"><?xmltex \def\ref@label{{Leinonen et~al.(2021)Leinonen, Hamann, Germann, and
Mecikalski}}?><label>Leinonen et al.(2021)Leinonen, Hamann, Germann, and
Mecikalski</label><?label Leinonen2021TSNowcastData?><mixed-citation>Leinonen, J., Hamann, U., Germann, U., and Mecikalski, J. R.: Machine learning code and dataset for “Nowcasting thunderstorm hazards using machine learning: the impact of data sources on performance”, Zenodo [code and data set], <ext-link xlink:href="https://doi.org/10.5281/zenodo.6206919" ext-link-type="DOI">10.5281/zenodo.6206919</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx38"><?xmltex \def\ref@label{{Li et~al.(2020)Li, Li, and Schmit}}?><label>Li et al.(2020)Li, Li, and Schmit</label><?label Li2020GOESRLegacy?><mixed-citation>Li, J., Li, Z., and Schmit, T. J.: ABI Legacy Atmospheric Profiles and
Derived Products from the GOES-R Series, in: The GOES-R Series: A New
Generation of Geostationary Environmental Satellites, chap. 7, edited by: Goodman, S. J., Schmit, T. J., Daniels, J., and Redmon, R. J., Elsevier, 63–77, <ext-link xlink:href="https://doi.org/10.1016/B978-0-12-814327-8.00007-X" ext-link-type="DOI">10.1016/B978-0-12-814327-8.00007-X</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx39"><?xmltex \def\ref@label{{Marshall and Radhakant(1978)}}?><label>Marshall and Radhakant(1978)</label><?label Marshall1978LightningIndicators?><mixed-citation>Marshall, J. S. and Radhakant, S.: Radar Precipitation Maps as Lightning
Indicators, J. Appl. Meteorol. Clim., 17, 206–212,
<ext-link xlink:href="https://doi.org/10.1175/1520-0450(1978)017&lt;0206:RPMALI&gt;2.0.CO;2" ext-link-type="DOI">10.1175/1520-0450(1978)017&lt;0206:RPMALI&gt;2.0.CO;2</ext-link>, 1978.</mixed-citation></ref>
      <ref id="bib1.bibx40"><?xmltex \def\ref@label{{Martner et~al.(2008)Martner, Yuter, White, Matrosov, Kingsmill, and
Ralph}}?><label>Martner et al.(2008)Martner, Yuter, White, Matrosov, Kingsmill, and
Ralph</label><?label Matrner2008DSD?><mixed-citation>Martner, B. E., Yuter, S. E., White, A. B., Matrosov, S. Y., Kingsmill, D. E., and Ralph, F. M.: Raindrop Size Distributions and Rain Characteristics in
California Coastal Rainfall for Periods with and without a Radar Bright Band,
J. Hydrometeorol., 9, 408–425, <ext-link xlink:href="https://doi.org/10.1175/2007JHM924.1" ext-link-type="DOI">10.1175/2007JHM924.1</ext-link>, 2008.</mixed-citation></ref>
      <ref id="bib1.bibx41"><?xmltex \def\ref@label{{Mecikalski and Bedka(2006)}}?><label>Mecikalski and Bedka(2006)</label><?label Mecikalski2006SATCAST?><mixed-citation>Mecikalski, J. R. and Bedka, K. M.: Forecasting Convective Initiation by
Monitoring the Evolution of Moving Cumulus in Daytime GOES Imager, Mon. Weather Rev., 134, 49–78, <ext-link xlink:href="https://doi.org/10.1175/MWR3062.1" ext-link-type="DOI">10.1175/MWR3062.1</ext-link>, 2006.</mixed-citation></ref>
      <ref id="bib1.bibx42"><?xmltex \def\ref@label{{Mecikalski et~al.(2010)Mecikalski, MacKenzie, Koenig, and
Muller}}?><label>Mecikalski et al.(2010)Mecikalski, MacKenzie, Koenig, and
Muller</label><?label Mecikalski2010CloudTop?><mixed-citation>Mecikalski, J. R., MacKenzie, W. M., Koenig, M., and Muller, S.: Cloud-Top
Properties of Growing Cumulus prior to Convective Initiation as Measured by
Meteosat Second Generation. Part I: Infrared Fields, J. Appl. Meteorol. Clim., 4, 521–534, <ext-link xlink:href="https://doi.org/10.1175/2009JAMC2344.1" ext-link-type="DOI">10.1175/2009JAMC2344.1</ext-link>, 2010.</mixed-citation></ref>
      <?pagebreak page596?><ref id="bib1.bibx43"><?xmltex \def\ref@label{{Mecikalski et~al.(2015)Mecikalski, Williams, Jewett, Ahijevych,
LeRoy, and Walker}}?><label>Mecikalski et al.(2015)Mecikalski, Williams, Jewett, Ahijevych,
LeRoy, and Walker</label><?label MecikalskiGOESRCI2015?><mixed-citation>Mecikalski, J. R., Williams, J. K., Jewett, C. P., Ahijevych, D., LeRoy, A.,
and Walker, J. R.: Probabilistic 0–1-h Convective Initiation Nowcasts that
Combine Geostationary Satellite Observations and Numerical Weather Prediction
Model Data, J. Appl. Meteorol. Clim., 54, 1039–1059,
<ext-link xlink:href="https://doi.org/10.1175/JAMC-D-14-0129.1" ext-link-type="DOI">10.1175/JAMC-D-14-0129.1</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx44"><?xmltex \def\ref@label{{Mecikalski et~al.(2021)Mecikalski, Sandmæl, Murillo, Homeyer, Bedka, Apke, and Jewett}}?><label>Mecikalski et al.(2021)Mecikalski, Sandmæl, Murillo, Homeyer, Bedka, Apke, and Jewett</label><?label Mecikalski2021Importance?><mixed-citation>Mecikalski, J. R., Sandmæl, T. N., Murillo, E. M., Homeyer, C. R., Bedka, K. M., Apke, J. M., and Jewett, C. P.: Random Forest Model to Assess Predictor Importance and Nowcast Severe Storms using High-Resolution Radar–GOES Satellite–Lightning Observations, Mon. Weather Rev., 149, 1725–1746,
<ext-link xlink:href="https://doi.org/10.1175/MWR-D-19-0274.1" ext-link-type="DOI">10.1175/MWR-D-19-0274.1</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx45"><?xmltex \def\ref@label{{Mostajabi et~al.(2019)Mostajabi, Finney, Rubinstein, and
Rachidi}}?><label>Mostajabi et al.(2019)Mostajabi, Finney, Rubinstein, and
Rachidi</label><?label Mostajabi2020Lightning?><mixed-citation>Mostajabi, A., Finney, D. L., Rubinstein, M., and Rachidi, F.: Nowcasting
lightning occurrence from commonly available meteorological parameters using
machine learning techniques, Clim. Atmos. Sci., 2, 41,
<ext-link xlink:href="https://doi.org/10.1038/s41612-019-0098-0" ext-link-type="DOI">10.1038/s41612-019-0098-0</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx46"><?xmltex \def\ref@label{{Mueller et~al.(2003)Mueller, Saxen, Roberts, Wilson, Betancourt,
Dettling, Oien, and J.}}?><label>Mueller et al.(2003)Mueller, Saxen, Roberts, Wilson, Betancourt,
Dettling, Oien, and J.</label><?label Mueller2003ANC?><mixed-citation>Mueller, C., Saxen, T., Roberts, R., Wilson, J., Betancourt, T., Dettling, S., Oien, N., and J., Y.: NCAR Auto-Nowcast System, Weather Forecast., 18, 545–561, <ext-link xlink:href="https://doi.org/10.1175/1520-0434(2003)018&lt;0545:NAS&gt;2.0.CO;2" ext-link-type="DOI">10.1175/1520-0434(2003)018&lt;0545:NAS&gt;2.0.CO;2</ext-link>, 2003.</mixed-citation></ref>
      <ref id="bib1.bibx47"><?xmltex \def\ref@label{{NASA/METI/AIST/Japan Spacesystems and US/Japan ASTER Science
Team(2019)}}?><label>NASA/METI/AIST/Japan Spacesystems and US/Japan ASTER Science
Team(2019)</label><?label ASTERGDEMV3Data?><mixed-citation>NASA/METI/AIST/Japan Spacesystems and US/Japan ASTER Science Team: ASTER
Global Digital Elevation Model V003 [data set], <ext-link xlink:href="https://doi.org/10.5067/ASTER/ASTGTM.003" ext-link-type="DOI">10.5067/ASTER/ASTGTM.003</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx48"><?xmltex \def\ref@label{{Natekin and Knoll(2013)}}?><label>Natekin and Knoll(2013)</label><?label Natekin2013GBM?><mixed-citation>Natekin, A. and Knoll, A.: Gradient boosting machines, a tutorial, Front.
Neurorobot., 7, 21, <ext-link xlink:href="https://doi.org/10.3389/fnbot.2013.00021" ext-link-type="DOI">10.3389/fnbot.2013.00021</ext-link>, 2013.</mixed-citation></ref>
      <ref id="bib1.bibx49"><?xmltex \def\ref@label{{NOAA National Weather Service~(NWS) Radar Operations
Center(1991)}}?><label>NOAA National Weather Service (NWS) Radar Operations
Center(1991)</label><?label NEXRADData?><mixed-citation>NOAA National Weather Service (NWS) Radar Operations Center: Next Generation Radar (NEXRAD) Level 2 Base Data [data set], <ext-link xlink:href="https://doi.org/10.7289/V5W9574V" ext-link-type="DOI">10.7289/V5W9574V</ext-link>, 1991.</mixed-citation></ref>
      <ref id="bib1.bibx50"><?xmltex \def\ref@label{{Pulkkinen et~al.(2019)Pulkkinen, Nerini, Pérez~Hortal,
Velasco-Forero, Seed, Germann, and Foresti}}?><label>Pulkkinen et al.(2019)Pulkkinen, Nerini, Pérez Hortal,
Velasco-Forero, Seed, Germann, and Foresti</label><?label Pulkkinen2019Pysteps?><mixed-citation>Pulkkinen, S., Nerini, D., Pérez Hortal, A. A., Velasco-Forero, C., Seed, A.,
Germann, U., and Foresti, L.: Pysteps: an open-source Python library for
probabilistic precipitation nowcasting (v1.0), Geosci. Model Dev., 12,
4185–4219, <ext-link xlink:href="https://doi.org/10.5194/gmd-12-4185-2019" ext-link-type="DOI">10.5194/gmd-12-4185-2019</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx51"><?xmltex \def\ref@label{{Raspaud et~al.(2018)Raspaud, Hoese, Dybbroe, Lahtinen, Devasthale,
Itkin, Hamann, Rasmussen, Nielsen, Leppelt, Maul, Kliche, and
Thorsteinsson}}?><label>Raspaud et al.(2018)Raspaud, Hoese, Dybbroe, Lahtinen, Devasthale,
Itkin, Hamann, Rasmussen, Nielsen, Leppelt, Maul, Kliche, and
Thorsteinsson</label><?label Raspaud2018PyTroll?><mixed-citation>Raspaud, M., Hoese, D., Dybbroe, A., Lahtinen, P., Devasthale, A., Itkin, M.,
Hamann, U., Rasmussen, L. O., Nielsen, E. S., Leppelt, T., Maul, A., Kliche,
C., and Thorsteinsson, H.: PyTroll: An Open-Source, Community-Driven Python Framework to Process Earth Observation Satellite Data, B. Am. Meteorol. Soc., 99, 1329–1336, <ext-link xlink:href="https://doi.org/10.1175/BAMS-D-17-0277.1" ext-link-type="DOI">10.1175/BAMS-D-17-0277.1</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx52"><?xmltex \def\ref@label{{Roberts and Rutledge(2003)}}?><label>Roberts and Rutledge(2003)</label><?label Roberts2003StormInitiation?><mixed-citation>Roberts, R. D. and Rutledge, S.: Nowcasting Storm Initiation and Growth Using
GOES-8 and WSR-88D Data, Weather Forecast., 18, 562–584,
<ext-link xlink:href="https://doi.org/10.1175/1520-0434(2003)018&lt;0562:NSIAGU&gt;2.0.CO;2" ext-link-type="DOI">10.1175/1520-0434(2003)018&lt;0562:NSIAGU&gt;2.0.CO;2</ext-link>, 2003.</mixed-citation></ref>
      <ref id="bib1.bibx53"><?xmltex \def\ref@label{{Rudlosky et~al.(2020)Rudlosky, Goodman, and
Virts}}?><label>Rudlosky et al.(2020)Rudlosky, Goodman, and
Virts</label><?label Rudlosky2020GOESRGLM?><mixed-citation>Rudlosky, S. D., Goodman, S. J., and Virts, K. S.: Lightning Detection: GOES-R Series Geostationary Lightning Mapper, in: The GOES-R Series: A New
Generation of Geostationary Environmental Satellites, chap. 16, edited by: Goodman, S. J., Schmit, T. J., Daniels, J., and Redmon, R. J., Elsevier, 193–202, <ext-link xlink:href="https://doi.org/10.1016/B978-0-12-814327-8.00016-0" ext-link-type="DOI">10.1016/B978-0-12-814327-8.00016-0</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx54"><?xmltex \def\ref@label{{Schmit and Gunshor(2020)}}?><label>Schmit and Gunshor(2020)</label><?label Schmit2020GOESRABI?><mixed-citation>Schmit, T. J. and Gunshor, M. M.: ABI Imagery from the GOES-R Series, in:
The GOES-R Series: A New Generation of Geostationary Environmental Satellites, chap. 4, edited by: Goodman, S. J., Schmit, T. J., Daniels, J., and Redmon, R. J., Elsevier, 23–34, <ext-link xlink:href="https://doi.org/10.1016/B978-0-12-814327-8.00004-4" ext-link-type="DOI">10.1016/B978-0-12-814327-8.00004-4</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx55"><?xmltex \def\ref@label{{Shi et~al.(2015)Shi, Chen, Wang, Yeung, Wong, and
Woo}}?><label>Shi et al.(2015)Shi, Chen, Wang, Yeung, Wong, and
Woo</label><?label Shi2015ConvLSTMPrecip?><mixed-citation>Shi, X., Chen, Z., Wang, H., Yeung, D.-Y., Wong, W.-K., and Woo, W.-C.:
Convolutional LSTM Network: A Machine Learning Approach for Precipitation
Nowcasting, in: Advances in Neural Information Processing Systems 28, edited
by Cortes, C., Lawrence, N. D., Lee, D. D., Sugiyama, M., and Garnett, R.,
Curran Associates, Inc., 802–810,
<ext-link xlink:href="http://papers.nips.cc/paper/5955-convolutional-lstm-network-a-machine-learning-approach-for-precipitation-nowcasting.pdf">http://papers.nips.cc/paper/5955-convolutional-lstm-network-a-machine-learning-approach
-for-precipitation-nowcasting.pdf</ext-link> (last access: 21 February 2022), 2015.</mixed-citation></ref>
      <ref id="bib1.bibx56"><?xmltex \def\ref@label{{Shi et~al.(2017)Shi, Gao, Lausen, Wang, Yeung, Wong, and
Woo}}?><label>Shi et al.(2017)Shi, Gao, Lausen, Wang, Yeung, Wong, and
Woo</label><?label Shi2017ConvLSTMPrecip?><mixed-citation>Shi, X., Gao, Z., Lausen, L., Wang, H., Yeung, D.-Y., Wong, W.-K., and Woo,
W.-C.: Deep learning for precipitation nowcasting: a benchmark and a new
model, in: Proceedings of the 31st International Conference on Neural Information Processing Systems, 5622–5632, <uri>https://dl.acm.org/doi/abs/10.5555/3295222.3295313</uri> (last access: 21 February 2022), 2017.</mixed-citation></ref>
      <ref id="bib1.bibx57"><?xmltex \def\ref@label{{Smith et~al.(2016)Smith, Lakshmanan, Stumpf, Ortega, Hondl, Cooper,
Calhoun, Kingfield, Manross, Toomey, and Brodgen}}?><label>Smith et al.(2016)Smith, Lakshmanan, Stumpf, Ortega, Hondl, Cooper,
Calhoun, Kingfield, Manross, Toomey, and Brodgen</label><?label Smith2016MRMSSevere?><mixed-citation>Smith, T. M., Lakshmanan, V., Stumpf, G. J., Ortega, K. L., Hondl, K., Cooper, K., Calhoun, K. M., Kingfield, D. M., Manross, K. L., Toomey, R., and
Brodgen, J.: Multi-Radar Multi-Sensor (MRMS) Severe Weather and Aviation
Products: Initial Operating Capabilities, B. Am. Meteorol. Soc., 97, 1617–1630, <ext-link xlink:href="https://doi.org/10.1175/BAMS-D-14-00173.1" ext-link-type="DOI">10.1175/BAMS-D-14-00173.1</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx58"><?xmltex \def\ref@label{{Snyder(1987)}}?><label>Snyder(1987)</label><?label Snyder1987Projections?><mixed-citation>Snyder, J. P.: Map Projections – A Working Manual, United States Government
Printing Office, Washington, DC, USA, <ext-link xlink:href="https://doi.org/10.3133/pp1395" ext-link-type="DOI">10.3133/pp1395</ext-link>, 1987.</mixed-citation></ref>
      <ref id="bib1.bibx59"><?xmltex \def\ref@label{{Sprenger et~al.(2017)Sprenger, Schemm, Oechslin, and
Jenkner}}?><label>Sprenger et al.(2017)Sprenger, Schemm, Oechslin, and
Jenkner</label><?label Sprenger2017Foehn?><mixed-citation>Sprenger, M., Schemm, S., Oechslin, R., and Jenkner, J.: Nowcasting Foehn Wind Events Using the AdaBoost Machine Learning Algorithm, Weather Forecast., 32, 1079–1099, <ext-link xlink:href="https://doi.org/10.1175/WAF-D-16-0208.1" ext-link-type="DOI">10.1175/WAF-D-16-0208.1</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx60"><?xmltex \def\ref@label{{Steinacker et~al.(2000)Steinacker, Dorninger, Wlfelmaier, and
Krennert}}?><label>Steinacker et al.(2000)Steinacker, Dorninger, Wlfelmaier, and
Krennert</label><?label Steinacker2000Tracking?><mixed-citation>Steinacker, R., Dorninger, M., Wölfelmaier, F., and Krennert, T.: Automatic Tracking of Convective Cells and Cell Complexes from Lightning and Radar Data, Meteorol. Atmos. Phys., 72, 101–110, <ext-link xlink:href="https://doi.org/10.1007/s007030050009" ext-link-type="DOI">10.1007/s007030050009</ext-link>, 2000.</mixed-citation></ref>
      <ref id="bib1.bibx61"><?xmltex \def\ref@label{{S\"{u}li and Mayers(2003)}}?><label>Süli and Mayers(2003)</label><?label Suli2003NumericalAnalysis?><mixed-citation>Süli, E. and Mayers, D. F.: An Introduction to Numerical Analysis, Cambridge University Press, Cambridge, UK, <ext-link xlink:href="https://doi.org/10.1017/CBO9780511801181" ext-link-type="DOI">10.1017/CBO9780511801181</ext-link>, 2003.</mixed-citation></ref>
      <ref id="bib1.bibx62"><?xmltex \def\ref@label{{Sullivan(2020)}}?><label>Sullivan(2020)</label><?label Sullivan2020GOESR?><mixed-citation>Sullivan, P. C.: GOES-R Series Spacecraft and Instruments, in: The GOES-R
Series: A New Generation of Geostationary Environmental Satellites, chap. 3, edited by: Goodman, S. J., Schmit, T. J., Daniels, J., and Redmon, R. J.,
Elsevier, 13–21, <ext-link xlink:href="https://doi.org/10.1016/B978-0-12-814327-8.00003-2" ext-link-type="DOI">10.1016/B978-0-12-814327-8.00003-2</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx63"><?xmltex \def\ref@label{{Waldvogel et~al.(1979)Waldvogel, Federer, and
Grimm}}?><label>Waldvogel et al.(1979)Waldvogel, Federer, and
Grimm</label><?label Waldvogel1979POH?><mixed-citation>Waldvogel, A., Federer, B., and Grimm, P.: Criteria for the Detection of Hail
Cells, J. Appl. Meteorol., 18, 1521–1525,
<ext-link xlink:href="https://doi.org/10.1175/1520-0450(1979)018&lt;1521:CFTDOH&gt;2.0.CO;2" ext-link-type="DOI">10.1175/1520-0450(1979)018&lt;1521:CFTDOH&gt;2.0.CO;2</ext-link>, 1979.</mixed-citation></ref>
      <ref id="bib1.bibx64"><?xmltex \def\ref@label{{Willmott and Matsuura(2005)}}?><label>Willmott and Matsuura(2005)</label><?label Willmott2005MAE?><mixed-citation>Willmott, C. J. and Matsuura, K.: Advantages of the mean absolute error (MAE)
over the root mean square error (RMSE) in assessing average model performance, Clim. Res., 30, 79–82, <ext-link xlink:href="https://doi.org/10.3354/cr030079" ext-link-type="DOI">10.3354/cr030079</ext-link>, 2005.</mixed-citation></ref>
      <ref id="bib1.bibx65"><?xmltex \def\ref@label{{Wilson and Mueller(1993)}}?><label>Wilson and Mueller(1993)</label><?label Wilson1993ThunderstormInitiation?><mixed-citation>Wilson, J. W. and Mueller, C. K.: Nowcasts of Thunderstorm Initiation and
Evolution, Weather Forecast., 8, 113–131, <ext-link xlink:href="https://doi.org/10.1175/1520-0434(1993)008&lt;0113:NOTIAE&gt;2.0.CO;2" ext-link-type="DOI">10.1175/1520-0434(1993)008&lt;0113:NOTIAE&gt;2.0.CO;2</ext-link>, 1993.</mixed-citation></ref>
      <ref id="bib1.bibx66"><?xmltex \def\ref@label{{Wilson et~al.(1998)Wilson, Crook, Mueller, Sun, and
Dixon}}?><label>Wilson et al.(1998)Wilson, Crook, Mueller, Sun, and
Dixon</label><?label Wilson1998NowcastingReview?><mixed-citation>Wilson, J. W., Crook, N. A., Mueller, C. K., Sun, J., and Dixon, M.: Nowcasting Thunderstorms: A Status Report, B. Amer. Meteorol. Soc., 79, 2079–2100, <ext-link xlink:href="https://doi.org/10.1175/1520-0477(1998)079&lt;2079:NTASR&gt;2.0.CO;2" ext-link-type="DOI">10.1175/1520-0477(1998)079&lt;2079:NTASR&gt;2.0.CO;2</ext-link>, 1998.</mixed-citation></ref>
      <ref id="bib1.bibx67"><?xmltex \def\ref@label{{Yeung et~al.(2015)Yeung, Smith, Baeck, and
Villarini}}?><label>Yeung et al.(2015)Yeung, Smith, Baeck, and
Villarini</label><?label Yeung2015Thunderstorms?><mixed-citation>Yeung, J. K., Smith, J. A., Baeck, M. L., and Villarini, G.: Lagrangian
Analyses of Rainfall Structure and Evolution for Organized Thunderstorm
Systems in the Urban Corridor of the Northeastern United States, J. Hydrometeorol., 16, 1575–1595, <ext-link xlink:href="https://doi.org/10.1175/JHM-D-14-0095.1" ext-link-type="DOI">10.1175/JHM-D-14-0095.1</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx68"><?xmltex \def\ref@label{{Zhang et~al.(2016)Zhang, Howard, Langston, Kaney, Qi, Tang, Grams,
Wang, Cocks, Martinaitis, Arthur, Cooper, Brogden, and
Kitzmiller}}?><label>Zhang et al.(2016)Zhang, Howard, Langston, Kaney, Qi, Tang, Grams,
Wang, Cocks, Martinaitis, Arthur, Cooper, Brogden, and
Kitzmiller</label><?label Zhang2016MRMSQPE?><mixed-citation>Zhang, J., Howard, K., Langston, C., Kaney, B., Qi, Y., Tang, L., Grams, H.,
Wang, Y<?pagebreak page597?>., Cocks, S., Martinaitis, S., Arthur, A., Cooper, K., Brogden, J.,
and Kitzmiller, D.: Multi-Radar Multi-Sensor (MRMS) Quantitative
Precipitation Estimation: Initial Operating Capabilities, B. Am. Meteorol. Soc., 97, 621–638, <ext-link xlink:href="https://doi.org/10.1175/BAMS-D-14-00174.1" ext-link-type="DOI">10.1175/BAMS-D-14-00174.1</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx69"><?xmltex \def\ref@label{{Zhou et~al.(2020)Zhou, Zheng, Dong, and Wang}}?><label>Zhou et al.(2020)Zhou, Zheng, Dong, and Wang</label><?label Zhou2020Lightning?><mixed-citation>Zhou, K., Zheng, Y., Dong, W., and Wang, T.: A Deep Learning Network for
Cloud-to-Ground Lightning Nowcasting with Multisource Data, J. Atmos. Ocean.
Tech., 37, 927–942, <ext-link xlink:href="https://doi.org/10.1175/JTECH-D-19-0146.1" ext-link-type="DOI">10.1175/JTECH-D-19-0146.1</ext-link>, 2020.
</mixed-citation></ref><?xmltex \hack{\newpage}?>
      <ref id="bib1.bibx70"><?xmltex \def\ref@label{{Zinner et~al.(2008)Zinner, Mannstein, and
Tafferner}}?><label>Zinner et al.(2008)Zinner, Mannstein, and
Tafferner</label><?label Zinner2008CbTRAM?><mixed-citation>Zinner, T., Mannstein, H., and Tafferner, A.: Cb-TRAM: Tracking and monitoring severe convection from onset over rapid development to mature phase using multi-channel Meteosat-8 SEVIRI data, Meteorol. Atmos. Phys., 101, 191–210, <ext-link xlink:href="https://doi.org/10.1007/s00703-008-0290-y" ext-link-type="DOI">10.1007/s00703-008-0290-y</ext-link>, 2008.</mixed-citation></ref>

  </ref-list></back>
    <!--<article-title-html>Nowcasting thunderstorm hazards using machine learning:  the impact of data sources on performance</article-title-html>
<abstract-html/>
<ref-html id="bib1.bib1"><label>Abrams et al.(2020)Abrams, Crippen, and Fujisada</label><mixed-citation>
Abrams, M., Crippen, R., and Fujisada, H.: ASTER Global Digital Elevation
Model (GDEM) and ASTER Global Water Body Dataset (ASTWBD), Remote Sens., 12,
1156, <a href="https://doi.org/10.3390/rs12071156" target="_blank">https://doi.org/10.3390/rs12071156</a>, 2020.
</mixed-citation></ref-html>
<ref-html id="bib1.bib2"><label>Autonès and Claudon(2012)</label><mixed-citation>
Autonès, F. and Claudon, M.: Algorithm Theoretical Basis Document for the
Convection Product Processors of the NWC/GEO, Tech. Rep.
SAF/NWC/CDOP/MFT/SCI/ATBD/11, Meteo-France, Toulouse,
<a href="https://www.nwcsaf.org/Downloads/GEO/2018.1/Documents/Scientific_Docs/NWC-CDOP2-GEO-MFT-SCI-ATBD-Convection_v2.2.pdf" target="_blank"/>
(last access: 21 February 2022), 2012.
</mixed-citation></ref-html>
<ref-html id="bib1.bib3"><label>Ayzel et al.(2020)Ayzel, Scheffer, and
Heistermann</label><mixed-citation>
Ayzel, G., Scheffer, T., and Heistermann, M.: RainNet v1.0: a convolutional
neural network for radar-based precipitation nowcasting, Geosci. Model Dev.,
13, 2631–2644, <a href="https://doi.org/10.5194/gmd-13-2631-2020" target="_blank">https://doi.org/10.5194/gmd-13-2631-2020</a>, 2020.
</mixed-citation></ref-html>
<ref-html id="bib1.bib4"><label>Barras et al.(2019)Barras, Hering, Martynov, Noti, Germann, and
Martius</label><mixed-citation>
Barras, H., Hering, A., Martynov, A., Noti, P.-A., Germann, U., and Martius,
O.: Experiences with  &gt; 50,000 Crowdsourced Hail Reports in Switzerland, B.
Am. Meteorol. Soc., 100, 1429–1440, <a href="https://doi.org/10.1175/BAMS-D-18-0090.1" target="_blank">https://doi.org/10.1175/BAMS-D-18-0090.1</a>, 2019.
</mixed-citation></ref-html>
<ref-html id="bib1.bib5"><label>Bedka et al.(2018)Bedka, Murillo, Homeyer, Scarino, and
Mersiovsky</label><mixed-citation>
Bedka, K., Murillo, E. M., Homeyer, C. R., Scarino, B., and Mersiovsky, H.: The Above-Anvil Cirrus Plume: An Important Severe Weather Indicator in Visible and Infrared Satellite Imagery, Weather Forecast., 33, 1159–1181,
<a href="https://doi.org/10.1175/WAF-D-18-0040.1" target="_blank">https://doi.org/10.1175/WAF-D-18-0040.1</a>, 2018.
</mixed-citation></ref-html>
<ref-html id="bib1.bib6"><label>Bedka and Khlopenkov(2016)</label><mixed-citation>
Bedka, K. M. and Khlopenkov, K.: A Probabilistic Multispectral Pattern
Recognition Method for Detection of Overshooting Cloud Tops Using Passive
Satellite Imager Observations, J. Appl. Meteorol. Clim., 55, 1983–2005,
<a href="https://doi.org/10.1175/JAMC-D-15-0249.1" target="_blank">https://doi.org/10.1175/JAMC-D-15-0249.1</a>, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib7"><label>Chai and Draxler(2014)</label><mixed-citation>
Chai, T. and Draxler, R. R.: Root mean square error (RMSE) or mean absolute
error (MAE)? – Arguments against avoiding RMSE in the literature, Geosci. Model Dev., 7, 1247–1250, <a href="https://doi.org/10.5194/gmd-7-1247-2014" target="_blank">https://doi.org/10.5194/gmd-7-1247-2014</a>, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib8"><label>Changnon(1993)</label><mixed-citation>
Changnon, S. A.: Relationships between Thunderstorms and Cloud-to-Ground
Lightning in the United States, J. Appl. Meteorol., 32, 88–105,
<a href="https://doi.org/10.1175/1520-0450(1993)032&lt;0088:RBTACT&gt;2.0.CO;2" target="_blank">https://doi.org/10.1175/1520-0450(1993)032&lt;0088:RBTACT&gt;2.0.CO;2</a>, 1993.
</mixed-citation></ref-html>
<ref-html id="bib1.bib9"><label>Czernecki et al.(2019)Czernecki, Taszarek, Marosz, Półrolniczak,
Kolendowicz, Wyszogrodzki, and Szturc</label><mixed-citation>
Czernecki, B., Taszarek, M., Marosz, M., Półrolniczak, M., Kolendowicz, L.,
Wyszogrodzki, A., and Szturc, J.: Application of machine learning to large
hail prediction – The importance of radar reflectivity, lightning occurrence
and convective parameters derived from ERA5, Atmos. Res., 227, 249–262,
<a href="https://doi.org/10.1016/j.atmosres.2019.05.010" target="_blank">https://doi.org/10.1016/j.atmosres.2019.05.010</a>, 2019.
</mixed-citation></ref-html>
<ref-html id="bib1.bib10"><label>Dixon and Wiener(1993)</label><mixed-citation>
Dixon, M. and Wiener, G.: TITAN: Thunderstorm Identification, Tracking,
Analysis, and Nowcasting – A Radar-based Methodology, J. Atmos. Ocean. Tech., 10, 785–797, <a href="https://doi.org/10.1175/1520-0426(1993)010&lt;0785:TTITAA&gt;2.0.CO;2" target="_blank">https://doi.org/10.1175/1520-0426(1993)010&lt;0785:TTITAA&gt;2.0.CO;2</a>, 1993.
</mixed-citation></ref-html>
<ref-html id="bib1.bib11"><label>Foote et al.(2005)Foote, Krauss, and Makitov</label><mixed-citation>
Foote, G. B., Krauss, T. W., and Makitov, V.: Hail metrics using conventional
radar, in: Proc. 16th Conference on Planned and Inadvertent Weather
Modification, <a href="https://ams.confex.com/ams/pdfpapers/86773.pdf" target="_blank"/> (last
access: 21 February 2021), 2005.
</mixed-citation></ref-html>
<ref-html id="bib1.bib12"><label>Foresti et al.(2019)Foresti, Sideris, Nerini, Beusch, and
Germann</label><mixed-citation>
Foresti, L., Sideris, I. V., Nerini, D., Beusch, L., and Germann, U.: Using a
10-Year Radar Archive for Nowcasting Precipitation Growth and Decay: A
Probabilistic Machine Learning Approach, Weather Forecast., 34, 1547–1569, <a href="https://doi.org/10.1175/WAF-D-18-0206.1" target="_blank">https://doi.org/10.1175/WAF-D-18-0206.1</a>, 2019.
</mixed-citation></ref-html>
<ref-html id="bib1.bib13"><label>Franch et al.(2020)Franch, Nerini, Pendesini, Coviello, Jurman, and
Furlanello</label><mixed-citation>
Franch, G., Nerini, D., Pendesini, M., Coviello, L., Jurman, G., and
Furlanello, C.: Precipitation Nowcasting with Orographic Enhanced Stacked
Generalization: Improving Deep Learning Predictions on Extreme Events,
Atmosphere, 11, 267, <a href="https://doi.org/10.3390/atmos11030267" target="_blank">https://doi.org/10.3390/atmos11030267</a>, 2020.
</mixed-citation></ref-html>
<ref-html id="bib1.bib14"><label>GOES-R Algorithm Working Group and GOES-R Series Program
Office(2018a)</label><mixed-citation>
GOES-R Algorithm Working Group and GOES-R Series Program Office: NOAA GOES-R
Series Advanced Baseline Imager (ABI) Level 2 Cloud Top Height (ACHA) [data set], <a href="https://doi.org/10.7289/V5HX19ZQ" target="_blank">https://doi.org/10.7289/V5HX19ZQ</a>, 2018a.
</mixed-citation></ref-html>
<ref-html id="bib1.bib15"><label>GOES-R Algorithm Working Group and GOES-R Series Program
Office(2018b)</label><mixed-citation>
GOES-R Algorithm Working Group and GOES-R Series Program Office: NOAA GOES-R
Series Advanced Baseline Imager (ABI) Level 2 Cloud Optical Depth (COD) [data set], <a href="https://doi.org/10.7289/V58G8J02" target="_blank">https://doi.org/10.7289/V58G8J02</a>, 2018b.
</mixed-citation></ref-html>
<ref-html id="bib1.bib16"><label>GOES-R Algorithm Working Group and GOES-R Series Program
Office(2018c)</label><mixed-citation>
GOES-R Algorithm Working Group and GOES-R Series Program Office: NOAA GOES-R
Series Advanced Baseline Imager (ABI) Level 2 Cloud Top Pressure (CTP) [data set], <a href="https://doi.org/10.7289/V5D50K85" target="_blank">https://doi.org/10.7289/V5D50K85</a>, 2018c.
</mixed-citation></ref-html>
<ref-html id="bib1.bib17"><label>GOES-R Algorithm Working Group and GOES-R Series Program
Office(2018d)</label><mixed-citation>
GOES-R Algorithm Working Group and GOES-R Series Program Office: NOAA GOES-R
Series Advanced Baseline Imager (ABI) Level 2 Derived Stability Indices [data set], <a href="https://doi.org/10.7289/V50Z71KF" target="_blank">https://doi.org/10.7289/V50Z71KF</a>, 2018d.
</mixed-citation></ref-html>
<ref-html id="bib1.bib18"><label>GOES-R Calibration Working Group and GOES-R Series
Program(2017)</label><mixed-citation>
GOES-R Calibration Working Group and GOES-R Series Program: NOAA GOES-R
Series Advanced Baseline Imager (ABI) Level 1b Radiances [data set], <a href="https://doi.org/10.7289/V5BV7DSR" target="_blank">https://doi.org/10.7289/V5BV7DSR</a>, 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib19"><label>Greene and Clark(1972)</label><mixed-citation>
Greene, D. R. and Clark, R. A.: Vertically Integrated Liquid Water – A New
Analysis Tool, Mon. Weather Rev., 100, 548–552,
<a href="https://doi.org/10.1175/1520-0493(1972)100&lt;0548:VILWNA&gt;2.3.CO;2" target="_blank">https://doi.org/10.1175/1520-0493(1972)100&lt;0548:VILWNA&gt;2.3.CO;2</a>, 1972.
</mixed-citation></ref-html>
<ref-html id="bib1.bib20"><label>Handwerker(2002)</label><mixed-citation>
Handwerker, J.: Cell tracking with TRACE3D – a new algorithm, Atmos. Res.,
61, 15–34, <a href="https://doi.org/10.1016/S0169-8095(01)00100-4" target="_blank">https://doi.org/10.1016/S0169-8095(01)00100-4</a>, 2002.
</mixed-citation></ref-html>
<ref-html id="bib1.bib21"><label>Heidinger et al.(2020)Heidinger, Pavolonis, Calvert, Hoffman, Nebuda, Straka, Walther, and Wanzong</label><mixed-citation>
Heidinger, A. K., Pavolonis, M. J., Calvert, C., Hoffman, J., Nebuda, S.,
Straka, W., Walther, A., and Wanzong, S.: ABI Cloud Products from the GOES-R Series, in: The GOES-R Series: A New Generation of Geostationary Environmental Satellites, chap. 6, edited by: Goodman, S. J., Schmit, T. J., Daniels, J., and Redmon, R. J., Elsevier, 43–62,
<a href="https://doi.org/10.1016/B978-0-12-814327-8.00006-8" target="_blank">https://doi.org/10.1016/B978-0-12-814327-8.00006-8</a>, 2020.
</mixed-citation></ref-html>
<ref-html id="bib1.bib22"><label>Heiss et al.(1990)Heiss, McGrew, and Sirmans</label><mixed-citation>
Heiss, W. H., McGrew, D. L., and Sirmans, D.: Nexrad: next generation weather
radar (WSR-88D), Microwave J., 33, 79+, 1990.
</mixed-citation></ref-html>
<ref-html id="bib1.bib23"><label>Helmus and Collis(2016)</label><mixed-citation>
Helmus, J. J. and Collis, S. M.: The Python ARM Radar Toolkit (Py-ART), a
library for working with weather radar data in the Python programming language, J. Open Res. Software, 4, e25, <a href="https://doi.org/10.5334/jors.119" target="_blank">https://doi.org/10.5334/jors.119</a>, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib24"><label>Hering et al.(2004)Hering, Morel, Galli, Snsi, Ambrosetti, and
Boscacci</label><mixed-citation>
Hering, A., Morel, C., Galli, G., Sénési, S., Ambrosetti, P., and Boscacci, M.: Nowcasting thunderstorms in the Alpine region using a radar based adaptive thresholding scheme, in: Proceedings of ERAD 2004, <a href="https://www.copernicus.org/erad/2004/online/ERAD04_P_206.pdf" target="_blank"/> (last access: 21 February 2022), 2004.
</mixed-citation></ref-html>
<ref-html id="bib1.bib25"><label>Hering et al.(2005)Hering, Snsi, Ambrosetti, , and
Bernard-Bouissires</label><mixed-citation>
Hering, A., Sénési, S., Ambrosetti, P., and Bernard-Bouissières, I.: Nowcasting thunderstorms in complex cases using radar data, in: WMO Symposium on Nowcasting and Very Short Range Forecasting, <a href="https://www.researchgate.net/publication/228609271_Nowcasting_thunderstorms_in_complex_cases_using_radar_data" target="_blank"/>
(last access: 21 February 2022), 2005.
</mixed-citation></ref-html>
<ref-html id="bib1.bib26"><label>Hering et al.(2006)Hering, Germann, Boscacci, , and
Snsi</label><mixed-citation>
Hering, A., Germann, U., Boscacci, M., and Sénési, S.: Operational
thunderstorm nowcasting in the Alpine region using 3D-radar severe weather
parameters and lightning data, in: Proceedings of ERAD 2006, <a href="http://www.crahi.upc.edu/ERAD2006/proceedingsMask/00122.pdf" target="_blank"/> (last
access: 21 February 2022), 2006.
</mixed-citation></ref-html>
<ref-html id="bib1.bib27"><label>Hoffmann(2008)</label><mixed-citation>
Hoffmann, J.: Entwicklung und Anwendung von statistischen Vorhersage – Interpretationsverfahren für Gewitternowcasting und Unwetterwarnungen unter Einbeziehung von Fernerkundungsdaten, PhD thesis, Freie Universität Berlin, Berlin, <a href="https://doi.org/10.17169/refubium-15903" target="_blank">https://doi.org/10.17169/refubium-15903</a>, 2008.
</mixed-citation></ref-html>
<ref-html id="bib1.bib28"><label>Huang et al.(2019)Huang, Jiang, Liu, Pan, Li, Guo, Huang, and
Duan</label><mixed-citation>
Huang, W., Jiang, Y., Liu, X., Pan, Y., Li, X., Guo, R., Huang, Y., and Duan,
B.: Classified Early-warning and Nowcasting of Hail Weather Based on Radar
Products and Random Forest Algorithm, in: 2019 International Conference on
Meteorology Observations (ICMO), <a href="https://doi.org/10.1109/ICMO49322.2019.9026039" target="_blank">https://doi.org/10.1109/ICMO49322.2019.9026039</a>, 2019.
</mixed-citation></ref-html>
<ref-html id="bib1.bib29"><label>James et al.(2018)James, Reichert, and
Heizenreder</label><mixed-citation>
James, P. M., Reichert, B. K., and Heizenreder, D.: NowCastMIX: Automatic
Integrated Warnings for Severe Convection on Nowcasting Time Scales at the
German Weather Service, Weather Forecasti., 33, 1413–1433,
<a href="https://doi.org/10.1175/WAF-D-18-0038.1" target="_blank">https://doi.org/10.1175/WAF-D-18-0038.1</a>, 2018.
</mixed-citation></ref-html>
<ref-html id="bib1.bib30"><label>Ke et al.(2017)Ke, Meng, Finley, Wang, Chen, Ma, Ye, and
Liu</label><mixed-citation>
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., and Liu,
T.-Y.: LightGBM: a highly efficient gradient boosting decision tree, in:
Proceedings of the 31st International Conference on Neural Information
Processing Systems, 3149–3157, <a href="https://dl.acm.org/doi/abs/10.5555/3294996.3295074" target="_blank"/> (last access: 21 February 2022), 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib31"><label>Kelly et al.(1985)Kelly, Schaefer, and
Doswell</label><mixed-citation>
Kelly, D. L., Schaefer, J. T., and Doswell, C. A.: Climatology of Nontornadic
Severe Thunderstorm Events in the United States, Mon. Weather Rev., 113,
1997–2014, <a href="https://doi.org/10.1175/1520-0493(1985)113&lt;1997:CONSTE&gt;2.0.CO;2" target="_blank">https://doi.org/10.1175/1520-0493(1985)113&lt;1997:CONSTE&gt;2.0.CO;2</a>, 1985.
</mixed-citation></ref-html>
<ref-html id="bib1.bib32"><label>Kober and Tafferner(2009)</label><mixed-citation>
Kober, K. and Tafferner, A.: Tracking and Nowcasting of Convective Cells Using Remote Sensing Data from Radar and Satellite, Meteorol. Z., 1, 75–84, <a href="https://doi.org/10.1127/0941-2948/2009/359" target="_blank">https://doi.org/10.1127/0941-2948/2009/359</a>, 2009.
</mixed-citation></ref-html>
<ref-html id="bib1.bib33"><label>Kober et al.(2012)Kober, Craig, Keil, and
Dörnbrack</label><mixed-citation>
Kober, K., Craig, G. C., Keil, C., and Dörnbrack, A.: Blending a probabilistic nowcasting method with a high-resolution numerical weather prediction ensemble for convective precipitation forecasts, Q. J. Roy. Meteorol. Soc., 138, 755–768, <a href="https://doi.org/10.1002/qj.939" target="_blank">https://doi.org/10.1002/qj.939</a>, 2012.
</mixed-citation></ref-html>
<ref-html id="bib1.bib34"><label>Kumar et al.(2020)Kumar, Islam, Sekimoto, Mattmann, and
Wilson</label><mixed-citation>
Kumar, A., Islam, T., Sekimoto, Y., Mattmann, C., and Wilson, B.: Convcast: An embedded convolutional LSTM based architecture for precipitation nowcasting using satellite data, PLOS One, 15, 1–18,
<a href="https://doi.org/10.1371/journal.pone.0230114" target="_blank">https://doi.org/10.1371/journal.pone.0230114</a>, 2020.
</mixed-citation></ref-html>
<ref-html id="bib1.bib35"><label>Lagerquist et al.(2017)Lagerquist, McGovern, and
Smith</label><mixed-citation>
Lagerquist, R., McGovern, A., and Smith, T.: Machine Learning for Real-Time
Prediction of Damaging Straight-Line Convective Wind, Weather Forecast., 32, 2175–2193, <a href="https://doi.org/10.1175/WAF-D-17-0038.1" target="_blank">https://doi.org/10.1175/WAF-D-17-0038.1</a>, 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib36"><label>Lagerquist et al.(2020)Lagerquist, McGovern, Homeyer, Gagne, and
Smith</label><mixed-citation>
Lagerquist, R., McGovern, A., Homeyer, C. R., Gagne II, D. J., and Smith, T.:
Deep Learning on Three-Dimensional Multiscale Data for Next-Hour Tornado
Prediction, Mon. Weather Rev., 148, 2837–2861, <a href="https://doi.org/10.1175/MWR-D-19-0372.1" target="_blank">https://doi.org/10.1175/MWR-D-19-0372.1</a>, 2020.
</mixed-citation></ref-html>
<ref-html id="bib1.bib37"><label>Leinonen et al.(2021)Leinonen, Hamann, Germann, and
Mecikalski</label><mixed-citation>
Leinonen, J., Hamann, U., Germann, U., and Mecikalski, J. R.: Machine learning code and dataset for “Nowcasting thunderstorm hazards using machine learning: the impact of data sources on performance”, Zenodo [code and data set], <a href="https://doi.org/10.5281/zenodo.6206919" target="_blank">https://doi.org/10.5281/zenodo.6206919</a>, 2021.
</mixed-citation></ref-html>
<ref-html id="bib1.bib38"><label>Li et al.(2020)Li, Li, and Schmit</label><mixed-citation>
Li, J., Li, Z., and Schmit, T. J.: ABI Legacy Atmospheric Profiles and
Derived Products from the GOES-R Series, in: The GOES-R Series: A New
Generation of Geostationary Environmental Satellites, chap. 7, edited by: Goodman, S. J., Schmit, T. J., Daniels, J., and Redmon, R. J., Elsevier, 63–77, <a href="https://doi.org/10.1016/B978-0-12-814327-8.00007-X" target="_blank">https://doi.org/10.1016/B978-0-12-814327-8.00007-X</a>, 2020.
</mixed-citation></ref-html>
<ref-html id="bib1.bib39"><label>Marshall and Radhakant(1978)</label><mixed-citation>
Marshall, J. S. and Radhakant, S.: Radar Precipitation Maps as Lightning
Indicators, J. Appl. Meteorol. Clim., 17, 206–212,
<a href="https://doi.org/10.1175/1520-0450(1978)017&lt;0206:RPMALI&gt;2.0.CO;2" target="_blank">https://doi.org/10.1175/1520-0450(1978)017&lt;0206:RPMALI&gt;2.0.CO;2</a>, 1978.
</mixed-citation></ref-html>
<ref-html id="bib1.bib40"><label>Martner et al.(2008)Martner, Yuter, White, Matrosov, Kingsmill, and
Ralph</label><mixed-citation>
Martner, B. E., Yuter, S. E., White, A. B., Matrosov, S. Y., Kingsmill, D. E., and Ralph, F. M.: Raindrop Size Distributions and Rain Characteristics in
California Coastal Rainfall for Periods with and without a Radar Bright Band,
J. Hydrometeorol., 9, 408–425, <a href="https://doi.org/10.1175/2007JHM924.1" target="_blank">https://doi.org/10.1175/2007JHM924.1</a>, 2008.
</mixed-citation></ref-html>
<ref-html id="bib1.bib41"><label>Mecikalski and Bedka(2006)</label><mixed-citation>
Mecikalski, J. R. and Bedka, K. M.: Forecasting Convective Initiation by
Monitoring the Evolution of Moving Cumulus in Daytime GOES Imager, Mon. Weather Rev., 134, 49–78, <a href="https://doi.org/10.1175/MWR3062.1" target="_blank">https://doi.org/10.1175/MWR3062.1</a>, 2006.
</mixed-citation></ref-html>
<ref-html id="bib1.bib42"><label>Mecikalski et al.(2010)Mecikalski, MacKenzie, Koenig, and
Muller</label><mixed-citation>
Mecikalski, J. R., MacKenzie, W. M., Koenig, M., and Muller, S.: Cloud-Top
Properties of Growing Cumulus prior to Convective Initiation as Measured by
Meteosat Second Generation. Part I: Infrared Fields, J. Appl. Meteorol. Clim., 4, 521–534, <a href="https://doi.org/10.1175/2009JAMC2344.1" target="_blank">https://doi.org/10.1175/2009JAMC2344.1</a>, 2010.
</mixed-citation></ref-html>
<ref-html id="bib1.bib43"><label>Mecikalski et al.(2015)Mecikalski, Williams, Jewett, Ahijevych,
LeRoy, and Walker</label><mixed-citation>
Mecikalski, J. R., Williams, J. K., Jewett, C. P., Ahijevych, D., LeRoy, A.,
and Walker, J. R.: Probabilistic 0–1-h Convective Initiation Nowcasts that
Combine Geostationary Satellite Observations and Numerical Weather Prediction
Model Data, J. Appl. Meteorol. Clim., 54, 1039–1059,
<a href="https://doi.org/10.1175/JAMC-D-14-0129.1" target="_blank">https://doi.org/10.1175/JAMC-D-14-0129.1</a>, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib44"><label>Mecikalski et al.(2021)Mecikalski, Sandmæl, Murillo, Homeyer, Bedka, Apke, and Jewett</label><mixed-citation>
Mecikalski, J. R., Sandmæl, T. N., Murillo, E. M., Homeyer, C. R., Bedka, K. M., Apke, J. M., and Jewett, C. P.: Random Forest Model to Assess Predictor Importance and Nowcast Severe Storms using High-Resolution Radar–GOES Satellite–Lightning Observations, Mon. Weather Rev., 149, 1725–1746,
<a href="https://doi.org/10.1175/MWR-D-19-0274.1" target="_blank">https://doi.org/10.1175/MWR-D-19-0274.1</a>, 2021.
</mixed-citation></ref-html>
<ref-html id="bib1.bib45"><label>Mostajabi et al.(2019)Mostajabi, Finney, Rubinstein, and
Rachidi</label><mixed-citation>
Mostajabi, A., Finney, D. L., Rubinstein, M., and Rachidi, F.: Nowcasting
lightning occurrence from commonly available meteorological parameters using
machine learning techniques, Clim. Atmos. Sci., 2, 41,
<a href="https://doi.org/10.1038/s41612-019-0098-0" target="_blank">https://doi.org/10.1038/s41612-019-0098-0</a>, 2019.
</mixed-citation></ref-html>
<ref-html id="bib1.bib46"><label>Mueller et al.(2003)Mueller, Saxen, Roberts, Wilson, Betancourt,
Dettling, Oien, and J.</label><mixed-citation>
Mueller, C., Saxen, T., Roberts, R., Wilson, J., Betancourt, T., Dettling, S., Oien, N., and J., Y.: NCAR Auto-Nowcast System, Weather Forecast., 18, 545–561, <a href="https://doi.org/10.1175/1520-0434(2003)018&lt;0545:NAS&gt;2.0.CO;2" target="_blank">https://doi.org/10.1175/1520-0434(2003)018&lt;0545:NAS&gt;2.0.CO;2</a>, 2003.
</mixed-citation></ref-html>
<ref-html id="bib1.bib47"><label>NASA/METI/AIST/Japan Spacesystems and US/Japan ASTER Science
Team(2019)</label><mixed-citation>
NASA/METI/AIST/Japan Spacesystems and US/Japan ASTER Science Team: ASTER
Global Digital Elevation Model V003 [data set], <a href="https://doi.org/10.5067/ASTER/ASTGTM.003" target="_blank">https://doi.org/10.5067/ASTER/ASTGTM.003</a>, 2019.
</mixed-citation></ref-html>
<ref-html id="bib1.bib48"><label>Natekin and Knoll(2013)</label><mixed-citation>
Natekin, A. and Knoll, A.: Gradient boosting machines, a tutorial, Front.
Neurorobot., 7, 21, <a href="https://doi.org/10.3389/fnbot.2013.00021" target="_blank">https://doi.org/10.3389/fnbot.2013.00021</a>, 2013.
</mixed-citation></ref-html>
<ref-html id="bib1.bib49"><label>NOAA National Weather Service (NWS) Radar Operations
Center(1991)</label><mixed-citation>
NOAA National Weather Service (NWS) Radar Operations Center: Next Generation Radar (NEXRAD) Level 2 Base Data [data set], <a href="https://doi.org/10.7289/V5W9574V" target="_blank">https://doi.org/10.7289/V5W9574V</a>, 1991.
</mixed-citation></ref-html>
<ref-html id="bib1.bib50"><label>Pulkkinen et al.(2019)Pulkkinen, Nerini, Pérez Hortal,
Velasco-Forero, Seed, Germann, and Foresti</label><mixed-citation>
Pulkkinen, S., Nerini, D., Pérez Hortal, A. A., Velasco-Forero, C., Seed, A.,
Germann, U., and Foresti, L.: Pysteps: an open-source Python library for
probabilistic precipitation nowcasting (v1.0), Geosci. Model Dev., 12,
4185–4219, <a href="https://doi.org/10.5194/gmd-12-4185-2019" target="_blank">https://doi.org/10.5194/gmd-12-4185-2019</a>, 2019.
</mixed-citation></ref-html>
<ref-html id="bib1.bib51"><label>Raspaud et al.(2018)Raspaud, Hoese, Dybbroe, Lahtinen, Devasthale,
Itkin, Hamann, Rasmussen, Nielsen, Leppelt, Maul, Kliche, and
Thorsteinsson</label><mixed-citation>
Raspaud, M., Hoese, D., Dybbroe, A., Lahtinen, P., Devasthale, A., Itkin, M.,
Hamann, U., Rasmussen, L. O., Nielsen, E. S., Leppelt, T., Maul, A., Kliche,
C., and Thorsteinsson, H.: PyTroll: An Open-Source, Community-Driven Python Framework to Process Earth Observation Satellite Data, B. Am. Meteorol. Soc., 99, 1329–1336, <a href="https://doi.org/10.1175/BAMS-D-17-0277.1" target="_blank">https://doi.org/10.1175/BAMS-D-17-0277.1</a>, 2018.
</mixed-citation></ref-html>
<ref-html id="bib1.bib52"><label>Roberts and Rutledge(2003)</label><mixed-citation>
Roberts, R. D. and Rutledge, S.: Nowcasting Storm Initiation and Growth Using
GOES-8 and WSR-88D Data, Weather Forecast., 18, 562–584,
<a href="https://doi.org/10.1175/1520-0434(2003)018&lt;0562:NSIAGU&gt;2.0.CO;2" target="_blank">https://doi.org/10.1175/1520-0434(2003)018&lt;0562:NSIAGU&gt;2.0.CO;2</a>, 2003.
</mixed-citation></ref-html>
<ref-html id="bib1.bib53"><label>Rudlosky et al.(2020)Rudlosky, Goodman, and
Virts</label><mixed-citation>
Rudlosky, S. D., Goodman, S. J., and Virts, K. S.: Lightning Detection: GOES-R Series Geostationary Lightning Mapper, in: The GOES-R Series: A New
Generation of Geostationary Environmental Satellites, chap. 16, edited by: Goodman, S. J., Schmit, T. J., Daniels, J., and Redmon, R. J., Elsevier, 193–202, <a href="https://doi.org/10.1016/B978-0-12-814327-8.00016-0" target="_blank">https://doi.org/10.1016/B978-0-12-814327-8.00016-0</a>, 2020.
</mixed-citation></ref-html>
<ref-html id="bib1.bib54"><label>Schmit and Gunshor(2020)</label><mixed-citation>
Schmit, T. J. and Gunshor, M. M.: ABI Imagery from the GOES-R Series, in:
The GOES-R Series: A New Generation of Geostationary Environmental Satellites, chap. 4, edited by: Goodman, S. J., Schmit, T. J., Daniels, J., and Redmon, R. J., Elsevier, 23–34, <a href="https://doi.org/10.1016/B978-0-12-814327-8.00004-4" target="_blank">https://doi.org/10.1016/B978-0-12-814327-8.00004-4</a>, 2020.
</mixed-citation></ref-html>
<ref-html id="bib1.bib55"><label>Shi et al.(2015)Shi, Chen, Wang, Yeung, Wong, and
Woo</label><mixed-citation>
Shi, X., Chen, Z., Wang, H., Yeung, D.-Y., Wong, W.-K., and Woo, W.-C.:
Convolutional LSTM Network: A Machine Learning Approach for Precipitation
Nowcasting, in: Advances in Neural Information Processing Systems 28, edited
by Cortes, C., Lawrence, N. D., Lee, D. D., Sugiyama, M., and Garnett, R.,
Curran Associates, Inc., 802–810,
<a href="http://papers.nips.cc/paper/5955-convolutional-lstm-network-a-machine-learning-approach-for-precipitation-nowcasting.pdf" target="_blank">http://papers.nips.cc/paper/5955-convolutional-lstm-network-a-machine-learning-approach
-for-precipitation-nowcasting.pdf</a> (last access: 21 February 2022), 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib56"><label>Shi et al.(2017)Shi, Gao, Lausen, Wang, Yeung, Wong, and
Woo</label><mixed-citation>
Shi, X., Gao, Z., Lausen, L., Wang, H., Yeung, D.-Y., Wong, W.-K., and Woo,
W.-C.: Deep learning for precipitation nowcasting: a benchmark and a new
model, in: Proceedings of the 31st International Conference on Neural Information Processing Systems, 5622–5632, <a href="https://dl.acm.org/doi/abs/10.5555/3295222.3295313" target="_blank"/> (last access: 21 February 2022), 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib57"><label>Smith et al.(2016)Smith, Lakshmanan, Stumpf, Ortega, Hondl, Cooper,
Calhoun, Kingfield, Manross, Toomey, and Brodgen</label><mixed-citation>
Smith, T. M., Lakshmanan, V., Stumpf, G. J., Ortega, K. L., Hondl, K., Cooper, K., Calhoun, K. M., Kingfield, D. M., Manross, K. L., Toomey, R., and
Brodgen, J.: Multi-Radar Multi-Sensor (MRMS) Severe Weather and Aviation
Products: Initial Operating Capabilities, B. Am. Meteorol. Soc., 97, 1617–1630, <a href="https://doi.org/10.1175/BAMS-D-14-00173.1" target="_blank">https://doi.org/10.1175/BAMS-D-14-00173.1</a>, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib58"><label>Snyder(1987)</label><mixed-citation>
Snyder, J. P.: Map Projections – A Working Manual, United States Government
Printing Office, Washington, DC, USA, <a href="https://doi.org/10.3133/pp1395" target="_blank">https://doi.org/10.3133/pp1395</a>, 1987.
</mixed-citation></ref-html>
<ref-html id="bib1.bib59"><label>Sprenger et al.(2017)Sprenger, Schemm, Oechslin, and
Jenkner</label><mixed-citation>
Sprenger, M., Schemm, S., Oechslin, R., and Jenkner, J.: Nowcasting Foehn Wind Events Using the AdaBoost Machine Learning Algorithm, Weather Forecast., 32, 1079–1099, <a href="https://doi.org/10.1175/WAF-D-16-0208.1" target="_blank">https://doi.org/10.1175/WAF-D-16-0208.1</a>, 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib60"><label>Steinacker et al.(2000)Steinacker, Dorninger, Wlfelmaier, and
Krennert</label><mixed-citation>
Steinacker, R., Dorninger, M., Wölfelmaier, F., and Krennert, T.: Automatic Tracking of Convective Cells and Cell Complexes from Lightning and Radar Data, Meteorol. Atmos. Phys., 72, 101–110, <a href="https://doi.org/10.1007/s007030050009" target="_blank">https://doi.org/10.1007/s007030050009</a>, 2000.
</mixed-citation></ref-html>
<ref-html id="bib1.bib61"><label>Süli and Mayers(2003)</label><mixed-citation>
Süli, E. and Mayers, D. F.: An Introduction to Numerical Analysis, Cambridge University Press, Cambridge, UK, <a href="https://doi.org/10.1017/CBO9780511801181" target="_blank">https://doi.org/10.1017/CBO9780511801181</a>, 2003.
</mixed-citation></ref-html>
<ref-html id="bib1.bib62"><label>Sullivan(2020)</label><mixed-citation>
Sullivan, P. C.: GOES-R Series Spacecraft and Instruments, in: The GOES-R
Series: A New Generation of Geostationary Environmental Satellites, chap. 3, edited by: Goodman, S. J., Schmit, T. J., Daniels, J., and Redmon, R. J.,
Elsevier, 13–21, <a href="https://doi.org/10.1016/B978-0-12-814327-8.00003-2" target="_blank">https://doi.org/10.1016/B978-0-12-814327-8.00003-2</a>, 2020.
</mixed-citation></ref-html>
<ref-html id="bib1.bib63"><label>Waldvogel et al.(1979)Waldvogel, Federer, and
Grimm</label><mixed-citation>
Waldvogel, A., Federer, B., and Grimm, P.: Criteria for the Detection of Hail
Cells, J. Appl. Meteorol., 18, 1521–1525,
<a href="https://doi.org/10.1175/1520-0450(1979)018&lt;1521:CFTDOH&gt;2.0.CO;2" target="_blank">https://doi.org/10.1175/1520-0450(1979)018&lt;1521:CFTDOH&gt;2.0.CO;2</a>, 1979.
</mixed-citation></ref-html>
<ref-html id="bib1.bib64"><label>Willmott and Matsuura(2005)</label><mixed-citation>
Willmott, C. J. and Matsuura, K.: Advantages of the mean absolute error (MAE)
over the root mean square error (RMSE) in assessing average model performance, Clim. Res., 30, 79–82, <a href="https://doi.org/10.3354/cr030079" target="_blank">https://doi.org/10.3354/cr030079</a>, 2005.
</mixed-citation></ref-html>
<ref-html id="bib1.bib65"><label>Wilson and Mueller(1993)</label><mixed-citation>
Wilson, J. W. and Mueller, C. K.: Nowcasts of Thunderstorm Initiation and
Evolution, Weather Forecast., 8, 113–131, <a href="https://doi.org/10.1175/1520-0434(1993)008&lt;0113:NOTIAE&gt;2.0.CO;2" target="_blank">https://doi.org/10.1175/1520-0434(1993)008&lt;0113:NOTIAE&gt;2.0.CO;2</a>, 1993.
</mixed-citation></ref-html>
<ref-html id="bib1.bib66"><label>Wilson et al.(1998)Wilson, Crook, Mueller, Sun, and
Dixon</label><mixed-citation>
Wilson, J. W., Crook, N. A., Mueller, C. K., Sun, J., and Dixon, M.: Nowcasting Thunderstorms: A Status Report, B. Amer. Meteorol. Soc., 79, 2079–2100, <a href="https://doi.org/10.1175/1520-0477(1998)079&lt;2079:NTASR&gt;2.0.CO;2" target="_blank">https://doi.org/10.1175/1520-0477(1998)079&lt;2079:NTASR&gt;2.0.CO;2</a>, 1998.
</mixed-citation></ref-html>
<ref-html id="bib1.bib67"><label>Yeung et al.(2015)Yeung, Smith, Baeck, and
Villarini</label><mixed-citation>
Yeung, J. K., Smith, J. A., Baeck, M. L., and Villarini, G.: Lagrangian
Analyses of Rainfall Structure and Evolution for Organized Thunderstorm
Systems in the Urban Corridor of the Northeastern United States, J. Hydrometeorol., 16, 1575–1595, <a href="https://doi.org/10.1175/JHM-D-14-0095.1" target="_blank">https://doi.org/10.1175/JHM-D-14-0095.1</a>, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib68"><label>Zhang et al.(2016)Zhang, Howard, Langston, Kaney, Qi, Tang, Grams,
Wang, Cocks, Martinaitis, Arthur, Cooper, Brogden, and
Kitzmiller</label><mixed-citation>
Zhang, J., Howard, K., Langston, C., Kaney, B., Qi, Y., Tang, L., Grams, H.,
Wang, Y., Cocks, S., Martinaitis, S., Arthur, A., Cooper, K., Brogden, J.,
and Kitzmiller, D.: Multi-Radar Multi-Sensor (MRMS) Quantitative
Precipitation Estimation: Initial Operating Capabilities, B. Am. Meteorol. Soc., 97, 621–638, <a href="https://doi.org/10.1175/BAMS-D-14-00174.1" target="_blank">https://doi.org/10.1175/BAMS-D-14-00174.1</a>, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib69"><label>Zhou et al.(2020)Zhou, Zheng, Dong, and Wang</label><mixed-citation>
Zhou, K., Zheng, Y., Dong, W., and Wang, T.: A Deep Learning Network for
Cloud-to-Ground Lightning Nowcasting with Multisource Data, J. Atmos. Ocean.
Tech., 37, 927–942, <a href="https://doi.org/10.1175/JTECH-D-19-0146.1" target="_blank">https://doi.org/10.1175/JTECH-D-19-0146.1</a>, 2020.

</mixed-citation></ref-html>
<ref-html id="bib1.bib70"><label>Zinner et al.(2008)Zinner, Mannstein, and
Tafferner</label><mixed-citation>
Zinner, T., Mannstein, H., and Tafferner, A.: Cb-TRAM: Tracking and monitoring severe convection from onset over rapid development to mature phase using multi-channel Meteosat-8 SEVIRI data, Meteorol. Atmos. Phys., 101, 191–210, <a href="https://doi.org/10.1007/s00703-008-0290-y" target="_blank">https://doi.org/10.1007/s00703-008-0290-y</a>, 2008.
</mixed-citation></ref-html>--></article>
