Windstorms in the Northeastern United States

Windstorms are a major natural hazard in many countries. Windstorms during the last four decades in the U.S. Northeast are identified and characterized using the spatial extent of locally extreme wind speeds at 100 m height from the ERA5 reanalysis database. During all of the top 10 windstorms, wind speeds in excess of their local 99.9 percentile extend over at least one-third of land-based ERA5 grid cells in this high population density 10 region of the U.S. Maximum sustained wind speeds during these windstorms range from 26 to over 43 ms, with wind speed return periods exceeding 6.5 to 106 years (considering the top 5% of grid cells during each storm). The property damage associated with these storms (inflation adjusted to January 2020) is $24 million to over $29 billion. Two of these windstorms are linked to decaying tropical cyclones, three are Alberta Clippers and the remaining storms are Colorado Lows. Two of the ten re-intensified off the east coast leading to development of 15 Nor’easters. These windstorms followed frequently observed cyclone tracks, but exhibit maximum intensities as measured using 700 hPa relative vorticity and mean sea level pressure that are five to ten times mean values for cyclones that followed similar tracks over this 40-year period. The time-evolution of wind speeds and concurrent precipitation for those windstorms that occurred after the year 2000 exhibit good agreement with in situ groundbased and remote sensing observations, plus storm damage reports, indicating that the ERA5 reanalysis data have 20 a high degree of fidelity for large, damaging windstorms such as these. A larger pool of the top 50 largest windstorms exhibits evidence of serial clustering, but to a degree that is lower than comparable statistics from Europe.


Hazardous wind phenomena 25
Hazardous wind phenomena span a range of scales from extra-tropical cyclones down to downburst and gust fronts associated with deep convection (Golden and Snow, 1991). Herein we focus on large-scale, long duration 'windstorms' associated with extratropical cyclones since they are likely to have the most profound societal impacts. These large-scale windstorms are a feature of the climate of North America and Europe and a major contributor to weather-related social vulnerability and insurance losses (Della-Marta et al., 2009;Feser et al., 30 2015;Hirsch et al., 2001;Changnon, 2009;Ulbrich et al., 2001;Haylock, 2011;Lukens et al., 2018;Marchigiani et al., 2013).
This analysis focusses on windstorms in the Northeastern region of the United States (U.S.) as defined in the National Climate Assessment (USGCRP, 2018) (Table 1). This region exhibits a very high prevalence of midlatitude cyclone passages (Hodges et al., 2011;Ulbrich et al., 2009) and the associated extreme weather events 35 (Bentley et al., 2019). It lies under a convergence zone of two prominent Northern Hemisphere cyclone tracks https://doi.org/10.5194/nhess-2020-345 Preprint. Discussion started: 11 December 2020 c Author(s) 2020. CC BY 4.0 License. associated with cyclones that form or redevelop as a result of lee-cyclogenesis east of the Rocky Mountains (Lareau and Horel, 2012). The first is associated with extra-tropical cyclones that have their genesis in the lee of Rocky Mountains within/close to the U.S. state of Colorado and typically track towards the northeast (Colorado Lows, CL) (Bierly and Harrington, 1995;Hobbs et al., 1996). The second is characterized by cyclones that have 40 their genesis in the lee of Rocky Mountains in/close to the Canadian province of Alberta and track eastwards across the Great Lakes (Alberta Clippers, AC). Previous research has found that these cyclones generally move southeastward from the lee of the Canadian Rockies toward or just north of Lake Superior (Fig. 1a) before progressing eastward into southeastern Canada or the northeastern United States, with less than 10% of the cases in the climatology tracking south of the Great Lakes (Thomas and Martin, 2007). The northeastern states are also 45 impacted by decaying tropical cyclones (TC) that track north from the Gulf of Mexico or along the Atlantic coastline (Baldini et al., 2016;Varlas et al., 2019;Halverson and Rabenhorst, 2013). Consistent with recent research on the windstorm risk in Europe that found that although less than 1% of cyclones that impact Northern Europe are post tropical cyclones, these systems tend to be associated with higher 10-m wind speeds (Sainsbury et al., 2020). Tropical cyclones, such as Hurricane Sandy have been associated with large geophysical hazards in 50 the Northeast (Halverson and Rabenhorst, 2013;Lackmann, 2015). This region also experiences episodic Nor'easters, extra-tropical cyclones that form or intensify off/along the U.S. east coast and exhibit either retrograde or northerly track resulting in a strong northeasterly flow over the Northeastern states (Hirsch et al., 2001;Zielinski, 2002).

Name
Abbreviation 2010  There is evidence that intense winter wind speeds in the mid-latitudes have increased since 1950, due in part to increased frequency of intense extra-tropical cyclones (Ma and Chang, 2017;Vose et al., 2014). While long-term trends such as this from reanalysis products are subject to the effects of changing data assimilation (Bloomfield 60 et al., 2018;Befort et al., 2016;Bengtsson et al., 2004), the 56 member twentieth century reanalysis exhibits a positive trend in the 98 th percentile wind speed over parts of the U.S. including the Northeastern states that are the focus of the current research (Brönnimann et al., 2012) (Fig. 1, Table 1). https://doi.org/10.5194/nhess-2020-345 Preprint. Discussion started: 11 December 2020 c Author(s) 2020. CC BY 4.0 License.

Socioeconomic consequences of windstorms
Economic losses associated with atmospheric hazards are substantial. Data from Munich Re indicate that annual 75 'weather related' losses at the global scale in 1997-2006 were US $45.1 billion (inflation adjusted to 2006 $) (Bouwer et al., 2007). In 2013, globally aggregated losses due to natural hazards were estimated at US$125 billion (Kreibich et al., 2014). Data from the contiguous U.S. indicate 168 "billion-dollar disaster events" linked to atmospheric phenomena during 1980(Smith and Matthews, 2015. In the U.S., three-quarters of total damages from natural hazards derive from hurricanes, flooding, and severe winter storms (including windstorms) 80 (Gall et al., 2011). There is also evidence of a trend towards increasing economic impact from natural hazards within the U.S. even after adjusting for inflation. According to one report; 'Nationwide, annual losses rose from $4.7 billion in the 1960s to $6.7 billion in the 1970s, $7.6 billion in the 1980s, $14.8 billion in the 1990s, and $23.6 billion in the 2000s' due to a combination of more frequent disasters, disasters of larger scale and changes in societal resilience (Gall et al., 2011). 85 Here we focus on the Northeastern U.S. states (Fig. 1a, Table 1) because this region experiences a relatively high frequency of damaging storms, in particular during the cold season (Hirsch et al., 2001), and exhibits relatively high exposure due to both the large number of (i) highly populated, high-density urban areas (Fig. 1d (SEDAC, 2020;Census, 2019)) and (ii) high-value (insured) assets. For example, New York state ranks tenth of fifty U.S. https://doi.org/10.5194/nhess-2020-345 Preprint. Discussion started: 11 December 2020 c Author(s) 2020. CC BY 4.0 License. states in total direct economic losses related to natural hazards, with estimated losses of $12.54 billion in 2009of $12.54 billion in $ 90 between 1960of $12.54 billion in to 2009of $12.54 billion in (Gall et al., 2011. Windstorms present a hazard to the built environment, transportation, especially to aviation (Young and Kristensen, 1992), and multi-energy systems including the electric grid (Bao et al., 2020;Wanik et al., 2015). In 2016 the annual cost of grid disruptions within the U.S. were estimated to range from approximately $28 billion to $209 billion (Mills and Jones, 2016). Composite events characterized by the co-occurrence of ice accumulation 95 and wind are particularly hazardous to the built environment, aviation and energy infrastructure (Sinh et al., 2016;Jeong et al., 2019). For example, in the 1998 Northeastern ice storm ice deposition combined with high winds led to the toppling of 1,000 transmission towers, loss of power to 5 million people, and 840,000 insurance claims valued at $1.2 billion (Mills and Jones, 2016).

Objectives of this research 100
This research is inspired by and is conceptually analogous to development of the XWS (eXtreme WindStorms) catalogue of storm tracks and wind-gust footprints for 50 of the most extreme European winter windstorms (Roberts et al., 2014). Specific goals of the research reported herein are to: 1) Present a new method for identifying and physically characterizing severe windstorms. This method is applied to forty-years of hourly output from the ERA5 reanalysis to extract the 10 most intense windstorms over the 105 U.S. Northeastern states and describe them in terms of their location, spatial extent, duration, and severity.
We further evaluate the degree to which these windstorms are composite extreme events, wherein high wind speeds co-occur with extreme or hazardous precipitation.
2) Verify aspects of the windstorms as characterized based on ERA5 reanalysis output using wind speed observations from sonic anemometers and precipitation characteristics from RADAR and in situ rain gauges, 110 plus storm damage reports.
3) Contextualize these windstorms in the long-term cyclone climatology. Specifically, we track each windstorm over time and space using two indices of intensity derived from mean-surface pressure and relative vorticity and contextualize these events in the cyclone climatology for 1979-2018. 4) Evaluate these windstorms in terms of the return periods of extreme wind speeds derived using the Gumbel 115 distribution applied using annual maximum wind speeds for 1979-2018.
This research is a part of the HyperFACETS project which uses a storyline-based analysis framework. Storylines are "physically self-consistent unfolding of past events, or of plausible future events or pathways" (Shepherd et al., 2018) and provide a method of framing a research inquiry in terms of three elements: A geographic region, an event, and a set of process drivers for that event. 120 2 Data and Methods

ERA5 reanalysis
Attempts to identify and characterize windstorms from a geophysical perspective have historically been hampered by limited data availability and/or quality from geospatially inhomogeneous observing networks. Further, time series from in situ wind measurement networks exhibit substantial inhomogeneities due to factors such as station 125 relocations, instrumentation changes, changes in conditions around individual measurement stations, changes in https://doi.org/10.5194/nhess-2020-345 Preprint. Discussion started: 11 December 2020 c Author(s) 2020. CC BY 4.0 License. measurement frequencies and/or integration periods (Pryor et al., 2009;Wan et al., 2010). Thus, herein we employ once hourly wind speeds from the ERA5 reanalysis. The wind speeds are for a height of 100-m a.g.l. at the model time step of 20 minutes and a spatial resolution of 0.250.25. This study focuses on windstorms within a study domain that extends from 35 to 50N and 65 to 90W (Fig. 1a). The events are defined using data from the 924 130 ERA5 land-dominated grid cells over the twelve Northeastern states (two-letter abbreviations given in Table 1).
The ERA5 reanalysis is derived using an unprecedented suite of assimilated in situ and remote sensing observations  and exhibits relatively high fidelity for wind speeds (Kalverla et al., 2020;Olauson, 2018;Kalverla et al., 2019;Pryor et al., 2020;Jourdier, 2020;Ramon et al., 2019). However, it is important to acknowledge that wind parameters from any model do not fully reflect all scales of flow variability 135 (Skamarock, 2004) and underestimate extreme wind speeds (Larsén et al., 2012), particularly in areas with high orographic complexity and or varying surface roughness length. Here we use wind speeds at 100-m because flow at this height is less likely to be impacted by sub-grid scale heterogeneity in surface roughness length and uncertainties induced by unresolved sub-grid scale variability. Near-surface wind speeds are strongly coupled to wind speeds at 100-m (i.e. within the PBL) but wind speeds at 100-m are less strongly impacted by inaccuracies 140 and/or uncertainty in surface roughness length (z0) (Minola et al., 2020;Nelli et al., 2020). Applying an uncertainty of a factor of two to z0 can lead to mean differences of up to 0.75 ms -1 for near-surface (40 to 150 m a.g.l.) wind speeds (Dörenkämper et al., 2020). Further, the scale of events we seek to characterize are regional rather than local scale, and are necessarily driven by winds aloft.
Cyclone tracking and intensity estimates presented herein employ three-hourly mean sea level pressure (MSLP) 145 and relative vorticity at 700 hPa (RV) fields from ERA5. Previous research has indicated relatively good consistency between cyclone climatologies derived using ERA5 and other recent reanalyses (Gramcianinov et al., 2020;Sainsbury et al., 2020). RV values at 700 hPa are used rather than 850 hPa as in the XWS European analysis due to the presence of high elevation areas in U.S. cyclone source regions. Further, the three-hourly fields from ERA5 used herein are direct products of the reanalysis, whereas the 3-hourly values used in XWS were based on 150 6-hourly ERA Interim reanalyses combined with ERA Interim forecast values for the intervening time steps (Roberts et al., 2014).
Compound events, windstorms which exhibit a co-occurrence of extreme precipitation and/or freezing rain with high winds, are associated with amplified risk (Zscheischler et al., 2018;Sadegh et al., 2018). Precipitation intensity and hydrometeor class from ERA5 are used to identify to what degree each of the ten storms are 155 compound events. The hydrometeor classes reported by ERA5 are; rain, mixed rain and snow, wet snow, dry snow, freezing rain, and ice pellets and are differentiated based largely on the temperature structure in the reanalysis model (https://confluence.ecmwf.int/display/FUG/9.7+Precipitation+Types). Prior analyses of ERA5 precipitation values have indicated skill relative to in situ observations and gridded data sets over the U.S. (Tarek et al., 2020;Sun and Liang, 2020). 160

Observational data
Wind speeds and precipitation characteristics during the windstorms are identified using ERA5 and are validated using in situ measurements from 24 National Weather Service (NWS) Automated Surface Observation System (ASOS) stations and seven NWS RADARs (Fig. 1c). Since major upgrades to the NWS systems were conducted in 2000, this evaluation is focused on windstorms that occurred after that year. Five minute measurements of in 165 https://doi.org/10.5194/nhess-2020-345 Preprint. Discussion started: 11 December 2020 c Author(s) 2020. CC BY 4.0 License. situ wind speeds at 10-m a.g.l. used in this evaluation derive from ice-free two-dimensional sonic anemometers (Schmitt IV, 2009), while the in situ observations of precipitation intensity reported from the ASOS network derive from heated tipping-bucket rain gauges (Tokay et al., 2010). In the absence of widespread in-situ wind speed observations from tall towers (which would be more comparable to the 100-m wind speeds from ERA5), these 10-m wind speed observations represent the best available validation data set for the occurrence of high 170 winds throughout the Northeast states. NWS protocols document accumulated precipitation since the last hour, sampled every minute and reported every five minutes (Nadolski, 1998). For the current comparison to ERA5, these are averaged to generate hourly rainfall rates.
Precipitation rates from seven NWS dual polarization RADAR (Kitzmiller et al., 2013) are used to provide an areally-averaged comparison of ERA5 (Fig. 1c). NWS RADAR precipitation products are the product of extensive 175 development efforts (Cunha et al., 2015;Villarini and Krajewski, 2010;Straka et al., 2000) and have been employed in a wide array of applications (Letson et al., 2020;Seo et al., 2015;Krajewski and Smith, 2002).
Precipitation intensity rates derived from RADAR reflectivity are reported in 41,400 cells using 1° azimuth angle and a range resolution of 2 km. In the current work, precipitation rates within 200 km of each RADAR are averaged in time to match the hourly resolution of ERA5 precipitation and interpolated in space to the 0.250.25 180 ERA5 grid. For comparison with ERA5, mean precipitation rates in each hour of the windstorm are computed from ERA5, ASOS and RADAR over the land areas of Northeastern states that are within 200 km of the 7 RADAR stations used herein (Fig. 1c).

NOAA Storm Events Database
The U.S. National Oceanic and Atmospheric Administration (NOAA) provides detailed information on "the 185 occurrence of storms and other significant weather phenomena having sufficient intensity to cause loss of life, injuries, significant property damage, and/or disruption to commerce" at the county level in the NOAA Storm Events Database (https://www.ncdc.noaa.gov/stormevents/). These records are subject to some inhomogeneities associated with digitization of transcripts prior to 1993, and standardized into 48 event types in 2013 (https://www.ncdc.noaa.gov/stormevents/details.jsp?type=collection) but are compiled from a range of county, 190 state and federal agencies in addition to the NWS. Like all hazard loss datasets they are subject to reporting inaccuracies and inconsistencies (Gall et al., 2009), but they represent a long and relatively consistent record, and are widely used (Young et al., 2017;Konisky et al., 2016). Damage and mortality estimates from this dataset to provide an estimate of the impact of each windstorm, with the caveat that population density and hence the potential for loss of life and damage vary markedly between U.S. counties (Fig. 1d). 195

Method used to characterize windstorms
A range of different techniques have been developed and applied to identify and characterize atmospheric hazards including extreme windstorms. Some rely on an assessment of the severity of the events such as insured losses or human mortality/morbidity, others prescribe a level of rarity (i.e. are probabilistic), while others prescribe a level of intensity (i.e. the occurrence of extreme values of some physical phenomena) (Stephenson, 2008). Here we 200 employ a methodology based on the intensity and spatial extent of extreme wind speeds. This approach is conceptually similar to storm severity indices derived from European work based on the maximum 925 hPa wind speed within a 3 radius of the vorticity maximum and the area over which wind speeds at that height exceed 25 https://doi.org/10.5194/nhess-2020-345 Preprint. Discussion started: 11 December 2020 c Author(s) 2020. CC BY 4.0 License. ms -1 (Roberts et al., 2014;Della-Marta et al., 2009). It also draws from earlier work that used an index defined as the product of the cube of the maximum observed wind speed over land, the areas impacted by damaging winds 205 (> 25.7 ms -1 ) and the duration of damaging winds (Lamb, 1991).
This analysis employs hourly wind speeds at 100-m a.g.l. for 1979-2018 in all 924 land-dominated grid cells over the Northeastern states. The methodology applied to identify and characterize the ten largest windstorms does not employ an absolute threshold of wind speed, but rather exceedance of locally determined thresholds defined by the 99.9 th percentile wind speed value (U999). A local U999 threshold is used, rather than an absolute wind speed 210 threshold in ms -1 , in part because storms affecting urban areas, which may not be prone to high wind speeds, are especially damaging to infrastructure. While lower percentile thresholds have been used in previous work (Walz et al., 2017;Klawa and Ulbrich, 2003), use of the 99.9 th percentile wind speed value is appropriate for identifying the truly extraordinary conditions we seek to characterize and is robust when applied to very long datasets with very large sample sizes. Use of locally determined thresholds also enables direct comparison of the spatial scale 215 and intensity of windstorms derived using the ERA5 data at 100 m a.g.l. and near-surface wind speed observations from 10 m a.g.l.. Exceedance of the local 99.9 th percentile wind speed value (U999) value is considered in both cases based on the ~20 year record from each ASOS station and the 40 years of ERA5 data, and comparisons are made at an hourly resolution by averaging all ASOS wind speeds within a given hour.
As shown in Fig. 1a, there is marked spatial variability in the 99.9 th percentile wind speed (i.e. the wind speed 220 exceeded on slightly over 3500 hours during the forty-year period). U999 ranges from over 28 ms -1 over the Atlantic Ocean down to 12 ms -1 over some land grid cells due to the higher surface roughness and topographic drag.
Windstorms are identified as periods when the largest number of ERA5 grid cells exceed their local (ERA5 grid cell specific) 99.9 th percentile wind speed value (U>U999). A further restriction is applied in that no event may be within 14 days of any other, to avoid double-counting of any individual storm (Fig. 1b, Table 2). 225 The peak hour of U>U999 coverage within the Northeast states for each of the ten most intense storms is referred to herein as the peak windstorm time (tp), and the 97 hours (±48 hours) surrounding that time are referred to as the storm period. For each hour of each storm period a high-wind centroid is identified using the mean latitude and longitude of all grid cells where U>U999.
Precipitation associated with each of the ten most intense windstorms is also evaluated using ERA5 precipitation 230 totals and types. The analysis of precipitation centers on a 24-hour period centered on the peak windstorm time (tp). Precipitation statistics including 24-hour total precipitation, hourly precipitation rates, and the frequency of each precipitation type is characterized for all land grid cells in Northeastern states that exceed their local U999 value at any point in this 24-hour period.
Research from Europe indicate evidence of serial clustering of windstorms (Walz et al., 2018). Although our focus 235 is primarily on the ten most intense and extensive windstorms, a larger sample of 50 events is extracted using the methodology described above but relaxing the temporal separation from 14 to 2 days, to examine the degree to which spatially extensive windstorms as manifest in ERA5 are serially clustered (Fig. 1b). This analysis employs a Poisson distribution fit to these 50 events and the dispersion index (D) of (Mailier et al., 2006): (1) 240 Where  2 and  are the variance and mean of the distribution, which for a Poisson distributed random variable are equal (Wilks, 2011). D > 0 indicates the presence of temporal clustering. The significance of D is evaluated using a bootstrapping analysis in which 10,000 samples are drawn with replacement and the dispersion index is https://doi.org/10.5194/nhess-2020-345 Preprint. Discussion started: 11 December 2020 c Author(s) 2020. CC BY 4.0 License. calculated for each, similar to a method used in (Pinto et al., 2016).

Development of a cyclone climatology 245
A cyclone detection and tracking algorithm (Hodges et al., 2011) is applied to 3-hourly ERA5 MSLP and 700hPa RV global fields that have been subjected to T42 spectral filtering for RV (corresponding to a 310-km resolution at the equator) and T63 filtering for MSLP (210 km at the equator) with the large scale background removed for total wavenumbers  5. These spectral filters are designed to restrict detection to tropical and mid-latitude cyclones (Hoskins and Hodges, 2002). The location and intensity of the cyclones are identified using the local maxima in 250 Consistent with past research (Hirsch et al., 2001) all of the top-10 windstorms identified using the largest spatial extent of locally extreme wind speeds in the ERA5 data occur during cold season months (October to April). Thus, the cyclone track density used to contextualize the windstorms is restricted to only those months. This analysis further focusses solely on cyclones that track into the Northeastern domain. These restrictions allow direct 270 evaluation of the degree to which the windstorms are typical of the prevailing cyclone climatology.

Calculation of long-term period wind speeds
Peak wind speeds (Upeak) during each of the windstorms are expressed in terms of their return period (RP in years) to provide a metric of the degree to which these events are exceptional. These statistics are computed for each ERA5 grid cell by fitting a double exponential (Gumbel) distribution to annual maximum wind speeds (Umax) 275 (Mann et al., 1998): Where the distribution parameters  and  are derived using maximum likelihood estimation. The Upeak estimates for each ERA5 grid cell are then evaluated in terms of their return period (RP in years) using (Wilks, 2011;Pryor et al., 2012): 280 https://doi.org/10.5194/nhess-2020-345 Preprint. Discussion started: 11 December 2020 c Author(s) 2020. CC BY 4.0 License.
This method is similar to that used for grid-point-based wind speed return period calculations in previous work (Della-Marta et al., 2009), which resulted in return periods of 0.1 to 500 years when considering 200 prominent windstorms in Europe.

Windstorm identification and characterization
The top-10 windstorms during 1979-2018 over the Northeastern states identified using the method described above are summarized in Table 2. During the peak hour (tp) of each of these windstorms, 309 to 524 (33 to 56%) of the 924 ERA5 land-dominated grid cells exhibit U>U999 (Table 2). For context, 10% of ERA5 grid cells coexhibit U>U999 in <1% of hours. The windstorms are not concentrated in any sub-period of the 40 years under 290 consideration  and no individual year contained two of the top ten windstorms (Fig. 1b). Hence, in the following the windstorms are referred to below by their (unique) year of occurrence, and in all figures and tables results are displayed in decreasing order of windstorm magnitude as defined using spatial extent ( Table 2).
The maximum wind speed at 100 m a.g.l. in any ERA5 grid cell at the peak hour range from 25 to 41 ms -1 , while the maximum during the storm period range from 26 to 44 ms -1 (Table 2). These maximum wind speeds do not 295 scale with the storm intensity as measured by the number of grid cells that exceed their local 99.9 th percentile wind speeds (Table 2). For example, the windstorm during March 1993 was associated with the highest absolute wind speeds but was manifest in a relatively small number of ERA5 grid cells (Table 2). All ten windstorms are associated with substantial damage reports within the Northeast states (Table 2, Fig. 2) and nine of the ten storms were responsible for deaths in the Northeast states (Fig. 2). There is not direct correspondence between the ranking of the windstorms in terms of the number of ERA5 grid cells with U>U999, and the amount of damage and human mortality as reported in the NOAA Storm Data, but the four highestmagnitude windstorms (2012, 2003, 1979, and 1996) all have property damage totals above any of the other six 310 windstorms (Table 2). Further, although NOAA Storm Data indicate only modest total economic costs associated https://doi.org/10.5194/nhess-2020-345 Preprint. Discussion started: 11 December 2020 c Author(s) 2020. CC BY 4.0 License. with property damage during the 1992 windstorm, there are reports of widespread damage in counties across much of the Northeast (Fig. 2). The lack of complete correspondence between the centroid of windstorms, as identified using the methodology presented here, and property damage in the NOAA dataset is likely due to: (i) Occurrence of localized extreme (damaging) winds that are manifest at scales below those represented in the ERA5 reanalysis 315 (e.g. downbursts from embedded thunderstorms, sting jets and other mechanisms Clark and Gray, 2018)). (Hewson and Neu, 2015) suggest a grid resolution of 20 km or higher is required to fully capture damaging winds. (ii) Spatial variability in insured assets (e.g. (Nyce et al., 2015) and (Brown et al., 2015) ). (iii) Possible inconsistences in storm-reporting practices across counties (See NOAA storm data publications for details: https://www.ncdc.noaa.gov/IPS/sd/sd.html). Nevertheless, although many factors dictate economic losses from 320 windstorms, the Pearson correlation coefficient (r) between the number of grid cells with U > U999 at tp and inflation adjusted property damage exceeds 0.66, and r between the maximum wind speed and inflation adjusted property damage is 0.56. For a sample size of 10, using a t-test to evaluate significance (Wilks, 2011), these correlation coefficients differ from 0 at confidence levels of 95% and 90%, respectively. Thus, this geophysical intensity metric captures aspects of relevance to storm damage. 325

maximum precipitation accumulated in any Northeastern state land grid cell is given for in the 24 hours surrounding the storm peak. Property damage for the Northeastern states is the accumulation of information from the NOAA Storm Data accumulated over the duration of the period for which the associated cyclone (defined using RV) is evident. Inflation adjusted property damage are derived using inflation estimates from the U.S. Bureau of Statistics
Several of the windstorms have been previously identified in independent analyses further confirming the reliability of the detection method. For example, Hurricane Sandy, the most intense windstorm in this analysis (Table 2), is a historic storm that moved parallel to the coast before making landfall in southern New Jersey on 29 October and caused $50 billion of damage (Lackmann, 2015). According to the ERA5 output at its peak, over 300,000 km 2 of the Northeastern states exhibited wind speeds at 100 m a.g.l. that exceeded the locally determined 330 U999 (Fig. 3). The 8 th most intense windstorm (Table 2) is the "Storm of the Century" of 12-14 March 1993 that formed in the Gulf of Mexico and caused widespread damage in Florida and along the Atlantic coast before entering the Northeast (Huo et al., 1995).

Figure 2 (cont).
Previous research has reported that reinsurance contracts commonly employ a 72 hour window to describe a 'single event' (Haylock, 2011). All of the windstorms identified in this work transited the Northeastern study domain in < 72 hours. Intense wind coverage (U>U999) is generally concentrated in the ±10 hours around the storm peak time, tp (Fig. 3), although some windstorms had longer duration and a slower decay in widespread intense 350 wind speeds with significant coverage remaining >10 hours after tp (Fig. 3).

355
Twenty-four-hour precipitation totals, used as an indicator of flooding potential, and maximum precipitation rates, used as an indicator of transportation hazards, vary substantially among the ten windstorms, but virtually all of 360 the windstorms were associated with some form of extreme or hazardous precipitation (Fig. 4). Consistent with https://doi.org/10.5194/nhess-2020-345 Preprint. Discussion started: 11 December 2020 c Author(s) 2020. CC BY 4.0 License. observational evidence (Munsell and Zhang, 2014), Hurricane Sandy (windstorm during 2012) is associated with total 24-hour precipitation accumulation in several ERA5 grid cells of up to 100 mm, and accumulations exceeded 20 mm in multiple ERA5 grid cells. Very heavy precipitation, both in terms of maximum precipitation intensity and total accumulated precipitation is also associated with the 1993 windstorm resulting from a decaying TC that 365 formed a NE (Fig. 4). The windstorms with lowest precipitation totals occurred in 2003, 1979 and 1995 and are associated with AC. Freezing rain, which in conjunction with high winds is a particular hazard to electrical infrastructure and transportation, is present during the windstorms in 1992, 1981 and 1993 (Fig. 4). There is also snow indicated in at least one location in the domain in every storm, except for 2012 (Hurricane Sandy). https://doi.org/10.5194/nhess-2020-345 Preprint. Discussion started: 11 December 2020 c Author(s) 2020. CC BY 4.0 License.
Four of the top-10 windstorms occurred after 2000, 2003, and thus high quality 375 ASOS and RADAR data are available for comparison with estimates from ERA5 for these events. For the 2012, 2003 and 2018 windstorms there is good agreement between the spatial extent of locally extreme wind speeds from ERA5 and ASOS, and the duration of intense wind speeds (Fig. 5). The agreement is less good for the 2007 windstorm possibly due to the low density of ASOS stations in the U.S. state of Maine where the ERA5 output indicate the wind maximum was manifest for a substantial fraction of the storm period (Fig. 2). For the other three 380 windstorms the fraction of ERA5 grid cells in the Northeastern states with U>U999 closely matches the fraction of ASOS stations in the same area that exceed their local U999 threshold during each hour of the storm period (Fig.   5). The timing of storm precipitation in the ERA5 data is also in good agreement with observational estimates from RADAR and ASOS stations, consistent with assimilation of RADAR precipitation and weather station data (Lopez, 2011;Hersbach et al., 2019). The period with most intense precipitation occurred concurrently with the 385 high wind speeds during Hurricane Sandy, but largely well before tp in the 2007 and 2018 windstorms (Fig. 5), consistent with previous work characterizing extra-tropical cyclones (Bengtsson et al., 2009). Mean ERA5 precipitation rates in Northeast states during these ten storms are consistently somewhat higher than estimates from RADAR, but below ASOS point measurements, reflecting spatial variability in rainfall intensity at scales below those manifest in a network of point measurements (Villarini et al., 2008). 390 A larger sample of 50 windstorms was also drawn from the 40-year time series to examine the serial dependence.
These top-50 windstorms are relatively well described by a Poisson distribution in terms of counts per calendar year. The resulting dispersion value (D) is 0.18 indicating evidence for serial dependence or alternatively stated that these windstorms are clustered in fewer years than would be expected for independent events. Of 10,000 bootstrapped samples, 99.97% had dispersion indices above zero. While this D value is symptomatic of serial 400 clustering for windstorms that impact the Northern USA, it is lower than those computed for regions of European in earlier research using the 20 th century ERA reanalysis and a 98 th percentile wind speed threshold (Walz et al., 2018). While the top ten windstorms considered in detail herein all have spatial extent of between 309 and 524 grid cells, the 11 th -through 50 th -ranked storms in the set used to characterize seriality have a mean extent of 216 grid cells, and range in extent from 176 to 309 cells, further indicating that the top ten storms are distinct in the 405 40-year time series (Fig. 1).

Cyclone detection and tracking
Consistent with past research employing other reanalysis data sets (Ulbrich et al., 2009), results from application of the cyclone detection and tracking algorithm to ERA5 output also indicate the U.S. Northeast exhibits a high frequency of transitory cyclones (Fig. 6). Also in accord with expectations, the tracks followed by the windstorms 410 are generally characteristic of those dominant cyclone tracks, and derive from a mixture of intense nor'easters (NE), Alberta Clippers (AC), deep Colorado lows (CL), and decaying tropical cyclones (TC) (Table 3, Fig. 6).
Cyclone intensities for the 10 windstorms are an order of magnitude above the mean intensities for cold-weather cyclones at the same locations over the U.S. for both RV and MSLP (Fig. 7, Table 3). Windstorms with the highest intensities tend to pass over the ocean (2012, 1993 and the 2018 storm). Both the 2012 and the 1993 windstorms 415 are the result of decaying tropical cyclones, with the 1993 system transitioning to become a NE ( Fig. 2 and 6, Table 3). The 2012 windstorm (Hurricane Sandy) exhibited extremely high intensity and is also associated with the largest area (number of grid cells) with U>U999. It was also associated with by far the largest amount of property damage and deaths (Fig. 2, Table 2). The 2018 windstorm is associated with a CL that stalled over the Atlantic coast and re-intensified to form a NE. Although this event was not the most geographically expansive, 420 its track over very high-density population areas and high value assets led to high associated storm damage (Fig.   2). Five of the 10 storms are associated with Colorado Lows, consistent with the high prevalence of such cyclones (Booth et al., 2015) (Fig. 6). These storms generally impacted the smallest areas and tend to be associated with substantial but lower amounts of property damage than TC or AC ( Table 2).
The Great Lakes are known to have a profound effect on passing cyclones during ice-free and generally unstable 425 conditions that prevail during September to November (Angel and Isard, 1997). Particularly during the early part of the cold-season, cyclones that cross the Great Lakes are frequently subject to acceleration and intensification via enhanced vertical heat flux and low-level moisture convergence due to the lake-land roughness contrast (Xiao et al. 2018). Cyclones that transit the Great Lakes during periods with substantial ice cover are subject to less alteration (Angel and Isard, 1997). The 2003The , 1979 storms are associated with Alberta Clippers (Table  430 3) that exhibit initially low intensities, but rapidly intensify as they pass across the Great Lakes region (~80°W and 45°N). Cyclone intensities for these three storms increased by an average of 16% for RV and 33% for MSLP during their crossing of the Great-Lakes longitudes (92°W to 76°W). These windstorms occurred when Great Lakes ice cover was minimal (https://www.glerl.noaa.gov/data/ice/atlas/ice_duration/duration.html). Both 2003 https://doi.org/10.5194/nhess-2020-345 Preprint. Discussion started: 11 December 2020 c Author(s) 2020. CC BY 4.0 License. and 1979 storms events exhibit large spatial scales (Fig. 3) and resulted in substantial property damage (Table 2). 435 Tracking of windstorms is a key determinant of societal impacts. The 2012 and 2017 windstorms had high-wind speed centroids that are closely aligned from the cyclone centers. They passed over highly populated areas including New York, and are associated with recorded damage in the hundreds of millions of dollars (Fig. 2, Table   2). The 1993 windstorm high wind speed centroid is out over the Atlantic Ocean which may partly explain the lower loss of life and property damage associated with this event (Fig. 2). The AC associated windstorms (2003, 440 1979, 1995) tracked west-east have maximum intensity centers across the north of the region and thus were also associated with lower damages over the US than other the other windstorms. Cyclones associated with the windstorms in 1992, 1996, 1981 tracked from the southeast to the northwest but their centers diagnosed from MSLP remain east of the region as do those from RV in 1992 and 1996. The geographic centroids of high wind speeds track through Virginia, Pennsylvania, and New York in all three years. Inflation-adjusted damage amounts 445 are very different for these storms and range from $24 million for the 1981 windstorm to $2181 million for the 1996 windstorm ( Fig. 2 and 7).  legends represent the 10 th , 50 th and 90 th percentile cyclone intensities from among the top 10 windstorms, and color coding of the cyclone tracks associated with each windstorm is as in Fig. 3 and Fig. 6. https://doi.org/10.5194/nhess-2020-345 Preprint. Discussion started: 11 December 2020 c Author(s) 2020. CC BY 4.0 License.

Windstorm Return Periods
All ten windstorms are associated with long return-period (RP > 50 years) wind speeds in at least some ERA5 grid cells, with return periods exceeding 100 years for the 2012 windstorm. Defining a single return period for 465 each windstorm is difficult due to the multiple degrees of freedoms, but the median (50 th percentile) and highest 5 percent (95 th percentile) of ERA5 grid cell estimates provide some qualitative assessment of probability. The median RP computed for all 924 grid cells ranges from 1 to 5 years across the ten windstorms (Table 3), while at least 5% of grid cells are characterized by wind speeds during each of the ten windstorms with RP of 6.5 to 106 years (Table 3, Fig. 8). The number of ERA5 grid cells that exhibit the annual maximum during the storm period 470 are positively correlated with the three metrics of return periods; (i) median RP, (ii) 95 th percentile RP and (iii) median RP for grid cells that exhibited U>U999 (r: 0.45 to 0.64), consistent with the longest-RP wind speeds being associated with the largest windstorms (Fig. 8, Table 3). For the two windstorms that entered the Northeastern  (Table 2), is the event with the largest number of ERA5 grid cells in excess of 50-year RP wind speeds in the Northeast domain. The Colorado Low associated windstorms (1996, 2007 and 1981) have their highest-RP winds in the mountainous regions of West Virginia, New York, Vermont, and Maine (WV, NY, VT, 480 and ME). Table 3. Windstorm details (windstorms are ordered as in Table 2). Cyclone type is based on subjective evaluation of results from the cyclone detection and tracking algorithm: AC = Alberta Clipper. TC = Tropical Cyclone. CL = Colorado Low. NE = Nor'easter. Max intensity is the maximum cyclone intensity along the storm-associated cyclone tracks for RV (x10 -5 s -1 ) and MSLP (scaled by -1, hPa). # cells with Umax indicates the number of grid cells for which 485 the maximum wind speed for the storm year occurred within the storm period. Median RP is the 50 th percentile return period for maximum wind speed in each Northeastern grid cells during each storm period, while p95 is the 95 th percentile RP. Also shown is the median RP for grid cells that exhibited U>U999 at the storm peak.

Concluding Remarks 495
The U.S. Northeast exhibits high socio-economic exposure to atmospheric hazards due to the presence of major urban centers with high population density and high density of insured, high-value assets (Table 1, Fig. 1), and windstorms present a substantial fraction of historically important climate hazards in this region. The Northeastern states are also experiencing population increases that are projected to continue into the future (Zoraghein and O'Neill, 2020). This increase in population may result in increased exposure to this hazard even in the absence of 500 any change in windstorm frequency or intensity. Thus, there is great value in improved characterization of these events.
The ten largest windstorms in the Northeast U.S. during 1979-2018 covered 33 to 57% of ERA5 land cells in the Northeastern states with wind speeds exceeding the locally determined 99.9 th percentile threshold (Table 2).
Although all ten events occurred during the cool season months of October through April, they are distributed 505 throughout the forty-years, and no individual year exhibits more than one of these events (Fig. 1b). However, https://doi.org/10.5194/nhess-2020-345 Preprint. Discussion started: 11 December 2020 c Author(s) 2020. CC BY 4.0 License. when a larger pool of the top 50 largest windstorms is considered, clear evidence of serial clustering emerges.
Return periods for wind speeds in the upper 5% of ERA5 grid cells during these 10 windstorms range from 6.5 to 106 years (Table 3, Fig. 8). Many of these windstorms exhibit co-occurrence of extreme and/or hazardous precipitation and thus may be considered composite events. 510 Any windstorm catalogue is, to some degree, a product of the dataset on which it is predicated, and the windstorms identified herein are derived using a methodology that preferences intense but large-scale events. Their characteristics will naturally differ from severe local storms. The windstorms identified independently and objectively in this work are consistent with historically notable events. Further, precipitation and wind speeds from ERA5 for windstorms that occurred after 2000 exhibit good agreement with in-situ observations from the 515 NWS ASOS network and NWS dual-polarization RADAR, consistent with assimilation RADAR precipitation and weather station data streams by the ECMWF data assimilation protocols and past evaluations of the ERA5 reanalysis (Fig. 5). The accord between the geophysical data streams and the ERA5 windstorm intensity estimates and independent damage estimates provide further confidence in the fidelity of the windstorm catalogue presented herein. 520 The cyclone tracks associated with the ten windstorms are consistent with the climatology of cold-season cyclones and thus the associated extra-tropical cyclones are a mixture of; Alberta Clippers, Colorado Lows, decaying Tropical Cyclones and Nor' easters (Fig. 6). These cyclones, however, exhibit considerably higher intensities (from both RV and MSLP perturbations) that are an order of magnitude higher than mean values sampled on those same tracks (Fig. 7). With the possible exception of Hurricane Sandy, these windstorms are largely differentiable 525 from the cyclone climatology in terms of their intensification rather than the associated cyclone storm track. It is also notable that the most intense AC events occurred during periods of low ice cover in the Great Lakes, which may imply windstorms associated with AC events are likely to intensify under climate change as results of reduced icing of these water bodies (Smith, 1991).
Inflation-adjusted (to January 2020) property damage totals for each of the windstorms range from $24 million to 530 $29 billion (Table 2). While there is not perfect agreement in the ranking of these storms between high wind coverage and property damage, the top four storms in terms of extent do all have higher damage totals than the next six.
This windstorm catalogue is intended to characterize extreme windstorms in the Northeastern U.S. and may have value in efforts to evaluate the validate climate and natural hazard catastrophe models. Planned extension of the 535 ERA5 reanalysis to 1950 may provide an opportunity to further extend this analysis to include elements related to non-stationarity in windstorm probability, with the caveat that such detection will be challenging due to changes in the assimilated data. Research is underway to dynamically downscale these windstorms using the Weather Research and Forecasting model to examine sub-grid scale variability in extreme wind speeds and the sensitivity of these events to global climate non-stationarity. 540 Data Availability 545 ERA5 reanalysis output are available from https://climate.copernicus.eu/climate-reanalysis. NWS RADAR data are available from the National Climatic Data Center; https://www.ncdc.noaa.gov/data-access/radar-data. NWS ASOS data are available from ftp://ftp.ncdc.noaa.gov/pub/data/asos-fivemin/. The NOAA Storms database is available at; https://www.ncdc.noaa.gov/stormevents/. Historical estimates of Great Lakes ice cover are available from: https://www.glerl.noaa.gov/data/ice/atlas/ice_duration/duration.html. 550

Author Contribution
All four authors participated discussion about the goals and methods for this paper. SCP devised the analysis framework. FL had primary responsibility for performing the analyses. FL, SCP and RJB wrote the majority of the manuscript text. KH provided analysis tools, expertise, advice and context for cyclone tracking. RJB and SCP performed analyses on the societal impact of these windstorms. RJB and SCP acquired the funding and computing 555 resources to make this research possible.