A hazard model of sub-freezing temperatures in the United Kingdom using vine copulas

Extreme cold weather events, such as the winter of 1962/63, the third coldest winter ever recorded in the Central England Temperature record, or more recently the winter of 2010/11, have significant consequences for the society and economy. This paper assesses the probability of such extreme cold weather across the United Kingdom (UK), as part of a probabilistic catastrophe model for insured losses caused by the bursting of pipes. A statistical model is developed in order to model the extremes of the Air Freezing Index (AFI), which is a common measure of the magnitude and duration of freezing temperatures. A novel approach in the modelling of the spatial dependence of the hazard has been followed which takes advantage of the vine copula methodology. The method allows complex dependencies to be modelled, especially between the tails of the AFI distributions, which is important to assess the extreme behaviour of such events. The influence of the North Atlantic Oscillation and of anthropogenic climate change on the frequency of UK cold winters has also been taken into account. According to the model, the occurrence of extreme cold events, such as the 1962/63 winter, has decreased approximately 2 times during the course of the 20th century as a result of anthropogenic climate change. Furthermore, the model predicts that such an event is expected to become more uncommon, about 2 times less frequent, by the year 2030. Extreme cold spells in the UK have been found to be heavily modulated by the North Atlantic Oscillation (NAO) as well. A cold event is estimated to be ≈ 3–4 times more likely to occur during its negative phase than its positive phase. However, considerable uncertainty exists in these results, owing mainly to the short record length and the large interannual variability of the AFI.


Introduction
Extended periods of extreme cold weather can cause severe disruptions in human societies; on human health, by exacerbating previous medical conditions or due to reduction of food supply which can lead to famine and disease; agriculture, by devastating crops particularly if the freeze occurs early or late in the growing season; on infrastructure, e.g. severe disruptions in the transport system, burst of residential or system water pipes (Bowman et al., 2012). All these consequences lead to important 20 economic losses.
Of particular interest for the insurance industry are the economical losses that originate as a result of bursting of pipes due to freeze events. Water pipes burst because the water inside them expands as it gets close to freezing which causes an increase in pressure inside the pipe. Whether a pipe will break or not, depends on the water temperature (and consequently on the air temperature), the freezing duration, the pipe diameter and composition, the wind chill effect (due to wind and air leakage on water pipes), and the presence of insulation (Gordon, 1996;McDonald et al., 2014).
Insurance losses from burst pipes have a significant impact on the UK insurance industry. They amount to more than £900 million in the last 10 years, representing around 10% of the total insured losses, mainly due to flood and windstorm, in the United Kingdom (UK) during the same period (ABI, 2017). Particular years can be very damaging, such as, for example, the 5 winter of 2010/2011 where losses from burst pipes have exceeded £300 million in UK making it the peril with the largest losses that year (ABI, 2017). Moreover, much more extreme cold winters have actually occurred in the UK in the last 100 years, such as the winters of 1946/47 and 1962/63. It is crucial for the insurance business to be able to anticipate the likelihood of occurrence of similar and even more extreme events so that they can adequately prepare for their financial impact (AIR, 2012). In fact, the capital requirements in (re)insurance is estimated in a 1 in 200 year return period (RP) loss basis, which is 10 usually much larger than the available historical records.
Probabilistic catastrophe modelling is generally agreed to be the most appropriate method to analyze such problems. The main goal of catastrophe models is to estimate the full spectrum of probability of loss for a specific insurance portfolio (i.e. comprised by several residential, auto, commercial or industrial risks). This requires the ability to extrapolate the possible losses at each risk to high return periods (RP) which is usually achieved by simulating synthetic events that are likely to happen 15 in the near future (typically a year). More importantly, it requires to consider also how all risks relate to each other and their potential synergy to create catastrophic losses. Such spatial dependence between risks can result from various sources, for example due to the spatial structure of the hazard (e.g. the footprint in a windstorm or the catchment area in a flood event) or due to similar building vulnerabilities between risks in the same geographical area (e.g. due to common building practices) (Bonazzi et al., 2012). 20 Modelling the spatial dependence of the hazard is usually achieved by taking advantage of certain characteristic properties of the hazard footprint, like for example the track path and the radius of maximum wind for windstorms or the elevation in the case of floods. In the case of temperature, however, such a property cannot be easily defined; an alternative solution is to use multivariate copula models. Based on Sklar's theorem (Sklar, 1959), the joint distribution of all risk sources can be fully specified by the separate marginal distributions of the variables and by their copula, which defines the dependence structure 25 between the variables. However, one important difficulty is the limited choice of adequate copulas for more than two dimensions. For example, standard multivariate copula models such as the elliptical and Archimedean copulas do not allow for different dependency models between pairs of variables. Vine copulas provide a flexible solution to this problem based on a pairwise decomposition of a multivariate model into bivariate copulas. This approach is very flexible, as the bivariate copulas can be selected 30 independently for each pair, from a wide range of parametric families, which enables modelling of a wide range of complex dependencies (Czado, 2010;Dißmann et al., 2013).
In this paper, the vine copula methodology is used in a novel application to develop a catastrophe model on insurance losses due to pipe bursts resulting from freeze events in the United Kingdom. The focus here is on the hazard component (Sect. 2) which is modeled using the Air Freezing Index (AFI), an index which takes account both the magnitude and duration 35 of air temperature below freezing, calculated from reanalysis data from the last 110 years. The statistical models utilized to extrapolate to longer return periods are described in Sect. 3. The model also accounts for two major drivers of climate variability in UK that are incorporated as predictors: the North Atlantic Oscillation (NAO), a leading pattern of weather and climate variability over the entire Northern Hemisphere, which accounts for more than half of the year-to-year variability in winter surface temperature over UK.

5
-Anthropogenic climate change and its direct effects in the temperature profile in the UK.
Stochastic winter-seasons are simulated taking into account the correlation of the hazard between all pair-cells with the help of regular vine copulas (Sect. 3.3). The resulting return periods of extreme cold winters in UK, including the underlying uncertainties, are discussed in Sect. 4. Concluding remarks are found in Sect. 5. The hazard component of the catastrophe model is based on the European Centre for Medium-Range Weather Forecasts (ECMWF) twentieth century reanalysis (ERA-20C) covering the entire twentieth century from 1900 to 2010 (Poli et al., 2016).
Reanalyses are data-assimilating weather models which are widely used as proxies for the true state of the atmosphere in the recent past. Even though centennial reanalyses, such as ERA-20C, represent the most convenient data sets for assessing the 15 long-term historical climate, biases and uncertainties inherent in both raw observations and models mean that they should be used with caution.
For consistency throughout the period, the observational input in ERA-20C is limited to surface pressure and surface marine winds only, which may however lead to some reduction in accuracy (Dell'Aquila et al., 2016). For example, important differences in the 2-meter temperature have been found between ERA-20C and other centennial reanalysis data sets, especially 20 during the first half of the twentieth century as a result of the spare observational network in those early years (Poli et al., 2016;Donat et al., 2016). Furthermore, studies have suggested that long-term changes in the Arctic Oscillation, mean sea level pressure, and wintertime storminess seen in ERA-20C, may be spurious as a result of the assimilation of increasing numbers of observations (Dell'Aquila et al., 2016;Poli et al., 2016;Bloomfield et al., 2018).
ERA-20C product provides daily 3-hour forecast (i.e. eight forecast steps starting at 06:00UTC each day) of minimum 25 and maximum temperature at 2 meters. These are used to compute daily minimum and maximum values at every grid cell for the entire period. The daily average temperatures are then computed as 0.5(T max -T min ) and the data re-gridded to a 1 • x1 • resolution, which corresponds to a total of 67 cells over land.
The coarse horizontal resolution is expected to have relatively small influence in most cases given that winter climate anomalies are often coherent across large parts of the UK as they are primarily associated with large-scale atmospheric circulation 30 patterns (Scaife and Knight, 2008). Nevertheless, local temperature may be subtly different in certain micro-climates, such as upland and urban regions. In particular over urban regions, which are most important from an insurance perspective, lower resolution may lead to temperatures that are biased towards lower values, leading though to a conservative view on the severity of extreme freeze events. In upland regions, on the other hand, extreme cold temperatures are most probably underestimated, although it is reasonable to expect that their damaging effects are somewhat mitigated from increased protection levels. For example, water pipes in properties located in mountainous regions are usually better protected against cold spells.

5
For comparison purposes, the observed daily average temperature gridded data set developed from the UK Met Office is also used (Perry et al., 2009). This data set is based on temperature data retrieved from 540 stations across UK with an average station density of 21 x 21 km 2 (Perry and Hollis, 2005;Perry et al., 2009). It covers the entire UK, but for a much shorter period of 51 years . The original 5km x 5km resolution is re-gridded using bi-linear interpolation to 1 • x1 • in order to match the ERA-20C grid.

North Atlantic Oscillation Index
The NAO refers to a redistribution of atmospheric mass between the Arctic and the subtropical Atlantic, and swings from one phase to another producing large changes in weather, and in particular in surface air temperature, over the Atlantic and the adjacent continents (Hurrell et al., 2013). It is described by the NAO index (NAOI), a measure of the mean atmospheric pressure gradient between the Azores High and the Iceland Low. A positive NAOI is associated with depression systems taking 15 a more northerly route across the Atlantic, which causes UK weather to be milder, while a negative NAOI is associated with depression systems taking a more southerly route, as a result of which UK weather tends to be colder and drier (Osborn, 2000).
In this study, the winter (December thru March) station-based index of the NAO from Hurrell (2003) is used, which is based on the difference of normalized sea level pressure between Lisbon, Portugal and Stykkisholmur/Reykjavik, Iceland ( Figure 1b). 20 Increases in concentration of greenhouse gases, such as carbon dioxide (CO 2 ), are accompanied by increased radiative forcing, i.e. the difference between the incoming radiation from the sun and the outgoing radiation emitted from the Earth. This forcing arises from the ability of the gases to absorb long wave radiation, thus reducing the amount of heat energy being lost to space, and increasing the warming of the earth's surface. Here we use the change in radiative forcing from CO 2 as a predictor for climate change. It is calculated using the simplified expression (Myhre et al., 1998):

Anthropogenic forcing
where ∆F CO2 is the radiative forcing change (in W m −2 ), C i is the concentration of atmospheric CO 2 at year i, and C 1900 is the reference 'pre-industrial' CO 2 concentration at year 1900. Consequently, a doubling of CO 2 corresponds to a change in the radiative forcing of 3.7 W m −2 . Historical observations of global mean CO 2 concentrations (in parts per million or ppm) are based on Hansen et al. (2007). The temporal increase in the CO 2 radiative forcing during the 20th century is shown in Figure   30 1c.  1916 1924 1932 1940 1948 1956 1964 1972 1980 1988 1996 2004 Figure 1. Interannual variation of (a) average AFI over UK (mAFI), (b) the North Atlantic Oscillation Index (NAOI), and (c) CO2 forcing during the study period.

Air-Freezing Index and historical events
The daily temperature data are used to compute the AFI at each grid cell, as the sum of the absolute average daily temperatures of all days with below 0 • C temperatures during the freezing period (Eq. (3)). The freezing period in this study is defined from first of June of year y to end of May of the following year y + 1, in order to include the entire winter season. Because AFI 5 accounts both for the magnitude and duration of the freezing period, it is commonly used for determining the freezing severity of the winter season (Frauenfeld et al., 2007;Bilotta et al., 2015).  (Booth, 2007). The mean AFI value (mAFI) in the entire UK (i.e. average of AFI values across all gridcells) mounted up to 75.6 • C, the second largest value during the period. lower than the long-term average (13 • C) and over ten times lower than mAFI value according to the UKMO dataset (59.1 • C). No clear reason is known for this bias, but it might be related to possible spurious long-term trends in the atmospheric circulation (Bloomfield et al., 2018).
As shown in Figure 1a, the two most severe winters in the century (1946/47 and 1962/63) were associated with a negative 20 NAO phase (Murray, 1966;Osborn, 2011). As mentioned earlier, the NAO has a profound effect on winter climate variability around the Atlantic basin, accounting more than half of the year-to-year variability in winter surface temperature over UK (Scaife et al., 2005;Scaife and Knight, 2008). Not surprising, the ERA-20C mAFI over the entire UK is found to be signif-icantly anti-correlated (ρ = -0.49, pval=6.510 −8 ) with NAOI. A negative correlation is found between mAFI and ∆F CO2 forcing, but it is much less significant (ρ = -0.17, pval= 0.08). Both NAO and climate change effects are included in the statistical model as predictors in order to account for their relation to cold winter spells in UK as discussed in the following section.
3.2 Extreme value analysis 5

Stationary model
Since the historical data only extends for 110 years and our interest lies in very rare events (such as 1 in 200 years), it is necessary to extrapolate by fitting an extreme value distribution. The Generalized Extreme Value (GEV) family of distributions has been chosen, which includes the Gumbel, the Fréchet, and Weibull distributions. An additional term was included, the probability of no hazard (P0), in order to account for the cells, mainly on the south England coast, that have years with no 10 negative temperatures at all. The probability therefore that the AFI value (X) inside a cell j is lower or equal than a certain value (x) takes the form: where µ, σ, and ξ represent the location, scale, and shape parameters of the distribution, respectively. F (x) is defined when There are various methods of parameter estimation for fitting the GEV distribution, such as least squares estimation, maximum likelihood estimation (MLE), probability weighted moments, and others. Traditional parameter estimation techniques give equal weight to every observation in the data set. However, the focus in catastrophe modeling is mainly on the extreme outcomes and, thus, it is preferable to give more weight to the long return periods. The Tail-Weighted Maximum Likelihood 20 Estimation (TWMLE) method developed by Kemp (2016) is employed here in order to estimate the GEV parameters. This method introduces ranking depended weights (w (r) ) in the maximum likelihood. The weights are defined for each cell based on the historical winter-season AFI values, i.e. the lowest historical AFI value in the cell (rank r=1 out of n observations) has the lowest weight, while the largest historical AFI value (rank r=n) has the largest weight, as follows:

25
Along with the TWMLE method described above, a second modification has been implemented in order to geographically smooth the GEV parameters. The smoothing is incorporated into the fitting process by minimizing the local (ranked) loglikelihood. More precisely, the log-likelihood at each grid cell i is calculated using all grid points but weighted by their distance: 2L 2 , d ij is the distance between cell i and j, L is the smoothing parameter, LogL j is the ranked loglikelihood for cell j.
The smoothing increases the sample size at each grid point, which thus leads to a more precise estimation of the parameters, especially for the shape parameter which is highly influential in estimating the hazard levels and high return periods. Because the data grid resolution is already coarse, a small length scale parameter L of 20 km has been used (in comparison to the grid 10 size).
Finally, in order to avoid an over-estimate of the positive value of the shape parameter due to the small sample size (Lee et al., 2017), a modification of the maximum likelihood estimator using a penalty function is also applied for fitting the GEV. The penalty function penalizes estimates of ξ that are close to, or greater than 1, following Coles and Dixon (1999).
Estimates of P0 for each grid cell are obtained by fitting a logistic regression model with intercept only (Eq. (7)). As before, 15 the fitting is performed against all grid cells, weighted by their distance d ij , and a length scale of 20 Km has been used. The model is extended in the non-stationary model to include covariates as described in Sect. 3.2.2.
As an example, the GEV fit for a single cell over London is shown in Fig. 3. The curve fitted as described above (black line) is closer to the empirical estimates (black circles, computed as described in Sect. 4.1) in comparison with the GEV fit with no 20 weighting applied (grey line). As shown in table 1, for both fits the shape parameter is positive (i.e. both fits correspond to the Fréchet distribution), but for the approach followed here (TWMLE + geographical smoothing), the shape parameter is smaller leading to a shorter tail and a curve that is nearer to the empirical estimate.
Maps of the fitted parameters are shown in Fig. 4. The probability of non-negative temperatures during a season (P0) is, as expected, larger around the coast which has milder and less variable climate due to the water influence. This also explains the 25 lower mean (location parameter) and larger spread (scale parameter) in the AFI distributions around the coast in comparison to inland. The shape parameter, which affects the skew of the distribution, shows larger values in the southern part of the UK in comparison to the north, suggesting a less rapid increase in the maximum AFI estimates.

Non Stationary model
In stationary models, the distribution parameter space is assumed to be constant for the period under consideration. However, such assumption is not valid in the presence of atmospheric circulation patterns or anthropogenic changes. In this study, a ln where (b 0 , µ 0 ) are the stationary model parameter estimates and (b 1 , µ 1 ), (b 2 , µ 2 ) are linear transformations of the covariates NAOI and ∆F CO2 with respect to time, respectively.
In this study, only non-stationarity with respect to P0 and the location parameter, µ, is discussed, since modeling temporal 5 changes in σ and ξ reliably requires long-term observations in order to be estimated accurately (Katz et al., 2002;Cheng et al., 2014). A simple linear model is selected as this is usually preferred when searching for trends in the occurrence of extreme events (Beguería et al., 2011). Finally, even though some climate modeling studies predict changes in the nature of NAO variability in an increasing CO 2 climate (Rind et al., 2005;Woollings et al., 2010), the model does not include any interactionterms, as they have been found to be non-significant.
As before, the parameters of each cell are estimated taking also into account its neighboring cells weighted by their distance.
The most pertinent model is selected, for each cell, using the χ 2 test, based on the change in deviance, between the null, one or two predictor model. If the significance value is less than 0.01, the model is estimated to have a significant improvement over 5 the reduced model. A separate test is performed for the P0 and the GEV model. As an example, in the case of the London cell, the model with two predictors for both P0 and the location parameter has been chosen (table 1).
The spatial distribution of the parameters of the final model is shown in Fig. 5. Increasing NAOI or ∆F CO2 are consistent with a warming trend, leading to positive values of the P0 parameters (indicating increases in the number of years with no negative temperatures) and to negative values in the location parameters (indicating lower means in the AFI distributions). The

10
NAO is found to affect more cells in total (90%) in comparison to anthropogenic climate change (51%). Notice however that due to the internal variability of the NAO, any signal from a climate change trend can be hidden in the limited observational period.

Copulas and vine copulas
The stochastic behaviour of the hazard (i.e. AFI) at each cell is fully described by its corresponding GEV probability distribu-15 tion, as described in Sect. 3.2. However, insurance portfolio loss analysis requires the calculation of the combined stochastic behaviour of the hazard across all the model domain (i.e. all cells). This is described by the joint distribution of the hazard which, according to Sklar's theorem, can be fully specified by the separate marginal GEV distributions and by their (d-dimensional) copula, which models the hazard dependence between the cells.
The probability density function (pdf) of X, f (x 1 , ..., x d ), can be found by taking the partial derivatives with respect to X: where c(u 1 , ..., u d ) is the copula density, given by Expression 9 is important in terms of modelling because it permits to define a multivariate density as the product of marginal pdfs and a copula density function that captures the dependence between the random variables (Abbara and Zevallos, 2014). To quantify the dependence between variables, different measures have been defined, addressing different aspects of dependence. A common measure of overall dependence is the Kendall rank correlation coefficient, commonly referred to as Kendall's 5 τ coefficient (Genest and Favre, 2007). However, dependence of rare events cannot be measured by overall correlations: even if two variables are completely uncorrelated, there can be a significant probability of a concurrent extreme event in the two, i.e., they can still be tail dependent. Tail dependence describes the amount of dependence in the lower tail or upper tail of a bivariate distribution. For its mathematical definition see Haff et al. (2015). One important complication is that identifying the appropriate d-dimensional copula is not an easy task. In high dimensions, the choice of adequate families is rather limited (Brechmann and Schepsmeier, 2013). Standard multivariate copulas, either do not allow for tail dependence (i.e. multivariate Gaussian) or have only a single parameter to control tail dependence of all pairs of variables (Student-t and Archimedean multivariate copulas). This is particularly problematic for catastrophe modeling applications, where a flexible modeling of tails is vital to assess reliably the extreme behaviour of natural events.

5
Vine copulas provide a flexible solution to this problem based on a pairwise decomposition of a multivariate model into bivariate (conditional and unconditional) copulas, where each pair-copula can be chosen independently from the others. In particular, asymmetries and tail dependence can be taken into account as well as (conditional) independence to build more parsimonious models. Vines thus combine the advantages of multivariate copula modeling, that is separation of marginal and dependence modeling, and the flexibility of bivariate copulas (Brechmann and Schepsmeier, 2013).

10
As an example, in a 4-dimensional case, the joint pdf can be decomposed as a product of 6 pair-copulas (3 uncoditional and 3 conditional) and 4 marginal pdfs as shown in Eq. (11): The above decomposition is not unique and Bedford and Cooke (2002) introduced a graphical structure called regular vine (R-Vine) structure to represent this decomposition with a set of nested trees. The dependence structure with three trees for 15 the 4-dimensional example above is shown in Figure 6. More details on vine copulas can be found in Aas et al. (2006); D. and Schirmacher (2008); Czado (2010); Schepsmeier (2013).

In this study, the joint multivariate hazard distribution of AFI across all the model domain (67 cells) is decomposed as a product of marginal and pair-copula pdfs (in a similar way as shown for the 4-d case above). F(x) and f(x) represent the marginal
GEV distributions here as defined by Eq. (3) and (4). The pair-copulas are fitted using the R (https://www.r-project.org/) package VineCopula (Schepsmeier et al., 2017;Brechmann and Schepsmeier, 2013). The method follows an automatic strategy of 5 jointly searching for an appropriate R-Vine tree structure, its pair-copula families, and estimating their parameters developed by Dißmann et al. (2013). This algorithm selects the tree structure by maximizing the empirical Kendall's τ values, based on the premise that variable pairs with high dependence should contribute significantly to the model fit and should be included in the first trees.
The copula family types for each selected pair in the first tree are determined by using the Akaike information criterion 10 (Brechmann and Schepsmeier, 2013). For computational reasons, the two-parameter Archimedean copulas are excluded from this analysis, which however has only a negligible impact in the results (not shown). The copula parameters are estimated sequentially (using maximum likelihood estimation) starting from the top tree until the last tree, as described in Czado et al. (2013). This approach only involves estimation of bivariate copulas and has been chosen since it is computationally much less demanding than joint maximum likelihood estimation of all parameters at once.

15
The percentage of family types used for the first few trees of the selected RVM is shown in Table 2. The large majority of the pairs in all trees are estimated to be independent (59%), but these pairs occur mainly at the higher trees, since the most important dependencies are captured in the first trees (Brechmann and Schepsmeier, 2013;Dißmann et al., 2013). Large dependencies, with Kendall's tau coefficients greater than 0.90, are found as expected between neighboring cells, but remain important across the whole model domain due to the nature of the hazard: AFI assess the freezing temperatures during the 20 entire winter and, thus, is less associated with small scale local phenomena that can cause important spatial variation. At the first tree, 52% of the selected bivariate copulas are found to belong to the t-Student Copula and 35% to the Gumbel family, which exhibit positive dependence in the tails. Gumbel in particular has a greater dependence in the positive tail than in the negative and thus implies greater dependence at larger AFI values than at lower ones. From the third tree and onwards, the percentage of independent families is always larger than 40%. 25 The small sample size used (110 years of data) in conjunction with the high dimensions of the modelled pdf (67) is of concern in this study since this can lead to large uncertainties in the resulting pdf, which can also propagate in the estimated return periods. The impact of the short sample size on the uncertainties in the results is quantified using a bootstrap technique, as described in Sect. 3.3.2.
Goodness-of-fit (GOF) is calculated using the Cramer von Mises test, which compares the final selected RVM with the 30 empirical copula. The RVineGofTest algorithm of the same R package implements different methods to compute the test, which however perform usually poorly in cases of small sample sizes and at higher dimensions as is the case for this work (Schepsmeier, 2013). Nevertheless, table 3 shows the GOF results for two of these methods. The p.value is found to be larger than 0.05, which is an indication that the fitted RVM cannot be rejected at a 5% significance level. However, given also the quite

Stochastic simulation and uncertainty estimation via parametric bootstrap
The RVM is used to simulate 100K years of winter-seasons in the UK. For each year, the simulated AFI values at each grid cell depend on the other cells based on the fitted RVM. Performing long enough simulations is necessary in order to obtain 5 converged numerically results, i.e. to convergence to the "true" return period. Our focus here is the 200 year RP, which is commonly associated with capital and regulatory requirements. By repeating the simulation several times, it has been assessed that 100K years of winter seasons is long enough and the Monte Carlo simulation error is negligible.
The stationary model is used to generate a stochastic set which corresponds to the current hazard experience. The nonstationary model permits us to create additional stochastic sets that represent different climate conditions. In the case of NAO 10 and following the Shapiro-Wilk test for normality, a 100-year long NAOI has been simulated assuming a Gaussian distribution (see Figure 7). In order to assess the influence of climate change in UK cold spells, three separate stochastic sets, of 100K years each, have been created as follows: -Pre-industrial climate (∆F CO2 =0 W m −2 ), corresponding to pre-industrial (1900) concentration of CO 2 (296 ppm).

15
-Future double-CO 2 climate (∆F CO2 = 3.7 W m −2 ), corresponding to 2 x CO 2 concentration since pre-industrical times (592 ppm). The small sample size used in this study (110 years of data) together with the high dimensions of the modelled pdf (67) can lead to large uncertainties in the estimated return periods. Following Bevacqua et al. (2017a), the model uncertainty is assessed using a parametric bootstrap approach, where a large number of models are created using as basis, instead of observations, randomly simulated data from the selected RVM. In particular, confidence intervals are constructed as follows:

NAOI
-A simulation with the same length as the observed data (i.e. 110 years) is repeated for B = 500 times.

5
-For each of these B = 500 samples, a new full model is fitted (including new GEV and logistic regression model parameters at each cell and new RVM structure, pair-copula families and parameters) following the methodology described in Sect. 3.2.1 and 3.3.1.
-For each of the resulting B = 500 RVMs, a simulation of 10K years of winter-seasons is performed. The uniform variables are then transformed using the (new) inverse marginal pdfs and the corresponding return period levels are estimated.

10
-The uncertainty in the return levels is estimated by identifying the 95% confidence interval (i.e. the range 2. 5-97.5 %) from these 500 return level curves.
Due to computational constraints, confidence intervals are computed only for the stationary model and the simulation length has been reduced to 10K years (instead of 100K). In order to separate the uncertainty associated with the RVM only from the uncertainty of the full model, i.e. of the joint pdf, confidence intervals have been also calculated with the same approach as described above, but using the same marginal pdfs in each bootstrap repetition.

Return period maps
The obtained stochastic sets (see Sect. 3.2) are used to create return period maps for the different climatic conditions. The top 5 panels of Fig. 8 represent the AFI values that occur once every 10, 25, and 50 years based on the stationary model. The empirical return periods are also plotted for comparison (bottom panels). These are calculated for each cell as 1/(1-P), where P represents the cumulative probabilities of the ranked values and is calculated based on the Weibull formula P=i/(n+1) (Makkonen, 2006).
The spatial pattern is consistent between the empirical and stochastic sets, showing largest AFI values in high elevation areas, as expected. However, the empirical values are in general somewhat larger than the stochastic set. This difference is driven by 10 the exceptional 1962/63 event which is estimated empirically at 1 in 110 years but is predicted to be less frequent according to the GEV fits. The probability of such an event happening today is discussed in detail in Sect. 4.2.3.
Return period maps at higher return periods (100, 200, and 500 years) for the pre-industrial, current, and 2xCO 2 climate stochastic sets are shown in Fig. 5. UK in the beginning of the 20th century has been experiencing much colder winters than today. In a 2xCO 2 climate change scenario, negative temperatures become very rare (larger than 1 in 500 years) in large part of 15 the UK except at mountainous regions. It is also interesting to note that at high return periods and across all scenarios, larger AFI values are predicted for the southern part of UK in comparison to the north. The extreme AFI values in the south are driven by the exceptional 1962/63 winter which has been more severe in the South than the North (see Fig. 2b). Excluding this winter from the analysis results in almost much lower AFI values in most of the region (not shown).
4.2 Regional return period AFI curves 20 The vine copula methodology permits the estimation of the hazard return periods over aggregated regions in the UK. Since our focus is mainly on inhabited areas, for each simulation year (y) and for each region, the "weighted AFI" (wAFI) is computed, where the AFI value at each cell j is weighted by the corresponding number of residential properties (n j ), as shown in Eq. (12). The weighted AFI thus places more weight on the hazard over large populated urban areas than agricultural or mountainous areas. The number of residential properties in the UK is taken from the PERILS Industry Exposure Database 25 (https://www.perils.org/), which contains up-to-date high quality insurance market data at Cresta level ("Catastrophe Risk Evaluation and Standardizing Target Accumulations", https://www.cresta.org/) based on data directly collected from insurance companies writing property business in the UK. Return period wAFI curves for both the empirical and the stochastic data are shown in Fig. 10. Analogous return period plot based on mAFI, i.e. without weighting, can be found in the Appendix (Fig.   A1).
wAF I y = ∑ AF I j,y · n j ∑ n j (12)

Model uncertainty
The stationary model is utilized to analyze the uncertainty in the model results and investigate its sources. Fig. 10 shows the 5 empirical and the stochastic return period curves of wAFI for the entire UK, together with their associated uncertainties. The empirical return periods calculation is described in Sect. 4.1, while their uncertainty intervals are computed from the 2.5 th and 97.5 th quantile of the beta probability distribution function (Folland and Anderson, 2002). The stochastic curve and confidence intervals are computed as described in Sect. 3.3.2. The uncertainty in the model is found to be large, only marginally lower, then the empirical estimates and is associated to the short historical record length. Most of the uncertainty (around 90% for RPs 10 greater than 50 years) originates from the uncertainty in the GEV distribution parameters, with the remaining 10% to be due to the RVM model (dark shaded area in Fig. 10). Extreme-value theory is considered as a state-of-the-art procedure to find values for return periods that amply exceed the record length and has been utilized in this study. However, a common difficulty with extremes is that, by definition, data is rare and as a result, the shorter the record length, the more inaccurate is the estimation of the GEV parameters. The results presented in the following sections should therefore be interpreted being aware of the existing 15 uncertainties.

The 1962/63 winter return period and climate change influence
The 1962/63 winter has been the coldest in the reanalysis data in the UK and, thus, it is estimated empirically as a 1 in 110 years event (i.e. the length of the data set). This corresponds well with the Central England Temperature (CET) record, the oldest continuously running temperature data set in the world (Manley, 1974 in the UK are now half as likely as they were in the 1960s. Christidis and Stott (2012) also indicate that human influence has reduced the probability of such a severe winter in UK by at least 20% and possibly by as much as 4 times, with a best estimate that the probability has been halved. On the other hand, some recent studies have argued that warming in the Arctic could favor the occurrence of cold winter extremes, and might have been also responsible for the unusually cold winters in the UK 30 of 2009/10 and 2010/11 (Francis and Vavrus, 2012;Tang et al., 2013). This hypothesis though is still largely under debate, see for example Barnes and Screen (2015) and Wallace et al. (2014).

°C)
Stochastic CI (RVM only) Figure 10. Return period curves of wAFI (in • C) based on the historical data (grey) and the stochastic model (black). The 95% confidence intervals are shown as dashed grey lines for the historical data and as a shaded grey area for the stochastic model. The dark shaded area represents the stochastic uncertainty due to the RVM model alone.
Under a 2xCO 2 climate, a 1962/63 is predicted to become almost 10 times more infrequent, having a return period of around 1 in 4000 years. Fig. 11 shows an important reduction in the probability of occurrence of cold extreme events across the whole distribution as a result of the increase of anthropogenic CO 2 concentrations. Larger reductions are found for the most extreme events as well, which is probably related to the large increase of the probability of no negative temperatures (P0) for several cells especially around the coast (see Fig. 9). Similar results are found in both the northern and the southern part of UK, as well 5 (Fig. 11b).

NAO influence
The profound effect of NAO on the winter surface temperature over UK has been reported by several studies (Scaife et al., 2005;Scaife and Knight, 2008;Osborn, 2011). In conjunction with those, the model predicts a negative (positive) NAO phase increases (decreases) substantially the probability of a cold event in the UK (Fig. 12). In fact, on average, extreme cold winters 10 are estimated to occur approximately 3 to 4 times more likely during the negative than the positive phase. As an example, an event with wAFI of 100 • C has a return period of 1 in 39 years, assuming a negative phase, and 1 in 133 years, assuming a  positive phase. Because of its intrinsic chaotic behaviour, NAO is difficult (if even possible) to be predicted (Kushnir et al., 2006). Nevertheless, numerical seasonal forecast systems are currently rapidly improving and have even shown some success

°C)
Current Climate NAOI > 1 NAOI < -1 Figure 12. Return period curves of wAFI (in • C) based on the current climate stochastic model and assuming a variable NAOI as described in the text (black line). Return period curves based on negative (lower than -1) and positive (larger than 1) values of NAOI are shown with green and orange lines, respectively.
in the past (Graham et al., 2006;Folland et al., 2006). Incorporating such information in models could be very useful from the catastrophe risk management perspective.
It is important to note that the effect of NAO in the hazard dependency structure has not been taken into account here.
Recently, a methodology that offers the possibility to include such meteorological predictors in a vine copula model has been developed by Bevacqua et al. (2017a); Bevacqua (2017b) and is something to be addressed in a future study. Finally, another 5 point that requires further consideration is the mechanisms that control and affect the NAO and its temporal evolution and in particular how the NAO responds to external CO 2 forcing (Hurrell, 2015, e.g.).

Conclusions
This paper presents a probabilistic model of extreme cold winters in the United Kingdom. The hazard is modeled using the Air Freezing Index, an index which takes account both the magnitude and the duration of air temperature below freezing and been applied in order to estimate the probability of extreme cold winters spatially across the UK. More importantly, the spatial dependence between regions in the UK has been assessed through a novel approach which takes advantage of the vine copula methodology. This approach allows the modeling of concurrent high AFI values across the country which is necessary in order to assess reliably the extreme behaviour of such events.
Recognizing the non-stationary nature of climate extremes, the model also incorporates NAO and climate change effects 5 as predictors. Stochastic sets of 100K years representing different climate conditions (i.e. pre-industrial, current, or future climate and positive or negative NAO) have been generated and the return periods of extreme cold winters in UK, such as the "Big Freeze of 1962/63" have been estimated. According to the model, such a cold winter is estimated to occur once every approximately 400 years under current climate conditions in the UK. The occurrence of such an event is calculated to have increased approximately two times during the course of the 20th century as a result of anthropogenic climate change. Moreover, 10 the model predicts that such an event will become quite uncommon and occur even more rarely, about 10 times less frequently, under 2xCO 2 climate conditions. The frequency of extreme cold spells in UK has been found to be heavily modulated by NAO, as well. A cold event is estimated to occur ≈3-4 times more likely during the negative than the positive phase.
However, considerable uncertainty exists in these estimates which should be interpreted with caution. The 110-year reanalysis record used in this study has been estimated to be short in order to estimate with enough confidence the frequencies 15 of such extreme events. Additional uncertainty may also be introduced by possible spurious trends in the reanalysis data set. A longer record of temperature data would be necessary in order to reduce the uncertainty and high quality long-term reanalysis products including ensemble approaches could help towards this direction.