Spatial variability in the relation between fire weather and burned area: patterns and drivers in Portugal

Fire weather indices are used to assess the effect of weather conditions on wildfire behaviour and the high Daily Severity Rating percentile (DSRp) is strongly related to the total burned area (BA) in Portugal. The aims of this study were to: 1) assess if the 90th DSRp (DSR90p) threshold is adequate for Portugal; 2) identify and characterize regional variations of the DSRp threshold that justifies the bulk of BA; and, 3) analyse if vegetation cover can explain the DSRp spatial variability. We used wildfire data, weather reanalysis data from ERA5, for the 2001 – 2019 period, and the land use map for Portugal. DSRp 5 were computed for an extended summer period and combined with individual large wildfires. Cluster analysis was performed using the relationship between DSRp and BA, in each municipality. Results revealed that the DSR90p is an adequate threshold for Portugal and well related to large BA. However, at the municipality scale, differences appear between the DSRp linked to the majority of accumulated BA. Cluster analysis revealed that municipalities where large wildfires occur in high DSRp present higher BA in forests and are located in coastal areas. In contrast, clusters with lower DSRp present greater BA in shrublands 10 and are situated in eastern regions. These findings can support better prevention and fire suppression planning.

the correspondent 80% and 90% of FTBA as sufficient to classify DSRp as the extreme threshold, justified by the results of Pereira et al., (2005), which showed that 80% of BA occurs in 10% of summer days. We selected 175 municipalities (from 278) affected by more than three individual wildfires and a total BA>500 ha in the studied period (2001 -2019). Restricting the analysis to the administrative units with sufficient data aims to increase the robustness of the results and to prevent possible interpretation errors. Figures assessing the relation between DSRp and FTBA were produced, for all the selected municipalities, 125 with the purpose to answer the second research question. In each municipality, the selection of the maximum spatial value of DSR to associate with fires is justified by the low spatial variability of the DSR, small size of administrative units and the native reanalysis data resolution (Copernicus Climate Change Service (C3S), 2017). The burnt area division between municipalities can produce noise in the data. This procedure artificially generates wildfires, some of them with relatively small size but high or very high DSRp. To circumvent this difficulty, we decided to analyze BA percentages, which reduce the influence of small 130 wildfires on the final results.

Cluster Analysis
Potential clustering was assessed using the curves of FTBA vs. DSRp for all the selected municipalities. Clusters were computed using "complete" (the longest distance) method, and (1-r2), as the metric, where r2 is the coefficient of determination between FTBA and DSRp. Method and metric choices are justified to ensure robustness and ease of visualization, respectively. 135 The selected (1-r2) threshold was 0.35, meaning that coefficient of determination in the municipalities within the cluster is higher than 0.65. Algorithms were processed with Matlab software.

The influence of the type of vegetation
The burnable area (BNA) was computed as the total burnable area (sum of the land cover types that are susceptible to burn based on the land cover map) divided by the total area of the municipality and presented in percentage. The ratio between TBA 140 in the 2001 -2019 period, divided by the total burnable area in the municipality (TBA/BNA), was also computed and presented in percentage. LULC was related with TBA by computing the TBA in the 5 classes of vegetation, namely: forests, shrublands, agriculture, agroforestry and others. Computations were made for each analysed municipality and cluster, to answer the third research question. Two additional ratios were computed, the first between forest and shrublands BNA and the second between forest and shrublands TBA, for each municipality. Moreover, the spatial distribution of prevailing land use types that were most 145 affected by wildfires was investigated to identify which municipalities have BA in forests larger than 50% or BA in shrublands larger than 40% of TBA. A contingency table, accuracy metrics and statistical measures of association were used to analyze the influence of the type of vegetation cover on the relationship between DSRp and TBA. The contingency table contains the number of municipalities that are characterized by diverse DSRp thresholds at 90% of TBA (DSRp90TBA) and, therefore, different group of clusters. The objective is to relate the municipalities (within the groups of clusters) with TBA in diverse vegetation cover types, taking in consideration that a pre-conceived relation must be made. For example, we can propose that municipalities with high DSRp90TBA will have the largest TBA in forested areas, comparing with other land use types, and accuracy metrics will be computed according to this initial classification. A contingency table needs, at least, two rows and two columns and, therefore, two relations. The list of accuracy metrics includes: (i) the overall accuracy, which represents the samples that were correctly classified and are the diagonal elements in the contingency table, from top-left to bottom-right 155 (Alberg et al., 2004); (ii) the user's accuracy, or reliability, that is indicative of the probability of a sample that was classified in one category belongs to that category; and, (iii) the producer's accuracy, represents the probability of a sample being correctly classified (Congalton, 2001). Statistical measures are: the Chi-squared (χ2) test (Greenwood and Nikulin, 1996), which test the independence of two categorical variables; the Phi-test (Φ) or phi coefficient (David and Cramer, 1947) is related to the chi-squared statistic for a 2×2 contingency table, and the two variables are associated if Φ>0. Lastly, we computed the Cohen's 160 Kappa coefficient, firstly presented by Cohen, (1960) and recently analysed by McHugh, (2012), that measures the interrater agreement of the two nominal variables. This coefficient ranges from -1 to 1 and is interpreted as < 0 indicating no agreement to 1 as almost perfect agreement. DSR>DSR50p the FTBA is almost 100%, meaning that fires in days with lower DSR have a negligible impact on TBA. Fires in days with DSRp between 85 and 95 were responsible for more than 80% of TBA in the 2001 -2019 period, making this a good DSRp threshold for extreme days. This justifies using the DSR90p at the national scale, which is widely used for threshold of extreme values (Carvalho et al., 2008;Bedia, Herrera and Guti, 2012;Fernandes, 2019;Silva et al., 2019). However, if the analysis is performed at higher spatial resolution, namely at municipality level, some differences become apparent (Figure 4). Viseu Dão-Lafões, Região de Coimbra, Beira Baixa and Região de Leiria), reaching values similar to the mean country level value (85 -95). In some NUTSIII provinces of the northern and central hinterland, DSRp90TBA is between 60 and 70 in most of the municipalities, particularly in Douro and Terras de Trás-os-Montes. It is important to underline that DSRp80TBA > DSRp90TBA which is a consequence of the adopted methodology to perform this analysis (please see section 2.1). This also helps understand why DSRp=50 is associated to FTBA=100% (Figure 3). The spatial distribution of DSRp80TBA and 185 DSRp90TBA suggests the existence of municipality clustering.

Patterns at the municipality level
We explored other features of wildfires in mainland Portugal, with the objective of explaining the differences observed in DSRp at municipality level. Burnable area (BNA), the ratio of Forest/Shrublands BNA, and the ratio of Forest/Shrublands TBA in each municipality were assessed and analysed ( Figure 5). Additionally, the number of wildfires and the TBA/BNA ratio in each Região de Leiria, Médio-Tejo and one municipality in Algarve. In some of these municipalities, this value is >100%, meaning that in the 19-years period TBA is larger than BNA and, consequently, there were a large number of recurrent wildfires in those areas.

Cluster analysis pattern
Based on the relationship between TBA and DSRp the municipalities were grouped in ten clusters. However, the dendrogram ( Figure 6) discloses that cluster 10 is isolated, with only one municipality, and, therefore, can be eliminated from further analysis. Cluster numbers are in descending order of the DSRp90TBA, i.e., 90% of TBA was registered with DSRp larger than this value. Cluster 2 includes the largest number of municipalities (23% of total) and highest TBA, almost 500,000 ha (26% of total). Generally, clusters group 13 or more municipalities, with the exception of cluster 3 and 8, with only 5 and 6 220 municipalities, respectively. Each cluster represents between 8% and 16% of the total TBA for the study period, except for the two smaller clusters, where TBA is only 1% of total. The spatial pattern of Figure 7  Lafões, Beiras e Serra da Estrela, Médio-Tejo and Alto Alentejo) and in the south coast (almost all of Algarve). Clusters 4, 5 and 6 are prone to burn with less extreme conditions, where the median of DSR90p corresponds to 85 -90% of TBA. The slope of FTBA vs DSRp curves is less steep than the previous clusters, and dispersion is higher in these clusters, with more municipalities where fire can occur with lower values of DSRp. Both suggest that in these clusters fires in tend to occur in a widest range of meteorological conditions. These clusters are spread throughout the country, and can be viewed as a transition 235 between the group of clusters with extreme (1, 2 and 3) and less extreme (7, 8 and 9) DSRp80TBA or DSRp90TBA. Clusters 7, 8 and 9 can be considered as the group of lower DSRp clusters, due the relatively lower values of the DSR90p and of the DSRp80TBA or DSRp90TBA, which range from 70 to 80%. Additionally, higher curve dispersion is also apparent, especially in cluster 9, which integrates municipalities where large wildfires can occur with lower values of DSRp (in some cases, below DSR50p). In this group of clusters, the slope of the FTBA vs DSRp curves, at higher values of DSRp is the lowest, especially 240 in clusters 8 and 9. Nevertheless, the median curve of cluster 8 has a different behaviour, comparing to the other clusters: the steeper interval is between 70th and 80th percentile, meaning that it has a larger amount of BA in less extreme conditions. The municipalities within these clusters are mostly located in northern and central hinterland, particularly in Alto-Tâmega, Terras de Trás-os-Montes, Douro, Beiras e Serra da Estrela and Beira Baixa. Additionally, a few municipalities within these clusters belong to Alentejo Central and Baixo Alentejo, two provinces with scarce number of fires and burnt area. Box-plots 245 of the DSRp80TBA and DSRp90TBA for the municipalities of each cluster (Figure 9) are consistent with the previous results.
Dispersion is considerably higher in the latter than in the former case, especially in clusters 3, 7 and 8. In some municipalities of clusters 7 and 8, large wildfires, with the ability to exceed FTBA=10% (Figure 8), start to occur with relatively low values of DSRp. Another notable difference is the boxplot medians: for DSRp90TBA they decrease with the ascending number of clusters as expectable, but not for DSRp80TBA, where they increase between cluster 4 and 5, between 6 and 7, and between 8 250 and 9.

Major drivers
The spatial distribution of the clusters resembles the general pattern of LULC in Portugal (Figure 11). In general, municipalities with high DSRp90TBA are located in regions of forests while municipalities with lower DSRp90TBA are located in regions where shrublands tend to be predominant. LULC type analysis, made in each cluster, indicates that BA in forests (BAF) is 255 notably higher than shrublands (BAS), in the first five clusters than in the last four clusters (Figure 11, top panel). This means that BAF is higher for clusters with higher DSRp90TBA while BAS is higher for clusters with lower DSRp90TBA. In addition, there is an increase of the fraction of BA in agriculture land associated with the decrease of DSRp90TBA. This amount is larger or very close to 10% in clusters 6-9 and lower in clusters 1-5. Results show marked evidences between most of coastal and northern/north eastern hinterland municipalities, which present similar DSRp90TBA and, therefore, similar cluster distribution.

260
Highest BAF characterizes the majority of the municipalities with the observed highest DSRp at 90% of TBA (generally above 85) while the territory with higher BAS is also characterized by lower DSRp90TBA (below 85). These clusters (7-9) also present relatively high percentages of BA in agriculture (mostly between 10 and 20%). It is also worth mentioning that some municipalities present similar BAF and BAS, although being located in the coastal regions, usually characterized by higher forest cover. Land cover also helps to understand the DSRp80TBA and DSRp90TBA boxplots for each cluster, especially the 265 higher dispersion in the later in comparison with the former (Figure 9). These dissimilarities are especially evident in cluster 8, which is the cluster with highest BA in shrublands and agriculture (twice the value of clusters 1 -5) and less in forest (half the value of clusters 1 -5). Additionally, cluster 8 is the one with less burnable area (not shown). The combination of these factors could explain the high dispersion: high BA in shrublands can occur with low DSRp, high BA in agricultural lands is much more likely to occur with high DSRp; and, finally, low burnable areas prevents very large wildfires to occur, even with extreme 270 DSRp. A contingency table permitted to evaluate the influence of vegetation cover in the spatial distribution of the clusters and, therefore, also in DSRp90TBA. Table 1 is based on the results illustrated in Figure 11 and aims to assess if the differences in groups of clusters or in DSRp90TBA can be explained by the BA prevailing in forested areas or in shrubland+agricultural zones.
Specifically, it purposes to assess if municipalities of clusters 1 -5, with DSRp90TBA>90, have higher BAF (BAF>50%), and, on the contrary, clusters 7 -9, with DSRp90TBA<90, present higher BAS+BAA (BAS+BAA>50%). Results reveal that the 275 number of municipalities of clusters 1-5 and BAF>50% is 4.6 times higher than the number of municipalities in clusters 7-9 and BAF>50%. However, the number of municipalities of clusters 7-9 and BAS+BAA>50% is 1.3 higher than the number of municipalities of clusters 1-5 and BAS+BAA>50%. Consequently, the OA (71%), UA (71% -70%) and PA (82% -55%) reveal moderate to high accuracy. The BAS+BAA>50% threshold is probably a too demanding criterion for DSRp90TBA=90 limit, as shrublands and agriculture land cover will also burn with higher DSRp in a large number of municipalities. For forests 280 (BAF>50%), the accuracy is better, i.e., this threshold has been accurate in more than four times of the municipalities that were incorrectly classified. The χ2 test results indicate that we can claim that the samples are independent, with an error risk

Discussion
It is important to discuss some methodological options. Only wildfires occurred in the extended summer period, from 15th May to 31st October, were studied because of two main reasons: (i) BA within this period accounts for 97.5% of TBA, assuming only large fires; and, (ii) the secondary peak of fire incidence in Portugal occurs in late winter early spring, with low DSR values and depends more on drought than on temperature (Amraoui et al., 2015;Calheiros, et al., 2020). Only large wildfires 290 (BA>100 ha), similarly defined by the Portuguese forest authorities (ICNF), have been included also for two reasons. First, wildfires in Portugal are mainly (99.4%) caused by humans, by negligence (about one quarter of total number of wildfires with known cause) and intentionally (about three quarters), associated to the use of fire, accident and structural/land use (Parente et al., 2018) i.e., small wildfires can occur with relatively low DSR. Second, mainland Portugal registers a very large number of small wildfires but they account only for a small amount of TBA. For example, wildfires with BA>100 ha are just about 1% 295 of all wildfires, but account for 75% of total burnt area (Pereira et al., 2011). LULC data can limit the analysis and affect the use LULC data for one year/inventory to assess wildfire selectivity. Understory vegetation is also a very important factor in fire vulnerability, spread and intensity (Fonseca and Duarte, 2017;Espinosa et al., 2019). Consequently, wildfires only tend to occur and spread in managed forests with very high DSR, higher than in unmanaged forests (Fernandes, Guiomar and Rossa, 2019). However, land use data does not include forest management information. Despite the small fraction of managed forested areas, roughly 20%, as estimated by Beighley and Hyde, (2018), this lack of information can influence our results, Europe due to anthropogenic warming projected with non-stationary climate-fire models, Nat. Commun., 9(1), 1-9, doi:10.1038/s41467- Table 1. Contingency tables and accuracy metrics to assess the role of vegetation BA assessed with DSRp90BA thresholds, for the municipalities used in cluster analysis. The contingency tables computed the number of municipalities (NM) for the following criteria: CLUST 1-5 (CLUST 7-9) and BAF>50% (BAS+BAA>50%). Overall Accuracy (OA), User's Accuracy (UA) and Producer's Accuracy (PA) were the calculated accuracy metrics, together with the statistical tests Chi-squared (χ2) test (with p-value), Phi coefficient (Φ), Contingency coefficient (C) and the Cohen's Kappa coefficient (κ).