the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A data-driven model for Fennoscandian wildfire danger
Sigrid Jørgensen Bakke
Niko Wanders
Karin van der Wiel
Lena Merete Tallaksen
Download
- Final revised paper (published on 12 Jan 2023)
- Supplement to the final revised paper
- Preprint (discussion started on 23 Dec 2021)
- Supplement to the preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on nhess-2021-384', Anonymous Referee #1, 21 Feb 2022
This paper uses Random Forests to estimate wildfire probability in the mostly boreal Fennoscandia region. Comparable studies using similar data and Random Forest models have been performed over various spatial domains but this study is the first one focusing on Fennoscandia in particular. The analyses are thorough and very well documented. There are a few issues I would like to see addressed before publication:
- What was the motivation to perform the analysis at a 0.25° and not the native MODIS resolution, or at least at the finest meteorological resolution? You lose a lot of spatial detail in this way.Pixel product data are available at a 250 m resolution.
- Not including dynamic vegetation predictors or specific land cover is a weakness. Recent work (e.g. Kuhn Regnier et al., 2021) has shown that adding vegetation dynamics has considerable impact on model skill. NDVI not being modelled by DGVMs is not a valid justification as several productivity-related indicators estimated by DGVMs are available from Earth observation. The same applies to (more static) land cover information, such as crop fraction or tree type (e.g. Forkel et al., 2019).
- The same applies to socio-economic drivers such as population density. Fig.3 suggests that there is clear link between wildfire occurrence and population centres. Probably, including crop fraction as variable would already be a good proxy for this.
Detailed comments
The title is a bit misleading: The model identifies the main hydrometeorological drivers of past wildfire occurrences. It estimates the probability of wildfire occurrence but it does not predict (i.e. forecast) wildfire occurrence itself; It this should be made clear in the title.
l68-69: It is misleading to state that fire-weather indices based on climate model and reanalysis data can be used for monitoring and forecasting.
l91: mention some of these limited studies using data-driven methods to predict intra- and inter-annual dynamics, e.g. Forkel et al. (2017, 2019) and Kuhn-Regnier et al. (2021), who predict monthly global patterns.
l92: How do you define a data-rich region? With recent satellite availability, practically all regions have become data rich and several studies ha
l97: "In addition, a bottom-up approach is typically less straightforward in its data requirements and
methodology as compared to the process-based approaches" -> explainl123: unclear whether this dataset is used for training or as independent validation reference. If used as target in model development, this doesn't come out clearly in Fig.1 (as it should also be split up into training and testing)
l127: How can the machine learning algorithm both be simpler and more sophisticated?
l181: Why are dynamic vegetation predictors not included? Recent work (e.g. Kuhn Regnier et al., 2021) has shown that adding vegetation dynamics ha considerable impact on model skill.
l220: why is wind speed included as predictor? More a predictor of fire spread than of occurrence
l314: Which threshold was used beyond which no more predictors were removed?
l394: Why did you not assess the impact of a predictor that is
Section 3.1/l415: The final set of predictors, which mostly excludes anomaly-based indicators, seems to suggest that the model is tuned to predict fire occurrence climatology rather than typical fire weather situations. Is this correct?
L419-421: is the minor difference between the RF model and the FWI predictors really significant?
l442-447: To me it's not very surprising that simply including NDVI does not improve model skill as it's climatology closely follows that of soil moisture and meteorological variables. Did you also test the inclusion of NDVI anomalies?
l445: High fire danger (luckily) most of the times does not lead to actual wildfire activity as an ignition source is required.
Fig.9: it seems that the correlation patterns closely follow the border between Finland and Russia (ans to lesser degree Sweden). How can this be explained?
Can it be that the superior skill of FWI over the RF model is because FWI describes anomalous conditions whereas your model more relates to describing fire weather climatology and spatial patterns?
l496: In this context, reference should me made to Forkel et al., 2012, who showed that antecedent moisture conditions are better predictors of fire occurrence in a Boreal environment than FWI and precipitation anomalies.
l498: to what extent is soil moisture an indicator of litter fuel conditions? This is usually where fires start, not in the tree crowns.
l531: This statement underestimates the role observations play in reanalysis.
l533-534: Could it be that wind is not directly but indirectly related, i.e. by the dominant weather patterns? High-pressure conditions, which are favourable to fire weather, are typically associated with low wind speeds. Vice-versa, westerlies bring high wind speeds and precipitation.
l539: are latitude and months of the year not already implicitly included in the other predictors?
l545: vegetation variables like fAPAR and LAI would be more obvious candidates than NDVI as these are simulated by DGVMs (which is an argument you brought up earlier).
l546: Vegetation Optical Depth from microwave satellites has been proposed as fuel moisture indicators (e.g Forkel et al., 2017, 2019).
l582: Several studies have done this before as proved by the references below. Please rephrase.
l611: I'd be careful with the word easily here as in other regions others drivers can be dominant, some of which may not even have been originally tested here. Besides, high-quality datsets such as the EOBS and observation-heavy reanalysis data may be unavailable or have reduced skill, respectively, in other regions and hence lead to a different model. Also fire management is different in many parts of the globe (e.g. rangeland burning management in Africa or deforestation).
References
Kuhn-Regnier et al., 2021: https://bg.copernicus.org/articles/18/3861/2021/
Forkel et al., 2012: https://iopscience.iop.org/article/10.1088/1748-9326/7/4/044021/
Forkel et al., 2017: https://doi.org/10.5194/gmd-10-4443-2017
Forkel et al., 2019: https://bg.copernicus.org/articles/16/57/2019/
Citation: https://doi.org/10.5194/nhess-2021-384-RC1 -
AC1: 'Reply on RC1', Sigrid Joergensen Bakke, 16 May 2022
AC: Many thanks for your time and efforts in evaluating our manuscript. We highly appreciate your positive and constructive feedback. In the following, we would like to respond to the comments.
RC1: This paper uses Random Forests to estimate wildfire probability in the mostly boreal Fennoscandia region. Comparable studies using similar data and Random Forest models have been performed over various spatial domains but this study is the first one focusing on Fennoscandia in particular. The analyses are thorough and very well documented. There are a few issues I would like to see addressed before publication:
What was the motivation to perform the analysis at a 0.25° and not the native MODIS resolution, or at least at the finest meteorological resolution? You lose a lot of spatial detail in this way.Pixel product data are available at a 250 m resolution.
AC: We chose the 0.25 degree resolution to investigate if a data-driven model is applicable for use at the state of the art global climate models, rather than aiming for the highest spatial detail possible. Further, spatial dependency of fires (e.g. the same fire occurring in two or more cells) is reduced when using a coarser scale. We see that the reasoning behind the spatial scale chosen is not stated clearly in our manuscript, and we will do so in the revised version.
RC1: Not including dynamic vegetation predictors or specific land cover is a weakness. Recent work (e.g. Kuhn Regnier et al., 2021) has shown that adding vegetation dynamics has considerable impact on model skill. NDVI not being modelled by DGVMs is not a valid justification as several productivity-related indicators estimated by DGVMs are available from Earth observation. The same applies to (more static) land cover information, such as crop fraction or tree type (e.g. Forkel et al., 2019).
AC: Thank you for the references. As you state, several productivity-related indicators are estimated by DGVMs. Still, most climate model outputs are not based on runs for which the climate model has been coupled with a DGVM. For this reason, we wanted to limit the choice of predictors to those available from climate models without the need of DGVMs. We will clarify our reasoning, and acknowledge the possibility of productivity-based indicators estimated by DGVMs in the revised manuscript (e.g. in Sect. 4.2).
RC1: The same applies to socio-economic drivers such as population density. Fig.3 suggests that there is clear link between wildfire occurrence and population centres. Probably, including crop fraction as variable would already be a good proxy for this.
AC: We agree that socio-economic predictors would likely improve the model prediction. The main reason that we hypothesise this is that humans and human infrastructure are fire starters (line 547-551). Figure 3b of Norway suggests a link between wildfire occurrence and population centres. We suggest this is partly due to humans being a major ignition source as already mentioned, as well as the overlap between human settlement in Norway and burnable areas (a potential predictor included). However, as seen in Figure 2, which shows number of fires over Fennoscandia, the link between human settlement and fires is not clear. We chose to constrain our study to predictors available in global climate models. In a setting where no data restrictions are imposed, we agree that one should test the inclusion of socio-economic and vegetation based predictors.
Detailed comments
RC1: The title is a bit misleading: The model identifies the main hydrometeorological drivers of past wildfire occurrences. It estimates the probability of wildfire occurrence but it does not predict (i.e. forecast) wildfire occurrence itself; It this should be made clear in the title.
AC: We agree, and suggest to revise the title to “A data-driven model for Fennoscandian wildfire danger”.
RC1: l68-69: It is misleading to state that fire-weather indices based on climate model and reanalysis data can be used for monitoring and forecasting.
AC: We agree and will rephrase these sentences.
RC1: l91: mention some of these limited studies using data-driven methods to predict intra- and inter-annual dynamics, e.g. Forkel et al. (2017, 2019) and Kuhn-Regnier et al. (2021), who predict monthly global patterns.
AC: Thank you for providing these references. We will include these (and potentially other relevant references) in the revised manuscript.
RC1: l92: How do you define a data-rich region? With recent satellite availability, practically all regions have become data rich and several studies ha
AC: We agree this is an unclear statement and will clarify it in the revised manuscript. The last part of your comment is unfortunately lacking, however, we trust your point has been made in this sentence part (please let us know otherwise).
RC1: l97: "In addition, a bottom-up approach is typically less straightforward in its data requirements and methodology as compared to the process-based approaches" -> explain
AC: We will clarify this in the revised manuscript.
RC1: l123: unclear whether this dataset is used for training or as independent validation reference. If used as target in model development, this doesn't come out clearly in Fig.1 (as it should also be split up into training and testing)
AC: The local (Norwegian) fire occurrence dataset is used as an independent validation reference for research question 2, and used for training (a target in model development) for research question 3. For research question 3, we split up the Norwegian fire occurrence dataset into training and test datasets. This is described in line 375-380. Figure 1 shows the data-driven approach for the Fennoscandian domain (i.e. the one developed using satellite-based fire data as target), and not that of the Norway alone. It is stated in the figure caption, but we will clarify it in line 132 as well. Further, we will make a line break after punctuation in line 375, to separate the two applications of the Norwegian dataset. We considered including it is the figure, but concluded it would make the figure more messy than clarifying.
RC1: l127: How can the machine learning algorithm both be simpler and more sophisticated?
AC: The simpler and more sophisticated machine learning algorithm are two separate algorithms (Decision tree is the simpler and AdaBoost is the more sophisticated one). We will clarify by rephrasing the sentence.
RC1: l181: Why are dynamic vegetation predictors not included? Recent work (e.g. Kuhn Regnier et al., 2021) has shown that adding vegetation dynamics ha considerable impact on model skill.
AC: With reference to our earlier comment on the subject (based on your question regarding DGVM), we will clarify our reasoning in this section in the revised manuscript.
RC1: l220: why is wind speed included as predictor? More a predictor of fire spread than of occurrence
AC: For a fire to occur in the burned area dataset, it must have been of a size recognisable for the satellite. Thus, the fire must have spread to some degree (due to wind or not). Another effect of the wind is drying of the ground and vegetation by increasing evapotranspiration prior to the fire. Regardless of the reason, wind was found to be a selected predictor, indicating its importance in predicting the fire occurrence dataset.
RC1: l314: Which threshold was used beyond which no more predictors were removed?
AC: We used no threshold; a predictor subset was made for all (each) number of predictors (Np), as stated in line 314. This can also be seen in Fig. S1 that shows the average cross-validation score for each combination of max depth and number of predictors from one to all (30) predictors. The Np selected for the final model was selected as described in Sect. 2.6.1.
RC1: l394: Why did you not assess the impact of a predictor that is
AC: Unfortunately, the last part of your comment is missing.
RC1: Section 3.1/l415: The final set of predictors, which mostly excludes anomaly-based indicators, seems to suggest that the model is tuned to predict fire occurrence climatology rather than typical fire weather situations. Is this correct?
AC: We do not fully agree that the model predict fire occurrence climatology rather than typical fire weather situations. First; even though most of the anomaly-based potential predictor are not included in the final set of predictors, the shallow soil water anomaly stands out as a clear dominant predictor as compared to the other selected predictors. Secondly, the predictors have a high annual variability in monthly values. Notable differences from year to year for the same month can be seen in the fire danger probability maps produced by the model (in Fig. 8 and S6-S9), for example July 2017 (Fig. S9d) versus July 2018 (Fig. 8d).
RC1: L419-421: is the minor difference between the RF model and the FWI predictors really significant?
AC: We did not test for significance, but we agree that this difference is likely not significant. We will change to a more precise language (e.g. at which digit they differ) in the revised manuscript.
RC1: l442-447: To me it's not very surprising that simply including NDVI does not improve model skill as it's climatology closely follows that of soil moisture and meteorological variables. Did you also test the inclusion of NDVI anomalies?
AC: We did not include NDVI anomaly. NDVI can be viewed as a potential estimate of burnable biomass (in particular in the Nordic landscape that has a high variability in burnable biomass) and it is therefore preferred to include the absolute NDVI value instead of the NDVI anomaly. The close relationship between NDVI and hydrometeorological variables, such as temperature and snow cover, further supports the potential of developing models without NDVI. As acknowledged earlier, many more variables could have been included in our study (NDVI anomaly being one of them), however, some constrains in predictors included had to be made at the start of our study.
RC1: l445: High fire danger (luckily) most of the times does not lead to actual wildfire activity as an ignition source is required.
AC: We agree. The relation to line 445 is unclear to us, and we suspect the reviewer intended to refer to line 455.
RC1: Fig.9: it seems that the correlation patterns closely follow the border between Finland and Russia (ans to lesser degree Sweden). How can this be explained?
AC: This is an interesting observation, and we are left with speculations when trying to explain the pattern. Figure 2b also shows a Finland-Russia divide in the number of fires, where more fires are found in Russia. As a consequence, the data-driven model may have been better tuned to Russian conditions as compared to Finnish conditions, whereas the FWI performance is independent of the fire occurrence density. This may be one reason for the higher correlations between the two approaches in Russia compared to eastern Finland. We will briefly comment on this in the revise manuscript.
RC1: Can it be that the superior skill of FWI over the RF model is because FWI describes anomalous conditions whereas your model more relates to describing fire weather climatology and spatial patterns?
AC: In our understanding, FWI does not describe anomalous conditions, but rather estimates moisture (in surface, intermediate and deep organic layers) and spread conditions regardless of what is “normal”. Since soil moisture anomaly is a dominant predictor in the data-driven model, the emphasis on anomalous conditions is a notable feature of the data-driven model rather than FWI.
RC1: l496: In this context, reference should me made to Forkel et al., 2012, who showed that antecedent moisture conditions are better predictors of fire occurrence in a Boreal environment than FWI and precipitation anomalies.
AC: Thank you for this good suggestion, we will include it in the revised manuscript.
RC1: l498: to what extent is soil moisture an indicator of litter fuel conditions? This is usually where fires start, not in the tree crowns.
AC: We expect a strong relation between shallow soil moisture and litter fuel conditions (favourable fuel conditions for low soil moisture). We agree with your statement and will consider adding a remark to the manuscript.
RC1: l531: This statement underestimates the role observations play in reanalysis.
AC: We agree and will make this clear in the revised manuscript.
RC1: l533-534: Could it be that wind is not directly but indirectly related, i.e. by the dominant weather patterns? High-pressure conditions, which are favourable to fire weather, are typically associated with low wind speeds. Vice-versa, westerlies bring high wind speeds and precipitation.
AC: Yes. We will consider mentioning this in the revised manuscript.
RC1: l539: are latitude and months of the year not already implicitly included in the other predictors?
AC: In some ways, yes, but they could have guided the model in cases such as the example presented commenting on the different effect of SPEI3 during the growing season as compared to the snow accumulation period (line 540-541).
RC1: l545: vegetation variables like fAPAR and LAI would be more obvious candidates than NDVI as these are simulated by DGVMs (which is an argument you brought up earlier).
AC: As commented on earlier, we excluded vegetation variables represented by DGVMs and limited the choice of variables to what is available form more common climate models (not including dynamic vegetation).
RC1: l546: Vegetation Optical Depth from microwave satellites has been proposed as fuel moisture indicators (e.g Forkel et al., 2017, 2019).
AC: We will check the suggested references (thanks for providing these) and adapt the text accordingly.
RC1: l582: Several studies have done this before as proved by the references below. Please rephrase.
AC: We will rephrase this sentence in the revised manuscript according to the analyses done in the references provided.
RC1: l611: I'd be careful with the word easily here as in other regions others drivers can be dominant, some of which may not even have been originally tested here. Besides, high-quality datsets such as the EOBS and observation-heavy reanalysis data may be unavailable or have reduced skill, respectively, in other regions and hence lead to a different model. Also fire management is different in many parts of the globe (e.g. rangeland burning management in Africa or deforestation).
AC: Yes, we agree and will remove the word easily from this sentence.
Citation: https://doi.org/10.5194/nhess-2021-384-AC1
-
RC2: 'Comment on nhess-2021-384', Anonymous Referee #2, 17 Mar 2022
This manuscript uses machine learning methods to predict fire danger in Fennoscandia at approximately 0.25 degree spatial scale for 2001-2019. Here, the authors are using official statistics compared to MODIS burned area, with predicted fire danger probability models compared to the results from the Canadian Fire Weather Index. The method is novel and the comparison is rigorous, but the data and approach need to be explained more – and at times even cited better – to assess the efficacy of the model. In general, this manuscript needs to be revised in order to understand why this method may be useful for predicting fire danger probabilities. First, the authors should explain what fire danger is as opposed to fire occurrence. Second, why are burned area data used as ‘fire occurrence’ when satellite-based active fire detections are available? Finally, the manuscript does not describe fully many of the datasets used, including where to obtain them and what their uncertainty are. Finally, the results seem to indicate that a single shallow soil moisture variable is driving the predictions (which is not usually considered in fire danger modeling like FWI). A major revision and resubmission is recommended.
Specific comments:
- The title is “A data-driven prediction model for Fennoscandian wildfires“ but the thesis of the paper is to produce spatiotemporally resolved fire danger probability maps – which is not quite the same as predicting wildfires. Consider revising the title to be more specific.
- Line 19: “which stores approx. 30% of the world’s soil carbon pool” needs a citation
- Lines 26-27: “However, to the best of our knowledge, fire studies of the European boreal zone are limited.” needs a citation.
- Line 144: What is the spatial resolution of a European Space Agency Climate Change Initiative (ESA145 CCI) product version 5.1.1cds? Please include that.
- Line 146-147: “and is based on Terra Moderate Resolution Imaging Spectroradiometer (MODIS) Reflection information” is not correct way to right this. It should be “the reflectance product of the Moderate Resolution Imaging Spectroradiometer (MODIS) sensor on the Terra satellite”. Can the authors please specify which reflectance information is used? Daily surface reflectance?
- Section 2.2 Norwegian fire occurrence dataset – the authors have not provided a citation to the dataset, where it can be accessed, and how it is collected. Are these truly wildfires or are these fires from all ignition sources (lightning plus human-caused)? Is there a burned area minimum that fires must meet to be included in this wildfire dataset? Please describe this dataset more.
- Line 164: Why were the months April – September selected?
- Figure 3: The authors are using burned area from the European Space Agency Climate Change Initiative (ESA145 CCI) product version 5.1.1cds but noting it as fire occurrence and number of fires. Can the authors describe how this was done with the burned area product? Is this the most appropriate comparison of burned area to number of fires in the official statistics? What is the original spatial resolution and what is lost when aggregated to 0.25 degrees?
- Line 230: Can the authors explain how snow cover was used? Especially since the model was limited to monthly values from April to September over the period 2001–2019.
- Line 235: The land cover data and fraction of burnable area is not well described. Which land covers? Why were those chosen? Are all vegetation types are included?
- Line 241-242: Can the authors provide citations for this statement (and for Norway and Sweden, specifically): “We chose FWI because it is developed for boreal forests and because it is used for fire danger forecasts in large parts of Fennoscandia (Norway and Sweden).”
- Figure 6: Should readers interpret Figure 6 as the only important variable to be soil moisture anomalies in the layer 7-28 cm? It would be helpful for the authors to spend more time explaining why this figure is important for creating a data-driven model, i.e., variable selection.
- Table 1: Should NDVI be included in this as a potential predictor?
- Figure 8: The red-blue scheme is not colorblind safe. Can the authors change these figures to make them colorblind safe? Tools like colorbrewer can help.
- Figure 8: At first look, a reader may think that the fire danger probability maps did not perform well, especially compared to the satellite-based fire occurrence (which is really burned area dataset). Using the active fire products from MODIS or VIIRS may provide a better match than the burned area. Further, consider changing the title and better explaining fire danger in the Introduction so that interpretation of the Results is more straightforward.
- Figure 9: Same comment as for Figure 8. Is this colorblind safe? The colors chosen are hard to interpret, particularly in Figure 9c.
- Line 500: Most of the figures and results in the manuscript highlight the importance of swvl2_anomaly only. The manuscript needs to better describe the input and importance of other variables.
- Lines 535: The authors need to better evidence to say that reanalysis products are helpful when what was used in this study is mainly reanalysis.
- Conclusions: Since the subsurface soil layers are the best predictors, can the authors provide some description of this dataset and the uncertainties / validation of the product? This is not described in section 2.3.3.
- The authors have not shared the data or code and these should be provided. How was this study conducted? In R? In MATLAB? Please provide these details.
Citation: https://doi.org/10.5194/nhess-2021-384-RC2 -
AC2: 'Reply on RC2', Sigrid Joergensen Bakke, 16 May 2022
AC: Many thanks for your time and efforts in evaluating our manuscript. We highly appreciate your positive and constructive feedback. In the following, we would like to respond to the comments.
RC2: This manuscript uses machine learning methods to predict fire danger in Fennoscandia at approximately 0.25 degree spatial scale for 2001-2019. Here, the authors are using official statistics compared to MODIS burned area, with predicted fire danger probability models compared to the results from the Canadian Fire Weather Index. The method is novel and the comparison is rigorous, but the data and approach need to be explained more – and at times even cited better – to assess the efficacy of the model.
RC2: In general, this manuscript needs to be revised in order to understand why this method may be useful for predicting fire danger probabilities.
AC: We will state our motivation clearer in the abstract. In the manuscript body, we believe the usefulness of the method is sufficiently justified. The reasoning is introduced, discussed and concluded; it links the background and the objectives (line 99-111) in the introduction, it is discussed in line 589-592 and Sect 4.5, and it is emphasised in the conclusion (line 637-640 and line 660-664).
RC2: First, the authors should explain what fire danger is as opposed to fire occurrence.
AC: We agree that we should clarify the difference between fire danger and fire occurrence. We will find a suitable place in the introduction to make this clarification.
RC2: Second, why are burned area data used as ‘fire occurrence’ when satellite-based active fire detections are available?
AC: The burned area data is used to get a binary fire/no-fire dataset based on the same resolution as found for many global climate models, to see if a data-driven model is able to make predictions of a dataset existing at this spatial scale. The active fire products detect burning at the time of overpass given relatively cloud-free conditions, which can be a problem for parts of Fennoscandia that are seldom cloud-free. The burned area product is considered less sensitive to cloud-cover and is also more reliable as it has a longer temporal influence. Further, by detecting the structural consequences of fires, the burned area product have a more direct relevance to climate-relevant consequences, such as albedo and ecosystem functioning. We will make a comment on this in the revised manuscript. We acknowledge that more analyses comparing different target datasets would be an interesting continuation of our study. We made one such comparison of the target dataset by including a fire record of Norway. We chose this over a satellite-based active fire detection dataset because it clearly separates each fire occurrence from others and all known occurrences are registered regardless of the (e.g. heat) signal captured by the satellite.
RC2: Finally, the manuscript does not describe fully many of the datasets used, including where to obtain them and what their uncertainty are.
AC: We will carefully check the data section and add details, explanations and citations when lacking. We will include information of their uncertainties (as available), in particular for the Norwegian fire occurrence dataset. Most datasets are established datasets with known validation studies. We will add well-known uncertainties in the revised manuscript.
RC2: Finally, the results seem to indicate that a single shallow soil moisture variable is driving the predictions (which is not usually considered in fire danger modeling like FWI). A major revision and resubmission is recommended.
AC: The result indicate that a shallow soil moisture variable is the dominant predictor, however not sufficient alone to make a good prediction (emphasised e.g. in line 500-502). As you state, soil moisture is usually not considered in fire weather indices such as the FWI (this is commented on in general terms in line 637-638).
Specific comments:
- RC2: The title is “A data-driven prediction model for Fennoscandian wildfires“ but the thesis of the paper is to produce spatiotemporally resolved fire danger probability maps – which is not quite the same as predicting wildfires. Consider revising the title to be more specific.
AC: The authors agree, and suggest the title: “A data-driven model for Fennoscandian wildfire danger” - RC2: Line 19: “which stores approx. 30% of the world’s soil carbon pool” needs a citation
AC: This is stated in the paper cited in the end of the sentence, i.e. Flannigan et al. (2009): “Boreal regions store about 30% of the world’s soil carbon pool…” - RC2: Lines 26-27: “However, to the best of our knowledge, fire studies of the European boreal zone are limited.” needs a citation.
AC: We have not found a paper stating this specifically, and the statement here is therefore based on our literature search. This is why we emphasise that it is “to the best of our knowledge” in the beginning of the sentence. - RC2: Line 144: What is the spatial resolution of a European Space Agency Climate Change Initiative (ESA145 CCI) product version 5.1.1cds? Please include that.
AC: The spatial resolution is 0.25 deg longitude/latitude. We agree that we should state this earlier in the paragraph (it is mentioned in line 151-152), and will move the statement to the beginning of the paragraph. - RC2: Line 146-147: “and is based on Terra Moderate Resolution Imaging Spectroradiometer (MODIS) Reflection information” is not correct way to right this. It should be “the reflectance product of the Moderate Resolution Imaging Spectroradiometer (MODIS) sensor on the Terra satellite”. Can the authors please specify which reflectance information is used? Daily surface reflectance?
AC: Thank you for pointing out the correct writing; this will be corrected in the revised manuscript. The main source of data are daily surface reflectance information in the red and Near Infrared bands. The algorithm theoretical basis is found under documentation at the reference given (specifically http://datastore.copernicus-climate.eu/documents/satellite-fire-burned-area/D1.6.2-v1.0_ATBD_CDR_BA-FireCCI_MODIS_v5.1cds_PRODUCTS_v1.0.1.pdf, which is based on https://climate.esa.int/media/documents/Fire_cci_D2.1.3_ATBD-MODIS_v2.0.pdf). We will include this reference in the revised manuscript. - RC2: Section 2.2 Norwegian fire occurrence dataset – the authors have not provided a citation to the dataset, where it can be accessed, and how it is collected. Are these truly wildfires or are these fires from all ignition sources (lightning plus human-caused)? Is there a burned area minimum that fires must meet to be included in this wildfire dataset? Please describe this dataset more.
AC: We assume the reviewer is referring to Sect. 2.1.2 and not 2.2 here. We will provide more details and citation (http://www.brannstatistikk.no) to the dataset in the revised manuscript. We are unsure what you mean by “truly wildfires” (do you mean only the wildfires ignited by lightning?) as opposed to “fires from all ignition sources”. The dataset comprise all fires in grass, cultivated land, forests and uncultivated land, regardless of ignition source. We do not define wildfires depending on the type of ignition source in our study. The data are based on the fire and rescue service reporting system in Norway (brann- og redningstjenestens rapporteringssystem; BRIS). There is no lower limit of burned area in this dataset, as it is based on fire responses of the fire department. - RC2: Line 164: Why were the months April – September selected?
AC: The Norwegian fire occurrence dataset must cover the same months as the satellite based fire occurrence dataset, and the reason for omitting October to March in the satellite based fire occurrence dataset is given in line 152-153 (few fire occurrences). We will clarify this in the revised manuscript. - RC2: Figure 3: The authors are using burned area from the European Space Agency Climate Change Initiative (ESA145 CCI) product version 5.1.1cds but noting it as fire occurrence and number of fires. Can the authors describe how this was done with the burned area product?
AC: The transition from burned area to fire occurrence is explained in Sect. 2.1.1 (line 147-154), and the transition from the national record to the Norwegian fire occurrence dataset is described in Sect. 2.1.2 (line 161-163).
RC2: Is this the most appropriate comparison of burned area to number of fires in the official statistics? What is the original spatial resolution and what is lost when aggregated to 0.25 degrees?
AC: None of the two datasets is directly comparable to the number of fires in official statistics because they are both aggregated in space and time. It is not an aim of the study to make the datasets directly comparable to official statistics, but rather see if a data-driven model is able to predict fire occurrences at the spatiotemporal resolution (0.25 deg regular grid and monthly time step) used in the study. The original spatial resolution of the burned area product is 250m. We have not evaluated what is lost when aggregated to 0.25 degrees, as the aggregated version is an established and verified dataset publically available. However, known uncertainties with the different datasets applied will be commented on when introduced. - RC2: Line 230: Can the authors explain how snow cover was used? Especially since the model was limited to monthly values from April to September over the period 2001–2019.
AC: The (fractional) snow cover is a continuous variable describing the fraction of a given grid cell covered by snow at a given time step, and was used as a potential predictor. Our study region cover high latitudes and altitudes, and snow cover is present in some grid cells and months also in the period analysed. - RC2: Line 235: The land cover data and fraction of burnable area is not well described. Which land covers? Why were those chosen? Are all vegetation types are included?
AC: Because the dataset is publically available, we do not elaborate in details choices made in their creation. However, we will consider commenting in general on their uncertainties along with key references. We have added a reference to the dataset for interested readers to look up (line 237). - RC2: Line 241-242: Can the authors provide citations for this statement (and for Norway and Sweden, specifically): “We chose FWI because it is developed for boreal forests and because it is used for fire danger forecasts in large parts of Fennoscandia (Norway and Sweden).”
AC: We will provide citations for this statement in the revised manuscript. Norway: https://skogbrannfare.met.no/, Sweden: https://www.smhi.se/brandrisk. We will include Canadian; “…is developed for (Canadian) boreal forests…”, to clarify that it was not originally developed for Fennoscandia. - RC2: Figure 6: Should readers interpret Figure 6 as the only important variable to be soil moisture anomalies in the layer 7-28 cm? It would be helpful for the authors to spend more time explaining why this figure is important for creating a data-driven model, i.e., variable selection.
AC: No, Figure 6 should not be interpreted this way. The figure shows the importances of the subset of predictors used in the final data-driven model, and is therefore rather showing the opposite; multiple predictors are important, and a model of the soil moisture anomaly alone would not perform well. This is further emphasised by Figure S1, which shows that model performance reduces when reducing the number of predictors, and by Figure 7, which illustrates that swvl2_anomaly alone is not a sufficient predictor. See e.g. line 500-503. This figure is not important for creating a data-driven model, rather it is a result of the final data-driven model. - RC2: Table 1: Should NDVI be included in this as a potential predictor?
AC: We considered including NDVI in this table, but concluded not to because the NDVI experiments were performed separately from the main analysis. - RC2: Figure 8: The red-blue scheme is not colorblind safe. Can the authors change these figures to make them colorblind safe? Tools like colorbrewer can help.
AC: We tested the figures for colour blindness using https://www.color-blindness.com/coblis-color-blindness-simulator/ and the app “Color Blind Pal”. We did not find any difficulty for the different colour blind views with this figure. Given your comment, we wonder if we have overlooked a colour blind view. If so, please let us know for which colour blind view this figure is a problem for, so we can correct it? It is of high priority to us to make the figures interpretable for all colour views. - RC2: Figure 8: At first look, a reader may think that the fire danger probability maps did not perform well, especially compared to the satellite-based fire occurrence (which is really burned area dataset). Using the active fire products from MODIS or VIIRS may provide a better match than the burned area. Further, consider changing the title and better explaining fire danger in the Introduction so that interpretation of the Results is more straightforward.
AC: The satellite-based fire occurrence dataset is used to construct the model, which is why we use this dataset in Figure 8. We are unsure if the active fire products would provide a better match, as the main reason for mapping regions of fire danger probability and not to predict fire occurrences as such. Rather, the lack of no fire occurrences is likely related to the lack of ignition source. However, regions with fire occurrences are often mapped with high probability, indicating a good model prediction. We agree that the title is unclear, and suggest to revise it as stated in our answer to your comment 1. We will make sure to carefully go through the description of fire danger and clarify (better explain) the text to ensure it is well understood. - RC2: Figure 9: Same comment as for Figure 8. Is this colorblind safe? The colors chosen are hard to interpret, particularly in Figure 9c.
AC: We tested the figure for colour blindness using https://www.color-blindness.com/coblis-color-blindness-simulator/ and could not find an issue with the colours for the different colour visions. The same colour scale is used for all three maps to ease the comparison. Figure 9c shows high correlations (above 0.8 for the whole study domain) and thus, is only represented by two of the colours from the scale. - RC2: Line 500: Most of the figures and results in the manuscript highlight the importance of swvl2_anomaly only. The manuscript needs to better describe the input and importance of other variables.
AC: We disagree that most figures and results highlight the importance of swvl2_anomaly only. There is only one figure (Fig. 7) in which swvl2_anomaly is the only predictor shown. This is justified by the relatively high importance of this predictor as compared to other predictor as shown in Fig. 6. All other figures relating to the predictors, show either all potential predictors (Fig. 4), or all selected predictors of a given model (Fig. 6, S3b, S4b, S5b and S13b). All input variables are described in Sect. 2.3, and the selected variables other than swvl2_anomaly are discussed in the lines 503-535; following the line of your comment. We will carefully go through the text and add information on predictors as seen needed for the interpretation of the overall results. However, commenting on the role of all predictors in more details, also those of less importance, we believe would lengthen an already long text and divert the attention from the key findings. - RC2: Lines 535: The authors need to better evidence to say that reanalysis products are helpful when what was used in this study is mainly reanalysis.
AC: Stating that “the use of reanalysis products is useful” does not imply that it is useful compared to another alternative, but simply that reanalysis products can be used to construct a well-performing model. We will consider making it clear that other types of data can be useful as alternatives and that this was not tested in our study. - RC2: Conclusions: Since the subsurface soil layers are the best predictors, can the authors provide some description of this dataset and the uncertainties / validation of the product? This is not described in section 2.3.3.
AC: We agree that this can be a valuable information, and will search the literature for description, uncertainties and validation of the dataset and include the relevant information we find. - RC2: The authors have not shared the data or code and these should be provided. How was this study conducted? In R? In MATLAB? Please provide these details.
AC: The data is openly available online, except for the details concerning the Norwegian fire dataset, for which we have provided the source. We will add a ‘code and data availability’ section in the end following the Copernicus template where we will repeat the information given in the data section and acknowledgements related to data availability. We support the general efforts to make code used in publications available to make analyses reproducible. Unfortunately, the code is not in in a state appropriate for sharing. However, we will add “Code is available upon reasonable request from the corresponding author” under the ‘code and data availability section’. Here, we will also state that we used R for the SPI and SPEI calculations, and Python otherwise.
Citation: https://doi.org/10.5194/nhess-2021-384-AC2 - RC2: The title is “A data-driven prediction model for Fennoscandian wildfires“ but the thesis of the paper is to produce spatiotemporally resolved fire danger probability maps – which is not quite the same as predicting wildfires. Consider revising the title to be more specific.