the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
An ensemble of state-of-the-art ash dispersion models: towards probabilistic forecasts to increase the resilience of air traffic against volcanic eruptions
Barbara Scherllin-Pirscher
Delia Arnold Arias
Rocio Baro
Guillaume Bigeard
Luca Bugliaro
Ana Carvalho
Laaziz El Amraoui
Kurt Eschbacher
Marcus Hirtl
Christian Maurer
Marie D. Mulder
Dennis Piontek
Lennart Robertson
Carl-Herbert Rokitansky
Fritz Zobl
Raimund Zopp
Download
- Final revised paper (published on 05 Oct 2021)
- Supplement to the final revised paper
- Preprint (discussion started on 09 Apr 2021)
- Supplement to the preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on nhess-2021-96', Tatjana Bolic, 04 May 2021
The authors could consider re-writing the Introductory section as it is jumping from ATM to ash forecast to costs in circular manner.
The authors cite the ICAO - Volcanic Ash Contingency Plan - European and North Atlantic Regions. The plan spells out that in these regions, the air traffic management performed by air navigation service providers should not impose flight restrictions, unless in the very close proximity of the erupting volcano. The latest version specifies that the airlines are those that choose how to address this hazard. If they have their SAfety Risk Assessment (SRA) for operations in the presence of volcanic ash accepted by appropriate authority, than the air traffic control cannot restrict their flight plans, if submitted in accordance to the (SRA). Based on this, I would suggest the authors to change the references from the "conservative ATM" (i.e. line 410) or ATM to the flight planning or airline choices, as this is currently the case.
In the attached pdf, a few small corrections are indicated and some typoes.
-
AC1: 'Reply on RC1', Matthieu Plu, 05 Aug 2021
The authors thank RC1 for her positive evaluation of the manuscript and for her comments about the text.
The authors could consider re-writing the Introductory section as it is jumping from ATM
to ash forecast to costs in circular manner.
The introduction has been re-written and is now organized following the plan: Impacts of volcanic eruption on air traffic, warnings and general aspects of decision-making, ash forecasts state-of-the-art, ensemble approach, probabilistic approach, and cost/loss rationale for ATM. We expect this introduction plan to be clearer and more straightforward for the reader. A new introduction is proposed as an enclosed pdf file.
“[…] Based on this, I would suggest the authors to change the references from the
"conservative ATM" (i.e. line 410) or ATM to the flight planning or airline choices, as this is currently the case.”
A general comment on the fact that this is airline choice has been added: “The ICAO (2016) plan and latest versions spell out that the airlines are those that choose how to address the volcanic ash hazard, provided they have their safety risk assessment for operations in the presence of volcanic ash accepted by appropriate authority.”
Besides, “ATM” has been replaced by “flight planning” in this section and in many parts of the manuscript.
RC1 added some specific comments in a pdf copy of the manuscript:
- line 28 “at three vertical levels” replaced by “ in the Flight Level (FL) bands FL000-200, FL200-350, FL350-550,”
- lines 37-39 : the cost/loss argument has been reformulated following the new introduction plan, and the climate-related argument has been removed,
- lines 414-415 : replaced by “However, the models provide useful guidance in the sense that flying above the predicted clouds and also around highly contaminated regions may be possible.”.
- lines 467-468 : reformulated as “so they can introduce in their risk management plan some acceptance to fly at least through regions which are below the safety-critical pollutant concentration threshold.”
Besides, the typos have been corrected.
We hope that we have addressed RC1’s comments satisfactorily and that, after implementation of theses changes in the manuscript, it can be accepted for publication.
-
AC1: 'Reply on RC1', Matthieu Plu, 05 Aug 2021
-
RC2: 'Comment on nhess-2021-96', Dr. Andreas Becker, 16 May 2021
The May 2010 Eyjafjallajökull eruption in combination with the at these days mandatory zero-ash policy applied by the ICAO has led to tremendous losses for the flight carriers on the one hand, but potentially saved thousands of passengers lives as it was indeed successful in preventing any flight crash related to the eruption. In the following years ICAO has revised the zero tolerance policy and replaced it by the introduction of several threshold concentrations that are attributed to corresponding flight safety measures with regard to a particular flight and the plane maintenance regarding also doses concentrations. Obviously this is an optimization problem for the air traffic management constrained by the accuracy of ash plume forecasts and re-analyses that both also depend on the high-availability of measurements.
Understanding that this optimization problem is a moving target, this paper does not suffer from a lack of originality problem, which has been my first reaction when reading about a multi-model ensemble study related to the eruption that happened 11yrs ago. Maybe this is a hint, that the paper title is sort of misleading, and could better read: “On the aid of state-of-the art probabilistic ash plume forecast modelling to increase the resilience of air traffic against volcanic eruptions” but I leave this with the authors. Anyway the word tailored should be moved forward so it reads “A multi-model ensemble tailored for air traffic management”
Besides these nitty-gritty comments I have read the paper with joy and gained a lot of insights with regard to the nowadays capabilities of state-of-the art dispersion models for the purpose of air traffic management. The mulit-model ensemble is well chosen covering the range of methodologies with regard to Eularian and Lagrangian approaches and taking advantage of inversion modelling techniques to ameliorate the source term along the different lead times of the forecast and re-analysis.
In general the paper is well and clearly written, and there’s little ambiguity with regard to the way the methodologies are described. However there’s some redundancy Section 2 that could be avoided by adding one or two tables listing all model realisations along their critical features stating in particular where they are similar and where not. For example FLEXPART and MATCH are pretty similar in the meteorological forcing (wind fields) at least this is my understanding. The authors could make this clear by such a comprehensive table on the model configurations.
I liked very much the comprehensive recognition of the VACOS reference observations with +-1h pictures in addition, comparing very well to the different realisations in Fig.5, so why not having the table claimed above with the same 16cells (model configurations)?
I am fully happy with the discussion provided in particular with regard to the compilation of the flight based measurements of the ash plume provided with Table 1 in connection with the discussion of the apparently to high vertical dilution that at lease the ensemble results reveal. In this context I would be curious to learn, whether the Lagrangian Models performed slightly better in reproducing the filaments, but that’s a minor point in the overall discussion
So I strongly recommend publication of this paper and hope that my recommendations made above, corresponding to a “minor revision” of the manuscript are swiftly addressed by the authors.
Citation: https://doi.org/10.5194/nhess-2021-96-RC2 -
AC2: 'Reply on RC2', Matthieu Plu, 05 Aug 2021
The authors thanks RC2 for his positive evaluation of the manuscript and for his insightful remarks.
- RC2 suggests to modify the title of the manuscript: Maybe this is a hint, that the paper title is sort of misleading, and could better read: “On the aid of state-of-the art probabilistic ash plume forecast modelling to increase the resilience of air traffic against volcanic eruptions” but I leave this with the authors. Anyway the word tailored should be moved forward so it reads “A multi-model ensemble tailored for air traffic management”
We agree with RC2 that the title can be improved so as to: alleviate the tailored/optimization aspects, better reflect the ash dispersion aspect of the article and remove the “air traffic management” terminology (as suggested by RC1 also). However, the article does not consider truly “forecasts” since the meteorological input data are not forecasts (analyses instead). Our best suggestion for a new title based on RC2’s remark is: “An ensemble of state-of-the art ash dispersion models: towards probabilistic forecasts to increase the resilience of air traffic against volcanic eruptions”
- However there’s some redundancy Section 2 that could be avoided by adding one or two tables listing all model realisations along their critical features stating in particular where they are similar and where not. For example FLEXPART and MATCH are pretty similar in the meteorological forcing (wind fields) at least this is my understanding. The authors could make this clear by such a comprehensive table on the model configurations.
Following this suggestion, we propose to add in the manuscript a table that summarizes the main characteristics of the model configurations :
Model
FLEXPART
MATCH
MOCAGE
WRF-Chem
Version
9
6.0
2018
4.2
Horizontal resolution
n/a (Lagrangian model)
0.1°
0.2°
0.1°
Vertical resolution
n/a (Lagrangian model)
45 vertical levels
47 vertical hybrid sigma-pressure levels from the surface up to 5~hPa
47 vertical levels
Simulated time period
10 to 20 May
10 to 20 May
10 to 20 May
4 to 20 May
Meteorological input
ECMWF analyses and forecasts at 3 hourly step
ECMWF analyses and forecasts at 3 hourly step
ARPEGE 6-hourly analyses, interspersed with 3-hours forecasts
WRF meteorology using ECMWF analysis as initial condition, and ECMWF 6-hourly analyses, interspersed with 3-hours forecasts as boundary conditions
Fine ash size bins
Centred at 4 (bin 1), 6 (bin 2), 8 (bin 3), 10 (bin 4), 12 (bin 5), 14 (bin 6), 16 (bin 7), 18 (bin 8), 25 (bin 9) µm
Bulk description physically regarded as coarse fraction (2.5-10 mum).
0.98 to 1.95 µm (bin 1), 1.95 to 3.91 µm (bin 2), 3.91 to 7.81 µm (bin 3), 7.81 to 15.63 µm (bin 4), 15.63 to 31.25 µm (bin 5), and 31.25 to 62.5 µm (bin 6)
<3.91 µm (3.5 %), 3.91 to 7.81 µm (5.0 %), 7.81 to 15.62 µm (8.0 %), 15.62 to 31.25 µm (11.0 %
Fine ash distribution (a priori source terms)
7.6 % (bin 1), 6 13.3 % (bin 2), 11.8 % (bin 3), 11.1 % (bin 4), 10.8 % (bin 5), 10.5 % (bin 6), 10.8 % (bin 7), 11.1 % (bin 8), 12.7 (bin 9%)
n/a (bulk description)
Non-constant: physically resolved by the FPLUME plume-rise model
<3.91 µm (3.5 %), 3.91 to 7.81 µm (5.0 %), 7.81 to 15.62 µm (8.0 %), 15.62 to 31.25 µm (11.0 %); the percentage refers to the amount of total ash rather than that of fine ash
Fine ash distribution (a posteriori source terms)
Same as a priori source term
n/a (bulk description)
0.05% (bin 1), 0.25% (bin 2), 3.2% (bin 3), 25% (bin 4), 71.5% (bin 5), 0.0% (bin 6)
<3.91 µm (3.5 %), 3.91 to 7.81 µm (5.0 %), 7.81 to 15.62 µm (8.0 %), 15.62 to 31.25 µm (11.0 %); the percentage refers to the amount of total ash rather than that of fine ash
Some text to describe the models remain, particularly regarding the representation of transport and of ash processes, but without redundancy with the information that is provided in the table.
Besides, a single statement has been added about the post-processing of model outputs: “All model outputs are post-processed every hour, on the same 0.1° latitude-longitude grid, y interpolating the values horizontally, and on 13 vertical layers, by calculating the mean concentration of fine volcanic ash between the corresponding FLs.”
We hope that we have addressed RC2’s comments satisfactorily and that, after implementation of theses changes in the manuscript, it can be accepted for publication.
Citation: https://doi.org/10.5194/nhess-2021-96-AC2
-
AC2: 'Reply on RC2', Matthieu Plu, 05 Aug 2021
-
RC3: 'Comment on nhess-2021-96', Claire Witham, 17 May 2021
General Comments
The paper presents a multi-model study of a short period of the Eyjafjallajokull eruption in May 2010. As such it is a very limited study, but it highlights some useful challenges with using and comparing a multi-model ensemble for probabilistic volcanic ash forecasting. Some further details on the source terms used would be helpful for the reader to understand the differences between the model outputs. The authors apply the Fractions Skill Score as their measure of model skill, but I have some serious concerns about the way it has been applied here. The WRF-CHEM a priori simulation is clearly much worse than the other models, but this is really not reflected in the FSS. I believe the application of the pixel filtering is biasing the FSS results to suggest the models are performing better than they are. The use of the 99th percentile is also not mathematically justified. With some revisions to the statistics and data displayed, this would be a more balanced and hence interesting paper for the community.
Specific comments
Line 23: The statement “predefined procedures during such events were missing” is factually incorrect. There were very well established procedures in place laid down by the International Civil Aviation Organisation which can be found in the Handbook on International Airways Volcano Watch (IAVW). This statement needs to be deleted or amended. It is true that these procedures did not cover hazardous concentration levels.
Line 116: Please could the authors provide further details on the size distribution, in particular the size range. This is important in order to understand differences between the models. From the description of it being aerosol the implication is that the sizes are small (<2.5 um) compared to FLEXPART, but this may not be the case.
Line 131-133: It would be much more helpful to the reader if the size bins could be given in their metric equivalent rather than the phi scale. This would allow direct comparison with the other models. I believe this is effectively saying that the size range included is 0.1 – 62.5 um diameter, implying the largest particles are 3 times larger than those in FLEXPART.
Line 168 and Line 171: As there was no umbrella cloud present during this phase of the Eyjafjallajokull eruption, please could the authors provide more details on how the models work. Perhaps this is just a poor use of terminology and the use of “umbrella” needs to be changed to e.g. “maximum plume height”.
Throughout this whole section there is no information about the mass erupted. This seems a strange omission. As the mass eruption rate (or total mass) is one of the fundamental source parameters, it needs to be mentioned. For example, is the same mass applied in each case even though different particle size ranges (and hence different fractions of the total ash mass) are being represented? If not, then the implications of using different masses (and particle sizes) need to be discussed.
It seems unfortunate to have used different a priori source terms in the four models. Presumably this is because these are the data that were available. It would still be helpful if the authors could justify why they did not rerun the models with the same a priori. It is hardly surprising that the a posteriori ensemble outperforms the a priori ensemble as it contains less uncertainty.
The WRF_CHEM output is a clear outlier in fig5. I think the authors need to provide some explanation as to why this is. From figure 1, it doesn’t appear that the WRF-CHEM simulation is emitting more mass than the other simulations, but it clearly contains more mass in the other figures. The fig1 caption says the “the source terms of fine ash”… Please explain in the text what this means. I also wonder if this is the issue..? i.e. is additional non-fine ash being emitted in the WRF-CHEM run? Also why does this simulation start 6 days before the others? Does this mean it contain extra sources and hence extra ash compared to the others? This is why it's important to provide full details of the mass and particle sizes as mentioned above.
Line 267: Can I check that the value of 0.2 g/m2 is correct here? Based on the plots in Figure 3, this appears to be a large area, rather than the highest contamination area. Perhaps the colour scale in Figure 3 is what is not helpful for the interpretation here, as 0.2 is white, which also appears to be the colour of no ash. A different plotting scale is needed if that is the case.
Following on from this, a more detailed description of what is meant by “For each model output, the G grid points with the highest ash concentration in the domain are kept for further analysis and used to calculate the FSS.” is needed. Firstly, I assume that total column load, not "concentration" was used? Second, it implies that a different set of G grid points is derived compared to those determined from the satellite data. This is the approach used by Harvey and Dacre, but this is not at all obvious in this paragraph for those unfamiliar with the FSS.
I have some serious reservations about the current application of the FSS approach to compare the different models. Particularly when the paper is aimed at quantitative model output. As noted above, the WRF-CHEM a priori data is a clear outlier, but using this approach the FSS is only slightly worse. I understand the aim of applying the normalisation, but this does not sit well with me and I think gives a widely spread model a better score than it should. For this work, I would suggest that the application of the FSS would be more scientifically rigorous if a threshold approach was used. The obvious choice would be to apply the same 0.2 mg/m2 threshold to each model. This would be a true measure of the quantitative spatial skill of the model and allow a better comparison both between models and between the a priori and a posteriori results. Currently the statistics are very misleading.
Line 289: FLEXPART appears to perform worse with the a posteriori according to these FSS. Are the numbers correct? If they are, then this aspect should be discussed.
Line 298: It would be helpful to discuss that the reason for these differences in ash load for the a priori runs is because different masses are used in the sources. That they look more similar in the a posteriori in partly due to using the same mass.
Line 307: The Marenco et al measurements were taken along a NW-SE trending line over the UK on the 16 May, so it’s unfortunate that the authors have chosen to use a SW-NE trending line for their cross-sections. There is a suggestion in Fig 6 that the FLEXPART and MATCH modelled ash over the UK is indeed at altitude and so the authors may be being overly critical of their results. It is hard to see clearly in Fig 6. Ideally the authors would produce new plots with a NW-SE cross-section. If this is not possible, then it would be helpful if the longitudes of the UK coastline could be marked on the x-axes, as this would allow a better comparison to the Marenco results. Because the y axes are in pressure coordinates it would also be helpful to provide the corresponding pressure values in the text alongside the “4 and 6 km” reference.
Line 335: The 99% percentile has no meaning for a 4 member ensemble (valid values are 0, 25, 50 and 100) and strictly does not apply for a 12 member ensemble either. I recommend the authors use 100% for a meaningful statistic rather than 99%. This is unlikely to affect their results.
Line 428: “To estimate the long-term damage due to high ash dose, we recommend using the median of the ensemble as this gives the best estimate of ash distribution.” No scientific justification or proof of this statement is provided, therefore it is just conjecture. I recommend this sentence is deleted.
Technical Corrections
Line 22: Use of English: “forced to cancel” replace by “forced the cancellation of”
Line 25: “type” should be “types”
Line 28-29: These thresholds are incorrect. The contamination levels for which information is provided by the two VAACs are 0.2-2mg/m3, 2-4mg/m3 and >4mg/m3. Please correct.
Line 34: please clarify if “medium ash concentration” referred to here is the same as the medium contamination level specified earlier in the paragraph
Line 35: the use of “shorter intervals” is ambiguous here, it could refer to shorter exposure intervals, it would be helpful to be specific “shorter maintenance intervals”
Line 45: Recommend changing “can” to “could” as this has not yet been demonstrated in practice in a real event. There are other factors that would also need to be considered, such as traffic volume.
Line 88: change “regardless the” to “regardless of the”
Figure 3 and Figure 5: The colour bar captions say “Total ash concentration”. This needs to be changed to “Ash column load”.
Line 318: change “30-years” to “30-year”
Line 334: change “in in” to “in”
Citation: https://doi.org/10.5194/nhess-2021-96-RC3 -
AC4: 'Reply on RC3', Matthieu Plu, 05 Aug 2021
The authors thank RC3 for her insightful comments and constructive recommendations to improve the manuscript.
Since our answers include some new plots to support explanations, they are all included in the enclosed pdf file.
We hope that we have addressed RC3’s comments satisfactorily and that, after implementation of theses changes in the manuscript, it can be accepted for publication.
-
AC4: 'Reply on RC3', Matthieu Plu, 05 Aug 2021
-
RC4: 'Comment on nhess-2021-96', Ole Ross, 18 May 2021
General remarks
Improvement of volcanic ash dispersion forecasts is still highly relevant for civil aviation and the potential benefit of ensemble use is worth being investigated. Although the authors deal with a rather small and diverse ensemble the study is generally well designed. The manuscript is easily readable and understandable. The presented figures show the results nicely. I recommend publication after consideration of the referee comments.
Specific comments
Could you please comment on the following aspects for my personal understanding (not necessarily in the article if you think it is obvious for the community):
- The general purpose of spin-up phases is clear, although I thought about differences regarding their importance between meteorology models and externally driven LPDMs. Could you please comment how you chose the spin-up phase of 3 days in the specific situation? Have you looked internally for any features/differences of the models during spin up evolvement?
- The a posteriori source term was generated by inversion of satellite observations using flexpart, at least partially driven also by ECMWF analyses. The real plume somehow connected the satellite observations and the other measurements used in the present study. The ATM is connected by at least one similar setup, whereas the a posteriori source term seems to me rather free and independent. Could you briefly give some explanation why to assume “validity” of the a posteriori source term is justified in light of this potential model self-consistency issue?
Minor review points
L38
although it is of course true that longer routes have enhanced environmental and climate impact I cannot imagine that this consideration plays any significant role in the airline’s decision on ad hoc rerouting. This is probably purely about safety / economic rational (including maintenance costs and passenger rights compensations).
L52
reconsider the expression “perfect models” – that they cannot be reached in “near future” is not only highly probable, it is systematically certain. Perhaps “with sufficient accuracy” or similar expression for reliable correctness / precise plume representation.
L103
ECMWF’s vertical resolution of lower sigma-hybrid levels is surface pressure dependent, insert e.g. approximately/about/roughly
Citation: https://doi.org/10.5194/nhess-2021-96-RC4 -
AC3: 'Reply on RC4', Matthieu Plu, 05 Aug 2021
The authors thank RC4 for his positive evaluation of the manuscript and for his insightful remarks.
The general purpose of spin-up phases is clear, although I thought about differences
regarding their importance between meteorology models and externally driven LPDMs.
Could you please comment how you chose the spin-up phase of 3 days in the specific
situation? Have you looked internally for any features/differences of the models during
spin up evolvement?
The main objective of the spin-up is to assure that ash from the most recent phase of the eruption is present in the domain. Some first simulations starting from 13th May showed that starting from 0 concentrations degraded the scores. From 9th to 13th May, the emission was rather low and constant. Considering the size of the domain and the intensity of ash emission, 3 days is a reasonable time frame to allow realistic background ash concentrations in the domain for the model. WRF-Chem used longer spin-up (4th to 13th May, i.e., 9 day spin-up. Differences between WRF-Chem and the other models were not attributed to different spin-up lengths. An argument for justification of the spin-up length has been added to the manuscript.
The a posteriori source term was generated by inversion of satellite observations using
flexpart, at least partially driven also by ECMWF analyses. The real plume somehow
connected the satellite observations and the other measurements used in the present
study. The ATM is connected by at least one similar setup, whereas the a posteriori
source term seems to me rather free and independent. Could you briefly give some
explanation why to assume “validity” of the a posteriori source term is justified in light
of this potential model self-consistency issue?
This comment raises many points, we presume that the two main aspects that deserve to be considered behind this issue are: 1/ potential self-consistency between FLEXPART used for source-term inversion and as a model in the study, 2/ potential self-consistency between ash load used for source-term inversion and for verification.
1/ We do not see a systematic better performance of FLEXPART a posteriori run compared to the other a posteriori runs. So there is no clear advantage of having used FLEXPART for the inversion. As noted in the manuscript, sharing the source term computed from one model to others provides good results.
2/ The ash load measurements used for inversion are based on IASI and SEVIRI and the retrievals use (Stohl et al., 2011) a look-up table approach in combination with a correction for atmospheric water vapor (based on Yu et al., 2011) and a prior detection scheme using threshold tests for the brightness temperature difference between 10.9μm and 12μm and an opacity test. The reference ash load data VACOS used in the manuscript is the only algorithm around based on neural networks combined with simulated ash observations. VACOS uses all TIR channels (6.2μm, 7.3μm, 8.7μm, 9.7μm, 10.8μm, 12μm, 13.4μm), compared to using only 10.9μm and 12μm in Stohl et al. (2011). On top of these, there are many other scientific differences between retrievals algorithms (training data, assumptions about ash properties, etc) that make the VACOS data retrievals independent from the data used
Minor review points
L38 although it is of course true that longer routes have enhanced environmental and climate
impact I cannot imagine that this consideration plays any significant role in the airline’s
decision on ad hoc rerouting. This is probably purely about safety / economic rational
(including maintenance costs and passenger rights compensations).
The climate-related argument has been removed.
L52 reconsider the expression “perfect models” – that they cannot be reached in “near future”
is not only highly probable, it is systematically certain. Perhaps “with sufficient accuracy”
or similar expression for reliable correctness / precise plume representation.
The part of the sentence now states: “[…], it is highly probable that predictions with sufficient accuracy cannot be reached in a near future.”
L103 ECMWF’s vertical resolution of lower sigma-hybrid levels is surface pressure dependent,
insert e.g. approximately/about/roughly
Due to the simplification of the description of models (see answer to RC2), this has been removed from the manuscript.
We hope that we have addressed RC4’s comments satisfactorily and that, after implementation of theses changes in the manuscript, it can be accepted for publication.
Citation: https://doi.org/10.5194/nhess-2021-96-AC3