Articles | Volume 25, issue 11
https://doi.org/10.5194/nhess-25-4655-2025
© Author(s) 2025. This work is distributed under the Creative Commons Attribution 4.0 License.
Ensemble random forest for tropical cyclone tracking
Download
- Final revised paper (published on 24 Nov 2025)
- Supplement to the final revised paper
- Preprint (discussion started on 18 Mar 2025)
- Supplement to the preprint
Interactive discussion
Status: closed
Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor
| : Report abuse
-
RC1: 'Comment on egusphere-2025-252', Anonymous Referee #1, 05 May 2025
- AC1: 'Reply on RC1', Pradeebane Vaittinada Ayar, 21 Jul 2025
-
RC2: 'Comment on egusphere-2025-252', Anonymous Referee #2, 20 May 2025
- AC2: 'Reply on RC2', Pradeebane Vaittinada Ayar, 21 Jul 2025
Peer review completion
AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload
ED: Reconsider after major revisions (further review by editor and referees) (05 Aug 2025) by Piero Lionello
AR by Pradeebane Vaittinada Ayar on behalf of the Authors (05 Aug 2025)
Author's response
Author's tracked changes
Manuscript
ED: Referee Nomination & Report Request started (26 Aug 2025) by Piero Lionello
RR by Anonymous Referee #1 (11 Sep 2025)
RR by Anonymous Referee #2 (29 Sep 2025)
ED: Publish subject to minor revisions (review by editor) (05 Oct 2025) by Piero Lionello
AR by Pradeebane Vaittinada Ayar on behalf of the Authors (08 Oct 2025)
Author's response
Author's tracked changes
Manuscript
ED: Publish as is (02 Nov 2025) by Piero Lionello
AR by Pradeebane Vaittinada Ayar on behalf of the Authors (03 Nov 2025)
Manuscript
Review of “Ensemble Random Forest for Tropical Cyclone Tracking”
Overview
This work applies Random Forest (RF) models to track tropical cyclones using environmental variables from a global reanalysis (ERA5) with an eventual goal of using the RF tracker in long-running climate simulations. The Eastern Pacific and Northern Atlantic TC basics were chosen for investigation. Random Forests were trained by categorizing localized boxed regions in each basin as either containing a TC or not (TC-free) and associating statistics of environmental variables in each box from ERA5 to the binary events. Variables of mean sea level pressure, relative vorticity, column water vapor, and thickness were used as they represented different facets of physical mechanisms and TCs. Statistics are computed for these variables and included as inputs during RF training.
Training is conducted with 6-fold cross-validation to generate a range of RF solutions that are then used to compute MCC, POD, and FAR over a series of subsampling experiments – the authors note a significant proportion of their samples are TC-free compared to TC samples. Generally, a ratio of 25-1 is seen as reasonable with POD and FAR tradeoffs as the ratio is increased/decreased. Detection skill is notably better than the baseline UZ method in both basins. Further investigation of skill suggests the model primarily misses TCs at low intensity and low duration. The authors also devise analyses to interpret physical meaning, although I have some comments on this aspect of the analysis below.
Overall, the authors have employed RFs in a very unique and potentially innovative application area to track TCs in global reanalyses. The manuscript could benefit from improved grammar and clarity in locations, along with consideration of additional analyses or methods to improve the scientific presentation. I look forward to seeing a revised manuscript after careful revision.
Comments
McGovern, A., R. Lagerquist, D. John Gagne, G. E. Jergensen, K. L. Elmore, C. R. Homeyer, and T. Smith, 2019: Making the Black Box More Transparent: Understanding the Physical Implications of Machine Learning. Bull. Amer. Meteor. Soc., 100, 2175–2199, https://doi.org/10.1175/BAMS-D-18-0195.1.
Technical Edits and Questions
Generally: the authors should spend a substantial amount of time proof-reading the document for lingering grammar issues.
Line 48: Change to “this study focuses on data-driven algorithms using machine learning”. Sometimes “so-called” can have a negative/inappropriate connotation, which I don’t believe was your intent.
Lines 93-96: While I understand it is a long-held tradition to include a “table of contents paragraph” in this manner, you can remove this paragraph – it has no particular value for readers. The scientific structure of manuscripts has remained unchanged for decades and every reader knows that methods will come next, results afterward, and so on. If a reader is interested in a particular section, they can seek out the section header to know what is contained within.
Line 99: Remove this single line
Line 103: Remove “cyclonic” – seasons are not “cyclonic”. Alternatively, can adjust to “cyclone seasons”
Line 106: “Track records that do not provide”
Lines 106-107: If a TC undergoes extratropical transition, how is the transition from TC to extratropical TC handled? Also, how is TC demise to depression stage handled? Only TC achievement is mentioned here (i.e., genesis).
Line 131: Moisture is misspelled
Lines 136-138: The description here appears to have two statements in conflict with one another. First, the text says that every box has a vector of ones and zeros is constructed: is this for every grid point in the box? The next sentence says the box is encoded as a 1 or 0. Some additional clarity and perhaps rewording these sentences is needed to clarify the approach. I suspect it is the latter, but the wording is a bit confusing.
Line 134: Why are the boxes not immediately adjacent to one another? Could a TC be missed if it lies outside of the boxes in the white areas of Figure 1?
Lines 139-140: What is the motivation for synthesizing the ERA5 data in the boxes to single-statistic values? Other works have used spatial regions to encode relevant spatial relationships into RFs (see Hill et al. 2020, 2021, 2023, 2024, Schumacher et al. 2021) and have had tremendous success, including deducing how those spatially oriented data contribute to forecast skill (Mazurek et al. 2025). Others tackling severe weather hazards have taken a synthesizing approach too (see Clark and Loken 2022, Loken et al. 2022). Were there any tests that also included the full box of ERA5 data to demonstrate the single-value statistics were a better methodological choice?
Loken, E. D., A. J. Clark, and A. McGovern, 2022: Comparing and Interpreting Differently Designed Random Forests for Next-Day Severe Weather Hazard Prediction. Wea. Forecasting, 37, 871–899, https://doi.org/10.1175/WAF-D-21-0138.1.
Clark, A. J., and E. D. Loken, 2022: Machine Learning–Derived Severe Weather Probabilities from a Warn-on-Forecast System. Wea. Forecasting, 37, 1721–1740, https://doi.org/10.1175/WAF-D-22-0056.1.
Mazurek, A. C., A. J. Hill, R. S. Schumacher, and H. J. McDaniel, 2025: Can Ingredients-Based Forecasting Be Learned? Disentangling a Random Forest’s Severe Weather Predictions. Wea. Forecasting, 40, 237–258, https://doi.org/10.1175/WAF-D-23-0193.1.
Hill, A. J., R. S. Schumacher, and M. R. Green, 2024: Observation Definitions and their Implications in Machine Learning-based Predictions of Excessive Rainfall. doi.org/10.1175/WAF-D-24-0033.1.
Hill, A. J., R. S. Schumacher, and I. L. Jirak, 2023: A new paradigm for medium-range severe weather forecasts: probabilistic random forest-based predictions. doi:10.1175/WAF-D-22-0143.1.
Hill, A. J. and R. S. Schumacher, 2021: Forecasting excessive rainfall with random forests and a deterministic convection-allowing model. doi:10.1175/WAF-D-21-0026.1.
Schumacher, R. S., A. J. Hill, M. Klein, J. Nelson, M. Erickson, S. M. Trojniak, and G. R. Herman, 2021: From random forests to flood forecasts: A research to operations success story. doi:10.1175/BAMSD-20-0186.1.
Hill, A. J., G. R. Herman, and R. S. Schumacher, 2020: Forecasting severe weather with random forests. doi:10.1175/MWR-D-19-0344.1.
Lines 147-148: This sentence is not needed – can be removed. All of this information is contained in the section headers.
Line 174-175: To be consistent with both machine learning and atmospheric science literature, the “calibration” phase should be referred to as the “training” phase of the ERF. Then, you use cross-validation to validate the trained model on withheld periods – you don’t use those withheld periods to “calibrate” the models.
Line 188: Should RF actually be ERF?
Line 188: Did you consider alternative probability thresholds (beyond just 50%) to assignment detected tracks (D)?
Lines 251-253: This text is best reserved for the figure caption – please move there if not already. This text is just describing the figure, not the science.
Figure 3: It would be good to see the full distribution of MCC scores for the 100 RFs plotted as error bars, akin to a 95% confidence interval. Are the MCC values truly indifferent statistically? (it is hard to tell but maybe this detail is plotted as light blue lines? If so, please try and make these lines clearer so they can be discerned and provide a description in the figure caption)
Lines 273-274: What is meant by “calibration experiments”? Are you just evaluating the model’s ability to detect storms over the testing period for which it was trained? It is to be expected that POD will be high and FAR low.
Line 283-284: Isn’t a missed track by definition lower probability? Aren’t hits/misses defined by probabilities greater than or less than 50%? These box plots in Figure 5b are being more or less constrained by the methods used, and don’t necessarily provide much scientific reasoning for “FA are less likely to happen than hits”. The authors should reconsider the usefulness of this analysis in regard to their methodological choices.
Lines 320-322: As mentioned earlier, they are also prescribed by the authors, so these results are not extremely surprising. See major comment above.
Lines 348-349: This information is once again best reserved for the figure caption.
Figure 10: This is an excellent figure that clearly demonstrates how the RFs are learning the relevance of each predictor to drive the yes/no predictions.