21 Mar 2023
 | 21 Mar 2023
Status: this preprint is currently under review for the journal NHESS.

Short-term prediction of extreme sea-level at the Baltic Sea coast by Random Forests

Kai Bellinghausen, Birgit Hünicke, and Eduardo Zorita

Abstract. We have designed a machine-learning method to predict the occurrence of extreme sea-level at the Baltic Sea coast with lead times of a few days. The method is based on a Random Forest Classifier and uses sea level pressure, surface wind, precipitation, and the prefilling state of the Baltic Sea as predictors for daily sea level above the 95 % quantile at seven tide-gauge stations representative of the Baltic coast.

The method is purely data-driven and is trained with sea-level data from the Global Extreme Sea Level Analysis (GESLA) data set and from the meteorological reanalysis ERA5 of the European Centre for Mid-range Weather Forecasting. These records cover the period from 1960 to 2020 using one part of them to train the classifier and another part to estimate its out-of-sample prediction skill.

The method is able to satisfactorily predict the occurrence of sea-level extremes at lead times of up to 3 days and to identify the relevant predictor regions. The sensitivity, measured as the proportion of correctly predicted extremes is, depending on the stations, of the order of 70 %. The proportion of false warnings, related to the specificity of the predictions, is typically as low as 10 to 20 %. For lead times longer than 3 days, the predictive skill degrades; for 7 days, it is comparable to a random skill.

The importance of each predictor depends on the location of the tide gauge. Usually, the most relevant predictors are sea level pressure, surface wind and prefilling. Extreme sea levels in the Northern Baltic are better predicted by surface pressure and the meridional surface wind component. By contrast, for those located in the south, the most relevant predictors are surface pressure and the zonal wind component. Precipitation was not a relevant predictor for any of the stations analysed.

The Random Forest classifier is not required to have considerable complexity and the computing time to issue predictions is typically a few minutes. The method can therefore be used as a pre-warning system triggering the application of more sophisticated algorithms to estimate the height of the ensuing extreme sea level or as a warning to run larger ensembles with physically based numerical models.

Kai Bellinghausen et al.

Status: open (extended)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse

Kai Bellinghausen et al.

Kai Bellinghausen et al.


Total article views: 343 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
274 57 12 343 6 6
  • HTML: 274
  • PDF: 57
  • XML: 12
  • Total: 343
  • BibTeX: 6
  • EndNote: 6
Views and downloads (calculated since 21 Mar 2023)
Cumulative views and downloads (calculated since 21 Mar 2023)

Viewed (geographical distribution)

Total article views: 342 (including HTML, PDF, and XML) Thereof 342 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
Latest update: 09 Jun 2023
Short summary
The prediction of extreme coastal sea level, e.g. caused by a storm surge, is operationally carried out with dynamical computer models. These models are expensive to run and still display some limitations in predicting the height of extremes. We present a successful purely data-driven machine learning model to predict extreme sea levels along the Baltic Sea coast a few days in advance. The method is also able to identify the critical predictors for the different Baltic Sea regions.