Articles | Volume 23, issue 7
Research article
05 Jul 2023
Research article |  | 05 Jul 2023

Bayesian hierarchical modelling of sea-level extremes in the Finnish coastal region

Olle Räty, Marko Laine, Ulpu Leijala, Jani Särkkä, and Milla M. Johansson

Occurrence probabilities of extreme sea levels required in coastal planning, e.g. for calculating design floods, have been traditionally estimated individually at each tide-gauge location. However, these estimates include uncertainties, as sea-level observations typically have only a small number of extreme cases such as annual maxima. Moreover, exact information on sea-level extremes between the tide-gauge locations and incorporation of dependencies between the adjacent stations is often lacking in the analysis.

In this study, we use Bayesian hierarchical modelling to estimate return levels of annual maxima of short-term sea-level variations related to storm surges in the Finnish coastal region. We use the generalised extreme value (GEV) distribution as the basis and compare three hierarchical model structures of different complexity against tide-gauge-specific fits. The hierarchical model structures allow us to share information on annual maximum sea levels between the neighbouring stations and also provide a natural way to estimate uncertainties in the theoretical estimates.

The results show that compared to the tide-gauge-specific fits, the hierarchical models, which pool information across the tide gauges, provide narrower uncertainty ranges for both the posterior parameter estimates and the corresponding return levels in most locations. The estimated shape parameter of the GEV model is systematically negative for the hierarchical models, which indicates a Weibull type of behaviour for the extremes along the Finnish coast. The negative shape parameter also allows us to calculate the theoretical upper limit for the annual maximum sea levels on the Finnish coast. Depending on the tide gauge and hierarchical model considered, the median value of the theoretical upper limit was 47–73 cm higher than the highest observed sea level.

1 Introduction

Extreme sea-level phenomena (waves, storm surges, tides, etc.) together with the rising mean sea level introduce hazards both to people and to coastal infrastructure by causing migration, loss of functionality and biodiversity and decreasing our living habitat. In the Baltic Sea region, sea-level extremes are directly associated with such hazards as coastal erosion and flooding (e.g. Rutgersson et al.2022; Weisse et al.2021). Recent studies have shown that the increase in the mean sea level has generally exceeded the global average during the past 50 years in the Baltic Sea (Weisse et al.2021), with some local exceptions from this trend (e.g. Männikus et al.2020). It is foreseen that the main drivers for long-term changes in the extreme sea levels are changes in the relative mean sea level and atmospheric conditions. For these reasons, reliable projections of extreme sea levels are key tools for supporting coastal planning and safety in regional scales. For example, estimation of low-probability–high-consequence events such as extremely high sea levels is important for nuclear power plant safety in Finland (Jylhä et al.2018).

The Finnish coast is surrounded by the Baltic Sea – an intra-continental small sea that is connected to the Atlantic Ocean via the narrow and shallow Danish straits (Leppäranta and Myrberg2009). The unique geography of the Baltic Sea as well as various local and global processes govern the sea-level variations on the Finnish coast. On short temporal scales, wind and air pressure are the main factors inducing local sea-level fluctuations along the Finnish coast. In addition, wind-generated waves, seiche oscillations (standing waves inside the Baltic Sea basin) and meteotsunamis (e.g. Pellikka et al.2020) (meteorologically induced long, shallow-water waves) alter the sea level locally on a short timescale. Tides (e.g. Medvedev et al.2013; Särkkä et al.2017) and ice have a marginal influence on the sea-level variations on the Finnish coast compared to other forcing factors. In a longer-term perspective, the in- and outflow of the water from the Danish straits controls the total water balance in the Baltic Sea. Other important long-term elements are the mean sea-level change reflecting the behaviour of global sea-level rise, as well as post-glacial land uplift, which originates from the pressure release of the crust after the last glacial era. The largest sea-level variations on the Finnish coastline take place in the ends of the Bay of Bothnia and the Gulf of Finland due to the piling up effect and standing wave oscillations within the bay (Jönsson et al.2008).

Previous studies have addressed coastal flooding risks in the present-day and future climatic conditions at the Finnish tide-gauge locations. Recently, Pellikka et al. (2018) estimated future coastal flooding risks by combining mean sea-level projections with the short-term sea-level variability. Leijala et al. (2018) examined the total water level on the shore due to simultaneous effect of sea level and wind waves. The physical limits of simulated extreme sea levels were investigated in hydrodynamic modelling exercises by Särkkä et al. (2017) on the Helsinki coastal area, using winds from regional climate model simulations, and by Averkiev and Klevannyy (2010) for the whole Gulf of Finland, with idealised wind fields as forcing. Wolski et al. (2014) and Wolski and Wiśniewski (2020) showed that there are large geographical variations in the observed sea-level extremes in the Baltic Sea and that they tend to increase towards the ends of the bays in the northern parts of the Baltic Sea domain (see Fig. 1). Johansson et al. (2001) inspected temporal variations in the extreme sea levels on the Finnish coast and concluded that a general increase has occurred in the annual maxima during the 20th century. Rapid sea-level oscillations due to air-pressure disturbance, i.e. meteorological tsunami waves, have also been studied on the Finnish coast by concentrating on their occurrence in the Gulf of Finland (Pellikka et al.2020).

Figure 1Detrended time series of the annual maximum sea level at each tide-gauge location selected for this study. The bathymetric data used in the background map are taken from the ETOPO1 database (Amante and Eakins2009) using the marmap package in R (Pante and Simon-Bouhet2013; Pante et al.2021).

The theory on extreme value analysis is well documented in the literature (e.g. Coles2001). If block maxima, such as the annual maximum sea levels, are considered, extreme value theory tells us that the generalised extreme value distribution (GEV) is the only possible limiting distribution. GEV has been used to model return levels of annual maximum sea levels in the Baltic Sea region in many previous studies (e.g. Ribeiro et al.2014; Wolski et al.2014; Marcos and Woodworth2017; Soomere et al.2018; Kudryavtseva et al.2021). Most of these studies, however, have concentrated on individual tide-gauge locations. As the time series of sea-level extremes are relatively short, they generally do not allow reliable estimation of the occurrence probabilities of very rare events (e.g. return period >1000 years). Furthermore, to be able to construct predictions for sea-level extremes outside the tide-gauge locations, information on their spatial dependencies and variations is needed. For example, Soomere et al. (2018) inspected spatial variations of GEV parameters along the Estonian coast using ocean model data as the basis for extreme value analysis.

Issues with the limited sample size can be partly alleviated by pooling information on sea-level extremes across the neighbouring tide gauges, which reduces uncertainty on the parameter estimates. One approach to pool information across multiple tide gauges is based on regional frequency analysis (RFA) (Dalrymple1960; Hosking and Wallis1997). In this approach, a reasonably homogeneous region is searched using some similarity criterion, tide gauges within the region are normalised locally using flood-index and then pooled together before fitting a single extreme value distribution to the pooled data. This approach has been successfully applied to storm surges (e.g. Bardet et al.2011; Bernardara et al.2011). However, the results obtained with RFA might contain inconsistencies between different domains. Also, the method does not directly account for spatial dependencies between the tide gauges within the whole target region.

The Bayesian hierarchical modelling approach allows more flexibly for incorporating spatial and other dependencies in statistical models (e.g. Gelman et al.2013). In our case, this means that tide-gauge-specific GEV parameters are described jointly at the population level, for example, assuming that they come from the same joint hyperdistribution. This hyperdistribution has its own hyperparameters which need to be either specified from data or modelled by adding an additional layer to the model. Bayesian hierarchical models have been used to estimate extremes in other geoscientific fields (e.g. Cooley et al.2007), but there are relatively few examples of their use when modelling sea-level extremes. Coles and Tawn (2005) made one of the first attempts to model storm surges within the Bayesian framework. Calafat and Marcos (2020) developed a novel hierarchical model to account for spatial dependencies on storm surge extremes over the North Sea and British Isles. They showed that the Bayesian hierarchical modelling approach allows one to estimate occurrence probabilities of sea-level extremes outside gauge locations and to quantify uncertainty in the predicted sea levels.

This paper applies previous efforts on statistical modelling of sea-level extremes on the Finnish coastal region. Our aim is to assess how Bayesian hierarchical modelling – implemented in a simpler manner compared to Calafat and Marcos (2020) – performs in comparison to tide-gauge-specific models when estimating theoretical return levels for extremes related to short-term sea-level variations. Three hierarchical models are included, two of which take the spatial distance between the tide gauges into account in their model formulations. We perform a series of tests with the hierarchical models and evaluate their performance against the observed sea-level extremes. Furthermore, we illustrate the theoretical return level estimates obtained with the hierarchical models and their differences with respect to tide-gauge-specific estimates. We exclude long-term changes from the analysis and focus on short-term sea-level variations controlled by meteorological factors. As we concentrate solely on stationary conditions, the potential time dependence of GEV parameters is not assessed.

The paper is structured in the following manner. In Sect. 2, we describe the tide-gauge observations from the Finnish coast. In Sect. 3, we introduce the methods utilised in this paper, which is then followed by a summary of the main outcomes of the study in Sect. 4. In Sect. 5, we discuss the shortcomings of this study and potential avenues for future research. The paper is closed with conclusions in Sect. 6.

2 Tide-gauge records on the Finnish coast

The Finnish Meteorological Institute (FMI) maintains and collects sea-level observations from 14 tide gauges along the Finnish coast. Most of the Finnish tide gauges were constructed in the 1920s, meaning that they have more than 90 years of sea-level records currently available. The available time series are shorter only for the Rauma tide gauge (built in 1933) and for the Porvoo tide gauge (built in 2014). The longest time series providing over 100 years of data are available from Hanko (the first tide gauge in Finland, built in 1887) and Helsinki (tide gauge built in 1904). Two tide gauges were excluded from further analysis either due to the too-short length of time series for extreme value analysis (Porvoo) or the differing behaviour of the annual maximum sea levels in the tide-gauge location in the Finnish archipelago (Föglö) compared to the coastal tide gauges, which was not captured by our simple hierarchical modelling approach.

The data set used here consists of annual maximum sea levels. Annual maxima were calculated from monthly maximum sea levels which were extracted either from the original continuous paper recordings or later from the digital data at 1 min resolution. Before the year 1939, the values have been calculated from 4-hourly observations. Although there have been changes in the temporal resolution of the observations, we decided to include the earliest years in our study in order to use the longest possible time series in our analysis. The measured sea levels have been converted to the height system N2000 from fixed tide-gauge-specific reference systems. N2000 is the Finnish realisation of the European Vertical Reference System, and the datum has been derived from Normaal Amsterdams Peil (NAP) (Saaranen et al.2009). All time series extend to the end of the year 2020, with some gaps in them due to missing data. To reduce the impact of long-term changes caused by climate change and post-glacial land uplift to the extreme value analysis, a linear trend was calculated separately for each tide gauge based on 4-hourly observations and then subtracted from the annual maximum values.

Studies in the Baltic Sea region have suggested that the annual maximum sea levels should be calculated for a year-long block that covers all winter months, instead of using the calendar year due to seasonality in storminess (e.g. Johansson et al.2001; Suursaar et al.2002; Soomere et al.2018; Kudryavtseva et al.2021). We took this approach and calculated the annual maximum values between 1 April and 31 March. This way the maximum value from each winter period was selected, which removes the correlation between the annual maximum values. Furthermore, those years which had more than one monthly maximum value missing from the winter half of the year were filtered out. The length of the pre-processed time series varied between 87 and 124 years. These are illustrated in Fig. 1.

The maximum sea level varies between 36 and 198 cm in the pre-processed time series. From Fig. 1, it is observed that the highest sea levels are typically observed in Kemi and Hamina, which are the tide gauges located closest to the ends of the Gulf of Bothnia and the Gulf of Finland. Time series for these tide gauges also exhibit the highest year-to-year variability. These features have some implications for the statistical model design, as discussed in the next section. We note that the highest annual maximum values used here differ to a certain extent from the reported record heights in Finland (e.g. Wolski and Wiśniewski2020) due to detrending and, as they have been previously defined against a different reference, the so-called theoretical mean sea level (Johansson et al.2003).

Some individual values stand out from the time series panels. For example, markedly high sea levels were observed in January 1984 on some tide gauges located on the western coast of Finland. Such exceptional cases are of interest to us because they are likely to some extent to affect our statistical modelling results. We will briefly discuss the sensitivity of our GEV models for anomalously high observed sea levels in Sect. 4, using the aforementioned case as an example.

3 Methods

3.1 Extreme value analysis for annual maximum sea levels

We briefly summarise the main properties of the generalised extreme value distribution (GEV) applied to the extreme sea levels before discussing the chosen modelling approaches. We refer to Coles (2001) for more details on GEV distribution and its statistical properties. Let Yi with i=1,,12 be a random variable describing the annual maximum sea level (i.e. block maxima) at the ith tide gauge. The extreme value theorem states that for the normalised maxima of a sequence of independent and identically distributed random variables the GEV distribution is the only possible limiting distribution. GEV is a suitable model in our case, as the detrended annual maxima can be considered to be independent of each other. The cumulative distributions function of GEV can be written as

(1) G ( y ; μ , σ , ξ ) = exp - 1 + ξ y - μ σ + - 1 / ξ .

In Eq. (1), μ∈ℝ is the location parameter, σ>0 is the scale parameter and ξ∈ℝ denotes the shape parameter. The tail behaviour of GEV distribution is strongly controlled by the shape parameter. If ξ<0, y has an upper limit at μ-ξ/σ, while in other cases the upper tail is unbounded and has either exponential (ξ=0) or polynomial (ξ>0) decay.

The quantiles of GEV distribution are obtained by inverting Eq. (1):

(2) y p = μ - σ ξ 1 - - log ( 1 - p ) - ξ , for ξ 0 , μ - σ log 1 - log ( 1 - p ) , for ξ = 0 .

This equation is used to calculate the return level yp associated with a certain exceedance probability p or, equivalently, the return period T=1/p. While other approaches such as peaks over threshold could be considered (e.g. Coles2001), we used GEV, as it allows the direct modelling of annual maxima.

In the simplest case where the GEV parameters are estimated separately for each tide gauge (denoted as Separate hereafter), the model for the observations at the ith tide gauge is expressed as yiGEV(μi,σi,ξi). We use Bayesian methods to estimate the distribution parameters. As we are particularly interested in robustly quantifying uncertainty in the GEV parameters and in the predicted return levels, the Bayesian approach provides a natural way to do this. Bayes' theorem states that the posterior distribution of model parameters given observations y is

(3) p ( θ | y ) = p ( θ ) p ( y | θ ) p ( θ ) p ( y | θ ) d θ .

Here, p(θ) is the prior distribution for the parameters and p(y|θ) the likelihood function. The symbol θ denotes a vector that contains all the unknown parameters of the model. As the integral in the denominator in Eq. (3) does not generally have a closed-form solution for our hierarchical models, we use Markov chain Monte Carlo (MCMC) methods to sample from the posterior parameter distribution p(θ|y). The implementation details such as the selected prior distributions for the parameters p(θ) (and for the hyperparameters in hierarchical modelling cases) are provided in the Supplement.

3.2 Hierarchical models for GEV parameters

Tide-gauge-specific parameter estimates tend to be uncertain, as we usually have relatively few observations at our disposal. One way to include more information is to assume that the model parameters at different tide-gauge locations are similar. A hierarchical approach is to assume that their values are bound together with a joint prior distribution, whose parameters are not assumed to be known but estimated along the individual gauge specific parameters. For pooling information across the tide gauges, we tested three hierarchical formulations for the GEV distribution parameters. The first model (denoted as Common hereafter) is a simple extension to the separate fitting of GEV distribution at the tide-gauge locations. The model is written as

(4) y i GEV μ i , σ i , ξ i , i = 1 , , 12 μ i N μ μ , σ μ 2 σ i N + μ σ , σ σ 2 ξ i N μ ξ , σ ξ 2 ,

where 𝒩+ denotes the half-normal distribution. This model applies partial pooling to the data with the assumption that the tide-gauge-specific distribution parameters come from the same joint Gaussian distribution, but it simultaneously allows different parameter values to be estimated for individual tide gauges. In addition to the tide-gauge-specific GEV parameters, we have six additional hyperparameters (μμ, σμ, μσ, σσ, μξ, σξ) that tie the individual parameters together. As the hyperparameters are unknown and estimated from the data, we need to assign further prior distributions to them (see the Supplement for more details).

3.3 GEV parameter hierarchy using splines

Common does not account for possible spatial dependencies in the sea-level extremes between the tide gauges. As was mentioned previously, the northern part of the Baltic Sea consists of sub-basins like the Gulf of Bothnia and the Gulf of Finland. This geometry strongly regulates the sea-level variations in these regions (the so-called bay effect), and the magnitude of sea-level extremes systematically increases towards the end of the two bays (Fig. 1). We used the distance d calculated roughly along the coast between the tide-gauge locations such that dKemi=0 as covariate information when modelling the GEV parameters. In these models, the vector for the GEV parameters is expressed as θ=μ(di),σ(di),ξi. Thus, we assume the shape parameter ξ does not depend on d but still gets its own value at each tide gauge, similarly as in Sect. 3.2. One benefit of this approach is that the model provides estimates of the GEV parameters and consequently return level estimates between the tide-gauge locations.

We tested two approaches to incorporate the distance dependence in our hierarchical models. In the first one, the location and scale parameter are modelled using B-splines (de Boor1978) (denoted as Spline hereafter). B-splines are defined by the degree p of the basis function polynomials and a non-decreasing, here equally spaced, set of knots t=t1,,tr. We have set r=10. Spline functions are then constructed as a linear combination of B-spline basis functions. The model formulation for Spline is

(5) y i GEV j = 1 m α j B j ( d i ) , j = 1 m β j B j ( d i ) , ξ i ,

where m=r+2 is the number of cubic (p=3) B-splines, Bj are the B-spline basis functions and αj and βj are the spline coefficients to be estimated from the data. To avoid overfitting, we used cubic B-splines with first order random walk priors for the spline coefficients (e.g. Eilers and Marx1996; Lang and Brezger2004).

3.4 GEV parameter hierarchy using Gaussian processes

In the third hierarchical model, spatial dependence for the GEV parameters is accounted for using a Gaussian process (GP) prior. The model is written as

(6) y i GEV f μ ( d i ) , f σ ( d i ) , ξ i f μ G P m μ , K μ f σ G P m σ , K σ .

In Eq. (6), mμ and mσ are the mean, and Kμ and Kσ are the covariance functions for μ and σ, respectively. As we have a finite number of tide-gauge locations, this amounts to modelling both priors as multidimensional Gaussian distributions. To obtain smoothness in the neighbouring tide-gauge estimates for the GEV parameters, we used squared exponential covariance function in our model:

(7) K d i , d j | α , ρ = α 2 exp - 1 2 d i - d j 2 ρ 2 ,

where di and dj are tide-gauge locations defined by their distances to the reference station at Kemi. Furthermore, α is the spatial standard deviation and ρ is the characteristic length scale, both of which are estimated from the data.

All models were fitted using MCMC simulations by R version 4.1.2 (R Core Team2021) and Stan probabilistic programming language (Gabry and Češnovar2021; Stan Development Team2020, 2023). Stan implements a variant of the Hamiltonian Monte Carlo algorithm, the so-called No-U-Turn Sampler (NUTS), which has been shown to perform well in fitting complex hierarchical models (e.g. Calafat and Marcos2020). For each model, MCMC simulations were done with four parallel chains over 3000 iterations. The first 1000 iterations were removed as the burn-in period. Thus, the total number of draws obtained from the posterior distribution was 8000.

Figure 2 shows an example of how the Spline and GP fits for the location and scale parameter vary along the tide-gauge locations from Kemi to Hamina. Overall, both methods provide smooth, yet flexible, fits between the stations. Also the uncertainty estimates given by these methods are very similar. The similarity of the obtained fits is not surprising, as the formulation of the distance dependence is similar for both the Spline and GP model. Note that we do not attempt to extrapolate beyond Kemi or Hamina, as the resulting parameter estimates would be highly uncertain. The bottom row in Fig. 2 illustrates what the spatial 50-year return level estimates look like for both models. The shape parameter ξ has been sampled from the joint posterior distribution of the tide-gauge-specific parameter values when drawing these plots. It is seen that the spatial return level estimates vary smoothly along the coast and match relatively closely with the tide-gauge-specific estimates of these models. In the following, we concentrate on the tide-gauge-specific estimates, as it allows us to compare the results of all four models.

Figure 2Illustration of the (a, c, e) spline and (b, d, f) Gaussian process fits for the (a, b) location and (c, d) scale parameter with respect to the distance from the Kemi tide gauge. Darker (light) shading denotes the interquartile (95 %) range of the parameter estimates. The figure also shows box plots of the corresponding parameters at the tide-gauge locations. The bottom row shows similar plots for the 50-year return level of annual maximum sea levels for both models. The box covers the interquartile range (IQR), and the median value is highlighted by the horizontal line. The length of the whiskers is 1.5 times the IQR.


4 Results

We first evaluate the goodness of fit of the hierarchical models against the observations and compare their performance against the tide-gauge-specific GEV fits. We then have a more in-depth look at the simulated posterior parameter distributions for the hierarchical models and illustrate how the return level estimates differ between Separate and the hierarchical models. We also briefly discuss whether some theoretical limits for the annual maximum sea level could be inferred from the estimated parameters.

4.1 Evaluation of model fit against observations

To obtain some insights on model adequacy and the goodness of the separate and hierarchical model fits, quantile–quantile (QQ) plots calculated against the observed return levels are shown in Fig. 3. As the QQ plots are very similar for all hierarchical models, the results are shown only for the Spline model. Visual inspection of the tide-gauge-specific panels shows that both GEV models fit reasonably well with the data in most tide gauges, although the observational estimates tend to be outside the 95 % uncertainty range in some cases. Figure 3 also shows that although the median results are close to each other for these models the uncertainty range is indeed larger for Separate in many locations. Figure 4 shows the probability–probability (PP) plots for the same models. While the panels do not reveal any major discrepancies from the observed probabilities in most tide gauges, the match is less good for those tide gauges that are located close to the ends of the Bay of Bothnia and Gulf of Finland. In particular, the fit seems to be less good for Spline compared to Separate in Oulu. Nevertheless, both QQ and PP plots indicate that, with some tide-gauge-specific discrepancies with the observations, the models provide reasonable fits to the data.

Figure 3Quantile–quantile plots showing the modelled quantiles for tide-gauge-specific fits (Separate) and for Spline when plotted against the observed quantiles. Median values are given as points, and the shading shows the 95 % uncertainty range for the modelled quantiles. The diagonal line is also shown in each panel.


Figure 4Probability–probability plots calculated using median values of the GEV parameters for tide-gauge-specific fits (Separate) and for Spline when plotted against the observed probabilities. The diagonal line is also shown in each panel.


4.2 Leave-one-out cross-validation

To compare the relative performance of the GEV models, we performed leave-one-out cross-validation with Pareto smoothed importance sampling (PSIS-LOO) implemented in the loo package in R (Vehtari et al.2017, 2020). loo provides the expected logarithmic point-wise predictive density elpdloo for the out-of-sample predictions and also estimates the effective number of parameters ploo. While Bayesian LOO cross-validation has some limitations for model selection purposes (e.g. see discussion in Gronau and Wagenmakers2019a, b), it is nevertheless useful in our case for highlighting possible differences in the model performance.

The LOO cross-validation statistics are listed in Table 1. The first column shows the difference in elpdloo with respect to the best model, for which this value is zero, together with its standard error estimate. Spline has the highest elpdloo value out of all models, with GP having only slightly worse results. Separate has the smallest elpdloo, which indicates that the hierarchical models might provide a better fit to the observed annual extremes. The second column shows the effective number of parameters (ploo) together with the estimated standard error, whose value gives a rough estimate of model complexity. In line with Δelpdloo Spline has the smallest ploo, which is roughly half of that of Separate.

Table 1Leave-one-out cross-validation statistics for the implemented GEV models. The first column shows the difference in the expected logarithmic point-wise predictive density (elpdloo), with respect to the best performing model, and the second column the effective number of parameters. Values within the brackets are the standard errors for the estimated statistics.

Download Print Version | Download XLSX

To assess the performance of the spatial models in ungauged locations, we performed an additional experiment in which the tide gauges were left out one at time before fitting Spline and GP to the observations, and the obtained fit was used to estimate the 50-year return level in the omitted tide-gauge location. This procedure was repeated over all tide gauges, apart from Kemi and Hamina, as our models are not suitable for extrapolating the results spatially. We then calculated absolute and relative bias and mean absolute error (MAE) for the posterior median and conditional rank probability score (CRPS; Hersbach2000) for the full posterior distribution against the observed 50-year return level. The observed 50-year return level has been estimated by first calculating the exceedance probability for the observed annual maxima based on Weibull plotting positions and then interpolating the estimated probabilities between the observed values. The resulting statistics are shown in Table 2, when averaged over the 10 tide gauges. The spatial models fitted without the target tide gauge (Splineloo and GPloo) have worse statistics than the “full” models except for the model bias for Splineloo. In particular, GPloo has the largest errors apart from MAE. However, absolute differences in the error statistics to the model estimates based on the full data set are not large, which suggests that both Spline and GP are able to provide useful posterior predictions in ungauged locations.

Table 2Absolute and relative bias, mean absolute error (MAE) and conditional rank probability score (CRPS) calculated with respect to the observed 50-year return level when averaged over the tide gauges apart from Kemi and Hamina. The statistics are shown for the four models fitted using all tide-gauge records and (last two columns) for Spline and GP, when the target tide gauge has been left out of the data before fitting the models.

Download Print Version | Download XLSX

4.3 Posterior parameter distributions

We next take a look at the posterior parameter estimates obtained with MCMC. To first illustrate some of the properties of the posterior parameter distributions, bivariate parameter distributions are shown in Fig. 5 for Separate and Spline in the Kemi tide gauge. The panels in this plot show that μ and σ are negatively correlated with ξ to a certain degree for Separate, as the location and scale parameters tend to increase for more negative shape parameter values. The parameters are less strongly correlated for the Spline model, and the distribution shape is more Gaussian due to the effect of pooling. The bivariate distributions look well identified for both models, but they cover for Spline visibly narrower parameter ranges compared to Separate. Very similar results are obtained for the other hierarchical models and, therefore, are not shown here.

Figure 5Example of posterior distributions for the GEV parameters in the Kemi tide gauge obtained with (orange) Separate and (blue) Spline, respectively. Histograms of the three GEV parameters are shown on the diagonal and the corresponding bivariate scatter plots on the lower triangle. The location and scale parameter values are given in centimetres.


Figure 6 shows the posterior parameter distributions for all models. Both the location μ and scale σ parameter generally increase towards the ends of the two bays. The location parameter μ obtains its largest values in the Kemi, Oulu and Hamina tide gauges, where the median value for μ approaches or even exceeds 100 cm. The posterior median for σ is close to, or exceeds, 25 cm in the same tide gauges. Furthermore, the spread of the posterior parameter distribution tends to be largest in these tide gauges for both μ and σ. Note that the posterior parameter uncertainty in σ tends to be somewhat larger for Separate compared to the hierarchical models. Variations in μ across the tide gauges are similar for all models, but for the scale parameter σ, Separate has slightly larger overall variations compared to the hierarchical models. This could be due to the shrinkage effect, which tends to bring the station-specific parameter values of the hierarchical models towards the overall mean.

Figure 6Posterior parameter distributions for all the tested models at each tide gauge. The location and scale parameter values are given in centimetres.


In contrast with μ and σ, the shape parameter ξ does not have as clear a connection with the distance from the two bays. The overall posterior median for ξ is around −0.16 for all models. However, for Separate the posterior median tends to be more negative around the coast of the Bothnian Bay and less negative to the south from Vaasa on the coast of the Bothnian Sea. For the hierarchical models, the shape parameter values vary only weakly with the tide-gauge location, as this parameter is, in our hierarchical model formulation, less sensitive to location-specific aspects of data. Even when the uncertainty in the posterior parameter estimates is taken into account, ξ is consistently negative for all hierarchical models, although for Separate the posterior distribution of ξ has a long tail towards positive values in some locations. The typically negative shape parameter value for all models suggests that annual maximum sea levels follow a 3-parameter Weibull distribution on the Finnish coast. This result is in line with the previous study by Marcos and Woodworth (2017), which also suggested a negative shape parameter for the Finnish and the neighbouring tide gauges. A slightly contrasting result was obtained by Soomere et al. (2018), who estimated shape parameter values close to zero from an ocean model output along the Estonian coast. Furthermore, in their additional analysis based on tide-gauge observations, the shape parameter varied strongly depending on the location and method used to estimate the GEV distribution parameters. We note that as the behaviour of annual sea-level extremes is likely different on the Estonian side of Gulf of Finland, their results are not directly comparable to ours.

When ξ<0, the return level yp has an upper limit at y0=μ-σ/ξ, which could be used, at least in theory, to estimate the highest possible value that the annual maximum sea level can reach along the Finnish coast. We illustrate the upper-limit estimates in the next section.

To conclude, compared to the separate fits, the hierarchical models have a narrower uncertainty range for the scale and shape parameter. Overall, posterior parameter distributions are very similar for the hierarchical models regardless of the tide gauge. This shows that pooling information across the tide gauges narrows the uncertainty range in the posterior parameter estimates.

4.4 Posterior predictive simulations

The estimated GEV parameters shown in the previous section can be used to calculate theoretical estimates for the N-year return levels on tide-gauge locations. The estimated return levels are illustrated for all models and for two return periods in Fig. 7. The upper panel shows box plots of 50-year return levels together with the return level estimated from the observations. In most cases, the observed return level matches relatively well with the model estimates. One must keep in mind that due to the relatively short length of the available time series, the observed return level estimates are uncertain and likely affected by sampling variability.

Figure 7Example plots showing the posterior predictions for (a) 50-year and (b) 1000-year return levels for all four models. For comparison, panel (a) contains (circles) the observed 50-year return levels calculated from the observations. The boxes cover the inter-quartile range and are shown together with the median value, while the whiskers extend over the 95 % uncertainty range.


The lower panel in Fig. 7 shows estimates of much rarer 1000-year return levels. In this case, differences between Separate and the hierarchical models have become noticeably more visible compared to the 50-year return level. The median values are slightly higher for the hierarchical models on tide gauges from Oulu to Pietarsaari, while Separate has markedly higher median values in many tide gauges from Vaasa to Hamina. For Separate, the 95 % uncertainty range in the return level estimates tends to be smaller (larger) in the former (latter) locations, although the uncertainty range is in absolute terms largest for this model in all tide gauges. The hierarchical models predict very similar 1000-year return level values and also have similar uncertainty ranges in all locations. Most of the features in the return level estimates obtained with Separate are directly related to the larger uncertainty in the estimated shape parameter values, which also exhibit larger spatial variations for this model than for the hierarchical models.

To further illustrate how much the hierarchical modelling approach reduces the prediction uncertainty, Table 3 shows the standard deviation of the predicted 50-year return level in the tide-gauge locations and its ratio (in percentage) with respect to Separate for the hierarchical models. The spread is reduced in all cases and in some locations for Spline, and GP is less than 50 % of that of Separate. There are no major differences between the hierarchical models, although the reduction in the predictive uncertainty tends to be slightly smaller for Common than for the two spatial models. This supports our conclusion that the hierarchical models are able to reduce uncertainty in the posterior predictions.

Table 3Standard deviation of the predictive distribution for the 50-year return level (in mm), shown separately for each tide gauge and model. The percentage fraction of standard deviation with respect to Separate is given for the three hierarchical models in the brackets.

Download Print Version | Download XLSX

As was mentioned in Sect. 2, individual extremely high sea-level observations might markedly influence model fitting. To check this, we performed a sensitivity test in which the observation from January 1984 was left out of the time series before fitting the models to the data. Removal of this observation lead to a slightly more negative shape parameter value for Separate especially in Kaskinen and Pori, which reduced its median estimate of 1000-year return level closer to the hierarchical models in these locations (not shown). Return level estimates for the hierarchical models were only slightly changed in this test, which highlights higher sensitivity of separate model fits to anomalous observations. We note that the 1000-year return level estimates require extrapolation to such cases that are not present in the observational records. Therefore, the aforementioned results should be interpreted cautiously.

Figure 8Theoretical upper limit (μ-σ/ξ) for the annual maximum sea level in the Kemi tide gauge, inferred from the posterior parameter distributions. The results are shown for the hierarchical models only.


As discussed in the previous section, the mostly negative shape parameter suggests that, in theory, we could infer an upper limit that the sea level can reach given the assumptions on data and models are met. As a sanity check for our models, we briefly illustrate how high the values that the hierarchical models give are at the theoretical upper limit. We stress that the shown values should not be interpreted as actual limits for the sea level but, rather, as a hypothetical result provided by the hierarchical models.

The distribution of μ-σ/ξ is illustrated for the hierarchical models in the Kemi tide gauge in Fig. 8. All distributions look very similar and are positively skewed towards larger values. The median value is almost identical for all models and exceeds 260 cm. These values are 63–65 cm higher than the largest value in the observed time series, which sounds reasonable, although the very long right tail highlights the substantial uncertainty associated with the upper-level estimates.

The theoretical upper-limit estimates are summarised for the rest of the tide gauges in Table 4. Unsurprisingly, the median estimates are highest at those tide gauges, which are located closest to the end of the Bothnian Bay and Gulf of Finland, where they exceed 250 cm. As in the Kemi example, all hierarchical models give very similar results. Overall, differences between the highest observed sea levels and the theoretical upper limit vary between 47 and 73 cm, depending on the location. Interestingly, there is also a local maxima in the upper-limit estimates in Kaskinen, which is similar to the feature that was seen in the 1000-year return level estimates for Separate in Fig. 7.

Table 4Median value of the theoretical upper limit (μ-σ/ξ) of the annual maximum sea level in units of centimetres, calculated separately for each tide gauge. The 95 % credible interval is shown within the brackets. Differences to the highest observed annual sea levels are also provided for each model.

Download Print Version | Download XLSX

5 Discussion

To put the obtained results into a broader perspective, we briefly compare our results to those available in the literature from both statistical and dynamical model studies. We stress that this comparison is qualitative at best, as it is difficult to make direct comparisons due to differences in the used methods and observations.

Wolski et al. (2014) provided 50-year return level estimates based on the stationary GEV fit to observations from 1960–2010 for the Helsinki (163.8 cm) and Kemi (209.8 cm) tide gauges. These results are 25.1–31.1 cm higher than the median values for our models and closer to the estimated 1000-year return levels. This is likely due to differences in the study period and methods used. Using idealised cyclone winds as forcing in a hydrodynamic ocean model, Averkiev and Klevannyy (2010) estimated how high sea levels could be associated with strong low pressure systems in the Gulf of Finland. Their results (184 cm in Hanko and 186 cm in Helsinki) are higher than our 1000-year return level estimates for the hierarchical models and closer to their median upper-limit estimates. Särkkä et al. (2017) performed an 850-year-long simulation with a hydrodynamic ocean model, using regional climate model simulations as input. In their results, sea level could reach 225 cm height in Helsinki, which is somewhat higher but still similar to our median upper-limit estimate in this location.

As a deliberate choice in our study, long-term changes in mean sea level were excluded from the analysis by detrending the time series. Inclusion of mean sea level would be required if we were to assess the overall flooding risk related to sea-level extremes both in the present and future climate, as mean sea level has been a major driver of sea-level extremes also in the Finnish coastal region (e.g. Marcos and Woodworth2017). Furthermore, studies have suggested that future changes in the sea-level extremes are strongly associated with changes in mean sea level in the Baltic Sea region (e.g. Meier et al.2004; Gräwe and Burchard2012; Vousdoukas et al.2016, 2017). In Finland, changing mean sea level is expected to increase flooding risk along the coast of the Gulf of Finland and the Gulf of Bothnia, whereas post-glacial land uplift is likely to counter most of changes in the mean sea level in the Bothnian Bay (Pellikka et al.2018).

We have used GEV distribution to model the overall annual sea-level maxima (after removing long-term trends). However, different processes contribute to variations in sea level in the Baltic Sea and cause it to fluctuate at different temporal scales. Therefore, alternative modelling approaches could be considered to model the different sea-level fluctuations and their contributions separately. For example, Soomere et al. (2015) have used an approach in which separate statistical models were fitted to weekly scale and local storm-surge-driven sea-level fluctuations.

We also recognise some limitations in our study. Firstly, it was assumed that the time series of sea-level extremes are stationary after they have been detrended. This assumption is unlikely to be completely true. While not giving a rigorous conclusion, visual inspection of the time series in Fig. 1 suggests that changes might have occurred both in the level and variability of annual maxima. Furthermore, studies have indicated that sea-level extremes have exhibited variations in the Baltic Sea region over different timescales (Johansson et al.2001; Ribeiro et al.2014; Marcos and Woodworth2017; Kudryavtseva et al.2021) and that these variations were associated with variations in large-scale atmospheric conditions (Samuelsson and Stigebrandt1996; Johansson et al.2001; Ribeiro et al.2014; Marcos and Woodworth2017; Kudryavtseva et al.2021; Passaro et al.2021). As with changing mean sea level, future climate is also expected to bring changes in storm surges in the Baltic Sea region (e.g. Vousdoukas et al.2016, 2017, 2018). Our models do not account for non-stationarity in extremes related to short-term sea-level variations and cannot be used to assess such changes.

One remedy would be to model temporal dependence directly by allowing (for example) linear trends in the GEV parameters (e.g. Ribeiro et al.2014; Marcos and Woodworth2017; Kudryavtseva et al.2021). Physical covariates, which describe large-scale atmospheric circulation conditions around the Baltic Sea region, could also be used to account for interannual-to-decadal scale variations in the annual maximum sea levels. For example, Marcos and Woodworth (2017) assessed connections between sea-level extremes and the North Atlantic Oscillation (NAO) index in the North Atlantic region. Their results showed that even after taking the effect of mean sea level into account, NAO explained part of temporal variations in the sea-level extremes in the Baltic Sea. We aim to incorporate such physical covariates in our models in the following studies.

Another limitation for our hierarchical models is that they only account for dependence in the marginal GEV parameters and do not take additional residual dependence (dependence in annual maxima between different tide gauges) into account. Exclusion of residual dependence implies that our uncertainty estimates are likely slightly too narrow. One way to address this shortcoming is provided by Calafat and Marcos (2020), who use a max-stable process to capture the residual dependence. Their approach is, however, outside the scope of this paper.

6 Conclusions

In this study, Bayesian hierarchical modelling was applied to estimate theoretical return levels from annual maximum sea level on tide gauges located along the Finnish coast. Three hierarchical descriptions of the parameters of the generalised extreme value (GEV) distribution were compared against individual fits at 12 tide gauges. The main motivation was to test how the hierarchical description of the distribution parameters affects the estimation of, and uncertainty in, them and the corresponding estimates of return levels. The simplest hierarchical model assumes that station-specific distribution parameters come from the same joint Gaussian hyperdistributions. In addition, two hierarchical model structures based on B-splines and Gaussian processes were implemented, in which the distance with respect to the Kemi tide gauge along the Finnish coast was used as a covariate in the model formulations. These two models also allow, in principle, the estimation of return levels between the tide gauges.

The main results obtained from subsequent analysis are the following.

  • All hierarchical models provide similar results for the GEV parameters and reduce the posterior parameter uncertainty in comparison to the individual model fits. In addition, the shape parameter is consistently negative on the Finnish coast, with median values around −0.16 for all models.

  • Examples for the return level estimates show that the 50-year return level is well captured by the hierarchical models. For rarer (1000-year) events, individual fits tend to have a larger uncertainty range and they also give higher return level estimates compared to the hierarchical models in most locations.

  • Median values of the theoretical upper limits computed with the hierarchical models indicate differences of 47–73 cm in comparison to the observed maximum sea levels in the observational time series. However, uncertainty in these estimates is large, and, therefore, they should be interpreted with extreme caution.

While the results suggest that our hierarchical models provide an improvement over separate fits to tide-gauge time series, the study is region specific and exploits regional geographical features. For regions with shorter tide-gauge records available for analysis, it is expected that the hierarchical modelling approach has larger benefits in comparison to tide-gauge-specific models. Furthermore, other modelling approaches should be included along our approach and compared with the local sea-level observations to find the best solutions. One must also keep in mind that the results still contain uncertainties due to possibly missing components (long-term temporal evolution, physical information) in the model formulation and the limited sample size for the observed extremes. To conclude, care needs to be taken when interpreting the above-mentioned results. Improvements for the tested models, e.g. by including missing non-stationary components, will be addressed in future studies.

Code and data availability

Code for reproducing this study has been made publicly available in Zenodo (Räty and Laine2023). The package contains the implemented Stan models and the R scripts for running the models and for reproducing most of the analysis made in this study. The detrended tide-gauge time series of annual maximum sea levels required by the scripts are also available in Zenodo (Räty and Johansson2021).


The supplement related to this article is available online at:

Author contributions

All authors contributed to the design of the study. MMJ and JS provided the tide-gauge data, and OR pre-processed the data with the help of ML, JS and MMJ. OR and ML implemented the statistical models and conducted the analysis presented in the paper. OR, UL and ML wrote the paper with critical comments from the other authors.

Competing interests

The contact author has declared that none of the authors has any competing interests.


Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Special issue statement

This article is part of the special issue “Coastal hazards and hydro-meteorological extremes”. It is not associated with a conference.


The authors thank the two anonymous reviewers for their constructive comments, which helped to improve the paper. The authors acknowledge the financial support of the Finnish State Nuclear Waste Management Fund (VYR) and the Academy of Finland.

Financial support

This research has been supported by the Finnish State Nuclear Waste Management Fund (VYR) through SAFIR2022 (grant nos. Dnro SAFIR 6/2020, 6/2021 and 6/2022) and the ADAFUME project funded by the Academy of Finland (decision no. 321890).

Review statement

This paper was edited by Francisco Campuzano and reviewed by two anonymous referees.


Amante, C. and Eakins, B. W.: ETOPO1 1 Arc-Minute Global Relief Model: Procedures, Data Sources and Analysis, NOAA Technical Memorandum NESDIS NGDC-24, NOAA National Geophysical Data Center [data set],, 2009. a

Averkiev, A. S. and Klevannyy, K. A.: A case study of the impact of cyclonic trajectories on sea-level extremes in the Gulf of Finland, Cont. Shelf Res., 30, 707–714, 2010. a, b

Bardet, L., Duluc, C.-M., Rebour, V., and L'Her, J.: Regional frequency analysis of extreme storm surges along the French coast, Nat. Hazards Earth Syst. Sci., 11, 1627–1639,, 2011. a

Bernardara, P., Andreewsky, M., and Benoit, M.: Application of regional frequency analysis to the estimation of extreme storm surges, J. Geophys. Res., 116, C02008,, 2011. a

Calafat, F. M. and Marcos, M.: Probabilistic reanalysis of storm surge extremes in Europe, P. Natl. Acad. Sci. USA, 117, 1877–1883,, 2020. a, b, c, d

Coles, S.: An Introduction to Statistical Modeling of Extreme Values, Springer London,, 2001. a, b, c

Coles, S. and Tawn, J.: Bayesian modelling of extreme surges on the UK east coast, Philos. T. Roy. Soc. A, 363, 1387–1406, 2005. a

Cooley, D., Nychka, D., and Naveau, P.: Bayesian spatial modeling of extreme precipitation return levels, J. Am. Stat. Assoc., 102, 824–840, 2007. a

Dalrymple, T.: Flood-frequency analyses, Water Supply Paper 1543-A, US Government Printing Office,, 1960. a

de Boor, C.: A practical guide to splines, in: Applied mathematics sciences, Vol. 27, Springer-Verlag, ISBN 3-540-90356-9, 1978. a

Eilers, P. H. C. and Marx, B. D.: Flexible smoothing with B-splines and penalties, Stat. Sci., 11, 89–121,, 1996. a

Gabry, J. and Češnovar, R.: cmdstanr: R Interface to 'CmdStan', Stan [code], (last access: 2 November 2021), 2021. a

Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B.: Bayesian data analysis, Chapman and Hall/CRC,, 2013. a

Gräwe, U. and Burchard, H.: Storm surges in the Western Baltic Sea: the present and a possible future, Clim. Dynam., 39, 165–183, 2012. a

Gronau, Q. F. and Wagenmakers, E.-J.: Limitations of Bayesian leave-one-out cross-validation for model selection, Computational Brain & Behavior, 2, 1–11, 2019a. a

Gronau, Q. F. and Wagenmakers, E.-J.: Rejoinder: More limitations of Bayesian leave-one-out cross-validation, Computational Brain & Behavior, 2, 35–47, 2019b. a

Hersbach, H.: Decomposition of the Continuous Ranked Probability Score for Ensemble Prediction Systems, Weather Forecast., 15, 559–570,<0559:DOTCRP>2.0.CO;2, 2000. a

Hosking, J. R. M. and Wallis, J. R.: Regional Frequency Analysis: An Approach Based on L-Moments, Cambridge University Press,, 1997. a

Johansson, M. M., Boman, H., Kahma, K. K., and Launiainen, J.: Trends in sea level variability in the Baltic Sea, Boreal Environ. Res., 6, 159–179, 2001. a, b, c, d

Johansson, M. M., Kahma, K. K., and Boman, H.: An improved estimate for the long-term mean sea level on the Finnish coast, Geophysica, 39, 51–73, 2003. a

Jylhä, K., Kämäräinen, M., Fortelius, C., Gregow, H., Helander, J., Hyvärinen, O., Johansson, M., Karppinen, A., Korpinen, A., Kouznetsov, R., Kurzeneva, E., Leijala, U., Mäkelä, A., Pellikka, H., Saku, S., Sandberg, J., Sofiev, M., Vajda, A., Venäläinen, A., and Vira, J.: Recent meteorological and marine studies to support nuclear power plant safety in Finland, Energy, 165, 1102–1118,, 2018. a

Jönsson, B., Döös, K., Nycander, J., and Lundberg, P.: Standing waves in the Gulf of Finland and their relationship to the basin-wide Baltic seiches, J. Geophys. Res.-Oceans, 113, C03004,, 2008. a

Kudryavtseva, N., Soomere, T., and Männikus, R.: Non-stationary analysis of water level extremes in Latvian waters, Baltic Sea, during 1961–2018, Nat. Hazards Earth Syst. Sci., 21, 1279–1296,, 2021. a, b, c, d, e

Lang, S. and Brezger, A.: Bayesian P-splines, J. Comput. Graph. Stat., 13, 183–212, 2004. a

Leijala, U., Björkqvist, J.-V., Johansson, M. M., Pellikka, H., Laakso, L., and Kahma, K. K.: Combining probability distributions of sea level variations and wave run-up to evaluate coastal flooding risks, Nat. Hazards Earth Syst. Sci., 18, 2785–2799,, 2018. a

Leppäranta, M. and Myrberg, K.: Physical Oceanography of the Baltic Sea, Springer,, 2009. a

Männikus, R., Soomere, T., and Viška, M.: Variations in the mean, seasonal and extreme water level on the Latvian coast, the eastern Baltic Sea, during 1961–2018, Estuar. Coast. Shelf S., 245, 106827,, 2020. a

Marcos, M. and Woodworth, P. L.: Spatiotemporal changes in extreme sea levels along the coasts of the North Atlantic and the Gulf of Mexico, J. Geophys. Res.-Oceans, 122, 7031–7048,, 2017. a, b, c, d, e, f, g

Medvedev, I. P., Rabinovich, A. B., and Kulikov, E. A.: Tidal oscillations in the Baltic Sea, Oceanology, 53, 526–538,, 2013. a

Meier, H. M., Broman, B., and Kjellström, E.: Simulated sea level in past and future climates of the Baltic Sea, Clim. Res., 27, 59–75, 2004. a

Pante, E. and Simon-Bouhet, B.: marmap: a package for importing, plotting and analyzing bathymetric and topographic data in R, PLoS One, 8, e73051,, 2013. a

Pante, E., Simon-Bouhet B., and Irisson, J.-O.: marmap: Import, Plot and Analyze Bathymetric and Topographic Data, R package version 1.0.6, CRAN [code], (last access: 3 December 2021), 2021. a

Passaro, M., Müller, F. L., Oelsmann, J., Rautiainen, L., Dettmering, D., Hart-Davis, M. G., Abulaitijiang, A., Andersen, O. B., Høyer, J. L., Madsen, K. S., Ringgaard, I. M., Särkkä, J., Scarrott, R., Schwatke, C., Seitz, F., Tuomi, L., Restano, M., and Benveniste, J.: Absolute Baltic Sea Level Trends in the Satellite Altimetry Era: A Revisit, Frontiers in Marine Science, 8, 546,, 2021. a

Pellikka, H., Leijala, U., Johansson, M. M., Leinonen, K., and Kahma, K. K.: Future probabilities of coastal floods in Finland, Cont. Shelf Res., 157, 32–42,, 2018. a, b

Pellikka, H., Laurila, T. K., Boman, H., Karjalainen, A., Björkqvist, J.-V., and Kahma, K. K.: Meteotsunami occurrence in the Gulf of Finland over the past century, Nat. Hazards Earth Syst. Sci., 20, 2535–2546,, 2020. a, b

Räty, O. and Johansson, M. M.: Data files for the article “Bayesian hierarchical modeling of sea level extremes in the Finnish coastal region”, Version 1, Zenodo [data set],, 2021. a

Räty, O. and Laine, M.: Supplementary Stan codes and R scripts, Version 3.0, Zenodo [code],, 2023. a

R Core Team R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, (last access: 1 December 2021), 2021. a

Ribeiro, A., Barbosa, S. M., Scotto, M. G., and Donner, R. V.: Changes in extreme sea-levels in the Baltic Sea, Tellus A, 66, 20921,, 2014. a, b, c, d

Rutgersson, A., Kjellström, E., Haapala, J., Stendel, M., Danilovich, I., Drews, M., Jylhä, K., Kujala, P., Larsén, X. G., Halsnæs, K., Lehtonen, I., Luomaranta, A., Nilsson, E., Olsson, T., Särkkä, J., Tuomi, L., and Wasmund, N.: Natural hazards and extreme events in the Baltic Sea region, Earth Syst. Dynam., 13, 251–301,, 2022. a

Saaranen, V., Lehmuskoski, P., Rouhiainen, P., Takalo, M., Mäkinen, J., and Poutanen, M.: The new Finnish height reference N2000, in: Geodetic Reference Frames: IAG Symposium, Munich, Germany, 9–14 October 2006, edited by: Drewes, H., Springer Berlin Heidelberg, 297–302,, 2009. a

Samuelsson, M. and Stigebrandt, A.: Main characteristics of the long-term sea level variability in the Baltic sea, Tellus A, 48, 672–683,, 1996. a

Särkkä, J., Kahma, K. K., Kämäräinen, M., Johansson, M. M., and Saku, S.: Simulated extreme sea levels at Helsinki, Boreal Environ. Res., 22, 299–315, 2017. a, b, c

Soomere, T., Eelsalu, M., Kurkin, A., and Rybin, A.: Separation of the Baltic Sea water level into daily and multi-weekly components, Cont. Shelf Res., 103, 23–32,, 2015. a

Soomere, T., Eelsalu, M., and Pindsoo, K.: Variations in parameters of extreme value distributions of water level along the eastern Baltic Sea coast, Estuar. Coast Shelf S., 215, 59–68, 2018. a, b, c, d

Stan Development Team: RStan: the R interface to Stan, R package version 2.21.2, CRAN [code], (last access: 27 October 2021), 2020. a

Stan Development Team: Stan Modeling Language Users Guide and Reference Manual, Version 2.32, (last access: 21 June 2023), 2023. a

Suursaar, Ü., Kullas, T., and Otsmann, M.: A model study of the sea level variations in the Gulf of Riga and the Väinameri Sea, Cont. Shelf Res., 22, 2001–2019,, 2002. a

Vehtari, A., Gelman, A., and Gabry, J.: Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC, Stat. Comput., 27, 1413–1432, 2017.  a

Vehtari, A., Gabry, J., Magnusson, M., Yao, Y., Bürkner, P.-C., Paananen, T., and Gelman, A.: loo: Efficient leave-one-out cross-validation and WAIC for Bayesian models, R package version 2.4.1, CRAN [code], (last access: 27 October 2021), 2020. a

Vousdoukas, M. I., Voukouvalas, E., Annunziato, A., Giardino, A., and Feyen, L.: Projections of extreme storm surge levels along Europe, Clim. Dynam., 47, 3171–3190,, 2016. a, b

Vousdoukas, M. I., Mentaschi, L., Voukouvalas, E., Verlaan, M., and Feyen, L.: Extreme sea levels on the rise along Europe's coasts, Earth's Future, 5, 304–323,, 2017. a, b

Vousdoukas, M. I., Mentaschi, L., Voukouvalas, E., Verlaan, M., Jevrejeva, S., Jackson, L. P., and Feyen, L.: Global probabilistic projections of extreme sea levels show intensification of coastal flood hazard, Nat. Commun., 9, 1–12, 2018. a

Weisse, R., Dailidienė, I., Hünicke, B., Kahma, K., Madsen, K., Omstedt, A., Parnell, K., Schöne, T., Soomere, T., Zhang, W., and Zorita, E.: Sea level dynamics and coastal erosion in the Baltic Sea region, Earth Syst. Dynam., 12, 871–898,, 2021. a, b

Wolski, T. and Wiśniewski, B.: Geographical diversity in the occurrence of extreme sea levels on the coasts of the Baltic Sea, J. Sea Res., 159, 101890,, 2020. a, b

Wolski, T., Wiśniewski, B., Giza, A., Kowalewska-Kalkowska, H., Boman, H., Grabbi-Kaiv, S., Hammarklint, T., Holfort, J., and Lydeikaitė, Ž.: Extreme sea levels at selected stations on the Baltic Sea coast, Oceanologia, 56, 259–290, 2014. a, b, c

Short summary
We studied annual maximum sea levels in the Finnish coastal region. Our aim was to better quantify the uncertainty in them compared to previous studies. Using four statistical models, we found out that hierarchical models, which shared information on sea-level extremes across Finnish tide gauges, had lower uncertainty in their results in comparison with tide-gauge-specific fits. These models also suggested that the shape of the distribution for extreme sea levels is similar on the Finnish coast.
Final-revised paper