Non-stationary analysis of water level extremes in Latvian waters, Baltic Sea, during 1961–2018

Analysis and prediction of water level extremes in the eastern Baltic Sea is a difficult task because of the contribution of various drivers to the water level, the presence of outliers in time series and possibly non-stationarity of the extremes owing to the changes in the atmospheric forcing. Non-stationary modelling of extremes was performed to the block 10 maxima of water level derived from the time series at six locations in the Gulf of Riga and one location in the Baltic proper, Baltic Sea, during 1961–2018. Several parameters of the Generalised Extreme Value distribution of the measured water maxima both in the Baltic proper and in the interior of the Gulf of Riga exhibit statistically significant changes over these years. The most considerable changes occur to the shape parameter ξ . All stations in the interior of the Gulf of Riga experienced a regime shift: a drastic abrupt drop of the shape parameter from ξ ≈ 0.03 ± 0.02 to ξ ≈ −0.36 ± 0.04 around 15 1986 followed by an increase of a similar magnitude around 1990. This means a sudden switch from a Fréchet distribution to a three-parameter Weibull distribution and back. The water level extremes at Liepaja in the Baltic proper and Kolka at the entrance to the Gulf of Riga reveal significant linear trends in the location and scale parameters. This pattern indicates a different course of the water level extremes in the Baltic proper and the interior of the Gulf of Riga. The described changes may lead to greatly different projections for long-term behaviour of water level extremes and their return periods based on 20 data from different intervals.


Introduction
Extreme values are the most common input for coastal design and management (Coles, 2004). Observed or measured timeseries of water level usually serve as the most reliable source of information. However, a sophisticated approach to a problem (extent of a flood, height of a structure etc.) requires not only the values of extremes but also their frequency (e.g., 35 return periods of different heights) and the duration of extreme events. As time-series of observed water level are commonly not longer than 100 years, there have been attempts to find suitable theoretical statistical distributions of extreme values which could be used to find reliable values for return periods. This is a complicated issue since the data may be too short, inaccurate or non-stationary (Mudersbach Jensen, 2010;Galiatsatou et al., 2019). Moreover, there could be different populations of storms which result in extreme values which do not follow the chosen distribution (Suursaar and Sooäär, 40 2007).
The situation is even more complicated in estuarine-type environments where a multitude of drivers may contribute to the formation of high water levels (Del-Rosal-Salido et al., 2019). In the Baltic Sea, the frequent presence of long-term aperiodic high water levels in the entire sea (Lehmann and Post, 2015;Lehmann et al., 2017) may contribute to storm surges depending on the location, openness and orientation of single coastal sections. For example, in the eastern regions of the 45 Baltic Sea, the largest storm surges are caused by strong westerly winds that often also push large volumes of water into this sea (Lehmann et al., 2017).
Depending on the method in use, set of data and regional differences in the storm surge heights, the estimates of water level extremes commonly reveal the disparity between different models and observations (Bardet et al., 2011). This feature was thoroughly analysed in (Meier et al., 2004) using two different circulation models and two sea-level scenarios. Dieterich et 50 al. (2019) demonstrated that the estimates of water level extremes for several areas of the Baltic Sea such as the Skagerrak, Gulf of Finland and Gulf of Riga are sensitive to the choice of the particular regional climate (circulation) model even if forced by the same external drivers. The uncertainties of projections of extreme water levels can be made smaller by an increase in the model resolution (Kowalewski and Kowalewska-Kalkowska, 2017). This approach inter alia makes it possible to resolve the nonlinear response of the water level extremes to the increase in the mean water level in shallow 55 regions (Gräwe and Burchard, 2012). Different drivers of extreme water levels interact in a nonlinear manner so that their joint impact may be more significant than the sum of effects of single drivers (Hieronymus et al., 2017). The most complicated dynamics seem to occur in the easternmost Baltic Sea, where conventional methods for extreme value estimation are not able to accommodate all observed and hindcast extremes (Suursaar and Sooäär, 2007;Eelsalu et al., 2014). Moreover, the spread among different methods can be substantial in areas that may have extensive wave set-up. 60 The analysis of Weisse et al. (2014) signals an increase in the Baltic Sea water level extremes in the past 100 years. The main contributor to this process is the increase in the mean sea level. An increase in wind speed will lead to a stronger reaction of the water level in areas such as the Gulf of Finland, where extremes are already high (Hieronymus et al., 2018). A https://doi.org/10.5194/nhess-2020-100 Preprint. Discussion started: 24 April 2020 c Author(s) 2020. CC BY 4.0 License. specific feature of the Baltic Sea is that extreme water levels may increase faster than the mean water level even without an increase in the wind speed (Meier, 2006;Soomere and Pindsoo, 2016). This property seems to be distinctive to the eastern 65 sub-basins such as the Gulf of Finland, Western Estonian archipelago and the Gulf of Riga (Fig. 1), and to the south-eastern segments of the sea (Pindsoo and Soomere, 2020).
A natural reflection of this difference in the increase in water extremes is the extensive spatial variation in the parameters of the General Extreme Value (GEV) distribution for water level extremes along the north-eastern Baltic Sea shore . This variation includes alongshore changes in the sign of the GEV shape parameter. It signals the necessity of 70 using different particular cases (three-parameter Weibull, Gumbel or Fréchet) distribution for adequate projections of extreme water levels and their return periods in different coastal segments. The situation is even more complicated in changing climates where the background process of formation of high water levels is not necessarily statistically stationary, and the parameters of the GEV distribution of extreme water levels may change in time (Kudryavtseva et al., 2018). Such variations of these parameters may lead to great variations in projections of the resulting water level for longer return 75 periods. Spatio-temporal variations in the parameters of the GEV distribution in the Baltic Sea basin have been so far analysed based on modelled water levels (e.g., Kudryavtseva et al., 2018;Soomere et al., 2018). The availability of high-quality long-term 80 https://doi.org/10.5194/nhess-2020-100 Preprint. Discussion started: 24 April 2020 c Author(s) 2020. CC BY 4.0 License. observed or measured water level data sets from the shore of the central Baltic Sea region makes it possible to extend this analysis to in situ data.
In this paper, we focus on the temporal behaviour of the parameters of distributions of extreme water levels in the Gulf of Riga (Fig. 1). Extreme water levels in this sub-basin are, historically, the third-highest in the entire Baltic Sea after the eastern Gulf of Finland and south-western Baltic Sea (Dailidiené et al., 2006;Averkiev and Klevanny, 2010). This feature 85 reflects a specific pattern of the formation of water levels in this semi-enclosed water body (Astok et al., 1999;Männikus et al., 2019).
The main objective of this study is to characterise the temporal course and quantify the magnitude of temporal changes to the parameters of the GEV at selected observation sites. We start from a description of the study area, data and methods for the analysis of extreme water levels in Section 2. The analysis is based on block maxima of water levels over the windy autumn 90 and winter season. Differently from annual maxima, such maxima are not serially correlated. The nature of changes to various parameters of the GEV is analysed in Section 3. We also test different approximations which could describe the patterns of change of in GEV parameters and roughly estimate the role of non-stationarity of the data in the formation of the values of parameters of GEV.

Observed data in the study area
The study areathe shores of Latvia with a total length of about 500 kmconsists of two major segments. About half of the study area on the western coast of Latvia is open to the Baltic proper (Fig. 1). The water level regime and the behaviour of water level maxima in this segment, represented by the Liepāja and Kolka tide gauges, are largely similar to the relevant features in Lithuania (Dailidiené et al., 2006) and the western shores of the Western Estonian archipelago (Suursaar and 100 Sooäär, 2007;Eelsalu et al., 2014;Soomere et al., 2018). Another half of the study area is located on the western, southern and eastern shores of the Gulf of Riga (Fig. 1). This gulf has a generally regular size with dimensions about 130 × 140 km (Suursaar et al., 2002) and is connected with the Baltic proper via relatively narrow and shallow (systems of) straits. The main connection, Irbe Strait, is 27 km wide but with water depth in the sill area mostly <10 m (Maritime Administration of Latvia, 2014). The connections of another outlet via the Väinameri (Moonsund) sub-basin in the Western Estonian 105 archipelago are much narrower (the width of, e.g., Suur Strait is 4-5 km) and shallower, with the sill depth of about 5 m.
Irbe Strait is open towards the Baltic proper to the south-west, that is, to one of the predominant wind directions (Soomere, 2003). This configuration of the Gulf of Riga supports a two-step mechanism of formation of extreme water levels (Astok et al., 1999). As mentioned above, specific patterns of atmospheric forcing may drive a large volume of water into the Baltic Sea on weekly scales (Post and Kõuts, 2014;Lehmann et al., 2017). Such large volume changes may increase the average 110 sea level in the entire Baltic Sea by almost 1 m for several months (Soomere and Pindsoo, 2016). A similar process may https://doi.org/10.5194/nhess-2020-100 Preprint. Discussion started: 24 April 2020 c Author(s) 2020. CC BY 4.0 License. additionally increase the average sea level in the entire Gulf of Riga by another 1 m (Männikus et al., 2019). Even though such highly elevated water levels usually persist only a few days in this gulf, they are usually driven by strong westerly winds over Irbe Strait. Therefore, they are often associated with high local storm surges in the western parts of the gulf.
Extremely high water levels in the Gulf of Riga are thus developed under the joint impact of three major drivers: water 115 volume of the entire Baltic Sea with a characteristic timescale of weeks, water occasionally pushed by a sequence of cyclones into the Gulf of Riga for 1-2 days (Suursaar et al., 2002), and local storm surges with a typical duration of a few hours. Each of these drivers may add about 1 m to the resulting water level (Männikus et al., 2019).
Water level observations have been carried out at various locations on the Latvian coast over almost two centuries.
Currently, Latvian Environment, Geology and Meteorology Centre (LEGMC) operates tide gauges at ten sites. The readings 120 of observations and measurements are available on their website (http://www.meteo.lv) from 1961. The sampling frequency, coverage and completeness of recordings vary greatly between the locations. Hourly records started mostly in the middle of 2000s when automatic devices were installed. The properties and quality of these time-series are presented in (Männikus et al., 2020). In this study, time-series from seven stations (Liepāja, Kolka, Roja, Daugavgrīva, Skulte and Salacgrīva in Latvian waters and Pärnu in Estonian waters, Figure 1; Table 1) are used as these are most reliable in terms of monthly 125 completeness. The Estonian Weather Service provided the data set for Pärnu. Before that, the official height system BK77 with reference level associated with the Kronstadt zero was used. This zero was defined as the average water level at this location in 1825-1840 (Lazarenko, 1986). As the information about water levels in Latvia and neighbouring Baltic countries published in the international literature until 2019 is given in the BK77 system, we shall use water level data in this system as well. Moreover, other authors (Averkiev and Klevanny, 2010) have also given 135 results in the BK77.
The quality of data in these stations is analysed in detail in Männikus et al. (2019Männikus et al. ( , 2020. The interplay of the water masses in the lake of Liepāja and the canal connecting the lake with the sea may in some occasions greatly affect the readings of extreme water levels. Seasonal course of water level at Daugavgrīva is influenced by the hydroelectric plant about 20 km upstream of the river; however, this impact does not affect annual maxima of water levels at this site. 140 The shapes of empirical probability distributions of the occurrence of different water levels at Liepāja, Daugavgrīva and Pärnu ( Fig. 2) were analysed in Männikus et al. (2019). The completeness of the dataset of hourly observations is 99.54%.
The recordings at Liepāja represent water levels at the eastern shore of the Baltic proper. The site at Kolka is also strongly influenced by the water level in the Baltic proper. However, the Daugavgrīva and Pärnu data characterise water level in the southern and north-eastern bayhead of the Gulf of Riga, respectively. 145 Both empirical probability distributions have a quasi-Gaussian appearance which is characteristic in the north-eastern Baltic Sea (Johansson et al., 2001). This shape of the probabilities reflects the joint impact of storm surges (that follow a Poisson 150 distribution on open ocean shores, Schmitt et al., 2018) and frequently existing large volumes of excess water. The excess https://doi.org/10.5194/nhess-2020-100 Preprint. Discussion started: 24 April 2020 c Author(s) 2020. CC BY 4.0 License.
water is pumped into the Baltic Sea by certain sequences of atmospheric processes (Leppäranta and Myrberg, 2009) and exhibit a Gaussian distribution (Soomere et al., 2015a). The skewed shapes of the distributions for Liepāja and Daugavgrīva (skewness 1.431 and 1.674, respectively) indicate a well-known property of eastern Baltic Sea that elevated water levels are more likely than negative surges (Johansson et al., 2001;Suursaar and Sooäär, 2007;Männikus et al., 2019). 155

Projections based on block maxima
Long-term water level time-series can be analysed and interpreted using results of the extreme-value theory if certain fundamental conditions are fulfilled. The essential requirements are that the selected maximum or minimum values in the time series to be used, e.g., for the block maximum method (Coles, 2004) must be (i) statistically independent and (ii) identically distributed. In other words, each value should be an independent, random sample from the same population. 160 To achieve statistical independence and remove possible serial correlation, one should consider only values that are sufficiently separated in time. Monthly water level maxima are often correlated because there is a significant time lag between the impact of large-scale atmospheric patterns and the reaction of water level (Johansson et al., 2014). As the events of elevated water level of the entire Baltic Sea (Post and Kõuts, 2014;Samuelsson and Stigebrandt, 1996) may last a few months (Pindsoo and Soomere, 2016), often annual maximum or minimum values at every observation station are 165 considered in the analysis of extremes. However, sometimes annual extremes may be correlated as well. The same cluster of storms may be reflected by two subsequent annual maxima (one in December, another in January of the next year). Männikus et al. (2020) highlighted the well-known seasonal variability in water level course in the eastern Baltic Sea. It was shown that the most reliable way to select uncorrelated water level extremes is to use the maxima of the entire relatively windy season starting from late spring or early summer (e.g., in July) and ending in June of next year. This extended windy 170 season is called the stormy season for brevity. The set of water level maxima for stormy seasons is suitable for the analysis of extreme water levels and their return periods in the eastern Baltic Sea (Soomere and Pindsoo, 2016).
The difference between the two sets of block maxima (annual and stormy season) is generally insignificant. However, substantial differences are seen in the projections of extreme water levels and their return periods. Eelsalu et al. (2014) reported about 10 cm differences for the observed data in Estonian coastal waters whereas high water levels projected using 175 the maxima of stormy seasons were usually higher than those based on the annual maxima. For above-listed reasons, we employed block maxima for stormy seasons.

Extreme value distributions
Projections of extreme water levels and their return periods were constructed based on the theoretical extreme value distributions (Coles, 2004). These are Generalized Extreme Value (GEV) distribution and its particular cases Gumbel, 180 Fréchet and Weibull distributions. The use of these distributions is justified if samples are independent and identically distributed random variables. The Gumbel and Weibull distributions are, in fact, particular cases of a GEV distribution: https://doi.org/10.5194/nhess-2020-100 Preprint. Discussion started: 24 April 2020 c Author(s) 2020. CC BY 4.0 License.
The GEV distribution is characterized by a location parameter ∈ ℝ, scale parameter ∈ ℝ, and shape parameter ∈ ℝ (Coles, 2004). Here has the meaning of block maxima (e.g., maximum water level for each stormy season). The return 185 period (̂) for a particular value , is given by the 1 [1 − (̂)] ⁄ -th percentile of ( ) . (2) For the shape parameter → 0, the GEV distribution reduces to a Gumbel distribution. If < 0, the GEV distribution is equivalent to a three-parameter Weibull distribution, and for > 0 to a Fréchet distribution (Fig. 4f). A Gumbel distribution is best suitable for the description of extremes of populations described by distributions with an exponential tail such as the 190 Gaussian distribution. A Weibull distribution matches extremes of populations with so-called light-tailed (very rapidly decaying) distributions. The Fréchet distributions have a strong positive tail. The shapes of empirical water level distributions at Liepāja and Daugavgrīva resemble a Gaussian distribution but are skewed towards high water levels (the skewness is 1.431 and 1.674, respectively). Their kurtosis (3.3 and 4.2 for Liepāja and Daugavgrīva, respectively) is slightly different from the value of kurtosis of a Gaussian distribution (Eq. 3). Therefore, the 195 probability of very high water level values differs from their expectation for a Gaussian-distributed data set.
A typical feature of the north-eastern Baltic Sea coastal waters is the presence of outliers in the water level recordings that do not follow the classic extreme value distributions (Fig. 3). The threshold for defining the outliers could be evaluated from the difference between the first and third quartile of the sample (Suursaar and Sooäär, 2007). The difference is multiplied by 1.5 and added to the third quartile to reach the threshold. These outliers could be created, for example, by a sequence of storms 200 that first raised water level considerably in the entire Baltic Sea and then created a "usual" surge in single locations on the background of strongly elevated offshore water level. In this sense, these outliers may represent a different population of water level extremes. In some occasions, the outliers could be explained by local effects, for example, by substantial river discharge or even by reading or measurement error (cf. Männikus et al., 2019). For example, the highest water level at Liepāja (174 cm) was recorded on 18 October 1967 at 14:00 during an event when the water level rose within 2 hrs from 205 60 cm to 174 cm, remained constant for 5 hrs and then dropped to 100 cm in one hour. The water level was likely very high this day at Liepāja; however, it is unlikely that water level was constant for five hours.
The presence of a few outliers typically insignificantly impacts the integral parameters of the entire distribution but may have an impact on the parameters of the associated extreme value distributions and projections of return periods of very high water levels (Suursaar and Sooäär, 2007). For example, using a Gumbel distribution would eventually underestimate the 210 importance of positive outliers and lead to underestimation of values of higher return periods. The opposite bias can be expected from the use of a Weibull distribution. Hence, it would be reasonable to consider various distributions for longterm projections.
https://doi.org/10.5194/nhess-2020-100 Preprint. Discussion started: 24 April 2020 c Author(s) 2020. CC BY 4.0 License. To evaluate the parameters of the GEV (including Weibull and Fréchet) and Gumbel distributions, built-in procedures of 220 maximum likelihood estimation in Matlab were used. As it is not possible to decide beforehand which theoretical distribution at best describes water levels and their extremes on the Latvian coast, we also employed other methods for calculating the parameters of these extreme value distributions. We used a method of moments which resulted in unbiased and biased estimates of parameters. Finally, we used free software Hydrognomon (http://hydrognomon.org/) for the processing and analysis of water level data. This application runs on standard Microsoft Windows. Its statistical module was 225 employed to evaluate the parameters of the GEV and Gumbel distributions.

Non-stationary extreme value analysis
To get an estimate of the level of non-stationarity of the extreme value distribution, a sliding window approach was used.
The time series was separated into 30-year long consecutive windows. In each window, a stationary GEV distribution (Eq. 1) was fitted to the block maxima with a fixed location parameter , scale parameter , and shape parameter . Before the fit, 230 the background non-stationarity of the time series caused by the joint impact of global sea-level rise and local postglacial uplift is removed subtracting the annual mean from the block maxima.
If the results of the sliding GEV test revealed that the parameters of the distribution are time-variable, a non-stationary GEV analysis described below was performed. In such a case, it is assumed that the parameters are functions of time = ( ), https://doi.org/10.5194/nhess-2020-100 Preprint. Discussion started: 24 April 2020 c Author(s) 2020. CC BY 4.0 License. = ∑( ), and = ( ). With this hypothesis, a non-stationary GEV distribution is fitted with one parameter changing with 235 time (either location, scale or shape parameter). An illustration of the method is shown in Fig. 4.  For example, to test if the shape parameter is changing with time (Fig. 4f), the following distribution is fitted: https://doi.org/10.5194/nhess-2020-100 Preprint. Discussion started: 24 April 2020 c Author(s) 2020. CC BY 4.0 License.
For the non-stationary GEV fit, we used the ismev package (version 1.42) in R programming language (version 3.6.1).
An increase in the location parameter with time represents the case when the whole GEV distribution is shifted to the right, towards higher values. This means that all extremes, from the most severe ones down to "medium-range" and "low-range" ones are getting higher (e.g., Kudryavtseva et al., 2018). A decrease in this parameter would lead to changes in the opposite 255 direction. The scale parameter is responsible for the width of the distribution. An increase in the scale parameter indicates that "medium-range" extremes are getting more frequent, whereas a decrease corresponds to less frequent "medium-range" extremes.
Changes to the shape parameter may have more complicated consequences. This parameter of a GEV distribution defines the overall type (shape) of the extreme value distribution. In particular, a change in the sign of the shape parameter 260 corresponds to a switch between the radically different types of extreme value distribution. For example, if the negative values of the shape parameter shift towards values close to zero, the extremes that are initially characterised by a Weibull distribution with an upper limit start to follow a Gumbel distribution gradually. Further increase in this parameter means a switch to a Fréchet distribution that has a lower limit (Fig. 4).

Temporal changes in the parameters of the GEV distribution
To estimate the magnitude of changes to the parameters of the GEV at selected observation sites (equivalently, to reveal whether extreme water levels may follow a non-stationary process), a sliding GEV fit was performed. The time series of stormy season maxima of water levels (Fig. 4b) for all seven measurement sites were divided into consecutive intervals https://doi.org/10.5194/nhess-2020-100 Preprint. Discussion started: 24 April 2020 c Author(s) 2020. CC BY 4.0 License.
(windows) with a constant width. A stationary GEV (Eq. 1) fit was performed for each time interval. The window size was 270 thoroughly tested, and the optimal value of 30 years was selected as it provided the lower noise level.  From these measurement locations, Kolka is primarily affected by the water level in the Baltic proper, Roja is located at the western shore of the Gulf of Riga and Pärnu in the north-eastern bayhead of the Gulf of Riga. As the 95% confidence 290 intervals of estimates of this parameter for single years mostly overlap, it is safe to say that the location parameter of the GEV distribution of water level extremes has not experienced any substantial changes in Latvian waters since the 1960s.
Contrarily, the analysis of the time series using the sliding GEV fit revealed large changes in the scale parameter σ at some locations and spatially variable pattern of its variations. Its values experienced a slight decrease of ~20% during 1981-1990 at Pärnu (Fig. 6, bottom), Salacgrīva, and Skulte (Fig. 6, middle). Even more substantial increase of 25% was observed at 295 Liepāja (Fig. 6, top). However, other tide gauges did not show significant variability in this parameter.

Regime shifts in terms of the shape parameter
The most dramatic changes of >50% were observed in the shape parameter (Fig. 7). This feature indicates significant nonstationarity of stormy season maxima in terms of the shape of the GEV distribution. The temporal course of the shape 300 parameter obtained with the sliding GEV fit is different at different stations. At Liepāja, the shape parameter was between 0.1 and 0.2 until the mid-1980s and dropped almost zero from 1985. In other words, a Fréchet distribution was replaced by a Gumbel one. This parameter was negative (between -0.4 and -0.2) at Kolka until the end of the 1980s and then rapidly increased to about 0.1. Therefore, a Weibull distribution of water level extremes was replaced by a Fréchet distribution at this location. 305 The values of the shape parameter at all other stations experience a regime shift: a drastic abrupt drop around 1985 followed by an increase of a similar magnitude around 1990. Before and after this drop the shape parameter was close to zero in most stations. At Skulte and Salacgrīva it stabilised on the level of -0.1 whereas it increased to about 0.1 at Daugavgrīva. During the years 1985-1990, it was from about -0.2 (at Salacgrīva and Daugavgrīva) down to about -0.4 at Roja, Pärnu, and Skulte ( Fig. 7, Table 2). 310 Importantly, Fig. 7 demonstrates that the features of temporal changes in the shape parameter of the relevant GEV are different at the locations reflecting (or strongly affected by) water levels in the Baltic proper (see above) and in the interior of the Gulf of Riga. The properties of water level at all measurement sites in the interior of the Gulf of Riga show consistent behaviour. Before 1985, the shape parameter is consistently close to zero in the sense that its difference from zero is less than the level of uncertainty (95% confidence intervals) of its evaluation. Thus, differently from the Baltic proper, the GEV 315 distribution follows a Gumbel one in the Gulf of Riga.
During the regime shift in 1985-1989, all stations (except for Daugavgrīva) show consistently negative values of the shape parameter with the average value over all stations ≈ −0.36 ± 0.04. Therefore, the extreme values of water level followed a three-parameter Weibull distribution. Only the data from Daugavgrīva reveal a considerable uncertainty in the sense that 95% confidence limits involve the zero value for most of the years in 1985-1989. Therefore, its switch to negative values is 320 not as clear as for other sites.  1985, 1985-1989, and after 1989 (Table 2). https://doi.org/10.5194/nhess-2020-100 Preprint. Discussion started: 24 April 2020 c Author(s) 2020. CC BY 4.0 License.
After 1989, the recordings show a higher discrepancy in . At Daugavgrīva, Pärnu, and Roja ≈ −0.0 ± 0.02 within the uncertainties and thus a Gumbel distribution is acceptable again. The estimates for the shape parameter of the GEV distribution for Salacgrīva and Skulte return to a negative (but smaller) value ≈ −0.12 ± 0.04. 330 The test for non-stationarity employs a sliding window of 30 years. For example, the GEV parameters listed for the year 1985 are evaluated using the extreme water level data from 1970 to 2000. Therefore, the abrupt changes in the shape parameter do not necessarily occur specifically in 1985 or 1989. As the reported values of the parameters characterise the GEV fit of extreme water levels over 30 years, using 15 years before and after the listed "central" year of the fit, the regime shift may have occurred at a somewhat different time instant. 335 To more consistently estimate the relative magnitude of the identified step-like changes with respect to the typical level of variations in the shape parameter, we make use of the overall appearance of the course of the shape parameter of the GEV distribution that resembles a step-like behaviour with two instants of a regime shift. Using the estimated uncertainties , we can assess the statistical significance of these variations. The abrupt changes range in significance from 2 (Daugavgrīva) to 7.8 (Pärnu). The magnitude of the change at other stations is in the range of 4-5 . This shows a high significance of 340 the shift at all stations except for Daugavgrīva. The described significant change in the shape parameter indicates a dramatic shift in the overall appearance of the extreme value distribution from an approximately Gumbel one to a Weibulllike shape in 1985 and the back to a Gumbel distribution in 1989. Table 2. The average value of shape parameter as a result of sliding GEV fit with a window of 30 years before 1985, 345 1985-1989, and after 1989

Non-stationary extreme value analysis
To model the non-stationary extreme values, we tested the following hypotheses: (a) The GEV location, scale, and shape parameters follow a linear trend (Eq. 4) 350 (b) The GEV shape parameter follows a quadratic trend (Eq. 5) (c) The GEV shape parameter follows a step-function (Eq. 6).
These model functions Eqs. (4-6) were used to describe, to a first approximation, the possible course of time variability of parameters in non-stationary GEV fit (Eq. 3). The linear trends (case a) are commonly used to describe the impact of climate change on the ocean or atmosphere conditions. The hypotheses (b) and (c) were chosen to reach a better description of the 355 detected features of the variability of the shape parameter (Fig. 7).
The use of the assumption of the presence of a linear trend in all GEV parameters (case a) showed that only tide gauges located in the Baltic proper (Liepāja) or locations in the Gulf of Riga that are largely affected by the water level in the Baltic proper (Kolka) follow a significant trend in at least one of the parameters (Fig. 8). The data from Kolka revealed a linear trend in the location parameter at a significance level of 89% and in the shape parameter at a significance level of 94%. The 360 parameters of the GEV distribution for water level maxima at Liepāja tide gauge revealed weaker linear trends in the location parameter (81% significance level) and scale parameter (84% significance level) were observed. However, no significant linear trends (at least an 80% significance level) were detected in any of the GEV parameters at all sites in the interior of the Gulf of Riga. Even though the above-mentioned trends at Liepāja and Kolka are not statistically significant at a commonly accepted 95% level, the presented features indicate an intrinsic difference in the behaviour of water level 365 extremes in the inner area of the Gulf of Riga compared to the stations that reflect water level in the Baltic proper.
Interestingly, results presented in Section 3.1 demonstrate that the sliding GEV fit (with the assumption that the parameters of the GEV are constant within each single time window) showed almost no changes in the parameters of GEV at Liepāja. This confirms that sliding GEV can be used in assessing whether the extremes behave in a non-stationary manner. However, to accurately model the time variability of GEV, a more elaborate approach such as non-stationary GEV fit, should be 370 performed. A likely reason for this difference is that the sliding GEV fit involves a smaller number of years within each window, whereas the non-stationary GEV fit considers all available data.
The shape of temporal changes in various parameters  suggests that even if a clear linear trend is missing, fittings using the more complicated functional shape of approximations may reveal further information about changes in these parameters. A quadratic function is a natural choice to highlight acceleration of changes as well as to identify the presence of 375 cycles with periods longer than the entire time series if some of the parameters have a clearly defined minimum or maximum within the observation time interval. Fitting a quadratic trend to the shape parameter at all locations (case b) showed a significant fit only for the Skulte water level extremes with a minimum in 1988 (at a significance level of 93%). However, the fit did not follow well the observed change. Therefore, it is unlikely that the observed changes had a cyclic manner.
Finally, a step-like function was used to model the changes in the shape parameter following the sliding GEV analysis that 380 showed that the fitted shape parameters exhibit an abrupt shift (Fig. 7). To find a suitable approximation of such changes (case c, Eq. 6) which best describes the change and to specify when did it happen and how long did it last, we tested various settings of a step function (Eq. 6). The parameter ∆ was varied from 1 to 15 years. The corresponding total duration of the event (2∆ ) was in the range of 2 to 30 yrs, correspondingly. The parameter 0 corresponds to the time in the middle of the https://doi.org/10.5194/nhess-2020-100 Preprint. Discussion started: 24 April 2020 c Author(s) 2020. CC BY 4.0 License. event. It was modified in the range of 1981-1991 with a step of one year. The amplitude of both regime shifts was assumed 385 to be the same. It was changed from -0.2 to -1 incrementally with a step of 0.2. The constructed step functions were used to model the time variability of the GEV shape parameter at all locations. These functions were fitted to the set of block maxima of water levels using a non-stationary GEV fit (Eq. 3).
The properties ∆ and 0 of the best approximation using step-function for the course of the shape parameter, vary largely for individual stations. This not necessarily means a large discrepancy of the properties of the relevant extreme value 390 distributions because recordings at different sites may be affected by several local features. To obtain the best fit for all stations, we used a collective location approach, where a sum of the goodness of fit for all stations was used as a measure of the quality of the fitted non-stationary distribution (e.g., Votier et al., 2008). This approach provided more consistent results.
In some cases, however, it showed lower formal statistical significance compared to the tests performed for individual sites.
The negative log-likelihood value of the non-stationary GEV fit was used as a measure of the goodness of fit. 395 A sensitivity test was performed to check what parameters affected the fit the most. The goodness of fit was significantly 400 affected by the change in ∆ and 0 . However, it was less sensitive to the amplitude of regime shift of the fitted step https://doi.org/10.5194/nhess-2020-100 Preprint. Discussion started: 24 April 2020 c Author(s) 2020. CC BY 4.0 License. function in Eq. (6). Therefore, it was fixed to the value = 0.36 that was averaged over all seven stations in Table 2. This feature indicates that the existing data are consistent in terms of the magnitude of the regime shift in the middle and end of the 1980s.
For the stations located in the interior of the Gulf of Riga (that is, excluding Kolka), the best fit for the shape parameter was 405 ∆ = 2, 0 = 1988. The corresponding abrupt shift therefore formally started in 1986 and lasted until 1990. The likelihood ratio test showed that the non-stationary GEV fit with the shape parameter following a step-function with ∆ = 2, 0 = 1988 described extreme values better than the stationary fit. This claim is valid with statistical significance of 99.9% at Roja and Salacgrīva, 98% at Daugavgrīva, 85% at Skulte, and 80% at Pärnu. The identified step function is consistent with the results obtained with the sliding GEV fit (Fig. 7, Table 2). However, the formal statistical significance of the presence of an abrupt 410 change depends on the method in use. In case of the sliding GEV, the most significant change was obtained at Pärnu, and the least significant one occurred at Daugavgrīva.

Conclusions and discussion
The core conclusion from the performed analysis is that the parameters of theoretical extreme value distributions of the observed and measured water maxima in Latvian waters, both on the shore of the Baltic Proper and in the interior of the Gulf 415 of Riga exhibit statistically significant changes over the last 60 years. The greatest changes occur to the shape parameter of the GEV distribution. These changes may cause fundamentally different projections for long-term behaviour of water level extremes and their return periods as the underlying extreme value distribution changes from a three-parameter (reversed) Weibull distribution to a Gumbel one. While the reversed Weibull distribution has a finite upper limit, any water level height may occur according to Gumbel distributions. 420 Surprisingly, the nature of changes in the shape parameter of the GEV distribution for water level maxima is substantially different on the shores of the Baltic proper and in the interior of the Gulf of Riga. This parameter changes more gradually and mostly in one direction at sites directly affected by the water level regime of the Baltic proper. The direction of changes is different at different locations: the shape parameter decreases on the open shore of the Baltic proper at Liepāja but increases at Kolka. 425 Interestingly, the temporal course of the shape parameter has a pair of regime shifts at all measurement sites in the interior of the Gulf of Riga. During most of the time, this parameter is close to zero and, therefore, the GEV distribution can be approximated well by a Gumbel distribution. The shape parameter becomes negative (and thus a reversed three-parameter Weibull distribution governs the water level maxima) around the year 1986. In contrast, its value jumps back to a level close to zero in around the year 1990. This regime shift may reflect abrupt changes in geostrophic winds in 1987/1988 (Soomere et 430 al., 2015b) and surface-level winds in the 1980s (Keevallik and Soomere, 2014). The described process may also mirror massive evidence of regime shift in various abiotic variables in Estonia in [1989][1990] in biotic time series of bogs and marine ecosystems in 1990 (Kotta et al., 2018). https://doi.org/10.5194/nhess-2020-100 Preprint. Discussion started: 24 April 2020 c Author(s) 2020. CC BY 4.0 License.