Calibration of a real-time tsunami detection algorithm for sites with no instrumental tsunami records : application to coastal tide-gauge stations in eastern Sicily , Italy

Coastal tide gauges play a very important role in a tsunami warning system, since sea-level data are needed for a correct evaluation of the tsunami threat, and the tsunami arrival has to be recognized as early as possible. Real-time tsunami detection algorithms serve this purpose. For an efficient detection, they have to be calibrated and adapted to the specific local characteristics of the site where they are installed, which is easily done when the station has recorded a sufficiently large number of tsunamis. In this case the recorded database can be used to select the best set of parameters enhancing the discrimination power of the algorithm and minimizing the detection time. This chance is however rare, since most of the coastal tide-gauge stations, either historical or of new installation, have recorded only a few tsunamis in their lifetimes, if any. In this case calibration must be carried out by using synthetic tsunami signals, which poses the problem of how to generate them and how to use them. This paper investigates this issue and proposes a calibration approach by using as an example a specific case, which is the calibration of a real-time detection algorithm called TEDA (Tsunami Early Detection Algorithm) for two stations (namely Tremestieri and Catania) in eastern Sicily, Italy, which were recently installed in the frame of the Italian project TSUNET, aiming at improving the tsunami monitoring capacity in a region that is one of the most hazardous tsunami areas of Italy and of the Mediterranean.


Introduction
Coastal tide gauges are the oldest, most used, easiest to maintain and cheapest instruments to record tsunami signals, and, in the frame of a centralized tsunami warning system (TWS) with real-time data processing, they provide information about the propagation and the magnitude of the occurring event.Even if tsunami wave properties like height and period at a given coastal station have a strong local character, nevertheless the local information about tsunami arrival and magnitude can be used to warn nearby areas.For all the above reasons, coastal tide gauges are considered indispensable elements of any TWS, and there is no TWS today in operation without a suitable tide-gauge network (see Nagai et al., 2007;Allen et al., 2008;Schindelé et al., 2008).
Within a TWS, what is of paramount relevance is the capability of providing ready and adequate response as the tsunami progresses and reaches the coast.The need to identify tsunamis as soon as possible stimulated the development of tsunami detection algorithms that has evolved in parallel with the introduction of new tsunami measurement technologies.One of the first algorithms was designed by Mofjeld (1997) and installed in the DART bottom pressure recorder (BPR) systems for the Pacific TWS.Others were later devised for GPS buoys and acoustic/pressure wave gauges in Japan by Shimizu et al. (2006), and for coastal tide gauges in Canada (Rabinovich and Stephenson, 2004).Among the most recent efforts, one can mention the studies of tsunami detection algorithms for high-frequency (HF) radar installations (Gurgel et al., 2011;Lipa et al., 2012), for BPR sensors Published by Copernicus Publications on behalf of the European Geosciences Union.

Calibration procedure of a tsunami detection algorithm
Any tsunami detection algorithm aims to discriminate tsunami signals from the background, which is contributed by tides, infra-gravity waves, seiches, storm and wind waves, surges, ship waves, random noise, etc.Since both tsunamis and background oscillations are known to be dependent on local coastal dynamics and the geometry of local basins (e.g.harbours, inlets, bays), detection algorithms have to be calibrated to site conditions.Calibration is, however, sometimes overlooked and this constitutes an operational risk, because, if the selected algorithm parameters happen to be inadequate to the site, the algorithm performance might be compromised with an increased risk of failure in the case of a tsunami.Bressan and Tinti (2011) gave an example of a rigorous calibration procedure on introducing TEDA (Tsunami Early Detection Algorithm) and tuning it to the coastal tide-gauge station of Adak Island in Alaska.

Calibration in the case of availability of recorded tsunami signals
If we assume that an algorithm is not a rigid procedure and that instead it can be adapted to site-specific conditions by selecting a proper set of parameters, it follows that calibrating an algorithm means finding the parameter configuration ensuring the best performance of the algorithm for a given site.
Bearing this in mind, when the station where the detection algorithm has to be installed has a long record of instrumental tsunami time histories, the calibration procedure can be schematized in the following sequence of steps: 1. setting up a database of the background records: this can be built by selecting homogeneous records of sufficient length to include seasonal variations; 2. defining quantitative performance indicators and selecting the parameters of the algorithm to be tested; 3. applying the selected configurations to the background database records, to characterize the background signal and to select appropriate detection thresholds that avoid false detections when the algorithm is applied to the background database; 4. setting up a database with local tsunami records; 5. applying the selected configurations to the event records and computing the corresponding performance indicators; 6. selecting the configuration with the best indicators as the operational setting.
The above steps will be further commented on in the next sections of the paper, and full details can be found in the paper by Bressan and Tinti (2011) where the algorithm TEDA was introduced for the first time and calibrated.It is noted that in this sequence of actions, the step of the background characterization (step 3) follows the one of the selection of the parameter configurations (step 2) because the background is studied also, though not only, by using the way it is defined in the algorithm, which may depend on parameters and vary from one configuration to the other.The goodness of the procedure is based on two fundamentals: (1) the background database is long enough to allow a stable characterization, and (2) the tsunami signal database contains enough cases including strong as well as weak tsunamis to allow for significant statistics of the performance indicators.Both those requirements were met for Adak Island station.As to the former, on analysing several years of records (though with relevant gaps), it was found that there is a substantial annual stability of the background statistics, though exhibiting seasonal variations, and hence records longer than one year have duration large enough for calibration purposes.As to the second, as many as 17 tsunami records that occurred from 1997 up to 2010 could be collected, including tsunamis from very distant sources like South Chile and Peru to less far sources like Kuril Islands, Russia, or Andreanof Islands, Alaska, and tsunamis with the first wave height ranging from a few centimetres to some tens of centimetres.It is further noted that the sampling interval for the recorded data was 1 min (i.e.fine enough to allow computations in the tsunami frequency band).

Calibration in the case of a poor tsunami record
The calibration procedure described in the above section includes as a fourth step the building of a database of real tsunami signals recorded in the same station, which is a requirement that can be met only by a limited number of stations worldwide that are old enough and are located in sites with a relatively short tsunami return time.In practice, only some of the stations in the Pacific region can satisfy these conditions, while this is not the case for stations in all other oceans.As for the Mediterranean Sea, only in recent years digital tide-gauge stations have been installed with (or upgraded to) a sampling rate adequate to cover tsunami recording needs, and digital records of real tsunamis are very few.The best recorded case is the 23 May 2003 tsunami induced by the Boumerdés-Zemmouri M w = 6.9 earthquake (in Algeria) that was recorded by more than 20 tide-gauge stations in the western and central Mediterranean (see Alasset et al., 2006;Heidarzadeh and Satake, 2013).
The main purpose of this paper is to propose a method to calibrate a detection algorithm for those tide-gauge stations for which the instrumental record of tsunami is quite poor or does not exist at all.In this case, the role played by the records of real tsunamis in the calibration procedure must be played by synthetic records (i.e. by records computed by means of numerical models).Step 4 of the procedure given in Sect.2.1 (i.e.setting up a database of tsunami records) has to be modified and can be described as follows: We will show an exemplary application of this method by calibrating TEDA for two coastal stations recently installed in Sicily, southern Italy, namely in the harbour of Tremestieri, close to Messina, and in the harbour of Catania, with a sampling rate adequate to record tsunami waves.Together with the station of Siracusa, these are installations made in the frame of the Italian project TSUNET with the aim of improving the tsunami monitoring capability of eastern Sicily, which is located in one of the most active tsunamigenic zones in Italy and in the Mediterranean (Tinti et al., 2004(Tinti et al., , 2010;;Tonini et al., 2011) and was hit by some of the largest Italian tsunami events in the past.According to the Italian tsunami catalogue (Tinti et al., 2004), most of these tsunamis are associated with large local earthquakes (e.g.1693,1783,1908), but recently their purely tectonic origin was questioned and a debate was opened on the possible parallel or prevalent contribution provided by submarine landslides triggered by seismic shaking (see tsunami studies and tsunami source analyses in Piatanesi and Tinti, 1998;Graziani et al., 2006;Gutscher et al., 2006;Tinti et al., 2007;Papadopoulos et al., 2007;Billi et al., 2008;Gerardi et al., 2008;Favalli et al., 2009;Argnani et al., 2009Argnani et al., , 2012)).Indeed, because of the morphology and steep bathymetry of the Messina Strait and of the Hyblean-Malta escarpment bordering eastern Sicily offshore, landslide excitation by earthquakes and even by simple gravitational loading cannot be ruled out, which means that landslides can be considered a further source of tsunami.Taking this into account, and given that landslide-generated tsunamis may be very dangerous, supplying eastern Sicily tsunami stations with an automatic and site-oriented tsunami detection algorithm seems to be a very important step to ensure an efficient and timely warning.

TEDA (Tsunami Early Detection Algorithm)
TEDA is a real-time automatic procedure to detect hazardous tsunami and hazardous long period waves in sealevel records (Bressan and Tinti, 2011) that was developed by the Tsunami Research Team of the University of Bologna.
Here only the main characteristics of TEDA are briefly summarized, since a detailed description can be found in Bressan and Tinti (2011Tinti ( , 2012)).TEDA is structured to work at the station level (i.e. to be installed at the station data logger, with the purpose of triggering alarm procedures, such as starting automatically a real-time data transmission).It is composed of two parallel algorithms, denoted tsunami-and secure-detection method, that check detection conditions at every new data acquisition.The former method is a slopebased algorithm particularly efficient in detecting "anomalous" wave trains starting with an impulsive front, while the latter is based on the magnitude of the sea-level displacement from mean values.Once a detection is made by the tsunamidetection or the secure-detection method, an "alarm state" starts that is named respectively "tsunami state" or "alert state".Alarm procedures are started whenever TEDA triggers an "alarm state", i.e. either a "tsunami state" or an "alert state".The tsunami-detection method is based on the comparison of the most recent signal with the background signal.The most recent signal is represented by the instantaneous slope (IS(t)), which is the slope of the last sea-level data corresponding to a time window of length t IS that includes the last sample, corrected from the tidal slope (for the implementation of tidal correction, see Bressan and Tinti, 2011).The background signal is represented by the background slope (BS(t)), which is statistics of previous values of IS(t) over a time window of length t BS , longer than t IS , and preceding the time window used for the computation of IS(t) by an amount t G .Three different options are proposed in TEDA to compute the function BS(t), the best of which can be determined after calibration: and t being the acquisition time of the last sample.
When an impulsive anomalous wave, as a tsunami or sudden seiches, is recorded at the tide-gauge station, the function IS(t) soon increases because the tsunami wave is included in the computation of IS(t), while the function BS(t) increases with a delay of t G . when CF might get very high values for very small waves.The tsunami detection starts the tsunami state, which lasts until the background slope BS decreases to "normal values", assumed to be equal to the value it takes at the detection time.
For the sake of clarity, it is stressed that in this paper the term background is used to denote two different sets of data.The background signal BS(t) introduced here is computed according to the definitions given above on a window of IS of duration t BS , typically 1 h long.On the other hand, the background records mentioned in Sect. 2 that are used to characterize the tide-gauge station site and to select appropriate time windows corresponding to different sea conditions are tide-gauge records in absence of tsunamis and can be several months or years long.
The secure-detection method computes a "filtered" marigram M(t) through partial integration of the function IS(t) over a time window of length t SD , which includes the time of the last acquired sample: where dt is the sampling interval.This procedure acts as a band-pass filter, since it amplifies only a limited range of frequencies depending on the values of t IS and t SD while damping the outer frequencies.In the case of anomalous waves appearing at the station as a series of waves of slowly increasing amplitude, as a seiche event or a far-field tsunami, the tsunami-detection method would fail the detection.However, the function |M(t)| would oscillate with increasing amplitudes, and whenever the function |M(t)| passes a threshold amplitude (i.e. if |M(t)| ≥ λ SD ), a secure detection is triggered and an alert state starts.The alert state lasts for a predefined time interval t A , and in case more consecutive detections are triggered, it lasts for a time t A after the last trigger.
An example of TEDA functions and detection can be seen in Fig. 1, and a scheme of TEDA can be found in Table 1.
One additional feature of TEDA worth mentioning is that it addresses also the problem of missing data, if the no-data period is not too long: indeed in the case of short data holes (less than 15 min in this work), TEDA suspends temporarily the calculations until data acquisition restarts.Then it assumes a linear interpolation between the last value before the gap and the first value after the gap, and computes all the functions from the suspension time till the actual time, so filling the gap.However, if the lack of data persists beyond a predefined time parameter, all TEDA functions are reset to zero with computation restarting when data flow resumes.This reset, including the calculation of the background function, implies that some time passes before TEDA becomes fully operational again.

The TSUNET stations of Tremestieri and Catania (step 1)
Within the project TSUNET three marine and meteorological stations have been installed on the eastern coast of Sicily in Tremestieri (Messina), in Catania and Targia (Siracusa).In this paper only the first two stations will be taken into account (see Fig. 2).Each station is provided by a tide gauge, measuring sea level every second and recording 5 s averages in a local data logger.Presently, in the test phase of the network, data are transmitted via Global System for Mobile Communications (GSM) every 4 h to a data acquisition centre located at the University of Bologna, where they are analysed and stored.The acquisition centre can inquire the station and force data downloads remotely.If a detection software (namely TEDA to be installed in the data logger) triggers an alert, the station activates a real-time data transfer of 1 s measurements to the acquisition centre until the alert is cancelled.
The first installation was made in the harbour of Tremestieri, a few kilometres south of Messina.The harbour, which is almost exclusively commercial, guarantees the connection of Sicily with the mainland mostly for trucks.It is approximately a rectangular basin oriented SW-NE and open at the north-east end.The tide-gauge station has been functioning since 22 January 2008.For this station the database of the background signal was taken to consist of data acquired from 5 September 2008 until 27 January 2010 since they form a homogeneous set as regards the sampling interval (5 s), the sea-level sensor (pressure gauge) and the harbour geometry.After January 2010 some repair work was done on the external harbour pier that was severely damaged by an exceptional storm, and this operation modified slightly the harbour geometry and also slightly changed the resonance spectral peaks (see Fig. 3).Finally, the pressure sensor was replaced by a radar sensor that started to provide data regularly in 28 December 2010.
The station of Catania was installed in November 2009 close to the entrance of the Catania harbour, on the inner side of the external pier.Here storms have caused no damage to the station until present.The background signal database consists of data recorded from 21 December 2009 until 31 December 2011 that form a homogeneous time series with constant sampling rate.

The parameter configurations of TEDA (step 2)
Each tsunami detection algorithm includes parameters that can be changed for adaptation to local site conditions.As for the specific case of TEDA, the parameters to set are the length t IS of the time interval used to compute the function IS(t), the lengths of the time interval and time shift, respectively t BS and t G , used to calculate the function BS(t), and the duration t SD of the time interval for the computation of the function M(t).In addition, one has to set the alarm The tsunami detections (red), secure detections (blue) and alert states are also shown, with respectively vertical and horizontal lines.In addition, for the functions IS, CF3 and M, the respective thresholds are indicated (green).In the central left panel, the value of BS(t D ), which determines the end of the tsunami state, is also indicated (grey).thresholds λ IS and λ CF for the tsunami detection and λ SD for the secure detection.Notice that the minimum possible values for the time intervals (and especially for t IS ) are related to the sampling interval dt, since all functions, including the instantaneous slope IS(t), should be calculated with a sufficient number of samples.In previous applications of TEDA (Bressan andTinti, 2011, 2012), it was found that the most sensitive parameter is the time length t IS , and therefore for the sake of simplicity in this paper we will keep fixed the values of all other parameters (namely t BS = 60 min, t G = 15 min and t SD = 6 min) and change only t IS , which can assume one of the five following values: t IS = 4, 6, 8, 10, or 12 min.The corresponding configurations will be hereafter called C1-C5 in the respective order.Bearing further in mind that in Sect. 3 three options (A1, A2 and A3) have been given to compute the background function BS(t), in total there are as many as 15 configurations to explore for each station.These will be denoted hereafter as AxCy where x and y are integers in the respective range from 1 to 3 and from 1 to 5.
In order to evaluate and compare the different algorithm configurations, we have introduced two performance indicators, which are the number of event detections (ND) and the delay time (DT).Let us suppose that for each event one can define a tsunami arrival time (TAT) and a corresponding tsunami detection interval (TDI).How TAT and TDI have been defined will be explained later on.If TEDA recognizes a tsunami either by means of the tsunami-detection method or by means of the secure-detection method within the TDI, then a detection is counted and ND is incremented by one.If none of the methods recognizes the tsunami within the TDI, then the event is considered missed.Further, for every detection, the DT is defined as the time elapsed from the estimated TAT to the time of the detection.If the tsunami is seen by both TEDA methods, the shortest DT is taken.
The selection of the parameter configurations and of the performance indicators completes step 2 of the calibration procedure.

Characterization of the background record (step 3)
By background, one means here a tide-gauge record that is not due to a tsunami.Since in the example treated in this paper there are no tsunami records, all the data recorded by the stations of Tremestieri and of Catania can be considered as background, though only part of those have been selected for the calibration analysis as already explained in Sect. 4.
The analysis of the background signal is performed in step 3 of the calibration procedure and can be split into two parts: one, the spectral analysis, is independent of the specific features of the detection algorithm, while the other is instead based on variables defined in the algorithm.The former can be conducted at any time, while the second has to follow step 2.

Spectral analysis
For each tide gauge, the sea-level time series selected in step 1 (see Sect. 4) have been analysed by computing the power spectral density (PSD) over consecutive, nonoverlapping, 12 h long time windows.For the whole data set, the average PSD has been calculated as well as the PSD corresponding respectively to the 10th, 50th and the 90th percentiles.In addition, data have been organized in calendar months and for each month the average PSD has been computed.All these spectral curves are graphed for Tremestieri and Catania in Fig. 4. The spectral curves are characterized by several peaks of different amplitude and by noise.In general, if a site background exhibits strong spectral peaks, one might expect that an incoming tsunami wave could excite typical resonances that could even predominate and mask the tsunami itself (Bressan and Tinti, 2011).Conversely, the presence of low spectral peaks might suggest that the incoming tsunami would keep its spectral signature on the sea-level record.The most striking feature of the calculated spectra is the stability of the curves as regards the main spectral peaks and the general trend.The variability from one month to the other regards only the level (intensity), but not the shape of the curves on the graph, and spans about two orders of magnitudes in spectral power (corresponding to a factor of 10 in wave amplitude): expectedly, there is a seasonal influence, and winter month spectra are more energetic than those of summer months.
The background record of Tremestieri is characterized by a tidal range of about 20 cm, with a peculiar tidal waveform that is due to the tidal current regime in the Messina Strait, which causes the tide to rise very steeply at the beginning of the tidal change.The spectral analysis in Fig. 4 shows that the PSD curves are dominated by a very strong main peak with period of about 2 min (which can be possibly further split into peaks at 118, 124 and 142 s).Other much weaker peaks can be identified at about 24, 30 and 50 min, while noise prevails in the range from 2 to 20 min.It is worth noting that the Tremestieri sea level is perturbed by ferries passing frequently near the station and leaving a typical signature in the tide-gauge record, which affects also the spectral content of the background.This is evident by comparing spectra in ordinary conditions with spectra taken in those rare days where ferries do not travel.From Fig. 5 it is clear that the passage of ferries is the reason for the strong noise level up to about 10 min, which is always present in the background signal and masks a secondary peak at about 5 min.The strong short-period resonances can be explained by the simple geometry of the Tremestieri small harbour and of the nearby coast, which is almost straight.The background signal of Catania is characterized by the same tidal range of about 20 cm as Tremestieri.The passage of ferries close to the station leaves a characteristic mark also for this station.But Catania spectra differ very much from the ones of Tremestieri.Peaks are broader and the most energetic ones correspond to long periods, namely about 15-16, 18.5-19.5,and 25 min, which have to be presumably attributed to the resonance of the continental platform offshore Catania.Further to notice is the sequence of peaks between 4 and 20 min that are probably due to the complex geometry of the Catania harbour, which is larger than Tremestieri port basin and is structured, also for historical reasons, into a number of sub-basins by internal jetties.

TEDA-dependent background analysis
In order to evaluate the efficiency of a specific tsunami detection algorithm, the background record has to be analysed by using the same perspective from which the algorithm sees the background data, which means by using the same functions and parameters characterizing the algorithm.This has been done for TEDA.More specifically we have calculated all the TEDA functions, namely IS, M, BS1, BS2 and BS3 that were defined in Sect. 5 and in addition the respective control functions CF1, CF2 and CF3 (here by CFx we denote the ratio |IS|/BSx) for all 15 configurations of TEDA parameters that were selected for calibration (see Sect. 5).In order to test the stability of the background, we have studied the records over different base intervals: (1) the whole data interval, (2) intervals about 1 yr long, and (3) intervals corresponding to calendar months.
One way to explore the stability from a statistical point of view is to consider the empirical frequency distributions (EFDs) of the TEDA functions and to see how the distributions change from one base interval to the other.Notice that in this paper we designate by EFD the normalized frequency distributions, which in the case of random variables can be seen as an approximation of probability density functions.To illustrate the process let us consider the function |IS| computed for the configuration C2 (corresponding to t IS = 6 min) and the functions BS3 and CF3, implying that the considered TEDA configuration is A3C2.What is found and is remarkable is that each function has a characteristic form of EFD. Figure 6 shows exemplary EFD curves calculated for the station of Catania.The curves corresponding to the whole data interval, and to the months of December and August are plotted for the function |IS| (a), the function BS3 (b) and the function CF3 (c).It is found that the EFDs are strongly asymmetric and unimodal.The |IS| and CF curves are generally decreasing, with the mode being found on the left end of the definition interval.The |IS| curves are strongly peaked, while the CF curves present a long tail.The BS3 curves are right-skewed with a long right tail.
The fact that all curves of the same function are found to be of the same type allows us to make some useful assumptions and analyse stability only in terms of a few parameters.If we assume that the EFDs of the function |IS| are one-parameter curves, we can use only the standard deviation as the characteristic parameter.Accordingly, in Fig. 7a we plot the standard deviations σ calculated for all EFDs corresponding to the base intervals mentioned above.It is seen that the standard deviation of the two-year EFD and of the yearly (2010 and 2011) EFDs are very close to one another, suggesting annual stability.On the other hand it is seen that σ changes substantially from one month to the other with peaks in spring and low values in summer months.If we use the EFD mode, instead of the standard deviation (see Fig. 7a), we find the same picture, which is stability over year-long bases and seasonal variability, with a strong inverse correlation between standard deviation and mode, which is a confirmation of the correctness of the assumption that EFDs of |IS| are oneparameter distributions.From a physical point of view, a peaked distribution of |IS| means a large predominance of values close to zero of the sea-level slope.This means that sea level tends to be flat and calm in the analysed period range, which is a condition more easily met in summer, and this is coherent with our finding (summer months have higher EFD mode).
The considerations for the function BS3 are quite similar.The EFD for August (shown in Fig. 6b) is more peaked and narrower than the one for December.Furthermore, when the mode is higher (summer months), the corresponding median value tends to be smaller (see Fig. 7b), and in addition mode and median do not change significantly from one year to the other.This confirms the one-parameter character of the EFDs, and that parameters have seasonal variability but annual stability.
Importantly, the EFD for the function CF3 seems to have a different behaviour: it proves to be in general almost constant in time and does not show strong seasonal differences.This property is shown in Fig. 6c, where the three EFDs (whole data, August and December) are almost superimposed.Indeed, the circumstance that CF3 is stable and almost independent of the season (i.e.rather insensitive to meteorological and climatological conditions) makes it an appropriate tool to search for tsunami anomalies.
To evidence the seasonal variations of the TEDA functions better, it is useful to consider the cumulative EFDs and the maximum and minimum envelope of the monthly cumulative EFDs for all months with at least 75 % of data.This is shown in Fig. 8a.While for |IS| and BS3 the envelopes differ significantly from the cumulative EFDs, this does not happen for the EFD of CF3.The distance between the two envelopes is plotted as a function of the cumulative EFD in Fig. 8b.This distance happens to be quite large for |IS| and even larger for BS3, but it is quite limited (less than 0.05) for CF3.It is therefore clear that the function CF3 is the most insensitive to the sea state conditions, in spite of the fact that the function BS3 is the one most affected.This result can be better appreciated when extending the analysis to all the TEDA options to compute the background signal and the control functions.Figure 9 shows for the cumulative EFDs of the functions BS1, BS2 and BS3 computed for the harbours of Tremestieri and Catania together with the maximum and minimum envelope (upper panels).It further shows the distance between the two envelopes for the cumulative EFDs of the functions CF1, CF2 and CF3 (lower panels).We recall that all these functions are computed here for the parameter configuration C2, and therefore for the TEDA configurations that are called A1C2, A2C2 and A3C2.It emerges that all TEDA control functions possess the right characteristics to be used for tsunami detection since their behaviour is only weakly influenced by the seasonal conditions.It can be further concluded that for Tremestieri all control functions are almost equivalent since for all of them the distance between the upper and lower envelopes is very small (less than 0.02).For Catania instead such distance is much smaller for the option CF2 (less than 0.03) than for option CF1 and CF3 (less than 0.05), which is suggestive that the method A2C2 (see Sect. 3) seems to be the least sensitive to sea-level conditions and the most suitable for the tsunami analysis of the Catania records.
The above analysis of the background signals completes step 3 of the calibration procedure.

Synthetic tsunami signals (step 4)
The calibration of any tsunami detection algorithm implies carrying out tests on records containing tsunami signals.In this paper we focus on the calibration for stations having not enough tsunami records to build a satisfactory database of experimental data.The tide-gauge stations of Tremestieri and Catania selected for our study are perfect examples because they have recorded no tsunami since the time of their installation.We provide here an example of step 4 of the calibration procedure of a tsunami detection algorithm, which consists of the construction of synthetic records including tsunami signals.To compute tsunami signals we rely upon a specific study undertaken by Tonini et al. (2011) to assess tsunami hazard for eastern Sicily and that is based on the worst-case credible scenario approach.In the following only the main characteristics of these scenarios are briefly outlined.

Tsunami scenarios (step 4.1)
The tsunami signals for Catania and Tremestieri have been computed by considering scenarios inspired by three historical events: the 21 July 365 AD, the 11 February 1693 and the 28 December 1908 tsunami.The 365 AD scenario tsunami originated from a remote source, a strong M w = 8.3 earthquake, occurring in the western Hellenic arc.The historical event is very well documented and produced inundation also very far from the source, such as in southern Italy and on the eastern coast of Sicily (Guidoboni et al., 1994;Stiros, 2010).Three source hypotheses have been considered for this study (Tonini et al., 2011): the first fault, named hereafter 365F1, is based on the literature and involves a fault along the western Hellenic arc touching western Crete; the second fault, designated as 365F2, is a hypothetical fault that involves the part of the Hellenic arc between Peloponnese and Crete; while the third source, named 365F3, is modelled as the joint rupture of the previous faults, namely 365F1 plus 365F2.
The other two events, the 1693 and 1908 tsunamis, are due to local sources placed off eastern Sicily coasts and in the Messina Strait.Though these two big earthquakes, with respective estimated magnitudes of M w = 7.4 and M w = 7.1, were followed by tsunamis, there are still doubts on the location of faults, and there is a quite open debate on whether tsunamis were only due to tectonic sources or to additional submarine landslides (Tinti et al., 1999;Billi et al., 2008Billi et al., , 2010;;Argnani et al., 2009).In view of the uncertainties, two different sources have been considered for the 1693 event, a seismic fault and a pure landslide source, named respectively 1693E and 1693L, both described in the paper by Argnani and Bonazzi (2005).The latter, the 1693L tsunami, was not used for the calibration of the Tremestieri station since its effects are too weak and therefore negligible there.As for the 1908 tsunami scenarios, two possible sources are proposed following Tonini et al. (2011): the first is a heterogenous-slip fault in the Messina Strait, named 1908E, while the second, named 1908EL, is composed by adding a contribution of a landslide in the tsunami generation process.
The synthetic tide-gauge signals have been computed by means of the numerical code UBO-TSUFD, which is a tsunami on finite-difference technique that solves the Navier-Stokes equations in the shallow water approximation and allows the utilization of nested grid domains with different resolutions (see Tinti and Tonini, 2013).The model neglects dispersive wave behaviour and the non-hydrostatic effects and does not include the tide in the run-up computation.In this specific case the grid covering the harbour of Tremestieri area is formed by 200 m × 200 m cells, while cells covering the harbour of Catania, which is characterized by a more complex geometry, have sides 40 m long.This higher resolution allows taking into account the harbour geometry and reproducing the harbour response to waves, while the lower resolution of 200 m cannot reproduce the harbour and coastal behaviour.However, the Tremestieri sea-level background is characterized mainly by short-period resonance oscillations (of about 2 min) due to the harbour geometry and lacks other specific spectral frequencies due to the local costal morphology.
In total for calibration purposes, we have computed 7 tsunami signals for the harbour of Catania and 7 for the harbour of Tremestieri, though for the latter we have used only 6.The length of the computed signals is only a few hours, though it is known that tsunami can persist in the harbours for much longer time.The purpose of this study however is to measure the performance of the detection algorithm, which means that only the first tsunami oscillations are important rather than the full signal including a long oscillation queue.An important observation is that the station may happen to be involved with the co-seismic displacement, which means that the sea levels before and after the earthquake may be different.This is the case for the station of Tremestieri for the scenarios denoted by 1908EL and 1908E, where the mean sea level after the quake is about 80 cm higher, which is the effect of a co-seismic subsidence of the harbour area.This effect is included in the tsunami simulation, since the tsunami model UBO-TSUFD uses bathymetry and coastlines as they are after the earthquake occurrence, which is after the correction due to the co-seismic displacements (see Tinti and Tonini, 2013).

Selecting the background windows with different sea conditions (step 4.2)
Building synthetic tsunami signals implies the computation of scenario tsunami signals as described in the previous section and the superposition of these signals on the background, considering different sea-state conditions as emerging from the background statistics.In the present exemplary study, we have considered four typical background signal situations for the harbours of Tremestieri and Catania as summarized in Table 2: a condition of calm sea (window 1) and a condition of rough sea (window 2) characterized respectively by low-energy and high-energy power spectra (see Figs. 4  and 5); after the passage of a ferry (window 3), since ferries produce oscillations in the tsunami frequency range (see Sect. 6.1); and in conditions of initial rising tide (as regards Tremestieri) or after a relevant seiche-excitation event with about 20 min period oscillations (as regards Catania, window 4).The four mentioned windows can be characterized by means of the TEDA functions.The values of the instantaneous slope are quite low for the calm sea conditions (around 0.13 cm min −1 ) in both harbours and rather high, but not exceptionally high, for rough sea conditions.The average  8a for the harbour of Catania), which means that similar or worse conditions are expected to occur more than 15-20 days every year (see also Fig. 11).The presence of boat disturbances (window 3) and of seiches and tidal rise (window 4) has the effect of increasing the instantaneous slope values and making tsunami detection more problematic.Considering Table 2, one can see that boat signals do influence the function |IS| (window 2), but less than the tidal rise and, for the Catania station, less than the seiches event (window 4).In particular, it is worth noting that seiches affect |IS| more than a storm event, since wind waves are characterized by high amplitude as well as high-frequency oscillations, which are filtered and damped down by TEDA.

Building the tsunami records (step 4.3)
In view of the above choice, the tsunami record database for this study consists of 28 records for the harbour of Catania and of 24 records for the harbour of Tremestieri.The difference is due to the fact that the tsunami signal for the tsunami scenario of the case 1693L is too small to be used for a tsunami detection for Tremestieri.Building a tsunami record is usually a trivial operation since the tsunami signal computed by means of the tsunami simulation code is superimposed to the tide-gauge record in the selected time window by merely adding it to the background.This operation is quite simple and only requires an adjustment since, to avoid discontinuities in the tsunami record, the computed tsunami is tapered to zero at the end of the computation interval.However, if the station experiences some vertical movement (uplift or subsidence), then the computed signal does not revert to zero at the end of the tsunami but remains shifted by the uplift (downlift) amount.In this case, the experimental record is adapted to the new level of the synthetic tsunami signal.
8 Results of the calibration (steps 5 and 6) After the tsunami record database is built (and hence step 4 is finished), the calibration procedure foresees the application of the algorithm configurations to the database records (step 5) and the selection of the best configuration (final step 6) determined on the basis of the best performance evaluated by means of the tsunami indicators (ND and DT) introduced in Sect. 5.In order to apply the indicators, one has to specify the tsunami detection interval (TDI) and the TAT.For this purpose, the TDI is taken as the interval which corresponds to the first 30 min of the window of the sea-level record on which the tsunami signal is superimposed.On the other hand, the theoretical TAT is determined by visual inspection of the tsunami signals and corresponds to the instant when the signal at the station exceeds the threshold of about 2 cm in absolute value.In virtue of this definition, notice that TAT can be delayed with respect to the beginning of the TDI (see Fig. 10).In Table 2 we show the mean and maximum value taken by the function |IS| in the TDI of the original background record, without tsunami signal superimposed.
All the 15 configurations of TEDA (Sect.5) have been applied to all the tsunami records of the database, and the configuration with the highest ND and with the lowest average value of DT has been selected as the optimal one.The result is that the best configuration is able to detect all the events (ND = 28 for Catania and ND = 24 for Tremestieri), with an average DT shorter than 10 min.The best configurations happen to be A3C2 for Tremestieri and A2C2 for Catania.The results of the different configurations can be seen in Table 3.The results illustrated in the Figs. 10 and 11 refer to the identified best configuration of TEDA.In Fig. 10 all detections made by means of the tsunami-detection method (red) and through the secure-detection method (blue) are shown.It is evident that the tsunami-detection method is usually faster.This is always true for Catania and almost always for Tremestieri.Indeed, in Tremestieri in the case of rough sea conditions (window 2), the tsunami-detection method either fails or its detection is slower than the secure-detection method.In general, almost all detections occur within the first peak of the first tsunami wave.
The results of the TEDA detections in the various sea conditions are shown in Fig. 11.The four sea conditions given and characterized in Table 2 (windows 1-4) are given here distinctly in terms of probability of exceedance, which is in terms of the complementary function of the cumulative EFD: indeed for each value of the function |IS|, the cumulative EFD represents by definition the probability of occurrence of |IS| with lower or equal values, and hence the complementary cumulative EFD represents the occurrence probability of larger values of |IS|.In the bottom panels of Fig. 11, the average values of |IS|, BS2 and BS3 within the TDI of the original background records are plotted against the exceedance probability of |IS| for the configurations A2C2 and A3C2.It is seen that the calm sea conditions (window 1 of Table 2) can be exceeded from more than 45 to more than 55 % of the time, which means that in about 150 days every year, the sea is even flatter than we have considered.On the other hand, the most infrequent conditions result to be the rough sea conditions (window 2) for the harbour of Tremestieri (which is met only a few days every year) and the concomitant "seiche + boat signal" conditions (window 4) for the harbour of Catania, which is expected to occur less than 10 days every year.The estimated tsunami arrival (TAT) is indicated with a vertical line (violet).TEDA detections for all the four windows are indicated (tsunami-detection method, red, and secure-detection method, blue) for the best configuration and for the different sea-state conditions.In general, the TEDA secure method provides later detections.
The DTs of the detections are displayed in the two upper panels of Fig. 11.The first fact to notice is that for both Tremestieri and Catania, the detections are quite fast, all within 10 min from the estimated TAT.For the Tremestieri station, however, under rough sea conditions, the detection of a tsunami might be 1-2 min slower than when in calm sea conditions because the detection is triggered by the securedetection method rather than the tsunami-detection method.The presence of a boat signal (window 3) does not seem to affect the efficiency of TEDA very much, since the DTs of detections are only slightly larger.Conversely, the fast tidal rise (window 4) might very well have more influence: the 1908E and 1908EL tsunamis arriving with a first positive leading wave are detected earlier since the tidal slope adds to the initial tsunami slope, while most probably detection would have been slower for a tsunami with a first leading negative wave.It is found that also the results of the detections for the Catania station are quite satisfactory: it is found that the DTs of all events are quite independent of the sea state conditions, or that, in general, if there is some dependence, this is quite small and less than 15 s.In Tremestieri the events that are more timely recognized are the local ones (1908E and1908EL), while tsunamis of remote origin are detected later (365F1, 365F2 and 365F3).For Catania, the slowest detection regards the scenario tsunami 1693L, which is the weakest one, with first oscillation amplitude of less than 25 cm (see Fig. 10).
As was anticipated before, the best TEDA configurations are able to detect all tsunamis with an average DT less than 10 min.Out of the 15 TEDA configurations tested, the configuration A3C2 results to be the best for the harbour of Tremestieri, whereas for the harbour of Catania, two configurations (namely A3C2 and A2C2) turn out to be equivalent, since they give exactly the same results.This does not mean that a decision is impossible.One possible way to come to a decision could be to enrich the tsunami record database (consisting of 28 records) and to test the two methods over additional records.In our case, however, we consider it unnecessary since we can exploit an additional piece of information.During the background analysis (see Sect. 6.2), it was found that the configuration A2C2 provides a control function CF2 that is by far less sensitive to seasonal variations (see Fig. 9) than CF3 (resulting from A3C2), which means that the method A2C2 is expected to perform well in a larger range of sea conditions than A3C2.Therefore, in virtue of this consideration we conclude that the method A2C2 is preferred and can be selected as the most adequate for TEDA computations.

Conclusions
In this paper we have presented a procedure to calibrate a real-time tsunami detection code for tide-gauge stations where experimental tsunami records are poor or not existing at all.The procedure is based on two main elements: the analysis of the background records and the creation of a suitable database of synthetic tsunami records.
It is stressed that the analysis of the background signal must be performed also by means of the functions that are defined by the detection code under study.Therefore in addition to the traditional tools to study time series, such as the Fourier analysis and power spectral densities, we have considered specific frequency distributions of functions like the instantaneous slope (|IS|), the background signal (BS) and the control functions (CFs), by means of which we have identified typical conditions of sea states with different occurrence probabilities, and have recognized that sea-level records have stability over time bases of one year and have remarkable variations from one season to another.
We have further stressed that the creation of a database for tsunami records requires the identification of suitable tsunami scenarios that should be built by exploiting the knowledge of the tsunami history of the area under study and of the seismo-tectonic and geomorphological setting.In the case examined in the paper as an illustrative example, we have considered two harbours on the eastern coast of Sicily, and we have considered seven cases of tsunami scenarios including remote earthquakes (western Hellenic arc) and local seismic fault and landslide sources (in the Messina Strait and in the Hyblean-Malta escarpment).
To compute the tsunami signals, we have used an inviscid non-dispersive non-linear shallow-water approximation model solved through a finite difference technique (see Tinti and Tonini, 2013) over grids with a space resolution of 200 m for the harbour of Tremestieri and of 40 m for the harbour of Catania.More sophisticated numerical models accounting for dispersion or 3-D effects, or using finer resolution to capture the small-scale geometrical effects better within the considered harbours could provide more reliable tsunami signals.However, we believe that the model we use is able to capture the most relevant features of the examined tsunami scenarios and serve the purpose to illustrate at best the application of the procedure.
After applying as many as 15 parameter configurations of TEDA to all records of the database, we have been able to select the configuration A3C2 as the best for Tremestieri and the configuration A2C2 as the best for the harbour of Catania.Our analysis does not guarantee that TEDA will detect all possible tsunamis (from weak to strong) in all possible circumstances (from very calm to very rough sea state) and within a short period of time (the first 10 min); that is, it does not guarantee a 100 % detection performance, and does not even guarantee that no false tsunamis are detected.It simply ensures that the found configurations are likely to produce the best possible performance for TEDA.A perfect performance with no missing tsunami cases and no false detection is perhaps an unreachable goal for a single-station detection algorithm like TEDA and can only be approached in a multiple-station (which is in a monitoring network) environment.

Fig. 1 .
Fig. 1.TEDA functions computed in Tremestieri for the configuration C2 (see Sect. 5) during an event built by adding the 1693E tsunami signal to a calm sea record (see Sect. 7).The marigram and the tsunami and the secure-detection functions IS, BS3, CF3 and M are shown.The tsunami detections (red), secure detections (blue) and alert states are also shown, with respectively vertical and horizontal lines.In addition, for the functions IS, CF3 and M, the respective thresholds are indicated (green).In the central left panel, the value of BS(t D ), which determines the end of the tsunami state, is also indicated (grey).

Fig. 2 .
Fig. 2. Location and map of the Tremestieri and Catania harbours (map data: Google, TerraMetrics, DigitalGlobe).The tide-gauge station is indicated by a red circle.
Fig. 3. Average power spectral density (PSD) for the three different homogeneous periods for background sea level identified for the station of Tremestieri.

Fig. 4 .
Fig. 4. Power spectral density (PSD) of Tremestieri and Catania sea-level series.The average PSD, in black, is shown together with 10, 50 and 90 percentiles, in grey.Monthly PSDs are shown with different colours (see legend) to evidence the seasonal variability.Summer months have lower spectral intensity than winter months.The Tremestieri and Catania PSD curves are quite different: a few strong short-period peaks dominate in Tremestieri, whereas Catania curves are rather wavy, with very many peaks distributed over the whole analysed range and dominating peaks between 15 and 23 min.The spectra are smoothed for better comparison.

Fig. 5 .
Fig. 5. Power spectral density (PSD) of Tremestieri in the case of rough sea conditions (R), calm sea (C) and in absence of ferry signals (25 December 2009).

Fig. 6 .
Fig. 6.Empirical frequency distributions computed for the functions (a) |IS|, (b) BS3 and (c) CF3 of configuration A3C2 for the Catania station.Here the EFD for the whole interval is shown together with the EFDs for the months of December and August.

Fig. 7 .
Fig.7.The seasonal variations in the background sea level can be seen by plotting the standard deviation (red) and the mode (green) of the monthly EFDs of |IS| (a), and of the mode (green) and median (violet) of the monthly EFDs of BS3 (b).The standard deviation, mode and median of EFD for the whole database and of the yearly EFD are indicated with horizontal lines (in the same colours as the monthly parameters).The percentage of available data of each monthly EFD is plotted in black.Only months with at least 75 % of data are used to plot the variations over time (line and circles).

Fig. 8 .Fig. 9 .
Fig. 8. Cumulative EFDs of the functions |IS| (black), BS3 (red) and CF3 (blue) for the whole base interval together with the maximum and minimum envelope of monthly EFDs for the configuration A3C2 (a).Distance of the envelopes plotted against the cumulative EFD (b).

Fig. 10 .
Fig. 10.Synthetic tsunami signals for Catania and Tremestieri computed through numerical modelling.The estimated tsunami arrival(TAT) is indicated with a vertical line (violet).TEDA detections for all the four windows are indicated (tsunami-detection method, red, and secure-detection method, blue) for the best configuration and for the different sea-state conditions.In general, the TEDA secure method provides later detections.

Fig. 11 .
Fig. 11.Delay times (DTs) of TEDA detections (upper panel) given separately for each scenario and mean |IS| and BS values (lower panel) vs. probability of exceedance of |IS| for the four sea-condition windows.Detections made by the TEDA secure-detection method are given with solid yellow symbols.

Table 1 .
Scheme of TEDA functions and detections.
TEDA functions and description Computational time interval Depends on IS(t) Instantaneous slope of sea level [t − t IS , t] sea level BS(t) Background slope of sea level [t − t BS − t G , t − t G ]

Table 2 .
Characteristic values of the maximum and the average of |IS| for the configuration C2 in the TDI of the four windows of the sealevel records where the synthetic tsunami signals have been added.The TDI corresponds to the first 30 min of the window on the sea-level record on which the tsunami signal is superimposed and defines the interval where detection is valid.Date and initial time (hh:mm:ss) of the windows are given.is around the 95 percentile (e.g.see the cumulative EFD for |IS| displayed in Fig.

Table 3 .
Results of the event detections for the different configurations.