Natural Hazards and Earth System Sciences Comparable analysis of the distribution functions of runup heights of the 1896 , 1933 and 2011 Japanese Tsunamis in the Sanriku area

Data from a field survey of the 2011 Tohoku-oki tsunami in the Sanriku area of Japan is used to plot the distribution function of runup heights along the coast. It is shown that the distribution function can be approximated by a theoretical log-normal curve. The characteristics of the distribution functions of the 2011 event are compared with data from two previous catastrophic tsunamis (1896 and 1933) that occurred in almost the same region. The number of observations during the last tsunami is very large, which provides an opportunity to revise the conception of the distribution of tsunami wave heights and the relationship between statistical characteristics and the number of observed runup heights suggested by Kajiura (1983) based on a small amount of data on previous tsunamis. The distribution function of the 2011 event demonstrates the sensitivity to the number of measurements (many of them cannot be considered independent measurements) and can be used to determine the characteristic scale of the coast, which corresponds to the statistical independence of observed wave heights.


Introduction
A very strong earthquake (M = 9.0) occurred at 14 h 46 m (JST) on 11 March 2011 in the Pacific Sea.The event occurred east of the Tohoku district and near the northeast part of Honshu Island, Japan.A huge tsunami accompanied the earthquake; as of 14 July 2011, estimated 16 011 people were killed and 5242 people are missing and presumed lost.The total of those numbers is 26 347, which is greater than the number of the victims of the 1896 Meiji Sanriku earth-quake tsunami (about 21 915).Many groups have conducted field surveys on the Sanriku coast, searched for traces of inundation limits, and measured the runup heights.First results of study of the 2011 Tohoku-oki earthquake and tsunami are now published (Koketsu et al., 2011;Fujii et al., 2011;Hayashi et al., 2011;Ozaki, 2011;Tsuji et al., 2011).
Many cities in this area have been affected by historic tsunamis.For instance, the town of Taro was hit by the Meiji Sanriku tsunami of 1896, in which 2859 people (about 95 % of the total population) were killed.In 1933, this town was hit again by the Showa Sanriku tsunami, and 911 people were killed.The most recent tsunami entirely destroyed the newlyconstructed sea wall, and all of the wooden houses in the new town area were swept away; additionally, approximately 200 people were killed there.In the residential area of the old town, seawater flooded over the old sea wall and destroyed almost all wooden houses; fortunately few people were killed there.
Japan Society of Civil Engineers (JSCE) member groups performed post-tsunami runup surveys.The purpose of these surveys was to observe and document the effects of tsunamis, collect readily available but perishable data as soon as possible to learn about the nature and impact of the phenomenon, and enable recommendations on the need for further research, planning and preparedness.Data from these surveys have been compiled to provide geographical locations and runup heights online (http://www.coastal.jp/tsunami2011).This nationwide tsunami survey has been conducted by joint research groups of 299 researchers among 64 different universities/institutes. Inundation heights and run-up heights measured at 5.247 points in total (used data released on 3 characteristics of the observations are summarized in Table 1.At nine points on the 25 1 kilometers of coast between the town of Yamada and the northern limit of Miyako City, 2 seawater rose up to the heights exceeding 30 meters.The historical tsunami records of the Showa Sanriku (1933) and Meiji Sanriku (1896) tsunamis were collected from published sources, and the analysis of these two events were compared with those of the 2011 Tohoku-oki tsunami.Locations of post-tsunami runup surveys in Japan and a spatial distribution of the maximum heights are shown in Fig. 1.The characteristics of the observations are summarized in Table 1.At nine points on the 25 km of coast between the town of Yamada and the northern limit of Miyako City, seawater rose up to heights exceeding 30 m.
However, measured maximum tsunami heights should be interpreted carefully because they are sometimes the heights of splash-up and are not representative of runup heights in a specific region.Even in a limited coastal region, the runup heights vary widely due to interactions between wave and coastal topography.As we demonstrate below, the conception of the distribution function can be useful to check observed data.

Distribution Functions of the observed runup heights
Numerical and statistical methods can be applied to analyze data and recover possible missing data.It is important to know the contribution of the highest wave height values, including missing data, to the total distribution of wave heights.Van-Dorn (1965) was the first to apply a statistical approach to analyze observed tsunami runup heights.He found that a log-normal distribution was the best fit for tsunamis on the coast of the Hawaiian Islands.This distribution (mathematically, a probability density function, pdf) has the following shape: where H is the maximum runup height for each point along the coast, measured in meters.This distribution has two parameters with evident physical characteristics: a = <log H> is the average value of the wave height logarithm, and σ is the standard deviation of the height logarithm.This analysis was continued by Kajiura (1983) with a special focus on the Japanese coast.He analyzed 6 events: 15-06-1896 Meiji Sanriku, 03-03-1933Showa Sanriku, 12-21-1946Nankaido, 05-23-1960Chile, 06-16-1964Niigata, and 05-16-1968 Tokashi-oki.Some of these events occurred in 1896 and 1933 in the Sanriku area in the same place as the 2011 event.Histograms of the observed tsunami runups in the total range of about 300 km of the coastal stretch for each event were approximated by the log-normal curve (1), and parameters of the distribution were determined and are given in Table 1.The distribution functions for Meiji Sanriku ( 1896) and Showa Sanriku (1933) are reproduced in Fig. 2. We plotted it for different intervals of the runup height from 0.5 m, but only when the interval increases up to 4 m, the correlation with the log-normal curve seems more evident.A similar histogram analysis was performed for the 2011 Tohoku-oki event based on field survey data; the histogram and log-normal approximation from this analysis are presented in Fig. 2.
Here we present data from three tsunamis; these data were obtained from different numbers of measurements.For the 1896 tsunami, the oldest event, 132 points of measurement were collected.For the 1933 event, data on runup heights were obtained in 205 coastal locations.In the most recent 2011 event, the number of coastal locations exceeds 5000.In his analysis of the distribution functions of runup heights along the Japanese coast, Kajiura (1983) mentioned that the number of samples can influence the observed characteristics of the distribution function.Regarding the applicability of statistical approaches to the historic tsunamis, he concluded that "because of the strong dependence of tsunami behaviour along the coast on both the characteristics of the incoming tsunami wave train and the topographic conditions of local and regional scales, it is very difficult to draw general conclusions concerning the spatial statistics of coastal wave run-up heights applicable to all regions and tsunamis."In our opinion, the main problem with the use of log-normal distribution Eq. ( 1) is that the obtained histograms of runup heights with small intervals (0.5-2 m) do not coincide well with lognormal approximations.A clear correlation is achieved only for large intervals (4 m) when the number of groups is relative small.Additionally, adding or eliminating new measured runup values can influence the statistical characteristics of  -1933Showa Sanriku, 12-21-19461 Nankaido, 05-23-1960Chile, 06-16-1964Niigata, and 05-16-1968 Tokashi-oki.Some of 2 these events occurred in 1896 and 1933 in the Sanriku area in the same place as the 2011 3 event.Histograms of the observed tsunami runups in the total range of about 300 kilometres 4 of the coastal stretch for each event were approximated by the log-normal curve (1), and 5 parameters of the distribution were determined and are given in Table 1.The distribution 6 functions for Meiji Sanriku (1896) and Showa Sanriku (1933) are reproduced in Fig. 2. We 7 plotted it for different intervals of the runup height from 0.5 m, but only when the interval 8 increases up to 4 meters, the correlation with the log-normal curve seems more evident.A 9 similar histogram analysis was performed for the 2011 Tohoku-oki event based on field 0 survey data; the histogram and log-normal approximation from this analysis are presented in 1 runup distributions.This problem can be resolved with the use of integral forms of runup distributions (probability function or cumulative distribution), which can be obtained by the integration of probability density function Eq. ( 1).
The theoretical interpretation of the log-normal distribution of tsunami runup heights is associated with the random topography of the sea bottom (Go, 1997;Choi et al., 2002;Burroughs and Tebbens, 2005;Kaistrenko, 2011).In the simplified ray theory based on the linear shallow-water theory, the runup height, H , is taken as proportional to a wave height in the tsunami source (open ocean), H o : where K is the transformation factor that depends on seadepth change along a propagation path and is determined by random seafloor topography only within these approximations.Dividing the propagation path over a series of more or less statistically independent segments, the total transformation coefficient is a product of the local coefficients of the tsunami transformation on each segment.In this case, formula Eq. ( 2) can be rewritten in the logarithm form: where icharacterizes the number of random statistically independent segments along the propagation path and log K i can be considered as independent random variables.The central limit theorem states that the sum of many random independent variables tends to the Gaussian process, and therefore the wave height is described by the log-normal distribution Eq. ( 1).It is important to note here that measured data in all points should be statistically independent, but this is difficult to check experimentally.The characteristic correlation scale between different measured values cannot be determined theoretically; it should be done with an analysis of obtained data.As we show below, the distribution function can help determine this correlation scale.Another important consideration is how we make the best comparison of distribution functions for different events.The correct scaling should be identified.Choi et al. (2002) suggested the use of a complementary cumulative distribution (probability) instead of probability density function ( 1) Eq. ( 4) can be written in universal non-parametric form: where ζ = (H / H ) 1/σ , H = 10 a .In fact, the cumulative distribution, or probability function Eq. ( 5) was used to analyze the distribution of runup heights in different countries during the 2004 tsunami in the Indian Ocean (Choi et al., 2006).The characteristics of this tsunami in Indonesia, Thailand, Malaysia, India, Sri Lanka were very different, but the cumulative distributions Eq. ( 5) written in dimensionless variables were very similar to one another (Choi et al., 2006).
We applied the same approach to analyze data from the 2011 Tohoku-oki event in comparison with the similarly huge historic events of 1896 and 1933.Figure 3 presents the cumulative distribution (probability) of these tsunami wave observed for the 2011 Tohoku-oki event.In our opinion, this can be explained by the large 1 number of "dependent" measurement values (more than 5,000) because the interval between 2 some points was 100 m or less.heights for the eastern coast of Japan as a whole and the theoretical curve Eq. ( 4).The distribution of the observed runup heights in 1896 and 1933 was described quite well by the theoretical function Eq. ( 4), but no such agreement was observed for the 2011 Tohoku-oki event.In our opinion, this can be explained by the large number of "dependent" measurement values (more than 5000) because the interval between some points was 100 m or less.
The influence of the spatial correlation scale of observation data is demonstrated in Fig. 4 where the probability function is plotted with different scales of averaging of observed runup heights.The spatial scale of 7.5 km provides the best comparison of the theoretical curve with observational data, and therefore this scale can determine the correlation length characterizing the independence of different locations of runup measurements.

Conclusions
We demonstrate that distribution functions of runup heights of three catastrophic tsunamis (1896, 1933, and 2011) are very well fitted by log-normal curves based on the assumption of statistical independence of measured wave heights.In the case of the 2011 event, the number of measurements is too high (more than 5000), and some of the runup values are not independent.If runup heights are averaged with a spatial scale of 7.5 km, the best correlation between theory and observations is obtained and this scale can therefore be considered the characteristic correlation scale.

3 4Figure 1 .Fig. 1 .
Figure 1.Observed runup height distributions from the 1896, 1933 and 2011 tsunamis 5 6 However, measured maximum tsunami heights should be interpreted carefully because they 7 are sometimes the heights of splash-up and are not representative of runup heights in a 8 specific region.Even in a limited coastal region, the runup heights varied widely due to 9

3 4Figure 3 .
Figure 3. Cumulative distribution (probability) for tsunamis runup heights.The symbols are 5 each tsunami events.The solid lines are log-normal curves.6 7 The influence of the spatial correlation scale of observation data is demonstrated in Fig. 4 8 where the probability function is plotted with different scales of averaging of observed runup 9 heights.The spatial scale of 7.5 kilometres provides the best comparison of the theoretical 10 curve with observational data, and therefore this scale can determine the correlation length 11 characterizing the independence of different locations of runup measurements.12

Fig. 3 .
Fig. 3. Cumulative distribution (probability) for tsunamis runup heights.The symbols are each tsunami events.The solid lines are log-normal curves.

Figure 4 .
Figure 4. Cummulative distribution (probability) of the 2011 Tohoku-oki tsunami in 4 dimensionless form.The dashed line corresponds to the log-normal curve.Circles and 5 triangles represent spatial intervals of 4.5 km and 7.5 km, respectively.Dots indicate the 6 individual observed values (over 2,000).7 8 3 Conclusion 9We demonstrate that distribution functions of runup heights of three catastrophic tsunamis10   (1896, 1933, and 2011)  are very well fitted by log-normal curves based on the assumption of 11

Fig. 4 .
Fig. 4. Cummulative distribution (probability) of the 2011 Tohokuoki tsunami in dimensionless form.The dashed line corresponds to the log-normal curve.The circles and triangles represent spatial intervals of 4.5 km and 7.5 km, respectively.The dots indicate the individual observed values (over 2000).