Thermal anomalies detection before strong earthquakes ( M > 6 . 0 ) using interquartile , wavelet and Kalman filter methods

Abstract. Thermal anomaly is known as a significant precursor of strong earthquakes, therefore Land Surface Temperature (LST) time series have been analyzed in this study to locate relevant anomalous variations prior to the Bam (26 December 2003), Zarand (22 February 2005) and Borujerd (31 March 2006) earthquakes. The duration of the three datasets which are comprised of MODIS LST images is 44, 28 and 46 days for the Bam, Zarand and Borujerd earthquakes, respectively. In order to exclude variations of LST from temperature seasonal effects, Air Temperature (AT) data derived from the meteorological stations close to the earthquakes epicenters have been taken into account. The detection of thermal anomalies has been assessed using interquartile, wavelet transform and Kalman filter methods, each presenting its own independent property in anomaly detection. The interquartile method has been used to construct the higher and lower bounds in LST data to detect disturbed states outside the bounds which might be associated with impending earthquakes. The wavelet transform method has been used to locate local maxima within each time series of LST data for identifying earthquake anomalies by a predefined threshold. Also, the prediction property of the Kalman filter has been used in the detection process of prominent LST anomalies. The results concerning the methodology indicate that the interquartile method is capable of detecting the highest intensity anomaly values, the wavelet transform is sensitive to sudden changes, and the Kalman filter method significantly detects the highest unpredictable variations of LST. The three methods detected anomalous occurrences during 1 to 20 days prior to the earthquakes showing close agreement in results found between the different applied methods on LST data in the detection of pre-seismic anomalies. The proposed method for anomaly detection was also applied on regions irrelevant to earthquakes for which no anomaly was detected, indicating that the anomalous behaviors can be related to impending earthquakes. The proposed method receives its credibility from the overall capabilities of the three integrated methods.


Introduction
Although most of the precursors have an important role in the earthquake prediction process, the thermal anomaly precursor is one of the precursors which has gained more attention and support from the scientific community across the world (Panda et al., 2007).Thermal anomaly is an unusual increase in Land Surface Temperature (LST) that occurs around 1-24 days prior to an earthquake with increases in temperature of the order of 3-12 • C or more and disappears a few days after the event.
The idea that thermal anomalies may be connected with seismic activity was put into application in Russia, China and Japan.In 1980, Russian researchers detected thermal anomalies prior to an earthquake in Central Asia using satellite images (Tronin, 1996).Then, other researchers reported more observations on thermal anomalies before strong earthquakes.In several studies, evident correlations of thermal anomalies in LST that are apparently related to pre-seismic activities have been identified (Qiang et al., 1991 and1999;Tronin et al., 2000Tronin et al., , 2002Tronin et al., , 2004;;Saraf and Choudhury 2005a, b;Ouzounov and Freund 2004;Ouzounov et al., 2006 andChoudhury et al., 2006;Pulinets et al., 2006).
It should be noted that thermal anomalies may have origins other than earthquakes.In the case of an earthquake, in fact, it can be due to the stress in the underground layers and change of soil characteristics.Qiang (1991) has shown that prior to an earthquake, gases such as methan, carbon dioxide and hydrogen are emitted through rock slots, intensifying green-house effects and affecting the electromagnetic field of the earth.There have been some other theories to explain this phenomenon, such as piezoelectric and elastic strain dilatation forces causing the temperature to increase (Ouzounov and Freund 2004;Freund 2009).To date, none of them has been widely accepted and the mechanism of thermal anomalies is still not clear.Some remote sensing satellites can measure the radiation coming from the earth in thermal bands and provide useful information prior to earthquakes.Due to their suitable temporal and spatial resolutions, the thermal infrared bands of NOAA-AVHRR, Terra-MODIS, Aqua-MODIS and Meteosat-5 data have been used in most recent major earthquake prediction studies based on thermal anomaly.
The focus of this study includes the analysis of the time series of LST data concerning the Bam, Zarand, and Borujerd earthquakes to locate unusual variations related to seismic activities using Interquartile, Wavelet and Kalman filter methods.

LST data
The Moderate Resolution Imaging Spectroradiometer (MODIS) onboard the Terra satellite, was launched on 18 December 1999 for global monitoring of the atmosphere, terrestrial ecosystems, and oceans.On 4 May 2002, a similar instrument was launched on the EOS-Aqua satellite.MODIS, flying on these two satellites with its 2330 km swath width provides almost complete dual global daily coverage in 36 spectral bands between 0.415 and 14.235 µm with spatial resolutions of 250 m (bands 1 and 2), 500 m (bands 3, 4, 5, 6 and 7) and 1000 m (bands 8-36) (Ouzounov et al., 2006).In this study, LST variations close to the studied earthquake epicenters have been analyzed using the daytime LST images provided by NASA (http://modis.gsfc.nasa.gov/data).These data are generated on a daily basis at a temperature resolution of 0.02 • C. Each pixel of a LST image covers an area of 1 × 1 km 2 on the ground.For each image the mean of LST values of a 5 × 5 pixel area centered on the earthquake epicenter has been considered.

Air temperature data
In this study the air temperature (AT) data downloaded from the web site: (http://www.wunderground.com/) have been used to exclude LST data from atmospheric effects.These data have been collected by the meteorological stations close to the studied earthquakes epicenters.

Methodology
Daily variations of the land surface temperature depend on season, geographic location, climatological conditions and other unknown parameters.The unknown variations preclude the possibility of using methods based on normal distribution of data.Since linear solutions and normal distribution are not suitable for time series of LST data due to their unknown variations.In this section, the interquartile, wavelet and Kalman filter methods which will be used in detection of preseismic anomalies are explained.

Anomaly detection using interquartile
As the fluctuation of the temperature very often does not follow a Gaussian probability function, some researchers (Liu et al., 2004;Pulinets and Boyarchuk, 2004;Akhoondzadeh, 2011) use the median value and the interquartile range.The median and the interquartile range of data are used to specify higher and lower bounds in order to distinguish seismic anomalies from the background of natural variations.The higher and lower bounds of the mentioned range can be calculated using the following equations: (1) where x, x high , x low , M, IQR and Dx are parameter, higher bound, lower bound, median value, interquartile range and differential of x, respectively.According to this, if the absolute value of Dx is greater than k, (|Dx| > k), the behavior of the relevant parameter (x) is regarded as anomalous.According to Eq. (3), p = ±100 × (|Dx| − k) /k indicates the percentage of parameter change from the undisturbed state.
If an observed LST falls out of either the associated lower or higher bound, we conclude with a confidence level of about 80-85% that a lower or higher abnormal signal has been detected (Liu et al., 2004).

Anomaly detection using wavelet transformation
In this study, to obtain the time variability of the main periodicities, the wavelet transformation (Eq.4) has been applied on the LST time series of earthquakes.
where, s is the scaling factor, b is the location parameter, * is the complex conjugate of continuous wavelet function and f (x)is the time series under analysis.Due to the variability pattern of our data, the Daubechies 1D wavelet has been applied to identify anomalies in the data.The low frequency seasonal components and high frequency noise have been eliminated using the components of the wavelet transform.The high perturbations of LST are then detectable by wavelet coefficients greater than a pre-defined threshold value.In this study, M + 1.6 × I QR, M + IQR and M + 1.1 × IQR have been selected as optimum threshold values to detect unusual values of the wavelet coefficients for the Bam, Zarand and Borujerd earthquakes, respectively.Optimal threshold values (k) were defined based on an iterative process.Different k values were evaluated to detect high unusual variations during the studied time period.These threshold values are different from case to case.M and I QR are the median and the interquartile range parameters, respectively.The wavelet coefficients of LST values greater than the defined thresholds are regarded as anomaly values (Akhoondzadeh, 2011).

Anomaly detection using Kalman filter
The Kalman filter is a recursive solution to optimize the described systems in the state space.It is a collection of mathematics equations to optimize prediction equations using estimation of state variables and minimization of error covariance.The Kalman filter is suitable for stationary as well as dynamic and linear processes and can be applied for nonlinear systems using Taylor expansion equations.This filter has high capabilities in determination of inner variables and it simultaneously solves state and measurement equations in order to reach optimized unobservable states.In other words, this method uses observed variables (y 1 , y 2 ,...,y t ) to estimate state (x i ) with minimum error.Depending on (i), filtering, prediction and interpolation states are the cases as following definitions (Haykin, 2001): If i = t,i>t or i<t, this method is known as filtering, prediction or interpolation respectively.Equations ( 5) and ( 6) are state and measurement equations: x − t ) as Eq. ( 7).
k t is Kalman coefficient and must be defined based on the minimum of post-error covariance (Eq.8).
Regarding the mentioned equations, measurements would be reliable when covariance of measurement error is close to zero.Kalman filter equations are classified into two categories: (1) time update; time retrieval equations update state and covariance matrixes based on the pre-measurements (Eqs.9 and 10), (2) measurement update; measurement retrieval equations are for feedback of time update effects in the system and reach to an optimum state based on the measurements (Eqs.11, 12 and 13).
Therefore at the beginning, the prediction process is done; then it is corrected based on the observations and again, the prediction process is repeated.If however, the state and measurement equations are nonlinear, (as the time series of earthquake precursors), they could be changed into linear equations using Taylor expansion which is called an extended Kalman filter.This is one of the striking characteristics of the Kalman filter (Haykin, 2001).If the difference between the observed LST value (y t ) and the predicted LST value ( ∧ x − t ) is greater than a pre-defined threshold value, the observed LST value is regarded as an anomaly (Akhoondzadeh, 2011).

Implementation
The implementation has been performed on three registered earthquakes as case studies.The first case study is a strong earthquake of a magnitude of M w = 6.6 that occurred in Bam on 26 December 2003 at 01:56:52 UTC.The second case is an earthquake that happened in Zarand with a magnitude of M w = 6.4 on 22 February 2005 at 02:25:23 UTC.The third one is an earthquake of a magnitude of M w = 6.1 occurring in Borujerd on 31 March 2006 at 01:17:01 UTC (Table 1).Three sets of data about the Bam, Zarand and Borujerd earthquakes have been selected by visual inspection of seismic databases available in http://earthquake.usgs.gov, http://www.emsc-csem.org,http://iiees.ac.ir and http: //geophysics.ut.ac.ir to show the relevance of LST parameter perturbation with earthquakes.In order to exclude variations of LST from atmospheric effects, AT data derived from the meteorological stations close to the earthquakes epicenters  The earthquake day is represented as vertical dotted line.have been taken into account.In each case study, the mean of air temperature data around earthquake day during several years before the earthquake year was shifted to zero and then the residuals were subtracted from the LST variations.Interquartile, wavelet transform, and Kalman filter methods have been used to locate the anomalies.The interquartile has been used with regard to finding signal fluctuations beyond the lower and higher bounds.In order to obtain the time variability of the main periodicities by which the intense anomalies may be highlighted, the wavelet transformation has been applied on the three time series.Since one of the Kalman filtering applications is to study the normal current signal and predict the incoming signal, it has been found suitable for predicting the normal signal.The difference controlled by a predefined threshold value between the predicted and observed signals then would be regarded as an anomaly (Akhoondzadeh, 2011).

The Bam earthquake
In order to analyze pure temperature values, the atmospheric effects have been removed from LST. Red and blue curves in Figure 1 show the time series of LST and AT data, respectively.The data is on the Bam earthquake epicenter during   When implementing the interquartile method, Dx which will be called DLST here is calculated using Eq. ( 3).The corrected LST value exceeds the higher bound (M + 1.3 × IQR) on 6, 7 and 8 December 2003 and reaches to its maximum value 20 and 18 days prior to the earthquake with the value of 23.54% of the higher bound (Fig. 4a).December 2003, 20 days before the earthquake (Fig. 4c, d). Figure 4d illustrates the differences between the observed and predicted LST values estimated using the Kalman filter method.To perform the null-hypothesis test, the same period of LST data for other regions have been considered.The regions have been marked by black asterisks in Fig. 7 to 3 January 2004, the interquartile as an example method has detected anomalies in region E for 20, 19 and 18 days before earthquake (on 6, 7 and 8 December 2003), but not in regions 1, 2 and 3 (marked in Fig. 7). Figure 10b, c and  d do not show any meaningful anomalies for the same dates implying that the detected unusual variations of LST close to the earthquake epicenter can be regarded as an indication for an impending earthquake.

The Zarand earthquake
Figure 2 represents the variations of the LST (red curve) and AT (blue curve) data close to the Zarand earthquake epicenter during the period of 1 to 28 February 2005.The average of AT data during the period of 1 to 28 February from 1999 to 2004 years is shown as a green curve (Fig. 2).In order to exclude the time series of LST data from atmospheric effects, meteorological data obtained from the station close to the Zarand earthquake epicenter has been taken into account.Figure 5a illustrates the corrected LST values from 1 to 28 February 2005 around the Zarand earthquake epicenter.
After applying the interquartile method, an unusual increase in LST values is clearly observed 1 day before the earthquake.Variations of LST clearly exceed the higher bound (M + IQR) of the order of 10.26%, 1 day before earthquake.
Figure 5b represents the wavelet transformation of the LST variations.It shows an anomaly of the order of 10.45%, 1 day before the event.
Using the Kalman filter, unusual behaviors are seen in LST variations (Fig. 5c and d), when the difference between the observed LST value and the predicted LST value reaches its maximum value (i.e.1.46 • C) on 18 February 2005, 4 days before the earthquake.The LST values gradually increase from 17 February 2005 and reach their maximum value on 21 February 2005.The sudden deviation of LST variations on 18 February 2005 can not be predicted by the Kalman filter method and is accordingly regarded as an anomaly.
Figure 11 shows the variations of LST data and detected anomalies using interquartile methods for the Zarand earthquake epicenter and other regions which have been marked in Fig. 8, from 1 to 28 February 2005.Figure 8 illustrates the daytime LST map around Zarand city on 21 February 2005, 1 day before earthquake.The detected anomaly close to the Zarand earthquake epicenter on 21 February 2005 (Fig. 11a) is not seen in other regions which have been shown in Fig. 8 (Fig. 11b, c and d).Therefore the exceeding of LST variations from the higher bound close to the Zarand earthquake epicenter on 21 February 2005 could have been induced by the forthcoming earthquake.

The Borujerd earthquake
The variations of LST, AT and mean of AT (during the period of 1 March to 15 April from 2004 to 2005) data close to the Borujerd earthquake epicenter are shown as red, blue and green curves, respectively, in Fig. 3.
It can be seen that the LST value exceeds the higher bound (M + 1.4 × IQR) defined using the interquartile method and reaches its maximum value (i.e.23.68 %) on 25 March 2006, 6 days before the earthquake (Fig. 6a).
When implementing the wavelet transformation, the LST value exceeds the higher bound (M +1.1×I QR) of the order of 35.29%, on 26 March 2006 (Fig. 6b).
After applying the Kalman filter, the difference between the observed LST value and the predicted LST value reaches its maximum value (i.e.2.7 • C) on 25 March 2006, 6 days before the event (Fig. 6c and d).This means that an anomalous and unpredictable variation in LST data has been occurred on this date.
Figure 9 illustrates the daytime LST map around Borujerd city on 25 March 2006, 6 days before the earthquake.The earthquake epicenter and other studied regions have been marked with black asterisks.Figure 12 represents the results of LST data analysis close to the Borujerd earthquake epicenter and other regions which have been shown in Fig. 9, from 1 March to 15 April 2006.The observed anomaly close to the Borujerd earthquake epicenter on 25 March 2006 (Fig. 12a) is not seen in other studied regions (Fig. 12b, c and d).It indicates that this anomaly can be considered as an earthquake anomaly.

Discussion
It should be pointed out that one of the aims of this study is to integrate the capabilities of the three methods in appropriately detecting actual thermal anomalies.As it can be seen, each method has slightly different lead times and the local time of anomalies, despite using the same data source of LST values.The obtained differences, which are still close, might be related to the differences in the nature of each method.For instance, the interquartile method detects any unusual variations falling outside the predefined bounds.This method excludes some non seismic signals accordingly.The wavelet method significantly detects local maxima of the studied time series.Then, only those local maxima would be valid that are in agreement with outstanding results obtained from other methods.The capability of the Kalman filter method in anomaly detection depends on the determination of the related parameters, such as error covariance matrix in the training process.Then the training is done on the time series created in normal conditions (and not on the anomalous signals).Therefore, the output from the Kalman filter method would not necessarily coincide exactly in details with the results obtained from the other methods.When implementing the Kalman filter, the discrepancies between the observed values and predicted values exceeded the pre-defined threshold regarded as anomaly.In the case of the Zarand earthquake, interquartile and wavelet detected thermal anomaly 1 day before earthquake.The LST value reached its maximum on this date.But the Kalman filter detected thermal anomaly 4 days before the event.In other words, the Kalman filter detected the unusual deviation of LST variations on this date.It can be concluded that the proposed method gets its credibility from the overall capabilities of the three integrated methods.This can be done by accepting the major anomalies presented in all methods while neglecting the minor ones presented only by some methods.
It should be noted that depending on the earthquake magnitude, a specific set of bounds or threshold values should be used when executing the methods.This implies that for anomaly detection in weaker earthquakes (M w <6.0), due to the lower magnitude and anomaly strength, the same bounds or threshold values can not be applied in stronger earthquake data analysis.Therefore, an empirical evaluation should be done to select suitable bounds or threshold values in each case study.
Table 2 represents the detected anomalies in LST parameters variations for the studied earthquakes using the interquartile, wavelet and Kalman filter methods.Using a simple voting, it can be concluded that the 6,  However, it is necessary to take into the account that the meteorological parameters have complicated behavior and sometimes display anomalies in quiet seismic condition that can be associated with other unknown factors.

Conclusions
The efficiency of integrating the interquartile, wavelet and Kalman filter methods to detect anomalies in LST variations for the Bam, Zarand and Borujerd earthquakes has been shown in this study.In each case study, the detected anomalies in LST variations derived from MODIS satellite data were acknowledged using observed anomalies in AT variations obtained from the meteorological stations close to the earthquake epicenter.
Different wavelet functions and levels of wavelet transformation over LST data have been evaluated.Among them, the level 1 approximation information clearly showed the disturbed states that might be associated with an impending earthquake.Since LST variations do not follow the normal distribution and their behavior is nonlinear, the extended Kalman filter has been used to detect anomaly around the earthquake data.
Our results indicate that the highest deviations from a normal state that were regarded as anomaly appeared within the time interval 1-20 days before the earthquakes.It should be point out that the detected thermal anomalies can be related to non-seismic events.
However, when using optical satellite data in systematic and real-time monitoring of anomalies, cloud cover may act as a problem.This is due to the relatively lower temporal resolution of the MODIS satellite images when compared to data provided by geostationary meteorological satellites.Also, the thermal anomaly may be a result of degassing in an area.It is not clear whether topography, vegetation cover, focal depth of the earthquake and the thickness of the Earth's crust play roles in permitting the escape, and the thicker the crust, the lower the capacity for gases to reach the lower atmosphere (Saraf and Choudhury, 2005a).This problem may result in very little or even lack of thermal anomaly observation in some places with different topography and thickness of the crust, making some uncertainties relevant to how sensitive and linear the response of the anomaly to the magnitude and timing of an actual earthquake is.As reported in several studies, a thermal increase of about 3-12 C is considered a thermal anomaly.Therefore, its detection is a delicate task and may be missed in visual and qualitative interpretation.It is required then to establish such a system which detects anomalies accurately and automatically.

18Figure 1 .
Figure 1.Results of LST and AT data analysis for the Bam earthquake (26 December 2003) from 20 November 2003 to 03 January 2004.The red and blue curves represent the LST and AT variations.The mean of AT during the period of 20 November to 03 January from 1998 to 2002 years is represented as green curve.The x-axis represents the day relative to the earthquake day.

Fig. 1 .
Fig. 1. Results of LST and AT data analysis for the Bam earthquake (26 December 2003) from 20 November 2003 to 3 January 2004.The red and blue curves represent the LST and AT variations.The mean of AT during the period of 20 November to 3 January from 1998 to 2002 is represented as a green curve.The x-axis represents the day relative to the earthquake day.The earthquake day is represented as a vertical dotted line.

Figure 2 .Fig. 2 .
Figure 2. Results of LST and AT data analysis for the Zarand earthquake (22 February 2005) 3 from 01 to 28 February 2005.The mean of AT during the period of 01 to 28 February from 1999 4 to 2004 is represented as green curve.5 6 7 8 9 10 11 12 13 14 15

Figure 3 .Fig. 3 .Fig. 4 .
Figure 3. Results of LST and AT data analysis for the Borujerd earthquake (31 March 2006) from 3 01 March to 15 April 2006.The mean of AT during the period of 01 March to 15 April from 4 2004 to 2005 is represented as green curve.5 6 7 8 9

Figure 2 Figure 6 .Fig. 6 .
Figure 4b clearly shows the pre-earthquake LST anomalies detected using wavelet transformation.The peak of anomaly reaches to 19.61% above the threshold value on 7 and 8 December 2003, 19 and 18 days prior to the event.By applying the Kalman filter, it can be seen that the difference between the observed LST value and the predicted LST value reaches its maximum value (i.e.3.1627 • C) on 6

Figure 7 .
Figure 7. Daytime LST map around Bam city in Iran on 07 December 2003.A black asterisk with 'E' letter represents the earthquake epicenter.Other black asterisks represent the selected regions to study LST variations during the studied dates.

Fig. 7 .
Fig. 7. Daytime LST map around Bam city in Iran on 7 December 2003.A black asterisk with "E" letter represents the earthquake epicenter.Other black asterisks represent the selected regions for studying LST variations during the studied dates.

26Figure 9 .
Figure 9. Same as Figure 7 but for the Borujerd city on 25 March 2006.

Table 1 .
List of the earthquakes selected in this study (reported by http://earthquake.usgs.gov/).

Table 2 .
Detected anomalies for studied earthquakes in LST variations using the interquartile, wavelet and Kalman filter methods.