Fault distance-based approach in thermal anomaly detection before strong Earthquakes

The recent scientific studies in the context of earthquake precursors reveal some processes connected to seismic activity including thermal anomaly before earthquakes which is a great help for making a better decision 10 regarding this disastrous phenomenon and reducing its casualty to a minimum. This paper represents a method for grouping the proper input data for different thermal anomaly detection methods using the land surface temperature (LST) mean in multiple distances from the corresponding fault during the 40 days (i.e. 30 days before and 10 days after impending earthquake) of investigation. Six strong earthquakes with Ms > 6 that have occurred in Iran have been investigated in this study. We used two different approaches for detecting thermal 15 anomalies. They are mean-standard deviation method also known as standard method and interquartile method which is similar to the first method but uses different parameters as input. Most of the studies have considered thermal anomalies around the known epicentre locations where the investigation can only be performed after the earthquake. This study is using fault distance-based approach in predicting the earthquake regarding the location of the faults as the potential area. This could be considered as an important step towards actual prediction of 20 earthquake’s time and intensity. Results show that the proposed input data produces less false alarms in each of the thermal anomaly detection methods compared to the ordinary input data making this method much more accurate and stable considering the easy accessibility of thermal data and their less complicated algorithms for processing. In the final step, the detected anomalies are used for estimating earthquake intensity using Artificial Neural Network (ANN). The results show that estimated intensities of most earthquakes are very close to the 25 actual intensities. Since the location of the active faults are known a priori, using fault distance-based approach may be regarded as a superior method in predicting the impending earthquakes for vulnerable faults. In spite of the previous investigations that the studies were only possible aftermath, the fault distance-based approach can be used as a tool for future unknown earthquakes prediction. However, it is recommended to use thermal anomaly detection as an initial process to be jointly used with other precursors to reduce the number of 30 investigations that require more complicated algorithms and data processing.

earthquake, as it uses the known location of the fault, which is related to the impending earthquake. The conventional data selection uses only LST and time, while distance-based grouping of data proposed in this study would also take into consideration one of the very influential parts of the earthquake effect which is the relevant fault and the regions around it. This study also intends to show the accuracy of using the combination of this assembled data and thermal anomaly detection results for estimating each earthquake's intensity using 100 ANN.

Datasets
In this study, MODIS sensor daily land surface temperature product (MOD11A1) during forty days (thirty days before and ten days after the earthquakes) for each earthquake has been used. The MODIS daily LST and emissivity data are retrieved at 1km pixel size by the generalized split-window algorithm, which uses bands 31 and 32. 110 In addition, the relevant active fault was identified and its shape file was extracted depending how close it was to the location of each earthquake's epicentre.

METHODOLOGY
This paper presents a method of grouping input data for different thermal anomaly detection methods. This uses the land surface temperature mean in multiple distances of 1 to 20 km from the corresponding fault during the 115 forty days starting from 30 days before and 10 days after a given earthquake event. In order to generate the input data and use it in the anomaly detection algorithm, the following steps have been performed: pre-processing, fault distant map, and land surface temperature diagram. The data then are used in anomaly detection methods and Artificial Neural Network (ANN).

Pre-processing 120
The first step is to remove the natural and observational noise signals, which are due to changes in seasons, view angles and air density from the TIR data. By doing so, the remaining data would be mainly unmixed TIR anomaly data associated with increased seismic activity. In order to achieve this, a linear function was fitted to the LST of the previous year with no strong seismic activities and then was subtracted from the present year of LST in which the earthquake had occurred. 125

Fault distant map
In order to use the fault in our process it is necessary to have an understanding of the corresponding fault and its surrounding areas of different distancing. Fault distant map is a map that its pixels represent values depending on how far they are from the fault. The closer the pixel to the fault, the lower its value and it will increase as we get further from it. Figure 1 shows the example of Azgalah fault distant map. 130 Figure 1 Fault distant map for Azgalah study case https://doi.org/10.5194/nhess-2020-391 Preprint. Discussion started: 17 December 2020 c Author(s) 2020. CC BY 4.0 License.

Land surface temperature diagram
This paper presents a method that uses the temperature mean in different buffers with various radiuses (i.e. 1-20 km) around the related active fault during the period of investigation. As can be seen in Figure 2, the data is shown by a 3D diagram, made by the LSTs mean in different radiuses around the related active fault in each 135 day. This means that each pixel in this data which can be represented as a picture or 3D diagram show the LST mean in a certain radius buffer zone for a specific day. It should be noted that width of each buffer is only 1 km and R is the buffer radius (distance) from the related fault. Later, these temperatures mean of each buffer is used as an input data to test various anomaly detection methods 140 such as interquartile method and standard deviation method.
Instead of using conventional 2D-data, by using this 3D-data which include LST mean values in different buffers, time laps (days) to the earthquake events, and the distances from the fault, the anomaly detection methods will act more appropriately.

Anomaly detection methods 145
Two anomaly detection methods have been used in this study. The first one is simply the use of mean and standard deviation (Equation 1) of LST values (Akhoondzadeh, 2011) in each buffer zone.
where µ is the mean value and σ is the standard deviation value of the LSTs and k is a coefficient around 1.6 but may slightly change for each study site. For each x (i.e. LST), if the result of Equation 1 is true, it will be regarded as an anomaly. 150 The second anomaly detection method uses a similar approach but instead of using mean and standard deviation, it uses median and interquartile range (Equation 2) (Saradjian and Akhoondzadeh, 2011) which is known as the Interquartile method.
where M is median value, IQR is the interquartile range and k is a coefficient around 1.3 but may slightly change for each study site. Like the first method, if each x (i.e. LST) is greater than + × , then the 155 behaviour of the LST will be regarded as anomalous.
https://doi.org/10.5194/nhess-2020-391 Preprint. Discussion started: 17 December 2020 c Author(s) 2020. CC BY 4.0 License. In Figure 4 and Figure 5, the output of each anomaly detection method for each earthquake is shown. Results of each earthquake investigation show that the thermal anomaly is detectable in both of the anomaly detection methods mostly on the closest day to the earthquake regarding the closest buffer zone to the fault. These anomaly detection methods were used in other studies by using conventional data as input. Although they have 195 detected some anomalies, their accuracy was always in question due to many false alarm anomalies detected along with the actual anomaly.
Results show that in Azgalah, Goharan, Saravan, Brujerd and Sari case studies, the anomalies detected by both methods are either on the day of the earthquake, the day before, the day after, or all of days mentioned. This difference is due to temporal proximity between the time of the imaging and the earthquake and earthquake's 200 intensity. In Shonbeh case study, although a thermal anomaly was detected on the day of the earthquake, another slightly stronger anomaly was detected 8 days after that.
It should be noted that the anomalies detected in far distance buffers from the related fault are different for each earthquake (mostly in Saravan and Sari case studies) and do not have similar pattern. Moreover, since these pixels are far away from the related fault and epicentre, it cannot be said for certain that they are related to the 205 earthquake. Therefore, these pixels were not considered as earthquake related anomaly and only anomalies in close distance buffers were used as earthquake related anomalies in ANN algorithms. https://doi.org/10.5194/nhess-2020-391 Preprint. Discussion started: 17 December 2020 c Author(s) 2020. CC BY 4.0 License.
The difficulty of this method is in far distances, for example in buffers as far as 20 km radius from the fault, two pixels inside the buffer can be up to 80 km apart from each other, depending on the length of the fault itself. As a result, buffers with large radiuses could have pixels with various land covers and different temperatures. While 210 limiting the buffer radius could shorten radiuses from the fault, it would make the area and diagram under investigation to become too small, causing the method to be less effective.
Changing the coefficient value (k) for each anomaly detection process affects the result. The higher the value of coefficient k is, the higher the threshold for anomaly detection is set. This reduces the anomalies that can be detected while lowering the number of false alarm anomalies. On the other hand, increasing the coefficient 215 value could result in omitting even the main anomaly that is related to the earthquake. Therefore, it is necessary to find an optimal value to increase the efficiency of each method. In this study, the coefficient value for standard method is around 1.6 and for interquartile method is around 1.3.

The impact of difference in Anomaly detection methods 220
The results show that both anomaly detection methods do find the thermal anomaly caused by seismic activities in each investigated earthquake. However, interquartile anomaly detection method has a slightly more specified outcome and less false alarm anomalies. https://doi.org/10.5194/nhess-2020-391 Preprint. Discussion started: 17 December 2020 c Author(s) 2020. CC BY 4.0 License. Figure 4 shows the results for a standard anomaly detection method. It indicates the anomalies detected around the time of the earthquake in the nearest buffer from the related fault. In Azgalah, Saravan and Brujerd cases, 225 few anomalies are detected before the earthquake while in Goharan and Shonbeh cases few anomalies are detected after the earthquake. In Sari case, the earthquake related anomalies are detected on the day of the earthquake. However, the anomaly is not detected in the nearest buffer but in the 2-8 km buffer zones. In Azgalah, Saravan, Brujerd cases, some of the anomalies were detected around 6 days before the earthquake.
Although these anomalies are not as strong as the anomalies detected near the time of the earthquake, they seem 230 to be related to some seismic activities rather than being a false alarm.
Results for interquartile anomaly detection method can be seen in Figure 5. Many anomalies detected by this method are related to the earthquake and found near the time of the earthquake in the closest buffer to the related fault with exception of Shonbah earthquake. As mentioned before, in Shonbah case study, another thermal anomaly was detected beside the main anomaly, almost 8 days after the earthquake, which was even 235 stronger than the anomaly related to the earthquake. Nevertheless, these results show that Interquartile anomaly detection method has more specified results and a better outcome for training ANN, compared to standard anomaly detection method. Since interquartile method created more precise inputs for training the ANN, the anomalies detected by this method were used. Table 2 shows the results of each earthquake's estimated intensity and its accuracy compared to its actual intensity. The results indicate that the best accuracy belongs to Azgalah and the one with least accuracy belongs to Sari case study. ANN results also show high correlations between thermal anomaly data and the earthquakes intensity. 245 Considering the limited number of investigated earthquakes, ANN did a great work by managing to sustain a good accuracy using various thermal data for each earthquake.

CONCLUSION
Thermal anomaly is indeed a significant precursor for strong earthquakes. The proposed method which includes 250 analysis the anomalies with respect to the buffer zones in different distances relevant to faults, increases the accuracy dramatically. Two thermal anomaly detection methods were used for investigating each earthquake in this study. Although the outcome of each method is slightly different from another for each earthquake, interquartile method has better results compared to standard method. Nevertheless, they are both more accurate when anomaly detection algorithms use the proposed grouped inputs data instead of the ordinary data. 255 ANN results show that thermal anomaly data highly corresponds with earthquake intensity. Thus, the network was constructed properly, making the estimated results close to actual intensities. It is recommended to use more data related to more earthquakes and different locations for training ANN to improve the network accuracy.
However, it should be pointed out that thermal anomaly on its own is not quite sufficient for estimating the earthquake parameters and activities. It is highly recommended to use it as an initial and primary precursor for 260 limiting the search area and then use other precursors, which require more complicated data and methods.
Thermal anomaly precursors can also be used in combination with other simple precursors to get efficient and comprehensive results.
Many previous studies that investigated thermal anomalies, explored areas only around the epicentre. Methods used in such studies required the exact location of epicentre therefore they are only possible after happening of 265 the earthquake. Since the location of the active faults are known a priori or can be identified by further investigations, using fault distance-based approach can be a superior method in predicting the impending earthquakes for vulnerable faults. In spite of the previous investigations that the studies were only possible aftermath, the fault distance-based approach can be used as a tool for future unknown earthquakes prediction. 270 https://doi.org/10.5194/nhess-2020-391 Preprint. Discussion started: 17 December 2020 c Author(s) 2020. CC BY 4.0 License.