Comparison of machine learning classification algorithms for land cover change in a coastal area affected by the 2010 Earthquake and Tsunami in Chile

Abstract. Earthquakes and tsunamis are the natural events that generate subsequent geomorphological land cover changes. The damage is usually of such importance and of such a diverse nature that it is imperative to have tools that allow quick and precise monitoring. Thus, know which classification methods have the best potential to obtain greater precision will improve natural disaster management. We analyze Tubul locality (37.21º S; 73.43º O) in Biobío region, Chile, in which greatest geomorphological changes were documented after the earthquake and tsunami occurred 27/February/2010. These changes can be analyzed using different machine learning methods. We investigate the Support Vector Machine (SVM) and Random Forest (RF), versus the Maximum Likelihood (ML) classification method of Landsat TM and ASTER satellite images. The comparison of the performance of the classifiers and certifying accuracy improvement shows that machine learning algorithms are superior to traditional classification methods in terms of overall accuracy and robustness. The general classification accuracy was approximately 97 %. We also visualize the land cover transformations, showing that 26 % of the region was altered. The results of performance testing of machine learning methodologies was consistent with other studies and presents a valid application in the visualization of land cover changes in areas of natural disasters.


and Moran 2015, present an overview of recent methodologies and applications of classification models in 50 remote sensing.
We propose a machine learning algorithm implementation such as SVM and RF, having as the main motivation, improvement in performance, accuracy, and reliability over the classification results achieved by traditional methods such as ML. Another source of motivation is that in the remote sensing related field of coastal changes related to Tsunami, these classifiers (SVM and RF) are not as familiar as other classifiers, 55 such as decision trees (DT), and neural network (NN) variants. We classified the satellite images involving the periods before and after February 27, 2010, concentrating on the Tubul town area, in central Chile. On this date, an earthquake of magnitude Mw = 8.8 (according to the Chilean seismological service) reached the central Chile coast with its epicenter 50 km northeast of the Concepción city, at a depth of 47.7 km (see Figure 1). It caused a magnitude 4 tsunami, affecting small 60 bays in a coastal stretch of 800 km along the Maule and Biobío regions (Quezada et al. 2012.). The earthquake and tsunami events generated destruction of the infrastructure of cities and coastal towns (Cienfuegos et al., 2014;Soto et al., 2015) and profound modifications in various aspects in the territory's geography, among which stand out coseismic uplift and subsidence, seawater entry in coastal areas, sand and debris transport, erosion of dunes and coastal bars (Araya and Carvajal, 2016;Martínez et al., 2011).

-Multispectral Images of Moderate Resolution.
We chose pre-and post-earthquake geo-referenced Landsat and Aster satellite images that could likewise be comparable, that is dates with comparable meteorological conditions. It is the main reason we resolved 90 not selecting images days previous to the event, but from the same time as the preceding year. In Table I, we present the characteristics of these images. In addition, the free-access SRTM (Shuttle Radar Topography Mission) digital elevation model (DEM) was used, with a spatial resolution of 30 meters, corresponding to the area of study.

-Data preparation
Prior to the analysis, the images must go through some processes to give a physical sense to the values, with the methodology expressed in figure 2.
Different corrections are necessary: First, we make a radiometric correction to get the reflectance values at 100 the top of the atmosphere (TOA) according to the procedure described in (Chander et al., 2009). Because of varying sun-ground-sensor geometry, which is affected by the topography of the area, we also perform a topographic correction. This imposes an additional variation on the radiometric data in pixels with ground cover and very similar structural biophysical characteristics, but with characteristics of terrain slope or  (Moreira and Valeriano, 2014). The "correction 105 C" detailed in (Soenen et al., 2005) was applied. Composition is made from the 6 bands extracted from Landsat and the 3 bands extracted from ASTER. The training and validation samples, which contain representative pixels for the different soil coverages, were extracted by selecting a region of interest (ROI). In ROI's a simple sampling selection protocol was followed to define the training samples. Later, it is compiled in a general list with the values of the pixels for each band.
110 Figure 2. Methodological scheme to illustrate steps necessary for classification using the SVM and RF algorithms. Table 2. Definition of the thematic classes that represent the soil cover selected to represent the reality of land.

City
Urban soil and exposed rocks that, like cement, have a high albedo; It reflects a large amount of incident energy.

Dry Sediment
Exposed soil with low moisture content and organic matter, such as sand, exposed hillsides, uncultivated areas, cleared areas, burned areas, erosion areas and areas with no vegetation.

Wet Sediment
Soils with high moisture content, such as wetland soils. Also, farmland and coastal deposits.

Low vegetation
Plant formation where the herbaceous cover is over 40%, this includes land with crop rotation, types of trees and shrubs with an area of extension of less than 25%. Areas used by agriculture, including cereal crops, vegetables, and fruits.

High Vegetation
Vegetation cover in which the tree stratum is established by natural species such as Coihue, Olivillo, Patagua, and Boldo. It also represents forests where the arboreal stratum is formed by exotic species such as eucalyptus and radiata pine.

Water
Surfaces covered by water, both fresh and salty.

-Classification of images
We executed the classifications of the images using ML, RF, and SVM. In the following subsections, we present a brief description of the 3 algorithms considered.

-Maximum Likelihood
The Maximum Likelihood (ML) method is one of the most used classification methods in remote sensing 120 (Yonezawa, 2007). This method is based on the assumption that the image DN (digital number) values in each of the user-defined classes follow a multivariate normal probability distribution. Although this assumption is not always true, the method is robust (Ahmad & Quegan, 2012).

-Random Forest
RF is a learning algorithm that combines information from an ensemble of decision trees using random 125 subsets of variables to classify and train data (Breiman, 2001;He et al., 2017). The trees vote to determine the label assigned to unknown samples. This overcomes the problem that any tree is non-optimal, as when incorporating many trees, it takes a global optimum (Rodriguez-Galiano et al., 2012). The set of decision trees or "forest" is built up from the training data selected by the user executing a "bootstrap" sampling. In this sampling, only 2/3 of the original training data for each tree are randomly used, and besides a random 130 selection of predictor variables is used to split the nodes in the tree's construction (Naidoo et al., 2012).
As detailed in Dye et al. (2011) and He et al. (2017), there are two main adjustment parameters required to configure an RF algorithm. These parameters are the number of trees that will be built in the forest and the number of division categories considered for each node in the trees. RF uses an out-of-bag (OOB) procedure where the remaining 1/3 of the training samples (randomly picked out and taken out from the sample to 135 establish the decision tree) are booked as an internal test set, which calculates so an unbiased and reliable error rate (Maxwell et al., 2018b).

-Support Vector Machine.
SVM focuses only on the training samples closest to the space characteristics to the optimum limit of separation between the classes (Vapnik, 1995). These samples are called support vectors and are used to 140 define the hyperplane with the maximum margin (i.e., separation) between classes (Mountrakis et al., 2011a).
The basic linear decision limits are often not enough to classify the categories with high precision. Techniques and alternative solutions such as a kernel function used to work out the problem of inseparability, introducing additional variables in the optimization of SVM and mapping (through an 145 adequate mathematical function) the non-linear correlations in a higher space (Euclidean or Hilbert) . The selection of a kernel function often influences the results of the analysis, so in the same way as the adjustment parameters, it is very important to choose it carefully (Kavzoglu and Colkesen, 2009). In addition, SVM requires the "cost " parameter (C parameter) definition, which controls the cost paid by the SVM for erroneous classifications of a training point. Adjustment of this parameter can balance the margin 150 maximization and the classification violation (Melgani and Bruzzone, 2004). Interestingly, SVM does not assume a known statistical distribution of the data to be classified. This is useful since data gained from remote sensing images have an unknown distribution and normality does not always provide a correct assumption of the actual dispersion of the pixels in each class.

155
We created a confusion matrix for global accuracy and analysis of the reliability of the implemented models (Tralli et al., 2005). We calculated the kappa statistic for algorithm evaluation, which tests the success of pairs of data between a set of categories while correcting the success expected probability. The values range from -1, which shows a complete disagreement between the categories and +1 showing a perfect agreement (Comber et al., 2012).

-Classification scheme.
We used the R software to perform the ML, SVM, and RF classification process. R software packages are free and open-source (R: The R Project for Statistical Computing ). It offers a wide variety of functions for implementing algorithms and statistical analysis. The packages used were "e1071" (SVM implementation), "RandomForest" (RF implementation), "Raclass" (ML implementation) and "SP", "RASTER ", "RGDAL"

165
(these 3 packages allow to read raster data in R). To train and validate the algorithms, ROI's were subset to provide 25% of the data for training and the remaining 75% for validation.
As mentioned in section 2.4.3., SVM and RF require some parameters defined by the user. In SVM classification, the first parameter is the kernel function type; here we selected the radial base Gaussian function (RBF), as being the most commonly used in remote sensing [49]. Two additional parameters "cost"

170
(C parameter; which controls the cost paid by the SVM for erroneous classifications of a training point) and "gamma" (associated with the radial basis kernel function) needs to be defined, which in the first stage will have their values predetermined by the software. For the RF model, requires two parameters defined by the user, the number of variables available for each node division (mtry), and the number of decision trees (ntree) produced.

175
To obtain the highest classification accuracy, we perform an optimization parameter process to define the optimal learning algorithms. It is a necessary step as machines learning algorithms are sensitive to the parameters defined by the user (Mountrakis et al., 2011a). Its optimization is achieved by using the method of testing parameter combinations through cross-validation (10-fold) (Huang et al., 2002). The optimized parameters obtained were "gamma" (equal to 10) and "C" (equal to 1000) for SVM, and "ntree" (equal to 180 200) and "mtry" (equal to 3) for RF. Through the "epicalc" package, available in R, we create error matrices, with which it is possible to reach the global reliability value or total agreement of the models and the value of the kappa index.

-Classification
After the SVM and RF algorithms were trained, we validated the models by performing an accuracy 185 assessment using the ROI's data set. The SVM and RF classification methods were applied for all corrected images. We only applied the ML method in the ASTER images, since it requires a longer time processing. In the end, SVM and RF are compared in Landsat, while the three SVM, RF and ML are compared with ASTER.
The results are raster maps or thematic maps that can be viewed using any geospatial software (QGIS).

190
These thematic maps will control the six classes in which it has been desired to categorize the land cover (specified in table 2).
The colors of the thematic maps will have the same significances for all the images, where the black represents coverage of the city or exposed rock, the yellow for dry soil or with a little vegetal cover, the tan for humid grounds, the light green for low vegetation, dark green for tall vegetation and blue to represent 195 water. Table 3 shows the error matrices together with the values of overall agreement and kappa index for each of the models generated from different satellite images and different dates of acquisition. The land cover categories are represented by the numbers from 1 to 6.  In addition, table 3 shows that the overall accuracies for the SVM and RF models, which vary between 96% and 97%, while the kappa index varies between 0.95 and 0.97. In contrast, for larger ensembles, overall accuracy is close to 98%. The average kappa index shows a value of 0.96 for both the SVM and RF algorithms, much higher than the 0.86 for the ML algorithm. The ML model has a lower precision (close 210 to 90%) because it depends on a higher degree on the number of samples available for each category. It should be noted that the accuracy levels are high in most cases (table 3). This is, related to the thematic 225 classes that used separate land coverings differentiable from each other; this helps in getting less confusion between the classes. The number of training samples is abundant in all cases, helping to optimize the algorithms. Thematic maps are presented in Figure 3.I and 3.II, resulting only for the SVM method, since it was the one that showed the highest accuracy in performance, and changes in soil cover will be described from these thematic maps 230 Table 4 present the soil cover description for Tubul, with a comparative approach before and after the earthquake and tsunami. With this objective, for each category of the land cover, the numbers of pixels classified are displayed. The percentages corresponding to the numbers of pixels selected for each coverage regarding the total number of pixels in the image are also shown (%class). In the last column (Class increase) the differences in percentage terms of the pixels selected for each category are presented,

235
comparing the two study periods. This provides us with value to discriminate if there was an increase or decrease in each category (due to the earthquake and tsunami effects).

240
People reported that In Tubul a setback of several tens to hundred meters from the sea took place, because of the low slope of the area strongly affected by the co-seismic rise of 1.4 ± 0.1 meters (Quezada et al. 2012), forming a sandy beach (see Figure 3). After analyzing changes in the thematic map, the coastline receded approximately 200 meters. Another clear effect is the drying of the Tubul and Raqui rivers. The thematic class of wet sediment replaced all these reversals of water cover, registering an increase of over

245
20% of this class, to the detriment of the thematic class of water and vegetation. Exposed rock coverings appeared, a product of landslides in steep areas and rock removal in coastal areas (see Figure 3).
The results presented in Table 4 describe the rate of change of land cover in percentage with respect to the total area covered in the study area. The cover corresponding to urban land increased from 5.9% to 7.6%, dry land increased from 13% to 16.9%, wet soil had a large increase from 30% to 50.3%. The two categories 250 related to vegetation had a decrease in the coverage area, the low vegetation from 16.7% to 10.6% and the high vegetation from 12.1% to 6.8%. Likewise, the aquatic cover decreased from 22.3% to 7.8%. All these variations in land cover give us a total change rate of 26.1%.
Among the land cover changes mentioned above the increase in the cover of wet soil, corresponding to the appearance of wet soil in areas where water coverage existed before, due to the desiccation of the river 255 basin and the retraction of the coastline, which exposed the seabed. The vegetation cover had important variations, especially the low vegetation cover, which corresponds to rocks and sediments landslides, and the entry of marine water into the interior areas, these processes causing vegetation death and subsequent soil exposure (without vegetation).

260
For the image classification, the three algorithms show good results, although the SVM and RF learning algorithms have superior performance.
As far as the SVM classifier is concerned, from an algorithmic perspective, there is controversy about the kernel to be used and the selection and parameter optimization. Some authors postulate that optimization processes do not contribute to an increase in classification accuracy (Zhang and Xie, 2013), while others 265 show that the parameter fluctuations evince a great impact on precision. In this analysis, the applied parameter optimization process showed a high C parameter and a low gamma value, which agrees with what has been postulated elsewhere and particularly in (Maxwell et al., 2014). In addition, the accuracy of the algorithm will be subject to the choice of the kernel function (Huang et al., 2002); in this case, the choice of the RBF kernel brought good results. All the above comes besides multiple works that show that 270 there is empirical evidence to support the theoretical formulation and motivation behind SVM (Maxwell and Warner, 2015;Maxwell et al., 2014;Zhang and Xie, 2013).
For SVM, one of its most salient features is to generalize well from a limited amount and/or poor quality training data (Mountrakis et al., 2011a), as reflected in the high levels of overall reliability delivered https://doi.org/10.5194/nhess-2020-41 Preprint. Discussion started: 13 March 2020 c Author(s) 2020. CC BY 4.0 License. by models (see Table 3), although the representative pixels of the training samples for each class did not 275 include only the category of pure soil coverage. This causes a deviation in the reflectivity that would correspond to only a ground cover, which implies a decrease in the training's quality data. The ability to get high accuracy of SVM, despite the described limitations, is in line with the concept of "support vector" that is based on only a few data points to define the hyperplane of the classifier, this process being computationally lighter than other methods (Pal and Mather, 2005).

280
On the other hand, RF delivered very good results in the classification accuracy, showing a precision comparable to that of SVM. In addition, it is very easy to use, since it only has two parameters (the number of variables in the random subset of each node and the number of trees in the decision set) and is not very sensitive to their values (Rodriguez-Galiano et al., 2012). Regarding the number of decision trees, the optimization reached an optimal value of 200, which agrees with other studies showing that as the number 285 of trees increases it rises accuracy, but only to a certain range where accuracy stabilizes (Ghimire et al., 2012;Shi and Yang, 2019). For the number of random variables available within each node, the value obtained was 3. Though this value is low, it avoids the correlation between the trees (Breiman, 2001).
The RF algorithm has the advantage of generating an unbiased internal estimate of the generalization error (OOB error) (Cánovas-García et al., 2017). This means that is not necessary to use an independent 290 evaluation subset. However, in order to assess the classification accuracy as for the other algorithms, training and validation data were used to assess the classification accuracy in the same manner as for the other classification algorithms. Also, an evaluation set was used to measure the classification accuracy but randomness is desired in the evaluation set to avoid the bias generated in the measurement of the OOB error. RF also provides an evaluation of variables importance (bands) for the general classification of the 295 land cover categories and each category classification using the Ginni index and the OOB estimate (Breiman, 2001). In this study this estimate was not considered, since for the images analyzed, the number of bands was quite small (6 for Landsat and 3 for ASTER), so that reducing them would not generate a significant reduction in computational cost.
SVM and RF obtained similar values of global precision, classifying the land cover categories equally 300 well, which is consistent with research that indicates a similar level of performance in terms of accuracy for both types of classifiers (Adam et al. 2014;Pal and Mather 2005). It should be noted that the high levels of accuracy obtained by learning algorithms, respond in part, by the high number of training samples used and the low number of bands that made up each image, as postulated in ( Pal and Foody, 2010).

-Conclusion.
The results presented here show that machine learning algorithms had an excellent performance in the classification of changes in land cover, facing a catastrophic process such as an earthquake with many aftershocks accompanied by a tsunami. These results respond to the good classification accuracy, the optimal choice of parameters for the algorithms, thanks to implementing a parameter optimization. On the 310 other hand, many training sample selections generate more robust algorithms. The results provide new perspectives on SVM and RF algorithm's performance in mapping's context of soil cover in large and heterogeneous areas. In addition, the results add to other research showing the superiority of learning algorithms versus more traditional methods, setting them as the best option for classifying land cover in heterogeneous areas (Maxwell et al., 2018b;Melgani & Bruzzone, 2004).

315
Similar results were obtained to those shown in other studies with similar characteristics, we obtained comparable results, with the changes shown in the ground cover, a product of earthquake or tsunami effects analysis, through the implementation of image classification algorithms satellite (Ishihara et.al., 2017;Pandey et al.,2019).
This study gives a more local-scale approach to changes in the land cover before and after the 2010 320 event in Chile, capture changes in more limited areas, which stands out from other work done to analyze this event by generating thematic maps, which focused on a more regional scale (Rojas et al., 2013 This work contributes to show that earthquakes and tsunamis, which are very rare, powerful and destructive natural events with deep consequences in landscape, could be quickly analyzed through passive satellite image and new machine learning methodologies, that can help to measure quickly and precisely 330 not only the extent of a catastrophe but also its real effects on the territory. In addition, it can be established as a tool for the generation of risk maps for catastrophic events, serving as a risk management instrument both to improve territorial planning in coastal areas, optimize evacuation routes and artificial barriers creation to protect urban areas.