An attempt to model the relationship between MMI attenuation and engineering ground-motion parameters using artificial neural networks and genetic algorithms

Complex application domains involve difficult pattern classification problems. This paper introduces a model of MMI attenuation and its dependence on engineering ground motion parameters based on artificial neural networks (ANNs) and genetic algorithms (GAs). The ultimate goal of this investigation is to evaluate the targetregion applicability of ground-motion attenuation relations developed for a host region based on training an ANN using the seismic patterns of the host region. This ANN learning is based on supervised learning using existing data from past earthquakes. The combination of these two learning procedures (that is, GA and ANN) allows us to introduce a new method for pattern recognition in the context of seismological applications. The performance of this new GA-ANN regression method has been evaluated using a Greek seismological database with satisfactory results.


Introduction
A common problem encountered in engineering seismology involves the difficulty in assessing the damage potential of an earthquake based on the distribution of macroseismic intensity.This parameter is subject to interpretation due to the wide variation in geological conditions, the response of structures, uncertainty related to construction conditions before the earthquake, the type of construction, and population density.Obviously, a physically-based groundrelated MMI is required for engineering purposes (Tselentis Correspondence to: G-A.Tselentis (tselenti@upatras.gr) and Danciu, 2008;Danciu and Tselentis, 2007).With the advent of instrumental seismology, the relationship between intensity and ground-motion parameters has become a topic of increasing interest.
Instrumental seismology offers the possibility to transform readily observed data (in this case, intensity) into widelyused parameters useful for engineering purposes (in this case, engineering ground-motion measures) and allows seismologists to evaluate historical earthquakes for which no instrumental data are available in order to assess seismic hazard and damages, correlate different intensity scales, and rapidly assess the severity of ground shaking.
Until recently, macroseismic intensity was related most frequently to peak ground acceleration (PGA) because of that parameter's importance for seismic-resistant design.This is due to the fact that the product of PGA and mass represents the inertial force loading structures (Krinitzsky and Chang, 1988).In recent years, research on earthquake damage prediction has concluded that other ground-motion characteristics such as duration, frequency, and energy content all contribute to structural damage.
Recently, Tselentis and Danciu (2008) derived empirical regression equations for modified Mercalli intensity (MMI) and for various ground-motion parameters such as duration, CAV, I a , characteristic intensity, Housner's spectrum intensity, and total elastic input energy index.
These relationships have been used to generate maps of estimated shaking intensities within a few minutes of the event based on recorded peak motions.These maps provide a rapid visualisation of the extent of expected damages following an earthquake and can be used for emergency response, loss estimation, and the release of public information through the media.
Another application of engineering ground-motion parameters involves the development of early warning systems.These systems are low-cost solutions aimed at reducing the seismic risk to vital facilities, such as nuclear power plants, pipelines, and high-speed trains.The ground-motion parameters and the damage potential threshold are essential for these systems.
The objective of the present investigation is to uncover the hidden, complex and often fuzzy relations between the engineering ground-motion parameters and macroseismic intensity and to express these relations in the form of input/output dependencies.
The emergence of neural network technology (Haykin, 1999;Bishop, 1996) provides valuable insights for addressing the complicated problem of expression these relations.In this context, neural networks can be viewed as advanced mathematical models used to discover complex correlations between physical process variables from a set of perturbed observations.

Engineering seismological parameters
Because structure and equipment damage is measured according to its inelastic deformation, the potential for earthquake damage depends on the time duration of motion, the energy absorption capacity of the structure or equipment, the number of strain cycles, and the energy content of the earthquake.Therefore, for engineering purposes, parameters that incorporate in their definition these previouslymentioned characteristics are more reliable predictors of an earthquake's damage potential than parameters related solely to the amplitude of ground motion (such as peak ground acceleration, or PGA), which are often poor indicators of structural damage.The most commonly-used engineering ground-motion parameters in addition to PGA and PGV are Arias intensity (I a ), acceleration response spectrum (S a ), and cumulative absolute velocity (CAV).
I a , as defined by Arias (1970), is the total energy per unit weight stored by a set of undamped simple oscillators at the end of ground motion.The Arias intensity for ground motion in the x-direction (I aX ), is calculated as follows: where a X (t) is the acceleration time history in the x direction, and t is the total duration of ground motion.Spectrum acceleration (S a ) is the most common-response spectral parameter and is related to spectrum velocity (S v ) and spectrum displacement (S d ) according to the following expression.
where T is the undamped natural period of a single-degreeof-freedom (SDOF) oscillator.
Cumulative absolute velocity (CAV) is defined as the integral of the absolute value of ground acceleration over the following seismic time-history record.
where |a(t)| is the absolute value of acceleration, and t is the total duration of ground motion.

Data set
The strong-motion records used for the present investigation were provided by the European Strong Motion Database (Ambraseys et al., 2004).More details on these data can be found in Danciu and Tselentis (2007).The macroseismic information was available in part from the digital database of the European strong-motion data; it was also separately estimated in part from the macroseismic data provided by the Geodynamic Institute of the National Observatory of Athens (Kalogeras et al., 2004).The general criterion of selecting the appropriate MMI value was to allocate at each station the nearest MMI values within an uncertainty of one unit to every station.If more than one MMI value was observed near the station location, at an equal distance from the station, the average of the values was used (Tselentis and Danciu, 2008).
Maps of the reported MMI values together with the strong-motion instrument locations were plotted to show reasonable confidence that the allocated MMI values are within one unit of the assigned value.This approach provides rapid visualisation of the macroseismic distribution in the area surrounding the recording stations and might be an efficient approach to assigning MMI values to minimise errors (Atkinson and Kaka, 2007).
At this preliminary stage of intensity ground-motion investigation, we decided not to disaggregate the soil effect because the soft soil subset comprises less than 30% of the total data set.The final data set consists of 310 records from 151 earthquakes and is depicted in Fig. 1 and Table 1.Using the recorded strong-motion data for these earthquakes, for each horizontal component, we compute the previously mentioned engineering seismological parameters.
The arithmetic average between the two horizontal components of the independent variables was used for the present investigation.

Artificial neural networks
An artificial neural network (ANN) is an informationprocessing paradigm inspired by the way biological nervous systems such as the brain process information.The most basic element of the human brain involves a specific type of cell that provides us with the ability to remember, think, and apply previous experiences to our every action.These cells are known as neurons.The power of the brain comes from the number of neurons and the multiple connections (or synapses) between them.
Figure 2 shows a simplified view of an ANN.It consists of a network of simple processing elements (that is, artificial neurons) that are organised in several layers, including an input layer that shows the number of neurons linked to the dimensionality of the input, one or several hidden layers, and an output layer.The hidden layer represents network inputs.When one presents the network with a form to be learned, the neurons simultaneously start a state of activity that causes a small modification of the synaptic forces between them.This is followed by a quantitative reconfiguration of all of the synapses: some of them become very strong, while others     become weak.The learned form is not directly memorised at a precise place; it corresponds to a particular energy state of the network, which is a particular configuration of the activity of each neuron, across a very large set of possible configurations.This configuration is supported by the values of the synaptic forces.Let Y s j represent the output of the j -th neuron at layer s; W s i j is the weight connecting the i-th neuron in layer s to the j -th neuron at layer s-1.The neurons have their activation function characterised by a nonlinear function, such as the sigmoid function in Fig. 3.This function maps the output to its input and can be expressed by the following equation.
where b is the bias.
The direction in which the weights are updated is given by the negative of the gradient of (e) with respect to every element of the weight.This process consists in minimising (e) by a gradient descent.Thus, we try to modify the synaptic weights to reduce (e).This is carried out using the following relation.
where µ is the learning rate parameter, which usually takes values between 0 and 0.5.The quantity e s j is the locale of the error of the j -th neuron in the s layer.Weights and bias terms are first initialised at random values.In general, there are no strict rules for determining the network configuration for optimum training and prediction.
One representation of the ANN used in the present investigation is shown in Fig. 4, which includes the most commonly-used engineering seismological ground-motion parameters as inputs.

Data processing
We consider automatic MMI assessment based on groundmotion parameters as part of a larger category of problems encountered in pattern recognition (Poggio and Girosi, 1990.In the present investigation, we consider the following four phases: (1) feature extraction, (2)classification, (3) preprocessing and optimisation and (4) regression.

Feature extraction
During this phase, we combined the enhanced selection capacities offered by GA with the performance of an ANN as a classifier.At first, we used all nine commonly-used input parameters that characterise earthquake ground motion at a site corresponding to an MMI value.These measures include M, log(R), PGV, log(PGV), S a , PGA, log(PGA), I a and CAV (Fig. 4).During this procedure, we use 9-bit string (S) quantification with binary field-values as follows.

S
= M,logR,S a ,PGV,logPGV,PGA,logPGA,I a ,CAV (7) For example, the three-parameter string [logR, I a , CAV] is represented as the string 010000011.A genetic algorithm was used to generate populations of strings out of the 512 possible combinations from 000000000 to 111111111.The total number of available strings (that is, the equivalent of chromosomes) at a certain time (i.e., after the Max-No-Generations), which is known as the genome, was evaluated by an ANN implemented as a k-nearest neighbour (k-NN).This was achieved by comparing the corresponding MMI (that is, the outputs) with the selected inputs out of the nine possible inputs represented by the strings generated by the genetic algorithm.In our implementation, we have allowed a population size of 20, where the population size indicates the maximum number of chromosomes (or strings) allowed in a generation.

Classification
The second phase, which deals with classification, is implemented using a k-NN-type neural network.The k-nearest neighbours (k-NN) algorithm is a method for classifying objects based on the closest training examples in the feature space.The k-NN algorithm is a type of instancebased learning, or lazy learning, where the function is only approximated locally, and all computation is deferred until classification.
In the machine-learning community, Instance-Based Learning (IBL) (Aha et al., 1991), also known as memorybased learning, is a family of learning algorithms that, instead of performing explicit generalisations, compares new instances with instances that have been observed during training.It is called instance-based because it constructs hypotheses directly from the training instances themselves.A direct consequence of this approach is that the complexity of the problem grows with the amount of data available for training and testing.
Of the data set of 310 values, some data points were left for testing, while most were considered for training.We have used a GA-ANN with IBL approach in order to avoid data changes produced by normalisation techniques.The Euclidean metric is used to assess distances between the training and testing epochs.
In our approach, we used one of the simplest examples of IBL, namely, k-NN classifier and its Java implementation based on the Weka-toolbox (Witten and Frank, 2005).
The k-NN classifier is amongst the simplest machinelearning algorithms.In our case, it selects the optimal combination of the nine inputs (see Eq. 7).An object is an instance out of the 310 data values given in the features space; it is a five-parameter string consisting of [M, log(R), PGA, I a , CAV].It is classified by a majority vote of its neighbours, with the object being assigned to the class most common among its k-nearest neighbours.

Pre-processing and optimisation
In the previous two phases, we considered all inputs as they were, but for processing purposes, we converted all inputs into integers by multiplying them by powers of 10 (Härdle et al., 1995;Mierswa et al., 2006).
Candidate solutions to the optimisation problem act like individuals in a population, and a fitness function determines the environment within which these solutions "live" (e.g., a cost function).Genetic algorithms are a particular class of evolutionary algorithms (EA) (Fonseca and Fleming, 1995) that use techniques inspired by evolutionary biology such as inheritance, mutation, selection, and crossover (or recombination).
In other words, a GA quantifies information (namely, the parameters of the k-NN classifier) in the form of strings (that is, the chromosomes), and through the EA, only the fittest chromosomes survive over the generations of the evolution.Therefore, an important parameter for the proposed GA-ANN method is Max-No-Gener, which allows the GA algorithm to evolve to achieve the optimal solution.In our case, there are several parameters that have to be modified (or fine-tuned) to achieve optimal behaviour according to ANN.These include the number of hidden layers and the number of neurons in each hidden layer.
After finding the optimal Max-No-Gener parameter, we determined k and the selection scheme (or the size of the tournament).k was found by a trial-and-error procedure to have a value of 2. One advantage of the selection mechanism of a GA (Tobias and Lothar, 1995) is the independence of the representation of the individuals.Only the fitness values of individuals are taken into account.
A fitness function is a particular type of objective function that prescribes the optimality of a solution (or chromosome) in a genetic algorithm so that a particular chromosome may be ranked among all other chromosomes from the genome.Chromosomes which are found to be "more optimal" are allowed to breed.That is, further binary combinations will be created by the GA on the "skeleton" of these "more optimal" chromosomes.Data sets can then be mixed using any of several available techniques, thus producing a new generation of chromosomes.This mix in our case can be represented by taking the first four digits from one "optimal" string and the last five digits from another "more optimal" one, thereby creating a new chromosome (9-digit string) that can possibly perform better under k-NN classification.This simplifies the analysis of the selection methods and allows a comparison that can be used for all kinds of genetic algorithms.
One of the frequently-used selection schemes is tournament selection.In this scheme, we run a tournament among a few individuals chosen at random from the population (i.e., from the genome) and select the one with the best fitness for crossover as the winner, adjusted by variances in tournament size.In genetic algorithm theory, the crossover is the genetic operator used to modify the programming of a chromosome or a group of chromosomes from one generation to the next.It is analogous to natural reproduction and biological crossover upon which the simplified GA theory for computational intelligence was built.
First, a single crossover point on the strings of both parent organisms is selected while avoiding extreme points.All data beyond that point in either organism string are swapped between the two organism strings.The resulting organisms are the children (or offspring), as shown in Fig. 5.
If the tournament size is larger (chromosomes for which the objective function, i.e., the error has a higher value), weak individuals have a smaller chance of being selected for breeding, crossover, and perpetuation in the next generations.Performance was quantified by RMS criteria and the squared error.A flow chart describing all of the above operations is presented in Fig. 6.
To validate the performance of the above-mentioned GA-ANN selection schemes, we selected a validation scheme based on regression performance.The results for the selection of the first optimal parameter from those described above (that is, Max-No-Gener) are presented in Table 2. From this table, we note the following cases.
Supposing that Max-No-Gener = 75 and that we obtained S 1 as the fittest string, where S 1 is given by the GA+k-NN algorithm and corresponds to the combination [M, log(R), log(PGV), I a , CAV].This is represented as 110010011.The fittest chromosome is considered the one for which the objective function has an extreme value.In our case, the objective function is the squared error of the k-NN type of ANN, and so the objective function must have a minimum value.If we consider the case with Max-No-Gener = 80 with S 2 as the fittest string, this corresponds to the combination [M, log(R), PGA, I a , CAV] and is represented as 110001011.In this situation, S 2 is retained as having the best fitness because the output of the k-NN classifier for S 1 is given by the squared error 0.329 ± 0.170 and for S 2 is given by of the squared error of 0.311±0.154.In other words, both the error and its standard deviation are lower than in the case of S 1 .
Setting Max-No-Gener higher than optimally at 90, S 3 then corresponds to the combination [logR, I a , CAV] and is represented as 010000011; this solution is rejected due to a worse regression performance (that is, a higher RMS error).The solution with Max-No-Gener = 94 and string S 4 corresponding to the combination [M, logR, S a , PGV, logPGV, PGA, logPGA, I a , CAV] and represented by 111111111 is also rejected because it accepted almost all of the input parameters and is a trivial solution.
Finally, an underdetermined solution, which results if values of Max-No-Gener are in the range of 25 to 70, is rejected because GAs require a minimum number of generations to obtain optimal fitness among all available individuals.
Judging from the above, an optimal value of Max-No-Gener = 80 with a minimum RMS was selected.In this case, the optimal selected input parameters are M, logR, PGA, I a , and CAV.These are the parameters used to express MMI throughout the regression process.

Regression
In a case in which the existing data are not sufficient for analysis because we must use part of the data for validation and test sets, it is common to use a crossvalidation or rotation estimation method (Kohavi, 1995).This is a technique for assessing how the results of a statistical analysis will generalise to an independent data set.It is mainly used in applications in which the goal is prediction, particularly when one wants to estimate how accurately a predictive model will perform in practice.Cross-validation involves partitioning a portion of the data into complementary subsets, performing analysis on one subset called the training set, and validating the analysis on the other subset, which is called the validation or testing set.For relatively large datasets, a higher cross-validation is usually used (25 in our case).
Next, the selected optimal input parameters [M, logR, PGA, I a , and CAV] are considered as X multivariate inputs, and the response variable Y is understood to indicate MMI.To calculate the coefficients that link Y to the five-dimensional X variable, we used the linear = 0.52632 Adjusted R 2 = 0.51853 Standard Error = 0.67936 regression approach described by the following equation (Härdle et al., 1995).
where b 0 is the intercept, and b 1 ,...,b 5 are the coefficients for the ground parameters.
The obtained b values are presented in Table 3, and the results of the ANOVA statistical test are shown in Table 4. Accordingly, the relation that describes MMI as a function of [M, logR, PGA, I a , CAV] is as follows.MMI = 8.824 + 0.417M − 7.960logR + 0.380PGA +1.105I a − 0.551CAV (9) Figure 7 shows the time series corresponding to the original data and the results of the regression analysis.For the GA-ANN selection scheme, we used a Java-based implementation built around the Weka (the IBk lazy learner) data-mining system (Witten and Frank, 2005).

Conclusions
In this research, we presented a new approach based on ANN and GA to model the relationship between MMI attenuation and engineering ground motion parameters.The performance of this new regression approach has been tested using a Greek strong motion database with satisfactory results.
We note that not all of the features selected in the GA-ANN approach have the same influence on MMI attenuation.An approach based on evolutionary algorithms can be useful in weighting the importance of those features.Additionally, a new type of neural network (namely, an evolutionary neural network) can be used to replace the classical k-NN used in the current paper.If we implement an expert system, we can derive a real-time signal-processing system, which can be used for near real-time damage assessment and shake map construction.

Fig. 1 .
Fig. 1.Epicentral distribution of earthquakes used in the present analysis.

Fig. 4 .
Fig. 4. Topology of the feed-forward k-NN-type ANN used in the present study.

Fig. 6 .
Fig. 6.Flowchart showing the sequences of feature-selection and classification.

Fig. 7 .
Fig. 7. Output of the regression algorithm (in red) and the original MMI data (in blue) for the 310 considered data points.

Table 1 .
Database of strong motion records used.

Table 3 .
The parameters for linear regression obtained by the program XploRe for the input data selected by the proposed GA-ANN method.

Table 4 .
The statistical parameters obtained for the linear regression using the ANOVA test.