UAV-based urban structural damage assessment using object-based image analysis and semantic reasoning

Introduction Conclusions References Tables Figures


Introduction
The challenges and importance of structural damage assessment, in particular its critical role in efficient post-disaster response, have placed this discipline in the spotlight of the remote sensing community (Rastiveis et al., 2013).The in-formation generated is primarily used by search and rescue (SAR) teams but is also valuable for many other stakeholders engaged in post disaster activities, such as those dealing with estimation of economic loses, recovery, or reconstruction (Barrington et al., 2011).
For rapid damage assessment, remote sensing has been found to be very useful, as it can cover large areas, and image-based assessments are realized more rapidly than through ground deployment of appropriately skilled surveyors.However, so far it has not reached the level of detail and accuracy of ground-based surveys, a target our research aims at helping to reach.The limitations of image-based damage assessment are only partly related to the spatial resolution of the sensors.The primary problem is the vertical perspective of most operational sensors that largely limits the building information to the roofs.This roof information is well suited for the identification of extreme damage states, i.e. completely destroyed structures or, to a lesser extent, undamaged buildings.However, damage is a complex 3-dimensional phenomenon, and important damage indicators expressed on building façades, such as cracks or inclined walls, are largely missed, preventing an effective assessment of intermediate damage states.
Oblique color imagery, which shows both roof and façades, was already identified as a potential solution by Mitomi et al. (2001), who attempted to use oblique TV footage to map structural damage.Commercial oblique color data acquired by Pictometry © of post-earthquake Port-au-Prince (Haiti) were tested by Gerke and Kerle (2011a) and Cambridge Architectural Research Ltd. (CAR), among others, and were found to be more useful than conventional vertical images.However, such data also lead to challenges resulting from the multi-perspective nature of the data, such as how to create single damage scores when multiple façades are imaged.Part of the solution to these challenges lies in modern oblique data that are acquired as multi-perspective stereo pairs, which allow the generation of 3-D point clouds.These exceed standard LiDAR point clouds in terms of detail, especially at façades, and provide a rich geometric environment that favors the identification of more subtle damage features, such as inclined walls, that otherwise would not be visible, and that in combination with detailed façade and roof imagery have not been studied yet.
Nevertheless, commercial oblique imagery is typically difficult to obtain in disaster situations, and control over data acquisition with piloted aircraft (e.g.Pictometry © ) tends to be limited for researchers or disaster responders.Unmanned aerial vehicles (UAVs) appear to be an alternative, especially because of their ability to obtain data at higher spatial resolution, but also because they afford more flexible data acquisition that improves the quality of the point clouds that can be derived.
The image interpretation process still typically relies on expert-based visual assessment because of the complexity of the task.Most operational post-disaster damage mapping, such as the processing of satellite data acquired through the International Charter "Space and Major Disasters", remains based on visual interpretation (e.g.Kerle, 2010;Voigt et al., 2011).While oblique airborne data should in principle allow an easier and more accurate damage assessment, owing to their comparatively high spatial resolution and more complete representation of a building, the data richness itself actually hinders more automated analysis procedures.However, there seems to be an inherent limitation of remote sensing imagery for damage assessment, regardless of type and quality: visual analysis of the Pictometry © data of Port-au-Prince by CAR also only achieved accuracy rates of 63 % when compared with ground assessment (Corbane et al., 2011, K. Saito, personal communication, July 2010).Nevertheless, expert-based visual assessment of complex data also only relies on directly visible spectral indicators and relatively coarse geometric information.Combining those indicators that form the basis for visual assessment with more subtle geometric features from 3-D point clouds may lead to better performance.
Automatic image analysis techniques for building damage assessment (BDA) can be broadly grouped into pixeland object-based methods.In a variety of domains, objectbased techniques have shown advantages over pixel-based approaches (Yamazaki and Matsuoka, 2007).This tendency has to do with the spatial resolution of modern remote sensing images, where target elements are clusters of pixels that are better captured by objects rather than pixels (Johnson and Xie, 2011).Additionally, object-based image analysis (OBIA, in the literature also referred to as object-oriented image analysis, OOA) adds a cognitive dimension that is expected to help in a detailed object classification.
In this study we thus aimed at maximizing the potential of modern multi-perspective oblique imagery captured from UAVs, using both the high-resolution image data and derived 3-D point clouds, resulting in a detailed representation of all parts of a building.This comprehensive appraisal that approaches ground-based damage assessment in terms of complexity and completeness was coupled with a semi-automatic extraction of a range of damage indicators using OBIA.This allowed a complete characterization of the images, especially by using OBIA's cognitive dimension for the damage features extraction.In this study we did not yet aim at an automatic classification into per-building damage scores.Instead, our assumption was that severe damage could be determined directly from the 3-D point cloud data, while for the distinguishing of lower damage levels structural engineering expertise remains necessary.Therefore, in an earlier study, a Master of Science thesis that triggered this more detailed research (Fernandez Galarreta, 2014), we created a set of experiments to enhance the UAV images by annotating them with the OBIA-extracted damage features.The damage features were given to experts in ground-based damage assessment to assess the added value of the OBIA information, but also to study scoring variability and uncertainty among the experts.
Therefore, in the final part of this study we addressed the multi-perspective dimension of the data set, taking into account all information collected from the façades and roofs, and aggregating it at a building level by mimicking the cognitive assessment process of ground surveyors.

State of the art in image-based damage assessment
Remote sensing for BDA has undergone tremendous changes over time.Its roots go back to George Lawrence and his 49 pound camera attached to a set of kites over earthquakeravaged San Francisco in 1907, and today, companies such as Skybox (2013) can deliver HD videos from satellites.However, the challenges of BDA are only partly rooted in image type and spatial image resolution; viewing angle, understanding of the damage features, subjectivity, amongst others, are factors that also play a role in the complexity of this kind of study.
The utility of almost every platform and sensor, in their multiple combinations, has been assessed for BDA.There are many examples of successful studies where the results obtained have been satisfactory and useful, e.g.Li et al. (2010) using VHR satellite imagery, Ehrlich et al. (2009) processing VHR radar imagery, or Khoshelham et al. (2013) employing aerial LiDAR data sets.For a deeper review of platforms and data types used for damage mapping see reviews by Kerle et al. (2008), Zhang and Kerle (2008), and Dell'Acqua and Gamba (2012).
For the above mentioned studies, regardless of their different sensor/platform combinations, the perspective constraint applies: the typically near-vertical perspective of sensors effectively limits the damage signature to the roofs (Gerke and Kerle, 2011a), resulting in a high dependence on proxies, e.g.changes in shadows, or evidence of blow-out debris (Kerle and Hoffman, 2013).In reality, structural damage is a phenomena expressed in all parts of the building and, in particular, the intermediate damage levels tend to display damage evidences in their façades, absence of which in vertical data constitutes a several limitation for complete BDA.
To solve this constraint, color images have been acquired from an oblique perspective to allow the evaluation of building façades.Mitomi et al. (2001) and Rasika et al. (2006) were examples of early use of this type of non-conventional imagery.However, despite studies such as by Weindorf et al. (1999) that tried to overcome low image quality issues, challenges continued to persist.Recent, more sophisticated and controlled image acquisition systems, such as Pictometry or multi-head mid-format camera systems offered by Microsoft or Hexagon, have allowed data processing based on advanced photogrammetry and machine learning principles (Gerke and Kerle, 2011b).However, besides the improvements offered by oblique imagery acquired from piloted platforms, UAVs provide additional advantages (Nonami et al., 2010): fully controlled flight, VHR imagery of up to 2 cm resolution that allows detection of fine cracks, and the large degree of image overlap that supports the generation of very detailed point clouds.However, UAVs are still in development and have to overcome a variety of issues, such as short battery life, and thus limited area of coverage, unforeseen behavior in variable atmospheric conditions, typically limited pilot training of the user, and legislation that severely limits the use of UAVs in most countries.
As stated before, image interpretation for BDA is not trivial, especially in complex urban areas.Manual approaches constitute an easy and direct method, though with a number of constraints compared with automatic approaches.Conversely, they are capable of addressing damage holistically, i.e. in its entirety, as expert knowledge can be well matched to a given level of ambiguity and uncertainty (Rastiveis et al., 2013).Automatic approaches developed to date have struggled to deal with uncertainty inherent in damage assessment, although approaches such as by Rastiveis et al. (2013), who explored fuzzy decision making approaches, or by Li et al. (2010), who studied urban damage detection incorporating support vector machines and spatial relations, have been trying to overcome this limitation.
Within the class of automatic approaches, OBIA techniques frequently outperform pixel-based methods for reasons given above.In particular, recent work aiming at automatic identification of optimal segmentation settings, e.g.Drȃgut et al. (2014)'s ESP 2.0 tool, the plateau objective function of Martha et al. (2011), or research on the use of machine learning for better identification of suitable image features and for threshold / parameterization (e.g.Stumpf and Kerle, 2011) have increased the utility of OBIA for more complex automated procedures.
Besides the object-based approach, one of the most interesting advantages of OBIA is its cognitive dimension.This has already been exploited in other fields, such as landslide mapping (Lu et al., 2011), but to date has not been used for detailed BDA.This cognitive dimension aims at supporting a damage-feature extraction that frequently is more conceptual than physical (Kerle and Hoffman, 2013).Damage features, due to their complexity and variability, are frequently hard to reduce to a number of parameters to describe them as an image-feature.
BDA conventionally makes use of a damage scale.The European Macroseismic Scale of 1998 (EMS-98; Grünthal, 1998) is a damage scale that classifies buildings from D1 (negligible damage) to D5 (total collapse).Even though it is the most commonly used damage scale for image-based BDA, the EMS-98 was originally created for ground surveys, leading to several drawbacks such as vague description of damage features.Besides, it is based on damage features that do not add up linearly to a per-building damage score.Examples that illustrate the challenges of using a scale might require a new approach in the near future.

Methods and data used
This study aimed at generating per-building damage scores based on oblique, multi-perspective, highly overlapping and very high resolution imagery.Those were primarily acquired with a UAV, and partly with a camera attached to a pole, (details on image acquisition are given in Sect.3.1).From the multi-view imagery, 3-D point clouds were generated to allow experts to visually identify the most affected building: D4-D5 (Sect.3.2).Subsequently, the façade and roof images of the buildings that were still standing were analyzed with OBIA, where damage features were extracted Sect.3.3).In a separate experiment by Fernandez Galarreta (2014) the image data of buildings for which the 3-D point clouds did not reveal extensive damage, or the damage features were not visually identified, were subjected to expert assessment.Each image with overlaid information from the OBIA damagefeature extraction being assigned to an EMS-98 score and a certainty measurement.The process of aggregating the individual scores at building level, and thereby simulating the understanding of the expert surveyors on the ground, is described in detail in Sect.3.4.Figure 1 provides an overview of the methodology.

Data used
The data for this study were collected with an Aibot X6 V.1 UAV (Fig. 2a), and with a camera attached to a 7 m pole (Fig. 2b).Several acquisition campaigns were made: Gronau (Germany), Enschede (The Netherlands) and several loca-  tions near Bologna (Italy), where an earthquake in 2012 caused extensive structural damage.The images of the building used in this study belong to different buildings, although sometimes we use them as a part of one conceptual building which does not exist in the real world.
The UAV flights were planned beforehand using the waypoint capability of modern UAV systems, and included both vertical and oblique image acquisition.The former was defined in a stripwise manner to achieve 80 % end lap and 30 % side lap, using a Canon 600D with a 40 mm fixed zoom Voigtländer lens.Flying at 70m altitude resulted in image footprint of approx.40 × 25 m and a nominal pixel resolution of 7 mm.The oblique flight was realized using a circular setup, i.e. to fly a circle with a radius of 70 m and a camera nick angle of 45 • .
For the camera attached to the pole, a simple Canon Power Shot S100 was used to simulate an UAV flight.The camera was moved around the building at an approximate distance of 15 m to the façade using three different camera heights (3, 5 and 7 m).This resulted in images with pixel resolutions of better than 1 cm.

3-D point cloud assessment
The aim of this step was that experts could visually identify in the 3-D point cloud a number of damage features that are related to D4 and D5: total collapse, collapsed roof, rubble piles and inclined façades.This step was expected to be expert subjective because the definition of thresholds for the identification of the mentioned damage features would be a mistake due to their complexity and variety of representations.It was also meant to limit the more detailed assessment to those building without clear D4-D5 damage features expressed in their 3-D point clouds.
The test data set used to identify the damage features comprised four 3-D point clouds (Fig. 3) generated from the oblique overlapping images as explained below.
Image processing started with the computation of camera parameters, such as intrinsic and orientation information, using a structure-from-motion approach.Musialski et  2013) give a comprehensive overview of state-of-theart algorithms, such as implemented in the software Autodesk 123D Catch (Autodesk-123D, 2014).The scale of the sparsely reconstructed scene and the placement of the local coordinate system is generally arbitrary; hence, a local coordinate system was subsequently defined where the z axis was chosen to point upwards.In those cases where GPS was available (for the UAV, not for the pole images), the scale and coordinate layout were defined through GPS information.Through the subsequent dense image matching (Furukawa and Ponce, 2010), the initial point cloud was substantially densified.In case of well-textured areas one 3-D point for each image pixel is achievable.The accuracy of the points depends mainly on the image configuration.In our case the standard deviation was estimated to be in the range of the pixel resolution.
Following the construction of the 3-D point cloud, for each point, a local tangent plane was computed from adjacent points.In particular, the z component of the normal of this plane was of interest.It is the smallest eigenvector computed from the co-variance matrix of the neighborhood points.The z component takes values from 0 (vertical) to 1 (horizontal), and it was converted to degrees by calculating its arcsine; it scaled the parameter from 0 • (vertical) to 90 • (horizontal) for better user understanding.The expected outcome was the possibility to visually identify D4 and D5 damage elements from the z component of the 3-D point cloud.
More automatic approaches for BDA with LiDAR point clouds have previously been attempted (Khoshelham et al., 2013;Oude Elberink et al., 2011).However, approaches for the denser point clouds damage-feature extraction are still being developed (Weinmann et al., 2013).

OBIA-based damage-feature extraction
The goal of this step was to proceed with a more detailed façade and roof analysis of the buildings that did not show any D4-D5 damage feature in the previous step.Several algorithms were created in eCognition TM (Trimble, 2013) to extract from the images several damage features that can be expected in those façades and roofs.The importance of this section relied on three aspects: the detail of the damage assessment that was similar to ground-based surveys, the focus on the façade damage features that tend to be excluded in the conventional remote sensing based BDAs, and the use of OBIA to bring the cognitive dimension into the BDA framework, which helped to simulate expert-based assessment.
The data set used in this section comprised 11 VHR images that represented roofs and two different types of façades: concrete and brick (Fig. 4).The selection of the images was done manually due to the complexity of the scenarios.For each of the three types, a rule set was created to extract the damage features.
The damage-feature extraction can be subdivided into two steps: image segmentation and object classification, followed by results export and accuracy assessment.But first useful damage features to be extracted from the images had to be identified.In this case the same damage features as in Fernandez Galarreta (2014) were selected: cracks, holes, intersection of cracks with load-carrying elements and dislocated  tiles.These damage features are the characteristic of intermediate damage in façades and roofs.In addition, non-damage related features also had to be classified as part of the process: façade, window, column and intact roof.
1. Image segmentation: the aim of image segmentation was to generate meaningful damage related objects that could be easily characterized.In order to achieve that a two-step segmentation approach was implemented.This was chosen over automatic segmentation approaches such as the Estimation of Scale Parameter (ESP 2.0; Drȃguţ et al., 2014), because of the possibility of objectively using such a parameter selection process as an extra tool to capture the target objects.The two-step segmentation started with a multiresolution segmentation algorithm, using a small scale factor (Table 1).This resulted in a desired over-segmentation meant to capture every small detail in the image, such as individual bricks, tiles and sections of cracks.The secondary parameters (shape and compactness) were modified accordingly in order to fit the requirements of the imagefeatures that had to be captured: individual and contrasted image-features (Table 1).Subsequently, a spectral difference segmentation was applied on the previously generated objects.The goal was to merge the more homogeneous objects (façade and intact roof objects) into larger ones, whilst retaining the heteroge-neous, damage related objects (cracks and dislocated tiles) as smaller contrasted objects for easy characterization in the next step.To achieve that, different maximum spectral difference (MSD) threshold values were tested, with values from 1 to 80 being tested to observe the effects on the end result.In general low parameter sensitivity was observed, with MSD values from 10 to 20 giving similar end results in all cases.This low sensitivity was even more prominent on the concrete façades where similar results were obtained with MSD values from 10 to 40.This result suggested that transferring this two-step approach to similar building images could be straightforward.The final selected parameters for the different scenarios are summarized in Table 1.
2. Object classification: the overall strategy to classify both façades and roofs attempted to emulate the approach of surveyors in the field.It started by classifying the largest objects first: intact roof and façade objects.Once those were classified, the rest of the classes (windows, columns, cracks, holes and dislocated tiles), more geometrically differentiable, were identified based on a number of object image-features (Table 1).
With the basic damage features classified, their topological relationships were subsequently used to define their semantic dimension and, hence, identify crossing cracks (cracks crossing columns) and connecting cracks (cracks touching windows or holes).
For a more detailed explanation of the segmentation approach followed in this study and for a deeper description of the created rule sets, see Sect.4.2 in Fernandez Galarreta (2014).
1. Export: the classified objects were exported to Ar-cGIS 10.1 (ArcGIS, 2013).The objects were stored as vectors with two of their properties attached: area in m 2 and length in m.These stored vectors created a damage inventory with a very detailed geometric description of the extracted damage features.
2. Accuracy assessment: the accuracy assessment was based on a set of statistical measurements that compared the areas of the extracted damage features with the area of reference damage features digitized in Ar-cGIS by creating individual polygons for each damage feature found.Comparing these data sets two accuracy measurements, correctness and completeness, were derived (Fig. 5).To calculate them, three indicators were needed (Fig. 5): false positive (FP), false negative (FN), and true positive (TP).
The overall workflow resulted in one set of extracted damage features for each of the images, which, together with their associated information, were meant to facilitate imagebased visual damage assessment.Besides the extracted objects themselves, this step also produced a number of statistical indicators used within this paper to assess the quality of the extraction.As a way of providing the damage information to the damage evaluator in your tests we experimented with a 3-D wire-mesh construct on which different damage types can be interactively switched on when needed.

Aggregation of multi-perspective damage information
In the final step, an approach to aggregate multi-perspective damage information, resulting from the expert-based damage classification of roof and façade images carried out in Fernandez Galarreta (2014), was developed.Six experts in BDA analyzed different façade and roof images enhanced with the OBIA-extracted damage features, and assigned EMS-98 scores to each image.In addition they were asked to rate their classification confidence -from uncertain (1) to very certain (3).For more information about how the experiment was set up, see the results section "Interface design and testing" in Fernandez Galarreta (2014).
This section can be subdivided into a number of steps: 1.  2. Aggregation algorithms development: according to the information obtained from several field guides (Baggio et al., 2007;ATC, 2005) and interviews carried out with experts in the field, two aggregation algorithms were created to generate per-building damage scores.
3. Aggregated outcome assessment: the two algorithms were applied to the expert-based damage classification table to generate aggregated damage scores and certainty measurements.

3-D point cloud assessment
The visual assessment of the 3-D point cloud's z component allowed the identification of the previously listed damage features to classify a building as D4-D5 (total collapse, collapsed roof, rubble pile and inclined façade; Fig. 6).
Total collapse (Fig. 6a) can be easily identified by the absence of planar building sections.For the partially collapsed roof (Fig. 6b) the indicator was the shift towards a more vertical value of the collapsed section of the roof.Rubble piles (Fig. 6c) are recognizable from the z component variation, as well as the positive elevation anomaly.Finally, the z component deviation from the vertical readily signals inclined façades (Fig. 6d).
The images shown in Fig. 8b and c were edited to remove visible damage features.The rule sets were re-run on those images to test their performance in a damage-free environment.The results are shown in Fig. 8b' and c'.
After the classification, the results were exported to Ar-cGIS 10.1 as vectors with the associated attributes: area and length.These vectors represented the damage inventory environment where the experts could gain more inside about the damage features that were extracted.
The results of the accuracy assessment described in Sect.3.3 IV are shown in Table 2.

Aggregation of multi-perspective damage information
The outcome of the experiment carried out in Fernandez Galarreta (2014), where six experts assessed five images representing a real case scenario, is summarized in Table 3.
Together with this table the experts also provided feedback on the usability of the information provided.For more detailed information on this feedback, see Fernandez Galarreta (2014) Sect.4.3.5 "Summary of the received feedback".An aggregation algorithm was created for the damage scores (Table 4) and the certainty measurements were simply scaled to a percentage following Eq.( 1): The result of applying the previously presented algorithm (Table 4 and Eq. 1) on the individual per-façade / roof classification (Table 3) is presented in Table 5.A total of six fi- nal per-building damage scores and certainty measurements were generated.

Discussion
Structural damage assessment is a priority after a disaster event, and the potential of remote sensing has already been demonstrated in many studies.However, the lack of methods to achieve a comprehensive damage evaluation based on all external components of a building motivated our work.

3-D point cloud assessment
We aimed at assessing whether a 3-D point cloud allows the identification of damage features indicative of D4 or D5 damage.Except for the identification of rubble piles (Fig. 6c) in situations where the grass around a building partially masked rubble presence, the results demonstrated that the visual assessment of the point cloud's z component was very useful to identify those damage features.In addition, it allows experts to identify subtle damage signatures, such as inclined walls (Fig. 6d), that are difficult to recognize in traditional BDA approaches.Nevertheless, the 3-D point-cloud assessment was not focused on the actual data processing and automatic analysis of the data sets, because of its expected complexity and novelty.Thus it focused on generating the 3-D point-clouds using existing approaches, before proceeding with a visual assessments of the data sets to identify the cited damage features.

OBIA-based damage-feature extraction
Previous research on BDA suggests that, regardless of the data type and quality used, the detection of intermediate damage scales remains ambiguous, being strongly influenced by the expertise and experience of the assessor.Besides, given the absence of proven automated damage assessment methods, we accept that a certain subjectivity was inherent to this problem.The underlying idea of this work was thus to have experts, who typically assess structural damage based on holistic evaluation, either in the field or using image data, to assess if the information from OBIA aids their assessment.Consequently, we opted for an OBIA approach to identify damage features to assess if those can meaningfully support visual damage mapping by experts.The damage detection was largely successful, achieving reasonable correctness and completeness rates (Table 2).However, several problems were found during the segmentation and the classification, in addition to problems related to the accuracy assessment, all of them addressed in more detail below.The two-step segmentation aimed at generating large homogeneous nondamage objects, whilst highlighting smaller damage features, and was largely successful throughout the different scenarios (e.g.Fig. 10).This formed the theoretical approach for the OBIA-based damage-feature extraction.

Segmentation
The two-step segmentation aimed at generating large homogeneous non-damage objects, whilst highlighting smaller damage features, and was largely successful throughout the different scenarios (e.g.Fig. 10).This set the theoretical approach for the OBIA-based damage-feature extraction.The majority of published OBIA studies suffered from limited transferability, due to the need for trial-and-error segmentation parameter adjustment.In our study, the two-step approach effectively reduces the parameter sensitivity, especially for concrete façades where a relatively large MSD threshold range led to comparable results, although more research in this direction would be needed to improve this fact.Despite the overall very satisfactory performance of the twostep segmentation, problems remained where damage features approached non-damage background with similar spectral characteristics or patterns, such as cracks in brick walls  (Fig. 11).Resulting segmentation errors propagated into the analysis stage, leading to misclassifications.

Classification
The classification part of the rule sets had to deal with complex scenarios, many target damage features and a range of different images, leading to limited, but unavoidable errors.They included both false negatives (Fig. 12a) and false positives (Fig. 12b and c).This particular concentration of errors in the brick façades has to do with the actual pattern that the façade presents.The complex pattern presented in the brick façades challenges the segmentation algorithm that creates small non-damage related objects that during the classification process are mistakenly classified as cracks.Con-crete façades, having a smother texture pattern, facilitate the segmentation algorithm task of creating large objects for the intact part of the façades, and small object for the damagerelated ones.Nevertheless, in general, the damage-feature classification was found to be very satisfactory, especially in the concrete façades.We aimed at a compromise of reaching acceptable accuracy values whilst maximizing rule set transferability.Three rule sets were applied to 11 images to test that flexibility.

Accuracy assessment
This research focused on finding and extracting damage features from façade and roof images to support subsequent expert-based damage classification.The aim was not an au-  tomatic delineation and extraction of those damage features.For this reason, an accuracy assessment of the detected damage features based on digitized reference objects is only partially appropriate.This is because, to our knowledge, the significance of different types of misclassification has not yet been addressed in the literature.Clearly, errors in terms of absolute length of a feature, falsely identified connectivity to specific structural building elements, or number of identified dislocated tiles, still need to be assessed from a structural engineering perspective.However, it is important to notice that although the images seem to be well classified (e.g.Fig. 8ad), the completeness parameters are still rather low.This was due to the actual extraction process that consistently missed parts of the crack borders, which consistently led to false positives around those cracks.It is also important to notice that the extraction in Fig. 8b'-c' reached 100 % correctness and completeness, which is explained by the absence of damage features, which the rule set processed correctly.Further, traditional accuracy assessment approaches do not address the semantic dimension of the extracted features.According to the feedback obtained after the expert-based perfaçade/ roof classification (Sect. 4.3.5 in Fernandez Galarreta, 2014) errors such as the one in Fig. 12b are flagged as FP, yet to an expert analyzing damage based on the OBIA damage features this type of misclassification posed no problem.For the specific case of Fig. 12b, the expert knowledge allows an easy recognition of such an object as an artefact of the façade, a letter.It is directly understood as a non-damage related object despite its erroneous classification.

Aggregation of multi-perspective damage information
In the final step of the methodology the damage information generated at the façade / roof level had to be aggregated at the building level.This section was successful; however, many challenges and constraints were found that raise questions concerning essential parts of this research.

Per-façade / roof expert-based individual damage classification
Referring to Table 3, the most important conclusion was the obvious presence of subjectivity in the classification.The six experts assessed the exact same simulated scenario and none of them agreed for the individual damage scores.Most agreement was found for the image of the intact façade.On the other hand, the experts tended to provide more variable damage scores for the roof image, which may indicate that the surveyors typically do not have access to the roofs, hence have limited experience in roof damage assessment.
For the certainty measurements more homogeneous tendencies were found within each expert.In general experts tended to be more for images that contained some form of damage feature, and were more uncertain when no damage features were present.This was an interesting point, because the image showing an intact façade was the one where the experts agreed the most, yet they felt less certain about it.This could be related to their experience, which would tell them that even in the absence of visible damage features, the façade might still be somehow compromised.
In addition to the table (Table 3) obtained from Fernandez Galarreta (2014), feedback was sought from the experts about the usability of the overlaid OBIA-derived damage information, using a wire-mesh construct (Fig. 13).Our assumption had been that such information was going to aid the expert's classification, reducing the ambiguity of the intermediate damage levels.However, the feedback showed that such information, because it was mainly based on spectral information, was not considered to be useful, since the same damage features can be readily identified by an experienced analyst in the raw images.This conclusion summarized from the experts' feedback affected directly the scope of the study.However, it must be recognized that this study not only has the potential to show the experts the damage features based on spectral information.It is also capable of providing information that otherwise would be invisible, such as inclined façades in the 3-D point cloud.In addition we also experimented with the possibility of identifying those damage fea- tures that affect adjacent façades, by first classifying cracks in separate façade views, and then identifying those that connect.This process mimics the holistic analysis of groundbased damage assessment, whilst eliminating the risk associated with such ground work.

Aggregation algorithms
Two functions were used to aggregate the individual damage and certainty scores at the building level.This asks for several assumptions to be made, and semantic rules to be defined.For the aggregation of damage scores (Table 4), in general, significant damage, even if only affecting parts of a structure, has a disproportionate significance for the performance of the entire building, also because it suggests further invisible damage.Therefore, in our study we gave priority to D4 damage elements, meaning that the presence of this score in any façade or roof determined the score for the entire building in an attempt to not underestimate the overall damage.
For the rest of damage levels a more symmetric approach was followed.Field-based BDA relies on a holistic, expertbased information integration.However, also other studies (e.g.Kerle and Hoffman, 2013) emphasized that damage evidence does not add up linearly, hence mathematic integration rules are ultimately poorly suited.To our knowledge there has been no research yet on the significance of damage indicators on adjacent or opposite façades for the overall structural integrity of the building, or to what extent observed damage pattern can be extrapolated to occluded façades.Such studies based on structural engineering principles are needed for better semantic integration of image-derived damage features to be possible.
The aggregation of the certainty measurements (Eq. 1) had to represent the expert's certainty that led to the final perbuilding damage score; hence, all certainty measurements had to be averaged to represent the reality of the expert assessment for that building.

Aggregated outcome assessment
Table 5 shows the results of the damage and certainty measurement aggregation of the expert analysis results presented in Table 3.The principal conclusion was that the algorithms were not able to reduce the subjectivity effect associated with the per-façade / roof scores as expected.The final aggregated damage scores ranged from D1 to D3.A similar effect can be seen for the certainty measurements that ranged from 40 to 80 %.Nevertheless, the certainty measurements are an excellent indicator of the source of this subjectivity effect.It can be seen in the Table 5 how different experts showed different self-confidence when tagging an image with an EMS-98 score.
Nevertheless, the goal of this study of generating more comprehensive per-building damage scores was reached.The produced scores of this study not only take into account the overall structure of the building; they also aggregate the information collected from each one of the façades and roofs of the building to provide an individual per-building damage score.

Conclusions and further work
In this paper we addressed a number of problems, starting with the identification of a number of principal gaps in the existing literature: (i) remote sensing-based BDA does not reach the ground-based BDA level of detail, (ii) façade assessments tend to be missed, (iii) the multi-perspective dimension of BDA has so far been relatively un-explored, (iv) UAVs, as a very detailed source of information, have not been used in this field, and (v) OBIA's cognitive dimension has not previously been exploited for BDA at such a level of detail.
We successfully used 3-D point clouds to identify D4-D5 building damage, and exploited the cognitive dimension of OBIA to assess at a detailed level damage on both façades and roofs, which is largely lacking in traditional BDA.However, in our understanding, the main constraint of this study is the actual aggregation of the damage information collected from the different parts of the building.The approach of dealing with individual façades and roofs not only failed to reduce the subjectivity of the classification, it actually increased complexity by adding the topological relationships of the damage features in the buildings.Besides, it requires the creation of aggregation algorithms to bring the information to building level, which mimics the cognitive process followed by ground surveyors.
A solution may be a building damage classification directly performed in a 3-D environment, where experts can analyze the entire building using both geometric information from the 3-D point cloud and the OBIA-based damage feature simultaneously.Nevertheless, this would still suffer from the subjectivity that characterizes expert-based image analysis.In summary, more research is needed to automatically extract damage features from point clouds, combine them with spectral and pattern indicators of damage, and to couple this with engineering understanding of the significance of connected or occluded damage indictors for the overall structural integrity of a building.
Edited by: R. Lasaponara Reviewed by: three anonymous referees

Figure 2 .
Figure 2. The two platforms used to collect data.(a) Aibot X6 V1 UAV, (b) camera attached to a pole.

Figure 4 .
Figure 4. Examples of the different images used in this section.(a) Roof with dislocated tiles, (b) cracks in concrete façade, and (c) cracks and hole in brick façade.Scale approx.

Figure 5 .
Figure 5. Equations for the correctness and completeness accuracy measurements based on the accuracy indicator: false positive (FP, red), false negative (FN, blue), and true positive (TP, green) (Joshi, 2010).

Figure 7 .
Figure 7. Result of applying the roof rule set on two roof images.Scale approx.

Figure 8 .
Figure 8. Results of applying the concrete rule set on six concrete façade.(b') and (c') were edited in order to remove the damage feature.(a) and (d) highlight cases of connecting and crossing cracks.(a) highlights with a circle the location of a connection, the small nature of a crack connecting with a window makes it hard to visualize it in the illustration.(d) illustrates a crack crossing a column as a green object (highlighted by circle).Scale approx.

Figure 9 .
Figure 9. Results of applying the brick rule set on three brick façade images.Scale approx.

Figure 10 .
Figure 10.Result of the two-step segmentation of a concrete façade.Scale approx.

Figure 11 .
Figure 11.Example of misclassified crack due to segmentation problems.Scale approx.

Figure 12 .
Figure 12.(a) Example of misclassified roof tiles (false negative), (b) example of a letter classified as a crack in a concrete façade (false positive), and (c) example of non-related objects classified as cracks in a brick façade (false positive).Scale approx.

Figure 13 .
Figure 13.Example of the wire mesh with OBIA extracted information overlaid.Yellow indicated holes, and red indicated cracks in the façade (for better contrast, the color scheme was changed from the previous examples).

Table 1 .
Segmentation parameters and classification features for the three developed rule sets.

Table 2 .
Results of the accuracy assessment.

Table 3 .
Galarreta (2014)mage scores and certainty measurements from the six experts that carried out the interface test in FernandezGalarreta (2014).

Table 4 .
Description of the algorithm created to aggregate the perfaçade / roof damage score at the building level.

Table 5 .
Results of the aggregation of the individual expert-based per-façade / roof classifications.