Automatic Building Detection from High-Resolution Remote Sensing Images Based on Joint Optimization and Decision Fusion of Morphological Attribute Profiles

Wang, Chao; Zhang, Yan; Chen, Xiaohui; Jiang, Hao; Mukherjee, Mithun; Wang, Shuai

doi:10.3390/rs13030357

Open AccessArticle

Automatic Building Detection from High-Resolution Remote Sensing Images Based on Joint Optimization and Decision Fusion of Morphological Attribute Profiles

by

Chao Wang

^1,2,3

,

Yan Zhang

^2,3,4

,

Xiaohui Chen

^2,3,

Hao Jiang

⁴

,

Mithun Mukherjee

^2,4,* and

Shuai Wang

^2,3,4

¹

Key Laboratory of Meteorological Disaster, Ministry of Education (KLME), Nanjing University of Information Science and Technology, Nanjing 210044, China

²

Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering, China Three Gorges University, Yichang 443002, China

³

China Yichang Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering, China Three Gorges University, Yichang 443002, China

⁴

Jiangsu Key Laboratory of Meteorological Observation and Information Processing, Nanjing University of Information Science and Technology, Nanjing 210044, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2021, 13(3), 357; https://doi.org/10.3390/rs13030357

Submission received: 16 December 2020 / Revised: 3 January 2021 / Accepted: 18 January 2021 / Published: 21 January 2021

Download

Browse Figures

Versions Notes

Abstract

:

High-resolution remote sensing (HRRS) images, when used for building detection, play a key role in urban planning and other fields. Compared with the deep learning methods, the method based on morphological attribute profiles (MAPs) exhibits good performance in the absence of massive annotated samples. MAPs have been proven to have a strong ability for extracting detailed characterizations of buildings with multiple attributes and scales. So far, a great deal of attention has been paid to this application. Nevertheless, the constraints of rational selection of attribute scales and evidence conflicts between attributes should be overcome, so as to establish reliable unsupervised detection models. To this end, this research proposes a joint optimization and fusion building detection method for MAPs. In the pre-processing step, the set of candidate building objects are extracted by image segmentation and a set of discriminant rules. Second, the differential profiles of MAPs are screened by using a genetic algorithm and a cross-probability adaptive selection strategy is proposed; on this basis, an unsupervised decision fusion framework is established by constructing a novel statistics-space building index (SSBI). Finally, the automated detection of buildings is realized. We show that the proposed method is significantly better than the state-of-the-art methods on HRRS images with different groups of different regions and different sensors, and overall accuracy (OA) of our proposed method is more than 91.9%.

Keywords:

automatic building detection; decision fusion framework; genetic algorithm; MAPs

Graphical Abstract

1. Introduction

With the rapid development of earth observation technology, building detection based on high-resolution remote sensing (HRRS) images has been one of the research hotspots in the field of remote sensing [1]. Remote sensing images have the characteristics of wide coverage, strong timeliness, and a large amount of obtainable information, which are helpful for cognition and interpretation of geographical targets. Buildings occupy an important position in the area of human activities. The spatial characteristics and distribution of urban buildings represent important basic data for urban construction management, such as national survey monitoring, urban and rural planning management, real estate management [2], etc. The study of automatic high-precision detection of buildings on remote sensing images is significant for further developing remote sensing image information mining technology, and promoting its applications in digital cities and other related fields [3].

Compared with the traditional medium and low-resolution remote sensing images, the HRRS images contain a wealth of spatial structure information, which is conducive to the fine description of buildings in the complex urban scene. On the other hand, the low signal to noise ratio (SNR) of HRRS images limits the detection accuracy. In addition, buildings are often hedged in by other artificial or natural geographic objects due to their complex structures. Moreover, there may be significant differences even between buildings in the same area. All of these negative factors cause difficulties in implementing high-precision, reliable building detection with HRRS images [4].

In recent years, morphological attribute profiles (MAPs) have been proven to have a strong ability to detect buildings in complex urban backgrounds, which has been one of the most effective spatial structure modelling methods for HRRS images. The morphological feature set of local area constructed by MAPs can be used to realize the multi-attribute and multi-scale expression of different ground objects, thus significantly improving the separability of buildings and other ground objects [5,6,7]. However, the following limitations must be overcome to realize high-precision, unsupervised building detection based on MAPs: (1) The potential building pixels are directly determined by the differential attribute profiles (DAPs) extracted from the differential of neighboring attribute profiles (APs), and morphological attribute profile (MAP) theory does not give a scale parameter setting using clear rules, so the requirement according to the scale of the original image is used to construct (on an adaptive basis) a reasonable parameter set. If the interval between the scales is too large, it is difficult to describe different types of buildings with different attributes. Otherwise, it is easy to retain too many other feature pixels with similar attributes to buildings in the detection results. (2) As a basis for determining whether a pixel belongs to the building, the DAPs extracted by different attributes may give opposite conclusions, and the experimental results in this article verify that it is difficult to achieve reliable detection results for the common practice of taking the union of all attributes and scales of DAPs, design of effective decision rules is needed to deal with this evidential conflict. (3) Buildings should be a type of geographical objects with closed contours, and how to automatically convert potential building pixels extracted based on MAPs into object-level building detection results is another challenge to be tackled.

In response to these challenges, we propose an automatic building detection method from HRRS images based on the joint optimization and decision fusion of MAPs. The contributions of this study can be summarized as follows:

(1) A new adaptive cross-probability genetic algorithm based on DAPs (ACGA-DAPs) is proposed to detect the pixels of potential buildings by transforming the scale parameter selection of MAPs into the joint optimization of multi-attribute DAPs. To meet the application requirements of building detection, a wide range of scale parameter values and tight sampling intervals are set and traversed to ensure that the initial DAPs can extract the property details of the building. Based on this, the genetic algorithm (GA) is introduced to optimize the DAPs with different attributes, and a cross-probability adaptive selection strategy is proposed. The constructed ACGA-DAPs are helpful in significantly improving the detection accuracy of buildings.

(2) Based on ACGA-DAPs and image segmentation results, we propose an unsupervised decision fusion framework, which bridges the gap between potential building pixels and object-level building detection results. Among them, this framework combines statistics and spatial information to construct a novel statistics-space building index (SSBI), finally realizing the automatic detection of building sets.

The rest of the paper is organized as follows: Section 2 reviews the relevant literature on building detection, and introduces MAP theory; Section 3 presents the implementation steps of the proposed method in detail; in Section 4, the experimental results are evaluated; Section 5 discusses the setting of proportion parameters; and in Section 6 conclusions and future lines of research are summarized.

2. Related Work

2.1. Building Detection from Remote Sensing Images

Building detection from remote sensing images can be implemented by combining artificial interpretation and field investigation. However, these methods require a great deal of manpower and material resources, and are of very low detection efficiency. In recent years, extensive building detection research—in regard to both theory and methods—have been undertaken, such as demolished building detection from aerial imagery using deep learning [8], automatic building extraction with rooftop detectors [9], etc. Considering the particularity of deep neural structures, we divide the existing methods into deep learning methods and non-deep learning methods.

2.1.1. Deep Learning Methods

Deep learning technology is based on the biological understanding and has a strong impact in the field of remote sensing image processing. Deep learning has been proven to have a strong ability of concentrating on the essential building characteristics of the dataset from non-annotated samples [10].

Many scholars have conducted various deep network structures in the building detection application. Hamed et al. [11] proposed a building detection approach based on deep learning using the fusion of light detection and ranging (LiDAR) data and orthophotos. The convolutional neural network (CNN) was adopted in this article to transform compressed features into high-level features, which were used to distinguish buildings from backgrounds. Wang et al. [12] proposed a fully convolutional dense connection network to better learn the rich architectural features. The innovative design of top-down short connections promotes the fusion of high and low feature information. Since the first version of DeepLab model was released in 2015, Google has evolved and expanded to DeepLab V3 +. This model further applies deep dividable volume to the atrous spatial pyramid pooling (ASPP) and decoder modules, resulting in a faster and more powerful semantic segmentation encoder–decoder network [13].

Despite this, deep learning requires an abundance of annotated samples to participate in the training of the model; otherwise, overfitting will occur, which restricts the feasibility and effectiveness of such methods in practical application [14].

2.1.2. Non-Deep Learning Methods

Since the number of annotated samples is often limited, which negatively affects the building detection performance in deep learning, a variety of non-deep learning building detection methods have been proposed.

Building indexes can effectively describe the characteristics of buildings from different aspects, which have been widely used in building detection application. You et al. [15] proposed a scale-invariant feature point detection method considering the multi-scale and multi-directional texture characteristics of the stacking area. In this article, the traditional morphological building index (MBI) was applied to the extracted built-up area, and then the threshold segmentation of MBI feature images was carried out to obtain the results. Bi et al. [16] proposed a multi-channel multi-scale filtering building index (MMFBI) to overcome the drawbacks of MBI. This index is helpful to fully utilize the relatively little spectral information in HRRS images. However, these methods require appropriate thresholds to obtain the final results, and are always limited by the threshold method.

In addition, many scholars have conducted in-depth research on the application of MAP in building detection. Hu et al. [17] proposed a method of combining the new alternating sequential filters (NASFs) strategy with MAPs for building detection from high-resolution synthetic aperture radar (SAR) images. Wang et al. [18] proposed a novel adaptive morphological attribute profile under the object boundary constraint (AMAP-OBC) method. By investigating the associated attributes in MAPs, this method established corresponding relationships between AMAP-OBC and building characteristics in HRRS images. Compared with the building index, MAP adopts multi-category and multi-scale attributes as proofs of building detection, and can obtain more reliable results.

Most of the existing MAP research directly optimizes APs and ignores the information redundancy and evidence conflict between DAPs. As described in Section 1 of this paper, these processing strategies will bring some specific problems in building detection. To this end, we propose a MAP method based on the joint optimization and decision fusion of MAPs in this paper.

2.2. MAP Theory and Constitution of Attribute Set

This developed from traditional morphological filtering, MAP theory has a powerful ability to portray geographical objects in fine detail across different scales and different attributes from different angles. At present, MAP theory is widely used in the classification and change detection of HRRS images. MAP uses a Max-Tree structure to represent the image and performs attribute thickening and thinning operations based on the given set of scale parameters N, to evaluate the attribute values of the connecting components in the image. The basic processing flow is as follows:

For a given grey-scale image M, let j be any pixel and

B n_{n}^{j} (M)

be a binary image determined by the scale parameter

n \in N

. The thickening operation profile

φ_{}^{j} (M)

and the thinning operation profile

θ_{}^{j} (M)

can be obtained by Equations (1) and (2), respectively:

φ_{}^{j} (M) = \max {n : j \in φ_{}^{j} [B n_{n}^{} (M)]}

(1)

θ_{}^{j} (M) = \min {n : j \in θ_{}^{j} [B n_{n}^{} (M)]}

(2)

By traversing all the scale parameters, the set of thickening and thinning operations can be extracted. On this basis, the difference operation is carried out for the adjacent scale sections, and the DAPs are represented as follows:

Δ ϕ (M) = {Δ_{l} ϕ (M) 〈 \begin{array}{l} φ_{n}^{j} (M) - φ_{n - 1}^{j} (M), n = (l + 1), \forall l \in [1, \cdot \cdot, N] \\ θ_{n + 1}^{j} (M) - θ_{n}^{j} (M), n = (l - N), \forall l \in [N + 1, \cdot \cdot, 2 N] \end{array} 〉}

(3)

Therefore, by treating M as being superimposed by

B n_{n}^{j} (M)

, the specific attribute characteristics in different scale profiles can be enhanced, and then the corresponding geographic objects can be extracted through DAPs.

Here, four attributes of area, diagonal, standard deviation, and normalized moment of inertia (NMI) are adopted to the fine description of the building. The reasons are as follows: an area attribute can describe the size of a building; the diagonal describes the length of the building’s shape; standard deviation can describe the complexity of building texture. NMI reflects the mass distribution of the building and has the advantage of invariance of translation, rotation, and zoom. Studies have shown that the combination of the above four attributes will endow buildings and other ground objects with strong interclass separability [19].

3. Method

The implementation of the proposed method mainly included three steps: data pre-processing, ACGA-DAPs extraction based on multi-attribute joint optimization, and the construction of an unsupervised decision fusion framework. The implementation process is shown in Figure 1.

3.1. Data Pre-Processing

3.1.1. Image Segmentation by WJSEG

In the data pre-processing step, the original image is first segmented to divide the discrete pixels into connected sets of pixels with semantic information, thus providing the basic analysis unit for subsequent building detection. At the same time, during the construction of MAPs, object boundaries are used to determine the connectivity domain for thickening and thinning operations so that the calculated results reflect the properties of the actual geographical objects.

For this reason, we adopted wavelet J-Segmentation (WJSEG), a HRRS image segmentation method for urban scenes. This method profitably maintains the integrity of the object contour, and there is no false “narrow, long unit” arising in the segmentation results when using mainstream commercial software Ecognition [20,21]. WJSEG mainly includes several steps, such as multi-band fusion, seed region initialization and secondary extraction, region merging, etc. The specific implementation process is detailed elsewhere [20]. It should be pointed out that the segmentation method adopted in this study was not limited to WJSEG, which meant that the use of other methods did not affect the implementation of the subsequent building detection phase.

3.1.2. Non-Building Pre-Screening

In the image segmentation results, there must be objects with significant feature differences such as vegetation, vehicles, and other buildings. The elimination of such objects in the pre-processing stage is not only helpful in reducing the computational burden but can also avoid the possibility of subsequent false detections.

At present, scholars have proposed many preliminary screening strategies for non-building objects. This article adopted the four discriminant rules proposed in the literature [18]: shadow index, normalized difference vegetation index (NDVI), area index and rectangularity. The objects rejected in the initial screening are not considered in the subsequent building detection, while the remaining objects constitute the candidate building set

R_{c d i}

.

3.2. ACGA-DAPs Extraction Based on Multi-Attribute Joint Optimization

The premise of constructing MAPs is to determine the set of scale parameters corresponding to different attributes, and the setting of scale parameters is one of the key factors affecting the accuracy of building detection. However, MAPs only realize the quantitative expression of morphological attributes, and the DAPs obtained by subtracting APs of adjacent scales are the basis for identifying potential building pixels. Therefore, it is difficult to objectively evaluate the rationality of scale parameter selection by directly using traditional measurement methods such as mutual information between scales of MAPs. For this reason, we proposed to transform the scale parameter selection problem of MAPs into the joint optimization problem of multi-attribute DAPs. By using fixed adjacent scale spacing to fully extract the morphological attribute features contained in the original image, the improved genetic algorithm is used to carry out multi-attribute joint optimization screening of the difference features.

3.2.1. Candidate Object Set of DAPs

In the process of MAPs extraction, a wide range of values and a tight sampling interval are set for each attribute, and then a complete set of MAPs as generated by traversing all the scale parameters within the interval. The purposes are to expand MAPs with small sampling interval, to increase computation, and obtain a complete representation of scene spatial structure.

To this end, according to other suggestions [22,23], the area, diagonal, standard deviation, and NMI values were set to [500, 28000], [10, 100], [10, 70], and [0.2, 0.5]: each of the four attributes extracted 50 scale parameters at equal intervals, resulting in a total of 200 scales of MAPs for the four attributes [18]. On this basis, the initial DAPs set was obtained by applying those differences to all adjacent scales, as defined by

D A P s_{c d i}

.

3.2.2. ACGA-DAPs

Since

D A P s_{c d i}

, GA is used to screen out representative DAPs sequences in different attributes and a novel ACGA-DAPs is proposed. The specific steps are as follows:

Step 1: for DAPs belonging to the same attribute in

D A P s_{c d i}

, first random sampling with replacement is performed to obtain the set of Q DAPs corresponding to the attribute.

Step 2: calculate the fitness

f (D)

by Equation (4):

f (D) = \frac{(Q - 1) \sum d}{Q \sum d'}

(4)

where

d

represents the difference index of the two DAPs in set

D

, and

d'

is the difference index of the two DAPs from different sets.

d

can be calculated by Equation (5):

d = {(1 - \sqrt{\frac{2 M I}{H H^{'}}})}^{2}

(5)

where H represents the information entropy of a DAP, and the mutual information of two DAPs is MI.

Step 3: keep the set of DAPs corresponding to the minimum fitness, defined as

D_{\min}

. According to the roulette wheel selection (RWS) method [24], reselect Q-1 DAPs. On this basis, the one-point crossover method is used to perform pair-wise cross-over operations on the sets of

D_{\min}

and Q-1 DAPs, and set the cross-over probability

P_{c}

[25]. At this time, the RWS method is adopted to re-select Q DAPs sets, and

D_{\min}

is updated to

D'_{\min}

according to Step 2. Among them, whether the setting of

P_{c}

is reasonable will significantly affect the genetic performance, which is reflected in: if

P_{c}

is too large, the model may be completely ineffective; otherwise, it may fall into local optimality. To this end, the distance distribution matrix S of all DAPs sets is calculated:

S = {(s)}_{Q \times Q}

(6)

where

s

represents the distance between two sets, and the minimum distance set

s_{\min}

of each row can be obtained. On this basis, let the fitness corresponding to the maximum distance be

f_{m}

, and the cross-over probability

P_{c}

can be adaptively determined as:

P_{c} = {_{\frac{2}{π} \arctan (π \frac{f_{m} - f_{\min}}{f_{a v g} - f_{\min}}), f_{m} < f_{a v g}}^{1 - \frac{2}{π} \arctan (π \frac{f_{\max} - f_{m}}{f_{\max} - f_{a v g}}), f_{m} \geq f_{a v g}}

(7)

where

f_{\max}

,

f_{\min}

, and

f_{a v g}

are the maximum, minimum, and average fitness of Q DAPs sets, respectively.

Step 4: Steps 2 and 3 are repeated to obtain the representative DAPs corresponding to the current attribute. Four attributes are traversed, and all DAPs screened jointly constitute ACGA-DAPs.

3.3. Construct an Unsupervised Decision Fusion Framework

In practical application, buildings are a type of geographical object with complete contours, while ACGA-DAPs only provides the detection results of potential buildings at pixel level. On the other hand, the traditional decision-making method of taking the union directly for the DAPs corresponding to different attributes ignores the evidential conflict and redundant information. Therefore, based on Dempster–Shafer (D–S) evidence theory, this paper proposed an unsupervised decision fusion framework combining ACGA-DAPs and image segmentation [5].

3.3.1. Identification Framework Based on D–S Theory

As an uncertain reasoning method, D–S evidence theory can not only satisfy the weaker condition than Bayesian probability theory, but also has a powerful ability to deal with uncertain information directly. Denote I as the total number of

R_{c d i}

and define the recognition framework

U : {B, N B}

for any object

R_{i} (i = 1, 2, 3, \dots, I)

, where B and NB represent building and non-building, respectively. Thus, the non-empty subset A of U is desirable

{B}

,

{N B}

and

{B, N B}

. We define the basic probability assignment formula (BPAF) as m:

2^{U} \to [0, 1]

, and satisfy the following constraints:

{\begin{matrix} m (Θ) = 0 \\ \sum_{A \subseteq U} m (A) = 1 \end{matrix}

(8)

Let the total number of DAPs in ACGA-DAPs be K, then the synthesis rules of K m functions

m_{1}, m_{2}, \dots, m_{K}

are as follows:

m (A) = m_{1} \oplus m_{2} \oplus \dots \oplus m_{K} (A) = \frac{\sum_{\cap A^{'} = A} \prod_{1 \leq k \leq K} m_{k} (A^{'})}{\sum_{\cap A^{'} = \emptyset} \prod_{1 \leq k \leq K} m_{k} (A^{'})}

(9)

3.3.2. Calculation of SSBI

In the decision fusion framework, it is required to quantify the degree of uncertainty belonging to buildings (or non-buildings) when constructing BPAF. For this reason, we combined statistics and spatial information to construct a novel SSBI. The SSBI is calculated as follows:

Step 1: calculation of statistical indicators

D p r o

and

D p r o'

. According to the proportion of building pixels in all objects, fuzzy C-means (FCM) method is first used to determine the two proportion parameters

ν_{B}

and

ν_{N B}

, which correspond to the clustering center of building and non-building objects, respectively. On this basis,

D p r o

and

D p r o'

are, respectively, defined as:

D p r o = | ν_{i} - ν_{B} |

(10)

D p r o' = | ν_{i} - ν_{N B} |

(11)

where

ν_{i}

denotes the building pixel ratio of

R_{i}

in a DAP.

D p r o

and

D p r o'

, respectively, represent that

R_{i}

is a building and a non-building object (the smaller the distance the higher the likelihood).

Step 2: calculation of spatial information indices

D s p a

and

D s p a'

. Since the center of mass reflects the center of mass distribution of an object in space, this paper holds that the closer the distance between the pixel and the center of mass, the more reliable it will be as proof of building detection. Based on this assumption, let

W_{B}

and

W_{N B}

be the number of building pixels and non-building pixels in

R_{i}

, respectively.

D s p a

and

D s p a'

of

R_{i}

can be calculated by:

D s p a = \frac{\sum_{w = 1}^{W_{B}} s_{w}}{W_{B}}

(12)

D s p a' = \frac{\sum_{w' = 1}^{W_{N B}} s_{w'}}{W_{N B}}

(13)

where

s_{w}

and

s_{w'}

represent the distance from the center of mass of a pixel belonging to a building or non-building.

Step 3: determination of SSBI.

S S B I = {S S B I_{B}, S S B I_{N B}}

is defined by combining statistical and spatial information indicators.

S S B I_{B} = e^{- (D p r o \times D s p a)}

(14)

S S B I_{N B} = e^{- (D p r o' \times D s p a')}

(15)

3.3.3. BPAF and Discrimination Rules

Through all

R_{i}

, BPAF is constructed according to Equations (8) to (15) as follows:

\begin{array}{l} m_{k} ({B}) = S S B I_{B_{}} \times γ_{a t e} \\ m_{k} ({N B}) = S S B I_{N B_{}} \times γ_{a t e} \\ m_{k} ({B, N B}) = 1 - (S S B I_{B_{}} \times γ_{a t e} + S S B I_{N B_{}} \times γ_{a t e}) \end{array}

(16)

where

γ_{a t e}

is a confidence factor.

γ_{a t e}

is designed to cope with the problem that there may be an imbalance in the number of DAPs belonging to four different attributes. Let

g_{t} (t = 1, 2, 3, 4)

be the number of DAPs for each of the four attributes. Then

γ_{a t e}

can be calculated from Equation (17).

γ_{a t e} = \frac{1}{1 + \sum_{t' \neq t} (g_{t} / g_{t'})}

(17)

At this point, based on BPAF, if

m ({B}) > m ({N B})

and

m ({B}) > m ({B, N B})

are satisfied,

R_{i}

is a building; otherwise,

R_{i}

represents a non-building. Traverse all objects to obtain the final building detection results.

4. Experiments and Evaluation

Three datasets of HRRS images of urban scenes in different regions and with different spatial resolutions are used in the experiment. Compared with various advanced building detection methods, the proposed method is found to have excellent performance by combining quantitative and visual analysis for accuracy evaluation.

4.1. Dataset and Experimental Strategy

4.1.1. Dataset Description

Dataset 1 was a pan-sharpened WorldView image with red, green, and blue bands of Chongqing, China; the acquisition date was August 2011, the spatial resolution was 0.5 m, and the size was 1370 pixels × 1370 pixels, as shown in Figure 2a. Dataset 2 was an aerial remote sensing image with red, green, and blue bands of Nanjing, China; the acquisition date was October 2011, the spatial resolution was 2 m, and the image size was 300 pixels × 500 pixels, as shown in Figure 2b. Dataset 3 was a WorldView pan-sharpened image with red, green, and blue bands of Nanjing, China; the acquisition date was December 2012, the spatial resolution was 0.5 m, and the image size was 1400 pixels × 1400 pixels, as shown in Figure 2c. In addition, as the basis for accuracy evaluation, the ground truth map was manually created by field survey and visual interpretation, where white objects represent buildings and black objects represent non-buildings. Some representative areas marked in red boxes (patches I1, I3, and I5) and blue boxes (patches I2, I4, and I6) in Figure 2 were chosen for more detailed comparison and analysis.

As shown in Figure 2, the three datasets are all typical urban scenes composed of buildings, roads, vegetation, shadows, and other features, but at the same time, there are significant differences in the image lighting conditions, acquisition time, and imaging side view. In addition, the buildings in Dataset 1 are mainly low-rise residential buildings and regularly-shaped factory buildings; in Dataset 2, there are many high-rise buildings and in Dataset 3, the geometric shapes of old commercial buildings to be demolished are irregular. Therefore, experiments on these datasets are helpful to reflect the detection performance of the algorithm in real urban scenes from different angles.

4.1.2. Experimental Set-Up

To evaluate the performance of the proposed method comprehensively and objectively, we used six advanced building detection methods for comparative experiments: based on adaptive MAP_S method (Method 1) [18]; the grey-level co-occurrence matrix (GLCM) and support vector machine (SVM) based method (Method 2) [26]; the top-hat filter and K-means classification based method (method 3) [27]; based on a DeepLab V3+ network method (Methods 4 and 5) [13], combine the Otsu method and the evidence fusion strategy proposed in this paper, respectively, to obtain the object-level building detection results. By comparing this with Method 1, it is helpful when analyzing the optimization strategies of different DAPs. Method 2 belongs to the common machine learning method. Method 3 is an automatic detection method based on building descriptors. Methods 4 and 5 are deep learning methods. By comparing with these methods, it is helpful to assess the performance of the proposed methods in general. In addition, the further to investigate the influence of ACGA-DAPs on the detection performance of buildings separately, Method 6 only replaces ACGA-DAPs with

D A P s_{c d i}

, and the other steps are the same as the proposed method. Meanwhile, to ensure consistency of the geographic object set, all comparison methods are based on the segmentation results of WJSEG to achieve the object-level building detection results. Finally, based on the initialization parameters of the improved GA model, we adopt the recommendation made elsewhere [28], and take Q = 20, and use 500 iterations.

According to Section 3.2.1, the

D A P s_{c d i}

can be obtained. After screening the

D A P s_{c d i}

with the improved GA model, the adaptively extracted ACGA-DAPs contains 85, 76, and 84 DAPs in three datasets of experiments, respectively. Each DAP in ACGA-DAPs is calculated by applying the difference to the two adjacent APs. Let the smaller scale parameter corresponding to one of these two APs be an initial parameter. The obtained scale parameter sets are listed in Table 1, Table 2 and Table 3.

4.2. Experimental Results and Accuracy Evaluation

4.2.1. General Results and Analysis

The building detection results of three datasets are shown in Figure 3, Figure 4 and Figure 5: true positive (TP), false positive (FP), false negative (FN), and other non-buildings are represented by four colors.

The visual analysis shows that the results of this method are significantly better overall than the five comparison methods. In addition, four accuracy evaluation indices including overall accuracy (OA), FP, FN, and Kappa are adopted for quantitative accuracy evaluation in this work. The results are reported in Table 4, Table 5 and Table 6. In the three groups of experiments, the OA of the proposed method exceeds 91.9%, offering the best performance among all experimental methods, in line with the conclusions of the visual analysis. Despite the differences between the different datasets the OA fluctuations of the method presented here is less than 2%, showing its stability.

As an automated building detection method based on MAPs, the OA of Method 1 and the proposed method in all three sets of experiments are higher than 90%, and FNs are lower than 3.1%, which confirms the powerful ability of MAPs at portraying buildings in complex urban environments; however, all other accuracy indices of this method are higher than that of Method 1, except for FPs in Dataset 3. Therefore, compared with the strategy of Method 1, we directly select the DAPs set for adaptive selection based on the statistics and spatial information of potential building pixels, which is conducive to more accurate building characterization.

Method 2 is a classification method based SVM, which not only requires manual intervention, but also the detection accuracy is susceptible to the quality and number of training samples. For example, the number of samples in Dataset 1 is 833, which is higher than the 462 in Dataset 2 and 212 in Dataset 3, while the OA is improved by 3.7% and 2.9%, respectively. Method 3 uses fixed shapes for structural elements despite the use of automated building descriptors, while ignoring the complexity and diversity of the buildings, so it has an OA of less than 80% for both Datasets 2 and 3.

As deep learning methods, Methods 4 and 5 show low accuracy and bad stability in all three datasets of experiments. For example, the fluctuation range of the OA reaches 16.9%, and the lowest OA is only 66.6%. Compared with the proposed method, deep learning methods are not applicable to specific building detection applications where a prior knowledge is sparse, the reasons are as follows: (1) Due to the architecture of multi-layer neural networks, deep learning models require large, diverse training datasets to avoid the overfitting problem. In the implementation of building detection within a specific area, the efforts to curate these datasets is regarded as the main barrier to adopt the deep learning method. This is the case for the three sets of experiments in this paper. (2) The proposed method can automaticly extract appropriated morphological attributes according to the characteristics of the remote sensing images, which is not limited by the amount of training samples. In addition, compared with the traditional treatment of taking the union of all DAPs adopted in Method 4, the improvement of the OA proves that the proposed fusion strategy is both feasible and effective.

After replacing the ACGA-DAPs with

D A P s_{c d i}

in Method 6, the OA is reduced in all three sets of experiments, and in particular, the FPs improve significantly. This indicates that the number of DAPs is not maximized, and the redundancy of information and evidential conflicts among DAPs with different attributes and scale parameters may adversely affect the detection performance. Therefore, it is necessary to optimize the selection of

D A P s_{c d i}

from the perspective of improving the OA and automation, which is also aligned with the goal of the proposed cross-probability adaptive genetic algorithm.

4.2.2. Visual Comparison of Representative Patches

The results of the representative patches in each dataset are reported in Figure 6 (patches I1 and I2), Figure 7 (patches I3 and I4), and Figure 8 (patches I5 and I6). The results for each representative patch are discussed as follows.

Residential and industrial buildings are two types of buildings that are common and distributed widely across urban HRRS images. On the other hand, they are always regions of interest (ROIs) in building detection applications based on HRRS images. Therefore, we focused on both residential and industrial buildings to ascertain the detection performance of the proposed method.

As shown in the figure, the segmentation by WJSEG extracts the complete outline of the buildings without obvious over-segmentation and under-segmentation, which lays a good foundation for the subsequent object-level building detection. The detection performance of the method in this paper is significantly better than that of other methods, regardless of the low-rise residential buildings in the green rectangle of I1, the green rectangle of I5, and purple rectangle, or high-rise residential buildings in the green rectangle of I3 and I4; however, only objects adjacent to the building and with regular shape in individual positions have FPs, but no FNs occur. For irregularly shaped buildings, such as villas (e.g., the purple rectangle in I1) and industrial buildings (e.g., the green rectangle in I6), only the methods proposed in this paper and Methods 1, 4, and 5 do not have FNs; for industrial buildings to be demolished (e.g., the yellow and green rectangle in I6), Methods 2, 3, and 4 had severe FPs. For industrial buildings of large size and regular shape (e.g., the yellow rectangle in I2 and the purple rectangle in I6), only the method in this paper and Methods 1, 4, and 5 do not have FPs. For some geographic objects that are located between buildings and have similar morphological features to buildings (e.g., the green rectangle in I4 and the green and purple rectangles in I5), the detection results of this method and Method 6 are better than the other comparators. In addition, the screening strategy employed in the present research can reduce the influences of non-building objects such as vegetation and shadows (e.g., the green and yellow rectangles in I1 and the purple rectangles in I2 and I4). For high-rise buildings using side-view imaging as in Dataset 2, the method proposed in this paper can obtain correct detection results when the building roof and side elevations are partitioned into the same object (e.g., the green rectangles in I3 and I4) and when the side elevations are partitioned into separate objects (e.g., the yellow and purple rectangles in I4).

Through the visual analysis of representative patches, this method can detect different types of buildings in complex urban backgrounds, and is insensitive to interference from factors such as false targets and imaging side view confusion. It is significantly better than other comparison methods and agrees with the conclusions of quantitative analysis.

5. Discussion

In the process of decision fusion of ACGA-DAPs, we employed the idea of fuzzy clustering to self-adapt to determine the proportion parameters

ν_{B}

and

ν_{N B}

. The results are as follows.

On this basis, to further discuss the influence of the setting of OA, we constructed the V_NB-V_B-OA three-dimensional curves in the intervals [0.05, 0.45] and [0.5, 0.95], respectively, with an interval of 0.05, as shown in Figure 9.

As shown in the figure, when the value of

ν_{B}

is unchanged, the OA of

ν_{N B}

in [0, 0.5] show an increasing tendency before falling; when the value of

ν_{N B}

is unchanged, OA of

ν_{B}

in the interval [0.5, 1] also exhibits a similar trend of increasing before falling. Among them, the three datasets belong to {[0.1, 0.2], [0.85, 0.95]}, {[0.15, 0.2], [0.8, 0.95]}, and {[0.15, 0.2], [0.85, 0.95]} in (

ν_{N B}

,

ν_{B}

), respectively, and OA can exceed 88%. Meanwhile,

ν_{N B}

and

ν_{B}

are also located in these intervals according to Table 7. Therefore, we adopted 0.02 as the sampling interval for the above intervals to describe the relationship between the setting of

ν_{N B}

,

ν_{B}

, and OA. The results are shown in Figure 10.

In the above interval, the mean values of OA for the three datasets are 91.2%, 90.8%, and 90.5%, respectively, and the peak values of OA are 94%, 93.5%, and 93.3%, respectively, while the OA of the method proposed in this paper is 93.2%, 92.2%, and 91.9%, respectively. Thus, the OA of the method proposed herein is only slightly lower, by 0.8%, 0.13% and 0.16%, than the corresponding highest OA in three datasets, respectively, and significantly higher than the interquartile range of the overall mean accuracy. Meanwhile, the automation of parameter setting is thus achieved.

6. Conclusion and Future Lines of Research

For HRRS images of buildings in complex urban backgrounds, an automatic detection method based on the joint optimization and decision fusion of MAPs is proposed. This method aims to preserve detailed information about the morphological attributes of buildings by transforming the scale parameter setting of MAPs into an optimal selection problem for DAPs, and a cross-probability adaptive selection method is developed. Based on these, a building index SSBI that combines statistical and spatial information is designed and an unsupervised decision fusion framework based on D–S evidence theory is established to achieve the automated building detection. In the experiments on HRRS images from different groups of different regions and different sensors, the proposed method outperforms the other six advanced comparison methods in visual and quantitative analysis, with the OA exceeding 91.9%, while FPs and FNs are less than 6.13% and 3.03%, respectively. In addition, the setting of the value intervals of different attributes in this paper was limited by prior knowledge. In the future, we will consider studying an automatic method to determine appropriate value interval corresponding to the attribute.

Author Contributions

Conceptualization, C.W.; methodology, C.W. and Y.Z.; software, Y.Z.; validation, X.C., Y.Z., and H.J.; formal analysis, Y.Z. and S.W.; investigation, M.M. and H.J.; resources, C.W.; writing—original draft preparation, Y.Z.; writing—review and editing, C.W.; visualization, C.W. and Y.Z.; supervision, C.W., X.C., and H.J.; project administration, C.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the 2020 Opening fund for Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering under Grant 2020SDSJ05, the Construction fund for Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering under Grant 2019ZYYD007 and the Six Talent-peak Project in Jiangsu Province under Grant 2019XYDXX135.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to other ongoing researches.

Conflicts of Interest

All authors have reviewed the manuscript and approved submission to this journal. The authors declare that there is no conflict of interest regarding the publication of this article.

References

Lin, W.; Li, Y. Parallel regional segmentation method of high-resolution remote sensing image based on minimum spanning tree. Remote Sens. 2020, 12, 783–813. [Google Scholar] [CrossRef] [Green Version]
Li, J.; Cao, J.; Feyissa, M.E. Automatic building detection from very high-resolution images using multiscale morphological attribute profiles. Remote Sens. Lett. 2020, 11, 640–649. [Google Scholar] [CrossRef]
Pham, M.T.; Lefèvre, S.; Aptoula, E. Local feature-based attribute profiles for optical remote sensing image classification. IEEE Trans. Geosci. Remote Sens. 2018, 56, 1199–1212. [Google Scholar] [CrossRef] [Green Version]
Zhou, K.; Lindenbergh, R.; Gorte, B. Automatic shadow detection in urban very-high-resolution images using existing 3D models for free training. Remote Sens. 2019, 11, 72–96. [Google Scholar] [CrossRef] [Green Version]
Wang, C.; Liu, H.; Shen, Y.; Zhao, K.; Xing, H.; Wu, H. High-Resolution remote-sensing image-change detection based on morphological attribute profiles and decision fusion. Complexity 2020, 171, 1–17. [Google Scholar] [CrossRef]
Liu, B.; Guo, W.; Chen, X. Morphological Attribute Profile Cube and Deep Random Forest for Small Sample Classification of Hyperspectral Image. IEEE Access. 2020, 8, 117096–117108. [Google Scholar] [CrossRef]
Ma, W.; Wan, Y.; Li, J.; Zhu, S.; Wang, M. An automatic morphological attribute building extraction approach for satellite high spatial resolution imagery. Remote Sens. 2019, 11, 337–362. [Google Scholar] [CrossRef] [Green Version]
Su, S.; Nawata, T. Demolished building detection from aerial imagery using deep learning. In Proceedings of the 29th International Cartographic Conference (ICC 2019), Tokyo, Japan, 15–20 July 2019; Volume 2, pp. 1–8. [Google Scholar]
Li, Z.; Xu, D.; Zhang, Y. Real walking on a virtual campus: A VR–based multimedia visualization and interaction system. In Proceedings of the 3rd International Conference on Cryptography, Security and Privacy, Kuala Lumpur, Malaysia, 19–21 January 2019; Volume 19, pp. 261–266. [Google Scholar]
Liu, Y.; Gross, L.; Li, Z.; Li, X.; Fan, X.; Qi, W. Automatic building extraction on high-resolution remote sensing imagery using deep convolutional encoder-decoder with spatial pyramid pooling. IEEE Access 2019, 7, 128774–128786. [Google Scholar] [CrossRef]
Hamed, N.F.; Shafri, H.Z.M.; Ibrahim, S.M. Deep learning approach for building detection using LiDAR-orthophoto fusion. J. Sens. 2018, 7, 1–12. [Google Scholar]
Wang, S.; Zhou, L.; He, P.; Quan, D.; Zhao, Q.; Liang, X.; Hou, B. An improved fully convolutional network for learning rich building features. In Proceedings of the 2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 28 July–2 August 2019; pp. 6444–6447. [Google Scholar]
Yuan, L.; Yuan, J.; Zhang, D. Remote sensing image classification based on deeplab-v3+. Laser Optoelectron. Prog. 2019, 56, 152801–152809. [Google Scholar] [CrossRef]
Qiao, R.; Ghodsi, A.; Wu, H.; Chang, Y.; Wang, C. Simple weakly supervised deep learning pipeline for detecting individual red-attacked trees in VHR remote sensing images. Remote Sens. Lett. 2020, 11, 650–658. [Google Scholar] [CrossRef]
You, Y.; Wang, S.; Wang, B.; Ma, Y.; Shen, M.; Liu, W.; Xiao, L. Study on hierarchical building extraction from high resolution remote sensing imager. J. Remote Sens. 2019, 23, 125–136. [Google Scholar]
Bi, Q.; Qin, K.; Zhang, H.; Zhang, Y.; Li, Z.; Xu, K. A multi-scale filtering building index for building extraction in very high-resolution satellite imager. Remote Sens. 2019, 11, 482–509. [Google Scholar] [CrossRef] [Green Version]
Hu, S.M.; Yu, J.; Xie, D. Combination of NASFs filter strategy with morphological attribute profiles for building detection from sar imagery. Geogr. Geoinf. Sci. 2018, 34, 27–33. [Google Scholar]
Wang, C.; Shen, Y.; Liu, H.; Zhao, K.; Xing, H.; Qiu, X. Building extraction from high–resolution remote sensing images by adaptive morphological attribute profile under object boundary constraint. Sensors 2019, 19, 3737. [Google Scholar] [CrossRef] [Green Version]
Mura, M.D.; Benediktsson, J.A.; Waske, B.; Bruzzone, L. Morphological attribute profiles for the analysis of very high resolution images. IEEE Trans. Geosci. Remote Sens. 2010, 48, 3747–3762. [Google Scholar] [CrossRef]
Wang, C.; Shi, A.; Wang, X.; Wu, F.; Huang, F.; Xu, L. A novel multi–scale segmentation algorithm for high resolution remote sensing images based on wavelet transform and improved JSEG algorithm. Light Electron. Opt. 2014, 125, 5588–5595. [Google Scholar] [CrossRef]
Chakraborty, D.; Singh, S.; Dutta, D. Segmentation and classification of high spatial resolution images based on Hölder exponents and variance. Geo Spat. Inf. Sci. 2017, 20, 39–45. [Google Scholar] [CrossRef] [Green Version]
Aptoula, E.; Mura, M.D.; Lefèvre, S. Vector attribute profiles for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2016, 54, 3208–3220. [Google Scholar] [CrossRef] [Green Version]
Cavallaro, G.; Dalla, M.M.; Benediktsson, J.A.; Bruzzone, L. Extended self–dual attribute profiles for the classification of hyperspectral images. IEEE Geosci. Remote Sens. Lett. 2015, 12, 1690–1694. [Google Scholar] [CrossRef] [Green Version]
Guo, X.; Su, J.; Zhou, H.; Liu, C.; Li, L. Community detection based on genetic algorithm using local structural similarity. IEEE Access 2019, 7, 134583–134600. [Google Scholar] [CrossRef]
Reynolds, J.; Rezgui, Y.; Kwan, A.; Piriou, S. A zone-level, building energy optimisation combining an artificial neural network, a genetic algorithm, and model predictive control. Energy 2018, 151, 729–739. [Google Scholar] [CrossRef]
Gavankar, N.L.; Ghosh, S.K. Automatic building footprint extraction from high–resolution satellite image using mathematical morphology. Remote Sens. 2018, 51, 182–193. [Google Scholar] [CrossRef] [Green Version]
Kumar, M.; Garg, P.K.; Srivastav, S.K. A spectral structural approach for building extraction from satellite imageries. Remote Sens. 2018, 7, 2471–2477. [Google Scholar] [CrossRef]
Liu, F.; Yu, L. Research on remote sensing image segmentation algorithms based on improved thresholds of genetic operators. Henan Sci. Technol. 2019, 14, 37–38. [Google Scholar]

Figure 1. Flowchart of the proposed method. NMI: normalized moment of inertia; WJSEG: wavelet J-Segmentation ACGA-DAP: adaptive cross-probability genetic algorithm based on DAPs; DAPs: differential attribute profiles; SSBI: statistics-space building index.

Figure 2. Three datasets and corresponding ground truth maps: (a) Dataset 1 and patches I1 (red box) and I2 (blue box); (b) Dataset 2 and patches I3 (red box) and I4 (blue box) and (c) Dataset 3 and patches I5 (red box) and I6 (blue box).

Figure 3. Building detection results of Dataset 1: (a) Original image; (b) proposed method; (c) Method 1; (d) Method 2; (e) Method 3; (f) Method 4; (g) Method 5 and (h) Method 6.

Figure 4. Building detection results of Dataset 2: (a) Original image; (b) proposed method; (c) Method 1; (d) Method 2; (e) Method 3; (f) Method 4; (g) Method 5 and (h) Method 6.

Figure 5. Building detection results of Dataset 3: (a) Original image; (b) proposed method; (c) Method 1; (d) Method 2; (e) Method 3; (f) Method 4; (g) Method 5 and (h) Method 6.

Figure 6. Building detection results of patches I1 and I2: (a) Patch I1; (b–h) results obtained in patch I1 using the proposed method and Methods 1 to 6, respectively; (i) patch I2; (j–p) results obtained in patch I2 using the proposed method and Methods 1 to 6, respectively.

Figure 7. Building detection results of patches I3 and I4: (a) Patch I3; (b–h) results obtained in patch I3 using the proposed method and Methods 1 to 6, respectively; (i) patch I4; (j–p) results obtained in patch I4 using the proposed method and Methods 1 to 6, respectively.

Figure 8. Building detection results of patches I5 and I6: (a) Patch I5; (b–h) results obtained in patch I5 using the proposed method and Methods 1 to 6, respectively; (i) patch I6; (j–p) results obtained in patch I6 using the proposed method and Methods 1 to 6, respectively.

Figure 9. 3-d V_NB-V_B-OA curves at sampling interval of 0.05: (a) Dataset 1; (b) Dataset 2; (c) Dataset 3.

Figure 10. 3-d V_NB-V_B-OA curves at sampling interval of 0.02: (a) Dataset 1; (b) Dataset 2; (c) Dataset 3.

Table 1. ACGA-DAPs extracted scale parameter set of Dataset 1.

Attribute	Initial Parameter Set of Dataset 1	Number
Area	(500,1050,1600,2700,3250,3800,4900,5450,7100,7650,8200,9850,12050,14800,15900,17000,19200,21400,21950,23050,23600,26350,27450)	23
Diagonal	(10,13.6,17.2,19,24.4,26.2,21.6,38.8,40.6,44.6,47.8,56.8,58.6,62.2,64,65.8,69.4,71.2,78.4,91,92.8,94.6,98.2)	23
Standard deviation	(11.2,13.6,14.8,17.2,240.8,25.6,26.8,29.2,32.8,34,44.8,54.4,56.8,58,60.4,62.8,64,65.2,67.6)	19
NMI	(0.206,0.242,0.248,0.254,0.29,0.296,0.314,0.326,0.332,0.338,0.344,0.374,0.392,0.398,0.41,0.422,0.446,0.458,0.47,0.476)	20

Table 2. ACGA-DAPs extracted scale parameter set of Dataset 2.

Attribute	Initial Parameter Set of Dataset 2	Number
Area	(500,2150,3800,4900,5450,7650,8200,9850,12050,14800,17000,17550, 22500,23050,23600,24700,27450)	17
Diagonal	(13.6,15.4,17.2,19,22.6,24.4,28,29.8,31.6,35.2,37,38.8,49.6,51.4,56.8,60.4,62.6,74.8,80.2,82,85.6,96.4)	22
Standard deviation	(10,13.6,14.8,16,26.8,29.2,32.8,34,36.4,40,43.6,47.2,49.6,52,56.8,60.4,67,6)	17
NMI	(0.2,0.206,0.254,0.26,0.272,0.284,0.29,0.296,0.30,0.356,0.362,0.392,0.404,0.41,0.422,0.44,0.47)	20

Table 3. ACGA-DAPs extracted scale parameter set of Dataset 3.

Attribute	Initial Parameter Set of Dataset 3	Number
Area	(2150,2700,4350,4900,6550,7100,7650,8750,9850,11500,12050,12600,14250,15350,16450,17000,17550,21400,23050,24700,25250)	21
Diagonal	(15.4,19,22.6,24.4,28,29.8,31.6,35.2,38.8,44.2,46,47.8,51.4,56.8,62.2,64,69.4,76.6,80.2,82,83.8,89.2,94.6,96.4)	24
Standard deviation	(2.4,13.6,16,18.4,19.6,22,24.4,26.8,29.2,32.8,34,35.2,37.6,41.2,44.8,46,49.6,54.4,56.8,58,59.2,62.8)	22
NMI	(0.224,0.236,0.254,0.26,0.272,0.29,0.296,0.302,0.344,0.35,0.362,0.41,0.43,0.452,0.458,0.476,0.488)	17

Table 4. Evaluation of building detection accuracy in Dataset 1.

Method/Indicator	OA (%)	FP (%)	FN (%)	Kappa
Evaluation Criteria	The Higher the Better	The Lower the Better	The Lower the Better	The Higher the Better
Proposed method	93.2	3.71	2.99	0.809
Adaptive MAP_S	92.1	4.71	3.12	0.782
GLCM-SVM	83.8	10.7	5.99	0.663
Top-hat	83.1	6.83	9.82	0.644
DeepLab-Otsu	66.6	27.3	6.11	0.282
DeepLab-fusion	69.8	20.9	9.22	0.270
DAPs-fusion	85.8	10.16	4.07	0.687

Table 5. Evaluation of building detection accuracy in Dataset 2.

Method/Indicator	OA (%)	FP (%)	FN (%)	Kappa
Evaluation Criteria	The Higher the Better	The Lower the Better	The Lower the Better	The Higher the Better
Proposed method	92.2	4.76	3.03	0.841
Adaptive MAP_S	90.2	6.95	3.25	0.780
GLCM-SVM	80.1	5.64	14.3	0.594
Top-hat	78.7	8.89	12.6	0.568
DeepLab-Otsu	82.0	3.90	14.7	0.622
DeepLab-fusion	83.1	5.41	11.4	0.649
DAPs-fusion	83.7	9.81	5.38	0.674

Table 6. Evaluation of building detection accuracy in Dataset 3.

Method/Indicator	OA (%)	FP (%)	FN (%)	Kappa
Evaluation Criteria	The Higher the Better	The Lower the Better	The Lower the Better	The Higher the Better
Proposed method	91.9	6.13	1.89	0.811
Adaptive MAP_S	90.5	4.65	5.12	0.766
GLCM-SVM	80.9	9.30	9.77	0.563
Top-hat	72.6	12.6	14.9	0.456
DeepLab-Otsu	81.1	2.36	16.5	0.614
DeepLab-fusion	83.5	3.77	12.74	0.649
DAPs-fusion	84.9	10.92	4.25	0.624

Table 7. Proportion parameters of

ν_{B}

and

ν_{N B}

in the three datasets.

Table 7. Proportion parameters of

ν_{B}

and

ν_{N B}

in the three datasets.

	Dataset 1	Dataset 2	Dataset 3
$ν_{B}$	0.93	0.87	0.91
$ν_{N B}$	0.18	0.16	0.21

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, C.; Zhang, Y.; Chen, X.; Jiang, H.; Mukherjee, M.; Wang, S. Automatic Building Detection from High-Resolution Remote Sensing Images Based on Joint Optimization and Decision Fusion of Morphological Attribute Profiles. Remote Sens. 2021, 13, 357. https://doi.org/10.3390/rs13030357

AMA Style

Wang C, Zhang Y, Chen X, Jiang H, Mukherjee M, Wang S. Automatic Building Detection from High-Resolution Remote Sensing Images Based on Joint Optimization and Decision Fusion of Morphological Attribute Profiles. Remote Sensing. 2021; 13(3):357. https://doi.org/10.3390/rs13030357

Chicago/Turabian Style

Wang, Chao, Yan Zhang, Xiaohui Chen, Hao Jiang, Mithun Mukherjee, and Shuai Wang. 2021. "Automatic Building Detection from High-Resolution Remote Sensing Images Based on Joint Optimization and Decision Fusion of Morphological Attribute Profiles" Remote Sensing 13, no. 3: 357. https://doi.org/10.3390/rs13030357

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Automatic Building Detection from High-Resolution Remote Sensing Images Based on Joint Optimization and Decision Fusion of Morphological Attribute Profiles

Abstract

1. Introduction

2. Related Work

2.1. Building Detection from Remote Sensing Images

2.1.1. Deep Learning Methods

2.1.2. Non-Deep Learning Methods

2.2. MAP Theory and Constitution of Attribute Set

3. Method

3.1. Data Pre-Processing

3.1.1. Image Segmentation by WJSEG

3.1.2. Non-Building Pre-Screening

3.2. ACGA-DAPs Extraction Based on Multi-Attribute Joint Optimization

3.2.1. Candidate Object Set of DAPs

3.2.2. ACGA-DAPs

3.3. Construct an Unsupervised Decision Fusion Framework

3.3.1. Identification Framework Based on D–S Theory

3.3.2. Calculation of SSBI

3.3.3. BPAF and Discrimination Rules

4. Experiments and Evaluation

4.1. Dataset and Experimental Strategy

4.1.1. Dataset Description

4.1.2. Experimental Set-Up

4.2. Experimental Results and Accuracy Evaluation

4.2.1. General Results and Analysis

4.2.2. Visual Comparison of Representative Patches

5. Discussion

6. Conclusion and Future Lines of Research

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI