1.1. Gangue Hill for the Necessity of Vegetation Restoration
A large number of coal mines in China are concentrated in the provinces of Shanxi, Inner Mongolia, Shaanxi and Ningxia. There are 11 large coal mining bases in arid and semi-arid regions, such as the north and northwest of China. These areas are already ecologically fragile, desertification is severe, and a large amount of coal mining has made the ecological environment of the grassland areas even more unstable, which, in turn, has affected the development of local livestock. Drainage sites and gangue hills are typical types of damage formed during the coal-mining process in the arid desert areas of northwest China, mainly consisting of coal gangue, rocks, and soil. The soil structure of the drainage field is loose, with low nutrient content and more developed macro porosity. Its soil’s physical and chemical properties and the ecological environment differ significantly from the natural landscape. At the same time, the crushing of automated vehicles during the soil piling process leads to severe compaction of the ground surface, which is not conducive to the rooting and growth of vegetation. Under constant high-temperature exposure, the dark-colored land also tends to absorb a large amount of solar energy, and the surface is rich in coal debris, posing serious fire-safety hazards; under rainfall and windy conditions, the exposed soil discharge site is prone to environmental problems such as soil erosion, surface water pollution and dust, which have a terrible impact on the surrounding ecological environment and people’s lives. Necessary vegetation restoration is beneficial to improve the ecological environment near the mine, reduce soil erosion and desertification, improve slope stability of gangue hills, and increase sunlight reflectivity. The study area selected for this paper is located in the Luotua Mountain and Mengxi mining areas in Wuhai, Inner Mongolia. It is a typical northwest open pit mine, and its research results have a tremendous role in monitoring vegetation restoration in northwest coal mines.
1.2. Difficulties of Vegetation Greening
Although revegetation of the area around the open pit is a pressing issue, due to the low precipitation in the semi-arid region, natural revegetation has low coverage and long lead times, requiring manual intervention. However, the lack of water makes revegetation costly. Effective revegetation monitoring can reduce maintenance costs by replacing periodic full irrigation sprinklers with timely manual intervention in poorly restored areas. At the same time, more accurate ground interpretation allows for better comparison of restored areas with bare areas and timely monitoring of the dynamics of water storage lakes and roads.
Remote-sensing means are used for vegetation identification mainly by extracting remote sensing image information for interpretation. Traditional remote-sensing monitoring methods are mostly satellite remote sensing, which uses images of different spatial, spectral, and temporal resolutions for regional monitoring and analysis. However, for the specific application scenario of monitoring vegetation recovery in mining areas, satellite remote sensing has several disadvantages as follows:
Due to the limitation of image resolution, the accuracy of thematic information extracted through satellite remote-sensing images has a particular gap with the actual value. For the drainage field, a typical disturbed patch in the mining area, its scope is relatively small, and the identification of vegetation species and the quantitative monitoring study of soil erosion cannot be well achieved by traditional remote-sensing monitoring.
High-resolution satellite data is expensive. According to the internationally accepted calculation method, on average, the development, launch, and insurance cost of an artificial satellite is nearly 1.4 billion RMB, which does not include the post-processing and transmission of the satellite remote-sensing images of ground receiving station costs and labor costs. So, the cost of remote sensing images is high, especially if the gangue mountain used for landforms changes within a short period. The images are required to have higher definition and complete feature information.
With the mature development of UAV technology in the past two years, UAVs, as a new means of remote sensing, have unique advantages compared to traditional satellite remote sensing and manned aerial remote sensing:
Relatively inexpensive drones with low operating and maintenance costs;
UAV flight has low site requirements. It can adopt a variety of take-off and landing methods without the need for professional take-off and landing runways, and can operate in mountainous areas, gullies, rivers, and other areas. It cannot be reached by human power;
UAVs are relatively simple to operate, easy to maintain, and can deal with temporary on-site problems positively and effectively.
However, during field operations, the aircraft flies at a higher altitude from the ground due to factors affecting flight safety, such as undulating terrain and easily blocked signal propagation, resulting in lower ground resolution. We need to introduce image super-resolution techniques into the pre-processing of UAV images.
1.3. Related Work
H. Shen proposed a super-resolution image reconstruction algorithm to moderate-resolution imaging spectroradiometer (MODIS) remote-sensing images. In the registration part, a truncated quadratic cost function is used to exclude the outlier pixels, which strongly deviate from the registration model [
1].
X. Qifang used an interpolation reconstruction method to reconstruct satellite video images, so that the reconstructed static features have smoother and clearer edges and richer detail information, which can effectively improve the resolution of satellite video images [
2]. In the multi-frame remote-sensing image Super-Resolution (SR) reconstruction based on Back-Propagation (BP) neural network, Ding et al. used a three-layer BP neural network to perform super-resolution reconstruction of the input remote-sensing images. The amount of computation performed by the BP neural network is also particularly outstanding due to a large amount of data it requires to converge. Although genetic algorithms can find globally optimal solutions and are highly robust, the algorithm is not sufficiently convergent [
3]. Chen and Wang combined the genetic algorithm with the BP neural network to give the network faster convergence and more vital learning ability [
4]. Ma et al. proposed a transient Generative-Adversarial-Networks (GAN)-based method for SR reconstruction of remote sensing images, which improved the previous SR-GAN [
5]. Specifically, by removing components, the traditional GAN is simplified to reduce memory requirements and improve computational performance. In addition, inspired by migration learning, their reconstruction method was pre-trained on the DIV2K dataset and then tuned using the remotely sensed image dataset, leading to higher accuracy and visual performance. Johanna’s study investigated the physically-based characterization of mixed floodplain vegetation by means of terrestrial laser scanning (TLS). The work aimed at developing an approach for deriving the characteristic reference areas of herbaceous and foliated, woody vegetation, and estimating the vertical distribution of woody vegetation [
6]. ZM’s review provided few outlooks in understanding the underlying od the ML models application for HM simulation [
7]. Mohammad proposed a three-dimensional hole size (3DHS) analysis for separating extreme and low-intensity events observed during experimental runs [
8]. Lama’s study aimed at quantifying analytically the uncertainty in flow average velocity estimations associated with the uncertainty of Leaf Area Index (LAI) of Phragmites australis (Cav.) Trin. ex Steudel covering a vegetated channel [
9]. Shen’s review paper was intended to provide water resources scientists and hydrologists, in particular, with a simple technical overview, transdisciplinary progress update, and a source of inspiration about the relevance of DL to water [
10,
11]. The Vegetation Index (VI) method is the most common, economical, and effective method for extracting and analyzing vegetation information over a large area using remote sensing data. It is based on using different wavelengths in images from satellites in orbit (most commonly the visible red, green, and blue wavelengths and the near-infrared wavelengths) in different mathematical combinations for vegetation studies. There are currently more than 100 vegetation indices proposed in remote sensing. However, most of them are based on the visible-NIR bands, such as the more common normalized difference vegetation index (NDVI) [
12], the ratio vegetation index (RVI), and the NIR band.
The leading vegetation indices are EXG (excess green) [
13], NGRDI (normalized green-red difference index) [
14], and EVI (enhanced vegetation index) [
12]. There are few vegetation indices based only on visible wavelengths. It mainly includes EXG (excess green), NGRDI (normalized green-red difference index), NGBDI (normalized green-blue difference index) [
15], and RGRI (red-green ratio index) modeled based on the NGRDI structure.
In recent years, scholars have also increasingly applied deep-learning models to remote-sensing images to obtain better recognition results. Natesan et al. [
16] applied the residual network (Resnet) to RGB images, which contained different tree types, and used the trained model for classification, achieving 80% classification accuracy; Yang et al. [
17]. took the high spatial resolution remote sensing images World View-2 from Bazhou City, Hebei Province, as the data source and selected the deep convolutional SegNet, a deep convolutional neural network, to extract rural buildings from remote-sensing images. Lobo et al. [
18]. used five deep, fully convolutional networks: SegNet, U-Net, FCN-DenseNet, and two DeepLabv3+ network variants to semantically classify single tree species in UAV visible images of urban areas. In segmentation, the performance of the five models was evaluated in terms of classification accuracy and computational effort. The experimental analysis showed that the average overall accuracy of the five methods ranged from 88.9% to 96.7%. The experiments also showed that the addition of a Conditional Random Field (CRF) could improve the performance of the models but required higher computational effort; Ren et al. [
19]. applied deep learning to the detection of rural buildings in UAV remote-sensing imagery and used the Faster R-CNN network model to identify rural buildings quickly and accurately. The overall accuracy of this method exceeded 90%. Zhou et al. [
20] compared the traditional segmentation model with the semantic segmentation method DeepLabv3+ to segment the remote sensing images acquired by UAV under different environments with leaf coverage images, and the results showed that the DeepLabv3+ model could obtain better segmentation results. Long et al. [
21] first used a well-known classification network (VGG16) for end-to-end semantic segmentation, being completely convolutional, and upsampling the output feature mapping. However, the direct use of these networks leads to rough pixel output caused by multiple upsampling in order to collect more context information in the classification task. To output more precisely, Long proposed to use jump connections to fuse deep and shallow network features. Morales et al. [
22] used three RGB cameras to obtain aerial images of palm trees under different environmental and lighting conditions. They trained the DeeplabV3+ network for areas that are difficult to investigate in the field, such as swamps. The test set on the model achieved an accuracy of 98.14% and was able to identify not only free-standing palm trees but also palm trees partially covered by other types of vegetation, which is highly practical.
The main advantages of our method are: (1) The introduction of super-resolution algorithms into the processing of remote sensing images, which improves the spatial resolution of images without affecting the interpretation. (2) Suitable visible vegetation indices and thresholds were selected for the northwest’s specific topographical and geomorphological features, and the accurate extraction of vegetation was achieved. (3) A dataset for the interpretation of specific mining areas based on aerial images from UAVs was produced, and a deep-learning network was used for ground interpretation to improve vegetation monitoring in mining areas and reduce the cost of vegetation restoration.