Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Clustering Analysis of Integrated Rural Land For T

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

This article has been accepted for publication in IEEE Access.

This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3321894

Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2017.Doi Number

Clustering Analysis of Integrated Rural


Land for Three Industries Using Deep
Learning and Artificial Intelligence
Qian Huang1,2, Haibin Xia3,2, Zhangcheng Zhan1,*
1
Guangdong Vocational and Technical College, Foshan, 528000, China
2
University of Perpetual Help System DALTA, Manila,1740, Philippines;huangqianner@gmail.com
3
East China Jiao Tong University, Nanchang, China; Bankeyxia@gmail.com

Corresponding author: Zhangcheng Zhan (e-mail: zcz08200@gmail.com).

ABSTRACT This study employs deep learning and artificial intelligence (AI) clustering analysis techniques
to evaluate the suitability of integrated rural land for three industries. Diverse datasets pertaining to rural
development, encompassing land use, agricultural production, and rural tourism, are gathered and
harmoniously amalgamated. An innovative land suitability assessment model, merging ResNet-50 with the
k-means algorithm, is devised. Specifically, ResNet-50 is harnessed for the classification and recognition of
rural land-use images, thus deriving feature vectors for each sample. These feature vectors are subsequently
fed into the k-means algorithm to cluster samples with akin land-use patterns. The ensuing examination of
land use composition within each cluster facilitates the evaluation of rural land’s suitability for three-industry
integration. Experimental scrutiny discloses that this study achieves an accuracy rate of 88.3% in rural land-
use classification and recognition, outperforming alternative algorithms by at least 3.1%. Furthermore, it
yields an average intersection over union (IoU) of 67.29%. Remarkably, the k-means algorithm exhibits
superior clustering outcomes. Consequently, the model introduces herein demonstrated substantial
enhancements in rural land-use classification and recognition accuracy, average IoU, and clustering
performance. It offers an innovative tool for policymakers to advance rural industry integration, fostering
economic diversification. Additionally, this model aids decision-makers in identifying prospective
opportunities and challenges, thus facilitating the formulation of forward-thinking and viable rural
development strategies.

INDEX TERMS Artificial intelligence, Deep learning, integrated rural land for three industries, cluster,
Suitability evaluation

I. INTRODUCTION The concept of rural tri-sector integration pertains to the


organic and coordinated development of agriculture, rural
A. RESEARCH BACKGROUND AND MOTIVATIONS industries, and rural tourism within a shared geographical
As social productivity continues to advance, Chinese space. This model harnesses the distinctive strengths of these
agriculture is undergoing a transition towards modernization. sectors, facilitating resource sharing and complementarity,
The conventional rural economic framework, primarily thereby augmenting the diversity and comprehensive
centered on agriculture, has encountered challenges, advantages of rural development (Ge et al., 2022).
including uneven resource utilization and a narrow industrial Nonetheless, during the implementation of rural tri-sector
focus, which fall short of meeting the evolving expectations integration, a crucial challenge arises: how to judiciously
for a quality living environment and diverse industrial manage land resources and ascertain the appropriateness of
requirements (Jia et al., 2022; Yang et al., 2021). Against the land for integrated use (Leng & Tong, 2022; Wen et al.,
backdrop of population urbanization, mounting resource and 2023). This study explores the utilization of deep learning
environmental pressures, and the imperative of rural techniques to process extensive land-use data, enabling the
economic transformation and upgrading, the concept of rural extraction of crucial features pivotal for assessing integrated
tri-sector integration has emerged as a prominent and land use (Wang et al., 2022; Liu et al., 2021). Furthermore,
extensively discussed strategy. Its primary objectives it investigates the application of artificial intelligence (AI)
encompass stimulating rural revitalization and achieving clustering analysis methods to classify diverse land-use data
sustainable development. types, thereby furnishing essential support for subsequent
evaluations (Shirazy et al., 2022).

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3321894

Author Name: Preparation of Papers for IEEE Access (February 2017)

This study addresses several pivotal concerns. Firstly, it land’s attributes and prospects. This innovative method
recognizes the imperative for rural areas to achieve economic provides invaluable support for the harmonization of rural
diversification and sustainable development by integrating industries and the cultivation of sustainable development.
primary, secondary, and tertiary industries. As urban-rural Decision-makers can utilize this model to identify latent
disparities diminish, rural regions must amalgamate developmental opportunities and challenges, facilitating the
agriculture, industry, and services to achieve comprehensive formulation of forward-looking and viable rural
industrial development. Consequently, investigating the development strategies ultimately advancing the enduring
suitability of rural land becomes paramount to facilitating the prosperity of rural landscapes.
integration of these sectors. Secondly, the study
acknowledges the remarkable strides made in deep learning II. LITERATURE REVIEW
and AI technologies, particularly in image processing,
pattern recognition, and data analysis. Leveraging these A. REVIEW OF RELATED RESEARCH IN THE FIELD OF
advanced technologies, which excel in handling large-scale, RURAL LAND USE EVALUATION
multi-source data, is expected to enhance the precision and In the realm of rural development and land utilization
efficiency of land suitability assessments. Thirdly, the study assessment, a multitude of research findings have emerged, all
recognizes the need for scientific land suitability assessment bearing relevance to the theme of rural tri-sector integration
methods to guide policy formulation and strategies aimed at and suitability appraisal. Wei et al. (2021) analyzed the
propelling rural economic growth. This assessment extends spatiotemporal characteristics and driving forces behind land
beyond agriculture and encompasses areas like rural tourism marketization in Shaanxi Province, uncovering noteworthy
and the service industry. Thus, the development of a temporal and spatial disparities in marketization levels. Li et
decision-support-oriented land suitability assessment model al. (2021) pioneered the formulation of an index framework
is imperative for informed decision-making by governments for sustainable rural development grounded in the concept of
and stakeholders. ecological livability. This framework offers a comprehensive
evaluation encompassing ecological, societal, and economic
B. RESEARCH OBJECTIVES factors, providing an effective means to assess the
The objective of this study is to assess the appropriateness of sustainability of rural development. Soleimani et al. (2022)
rural land for tri-sector integration employing deep learning harnessed Monte Carlo simulation and sensitivity analysis to
and AI cluster analysis. By systematically collecting and gauge groundwater quality and nitrate risk, revealing nitrate
processing data pertinent to rural development, coupled with concentrations in groundwater as a hazard influenced by
the application of deep learning and cluster analysis diverse factors. Ghayour et al. (2021) leveraged machine
techniques, the study endeavors to precisely discern distinct learning algorithms to assess the performance of Sentinel-2
categories of integrated land, including their attributes, and data in land cover/use classification, culminating in
gauge their suitability and developmental prospects. This commendable accuracy and consistency. Wang et al. (2023)
endeavor serves as a scientific foundation and point of examined the potential contributions of rural revitalization by
reference for rural development planning and decision- delineating the structure of rural regional systems, thus
making and contributes to the advancement of rural supplying crucial reference points and guidance for rural
revitalization and the realization of sustainable development development planning and decision-making.
goals. Furthermore, the methodologies and findings Upon scrutinizing the aforementioned literature, it
presented in this study hold substantial theoretical and becomes evident that they share a common focus on rural
practical significance in terms of optimizing land resource development and land use assessment, albeit with variations
allocation and propelling the transformation and elevation of in their respective emphases, methodologies, and levels of
rural economies. depth. Notably, there exists a dearth of all-encompassing and
This study introduces a pioneering land suitability integrated research efforts, which in turn impedes the
assessment model that merges deep learning and clustering efficacious resolution of challenges related to the integration
analysis methodologies. Employing ResNet-50 for image of the rural tri-sector. This context offers a theoretical
classification and recognition, in conjunction with the k- foundation for the suitability evaluation of integrated rural
means algorithm for clustering analysis, this model adeptly land for three industries as presented in this study, bearing
evaluates rural land’s suitability for seamless integration of inherent innovation and practical applicability.
primary, secondary, and tertiary industries. This approach’s
innovation stems from applying advanced AI technologies to B. REVIEW ON THE APPLICATION OF DEEP LEARNING
IN LAND USE EVALUATION
the domain of land assessment, significantly enhancing
In recent years, there has been a growing interest in the
assessment precision and efficiency. This study presents a
application of deep learning and artificial intelligence (AI) to
comprehensive and scientifically rigorous approach to
land use evaluation. Deep learning technology offers robust
evaluating land’s multifaceted potential, empowering
pattern recognition capabilities, with its capacity to acquire
decision-makers to gain a deeper understanding of rural
and distill features from vast datasets through the
2

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3321894

Author Name: Preparation of Papers for IEEE Access (February 2017)

construction of intricate neural network models. This statistical techniques to examine the relationship between
technological advancement has garnered substantial land use patterns and water quality in Jiangyuan Province,
attention from researchers. For instance, Debella-Gilo & Vietnam. The findings demonstrated a discernible
Gjertsen (2021) employed deep learning methods to map correlation between land use patterns and water quality.
seasonal agricultural land use types, yielding results Lastly, Kalisz et al. (2023) delved into the utilization of land
distinguished by their high accuracy and practical use indicators in assessing land use efficiency, concluding
applicability. Likewise, Masolele et al. (2021) utilized deep that judiciously selected land use indicators could effectively
learning techniques to infer land use changes following gauge land use efficiency.
deforestation, demonstrating the method’s ability to In summary, these documents underscore the wide-
accurately identify post-deforestation land use types and ranging application of cluster analysis in land use evaluation,
uncover trends in land use alterations. Castelo-Cabay et al. particularly in its capacity to discern diverse types,
(2022) harnessed deep learning to classify land use and land characteristics, and relationships within data. However, there
cover, ultimately showcasing its efficacy and accuracy. Zhu exists a noticeable research gap in the assessment of
et al. (2022) introduced a land use/land cover change clustering analysis algorithms for the suitability of integrated
detection approach tailored for high spatial resolution remote rural land use for three industries. Consequently, this study’s
sensing images, founded on a twin global learning distinctive contribution lies in the integration of deep
framework, achieving precise detection of land use/land learning with cluster analysis to assess the suitability of
cover changes. Boonpook et al. (2023) successfully integrated rural land for three industries, presenting a novel
employed deep learning for the semantic segmentation of and innovative approach within the realm of rural sustainable
various land use and land cover types, revealing the method’s development.
capability to accurately delineate diverse land use and land
cover categories. D. SUMMARY
In conclusion, these studies underscore the potential of In conclusion, while the studies conducted by the
deep learning in land use assessment, highlighting its aforementioned scholars have yielded valuable insights in
capacity to enhance classification accuracy and detect land the domain of rural development and land use assessment,
use changes effectively. Nevertheless, a noticeable research they still exhibit certain limitations. Consequently, the
gap exists within the domain of rural land suitability novelty and distinctiveness of this study lie in the fusion of
assessment, especially concerning the context of three- deep learning and artificial intelligence cluster analysis, as
industry integration. There is a dearth of methodologies that applied to the evaluation of the suitability of integrated rural
amalgamate deep learning with suitability assessment. land for three industries. By surmounting the constraints of
Consequently, this study’s focal point resides in evaluating conventional methodologies, this approach promises to
rural land suitability for integrating the three industries, deliver more precise, scientifically grounded, and objective
representing a specific and innovative research avenue. By outcomes in land use evaluation, thereby offering innovative
integrating deep learning and cluster analysis, this study support for rural sustainable development and land planning.
introduces a novel approach that offers substantial support The innovative significance of this method in the realm of
for rural development and land planning. rural development and land use assessment cannot be
overstated, as it introduces a fresh perspective and solution
C. REVIEW ON THE APPLICATION OF THE CLUSTER to address practical challenges and future endeavors.
METHOD IN LAND USE
Cluster analysis algorithms play a pivotal role in III. Rural Land Suitability Evaluation through the
categorizing data samples into distinct groups, effectively Integration of Deep Learning and AI Clustering
distinguishing similar samples from dissimilar ones. In the Algorithms
realm of land use evaluation, cluster analysis methods serve
to identify various types and characteristics of integrated A. ANALYSIS OF INTEGRATION OF THREE
INDUSTRIES
land use—a topic that has garnered significant scholarly
The amalgamation of three industries entails the integration
attention. Abera et al. (2021) explored the influence of
of agriculture, industry, and the service sector, with the
clustering algorithms on ecosystem services when applied to
optimization of land use serving as a catalyst for the
the dynamics of land use and land cover in forest biosphere
harmonious development of these distinct sectors. Within the
reserves. This approach effectively categorized water source
context of the modern era, the amalgamation of these rural
protection, soil conservation, and carbon storage within
industries has emerged as the definitive path for China to
ecosystems. Seaton et al. (2021) leveraged cluster analysis to
transition from traditional agriculture to the sustainable
group soil health based on national soil indicator monitoring
development of cutting-edge, technology-driven, and
data. The outcomes revealed distinct cluster patterns in
service-oriented industries. This paradigm shift has
different regions, reflecting spatial variations in soil health.
propelled the realization and implementation of China’s rural
Giao et al. (2021) employed remote sensing and multivariate
revitalization strategy (Zheng et al., 2022). Figure 1
3

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3321894

Author Name: Preparation of Papers for IEEE Access (February 2017)

illustrates the developmental framework underpinning the Lastly, the amalgamation of the three industries has a
integration of rural tertiary industries. positive impact on the rural environment, leading to overall
improvements and creating a more appealing living
environment for rural residents.

B. DEEP LEARNING AND ITS APPLICATION IN LAND


TYPE IDENTIFICATION AND ANALYSIS
Secondary
industry Deep learning is a machine learning technique that employs
artificial neural network models to simulate and acquire data
Integration of feature representations, enabling the recognition of intricate
three industries patterns and advanced abstractions within the data. Deep
Primary industry
learning technology proves valuable in data analysis and
prediction, providing decision-making support for assessing
Tertiary the suitability of integrated rural land for three industries.
industry One commonly employed deep learning model, the
Convolutional Neural Network (CNN), particularly excels in
image classification tasks (Garg et al., 2021). The utilization
of CNN for land type identification is depicted in Figure 2.

Land image
Drive Enhance Narrowing Emphasize input
structural spatial the income environmental
optimization interaction gap protection

Convolution
layer

Pooling
layer

Rural sustainable
development Fully connected
layer
FIGURE 1. Schematic representation of the developmental framework
driving the integration of rural tertiary industries.
In Figure 1, several noteworthy advantages emerge
following the integration of three rural industries. Firstly,
land resources can be harnessed more effectively, resulting
in enhanced land use efficiency and the maximization of Output ...
layer
resources. This integration also facilitates the optimization
and upgrading of the industrial structure. Secondly, it fosters
the diversified development of the rural economy by FIGURE 2. Illustration of CNN’s application in land type identification.

encouraging synergy among industries, mitigating reliance In Figure 2, CNNs demonstrate exceptional suitability for
on any single sector, and bolstering economic stability and processing data characterized by spatial structures, including
resilience. Furthermore, the integration of different images and geographic information. The convolution layer,
industries generates increased employment opportunities, within this framework, employs convolution operations to
elevates farmers’ income levels, and enhances the living extract features from localized areas, while the pooling layer
standards of rural residents. Consequently, it contributes to serves to diminish the dimensionality of the feature map
narrowing the income gap between urban and rural areas. while preserving key features. The process of stacking

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3321894

Author Name: Preparation of Papers for IEEE Access (February 2017)

multiple convolution layers and pooling layers results in the sampling. Within this framework, the Batch Normalization
extraction of increasingly abstract, higher-level features. layer is commonly utilized to adjust the output data
This hierarchical feature extraction enables precise data distribution from the convolution layer, thereby expediting
classification and recognition. convergence (Groth et al., 2021). Suppose that the input of a
Within the CNN algorithm, ResNet50, a variant of ResNet, batch at a specific neural network layer is represented as
holds significance (Cecili et al., 2023). In the domain of land X=[x0, x1⋯, xn], where xi denotes a rural land sample, and n
type recognition, ResNet-50 offers distinct advantages, signifies the batch size. To begin, the mean value μB of the
characterized by its depth, residual connections, and training elements within the mini-batch is computed, as indicated in
on extensive image datasets. These attributes equip it to Equation (1).
effectively capture the intricate characteristics of land,
thereby enhancing classification accuracy and stability. The 𝜇 = ∑ 𝑥 (1)
fusion of ResNet-50 with meticulous data preprocessing and
Next, the variance 𝜎2𝐵 of the mini-batch is determined, as
feature extraction makes the realization of a more precise and
resilient land type recognition model attainable. ResNet50 illustrated in Equation (2).
involves five down-sampling operations, with the second 𝜎 = ∑ (𝑥 − 𝜇 ) (2)
employing maximum pooling to halve the feature map size,
while the subsequent four down-sampling stages employ a In this manner, each element can be normalized, as
convolution step size of 2, effectively extracting land type depicted in Equation (3).
features while progressively reducing the feature map size,
𝑥 = (3)
as illustrated in Figure 3.

In Equation (3), ε represents a constant that prevents the


denominator from being 0. After performing the
aforementioned operations, the data is transformed into a
Land image normal distribution with a mean of 0 and a variance of 1,
input
resulting in the loss of data offset. To revert to the data as it
was before applying Batch Normalization, an identity
transformation is required, as demonstrated in Equation (4).
𝑦 = 𝛾𝑥 + 𝛽 (4)
BN layer In Equation (4), 𝑦 represents the ultimate output of the
network, while 𝛾 and 𝛽 denote the variance and offset of the
input data distribution, respectively. In networks lacking a
Batch Normalization layer, these two values are associated
with the nonlinear characteristics introduced by the
1
preceding layer. However, following transformation, they
become independent of the previous layer and instead serve
as learning parameters for the current layer. This adjustment
ResNet50
is advantageous for optimization and does not compromise
...

network capacity. During testing, the BN operation, denoted


4
as 𝑥 ′ , employs unbiased estimators of the mean value, E(x),
and variance, Var(x), recorded during each Mini-batch. The
final output is designated as 𝑦 , as outlined in Equations (5)
to (8).

Softmax
𝐸(𝑥) = 𝐸 (𝜇 ) (5)
Output ... 𝑉𝑎𝑟(𝑥) = 𝐸 (𝜎 ) (6)
layer
( )
𝑥 = ( )
(7)
FIGURE 3. Schematic representation of ResNet50’s application in land
type identification.
𝑦 = 𝛾𝑥 + 𝛽 (8)
As depicted in Figure 3, when ResNet50 is employed for
image feature extraction, it comprises four residual modules, In the context of employing ResNet50 for land type
each composed of a convolution layer and a pooling layer. identification, enhancing the model’s fitting capability can
Notably, the last two residual modules do not employ down- be achieved by solely increasing its depth through the use of

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3321894

Author Name: Preparation of Papers for IEEE Access (February 2017)

convolutional kernel residual structures. It is advisable to As depicted in Figure 4, it represents the IoU ratio between
omit the final two down-sampling layers to reduce the model’s output area and the expected output area.
computational demands. Opting to remove the initial two Consider a certain land type area as an ellipse A, and the
down-sampling layers may result in minor alterations to the model’s prediction result as a rectangle B. The degree of
feature map recognition network. overlap between ellipse A and rectangle B determines the
When assessing the performance of a learning algorithm, closeness of the intersection and union areas and,
typical metrics such as accuracy, precision, recall, and F1 consequently, the proximity of the IoU value to 1.
value are commonly employed to gauge the accuracy of land Conversely, if rectangle A and rectangle B have no overlap
type recognition. The precise equations for these metrics are whatsoever, the IoU value approaches 0. A higher IoU value
presented in Equations (9) through (12). signifies a more accurate model classification result. The
mean intersection and union ratio can be calculated as the
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = = mean IoU value across all land types, as shown in Equation

(9) (13).
∑ ∑
𝑀𝐼𝑜𝑈 = ∑
∑ ∑ ∙
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = =
(13)

(10)
∑ ∑ ∑ , In Equation (13), d represents the number of land types
being classified. pii corresponds to the number of pixels
𝑅𝑒𝑐𝑎𝑙𝑙 = = correctly classified for each land type, while pij and pji
∑ represent FP and FN samples, respectively.
∑ ∑ ∑
(11)
,

× C. APPLICATION ANALYSIS OF CLUSTERING


𝐹1 = 2 × (12) ALGORITHM IN LAND FUNCTIONAL ZONING
Typically, rural land use involves a complex and diverse set
Here, TP represents the count of true positive samples of indicators. In the suitability evaluation of integrated rural
correctly predicted as positive. FP stands for the count of land for three industries, cluster analysis plays a crucial role
negative samples inaccurately predicted as positive. FN in classifying and identifying similar objects by establishing
denotes the count of positive samples erroneously predicted data similarity. When applying cluster analysis to the study
as negative. TN signifies the count of negative samples of land use function zoning, it becomes possible to
correctly predicted as negative. Here, d signifies the number objectively perform land use function zoning based on the
of distinct land types classified, pii represents the count of similarities among different rural land uses, thereby
pixels accurately classified for each land type, pij represents revealing the correlations and distinctions between different
the count of pixels belonging to the i-th land type but integrated land types. In this study, the primary cluster
classified as the j-th land type by the model (indicating false methods employed are K-means and the Gaussian mixture
positives), and pji refers to false negative pixels. model. The optimal clustering scheme is then determined
The mean Intersection over Union (IoU) serves as another based on the Calinski-Harabasz (CH) coefficient and contour
commonly employed metric. It quantifies the intersection-to- coefficient.
union ratio between the model’s predicted area and the In the k-means algorithm (Jing et al., 2021), the process
expected output area for a specific land type, as visually begins by initializing 𝑘 clustering centers. Each sample is
depicted in Figure 4. then assigned to the nearest clustering center. After this
A∩B
initial classification, the clustering centers are updated by
calculating the centroid of the samples within each cluster.
This iterative process continues until the clustering centers
no longer change or the maximum number of iterations is
A B
reached. The detailed steps are depicted in Figure 5.

IoU=

A A∪B B

FIGURE 4. Visualisation of intersection and parallel ratio.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3321894

Author Name: Preparation of Papers for IEEE Access (February 2017)

Start Here, 𝜇 refers to the centroid of the updated cluster 𝑗, and


𝐶 signifies the 𝑗-th cluster.
The Euclidean distance of the cluster center for each
Determine the number of sample is calculated iteratively until the cluster center
clusters
remains unchanged or the maximum number of iterations is
reached. The clustering result for each sample point is then
Determine the initial
returned.
clustering center While the K-means algorithm is known for its simplicity
and efficiency, it has its limitations. One such limitation is
the need to manually specify the number of clusters, which
Calculate Euclidean can be inaccurate and may lead to suboptimal results due to
distance
local optimization. To address this issue, an optimization
Sample
method is employed. Initially, the number of clusters is
Classify the sample to determined using the elbow method (Racolte et al., 2022).
the closest class The elbow method employs the Sum of Squares due to Error
(SSE) to select the optimal number of clusters, as illustrated
Is the in Equation (17).
No
cluster center no longer
finer? 𝑆𝑆𝐸 = ∑ ∑ ∈ |𝑠 − 𝑚 | (17)
Yes A smaller SSE value indicates a better clustering outcome.
In Equation (17), 𝑘 refers to the number of clusters, 𝑤 refers
Reaching No
maximum iteration to the i-th cluster, s r denotes a sample in 𝑤 , and m
number? correspond to the centroid of the cluster 𝑤 . Furthermore, the
K-means++ algorithm is employed to initialize the cluster
Yes centers, mitigating the issue of random initialization leading
End Training to local optima, as demonstrated in Equation (18).
( )
𝑃(𝑐 ) = ∑ (18)
( )
End
𝑃(𝑐 ) represents the probability of selecting sample i as
the initialization centroid, and 𝑄(𝑐 ) denotes the distance
FIGURE 5. Flow chart of land use classification under k-means
algorithm. between sample i and the other samples.
In Figure 5, the initial step involves determining the In the Gaussian mixture model (GMM), the statistical
number of clusters (𝑘). Subsequently, 𝑘 samples are distribution of certain data is quantified by assuming that the
randomly chosen to serve as the initial cluster centers. The sample conforms to a linear combination of k Gaussian
Euclidean distance between each sample and the 𝐶 cluster distributions. The probability of each sample point belonging
centers is then calculated, as represented in Equation (14). to each cluster is calculated, and the sample point is assigned
to the Gaussian distribution with the highest probability of
𝑑 = 𝑥 −𝜇 (14)
completing the clustering (Cao et al., 2021). Let the
In Equation (14), 𝑑 denotes the Euclidean distance distribution of the sample numbers be composed of k
between the sample 𝑖 and the cluster center 𝑗. 𝑥𝑖 represents Gaussian distributions, and the mixed model of k Gaussian
the coordinates of sample 𝑖. 𝜇 represents the coordinates of distributions is shown in Equation (19).
cluster center 𝑗. The sample is assigned to the nearest cluster, 𝑃(𝑥|𝜃) = ∑ 𝜆 𝜙(𝑥|𝜃 ) (19)
𝐶 , as shown in Equation (15).
∑ 𝜆 = 1, 𝜆 ∈ (0,1) (20)
𝜆 = 𝑎𝑟𝑔𝑚𝑖𝑛 𝑑 (15)
In Equation (20), 𝜆 refers to the probability that the
In Equation (15), 𝜆 represents the cluster to which the sample belongs to the k-th Gaussian distribution; x refers to
sample is assigned, and 𝑑 refers to the Euclidean distance sample data; 𝜙(𝑥|𝜃 ) refers to the k-th Gaussian distribution,
from sample 𝑖 to cluster center 𝑗. The cluster center as shown in Equation (21).
undergoes continuous updates, as demonstrated in Equation ( )
(16). 𝜙(𝑥|𝜃 ) = 𝑒 (21)
𝜇 = ∑∈ 𝑥 (16)
| |

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3321894

Author Name: Preparation of Papers for IEEE Access (February 2017)

𝜎 refers to the variance of the k-th Gaussian distribution, Start


and 𝜇 refers to the expected value of the k-th Gaussian
distribution. Equation (22) is shown as follows.
Initialization parameters
𝑛 =∑ 𝑠
(22)
𝑁=∑ 𝑛
K Gaussian mixture models
𝑠 in Equation (22) is a hidden variable, and its equation
is as follows:
1, 𝑥 ∈ 𝜙 Calculate probability
𝑠 = (23)
0, 𝑥 ∉ 𝜙
Equation (24) can be derived based on Equation (23). Update model parameters

𝑙𝑛𝑃(𝑥, 𝑠|𝜃) = ∑ 𝑛 𝑙𝑛𝜆 + ∑ 𝑠 [− 𝑙𝑛2𝜋 −


( )
𝑙𝑛𝜎 − ] (24) Does the model converge?
No

Let 𝑠 be marked as 𝑇(𝑆) , 𝛾 as 𝐸(𝑇) , and 𝑛 as Yes


∑ 𝛾 . Then, the maximum likelihood estimation is as End Training
shown in Equation (25).
( ) End
𝐿(𝜃) = ∑ 𝑛 [𝑙𝑛𝜆 − − 𝑙𝑛2𝜋 − 𝑙𝑛𝜎 − ]
( | )
𝛾 =∑ FIGURE 6. Flow chart of updating parameters of Gaussian mixture
( | ) model by the EM algorithm.
(25) Finally, the optimal number of clusters is determined
using the Bayesian Information Criterion (BIC). The
Let 𝐿(𝜃) represent the partial derivative with respect to 𝜃
penalization equation for this criterion is presented in
separately, as presented in Equation (26).
Equation (27).
⎧ 𝜇 = ∑ 𝛾 𝑥 𝐵𝐼𝐶 = 𝐾𝑙𝑛(𝑁) − 2𝑙𝑛 (𝐿) (27)

𝜎 = ∑ 𝛾 (𝑥 − 𝜇 ) (26) In Equation (27), K represents the number of models fitted

⎪ 𝜆 = in the Gaussian mixture model, N denotes the number of
⎩ samples, and L signifies the likelihood function. The
Equations (14) and (15) represent the steps of the determination of the number of clusters relies on the BIC
Expectation Maximization (EM) algorithm, which value, with the k value corresponding to the minimum BIC
iteratively maximizes the L(θ) function to complete the selected as the optimal number of clusters in the Gaussian
training of the Gaussian mixture model (Orellana et al., mixture model.
2021). The parameter updating steps for the Gaussian Furthermore, the optimal clustering algorithm is chosen
mixture model using the EM algorithm are illustrated in based on internal clustering performance evaluation metrics,
Figure 6. including the CH coefficient and contour coefficient.
The CH coefficient evaluates the clustering effectiveness
by quantifying the cohesion within clusters and the
separation among clusters. A higher value indicates a more
favorable clustering outcome. Its formulation is presented in
Equation (28).
∑ ( , )( )
𝐶𝐻(𝑛 ) = ∑ ∑ ∈
(28)
( , )⁄( )

In Equation (28), XM represents the center of the sample


data X. nc denotes the total number of data points contained
in category Xi. D stands for the distance between data point
Xi and cluster center XM.
The contour coefficient combines the two elements of
intra-cluster aggregation and inter-cluster separation to
evaluate the clustering effect, as shown in Equation (29).

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3321894

Author Name: Preparation of Papers for IEEE Access (February 2017)

() () In Figure 7, this model begins with a series of data


𝑆̅ = ∑ ( ) (29)
[ ( ), ( )]
preprocessing and feature learning steps. Initially, diverse
In Equation (29), 𝑆̅ refers to the contour coefficient, with data from rural areas, encompassing land use, agricultural
values ranging from -1 to 1. A larger value indicates a better production, and rural tourism, is gathered. These data
clustering result. When the value is negative, it suggests that undergo a series of operations including cleaning, denoising,
the sample clustering might be incorrect. a(𝑖) represents and normalization, to ensure data quality and consistency.
intra-cluster dissimilarity, while 𝑏(𝑖) signifies inter-cluster Subsequently, a ResNet-50 model is employed for training
dissimilarity. and feature learning. In the process of transfer learning, the
pre-trained ResNet-50 model is loaded and trained on a
D. CONSTRUCTION AND ANALYSIS OF SUITABILITY
EVALUATION MODEL FOR RURAL LAND INTEGRATION substantial image dataset, allowing it to acquire advanced
OF TERTIARY INDUSTRY BASED ON DEEP LEARNING features from images. The model is then tailored to align
FUSION CLUSTER ANALYSIS with specific task requirements. To preserve the learned
This study introduces an advanced method that amalgamates feature representation capability of the pre-trained model,
deep learning and clustering analysis techniques. The deep specific lower-level convolution layers are frozen,
learning model, ResNet-50, is harnessed to extract intricate preventing their weights from being updated during training
abstract features. Subsequently, clustering analysis is and retaining the features they initially acquired from the
deployed to categorize analogous land regions into coherent source data. Ultimately, through training on new layers, the
clusters. The selection of the optimal clustering scheme model gradually adapts to the task at hand, acquiring the
relies on the evaluation of C and silhouette coefficients. An ability to map the pre-trained feature representation to the
improved land suitability assessment methodology is specific rural land suitability evaluation task. ResNet-50
presented, markedly augmenting the precision and excels at learning advanced abstract features within the data,
comprehensive appraisal of rural land concerning its including various land use types, levels of agricultural
viability for the integration of the three industries. The model production, and rural tourism resources. This enables the
for evaluating the suitability of rural land for tertiary industry model to more accurately capture regional characteristics
integration, which is based on the fusion of deep learning and and lays a robust foundation for subsequent evaluation
cluster analysis, is illustrated in Figure 7. processes.
Secondly, the process involves feature fusion and cluster
analysis. Here, the advanced features extracted from the
Land image
input ResNet-50 model are integrated with cluster analysis. This
method classifies similar plots or areas into the same cluster
through cluster analysis techniques by utilizing the features
derived from the deep learning model as input. This
Data
preprocessing integration allows for a comprehensive consideration of
various factors and illuminates both the relationships and
distinctions among different categories of integrated land.
BN layer Cluster analysis plays a pivotal role in uncovering potential
laws and patterns, thereby enhancing the comprehensiveness
and accuracy of land suitability evaluations.
Thirdly, the process involves comprehensive evaluation
and result interpretation. The clusters generated through
cluster analysis serve as the classification of land types for
...

ResNet50

rural three-industry integration, facilitating a comprehensive


1 4
assessment of land suitability. By comparing and analyzing
the characteristics of different clusters in alignment with
existing rural development and land use policies, a deeper
understanding of the suitability of each type is achieved,
Clustering
algorithm thereby furnishing decision-makers with a scientifically
grounded basis for their choices. This capacity for
comprehensive analysis enhances the credibility and
practicality of the evaluation results.
Suitability
classification ... The core algorithm flow of this model is depicted in Figure
and evaluation
8.

FIGURE 7. Schematic diagram of the rural land suitability evaluation


model based on the fusion of deep learning and cluster analysis.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3321894

Author Name: Preparation of Papers for IEEE Access (February 2017)

1 Start Rural Third The number of National


2 Input: Land image data
tourism industry tourist attractions, Tourism
3 Output: Comprehensive evaluation results of land
suitability data tourists, tourism Administratio
4 # Data preparation and preprocessing income, etc. n
5 # Combine data
(http://www.c
6 # Data cleaning, noise reduction, normalization, etc.
7 # Split data into training and testing sets taweb.org.cn/)
8 # Build ResNet-50 model
9 def build_resnet50_model() After acquiring the data, this study preprocesses it and
10 # Compile and train ResNet-50 model then partitions it into a training set and a test set based on the
11 # Use ResNet-50 for feature extraction data type, maintaining a 7:3 ratio between the training and
12 # Perform clustering analysis test sets.
13 # Conduct suitability assessment
14 def suitability_assessment(data, clusters): B. EXPERIMENTAL ENVIRONMENT
15 suitability_scores = [] All experiments are conducted on an Ubuntu 18.04 system
16 for cluster_id in range(num_clusters) with an Intel 9900K CPU, featuring 8 cores and 1G threads,
17 End with a maximum frequency of SGHz. The server is equipped
with 32GB of memory. An NVIDIA RTX 2080TI GPU with
FIGURE 8. Model algorithm flow.
11GB of video memory is employed. The deep learning
framework utilized for these experiments is TensorFlow,
IV. EXPERIMENTAL DESIGN AND PERFORMANCE
EVALUATION along with essential Python packages and tool libraries such
as NumPy.
A. DATASETS COLLECTION
The data sources for this study are diverse and encompass C. PARAMETERS SETTING
various categories, as outlined in Table 1. In the training process, the following hyper-parameters are
TABLE I employed: a weight decay of 0.00001, a momentum of 0.9,
DATA SOURCES OF RURAL TERTIARY INDUSTRY INTEGRATION and an initial learning rate of 0.15. A Cosine Annealing
Data Belonging Data type Source strategy is adopted to modulate the learning rate as the
industry number of consecutive cycles increased. At the end of each
cycle, the learning rate is eventually reduced to 0.00001, and
this training process is repeated 100 times. These parameter
Land use First Land use type, National land settings allow the training process to commence with a high
data industry, land use area, land use data initial learning rate, facilitating rapid convergence aided by
second use change, etc. network momentum. Subsequently, the learning rate is gradually
reduced using the Cosine Annealing strategy, enhancing the
industry (http://www.d
model’s stability and generalization capacity in the later
and third sac.cn/) stages of training. This adjustment contributes to the model’s
industry improved adaptation to the data, resulting in superior
performance. The optimization algorithm used in the model
Agricultur First Crop planting National is Adam, the activation function is ReLU, and the batch size
is 64. The model consists of a total of 49 convolutional layers
al industry area, agricultural Bureau of
and 1 fully connected layer.
production product output, Statistics
D. PERFORMANCE EVALUATION
data agricultural input (http://www.st In the initial stage of the analysis, several algorithms are
and income, etc. ats.gov.cn/) employed, including the model algorithm proposed in this
study, ResNet50, CNN, Zhu et al. (2022), and Boonpook et
Industrial Second The number, National al. (2023). Rural land use classification and identification
accuracy are evaluated using various metrics, including
production industry scale, output Bureau of
accuracy, precision, recall, and F1 value. Figures 9 through
data value, land area Statistics 12 illustrate the outcomes of these assessments.
and pollution (http://www.st
discharge of ats.gov.cn/)
industrial
enterprises, etc.

10

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3321894

Author Name: Preparation of Papers for IEEE Access (February 2017)

1.0 1.0
The proposed algorithm The proposed algorithm
Boonpook et al. Boonpook et al.
0.8 Zhu et al. Zhu et al.
0.8
ResNet50 ResNet50
CNN CNN
0.6
Accuracy

0.6

F1
0.4
0.4

0.2
0.2

0.0
0 20 40 60 80 100 0.0
Epochs 0 20 40 60 80 100
Epochs

FIGURE 9. Accuracy results of rural land use classification recognition


under each algorithm. FIGURE 12. F1 value results of rural land use classification and
identification under each algorithm.
1.0
The proposed algorithm
Figures 9 to 12 present a detailed analysis of rural land use
Boonpook et al. classification and identification accuracy using the model
0.8 Zhu et al.
ResNet50 algorithm developed in this study, ResNet50, CNN, Zhu et
0.6
CNN al. (2022), and Boonpook et al. (2023). These analyses
Precision

consider metrics such as accuracy, precision, recall, and F1


0.4 value score. The results reveal that the rural land use
classification and identification accuracy achieved by the
0.2
model algorithm in this study is notably high, reaching
88.3%. This performance surpasses that of the other model
0.0
0 20 40 60 80 100 algorithms, with at least a 3.1% advantage. The order of
Epochs
accuracy from highest to lowest is as follows: the model
algorithm in this study > Boonpook et al. (2023) > Zhu et al.
FIGURE 10. Precision results of rural land use classification
recognition under each algorithm. (2022) > RESNET 50 > CNN. Furthermore, when examining
precision, recall, and F1 values, it becomes evident that the
1.0 research model algorithm for rural land use classification and
The proposed algorithm identification is superior. This superiority may be attributed
Boonpook et al.
0.8 Zhu et al. to the combination of ResNet-50 and the K-means algorithm,
ResNet50 which enhances the model’s ability to accurately capture
CNN land characteristics and classify similar plots or areas,
0.6
ultimately improving accuracy and the overall analytical
Recall

capability for land suitability evaluation. Consequently, this


0.4 study’s rural land suitability evaluation model based on deep
learning fusion cluster analysis excels in rural land use
0.2 classification and identification accuracy.
Furthermore, a comparison of the average intersection and
0.0 union ratio for each algorithm is illustrated in Figure 13.
0 20 40 60 80 100
Epochs

FIGURE 11. Recall results of rural land use classification recognition


under each algorithm.

11

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3321894

Author Name: Preparation of Papers for IEEE Access (February 2017)

coefficient, signifying its superior clustering efficacy. Hence,


this study selects the clustering results generated by the k-
0.7 means algorithm as the foundation for classifying rural three-
industry integration land.
0.6
E. DISCUSSION
This study reveals that the algorithm developed in this study
0.5
MIoU

achieves an impressive accuracy of 88.3% in rural land use


The proposed algorithm classification and identification, significantly outperforming
0.4
Boonpook et al. other algorithms in this regard. Additionally, the mean IoU
Zhu et al. reaches 67.29%, indicating the superior feature identification
0.3 ResNet50
CNN
accuracy of this study’s model algorithm for integrated rural
land for three industries when compared to Zhu et al. (2022)
0.2
0 20 40 60 80 100 and Boonpoke et al. (2002). Furthermore, in the analysis of
Epochs clustering algorithms, the k-means algorithm demonstrates a
strong clustering effect, notably surpassing the clustering
FIGURE 13. Mean IoU results of each algorithm. score of GMM, consistent with Jiang & Beck (2022).
Figure 13 provides insight into the mean IoU ratio of each The model reported here offers a versatile array of
algorithm. Analysis reveals a common trend of initial practical applications. Firstly, it enhances the accuracy of
increase followed by stabilization in the average intersection rural land use classification, enabling a more precise
ratio as the training period progresses. Notably, when the understanding of land use conditions. This, in turn, provides
training period reaches 100, the model algorithm introduced essential data support for urban planners and decision-
in this study attains a substantial mean IoU of 67.29%. In makers, facilitating the formulation of more effective
contrast, the highest mean IoU achieved by other model development strategies. Secondly, it allows for more targeted
algorithms peaks at 64.29%, significantly lower than that of planning of rural industrial land by delineating the types and
the algorithm reported here. Consequently, a comprehensive directions of industrial development in rural areas. This
analysis of experimental results consistently highlights the precise planning can identify the specific types of industrial
superior accuracy of rural land use classification and development needed in rural areas, thereby guiding future
identification achieved by the model algorithm presented in industrial development in different regions (Qu et al., 2021).
this study. Such guidance promotes industrial revitalization in various
Furthermore, to enhance the clustering performance of the locales, contributing to the realization of sustainable rural
model, a comparative analysis of the CH coefficient and development. In conclusion, the practical application of this
contour coefficient between the k-means algorithm and research model holds significant potential for rural
GMM algorithm is presented in Figure 14. development planning and land use management, offering
strategic support and guidance for the sustainable
0.30 development of rural areas.
103.0
Contour coefficient This guide offers a systematic approach for novice
CH coefficient
102.5 0.28 researchers, particularly graduate students, aiming to apply
Contour coefficient

cutting-edge methods in unfamiliar fields or scenarios. The


CH coefficient

102.0
0.26
proposed steps and recommendations encompass the
101.5
following key aspects:
0.24
101.0 1) Acquaintance with Current Methods: Beginners should
100.5
embark on a comprehensive exploration of the
0.22
prevailing methodologies and techniques. This entails
100.0
0.20
mastering deep learning models (e.g., ResNet-50),
K-means GMM familiarizing themselves with clustering analysis
Clustering algorithm algorithms (e.g., K-means and Gaussian Mixture
Models), and grasping the foundational principles of
FIGURE 14. Comparison of scores of different clustering algorithms.
land suitability assessment. This proficiency can be
Upon comparing the clustering scores of each algorithm,
cultivated through an extensive review of pertinent
depicted in Figure 14, it is evident that the k-means algorithm
literature, engagement in online courses and tutorials,
attains scores of 102.97 for the CH coefficient and 0.29 for
and hands-on programming and experimentation.
the contour coefficient, while the GMM algorithm achieves
scores of 100.15 for the CH coefficient and 0.21 for the 2) Data Gathering and Organization: Graduate students
contour coefficient. Clearly, the k-means algorithm should undertake the collection of pertinent data
outperforms in terms of both CH coefficient and contour pertinent to their new area of interest or research
inquiry. This may encompass geographical information
12

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3321894

Author Name: Preparation of Papers for IEEE Access (February 2017)

system (GIS) data, remote sensing images, rural these techniques to new scenarios across diverse fields can
economic statistics, and other relevant sources. lead to innovative solutions for multifaceted challenges.
Vigilance in preserving data quality and consistency is By following these methods and recommendations,
paramount; hence, data preprocessing and cleansing beginners can acquire the skills to adapt existing techniques
are indispensable stages. to novel fields or scenarios and explore promising avenues
3) Data Preprocessing and Feature Engineering: Novices for future research. Furthermore, this practice encourages
must become proficient in the art of data preprocessing fellow scholars to engage actively and contribute to the
and feature engineering. These activities encompass advancement of pertinent domains.
tasks such as data cleansing, noise mitigation,
normalization, and the extraction of salient features V. CONCLUSION
from raw datasets. The caliber of feature engineering A. RESEARCH CONTRIBUTION
directly impacts the model’s efficacy. This study introduces deep learning and clustering
4) Model Selection and Training: Depending on the algorithms and presents an innovative rural land suitability
research query, neophytes should elect an apt deep evaluation model based on deep learning fusion clustering
learning model, potentially ResNet-50 or an alternative analysis. The novelty of this study lies in the synergistic
well-suited model. Thereafter, they should engage in integration of deep learning and cluster analysis, which
model training using their proprietary data, which may yields more precise and comprehensive support for rural
necessitate access to a computer equipped with GPU sustainable development. Experimental analysis reveals that
acceleration. The refinement of the model’s hyper- the accuracy of rural land classification achieved by the
parameters to attain peak performance is also a pivotal model algorithm presented in this study reaches 88.3%.
endeavor. Additionally, the mean IoU attains 67.29%, demonstrating
5) Clustering Analysis and Interpretation: Following the clear superiority over other algorithms. The utilization of the
acquisition of predictions from the deep learning model, k-means algorithm for clustering land suitability enhances its
beginners can amalgamate these outcomes with utility, providing robust support and guidance for the
clustering analysis techniques. This synergy aids in the sustainable development of rural areas.
revelation of latent patterns and cohorts within the data, The ResNet-50 model employed in this analysis exhibits
further elucidating the model’s prognostications. several strengths and limitations. ResNet-50 is a deep CNN
6) Results Comprehension and Documentation: Lastly, model characterized by its substantial depth, which
newcomers must interpret their research findings and facilitates the extraction of high-level abstract features from
compile an exhaustive report or scholarly paper. This data. It excels at recognizing diverse land-use patterns,
entails bridging the model’s outputs with real-world agricultural production levels, and rural tourism resources,
contexts, scrutinizing the import of the results, and thereby enhancing its feature extraction capabilities for land
deliberating on the study’s constraints and prospective characteristics. ResNet-50 adopts a transfer learning
avenues for further research. approach, initially pre-trained on extensive image datasets
This study outlines key avenues for advancing land and subsequently fine-tuned for specific tasks. This strategy
suitability assessment methods, focusing on improving expedites model convergence, diminishes the demand for
model performance, multimodal data fusion, interpretability copious labeled data, and enhances the model’s
studies, and cross-domain applications. Future research generalization prowess. Parameter freezing is applied to the
should prioritize enhancing deep learning models’ lower convolutional layers of ResNet-50, ensuring that these
performance in new scenarios. This involves optimizing layers retain their original feature representations and
model architectures and leveraging data augmentation thereby aiding the model’s adaptation to the specific task. In
techniques to increase their effectiveness. Research efforts scenarios involving data with spatial structures, such as
should explore techniques for seamlessly integrating these image-based land suitability assessment and land-use
multimodal data into unified models in cases involving classification, ResNet-50 is renowned for its high
diverse data types (e.g., images, geographic, economic). This performance. However, it also presents several limitations.
integration can boost prediction accuracy and offer a more Notably, deep learning models, including ResNet-50,
holistic view of complex landscapes. Addressing the necessitate substantial volumes of labeled data, which can be
interpretability of deep learning models is paramount. a limiting factor when such data is scarce. Furthermore, the
Research should delve into methods to elucidate model computational resources required for training and inference
decisions and establish meaningful links between are considerable, often mandating high-performance
interpretability findings and real-world contexts. This computing equipment and hardware acceleration. The
direction aims to make models more transparent and numerous hyperparameters inherent to deep learning models,
accountable. Extending these methods to different domains including learning rates, weight decay, and iteration counts,
(e.g., urban planning, environmental protection, natural demand meticulous tuning and optimization to achieve peak
resource management) holds significant promise. Adapting performance. Inappropriate parameter configurations can
13

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3321894

Author Name: Preparation of Papers for IEEE Access (February 2017)

lead to reduced model efficacy or training instability. Lastly, J. Groth et al., “Investigating environment-related migration
processes in Ethiopia–A participatory Bayesian network,”
deep learning models tend to be regarded as black-box Ecosystems and People, vol. 17, no. 1, pp. 128-147, 2021.
models, hindering their interpretability. In fields like land Z. Jia., “Rural tourism competitiveness and development mode,
suitability assessment, where interpretability is critical, a case study from Chinese township scale using
integrated multi-source data,” SUSTAINABILITY-BASEL,
additional efforts may be required to elucidate the model’s vol. 14, no. 7, pp. 4147, 2022.
decision-making processes. X. Jiang, and M. Beck, “Visual Attention during Scene
Viewing–Eye Tracking Discovery with K-Means and
B. FUTURE WORKS AND RESEARCH LIMITATIONS Gaussian Mixture Model,” J. Vision, vol. 22, no. 14, pp.
Nonetheless, this study has certain limitations. For instance, 3631-3631, 2022.
while it utilizes multi-source data as the foundation, it lacks J. Jing et al., “Energy method of geophysical logging lithology
based on K-means dynamic clustering analysis,” Environ.
detailed explanations regarding the specific processing Technol. Inno., vol. 23, pp. 101534, 2021.
methods for distinct types of data, including remote sensing B. Kalisz et al., “Land Use Indicators in the Context of Land
images, land use data, and tourism data. To address this, Use Efficiency,” SUSTAINABILITY-BASEL, vol. 15, no.
future research could focus on expanding the scale and 2, pp. 1106, 2023.
X. Leng, and G. Tong, “The Digital Economy Empowers the
diversity of datasets and integrating additional data sources, Sustainable Development of China’s Agriculture-Related
such as geographic information system data and socio- Industries,” SUSTAINABILITY-BASEL, vol. 14, no. 1), pp.
economic data. This expansion and integration would serve 10967, 2022.
X. Li, et al., “Index system of sustainable rural development
to enhance the accuracy and comprehensiveness of the based on the concept of ecological livability,” Environ.
suitability evaluation for integrated rural land for three Impact Asses., vol. 86, pp. 106478, 2021.
industries. M. Liu “Comparison of multi-source satellite images for
classifying marsh vegetation using DeepLabV3 Plus deep
Acknowledgements learning algorithm,” Ecol. Indic., vol. 125, pp. 107562,
This work was supported by Research on the zero-point 2021.
paradigm of industrial integration agricultural product R. N. Masolele et al., “Spatial and temporal deep learning
methods for deriving land-use following deforestation: A
market led by art design (No.XJKY202267). pan-tropical case study using Landsat time series,”
Remote Sens. Environ., vol. 264, pp. 112600, 2021.
R. Orellana et al. “On the uncertainty identification for linear
REFERENCES dynamic systems using stochastic embedding approach
W. Abera et al., “Impacts of land use and land cover dynamics with gaussian mixture models,” SENSORS-BASEL, vol.
on ecosystem services in the Yayo coffee forest biosphere 21, no. 11, pp. 3837, 2021.
reserve,” southwestern Ethiopia. Ecosystem Services, vol. G. Racolte et al., “Spherical K-Means and Elbow Method
50, pp. 101338, 2021. Optimizations With Fisher Statistics for 3D Stochastic
W. Boonpooket al., “Deep Learning Semantic Segmentation DFN From Virtual Outcrop Models,” IEEE Access, vol.
for Land Use and Land Cover Types Using Landsat 8 10, pp. 63723-63735, 2022.
Imagery,” Isprs. Int. J. Geo-Inf., vol. 12, no. 1, pp. 14, F. M. Seaton et al., (2021). “Soil health cluster analysis based
2023. on national monitoring of soil indicators,” EUR. J. SOIL.
J. Cao et al., “Unsupervised eye blink artifact detection from SCI, vol. 72, no. 6, pp. 2414-2429.
EEG with Gaussian mixture model,” IEEE J. Biomed. A. Shirazy et al., “K-means clustering and general regression
Health, vol. 25, no. 8, pp. 2895-2905, 2021. neural network methods for copper mineralization
M. Castelo-Cabay, J. A. Piedra-Fernandez, and R. Ayala, probability in Chahar-Farsakh, Iran,” Turk. Jeol. Bult.,
“Deep learning for land use and land cover classification vol. 65, no. 1, pp. 79-92, 2022.
from the Ecuadorian Paramo,” Int. J. Digit. Earth., vol, H. Soleimani et al. “Groundwater quality evaluation and risk
15, no. 1, pp. 1001-1017, 2022. assessment of nitrate using monte carlo simulation and
G. Cecili et al., “Land Cover Mapping with Convolutional sensitivity analysis in rural areas of Divandarreh County,
Neural Networks Using Sentinel-2 Images: Case Study of Kurdistan province,” Iran. International Journal of
Rome,” Land, vol. 12, no. 4, pp. 879, 2023. Environmental Analytical Chemistry, vol. 102, no. 10, pp.
M. Debella-Gilo, and A. K. Gjertsen, “Mapping seasonal 2213-2231, 2022.
agricultural land use types using deep learning on D. Wang et al., “A review of deep learning in multiscale
Sentinel-2 image time series,” REMOTE SENS-BASEL., agricultural sensing,” REMOTE SENS-BASEL, vol. 14,
vol. 13, no. 2, pp. 289, 2021. no, 3, pp. 559, 2022.
R. Garg et al., “Semantic segmentation of PolSAR image data J. Wang et al., “Identifying the structure of rural regional
using advanced deep learning model,” SCI REP-UK, vol. system and implications for rural revitalization: A case
11, no. 1, pp. 1-18, 2021. study of Yanchi County in northern China.” Land Use
H. Ge et al., “Research on digital inclusive finance promoting Policy, vol. 124, pp. 106436, 2023.
the integration of rural three-industry,” INT J ENV RES X. Wei et al., “Spatiotemporal assessment of land
PUB HE, vol. 19, no. 6, pp. 3363, 2022. marketization and its driving forces for sustainable
L. Ghayour et al., “Performance evaluation of sentinel-2 and urban–rural development in Shaanxi province in China,”
landsat 8 OLI data for land cover/use classification using SUSTAINABILITY-BASEL, vol. 13, no. 14, pp. 7755,
a comparison between machine learning algorithms,” 2021.
REMOTE SENS-BASEL, vol. 13, no. 7, pp. 1349, 2021. Q. Wen et al., “Evolutionary process and mechanism of
N. T. Giao, N. V. Cong and H. T. H. Nhien, “Using REMOTE population hollowing out in rural villages in the farming-
SENS-BASEL and multivariate statistics in analyzing the pastoral ecotone of Northern China: A case study of
relationship between land use pattern and water quality in Yanchi County,” Ningxia. Land Use Policy, vol. 125, pp.
Tien Giang province,” Vietnam. Water, vol. 13, no. 8, pp. 106506, 2023.
1093, 2021. Y. Qu et al., “How does the rural settlement transition
contribute to shaping sustainable rural development?

14

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3321894

Author Name: Preparation of Papers for IEEE Access (February 2017)

Evidence from Shandong,” China. Journal of Rural


Studies, vol. 82, pp. 279-293, 2021.
J. Yang et al., “Effects of rural revitalization on rural tourism,”
J. HOSP. TOUR. MANAG., vol. 47, pp. 35-45, 2021.
Y. Zheng et al., “The Governance Path of Urban–Rural
Integration in Changing Urban–Rural Relationships in
the Metropolitan Area: A Case Study of Wuhan,” China.
Land, vol. 11. no. 8, pp. 1334, 2022.
Q. Zhu et al., “Land-use/land-cover change detection based on
a Siamese global learning framework for high spatial
resolution remote sensing imagery,” Isprs J.
Photogramm., vol. 184, pp. 63-78.

15

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

You might also like