1. Introduction
Photosynthesis is a complex and critical biological process in nature performed by green plants [
1]. It achieves energy conversion, participates in the carbon cycle, produces oxygen, and provides the basis for the survival of other organisms in the ecosystem. This process is not only a simple chemical reaction but also a key link in the energy conversion and material cycle in the living system [
2]. Green plants are able to absorb solar energy and use it to convert carbon dioxide and water into energy-rich organic matter, such as glucose, through photosynthesis. This provides energy and raw materials for the growth of plants and provides a food source for other organisms in the biosphere [
3]. The net photosynthetic rate (Pn), which measures the speed of carbon dioxide absorption and oxygen release through photosynthesis per unit of time under light conditions, is a core indicator for evaluating the efficiency of plant photosynthesis. It is the value of the total photosynthetic rate minus the respiration rate and is usually used to measure the photosynthetic efficiency and growth conditions of plants. Therefore, it is particularly important to monitor plant Pn in real time to scientifically monitor plant growth conditions and effectively improve cultivation measures.
The commonly used method for measuring Pn is typically based on ground-contact measurement devices, such as the LI-COR photosynthesis meters [
4,
5]. However, the measurement area of these devices is limited to a single leaf, and the measurement process is time consuming and labour intensive, making it impossible to measure the entire plant of a crop and resulting in low representativeness of the data. Therefore, there is an urgent need for a method that can provide the overall Pn of the crop in a high-throughput and rapid manner. In recent years, unmanned aerial vehicle (UAV) remote sensing technology has made remarkable progress. Its advantages include wide access to information, few operational constraints, highly efficient data acquisition, and the ability to monitor crop growth dynamically. This makes it an important tool for large-area agricultural surveys and monitoring, which is crucial for the precise management of modern agriculture [
6]. Considering the high efficiency, flexibility, and real-time capabilities of UAV remote sensing technology, its application in dynamic, rapid, and high-throughput monitoring of soybean Pn shows great potential and is expected to provide strong scientific support for precision management in modern agriculture.
In recent years, UAV remote sensing technology has performed particularly well in the field of crop phenotyping, providing brand-new technical means for crop growth status monitoring, physiological and ecological monitoring [
7], and agricultural resource management [
8]. In terms of research on crop physiological and ecological-related indicators, Zhang et al. [
9] established an inverse model of leaf area index (LAI) at four stages of wheat by combining the UAV point-cloud-data-based canopy height model (CHM) with the vegetation index (VIS). The results showed that the regression model combined with CHM data increased
R2 by 0.020–0.268. Li et al. [
10] obtained a maize canopy structure (including height and density) using UAV point cloud, LAI inversion, and canopy-structure-based multiple linear regression models. Gong et al. [
11] successfully estimated the LAI of different rice varieties during the entire growth season based on the product of the VIS and canopy height using UAV remote sensing technology, with an error controlled within 24%. This method did not require parameter adjustment due to phenological changes, effectively reducing lag. Combined with machine learning, Han et al. [
12] estimated maize aboveground biomass (AGB) using structural and spectral information provided by UAV remote sensing. The results showed that the random forest model had the most balanced performance, with small errors and a high explained variance ratio on both training and test sets. The importance analysis of the predictive factors showed that the three-dimensional volume index had the largest strength effect on AGB estimation among the four machine learning models. Zhang et al. [
13] analysed structural indices and two chlorophyll vegetation indices using three regression algorithms. Maimaitijiang et al. [
14] used satellite and UAV data fusion for crop monitoring based on machine learning. This study provided canopy spectral information and canopy structure characteristics (CSC) in soybean areas using inexpensive UAVs. Four machine learning methods were used to predict soybean LAI, AGB, and leaf nitrogen content using canopy spectral and structural information and their combinations. These studies indicated that CSC, such as plant height and canopy coverage, have good correlations with physiological and ecological indicators such as LAI, AGB, and nitrogen content, and the combination of CSC and VIS yields good results for the inversion of these physiological indicators. This provides strong support for the application of UAV remote sensing technology in crop physiological and ecological research. Since the organic matter accumulated by photosynthesis directly affects basic growth indicators of soybeans, such as plant height, volume, and canopy coverage, CSC also demonstrate great potential in estimating Pn.
In research on the inversion of crop Pn using remote sensing technology, Zhang et al. [
15] established a regression model for the canopy Pn of rapeseed using the remote sensing VIS and solar-induced chlorophyll fluorescence. They also obtained a new composite index by multiplying individual indicators, improving the method for extracting the Pn of rapeseed seedlings from UAV remote sensing data. Wu et al. [
16] applied inversion modelling to Pn using UAV multispectral images and found that gradient-boosting decision trees and random forest models with fused inputs could be used for estimating rice Pn. This method could also provide references for field Pn monitoring and yield prediction. Zhang et al. [
17] used multispectral data obtained from UAVs to input into the LRC model to rapidly predict the diurnal variation of rice leaf photosynthetic rate. Zhang et al. [
18] used six leaf phenotypic data of aspen leaves (area, length, width, perimeter, ratio, and factor) combined with four machine learning algorithms to invert leaf Pn. The results showed that the extreme gradient-boosting tree had the highest inversion accuracy, with an MAE and
R2 of 1.12 and 0.60, respectively. All of the above are classic cases of Pn inversion using remote sensing technology. However, few studies have been conducted on the Pn of soybean, and little attention has been paid to the effect of CSC on Pn prediction.
Therefore, this study focused on soybeans under different moisture gradients, obtaining visible-light and multispectral images and point cloud data of soybeans using a UAV. The differences in Pn under different conditions and the trends of CSC under different moisture gradients were analysed. The correlation between CSC, VIS, and Pn at different stages was analysed, and VIS was selected for the input into four machine learning models based on the magnitude of the correlation. The accuracy of the Pn inversion model under different input feature combinations at each stage was compared, and the inversion effect of the fusion of VIS and CSC was further analysed. The technology roadmap for this study is shown in
Figure 1.
2. Materials and Methods
2.1. Study Region and Experimental Design
The study area was located at the experimental base of Batou, Yazhou District, Sanya City, Hainan Province, China (18°22′12″ N, 109°9′11″ E). The experimental base is located in the subtropical region and has a tropical marine monsoon climate. The average annual temperature ranges from 24.9 °C to 26 °C, the average annual sunshine duration is 2572.8 h, and the average annual precipitation is 1100–1300 mm. The area experiences distinct wet and dry seasons and has excellent air quality, making it highly suitable for soybean growth and experimentation.
The soybean sowing for the experiment was conducted on 1 November 2023. As shown in
Figure 2a, a total of four ridges were planted in the area, with each ridge containing five different soybean varieties. Each variety was planted in three plots within different ridges, for a total of 60 plots, to increase the sample size of the varieties. Double rows with a plant spacing of 0.15 m and a ridge spacing of 0.8 m were planted in each plot. Each variety was sown with 16 seedlings, that is, each plot was 1.2 m long and 0.8 m wide. Different moisture gradients, categorised into sufficiently watered (FW, relative moisture content of 80–85%), mild drought (D1, relative moisture content of 65–70%), moderate drought (D2, relative moisture content of 50–55%), and severe drought (D3, relative moisture content of 25–30%), were applied to the experimental soil. The water content of the soybeans in each row was controlled by each watering, which used a flow meter to control the amount of water flowing out of each row of pipes. Ridges were separated from each other by a ridge of land to prevent watering interfering with the moisture levels of other ridges. Due to unforeseen circumstances, one area under the D3 treatment did not emerge successfully.
2.2. Photosynthetic Rate and Unmanned Aerial Vehicle Data Collection
The instrument used for the measurement of Pn was the LI-6800 photosynthesis meter (LI-COR, Lincoln, NE, USA). Due to the opening and closing characteristics of plant stomata, the measurements needed to be taken before the flight operations of the UAV, which was equipped with a variety of sensors, i.e., between 8:30 and 11:30 every day. As shown in
Figure 2b, the Pn of soybeans was simultaneously measured in the field. Three soybean plants were randomly selected from each plot, and measurements were taken on the third attached leaf at the top. The collected data were averaged to obtain the Pn value for each plot. Five measurements were conducted at the Yazhou experimental base during the soybean growth cycle: at the flowering, podding, beginning seed, seed-filling, and maturity stages. The measurement dates were 6 January, 12 January, 19 January, 26 January, and 3 February 2024, respectively. Since soybeans were sown on 1 November 2023, this corresponded to 36, 42, 49, 56, and 63 days after sowing (DAS). Pn samples were collected from 59 plots during each measurement for 295 samples.
This study utilised a UAV system (Matrice 300 RTK; SZ DJI Technology Co., Ltd., Shenzhen, Guangdong, China) equipped with visible, multispectral, and LiDAR sensors to simultaneously collect three types of remote sensing images (
Figure 3). The visible sensor (P1, SZ DJI Technology Co., Ltd.) had a resolution of 8192 × 5460. The multispectral sensor (Rededge-MX; MicaSense, Seattle, WA, USA) was composed of five bands with a wavelength range of 400–900 nm and a resolution of 1280 × 960. The LiDAR sensor (L1; SZ DJI Technology Co., Ltd.) had a ranging accuracy of 3 cm @ 100 m. The flight planning for the visible and multispectral sensors was identical, with a flight altitude of 30 m, 80% forward overlap, 80% side overlap, an excitation mode of isotropic velocity, an excitation interval of 1 s, and a flight speed of 2 m s
−1. The flight planning for the LiDAR sensor included a flight altitude of 20 m, 70% forward overlap, 20% LiDAR side overlap, 70% visible side overlap, an excitation mode of isotropic velocity, an excitation interval of 1 s, and a flight speed of 1 m s
−1. Two radiometric calibration panels with reflectance values of 5% and 15% were placed in the field before each flight as the digital numbers (DN) of the multispectral images needed to be converted into reflectance values during post-processing.
2.3. Canopy Structure Characteristics Data Processing
Three CSC of soybean, i.e., canopy coverage, canopy length, and canopy width, were extracted from visible images; two CSC, i.e., plant height and volume, were extracted from point cloud images; and the VIS was extracted from multispectral images. The vegetation index EXGR (Excess Green minus Excess Red) [
19], on the other hand, is a vegetation index used to assess vegetation cover and growth and is calculated as shown in Equation (1) in the text. This index is particularly suitable for analysing UAV visible-light imagery to more accurately identify vegetated and non-vegetated areas. Therefore, the EXGR vegetation index combined with the OTSU thresholding method was used to binarise the visible and multispectral images and segment the soybean canopy image of the field [
20]. The breakdown process diagram is shown in
Figure 4. R, G, and B are the red, green, and blue bands, respectively.
After obtaining the mask image from the visible image, the canopy coverage (CC) of the image was obtained by traversing each pixel of the image, counting the number of black and white pixels, and then calculating the proportion of white pixels to the total number of pixels. The length (L) and width (W) of the canopy were obtained by calculating the number of rows and columns occupied by white pixels.
After obtaining the mask image from the multispectral image, soybean images were extracted in the red, green, blue, near-infrared, and red-edge bands. Each soybean image was segmented into multiple regions of interest (ROIs) based on variety. The average greyscale value of each ROI image was calculated, and then the extracted greyscale values were calibrated using two reflectance calibration panels placed before the experiment to obtain reflectance values for the five bands.
Soybean plant height (PH) and volume (V) were extracted in the soybean point cloud using CloudCompare_v2.13.1 software. The height measurement tool in the CloudCompare_v2.13.1 software was used to mark the ROIs and calculate plant height, and the volume calculation tool was used to select regions or geometric shapes for volume measurement.
2.4. Calculation of Vegetation Index
VIS is a remote sensing indicator used to assess vegetation health and coverage. It is typically based on multispectral or hyperspectral image data. These indices evaluate vegetation growth status, chlorophyll content, and land cover types by calculating the relationship between different bands in the image [
21]. Therefore, after obtaining the reflectance values for the five bands of the soybean canopy, the VIS was calculated, and 14 VIS indices were obtained, as shown in
Table 1.
2.5. Construction and Evaluation of Regression Models
In this study, four common machine learning regression models, i.e., multiple linear regression (MLR), random forest regression (RF), Extreme gradient-boosting tree regression (XGB), and ridge regression (RR), were created using Python to estimate Pn to fully evaluate the performance and generalisation of the dataset.
- (1)
Multiple linear regression (MLR): MLR is a basic regression analysis method that establishes a relationship between the independent and dependent variables by fitting a linear relationship. It is simple and easy to understand and implement, fast to compute, and suitable for situations where the dataset exhibits a clear linear relationship.
- (2)
Random forest regression (RF): RF is an integrated learning method that improves the model’s accuracy by constructing multiple decision trees and combining their prediction results [
34]. It is highly robust, can handle high-dimensional data and large feature sets, is insensitive to outliers, and effectively reduces overfitting. It is widely used in various regression and classification problems, and is especially effective in the case of complex datasets and more features.
- (3)
Extreme gradient-boosting tree regression (XGB): XGB is a gradient-boosting tree algorithm that improves the model’s accuracy by iteratively training the decision tree and optimising the loss function [
35]. It is efficient, flexible, capable of handling large-scale datasets and complex features, and performs well in modelling non-linear relationships.
- (4)
Ridge regression (RR): RR is a regularised linear regression method that prevents overfitting by adding a regular term to the loss function, thereby improving the generalisation ability of the model [
36,
37]. It is suitable for dealing with the presence of collinearity among features, effectively reducing the variance of the model and improving the stability of the model.
Three metrics were used in this study to assess the accuracy of the regression model in the test set. The
R2 (coefficient of determination) is a statistical measure that indicates the proportion of the variance of the dependent variable that is predictable from the independent variable. In regression analysis,
R2 is used to assess the goodness of fit of a model. Typically, it ranges from 0 to 1, with larger values indicating a better fit. An
R2 of 1 indicates that the model predicts the target variable perfectly, while an
R2 of 0 indicates that the model does not explain any of the variance in the target variable. Root mean square error (RMSE) is the square root of the mean squared error (MSE). It measures the average size of the errors in a set of predictions considering both the magnitude and direction of the errors. A smaller RMSE indicates a more accurate prediction. Relative percentage difference (RPD) is the ratio of the sample standard deviation (SD) to the predicted RMSE. It is commonly used to compare the consistency between actual values and predicted values. When RPD < 1, the model is considered unable to predict the samples; when 1 ≤ RPD < 2, the model’s performance is considered fair and can be used for rough predictions; when RPD ≥ 2, the model is considered to have good predictive ability.
where
n is the number of samples,
is the observed value,
is the predicted value,
is the mean of the observed values, and SD is the sample standard deviation.
4. Conclusions
This study focused on field soybeans under four moisture gradients. Various phenotypic traits at five growth stages were acquired using UAV multispectral remote sensing. Simultaneously, soybean Pn was measured manually in the field. The relationships among VIS, CSC, and Pn were comprehensively analysed using UAV visible, multispectral, and point cloud imagery. In addition, the Pn inversion performance of MLR, RF, XGB, and RR models under different input combinations was evaluated and compared during the flowering, podding, seed initiation, seed-filling, and maturity stages. The results indicated that both VIS and CSC reached maximum correlation with Pn on DAS63, and, thus, all four selected models showed the highest inversion accuracy on DAS63. Compared to single-type canopy trait inputs (VIS, CSC), CSC + VIS input regression models could effectively improve the model accuracy. As for the four models, RF and MLR were more stable and highly accurate for estimating soybean Pn throughout the growth stages. One is suitable for complex datasets with more features, and the other is suitable for more obvious linear relationships of the dataset, which can complement each other’s shortcomings, making them suitable for soybean Pn inversion in the field. In this study, UAV remote sensing technology was used to monitor the Pn of soybeans in real time and with high throughput. This method provides precise growth data, facilitating a scientific understanding of soybean growth conditions and physiological characteristics and offers essential decision support for modern agricultural management.
This study provides valuable insights into the monitoring of crop Pn but has some limitations. First, because the study only covered soybean in one season and one location, the generalizability of the results may be limited across seasons and regions. Second, because the study focused only on Pn in soybean, the proposed methodology may not be directly applicable to other crops. Therefore, future studies should consider a wider range of seasons, locations, and crop types to enhance the generalizability and applicability of the results.