Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Next Article in Journal
Biochar Combined with Garbage Enzyme Enhances Nitrogen Conservation during Sewage Sludge Composting: Evidence from Microbial Community and Enzyme Activities Related to Ammoniation
Previous Article in Journal
Challenges in Mapping Soil Variability Using Apparent Soil Electrical Conductivity under Heterogeneous Topographic Conditions
Previous Article in Special Issue
Improved YOLOv8 and SAHI Model for the Collaborative Detection of Small Targets at the Micro Scale: A Case Study of Pest Detection in Tea
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Inversion of Glycyrrhiza Chlorophyll Content Based on Hyperspectral Imagery

1
College of Information Science and Technology, Shihezi University, Shihezi 832000, China
2
Geospatial Information Engineering Research Center, Xinjiang Production and Construction Crops, Shihezi 832000, China
3
Industrial Technology Research Institute, Xinjiang Production and Construction Corps, Shihezi 832000, China
4
School of Information Network Security, Xinjiang University of Political Science and Law, Tumxuk 843900, China
*
Author to whom correspondence should be addressed.
Agronomy 2024, 14(6), 1163; https://doi.org/10.3390/agronomy14061163
Submission received: 15 April 2024 / Revised: 16 May 2024 / Accepted: 20 May 2024 / Published: 29 May 2024

Abstract

:
Glycyrrhiza is an important medicinal crop that has been extensively utilized in the food and medical sectors, yet studies on hyperspectral remote sensing monitoring of glycyrrhiza are currently scarce. This study analyzes glycyrrhiza hyperspectral images, extracts characteristic bands and vegetation indices, and constructs inversion models using different input features. The study obtained ground and unmanned aerial vehicle (UAV) hyperspectral images and chlorophyll content (called Soil and Plant Analyzer Development (SPAD) values) from sampling sites at three growth stages of glycyrrhiza (regreening, flowering, and maturity). Hyperspectral data were smoothed using the Savitzky–Golay filter, and the feature vegetation index was selected using the Pearson Correlation Coefficient (PCC) and Recursive Feature Elimination (RFE). Feature extraction was performed using Competitive Adaptive Reweighted Sampling (CARS), Genetic Algorithm (GA), and Successive Projections Algorithm (SPA). The SPAD values were then inverted using Partial Least Squares Regression (PLSR), Support Vector Regression (SVR), Random Forest (RF), and Extreme Gradient Boosting (XGBoost), and the results were analyzed visually. The results indicate that in the ground glycyrrhiza inversion model, the GA-XGBoost model combination performed best during the regreening period, with R2, RMSE, and MAE values of 0.95, 0.967, and 0.825, respectively, showing improved model accuracy compared to full-spectrum methods. In the UAV glycyrrhiza inversion model, the CARS-PLSR combination algorithm yielded the best results during the maturity stage, with R2, RMSE, and MAE values of 0.83, 1.279, and 1.215, respectively. This study proposes a method combining feature selection techniques and machine learning algorithms that can provide a reference for rapid, nondestructive inversion of glycyrrhiza SPAD at different growth stages using hyperspectral sensors. This is significant for monitoring the growth of glycyrrhiza, managing fertilization, and advancing precision agriculture.

1. Introduction

Glycyrrhiza, belonging to the genus Glycyrrhiza of the Fabaceae family, is a perennial herb with a long history in China, having been first documented in “Sheng Nong’s herbal classic” as a traditional Chinese herbal medicine, and is revered as the “king of herbs”. Different parts of glycyrrhiza are widely utilized across various sectors. The abundant flavonoid compounds in the stems and leaves of glycyrrhiza make it an excellent livestock feed and feed additive [1]. Studies have demonstrated that the net photosynthetic efficiency and chlorophyll SPAD value of glycyrrhiza are significantly positively correlated with the aboveground biomass accumulation and leaf flavonoid content of glycyrrhiza [2]. Glycyrrhiza plants and their extracts are extensively used in disease prevention and clinical treatment [3], and they are also well appreciated in the chemical and cosmetics industries [4]. Currently, the demand for high-quality glycyrrhiza is rapidly increasing, yet the resources of glycyrrhiza in China are diminishing. The active components of artificially cultivated glycyrrhiza are affected by climatic conditions, geographic location, and soil quality [5], which makes it challenging to ensure the quality of glycyrrhiza [6].
Photosynthesis in vegetation is influenced by chlorophyll content, which not only serves as a vital indicator of plant health but also reflects the developmental stages and nutritional status of vegetation [7,8]. Traditional chlorophyll detection methods include spectrophotometry [9], fluorometry [10], and high-performance liquid chromatography [11], all of which require destructive sampling and incur high economic and labor costs, making them unsuitable for large-scale chlorophyll content acquisition. In contrast, the portable Soil and Plant Analyzer Development (SPAD) meter allows for the nondestructive, rapid assessment of relative chlorophyll content in vegetation. Studies have shown that the SPAD values in glycyrrhiza are highly positively correlated with the accumulation of aboveground dry matter and the flavonoid content in leaves [2], making SPAD quantification crucial for analyzing dry matter accumulation in glycyrrhiza, with significant practical implications.
Although handheld SPAD meters partially compensate for the shortcomings of traditional methods, the collection of chlorophyll content in large-scale vegetation communities still requires significant time and labor costs [12]. Spectral remote sensing technology is essential for establishing links between canopy reflectance and biophysical and biochemical parameters and can estimate chlorophyll content at the canopy scale. Recently, regression models based on reflectance data and existing vegetation indices have enabled nondestructive, rapid, and effective monitoring of vegetation canopy biophysical parameters at the field scale [13,14]. Qi et al. [15] developed a chlorophyll content detection model for peanuts using a BP neural network based on eight vegetation indices, which allows for the rapid in-field acquisition of chlorophyll information in peanuts and the determination of the optimal planting density.
Hyperspectral imaging is characterized by a “spectral image cube” that captures two-dimensional spatial information along with multiple contiguous narrow spectral bands [16]. The existing research has addressed anomalies in hyperspectral data collection through methods such as Savitzky–Golay (SG) smoothing [17,18] and various feature selection algorithms have effectively reduced data redundancy, extracting key spectral information to construct models for estimating chlorophyll content in vegetation [19]. The use of hyperspectral technology to estimate chlorophyll content in the canopy leaves of field crops has been widely applied. For instance, Yang et al. [20] processed hyperspectral images using first-order derivatives, employed principal component analysis and the Successive Projections Algorithm (SPA) for dimensionality reduction, and constructed chlorophyll estimation models based on four types of regression models. Zhang et al. [21] used feature selection methods like Competitive Adaptive Reweighted Sampling (CARS) to predict from second-order derivative-transformed hyperspectral images and constructed a CatBoost-based model for estimating chlorophyll content in apple trees to monitor their growth, with R2, RMSE, and RPD values of 0.923, 2.472, and 3.64, respectively. Chen et al. [22] developed a model combining the Genetic Algorithm (GA) with the Partial Least Squares (PLS) method to select characteristic bands in rapeseed, resulting in an RPD increase of 3.42 compared to the original full-spectrum model, effectively reducing the number of model variables and enhancing model accuracy.
Hyperspectral technology is particularly prominent in estimating chlorophyll content in the canopy leaves of field crops [23]. The proliferation of UAV-based low-altitude remote sensing platforms has normalized the application of remote sensing technology in precision agriculture, and UAVs equipped with corresponding sensors such as visible light, multispectral, hyperspectral, and LIDAR can accurately and rapidly acquire phenotypic information of plants under different environmental conditions, providing theoretical guidance and technical support for the quantitative analysis of physicochemical parameters during the plant breeding process [24]. Sun et al. [25] based their study on the spatial distribution of chlorophyll for UAV monitoring of corn lodging, constructing a vegetation index from preprocessed hyperspectral images to assess chlorophyll distribution under different lodging levels.
In studies of crop chlorophyll content inversion based on hyperspectral data, Partial Least Squares Regression (PLSR), Support Vector Regression (SVR), Random Forest (RF), and Extreme Gradient Boosting (XGBoost) have shown good performance. PLSR and RF have performed well in estimating the structure of various types of vegetation, with excellent capabilities in handling high-dimensional data and analyzing highly collinear datasets [26], and the RF method has proven effective in significantly reducing RMSE [27]. Narmilan et al. [28] constructed multiple vegetation indices from multispectral images and estimated sugarcane chlorophyll content using models such as SVR and XGBoost, assessing crop health in sugarcane fields. SVR and XGBoost have been widely applied in using spectral information to evaluate the physicochemical parameters of crops, achieving good results [22,29,30].
Currently, there is extensive research focused on crops and cash crops [7,12,20], but studies applying hyperspectral remote sensing technology to the inversion of chlorophyll content in glycyrrhiza are scarce. This study applies hyperspectral remote sensing technology to the inversion of chlorophyll content in glycyrrhiza, comparing the performance across different growth stages and various models. This study aims to achieve rapid and nondestructive monitoring of glycyrrhiza canopy chlorophyll content based on hyperspectral data. Hyperspectral data are processed using the SG smoothing method, and characteristic spectral bands are optimized using three feature selection techniques: CARS, GA, and SPA. Four machine learning methods—PLSR, RF, SVR, and XGBoost—are employed for modeling. By comparing and analyzing the results of different combinations, the optimal feature selection and inversion model combinations for different growth stages are selected. Based on the Pearson Correlation Coefficient (PCC) and Recursive Feature Elimination (RFE), the vegetation indices most highly correlated with chlorophyll content are selected to construct the best glycyrrhiza canopy chlorophyll content inversion model, enabling the visualization and mapping of UAV imagery. This study inverts glycyrrhiza SPAD values based on the minimal effective information variables at different growth stages, providing an effective strategy and method for rapid and nondestructive monitoring of glycyrrhiza canopy chlorophyll content using UAVs.

2. Materials and Methods

2.1. Study Area Overview

The research area is located at the Glycyrrhiza Plantation Resources Garden of the fifth company at Shihezi University Experimental Station in Shihezi City, Xinjiang Uygur Autonomous Region, China (85.970768 E, 44.316581 N) (Figure 1). The experimental station is situated in the western part of Shihezi City at an average altitude of 364 m. The area experiences scarce annual precipitation and is characterized by a typical temperate continental climate. The region receives sunlight for about 2721 to 2818 h per year, with annual rainfall ranging from 125.0 to 207.7 mm. The highest temperatures occur in July, with average temperatures between 25.1 and 26.1 °C; the lowest temperature is in January, around −15 °C, with conditions that meet the normal growth requirements of glycyrrhiza. The experimental fields primarily grow cotton and glycyrrhiza. The soil type of the experimental plots is clay loam, belonging to the category of loamy soil, with a higher proportion of clay particles (particle size < 2 μm). Due to the region’s low precipitation, drip irrigation is used to meet the water requirements for glycyrrhiza cultivation, and nitrogen levels are uniformly maintained across the fields. To fully cover different varieties of glycyrrhiza and meet the needs of a diverse monitoring environment, 158 experimental plots (2 m × 5 m each) were established within the selected research area. These plots host seven varieties of glycyrrhiza: Grand control points, Glycyrrhiza uralensis, G. inflata Batal, G. glabra L., G. var. glandulosa, G. eglandulosa, G. eurycarpa, and G. prostrata, with varying plant species across different plots available for experimental analysis.

2.2. Data Collection and Preprocessing

2.2.1. Chlorophyll Content Data Acquisition

For each experiment, data from 50 samples were collected. Due to inherent equipment limitations and human collection errors, deviations were observed in the measurements. After excluding outliers, a total of 148 chlorophyll content records were obtained. Of these, 70% of the data from each growth stage were used as the training set for the models and 30% as the validation set.
Prior to in-field data collection (during the glycyrrhiza seedling stage), healthy glycyrrhiza plants with typical traits within each experimental plot were tagged. From each plot, 2–3 tagged glycyrrhiza plants were randomly selected. RTK (Real-Time Kinematic) positioning was used to determine the precise locations of these sampling points. Chlorophyll content was measured using a portable SPAD chlorophyll meter. For each sampling point, 3–5 healthy leaves from the canopy of tagged glycyrrhiza plants were selected for measurement. Measurements were taken 3–6 times on different parts of the leaves, avoiding the veins, and the average value from the area of the sampling point was used to determine the chlorophyll content of the glycyrrhiza canopy. The data collection process and the distribution map of glycyrrhiza SPAD values are shown in Figure 2 and Figure 3.

2.2.2. Hyperspectral Data Acquisition and Preprocessing

Considering the climatic conditions and the impact of solar radiation on hyperspectral imaging in the Xinjiang region, the experiments were conducted under clear, cloudless conditions. Each hyperspectral data collection session took place between 13:00 and 15:00. Ground hyperspectral data during the regreening stage of glycyrrhiza were collected on 3 July 2023, with the solar elevation angle at 68°16′15″ and solar azimuth angle at 167°17′22″. UAV hyperspectral data during the flowering stage were gathered on 25 July 2023, with the solar elevation angle at 64°56′46″ and solar azimuth angle at 167°18′37″. UAV hyperspectral data during the maturity stage were collected on 10 September 2023, with the solar elevation angle at 50°36′43″ and solar azimuth angle at 174°47′21″.
For ground hyperspectral imaging, the experimental team used a tripod-mounted hyperspectral camera connected to a laptop. Real-time ground glycyrrhiza canopy hyperspectral data were acquired using Rikola’s Hyperspectral Imager software in Live Imaging mode. The collected hyperspectral data did not require image stitching. The spectral and half-peak full-width parameters of the 45-band hyperspectral data captured by the camera are specified in Table 1.
UAV hyperspectral data acquisition was conducted using a DJI M600 Pro UAV, equipped with a Ronin-MX gimbal carrying a SENOP Rikola frame-based hyperspectral imager (SENOP, Kangasala, Finland) for spectral data collection. This imager is equipped with a 5.5 μm × 5.5 μm CMOS sensor, covering a wavelength range from 500 nm to 900 nm, and includes a relative irradiance sensor (Figure 4). The study set the spectral resolution at 9 nm, capturing hyperspectral images across 45 bands with a resolution of 1010 × 1010 pixels. During the UAV data collection, the hyperspectral camera was oriented vertically downwards. The flight altitude was set at 50 m, with a speed of 3.9 m/s, and both the flight path and side overlap were maintained at 85%. The exposure time was set to 5 ms based on the intensity of sunlight on the day of the experiment. Additionally, 14 ground control points were arranged in the experimental field to facilitate later image stitching and geographic correction.

2.2.3. Hyperspectral Radiometric Calibration

Prior to conducting flight operations, four standard diffuse reflectance panels with varying reflectivities (3%, 22%, 48%, and 64%) are laid out in a cross pattern in an open area of the experimental field to calibrate using ground diffuse reflectance targets, ensuring no shadows obscure the radiometric targets. The UAV-collected hyperspectral images are processed using the Hyperspectral Imager v2.1.4 software equipped on the Rikola hyperspectral camera to perform dark current correction. This correction reduces errors due to absorbance and obtains the irradiance values of the hyperspectral images. Subsequently, the RegMosaicV3.0 software is used for stitching and band registration of the UAV hyperspectral images, synthesizing a complete hyperspectral image. Radiometric corrections are applied using ground-laid radiometric targets, converting irradiance brightness values into reflectance through empirical methods. Geographic registration of the UAV images is performed using ground control points, yielding UAV hyperspectral reflectance image data [31].
The spacing between glycyrrhiza plants and the bare soil between adjacent plots can affect the selection of the ROI in glycyrrhiza hyperspectral images, leading to a decrease in the accuracy of extracting the spectral reflectance of glycyrrhiza. To further reduce the influence of soil background and vegetation canopy shadows in the hyperspectral images, this study employs the Normalized Difference Canopy Shadow Index (NDCSI) [32] to eliminate the effects of soil background and canopy shadows. The calculation formula for NDCSI is shown in Equation (1).
NDCSI = R 864     R 664 R 864 + R 664   ×   R 719     R 719 _ min R 719 _ max + R 719 _ min   ,
In the formula, RX denotes the reflectance value at wavelength X, while R max and R min represent the maximum and minimum reflectance values of that band, respectively.
Using the Regions of Interest (ROI) tool in ENVI 5.6, the spectral reflectance of the glycyrrhiza canopy at sampling points is obtained. Combined with geographic location information from RTK, ROIs for ground sampling points in UAV hyperspectral imagery are precisely extracted. The average reflectance of each ROI is calculated as the spectral reflectance for that sampling point. The same processing steps are applied to hyperspectral images collected during three different growth stages to obtain glycyrrhiza canopy spectral reflectance data for each period. Due to factors such as equipment characteristics, human operation, and environmental conditions, the hyperspectral data contain a substantial amount of random noise, complicating the structure of the models and reducing their robustness. Previous research has demonstrated that spectral preprocessing methods can reduce random noise in hyperspectral data [33]. In this study, Savitzky–Golay (SG) smoothing is applied to the raw hyperspectral reflectance data collected from the three periods.

2.3. Vegetation Index Calculation and Band Selection

2.3.1. Calculation of Vegetation Index and Correlation Analysis

When the model input is large, it is necessary to perform feature transformation and reduction on the independent variables to optimize model performance [34]. Constructing a vegetation index is a common method of feature transformation. Based on the Pearson Correlation Coefficient (PCC), vegetation indices that are most significant and highly correlated with chlorophyll content are identified, optimizing the best index for chlorophyll inversion, removing redundant band information, saving computational resources, and enhancing data collection process efficiency. To minimize the effects of solar elevation angle, canopy shadows, and soil background during different growth stages, and to enhance spectral feature variations of glycyrrhiza and improve spectral contrast between different materials, this study selects multiple dual-band and triple-band combination vegetation indices. The selected vegetation indices are listed in Table 2. The correlation between the vegetation indices and SPAD values is analyzed using the PCC, and the importance of vegetation indices is assessed using RFE. Based on the correlation coefficients and importance analysis results, vegetation indices with high correlation and importance to SPAD values are selected for model analysis (Table 3, Figure 5). Analysis results indicate that the chlorophyll absorption ratio index (CARI) and its derivative index show a high correlation with the chlorophyll content of vegetation.

2.3.2. Feature Band Selection Algorithms

Due to the numerous bands in hyperspectral data and their substantial redundancy, three methods are employed for feature band selection: CARS, GA, and SPA.
CARS integrates cross-validation (CV) with PLSR regression model coefficients to filter out significant feature variables. It progressively eliminates less important variable information to capture critical information variables, utilizing PLSR and CV to select the best combination of reflective bands from the full spectrum [7]. The GA is constructed following evolutionary mechanisms and serves as a global optimization strategy in high-dimensional spaces, effectively selecting important bands often involved in combinatorial optimization, and GA typically achieves better outcomes [47]. It has been extensively used for variable selection in spectral multivariate calibration. SPA is a forward variable selection algorithm that minimizes collinearity in vector spaces, selecting the least redundant spectral variables from full-band spectral information [48].

2.3.3. Model Construction and Evaluation Metrics

PLSR combines principal component analysis, conventional correlation analysis, and multivariate linear regression for multivariate statistical analysis of linear data. PLSR is particularly advantageous when internal variables exhibit high linear relationships and is suitable for scenarios where the number of variables exceeds the number of samples, as well as handling multicollinearity issues [16]. SVR is based on statistical learning theory, using kernel functions to transform the initial input space into a new feature space with higher dimensions, thus converting nonlinear regression problems into linear relationships, and performs exceptionally well with smaller samples [49]. RF is an ensemble learning method that combines multiple decision trees through bootstrap sampling. Each tree randomly selects variables during node splitting, creating numerous decision trees, which predict the dependent variable by averaging these trees’ results when used for regression [50,51]. XGBoost is a weighted iterative method for generating decision trees, continuously forming new trees to fit the first prediction residuals, thereby reducing the discrepancies between measurements and predictions and enhancing model fit for challenging samples.
Reflectance data obtained via SG smoothing are used as the independent variable, along with feature bands and vegetation indices acquired through the four feature selection methods, with SPAD values as the dependent variable, to construct glycyrrhiza SPAD inversion models based on PLSR, SVR, RF, and XGBoost. The coefficient of determination (R2), root mean square error (RMSE), and mean absolute error (MAE) serve as evaluation metrics for the SPAD inversion model. An R2 closer to 1 indicates better model fit for outliers, while lower RMSE and MAE values represent higher model accuracy. The calculation formulas are as follows:
R 2 =   ( y ^ i     y ¯ ) 2   ( y i     y ¯ ) 2   ,
RMES = i = 1 n   ( y ^ i   y i ) 2 n   ,
MAE = i = 1 n   y ^ i     y i n   .
In the formula, y i is the measured value; y ^ i is the estimated value; y ¯ is the average measured value of the samples; and n is the number of samples.
The feature selection algorithms and models in this paper are implemented in Python 3.6 within the scikit-learn 0.24.2 environment. The optimal kernel parameters for RF and XGBoost are sought using the GridSearchcv (GS) tool available in the scikit-learn library.

3. Results

3.1. Glycyrrhiza SPAD Feature Band Selection

To achieve optimal inversion results for each model, this study dynamically optimizes the PLS parameter n_componentsn in the CARS algorithm based on the number of input bands, with CV set to 10 times, selecting the best variables based on the lowest RMSE-CV. SPA divides the original spectral data into a training set and a test set in a 7:3 ratio, using it as a feature set for selection, and, finally, outputs selected bands ranked by importance. The GA algorithm selects feature bands with number_of_generation set to 100, probability_of_crossover at 0.5, probability_of_mutation at 0.01, and threshold_of_variable_selection set at 0.7.
These three feature selection methods effectively reduce the dimensionality of spectral band information. From the SG smoothing preprocessed images, characteristics such as chlorophyll absorption at 575 nm, the red-edge features of crops at 710 nm, and strong reflection traits of crop leaf tissues at 809, 863, 889, and 899 nm are extracted. This extraction of hyperspectral feature bands reduces data redundancy and enhances the robustness and computation speed of the inversion models, and the selected feature bands are shown in Table 4 and Figure 6.

3.2. Construction of Glycyrrhiza SPAD Inversion Models

3.2.1. Establishment and Analysis of Models Based on Feature Bands

Using three feature selection algorithms, feature extraction was performed on the glycyrrhiza canopy reflectance data processed by SG smoothing. This study employed PLSR, RF, SVR, and XGBoost regression models to construct predictive models for SPAD values at different growth stages of glycyrrhiza. When building the inversion model with the PLSR model, multiple regression tests proved that setting n_components to 3 yielded the best regression performance. For the RF model, the main adjustments are n_estimators and max_features, which were optimized using GS and ultimately set at 200 and 2, respectively. In the SVR regression model, the radial basis function kernel (RBF) was used, with penalty coefficient C set to 1 and gamma set to scale for optimal model performance. Additionally, the XGBoost model, having more parameters compared to other models, showed through multiple GS experiments that setting colsample_bytree at 0.7, max_depth at 3, min_child_weight at 5, reg_alpha at 0.9, reg_lambda at 1, and subsample at 0.4 resulted in the best overall performance across different growth stages.
Using various feature selection methods and model combinations to estimate the chlorophyll content of glycyrrhiza across three growth stages, a comprehensive analysis of Table 5 and Table 6 reveals that the GA-XGBoost model combination performs optimally during the regreening stage, with R2, RMSE, and MAE values of 0.95, 0.967, and 0.825, respectively. For the full-spectrum model, the R2, RMSE, and MAE were 0.87, 1.222, and 1.275, respectively. After band selection, there was an improvement across all metrics compared to the full spectrum: R2 increased by 9.2%, RMSE decreased by 20.87%, and MAE decreased by 35.29%. The SPA-PLSR combination algorithm performed best during the flowering stage, with full-spectrum inversion accuracy R2, RMSE, and MAE of 0.71, 1.243, and 0.926. After feature band selection, the R2, RMSE, and MAE improved to 0.77, 1.171, and 0.795, with increases of 8.45% and reductions of 5.79% and 14.15%, respectively, compared to the full spectrum. The CARS-PLSR combination algorithm yielded the best results during the maturity stage. The full-spectrum inversion accuracy R2, RMSE, and MAE were 0.78, 1.350, and 1.308, improving to 0.83, 1.279, and 1.215 after feature band selection, an increase of 6.41% in R2, a decrease of 5.26% in RMSE, and a 7.11% reduction in MAE. The inversion effect showed improvement across different spectra, with model accuracy ranked from highest to lowest as PLSR, XGBoost, RF, and SVR. During the experiments, it was observed that the SVR model tended to overfit during training, resulting in poor performance, particularly in high-dimensional data scenarios, suggesting it might be less suitable for chlorophyll content inversion studies in glycyrrhiza.
The results indicate that feature band selection plays a significant role in reducing the dimensionality of input variables and enhancing the inversion accuracy of the models. The experiments show that the combination of CARS with various regression models excels in precision. This not only significantly simplifies model complexity but also enhances the model’s robustness against outlier data, improving model accuracy. Compared to other band selection algorithms, CARS performs exceptionally well across all three growth stages, and the four inversion models constructed with its selected feature bands demonstrate higher accuracy. This algorithm effectively compresses model input parameters under various conditions, addressing some deficiencies of existing modeling approaches. Consequently, CARS proves to be an efficient feature selection algorithm for glycyrrhiza chlorophyll inversion. It achieves favorable results in analyzing multicollinearity problems in datasets and offers a viable method for the study of glycyrrhiza chlorophyll content inversion using hyperspectral remote sensing imagery. Additionally, this study also confirms the high-precision potential of tree-based ensemble learning algorithms such as RF and XGBoost in regression tasks for glycyrrhiza chlorophyll content inversion, providing algorithmic support for solving complex regression problems. Scatter plots based on feature bands selected during different growth stages are shown in Figure 7.

3.2.2. Development and Analysis of Models Based on Vegetation Index

Twelve vegetation indices were selected to construct glycyrrhiza canopy chlorophyll inversion models. Vegetation indices collected during different growth stages were used as independent variables in the inversion models, with SPAD values of the sampling points serving as the dependent variable. Univariate regression was employed to explore the correlation between independent and dependent variables, establishing a functional relationship. The regression results are shown in Table 7.
Vegetation indices selected through correlation analysis are used as independent variables to construct four different regression models for SPAD values across various growth stages. Based on hyperspectral imagery from three periods, the following vegetation indices were chosen: PSRI, CARI, MCARI, TCARI, RDVI, and CIre during the regreening stage, and CARI, MCARI, RDVI, and TCARI during the flowering and regreening stages. Four regression models were employed to build glycyrrhiza chlorophyll inversion models, as shown in Table 8. During the regreening stage, the RF model yielded the best results, with R2, RMSE, and MAE values of 0.77, 1.429, and 1.655, respectively. PLSR performed optimally during the flowering and maturity stages, with flowering stage results of R2 = 0.52, RMSE = 1.411, and MAE = 1.263, and during maturity, the results were R2 = 0.7, RMSE = 1.469, and MAE = 1.528.
The accuracy of chlorophyll inversion models based on feature vegetation indices during different growth stages surpasses that of models using single-variable vegetation indices, with the highest performance observed in the RF model during the regreening stage, where the R2, RMSE, and MAE are 0.77, 1.429, and 1.655, respectively. Overall, the chlorophyll inversion accuracy is consistent with that of feature band selection inversions, with both achieving the best results during the regreening stage. Scatter plots based on feature vegetation indices for different growth stages are shown in Figure 8.

3.3. Visualization Mapping of Glycyrrhiza SPAD Values Based on Feature Selection Methods

By analyzing the inversion results of different combination methods, using the best-performing inversion models during various growth stages, feature band images are generated from hyperspectral data during the flowering and maturity stages using different feature selection methods. Each pixel of the image is used as a model input parameter, fed into PLSR, SVR, RF, and XGBoost inversion models, to estimate the SPAD values for each pixel in the experimental area during different growth periods. This results in a distribution map of SPAD values across the entire hyperspectral image, providing a visualization of the glycyrrhiza SPAD values. Vegetation index and selected feature bands after feature selection are used as model-independent variables for visualization mapping. The SPAD values are segmented for display based on the chlorophyll distribution histogram, with the full-spectrum SPAD value inversion visualization and feature band inversion visualization shown in Figure 9 and Figure 10, respectively.
Field surveys and accumulated prior knowledge reveal that in July, with intense sunlight and abundant daylight, most glycyrrhiza varieties are in their peak growth period, and the chlorophyll content in the leaves is high. In September, despite similar sunlight conditions to July, the growth status of glycyrrhiza plants begins to vary; most varieties have dense foliage, but many individuals of Ural glycyrrhiza start to wither, with a reduction in leaf count and plant wilting, accompanied by a decrease in chlorophyll content and weakened photosynthesis. The chlorophyll content in other glycyrrhiza varieties also declines to varying degrees. Observations from the images indicate a noticeable drop in SPAD values for plots with Ural glycyrrhiza, and the overall chlorophyll level across the entire experimental field is low.

4. Discussion

4.1. Chlorophyll Distribution during Different Growth Stages of Glycyrrhiza

Due to significant developmental differences among glycyrrhiza varieties, and variations in growth characteristics and environmental influences, the distribution of SPAD values varies across different growth stages. Nitrogen levels and soil moisture have a substantial impact on the growth of glycyrrhiza [52,53]. This study maintains consistent nitrogen application and irrigation levels across the experimental field, disregarding declines in chlorophyll content due to drought stress or insufficient nutrient levels. During the regreening stage, plants are in a rapid growth stage with expanding leaves, resulting in large fluctuations in chlorophyll content due to varietal differences. In the flowering stage, growth stabilizes, and leaf chlorophyll content reaches higher levels. By the maturity stage, plants of varieties like Ural glycyrrhiza and prostrate glycyrrhiza have withered, with a decrease in leaf area and overall lower chlorophyll content across other varieties as well. Therefore, chlorophyll content can be used to assess the growth condition of glycyrrhiza, with distribution from highest to lowest being flowering stage > returning green stage > maturation stage.

4.2. Selection of Vegetation Index

This study constructed 12 vegetation indices. Previous research found that the NDVI built from 683 nm and 800 nm has a low correlation across all three growth stages and contributes less during model training [35], with a lower linear correlation than a vegetation index like NDVI [705, 750]. Composite indices like CARI are good estimators of vegetative chlorophyll content [13]; RDVI, a hybrid of DVI and NDVI, is sensitive to both low and high chlorophyll content variations [54]. REP and MTCI, closely associated, are sensitive to high chlorophyll content and commonly used for chlorophyll content estimation [43]. In this study, climatic conditions and physiological states of vegetation on the day of image capture during the maturity stage might cause a low correlation between REP 1, REP 2, and SPAD values compared to other growth stages. Zolotukhina et al. [55] estimated chlorophyll content by extracting 14 vegetation indices from various soybean cultivars using a linear model. Liu et al. [56] established estimation models for chlorophyll content in rice canopies at different growth stages by extracting vegetation indices and characteristic bands such as DVI, OSAVI, and RDVI from rice, using band selection methods and machine learning models. Core bands of rice were selected for hyperspectral remote sensing monitoring of rice. Multivariate regression generally outperforms univariate regression, which shows the largest deviation between estimated and actual SPAD values and lower inversion accuracy. Due to uniform and nonuniform spectral presence in hyperspectral imagery, the use of spectral information is limited, and models based on single variables fail to fully reflect the differences in canopy SPAD values across various conditions.

4.3. Selection of Feature Bands

Hyperspectral bands are complex and redundant, and not all bands are necessary for modeling crop physiological states [57]. Models based on full-spectrum bands tend to overfit, lacking predictive power and explanatory capacity, whereas continuous narrow bands help determine the optimal composition of discontinuous wide bands. Key features impacting the inversion of vegetative physiochemical parameters are explored to improve crop trait estimations [58]. In the hyperspectral imagery from all three stages, bands like 575 nm and 719 nm were selected under visible wavelengths using CARS, GA, and SPA, consistent with previous studies [16]. In the near-infrared region (>750 nm), characteristic bands such as 881 nm and 899 nm are selected, eliminating bands with low contribution and high autocorrelation. Reflectance features are often chosen between 583 and 722 nm, encompassing the main wavelengths of visible red light absorbed by chlorophyll. Feature selection methods also identify high reflectance bands (>800 nm) associated with higher correlations to vegetative chlorophyll content. Many spectral reflectance features selected by different methods fall within the 600 to 800 nm range, highlighting the importance of the red-edge region in estimating glycyrrhiza chlorophyll content from canopy spectral reflectance during various growth stages [59]. Building models with different feature selection algorithms via data image processing is challenging to optimize across different research subjects, times, environments, and models. Exploring the connections between spectra, eliminating redundant bands, and enhancing the efficiency of band selection algorithms is crucial.

4.4. Accuracy of Chlorophyll Content Models

The canopy hyperspectral imagery collected during the regreening stage of glycyrrhiza indicates that the inversion models performed better during this period. This is mainly due to the significant variations in growth and chlorophyll distribution among different types of glycyrrhiza, which provide diverse chlorophyll content that may benefit model fitting. The most significant changes in SPAD occur during the regreening stage, and these notable fluctuations in SPAD values likely aid in achieving good model fits. The research results (Table 5, Table 6 and Table 8) demonstrate that the scale of hyperspectral imagery influences the accuracy of the inversion models, with varying accuracies at different scales, aligning with the findings of Pan [60] and Zhu [61] (overall inversion accuracy: ground > UAV). The study indicates that some PLSR models perform slightly worse than RF models, consistent with previous research [62,63]. Gao et al. [64] estimated the leaf chlorophyll content of maize by extracting spectral reflectance from the leaf mesophyll region. Feature selection algorithms and correlation analysis were employed to select characteristic bands. The estimation was carried out using a PLSR model, with an R2 of 0.86 and RMSE of 1.86. Ta et al. [65] utilized reflectance data from apple tree leaves at various growth stages, extracting multiple vegetation indices. They employed a machine learning model to estimate chlorophyll content, achieving high accuracy (R2 = 0.96, RMSE = 0.95).
The accuracy of XGBoost and RF models is notably high, showing stable performance across all three growth stages. The accuracy of PLSR models surpasses that of RF models during certain growth stages of glycyrrhiza, due to significant performance variations of PLSR across different plant species and regions [66,67]. The performance of PLSR, RF, and XGBoost models is superior to that of SVR models, potentially due to the limited amount of independent variable data and the small representativeness of the input vegetation index, which do not allow for fitting regression curves from the initial model kernel and feature engineering, consistent with conclusions drawn by Angel et al. [68].

5. Conclusions

This study is based on Rikola hyperspectral imagery from various growth stages of glycyrrhiza, considering different feature selection methods and the construction of vegetation indices for analysis, evaluating their performance in different chlorophyll content inversion models, and exploring inversion models that use selected spectral bands and vegetation indices as independent variables. Currently, this study proposes a method to address the large-scale quantitative inversion of chlorophyll content in artificially cultivated glycyrrhiza. It uses feature band selection algorithms to extract characteristic bands from preprocessed hyperspectral imagery and constructs glycyrrhiza chlorophyll inversion models using PLSR, SVR, RF, and XGBoost. When feature bands serve as independent variables, the best inversion accuracy during the regreening stage of glycyrrhiza occurs with the GA-XGBoost combination (R2: 0.95, RMSE: 0.967, MAE: 0.825), and during the flowering and maturity stages, the best model accuracy appears in the CARS-PLSR combination during the flowering stage (R2: 0.83, RMSE: 1.279, MAE: 1.215). When vegetation indices serve as independent variables: the best inversion accuracy during the regreening stage is achieved with the RF model (R2: 0.77, RMSE: 1.429, MAE: 1.655), and the best model accuracy during the flowering and maturity stages is achieved with the PLSR model during the flowering stage (R2: 0.52, RMSE: 1.411, MAE: 1.263). This study employs a combination of feature bands and vegetation indices with machine learning methods to perform quantitative analysis of SPAD values in glycyrrhiza at different growth stages.
This research thoroughly investigates and achieves favorable results in inversely estimating the chlorophyll content of glycyrrhiza using hyperspectral imagery. However, the models and corresponding parameters selected in this study may not be suitable for chlorophyll estimation research in other crops. Further exploration is needed to assess the transferability of the band selection methods and models to other crops. Additionally, the sample size for the same growth stage in this study is relatively small. To enhance the adaptability and accuracy of the models, increasing the sample size may be an effective approach.

Author Contributions

M.X.: methodology, software, writing, editing, data analysis, result verification. J.D.: funding acquisition, methodology, supervision. G.Z.: data. W.H.: methodology, data analysis, review. Z.M.: methodology, review. P.C.: methodology, writing. Y.C.: methodology, review. Q.Z.: funding acquisition, review. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the 2023 Self-supported Research Project of Shihezi University (ZZZC2023009).

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Guo, T.-J.; Zhu, H.-B.; Zhang, J.-Y.; Zang, C.-J.; Sang, D.-J. Physicochemical Properties of Glycyrrhiza uralensis and its Application in Animal Production. China Anim. Husb. Vet. Med. 2014, 41, 105–109. Available online: https://www.chvm.net/CN/Y2014/V41/I9/105 (accessed on 14 April 2024).
  2. Lv, X.; Li, M.; Li, Y.; Bai, H.; Hui, J.; Ma, H.; Li, S.; Guo, S.; Xu, X. Effects of different water and nitrogen ratios on the growth, water use efficiency and flavonoid content of liquorice. J. Plant Physiol. 2023, 59, 421–431. Available online: https://kns.cnki.net/kcms2/article/abstract?v=8dkf_uZKVx28RvmwQ1zEciwpPR73yCoMPDv10YSJzrAA-bT2XdGa-TSK6jH2VAGVG5TtT9kRTpZ0d8KSpGS3xRT4rdbSSSvXUIoD2LRAOssUYXlPmRz5FBVAWngBLAatI10rz7dWQo23t5cOIyBudA==&uniplatform=NZKPT&language=CHS (accessed on 14 April 2024). [CrossRef]
  3. Pastorino, G.; Cornara, L.; Soares, S.; Rodrigues, F.; Oliveira, M.B.P.P. Liquorice (Glycyrrhiza glabra): A phytochemical and pharmacological review. Phytother. Res. 2018, 32, 2323–2339. [Google Scholar] [CrossRef] [PubMed]
  4. Zhang, Y.F. Research progress on pharmacological activities of Glycyrrhiza uralensis Fisch and its active components. Clin. J. Chin. Med. 2019, 11, 141–142. [Google Scholar]
  5. Cui, X.; Lou, L.; Zhang, Y.; Yan, B. Study of the distribution of Glycyrrhiza uralensis production areas as well as the factors affecting yield and quality. Sci. Rep. 2023, 13, 5160. [Google Scholar] [CrossRef] [PubMed]
  6. Deng, T.-M.; Peng, C.; Peng, D.-Y.; Yu, N.-J.; Chen, W.-D.; Wang, L. Research progress on chemical constituents and pharmacological effects of Glycyrrhizae Radix et Rhizoma and discussion of Q-markers. Zhongguo Zhong Yao Za Zhi 2021, 46, 2660–2676. [Google Scholar] [PubMed]
  7. Yuan, Z.; Ye, Y.; Wei, L.; Yang, X.; Huang, C. Study on the Optimization of Hyperspectral Characteristic Bands Combined with Monitoring and Visualization of Pepper Leaf SPAD Value. Sensors 2022, 22, 183. [Google Scholar] [CrossRef] [PubMed]
  8. Zhang, D.; Zhang, J.; Peng, B.; Wu, T.; Jiao, Z.; Lu, Y.; Li, G.; Fan, X.; Shen, S.; Gu, A.; et al. Hyperspectral model based on genetic algorithm and SA-1DCNN for predicting Chinese cabbage chlorophyll content. Sci. Hortic. 2023, 321, 112334. [Google Scholar] [CrossRef]
  9. Johan, F.; Jafri, M.Z.; Lim, H.S. Laboratory Measurement: Chlorophyll-a Concentration Measurement with Acetone Method using Spectrophotometer. In Proceedings of the 2014 IEEE International Conference on Industrial Engineering and Engineering Management, Selangor, Malaysia, 9–12 December 2014. [Google Scholar]
  10. Gibbs, C.F. Chlorophyll b Interference in the Fluorometric Determination of Chlorophyll a and “Phaeo-Pigments”. Mar. Freshw. Res. 1979, 30, 597. [Google Scholar] [CrossRef]
  11. Schwartz Steven, J.; Woo Susan, L.; Von Elbe Joachim, H. High-performance liquid chromatography of chlorophylls and their derivatives in fresh and processed spinach. J. Agric. Food Chem. 1981, 29, 533–535. [Google Scholar] [CrossRef]
  12. Zhang, Y.; Xia, C.; Zhang, X.; Cheng, X.; Feng, G.; Wang, Y.; Gao, Q. Estimating the maize biomass by crop height and narrowband vegetation index derived from UAV-based hyperspectral images. Ecol. Indic. 2021, 129, 107985. [Google Scholar] [CrossRef]
  13. Wu, C.; Niu, Z.; Tang, Q.; Huang, W. Estimating chlorophyll content from hyperspectral vegetation index: Modeling and validation. Agric. For. Meteorol. 2008, 148, 1230–1241. [Google Scholar] [CrossRef]
  14. Thenkabail, P.S.; Smith, R.B.; De Pauw, E. Hyperspectral Vegetation Index and Their Relationships with Agricultural Crop Characteristics. Remote Sens. Environ. 2000, 71, 158–182. [Google Scholar] [CrossRef]
  15. Qi, H.; Wu, Z.; Zhang, L.; Li, J.; Zhou, J.; Jun, Z.; Zhu, B. Monitoring of peanut leaves chlorophyll content based on UAV-based multispectral image feature extraction. Comput. Electron. Agric. 2021, 187, 106292. [Google Scholar] [CrossRef]
  16. Sudu, B.; Rong, G.; Guga, S.; Li, K.; Zhi, F.; Guo, Y.; Zhang, J.; Bao, Y. Retrieving SPAD Values of Summer Maize Using UAV Hyperspectral Data Based on Multiple Machine Learning Algorithm. Remote Sens. 2022, 14, 5407. [Google Scholar] [CrossRef]
  17. Wen, S.; Zhang, W.; Sun, Y.; Li, Z.; Huang, B.; Bian, S.; Zhao, L.; Wang, Y. An enhanced principal component analysis method with Savitzky–Golay filter and clustering algorithm for sensor fault detection and diagnosis. Appl. Energy 2023, 337, 120862. [Google Scholar] [CrossRef]
  18. Schafer, R. What Is a Savitzky-Golay Filter? [Lecture Notes]. IEEE Signal Process. Mag. 2011, 28, 111–117. [Google Scholar] [CrossRef]
  19. Zhang, Y.; Chen, J.; Miller, J.; Noland, T.L. Leaf chlorophyll content retrieval from airborne hyperspectral remote sensing imagery. Remote Sens. Environ. 2008, 112, 3234–3247. [Google Scholar] [CrossRef]
  20. Yang, Y.; Nan, R.; Mi, T.; Song, Y.; Shi, F.; Liu, X.; Wang, Y.; Sun, F.; Xi, Y.; Zhang, C. Rapid and Nondestructive Evaluation of Wheat Chlorophyll under Drought Stress Using Hyperspectral Imaging. Int. J. Mol. Sci. 2023, 24, 5825. [Google Scholar] [CrossRef] [PubMed]
  21. Zhang, Y.; Chang, Q.; Chen, Y.; Liu, Y.; Jiang, D.; Zhang, Z. Hyperspectral Estimation of Chlorophyll Content in Apple Tree Leaf Based on Feature Band Selection and the CatBoost Model. Agronomy 2023, 13, 2075. [Google Scholar] [CrossRef]
  22. Zhang, J.; Cheng, T.; Guo, W.; Xu, X.; Qiao, H.; Xie, Y.; Ma, X. Leaf area index estimation model for UAV image hyperspectral data based on wavelength variable selection and machine learning methods. Plant Methods 2021, 17, 49. [Google Scholar] [CrossRef] [PubMed]
  23. Zhang, Y.; Xiao, J.; Yan, K.; Lu, X.; Li, W.; Tian, H.; Wang, L.; Deng, J.; Lan, Y. Advances and Developments in Monitoring and Inversion of the Biochemical Information of Crop Nutrients Based on Hyperspectral Technology. Agronomy 2023, 13, 2163. [Google Scholar] [CrossRef]
  24. Liu, J.; Zhao, C.; Yang, G.; Yu, H.; Zhao, X.; Xu, B.; Niu, Q. Review of field-based phenotyping by unmanned aerial vehicle remote sensing platform. Trans. Chin. Soc. Agric. Eng. 2016, 32, 98–106. [Google Scholar] [CrossRef]
  25. Sun, Q.; Gu, X.; Chen, L.; Xu, X.; Wei, Z.; Pan, Y.; Gao, Y. Monitoring maize canopy chlorophyll density under lodging stress based on UAV hyperspectral imagery. Comput. Electron. Agric. 2022, 193, 106671. [Google Scholar] [CrossRef]
  26. Hoeppner, J.M.; Skidmore, A.K.; Darvishzadeh, R.; Heurich, M.; Chang, H.C.; Gara, T.W. Mapping Canopy Chlorophyll Content in a Temperate Forest Using Airborne Hyperspectral Data. Remote Sens. 2020, 12, 3573. [Google Scholar] [CrossRef]
  27. Shah, S.H.; Angel, Y.; Houborg, R.; Ali, S.; McCabe, M.F. A Random Forest Machine Learning Approach for the Retrieval of Leaf Chlorophyll Content in Wheat. Remote Sens. 2019, 11, 920. [Google Scholar] [CrossRef]
  28. Narmilan, A.; Gonzalez, F.; Salgadoe, A.S.A.; Kumarasiri, U.W.L.M.; Weerasinghe, H.A.S.; Kulasekara, B.R. Predicting Canopy Chlorophyll Content in Sugarcane Crops Using Machine Learning Algorithms and Spectral Vegetation Index Derived from UAV Multispectral Imagery. Remote Sens. 2022, 14, 1140. [Google Scholar] [CrossRef]
  29. Huang, L.; Liu, Y.; Huang, W.; Dong, Y.; Ma, H.; Wu, K.; Guo, A. Combining Random Forest and XGBoost Methods in Detecting Early and Mid-Term Winter Wheat Stripe Rust Using Canopy Level Hyperspectral Measurements. Agriculture 2022, 12, 74. [Google Scholar] [CrossRef]
  30. Yoon, H.I.; Lee, H.; Yang, J.S.; Choi, J.H.; Jung, D.H.; Park, Y.J.; Park, J.E.; Kim, S.M.; Park, S.H. Predicting Models for Plant Metabolites Based on PLSR, AdaBoost, XGBoost, and LightGBM Algorithms Using Hyperspectral Imaging of Brassica juncea. Agriculture 2023, 13, 1477. [Google Scholar] [CrossRef]
  31. Wang, N.; Clevers, J.G.; Wieneke, S.; Bartholomeus, H.; Kooistra, L. Potential of UAV-based sun-induced chlorophyll fluorescence to detect water stress in sugar beet. Agric. For. Meteorol. 2022, 323, 109033. [Google Scholar] [CrossRef]
  32. Xu, N.; Tian, J.; Tian, Q.; Xu, K.; Tang, S. Analysis of Vegetation Red Edge with Different Illuminated/Shaded Canopy Proportions and to Construct Normalized Difference Canopy Shadow Index[EB]. Remote Sens. 2019, 11, 1192. [Google Scholar] [CrossRef]
  33. Mishra, P.; Biancolillo, A.; Roger, J.M.; Marini, F.; Rutledge, D.N. New data preprocessing trends based on ensemble of multiple preprocessing techniques. TrAC Trends Anal. Chem. 2020, 132, 116045. [Google Scholar] [CrossRef]
  34. Guyon, I.; Elisseeff, A. An Introduction to Variable and Feature Selection. J. Mach. Learn. Res. 2003, 3, 1157–1182. [Google Scholar]
  35. Feng, L.; Zhang, Z.; Ma, Y.; Du, Q.; Williams, P.; Drewry, J.; Luck, B. Alfalfa Yield Prediction Using UAV-Based Hyperspectral Imagery and Ensemble Learning. Remote Sens. 2020, 12, 2028. [Google Scholar] [CrossRef]
  36. Shu, M.; Chen, X.; Wang, X.; Ma, Y. Estimation of Maize Leaf Area Index and Aboveground Biomass Based on Hyperspectral Data. Smart Agric. 2021, 3, 29–39. [Google Scholar]
  37. Zheng, H.; Ma, J.; Zhou, M.; Li, D.; Yao, X.; Cao, W.; Zhu, Y.; Cheng, T. Enhancing the Nitrogen Signals of Rice Canopies across Critical Growth Stages through the Integration of Textural and Spectral Information from Unmanned Aerial Vehicle (UAV) Multispectral Imagery. Remote Sens. 2020, 12, 957. [Google Scholar] [CrossRef]
  38. Guerrero, J.M.; Pajares, G.; Montalvo, M.; Romeo, J.; Guijarro, M. Support Vector Machines for crop/weeds identification in maize fields. Expert Syst. Appl. 2012, 39, 11149–11155. [Google Scholar] [CrossRef]
  39. Han, L.; Yang, G.; Dai, H.; Xu, B.; Yang, H.; Feng, H.; Li, Z.; Yang, X. Modeling maize above-ground biomass based on machine learning approaches using UAV remote-sensing data. Plant Methods 2019, 15, 10. [Google Scholar] [CrossRef]
  40. Rondeaux, G.; Steven, M.; Baret, F. Optimization of soil-adjusted vegetation index. Remote Sens. Environ. 1996, 55, 95–107. [Google Scholar] [CrossRef]
  41. Haboudane, D.; Miller, J.R.; Tremblay, N.; Zarco-Tejada, P.J.; Dextraze, L. Integrated narrow-band vegetation index for prediction of crop chlorophyll content for application to precision agriculture. Remote Sens. Environ. 2002, 81, 416–426. [Google Scholar] [CrossRef]
  42. Daughtry CS, T.; Walthall, C.L.; Kim, M.S.; De Colstoun, E.B.; McMurtrey, J.E. Estimating Corn Leaf Chlorophyll Concentration from Leaf and Canopy Reflectance. Remote Sens. Environ. 2000, 74, 229–239. [Google Scholar] [CrossRef]
  43. Dash, J.; Curran, P.J. The MERIS terrestrial chlorophyll index. Int. J. Remote Sens. 2004, 25, 5403–5413. [Google Scholar] [CrossRef]
  44. Van Der Meij, B.; Kooistra, L.; Suomalainen, J.; Barel, J.M.; De Deyn, G.B. Remote sensing of plant trait responses to field-based plant–soil feedback using UAV-based optical sensors. Biogeosciences 2017, 14, 733–749. [Google Scholar] [CrossRef]
  45. Guyot, G.; Baret, F. Utilisation de la haute resolution spectrale pour suivre l’etat des couverts vegetaux. Spectr. Signat. Objects Remote Sens. 1988, 287, 279. [Google Scholar]
  46. Gitelson, A.A.; Viña, A.; Ciganda, V.; Rundquist, D.C.; Arkebauer, T.J. Remote estimation of canopy chlorophyll content in crops. Geophys. Res. Lett. 2005, 32. [Google Scholar] [CrossRef]
  47. Zheng, Z.; Liu, Y.; He, M.; Chen, D.; Sun, L.; Zhu, F. Effective band selection of hyperspectral image by an attention mechanism-based convolutional network. RSC Adv. 2022, 12, 8750–8759. [Google Scholar] [CrossRef] [PubMed]
  48. Chen, X.; Lv, X.; Ma, L.; Chen, A.; Zhang, Q.; Zhang, Z. Optimization and Validation of Hyperspectral Estimation Capability of Cotton Leaf Nitrogen Based on SPA and RF. Remote Sens. 2022, 14, 5201. [Google Scholar] [CrossRef]
  49. Fang, H.; Man, W.; Liu, M.; Zhang, Y.; Chen, X.; Li, X.; He, J.; Tian, D. Leaf Area Index Inversion of Spartina alterniflora Using UAV Hyperspectral Data Based on Multiple Optimized Machine Learning Algorithms. Remote Sens. 2023, 15, 4465. [Google Scholar] [CrossRef]
  50. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  51. Wang, L.; Zhou, X.; Zhu, X.; Dong, Z.; Guo, W. Estimation of biomass in wheat using random forest regression algorithm and remote sensing data. Crop J. 2016, 4, 212–219. [Google Scholar] [CrossRef]
  52. Goudarzi, T.; Tabrizi, L.; Nazeri, V.; Etemadi, M. Nutrient distribution in various tissues of Glycyrrhiza (Glycyrrhiza glabra L.) and the influence of soil fertility on the levels of its bioactive compounds. Ind. Crops Prod. 2024, 209, 118073. [Google Scholar] [CrossRef]
  53. Behdad, A.; Mohsenzadeh, S.; Azizi, M. Growth, leaf gas exchange and physiological parameters of two Glycyrrhiza glabra L. populations subjected to salt stress condition. Rhizosphere 2021, 17, 100319. [Google Scholar] [CrossRef]
  54. Tong, A. Estimating and mapping chlorophyll content for a heterogeneous grassland: Comparing prediction power of a suite of vegetation index across scales between years. ISPRS J. Photogramm. Remote Sens. 2017, 126, 146–167. [Google Scholar] [CrossRef]
  55. Zolotukhina, A.; Machikhin, A.; Guryleva, A.; Gresis, V.; Tedeeva, V. Extraction of chlorophyll concentration maps from AOTF hyperspectral imagery. Front. Environ. Sci. 2023, 11, 1152450. [Google Scholar] [CrossRef]
  56. Liu, H.; Lei, X.; Liang, H.; Wang, X. Multi-Model Rice Canopy Chlorophyll Content Inversion Based on UAV Hyperspectral Images. Sustainability 2023, 15, 7038. [Google Scholar] [CrossRef]
  57. Rivera-Caicedo, J.P.; Verrelst, J.; Muñoz-Marí, J.; Camps-Valls, G.; Moreno, J. Hyperspectral dimensionality reduction for biophysical variable statistical retrieval. ISPRS J. Photogramm. Remote Sens. 2017, 132, 88–101. [Google Scholar] [CrossRef]
  58. Thorp, K.R.; Wang, G.; Bronson, K.F.; Badaruddin, M.; Mon, J. Hyperspectral data mining to identify relevant canopy spectral features for estimating durum wheat growth, nitrogen status, and grain yield. Comput. Electron. Agric. 2017, 136, 1–12. [Google Scholar] [CrossRef]
  59. Wang, Q.; Chen, X.; Meng, H.; Miao, H.; Jiang, S.; Chang, Q. UAV Hyperspectral Data Combined with Machine Learning for Winter Wheat Canopy SPAD Values Estimation. Remote Sens. 2023, 15, 4658. [Google Scholar] [CrossRef]
  60. Pan, J.; Lin, J.; Xie, T. Exploring the Potential of UAV-Based Hyperspectral Imagery on Pine Wilt Disease Detection: Influence of Spatio-Temporal Scales. Remote Sens. 2023, 15, 2281. [Google Scholar] [CrossRef]
  61. Zhu, W.; Sun, Z.; Yang, T.; Li, J.; Peng, J.; Zhu, K.; Li, S.; Gong, H.; Lyu, Y.; Li, B.; et al. Estimating leaf chlorophyll content of crops via optimal unmanned aerial vehicle hyperspectral data at multi-scales. Comput. Electron. Agric. 2020, 178, 105786. [Google Scholar] [CrossRef]
  62. Wang, L.; Chang, Q.; Li, F.; Yan, L.; Huang, Y.; Wang, Q.; Luo, L. Effects of Growth Stage Development on Paddy Rice Leaf Area Index Prediction Models. Remote Sens. 2019, 11, 361. [Google Scholar] [CrossRef]
  63. Chen, X.; Li, F.; Shi, B.; Fan, K.; Li, Z.; Chang, Q. Estimation of Winter Wheat Canopy Chlorophyll Content Based on Canopy Spectral Transformation and Machine Learning Method. Agronomy 2023, 13, 783. [Google Scholar] [CrossRef]
  64. Gao, D.; Li, M.; Zhang, J.; Song, D.; Sun, H.; Qiao, L.; Zhao, R. Improvement of chlorophyll content estimation on maize leaf by vein removal in hyperspectral image. Comput. Electron. Agric. 2021, 184, 106077. [Google Scholar] [CrossRef]
  65. Ta, N.; Chang, Q.; Zhang, Y. Estimation of Apple Tree Leaf Chlorophyll Content Based on Machine Learning Methods. Remote Sens. 2021, 13, 3902. [Google Scholar] [CrossRef]
  66. Yang, X.; Yang, R.; Ye, Y.; Yuan, Z.; Wang, D.; Hua, K. Winter wheat SPAD estimation from UAV hyperspectral data using cluster-regression methods. Int. J. Appl. Earth Obs. Geoinf. 2021, 105, 102618. [Google Scholar] [CrossRef]
  67. Wu, Q.; Zhang, Y.; Zhao, Z.; Xie, M.; Hou, D. Estimation of Relative Chlorophyll Content in Spring Wheat Based on Multi-Temporal UAV Remote Sensing. Agronomy 2023, 13, 211. [Google Scholar] [CrossRef]
  68. Angel, Y.; McCabe, M.F. Machine Learning Strategies for the Retrieval of Leaf-Chlorophyll Dynamics: Model Choice, Sequential Versus Retraining Learning, and Hyperspectral Predictors. Front. Plant Sci. 2022, 13, 722442. [Google Scholar] [CrossRef]
Figure 1. Location of the study area.
Figure 1. Location of the study area.
Agronomy 14 01163 g001
Figure 2. Data collection. (a) Tagged sampling points; (b) hyperspectral data collection; (c) RTK data collection; (d) SPAD measurements.
Figure 2. Data collection. (a) Tagged sampling points; (b) hyperspectral data collection; (c) RTK data collection; (d) SPAD measurements.
Agronomy 14 01163 g002
Figure 3. SPAD value distribution across different growth stages.
Figure 3. SPAD value distribution across different growth stages.
Agronomy 14 01163 g003
Figure 4. Unmanned hyperspectral systems.
Figure 4. Unmanned hyperspectral systems.
Agronomy 14 01163 g004
Figure 5. Feature importance score charts. (a) Importance scores of vegetation index during the regreening stage; (b) importance scores of vegetation index during the flowering stage; (c) importance scores of vegetation index during the maturity stage.
Figure 5. Feature importance score charts. (a) Importance scores of vegetation index during the regreening stage; (b) importance scores of vegetation index during the flowering stage; (c) importance scores of vegetation index during the maturity stage.
Agronomy 14 01163 g005
Figure 6. Distribution of feature bands during different growth stages.
Figure 6. Distribution of feature bands during different growth stages.
Agronomy 14 01163 g006
Figure 7. Scatter plots of models during different growth stages. (a) Scatter plot of feature bands during the regreening stage; (b) scatter plot of feature bands during the flowering stage; (c) scatter plot of feature bands during the maturity stage. In the figures, the darkest color (blue) of the data points represents the lowest predicted SPAD, while the lightest color (yellow) represents the highest predicted SPAD. The same applies to the following figure.
Figure 7. Scatter plots of models during different growth stages. (a) Scatter plot of feature bands during the regreening stage; (b) scatter plot of feature bands during the flowering stage; (c) scatter plot of feature bands during the maturity stage. In the figures, the darkest color (blue) of the data points represents the lowest predicted SPAD, while the lightest color (yellow) represents the highest predicted SPAD. The same applies to the following figure.
Agronomy 14 01163 g007
Figure 8. Scatter plots of models during different growth stages. (a) Scatter plot of feature vegetation index during the regreening stage; (b) scatter plot of feature vegetation index during the flowering stage; (c) scatter plot of feature vegetation index during the maturity stage.
Figure 8. Scatter plots of models during different growth stages. (a) Scatter plot of feature vegetation index during the regreening stage; (b) scatter plot of feature vegetation index during the flowering stage; (c) scatter plot of feature vegetation index during the maturity stage.
Agronomy 14 01163 g008
Figure 9. Inversion map of SPAD values during the flowering stage.
Figure 9. Inversion map of SPAD values during the flowering stage.
Agronomy 14 01163 g009
Figure 10. Inversion map of SPAD values during the maturity stage.
Figure 10. Inversion map of SPAD values during the maturity stage.
Agronomy 14 01163 g010
Table 1. Wavelengths and full width at half maximum (FWHM) of the 45-band hyperspectral camera.
Table 1. Wavelengths and full width at half maximum (FWHM) of the 45-band hyperspectral camera.
BandsWavelengths/nmFWHMBandsWavelengths/nmFWHM
1503.495.3224710.114.97
2512.137.0825719.425.78
3521.187.2626728.24.96
4530.485.2227737.445.99
5539.475.4328746.334.65
6548.484.6729755.185.71
7557.325.3530764.374.53
8566.344.2531773.417.97
9575.395.3932782.257.08
10584.394.2733791.367.09
11593.316.534800.037.51
12602.275.8435809.47.25
13611.448.4836817.7113.44
14620.089.0937827.3212.71
15628.868.0238835.712.66
16636.758.1939844.8612.23
17650.227.7340854.2612.82
18656.177.6641863.5512.47
19664.937.4742872.0212.35
20674.228.0243881.4512.44
21683.297.8644889.8511.56
22692.28.0945899.0712.34
23701.17.85
Table 2. Vegetation indices and calculation formulas.
Table 2. Vegetation indices and calculation formulas.
Vegetation IndexFormula
Normalized Difference Vegetation Index (NDVI) [35](R800 − R683)/(R800 + R683)
Chlorophyll Absorption Reflectance Index (CARI) [36](R701 − R674) − 0.2(R701 + R674)
Renormalized Difference Vegetation Index (RDVI) [37] ( R 800 R 683 ) / R 800 + R 683
Plant Senescence Reflectance Index (PSRI) [38](R683 − R503)/R746
Ratio Vegetation Index (RVI) [39]R800/R683
Optimization of Soil-Adjusted Vegetation Index (OSAVI) [40]1.6 × (R800 − R674)/(R800 + R674 + 0.16)
Transformed Chlorophyll Absorption in Reflectance Index (TCARI) [41]3[(R701 − R674) − 0.2(R701 − R548)]R701/R674
Modified Chlorophyll Absorption Ratio Index (MCARI) [42][R701 − R674 − 0.2(R701 − R548)]R701/R674
MERIS Terrestrial Chlorophyll Index (MTCI) [43](R755 − R710)/(R710 − R683)
Red-edge Position Index (REP 1) [44]700 + 45[(R674 + R782)/2 − R701]/(R737 − R701)
Red-edge Position Index (REP 2) [45]700 + 40[(R674 + R782)/2 − R701]/(R737 − R701)
Red-edge Chlorophyll Index (CIre) [46]R863/R719 − 1
Table 3. Correlation coefficients between vegetation indices and SPAD at different fertility stages.
Table 3. Correlation coefficients between vegetation indices and SPAD at different fertility stages.
Vegetation IndexReturning Green StageFlowering StageMaturation Stage
PSRI0.510.0440.044
CARI0.530.5500.550
MCARI0.750.4600.460
TCARI0.750.5900.590
RVI0.150.03100.031
REP1−0.49−0.360−0.360
REP2−0.49−0.360−0.360
RDVI0.880.8100.810
OSAVI0.11−0.039−0.039
NDVI0.130.1700.170
MTCI−0.49−0.410−0.410
CIre−0.52−0.280−0.280
Table 4. Feature bands selected by different feature selection methods.
Table 4. Feature bands selected by different feature selection methods.
Selection AlgorithmGrowth StagesCharacteristic BandsNumber of Bands
CARSReturning green stage719.42, 728.2, 737.44, 746.33, 827.32, 835.7, 844.867
Flowering stage602.27, 692.2, 710.11, 719.42, 728.2, 737.44, 746.33, 755.18, 764.37, 782.25, 791.36, 809.4, 827.32, 835.7, 844.86, 863.55, 872.02, 881.45, 889.8519
Maturation stage719.42, 728.2, 782.25, 791.364
GAReturning green stage503.49, 539.47, 656.17, 692.2, 755.18, 782.25, 800.03, 827.32, 881.45, 889.8510
Flowering stage503.49, 710.11, 737.44, 746.33, 773.41, 782.25, 817.71, 835.7, 844.86, 854.26, 889.85, 899.0712
Maturation stage503.49, 521.18, 530.48, 539.47, 557.32, 566.34, 575.39, 620.08, 650.2, 664.93, 701.1, 710.11, 737.44, 746.33, 854.26, 872.02, 881.45, 899.0718
SPAReturning green stage503.49, 548.48, 611.44, 664.93, 683.29, 710.11, 782.25, 899.078
Flowering stage503.49, 602.27, 636.75, 656.17, 683.29, 692.2, 701.1, 881.458
Maturation stage503.49, 566.34, 602.27, 636.75, 692.2, 728.2, 773.41, 899.078
Table 5. Inversion accuracy of the full-spectrum model.
Table 5. Inversion accuracy of the full-spectrum model.
Growth StagesModelsR2RMSEMAE
Returning green stagePLSR0.771.4271.381
SVR0.511.4631.600
RF0.881.2221.275
XGBoost0.861.2591.255
Flowering stagePLSR0.681.2731.100
SVR0.521.4131.416
RF0.711.2420.926
XGBoost0.651.3011.207
Maturation stagePLSR0.791.3511.308
SVR0.561.6192.026
RF0.661.5161.637
XGBoost0.611.5731.780
Table 6. Inversion accuracy of the feature band model.
Table 6. Inversion accuracy of the feature band model.
Growth StagesModelsR2RMSEMAE
Returning green stageCARS-PLSR0.811.3561.287
SPA-SVR0.831.3151.189
CARS-RF0.921.0950.962
GA-XGBoost0.950.9670.825
Flowering stageSPA-PLSR0.771.1710.795
CARS-SVR0.631.3241.616
GA-RF0.731.2210.842
CARS-XGBoost0.761.1891.007
Maturation stageCARS-PLSR0.831.2791.215
SPA-SVR0.561.6222.028
CARS-RF0.731.4411.604
CARS-XGBoost0.751.4091.429
Table 7. Univariate linear regression.
Table 7. Univariate linear regression.
Growth StagesVegetation IndexModel EquationsR2RMSEMAE
Returning green stagePSRIy = 116.8x + 40.420.263.6163.008
CARIy = 2.9238x + 36.5850.283.5702.846
MCARIy = 1.2777x + 31.0420.562.7902.286
TCARIy = 0.4259x + 31.0420.562.7892.285
RDVIy = 6.1869x + 6.34020.771.9781.541
CIrey = −6.6095x + 46.0970.263.6022.882
Flowering stageCARIy = 2.3386x + 36.4440.332.6842.035
MCARIy = 0.4664x + 36.30.202.9292.110
TCARIy = 1.1202x + 31.1440.352.6631.990
RDVIy = 6.4685x + 2.50160.651.9371.384
Maturation stageCARIy = 3.1485x + 30.5510.143.3972.777
MCARIy = 1.4818x + 24.7260.382.8902.453
TCARIy = 1.1931x + 21.7990.482.6502.220
RDVIy = 4.5565x + 12.080.482.6542.031
Table 8. Inversion accuracy of models using feature vegetation indices during different growth stages.
Table 8. Inversion accuracy of models using feature vegetation indices during different growth stages.
Selection MethodsModelR2RMSEMAE
Returning green stagePLSR0.731.4821.731
SVR0.661.5711.919
RF0.771.4281.683
XGBoost0.711.5171.793
Flowering stagePLSR0.521.4111.263
SVR0.231.5901.980
RF0.411.4871.694
XGBoost0.511.4201.494
Maturation stagePLSR0.701.4691.528
SVR0.281.8372.636
RF0.501.6752.240
XGBoost0.631.5491.626
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xu, M.; Dai, J.; Zhang, G.; Hou, W.; Mu, Z.; Chen, P.; Cao, Y.; Zhao, Q. Inversion of Glycyrrhiza Chlorophyll Content Based on Hyperspectral Imagery. Agronomy 2024, 14, 1163. https://doi.org/10.3390/agronomy14061163

AMA Style

Xu M, Dai J, Zhang G, Hou W, Mu Z, Chen P, Cao Y, Zhao Q. Inversion of Glycyrrhiza Chlorophyll Content Based on Hyperspectral Imagery. Agronomy. 2024; 14(6):1163. https://doi.org/10.3390/agronomy14061163

Chicago/Turabian Style

Xu, Miaomiao, Jianguo Dai, Guoshun Zhang, Wenqing Hou, Zhengyang Mu, Peipei Chen, Yujuan Cao, and Qingzhan Zhao. 2024. "Inversion of Glycyrrhiza Chlorophyll Content Based on Hyperspectral Imagery" Agronomy 14, no. 6: 1163. https://doi.org/10.3390/agronomy14061163

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop