Predicting The Elemental Compositions of Solid Waste Using ATR-FTIR and Machine Learning

Front. Environ. Sci. Eng.
2023, 17(10): 121

https://doi.org/10.1007/s11783-023-1721-1
RESEARCH ARTICLE
Predicting the elemental compositions of solid waste using

ATR-FTIR and machine learning
Haoyang Xian1, Pinjing He1,2,3, Dongying Lan1, Yaping Qi1, Ruiheng Wang1, Fan Lü1,2,3,
Hua Zhang (✉)1,2,3, Jisheng Long (✉)4
1 Institute of Waste Treatment & Reclamation, College of Environmental Science and Engineering, Tongji University, Shanghai 200092, China
2 Shanghai Institute of Pollution Control and Ecological Security, Shanghai 200092, China
3 Shanghai Engineering Research Center of Multi-source Solid Wastes Co-processing and Energy Utilization, Shanghai 200092, China
4 Shanghai SUS Environment Co., Ltd., Shanghai 201703, China
HIGHLIGHTS GRAPHIC ABSTRACT

● A method based on ATR-FTIR and ML was
developed to predict CHNS contents in waste.
● Feature selection methods were used to
improve models’ prediction accuracy.
● The best model predicted C, H, and N contents
with accuracy R2 ≥ 0.93, 0.87, 0.97.
● Some suitable models showed insensitivity to
spectral noise.
● Under moisture interference, the models still
had good prediction performance.
ARTICLE INFO ABSTRACT

Article history: Elemental composition is a key parameter in solid waste treatment and disposal. This study has
Received 28 May 2022 proposed a method based on infrared spectroscopy and machine learning algorithms that can rapidly
Revised 19 March 2023 predict the elemental composition (C, H, N, S) of solid waste. Both noise and moisture spectral
interference that may occur in practical application are investigated. By comparing two feature
Accepted 20 March 2023 selection methods and five machine learning algorithms, the most suitable models are selected.
Available online 30 April 2023 Moreover, the impacts of noise and moisture on the models are discussed, with paper, plastic, textiles,
wood, and leather as examples of recyclable waste components. The results show that the combination
of the feature selection and K-nearest neighbor (KNN) approaches exhibits the best prediction
Keywords: performance and generalization ability. Particularly, the coefficient of determination (R2) of the
Elemental composition validation set, cross validation and test set are higher than 0.93, 0.89, and 0.97 for predicting the C, H,
Infrared spectroscopy and N contents, respectively. Further, KNN is less sensitive to noise. Under moisture interference, the
Machine learning combination of feature selection and support vector regression or partial least-squares regression
Moisture interference shows satisfactory results. Therefore, the elemental compositions of solid waste are quickly and
Solid waste accurately predicted under noise and moisture disturbances using infrared spectroscopy and machine
learning algorithms.
Spectral noise
© Higher Education Press 2023
1 Introduction
✉ Corresponding authors
E-mails: zhanghua_tj@tongji.edu.cn (H. Zhang); Solid waste, such as agricultural, industrial, and
long@shjec.cn (J. Long) municipal waste, is generated from human activities (Ren
Special Issue—Artificial Intelligence/Machine Learning on et al., 2021). Solid waste management is a global
Environmental Science & Engineering (Responsible Editors: environmental issue, and the classification and separate
Yongsheng Chen, Xiaonan Wang, Joe F. Bozeman III & Shouliang Yi) collection of such waste are effective solutions for its
2 Front. Environ. Sci. Eng. 2023, 17(10): 121
reduction (Zhang et al., 2010; Xiao et al., 2020). dissolved organic C and the C:N ratio. However, the
Recycling, biological treatment, incineration, and coefficient of determination (R2) is only 0.64 in
landfilling are commonly used for the collected solid predicting dissolved organic C with this method
waste (Chin et al., 2022; Li et al., 2022a). Paper, plastic, (Higashikawa et al., 2014). Tao et al. (2020) used FTIR to
textiles, wood, and leather are the main components of predict C, H, and O contents and low calorific values of
recyclable waste for resource reuse or of combustible biomass waste , obtaining accuracies ranging from
waste for energy recovery (Demetrious et al., 2018). Solid 85.53% (H) to 95.54% (C). However, most of these
waste is heterogeneous, its composition varies consider- studies focus on clean samples or similar types of
ably, significantly impacting its treatment and disposal. samples, without considering possible spectral noise
For example, if the characteristics of the fed waste are (Zhang et al., 2018) and moisture interference (Jiang
unstable, incomplete combustion may occur during et al., 2016) that may exist in practical applications.
incineration, generating dioxins (Li et al., 2019b). Machine learning algorithms have been widely used in
Moisture in waste influences not only the calorific value recent studies to predict or classify solid waste compo-
of waste and the sorting of recyclables, but also leachate nents (Kannangara et al., 2018; Adedeji and Wang,
production (El-Fadel et al., 2002) during waste treatment 2019). The most commonly used machine learning
and disposal. The elemental composition determines algorithms include the K-nearest neighbor (KNN) based
waste transformation and pollution derivation during on distance (Peršak et al., 2020), partial least-squares
waste management. Particularly, C contained in waste is regression (PLSR) based on linear regression
closely related to C emissions, which are presently of (Kandlbauer et al., 2021), support vector machines
wide concern (Wang et al., 2021), and S and N can (SVMs) for both linear and nonlinear kernel functions
generate acid gas and other pollutants when solid waste is (Paul et al., 2019), artificial neural networks (ANNs) that
incinerated (Zou et al., 2022). Therefore, these elements simulate neuron activity (Chen et al., 2021), and
are key object of pollution control in plant design and convolutional neural networks which develop from it
operation. Moreover, elemental composition is an impor- (Huang et al., 2022). Tree-based models have also been
tant parameter for selecting solid waste classification, developed recently and include decision tree (Kardani
recycling, treatment, and disposal technologies (Garcés et al., 2021), random forest (RF) (Karimi et al., 2021) and
et al., 2016). gradient boost regression tree (Lu et al., 2022), as well as
Traditional method of the elemental composition of the more recently implemented extreme gradient boosting
solid waste uses an elemental analyzer based on the tree (XGBOOST) (Xu et al., 2021). These algorithms are
complete combustion of compounds containing C, H, N, used to monitor the arsenic content in soil (Chakraborty
and S. However, this method requires complex sample et al., 2017), predict landfill area of solid waste (Hoque
preparation and pretreatment, and it can not be used to and Rahman, 2020) and sort plastic bottles (Wang et al.,
rapidly detect elemental composition in real time. 2019). Although algorithms have applications in solid
Moreover, the elemental analyzer limits the sample size waste classification and property analysis, the literature
to several milligrams, which inevitably causes the testing lacks comparative analyses between different models.
precision during analysis to be affected by sample Moreover, most machine learning algorithms are black-
inhomogeneity and experimental error in the size box models and cannot explain the relationship between
reduction process (Nzihou, 2020). Therefore, to avoid relevant variables and results. Accordingly, Lundberg and
time-consuming measurements, high labor and material Lee developed a Shapley addition interpretation algori-
resource requirements, and subjective effects during thm that attempted to explain model results on the basis
analysis, this study proposed a method for rapidly of variable importance (Lundberg and Lee, 2017).
predicting the elemental composition of solid waste using Moreover, feature selection methods such as the succes-
Fourier transform infrared spectroscopy (FTIR) and sive projections algorithm (SPA) (Wang and Wang,
machine learning techniques. 2022) and competitive adaptive reweighted sampling
FTIR is a commonly used spectroscopic technique for algorithm (CARS) (Xu et al., 2023) are often used to
quickly obtaining a spectral map exhibiting the select variables, explain the relation between variables
characteristics of functional groups without destroying and models, and improve the prediction performance.
the original sample. This method has been applied in Herein, the main recyclable components, including
remote sensing (Goydaragh et al., 2021), food (Kaur paper, plastic, fabrics, wood, and leather are taken as
et al., 2022), and medicine (Govindappa et al., 2021) and examples for FTIR analysis and elemental composition
is often used to identify microplastics in the environment prediction. Five machine learning regression algorithms,
(Zhang et al., 2020; Li et al., 2022b). FTIR is often including RF, support vector regression (SVR), PLSR,
combined with machine learning technology to classify, XGBOOST, and KNN, and two feature selection
sort, and recycle plastics (Michel et al., 2020; Said et al., methods, including SPA and CARS, are compared, and
2021). Moreover, it is used to evaluate the compost the best-fit machine learning model is selected. Further-
stability of organic waste and to predict indicators such as more, the effect of noise and moisture on the model are
Haoyang Xian et al. Predicting the elemental compositions of solid waste using ATR-FTIR and machine learning 3
investigated. of each sample were selected for ATR-FTIR analysis.

Because differences existed between the front and back of
the leather samples and wall paper, three regions on each
2 Materials and methods side were selected for testing. Finally, a total of 836
infrared spectra were obtained. In addition, to detect the
2.1 Samples influence of moisture on the spectra, ATR-FTIR was used
to test the textiles with different moisture contents,
Herein, 24 types of leather, 27 types of paper, 10 types of yielding 582 infrared spectra were obtained.
textiles, 9 types of wood, and 27 types of plastics were
collected, each with three parallel samples; making up a 2.2.2 Moisture content and elemental composition
total of 291 samples. Table 1 shows specific information measurement
on the samples. Leather, textiles, and wood were
purchased from tanneries, textile mills, and furniture The moisture contents of the samples were calculated by
processors, respectively, and all comprised a single recording weight change during drying at 105 ± 5 °C. The
component. Plastics included both single-component C, H, N, and S contents were determined using an
resin materials and practical plastic waste samples elemental analyzer (Vario EL III, Elementar Gmbh,
collected from daily life. Paper waste of various types Germany). Before each test, 5 mg of samples were dried
was also collected from daily supplies. All the samples and sizes were reduced. According to the moisture
were dried and stored in sealed bags after moisture contents, the elemental compositions on a wet basis were
content measurement. obtained for machine learning regression.
The textiles contained both nature fibers and synthetic
polymer fibers, which had compositions and spectra 2.3 Data analysis
similar to some corresponding paper or plastic
components. Furthermore, the textiles can absorb water 2.3.1 Feature selection
easily, and their moisture contents can be adjusted to
various certain ranges, the moisture in wood and some Two feature selection methods, the SPA and the CARS
types of paper such as wall paper cannot be altered. algorithm, were used to select spectral features. SPA is a
Therefore, textiles were chosen to investigate the forward selection algorithm based on the projection
influence of moisture on the infrared spectrum. Textiles operation, which finds features having the least
with different moisture contents were prepared by adding correlation with each other in each iteration until the
a certain amount of ultrapure water into the dried textile specified condition is reached. SPA can effectively solve
samples. Table S1 in supporting materials (SM) the collinearity problem because only the features with
summarizes the moisture content ranges of the samples. the least correlation are selected in the continuous
iteration. This method has many applications in rapid
2.2 Experimental methods detection (Feng et al., 2019; Li et al., 2019a). The CARS
method uses the Monte Carlo method to select features in
2.2.1 FTIR analysis each iteration and to evaluate and filter each feature (Li
et al., 2009). This method removes a large number of
Infrared spectra of the samples were obtained using irrelevant features at the very beginning and thus works
attenuated total reflectance FTIR (ATR-FTIR, Spectrum quickly. Because of the randomness of selection by the
Two, Perkin-Elmer, USA). The wavenumber range was Monte Carlo method the final result is not particularly
650–4000 cm−1, with a resolution of 2 cm−1. To ensure stable. To address this problem, Zheng et al. (2012)
the accuracy of spectral detection, three different regions proposed a stable competitive adaptive weighted
Table 1 Specific information on the samples

Samples Types Information in detail*
Leather 24 types Cow leathers (6 types), sheep leathers (6 types), pig leathers (6 types), synthetic leathers (6 types)
Paper 27 types Cardboard, newspaper, kraft paper, wall paper, toilet paper and tissue (2 types), magazines (3 types), wax paper (6 types), office
paper (12 different colors)
Textile 10 types Cotton, ramie, wool, silk, viscose, PA, PET, PP, acetate, polyester cotton blend
Wood 9 types Ash, beech, cassia, rubber wood, white oak, red oak, pine, poplar, camphor
Plastic 27 types HDPE (resin, white and translucent plastic bags, plastic bottles, landfill cover films), LDPE (resin, plastic bags, red, white and
black foam, black and white express bags), PA seal, PA6 resin, PA66 resin, PC (resin, plastic bowl), PET (resin, brown and
green plastic bottles), PMMA resin, PP (resin, ventilation duct, pipette tip, centrifuge tube), PS resin, PVC resin, TPU resin
Notes: * PA: polyamide; PET: polyethylene terephthalate; PP: polypropylene; HDPE: high density polyethylene; LDPE: low density polyethylene;
PC: polycarbonate; PMMA: polymethyl methacrylate; PS: polystyrene; PVC: polyvinyl chloride; TPU: thermoplastic polyurethane.
sampling procedure to improve the stability of CARS. 2.3.3 Spectral noise simulation
Referring to this method, the present study used CARS to
repeat the selection five times and to identify the common Spectral noise is caused by illumination changes during
features. spectral detection. This variable was simulated by adding
noise to the raw spectral data before feature selection to
test the anti-noise stability of the various machine
2.3.2 Regression model development learning methods. First, a spectral matrix of 836 (number
of spectra) × 1676 (number of bands) was built, and a
All spectral data after feature selection were randomly
noise matrix (836 × 1676) was generated with random
divided into training, validation, and test set at a ratio of
numbers between −1 and +1. To test the influence of
70:15:15. Because division of the data affects the final different noise intensities, five noise factors were
result, the random split was repeated three times and the adopted: 0.1, 0.2, 0.5, 0.8, and 1.0. The noise matrix was
regression results were averaged. Furthermore, 10-fold multiplied by the respective noise factors and added to the
cross validation was used to test the performance and spectral matrix, which caused each light intensity in the
generalization ability of the models. After random split, spectral matrix to increase or decrease by a certain
the datasets were normalized by maximum and minimum amount. Subsequently, the new spectra were used for
normalization. Five machine learning algorithms, machine learning regression following the same methods
including RF, SVR, PLSR, XGBOOST, KNN, were used described above.
for comparison and to build a suitable regression model.
These five algorithms were taken from the scikit-learn
package in Python. The model principles can be found in 3 Results and discussion
the literature (Breiman, 2001; Wold et al., 2001; Holmes
and Adams, 2003; Smola and Schölkopf, 2004; Chen and 3.1 Elemental compositions
Guestrin, 2016).
The important hyperparameters were chosen to build Figure 1 and Table S6 (SM) show the elemental composi-
the models, as summarized in Table S2 (SM). The tions of the samples. Among the five components,
validation set was used to optimize the hyperparameters plastics had the highest C content in a wide distribution
of each model using manual tuning method, in which a range. The C content of textiles ranked the second, with a
range of values for each hyperparameter was evaluated, relatively wide distribution range of 40.3%‒65.5%. The C
and the value affording the best performance was content of leathers and woods averaged around 41.5%,
selected. The tuned hyperparameters are shown in Tables and that of papers was the lowest, at an average value of
S3–S5 (SM). Two indicators were used to evaluate model 35.5%. Plastics had the highest H content and similar
prediction performance: R2 and root mean square error results averaging 5.3%–5.8% were obtained for leathers,
(RMSE). In statistics, R2 expresses the percentage of papers, and woods. However, the H content of the textiles
was slightly higher, averaging 6.5%. The papers, woods
variation in the dependent variable explained by the
and plastics were almost free of N, whereas the leathers
independent variable. In a machine learning model, it
had the highest N content at an average of 8.0%. The
indicates the degree of fitness of the model, with values
average N content of textiles ranged from 0.0% to 23.9%.
closer 1 indicating greater model accuracy. Herein, RMSE The S content of leathers was the highest. The plastics
was calculated to quantify the difference between the contained no or small amounts of S, with an average
predicted and true values. The closer the value is to zero, value of 0.2%, and the papers, woods, and textiles had
the better the model. R2 and RMSE were calculated using close to 0.0%.
Eqs. (1) and (2). Because the two indicators are similar in
evaluating model performance, only R2 will be discussed 3.2 Infrared spectra
hereafter. By continuously adjusting the hyperparameters,
the model R2 was brought to its maximum value and the The ATR-FTIR spectra of the original components are
hyperparameters were optimized to yield the best fitting shown in Fig. S1 (SM), and those of textiles with varied
method for each model. moisture contents are shown in Fig. 2. For example, six
∑ ∑ moisture content gradients are represented for cotton.
i ( fi − xi ) i ( fi − x̄i )
2 2
R = 1− ∑
2
= ∑ , (1) Clear differences were noted in the spectra among the
i (xi − x̄i ) i (xi − x̄i )
2 2
different types of solid waste, which enabled the identi-
√ fication of the components and the prediction of the
1 ∑n property indicators by regression. The absorbance of the
RMSE = ( fi − xi )2 , (2)
n i=1 textiles increases with moisture content, which is
where xi and fi are the measured and predicted values consistent with what has been observed in the literature
respectively, x̄i is the average of the measured values, and (Mirghani et al., 2011). The absorbance of wet textiles
n is the number of samples. increases significantly in the two ranges of 1500–1750
Fig. 1 Box plots of elemental compositions. The box boundaries indicate the 25% and 75% percentiles, and the line inside each box
indicates 50%. The whiskers show other data, with the ends of each whisker representing 5% and 95%. The black points are outliers.
shown in Table 2. No single machine learning algorithm

excelled at predicting all four elements (C, H, N, and S).
For the C content, the model based on KNN algorithm
without feature selection yielded the best prediction
results of R2val = 0.93, R2test = 0.91, R2val_CV = 0.92,
R2test_CV = 0.91. After feature selection, the prediction
results of SVR and PLSR were worse, and the perfor-
mances of RF and XGBOOST were slightly decreased or
showed little change. However, the KNN model showed
improvement after SPA in predicting C content, yielding
the highest R2val, R2test, R2val_CV, and R2test_CV values of
0.96, 0.93, 0.94, and 0.94.
The KNN based model achieved the best performance
to predict the H content without feature selection, with
Fig. 2 ATR-FTIR spectra of the textiles (cotton) with different values of R2val = 0.91, R2test = 0.87, R2val_CV = 0.89, and
moisture contents.
R2test_CV = 0.85. CARS improved its prediction perfor-
cm−1 and 3000–3750 cm−1. This occurs because that the mance to R2val, R2test, R2val_CV and R2test_CV of 0.96, 0.97,
vibration of O-H bond in water at 1645 cm−1 and 3406 0.91 and 0.89, respectively. The prediction results for H
cm−1 caused the molecules to resonate. Water exhibits a worsened in cross validation, which indicates that H
weak absorption peak at 2140 cm−1, which also results in prediction models may be more prone to be overfitting.
a smaller bulge at 2000‒2250 cm−1. For the N contents, the prediction results were the best
among the four elements. The KNN model showed the
3.3 Model development using the spectra in the ranges of best performance without feature selection, with values of
full wavenumbers and feature wavenumbers R2val = 0.98, R2test = 0.98, R2val_CV = 0.97, and R2test_CV =
0.98. The feature selection methods slightly improved the
Regression models were developed to predict the N prediction in most cases (RF and XGBOOST) and
elemental compositions using the full range spectra, as SPA-KNN was the most suitable model for N prediction,
Table 2 Predicted results of the elemental compositions of solid waste based on the full spectra and on spectra with feature selection
Model Validation Test Validation CV Test CV
Element
Algorithm Feature selection R2 RMSE R2 RMSE R2 RMSE R2 RMSE
C RF None 0.88 4.21 0.90 4.52 0.89 4.22 0.93 3.71
SPA 0.88 4.17 0.88 4.78 0.88 4.83 0.91 4.26
CARS 0.86 4.43 0.88 4.81 0.87 4.67 0.89 4.61
SVR None 0.89 3.97 0.85 5.52 0.86 5.01 0.87 5.13
SPA 0.79 5.60 0.77 6.78 0.78 6.23 0.79 6.61
CARS 0.84 4.81 0.83 6.19 0.82 5.67 0.83 5.92
PLSR None 0.80 5.25 0.72 7.59 0.74 6.61 0.79 6.46
SPA 0.61 5.46 0.63 8.67 0.61 8.08 0.63 8.68
CARS 0.77 5.60 0.78 6.74 0.73 6.66 0.79 6.63
XGBOOST None 0.89 3.91 0.88 4.90 0.88 4.41 0.92 4.01
SPA 0.90 3.80 0.89 4.64 0.89 4.26 0.92 4.04
CARS 0.89 4.01 0.88 4.92 0.87 4.50 0.90 4.43
KNN None 0.93 3.11 0.91 4.27 0.92 3.52 0.91 4.09
SPA 0.96 2.39 0.93 3.79 0.94 2.95 0.94 3.45
CARS 0.95 2.56 0.92 3.82 0.93 3.28 0.94 3.31
H RF None 0.84 0.82 0.77 1.19 0.82 0.93 0.85 0.94
SPA 0.84 0.81 0.82 1.04 0.87 0.90 0.85 0.96
CARS 0.85 0.78 0.84 0.97 0.84 0.89 0.87 0.87
SVR None 0.94 0.49 0.83 1.01 0.84 0.90 0.84 0.99
SPA 0.88 0.71 0.82 1.05 0.81 0.94 0.84 0.98
CARS 0.93 0.53 0.84 0.99 0.86 0.86 0.86 0.93
PLSR None 0.77 0.95 0.59 1.56 0.67 1.27 0.69 1.37
SPA 0.63 1.19 0.60 1.56 0.63 1.36 0.59 1.58
CARS 0.78 0.92 0.73 1.29 0.70 1.22 0.74 1.25
XGBOOST None 0.88 0.70 0.76 1.20 0.85 0.86 0.84 0.98
SPA 0.89 0.67 0.84 0.98 0.87 0.80 0.85 0.95
CARS 0.88 0.71 0.82 1.04 0.87 0.80 0.80 1.10
KNN None 0.91 0.59 0.87 0.85 0.89 0.72 0.85 0.87
SPA 0.96 0.41 0.83 0.98 0.90 0.70 0.83 0.97
CARS 0.96 0.42 0.97 0.70 0.91 0.66 0.89 0.63
N RF None 0.87 1.73 0.90 1.55 0.90 1.63 0.92 1.37
SPA 0.89 1.59 0.92 1.40 0.91 1.56 0.94 1.16
CARS 0.90 1.56 0.92 1.45 0.91 1.56 0.94 1.17
SVR None 0.97 0.91 0.93 1.29 0.93 1.32 0.97 0.92
SPA 0.85 1.88 0.92 1.32 0.91 1.59 0.92 1.30
CARS 0.88 1.63 0.94 1.14 0.93 1.39 0.94 1.13
PLSR None 0.93 1.37 0.81 2.06 0.89 1.63 0.82 2.07
SPA 0.74 2.56 0.80 2.25 0.78 2.40 0.80 2.26
CARS 0.89 1.69 0.87 1.85 0.89 1.64 0.89 1.70
XGBOOST None 0.94 1.19 0.88 1.68 0.92 1.47 0.91 1.43
SPA 0.95 1.02 0.92 1.36 0.94 1.27 0.95 1.09
CARS 0.94 1.24 0.90 1.61 0.93 1.29 0.93 1.34
KNN None 0.98 0.51 0.98 0.61 0.97 0.84 0.98 0.52
SPA 0.99 0.44 0.97 0.78 0.98 0.75 0.98 0.56
CARS 0.98 0.56 0.97 0.70 0.98 0.68 0.98 0.63
(Continued)
Element
S RF None 0.89 0.24 0.79 0.30 0.84 0.28 0.79 0.30
SPA 0.86 0.27 0.76 0.32 0.82 0.29 0.78 0.31
CARS 0.89 0.23 0.73 0.34 0.84 0.28 0.76 0.32
SVR None 0.92 0.20 0.86 0.24 0.89 0.23 0.87 0.24
SPA 0.91 0.21 0.79 0.29 0.87 0.25 0.79 0.30
CARS 0.94 0.18 0.80 0.29 0.88 0.24 0.81 0.28
PLSR None 0.92 0.20 0.54 0.41 0.84 0.28 0.66 0.36
SPA 0.84 0.29 0.72 0.35 0.79 0.32 0.72 0.35
CARS 0.91 0.21 0.76 0.31 0.86 0.26 0.79 0.30
XGBOOST None 0.90 0.23 0.77 0.31 0.85 0.26 0.79 0.31
SPA 0.88 0.24 0.76 0.32 0.85 0.27 0.79 0.30
CARS 0.84 0.26 0.79 0.51 0.86 0.26 0.76 0.31
KNN None 0.95 0.15 0.83 0.26 0.89 0.23 0.84 0.25
SPA 0.92 0.20 0.78 0.30 0.87 0.25 0.81 0.29
CARS 0.93 0.18 0.77 0.31 0.89 0.23 0.78 0.30
with values of R2val = 0.99, R2test = 0.97, R2val_CV = 0.98, representative of linear algorithms, did not show high R2
and R2test_CV = 0.98. and low RMSE in test set, which indicates overfitting and
A gap in prediction performance was noted between the poor generalization of the models built by PLSR
test set and validation set or cross validation when algorithms. In comparison, the models based on KNN and
predicting the S content, indicating less-stable prediction SVR showed strong generalization ability and good
ability. Without feature selection, the SVR model was the prediction performance.
best at predicting the S contents, with R2val = 0.92 and Table S7 shows the feature spectra selected by SPA and
R2test = 0.86, R2val_CV = 0.89, R2test_CV = 0.87. With the CARS. Most chemical bonds associated with the C, H, N
exception of PLSR, models using feature selection and S elements were included in the selected feature
afforded lower performance compared with models not spectra. Taking N as an example, the absorption peaks of
using feature selection. For example, applying SPA associated chemical bonds were at 1266, 1630 and
before SVR decreased R2val, R2test, R2val_CV, R2test_CV to 3300 cm−1, all of which were included in the selected
0.91, 0.79, 0.87, 0.79, respectively. However, the PLSR range. Thus, feature selection appears to address the
model was improved by feature selection methods, with overfitting problem of PLSR. However, applying SPA
respective R2test and R2test_CV values increasing from 0.54 weakened the prediction ability of the model built by
to 0.72 and from 0.66 to 0.72 using SPA, and to 0.76 and PLSR, because it selects spectral data with the lowest
0.79 using CARS. linear correlation. When feature selection was used for a
Regarding the overall performance of the validation tree prediction-based algorithm such as RF, the model
(cross validation) and test sets, almost all models generalization ability decreased. RF first ranks the
performed better for predicting the C and N contents than importance of each variable in model training and then
those predicting the H and S contents. This can be calculates the weighted tree of different trees according to
attributed to the wider vibration absorption peaks of H- this ranking (Breiman, 2001). Therefore, when feature
containing groups, which might have affected the training selection is used for the RF algorithm, the repeated
of the models. Moreover, S-containing groups rarely selection loses some key information and reduces the
exhibit characteristic peaks in the range of 650– model generalization ability. Yan et al. (2021) used laser-
4000 cm−1, making the prediction models difficult to induced emission spectroscopy to establish a prediction
build. model for the elemental composition of residual waste.
Figure 3(a) shows a comparison of the R2val and R2test After changing the original training, validation, and test
values for each models, which illustrates the model sets, the predictability of principal component analysis–
generalization ability and prediction performance. A RF was greatly reduced.
longer line between the two points indicates lower
generalization ability of the models. The nonlinear 3.4 Spectral noise effect
algorithms enabled model building with better prediction
performance than the linear algorithms. PLSR, as a Figures S2–S6 (SM) show the influence of different noise
Fig. 3 Prediction results of the models. (a) Comparison of model R2val and R2test values using full spectra with or without feature
selection. (b) Variation of R2 with different noise factors for the validation set. (c) Variation of R2 with different noise factors for the
test set.
factors on the spectra of samples, and Figures 3(b) and RF and XGBOOST were considerably affected by
3(c) show the anti-noise ability of each model for the noise and showed similarities in their change trends.
validation and test sets. As the noise increased, the R2 of When predicting N contents, the R2 of XGBOOST in the
the model trended downward. Noise had the least impact validation set first decreased and then increased with
on the KNN model, for which the R2 decrease was the higher noise factors, whereas the R2 curve of the test set
slowest. When using the KNN algorithm for regression, showed a V-shape. Such a significant change in R2 can
the algorithm selects the nearest points around the indicate poor stability of the model, which means that
predicted point and calculate the average of the points as spectral interference has a significant impact. RF and
the predicted value. When the spectral information is XGBOOST are both decision trees in essence, although
disturbed by spectral noise, the distance between the the iteration methods differ between them. These two
points will not change considerably; therefore, noise has iterative methods make RF and XGBOOST better at
little effect on the KNN model. handling data with missing features but are less suitable
Although the robustness shown by the SVR model was for handling spectral noise, which directly changes the
not as good as that of the KNN model, the former showed spectral data.
insensitivity to spectral changes. The SVR algorithm The PLSR model is highly sensitive to noise
defines an area known as soft margin loss; data falling in performance, and its R2 declined the most rapidly among
this area will not affect the model. This makes the model all the models. The sensitivity of PLSR to noise has been
somewhat insensitive to interference by spectral noise. described in the literature (Zou et al., 2010). Before PLSR
is used, it is often necessary to select continuous spectral 3.5 Moisture interference

bands to reduce the influence of redundant information.
Noise interference affected on the modeling results. Table 3 shows the predicted results for the elemental
Spectral noise caused by light or illumination is relatively compositions using the spectra of textiles with varied
common in practical tests. Therefore, choosing a model moisture contents. For the C content, all models showed
with strong anti-noise capability is effective for acceptable results. The model based on the SVR
combating noise interference. From the above results, it algorithm was the most suitable for predicting the C
can be found that KNN and SVR algorithms had strong content, with values of R2val = 0.93, R2test = 0.92, R2val_CV
anti-noise ability. For algorithms that are highly sensitive = 0.91, and R2test_CV = 0.92 without feature selection. This
to spectral changes, such as PLSR, RF, and XGBOOST, result is attributed to its high R2 and strong generalization
it is best to use spectral preprocessing methods such as ability. Feature selection methods had negligible impact
spectral smoothing and correction to reduce the influence in predicting C. With feature selection, the performances
of noise. of the models varied slightly, CARS-SVR was the best
Table 3 Predicted results for elemental compositions of solid waste based on the full spectra and spectra with feature selection under moisture
interference
Element
C RF None 0.90 3.21 0.90 3.37 0.90 3.18 0.90 3.31
SPA 0.90 3.25 0.89 3.46 0.90 3.19 0.89 3.45
CARS 0.90 3.22 0.89 3.39 0.89 3.18 0.90 3.32
SVR None 0.93 2.77 0.92 2.97 0.91 2.87 0.92 2.97
SPA 0.93 2.76 0.91 3.17 0.91 2.95 0.91 3.12
CARS 0.93 2.75 0.92 3.04 0.91 2.90 0.92 3.03
PLSR None 0.93 2.75 0.91 3.17 0.90 3.08 0.90 3.35
SPA 0.91 2.98 0.88 3.57 0.89 3.35 0.89 3.53
CARS 0.91 2.44 0.93 2.78 0.91 3.14 0.87 3.71
XGBOOST None 0.91 3.16 0.87 3.76 0.90 3.18 0.88 3.55
SPA 0.92 2.88 0.88 3.59 0.89 3.22 0.89 3.39
CARS 0.91 3.14 0.87 3.71 0.89 3.26 0.89 3.52
KNN None 0.88 3.53 0.87 3.66 0.88 3.51 0.90 3.35
SPA 0.90 3.35 0.88 3.58 0.88 3.48 0.88 3.52
CARS 0.90 3.26 0.88 3.65 0.87 3.56 0.89 3.45
H RF None 0.88 0.46 0.88 0.48 0.88 0.45 0.88 0.47
SPA 0.89 0.46 0.87 0.50 0.88 0.46 0.87 0.49
CARS 0.87 0.49 0.87 0.50 0.86 0.49 0.86 0.50
SVR None 0.87 0.48 0.87 0.48 0.89 0.44 0.87 0.48
SPA 0.91 0.59 0.87 0.43 0.91 0.39 0.91 0.42
CARS 0.85 0.50 0.86 0.50 0.88 0.45 0.87 0.50
PLSR None 0.92 0.39 0.87 0.49 0.88 0.45 0.89 0.45
SPA 0.91 0.41 0.86 0.51 0.87 0.48 0.86 0.51
CARS 0.94 0.33 0.93 0.36 0.92 0.36 0.93 0.36
XGBOOST None 0.91 0.41 0.86 0.51 0.89 0.45 0.86 0.51
SPA 0.90 0.44 0.86 0.51 0.88 0.46 0.86 0.51
CARS 0.89 0.45 0.83 0.56 0.86 0.49 0.86 0.52
KNN None 0.90 0.44 0.81 0.59 0.88 0.46 0.87 0.48
SPA 0.90 0.43 0.85 0.53 0.88 0.47 0.87 0.51
CARS 0.90 0.43 0.85 0.59 0.88 0.47 0.86 0.52
(Continued)
Element
N RF None 0.95 1.36 0.93 1.56 0.95 1.35 0.94 1.42
SPA 0.97 1.02 0.93 1.62 0.96 1.13 0.92 1.52
CARS 0.96 1.26 0.94 1.49 0.96 1.25 0.94 1.42
SVR None 0.97 0.95 0.97 0.98 0.98 0.90 0.98 0.91
SPA 0.99 0.76 0.99 0.78 0.98 0.77 0.98 0.78
CARS 0.98 0.84 0.98 0.81 0.98 0.84 0.98 0.80
PLSR None 0.95 1.10 0.95 1.40 0.95 1.38 0.95 1.33
SPA 0.91 1.93 0.91 1.85 0.91 1.78 0.91 1.84
CARS 0.96 1.15 0.96 1.25 0.96 1.17 0.96 1.25
XGBOOST None 0.96 1.26 0.92 1.67 0.95 1.28 0.90 1.82
SPA 0.97 1.00 0.94 1.38 0.97 1.04 0.93 1.64
CARS 0.96 1.22 0.95 1.41 0.96 1.23 0.95 1.30
KNN None 0.98 0.77 0.98 0.93 0.98 0.85 0.97 0.99
SPA 0.99 0.71 0.98 0.84 0.98 0.77 0.98 0.85
CARS 0.98 0.85 0.98 0.95 0.98 0.84 0.98 0.82
S RF None 0.64 0.32 0.59 0.29 0.56 0.33 0.69 0.27
SPA 0.48 0.39 0.50 0.33 0.47 0.37 0.60 0.31
CARS 0.63 0.33 0.60 0.30 0.61 0.32 0.70 0.28
SVR None 0.71 0.29 0.62 0.29 0.76 0.24 0.86 0.19
SPA 0.93 0.14 0.91 0.14 0.92 0.13 0.93 0.13
CARS 0.89 0.17 0.87 0.17 0.89 0.16 0.94 0.12
PLSR None 0.87 0.19 0.89 0.16 0.86 0.18 0.92 0.14
SPA 0.69 0.30 0.80 0.22 0.76 0.26 0.69 0.28
CARS 0.89 0.17 0.90 0.16 0.88 0.17 0.90 0.16
XGBOOST None 0.61 0.33 0.56 0.31 0.53 0.35 0.54 0.33
SPA 0.65 0.32 0.64 0.27 0.55 0.35 0.49 0.36
CARS 0.72 0.29 0.72 0.25 0.63 0.32 0.74 0.26
KNN None 0.64 0.26 0.47 0.34 0.55 0.35 0.59 0.27
SPA 0.59 0.33 0.54 0.29 0.64 0.30 0.45 0.34
CARS 0.73 0.23 0.59 0.29 0.70 0.26 0.45 0.34
model, and its R2 values were the same as those of the R2val_CV = 0.98, and R2test_CV = 0.97. The feature selection
model without feature selection. methods optimized the models in most cases, except
The prediction results for H were slightly worse than SPA-PLSR. With the help of SPA, the model based on
that of C. The PLSR model was the best for predicting H, SVR had the best prediction performance in predicting N,
with values of R2val = 0.92, R2test = 0.87, R2val_CV = 0.88, with values of R2val = 0.99, R2test = 0.99, R2val_CV = 0.98,
and R2test_CV = 0.89. Feature selection improved the and R2test_ CV = 0.98.
model prediction in most cases. For instance, SPA The prediction performance for S content was generally
afforded a better performance of SVR models, with worse. Without feature selection, only the model built by
values of R2val = 0.91, R2test = 0.87, R2val_CV = 0.91, and PLSR showed good result, with values of R2val = 0.87,
R2test_CV = 0.91, and CARS enhanced the prediction R2test = 0.89, R2val_CV = 0.86, and R2test_ CV = 0.92. The
performance of PLSR, yielding R2val, R2test, R2val_CV, and feature selection methods considerably improved the
R2test_CV values of 0.94, 0.93, 0.92 and 0.93, respectively. SVR model. Applying SPA before SVR increased R2val,
All the models had good results when predicting N, and R2test, R2val_ CV, R2test_ CV from 0.71 to 0.93, 0.62 to 0.91,
the model based on KNN algorithm was the best without 0.76 to 0.92, and 0.86 to 0.93, respectively. CARS-SVR,
feature selection, with values of R2val = 0.98, R2test = 0.98, with values of R2val = 0.89, R2test = 0.87, R2val_ CV = 0.89,
and R2test_ CV = 0.94, also had better prediction results selection and feature selection) were selected, as shown in
compared with the model with the original spectra as Fig. 4. The data for C, H, and N contents were evenly
input. distributed, but very few points of S data were distributed
Almost all models performed better when predicting C outside the origin, affording an unbalanced, “top-heavy”
and N contents than when predicting H and S contents, distribution. Most machine learning algorithms assume
which was the same as the one without moisture that the data follow a normal distribution for better
interference. The influence of moisture on the infrared statistical inference and hypothesis verification. There-
spectra was concentrated mainly in the three regions of fore, the model prediction performance for C, H, and N
1645, 2140, and 3460 cm−1 (Fig. 2), where the functional contents was better, whereas the “top-heavy” distribution
groups containing H and N elements (such as the of S data caused the prediction results to be fitted to the
absorption peak of the N-H bond at 3300 cm−1) showed side in which more data were distributed, which in this
peaks. However, the predicted R2 of the models for both case was close to zero (Altun et al., 2007; Vong et al.,
H and N elements were > 0.81. The characteristic peaks 2015). According to the final results, the spectral changes
of the S-containing groups were concentrated in caused by moisture interference had little effect on model
1000–1200 cm−1 range, and the spectra in this range development.
experienced changes only in the absorbance intensity,
which is not considerably affected by moisture. However, 3.6 Environmental implication
the S element, which should have been the least affected,
had the worst predictive model performance. This can be The study confirmed the feasibility of combining infrared
explained by the data distribution such that uneven data spectroscopy and machine learning algorithms to predict
distribution often affects model training. the elemental composition of solid waste. Once the
The model fit and data distribution were mapped on a elemental composition data are obtained, important
scatter plot of the predicted and measured values. Based parameters related to solid waste treatment and disposal,
on the results of three parallel tests and five machine such as calorific value and C emissions, can be further
learning algorithms, the scatter plots of the validation and calculated. When faced with noise and moisture
test sets with the highest R2 (including no feature interference in practical applications, the models can be
Fig. 4 Prediction results under moisture interference. (a) Scatter plots of predicted and measured values for the validation set.
(b) Scatter plots of predicted and measured values for the test set.
optimized using feature selection and machine learning PLSR Partial least squares regression
algorithms. PMMA Polymethyl methacrylate
PP Polypropylene
PS Polystyrene
4 Conclusions
PVC Polyvinyl chloride
A method based on ATR-FTIR and machine learning to R2 R-Square
rapidly detect the elemental composition of solid waste RF Random forest
was proposed, and the impacts of noise and moisture SPA Successive projections algorithm
interference, which may be encountered in practical
SVR Support vector regression
applications, were simulated. In total, 291 samples of 97
types were collected, and their elemental compositions TPU Thermoplastic polyurethane
and infrared spectra were tested. When using full-band XGBOOST Extreme gradient boosting tree
spectra for prediction, the KNN algorithm had the best
prediction performance for predicting C, H, and N and Acknowledgements We acknowledge the support from the National Key
SVR was the most suitable model for predicting S. R&D Program of China (No. 2020YFC1910100).
Feature selection improved the model performance in
most cases. Models based on KNN algorithms with Data Accessibility Statement The data and code that support the findings
of this study are available from the corresponding author, Hua Zhang, upon
feature selection methods were the best models for reasonable request.
predicting C, H, and N.
In the case of spectral noise interference, both the KNN Electronic Supplementary Material Supplementary material is available
and SVR algorithms showed stability, whereas the RF, in the online version of this article at https://doi.org/10.1007/s11783-023-
PLSR, and XGBOOST algorithms were more sensitive to 1721-1 and is accessible for authorized users.
noise, and their performance dropped s considerably
under noise interference. Under the interference of water,
the models still had higher R2 when predicting C, H, and References
N contents. Owing to the concentrated and uneven
Adedeji O, Wang Z H (2019). Intelligent waste classification system
distribution of the S content data, however, only the R2 of
using deep learning convolutional neural network. In: 2nd
the PLSR and SVR model showed values >0.8 when
international conference on sustainable materials processing and
predicting S content. With the feature selection method,
manufacturing (SMPM). Sun City, South Africa: Procedia
model performance improved only slightly, and SPA-
Manufacturing, 607–612
SVR, SVR, CARS-PLSR performed well in predicting
Altun H, Bilgil A, Fidan B C (2007). Treatment of multi-dimensional
the elemental compositions of solid waste with moisture
data to enhance neural network estimators in regression problems.
interference.
Expert Systems with Applications, 32(2): 599–605
Therefore, using infrared spectroscopy and machine Breiman L (2001). Random forests. Machine Learning, 45(1): 5–32
learning algorithms, the elemental compositions of solid Chakraborty S, Li B, Deb S, Paul S, Weindorf D C, Das B S (2017).
waste can be quickly predicted and noise and moisture Predicting soil arsenic pools by visible near infrared diffuse
disturbances can be effectively addressed. reflectance spectroscopy. Geoderma, 296: 30–37
Chen K, Peng Y, Lu S, Lin B, Li X (2021). Bagging based ensemble
learning approaches for modeling the emission of PCDD/Fs from
Abbreviation municipal solid waste incinerators. Chemosphere, 274: 129802
Chen T, Guestrin C (2016). Assoc Comp M XGBoost: a Scalable Tree
ANN Artificial neural network
Boosting System. In: Proceedings of the 22nd ACM SIGKDD
CARS Competitive adaptive reweighted sampling International Conference on Knowledge Discovery and Data
DT Decision tree Mining. San Francisco, USA: Association for Computing
FTIR Fourier transform infrared spectroscopy Machinery, 785–794
Chin M Y, Lee C T, Woon K S (2022). Policy-driven municipal solid
HDPE High density polyethylene
waste management assessment using relative quadrant eco-
KNN K-nearest neighbor
efficiency: a case study in Malaysia. Journal of Environmental
LDPE Low density polyethylene Management, 323: 116238
RMSE Root mean square error Demetrious A, Verghese K, Stasinopoulos P, Crossin E (2018).
PA Polyamide Comparison of alternative methods for managing the residual of
material recovery facilities using life cycle assessment. Resources,
PC Polycarbonate
Conservation and Recycling, 136: 33–45
PET Polyethylene terephthalate
El-Fadel M, Bou-Zeid E, Chahine W, Alayli B (2002). Temporal
variation of leachate quality from pre-sorted and baled municipal competitive adaptive reweighted sampling method for multivariate
solid waste with high organic and moisture content. Waste calibration. Analytica Chimica Acta, 648(1): 77–84
Management (New York, N.Y.), 22(3): 269–282 Li H Y, Jia S Y, Le Z C (2019a). Quantitative analysis of soil total
Feng X P, Chen H M, Chen Y, Zhang C, Liu X D, Weng H Y, Xiao S nitrogen using hyperspectral imaging technology with extreme
P, Nie P C, He Y (2019). Rapid detection of cadmium and its learning machine. Sensors (Basel), 19(20): 4355
distribution in Miscanthus sacchariflorus based on visible and near- Li R, Gong M, Biney B W, Chen K, Xia W, Liu H, Guo A (2022a).
infrared hyperspectral imaging. Science of the Total Environment, Three-stage pretreatment of food waste to improve fuel
659: 1021–1031 characteristics and incineration performance with recovery of
Garcés D, Díaz E, Sastre H, Ordóñez S, González-Lafuente J M (2016). process by-products. Fuel, 330: 125655
Evaluation of the potential of different high calorific waste fractions Li X, Ma Y, Zhang M, Zhan M, Wang P, Lin X, Chen T, Lu S, Yan J
for the preparation of solid recovered fuels. Waste Management (2019b). Study on the relationship between waste classification,
(New York, N.Y.), 47: 164–173 combustion condition and dioxin emission from waste incineration.
Govindappa M, Tejashree S, Thanuja V, Hemashekhar B, Srinivas C, Waste Disposal & Sustainable Energy, 1(2): 91–98
Nasif O, Pugazhendhi A, Raghavendra V B (2021). Pomegranate Li Y, Wang Z, Guan B (2022b). Separation and identification of
fruit fleshy pericarp mediated silver nanoparticles possessing nanoplastics in tap water. Environmental Research, 204: 112134
antimicrobial, antibiofilm formation, antioxidant, biocompatibility Lu W J, Huo W Z, Gulina H, Pan C (2022). Development of machine
and anticancer activity. Journal of Drug Delivery Science and learning multi-city model for municipal solid waste generation
Technology, 61: 102289 prediction. Frontiers of Environmental Science and Engineering,
Goydaragh M G, Taghizadeh-Mehrjardi R, Jafarzadeh A A, Triantafilis 16(9): 119
J, Lado M (2021). Using environmental variables and Fourier Lundberg S M, Lee S I (2017). A unified approach to interpreting
transform infrared spectroscopy to predict soil organic carbon. model predictions. In: Proceedings of the 31st international
Catena, 202: 105280 conference on neural information processing systems. Long Beach,
Higashikawa F S, Silva C A, Nunes C A, Sánchez-Monedero M A USA: Curran Associates Inc., 4768–4777
(2014). Fourier transform infrared spectroscopy and partial least Michel A P M, Morrison A E, Preston V L, Marx C T, Colson B C,
square regression for the prediction of substrate maturity indexes. White H K (2020). Rapid identification of marine plastic debris via
Science of the Total Environment, 470–471: 536–542 spectroscopic techniques and machine learning classifiers.
Holmes C C, Adams N M (2003). Likelihood inference in nearest- Environmental Science & Technology, 54(17): 10630–10637
neighbour classification models. Biometrika, 90(1): 99–112 Mirghani M E S, Kabbashi N A, Alam M Z, Qudsieh I Y, Alkatib M F
Hoque M M, Rahman M T U (2020). Landfill area estimation based on R (2011). Rapid method for the determination of moisture content
solid waste collection prediction using ANN model and final waste in biodiesel using FTIR spectroscopy. Journal of the American Oil
disposal options. Journal of Cleaner Production, 256: 120387 Chemists’ Society, 88(12): 1897–1904
Huang Y C, Chen J Y, Duan Q N, Feng Y J, Luo R, Wang W J, Liu F Nzihou A (2020). Handbook on Characterization of Biomass, Biowaste
L, Bi S F, Lee J C (2022). A fast antibiotic detection method for and Related By-Products. Cham, Switzerland: Springer Cham
simplified pretreatment through spectra-based machine learning. Paul A, Wander L, Becker R, Goedecke C, Braun U (2019). High-
Frontiers of Environmental Science and Engineering, 16(3): 38 throughput NIR spectroscopic (NIRS) detection of microplastics in
Jiang Q H, Chen Y Y, Guo L, Fei T, Qi K (2016). Estimating soil soil. Environmental Science and Pollution Research, 26(8):
organic carbon of cropland soil at different levels of soil moisture 7364–7374
using VIS-NIR spectroscopy. Remote Sensing (Basel), 8(9): 755 Peršak T, Viltuznik B, Hernavs J, Klancnik S (2020). Vision-based
Kandlbauer L, Khodier K, Ninevski D, Sarc R (2021). Sensor-based sorting systems for transparent plastic granulate. Applied Sciences,
particle size determination of shredded mixed commercial waste 10(12): 4269
based on two-dimensional images. Waste Management (New York, Ren M H, Zhang H J, Fan Y, Zhou H Q, Cao R, Gao Y, Chen J P
N.Y.), 120: 784–794 (2021). Suppressing the formation of chlorinated aromatics by
Kannangara M, Dua R, Ahmadi L, Bensebaa F (2018). Modeling and inhibitor sodium thiocyanate in solid waste incineration process.
prediction of regional municipal solid waste generation and Science of the Total Environment, 798: 149154
diversion in Canada using machine learning approaches. Waste Said M, Amr M, Sabry Y, Khalil D, Wahba A (2021). Plastic sorting
Management (New York, N.Y.), 74: 3–15 based on MEMS FTIR spectral chemometrics sensing. Conference
Kardani N, Zhou A N, Nazem M, Lin X S (2021). Modelling of on Optical Sensing and Detection VI, Proceedings of SPIE, 11354:
municipal solid waste gasification using an optimised ensemble soft 113540J
computing model. Fuel, 289: 119903 Smola A J, Schölkopf B (2004). A tutorial on support vector
Karimi N, Ng K T W, Richter A (2021). Prediction of fugitive landfill regression. Statistics and Computing, 14(3): 199–222
gas hotspots using a random forest algorithm and Sentinel-2 data. Tao J, Liang R, Li J, Yan B, Chen G, Cheng Z, Li W, Lin F, Hou L
Sustainable Cities and Society, 73: 103097 (2020). Fast characterization of biomass and waste by infrared
Kaur G, Kaur D, Kansal S K, Garg M, Krishania M (2022). Potential spectra and machine learning models. Journal of Hazardous
cocoa butter substitute derived from mango seed kernel. Food Materials, 387: 121723
Chemistry, 372: 131244 Vong C M, Ip W F, Chiu C C, Wong P K (2015). Imbalanced learning
Li H, Liang Y, Xu Q, Cao D (2009). Key wavelengths screening using for air pollution by meta-cognitive online sequential extreme
learning machine. Cognitive Computation, 7(3): 381–391 857: 159282

Wang L, Wang R (2022). Determination of soil pH from Vis-NIR Yan B, Liang R, Li B, Tao J, Chen G, Cheng Z, Zhu Z, Li X (2021).
spectroscopy by extreme learning machine and variable selection: a Fast identification and characterization of residual wastes via laser-
case study in lime concretion black soil. Spectrochimica Acta. Part induced breakdown spectroscopy and machine learning. Resources,
A: Molecular and Biomolecular Spectroscopy, 283: 121707 Conservation and Recycling, 174: 105851
Wang Y, Shi Y, Zhou J, Zhao J, Maraseni T, Qian G (2021). Zhang C, Liu F, He Y (2018). Identification of coffee bean varieties
Implementation effect of municipal solid waste mandatory sorting using hyperspectral imaging: influence of preprocessing methods
policy in Shanghai. Journal of Environmental Management, 298: and pixel-wise spectra analysis. Scientific Reports, 8(1): 2166
113512 Zhang D Q, Tan S K, Gersberg R M (2010). Municipal solid waste
Wang Z, Peng B, Huang Y, Sun G (2019). Classification for plastic management in China: status, problems and challenges. Journal of
bottles recycling based on image recognition. Waste Management Environmental Management, 91(8): 1623–1633
(New York, N.Y.), 88: 170–181 Zhang Y, Kang S, Allen S, Allen D, Gao T, Sillanpää M (2020).
Wold S, Sjöström M, Eriksson L (2001). PLS-regression: a basic tool Atmospheric microplastics: a review on current status and
of chemometrics. Chemometrics and Intelligent Laboratory perspectives. Earth-Science Reviews, 203: 103118
Systems, 58(2): 109–130 Zheng K, Li Q, Wang J, Geng J, Cao P, Sui T, Wang X, Du Y (2012).
Xiao S, Dong H, Geng Y, Tian X, Liu C, Li H (2020). Policy impacts Stability competitive adaptive reweighted sampling (SCARS) and
on municipal solid waste management in Shanghai: a system its applications to multivariate calibration of NIR spectra.
dynamics model analysis. Journal of Cleaner Production, 262: Chemometrics and Intelligent Laboratory Systems, 112: 48–54
121366 Zou H, Huang S, Ren M, Liu J, Evrendilek F, Xie W, Zhang G (2022).
Xu S, Wang M, Shi X, Yu Q, Zhang Z (2021). Integrating Efficiency, by-product valorization, and pollution control of co-
hyperspectral imaging with machine learning techniques for the pyrolysis of textile dyeing sludge and waste solid adsorbents: their
high-resolution mapping of soil nitrogen fractions in soil profiles. atmosphere, temperature, and blend ratio dependencies. Science of
Science of the Total Environment, 754: 142135 the Total Environment, 819: 152923
Xu Y, Liu J, Sun Y, Chen S, Miao X (2023). Fast detection of volatile Zou X, Zhao J, Povey M J W, Mei H, Mao H (2010). Variables
fatty acids in biogas slurry using NIR spectroscopy combined with selection methods in near-infrared spectroscopy. Analytica Chimica
feature wavelength selection. Science of the Total Environment, Acta, 667(1–2): 14–32

Predicting The Elemental Compositions of Solid Waste Using ATR-FTIR and Machine Learning

Uploaded by

Copyright:

Available Formats

Predicting The Elemental Compositions of Solid Waste Using ATR-FTIR and Machine Learning

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Predicting The Elemental Compositions of Solid Waste Using ATR-FTIR and Machine Learning

Uploaded by

Copyright:

Available Formats

Front. Environ. Sci. Eng.

2023, 17(10): 121

Predicting the elemental compositions of solid waste using

HIGHLIGHTS GRAPHIC ABSTRACT

ARTICLE INFO ABSTRACT

© Higher Education Press 2023

investigated. of each sample were selected for ATR-FTIR analysis.

Table 1 Specific information on the samples

shown in Table 2. No single machine learning algorithm

is used, it is often necessary to select continuous spectral 3.5 Moisture interference

learning machine. Cognitive Computation, 7(3): 381–391 857: 159282

You might also like