Detection of Water pH Using Visible Near-Infrared Spectroscopy and One-Dimensional Convolutional Neural Network
Abstract
:1. Introduction
2. Materials and Methods
2.1. Experiment
2.1.1. Spectrophotometer Experiment: Detecting the pH Value of Distilled Water with a Spectrophotometer
2.1.2. Grating Spectrograph Experiment: Detecting the pH Value of Distilled Water with a Grating Spectrograph
2.1.3. Spectral Reference Characteristics
2.2. Traditional Modeling Methods
2.2.1. Spectral Preprocessing Methods
2.2.2. Modeling Algorithms
- (1)
- Partial least squares regression is one of the most commonly used calibration methods, which establishes a linear connection between the spectra data matrix (x) and the target attributes (y). PLS extract uncorrelated principal components (PCs) from the spectra to construct the calibration models. For more details about PLS, please refer to reference [24]. In this study, according to the root mean square error of cross-validation (RMSECV), we chose the optimum number of PCs (nPCs) [18].
- (2)
- Least squares support vector machine is a commonly-used machine learning algorithm which exhibits high prediction accuracy in addressing linear and nonlinear problems [25]. LS-SVM employs a kernel function to transform the original spectra data into a high-dimensional space. Then support vectors are obtained by a set of linear equations. For more details about LS-SVM, please refer to reference [8]. The prediction results of LS-SVM can be expressed as Equation (1):
2.2.3. Successive Projection Algorithm (SPA)
2.3. One-Dimensional Convolutional Neural Network (1D-CNN)
2.3.1. Data Augmentation and Spectral Preprocessing
2.3.2. 1D-CNN Architecture
2.3.3. Training of 1D-CNN
- (1)
- Data augmentation and spectral preprocessing. Before 1D-CNN training, in order to improve the prediction accuracy and prevent overfitting. As previously described, after Z-score preprocess, the calibration set was augmented 10 times using the data augmentation method.
- (2)
- The parameters of 1D-CNN, including all layer weight and biases, were initialized randomly.
- (3)
- Forward propagation. The spectra in the calibration set as the input data of the 1D-CNN finally acquired the predicted pH values from the output layer.
- (4)
- Calculate the MSE value between the predicted and the reference pH values by equation (2).
- (5)
- Backpropagation. Calculate the error gradient of the output layer, and use the backpropagation algorithm to calculate the error gradient of each weight. Then, use the gradient descent algorithm to update the weight value in each layer. The purpose of this step is to optimize the weight of 1D-CNN to minimize the MSE [32].
- (6)
- Go to step (3) until the training epochs reach the maximum number of training epochs or the MSE value is less than the set value.
2.4. Criteria for Model Evaluation
2.5. Outlier Recognition
3. Results and Discussion
3.1. Prediction Results Using Traditional Modeling Methods
3.2. Characteristic Wavelength Selection and Validation
3.3. Prediction Results of 1D-CNN
3.4. Interpreting the Feature Representations of Convolutional Layers
3.5. Calibration Performance Comparisons Discussion of the Multivariate Calibration Models
3.5.1. Discussion of Model Prediction Accuracy
3.5.2. Impacts of Spectra Preprocessing on Calibration Models
3.5.3. Impacts of Feature Selection on Calibration Models
3.5.4. Discussion of Calculation Rapidity
4. Conclusions
- (1)
- The prediction performance of 1D-CNN based on full spectra is better than the traditional linear (PLS) and nonlinear (LS-SVM) approaches using full spectra and characteristic wavelength variables. For the spectrophotometer experiment, the RMSEP is 0.7925 and the is 0.8515. For the grating spectrograph experiment, the RMSEP is 0.5128 and the is 0.9273.
- (2)
- (By visualizing the characteristic map through three convolution layers, we can understand how the convolution network converts one-dimensional spectral data into prediction results. The first convolutional layer acts for spectra pretreatment and learns the shape feature of input spectra. The second convolutional layer extracts the hidden features in the spectra. The third convolutional layer stably enhances the activations of the feature spectra peaks.
- (3)
- 1D-CNN could effectively extract the spectra features. The number of activation variables of 1D-CNN is more than the feature variables selected by SPA, and the prediction accuracy of 1D-CNN is higher than that of SPA-PLS and SPA-LS-SVM for both experiments.
- (4)
- 1D-CNN could improve the convenience of modeling. Compared with the traditional regression methods, 1D-CNN modeling only require preprocessing is normalization. 1D-CNN does not need complex spectra pretreatment and variable selection, which ensures the calculation rapidity of 1D-CNN.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Chen, Z.; Guo, Q.; Shi, Z. Design of WSN node for water pollution remote monitoring. Telecommun. Syst. 2013, 53, 155–162. [Google Scholar] [CrossRef]
- Golan, R.; Gavrieli, I.; Lazar, B.; Ganor, J. The determination of pH in hypersaline lakes with a conventional combination glass electrode. Limnol. Oceanogr. Methods 2014, 12, 810–815. [Google Scholar] [CrossRef]
- Song, J.; Li, G.L.; Yang, X.D.; Liu, X.W.; Xie, L. Rapid analysis of soluble solid content in navel orange based on visible-near infrared spectroscopy combined with a swarm intelligence optimization method. Spectrochim. Acta Part A 2020, 228, 117815. [Google Scholar] [CrossRef] [PubMed]
- Wang, L.S.; Wang, R.J.; Lu, C.P.; Wang, J.; Huang, W. Rapid determination of moisture content in compound fertilizer using visible and near infrared spectroscopy combined with chemometrics. Infrared Phys. Technol. 2019, 102, 103045. [Google Scholar] [CrossRef]
- Wang, K.; Bian, X.; Zheng, M.; Liu, P.; Lin, L.; Tan, X. Rapid determination of hemoglobin concentration by a novel ensemble extreme learning machine method combined with near-infrared spectroscopy. Spectrochim. Acta Part A 2021, 263, 120138. [Google Scholar] [CrossRef]
- Xie, L.; Ying, Y.; Ying, T. Classification of tomatoes with different genotypes by visible and short-wave near-infrared spectroscopy with least-squares support vector machines and other chemometrics. J. Food Eng. 2009, 94, 34–39. [Google Scholar] [CrossRef]
- Li, L.; Li, D. A Hybrid Multivariate Calibration Optimization Method for Visible Near Infrared Spectral Analysis. In Proceedings of the 2021 7th International Conference on Condition Monitoring of Machinery in Non-Stationary Operations (CMMNO), Guangzhou, China, 11–13 June 2021; pp. 76–81. [Google Scholar] [CrossRef]
- Chen, H.; Xu, L.; Ai, W.; Lin, B.; Feng, Q.; Cai, K. Kernel functions embedded in support vector machine learning models for rapid water pollution assessment via near-infrared spectroscopy. Sci. Total Environ. 2020, 714, 136765. [Google Scholar] [CrossRef]
- Xiao, D.; Liu, C.; Le, B.T. Detection method of TFe content of iron ore based on visible-infrared spectroscopy and IPSO-TELM neural network. Infrared Phys. Technol. 2019, 97, 341–348. [Google Scholar] [CrossRef]
- Tian, H.; Zhang, L.; Li, M.; Wang, Y.; Sheng, D.; Liu, J.; Wang, C. WSPXY combined with BP-ANN method for hemoglobin determination based on near-infrared spectroscopy. Infrared Phys. Technol. 2019, 102, 103003. [Google Scholar] [CrossRef]
- Li, Y.; Ma, B.; Li, C.; Yu, G. Accurate prediction of soluble solid content in dried Hami jujube using SWIR hyperspectral imaging with comparative analysis of models. Comput. Electron. Agric. 2022, 193, 106655. [Google Scholar] [CrossRef]
- Yun, Y.H.; Li, H.D.; Deng, B.C.; Cao, D.S. An overview of variable selection methods in multivariate analysis of near-infrared spectra. TrAC Trends Anal. Chem. 2019, 113, 102–115. [Google Scholar] [CrossRef]
- Zhang, X.; Xu, J.; Yang, J.; Chen, L.; Zhou, H.; Liu, X.; Li, H.; Lin, T.; Ying, Y. Understanding the learning mechanism of convolutional neural networks in spectral analysis. Anal. Chim. Acta 2020, 1119, 41–51. [Google Scholar] [CrossRef]
- Cui, C.; Fearn, T. Modern practical convolutional neural networks for multivariate regression: Applications to NIR calibration. Chemom. Intell. Lab. Syst. 2018, 182, 9–20. [Google Scholar] [CrossRef]
- Mishra, P.; Passos, D. Multi-output 1-dimensional convolutional neural networks for simultaneous prediction of different traits of fruit based on near-infrared spectroscopy. Postharvest Biol. Technol. 2022, 183, 111741. [Google Scholar] [CrossRef]
- Fukuhara, M.; Fujiwara, K.; Maruyama, Y.; Itoh, H. Feature visualization of Raman spectrum analysis with deep convolutional neural network. Anal. Chim. Acta 2019, 1087, 11–19. [Google Scholar] [CrossRef]
- Ng, W.; Minasny, B.; McBratney, A. Convolutional neural network for soil microplastic contamination screening using infrared spectroscopy. Sci. Total Environ. 2020, 702, 134723. [Google Scholar] [CrossRef]
- Tian, S.; Wang, S.; Xu, H. Early detection of freezing damage in oranges by online Vis/NIR transmission coupled with diameter correction method and deep 1D-CNN. Comput. Electron. Agric. 2022, 193, 106638. [Google Scholar] [CrossRef]
- Mishra, P.; Passos, D. A synergistic use of chemometrics and deep learning improved the predictive performance of near-infrared spectroscopy models for dry matter prediction in mango fruit. Chemom. Intell. Lab. Syst. 2021, 212, 104287. [Google Scholar] [CrossRef]
- Huang, Y.; Lin, M.; Cavinato, A.G.; Mayes, D.M.; Rasco, B.A. Influence of temperature on the measurement of NaCl content of aqueous solution by short-wavelength near infrared spectroscopy (SW-NIR). Sens. Instrum. Food Qual. 2007, 1, 91–97. [Google Scholar] [CrossRef]
- Mishra, P.; Biancolillo, A.; Roger, J.M.; Marini, F.; Rutledge, D.N. New data preprocessing trends based on ensemble of multiple preprocessing techniques. TrAC Trends Anal. Chem. 2020, 132, 116045. [Google Scholar] [CrossRef]
- Wiedemair, V.; Langore, D.; Garsleitner, R.; Dillinger, K.; Huck, C. Investigations into the Performance of a Novel Pocket-Sized Near-Infrared Spectrometer for Cheese Analysis. Molecules 2019, 24, 428. [Google Scholar] [CrossRef] [Green Version]
- Lu, Z.; Chen, X.; Yao, S.; Qin, H.; Zhang, L.; Yao, X.; Yu, Z.; Lu, J. Feasibility study of gross calorific value, carbon content, volatile matter content and ash content of solid biomass fuel using laser-induced breakdown spectroscopy. Fuel 2019, 258, 116150. [Google Scholar] [CrossRef]
- Geladi, P.; Kowalski, B.R. Partial least square regression: A tutorial. Anal. Chim. Acta 1986, 185, 1–17. [Google Scholar] [CrossRef]
- Bao, Y.; Liu, F.; Kong, W.; Sun, D.W.; He, Y.; Qiu, Z. Measurement of Soluble Solid Contents and pH of White Vinegars Using VIS/NIR Spectroscopy and Least Squares Support Vector Machine. Food Bioprocess Technol. 2014, 7, 54–61. [Google Scholar] [CrossRef]
- Liu, F.; He, Y.; Wang, L. Comparison of calibrations for the determination of soluble solids content and pH of rice vinegars using visible and short-wave near infrared spectroscopy. Anal. Chim. Acta 2008, 610, 196–204. [Google Scholar] [CrossRef]
- Wu, Q.; Song, T.; Liu, H.; Yan, X. Particle swarm optimization algorithm based on parameter improvements. J. Comput. Methods Sci. Eng. 2017, 17, 557–568. [Google Scholar] [CrossRef]
- Liu, K.; Chen, X.; Li, L.; Chen, H.; Ruan, X.; Liu, W. A consensus successive projections algorithm--multiple linear regression method for analyzing near infrared spectra. Anal. Chim Acta 2015, 858, 16–23. [Google Scholar] [CrossRef] [PubMed]
- Araújo, M.R.C.S.U.; Saldanha, T.C.B.; Galvão, R.K.H.; Yoneyama, T.; Chame, H.C.; Visani, V. The successive projections algorithm for variable selection in spectroscopic multicomponent analysis. Chemom. Intell. Lab. Syst. 2001, 57, 65–73. [Google Scholar] [CrossRef]
- Bjerrum, E.J.; Glahder, M.; Skov, T. Data Augmentation of Spectral Data for Convolutional Neural Network (CNN) Based Deep Chemometrics. arXiv 2017. [Google Scholar] [CrossRef]
- Zhu, J.; Sharma, A.S.; Xu, J.; Xu, Y.; Jiao, T.; Ouyang, Q.; Li, H.; Chen, Q. Rapid on-site identification of pesticide residues in tea by one-dimensional convolutional neural network coupled with surface-enhanced Raman scattering. Spectrochim. Acta Part A 2021, 246, 118994. [Google Scholar] [CrossRef] [PubMed]
- Malek, S.; Melgani, F.; Bazi, Y. One-dimensional convolutional neural networks for spectroscopic signal regression. J. Chemom. 2018, 32, e2977. [Google Scholar] [CrossRef]
- Zhu, H.; Yang, L.; Gao, J.; Gao, M.; Han, Z. Quantitative detection of Aflatoxin B1 by subpixel CNN regression. Spectrochim. Acta Part A 2022, 268, 120633. [Google Scholar] [CrossRef]
- Liu, Z.; Cai, W.; Shao, X. Outlier detection in near-infrared spectroscopic analysis by using Monte Carlo cross-validation. Sci. China Ser. B: Chem. 2008, 51, 751–759. [Google Scholar] [CrossRef]
- Sewdien, V.N.; Preece, R.; Torres, J.; Rakhshani, E.; Van, D. Assessment of critical parameters for artificial neural networks based short-term wind generation forecasting. Renew. Energy 2020, 161, 878–892. [Google Scholar] [CrossRef]
- Chen, X.; Yu, R.; Ullah, S.; Wu, D.; Zhang, Y. A novel loss function of deep learning in wind speed forecasting. Energy 2021, 238, 121808. [Google Scholar] [CrossRef]
- Koshoubu, J.; Iwata, T.; Minami, S. Elimination of the Uninformative Calibration Sample Subset in the Modified UVE (Uninformative Variable Elimination)–PLS (Partial Least Squares) Method. Anal. Sci. 2001, 17, 319–322. [Google Scholar] [CrossRef] [Green Version]
- Dixit, Y.; Al-Sarayreh, M.; Craigie, C.R.; Reis, M.M. A global calibration model for prediction of intramuscular fat and pH in red meat using hyperspectral imaging. Meat Sci. 2021, 181, 108405. [Google Scholar] [CrossRef]
- Yun, Y.; Wang, W.; Tan, M.; Liang, Y.; Li, H.; Cao, D.; Lu, H.; Xu, Q. A strategy that iteratively retains informative variables for selecting optimal variable subset in multivariate calibration. Anal. Chim. Acta 2014, 807, 36–43. [Google Scholar] [CrossRef]
Model | Preprocessing | nPCs | γ | δ2 | Calibration Set | Prediction Set | ||
---|---|---|---|---|---|---|---|---|
RMSEC | RMSEP | |||||||
PLS | Raw | 9 | - | - | 0.3986 | 0.9621 | 1.1381 | 0.6942 |
Smoothing | 6 | - | - | 1.0119 | 0.7557 | 1.3038 | 0.5986 | |
SNV | 3 | - | - | 1.6956 | 0.3141 | 1.8039 | 0.2318 | |
Z-score | 5 | - | - | 1.1286 | 0.6961 | 1.1711 | 0.6762 | |
LS-SVM | Raw | - | 77,838.29 | 26,573.41 | 0.8957 | 0.8086 | 1.0295 | 0.7495 |
Smoothing | - | 85,781.59 | 29,349.56 | 0.9332 | 0.7923 | 1.0290 | 0.7498 | |
SNV | - | 22,520.66 | 79,010.55 | 0.7688 | 0.8590 | 1.6613 | 0.3478 | |
Z-score | - | 54,293.14 | 14,798.45 | 0.8296 | 0.8358 | 1.2398 | 0.6368 |
Model | Preprocessing | nPCs | γ | δ2 | Calibration Set | Prediction Set | ||
---|---|---|---|---|---|---|---|---|
RMSEC | RMSEP | |||||||
PLS | Raw | 8 | - | - | 0.0424 | 0.9995 | 1.1496 | 0.6569 |
Smoothing | 8 | - | - | 0.0754 | 0.9985 | 1.1366 | 0.6646 | |
SNV | 6 | - | - | 0.1354 | 0.9954 | 1.2530 | 0.5924 | |
Z-score | 6 | - | - | 0.2187 | 0.9882 | 1.2879 | 0.5694 | |
LS-SVM | Raw | - | 29,195.21 | 3095.09 | 0.0022 | 0.9999 | 1.1991 | 0.6025 |
Smoothing | - | 92,829.47 | 99,301.32 | 0.0293 | 0.9998 | 1.2294 | 0.5821 | |
SNV | - | 89,171.26 | 1500.91 | 0.0001 | 0.9999 | 1.3533 | 0.4936 | |
Z-score | - | 35,097.78 | 3077.16 | 0.0018 | 0.9999 | 1.0228 | 0.7108 |
Model | nPCs | γ | δ2 | Calibration Set | Prediction Set | ||
---|---|---|---|---|---|---|---|
RMSEC | RMSEP | ||||||
SPA-PLS | 12 | - | - | 0.8760 | 0.9169 | 1.0209 | 0.7539 |
SPA-LS-SVM | - | 98,472.83 | 2024.81 | 1.0019 | 0.7605 | 1.1286 | 0.6990 |
Model | nPCs | γ | δ2 | Calibration Set | Prediction Set | ||
---|---|---|---|---|---|---|---|
RMSEC | RMSEP | ||||||
SPA-PLS | 8 | - | - | 0.1549 | 0.9941 | 0.5737 | 0.9145 |
SPA-LS-SVM | - | 73,016.22 | 6037.97 | 0.0782 | 0.9985 | 0.5211 | 0.9249 |
Experiment | Calibration Set | Prediction Set | ||
---|---|---|---|---|
RMSEC | RMSEP | |||
Spectrophotometer | 0.7478 | 0.8715 | 0.7925 | 0.8515 |
Grating spectrograph | 0.1337 | 0.9953 | 0.5128 | 0.9273 |
Experiment | PLS (s) | LS-SVM (s) | SPA-PLS (s) | SPA-LS-SVM (s) | 1D-CNN (s) | |||||
---|---|---|---|---|---|---|---|---|---|---|
Mean | Std. | Mean | Std. | Mean | Std. | Mean | Std. | Mean | Std. | |
Spectrophotometer | 0.0014 | 0.0005 | 0.0065 | 0.0004 | 0.0017 | 0.0002 | 0.0081 | 0.0009 | 0.0024 | 0.0005 |
Grating spectrograph | 0.0068 | 0.0008 | 0.0065 | 0.0012 | 0.0186 | 0.0008 | 0.0232 | 0.0009 | 0.0082 | 0.0011 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, D.; Li, L. Detection of Water pH Using Visible Near-Infrared Spectroscopy and One-Dimensional Convolutional Neural Network. Sensors 2022, 22, 5809. https://doi.org/10.3390/s22155809
Li D, Li L. Detection of Water pH Using Visible Near-Infrared Spectroscopy and One-Dimensional Convolutional Neural Network. Sensors. 2022; 22(15):5809. https://doi.org/10.3390/s22155809
Chicago/Turabian StyleLi, Dengshan, and Lina Li. 2022. "Detection of Water pH Using Visible Near-Infrared Spectroscopy and One-Dimensional Convolutional Neural Network" Sensors 22, no. 15: 5809. https://doi.org/10.3390/s22155809