Linear and Non-Linear Modelling Methods for a Gas Sensor Array Developed for Process Control Applications

Lakhmi, Riadh; Fischer, Marc; Darves-Blanc, Quentin; Alrammouz, Rouba; Rieu, Mathilde; Viricelle, Jean-Paul

doi:10.3390/s24113499

Open AccessArticle

Linear and Non-Linear Modelling Methods for a Gas Sensor Array Developed for Process Control Applications

by

Riadh Lakhmi

^*,

Marc Fischer

,

Quentin Darves-Blanc

,

Rouba Alrammouz

,

Mathilde Rieu

and

Jean-Paul Viricelle

Mines Saint-Etienne, Univ Lyon, CNRS, UMR 5307 LGF, Centre SPIN, F-42023 Saint-Etienne, France

^*

Author to whom correspondence should be addressed.

Sensors 2024, 24(11), 3499; https://doi.org/10.3390/s24113499

Submission received: 18 April 2024 / Revised: 22 May 2024 / Accepted: 24 May 2024 / Published: 29 May 2024

(This article belongs to the Special Issue Low-Cost Chemosenors for Applications in Environment, Health, Food, and Industry Process Control)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

New process developments linked to Power to X (energy storage or energy conversion to another form of energy) require tools to perform process monitoring. The main gases involved in these types of processes are H₂, CO, CH₄, and CO₂. Because of the non-selectivity of the sensors, a multi-sensor matrix has been built in this work based on commercial sensors having very different transduction principles, and, therefore, providing richer information. To treat the data provided by the sensor array and extract gas mixture composition (nature and concentration), linear (Multi Linear Regression—Ordinary Least Square “MLR-OLS” and Multi Linear Regression—Partial Least Square “MLR-PLS”) and non-linear (Artificial Neural Network “ANN”) models have been built. The MLR-OLS model was disqualified during the training phase since it did not show good results even in the training phase, which could not lead to effective predictions during the validation phase. Then, the performances of MLR-PLS and ANN were evaluated with validation data. Good concentration predictions were obtained in both cases for all the involved analytes. However, in the case of methane, better prediction performances were obtained with ANN, which is consistent with the fact that the MOX sensor’s response to CH₄ is logarithmic, whereas only linear sensor responses were obtained for the other analytes. Finally, prediction tests performed on one-year aged sensor platforms revealed that PLS model predictions on aged platforms mainly suffered from concentration offsets and that ANN predictions mainly suffered from a drop of sensitivity.

Keywords:

sensor array; Power to X; multivariate analysis; PLS; ANN

1. Introduction

The development of green energies brings with it the problem of their intermittency. One solution is to interconnect electricity, gas and heat networks. In this way, surplus electrical energy produced at times of low household consumption could be stored in chemical form by producing H₂ through electrolysis [1]. H₂ thus produced can be used for mobility applications, inserted in limited quantities into the domestic gas network, or used to capture CO₂ (produced during natural gas combustion) that will be transformed into CH₄ via the methanation reaction [2].

Nowadays, there are various processes for producing dihydrogen and methane. These include biogas reforming, pyrolysis or pyro-gasification of biomass, methanation (chemical or biological), and electrolysis. The molecules that are mainly involved in those processes (if we do not consider the different hydrocarbonated molecules that belong to the biomass) are: CH₄, CO, CO₂, H₂, and H₂O.

As the development in this Power to X area intensifies, issues related to process safety (CO or H₂ leakage, possibly concomitantly) and process control (measurement of CO₂ or CH₄ concentrations, possibly in the presence of H₂) are emerging.

To monitor and control industrial processes, monitor stack emissions or detect leaks, some companies are equipped with analyzers installed as close as possible to the part of the process to be characterized (reactors, pipes, unit operations). The nature of these analyzers depends on the gases to be detected. Among the analyzer technologies used, we can find infrared analyzers, chromatography, Raman spectroscopy-based analyzers, and photoacoustic analyzers [3,4,5]. However, analyzers of those types cost several tens of thousands of euros per unit. As a result, equipping a production unit with several analyzers can be quite costly. An alternative solution would be to measure gaseous composition using a multi-sensor platform, the cost of which would be reduced by a factor of 10 to 50 compared with an analyzer. After automatic sampling (using mass flowmeters), the gaseous solution to be analyzed may have to be diluted to ensure that the concentrations to be analyzed are compatible with the sensors’ detection ranges, and reduced to atmospheric pressure, as most sensors have a restricted operating range in terms of pressure.

The well-known problem of sensors is their lack of selectivity [6,7,8,9]. To overcome this problem, several approaches exist which have given rise to specific research. One of the methods currently used consists of modifying the composition of the sensor. This can be achieved by adding a selective sensitive layer that responds to one target gas [10,11,12,13] or by integrating a filter that will block access to the sensor’s reaction sites to certain gases, similarly to the work reported by Gao et al. [14]. Another method used to achieve good selectivity consists of using sensor arrays. This technique may be used as an alternative to the first one. In this case, no modification of the sensor’s composition is made but signal treatment based on multivariate analysis enables the identification of the analytes’ nature and concentration due to the increased size of data collected by the different sensors of the array [15,16]. These arrays can use different sensors based on one transducing principle, or can group sensors with different transducing principles (arrays of MOX sensors [17,18,19], arrays of electrochemical sensors [20,21], and so on).

To achieve good prediction of a gas mixture composition, a model has to be built based on the results obtained in a first phase (“training” phase) before being validated through an independent dataset. Prediction models can be multilinear. In this case, the model will consist of a matrix. Concerning the linear methods classically used for modeling purposes, PCA calculates matrices to project variables into a new space, using a new matrix to show the degree of similarity between variables. This method is classically used in the sensor field to classify sensor signals of electronic noses into odor types [22] which can be useful in the food industry or in testing indoor/outdoor air quality, for example. However, this method is not relevant for the identification of both natures and concentrations of gases as is required in the case of process control. For this purpose, Multi Linear Regression—Ordinary Least Square (MLR-OLS) presents the advantage of being quite simple to implement [23,24]. Modeling performances are interesting for calibration and for concentration predictions when the explanatory variables (models inputs including the sensor signals) are not correlated with each other. In this case, the PLS model is more appropriate. Indeed, this method is very effective, especially when the sensor signals are linear in their detecting range [25]. For example, in a study performed by Karami, Rasekh [26], an e-nose with MOS sensor was used to detect oil oxidation. The reliability of the PLS method in detecting this phenomenon was the most interesting among the tested methods, and was assessed at 100%.

On the other hand, it seems interesting to consider non-linear models which can be more effective in the case of strong non-linearity in the input sensor signals. More or less complex artificial neural network (ANN)-based models will, for example, allow gas mixture composition prediction [27] and also gave good results when used with e-noses to assess the quality of products. As we will explain in the following sections, preliminary tests consisting of exposing sensors to mono-analyte gas compositions gave, in almost every case, linear responses in the targeted detection range. Non-linear models like neural networks are suitable for linear behaviors but, due to the fact that the extraction of the model parameters is based on reaching local minimums [28], the extracted parameters will not always correspond to the most adequate model.

Predictions can be biased due to numerous problems: different types of transitory phenomena (temperature or pressure evolution), ageing of the electronics for signal treatment, and drift/ageing of the sensors constituting the platform. Different reversible or irreversible processes may cause short- or long-term sensor drift. Reversible damage, which results in short-term drifts, can result from condensation of chemical vapor on the active surface of the sensor, physical adsorption of chemical compounds, or evolution of ambient atmospheric conditions (temperature, humidity, influence) [29]. Irreversible damage can result from a brutal phenomenon, for example, the poisoning of MOX or electrochemical sensors with sulfur compounds or from a continuous evolution over time. This last case can be due to the evolution of the electronic components’ dedicated signal treatment or from an evolution of the sensor’s active materials (electrodes, semiconductor oxides, and heating element) due to surface chemical reactions or degradations due to mechanical stresses [30].

In this paper, two linear models and several neural network-based models with and without a hidden layer will be compared in terms of predictive capabilities. Single analytes and binary mixtures will be considered for developing and testing the models. In addition, the article also includes the study of the pertinence of the concentration predictions after one year of continuous use of the platform under controlled environmental conditions. In this case, predictions will be affected by irreversible damage due to continuous ageing of electronics or sensors. Comparison of the prediction performances of linear (PLS) and non-linear (ANN) methods, initially and after one year of continuous platform use, will be carried out. Indeed, one of the major objectives of this work will be to evaluate the evolution over time of the performance of commonly used linear and non-linear models for predicting gas concentrations in the simple case of binary mixtures.

2. Materials and Methods

2.1. Sensor Choice

The choice of sensors for the multi-sensor platform was made according to well-defined specifications. The first one concerned the targeted gases: CO₂, H₂, CH₄, and CO, which are the most common gases in Power to X-linked processes. Additional temperature and humidity sensors are also needed since those two parameters will vary during the detection phase and since they constitute potential influences. The desirable gas concentration detection ranges are both a function of the gas concentrations that can be measured at specific points of the processes and the possibilities offered by commercial sensors. In order to detect traces of CO in H₂ (methanization) or leakage of CO, the targeted detection range expected for CO runs from a few ppm to some hundreds of ppm. For H₂, requirements relate to its monitoring in a process and the detection of H₂ leaks. For safety reasons and for reasons of sensor range limitations, the targeted detection range for H₂ has been limited from a few hundred ppm to 1%. For CO₂ and CH₄ gases, the requirements concern only the gas concentration monitoring at different steps of the process. Ideally, sensor detection ranges should be from a few hundred ppm up to 100% but, as will be shown, the sensors’ upper limit for those gases does not exceed a few tens of %.

In order to address the specifications, 5 commercial sensors were selected for this project. These ones were deliberately chosen with very different operating principles in order to maximize the versatility of the associated responses.

Prior to selecting the sensors, a study was carried out to determine the sensor technologies that could be used to detect the gases of interest to the project (CO, CO₂, CH₄, and H₂). The results are summarized in Table 1.

Sensor technologies can be classified into two families: chemical and physical sensors. The first family is based on the change of an electrical output characteristic’s value due to a chemical reaction. CO₂ is a weakly reactive molecule. Therefore, chemical sensors (MOX, catalytic, electrochemical) are not the most effective to detect it. Commercially, two physical sensor technologies exist for the detection of CO₂: Non-Dispersive Infrared (NDIR) sensors and photo-acoustic sensors. The second technology is relatively recent and only a few constructors propose it. The photo-acoustic effect is based on the absorption (by target molecules) of a modulated (or pulsed) light beam. As the molecules de-energize through collisions, they generate sound waves that are detected by a condenser microphone. Yet, the technology that is mainly used for the detection of CO₂ remains the classical infrared absorption-based one. As for the larger IR analyzers, those sensors will be based on the detection of a change of luminous intensity due to absorption of NDIR rays by the gaseous analyte. In an MOX sensor, the variation in electrical conductivity of an oxide semiconductor layer is measured as a function of the presence of chemisorbed gaseous analytes (redox interface reactions). Concerning electrochemical sensors, redox reactions are also involved but at electrode/electrolyte/gas interfaces. Those reactions will affect the interface resistance of the working electrode (mainly) and change the electromotive force measured between the electrodes. For catalytic sensors, a specific combustion (redox) reaction will occur on two alumina beads: a reference bead and another bead (for measurement) covered with a catalyst which decreases the combustion reaction temperature. Those alumina beads are traversed by a platinum wire, which is an RTD material, i.e., a material which changes resistance with temperature. Resistance variation due to analyte combustion is measured thanks to a Wheatstone bridge involving both the reference and the measurement bead resistances. In those three technologies of sensors, electrochemical reactions are involved. Due to the variety of semiconductor oxides used in MOX sensors, commercial references were found for CO, H₂ and CH₄ compounds. For the electrochemical sensors, a lot of references were found for CO detection and references were found for H₂ detection also. Concerning the catalytic sensors, references found mainly concerned the detection of hydrocarbons (including CH₄) and H₂.

From the references identified, the next step was the selection of the sensors that will be used in this work. This selection was performed according to many criteria: respect of the process linked specification (presented earlier), the versatility of the sensor signal expected (it was important to avoid collinearity between the sensor responses), and low number of interferents (especially humidity and temperature interferents).

Table 2 lists the name, model, type, detection range, and target gases for each selected sensor. A digital NDIR sensor was chosen for CO₂ measurement. This one incorporates a temperature sensor as well as a humidity sensor, so that the CO₂ concentration signal delivered by the sensor incorporates compensation for temperature and humidity variations. Hence, our system will eventually have 7 sensors. The temperature and the humidity sensor will also be used in the sensor network as input parameters in the multi-linear models, but not in the ANN as we feared this could lead to overfitting. Unlike other sensors, the NDIR sensor is selective. Determining CO₂ concentration will therefore not require multivariate analysis as will be the case for the other gases. The platform also incorporates two electrochemical sensors (EC-H₂ and EC-CO) which are highly sensitive to H₂ and CO, but not selective. Finally, a catalytic sensor (CATA) and a Metal Oxide sensor (MOX) were chosen for their sensitivity to CO, H₂, and CH₄.

2.2. Experimental Setup

A gas bench equipped with a series of flowmeters was used to generate gas mixtures of specified compositions with a total gas flow of 30 L/h (Figure 1a). In order to expose the sensor platform to single analyte gases or binary mixtures, 2 sensor platforms of 7 sensors (if we include the temperature and humidity sensors) were introduced in sealed cells such as the one shown in Figure 1b and exposed to the gas mixtures. The sensor signals were conditioned using commercial or laboratory-developed analogic and digital electronics as can be seen in Figure 2. EC-CO and EC-H₂ sensors require a special conditioning step performed by analogic circuit boards. However, for the rest of the sensors, the conditioning is performed by the central “laboratory-made” circuit board. Finally, signals are digitized and computerized using an Arduino board for NDIR CO₂ sensors, and a National Instruments (NI) board for the others. Once the sensor data have been collected, behavioral modeling of the platform is carried out using Excel software for the MLR-OLS method (Multi Linear Regression—Ordinary Least Square), the Python algorithm using the “PartialLeastSquares” library for the MLR-PLS method (Multi Linear Regression—Partial Least Square) and Keras/Tensorflow in Python for the neural network method. These models will be used for the final tests to predict a gas composition from the response of the sensor array immediately in the weeks after the model building and also after an ageing period of one year.

2.3. Test Procedure

2.3.1. Role of Mono-Analyte Tests

Before exposing the sensor array to complex binary mixtures, it was important to study the sensor responses to each analyte. The goal of these “mono-analyte tests” are of different natures:

-: to verify the reproducibility of the sensor responses,
-: to check that the sensor drift is limited and close to zero,
-: to analyze the transfer function linking the gas concentration of the analyte and the sensor responses (linear or not),
-: to check that the sensor responses to the introduced analytes are sufficiently uncorrelated to have enough variability in the information collected. If two sensors respond the same way to all the analytes, they finally bring “collinear” information, which would be prejudicial for the models. Indeed, it can lead to overfitting so that the model will almost perfectly learn to match the training data but will be unable to capture the validation data.

2.3.2. Sensor Network Exposure to Both Mono-Analyte and Binary Mixtures

A LabVIEW program is used both to control the flowmeters and therefore the gas mixture composition in the gas line according to time, and also to collect the data from the different sensors and gather them in a specific file. The program can be fed with a file containing sequences of gas compositions at different times that will be applied to the different flowmeters to obtain the expected gas concentration evolution as a function of time (Figure 3). Each sequence lasts between 30 and 60 min and a set of sequences is structured in the following way:

-: first sequence under “base gas”: 12% O₂/1% absolute humidity/N₂,
-: several sequences including introduction of analytes alone or in binary analyte mixtures,
-: last sequence under “base gas”: 12% O₂/1% absolute humidity/N₂ to verify the return of the sensor signals to the base line, i.e., verify that the signal corresponding to the first sequence is the same as the signal at this last sequence (no drift of the sensor signals).

2.3.3. Modelling Step: Behavior Model Construction

During this step, the signals collected during the phase of sensor network exposure to the different single and binary analyte mixtures will be used. They will constitute the input data from our models and will be gathered in a matrix X constituted of the elements X_ij, where “i” corresponds to the sampled point number (which can be linked to time knowing the sampling frequency) and “j” corresponds to the sensor number: 1 to 14 (2 cells with 7 sensors each). The output matrix of the model is a table Y constituted of the elements Y_ik, containing the concentration evolution of the four targeted analytes according to time. Here, “i” also corresponds to the sampled point number and “k” corresponds to the analyte number (from 1 to 4).

Concerning linear models, MLR-OLS [31,32] and MLR-PLS [33,34] methods were chosen. The MLR-OLS model is a modelling method in which the empirical estimation of a calibration matrix, C, allows us to use experimental sensor data (matrix

X

) to get a prediction matrix

\hat{Y}

(which corresponds, here, to modelled gas concentration values). C is composed of the elements

C_{k j}

and is determined by least squares minimization (parameter RMSE, Root Mean Square Error) between modelled values

\hat{Y_{i k}}

and experimental values

Y_{i k}

:

R M S E = \sqrt{\sum_{i = 1}^{N} \frac{{(y_{i k} - \hat{y_{i k}})}^{2}}{N}}

. The RMSE is calculated for each analyte “k”. In the linear model, the relationship existing between the sensors’ matrix signals and the analyte concentrations is the following:

X = Y \cdot C + o n e s \cdot R_{0}

(1)

where

o n e s

is a one-column vector composed of

i_{m a x}

elements (number of sampled points) whose components are all equal to 1.

R_{0}

is a one line vector composed of

j_{m a x}

elements (14 sensors here). It corresponds to the sensors’ response vector when no analyte is introduced.

R_{0}

and the calibration matrix

C

determined during the training phase will constitute the parameters of the model (

C^{'}

being the transpose of

C

).

To predict the value of the concentration matrix in the model validation phase, linear algebra is used to extract the concentration matrix

\hat{Y}

:

\hat{Y} = (X - o n e s \cdot R_{0}) \cdot C^{'} \cdot {(C \cdot C^{'})}^{- 1}

(2)

In the case where the number of predicted variables (matrix

\hat{Y}

) is rather high and at the same time the amount of information from experimental data (matrix

X

) insufficient, the OLS method becomes unstable because the system is undetermined. Similarly, when the number of experimental variables (sensors as predictors) is large and the amount of data used in the model-building phase is insufficient, OLS models then suffer from multi-collinearity and overfitting problems.

The MLR-PLS model will then seek to define a model that will maximize the covariance between

X

and

Y

using latent variables. These variables replace explanatory variables (sensor signals in our case) with a more or less strong collinearity. Indeed, they constitute linear combinations of those in which the factor affected by each explanatory variable is chosen so as to maximize the covariance between the newly created latent vector and the concentration matrix. Then, the multilinear regression is not performed on the explanatory variables but on the latent variables. In this work, a version of the algorithm developed by Abdi et al. [35] in 2010 was used through Python code.

We also decided to model the relationship between the analyte concentrations and the sensor signals through a series of artificial neural networks (ANN) [36], as illustrated by Figure 4 for the case of H₂ concentration prediction.

f_{a c t, 1} = t a n h

and

f_{a c t, 2} = l i n e a r

are the two activation functions we have chosen. When

n_{h i d d e n} = 0

, there is no hidden layer and the concentration of the considered analyte (H₂ in the following equation) can be determined by:

Y_{H_{2}, i, A N N} = f_{a c t, 1} (\sum_{j = 1}^{n_{s e n s o r s}} ω_{j} X_{i j} + ω_{0})

(3)

where

X_{i j}

is the value returned by the j-th sensor of the regressor combination at time i and the

ω_{j}

are weights to be optimised based on the training data set.

ω_{0}

is the weight of the bias neuron. In the presence of a hidden layer, the pollutant concentration (for instance H₂) can be calculated using the following equation:

Y_{H_{2}, i, A N N} = f_{a c t, 2} (\sum_{j = 1}^{n_{h i d d e n}} ω_{2, j} * f_{a c t, 1} (\sum_{k = 1}^{n_{s e n s o r s}} ω_{1, k, j} X_{i k} + ω_{1,0, j}) + ω_{2,0})

(4)

where the weights

ω_{1, k, j}

and

ω_{2, j}

are to be optimised based on the training data set.

ω_{2,0}

and

ω_{1,0, j}

are the weights of the bias neurons for the output and the hidden layer, respectively.

The weights are optimized through the minimization of RMSE given by:

R M S E = \sqrt{\sum_{k = 1}^{n_{t i m e}} {(Y_{s p e c i e s, A N N, k} - Y_{s p e c i e s, k})}^{2}}

(5)

where n_time is the number of time points in the training data set,

Y_{s p e c i e s, k}

, is the considered experimental species concentration at the k-th time point of the training data set, and

Y_{s p e c i e s, A N N, k}

is the prediction of this value by the ANN. The Gradient Descent Method with momentum [37] was employed for the optimization. After a random initialization, the weights of the neural network are iteratively adapted according to these coupled equations:

ω_{t + 1} = ω_{t} - ε \cdot v_{t + 1}

(6)

With:

v_{t + 1} = ρ \cdot v_{t} + (1 - ρ) \nabla_{ω} R M S E (ω)

(7)

ε

is the learning rate which was set to 0.2, whereas

ρ

is the momentum constant which was set to 0.9, and

v_{0}

is set to 0 in Keras/Tensorflow.

If

∆_{t} R M S E = {R M S E}_{t} - {R M S E}_{t - 1} < 10^{- 4}

for a duration equal to patience = 500 iterations, the optimisation is stopped as we consider that no further significant improvement can be made. The optimisation is always stopped after a maximum of 10,000 iterations. It is worth emphasizing that this is the RMSE of the training data set as the validation data set is only used after the end of the optimization to test the ability of the model to predict independent experiments. Both the regressors (sensors) and the target concentrations were normalized before the beginning of the training of the ANN so that they only take on values between 0 and 1. This is preferable if we want to use activation functions such as “tanh”.

3. Results & Discussions

3.1. Mono-Analyte Tests

Mono-analyte test results presented in Figure 5, Figure 6, Figure 7 and Figure 8 allowed us to answer all the questions raised in part 2.3 and to validate the potentialities of the sensor array to be used for predictions. The first point to validate was reproducibility. All the results presented in Figure 5, Figure 6, Figure 7 and Figure 8 were reproduced two times and, in each case, less than 10% difference was observed between the sensor responses, which is, in the case of analogue sensors, quite reasonable. Then, it was important to check that no major drifts occurred in the sensor responses, i.e., that the base line before and after the introduction of the analytes is at the same level and that the slope of the response curve according to time finally tends towards a horizontal asymptote after the gas mixture composition has been changed. This also could be validated in the results shown in Figure 5, Figure 6, Figure 7 and Figure 8. Another very important point was to check that the difference in the sensor array’s responses from one gas to another was sufficient to discriminate the target gases. It can be seen from Figure 5 and Figure 6 that sensor responses to H₂ and CO are relatively close even though the amplitudes of the responses of the different sensors are not comparable. One element that could permit to discriminate H₂ and CO when they are in binary mixtures with another gas is the fact that H₂ induces a significant response in the MOX sensor, whereas for CO this is not the case. Concerning CH₄, only the MOX sensor responds to this analyte (Figure 7), in contrast to H₂ or CO. Mixtures of H₂ and CH₄ may be less convenient to identify for low concentrations of CH₄ since the correlation between H₂ concentration and MOX sensor is also very strong. For the three gases mentioned in this paragraph, multivariate analysis is required since two conditions are not fulfilled for those gases:

-: induce a response in only one sensor;
-: this last sensor should only respond to this analyte.

However, this is the case for CO₂. Indeed, CO₂ induces a response only in the NDIR sensor (Figure 8) and, at the same time, this last sensor does not respond to the other analytes (Figure 5, Figure 6 and Figure 7—curve f). Thus, CO₂ will not require any multivariate analysis to be discriminated. The information from the NDIR sensor will be sufficient. Therefore, in the multivariate analysis performed in the following sections, CO₂ will not be involved.

3.2. Sensor Transfer Function

The sensor transfer function towards each analyte is a very important piece of information to collect to get indications of the type of models that may be the most suitable to perform concentration predictions. Indeed, if most of the sensor responses to the analytes were not linear, attempts to build a linear model to perform prediction would be unsuitable. From Figure 9, it can be noticed that most of the transfer functions are linear for all the considered sensors and analytes. Two exceptions to this statement exist. The first one concerns the response of the catalytic sensor to H₂. In this case, the response is linear until a concentration of 600 ppm, after which there seems to be an asymptotic behavior (saturation of the sensor). The second exception concerns the MOX sensor’s response to CH₄, for which the sensor’s response is purely logarithmic.

Despite these two non-linear transfer functions, multi-linear models remain good candidates to perform prediction of analyte concentrations because of the measurement uncertainty (mainly composed of reproducibility errors) which could be less amplified by linear models.

3.3. Building up Models from Training Data

To build up the different models (linear and non-linear), four specific test sequences were carried out twice each. To avoid overloading the article, these specific test sequences will be presented in Appendix A in graphical form. The sensor data from these tests were collected, formatted and used to build the models. On the raw sensor data, the transient parts correspond to both the response time of the multi-sensor array and the air renewal dynamics in the large 0.4 L cell volume (which is a function of the total flow rate of 30 L/h on the two cells). Those transient parts were removed in the data used to build up the model, leaving only the stationary parts. The data are composed of 14 columns for the 14 sensors (two cells of seven sensors) that will constitute the explanatory variables of the different models plus four columns containing the gas concentrations that will constitute the response variables. Those concentrations are accurately known since the LabVIEW program controls them through the flowmeters. From the model parameters estimated based on the training data, “modeled” analyte concentrations could be calculated using sensor responses and compared to the “real” experimental concentrations used during the training tests. The accuracy of these concentration predictions is evaluated based on the RMSE (Table 3). The only linear model enabling good concentration predictions on the training data is MLR-PLS. In Figure 10, analyte concentration predictions performed on the training dataset with this last model are represented as a function of time and compared to experimentally imposed values of H₂, CO, and CH₄ concentrations. It can be noticed that good prediction results are globally obtained for those three analytes. Yet, concerning the base line (concentration of 0 ppm of analytes), some false positive or negative values are obtained for CO and CH₄. Those false positive or negative concentration prediction values are, then, expected to occur also on predictions performed on validation tests.

A series of artificial neural networks based on different sensor combinations as regressors and numbers of neurons in the hidden layer were trained and validated, as explained in Appendix B: Results of the Neural Networks. While the quality of the prediction of the validation data set for CO generally increases with the complexity of the ANN structures, this is not the case for CH₄ where we can only see a decrease in the training RMSE. When it comes to CH₄, the training RMSE sometimes increases as the complexity of the ANN is increased, which can only stem from the optimization method becoming stuck inside local minima, since the most complex structures include the less complex ones as a special case.

The results of the different models for the training phase can be seen in Table 3. The obtained RMSE are quite comparable with those obtained with the PLS method even if the prediction of H₂ concentration seems a little less accurate for the best ANN model compared to the PLS model. These results are confirmed in Figure 11 in which we can see that the prediction results are quite comparable to those obtained with the PLS method (Figure 10).

3.4. Validation Tests and Comparison of Models

In this section on validation tests, the best linear model and best non-linear model are selected and used to perform prediction on data that have not been used to construct the model. The sensor data used for prediction were collected during gaseous exposure of the platform to the sequence shown in Figure 12 for the gases H₂, CH₄, and CO, alone or in binary mixtures. Gas concentration prediction results for PLS and ANN methods are shown in Figure 13 and Figure 14, respectively. The predictions made for the three gases are relatively satisfactory in terms of determining the nature of the gases present. However, there are still some imperfections: over/underestimation of concentrations, and false positives/negatives. Prediction of H₂ and CO gas concentrations seems quite comparable between PLS and ANN methods even if H₂ prediction seems slightly better with the PLS method, as is confirmed in Table 4. Concerning CH₄, prediction results are better with the ANN method as can be seen in Figure 13 and Figure 14 and confirmed by an RMSE result of 497 compared to 755 for the PLS method (Table 4). Less effective prediction of CH₄ with the PLS method is consistent with the fact that CH₄ only induces a significant response on the MOX sensor, which is logarithmic. Thereby, the limits of the linear PLS model compared to ANN are shown here, even if CH₄ concentration prediction results are consistent.

3.5. Data Post-Treatment

For the data predicted by the best linear and non-linear models, there are still problems that seem avoidable, particularly when the predicted concentration values are negative or positive while the actual concentrations seen by the sensor network are zero. To overcome these false negatives and false positives, a post-processing algorithm has been developed. This is based on the knowledge of sensor signal values when analyte concentrations are zero, and will be limited to concentration predictions in the latter case only, in order to avoid introducing bias. It consists of setting the analyte concentration to 0 when the signals of the different sensors constituting the platform remain under thresholds specific to each sensor. Post-treatment results are presented in Figure 15 and Figure 16 and Table 5 for MLR-PLS and ANN models, respectively. The corrections are very effective for the H₂ analyte. In this case, both false negative and positive could be removed. In the case of the CH₄ analyte, results are very good too, even if the algorithm seems less effective at removing the few false positives obtained. Finally, post-treatment results obtained in the case of the CO analyte are less good, especially when CH₄ is in binary mixture with H₂. In the latter case, in spite of the absence of CO, the proximity of the sensor responses to H₂ and CO makes it very difficult to distinguish H₂ from CO and CO predicted concentration is, in this case, not equal to 0.

3.6. Ageing of Sensors

Next, the multi-sensor platforms were maintained for a year under electrical power, atmospheric pressure, and air. At the end of the year, the gas sequence to which the sensors had been exposed for the prediction testing a year earlier was used again, to check the relevance of the prediction model. Figure 17 shows a comparison between sensor responses before ageing and after a one-year ageing period when subjected to the gas sequence shown in Figure 17a. For the EC-H₂ (Figure 17b) and EC-CO (Figure 17c) sensors, an overall decrease in sensitivity is observed. This is more pronounced for the EC-H₂ sensor than for the EC-CO sensor. The same trend is observed for the catalytic sensor (Figure 17d), except at higher H₂ concentrations, where the response of the aged sensor is greater. Finally, for the MOX sensor (Figure 17e), a greater sensitivity is observed for the sensor after one year of ageing as long as H₂ is present is the gas mixture.

Based on the sensor platform’s responses and applying the MLR-PLS and ANN models previously developed (without post-treatment), predictions of the concentration of H₂, CO, and CH₄ have been performed. Concerning the predictions made using the ANN model (Figure 18), different conclusions can be drawn according to the gas considered. Indeed, for H₂, a global underestimation of the concentration is observed. When H₂ concentration seen by the sensors is too low, predicted H₂ concentration becomes 0. Concerning CH₄, it seems that when this analyte is present alone in the base gas mixture, predicted CH₄ concentration is close to 0. However, as long as H₂ is present in sufficiently high concentration (even without the presence of CH₄), predicted CH₄ concentration raises. This indicates that, when trying to predict CH₄ concentration, the ANN model will actually be more representative of the H₂ concentration in the mixture. Finally, the best prediction results of ANN on aged sensors is obtained for CO. The concentration evolution is globally well reproduced. However, false positives in CO concentration predictions can be observed when H₂ is present in a mixture that does not contain CO.

Then, concerning the predictions made using the MLR-PLS model (Figure 19), for the different gases, a prediction offset is observed in each case. For H₂ and CO, a positive offset is noticed while a negative prediction offset is observed for CH₄. Even if the prediction response amplitude seems in correlation with experimental gas concentration, the presence of this offset completely ruins prediction quality for the different gases. Application of the post-treatment algorithm described earlier is not efficient in this case because of the important offsets observed. However, those offsets could be easily removed by periodic zero calibration of the platform. In Figure 20, prediction results including offset compensation are shown. Offset-compensated H₂ predictions reveal global underestimation of H₂ concentration. In the case of CH₄, concentration overestimation is reported. When H₂ is present without CH₄, false positives are also present. For the PLS model, best (offset-compensated) prediction results are obtained for CO, for which concentrations are quite accurately predicted. False positive CO concentration prediction is also obtained when a high concentration of H₂ is present (without CO) in the gas mixture.

4. Conclusions

In a context of process developments linked to Power to X, the first goal of this work was to select commercial sensors to build a multi-sensor platform able to detect H₂, CO, CH₄, and CO₂ concentrations when gases are alone or in binary mixtures while respecting specifications linked to the applications. With the sensors selected, the processing chain enabling signal treatment and digitalization was developed. Finally, the main task was to build linear (MLR-OLS and MLR-PLS) and non-linear (ANN) models capable of detecting gases in binary mixtures. The first step was training. This allowed us to train the models by estimating their coefficients and then evaluate them by their ability to reproduce the data they were trained with. This step allowed us to disqualify the MLR-OLS model, which was clearly not able to predict analyte concentrations based on the sensor signals. Another set of experimental sensor data was then used to compare the prediction performances of MLR-PLS and ANN in the case of “fresh sensor data”, i.e., data not used to build up the models. The main result was that the gas concentration predictions were quite comparable for H₂ and CO (slightly better for H₂ concentration predictions with the PLS method) for linear and non-linear methods and better for ANN in the case of CH₄ concentration predictions, which is in accordance with the fact that the MOX sensor’s response to CH₄ is logarithmic. To improve the sensor prediction performances, a post-treatment algorithm was developed to correct for the case where predictions should give 0 ppm analyte concentration. This helped to improve the RMSE, especially in the case of CH₄.

Finally, prediction tests were performed on a sensing platform that had been aged for one year. Due to the evolution of the sensors’ responses, the quality of the predictions performed by the ANN and PLS models greatly deteriorated. While ANN predictions suffer from high underestimation of H₂ and CH₄ concentrations (predictions for CO concentrations being correct), the PLS model suffers from big prediction offsets. After compensation of the offset by calibration, the quality of the prediction by PLS becomes much better than ANN, even if global underestimation of H₂ concentrations and the presence of false positives in the prediction curves of CO and CH₄ reduce prediction quality compared to the unaged sensing platform case.

In future works on the subject, our first goal will be to test sensor signals at different stages of ageing and propose an ageing model, which would act as a second layer added to the already developed linear and non-linear models to compensate the effects of sensor ageing.

Our second goal will be the determination of more complex gas mixture compositions, such as ternary and quaternary mixture compositions, based on sensor signals. This will require the development of new models and possibly the addition of new sensors to the sensor array used in this work to enrich the information provided by the latter.

Author Contributions

Conceptualization, R.L.; methodology and investigation, R.L., M.F. and Q.D.-B.; validation, R.L., M.F., M.R., R.A. and J.-P.V.; data curation, R.L., M.F. and Q.D.-B.; writing—original draft preparation, R.L.; writing—review and editing, R.L., M.F. and M.R.; supervision, R.L., M.R. and J.-P.V. All authors have read and agreed to the published version of the manuscript.

Funding

Carnot M.I.N.E.S has supported the HyTREND research project-Hydrogen for a low-carbon energy transition with funding from the French National Agence Nationale de la Recherche (ANR) for a 3-year period. Total amount received by SPIN center is 20k €.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Gas Sequences Used during Training

Figure A1. Gas sequences used during the training tests.

Appendix B. Results of the Neural Networks

The results of the training and validation of the artificial neural networks can be seen in Table A1, Table A2 and Table A3 for H₂, CO, and CH₄, respectively.

Table A1. Model performance (H2, training—one validation file).

Regressors	hnn	rmset	r2t	rmsev	r2v
H2(V)	0	173.50	0.92	246.00	0.77
H2(V)	1	154.21	0.94	237.45	0.78
H2(V)	2	153.31	0.94	238.42	0.78
H2(V)	3	152.48	0.94	239.18	0.78
H2(V)	4	153.24	0.94	237.92	0.78
H2 (V), CATA (V), MOX (V)	0	174.12	0.92	246.97	0.76
H2 (V), CATA (V), MOX (V)	1	149.66	0.94	228.62	0.80
H2 (V), CATA (V), MOX (V)	2	134.30	0.95	218.96	0.81
H2 (V), CATA (V), MOX (V)	3	127.45	0.96	217.53	0.82
H2 (V), CATA (V), MOX (V)	4	115.21	0.97	210.05	0.83
H2 (V), CATA (V), MOX (V), CO (V)	0	120.76	0.96	230.07	0.80
H2 (V), CATA (V), MOX (V), CO (V)	1	105.49	0.97	226.49	0.80
H2 (V), CATA (V), MOX (V), CO (V)	2	102.05	0.97	223.08	0.81
H2 (V), CATA (V), MOX (V), CO (V)	3	102.50	0.97	227.69	0.80
H2 (V), CATA (V), MOX (V), CO (V)	4	102.25	0.97	224.73	0.80
H2 2 (V)	0	420.11	0.54	205.20	0.84
H2 2 (V)	1	400.72	0.58	208.63	0.83
H2 2 (V)	2	395.83	0.59	214.47	0.82
H2 2 (V)	3	396.50	0.59	216.22	0.82
H2 2 (V)	4	394.87	0.60	211.07	0.83
H2 2 (V), CATA 2 (V), MOX 2 (V)	0	283.06	0.79	213.99	0.82
H2 2 (V), CATA 2 (V), MOX 2 (V)	1	281.79	0.79	226.30	0.80
H2 2 (V), CATA 2 (V), MOX 2 (V)	2	247.85	0.84	264.74	0.73
H2 2 (V), CATA 2 (V), MOX 2 (V)	3	176.59	0.92	206.41	0.84
H2 2 (V), CATA 2 (V), MOX 2 (V)	4	162.69	0.93	231.02	0.79
H2 2 (V), CATA 2 (V), MOX 2 (V), CO 2 (V)	0	248.21	0.84	262.64	0.73
H2 2 (V), CATA 2 (V), MOX 2 (V), CO 2 (V)	1	244.35	0.85	248.91	0.76
H2 2 (V), CATA 2 (V), MOX 2 (V), CO 2 (V)	2	239.22	0.85	253.51	0.75
H2 2 (V), CATA 2 (V), MOX 2 (V), CO 2 (V)	3	158.88	0.93	228.72	0.80
H2 2 (V), CATA 2 (V), MOX 2 (V), CO 2 (V)	4	155.62	0.94	253.60	0.75
H2 (V), H2 2 (V)	0	172.93	0.92	242.12	0.77
H2 (V), H2 2 (V)	1	153.97	0.94	237.42	0.78
H2 (V), H2 2 (V)	2	148.86	0.94	243.59	0.77
H2 (V), H2 2 (V)	3	148.91	0.94	243.34	0.77
H2 (V), H2 2 (V)	4	149.08	0.94	240.26	0.78
H2 (V), CATA (V), MOX (V), H2 2 (V), CATA 2 (V), MOX 2 (V)	0	152.32	0.94	219.50	0.81
H2 (V), CATA (V), MOX (V), H2 2 (V), CATA 2 (V), MOX 2 (V)	1	134.75	0.95	206.36	0.84
H2 (V), CATA (V), MOX (V), H2 2 (V), CATA 2 (V), MOX 2 (V)	2	126.60	0.96	224.33	0.81
H2 (V), CATA (V), MOX (V), H2 2 (V), CATA 2 (V), MOX 2 (V)	3	125.43	0.96	219.67	0.81
H2 (V), CATA (V), MOX (V), H2 2 (V), CATA 2 (V), MOX 2 (V)	4	108.90	0.97	208.77	0.83
H2 (V), CATA (V), MOX (V), CO (V), H2 2 (V), CATA 2 (V), MOX 2 (V), CO 2 (V)	0	119.35	0.96	222.67	0.81
H2 (V), CATA (V), MOX (V), CO (V), H2 2 (V), CATA 2 (V), MOX 2 (V), CO 2 (V)	1	102.79	0.97	207.34	0.83
H2 (V), CATA (V), MOX (V), CO (V), H2 2 (V), CATA 2 (V), MOX 2 (V), CO 2 (V)	2	100.22	0.97	216.92	0.82
H2 (V), CATA (V), MOX (V), CO (V), H2 2 (V), CATA 2 (V), MOX 2 (V), CO 2 (V)	3	96.87	0.98	210.69	0.83
H2 (V), CATA (V), MOX (V), CO (V), H2 2 (V), CATA 2 (V), MOX 2 (V), CO 2 (V)	4	95.22	0.98	227.32	0.80

For H₂ mole fraction prediction, we can see that increasing the complexity of the ANN in terms of number of regressors and of neurons in the hidden layer strongly increases the fit to the training data set. Thus, it decreases RMSEt but it also generally fails to decrease the RMSE of the validation data set, which indicates that while the ANN becomes more and more able to learn how to reproduce the training data, it is not able to generalize this learning through better predictions of the validation measurements. Best validation RMSEv is reached for H₂ 2(V) (205.20) but the training RMSEt is very large (420.11) which makes it likely that the model has not properly learned the relationship between sensor and mole fraction. Consequently, we decided to consider H₂(V), CATA(V), MOX(V), CO(V), H₂ 2(V), CATA 2(V), MOX 2(V), CO 2(V) with 1 neuron in the hidden layer as the best model as RMSEv = 207.34 is only slightly larger than RMSEv = 205.20 whereas the training RMSEt = 102.79 is much better.

Table A2. Model performance (CO, training—one validation file).

Regressors	hnn	rmset	r2t	rmsev	r2v
CO (V)	0	87.34	0.65	59.34	0.27
CO (V)	1	84.93	0.67	59.05	0.28
CO (V)	2	80.64	0.70	56.61	0.34
CO (V)	3	80.42	0.70	56.62	0.34
CO (V)	4	80.46	0.70	56.68	0.33
CO (V), CATA (V)	0	85.68	0.66	57.90	0.31
CO (V), CATA (V)	1	84.16	0.67	57.50	0.31
CO (V), CATA (V)	2	80.35	0.70	55.01	0.37
CO (V), CATA (V)	3	77.86	0.72	50.10	0.48
CO (V), CATA (V)	4	71.12	0.77	55.98	0.35
CO (V), CATA (V), MOX (V), H2 (V)	0	46.68	0.90	35.41	0.74
CO (V), CATA (V), MOX (V), H2 (V)	1	41.51	0.92	25.36	0.87
CO (V), CATA (V), MOX (V), H2 (V)	2	40.67	0.92	29.61	0.82
CO (V), CATA (V), MOX (V), H2 (V)	3	40.42	0.92	25.85	0.86
CO (V), CATA (V), MOX (V), H2 (V)	4	40.43	0.92	26.54	0.85
CO 2 (V)	0	124.69	0.28	67.96	0.04
CO 2 (V)	1	124.76	0.28	67.75	0.05
CO 2 (V)	2	124.00	0.29	67.53	0.05
CO 2 (V)	3	124.05	0.29	67.54	0.05
CO 2 (V)	4	124.92	0.28	68.17	0.04
CO 2 (V), CATA 2 (V)	0	111.85	0.42	69.25	0.01
CO 2 (V), CATA 2 (V)	1	108.15	0.46	65.93	0.10
CO 2 (V), CATA 2 (V)	2	104.75	0.49	61.24	0.22
CO 2 (V), CATA 2 (V)	3	101.51	0.53	50.95	0.46
CO 2 (V), CATA 2 (V)	4	99.49	0.54	48.70	0.51
CO 2 (V), CATA 2 (V), MOX 2 (V), H2 2 (V)	0	66.60	0.80	49.14	0.50
CO 2 (V), CATA 2 (V), MOX 2 (V), H2 2 (V)	1	62.62	0.82	50.30	0.48
CO 2 (V), CATA 2 (V), MOX 2 (V), H2 2 (V)	2	50.91	0.88	30.69	0.80
CO 2 (V), CATA 2 (V), MOX 2 (V), H2 2 (V)	3	54.04	0.87	28.44	0.83
CO 2 (V), CATA 2 (V), MOX 2 (V), H2 2 (V)	4	64.00	0.81	38.77	0.69
CO (V), CO 2 (V)	0	87.78	0.65	61.68	0.21
CO (V), CO 2 (V)	1	85.00	0.67	59.32	0.27
CO (V), CO 2 (V)	2	80.29	0.70	57.08	0.32
CO (V), CO 2 (V)	3	80.35	0.70	57.63	0.31
CO (V), CO 2 (V)	4	80.27	0.70	56.95	0.33
CO (V), CATA (V), CO 2 (V), CATA 2 (V)	0	79.55	0.71	44.80	0.58
CO (V), CATA (V), CO 2 (V), CATA 2 (V)	1	77.22	0.73	43.11	0.61
CO (V), CATA (V), CO 2 (V), CATA 2 (V)	2	72.58	0.76	45.15	0.58
CO (V), CATA (V), CO 2 (V), CATA 2 (V)	3	70.14	0.77	49.03	0.50
CO (V), CATA (V), CO 2 (V), CATA 2 (V)	4	72.33	0.76	45.31	0.57
CO (V), CATA (V), MOX (V), H2 (V), CO 2 (V), CATA 2 (V), MOX 2 (V), H2 2 (V)	0	38.63	0.93	24.08	0.88
CO (V), CATA (V), MOX (V), H2 (V), CO 2 (V), CATA 2 (V), MOX 2 (V), H2 2 (V)	1	37.44	0.94	21.94	0.90
CO (V), CATA (V), MOX (V), H2 (V), CO 2 (V), CATA 2 (V), MOX 2 (V), H2 2 (V)	2	36.82	0.94	20.89	0.91
CO (V), CATA (V), MOX (V), H2 (V), CO 2 (V), CATA 2 (V), MOX 2 (V), H2 2 (V)	3	34.67	0.94	21.60	0.90
CO (V), CATA (V), MOX (V), H2 (V), CO 2 (V), CATA 2 (V), MOX 2 (V), H2 2 (V)	4	34.38	0.95	19.27	0.92

As for CO mole fraction prediction, we can generally see that increasing the complexity of the neural network architecture lowers both the training and validation RMSE, which indicates that overfitting is not a significant issue.

Table A3. Model performance (CH4, training—one validation file).

Regressors	hnn	rmset	r2t	rmsev	r2v
MOX (V)	0	2017.52	0.03	1929.53	0.01
MOX (V)	1	2012.91	0.04	1907.28	0.03
MOX (V)	2	2013.47	0.04	1908.39	0.03
MOX (V)	3	2012.85	0.04	1907.44	0.03
MOX (V)	4	2012.65	0.04	1906.95	0.03
H2 (V), CATA (V), MOX (V), CO (V)	0	2064.63	−0.01	2034.40	−0.11
H2 (V), CATA (V), MOX (V), CO (V)	1	2013.41	0.04	1904.95	0.03
H2 (V), CATA (V), MOX (V), CO (V)	2	2019.50	0.03	1912.46	0.02
H2 (V), CATA (V), MOX (V), CO (V)	3	2002.79	0.05	1888.72	0.05
H2 (V), CATA (V), MOX (V), CO (V)	4	1993.49	0.05	1885.03	0.05
MOX 2 (V)	0	2109.53	−0.06	2072.43	−0.15
MOX 2 (V)	1	1041.14	0.74	951.45	0.76
MOX 2 (V)	2	1053.46	0.74	946.47	0.76
MOX 2 (V)	3	1309.52	0.59	1000.08	0.73
MOX 2 (V)	4	1343.30	0.57	1027.83	0.72
H2 2 (V), CATA 2 (V), MOX 2 (V), CO 2 (V)	0	634.05	0.90	504.01	0.93
H2 2 (V), CATA 2 (V), MOX 2 (V), CO 2 (V)	1	2033.96	0.02	1924.30	0.01
H2 2 (V), CATA 2 (V), MOX 2 (V), CO 2 (V)	2	775.57	0.86	613.21	0.90
H2 2 (V), CATA 2 (V), MOX 2 (V), CO 2 (V)	3	956.25	0.78	687.63	0.87
H2 2 (V), CATA 2 (V), MOX 2 (V), CO 2 (V)	4	814.85	0.84	675.74	0.88
MOX (V), MOX 2 (V)	0	656.50	0.90	536.24	0.92
MOX (V), MOX 2 (V)	1	2003.38	0.05	1892.26	0.04
MOX (V), MOX 2 (V)	2	803.81	0.85	549.04	0.92
MOX (V), MOX 2 (V)	3	1033.47	0.75	718.90	0.86
MOX (V), MOX 2 (V)	4	1045.05	0.74	758.92	0.85
H2 (V), CATA (V), MOX (V), CO (V), H2 2 (V), CATA 2 (V), MOX 2 (V), CO 2 (V)	0	613.07	0.91	511.26	0.93
H2 (V), CATA (V), MOX (V), CO (V), H2 2 (V), CATA 2 (V), MOX 2 (V), CO 2 (V)	1	670.82	0.89	497.27	0.93
H2 (V), CATA (V), MOX (V), CO (V), H2 2 (V), CATA 2 (V), MOX 2 (V), CO 2 (V)	2	751.48	0.87	725.78	0.86
H2 (V), CATA (V), MOX (V), CO (V), H2 2 (V), CATA 2 (V), MOX 2 (V), CO 2 (V)	3	659.12	0.90	596.29	0.91
H2 (V), CATA (V), MOX (V), CO (V), H2 2 (V), CATA 2 (V), MOX 2 (V), CO 2 (V)	4	781.68	0.85	724.44	0.86

When it comes to the prediction of CH₄ mole fraction, we can see that the training RMSE sometimes increases as the number of hidden neurons or regressors is increased. Since the convergence of RMSEt was visually verified, this can only stem from the fact that the optimization method became stuck in local minima, as the global minima must be either increasingly better or at least as good as those of the simpler models which are specific cases of the more complex models.

References

Shiva Kumar, S.; Lim, H. An overview of water electrolysis technologies for green hydrogen production. Energy Rep. 2022, 8, 13793–13813. [Google Scholar] [CrossRef]
Ashok, J.; Pati, S.; Hongmanorom, P.; Tianxi, Z.; Junmei, C.; Kawi, S. A review of recent catalyst advances in CO2 methanation processes. Catal. Today 2020, 356, 471–489. [Google Scholar] [CrossRef]
Torres, L.F.; Damascena, M.A.; Alves, M.M.; Santos, K.S.; Franceschi, E.; Dariva, C.; Barros, V.A.; Melo, D.C.; Borges, G.R. Use of near-infrared spectroscopy for the online monitoring of natural gas composition (hydrocarbons, water and CO2 content) at high pressure. Vib. Spectrosc. 2024, 131, 103653. [Google Scholar] [CrossRef]
Sherstov, I.; Vasiliev, V. Highly sensitive Laser Photo-Acoustic SF6 Gas Analyzer with 10 decades dynamic range of concentration measurement. Infrared Phys. Technol. 2021, 119, 103922. [Google Scholar] [CrossRef]
Liu, A.; Yi, J.; Ding, X.; Deng, J.; Wu, D.; Huo, Y.; Jiang, J.; Li, Q.; Chen, J. An online technology for effectively monitoring inorganic condensable particulate matter emitted from industrial plants. J. Hazard. Mater. 2022, 428, 128221. [Google Scholar] [CrossRef] [PubMed]
Sinha, M.; Neogi, S.; Ghosh, R. Temperature dependent selectivity switching from methanol to formaldehyde using ZnO nanorod based chemi-resistive sensor. Sens. Actuators A Phys. 2023, 357, 114405. [Google Scholar] [CrossRef]
Wook Noh, H.; Jang, Y.; Dol Park, H.; Kim, D.; Hun Choi, J.; Ahn, C.-G. A selective feature optimized multi-sensor based e-nose system detecting illegal drugs validated in diverse laboratory conditions. Sens. Actuators B Chem. 2023, 390, 133965. [Google Scholar]
Zhan, K.; Qin, P.; Jiang, Y.; Chen, Y.; Heinke, L. Optical sensor array of metal-organic-framework-based inverse opal films for the detection and identification of various alcohols. Sens. Actuators B Chem. 2023, 393, 134271. [Google Scholar] [CrossRef]
Lu, Y.; Zhang, X.; Huang, Y. 2023 Chemiluminescence sensor array for oxidase discrimination based on ternary transition metal sulfide. Sens. Actuators B Chem. 2023, 390, 134003. [Google Scholar] [CrossRef]
Tang, Y.; Xu, X.; Du, H.; Zhu, H.; Li, D.; Ao, D.; Guo, Y.; Fu, Y.; Zu, X. Cellulose nano-crystals as a sensitive and selective layer for high performance surface acoustic wave HCl gas sensors. Sens. Actuators A Phys. 2020, 301, 111792. [Google Scholar] [CrossRef]
Hsiao, S.-H.; Wu, J.-X.; Chen, H.-I. High-selectivity NOx sensors based on an Au/InGaP Schottky diode functionalized with self-assembled monolayer of alkanedithiols. Sens. Actuators B Chem. 2020, 305, 127269. [Google Scholar] [CrossRef]
Pushpanjali, P.; Manjunatha, J.; Srinivas, M. Highly sensitive platform utilizing poly(l-methionine) layered carbon nanotube paste sensor for the determination of voltaren. FlatChem 2020, 24, 100207. [Google Scholar] [CrossRef]
Staszek, K.; Szkudlarek, A.; Kawa, M.; Rydosz, A. Microwave system with sensor utilizing GO-based gas-sensitive layer and its application to acetone detection. Sens. Actuators B Chem. 2019, 297, 126699. [Google Scholar] [CrossRef]
Gao, J.; Viricelle, J.-P.; Pijolat, C.; Breuil, P.; Vernoux, P.; Boreave, A.; Giroir-Fendler, A. Improvement of the NOx selectivity for a planar YSZ sensor. Sens. Actuators B Chem. 2011, 154, 106–110. [Google Scholar] [CrossRef]
Alsaedi, B.S.; McGraw, C.M.; Schaerf, T.M.; Dillingham, P.W. Multivariate limit of detection for non-linear sensor arrays. Chemom. Intell. Lab. Syst. 2020, 201, 104016. [Google Scholar] [CrossRef]
Ferro, L.M.; Lemos, S.G.; Ferreira, M.; Trivinho-Strixino, F. Use of multivariate analysis on Fabry-Pérot interference spectra of nanoporous anodic alumina (NAA) for optical sensors purposes. Sens. Actuators B Chem. 2017, 248, 718–723. [Google Scholar] [CrossRef]
Song, L.; Yang, L.; Wang, Z.; Liu, D.; Luo, L.; Zhu, X.; Xi, Y.; Yang, Z.; Han, N.; Wang, F.; et al. One-step electrospun SnO2/MOx heterostructured nanomaterials for highly selective gas sensor array integration. Sens. Actuators B Chem. 2019, 283, 793–801. [Google Scholar] [CrossRef]
Yousefi-Darani, A.; Babor, M.; Paquet-Durand, O.; Hitzmann, B. Model-based calibration of a gas sensor array for on-line monitoring of ethanol concentration in Saccharomyces cerevisiae batch cultivation. Biosyst. Eng. 2020, 198, 198–209. [Google Scholar] [CrossRef]
Chu, J.; Li, W.; Yang, X.; Wu, Y.; Wang, D.; Yang, A.; Yuan, H.; Wang, X.; Li, Y.; Rong, M. Identification of gas mixtures via sensor array combining with neural networks. Sens. Actuators B Chem. 2021, 329, 129090. [Google Scholar] [CrossRef]
Singh, V. Quantum dot decorated multi-walled carbon nanotube modified electrochemical sensor array for single drop insulin detection. Mater. Lett. 2019, 254, 415–418. [Google Scholar] [CrossRef]
Gornall, D.D.; Collyer, S.D.; Higson, S.P.J. Investigations into the use of screen-printed carbon electrodes as templates for electrochemical sensors and sonochemically fabricated microelectrode arrays. Sens. Actuators B Chem. 2009, 141, 581–591. [Google Scholar] [CrossRef]
Yu, H.; Wang, J.; Xiao, H.; Liu, M. Quality grade identification of green tea using the eigenvalues of PCA based on the E-nose signals. Sens. Actuators B Chem. 2009, 140, 378–382. [Google Scholar] [CrossRef]
Lin, C.; Gillespie, J.; Schuder; Duberstein, W.; Beverland, I.; Heal, M. Evaluation and calibration of Aeroqual series 500 portable gas sensors for accurate measurement of ambient ozone and nitrogen dioxide. Atmosp. Environ. 2015, 100, 111–116. [Google Scholar] [CrossRef]
Alonso, M.J.; Madsen, H.; Liu, P.; Jørgensen, R.B.; Jørgensen, T.B.; Christiansen, E.J.; Myrvang, O.A.; Bastien, D.; Mathisen, H.M. Evaluation of low-cost formaldehyde sensors calibration. J. Affect. Disord. 2022, 222, 109380. [Google Scholar] [CrossRef]
Liang, H.; Liu, G. Research on quantitative analysis method of PLS hydrocarbon gas infrared spectroscopy based on net signal analysis and density peak clustering. Measurement 2022, 188, 110392. [Google Scholar] [CrossRef]
Karami, H.; Rasekh, M.; Mirzaee-Ghaleh, E. Qualitative analysis of edible oil oxidation using an olfactory machine. J. Food Meas. Charact. 2020, 14, 2600–2610. [Google Scholar] [CrossRef]
Karami, H.; Kamruzzaman, M.; Covington, J.A.; Hassouna, M.; Darvishi, Y.; Ueland, M.; Fuentes, S.; Gancarz, M. Advanced evaluation techniques: Gas sensor networks, machine learning, and chemometrics for fraud detection in plant and animal products. Sens. Actuators A Phys. 2024, 370, 115192. [Google Scholar] [CrossRef]
Swirszcz, G.; Czarnecki, W.M.; Pascanu, R. Local minima in training of neural networks. arXiv 2016, arXiv:1611.06310. [Google Scholar]
Haugen, J.-E.; Tomic, O.; Kvaal, K. A calibration method for handling the temporal drift of solid state gas-sensors. Anal. Chim. Acta 2000, 407, 23–39. [Google Scholar] [CrossRef]
Capone, S.; Epifani, M.; Francioso, L.; Kaciulis, S.; Mezzi, A.; Siciliano, P.; Taurino, A.M. Influence of electrodes ageing on the properties of the gas sensors based on SnO2. Sens. Actuators B Chem. 2006, 115, 396–402. [Google Scholar] [CrossRef]
Carvalho, C.; Nechio, F.; Tristao, T. 2021 Taylor rule estimation by OLS. J. Monet. Econ. 2016, 124, 140–154. [Google Scholar] [CrossRef]
Sundberg, R. Small-sample and selection bias effects in multivariate calibration, exemplified for OLS and PLS regressions. Chemom. Intell. Lab. Syst. 2006, 84, 21–25. [Google Scholar] [CrossRef]
Yamashita, G.H.; Anzanello, M.J.; Soares, F.; Rocha, M.K.; Fogliatto, F.S. Selecting relevant wavelength intervals for PLS calibration based on absorbance interquartile ranges. Chemom. Intell. Lab. Syst. 2022, 231, 104689. [Google Scholar] [CrossRef]
Aguilera, A.M.; Escabias, M.; Preda, C.; Saporta, G. Using basis expansions for estimating functional PLS regression: Applications with chemometric data. Chemom. Intell. Lab. Syst. 2010, 104, 289–305. [Google Scholar] [CrossRef]
Abdi, H. Partial least squares regression and projection on latent structure regression (PLS Regression). Wiley Interdiscip. Rev. Comput. Stat. 2010, 2, 97–106. [Google Scholar] [CrossRef]
Hashem, S.; Keller, P.E.; Kouzes, R.T.; Kangas, L.J. Neural Network based data analysis for chemical sensor arrays. In Proceedings of the Applications and Science of Artificial Neural Networks of Symposium SPIE 2492, Orlando, FL, USA, 6 April 1995. [Google Scholar]
Qian, N. On the momentum term in gradient descent learning algorithms. Neural Netw. 1999, 12, 145–151. [Google Scholar] [CrossRef]

Figure 1. (a) schematic of the gas bench. (b) Cell containing the multi-sensor platform.

Figure 2. Sensors and the signal treatment chain.

Figure 3. Example of gas composition sequence and gas sensor response.

Figure 4. Example architecture of the artificial neural network (ANN) used for H₂ concentration prediction.

Figure 5. (a) Gas sequence with H₂ concentrations between 100 ppm and 1000 ppm. (b) EC-H₂ sensor signal. (c) EC-CO sensor signal. (d) MOX sensor signal. (e) Catalytic sensor signal. (f) CO₂ sensor signal.

Figure 6. (a) Gas sequence with CO concentrations between 15 ppm and 300 ppm. (b) EC-H₂ sensor signal. (c) EC-CO sensor signal. (d) MOX sensor signal. (e) Catalytic sensor signal. (f) CO₂ sensor signal.

Figure 7. (a) Gas sequence with CH₄ concentrations between 800 ppm and 10 000 ppm. (b) EC-H₂ sensor signal. (c) EC-CO sensor signal. (d) MOX sensor signal. (e) Catalytic sensor signal. (f) CO₂ sensor signal.

Figure 8. (a) Gas sequence with CO₂ concentrations between 1000 ppm and 30,000 ppm. (b) EC-H₂ sensor signal. (c) EC-CO sensor signal. (d) MOX sensor signal. (e) Catalytic sensor signal. (f) CO₂ sensor signal.

Figure 9. Transfer function of the sensors submitted to different concentrations of: (a) H₂, (b) CO, (c) CH₄.

Figure 10. (a) H₂, (b) CO, (c) CH₄ concentration predictions based on PLS modelling with training data.

Figure 11. (a) H₂, (b) CO, (c) CH₄ concentration predictions based on ANN modelling with training data.

Figure 12. Sequence used to predict H₂/CH₄/CO analyte concentrations alone or binary mixtures.

Figure 13. Prediction of concentrations of: (a) H₂, (b) CH₄, (c) CO by the MLR-PLS method.

Figure 14. Prediction of concentrations of: (a) H₂, (b) CH₄, (c) CO by the ANN method.

Figure 15. Predictions results before (_Raw) and after (_PT) application of post-treatment algorithm for PLS prediction curves: (a) H₂ predictions results, (b) CH₄ predictions results, (c) CO predictions results.

Figure 16. Predictions results before (_Raw) and after (_PT) application of post-treatment algorithm for ANN prediction curves: (a) H₂ predictions results, (b) CH₄ predictions results, (c) CO predictions results.

Figure 17. Comparison between the sensor responses before ageing and after one year of ageing: (a) Gas sequence used for the test, (b) Electrochemical EC-H₂ sensor, (c) Electrochemical EC-CO sensor, (d) Catalytic CATA sensor, (e) Metal-Oxyde MOx sensor.

Figure 18. Gas concentration predictions (_Pred in the legend) on aged sensor platform for: (a) H₂, (b) CH₄, (c) CO using the previously developed ANN model—comparison to experimental used concentrations (_Exp in the legend).

Figure 19. Gas concentration predictions (_Pred in the legend) on aged sensor platform for: (a) H₂, (b) CH₄, (c) CO using the previously developed MLR-PLS model—comparison to experimental concentrations (_Exp in the legend).

Figure 20. Offset-corrected gas concentration predictions (_offset shift in the legend) on aged sensor platform for: (a) H₂, (b) CH₄, (c) CO using the previously developed MLR-PLS model–comparison to experimental concentrations (_Exp in the legend).

Table 1. Technologies of sensors commercially available and corresponding to our specifications according to the target gas (X: Only a few commercial references found, XX: Many commercial references could be found but only a few corresponding to our requirements, XXX: many commercial references corresponding to our specifications found).

	CO	CO₂	H₂	CH₄
MOX (semiconductors)	X		X	X
Pellistors (Catalytic sensors)			X	XX
NDIR sensors		XXX		X
Photo-acoustic sensors		X
Electrochemical	XXX		X

Table 2. List of selected sensors to be used in the sensor network cell.

	Brand Name	Model	Type	Detection Range	Detected Gas **
EC-CO	Membrapor	CO/MF-1000	Electrochemical	0–1000 ppm	CO
CATA	Figaro	TGS6812-D00	Catalytic	0–100% LEL *	H₂, CH₄, C₃H₈
MOX	Figaro	TGS2612-D00	Semiconductor	1–25% LEL *	H₂, CH₄, C₃H₈
CO₂	Sensirion	SCD30	Infrared	0–40%	CO₂ (+HR et T)
EC-H₂	Membrapor	H2/M-4000	Electrochemical	0–4000 ppm	H₂

* LEL stands for Lower Explosive Limit (which is the lowest concentration of a gas or vapor that will burn in air—about 4%, 5%, and 12.5% respectively for H₂, CH₄, and CO). ** according to the sensors’ datasheets.

Table 3. RMSE obtained with concentrations predicted from training data used to build the models.

Model	Training Prediction RMSE (ppm)
Model	H₂	CO	CH₄
MLR—OLS	1801	895	11,060
MLR—PLS	66	35	656
Best ANN	103	34	671

Table 4. RMSE obtained with concentrations predicted from validation data.

Model	Training Prediction RMSE (ppm)
Model	H₂	CO	CH₄
MLR—PLS	197	26	755
Best ANN	207	19	497

Table 5. RMSE obtained with concentrations predicted from validation data and post-treated.

Model	Training Prediction RMSE (ppm)
Model	H₂	CO	CH₄
MLR—PLS	194	22	622
Best ANN	205	19	424

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lakhmi, R.; Fischer, M.; Darves-Blanc, Q.; Alrammouz, R.; Rieu, M.; Viricelle, J.-P. Linear and Non-Linear Modelling Methods for a Gas Sensor Array Developed for Process Control Applications. Sensors 2024, 24, 3499. https://doi.org/10.3390/s24113499

AMA Style

Lakhmi R, Fischer M, Darves-Blanc Q, Alrammouz R, Rieu M, Viricelle J-P. Linear and Non-Linear Modelling Methods for a Gas Sensor Array Developed for Process Control Applications. Sensors. 2024; 24(11):3499. https://doi.org/10.3390/s24113499

Chicago/Turabian Style

Lakhmi, Riadh, Marc Fischer, Quentin Darves-Blanc, Rouba Alrammouz, Mathilde Rieu, and Jean-Paul Viricelle. 2024. "Linear and Non-Linear Modelling Methods for a Gas Sensor Array Developed for Process Control Applications" Sensors 24, no. 11: 3499. https://doi.org/10.3390/s24113499

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Linear and Non-Linear Modelling Methods for a Gas Sensor Array Developed for Process Control Applications

Abstract

1. Introduction

2. Materials and Methods

2.1. Sensor Choice

2.2. Experimental Setup

2.3. Test Procedure

2.3.1. Role of Mono-Analyte Tests

2.3.2. Sensor Network Exposure to Both Mono-Analyte and Binary Mixtures

2.3.3. Modelling Step: Behavior Model Construction

3. Results & Discussions

3.1. Mono-Analyte Tests

3.2. Sensor Transfer Function

3.3. Building up Models from Training Data

3.4. Validation Tests and Comparison of Models

3.5. Data Post-Treatment

3.6. Ageing of Sensors

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Gas Sequences Used during Training

Appendix B. Results of the Neural Networks

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI