2014 BKissane PH D
2014 BKissane PH D
2014 BKissane PH D
Submitted:
November 2014
School of Chemistry
National University of Ireland, Galway
-2-
Table of Contents
Declaration ..................................................................................................................... I
Abstract ......................................................................................................................... V
1 Introduction ............................................................................................................ 1
2.1 Chemometrics................................................................................................ 51
2.2 Qualitative and Quantitative analysis ............................................................ 51
2.3 Calibration Modelling ................................................................................... 52
2.4 Figures of Merit for Modelling ..................................................................... 53
2.4.1 Correlation Coefficient (r2) .................................................................... 53
i
2.5.2 Regression .............................................................................................. 58
ii
2.15 Sample Holders.......................................................................................... 81
2.16 Specific Chemometric Procedures............................................................. 82
2.16.1 Baseline Offset Correction ..................................................................... 82
3 Development using Raman Spectroscopy for the Analysis of Cell Culture Media
Components ................................................................................................................. 86
iii
3.8.2 Quantification: Glucose in M5Glu Data .............................................. 124
iv
5.4 Variance Analysis ....................................................................................... 171
5.4.1 PCA Analysis ....................................................................................... 171
8.3.2 Calibration Models For M3Glu Data (Replicate Models) ................... 228
8.3.4 Calibration Models For M5eRDF Data for the Conventional Raman
Data 238
8.3.5 Calibration Models for M5Ye Data for the Conventional Raman ....... 240
8.4.2 Calibration Models for M5eRDF for the SERS Data .......................... 244
8.4.3 Calibration Models for M5Ye for the SERS Data ............................... 256
v
8.5 Supplementary Information for Chapter Five ............................................. 268
8.5.1 Spectral Overview ................................................................................ 268
8.5.3 Calibration Models for the Fluorescence Data (Replicate Runs) ........ 270
vi
Declaration
I declare that the work included in this thesis is my own work and has not been
previously submitted for a degree to this or any other academic institution.
Bridget Kissane.
I
List of abbreviations
ACO – Ant Colony Optimization
AE-IPAD Anion-Exchange Chromatography - Integrated Pulsed Amperometric
Detection
API – Active Pharmaceutical Ingredient
ATP – Adenosine Triphosphate
AVG – Averaged
BC – Baseline Correction
BSS – Balanced Salt Solutions
CCD – Charge Coupled Device
CD – Chemically Defined
CHO – Chinese Hamster Ovary
CIP – Cleaned In Place
CoAdReS – Competitive Adaptive Reweighted Sampling
DMEM – Dulbecco Minimal Essential Medium
E. coli – Escherichia Coli
ED – Electrochemical Detection
EEM – Excitation Emission Matrix
eRDF – Enhanced RPMI/DMEM/F12
F12/F10 – Ham’s Nutrient Mixture medium
FAD – Flavin Adenine Dinucleotide
FD – First Derivative
FDA – Food and Drug Administration
FDMSC – First Derivative Multiplicative Scatter Correction
FMN – Flavin Mononucleotide
FSH – Follicle-Stimulating Hormone
HEPES – N-2-hydroxyethylpiperazine-N’-2-ethanesulfonic Acid
HPLC – High Performance Liquid Chromatography
IC – Ion Chromatography
IFE – Inner Filter Effect
IR – Infrared
LB – Lysogeny Broth
LC-MS – Liquid Chromatography–Mass Spectrometry
LFH – Laminar Flow Hood
LH – Luteinizing Hormone
II
LOD – Limit of Detection
LOOCV – Leave One Out Cross Validation
MCCV – Monte Carlo Cross Validation
MCD – Minimum Covariance Determinant
MDCK – Madin Darby Canine Kidney cell lines
MEM – Minimal Essential Medium
MIR – Mid Infrared
MSC – Multiplicative Scatter Correction
MWPLSR – Moving Window Partial Least Squares Regression
NAD – Nicotinamide Adenine Dinucleotide
NADH – Reduced Nicotinamide Adenine Dinucleotide
NADP – Nicotinamide Adenine Dinucleotide Phosphate
NADPH – Reduced Nicotinamide Adenine Dinucleotide Phosphate
NIR – Near Infrared
NMR – Nuclear Magnetic Resonance
Norm – Normalisation
PARAFAC – Parallel Factor Analysis
PAT – Process Analytical Technology
PCA – Principal Component Analysis
Phe – Phenylalanine
PLS – Partial Least Squares
QE – Quantum Efficiency
RDF – RPMI/DMEM/F12 2:1:1 Mixture
REP – Relative Error of Prediction
RET – Radiative Energy Transfer
RMSEC – Root Mean Square Error of Calibration
RMSECV – Root Mean Square Error of Cross Validation
RMSEP – Root Mean Square Error of Prediction
RNA – RiboNucleic Acid
ROBPCA – Robust Principal Component Analysis
RPMI – Roswell Park Memorial Institute medium
SERS – Surface Enhanced Raman Spectroscopy
SFS – Synchronous Fluorescence Scan
SIMCA – Soft Independent Modelling by Class Analogy
SSR – Sum of Squared Residue
TChr – Thiochrome
III
Trp – Tryptophan
TSB – Tryptic Soy Broth
TSFS – Total Synchronous Fluorescence Scan
Tyr – Tyrosine
UPLS – Unfolded Partial Least Squares Regression
UV – Ultra Violet
VIP – Variable Importance in Projection
WE – Water Elimination
YPD – Yeast Extract-Peptone-Dextrose broth
IV
Abstract
Industrial scale cell culture is used for the production of many therapeutic agents such
as protein and vaccines. Cell culture medium is a vital raw material used in these
production processes. Formulation analysis of the medium is thus an essential task of
any bioprocess. The medium is a critical aspect of the process because it has to supply
all of the necessary nutrients and other factors to ensure growth and productivity.
Small variations in medium composition can alter cell metabolism, thereby changing
process efficiency and productivity. There is an ongoing need for analytical methods
to ensure reproducible medium formulations; therefore, real–time qualitative and
quantitative analysis of medium components by spectroscopic methods in
combination with chemometrics has the potential to be adapted as a PAT tool in
bioprocesses.
This thesis investigates the spectroscopic analysis and quantification of three medium
components - D-glucose, eRDF and yeastolate - in model medium formulations by
Raman, Surface Enhanced Raman Scattering (SERS) and two fluorescence
approaches (Excitation Emission Matrix (EEM) and Total Synchronous Fluorescence
Scan (TSFS)). These methods were used in conjunction with chemometrics to provide
a wealth of information about medium composition: qualitative assessment and outlier
detection through principal component analysis and robust principal component
analysis, fluorophore detection and identification using parallel factor analysis and
multivariate curve resolution, and quantitative analysis achieved with partial least
squares. These studies complement previous studies in this laboratory where specific
component quantification [1, 2] and variance analysis were used for characterising,
screening [3-5] and quantifying the performances of cell culture media by
spectroscopic methods [6-8].
The advantages of spectroscopic methods are that they require little to no sample
preparation and they give spectra with rich information content suitable for the
discrimination of subtle chemical and physical effects. The goal of this work was to
see if these spectroscopic methods could be used to accurately quantify medium
components, both simple (glucose) and complex (yeastolate and eRDF). The end-use
V
application was to develop a quality assurance method for correct medium
preparation/formulation.
Quantitative accuracy varied with the methods due to various experimental factors.
Various different pre-processing techniques were used to minimise unwanted spectral
effects such as noise, intensity and baseline differences. With Raman, quantification
of D-glucose, eRDF and yeastolate was achieved with an error of ~5%, ~16% and
~38% respectively. The SERS model gave error percentages of 16% for the eRDF and
12% error for yeastolate, while the best fluorescence model gave error figures of 5.4%
for yeastolate and 7.2% for eRDF. These models show the potential of these
spectroscopic methods for the measurement/identification of individual medium
components within complex cell culture medium. However, the error level obtained
suggests that improvement could be achieved through modification of the current
experimental setup which would then lead to more accurate prediction of component
concentrations.
VI
VII
1 Introduction
Figure 1 The number of food and drug administration (FDA) approvals for new
biopharmaceutical products by year since the first biopharmaceutical in 1982[12].
Since Humulin was first approved, the FDA has approved more than 100 new
recombinant protein therapeutics and more than 300 non-recombinant
biopharmaceuticals [13]. In 2012, 18 products received approval from the FDA; the
1
Recombinant Human Insulin
1
majority of these products were bio-better, me-too, or follow-on products, nine of
which were considered to be new biopharmaceutical entities [12].
2
45% originates from mammalian cell lines - CHO is dominant at 35%, while
other cell lines produce 10% of products
40% originates from bacterial cell lines - 39% in Escherichia coli and 1% in
other bacteria
The remaining 15% comes from yeast based fermentations.
2
A bioreactor or fermentor is a reaction vessel containing a liquid medium to support cell growth.
Fermentor refers to the vessel in which the fermentation of single-celled organisms occurs while a
bioreactor is the vessel for the culture of animal cells.
3
Figure 2 Different types of feed strategies for fermentation and cell culturing 3 [19-21].
3
Growth inhibitors are substances that hinder the growth by interfering with metabolism and uptake of
nutrients. In cell culture systems growth inhibition can be caused by a build-up of metabolites such as
lactate, pyruvate, succinate, propionate, isobutyrate, and acetate.
4
adverse pH changes, minimise toxic by-product formation and maintain homeostasis.
Therefore a medium is comprised of a basal medium4 and other nutrient supplements
like insulin, cholesterol and lipids. A basal medium contains amino acids, minerals,
sugars, inorganic salts, vitamins, organic acid and buffers. The basic composition of
basal medium allows for a wide variety of supplements to be added to enhance growth
and productivity. The requirements vary among cell lines and these differences have
led to the development of an extensive collection of medium formulations [14] [17].
Formulation analysis is a vital task in cell culture medium analysis and pre-
formulation analysis highlights compositional faults prior to starting the culture.
Various spectroscopic techniques (NIR, MIR and Raman) have been applied for
monitoring nutrients during fermentation to ensure on-going process quality [22-25].
Glucose is metabolized by cells at a faster rate than other carbon sources (galactose
and fructose). Glucose and galactose use the same transporter into the cell but glucose
has a greater affinity and a higher uptake rate than galactose [29]. Fructose is another
carbon source that can be used. Fructose and galactose both result in reduced
formation of lactic acid, but also exhibit a slower cell growth rate [20]. For Vero and
MDCK cell lines, fructose is used as the carbohydrate source as it helps maintain the
4
Minimal Essential Medium (MEM) and other basal media supply the basic needs for cellular
metabolism.
5
A heterotrophic organism utilizes organic compounds to obtain carbon that is essential for growth and
development. Examples of such organisms are animals, which are not capable of manufacturing food
from inorganic sources but must consume organic substrates for nutrition.
5
lactate/pyruvate ratio and a stable pH in high density cultures [30]. Galactose has been
used as a carbon source for CHO TF 7OR cells as a suitable substrate with an
acceptable growth rate and minimizes the generation of toxic by-products [31].
Cells require sodium, potassium, calcium, magnesium, chlorides and phosphates for
proliferation. Balanced Salt Solutions (BSS) of bulk ions supply the required
electrolytes needed for physiological roles. Ions are important for the following
reasons: maintenance of osmotic pressure, controlling the membrane potential, and
coordination of the transport channels in and out of the cells. Ions also participate in
oxidation-reduction, and are used in energy production (Kreb’s cycle) [27, 28].
Phenol red is added to the BSS as a visual indicator of pH. BSS may or may not be
buffered with bicarbonate, depending on the culture setup [34].
6
Glutamine in solution undergoes cyclisation to form a toxin, pyroglutamate (5- oxoproline, 5-
pyrrolidone-2-carboxylic acid). This reaction occurs at room temperature and is accelerated by heat.
6
1.3.2.4 Buffers
Buffers are added to cell culture media to maintain and avoid adverse changes in pH.
The most common buffering system used in mammalian cell cultures is the
bicarbonate/carbon dioxide system as it mimics the buffering system of blood. The
bicarbonate/carbon dioxide buffer has a pKa of 6.3 at 37 °C and requires the use of a
closed culture system as a result of the gaseous nature of carbon dioxide [30, 35].
HEPES (N-2-hydroxyethylpiperazine-N’-2-ethanesulfonic acid) is a zwitterion buffer
with a pKa of 7.3 at 37 °C that is sometimes supplemented into cell culture media for
more effective buffering in the physiological pH range. HEPES is used in conjunction
with sodium bicarbonate, as bicarbonate also provides some nutritional value. HEPES
is added in a concentration range of approximately 10 mM to 25 mM to maintain pH
stability [36-38].
7
Carrier protein; attachment regulators; defence molecules; growth factors; hormones; enzymes and
their regulators.
7
development of serum free media has advanced existing culture methodologies to
facilitate bioprocesses without serum proteins and endogenous serum substances such
as hormones or natural antibodies [42]. A growing number of alternatives to animal
sera exist for cell lines and primary cultures [21, 33, 40, 42]. Hydrolysates are
enzymatic or acid digests of biological materials such as animal tissues (meat digest),
milk products (casein), microorganisms (yeast) and vegetables (soy, wheat gluten,
rice). Hydrolysates are relatively low-cost medium additives used to provide nutrients
and growth factors to cell cultures in order to partly or fully replace serum. These
hydrolysates are poorly defined complex mixtures of peptides, free amino acids,
lipids, polysaccharides, phenolics, vitamins, nucleic acids, and minerals. These
relatively low cost materials are an ideal addition for large scale production. Some
hydrolysates show an anti-apoptotic activity which can extend the fermentation
lifetime [44-46]. For a better defined production of therapeutic proteins, there is a
move towards animal free hydrolysates originating from yeast and vegetable (soy,
rice, wheat gluten, rapeseed and chickpeas) [47-49]. Yeast hydrolysate8 or yeastolate
is a medium supplement that is cost-effective, non-animal derived, and has been
shown to have a significant positive effect on cell growth. Yeastolate is a complex
mixture known to contain free amino acids, peptides, vitamins, minerals, and
carbohydrates, but it also contains a significant amount of unknown material [43, 50,
51].
8
Yeastolate is produced by culturing yeast to a certain volume. Once this volume is reached, the
process is stopped with heat shock. The cells are digested to produce unrefined hydrolysates. The
hydrolysate is then filtered, concentrated, ultra-filtered, and spray dried.
8
Table 42 to Table 44. The development of serum free and chemically defined media
occurred through a series of multiple steps starting with serum based medium.
Eagle’s medium was altered by increasing the amino acid content to form
Eagle’s Minimal Essential Medium (MEM). This modified version still
required serum for cell growth [33].
MEM was further adjusted by Dulbecco to form DMEM, and this contained a
fourfold concentration increase of nutrients [53].
Another media series, Ham’s nutrient mixtures F12 and F10 were shown to
support growth and maintenance of different cell types [54].
The merger of DMEM and F12 by Sato [55] gave a fortified basal medium.
This amalgamated medium still needed to be supplemented with serum to
support bioreactor production.
Testing showed that under serum-free conditions, transferrin, insulin,
ethanolamine, linoleic acid, ascorbic acid, hydrocortisone and certain trace
element compounds stimulated hybridoma growth.
The supplementation of DMEM/F12 with insulin, transferrin, selenium and
ethanolamine provided the additional nutrients required to facilitate serum free
cultivations [33, 56].
RPMI media series use an increased level of nutrients while maintaining a
constant salt content. The combination of RPMI with supplemented
DMEM/F12 in a 1:1 ratio produced a formulation (RDF) with a superior
performance than DMEM/F12 alone [57-59].
Further enhancement of basal RDF medium with a three-fold increase in the
level of amino acids and glucose gave the enriched RDF medium (eRDF) [57-
59].
eRDF is a chemically defined basal medium used in the culturing of
therapeutic proteins and each formulation is proprietary for individual
manufacturers.
eRDF comprises over 50 compounds including inorganic salts, amino acids,
vitamins, HEPES buffer, glucose, and various others.
9
1.4 Analysis of Cell Culture Media
In large scale manufacturing, most operational parameters are set. Medium
formulations change with manufacturers, cell line and product type. Extensive use
testing9 is required to select the high performing lots that support cell culture
performance in large scale production [60]. The formulation of medium involves
selecting and blending various components, resulting in a complex medium matrix.
For effective and reproducible culturing, the correct medium formulation and
blending is essential. Changes to the medium can affect growth rate, product yield,
and quality [17]. Therefore, there is an ongoing need for new or improved analytical
methods to ensure reproducible medium formulation. Comprehensive and detailed
analysis of cell culture medium composition and variance can help control and
understand the very complex cell culture based manufacturing process.
In-depth analytical methods for cell culture medium can be time consuming,
multistep, challenging and expensive. Medium samples are centrifuged or filtered to
remove particulates, diluted and derivatized (if necessary) before testing. Detailed
method development is required to address all analytes present, and also to determine
analytes of interest, analyte concentration and overall analysis time. The analysis
method is dependent on the medium and should be determined individually for each
medium to ensure accurate measurement [61, 62]. Exact quantification of nutrients in
cell culture medium is desirable, if not necessary, in order to meet cell line
requirements. It is advantageous to characterise the medium ingredients (including
inorganic ions, carbohydrates, alcohols, and aliphatic carboxylic and amino acids)
because the presence or absence of specific components may impact the yield of the
desired products.
When it comes to characterisation, most of these nutrients and metabolites are ionic or
polar in nature, and do not have the chromophores necessary for analysis by
absorption measurements. Ion chromatography (IC) with electrochemical detection
(ED) is a suitable technique for the determination of these components. Carbohydrate
9
When testing a new basal medium, a scaled down production process is used. In the test, the new
material and the reference material are cultured side by side. The results compare cell growth, product
yield, nutrient usage and by-product formation for the reference versus new material.
10
analysis uses liquid chromatography with refractive index detection [63]. The
common method for amino acid detection involves liquid chromatography (reverse
phase or cation exchange) resulting in the detection of derivatives [62-64]. Pre- or
post-column derivatization based methods are limited to a specific range of amino
acids. High operating costs and the inability to detect multiple components, such as
certain amino acids and carbohydrates, renders LC methods unattractive [62, 65]. In
order to advance the chromatographic measurements beyond derivative detection and
toward multiple component analysis, anion chromatography with integrated pulsed
amperometric detection was developed [65]. Hanko et al. [61, 62] have developed a
method using anion exchange chromatography – integrated pulsed amperometric
detection (AE–IPAD) technology that allows for simultaneous detection of amino
acids and carbohydrates in four media formulations (YPD broth, LB broth, MEM and
serum free-protein free hybridoma medium).
Identifying lot to lot variability in raw material is another aspect of media analysis.
Variability can arise from several sources: the producer, the raw material used, and
also the aging of the material. Variability in chemical composition and culture
performance may depend on extraction material (i.e. whether extraction is from yeast,
malt or protein digests). Medium supplements like hydrolysates are heterogeneous in
terms of molecular size and chemical diversity. The exact concentration varies per
manufacturer and it is impossible to identify and quantify every individual
component. Yeastolate is inexpensive and lot to lot variation is known to occur. When
six different lots were tested, the free amino acid content varied from 45% to 78%
resulting in different biomass levels and growth rates [66].
Various methods are used to analyse the different components in bioprocess studies.
Commonly glucose, lactate, glutamine, and glutamate are measured using enzyme-
based biosensors. The biosensors are amperometric electrodes that have immobilized
enzymes in their membranes and work by converting the glucose or other substrates
to hydrogen peroxide, which is oxidized to produce an amperometric signal
proportional to substrate concentration [67]. For example, Sun et al. used six different
tests to quantify nutrients, metabolite, product and by-product formation, because a
single method was not able to quantify all of the different components [68]. These
11
included BioProfile analyser for glucose, lactose and ammonium; two different
enzyme based assays for galactose and ATP; three different variations of HPLC for
quantifying amino acids, vitamins and antibody formation.
Most analytical approaches used for cell culture medium analysis are multistep and
time consuming in nature. As a result, there is a need for a rapid, sensitive and
inexpensive technique that is capable of monitoring multiple components in a single
measurement. Spectroscopic methods seem to be ideal candidates and in this work,
three spectroscopic methods (Raman, SERS and Fluorescence) were used to study cell
culture medium for formulation analysis.
10
Difference between the incident and scattered beam frequencies
13
When a molecule enters an electric field of strength 𝐸, a dipole moment P is induced
in the molecule. The magnitude of the induced dipole moment is 𝑃 = 𝛼𝐸, where α is
the polarizability of the molecule. If the molecule encounters electromagnetic
radiation of frequency ʋo, a varying electric field E is induced. This in turn induces a
varying electric dipole moment, which causes an emission of light identical in
frequency to the incident radiation. This is elastic or Rayleigh scattering. If there is a
change in the polarizability of a bond during a rotation or vibration through
interaction with electromagnetic radiation then the vibrational mode is Raman active
and the emitted light is altered from the incident radiation [87-89].
Figure 3: (Left) A schematic illustrating the scattering of incident light as it interacts with a
molecule, giving off Rayleigh and Raman scatter and (Right) energy level diagram depicting
scattering processes of Rayleigh and Raman. E0 and E1 are the ground and first excited
electronic energy levels, respectively. Reproduced with permission from [90, 91].
The quantum theory for the scattering process treats the monochromatic light
frequency 𝑣0 as a stream of photons having energy ℎ𝑣0 , where h is Planck’s constant.
With Rayleigh scattering, the incident photons interact with a molecule and are
scattered without a change in frequency (elastic scattering). However in the Raman
effect, the photon interacts with the vibrational energy levels of the molecule and the
scattered radiation has a different frequency 𝑉𝑣 through either loss or gain of energy
from the incident light (inelastic scattering). A molecule undergoing a vibrational
transition from the ground vibrational energy level (v = 0) to the first excited
vibrational energy level (v’ = 1) will have a corresponding frequency of 𝑣𝑣 and the
14
scattered photon will be diminished in energy by the amount ℎ𝑣𝑣 . The energy of the
scattered photon will be ℎ(𝑣0 − 𝑣𝑣 ). This is known as Stokes scattering. In contrast,
if the molecule is already in an excited vibrational state when the photon interacts
with it, the transition 𝑣 ′ → 𝑣′′ may be induced and the photon will be scattered with
an enhanced energy that produces anti-Stokes Raman lines. At room temperature in
accordance with Boltzmann distribution, the population of molecules in the ground
vibrational states is always much greater than those in the excited vibrational states.
As a result, the intensities of anti-Stokes lines will always be much weaker than those
of the Stokes lines. [91, 92]
For a molecule which possesses a centre of symmetry such as CO2, there is a useful
rule - the mutual exclusive rule. This states that for molecules with a centre of
symmetry, fundamental transitions which are active in the infrared (IR) absorption
spectroscopy are forbidden in Raman and vice versa. Together Raman and IR
absorption spectroscopy provide a complete picture of the different vibrational
frequencies in a molecule. Groups which lack strong features in Raman, may exhibit
intense bands in IR and vice versa. In molecules with symmetric elements other than a
centre of symmetry, certain bonds may be Raman active, IR active, both or neither.
All normal modes allowed in both IR and Raman for complex molecules with no
symmetry. The methods are complementary to one another. [87-89]
15
well within a 780-850 nm excitation range. This wavelength range also reduces
fluorescence to an acceptable level [93].
As water is the principle medium in cell culture media, its impact is important.
Comparative studies of solid amino acids and their aqueous solutions show the
difference in the spectra of solid amino acids and their aqueous solutions is
significant, (Figure 4). The spectra of solid amino acids are complex and detailed
compared to their aqueous solutions. The aqueous solutions are low concentration
samples, and as the signal intensity is proportional to concentration, this leads to a
weaker signal and loss of spectral detail [95, 96].
Figure 4 Raman spectra of solid and aqueous solutions of Phe (0.3 g/L), Trp (0.11 g/L) and Tyr
(0.004 g/L). Reproduced with permission from [95].
Even though water should have a weak Raman signal, it can be a significant issue
when looking at very low concentrations in aqueous solution. The strong impact of
the water signal on the amino acid signal is significant since the samples prepared as
part of this study are low concentration aqueous solutions. The Raman method may
16
not be able to detect single analytes within an aqueous solution but we are going to
test its ability in testing the gross signal changes for a whole component (D-glucose,
eRDF and yeastolate).
Figure 5 Average Raman spectra of (a) human, (b) bovine, and (c) porcine insulin on the left and
difference spectra between human and porcine insulin on the right. Reproduced with permission
from [94].
The flexibility of the Raman instrumentation increased with the use of a fiber optic
probe as the delivery and collection system. In-situ Raman spectra were measured for
an E. coli fermentation of phenylalanine production for simultaneous estimation of
glucose, acetate, formate, lactate, and phenylalanine [76]. The substrates were
modelled based on the Raman spectra and the HPLC reference method data. The error
levels for Raman models for a production run were glucose (4.16%), acetate (4.67%),
formate (5.5%), lactate (5.39%), and phenylalanine (not detected). The Raman
estimates for glucose consistently underestimated the reference method, the estimates
for acetate, formate and lactate showed qualitative agreement with error, while
phenylalanine was not detected by the Raman model. The results showed potential
despite the errors introduced by the physical environment of the bioreactor [76]. The
implementation of Raman analysis to fermentations also has potential for tracking
culture parameters. In-line Raman monitoring of a mammalian cell culture bioreactor
was applied for prediction of various media components (glutamine, glutamate,
glucose, lactate, and ammonium) and compared to the standard reference
measurements using a BioProfile 400 Analyzer [100]. Table 1 and Figure 6 show the
model predictions and accuracy based on the Raman spectra throughout the culture.
The predictions follow the overall expected trend. The Raman models accurately
predict decreases in nutrient levels (glutamine, glutamate, glucose) and increases in
metabolite levels (lactate and ammonium). The error level values for glutamine are
close as low accuracy is seen with both the Raman and reference method because of
the low concentration of glutamine. The error levels for glutamate and lactate
compare better as the models accurately predict the behaviour of these analytes. The
models for glucose and ammonium did follow the process; however, their error levels
are poor, reducing their model accuracy. The performances demonstrate that the
Raman method is comparable to the reference method and therefore Raman
18
spectroscopy provides an attractive approach for monitoring mammalian cell culture
processes.
Figure 6 Comparison of measured nutrient and metabolite concentrations (solid diamonds) and
the predictions (solid lines) from the modelling of Raman data for (a) glutamine, (b) glutamate,
(c) glucose, (d) lactate, and (e) ammonium. Dashed lines indicate the standard deviation
measured for the reference method. Reproduced with permission from [100].
19
Table 1 Results for predictions of nutrient and metabolite concentrations using in line Raman
meaurements and standard reference measurements from the BioProfile 400 Analyzer.
Reproduced with permission from [100].
Media Component Calibration Range Raman % error Reference % error
Glutamine (mM) 0.66–4.26 30.3 22.0
Glutamate (mM) 2.21–5.72 12.0 17.0
Glucose (g/L) 2.07–6.22 15.3 4.0
Lactate (g/L) 0.23–5.21 12.9 10.0
Ammonium (mM) 2.01–8.51 11.4 4.0
Previous work on the cell culture media analysis by Li et al. showed rapid
identification, characterisation and quality assessment of media components used in
industrial cell culturing [1-8, 101]. Raman was used to identify the different media
types and as a sample quality testing method. Chemometric analysis (PCA and
SIMCA11) were used for sample evaluation. Five different chemically defined (CD)
commercial media components (Figure 7) were investigated. Each of these
components was used in a Chinese Hamster Ovary (CHO) based manufacturing of
recombinant proteins. Raman data provided significant differences within spectra to
identify the different media types, and also outlier analysis allowed for identification
of suspect samples. The “normal” samples were selected for the routine identification
and quality evaluation of the different media components. Five distinct classes were
obtained through SIMCA classification (Figure 7b) where each medium type was
grouped according to their spectral differences [3, 102]. This study clearly showed
that the identification and classification of incoming materials was possible using the
Raman method.
11
SIMCA (Soft independent modelling of class analogy) is a classification method that outperforms
PCA which is based on total variance. In order to build a reliable model, it uses a series of PCA models
where samples are identified as class members by describing relevant spectral variance. The
classification is based on significance tests where the distance from the model center (leverage) and the
distance to the model space (residuals) are examined.
20
Figure 7 Spectra of five different chemically defined (CD) commercial media and the SIMCA
classification of the 336 sample measurements using the pre-processed Raman spectra of CD–A1,
CD–A2, CD–S1, CD–S2, and eRDF samples. Reproduced with permission from [3].
When Raman was applied to the analysis of bioprocess samples (sample components
may include cells, fresh media, spent media and product proteins), these samples
resulted in a strong water signal and weak signals for the media components. Li et al.
investigated the correlation between the Raman spectra with the glycoprotein yield
from 9 different time points. The generated models used a full region (400–1053 cm–
1
), and two variable selection methods (CoAdReS and ACO). The full range model
gave a poor performance while the variable selection greatly improved the prediction
ability. The CoAdReS and ACO models were equally matched but the run time for
the ACO method was very time consuming. This was not acceptable when the goal
was the development of a rapid analysis method. However, using Raman spectra with
CoAdReS variable selection generated an accurate prediction of the glycoprotein
yield in a timelier manner. This opened up the possibility of developing the Raman
method for bioprocess evaluation of product yield during the process in order to
ensure consistent yields and prevent losses [6].
21
1.6.2 Surface Enhanced Raman Spectroscopy (SERS)
An enhanced Raman spectrum was observed for pyridine adsorbed on
electrochemically roughened silver like that in Figure 8. The initial conclusion by
Fleischmann et al. was that a roughened electrode surface area caused a local increase
in pyridine concentration leading to a stronger signal [103]. This was disproved by D.
L. Jeanmaire and R. P. Van Duyne who showed that the signal increase was caused by
a dramatic increase (an estimated 105 fold enhancement) in the Raman scattering
cross section [92, 104, 105].
Figure 8 (a) SERS of the pyridine at silver films and (b) Raman spectrum of the aqueous solution
of 0.01 M pyridine in 0.1 M KCl. Reproduced with permission from [106].
22
Figure 9 (A) Raman spectrum of 0.05 g/L Acebutolol solution displaying fluorescence
interference and (B) surface enhanced Raman spectrum of 0.05 g/L Acebutolol showing reduced
background, adapted from and reproduced with permission from [112].
SERS signal strength is also dependent on the orientation of the adsorbed species. In
the case of pefloxacin (Figure 10), at high concentrations a more perpendicular
orientation of the adsorbed species occurs, whereas a flatter alignment is observed
with a lower concentration. At 10–6 M, the intensity of the 229 cm–1 band increases,
corresponding to a carboxylate group interacting with the Ag surface in accordance
with the adsorbed molecules lying flat on the surface. At 10–4 M, the intensity of the
210 cm–1 and 1656 cm–1 bands increases, reflecting a more tilted orientation that
23
results from local steric hindrance, with increased surface coverage and/or repulsive
electrostatic forces between the adsorbed species [115].
Figure 10 (a) SERS spectra of pefloxacin at three concentrations (10 –4, 10–5, and 10–6 M) and a
representation of possible orientations for pefloxacin adsorbed on silver colloid; (b) for
pefloxacin concentration of 10–6 M; (c) 10–5 M; (d) 10–4 M, reproduced with permission from
[115].
12
For scattering there needs to be an oscillation perpendicular to the surface, which requires a
roughened surface.
24
electromagnetic field of the surface plasmon is stronger than the incident light field
and therefore increases the intensity of the Raman scattered light [92]. The excitation
of the surface plasmon greatly increases the local field experienced by the molecule
(Figure 11). The enhanced field depends on the optical conductivity of the metal,
while optical conductivity depends on the wavelength used and the size and shape of
the particle [87, 92, 103, 105].
Figure 11 Illustration of the excitation of the localized surface plasmon resonance of a spherical
nanoparticle by incident electromagnetic field. Reproduced with permission from [117].
25
1.6.2.1.2 Chemical Enhancement
Adsorption of the analyte onto the nanostructured metal surface leads to changes in
the molecular orbitals and electron distributions across both the analyte and the metal
surface. This increases the polarizability of the analyte molecule [87, 92, 120]. It is
believed that chemical enhancement is related to the new electron state belonging to
the bond formed between the analyte and the metal surface; the new electron states
are resonant intermediates [92]. When an incident photon excites an electron from the
metal surface into an adsorbed molecule, it creates a negatively charged excited
molecule. The molecular geometry of this excited molecule differs from that of the
neutral species. This allows for charge transfer to occur from the metal to the analyte.
The signal enhancement will take place when the excited electron of the charge
transfer becomes resonant with the incident light [105]. The incident photon is
adsorbed onto the metal nanoparticles, and the associated charge transfer induces a
nuclear relaxation within the excited molecule. This results in the return of the
electron to the metal surface, the creation of a neutral molecule, and the emission of a
Raman scattered photon [92, 107].
1.6.2.2 Substrates
SERS spectra are obtained after molecules interact or are adsorbed onto certain
nanostructured metal surfaces. Many different types of surfaces can be used for
SERS; examples include aggregated colloid suspension [121-123], roughened
electrodes [121, 124, 125], metal films (such as silver island films) [126-128], and
silver coated beads [129-131]. Silver, gold and copper are the most commonly used
substrate materials, with silver being the most widely used. The choice of SERS
substrates is based on the wavelength of the surface plasmon band (e.g. λmax in the UV
or visible spectrum). This is a function of the material used and its size. Both silver
and gold surface plasmons oscillate at frequencies in the visible region making them
suitable for use with the visible and NIR excitation wavelengths typically used in
Raman spectroscopy. Silver has a broad excitation range from the UV to IR while
gold is limited to the red and IR ranges because of band transitions. Silver is less
reliant on the excitation wavelength compared to other SERS active metals like gold
and copper due to its favourable dielectric function [92, 132]. Silver colloids are used
in applied techniques (i.e. silver island films) because silver is a more efficient optical
26
material giving a SERS signal of 10–100 fold higher than gold. Gold colloids are,
however, used in studies of living organisms because of their chemical stability, better
control of size and shape and higher biocompatibility [118, 133].
1.6.2.2.1 Colloids
A metal colloid is the suspension of metal nanoparticles in a solvent. Silver or gold
colloidal suspensions may be formed by chemical reduction of metal salts. Silver
colloids can be prepared by the sodium borohydride or citrate reduction methods
[134, 135]. The size, shape and dielectric constant of colloids exhibit differences in
plasmon resonance and wavelength dependence. The nanoparticle sizes range from 10
to ~200 nm and depend on the method of preparation [136, 137]. UV-visible
absorbance measurements indicate the size of the particles present in the colloid.
Larger particles produce broader peaks at longer wavelengths. For silver
nanoparticles, an absorbance maximum ranging from 395–405 nm indicates a particle
size of 10–14 nm in diameter, absorbance around 420 nm indicates a size ranging
from 35–59 nm, and absorbance around 438 nm point to a size of 60–80 nm [138].
When the particles in a colloid are too small to exhibit large field enhancement, an
aggregating agent can be added to produce clusters of particles. The most commonly
used aggregating agent is sodium chloride (NaCl). Nanoparticles are kept in
suspension by repulsive electrostatic forces between the particles. The addition of
NaCl buffers the charges allowing the particles to clump together and form
aggregates. The resulting silver particle clusters provide a much higher surface-
enhancement signal [87, 138]. Metal colloids may either be added directly to the
analyte solution or immobilised on a mounting substrate before exposure to the
analyte.
27
reproducible. Reproducible spectra can be obtained from samples containing 50–100
analyte molecules [139, 140].
Figure 13 Bulk-Raman (black) and SERS (grey) spectra for (left) para-pyridinecarboxylic acid
and (right) meta-pyridinecarboxylic acid. Reproduced with permission from [141].
The profiles of amino acids in solution were studied using SERS spectroscopy [144-
146]. Nineteen different amino acids and their adsorption route onto the metal surface
were examined. Identification of amino acids was possible because of their spectral
differences. The side chain groups influenced the spectra through their interaction
with the metal surface. SERS studies showed that interactions between the silver
surface and the amino acids occur through their deprotonated carboxylate group. The
sulphur containing amino acids also showed a strong interaction between the surface
and the sulphur containing functional groups [146, 147]. The size of the amino acids
can also contribute to the signal strength. In some cases, the signal strength was
weakened when a large molecule size caused limited interaction. The adsorption of
the aromatic and sulphur containing amino acids resulted in stronger band intensity
and was more favourable than the rest [123, 148]. The detection limits of the aromatic
and sulphur containing amino acids were as low as 10–10 M , and the rest of amino
acids were detectable at 10–9 M [144].
28
HPLC is the standard method for quantification of melamine13. However, it is labour
intensive and time consuming. A faster SERS protocol was developed for detection
and quantification of melamine in foodstuffs. When the SERS method was compared
to HPLC, the qualitative results showed that SERS was capable of detecting trace
amounts of melamine. The limit of detection for melamine with the SERS Klarite14
substrate was estimated at 0.033 µg/L. The HPLC limit of detection for melamine
standard solution was 1.0 µg/L. In a quantitative assessment of the methods; the
HPLC outperformed the SERS method by an r2 value of 0.99 to 0.90. The SERS
method showed high sensitivity (L.O.D. ~0.033 µg/L); however, the accuracy and the
precision was less than the HPLC method. As a result, SERS could potentially be
used as a preliminary screening method for the large sample sets, as it is faster, less
labour intensive and simpler than HPLC. Verification of suspect samples could then
be carried out using the more precise HPLC method [149-151].
1.6.2.4.1 pH Effects
Under different pH conditions, the surface charges of the nanoparticles change,
affecting the SERS performance [154]. The effect of pH variation on SERS behaviour
can be utilised to study different molecular species. For example, the SERS analysis
of thiamine at different pH discriminated the protonated from un-protonated species
(Figure 14). At low pH, spectra featured the protonated related peaks at 1657 and
1550 cm–1. As the pH increased, the degree of protonation decreased, resulting in the
emergence of un-protonated related peaks at 1590 an 1373 cm–1. At a pH of 9 and
above, the thiamine was destroyed and the SERS spectrum was no longer recorded
[155].
13
Melamine is a nitrogen rich compound that is banned in foodstuffs.
14
The Klarite substrate consists of a silicon surface which has been patterned with a square array of
micrometre-sized square-based pyramidal pits coated with a 300-nm layer of gold.
29
Figure 14 SERS spectra of thiamine in colloidal gold sol (10 –5 mol/L) at different pH values.
Reproduced with permission from [155].
30
as the concentration of colloid increased eightfold. The more reproducible SERS
spectrum was achieved by increasing the colloid concentration. This was the result of
the high concentration of nanoparticles and aggregates that remain close to the sample
surface for a stable signal [154, 156].
15
The surface charges impact the SERS signal via the strength of interaction and the proximity of the
nanoparticle to the analyte.
31
Figure 16 Comparison of SERS spectra of E. coli obtained using borohydride reduced and
citrate-reduced silver nanoparticles. [154]
Another limitation in SERS spectroscopy is the batch to batch variation of the SERS
substrate. Therefore all possible sources of variance should be identified and
controlled in the preparation of SERS substrates. There may be changes in the particle
size, shape of roughened surfaces and the distribution of the particles into clusters
after aggregation of a colloid [92, 157].
The Marotta study proposed that the SERS signal seen for bacteria samples was the
residual signal from growth media. The SERS spectra for bacteria cells (EC, BC and
AH) and nutrient broth (NB) showed similarities (Figure 17). The spectra were
collected from dilute bacteria culture samples, rather than the specialised preparation
method of repeated cycles of washing and centrifugation needed for bacteria samples.
Therefore the bacteria samples tested contained both bacteria cells and a diluted
growth medium, which was shown to give a strong SERS response (Figure 17b). The
stock medium spectrum was dominated by a strong fluorescence background signal
while the diluted medium spectrum had well-defined peaks for components in cell
culture medium (Figure 17b) [158]. The Marotta study was followed up by the
32
Premasiri study [159] in which the SERS signal for bacterial cells and growth media
were shown to be different at a spectroscopic level. Comparison of the SERS spectra
for bacteria and growth media (Figure 18) revealed a common peak observed in
several spectra at 725–730 cm–1. The intensity of the peak varied with each particular
sample. This peak was assigned to the adenine signal from the FAD component found
in both bacterial cells and growth media [159].
Figure 17 (Left) SERS spectra of diluted Nutrient Broth (NB) and the three different bacterial
cells - (EC) E. coli, (BC) B. cereus, and (AH) A. histidinolovorans - prepared in diluted nutrient
broth. (Right) SERS spectrum of Nutrient Broth (a) concentrated stock, and (b) diluted 1:100
(v/v) [158].
Because of the subtle differences observed in the bacteria and media spectra, principal
component analysis (PCA) was needed to confirm the difference between growth
media and bacteria cells. The PCA grouped the data based on their variance and
showed that the SERS signal for growth media was different to properly washed
bacterial cells. In the scores plot (Figure 19a) for the different washes, nutrient broth,
no wash culture and the spun culture, the signal from bacteria and media were
separated. The spun culture and first wash groupings were standalone but the
subsequent three washes overlapped. The nutrient broth was clustered with the no
wash culture as they both contained large amounts of broth. In order to show that
medium had no effect on the bacterial signal, two different growth media (TSB and
LB broth) were used to grow the bacteria E. faecalis (Figure 19b). The PCA results
showed the washed bacteria samples from both fermentations were clustered together
and well separated from either growth medium. Also in Figure 19b, the effect of a
growth medium (TSB) on three different cultures was examined. TSB medium had no
effect as the PCA results showed the TSB medium cluster was well separated from
the bacteria cells clusters [159].
33
Figure 18 The SERS spectra of the bacterial species with their strain and growth media noted
and on the right, the SERS spectra of the various growth media [159].
Figure 19 PCA analysis results for (A) the sample preparation and (B) the separation seen
amongst the different bacterial cell and growth media [159].
The application of SERS to media analysis has potential considering the SERS
activity of many media components. SERS can then be used as a screening method
for cell culture media and its components. Changes in the sample composition cause
spectral differences [5]. For example, SERS analysis of yeast extracts gave spectra
with deviating band positions and intensities for different batches. This helped to
characterise and discriminate the yeast extracts based on identity, origin and source.
Since SERS observed changes in medium composition, it was also used in a medium
degradation study [101]. SERS was able to detect storage induced changes in
34
chemically defined cell culture medium. PCA of the SERS data revealed a change
within the media samples during dark storage conditions. The observed change in the
SERS spectra was caused by cysteine oxidation. This was identified as the key event,
resulting in the formation of cystine, which does not promote cell growth, unlike
cysteine. This showed that SERS can be used to detect compositional changes such as
cysteine oxidation that have an impact on sustaining optimal cell growth.
In the ground electronic state, molecules can occupy a variety of vibrational energy
levels. At ambient temperature, most molecules are in the lowest vibrational state of
the ground energy. Occupancy of the upper vibrational states depends on the
temperature and the Boltzmann distribution [84, 162-164]. Absorption of light causes
the evaluation of molecules from the ground state into electronically excited states.
The strength of the absorbed energy determines which electronic level (S1 or S2)
becomes populated. In the excited sate, collisions cause excited molecules to lose
energy until they reach the lowest vibrational level of the excited electronic state (S2
and S1). An excited molecule exists only for a limited time (~10–8 s) as a result of
these energy reducing processes. An excited molecule can return to its ground state
(S0) through different steps. The preferred route is the one that minimises the lifetime
of the excited state [162, 163].
35
Figure 20 Jablonski Energy Diagram for photoluminscent systems. The lowest heavy horizontal
line (S0) represents the ground state of the molecule. The upper lines represent excited electronic
states. S1 and S2 are the first and second electronic singlet states. T1 is the first electronic triplet
state. Each electronic state has numerous vibrational energy levels, [165] adapted from [84, 162].
Figure 20 shows the various routes taken back to the ground state. De-excitation
directly back to the ground state (S0) can occur by fluorescence or internal conversion.
Fluorescence returns the excited molecule to the ground state accompanied by the
emission of light at a longer wavelength. Internal conversions are transitions between
electronic states that allow the return of an excited state to S0 without light emission.
Internal conversion is a non-radiative transition where there is no change in spin [84,
162, 163, 166]. Intersystem crossing is a non-radiative transition involving a change
in spin, for example from the singlet excited state (S1) to the triplet excited state (T1).
From the triplet excited state (T1), the de-excitation of the molecule occurs by internal
conversion or phosphorescence. Phosphorescence is the transition of an excited
molecule from a triplet excited state (T1) to the ground state (S0) with the emission of
light. Intersystem crossing is stimulated in molecules with iodine, [167] bromine
[168] or in the presence of molecular oxygen [84, 162, 163, 169, 170].
36
1.6.3.2 Stokes Shift and Mirror Image
The energy associated with fluorescence emission is typically less than that of
absorption. The emitted photons have less energy and are shifted to longer
wavelengths. The Stokes shift is a measure of the difference between the maximum
wavelength of absorbance and emission [171]. This Stokes shift arises from the loss
of energy from the excited species through various processes such as excited-state
reactions, energy transfer, solvent effects, and complex formations. The size of the
shift varies with environment, but can range from a few nanometres to over several
hundred nanometres.
Figure 21 Graphical representation of the absorption and emission transitions and normalized
absorption spectra (in dimethylformamide) and fluorescence spectra of quinine sulfate dication
in (a) cyclohexane, (b) diethylether and (c) dimethylformamide. Reproduced with permission
from [172].
The peaks in the absorption spectrum correspond to transitions from the lowest
ground state energy levels to different vibrational levels of the electronic excited state.
Meanwhile, the peaks observed in the fluorescence spectrum arise from transitions
from the lowest vibrational level of the excited electronic state to the different
vibrational levels of the ground state. Following absorption (see Figure 21), an excited
fluorophore quickly undergoes relaxation (yellow arrows) to the lowest vibrational
energy level of the excited state (S1). All subsequent relaxation processes –
fluorescence, radiationless relaxation, and intersystem crossing – will therefore
originate from the lowest vibrational level of the excited state (S1). Thus, the
excitation wavelength should not influence the emission spectrum. Under ideal
conditions for a single fluorophore, the mirror image effect between the emission and
37
absorption spectra can be observed. In terms of Figure 21, the resulting emission
spectrum at λem = 450 nm strongly resembles the absorption spectrum at Aλ = 350 nm
from the ground state (S0) to the first excited transition state (S1), but not of the entire
absorption spectrum, which may include transitions to higher energy levels (S2) at Aλ
= 320 nm [162, 163].
In complex cell culture media, there are usually multiple fluorophores and thus the
observed emission is a combination of the emission from multiple fluorophores. This
means that the mirror image rule does not hold and that the emission is very sensitive
to the excitation wavelength. Thus in multi-fluorophoric mixtures like cell culture
media, multi-dimensional methods are commonly used to collect the maximum
information [1, 2, 4, 7, 8, 160].
Multi-component mixtures are better analysed using multi-dimensional data like EEM
[180, 181]. EEM and 2D synchronous (SFS) methods were compared for analysing
beer. The EEM data exhibited three bands: one at λex/λem of 250/350 nm, a second at
38
350/420 nm, and a third at 450/520 nm. These peaks were assigned to the aromatic
amino acids tryptophan, tyrosine and phenylalanine, as well as the vitamin riboflavin.
The 2D synchronous spectra were collected at ∆λ 30 nm and ∆λ 60 nm. Several bands
were observed in the synchronous fluorescence spectra taken at ∆λ = 30 nm. The
sharp and intense short-wavelength emission was attributed to amino acids, while the
longest-wavelength emission band belonged to riboflavin. The fluorescence was
measured directly from the beer and the data was used to quantify the amino acids and
riboflavin content [178]. Both EEM and SFS gave the same quantitative results for
riboflavin and tryptophan corresponding to RMSECV of 14% and 4% respectively.
EEM outperformed SFS for tyrosine and phenylalanine, with RMSECV of 4% and
16% compared to 6% and 31% respectively. The better performance was the result of
the full sample profile by EEM while the SFS was only a cross section of the data.
Figure 22 Three-dimensional plot of EEM for a beer sample studied along with the synchronous
fluorescence scan of beers 1 and 2 at ∆𝛌 = 30 nm. Reproduced with permission from [178].
39
Rayleigh - from all molecules. The first order scatter is the prominent
scatterer, while a second order scatter can be seen at a wavelength double that
of the exciting light and is generally weak.
Raman - from all molecules. This is caused by a light shift to a longer
wavelength.
For the EEM data the scatter follows a series of multiple sharp peaks along the
diagonal line in the matrix (Figure 23a). Light scatter artefact peaks can cause
problems with chemometric analysis of EEM data by interfering with qualitative and
quantitative analysis and swamping the signal. In order to avoid these problems, the
scatter peaks are eliminated as part of the pre-processing (Figure 23b) [183].
(a) (b)
300
250
250
200
200
Intensity
Intensity
150
150
100 100
50 50
0 0
500 500
600 600
400 500 400 500
300 400 300 400
300 300
Excitation wavelenth (nm) Emission wavelenth (nm) Excitation wavelenth (nm) Emission wavelenth (nm)
Figure 23 (a) EEM landscapes of M5eRDF and (b) the scatter corrected spectrum for the same
M5eRDF sample from this body of work.
1.6.3.5.1 Quenching
Quenching (collisional or static) is the process by which fluorescence intensity
decreases. Collisional quenching occurs when the excited-state fluorophores are de-
excited by contact with some other molecule. This molecule is called a quencher and
its presence results in the return of the fluorophore to the ground state. The molecules
are not chemically altered in the process. A wide variety of molecules can act as
collisional quenchers. Examples include oxygen, halogens, amines and electron
deficient molecules like acrylamide [162]. Flavins like lumiflavin, riboflavin and
40
FMN undergo collisional quenching from iodide ions as they exert their quenching
effect by spin orbit perturbation, giving rise to increased intersystem crossing [184].
41
anionic and neutral, depending on the solutions pH value. The cationic form is non-
fluorescent, the anionic form is weakly fluorescent and the neutral form fluoresces.
The absorption spectrum differs, with each form dominating at varying pH values
[187, 188].
Figure 24 The cationic, neutral, and anionic structures of riboflavin species, R (ribityl side chain)
= -CH2(CHOH)3CH2OH, reproduced with permission from [188].
Solvent molecules can also interact with excited state molecules, thus lowering their
energy. Solvent effects relates to the chemical properties of the fluorophores, solvent
and surrounding molecules. The degree of interaction increases with solvent polarity.
Polar fluorophores are highly sensitive to solvent polarity while non-polar
fluorophores remain unaffected by solvent polarity. A large spectral shift for a small
change in the solvent composition generally indicates specific solvent effects. These
effects include hydrogen bonding, preferential solvation, charge transfer interactions
and acid and base chemical reactions. Solvents containing –Br, –I or –NO2 are
undesirable because they promote fluorescence quenching with increased triplet
formation [162, 187].
1.6.3.6 Fluorophores
In biological and fermentation samples, many intrinsic biological fluorophores exist.
Examples include aromatic amino acids (tryptophan, tyrosine and phenylalanine),
vitamins (riboflavin and pyridoxine) and coenzymes (NADH, NADPH, FMN and
42
FAD) [80, 191]. In cell culture media, the range of fluorophores is less. For example,
in the chemically defined basal medium (eRDF), five significant fluorophores were
identified: tryptophan, tyrosine, pyridoxine, folic acid and riboflavin [1]. The
interactions between all of the various components play a large part in determining
the shape and intensity of obtained spectra. This produces a unique fluorescence 3D
profile which can be used to characterize cell culture media. Small changes in media
composition can cause variances in the spectral profiles observed [1, 2].
Figure 25 Typical Biological fluorophores that can be detected with the use of an excitation-
emission matrix (EEM). Reproduced with permission from [191].
16
The quantum yield or efficiency (QE) for fluorescence is the ratio of the total number of emitted
molecules to the total number of excited molecules.
43
1.6.3.6.2 Vitamins and Co-enzymes
Riboflavin (Vitamin B2) is a precursor in the biosynthesis of two co-enzymes flavin
mononucleotide (FMN) and flavin adenine dinucleotide (FAD). Riboflavin is used in
energy metabolism via FMN and FAD [195]. Riboflavin, FMN and FAD absorb light
from the visible region at around 450 nm with a fluorescence emission maximum at
~520 nm. For riboflavin, the quantum yield is 0.26 at pH 7. The fluorescence yield is
the same for the FMN but lower for FAD. FAD fluorescence is weakened by the
presence of adenine [162]. Other fluorescent vitamins include Vitamin B1 (thiamine),
Vitamin B6 (pyridoxine) and Vitamin B9 (folic acid). Thiamine needs to be oxidised
to its fluorescent by-product thiochrome to enable detection. Thiochrome can be
excited at ~360 nm and emits at ~450 nm. Pyridoxine fluoresces at ~395 nm after
excitation at ~320 nm. For folic acid, excitation occurs at ~370 nm and emission is
seen at ~430 nm [179, 196, 197].
The main challenges facing in-line measurements are the physical attributes of the
fermentation process. Stirring rates and aeration procedures generate variations in
signals. The stirring rates can decrease the signal to noise ratio (SNR), while air
bubbles interfere with measurements. In general, signals from in-line analysis exhibit
a lower SNR than off-line measurements. In-line spectral data thus requires
smoothing to reduce the impact of noise [206]. The time requirements for data
analysis may affect the real-time output as the information rich spectra require
chemometrics to interpret and evaluate data. This is part of the model development to
be carried out prior to setting up automated analysis [208].
Figure 26 (a) Spectrum of a starved Yeast suspension at aerobic conditions; (b) spectrum of a
starved Yeast suspension at anaerobic conditions; (c) difference spectrum of a starved Yeast
suspension (anaerobic–aerobic). Reproduced with permission from [211].
The limitation of single compound monitoring arises when other fluorophores overlap
with the signal of interest making signal interpretation difficult. For example,
excitation at 365 nm generates NADPH and riboflavin emission overlap. This overlap
can be avoided with specific excitation wavelength selection. For NADPH, excitation
at 334 nm eliminates the riboflavin emission; meanwhile, for riboflavin, excitation at
404 nm eliminates the NADPH signal [205, 208]. Alternatively, using EEM or TSFS
allows one to simultaneously monitor several biogenic fluorophores such as vitamins,
coenzymes and amino acids, in order to give a clearer picture of the cellular activities
[206]. Different fluorophores can be more sensitive to different process aspects [199,
46
205, 209]. The most appropriate fluorophore to use for monitoring product
production, nutrient depletion and by-product build up will depend on the particular
fermentation [205].
Another way of analysing cell culture fluorescence was to examine the culture signal
as a whole. Multicomponent fluorescence analysis was performed on P. pastoris
batch culture, which contained NADPH, tryptophan and riboflavin. Prediction models
for biomass concentration were built using the fluorescence signal of the combined
three fluorophores. The combined fluorescence signal offered a more robust measure
than a single fluorophore. The strength of the model was dependent on the overall
signal and the interplay of fluorophores with the process variable [75, 206, 210, 212].
47
From previous work on cell culture media using fluorescence [1, 2, 4, 213], EEM data
has proven a suitable method for media screening and evaluation (i.e. rapidly
identifying different types of media and determination of the sample quality). EEM
data for seven different types of media were collected and the different media lots
were easily classified based on the spectroscopic profiles (Figure 27) using NPLS-DA
scores plot. Scores describes the variation between samples, giving a visualisation
representation of how the samples related to one another. Any significant
compositional changes which cause measurable differences in the spectroscopic data
are represented by changes in scores. Scores plots allow one to easily visualize sample
differences. This meant that different media samples were easy to identify despite
appearing visually similar. Even identical media samples that were changed by
different CHO based productions (A and L process) were separated based on their
slightly different in-process media compositions [213].
Figure 27 Scores plot of LV3 versus LV1 showing NPLS-DA discrimination of CD-A1, CD-A2,
CD-S2, insulin, eRDF, yeastolate, and phytone sample solutions. Reproduced with permission [4].
Changes in the composition within batches of media samples can impact the process;
therefore, it is necessary to determine the quality of the media samples. Chemometric
methods like MROBPCA and MANOVA can be used to identify outliers and define
class variance [4, 213]. MROBPCA identifies outlying samples based on
compositional changes. For example 22 yeastolate samples were tested in triplicate by
EEM. The MROBPCA outlier map identified 14 major and 5 minor outliers. The
spectra of the major outliers displayed either a higher or lower than normal
fluorescence compared to the main body of samples. The minor outliers were singular
events due to experimental error [5].
48
In conjunction with MROBPCA, MANOVA was used to calculate the class variance
of groups identified. The use of MANOVA can be useful for comparing media over
time. Ryan et al. observed changes to the class variance of media samples with time
indicating that minor compositional changes were occurring within the media during
storage. This can be an issue if the changes become significant and impact the
performance of the media [213].
17
Media can be stored in transparent bioreactors, media storage vessels, or single-use disposable
bioreactors.
49
Figure 28 The PARAFAC scores are shown for the two different storage conditions: (left) RT-L
and (right) RT-D. Components 1 to 4 are represented by blue squares (Trp), green inverted
triangles (Tyr), red circles (Py), and cyan upright triangles (FA/Rf and/or photo-products)
respectively. Reproduced with permission from [2].
EEM profiles are sensitive to compositional changes in the media. Therefore, the
EEM data was also tested for quantitative purposes by modelling the changes incurred
by varying analyte concentrations in order to predict the quantity of that analyte
within test samples. Calvet et al. developed a modified standard addition method for
the determination of tryptophan and tyrosine in eRDF media solutions [1] which was
later expanded to pyridoxine, riboflavin and folic acid [2]. eRDF samples produced
relatively complex spectra with strong fluorescence from tyrosine and tryptophan and
weaker contributions from pyridoxine, riboflavin and folic acid. Apart from
quantifying the media components within prepared medium, the best performing
models (tyrosine, tryptophan and riboflavin models) were applied to quantify these
analytes in stored eRDF media samples as they degraded. The tyrosine model failed
due to the dynamic changes seen in the stored media samples. However, the
tryptophan and riboflavin models compared well with the equivalent HPLC result,
which also confirmed changes in the tryptophan and riboflavin concentrations [1, 2].
These quantitative results coupled with the effective qualitative analysis show that
EEM fluorescence is a potential method for a wide-range of analytical tests for cell
culture media.
2.1 Chemometrics
During the 1970’s the development of chemometrics coincided with the emergence of
the personal computer and the increased use of computers in chemistry. Modern
instrumentation generates vast amounts of numerical data and the examination of such
data was limited until the introduction of computer based analysis [214]. A definition
of chemometrics is “the chemical discipline that uses mathematical and statistical
methods (a) to design or select optimal measurement procedures and experiments and
(b) to provide maximum chemical information by analysing chemical data”. In other
words, computer based statistical analysis of chemical data [215].
51
commonly the determination of the amount of an analyte/analytes a given sample
[217]. In this work, spectral and sample variance were evaluated using Principal
Component Analysis (PCA) for 2D and 3D data, while Robust Principal Component
Analysis (ROBPCA) and Parallel Factor Analysis (PARAFAC) were utilised for the
3D data only. For quantifying components in the data, the Partial Least Squares (PLS)
regression method was used for 2D data and Unfolded Partial Least Squares (UPLS)
regression method was used for the 3D data.
18
A correlation coefficient of 1.00 indicates a positive fit i.e. the relationship between obtained and
predicted variables follow a similar pattern. In the case of the correlation of zero it represents no
relationship between the obtained and the predicted. A correlation coefficient of -1.00 indicates a
negative fit amongst the different variables. It occurs when there is an increase in one variable and a
decrease associated with another variable.
53
2.4.2 Root Mean Square Error of Calibration (RMSEC)
The RMSEC describes the measure of uncertainty between the estimation obtained
for the calibration samples and the accepted true values of the calibration samples
used to obtain the model parameters in y = xb1 + e according to:
Equation 2-6
𝑛
1
𝑅𝑀𝑆𝐸𝐶 = √ ∑(𝑦𝑖 − 𝑦̂𝑖 )2
𝑛 − (𝑚 − 1)
𝑖=1
Because estimating the model parameters uses m+1 degrees of freedom, the
remaining 𝑛 − (𝑚 − 1) degrees of freedom are used to estimate RMSEC. Typically
RMSEC generates overly optimistic values. This is a result of the internal error
estimation. The samples themselves are used to calculate the error; therefore
measurement noise is also modelled in the estimated parameters [218, 224].
The leave one out cross validation (LOOCV) and Monte Carlo cross validation
(MCCV) methods are two different strategies used for error estimating. LOOCV is
performed by generating n calibration models, where each of the N samples is left out
one at a time. In each case the omitted sample is analysed by the model. The
prediction values are averaged giving the estimate of the prediction ability. A pitfall
of LOOCV is the internal nature of the prediction, leading to an overly optimistic
outcome. While the LOOCV approach is often necessary when only small numbers
of calibration samples are available. However, when multiple samples are removed,
the resulting validation is more accurate. With MCCV, the sample set is randomly
split many times into training (calibration) and validation sets. For each split,
54
validation is performed ultimately giving an averaged MCCV value from a large
number of random splits [8] [218, 224].
The RMSEP is also known as root mean square error of validation (RMSEV). It is a
measure of the uncertainty that can be expected in future predictions. With RMSEP, a
set of validation samples (test set) are prepared and measured independently from the
calibration samples. The number of validation samples p should be sufficiently large
so that the estimated prediction error accurately reflects all sources of variability
within the calibration method [218, 226]. The RMSEP is computed by:
Equation 2-8
𝑝
1
𝑅𝑀𝑆𝐸𝑃 = √ ∑(𝑦𝑖 − 𝑦̂𝑖 )2
𝑝
𝑖=1
The RMSEP judges the prediction ability of the model and indicates if the number of
latent variables used is correct. The RMSEP has the same units as the validation
samples [110].
55
batches and process variables, while PLS can be used to investigate the correlation
between spectral data and quantitative properties such as product yield [228]. Using
multiple analysis methods allows for a more detailed exploration of the process to be
carried out [229].
The PCA algorithm reduces the number of variables and the information is projected
onto a smaller number of significant variables, the so-called principal components
(PCs). The principal components are linear combinations of the original variables and
are selected so that the first principal component covers as much of the variation in
the data as possible. The second principal component is orthogonal to the first and
covers as much of the remaining variation as possible and so forth [206].
The mathematical model for the PCA method is as follows:
Equation 2-9
𝑋 = 𝑇𝐴 𝑃𝐴𝑡 𝑡 + 𝜀
Where T is an N by A matrix containing the scores of the PCs and P is an M-by-A
matrix containing the loadings of the PCs and the ɛ matrix contains unexplained
variance. The scores are the intensities of each of the new compressed variables for all
of the samples and contain information on how the samples relate to each other. The
loadings are the distributions of the new variables in terms of the original variables
and include information on how variables relate to one another.
56
Figure 29 PCA plot is shown where the blue circles represent the scores of the sample after PCA
analysis. The major axis of the ellipse represents the first principal component, PC1, and its
minor axis the second principal component, PC2.
For high dimensional data, where the number of variables is greater than the number
of samples, ROBPCA proceeds as follows: The X data is pre-processed by reducing
its data space to the linear transformed subspace using singular value decomposition.
The dimension of this subspace is at most N-1, where N is the number of samples. A
19
Developed by Hubert et al. in 2005
57
measure of “outlyingness” is computed for each data point obtained within the new
data space by projecting high dimension data points in many univariate directions. For
every direction a robust centre and scale is computed for the projected data points.
Each direction is scored by its corresponding value of “outlyingness”:
Equation 2-10
𝑚𝑎𝑥 |𝑋𝑖𝑡 𝑉 − 𝑚𝑀𝐶𝐷 (𝑋𝑖𝑡 𝑉)|
𝑜𝑢𝑡𝑙(𝑋𝑖 ) =
𝑣 𝑠𝑀𝐶𝐷
Where 𝑋𝑡𝑖 𝑉 is the standardised distance to the centre measured for each data point,
location 𝑚𝑀𝐶𝐷 and scale sMCD are the univariate minimum covariance determinant
(MCD) estimators and V is the number of univariate directions. Using the data points
with the smallest “outlyingness” to form a covariance matrix, the final number of
principal component K is determined. The data points are projected onto a K
dimensional subspace of which the centre and shape is determined by means of a
reweighted MCD estimate. From this the robust principal components are known and
the robust centre is the MCD location estimate [234-236].
2.5.2 Regression
58
PLS regression uses exactly the same mathematical model for compression of X and Y
data. The data matrix X is decomposed into a matrix of scores T and loadings P and
the response matrix Y is also split into a matrix of scores U and loadings Q:
Equation 2-11
𝑋 = 𝑇𝑃𝑇 + 𝐸
Equation 2-12
𝑌 = 𝑈𝑄 𝑇 + 𝐹
The goal of PLS regression is to model all the variables within X and Y in order that
the error in X block, E, and the error in Y block, F, is minimised. The least squares
regression is performed between U and T. An internal correlation is built that relates
the scores of the X block to the scores of the Y block in terms of the maximum
covariance between X and Y:
Equation 2-13
𝑈 = 𝑇𝑊
This is followed by the overall regression step where the decomposition of X is used
to predict y.
Equation 2-14
𝑦̂ = 𝑋𝛽̂
The regression coefficients are given by
Equation 2-15
𝛽̂ = 𝑃(𝑃𝑇 𝑃)−1 𝑊𝑄 𝑇
K
Matrix K J1 J2 J3 ……
(X)
J
I I I I
Figure 30 The unfolding scheme for multi-dimensional array into KxIJ slices, adapted from [241]
with permission to reproduce.
59
way data where simpler mathematical models can be applied. During the unfolding
process, one of the directions remains unchanged while the other two are arranged
slice by slice to give a row vector. A cube (KxIxJ) can be unfolded in three different
directions: row wise (KxIJ), column wise (IxKJ), and tube wise (JxKI). After
unfolding an EEM matrix, the 2D matrix will have the following dimensions Kx(I*J)
where K is the number of samples and I is the number of excitation wavelengths and J
the number of emission wavelengths. PLS regression analysis is performed on the
rearranged two-way data [218, 219, 242-244].
60
Figure 31 Scheme of steps for the resolution process in MCR-ALS method, adapted from [245]
with permission to reproduce.
In MCR, to start the iterative ALS process, an initial estimation of the factors present
in the spectral profiles for each sample is performed by methods like PCA. With the
initial estimation, solving for both 𝐶 and 𝑆 𝑇 least squares solutions can be
implemented in an alternating cycle, with iterations giving a new 𝐶 or 𝑆 𝑇 matrix.
The calculation of 𝐶 and 𝑆 𝑇 are repeated until an optimal solution is obtained or
convergence is achieved. Constraints may be imposed on the profiles, for example
non-negativity where the spectra or concentration values cannot be negative. The
MCR-ALS method works with trilinear and non-trilinear data sets. A trilinear
structure can be set as an optional constraint in the MCR-ALS method for the C
matrix [218, 245-247].
61
2.5.3.2 Parallel Factor Analysis (PARAFAC)
For the analysis of the EEM data, one can also use PARAFAC as a decomposition
method in order to resolve the fluorescence landscape into a number of trilinear
components f, which, in theory, could represent the excitation and emission spectra of
the constituent fluorophores. In PARAFAC, multi-way data are decomposed into sets
of scores and loadings with the same number of factors identified by the model. The
numbers of factors or latent variables are much lower than the number of original
variables making visualisation of the data possible. PARAFAC uses all the original
variables to determine the set of latent variables [200, 248, 249].
The objective for PARAFAC is to build a model that minimises the sum of the
residual 𝑟𝑖𝑗𝑘 present:
Equation 2-17
𝐹
Where 𝑋𝑖𝑗𝑘 is an element of the three-way data and i, j, and k are the indices of this
element on the sample, excitation and emission modes. The fluorescence landscape is
decomposed into sample scores, 𝑎𝑖𝑓, excitation loadings, 𝑏𝑗𝑓 , and emission loadings,
𝑐𝑘𝑓 , for each factor f. The residual 𝑟𝑖𝑗𝑘 contains the variation not captured by the
PARAFAC model [248, 250-252].
The core consistency gauge is a method for finding the correct number of components
to use in PARAFAC modelling [253]. In an ideal PARAFAC model, the core array
has ones on the super-diagonal, and zeros elsewhere, implying that no interactions
occur between the components from different modes for the PARAFAC model. A
core array is calculated from the loadings for each component in the model and
compared with the ideal PARAFAC core array of zeros and ones. The optimum
model is computed when the number of components comes together with an
62
acceptable core array. Core consistency can be increased by lowering the number of
components [254]. The core consistency is the relative sum of squared differences
between the core array and the array of super-diagonal core of ones. Core consistency
provides a quantitative measure of how well the loadings represent variation within
the data. It is generally expressed as percentage and if the percentage is close to 100,
the model gives an appropriate description of the data. In cases of a low core
consistency percentage, the model is not describing the data [255, 256].
2.6.2 Derivatives
Derivatives act as a frequency scaling tool and high pass filter. Derivative pre-
processing minimises lower frequency features such as sloping baseline and retains
the high frequency aspects of the original data like the Raman peaks (Figure 33). A
drawback to derivative pre-processing comes from the frequency response function
used in polynomial smoothing which can introduce distortions into the data. Also the
filtering nature removes substantial amounts of signal, producing a lower signal to
noise ratio in the data [218]. The first order derivative effectively removes the
baseline offset variation in the spectral profiles and is useful where the samples
exhibit a baseline shift. The second derivative removes differences in baseline offset
63
and baseline slope. In the case of a complex spectrum, the use of a second order
derivative can make spectral interpretation more difficult. However, for a low signal
spectrum, the second order derivative enhances the signal [220, 232, 259, 260]. In this
body of work, the Savitzky-Golay derivative algorithm20 is used [261]. The S-G
algorithm fits individual polynomials to filter windows around each point across the
spectrum. This continues until it reaches the end of the spectrum. It requires the
selection of the size of the window, the order of the polynomial and the order of
derivative [220, 232, 259, 260].
4
x 10
3.5 2500
3 2000
1500
2.5
1000
2
Intensity
Intensity
500
1.5
0
1 -500
0.5 -1000
-1500
0
500 1000 1500 2000 2500 3000 500 1000 1500 2000 2500 3000
Wavenumber (cm-1) Wavenumber (cm-1)
Figure 33 (Left) Raman spectra of M1GLU data (this work) and (Right) the first order derivative
following Savitzy-Golay smoothing of M1GLU data (this work).
20
There are alternative derivative methods such as finite differences and Norris-Williams (NW)
derivation. The former is sensitive to high frequency noise while NW is less applicable to
spectroscopic data than Savitzky-Golay.
64
Figure 34 (A) Un-processed NIR spectra and (B) after multiplicative scatter correction of wheat
samples.[264]
2.6.4 Normalisation
Normalising methods attempt to overcome changes in the data due to fluctuations in
absolute intensity due to instrument or measurement factors e.g. light source variation
may be overcome by identifying a feature present in the sample/spectra that should be
constant from one sample to the next and correcting the variables to scale based on
this characteristic. A good normalisation method minimises the variance between
spectra and maximises the signal for classification/discrimination purposes. Full
spectrum normalization captures the general characteristics of the data such as scaling
to the area under the curve. On the other hand, local normalization methods polarise
the spectra such as scaling to a known peak which may be useful with varying noise
levels [265-267].
Below are the equations for the normalisation methods used in this study. [218, 232]
In each case, 𝑋𝑖,𝑛𝑜𝑟𝑚 is the normalized spectrum; 𝑋𝑖 is the spectrum of the ith sample,
𝑋𝑖∗ is the vector of observed values for the given normalisation; j is the variable
number and n is the total number of variables.
In area normalisation, (Norm1), each variable is scaled to unit area under the curve
equal to one. This is achieved by dividing each variable by the sum of the absolute
value of all variables from the given samples.
Equation 2-18
𝑋𝑖
𝑋𝑖,𝑛𝑜𝑟𝑚1 = 𝑛
∑𝑗=1|𝑋∗𝑖 |
With Norm2 normalisation, each variable is divided by the sum of the squared values
of all variables for the given sample. Norm2 returns a vector of unit length (length
65
equal to one). It is a form of weighted normalisation where larger values are weighted
more heavily in the scaling.
Equation 2-19
𝑋𝑖
𝑋𝑖,𝑛𝑜𝑟𝑚2 = 2
∑𝑛𝑗=1 𝑋∗𝑖
For the infinity normalisation mode (maximum norm - NormINF), each variable is
divided by the highest peak observed for all variables of a given sample, giving a
vector scaled to a maximum value equal to one. Therefore all variables are weighed
against the largest value.
Equation 2-20
𝑋𝑖
𝑋𝑖,𝑛𝑜𝑟𝑚𝑖𝑛𝑓 =
(𝑀𝑎𝑥(𝑋∗𝑖 ))
4 -3
x 10 x 10
3.5
(a) 5
(b)
3
2.5 4
2
Intensity
Intensity
1.5
2
1
1
0.5
0
0
500 1000 1500 2000 2500 3000 500 1000 1500 2000 2500 3000
Wavenumber (cm-1) Wavenumber (cm-1)
1
0.06
(c) 0.9
(d)
0.05
0.8
0.04 0.7
0.6
Intensity
Intensity
0.03
0.5
0.02 0.4
0.3
0.01
0.2
0 0.1
500 1000 1500 2000 2500 3000 500 1000 1500 2000 2500 3000
Wavenumber (cm-1) Wavenumber(cm-1)
Water is the main component of all samples studied in this work. The water bands can
act as an internal standard as vibrational OH bands are evident above 3000 cm–1 and
at 1640 cm–1 and these non-overlapping bands can be used as internal references. By
normalising to the OH bending vibration we are reducing the impact of the absolute
66
intensity variation of water signal. The water signal is in such an excess that variance
in this signal is large due to measurement error, but since it is present in excess we
can assume that in reality it is an invariant signal. In other words for the changing
media composition, the water signal should remain unchanged and thus suitable as an
internal standard. Looking to the differences in the analyte-water band ratios, the
relative intensity of water signal gives a good estimation of the analyte signal.
21
Full spectrum calibration models can be misled by contrived correlations from system drift or co-
variation amongst constituents resulting in over-fitting of the model.
67
2.7.1 Moving Window Partial Least Squares (MWPLS)
Moving Window Partial Least Squares (MWPLS) is a wrapper variable selection
method and was the primary variable selection method used in this body of work. The
objective of MWPLS is to find the informative spectral regions within complex
spectra. Informative regions hold the most relevant information for the PLS analysis
to yield better performing models. The operation of MWPLS uses a fixed sized
window to build a series of PLS models across the whole spectral region. The
informative regions are assigned by the examination of the complexity and the error
level of the PLS models. For each PLS model built along the spectrum, the sum of the
squared residue (SSR) is calculated. In Figure 36, the SSR versus window position
plot provides a display of the residual lines useful in identifying informative regions.
Residue lines show downward-facing bands, which correspond to a particular
wavenumber range. The informative ranges have a low SRR value compared to the
insignificant ranges [271-273].
Figure 36 MWPLS Residue lines obtained from Raman spectra of M1GLU data collected in this
work for the calibration samples. Each colour represents a different residual line.
2.8 Outliers
An outlier is an observation which does not obey the pattern of the majority of the
data. There are two types of outliers: firstly odd measurements, where one of the
replicate measurements is different, possibly due to measurement error; and secondly
odd samples, where samples may be compositionally different to each other. Outliers
should be flagged and discarded as they will introduce unfavourable repercussions in
68
further analysis. Outlier analysis techniques have the ability to detect the artefacts
(like spectral distortions and deviation from offset) that cannot be seen in the spectra.
If a model describes the variance expected from the calibration test and unknown
samples then one should not encounter outliers. When samples do not fit the model it
is an indicator that either,
a) the sample is different, or
b) the model does not adequately describe the variance space.
In addition to computing principal components, both PCA and ROBPCA also flag
outliers. Outliers are identified on a scores plot if they fall outside of the 95%
confidence limit or are identified as outliers in the corresponding Hoteling’s T2/Q
residual model. There are different categories of outliers which are encountered in a
ROBPCA outlier map (Figure 37): (1) good leverage points (2) orthogonal outliers,
and (3) bad leverage points. Good leverage points lie close to the PCA subspace but
far from the regular observations, whereas bad leverage points have a large
orthogonal distance22 to the PCA subspace yet their projection is within the PCA
space. Orthogonal outliers have a large orthogonal distance and their projection on the
PCA space is far from the regular data [234, 235] [274] [275].
Figure 37: An outlier map plots the orthogonal distance versus the score distance, with
orthogonal outliers and good and bad leverage points.
22
The orthogonal distance is the distance between an observation and its projection in the k-
dimensional subspace.
69
Materials and Methods
2.9 Materials
The conventional Raman study covered the development of the analysis method from
the initial data collection setup to data handling and chemometric analysis. It involved
the changing of the sample composition and the progression from single analyte
component samples to five component model medium samples. The chemometric
methods were applied to both the simple and complex samples and with increasing
70
complexity the degree of accuracy from the model performance was studied. The
Raman study was undertaken firstly to develop a method for measuring D-glucose
concentration in the M5Glu media. Two more sample sets, M5eRDF and M5Ye (see
below), were model media with very complex compositions (amino acids,
carbohydrates, minerals, etc.). For these media the goal was to quantify the amount of
multi-component ingredients (e.g. eRDF and yeastolate) added to each media blend.
For the complex ingredient quantification, more sensitive methods like SERS and
EEM were investigated. Qualitative and quantitative models were built for the eRDF
and yeastolate samples using Raman, SERS, EEM and TSFS.
2.12 Datasets
72
2.12.3 M3Glu Media Dataset
A 3-component model system (D-glucose, L-glutamine and D-galactose) was
generated to determine the potential for quantifying closely related analytes in dilute
aqueous solution. D-glucose concentration was varied throughout the sample set and
L-glutamine and D-galactose concentrations were kept at either a high or low level as
a background influence (Table 3). Chemometric modelling was performed on the
Raman data collected to predict the D-glucose concentration within these samples as it
was giving the strongest signal of the media components used.
Table 3 Composition of the M3Glu model media solutions.
# Sample D-glucose (g/L) D-galactose (g/L) L-glutamine (g/L)
M3Glu01 0.00 1.3 0.44
M3Glu02 0.32 3.7 1.16
M3Glu03 0.64 1.3 0.44
M3Glu04 0.96 3.7 1.16
M3Glu05 1.28 1.3 0.44
M3Glu06 1.60 3.7 1.16
M3Glu07 1.92 1.3 0.44
M3Glu08 2.24 3.7 1.16
M3Glu09 2.56 1.3 0.44
M3Glu10 2.88 3.7 1.16
M3Glu11 3.20 1.3 0.44
M3Glu12 3.52 3.7 1.16
M3Glu13 3.84 1.3 0.44
M3Glu14 4.16 3.7 1.16
M3Glu15 4.48 1.3 0.44
M3Glu16 4.80 3.7 1.16
M3Glu17 5.12 1.3 0.44
M3Glu18 5.44 3.7 1.16
M3Glu19 5.76 1.3 0.44
M3Glu20 6.08 3.7 1.16
M3Glu21 6.40 1.3 0.44
M3Glu22 6.72 3.7 1.16
M3Glu23 7.04 1.3 0.44
M3Glu24 7.36 3.7 1.16
M3Glu25 7.68 1.3 0.44
M3Glu26 8.00 3.7 1.16
M3Glu27 8.32 1.3 0.44
M3Glu28 8.64 3.7 1.16
M3Glu29 8.96 1.3 0.44
M3Glu30 9.28 3.7 1.16
M3Glu31 9.60 1.3 0.44
M3Glu32 9.92 3.7 1.16
73
The thirty one different solutions of D-glucose prepared in the M1Glu dataset were
used as the stock solutions for D-glucose in the M3Glu dataset. Two stock solutions
were prepared for L-glutamine at 2.2 g/L and 5.8 g/L. Two stock solutions were
prepared for D-galactose at 6.5 g/L and 18.5 g/L. All stock solutions were prepared
with Millipore water. The M3Glu dataset consisted of 32 samples. Sample one was
prepared by pipetting 6 mL of Millipore water, 2 mL of L-glutamine and 2 mL of D-
galactose at the low concentration giving a sample volume of 10 mL. For sample two
to sample thirty two, a 2 mL aliquot of the specified D-glucose solution was pipetted
into a sample vial together with 2 mL of L-glutamine and 2 mL of D-galactose (at
high concentration for the even numbered samples and at low concentration for the
odd numbered samples). The sample volume was made up to 10 mL with 4 mL of
Millipore water.
Media can also contain complex ingredients that also contain glucose. Yeastolate and
eRDF are complex mixtures that have D-glucose as a component, for example eRDF
has a glucose concentration of 0.019 mg/g. The same issue arises with yeastolate. For
yeastolate, there is no accurate compositional data available. The amount of glucose
contained in yeastolate and eRDF is so small that it should not have a significant
impact on the model. When quantitative analysis of D-glucose on the M5Glu data was
74
performed, the concentration of D-glucose in yeastolate and eRDF was not taken into
account.
Table 4 The composition of the M5Glu samples with the additive contribution of D-glucose from
eRDF and yeastolate giving a new range of D-glucose from 0.0 g/L to 9.92 g/L.
D-glucose eRDF Yeastolate D-galactose L-glutamine
Sample No g/L g/L g/L g/L g/L
M5Glu01 0.00 3.4 1 2.5 0.8
M5Glu02 0.32 3.4 1 2.5 0.8
M5Glu03 0.64 3.4 1 2.5 0.8
M5Glu04 0.96 3.4 1 2.5 0.8
M5Glu05 1.28 3.4 1 2.5 0.8
M5Glu06 1.60 3.4 1 2.5 0.8
M5Glu07 1.92 3.4 1 2.5 0.8
M5Glu08 2.24 3.4 1 2.5 0.8
M5Glu09 2.56 3.4 1 2.5 0.8
M5Glu10 2.88 3.4 1 2.5 0.8
M5Glu11 3.20 3.4 1 2.5 0.8
M5Glu12 3.52 3.4 1 2.5 0.8
M5Glu13 3.84 3.4 1 2.5 0.8
M5Glu14 4.16 3.4 1 2.5 0.8
M5Glu15 4.48 3.4 1 2.5 0.8
M5Glu16 4.80 3.4 1 2.5 0.8
M5Glu17 5.12 3.4 1 2.5 0.8
M5Glu18 5.44 3.4 1 2.5 0.8
M5Glu19 5.76 3.4 1 2.5 0.8
M5Glu20 6.08 3.4 1 2.5 0.8
M5Glu21 6.40 3.4 1 2.5 0.8
M5Glu22 6.72 3.4 1 2.5 0.8
M5Glu23 7.04 3.4 1 2.5 0.8
M5Glu24 7.36 3.4 1 2.5 0.8
M5Glu25 7.68 3.4 1 2.5 0.8
M5Glu26 8.00 3.4 1 2.5 0.8
M5Glu27 8.32 3.4 1 2.5 0.8
M5Glu28 8.64 3.4 1 2.5 0.8
M5Glu29 8.96 3.4 1 2.5 0.8
M5Glu30 9.28 3.4 1 2.5 0.8
M5Glu31 9.60 3.4 1 2.5 0.8
M5Glu32 9.92 3.4 1 2.5 0.8
75
2.12.5 T5 Test Dataset
Ten different stock solutions of D-glucose ranging from 4 g/L to 44.5 g/L were
prepared. Stock solutions with 4 g/L of L-glutamine, 12.5 g/L of D-galactose, 5 g/L of
yeastolate and 17 g/L of eRDF were also prepared. All stock solutions were made
with Millipore water. The T5 samples were assembled by pipetting 2 mL of the
specified D-glucose stock solution, 2 mL of L-glutamine, 2 mL of D-galactose, 2 mL
of yeastolate and 2 mL of eRDF to give a final sample volume of 10 mL.
Table 5 D-glucose sample composition for Raman testing with the amount of D-glucose per
sample changing at a rate of 0.9 g/L while the other components remain at one concentration.
Sample No eRDF D-glucose L-glutamine D-galactose Yeastolate
(g/L) (g/L) (g/L) (g/L) (g/L)
T5Glu01 3.4 1.70 0.8 2.5 1
T5Glu02 3.4 2.60 0.8 2.5 1
T5Glu03 3.4 3.50 0.8 2.5 1
T5Glu04 3.4 4.40 0.8 2.5 1
T5Glu05 3.4 5.30 0.8 2.5 1
T5Glu06 3.4 6.20 0.8 2.5 1
T5Glu07 3.4 7.10 0.8 2.5 1
T5Glu08 3.4 8.00 0.8 2.5 1
T5Glu09 3.4 8.90 0.8 2.5 1
T5Glu10 3.4 9.80 0.8 2.5 1
76
Table 6 M5eRDF sample compositions with the amount of eRDF per sample changing at a rate of
0.6 g/L while the other components have a fixed concentration.
Sample No eRDF D-glucose L-glutamine D-galactose Yeastolate
(g/L) (g/L) (g/L) (g/L) (g/L)
M5eRDF01 1.0 6.2 0.8 2.5 1
M5eRDF02 1.6 6.2 0.8 2.5 1
M5eRDF03 2.2 6.2 0.8 2.5 1
M5eRDF04 2.8 6.2 0.8 2.5 1
M5eRDF05 3.4 6.2 0.8 2.5 1
M5eRDF06 4.0 6.2 0.8 2.5 1
M5eRDF07 4.6 6.2 0.8 2.5 1
M5eRDF08 5.2 6.2 0.8 2.5 1
M5eRDF09 5.8 6.2 0.8 2.5 1
M5eRDF10 6.4 6.2 0.8 2.5 1
Table 7 Yeastolate sample composition with the amount of yeastolate per sample changing at a
rate of 0.18 g/L while the other components have a fixed concentration.
Sample No eRDF D-glucose L-glutamine D-galactose Yeastolate
(g/L) (g/L) (g/L) (g/L) (g/L)
M5Ye01 3.4 6.2 0.8 2.5 0.10
M5 Ye02 3.4 6.2 0.8 2.5 0.28
M5 Ye03 3.4 6.2 0.8 2.5 0.46
M5Ye04 3.4 6.2 0.8 2.5 0.64
M5Ye05 3.4 6.2 0.8 2.5 0.82
M5Ye06 3.4 6.2 0.8 2.5 1.00
M5Ye07 3.4 6.2 0.8 2.5 1.18
M5Ye08 3.4 6.2 0.8 2.5 1.36
M5Ye09 3.4 6.2 0.8 2.5 1.54
M5Ye10 3.4 6.2 0.8 2.5 1.72
77
2.14 Measurement Techniques
The instrumental setup allowed for two different scanning modes: line scanning and
mapping. The line scanning mode was originally used but later the mapping mode
was preferred as it gave a better sample representation. With line scanning or single
point data collection, possible sample inhomogeneity could have led to the collection
of erroneous data. This change from line scanning and mapping was also coordinated
with the sample holder change from aluminium crucibles to a multi-well plate.
For the M1GluR123, M3GluR1 and M5GluR1 datasets, the line scanning mode used a
3×10 s exposure time and multiple spectra were collected using a three point line scan
with 0.05 mm spacing. A total of nine spectra were collected per sample and these
were averaged for data analysis. For second and third data collections of M1Glu,
M3Glu and M5Glu, mapping was used. For the mapping of samples a 2×10 s
exposure time was used with a 3×3 grid with 0.05 mm spacing to give multiple
spectra, which were averaged prior to data analysis. All samples were analysed at
room temperature. For data collection of the SERS spectra, single point data
collection with a 2×10 s exposure time was used to give a single spectrum per sample.
23
R1 denotes the first collection of these datasets i.e. Run 1.
78
Aqua Regia24 the glassware was thoroughly rinsed with Millipore water to remove all
traces of acid [160, 283]. For the colloid preparation, 250 mL of Millipore water,
0.045 g of silver nitrate and a Teflon coated magnetic stirring bar were put into a
round bottom flask. Stirring was performed for the duration of the reaction. A reflux
reaction was setup to prepare the colloid using an oil bath to maintain a constant
temperature and to protect the solution from light (as silver nitrate is light sensitive).
When the solution started to boil, 5 mL of a 1% Sodium Citrate solution25 was added
drop-wise. The reaction flask and the oil bath were then wrapped with Aluminium foil
to maintain a constant temperature and the reaction was left to reflux for one hour. A
colour change of colourless to yellow to green to olive green was observed after 10–
15 mins following the addition of sodium citrate. The colour change was indicative of
the colloid quality. After 1 hour of refluxing, the colloid was allowed to cool to room
temperature. The absorption spectrum of the colloid was then recorded on a Shimadzu
UV-1601 UV-Visible spectrophotometer to determine the plasmin band maximum
(λmax).
24
Disposal of the Aqua Regia was performed by first diluting the acid to 10% of its original volume.
The acid was then neutralised by the addition of sodium bicarbonate in small quantities until the
effervescence stopped. The solution was tested with pH paper to check for neutral pH before disposal.
25
A 1% solution of Sodium Citrate was prepared by the addition of 0.05 g to 5 mL of Millipore water.
79
4
x 10
3.5
Cosmic Ray
3 Signal Spikes
2.5
Intensity
1.5
0.5
Figure 38 Raman spectra of a media sample with sharp artefact peaks (red and turquoise peaks)
due to the presence of cosmic rays.
26
It allows a maximum of four samples to be analysed sequentially.
80
2.15 Sample Holders
Aluminium crucibles27 were used as the sample holder for Raman testing had a 50 µl
volume and a 2 mm depth. Prior to use the aluminium crucibles were rinsed with
distilled water, followed by three washes with 70% isopropanol and finally were
thoroughly rinsed with Millipore water. The crucibles were then thoroughly dried
with cotton buds wrapped in lens tissue. Sampling was carried out in the LFH, where
40 µl of sample was pipetted into the crucible. The crucible was then placed on the
Raman sample stage for testing.
The testing procedure was altered during the Raman experiments to improve the
sampling process. The first run for each dataset was carried out using aluminium
crucibles and the second and third data collections were carried out using the stainless
steel 96 well plate sample holder. The changes to the measurement setup were noted
as this affected the spectral data. This change also altered the sample volume and the
sampling speed. The reason for the development of the stainless steel plate was to
improve the sampling method capability for high throughput screening, as it allows
for multiple samples to be tested in quick succession. The 96-well stainless steel plate
increased the number of samples for consecutive analysis, leading to a greater number
of samples being tested per day. The maximum sample volume per well was 200 µl
and for the analysis a sample volume of 100 µl was used. [8]
27
Aluminium crucibles were supplied by Thorn Scientific Services Ltd UK.
28
The Aluminium crucibles were designed for single use and were subject to damage during cleaning
due to the light structure. Problems occurred with the sides of the crucibles, which caved in if too much
pressure was applied.
81
For SERS analysis 50 µl of sample solution and 50 µl of silver colloid were pipetted
into a well. The sample was mixed five times before testing by re-suspending the
sample colloid mixture.
Quartz cuvettes29 were used for the fluorescence analysis. Quartz cuvettes were rinsed
with Millipore water, followed by five washes with 70% isopropanol before
thoroughly rinsing with Millipore water. Cuvettes were dried in an oven set at 60 oC
and allowed to cool in the LFH before sampling. The sample solution was added to
the cuvette. The cuvette was stoppered and parafilm was used to secure the stopper in
place. The outside of the cuvette was wiped down with lens tissue before
measurement. At the end of each week of testing, the cuvettes and stoppers were
rinsed with Millipore water and left to soak in 30% nitric acid over the weekend (~2.5
days). This ensured a thoroughly clean quartz surface.
29
Quartz cuvettes were supplied by Lightpath Optical (UK) Ltd.
82
2.16.3 Water to Analyte Ratio
This water-to-analyte ratio (WAR) was calculated using an in-house Matlab routine to
reflect the water per sample. This code calculated the WAR per spectrum based on
the residuals generated from the difference between the Savitzky–Golay smoothed
version of a spectrum and the test spectrum. Since this function worked on a
spectrum-by-spectrum basis and for WAR of a full dataset, the mean of the individual
WAR values was taken.
30
Over-fitting occurred when unnecessary latent variables were used to overly explain variance and
noise amongst the spectra. This resulted in a restricted calibration model.
83
15
RMSECV
RMSEC
10
RMSECV, RMSEC
0
2 4 6 8 10 12 14 16 18 20
Latent Variable Number
Figure 39 Plot of Variance captured per latent variable versus RMSEC and RMSECV.
The relative error of prediction for the calibration models was calculated from
RMSECV and mean value for the analyte concentration (yconc).
Equation 2-21
𝑅𝑀𝑆𝐸𝐶𝑉
𝑅𝐸𝑃% = 𝑥 100
̅̅̅̅̅̅̅̅
𝑦𝑐𝑜𝑛𝑐
If the ratio of SECV to SEC was above 3, the data was deviating from the
model and being over-fitted. [6]
It is the right balance of these elements that gives models the potential to be useful in
the prediction of other samples.
84
2.16.5 Chemometric Workflow Overview
Spectroscopic methods provide a large amount of data, in order to interpret this data
the chemometric methods covered in this chapter were utilised. Both qualitative and
quantitative assessment of the data was performed.
For Raman, SERS and fluorescence data, the PCA method helped visualise the data
for reproducibility testing and outlier detection. The fluorescence data was further
examined for outliers using ROBPCA. But this method proved to be too sensitive for
the same sample number used here.
The fluorescence emission in the EEM data was generated by multiple, different
fluorophores. MCR and PARAFAC were used identify the fluorophores in the
samples and assess how their emission varied as composition was changed. The MCR
method was better suited than PARAFAC to the EEM data from these complex media
samples because of IFE introducing non-linearities into the data.
85
3 Development using Raman Spectroscopy for
the Analysis of Cell Culture Media
Components
This chapter covers the investigative work carried out to demonstrate the feasibility of
using Raman spectroscopy for the qualitative and quantitative analysis of complex
aqueous cell culture media. Raman spectroscopy was used to quantify the main source
of carbon and largest media component, D-glucose, as well as the more complex
media components (eRDF and yeastolate) in a model media system. This is of interest
in bioprocess monitoring because it is important to track the concentration of media
components, as this directly affects the metabolism of cells and influences production
yield.
86
4
x 10
12 (a) D-Glucose
(b) eRDF
10 (c) Yeastolate
(d) L-Glutamine
8 (e) D-Galactose
Intensity
(a)
6
(b)
4
(c)
2
(d)
(e)
500 1000 1500 2000 2500 3000
Wavenumber (cm-1)
Figure 40 An overlay of the Raman spectra of aqueous solutions of eRDF (17 g/L), D-glucose (31
g/L), D-galactose (12.5 g/L), L-glutamine (4 g/L) and Yeastolate (5 g/L), the components used to
formulate the cell culture media. The concentrations are those used in the final media
formulation. The spectrum has been enlarged to highlight the weaker peaks.
Figure 41 shows the peaks contained in a concentrated M1Glu sample (49.6 g/L)
spectrum. The top of the OH stretching band beyond 3000 cm–1 was omitted. The
other OH band present is the strong OH bending band at 1640 cm–1 which obscures
the carbonyl (C=O) group which would be seen at 1620–1680 cm–1. Several peaks
87
for D-glucose can be assigned. The low wavenumber peaks 426 cm–1 and 514 cm–1
result from skeletal deformation by exo and endocyclic CCO, CCC, COC and OCO
bending modes. The peaks seen at the high wavenumber 2902 cm–1 and 2950 cm–1 are
the asymmetric and symmetric stretching of CH2 and CH3 groups respectively. In the
fingerprint region the strongest peak at 1123 cm–1 can be assigned to a C–C stretching
vibration along with the peak at 1067 cm–1 which is either a C–C stretching of ring
vibration or molecular backbone. CH bending in the form of CH3, CH2 and CH
deformation gives rise to the 1460 cm–1 peak. The symmetric CH3 deformation (CH
twisting) results in the peak at 1373 cm–1 which is stronger than the other broad OH
band seen at 1364 cm–1. The peaks at 843 cm–1 and 915 cm–1 can be assigned to the
vibrations of the glycosidic bonds and sugar linkages [287-290]. As the datasets get
more complex (M1Glu to M5Glu), the bands appear weaker due to lower overall
concentration. In all cases multivariate analysis is required to extract information from
the Raman spectra with a strong water signal.
88
5 4
x 10 x 10
12
(b)
2.5
(a) 10
2
8
Intensity
Intensity
1.5
6
1 4
2
0.5
500 1000 1500 2000 2500 3000 500 1000 1500 2000 2500 3000
Wavenumber (cm-1) Wavenumber (cm-1)
4
x 10
12 (c)
10
8
Intensity
Figure 42 Averaged Raw Raman spectra of the (a) M1GluR1, (b) M1GluR2 and (c) M1GluR3
data (Table 2).
The baseline variation resulted from the different depths and surface finishes of the
sample containers. It affected the performance of data when used for qualitative and
quantitative analysis until corrective action was taken. In order to ensure consistent
data collection the collimation of the radiation from the samples required focusing. If
the radiation was diffusely scattered, the level of collimation decreased and this led to
an increase in the stray light resulting in more scatter in the spectra. A way to prevent
scatter was to have the focal depth centred over the middle of the sample. This would
help to avoid scatter from the container surface, which if present would contributed to
the baseline offset [291].
2
(a) (b)
10
8
1.5
Intensity
Intensity
6
1
0.5
2
0 0
500 1000 1500 2000 2500 3000 500 1000 1500 2000 2500 3000
Wavenumber (cm-1) 4 Wavenumber (cm-1)
x 10
12 (c)
10
8
Intensity
0
500 1000 1500 2000 2500 3000
Wavenumber (cm-1)
Figure 43 Baseline Corrected Raman spectra of the (a) M1GluR1, (b) M1GluR2 and (c)
M1GluR3 data.
M1R1 Sample
(a)
3 (b)
Water
2 2.5
Difference
2
1.5
1.5
Intensity
Intensity
1
1
0.5
0.5
0
-0.5
-1
500 1000 1500 2000 2500 3000
500 1000 1500 2000 2500 3000
Wavenumber (cm-1) Wavenumber (cm-1)
16000
10000 10000
8000 8000
Intensity
Intensity
6000 6000
4000 4000
2000 2000
0 0
-2000 -2000
500 1000 1500 2000 2500 3000 500 1000 1500 2000 2500 3000
Wavenumber (cm-1) Wavenumber(cm-1)
Figure 44 (a) Raman spectra of M1GluR1 (49.6 g/L D-glucose) sample, water and a subtracted
spectra. (b) M1GluR1, (c) M1GluR2 and (d) M1GluR3 are the water eliminated Raman spectra
for the different datasets.
The M1Glu and water spectra were both baseline corrected prior to water elimination.
After water elimination (Figure 44), the analyte Raman bands were clearly visible and
easier to identify and interpret. This was particularly important when analysing
samples which displayed subtle changes. A drawback of water elimination was the
introduction of artefacts such as negative peaks, baseline shift and enhanced noise.
This was due to the variance amongst the spectra of the samples and water. It was also
evident that each data collection series (R1, R2 and R3) were affected differently by
water elimination. For example, above 3000 cm–1, the noise varied between replicate
91
runs. The M1GluR1 and M1GluR3 samples displayed similar artefacts. For instance
at 600–900 cm–1, they both showed downward facing bands, and in the 1500–2000
cm–1 region, where the OH bending band was, the spectra showed increased levels of
noise and baseline offset. For M1GluR2, water elimination introduced an increase
baseline offset across the data.
PCA was performed to provide a simple overview of the variance within the data. The
averaged raw M1Glu spectra from the replicate measurements were amalgamated for
comparison in order to assess data reproducibility. Sample grouping and data
collection reproducibility were evaluated (Figure 45). During this study, changes in
the experimental setup (sample container and scanning mode) led to differences in
data collection. The scores plots (Figure 45), highlighted that the second and third
runs - which were collected under the same conditions - overlapped, indicating
reproducible data collection. The first run proved anomalous, however, due to the
different data collection setup used. Sample variance was an issue with these media
samples. It was clear from the scores plots that changing the setup minimised the
measurements variance as seen by the tighter grouping of the M1R2 and M1R3
samples [51, 66, 294].
92
5 4
x 10 x 10
8
2 6
4
1
Scores on PC 2 (1.74%)
Scores on PC 3 (0.04%)
2
0
0
-1
-2
-2
-4
-3
-6
-4 -8
-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3
Scores on PC 1 (98.21%) 6 Scores on PC 1 (98.21%) 6
x 10 x 10
4
x 10
8
0.25 L1(98.21%)
6 L2(1.74%)
L3(0.04%)
0.2
4
Scores on PC 3 (0.04%)
0.15
2
Intensity
0.1
0
0.05
-2
0
-4
-0.05
-6
-0.1
-8
-4 -3 -2 -1 0 1 2 500 1000 1500 2000 2500 3000
Scores on PC 2 (1.74%) 5 Wavenumber(cm-1)
x 10
Figure 45 Scores plots and loadings (L1, L2 and L3) of PC1, PC2 and PC3 for amalgamated
averaged raw M1Glu data. The black circles represent run1, red triangles represent run2 and
the green asterisks represent run3. The blue circle represents 95 % confidence level of explained
variance.
In Figure 45, the scores along PC3 for all the data collection runs showed the
expected near linear variation as a result of the changing analyte concentration.
Indeed the corresponding third loading contained peaks relating to the D-glucose. The
other two loadings represented the water signal and the baseline offset, respectively.
The large variability of run one measurements along PC2 was caused by the use of
aluminium crucibles which resulted in a strong baseline shift. Moreover, the
instrumental effects associated with line scanning were greater than those of the
mapping mode (this was the reason why the measurement protocol was changed from
line scanning to mapping). When baseline offset correction was performed and PCA
was repeated, the number of components was reduced to two with the first
representing the water signal and the second showing the analyte signal. Even after
pre-processing, line scanning and mapping samples did not overlap. This made them
incompatible when combined for modelling (data not shown but it was evident from
the PCA results).
93
A full data collection of line scanning samples was carried out on M1Glu, M3Glu and
M5Glu, prior to the switch to the new experimental setup. The advantage of line
scanning was the faster collection time. However, from the PCA results, the
M1GluR1 data should be treated separately as:
a) it did not match the R2/R3 data
b) the variance of M1GluR1 samples was much greater as seen by the scores
pattern in Figure 45.
From the MWPLS results, a residual line plot of downward facing bands was
observed (Figure 46). The bands represented areas rich in signal variation while the
flat sections showed the areas of limited information. The five residual lines for the
five principle components were used to determine the error level. MWPLS was
performed on the M1Glu, M3Glu and M5Glu sample sets. The residual lines were
clearer in the M1Glu dataset due to the stronger signal and the simple sample
composition. In the region below 650 cm–1, there were scatter contributions that
appeared as large sloping baseline variances. This region was compromised by
Rayleigh light leakage from the filters, therefore the 250–600 cm–1 region was
omitted. For the M1Glu data, the first informative band selected by MWPLS was at
31
Areas that contain interference effects - from the diffuse light scatter and noise from the hardware
used - are removed.
94
818–1676 cm–1. This represented several groups from CH3 and CH2 deformations, C–
O and O–O groups, C–N, C–C, and C=C stretching bands etc. The second informative
band at 2774–3159 cm–1 represented C–H and O–H stretching modes. There were
similar informative areas for M3Glu (802–1612 cm–1, 2798–3151 cm–1) and M5Glu
data (826–1596 cm–1, 2814–3167 cm–1). Since the MWPLS results showed similar
downward bands for all three datasets, the regions 800–1680 cm–1 and 2770–3170 cm–
1
were chosen for data analysis. In previous research by the NBL laboratory on cell
culture media by Raman spectroscopy, [3, 6, 8] spectral regions were selected for
chemometric analysis and the most significant bands were observed in the 707–1853
cm–1 region (Figure 47). The expected bands associated with the media components
were observed in this region and it was also selected for data analysis.
3.5
3
Log [SRR]
2.5
(A)
2
1.5
2.4
2
2.2
1.5 1.8
Log [SRR]
Log [SRR]
1.6
1.4
1
(B) 1.2 (C)
1
0.5
0.8
0.6
0
500 1000 1500 2000 2500 3000 500 1000 1500 2000 2500 3000
Wavenumber (cm-1) Wavenumber (cm-1)
Figure 46 MWPLS Error Plot of Log (SSR) versus Raman Shift (cm–1) of the (A) M1Glu
samples, (B) M3Glu samples and (C) M5Glu samples.
In the preliminary data analysis, the 2774–3174 cm–1 spectral range was modelled for
the M1Glu (Table 47) and M3Glu (Table 48) sample sets. The performance with the
M1Glu data showed good correlation to D-glucose concentration and performed at a
similar level to the other two reduced region models. Whilst for the M3Glu sample
set, the models were weaker. The other reduced regions (800–1680 cm–1 and 707–
1853 cm–1) of M3Glu data outperformed the 2774–3174 cm–1 region. The reasons for
95
poor performance came from the weak analyte signal together with the strong water
band bordering this region which diminished the correlation to D-glucose seen in the
high concentration M1Glu samples. Another cause of the poor performance of the
2774–3174 cm–1 region was the high levels of variance caused by instrumental factors
such as lower detector sensitivity above 1000 nm and higher detector noise. Thus,
after preliminary modelling, the 2774–3174 cm–1 region was omitted from further
analysis.
Figure 47 Raman spectra of the 49.6 g/L D-glucose aqueous solution over the full spectra range
(250–3311 cm–1) and inset is the selected region of interest, 707–1853 cm–1.
96
component analysis. More models were constructed following pre-processing to
further improve the calibration results.
Table 8 Spectral areas selected for model generation using the Raman Data.
Region ID Wavenumber Region (cm–1)
Full 250–3300 cm–1
Reduced Region (ROI) 707–1853 cm–1
MWPLS Region 800–1680 cm–1
All preliminary models built with the averaged raw data were able to correlate the
Raman spectra to the D-glucose concentration (Figure 48). Comparison of M1GluR1
and M1GluR2 revealed how the changes in sampling affected the reproducibility of
the data. The new experimental setup improved the RMSECV for the M1GluR2 data.
There was an improvement of 37% for averaged data, 47% for the baseline corrected
and 27% for water eliminated data over the M1GluR1 data. In the raw data calibration
model of M1GluR3 (Figure 48c), a weaker performance was seen compared to the
M1GluR1 and M1GluR2 models. When the raw data was investigated, the first 18
samples collected were separated by a spectral offset from the remaining 14 samples
that were collected later that day. The samples were tested in a random order and
variation was therefore seen randomly throughout the dataset. This variation was most
likely due to minor changes during sampling, such as the temperature, as these
samples were kept at room temperature for longer before measurement. This type of
change was only observed in this dataset, but it can have a major impact on data
quality and therefore model performance. In this case the spectral offset variance was
removed by pre-processing.
60 60 60
R2 = 0.996 R2 = 0.979
R2 = 0.986 31 3 Latent Variables 31 3 Latent Variables 29
50 3 Latent Variables 50 50
30 RMSEC = 0.95139 30 RMSEC = 2.0663 31
RMSEC = 1.7048 RMSECV = 1.142 29 RMSECV = 2.3858
28 30
Predicted D-Glu Conc [g/L]
28
2627 27 40 2627
40 40 24 26 23
25
22 24 25
23 24
21 25 22
23
21
18 2122
20 1920
30 17 1920 30 1819
30
16 18 17
17
16 14 16
15
131415 1415 12
20 20 13 20 13
12 10
1112
10 1011 11
89 9
7 9 6
10 6 8 (a) 10
5
6
7
(b) 10 5 7
8
(c)
4 5 4 4
3 2
1 3
2 2 3
1 0 1
0 0
0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50
Measured D-Glu Conc [g/L] Measured D-Glu Conc [g/L] Measured D-Glu Conc [g/L]
Figure 48 Predicted versus Measured D-glucose concentration plots for the calibration models for
the averaged data using full range for (a) M1GluR1, (b) M1GluR2 and (c) M1GluR3 sample sets.
97
For the M1GluR2 sample set, the preliminary models are shown in Table 9. For the
averaged data, the performance of all three regions was similar but the reduced range
used a smaller number of variables compared to the full dataset. The calibration
models improved with baseline correction and marginally worsened with water
elimination due to artefacts introduced by the signal elimination process. By using
reduced regions with water eliminated data, some of the artefacts were avoided and
model performance improved. Water eliminated data produced reasonable models in
the preliminary study, but the process also introduced artefacts into the spectra.
Further analysis was required to determine the severity of these artefacts.
The PLS loadings from the calibration model generated using baseline corrected
M1GluR2 data are shown in Figure 49. The first loading resembled the pure water
signal and represented 99.88% of the explained variance in the data. The second and
third loadings showed the baseline offset seen in the data. They both represented the
D-glucose signal. Together they accounted for 0.11% of the explained variance but
differed at 1450 cm–1 and 1367 cm–1; these bands originated from the asymmetric and
symmetric stretching of CH3 respectively. The offset intensity in the second and third
loadings suggested a high water signal and a low water signal as a result of
hydrophobic interactions. The second loading reflected the high concentration
samples while the third loading represented the low concentration samples and
background noise in an aqueous environment [296, 297].
0.25
L1(99.88%)
L2(0.10%)
0.2 L3(0.01%)
0.15
Loadings
0.1
0.05
-0.05
Figure 49 Loadings plot from the calibration model for baseline corrected M1GLUR2 data over
the full range.
98
Table 9 Models generated from M1GluR2 data after preliminary pre-processing.
M1GluR2 LV R2 RMSEC RMSECV REP%
(g/L) (g/L)
Avg Full (250–3311 cm–1) 3 0.996 0.95 1.14 4.45
Avg ROI (707–1853 cm–1) 2 0.994 1.13 1.23 4.80
Avg MWPLS (800–1680 cm–1) 2 0.995 1.02 1.11 4.33
M1GluR2 LV R2 RMSEC RMSECV REP%
(g/L) (g/L)
BC Full (250–3311 cm–1) 3 0.996 0.89 1.02 3.98
BC ROI (707–1853 cm–1) 3 0.997 0.75 0.86 3.35
BC MWPLS (800-1680 cm–1) 3 0.998 0.62 0.72 2.81
M1GluR2 LV R2 RMSEC RMSECV REP%
(g/L) (g/L)
WE Full (250–3311 cm–1) 2 0.996 0.97 1.09 4.25
WE ROI (707–1853 cm–1) 3 0.998 0.62 0.74 2.89
WE MWPLS (800–1680 cm–1) 3 0.998 0.64 0.76 2.96
99
800
600
400
Intensity
200
-200
-400
-600
End of Range Artefacts
800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800
Wavenumber(cm-1)
Figure 50 Overlay of the first derivative pre-processing of M1Glu samples with end of range
artefacts. The blue traces used post area selection and the black traces used pre-area selection.
8000
(a) 7000 (b)
7000
6000
6000
5000
5000
4000
Intensity
Intensity
4000
3000
3000
2000
2000
1000
1000
0 0
-1000 -1000
800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800
Wavenumber(cm-1) Wavenumber(cm-1)
Figure 51 Effects of MSC on spectra from M1GluR2 data in the ROI range (707–1853 cm–1),
(a) Averaged Raw spectra, (b) after MSC.
The Raman spectra of the M1Glu before and after MSC pre-processing are displayed
in Figure 51. The additive/multiplicative effects observed in the raw data were
reduced by MSC and the peaks were clarified. Table 10 outlines the calibration
performance of M1GluR2 data after MSC for the D-glucose modelling. Similar results
were obtained for the models using data before and after baseline correction. The best
MSC model was built on averaged data using the 707–1853 cm–1 region. It used two
LVs and gave an accuracy of roughly 1.53% REP. This model demonstrated an
100
improved correlation for D-glucose concentration compared to the preliminary pre-
processing models in Table 9.
The MSC water eliminated (WE) models were generally poor as seen in Figure 52a.
There were two distinct regions in the predicted versus expected plot for the high and
low concentration samples. This indicated a big difference in how the glucose
interacted with water at high and low concentrations. In the Hotelling’s T2 vs Q
residuals plot (Figure 52c), there were several samples outside the 95% confidence
limit. Low concentration water eliminated samples were excessively modified by
MSC and were cast as outliers. This behaviour suggests the involvement of
hydrophobic interactions due to grouping of the high and low water samples [296,
297]. The PC1 vs PC2 scores plot gave a linear response with decreasing water
content, confirming the different hydrophobic behaviours seen in the samples (Figure
52d). Another explanation was that as glucose concentration increased the density and
the refractive index of the sample changed. This could affect the Raman spectra, as
the low concentration samples behaved differently to the high concentration samples
after water elimination. A conclusion was drawn that accurate linear glucose models
can only be constructed over a small concentration range.
Further analysis of the MSC WE PLS loadings (Figure 52b) showed that the major
signal was from D-glucose, with the first loading containing peaks attributed to D-
glucose and describing 85.5% of the explained variance. With MSC, the signal was
corrected to have a reduced level of scatter. Therefore the MSC WE data contained
the enhanced analyte peaks as well as the water removal artefacts. This inadvertently
introduced more noise into the data, with the second and third loadings accounting for
almost 10% of the signal variance and mainly describing the water artefacts. The
second loading (8.97%) represented the water removal artefacts present in the 1200–
2000 cm–1 region and the region beyond 3000 cm–1. The third loading described
0.95% of variance and its main contribution was the noise present beyond 3000 cm–1.
101
40
30
(a) 8 11
12
14 16
192021
22
2324
29
25262728 3031 0.2 (b)
15 18
10 13 17
9 0.1
6
20
Predicted (g/L)
7 0
Intensity
2 5
10
-0.1
4
0 3
-0.2
L1(85.5%)
-10 R^2 = 0.619 L2(8.97%)
1 -0.3
3 Latent Variables L3(0.95%)
RMSEC = 8.8302
RMSECV = 11.982
-20
0 5 10 15 20 25 30 35 40 45 50 500 1000 1500 2000 2500 3000
Expected (g/L) 4 Wavenumber(cm-1)
x 10
30 4
1
3
12 25 2 (c) 2 (d)
1
Scores on PC 2 (8.91%)
Hotelling T^2 (98.54%)
22 20
3 0
15 -1
-2
10
30
-3
29 4
23 18
-4
5
17 13 19 7 -5
5 1 9
15 10 612 5
6 8 7 31161311 22 8
0 -6
1 1.2 1.4 1.6 1.8 2 0 2 4 6 8 10 12 14 2.9 3 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9
esiduals (0.65%) 7 Q Residuals (1.46%) 7
Scores on PC 1 (85.65%) 4
x 10 x 10 x 10
30
2 Figure 52 (a) Relationship between expected and predicted D-glucose content for WE MSC pre-
25
treated 1
M1Glu sample set, (b) the loadings plots of the three PCs, (c) Hotelling’s T2 vs Q
residuals for water eliminated M1GluR2 data after MSC, and (d) the PC1 vs PC2 scores plot
Hotelling T^2 (98.09%)
20
7 with the high concentration samples in black and the low concentration samples in red.
15
30 6
Table1010 Calibration Evaluation for Multiplicative Scatter Correction on the M1GLUR2 data.32
5
7 M1GluR2 3 2 LV R2 RMSEC (g/L) RMSECV (g/L) REP%
15
19 12 5
10
18
16
9
Avg Full 9 1210
8 4
2 0.998 5
0.56 0.60 2.34
2716 1122 15
31 7 6
1 29
6 8 10 12 Avg ROI 0
0 0.02 0.04 20.06 0.08 0.997
0.1 0.12 0.71
0.14 0.75 2.92
esiduals (2.84%) 5 Q Residuals (1.91%)
x 10
Avg ROI (1) 3 0.999 0.35 0.39 1.52
Avg MWPLS 2 0.998 0.68 0.71 2.77
Avg MWPLS (1) 2 0.997 0.77 0.82 3.20
M1GluR2 LV R2 RMSEC (g/L) RMSECV (g/L) REP%
BC Full 2 0.998 0.55 0.59 2.30
BC ROI 2 0.998 0.55 0.60 2.34
BC ROI (1) 2 0.998 0.55 0.60 2.34
BC MWPLS 2 0.998 0.60 0.65 2.53
BC MWPLS(1) 2 0.998 0.60 0.65 2.53
M1GluR2 LV R2 RMSEC (g/L) RMSECV (g/L) REP%
WE Full 3 0.619 8.83 11.98 46.79
WE ROI 2 0.245 12.56 14.31 55.89
WE ROI (1) 3 0.494 10.17 12.57 49.10
WE MWPLS 2 0.244 12.55 14.26 55.70
WE MWPLS (1) 3 0.454 10.57 12.30 48.04
32
ROI (1) and MWPLS (1) signify that pre-processing was carried out on the reduced area of the
spectra. No bracket represents area selection on data pre-processed on the full range. For ease of
interpretation of the table, the best models will be highlighted in grey.
102
3.5.3 Normalisation of M1GluR2 Data
Normalisation corrects for variation in signal intensity due to experimental setup.
Four different modes of normalisation were assessed here for enhancement of
calibration performance over the full and reduced ranges. Figure 53 shows the
spectral profiles of the normalised data using the different methods (Norm1, Norm2,
NormINF, and NormOH see section 2.6.4).
0.2
0.018 (a) 0.18
(b)
0.016
0.16
0.014
0.14
0.012
Intensity
0.12
Intensity
0.01
0.1
0.008 0.08
0.006 0.06
0.004 0.04
0.002 0.02
500 1000 1500 2000 2500 3000 500 1000 1500 2000 2500 3000
Wavenumber(cm-1) Wavenumber(cm-1)
1
0.9
(c) 8 (d)
0.8
7
0.7
6
0.6
Intensity
Intensity
5
0.5
4
0.4
3
0.3
0.2 2
0.1 1
500 1000 1500 2000 2500 3000 500 1000 1500 2000 2500 3000
Wavenumber(cm-1) Wavenumber(cm-1)
Figure 53 Effects of normalisation on M1GluR2 data (a) after NORM1 pre-processing, (b) after
NORM2 pre-processing, (c) after NORMINF pre-processing, and (d) after NORMOH pre-
processing.
Normalisation improved the models built using the averaged raw data (Table 9). The
calibration results (Table 11-Table 14) showed that the best model was obtained using
Norm2 which gave a correlation co-efficient of 0.999. Model performances only
varied slightly with the different normalisation methods, with Norm1 being the
weakest. Figure 54a showed the PLS loadings for the raw (averaged) M1GluR2 data
after Norm2 pre-processing; similar loadings resulted from the other normalisation
methods. The first loading revealed, as expected, that the water signal was the
dominant feature in the data while the second loading was the D-glucose signal and
described less than 0.1% of the total variance.
103
0.25
0.25
L1(99.89%)
L2(0.09%)
0.2
(a) 0.2
(b)
0.15
0.15
0.1
Intensity
Intensity
0.05
0.1
0.05
-0.05
-0.1
0
-0.15
-0.05
500 1000 1500 2000 2500 3000 500 1000 1500 2000 2500 3000
Wavenumber(cm-1) Wavenumber(cm-1)
Figure 54 (a) shows the loadings plots of the two principal components from raw (averaged)
M1GluR2 with Norm2 pre-processing and (b) the Water Eliminated M1GluR2 spectra with
Norm 2 pre-processing.
Figure 54b showed the water eliminated spectra with Norm2 pre-processing. The
drawback of normalising water eliminated data was the increased noise artefacts and
baseline offset. These had adverse implications for the calibration modelling. The
normalisation methods used were not able to handle the water eliminated data (Figure
55a). After water elimination, three components described the data, see Figure 55b.
The first and second loadings represented the D-glucose signal. The difference in
these loadings showed the different levels of interaction between the water and the D-
glucose at high (12.8–49.6 g/L) and low (1.6–19.2 g/L) concentrations. The
normalised and MSC data behaved in a similar fashion. All the WE spectra suffered
from an offset due to the changing concentrations and differing optical properties of
the sample (Figure 54b). The third loading represented the signal from the low
concentration samples and noise. Because the water signal was so strong in the low
concentration samples, the water elimination had a bigger impact and the resulting
spectra had more noise than detail compared to the high concentration samples. This
was seen in the data where the first four samples were noisy and the D-glucose signal
came through in the fifth sample (Figure 56a). In the Hotelling’s T2 vs Q residuals
plot (Figure 55c), the low concentration samples showed more unexplained variance
than the high concentration samples. The variance amongst the samples was also seen
in the scores plot where the high concentration (black) samples were clustered
together while the low concentration (red) samples were scattered due to their
increased variability. Comparing these with the remaining high concentration samples
showed the different groupings where the high concentrations were clustered together
and the low concentration samples were dispersed (Figure 55d). It may then be
104
concluded that normalisation was adequate when the peak positions remained
relatively constant across the samples. However, when specific changes affected the
spectra, such as the noise laden spectra of the low concentration samples,
normalisation failed.
30
1
40
2122
12 25 (a)2 1920
27 2930
23 2526 28
24
31
0.2
(b)
30 16 0.15
11121314 18
17
15
Hotelling T^2 (98.54%)
22 20 0.1
8 10
20 3 9 0.05
6
Predicted(g/L)
7
15
Intensity
0
10 3 R2 = 0.702
2 4 3 Latent Variables -0.05
8 5
10 RMSEC = 7.8161
30 RMSECV = 9.7818 -0.1
29 0
4
23 18 -0.15
5 L1(90.99%)
0
17 13 19 -10 7 -0.2 L2(5.82%)
5 1 9
15 10 612 5
6 8 7 31161311 22 8 -0.25
L3(0.98%)
0 1
1 1.2 1.4 1.6 1.8 2 0 2 4 6 8 10 12 14
-20
esiduals (0.65%) 7
0 5 10 15 Q
20Residuals
25 (1.46%)
30 35 40 45 50 7
500 1000 1500 2000 2500 3000
x 10 x 10
Expected(g/L) Wavenumber(cm-1)
30 1.5
2
25 1 (c) 1
(d)
Scores on PC 2 (5.95%)
Hotelling T^2 (98.09%)
20
2
0.5 3
7 1
15 412
2 781518
0 31
30
30 6 10 28
6 27
5 5 21
10
7 3 2
15 -0.5
19 12 9 5
18
10 16 9 1210
8 4 5
2716 1122 15
31 7 6
1 29
0 -1
6 8 10 12 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 1.05 1.1 1.15
esiduals (2.84%) 5 Q Residuals (1.91%) Scores on PC 1 (91.01%)
x 10
Figure 55 (a) Relationship between expected and predicted D-glucose content for WE Norm2 pre-
treated M1Glu sample set, (b) the loadings spectrum of the three PCs, (c) Hotelling’s T2 vs Q
residuals for H2O Eliminated M1GluR2 data after Norm2, (d) the scores plot for PC1 vs PC2,
with the high concentration samples in black and the low concentration samples in red.
0.25
0.25 M1S01
M1S01 M1S05
0.2
M1S02 0.2 M1S10
M1S03 M1S15
0.15
M1S04 0.15 M1S20
M1S05 M1S30
0.1
0.1
Intensity
Intensity
0.05
0.05
0
0
-0.05
-0.05
Figure 56 Water Eliminated M1GluR2 spectra with Norm2 pre-processing (a) first five samples
and (b) a selection of samples from M1GluSO1 to M1GluS31.
105
Table 11 Calibration Evaluation for Normalisation on the M1GluR2 data using Norm1.32
M1GluR2 LV R2 RMSEC RMSECV REP%
(g/L) (g/L)
Avg Full Norm1 2 0.995 0.99 1.09 4.25
Avg ROI Norm1 2 0.998 0.60 0.63 2.46
Avg ROI Norm1 (1) 2 0.996 0.85 0.91 3.55
Avg MWPLS Norm1 2 0.997 0.74 0.78 3.04
Avg MWPLS Norm1 (1) 2 0.994 1.12 1.20 4.68
M1GluR2 LV R2 RMSEC RMSECV REP%
(g/L) (g/L)
BC Full Norm1 3 0.997 0.78 0.90 3.51
BC ROI Norm1 3 0.997 0.84 0.96 3.75
BC ROI Norm1 (1) 3 0.934 3.69 4.15 16.21
BC MWPLS Norm1 3 0.997 0.80 0.91 3.55
BC MWPLS Norm1 (1) 3 0.925 3.93 4.41 17.22
M1GluR2 LV R2 RMSEC RMSECV REP%
(g/L) (g/L)
WE Full Norm1 3 0.692 7.94 9.87 38.55
WE ROI Norm1 4 0.726 7.49 9.64 37.65
WE ROI Norm1 (1) 4 0.664 8.29 9.51 37.14
WE MWPLS Norm1 5 0.826 5.97 10.26 40.07
WE MWPLS Norm1 (1) 4 0.684 8.04 10.43 40.74
Table 12 Calibration Evaluation for Normalisation on the M1GluR2 data using NormOH. 32
M1GluR2 LV R2 RMSEC RMSECV REP%
(g/L) (g/L)
Avg Full Norm OH 2 0.996 0.84 0.93 3.63
Avg ROI Norm OH 2 0.998 0.62 0.65 2.53
Avg ROI NormOH (1) 2 0.998 0.62 0.65 2.53
Avg MWPLS NormOH 3 0.999 0.44 0.49 1.91
Avg MWPLS Norm OH(1) 3 0.999 0.44 0.49 1.91
M1GluR2 LV R2 RMSEC RMSECV REP%
(g/L) (g/L)
BC Full Norm OH 3 0.994 1.06 1.22 4.76
BC ROI Norm OH 3 0.995 0.98 1.13 4.41
BC ROI Norm OH (1) 3 0.995 0.98 1.13 4.41
BC MWPLS Norm OH 3 0.995 0.97 1.12 4.37
BC MWPLS Norm OH(1) 3 0.995 0.97 1.12 4.37
106
Table 13 Calibration Evaluation for Normalisation on the M1GluR2 data using Norm2. 32
M1GluR2 LV R2 RMSEC RMSECV REP%
(g/L) (g/L)
Avg Full Norm2 2 0.995 1.02 1.12 4.37
Avg ROI Norm2 2 0.999 0.43 0.45 1.75
Avg ROI Norm2 (1) 2 0.996 0.85 0.91 3.55
Avg MWPLS Norm2 2 0.999 0.53 0.56 2.18
Avg MWPLS Norm2 (1) 2 0.994 1.12 1.21 4.72
M1GluR2 LV R2 RMSEC RMSECV REP%
(g/L) (g/L)
BC Full Norm2 3 0.999 0.42 0.49 1.91
BC ROI Norm2 3 0.999 0.45 0.50 1.95
BC ROI Norm2 (1) 3 0.966 2.65 3.00 11.71
BC MWPLS Norm2 3 0.999 0.37 0.41 1.60
BC MWPLS Norm2 (1) 3 0.965 2.69 3.05 11.91
M1GluR2 LV R2 RMSEC RMSECV REP%
(g/L) (g/L)
WE Full Norm2 4 0.822 5.45 7.22 28.20
WE ROI Norm2 3 0.664 7.49 9.28 36.25
WE ROI Norm2(1) 5 0.822 5.44 8.57 33.47
WE MWPLS Norm2 5 0.842 5.14 7.41 28.94
WE MWPLS Norm2(1) 5 0.834 5.25 8.02 31.32
Table 14 Calibration Evaluation for Normalisation on the M1GluR2 data using Norm INF. 32
M1GluR2 LV R2 RMSEC RMSECV REP%
(g/L) (g/L)
Avg Full Norm INF 2 0.995 1.03 1.13 4.41
Avg ROI Norm INF 2 0.997 0.71 0.75 2.92
Avg ROI Norm INF (1) 2 0.996 0.89 0.96 3.75
Avg MWPLS Norm INF 2 0.999 0.53 0.56 2.18
Avg MWPLS Norm INF(1) 2 0.995 1.05 1.13 4.41
M1GluR2 LV R2 RMSEC RMSECV REP%
(g/L) (g/L)
BC Full Norm INF 3 0.999 0.41 0.47 1.83
BC ROI Norm INF 3 0.999 0.47 0.52 2.03
BC ROI Norm INF (1) 3 0.995 0.98 1.12 4.37
BC MWPLS Norm INF 3 0.999 0.36 0.40 1.56
BC MWPLS Norm INF(1) 3 0.995 0.97 1.12 4.37
M1GluR2 LV R2 RMSEC RMSECV REP%
(g/L) (g/L)
WE Full Norm INF 4 0.781 6.05 9.32 36.40
WE ROI Norm INF 8 0.860 4.83 8.54 33.35
WE ROI Norm INF(1) 6 0.896 4.16 8.47 33.08
WE MWPLS Norm INF 5 0.819 5.50 9.42 36.79
WE MWPLS Norm INF(1) 5 0.797 5.82 9.07 35.42
107
3.5.4 Derivative Pre-Processing of M1GluR2 Data
The first order derivative smoothed and resolved peaks in complex spectral profiles. It
also caused the spectral effects of baseline offset and slopes to diminish. Here
derivative pre-processing of the data was performed using the Savitzky Golay
algorithm. The settings chosen were first order derivative with a filter width of eleven
and a polynomial order of three. In Figure 57 the averaged Raman spectra of the
M1Glu sample and the spectral profiles after first order derivative pre-processing of
the different regions (full, ROI and MWPLS) are compared. The first order derivative
spectra of M1GluR2 contained a large peak above 2900 cm–1, with multiple smaller
positive and negative peaks in the fingerprint region (400–1800 cm–1), indicative of
the D-glucose signal.
4
x 10
12
8
2000
Intensity
Intensity
6 1000
0
4
-1000
2
-2000
500 1000 1500 2000 2500 3000 500 1000 1500 2000 2500 3000
Wavenumber (cm-1) Wavenumber (cm-1)
600
(c) 600
(d)
400 400
200 200
0 0
Intensity
Intensity
-200 -200
-400 -400
-600 -600
-800 -800
-1000 -1000
800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 900 1000 1100 1200 1300 1400 1500 1600
Wavenumber (cm-1) Wavenumber (cm-1)
Figure 57 Effects of first order derivative pre-processing on M1GluR2 data, (a) Raw spectra, (b)
after processing in the full range (c) in the ROI range and (d) in the MW range.
Thus far the WE data only worked without further pre-processing. MSC and
normalisation increased artefacts produced by the water elimination dividing the
dataset into two populations of samples (high and low concentration). However the
first order derivative handled the water eliminated spectra as the baseline offset was
corrected and the scatter was reduced. When comparing the first derivative of the
108
averaged data (Figure 57b) to the water eliminated data (Figure 58), the OH bending
band (1640 cm–1) was clearly removed. In addition, the OH stretch above 3000 cm–1
left some noise, but region selection avoided interference from that spectral region.
The first order derivative data generated reasonable PLS models, however the models
generated using MSC and Norm2 pre-processing were better. Table 15 shows
consistent values for the PLS models; models for the region selection were virtually
the same and showed a slight improvement on the full range models.33
Table 15 Calibration Evaluation for first order derivative (FD) on the M1GluR2 data 32
M1GluR2 LV R2 RMSEC RMSECV REP%
(g/L) (g/L)
Avg FD13 Full 2 0.994 1.06 1.17 4.57
Avg FD13 ROI 3 0.997 0.84 0.98 3.82
Avg FD13 ROI (1) 3 0.997 0.84 0.99 3.86
Avg FD13 MWPLS 3 0.996 0.84 0.99 3.86
Avg FD13 MWPLS (1) 3 0.996 0.85 0.99 3.86
M1GluR2 LV R2 RMSEC RMSECV REP%
(g/L) (g/L)
BC FD13 Full 2 0.994 1.06 1.16 4.53
BC FD13 ROI 3 0.997 0.84 1.00 3.90
BC FD13 ROI (1) 3 0.997 0.84 0.99 3.86
BC FD13 MWPLS 3 0.996 0.85 1.00 3.90
BC FD13 MWPLS(1) 3 0.996 0.85 1.00 3.90
M1GluR2 LV R2 RMSEC RMSECV REP%
(g/L) (g/L)
WE FD13 Full 5 0.999 0.54 0.92 3.59
WE FD13 ROI 3 0.998 0.62 0.97 3.78
WE FD13 ROI (1) 3 0.998 0.63 0.99 3.86
WE FD13 MWPLS 4 0.999 0.51 0.99 3.86
WE FD13 MWPLS (1) 3 0.998 0.61 0.98 3.82
2000
1500
1000
Intensity
500
-500
-1000
Figure 58 First order derivative pre-processing on the water eliminated M1GluR2 Raman
spectra.
33
The settings used in Figure 50 are not the same as those for FD modelling and the end of range
effects are less severe with the FD setting of filter width of eleven and polynomial order of three.
109
3.5.5 MSC-FD and FD-MSC Pre-Processing of M1GluR2
Data
The next logical step was to consider the combination of multiple pre-processing
methods and how these might improve model quality in terms of RMSEC/RMSECV.
The pre-processing combination was based on the performance of the individual
method models. The combination of MSC and FD was investigated as their singular
models could be improved34 and also these methods complement each other. MSC is a
signal and scatter correction method and FD is a signal correction method. Together
they can remove baseline shift and additive effects. Considering the large impact that
baseline had on the data, the use of first derivative pre-processing alone eliminated the
baseline. However, a drawback of first derivative was that the artefacts generated by
smoothing and filtering had increased the noise. In order to prevent this, MSC was
used to increase the signal to noise ratio following derivative pre-processing.
1000 1000
500 500
Intensity
Intensity
0 0
-500 -500
MSCFSTROI(1)
3000 FSTMSCROI(1)
MSCFSTROI
FSTMSCROI
-1000 -1000 3000
800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800
2000 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800
Wavenumber(cm-1) Wavenumber(cm-1)
2000
Figure 59 Effects
1000
of FD-MSC (left) and MSC-FD (right) pre- processing on the M1GluR2 data in
the 707-1853cm–1, where the red is for pre-processing after area
1000
selection and blue represents
pre-processing before area selection.
0
Both sequences
-1000 (MSC-FD and FD-MSC) of pre-processing were investigated (see
-1000
Table 16 and
-2000
Table 17). FD-MSC outperformed MSC-FD leading to the best model.
-2000
Therefore for 500all remaining
1000 1500
data
2000
analysis,
2500 3000
FD-MSC was used. The different
500 1000 1500 2000 2500 3000
arrangements generated different results for region selection. The FD-MSC led to a
poor performance when performed after area selection while MSC-FD was less
affected by area selection. In the FD-MSC spectra (Figure 59), the OH bending band
(1640 cm–1) was the main difference between the pre-processing before and after area
selection. With pre-processing after area selection, more variance caused by this band
34
The best model thus far was from Norm2 averaged data in the region 707–1853 cm–1.
110
resulted in the weaker calibration models, due to the strong influence that water had
on the data.
The WE process proved to be problematic with FD-MSC and MSC-FD (Table 16 and
Table 17). As in the case of the MSC and normalisation methods, after WE, the signal
was inundated with too much noise to be useful, since the low concentration samples
were largely composed of water. The low concentration samples were altered making
them different from high concentration samples (Figure 60 a and c). From the spectra
for M1GluS01 and M1GluS31 and the loadings (Figure 60 b and d), it was clear that
the low concentration sample signal was laden with noise while the higher
concentration sample reflected the D-glucose signal matching the first loadings signal.
The first loading explained 66.95% of the explained variance; this was much lower
compared to good correlation models (e.g. Figure 54a, with 99.99%). Overall the
noise artefacts generated in the low concentration samples prevented a correlation
between the Raman signal and D-glucose concentration.
25 1 4000
3000
20
6
2000 M1S01
Hotelling T^2 (93.64%)
M1S31
15
1000
Intensity
3
0
10
-1000
4
5
7
5
(a) -2000 (b)
30 12
3118161110 9
-3000
0
0 2 4 6 8 10 12 500 1000 1500 2000 2500 3000
Q Residuals (6.36%) 6 Wavenumber(cm-1)
x 10
8000
-0.4
6000
-0.6
4000 L1(66.95%) L2(17.15%) L3(1.71%) L4(7.84%)
-0.8
Scores on PC 3 (8.77%)
3
2000 5 2 -1
47
Loadings
18 -1.2
0 31201598
22
19
-1.4
6
-2000
-4000
-1.6
-1.8
(d)
-6000 1 (c) -2
-2.2
-8000
2000 2500 3000 3500 4000 4500 5000 5500 6000 500 1000 1500 2000 2500 3000
Scores on PC 1 (67.48%) Wavenumber(cm-1)
Figure 60 (a) Hotelling’s T2 vs Q residuals, (b) M1GluSO1 and M1GluS31 spectra for WE
M1GluR2 data after FD-MSC, (c) the scores plot for PC1 vs PC3, with the high concentration
samples marked black and the low concentration samples marked red, and (d) the loadings
spectra for the four PCs.
111
Table 16 Calibration Evaluation for MSC-FD on the M1GluR2 data.32
M1GluR2 LV R2 RMSEC RMSECV REP%
(g/L) (g/L)
Avg MSCFD Full 2 0.999 0.48 0.53 2.07
Avg MSCFD ROI 2 0.999 0.38 0.40 1.56
Avg MSCFD ROI (1) 2 0.999 0.39 0.42 1.64
Avg MSCFD MWPLS 2 0.999 0.38 0.40 1.56
Avg MSCFD MWPLS (1) 3 0.998 0.59 0.69 2.69
M1GluR2 LV R2 RMSEC RMSECV REP%
(g/L) (g/L)
BC MSCFD Full 2 0.999 0.44 0.49 1.91
BC MSCFD ROI 2 0.999 0.35 0.38 1.48
BC MSCFD ROI (1) 2 0.998 0.68 0.72 2.81
BC MSCFD MWPLS 2 0.999 0.35 0.38 1.48
BC MSCFD MWPLS(1) 2 0.997 0.76 0.80 3.12
M1GluR2 LV R2 RMSEC RMSECV REP%
(g/L) (g/L)
WE MSCFD Full 5 0.773 6.81 12.98 50.70
WE MSCFD ROI 3 0.647 8.50 11.28 44.06
WE MSCFD ROI (1) 3 0.518 9.85 11.71 45.74
WE MSCFD MWPLS 2 0.404 11.04 11.74 45.85
WE MSCFD MWPLS (1) 2 0.353 11.53 12.56 49.06
112
3.6 Outcomes from M1Glu Data Analysis
The glucose in water model was used to evaluate which data collection, data-pre-
processing and PLS modelling conditions that were most important for accurate
prediction of concentration using Raman spectroscopy. This investigative study
showed that it was possible to precisely quantify D-glucose in water at concentrations
typical of that used in cell culture (REP< 1.5%). The two main issues to overcome in
order to achieve this result were the strong water band and the baseline offset in the
Raman signal.
To deal with the strong water signal, WE was implemented and worked up to a point.
However, the WE method also introduced artefacts which highlighted differences
between high and low concentration samples. These artefacts were eliminated with a
simple first derivative, but not with MSC or normalisation which saw two linear
ranges emerge for high and low concentration samples: the first range of samples (1 to
12) covered 1.6 g/L to 19.0 g/L and the second range started from sample 8, covering
12 g/L to 50 g/L.
The baseline offset had a negative impact on calibration modelling. Through changes
to the experimental setup and pre-processing, reduction of spectral variance and
improved D-glucose signal were observed. The best pre-processing was FD-MSC.
The FD pre-processing removed baseline offset and also smoothed and resolved
peaks. This was prior to MSC, where correction of remaining offset, scatter and
baselines shift, as well as derivative artefacts was performed.
The best performing models are shown in Table 18. It was clear that for each model
the performance was improved by region selection, which removed the influence of
the large OH band. Overall, the best calibration model was built on the averaged data,
with further pre-processing by FD and MSC, before finally being reduced to the best
performing region of 800–1680 cm–1. This model was referred to as M1Glu AVG FD-
MSC MW.
113
Table 18 The optimal M1GluR2 models generated after the different pre-processing methods.
M1GluR2 LV R2 RMSEC RMSECV REP%
(g/L) (g/L)
BC MW (800–1680 cm–1) 3 0.998 0.62 0.72 2.81
AVG ROI (707–1853 cm–1) 3 0.999 0.35 0.39 1.52
MSC pre-processing
AVG ROI (707–1853 cm–1) 2 0.999 0.43 0.45 1.75
Norm2 pre-processing
AVG ROI (707–1853 cm–1) 3 0.997 0.84 0.98 3.82
FD pre-processing
AVG MW (800–1680 cm–1) 2 0.999 0.32 0.34 1.32
FD-MSC pre-processing
35
Table 3 Composition of the M3Glu
114
4
x 10
D-Glucose
6 D-Galactose
L-Glutamine
5
Intensity
Figure 61 Overlay of the Raman spectra of solid D-glucose, L-glutamine and D-galactose (λex 785
nm). These spectra were collected as single scan (10 second exposure) from 250–3311 cm–1 with 8
cm–1 resolution.
115
4 4
x 10 x 10
3.5
7
(a) M1 Raw Data (b) M3Raw Data
3
6
2.5
5
2
4
Intensity
Intensity
1.5
3
1
2
0.5
1
0
0
1.5
1000
Intensity
Intensity
1
0
0.5
-1000
0
-2000
-0.5
500 1000 1500 2000 2500 3000 500 1000 1500 2000 2500 3000
Wavenumber (cm-1) Wavenumber (cm-1)
Figure 62 (a) Averaged Raman spectra of the M1GluR2 data, (b) averaged Raman spectra of the
M3GluR2 data, (c) baseline corrected Raman spectra of the M3GluR2 data, and (d) water
eliminated Raman spectra of the M3GluR2 data
Similar to the M1Glu data, a sloping baseline, baseline offset and large water signal
were characteristic of the M3Glu data. The baseline offset effects were increased by
the low analyte concentrations of the M3Glu samples as seen in Figure 62a and b.
After water elimination, a more detailed spectrum was generated with distinguishable
bands and water elimination artefacts. The artefacts were seen as the increased
baseline offset (1200–2100 cm–1) and the large noise signal seen above 3000 cm–1.
The M1Glu data analysis showed that these artefacts had a negative impact on the
modelling ability and limited useful spectral ranges. Water elimination was then
tested for M3Glu samples to verify if the same outcome occurred with more complex
data.
3.7.2 Reproducibility
When the PCA analysis was performed on the individual data collections, it did not
reveal any outliers and only two components were needed to describe the M3Glu data.
The first component represented the water signal and the second component contained
analyte signal. The M3Glu data showed a similar pattern to the M1Glu data for the
116
amalgamated samples sets (Figure 63). The second and third data collections were
close together while the first was separated due to the sampling setup change.
4
x 10
2
1.5
(a) M3 L1(96.7%)
(b)M3 Loadings
0.15 L2(3.28%)
L3(0.01%)
1 Scores
0.1
Scores on PC 3 (0.01%)
0.5
Intensity
0.05
0
-0.5 0
-1
-0.05
-1.5
-0.1
-2
-3 -2 -1 0 1 2 500 1000 1500 2000 2500 3000
Scores on PC 2 (3.28%) 5 Wavenumber(cm-1)
x 10
2500
0.2 L1(99.42%)
2000 L2(0.30%)
1500 0.15
Scores on PC 2 (0.30%)
1000 0.1
500
Loadings
0.05
0
0
-500
-1000 -0.05
-1500
-0.1
-2000
(c) M3_FDMSC -0.15
(d)M3_FDMSC Loadings
-2500
500 1000 1500 2000 2500 3000
-3 -2 -1 0
Scores
1 2 3
Scores on PC 1 (99.42%)
4 5 6
4
Wavenumber(cm-1)
x 10
Figure 63 PCA Scores and Loadings plots for triplicate measurements (a)/(b) for averaged raw
M3Glu samples and (c)/(d) for the FDMSC M3Glu samples. Run 1 is black, Run 2 is red, and
Run 3 is green.
The impact of water and spectral offset was evident in the loadings (Figure 63b). For
example, in the M3GluL3 data, there was a severe baseline slope, a large downward
peak at the water bending band (1640 cm–1) and small analyte peaks. Their visibility
was hampered by the low analyte concentration, baseline slope and the large water
signal. After FD-MSC pre-processing (Figure 63c and d), the deviations caused by
scatter effects were dealt with and the number of loadings describing the media was
reduced to two, where the second loading described the D-glucose signal. This
showed that the data can be adequately corrected for quantitative analysis by
chemometric pre-processing.
117
3.7.3 Quantitative Analysis: Calibrating D-glucose in
M3Glu Data
To determine which model best estimated the D-glucose content in media samples, a
wide variety of models were assessed (Table 65–Table 69). The best M3Glu models
for the different pre-processing methods are shown in Table 19.
Table 19 The best performing M3Glu models generated after the different pre-processing
methods.
M3Glu Data LV Correlation RMSEC RMSECV REP%
Coefficient (g/L) (g/L)
WE Data MW (800–1680 cm–1) 4 0.990 0.28 0.38 7.65
BC MW (800–1680 cm–1) 4 0.993 0.25 0.30 6.04
MSC pre-processing
BCMW (800–1680 cm–1) 4 0.993 0.25 0.32 6.44
Norm2 pre-processing
BC FD MW (800–1680 cm–1) 3 0.993 0.25 0.29 5.84
FD pre-processing
BCFDMSCMW (800–1680 cm–1) 3 0.995 0.20 0.23 4.63
FD MSC pre-processing
When compared with the best M1Glu models (Table 18), there was a decline in the
calibration model quality. This was the result of the lower analyte concentrations in
these samples. Overall, the baseline corrected spectra suited the calibration models
and the best model also used FD-MSC pre-processing. It aided the model by resolving
the peaks and removing baseline offset to improve the analyte signal. As with the
M1Glu data, water elimination on the M3Glu data only worked with first derivative
pre-processing.
For each pre-processing method, the best models used the 800–1680 cm–1 range as the
water bending band at 1640 cm–1 acted like an internal reference. The 1640 cm–1
water band remained steady, unlike the strong OH band above 3000 cm–1 which was
affected by detector limitations, shot noise and sampling effects, causing greater
variation. In these very dilute solutions compared to the M1Glu data, the strong water
signal was a benefit. The water signal showed little change and thus acted as an
internal standard. This region also contained several peaks 915, 1059, 1123, 1372, and
1460 cm–1 related to the D-glucose signal. The variation in the D-glucose signal was
118
self-referenced to the stable water signal allowing for estimation of the D-glucose
concentration[98, 131].
The best overall model was obtained after FD-MSC was applied to baseline corrected
data and when the spectral range was reduced to 800–1680 cm–1. Three latent
variables were necessary to model the data. The first loading (98.53%) represented the
large water signal within M3Glu data (Figure 64). Figure 65 shows a comparison
water spectrum before and after FDMSC pre-processing. The second loading
accounted for 1.16% of the explained variance and was the analyte signal. The third
loading (0.10%) was unresolved analyte signal and spectral noise.
12
R2 = 0.995
3 Latent Variables
0.2
10 RMSEC = 0.20568
RMSECV = 0.23945
0.1
8
Predicted (g/L)
Intensity
0
6
-0.1
4
-0.2
L1(98.53%)
2 L2(1.16%)
L3(0.10%)
-0.3
0
0 1 2 3 4 5 6 7 8 9 10 900 1000 1100 1200 1300 1400 1500 1600
Expected (g/L) Wavenumber(cm-1)
Figure 64 The BC M3Glu calibration model is built using FDMSC pre-treated data in the 800-
1680cm–1 region, (left) the predicted versus expected plot and (right) the three latent variables
loadings used in the calibration model.
This model showed that for the more complex and dilute media such as M3Glu, one
could use Raman together with chemometrics to estimate the D-glucose concentration
with reasonable accuracy (REP of 4.63%). The same pre-processing and region
selection as for the M1Glu samples was used. The next step was to determine if the
same methodology worked on the more complex M5Glu Media which contains five
media components.
119
4000 600
400
200
3000 4
x 10
Intensity
0
-200
2000 10
-400
Intensity
-600 9 FDMSC-H2O
800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800
1000 Wavenumber(cm-1)
8
7
0
Intensity
-1000 5
4 FDMSC-H2O
-2000
3
500 1000 1500 2000 2500 3000
2 Wavenumber(cm-1)
H2O
1
0
500 1000 1500 2000 2500 3000
Wavenumber(cm-1)
Figure 65 Raman spectra of water before and after by FDMSC pre-processing and inset is the
707-1853 cm–1 region FDMSC spectrum.
120
5 4
x 10 x 10
2.5
(a) 10
9
(b)
2 8
7
Intensity
Intensity
1.5 6
4
1
3
2
0.5
1
500 1000 1500 2000 2500 3000 500 1000 1500 2000 2500 3000
Wavenumber(cm-1) Wavenumber(cm-1)
Figure 66 Averaged raw Raman spectra of the M5GluR1 data (a) and M5GluR2 (b).
For the first data collection, the averaged spectra (Figure 66a) displayed a severe
baseline offset. After baseline correction, however, the offset was removed. When
conducting PCA analysis on the individual data collections for M5Glu, the first data
collection showed no outliers. However in the M5GluR2 data (Figure 66b) a sample
displayed a lower intensity than the other samples; this sample was identified as
M5GluR2S12. PCA analysis (Figure 67) confirmed the outlier. When M5GluS12 was
measured as part of the third data collection, no outliers were present. Therefore this
outlier was the result of an odd measurement due to experimental error. After
M5GluR2S12 was removed, the repeated PCA analysis of the raw averaged
M5GluR2 data revealed that the remaining samples were within the 95% confidence
limit.
4 4
x 10 x 10
0.2
2 1
30
12 0.8
16 0.15
1.5
0.6 11
18
Scores on PC 2 (0.01%)
Scores on PC 2 (0.01%)
1 0.4 0.1
30 32
25
Intensity
16 0.2
11 29
0.5 18 2723 0.05
0 31
1726
32
24
29 -0.2
0 3
1731 0
26 5 21
21 -0.4 9
513 4
-0.5 4 L1(99.98%)
61 -0.6 61 -0.05
L2(0.01%)
-1 -0.8
-1 -0.1
-5 0 5 10 15 -5 0 5 10 15 500 1000 1500 2000 2500 3000
Scores on PC 1 (99.98%) 5 Scores on PC 1 (99.98%) 5 Wavenumber(cm-1)
x 10 x 10
Figure 67: PCA Scores plots for averaged M5GluR2 data before and after the outlier removal
and the loadings of the PC1 and PC2 for Run 2 data after outlier has been removed.
The M5Glu PCA loadings (Figure 67) illustrated a trend similar to the one seen in all
the media data analysed so far. The first loading was dominated by the water signal
(99.98% of the explained variance) and the second revealed small peaks (426, 522,
1066, 1123, 1362 and 2898 cm–1) for only ~0.02% explained variance. This was a
121
result of the low analyte signal intensity compared to the water signal within the
spectra. Pre-processing of the data increased the explained variance for PC2 to 0.33%,
by enhancing the analyte signal.
Looking at the third data collection for M5Glu spectra (Figure 68), the raw data was
comprised of two groups of spectra as a result of being collected over 2 days. This
grouping was caused by instrumental variation of the Raman Station, a single channel
instrument which has no internal calibration to prevent possible power fluctuations
which could have caused the shift in the M5Glu data. This variation was corrected by
FD-MSC pre-processing and when the FD-MSC corrected data was overlaid for Day
1 and Day 2 (Figure 68b), the samples were distributed according to their
concentrations.
4
x 10 500
10
400
Low
9
(a) 300
(b)
8 1
200 5 3
4
Scores on PC 2 (1.04%)
2 6
7 7 8
100 11
13 9
10
Intensity
6 12 16
0 2114
5
-100
18
19
20
25
17
High
4 23 27
29 28
26
-200 3130
3
32
2 -300
1 -400
Figure 68 Averaged Raman spectra of the M5GluR3 data (left) and PCA Scores plots for
M5GluR3 data first derivative and multiplicative scatter correction (right). Red refers to day one
and black is day two.
122
4
x 10
2
1.5
(a) M5 Scores 0.2
L1(99.87%) (b) M5 Loadings
L2(0.12%)
0.15
L3(0.01%)
1
0.1
Scores on PC 3 (0.01%)
0.5 0.05
Intensity
0 0
-0.05
-0.5
-0.1
-1
-0.15
-1.5
-0.2
-2
-6 -4 -2 0 2 4 500 1000 1500 2000 2500 3000
Scores on PC 2 (0.12%) 4 Wavenumber(cm-1)
x 10
5000
4000
(c) M5_FDMSC Scores 0.2
L1(98.74%)
3000 L2(0.86%)
0.15
Scores on PC 2 (0.86%)
2000
1000 0.1
Loadings
0
0.05
-1000
-2000
0
-3000
-4000 -0.05
Figure 69 PCA Scores and loadings plots for triplicate measurements of M5Glu sample sets. Run
1 is black, Run 2 is red and Run 3 is green.
As with the M1Glu and M3Glu data, PCA analysis on the M5Glu data showed that
the samples from the second and third runs were close together while the first data
collection was separated due to the sampling setup issue (Figure 69). Pre-processing
was unable to correct for this variance. The strong water signal was observed in the
first loading, the second loading depicted the analyte signal and the third loading
represented the offset effects caused by water at 1640 cm–1 and above 3000 cm–1.
After FD-MSC pre-processing of the M5GluR3 data, the loadings were reduced to
two: one for the water signal and the other for the analyte, since the interferences seen
in the raw data were removed.
123
3.8.2 Quantification: Glucose in M5Glu Data
Several models were built for the M5Glu data (Table 70 to Table 74) and the best
models for estimating the D-glucose concentration are listed in Table 20. From the
M1Glu/M3Glu dataset, it was observed that the reduced region of 800–1680 cm–1 and
FD-MSC pre-processing gave the best results. The same held true for the M5Glu data
(Figure 70). The best model used two latent variables, the first represented the water
signal and the second the analyte signal.
12
R2 = 0.993 0.2
2 Latent Variables
10 RMSEC = 0.25039
RMSECV = 0.27805
0.1
8
Predicted (g/L)
Intensity
6
-0.1
4
-0.2
L1(99.78%)
2
L2(0.92%)
-0.3
0
0 2 4 6 8 10 12 900 1000 1100 1200 1300 1400 1500 1600
Expected (g/L) Wavenumber(cm-1)
Figure 70 Predicted versus expected plots for the calibration of BC FDMSC M5Glu data in the
800-1680 cm–1 region and the loadings showing the components represented in the calibration
model.
Table 20 The best M5Glu models generated after the different pre-processing methods.
M5Glu Data LV Correlation RMSEC RMSECV REP
Coefficient (g/L) (g/L) (%)
BC Data (Entire region) 4 0.983 0.38 0.50 8.63
Preliminary pre-processing
BC MW (800–1680 cm–1) 4 0.990 0.29 0.35 6.04
MSC pre-processing
AVG MW (800–1680 cm–1) 4 0.992 0.26 0.34 5.87
Norm2 pre-processing
BC MW (800–1680 cm–1) 3 0.985 0.36 0.44 7.59
FD pre-processing
BC MW (800–1680 cm–1) 2 0.993 0.25 0.27 4.66
FD-MSC pre-processing
124
spectral data quality and that glucose quantification should always be possible in
complex media as long as it is present in relatively high concentration. If more
samples and more replicates were used in the calibration set and if a smaller, more
appropriate (to the designed formulation) D-glucose concentration range was
employed, then it should be feasible to get a much lower REP in the 1–2 % range.
2 2
1.4 1.4
Intensity
Intensity
1.2 1.2
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
500 1000 1500 2000 2500 3000 500 1000 1500 2000 2500 3000
Wavenumber cm-1 Wavenumber cm-1
Figure 71 Averaged raw Raman spectra of (a) M5Ye (0.1-1.72 g/L) and (b) M5eRDF (1-6.4g/L)
and, in red, the spectra after multiplicative scatter correction.
The raw Raman spectra for M5eRDF and M5Ye resembled water spectra with strong
baseline offset effects (Figure 71). At the low concentrations used in these samples,
the Raman signal was weak and there was little difference observed in the spectra for
125
M5Ye, M5eRDF and M5Glu samples (Figure 66). The WARs were similar for
M5eRDF (14.93) and M5Ye (14.64) while the WAR for M5Glu data was 9.71. The
M5Glu samples were a more concentrated sample set compared to the M5eRDF and
M5Ye samples.
As a result of the weak analyte signal, multivariate analysis was required to extract
relevant analyte information. MSC pre-processing was used to remove spectral offset
and noise (Figure 71) and for the quantitative analysis the same spectral regions used
with M5Glu data were used (250–3311 cm–1, 707–1853 cm–1 and 800–1680 cm–1).
Table 21 Summary of Calibration models for the M5eRDF samples using averaged Raman data.
M5eRDF LV Correlation RMSEC RMSECV REP%
Coefficient (g/L) (g/L)
BC Data 5 0.980 0.24 0.70 18.91
MW(800–1680 cm–1)
Avg ROI (707–1853 cm–1) 5 0.995 0.12 0.62 16.94
MSC pre-processing
BC ROI (707–1853 cm–1) 5 0.988 0.18 0.72 19.62
NINF pre-processing
BC MW(800–1680 cm–1) 4 0.974 0.27 0.77 20.81
Norm2 pre-processing
BC ROI (707–1853 cm–1) 3 0.975 0.27 0.62 16.75
FST11 pre-processing
BC ROI(707–1853 cm–1) 4 0.993 0.14 0.59 15.94
FST11MSC pre-processing
The pre-processed M5eRDF spectra and the best calibration model are shown in
Figure 72. The best model was built using the reduced range of 707–1853 cm–1; this
eliminated the strong OH band and the sloping baseline seen below 700 cm–1; (visible
in Figure 71). The model used four variables and the second loading correlated with
the analyte signal and its scores showed a noisy linear correlation of 0.911 with
126
increasing concentration (Figure 73). The first loading represented the water signal
with 99% of the explained variance in the spectra. The remaining two loadings
described less than 1% of the explained variance (analyte signal and noise). When
compared to the second loading, these represented ~38% and ~23% of the analyte
signal and noise. The overwhelming water signal reduced their influence on the
model.
7
1000
R2 = 0.993
4 Latent Variables
6 RMSEC = 0.14718
RMSECV = 0.59011
500
5
Predicted (g/L)
Intensity
0
4
3
-500
2
-1000
1
800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1 2 3 4 5 6 7
Wavenumber(cm-1) Expected (g/L)
Figure 72 The pre-treated spectra of BC FDMSC M5eRDF data and the predicted versus
expected eRDF concentration plot for the calibration model for M5eRDF in the region of 707-
1853 cm–1.
LPC1(99.69%)
0.6 LPC2(0.13%)
LPC3(0.05%)
LPC4(0.03%)
0.4
0.2
Loadings
-0.2
-0.4
-0.6
800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800
Wavenumber(cm-1)
Figure 73 M5eRDF loadings (left) and scores (right) of the second component for BC FDMSC
ROI calibration model.
Compared to the M5Glu calibration models (Table 20) where the best REP was ~ 5%,
the M5eRDF model had significantly lower accuracy with a REP of ~16%. The
difference in performance was the result of the high WAR of the M5eRDF compared
to the M5Glu sample set and the different sample numbers used per sample sets.36
M5Glu comprised of 32 samples spanning 0.0 g/L to 9.92 g/L while only 10 samples
36
Measurement precision increases with more samples.
127
were used for M5eRDF covering 1.0 g/L to 6.4 g/L range. The performance of the
M5eRDF model could then be improved by increasing the sample numbers and
increasing the concentration of eRDF within the samples.
When comparing the performance of the M5eRDF model to the M5Glu one, over-
fitting37 was evident within the data as noted by the low RMSEC and high RMSECV.
The M5eRDF models used more latent variables on average. For the best M5Glu
models the average ratio of SECV to SEC was 1.2, while for the M5eRDF the
averaged ratio was 3.5. This was another indicator that M5eRDF data was over-fitted.
This was likely due to samples not changing enough to be correlated with eRDF
concentration increase. The concentration change signal was not seen as a whole but
as a product of multiple individual components changing since eRDF is a mixture.
This resulted in a more complex but less intense change in the Raman signal. The
Raman method worked for M5Glu dataset not only due to the fact that only a single
component signal was changing, but also the M5Glu dataset benefitted from having
more samples thus allowing for greater precision.
Figure 74 shows a linear correlation. However, there was too much scatter and this
prevented generation of an accurate model. This calibration model used three
variables to describe the system. The first loading explained 99.65 % of variance and
37
When too many latent variables are used the model essentially fits noise which is specific to the
calibration set. Over-fitting is then characterised by the large RMSECV which results from the
prediction of samples with their own noise pattern.
38
A full listing of the calibration models is available in the appendix, see section 8.3.5.
128
matched the water signal after pre-processing shown in Figure 65. The first loading
for the M5Ye matched the first loadings for the M5eRDF data as water was the major
component in these media samples. After the water signal was described only 0.24 %
of variance was left to be explained by the second and third loadings. Together they
represented the noisy analyte signal buried beneath the water signal, as well as shot
noise. Model accuracy may be improved if a higher concentration range of yeastolate
was studied and if more samples and replicates were used in the calibration set.
Table 22 Summary of models performance for the M5Ye samples using averaged Raman data
M5GLUYe LV Correlation RMSEC RMSECV REP%
Coefficient (g/L) (g/L)
BC Data Full(250–3311 cm–1) 4 0.935 0.13 0.36 39.67
BC Full (250–3311 cm–1) 4 0.944 0.12 0.36 39.78
MSC Pre-processing
BC Full (250–3311 cm–1) 4 0.938 0.12 0.36 40.32
NINF Pre-processing
BC Full (250–3311 cm–1) 4 0.938 0.12 0.36 39.56
Norm2 Pre-processing
Avg ROI (707–1853 cm–1) 3 0.941 0.12 0.35 38.46
FD11 Pre-processing
Avg ROI (707–1853 cm–1) 3 0.929 0.13 0.43 47.25
FD11MSC Pre-processing
1.8 0.7
LPC3(0.04%)
1.6 R2 = 0.941 0.6 LPC2(0.20%)
3 Latent Variables LPC1(99.65%)
RMSEC = 0.12588
1.4 RMSECV = 0.35503 0.5
1.2 0.4
Predicted (g/L)
Loadings
1 0.3
0.8 0.2
0.6 0.1
0.4 0
0.2 -0.1
0 -0.2
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800
Expected (g/L) Wavenumber(cm-1)
Figure 74 Predicted versus expected concentration plot and loadings of the calibration model for
FD M5Ye data in the region of 707–1853 cm–1.
In comparison to the M5GluR2 model (Figure 70, REP 4.66%), the error level was
higher for the validation models (avg. REP 7.10%). For these validation tests, the
samples were subject to more day to day variation which did not affect the M5GluR2
model. To overcome this source of error and improve the results, the data should be
normalised prior to validation.
130
12
R2 = 0.989
3 Latent Variables
10
RMSEC = 0.30786
RMSECV = 0.37503
RMSEP = 0.30607
8
Predicted (g/L)
0
0 1 2 3 4 5 6 7 8 9 10
M e a su r e d ( g /L )
Figure 75 Predicted versus expected prediction plot with the calibration samples for BC FDMSC
M5Glu data using the 800–1680 cm–1 range.
Table 23 Results for the Internal Validation on 10 different subsets for the BC FDMSC M5Glu
data in the 800–1680 cm–1 region.
Dataset LV Correlation RMSEC RMSECV RMSEP REP%
(g/L) (g/L) (g/L)
Subset_A01C 3 0.991 0.28 0.34 6.26
Subset_A01P 3 0.984 0.28 0.34 0.38 6.99
Subset_A02C 3 0.987 0.33 0.38 6.99
Subset_A02P 3 0.987 0.33 0.38 0.31 5.70
Subset_A03C 3 0.990 0.28 0.33 6.21
Subset_A03P 3 0.991 0.28 0.33 0.44 8.28
Subset_A04C 3 0.992 0.26 0.32 7.40
Subset_A04P 3 0.986 0.26 0.32 0.39 9.02
Subset_A05C 3 0.986 0.32 0.37 7.82
Subset_A05P 3 0.991 0.32 0.37 0.32 6.76
Subset_A06C 3 0.988 0.32 0.38 8.29
Subset_A06P 3 0.984 0.32 0.38 0.35 7.64
Subset_A07C 3 0.989 0.30 0.35 6.90
Subset_A07P 3 0.984 0.30 0.35 0.43 8.48
Subset_A08C 3 0.991 0.27 0.33 6.58
Subset_A08P 3 0.984 0.27 0.33 0.37 7.38
Subset_A09C 3 0.987 0.30 0.37 7.50
Subset_A09P 3 0.989 0.30 0.37 0.30 5.89
Subset_A10C 3 0.990 0.30 0.35 7.11
Subset_A10P 3 0.985 0.30 0.35 0.37 7.52
131
3.10.2 Independent Test Set Prediction
A new sample set (T5) was collected to determine the capability of the larger M5Glu
dataset for the prediction of unknown samples. The samples were prepared in the
same way as the M5Glu samples but at a different time and date. The composition of
the samples can be seen in Materials and Methods (Table 5). The D-glucose
concentration in the T5 samples ranged from 1.7 g/L to 9.8 g/L. Calibration was
performed using the amalgamated data of the second and third M5Glu data collections
(62 samples39). Modelling was also conducted on the averaged replicate data (32
samples). As well as the amalgamated and the averaged M5Glu sample sets,
normalised data for these were also tested in order to minimise the intensity offset
between the different data collections.
PCA was used to evaluate the closeness between the T5/M5Glu data. The closer the
calibration set was to the prediction set, the better it was for the model and
quantitative performance. The scores plot showed that there was an overlap between
M5Glu/T5 datasets and the T5 samples were spread across the M5Glu sample set
(Figure 76).
0.04
0.03 M5
0.02
T5
Scores on PC 2 (1.00%)
0.01
-0.01
-0.02
-0.03
-0.04
-0.05
0.115 0.12 0.125 0.13 0.135 0.14 0.145 0.15
Scores on PC 1 (98.50%)
Figure 76 Scores plot for the PCA comparison of T5 and M5Glu Data after BC FDMSC MW
pre-processing.
39
Samples M5R2S12 and M5R2S23 were removed as outliers.
132
Table 24 Calibration and prediction models generated by BC FDMSC M5Glu and BC FDMSC
T5 samples using the 800–1680 cm–1 range.
# of Samples Correlation LV RMSEC RMSECV RMSEP REP%
Pre-processing Coefficient
62 0.982 3 0.40 0.45 9.07
10 0.980 3 0.40 0.45 0.57 9.93
62_Normalised 0.971 2 0.50 0.52 10.48
10_ Normalised 0.990 2 0.50 0.52 0.52 9.05
The external validation of M5Glu data (Table 24) showed that the models were
capable of performing predictions and produced a prediction performance equivalent
to the validation model generated by sample splitting. Averaging the M5Glu data
improved the model error level, while normalisation saw only a slight improvement in
the averaged and amalgamated data. The averaged sample sets performed better as the
variance within the data was reduced compared to the amalgamated sample sets. The
lower correlation in the validation models was due to the increased variation between
the calibration and prediction as the result of the different sample make up and
concentration ranges.
12
R2 = 0.990
10 2 Latent Variables
RMSEC = 0.2727
RMSECV = 0.29923
8 RMSEP = 0.34207
Predicted (g/L)
-2
0 1 2 3 4 5 6 7 8 9 10
Exp e cte d ( g /L )
Figure 77 Predicted versus expected prediction plot for normalised BC FDMSC T5 with the
M5Glu calibration samples data using the 800–1680 cm–1 range.
133
Table 25 Prediction results of T5 data based on the normalised BCFDMSC data in the 800-1680
cm–1 region model from the M5Glu data.
Sample No Expected g/L Predicted g/L Difference g/L
T5Glu01 1.7 2.07 +0.37
T5Glu02 2.6 3.02 +0.42
T5Glu03 3.5 3.96 +0.46
T5Glu04 4.4 4.58 +0.18
T5Glu05 5.3 5.55 +0.25
T5Glu06 6.2 6.67 +0.47
T5Glu07 7.1 7.29 +0.19
T5Glu08 8.0 8.24 +0.24
T5Glu09 8.9 8.55 -0.35
T5Glu10 9.8 9.49 -0.31
Table 25 displays the prediction results for the T5 samples. The reasonable
performance from the M5Glu/T5 model indicated the potential of Raman
spectroscopy for modelling high concentration samples. These results showed that
prediction of the low concentration samples was worse than the higher concentration
ones. This was consistent with previously modelled M1Glu data, where better
calibration models were obtained as a result of higher concentration and stronger
signal. Therefore, adapting the Raman procedure to higher concentration ranges
would lead to an analytical tool with better accuracy for quantifying D-glucose
concentration in media.
The major issue with these Raman datasets was the large water signal compared to
weak analyte signal as seen in Figure 78. The water signal dominated the first loading
for the different sample sets. M1Glu was a simple system where the analyte was at
134
higher concentrations, giving a stronger performing model. For this reason, the first
loading of the model selected for M1Glu had small peaks from the D-glucose beneath
the water signal. However for the M3Glu and M5Glu data, the first loadings were the
same as the water signal with varying baselines. Elimination of the water signal from
the data by simple subtraction was possible though it produced spectra containing
artefacts. These artefacts occurred when the variance amongst the spectra was caused
by more than the weak analyte signal and its varying concentration. Unresolved issues
such as baseline offset, sloping baseline and noise also contributed to the spectral
variance. Further pre-processing prior to water elimination may prevent the
appearance of some artefacts. The efficacy of the water elimination method as part of
a series of pre-processing steps was shown by Li et al. when performed on relatively
low concentration samples (1–2% dissolved solids). [3] However in this study, the
samples (M3Glu/M5Glu) had a lower concentration of dissolved solids (~1%) and
WE was not a suitable method.
The second loading for the M1Glu, M3Glu and M5Glu data revealed the impact of
lower D-glucose concentration and increasing sample complexity (Figure 78). For
M1Glu (blue), the analyte signal was clear but became less defined with M3Glu
(green) and M5Glu (red). The noise level was elevated with the M3Glu and M5Glu
data. M3Glu was severely affected by the water bending band at 1640 cm–1 as it had
the lowest level of dissolved solids.
1 0.25
0.2
0.8
0.15
0.1
0.6
0.05
Intensity
Intensity
L1M1(99.91%)
L1M3(99.99%) 0
0.4
L1M5(99.98%)
-0.05
H2O
-0.1
0.2 L2M1(0.08%)
-0.15 L2M3(0.03%)
L2M5(0.01%)
-0.2
0
-0.25
500 1000 1500 2000 2500 3000 500 1000 1500 2000 2500 3000
Wavenumber(cm-1) Wavenumber(cm-1)
Figure 78 (Left) First loadings and water spectra and (Right) second loading for M1Glu, M3Glu
and M5Glu from equivalent40 PCA models.
40
Same data treatment.
135
The calibration methods were used on high (M1Glu) and low concentration (M3Glu,
M5Glu, M5eRDF and M5Ye) samples. The performance of the various models
reflected the different concentration magnitudes and dataset complexity. However, the
same pre-processing and region selection were found to be adequate for all the
different D-glucose samples sets. The M5Glu model developed here performed well in
the 0–9.92 g/L range giving a REP of ~ 5%. This was well within the typical
concentration range of glucose in media samples for mammalian cell culture (which is
typically 1–10 g/L). [23, 299, 300] The glucose concentration required for plant, yeast
and bacterium cell lines is higher (20–30 g/L). [24, 25, 301] The Raman method
worked to a limited level of accuracy for the D-glucose samples as a result of the low
concentrations studied. An improvement of the model performance would be
expected with a higher concentration range where the signal would be stronger. For
the quantification of the more complex components, Raman analysis did not perform
well. No acceptable model was found for the weak M5Ye samples and the M5eRDF
model was three times weaker than the M5Glu model. These results could be
improved by a larger number of samples but were intrinsically weakened by the low
concentration ranges.
This study has shown that (1) Raman analysis is not sensitive enough for the dilute
solutions tested here, (2) the analyte signal is obscured by the water signal and the
associated shot noise, and (3) the water elimination correction method added more
interfering information rather than being of any benefit to the analyte signal. For
these reasons the next logical step was to use SERS to enhance the analyte signal
while limiting the impact of the water signal. SERS can however only be applied to
yeastolate and eRDF quantification as D-glucose is not a SERS active molecule.
136
137
4 Surface Enhanced Raman Spectroscopy
(SERS) Analysis of Complex Media
Components
SERS offered greater sensitivity and was applied to the qualitative analysis of
yeastolate and other media components [5, 159]. The use of SERS for monitoring
changes in yeastolate showed that the SERS signal from complex media components
was a useful qualitative tool for detecting batch to batch variations and storage
changes [5]. With this in mind, we propose to use SERS to quantify the eRDF and
yeastolate concentrations in cell culture media. Specifically we want to quantify the
global concentration of eRDF/yeastolate and not individual constituents.
This chapter focuses on two topics:
(a) The investigation of SERS signals from complex media components eRDF
and yeastolate; and
(b) The quantification of eRDF and yeastolate in media by SERS.
SERS measurements were carried out on the same M5eRDF and M5Ye sample sets as
for conventional Raman (Figure 79). The comparison of SERS spectra obtained by
using colloidal silver nanoparticles, with the corresponding Raman spectrum revealed
a significant enhancement of the Raman signal (Figure 80).
138
4 4
x 10 x 10
16
14
14
12
12
10
Intensity
Intensity
10
8
8
6
6
4 4
500 1000 1500 2000 2500 3000 500 1000 1500 2000 2500 3000
Raman Shift (cm-1) Raman Shift (cm-1)
Figure 79 (Left) SERS spectra of M5eRDF samples (1–6.4 g/L), and (Right) SERS spectra of
M5Ye samples (0.1–1.72 g/L).
0.03 eRDF AvgSERS 0.07 Ye AvgSERS
eRDF AVGCR Ye AVGCR
Difference 0.065 Difference
0.02
0.06
0.01
0.055
Intensity
0 0.05
Intensity
0.045
-0.01
0.04
-0.02
0.035
-0.03 0.03
-0.04 0.025
0.02
50 100 150 200 250 300 50 100 150 200 250 300
Wavenumber (cm-1) Wavenumber (cm-1)
Figure 80 Conventional Raman spectra, SERS spectra and difference for the M5eRDF samples
(Left) and M5Ye samples (Right).
139
spectrum gave information on the size (λmax) and the size distribution (fwhm) of
colloid particles. For optimum SERS enhancement when using 785 nm excitation, the
good quality silver colloids showed λmax values close to ~400 nm and a fwhm of <60
nm [304-306]. An increase in the fwhm value indicated an increasing particle size
variation [307]. From our preparation of silver colloids, good batches had an
absorption maximum (λmax) of ~406 nm with a full width half maximum of 80 nm.
Acceptable Raman spectra were also achieved from colloids with a λmax as high 412
nm; however colloids with a λmax of ~430 nm generated poor Raman spectra (Figure
81).
1 B1 408nm
B2 430nm
0.9 B3 404nm
B4 406nm
0.8 B5 412nm
B6 404nm
B7 404nm
0.7
B8 434nm
B9 404nm
0.6
Intensity
B10 404nm
0.5
0.4
0.3
0.2
0.1
300 350 400 450 500 550 600 650 700 750 800 850
Wavelength(nm)
Figure 81 Normalised UV-Vis absorption spectra of ten different batches of silver colloid, where
the optimal colloids are the solid lines while the poor performing colloids are represented by the
dashed lines.
In order to overcome batch variation based on colloid particle size, several batches of
good quality SERS colloids were mixed together to form a single colloid. Mixing
batches minimized batch-to-batch variation which would otherwise adversely affected
spectral reproducibility [306].
It was previously seen in an eRDF solution (18 g/L) using a 1:4 sample to colloid
ratio that the SERS signal steadily increased for about 6 minutes before levelling off
[101]. Therefore, data collection was performed within minutes of colloid addition
before levelling off could occur. The colloid was added, the solution was mixed five
times and the spectra were then measured. The incubation times for all samples were
kept close and as short as possible.
11000
60 mins
10000
(a)
9000
8000
7000
Intensity
6000
0 mins
5000
4000
3000
2000
(b) (c)
Figure 82 (a) Plot of SERS spectra for M5Ye sample (1.54 g/L) versus time (sixteen
measurements taken over an hour), (b) Intensity profiles for selected peaks showing the
increasing intensity and (c) the intensity ratio for the selected peaks against the water peak at
1604 cm–1. The SERS spectra were measured using a single point collection with an exposure
time of 2×10 s. A sample to colloid ratio of 1:1 was used.
141
In order to investigate incubation time effects on the M5Ye (yeastolate at 1.54 g/L)
and M5eRDF (eRDF at 3.4 g/L) samples, a series of SERS spectra were taken over an
hour using a 1:1 sample colloid ratio. Sixteen measurements were made with no re-
suspension to show the evolving spectra for M5Ye and M5eRDF (Figure 82a and
Figure 83a). The spectra displayed a steady increase in baseline intensity and
enhancement with time indicating a reasonably stable sample colloid mixture with no
need of re-suspension. These results differed from the SERS testing of eRDF (17.7
g/L) solution with the 1:4 sample to colloid ratio. In that case, aggregation of the
nanoparticles was induced at a higher rate compared to the less concentrated M5Ye
and M5eRDF samples [101]. The use of dilute media samples required a smaller
quantity of colloid which saw less aggregation occurring and provided a steadily
increasing signal for testing.
In the M5Ye sample (Figure 82b), the 730 cm–1 peak exhibited the greatest intensity
increase followed by the 1332 cm–1 peak, while for the M5eRDF sample (Figure 83b),
it was the 650 cm–1 peak along with the 1388 cm–1 that displayed the greatest
intensity. We speculated that the 730 cm–1 peak was related to the adenine signal in
yeastolate, while the 650 cm–1 peak was a result of the L-cysteine hydrochloride
monohydrate present in eRDF and the 1332 cm–1 and 1388 cm–1 signified the amino
acid portion of the M5eRDF and M5Ye samples, respectively. In terms of variance
compared to the 1604 cm–1 band (OH bending band), there were two trends: the low-
wavenumber bands (650 cm–1 and 730 cm–1) showed a relative increase and the high-
wavenumber bands remained stable (see Figure 82c and Figure 83c). This could be a
result of the high-wavenumber bands relating to the stretching vibrations while the
low-wavenumber bands involved bending vibrations (since more energy is required to
stretch a group than to bend a group) [311, 312]. Other factors such as the surface
orientation of individual analytes may also be the source of differences in the intensity
bands. As the micro-environment of the sample was continually changing because of
the aggregation of the colloid, more hotspots were forming resulting in an increasing
signal from molecules41 closer to the hotspots [311].
41
Molecules that are perpendicular to the surface are more significantly enhanced than those parallel to
the surface.
142
For these samples, using a 1:1 ratio of sample to colloid gave a stable mixture without
re-suspension of the sample as the signal increased or remained constant during the
testing period, indicating that precipitation did not occur and the SERS signal was
steady.
8000
60 mins
7000 (a)
6000
Intensity
5000
4000
0 mins
3000
2000
1000
600 800 1000 1200 1400 1600 1800
Wavenumber cm-1
(b) (c)
Figure 83 (a) Plot of SERS spectra for M5eRDF sample (3.4 g/L) versus time (sixteen
measurements taken over an hour), (b) Intensity profiles for selected peaks showing the
increasing intensity and (c) the intensity ratio for the selected peaks against the water peak at
1604 cm–1. The SERS spectra were measured using a single point collection with an exposure
time of 2×10 s. A sample to colloid ratio of 1:1 was used.
4.2.3 Reproducibility
Reproducibility was a major issue with SERS measurements. If the sample was left to
stand, the colloid was liable to aggregate and precipitate out of solution before testing,
therefore immediate testing of the sample was preferable. PCA scores (Figure 84) of
the replicate runs42 for the M5eRDF and M5Ye measurements demonstrated the class
variance for the different data collections. M5Ye samples (Figure 84b) were more
stable with overlapping ellipsoids, while the M5eRDF samples showed greater
42
The PCA scores for the individual runs are shown in 8.4.1
143
variability especially amongst the low concentration samples. For example,
M5eRDFS01 from data collection one (#1) and three (#21) were close but for data
collection two (#11), it differed with its high PC2 reading. This may be a low
concentration effect where matrix effects were causing more fluctuations to occur,
given that the high concentration samples M5eRDFS10 (#10, #20, #30) were grouped
together in the centre of the scores plot.
4 4
x 10 x 10
10 6
Run 1 Run 1
8
Run 2 4 Run 2
6 11
15 Run 3 Run 3
Scores on PC 2 (0.06%)
Scores on PC 2 (0.14%)
4 2
26
2
9 8 19
24
0
29
12 6
27 25 10 22
0 20
23 30 13
5 37
4
-2 -2
2
14
-4 1
21 -4
-6 (a) (b)
-8 -6
4 4.5 5 5.5 6 6.5 7 7.5 8 4 4.5 5 5.5 6 6.5 7 7.5 8
Scores on PC 1 (99.73%) 5 Scores on PC 1 (99.86%) 5
x 10 x 10
Figure 84 Scores plots for the three raw data collections for M5eRDF (a) and M5Ye (b), with the
replicate runs outlined. The ellipsoids represent the PCA subspace generated for each dataset.
Ye a s t o la t e
2 Ye SERS
1.8
1.6
1.4 730
Intensity
1332
2934
1.2 802 1644
955
1 658 1388
1027
0.8
0.6
0.4
Figure 85 SERS and Raman spectra for an aqueous solution of yeastolate (5 g/L).
145
4
x 10
16
e RDF
14 e RDF SERS
730
12 802 1340 2942
666
899
1396
Intensity
10 955 1612
1035
4
500 1000 1500 2000 2500 3000
Wavenumber (cm-1)
Figure 86 SERS and Raman spectra for an aqueous solution of eRDF (17 g/L).
146
models for determining yeastolate concentration using different spectral pre-
processing methods. Using SERS data it was possible to generate a better correlation
between the yeastolate concentration and the M5YE spectra. The model performance
of the conventional Raman data gave an error of 38–47% compared to 10–18% error
for these SERS models. However the error levels were still high and accurate
predictions were difficult to reliably obtain.
Compared to the other pre-processing methods (Table 27), the result for FDMSC
offered the best performance; the calibration model had REP of ~12% with three
latent variables. Also seen in Table 27 was a high correlation of 0.998 with a lower
REP of ~10% for the NormINF model. This model appeared better than FDMSC
model but it was subject to over-fitting43 as indicated by the higher number of latent
variables used and the large difference between the RMSEC and RMSECV values.
The reduced region (1260–1444 cm–1) of the spectra used for the FDMSC model
(Figure 87) contributed to the correlation observed. From previous studies[95, 145],
the bands from 1300 cm–1 to 1400 cm–1 were attributed to CH /CH3 bending and
deformation in amino acids. Since there were multiple amino acids within the M5Ye
model media, it was not surprising that the amino acid components influenced the
calibration model.
1.6
2000 R2 = 0.979
1.4 3 Latent Variables
1500 RMSEC = 0.075789
1.2 RMSECV = 0.11168
1000
1
Predicted (g/L)
Intensity
500
0.8
0
0.6
-500
0.4
-1000
0.2
-1500
0
1260 1280 1300 1320 1340 1360 1380 1400 1420 1440 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8
Wavenumber(cm-1) Expected (g/L)
Figure 87 SERS spectra for M5Ye of the 1260–1444 cm–1 range after FD-MSC pre-processing
with the predicted versus expected plot for the resulting calibration model.
43
Over-fitting leads to poor prediction results as the model is too specific to the calibration samples.
147
Table 27 The best results for calibration modelling of yeastolate in the M5YE SERS data
M5Ye LV (R2) RMSEC RMSECV REP%
(g/L) (g/L)
WE Data (250–3311 cm–1) 4 0.966 0.09 0.16 17.58
WE MSC (250–3311 cm–1) 4 0.973 0.08 0.15 16.48
WE NINF A (602–995 cm–1) 5 0.998 0.02 0.10 10.98
WE Norm2A (602–995 cm–1) 4 0.985 0.06 0.14 15.38
BC FST11B (1260–1444 cm–1) 3 0.951 0.11 0.15 16.48
BC FST11MSC B (1260–1444 cm–1) 3 0.979 0.07 0.11 12.08
The loadings and scores (Figure 88) explained more about the behaviour of the model.
The first loading was representative of the average FDMSC signal with peaks at 1310
cm–1 and 1380 cm–1, while the corresponding first score plot confirmed the increasing
signal for an increasing yeastolate concentration.
1 LPC1(88.90%)
LPC2(10.18%)
LPC3(0.64%)
0.8
0.6
0.4
Loadings
0.2
-0.2
-0.4
-0.6
1260 1280 1300 1320 1340 1360 1380 1400 1420 1440
Wavenumber(cm-1)
Figure 88 The loadings and scores versus samples from the M5Ye SERS calibration model with
three components: the first component in blue, the second in green and the third in red.
The second loading dealt with the 1310 cm–1 and 1420 cm–1 peaks while the third
loading covered the peaks at 1270 cm–1, 1330 cm–1, 1380 cm–1 and 1440 cm–1. From
the loadings, it was clear that the M5YeS06 was different from the neighbouring
samples. However in the calibration M5YeS06 was on the regression line and was not
an outlying sample. Also in the PCA results (Figure 116), M5YeS06 was grouped
with the other samples. It seemed to be an anomaly. The fault in this sample may be
148
in the sample preparation given that when it was compared with the other samples, it
displays a similar spectrum to the M5YeS01 and M5YeS02. In the prediction testing
of eRDF concentration within the M5YeS06 sample as part of the model evaluation
for M5eRDF, the eRDF concentration was double what was expected. This gave
merit to the hypothesis of incorrect sample preparation and also gave a reason as to
why the M5YeS06 spectrum matched the M5YeS01 and M5YeS02 samples. They
shared a common trait - their ratio of eRDF to yeastolate was large compared to the
high concentration samples. Another factor linking this M5YeS06 sample to a strong
eRDF signal was the proximity of the 1375 cm–1 peak to the 1388 cm–1 peak which
was a significant peak in the M5eRDF spectra (Figure 83).
Table 28 Comparison between best calibration models after different pre-processing methods
from the M5eRDF SERS data
M5eRDF LV (R2) RMSEC RMSECV REP%
(g/L) (g/L)
AvgDataB(1260–1444 cm–1) 3 0.919 0.49 0.63 17.02
WE MSC (250–3311 cm–1) 2 0.908 0.52 0.63 17.02
WE NINF (250–3311 cm–1) 2 0.922 0.48 0.59 15.94
WE Norm2ROI (707–1853 cm–1) 4 0.943 0.41 0.61 16.48
Avg FST11ROI (707–1853 cm–1) 4 0.960 0.34 0.74 20.00
Avg FST11MSCROI(707–1853 cm–1) 4 0.965 0.32 0.74 20.00
149
7
R2 = 0.922
2 Latent Variables
0.8 6
RMSEC = 0.4811
RMSECV = 0.5958
5
0.6
Predicted (g/L)
4
Intensity
0.4
3
0.2
2
0
1
-0.2
0
500 1000 1500 2000 2500 3000 1 2 3 4 5 6 7
Wavenumber(cm-1) Expected (g/L)
Figure 89 Water eliminated SERS spectra after normalising for the M5eRDF over the entire
range (250–3311 cm–1) and the predicted versus expected plot for corresponding calibration
model.
The REP for this model was ~16%, similar to that achieved with the conventional
Raman data. In addition, only two latent variables were used and a low SECV/SEC
ratio was noted, unlike the conventional Raman model. Both scores show increasing
linear trends for increasing eRDF concentration per sample; see Figure 90. When
comparing the scores versus concentration plots, the R2 value for L2 was 0.92
compared to 0.71 for L1. The first loading was representative of the average
spectrum after water elimination. This led to a weaker correlation from the signal for
the changing eRDF concentration as the largest water elimination artefact was also
included in the model. The second variable was better correlated to the increase in
eRDF concentration as its fingerprint region was highly detailed and the level of noise
from the water elimination was significantly less.
0.5 LPC1(95.78%)
LPC2(2.87%)
0.4
0.3
0.2
Loadings
0.1
-0.1
-0.2
-0.3
500 1000 1500 2000 2500 3000
Wavenumber(cm-1)
Figure 90 The loadings and scores versus samples for the calibration model of M5eRDF SERS
with the first component in blue and the second in green.
150
4.7 Model Evaluation
The following procedure was used in order to perform the test set evaluation of the
best M5eRDF and M5Ye models.
For the yeastolate models, the M5eRDF sample set was used to attempt to
predict its yeastolate concentration
For the eRDF models the M5Ye sample set was used to attempt to predict its
eRDF concentration.
The advantage of doing this was that test sets with a significant in-built variability
were being used while also removing the need to build a new test set. These samples
were used for the prediction of the stable analyte concentration and also to see how
spectral fluctuations may impact the prediction of analyte concentrations.
Table 29 Prediction results based on the M5Ye prediction of yeastolate concentration in M5eRDF
and M5eRDF prediction of eRDF concentration in M5Ye.
Sample ID/ SERS Sample ID / SERS Raman
Expected Ye Conc. (g/L) Predicted Expected eRDF Predicted Predicted
Ye Conc. Conc. (g/L) eRDF eRDF
M5eRDFS01 1 0.98 M5YeS01 3.4 3.98 3.21
M5eRDFS02 1 1.60 M5YeS02 3.4 3.27 3.06
M5eRDFS03 1 1.81 M5YeS03 3.4 3.26 3.03
M5eRDFS04 1 1.22 M5YeS04 3.4 3.40 3.61
M5eRDFS05 1 1.00 M5YeS05 3.4 3.34 3.23
M5eRDFS06 1 0.95 M5YeS06 3.4 5.60 3.10
M5eRDFS07 1 0.78 M5YeS07 3.4 3.28 2.91
M5eRDFS08 1 0.59 M5YeS08 3.4 3.38 3.80
M5eRDFS09 1 0.58 M5YeS09 3.4 3.59 3.24
M5eRDFS10 1 0.52 M5YeS10 3.4 3.51 3.79
As the REP for the SERS and conventional Raman M5eRDF models were the same,
both models were tested for the prediction of eRDF. Yeastolate prediction used the
best M5Ye SERS model and the results showed that prediction was possible (Table
29). The closer the validation samples were to the model samples the better the
prediction was (as seen with the M5YeS04 and M5YeS05 for eRDF and samples
151
M5eRDFS05 and M5eRDFS06 for yeastolate). In the design of the experiment, the
concentration of the samples overlapped in the mid-point range of the samples, see
Table 6 and Table 7.
The M5Ye SERS model only predicted three samples accurately. However for the
yeastolate prediction, the influence of the eRDF concentration in the M5eRDF
samples was evident. The prediction ability of the samples decreased with increasing
eRDF concentration. The prediction of the eRDF concentration was not affected by
the varying concentration of yeastolate within the M5Ye samples. Moreover, the
SERS model was more accurate than the Raman model. Seven of the ten validation
samples were within 10% of the target concentration while only four of the Raman
predictions were within this limit.
When comparing the SERS and conventional Raman models for eRDF and yeastolate
the following was observed:
For eRDF, the best models gave equivalent prediction errors of ~16%. The
SERS model was better than the Raman model, however, because it did not
show any over-fitting44; which can lead to inaccurate predictions.
For yeastolate, the SERS model was much improved with a percentage error
of ~12% versus ~38% for Raman. The low concentration of M5Ye sample set
with a strong water signal hampered the Raman model. The SERS method
gave the signal enhancement needed for M5Ye samples to compensate for the
strong water signal which led to a good r2 correlation of 0.979.
44
The Raman model has a high SECV/SEC ratio of 3.5 compared to 1.22 for the SERS model.
152
SERS showed some promise at quantifying the complex media components as a
whole but the error levels were still too high. A point to note was that the
concentration range used here is greater than the +10% variation typically expected in
industrial use. The predictions showed that best results were closest to the mid-point
of the concentration range for the media samples, thus a reduction of the
concentration span would improve the quantification. In the prediction of yeastolate,
the method only worked for the low to mid-range M5eRDF samples as another SERS
active molecule contained in the eRDF component impacted the result. Therefore
using the current setup it would not be feasible to quantify both eRDF and yeastolate
simultaneously due to spectral overlap, as there were too few samples and the
experimental design was not fully optimised. It may be more feasible once a better
sampling and model setup are implemented. This could include: a calibration sample
set of more than 60 samples, a reduced range of + 20% of concentration specification
and a greater number (i.e. >3) of replicate measurements per calibration sample.
These steps should result in a more reproducible data collection and yield a more
reliable calibration model to base accurate predictions on.
The overall goal of this work is the development of a robust quantitative method for
complex ingredient analysis, and so another approach was investigated. As both
eRDF and yeastolate contain fluorophores that will produce a distinctive fluorescence
spectrum, fluorescence may be capable of quantifying the complex media components
as a whole.
153
5 Fluorescence Spectroscopy Analysis of
Complex Media Components
45
Formulation compositions provided by the manufacturers, see 8.2.1 and 8.2.2.
154
5.1 The EEM/TSFS Analytical Procedure
Multi-dimensional fluorescence data provided information about chemical
composition because both the peak intensity and shape of the signal were sensitive to
individual and global concentration changes in the analytes present [191, 206, 211,
321]. In this work, EEM/TSFS measurements were used to see if it was feasible to
quantify the yeastolate and eRDF concentration in the model media. The NBL group
has already demonstrated that EEM can be used for the quantification of individual
components, [1] media variance and identification applications, [4] and media
degradation [2].
Fluorescence data is information rich and can be analysed in multiple ways to extract
both qualitative and quantitative results. The outline of the fluorescence workflow
was:
Spectral Overview
o Identify Peaks in EEM/TSFS data
PARAFAC/MCR Analysis
o Identification of the fluorophores
o Profile changes for the fluorophores in relation to concentration
Variance analysis and Outlier detection
o Investigate what causes changes in the spectra by PCA
o Identify abnormal samples using ROBPCA
Quantification – UPLS modelling of the media components (eRDF and
yeastolate) in the M5eRDF and M5YE sample sets.
155
5.2 Spectral Overview of Media Samples (M5eRDF
and M5Ye)
Numerous peaks were seen in the EEM spectra obtained for the M5eRDF and M5Ye
samples (Figure 91 and Figure 92). The samples from both samples sets had similar
peaks, indicating the presence of multiple fluorophores (i.e. the fluorescent amino
acids and vitamins) in eRDF and yeastolate. Previous studies in this lab had identified
the key fluorophores in both eRDF and yeastolate. The peak locations of the
M5eRDF/ M5YE samples in this study were similar to those of the chemically
defined media samples described by Calvet et al. and yeastolate samples described by
Li et al. [1, 2, 4, 7] In their studies five peaks were identified at λex/λem = 275/310 nm,
λex/λem = 260–285/355 nm, λex/λem = 320/390 nm, λex/λem = 365/520 nm and λex/λem =
355/445 nm. These were due to the fluorescence of amino acids (tyrosine and
tryptophan) and vitamins (pyridoxine, riboflavin, and folic acid).
Figure 91 EEM landscape plots of (left) an M5eRDF sample (1 g/L eRDF) and (right) an M5Ye
sample (0.1 g/L Ye). The Rayleigh scatter was removed from the spectra.
156
500 500
(a) (b)
450 450
400 400
350 350
300 300
250 250
300 350 400 450 500 550 600 300 350 400 450 500 550 600
Emission Wavelength ( nm) Emission Wavelength ( nm)
500 500
(c) (d)
Excitation Wavelength (nm)
400 400
350 350
300 300
250
250
Figure 92 EEM contour profiles46 for (a) M5eRDF S01(1 g/L eRDF), (b) M5eRDF S10(6.4 g/L
eRDF), (c) M5Ye S01(0.1 g/L Ye ) and (d) M5Ye S10 (1.72 g/L Ye).
For the EEM landscape plots (Figure 91), the peak of maximum intensity for the
M5eRDF and M5Ye data was located at excitation/emission wavelengths (λex/λem) of
285/355 nm. The secondary peaks were located at λex/λem of 280/305 nm, 230/305 nm
and 230/360 nm. Second order bands started to appear at λex/λem of 280/595 nm and
230/595 nm. The contour plots (Figure 92) showed the changes in the signals with
increasing concentration. For both the M5eRDF and M5Ye samples, the tryptophan
signal peak at 285/355 nm dominated. In the case of M5eRDF samples, the tyrosine
peaks at 280/305 nm and 230/305 nm were weak but observable (even after the
increase in concentration from 1 g/L to 6.4 g/L). Similarly the low concentration
M5YE sample displayed the same peaks indicative of tyrosine (at 280/305 nm and
230/305 nm). Tyrosine fluorescence can be difficult to observe clearly due to overlap
with the tryptophan emission band and the occurrence of radiative energy transfer
(RET) from tyrosine to tryptophan. With increasing sample concentration, the
tyrosine signal decreased as the tryptophan signal increased. In both sample sets, the
much weaker emission from pyridoxine (325/395 nm), riboflavin (455/520 nm), and
46
300 contour lines were used with 0.83 spacing starting with 0.90.
157
folic acid (350/445 nm) only became visible at high concentrations of eRDF and
yeastolate.
TSFS spectra provided an alternative way of measuring the total emission of complex
mixtures. When comparing TSFS with the EEM spectra, the output plot and data were
orientated differently. In TSFS, the peaks were viewed by plotting the excitation
wavelength against the delta wavelength offset (Δλ=λem−λex) with the intensity along
the z-axis. The TSFS landscape plots for both M5eRDF and M5Ye (Figure 93)
displayed the tyrosine signal at 230 nm excitation while the tryptophan signal was
visible at 285 nm for the M5eRDF sample and at 290 nm for M5Ye.
Figure 93 TSFS landscape plots for (left) M5eRDF media sample (1 g/L eRDF) and (right)
M5Ye media sample (0.1 g/L Ye).
In order to more easily compare EEM and TSFS data, the TSFS spectra were re-
plotted after being mathematically transformed into EEM spectra. The transformation
involved diagonally stacking the collected data and filling in zero for empty areas.
When comparing the contour plots of the TSFS and EEM data (Figure 94c and d) the
signal intensities for the samples were different but the peak positions were constant.
From the fluorescence profiles of the media samples, it was possible to see that these
samples had the same underlying components. The peaks were visible in both the
EEM and TSFS contour plots, at the following wavelengths: 285/355 nm (tryptophan,
1), 280/305 nm (tyrosine, 2), 325/395 nm (pyridoxine, 3), 455/520 nm (riboflavin, 4)
and weakly at 350/445 nm (folic acid, 5) [1].
158
(a) 200
180
(b)
250 160
150 120
100 100
50 80
60
500
200
40
400 150
100
300 20
Excitation wavelength (nm) 50 Delta wavelength (nm)
250 300 350 400 450 500
Excitation Wavelength (nm)
500 500
(c) (d)
(4)
450 450
400 400
(5)
350 350
(3)
300 300
(2) (1)
250 250
300 350 400 450 500 550 600 300 350 400 450 500 550 600
Emission Wavelength (nm) Emission Wavelength (nm)
Figure 94 (a) TSFS landscape for M5YeS05 media sample (0.82 g/L) and (b) TSFS contour
profile, (c) rearranged TSFS profile into EEM format and (d) EEM contour profile for M5YeS05
media sample.46
200
140
200
Intensity
120
150
100
100
50 80
60
500
200 40
400 150
100 20
300
Excitation wavelength (nm) 50 Delta wavelength (nm)
250 300 350 400 450 500
Excitation Wavelength (nm)
500 500
(c) (d)
(4)
Excitation Wavelength (nm)
Excitation Wavelength (nm)
450 450
400 400
300 350 400 450 500 550 600 300 350 400 450 500 550 600
Emission Wavelength (nm) Emission Wavelength (nm)
Figure 95 (a) TSFS landscape plot for M5eRDFS05 media sample (3.4 g/L), (b) TSFS contour
profile, (c) rearranged TSFS contour profile into EEM format and (d) EEM contour profile for
M5eRDFS05 media sample. 46
159
5.3 Assessing Fluorophore Contributions from Media
Fluorescence
In the model media, the identities of the fluorophores which contributed most
significantly to the EEM and TSFS profiles were tentatively identified by visual
inspection. These findings were then corroborated with literature references [1, 2, 4,
7, 8]. While this was acceptable as an initial inspection, there needed to be a more
precise identification of the fluorophores. A superior, more rigid approach was to use
a mathematical, factor based, chemometric approach like PARAFAC or MCR in
order to identify fluorophores and to analyse emission changes [250, 322-324].
PARAFAC decomposed the fluorescence data into the individual excitation and
emission profiles for the fluorophores in the sample. The PARAFAC model gave
loadings to help determine the fluorophores present and additionally generated a score
associated with each component (fluorophore). Changes in each components
contribution to the EEM spectrum were thus quantified as sample composition varied.
PARAFAC scores may not correlate with fluorescence concentration because of non-
linearities caused by IFE/RET etc. TSFS data was not suitable for PARAFAC
analysis as it was not tri-linear [246]. PARAFAC was optimized to work with tri-
linear data which was characterised by the fact that each component displayed the
same pattern (profile) for the different samples in both excitation and emission modes.
Within the M5eRDF/M5Ye samples, the complex sample matrix affected the
fluorescence profile, leading to non-linear data. This was one of the challenges with
these complex media where the emission and excitation spectra of many fluorophores
overlapped. In the PARAFAC results, the unique profiles obtained were not the true
profiles of the components, as the data was not tri-linear47. Since the PARAFAC
results were found to be lacking, another factor analysis method (MCR) was utilised
to help determine and better understand the behaviour of the underlying components.
MCR worked better with non-trilinear data for the evaluation of the fluorescent
47
For the data to be tri-linear , the same sample profile must hold for different samples but can be
scaled differently as a result of changing concentration.
160
components by bilinear decomposition [325]. MCR was applied to both EEM and
TSFS data to solve for the underlying components. For the M5eRDF and M5Ye
samples, the resolution of the components improved with the use of MCR (Table 31).
Table 31 The number of fluorophores/components determined by PARAFAC and MCR for
M5eRDF and M5Ye
Dataset Model Type No of Components
M5Ye EEM PARAFAC 3
M5eRDF EEM PARAFAC 2
M5Ye EEM MCR 5
M5Ye TSFS MCR 6
M5eRDF EEM MCR 5
M5eRD FTSFS MCR 6
L1R1 L1R1
2500 L2R1 0.25 L2R1
L1R2 L1R2
L2R2 L2R2
0.2
L1R3 L1R3
2000
L2R3 L2R3
0.15
1500
0.1
0.05
1000
0
500
-0.05
0 -0.1
250 300 350 400 450 500 300 350 400 450 500 550 600
Excitation wavelength (nm) Emission wavelength (nm)
Figure 96 PARAFAC loadings excitation (left) and emission (right) for M5eRDF for the replicate
EEM data collections.
161
The M5eRDF EEM data used a two component PARAFAC model (Figure 96),
however the components were not well resolved. In the PARAFAC emission
loadings, the first loading was clearly tryptophan with a contribution from tyrosine
(visible as a shoulder at ~ 310 nm). The second emission loading was obviously a
composite emission from multiple fluorophores (folic acid, vitamin B6 and its
derivatives, and riboflavin) [326-328]. The concentration change in M5eRDFS01 and
M5eRDFS10 was less visible in the contour plots (Figure 99) as these samples were
more concentrated compared to the low concentration M5Ye samples. They were
therefore subject to more IFE’s. The PARAFAC score for component one showed an
increase for each sample with increasing eRDF concentration, while the second score
showed a stable signal with minor changes for each sample.
0.4
Component1 (53.00%)
0.38 Component1 (72.52%) 0.45
Component2 (39.41%)
Component2 (27.47%) Component3 (7.58%)
0.36
0.4
0.34
0.32 0.35
0.3
0.3
0.28
0.26 0.25
0.24
0.2
0.22
0.2 0.15
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Samples Samples
Figure 97 PARAFAC scores results for M5eRDF (left) and M5Ye (right).
2500
L1R1 L1R1
0.3
L2R1 L2R1
L3R1
2000
L3R1
L1R2 0.25 L1R2
L2R2
L2R2
L3R2
L3R2
1500 L1R3 0.2
L1R3
L2R3
L3R3 L2R3
0.15 L3R3
1000
0.1
500
0.05
0 0
250 300 350 400 450 500 300 350 400 450 500 550 600
Excitation Wavelength (nm) Emission Wavelength (nm)
Figure 98 PARAFAC loadings excitation (left) and emission (right) of M5Ye for the replicate
EEM data collections.
PARAFAC of the M5Ye EEM data revealed three components (Figure 98). The first
component was clearly tryptophan. The second component featured two unresolved
bands from tyrosine with a shoulder peak indicative of tryptophan. The third
component represented an amalgamated peak for multiple fluorophores (pyridoxine,
162
folic acid and a shoulder for riboflavin at 520 nm). From Figure 100, the peak
intensity showed a difference in the M5YeS01 and the M5YeS10 samples, as the
fluorophores evolved with the changing concentration. This was in agreement with
the PARAFAC scores where components one and three were increasing with the
increasing yeastolate for each sample. In contrast, component two - which related to
tyrosine decreased as the tryptophan signal, became more dominant at the higher
yeastolate concentration. For M5Ye the third component represented a merged signal
of pyridoxine, folic acid and riboflavin; this signal only became visible at higher
concentration. Therefore, the subsequent PARAFAC scores showed the increase for
each sample. This PARAFAC result indicated that there was more variace in the
M5YE samples compared to M5eRDFsamples. The change in concentration for the
M5Ye samples were more signficant as larger changes in the profile and the intensity
were seen. While the M5eRDF sample set had a higher overall concentration, it was
also more susceptible to IFE/ET and quenching which resulted in less dynamic
changes in this complex medium. The PARAFAC scores, however, showed that there
was some correlation with the change in fluorophores as the eRDF concentration
increased.
500 500
250
Excitation Wavelength (nm)
450 450
Excitation Wavelength (nm)
200
400 400
150
350 350
100
300 300
50
250 250
300 350 400 450 500 550 600 300 350 400 450 500 550 600
Emission Wavelength (nm) Emission Wavelength (nm)
Figure 99 Comparison of EEM contour plots for the low concentration M5eRDFS01 (1 g/L) and
the high concentration M5eRDFS10 (6.4 g/L) samples; the low to high concentration is based on
the added eRDF.46
163
500
500
250
450
Excitation Wavelength (nm)
450
400
400
150
350 350
100
300 300
50
250 250
300 350 400 450 500 550 600 300 350 400 450 500 550 600
Emission Wavelength (nm) Emission Wavelength (nm)
Figure 100 Comparison of EEM contour plots for the low concentration M5YESO1 (0.1 g/L) and
the high concentration M5YES10 (1.72 g/L) samples; the low to high concentration is based on
the added yeastolate.46
The PARAFAC results obtained were different from previous studies on cell culture
media; in the study by Calvet et al., it was feasible to resolve more than two or three
components [1, 2]. The main difference came from the co-linearity in the variation of
several fluorophores increasing together within the yeastolate and eRDF. This
behaviour was not modelled by PARAFAC because it was not able to determine if the
fluorophores were different components, and also because PARAFAC assumes
constant profiles in all dimensions48, which was not the case. The complex M5eRDF
and M5Ye samples gave rise to non-linearity in the EEM data. This non-linearity
could be resolved by significantly diluting the samples so that the interactions (energy
48
For succesful PARAFAC, the profile response should be the same.
164
transfer/quenching/IFE) between components were minimised. This was not,
however, an ideal solution for media analysis where minimal sample handling was
desired.
Since PARAFAC did not effectively resolve the individual fluorophores, another
factor based analysis method (MCR) was used with the hope of better elucidation of
the fluorophores of M5eRDF and M5YE data.
S1- 68.7%
S1- 66.6% 500 S2- 14.4%
500
S2- 16.6% S3- 0.99%
S3- 1.07% S4- 109%
S4- 10.7
400
400 S5- 4.81%
S5- 4.9%
Scores
Scores
300 300
200 200
100
100
0
300 350 400 450 500 550 600 0
Emission wavelemgth(nm) 300 350 400 450 500 550 600
Emission wavelemgth(nm)
Figure 101 Resolved emission profiles of the M5eRDF (left) and M5Ye (right) for EEM MCR
models.
Scores
250 200
200
150
150
100
100
50
50
0 0
20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200
Delta Wavelength (nm) Delta Wavelength (nm)
Figure 102 Resolved Delta Profiles for M5eRDF (left) and M5Ye (right) from the TSFS data.
The EEM and TSFS spectra were collected for five pure fluorophores (tyrosine,
tryptophan, riboflavin, pyridoxine and folic acid) that were listed in the eRDF
formulation, Table 45. The excitation, emission and delta profiles were then resolved
165
by MCR for comparison to unknown fluorophore profiles recovered from the
M5eRDF and M5Ye samples. The extracted MCR profiles for the media samples
were a reflection of the relative fluorescence emission between components and their
scores indicated their changes in the media environment.
When the recovered emission profiles for the EEM data were compared with the pure
spectra of tryptophan, tyrosine, pyridoxine and riboflavin, very close agreement was
achieved (Figure 103). There were shifts seen in the maximum band position between
the pure and recovered profiles as the extracted profiles were affected by the sample
complexity. The emission profiles recovered for the same fluorophore in the different
media environments (M5eRDF/M5Ye) were in close agreement than when compared
to the pure fluorophore spectra. These spectral shifts were the result of energy
transfer/quenching/IFE that occurred within the media samples. Tyrosine was red
shifted by ~10 nm caused by energy transfer as it overlapped with the absorption of
tryptophan.
There was a large difference between the final recovered component and pure folic
acid spectrum; this indicated that component five was not folic acid. It was difficult to
determine the number of fluorophores present above the 375 nm emission region
because of the lower signal intensity. The 450 nm peak could be a secondary
excitation band for riboflavin [325]. In a proposed assignment of the unknown
component five, it was noted that the fluorescence behaviour was similar to the
biogenic fluorophores NADH and NADPH. NADPH fluorescence at 360/460 nm was
seen (Figure 26) in the analysis of yeast samples and, because yeastolate is a digest of
yeast, NADPH may be a component of yeastolate [211]. However without
comprehensive compositional information about the yeastolate, it was unknown if
NADPH was definitely present in yeastolate. The compositional information for
yeastolate (listed in Table 46) was limited to mainly the amino acid and mineral
content.
166
0.1
0.55
Trp
0.5 Tyr
Py
0.45 0.08
RB
0.4 FA
0.35
Intensity (AU)
Intensity
0.06
0.3
0.25
0.2 0.04
0.15
0.1
0.02
0.05
Figure 103 Emission spectra resolved by MCR from the EEM data for the pure components
(solid line), M5Ye (dotted) and M5eRDF (dashed). The spectra were normalised to area equal to
one.
For TSFS data, both the excitation and delta profiles (Figure 104 and Figure 105)
were considered. Six components were recovered for both M5eRDF and M5Ye, and
when compared to the pure component profiles, they did not align well for the
suspected components. From these plots it was clear that it was impossible to clearly
assign the TSFS components with the pure component spectra. For example in both
the M5eRDF and M5Ye data, the excitation profiles showed that all bands were
excited in the 280 nm to 305 nm range (Table 32). This covered the tyrosine and
tryptophan excitation range. However, pyridoxine, riboflavin and folic acid were not
clearly indicated by the excitation profiles, showing that they were minor contributors
to the overall sample fluorescence. The dynamic environment allowed molecules to
undergo interactions with other media components resulting in various micro-states.
TSFS proved to be too sensitive to these changes. Therefore the elucidation of
components from TSFS data was not as clear as the EEM data. It was easier to
interpret the emission profiles for the EEM data compared to the delta profiles for the
TSFS data. This may be caused by the data displaying the information in different
ways, given that the collection methods were analysing the same samples. From
Figure 104 and Figure 105, the excitation profiles for the fluorophores were very
close. One of the reasons for this was that only a 5 nm step was used between each
excitation and delta profile. The offset measurement approach did not allow for clear
167
resolution of the peaks but the EEM method did. Therefore, for complicated media
samples the EEM MCR approach clearly offers the best method for identifying the
specific components.
0.5 C1
C2
0.45
C3
0.4 C4
C5
0.35 FA
Py
0.3
Intensity
Rb
Tyr
0.25
Trp
0.2
0.15
0.1
0.05
0
20 40 60 80 100 120 140 160 180 200
De lt a Wa ve le n g t h ( n m)
0.6 C1
C2
C3
0.5 C4
C5
FA
0.4
Py
Intensity
Rb
Tyr
0.3
Trp
0.2
0.1
0
250 300 350 400 450 500
Ex c it a t io n Wa v e le n g t h ( n m)
Figure 104 TSFS delta and excitation profiles for M5eRDF compared to the pure component
profiles (coloured traces) resolved by MCR. The spectra were normalised.
168
0.5 C1
C2
0.45 C3
C4
0.4
C5
0.35 FA
Py
Intensity
0.3 Rb
Tyr
0.25
Trp
0.2
0.15
0.1
0.05
0
20 40 60 80 100 120 140 160 180 200
De lt a Wa v e le n g t h ( n m)
0.6 C1
C2
C3
0.5
C4
C5
FA
0.4
Py
Intensity
Rb
0.3 Tyr
Trp
0.2
0.1
0
250 300 350 400 450 500
Excita tio n Wa ve le n g th ( n m)
Figure 105 TSFS delta and excitation profiles for M5Ye compared to the pure component
profiles (coloured traces) resolved by MCR. The spectra were normalised.
Table 32 Excitation, Emission and % Fit values for MCR TSFS models using 1–6 factors. The
emission wavelength (λem) was obtained by adding the delta (∆𝛌) to the excitation wavelength
(λex).
Pure Standard M5Ye M5eRDF
λex λem λex λem λex λem
% Fit % Fit
(nm) (nm) (nm) (nm) (nm) (nm)
Trp 270 360 C1 285 360 34.6 C1 285 360 41.7
Tyr 270 320 C2 290 345 22.4 C2 290 345 18.8
Py 355 400 C3 280 310 2.7 C3 280 310 5.3
Rb 400 530 C4 305 435 24.1 C4 305 430 18.6
FA 440 490 C5 285 385 14.0 C5 285 300 12.7
C6 285 300 1.7 C6 285 385 2.4
169
The EEM scores plot (Figure 106) showed an increase of two components (#2 and #3)
from M5eRDF sample 1 to sample 10 while the other components showed little
change. The TSFS scores plot showed that all components (bar component four)
displayed an increase from M5eRDF sample 1 to sample 10. Component four gave a
stable signal across the scores plot.
1.6
2.4
C1 C2 C3 C4 C5
1.5
2.2
1.4
2
1.3
1.8
1.2
1.6
1.1
1.4
1
1.2 0.9 C1
C2
1 0.8 C3
C4
C5
0.8 0.7
C6
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Samples Samples
Figure 106 M5eRDF MCR scores for EEM model (left) and TSFS model (right).
The scores (Figure 107) for both the EEM and TSFS M5Ye data showed a similar
trend. The scores of the components emitted at shorter wavelengths decreased, the
intermediate components were stable, while at the longer wavelengths the component
scores increased. This pattern was clearer in the TSFS scores. The short wavelength
components were subject to more IFE compared to long wavelength components (as a
result of the higher absorbance that occurred in these regions). The M5Ye scores also
revealed that M5YeS01 deviated from the other score points. This observation was
confirmed by ROBPCA results where the M5YeS01 sample was seen as different
from the other samples. The reason for the altered profile was the low sample
concentration, which gave rise to a more dilute spectral profile of the M5YeS01
sample compared to the other samples.
3 1.6
C1
C2
1.5
C3
2.5
C4 1.4
C5
1.3
2
1.2
1.1
1.5
1 C1
C2
0.9 C3
1 C4
0.8 C5
C6
0.5 1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
Samples
Samples
Figure 107 M5Ye MCR scores plots for EEM model (left) and TSFS model (right).
170
The MCR scores for all components varied as eRDF or yeastolate concentration
increased. It was therefore appropriate that the full or reduced spectral area be used
for the quantitative modelling. If one wanted to quantify individual fluorophores, one
could of course look at a more restricted emission range [1].
1000 M5eRDF
0.12
M5Yeastolate
Scores on PC 2 (2.56%)
Loadings on PC 1 (97.15%)
500 0.1
0.08
0
0.06
-500
0.04
-1000
0.02
-1500 0
1700 1800 1900 2000 2100 2200 2300 2400 2500 2600 2700 300 350 400 450 500 550 600
Scores on PC 1 (97.15%) Emission Wavelength [nm]
0.15 0.25
0.1 0.2
Loadings on PC 2 (2.56%)
Loadings on PC 3 (0.17%)
0.05 0.15
0 0.1
-0.05 0.05
-0.1 0
-0.15 -0.05
-0.2 -0.1
300 350 400 450 500 550 600 300 350 400 450 500 550 600
Emission Wavelength [nm] Emission Wavelength [nm]
Figure 108 Graphic results for the PCA analysis of M5eRDF/M5YE comparison, the arrows
show the samples going from low to high concentration. (Top Left) PC1 vs PC2 Scores plot, (Top
Right) Loadings for PC1 (97.15%), (Bottom Left) Loadings for PC2 (2.56%), (Bottom Right)
Loadings for PC3 (0.17%).
49
The same trends occurred with the TSFS data and these results will not be shown.
171
It was not possible to fully segregate the M5eRDF and M5Ye samples in the scores
plots as they partially overlapped due to the common fluorophores. From the analysis
of the loadings plots, it was proposed that PC1 was largely tryptophan signal; PC2
was largely tyrosine signal, while PC3 was an unresolved tryptophan and pyridoxine
peak. From Figure 108(Top Left), it was clear that changes in the M5Ye were greater
along both PC1 and PC2 than for M5eRDF. This was attributed to the decrease in the
tyrosine signal and increase in the tryptophan signal with yeastolate concentration
changes (Figure 100 and Figure 97). The M5eRDF PCA result was less susceptible to
variation as a result of the higher concentration of these samples. This led to less
dynamic change being observed in the fluorescence signal with increasing
concentration as seen with the MCR scores (Figure 106).
In the PCA analysis of the individual datasets, only two principal components were
used to describe the variance in each sample set. For the individual M5Ye PCA model
the first component described 95.53% of the explained variance while the second
component described 4.23%. The first component showed a peak with emission at
355 nm i.e. tryptophan, which was the dominant feature in both the M5eRDF and
M5Ye emission. The second component of the individual M5Ye PCA model matched
the second component of the combined model i.e. tyrosine. Besides the peak for
tyrosine in PC2 there was also a negative portion which was representative of the
interaction with the tryptophan as the concentration changed [330]. With the
increasing tryptophan and tyrosine concentration, tryptophan emission became
dominant and radiative energy transfer potentially occurred between tyrosine and
tryptophan fluorophores leading to a reduced signal from the tyrosine peak [331]. For
the individual M5eRDF PCA model, the first component (emission peak at 355 nm)
described 99.68% of the explained variance and while the second component (0.3%)
matched the third component of the combined PCA model. Considering the strength
of the tryptophan signal in the M5eRDF, the observed variation may be attributed to
IFE and energy transfer interactions that occur with tryptophan within M5eRDF
samples.
172
This variance - created by the differing fluorophores, IFE, energy transfer
interactions, and quenching occurring within each sample - will be used to correlate
the eRDF and yeastolate concentrations to the gross fluorescence signal.
250
250
200 200
Intensity
150
Intensity
150
100
50
100
0
500
600 50
400 500
300 400
300 Emission Wavelength (nm) 0
Excitation Wavelength (nm) 0 500 1000 1500 2000 2500 3000 3500 4000
Data Points
Figure 109 EEM landscape spectra of a single M5eRDF solution (5.8g/L) and unfolded spectra of
M5eRDF.
400
M5Ye
200 M5eRDF
300
100
200
Scores 2 (22.10%)
Scores1 (22.60%)
0
100
-100
0
M5eRDF -200
-100
M5Ye
-200 -300
-500 0 500 1000 1500 -600 -400 -200 0 200 400 600 800 1000 1200
Scores1 (69.22%) Scores 1 (73.68%)
Figure 110 ROBPCA Scores plot for PC1 versus PC2 of M5eRDF/M5YE comparison, EEM
Scores (left) and TSFS scores (right). The arrows indicate the direction of the changing
concentration from low to high.
173
The ROBPCA result showed that the sample concentration influenced sample
fluorescence as the sample scores followed the increasing concentration (Figure 110).
As the concentration changed in the media, so did the photo-physical behaviour as
IFE/quenching/energy transfer increased. This caused differences in the spectra so the
samples at either ends of the concentration range tended to be very different. The
ROBPCA subspaces for EEM and TSFS were orientated differently due to the
difference in the way the data was presented. When comparing the ROBPCA scores
(Figure 110) to the PCA scores (Figure 108, top left), the same pattern of crossing
over was observed as the samples go from low to high concentration.
Table 33 Outlying observations from the EEM and TSFS datasets for M5eRDF, with the outlying
samples being identified by ROBPCA.
Dataset PC’s Outliers Outliers Type
Identified
M5eRDF EEM R1 3 1, 10 Good Leverage(1), Bad Leverage(10)
M5eRDF EEM R2 3 1, 9 Bad Leverage(1), Good Leverage(9)
M5eRDF EEM R3 2 1, 2, 9 Orthogonal(1), Bad Leverage(2)
Good Leverage(9)
M5eRDF EEM 30 3 8, 10, 16, 21 Good Leverage (8), Bad Leverage(10,21)
Orthogonal (16)
M5eRDF TSFS R1 3 1, 8 Good Leverage(1), Bad Leverage(8)
M5eRDF TSFS R2 3 8, 10 Bad Leverage(10), Good Leverage(8)
M5eRDF TSFS R3 2 1, 9 Orthogonal(1), Bad leverage(9)
M5eRDF TSFS 30 4 8, 9, 16, 21 Good Leverage (8,16), Bad Leverage(9,21)
Samples with either low or high additive concentration were identified as outliers.
The samples in the edges of the data distribution were still valid samples but they
were easily identified as outliers because of the small sample sets. It was however
unwise to disregard these since the variance was a result of the variability associated
with the limited linear range. The majority of outliers in Table 33 and Table 34 were
either good or bad leverage points, indicating that they only deviated along a single
aspect of the spectra. In addition, the orthogonal outliers occurred when there was a
significant change from the mean data profile. A tentative assumption was made here
that good or bad leverage points were representative of a spike in fluorophore
intensity within a sample, while the orthogonal outliers were more frequently seen
with the low concentration samples. This was due to a low tryptophan signal within
the low concentration samples which altered them from the mean data. The PCA
174
results showed that the primary component (mean signal) in both the M5eRDF and
M5Ye was an emission peak at 355 nm which was indicative of tryptophan.
Table 34 Outlying observations from the EEM and TSFS datasets for M5Ye, with the outlying
samples being identified by ROBPCA.
Dataset PC’s Outliers Identified Outliers Type
M5YE EEM R1 3 2 Orthogonal
M5YE EEM R2 3 1 Orthogonal
M5YE EEM R3 2 1, 2 Orthogonal
M5YE EEM 30 4 10, 11, 17, 21, 20, 30 Good Leverage (11,21)
Bad Leverage(10, 17,20, 30)
M5YE TSFS R1 2 1, 2 Orthogonal
M5YE TSFS R2 3 1, 9 Bad Leverage(1)
Good Leverage(9)
M5YE TSFS R3 2 1, 2 Orthogonal
M5YE TSFS 30 4 10, 11, 14, 17, 21, 27 Good Leverage(10)
Bad Leverage(14,27)
Orthogonal (11,17,21)
For the M5eRDF and M5Ye datasets, the outliers identified using ROBPCA were not
seen in the SERS and Raman data because only PCA was carried out for these
methods. The PCA results gave no outliers for the Raman or SERS data. In the SERS
modelling of the M5Ye, the PLS scores revealed that M5YeS06 deviated along the
second and third components of the model. This type of behaviour would have been
picked up by a ROBPCA model. This is because the ROBPCA results for the
fluorescence data highlighted the end-range samples as deviating from the mean,
whereas the PCA results gave no outlier. The linearity range with these sample sets
was too big, causing the limits to be exceeded. As a result, excluding end-range
samples to reduce the concentration range may be necessary in order to achieve
accurate quantification. In essence, this indicated that the fluorescence method was
more-sensitive to matrix effects and smaller concentration ranges should be used. A
similar situation was observed for single component quantification modelling [1].
175
5.5 Quantitative Analysis of M5eRDF and M5Ye
The PCA and ROBPCA scores plots (Figure 108 and Figure 110) revealed that there
was a clear linear trend observed with increasing concentrations of yeastolate and
eRDF in the M5Ye and M5eRDF data respectively. This trend indicated that the
fluorescence data was suitable for developing linear PLS models for yeastolate and
eRDF.
The data was unfolded prior to quantitative analysis (PLS) of EEM and TSFS (Figure
109). The use of unfolded PLS was chosen because it takes into consideration the
analyte-background interactions when describing the calibration data as part of the
number of latent variables (components) selected [332].
During the development of fluorescence based calibration models, the full region and
two reduced regions were selected to focus on the strong analyte signal. The contour
plots (Figure 111) for the EEM of the full and the two reduced regions showed the
varying levels of information in the different regions.
310
(b)
500
(a) 300
Excitation wavelength (nm )
290
280
Excitation wavelength (nm)
450 270
260
250
400 240
230
280 300 320 340 360 380 400 420
Emission wavelength (nm)
360
350 350
340
(c)
Excitation wavelength (nm)
330
320
300 310
300
290
280
250 270
260
Figure 111 Contour plots for the Full Range (λex/λem 230–520/270–600 nm), Reduced Area A
(λex/λem 230–315/270–435 nm) and Reduced Area B (λex/λem 250–360/285–425 nm) of a single
M5eRDF solution.46
176
For the EEM data the selected regions were:
o Reduced region A - excitation 230–315 nm and emission 270–435 nm
o Reduced region B - excitation 250–360 nm and emission 285–425 nm.
Unfolded PLS was performed on the individual data collections for EEM and TSFS
datasets with the inclusion of all the samples (Table 137 to Table 140). Some
anomalies were noted, however, and so unfolded PLS was repeated on the averaged
data of the M5eRDF and M5Ye sample sets (Table 35 to Table 38). When unfolded
PLS modelling was performed on the individual runs for M5eRDF data, the sample
M5eRDFS09R1 deviated from the expected measurement line. As a result, this
sample was removed when generating the averaged M5eRDF sample set. The
ROBPCA results were used as a guide for erroneous samples, but with so few
samples over a large range, the ROBPCA results were hypersensitivity to linear
deviations.
For the M5Ye samples, the unfolded PLS on the individual runs revealed two
anomalies. First, the sample M5YeS01R1 was an odd measurement; its spectrum was
removed from the averaged dataset (Figure 112). The second anomaly was the non-
linear behaviour of the high concentration samples. In the calibration models of the
individual runs, there was a dramatic shift in between samples M5YeS07 and
M5YeS08. When the spectra were investigated the biggest change was seen with the
tryptophan peak. The intensity of the tryptophan peak was plotted (Figure 112). It
showed a linear increase in signal up to sample M5YeS06, but after the data became
non-linear. In order to improve the calibration model for the M5Ye data, the sample
set was reduced to only six samples covering a linear range of 0.1–1.0 g/L yeastolate.
177
1.8 300
2
R2R= 0.835
= 0.835
1.6 R32Latent
3= Latent
0.835 Variables
Variables
RMSEC = 0.20977 290
3RMSEC
Latent Variables
= 0.20977
RMSECV
RMSEC
RMSECV = 0.23678
= 0.20977
= 0.23678
1.4
RMSECV = 0.23678 280
1.2
Predicted (g/L)
270
Intensity
1
260
0.8
250
0.6
M5YeR1
240 M5YeR2
0.4
M5YeR3
0.2 230
0 220
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 1 2 3 4 5 6 7 8 9 10
Expected (g/L) Samples
Figure 112 Predicted versus expected plots of the EEM calibration model for all 30 M5Ye
samples (Left) and the intensity changes of the tryptophan peak (285/355 nm.) per sample
(Right).
178
7 7
R2 = 0.991
R2 = 0.993
3 Latent Variables
6 3 Latent Variables 6
RMSEC = 0.15924
RMSEC = 0.14858
RMSECV = 0.29901
RMSECV = 0.27636
5 5
Predicted (g/L)
Predicted (g/L)
4 4
3 3
2 2
1 1
0 0
1 2 3 4 5 6 7 1 2 3 4 5 6 7
Expected (g/L) Expected (g/L)
Figure 113 Predicted versus expected plots for M5eRDF of the EEM calibration model (Left)
and the TSFS calibration model (Right).
Table 35 Calibration models using unfolded PLS regression for averaged M5eRDF EEM data.
Model LV R2 RMSEC RMSECV REP%
g/L g/L
Full Range (λex 230–520 nm λem 270–600 nm)
M5eRDF EEM Unfolded 3 0.990 0.17 0.34 9.18
M5eRDF EEM Unfolded MSC 3 0.978 0.25 0.48 12.97
M5eRDF EEM Unfolded Norm 3 0.985 0.21 0.42 11.35
Reduced Region A (λex 230–315 nm λem 270–435 nm)
M5eRDF EEM Unfolded 3 0.988 0.19 0.34 9.18
M5eRDF EEM Unfolded MSC 4 0.998 0.08 0.41 11.08
M5eRDF EEM Unfolded Norm 4 0.998 0.08 0.32 8.64
Reduced Region B (λex 250–360 nm λem 285–425 nm)
M5eRDF EEM Unfolded 3 0.993 0.14 0.27 7.29
M5eRDF EEM Unfolded MSC 2 0.962 0.33 0.47 12.70
M5eRDF EEM Unfolded Norm 2 0.974 0.27 0.38 10.27
Table 36 Calibration models using unfolded PLS regression for averaged M5eRDF TSFS data.
Model LV R2 RMSEC RMSECV REP%
g/L g/L
Full Range (λex 230–520 nm ∆𝛌 10–200 nm)
M5eRDF TSFS Unfolded 3 0.991 0.15 0.29 7.83
M5eRDF TSFS Unfolded MSC 3 0.979 0.24 0.46 12.43
M5eRDF TSFS Unfolded Norm 3 0.986 0.20 0.38 10.27
Reduced Region A (λex 230–310 nm ∆𝛌 10–190 nm)
M5eRDF TSFS Unfolded 3 0.988 0.19 0.34 9.18
M5eRDF TSFS Unfolded MSC 4 0.995 0.12 0.47 12.70
M5eRDF TSFS Unfolded Norm 4 0.997 0.08 0.35 9.45
Reduced Region B (λex 250–310 nm ∆𝛌 10–140 nm)
M5eRDF TSFS Unfolded 2 0.978 0.25 0.34 9.18
M5eRDF TSFS Unfolded MSC 2 0.961 0.34 0.47 12.70
M5eRDF TSFS Unfolded Norm 2 0.972 0.28 0.39 10.54
179
5.5.2 Correlation of Ye Concentration to the M5Ye
Fluorescence Data
The PCA showed that the spectral variation caused by changing yeastolate
concentration was much larger than for eRDF (Figure 108). One consequence of this
was that at the higher yeastolate concentrations the spectral changes became non-
linear. The concentration range was reduced by removing the highest concentration
samples to get a decent qualitative model. As a result the M5Ye calibration models
were stronger than M5eRDF models and showed a good linear performance for both
EEM and TSFS data (Figure 114). The best models were built using the reduced
region B with MSC pre-processing for EEM and with no pre-processing for the TSFS.
The model performance improved when the reduced region B was used over the full
or reduced region A. Pre-processing positively adjusted the model results, as it
minimized spectral variations that were not caused by changes in the analyte
concentration. The best TSFS calibration model was reasonable with a REP of 7.27%
but was weaker than the EEM whose model gave REP of 5.45%.
1 1
R2 = 0.998 R2 = 0.994
0.9 0.9
3 Latent Variables 3 Latent Variables
RMSEC = 0.014723 RMSEC = 0.0246
0.8 RMSECV = 0.034587 0.8 RMSECV = 0.047416
0.7 0.7
Predicted (g/L)
Predicted (g/L)
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Expected (g/L) Expected (g/L)
Figure 114 Predicted versus expected for M5Ye with the EEM calibration model (left) and the
TSFS calibration model (right).
180
Table 37 Unfolded PLS calibration model for the averaged M5Ye EEM data using 6 samples.
Model LV R2 RMSEC RMSECV REP%
g/L g/L
Full Range (λex 230–520 nm λem 270–600 nm)
M5YeEEM Unfolded 2 0.962 0.05 0.10 18.18
M5YeEEM Unfolded MSC 2 0.986 0.03 0.06 10.90
M5YeEEM Unfolded Norm 2 0.990 0.03 0.05 9.09
Reduced Area A (λex 230–315 nm λem 270–435 nm)
M5YeEEM Unfolded 2 0.955 0.06 0.11 20.00
M5YeEEM Unfolded MSC 2 0.980 0.04 0.07 12.72
M5YeEEM Unfolded Norm 2 0.987 0.03 0.06 10.90
Reduced Area B (λex 250–360 nm λem 285–425 nm)
M5YeEEM Unfolded 3 0.997 0.01 0.04 7.27
M5YeEEM Unfolded MSC 3 0.998 0.01 0.03 5.45
M5YeEEM Unfolded Norm 3 0.995 0.02 0.06 10.90
Table 38 Unfolded PLS calibration model for the averaged M5Ye TSFS data using 6 samples.
Model LV R2 RMSEC RMSECV REP%
g/L g/L
Full Range (λex 230–520 nm ∆𝛌 10–200 nm)
M5YeTSFS Unfolded 2 0.962 0.05 0.10 18.18
M5YeTSFS Unfolded MSC 2 0.982 0.04 0.06 10.90
M5YeTSFS Unfolded Norm 2 0.987 0.03 0.05 9.09
Reduced Area A (λex 230–310 nm ∆𝛌 10–190 nm)
M5YeTSFS Unfolded 2 0.954 0.06 0.11 20.00
M5YeTSFS Unfolded MSC 2 0.977 0.04 0.08 14.54
M5YeTSFS Unfolded Norm 2 0.983 0.03 0.06 10.90
Reduced Area B (λex 250–310 nm ∆𝛌 10–140 nm)
M5YeTSFS Unfolded 3 0.994 0.02 0.04 7.27
M5YeTSFS Unfolded MSC 3 0.994 0.02 0.04 7.27
M5YeTSFS Unfolded Norm 3 0.991 0.02 0.06 10.90
When comparing the EEM and TSFS models, the differences in performance were
minor for the M5eRDF samples. However, the EEM model outperformed the TSFS
model for the M5Ye data (Table 37and Table 38). Overall, the EEM method was
better in the analysis of M5Ye and M5eRDF data. This was shown by the resolution
of the fluorophores in the MCR results as well as by the good calibration models
formed with the M5eRDF/M5Ye data.
181
5.5.3 Model Evaluation
The fluorescence models were evaluated in the same way as the prediction ability of
the best models was evaluated in the SERS study.
For the M5eRDF calibration model, the M5Ye samples were taken to see if
their eRDF concentrations could be correctly predicted.
For the M5Ye calibration model, yeastolate concentrations were predicted on
the M5eRDF data.
Table 39 Prediction results of eRDF concentration in M5Ye, from the EEM and TSFS calibration
models.
Sample ID Expected Predicted Conc. Predicted Conc.
Concentration from EEM Model from TSFS Model
M5YeS01 3.4 8.83 9.43
M5YeS02 3.4 5.98 6.83
M5YeS03 3.4 4.15 4.97
M5YeS04 3.4 3.46 3.93
M5YeS05 3.4 3.46 3.65
M5YeS06 3.4 3.39 3.54
M5YeS07 3.4 3.29 2.79
M5YeS09 3.4 4.90 4.03
M5YeS10 3.4 4.25 3.69
The predictions of the eRDF concentration from the M5Ye sample set showed
reasonable results for the mid–to high– concentration samples (Table 39). It was seen
in the PCA scores that the overlap between the M5eRDF and M5Ye was small due to
the different media changes within each sample set. The low concentration samples
were very poorly predicted (with the prediction values for S01 and S02 approximately
double the expected 3.4 g/L). The primary reason for the relatively low model
accuracy and over estimation of the low concentration sample was the large spectral
difference in the signal for the test set compared to M5eRDF calibration sample set.
The variance was due to the different matrix environment given the varying yeastolate
or eRDF concentration. It was obvious from the test set EEM contour plot (Figure
119) that there was a large change between the lowest to the highest prediction sample
(the main fluorescence signal changes from 275/310 nm to 285/355 nm). This
corresponded to the decrease in tyrosine signal as the tryptophan signal increased. In
the calibration EEM contour plots (Figure 120) the main fluorescence signal remained
at 285/355 nm from low to high concentration samples.
182
Table 40 Prediction results of M5Ye EEM and TSFS calibration models for yeastolate
concentration in M5eRDF.
Sample ID Expected Predicted Conc. Predicted Conc.
Concentration from EEM from TSFS
Model Model
M5eRDFS01 1.0 1.00 0.91
M5eRDFS02 1.0 1.01 0.96
M5eRDFS03 1.0 1.04 0.98
M5eRDFS04 1.0 1.03 1.00
M5eRDFS05 1.0 0.98 0.97
M5eRDFS06 1.0 1.02 1.02
M5eRDFS07 1.0 1.06 1.06
M5eRDFS08 1.0 1.09 1.10
M5eRDFS09 1.0 1.05 1.08
M5eRDFS10 1.0 1.04 1.10
The prediction ability of the M5Ye calibration model was better than the M5eRDF
model (Table 40). The better predictability came from the similarity in the M5eRDF
and M5Ye samples which contained the same first principal component (i.e. the
tryptophan emission) that was observed from the PCA results (Figure 108). Therefore,
in predicting the yeastolate concentration of M5eRDF samples, the correlation
between the calibration and prediction was based on the tryptophan emission band.
Also, no dramatic changes were observed in M5eRDF test set; this can be seen in both
the PARFAC and MCR scores (Figure 97, Figure 106 and Figure 107).
Quantitative models for the more complex media components, eRDF and yeastolate,
were developed. Both EEM and TSFS calibration and prediction models worked, but
the EEM method outperformed TSFS with its results and ease of interpretation. These
EEM/TSFS methods were part of the holistic approach to cell culture media analysis
and this quantification method complemented previous work developed for the
quantification of specific fluorophores [1, 2]. Calvet et al. showed that specific
analytes could be quantified in chemically defined media. The prediction of
tryptophan, tyrosine, pyridoxine, riboflavin and folic acid in eRDF media solutions
worked using NPLS analysis (samples were prepared using a standard addition
method (SAM)). These models used narrower spectral ranges from EEM data centred
on each analyte emission. The analytes were predicted with the following error levels:
tryptophan (4.5%), tyrosine (5.5%), pyridoxine (4.6%), riboflavin (2.3%), and folic
acid (8.7%) [2].
183
However, the impact of changing the concentration of a complex ingredient like
eRDF or yeastolate resulted in large changes in the concentration of multiple
fluorophores. This led to large matrix changes which reduced the potential linear
ranges for quantification. For the M5eRDF, there were fluctuations that limited the
correlation performance to the changes in eRDF concentration. For the M5eRDF data,
there were large fluctuations in the test set that limited the correlation performance of
the M5eRDF calibration model. This could be seen in the poor prediction of the eRDF
concentration in the test set. The most accurate results were close to the mean
concentration which had a stable tryptophan signal. These results indicated that eRDF
could potentially be modelled within a specified range. For the M5Ye, it caused
fluctuations and non-linearity to accurately correlate the changes in yeastolate
concentration over the full linear range, but once the linear was reduced the
correlation improved and the prediction worked well.
This study showed that it was feasible to quantify complex ingredients of cell culture
media. However the method does require further refinements such as
Setting the ingredient concentration range to a more realistic range for X g/L ±
25%, so if we were to redo the M5eRDF sample set a more appropriate
concentration range might be: 3.4 g/L ± 0.85 g/L.
Using more samples in the calibration modeling. Li et al. used a ratio of 1 test
sample to 4 or 5 calibration samples [7]. The minimum sample number should
be set to 20 as the use of 10 samples in this model was not enough. The
replicate measurement is usually set at three but double that may minimize day
to day variation better.
More suitable test sets designed using risk analysis of the media would be
better for assessing model accuracy in real terms. These test sets would also
inform the design of the calibration sample set so that there was correct
overlap between the PCA subspaces of the samples.
The linearity of the calibration plots indicated good performances in correlating the
eRDF and yeastolate concentrations with the EEM and TSFS models. Using unfolded
PLS it was possible to quantify the eRDF concentration with a 7.2% error level for the
EEM data and a 7.8% error level for the TSFS data. For yeastolate concentration,
calibration model errors were 5.4% and 7.2% using EEM and TSFS respectively.
The M5Ye calibration model worked, and its prediction ability was good. The
prediction results for yeastolate indicated that it was possible to quantify yeastolate
from the gross analyte signal of test media samples. The M5eRDF samples generated
good calibration models but had weaker prediction results. When the M5Ye
prediction results were compared to the M5eRDF, it highlighted the importance of the
test set. The test set for the M5eRDF model deviated too much from the calibration
samples as there was a rapidly varying peak intensity from the IFEs at the edges of the
concentration range of the test set samples. The type of deviation experienced in
practice was seen with the test set for M5Ye calibration as the test set was devoid of
major fluctuations.
If comparing the results from fluorescence, Raman and SERS, the results revealed
how the sample set under investigation changed the performance of the model. The
fluorescence method improved the calibration for the M5eRDF data but the test set
was undergoing too many matrix fluctuations for effective prediction. For this reason
the prediction results of Raman and SERS data outperforms the fluorescence data (see
Table 29). In the case of M5Ye samples, the fluorescence method outperformed
185
Raman and SERS in terms of both the calibration and prediction performance. The
Raman data was not able to measure the weak yeastolate signal. The SERS method
improved the correlation with the enhancement of the signal but its prediction
performance suffered from spectral overlap from other components which led to poor
prediction results. The overall quantification of yeastolate in the M5Ye samples was
best achieved using fluorescence measurements.
Within every fluorescence landscape, part of the acquired signal contained little or no
fluorescence; these areas were eliminated to improve results. In the quantitative
analysis, reduced area selections were favoured with the EEM and TSFS data for
M5Ye. This finding was in agreement with other studies where variable selection for
the most prominent excitation and emission combinations were chosen leading to
improvement in prediction capability [7, 333, 334]. In previous studies with industrial
cell media, Li et al. correlated the fluorescence signal in EEM data to the glycoprotein
yields [7]. The use of the full spectrum resulted in weak calibration but when variable
selection methods were applied the model performance improved. The R2 value went
from 0.2 to 0.94 and the REP from ~8.94% to ~3.62% depending on the process stage
being tested. This study showed that correlation was only dependent on high intensity
fluorescence bands and emission properties of the analyte of interest. For
multicomponent analytes like eRDF and yeastolate, specific emission ranges span
multiple emission bands. As a result precise area selection could not be used to pin
point the fluorescent analytes for multicomponent mixtures but mathematical variable
selection50 which take into account the full area could be applied to improve model
performance [7, 335-338].
In industrial settings, large quantities of powders are mixed together to produce the
media; the variance seen would be within specification of industrial limits. Therefore
no large fluctuations would be expected unless an outlier was present. Thus the
method developed in this work could be adapted for industrial use where the variance
in samples and the concentration ranges used are smaller.
50
Methods like competitive adaptive reweighted sampling (CARS) and ant colony optimization
(ACO) based on mathematical evaluation of each wavelength importance would be better for the
multiple component analytes.
186
6 Conclusions and Future Work
The FDA and the biopharmaceutical industry want to better regulate and understand
bio-processes through the use of quality by design (QbD) and PAT. One area where
QbD and PAT can be applied for better control is media formulation. Prior to use,
media are tested in order to determine whether they are fit for purpose. This can
include small scale performance testing, but this is time consuming and expensive.
Variability in media can have a large impact on product quality and process
performance [77, 339-341]. The objective of this thesis was to develop rapid
spectroscopic methods for quantifying certain components in media which could then
be used for media formulation analysis. Raman, SERS and Fluorescence
spectroscopic methods offer the possibility to carry out non-destructive qualitative
and quantitative analysis of cell culture media in near real time.
187
The complex nature of eRDF and yeastolate meant that they were open to analysis by
all of the spectroscopic method being investigated (Table 41). Chapter 4 and 5
covered the quantification of eRDF and yeastolate by SERS and fluorescence
respectively. SERS gave the signal enhancement required to compensate for the
strong water signal but the preparation of colloid made it the most labour intensive
method. The results showed some promise in quantifying the complex media
components as a whole but the error levels were too high to be useful. SERS gave an
improved yeastolate model (12% error), while the eRDF model that it produced
(~16% error) matched the Raman model. The SERS method can be improved with
more control over sample to colloid ratio, increased sample numbers, reduced linear
range as the data exceed ± 25%, incubation time, sample re-suspension, and use of
aggregating agent. . In order to make the SERS method optimal, further development
is required. Areas to improve upon are: sample to colloid ratio; whether to use an
aggregating agent; and improved reproducibility testing, by taking ten replicate
measurements.
Table 41 Summary of the best calibration models for the media components generated from the
different methods.
Dataset Method Sample Range Pre-processing REP%
Number
M5Glu Raman 32 800–1680 cm–1 BC FD MSC 4.66
Fluorescence data was informative and easy to collect and the measurements were
reproducible. The EEM and TSFS of yeastolate and eRDF shared common
188
fluorophores due to their biogenic components (amino acids, peptides and vitamins).
Prior to quantitative analysis, an extra assessment of the data was conducted using
ROBPCA for outlier detection. ROBPCA indicated that the samples at the ends of the
concentration deviated most, in other words the matrix (in photophysical terms) had
changed very significantly. This indicated that the linear range was too large for
accurate quantification. For practical operational use, these methods would be better if
they were developed with a more restricted concentration range (i.e. a range that
varied by ± 25% of the set concentration value) like the M5YE after the sample
numbers were reduced. This would limit the extent of matrix variations and generate
accurate quantitative methods. Using unfolded PLS, it was possible to quantify
yeastolate concentration with 5.4% error level for the EEM data and 7.2% for the
TSFS data; for eRDF, the error levels were 7.2% and 7.8% using EEM and TSFS
respectively. EEM outperformed TSFS in both the calibration and prediction
performance for both M5eRDF and M5Ye. It also gave better resolved bands for
identification of underlying fluorophores.
190
7 References
192
36. Incorprated AB. Product Data Sheet HEPES Buffer Solution (1M) 2010.
Available from: http://www.atlantabio.com/assets/PDS/File/PDS%20-
%20HEPES%20Buffer%20Solution.pdf.
37. Luo S, Pal D, Shah SJ, Kwatra D, Paturi KD, Mitra AK. Effect of HEPES
buffer on the uptake and transport of P-glycoprotein substrates and large neutral
amino acids. Molecular pharmaceutics. 2010;7(2):412-20.
38. Delmouly K, Belondrade M, Casanova D, Milhavet O, Lehmann S. HEPES
inhibits the conversion of prion protein in cell culture. Journal of General Virology.
2011;92(5):1244-50.
39. Denry Sato J, Kan, M. Media for culture of mammalian cells. Current
Protocols in Cell Biology. 1998;1:1-.2.
40. Langer E. Trends to Watch in the Biopharmaceutical Industry: The Economy,
Approvals, Contamination, and Going Animal-Free. 2010.
41. Cleland D, Jastrzembski K, Stamenova E, Benson J, Catranis C, Emerson D,
et al. Growth characteristics of microorganisms on commercially available animal-
free alternatives to tryptic soy medium. Journal of microbiological methods.
2007;69(2):345-52.
42. Jayme D, Watanabe T, Shimada T. Basal medium development for serum-free
culture: a historical perspective. Cytotechnology. 1997;23(1):95-101.
43. Kim DY, Lee JC, Chang HN, Oh DJ. Development of serum-free media for a
recombinant CHO cell line producing recombinant antibody. Enzyme and microbial
technology. 2006;39(3):426-33.
44. Sung Y, Lim S, Chung J, Lee G. Yeast hydrolysate as a low-cost additive to
serum-free medium for the production of human thrombopoietin in suspension
cultures of Chinese hamster ovary cells. Applied Microbiology and Biotechnology.
2004;63(5):527-36.
45. Franěk F, Hohenwarter O, Katinger H. Plant protein hydrolysates: preparation
of defined peptide fractions promoting growth and production in animal cells cultures.
Biotechnology progress. 2000;16(5):688-92.
46. Michiels J-F, Barbau J, De Boel S, Dessy S, Agathos S, Schneider Y-J.
Characterisation of beneficial and detrimental effects of a soy peptone, as an additive
for CHO cell cultivation. Process Biochemistry. 2011;46(3):671-81.
47. Schlaeger E-J. The protein hydrolysate, Primatone RL, is a cost-effective
multiple growth promoter of mammalian cell culture in serum-containing and serum-
free media and displays anti-apoptosis properties. Journal of immunological methods.
1996;194(2):191-9.
48. Burteau CC, Verhoeye FR, Molsl JF, Ballez J-S, Agathos SN, Schneider Y-J.
Fortification of a protein-free cell culture medium with plant peptones improves
cultivation and productivity of an interferon-γ-producing CHO cell line. In Vitro
Cellular & Developmental Biology-Animal. 2003;39(7):291-6.
49. Heidemann R, Zhang C, Qi H, Rule JL, Rozales C, Park S, et al. The use of
peptones as medium additives for the production of a recombinant therapeutic protein
in high density perfusion cultures of mammalian cells. Cytotechnology.
2000;32(2):157-67.
50. Dick LW, Kakaley JA, Mahon D, Qiu D, Cheng KC. Investigation of proteins
and peptides from yeastolate and subsequent impurity testing of drug product.
Biotechnology progress. 2009;25(2):570-7.
193
51. Mosser M, Chevalot I, Olmos E, Blanchard F, Kapel R, Oriol E, et al.
Combination of yeast hydrolysates to improve CHO cell growth and IgG production.
Cytotechnology. 2012:1-13.
52. Even MS, Sandusky CB, Barnard ND. Serum-free hybridoma culture: ethical,
scientific and safety considerations. Trends in Biotechnology. 2006;24(3):105-8.
53. Sarkar A. Stem Cell Culture: Discovery Publishing House.
54. Ham RG. Clonal growth of mammalian cells in a chemically defined,
synthetic medium. Proceedings of the National Academy of Sciences of the United
States of America. 1965;53(2):288.
55. Gstraunthaler G. Alternatives to the use of fetal bovine serum: serum-free cell
culture. Altex. 2003;20(4):275-81.
56. Murakami H, Masui H, Sato GH, Sueoka N, Chow TP, Kano-Sueoka T.
Growth of hybridoma cells in serum-free medium: ethanolamine is an essential
component. Proceedings of the National Academy of Sciences. 1982;79(4):1158.
57. Kong ZLM, M. Murakami, H. Shinohara, K. Establishment of a
macrophagelike cell line derived from U-937, human histiocytic lymphoma, grown
serum-free. In Vitro Cellular & Developmental Biology-Plant. 1990;26(10):949-54.
58. Kawahara MN, A. Terada, S. Kato, K. Tsumoto, K. Kumagai, I. Miki, M.
Mahoney, W. Ueda, H. Nagamune, T. Replacing factor-dependency with that for
lysozyme: affordable culture of IL-6-dependent hybridoma by transfecting artificial
cell surface receptor. Biotechnology and Bioengineering. 2001;74(5):416-23.
59. Chua F, Oh SKW, Yap M, Teo WK. Enhanced IgG production in eRDF media
with and without serum:: A comparative study. Journal of immunological methods.
1994;167(1-2):109-19.
60. Garnick R, Solli N, Papa P. The role of quality control in biotechnology: An
analytical perspective. Analytical Chemistry. 1988;60(23):2546-57.
61. Hanko VP, Rohrer JS. Determination of carbohydrates, sugar alcohols, and
glycols in cell cultures and fermentation broths using high-performance anion-
exchange chromatography with pulsed amperometric detection. Analytical
Biochemistry. 2000;283(2):192-9.
62. Hanko VP, Rohrer JS. Determination of amino acids in cell culture and
fermentation broth media using anion-exchange chromatography with integrated
pulsed amperometric detection. Analytical Biochemistry. 2004;324(1):29-38.
63. Fa Y, Yang H, Ji C, Cui H, Zhu X, Du J, et al. Simultaneous determination of
amino acids and carbohydrates in culture media of< i> Clostridium thermocellum</i>
by valve-switching ion chromatography. Analytica Chimica Acta. 2013;798:97-102.
64. Buha SM, Panchal A, Panchal H, Chambhare R, Patel PR, Kumar S, et al.
HPLC-FLD for the Simultaneous Determination of Primary and Secondary Amino
Acids from Complex Biological Sample by Pre-column Derivatization. Journal of
chromatographic science. 2011;49(2):118-23.
65. Genzel Y, König S, Reichl U. Amino acid analysis in mammalian cell culture
media containing serum and high glucose concentrations by anion exchange
chromatography and integrated pulsed amperometric detection. Analytical
Biochemistry. 2004;335(1):119-25.
66. Potvin J, Fonchy E, Conway J, Champagne CP. An automatic turbidimetric
method to screen yeast extracts as fermentation nutrient ingredients. Journal of
microbiological methods. 1997;29(3):153-60.
194
67. Pohlscheidt M, Charaniya S, Bork C, Jenzsch M, Noetzel TL, Luebbert A.
Bioprocess and fermentation monitoring. Encyclopedia of Industrial Biotechnology.
2012.
68. Sun Y-t, Zhao L, Ye Z, Fan L, Liu X-p, Tan W-S. Development of a fed-batch
cultivation for antibody-producing cells based on combined feeding strategy of
glucose and galactose. Biochemical Engineering Journal. 2013;81:126-35.
69. Food, Administration D. Guidance for Industry: PAT—a framework for
innovative pharmaceutical development, manufacturing, and quality assurance.
Rockville, MD. 2004.
70. De Beer T, Allesø M, Goethals F, Coppens A, Vander Heyden Y, De Diego
HL, et al. Implementation of a process analytical technology system in a freeze-drying
process using Raman spectroscopy for in-line process monitoring. Analytical
Chemistry. 2007;79(21):7992-8003.
71. De Beer T, Burggraeve A, Fonteyne M, Saerens L, Remon JP, Vervaet C.
Near infrared and Raman spectroscopy for the in-process monitoring of
pharmaceutical production processes. International Journal of Pharmaceutics.
2011;417(1):32-47.
72. De Beer TRM, Bodson C, Dejaegher B, Walczak B, Vercruysse P, Burggraeve
A, et al. Raman spectroscopy as a process analytical technology (PAT) tool for the in-
line monitoring and understanding of a powder blending process. Journal of
Pharmaceutical and Biomedical Analysis. 2008;48(3):772.
73. Johansson J, Pettersson S, Folestad S. Characterization of different laser
irradiation methods for quantitative Raman tablet assessment. Journal of
Pharmaceutical and Biomedical Analysis. 2005;39(3-4):510.
74. Clarke SJ, Littleford RE, Smith WE, Goodacre R. Rapid monitoring of
antibiotics using Raman and surface enhanced Raman spectroscopy. Analyst.
2005;130(7):1019-26.
75. Jain G, Jayaraman G, Kökpinar Ö, Rinas U, Hitzmann B. On-line monitoring
of recombinant bacterial cultures using multi-wavelength fluorescence spectroscopy.
Biochemical Engineering Journal. 2011;58–59(0):133-9.
76. Lee HLT, Boccazzi P, Gorret N, Ram RJ, Sinskey AJ. In situ bioprocess
monitoring of Escherichia coli bioreactions using Raman spectroscopy. Vibrational
Spectroscopy. 2004;35(1-2):131.
77. Lee HW, Christie A, Yoon S. Characterization of Raw Material Influence on
Mammalian Cell Culture Performance: Chemometric Based Data Fusion Approach.
78. Lourenço N, Lopes J, Almeida C, Sarraguça M, Pinheiro H. Bioreactor
monitoring with spectroscopy and chemometrics: a review. Analytical and
bioanalytical chemistry. 2012;404(4):1211-37.
79. Macaloney G, Draper I, Preston J, Anderson K, Rollins M, Thompson B, et al.
At-Line Control and Fault Analysis In an Industrial High Cell Density Escherichia
Coli Fermentation, Using NIR Spectroscopy. Food and Bioproducts Processing.
1996;74(4):212-20.
80. Marose S, Lindemann C, Scheper T. Two-Dimensional Fluorescence
Spectroscopy: A New Tool for On-Line Bioprocess Monitoring. Biotechnology
progress. 1998;14(1):63.
81. Triadaphillou S, Martin E, Montague G, Norden A, Jeffkins P, Stimpson S.
Fermentation process tracking through enhanced spectral calibration modeling.
Biotechnology and Bioengineering. 2007;97(3):554-67.
195
82. Ryder AGV, John De Li, Boyan Ryan, Paul W. Sirimuthu, Narayana M. S.
Leister, Kirk J. A stainless steel multi-well plate (SS-MWP) for high-throughput
Raman analysis of dilute solutions. Journal of Raman Spectroscopy.
2010;41(10):1266-75.
83. Settle FA. Handbook of Instrumental Techniques for Analytical Chemistry.
Journal of Liquid Chromatography Related Technologies. 1998;21(19):3072-6.
84. Ewing GW. Analytical instrumentation handbook: CRC Press; 1997.
85. Willard HH, Merritt Jr LL, Dean JA. Instrumental methods of analysis. Settle
Jr FA, editor: Wadsworth Pub. Co.; 1988.
86. Turrell G, Corset J. Raman microscopy: developments and applications:
Access Online via Elsevier; 1996.
87. McCreery R. Raman spectroscopy for chemical analysis: Wiley-Interscience;
2000.
88. Hollas M. Modern Spectroscopy. 1987. New York: John Wiley & Sons.
89. Straughan B, Walker S. Spectroscopy: Chapman and Hall London; 1976.
90. Collette TW, Williams TL. The role of Raman spectroscopy in the analytical
chemistry of potable water. Journal of Environmental Monitoring. 2002;4(1):27-34.
91. Egawa T, Yeh S-R. Structural and functional properties of hemoglobins from
unicellular organisms as revealed by resonance Raman spectroscopy. Journal of
Inorganic Biochemistry. 2005;99(1):72-96.
92. Smith E, Dent G. Modern Raman spectroscopy: a practical approach: Wiley;
2005.
93. Sebastian R, Petra R, Marion AS, Dorothea B, Malgorzata B, Hartwig S, et al.
Nondestructive analysis of single rapeseeds by means of Raman spectroscopy. Journal
of Raman Spectroscopy. 2007;38(3):301-8.
94. Ortiz C, Zhang D, Xie Y, Davisson VJ, Ben-Amotz D. Identification of insulin
variants using Raman spectroscopy. Analytical Biochemistry. 2004;332(2):245.
95. Zhu G, Zhu X, Fan Q, Wan X. Raman spectra of amino acids and their
aqueous solutions. Spectrochimica Acta Part A: Molecular and Biomolecular
Spectroscopy. 2011;78(3):1187-95.
96. Kang J, Yuan X, Dong X, Gu H, editors. The effect of aqueous solution in
Raman spectroscopy. Photonics and Optoelectronics Meetings 2009; 2009:
International Society for Optics and Photonics.
97. Cannizzaro C, Rhiel M, Marison I, von Stockar U. On-line monitoring of
Phaffia rhodozyma fed-batch process with in situ dispersive raman spectroscopy.
Biotechnology and Bioengineering. 2003;83(6):668-80.
98. Shaw AD, Kaderbhai N, Jones A, Woodward AM, Goodacre R, Rowland JJ,
et al. Noninvasive, on-line monitoring of the biotransformation by yeast of glucose to
ethanol using dispersive Raman spectroscopy and chemometrics. Applied
spectroscopy. 1999;53(11):1419-28.
99. McGovern AC, Broadhurst D, Taylor J, Kaderbhai N, Winson MK, Small DA,
et al. Monitoring of complex industrial bioprocesses for metabolite concentrations
using modern spectroscopies and machine learning: application to gibberellic acid
production. Biotechnology and Bioengineering. 2002;78(5):527-38.
100. Abu-Absi NR, Kenty BM, Cuellar ME, Borys MC, Sakhamuri S, Strachan DJ,
et al. Real time monitoring of multiple parameters in mammalian cell culture
bioreactors using an in‐line Raman spectroscopy probe. Biotechnology and
Bioengineering. 2011;108(5):1215-21.
196
101. Calvet A, Ryder AG. Monitoring cell culture media degradation using surface
enhanced Raman scattering (SERS) spectroscopy. Analytica Chimica Acta.
2014;840(0):58-67.
102. Wold JP, Westad F, Heia K. Detection of parasites in cod fillets by using
SIMCA classification in multispectral images in the visible and NIR region. Applied
spectroscopy. 2001;55(8):1025-34.
103. Campion A, Kambhampati P. Surface-enhanced Raman scattering. Chemical
Society Reviews. 1998;27(4):241-50.
104. Tian ZQ. Surface-enhanced Raman spectroscopy: advancements and
applications. Journal of Raman Spectroscopy. 2005;36(6-7):466-70.
105. Suetaka W, Yates J. Surface infrared and Raman spectroscopy: methods and
applications: Springer; 1995.
106. Jia H, Zeng J, Song W, An J, Zhao B. Preparation of silver nanoparticles by
photo-reduction for surface-enhanced Raman scattering. Thin Solid Films.
2006;496(2):281-7.
107. Dennis A. Surface Enhanced Raman Spectroscopy (SERS). Technical Note
108. Wang Y, Li Y, Zhang Z, An D. Surface-enhanced Raman scattering of some
water insoluble drugs in silver hydrosols. Spectrochimica Acta Part A: Molecular and
Biomolecular Spectroscopy. 2003;59(3):589-94.
109. Jarvis R, Brooker A, Goodacre R. Surface-enhanced Raman spectroscopy for
bacterial discrimination utilizing a scanning electron microscope with a Raman
spectroscopy interface. Anal Chem. 2004;76(17):5198-202.
110. Stosch R, Henrion A, Schiel D, Güttler B. Surface-Enhanced Raman
Scattering Based Approach for Quantitative Determination of Creatinine in Human
Serum. Anal Chem. 2005;77(22):7386-92.
111. Mrozek M, Weaver M. Detection and identification of aqueous saccharides by
using surface-enhanced Raman spectroscopy. Anal Chem. 2002;74(16):4069-75.
112. Rupérez A, Laserna JJ. Surface-enhanced Raman spectrometry of chiral β-
blocker drugs on colloidal silver. Analytica Chimica Acta. 1996;335(1–2):87-94.
113. Kukushkin V, Van'kov A, Kukushkin I. Long-range nature of surface-
enhanced Raman scattering. arXiv preprint arXiv:12122782. 2012.
114. Ye Q, Fang J, Sun L. Surface-enhanced Raman scattering from functionalized
self-assembled monolayers. 2. Distance dependence of enhanced Raman scattering
from an azobenzene terminal group. The Journal of Physical Chemistry B.
1997;101(41):8221-4.
115. Lecomte S, Matejka P, Baron M. Correlation between surface enhanced
Raman scattering and absorbance changes in silver colloids. Evidence for the
chemical enhancement mechanism. Langmuir. 1998;14(16):4373-7.
116. Barnes WL, Dereux A, Ebbesen TW. Surface plasmon subwavelength optics.
Nature. 2003;424(6950):824-30.
117. Haynes CL, McFarland AD, Duyne RPV. Surface-Enhanced Raman
Spectroscopy. Analytical Chemistry. 2005;77(17):338 A-46 A.
118. Abalde-Cela S, Aldeanueva-Potel P, Mateo-Mateo C, Rodríguez-Lorenzo L,
Alvarez-Puebla RA, Liz-Marzán LM. Surface-enhanced Raman scattering biomedical
applications of plasmonic colloidal particles. Journal of The Royal Society Interface.
2010;7(Suppl 4):S435-S50.
119. Tao AR, Yang P. Polarized surface-enhanced Raman spectroscopy on coupled
metallic nanowires. The Journal of Physical Chemistry B. 2005;109(33):15687-90.
197
120. Aaltonen J, Gordon KC, Strachan CJ, Rades T. Perspectives in the use of
spectroscopy to characterise pharmaceutical solids. International Journal of
Pharmaceutics. 2008;364(2):159.
121. Ahern AM, Garrell RL. Protein-metal interactions in protein-colloid
conjugates probed by surface-enhanced Raman spectroscopy. Langmuir.
1991;7(2):254-61.
122. Nickel U, zu Castell A, Pöppl K, Schneider S. A silver colloid produced by
reduction with hydrazine as support for highly sensitive surface-enhanced Raman
spectroscopy. Langmuir. 2000;16(23):9087-91.
123. Herne TM, Ahern A, Garrell RL. Surface-enhanced Raman spectroscopy of
peptides: preferential N-terminal adsorption on colloidal silver. Journal of the
American Chemical Society. 1991;113(3):846-54.
124. Jurkiewicz-Herbich M, Slojkowska R, Zawada K, Bukowska J.
Electrochemical and surface-enhanced Raman spectroscopy studies of 4-
phenylpyridine adsorption at the gold/solution interface. Electrochimica acta.
2002;47(15):2429-34.
125. Bulovas A, Talaikytė Z, Niaura G, Kažemėkaitė M, Marcinkevičienė L.
Double layered Ag/Au electrode for SERS spectroscopy: preparation and application
for adsorption studies of chromophoric compounds. Chemija. 2007;10(4):9-15.
126. Saito Y, Wang JJ, Smith DA, Batchelder DN. A Simple Chemical Method for
the Preparation of Silver Surfaces for Efficient SERS. Langmuir. 2002;18(8):2959-61.
127. Li P-W, Zhang J, Zhang L, Mo Y-J. Surface-enhanced Raman scattering and
adsorption studies of morphine on silver island film. Vibrational Spectroscopy.
2009;49(1):2-6.
128. Weitz D, Garoff S, Gramila T. Excitation spectra of surface-enhanced Raman
scattering on silver-island films. Optics Letters. 1982;7(4):168-70.
129. Dick LA, McFarland AD, Haynes CL, Van Duyne RP. Metal film over
nanosphere (MFON) electrodes for surface-enhanced Raman spectroscopy (SERS):
Improvements in surface nanostructure stability and suppression of irreversible loss.
The Journal of Physical Chemistry B. 2002;106(4):853-60.
130. Fang Y, Yang H, Jiang P, Dlott DD. The distributions of enhancement factors
in close‐packed and nonclose‐packed surface‐enhanced Raman substrates. Journal of
Raman Spectroscopy. 2011;43(3):389-95.
131. Zhang X, Shah N, Van Duyne R. Sensitive and selective chem/bio sensing
based on surface-enhanced Raman spectroscopy (SERS). Vibrational Spectroscopy.
2006;42(1):2-8.
132. Laserna J, Campiglia A, Winefordner J. Mixture analysis and quantitative
determination of nitrogen-containing organic molecules by surface-enhanced Raman
spectrometry. Analytical Chemistry. 1989;61(15):1697-701.
133. Gupta RK, Srinivasan MP, Dharmarajan R. Synthesis of short chain thiol
capped gold nanoparticles, their stabilization and immobilization on silicon surface.
Colloids and Surfaces A: Physicochemical and Engineering Aspects. 2011;390(1–
3):149-56.
134. Lee P, Meisel D. Adsorption and surface-enhanced Raman of dyes on silver
and gold sols. The Journal of Physical Chemistry. 1982;86(17):3391-5.
135. Creighton JA, Blatchford CG, Albrecht MG. Plasma resonance enhancement
of Raman scattering by pyridine adsorbed on silver or gold sol particles of size
comparable to the excitation wavelength. J Chem Soc, Faraday Trans 2. 1979;75:790-
8.
198
136. Merza KS, Al-Attabi HD, Abbas ZM, Yusr HA. Comparative Study on
Methods for Preparation of Gold Nanoparticles. Green and Sustainable Chemistry.
2012;2(1):26-8.
137. Aroca R, Alvarez-Puebla R, Pieczonka N, Sanchez-Cortez S, Garcia-Ramos J.
Surface-enhanced Raman scattering on colloidal nanostructures. Advances in colloid
and interface science. 2005;116(1):45-61.
138. Mulfinger L, Solomon SD, Bahadory M, Jeyarajasingam AV, Rutkowsky SA,
Boritz C. Synthesis and study of silver nanoparticles. Journal of chemical education.
2007;84(2):322.
139. Kudelski A. Analytical applications of Raman spectroscopy. Talanta.
2008;76(1):1.
140. Etchegoin P, Maher RC, Cohen LF, Hartigan H, Brown RJC, Milton MJT, et
al. New limits in ultrasensitive trace detection by surface enhanced Raman scattering
(SERS). Chemical Physics Letters. 2003:84–90.
141. Dressler DH, Mastai Y, Rosenbluh M, Fleger Y. Surface-enhanced Raman
spectroscopy as a probe for orientation of pyridine compounds on colloidal surfaces.
Journal of molecular structure. 2009;935(1-3):92-6.
142. Leopold N, Cınta-Pınzaru S, Baia M, Antonescu E, Cozar O, Kiefer W,
et al. Raman and surface-enhanced Raman study of thiamine at different pH values.
Vibrational Spectroscopy. 2005:169–76.
143. Pettinger B, Picardi G, Schuster R, Ertl G. Surface-enhanced and STM tip-
enhanced Raman spectroscopy of CN-ions at gold surfaces. Journal of
Electroanalytical Chemistry. 2003;554:293-9.
144. Stewart S, Fredericks PM. Surface-enhanced Raman spectroscopy of amino
acids adsorbed on an electrochemically prepared silver surface. Spectrochimica Acta
Part A: Molecular and Biomolecular Spectroscopy. 1999;55(7-8):1641.
145. Brambilla A, Philippidis A, Nevin A, Comelli D, Valentini G, Anglos D.
Adapting and testing a portable Raman spectrometer for SERS analysis of amino
acids and small peptides. Journal of molecular structure. 2013.
146. Podstawka E, Ozaki Y, Proniewicz LM. Part I: surface-enhanced Raman
spectroscopy investigation of amino acids and their homodipeptides adsorbed on
colloidal silver. Applied spectroscopy. 2004;58(5):570-80.
147. Podstawka E, Ozaki Y, Proniewicz LM. Adsorption of S–S containing
proteins on a colloidal silver surface studied by surface-enhanced Raman
spectroscopy. Applied spectroscopy. 2004;58(10):1147-56.
148. Stewart S, Fredericks P. Surface-enhanced Raman spectroscopy of peptides
and proteins adsorbed on an electrochemically prepared silver surface.
Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy.
1999;55(7):1615-40.
149. Lin M, He L, Awika J, Yang L, Ledoux DR, Li H, et al. Detection of
Melamine in Gluten, Chicken Feed, and Processed Foods Using Surface Enhanced
Raman Spectroscopy and HPLC. Journal of Food Science. 2008;73(8):T129-T34.
150. Vernon K, Davis T, Scholes F, Gomez D, Lau D. Physical mechanisms behind
the SERS enhancement of pyramidal pit substrates. Journal of Raman Spectroscopy.
2010;41(10):1106-11.
151. Food, Administration D. Update: Interim Safety and Risk Assessment of
Melamine and its Analogues in Food for Humans 2008 [updated November 28, 2008].
Available from:
199
http://www.fda.gov/Food/FoodborneIllnessContaminants/ChemicalContaminants/ucm
164520.htm.
152. Eliasson C, Lorén A, Murty KVGK, Josefson M, Käll M, Abrahamsson J, et
al. Multivariate evaluation of doxorubicin surface-enhanced Raman spectra.
Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy.
2001;57(9):1907-15.
153. R.Stosch AH, D. Schiel, B. Guttle. Surface-Enhanced Raman Scattering Based
Approach for Quantitative Determination of Creatinine in Human Serum. Analytical
Chemistry. 2005:7386-92.
154. Kahraman M, Yazici MM, Şahin F, Bayrak ÖF, Çulha M. Reproducible
surface-enhanced Raman scattering spectra of bacteria on aggregated silver
nanoparticles. Applied spectroscopy. 2007;61(5):479-85.
155. Leopold N, Cînta-Pînzaru S, Baia M, Antonescu E, Cozar O, Kiefer W, et al.
Raman and surface-enhanced Raman study of thiamine at different pH values.
Vibrational Spectroscopy. 2005;39(2):169-76.
156. Culha M, Kahraman M, Çam D, Sayın I, Keseroǧlu K. Rapid identification of
bacteria and yeast using surface-enhanced Raman scattering. Surface and Interface
Analysis. 2010;42(6‐7):462-5.
157. Jarvis RM, Brooker A, Goodacre R. Surface-Enhanced Raman Spectroscopy
for Bacterial Discrimination Utilizing a Scanning Electron Microscope with a Raman
Spectroscopy Interface. Analytical Chemistry. 2004:5198-202.
158. Marotta NE, Bottomley LA. Surface-enhanced Raman scattering of bacterial
cell culture growth media. Applied spectroscopy. 2010;64(6):601-6.
159. Premasiri WR, Gebregziabher Y, Ziegler LD. On the difference between
surface-enhanced Raman scattering (SERS) spectra of cell growth media and whole
bacterial cells. Applied spectroscopy. 2011;65(5):493-9.
160. Aqua Regia. In Laboratory Safety Manual; Princeton University: Princeton,
NJ, . Available from:
http://web.princeton.edu/sites/ehs/labsafetymanual/cheminfo/aquaregia.htm.
161. Kamath SD, Kartha VB, Mahato KK. Dynamics of l-tryptophan in aqueous
solution by simultaneous laser induced fluorescence (LIF) and photoacoustic
spectroscopy (PAS). Spectrochimica Acta Part A: Molecular and Biomolecular
Spectroscopy. 2008;70(1):187.
162. Lakowicz JR. Principles of Fluorscence Spectroscopy. 2nd Ed. ed: Springer;
2004.
163. Valeur B. Molecular Fluorscence Principles and Applications Wiley VCH;
2002.
164. Dijkstra RJ, Ariese F, Gooijer C, Brinkman UAT. Raman spectroscopy as a
detection method for liquid-separation techniques. TrAC Trends in Analytical
Chemistry. 2005;24(4):304.
165. Skoog DA, Leary JJ. Principles of Instrumental Analysis. Clinical Chemistry-
Reference Edition. 1994;40(8):1612.
166. Forster LS, Dudley D. The Luminescence of Fluorescein Dyes. The Journal of
Physical Chemistry. 1962;66(5):838-40.
167. Gutierrez AR, Whitten DG. Photochemistry of quaternary salts of 1, 2-
bispyridylethylenes. Heavy atom induced singlet-triplet intersystem crossing as a path
for decay in charge-transfer exciplexes. Journal of the American Chemical Society.
1976;98(20):6233-7.
200
168. Mac M, Danel A, Kizior K, Nowak P, Karocki A, Tokarczyk B. Investigations
of the heavy atom effect occurring in bianthryl and 10, 10′-dibromobianthryl.
Fluorescence, cyclovoltamperometric and actinometric studies. Phys Chem Chem
Phys. 2002;5(6):988-97.
169. Gijzeman O, Kaufman F, Porter G. Oxygen quenching of aromatic triplet
states in solution. Part 1. J Chem Soc, Faraday Trans 2. 1973;69:708-20.
170. English D, Furube A, Barbara P. Single-molecule spectroscopy in oxygen-
depleted polymer films. Chemical Physics Letters. 2000;324(1):15-9.
171. Engelborghs Y. The analysis of time resolved protein fluorescence in multi-
tryptophan proteins. Spectrochimica Acta Part A: Molecular and Biomolecular
Spectroscopy. 2001;57(11):2255-70.
172. Joshi S, Pant DD. Ground and excited state dipole moments of quinine sulfate
dication: Solvatochromic shift of absorption and fluorescence spectra. Journal of
Molecular Liquids. 2012;172(0):125-9.
173. Scott SM, James D, Ali Z, O'Hare WT, Rowell FJ. Total luminescence
spectroscopy with pattern recognition for classification of edible oils. The Analyst.
2003;128(7):966-73.
174. Patra D, Mishra A. Recent developments in multi-component synchronous
fluorescence scan analysis. TrAC Trends in Analytical Chemistry. 2002;21(12):787-
98.
175. Patra D, Mishra A. Total synchronous fluorescence scan spectra of petroleum
products. Analytical and bioanalytical chemistry. 2002;373(4):304-9.
176. Engelen S, Møller SF, Hubert M. Automatically identifying scatter in
fluorescence data using robust techniques. Chemometrics and Intelligent Laboratory
Systems. 2007;86(1):35-51.
177. Poulli K. I., Mousdis G. A., A. GC. Synchronous fluorescence spectroscopy
for quantitative determination of virgin olive oil adulteration with sunflower oil.
Analytical and bioanalytical chemistry. 2006(386):1571-5.
178. Sikorska E, Gliszczynska-Swiglo A, Insinska-Rak M, Khmelinskii I, De
Keukeleire D, Sikorski M. Simultaneous analysis of riboflavin and aromatic amino
acids in beer using fluorescence and multivariate calibration methods. Analytica
Chimica Acta. 2008;613(2):207.
179. Wang Y, Zhu P-H, Tian T, Tang J, Wang L, Hu X-Y. Synchronous
Fluorescence as a Rapid Method for the Simultaneous Determination of Folic Acid
and Riboflavin in Nutritional Beverages. Journal of agricultural and food chemistry.
2011;59(23):12629-34.
180. Datta P, Bhattacharyya D. Analysis of fluorescence excitation-emission matrix
of multicomponent drugs: A case study with human placental extract used as wound
healer. Journal of Pharmaceutical and Biomedical Analysis. 2004;36(1):211.
181. Jiménez Girón A, Durán-Merás I, Espinosa-Mansilla A, Muñoz de la Peña A,
Cañada Cañada F, Olivieri AC. On line photochemically induced excitation–
emission-kinetic four-way data: Analytical application for the determination of folic
acid and its two main metabolites in serum by U-PLS and N-PLS/residual
trilinearization (RTL) calibration. Analytica Chimica Acta. 2008;622(1–2):94-103.
182. Zepp RG, Sheldon WM, Moran MA. Dissolved organic fluorophores in
southeastern US coastal waters: correction method for eliminating Rayleigh and
Raman scattering peaks in excitation-emission matrices. Marine Chemistry.
2004;89(1-4):15-36.
201
183. Andrade-Eiroa Á, de-Armas G, Estela J-M, Cerdà V. Critical approach to
synchronous spectrofluorimetry. I. TrAC Trends in Analytical Chemistry.
2010;29(8):885-901.
184. Heelis P. The photophysical and photochemical properties of flavins
(isoalloxazines). Chem Soc Rev. 1982;11(1):15-39.
185. Drössler P, Holzer W, Penzkofer A, Hegemann P. Fluoresence quenching of
riboflavin in aqueous solution by methionin and cystein. Chemical Physics.
2003;286(2–3):409-20.
186. Gaigalas A, Li L, Henderson O, Vogt R, Barr J, Marti G, et al. The
development of fluorescence intensity standards. Journal of Research - National
Institute of Standards and Technology. 2001;106(2):381-90.
187. Skoog D, Leary J. Principles of Instrumental Analysis, Harcourt Brace
College Publishers. Fort Worth. 1992.
188. Drössler P, Holzer W, Penzkofer A, Hegemann P. pH dependence of the
absorption and emission behaviour of riboflavin in aqueous solution. Chemical
Physics. 2002;282(3):429-39.
189. Evale BG, Hanagodimath S. Effect of temperature and quencher on the
fluorescence of 4-(5-methyl-3-furan-2-yl-benzofuran-2-yl)-7-methyl-chromen-2-one
in different solvents. Spectrochimica Acta Part A: Molecular and Biomolecular
Spectroscopy. 2010;75(5):1592-9.
190. Evale BG, Hanagodimath S, Kulkarni MV. Effect of temperature on the
fluorescence emission of 2H-chromen-2-one derivative in non-polar and polar
solvents. Journal of Luminescence. 2010;130(8):1325-9.
191. Skibsted E, Lindemann C, Roca C, Olsson L. On-line bioprocess monitoring
with a multi-wavelength fluorescence sensor using multivariate calibration. Journal of
Biotechnology. 2001;88(1):47-57.
192. Wendisch VF. Amino acid biosynthesis: pathways, regulation, and metabolic
engineering: Springer Verlag; 2007.
193. Wiberg K, Sterner-Molin A, Jacobsson SP. Simultaneous determination of
albumin and immunoglobulin G with fluorescence spectroscopy and multivariate
calibration. Talanta. 2004;62(3):567.
194. Udenfriend S. Fluorescence assay in biology and medicine. Fluorescence
assay in biology and medicine. 1962.
195. Beale JM, Block J. Organic medicinal and pharmaceutical chemistry:
Lippincott Williams & Wilkins; 2010.
196. Garcıa L, Blázquez S, San Andrés M, Vera S. Determination of thiamine,
riboflavin and pyridoxine in pharmaceuticals by synchronous fluorescence
spectrometry in organized media. Analytica Chimica Acta. 2001;434(2):193-9.
197. Gatti R, Gioia M. Liquid chromatographic determination with fluorescence
detection of B6 vitamers and riboflavin in milk and pharmaceuticals. Analytica
Chimica Acta. 2005;538(1):135-41.
198. Hart H, Hart DJ, Craine LE, Hadad CM. Organic chemistry: a short course.
11th Edition ed: Brooks/Cole Pub Co.
199. Hisiger S, Jolicoeur M. A multiwavelength fluorescence probe: Is one probe
capable for on-line monitoring of recombinant protein production and biomass
activity? Journal of Biotechnology. 2005;117(4):325.
200. Haack MB, Eliasson A, Olsson L. On-line cell mass monitoring of
Saccharomyces cerevisiae cultivations by multi-wavelength fluorescence. Journal of
Biotechnology. 2004;114(1-2):199.
202
201. Amigo JM, Surribas A, Coello J, Montesinos JL, Maspoch S, Valero F. On-
line parallel factor analysis. A step forward in the monitoring of bioprocesses in real
time. Chemometrics and Intelligent Laboratory Systems. 2008;92(1):44.
202. Boehl D, Solle D, Hitzmann B, Scheper T. Chemometric modelling with two-
dimensional fluorescence data for Claviceps purpurea bioprocess characterization.
Journal of Biotechnology. 2003;105(1-2):179-88.
203. Ganzlin M, Marose S, Lu X, Hitzmann B, Scheper T, Rinas U. In situ multi-
wavelength fluorescence spectroscopy as effective tool to simultaneously monitor
spore germination, metabolic activity and quantitative protein production in
recombinant Aspergillus niger fed-batch cultures. Journal of Biotechnology.
2007;132(4):461.
204. Marose S, Lindemann C, Ulber R, Scheper T. Optical sensor systems for
bioprocess monitoring. Trends in Biotechnology. 1999;17(1):30-4.
205. Li J.K. AEC, Humphrey A. E., Horvath J.J. Monitiring Cell Concentration and
Activity by Multiple Excitation Fluorometry. Biotechnology progress. 1991;7:21-7.
206. Eliasson Lantz A, Jørgensen P, Poulsen E, Lindemann C, Olsson L.
Determination of cell mass and polymyxin using multi-wavelength fluorescence.
Journal of Biotechnology. 2006;121(4):544.
207. Garcia Moreda FJ, Arregui FJ, Achaerandio M, Matias IR. Study of indicators
for the development of fluorescence based optical fiber temperature sensors. Sensors
and Actuators B: Chemical. 2006;118(1-2):425-32.
208. Vojinovic V, Cabral JMS, Fonseca LP. Real-time bioprocess monitoring: Part
I: In situ sensors. Sensors and Actuators B: Chemical. 2006;114(2):1083.
209. Horvath JJ, Glazier SA, Spangler CJ. In situ fluorescence cell mass
measurements of Saccharomyces cerevisiae using cellular tryptophan. Biotechnology
progress. 1993;9(6):666-70.
210. Surribas A, Geissler D, Gierse A, Scheper T, Hitzmann B, Montesinos JL, et
al. State variables monitoring by in situ multi-wavelength fluorescence spectroscopy
in heterologous protein production by Pichia pastoris. Journal of Biotechnology.
2006;124(2):412.
211. Lindemann C, Marose S, Nielsen HO, Scheper T. 2-Dimensional fluorescence
spectroscopy for on-line bioprocess monitoring. Sensors and Actuators B: Chemical.
1998;51(1–3):273-7.
212. Hantelmann K, Kollecker M, Hüll D, Hitzmann B, Scheper T. Two-
dimensional fluorescence spectroscopy: A novel approach for controlling fed-batch
cultivations. Journal of Biotechnology. 2006;121(3):410.
213. Ryan PW, Li B, Shanahan M, Leister KJ, Ryder AG. Prediction of cell culture
media performance using fluorescence spectroscopy. Analytical Chemistry.
2010;82(4):1311-7.
214. Geladi P. Chemometrics in spectroscopy. Part 1. Classical chemometrics.
Spectrochimica Acta Part B: Atomic Spectroscopy. 2003;58(5):767.
215. Otto M. Chemometrics: statistics and computer application in analytical
chemistry: Vch Verlagsgesellschaft Mbh; 2007.
216. Miller JN, Miller JC. Statistics and chemometrics for analytical chemistry:
Prentice Hall Ptr; 2005.
217. Kellner RA, Mermet J-M. Analytical chemistry : a modern approach to
analytical science. Weinheim: WILEY-VCH; 2004.
218. Gemperline P. Practical guide to chemometrics: CRC; 2006.
203
219. Brereton RG. Introduction to multivariate calibration in analytical chemistry.
The Analyst. 2000;125(11):2125-54.
220. Bakeev KA. Process analytical technology: spectroscopic tools and
implementation strategies for the chemical and pharmaceutical industries: Wiley-
Blackwell; 2005.
221. Foster JJ, Barkus E, Yavorsky C. Understanding and using advanced statistics:
Sage Publications Ltd; 2006.
222. Faber NM, Rajkó R. How to avoid over-fitting in multivariate calibration--The
conventional validation approach and an alternative. Analytica Chimica Acta.
2007;595(1-2):98.
223. Steinberg WJ. Statistics Alive!: Sage Publications, Inc; 2007.
224. Martens HA, Dardenne P. Validation and verification of regression in small
data sets. Chemometrics and Intelligent Laboratory Systems. 1998;44(1-2):99-121.
225. Baumann K. Cross-validation as the objective function for variable-selection
techniques. TrAC Trends in Analytical Chemistry. 2003;22(6):395-406.
226. Estienne F, Massart DL, Zanier-Szydlowski N, Marteau P. Multivariate
calibration with Raman spectroscopic data: a case study. Analytica Chimica Acta.
2000;424(2):185.
227. Mercier SM, Diepenbroek B, Dalm MC, Wijffels RH, Streefland M.
Multivariate data analysis as a PAT tool for early bioprocess development data.
Journal of Biotechnology. 2013;167(3):262-70.
228. Yoo CK, Lee J-M, Vanrolleghem PA, Lee I-B. On-line monitoring of batch
processes using multiway independent component analysis. Chemometrics and
Intelligent Laboratory Systems. 2004;71(2):151-63.
229. Read E, Shah R, Riley B, Park J, Brorson K, Rathore A. Process analytical
technology (PAT) for biopharmaceutical products: part II. Concepts and applications.
Biotechnology and Bioengineering. 2010;105(2):285-95.
230. Dunteman GH. Principal components analysis: Sage Publications, Inc; 1989.
231. Wold S, Esbensen K, Geladi P. Principal component analysis. Chemometrics
and Intelligent Laboratory Systems. 1987;2(1-3):37-52.
232. Eigenvector Research Inc. WED, Wenatchee, WA, . PLS_Toolbox 4.0 for use
with MATLAB™. 2006.
233. Yeo B-S, Schmid T, Zhang W, Zenobi R. A strategy to prevent signal losses,
analyte decomposition, and fluctuating carbon contamination bands in surface-
enhanced Raman spectroscopy. Applied spectroscopy. 2008;62(6):708-13.
234. Hubert M, Rousseeuw P, Vanden. ROBPCA: a new approach to robust
principal component analysis. Technometrics. 2005;47:79.
235. Hubert M, Rousseeuw P, Verdonck T. Robust PCA for skewed data and its
outlier map. Computational Statistics & Data Analysis. 2009;53(6):2264.
236. Daszykowski M, Kaczmarek K, Vander Heyden Y, Walczak B. Robust
statistics in data analysis -- A review: Basic concepts. Chemometrics and Intelligent
Laboratory Systems. 2007;85(2):203.
237. Li B, Morris J, Martin EB. Model selection for partial least squares regression.
Chemometrics and Intelligent Laboratory Systems. 2002;64(1):79.
238. Gómez-Carracedo MP, Andrade JM, Rutledge DN, Faber NM. Selecting the
optimum number of partial least squares components for the calibration of attenuated
total reflectance-mid-infrared spectra of undesigned kerosene samples. Analytica
Chimica Acta. 2007;585(2):253.
204
239. Lennox BM, GA Hiden, HG Kornfeld, G. Goulding, PR. Process monitoring
of an industrial fed‐batch fermentation. Biotechnology and Bioengineering.
2001;74(2):125-35.
240. Lennox BM, G. Marjanovic, O. Detection of faults in Batch Processes:
Application to an industrial fermentation and a steel making process. Water Science
and Technology. 2000.
241. Shinzawa H, Nishida M, Kanematsu W, Tanaka T, Suzuki K, Noda I. Parallel
factor (PARAFAC) kernel analysis of temperature- and composition-dependent NMR
spectra of poly(lactic acid) nanocomposites. Analyst. 2012;137(8):1913-21.
242. Zarzo M, Ferrer A. Batch process diagnosis: PLS with variable selection
versus block-wise PCR. Chemometrics and Intelligent Laboratory Systems.
2004;73(1):15.
243. Durante C, Cocchi M, Grandi M, Marchetti A, Bro R. Application of N-PLS
to gas chromatographic and sensory data of traditional balsamic vinegars of modena.
Chemometrics and Intelligent Laboratory Systems. 2006;83(1):54.
244. Zissis KD, Brereton RG, Dunkerley S, Escott REA. Two-way, unfolded three-
way and three-mode partial least squares calibration of diode array HPLC
chromatograms for the quantitation of low-level pharmaceutical impurities. Analytica
Chimica Acta. 1999;384(1):71-81.
245. Azzouz T, Tauler R. Application of multivariate curve resolution alternating
least squares (MCR-ALS) to the quantitative analysis of pharmaceutical and
agricultural samples. Talanta. 2008;74(5):1201-10.
246. Kumar K, Mishra AK. Application of ‘multivariate curve resolution
alternating least square (MCR–ALS)’ analysis to extract pure component synchronous
fluorescence spectra at various wavelength offsets from total synchronous
fluorescence spectroscopy (TSFS) data set of dilute aqueous solutions of
fluorophores. Chemometrics and Intelligent Laboratory Systems. 2012;116(0):78-86.
247. Tauler R. Multivariate curve resolution applied to second order data.
Chemometrics and Intelligent Laboratory Systems. 1995;30(1):133-46.
248. Yohannan Panicker C, Tresa Varghese H, Philip D. FT-IR, FT-Raman and
SERS spectra of Vitamin C. Spectrochimica Acta Part A: Molecular and
Biomolecular Spectroscopy. 2006;65(3):802-4.
249. Wu W, Guo Q, Massart DL, Boucon C, de Jong S. Structure preserving
feature selection in PARAFAC using a genetic algorithm and Procrustes analysis.
Chemometrics and Intelligent Laboratory Systems. 2003;65(1):83-95.
250. Bro R. PARAFAC. Tutorial and applications. Chemometrics and Intelligent
Laboratory Systems. 1997;38(2):149-71.
251. Christensen J, Becker EM, Frederiksen CS. Fluorescence spectroscopy and
PARAFAC in the analysis of yogurt. Chemometrics and Intelligent Laboratory
Systems. 2005;75(2):201.
252. Bro R. Exploratory study of sugar production using fluorescence spectroscopy
and multi-way analysis. Chemometrics and Intelligent Laboratory Systems.
1999;46(2):133.
253. Bro R. Multi-way analysis in the food industry: models, algorithms, and
applications: Københavns UniversitetKøbenhavns Universitet, Det Biovidenskabelige
Fakultet for Fødevarer, VeteFaculty of Life Sciences, Institut for
FødevarevidenskabDepartment of Food Science, Kvalitet og TeknologiQuality &
Technology; 1998.
205
254. Rutledge DN, Jouan-Rimbaud Bouveresse D. Multi-way analysis of outer
product arrays using PARAFAC. Chemometrics and Intelligent Laboratory Systems.
2007;85(2):170-8.
255. Rizkallah J, Morales FJ, Ait-ameur L, Fogliano V, Hervieu A, Courel M, et al.
Front face fluorescence spectroscopy and multiway analysis for process control and
NFC prediction in industrially processed cookies. Chemometrics and Intelligent
Laboratory Systems. 2008;93(2):99-107.
256. Goodarzi M, Freitas MP. Augmented Three-mode MIA-QSAR Modeling for a
Series of Anti-HIV-1 Compounds. QSAR & Combinatorial Science.
2008;27(9):1092-7.
257. Wold S, Antti H, Lindgren F, Öhman J. Orthogonal signal correction of near-
infrared spectra. Chemometrics and Intelligent Laboratory Systems. 1998;44(1-
2):175.
258. Eriksson L, Trygg J, Johansson E, Bro R, Wold S. Orthogonal signal
correction, wavelet analysis, and multivariate calibration of complicated process
fluorescence data. Analytica Chimica Acta. 2000;420(2):181.
259. Næs T, Isaksson T, Fearn T, Davies T. A user-friendly guide to multivariate
calibration and classification: NIR publications Chichester, UK; 2002.
260. Savitzky A, Golay MJE. Smoothing and differentiation of data by simplified
least squares procedures. Analytical Chemistry. 1964;36(8):1627-39.
261. Sun D-W. Infrared spectroscopy for food quality analysis and control:
Academic Press; 2009.
262. Candolfi ADM, R Jouan-Rimbaud, D Hailey, PA Massart, DL. The influence
of data pre-processing in the pattern recognition of excipients near-infrared spectra.
Journal of Pharmaceutical and Biomedical Analysis. 1999;21(1):115-32.
263. Helland IS, N s T, Isaksson T. Related versions of the multiplicative scatter
correction method for preprocessing spectroscopic data. Chemometrics and Intelligent
Laboratory Systems. 1995;29(2):233-41.
264. Davies T. Back to basics: The" final" calibration-Column. Spectroscopy
Europe. 2007;19(6):15-9.
265. Bolstad BMI, R.A Åstrand, M. Speed, T.P. A comparison of normalization
methods for high density oligonucleotide array data based on variance and bias.
Bioinformatics. 2003;19(2):185-93.
266. Meuleman WE, J.Y.M.N. Gast, M.C.W. Beijnen, J.H. Reinders, M.J.T.
Wessels, L.F.A. Comparison of normalisation methods for surface-enhanced laser
desorption and ionisation (SELDI) time-of-flight (TOF) mass spectrometry data.
BMC bioinformatics. 2008;9(1):88.
267. Park TY, S.G. Kang, S.H. Lee, S.Y. Lee, Y.S. Simon, R. Evaluation of
normalization methods for microarray data. BMC bioinformatics. 2003;4(1):33.
268. Chong I-G, Jun C-H. Performance of some variable selection methods when
multicollinearity is present. Chemometrics and Intelligent Laboratory Systems.
2005;78(1–2):103-12.
269. Leardi R, Lupiáñez González A. Genetic algorithms applied to feature
selection in PLS regression: how and when to use them. Chemometrics and Intelligent
Laboratory Systems. 1998;41(2):195-207.
270. Mehmood T, Martens H, Sæbø S, Warringer J, Snipen L. A Partial Least
Squares based algorithm for parsimonious variable selection. Algorithms for
Molecular Biology. 2011;6(1):1-12.
206
271. Du YP, Liang YZ, Jiang JH, Berry RJ, Ozaki Y. Spectral regions selection to
improve prediction ability of PLS models by changeable size moving window partial
least squares and searching combination moving window partial least squares.
Analytica Chimica Acta. 2004;501(2):183.
272. Jiang J-H, Berry RJ, Siesler HW, Ozaki Y. Wavelength Interval Selection in
Multicomponent Spectral Analysis by Moving Window Partial Least-Squares
Regression with Applications to Mid-Infrared and Near-Infrared Spectroscopic Data.
Analytical Chemistry. 2002;74(14):3555.
273. Hu Y, Li B, Wang H, Osaki K, Ozaki Y. Determination of ethenzamide and
acetoaminophen in quaternary powdered samples by near infrared spectroscopy and
searching combination moving window partial least squares algorithm: effects of
spectral resolution and signal-to-noise ratio. Journal of near infrared spectroscopy.
2006;14(2):103.
274. De Groot PP, GJ Melssen, WJ Buydens, LMC Deckert, V. Zenobi, R.
Application of principal component analysis to detect outliers and spectral deviations
in near-field surface-enhanced Raman spectra. Analytica Chimica Acta. 2001;446(1-
2):71-83.
275. Engelen S, Hubert M. Detecting outlying samples in a parallel factor analysis
model. Analytica Chimica Acta. 2011;705(1-2):155-65.
276. Mulukutla BCK, Salmaan Lange, Alex Hu, Wei-Shou. Glucose metabolism in
mammalian cell culture: new insights for tweaking vintage pathways. Trends in
Biotechnology. 2010;28(9):476-84.
277. Lohr DV, P. Zlatanova, J. Transcriptional regulation in the yeast GAL gene
family: a complex genetic network. The FASEB Journal. 1995;9(9):777-87.
278. Gorman Linda MLP, Hennig Bernhard,. Growth requirements of endothelial
cells in culture: Variations in serum and amino acid concentrations. Nutrition.
1996;12(4):266-70.
279. Coster JM, R. Hall, J. Glutamine: metabolism and application in nutrition
support. Asia Pacific journal of clinical nutrition. 2004;13(1):25-31.
280. Dick Jr LWK, J.A. Mahon, D. Qiu, D. Cheng, K.C. Investigation of proteins
and peptides from yeastolate and subsequent impurity testing of drug product.
Biotechnology progress. 2009;25(2):570-7.
281. Shen CFK, T. Jardin, B. Konishi, Y. Kamen, A. Characterization of yeastolate
fractions that promote insect cell growth and recombinant protein production.
Cytotechnology. 2007;54(1):25-34.
282. Lee PC, Meisel D. Adsorption and surface-enhanced Raman of dyes on silver
and gold sols. The Journal of Physical Chemistry. 1982;86(17):3391-5.
283. Bright RMS, Caryn S Yelverton, Joshua C Eanes, Sharon Patel, Vikas Riggs,
Julia Wright, Sarah. Use of surface-enhanced raman spectroscopy in inorganic
syntheses for an upper-level exploratory lab. Journal of chemical education.
2007;84(1):132.
284. Hora JL, Patten BM, Fazio GG, Glaccum WJ, editors. The effects of cosmic
rays and solar flares on the IRAC detectors: the first two years of in-flight operation.
Astronomical Telescopes and Instrumentation; 2006: International Society for Optics
and Photonics.
285. Ryder AG, Vincentis JD, Li B, Ryan PW, Sirimuthu NMS, Leister KJ. A
stainless steel multi-well plate (SS-MWP) for high-throughput Raman analysis of
dilute solutions. Journal of Raman Spectroscopy.41(10):1266-75.
207
286. Lorber A. Error propagation and figures of merit for quantification by solving
matrix equations. Analytical Chemistry. 1986;58(6):1167-72.
287. Avetisyan A, Jensen JB, Huser T. Monitoring Trehalose Uptake and
Conversion by Single Bacteria using Laser Tweezers Raman Spectroscopy. Analytical
Chemistry. 2013;85(15):7264-70.
288. Avzianova E, Brooks SD. Raman spectroscopy of glyoxal oligomers in
aqueous solutions. Spectrochimica Acta Part A: Molecular and Biomolecular
Spectroscopy. 2012.
289. Bell A, Hecht L, Barron L. Vibrational Raman optical activity of
disaccharides. Journal of molecular structure. 1995;349:401-4.
290. Islam MT, Ciotti S, Ackermann C. The potential of Raman spectroscopy as a
process analytical technique during formulations of topical gels and emulsions.
Pharmaceutical research. 2004;21(10):1844-51.
291. Bonnier F, Mehmood A, Knief P, Meade A, Hornebeck W, Lambkin H, et al.
In vitro analysis of immersed human tissues by Raman microspectroscopy. Journal of
Raman Spectroscopy. 2011;42(5):888-96.
292. Giles JH, Gilmore DA, Denton MB. Quantitative analysis using Raman
spectroscopy without spectral standardization. Journal of Raman Spectroscopy.
1999;30(9):767-71.
293. Berger AJ, Itzkan I, Feld MS. Feasibility of measuring blood glucose
concentration by near-infrared Raman spectroscopy. Spectrochimica Acta Part A:
Molecular and Biomolecular Spectroscopy. 1997;53(2):287.
294. Zheng X, Baker H, Hancock WS, Fawaz F, McCaman M, Pungor E.
Proteomic analysis for the assessment of different lots of fetal bovine serum as a raw
material for cell culture. Part IV. Application of proteomics to the manufacture of
biological drugs. Biotechnology progress. 2006;22(5):1294-300.
295. Øvergaard S, Fernández Pierna J, Baeten V, Dardenne P, Isaksson T.
Prediction error improvements using variable selection on small calibration sets—a
comparison of some recent methods. Journal of Near Infrared Spectroscopy.
2012;20(3):329.
296. MacPhail R, Strauss H, Snyder R, Elliger C. Carbon-hydrogen stretching
modes and the structure of n-alkyl chains. 2. Long, all-trans chains. The Journal of
Physical Chemistry. 1984;88(3):334-41.
297. Notingher I, Verrier S, Haque S, Polak JM, Hench LL. Spectroscopic study of
human lung epithelial cells (A549) in culture: Living cells versus dead cells.
Biopolymers. 2003;72(4):230-40.
298. Crowley J, Arnold SA, Wood N, Harvey LM, McNeil B. Monitoring a high
cell density recombinant Pichia pastoris fed-batch bioprocess using transmission and
reflectance near infrared spectroscopy. Enzyme and microbial technology. 2005;36(5–
6):621-8.
299. Weil BR, Abarbanell AM, Herrmann JL, Wang Y, Meldrum DR. High
glucose concentration in cell culture medium does not acutely affect human
mesenchymal stem cell growth factor production or proliferation. American Journal
of Physiology-Regulatory, Integrative and Comparative Physiology.
2009;296(6):R1735.
300. McShane MJ, Cote GL. Near-infrared spectroscopy for determination of
glucose, lactate, and ammonia in cell culture media. Applied spectroscopy.
1998;52(8):1073-8.
208
301. Hashimoto A, Kameoka T. Mid-infrared spectroscopic determination of sugar
contents in plant-cell culture media using an ATR method. Applied spectroscopy.
2000;54(7):1005-11.
302. Nogueira HIS. Surface-enhanced Raman scattering (SERS) of 3-
aminosalicylic and 2-mercaptonicotinic acids in silver colloids. Spectrochimica Acta
Part A: Molecular and Biomolecular Spectroscopy. 1998;54(10):1461-70.
303. Bell SE, Sirimuthu NM. Surface-enhanced Raman spectroscopy (SERS) for
sub-micromolar detection of DNA/RNA mononucleotides. Journal of the American
Chemical Society. 2006;128(49):15580-1.
304. Shadi I, Chowdhry B, Snowden M, Withnall R. Semi-quantitative analysis of
indigo by surface enhanced resonance Raman spectroscopy (SERRS) using silver
colloids. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy.
2003;59(10):2213-20.
305. Stamplecoskie KG, Scaiano JC, Tiwari VS, Anis H. Optimal size of silver
nanoparticles for surface-enhanced Raman spectroscopy. The Journal of Physical
Chemistry C. 2011;115(5):1403-9.
306. Mock JJ, Barbic M, Smith DR, Schultz DA, Schultz S. Shape effects in
plasmon resonance of individual colloidal silver nanoparticles. The Journal of
Chemical Physics. 2002;116(15):6755-9.
307. Larmour I, Faulds K, Graham D. SERS activity and stability of the most
frequently used silver colloids. Journal of Raman Spectroscopy. 2012;43(2):202-6.
308. Brehm G, Sauer G, Fritz N, Schneider S, Zaitsev S. Correlation spectroscopy
based on non-linear response of silver colloids (including SEHRS). Journal of
molecular structure. 2005;735:85-102.
309. Doering WE, Nie S. Single-molecule and single-nanoparticle SERS:
examining the roles of surface active sites and chemical enhancement. The Journal of
Physical Chemistry B. 2002;106(2):311-7.
310. Kudelski A. Some aspects of SERS temporal fluctuations: analysis of the most
intense spectra of hydrogenated amorphous carbon deposited on silver. Journal of
Raman Spectroscopy. 2007;38(11):1494-9.
311. Weiss A, Haran G. Time-dependent single-molecule Raman scattering as a
probe of surface dynamics. The Journal of Physical Chemistry B.
2001;105(49):12348-54.
312. Socrates G. Infrared and Raman characteristic group frequencies: tables and
charts: John Wiley & Sons; 2004.
313. Giese B, McNaughton D. Density functional theoretical (DFT) and surface-
enhanced Raman spectroscopic study of guanine and its alkylated derivativesPart 2:
Surface-enhanced Raman scattering on silver surfaces. Physical Chemistry Chemical
Physics. 2002;4(20):5171-82.
314. Jing C, Fang Y. Experimental (SERS) and theoretical (DFT) studies on the
adsorption behaviors of l-cysteine on gold/silver nanoparticles. Chemical Physics.
2007;332(1):27-32.
315. Saha A, Palmal S, Jana NR. Highly reproducible and sensitive surface-
enhanced Raman scattering from colloidal plasmonic nanoparticle via stabilization of
hot spots in graphene oxide liquid crystal. Nanoscale. 2012;4(20):6649-57.
316. Seballos L, Richards N, Stevens DJ, Patel M, Kapitzky L, Lokey S, et al.
Competitive binding effects on surface-enhanced Raman scattering of peptide
molecules. Chemical Physics Letters. 2007;447(4):335-9.
209
317. Tripp RA, Dluhy RA, Zhao Y. Novel nanostructures for SERS biosensing.
Nano Today. 2008;3(3–4):31-7.
318. Stewart S, Fredericks P. Surface-enhanced Raman spectroscopy of amino
acids adsorbed on an electrochemically prepared silver surface. Spectrochimica Acta
Part A: Molecular and Biomolecular Spectroscopy. 1999;55(7-8):1641-60.
319. Giese B, McNaughton D. Surface-Enhanced Raman Spectroscopic and
Density Functional Theory Study of Adenine Adsorption to Silver Surfaces. The
Journal of Physical Chemistry B. 2001;106(1):101-12.
320. VanDusen WJ, Fu J, Bailey FJ, Burke CJ, Herber WK, George HA. Adenine
quantitation in yeast extracts and fermentation media and its relationship to protein
expression and cell growth in adenine auxotrophs of Saccharomyces cerevisiae.
Biotechnology progress. 1997;13(1):1-7.
321. Chen W, Westerhoff P, Leenheer JA, Booksh K. Fluorescence excitation-
emission matrix regional integration to quantify spectra for dissolved organic matter.
Environmental science & technology. 2003;37(24):5701-10.
322. Bosco MV, Larrechi MS. PARAFAC and MCR-ALS applied to the
quantitative monitoring of the photodegradation process of polycyclic aromatic
hydrocarbons using three-dimensional excitation emission fluorescent spectra:
Comparative results with HPLC. Talanta. 2007;71(4):1703-9.
323. Jaumot J, Gargallo R, de Juan A, Tauler R. A graphical user-friendly interface
for MCR-ALS: a new tool for multivariate curve resolution in MATLAB.
Chemometrics and Intelligent Laboratory Systems. 2005;76(1):101-10.
324. Marsili NR, Lista A, Fernandez Band BS, Goicoechea HC, Olivieri AC. New
method for the determination of benzoic and sorbic acids in commercial orange juices
based on second-order spectrophotometric data generated by a pH gradient flow
injection technique. Journal of agricultural and food chemistry. 2004;52(9):2479-84.
325. Zachariassen CB, Larsen J, Van den Berg F, Bro R, de Juan A, Tauler R.
Comparison of PARAFAC2 and MCR-ALS for resolution of an analytical liquid
dilution system. Chemometrics and Intelligent Laboratory Systems. 2006;83(1):13-25.
326. Bilski P, Li MY, Ehrenshaft M, Daub ME, Chignell CF. Vitamin B6
(Pyridoxine) and Its Derivatives Are Efficient Singlet Oxygen Quenchers and
Potential Fungal Antioxidants. Photochemistry and Photobiology. 2000;71(2):129-34.
327. Mote US, Patil SR, Bhosale SH, Han SH, Kolekar GB. Fluorescence
resonance energy transfer from tryptophan to folic acid in micellar media and
deionised water. Journal of Photochemistry and Photobiology B: Biology.
2011;103(1):16-21.
328. Niazi A, Yazdanipour A, Ghasemi J, Abbasi A. Determination of riboflavin in
human plasma by excitation-emission matrix fluorescence and multi-way analysis.
Journal - Chinese Chemical Society Taipei. 2006;53(2):503.
329. Wentzell PD, Nair SS, Guy RD. Three-way analysis of fluorescence spectra of
polycyclic aromatic hydrocarbons with quenching by nitromethane. Analytical
Chemistry. 2001;73(7):1408-15.
330. Baker A. Fluorescence properties of some farm wastes: implications for water
quality monitoring. Water Research. 2002;36(1):189-95.
331. Guilbault GG. Practical fluorescence: CRC Press; 1990.
332. Muñoz de la Peña A, Merás ID, Jiménez Girón A, Goicoechea H. Evaluation
of unfolded-partial least-squares coupled to residual trilinearization for four-way
calibration of folic acid and methotrexate in human serum samples. Talanta.
2007;72(4):1261-8.
210
333. Ranzan C, Trierweiler LF, Hitzmann B, Trierweiler JO, editors. Fluorescence
Spectroscopy As a Tool for Ethanol Fermentation On-Line Monitoring. Advanced
Control of Chemical Processes; 2012.
334. Clementschitsch F, Jürgen K, Florentina P, Karl B. Sensor combination and
chemometric modelling for improved process monitoring in recombinant E. coli fed-
batch cultivations. Journal of Biotechnology. 2005;120(2):183-96.
335. Li H, Liang Y, Xu Q, Cao D. Key wavelengths screening using competitive
adaptive reweighted sampling method for multivariate calibration. Analytica Chimica
Acta. 2009;648(1):77-84.
336. Li Z, Lu H, Yang J, Zeng X, Zhao L, Li H, et al. Analysis of the raw serum
peptidomic pattern in glioma patients. Clinica Chimica Acta. 2013;425(0):221-6.
337. Dorigo M, Stützle T. The Ant Colony Optimization Metaheuristic:
Algorithms, Applications, and Advances. In: Glover F, Kochenberger G, editors.
Handbook of Metaheuristics. International Series in Operations Research &
Management Science. 57: Springer US; 2003. p. 250-85.
338. Parpinelli RS, Lopes HS, Freitas AA. Data mining with an ant colony
optimization algorithm. Evolutionary Computation, IEEE Transactions on.
2002;6(4):321-32.
339. Chalk S. Raw Material Variability - The biopharmaceutical industry is
developing a new approach to controlling variability in raw materials. BioPharm
International. 2014;27(4):pp. 38-9
340. Food, Administration D. Guidance for Industry–Characterization and
Qualification of Cell Substrates and Other Biological Materials Used in the
Production of Viral Vaccines for Infectious Disease Indications. 2010.
341. Sharma C, Drew B, Head K, Pusuluri R, Caple MV, editors. Analytical
techniques for characterization of raw materials in cell culture media. BMC
proceedings; 2011: BioMed Central Ltd.
342. Becton-Dickinson, Company. Manual of Microbiological Culture Media
Second Edition 2009.
211
212
Appendix
213
8 Appendix
214
Table 43 Typical Compositional information listed by Sigma Aldrich for various media
formulations (Part 2 of 3)
Basal Dulbecco's RPMI-
Nutrient
Media Modified (DMEM/F12) 1640
Component Mixture F-12
Eagle Eagle's Medium (g/L) Medium
Ham (g/L)
(g/L) (g/L) (g/L)
Amino Acids
L-Alanine 0.009 0.00445
L-Arginine 0.2
L-Arginine • HCl 0.021 0.084 0.211 0.1475
L-Asparagine
0.05
(anhydrous)
L-Asparagine • H2O 0.01501 0.0075
L-Aspartic Acid 0.0133 0.00665 0.02
L-Cystine • 2HCl 0.01565 0.03129 0.0652
L-Cysteine • HCl •
0.035 0.01756
H2O
L-Glutamic Acid 0.0147 0.00735 0.02
L-Glutamine 0.365
Glycine 0.03 0.00751 0.01875 0.01
L-Histidine (free base) 0.008 0.015
L-Histidine • HCl •
0.042 0.03148
H2O
L-Histidine • 3HCl •
0.02096
H2O
Hydroxy-L-Proline 0.02
L-Isoleucine 0.026 0.105 0.00394 0.5447 0.05
L-Leucine 0.026 0.105 0.0131 0.05905 0.05
L-Lysine • HCl 0.03647 1.46 0.0365 0.09125 0.04
L-Methionine 0.0075 0.00448 0.01724 0.015
L-Phenylalanine 0.0165 0.066 0.00496 0.03548 0.015
L-Proline 0.0345 0.01725 0.02
L-Serine 0.042 0.0105 0.02625 0.03
L-Threonine 0.024 0.095 0.0119 0.05345 0.02
L-Tryptophan 0.004 0.016 0.00204 0.00902 0.005
L-Tyrosine • 2Na •
0.02595 0.10379 0.00778 0.05579 0.02883
2H2O
L-Valine 0.0235 0.094 0.0117 0.05579 0.02
215
Table 44 Typical Compositional information listed by Sigma Aldrich for various media
formulations (Part 3 of 3)
Component Basal Dulbecco's Nutrient (DMEM/F12) RPMI-1640
Media Modified Mixture F-12 (g/L) Medium
Eagle Eagle's Medium Ham (g/L) (g/L)
(g/L) (g/L)
Vitamins
D-Biotin 0.001 0.0000073 0.0000035 0.0002
Choline Chloride 0.001 0.004 0.01396 0.00898 0.003
Folic Acid 0.001 0.004 0.00132 0.00265 0.001
myo-Inositol 0.002 0.0072 0.018 0.0126 0.035
Niacinamide 0.001 0.004 0.000037 0.00202 0.001
p-Aminobenzoic
Acid 0.001
D-Pantothenic
Acid
(hemicalcium) 0.001 0.004 0.00048 0.00224 0.00025
Pyridoxal • HCl 0.001 0.002
Pyridoxine • HCl 0.004 0.000062 0.002031 0.001
Riboflavin 0.0001 0.0004 0.000038 0.000219 0.0002
Thiamine • HCl 0.001 0.004 0.00034 0.00217 0.001
Vitamin B12 0.00136 0.00068 0.000005
Other
D-Glucose 1 4.5 1.802 3.15 2
Glutathione
(reduced) 0.001
Hypoxanthine 0.00408 0.0021
Linoleic Acid 0.000084 0.000042
Phenol Red • Na 0.011 0.0159 0.0013 0.00863 0.0053
Putrescine • HCl 0.000161
Pyruvic Acid • Na 0.0159 0.11 0.11
DL-Thioctic Acid 0.00021 0.000105
Thymidine 0.00073 0.000365
Add
L-Glutamine 0.292 0.0584 0.146 0.3
Sodium
Bicarbonate 2.2 1.2
216
8.2 Supplementary Information for Chapter Two
217
8.2.2 Difco TC Yeastolate UF
Table 46 Typical Analysis of Difco TC Yeastolate UF,[342] provided by the Becton, Dickinson
and Company.
Substance Unit Substance Unit
Total Nitrogen (%) 10.7 Glutamine 0.3
(% Free)
Amino Nitrogen (%) 6.0 Glycine (% Free) 1.3
AN/TN 0.55 Glycine (% Total) 2.9
Total Carbohydrate (mg/g) 108.2 Histidine (% Free) 0.6
Ash (%) 18.2 Histidine (% Total) 1.2
Loss on Drying (%) 0.7 Isoleucine (% Free) 1.8
NaCl (%) 0.0 Isoleucine 3.8
(% Total)
pH (1% Solution) 7.0 Leucine (% Free) 2.8
Calcium (ug/g) 191 Leucine (% Total) 4.7
Iron (ug/g) 57.9 Lysine (% Free) 2.2
Magnesium (ug/g) 558 Lysine (% Total) 4.6
Potassium (ug/g) 59240 Methionine 0.7
(% Free)
Sodium (ug/g) 1244 Methionine 0.8
(% Total)
Chloride (%) 0.13 Phenylalanine 2.1
(% Free)
Sulfate (%) 1.02 Phenylalanine 3.6
(% Total)
Phosphate (%) 2.70 Proline (% Free) 0.9
Alanine (% Free) 4.8 Proline (% Total) 1.9
Alanine (% Total) 5.4 Serine (% Free) 1.6
Arginine (% Free) 1.5 Serine (% Total) 1.7
Arginine (% Total) 2.6 Threonine (% Free) 1.3
Asparagine 1.2 Threonine 1.6
(% Free) (% Total)
Aspartic Acid 1.7 Tryptophan 0.5
(% Free) (% Free)
Aspartic Acid 5.4 Tyrosine (% Free) 0.5
(% Total)
Cystine (% Free) 0.2 Tyrosine (% Total) 0.8
Glutamic Acid (% Free) 6.8 Valine (% Free) 2.4
Glutamic Acid (% Total) 10.0 Valine(% Total) 4.1
218
8.3 Supplementary Information for Chapter Three
219
Table 49 Models generated from M1GLUR1 data after preliminary pre-processing
M1GLUR1 AVG LV Correlation RMSEC RMSECV REP
(g/L) (g/L) (%)
R1 AVG Full 3 0.985 1.70 1.86 7.26
R1 AVG ROI 3 0.991 1.34 1.55 6.05
R1 AVG MWPLS 3 0.991 1.38 1.57 6.13
R1 BC Full 2 0.985 1.77 1.94 7.57
R1 BC ROI 2 0.990 1.42 1.56 6.09
R1 BC MWPLS 2 0.989 1.47 1.61 6.28
R1 WE Full 3 0.993 1.22 1.51 5.89
R1 WE ROI 2 0.992 1.37 1.53 5.97
R1 WE MWPLS 1 0.990 1.50 1.60 6.25
220
Table 51 Review of the calibration models for the M1GluR1MSC data.
M1GluR1 MSC LV Correlation RMSEC (g/L) RMSECV
(g/L)
AVG Full 2 0.994 1.06 1.18
AVG ROI 3 0.996 0.87 0.98
AVG ROI (1) 3 0.993 1.22 1.39
AVG MWPLS 3 0.997 0.76 0.87
AVG MWPLS (1) 2 0.990 1.43 1.56
BC Full 2 0.993 1.16 1.26
BC ROI 2 0.995 0.98 1.05
BC ROI (1) 2 0.996 0.95 1.03
BC MWPLS 2 0.995 0.98 1.05
BC MWPLS(1) 2 0.996 0.90 0.97
WE Full 5 0.805 6.31 21.77
WE ROI 4 0.701 7.82 14.31
WE ROI (1) 4 0.694 7.91 10.86
WE MWPLS 5 0.779 6.72 10.16
WE MWPLS (1) 4 0.680 8.10 12.80
Table 52 Review of the calibration models for the M1GluR3 MSC data.
M1GluR3 MSC LV Correlation RMSEC (g/L) RMSECV
(g/L)
AVG Full 3 1.00 0.28 0.31
AVG ROI 3 0.999 0.43 0.48
AVG ROI (1) 3 0.999 0.39 0.43
AVG MWPLS 3 0.999 0.37 0.42
AVG MWPLS (1) 3 0.996 0.94 1.09
BC Full 3 0.999 0.40 0.45
BC ROI 3 0.999 0.39 0.43
BC ROI (1) 3 0.998 0.66 0.77
BC MWPLS 3 0.999 0.39 0.43
BC MWPLS(1) 2 0.997 0.82 0.90
WE Full 3 0.787 6.60 7.46
WE ROI 3 0.807 6.29 6.86
WE ROI (1) 2 0.673 8.18 9.04
WE MWPLS 3 0.794 6.49 7.18
WE MWPLS (1) 2 0.654 0.41 9.19
221
Table 53 Calibration Evaluation for Norm 1 signal correction on the M1GluR1 data.
M1GluR1 Norm1 LV Correlation RMSEC (g/L) RMSECV
(g/L)
AVG Full 3 0.992 1.31 1.48
AVG ROI 3 0.994 1.06 1.24
AVG ROI (1) 3 0.993 1.22 1.43
AVG MWPLS 2 0.990 1.42 1.51
AVG MWPLS (1) 2 0.988 1.55 1.64
BC Full 2 0.995 1.15 1.24
BC ROI 2 0.994 1.10 1.18
BC ROI (1) 2 0.995 1.00 1.08
BC MWPLS 2 0.994 1.13 1.20
BC MWPLS(1) 2 0.996 0.93 0.99
WE Full 5 0.886 4.84 8.79
WE ROI 4 0.772 6.93 8.66
WE ROI (1) 4 0.793 6.63 8.33
WE MWPLS 5 0.848 5.60 8.13
WE MWPLS (1) 5 0.870 5.18 8.11
Table 54 Calibration Evaluation for Norm 1 signal correction on the M1GluR3 data.
M1GluR3 Norm1 LV Correlation RMSEC (g/L) RMSECV
(g/L)
AVG Full 2 0.996 0.85 0.91
AVG ROI 3 0.999 0.46 0.52
AVG ROI (1) 3 0.999 0.54 0.61
AVG MWPLS 3 0.999 0.45 0.50
AVG MWPLS (1) 3 0.998 0.59 0.67
BC Full 3 0.997 0.80 0.89
BC ROI 3 0.997 0.76 0.86
BC ROI (1) 3 0.949 3.29 3.70
BC MWPLS 3 0.997 0.72 0.81
BC MWPLS(1) 3 0.954 3.11 3.51
WE Full 4 0.887 4.82 6.84
WE ROI 3 0.825 5.98 6.56
WE ROI (1) 3 0.735 7.65 8.59
WE MWPLS 3 0.823 6.03 6.69
WE MWPLS (1) 3 0.727 7.65 8.42
222
Table 55 Calibration Evaluation for Norm 2 signal correction on the M1GluR1 data.
M1GluR1 Norm2 LV Correlation RMSEC (g/L) RMSECV
(g/L)
AVG Full 3 0.993 1.17 1.28
AVG ROI 3 0.996 0.89 1.02
AVG ROI (1) 3 0.993 1.21 1.42
AVG MWPLS 3 0.996 0.86 0.98
AVG MWPLS (1) 2 0.988 1.55 1.65
BC Full 2 0.994 1.07 1.16
BC ROI 3 0.997 0.71 0.85
BC ROI (1) 2 0.996 0.91 0.98
BC MWPLS 3 0.997 0.74 0.85
BC MWPLS(1) 2 0.997 0.82 0.88
WE Full 5 0.869 5.17 9.25
WE ROI 4 0.763 7.00 8.64
WE ROI (1) 5 0.882 4.93 7.94
WE MWPLS 5 0.844 5.66 8.11
WE MWPLS (1) 5 0.863 5.31 7.91
Table 56 Calibration Evaluation for Norm 2 signal correction on the M1GluR3 data.
M1GluR3 Norm2 LV Correlation RMSEC (g/L) RMSECV
(g/L)
AVG Full 2 0.996 0.86 0.92
AVG ROI 3 0.999 0.42 0.47
AVG ROI (1) 3 0.999 0.54 0.61
AVG MWPLS 3 0.999 0.40 0.45
AVG MWPLS (1) 3 0.998 0.60 0.68
BC Full 3 0.998 0.67 0.74
BC ROI 3 0.999 0.49 0.54
BC ROI (1) 3 0.961 2.85 3.23
BC MWPLS 3 0.999 0.42 0.48
BC MWPLS(1) 3 0.960 2.88 3.27
WE Full 3 0.830 5.90 6.50
WE ROI 3 0.841 5.71 6.12
WE ROI (1) 3 0.784 6.71 7.40
WE MWPLS 3 0.832 5.88 6.49
WE MWPLS (1) 3 0.769 6.90 7.58
223
Table 57 Calibration Evaluation for Norm INF signal correction on the M1GluR1 data.
M1GluR1 Norm INF LV Correlation RMSEC (g/L) RMSECV
(g/L)
AVG Full 3 0.993 1.17 1.27
AVG ROI 3 0.996 0.93 1.06
AVG ROI (1) 3 0.993 1.18 1.38
AVG MWPLS 3 0.996 0.92 1.04
AVG MWPLS (1) 2 0.988 1.56 1.66
BC Full 3 0.998 0.66 0.83
BC ROI 3 0.996 0.87 1.02
BC ROI (1) 2 0.994 1.09 1.20
BC MWPLS 3 0.996 0.89 1.02
BC MWPLS(1) 2 0.994 1.08 1.18
WE Full 5 0.841 5.70 10.38
WE ROI 4 0.750 7.46 8.46
WE ROI (1) 4 0.734 7.38 8.66
WE MWPLS 4 0.760 7.01 8.22
WE MWPLS (1) 4 0.742 7.26 8.39
Table 58 Calibration Evaluation for Norm INF signal correction on the M1GluR3 data.
M1GluR3 Norm INF LV Correlation RMSEC (g/L) RMSECV
(g/L)
AVG Full 3 0.999 0.42 0.60
AVG ROI 3 0.999 0.49 0.55
AVG ROI (1) 3 0.997 0.78 0.88
AVG MWPLS 3 0.999 0.48 0.54
AVG MWPLS (1) 3 0.997 0.77 0.87
BC Full 3 0.997 0.74 0.81
BC ROI 3 0.999 0.51 0.57
BC ROI (1) 3 0.967 2.59 2.98
BC MWPLS 3 0.999 0.46 0.51
BC MWPLS(1) 3 0.967 2.59 2.98
WE Full 3 0.648 8.66 9.81
WE ROI 3 0.702 7.83 8.53
WE ROI (1) 3 0.672 8.36 9.31
WE MWPLS 3 0.672 8.21 9.21
WE MWPLS (1) 4 0.702 7.93 8.96
224
Table 59 Calibration Evaluation for First Derivative signal correction on the M1GluR1 data.
M1GluR1 FD LV Correlation RMSEC (g/L) RMSECV
(g/L)
AVG Full 2 0.991 1.34 1.45
AVG ROI 2 0.992 1.28 1.39
AVG ROI (1) 2 0.992 1.28 1.39
AVG MWPLS 2 0.992 1.28 1.39
AVG MWPLS (1) 2 0.992 1.28 1.39
BC Full 2 0.991 1.33 1.45
BC ROI 2 0.992 1.28 1.39
BC ROI (1) 2 0.992 1.28 1.38
BC MWPLS 2 0.992 1.28 1.39
BC MWPLS(1) 2 0.992 1.28 1.39
WE Full 2 0.994 1.08 1.39
WE ROI 1 0.992 1.32 1.40
WE ROI (1) 1 0.992 1.32 1.40
WE MWPLS 1 0.992 1.32 1.40
WE MWPLS (1) 1 0.992 1.33 1.40
Table 60 Calibration Evaluation for First Derivative signal correction on the M1GluR3 data.
M1GluR3 FD LV Correlation RMSEC (g/L) RMSECV
(g/L)
AVG Full 3 0.977 2.16 3.02
AVG ROI 3 0.984 1.80 2.10
AVG ROI (1) 3 0.984 1.80 2.10
AVG MWPLS 3 0.983 1.84 2.15
AVG MWPLS (1) 3 0.983 1.86 2.16
BC Full 3 0.978 2.14 2.98
BC ROI 3 0.986 1.70 2.00
BC ROI (1) 3 0.986 1.68 1.98
BC MWPLS 3 0.984 1.79 2.09
BC MWPLS(1) 3 0.984 1.80 2.10
WE Full 4 0.988 1.53 2.28
WE ROI 3 0.986 1.70 2.05
WE ROI (1) 3 0.986 1.67 2.03
WE MWPLS 2 0.985 1.74 2.12
WE MWPLS (1) 3 0.985 1.76 2.13
225
Table 61 Calibration Evaluation for First Derivative plus MSC correction on the M1GluR1 data.
M1GluR1 LV Correlation RMSEC (g/L) RMSECV
FD MSC (g/L)
AVG Full 2 0.998 0.69 0.74
AVG ROI 2 0.998 0.70 0.75
AVG ROI (1) 2 0.960 2.84 3.13
AVG MWPLS 2 0.998 0.70 0.75
AVG MWPLS (1) 2 0.938 3.55 3.92
BC Full 2 0.998 0.69 0.74
BC ROI 2 0.998 0.70 0.75
BC ROI (1) 2 0.960 2.84 3.13
BC MWPLS 2 0.998 0.70 0.75
BC MWPLS(1) 2 0.938 3.55 3.92
WE Full 4 0.704 7.78 11.33
WE ROI 4 0.724 7.52 9.92
WE ROI (1) 4 0.785 6.62 9.33
WE MWPLS 4 0.705 7.77 10.99
WE MWPLS (1) 4 0.726 7.48 10.17
Table 62 Calibration Evaluation for First Derivative plus MSC correction on the M1GluR3 data.
M1GluR3 LV Correlation RMSEC (g/L) RMSECV
FD MSC (g/L)
AVG Full 2 0.999 0.35 0.38
AVG ROI 2 1.000 0.29 0.31
AVG ROI (1) 2 0.969 2.53 2.78
AVG MWPLS 2 1.000 0.29 0.31
AVG MWPLS (1) 2 0.948 3.27 3.60
BC Full 2 0.999 0.35 0.38
BC ROI 2 1.000 0.29 0.31
BC ROI (1) 2 0.969 2.53 2.78
BC MWPLS 2 1.000 0.29 0.31
BC MWPLS(1) 2 0.948 3.27 3.60
WE Full 2 0.613 8.90 9.94
WE ROI 2 0.643 8.54 9.59
WE ROI (1) 2 0.536 9.74 11.48
WE MWPLS 2 0.642 8.56 9.61
WE MWPLS (1) 2 0.534 9.76 11.46
226
Table 63 Calibration Evaluation for MSC plus First Derivative on the M1GluR1 data.
M1GluR1 LV Correlation RMSEC (g/L) RMSECV
MSC FD (g/L)
AVG Full 2 0.998 0.67 0.73
AVG ROI 2 0.998 0.69 0.73
AVG ROI (1) 3 0.993 1.15 1.40
AVG MWPLS 2 0.998 0.69 0.74
AVG MWPLS (1) 3 0.993 1.20 1.45
BC Full 2 0.998 0.64 0.69
BC ROI 2 0.998 0.67 0.72
BC ROI (1) 2 0.994 1.14 1.21
BC MWPLS 2 0.998 0.68 0.72
BC MWPLS(1) 3 0.997 0.84 1.06
WE Full 4 0.646 8.51 26.69
WE ROI 4 0.706 7.76 13.30
WE ROI (1) 4 0.676 8.14 12.46
WE MWPLS 4 0.701 7.82 22.71
WE MWPLS (1) 4 0.688 7.99 20.31
Table 64 Calibration Evaluation for MSC plus First Derivative on the M1GluR3 data.
M1GluR3 LV Correlation RMSEC (g/L) RMSECV
MSC FD (g/L)
AVG Full 2 0.999 0.34 0.37
AVG ROI 2 0.999 0.37 0.40
AVG ROI (1) 2 0.999 0.32 0.35
AVG MWPLS 2 0.999 0.37 0.40
AVG MWPLS (1) 3 0.995 0.97 1.14
BC Full 2 0.999 0.33 0.36
BC ROI 2 0.996 0.85 0.92
BC ROI (1) 2 0.999 0.32 0.34
BC MWPLS 2 0.996 0.94 1.02
BC MWPLS(1) 2 0.999 0.33 0.35
WE Full 2 0.717 7.78 8.63
WE ROI 4 0.851 5.52 7.77
WE ROI (1) 2 0.637 8.62 9.43
WE MWPLS 3 0.817 6.12 7.56
WE MWPLS (1) 2 0.648 8.48 9.34
227
8.3.2 Calibration Models For M3Glu Data (Replicate
Models)
Table 65 Calibration Evaluation for the M3Glu data, with preliminary pre-processing.
M3Glu LV Correlation RMSEC RMSECV
(g/L) (g/L)
Avg R1 Data Full 8 0.993 0.25 0.83
Avg R1 Data ROI 6 0.989 0.31 0.50
Avg R1 Data MWPLS 3 0.927 0.79 0.89
Avg R2 Data Full 4 0.938 0.73 1.25
Avg R2 Data ROI 4 0.989 0.30 0.40
Avg R2 Data MWPLS 4 0.989 0.31 0.40
Avg R3 Data Full 5 0.980 0.41 0.92
Avg R3 Data ROI 5 0.981 0.4 0.61
Avg R3 Data MWPLS 5 0.979 0.42 0.57
M3Glu LV Correlation RMSEC RMSECV
(g/L) (g/L)
BCR1 Data Full 6 0.986 0.35 0.84
BCR1 Data ROI 4 0.981 0.41 0.65
BCR1 Data MWPLS 5 0.992 0.26 0.40
BCR2 Data Full 4 0.964 0.55 0.96
BCR2 Data ROI 4 0.988 0.32 0.42
BCR2 Data MWPLS 4 0.989 0.30 0.38
BCR3 Data Full 6 0.975 0.46 0.97
BCR3 Data ROI 6 0.981 0.41 0.61
BCR3 Data MWPLS 6 0.982 0.39 0.58
M3Glu LV Correlation RMSEC RMSECV
(g/L) (g/L)
WE R1 Data Full 5 0.980 0.42 0.90
WE R1 Data ROI 5 0.993 0.25 0.45
WE R1 Data MWPLS 5 0.994 0.23 0.39
WE R2 Data Full 4 0.966 0.54 0.89
WE R2 Data ROI 4 0.990 0.29 0.40
WE R2 Data MWPLS 4 0.990 0.28 0.38
WE R3 Data Full 5 0.966 0.52 0.97
WE R3 Data ROI 5 0.981 0.39 0.57
WE R3 Data MWPLS 5 0.979 0.43 0.56
228
Table 66 Calibration Evaluation for the M3Glu data, with MSC pre-processing.
M3Glu LV Correlation RMSEC (g/L) RMSECV
(g/L)
Avg R1 MSC Full 6 0.983 0.38 0.84
Avg R1 MSC ROI 5 0.992 0.26 0.42
Avg R1 MSC MWPLS 5 0.997 0.16 0.36
Avg R2 MSC Full 4 0.955 0.62 1.10
Avg R2 MSC ROI 4 0.991 0.27 0.35
Avg R2 MSC MWPLS 4 0.992 0.26 0.32
Avg R3 MSC Full 6 0.988 0.32 0.78
Avg R3 MSC ROI 5 0.997 0.17 0.26
Avg R3 MSC MWPLS 5 0.997 0.16 0.23
M3Glu LV Correlation RMSEC (g/L) RMSECV
(g/L)
BCR1 MSC Full 5 0.981 0.40 0.84
BCR1 MSC ROI 4 0.984 0.37 0.56
BCR1 MSC MWPLS 5 0.997 0.15 0.29
BCR2 MSC Full 4 0.968 0.52 0.96
BCR2 MSC ROI 4 0.993 0.25 0.32
BCR2 MSC MWPLS 4 0.993 0.25 0.30
BCR3 MSC Full 5 0.980 0.41 0.85
BCR3 MSC ROI 6 0.997 0.16 0.26
BCR3 MSC MWPLS 6 0.997 0.16 0.24
M3Glu LV Correlation RMSEC (g/L) RMSECV
(g/L)
WE R1 MSC Full 6 0.959 0.58 1.48
WE R1 MSC ROI 4 0.910 0.85 1.30
WE R1 MSC MWPLS 4 0.867 .07 1.91
WE R2 MSC Full 5 0.947 0.68 1.35
WE R2 MSC ROI 5 0.957 0.61 1.00
WE R2 MSC MWPLS 4 0.944 0.69 0.90
WE R3 MSC Full 6 0.956 0.62 1.28
WE R3 MSC ROI 5 0.963 0.56 1.00
WE R3 MSC MWPLS 6 0.968 0.52 0.95
229
Table 67 Calibration Evaluation for the M3Glu data, with Norm2 pre-processing.
M3Glu LV Correlation RMSEC (g/L) RMSECV
(g/L)
Avg R1 Norm2 Full 8 0.994 0.23 0.77
Avg R1 Norm2 ROI 6 0.996 0.18 0.37
Avg R1 Norm2 5 0.989 0.30 0.50
MWPLS
Avg R2 Norm2 Full 5 0.971 0.50 0.91
Avg R2 Norm2 ROI 4 0.993 0.25 0.34
Avg R2 Norm2 4 0.992 0.26 0.34
MWPLS
Avg R3 Norm2 Full 5 0.979 0.43 0.92
Avg R3 Norm2 ROI 5 0.996 0.18 0.29
Avg R3 Norm2 5 0.996 0.18 0.26
MWPLS
M3Glu LV Correlation RMSEC (g/L) RMSECV
(g/L)
BCR1 Norm2 Full 6 0.989 0.31 0.73
BCR1 Norm2 ROI 5 0.998 0.14 0.29
BCR1 Norm2 MWPLS 5 0.997 0.17 0.29
BCR2 Norm2 Full 5 0.986 0.35 0.74
BCR2 Norm2 ROI 4 0.991 0.27 0.36
BCR2 Norm2 MWPLS 4 0.993 0.25 0.32
BCR3 Norm2 Full 6 0.982 0.39 0.80
BCR3 Norm2 ROI 6 0.997 0.16 0.26
BCR3 Norm2 MWPLS 6 0.997 0.16 0.24
M3Glu LV Correlation RMSEC (g/L) RMSECV
(g/L)
WE R1 Norm2 Full 6 0.975 0.46 0.99
WE R1 Norm2 ROI 4 0.953 0.64 0.88
WE R1 Norm2 MWPLS 4 0.951 0.65 0.84
WE R2 Norm2 Full 5 0.945 0.69 1.41
WE R2 Norm2 ROI 4 0.954 0.63 0.80
WE R2 Norm2 MWPLS 4 0.951 0.65 0.80
WE R3 Norm2 Full 4 0.906 0.90 1.23
WE R3 Norm2 ROI 5 0.967 0.53 0.79
WE R3 Norm2 MWPLS 6 0.983 0.38 0.64
230
Table 68 Calibration Evaluation for the M3Glu data, with First Derivative pre-processing.
M3Glu LV Correlation RMSEC (g/L) RMSECV
(g/L)
Avg R1 FST Full 6 0.994 0.22 0.49
Avg R1 FST ROI 4 0.994 0.23 0.32
Avg R1 FST MWPLS 3 0.988 0.31 0.38
Avg R2 FST Full 6 0.997 0.15 0.32
Avg R2 FST ROI 3 0.993 0.25 0.30
Avg R2 FST MWPLS 3 0.993 0.25 0.30
Avg R3 FST Full 5 0.986 0.35 0.68
Avg R3 FST ROI 4 0.983 0.38 0.52
Avg R3 FST MWPLS 4 0.983 0.38 0.51
M3Glu LV Correlation RMSEC (g/L) RMSECV
(g/L)
BCR1 FST Full 6 0.994 0.22 0.49
BCR1 FST ROI 3 0.990 0.29 0.35
BCR1 FST MWPLS 3 0.990 0.29 0.35
BCR2 FST Full 6 0.997 0.15 0.32
BCR2 FST ROI 3 0.993 0.25 0.30
BCR2 FST MWPLS 3 0.993 0.25 0.29
BCR3 FST Full 5 0.986 0.35 0.68
BCR3 FST ROI 4 0.982 0.39 0.54
BCR3 FST MWPLS 4 0.983 0.39 0.53
M3Glu LV Correlation RMSEC (g/L) RMSECV
(g/L)
WE R1 FST Full 4 0.984 0.37 0.60
WE R1 FST ROI 4 0.995 0.21 0.33
WE R1 FST MWPLS 4 0.994 0.22 0.35
WE R2 FST Full 5 0.994 0.22 0.39
WE R2 FST ROI 2 0.991 0.28 0.31
WE R2 FST MWPLS 2 0.991 0.27 0.30
WE R3 FST Full 5 0.988 0.32 0.60
WE R3 FST ROI 4 0.983 0.38 0.49
WE R3 FST MWPLS 4 0.983 0.39 0.48
231
Table 69 Calibration Evaluation for the M3Glu data, with First Derivative and MSC pre-
processing
M3Glu LV Correlation RMSEC (g/L) RMSECV
(g/L)
Avg R1 FSTMSC Full 6 0.996 0.18 0.45
Avg R1 FSTMSC ROI 4 0.996 0.17 0.26
Avg R1 FSTMSC 3 0.993 0.25 0.31
MWPLS
Avg R2 FSTMSC Full 6 0.997 0.16 0.32
Avg R2 FSTMSC ROI 3 0.995 0.20 0.24
Avg R2 FSTMSC 3 0.995 0.20 0.23
MWPLS
Avg R3 FSTMSC Full 5 0.992 0.25 0.49
Avg R3 FSTMSC ROI 3 0.995 0.21 0.25
Avg R3 FSTMSC 3 0.995 0.21 0.25
MWPLS
M3Glu LV Correlation RMSEC (g/L) RMSECV
(g/L)
BCR1 FSTMSC Full 6 0.996 0.18 0.45
BCR1 FSTMSC ROI 4 0.996 0.17 0.26
BCR1 FSTMSC MWPLS 3 0.993 0.24 0.30
BCR2 FSTMSC Full 6 0.997 0.16 0.32
BCR2 FSTMSC ROI 3 0.995 0.20 0.24
BCR2 FSTMSC MWPLS 3 0.995 0.20 0.23
BCR3 FSTMSC Full 5 0.992 0.25 0.49
BCR3 FSTMSC ROI 3 0.995 0.21 0.25
BCR3 FSTMSC MWPLS 3 0.995 0.21 0.25
M3Glu LV Correlation RMSEC RMSECV
(g/L) (g/L)
WE R1 FSTMSC Full 7 0.954 0.63 1.90
WE R1 FSTMSC ROI 8 0.960 0.59 2.99
WE R1 FSTMSC MWPLS 5 0.846 1.21 2.95
WE R2 FSTMSC Full 4 0.941 0.71 0.99
WE R2 FSTMSC ROI 2 0.935 0.75 0.84
WE R2 FSTMSC MWPLS 2 0.934 0.76 0.84
WE R3 FSTMSC Full 5 0.949 0.66 1.12
WE R3 FSTMSC ROI 4 0.948 0.67 0.99
WE R3 FSTMSC MWPLS 4 0.942 0.71 1.01
232
8.3.3 Calibration Models For M5Glu Data (Replicate
models)
Table 70 Calibration Evaluation for the M5Glu data, with preliminary pre-processing.
M5Glu LV Correlation RMSEC RMSECV
(g/L) (g/L)
Avg R1 Data Full 4 0.965 0.55 0.66
Avg R1 Data ROI 5 0.971 0.50 0.59
Avg R1 Data MWPLS 4 0.971 0.50 0.59
Avg R2 Data Full 4 0.955 0.63 0.94
Avg R2 Data ROI 4 0.957 0.61 0.77
Avg R2 Data MWPLS 4 0.961 0.59 0.72
Avg R3 Data Full 4 0.972 0.49 0.89
Avg R3 Data ROI 5 0.982 0.39 0.72
Avg R3 Data MWPLS 4 0.971 0.50 0.68
M5Glu LV Correlation RMSEC RMSECV
(g/L) (g/L)
BCR1 Data Full 4 0.983 0.38 0.50
BCR1 Data ROI 4 0.975 0.46 0.57
BCR1 Data MWPLS 4 0.977 0.45 0.52
BCR2 Data Full 4 0.969 0.52 0.81
BCR2 Data ROI 3 0.960 0.59 0.69
BCR2 Data MWPLS 3 0.960 0.60 0.69
BCR3 Data Full 4 0.955 0.62 1.08
BCR3 Data ROI 3 0.953 0.64 0.80
BCR3 Data MWPLS 3 0.955 0.62 0.76
M5Glu LV Correlation RMSEC RMSECV
(g/L) (g/L)
WE R1 Data Full 6 0.991 0.28 0.68
WE R1 Data ROI 4 0.968 0.52 0.68
WE R1 Data MWPLS 5 0.989 0.31 0.48
WE R2 Data Full 4 0.964 0.56 1.05
WE R2 Data ROI 4 0.963 0.57 0.74
WE R2 Data MWPLS 3 0.961 0.59 0.73
WE R3 Data Full 4 0.966 0.54 1.10
WE R3 Data ROI 4 0.967 0.53 0.78
WE R3 Data MWPLS 4 0.966 0.54 0.80
233
Table 71 Calibration Evaluation for the M5Glu data, with MSC pre-processing.
M5Glu LV Correlation RMSEC RMSECV
(g/L) (g/L)
Avg R1 MSC Full 4 0.980 0.42 0.55
Avg R1 MSC ROI 5 0.989 0.30 0.40
Avg R1 MSC MWPLS 5 0.991 0.27 0.34
Avg R2 MSC Full 4 0.973 0.49 0.82
Avg R2 MSC ROI 3 0.961 0.58 0.69
Avg R2 MSC MWPLS 3 0.966 0.55 0.64
Avg R3 MSC Full 4 0.988 0.32 0.62
Avg R3 MSC ROI 4 0.985 0.36 0.46
Avg R3 MSC MWPLS 4 0.990 0.29 0.41
M5Glu LV Correlation RMSEC RMSECV
(g/L) (g/L)
BCR1 MSC Full 4 0.985 0.35 0.47
BCR1 MSC ROI 3 0.981 0.40 0.50
BCR1 MSC MWPLS 4 0.990 0.29 0.35
BCR2 MSC Full 4 0.970 0.51 0.82
BCR2 MSC ROI 3 0.971 0.50 0.59
BCR2 MSC MWPLS 3 0.978 0.44 0.52
BCR3 MSC Full 4 0.972 0.49 0.85
BCR3 MSC ROI 4 0.991 0.27 0.39
BCR3 MSC MWPLS 4 0.991 0.28 0.39
M5Glu LV Correlation RMSEC RMSECV
(g/L) (g/L)
WE R1 MSC Full 6 0.955 0.62 1.16
WE R1 MSC ROI 6 0.966 0.54 1.02
WE R1 MSC MWPLS 5 0.932 0.76 1.04
WE R2 MSC Full 6 0.964 0.56 1.41
WE R2 MSC ROI 5 0.967 0.53 0.81
WE R2 MSC MWPLS 3 0.928 0.80 0.93
WE R3 MSC Full 5 0.959 0.59 1.55
WE R3 MSC ROI 5 0.970 0.51 1.00
WE R3 MSC MWPLS 5 0.962 0.57 1.02
234
Table 72 Calibration Evaluation for the M5Glu data, with Norm2 pre-processing.
M5Glu LV Correlation RMSEC RMSECV
(g/L) (g/L)
Avg R1 Norm2 Full 4 0.970 0.50 0.61
Avg R1 Norm2 ROI 5 0.989 0.31 0.37
Avg R1 Norm2 MWPLS 4 0.989 0.30 0.37
Avg R2 Norm2 Full 5 0.974 0.47 0.83
Avg R2 Norm2 ROI 5 0.984 0.38 0.61
Avg R2 Norm2 MWPLS 4 0.975 0.47 0.59
Avg R3 Norm2 Full 4 0.981 0.40 0.83
Avg R3 Norm2 ROI 5 0.995 0.20 0.34
Avg R3 Norm2 MWPLS 4 0.992 0.26 0.34
M5Glu LV Correlation RMSEC RMSECV
(g/L) (g/L)
BCR1 Norm2 Full 5 0.995 0.19 0.33
BCR1 Norm2 ROI 4 0.989 0.31 0.40
BCR1 Norm2 MWPLS 4 0.991 0.27 0.34
BCR2 Norm2 Full 4 0.971 0.50 0.80
BCR2 Norm2 ROI 3 0.972 0.49 0.59
BCR2 Norm2 MWPLS 3 0.975 0.47 0.54
BCR3 Norm2 Full 4 0.974 0.47 0.81
BCR3 Norm2 ROI 3 0.984 0.36 0.45
BCR3 Norm2 MWPLS 3 0.987 0.33 0.40
M5Glu LV Correlation RMSEC RMSECV
(g/L) (g/L)
WE R1 Norm2 Full 5 0.946 0.68 1.21
WE R1 Norm2 ROI 5 0.953 0.64 0.91
WE R1 Norm2 MWPLS 5 0.944 0.70 0.90
WE R2 Norm2 Full 3 0.906 0.91 1.36
WE R2 Norm2 ROI 4 0.942 0.72 1.01
WE R2 Norm2 MWPLS 3 0.918 0.85 1.00
WE R3 Norm2 Full 5 0.958 0.52 1.29
WE R3 Norm2 ROI 5 0.982 0.39 0.66
WE R3 Norm2 MWPLS 5 0.976 0.46 0.72
235
Table 73 Calibration Evaluation for the M5Glu data, with First Derivative pre-processing.
M5Glu LV Correlation RMSEC RMSECV
(g/L) (g/L)
Avg R1 FST Full 4 0.986 0.34 0.47
Avg R1 FST ROI 2 0.980 0.41 0.45
Avg R1 FST MWPLS 2 0.981 0.41 0.44
Avg R2 FST Full 4 0.966 0.55 0.84
Avg R2 FST ROI 3 0.968 0.53 0.70
Avg R2 FST MWPLS 3 0.968 0.53 0.68
Avg R3 FST Full 4 0.930 0.78 1.27
Avg R3 FST ROI 3 0.975 0.47 0.59
Avg R3 FST MWPLS 3 0.973 0.48 0.59
M5Glu LV Correlation RMSEC RMSECV
(g/L) (g/L)
BCR1 FST Full 4 0.986 0.34 0.47
BCR1 FST ROI 2 0.981 0.41 0.45
BCR1 FST MWPLS 2 0.981 0.40 0.44
BCR2 FST Full 4 0.966 0.55 0.85
BCR2 FST ROI 3 0.968 0.53 0.70
BCR2 FST MWPLS 3 0.968 0.53 0.69
BCR3 FST Full 5 0.990 0.28 0.49
BCR3 FST ROI 3 0.984 0.37 0.45
BCR3 FST MWPLS 3 0.985 0.36 0.44
M5Glu LV Correlation RMSEC RMSECV
(g/L) (g/L)
WE R1 FST Full 5 0.990 0.28 0.49
WE R1 FST ROI 3 0.984 0.37 0.45
WE R1 FST MWPLS 3 0.985 0.36 0.44
WE R2 FST Full 4 0.964 0.57 0.91
WE R2 FST ROI 3 0.969 0.52 0.73
WE R2 FST MWPLS 3 0.969 0.52 0.71
WE R3 FST Full 6 0.967 0.53 1.07
WE R3 FST ROI 3 0.978 0.43 0.60
WE R3 FST MWPLS 3 0.977 0.45 0.61
236
Table 74 Calibration Evaluation for the M5Glu data, with First Derivative and MSC pre-
processing
M5Glu LV Correlation RMSEC RMSECV
(g/L) (g/L)
Avg R1 FSTMSC Full 4 0.992 0.26 0.36
Avg R1 FSTMSC ROI 2 0.993 0.25 0.28
Avg R1 FSTMSC MWPLS 2 0.993 0.24 0.27
Avg R2 FSTMSC Full 4 0.972 0.49 0.78
Avg R2 FSTMSC ROI 2 0.969 0.52 0.57
Avg R2 FSTMSC MWPLS 2 0.981 0.41 0.54
Avg R3 FSTMSC Full 6 0.970 0.51 0.94
Avg R3 FSTMSC ROI 3 0.991 0.27 0.34
Avg R3 FSTMSC MWPLS 3 0.991 0.28 0.33
M5Glu LV Correlation RMSEC RMSECV
(g/L) (g/L)
BCR1 FSTMSC Full 4 0.992 0.26 0.36
BCR1 FSTMSC ROI 2 0.993 0.25 0.29
BCR1 FSTMSC MWPLS 2 0.993 0.25 0.27
BCR2 FSTMSC Full 4 0.972 0.49 0.78
BCR2 FSTMSC ROI 3 0.969 0.53 0.57
BCR2 FSTMSC MWPLS 3 0.981 0.41 0.54
BCR3 FSTMSC Full 5 0.960 0.59 1.03
BCR3 FSTMSC ROI 3 0.991 0.27 0.33
BCR3 FSTMSC MWPLS 3 0.991 0.28 0.33
M5Glu LV Correlation RMSEC RMSECV
(g/L) (g/L)
WE R1 FSTMSC Full 5 0.953 0.64 0.97
WE R1 FSTMSC ROI 4 0.952 0.64 0.82
WE R1 FSTMSC MWPLS 3 0.949 0.66 0.82
WE R2 FSTMSC Full 5 0.948 0.68 1.30
WE R2 FSTMSC ROI 2 0.903 0.95 1.05
WE R2 FSTMSC MWPLS 2 0.900 0.96 1.06
WE R3 FSTMSC Full 6 0.932 0.77 1.53
WE R3 FSTMSC ROI 4 0.958 0.60 1.21
WE R3 FSTMSC MWPLS 4 0.944 0.70 1.25
237
8.3.4 Calibration Models For M5eRDF Data for the
Conventional Raman Data
Table 75 Calibration models obtained for the M5eRDF using Raman data with no pre-processing
M5eRDF LV Correlation RMSEC RMSECV REP %
Coefficient (g/L) (g/L)
Avg Data(250-3311cm-1) 5 0.974 0.27 1.44 39.00
-1
Avg Data_roi(707-1853cm ) 6 0.990 0.17 0.84 22.70
Avg Data_MW(800-1680cm-1) 4 0.962 0.33 1.03 27.83
BC Data(250-3311cm-1) 5 0.992 0.15 1.18 32.08
BC Data_roi(707-1853cm-1) 5 0.983 0.22 0.80 21.75
BC Data_MW(800-1680cm-1) 5 0.980 0.24 0.70 18.91
WE Data(250-3311cm-1) 4 0.939 0.42 2.06 55.91
WE Data_roi(707-1853cm-1) 5 0.992 0.15 0.95 25.70
WE Data_MW(800-1680cm-1) 4 0.957 0.35 1.10 29.72
Table 76 Calibration models obtained for the M5eRDF using Raman data with multiplicative
scatter correction.
M5eRDF LV Correlation RMSEC RMSECV REP %
Coefficient (g/L) (g/L)
Avg MSC(250-3311cm-1) 5 0.978 0.25 1.40 37.89
-1
Avg MSC_roi(707-1853cm ) 5 0.995 0.12 0.62 16.94
Avg MSC MW(800-1680cm-1) 5 0.985 0.21 0.69 18.64
BC MSC(250-3311cm-1) 5 0.992 0.15 1.19 32.35
BC MSC_roi(707-1853cm-1) 4 0.991 0.16 0.91 24.62
BC MSC _MW(800-1680cm-1) 4 0.982 0.23 0.63 33.51
WE MSC(250-3311cm-1) 4 0.928 0.46 2.60 70.48
WE MSC_roi(707-1853cm-1) 6 0.997 0.09 1.14 30.91
WE MSC _MW(800-1680cm-1) 5 0.966 0.31 1.24 33.51
Table 77 Calibration models obtained for the M5eRDF using Raman data with normalisation
using Norm INF
M5eRDF LV Correlation RMSEC RMSECV REP %
Coefficient (g/L) (g/L)
Avg NINF(250-3311cm-1) 5 0.973 0.28 1.42 38.43
-1
Avg NINF roi(707-1853cm ) 6 0.991 0.16 0.83 22.48
Avg NINF MW(800-1680cm-1) 4 0.956 0.35 1.09 29.45
BC NINF (250-3311cm-1) 5 0.991 0.16 1.22 33.00
BC NINF roi(707-1853cm-1) 5 0.988 0.18 0.72 19.62
BC NINF MW(800-1680cm-1) 4 0.976 0.26 0.75 20.27
WE NINF (250-3311cm-1) 5 0.988 0.18 2.50 67.72
WE NINF roi(707-1853cm-1) 5 0.998 0.07 1.23 33.35
WE NINF MW(800-1680cm-1) 5 0.966 0.32 1.22 32.97
238
Table 78 Calibration models obtained for the M5eRDF using Raman data with normalisation
using Norm 2
M5eRDF LV Correlation RMSEC RMSECV REP %
Coefficient (g/L) (g/L)
Avg Norm2 (250-3311cm-1) 5 0.975 0.27 1.42 38.37
-1
Avg Norm2 roi(707-1853cm ) 4 0959 0.34 1.09 29.45
Avg Norm2 MW(800-1680cm-1) 4 0.961 0.33 1.06 28.64
BC Norm2(250-3311cm-1) 4 0.967 0.31 1.35 36.48
BC Norm2roi(707-1853cm-1) 4 0.976 0.26 0.81 21.89
BC Norm2 MW(800-1680cm-1) 4 0.974 0.27 0.77 20.81
WE Norm2(250-3311cm-1) 2 0.390 1.34 1.76 47.56
WE Norm2roi(707-1853cm-1) 5 0.986 0.20 1.18 31.89
WE Norm2 MW(800-1680cm-1) 4 0.971 0.29 1.44 38.91
Table 79 Calibration models for the M5eRDF using Raman data with first order derivative using
filter width of 11 and polynomial order of 3.
M5eRDF LV Correlation RMSEC RMSECV REP %
Coefficient (g/L) (g/L)
Avg FST11(250-3311cm-1) 3 0.952 0.37 1.57 42.43
-1
Avg FST11 roi(707-1853cm ) 3 0.972 0.29 0.64 17.29
Avg FST11 MW(800-1680cm-1) 3 0.968 0.31 0.64 17.29
BCFST11(250-3311cm-1) 3 0.952 0.37 1.55 41.89
BC FST11roi(707-1853cm-1) 3 0.975 0.27 0.62 16.75
BC FST11 MW(800-1680cm-1) 2 0.947 0.39 0.65 17.56
WE FST11(250-3311cm-1) 4 0.965 0.32 1.71 46.21
WE FST11roi(707-1853cm-1) 4 0.992 0.15 0.72 19.45
WE FST11 MW(800-1680cm-1) 3 0.972 0.28 0.74 20.00
Table 80 Calibration models for the M5eRDF using Raman data with first order derivative using
filter width of 11 and polynomial order of 3 and followed by multiplicative signal correction.
M5eRDF LV Correlation RMSEC RMSECV REP %
Coefficient (g/L) (g/L)
Avg FST11MSC(250-3311cm-1) 3 0.921 0.48 1.59 42.97
Avg FST11MSCroi(707-1853cm-1) 3 0.983 0.22 0.63 17.02
AvgFST11MSCMW(800-1680cm-1) 3 0.980 0.24 0.65 17.56
BC FST11MSC(250-3311cm-1) 3 0.921 0.48 1.59 42.97
BC FST11MSC_roi(707-1853cm-1) 4 0.993 0.14 0.59 15.94
BC FST11MSCMW(800-1680cm-1) 3 0.980 0.24 0.64 17.29
WE FST11MSC(250-3311cm-1) 5 0.982 0.22 2.17 58.64
WE FST11MSCroi(707-1853cm-1) 4 0.980 0.24 0.87 23.51
WE FST11MSCMW(800-1680cm-1) 3 0.960 0.34 0.90 24.32
239
8.3.5 Calibration Models for M5Ye Data for the
Conventional Raman
Table 81 Calibration models obtained for the M5Ye using Raman data with no pre-processing
M5Ye LV Correlation RMSEC RMSECV REP %
Coefficient (g/L) (g/L)
Avg Data(250-3311cm-1) 5 0.956 0.10 0.47 52.52
-1
Avg Data_roi(707-1853cm ) 4 0.886 0.17 0.35 38.68
Avg Data_MW(800-1680cm-1) 4 0.849 0.20 0.40 43.95
BC Data(250-3311cm-1) 4 0.935 0.13 0.36 39.67
BC Data_roi(707-1853cm-1) 4 0.889 0.17 0.53 59.23
BC Data_MW(800-1680cm-1) 2 0.463 0.37 0.46 50.54
WE Data(250-3311cm-1) 4 0.962 0.10 0.40 44.17
WE Data_roi(707-1853cm-1) 5 0.976 0.07 0.54 60.21
WE Data_MW(800-1680cm-1) 5 0.960 0.10 0.56 61.53
Table 82 Calibration models obtained for the M5Ye using Raman data with multiplicative scatter
correction.
M5Ye LV Correlation RMSEC RMSECV REP %
Coefficient (g/L) (g/L)
Avg MSC(250-3311cm-1) 5 0.915 0.15 0.70 77.91
-1
Avg MSC_roi(707-1853cm ) 3 0.839 0.20 0.32 35.38
Avg MSC MW(800-1680cm-1) 3 0.814 0.22 0.37 40.65
BC MSC(250-3311cm-1) 4 0.944 0.12 0.36 39.78
BC MSC_roi(707-1853cm-1) 4 0.858 0.19 0.42 46.92
BC MSC MW(800-1680cm-1) 4 0.904 0.16 0.46 50.54
WE MSC(250-3311cm-1) 5 0.996 0.03 0.48 53.51
WE MSC_roi(707-1853cm-1) 5 0.960 0.10 0.47 52.08
WE MSC MW(800-1680cm-1) 5 0.945 0.12 0.46 50.54
Table 83 Calibration models obtained for the M5Ye using Raman data with normalisation using
Norm INF
M5Ye LV Correlation RMSEC RMSECV REP %
Coefficient (g/L) (g/L)
Avg NINF(250-3311cm-1) 5 0.961 0.10 0.49 54.83
-1
Avg NINF roi(707-1853cm ) 4 0.892 0.17 0.36 39.78
Avg NINF MW(800-1680cm-1) 4 0.844 0.20 0.41 45.05
BC NINF (250-3311cm-1) 4 0.938 0.12 0.36 40.32
BC NINF roi(707-1853cm-1) 6 0.992 0.04 0.56 62.52
BC NINF MW(800-1680cm-1) 4 0.848 0.20 0.56 61.53
WE NINF (250-3311cm-1) 4 0.961 0.10 0.42 46.92
WE NINF roi(707-1853cm-1) 5 0.981 0.07 0.56 61.86
WE NINF MW(800-1680cm-1) 5 0.951 0.11 0.58 63.73
240
Table 84 Calibration models obtained for the M5Ye using Raman data with normalisation using
Norm 2
M5Ye LV Correlation RMSEC RMSECV REP %
Coefficient (g/L) (g/L)
Avg Norm2 (250-3311cm-1) 5 0.954 0.11 0.50 54.94
-1
Avg Norm2 roi(707-1853cm ) 4 0.881 0.17 0.37 40.65
Avg Norm2 MW(800-1680cm-1) 4 0.852 0.19 0.41 45.05
BC Norm2(250-3311cm-1) 4 0.938 0.12 0.36 39.56
BC Norm2roi(707-1853cm-1) 4 0.874 0.18 0.55 60.43
BC Norm2 MW(800-1680cm-1) 4 0.849 0.20 0.56 61.53
WE Norm2(250-3311cm-1) 4 0.960 0.10 0.46 50.54
WE Norm2roi(707-1853cm-1) 5 0.984 0.06 0.58 63.73
WE Norm2 MW(800-1680cm-1) 5 0.962 0.10 0.56 61.53
Table 85 Calibration models obtained for the M5Ye using Raman data with first order derivative
using filter width of 11 and polynomial order of 3.
M5Ye LV Correlation RMSEC RMSECV REP %
Coefficient (g/L) (g/L)
Avg FST11(250-3311cm-1) 4 0.882 0.17 0.68 74.72
-1
Avg FST11 roi(707-1853cm ) 3 0.941 0.12 0.35 38.46
Avg FST11 MW(800-1680cm-1) 3 0.924 0.14 0.37 40.65
BCFST11(250-3311cm-1) 4 0.888 0.17 0.67 73.62
BC FST11roi(707-1853cm-1) 3 0.929 0.13 0.41 45.05
BC FST11 MW(800-1680cm-1) 2 0.597 0.32 0.42 46.15
WE FST11(250-3311cm-1) 4 0.980 0.07 0.70 76.92
WE FST11roi(707-1853cm-1) 4 0.993 0.04 0.38 41.75
WE FST11 MW(800-1680cm-1) 4 0.991 0.04 0.42 46.15
Table 86 Calibration models obtained for the M5Ye using Raman data with first order derivative
using filter width of 11 and polynomial order of 3 and multiplicative signal correction.
M5Ye LV Correlation RMSEC RMSECV REP %
Coefficient (g/L) (g/L)
Avg FST11MSC(250-3311cm-1) 3 0.689 0.28 0.72 79.12
Avg FST11MSC_roi(707-1853cm-1) 3 0.929 0.13 0.43 47.25
AvgFST11MSCMW(800-1680cm-1) 2 0.569 0.33 0.45 49.45
BC FST11MSC(250-3311cm-1) 4 0.923 0.14 0.71 78.02
BC FST11MSC_roi(707-1853cm-1) 3 0.928 0.13 0.43 47.25
BC FST11MSCMW(800-1680cm-1) 2 0.573 0.33 0.45 49.45
WE FST11MSC(250-3311cm-1) 3 0.727 0.26 0.71 78.02
WE FST11MSC_roi(707-1853cm-1) 4 0.984 0.06 0.43 47.25
WE FST11MSCMW(800-1680cm-1) 4 0.990 0.05 0.45 49.45
241
8.4 Supplementary Information for Chapter Four
6 0.8
0.6
4
0.4
Scores on PC 2 (0.12%)
Scores on PC 2 (0.22%)
2
0.2
0 0
-0.2
-2
-0.4
-4
-0.6
-6 -0.8
-1
-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5
Scores on PC 1 (99.84%) 6 Scores on PC 1 (99.65%) 6
x 10 x 10
5
x 10 Samples/Scores Plot of M5eRDFSR3 5
x 10
3
2.5
2 2.5 28
1.5 2
Scores on PC 2 (1.76%)
Scores on PC 2 (0.73%)
1
1.5
0.5
1
0
-0.5 0.5 11
15
-1 26
0 29
5 30
-1.5 14
-0.5 121
-2
-1
-2.5
-1.5
-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 -1 -0.5 0 0.5 1 1.5 2
Scores on PC 1 (98.10%) 6 Scores on PC 1 (99.12%) 6
x 10 x 10
4
x 10
8
6 11
15
4
Scores on PC 2 (0.14%)
26
2
24
29
2722
0 20
30
537
4
-2
2
14
-4 1
21
-6
-1 -0.5 0 0.5 1 1.5 2
Scores on PC 1 (99.73%) 6
x 10
Figure 115 Scores plots for the replicate data collection and the amalgamated samples with a
95% confidence limit for the M5eRDF SERS samples
242
4
4 x 10 Samples/Scores Plot of M5YeSR2
x 10 Samples/Scores Plot of M5YeSR1
6
4 4
Scores on PC 2 (0.08%)
Scores on PC 2 (0.07%)
2 2
0 0
-1
-2 -2
-3
-4
-4
-5
-6
-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5
Scores on PC 1 (99.87%) 6 Scores on PC 1 (99.85%) 6
x 10 x 10
4 4
x 10 Samples/Scores Plot of M5YeSR3 x 10
4 3
2
Scores on PC 2 (0.07%)
Scores on PC 2 (0.06%)
2
1
0 0
-1
-2
-2
-4 -3
-4
-6
-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 -1 -0.5 0 0.5 1 1.5 2
Scores on PC 1 (99.89%) 6 Scores on PC 1 (99.86%) 6
x 10 x 10
Figure 116 Scores plots for the replicate data collection and the amalgamated samples with a
95% confidence limit for the M5Ye SERS samples
243
8.4.2 Calibration Models for M5eRDF for the SERS Data
Table 87 Calibration Evaluation for the M5eRDFAVG SERS data, with preliminary pre-
processing
M5eRDF LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
AvgData (250-3311cm-1) 3 0.965 0.32 0.93 25.13
AvgData_ROI(707-1853cm-1) 4 0.957 0.35 0.75 20.27
AvgDataA (602-995cm-1) 4 0.956 0.36 0.89 24.05
AvgDataB(1260-1444cm-1) 3 0.919 0.49 0.63 17.02
AvgAB(602-995,1260-1444cm-1) 4 0.960 0.34 0.70 18.91
BC Data (250-3311cm-1) 3 0.955 0.36 0.78 21.08
BC Data_ROI(707-1853cm-1) 4 0.958 0.35 0.80 21.62
BC DataA (602-995cm-1) 5 0.969 0.30 0.66 17.83
BC DataB(1260-1444cm-1) 3 0.873 0.61 0.81 21.89
BC AB(602-995,1260-1444cm-1) 4 0.928 0.46 0.84 22.70
WE Data (250-3311cm-1) 2 0.857 0.65 0.82 22.16
WE Data_ROI(707-1853cm-1) 2 0.766 0.84 1.22 32.97
WE DataA (602-995cm-1) 5 0.971 0.29 0.66 17.83
WE DataB(1260-1444cm-1) 3 0.807 0.77 0.96 25.94
WE AB(602-995,1260-1444cm-1) 6 0.986 0.20 0.83 22.43
Table 88 Calibration Evaluation for the M5eRDFAVG SERS data with MSC pre-processing.
M5eRDF LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg MSC (250-3311cm-1) 3 0.932 0.45 0.86 23.24
Avg MSC_ROI (707-1853cm-1) 2 0.848 0.67 0.97 26.21
Avg MSCA (602-995cm-1) 5 0.961 0.33 0.87 23.51
Avg MSCB (1260-1444cm-1) 3 0.926 0.46 0.72 19.45
Avg AB(602-995,1260-1444cm-1) 5 0.957 0.35 0.90 24.32
BC MSC (250-3311cm-1) 3 0.952 0.37 0.84 22.70
BC MSC_ROI (707-1853cm-1) 3 0.926 0.46 1.00 27.02
BC MSCA (602-995cm-1) 5 0.975 0.27 1.55 41.89
BC MSCB (1260-1444cm-1) 3 0.878 0.60 0.79 21.35
BC AB(602-995,1260-1444cm-1) 5 0.937 0.43 1.28 34.59
WE MSC (250-3311cm-1) 2 0.908 0.52 0.63 17.02
WE MSC_ROI (707-1853cm-1) 5 0.968 0.30 0.89 24.05
WE MSCA (602-995cm-1) 5 0.983 0.22 0.75 20.27
WE MSCB (1260-1444cm-1) 3 0.901 0.54 0.68 18.37
WE AB(602-995,1260-1444cm-1) 5 0.985 0.21 0.89 24.05
244
Table 89 Calibration Evaluation for the M5eRDFAVG SERS data, with infinity (max)
normalisation pre-processing
M5eRDF LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg NINF (250-3311cm-1) 4 0.964 0.32 0.92 24.86
Avg NINF ROI (707-1853cm-1) 2 0.870 0.62 0.86 23.24
Avg NINF A (602-995cm-1) 2 0.852 0.66 0.92 24.86
Avg NINF B(1260-1444cm-1) 3 0.895 0.56 0.71 19.18
Avg AB(602-995,1260-1444cm-1) 4 0.965 0.32 0.79 21.35
BC NINF (250-3311cm-1) 3 0.941 0.41 0.86 23.24
BC NINF ROI (707-1853cm-1) 4 0.936 0.43 1.01 27.29
BC NINF A (602-995cm-1) 5 0.964 0.32 0.81 21.89
BC NINF B(1260-1444cm-1) 3 0.834 0.70 0.92 24.86
BC AB(602-995,1260-1444cm-1) 5 0.945 0.40 1.06 28.64
WE NINF (250-3311cm-1) 2 0.922 0.48 0.59 15.94
WE NINF ROI (707-1853cm-1) 3 0.904 0.53 0.92 24.86
WE NINF A (602-995cm-1) 5 0.984 0.21 0.47 12.70
WE NINF B(1260-1444cm-1) 3 0.898 0.55 0.69 18.64
WE AB(602-995,1260-1444cm-1) 6 0.991 0.16 0.59 15.94
Table 90 Calibration models for the M5eRDFAVG SERS data after Norm2 pre-processing
M5eRDF LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg Norm2 (250-3311cm-1) 3 0.877 0.60 0.92 24.86
Avg Norm2ROI (707-1853cm-1) 2 0.922 0.48 0.65 17.56
Avg Norm2A (602-995cm-1) 2 0.868 0.68 0.86 23.24
Avg Norm2B (1260-1444cm-1) 3 0.922 0.48 0.61 16.48
Avg AB(602-995,1260-1444cm-1) 4 0.957 0.35 0.67 18.10
BC Norm2 (250-3311cm-1) 2 0.823 0.72 0.91 24.59
BC Norm2ROI (707-1853cm-1) 3 0.922 0.48 0.72 19.45
BC Norm2A (602-995cm-1) 5 0.967 0.31 0.77 20.81
BC Norm2B (1260-1444cm-1) 3 0.842 0.68 0.89 24.05
BC AB(602-995,1260-1444cm-1) 5 0.949 0.38 1.02 27.56
WE Norm2Full (250-3311cm-1) 2 0.911 0.51 0.62 16.75
WE Norm2ROI (707-1853cm-1) 4 0.943 0.41 0.61 16.48
WE Norm2A (602-995cm-1) 4 0.944 0.40 0.68 18.37
WE Norm2B (1260-1444cm-1) 3 0.882 0.59 0.72 19.45
WE AB(602-995,1260-1444cm-1) 3 0.876 0.60 0.91 24.59
245
Table 91 Calibration Evaluation for the M5eRDFAVG data, with first derivative (11, 3, 1) pre-
processing.
M5eRDF LV (R2) RMSEC RMSECV Percentage
(g/L) (g/L) Error (%)
Avg FST11(250-3311cm-1) 2 0.862 0.63 0.91 24.59
Avg FST11ROI (707-1853cm-1) 4 0.960 0.34 0.74 20.00
Avg FST11A (602-995cm-1) 5 0.979 0.25 0.78 21.08
Avg FST11B (1260-1444cm-1) 3 0.841 0.69 0.96 25.94
Avg AB(602-995,1260-1444cm-1) 2 0.773 0.83 1.17 31.62
BC FST11 (250-3311cm-1) 2 0.862 0.63 0.91 24.59
BC FST11ROI (707-1853cm-1) 4 0.958 0.35 0.75 20.27
BC FST11A (602-995cm-1) 5 0.983 0.22 0.76 20.54
BC FST11B (1260-1444cm-1) 3 0.858 0.65 0.91 24.59
BC AB(602-995,1260-1444cm-1) 4 0.946 0.40 1.18 31.89
WE FST11 (250-3311cm-1) 4 0.974 0.27 0.98 26.48
WE FST11ROI (707-1853cm-1) 3 0.829 0.72 1.08 29.18
WE FST11A (602-995cm-1) 3 0.953 0.37 1.56 42.16
WE FST11B (1260-1444cm-1) 3 0.780 0.82 1.12 30.27
WE AB(602-995,1260-1444cm-1) 4 0.970 0.30 1.26 34.05
Table 92 Calibration Evaluation for the M5eRDFAVG data, with first derivative (11,3,1) plus
MSC pre-processing.
M5eRDF LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg FST11MSC(250-3311cm-1) 2 0.875 0.60 0.85 22.97
-1
Avg FST11MSCROI(707-1853cm ) 4 0.965 0.32 0.74 20.00
Avg FST11MSC A (602-995cm-1) 5 0.981 0.23 0.85 22.97
Avg FST11MSC B(1260-1444cm-1) 3 0.853 0.66 0.90 24.32
Avg AB(602-995,1260-1444cm-1) 5 0.968 0.31 1.03 27.83
BC FST11MSC (250-3311cm-1) 2 0.875 0.60 0.85 22.97
BC FST11MSCROI (707-1853cm-1) 4 0.963 0.33 0.74 20.00
BC FST11MSC A (602-995cm-1) 5 0.988 0.18 0.89 24.05
BC FST11MSC B (1260-1444cm-1) 3 0.861 0.64 0.87 23.51
BC AB(602-995,1260-1444cm-1) 2 0.786 0.80 1.12 30.27
WE FST11MSC (250-3311cm-1) 2 0.916 0.50 0.66 17.83
WE FST11MSCROI(707-1853cm-1) 3 0.897 0.55 0.78 21.08
WE FST11MSC A (602-995cm-1) 2 0.851 0.67 0.90 24.32
WE FST11MSC B (1260-1444cm-1) 3 0.846 0.68 0.90 24.32
WE AB(602-995,1260-1444cm-1) 3 0.861 0.64 0.62 16.75
246
Table 93 Calibration Evaluation for the M5eRDFR1 SERS data, with preliminary pre-processing
M5eRDF LV (R2) RMSEC RMSECV Percentage
(g/L) (g/L) Error (%)
AvgData (250-3311cm-1) 2 0.724 0.90 1.18 31.89
AvgData_ROI(707-1853cm-1) 3 0.947 0.39 0.54 14.59
AvgDataA (602-995cm-1) 2 0.866 0.63 0.86 23.24
AvgDataB(1260-1444cm-1) 3 0.887 0.58 0.94 25.40
AvgAB(602-995,1260-1444cm-1) 2 0.891 0.57 0.79 21.35
BC Data (250-3311cm-1) 2 0.860 0.64 0.79 21.35
BC Data_ROI(707-1853cm-1) 2 0.888 0.57 0.84 22.70
BC DataA (602-995cm-1) 2 0.856 0.65 0.92 24.86
BC DataB(1260-1444cm-1) 3 0.894 0.55 0.87 23.51
BC AB(602-995,1260-1444cm-1) 2 0.853 0.66 0.92 24.86
WE Data (250-3311cm-1) 2 0.875 0.61 0.86 23.24
WE Data_ROI(707-1853cm-1) 2 0.826 0.72 0.95 25.67
WE DataA (602-995cm-1) 2 0.828 0.71 0.94 25.40
WE DataB(1260-1444cm-1) 3 0.860 0.64 1.02 27.56
WE AB(602-995,1260-1444cm-1) 2 0.823 0.72 0.97 26.21
Table 94 Calibration Evaluation for the M5eRDFR1 SERS data with MSC pre-processing.
M5eRDF LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg MSC (250-3311cm-1) 3 0.944 0.40 0.72 19.45
Avg MSC_ROI (707-1853cm-1) 2 0.903 0.53 0.67 18.10
Avg MSCA (602-995cm-1) 2 0.835 0.70 0.97 26.21
Avg MSCB (1260-1444cm-1) 2 0.872 0.61 1.03 27.83
Avg AB(602-995,1260-1444cm-1) 4 0.967 0.31 0.70 18.91
BC MSC (250-3311cm-1) 2 0.872 0.61 0.76 20.54
BC MSC_ROI (707-1853cm-1) 2 0.869 0.62 0.90 24.32
BC MSCA (602-995cm-1) 2 0.814 0.74 1.03 27.83
BC MSCB (1260-1444cm-1) 3 0.866 0.63 1.05 28.37
BC AB(602-995,1260-1444cm-1) 2 0.817 0.73 1.02 27.56
WE MSC (250-3311cm-1) 2 0.881 0.59 0.79 21.35
WE MSC_ROI (707-1853cm-1) 2 0.896 0.55 0.76 20.54
WE MSCA (602-995cm-1) 2 0.879 0.59 0.75 20.27
WE MSCB (1260-1444cm-1) 3 0.898 0.55 0.71 19.18
WE AB(602-995,1260-1444cm-1) 2 0.876 0.60 0.76 20.54
247
Table 95 Calibration Evaluation for the M5eRDFR1 SERS data, with infinity (max)
normalisation pre-processing
M5eRDF LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg NINF (250-3311cm-1) 3 0.862 0.63 0.96 25.94
Avg NINF ROI (707-1853cm-1) 3 0.983 0.22 0.55 14.86
Avg NINF A (602-995cm-1) 2 0.843 0.68 0.99 26.75
Avg NINF B(1260-1444cm-1) 3 0.876 0.60 0.84 22.70
Avg AB(602-995,1260-1444cm-1) 3 0.923 0.47 0.84 22.70
BC NINF (250-3311cm-1) 2 0.869 0.62 0.76 20.54
BC NINF ROI (707-1853cm-1) 2 0.866 0.63 0.93 25.13
BC NINF A (602-995cm-1) 2 0.827 0.71 1.05 28.37
BC NINF B(1260-1444cm-1) 3 0.877 0.60 0.99 26.75
BC AB(602-995,1260-1444cm-1) 2 0.823 0.72 1.05 28.37
WE NINF (250-3311cm-1) 2 0.878 0.60 0.78 21.08
WE NINF ROI (707-1853cm-1) 2 0.888 0.57 0.79 21.35
WE NINF A (602-995cm-1) 2 0.880 0.59 0.74 20.00
WE NINF B(1260-1444cm-1) 3 0.891 0.57 0.79 21.35
WE AB(602-995,1260-1444cm-1) 2 0.876 0.60 0.76 20.54
Table 96 Calibration models for the M5eRDFR1 SERS data after Norm2 pre-processing
M5eRDF LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg Norm2 (250-3311cm-1) 3 0.865 0.63 0.97 26.21
Avg Norm2ROI (707-1853cm-1) 3 0.951 0.38 0.57 15.40
Avg Norm2A (602-995cm-1) 2 0.852 0.66 0.87 23.51
Avg Norm2B (1260-1444cm-1) 3 0.883 0.58 0.95 25.67
Avg AB(602-995,1260-1444cm-1) 3 0.926 0.46 0.71 19.18
BC Norm2 (250-3311cm-1) 2 0.869 0.62 0.76 20.54
BC Norm2ROI (707-1853cm-1) 2 0.870 0.62 0.91 24.59
BC Norm2A (602-995cm-1) 2 0.833 0.70 1.02 27.56
BC Norm2B (1260-1444cm-1) 3 0.881 0.59 0.96 25.94
BC AB(602-995,1260-1444cm-1) 2 0.829 0.71 1.02 27.56
WE Norm2Full (250-3311cm-1) 2 0.885 0.58 0.78 21.08
WE Norm2ROI (707-1853cm-1) 2 0.876 0.60 0.83 22.43
WE Norm2A (602-995cm-1) 2 0.872 0.61 0.78 21.08
WE Norm2B (1260-1444cm-1) 3 0.891 0.57 0.79 21.35
WE AB(602-995,1260-1444cm-1) 2 0.868 0.62 0.80 21.62
248
Table 97 Calibration Evaluation for the M5eRDFR1 data, with first derivative (11, 3, 1) pre-
processing.
M5eRDF LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg FST11(250-3311cm-1) 2 0.897 0.55 1.15 31.08
Avg FST11ROI (707-1853cm-1) 3 0.976 0.26 0.88 23.78
Avg FST11A (602-995cm-1) 3 0.945 0.40 0.87 23.51
Avg FST11B (1260-1444cm-1) 1 0.777 0.81 0.98 26.48
Avg AB(602-995,1260-1444cm-1) 1 0.802 0.87 1.02 27.56
BC FST11 (250-3311cm-1) 2 0.896 0.55 1.16 31.35
BC FST11ROI (707-1853cm-1) 3 0.976 0.26 0.88 23.78
BC FST11A (602-995cm-1) 3 0.945 0.40 0.87 23.51
BC FST11B (1260-1444cm-1) 1 0.777 0.81 0.98 26.48
BC AB(602-995,1260-1444cm-1) 2 0.820 0.73 0.97 26.21
WE FST11 (250-3311cm-1) 2 0.880 0.57 1.16 31.35
WE FST11ROI (707-1853cm-1) 3 0.981 0.23 0.87 23.51
WE FST11A (602-995cm-1) 3 0.944 0.40 0.86 23.24
WE FST11B (1260-1444cm-1) 1 0.776 0.81 0.99 26.75
WE AB(602-995,1260-1444cm-1) 3 0.954 0.36 0.95 25.67
Table 98 Calibration Evaluation for the M5eRDFR1 data, with first derivative (11,3,1) plus MSC
pre-processing.
M5eRDF LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg FST11MSC(250-3311cm-1) 2 0.880 0.59 1.17 31.62
Avg FST11MSCROI(707-1853cm-1) 3 0.988 0.19 0.93 25.13
Avg FST11MSC A (602-995cm-1) 3 0.942 0.41 0.90 24.32
Avg FST11MSC B(1260-1444cm-1) 2 0.759 0.84 1.08 29.18
Avg AB(602-995,1260-1444cm-1) 2 0.801 0.77 1.03 27.83
BC FST11MSC (250-3311cm-1) 2 0.880 0.59 1.19 32.16
BC FST11MSCROI (707-1853cm-1) 3 0.985 0.21 0.93 25.13
BC FST11MSC A (602-995cm-1) 3 0.943 0.41 0.90 24.32
BC FST11MSC B (1260-1444cm-1) 2 0.760 0.84 1.12 30.27
BC AB(602-995,1260-1444cm-1) 3 0.959 0.34 1.01 27.29
WE FST11MSC (250-3311cm-1) 2 0.893 0.56 1.18 31.89
WE FST11MSCROI(707-1853cm-1) 2 0.899 0.54 0.75 20.27
WE FST11MSC A (602-995cm-1) 2 0.882 0.59 0.75 20.27
WE FST11MSC B (1260-1444cm-1) 2 0.847 0.67 0.91 24.59
WE AB(602-995,1260-1444cm-1) 2 0.878 0.60 0.78 21.08
249
Table 99 Calibration Evaluation for the M5eRDFR2 SERS data, with preliminary pre-processing
M5eRDF LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
AvgData (250-3311cm-1) 3 0.820 0.73 1.05 28.37
AvgData_ROI(707-1853cm-1) 3 0.924 0.47 0.66 17.83
AvgDataA (602-995cm-1) 2 0.846 0.67 0.89 24.05
AvgDataB(1260-1444cm-1) 3 0.934 0.44 0.65 17.56
AvgAB(602-995,1260-1444cm-1) 3 0.900 0.54 0.85 22.97
BC Data (250-3311cm-1) 4 0.935 0.44 0.84 22.70
BC Data_ROI(707-1853cm-1) 3 0.929 0.45 0.66 17.83
BC DataA (602-995cm-1) 2 0.864 0.63 0.86 23.24
BC DataB(1260-1444cm-1) 3 0.883 0.59 0.80 21.62
BC AB(602-995,1260-1444cm-1) 2 0.836 0.69 0.89 24.05
WE Data (250-3311cm-1) 3 0.907 0.52 0.77 20.81
WE Data_ROI(707-1853cm-1) 3 0.914 0.50 0.75 20.27
WE DataA (602-995cm-1) 1 0.852 0.68 0.89 24.05
WE DataB(1260-1444cm-1) 3 0.909 0.52 0.70 18.91
WE AB(602-995,1260-1444cm-1) 3 0.880 0.59 0.98 26.48
Table 100 Calibration Evaluation for the M5eRDFR2 SERS data with MSC pre-processing.
M5eRDF LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg MSC (250-3311cm-1) 4 0.917 0.49 0.85 22.97
Avg MSC_ROI (707-1853cm-1) 3 0.885 0.58 0.95 25.67
Avg MSCA (602-995cm-1) 3 0.910 0.51 0.90 24.32
Avg MSCB (1260-1444cm-1) 3 0.923 0.47 0.85 22.97
Avg AB(602-995,1260-1444cm-1) 2 0.740 0.87 1.03 27.83
BC MSC (250-3311cm-1) 4 0.924 0.47 0.86 23.24
BC MSC_ROI (707-1853cm-1) 3 0.930 0.45 0.61 16.48
BC MSCA (602-995cm-1) 2 0.845 0.68 0.95 25.67
BC MSCB (1260-1444cm-1) 3 0.914 0.50 0.68 18.37
BC AB(602-995,1260-1444cm-1) 2 0.820 0.73 0.96 25.94
WE MSC (250-3311cm-1) 3 0.927 0.46 0.73 19.72
WE MSC_ROI (707-1853cm-1) 3 0.937 0.43 0.65 17.56
WE MSCA (602-995cm-1) 2 0.894 0.56 0.78 21.08
WE MSCB (1260-1444cm-1) 3 0.931 0.45 0.63 17.02
WE AB(602-995,1260-1444cm-1) 2 0.855 0.65 0.89 24.05
250
Table 101 Calibration Evaluation for the M5eRDFR2 SERS data, with infinity (max)
normalisation pre-processing
M5eRDF LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg NINF (250-3311cm-1) 3 0.813 0.74 1.05 28.37
Avg NINF ROI (707-1853cm-1) 3 0.920 0.48 0.68 18.37
Avg NINF A (602-995cm-1) 2 0.837 0.69 0.92 24.86
Avg NINF B(1260-1444cm-1) 3 0.931 0.45 0.67 18.10
Avg AB(602-995,1260-1444cm-1) 4 0.930 0.45 0.78 21.08
BC NINF (250-3311cm-1) 4 0.928 0.46 0.83 22.43
BC NINF ROI (707-1853cm-1) 3 0.933 0.44 0.63 17.02
BC NINF A (602-995cm-1) 2 0.858 0.65 0.84 22.70
BC NINF B(1260-1444cm-1) 3 0.883 0.59 0.82 22.16
BC AB(602-995,1260-1444cm-1) 2 0.826 0.71 0.91 24.59
WE NINF (250-3311cm-1) 3 0.921 0.48 0.67 18.10
WE NINF ROI (707-1853cm-1) 4 0.930 0.45 0.70 18.91
WE NINF A (602-995cm-1) 2 0.858 0.64 0.93 25.13
WE NINF B(1260-1444cm-1) 3 0.916 0.50 0.77 20.81
WE AB(602-995,1260-1444cm-1) 4 0.910 0.51 0.92 24.86
Table 102 Calibration models for the M5eRDFR2 SERS data after Norm2 pre-processing
M5eRDF LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg Norm2 (250-3311cm-1) 3 0.833 0.70 1.01 27.29
Avg Norm2ROI (707-1853cm-1) 3 0.925 0.47 0.65 17.56
Avg Norm2A (602-995cm-1) 2 0.850 0.66 0.87 23.51
Avg Norm2B (1260-1444cm-1) 3 0.933 0.44 0.60 16.21
Avg AB(602-995,1260-1444cm-1) 4 0.931 0.45 0.77 20.81
BC Norm2 (250-3311cm-1) 4 0.929 0.46 0.83 22.43
BC Norm2ROI (707-1853cm-1) 3 0.933 0.44 0.64 17.29
BC Norm2A (602-995cm-1) 2 0.859 0.64 0.84 22.70
BC Norm2B (1260-1444cm-1) 3 0.882 0.59 0.81 21.89
BC AB(602-995,1260-1444cm-1) 2 0.829 0.71 0.91 24.59
WE Norm2Full (250-3311cm-1) 3 0.933 0.44 0.62 16.75
WE Norm2ROI (707-1853cm-1) 4 0.942 0.41 0.65 17.56
WE Norm2A (602-995cm-1) 2 0.893 0.56 0.80 21.62
WE Norm2B (1260-1444cm-1) 3 0.930 0.45 0.66 17.83
WE AB(602-995,1260-1444cm-1) 2 0.865 0.63 0.88 23.78
251
Table 103 Calibration Evaluation for the M5eRDFR2 data, with first derivative (11, 3, 1) pre-
processing.
M5eRDF LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg FST11(250-3311cm-1) 2 0.923 0.47 0.85 22.97
Avg FST11ROI (707-1853cm-1) 2 0.914 0.50 0.77 20.81
Avg FST11A (602-995cm-1) 2 0.884 0.58 0.85 22.97
Avg FST11B (1260-1444cm-1) 3 0.910 0.51 0.74 20.00
Avg AB(602-995,1260-1444cm-1) 2 0.886 0.58 0.85 22.97
BC FST11 (250-3311cm-1) 2 0.922 0.48 0.85 22.97
BC FST11ROI (707-1853cm-1) 2 0.914 0.50 0.77 20.81
BC FST11A (602-995cm-1) 2 0.883 0.58 0.85 22.97
BC FST11B (1260-1444cm-1) 3 0.911 0.51 0.73 19.72
BC AB(602-995,1260-1444cm-1) 2 0.886 0.58 0.85 22.97
WE FST11 (250-3311cm-1) 2 0.924 0.47 0.88 23.78
WE FST11ROI (707-1853cm-1) 2 0.912 0.51 0.79 21.35
WE FST11A (602-995cm-1) 2 0.885 0.58 0.85 22.97
WE FST11B (1260-1444cm-1) 3 0.911 0.51 0.86 23.24
WE AB(602-995,1260-1444cm-1) 2 0.888 0.57 0.85 22.97
Table 104 Calibration Evaluation for the M5eRDFR2 data, with first derivative
(11,3,1) plus MSC pre-processing.
M5eRDF LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg FST11MSC(250-3311cm-1) 2 0.916 0.49 0.86 23.24
Avg FST11MSCROI(707-1853cm-1) 2 0.912 0.51 0.77 20.81
Avg FST11MSC A (602-995cm-1) 2 0.880 0.59 0.87 23.51
Avg FST11MSC B(1260-1444cm-1) 3 0.908 0.52 0.73 19.72
Avg AB(602-995,1260-1444cm-1) 2 0.883 0.59 0.87 23.51
BC FST11MSC (250-3311cm-1) 2 0.917 0.49 0.85 22.97
BC FST11MSCROI (707-1853cm-1) 2 0.913 0.50 0.76 20.54
BC FST11MSC A (602-995cm-1) 2 0.882 0.59 0.85 22.97
BC FST11MSC B (1260-1444cm-1) 3 0.911 0.51 0.72 19.45
BC AB(602-995,1260-1444cm-1) 2 0.885 0.58 0.85 22.97
WE FST11MSC (250-3311cm-1) 3 0.991 0.16 0.82 22.16
WE FST11MSCROI(707-1853cm-1) 3 0.946 0.39 0.65 17.56
WE FST11MSC A (602-995cm-1) 2 0.896 0.55 0.79 21.35
WE FST11MSC B (1260-1444cm-1) 3 0.924 0.47 0.67 18.10
WE AB(602-995,1260-1444cm-1) 3 0.937 0.43 0.73 19.72
252
Table 105 Calibration Evaluation for the M5eRDFR3 SERS data, with preliminary pre-
processing
M5eRDF LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
AvgData (250-3311cm-1) 4 0.898 0.54 0.95 25.67
AvgData_ROI(707-1853cm-1) 3 0.882 0.59 0.81 21.89
AvgDataA (602-995cm-1) 3 0.827 0.71 1.75 47.29
AvgDataB(1260-1444cm-1) 5 0.910 0.51 1.96 52.97
AvgAB(602-995,1260-1444cm-1) 3 0.823 0.72 1.74 47.02
BC Data (250-3311cm-1) 3 0.880 0.59 0.88 23.78
BC Data_ROI(707-1853cm-1) 3 0.895 0.55 1.12 30.27
BC DataA (602-995cm-1) 3 0.839 0.69 1.18 31.89
BC DataB(1260-1444cm-1) 2 0.631 1.05 2.27 61.35
BC AB(602-995,1260-1444cm-1) 3 0.839 0.69 1.39 37.56
WE Data (250-3311cm-1) 3 0.843 0.68 1.38 37.29
WE Data_ROI(707-1853cm-1) 3 0.871 0.61 1.80 48.64
WE DataA (602-995cm-1) 4 0.905 0.53 1.23 33.24
WE DataB(1260-1444cm-1) 2 0.636 1.04 2.54 68.64
WE AB(602-995,1260-1444cm-1) 3 0.868 0.62 1.57 42.43
Table 106 Calibration Evaluation for the M5eRDFR3 SERS data with MSC pre-processing.
M5eRDF LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg MSC (250-3311cm-1) 4 0.926 0.46 0.93 25.13
Avg MSC_ROI (707-1853cm-1) 4 0.895 0.55 1.22 32.97
Avg MSCA (602-995cm-1) 3 0.807 0.75 1.15 31.08
Avg MSCB (1260-1444cm-1) 4 0.836 0.69 1.17 31.62
Avg AB(602-995,1260-1444cm-1) 4 0.839 0.69 1.16 31.35
BC MSC (250-3311cm-1) 5 0.988 0.18 0.97 26.21
BC MSC_ROI (707-1853cm-1) 3 0.865 0.63 0.84 22.70
BC MSCA (602-995cm-1) 3 0.838 0.69 1.60 43.24
BC MSCB (1260-1444cm-1) 2 0.614 1.07 2.29 61.89
BC AB(602-995,1260-1444cm-1) 3 0.796 0.77 1.69 45.67
WE MSC (250-3311cm-1) 3 0.871 0.61 1.10 29.72
WE MSC_ROI (707-1853cm-1) 3 0.779 0.81 1.24 33.51
WE MSCA (602-995cm-1) 3 0.773 0.82 1.18 31.89
WE MSCB (1260-1444cm-1) 2 0.692 0.95 1.33 35.94
WE AB(602-995,1260-1444cm-1) 3 0.773 0.82 1.19 32.16
253
Table 107 Calibration Evaluation for the M5eRDFR3 SERS data, with infinity (max)
normalisation pre-processing
M5eRDF LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg NINF (250-3311cm-1) 4 0.897 0.55 0.85 22.97
Avg NINF ROI (707-1853cm-1) 3 0.875 0.61 0.88 23.78
Avg NINF A (602-995cm-1) 3 0.824 0.72 1.51 40.81
Avg NINF B(1260-1444cm-1) 5 0.893 0.56 1.63 44.05
Avg AB(602-995,1260-1444cm-1) 3 0.811 0.75 1.51 40.81
BC NINF (250-3311cm-1) 3 0.862 0.64 1.01 27.29
BC NINF ROI (707-1853cm-1) 3 0.859 0.64 1.09 29.45
BC NINF A (602-995cm-1) 3 0.809 0.75 1.28 34.59
BC NINF B(1260-1444cm-1) 2 0.571 1.13 1.86 50.27
BC AB(602-995,1260-1444cm-1) 3 0.808 0.75 1.42 38.37
WE NINF (250-3311cm-1) 3 0.852 0.66 1.28 34.59
WE NINF ROI (707-1853cm-1) 3 0.817 0.74 0.96 25.94
WE NINF A (602-995cm-1) 3 0.802 0.77 0.98 26.48
WE NINF B(1260-1444cm-1) 2 0.739 0.90 1.17 31.62
WE AB(602-995,1260-1444cm-1) 3 0.804 0.77 0.95 25.67
Table 108 Calibration models for the M5eRDFR3 SERS data after Norm2 pre-processing
M5eRDF LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg Norm2 (250-3311cm-1) 4 0.895 0.55 0.95 25.67
Avg Norm2ROI (707-1853cm-1) 3 0.888 0.57 0.81 21.89
Avg Norm2A (602-995cm-1) 3 0.834 0.70 1.53 41.35
Avg Norm2B (1260-1444cm-1) 6 0.949 0.38 2.41 65.13
Avg AB(602-995,1260-1444cm-1) 3 0.835 0.70 1.53 41.35
BC Norm2 (250-3311cm-1) 3 0.861 0.64 0.94 25.40
BC Norm2ROI (707-1853cm-1) 3 0.860 0.64 1.12 30.27
BC Norm2A (602-995cm-1) 3 0.814 0.74 1.24 33.51
BC Norm2B (1260-1444cm-1) 2 0.583 1.12 2.14 57.83
BC AB(602-995,1260-1444cm-1) 3 0.809 0.75 1.45 39.18
WE Norm2Full (250-3311cm-1) 3 0.882 0.59 1.07 28.91
WE Norm2ROI (707-1853cm-1) 2 0.792 0.78 1.30 35.13
WE Norm2A (602-995cm-1) 3 0.819 0.73 1.13 30.54
WE Norm2B (1260-1444cm-1) 2 0.667 0.99 1.46 39.45
WE AB(602-995,1260-1444cm-1) 3 0.802 0.76 1.13 30.54
254
Table 109 Calibration Evaluation for the M5eRDFR3 data, with first derivative (11, 3, 1) pre-
processing.
M5eRDF LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg FST11(250-3311cm-1) 3 0.964 0.32 0.94 25.40
Avg FST11ROI (707-1853cm-1) 3 0.903 0.53 1.17 31.62
Avg FST11A (602-995cm-1) 3 0.894 0.55 1.25 33.78
Avg FST11B (1260-1444cm-1) 4 0.927 0.46 1.24 33.51
Avg AB(602-995,1260-1444cm-1) 4 0.942 0.41 1.17 31.62
BC FST11 (250-3311cm-1) 3 0.964 0.32 0.95 25.67
BC FST11ROI (707-1853cm-1) 3 0.903 0.53 1.17 31.62
BC FST11A (602-995cm-1) 3 0.894 0.56 1.25 33.78
BC FST11B (1260-1444cm-1) 4 0.926 0.46 1.23 33.24
BC AB(602-995,1260-1444cm-1) 3 0.897 0.55 1.38 37.29
WE FST11 (250-3311cm-1) 3 0.968 0.30 1.10 29.72
WE FST11ROI (707-1853cm-1) 3 0.893 0.56 1.30 35.13
WE FST11A (602-995cm-1) 3 0.890 0.57 1.31 35.40
WE FST11B (1260-1444cm-1) 4 0.930 0.45 1.51 40.81
WE AB(602-995,1260-1444cm-1) 4 0.938 0.43 1.17 31.62
Table 110 Calibration Evaluation for the M5eRDFR3 data, with first derivative (11,3,1) plus
MSC pre-processing.
M5eRDF LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg FST11MSC(250-3311cm-1) 3 0.949 0.38 1.09 29.45
Avg FST11MSCROI(707-1853cm-1) 3 0.867 0.62 1.23 33.24
Avg FST11MSC A (602-995cm-1) 3 0.860 0.64 1.30 35.13
Avg FST11MSC B(1260-1444cm-1) 4 0.900 0.54 1.31 35.40
Avg AB(602-995,1260-1444cm-1) 3 0.861 0.64 1.41 38.10
BC FST11MSC (250-3311cm-1) 3 0.949 0.39 1.08 29.18
BC FST11MSCROI (707-1853cm-1) 3 0.866 0.63 1.22 32.97
BC FST11MSC A (602-995cm-1) 3 0.859 0.64 1.30 35.13
BC FST11MSC B (1260-1444cm-1) 4 0.898 0.54 1.32 35.67
BC AB(602-995,1260-1444cm-1) 3 0.861 0.64 1.43 38.64
WE FST11MSC (250-3311cm-1) 2 0.888 0.57 1.04 28.10
WE FST11MSCROI(707-1853cm-1) 2 0.778 0.81 1.03 27.83
WE FST11MSC A (602-995cm-1) 2 0.734 0.89 1.09 29.45
WE FST11MSC B (1260-1444cm-1) 2 0.715 0.92 1.14 30.81
WE AB(602-995,1260-1444cm-1) 2 0.739 0.88 1.09 29.45
255
8.4.3 Calibration Models for M5Ye for the SERS Data
Table 111 Calibration Evaluation for the M5YE data, with preliminary pre-processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
AvgData (250-3311cm-1) 5 0.977 0.07 0.18 19.78
AvgData_ROI(707-1853cm-1) 4 0.930 0.14 0.27 29.67
AvgDataA (602-995cm-1) 6 0.989 0.05 0.20 21.97
AvgDataB(1260-1444cm-1) 5 0.960 0.10 0.22 24.17
AvgAB(602-995,1260-1444cm-1) 4 0.952 0.11 0.23 25.27
BC Data (250-3311cm-1) 5 0.988 0.05 0.15 16.48
BC Data_ROI(707-1853cm-1) 4 0.931 0.14 0.26 28.57
BC DataA (602-995cm-1) 5 0.977 0.08 0.22 24.17
BC DataB(1260-1444cm-1) 4 0.961 0.10 0.19 20.87
BC AB(602-995,1260-1444cm-1) 4 0.963 0.10 0.22 24.17
WE Data (250-3311cm-1) 4 0.966 0.09 0.16 17.58
WE Data_ROI(707-1853cm-1) 4 0.944 0.12 0.21 23.07
WE DataA (602-995cm-1) 4 0.954 0.11 0.18 19.78
WE DataB(1260-1444cm-1) 4 0.946 0.12 0.21 23.07
WE AB(602-995,1260-1444cm-1) 4 0.952 0.11 0.18 19.78
Table 112 Calibration Evaluation for the M5Ye data, with MSC pre-processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg MSC (250-3311cm-1) 4 0.975 0.08 0.16 17.58
Avg MSC_ROI (707-1853cm-1) 5 0.976 0.07 0.19 20.87
Avg MSCA (602-995cm-1) 4 0.956 0.10 0.17 18.68
Avg MSCB (1260-1444cm-1) 4 0.955 0.11 0.18 19.78
Avg AB(602-995,1260-1444cm-1) 3 0.946 0.12 0.21 23.07
BC MSC (250-3311cm-1) 3 0.983 0.06 0.19 20.87
BC MSC_ROI (707-1853cm-1) 5 0.971 0.08 0.20 21.97
BC MSCA (602-995cm-1) 4 0.969 0.09 0.20 21.97
BC MSCB (1260-1444cm-1) 4 0.968 0.09 0.17 18.68
BC AB(602-995,1260-1444cm-1) 4 0.958 0.10 0.19 20.87
WE MSC (250-3311cm-1) 4 0.973 0.08 0.15 16.48
WE MSC_ROI (707-1853cm-1) 4 0.969 0.09 0.19 20.87
WE MSCA (602-995cm-1) 4 0.961 0.10 0.18 19.78
WE MSCB (1260-1444cm-1) 3 0.827 0.21 0.32 35.16
WE AB(602-995,1260-1444cm-1) 6 0.994 0.04 0.22 24.17
256
Table 113 Calibration Evaluation for the M5Ye data, with normalisation pre-processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg NINF (250-3311cm-1) 5 0.976 0.07 0.18 19.78
Avg NINF ROI (707-1853cm-1) 4 0.935 0.13 0.28 30.76
Avg NINF A (602-995cm-1) 6 0.986 0.06 0.21 23.07
Avg NINF B(1260-1444cm-1) 5 0.961 0.10 0.22 24.17
Avg AB(602-995,1260-1444cm-1) 4 0.949 0.11 0.23 25.27
BC NINF (250-3311cm-1) 5 0.987 0.05 0.17 18.68
BC NINF ROI (707-1853cm-1) 4 0.931 0.14 0.27 29.67
BC NINF A (602-995cm-1) 4 0.968 0.09 0.22 24.17
BC NINF B(1260-1444cm-1) 4 0.968 0.09 0.18 19.78
BC AB(602-995,1260-1444cm-1) 4 0.960 0.10 0.20 21.97
WE NINF (250-3311cm-1) 3 0.927 0.14 0.21 23.07
WE NINF ROI (707-1853cm-1) 5 0.996 0.03 0.13 14.28
WE NINF A (602-995cm-1) 5 0.998 0.02 0.10 10.98
WE NINF B(1260-1444cm-1) 4 0.970 0.08 0.17 18.68
WE AB(602-995,1260-1444cm-1) 5 0.997 0.02 0.12 13.18
Table 114 Calibration Evaluation for the M5YE data, with normalisation pre-processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg Norm2 (250-3311cm-1) 5 0.975 0.08 0.20 21.97
Avg Norm2ROI (707-1853cm-1) 4 0.929 0.13 0.28 30.76
Avg Norm2A (602-995cm-1) 3 0.852 0.19 0.36 39.56
Avg Norm2B (1260-1444cm-1) 3 0.894 0.16 0.24 26.37
Avg AB(602-995,1260-1444cm-1) 4 0.948 0.11 0.23 25.27
BC Norm2 (250-3311cm-1) 3 0.980 0.07 0.19 20.87
BC Norm2ROI (707-1853cm-1) 4 0.963 0.09 0.22 24.17
BC Norm2A (602-995cm-1) 4 0.972 0.08 0.21 23.07
BC Norm2B (1260-1444cm-1) 3 0.938 0.12 0.18 19.78
BC AB(602-995,1260-1444cm-1) 4 0.965 0.09 0.19 20.87
WE Norm2Full (250-3311cm-1) 4 0.972 0.08 0.19 20.87
WE Norm2ROI (707-1853cm-1) 5 0.996 0.03 0.11 12.08
WE Norm2A (602-995cm-1) 4 0.985 0.06 0.14 15.38
WE Norm2B (1260-1444cm-1) 3 0.869 0.18 0.25 27.47
WE AB(602-995,1260-1444cm-1) 4 0.985 0.06 0.14 15.38
257
Table 115 Calibration Evaluation for the M5YE data, with first derivative pre-processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg FST11(250-3311cm-1) 3 0.947 0.11 0.19 20.87
Avg FST11ROI (707-1853cm-1) 4 0.964 0.09 0.16 17.58
Avg FST11A (602-995cm-1) 3 0.943 0.12 0.20 21.97
Avg FST11B (1260-1444cm-1) 3 0.949 0.11 0.16 17.58
Avg AB(602-995,1260-1444cm-1) 3 0.939 0.12 0.18 19.78
BC FST11 (250-3311cm-1) 3 0.947 0.11 0.19 20.87
BC FST11ROI (707-1853cm-1) 4 0.963 0.09 0.16 17.58
BC FST11A (602-995cm-1) 3 0.943 0.12 0.20 21.97
BC FST11B (1260-1444cm-1) 3 0.951 0.11 0.15 16.48
BC AB(602-995,1260-1444cm-1) 3 0.939 0.12 0.18 19.78
WE FST11 (250-3311cm-1) 3 0.941 0.12 0.18 19.78
WE FST11ROI (707-1853cm-1) 4 0.968 0.09 0.17 18.68
WE FST11A (602-995cm-1) 3 0.945 0.12 0.19 20.87
WE FST11B (1260-1444cm-1) 3 0.950 0.11 0.16 17.58
WE AB(602-995,1260-1444cm-1) 3 0.943 0.12 0.17 18.68
Table 116 Calibration Evaluation for the M5Ye data, with first derivative MSC pre-processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg FST11MSC(250-3311cm-1) 3 0.979 0.07 0.14 15.38
-1
Avg FST11MSCROI(707-1853cm ) 4 0.984 0.06 0.13 14.28
Avg FST11MSC A (602-995cm-1) 3 0.966 0.09 0.18 19.78
Avg FST11MSC B(1260-1444cm-1) 3 0.977 0.07 0.11 12.08
Avg AB(602-995,1260-1444cm-1) 3 0.969 0.09 0.14 15.38
BC FST11MSC (250-3311cm-1) 3 0.979 0.07 0.14 15.38
BC FST11MSCROI (707-1853cm-1) 4 0.985 0.06 0.12 13.18
BC FST11MSC A (602-995cm-1) 3 0.970 0.09 0.17 18.68
BC FST11MSC B (1260-1444cm-1) 3 0.979 0.07 0.11 12.08
BC AB(602-995,1260-1444cm-1) 3 0.970 0.08 0.14 15.38
WE FST11MSC (250-3311cm-1) 4 0.987 0.05 0.15 16.48
WE FST11MSCROI(707-1853cm-1) 4 0.998 0.05 0.12 13.18
WE FST11MSC A (602-995cm-1) 3 0.971 0.08 0.18 19.78
WE FST11MSC B (1260-1444cm-1) 4 0.987 0.05 0.11 12.08
WE AB(602-995,1260-1444cm-1) 4 0.989 0.05 0.13 14.28
258
Table 117 Calibration Evaluation for the M5YER1 data, with preliminary pre-processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
AvgData (250-3311cm-1) 4 0.945 0.12 0.31 34.06
AvgData_ROI(707-1853cm-1) 2 0.635 0.31 0.43 47.25
AvgDataA (602-995cm-1) 3 0.788 0.23 0.39 42.85
AvgDataB(1260-1444cm-1) 2 0.741 0.26 0.34 37.36
AvgAB(602-995,1260-1444cm-1) 3 0.801 0.23 0.39 42.85
BC Data (250-3311cm-1) 3 0.886 0.17 0.39 42.85
BC Data_ROI(707-1853cm-1) 2 0.710 0.27 0.39 42.85
BC DataA (602-995cm-1) 3 0.854 0.19 0.33 36.26
BC DataB(1260-1444cm-1) 2 0.751 0.25 0.33 36.26
BC AB(602-995,1260-1444cm-1) 3 0.857 0.19 0.35 38.46
WE Data (250-3311cm-1) 2 0.870 0.18 0.31 34.06
WE Data_ROI(707-1853cm-1) 4 0.920 0.14 0.37 40.65
WE DataA (602-995cm-1) 3 0.882 0.17 0.34 37.36
WE DataB(1260-1444cm-1) 2 0.721 0.27 0.36 39.56
WE AB(602-995,1260-1444cm-1) 3 0.787 0.23 0.42 46.15
Table 118 Calibration Evaluation for the M5YER1 data, with MSC pre-processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg MSC (250-3311cm-1) 3 0.938 0.12 0.31 34.06
Avg MSC_ROI (707-1853cm-1) 2 0.709 0.27 0.41 45.05
Avg MSCA (602-995cm-1) 3 0.886 0.17 0.30 32.96
Avg MSCB (1260-1444cm-1) 2 0.750 0.25 0.34 37.36
Avg AB(602-995,1260-1444cm-1) 3 0.847 0.20 0.36 39.56
BC MSC (250-3311cm-1) 3 0.894 0.16 0.32 35.16
BC MSC_ROI (707-1853cm-1) 2 0.722 0.27 0.37 40.65
BC MSCA (602-995cm-1) 3 0.863 0.19 0.31 34.06
BC MSCB (1260-1444cm-1) 2 0.762 0.25 0.32 35.16
BC AB(602-995,1260-1444cm-1) 3 0.855 0.19 0.33 36.26
WE MSC (250-3311cm-1) 3 0.936 0.13 0.24 26.37
WE MSC_ROI (707-1853cm-1) 2 0.750 0.25 0.37 40.65
WE MSCA (602-995cm-1) 3 0.906 0.15 0.29 31.86
WE MSCB (1260-1444cm-1) 2 0.793 0.23 0.30 32.96
WE AB(602-995,1260-1444cm-1) 3 0.823 0.21 0.36 39.56
259
Table 119 Calibration Evaluation for the M5YER1 data, with normalisation pre-processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg NINF (250-3311cm-1) 4 0.945 0.12 0.30 32.96
Avg NINF ROI (707-1853cm-1) 2 0.657 0.30 0.42 46.15
Avg NINF A (602-995cm-1) 3 0.795 0.23 0.38 41.75
Avg NINF B(1260-1444cm-1) 2 0.747 0.26 0.33 36.26
Avg AB(602-995,1260-1444cm-1) 3 0.806 0.22 0.38 41.75
BC NINF (250-3311cm-1) 3 0.895 0.16 0.33 36.26
BC NINF ROI (707-1853cm-1) 2 0.730 0.26 0.36 39.56
BC NINF A (602-995cm-1) 3 0.845 0.20 0.33 36.26
BC NINF B(1260-1444cm-1) 2 0.763 0.25 0.32 35.16
BC AB(602-995,1260-1444cm-1) 3 0.850 0.20 0.34 37.36
WE NINF (250-3311cm-1) 3 0.946 0.12 0.26 28.57
WE NINF ROI (707-1853cm-1) 4 0.945 0.12 0.34 37.36
WE NINF A (602-995cm-1) 3 0.870 0.18 0.36 39.56
WE NINF B(1260-1444cm-1) 2 0.738 0.26 0.36 39.56
WE AB(602-995,1260-1444cm-1) 5 0.991 0.04 0.30 32.96
Table 120 Calibration Evaluation for the M5YER1 data, with normalisation pre-processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg Norm2 (250-3311cm-1) 4 0.931 0.13 0.34 37.36
Avg Norm2ROI (707-1853cm-1) 2 0.663 0.30 0.43 47.25
Avg Norm2A (602-995cm-1) 3 0.790 0.23 0.40 43.95
Avg Norm2B (1260-1444cm-1) 3 0.790 0.23 0.40 43.95
Avg AB(602-995,1260-1444cm-1) 3 0.802 0.22 0.41 45.05
BC Norm2 (250-3311cm-1) 3 0.898 0.16 0.32 35.16
BC Norm2ROI (707-1853cm-1) 2 0.731 0.26 0.36 39.56
BC Norm2A (602-995cm-1) 3 0.847 0.20 0.32 35.16
BC Norm2B (1260-1444cm-1) 2 0.764 0.25 0.32 35.16
BC AB(602-995,1260-1444cm-1) 3 0.852 0.19 0.34 37.36
WE Norm2Full (250-3311cm-1) 3 0.938 0.12 0.23 25.27
WE Norm2ROI (707-1853cm-1) 4 0.936 0.13 0.31 34.06
WE Norm2A (602-995cm-1) 3 0.903 0.16 0.28 30.76
WE Norm2B (1260-1444cm-1) 2 0.786 0.23 0.30 32.96
WE AB(602-995,1260-1444cm-1) 3 0.818 0.22 0.36 39.56
260
Table 121 Calibration Evaluation for the M5YER1 data, with first derivative pre-processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg FST11(250-3311cm-1) 4 0.998 0.02 0.34 37.36
Avg FST11ROI (707-1853cm-1) 3 0.954 0.11 0.34 37.36
Avg FST11A (602-995cm-1) 3 0.981 0.07 0.37 40.65
Avg FST11B (1260-1444cm-1) 3 0.905 0.15 0.37 40.65
Avg AB(602-995,1260-1444cm-1) 3 0.965 0.09 0.34 37.36
BC FST11 (250-3311cm-1) 4 0.998 0.02 0.34 37.36
BC FST11ROI (707-1853cm-1) 3 0.954 0.11 0.34 37.36
BC FST11A (602-995cm-1) 4 0.995 0.03 0.31 34.06
BC FST11B (1260-1444cm-1) 3 0.906 0.15 0.36 39.56
BC AB(602-995,1260-1444cm-1) 3 0.964 0.09 0.34 37.36
WE FST11 (250-3311cm-1) 4 0.998 0.02 0.35 38.46
WE FST11ROI (707-1853cm-1) 3 0.959 0.10 0.37 40.65
WE FST11A (602-995cm-1) 4 0.993 0.04 0.31 34.06
WE FST11B (1260-1444cm-1) 3 0.895 0.16 0.42 46.15
WE AB(602-995,1260-1444cm-1) 3 0.969 0.09 0.36 39.56
Table 122 Calibration Evaluation for the M5YER1 data, with first derivative MSC pre-
processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg FST11MSC(250-3311cm-1) 4 0.998 0.02 0.32 35.16
Avg FST11MSCROI(707-1853cm-1) 3 0.955 0.11 0.33 36.26
Avg FST11MSC A (602-995cm-1) 4 0.995 0.03 0.27 29.67
Avg FST11MSC B(1260-1444cm-1) 3 0.898 0.16 0.36 39.56
Avg AB(602-995,1260-1444cm-1) 3 0.962 0.10 0.32 35.16
BC FST11MSC (250-3311cm-1) 4 0.998 0.02 0.32 35.16
BC FST11MSCROI (707-1853cm-1) 3 0.955 0.10 0.33 36.26
BC FST11MSC A (602-995cm-1) 4 0.994 0.03 0.28 30.76
BC FST11MSC B (1260-1444cm-1) 3 0.900 0.16 0.36 39.56
BC AB(602-995,1260-1444cm-1) 3 0.961 0.10 0.31 34.06
WE FST11MSC (250-3311cm-1) 2 0.865 0.18 0.33 36.26
WE FST11MSCROI(707-1853cm-1) 3 0.975 0.08 0.32 35.16
WE FST11MSC A (602-995cm-1) 3 0.977 0.07 0.31 34.06
WE FST11MSC B (1260-1444cm-1) 2 0.700 0.28 0.37 40.65
WE AB(602-995,1260-1444cm-1) 3 0.980 0.07 0.30 32.96
261
Table 123 Calibration Evaluation for the M5YER2 data, with preliminary pre-processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
AvgData (250-3311cm-1) 4 0.956 0.10 0.27 29.67
AvgData_ROI(707-1853cm-1) 5 0.933 0.13 0.27 29.67
AvgDataA (602-995cm-1) 5 0.951 0.11 0.34 37.36
AvgDataB(1260-1444cm-1) 2 0.614 0.32 0.41 45.05
AvgAB(602-995,1260-1444cm-1) 4 0.928 0.13 0.30 32.96
BC Data (250-3311cm-1) 4 0.967 0.09 0.23 25.27
BC Data_ROI(707-1853cm-1) 2 0.724 0.27 0.40 43.95
BC DataA (602-995cm-1) 4 0.924 0.14 0.28 30.76
BC DataB(1260-1444cm-1) 2 0.751 0.25 0.33 36.26
BC AB(602-995,1260-1444cm-1) 4 0.921 0.14 0.27 29.67
WE Data (250-3311cm-1) 3 0.932 0.13 0.28 30.76
WE Data_ROI(707-1853cm-1) 3 0.805 0.23 0.35 38.46
WE DataA (602-995cm-1) 5 0.978 0.07 0.24 26.37
WE DataB(1260-1444cm-1) 2 0.679 0.30 0.36 39.56
WE AB(602-995,1260-1444cm-1) 5 0.978 0.07 0.27 29.67
Table 124 Calibration Evaluation for the M5YER2 data, with MSC pre-processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg MSC (250-3311cm-1) 4 0.965 0.09 0.23 25.27
Avg MSC_ROI (707-1853cm-1) 2 0.638 0.31 0.48 52.74
Avg MSCA (602-995cm-1) 4 0.948 0.11 0.21 23.07
Avg MSCB (1260-1444cm-1) 2 0.735 0.26 0.33 36.26
Avg AB(602-995,1260-1444cm-1) 4 0.937 0.12 0.21 23.07
BC MSC (250-3311cm-1) 2 0.747 0.25 0.40 43.95
BC MSC_ROI (707-1853cm-1) 2 0.757 0.25 0.36 39.56
BC MSCA (602-995cm-1) 4 0.928 0.13 0.27 29.67
BC MSCB (1260-1444cm-1) 2 0.714 0.27 0.36 39.56
BC AB(602-995,1260-1444cm-1) 4 0.939 0.12 0.24 26.37
WE MSC (250-3311cm-1) 3 0.965 0.09 0.23 25.27
WE MSC_ROI (707-1853cm-1) 3 0.857 0.19 0.29 31.86
WE MSCA (602-995cm-1) 3 0.880 0.17 0.33 36.26
WE MSCB (1260-1444cm-1) 2 0.738 0.26 0.32 35.16
WE AB(602-995,1260-1444cm-1) 5 0.980 0.07 0.30 32.96
262
Table 125 Calibration Evaluation for the M5YER2 data, with normalisation pre-processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg NINF (250-3311cm-1) 4 0.956 0.10 0.27 29.67
Avg NINF ROI (707-1853cm-1) 5 0.937 0.12 0.29 31.86
Avg NINF A (602-995cm-1) 5 0.954 0.11 0.35 38.46
Avg NINF B(1260-1444cm-1) 2 0.612 0.32 0.41 45.05
Avg AB(602-995,1260-1444cm-1) 4 0.931 0.13 0.30 32.96
BC NINF (250-3311cm-1) 3 0.960 0.10 0.26 28.57
BC NINF ROI (707-1853cm-1) 2 0.741 0.26 0.39 42.85
BC NINF A (602-995cm-1) 4 0.933 0.13 0.28 30.76
BC NINF B(1260-1444cm-1) 2 0.746 0.26 0.33 36.26
BC AB(602-995,1260-1444cm-1) 3 0.911 0.15 0.33 36.26
WE NINF (250-3311cm-1) 3 0.970 0.08 0.26 28.57
WE NINF ROI (707-1853cm-1) 4 0.980 0.07 0.28 30.76
WE NINF A (602-995cm-1) 3 0.924 0.14 0.31 34.06
WE NINF B(1260-1444cm-1) 2 0.726 0.27 0.35 38.46
WE AB(602-995,1260-1444cm-1) 5 0.991 0.04 0.28 30.76
Table 126 Calibration Evaluation for the M5YER2 data, with normalisation pre-processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg Norm2 (250-3311cm-1) 5 0.968 0.09 0.23 25.27
Avg Norm2ROI (707-1853cm-1) 5 0.933 0.13 0.28 30.76
Avg Norm2A (602-995cm-1) 5 0.951 0.11 0.34 37.36
Avg Norm2B (1260-1444cm-1) 2 0.647 0.30 0.38 41.75
Avg AB(602-995,1260-1444cm-1) 4 0.927 0.13 0.28 30.76
BC Norm2 (250-3311cm-1) 3 0.963 0.10 0.25 27.47
BC Norm2ROI (707-1853cm-1) 2 0.744 0.26 0.38 41.75
BC Norm2A (602-995cm-1) 4 0.937 0.12 0.27 29.67
BC Norm2B (1260-1444cm-1) 2 0.754 0.25 0.33 36.26
BC AB(602-995,1260-1444cm-1) 3 0.917 0.14 0.32 35.16
WE Norm2Full (250-3311cm-1) 3 0.964 0.09 0.24 26.37
WE Norm2ROI (707-1853cm-1) 4 0.971 0.08 0.28 30.76
WE Norm2A (602-995cm-1) 3 0.900 0.16 0.32 35.16
WE Norm2B (1260-1444cm-1) 2 0.733 0.26 0.32 35.16
WE AB(602-995,1260-1444cm-1) 5 0.982 0.06 0.26 28.57
263
Table 127 Calibration Evaluation for the M5YER2 data, with first derivative pre-processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg FST11(250-3311cm-1) 2 0.920 0.14 0.34 37.36
Avg FST11ROI (707-1853cm-1) 3 0.971 0.08 0.31 34.06
Avg FST11A (602-995cm-1) 3 0.951 0.11 0.31 34.06
Avg FST11B (1260-1444cm-1) 2 0.666 0.29 0.39 42.85
Avg AB(602-995,1260-1444cm-1) 3 0.942 0.12 0.30 32.96
BC FST11 (250-3311cm-1) 2 0.920 0.14 0.34 37.36
BC FST11ROI (707-1853cm-1) 3 0.971 0.08 0.31 34.06
BC FST11A (602-995cm-1) 3 0.945 0.12 0.32 35.16
BC FST11B (1260-1444cm-1) 2 0.668 0.29 0.39 42.85
BC AB(602-995,1260-1444cm-1) 3 0.942 0.12 0.30 32.96
WE FST11 (250-3311cm-1) 4 0.999 0.01 0.31 34.06
WE FST11ROI (707-1853cm-1) 3 0.942 0.12 0.37 40.65
WE FST11A (602-995cm-1) 3 0.947 0.12 0.33 36.26
WE FST11B (1260-1444cm-1) 2 0.651 0.30 0.41 45.05
WE AB(602-995,1260-1444cm-1) 3 0.910 0.15 0.35 38.46
Table 128 Calibration Evaluation for the M5YER2 data, with first derivative MSC pre-
processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg FST11MSC(250-3311cm-1) 2 0.924 0.14 0.33 36.26
-1
Avg FST11MSCROI(707-1853cm ) 2 0.775 0.24 0.33 36.26
Avg FST11MSC A (602-995cm-1) 3 0.941 0.12 0.31 34.06
Avg FST11MSC B(1260-1444cm-1) 2 0.682 0.29 0.38 41.75
Avg AB(602-995,1260-1444cm-1) 3 0.940 0.12 0.30 32.96
BC FST11MSC (250-3311cm-1) 2 0.925 0.14 0.33 36.26
BC FST11MSCROI (707-1853cm-1) 3 0.980 0.07 0.30 32.96
BC FST11MSC A (602-995cm-1) 3 0.936 0.13 0.31 34.06
BC FST11MSC B (1260-1444cm-1) 3 0.932 0.13 0.37 40.65
BC AB(602-995,1260-1444cm-1) 3 0.941 0.12 0.30 32.96
WE FST11MSC (250-3311cm-1) 3 0.982 0.07 0.30 32.96
WE FST11MSCROI(707-1853cm-1) 3 0.985 0.06 0.31 34.06
WE FST11MSC A (602-995cm-1) 3 0.975 0.08 0.29 31.86
WE FST11MSC B (1260-1444cm-1) 2 0.705 0.28 0.37 40.65
WE AB(602-995,1260-1444cm-1) 3 0.969 0.09 0.30 32.96
264
Table 129 Calibration Evaluation for the M5YER3 data, with preliminary pre-processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
AvgData (250-3311cm-1) 5 0.982 0.07 0.20 21.97
AvgData_ROI(707-1853cm-1) 4 0.900 0.16 0.42 46.15
AvgDataA (602-995cm-1) 5 0.994 0.03 0.21 23.07
AvgDataB(1260-1444cm-1) 3 0.908 0.15 0.29 31.86
AvgAB(602-995,1260-1444cm-1) 4 0.928 0.13 0.34 37.36
BC Data (250-3311cm-1) 4 0.976 0.08 0.20 21.97
BC Data_ROI(707-1853cm-1) 2 0.687 0.28 0.39 42.85
BC DataA (602-995cm-1) 3 0.833 0.21 0.32 35.16
BC DataB(1260-1444cm-1) 3 0.906 0.15 0.29 31.86
BC AB(602-995,1260-1444cm-1) 3 0.881 0.17 0.26 28.57
WE Data (250-3311cm-1) 4 0.992 0.04 0.18 19.78
WE Data_ROI(707-1853cm-1) 5 0.986 0.06 0.28 30.76
WE DataA (602-995cm-1) 3 0.965 0.09 0.18 19.78
WE DataB(1260-1444cm-1) 2 0.671 0.29 0.38 41.75
WE AB(602-995,1260-1444cm-1) 4 0.955 0.10 0.23 25.27
Table 130 Calibration Evaluation for the M5YER3 data, with MSC pre-processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg MSC (250-3311cm-1) 4 0.979 0.07 0.23 25.27
Avg MSC_ROI (707-1853cm-1) 3 0.840 0.20 0.39 42.85
Avg MSCA (602-995cm-1) 3 0.886 0.17 0.26 28.57
Avg MSCB (1260-1444cm-1) 3 0.911 0.15 0.28 30.76
Avg AB(602-995,1260-1444cm-1) 4 0.950 0.11 0.28 30.76
BC MSC (250-3311cm-1) 4 0.971 0.08 0.21 23.07
BC MSC_ROI (707-1853cm-1) 3 0.901 0.16 0.30 32.96
BC MSCA (602-995cm-1) 3 0.870 0.18 0.32 35.16
BC MSCB (1260-1444cm-1) 3 0.914 0.15 0.28 30.76
BC AB(602-995,1260-1444cm-1) 3 0.904 0.16 0.27 29.67
WE MSC (250-3311cm-1) 3 0.950 0.11 0.27 29.67
WE MSC_ROI (707-1853cm-1) 4 0.971 0.08 0.24 26.37
WE MSCA (602-995cm-1) 4 0.979 0.07 0.20 21.97
WE MSCB (1260-1444cm-1) 2 0.738 0.26 0.33 36.26
WE AB(602-995,1260-1444cm-1) 4 0.951 0.11 0.26 28.57
265
Table 131 Calibration Evaluation for the M5YER3 data, with normalisation pre-processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg NINF (250-3311cm-1) 5 0.983 0.06 0.19 20.87
Avg NINF ROI (707-1853cm-1) 4 0.911 0.15 0.40 43.95
Avg NINF A (602-995cm-1) 5 0.994 0.03 0.19 20.87
Avg NINF B(1260-1444cm-1) 3 0.920 0.14 0.26 28.57
Avg AB(602-995,1260-1444cm-1) 4 0.937 0.12 0.31 34.06
BC NINF (250-3311cm-1) 4 0.970 0.08 0.21 23.07
BC NINF ROI (707-1853cm-1) 2 0.681 0.29 0.38 41.75
BC NINF A (602-995cm-1) 3 0.827 0.21 0.34 37.36
BC NINF B(1260-1444cm-1) 3 0.921 0.14 0.26 28.57
BC AB(602-995,1260-1444cm-1) 3 0.886 0.17 0.26 28.57
WE NINF (250-3311cm-1) 3 0.954 0.11 0.30 32.96
WE NINF ROI (707-1853cm-1) 4 0.979 0.07 0.23 25.27
WE NINF A (602-995cm-1) 4 0.981 0.07 0.21 23.07
WE NINF B(1260-1444cm-1) 3 0.824 0.21 0.34 37.36
WE AB(602-995,1260-1444cm-1) 4 0.957 0.10 0.28 30.76
Table 132 Calibration Evaluation for the M5YER3 data, with normalisation pre-processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg Norm2 (250-3311cm-1) 5 0.986 0.06 0.20 21.97
Avg Norm2ROI (707-1853cm-1) 4 0.923 0.14 0.36 39.56
Avg Norm2A (602-995cm-1) 4 0.940 0.12 0.26 28.57
Avg Norm2B (1260-1444cm-1) 3 0.930 0.13 0.24 26.37
Avg AB(602-995,1260-1444cm-1) 4 0.947 0.11 0.27 29.67
BC Norm2 (250-3311cm-1) 4 0.970 0.08 0.21 23.07
BC Norm2ROI (707-1853cm-1) 2 0.679 0.29 0.38 41.75
BC Norm2A (602-995cm-1) 3 0.827 0.21 0.34 37.36
BC Norm2B (1260-1444cm-1) 3 0.924 0.14 0.25 27.47
BC AB(602-995,1260-1444cm-1) 3 0.887 0.17 0.26 28.57
WE Norm2Full (250-3311cm-1) 4 0.987 0.05 0.23 25.27
WE Norm2ROI (707-1853cm-1) 4 0.970 0.09 0.24 26.37
WE Norm2A (602-995cm-1) 4 0.984 0.06 0.16 17.58
WE Norm2B (1260-1444cm-1) 2 0.732 0.26 0.33 36.26
WE AB(602-995,1260-1444cm-1) 4 0.961 0.10 0.22 24.17
266
Table 133 Calibration Evaluation for the M5YER3 data, with first derivative pre-processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg FST11(250-3311cm-1) 2 0.863 0.19 0.32 35.16
Avg FST11ROI (707-1853cm-1) 3 0.987 0.05 0.30 32.96
Avg FST11A (602-995cm-1) 3 0.950 0.11 0.27 29.67
Avg FST11B (1260-1444cm-1) 4 0.972 0.08 0.33 36.26
Avg AB(602-995,1260-1444cm-1) 3 0.952 0.11 0.30 32.96
BC FST11 (250-3311cm-1) 2 0.863 0.19 0.32 35.16
BC FST11ROI (707-1853cm-1) 3 0.987 0.05 0.30 32.96
BC FST11A (602-995cm-1) 3 0.949 0.11 0.27 29.67
BC FST11B (1260-1444cm-1) 3 0.934 0.13 0.39 42.85
BC AB(602-995,1260-1444cm-1) 3 0.951 0.11 0.30 32.96
WE FST11 (250-3311cm-1) 2 0.857 0.19 0.33 36.26
WE FST11ROI (707-1853cm-1) 3 0.975 0.08 0.34 37.36
WE FST11A (602-995cm-1) 3 0.954 0.11 0.26 28.57
WE FST11B (1260-1444cm-1) 1 0.665 0.33 0.41 45.05
WE AB(602-995,1260-1444cm-1) 3 0.955 0.10 0.30 32.96
Table 134 Calibration Evaluation for the M5YER3 data, with first derivative MSC pre-
processing.
M5Ye LV (R2) RMSEC RMSECV REP %
(g/L) (g/L)
Avg FST11MSC(250-3311cm-1) 2 0.849 0.20 0.31 34.06
Avg FST11MSCROI(707-1853cm-1) 3 0.977 0.07 0.30 32.96
Avg FST11MSC A (602-995cm-1) 3 0.957 0.10 0.27 29.67
Avg FST11MSC B(1260-1444cm-1) 4 0.973 0.08 0.34 37.36
Avg AB(602-995,1260-1444cm-1) 3 0.956 0.10 0.30 32.96
BC FST11MSC (250-3311cm-1) 3 0.979 0.07 0.33 36.26
BC FST11MSCROI (707-1853cm-1) 3 0.978 0.07 0.30 32.96
BC FST11MSC A (602-995cm-1) 3 0.957 0.10 0.27 29.67
BC FST11MSC B (1260-1444cm-1) 4 0.970 0.07 0.32 35.16
BC AB(602-995,1260-1444cm-1) 3 0.957 0.10 0.29 31.86
WE FST11MSC (250-3311cm-1) 2 0.855 0.19 0.31 34.06
WE FST11MSCROI(707-1853cm-1) 2 0.763 0.25 0.33 36.26
WE FST11MSC A (602-995cm-1) 3 0.981 0.07 0.21 23.07
WE FST11MSC B (1260-1444cm-1) 2 0.718 0.27 0.36 39.56
WE AB(602-995,1260-1444cm-1) 3 0.967 0.09 0.27 29.67
267
8.5 Supplementary Information for Chapter Five
180 180
160 160
Delta Wavelength (nm)
120 120
100 100
80 80
60 60
40 40
20 20
250 300 350 400 450 500 250 300 350 400 450 500
Excitation Wavelength (nm) Excitation Wavelength (nm)
200 200
180 180
160 160
Delta Wavelength (nm)
Delta Wavelength (nm)
140 140
120 120
100 100
80 80
60 60
40 40
20 20
250 300 350 400 450 500 250 300 350 400 450 500
Excitation Wavelength (nm) Excitation Wavelength (nm)
Figure 117 TSFS Contour profile for the high and low concentration samples, (Top) M5eRDF
S01 and M5eRDF S10 and (Bottom) M5Ye S01 and M5Ye S10. 46
8.5.2 PARAFAC
Table 135 shows the explained variance and core consistency for the M5eRDF EEM
data. The number of components was chosen based on core consistency and this was
determined on a case by case basis. The appropriate number of components is
assessed with the model with the highest number of components and a valid value for
core consistency. In each case a two component model gave the optimal fit. For one
component the core consistency was 100 but the explained variance was weak. The
core consistency is low with three and four component models meaning that the
models are not converging with the data. Table 136 shows the explained variance and
core consistency for the M5Ye EEM data.
268
Table 135 Percentage Fit and core consistency for PARAFAC models using 1–4 factors for the
three replicate runs for M5eRDF EEM data collection.
No. of Components 1 2 3 4
Explained Variance (R1) 91.41 96.07 99.06 99.62
Core Consistency (R1) 100 89 12 0
Explained Variance (R2) 91.26 95.89 99.00 99.56
Core Consistency (R2) 100 79 16 0
Explained Variance (R3) 90.72 95.93 99.11 99.62
Core Consistency (R3) 100 59 31 0
Table 136 Percentage Fit and core consistency for PARAFAC models using 1–4 factors for the
three replicate runs for Ye EEM data collection.
No. of Components 1 2 3 4
Explained Variance (R1) 88.98 94.25 98.42 98.99
Core Consistency (R1) 100 98 86 32
Explained Variance (R2) 88.05 94.57 98.12 98.54
Core Consistency (R2) 100 99 90 27
Explained Variance (R3) 87.07 94.22 98.17 98.58
Core Consistency (R3) 100 99 87 40
Figure 118 Emission loadings of the PARAFAC model (left) and PARAFAC2 model (right) for
dataset1 (varying eRDF concentration) used in [1]
269
8.5.3 Calibration Models for the Fluorescence Data
(Replicate Runs)
Table 137 Calibration Evaluation for the M5eRDF EEM data
Model LV Correlation RMSEC RMSECV REP %
Full Range (ex 230-520 nm em270-600 nm)
eRDFR1 Unfolded 3 0.905 0.53 0.84 22.70
eRDFR1UnfoldedMSC 4 0.965 0.32 0.78 21.08
eRDFR1UnfoldedNorm 3 0.905 0.53 0.83 22.43
Reduced Area A (ex 230-315 nm em270-435 nm)
eRDFR1 Unfolded 4 0.955 0.36 0.81 21.89
eRDFR1UnfoldedMSC 4 0.961 0.34 0.74 20
eRDFR1UnfoldedNorm 4 0.960 0.34 0.77 20.81
Reduced Area B (ex 250-360 nm em285-425 nm)
eRDFR1 Unfolded 3 0.994 0.12 0.33 8.91
eRDFR1UnfoldedMSC 3 0.990 0.16 0.42 11.35
eRDFR1UnfoldedNorm 3 0.991 0.15 0.36 9.72
Model LV Correlation RMSEC RMSECV REP %
Full Range (ex 230-520 nm em270-600 nm)
eRDFR2 Unfolded 3 0.981 0.24 0.38 10.27
eRDFR2UnfoldedMSC 3 0.973 0.28 0.46 12.43
eRDFR2UnfoldedNorm 3 0.976 0.26 0.43 11.62
Reduced Area A (ex 230-315 nm em270-435 nm)
eRDFR2 Unfolded 3 0.980 0.24 0.37 10
eRDFR2UnfoldedMSC 3 0.970 0.30 0.48 12.97
eRDFR2UnfoldedNorm 3 0.975 0.27 0.43 11.62
Reduced Area B (ex 250-360 nm em285-425 nm)
eRDFR2 Unfolded 2 0.973 0.28 0.35 9.45
eRDFR2UnfoldedMSC 2 0.967 0.31 0.43 11.62
eRDFR2UnfoldedNorm 2 0.974 0.27 0.37 10
Model LV Correlation RMSEC RMSECV REP %
Full Range (ex 230-520 nm em270-600 nm)
eRDFR3 Unfolded 3 0.993 0.14 0.41 11.08
eRDFR3UnfoldedMSC 4 0.997 0.09 0.34 9.18
eRDFR3UnfoldedNorm 3 0.991 0.16 0.44 11.89
Reduced Area (ex 230-315 nm em270-435 nm)
eRDFR3 Unfolded 4 0.998 0.07 0.23 6.21
eRDFR3UnfoldedMSC 4 0.997 0.09 0.31 8.37
eRDFR3UnfoldedNorm 3 0.994 0.13 0.29 7.83
Reduced Area B (ex 250-360 nm em285-425 nm)
eRDFR3 Unfolded 2 0.975 0.27 0.36 9.72
eRDFR3UnfoldedMSC 2 0.951 0.38 0.52 14.05
eRDFR3UnfoldedNorm 2 0.964 0.32 0.44 11.89
270
Table 138 Calibration Evaluation for the M5eRDF TSFS data
Model LV Correlation RMSEC RMSECV REP %
Full Range (ex 230-520 nm ∆𝛌 10-200)
eRDFR1 Unfolded 3 0.927 0.46 0.76 20.54
eRDFR1UnfoldedMSC 3 0.938 0.42 0.75 20.27
eRDFR1UnfoldedNorm 3 0.929 0.45 0.76 20.54
Reduced Area (ex 230-310 nm ∆𝛌 10-190)
eRDFR1 Unfolded 3 0.940 0.42 0.75 20.27
eRDFR1UnfoldedMSC 4 0.956 0.36 0.66 17.83
eRDFR1UnfoldedNorm 3 0.939 0.42 0.75 20.27
Reduced Area B (ex 250-310 nm ∆𝛌 10-140)
eRDFR1 Unfolded 3 0.990 0.16 0.31 8.37
eRDFR1UnfoldedMSC 3 0.984 0.20 0.40 10.81
eRDFR1UnfoldedNorm 3 0.987 0.18 0.35 9.45
Model LV Correlation RMSEC RMSECV REP %
Full Range (ex 230-520 nm ∆𝛌 10-200)
eRDFR2 Unfolded 3 0.987 0.19 0.31 8.37
eRDFR2UnfoldedMSC 3 0.977 0.25 0.42 11.35
eRDFR2UnfoldedNorm 3 0.983 0.22 0.37 10
Reduced Area (ex 230-310 nm ∆𝛌 10-190)
eRDFR2 Unfolded 3 0.984 0.21 0.33 8.91
eRDFR2UnfoldedMSC 3 0.974 0.27 0.44 11.89
eRDFR2UnfoldedNorm 3 0.981 0.23 0.37 10
Reduced Area B (ex 250-310 nm ∆𝛌 10-140)
eRDFR2 Unfolded 2 0.976 0.26 0.32 8.64
eRDFR2UnfoldedMSC 2 0.971 0.29 0.39 10.54
eRDFR2UnfoldedNorm 2 0.978 0.25 0.33 8.91
Model LV Correlation RMSEC RMSECV REP %
Full Range (ex 230-520 nm ∆𝛌 10-200)
eRDFR3 Unfolded 4 0.997 0.09 0.31 8.37
eRDFR3UnfoldedMSC 4 0.996 0.10 0.33 8.91
eRDFR3UnfoldedNorm 4 0.997 0.09 0.30 8.10
Reduced Area (ex 230-310 nm ∆𝛌 10-190)
eRDFR3 Unfolded 4 0.997 0.09 0.29 7.83
eRDFR3UnfoldedMSC 4 0.994 0.12 0.39 10.54
eRDFR3UnfoldedNorm 4 0.997 0.10 0.27 7.29
Reduced Area B (ex 250-310 nm ∆𝛌 10-140)
eRDFR3 Unfolded 2 0.975 0.27 0.38 10.27
eRDFR3UnfoldedMSC 2 0.953 0.37 0.54 14.59
eRDFR3UnfoldedNorm 2 0.967 0.31 0.45 12.16
271
Table 139 Calibration Evaluation for the M5Ye EEM data
Model LV Correlation RMSEC RMSECV REP %
Full Range (ex 230-520 nm em270-600 nm)
YeR1Unfolded 3 0.993 0.12 0.18 19.78
YeR1Unfolded MSC 2 0.933 0.12 0.16 17.58
YeR1Unfolded Norm 2 0.898 0.14 0.20 21.97
Reduced Area A (ex 230-315 nm em270-435 nm)
YeR1Unfolded 2 0.878 0.16 0.23 25.27
YeR1Unfolded MSC 2 0.924 0.12 0.17 18.68
YeR1Unfolded Norm 2 0.879 0.16 0.22 24.17
Reduced Area B (ex 250-360 nm em285-425 nm)
YeR1Unfolded 3 0.939 0.11 0.18 19.78
YeR1Unfolded MSC 2 0.930 0.12 0.16 17.58
YeR1Unfolded Norm 3 0.939 0.11 0.18 19.78
Model LV Correlation RMSEC RMSECV REP %
Full Range (ex 230-520 nm em270-600 nm)
YeR2Unfolded 2 0.903 0.16 0.21 23.07
YeR2Unfolded MSC 2 0.925 0.14 0.18 19.78
YeR2Unfolded Norm 2 0.902 0.16 0.20 21.97
Reduced Area A (ex 230-315 nm em270-435 nm)
YeR2Unfolded 5 0.988 0.05 0.13 14.28
YeR2Unfolded MSC 2 0.915 0.15 0.19 20.87
YeR2Unfolded Norm 5 0.984 0.06 0.14 15.38
Reduced Area B (ex 250-360 nm em285-425 nm)
YeR2Unfolded 2 0.896 0.15 0.19 20.87
YeR2Unfolded MSC 2 0.908 0.14 0.18 19.78
YeR2Unfolded Norm 2 0.889 0.15 0.20 21.97
Model LV Correlation RMSEC RMSECV REP %
Full Range (ex 230-520 nm em270-600 nm)
YeR3Unfolded 3 0.965 0.09 0.17 18.68
YeR3Unfolded MSC 2 0.943 0.12 0.16 17.58
YeR3Unfolded Norm 3 0.968 0.09 0.15 16.48
Reduced Area A (ex 230-315 nm em270-435 nm)
YeR3Unfolded 3 0.955 0.10 0.21 23.07
YeR3Unfolded MSC 2 0.930 0.13 0.18 19.78
YeR3Unfolded Norm 3 0.962 0.10 0.17 18.68
Reduced Area B (ex 250-360 nm em285-425 nm)
YeR3Unfolded 3 0.957 0.10 0.16 17.58
YeR3Unfolded MSC 3 0.951 0.11 0.18 19.78
YeR3Unfolded Norm 3 0.961 0.10 0.16 17.58
272
Table 140 Calibration Evaluation for the M5Ye TSFS data
Model LV Correlation RMSEC RMSECV REP %
Full Range (ex 230-520 nm ∆𝛌 10-200)
YeR1Unfolded 3 0.963 0.08 0.15 16.48
YeR1Unfolded MSC 2 0.952 0.10 0.14 15.38
YeR1Unfolded Norm 3 0.962 0.09 0.15 16.48
Reduced Area (ex 230-310 nm ∆𝛌 10-190)
YeR1Unfolded 3 0.963 0.08 0.14 15.38
YeR1Unfolded MSC 2 0.953 0.10 0.13 14.28
YeR1Unfolded Norm 3 0.959 0.09 0.15 16.48
Reduced Area B (ex 250-310 nm ∆𝛌 10-140)
YeR1Unfolded 3 0.964 0.08 0.14 15.38
YeR1Unfolded MSC 3 0.966 0.08 0.14 15.38
YeR1Unfolded Norm 3 0.966 0.08 0.13 14.28
Model LV Correlation RMSEC RMSECV REP %
Full Range (ex 230-520 nm ∆𝛌 10-200)
YeR2Unfolded 4 0.952 0.11 0.19 20.87
YeR2Unfolded MSC 2 0.927 0.13 0.17 18.68
YeR2Unfolded Norm 4 0.958 0.10 0.17 18.68
Reduced Area (ex 230-310 nm ∆𝛌 10-190)
YeR2Unfolded 3 0.945 0.12 0.20 21.97
YeR2Unfolded MSC 2 0.928 0.13 0.17 18.68
YeR2Unfolded Norm 4 0.957 0.10 0.18 19.78
Reduced Area B (ex 250-310 nm ∆𝛌 10-140)
YeR2Unfolded 4 0.978 0.07 0.14 15.38
YeR2Unfolded MSC 4 0.981 0.07 0.14 15.38
YeR2Unfolded Norm 4 0.981 0.07 0.13 14.28
Model LV Correlation RMSEC RMSECV REP %
Full Range (ex 230-520 nm ∆𝛌 10-200)
YeR3Unfolded 3 0.959 0.10 0.18 19.78
YeR3Unfolded MSC 2 0.944 0.12 0.16 17.58
YeR3Unfolded Norm 3 0.964 0.09 0.16 17.58
Reduced Area (ex 230-310 nm ∆𝛌 10-190)
YeR3Unfolded 3 0.960 0.10 0.18 19.78
YeR3Unfolded MSC 2 0.945 0.12 0.15 16.48
YeR3Unfolded Norm 3 0.965 0.09 0.16 17.58
Reduced Area B (ex 250-310 nm ∆𝛌 10-140)
YeR3Unfolded 3 0.947 0.11 0.17 18.68
YeR3Unfolded MSC 3 0.947 0.11 0.18 19.78
YeR3Unfolded Norm 3 0.952 0.11 0.17 18.68
273
360 360
350 350
340 340
330 330
320 320
310 310
300 300
290 290
280 280
270 270
260 260
250 250
300 320 340 360 380 400 420 300 320 340 360 380 400 420
Emission wavelength (nm) Emission wavelength (nm)
Figure 119 Contour plots of reduced area b (ex 250-360 nm em285-425 nm) for M5YeS01 (left)
and for M5YeS10 (right).46
360 360
350 350
340 340
Excitation wavelength (nm)
330 330
320 320
310 310
300 300
290 290
280 280
270 270
260 260
250 250
300 320 340 360 380 400 420 300 320 340 360 380 400 420
Emission wavelength (nm) Emission wavelength (nm)
Figure 120 Contour plots of reduced area b (ex 250-360 nm em285-425 nm) for M5eRDFS01
(left) and for M5eRDFS10 (right). 46
274
275