Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
6 views

Chapter

This thesis focuses on developing algorithms for automated analysis and classification of biological signals, specifically PPG and EEG, to enhance early detection of health-related issues. The research aims to create non-invasive, cost-effective diagnostic techniques that can be integrated into personal health monitoring devices, addressing the growing demand for accessible healthcare solutions. Key contributions include novel algorithms for emotion recognition, mental stress estimation, and eye movement detection, utilizing simple yet effective signal processing methods.

Uploaded by

dharsandipan86
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Chapter

This thesis focuses on developing algorithms for automated analysis and classification of biological signals, specifically PPG and EEG, to enhance early detection of health-related issues. The research aims to create non-invasive, cost-effective diagnostic techniques that can be integrated into personal health monitoring devices, addressing the growing demand for accessible healthcare solutions. Key contributions include novel algorithms for emotion recognition, mental stress estimation, and eye movement detection, utilizing simple yet effective signal processing methods.

Uploaded by

dharsandipan86
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 81

1.

1 Introduction
The context and driving forces for the research done for this thesis are explained in this
chapter.
It outlines the primary goals of the study and emphasizes the significant contributions
made by the suggested methodologies.
This chapter also includes the general thesis framework outline and the related
publications.
Several health-related problems can be easily detected when automated diagnostic techniques
are integrated into health monitoring equipment. In order to facilitate effective detection of
possible abnormalities, an algorithm should be developed for automated analysis and
classification purposes. The main goal of the thesis is to provide algorithms for the efficient pre-
processing, feature extraction, and machine learning-based analysis of biological signals, such
as PPG and EEG. The algorithms are specifically made to work with the real time signals and
different standard databases which are available for public use.

Health related issues are alarmingly growing on in the recent era irrespective of the age, sex, geographical
location, food habit, life style, emotional complications and job profile of the individuals [1.1]. These are
one of the leading causes of deaths throughout the whole world and the current scenario demands for the
attention of the researchers to this domain. Expert medical advice states that prompt medical care and
early detection can significantly lower heart and brain related complications, which in turn lower the risk
of death [1.2]. The traditional methods of diagnosis depend on skilled medical professionals visually
examining the biological signals and the diagnostic reports. However, the enormous amount of data that
needs to be analyzed, the time required, the detection error, and the scarcity of medical experts limit the
application of this technique. More significantly, worse survival rates are experienced in underdeveloped
nations due to restricted access to clinical knowledge and expensive diagnostics, which frequently cause
treatment delays.
In recent times, there has been a notable focus on research for the creation of sophisticated personal
health monitoring devices that incorporate automated diagnostic procedures [1.3 - 1.6]. The devices'
affordability offers easy accessibility for the general public, while their portability allows regular health
monitoring even from the comfort of home without the need for specialized medical intervention [1.5,
1.6]. A sophisticated automated monitoring system like this can lower the risk of death by ensuring early
and reasonably priced diagnosis for a large population. Early detection of various abnormalities is
ensured by the automated signal analysis algorithms, which integrate the diagnostic intelligence required
for such devices. Hence, the major focus area of this thesis is in the development of automated analysis
techniques for prediction and classification of different abnormalities and symptoms.
Personal healthcare devices must use non-invasive, affordable, and readily acquirable biological signals
in place of sophisticated and expensive diagnostic techniques to make inferences about the subject‘s
health. The most favored methods for initial level diagnosis in such devices continue to be the
Photoplethysmogram (PPG), which records the optical changes in blood volume related to the heart's
pumping, Electroencephalogram (EEG), which records the electrical activity of the brain and
Electrocardiogram (ECG), which records the electrical activity of the heart [1.6].
The automated techniques developed so far, are mainly based on ECG signal due to the simplicity and
consistency in the signal morphology. However, signal acquisition set up is a bit challenging aspect due
to the placement of multiple electrodes on the subject. This is very often an uncomfortable condition for
the subjects and causes difficulty in recording procedures. This problem can be overcome by developing
automated diagnostic algorithms based on PPG signal instead of ECG signal. The advantage of PPG
signal lies in the operation of the sensor which can be fitted to the finger tip of a subject and does not
create any discomfort [1.7]. Moreover, a single sensor is sufficient to record the signal attributes and thus
simplifies the overall acquisition procedure.
Respiration signal is another very important biological signal which is not given importance till date to
that extent as compared to ECG and EEG signals. But this signal contain many vital signature properties
which can have important diagnostic utility towards detection and monitoring of several abnormalities
related to heart, lungs and other physical and mental conditions such as tension, stress, emotions and
many more [1.8], [1.9]. Respiration signal is a simple periodic signal which can be acquired using a
respiration sensor which should be fitted to the chest of a subject by means of a belt. Tight fitting of the
belt for proper contact of the sensor to the human chest is essential which sometimes, can cause a little bit
of discomfort for the subjects. This problem can be avoided by using a PPG sensor instead of respiration
sensor since the respiration signal remain embedded in the acquired PPG signals. This has the advantage
in acquisition process as mentioned above. The additional advantage lies in the use of single senor based
approach for the acquisition of both the signals which opens up the possibility of multimodal analysis of a
single signal.
EEG signal has its own significance which cannot be replaced by any other signal due to its origin and
impact on everyday life of human beings. Brain is the most complex organ of the human body and is
responsible for all voluntary and involuntary actions. EEG signals contains numerous signature
properties, out of which only a minor percentage has been explored till date. The major challenge lies in
the non stationarity of the signal and its combination of several sub band frequencies which change
almost every instant for any individual. The acquisition of EEG signal is the most challenging out of all
the biological signals due to the use of several electrodes which needs to be placed on the human scalp
following the electrode positions governed by ‘International 10-20 electrode placement system‘. Despite
the acquisition complexity, researchers are focusing on the analysis of EEG signal because it has the
potential of generating valuable symptoms of numerous mental and physical problems and associated
complications [1.10]. Present day research is going on vastly in the domain of emotion recognition,
mental stress detection and applications in Brain Computer Interfaces (BCI) [1.11].

1.2 Objective & Contribution


The major objective of this thesis is to develop algorithms for –
 automated detection and classification of human emotions based on PPG signal
 automated estimation of mental stress based on PPG signal
 automated detection and classification of human eye ball movements based on EEG signal
 automated estimation of mental stress based on EEG signal
The algorithms have been specifically developed to satisfy the demanding specifications of automated
healthcare devices. The key contributions of the research work are:
i) The pre-processing techniques utilizes the filters like median filter and moving average filter for
significant noise reduction, and do not introduce much signal distortion.
ii) PPG signal, which provides a less obtrusive and more cost effective alternative to ECG, has been
employed for emotion detection and respiration rate estimation.
iii) PPG signal, which provides a brilliant alternative to EEG, has been employed for detection of mental
stress in addition to its detection from EEG signal.
iv) Eye ball movement detection has been successfully carried out using EEG signal, which contains vital
embedded information related to eye movements instead of estimating them from EOG signal, whose
acquisition is very complex and creates a lot of patient discomfort.
v) Instead of using advanced signal processing and data mining tools to achieve better performance, we
have focused on the use of simple techniques like the conventional Discrete Wavelet Transform (DWT)
and statistical analysis to ensure the computational simplicity.
vi) The features used for diagnosis are easy to extract and does not rely on the accurate extraction of the
different signal components only, thus, allowing considerable noise robustness.
v) The analysis algorithms are focused on the use of largely reduced feature dimension as compared to
the other available techniques. This, along with the use of simple classification techniques, not only
reduces the computational burden, but also allows faster implementation.
vi) In addition to the standard databases, the algorithms have also been tested on real data acquired from
various subjects to validate their clinical robustness.
In addition to the original works, a detailed comprehensive review of state of the art techniques has been
provided in each field with valuable insights regarding their application for the automated health
monitoring platforms.
The structural overview and arrangement of the thesis is displayed in Figure 1.1. It shows the
overview of the different chapters along with their content. Each chapter's pertinent publications
are also indicated.
1.4 Synopsis
In recent days, rapid advancement in human lifestyle, aging issues and the influences of increasing
mental stress are imposing detrimental effects on human health conditions. The resulting health
complicacy and irregularities most often instigate the risk factors for different chronic diseases. The
trend has already been indicated in the alarming reports of the World Health Organization (WHO),
where it is specified that several diseases are responsible for almost 31% of global mortality in 2021.
Mental or physical state is considered as one of the strongest physiological means for preliminary
assessment of the mental and the psychological conditions of any human being. Changes in the
emotional states normally influence a wide range of physiological activities inside the body. The
resulting effects are found to be embedded on multiple human body-generated physiological signals.
Owing to its foundation from mental activities, the domain of emotional state characterization is mostly
governed by the analysis of the electroencephalogram (EEG) signal. A majority of the EEG based
emotion detection methods primarily meant for brain-computer-interfacing uses a wide range of
automated algorithms in order to minimize the manual intervention and to obtain fast, efficient and
expert-free detection of the emotional states.
Considering the severity of the situation, over the past two decades extensive scientific attention is given
to the development of different compact, low-cost, portable and computer-assisted monitoring devices to
combat the fatal consequences and to reduce the mortality rate. Such systems primarily use a wide range
of bio-signals originated from the body. Consequently, these systems can be regarded as a lifesaving
alternative to avoid sudden death threats and also could be used to reduce a huge amount of post-
treatment expenditure, especially for the underprivileged rural populations around the world.
Consequently, different aspects of the Photoplethysmogram (PPG) signal are now being extensively
investigated as a promising diagnostic alternative because of its effortless, portable, cost-effective
acquisition technology and expert independent operational simplicity compared to other standard bio-
signals.

Emotion is very complex mental state or process of human beings, which can reflect human perceptions
and attitudes and play an important role in the communication between people. The research of emotion
recognition has very important value in the application of human-computer interaction. If the human-
computer interaction system can quickly and accurately identify human emotions, the interaction process
will be more friendly and natural. In the present day scenario, the primary challenge is to provide access
to basic medical facilities to the low-income populations living in remote and rural areas with vastly
compromised medical infrastructure. Consequently, the development of different less complicated
surrogate techniques for accurate detection of health conditions at the preliminary stage and at an
affordable cost has become a global research priority.
The objective of the present thesis is to critically investigate the different statistical, time and frequency
domain properties of PPG and EEG signals in order to establish its diagnostic utility for the prediction of
different emotion recognition, their classification and to identify and predict mental stress conditions of
human beings. The analysis of different emotions and mental stress will be effective, if the analysis is
carried out using noninvasive signal acquisition. Hence, the present work is mainly focused on the
processing of PPG and EEG signals. Apart from these two signals, some additional signal such as
respiration signal is also very powerful since it carries critical information about the breathing rate, which
often gets affected due to stress or emotional state. Heart rate variability can also be a valuable source of
information for predicting stress condition and so ECG can also be a vital signal for the present study.
The primary aspect of any analysis of the biomedical signal is based on pre-processing of the acquired
signal, followed by feature extraction and finally classification based on the extracted features. The
resulting features are then utilized via some standard techniques for the assessment of different diseases
or extraction of some vital parameters. Therefore, in the present research, the whole endeavor has been
divided into four parts as follows:

1) PPG Signal Analysis for Emotion Recognition & Classification:


To facilitate the process of accurate time-domain analysis using the PPG signal, a robust, automated yet
simple algorithm is developed in this study for classification of five distinct emotions using a threshold
based rule. Only two computationally simple features are used to identify the various emotions from
artifact free normalized PPG signal.
2) PPG Signal Analysis for Mental Stress Detection:
A simple, yet effective algorithm is proposed in this study for determination and classification of mental
stressed conditions from the easy to acquire PPG signal and the extracted respiratory signal. The PPG
signal is multi-modally characterized in this proposed approach to determine the stressed conditions. Two
simple features are used to identify the stressed conditions with respect to relaxed conditions of the
subjects using threshold based classification technique.
3) EEG Signal Analysis for Eye Ball Movement Detection:
A simple, robust EEG based algorithm is proposed in this work for automated identification of six
different classes of eye movements with significant accuracy. The algorithm initially uses discrete
wavelet transformation (DWT) of the EEG signal, obtained from six different leads to eliminate a wide
range of noise and artifacts. The detection of the eye movements can be carried out using a single feature
value only. The identification of the eye ball movements is important because this can be one of the
identification properties for stress analysis.
4) EEG Signal Analysis for Mental Stress Detection:
An automated method of stress detection based on EEG signal analysis is presented which is
computationally simple and yet achieve high detection accuracy. The present study uses two simple
features extracted from clean EEG sub bands to identify the mental stress conditions with respect to
relaxed condition. DWT is used to filter out the unwanted frequency components along with the different
artifacts and standard classifiers are used to classify between stressed and relaxed states.
This chapter gives a brief detailing about the different biomedical signals used in the
thesis:

 Photoplethysmogram (PPG)
 Electroencephalogram (EEG)

PPG is the optical technique of measuring the blood volume changes in the tissues due
to cardiac cycle.
EEG is the electrical activity related to the synchronous activity of the neurons inside
the human brain.

The automated biomedical signal analysis and processing systems utilize the different signals
generated by the human body in order to diagnose the physical and mental state of an
individual. In order to provide patient comfort and mobility at reasonable costs, these systems
need to employ low-cost and non-intrusive signal acquisition methods. Photoplethysmogram,
Electroencephalogram and Electrocardiogram are the most widely used physiological signals.
The purpose of this chapter is to give a quick summary of the origin and importance of the
biomedical signals along with an explanation of these signals, their physiological origin, and
measuring conventions.

Photoplethysmogram (PPG) is a type of plethysmogram that is acquired optically and is useful


for monitoring changes in blood volume within the micro vascular bed of tissue [2.1], [2.2].
Peripheral tissue is illuminated by optical radiation, which is absorbed and scattered as it passes
through various tissue layers and then either transmitted through or reflected off the tissue
surface. An optical sensor detects this reduced light intensity, which is then recorded as a
voltage signal known as a photoplethysmogram (PPG). As seen in Figure 2.1[2.3], a raw PPG
waveform depicts the changes in incident optical radiation attenuation by various tissue
components throughout the tissue volume.

Because of the fact that the photoplethysmogram (PPG) signal is non-invasive and reasonably
priced, it is frequently utilized in clinical and consumer devices [2.1]. Its main applications in
the past have been in the measurement of blood oxygen saturation and the monitoring of heart
rate in patients who are at rest. PPG signal processing has been used in therapeutic settings for
many years, but it is now a huge and expanding field of study. The growing usage of PPG
sensors in consumer wearables has spurred the research in this domain. The design of signal
processing algorithms faces a number of difficulties in this environment, including the
challenge of managing motion artifacts. Furthermore, the PPG signal, which is not now
commonly utilized, contains important information about the respiratory, cardiovascular, and
autonomic nervous systems. All of these elements work together to make it possible to utilize
the PPG alone and comprehensively offer health information in everyday situations. The
creation of reliable PPG signal processing algorithms is a crucial first step towards taking use of
this possibility.

2.1.1 PPG sensor:


Modern PPG sensors assess the amount of light that is reflected or transmitted using a
photodiode, and they light up the tissues using light emitting diodes (LEDs). A matching set of
LEDs and photo-detectors operating in the 0.8–1μm near-infrared wavelength range is utilized.
Because blood contains hemoglobin, which absorbs light, the amount of light that is transmitted
or backscattered depends on the volume of blood in the arteries, which is measured by a photo-
detector. The recording is done from the tip of the fingers, the lobes in the ears, and the toes.
Important benefits of the optical-based PPG monitoring setup include robustness, affordability,
and easy-to-wear sensors that promote patient comfort and mobility.

2.1.2 PPG signal:


PPG is a low frequency signal by nature and is composed of a pulsatile ac component that
represents the variations in blood volume that occur in tandem with the cardiac cycle. A typical
PPG wave and its distinguishing fiducial points are depicted in Figure 2.2. The waveform's
rising portion represents systole, or the contraction of the ventricles, and its falling portion
represents the relaxation of the ventricles, as well as wave reflections from the artery boundary.
Every pulse starts the process of blood being ejected from the heart into the aorta. Aortic valve
closure is indicated by the dicrotic notch, which is a sign that the ejection phase is coming to an
end.
Lower frequency fluctuations (the "DC" part) are brought on by changes in other tissue
components like venous and capillary blood, bloodless tissue, etc., while higher frequency
variations (the "AC" part) are brought on by changes in arterial blood volume with each
heartbeat. Based on the modified Beer-Lambert law, the attenuation of light in tissue can be
expressed as a function of the optical path length and the medium's attenuation coefficient [2.4].
The direction of red blood cells, the mechanical movement of cellular components, and a
confluence of elements have also been suggested as the origins of the PPG waveform [2.5]–
[2.7].
2.1.3 Diagnostic utility of PPG:
The pulse oximeter was the sole device that could use the PPG signal in the past. It is a valuable
source of information on the cardio-pulmonary system because it shows how the heart's
pumping action affects the blood volume in the arteries. The signal is a reflection of the blood
flow through the arteries, which flows in a wave-like pattern from the heart to the body's
destination (Figure 2.2). Therefore, modifications to the arterial characteristics that impact the
heartbeat, hemodynamics, and physiological state also have an impact on the PPG signal's
waveform profile. PPG signal analysis has been used recently in many commercially available
medical devices for cardiac output, blood pressure, heart rate, and peripheral vascular disease
detection [2.8, 2.9]. A typical PPG wave is comprised of a pulsatile (AC) component that
indicates blood volume changes that occur between the systolic and diastolic phases of the
cardiac cycle, and a DC component that depends on tissue structure, residual blood, venous
blood and arterial blood oxygen saturation (SpO2) [2.10]. The basic frequency of the pulsating
component is immediately correlated with the heart rate. There is a variation in the onset points
of the consecutive PPG beats which is known as baseline wandering effect. This is normally
caused due to the respiration cycle which can offer valuable information on the respiration rate
(see Figure 2.3). As can be seen in Figure 2.4, the PPG wave can be typically split into two
phases: the diastolic phase with the diastolic (reflected) wave and the systolic phase with the
systolic (direct) wave [2.11]. Pressure travels directly from the aortic root to the measuring site,
where it transfers into the diastolic wave. The reflection of waves from the periphery produces
the diastolic wave.

2.1.4 PPG Signal Acquisition


A photo detector (PD), sometimes known as a photo sensor, and a light source make up a PPG
sensor. The skin's tissue is illuminated by the light source, which is usually an LED. The PD
receives additional stray light that falls on the sensor from the surroundings in addition to the
spectrum and intensity of the reflected or transmitted light. A photodiode is usually used to
realize a PD, although a smart phone camera can also be used.
A PPG sensor essentially has two operational modes: the transmission mode, where the tissue is
positioned between the two components, and the reflectance mode, where the light source and
detector are placed side by side. The transmission method can only be used in a restricted
number of body parts by nature, such as the earlobes or fingers. On a finger, both functioning
modes are displayed in Figure 2.5.
The LED's wavelength selection is crucial for the PPG sensor's light source. Anderson and
Parrish [2.12] investigated the optical properties and light penetration depth of human skin. The
primary radiation absorber in the epidermis is melanin, particularly at shorter wavelengths in the
ultraviolet (UV) region, which is defined as having a wavelength of less than 300 nm. Within
this region, several acids and epidermal thickness are also significant variables. The blue range
(400–500 nm) has the highest absorption.
Since skin and most other soft tissues have an optical "window" in the red and near infrared (IR)
area between 600 and 1300 nm, near IR LEDs have been employed as a light source in PPG
sensors, particularly in medical applications. The sensor can examine larger tissue beds at lower
body levels and obtain more biometric data, such as hemoglobin, muscle saturation, and
hydration, because of the near-infrared light's penetration depth of approximately 1600 nm at a
wavelength of 1000 nm. In comparison to green light, red light is also considerably less affected
by physiological variances, tattoos, freckle patterns, and changes in skin tone. To obtain a good
signal-to-noise ratio (SNR), red light PPG sensors need sophisticated and reliable signal
processing since they are more prone to motion artifacts [2.13].
Green light, which has a wavelength of 560 nm, on the other hand, can only study the
superficial blood vessels and only reach a depth of around 420 nm. This might cause significant
issues in the vicinity of the wrist, where blood circulation is restricted. PPG sensors with green
LEDs have a greater SNR and are more resilient to motion artifacts, despite the fact that they do
not penetrate deeply. Green LEDs are widely used by producers of photoplethysmography-
based fitness devices because they need less signal processing and provide a lot of existing
product expertise for future development. Green LEDs are best for daily heart rate monitoring;
however, deeper penetrating red or infrared LEDs are advised for further biometric information
extraction, such as SpO2.

2.1.5 Areas of Application


Over the years, photoplethysmography has been used in numerous clinical settings. It is
commonly used in clinical physiological monitoring to measure vital signs such as heart rate,
blood pressure, cardiac output, and respiration. It is also used in vascular assessment, such as
venous assessment or arterial disease [2.1]. Particularly, pulse oximeter, which measures heart
rate and SpO2 using photoplethysmography, is thought to be among the most significant
technological advancements in clinical patient monitoring over the past few decades [2.14].
Heart rate estimate in particular is being utilized more and more in non-medical contexts
because to the growing number of wearables, such as fitness trackers and smart watches. The
primary benefit of a smart watch or fitness tracker is the ability to permanently store measures
that may yield fresh health-related information. Numerous applications are currently available
that may measure heart rate, emotional activity, or even identify problems related to mental
stress [2.15–2.19].
Heart rate is determined by the actions of the sympathetic and parasympathetic branches of the
autonomic nervous system (ANS), which regulate the heart's sinus node's firing rate. A possible
substitute for heart rate variability has been suggested to be PRV. Nowadays, PRV is frequently
utilized in research and has been suggested as a useful tool for a number of applications,
including pharmacological research, emotion research, mental health assessment, sleep studies,
and the identification, characterization, and monitoring of somatic disorders.
In many clinical contexts, respiratory rate (RR), or the number of breaths per minute, is utilized
for prognosis and diagnosis. In critically ill hospitalized patients, RR is a marker for clinical
deteriorations; a higher RR is indicative of unfavorable outcomes such cardiac arrest and death.
Sepsis and pneumonia can also be diagnosed using RR. Outside of critical care, RR is typically
determined by manual breath counting, despite its significance. This is a laborious and
imprecise technique. Additionally, many of the current techniques for RR monitoring in
wearables call for tools like chest bands. As a result, there is a lot of promise for an
inconspicuous way to monitor RR using standard sensors like a PPG sensor.

2.2 Electroencephalogram (EEG)

The human brain begins to fire neurally between the seventeenth and twenty-third week of
pregnancy. It is thought that electrical impulses produced by the brain at this early age and
throughout life reflect not just the health of the brain but also the state of the entire body. This
presumption serves as the driving force for the application of sophisticated digital signal
processing techniques to electroencephalogram (EEG) signals obtained from human subjects'
brains. Although the authors make no attempt to discuss the physiological components of brain
activity, there are a number of questions that need to be addressed regarding the nature of the
original sources, their real patterns, and the properties of the medium. The medium delineates
the trajectory from the neurons, which function as signal sources, to the electrodes, which are
sensors that measure combinations of the sources. On the other hand, for those who work with
these signals for the detection, diagnosis, and treatment of brain disorders and the related
diseases, an understanding of neuronal functions and neurophysiological properties of the brain
as well as the mechanisms underlying the generation of signals and their recordings is essential.
2.2.1 EEG Signal
EEG signal represents the currents that flow during synaptic excitations of the dendrites of
many pyramidal neurons in the cerebral cortex. The dendrites of brain cells (neurons) produce
synaptic currents when they are triggered. The summed postsynaptic graded potentials from
pyramidal cells, which form electrical dipoles between the soma (the body of a neuron) and the
apical dendrites, are the cause of differences in electrical potentials (Figure 2.6). The positive
ions sodium (Na+), potassium (K+), calcium (Ca++), and the negative ion, chlorine (Cl-), are
pumped through the neuron membranes in the direction determined by the membrane potential
to produce the majority of the current in the brain [2.20].

The human skull, brain, scalp, and other thin layers in between make up the various layers that
make up the human head. Signals are attenuated by the skull about a hundred times more than
by the soft tissue. However, the majority of noise is produced either above the scalp (also
known as external noise or system noise) or inside the brain (also known as internal noise).
Consequently, the scalp electrodes can only record enough potential from huge populations of
activated neurons. For display reasons, these impulses are then substantially magnified. The
central nervous system (CNS) develops about 1011 neurons at birth when it is fully functional
[2.21]. This results in 104 neurons on average per cubic millimeter. Synapses allow neurons to
link with one another to form neural networks.

An adult's synapses are roughly 5 × 1014. While the number of neurons declines with age, the
number of synapses per neuron grows. The brain can be categorized into three regions based on
its anatomy: the brainstem, cerebellum, and cerebrum (Figure 2.7). The cerebral cortex, which
is made up of the left and right lobes of the brain, is a highly convoluted layer on the surface of
the brain. The areas of the brain responsible for initiating movement, cognitive awareness of
sensation, complex processing, and emotional and behavioral expression are all located in this
region. Balance is preserved and voluntary muscle actions are coordinated by the cerebellum.
Involuntary processes like breathing, heart rate regulation, biorhythms, and neuro-hormone and
hormone sections are all managed by the brain stem [2.22]

It is evident from the section above that the study of EEG opens the door to the identification of
a wide range of neurological conditions as well as other anomalies in the human body. The
following clinical issues can be investigated using the obtained EEG signal from humans (and
also from animals) [2.22, 2.23]:

(a) tracking consciousness, unconsciousness, and brain death;


(b) identifying damage sites after stroke, brain tumor, and head injury;
(c) assessing afferent pathways (using evoked potentials);
(d) tracking cognitive activity (alpha rhythm);
(e) creating biofeedback scenarios;
(f) regulating the depth of anesthesia (servo anesthesia);
(g) researching epilepsy and identifying the cause of seizures;
(h) assessing the impact of epilepsy medications;
(i) helping with the experimental cortical excision of the epileptic focus;
(j) observing brain development;
(k) testing medications for convulsive effects;
(l) looking into physiology and sleep disorders;
(m) looking into mental disorders; and
(n) offering a hybrid data recording system in addition to other imaging modalities.

This above compilation demonstrates the great potential of EEG analysis and highlights the
necessity for sophisticated signal processing methods to support the physician in interpreting the
results. The descriptions of the EEG rhythms, which should be detectable in EEG recordings, is
mentioned below.

2.2.2 EEG Rhythms


Examining EEG signals visually aids in the diagnosis of numerous brain illnesses. The way that
brain rhythms show up in EEG data is well known to the clinical experts in the field.
The frequencies and amplitudes of these signals vary in healthy adults depending on the
human's state, such as awake versus sleep. The various frequency ranges of the five main brain
waves allow for their differentiation. These frequency bands are named delta (δ), theta (θ),
alpha (α), beta (β), and gamma (γ), correspondingly, ranging from low to high frequency range.
In 1929, Berger introduced the alpha and beta waves. The term "gamma" was first used by
Jasper and Andrews (1938) to describe waves that are higher than 30 Hz. Walter (1936)
established the delta rhythm to denote all frequencies below the alpha range. Additionally, he
defined theta waves as those with frequencies between 4 and 8 Hz. In 1944, Wolter and Dovey
proposed the idea of a theta wave [2.24].

The frequency range of delta waves is 0.5–4 Hz. These waves can occur during awake hours
and are generally linked to profound slumber. It is very easy to mistake the real delta response
with artifact signals coming from the big muscles in the mouth and neck. This is due to the fact
that the muscles are located close to the skin's surface and generate strong signals, but the signal
of interest comes from deep within the brain and is much reduced as it passes through the skull.
Nevertheless, it is rather straightforward to determine when the response is brought on by
excessive movement by using basic signal analysis techniques on the EEG.

Theta waves occur between 4 and 8 Hz. The word "theta" may have been chosen to suggest that
it originated in the thalamus. As slumber begins to fade into consciousness, theta waves
manifest. Deep concentration, creative inspiration, and the ability to access unconscious content
have all been linked to theta waves. A theta wave appears to be connected to the degree of
arousal and is frequently accompanied by other frequencies. It is well known that the alpha
wave of skilled mediators eventually decreases in frequency over extended periods of time.
Theta waves are significant during the early years of life. Greater theta wave activity in awake
adults are pathological and result from a variety of issues. Studies on emotions and maturation
look at the variations in theta wave rhythm [2.25].

Alpha waves are often located above the occipital area of the brain and manifest in the posterior
half of the head. They are present throughout the entire brain's posterior lobes. The frequency
range for alpha waves is 8–13 Hz, and they often manifest as a round or sinusoidal waveform.
On the other hand, it might occasionally show up as jagged waves. It has been suggested that
alpha waves signify a state of relaxed awareness devoid of any focus or attention. The alpha
wave, which may span a wider range than previously thought, is the most noticeable rhythm in
the entire field of brain activity. Even up to 20 Hz, there is a peak that is frequently observed in
the beta wave range that resembles an alpha wave condition rather than a beta wave. Once
more, an alpha setting emerges around 75 Hz, which is where a response is frequently observed.
It has been suggested that alpha waves are only a waiting or scanning pattern generated by the
visual regions of the brain because most persons produce some alpha waves while their eyes are
closed. It is lessened or gone when one opens their eyes, when one hears strange noises, when
one feels anxious, or when one focuses their mental attention. The amplitude of an alpha wave
is typically less than 50 μV and is higher over the occipital areas. More investigation is needed
to determine how an alpha wave forms in cortical cells, as the physiological relevance and
genesis of this phenomena remain unclear [2.26].

The electrical activity of the brain that varies between 14 and 26 Hz is known as a beta wave
(although other literature does not specify an upper bound). In healthy people, a beta wave is the
typical waking rhythm of the brain linked to active thinking, active attention, external focus,
and problem-solving. A person experiencing panic may acquire a high-level beta wave. Beta
activity that is rhythmic is primarily observed over the frontal and central regions. Significantly,
a motor action or tactile stimulation can block a central beta rhythm, which is associated with
the rolandic mu rhythm. Typically, the beta rhythm amplitude is less than 30 μV. Like the mu
rhythm, bone defects and areas surrounding tumors can also cause the beta wave to be
amplified.

The range of frequencies over 30 Hz, primarily up to 45 Hz, is known as the gamma range, or
rapid beta wave. These rhythms have extremely small amplitudes and are not frequently
observed, but their detection can be used to establish the existence of specific brain illnesses.
The frontal and central area contains the regions with the highest cerebral blood flow, oxygen
and glucose consumption, and high EEG frequencies. The locus for movement of the right and
left index fingers, the right toes, and the large and bilateral area for movement of the tongue can
all be seen using the gamma wave band, which has also been shown to be a useful indicator of
event-related synchronization (ERS) of the brain [2.27].

The normal brain rhythms and their typical amplitude levels are depicted in Figure 2.8.
Generally speaking, leptomeninges, cerebrospinal fluid, dura matter, bone, galea, and the scalp
attenuate brain activities, which project into EEG signals. The amplitudes of cartographic
discharges range from 0.5 to 1.5 mV, with spikes reaching several millivolts. On the scalp,
however, the amplitudes often fall between 10 and 100 μV. These rhythms are roughly cyclic in
nature since they could persist if the subject's condition does not alter.

2.2.3 EEG Signal Acquisition


A series of fine electrodes, a series of differential amplifiers followed by filters [2.22], and
needle (pen)-style registers make up more modern EEG devices. Plotting the multichannel
EEGs on plain paper is an option. Researchers began searching for a computerized system that
could digitize and store the signals shortly after this device hit the market. It was therefore
quickly realized that digital signals were necessary for the analysis of EEG signals. For this, the
signals had to be quantized, sampled, and encoded. The data volume, measured in terms of bits,
rises as the number of electrodes increases. Variable settings, stimulations, and sampling
frequencies are possible with the computerized systems, and some of them have basic or
sophisticated signal processing facilities. Multichannel analogue-to-digital converters (ADCs)
are used to convert analogue EEG data to digital data. Thankfully, EEG signals have an
effective bandwidth of only about 100 Hz. This bandwidth may be regarded as even half of this
figure for numerous applications. Thus, in order to meet the Nyquist requirement, a minimum
frequency of 200 samples / sec is frequently sufficient for EEG signal sampling. Sampling
frequencies of up to 1000 samples / sec may be used in some situations where a better
resolution is necessary for the representation of brain activity in the frequency domain.
The quantization of EEG waves is often quite precise in order to preserve the diagnostic
information. For EEG recording devices, representation of individual signal samples with up to
16 bits is highly popular. Due to this, a large amount of memory is required for signal archiving,
particularly for sleep EEG and epileptic seizure monitoring data. But generally speaking, a lot
more capacity is needed to archive radiological images than it is to archive EEG data. Even
though the formats used by various EEG devices to read the data may differ, most signal
processing software packages, including MATLAB, can simply convert these formats to
spreadsheets.
To obtain high-quality data, the EEG electrodes and their appropriate operation are essential.
EEG recording systems frequently employ a variety of electrode types, including: disposable
(gel-less and pre-gelled varieties); reusable disc electrodes (gold, silver, stainless steel, or tin);
electrode caps and headbands; saline-based electrodes; and needle electrodes. Electrode caps are
used for multichannel recordings including a lot of electrodes. Ag-AgCl disks with a diameter
of less than 3 mm and long, flexible leads that can be connected to an amplifier make up the
majority of scalp electrodes that are employed. The electrodes that require minimally invasive
procedures to implant beneath the skull are called needle electrodes. Distortion may result from
high impedance between the electrodes and the brain, or from high impedance electrodes
themselves, which may even obscure the real EEG signals. Impedance monitors are frequently
included with commercial EEG recording devices. The electrode impedances must be balanced
to within 1KΩ and 5KΩ in order to allow for a successful recording. After every trial, the
impedances are examined for more precise measurement.However, the distribution of potentials
over the scalp (or cortex) is not uniform because of the brain's layered and spiral structure
[2.28]. Some of the outcomes of source localization utilizing the EEG signals may be impacted
by this.

2.2.4 Conventional EEG Electrode Positions


The normal electrode arrangement (also known as 10–20) with 21 electrodes (apart from the
earlobe electrodes) has been suggested by the International Federation of Societies for
Electroencephalography and Clinical Neurophysiology [2.29]. This configuration is shown in
Figure 2.9. The reference electrodes are frequently the earlobe electrodes A1 and A2, which are
attached to the left and right earlobes, respectively. By employing certain anatomic landmarks
from which the measurement would be performed and then using 10 or 20% of that stated
distance as the electrode interval, the 10–20 approach avoids both eyeball placement and takes
into consideration some fixed distances. On the left are the odd electrodes, while on the right are
the even electrodes. The remaining electrodes are positioned equally spaced between the above
electrodes in order to set a greater number of electrodes utilizing the above typical approach. C1
is positioned, for instance, in between Cz and C3. A bigger configuration for 75 electrodes,
including the reference electrodes, is shown in Figure 2.10 and is based on recommendations
from the American EEG Society. When measuring the EOG, ECG, and EMG of the muscles
around the eye and the eyelid, additional electrodes may be utilized. A single channel may be
utilized in some applications, such as ERP analysis and brain-computer interface. However, in
these kinds of applications, it's important to precisely define where the matching electrode is
located. For brain–computer interface (BCI) applications, for instance, C3 and C4 can be
utilized to record signals pertaining to the movement of the right and left fingers, respectively.
The mental stress related signals can also be recorded using F3, F4, P3, and P4 electrodes.
There are two types of recording used: referential and differential. Each differential amplifier
receives two inputs from two electrodes while operating in differential mode. In contrast, one or
two reference electrodes are utilized in the referential mode. The literature contains a variety of
reference electrode locations. Physical references include the tip of the nose, bipolar references,
ipsilateral and contralateral ears, connected ears, linked mastoids, Cz, and C7. Additionally,
there are reference-free recording methods that genuinely make use of an average common
reference. If the reference is not comparatively neutral, the selection of reference could result in
topographic distortion. However, using contemporary instrumentation, the measurement is not
much impacted by the reference selection [2.30]. Other references like FPz, hand, or leg
electrodes may be employed in such systems. The active electrodes and the references are part
of the overall configuration.

2.2.5 Areas of Application


Certain brain abnormalities, such as mental illnesses, aging, and epileptic and non epileptic
attacks, can be identified using standard EEG measuring settings and the brain rhythms found in
normal or abnormal EEGs. There are numerous other brain illnesses and dysfunctions that may
or may not show certain types of abnormalities in the associated EEG signals, in addition to the
known neurological, physiological, pathological, and mental abnormalities of the brain.
Degenerative disorders of the central nervous system (CNS), including chromosomal
aberrations, multiple lysosomal disorders, several peroxisomal disorders, several mitochondrial
disorders, inborn urea cycle disturbances, numerous aminoacidurias, and other metabolic and
degenerative diseases, must be assessed and their symptoms correlated with the changes in EEG
patterns.
It's important to comprehend the similarities and distinctions between these disorders' EEGs.
Emotion recognition is another vital aspect which can be obtained from the analysis of EEG
signals since the emotion is a complex state of the human brain. Mental stress can also be
estimated from the EEG signals owing to the direct impact of stress conditions on EEG signals.
Another important application of EEG in recent years has been in the field of brain Computer
Interface (BCI). This has got a wide application of signal processing and analysis of EEG
signals for developing and controlling several gadgets or devices. The first step towards a BCI
application involves the detection of eyeball movements which provides a valuable source of
information for even mentally challenged persons. Detection of accurate eye ball positions and
its movements can lead to effective use in BCI applications. However, in order to further
improve the results of such processing, the developed mathematical algorithms must take into
account the clinical observations and discoveries. There's still a long way to go and a lot of
questions to be answered, despite of a number of technological techniques for analyzing EEGs
in regard to the above abnormalities have been thoroughly established.

This chapter gives a brief detailing about the different processing techniques which are
normally adopted for the elimination of artifacts from biomedical signals along with the
conventional signal processing methods available in the literature.

There are several artifacts which are normally picked up by the acquisition system
during the recording of biomedical signals.
These artifacts are a challenge to the researchers as they remain embedded in the
signals and their amplitude and frequency components are overlapping with that of
original signal.
This chapter presents the conventional signal processing techniques which are used for
eliminating different artifacts from the acquired signals.
Once the artifacts are eliminated, certain features are required to be extracted from the
clean signals using different signal processing steps.
A detail review of the signal processing techniques used for feature extraction is
provided in this chapter.
3.1 Review of Artifact Removal from PPG Signal
PPG signal processing is currently a wide field of research [2.9], [3.1], [3.2] since PPG sensors are
widely used in both clinical and consumer devices. Access to this field is ensured by the availability of
many publicly available datasets comprising of PPG signals along with reference parameters. An
overview of the PPG signal processing techniques is given in this section.

3.1.1 Digital Filtering


The first crucial step in reducing the influence of noise on PPG signal analysis is digital filtering. In order
to extract useful information from PPG signals, it is imperative that the noise should be attenuated. While
noise within particular frequency ranges (such as high and low frequency noise) can be reduced using
digital filtering, noise occurring within the frequency range of interest (such as motion artifact from
walking, where the frequency of steps can be comparable to heart rate) requires additional processing.
To create a filtered signal, digital filtering involves convolving the input signal with filter coefficients. In
the z-domain, a filter's transfer function is expressed as follows where b[m] and a[n] are the coefficients
of the numerator and denominator respectively:

The design method determines the coefficients based on the required cut-off frequencies and the kind of
filter [3.3]. In order to produce the filtered signal, y, this transfer function can also be represented as a
difference equation that can be simply applied to the original signal in the time domain. The expression
for the difference equation is:

Finding the b[m] and a[n] coefficients that provide the appropriate filter response is the critical step in the
design process of a digital filter [3.3]. Because digital filters can significantly affect the morphology of
PPG signals, it is crucial to design them with the final use in mind [3.4].
The two main families of digital filters are Infinite Impulse Response (IIR) and Finite Impulse Response
(FIR) filters which differ according to their transfer function. A filter can be low-pass (LPF), high-pass
(HPF), band-pass (BPF), or band-stop (BSF) with a specific type, order, and cut-off frequency [3.3]. The
behavior of the slope of transitions between the pass bands and the reject bands, or vice versa, is
determined by the sequence. In addition to requirement of a longer input signal time and producing a
bigger delay in the filtered signal, higher order filters have steeper transitions. Band-pass and band-stop
filters have two transition bands and two different cut-off frequencies, respectively, while low-pass and
high-pass filters only have one transition and one cut-off frequency.
The Moving Average (MA) filter is the most often used FIR filter in the field of biomedical signal
processing, including pre-processing the PPG signal [3.1]. Other widely used FIR filters include the
median filter, which calculates the filter's result based on the median value of the most recent n samples
rather than the mean value of the n samples, and FIR filters that employ Hamming windows, which can
be created for any kind of filter that is required. Analog filters were originally used to design IIR filters
[3.3]. The Butterworth filter, type I and II Chebyshev filters, and elliptic filters are the most often used
IIR filters [3.1]. The frequency response of the filters shows the primary distinction between the design
techniques. In contrast to elliptic and Chebyshev filters, Butterworth filters often have a higher slope but
offer ripple-free pass and stop bands, which is typically required in biological signal analysis [3.3].
However, fourth-order Chebyshev type II filters have been proposed by Liang et al. as being more
effective in enhancing PPG signal quality, despite the lack of standards for PPG signal filtering [3.3].
Other sophisticated methods, including de-noising with the Wavelet transform, have been suggested as
filtering options [3.3], [3.4].
The delay that filtering technique introduces into the final signal is one of its fundamental characteristics.
When filtering the signal offline, this delay can be fixed, allowing the filter to be applied in both forward
and reverse directions while still allowing access to the whole signal. Real-time zero-phase filtering is not
achievable, though. As a result, the signal is always delayed by real-time filtering. The order of the filter
affects this delay. FIR filters typically impose longer delays because they require higher orders to achieve
similar results. However, in some real-time applications that can tolerate greater delays, FIR filters might
be chosen over IIR filters. Additionally, unlike IIR filters, which typically have non-linear phase, FIR
filters have linear phase, meaning that the delay is always linear and of a known value.
There are no established standards for figuring the PPG filter cut-off frequencies. The majority of the
PPG signal's frequency content is below 15 Hz, yet the application has a significant influence on the cut-
off frequencies chosen for PPG signal analysis. Cut-off frequency selection is frequently the result of
compromise in design; for example, a lower low-pass cut-off frequency may enable individual pulse
waves to be more easily recognized, but it may also deform their shape.

3.1.2 Removal of Motion Artifact


Amplitude, and/or shape measures spanning a number of heartbeats may be impacted by PPG waveform
variations brought on by subject and PPG probe movements. Moreover, abrupt changes in breathing,
coughing, and talking can all affect the PPG signal. The objective should be to acquire high-quality
signals and adhere to a carefully thought-out methodology that minimizes subject movement when they
are at rest in order to try and reduce noise in the first place. But this isn't always feasible in the real world,
particularly for applications that involve wearable sensors and ambulatory measurements. In order to
mitigate the effects of artifacts on PPG records and support a variety of therapeutic applications, robust
signal processing methods are necessary. The elimination of movement artifact, however, can be
extremely difficult, especially since this kind of noise can initially seem a lot like normal physiological
variation. The planned clinical application will determine how much noise reduction is needed. For
example, applications demanding the morphological characterization of the pulse shape requires a
premium on signal quality [3.5].
Numerous methods for de-noising have been put forth. These include: manual identification and labeling
for the purpose of excluding noisy pulses [3.6]; cluster technique approaches to extract the most
consistent presentations of pulse shapes in a recording [3.7]; independent component analysis, such as
that used in SpO2 pulse oximeter, by taking advantage of the quasi-periodicity of the PPG signal and the
independence between the PPG and the motion artifact signals; combining independent component
analysis and block interleaving with low pass filtering to reduce motion artifacts under the general dual-
wavelength measurement [3.8]; signal decomposition and reconstruction, employing iterative motion
artifact removal based on a singular spectral analysis algorithm to obtain precise heart rate and SpO2
values from a pulse oximeter [3.9]. Periodic moving average filtering [3.10]; wavelet denoising as a pre-
processing step for robust heart rate detection [3.11], [3.12]; and TROIKA for accurate heart rate tracking
under motion [3.13]. Moreover, techniques requiring a reference motion signal have been proposed, such
as: using gyroscope data to obtain better noise cancellation performance compared to accelerometry
measurements; incorporating signals indicative of motion as an algorithm enhancement, such as 3D
accelerometry data from the wrist [3.14], [3.15]; and adaptive filtering with least square based active
noise cancellation [3.16].

3.1.3 Assessment of PPG signal quality


Many approaches have been reported to evaluate PPG signal quality out of which a majority of the
methods start with the identification of segments that are obviously low-quality by using physiological
plausibility checks, and then they use more advanced techniques to identify the remaining, less obvious
low-quality segments [3.2]. The majority of techniques collect morphological, spectral, or trend-based
features from a single PPG pulse wave, and then they use heuristic, empirically calculated, or machine
learning-derived thresholds to classify a pulse wave as high or low quality [3.2]. These properties, like
timing or amplitude properties [3.17], [3.18], are often retrieved in the time domain. Frequency-domain
indexes have also been used in some research [3.2].
Numerous features of pulse waves have been employed to evaluate the quality of the signal. Pulse wave
amplitude [3.18], [3.19], and the Perfusion Index (PI) [3.20] are examples of amplitude characteristics.
The PI measures the ratio of the signal's "AC" to "DC" component. The systolic phase duration, the ratio
of the systolic to the diastolic phase duration, the average pulse rate, and the inter beat intervals are
examples of timing characteristics. The number of diastolic peaks, the signal-to-noise ratio (SNR), the
number of times the signal shifts from positive to negative, or vice versa—also known as the zero-
crossing rate—and the comparison of the accuracy of various systolic wave detectors for isolating events
(such as beats or noise artifacts) are some examples of shape characteristics [3.20]. The distribution of the
data is indicated by higher order statistics like skewness and kurtosis, which also show how the pulse
wave is spread across time [3.2], [3.20]. These measurement are particularly well-suited for detecting
PPG pulses with noise-induced outliers [3.2]. Since Shannon entropy is a measure of system uncertainty
and rises in noisy PPG signals, it has been suggested as an additional way to quantify the amount of
"disorder" in a PPG pulse wave [3.20], [3.21].
Additionally, some algorithms evaluate the similarities between successive pulse waves, including the
relationship between the diastolic and systolic pulse wave amplitudes, variations in the amplitude,
systolic phase duration, and pulse wave length, as well as the differences in trough depth between
successive troughs [3.19], [3.22]. Furthermore, as stated in [3.23], some morphological characteristics can
be utilized to identify PPG signal segments without any pulses. The autocorrelation function of PPG
segments was recently presented as a technique to identify artifacts, which can lead to various shapes of
this function [3.23]. To characterize the autocorrelation function, features such as the maximum peak, the
lag value of the maximum peak, and the first zero crossing point were retrieved. Since pulse waves in
high-quality signal segments are anticipated to have comparable morphologies, template-matching
approaches are common measures of regularity in PPG segments [3.2]. The comparison of each pulse
wave with an average representation of these pulse waves in order to measure the Euclidean distance and
the ratio of amplitudes [3.19]; the extraction of the mean value of the correlation coefficients between
each cycle and the extracted template [3.24]; or the alteration of each pulse's width in order to compare it
to the template using sophisticated methods such as Dynamic Time Warping [3.17], [3.22]. There are
various methods available for carrying out template extraction as well. Li & Clifford proposed a
template-matching algorithm in which an average template cycle was obtained from all pulses in the PPG
segment, and only the average of the pulses with correlation coefficients higher than 0.8 with respect to
the original template was chosen [3.17]. Karlen et al., for example, compared the PPG pulse against a
previously formed reference pulse set, which included only those pulses that were previously determined
as high quality pulse waves, based on the correlation coefficient [3.25].
The majority of algorithms for assessing signal quality rely on automatic methods for detecting pulse
waves, and the effectiveness of these algorithms may be strongly influenced by both the signal's quality
and the algorithm's design [3.2]. Additional research is required to develop a well recognized, robust
method for accurately denoising the acquired PPG signal and detecting individual pulse waves in the
presence of noise.

3.2 Review of Artifact Removal from EEG Signal


Neural activity characteristics are found in EEG recordings. Although the signals are typically shown in
the time domain, many modern EEG devices can perform frequency analysis using basic signal
processing techniques like the Fourier transform. Numerous algorithms have been created thus far to
process EEG signals. Time-domain analysis, frequency-domain analysis, spatial-domain analysis, and
multi way processing are among the procedures that are performed for noise removal. In addition, a
number of algorithms have been created to display brain activity from visuals that have only been rebuilt
using EEG data. Another field of study has been separating the desired sources from the multi sensor
EEGs. This may eventually result in the identification of brain disorders like epilepsy and the origins of
different types of mental and physical activity. Brain–computer interface (BCI) research has recently
concentrated on creating sophisticated signal processing tools and algorithms for this use.
The EEG signals are influenced by electrocardiograms (ECGs), electroocclugrams (EOG), and eye
blinks. Adaptive and non adaptive filtering methods can be used to estimate and reduce the noise in the
EEGs. Neuronal information is included in the EEG signals below 100 Hz and often below 30 Hz in
many applications. Using low pass filters, any frequency component above these frequencies can be
easily eliminated. A notch filter is used to eliminate the 50 Hz line frequency when the EEG data
acquisition system is unable to cancel it out (because of a grounding issue or an imbalance in the inputs of
the differential amplifiers connected to the EEG system). Equalizing filters are used to adjust for any
known nonlinearities in the recording system due to the frequency response of the amplifiers.
Nevertheless, it is frequently uncertain about the properties the external and internal sounds which impact
the EEG recordings. If the signal and noise subspaces can be precisely distinguished, then the noise can
be described.
The multichannel EEG data can be broken down into its individual components, such as the noise and
brain activity, using principal component analysis or independent component analysis. By combining the
two, it is possible to extract, quantify, and isolate the estimated noise components from the real EEGs.
EEG signal artifacts can also be eliminated using adaptive noise cancellation, which are utilized in
biological signal analysis, communications, and signal processing. Nonetheless, a reference signal is
necessary for an efficient adaptive noise canceller. Considerable statistical information about the noise or
artifact and its characteristics is carried by the reference signal. For instance, a signature of the eye blink
signal can be obtained from the FP1 and FP2 EEG electrodes in order to eliminate eye blinking artifacts. As
an additional example, the reference signal for ERP (Event Related Potential) signal detection can be
obtained by averaging multiple ERP segments. Many other examples exist, such as the cancelation of the
electrocardiogram from electroencephalograms and the elimination of fMRI scanner artifacts from
simultaneous EEG – fMRI recordings in which reference signals are available.
Probably the most basic kind of adaptive filters are adaptive Wiener filters. The ideal weights for this
filter are determined during operation so that, in the mean-squared sense, the filtered signal is the best
approximation of the original signal. The Wiener filter reduces the error's mean-squared value. Every
suboptimal transform, including the DFT and DCT, breaks down the signals into a collection of
coefficients, which may or may not correspond to the signals' individual components. Furthermore, the
transform kernel is inefficient in terms of energy compaction and sample decorrelation because it is
independent of the data. Consequently, it is usually not possible to separate the signal from the noise
component using these inferior transforms. Certainly, the highest decorrelation of the signals is achieved
when the data is expanded into a set of orthogonal components. As a result, the data may be divided into
subspaces for the signal and noise.
PCA is frequently used for filtering, whitening, classifying, and decomposing data. The signal and noise
subspaces are divided in filtering applications and the data are rebuilt using only the signal's eigen values
and eigen vectors. If the original sources of correlated mixtures may be regarded as statistically
uncorrelated, PCA is also utilized for blind source separation. Running an SVD on the covariance matrix
is the same as running a Principal Component Analysis (PCA). PCA breaks down the data into its
uncorrelated orthogonal components using the same idea as SVD and orthogonalization, resulting in a
diagonalized autocorrelation matrix. The variance that each eigen value captures in the direction of the
principle components is quantitatively connected to the eigenvector, which represents a principal
component.
The ability to break down signals into their individual independent components is the foundation of the
Independent Component Analysis (ICA) concept. This idea is essential for signal separation and
denoising in situations where the combined source signals can be taken to be independent of one another.
It is simple to define a measure of independency to assess the independence of the deconstructed
components. Blind source separation (BSS) is one of the key uses for ICA. BSS is a method for
estimating and recovering independent source signals based solely on the mixing information seen at the
channels used for recording. BSS is gaining a lot of attention these days because of its many uses. To
make the problem more tractable, the BSS algorithms typically do not make reasonable assumptions
about the surroundings. The instantaneous situation, in which the source signals reach the sensors
simultaneously, is the most straightforward but frequently utilized scenario. This has been taken into
consideration for the separation of biological signals, such the EEG, where the sampling frequency is
typically low and the signals have narrow bandwidths.
Several attempts have been made to apply BSS to EEG signals [3.26]– [3.35] in order to distinguish
between sources connected to mental actions or physical movement, event-related signals, and regular
brain rhythms. A criterion must be developed to estimate the number of sources in advance if the number
of sources is unknown. This procedure is challenging, particularly when there is noise involved. The
aforementioned BSS schemes cannot be used in situations where there are more sources than mixtures
since the un-mixing matrix will not be able to be inverted, making it difficult to retrieve the original
sources in most cases. However, additional clustering-based techniques might be applied when the signals
are few. When it comes to EEG analysis, even though there are very few signals mixed at the electrodes,
there might be a huge number of sources, or neurons firing at once. On the other hand, the siganl might be
changed to the time–frequency domain or even the space–time–frequency domain if the goal is to
investigate a specific brain rhythm. The sources in these areas can be thought of as sparse and
discontinuous. Furthermore, it is claimed that if a neuron's firing rhythm includes extended periods of
inactivity, the neuron will encode information sparingly in the brain [3.36].
The spatiotemporal properties of EEG signals [3.37] are important at this point because they can aid in
the selection of an appropriate preprocessing and filtering method [3.38], [3.39], [3.40], [3.41] [3.42],
[3.43]. External sources, both physiological and non-physiological, are the source of artifacts. Typically,
line noise artifact is detected at 50 Hz or 60 Hz in the EEG's gamma region. This artifact is often
removed with a notch filter, which cuts off a certain frequency band. Nevertheless, using a notch filter
may cause the signal to be distorted and produce false oscillations with parasitic frequencies [3.44],
[3.45], [3.46]. Although spectral interpolation [3.44] is also a viable alternative, it occasionally introduces
additional frequencies into the signal during the phase interpolation process. One possible option is a
smoothing filter with a cut-off frequency of less than 50 Hz or 60 Hz. Nevertheless, it may result in an
improper denoised signal that misses causalities [3.47] and changes the temporal structure of the signal
[3.48]. This problem can be solved by estimating the spectral energy using a multi-taper decomposition,
which reduces broad band variations [3.49]. There are three main stages to the entire process: Using
discrete prolate spheroidal sequences (DPSS) taper, a brief temporal window is first dragged across the
data, and then numerous independent projections of the data are retrieved [3.50]. Secondly, the spectral
energy inside each band is represented by the single-taper spectra that are generated for each projection.
Finally, the approximate phase, amplitude, and mean of the component (50 Hz or 60 Hz) are determined
using a regression-based model. Regression analysis, which is based on predicting the artifacts from the
EEG data and subtracting them from the data, can be used in any domain [3.51]. Some of the
disadvantages of this technology are that it cannot be used for applications involving non-stationary EEG-
like signals and requires a channel to act as a reference. It does not apply to all types of artifacts; rather, it
is restricted to specific types.
Time domain filters: retain the necessary frequencies while attenuating either very high- or low-frequency
regions [3.52]. Temporal filters reduce noise in the EEG and enhance its quality by preventing artifacts
from power line interference and incorrect scalp electrode polarization [3.53]. Using DFT or FFT, long-
duration brain signals are filtered by removing any coefficients that don't match the EEG signals'
frequency range. It is then converted back into a time domain signal using a reverse DFT.
When recognizing patterns in brain activity, Surface Laplacian (SL) filters are utilized to reduce noise,
particularly in the case of imagined motor movements [3.54]. For space reduction, signal filtering, and
original brain signal recovery, a spatial filter is employed [3.55]. Surface Laplacian Filtering:
distinguishes between signals from the brain and muscles using topographical power spectral
distributions with regard to frequency [3.56]. These methods are an approximation of the localized
current density that enters the scalp perpendicularly [3.56]. The cortical surface potential and source
identification can also be performed with the surface Laplacian filter. Every Laplacian approach is
independent of references [3.57]. Adaptive filtering is a self-adjusting system that compares the reference
and output signals and modifies the filter parameters using an optimization method [3.58]. It then uses
feedback to estimate noise and subtract it from the original EEG signals.
One technique for evaluating data is independent component analysis (ICA), which starts with centering
and whitening the data. The goal of optimization, which comes next, is to reduce the independent sources'
non gaussianity. One benefit of ICA is that it is not dependent on reference channels [3.59]. The fact that
it requires human artifact selection and is computationally demanding is one of its main disadvantages.
This can be resolved by automatically detecting ICs using kurtosis in conjunction with spatial, spectral,
and temporal information [3.60].
Morphological component analysis (MCA) is unsatisfactory as a stand-alone technique since it requires
an artifact database that has been broken down based on its morphological attributes [3.61]. Principle
Component Analysis (PCA) converts correlated signals in the time domain into uncorrelated principal
components [3.62]. An artifact can it be eliminated only if it is uncorrelated with the EEG [3.63].
A technique for time-frequency analysis called wavelet transform breaks down a signal into a number of
wavelets that are confined in both time and frequency. This makes it possible to distinguish between
artifacts that are uncorrelated with the base mother wavelet and strongly correlated wavelets [3.64]. There
are various types of wavelet transforms, such as discrete wavelet transforms (DWT), stationary wavelet
transforms (SWT), wavelet packet transforms (WPT), and continuous wavelet transforms (CWT). When
working on single-channel applications, DWT with statistical threshold (ST) functions provides a rapid
execution time solution that is appropriate for removing ocular artifacts [3.65].
Empirical Mode Decomposition (EMD) is a technique that can be modified to fit various goals and
datasets [3.66]. These are applied on EEG in order to separate it into IMFs and require intricate
computations in order to eliminate the noisy intrinsic mode functions (IMFs) from the signals. The best
use case for this technique is heavily polluted data [3.67]. However, one of its major disadvantages is the
model overlap. To address this problem, an ensemble-EMD approach (EEMD) is utilized, in which a
multivariate empirical mode decomposition (MEMD) is predicated on recognizing muscular artifacts over
a limited number of channels, and the IMF component is derived by averaging the ensembles of trials
[3.68] in one channel.
It's common practice to combine several techniques to get better results. For example, it has been
discovered that BSS-AF and (BBS-WT) [3.69], [3.70], and [3.71] are more efficient than the BSS
approach alone, especially when paired with EMD or SVM [3.72], [3.73], and [3.29]. Furthermore, some
restrictions can be mitigated and ocular artifacts can be eliminated by combining the use of wavelet and
adaptive filters [3.74].

3.3 Review of PPG Signal Analysis


After the proper elimination of the artifacts from the acquired signal, the question of how to calculate the
values of the features associated with high- and low-quality PPG signals emerges. Subsequently, decision
rules are developed to decide how to label the PPG signal. Physiological thresholds, heuristics and data
fusion are the primary methods used to create the decision rules [3.3]. In the first scenario, systems for
evaluating signal quality often use combinations of rules based on various features, and thresholds are
typically defined using physiological knowledge or empirical evidences [3.3]. When several derived
features are combined together, data fusion results in reliable conclusions. PPG signal quality has been
assessed using probabilistic techniques, such as the Kalman filter [3.18], [3.23]. Machine learning
methods have also been suggested for the automatic identification of decision rules based on extracted
indices [3.18], [3.75], and [3.76]. The use of machine learning techniques for PPG signal quality
assessment is anticipated to increase due to advancements in physiological data analysis and the increased
accessibility of databases. These techniques will not only be used to combine various indices and
generate automated decision rules, but will also serve as alternatives for identifying low-quality segments
within the PPG records. Pereira et al. have already looked into this strategy and successfully used deep
learning algorithms to identify low-quality PPG segments automatically without the need to extract signal
quality parameters [3.77].
In pulse wave analysis, calculating the PPG signal's derivatives is frequently an important step. For
example, the second derivative of the pulse wave can be used to obtain many markers of vascular ageing
[3.78]. The derivative of a PPG signal can be calculated using a number of methods [3.79]. The single-
sided difference quotient, for example, is the simplest method for calculating a derivative. It takes the
difference between two consecutive points as input and computes the derivative. However, because this
method only uses two points on the signal, it is quite vulnerable to high frequency noise. While the
symmetric difference equation can yield a derivative estimate that is more precise, noise can still affect it.
In general, there are two ways to lower the signal's sensitivity to noise: either by employing additional
signal points in the computation or by low-pass filtering of the signal (below 7 Hz, for example, [3.80])
before differentiating. Differential quadrature techniques allow for the inclusion of more signal points by
computing the derivative as the weighted sum of several signal points.

3.3.1 Review of Time Domain Analysis of PPG signal


The PPG signal can be extensively analyzed in the temporal domain due to the various features that can
be applied to individual pulse waves. Features can be described in terms of their variability as well as
their amplitude, timing, and shape [2.1, 2.9, 3.1, 3.81]. Once the PPG has been pre-processed as described
in previous chapter, the following time-domain analysis can be performed on it:

(a) Determination of each pulse wave separately for examination.

(b) Identification of the fiducial points on each pulse wave and its derivatives so that the
features of the pulse waves can be computed.

(c) Computation of the pulse wave characteristics, such as amplitude, time, and form, by
computing them.

These techniques are now explained below:

(a) Determination of each pulse wave separately for examination.

Identifying individual pulse waves for study is the initial step in the time-domain analysis of the PPG.
This task is difficult due to the following reasons: (i) noise and low frequency physiological fluctuations
can induce perturbations in the signal; (ii) individual pulse waves can exhibit two different peaks,
especially in young healthy participants [3.82], [3.83]. Numerous techniques have been put forth to
distinguish between distinct pulse pulses. The majority of techniques rely on identifying the systolic peak
because it is typically the most noticeable characteristic [3.84]. Typically, methods involve four steps: (1)
signal filtering to highlight the desired PPG components; (2) pulse extraction; (3) peak or onset
identification; and (4) peak or onset correction [3.85].
Three popular methods for identifying peaks are as follows: (i) using thresholds to detect peaks [3.86],
(ii) identifying peaks as zero-crossing points on the first derivative and combining this with an adaptive
thresholding scheme [3.87], and (iii) identifying peaks using the slope function and combining it with
adaptive thresholding [3.88]. Additional techniques include: (i) locating the PPG signal's maxima and
minima points [3.82]; (ii) locating points on the PPG's first- [3.83], [3.89], and second-derivative [3.89];
(iii) locating the PPG's positive slopes [3.75], [3.89]; and employing more sophisticated methods like the
Wavelet transform [3.90], [3.91] and the local maxima scalogram [3.92].

(b) Identification of the fiducial points on each pulse wave and its derivatives so that the
features of the pulse waves can be computed.

Finding points of interest on each pulse wave is the second stage in time-domain analysis. Typically
referred to as fiducial points, these are discrete locations that are discernible on the pulse wave or its
derivatives. Fiducial points that are frequently used are shown in Figure 3.1. Finding points of interest on
each pulse wave is the second stage in time-domain analysis. Typically referred to as fiducial points,
these are discrete locations that are discernible on the pulse wave or its derivatives. Fiducial points that
are frequently used are shown in Figure 3.1. Referring to Figure 3.1(a), the dicrotic notch, the systolic
peaks, diastolic peaks, and the pulse onset are the important fiducial locations on the original PPG pulse
wave. Figure 3.1(b) displays the maximum point on the first derivative, which denotes the original
signal's maximum slope point. Four discrete systole points on the second derivative are recognized as the
a-, b-, c-, and d-waves (Figure 3.1(c)). The dicrotic notch's location can be ascertained using the e-wave
[3.82]. On the third derivative, the points p1 and p2 can be found (see Figure 3.1(d)).

(a) Computation of the pulse wave characteristics

The study of the PPG signal's morphology using features taken from the pulse wave form and its
derivatives is known as pulse wave analysis. Many morphological characteristics that are mostly derived
from the amplitude and width of the PPG wave contour have been studied in the literature. These
features measure the properties of pulse waves to gather data on the state of the cardiovascular system.
The systolic amplitude or the height of the PPG from the baseline to its peak, on the initial pulse wave
has been linked to stroke volume [3.81] and has also been proposed as a blood pressure measure [3.93].
The ratio of pulse area [3.94] and the width of the PPG at half its amplitude [3.95] are two possible
markers of total peripheral resistance.
Cardiovascular disorders have been distinguished using the crest time (CT) [3.96]. Blood pressure
indicators have been presented, including diastolic time, CT, pulse width from the systolic and diastolic
sections of the pulse wave, and their ratios [3.97], [3.98]. A correlation between the brachial-ankle pulse
wave velocity and the reflection index has been proposed [3.99]. It has been discovered that age and
mean arterial blood pressure are correlated with the major artery stiffness index [3.100]. Pulse wave
analysis commonly makes use of the amplitude ratios of the b, c, d, and e waves with respect to the a-
wave on the second derivative. It has been discovered that a number of characteristics derived from the
second derivative are helpful markers for evaluating cardiovascular health.
For example, the c/a, d/a, and e/a indices are believed to decrease with age, whereas the b/a index
increases. These indices are supposed to indicate greater arterial stiffness [3.78], [3.101]. Moreover, it
has been discovered that the b/a and c/a ratios are helpful in differentiating between hypertension
patients and healthy controls [3.102]. Furthermore, b/a ratio have been proposed as a helpful indicator of
atherosclerosis and changed arterial distensibility [3.103]. Other aspects, including the aging index given
as (b-c-d-e)/a, have also been proposed as derived from the second derivative [3.78]. A different aging
index, (b-e)/a, was suggested in [3.101] for the cases where the c and d-waves are not evident.

Through a technique called "pulse decomposition analysis," the PPG pulse wave can be broken down
into incident and reflected waves. This makes it possible to extract features from the separate waves that
comprise the pulse wave as a whole. Assuming that this incident wave is mostly responsible for the
systolic up slope, allows for the extraction of an incident wave. To create a symmetrical wave, the
systolic upslope is flipped horizontally. This symmetrical wave is then either fitted with a Gaussian to
mimic the incident wave [3.104]–[3.106], or it is considered to be the incident wave [3.107].The initial
pulse wave is then subtracted from this incident wave, leaving a residual part that is primarily composed
of one or more reflected waves. The next step is to extract these reflected waves by modeling them as
Gaussian curves or symmetrical waves. From the timings and amplitudes of the individual wave peaks,
the widths of the individual waves, and the time intervals between their peaks, pulse wave features can
then be obtained [3.3].

3.3.2 Review of Frequency Domain Analysis of PPG signal


In addition to time-domain analysis, the PPG's frequency-domain characteristic offers insightful data on
circulatory dynamics and makes it possible to identify and track aspects for a variety of pathological
disorders. Indeed, compared to the time-domain analysis, frequency-domain analysis can yield more
useful information [3.108]. A signal's frequency content can be evaluated using a variety of techniques,
which can be broadly categorized into two groups: (i) traditional methods based on the Fourier
transform, and (ii) contemporary methods based on signal source models [3.108]. PPG signals have been
treated to both methods, with comparatively comparable results [3.109].

Fourier transform is widely used to calculate frequency spectra of PPG signals. The Fourier transform
maps the signal's information to the frequency domain using sinusoidal waves [3.108]. Three parameters
can be used to describe sine waves: phase, frequency, and amplitude. As such, they are useful for
determining a spectrum's phase and magnitude components. These elements are shown for a clean PPG
signal in Figure 3.2. The Fast Fourier transform (FFT) algorithm was utilized to acquire these results, as
it was designed to apply the Fourier transform discretely and with reduced processing costs [3.108]. The
majority of the data in the PPG is below 15 Hz, and the heart rate is represented as a significant peak in
the spectrum at 1 Hz.
Spectral analyses are influenced by the PPG recording's duration and sampling frequency. First, the
maximum frequency accessible in the frequency spectrum (fs/2) is determined by the sampling
frequency (fs). As a result, the sampling frequency ought to be greater than twice the highest frequency
which is relevant. Second, the length of the recording affects the resolution of the resulting frequency
spectrum, which is determined by fs and the quantity of data points utilized in the spectrum calculation.
Zero-padding the signal [3.108] can enhance the number of data points in situations where a better
resolution is required, such as when evaluating autonomic nervous system activity, which often occurs in
very low frequency regions from about 0.04 to 0.4Hz [3.110].

If only the magnitude component is needed, then other methods can be applied to obtain the frequency
spectrum. Welch's periodogram approach and power spectral density (PSD), often known as power
spectrum are frequently applied. The computation of the Fourier transform power is the foundation of
the PSD. The foundation of Welch's periodogram is the segmentation of the PPG signal, the acquisition
of a PSD for each segment, and the ultimate production of a spectrum by averaging the spectra [3.108].
In certain applications where the signal's behavior is well understood, modern methods for obtaining
spectral information can be advantageous since they try to lessen the impact of noise on the resulting
spectrum [3.108]. Certain techniques rely on moving average models (MA), autoregressive models
(AR), or an amalgamation of both (ARMA). The power spectrum can be estimated using AR models in a
variety of ways, including the Yule-Walker, Burg, covariance, and modified covariance methods
[3.108]. Other contemporary non-parametric methods of spectrum analysis are predicated on Eigen
analysis frequency estimation, which divides the signal using singular value decomposition (SVD) into
correlated and uncorrelated signal components [3.108]. The multiple signal classification (MUSIC)
algorithm is one such example [3.108]. The application, the signals that are available, and the data that
has to be extracted from the spectra should all be taken into consideration when selecting the techniques
and parameters for obtaining a frequency spectrum.

3.3.3 Review of Time-Frequency Domain Analysis of PPG signal


There are numerous ways to carry out time-frequency analysis, which is the examination of a signal
simultaneously in the frequency and time domains. Even with short-duration signal, time-frequency
analysis makes it possible to examine how spectral properties change over time [3.110]. Sophisticated
methods like wavelet techniques and the short-time Fourier transform (STFT) method are frequently
employed. Wavelet transformations (WT) may identify transients in non-stationary signals and have
several advantages over older approaches, such as the ability to customize both time and frequency
resolutions without compromising one against the other. In addition to providing time-localized filtering,
wavelet analysis can be used to identify discontinuities and other events that are not immediately
apparent in the raw data. In addition to offering a sparse representation of the data, wavelets are helpful
for compressing or denoising data without sacrificing crucial features. Wavelet analysis comes in a
variety of formats. One popular tool for analyzing one-dimensional data, such a single PPG signal, is the
Continuous Wavelet Transform (CWT). The analysis function in the CWT is a wavelet. The CWT
compares the time domain signal to the mother wavelet that have been stretched, compressed, and/or
shifted [3.111]. Dilation or scaling refers to the process of stretching or compressing a function to match
the concept of physical scale. The signal is compared to the wavelet at different scales and places to
obtain a function of two variables. The CWT is a real valued function of scale and position if the wavelet
has a real value. Scaling and shifting are accomplished in discrete steps of two using the Discrete
Wavelet Transform (DWT), which is highly helpful for computer implementation of the procedure. In
addition to being easier to implement on hardware, the DWT can be faster than the CWT [3.111].
Wavelets come in a variety of forms, such as Daubechies, Analytic Morlet (Gabor), Haar, Morse, and
Bump. The Morlet and Daubechies wavelet are frequently used wavelets for DWT based analysis of
biomedical signals.

3.3.5 Review of PPG signal analysis for Emotion Recognition


When it comes to emotion recognition, scientists categorize feelings according to their degree of valence,
arousal, and dominance, either directly or indirectly. While some studies employed correlation and
entropy coefficients, others used the number of intersections between Poincare planes, mean, and
variance. For the objective of extracting emotion-based features, feature sets from activation, fusion, and
connection patterns are also used, along with a sparse adjacency matrix. Emotion classification also made
use of higher order characteristics such as the Lyapunov exponent, approximation entropy, correlation
coefficients, and Hjorth parameter. Few scholars combined decomposition and analysis approaches based
on Discrete Wavelet Transforms with time domain and frequency domain information. These methods
can't be used in real time applications since they need intricate computational processes that increase the
algorithm's overall execution duration. Certain techniques are restricted to classifying emotions into four
categories: arousal, non-arousal, valence, and non-valence. However, they are unable to categorize
different emotions, which is why a good study and subsequent treatment are urgently needed.
Furthermore, the results produced cannot be considered as a definitive measure of a subject's mental state
due to the low classification accuracy. Because of the lack of precision, any medical treatment based on
these results may cause further mental illness. Therefore, for the purpose of correctly and accurately
detecting the various emotions, a precise outcome is needed.
Based on the Persian emotional database and Berlin database, Bastanfard et al. classified six distinct
emotional groups using a number of global and local variables, their derivatives, and a few statistical
features. When they used feature set one and an auto-encoder neural network-based classifier, they were
able to achieve 90.3% recognition accuracy. The method's use of speech recognition software makes
pronunciation a significant difficulty. Furthermore, a subject may purposefully speak in a different way,
which could stifle the true emotional impact. It is not possible to use the aforementioned strategy on those
who are nonverbal or have speech difficulties. An additional impediment to voice recognition-based
approaches is language barriers. Considering these aspects, the suggested approach is better since it
makes use of the physiological signal, which is unchangeable and cannot be suppressed. Additionally, the
PPG signal can be administered with ease by deaf and dumb people and is not dependent on the subject's
native tongue. Table 3.1 presents a thorough comparison of a few of the known techniques for
recognizing emotions.

There hasn't been much study done up to this point that makes use of the PPG signal's potential for
accurate emotional state recognition. To classify emotions based on time plane and spectral features with
lower accuracy, most reported researches, however, use the PPG signal in conjunction with other
modalities such as galvanic skin response (GSR) [3.123], electromyography (EMG), respiration rate
(RSP), and EEG. [3.124] uses a complex classifier to examine the spectral power feature of four distinct
EEG bands in order to study the valence and arousal types of emotions. Park et al. [3.125] classifies
happiness and sadness using time-domain PPG characteristics and skin temperature. The authors of
[3.126] attempted to categorize thirteen emotions using a feature fusion and wavelet-based approach.
According to [3.127], the setup for the identification of emotional states combines a complex collection
of classifiers with statistical features taken from the ECG, GSR, and PPG signals. In [3.128], PPG and
GSR signals are combined and used through two classifiers to identify eight distinct emotional states with
a maximum average accuracy of 92%. Goshvarpour et al. [3.130] used descending accuracy analysis of
the Poincare's indices produced from the PPG signals in an attempt to classify only three unique
emotions. In addition to the multimodal use of the PPG signal [3.123 – 3.128], most of the
aforementioned methods either identify a small number of emotional states [3.123 – 3.125], [3.127],
[3.129], or use sophisticated bio-signal properties [3.123 – 3.124], [3.126], [3.127], [3.129], or
complicated classification algorithms [3.123 – 3.129]. However, it is still discovered that there is a lack of
exact identification of emotion recognition when using alone the PPG waveform features, despite the fact
that they have sufficient promise.

3.3.6 Review of PPG signal analysis for Mental Stress Detection


PPG signal characteristics are typically integrated with other widely used bio signals, such as those from
the electrocardiogram (ECG) [3.131], electroencephalogram (EEG), galvanic skin response (GSR),
electromyogram (EMG), blood pressure (BP), electro dermal activity (EDA), etc. [3.132], to form a
multimodal analysis in the majority of mental stress assessment studies conducted to date. Furthermore,
there exist a limited number of studies [3.133] that just utilize PPG signal features to evaluate mental
stress. The requirement for multisensory acquisition in multimodal signal analysis raises the burden of
acquisition and methodological complexity. The suggested PPG-based systems lag when more intricate
feature extraction and bigger feature sizes [3.134] or classification techniques [3.135] are used; as a
result, more research is necessary.
However, subjective self-reporting is the basis of traditional stress identification methods like surveys and
questionnaires, which have practical and accuracy issues due to bias in reporting, differing degrees of
self-awareness, and misremembered information [3.136]. However, physiological measures such as heart
rate variability (HRV) analysis and electroencephalogram (EEG) have been used in stress detection
[3.137], [3.138]. In a similar approach, EEG records brain activity and yields valuable insights into an
individual's emotional and cognitive states, particularly those associated with stress [3.139]. However,
due to the need for sophisticated acquisition equipment, the use of several electrodes, and expert
management, these techniques are only practical for sporadic personal usage, which can be expensive and
time-consuming [3.140]. The need for a reliable, low-cost, non-invasive stress detection technique that
works well in a range of settings is therefore critical. These days, customized stress tracking is feasible
thanks to the advancements in data analytics and wearable technology. PPG, which makes use of tiny
optical sensors found in devices like smart watches and fitness bands, can now continuously record pulse
rate and perform well in applications involving the identification of emotions [3.141]. These wearables
are advances in non-invasive, user-friendly stress monitoring in a range of contexts, such as professional
and sports environments, healthcare, and other areas. PPG sensors allow for an easy connection between
them. The key is in leveraging computational modeling to transform raw PPG data into stress assessments
that are clinically meaningful. The creation of an algorithm for computational simplicity and feature
extraction is necessary for the construction of a stress detection model utilizing physiological signals
[3.142]. A useful indicator of stress-related analysis that may also be derived from the PPG signal itself is
the respiration rate [3.143]. The benefit of obtaining a single bio-signal from a subject will result from
this, as opposed to using a separate respiration sensor whose positioning is not as convenient as that of a
PPG sensor. The documented literatures have made substantial use of machine learning-based methods to
classify stress based on biological signals and identify stress conditions [3.144]. SVM has been mostly
used in reported literatures on stress detection techniques [3.145]. Furthermore, additional algorithms for
stress detection have been employed, such as kNN [3.146], [3.147], LR [3.148], and Fuzzy Logic [3.140].
In comparison to conventional machine learning algorithms, deep learning algorithms have also been
used and have produced encouraging results because of their capacity to evaluate complicated data
[3.149]. While the aforementioned techniques have produced possibilities for precise stress detection,
they often require high computing capacity and higher power requirements, making them unfeasible for
low-computational systems. This results in the need for a straight forward feature-based algorithm and a
simple classification technique, which will do away with the need for a powerful computer system and
demand very little resources to run.

3.4 Review of EEG Signal Analysis

3.4.1 Review of Time Domain Analysis


The raw EEG signal comprises of frequency components up to 300 Hz and amplitudes of the order of
micro volts. The signals need to be filtered to remove noise and prepare them for processing and display,
in order to preserve the useful information. The filters are designed such that the signals are not altered or
distorted in any manner. High pass filters are used to eliminate very low frequency components, such as
breathing sounds, with a cut-off frequency of typically less than 0.5 Hz. On the other hand, low pass
filters with a cut-off frequency of roughly 50–70 Hz are used to reduce high-frequency artifacts. To
achieve flawless rejection of the powerful 50 Hz power line noise, notch filters with a null frequency of
50 Hz are frequently required. Here, the sampling frequency may be as low as twice the bandwidth that
most EEG devices typically employ. For EEG recordings, the following sampling frequencies are
frequently used: 100, 250, 500, 1000, and 2000 samples/s. System related artifacts and patient-related
(physiological) artifacts are the major noises. Sweating, bodily movement, EMG, ECG (including
pulsation), EOG, and ballistocardiogram are examples of patient-related or internal artifacts. The system
artifacts include imbalanced electrode impedances, cable faults, impedance fluctuation, electrical noise
from the electronic components, and interference from the 50/60 Hz power supply. During the
preprocessing stage, these artifacts are frequently reduced and the useful data is recovered.
Numerous algorithms have been created so far to process EEG signals. Time-domain analysis, frequency-
domain analysis, spatial-domain analysis, and multi way processing are among the procedures that are
performed. In order to see the brain activity using images reconstructed from the EEG, several techniques
have been created. Another field of study has been the extraction of the desired sources from the multi
sensor EEGs. This may eventually result in the identification of brain disorders like epilepsy and the
origins of different types of mental and physical activity. Processing lengthy EEG recordings in real time
is necessary for patient and sleep monitoring. EEG offers distinctive and significant insights into the
sleeping brain. Algorithms have been developed to capture major brain activity during sleep [3.150] also
includes matching pursuits [3.151].
By measuring certain statistics of the signals at various time lags, one can quantify the non stationarity of
the signals. If these data show no significant variation, the signals can be considered stationary. The mean
and covariance characteristics of the multichannel EEG distribution typically vary from segment to
segment, despite the fact that it is generally thought of as a multivariate Gaussian distribution. As a result,
EEGs are only regarded as quasi-stationarity—stationary for brief periods of time. This Gaussian
assumption is true when the brain is functioning normally, but it is false when engaging in mental and
physical tasks. Examples of non stationarity in EEG signals include shifts in wakefulness and alertness,
blinking of the eyes, changing between different ictal states, and signals from the event-related potential
(ERP) and evoked potential (EP). Both the divergence of the distribution from Gaussian and the
parameters of a Gaussian process can be used to quantify the change in the signal segment distribution.
One way to verify if the signals are non-Gaussian is to measure or estimate higher-order moments such
variance, skewness and kurtosis. The difference between data points and the mean is measured by
variance. Variance can be defined as the degree to which a collection of data, or numbers, deviates from
their mean, or average value.
Skewness is a metric for symmetry, or more accurately, for the distribution's lack of symmetry. If a
distribution or data set has the same appearance to the left and right of the center point, it is said to be
symmetric. If the distribution is closer to the right of the mean point the skewness is negative, and vice
versa. The skewness is zero for a symmetric distribution like the Gaussian distribution. Kurtosis is a
metric that indicates whether a set of data is peaked or flat in relation to a normal distribution; that is, data
sets with high kurtosis typically have heavy tails, a noticeable peak close to the mean, and a rapid decline.
Instead of a high peak, data sets with low kurtosis typically have a flat top close to the mean.
EEG signal frequently include lengthy recordings across several channels, producing enormous amounts
of data. By identifying attributes, feature extraction is used to simplify this dataset. This method offers the
benefit of lowering problems associated with over fitting and reducing burdens. EEG recordings are
commonly obtained from both healthy and diseased subjects while researching brain activity, producing a
vast amount of data that has to be analyzed. Values that capture signal properties found at sampling rates
ranging from 100 to 1000 Hz are represented by EEG signal features. A feature vector is created by
combining the different features. Several different approaches must be used in order to extract features
from an EEG signal or a set of data. In order to identify synchronization instances, prominent low-
frequency bands during peak periods, and frequencies indicative of particular pathologies such as
epilepsy, tumors, and injuries, feature engineering methods are used to prepare the data for classification
stages. An overview of the features and their extraction methods that are utilized to automatically
diagnose different brain illnesses is provided below.
The mean, median, variance, RMS, peak-to-peak, standard deviation, auto-correlation, absolute value,
and zero-crossing (ZC) are among the variables of time-domain parameters [3.152] that might change.
Here are some other time domain features:
1) EEG histogram: Shows the normal dispersion of an EEG.
2) A frequency distribution curve's kurtosis indicates how sharp a peak is in relation to a Gaussian curve.
3) Skewness: The amount that the distribution curve deviates from the expectation that a Gaussian
distribution of the data would provide.
4) Fractal dimensions: Also referred to as the Hurst exponent, this word describes the ability of a time
series to retain data for an extended period of time.
5) The time series' level of randomness is described by entropy. It indicates both the consistency of the
waves and the ambiguity of the changes at the same time.
6) The mobility coefficient and complexity coefficient are two examples of EEG derivatives whose
variability is measured by the Hjorth parameter [3.153].
7) Standard waveforms in non-rapid eye phase are K-complexes [3.154].
3.4.2 Review of Frequency Domain Analysis
Signals are examined in the frequency domain using frequency analysis as opposed to time analysis. The
frequency components of the signal's amplitude and phase are represented in the frequency-domain. In
particular, the phase shift needed to recombine the frequency components and to obtain the original time
signal can be included in this formulation [3.155]. The characteristics of the signal in the frequency
domain can also be obtained using the power spectral density (PSD) [3.156]. Signals are converted into
sinusoidal components by Fourier transforms [3.157], and a mother wavelet function is incorporated into
the decomposition process using wavelet decomposition [3.158]. Fourier transform entails breaking down
the signal into frequency spectrum-spanning sub-spectral components. In terms of frequency, these sub
spectral components indicate peaks. Next, the FFT algorithm is used to gather and compute the peaks in
this domain.
Given that two-dimensional signals might display non-stationary properties, it is advantageous to analyze
them in both the time and frequency domains [3.159], [3.153]. A signal's spectral characteristics are
subject to change over time, hence monitoring frequency variations is crucial for improving
comprehension of the signal. Signals in time-frequency domains can be analyzed using time-frequency
analysis. The short-time Fourier transform (STFT), which provides a spectrogram via uniform separation,
is one of the simplest methods for monitoring a signal and determining its frequency components [3.159].
For data with irregular spacing, more advanced techniques have also been created. One such technique is
the wavelet transforms [3.158], which make use of least-squares spectral analysis and changing window
sizes based on spectral frequencies. These methods offer insightful information on the spectral properties
of a signal with respect to time.
By describing a signal's frequency information in relation to time, STFT increases the accuracy of
categorization [3.160]. In contrast to the normal FT, which analyzes the signal as a whole, the STFT splits
the data into multiple brief signals and then determines each signal's frequency contents separately using
time-shifting window frames [3.161]. By offering multiscale analysis, the WT is an extension of the
Fourier transform that overcomes the drawbacks of STFT. Wavelet transform divides a signal into a
collection of basis functions that can be used to recreate the original signal. At various scales, the low-
frequency and high-frequency components of the signal can be distinguished due to this decomposition
process. It is feasible to modify wavelets to match particular signal properties by extending,
compressing, and bending the mother wavelet to form a new wavelet [3.162]. This adaptability makes
wavelet analysis an effective tool for signal processing and analysis, especially when combined with its
capacity to study signals at various sizes.

3.4.3 Review of Joint Time-Frequency Analysis


It is simple to characterize the signals in the frequency or time domain if they are statistically steady.
Using linear transforms with kernels independent of the signal, such as the Discrete Fourier transform
(DFT), Discrete Cosine transform (DCT), or any semi-optimal transform, one can determine the
frequency-domain representation of a finite-length signal. However, because of the fixed transform
kernels and short-term time-domain windowing of the signals, the outcomes of these transforms may be
deteriorated by spectral smearing. A perfect transformation, like the Karhunen-Loeve transform (KLT),
necessitates comprehensive statistical data, which might not be accessible in real-world scenarios.
While parametric spectrum estimation techniques, like AR or ARMA modeling, can more accurately
capture a signal's frequency-domain features than DFT, their ability to estimate model parameters may be
compromised by the short duration of the measured signals. For instance, precise values for the prediction
order and coefficients are required in order to simulate the EEGs using an AR model. True peaks in the
frequency spectrum may split with a high prediction order, while nearby peaks in the frequency domain
may combine with a low prediction order. The driving signal, or error, for an AR model of the signal, is
regarded as zero mean white noise. These would be derived from the finite length measurement in real
AR modeling, which would deteriorate the estimation of the spectrum.
The DFT result exhibits variations due to the statistical irregularity of power spectrum estimation
techniques that resemble periodogram. If the model fits the actual data, the AR technique's outcome
solves this issue. EEG signals frequently exhibit statistical non stationarity, especially when an aberrant
event is recorded within the signals. In certain instances, the frequency domain components are integrated
across the observation interval and hence fail to adequately depict the signal properties. A time-frequency
strategy is the way to go about fixing this issue. A space–time–frequency (STF) analysis using multi way
processing techniques has also gained popularity in the case of multichannel EEGs, where the
geometrical locations of the electrodes represent the spatial dimension [3.163]. The discrete-time Fourier
transform evaluated over a sliding window is known as the short-time Fourier transform, or STFT.
Generally, windows are selected to preserve positivity in the power spectrum estimate and to remove
discontinuities at block boundaries. The decision also affects the resulting technique's spectral resolution,
which is, to put it simply, the minimum frequency separation needed to resolve two equal amplitude
frequency components [3.164].
An additional option for a time-frequency analysis is the wavelet transform (WT). The WT has already
been thoroughly documented in a number of reputable literatures [3.165]. The WT-based method's time-
frequency kernel can localize the signal components more accurately in time-frequency space than the
STFT can. This effectively takes use of the relationship between the frequency and time components. It
follows that the primary goal of Morlet's [3.166] wavelet is to have a coherence time proportionate to the
sample period.

3.4.4 Review of Multi resolution Analysis


The embedded subsets produced by interpolating the signal at various scales—also known as down
sampling and filtering—are the source of multi resolution analysis. A signal is
projected onto the subset at each step; the projection is determined by the scaling function, which is
translated and dilated according to the needs, and the scalar product of the function and mother wavelet.
The wavelet coefficient can be directly computed using the standard equation for a discrete frequency
space (i.e., using the DFT). The signal is smoothed at each stage by dividing the number of scalar
products by two. The initial portion of a filter bank is constructed using this process. Mallat et. al.
[3.167] exploited the features of orthogonal wavelets to recover the original data, but by adding two
more filters, referred to as conjugate filters, the theory is expanded to a broad class of filters. The input is
successively convolved with the time domain forms of the two filters, L (low frequencies) and H (high
frequencies), during the decomposition process. Suppression of one out of every two samples results in a
decimation of each resulting function. The low-frequency signal undergoes further breakdown while the
high-frequency signal remains intact. After adding a zero between each sample to restore sampling, the
conjugate filters are applied, the outputs are summed, and the result is multiplied by two in the
reconstruction process.
The conjugate filters need to meet two requirements in order for the original signal to regenerate exactly:
the anti-aliasing condition and the accurate restoration requirement. It is possible to use a wide class of
compact wavelet functions. Particularly for coding, numerous sets of filters have been proposed [3.168].
It has been demonstrated that the wavelet functions and the scaling regularity must be taken into
consideration while selecting these filters. Daubechie's wavelets are the only compact solutions among
other wavelets that meet the above mentioned requirements. Using 3D representations, the WT provides
correct time information at high frequencies and exact frequency information at low frequencies. A large
range of WT types are employed in practice, utilizing many mother wavelets. The Continuous Wavelet
Transform (CWT) method looks at signal segments at various scales and locations to analyze non-
stationary signals. The CWT obtains the frequency content of the signal at different resolutions by
convolving it with a scaled and translated version of a mother wavelet function, commonly the "Morlet"
function [3.169], in contrast to the classic Fourier Transform, which concentrates on frequency
components. Because of its versatility, the CWT is a useful tool in signal processing, image analysis,
time-series analysis, and biomedical signal analysis. It can adapt to the dynamic features of non-
stationary data. The resultant scalogram offers a visual depiction of the signal's energy distribution
across various scales and times [3.158], making it easier to identify time-frequency patterns and
localized frequency fluctuations. The CWT enables the selection and examination of distinct signal
components for various scale variations by continually altering the location and scale parameters. All
things considered, the CWT and scalogram provides insightful information on the intricate dynamics of
time-varying signals. For effective signal decomposition, DWT can be utilized with a variety of mother
wavelets, where wavelet scales and translations can be adjusted based on the sampled value [3.170]. One
of the main distinctions between the DWT and CWT is that the DWT transform splits the signal into a
group of orthogonal wavelets across all discrete dimensions.
Different EEG patterns arise from the brain's non-linear dynamic properties and complicated electrical
activity. Dividing the signal into smaller subsystems may change the signal's erratic patterns and dynamic
characteristics. As a result, several non-linear statistical characteristics are taken out of EEG [3.171].
Because of its scaling invariance or self-similarity, fractal geometry offers an angle for examining EEG
signals. Multi fractal time-series analysis can be used to estimate a number of fractal dimensions in EEG
signal analysis, including the Higuchi fractal dimension (HFD) [3.172], Katz fractal dimension (KFD)
[3.173], Petrosian fractal dimension (PFD) [3.174], Hurst exponent [3.175], [3.176]. Similar to this, EEG
can be distinguished using Hjorth's characteristics according to their complexity, amplitude, and slope
[3.177]. These characteristics attain excellent accuracy in disease classification and are useful in EEG
analysis for medical diagnosis.
A dynamic system's linearity, complexity, and stability are quantified by the Lyapunov exponent (LE)
[3.178], which is calculated by analyzing the exponential divergence between two trajectories over time
[3.73]. WT [3.179], EMD, and multivariate EMD [3.180] can be used to extract non-linear aspects of the
Lyapunov exponent. The Hilbert transform [3.181] can then be used to classify these features. The
divergence also exhibits a similar pattern [3.182].

3.4.5 Review of EEG Signal Analysis for Mental Stress Detection


From a biological perspective, life is defined by the constant exchange of information and energy with the
surroundings within the constrained parameters of a dynamic equilibrium called "homeostasis" [3.183]. In
biology, an actual or predicted disturbance of homeostasis and, more broadly, a real or anticipated threat
to physiologic integrity are referred to as "stress" [3.184]. Any event triggers a general alarm reaction
because specific brain regions like the hippocampus, amygdala, and prefrontal cortex evaluate its
potential to cause instability. If the event does not match some cognitive representation based on prior
subjective experience, an alarm response is set off [3.183]. The clinical picture of this alarm reaction
includes psychological alterations like worry and panic as well as arousal, vigilance, and alertness, all of
which are more or less represented depending on an individual's susceptibility and resistance. Two major
physiological reactions are also triggered by exposure to stressful circumstances. The parasympathetic
branch of the Autonomic Nervous System (ANS) is temporarily suppressed, causing the sympathetic
branch of the ANS to simultaneously activate, resulting in the first and most immediate response which is
the activation of the Sympatho-Adrenal-Medullary (SAM) axis [3.185]. As a result, the adrenal medulla
secretes more adrenaline, and sympathetic nervous terminations release more noradrenaline. This puts the
body at risk for a given task by causing a sharp rise in heart rate, cardiac inotropism, blood pressure,
vascular tone, bronchial dilatation, and breathing frequency. Due to the reflex parasympathetic branch re-
activation [3.184], [3.185], and the exhaustion of the sympathetic activation effects resulting from the
short plasmatic and intersynaptic half-lives of catecholamines, the SAM axis's response to a single
stressful event lasts for 1-2 minutes. The hypothalamus secretes Corticotropin Releasing Hormone
(CRH), which stimulates the pituitary gland to secrete Adreno Corticotropic Hormone (ACTH), which in
turn causes the adrenal cortex to secrete cortisol. This is the second major response related to the
activation of the Hypothalamus Pituitary Adrenal (HPA) axis [3.183], [3.185]. Blood glucose levels rise
as a result of elevated cortisol levels, giving the body an effective energy substrate for a demanding
endeavour. The HPA axis can respond to a single stressful event for up to 60 minutes following the event,
in contrast to the SAM axis's stress response [3.186]. Because they gave living things an evolutionary
advantage, these phylogenetically ancient reactions to stress—particularly the "fight or flight" response
[3.187], which is logically designed to increase the likelihood of individual survival—evolved under the
impact of environmental selection. However, depending on personal susceptibility and resilience, these
good evolutionary responses may become detrimental health-threatening reactions [3.188, 3.189]. In
actuality, abrupt shifts in catecholamine levels may be the cause of severe cardiac arrhythmias,
hypertension, stroke, and coronary artery disease, while elevated cortisol levels may be the cause of
cancer and gastrointestinal disorders like ulcerative colitis and gastric ulcers [3.190]. In ordinarily
susceptible and resilient people, the physiological reactions to stress also influence the cognitive-
behavioral processes when stressful situations are repeated without sufficient time for recuperation in
between. Repeated exposure to stressful situations frequently results in imperfect perception, insufficient
attention, inadequate or delayed information processing, and mistakes of judgement. These conditions can
have major repercussions, especially in the workplace [3.191]– [3.193].This study estimated the level of
stress in relation to these physiological reactions to stressful events using electroencephalogram (EEG),
Heart Rate Variability (HRV), and Electrodermal Activity (EDA). This choice also aligns with the
hierarchy of physiological and physical cues for stress assessment developed in a survey study by Sharma
[3.194]. EEG, HRV, and EDA are ranked first, second, and third in this graphic, respectively. This
justifies the reflection of stress on EEG signals which becomes almost investable to be used for analyzing
and detecting stress conditions.
In the past, qualitative and subjective instruments like psychometric scales and instruments based human
stress assessment have been used to identify stress and anxiety [3.195], [3.196]. Self-completion of
questionnaires or expert execution is also possible [3.197, 3.198]. However, users frequently fail to
accurately express their feelings while answering questions and the psychometric results may be
influenced by participants' inaccurate personal stress perception [3.199]. Furthermore, there is a lack of
objectivity in this analysis, which makes it impossible to monitor continuously throughout time.
Physiological signals have recently been utilized to evaluate stress with less intrusiveness and complexity
than biological analysis, and with a higher degree of objectivity than psychometric tools. Actually, the
emergence of wearable sensors and new technologies has made it possible to address three major
problems with commonly used diagnostic devices: excessive intrusiveness, portability, and encumbrance.
The examination of physiological signals appears to be a good compromise between the earlier methods.
In particular, EDA, HRV, and EEG are frequently employed in the literature to look into stress levels
throughout various tasks [3.194]. Stress on the mind causes the sweat glands to become more active,
which changes the conductance of the skin. In earlier research, EDA has been employed as a measure of
sympathetic activation alone [3.205] or in conjunction with other physiological measures [3.206, 3.207]
to examine the stress levels of computer users [3.208], drivers [3.207], and participants in other routine
tasks [3.208]. Furthermore, a number of studies have demonstrated that stress affects the ANS [3.209]
and, in turn, the cardiac activity. The analysis of the HRV signal in both the temporal and frequency
domains is required in order to examine the impact of sympathetic and parasympathetic activities on the
basis of the electrocardiogram (ECG) signal. Prior research has employed HRV analysis in a variety of
studies to identify stress during mental tasks [3.210], heavy workloads [3.211], and car driving [3.212].
Stress, both physical and mental, can also alter brain activity [3.213]. Different physical and emotional
states are represented by the arrangement of EEG data in multiple frequency bands. EEG analysis has
been used to categorize the performance of various stressful tasks [3.214] and to look at the stress levels
of computer gamers [3.215].
Although the stress levels of people throughout various tasks have been studied using physiological
parameter measurement in previous works, variations in physiological signals can be influenced by a
variety of factors, including body posture, physical activity, and ambient conditions. Thus far, various
studies have emphasized the need to combine physiological and movement sensors to increase the
robustness of stress detection. In fact, the most accurate data processing methods may be found by using
this sensor fusion methodology to examine among the most effective sensor feature selections. Generally,
supervised learning approaches are employed in conjunction with ECG, EDA, and SpO2 [3.216, 3.217].
Similarly, additional examples of sensors include respiration [3.218], EMG, and EEG [3.219]. It's
interesting to note that inertial, proximity, and microphone sensors are also employed [3.215, 3.217]. To
the best of our knowledge, the overall accuracy of stress detection with various supervised or
unsupervised methods ranges from 0.79 to 0.95. Naturally, it is lower when data processing aims to detect
at least three levels of stress (no stress, moderate stress, and stress), instead of the most common scenario,
which is two levels (no stress, stress). A comparative summary of the stress detection techniques along
with their attributes is illustrated in Table 3.2.
The majority of works [3.215]–[3.218] rely on a psychological/cognitive stress induction, which is
frequently not in line with actual life, where physical stress is usually included as well. Only Xu et al.
[3.219] suggested inducing physical stress through squat-stand exercises and cognitive testing as a means
of stress induction. The requirement for a reference stress value, which is required to verify the stress
detected using sensors, is another crucial issue. Most studies employ self-reporting techniques, including
the STAI questionnaire [3.215]–[3.219]. It is widely acknowledged, therefore, that these methods have
poor objectivity and significant intra- and inter-subject variability. On the other hand, measuring hormone
concentrations is an objective way to identify stress because stress levels alter the production of stress
hormones (e.g., increased release of cortisol or catecholamine) [3.200], [3.201]. This method necessitates
time-consuming, costly laboratory analysis procedures as well as invasive approaches (such as collecting
samples of blood, saliva, or urine) [3.194]. However, certain particular research makes use of these
monitoring methods to assess stress in military personnel [3.203], drivers [3.202], and stress intervention
methods [3.204]. In view of the above reported studies, it becomes essential to develop a simple method
of stress evaluation based on bio-signal analysis which will be computationally simple and yet achieve
high detection accuracy.
PPG signal is becoming the most favourite signal among the present researchers due to
the ease of acquisition and simple morphology.

The clinical significance of PPG signal is justified from the wide variety of applications
as found in the literature.

This chapter presents an automated method of detection and classification of primary


emotions based on the PPG signal analysis.

4.1 Overview
Emotion is a complex mental state that often mirrors attitudes and views in people. Accurate
identification of emotional states and their characteristics is essential for diagnosing serious illnesses and
designing appropriate treatment plans. EEG-based analysis, which is multi-lead and complex, is typically
used to characterize emotion detection. These days, the photoplethysmogram (PPG) signal's wearable
features, comprehensive cardiac information, and ease of usage are being utilized to determine emotional
states. However, PPG signals are mostly used in multimodality approaches by most of the documented
emotion detection systems. This chapter presents a straightforward methodology that uses only the PPG
signal analysis to identify multiple emotional states. The blood ejection rate typically varies in response
to emotion-induced heart rate changes, which in turn creates a departure in the systolic and diastolic
phase balance. As a result, a particular time-domain characteristic is found to measure this imbalance, and
its variability is then employed as a feature in a threshold-based classification method to distinguish
between the five most prevalent emotional states. With an average detection accuracy of 97.78%, the
algorithm performs better when tested on PPG data gathered from the standard DEAP dataset. The higher
accuracy values, when compared to previous literature, demonstrate the usefulness of the suggested
algorithm for the sole PPG signal-based detection of different emotional states. Its potential for usage in
practical healthcare applications is further supported by the utilization of a single PPG characteristic and
the adoption of a straightforward threshold-based categorization method.

4.2 Preliminaries of Emotion


Emotion is a spontaneous reaction to a feeling which implies the response of a person to a particular
instance. For different situations, one may have different feelings like sadness, happiness, joy, fear, calm,
anger, excited, tired etc. The character of a person indicates his present mental state, thought and behavior
which can be termed as emotion [4.1], [4.2]. The accurate quantification of emotion is uncountable.
According to various research articles, common types of human emotional states are amusement,
boredom, disgust, excitement, joy, satisfaction, sympathy, romance, horror, excitement, confusion, awe,
nostalgia, fear, empathetic, calmness, anxiety, admiration, awkwardness, triumph, sadness, nostalgia,
interest, envy, craving, adoration, etc. [4.3],[4.4]. In order to classify or organize emotions, there are
various models proposed by different psychologists, some of which are summarized in Table 4.1. Among
the mentioned models, the most popular model is the Russel's Circumplex 2D model [4.5]. These
methods classify the emotions in either 2D model or 3D model. The two dimensional model classify the
emotions based on ‗valence‘ and ‗arousal‘ as the dimensions while the three dimensional model
incorporates ‗dominance‘ as the third dimension. One such two dimensional model of emotion
classification is shown in Figure 4.1.

4.3 Background and Contribution


One of the most effective physiological tools for assessing a person's mental and psychological health at
an early stage is emotion [4.11]. Emotional mood fluctuations typically impact a broad spectrum of
physiological processes within the body. It is discovered that the ensuing impacts are entrenched in
several physiological indicators and signals produced by the human body [4.12]. Since emotional state
characterization is based on mental activities, the study of the electroencephalogram (EEG) signal
generally governs this area [4.13], [4.14]. A large number of EEG-based emotion detection techniques,
primarily intended for brain-computer interface (BCI), now use a variety of automated algorithms to
reduce the need for manual intervention and provide quick, accurate, and expert-free emotional state
detection. This is due to the development of intelligent computational techniques and cutting-edge
hardware systems. To identify emotional states, these automated techniques primarily employ a broad
range of statistical, nonlinear, wavelet, time-domain, and frequency-domain information [4.15]. Despite
automation, most EEG-based methods are still behind because: 1) EEG acquisition systems are typically
expensive; 2) EEG acquisition requires the use of multiple leads embedded on a skull cap that is placed
on the subject's head via gel, affecting the subject's comfort and mobility; 3) EEG recording requires an
experienced operator to minimize error; and 4) most importantly, the involved algorithm requires a
number of complex methodologies in order to carry out effective analysis of the non-stationary EEG
signal [4.16].

Researchers nowadays are searching for various substitute methods to attain a less complicated yet
accurate emotional state detection along with implementation promises. Consequently, the PPG signal's
affordable, simple-to-implement, and easy-to-acquire qualities have long piqued researchers' curiosity.
The PPG signal's non-invasive mechanism primarily uses an electro-optical method to estimate the
relative blood volume change for each cardiac cycle. This mechanism can monitor the blood volume
change at various peripheral places on an individual, such as the tip of a finger, earlobe, toe, etc. [4.17].
An infrared source and detector assembly is used in the fundamental operation, where light from the
source travels through tissues and cells before hitting the detector. The amount of blood flowing during
each cardiac cycle determines how much light is partially absorbed. Numerous recent studies have
previously made use of the intrinsic cardiac and blood circulation-related facts encoded in the PPG signal
to extract a variety of cardiac parameters [4.18] and to identify a few crucial cardiac defects [4.19 – 4.21].
In reality, it is discovered that the autonomic nervous system (ANS) correlates human brain activity with
heart functionality. Consequently, pathophysiological fluctuations generated by emotions impact not only
heart rate variability (HRV) but also blood ejection rate [4.22]. Given that every PPG beat denotes the
total blood volume change connected to each cardiac cycle, it is reasonable to assume that any
fluctuations in the blood ejection rate brought on by emotion will also be represented in distinct PPG
signal waveform regions [4.18]. Because PPG signals are easy to acquire and have inherent clinical
significance, they may be a better option when it comes to emotion identification.

In the current chapter, a streamlined method for identifying various emotional states is developed, based
on an examination of the PPG signal properties. The placement of the dicrotic notch reveals a particular
time-domain characteristic, and the suggested algorithm makes use of this characteristic's variation to
reflect the modulation of the physiological changes associated with PPG waveform corresponding to five
distinct emotional states. It is noteworthy to emphasize that, to the best of our knowledge, accurate
detection of different emotional states using a single time-domain component of the PPG signal has not
yet been examined or published before this.

4.4 Methodology
The block schematic of Figure 4.2 presents the general methodological aspects of the present method for
PPG based identification of five distinct emotional states. The algorithm consists of four main steps: 1)
denoising of the PPG records obtained from the DEAP dataset; 2) precise identification of fiducial points
with minimal processing and amplitude normalization; 3) feature extraction; and 4) threshold-based
classification for the accurate identification of five emotional states.

5.5.1 PPG Dataset


The PPG records utilized in this chapter to determine emotional states were gathered from the DEAP
database, which is accessible to the general public [4.23]. Eight peripheral bio-signals, were generally
gathered from 32 people, ranging in age from 19 to 36 (17 male participants and 15 female participants).
However, in the current study, only the PPG signal records are selected out of a large number of other
signals. The PPG signals are captured from the thumb at 512 Hz sampling frequency for a minute and
stored in the database associated with every music video. For simplicity of processing, the captured data
is then down sampled to 128 Hz. Subsequently, every participant assessed the degree of several emotions,
including like or dislike, valence, arousal, dominance, and familiarity, using self-assessment manikins
(SAM), which were initially introduced in 1995 [4.24]. They then graded each movie in accordance with
their evaluations. It should be noted, nonetheless, that only the PPG signals obtained from the participants
watching calm, happy, sad, loving, and hateful music video clips were selected for additional analysis in
the current investigation, as opposed to mixed emotions.

4.4.2 Pre-processing of PPG signal


One of the main benefits of Photoplethysmography is its easy and low-cost signal acquisition methods.
However, a number of notable conditions are observed to severely degrade the PPG signal quality during
data collection. The most important factors affecting the signal quality are primarily the acquisition
fingertip's location and steadiness on the sensor, the photo detector's ability to capture all of the incident
light, the blood's inadequate perfusion in marginal area tissues, and—most importantly—the existence of
various activity-originated artifacts and outside noises [4.21]. Therefore, a low-pass Butterworth filter of
order six is used in the proposed study to minimize the influence of noise components encoded on the
acquired PPG data. It's important to note that all high frequency noise, including power line noise, is
eliminated by keeping the low-pass filter cut-off frequency within fifteen Hertz.

4.4.3 Identification of fiducial points of the PPG signal


The precision of the identified fiducial points is a critical factor in determining the validity of any time-
domain analysis of the bio-signals. The features of the PPG signal that indicate the various systolic and
diastolic phases of the heart cycle are known as the fiducial points. The automatic and simple
identification of the pulse onset and dicrotic notch is achieved in the suggested method by means of an
algorithm that is straightforward, effective, reliable, and fast. The dicrotic notch indicates the transition
from the systolic to the diastolic phase, while the pulse starting point indicates the beginning of the
systolic phase. The employed algorithm primarily makes use of easy-to-calculate techniques such as
amplitude thresholding, zero crossing, slope reversal, and the first and second derivatives (FDPPG and
SDPPG) of each individual PPG waveform. Furthermore, the effectiveness of the algorithm is evaluated
using many PPG waveforms that exhibit significant variance in the overall physiology. A detailed
discussion of the fiducial point detection algorithm's methodology is available in [4.25].

4.4.4 Correction of baseline modulation noise and amplitude normalization


It is necessary to estimate the time-domain features taken from any bio signals in relation to a reference.
Though the signal baseline frequently fluctuates quickly, as shown in Figure 4.3(a), breathing
irregularities, abrupt movements of any body part, and a lack of fingertip stability on the acquisition
sensor can also cause this. Currently, a very straightforward and simple mathematical technique based on
the positions of pulse onset points is utilized in this chapter in order to reduce the effect of baseline
modulation effects. The resulting baseline corrected waveform is shown in Figure 4.3(b), and [4.17]
provides a thorough explanation of the baseline correction process. The amplitude range of distinct PPG
signal corresponding to particular emotions is found to differ for each individual after baseline
modulation correction. Therefore, using the procedure from Eq. (4.1), each baseline corrected PPG signal
record is then normalized from (0-1).
In this case, the baseline adjusted PPG signal is denoted by V PPG, the resulting normalized PPG signal is
VPPG (norm), the lowest value of the baseline corrected PPG signal is V PPG (min), and the highest value is VPPG
(max). Following amplitude normalization and baseline correction, Figure 4.3(b) shows a sample PPG
signal record.

4.4.5 Feature extraction from PPG signal


From a physiological perspective, the medulla oblongata, located in the base of the brain, is primarily
responsible for controlling the heart's rhythmic activity. It also serves as the central nervous system's
control centre (ANS). Generally, certain chemicals are released from the centre depending on whether
there is any stimulation or not, and this changes the heart rate [4.18]. This implies that changes in mood
must also affect the autonomic nervous system's (ANS) reaction, which in turn affects heart rate (HR).
The amount of blood ejected and the variation in blood volume in the periphery of the human body are
both impacted by this altered heart rhythm. Such variations in blood volume at the body's extremities are
typically detected as PPG signals. Variations in the systole and diastole phases are caused by emotions
that alter the heart rate, blood ejection rate, and the generated forward and reflected wave. A careful
examination of the PPG waveforms reveals that both phases may be easily separated at the conclusion of
each PPG beat's systolic phase, which is indicated by a particular fiducial point called the "dicrotic
notch". One of the most important points in the PPG signal waveform that denotes the aortic valve
closure or the conclusion of blood ejection from the heart is the dicrotic notch. Five PPG signals that
represent five different emotional states are categorized in Figure 4.4 of the current study. The PPG beat
that corresponds to the 'love' condition in Figure 4.4(a) is noteworthy because it shows beat-wise
variation in the systole-diastole phases with different dicrotic notch placements. The systole phase in
Figure 4.4(b) appears to conclude earlier with a prolonged diastolic phase when the emotional state shifts
from "love" to "calm." As the 'happy' state is transitioning to the systole-diastole phase in Figure 4.4(c),
the notch position and morphology are found to be relatively steady and conspicuous. The ensuing PPG
beats of Figure 4.4(d) and Figure 4.4(e) during the "sad" or "hate" states clearly reveal beat-wise
fluctuation in the notch location and morphology.

The PPG beat's most critical part is considered to be the region between the onset and the dicrotic notch,
given the impact of emotion in peripheral blood volume change. Every PPG beat is discovered to induce
changes in the aforementioned area due to variations in heart rate and the systole-diastole portions of the
PPG beats. Also, the position and morphology of the dicrotic notch must withstand a beat-wise, subject-
wise, and emotion-wise variation since the effect of emotion is not static and the degree of emotion
continues to vary depending on the subject's physical or mental conditions as well as the level of external
stimuli. Ten PPG beats are therefore taken into consideration for each emotion and person in order to
depict this emotion-induced fluctuation. Figure 4.5 displays one such sample map of the area of interest
from the PPG beats. The PPG pulse's variability over the course of ten consecutive beats is then chosen as
the most discriminating and the only feature to provide the most profound outcome when categorizing the
five distinct emotional states on a group of subjects, as opposed to taking the selected area of the pulse
starting from the first onset to dicrotic notch. The feature for this study is the variance value for each
record of the summation of the indicated region, as shown in Figure 4.5.

4.4.6 Classification
According to the analysis of the extracted feature values from each subject, each emotional state under
consideration has a distinct range of feature values. This means that the five different types of emotional
states can be adequately classified using simplified threshold-based classification logic with an unique set
of decision boundary values. Two distinct categorization techniques, which are explained below, are used
to further authenticate the general discrimination logic that was established.

4.4.6.1 Multi class classification

The benefit and efficiency of the selected characteristic come from its ability to employ a straightforward
threshold-based categorization strategy in place of any intricate method. The maximum, lowest, standard
deviation and average values of the acquired feature are now used to calculate unique threshold values for
each emotion. The test dataset is then subjected to these threshold values in order to categorize particular
emotions. T1, T2, T3, and T4 are the only four separate threshold values that are generated and used for
classification due to the large number of emotional states in the current classification problem. The
threshold rule for classification is created as follows, where x is the feature value for a particular subject's
emotion. The feature set value in question is compared to each of the four boundary values using a
threshold rule-based classification technique, starting with the smallest value and working up to the
highest. The feature value will match any one of the five emotion classes throughout this comparison
process. The class that is chosen as the predicted emotion class is the one for which the current feature
value is inside the boundary values.

if x ≤ T1, then class = A (‘Love’)


else
if T1< x ≤ T2, then class = B (‘Calm’)
else
if T2< x ≤ T3, then class = C (‘Happy’)
else
if T3 < x ≤ T4, then class = D (‘Sad’)
else
if x > T4, then class = E ('Hate’)

It is evident that using a single unique feature value lessens the complexity of classification. Furthermore,
it should be noted that the threshold-based technique, which classifies five emotional states based on a
single feature, is exclusive in nature and has not, as far as we know, been discussed before. Due to the use
of only four distinct threshold values, the overall computational time is negligible even though five
conditions must be met for classification purposes. These threshold values form the basis of the entire
classification; therefore iteration is not necessary to achieve optimal results.

4.4.6.2 Binary class classification

Following classification into five distinct emotional states, each emotional state's efficiency in relation to
the other four emotions is evaluated in further detail for each of the computed threshold values. One
emotion is considered to be one class in each scenario, while the remaining feelings are considered to be
the other class. For instance, in the first scenario, the emotion of love is classified as one class, and all
other emotions are classified as a second class. A decision boundary is selected, and the threshold-based
rule is used to classify the data. Here is an illustration of a binary classification issue. An emotion will be
classified by the model as belonging to the Love category or to another (non-Love) category.

Case 1: Love vs. (Calm, Happy, Sad, Hate)

Case 2: Calm vs. (Love, Happy, Sad, Hate)

Case 3: Happy vs. (Love, Calm, Sad, Hate)

Case 4: Sad vs. (Love, Calm, Happy, Hate)

Case 5: Hate vs. (Love, Calm, Happy, Sad)

Step1: if x ≤ B1, then class = ‘Love’


else
class = Rest (‘Non Love’)
Step 2: if B1< x ≤ B2, then class = ‘Calm’
else
class = Rest (‘Non Calm’)
Step 3: if B2< x ≤ B3, then class = ‘Happy’
else
class = Rest (‘Non Happy’)
Step 4: if B3 < x ≤ B4, then class = ‘Sad’
else
class = Rest (‘Non Sad’)
Step 5: if x > B4, then class = 'Hate’

4.5 Results and Validation

4.5.1 Analysis of Feature distribution

4.5.1.1 Box plot based feature analysis


The box plot of the feature values for each emotion and for all the records is shown in Figure 4.6 in order
to illustrate the diversity in the feature values computed for five distinct emotional states. It is clear from
looking at the box plot that the feature values for the various emotions may be simply grouped according
to the feature set's absolute values. Valence, arousal, and dominant emotions are the main categories used
to group the accessible emotions in the current dataset [4.37]. The five emotions that are being examined
fit within the major emotion class, according this category. Out of these five emotions, "hate" is a
dominance emotion, "sad" is a negative valence emotion, "happy" is an arousal emotion, "calm" is a non-
arousal emotion, and "love" is a positive valence emotion [4.26]. Positive valence and non-arousal
emotions have little effect on a person's heart rate, and their physiological signals are essentially normal.
An individual's body becomes energized when they experience an arousing feeling, and this is anticipated
to show up as a shift in blood volume [4.27]. As a result, it is projected that significant changes in the
signal morphology will occur during negative valence emotions, such as "sad," with the dominating kind
of emotion, "hate," expected to see the most alteration.

Thus, it makes sense to find that the variance of the area values between a PPG signal's onset and dicrotic
notch would have distinct groups for various emotional states based on the previously explained
background. Figure 4.6 depicts the box plot showing how the range of values for different emotions
varies noticeably. According to the classification of emotions, the valence emotion has the lowest range
of values and the dominant emotion has the biggest range of variance values. The variance values of the
other emotions lie between the ranges of positive valence and dominance.

4.5.1.2 Feature trend analysis

Prior to classifying the emotional states, feature trend analysis verification is done to ascertain the
feature's trend of variation. This is regarded as a crucial component of any technique for recognizing
emotions since the trend of the extracted characteristic must be comparable across subjects in different
databases. While it is true that every emotion has a different effect on various subjects, low and high
arousal emotions as well as positive and negative valence emotions should exhibit a consistent pattern of
change. Figure 4.7 displays a typical plot of the feature values from ten randomly selected subjects,
demonstrating a consistent trend throughout the patients. The feature's values rose from positive valence
to strong emotional arousal.

Generally speaking, the following common statistical parameters which are defined below—are used to
assess the effectiveness of the previously indicated classification technique:

Here, true positive, true negative, false positive, and false negative are denoted by the letters TP, TN, FP,
and FN, respectively. The positive class that has been accurately classified as positive is represented by
TP, while the negative class that has been correctly classified as negative is represented by TN. In a
similar way, FN denotes the positive class that is mistakenly categorized as negative, and FP indicates the
negative class that has been mistakenly classified as positive.

4.5.2 Classifier performance evaluation

4.5.2.1 Multi class classification

In the current study on emotion categorization, thirty-two PPG records are taken into consideration, and
ten consecutive beats per record are chosen for feature extraction. Approximately 70:30 ratio is used to
partition the total dataset at the start of the classification procedure. This indicates that thirty percent of
the data is set aside for testing purposes and the remaining seventy percent is used for training.
Furthermore, a cross validation technique—eightfold in this case—is also used to reduce the risk of over
fitting, improve the trained model's accuracy for unseen feature values, and most importantly achieve
boundary values for the universal classification of every emotion. Now that the full data set has
undergone eight-fold cross validation, it has been originally divided into eight equal sets. Seven of these
eight data sets are used to create the classifier model and the remaining one data set is used for testing. To
cover all test sets, the entire process is repeated eight times. This indicates that every feature value
acquired is used in the classification process's training and testing phases. Now, unique threshold
boundaries (T1, T2, T3, and T4) are chosen for each test fold, and the related performance is noted.
Following eight-fold testing, the final decision boundary for the categorization of all the emotions is
determined by calculating the mean model boundary values obtained from the complete eight-fold test
data. Figure 4.8 displays the final modeled classifier boundaries using a threshold-based rule for the five
chosen emotions. Additionally, Table 4.2 presents the results of evaluating the constructed classifier's
improved performance for multi-class classification of all the emotions under consideration using the
parameters indicated in Equations (4.2–4.5).

4.5.2.2 Binary class classification

The whole dataset is split into two uneven halves before any one emotion is binary classified in relation
to all the other emotions. First part contains the number of datasets related to a particular emotion (32×1 =
32), and the second part contains the remaining datasets derived from four emotional states (32×4 = 128).
These two datasets are split up into four more sets, three of which are used for training and the fourth set,
is used for testing. In order to use each subset as a test set and take into account the average classification
border, the classification process is performed four times. This process is repeated for each of the five
binary classification models, and the results are displayed in Figure 4.9 for the computation of the four
distinct classification borders (B1, B2, B3, and B4). In this instance, the binary class and multi class
classifications have the identical classification boundary values under different labels. The classification
borders between "love" and "the rest of the emotions" is B1, respectively, and between "calm" and "the
rest of the emotions" are B1 and B2. The barrier between "Happy" and "Rest of the emotions" is formed
by B2 and B3, and the boundary between "Sad" and "Rest of the emotions" is marked by B3 and B4.
Finally, as seen in Figure 4.9, B4 delineates the boundary between "Hate" and "Rest of the emotions."
Using the parameters from Eq. (4.2–4.5), Table 4.3 displays the improved performance of the binary
classification strategy in this instance as well.

The number of true love emotional states that are accurately classified as "Love" emotion is represented
by TP, for instance, if we regard "Love" to be one class and the other as another. The other emotions that
are accurately classified as other emotional states (not in the "Love" category) are denoted by TN. The
other emotions that are mistakenly classified as "Love" are represented by FP. Likewise, FN stands for
"Love" feelings that are mistakenly classified as other emotions (i.e., not falling under the "Love" group).
Figure 4.10 shows the representative confusion matrix for the threshold-based binary class classification,
which assigns Love to one class and the other emotions to another.

The ROC curves for each of the classification methods provide additional evidence of the efficacy of the
procedure, and the area under the curve (AUC) value is calculated in each case. The results show that the
AUC values for the binary classification technique fall between 0.998 and 1, and for the multiclass
classification technique, they fall between 0.994 and 1. The effectiveness of the straightforward
threshold-based classifier is justified by the high AUC values shown in the ROC plots. The subsequent
Figure 4.11 and Figure 4.12 display the ROC curves and related AUC for binary class classification and
multi class classification, respectively.

4.6 Performance Comparison

It is noteworthy to emphasize that there are still very few studies that use PPG signal analysis alone to
detect numerous emotional states. Conversely, most of the approaches that have been suggested use PPG
signals in multimodality mode [4.28 – 4.33]. The drawback of multimodality mode is that it increases the
computing overhead of the algorithm by requiring distinct denoising and processing approaches and
extracting multiple features from various inputs. Different time-domain, frequency-domain, spectral
features, spectral power asymmetry, statistical, Poincare, or wavelet-based complex features are used by
the multimodality approaches as indicated in [4.28 – 4.34]. Moreover, a number of intricate classification
methods are employed to classify the emotional states, including Bayesian network, ridge regression,
Naïve Bayes method, Support Vector Machine (SVM), Multilayer Perception (MLP), k-nearest
neighbour (kNN), Meta-multiclass (MMC), and Random forest [4.28 – 4.33]. Finally, when evaluating
across the identical dataset as used in this chapter, most methodologies, although using several bio-
signals, identify only a restricted number of emotional states [4.28 - 4.30], [4.32], [4.34] with poor
accuracy. Table 4.4 presents a comparative performance evaluation of all the previously discussed
methods with the suggested work. Now, the evaluation result makes it clear that, despite significant
methodological differences, the proposed method outperforms the literature mentioned in it by classifying
five prominent emotional states using a single, easily extracted feature and a straightforward, easy to
implement threshold-based classification methodology.

4.7 Discussion

Researchers today are interested in the proper recognition of human emotional states by automated
systems, and many different physiological signals have been explored in this regard. A number of
additional signals, including the gold standard EEG signal as well as the ECG, EMG, GSR, and PPG,
have also been investigated for assessing an individual's emotional state. PPG, among other bio-signals,
has drawn particular interest from the scientific community because of its straightforward wave form,
inexpensive equipment, easier capture, integration of trustworthy pathophysiological data, and operator-
free processes. It is still uncommon to find a straightforward and effective method for recognizing
emotions using just the PPG signal. As a result, the current study proposes a straightforward, effective,
and manageable algorithm for the automatic detection of five different emotional states using only the
PPG signal. The five different emotional states are primarily classified using an easy-to-implement
threshold-based classification algorithm and an easy-to-extract feature. Since the emotion effect is
discovered to be at its peak within the first few seconds of the videos, ten consecutive PPG beats are
taken into account when computing the feature values. While a greater number of PPG beats may add to
the computational load, fewer PPG beats may not be able to adequately capture the emotional shifts. It is
noteworthy to observe that the suggested approach recognizes five key emotional states using just a single
feature.

On the other hand, PPG beats with sudden morphological variations frequently pose significant
challenges in identifying this moment. Future research on the PPG signal waveform should focus on
finding more pertinent elements that might be used in conjunction with the suggested algorithm to
recognize a wider range of emotional states. Because of the PPG signal's inherent physiological
properties, it is used to compare various bio-signals and techniques, including speech and facial emotions.
The subject has the ability to control their voice and facial emotions, and they can even suppress their
facial expressions to a certain level, which may result in faculty conclusions. However, because a patient
cannot willfully change the PPG signal, it can be used as a key signal in emotion recognition techniques.

Russell's circumplex model is a commonly used emotional model that classifies various emotions based
on their valence, arousal, and dominance scale. The aforementioned feelings are intricate and frequently
overlap. For instance, it is difficult to distinguish between the feelings "relaxed" and "calm," while "sad"
and "depressed" overlap with each other. The overlapping nature of emotions makes the classification
extremely challenging when considering all of the feelings at once. Furthermore, the impact of a certain
emotion varies among individuals. The current chapter is based on five fundamental emotions selected
from the wide range of emotions in order to avoid this complication. The objective is to create an
automated algorithm that can quickly and easily identify emotions for use in real-time applications.
Multiple emotions will add to the computational load and make practical implementations challenging.
The current algorithm's future efforts will focus on identifying various emotional states of a person,
including psychological issues, which are sadly becoming more common in today's society. Furthermore,
the current algorithm makes it simple to provide continuous mental excitation monitoring for individuals
with mental illnesses. This study can be expanded to diagnose mental shock patients, which could assist
medical professionals in taking the appropriate action and lowering the risk of suicide attempt. It is
possible to determine the mental state of a person who is deaf and dumb by using this algorithm. These
days, online gaming and other multimedia platforms are highly popular, and the suggested method may
be useful for understanding the user's mental response. Only five distinct emotions are used in the current
procedure, all of which are very universal and replicable in an experimental paradigm. Future iterations of
the algorithm will further enhance it to the point where a single algorithm can accurately recognize
several emotions, something that the literature currently lacks.

There is adequate potential for the classification of the five emotional states using the extracted feature
with a reasonable level of accuracy utilizing a straightforward, linear threshold-based classification
algorithm. The primary reason for using this categorization technique is to reduce the computational load
and to improve implementation prospects. However, it is noted that in order to enhance the overall
effectiveness of the suggested strategy, the selected threshold-based classifier has to be further optimized
and assessed over a bigger dataset. By putting the established algorithm into software, the full testing
process is carried out and verified. It is also stated that the current study does not use any hardware
implementation of the algorithm. High-tech personal assistance equipment and a wide range of other real-
life applications can benefit from the exceptional performance of the selected algorithm. The following is
a summary of some of the most important details about the suggested algorithm:

For this study, just five different emotions are taken into account. Numerous additional emotions can be
included in the algorithm. Given that mental stress is on the rise in today's culture, the approach could be
expanded to assess an individual's mental stress levels. All of the analysis for this study is done on an
offline software platform. Further development of the same algorithm could enable its use in Internet of
Things based systems, allowing for direct monitoring of the outcomes even from a remote place. The
classification process is carried out using a single, simple-to-calculate characteristic. In the future, a few
more features might be developed to make the algorithm stronger. Real-time PPG signals for particular
emotion stimuli can be captured in the lab, and the algorithm's performance can be examined.

5.8 Conclusion
It is clear that accurately identifying a person's emotional state using a streamlined computerized method
gives conventional human-machine interactions a new dimension and is useful for a wide range of
extremely complex applications. The scientific community is already aware, that PPG signals can be a
useful substitute for current bio-signals in situations where patient comfort and ease of use are the main
concerns. Although the suggested method performs better, it only utilizes a small portion of the DEAP
dataset; therefore its applicability for other applications will need to be further investigated. In the field of
HMI and other multimedia applications, where analysis is typically dependent on more complex signals
like EEG, ECG, etc., the usage of PPG-based emotion detection can be a viable substitute. Most crucially,
beside HMI applications, there are specific therapeutic situations involving some serious illnesses where
treatment planning would be simpler if the doctor is aware of the patient's emotional state beforehand.
When compared to other traditional methods, the suggested PPG-based emotion identification produces
superior results in certain situations. Overall, the usability of the algorithm in real-time applications is
guaranteed by the simplicity of the selected methodology, the algorithm's quick execution, and the high
average accuracy that is obtained. It should be noted, however, that in the future, the proposed
methodology is meant to be improved by adding a number of additional important emotions with more
straightforward methodologies, so that the final algorithm can be used in other cutting-edge, wireless
HMI applications for the detection of some critical disorders, like schizophrenia, autism and others.

 This chapter describes the utility of PPG signal for extraction of respiration signal
which remains embedded in it during acquisition.
 Instead of using an additional respiration sensor, a single channel PPG sensor is
effective for recording respiration signal along with PPG signal.
 The features obtained from PPG signal and from PPG extracted respiration signal
are used to estimate the mental stress conditions with respect to relaxed conditions
of the subjects.

5.1 Overview
Because of the increasing complexity of our society, mental stress is a part of life for everyone. Long-
term mental stress situations must be addressed as soon as possible since they have the potential to cause
a variety of chronic diseases. The current techniques for measuring mental stress based on
electroencephalograms (EEGs) are frequently intricate, multi-channel, and expert-dependent. Conversely,
the respiratory signal provides interesting insights into stress, but capturing it is difficult and necessitates
multimodal support. The respiratory signal can be separated from the photoplethysmogram (PPG) signal
in order to get around this problem. This suggested method for identifying the stressed state multimodally
characterizes the readily obtained PPG signal. The developed algorithm specifically uses a significant
PPG feature and, through simplified methods, also extracts the respiratory rate from the same PPG signal.
PPG records obtained from the publicly available DEAP dataset are used to evaluate the approach. The
suggested algorithm's efficacy is evaluated using a straightforward threshold-based approach in
conjunction with conventional classification approaches. Its average accuracy for classifying stressed and
relaxed states is 98.43%. The present method performs better than the available approaches, and its low
acquisition load and straight forward methodology enable its deployment in standalone, real-time
personal healthcare gadgets. The suggested method's usefulness in healthcare applications is ensured by
the usage of basic features and a simple classification strategy.

5.2 Background and Contribution


As our everyday life get more complex, mental stress has increased in frequency and has now become an
inherent aspect of who we are as individuals. The "stress" situation, excitement, or emotional stimulation,
often refers to a specific psychological state that could affect an individual's performance [5.1]. Mental
stress activates the stress response chronically and can be harmful to one's physical and mental health,
regardless of age [5.2]. Furthermore, stress weakens people's immune systems, which ultimately leads to
serious conditions including diabetes, depression, and cardiovascular illnesses [5.3], [5.4]. Thus, early
mental stress assessment lowers the risk of future physiological diseases while also assisting in the
optimization of current stress-related issues.
Since mental pressure is psychologically linked to stress, the most commonly examined instrument for
determining stress level is the electroencephalogram (EEG) signal [5.5]. However, proper signal
acquisition cap positioning and the use of conductive gel are necessary for effective measurement of
mental stress utilizing expensive, multi-channel EEG set-up. These procedures typically compromise
patient comfort and thus affect performance [5.6]. Furthermore, in order to extract information on mental
stress from the recorded EEG signal, sophisticated signal processing algorithms are required [5.7], which
increases the computational burden. Consequently, scientists are actively searching for alternative, user-
friendly techniques for rapidly determining the conditions surrounding mental stress.
In the modern era, many simplified PPG signal analyses are becoming more popular as a stand in tool for
a quick evaluation of mental states [5.8]. The easy-to-obtain PPG signal only requires one sensor, and its
operation requires very little prior knowledge. The intrinsic cardiac and blood circulation facts conveyed
in the PPG signal have previously yielded a number of cardiac features [5.9] and a few notable cardiac
abnormalities [5.10]. Since the autonomic nervous system (ANS) connects human brain activity with
cardiac functionality, any pathophysiological changes brought on by stress must often affect both blood
ejection rate and heart rate variability (HRV) [5.11].
Since each PPG beat represents the overall blood volume changes associated with each cardiac cycle, it
can be hypothesized that stress-induced fluctuation in the blood ejection rate will likewise be reflected in
the overall PPG signal waveforms [5.12]. The breathing signal, which offers fascinating information
about stress, requires a worn belt to obtain, which complicates the acquisition process and worsens patient
discomfort [5.13]. However, recent research indicates that the PPG signal by itself can be a very helpful
tool for precise respiration rate evaluation [5.14]. Since breathing generates a noticeable amplitude
variation in the PPG signal, we propose to study the additional information provided by the PPG derived
respiration rate as a multimodal tool for a simple evaluation of the mental stress.
An automated method utilizing multimodal PPG signal characterization is proposed in this chapter. The
goal of the present method is to create an automated algorithm that can take a single channel PPG sensor
from a subject's finger tip, separate the respiration signal from the PPG signal, and use straightforward
standard classifiers to identify the stress conditions based on the features. The two retrieved features are
employed with a typical classification technique and a reasonable accuracy for assessing the mental stress
situation to distinguish between relaxed and stressed states. The possibility of a healthcare monitoring
application is increased and the amount of data collected is reduced when a single sensor is used to
acquire the PPG signal and respiration signal. The following points describe the main contributions:
1. A straightforward classification model based on time plane feature extraction is used in conjunction
with a single PPG sensor to provide a non-invasive, affordable, and accurate approach of stress detection.
Many sensor modalities are not required with this approach.
2. We evaluate the signal processing and feature extraction techniques, such as baseline correction and
normalization, in order to optimize the model size and resource requirements for upcoming deployment.
These techniques achieve significant reductions in latency and model size without sacrificing accuracy.
3. We show that multimodal PPG signal characterization is feasible for stress state detection and
classification.
4. To confirm the effectiveness of our approach for accurate stress detection using just two
computationally straightforward features, we employ the DEAP dataset.
5. We draw attention to the potential for respiration signal extraction from PPG signal for stress
assessment purposes, which would do away with the need for a respiration sensor belt—an acquisition
method that is not generally patient-friendly.

5.3 Methodology
The general block diagram of the entire methodology is displayed in Figure 5.1. The four main parts of
the method are as follows: (1) removing high frequency and power line artifacts from PPG data from the
DEAP dataset; (2) obtaining the respiratory induced amplitude variation signal (RIAV) and calculating
the respiration rate; (3) extracting features from the clean PPG signal; and (4) using an threshold rule
based classification to differentiate between the stressed and the relaxed conditions.

5.3.1 PPG Data Acquisition


PPG signal recordings were gathered for the planned study from the DEAP database, which is available
to the general public [5.15]. A multimodal dataset called DEAP is used to analyze people's emotional
states. Thirty-two participants had their electroencephalogram (EEG) and peripheral physiological data
measured during forty one-minute music video excerpts. The subjects gave each movie a score based on
arousal, valence, like/dislike, dominance, and familiarity. Of the 32 people, 22 also had frontal facial
footage captured. A novel approach to stimulus selection was used, using video highlight recognition,
emotional tag extraction from the last.fm website, and an online assessment tool. The database contains
eight peripheral bio-signals and 32-channel EEGs from 32 subjects (17 males and 15 women). The PPG
signal in the database was recorded for one minute at 512 Hz sampling rate from the subjects' thumbs.
The signals were first down sampled to 128 Hz in order to prepare them for usage later on. It should be
mentioned that PPG recordings under both relaxed and stressful conditions are the only ones used in the
present study. All subjects provided informed consent, and these procedures were adhered to in the
production of the dataset. Prior to the trial, each participant filled out a questionnaire and signed a consent
form. Following that, participants received a set of reading instructions outlining the experiment's
methodology as well as the objective of the several self-assessment tools. The study was given approval
by the relevant institutional and national research ethics committee, which also attested to the fact that it
was carried out in compliance with the applicable ethical criteria or the 1964 Declaration of Helsinki and
its subsequent amendments.

5.3.2 PPG Data Processing


5.3.2.1 Denoising of PPG Signal
One of the high frequency artifacts found in the PPG records extracted from the DEAP dataset is the
power line artifact. These artifacts are eliminated by applying a median filter, which typically functions as
a low pass filter. The median filter operates on the signal sample by sample by replacing each sample
with the median of neighboring samples. The size of the adjacent samples, which fluctuates sample by
sample throughout the signal array, is referred to as the "window". Information on the median filter's
operation and related denoising efficacy can be found in [5.16]. Equation (5.1) can be used to represent,
mathematically, the filtering procedure at time n for a filter of size N, and the filtered output Y[n]
corresponding to the input data X[n].
The median filter has an advantage over other traditional filtering methods in that it can efficiently
remove specific noise frequency components of the signal without harming other frequency components.
While filtering methods such as DWT and EMD can also be employed to eliminate noise, they will also
eliminate certain unique features of the signal. Clinical features may vanish due to the successive filter
bank effect of the DWT, and the reconstructed signal might not contain all of the important frequency
components of the original signal. When employing EMD, non-linear characteristics are added to the
signal, which is not what is wanted for this study. DWT-based filtering leaves a trailing response behind,
which results in the existence of noise frequency components close to the cut-off frequency values of the
detail and approximation coefficients. The properties of the median filter have no bearing on the clinical
features of the signal that is being studied. It can therefore more effectively eliminate the artifacts that are
usually present in the obtained bio signals.
5.3.2.2 Pulse Onset and Systolic Peak Detection
A crucial factor for the validity of time-domain analysis is the accuracy of the detection of fiducial points
in the clean signal. The fiducial points are the locations in the PPG signal that represent the different
systolic and diastolic stages of the heart cycle. The beginning of the systolic phase is indicated by the
pulse starting point, whereas the dicrotic notch marks the change from the systolic to the diastolic phase.
A method that is simple, efficient, dependable, and quick is used in this chapter to automatically identify
the dicrotic notch and pulse onset. The used methodology mainly utilize methods that are simple to
compute, like zero crossing, slope reversal, amplitude thresholding, and the first and second derivatives
(FDPPG and SDPPG) of each PPG waveform. Furthermore, numerous PPG waveforms that show
notable diversity in the overall pathophysiology are used to assess the algorithm's efficacy.
The time-domain features extracted from any bio signals must be estimated with respect to a reference.
As demonstrated in Figure 5.2(a), breathing abnormalities, rapid movements of any body part, and a lack
of fingertip stability on the acquisition sensor are all causes of the signal baseline fluctuation that occurs
often. Currently, our approach reduces the influence of baseline modulation fluctuations using a
straightforward mathematical technique based on the placements of pulse start points. Figure 5.2(b)
shows the resulting baseline adjusted waveform. After baseline modulation correction, the amplitude
range of unique PPG signal records corresponding to specific emotions is found to vary for each
individual. Consequently, each baseline adjusted PPG signal record is then normalized in the range of 0
to 1 using the formula given in Eq. (5.3). Following baseline correction and amplitude normalization, a
sample PPG signal record is displayed in Figure 5.2(b).
The baseline artifact-corrected PPG signal in this case is denoted by VPPG, the resulting normalized clean
PPG signal is VPPG (norm), the lowest value of the baseline artifact-corrected PPG signal is VPPG (min), and the
highest value is VPPG (max).

5.3.3 Feature Extraction


5.3.3.1 Calculation of amplitude between PPG onset and systolic peak
The PPG signal is associated with variations in blood volume in the limbs of the body. A PPG signal is a
graphical representation of changes in blood volume throughout the body. Because of the cardiac-brain
link through the autonomic nervous system (ANS), every change in emotional state now affects the heart
rate variability (HRV) and the blood ejection rate [5.17]. Moreover, it has been documented that stress
alters a person's heart rate, and that a person's mental state affects cardiac activity [5.18]. Once the PPG
fiducial points have been determined, a primary and easily computed PPG feature is computed that
considers the impact of this variation in cardiac output on mental state. Slope reversal, amplitude
thresholding, signal derivative, and empirical formula-based techniques—all of which have good average
accuracy and are documented in the literature—are the foundations of the suggested peak detection
methodology [5.19]. In Figure 5.2, which shows a typical PPG record, the start point and the systolic
peaks are visually recognizable and marked by round symbols. The amplitudes between the PPG onset
point and the corresponding systolic peak of each of the twenty successive PPG beats are identified and
added together. In the present investigation, the amplitude sum that results is regarded as the first feature
(F1). A representative plot of the amplitudes of twenty successive PPG beats is shown in Figure 5.3.

5.3.3.2 Extraction of respiration induced amplitude variation (RIAV) and respiration


rate from the PPG signal
The PPG signal waveform clearly demonstrates the amplitude change of the obtained signal is caused by
respiration, and this phenomenon has also been reported in other literatures [5.20]. Respiratory-induced
amplitude variation (RIAV) is a type of respiratory-induced PPG signal variation that is directly related to
the respiration cycle. It is caused by a change in cardiac output caused by decreased ventricular filling,
although studies have identified three different types of respiratory-induced PPG signal variation
(amplitude, frequency, and intensity). To properly determine RIAV, it is important to accurately identify
the beginning and systolic peak points of the pulse as well as the accompanying events that were
previously discussed. This is accomplished by using the same PPG wave segment and plotting the
systolic peak positions to derive the respiratory induced amplitude variation (RIAV) signal. This
procedure is used to construct the respiratory-induced signal in every recording. A RIAV signal example
is shown in Figure 5.4(a).
The respiration rate is then calculated using the Fast Fourier Transform (FFT) on the RIAV signal, as
illustrated in a typical plot of Figure 5.4 (b). It is now known that in stressful conditions, the breathing
time interval varies significantly more quickly than it does in a normal state [5.21]. As a result, in the
present study, the breathing rate is considered the second feature (F2).

5.3.4. Classification
The overall ability of the extracted features to discriminate between mental stress conditions and relaxed
conditions is then assessed using standard classification approaches based on the extracted feature values
and a simple threshold rule-based classification. Because the feature threshold values for the chosen
classifier are deduced from the feature boundary values, the feature space is appropriately partitioned. It
should be mentioned that in order to reduce the impact of classifier biasing, the classification technique
employs a fourfold cross validation technique, in which the entire data sets are divided into four equal
parts. Following that, the classifier is trained on three of these four sets, saving the fourth set for testing.
By repeating this procedure for every possible combination of data sets, each dataset is used in both the
training and testing stages. A fourfold cross validation technique is used to compute the performance, and
other popular classification techniques such as Logistic Regression (LR), Linear Discriminant (LD),
Support Vector Machine (SVM), and k Nearest Neighbour (kNN) approaches are used to evaluate the
efficacy of the selected features. The most recent statistical parameters like Accuracy, Sensitivity and
Specificity are used to evaluate the overall classification performance of all the classifiers listed above, as
indicated in [5.22].
5.4 Results and Validation
It is accurate to say that a stressed-out condition elicits a substantial arousal whereas a relaxed state does
not. Figure 5.5 displays a box plot that illustrates the feature values' overall discriminating effectiveness.
The box-plot of the derived feature clusters supports the predictions of high arousal (stressed) and low
arousal (relaxed) emotional states made by various emotional models [5.23]. In addition, the overall
feature distribution and the non-overlapping feature values shown in box plot of Figure 5.5 indicate that,
in addition to other conventional classifiers, a simple threshold rule-based classifier is adequate for
correctly predicting the two classes. Table 5.1 presents the overall efficacy of the chosen classification
method for two distinct classes, exhibiting 98.43% accuracy, 96.96% sensitivity, and 100% specificity on
average. Table 5.2 displays the total performance of the selected features after they are further evaluated
using traditional classifiers. Five standard characteristics are used to assess the performance of the
classifiers: Accuracy, Sensitivity, Specificity, Precision and F1 score,. Figure 5.6 displays the various
values of the aforementioned parameters that were discovered throughout the categorization phases.
Table 5.3 also provides the confusion matrix for each classifier's classification stage. The high-value of
classifiers' evaluation parameters validates the recommended method's applicability and usefulness.
Furthermore, ROC curves in Figure 5.7(a) and Figure 5.7(b) demonstrate the superior performance of the
threshold-based classifier in comparison to other conventional classifiers. The usage of the suggested
features is further supported by the high area under the curve (AUC) values that were obtained for each
classifier.

5.5 Performance Comparison


Two basic features—one computed from the PPG signal and the other from the RIAV signal, which is
derived from the PPG signal—are used in the proposed methodology. These easy-to-implement features
enhance the effectiveness of the suggested approach and expand its applicability to practical systems. To
prove its superiority, the suggested strategy's effectiveness is contrasted with other state-of-the-art
methods, as illustrated in Table 5.4. The findings displayed in Table 5.4 demonstrate that the suggested
algorithm genuinely outperforms all of the previously published literature, even with modifications in the
database, sensor kinds, methodological approach, feature dimension, and classification tool. The present
algorithm's potential for implementation in real-world scenarios is further demonstrated by its superior
performance in comparison to existing threshold-based categorization methods.

5.6 Discussion
The PPG signal's multimodal characterization in this chapter makes it easy, automatic, and reliable to
identify mental stress. The key contributions of the proposed technique are enumerated below:
1) Unlike the multi-lead, complex EEG signal, the PPG signal is obtained with a single sensor.
2) Two basic features are constructed using the PPG signal and the respiration signal computed from the
same PPG signal.
3) Does not necessitate complex feature selection or ranking processes.
4) The use of linear classifiers with reasonable accuracy ensures the possibility of implementing AI-based
healthcare monitoring in real-world applications.
The multimodal evaluation of a single channel PPG sensor—which is patient-friendly, simple to acquire,
operator-independent, and contains crucial clinical information—represents the novelty of the chapter. To
the best of our knowledge, the use of two modalities of a given signal for stress detection hasn't been
studied previously before this approach. This technique has the added benefit of making the system
portable and user-friendly while also lowering the overall acquisition cost and complexity. The feature
values are computed in the present chapter using only twenty PPG beats of each record.
The computing cost might go up with more PPG beats, but fewer PPG beats might not be able to
adequately represent the changes brought on by stress. The signal length and the analysis that follows will
need to be thoroughly examined against other databases with different signal durations in the future for
increased classification accuracy.
Among the several complex classifiers, only simple linear classifiers are selected for accurate
classification of the stressed and relaxed states. Nevertheless, additional optimization and assessment of
the chosen classifier across a larger dataset are required to improve overall performance. Furthermore, in
the future, the robustness of the features will be evaluated using a few advanced classifiers. After being
implemented on the MATLAB software, the produced algorithm is assessed; no hardware platform is
utilized in this study. The exceptional performance of the adopted algorithm validates its appropriateness
for a wide range of real-world applications in the healthcare and associated fields, as well as for state-of-
the-art AI-based personal assistive gadgets.

5.7 Conclusion
This study presents a single multimodal characterization of the PPG signal as the basis for a robust,
automated, and easily implementable algorithm to identify the mental stress condition. The classification
result indicates that there is a notable variation in the PPG signal's distinguishing features when mental
stress is present. This suggests that the results not only validate the importance of the PPG signal but also
offer strong proof that the PPG signal can be a useful tool for determining a subject's mental stress level.
Due to its high average classification accuracy, sensitivity, and specificity, the suggested algorithm is also
capable of measuring mental stress in people from different populations. Due to its computational
simplicity, the method may be constructed on a hardware system and utilized as a smart, portable, stand-
alone application based on artificial intelligence for the study and detection of mental stress in remote
rural locations.
 This chapter describes the utility of EEG signal for detection of eye ball movements
in four directions along with eye open and close conditions.
 Instead of using an additional EOG electrode, EEG electrodes are used for
recording embedded eye movement signal in EEG signal.
 The EEG data was recorded in the laboratory from various subjects and the
developed algorithm is tested on these data for classification of eye ball movements.

6.1 Overview
In order to help the elderly and disabled people, modern human-machine interfaces (HMIs) use a variety
of human expressions. Depending on the kind of handicap, eye movements are frequently determined to
be the most effective means of communication for transmitting expressions. These days, eye movement
detection is done using Electroencephalogram (EEG) based setups, which are used to investigate
neurological conditions. However, most of the state-of-the-art EEG-based studies either use a greater
feature dimension with restricted classification accuracy or detect eye movements in a lesser direction.
This chapter elaborates a robust, straightforward, and automated technique that classifies six distinct
types of eye movements using the analysis of the EEG signal. To remove a variety of noise and artifacts,
the program applies discrete wavelet transform (DWT) to the EEG obtained from six distinct leads. Next,
a binary feature map is created by extracting two features per lead from the reconstructed wavelet
coefficients and combining them. Finally, six different types of eye movements are classified using a
threshold-based method based on an unique feature derived from the binary map's computed weighted
sum. With only one feature value, the algorithm has high average accuracy (Acc), sensitivity (Se), and
specificity (Sp) of 95.65%, 95.63%, and 95.63%, respectively. The suggested approach has enormous
potential for implementation in personal assistive devices, as demonstrated by the results achieved from
the adoption of simple methodology and comparison to other state-of-the-art methods.

6.2 Background and Contribution


The aging of the population and the devastating effects of many chronic illnesses are currently the main
factors contributing to the rise in disability in our society [6.1]. Technology intervention in the healthcare
sector has now become necessary to battle the results on a daily basis [6.2]. Over two billion individuals
worldwide will require at least one assistive technology product by 2030, according to recent World
Health Organization (WHO) reports [6.3]. As a result, the development of various human-machine
interfaces (HMIs) has received a great deal of scholarly interest in recent years [6.4]. In order to help
patients with neurological or physical disorders, modern HMIs can now process human-defined
instructions that are communicated via standard bio-signals and other human reactions [6.5]. However,
the most practical method of interacting with HMIs for treating illnesses associated with acute muscle
movement is to analyze the patient's eye movements [6.6]. The examination of eye movement has
gathered much interest recently due to its potential to express neurological and psychological intents
while communicating with the outside environment, in addition to its significant implications in the
management of disabilities [6.7]. The gold standard method for locating eye movement is often
Electrooculography (EOG) signal analysis, which is derived from the eye's current corneal–retinal
potential differential [6.8]. Even though EOG is a relatively simple technology to use, there are still
challenging limitations associated with it. These include the placement of the electrodes in close
proximity to the eyes, the artifacts caused by facial expressions, and the overall arrangement that reduces
field of vision during EOG acquisition [6.9].
To overcome the shortcomings related to the EOG, different characteristics of the
Electroencephalography (EEG) signals are now being adopted by the researchers as an alternative for the
detection of eye-movement. EEG is a non-invasive technique of measuring the electrical response of the
brain, obtained by placing electrodes on the exterior of the scalp [6.10]. Compared to the existing EOG
set-ups, recent state-of-the-art EEG devices also offer wireless acquisition of the signal without
compromising the patient comfort [6.11]. Normally, EEG signal is used to explore a wide variety of
neurological and psychological activities of the brain. Moreover, EEG electrodes placed on the frontal
lobe of the scalp usually pick up information related to eye-movements, which appears in the form of
EOG artifacts [6.12]. The presence of EOG artifacts in the EEG signal makes it a potential tool for the
analysis of eye-movement. There are now only few specialized techniques that identify eye movements in
different directions using the embedded information of the EEG signal. The technique described in [6.13]
classifies eye movement in only four cardinal directions by utilizing EOG artifacts in the EEG input.
With a restricted average accuracy of 50–85%, the technique uses the area under the curve of the
denoised EEG data as a feature to categorize the eye movements using a threshold-based hierarchical
classification method. In [6.14], an automated system is suggested to enable the real-time analysis of six
kinds of eye movements using the EEG input. Nevertheless, the technique employs a threshold-based
linear classifier with an average accuracy of 85.2% to detect eye movements using four distinct features
extracted from the processed EEG signal. In [6.15], a reliable and instantaneous method of controlling
eye movements in video games with EEG is suggested. The method follows the feature extraction and
classification techniques mentioned in [6.14] and makes use of the same acquisition tool. Lastly, the
algorithm's evaluation with just five participants yields an average accuracy of 80.2%. With an accuracy
of 88.2%, EEG-based automated identification of random changes in the eye state is studied in [6.16] and
[6.17]. The EEG signal can be used as a possible tool for eye movement analysis because it contains EOG
artifacts [6.18].
It appears that most of the aforementioned techniques either use a greater feature dimension [6.14[, [6.15]
with restricted classification accuracy or function well with detected eye movements in fewer directions
[6.13], [6.16]. In this chapter, a straightforward, reliable EEG-based algorithm is offered as a potential
substitute for automated, highly accurate identification of six distinct classes of eye movements. In order
to remove a variety of noise and artifacts, the method first applies discrete wavelet transformation (DWT)
of the collected EEG data, which is received from six separate leads. Then, using the particular wavelet
coefficients, two straightforward and reliable features per lead are retrieved in relation to the eye motions.
One binary feature map is created from the acquired features. Ultimately, a straightforward threshold-
based method is employed to identify six distinct types of eye movements using a weighted sum of the
binary feature map, which is generated as a single feature. In essence, decision boundary based
classification techniques are not needed for this strategy. Instead, just a single feature value is needed to
identify eye movements in various directions.

6.3 Methodology
Figure 6.1 provides an overview of the methodology used in the suggested EEG-based eye-movement
analysis method. The algorithm consists of four essential components: 1) pre-processing the recorded
EEG data using wavelet transformation to remove various sounds and artifacts; 2) extracting features
linked to eye movements from the wavelet coefficients; 3) producing a binary feature map and binarizing
the acquired feature set 4) calculating the binary feature map's equivalent-weighted sum and classifying
the eye movements depending on thresholds.

6.3.1 Acquisition of the EEG signal


In the current study, an EEG cap is applied to each subject's scalp to facilitate non-invasive EEG signal
acquisition while recording eye movements in various directions in the Biomedical Laboratory at the
Department of Applied Physics, University of Calcutta. The worldwide protocol of the 10–20 electrode
system is adhered to in maintaining the locations of the EEG leads on the cap. The BIOPAC MP 150 data
acquisition device, which has sixteen independent programmable leads, is used to acquire EEG signals
throughout the entire process [6.17]. Only six eye-specific leads have been selected to record the EEG
signals at a sampling frequency of 1 kHz because the goal of this investigation is to investigate eye
movement. Leads 1, 2, 3, 4, 5, and 6 are designated for the selected leads, "Fp1-Fp2," "F3-F4," "C3-C4,"
"P3-P4," "F7-F8," and "T3-T4," which are symmetrically positioned on both sides of the line connecting
the "Nasion" and the "Inion" across the scalp. Twenty healthy participants, ages ranging from 22 to 35,
participated in this study (15 men and 5 women). Four of the twenty subjects had corrected-to-normal
vision, whereas the other sixteen had normal vision. Figure 6.2 displays a sample EEG signal segment
that was taken from a subject for six different leads.

Prior to the commencement of the recording, all volunteers received an explanation of the complete
experimental method and were instructed to remain in a relaxed or idle state, with their eyes closed, for
around two minutes. Following the period of rest, the EEG signal is recorded for a further two minutes
while the subject's eyes are closed. Next, without moving their heads, the patients were instructed to open
their eyes and focus on a designated spot at the same horizontal level as their eye on the wall. The EEG
signal is recorded for two minutes. After that, the participants were instructed to concentrate on the
following location, which was at the same horizontal level and angled roughly 30 degrees to the left and
right of the central (first) mark, respectively. A two-minute recording of the EEG signal is made for each
of the aforementioned scenarios. Similar recordings were also made at a site approximately thirty degrees
above and below the central mark. For every patient, the complete EEG data collecting process is
repeated with varying eye directions, as shown in Figure 6.3. In accordance with institutional and
international practice, each subject signed an informed consent form before to the experiment.

6.3.2 Pre-processing of the acquired EEG data


Table 6.1 illustrates the five primary sub-bands of distinct frequency ranges that make up the EEG signal:
delta, theta, alpha, beta, and gamma. In addition to the aforementioned five frequency bands, power-line,
baseline, and eye blink artifacts are some noisy frequency components that frequently contaminate the
obtained EEG signal. To improve the quality of the EEG signal analysis and the extraction of the relevant
sub-bands, these extraneous noise components must be removed.

6.2.2.1 Removal of the power-line artifact


One of the main causes of noise in the EEG signal has been power line interferences with a 50 Hz
frequency component. In this chapter, power line interference artifacts are removed from the EEG signal
obtained from all leads using the moving average filtering technique. Since f s = 1 kHz was the sampling
frequency for the recorded EEG signals and 𝑓 is the power line frequency, the projected window size will
be
Now that Eq. 6.1 has been followed, a twenty-point moving average approach is used, and the result is
shown in Figure 6.4 for a sample EEG signal.

6.3.2.2 Removal of high frequency components


A method based on discrete wavelet transform (DWT) is used to treat the EEG signal further. DWT is
often considered the most effective method for breaking down non-stationary signals into successive
frequency sub-bands, which makes analysis of the signals easier. A single-stage DWT uses a low pass
and a high pass filtering technique to split the whole signal frequency range into two distinct frequency
bands. The "Approximation coefficient" (cA) is the signal component obtained after low-pass filtering,
and the "Detail coefficient" (cD) is the signal component obtained after high-pass filtering. However, the
right number of decomposition stages and the right choice of mother wavelet determine how effective the
chosen DWT approach is. It is found that the EEG signal's inherent variation is comparable to the
Daubechies wavelet family "db4" [6.19]. Therefore, "db4" is selected as the "mother wavelet" in this
chapter in order to perform DWT analysis on the EEG signal. Ten-level decomposition of the EEG signal
is performed in this study to extract the necessary sub-bands from the original signal.
According to the Nyquist requirements, the maximum frequency component in the recorded EEG signal
is 500 Hz as the signal is sampled at 1 KHz. The Detail coefficient (cD1) and Approximation coefficient
(cA1) that are produced by a single-stage DWT now have frequency components that range from 250 to
500 Hz and 0 to 250 Hz, respectively. In the second step, cA1 is further broken down into cA2 and cD2,
which have respective frequency range of 0–125 Hz and 125–250 Hz. After additional decomposition, the
coefficient cA2 yields cD3 (62.5–125 Hz) and cA3 (0–62.5 Hz). Following each of these phases, it is
discovered that some detail coefficients have frequency components that fall outside of the EEG sub-band
ranges. In the current study, these components are then removed as high-frequency noise components.

6.3.2.3 Removal of baseline wandering artifact


The baseline wandering frequency components are typically found to be lower than 0.5 Hz. As a result,
the approximation coefficient found at DWT's tenth level (cA10) is removed and regarded as the baseline
component. Figure 6.5 shows a representative EEG "delta" wave with and without baseline artifact .

6.3.2.4 Selection of the required DWT coefficients


Only those coefficients that have frequency ranges that match the theoretical values of the EEG frequency
band are selected for additional investigation after artifacts have been removed. More precisely, it is
discovered that the coefficients cD10, cD9, and cD8 correspond to "Delta" waves, cD7 to "Theta" waves,
cD6 to "Alpha" waves, cD5 to "Beta" waves, and cD4 to "Gamma" waves, respectively, following ten-
level decomposition. The remaining coefficients are eliminated as high-frequency and low-frequency
noise components, and these seven coefficients are selected for additional examination in the current
study. The overall process of DWT decomposition, which includes choosing the appropriate wavelet
coefficients that match with the EEG sub-bands and eliminating undesired noise components, is displayed
in the block diagram of Figure 6.6.

6.3.2.5 Removal of Eye Blink Artifact


The low frequency and high amplitude feature of the EEG signal indicate the presence of the eye blink
artifact. There are many eye blinking instants in the signal that corresponds to each of the obtained EEG
leads. All of these eye blink signals are labeled for visual identification after being found using a slope
detection technique. The slope detection technique works by figuring out how quickly the signal
amplitude changes at each instant. The eye blink instants are now defined as the areas when the slope
values abruptly change above a predetermined threshold. Every record under consideration goes through
this procedure of unique identification. An identified eye blink artifact taken from a specific record is
displayed in Figure 6.7.

The minimal duration of an eye blink signal is discovered to be close to one second once the full database
has been examined. After that, a sliding window is selected appropriately to get rid of the eye blink
artifacts. Now, each time the window moves from the eye blink instant's beginning, the average value of
the signal is substituted by computing the signal's value one second before and one second after the
specified sample instant. The data substitution procedure is executed for the duration of the eye blink.
The identical process is then repeated by moving the sliding window from the ending instant of the
current eye blink to the beginning instant of the subsequent eye blink. Figure 6.8 displays different EEG
signal bands before and after the eye blink artifact was eliminated. Feature extraction is then performed
on the obtained clean EEG signal.

6.3.3 Feature Extraction


To extract the eye movement information, denoising is followed by a power analysis of the wavelet
coefficients corresponding to each of the bands. For this reason, the signal power for the full coefficient
of a given band is calculated using a sliding window of two seconds. Following a sequential shift of the
window from the beginning to the end of the data, a power value is calculated for each window. Lastly,
the total power related to a particular band is represented by the sum of the power values, which are
determined by Eq. 6.2. As shown in Figure 6.9, a comparable power array is produced for every band,
which corresponds to every lead and every location of the eye.
where, 𝑖 = 1, 2, 3, 4, … , 𝑛 − 2 and 𝑛 = length of the coefficient value, α = power of Alpha band and i =
lead number.

These computed power levels are used to extract two features: the weighted average power (WAP) and
the absolute power factor (APF). APF is defined as the power of a certain band divided by the overall
power of all the bands for a given lead. The whole power of all the leads divided by the total number of
leads is known as WAP for a particular EEG band. The following formulas can be used to calculate APF
for a given EEG band (δ) and WAP for a given band alpha (α) of a specific lead.

where n represents the number of channels and 𝑃𝛿 , 𝑃𝜃 , 𝑃𝛼 , 𝑃𝛽 , 𝑃𝛾 denotes the power of the delta, theta,
alpha, beta and gamma bands respectively for a given channel.
For better representation, the computed features as suggested by the aforementioned Eqs. (6.3) and (6.4)
are displayed in Table 6.2 after being rounded to two decimal places.

6.3.3.1 ‗Binarization‘ of the parameters and creation of a binary feature map


Table 6.2 makes it clear that the computed features APF and WAP have dimensions of (6×5) and (1×5),
respectively, for each eye condition. Thus, a total of thirty-five feature values (6×5×1+5×1 = 30+5=35)
are generated, matching to each eye condition. Currently, a "binarization" technique is used to build a
distinct binary feature map for the retrieved features rather than working with the decimal values of the
features.
In the instance of APF, an empirical threshold value is determined for every lead following an
examination of the values derived from every lead and every record under consideration. The
"binarization" procedure is then started by applying the lead-specific threshold values to each and every
value of the various APF readings. During the "binarization" process, a value is replaced with a "0" if it is
determined that a given feature value is not greater than the selected threshold, and with a "1" otherwise.
To binarize all of the APF values, the entire process is performed for every lead and every eye condition.
When it comes to WAP, the values that are collected from each band of every EEG record are used to
compute the empirical threshold values. The WAP feature values are then "binarized" using a procedure
similar to that of APF.
Due to the dimensional mismatch between WAP and APF, the binarized WAP row vector is now added
to the binarized APF matrix as a seventh-row vector, resulting in the formation of a single binary feature
map of dimension (7×5), as shown in Figure 6.10. For every record and eye condition, the entire
procedure is repeated. By reducing the operational complexity of managing several features, this process
of creating a single, unique binary feature map also decreases the computing load during the classification
phase. Table 6.3 shows the results of binarization for three example participants for each eye condition.

In Table 6.3, the remaining cells with "0" values are painted gray, while all of the cells with "1" values
are displayed in black. Table 6.3 clearly demonstrates that, regardless of the subjects,
the generated binary map displays a distinct discriminating combination for each eye position. The
outcome shows that, even without the use of classification logic, the produced binary feature map allows
for simple visual inspection-based discrimination of every eye condition. Currently, each element is
given a positional weight value equal to the associated cell number based on its position in the binary
feature map. Table 6.4 displays a representative unified binary feature map with all positional weight
values for a particular eye condition.

The binary feature map's content is then multiplied by the appropriate positional weight for each cell, and
the resulting total of these values yields a unique number. The Binary Weighted Feature Value (BWFV),
which is a distinct integer assigned to every eye position, is shown in Table 6.5. Finally, the distinct
values of BWFV are applied in this chapter to categorize the eye movements in various directions.

6.3.4 Classification of eye ball movements


It is clear that visual evaluation of the binary feature map itself allows for comprehensive identification of
the eye movement even before BWFV values are calculated. Nonetheless, BWFV values are calculated in
the current chapter for automated eye movement classification. As shown in Table 6.5, all six of the eye
conditions are now categorized based on the initial BWFV levels. Next, a set of five distinct threshold
values are derived for each eye condition using the mean border of the BWFV values. Following that, a
straightforward threshold-based classification technique is developed using these threshold values in
order to categorize each and every eye movement condition. It is important to note that no particular
training phase is needed for the categorization technique. Instead, only five threshold values are used for
the total assessment. Apparently, the likelihood of utilizing any complex classification algorithm is
eliminated by the use of single BWFV feature values. The exclusive technique of employing a single
feature value has not yet been used for the classification of eye movements.

6.3.5 Parameters used for evaluation


Three statistical parameters—specificity (Sp), detection accuracy (Acc), and sensitivity (Se)—are used to
quantify the effectiveness of the suggested approach. The following formula is used to compute each
parameter.

Additionally, Specificity (Sp) measures the percentage of accurately detected true negative cases, while
Sensitivity (Se) indicates a test's capacity to discover positive cases.
For every class of eye movements, the suggested BWFV feature values provide discriminating separation.
For the purpose of classifying the eye movements, a basic threshold-based binary classification has thus
been applied. To ensure thorough validation of the classification approach, a ten-fold cross validation
method is utilized taking into account the size of the adopted dataset. Initially, the features of twenty
subjects are separated into ten equal sections, with two subjects in each portion. Presently, for each
iteration, one component (two subjects) is removed, and the classifier is assessed on the remaining nine
components (eighteen subjects). After ten iterations of the procedure, the average result for each fold is
displayed in Figure 6.11 and listed in Table 6.6. With an average accuracy (Acc) of 95.65%, sensitivity
(Se) of 95.63% and specificity (Sp) of 95.63%, Table 6.6 presents the total results.

6.4 Performance comparison


It should be noted that the analysis of the EEG signal has only been used in a small number of research
studies to address the issue of eye movement detection. In order to categorize the eye movements
utilizing a single value, a novel binarization and feature mapping property are employed. Significant
accuracy can be obtained by applying this unique value to a straightforward threshold-based classification
technique. To validate the usefulness of the suggested algorithm concerning its efficiency and
compatibility, the acquired outcomes are also assessed in comparison to other cutting-edge research.
However, there are imbalances that make it difficult to conduct a fair comparison with other studies due
to the use of various EEG acquisition setups, variations in the size of the acquired EEG dataset,
methodological differences, variations in the used feature dimension, and also the adopted classification
strategy. Table 6.7 presents a thorough evaluation of the suggested algorithm's performance in
comparison to other cutting-edge literatures. It is evident from the descriptions included in Table 6.7 that
the suggested algorithm exhibits high efficiency in terms of accuracy and resilience when compared to
other relevant publications.

6.4 Discussion
Recent studies demonstrate that the fundamental characteristics derived from the eye movement sequence
can be employed to control the functionality of numerous assistive technologies. However, the analysis of
EEG signals is currently being investigated as an option to track the movement of the eye in different
directions due to the complexity involved in the classic EOG based approaches [6.13–6.15]. This suggests
that the eye movement can be detected using the same EEG setup that is often used to identify
neurological activity. Since it hasn't been thoroughly investigated yet, the detection of eye movement
using EEG-based methods is still seen as a promising field of study. To the best of our knowledge, there
are currently no online EEG signal databases that are annotated and related to the properties of eye
movements. Thus, in the current chapter, six electrodes are applied to each subject's scalp in order to
collect non-invasively EEG signal for various eye-movement conditions in the institutional laboratory.
Initially, the adopted methodology extracts two attributes from each lead using a wavelet-based
methodology and signal power. After then, a single binary feature map is created by combining these
features. Lastly, a straightforward threshold-based classifier is employed to detect the eye movement
based on a single distinct feature that was obtained from the binary feature map. All in all, a less complex
approach is used at every stage of the algorithm—aside from the wavelet transform—to make it suitable
for use in the current assistive technology. The advantages of the suggested algorithm are outlined below:
1) Strict pre-processing of the obtained EEG signal is performed using a strong wavelet-based method.
Prior to further processing, the selected DWT-based method makes it easier to reduce a variety of
artifacts from the signal.
2) A straightforward, distinct slope-change and signal averaging method is employed to remove eye
blink-related artifacts from the obtained EEG signal.
3) A novel binarization and feature mapping technique reduces the entire feature dimension to one,
negating the need for the algorithm's feature selection mechanism.
4) The binary values in the binary feature map itself exhibit unique properties that correspond to each
condition of the eye. By alone, these special combinations make it easier to distinguish between different
eye conditions through visual inspection. This indicates that the proposed study also provides a visual
categorization of eye conditions at an earlier stage of the process, before the automated classification of
the eye conditions using the distinct BWFV values.
5) The main benefit of the suggested approach is that it doesn't need a conventional, intricate classifier.
There is a significant amount of discrimination between the six different classes of eye motions in the
computed single BWFV feature values. Consequently, it is discovered that merely a basic, linear,
threshold-based classifier is adequate to accurately categorize the eye movement characteristics based on
the retrieved feature values. This eliminates the need for a training phase, significantly reducing
complexity.
In addition to its numerous uses in the fields of neurology and psychology, EEG signal utilization has
certain disadvantages [6.24]. Six separate leads are used in this investigation to collect EEG signal. It is
evident that the requirement for an EEG acquisition cap and the proper positioning of six separate EEG
electrodes around the scalp frequently cause discomfort for the patients. The degree of discomfort
experienced by individuals can represent significant obstacles to the straightforward collection of EEG
data, contingent on their physical and mental health.
In the current study, no real-time hardware platform is used; instead, all algorithm testing is carried out
and validated through software implementation. On a desktop computer, MATLAB is used to prepare the
assessment result. It takes over twenty seconds to categorize six different types of eye movements when
the complete algorithm is executed, beginning with wavelet transformation, denoising, wavelet
coefficient selection, feature extraction, binarization, and ultimately classification. Based on the average
execution time gathered from all of the EEG records, the aforementioned execution time was calculated.
The suggested method's promising performance clearly demonstrates its use in cutting-edge assistive
devices that cater to a wide range of populations with significant physiological barriers.

6.5 Conclusion
Eye movements can be utilized to communicate human expressions that can help with various types of
mental or physical disabilities. The current chapter proposes a reliable and precise algorithm that
distinguishes between the six various types of eye movements by analyzing the EEG signal. The
proposed methodology primarily uses a robust wavelet-based approach for feature extraction and signal
denoising. Using a special binarization method, the features extracted from the selected wavelet
coefficients are then combined to create a single binary feature map. A straightforward threshold-based
classification method is utilized to categorize six distinct eye movements using a discriminating feature
value obtained from the binary feature map. The experimental results acquired in this chapter show good
average detection accuracy and high execution speed when evaluated over several EEG signal records, in
comparison to the related researches.
The algorithm's promising outcome, high-speed execution, and simple methodology ensure its
compatibility with state-of-the-art assistive HMI devices. Future work will focus on integrating the
algorithm into real-time, multifunctional personal assistive devices that may wirelessly and portably
acquire EEG signals using reduced leads. Additionally, the algorithm will be updated to support
additional eye movements, such as blinking of the eyes to make commands easier to provide.
 This chapter describes the utility of EEG signal for detection of mental stress.
 Instead of using all the electrode positions for conventional EEG recordings, only
four pairs of electrodes are chosen for reducing computational burden.
 The statistical features obtained from EEG signal are used instead of conventional
time or frequency domain features.

7.1 Overview
Mental stress is a significant emotional state which has an adverse effect on humans due to modern
lifestyle and work pressure. With the development of artificial intelligence, stress recognition has
demonstrated numerous beneficial applications in people's lives. Since human mental stress can be
accurately reflected by Electroencephalogram (EEG) signals, stress recognition approaches based on
EEG signals has emerged as a key area of study in real world and artificial intelligence applications.
However, the majority of currently used stress detection techniques perform poorly in recognition, which
hinders the advancement of these techniques in real-world settings. Moreover, the computational
complexity of the complex features and signal processing stages is a major challenge for the application
of EEG in this field. To alleviate this problem, a simple automated algorithm is proposed for automated
recognition of human mental stress based on two easy to compute features extracted from the EEG sub
bands generated using Discrete Wavelet Transform (DWT). The features are derived from the signal
energy of the different sub bands. SVM and k-NN classification techniques are adopted to classify
between the stressed and relaxed conditions based on the above features. The average values of obtained
accuracy, sensitivity and specificity are 98.73% and 98.63% and 98.82% respectively. Results reveal that
the proposed technique can be implemented in monitoring systems for early detection of mental stress
which can be beneficial towards decreasing the number of suicidal attempts or prolonged mental diseases.

7.2 Background and Contribution


According to the contemporary concept of stress, it arises when demands placed on people do not
correspond with their resources or their needs. In the event that the task exceeds one's capacity and
available time, stress will be the outcome. On the other hand, some people will become stressed out by a
tedious or repetitive activity that does not make use of their prospective abilities and expertise [7.1].
Numerous studies show that stress and its aftereffects have become increasingly significant in
contemporary culture [7.2]. The American Physiological Association reports that 75% of people say they
have had a stress-related symptom within a duration of only one month. Additionally, 25% of respondents
believe that a high stress level significantly affects their physical health status. Comparably, a study
carried out by the European Commission [7.3] shows that over 22% of EU workers believe their job
stress puts their health or safety at danger. When including elderly individuals, positions with significant
personal and public risk, and unpredictable work schedules, this number rises significantly. Growing
levels of stress and anxiety, together with personal relapses in wellbeing, typically have a big impact on
society and the world economy. For instance, stress at work causes the UK to lose about 13 million
working days annually, costing the country £12 billion [7.4]. This is particularly true for jobs that require
a lot of physical and mental exertion, as those in the transportation, medical, military, civil protection, and
office sectors [7.5]-[7.9]. According to a different survey, a sizable portion of respondents acknowledge
that they have never used a technique or activity to lessen and control their stress [7.10]. Thus, effective
stress management in the workplace is required, ideally with trustworthy techniques for automatic and
real-time monitoring of an individual's stress levels. Employers must therefore figure out how to ensure
the personal safety of their staff members while still keeping them engaged, healthy, and on the job. This
scenario is especially pertinent in light of the ongoing demographic shifts that have led to an increase in
the number of older people, which has increased the need for policies that support individuals in working
longer hours in a safe manner.
This main goal of this chapter is to use simple features to design, build, test, and assess an integrated
system of wearable EEG signal based approach for stress monitoring in the workplace and daily life.

7.3 Methodology
Figure 7.1 provides a block summary of the entire methodology used in the proposed EEG-based stress
detection algorithm. The algorithm consists of four essential components: 1) selection of EEG data from
DEAP dataset, 2) pre-processing of the EEG records using moving average filter and wavelet
transformation to remove various artifacts, 3) feature extraction where two easy to compute features are
extracted from the wavelet coefficients and 4) classification, where SVM and k-NN based classifiers are
used to classify between the stressed conditions and relaxed conditions.

7.3.1 EEG Dataset


The EEG records utilized in this study to determine mental stress conditions were collected from the
DEAP database, which is accessible to the general public [7.11]. Eight peripheral bio-signals, ranging in
age from 19 to 36, were generally gathered from 32 people (17 male participants and 15 female
participants). However, in the current study, only the EEG records are selected for analysis out of a large
number of signals. Thirty-two subjects' EEG data totaling 48 different channels at 512 Hz were recorded.
Two distinct places were used to record the data. Participants 23–32 were recorded in Geneva, and
participants 1–22 were recorded in Twente. There are a few little variations in the format because of a
different hardware revision. First, the two locations have distinct EEG channel orders. The data was
cleaned up of EOG artifacts and down sampled to 128Hz. The common reference was used to average the
data. The EEG channels were rearranged so that they all adhere to the previously mentioned Geneva
order. After dividing the data into 60-second trials, the pre-trial baseline of three seconds was eliminated.
7.3.2 EEG signal pre-processing
The selected database contains the EEG records of thirty two channels for each subject. It is reported in
the literature that the channels of Fp1, Fp2, F3 and F4 gives the best result for detection of mental stress
conditions [7.12], while the other channels do not perform satisfactorily. Hence, out of all the channels,
only four channels are selected for the present analysis of mental stress. Table 7.1 illustrates the five
primary sub-bands of distinct frequency ranges that make up the EEG signal: delta, theta, alpha, beta, and
gamma. In addition to the aforementioned five frequency bands, power-line, baseline, and eye blink
artifact are some other noisy frequency components that frequently contaminate the obtained EEG signal.
To isolate the appropriate sub-bands and perform a better analysis of the EEG signal, these undesired
noisy components must be removed.

7.3.2.1 Power line artifact removal


Power line interferences with a 50 Hz frequency component have been identified as a significant
contributor to EEG signal noise. The moving average filtering approach is used in this study to remove
power line interference from the recorded EEG signal. The detailed operation of moving average filter is
available in [7.13]. The complete waveform is now filtered out using the moving average method. A
representative plot of power line corrupted signal, the extracted power line artifact and the filtered EEG
signal is shown in Figures 7.2(a), 7.2(b) and 7.2(c) respectively.

7.3.2.2 High frequency artifact removal


A method based on discrete wavelet transformation (DWT) is used to clean the EEG signal further. Due to its ability
to break down non-stationary signals into successive frequency sub bands, DWT is widely considered the most
effective instrument for facilitating non-stationary signal analysis. A single-stage DWT uses a low pass and a high
pass filtering technique to split the whole signal frequency range into two distinct frequency bands. The
"Approximation coefficient" (cA) is the signal component obtained after low-pass filtering, and the "Detail
coefficient" (cD) is the signal component obtained after high-pass filtering. However, the right amount of
decomposition stages and the suitable mother wavelet selection determine how effective the chosen DWT approach
is. It is reported in the literature that the EEG signal's inherent variation is comparable to the Daubechies wavelet
family "db4" [7.12, 7.13]. Therefore, "db4" is selected as the "mother wavelet" in this work in order to perform
DWT analysis of the EEG signal. Seven level decomposition of the EEG signal is performed to extract the necessary
sub-bands from the original signal since the down sampled frequency of the EEG records is 128 Hz and the lowest
frequency range of the EEG sub-band (Delta wave) is 0.5– 4 Hz.
According to the Nyquist criteria, the maximum frequency component in the recorded EEG signal is 64 Hz as the
signal is sampled at 128 Hz. The Detail coefficient (cD1) and Approximation coefficient (cA1) are now produced in
the first stage of DWT, with frequency components ranging from 32–64 Hz and 0–32 Hz, respectively. In the second
stage, cA1 is further broken down into cA2 and cD2, which have respective frequency ranges of 0–16 Hz and 16–32
Hz. After additional decomposition, the coefficient cA2 yields cD3 (8–16 Hz) and cA3 (0–8 Hz) and so on. Figure
7.3 depicts the decomposition of EEG records into subsequent coefficients using DWT along with their frequency
ranges. For the current analysis, the aim is to use the Beta, Alpha and Theta waves for feature extraction purpose.
Following the above goal, it is discovered that the detail coefficient of the first stage have frequency components that
go outside of the desired sub-band ranges. The current work then removes these frequency components as high-
frequency artifacts.

7.3.2.3 Baseline artifact removal


The baseline wandering frequency components are typically found to be lower than 0.5 Hz. As a result,
the approximation coefficient found at DWT's seventh level of decomposition (cA7) is removed and
regarded as the baseline component. Figure 7.4(a) shows an example EEG "Delta" wave with baseline
artifact while Figure 7.4(b) and Figure 7.4(c) represent the extracted baseline artifact and the filtered
―Delta‖ wave.

7.3.3 Feature Extraction


7.3.3.1 EEG wave reconstruction
Only the coefficients whose frequency ranges match the theoretical values of the EEG frequency band are
selected for additional examination after artifact elimination. More particularly, it is discovered that the
coefficients cD7, cD6, and cD5 correspond to the "Delta" wave, cD4 to the "Theta" wave, cD3 to the
"Alpha" wave and cD2 to the "Beta" wave respectively, following seven-level decomposition. The
remaining coefficients are eliminated as high-frequency and low-frequency noise components, and these
six coefficients are selected for additional examination in the current study. A representative plot of the
reconstructed EEG waves is shown in Figure 7.5, where the individual waves are constructed from the
selected detail coefficients.

7.3.3.2 Feature selection


Two time domain features are extracted for the purpose of identifying the stressed conditions. Time plane
features are considered to reduce the computational complexity and make the system more accurate in
nature. The features are computed from the energy of the different EEG sub bands. For this purpose, the
different waves of ―Delta‖, ―Theta‖, ―Alpha‖ and ―Beta‖ are reconstructed from the selected detail
coefficients as mentioned earlier.
Feature 1 (F1): The variance of variance of signal energy corresponding to ―Theta‖ wave is considered to
be the first feature for the present analysis. A window of one second is considered and the signal energy
of ―Theta‖ wave is calculated. The window is shifted from the starting instant to the ending instant of the
data strip. This results in the generation of an array of signal energy values. Variance of the signal energy
within this each window is evaluated and the corresponding values are stored in separate array. The
overall variance of this variance values is coined as the first feature and termed as variance of variance of
signal energy. The mathematical expression of F1 is illustrated in Eq. (7.1), where w is the width of
selected window and θ denotes the ―theta‖ wave segment.
Feature 2 (F2): As mentioned earlier, a window of one second is selected for calculating the signal energy
of ―Alpha‖ and ―Beta‖ waves. The ratio of the signal energy of ―Beta‖ to ―Alpha‖ wave is determined for
each window segment. The variance of this signal energy ratio is considered as the second feature for the
present study. The mathematical expression of F1 is illustrated in Eq. (7.2)

7.3.4 Classification
For each subject, there are four electrode positions and for each electrode position, there are two feature
values for stressed and relaxed conditions. In total, there are sixteen different feature arrays for all the 32
subjects. As a whole, there are 512 feature values for classification stage. The fundamental concept of
classification stage is to create an optimum hyper plane which can identify and separate out the two
different classes based on the extracted features. The classification between stressed and relaxed
conditions is carried out using two standard classifiers. Literature shows the applicability and accuracy of
SVM [3.223- 3.226, 7.14] and k-NN classifiers [3.225, 3.226, 7.15] for the analysis of biomedical signals.
Hence, for the present study, both SVM and k-NN classifiers are used to identify the stressed and relaxed
conditions. During the classification technique, all the feature arrays are considered together for
classification and the entire dataset is divided into five equal parts to ensure thorough validation of the
classification approach. Four parts are used to train the model while the fifth part is used to test the
model. This process is repeated five times for avoiding classifier biasness and to obtain the best fit result.
These five iterations ensure that each set passes through both the training and testing phase during
classification. The final decision boundary is determined from the average of the decision boundaries
obtained in the five iterative stages.

7.4 Results and Validation


7.4.1 Feature analysis using box plot
The obtained values of both the features are plotted together so as to understand the groupings
corresponding to the stressed and relaxed conditions. The box plot of the feature values is shown in
Figure 7.6 where the individual feature values are plotted corresponding to four electrode positions for
both the stressed and relaxed conditions. The features values of F1 and F2 shows a lower grouping for
relaxed condition and a higher grouping for stressed condition. This signifies the increase in signal energy
in the EEG sub waves of a subject when he/she is subjected to mental stress. The signal energy for the
―Beta‖ band increases for the stressed conditions for all the subjects as compared to their values for
relaxed state. On the other hand, the signal energy of ―Alpha‖ band shows significant decrease in its value
when the subject is exposed to stress conditions. The signal energy distribution of the ―Theta‖ band also
shows marked change when the subjects are going through the stressed conditions. This reflection of
signal energy change for the relaxed and stressed conditions are clearly visible from the box plot of the
features. The groupings of the values support the choice of the features for the identification of stressed
conditions. There is a marked difference between the highest value of F1 and F2 for relaxed condition
with the lowest value of F1 and F2 for stressed condition. This enables the use of simple standard
classifier for identifying the stress conditions instead of going for complex classifiers. The choice of
simple classifier reduces the overall computation time to a greater extent.

7.4.2 Feature based Classification


As mentioned earlier, SVM and k-NN classifiers are used to classify the stressed and relaxed conditions
separately. Linear SVM and fine k-NN classifiers are chosen for the present analysis. The confusion
matrix obtained from the output of the classifiers is represented in Figure 7.7(a) and Figure 7.8(a). The
reverse operating characteristics (ROC) plot for both the classifiers are computed for understanding the
performance of the classifiers. The ROC plots are depicted in Figure 7.7(b) and Fig 7.8(b). The high
values of the AUC justify the classification accuracy. For every class of mental conditions, the suggested
feature values offer a discriminating distinction. Therefore, it can be concluded that the use of a linear
SVM and fine k-NN classifiers are effective for the classification of stressed and relaxed conditions.

Moreover, three statistical measures, such as Accuracy (Acc), Sensitivity (Se) and Specificity (Sp), are
used to evaluate the algorithm's efficiency. The following formula is used to create each parameter. TP
stands for the number of positive classes that are accurately identified as positive; TN for the number of
negative classes that are correctly identified as negative; FP for the number of negative classes that are
mistakenly identified as positive; and FN for the number of positive classes that are wrongly identified to
be negative but are actually positive. Additionally, Specificity (Sp) measures the percentage of accurately
detected true negative cases, while Sensitivity (Se) indicates a test's capacity to discover positive cases.

The accuracy, sensitivity and specificity values obtained for the SVM classifier are 98.63%, 99.21% and
98.04% respectively. K-NN classification generates the values of 98.82%, 98.04% and 99.60% for the
above three parameters. A high average accuracy (Acc) of 98.73%, sensitivity (Se) of 98.63%, and
specificity (Sp) of 98.82%, respectively, are obtained after the classification stage. The detailed
performance values for both the classifiers are summarized in Table 7.2.

7.5 Performance Comparison


It should be noted that, to date, the issue of mental stress detection has only been addressed in a small
number of research studies that have used the analysis of the EEG signal. In the literature, no
benchmarking is available for the supervised and unsupervised approach to classify the mental stress
conditions using EEG signals. So the present work classifies the stressed condition with respect to relaxed
condition. This work's suggested methodology classifies the stressed conditions with ease utilizing two
easy to compute features and simple classification technique. To validate the usefulness of the suggested
algorithm concerning its efficiency and compatibility, the acquired outcomes are also assessed in
comparison to other cutting-edge research. However, there are imbalances that make it difficult to
conduct a fair comparison with other studies due to the use of various EEG datasets, variations in the size
of the acquired EEG dataset, methodological differences, variations in the used feature dimension, and
also the adopted classification strategy. Table 7.3 presents a comprehensive evaluation of the suggested
algorithm's performance in comparison to other cutting-edge relevant literatures. It is evident from the
descriptions included in Table 7.3 that the suggested algorithm exhibits high efficiency in terms of
accuracy and resilience when compared to other relevant publications.

7.6 Discussion
Given that the system's ability to detect stress is demonstrated and that there is a satisfactory correlation
with a simple feature extraction model, further work will focus on refining the feature extraction and
classification algorithms and conducting more experiments to test more precise stress models in practical
settings. The current algorithm's future efforts will focus on identifying a person's emotional states in
addition to stress, including psychological issues, which are sadly becoming more common in today's
society. Furthermore, the current algorithm makes it simple to provide continuous mental excitation
monitoring for individuals with mental illnesses. This work can be expanded to determine the state of an
individual experiencing mental shock, which could assist the doctor in taking the appropriate action and
lowering the likelihood of a suicidal attempt. It is possible to determine mental condition of a person who
is deaf and dumb by using this algorithm. The algorithms will also be translated into a suitable language
to ensure its use on portable devices, proving a suitable mechanism for assessment of stress in real
environments and applications. This will develop a system that can detect stress levels and an immediate
action can be taken to prevent chronic stress and health consequences. Some of the major points
regarding the proposed algorithm are summarized below:
1) For this study, only the mental stress has been taken into account. Many additional emotions can be
included in the algorithm.
2) Given that mental stress is on the rise in today's culture, the approach could be expanded to assess an
individual's mental stress levels on a continuous basis.
3) A strong wavelet-based method is used to do a thorough pre-processing of the obtained EEG data.
Prior to further processing, the selected DWT-based method makes it easier to reduce a variety of
artifacts from the signal.
4) Only the EEG data from DEAP dataset is considered for the present analysis. In future, some more
public datasets might be used to increase the validity and robustness of the algorithm.
5) All of the analysis for this study is done on an offline software platform. Further development of the
same algorithm could enable its use in Internet of Things (IoT) based systems, allowing for direct
monitoring of the outcomes even from a remote place. The field of the Internet of Medical Things
(IoMT) will greatly benefit from this.
6) The classification process is carried out using only two, simple-to-calculate features. In future, a few
more features might be developed to make the algorithm stronger.
7) Real-time EEG signals for particular mental stress stimuli can be captured in the lab, and the
algorithm's performance can be examined.

7.7 Conclusion
Stressed conditions can be utilized to understand the mental state of human beings which will help in
various predictions and possible treatment in order to avoid any negative consequences. The current work
proposes a reliable and precise algorithm that distinguishes between the stressed and relaxed conditions
by analyzing the EEG signal. The proposed methodology primarily uses a robust wavelet-based approach
for feature extraction and signal denoising. Subsequently, a linear SVM and fine k-NN classification
techniques are employed to categorize the mental stress conditions with respect to the relaxed condition
using discriminating values obtained from the feature matrix. Modern assistive HMI devices can be
compatible with the algorithm owing to its strong methodology, rapid execution, and hopeful result.
Future work will focus on integrating the algorithm into multifunctional personal assistive gadgets that
can be portable and wirelessly acquire EEG signals using minimized electrodes. Additionally, the
algorithm will be updated to support additional emotional states in addition to stress detection.

 This chapter concludes the thesis, highlighting the overall achievements and
shortcomings of the individual works performed.
 It also discusses the future perspectives of the work undertaken for the thesis.

The development of algorithms for the automated analysis of the two most significant cardiac signals
PPG and EEG has been our focus. The algorithms are especially made to fit the needs of the present day
scenario which demands for portable and low cost medical diagnostic equipment. However, the
construction of algorithmic models is the exclusive focus of the current study, which has mostly been
evaluated and implemented at the software level. Our future research efforts will be concentrated on
creating a working prototype of an automated monitoring system that will allow for routine, reasonably
priced diagnosis and automated monitoring.
8.1 Conclusion
The main objective of the present thesis is to develop algorithms for automated analysis of the two most
significant bio signals: the photoplethysmogram (PPG) and electroencephalogram (EEG). The analyses of
the two signals are mainly focused on emotional state classification and mental stress detection. In order
to enable the automatic detection of primary emotions and mental stress conditions, the signals are
analyzed and certain features are extracted which can be effective for these applications. The algorithms
used for the present analysis are designed to enable substantial reductions in computational time without
sacrificing signal quality. The algorithms are specifically made to be used with automated monitoring
devices that have little amounts of processing power. By using these intelligent, portable monitoring
devices, a greater number of people can benefit from quick, inexpensive, and simple diagnosis. The
specific tasks completed for the thesis are outlined here, emphasizing both their overall successes and
shortcomings.

8.1.1 PPG Signal Analysis for Emotion Recognition & Classification


Because of its inexpensive and non-intrusive data collection method, PPG is becoming more and more
popular as an alternative to ECG and EEG for a range of automated monitoring applications. A
substantial quantity of information on the functioning of the heart is carried by the PPG signal. Although
this possibility exists, the PPG signal's potential for emotion detection has not yet been thoroughly
investigated. The goal of this thesis is to create an automated system that can satisfy the needs of personal
health monitoring devices by identifying primary emotions from PPG data. In addition to the benefits of
employing the easy to acquire PPG signal, the method merely makes use of a single computationally
simple feature and a straightforward decision rule-based categorization. In contrast to multi-lead EEG-
based methods, emotional state detection can be achieved with just single channel PPG. For automated
monitoring platforms, this method thus seems to be a more effective first step in the identification and
categorization of emotional states.
The effectiveness of our technique has been supported by a preliminary level comparison with other
relevant reported literatures. However, an ideal performance comparison would necessitate testing the
algorithms on the more subjects, which has not been taken into consideration in the current work. The
method is limited to initial level diagnosis in monitoring systems where multi-lead EEG recording is not
feasible.

8.1.2 PPG Signal Analysis for Mental Stress Detection


An attempt is made to identify the mental stress conditions by the sue of PPG signals instead of complex
EEG signals so as to have an alternative solution to stress related disorders which is alarmingly increasing
in the modern society irrespective of age, work culture, economic background and food habit. The
developed algorithm uses a single channel PPG signal and extracts the respiration signal from it. Two
computationally simple features are chosen and simple threshold base classification rule is adopted for
classification of mental stress conditions with respect to relaxed conditions. In comparison to the
available methods of mental stress detection from EEG signals, the current approach performs better in
terms of detection accuracy and computational time and thus can be applied to practical situations for
early detection of mental stress levels.
Although the developed algorithm is tested on PPG signals collected from standard public database, it
needs to be further validated on the signals of subjects of various age groups and different sections of the
society. This algorithm gives an initial result of the stress related conditions which can be helpful for the
common mass for cost effective and ready to use applications.

8.1.3 EEG Signal Analysis for Eye Ball Movement Detection


Eye ball movement is an important signal for applications such as brain computer interfacing
and for the physically disabled persons. This eye ball movements provides an excellent source
reflecting the actions of the brain, but has not been exploited to that extent as found in the
literature and is still an open research problem. A simple, noise robust algorithm is developed
using discrete wavelet based technique for artifact removal and EEG sub band selection process.
Two distinct features are computed from the signal power of the individual bands and a single
binary feature map is created by combining these features. Lastly, a straightforward threshold-
based classifier is employed to detect the eye movement based on a single distinct value
obtained from the binary feature map. With the exception of the wavelet transform, the entire
algorithm is designed to be compatible with the use of modern assistive devices by adhering to a
simpler methodology at every stage.

8.1.4 EEG Signal Analysis for Mental Stress Detection


EEG signal is normally considered for evaluating mental stress as found in the literature . The existing
methods normally use complex signal processing techniques along with high end classifiers for detecting
mental stress which creates a lot of computational burden on the processing stage. Instead of using all the
conventional EEG leads, the present algorithm uses only four pairs in order to reduce the computational
complexity. In addition to the benefits of employing only four pairs of leads, the method merely makes
use of two computationally straight forward statistical features and a simple classifier. When statistical
parameters are used as features instead of time plane feature-based approaches, the system is more
resilient to noise and has the benefit of being easier to compute. Since the availability of proper
annotated EEG database related to mental stress is not available for public use, the algorithm have been
tested on signals collected from DEAP database only.
8.2 Future Scope
Researchers will gain important insights from the work done in this thesis as well as from the thorough
literature survey of the approaches that are already accessible. This will create opportunities for additional
research toward the creation of algorithms especially suited for the automated monitoring platforms. As a
result, it will be easier to modify the clinically useful features of the common health monitoring
applications. This might revolutionize the way ordinary healthcare tools are used and help them integrate
into the general mechanism as a means of establishing smart healthcare solutions.
The encouraging outcomes of the PPG-based emotion and stress detection method create new
opportunities for enhancing PPG signal use in routine diagnostics. Its use in diagnosing various other
health problems can be expanded with a deeper understanding of the underlying physiology. However,
more thorough investigation is required to ascertain its impact and clinical implications.
DWT is successfully used to extract the frequency bands from the EEG signal for identification of eye
ball movements and mental stress conditions. However, various other medical status of a subject can be
inferred from the frequency spectra of the PPG and EEG data. To extract the features from the
corresponding frequency components, an alternative method to DWT may be employed. This tool may
also be used to diagnose different disorders and can allow for diagnostics with a much lower memory
demand and reduced computational power requirement.
The future efforts will incorporate more research to create a comprehensive automated yet
computationally simple health monitoring system that can provide regular, reasonably priced, readymade
diagnosis and monitoring. For the purpose of detecting all forms of health related irregularities, including
physical and mental illnesses, automated health monitoring systems need to possess diagnostic
intelligence. While the primary emphasis of this thesis has been emotion and stress identification,
development of automated methods based on the analysis of PPG and EEG data to identify other health
related problems will be the upcoming focused area.
The present thesis is comprised on the evaluation of the developed algorithms on software platforms
alone and there is no hardware implementation. The future target is to create a comprehensive prototype
health monitoring device by extending the system through hardware interfaces. Tele-monitoring can also
be made possible by incorporating suitable data connection facilities. By enabling in-house patient
monitoring, the usage of such health monitoring devices might significantly boost healthcare services in a
developing country like India, where there is a severe scarcity of medical facilities to accommodate the
country's growing population.

You might also like