0% found this document useful (0 votes)

27 views

Verifying Signal Processing Methods For Use in Brain Computer Interfaces

The document discusses validating signal processing and classification methods for brain-computer interfaces using electroencephalography data. It explores feature extraction and machine learning techniques to interpret EEG signals. Two methods will be tested on a BCI competition dataset involving motor imagery tasks to determine the most viable for non-stationary EEG signals.

Uploaded by

ADITI ARAVIND 17BMD0050

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views

Verifying Signal Processing Methods For Use in Brain Computer Interfaces

Uploaded by

ADITI ARAVIND 17BMD0050

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

To validate signal processing and classification methods for Brain

Computer Interfaces using Electroencephalography(EEG).

Research Report

For the course

“Neural Networks and Fuzzy Logic”

Under the Supervision of

Prof Crishtopher Clement J

Professor, Dept of Chemistry

School of Advanced Sciences
Vellore Institute of Technology, Vellore

SUBMITTED BY:

Aditi Aravind (17BMD0050)

1|Page
CERTIFICATE

This is to certify that the project work entitled “To validate signal processing and
classification methods for Brain Computer Interfaces using Electroencephalography
(EEG).” that is being submitted by the individual for “Neural Networks and Fuzzy Logic” is
a record of bonafide work done under my supervision. The contents of this Project work, in
full or in parts, have neither been taken from any other source nor have been submitted for
any other CAL course.

Place: Vellore
Date: 07 November 2019

Signature of Student:

Aditi Aravind (17BMD0050)

Signature of Faculty:

2|Page
ACKNOWLEDGEMENTS

We take immense pleasure in thanking Dr. G. Viswanathan, our beloved

Chancellor, VIT University and respected Dean, School of Electronics
Engineering, for having permitted us to carry out the project.

We express gratitude to our guide, Prof Crisopher Clement J for guidance and
suggestions that helped us to complete the project on time. Words are inadequate
to express our gratitude to the faculty and staff of the labs who encouraged and
supported us during the project. Finally, we would like to thank our ever-loving
parents for their blessings and our friends for their timely help and support.

Signature of Student

3|Page
ABSTRACT

The human brain is a complex nonlinear system showing complicated emergent

properties, including consciousness.

Understanding and interpreting ‘brain-waves’ has been of interest to many since the start of the
21st century, and the concept of Human-Computer Interfaces and Brain-Computer Interfaces
has developed over the years.

Biosignals are signals or time-series data recorded from the human body, which contain
information about physiological phenomena which reflect human health and wellbeing.
Electroencephalography (EEG) is the recording of ‘brain waves’ from the scalp which have
been sectioned using various transforms into bands of frequencies that have characteristic
waveforms and properties. These properties are used to diagnose and classify disorders, stages
in sleep cycles, movements and gestures etc.

The interpretation of EEG to produce meaningful information is crucial to every Brain

Computer Interface and many methods and learning algorithms have been used to extract as
much as possible.

In this paper, two methods will be tested and validated. These methods will include processing
the signal acquired by filtering out noise and unnecessary sections of data, feature
extraction and the building of the machine/deep learning pipeline.
The extracted features will be fed into various neural networks to determine the most viable
for non-stationary and time-variant signals, like the EEG or the Electro-Corticogram (ECoG).

4|Page
INTRODUCTION

2.1 Biosignals

Biosignals are signals or time-series data that can be continually measured from the human
body. They contain information about physiological phenomena which reflect human health
and well-being. They have become an important source of information for medical diagnoseis,
subsequent therapy, and passive healthy monitoring. They can be either electrical, such as
electroencephalogram (EEG), electrocardiogram (ECG), electrooculogram (EOG) and
electromyogram (EMG), or non-electrical, such as breathing and movements. In this section,
we will discuss examples of biosignals that have been widely used in clinical applications.

2.2 Electoencephalography (EEG)

It is an electrical bio - signal of brain activity recorded from electrodes placed on a subject's
scalp. This signal reflects the potential differences resulting from the activity of neurons in the
brain and are often contaminated by external interference outside the brain, such as eye
movements, muscular activity and cardiac activity. Hence, before any useful information is
gleaned from the signals, the EEG first is filtered and cleaned. Filtering is generally adaptive,
as the noise permeating the EEG Signals affects almost all frequencies.

Table 1: Frequency Bands of EEG Signal

5|Page
Figure 1: EEG Electrode Montoge

Traditionally, EEG is decomposed into different frequency bands (e.g., using STFT or DWT)
for analysing clinically relevant activity. The most common frequency bands are briefly
summarized in Table above. The information from these bands or other features extracted from
EEG has been used to help diagnose and monitor many conditions affecting the brain such as
the recognition of epileptic seizures or the different sleep stages.

2.3 Supervised Learning

Suppose there are m training examples (x(i); y(i)) m i=1 (i.e., a training set), where x(i) is
the input variable, also called input features, y(i) is the output or target variable that we are
trying to predict, (x(i); y(i)) is a training example, and the superscript \(i)" is an index into the
training set. The goal of the supervised learning algorithm is to learn a function f_ : X ! Y ,
where _ is the parameters of the function f, X is the space of input values and Y is the space of
output values. For instance, in the sleep analysis, X could be the space of EEG signals and Y

6|Page
could be a set of sleep stages [fW,N1,N2,N3,REMg] indicating one of the possible class of the
input EEG signals. The function f is considered as a “good" predictor for the corresponding
value of y when the disagreement between a predicted value ^y(i) = f_(x(i)) and a target value
y(i) is very low. Such disagreement can be quantifed using a scalar-valued loss function (or
cost function) L(^y; y). Thus, the objective in supervised learning is to find the parameter that
minimizes L(^y; y).

When the space of Y is a set of discrete values, such as sleep stages, we call such learning
problem a classification problem. When the space of Y is continuous, such as walking speed,
we call it a regression problem.

2.4 Deep Learning

Deep learning, also called neural network, is a branch of machine learning that utilizes multiple
layers of linear and non-linear functions to transform from inputs into representations that are
useful for subsequent tasks such as classifcation and regression. It can be trained to
approximate function in both supervised and unsupervised settings. For example, in the
classi_cation problem, a deep learning model can be trained to approximate a mapping y =
f_(x) by adjusting the parameters f that minimizes a given loss function.

Biological Inspiration:

Figure 2: Drawing of a Neuron

The neural network is originally inspired by the way biological nervous systems process
information. Figure 2 illustrates an example of a cartoon drawing of a biological neuron and
the mathematical model of a single neuron (or a hidden unit) in a neural network. Each neuron
receives input signals (the input vector x) from its dendrites and produces output signals along

7|Page
its axon. The dendrites carry the signals to the cell body, in which the strengths of the input
signals from the other neurons are controlled by synapses (the weights w). The cell body
accumulates the signals and sums them. If the sum is greater than a certain threshold, the neuron
can fire which sends a spike along its axon. The axon propagates the signals along its branches
to the dendrites of other neurons. In the mathematical model, the synaptic strengths (the
weights w) are learnable, which control the strength of influence of one neuron on another. If
the activation function is sigmoid nonlinearity function (f) that squashes it to range between 0
and 1, the output signals can be interpreted as the average firing rate of the neuron. The
mathematical neuron represents each neuron in neuron networks.

Concepts of Layers:

The neural network is built by connecting these neurons. These neurons are commonly
organized in layers as they allow us to efficiently calculate activations of all neurons in each
layer with a matrix multiplication. The neural network can be viewed as a directed acyclic
graph describing how a sequence of layers (i.e., functions) are combined to form the network.

Power of Depth:

Making a neural network model deeper is believed to enable the network to reuse features from
the top layers (closer to the input layer) and can lead to learning more abstract features at the
bottom layers (closer to the output layer) [49]. For instance, in the supervised learning, we can
think of the last layer of the network as a learning classifier, such as a soft-max classifier, and
the remaining layers above as they are trying to provide representations to this classifier. Each
layer learns the representations that are useful for subsequent tasks defined by a loss function
which provides gradients via backpropagation.

8|Page
METHODOLOGY

3.1 BCI Competition Dataset

The BCI Competition IV 2a dataset was used to develop and test the BCI decoding algorithm.
The dataset contains recordings from 9 healthy subjects that perform 4 motor imagery tasks,
right arm, left arm, feet, and tongue. The data of a subject consist of 2 sessions, one intended
for training and the other for evaluation. Each session is comprised of 72 trials for each MI
task, 288 trials in total, recorded with 22 EEG channels and 3 monopolar electrooculogram
(EOG) channels (with left mastoid serving as reference). In our study, only data from the
training session are used.

At the beginning of each trial (t = 0 s), a white cross on black background appeared, and after
2 s, an arrow oriented right, left up, or down informed the subject to perform the corresponding
MI task. The arrow appeared for 1.25 s, and the subject was asked to keep on performing the
MI task until the white cross disappeared (t = 6 s).

3.2. Signal Pre-processing

Signal analysis was performed solely on the EEG electrodes, and the EOG channels were
excluded. Average reference was used, and the data were band-pass filtered at 7–15 Hz using
a zero-phase FIR filter in order to capture the event related desynchronization and
synchronization (ERD/ERS) activity. Subsequently, data were down-sampled at 100 Hz and
epoched for 500 msec after the visual cue with epoch duration of 3000 msec. Data were visually
inspected for bad channels but none was excluded. All pre-processing was performed using a
custom Fieldtrip script.

3.3 METHOD 1: Decoding of motor imagery applied to EEG data decomposed using
Common Spatial Patterns.

9|Page
Common spatial pattern (CSP) is a mathematical procedure used in signal processing for
separating a multivariate signal into additive subcomponents which have maximum
differences in variance between two windows.

Here the classifier is applied to features extracted on CSP filtered signals.

Common Spatial Pattern (CSP) filters are one of the most used feature extraction methods in
BCI domain. Assuming data of two classes, for example, the motor imagery of right and left,
CSP algorithm calculates spatial filters that maximize the ratio of variance of data stemming
from the two classes. Consequently, the extracted signals are optimally discriminating two
different EEG classes while they are revealing spatial patterns of different classes.

Original CSP algorithm has been developed for two class problems, though there exist
multiclass extensions. Since the classification problem of this work is multiclass, a multiclass
extension of CSP was deployed using the One-vs-Rest scheme, with filters, the last and the
first eigenvectors of each class were selected [15]. CSP filters were calculated during training
phase, on the mean covariance matrices of the data conditioned to the four classes.

Additionally, a Linear Discriminant Analysis is applied as well for dimensionality reduction.

METHOD 2: Application of Multi-Voxal Pattern Analysis to the Sensor Space

Multi-voxel pattern analysis (MVPA) detects differences between conditions with higher
sensitivity than conventional univariate analysis by focusing on the analysis and comparison
of distributed patterns of activity.

In such a multivariate approach, data from individual voxels within a region are jointly
analyzed.

Multi-voxel pattern analysis (MVPA) involves searching for highly reproducible spatial
patterns of activity that differentiate across experimental conditions. MVPA is therefore
considered as a supervised classification problem where a classifier attempts to capture the
relationships between spatial patterns of fMRI activity and experimental conditions.

10 | P a g e
More generally, classification consists in determining a decision function that takes the
values of various “features" in a data “example” and predicts the class of that “example.”
“Features” is a generic term used in machine learning to be the set of variables or attributes
describing a certain “example.” In the context of fMRI, an “example” may represent a given
trial in the experimental run, and the “features” may represent the corresponding fMRI
signals in a cluster of voxels. The experimental conditions may represent the different
classes.

The patterns explain how the MEG and EEG data were generated from the discriminant
neural sources which are extracted by the filters.

CODES
METHOD 1:

tmin, tmax = -1., 4.

event_id = dict(hands=2, feet=3)
subject = 1
runs = [6, 9, 13]

raw_fnames = eegbci.load_data(subject, runs)

raw = concatenate_raws([read_raw_edf(f, preload=True) for f in raw_fnames])
raw.rename_channels(lambda x: x.strip('.'))

#FILTERING
raw.filter(7., 30., fir_design='firwin', skip_by_annotation='edge')
events, _ = events_from_annotations(raw, event_id=dict(T1=2, T2=3))
picks = pick_types(raw.info, meg=False, eeg=True, stim=False, eog=False,
exclude='bads')
epochs = Epochs(raw, events, event_id, tmin, tmax, proj=True, picks=picks,
baseline=None, preload=True)

11 | P a g e
epochs_train = epochs.copy().crop(tmin=1., tmax=2.)
labels = epochs.events[:, -1] – 2

scores = []
epochs_data = epochs.get_data()
epochs_data_train = epochs_train.get_data()
cv = ShuffleSplit(10, test_size=0.2, random_state=42)
cv_split = cv.split(epochs_data_train)

# ASSEMBLE THE CLASSIFIER

lda = LinearDiscriminantAnalysis()
csp = CSP(n_components=4, reg=None, log=True, norm_trace=False)
clf = Pipeline([('CSP', csp), ('LDA', lda)])
scores = cross_val_score(clf, epochs_data_train, labels, cv=cv, n_jobs=1)

#ACCURACY Measurement
class_balance = np.mean(labels == labels[0])
class_balance = max(class_balance, 1. - class_balance)
print("Classification accuracy: %f / Chance level: %f" % (np.mean(scores),
class_balance))
#DATA Visualization
csp.fit_transform(epochs_data, labels)

layout = read_layout('EEG1005')
csp.plot_patterns(epochs.info, layout=layout, ch_type='eeg',
units='Patterns (AU)', size=1.5)

sfreq = raw.info['sfreq']
w_length = int(sfreq * 0.5) # running classifier: window length
w_step = int(sfreq * 0.1) # running classifier: window step size
w_start = np.arange(0, epochs_data.shape[2] - w_length, w_step)

scores_windows = []

12 | P a g e
for train_idx, test_idx in cv_split:
y_train, y_test = labels[train_idx], labels[test_idx]
X_train = csp.fit_transform(epochs_data_train[train_idx], y_train)
X_test = csp.transform(epochs_data_train[test_idx])

# fit classifier
lda.fit(X_train, y_train)

# running classifier: test classifier on sliding window

score_this_window = []
for n in w_start:
X_test = csp.transform(epochs_data[test_idx][:, :, n:(n + w_length)])
score_this_window.append(lda.score(X_test, y_test))
scores_windows.append(score_this_window)

# Plot scores over time

w_times = (w_start + w_length / 2.) / sfreq + epochs.tmin

plt.figure()
plt.plot(w_times, np.mean(scores_windows, 0), label='Score')
plt.axvline(0, linestyle='--', color='k', label='Onset')
plt.axhline(0.5, linestyle='-', color='k', label='Chance')
plt.xlabel('time (s)')
plt.ylabel('classification accuracy')
plt.title('Classification score over time')
plt.legend(loc='lower right')
plt.show()

METHOD 2:

raw_fname = data_path + '/MEG/sample/sample_audvis_filt-0-40_raw.fif'

event_fname = data_path + '/MEG/sample/sample_audvis_filt-0-40_raw-eve.fif'

13 | P a g e
tmin, tmax = -0.1, 0.4
event_id = dict(aud_l=1, vis_l=3)

# Setup for reading the raw data

raw = io.read_raw_fif(raw_fname, preload=True)
raw.filter(.5, 25, fir_design='firwin')
events = mne.read_events(event_fname)

# Read epochs
epochs = mne.Epochs(raw, events, event_id, tmin, tmax, proj=True,
decim=2, baseline=None, preload=True)

labels = epochs.events[:, -1]

# get MEG and EEG data

meg_epochs = epochs.copy().pick_types(meg=True, eeg=False)
meg_data = meg_epochs.get_data().reshape(len(labels), -1)

clf = LogisticRegression(solver='lbfgs')
scaler = StandardScaler()

# create a linear model with LogisticRegression

model = LinearModel(clf)

# fit the classifier on MEG data

X = scaler.fit_transform(meg_data)
model.fit(X, labels)

# Extract and plot spatial filters and spatial patterns

for name, coef in (('patterns', model.patterns_), ('filters', model.filters_)):
# We fitted the linear model onto Z-scored data. To make the filters
# interpretable, we must reverse this normalization step
coef = scaler.inverse_transform([coef])[0]

14 | P a g e
# The data was vectorized to fit a single model across all time points and
# all channels. We thus reshape it:
coef = coef.reshape(len(meg_epochs.ch_names), -1)

# Plot
evoked = EvokedArray(coef, meg_epochs.info, tmin=epochs.tmin)
evoked.plot_topomap(title='MEG %s' % name, time_unit='s')

X = epochs.pick_types(meg=False, eeg=True)
y = epochs.events[:, 2]

# Define a unique pipeline to sequentially:

clf = make_pipeline(
Vectorizer(), # 1) vectorize across time and channels
StandardScaler(), # 2) normalize features across trials
LinearModel(
LogisticRegression(solver='lbfgs'))) # 3) fits a logistic regression
clf.fit(X, y)

# Extract and plot patterns and filters

for name in ('patterns_', 'filters_'):
# The `inverse_transform` parameter will call this method on any estimator
# contained in the pipeline, in reverse order.
coef = get_coef(clf, name, inverse_transform=True)
evoked = EvokedArray(coef, epochs.info, tmin=epochs.tmin)
evoked.plot_topomap(title='EEG %s' % name[:-1], time_unit='s')

15 | P a g e
RESULTS

METHOD 1:

16 | P a g e
METHOD 2:

17 | P a g e
CONCLUSION

Method 1 is better for producing results and decoding Motor Imagery (Mental Simulation of
a particular action). It provides the Quantitative Extraction of Features in a more reliable
manner.

Method 2 is better for Visualization and producing neurophsyiologically interpretable.

Essentially, it is better when the spatial orientation and placement of electrodes is crucial to the
experiment.

As most of our proposed methods are based deep learning, it also offers practical benefits in
addition to automating the process of hand-engineering features

Firstly, our models rely on relatively simple and homogeneous computations represented in
layers, which can reduce the coding complexity. Secondly, they also can utilize computation
power of Graphical Processing Units (GPUs) to run the algorithm efficiently. We also develop
an opensource library, named MNE, that provides rich bio-signal processing, model training,
and serving modules aimed at helping researchers and engineers efficiently develop deep
learning systems. We hope that many of the architectural elements featured in our proposed
models can be reused to solve problems that have similar properties and help inspire future
works.

18 | P a g e