Biometric
Biometric
Biometric
Abstract This paper describes how to identify unique individual readers using
their eye-movement patterns. A case study including forty participants was con-
ducted in order to measure eye movement during reading. The proposed biometric
method is developed based on an informative and stable eye-movement feature
set that gives rise to a high performance multi-class identification model. Multiple
individual classifiers are trained and tested on our novel feature set consisting of
28 features that represent basic eye movement, scan path and pupillary character-
istics. We combine three high-accuracy classifiers, namely Multilayer Perceptron,
Logistic, and Logistic Model Tree using the average of probabilities as the combina-
tion rule. We reach an overall accuracy of 95.31% and an average Equal Error Rate
(EER) of 2.03% and propose a strategy for adjusting decision thresholds that de-
creases the false acceptance rate to 0.1%. Our approach dramatically outperforms
previous methods, making it possible for the first time to build eye-movement
biometric systems for user identification and personalized interfaces.
Keywords Biometric identification · Equal error rate · Eye movement · Pattern
recognition
1 Introduction
A. Bayat
Department of Computer Science, University of Massachusetts Boston, Morrissey Boulevard,
Boston, MA, USA.
E-mail: akram@cs.umb.edu
M. Pomplun
E-mail: marc@cs.umb.edu
2 Akram Bayat, Marc Pomplun
2 Experimental Design
All screens were presented on a 22-inch View-Sonic LCD monitor with a refresh
rate of 75 Hz and resolution of 1024 x 768 pixels. Eye movements were monitored
with an SR Research EyeLink-2k system with a sampling frequency of 1000 Hz.
The passages that were used for data collection contain general topics (food,
science, health, history) taken from Washington Post News. Each passage has
between 230 and 240 words. The passages are easily readable texts in order to
decrease the influence of subjects’ prior knowledge on the experiments.
Forty native English speakers (25 female) with an average age of 20.4 years
(SD=5.35) and with normal or corrected to normal vision participated in the ex-
periments. For the first experiment with twenty participants (Dataset I), each
subject read six passages which were different for each subject. In the second ex-
periment that resulted in Dataset II, the other twenty subjects read six passages
which were identical for all subjects. Every passage in both experiments was di-
vided into 3 screens that added up to a total of 18 screens for six passages. The
font color for the text was black and the background color was grey.
4 Akram Bayat, Marc Pomplun
3 Feature Extraction
This section describes our feature extraction method, which consists of scan path
and pupillary response analysis. We aim to discriminate individuals by their visual
behavior during the reading task. The visual behavior is represented via a feature
set reflecting the dynamics of eye movement patterns.
It is important to choose the candidate features that provide the highest level
of specificity and noise tolerance. We attempt to choose features that represent the
patterns of eye movements that are observed during reading. Most of the features
used in this work are based on global processing, whereas word-by-word processing
is not specifically analyzed.
By considering the properties of eye-movement patterns in the reading task, we
extract features that hold promise as physiological and behavioral characteristics.
Moreover, the designed features should be less influenced by the content of the
particular texts used in the experiment. For instance, the frequency of the words
determines the likelihood that they are fixated. However, it is difficult to find
features that are text independent since eye-movement patterns are reactions to
the text stimulus. Besides, we use texts that require little prior knowledge in
order to decrease the influence of this knowledge on the features. Furthermore, all
subjects are at very similar age. For this reason, eye-movement patterns are not
significantly influenced by the readers’ age differences.
The raw data collected by the eye tracker system for each subject contains
various activities during reading, including fixations, saccades and blinks. Each of
these activities as well as the current gaze coordinates and pupil diameter were
measured at a temporal resolution of one millisecond.
A scan path is defined as the trajectory of eye movements over a screen, con-
taining a number of fixations and saccades with the number, duration, order and
placement of the fixations and saccades varying among individuals (Phillips and
Edelman, 2008). Analyzing the scan path of each subject during reading can lead
to measurable characteristics that are distinct for each subject. By analyzing a
scan path, we compute features that can considerably support biometric charac-
teristics. The extracted features are categorized into four groups: fixation features,
saccadic features, pupillary response features and spatial reading features.
Fixation Speed is the number of fixations in a scan path over the total time
needed for reading a screen.
Average fixation duration is measured as the sum of fixation durations over
the number of fixations.
We define basic and complex saccadic features, which are saccade related metrics. If
a blink happens during any saccade, that saccade is removed from the computation
of baseline and complex features.
Average saccade duration is computed as the sum of saccade durations
over the total number of saccades.
Average horizontal (vertical) saccade amplitudes of at least 0.5 de-
grees are measured as the sum of horizontal (vertical) saccade amplitudes greater
than 0.5 degrees over the total number of saccades with horizontal (vertical) ampli-
tudes greater than 0.5 degrees. Horizontal and vertical saccade amplitudes indicate
between-word and between-line saccades, respectively (Holland and Komogortsev,
2011a).
Average saccade horizontal (vertical) Euclidean length is the sum of
horizontal (vertical) Euclidean distances between fixation locations over the total
number of saccades.
Average saccade velocity is defined as the sum of Euclidean norm of the
horizontal and vertical velocities over the total number of saccades in a scan path.
The horizontal (vertical) saccade velocity is defined as the velocity with
which eyes move horizontally (vertically) from a fixation point to another. A very
simple, fast and accurate way to compute saccade velocity is to use a two-point
central difference. If these two points are considered as two adjacent fixation points
(we discard all data points between two fixation points), the signal-to-noise ratio
will significantly decrease. A more robust way to compute the horizontal saccade
velocity(VHS ) and vertical saccade velocity (VV S ) is to use the following formulas
which are designed for a 1000 Hz sampling rate of data collection:
n
1X
VHS = VH (k) (1)
n
k=0
n
1X
VV S = VV (k) (2)
n
k=0
where
x([k + 3]T ) − x([k − 3]T )
VH (k) = (3)
6T
y([k + 3]T ) − y([k − 3]T )
VV (k) = (4)
6T
where x and y are the coordinates of a sample point within a saccade, T is the
sampling interval (1ms), and k is the index for discretized time, i.e. k = 0, 1, 2, ..., n.
Average peak velocity is the sum of peak velocities over the total number of
saccades in a scan path, where the peak velocity is defined as the highest velocity
reached between any two consecutive samples during the saccade.
6 Akram Bayat, Marc Pomplun
Pupillary response features consist of features that reflect changes in pupil size
during reading activity. The fact that the magnitude of pupil dilation is a function
of processing load or mental effort has long been known in neurophysiology. The
various changes in pupil diameters in different participants that result from their
reactions to the same reading task can be considered as dynamic features. In order
to model these changes, the standard deviation, average rate of pupil size change,
and difference between the minimum and maximum pupil size in a scan path are
computed.
The standard deviation of pupil diameter is applied as follows:
v
u N
u1 X (MF ixation (i) − MScanpath )2
σ= t (5)
N (MScanpath )2
i=1
Where MScanpath and MF ixation , respectively, are mean pupil diameters that are
observed during one scan path and during each fixation in that scan path, and N
is the number of fixations within the scan path.
The average rate of pupil size change, VP , is measured as follows:
n−2
1 X P ([k + 2]T ) − P ([k − 2]T )
VP = (6)
n−3 4T
k=2
where n is the number of fixations within a scan path, P is the pupil size, and T
is the sampling interval (1ms) of data collection.
The difference between minimal and maximal pupil diameter in each
scan path is measured as another pupil feature.
Spatial reading features reveal eye movement behavior in terms of efficiency. For
comprehending a sentence and a passage, readers must establish word order. It
means that their gaze moves to upcoming words in the text when they become
relevant for sentence comprehension. We define a progressive reading procedure as
moving forward toward the next words on a line of text. We define a saccade as a
progressive saccade if the saccade angle deviates from this direction by less than
20 degrees. Saccades that move the eyes in other directions are not considered to
belong to a progressive, efficient reading procedure. However, a saccade in opposite
Biometric Identification through Eye-Movement Patterns 7
direction, landing on the next line of the text, is also counted as a progressive
saccade. Then we define the following features in this reading space:
Average horizontal (vertical) forward saccade velocities are measured
as the sum of progressive horizontal (vertical) saccade velocities over the total
number of progressive saccades.
Average absolute forward saccade velocities are the sum of absolute val-
ues of progressive saccade velocities over the total number of progressive saccades.
4 Classification
In this section, we describe our classification algorithm for our biometric user iden-
tification. Dataset I and Dataset II contain eye-movements data of forty subjects.
Each subject has read six passages with each passage being presented across three
successive screens. A feature vector is extracted for each screen that consists of the
scan path features in that screen. In this way, each subject has 28 feature vectors.
Feature extraction and classification are performed on Dataset I, Dataset II and
a combination of these datasets.
with Dataset I as expected. Obtaining the highest accuracy when using Dataset
I reflects higher recognition power due to the effects caused by differences in the
texts that are used for that experiment.
Combining multiple good classifiers can improve accuracy, efficiency and robust-
ness over single classifiers. The idea is that different classifiers may offer com-
plementary and diverse information about patterns to be classified, allowing for
potentially higher classification accuracy (e.g., Bayat, Pomplun, and Tran, 2014).
We use the vote classifier to combine the optimal set of classifiers from the previous
section by selecting a combination rule. The average of probabilities is considered
as our combination rule that returns the mean of the probability distributions
for each of the single classifiers. It is found that for our biometric identification
model the average of probabilities yields a better result than other methods such
as majority voting (94.40%). The combination consisting of the Logistic, Multi-
layer Perceptron and Logistic Model Tree classifiers yields the highest accuracy in
all three datasets. The results of our combined classifiers are listed in Table ??.
The accuracy decreases from 97.92% to 96.52% when considering data collected
with the same set of passages for all subjects (Dataset II). The accuracy takes
another drop to 95.31% when combining Dataset I and Dataset II. The reason for
combining two datasets is to reduce the error rate exhibited by the content of the
text that is used for identification. The small changes in accuracy rates for differ-
ent datasets and the behavioral nature of our feature set suggest that our feature
set and classifiers can well capture unique characteristics of different individuals
by using their eye-movement patterns. For the remainder of this paper, the model
constructed by using the combination of Datasets I and II will be used as our
classifier.
5 Identification Performance
Fig. 1 FAR and FRR Errors for the multiclass classifier with 95.31% accuracy and decision
threshold with probability of 0.5. The maximum value of FAR and FRR over all classes is
0.62% and 33.33%, respectively.
such as a receiver operating characteristic (ROC) and equal error rate (EER).
An ROC curve is used to show proportions of correctly and incorrectly classified
predictions over a wide and continuous range of decision threshold levels.
In biometric systems, from the user’s point of view, an error of accuracy occurs
when the system identifies an invalid user or when the system fails to identify a
valid user. The associated error rates are called False Acceptance Rate (FAR)
and False Rejection Rate (FRR), respectively, which are the most commonly used
metrics for identification problems. EER is the rate at which both FAR and FRR
are equal. The value of the EER can be easily obtained from the ROC curve. In
general, lower EER indicates higher accuracy.
The general strategy in evaluating the FAR and FRR errors for multi-class
classifiers is to reduce the problem of multiclass classification to multiple binary
classifications in which each class has its own value of FAR and FRR errors. Fig-
ure ?? shows the values of FAR and FRR errors for all forty classes. The combined
multi-class classifier achieves 95.31% probability of identification at average 0.134%
FAR and 6.051% FRR.
10 Akram Bayat, Marc Pomplun
Fig. 2 FAR and FRR versus decision threshold for two-sample binary classes; the EER value
serves as the decision threshold. In panel a) FAR= FRR in Threshold ' 0.5. In panel b) FAR
6= FRR in Threshold ' 0.5.
The combined classifier that obtains the accuracy of 95.31% uses a decision thresh-
old with a probability of 0.5 to map from instances to predicted classes. However,
it does not guarantee providing us with the optimal values of FAR and FRR er-
rors. Figure ?? shows the ROC graphs for two sample binary classes (two users)
in which FAR and FRR are plotted on the vertical axis and decision threshold is
plotted on the horizontal axis. Figure ??a shows an EER value of 0.0015 for both
FAR and FRR related to the decision threshold value of 0.49 for a sample binary
class that guarantees 0.15% EER or less for this class.
Selecting the EER value as the decision threshold is often a good choice for
identification applications. In Figure ??b, the FAR and FRR errors are not optimal
at a threshold value of 0.5, because the value of FAR (0.277) at a threshold of 0.5
is fairly high for an identification problem; it means that the classifier would fail
at 27.7 percent of attempts to identify a valid user.
ROC analysis and EER based decision making are commonly employed in two-
class classification problems because they are easy to analyze and visualize. For
our multi-class classifier we need to deal with more than one decision threshold
and make multi-decisions for our multi-class predictive model.
It is important to know that with more than two classes, ROC analysis be-
comes too complex to be managed. In our multi-class model, with 40 classes, the
confusion matrix becomes a 40×40 matrix containing the 40 correct classifications
(the major diagonal entries) and 1560(402 − 40) possible errors (the off-diagonal
entries). Instead of managing trade-offs between FAR and FRR, we have 1560
errors (Fawcett, 2006).
Biometric Identification through Eye-Movement Patterns 11
Fig. 3 EER vs. decision threshold for our proposed multiclass model.
Algorithm 1 The ODER gives the optimal values of FAR and FRR and their
corresponding threshold value for our identification problem
min F AR ← 1
loop . for all values of thresholds in each class
if | F AR − F RR |≤ 10−1 then
if F AR ≤ min F AR then
min F AR ← F AR
end if
end if
end loop
ure ??. ODER outperforms the EER by decreasing FAR from 2.03% to 0.1%,
while the FRR increase from 2.03% to 3.9%. In order to precisely compare EER
and ODER criteria for deciding the value of the decision threshold, we compute
the relative changes in FAR and FRR for each criterion. In this way, in ODER,
FAR decreases by 95.07% and FRR increases by only 47.9%. This result suggests
that the ODER algorithm outperforms the EER-based threshold decision making.
6 Related Work
The idea of using biometric identification by itself or by fusing it with other stan-
dard identification methods has been studied for many years. Finger prints, face,
DNA, voice and gait are some of the physiological and behavioral characteristics
that have been used in biometric identifiers. In recent years, the potential of eye-
movement tracking as a biometric has been investigated as an additional way that
can be integrated with other biometrics. The involuntary eye movements could
reflect the underlying anatomical organization which is unique for each individual.
Bednarik et al. (2005) present a first step towards using eye movement as
a biometric identification. An identification rate of 60% by using dynamics of
Biometric Identification through Eye-Movement Patterns 13
Fig. 4 Comparing average FAR and FRR for each of three different criteria (Decision thresh-
old = 0.5, EER, and ODER).
7 Conclusions
Acknowledgements The authors thank Ms. Nada Attar in the Visual Attention Laboratory
at the University of Massachusetts Boston for providing them with data for evaluating the
eye-movement classifiers presented in this study.
References