2011 11th IEEE International Conference on Data Mining Workshops

What’s your current stress level?

Detection of stress patterns from GSR sensor data
Jorn Bakker, Mykola Pechenizkiy, Natalia Sidorova
Department of Computer Science
Eindhoven University of Technology
Eindhoven, P.O. Box 513, 5600MB,The Netherlands,
Email: {j.bakker,m.pechenizkiy,n.sidorova}@tue.nl

Abstract—The problem of job stress is generally recognized

as one of the major factors leading to a spectrum of health
problems. People with certain professions, like intensive care
specialists or call-center operators, and people in certain phases
of their lives, like working parents with young children, are
at increased risk of getting overstressed. Stress management
should start far before the stress start causing illnesses. The
current state of sensor technology allows to develop systems
measuring physical symptoms reflecting the stress level. In this
paper we (1) formulate the problem of stress identification and What,When,Where,withWhom Physiologicalsigns
categorization from the sensor data stream mining perspective,
(2) consider a reductionist approach for arousal identification as Stressdetectionandprediction
a drift detection task, (3) highlight the major problems of dealing Stress
with GSR data, collected from a watch-style stress measurement detection& Coaching
device in normal (i.e. in non-lab) settings, and propose simple models
approaches how to deal with them, and (4) discuss the lessons
“Reschedule” “Prepare” “Takeabreak”
learnt from the conducted experimental study on real GSR data
collected during the recent field study.

Stress at work has become a serious problem affecting many
Fig. 1. Stress@work in a nutshell: stress detection, prediction and coaching
people of different professions, life situations, and age groups.
The workplace has changed dramatically due to globalization
of the economy, use of new information and communica-
tions technologies, growing diversity in the workplace, and There are a number of factors that are likely to cause stress
increased mental workload. In the 2000 European Working at work including but not limited to long work hours, work
Conditions Survey (EWCS) [12], work-related stress was overload, time pressure, difficult, demanding or complex tasks,
found to be the second most common work-related health high responsibility, lack of breaks, conflicts, underpromotion,
problem across the EU. 62% of Americans say work has lack of training, job insecurity, lack of variety, and poor phys-
a significant impact on stress levels. 54% of employees are ical work conditions (limited space, inconvenient temperature,
concerned about health problems caused by stress. One in four limited or inappropriate lighting conditions) [10].
employees has taken a mental health day off from work to cope In [1] we proposed the conceptual framework for managing
with stress (APA Survey 2004). stress at work. One very important step in the process of stress
Stress can contribute to illness directly, through its phys- management is making the worker aware of the past, current or
iological effects, or indirectly, through maladaptive health expected stress. We aim at the automation of the identification
behaviors (for example, smoking, poor eating habits or lack of the stress causes of an employee in question, as well as the
of sleep) [4]. It is important to motivate people to adjust identification of the common causes of stress for employees
their behavior and life style and start using appropriate stress within an organisation. Figure 1 shows the main ideas of our
coping strategies. So that they achieve a better stress balance approach: We aim at making stress and stressors visible by
far before increased level of stress results in serious health (1) keeping track of the calendar events and daily routine of the
problems. worker, (2) measuring stress-related physiological signs from
The avoidance of stress in the everyday working environ- the sensor data, (3) annotating these events with the sensor data
ment is impossible. Still, if people are informed of their stress and the results of automated analysis of additional information
levels, they become empowered for taking some preemptive sources, such as sentiment classification of the incoming and
actions in order to alleviate stress [16]. outgoing e-mails or social media messages [18] and explicit

978-0-7695-4409-0/11 $26.00 © 2011 IEEE 573

DOI 10.1109/ICDMW.2011.178
to deal with them. In Section III we present the results and
Stress factor lessons learnt from the conducted experimental study on real
GSR data collected during the recent pilot field study. Finally
in Section IV we give conclusions and discuss directions for

Sympathetic system
further work.
Sweat production Stress comes in three flavors:
1) Acute: stress caused by an acute short-term stress factor.
2) Episodic acute: acute stress that occurs more frequently
Other and/or periodically.
3) Chronic: stress caused by long-term stress factors and
Other factors can be very harmful in long run.
Most people experience acute stress during their everyday
Fig. 2. The reaction to stress factors is governed by the autonomous nervous life. It is a primal flight-or-fight response to immediate stress
system. This path is shared with a lot of other mechanisms. factors and is not considered harmful. When the frequency
of these occurrences increase, physiological symptoms might
occur. This type of stress is associated with a very busy and
user feedback, (4) extracting the relationship between event chaotic life and can be considered to be harmful when it occurs
data and sensor data, i.e. relations between the increases and over prolonged periods of time. The last type of stress, chronic,
decreases in the stress level with the characteristics of the is considered to be the most harmful. Prolonged periods of
events of daily lives (what, where, when, with whom, etc.), stress could be caused by personal circumstances or other
and (5) using extracted knowledge about this relationship for long-term factors.
personalized coaching. In our work, we want to prevent people from transferring
In order to find this relationship, a number of subtasks to the chronic category and therefore, we target the acute and
need to be done. One of the main subtasks is detecting episodic acute stress. Particularly, in this paper we focus on
stress from the sensor data. Due to modern ICT and sensor the identification of acute stress in order to facilitate coaching
technologies, objective measuring of the stress level in non- of the episodic acute stress.
lab settings becomes possible. Such symptoms as voice, heart Acute stress is a mechanism that brings the body into a
rate, galvanic skin response (GSR) and facial expressions are state of alertness. As shown in Figure 2, it is controlled by the
known to be highly correlated with the level of stress a person autonomous nervous system. This system maintains a constant
experiences [3], [5], [7]. In this paper we focus on the use of equilibrium (also known as homeostasis). A change in this
the GSR data (reflecting sweating) measured by a prototype equilibrium results in different changes in the bodily functions
device worn at a wrist. (e.g. activity of digestion system).
The direct use of the GSR measurements obtained is not that Stress can be seen as a state of emergency that is preceded
straightforward. Partly this is caused by noise and inaccuracies by arousal due to an external stimulus, see Figure 3. After the
in the collected sensor data, but what is more crucial – the re- factor causing stress (the stressor) disappears, the body relaxes
action to various stress factors is governed by the autonomous and returns to a normal state.
nervous system and this “path” to the symptomatic system is Figure 4 shows the general case with more relationships
shared with a lot of other mechanisms, such as the mechanism between the four states depicting the inner process of stress.
of adaption to the outside temperature and humidity (Figure 2). The problem of stress identification can be formulated in
We have conducted a pilot case study aimed at the identi- different ways, e.g. as a traditional classification task, as one-
fication of likely challenges we need to address to make our class classification, as event identification, and as time series
approach work in practice. In this paper, we focus only on subsequence classification to name a few main options.
the problem of detecting changes in the stress level from the It should be also noticed that acute stress can also be pos-
GSR sensor data alone. We study the peculiarities of noise and itive (e.g. caused by an excitement or an intrinsic motivation
disturbances in the signal and argue the need of the related or an engagement in the working process), and, consequently,
contextual data for improving the quality of stress detection. staying in a normal state for too a long period without any
The rest of this paper is organized as follows. In Section II, acute stress can be a sign of monotone uninteresting work or
we formulate the problem of stress identification and cate- poor motivation of the employee. Therefore, we would like
gorization from the sensor data stream mining perspective. to perform a more detailed classification of the states in the
We focus on a subproblem of arousal identification in online future.
settings, which we formulate as a drift detection task. We In this paper we consider a simplified setting assuming
highlight the major problems of dealing with GSR data, col- that a person is either in the normal state or in a stressed
lected from a watch-style stress measurement device in normal state. The change between the two states can be sudden or
(i.e. in non-lab) settings, and propose simple approaches how incremental; typically, arousal is more rapid and relaxation


20 Noise



Time, hours

Fig. 5. The GSR signal contains two-sided local noise peaks that are probably
caused by a physical disturbance of the contact between the skin and the
sensors, e.g. if someone has a habit to touch from time to time the watch or
Normal Aroused Stressed Relaxing the stress meter in this case.

20 Gaps

Fig. 3. An example of acute stress pattern observed from GSR data and how 15

it can be mapped to the symbolic (time-stamped) representation of person’s

Arousing 14 15
Time, hours
Normal Stressed
Relaxing Fig. 6. When the fit between the skin and the sensors is not tight enough,
the contact is continuously broken. A characteristic of this behavior is the
high amount of gaps (ground value of sensor) in the signal.
Fig. 4. Four states depicting the inner process of stress.

creates noise in the signal in the form of gaps (see Figure 5).
takes considerably longer. As we will show, different change Note that the skin in contact with the device contains
patterns can be observed. slightly more sweat than the skin next to the device, and when
the device is shifted on the skin, there is a resettling period of
A. Arousal as change detection about 15 minutes during which the skin that came in contact
The principal task is to detect whether a person is stressed with the device gets about the same level of sweat as the
at a particular moment in time or not. In other words, the skin that was in contact with the device before the shift, thus
detector assigns a label “stressed” or “not stressed” based on resulting in about the same GSR (under assumption that no
the observed historic data. change in the stress level happens in this period).
Detecting changes in GSR data is not as straightforward
as someone might think looking at the example in Figure 3. Importance of context. There are a lot of different factors
Different types of noise in the data and changes in GSR data that influence the internal state of a person. Rising GSR levels
due to other factors than stressors make it a non-trivial task. In might be related to a rise in temperature or to heavy physical
this section, we give illustrative examples of noise and other work or exercises. In other words, the GSR change patterns
factors affecting the GSR signal. can be related to contexts that are mostly hidden.

Types of noise. The quality of the GSR signal depends

primarily on the continuity of the contact between the device
and the skin of the test person. The skin conductance is 1

measured by two electrodes that require skin contact in order


to produce a reliable signal. However, this contact is not the 0.5

same for every person. For some people, the device fits less
well (e.g. because they dislike wearing it tight enough to
guarantee good contact, or because they have very dry skin); 11 12 13 14
Time, hours
15 16 17

due to a poor fit, we get noise in the signal (see Figure 6).
A person might also accidently touch the device (or do this Fig. 7. Prior to a stressful event (red-lined peak), the GSR level is gradually
periodically in case of having such a habit), thus increasing rising. Is this rise caused by an external factor or is it due to anticipation of
the event?
the pressure and influencing the GSR measurement; this also

1.5 10 Excercise start Excercise end

1 8

11 12 13 14 15 16 17 18 19 20 2
Time, hours
10 11 12 13 14 15 16 17
Time, hours
Fig. 8. After a stressful event (red-lined peak) the GSR level does not return
to the level it had prior to the event. This might indicate that there is no
relaxation process. Fig. 10. Doing physical exercises results in a high GSR level, yet is not
related to the emotional stress.

Different GSR levels

Raw sensor Noise Filter Aggregation Discretization Change
data (Median filter) (sec min) (SAX) detection


0 y=
10 11 12 13
Time, hours
14 15 16 17
( y t .. y t ) y’ = f ( y ) y’’ = g ( y’) SAX(y’’)
c curr
Fig. 9. After a suspected stressful event the GSR level does not return to n m
the level it had prior to the event. This might be an indicate that there is no tc
relaxation process or what is more like in this case - the baseline level of
GSR corresponding to normal unstressed state changed.
Fig. 11. Arousal detection approach: the GSR data is first (1) filtered,
(2) aggregated, and (3) discretized in the preprocessing phase and then passed
to a change detection technique. Each step is applied to a window of data
that is kept until a change has been detected.
One of these patterns is a steady increase of the GSR
level (see Figure 7). This might be an indication of changing
environmental factors (e.g. temperature), but it might also be a
genuine stress response. For instance, once a certain event has B. Approach
been scheduled, the person might get stressed in anticipation of
the event. This is an interesting pattern for the stress detection The main task is to determine whether the observed portion
task. of the signal contains a change that corresponds to an arousal.
The same holds for the patterns in Figures 8 and 9. In these Formulating this problem as a change detection task on
time series there is a suspected stress peak: in Figure 8 the univariate time series, we consider a four step approach for
red part corresponds to an event tagged by the user as being arousal detection as shown in Figure 11. In the preprocessing
stressful, in Figure 9 there is an untagged short-term increase phase we take the raw GSR sensor data and according to the
in the GSR level. In both cases, the GSR level does not return operational settings (i.e. offline vs. online) perform its filtering,
to the original baseline after passing the peaks. The question aggregation and discretisation. The processed data is served to
is whether this is due to continuous stress (because of the a change detection technique.
user being still busy with what has happened) or some other The purpose of arousal detection can be twofold. The first
factors. is to obtain labels for the supervised learning process aimed at
For some series we learnt from the users’ feedback that finding relationships between stress occurrences and external
certain patterns were caused by environmental factors or user events of factors causing stress. In this case we can perform
activity context. In Figure 10 the person is exercising between change detection in offline settings, i.e. the complete data
12:00hr and 13:00hr. The effect of the exercises is clearly series can be used in preprocessing and detection steps. The
visible in the GSR time series. Moreover, due to the form second purpose is to use an online detection mechanism in
and the intensity of the picks, we can discriminate those from online or semi-online settings as an alarm for making the user
genuine stress. aware of stress (and possibly asking for feedback that can be
These context-dependent patterns will be important in the related back to the subjective labeling process, i.e. the user
overall stress detection task. Knowing whether a person relaxes can confirm or reject the alert). Although we do not fix the
after a stressful event or whether he or she experiences purpose of the task in this paper, we only describe an online
anticipating stress is very important. Here we do not handle method that detects arousal for the point in time that might be
these contexts explicitly. as much as a minute in the past.

Preprocessing. The three preprocessing steps that we use 1.4

are shown in Figure 11 and exemplified for the illustrative



purposes in Figure 12. The main objective of the preprocessing 0.4

phase is to remove noise from the GSR time series. The first
10 11 12 13 14 15 16 17 09 10 11 12 13 14 15 16 17 18
Time, hours Time, hours

type of noise is due to poor contact between the sensors and (a) Raw GSR signal (b) Filtered GSR signal
the skin (see Figure 6). If the contact is not sufficient, the 1.4

sensor will not measure anything. The second type of noise is 1.2




a local disturbance of the signal (see Figure 5). These local 0.6


disturbances are caused by mechanical movements (e.g. user 0.2

0 50 100 150 200 250 300 350 400 450 500

0 50 100 150 200 250 300 350 400 450 500
Time, minutes Time, minutes

bumps device onto something) and should not be considered

(c) Aggregated GSR signal (d) Discrete GSR data
to be actual measurements.
Noise caused by contact loss is problematic, since we cannot
Fig. 12. An example of GSR signal in its original form and after each of the
be sure whether the signal can be trusted in these areas. In three individual steps in the data preprocessing: the raw GSR signal shown
these cases the frequency of the ground value (i.e. when the in (a) is filtered using a median filter (b), then the values are aggregated to
sensors are not measuring anything) is a lot higher than in the minute level (c), and finally they are discretised using SAX encoding (d)
to be used as an input for a change detection technique.
a normal time series. When such periods occur in the GSR
signal, we alarm the problem and do not consider it further GSR
in the arousal detection task. More specifically, we count 5 SAX + MAX
the number of occurrences of these faulty measurements and 4

exclude the time series if this number exceeds the number of


other points. 2

Noise caused by local disturbances must be filtered out 1 ADWIN

because they might be mistaken for genuine peaks. As shown 0

50 100 150 200
in Figures 3, one of the important parts of the arousal detection Time, minutes

task is to catch the transition from normal GSR levels to Fig. 13. An illustration that discretising the data with SAX does not
aroused levels. This transition is characterized (for a typical immediately give us information about a change in arousal, e.g. by taking
stress pattern) by a sudden peak in the GSR level. The filter a maximal value of the current window. The blue circles indicate the changes
alerted by such an approach. The red triangles indicate the change points
should filter out local disturbances while maintaining the alerted by the ADWIN change detection method taking the SAXified time-
typical peaks. Therefore, the noise is filtered out by using series as an input. ADWIN is considered below.
a median filter [14]. This is a filter that is used in image
processing, and it preserves edges (opposed to e.g. a moving
average) while filtering out noise. this setup it is important that the aggregation step is applied
The preprocessing step is applied to windows within the after the filter in order to avoid the influence of local noise.
window of kept data. Let ȳ = (ytc , . . . , ytcurr ) be the portion
The discretisation using SAX is done online in a progressive
of kept data from either the start or the last change point (ytc )
way. That is, the SAX representation is recomputed over
until the most recent sample (ytcurr ). The filter computes the
the historic data as new instances come within the training
filtered values ȳ  = f (ȳ) over a moving window of size n
(n = 100 in the experiments) from ȳ1 until ȳk , where f is the
filter function and k = tcurr −tc . Each consecutive block of m
samples in ȳ  is aggregated to one value, ȳ  = g(ȳ  ). In the last Change detection. Change detection in time series has been a
preprocessing step, this data is discretised using SAX [8] into topic of interests in different domains. Existing approaches can
a discrete time series from 1 to 5, SAX(ȳ  ). The levels can be be divided into two broad groups of techniques. Techniques
interpreted as being levels of stress (1: completely relaxed and from the first group are based on monitoring the evolution of
5: maximum arousal). However, they should not be interpreted performance indicators like classification model accuracy or
as absolute levels of arousal, but rather as a local relative some property of the data. Cumulative Sum (CUSUM), intro-
measure of arousal. Please, notice that discretisation of the duced in [11] and recently used in [17] is one of the statistical
time series does not lead to an easy identification of the change process monitoring mechanisms. This method monitors the
points (see Figure 13 for an illustrative example. However, the mean of the input data (that can be also any filter residual) and
dicretisation can help the change detector to be more accurate. gives an alarm when it is significantly different from zero, i.e.
The signals are measured with a sampling frequency of deviates from the normal process behaviour. Other methods
4 Herz, yet it does not make sense to expect the stress rely on time series forecasting techniques such as Neural
detection to have timing requirements in the order of tenths of Networks and Auto Regression functions [15] that estimate
seconds. For this reason, we aggregate the data to the order of parameter changes online based on an offline mapping.
minutes. We use m = 240 in the experiments, thus after the Techniques from the second group are based on monitor-
aggregation step 1 sample point ȳi corresponds to 1 minute. ing distributions on two different time-windows: a reference
In the experiments, we took ȳi = max(ȳblock 
). As said, in window summarizing past information and a window over

the most recent examples. Statistical tests based on Chernoff TABLE I
bound, which decide whether samples drawn from two proba-
Number of users 5
bility distributions are different, were studied in [6]. ADaptive Number of time series 72
WINdowing (ADWIN) [2] that we use in our experimental Time series per user (mean) 14
study keeps a variable-length window of recently seen data Mean length (samples) 98721
Number of change points overall 368
points. It tries to keep the window of the maximal length that Mean change points per series 6.5
is still statistically consistent with the hypothesis that there has
been no change in the mean signal value inside the window.
Thus, we consider two different approaches for change
detection. Both approaches are aimed at finding statistically III. E XPERIMENTAL STUDY
significant changes in data. The first approach that we call In this section we present the results from the conducted
here Fit is based on monitoring the model error, and the second experimental study on real GSR data collected during the
approach ADWIN is based on monitoring the data signal itself. recent pilot field study. First, we give a concise description of
Both approaches were recently used for change detection in the the constructed dataset and experiment setup, and then provide
task of online prediction of the fuel mass flow in a boiler [13]. a summary of the quantitative evaluation and some highlights
Fit: Performance monitoring-based change detection with of the qualitative analysis of interesting cases.
the non-parametric test. In this study we assume that the
general pattern of arousal resembles the curve as shown in Dataset description. Table I summarizes the main charac-
Figure 3. We also assume that there is no global model that teristics of the data set. The data consists of the GSR data
predicts the general GSR signal for a person. Instead of using a measured on five persons in the course of the four weeks.
global model in combination with statistical change detection The data was collected from a watch-like device worn by
methods, we opt for a method that computes local models. the persons during working hours. Since the sampling rate
If we assume that the stress level of a person is stable in be- is 4 Hz and the typical working day is roughly 8 hours, the
tween changes, the changes can be detected by monitoring the average length of the raw time series is 98721. All together
error of a locally fitted model. Given historic (preprocessed) the data set contained 72 time series. 26 time series were
data, the objective is to fit a simple regression model.Based on excluded from the experiments for either of the two reasons:
the observed Mean Squared Error for the incoming points, we the GSR level showed very low variation or the contact of
can apply a statistic measure (e.g. Mann Whitney U test [9]) the sensors was not sufficient to yield a usable signal (these
to determine whether a significant change in the prediction were detected automatically by a filter and then verified by
error has occurred. the visual inspection).
Every time a new point arrives, the data is split into two sets. For each of the remaining 56 time series we annotated the
The first set is a reference set that excludes the new point. The change points based on the visual inspection. Overall the set
second set is a test set that includes the new point. For each of of time series contains 368 change points with an average of
the two sets a model is trained while iteratively leaving out one around 6.5 change points per time series.
of the points. When there is an overall significant difference The users participated in the study were instructed to anno-
between the two sets, it is considered to be a change point tate any meeting in their agenda (MS Outlook Calendar) with
and a cut is made. information about their feeling towards the meeting (“nice”,
ADWIN: Change detection based on raw data using adap- “exciting”, “neutral”, “annoying”, or “tense”). Although this
tive windowing. ADWIN method works as follows: given a information was available, it was not used in this investigation.
sequence of signals it checks whether there are statistically The reason for this is that the primary objective in this work
significant differences between the means of each possible split is to detect GSR peaks; however, a lot of the peaks do not
of the sequence. If a statistically significant difference is found, correspond to any meeting recorded in the agenda. Moreover,
the oldest portion of the data backwards from the detected the actual stress related to a meeting does not necessarily
point is dropped and the splitting procedure is repeated until shows up at the time of the meeting. It might precede the event
there are no significant differences in any possible split of (see Figure 7) or continue to influence the person afterwards
the sequence. More formally, given the GSR data stream, (see Figure 8). In the ideal case, these labels reflect the state
suppose a1 and a2 are the means of the two subsequences transitions as shown in Figure 4, but in reality it is hard to
as a result of a split. Then the criterion for a change detection discern the separate state changes.
is |a1 − a2 | > cut , where Therefore, instead of using the working agenda annotations
provided by the users, we used manually added labels based

1 4k 1 on the visual inspection of the GSR time series. In the
cut = log , a = 1 1 , (1) experimental study presented in this paper, we labeled only the
2a δ k1 + k2 change points, i.e. from the problem formulation perspective,
each point is labeled to be either a change point or not – that
here k is total size of the sequence, while k1 and k2 are sizes
our arousal detection approach will try to detect based on the
of the subsequences respectively.
already observed GSR values.

μ( TPP ) σ( TPP ) μ( T PF+F
) σ( T PF+F
) 2

Fit 0.66 0.16 1.66 0.16 1.5
ADWIN 0.08 0.01 1.01 0.1 1
0 50 100 150 200 250 300 350 400
TABLE III Time, hours
Fig. 14. A flat signal followed by a high peak. On the down-curve of the
μ(|ta − td |) σ(|ta − td |) high peak there are many smaller peaks that are more difficult to detect.
Fit 2.8 0.54
ADWIN 2.5 1.2
8 label
Experiment setup and evaluation. On each of 56 time series

we perform three steps: preprocess the data as discussed in the 4

previous section (see Figures 11 and 12), apply each of the 2

change detection methods, compare the labels to the changes 0

0 50 100 150 200 250 300 350 400 450 500
signalled by the method. Time, minutes

The techniques are applies on each time series in a progres-

Fig. 15. One of the stress time series and the change points. Green triangles
sive way. That means that we assume that the data arrives as depict the ground truth, red diamonds depict the detection of the fit-method,
a stream (one point at the time). Historic data is kept until a and the blue circles depict the detection of ADWIN.
change point is suspected. After that a new window is created
from the change point onwards.
The change points are evaluated by measuring the distance
series is filtered and then aggregated, but not discretised with
between the point identified by a detection algorithm as a
SAX, change detection may become less accurate (e.g. in
change point and the closest actual change point within a
Figure 18 ADWIN missed two change points; cf. ADWIN
preset boundary threshold. The reason for doing this is that
in Figure 16).
there is no strict requirement that a change point should be
detected at exactly the point where it occurs. We should allow
for some leniency with respect to the actual time where it is Discussion. The main difficulty of the stress detection task
detected. Therefore, we measure the True Positive rate within is that arousal comes in many different forms. Since the
a window of 5 minutes around the actual change point. Instead experiments were done in uncontrolled settings, it is difficult
of the False Positive rate, the False Discovery rate is reported,
since the amount of True Negatives is very large with respect
to the True Positives. 8 label

Results. The results of the experiments are shown in Tables II

and III. As can be seen from Table II none of the methods
was able to catch all of the change points. The fit method 2

detected more change points than ADWIN, but at a cost of 0

160 170 180 190 200 210 220 230
more False Positives. The positioning of the change points is Time, minutes

better handled by ADWIN.

Fig. 16. Closeup of the time series in Figure 15. ADWIN clearly detects the
In Figure 14 and 15 there are a lot of False Positives in high peaks, whereas the fit method is more sensitive to small local changes.
the beginning of the time series. This is probably due to the
online encoding. In the beginning, if the signal is flat, small
fluctuations are blown-up by the discretization step. This might 1.2
lead to more False Positives. Yet the fit method shows this 1 fit

behavior along the whole length of the time series. 0.8


There are two reasons why the True Positive rate is low for
ADWIN. The first is that it does not detect small peaks. The 0.2
second is that it also does not detect the change in cases where 0
0 50 100 150 200 250 300 350 400
the signal is slowly rising or falling (like in Figure 17). Time, minutes

Although we did not study thoroughly the effect of the

Fig. 17. Steadily increasing signal is not detected by ADWIN, yet there are
preprocessing techniques on the performance of the change a lot of False Positives from the fit method.
detection methods, some examples indicate that when time-

GSR statistics from the calendar, e-mail correspondence and social
ADWIN fit media [18].
ADWIN An additional source of information is the similarities or
differences between persons. Each person will handle stress

4 in a different way, but some might share characteristics when

2 it comes to anticipation, relaxing, or the general impact of
stress on observable variables. Using these sources of data
160 170 180 190 200
Time, minutes
210 220 230 collected under more controlled settings we hope to be able
to get more reliable and more fine-grained categorization of
Fig. 18. Detection results of ADWIN and fit on the same time series as in stress patterns.
Figure 16, but without SAX discretisation in preprocessing.
This research has been partly supported by EIT ICT Labs,
to interpret the patterns in the data. The manual labels are Health & Wellbeing thematic line (http://eit.ictlabs.eu) and
not arbitrary, but their interpretation in terms of real arousal NWO HaCDAIS Project.
is difficult. R EFERENCES
Many examples suggest us that interpretation of the GSR [1] J. Bakker, L. Holenderski, R. Kocielnik, M. Pechenizkiy, and
data can be rather ambiguous and deciding whether a particular N. Sidorova. Stress@work: From measuring stress to its understanding,
observed pattern corresponds to stress or something else (like a prediction and handling with personalized coaching. In Proc. of the
2nd ACM SIGHIT International Health Informatics Symposium, IHI’12.
physical exercise) is a non-trivial task even for a human expert. ACM Press, 2012.
(We asked the domain expert to analyze GSR curves like the [2] A. Bifet and R. Gavaldà. Learning from time-changing data with
ones presented in the paper and he had confirmed that they adaptive windowing. In Proc. of the 7th SIAM Int. Conference on Data
Mining, SDM’07, 2007.
were ambiguous and additional information was required to [3] W. Boucsein. Electrodermal activity. New York and London: Plenum
make a confident judgement whether the peaks correspond to Press, 1992.
genuine stress or they are results of other factors). Therefore, [4] K. Glanz and M. Schwartz. Stress, coping, and health behavior. Health
behavior and health education: Theory, research, and practice, pages
even “ideal” noise-free GSR data may be insufficient for 211–236, 2008.
accurate determining the level of stress. This suggests that [5] H. S. Hayre and J. C. Holland. Cross-correlation of voice and heart rate
the reliable translation of physiological data gathered by using as stress measures. Applied Acoustics, 13(1):57 – 62, 1980.
[6] D. Kifer, S. Ben-David, and J. Gehrke. Detecting change in data streams.
sensor technology into the “stress level rates” is only possible In Proceedings of the International Conference on Very Large Data
when additional sources of information are available. For Bases, pages 180–191, Toronto, Canada, 2004. Morgan Kaufmann.
example, apart from the GSR measurements, we can also use [7] P. J. Lang, M. K. Greenwald, M. M. Bradley, and A. O. Hamm.
Looking at pictures: Affective, facial, visceral, and behavioral reactions.
measurements of acceleration in three dimensions. Exploring Psychophysiology, 30(3):261–273, 1993.
the potential of accelerometer data for detecting the activity [8] J. Lin, E. J. Keogh, L. Wei, and S. Lonardi. Experiencing SAX: a
context (e.g. physical exercises, walking, active discussion etc) novel symbolic representation of time series. Data Min. Knowl. Discov.,
15(2):107–144, 2007.
is an interesting direction for further research. [9] H. B. Mann and D. R. Whitney. On a test of whether one of two
Other sources of additional data may include subjective random variables is stochastically larger than the other. Annals of Math.
user feedback collected via questionnaires, annotation of the Statistics, 18:50–60, 1947.
[10] S. Michie. Causes and management of stress at work. Occupational
events/signal, etc., as well as various external data extracted and Environmental Medicine, 59(1):67, 2002.
e.g. from the social media, e-mail correspondence or electronic [11] E. S. Page. Continuous inspection schemes. Biometrika, 41(1/2):100–
agendas. Having access to such additional data facilitates the 115, 1954.
[12] P. Paoli, D. Merllié, and F. per a la Millora. Third European survey on
use of pattern mining for finding relations between the in- working conditions 2000. European Foundation for the Improvement of
creases and decreases in the stress level with the characteristics Living and Working Conditions, 2001.
of the events of daily lives (what, where, when, with whom, [13] M. Pechenizkiy, J. Bakker, I. Žliobaitė, A. Ivannikov, and T. Kärkkäinen.
Online mass flow prediction in cfb boilers with explicit detection of
etc.). sudden concept drift. SIGKDD Explor. Newsl., 11:109–116, May 2010.
[14] W. K. Pratt. Digital Image Processing. John Wiley & Sons, 1978.
IV. C ONCLUSIONS AND F UTURE W ORK [15] N. D. Ramirez-Beltran and J. A. Montes. Neural networks for on-
line parameter change detections in time series models. Computers &
The detection of stressful events is a challenging task. Industrial Engineering, 33(1-2):337 – 340, 1997. Proc. of the 21st Int.
The information coming from sensor measurements is highly Conference on Computers and Industrial Engineering.
ambiguous and dependent on hidden contexts. The detection [16] P. Sanches, K. Höök, E. K. Vaara, C. Weymann, M. Bylund, P. Ferreira,
N. Peira, and M. Sjölinder. Mind the body!: Designing a mobile stress
of separate stress peaks in the GSR data is also challenging management application encouraging personal reflection. In Conference
due to the varieties of patterns in the data. Moreover, it is on Designing Interactive Systems, pages 47–56, 2010.
not clear without additional information whether certain peaks [17] M. Severo and J. Gama. Change detection with Kalman Filter and
CUSUM. In Ubiquitous Knowledge Discovery, LNCS 6202, pages 148–
correspond to a significant physiological process and how to 162. Springer Berlin / Heidelberg, 2010.
categorize them if they do. [18] E. Tromp and M. Pechenizkiy. Senticorr: Multilingual sentiment analysis
In the further work, we plan to mine different sources of of personal correspondence. In Proc. of IEEE ICDM 2011 Workshops.
IEEE Press, 2011.
data for stress detection and categorization. This includes the


