Chapter Number
Towards Affect-sensitive Assistive Intervention
Technologies for Children with Autism
Karla Conn, Changchun Liu, Nilanjan Sarkar,
Wendy Stone and Zachary Warren
Vanderbilt University
USA
1. Introduction
Investigation into technology-assisted intervention for children with autism spectrum
disorder (ASD) has gained momentum in recent years. Therapists involved in interventions
must overcome the communication impairments generally exhibited by children with ASD
by adeptly inferring the affective cues of the children to adjust the intervention accordingly.
Similarly, an intelligent system, such as a computer or robot, must also be able to
understand the affective needs of these children - an ability that the current technologyassisted ASD intervention systems lack - to achieve effective interaction that addresses the
role of affective states in human-computer interaction (HCI), human-robot interaction (HRI),
and intervention practice. In this chapter we present a physiology-based affect-inference
mechanism for emotion modeling, emotion recognition, and emotion-sensitive adaptive
response in technology-assisted intervention. This work is the first step towards developing
“understanding” interactive technologies for use in future ASD intervention. We address
the problem of how to make computer-based ASD intervention tools affect-sensitive by
designing therapist-like affective models of the children with ASD based on their
physiological responses. By employing these models, we explain how a robot can detect the
affective states of a child with ASD and adapt its behaviors accordingly. Experimental
results with 6 children with ASD from computer-based cognitive tasks and a proof-ofconcept experiment (i.e., a robot-based basketball game) are presented. A Support Vector
Machines (SVM) based affective model yielded approximately 82.9% success for predicting
affect inferred from a therapist. The robot learned the individual liking level of each child
with regard to the game configuration and selected appropriate behaviors to present the
task at his/her preferred liking level. Results show the robot automatically predicted
individual liking level in real time with 81.1% accuracy. This is the first time, to our
knowledge, that the affective states of children with ASD have been detected via a
physiology-based affect recognition technique in real time. This is also the first time that the
impact of affect-sensitive closed-loop interaction between a robot and a child with ASD has
been demonstrated experimentally.
While there is at present no single accepted intervention, treatment, or known cure for ASD,
there is growing consensus that intensive behavioral and educational intervention programs
can significantly improve long term outcomes for individuals and their families (Rogers,
2
Affective Computing, Focus on Emotion Expression, Synthesis and Recognition
1998). Despite the urgent need and societal import of intensive treatment (Rutter, 2006),
appropriate intervention resources for children with ASD and their families are often
extremely costly when accessible (Jacobson et al., 1998; Tarkan, 2002). Therefore, an
important new direction for research on ASD is the identification and development of
assistive intervention tools that can make application of intensive treatment more readily
accessible.
In response to this need, a growing number of studies have been investigating the
application of advanced interactive technologies to address core deficits related to autism,
namely computer technology (Bernard-Opitz et al., 2001), virtual reality environments
(Pares et al., 2005; Parsons & Mitchell, 2002), and robotic systems (Dautenhahn & Werry,
2004; Kozima et al, 2005; Michaud & Theberge-Turmel, 2002; Pioggia et al., 2005; Scassellati,
2005). Initial results indicate that such technologies may hold promise for rehabilitation of
children with ASD. Computer and virtual reality (VR) based intervention may provide a
simplified but exploratory interaction environment for children with ASD (Moore et al.,
2000; Parsons & Mitchell, 2002). Various software packages and VR environments have been
developed and applied to address specific deficits associated with autism, e.g.,
understanding of false belief (Swettenham, 1996), attention (Trepagnier et al., 2006),
expression recognition (Silver & Oakes, 2001), and social communication (Bernard-Opitz et
al., 2001; Parsons et al., 2005). Research suggested that robots can allow simplified but
embodied social interaction that is less intimidating or confusing for children with ASD
(Dautenhahn & Werry, 2004). Michaud & Theberge-Turmel (2002) investigated the impact of
robot design on the interactions with children and emphasized that systems need to be
versatile enough to adapt to the varying needs of different children. Pioggia et al. (2005)
developed an interactive life-like facial display system for enhancing emotion recognition in
individuals with ASD. Robots have also been used to teach basic social interaction skills
using turn-taking and imitation games, and the use of robots as social mediators and as
objects of shared attention can encourage interaction with peers and adults (Dautenhahn &
Werry, 2004; Kozima et al, 2005). Interactive technologies pose the advantage of furnishing
robust systems that can support multimodal interaction and provide a repeatable,
standardized stimulus while quantitatively recording and monitoring the performance
progress of the children with ASD to assess the intervention approaches (Scassellati, 2005).
By employing human-computer interaction (HCI) and human-robot interaction (HRI)
technologies, interactive therapeutic tools can partially automate the time-consuming,
routine behavioral therapy sessions and may allow intensive intervention to be conducted at
home (Dautenhahn & Werry, 2004). For the purpose of using our affective computing tools,
computers or robots could be the mode of technology for assisted ASD interventions. We
will use the term intelligent system primarily in this text to imply both computer and robot
interactive technologies.
Even though there is increasing research in assistive technologies for autism intervention,
the authors found no published studies that specifically addressed how to automatically
detect and respond to affective cues of children with ASD. This could be important since
research suggests that people tend to interact with computers as they might relate to other
people, provided that the technology behaves in a socially competent manner (Reeves &
Nass, 1996). We believe that such ability could be critical given the importance of human
affective information in HRI (Fong et al., 2003; Picard, 1997) and the significant impacts of
the affective factors of children with ASD on the intervention practice (Seip, 1996). Common
Towards Affect-sensitive Assistive Intervention Technologies for Children with Autism
3
in autism intervention, therapists who work with children with ASD continuously monitor
affective cues of the children in order to make appropriate decisions about adaptations to
their intervention strategies. For example, ‘likes and dislikes chart’ is recommended to
record the children’s preferred activities and/or sensory stimuli during interventions that
could be used as reinforcers and/or ‘alternative behaviors’ (Seip, 1996). Children with
autism are particularly vulnerable to anxiety and intolerant of feelings of frustration, which
requires a therapist to plan tasks at an appropriate level of difficulty (Ernsperger, 2003). The
engagement of children with ASD is the ground basis for the ‘floor-time therapy’ to help
them develop relationships and improve their social skills (Wieder & Greenspan, 2005).
The potential impacts brought by an intelligent system that can detect the affective states of
a child with ASD and interact with him/her based on such perception could be various.
Complex social stimuli, sophisticated interactions, and unpredictable situations could be
gradually but automatically introduced when the robot recognizes that the child is
comfortable or not anxious at a certain level of interaction dynamics for a reasonably long
period of time. A therapist could use the child’s affective records to analyze the therapeutic
approach. With the record of the activities and the consequent emotional changes in a child,
an intelligent system could learn individual affective characteristics over time and thus
could adapt the ways it responds to the needs of different children.
The primary objective of the current research is to investigate how to augment HCI and HRI
to be used in autism intervention by endowing the intelligent system with the ability to
recognize and respond to the affective states of a child with ASD. To achieve this objective,
the research is divided into two phases. Phase I represents the development of affective
models through psychophysiological analysis, which includes designing cognitive tasks for
affect-elicitation, deriving physiological features via signal processing, and developing
affective models using machine learning techniques. Phase II is characterized by the
investigation of affect sensitivity during closed-loop interaction between a child with ASD
and the intelligent system (i.e., computer, VR environment, or robot). A proof-of-concept
experiment was designed wherein a robot learns individual preferences based on the
predicted liking level of the children with ASD and in real time selects an appropriate
behavior accordingly.
The chapter is organized as follows: The scope and rationale of this work is presented in
Section 2. Section 3 describes our use of physiological indices for affect recognition and our
proposed framework for automatically detecting and responding to affective cues of
children with ASD in closed-loop interaction, as well as the experimental design. This
description is followed by a detailed results and discussion section (Section 4). Finally,
Section 5 summarizes the contributions of the paper and outlines possible future directions
of this research. In addition, the machine learning algorithms employed in this study is
presented in the Appendix.
2. Scope and rationale
The overview of the affect-sensitive closed-loop interaction between a child with ASD and
an intelligent system is presented in Fig. 1. The physiological signals from the children with
ASD are recorded when they are interacting with the system. These signals are processed in
real time to extract features, which are fed as input into the models developed in Phase I.
The models determine the perceived affective cues and return this information as an output.
4
Affective Computing, Focus on Emotion Expression, Synthesis and Recognition
The affective information, along with other environmental inputs, is used by a controller to
decide the next course of action for the intelligent system. The child who engages with the
system is then influenced by the system’s behavior, and the closed-loop interaction cycle
begins anew. The impact of such an interaction with a robot as the intelligent system is
evaluated in Phase II.
Fig. 1. Framework overview
Human interactions with technology are characterized by explicit as well as implicit
channels of communication with presumed underlying affective states (Picard, 1997). While
the explicit channel transmits overt messages, the implicit one transmits hidden messages
about the communicator (e.g., his/her intention and attitude). There is a growing consensus
that endowing an intelligent system with an ability to understand implicit affective cues
should permit more meaningful and natural HCI and HRI (Picard, 1997). There are several
modalities such as facial expression (Bartlett et al., 2003), vocal intonation (Lee &
Narayanan, 2005), gestures and postures (Asha et al., 2005; Kleinsmith et al., 2005), and
physiology (Kulic & Croft, 2007; Liu et al., 2006; Mandryk & Atkins, 2007; Picard et al., 2001;
Rani et al., 2004) that can be utilized to evaluate the affective states. In this work we chose to
create affective models based on physiological data for several reasons. Children with ASD
often have communicative impairments (both nonverbal and verbal), particularly regarding
expression of affective states (DSM-IV-TR, American Psychiatric Association, 2000; Green et
al., 2002; Schultz, 2005). These vulnerabilities place limits on traditional conversational and
observational methodologies; however, physiological signals are continuously available and
are not necessarily directly impacted by these difficulties (Ben Shalom et al., 2006; Groden et
al., 2005; Toichi & Kamio, 2003). As such, physiological modeling may represent a
methodology for gathering rich data despite the potential communicative impairments of
children with ASD. In addition, physiological data may offer an avenue for recognizing
aspects of affect that may be less obvious for humans but more suitable for computers by
using signal processing and pattern recognition tools. Furthermore, there is evidence that
the transition from one affective state to another state is accompanied by dynamic shifts in
indicators of Autonomic Nervous System (ANS) activity (Bradley, 2000). The physiological
signals that have been used in this research consist of various cardiovascular, electrodermal,
Towards Affect-sensitive Assistive Intervention Technologies for Children with Autism
5
electromyographic, and body temperature signals, all of which have been extensively
investigated in psychophysiology literature (Bradley, 2000).
An important question when estimating human affective response is how to operationalize
the affective states. Although much existing research on affective modeling categorizes
affective states into “basic emotions,” there is no consensus on a set of basic emotions
among the researchers (Cowie et al., 2001). This fact implies that pragmatic choices are
required to select target affective states for a given application (Cowie et al., 2001). In this
research we chose anxiety, engagement, and liking to be the target affective states. Anxiety
was chosen for two primary reasons. First, anxiety plays an important role in various
human-machine interaction tasks that can be related to task performance (Brown et al.,
1997). Second, anxiety frequently co-occurs with ASD and plays an important role in the
behavior difficulties of children with autism (Gillott et al., 2001). Engagement, defined as
“sustained attention to an activity or person,” has been regarded as one of the key factors for
children with ASD to make substantial gains in academic, communication, and social
domains (Ruble & Robson, 2006). With ‘playful’ activities during the intervention, the liking
of the children (i.e., the enjoyment they experience when interacting with an intelligent
system) may create the urge to explore and allow prolonged interaction for the children
with ASD, who are susceptible to being withdrawn (Dautenhahn & Werry, 2004).
Notably, there is evidence that several affective states could co-occur at different arousal
levels (Vansteelandt et al., 2005), and different individuals could express the same emotion
with different characteristic response patterns under the same contexts (i.e., phenomenon of
person stereotypy) (Lacey & Lacey, 1958). The novelty of the presented affective modeling is
that it is individual-specific to accommodate the differences encountered in emotional
expression, and it consists of an array of recognizers – each of which determines the
intensity of one target affective state for each individual. In this work, a therapist observed
the experiments (described in Section 3.2.2) and provided subjective reports based on
expertise in inferring presumable underlying affective states from the observable behaviors
of children with ASD. The therapist’s reports on perceived intensity of the affective states of
a child and the extracted physiological indices (described in Section 3.2.4) were employed to
develop therapist-like affect recognizers that predict high/low levels of anxiety, engagement,
and liking for each child with ASD.
Once affective modeling was completed in Phase I, the recognizers equipped the intelligent
system with the capability to detect the affective states of the children with ASD in real time
from on-line extracted physiological features, which could be utilized in future interventions
even when a therapist is not available. As stated in (Dautenhahn et al., 2003), it is important
to have robots maintain characteristics of adaptability when applied to autism intervention.
In Phase II, we designed and implemented a proof-of-concept experiment (robot-based
basketball) wherein a robot adapts its behaviors in real time according to the preference of a
child with ASD, inferred from the interaction experience and the predicted consequent
liking level. This work is the first time, to our knowledge, that the feasibility and the impact
of affect-sensitive closed-loop interaction between a robot and a child with ASD have been
demonstrated experimentally. While the results are achieved in a non-social interaction task,
it is expected that the real-time affect recognition and response system described in this
work will provide a basis for future research into developing technology-assisted
intervention tools to help children with ASD explore social interaction dynamics in an
affect-sensitive and adaptive manner.
6
Affective Computing, Focus on Emotion Expression, Synthesis and Recognition
3. Experimental investigation
3.1 Participants
Given the nature of autism (a spectrum disorder) which implies vast individual differences,
the works on autism intervention assistive tools are generally guided by the individual
characteristics, needs, and preferences of the children (i.e., individual-specific approach) and
focus on one sect of the population to develop a method with the flexibility to make future
modifications for a wider part of the population (Pioggia et al., 2005; Robins et al., 2005;
Robins et al., 2004; Werry et al., 2001). The spectrum nature of autism and the phenomenon
of person stereotypy (Lacey & Lacey, 1958) led us to choose an individual-specific approach
to work on a long-term basis with a small group of children with ASD in order to evaluate
our affect-sensitive intelligent system.
Six participants within the age range of 13 to 16 years old volunteered to partake in the
experiments with the consent of their parents. Each of the participants had a diagnosis on
the autism spectrum, either autistic disorder, Asperger's Syndrome, or pervasive
developmental disorder not otherwise specified (PDD-NOS), according to their medical
records. Due to the nature of the designed cognitive tasks (as described in Section 3.2.1), the
following were considered when choosing the participants: (i) having a minimum
competency level of age-appropriate language and cognitive skills and (ii) not having any
history of mental retardation. Each child with ASD underwent the Peabody Picture
Vocabulary Test III (PPVT-III) (Dunn & Dunn, 1997) to assess cognitive function. The PPVTIII is a measure of single-word receptive vocabulary that is often used as a proxy for
intelligence quotient (IQ) testing (Dunn & Dunn, 1997). It provides standard scores with a
mean of 100 and a standard deviation of 15. The PPVT-III measure has high correlations
with standardized tests such as the Stanford-Binet Intelligence Scale and the Wechsler
Intelligence Scale for Children (Bee & Boyd, 2004), and DSM-IV-TR (2000) classifies full scale
IQ’s above 70 as nonretarded. Inclusion in our study was characterized as obtaining a
standard score of 80 or above on the PPVT-III measure. Table 1 shows the characteristics of
the participants in the experiments. The group sizes and the cardinality of participant age
range of many studies on technology-assisted autism intervention are commensurate with
our work when an individual-specific approach was used (Pioggia et al., 2005; Robins et al.,
2005; Robins et al., 2004; Werry et al., 2001). The affective modeling was performed based on
a large sample size of observations (approximately 85 epochs over 6 hours) for each child
with ASD, which is comparatively more extensive than many other works (Groden et al.,
2005; Pioggia et al., 2005; Robins et al., 2004).
Child ID Gender Age
Diagnosis
PPVT-III Score
A
Male
15
Autistic Disorder
99
B
Male
15 Asperger's Syndrome
80
C
Male
13
Autistic Disorder
81
D
Male
14
PDD-NOS
92
E
Male
16
PDD-NOS
93
F
Female 14
PDD-NOS
83
Table 1. Characteristics of Participants.
Towards Affect-sensitive Assistive Intervention Technologies for Children with Autism
7
3.2 Phase I – affective modeling
While the impact of Phase II is evaluated on affect-sensitive human-robot interaction, we
built the affective models using physiological data gathered from two human-computer
interaction tasks. Our previous work (Rani et al., 2006a) showed that affective models built
through human-computer interaction tasks could be successfully employed to achieve affect
recognition in human-robot interaction for typical individuals. This observation suggests
that it is possible to broaden the domain of tasks for affective modeling, thus reducing the
habituation effect of continuous exposure to the same robotic system.
3.2.1 Task design for affect elicitation during cognitive tasks
Two computer-based cognitive tasks – an anagram-solving task and a Pong-playing task –
were designed to evoke varying intensities of the following three affective states: anxiety,
engagement, and liking, from the participants. Affective responses were manipulated by
presenting the participant with anagrams of varying difficulty levels. For example, a long
series of trivially easy anagrams caused less engagement. The Pong task involved the
participant playing a variant of the classic video game “Pong.” Various parameters of the
game were manipulated to elicit the required affective responses: ball speed and size,
paddle speed and size, sluggish or over-responsive keyboard, and the level of the computer
opponent player. For examples, very high speeds and sluggish or over-responsive keyboard
caused anxiety at times and playing against a moderate-level computer player usually
generated liking. The task configurations were established through pilot work.
Each task sequence was subdivided into a series of discrete trials/epochs that were bounded
by the subjective affective state assessments. These assessments were collected using a
battery of five questions regarding the three target affective states and the perceived
difficulty and performance rated on an eight-point Likert scale where 1 indicated the lowest
level and 8 indicated the maximum level. Each participant took part in six sessions – three
one-hour sessions of anagrams and three one-hour sessions of Pong – on six different days.
3.2.2 Experimental setup
Fig. 2 shows the setup for the experiment. A child with ASD was involved in the cognitive
tasks on computer C1 while his/her physiological data was acquired via wearable
biofeedback sensors and the Biopac system (www.biopac.com). After being amplified and
digitized, physiological signals were transferred from the Biopac transducers to C2 through
an Ethernet link and stored. C1 was also connected to the Biopac system via a parallel port,
through which the physiological data were recorded in a time-synchronized manner. To
gain perspective from different sources and enhance the reliability of the subjective report, a
therapist with experience in working with children with ASD and a parent of the participant
were also involved in the study, who may best know the participant. We video recorded the
sessions to cross-reference observations made during the experiment. The signal from the
video camera was routed to a television, and the signal from the participant's computer
screen where the task was presented was routed to a separate computer monitor M2. The
therapist and a parent were seated at the back of the experiment room, watching the
experiment from the view of the video camera and observing how the task progressed on
the separate monitor.
8
Affective Computing, Focus on Emotion Expression, Synthesis and Recognition
Fig. 2. Experimental setup for affective modeling tasks
3.2.3 Experimental procedure
On the first visit, participants completed the PPVT-III measurement to determine eligibility
for the experiments. After initial briefing regarding the tasks, physiological sensors from a
Biopac system were attached to the participant's body. Participants were asked to relax in a
seated position and read age-appropriate leisure material while a three-minute baseline
recording was performed, which was later used to offset day-variability. Each session lasted
about an hour and consisted of a set (13-15) of either 3-minute epochs for anagram tasks or
up to 4-minute epochs for Pong tasks. Each epoch was followed by subjective report
questions rated on an eight-point Likert scale. The three sets of reports were used as the
possible reference points to link the objective physiological measures to the participant's
affective state.
3.2.4 Physiological indices for affective modeling
There is good evidence that the physiological activity associated with affective states can be
differentiated and systematically organized (Bradley, 2000). Cardiovascular and
electromyogram activities have been used to examine positive and negative affective states
of people (Cacioppo et al., 2000; Papillo & Shapiro, 1990). Electrodermal activities have been
shown to be associated with task engagement (Pecchinenda & Smith, 1996). The variation of
peripheral temperature due to emotional stimuli was studied by Kataoka et al. (1998). In this
work, we exploited the dependence of physiological responses on underlying affective
states to develop affective models for children with ASD by using the machine learning
method as described in Section 3.2.5 and Appendix 1. The physiological signals we
examined were: various cardiovascular activities including electrocardiogram (ECG),
impedance cardiogram (ICG), photoplethysmogram (PPG), and phonocardiogram
(PCG)/heart sound; electrodermal activities (EDA) including tonic and phasic responses
from skin conductance; electromyogram (EMG) activities from corrugator supercilii,
zygomaticus major, and upper trapezius muscles; and peripheral temperature. These signals
Towards Affect-sensitive Assistive Intervention Technologies for Children with Autism
9
were selected because they are likely to demonstrate variability as a function of the targeted
affective states, as well as they can be measured non-invasively, and are relatively resistant
to movement artifacts (Lacey & Lacey, 1958; Dawson et al., 1990). Further details of the
physiological signals examined in this work along with the features derived from each
signal can be found in our supplementary publication Rani et al. (2006b).
The physiological signals were acquired using the Biopac MP150 data acquisition system
(www.biopac.com). ECG was measured from the chest using the standard two-electrode
configuration. ICG describes the changes of thorax impedance due to cardiac contractility
and was measured by four pairs of surface electrodes that were longitudinally configured
on both sides of the body. A microphone specially designed to detect heart sound waves
was placed on the chest to measure PCG. PPG, peripheral temperature, and EDA were
measured from the middle finger, the thumb, and the index and ring fingers of the nondominant hand, respectively. EMG was measured by placing surface electrodes on two
facial muscles (corrugator supercilii and zygomaticus major) and an upper back muscle
(upper trapezius). Fig. 3 shows the sensor setup. The sampling rate was fixed at 1000 Hz for
all the channels. Appropriate amplification and band-pass filtering were performed.
Subsequently, emotional stimulus induced by cognitive tasks was applied in epochs of up to
four minutes in length (as described in Section 3.2.1).
Fig. 3. Sensor Setup. (a) shows the position of facial EMG sensors and (b) shows the
placement of sensors on non-dominant hand.
Signal processing techniques such as Fourier transform, wavelet transform, thresholding,
and peak detection were used to derive the relevant features from the physiological signals.
For example, inter beat interval (IBI) is the time interval between two “R” waves in the
electrocardiogram (ECG) waveform. Power spectral analysis is performed on the IBI data to
localize the sympathetic and parasympathetic nervous system activities associated with two
frequency bands (i.e., high (0.15-0.4Hz) and low (0.04-0.15Hz) frequency components.
Photoplethysmograph (PPG) signal measures changes in the volume of blood in the finger
tip associated with the pulse cycle and provides an index of the relative constriction versus
dilation of the blood vessels in the periphery. Pulse transit time (PTT) is estimated by
computing the time between systole at the heart (as indicated by the R-wave of the ECG)
and the peak of the pulse wave reaching the peripheral site where PPG is being measured.
The features extracted from the heart sound signal consist of the mean and standard
deviation of the 3rd, 4th, and 5th level coefficients of the Daubechies wavelet transform.
10
Affective Computing, Focus on Emotion Expression, Synthesis and Recognition
Bioelectrical impedance analysis (BIA) measures the impedance or opposition to the flow of
an electric current through the body fluids contained mainly in the lean and fat tissue. A
common variable in recent psychophysiology research, pre-ejection period (PEP) is derived
from ICG and ECG and is most heavily influenced by sympathetic innervation of the heart.
EDA consists of two main components - Tonic response and Phasic response. Tonic skin
conductance refers to the ongoing or the baseline level of skin conductance in the absence of
any particular discrete environmental events. Phasic skin conductance refers to the event
related changes that occur, caused by a momentary increase in skin conductance
(resembling a peak). The EMG signal from Corrugator Supercilii muscle (eyebrow) captures
a person's frown and detects the tension in that region. This EMG signal is also a valuable
source of blink information and helps determine the blink rate. The EMG signal from the
Zygomaticus Major muscle captures the muscle movements while smiling. Upper Trapezius
muscle activity measures the tension in the shoulders, one of the most common sites in the
body for developing stress. Variations in the peripheral temperature mainly come from
localized changes in blood flow caused by vascular resistance or arterial blood pressure and
reflect the autonomic nervous system activity.
3.2.5 SVM-based affective modeling
Determining the intensity (e.g., high/low) of a particular affective state from the
physiological response resembles a classification problem where the attributes are the
physiological features and the target function is the degree of arousal. Our earlier work
(Rani et al., 2006b) compared the efficacy of several machine learning algorithms (KNN,
Bayesian Network Technique, Regression Tree, and SVM) to recognize the affective states
from the physiological signals of typical individuals and found that SVM gave the highest
classification accuracy. In this work, SVM was employed to determine the underlying
affective state of a child with ASD given a set of physiological features. Details of the theory
and learning methods of SVM can be found in (Vapnik, 1998) and are briefly described in
Appendix 1.
As illustrated in Fig. 4, each participant had a data set comprised of both the objective
physiological features and corresponding subjective reports on arousal level of target
affective states from the therapist, the parent, and the participant. The physiological features
were extracted by using the approaches described in Section 3.2.4). The individual range per
affective state from each reporter on the subjective reports was normalized to [0, 1] and then
discretized such that 0–0.50 was labeled as low level and 0.51–1 was labeled as high level.
All three affective states were partitioned separately so that there were two levels for each
affective state. Each data set contained approximately 85 epochs. The multiple subjective
reports were analyzed, and one was chosen as the possible reference points to link the
physiological measures to the participant's affective state. For example, a therapist-like affect
recognizer can be developed when the therapist’s reports are used. A SVM-based recognizer
was trained on each individual’s data set for each target affective state. In this work, in order
to deal with the nonlinearly separable data, soft margin classifiers with slack variables were
used to find a hyperplane with less restriction (Eqn. 1, Appendix 1) (Burges, 1998). RBF
(Radial Basis Function) was selected as the kernel function because it often delivers better
performance (Burges, 1998). A ten-fold cross-validation was used to determine the kernel
parameter and regularization parameter (Eqn. 2, Appendix 1) of the recognizer.
Towards Affect-sensitive Assistive Intervention Technologies for Children with Autism
11
Fig. 4. Overview of affective modeling
Once affective modeling is accomplished, the affect recognizers can accept as input the
physiological features extracted on-line and produce as output the probable level of the
target affective state of a child with ASD while interacting with an intelligent system. In the
design for the human-robot interaction task in Phase II, adequate measures were taken to
avoid physical effort from overwhelming the physiological response.
3.3 Phase II - closed-loop human robot interaction
3.3.1 Task design for affect-sensitive behavior adaptation task
A closed-loop human robot interaction task, “robot-based basketball (RBB),” was designed.
The main objective was two-fold: (i) to enable the robot to learn the preference of the
children with ASD implicitly using physiology-based affective models as well as select
appropriate behaviors accordingly; and (ii) to observe the effects of such affective-sensitivity
in the closed-loop interaction between the children with ASD and the robot.
The affective model developed in Phase I is capable of predicting the intensity of liking,
anxiety, and engagement simultaneously. However to designate a specific objective for the
experiment in Phase II without compromising its proof-of-concept purpose, one of the three
target affective states was chosen to be detected and responded to by the robot in real time.
As has been emphasized in (Dautenhahn and Werry, 2004), the liking of the children (i.e.,
the enjoyment they experience when interacting with the robot) is a goal as desirable as skill
learning for autism intervention. Therefore, liking was chosen as the affective state around
which to modify the robot’s behaviors in Phase II.
Affective Computing, Focus on Emotion Expression, Synthesis and Recognition
12
In the RBB task, an undersized basketball hoop was attached to the end-effector of a robotic
manipulator, which could move the hoop in different directions (as shown in Fig. 5) with
different speeds. The children were instructed to shoot a required number of baskets into
the moving hoop within a given time. Three robot behaviors were designed as shown in
Table 2. For example, in behavior 1 the robot moves towards and away from the participant
(i.e., in the X direction) at a slow speed with soft background music, and the shooting
requirement for successful baskets is relatively low. The parameter configurations were
determined based on a pilot study to attain varied impacts on affective experience for
different behaviors. From this pilot study, the averaged performance of participants for a
given behavior was compiled and analyzed. Behavior transitions occurred between but not
within epochs. As such, each robot behavior extended for the length of an epoch (1.5
minutes in duration) to have the participant fully exposed to the impact of that behavior.
Fig. 5. X, Y, and Z directions for behaviors used in RBB
Behavior
ID
1
2
3
Motion
Direction
X
Y
Z
Speed
(sec/period)
2
4
8
Threshold
(shots/epoch)
12
20
30
Background
Music
Serene
Lively
Irregular
Table 2. Robot behaviors
Each of the six participants took part in two robot basketball sessions (RBB1 and RBB2). In
RBB1 (non-affect based) the robot selected its behavior randomly (i.e., without any regard to
the liking information of the participant), and the presentation of each type of behavior was
evenly distributed. This session was designed for two purposes: (i) to explore the state space
and action space of the QV-learning algorithm used in RBB2 for behavior adaptation
(described in Section 3.3.4); and (ii) to validate that the different robot behaviors have
Towards Affect-sensitive Assistive Intervention Technologies for Children with Autism
13
distinguishable impact on the child’s level of liking. In RBB2 (liking-based), the robot
continues to learn the child’s individual preference and selects the desirable behavior based
on interaction experiences (i.e., records of robot behavior and the consequent liking level of
a participant predicted by the affective model). The idea is to investigate whether the robot
can automatically choose the most-liked behavior of each participant as observed from RBB1
by means of physiology-based affective model and QV-learning.
3.3.2 Experimental setup
The real-time implementation of the RBB system is shown in Fig. 6. The set-up included a 5
degrees-of-freedom robot manipulator (CRS Catalyst-5 System) Two infrared (IR)
transmitter and receiver pairs were attached to the basketball hoop to detect small, soft foam
balls going through the hoop. Biological feedback equipment (Biopac system) was
connected to a C1 that: (i) acquired physiological signals from the Biopac system and
extracted physiological features on-line, (ii) predicted the probable liking level by using the
affective model developed in Phase I, (iii) acquired IR data through the analog input
channels of the Biopac system, (iv) ran a QV-learning algorithm that learns the participant’s
preference and chooses the robot’s next behavior accordingly. Computer C1 was connected
serially to the CRS computer (C2), which ran Simulink software. The behavior switch
triggers were transmitted from C1 to C2 via a RS232 link. The commands to control the
robot’s various joints were transmitted from C2 to the robot. As in Phase I tasks, the
therapist and a parent were also involved, watching the experiment from the TV that was
connected to a video camera.
Fig. 6. Experimental set-up for robot basketball
3.3.3 Experimental procedure
Each basketball session (RBB1 or RBB2) was approximately 1 hour long and included 27
minutes of active human-robot interaction (i.e., 18 epochs of 1.5 minutes each). The
remaining time was spent attaching sensors, guiding a short practice, taking a baseline
recording, collecting subjective reports, and pausing for scheduled breaks. During the
14
Affective Computing, Focus on Emotion Expression, Synthesis and Recognition
experiment, the participant was asked to take a break after every four epochs and the
participant could request a break whenever he/she desired one. During each basketball
epoch, the participant received commands and performance assessments from pre-recorded
dialogue via a speech program running on C1 and the interaction proceeded as follows:
1. The participant was notified of the shooting requirement threshold.
2. A start command instructed the participant to start shooting baskets.
3. Once the epoch started, the participant was given voice feedback every 30 seconds
regarding the number of baskets remaining and the time available.
4. A stop command instructed the participant to stop shooting baskets, which ended the
epoch.
5. At the end of each epoch, the participant's performance was rated and relayed to
him/her as excellent, above average, or below average.
Each epoch was followed by a subjective reporting procedure using the same protocol as
Phase I that took 30-60 seconds to collect. After the subjective reports were complete, the
next epoch would begin. To prevent habituation, a time interval of 7 days or more between
RBB sessions was enforced.
3.3.4 Affect-sensitive behavior adaptation in closed-loop human robot interaction
We defined the state, action, state transition, and reward functions so that the affectsensitive robot behavior adaptation problem could be solved using the QV-learning
algorithm as described in (Wiering, 2005) and Appendix 2.
The set of states consisted of three robot behaviors as described in Table 2. In every state, the
robot has three possible actions (1/2/3) that correspond to choosing behavior 1, 2, or 3,
respectively, for the next time step (i.e., next epoch). Each robot behavior persists for one full
epoch and the state/behavior transition occurs only at the end of an epoch. The detection of
consequent affective cues (i.e., the real-time prediction of the liking level for the next epoch)
was used to evaluate the desirability of a certain action. A reward function was defined
based on the predicted liking level. If the consequent liking level was recognized as high,
the contributing action was interpreted as positive and a reward was granted (r = 1);
otherwise the robot received a punishment (r = -1). QV-learning uses this reward function to
have the robot learn how to select the behavior that was expected to result in a high liking
level and therefore positively influenced the actual affective (e.g., liking) experience of the
child.
RBB1 enables state and action exploration through random, evenly distributed behaviorswitching actions. The V-function and Q-function are updated using Eqn. (3) and Eqn. (4)
from Appendix 2. After RBB1, the subjective reports are analyzed to examine the impacts of
different behaviors on each participant’s preference. In RBB2 the robot starts from a nonpreferred behavior/state and continues the learning process by using Eqn. (3) and Eqn. (4).
A greedy action selection mechanism is used to choose the behavior-switching action with
the highest Q-value.
Because of the limited number of states and actions in this proof-of-concept experiment,
tabular representation is used for the V-function and the Q-function. To prevent a certain
action and/or state from being overly dominant and to counteract the habituation effect, the
values of Q(s, a) and V(s) are bounded by using the reward or punishment encountered in
the interaction. The parameters in Eqn. (3) and Eqn. (4) are chosen as α = 0.8 and γ = 0.9.
Before RBB1 begins, the initial values in the V-table and the Q-table are set to 0.
Towards Affect-sensitive Assistive Intervention Technologies for Children with Autism
15
4. Results and discussion
In this section we present both the Phase I results of physiology-based affective modeling
for children with ASD and Phase II results of the affect-sensitive closed-loop interaction
between children with ASD and the robot.
4.1 Phase I – affect detection
One of the prime challenges of this work is attaining reliable subjective reports. Moreover,
researchers are reluctant to trust the responses of adolescents on self-reports (Barkley, 1998).
In order to overcome this difficulty, a therapist and a parent were involved by using the
approaches described in the experimental setup. They observed the experiments and
provided subjective reports based on their expertise/experience in inferring presumable
underlying affective states from the observable behaviors of children with ASD.
To measure the amount of agreement among the different reporters, the kappa statistic was
used (Siegel and Castellan, 1988). The kappa coefficient (K) measures pair-wise agreement
among a set of reporters making category judgments, correcting for expected chance
agreement. When agreement is complete, K=1; whereas, when there is only agreement as
would be expected by chance, K = 0. Fig. 7 shows results for K averaged across the target
affective states.
Fig. 7. Average Kappa Statistics between Reporters for Affective States
It was observed that the agreement between the therapist and parent (T/P) showed the
largest K values (mean = 0.62) among the three possible pairs for each child (p < 0.05, paired
t-test). When each child is examined individually, different trends arise, which revealed
diverse affective characteristics of the children with ASD who partook in this study. The
Kappa agreement between therapist and parent is substantial for Child A, Child B, Child D,
and Child F and moderate for Child C and Child E. Such results might stem from the fact
that it could be difficult for the therapist or parent to distinguish certain emotion for a
particular child with ASD. For example, the agreement between therapist and parent for the
anxiety level of Child C and Child E (K equals 0.352 and 0.372, respectively) are
considerably less than the average level. In the experiment, Child A and Child F’s ratings for
16
Affective Computing, Focus on Emotion Expression, Synthesis and Recognition
liking, anxiety, and engagement were almost constant which resulted in lower K values for
the therapist and child pair (T/C) and the parent and child pair (P/C) than those of the
other participants. This may be due to the fact that the spectrum developmental disorder for
children with autism manifests different abilities to recognize and report their emotions. The
mean of the kappa statistic values between the children and either the therapist or the
parent were relatively small (0.37 and 0.40, respectively). Although lack of agreement with
adults does not necessarily mean that the self-reports of children with ASD are not
dependable; however, given the fact that therapists’ judgment based on their expertise is the
state-of-the-art in most autism intervention approaches and the fact that there is a
reasonably high agreement between the therapist and the parents for all of the six children,
the subjective reports of the therapist were used as the reference points linking the objective
physiological data to the children’s affective state. To make the subjective reports more
consistent, the same therapist was involved in all of the experiments. This choice allowed for
building a therapist-like affective model. Once the affect modeling is completed, the
recognizers will be capable of inferring the affective states of the child with ASD from the
physiological signals in real-time even when the therapist is not available.
The performance of the developed affective model for each child (i.e., individual-specific
approach) is shown in Fig. 8. The cross-validation method, ‘leave-one-out’, was used. The
affective model produced high recognition accuracies for each target affective state of each
participant. The average correct prediction accuracies across all participants were: 85.0% for
liking, 79.5% for anxiety, and 84.3% for engagement, which are comparable to the best
results achieved for typical individuals (Picard et al., 2001; Rani et al., 2006b). We also
compared the performance of affective modeling to a control method that represents
random chance. For example, in 48 out of 86 epochs the engagement of Child E was rated as
low level, where a random classification could assign all test epochs to this category and
make accurate classifications (48/86)×100 = 55.8% of the time. We thus considered the level
with a majority of epochs to represent the chance condition, which is denoted by dark grey
bars in Fig. 8. While the physiology-based affective modeling alone did not provide perfect
classification (i.e., 100%) of affective states of children with ASD, they did yield reliable
matches with the subjective rating and significantly outperformed a random classifier
(averaging 82.9% vs. 59.2%). This was promising considering that this task was challenging
in two respects: (i) the reports were collected from the therapist who was observing the
children as opposed to having typical adults capable of differentiating and reporting their
own affective states and (ii) varying levels of arousal of any given affective state (e.g.,
low/high anxiety) were identified instead of determining discrete emotions (e.g., anger, joy,
sadness, etc.).
To explore the effects of reducing the number of physiological signals and the possibility of
achieving more economical modeling, we examined the performance of the affect
recognizers when cardiovascular, electrodermal, and electromyographic activities and their
combinations were used. As shown in Table 3, all the recognizers delivered better
predication than random guess (mean prediction rate equals 52.9%), and with more
information from physiological activities the performance of the affective models tends to
improve (except the combination of electrodermal and electromyographic activities). While
no combination of physiological activity surpassed the percent accuracy achieved when all
signals were used, the results suggested it may be possible to selectively reduce the set of
Towards Affect-sensitive Assistive Intervention Technologies for Children with Autism
17
signals and obtain nearly-as-good performance (e.g., using a combination of cardiovascular
and electrodermal signals).
Fig. 8. Prediction Accuracy of the Affective Model
Physiological Signals
Liking
Anxiety
Engage
Mean
Cardiovascular
Electrodermal
Electromyographic
Electrodermal +
Electromyographic
Cardiovascular +
Electromyographic
Cardiovascular + Electrodermal
All
75.7
73.4
73.1
68.5
72.3
65.8
76.2
73.3
70.1
73.5
73.0
69.7
75.0
69.4
71.4
71.9
79.6
70.2
79.9
76.6
79.9
85.0
74.3
79.5
81.9
84.3
78.7
82.9
Table 3. Prediction Accuracy of the Affective Modeling based on Different Physiological
Signals (%)*
4.2 Phase II – affect adaptation in robot-based basketball task
Six children with ASD who completed the Phase I experiments also took part in the robot
basketball task. The results described here are based on the RBB1 (non-affect based) and
RBB2 (liking-based) tasks.
First, we present results to validate that different behaviors of the robot had distinguishable
impacts on the liking level of the children with ASD. To reduce the bias of validation, in
RBB1 the robot selects behaviors randomly and the occurrence of each behavior is evenly
Peripheral temperature has relatively few features derived and was not examined
independently. Instead, it was studied conjunctively with the electrodermal activity, both of
which were acquired from the non-dominant hand of a participant.
*
18
Affective Computing, Focus on Emotion Expression, Synthesis and Recognition
distributed. Fig. 9 shows the average labeled liking level for each behavior as reported by
the therapist in RBB1. The difference of the impact is significant for five children
(participants A, B, D, E, and F) and moderate for participant C. Across all participants, the
differences of reported liking for the most-preferred, moderately-preferred, and leastpreferred behavior are statistically significant (p < 0.05, ANOVA test). Furthermore, it was
observed that different children with ASD may have different preferences for the robot’s
behaviors. These results demonstrated that it is important to have a robot learn the
individual’s preference and adapt to it automatically, which may allow a more tailored and
affect-sensitive interaction between children with ASD and the robot.
Fig. 9. Mean liking level for different behaviors in RBB1
Second, the predictive accuracy of how closely the real-time physiology-based quantitative
measures of liking, as obtained from affective models developed in Phase I, matched with
that of the subjective rating of liking made by the therapist during Phase II is discussed. The
average predictive accuracy across all the participants was approximately 81.1%. The
highest was 86.1% for Child D, and the lowest was 77.8% for Child B and Child E. Note that
the affective model was evaluated based on physiological data obtained on-line from a realtime application for children with ASD. However, this prediction accuracy is comparable to
the results achieved through off-line analysis for typical individuals (Rani et al., 2006b).
Third, we present results about robot behavior adaptation and investigate its impact on the
interaction between the children and the robot. Table 4 shows the percentages of different
behaviors that were chosen in RBB2 for each participant. The robot learned the individual’s
preference and selected the most-preferred behavior with high probability for all the
participants. Averaged across participants, the most-preferred, moderately-preferred, and
least-preferred behaviors were chosen 72.5%, 16.7%, and 10.8% of the time, respectively. The
preference of a behavior was defined by the reported liking level in RBB1 as shown in Fig. 9.
There could be several reasons why less-preferred behaviors were chosen in RBB2. The
learned behavior selection policy might not have been optimal after the exploration in RBB1,
and the QV-learning algorithm continued the learning process in RBB2. Another reason
could be that the affective model is not 100% accurate and may return false
reward/punishment, which may have given the robot imperfect instruction for behavior
Towards Affect-sensitive Assistive Intervention Technologies for Children with Autism
19
switches. Habituation to the most-preferred behavior during RBB2 could also be a factor
that might have contributed to temporary changes in preference which led the robot to
choose other behaviors.
Child ID
A
B
C
D
E
F
Most-Liked
Behavior
Moderate-Liked
Behavior
Least-Liked
Behavior
ID
Proportion
ID
Proportion
ID
Proportion
2
1
2
2
2
2
82.4%
70.6%
58.8%
76.5%
76.5%
70.6%
3
2
3
3
3
1
11.8%
17.7%
23.5%
11.8%
17.6%
17.7%
1
3
1
1
1
3
5.8%
11.7%
17.7%
11.7%
5.9%
11.7%
Table 4. Proportion of Different Behaviors Performed in RBB2
In Fig. 10 we present results to demonstrate that active monitoring of participants’ liking
and automatically selecting the preferred behavior allowed children with ASD to maintain
high liking levels. The average labeled liking levels of the participants as reported by the
therapist during the two sessions were compared. The lighter bars indicate the liking level
during the RBB1 session (i.e., when the robot selected behaviors randomly), and the darker
bars show the liking level during the RBB2 session (i.e., when robot learned the individual
preference and chose the appropriate behavior accordingly). For all participants liking level
was maintained, and for five of the six children liking level increased.
Fig. 10. Subjective liking as reported by therapist
There was no significant increase for Child C during the liking-based session as compared to
the non-affect based session. The impact of the different robot behaviors on the liking level
of Child C is not as significant as that of the others (refer to Fig. 9), which may impede the
robot in finding the preferred behavior and hence impede the robot in effectively
influencing the subjective liking level positively. Note that RBB1 presents a typically
20
Affective Computing, Focus on Emotion Expression, Synthesis and Recognition
balanced interaction with equal numbers of most-preferred, moderately-preferred, and
least-preferred epochs and the comparisons in Fig. 10 are not between liking-based sessions
and sessions of least-preferred epochs. To determine whether this change in liking level was
statistically significant across all the participants, a one-way ANOVA test was performed on
the null hypothesis of no change in liking level between liking-based sessions and non-affect
based sessions. The null hypothesis could be rejected at the 99.5% confidence level. This was
a significant result as the robot continued learning and utilizing the information regarding
the probable liking level of children with ASD to adjust its behaviors. This ability enables
the robot to adapt its behavior selection in real time and hence keep the participant in a
higher liking level.
5. Conclusions and future work
There is increasing consensus in the autism community that development of assistive tools
that exploit advanced technology will likely make application of intensive intervention for
children with ASD more readily accessible. In recent years, various applications of advanced
interactive technologies have been investigated in order to facilitate and/or partially
automate the existing behavioral intervention that addresses specific deficits associated with
autism. However, the current technology-assisted intervention tools for children with ASD
do not possess the ability of deciphering affective cues from the children, which could be
critical given that the affective factors of children with ASD have significant impacts on the
intervention practice. In this work, we have proposed a novel framework for affect-sensitive
human-machine interaction where the intelligent system can detect the affective states of the
children with ASD implicitly and respond to it accordingly.
The presented affective modeling methodology could allow the recognition of affective
states of children with ASD from physiological signals in real time and provide the basis for
future technology-assisted affect-sensitive interactive autism intervention. In Phase I, two
cognitive tasks – solving anagrams and playing Pong – have been designed to elicit the
affective states of liking, anxiety, and engagement for children with ASD that are considered
important in autism intervention. To have reliable reference points to link the physiological
data to the affective states, the reports from the child, the therapist, and the parent were
collected and analyzed. A large set of physiological indices have been investigated to
determine their correlation with the affective states of the children with ASD. We have
experimentally demonstrated that it is viable to detect the affective states of children with
ASD via a physiology-based affect recognition mechanism. A SVM-based affective model
yielded reliable prediction with a success rate of 82.9% when using the therapist’s reports.
In order to investigate the affect-sensitive closed-loop interaction between the children with
ASD and an intelligent system, we designed a proof-of-concept task, robot-based basketball,
and developed an experimental system for its real-time implementation and verification.
The real-time prediction of liking level of the children with ASD was accomplished with an
average accuracy of 81.1%. The robot learned individual preferences of the children with
ASD over time based on the interaction experience and the predicted liking level and hence
automatically selected the most-preferred behavior, on average, 72.5% of the time. We have
observed that such affect-sensitive robot behavior adaptation has led to an increase in
reported liking level of the children with ASD. This is the first time, to our knowledge, that
the affective states of children with ASD have been detected via a physiology-based affect
recognition technique in real time. This is also the first time that the impact of affect-
Towards Affect-sensitive Assistive Intervention Technologies for Children with Autism
21
sensitive closed-loop interaction between a robot and children with ASD has been
demonstrated experimentally.
The presented work requires physiological sensing that has its own limitations. For
example, one needs to wear physiological sensors, and use of such sensors could be
restrictive under certain circumstances. Given the rapid progress in physiological sensing
clothing and accessories (Picard, 1997), we believe that physiology-based affect recognition
can be appropriate and useful for the application of interactive autism intervention and
could be used conjunctively with other modalities (e.g., visual and audio) to allow flexible
and robust affective modeling for children with ASD. Moreover, none of the participants in
this study had any objection to wearing the physiological sensors.
Future work will involve designing socially-directed interaction experiments that address
the social communication deficits of children with ASD. We will investigate how to augment
the interactive autism intervention by having an intelligent system (e.g., computer, VR
environment, or robot) respond appropriately to the inferred affects based on the affective
model described here. Specifically, we plan to integrate the real-time affect recognition and
response system described in this research with a life-like android face developed by
Hanson Robotics (www.hansonrobotics.com) and separately with an interactive virtual
reality environment developed with Vizard software (www.worldviz.com). These
intelligent systems can produce accurate examples of common facial expressions that
convey affective states. This affective information could be used as feedback for empathy
exercises to help children recognize their own emotions. Enhancements on the intervention
process could also be envisioned. For instance, the intelligent system could exhibit
interesting behaviors to retain the child's attention when it detects his/her liking level is
low. Additionally, we will investigate fast and robust learning mechanisms that would
permit an intelligent system’s adaptive response in the more complex interaction tasks.
6. Appendix
6.1 Pattern recognition using support vector machines
SVM, pioneered by Vapnik (1998), is an excellent tool for classification (Burges, 1998). Its
appeal lies in its strong association with statistical learning theory as it approximates the
structural risk minimization principle. Good generalization performance can be achieved by
maximizing the margin, where margin is defined as the sum of the distances of the
hyperplane from the nearest data points of each of the two classes. SVM is a linear machine
working in a high k-dimensional feature space formed by an implicit embedding of ndimensional input data X (e.g., a vector of derived physiology features as described in
Section 3.2.4) into a k-dimensional feature space (k > n) through the use of a nonlinear
mapping φ(X). This allows for the use of linear algebra and geometry to separate the data,
which is normally only separable with nonlinear rules in the input space. The problem of
finding a linear classifier for given data points with known class labels can be described as
T
finding a separating hyperplane W ϕ ( X ) that satisfies:
(
)
yi W T ϕ ( X i ) = yi
(∑
k
j =1
)
w jφ j ( X i ) + w0 ≥ 1 − ξ i
(1)
where N represents the number of training data pairs (Xi, yi) indexed by i = 1,2,…, N; yi∈{+1,
-1} represents the class label (e.g., high/low intensity of a target affective state); ϕ(X) =
Affective Computing, Focus on Emotion Expression, Synthesis and Recognition
22
[φ0(X), φ1(X),…, φk(X)]T is the mapped feature vector (φ0(X) = 1); and W = [w0, w1,…, wk] is the
weight vector of the network. The nonnegative slack variable ξi generalizes the linear
classifier with soft margin to deal with nonlinearly separable problems.
All operations in learning and testing modes are done in SVM using a so-called kernel
function defined as K(Xi, X) = ϕT(Xi)ϕ(X) (Vapnik, 1998). The kernel function allows for
efficient computation of inner products directly in the feature space and circumvents the
difficulty of specifying the non-linear mapping explicitly. The most distinctive fact about
SVM is that the learning task is reduced to a dual quadratic programming problem by
introducing the Lagrange multipliers αi (Vapnik, 1998; Burges, 1998):
Maximize
Q (α ) = ∑ i =1α i −
N
Subject to
∑
N
i =1
(
1 N
N
∑ ∑ α iα j yi y j K X i , X j
2 i =1 j =1
α i yi = 0 and 0 ≤ α i ≤ C
)
(2)
where C is a user-defined regularization parameter that determines the balance between the
complexity of the network characterized by the weight vector W and the error of
classification of data. The corresponding αi multipliers are only non-zero for the support
vectors (i.e., the training points nearest to the hyperplane), which induces solution
sparseness. The SVM approach is able to deal with noisy data and over-fitting by allowing
for some misclassifications on the training set (Burges, 1998). This characteristic makes it
particularly suitable for affect recognition because the physiology data is noisy and the
training set size is often small. Another important feature of SVM is that the quadratic
programming leads in all cases to the global minimum of the cost function. With the kernel
representation and soft margin mechanism, SVM provides an efficient technique that can
tackle the difficult, high dimensional affect recognition problem.
6.2 Behavior adaptation using QV-learning
QV-learning (Wiering, 2005), a variant of the standard reinforcement learning algorithm Qlearning (Watkins and Dayan, 1992), was applied to achieve the affect-sensitive behavior
adaptation. QV-learning keeps track of both a Q-function and a V-function. The Q-function
represents the utility value Q(s, a) for every possible pair of state s and action a. The Vfunction indicates the utility value V(s) for each state s. The state value V(st) and Q-value
Q(st, at) at step t are updated after each experience (st, at, rt, st+1) by:
V ( st ) := V ( st ) + α ( rt + γ V ( st +1 ) − V ( st ) )
Q ( st , at ) := Q ( st , at ) + α ( rt + γ V ( st +1 ) − Q ( st , at ) )
(3)
(4)
where rt is the received reward that measures the desirability of the action at when it is
applied on state st and causes the system to evolve to state st+1. The difference between (4)
and the conventional Q-learning rule is that QV-learning uses V-values learned in (3) and is
not defined solely in terms of Q-values. Since V(s) is updated more often than Q(s, a), QVlearning may permit a fast learning process (Wiering, 2005) and enable the intelligent
system to efficiently find a behavior selection policy during interaction.
Towards Affect-sensitive Assistive Intervention Technologies for Children with Autism
23
7. Acknowledgements
The authors gratefully acknowledge the MARI (Marino Autism Research Institute) grant,
the staff support from the Vanderbilt Treatment and Research Institute for Autism Spectrum
Disorders for guidance during the development of experiments involving children with
ASD, and the parents and children who participated in the presented research.
8. References
American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders:
DSM-IV-TR (4th ed.). American Psychiatric Association. Washington, DC
Asha, K., Ajay, K., Naznin, V., George, T., & Peter, F. D. (2005). Gesture-based affective
computing on motion capture data, Proceedings of the Int. Conf. on Affective
Computing and Intelligent Interaction, Beijing, China
Barkley, R. A. (1998). Attention deficit hyperactivity disorder: A handbook for diagnosis and
treatment (2 ed.). Guilford Press. New York, NY
Bartlett, M. S., Littlewort, G., Fasel, I., & Movellan, J. R. (2003). Real time face detection and
facial expression recognition: development and applications to human computer
interaction, Proceedings of the Computer Vision and Pattern Recognition Workshop,
Madison, Wisconsin
Bee, H. & Boyd, D. (2004). The Developing Child. (10th ed.). Pearson. Boston
Ben Shalom, D., Mostofsky, S. H., Hazlett, R. L., Goldberg, M. C., Landa, R. J., Faran, Y.,
McLeod, D. R., & Hoehn-Saric, R. (2006). Normal physiological emotions but
differences in expression of conscious feelings in children with high-functioning
autism. J Autism Dev Disord, 36(3):395-400
Bernard-Opitz, V., Sriram, N., & Nakhoda-Sapuan, S. (2001). Enhancing social problem
solving in children with autism and normal children through computer-assisted
instruction. J Autism Dev Disord, 31(4):377-384
Bradley, M. M. (2000). Emotion and motivation, In: Handbook of Psychophysiology, J. T.
Cacioppo, L. G. Tassinary & G. Berntson, (Eds.), 602-642, Cambridge University
Press. New York, NY
Brown, R. M., Hall, L. R., Holtzer, R., Brown, S. L., & Brown, N. L. (1997). Gender and video
game performance. Sex Roles, 36(11-12):793 – 812
Burges, C. J. C. (1998). A tutorial on Support Vector Machines for pattern recognition. Data
Mining and Knowledge Discovery, 2(2):121-167
Cacioppo, J.T., Berntson, G.G., Larsen, J.T., Poehlmann, K.M., & Ito, T.A. (2000). The
psychophysiology of emotion, In: Handbook of Emotions, Lewis, M., & HavilandJones, J.M., (Eds.), The Guilford Press. New York, NY
Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., & Taylor,
J. (2001). Emotion recognition in human-computer interaction. IEEE Signal
Processing Magazine, 18(1):32-80
Dautenhahn, K., & Werry, I. (2004). Towards interactive robots in autism therapy:
background, motivation and challenges. Pragmatics & Cognition, 12(1):1-35
Dautenhahn, K., Werry, I., Salter, T., Boekhorst, R. T. (2003). Towards Adaptive
Autonomous Robots in
Autism Therapy: Varieties of Interactions, IEEE International Symposium on Computational
Intelligence in Robotics and Automation, Kobe
24
Affective Computing, Focus on Emotion Expression, Synthesis and Recognition
Dawson, M. E., Schell, A. M., & Filion, D. L. (1990). The Electrodermal System. In: Principles
of Psychophysiology: Physical, Social, and Inferential Elements, Cacioppo, J.T., &
Tassinary, L.G., (Eds.), Cambridge University Press. Cambridge, MA
Dunn, L. M., & Dunn, L. M. (1997). PPVT-III: Peabody Picture Vocabulary Test-Third
Edition.American Guidance Service. Circle Pines, Minnesota
Ernsperger, L. (2003). Keys to Success for Teaching Students with Autism. Future Horizons
Fong, T., Nourbakhsh, I., & Dautenhahn, K. (2003). A survey of socially interactive robots.
Robotics and Autonomous Systems, 42(3-4):143-166
Gillott, A., Furniss, F., & Walter, A. (2001). Anxiety in high-functioning children with
autism. Autism, 5(3):277-286
Green, D., Baird, G., Barnett, A. L., Henderson, L., Huber, J., & Henderson, S. E. (2002). The
severity and nature of motor impairment in Asperger's syndrome: a comparison
with Specific Developmental Disorder of Motor Function. Journal of Child Psychology
and Psychiatry and Allied Disciplines, 43(5):655-668
Groden, J., Goodwin, M. S., Baron, M. G., Groden, G., Velicer, W. F., Lipsitt, L. P., Hofmann,
S. G., & Plummer, B. (2005). Assessing Cardiovascular Responses to Stressors in
Individuals with Autism Spectrum Disorders. Focus on Autism and Other
Developmental Disabilities, 20(4):244-252
Jacobson, J.W., Mulick, J. A., & Green, G. (1998). Cost-benefit estimates for early intensive
behavioral intervention for young children with autism – General model and single
state case. Behavioral Interventions, 13:201-206
Kataoka, H., Kano, H., Yoshida, H., Saijo, A., Yasuda, M., & Osumi, M. (1998). Development
of a skin temperature measuring system for non-contact stress evaluation. IEEE
Ann. Conf. Engineering Medicine Biology Society
Kleinsmith, A., Fushimi, T., & Bianchi-Berthouze, N. (2005). An incremental and interactive
affective posture recognition system, Proceedings of the UM 2005 Workshop: Adapting
the Interaction Style to Affective Factors, Edinburgh, United Kingdom
Kozima, H., Nakagawa, C., & Yasuda, Y. (2005). Interactive robots for communication-care:
A case-study in autism therapy, Proceedings of the IEEE International Workshop on
Robot and Human Interactive Communication, Nashville, Tennessee, August
Kulic, D., & Croft, E. (2007). Physiological and subjective responses to articulated robot
motion. Robotica, 25:13-27
Lacey, J. I., & Lacey, B. C. (1958). Verification and extension of the principle of autonomic
response-stereotypy. Am J Psychol, 71(1):50-73
Lee, C. M., & Narayanan, S. S. (2005). Toward detecting emotions in spoken dialogs. IEEE
Transactions on Speech and Audio Processing, 13(2):293-303
Liu, C., Rani, P., & Sarkar, N. (2006). Human-Robot interaction using affective cues.
Proceedings of the International Symposium on Robot and Human Interactive
Communication, Hatfield, United Kingdom
Mandryk, R. L., & Atkins, M. S. (2007). A fuzzy physiological approach for continuously
modeling emotion during interaction with play technologies. International Journal of
Human-Computer Studies, 65(4):329-347
Michaud, F. & Theberge-Turmel, C. (2002). Mobile robotic toys and autism. In: Socially
Intelligent Agents: Creating Relationships with Computers and Robots, K. Dautenhahn,
A. H. Bond, L. Canamero, and B. Edmonds, (Eds.), 125-132, Kluwer Academic
Publishers
Towards Affect-sensitive Assistive Intervention Technologies for Children with Autism
25
Moore, D. J., McGrath, P., & Thorpe, J. (2000). Computer aided learning for people with
autism - A framework for research and development. Innovations in Education and
Training International, 37(3):218-228
Papillo, J.F., & Shapiro, D., (1990). The cardiovascular system. In: Principles of
Psychophysiology: Physical, Social, and Inferential Elements, Cacioppo, J.T., &
Tassinary, L.G., (Eds.), Cambridge University Press. Cambridge, MA
Pares, N., Masri, P., van Wolferen, G., & Creed, C. (2005). Achieving dialogue with children
with severe autism in an adaptive multisensory interaction: the "MEDIAte" project.
IEEE Trans Vis Comput Graph, 11:734-743
Parsons, S., & Mitchell, P. (2002). The potential of virtual reality in social skills training for
people with autistic spectrum disorders. J Intellect Disabil Res, 46(Pt 5):430-443
Parsons, S., Mitchell, P., & Leonard, A. (2005). Do adolescents with autistic spectrum
disorders adhere to social conventions in virtual environments? Autism, 9(1):95-117
Pecchinenda, A., & Smith, C. A. (1996). The affective significance of skin conductance
activity during a difficult problem-solving task. Cogn. and Emotion, 10(5):481-504
Picard, R. W. (1997). Affective Computing. The MIT Press. Cambridge, MA
Picard, R. W., Vyzas, E., & Healey, J. (2001). Toward machine emotional intelligence:
Analysis of affective physiological state. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 23(10):1175-1191
Pioggia, G., Igliozzi, R., Ferro, M., Ahluwalia, A., Muratori, F., & De Rossi, D. (2005). An
android for enhancing social skills and emotion recognition in people with autism.
IEEE Transactions on Neural Systems and Rehabilitation Engineering, 13(4):507-515.
Rani, P., Liu, C., & Sarkar, N. (2006a). Affective feedback in closed loop human-robot
interaction. Proceeding of the 1st ACM SIGCHI/SIGART Conference on Human-Robot
interaction, Salt Lake City, Utah, USA, March
Rani, P., Liu, C. C., Sarkar, N., & Vanman, E. (2006b). An empirical study of machine
learning techniques for affect recognition in human-robot interaction. Pattern
Analysis and Applications, 9(1):58-69
Rani, P., Sarkar, N., Smith, C. A., & Kirby, L. D. (2004). Anxiety detecting robotic system towards implicit human-robot collaboration. Robotica, 22:85-95
Reeves, B., & Nass, C. I. (1996). The media equation: how people treat computers, televisions, and
new media as real people and places. Cambridge University Press. New York, NY
Robins, B., Dickerson, P., & Dautenhahn, K. (2005). Robots as embodied beings –
Interactionally sensitive body movements in interactions among autistic children
and a robot, Proceedings of the 14th IEEE International Workshop on Robot and Human
Interactive Communication, Nashville, Tennessee, August
Robins, B., Dickerson, P., Stribling, P., & Dautenhahn, K. (2004). Robot-mediated joint
attention in children with autism: A case study in robot-human interaction.
Interaction Studies, 5(2):161–198
Rogers, S. J. (1998). Empirically supported comprehensive treatments for young children
with autism. J Clin Child Psychol, 27(2):168-179
Ruble, L. A., & Robson, D. M. (2006). Individual and Environmental Determinants of
Engagement in Autism. J Autism Dev Disord.
Rutter, M. (2006). Autism: its recognition, early diagnosis, and service implications. J Dev
Behav Pediatr, 27(2 Suppl):S54-58
26
Affective Computing, Focus on Emotion Expression, Synthesis and Recognition
Scassellati, B. (2005). Quantitative metrics of social response for autism diagnosis, Proc. IEEE
International Workshop on Robot and Human Interactive Communication, Nashville,
Tennessee, August
Schultz, R.T. (2005). Developmental deficits in social perception in autism: the role of the
amygdala and fusiform face area. Int J Dev Neurosci, 23:125-41
Seip, J. A. (1996). Teaching the autistic and developmentally delayed: A guide for staff training and
development. Delta. British Columbia
Siegel, S., & Castellan, J. N. J. (1988). Nonparametric Statistics for the Behavioral Sciences.
McGraw-Hill. New York, NY
Silver, M., & Oakes, P. (2001). Evaluation of a new computer intervention to teach people
with autism or Asperger syndrome to recognize and predict emotions in others.
Autism, 5(3):299-316
Swettenham, J. (1996). Can children with autism be taught to understand false belief using
computers? J Child Psychol Psychiatry, 37(2):157-165
Tarkan, L. (October 21, 2002). Autism therapy is called effective, but rare. New York Times
Toichi, M., & Kamio, Y. (2003). Paradoxical autonomic response to mental tasks in autism. J
Autism Dev Disord, 33(4):417-426
Trepagnier, C. Y., Sebrechts, M. M., Finkelmeyer, A., Stewart, W., Woodford, J., & Coleman,
M. (2006). Simulating social interaction to address deficits of autistic spectrum
disorder in children. Cyberpsychol Behav, 9(2):213-217
Vansteelandt, K., Van Mechelen, I., & Nezlek, J. B. (2005). The co-occurrence of emotions in
daily life: A multilevel approach. Journal of Research in Personality, 39(3):325-335
Vapnik, V. N. (1998). Statistical Learning Theory. Wiley-Interscience. New York, NY
Watkins, C. J. C. H. & Dayan, P. (1992). Q-Learning. Machine Learning, 8, May
Werry I, Dautenhahn K, & Harwin W. (2001). Investigating a robot as a therapy partner for
children with autism, Proceedings of the 6th European conference for the advancement of
assistive technology, Ljubljana, Slovenia
Wieder, S., & Greenspan, S. (2005). Can Children with Autism Master the Core Deficits and
Become Empathetic, Creative, and Reflective? The Journal of Developmental and
Learning Disorders, 9
Wiering, M. A. (2005). QV (λ)-learning: A New On-policy Reinforcement Learning
Algorithm, European Workshop on Reinforcement Learning, Napoli, Italy, October