Behnaz Nojavanasghari

Carnegie Mellon University, Computer Science, Department Member

Followers

Following

Co-authors

Public Views

Supervisors: Dr. Charles Hughes, and Dr. Louis Philippe Morency

less

InterestsView All (6)

Uploads

Papers by Behnaz Nojavanasghari

Hand2Face: Automatic Synthesis and Recognition of Hand Over Face Occlusions

by Behnaz Nojavanasghari, Charlie Hughes, and Louis-philippe Morency

—A person's face discloses important information about their affective state. Although there has ... more —A person's face discloses important information about their affective state. Although there has been extensive research on recognition of facial expressions, the performance of existing approaches is challenged by facial occlusions. Facial occlusions are often treated as noise and discarded in recognition of affective states. However, hand over face occlusions can provide additional information for recognition of some affective states such as curiosity, frustration and boredom. One of the reasons that this problem has not gained attention is the lack of naturalistic occluded faces that contain hand over face occlusions as well as other types of occlusions. Traditional approaches for obtaining affective data are time demanding and expensive, which limits researchers in affective computing to work on small datasets. This limitation affects the generalizability of models and deprives researchers from taking advantage of recent advances in deep learning that have shown great success in many fields but require large volumes of data. In this paper, we first introduce a novel framework for synthesizing naturalistic facial occlusions from an initial dataset of non-occluded faces and separate images of hands, reducing the costly process of data collection and annotation. We then propose a model for facial occlusion type recognition to differentiate between hand over face occlusions and other types of occlusions such as scarves, hair, glasses and objects. Finally, we present a model to localize hand over face occlusions and identify the occluded regions of the face.

Download

Exceptionally Social: Design of an Avatar-Mediated Interactive System for Promoting Social Skills in Children with Autism

Avatar-mediated and virtual environments hold a unique potential for promoting social skills in c... more Avatar-mediated and virtual environments hold a unique potential for promoting social skills in children with autism. This paper describes the design of " Exceptionally Social, " which is an interactive system that uses avatars to mediate human-to-human interactions for social skills training of children with autism. This system aims to offer the following functionalities: (1) gives children the opportunity to practice social skills in a safe environment, under various contexts. (2) changes the dynamics of the interactions based on the child's affective states. (3) provides visual support for children to teach them different social skills and facilitate their learning. (4) reduces the cognitive load on the interactor (a trained human orchestrating the avatars' behaviors) by providing real-time feedback about a child's affective states and suggesting appropriate visual supports using a recommendation system.

Download

The Future Belongs to the Curious: Towards Automatic Understanding and Recognition of Curiosity in Children

Curiosity plays a crucial role in learning and education of children. Given its complex nature, i... more Curiosity plays a crucial role in learning and education of children. Given its complex nature, it is extremely challenging to automatically understand and recognize it. In this paper, we discuss the contexts under which curiosity can be elicited and provide an associated taxonomy. We present an initial empirical study of curiosity that includes the analysis of co-occurring emotions and the valence associated with it, together with gender-specific analysis. We also discuss the visual, acoustic and verbal behavior indicators of curiosity. Our discussions and analysis uncover some of the underlying complexities of curiosity and its temporal evolution, which is a step towards its automatic understanding and recognition. Finally, considering the central role of curiosity in education, we present two education-centered application areas that could greatly benefit from its automatic recognition.

Download

Deep Multimodal Fusion for Persuasiveness Prediction

by Behnaz Nojavanasghari and Deepak Gopinath

Persuasiveness is a high-level personality trait that quantifies the influence a speaker has on t... more Persuasiveness is a high-level personality trait that quantifies the influence a speaker has on the beliefs, attitudes, intentions , motivations, and behavior of the audience. With social multimedia becoming an important channel in propagating ideas and opinions, analyzing persuasiveness is very important. In this work, we use the publicly available Persuasive Opinion Multimedia (POM) dataset to study persuasion. One of the challenges associated with this problem is the limited amount of annotated data. To tackle this challenge, we present a deep multimodal fusion architecture which is able to leverage complementary information from individual modalities for predicting persuasiveness. Our methods show significant improvement in performance over previous approaches.

Download

EmoReact: A Multimodal Approach and Dataset for Recognizing Emotional Responses in Children

Automatic emotion recognition plays a central role in the technologies underlying social robots, ... more Automatic emotion recognition plays a central role in the technologies underlying social robots, affect-sensitive human computer interaction design and affect-aware tutors. Although there has been a considerable amount of research on automatic emotion recognition in adults, emotion recognition in children has been understudied. This problem is more challenging as children tend to fidget and move around more than adults, leading to more self-occlusions and non-frontal head poses. Also, the lack of publicly available datasets for children with annotated emotion labels leads most researchers to focus on adults. In this paper, we introduce a newly collected multimodal emotion dataset of children between the ages of four and fourteen years old. The dataset contains 1102 audiovisual clips annotated for 17 different emotional states: six basic emotions, neutral, valence and nine complex emotions including curiosity, uncertainty and frustration. Our experiments compare unimodal and multimodal emotion recognition baseline models to enable future research on this topic. Finally, we present a detailed analysis of the most indicative behavioral cues for emotion recognition in children.

Download

Towards a Comprehensive Computational Model for Aesthetic Assessment of Videos

In this paper we propose a novel aesthetic model emphasizing psycho-visual statistics extracted f... more In this paper we propose a novel aesthetic model emphasizing psycho-visual statistics extracted from multiple levels in contrast to earlier approaches that rely only on descriptors suited for image recognition or based on photographic principles. At the lowest level, we determine dark-channel, sharpness and eye-sensitivity statistics over rectangular cells within a frame. At the next level, we extract Sentibank features (1, 200 pre-trained visual classifiers) on a given frame, that invoke specific sentiments such as " colorful clouds " , " smiling face " etc. and collect the classifier responses as frame-level statistics. At the topmost level, we extract trajectories from video shots. Using viewer's fixation priors, the trajectories are labeled as foreground, and background/camera on which statistics are computed. Additionally, spatio-temporal local binary patterns are computed that capture texture variations in a given shot. Classifiers are trained on individual feature representations independently. On thorough evaluation of 9 different types of features, we select the best features from each level – dark channel, affect and camera motion statistics. Next, corresponding classifier scores are integrated in a sophisticated low-rank fusion framework to improve the final prediction scores. Our approach demonstrates strong correlation with human prediction on 1, 000 broadcast quality videos released by NHK as an aesthetic evaluation dataset.

Download

Drafts by Behnaz Nojavanasghari

Interactive Generative Adversarial Networks for Facial Expression Generation in Dyadic Interactions

A social interaction is a social exchange between two or more individuals, where individuals modi... more A social interaction is a social exchange between two or more individuals, where individuals modify and adjust their behaviors in response to their interaction partners. Our social interactions are one of most fundamental aspects of our lives and can profoundly affect our mood, both positively and negatively. With growing interest in virtual reality and avatar-mediated interactions, it is desirable to make these interactions natural and human like to promote positive effect in the interactions and applications such as intelligent tutoring systems, automated interview systems and e-learning. In this paper, we propose a method to generate facial behaviors for an agent. These behaviors include facial expressions and head pose and they are generated considering the users affective state. Our models learn semantically meaningful representations of the face and generate appropriate and temporally smooth facial behaviors in dyadic interactions.

Download