Authors:
Juan Pablo A. Heredia
1
;
Yudith Cardinale
1
;
2
;
Irvin Dongo
1
;
3
and
Jose Díaz-Amado
1
;
4
Affiliations:
1
Computer Science Department, Universidad Católica San Pablo, 04001 Arequipa, Peru
;
2
Computer Science Department, Universidad Simón Bolívar, 1080 Caracas, Venezuela
;
3
Univ. Bordeaux, ESTIA Institute of Technology, 64210 Bidart, France
;
4
Electrical Engineering, Instituto Federal da Bahia, 45078-300 Vitoria da Conquista, Brazil
Keyword(s):
Emotion Recognition, Multi-modal Method, Emotion Ontology, Visual Expressions.
Abstract:
Human emotion recognition from visual expressions is an important research area in computer vision and machine learning owing to its significant scientific and commercial potential. Since visual expressions can be captured from different modalities (e.g., face expressions, body posture, hands pose), multi-modal methods are becoming popular for analyzing human reactions. In contexts in which human emotion detection is performed to associate emotions to certain events or objects to support decision making or for further analysis, it is useful to keep this information in semantic repositories, which offers a wide range of possibilities for implementing smart applications. We propose a multi-modal method for human emotion recognition and an ontology-based approach to store the classification results in EMONTO, an extensible ontology to model emotions. The multi-modal method analyzes facial expressions, body gestures, and features from the body and the environment to determine an emotiona
l state; this processes each modality with a specialized deep learning model and applying a fusion method. Our fusion method, called EmbraceNet+, consists of a branched architecture that integrates the EmbraceNet fusion method with other ones. We experimentally evaluate our multi-modal method on an adaptation of the EMOTIC dataset. Results show that our method outperforms the single-modal methods.
(More)