research-article

Emotion Recognition with Conversational Generation Transfer

Authors:

Zhongqing Wang,

Qinglei ZhouAuthors Info & Claims

Transactions on Asian and Low-Resource Language Information Processing, Volume 21, Issue 4

Article No.: 70, Pages 1 - 17

https://doi.org/10.1145/3494532

Published: 19 January 2022 Publication History

Abstract

Emotion recognition in conversation is one of the essential tasks of natural language processing. However, this task’s annotation data is insufficient since such data is hard to collect and annotate. Meanwhile, there is large-scale data for conversational generation, and this data does not need annotation manually. But, whether the vector space between different datasets is similar will be a problem. Therefore, we utilize a same dataset to train the conversational generator and the classifier, and transfer knowledge between them. In particular, we propose an Emotion Recognition with Conversational Generation Transfer (ERCGT) framework to model the interaction among utterances by transfer learning. First, we train a conversational generator. In the second step, a transfer learning model is used to transfer the knowledge of generator to the emotion recognition model. Empirical studies illustrate the effectiveness of the proposed framework over several strong baselines on three benchmark emotion classification datasets.

References

[1]

Gustavo Aguilar, Viktor Rozgic, Weiran Wang, and Chao Wang. 2019. Multimodal and multi-view models for emotion recognition. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 991–1002. https://doi.org/10.18653/v1/P19-1095

[2]

Carlos Busso, Murtaza Bulut, Chi Chun Lee, Abe Kazemzadeh, Emily Mower, Samuel Kim, Jeannette N. Chang, Sungbok Lee, and Shrikanth S. Narayanan. 2008. IEMOCAP: Interactive emotional dyadic motion capture database. Language Resources & Evaluation 42, 4 (2008), 335–359.

[3]

Abdelghani Dahou, Shengwu Xiong, Junwei Zhou, Mohamed Houcine Haddoud, and Pengfei Duan. 2016. Word embeddings and convolutional neural network for Arabic sentiment classification. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee, Osaka, Japan, 2418–2427.

[4]

Cristian Danescu-Niculescu-Mizil and Lillian Lee. 2011. Chameleons in imagined conversations: A new approach to understanding coordination of linguistic style in dialogs. In Proceedings of the 2nd Workshop on Cognitive Modeling and Computational Linguistics. Association for Computational Linguistics, Portland, Oregon, USA, 76–87.

Digital Library

[5]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423

[6]

Emily Dinan, Stephen Roller, Kurt Shuster, Angela Fan, Michael Auli, and Jason Weston. 2018. Wizard of Wikipedia: Knowledge-powered conversational agents. arXiv preprint arXiv:1811.01241 (2018).

[7]

John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research 12, Jul (2011), 2121–2159.

Digital Library

[8]

Bjarke Felbo, Alan Mislove, Anders Søgaard, Iyad Rahwan, and Sune Lehmann. 2017. Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1615–1625. https://doi.org/10.18653/v1/D17-1169

[9]

Deepanway Ghosal, Navonil Majumder, Soujanya Poria, Niyati Chhaya, and Alexander Gelbukh. 2019. DialogueGCN: A graph convolutional neural network for emotion recognition in conversation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 154–164. https://doi.org/10.18653/v1/D19-1015

[10]

John Gideon, Soheil Khorram, Zakaria Aldeneh, Dimitrios Dimitriadis, and Emily Mower Provost. 2017. Progressive neural networks for transfer learning in emotion recognition. 1098–1102. https://doi.org/10.21437/Interspeech.2017-1637

[11]

Yue Gu, Kangning Yang, Shiyu Fu, Shuhong Chen, Xinyu Li, and Ivan Marsic. 2018. Hybrid attention based multimodal network for spoken language classification. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, Santa Fe, New Mexico, USA, 2379–2390.

[12]

Devamanyu Hazarika, Soujanya Poria, Rada Mihalcea, Erik Cambria, and Roger Zimmermann. 2018. ICON: Interactive conversational memory network for multimodal emotion detection. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 2594–2604. https://doi.org/10.18653/v1/D18-1280

[13]

Devamanyu Hazarika, Soujanya Poria, Amir Zadeh, Erik Cambria, Louis-Philippe Morency, and Roger Zimmermann. 2018. Conversational memory network for emotion recognition in dyadic dialogue videos. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 2122–2132. https://doi.org/10.18653/v1/N18-1193

[14]

Devamanyu Hazarika, Soujanya Poria, Roger Zimmermann, and Rada Mihalcea. 2021. Conversational transfer learning for emotion recognition. Information Fusion 65 (2021), 1–12.

[15]

Chao-Chun Hsu, Sheng-Yeh Chen, Chuan-Chun Kuo, Ting-Hao Huang, and Lun-Wei Ku. 2018. EmotionLines: An emotion corpus of multi-party conversations. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). European Language Resources Association (ELRA), Miyazaki, Japan, 1597–1601.

[16]

Minlie Huang, Yujie Cao, and Chao Dong. 2016. Modeling rich contexts for sentiment classification with LSTM. arXiv preprint arXiv:1605.01478 (2016).

[17]

Yen-Hao Huang, Ssu-Rui Lee, Mau-Yun Ma, Yi-Hsin Chen, Ya-Wen Yu, and Yi-Shin Chen. 2019. EmotionX-IDEA: Emotion BERT – An affectional model for conversation. arXiv preprint arXiv:1908.06264.

[18]

Wenxiang Jiao, Michael R. Lyu, and Irwin King. 2019. PT-CoDE: Pre-trained Context-Dependent Encoder for Utterancelevel Emotion Recognition. arXiv preprint arXiv: 1910.08916 (2019).

[19]

Qintong Li, Hongshen Chen, Zhaochun Ren, Pengjie Ren, Zhaopeng Tu, and Zhumin Chen. 2020. EmpDG: Multi-resolution interactive empathetic dialogue generation. In Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Barcelona, Spain (Online), 4454–4466. https://doi.org/10.18653/v1/2020.coling-main.394

[20]

Yanran Li, Hui Su, Xiaoyu Shen, Wenjie Li, Ziqiang Cao, and Shuzi Niu. 2017. DailyDialog: A manually labelled multi-turn dialogue dataset. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Asian Federation of Natural Language Processing, Taipei, Taiwan, 986–995.

[21]

Ryan Lowe, Nissan Pow, Iulian Serban, and Joelle Pineau. 2015. The Ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. In Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue. Association for Computational Linguistics, Prague, Czech Republic, 285–294. https://doi.org/10.18653/v1/W15-4640

[22]

Navonil Majumder, Soujanya Poria, Devamanyu Hazarika, Rada Mihalcea, Alexander Gelbukh, and Erik Cambria. 2019. DialogueRNN: An attentive RNN for emotion detection in conversations. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 6818–6825.

Digital Library

[23]

Shotaro Matsumoto, Hiroya Takamura, and Manabu Okumura. 2005. Sentiment classification using word sub-sequences and dependency sub-trees. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 301–311.

Digital Library

[24]

Michael W. Morris and Dacher Keltner. 2000. How emotions work: The social functions of emotional expression in negotiations. Research in Organizational Behavior 22 (2000), 1–50.

[25]

Tim O’Keefe and Irena Koprinska. 2009. Feature selection and weighting methods in sentiment analysis. In Proceedings of the 14th Australasian Document Computing Symposium, Sydney. Citeseer, 67–74.

[26]

Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up? Sentiment classification using machine learning techniques. EMNLP 10 (06 2002), 79–86. https://doi.org/10.3115/1118693.1118704

Digital Library

[27]

Yookoon Park, Jaemin Cho, and Gunhee Kim. 2018. A hierarchical latent structure for variational conversation modeling. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 1792–1801. https://doi.org/10.18653/v1/N18-1162

[28]

Soujanya Poria, Erik Cambria, Devamanyu Hazarika, Navonil Majumder, Amir Zadeh, and Louis-Philippe Morency. 2017. Context-Dependent sentiment analysis in user-generated videos. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vancouver, Canada, 873–883. https://doi.org/10.18653/v1/P17-1081

[29]

Soujanya Poria, Devamanyu Hazarika, Navonil Majumder, Gautam Naik, Erik Cambria, and Rada Mihalcea. 2019. MELD: A multimodal multi-party dataset for emotion recognition in conversations. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 527–536. https://doi.org/10.18653/v1/P19-1050

[30]

Soujanya Poria, Navonil Majumder, Rada Mihalcea, and Eduard Hovy. 2019. Emotion recognition in conversation: Research challenges, datasets, and recent advances. IEEE Access 7 (May 2019), 100943–100953. https://doi.org/10.1109/ACCESS.2019.2929050

[31]

Iulian Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron Courville, and Yoshua Bengio. 2017. A hierarchical latent variable encoder-decoder model for generating dialogues. Proceedings of the AAAI Conference on Artificial Intelligence 31, 1, 3295–3301.

Digital Library

[32]

Iulian V. Serban, Alessandro Sordoni, Yoshua Bengio, Aaron Courville, and Joelle Pineau. 2016. Building end-to-end dialogue systems using generative hierarchical neural network models. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Vol. 33. 3776–3783.

Digital Library

[33]

Jia Wei, Shi Feng, Daling Wang, Yifei Zhang, and Xiangju Li. 2019. Attentional neural network for emotion detection in conversations with speaker influence awareness. In Natural Language Processing and Chinese Computing. Springer, Springer International Publishing, Cham, 287–297.

[34]

Changli Zhang, Daniel Zeng, Jiexun Li, Fei-Yue Wang, and Wanli Zuo. 2009. Sentiment analysis of Chinese documents: From sentence to document level. Journal of the American Society for Information Science and Technology 60, 12 (2009), 2474–2487.

Digital Library

[35]

Dong Zhang, Liangqing Wu, Changlong Sun, Shoushan Li, Qiaoming Zhu, and Guodong Zhou. 2019. Modeling both context- and speaker-sensitive dependence for emotion detection in multi-speaker conversations. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19. International Joint Conferences on Artificial Intelligence Organization, 5415–5421. https://doi.org/10.24963/ijcai.2019/752

[36]

Weinan Zhang, Lingzhi Li, Dongyan Cao, and Ting Liu. 2018. Exploring implicit feedback for open domain conversation generation. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th Innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2–7, 2018, Sheila A. McIlraith and Kilian Q. Weinberger (Eds.). AAAI Press, 547–554.

Digital Library

[37]

Guangyou Zhou, Yizhen Fang, Yehong Peng, and Jiaheng Lu. 2019. Neural conversation generation with auxiliary emotional supervised models. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 19, 2, Article 19 (Sept. 2019), 17 pages. https://doi.org/10.1145/3344788

Digital Library

[38]

Hao Zhou, Minlie Huang, Tianyang Zhang, Xiaoyan Zhu, and Bing Liu. 2018. Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory. Proceedings of the AAAI Conference on Artificial Intelligence 32, 1 (2018), 730–738.

Digital Library

Cited By

Priya Dharshini GSreenivasa Rao K(2024)Transfer Accent Identification Learning for Enhancing Speech Emotion RecognitionCircuits, Systems, and Signal Processing10.1007/s00034-024-02687-143:8(5090-5120)Online publication date: 1-Aug-2024
https://dl.acm.org/doi/10.1007/s00034-024-02687-1
Farooq MDe Silva VTibebu HShi X(2023)Conversational Emotion Detection and Elicitation: A Preliminary Study2023 IEEE IAS Global Conference on Emerging Technologies (GlobConET)10.1109/GlobConET56651.2023.10149922(1-5)Online publication date: 19-May-2023
https://doi.org/10.1109/GlobConET56651.2023.10149922

Index Terms

Emotion Recognition with Conversational Generation Transfer
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Discourse, dialogue and pragmatics

Recommendations

Topics Guided Multimodal Fusion Network for Conversational Emotion Recognition
Advanced Intelligent Computing Technology and Applications
Abstract
Emotion Recognition in Conversation (ERC) is a very challenging task. Previous methods capture the semantic dependencies between utterances through complex conversational context modeling, ignoring the impact of the topic information contained in ...
Facial Emotion Recognition with Varying Poses and/or Partial Occlusion Using Multi-stage Progressive Transfer Learning
Image Analysis
Abstract
This paper describes the use of multi-stage Progressive Transfer Learning (MSPTL) to improve the performance of automated Facial Emotion Recognition (FER). Our proposed FER solution is designed to work with 2D images, and is able to classify ...
Bi-stream graph learning based multimodal fusion for emotion recognition in conversation
Abstract
Emotion Recognition in Conversation (ERC) is the process of automatically detecting and understanding emotions expressed in a conversation, which plays an important role in human–computer interaction. A conversation generates different modality ...
Highlights
- A novel bi-stream graph learning framework is proposed.
- Capturing intra-modal contextual information by unimodal stream graph learning.
- Capturing inter -modal interaction information by cross-modal stream graph learning.
- ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 21, Issue 4

July 2022

464 pages

ISSN:2375-4699

EISSN:2375-4702

DOI:10.1145/3511099

Editor:
Imed Zitouni
Google, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 January 2022

Accepted: 01 October 2021

Revised: 01 August 2021

Received: 01 December 2020

Published in TALLIP Volume 21, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Funding Sources

National Key Research and Development Program of China
National Natural Science Foundation of China
Jiangsu High School Research Grant

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
576
Total Downloads

Downloads (Last 12 months)93
Downloads (Last 6 weeks)4

Reflects downloads up to 10 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Priya Dharshini GSreenivasa Rao K(2024)Transfer Accent Identification Learning for Enhancing Speech Emotion RecognitionCircuits, Systems, and Signal Processing10.1007/s00034-024-02687-143:8(5090-5120)Online publication date: 1-Aug-2024
https://dl.acm.org/doi/10.1007/s00034-024-02687-1
Farooq MDe Silva VTibebu HShi X(2023)Conversational Emotion Detection and Elicitation: A Preliminary Study2023 IEEE IAS Global Conference on Emerging Technologies (GlobConET)10.1109/GlobConET56651.2023.10149922(1-5)Online publication date: 19-May-2023
https://doi.org/10.1109/GlobConET56651.2023.10149922

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents