research-article

Neural Conversation Generation with Auxiliary Emotional Supervised Models

Authors:

Jiaheng LuAuthors Info & Claims

ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), Volume 19, Issue 2

Article No.: 19, Pages 1 - 17

https://doi.org/10.1145/3344788

Published: 17 September 2019 Publication History

Abstract

An important aspect of developing dialogue agents involves endowing a conversation system with emotion perception and interaction. Most existing emotion dialogue models lack the adaptability and extensibility of different scenes because of their limitation to require a specified emotion category or their reliance on a fixed emotional dictionary. To overcome these limitations, we propose a neural conversation generation with auxiliary emotional supervised model (nCG-ESM) comprising a sequence-to-sequence (Seq2Seq) generation model and an emotional classifier used as an auxiliary model. The emotional classifier was trained to predict the emotion distributions of the dialogues, which were then used as emotion supervised signals to guide the generation model to generate diverse emotional responses. The proposed nCG-ESM is flexible enough to generate responses with emotional diversity, including specified or unspecified emotions, which can be adapted and extended to different scenarios. We conducted extensive experiments on the popular dataset of Weibo post--response pairs. Experimental results showed that the proposed model was capable of producing more diverse, appropriate, and emotionally rich responses, yielding substantial gains in diversity scores and human evaluations.

References

[1]

Elisabeth André, Matthias Rehm, Wolfgang Minker, and Dirk Bühler. 2004. Endowing spoken language dialogue systems with emotional intelligence. In Proceedings of the Affective Dialogue Systems, Tutorial, and Research Workshop, (ADS’04). 178--187.

[2]

Nabiha Asghar, Pascal Poupart, Jesse Hoey, Xin Jiang, and Lili Mou. 2017. Affective neural response generation. In Proceedings of the 40th European Conference on IR, Advances in Information Retrieval (ECIR'2018). 154--166.

[3]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. CoRR abs/1409.0473 (2014). arxiv:1409.0473

[4]

Dan Bohus and Alexander I. Rudnicky. 2005. A principled approach for rejection threshold optimization in spoken dialog systems. In Proceedings of the 9th European Conference on Speech Communication and Technology (INTERSPEECH’05). 2781--2784.

[5]

Joseph L. Fleiss. 1971. Measuring nominal scale agreement among many raters. Psychol. Bull. 76, 5 (1971), 378.

[6]

Marjan Ghazvininejad, Chris Brockett, Ming-Wei Chang, Bill Dolan, Jianfeng Gao, Wen-tau Yih, and Michel Galley. 2017. A knowledge-grounded neural conversation model. CoRR abs/1702.01932 (2017). arxiv:1702.01932

[7]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735--1780.

Digital Library

[8]

Zongcheng Ji, Zhengdong Lu, and Hang Li. 2014. An information retrieval approach to short text conversation. CoRR abs/1408.6988 (2014). arxiv:1408.6988

[9]

Dacher Keltner and Jonathan Haidt. 2001. Social Functions of Emotions in Emotions: Current Issues and Future Directions. Guilford Press, 192--213.

[10]

Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. CoRR abs/1412.6980 (2014). arxiv:1412.6980

[11]

Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. 2016a. A Diversity-promoting objective function for neural conversation models. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT’16). 110--119.

[12]

Jiwei Li, Will Monroe, and Dan Jurafsky. 2016b. A simple, fast diverse decoding algorithm for neural generation. CoRR abs/1611.08562 (2016). arxiv:1611.08562

[13]

Jiwei Li, Will Monroe, Alan Ritter, Dan Jurafsky, Michel Galley, and Jianfeng Gao. 2016c. Deep reinforcement learning for dialogue generation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP’16). 1192--1202.

[14]

Jiwei Li, Will Monroe, Tianlin Shi, Sébastien Jean, Alan Ritter, and Dan Jurafsky. 2017. Adversarial learning for neural dialogue generation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP’17). 2157--2169.

[15]

Zhouhan Lin, Minwei Feng, Cícero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017. A structured self-attentive sentence embedding. CoRR abs/1703.03130 (2017). arxiv:1703.03130

[16]

Tomas Mikolov, Martin Karafiát, Lukás Burget, Jan Cernocký, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH’10). 1045--1048.

[17]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the 27th Annual Conference on Advances in Neural Information Processing Systems. 3111--3119.

Digital Library

[18]

Lili Mou, Yiping Song, Rui Yan, Ge Li, Lu Zhang, and Zhi Jin. 2016. Sequence to backward and forward sequences: A content-introducing approach to generative short-text conversation. In Proceedings of the 26th International Conference on Computational Linguistics (COLING'16). 3349--3358.

[19]

Yehong Peng, Yizhen Fang, Zhiwen Xie, and Guangyou Zhou. 2019. Topic-enhanced emotional conversation generation with attention mechanism. Knowl.-Based Syst. 163 (2019), 429--437.

[20]

Michal Ptaszynski, Pawel Dybala, Wenhan Shi, Rafal Rzepka, and Kenji Araki. 2009. Towards context aware emotional intelligence in machines: Computing contextual appropriateness of affective states. In Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI’09). 1469--1474.

Digital Library

[21]

Alan Ritter, Colin Cherry, and William B. Dolan. 2011. Data-driven response generation in social media. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP’11). 583--593.

Digital Library

[22]

Iulian Vlad Serban, Alessandro Sordoni, Yoshua Bengio, Aaron C. Courville, and Joelle Pineau. 2015. Hierarchical neural network generative models for movie dialogues. CoRR abs/1507.04808 (2015). arxiv:1507.04808

[23]

Iulian Vlad Serban, Alessandro Sordoni, Yoshua Bengio, Aaron C. Courville, and Joelle Pineau. 2016. Building end-to-end dialogue systems using generative hierarchical neural network models. In Proceedings of the 30th AAAI Conference on Artificial Intelligence. 3776--3784.

Digital Library

[24]

Iulian Vlad Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron C. Courville, and Yoshua Bengio. 2017. A hierarchical latent variable encoder-decoder model for generating dialogues. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. 3295--3301.

Digital Library

[25]

Lifeng Shang, Zhengdong Lu, and Hang Li. 2015. Neural responding machine for short-text conversation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL’15). 1577--1586.

[26]

Louis Shao, Stephan Gouws, Denny Britz, Anna Goldie, Brian Strope, and Ray Kurzweil. 2017. Generating long and diverse responses with neural conversation models. CoRR abs/1701.03185 (2017). arxiv:1701.03185

[27]

Marcin Skowron. 2009. Affect listeners: Acquisition of affective states by means of conversational systems. In Proceedings of the Development of Multimodal Interfaces: Active Listening and Synchrony. 169--181.

Digital Library

[28]

Marcin Skowron, Stefan Rank, Mathias Theunis, and Julian Sienkiewicz. 2011. The good, the bad and the neutral: Affective profile in dialog system-user communication. In Proceedings of the 4th International Conference of Affective Computing and Intelligent Interaction (ACII’11). 337--346.

Digital Library

[29]

Alessandro Sordoni, Michel Galley, Michael Auli, Chris Brockett, Yangfeng Ji, Margaret Mitchell, Jian-Yun Nie, Jianfeng Gao, and Bill Dolan. 2015. A neural network approach to context-sensitive generation of conversational responses. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT’15). 196--205.

[30]

Xiao Sun, Xiaoqi Peng, and Shuai Ding. 2017. Emotional human-machine conversation generation based on long short-term memory. Cogn. Comput. 10, 3 (2017), 389--397.

[31]

Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of the Annual Conference on Neural Information Processing Systems. 3104--3112.

Digital Library

[32]

Chongyang Tao, Lili Mou, Dongyan Zhao, and Rui Yan. 2017. RUBER: An unsupervised method for automatic evaluation of open-domain dialog systems. CoRR abs/1701.03079 (2017). arxiv:1701.03079

[33]

Zhiliang Tian, Rui Yan, Lili Mou, Yiping Song, Yansong Feng, and Dongyan Zhao. 2017. How to make context more useful? an empirical study on context-aware neural conversational models. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL’17). 231--236.

[34]

Ashwin K. Vijayakumar, Michael Cogswell, Ramprasath R. Selvaraju, Qing Sun, Stefan Lee, David J. Crandall, and Dhruv Batra. 2016. Diverse beam search: Decoding diverse solutions from neural sequence models. CoRR abs/1610.02424 (2016). arxiv:1610.02424

[35]

Hao Wang, Zhengdong Lu, Hang Li, and Enhong Chen. 2013. A dataset for research on short-text conversations. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP’13). 935--945.

[36]

Jason D. Williams and Steve J. Young. 2007. Partially observable markov decision processes for spoken dialog systems. Comput. Speech Lang. 21, 2 (2007), 393--422.

Digital Library

[37]

Zhiwen Xie, Zhao Zeng, Guangyou Zhou, and Weijun Wang. 2017. Topic enhanced deep structured semantic models for knowledge base question answering. Inf. Sci. 60, 11 (2017), 110103:1--110103:15.

[38]

Chen Xing, Wei Wu, Yu Wu, Jie Liu, Yalou Huang, Ming Zhou, and Wei-Ying Ma. 2017. Topic aware neural response generation. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. 3351--3357.

Digital Library

[39]

Rui Yan, Dongyan Zhao, and Weinan E. 2017. Joint learning of response ranking and next utterance suggestion in human-computer conversation system. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 685--694.

Digital Library

[40]

Rui Zhang, Zhenyu Wang, and Dongcheng Mai. 2017. Building emotional conversation systems using multi-task seq2seq learning. In Proceedings of the 6th CCF International Conference on Natural Language Processing and Chinese Computing (NLPCC’17). 612--621.

[41]

Guangyou Zhou, Tingting He, Jun Zhao, and Po Hu. 2015. Learning continuous word embedding with metadata for question retrieval in community question answering. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL’15). 250--259. http://aclweb.org/anthology/P/P15/P15-1025.pdf.

[42]

Guangyou Zhou and Jimmy Xiangji Huang. 2017. Modeling and learning distributed word representation with metadata for question retrieval. IEEE Trans. Knowl. Data Eng. 29, 6 (2017), 1226--1239.

Digital Library

[43]

Guangyou Zhou, Zhiwen Xie, Jimmy Xiangji Huang, and Tingting He. 2016b. Bi-transferring deep neural networks for domain adaptation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL’16).

[44]

Hao Zhou, Minlie Huang, Tianyang Zhang, Xiaoyan Zhu, and Bing Liu. 2017. Emotional chatting machine: Emotional conversation generation with internal and external memory. CoRR abs/1704.01074 (2017). arxiv:1704.01074

[45]

Xiangyang Zhou, Daxiang Dong, Hua Wu, Shiqi Zhao, Dianhai Yu, Hao Tian, Xuan Liu, and Rui Yan. 2016a. Multi-view response selection for human-computer conversation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP’16). 372--381.

[46]

Yimeng Zhuang, Xianliang Wang, Han Zhang, Jinghui Xie, and Xuan Zhu. 2017. An ensemble approach to conversation generation. In Proceedings of the 6th CCF International Conference on Natural Language Processing and Chinese Computing (NLPCC’17). 51--62.

Cited By

Su YBian HFan BLian BZhang CZhang BHuang R(2024)RLCA: Reinforcement Learning Model Integrating Cognition and Affection for Empathetic Response GenerationIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.325874111:1(1158-1168)Online publication date: Mar-2024
https://doi.org/10.1109/TCSS.2023.3258741
Castro ECalvo HKolesnikova OCastro C(2023)Analysis of Emotions in Speech Acts for Chatbots: An Overview and a Model Proposal2023 IEEE Symposium Series on Computational Intelligence (SSCI)10.1109/SSCI52147.2023.10371792(1467-1471)Online publication date: 5-Dec-2023
https://doi.org/10.1109/SSCI52147.2023.10371792
Bilquise GIbrahim SShaalan K(2022)Emotionally Intelligent Chatbots: A Systematic Literature ReviewHuman Behavior and Emerging Technologies10.1155/2022/96016302022(1-23)Online publication date: 26-Sep-2022
https://doi.org/10.1155/2022/9601630
Show More Cited By

Index Terms

Neural Conversation Generation with Auxiliary Emotional Supervised Models
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing

Index terms have been assigned to the content through auto-classification.

Recommendations

Emotional conversation generation with heterogeneous graph neural network
Abstract
The successful emotional conversation system depends on sufficient perception and appropriate expression of emotions. In a real-life conversation, humans firstly instinctively perceive emotions from multi-source information, including ...
Leveraging hierarchical semantic‐emotional memory in emotional conversation generation
Abstract
Handling emotions in human‐computer dialogues has emerged as a challenging task which requires artificial intelligence systems to generate emotional responses by jointly perceiving the emotion involved in the input posts and incorporating it into ...
Language Models as Emotional Classifiers for Textual Conversation
MM '20: Proceedings of the 28th ACM International Conference on Multimedia

Emotions play a critical role in our everyday lives by altering how we perceive, process and respond to our environment. Affective computing aims to instill in computers the ability to detect and act on the emotions of users. A core aspect of any ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 19, Issue 2

March 2020

301 pages

ISSN:2375-4699

EISSN:2375-4702

DOI:10.1145/3358605

Editor:
Imed Zitouni
Microsoft, USA

Issue’s Table of Contents

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 September 2019

Accepted: 01 July 2019

Revised: 01 February 2019

Received: 01 April 2018

Published in TALLIP Volume 19, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

National Natural Science Foundation of China
Fundamental Research Funds for the Central Universities

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
380
Total Downloads

Downloads (Last 12 months)32
Downloads (Last 6 weeks)3

Reflects downloads up to 26 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Su YBian HFan BLian BZhang CZhang BHuang R(2024)RLCA: Reinforcement Learning Model Integrating Cognition and Affection for Empathetic Response GenerationIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.325874111:1(1158-1168)Online publication date: Mar-2024
https://doi.org/10.1109/TCSS.2023.3258741
Castro ECalvo HKolesnikova OCastro C(2023)Analysis of Emotions in Speech Acts for Chatbots: An Overview and a Model Proposal2023 IEEE Symposium Series on Computational Intelligence (SSCI)10.1109/SSCI52147.2023.10371792(1467-1471)Online publication date: 5-Dec-2023
https://doi.org/10.1109/SSCI52147.2023.10371792
Bilquise GIbrahim SShaalan K(2022)Emotionally Intelligent Chatbots: A Systematic Literature ReviewHuman Behavior and Emerging Technologies10.1155/2022/96016302022(1-23)Online publication date: 26-Sep-2022
https://doi.org/10.1155/2022/9601630
Ma HWang ZZhou XZhou GZhou Q(2022)Emotion Recognition with Conversational Generation TransferACM Transactions on Asian and Low-Resource Language Information Processing10.1145/349453221:4(1-17)Online publication date: 19-Jan-2022
https://dl.acm.org/doi/10.1145/3494532
Wang JSun XWang M(2022)Emotional Conversation Generation With Bilingual Interactive DecodingIEEE Transactions on Computational Social Systems10.1109/TCSS.2021.30954799:3(818-829)Online publication date: Jul-2022
https://doi.org/10.1109/TCSS.2021.3095479
Yang RMa ZWang CDu B(2022)Enhancing Integrity Modeling for Emotional Conversation GenerationIEEE Transactions on Cognitive and Developmental Systems10.1109/TCDS.2021.309844414:3(1170-1178)Online publication date: Oct-2022
https://doi.org/10.1109/TCDS.2021.3098444
Majumder NGhosal DHazarika DGelbukh AMihalcea RPoria S(2022)Exemplars-Guided Empathetic Response Generation Controlled by the Elements of Human CommunicationIEEE Access10.1109/ACCESS.2022.319315910(77176-77190)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3193159
Ahanin ZIsmail M(2022)A multi-label emoji classification method using balanced pointwise mutual information-based feature selectionComputer Speech and Language10.1016/j.csl.2021.10133073:COnline publication date: 1-May-2022
https://dl.acm.org/doi/10.1016/j.csl.2021.101330
Zhang RWang ZHuang ZLi LZheng M(2021)Predicting Emotion Reactions for Human–Computer Conversation: A Variational ApproachIEEE Transactions on Human-Machine Systems10.1109/THMS.2020.304497551:4(279-287)Online publication date: Aug-2021
https://doi.org/10.1109/THMS.2020.3044975
Duan MLi QXiao L(2021)Topic-extended Emotional Conversation Generation Model Based on Joint DecodingIEEE Access10.1109/ACCESS.2021.30904359(89934-89940)Online publication date: 2021
https://doi.org/10.1109/ACCESS.2021.3090435
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents