Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Neural Conversation Generation with Auxiliary Emotional Supervised Models

Published: 17 September 2019 Publication History

Abstract

An important aspect of developing dialogue agents involves endowing a conversation system with emotion perception and interaction. Most existing emotion dialogue models lack the adaptability and extensibility of different scenes because of their limitation to require a specified emotion category or their reliance on a fixed emotional dictionary. To overcome these limitations, we propose a neural conversation generation with auxiliary emotional supervised model (nCG-ESM) comprising a sequence-to-sequence (Seq2Seq) generation model and an emotional classifier used as an auxiliary model. The emotional classifier was trained to predict the emotion distributions of the dialogues, which were then used as emotion supervised signals to guide the generation model to generate diverse emotional responses. The proposed nCG-ESM is flexible enough to generate responses with emotional diversity, including specified or unspecified emotions, which can be adapted and extended to different scenarios. We conducted extensive experiments on the popular dataset of Weibo post--response pairs. Experimental results showed that the proposed model was capable of producing more diverse, appropriate, and emotionally rich responses, yielding substantial gains in diversity scores and human evaluations.

References

[1]
Elisabeth André, Matthias Rehm, Wolfgang Minker, and Dirk Bühler. 2004. Endowing spoken language dialogue systems with emotional intelligence. In Proceedings of the Affective Dialogue Systems, Tutorial, and Research Workshop, (ADS’04). 178--187.
[2]
Nabiha Asghar, Pascal Poupart, Jesse Hoey, Xin Jiang, and Lili Mou. 2017. Affective neural response generation. In Proceedings of the 40th European Conference on IR, Advances in Information Retrieval (ECIR'2018). 154--166.
[3]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. CoRR abs/1409.0473 (2014). arxiv:1409.0473
[4]
Dan Bohus and Alexander I. Rudnicky. 2005. A principled approach for rejection threshold optimization in spoken dialog systems. In Proceedings of the 9th European Conference on Speech Communication and Technology (INTERSPEECH’05). 2781--2784.
[5]
Joseph L. Fleiss. 1971. Measuring nominal scale agreement among many raters. Psychol. Bull. 76, 5 (1971), 378.
[6]
Marjan Ghazvininejad, Chris Brockett, Ming-Wei Chang, Bill Dolan, Jianfeng Gao, Wen-tau Yih, and Michel Galley. 2017. A knowledge-grounded neural conversation model. CoRR abs/1702.01932 (2017). arxiv:1702.01932
[7]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735--1780.
[8]
Zongcheng Ji, Zhengdong Lu, and Hang Li. 2014. An information retrieval approach to short text conversation. CoRR abs/1408.6988 (2014). arxiv:1408.6988
[9]
Dacher Keltner and Jonathan Haidt. 2001. Social Functions of Emotions in Emotions: Current Issues and Future Directions. Guilford Press, 192--213.
[10]
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. CoRR abs/1412.6980 (2014). arxiv:1412.6980
[11]
Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. 2016a. A Diversity-promoting objective function for neural conversation models. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT’16). 110--119.
[12]
Jiwei Li, Will Monroe, and Dan Jurafsky. 2016b. A simple, fast diverse decoding algorithm for neural generation. CoRR abs/1611.08562 (2016). arxiv:1611.08562
[13]
Jiwei Li, Will Monroe, Alan Ritter, Dan Jurafsky, Michel Galley, and Jianfeng Gao. 2016c. Deep reinforcement learning for dialogue generation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP’16). 1192--1202.
[14]
Jiwei Li, Will Monroe, Tianlin Shi, Sébastien Jean, Alan Ritter, and Dan Jurafsky. 2017. Adversarial learning for neural dialogue generation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP’17). 2157--2169.
[15]
Zhouhan Lin, Minwei Feng, Cícero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017. A structured self-attentive sentence embedding. CoRR abs/1703.03130 (2017). arxiv:1703.03130
[16]
Tomas Mikolov, Martin Karafiát, Lukás Burget, Jan Cernocký, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH’10). 1045--1048.
[17]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the 27th Annual Conference on Advances in Neural Information Processing Systems. 3111--3119.
[18]
Lili Mou, Yiping Song, Rui Yan, Ge Li, Lu Zhang, and Zhi Jin. 2016. Sequence to backward and forward sequences: A content-introducing approach to generative short-text conversation. In Proceedings of the 26th International Conference on Computational Linguistics (COLING'16). 3349--3358.
[19]
Yehong Peng, Yizhen Fang, Zhiwen Xie, and Guangyou Zhou. 2019. Topic-enhanced emotional conversation generation with attention mechanism. Knowl.-Based Syst. 163 (2019), 429--437.
[20]
Michal Ptaszynski, Pawel Dybala, Wenhan Shi, Rafal Rzepka, and Kenji Araki. 2009. Towards context aware emotional intelligence in machines: Computing contextual appropriateness of affective states. In Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI’09). 1469--1474.
[21]
Alan Ritter, Colin Cherry, and William B. Dolan. 2011. Data-driven response generation in social media. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP’11). 583--593.
[22]
Iulian Vlad Serban, Alessandro Sordoni, Yoshua Bengio, Aaron C. Courville, and Joelle Pineau. 2015. Hierarchical neural network generative models for movie dialogues. CoRR abs/1507.04808 (2015). arxiv:1507.04808
[23]
Iulian Vlad Serban, Alessandro Sordoni, Yoshua Bengio, Aaron C. Courville, and Joelle Pineau. 2016. Building end-to-end dialogue systems using generative hierarchical neural network models. In Proceedings of the 30th AAAI Conference on Artificial Intelligence. 3776--3784.
[24]
Iulian Vlad Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron C. Courville, and Yoshua Bengio. 2017. A hierarchical latent variable encoder-decoder model for generating dialogues. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. 3295--3301.
[25]
Lifeng Shang, Zhengdong Lu, and Hang Li. 2015. Neural responding machine for short-text conversation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL’15). 1577--1586.
[26]
Louis Shao, Stephan Gouws, Denny Britz, Anna Goldie, Brian Strope, and Ray Kurzweil. 2017. Generating long and diverse responses with neural conversation models. CoRR abs/1701.03185 (2017). arxiv:1701.03185
[27]
Marcin Skowron. 2009. Affect listeners: Acquisition of affective states by means of conversational systems. In Proceedings of the Development of Multimodal Interfaces: Active Listening and Synchrony. 169--181.
[28]
Marcin Skowron, Stefan Rank, Mathias Theunis, and Julian Sienkiewicz. 2011. The good, the bad and the neutral: Affective profile in dialog system-user communication. In Proceedings of the 4th International Conference of Affective Computing and Intelligent Interaction (ACII’11). 337--346.
[29]
Alessandro Sordoni, Michel Galley, Michael Auli, Chris Brockett, Yangfeng Ji, Margaret Mitchell, Jian-Yun Nie, Jianfeng Gao, and Bill Dolan. 2015. A neural network approach to context-sensitive generation of conversational responses. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT’15). 196--205.
[30]
Xiao Sun, Xiaoqi Peng, and Shuai Ding. 2017. Emotional human-machine conversation generation based on long short-term memory. Cogn. Comput. 10, 3 (2017), 389--397.
[31]
Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of the Annual Conference on Neural Information Processing Systems. 3104--3112.
[32]
Chongyang Tao, Lili Mou, Dongyan Zhao, and Rui Yan. 2017. RUBER: An unsupervised method for automatic evaluation of open-domain dialog systems. CoRR abs/1701.03079 (2017). arxiv:1701.03079
[33]
Zhiliang Tian, Rui Yan, Lili Mou, Yiping Song, Yansong Feng, and Dongyan Zhao. 2017. How to make context more useful? an empirical study on context-aware neural conversational models. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL’17). 231--236.
[34]
Ashwin K. Vijayakumar, Michael Cogswell, Ramprasath R. Selvaraju, Qing Sun, Stefan Lee, David J. Crandall, and Dhruv Batra. 2016. Diverse beam search: Decoding diverse solutions from neural sequence models. CoRR abs/1610.02424 (2016). arxiv:1610.02424
[35]
Hao Wang, Zhengdong Lu, Hang Li, and Enhong Chen. 2013. A dataset for research on short-text conversations. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP’13). 935--945.
[36]
Jason D. Williams and Steve J. Young. 2007. Partially observable markov decision processes for spoken dialog systems. Comput. Speech Lang. 21, 2 (2007), 393--422.
[37]
Zhiwen Xie, Zhao Zeng, Guangyou Zhou, and Weijun Wang. 2017. Topic enhanced deep structured semantic models for knowledge base question answering. Inf. Sci. 60, 11 (2017), 110103:1--110103:15.
[38]
Chen Xing, Wei Wu, Yu Wu, Jie Liu, Yalou Huang, Ming Zhou, and Wei-Ying Ma. 2017. Topic aware neural response generation. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. 3351--3357.
[39]
Rui Yan, Dongyan Zhao, and Weinan E. 2017. Joint learning of response ranking and next utterance suggestion in human-computer conversation system. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 685--694.
[40]
Rui Zhang, Zhenyu Wang, and Dongcheng Mai. 2017. Building emotional conversation systems using multi-task seq2seq learning. In Proceedings of the 6th CCF International Conference on Natural Language Processing and Chinese Computing (NLPCC’17). 612--621.
[41]
Guangyou Zhou, Tingting He, Jun Zhao, and Po Hu. 2015. Learning continuous word embedding with metadata for question retrieval in community question answering. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL’15). 250--259. http://aclweb.org/anthology/P/P15/P15-1025.pdf.
[42]
Guangyou Zhou and Jimmy Xiangji Huang. 2017. Modeling and learning distributed word representation with metadata for question retrieval. IEEE Trans. Knowl. Data Eng. 29, 6 (2017), 1226--1239.
[43]
Guangyou Zhou, Zhiwen Xie, Jimmy Xiangji Huang, and Tingting He. 2016b. Bi-transferring deep neural networks for domain adaptation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL’16).
[44]
Hao Zhou, Minlie Huang, Tianyang Zhang, Xiaoyan Zhu, and Bing Liu. 2017. Emotional chatting machine: Emotional conversation generation with internal and external memory. CoRR abs/1704.01074 (2017). arxiv:1704.01074
[45]
Xiangyang Zhou, Daxiang Dong, Hua Wu, Shiqi Zhao, Dianhai Yu, Hao Tian, Xuan Liu, and Rui Yan. 2016a. Multi-view response selection for human-computer conversation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP’16). 372--381.
[46]
Yimeng Zhuang, Xianliang Wang, Han Zhang, Jinghui Xie, and Xuan Zhu. 2017. An ensemble approach to conversation generation. In Proceedings of the 6th CCF International Conference on Natural Language Processing and Chinese Computing (NLPCC’17). 51--62.

Cited By

View all
  • (2024)RLCA: Reinforcement Learning Model Integrating Cognition and Affection for Empathetic Response GenerationIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.325874111:1(1158-1168)Online publication date: Feb-2024
  • (2023)Analysis of Emotions in Speech Acts for Chatbots: An Overview and a Model Proposal2023 IEEE Symposium Series on Computational Intelligence (SSCI)10.1109/SSCI52147.2023.10371792(1467-1471)Online publication date: 5-Dec-2023
  • (2022)Emotionally Intelligent Chatbots: A Systematic Literature ReviewHuman Behavior and Emerging Technologies10.1155/2022/96016302022(1-23)Online publication date: 26-Sep-2022
  • Show More Cited By

Index Terms

  1. Neural Conversation Generation with Auxiliary Emotional Supervised Models
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian and Low-Resource Language Information Processing
    ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 19, Issue 2
    March 2020
    301 pages
    ISSN:2375-4699
    EISSN:2375-4702
    DOI:10.1145/3358605
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 September 2019
    Accepted: 01 July 2019
    Revised: 01 February 2019
    Received: 01 April 2018
    Published in TALLIP Volume 19, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Neural conversation
    2. natural language processing
    3. sequence-to-sequence model

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • National Natural Science Foundation of China
    • Fundamental Research Funds for the Central Universities

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)29
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 02 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)RLCA: Reinforcement Learning Model Integrating Cognition and Affection for Empathetic Response GenerationIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.325874111:1(1158-1168)Online publication date: Feb-2024
    • (2023)Analysis of Emotions in Speech Acts for Chatbots: An Overview and a Model Proposal2023 IEEE Symposium Series on Computational Intelligence (SSCI)10.1109/SSCI52147.2023.10371792(1467-1471)Online publication date: 5-Dec-2023
    • (2022)Emotionally Intelligent Chatbots: A Systematic Literature ReviewHuman Behavior and Emerging Technologies10.1155/2022/96016302022(1-23)Online publication date: 26-Sep-2022
    • (2022)Emotion Recognition with Conversational Generation TransferACM Transactions on Asian and Low-Resource Language Information Processing10.1145/349453221:4(1-17)Online publication date: 19-Jan-2022
    • (2022)Emotional Conversation Generation With Bilingual Interactive DecodingIEEE Transactions on Computational Social Systems10.1109/TCSS.2021.30954799:3(818-829)Online publication date: Jun-2022
    • (2022)Enhancing Integrity Modeling for Emotional Conversation GenerationIEEE Transactions on Cognitive and Developmental Systems10.1109/TCDS.2021.309844414:3(1170-1178)Online publication date: Sep-2022
    • (2022)Exemplars-Guided Empathetic Response Generation Controlled by the Elements of Human CommunicationIEEE Access10.1109/ACCESS.2022.319315910(77176-77190)Online publication date: 2022
    • (2022)A multi-label emoji classification method using balanced pointwise mutual information-based feature selectionComputer Speech and Language10.1016/j.csl.2021.10133073:COnline publication date: 1-May-2022
    • (2021)Predicting Emotion Reactions for Human–Computer Conversation: A Variational ApproachIEEE Transactions on Human-Machine Systems10.1109/THMS.2020.304497551:4(279-287)Online publication date: Aug-2021
    • (2021)Topic-extended Emotional Conversation Generation Model Based on Joint DecodingIEEE Access10.1109/ACCESS.2021.30904359(89934-89940)Online publication date: 2021
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media