Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Neural Conversation Generation with Auxiliary Emotional Supervised Models

Published: 17 September 2019 Publication History
  • Get Citation Alerts
  • Abstract

    An important aspect of developing dialogue agents involves endowing a conversation system with emotion perception and interaction. Most existing emotion dialogue models lack the adaptability and extensibility of different scenes because of their limitation to require a specified emotion category or their reliance on a fixed emotional dictionary. To overcome these limitations, we propose a neural conversation generation with auxiliary emotional supervised model (nCG-ESM) comprising a sequence-to-sequence (Seq2Seq) generation model and an emotional classifier used as an auxiliary model. The emotional classifier was trained to predict the emotion distributions of the dialogues, which were then used as emotion supervised signals to guide the generation model to generate diverse emotional responses. The proposed nCG-ESM is flexible enough to generate responses with emotional diversity, including specified or unspecified emotions, which can be adapted and extended to different scenarios. We conducted extensive experiments on the popular dataset of Weibo post--response pairs. Experimental results showed that the proposed model was capable of producing more diverse, appropriate, and emotionally rich responses, yielding substantial gains in diversity scores and human evaluations.

    References

    [1]
    Elisabeth André, Matthias Rehm, Wolfgang Minker, and Dirk Bühler. 2004. Endowing spoken language dialogue systems with emotional intelligence. In Proceedings of the Affective Dialogue Systems, Tutorial, and Research Workshop, (ADS’04). 178--187.
    [2]
    Nabiha Asghar, Pascal Poupart, Jesse Hoey, Xin Jiang, and Lili Mou. 2017. Affective neural response generation. In Proceedings of the 40th European Conference on IR, Advances in Information Retrieval (ECIR'2018). 154--166.
    [3]
    Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. CoRR abs/1409.0473 (2014). arxiv:1409.0473
    [4]
    Dan Bohus and Alexander I. Rudnicky. 2005. A principled approach for rejection threshold optimization in spoken dialog systems. In Proceedings of the 9th European Conference on Speech Communication and Technology (INTERSPEECH’05). 2781--2784.
    [5]
    Joseph L. Fleiss. 1971. Measuring nominal scale agreement among many raters. Psychol. Bull. 76, 5 (1971), 378.
    [6]
    Marjan Ghazvininejad, Chris Brockett, Ming-Wei Chang, Bill Dolan, Jianfeng Gao, Wen-tau Yih, and Michel Galley. 2017. A knowledge-grounded neural conversation model. CoRR abs/1702.01932 (2017). arxiv:1702.01932
    [7]
    Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735--1780.
    [8]
    Zongcheng Ji, Zhengdong Lu, and Hang Li. 2014. An information retrieval approach to short text conversation. CoRR abs/1408.6988 (2014). arxiv:1408.6988
    [9]
    Dacher Keltner and Jonathan Haidt. 2001. Social Functions of Emotions in Emotions: Current Issues and Future Directions. Guilford Press, 192--213.
    [10]
    Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. CoRR abs/1412.6980 (2014). arxiv:1412.6980
    [11]
    Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. 2016a. A Diversity-promoting objective function for neural conversation models. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT’16). 110--119.
    [12]
    Jiwei Li, Will Monroe, and Dan Jurafsky. 2016b. A simple, fast diverse decoding algorithm for neural generation. CoRR abs/1611.08562 (2016). arxiv:1611.08562
    [13]
    Jiwei Li, Will Monroe, Alan Ritter, Dan Jurafsky, Michel Galley, and Jianfeng Gao. 2016c. Deep reinforcement learning for dialogue generation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP’16). 1192--1202.
    [14]
    Jiwei Li, Will Monroe, Tianlin Shi, Sébastien Jean, Alan Ritter, and Dan Jurafsky. 2017. Adversarial learning for neural dialogue generation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP’17). 2157--2169.
    [15]
    Zhouhan Lin, Minwei Feng, Cícero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017. A structured self-attentive sentence embedding. CoRR abs/1703.03130 (2017). arxiv:1703.03130
    [16]
    Tomas Mikolov, Martin Karafiát, Lukás Burget, Jan Cernocký, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH’10). 1045--1048.
    [17]
    Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the 27th Annual Conference on Advances in Neural Information Processing Systems. 3111--3119.
    [18]
    Lili Mou, Yiping Song, Rui Yan, Ge Li, Lu Zhang, and Zhi Jin. 2016. Sequence to backward and forward sequences: A content-introducing approach to generative short-text conversation. In Proceedings of the 26th International Conference on Computational Linguistics (COLING'16). 3349--3358.
    [19]
    Yehong Peng, Yizhen Fang, Zhiwen Xie, and Guangyou Zhou. 2019. Topic-enhanced emotional conversation generation with attention mechanism. Knowl.-Based Syst. 163 (2019), 429--437.
    [20]
    Michal Ptaszynski, Pawel Dybala, Wenhan Shi, Rafal Rzepka, and Kenji Araki. 2009. Towards context aware emotional intelligence in machines: Computing contextual appropriateness of affective states. In Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI’09). 1469--1474.
    [21]
    Alan Ritter, Colin Cherry, and William B. Dolan. 2011. Data-driven response generation in social media. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP’11). 583--593.
    [22]
    Iulian Vlad Serban, Alessandro Sordoni, Yoshua Bengio, Aaron C. Courville, and Joelle Pineau. 2015. Hierarchical neural network generative models for movie dialogues. CoRR abs/1507.04808 (2015). arxiv:1507.04808
    [23]
    Iulian Vlad Serban, Alessandro Sordoni, Yoshua Bengio, Aaron C. Courville, and Joelle Pineau. 2016. Building end-to-end dialogue systems using generative hierarchical neural network models. In Proceedings of the 30th AAAI Conference on Artificial Intelligence. 3776--3784.
    [24]
    Iulian Vlad Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron C. Courville, and Yoshua Bengio. 2017. A hierarchical latent variable encoder-decoder model for generating dialogues. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. 3295--3301.
    [25]
    Lifeng Shang, Zhengdong Lu, and Hang Li. 2015. Neural responding machine for short-text conversation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL’15). 1577--1586.
    [26]
    Louis Shao, Stephan Gouws, Denny Britz, Anna Goldie, Brian Strope, and Ray Kurzweil. 2017. Generating long and diverse responses with neural conversation models. CoRR abs/1701.03185 (2017). arxiv:1701.03185
    [27]
    Marcin Skowron. 2009. Affect listeners: Acquisition of affective states by means of conversational systems. In Proceedings of the Development of Multimodal Interfaces: Active Listening and Synchrony. 169--181.
    [28]
    Marcin Skowron, Stefan Rank, Mathias Theunis, and Julian Sienkiewicz. 2011. The good, the bad and the neutral: Affective profile in dialog system-user communication. In Proceedings of the 4th International Conference of Affective Computing and Intelligent Interaction (ACII’11). 337--346.
    [29]
    Alessandro Sordoni, Michel Galley, Michael Auli, Chris Brockett, Yangfeng Ji, Margaret Mitchell, Jian-Yun Nie, Jianfeng Gao, and Bill Dolan. 2015. A neural network approach to context-sensitive generation of conversational responses. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT’15). 196--205.
    [30]
    Xiao Sun, Xiaoqi Peng, and Shuai Ding. 2017. Emotional human-machine conversation generation based on long short-term memory. Cogn. Comput. 10, 3 (2017), 389--397.
    [31]
    Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of the Annual Conference on Neural Information Processing Systems. 3104--3112.
    [32]
    Chongyang Tao, Lili Mou, Dongyan Zhao, and Rui Yan. 2017. RUBER: An unsupervised method for automatic evaluation of open-domain dialog systems. CoRR abs/1701.03079 (2017). arxiv:1701.03079
    [33]
    Zhiliang Tian, Rui Yan, Lili Mou, Yiping Song, Yansong Feng, and Dongyan Zhao. 2017. How to make context more useful? an empirical study on context-aware neural conversational models. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL’17). 231--236.
    [34]
    Ashwin K. Vijayakumar, Michael Cogswell, Ramprasath R. Selvaraju, Qing Sun, Stefan Lee, David J. Crandall, and Dhruv Batra. 2016. Diverse beam search: Decoding diverse solutions from neural sequence models. CoRR abs/1610.02424 (2016). arxiv:1610.02424
    [35]
    Hao Wang, Zhengdong Lu, Hang Li, and Enhong Chen. 2013. A dataset for research on short-text conversations. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP’13). 935--945.
    [36]
    Jason D. Williams and Steve J. Young. 2007. Partially observable markov decision processes for spoken dialog systems. Comput. Speech Lang. 21, 2 (2007), 393--422.
    [37]
    Zhiwen Xie, Zhao Zeng, Guangyou Zhou, and Weijun Wang. 2017. Topic enhanced deep structured semantic models for knowledge base question answering. Inf. Sci. 60, 11 (2017), 110103:1--110103:15.
    [38]
    Chen Xing, Wei Wu, Yu Wu, Jie Liu, Yalou Huang, Ming Zhou, and Wei-Ying Ma. 2017. Topic aware neural response generation. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. 3351--3357.
    [39]
    Rui Yan, Dongyan Zhao, and Weinan E. 2017. Joint learning of response ranking and next utterance suggestion in human-computer conversation system. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 685--694.
    [40]
    Rui Zhang, Zhenyu Wang, and Dongcheng Mai. 2017. Building emotional conversation systems using multi-task seq2seq learning. In Proceedings of the 6th CCF International Conference on Natural Language Processing and Chinese Computing (NLPCC’17). 612--621.
    [41]
    Guangyou Zhou, Tingting He, Jun Zhao, and Po Hu. 2015. Learning continuous word embedding with metadata for question retrieval in community question answering. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL’15). 250--259. http://aclweb.org/anthology/P/P15/P15-1025.pdf.
    [42]
    Guangyou Zhou and Jimmy Xiangji Huang. 2017. Modeling and learning distributed word representation with metadata for question retrieval. IEEE Trans. Knowl. Data Eng. 29, 6 (2017), 1226--1239.
    [43]
    Guangyou Zhou, Zhiwen Xie, Jimmy Xiangji Huang, and Tingting He. 2016b. Bi-transferring deep neural networks for domain adaptation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL’16).
    [44]
    Hao Zhou, Minlie Huang, Tianyang Zhang, Xiaoyan Zhu, and Bing Liu. 2017. Emotional chatting machine: Emotional conversation generation with internal and external memory. CoRR abs/1704.01074 (2017). arxiv:1704.01074
    [45]
    Xiangyang Zhou, Daxiang Dong, Hua Wu, Shiqi Zhao, Dianhai Yu, Hao Tian, Xuan Liu, and Rui Yan. 2016a. Multi-view response selection for human-computer conversation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP’16). 372--381.
    [46]
    Yimeng Zhuang, Xianliang Wang, Han Zhang, Jinghui Xie, and Xuan Zhu. 2017. An ensemble approach to conversation generation. In Proceedings of the 6th CCF International Conference on Natural Language Processing and Chinese Computing (NLPCC’17). 51--62.

    Cited By

    View all
    • (2024)RLCA: Reinforcement Learning Model Integrating Cognition and Affection for Empathetic Response GenerationIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.325874111:1(1158-1168)Online publication date: Mar-2024
    • (2023)Analysis of Emotions in Speech Acts for Chatbots: An Overview and a Model Proposal2023 IEEE Symposium Series on Computational Intelligence (SSCI)10.1109/SSCI52147.2023.10371792(1467-1471)Online publication date: 5-Dec-2023
    • (2022)Emotionally Intelligent Chatbots: A Systematic Literature ReviewHuman Behavior and Emerging Technologies10.1155/2022/96016302022(1-23)Online publication date: 26-Sep-2022
    • Show More Cited By

    Index Terms

    1. Neural Conversation Generation with Auxiliary Emotional Supervised Models
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 19, Issue 2
      March 2020
      301 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3358605
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 17 September 2019
      Accepted: 01 July 2019
      Revised: 01 February 2019
      Received: 01 April 2018
      Published in TALLIP Volume 19, Issue 2

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Neural conversation
      2. natural language processing
      3. sequence-to-sequence model

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      • National Natural Science Foundation of China
      • Fundamental Research Funds for the Central Universities

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)32
      • Downloads (Last 6 weeks)3
      Reflects downloads up to 26 Jul 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)RLCA: Reinforcement Learning Model Integrating Cognition and Affection for Empathetic Response GenerationIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.325874111:1(1158-1168)Online publication date: Mar-2024
      • (2023)Analysis of Emotions in Speech Acts for Chatbots: An Overview and a Model Proposal2023 IEEE Symposium Series on Computational Intelligence (SSCI)10.1109/SSCI52147.2023.10371792(1467-1471)Online publication date: 5-Dec-2023
      • (2022)Emotionally Intelligent Chatbots: A Systematic Literature ReviewHuman Behavior and Emerging Technologies10.1155/2022/96016302022(1-23)Online publication date: 26-Sep-2022
      • (2022)Emotion Recognition with Conversational Generation TransferACM Transactions on Asian and Low-Resource Language Information Processing10.1145/349453221:4(1-17)Online publication date: 19-Jan-2022
      • (2022)Emotional Conversation Generation With Bilingual Interactive DecodingIEEE Transactions on Computational Social Systems10.1109/TCSS.2021.30954799:3(818-829)Online publication date: Jul-2022
      • (2022)Enhancing Integrity Modeling for Emotional Conversation GenerationIEEE Transactions on Cognitive and Developmental Systems10.1109/TCDS.2021.309844414:3(1170-1178)Online publication date: Oct-2022
      • (2022)Exemplars-Guided Empathetic Response Generation Controlled by the Elements of Human CommunicationIEEE Access10.1109/ACCESS.2022.319315910(77176-77190)Online publication date: 2022
      • (2022)A multi-label emoji classification method using balanced pointwise mutual information-based feature selectionComputer Speech and Language10.1016/j.csl.2021.10133073:COnline publication date: 1-May-2022
      • (2021)Predicting Emotion Reactions for Human–Computer Conversation: A Variational ApproachIEEE Transactions on Human-Machine Systems10.1109/THMS.2020.304497551:4(279-287)Online publication date: Aug-2021
      • (2021)Topic-extended Emotional Conversation Generation Model Based on Joint DecodingIEEE Access10.1109/ACCESS.2021.30904359(89934-89940)Online publication date: 2021
      • Show More Cited By

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media