Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

μ-Forcing: Training Variational Recurrent Autoencoders for Text Generation

Published: 13 July 2019 Publication History

Abstract

It has been previously observed that training Variational Recurrent Autoencoders (VRAE) for text generation suffers from serious uninformative latent variables problems. The model would collapse into a plain language model that totally ignores the latent variables and can only generate repeating and dull samples. In this article, we explore the reason behind this issue and propose an effective regularizer-based approach to address it. The proposed method directly injects extra constraints on the posteriors of latent variables into the learning process of VRAE, which can flexibly and stably control the tradeoff between the Kullback-Leibler (KL) term and the reconstruction term, making the model learn dense and meaningful latent representations. The experimental results show that the proposed method outperforms several strong baselines and can make the model learn interpretable latent variables and generate diverse meaningful sentences. Furthermore, the proposed method can perform well without using other strategies, such as KL annealing.

References

[1]
Alexander Alemi, Ben Poole, Ian Fischer, Joshua Dillon, Rif A. Saurous, and Kevin Murphy. 2018. Fixing a broken elbo. In ICML.
[2]
Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. 2016. Layer normalization. In ICML.
[3]
Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz, and Samy Bengio. 2016. Generating sentences from a continuous space. In CONLL.
[4]
Xi Chen, Diederik P. Kingma, Tim Salimans, Yan Duan, Prafulla Dhariwal, John Schulman, Ilya Sutskever, and Pieter Abbeel. 2016. Variational lossy autoencoder. In ICML.
[5]
Andrew M. Dai and Quoc V. Le. 2015. Semi-supervised sequence learning. In NIPS.
[6]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In CVPR.
[7]
Li Dong, Shaohan Huang, Furu Wei, Mirella Lapata, Ming Zhou, and Ke Xu. 2017. Learning to generate product reviews from attributes. In EACL.
[8]
Otto Fabius and Joost R. van Amersfoort. 2014. Variational recurrent auto-encoders. arXiv preprint arXiv:1412.6581 (2014).
[9]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In NIPS.
[10]
Anirudh Goyal Alias Parth Goyal, Alessandro Sordoni, Marc-Alexandre Côté, Nan Rosemary Ke, and Yoshua Bengio. 2017. Z-forcing: Training stochastic recurrent networks. In NIPS.
[11]
Karol Gregor, Ivo Danihelka, Alex Graves, Danilo Jimenez Rezende, and Daan Wierstra. 2015. DRAW: A recurrent neural network for image generation. In ICML.
[12]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735--1780.
[13]
Matthew D. Hoffman and Matthew J. Johnson. 2016. Elbo surgery: Yet another way to carve up the variational evidence lower bound. In Workshop in Advances in Approximate Bayesian Inference, NIPS.
[14]
Yoon Kim, Sam Wiseman, Andrew C. Miller, David Sontag, and Alexander M. Rush. 2018. Semi-amortized variational autoencoders. arXiv preprint arXiv:1802.02550 (2018).
[15]
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. In ICML.
[16]
Diederik P. Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling. 2016. Improved variational inference with inverse autoregressive flow. In NIPS.
[17]
Diederik P. Kingma and Max Welling. 2013. Auto-encoding variational Bayes. In ICLR.
[18]
Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow, and Brendan Frey. 2015. Adversarial autoencoders. In ICML.
[19]
Julian McAuley, Rahul Pandey, and Jure Leskovec. 2015. Inferring networks of substitutable and complementary products. In SIGKDD.
[20]
Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig. 2013. Linguistic regularities in continuous space word representations. In NAACL.
[21]
Danilo Jimenez Rezende and Shakir Mohamed. 2015. Variational inference with normalizing flows. In ICML.
[22]
Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. 2014. Stochastic backpropagation and approximate inference in deep generative models. In ICML.
[23]
Stanislau Semeniuta, Aliaksei Severyn, and Erhardt Barth. 2017. A hybrid convolutional variational autoencoder for text generation. In EMNLP.
[24]
Xiaoyu Shen, Hui Su, Shuzi Niu, and Vera Demberg. 2018. Improving variational encoder-decoders in dialogue generation. In AAAI.
[25]
Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. 2016. Ladder variational autoencoders. In NIPS.
[26]
Martin Sundermeyer, Ralf Schlüter, and Hermann Ney. 2012. LSTM neural networks for language modeling. In INTERSPEECH.
[27]
Pascal Vincent, Hugo Larochelle, Isabelle Lajoie, Yoshua Bengio, and Pierre-Antoine Manzagol. 2010. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11 (2010), 3371--3408.
[28]
Xinchen Yan, Jimei Yang, Kihyuk Sohn, and Honglak Lee. 2016. Attribute2image: Conditional image generation from visual attributes. In ECCV.
[29]
Zichao Yang, Zhiting Hu, Ruslan Salakhutdinov, and Taylor Berg-Kirkpatrick. 2017. Improved variational autoencoders for text modeling using dilated convolutions. In ICML.
[30]
Serena Yeung, Anitha Kannan, Yann Dauphin, and Li Fei-Fei. 2017. Tackling over-pruning in variational autoencoders. arXiv preprint arXiv:1706.03643 (2017).
[31]
Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. 2017. SeqGAN: Sequence generative adversarial nets with policy gradient. In AAAI.
[32]
Shengjia Zhao, Jiaming Song, and Stefano Ermon. 2017. Infovae: Information maximizing variational autoencoders. arXiv preprint arXiv:1706.02262 (2017).
[33]
Tiancheng Zhao, Ran Zhao, and Maxine Eskenazi. 2017. Learning discourse-level diversity for neural dialog models using conditional variational autoencoders. In ACL.
[34]
Yaoming Zhu, Sidi Lu, Lei Zheng, Jiaxian Guo, Weinan Zhang, Jun Wang, and Yong Yu. 2018. Texygen: A Benchmarking Platform for Text Generation Models. In SIGIR.

Cited By

View all
  • (2022)Fast and Robust Image Encryption Scheme Based on Quantum Logistic Map and Hyperchaotic SystemComplexity10.1155/2022/36762652022Online publication date: 1-Jan-2022
  • (2022)Intrance: Designing an Interactive Enhancement System for the Development of QA ChatbotsProceedings of the ACM on Human-Computer Interaction10.1145/35551996:CSCW2(1-24)Online publication date: 11-Nov-2022
  • (2022)A Systematic Literature Review on Text Generation Using Deep Neural Network ModelsIEEE Access10.1109/ACCESS.2022.317410810(53490-53503)Online publication date: 2022
  • Show More Cited By

Index Terms

  1. μ-Forcing: Training Variational Recurrent Autoencoders for Text Generation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian and Low-Resource Language Information Processing
    ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 19, Issue 1
    January 2020
    345 pages
    ISSN:2375-4699
    EISSN:2375-4702
    DOI:10.1145/3338846
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 July 2019
    Accepted: 01 May 2019
    Revised: 01 March 2019
    Received: 01 September 2018
    Published in TALLIP Volume 19, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Variational autoencoders
    2. uninformative latent variables issues
    3. variational recurrent autoencoders

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • National Key R&D Program of China
    • National Natural Science Fund for Distinguished Young Scholar
    • State Key Program of National Science Foundation of China

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)19
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 01 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Fast and Robust Image Encryption Scheme Based on Quantum Logistic Map and Hyperchaotic SystemComplexity10.1155/2022/36762652022Online publication date: 1-Jan-2022
    • (2022)Intrance: Designing an Interactive Enhancement System for the Development of QA ChatbotsProceedings of the ACM on Human-Computer Interaction10.1145/35551996:CSCW2(1-24)Online publication date: 11-Nov-2022
    • (2022)A Systematic Literature Review on Text Generation Using Deep Neural Network ModelsIEEE Access10.1109/ACCESS.2022.317410810(53490-53503)Online publication date: 2022
    • (2021)Neural Joint Model for Part-of-Speech Tagging and Entity ExtractionProceedings of the 2021 13th International Conference on Machine Learning and Computing10.1145/3457682.3457718(239-245)Online publication date: 26-Feb-2021
    • (2021)An automatic evaluation metric for Ancient-Modern Chinese translationNeural Computing and Applications10.1007/s00521-020-05216-833:8(3855-3867)Online publication date: 1-Apr-2021

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media