research-article

μ-Forcing: Training Variational Recurrent Autoencoders for Text Generation

Authors:

Jiancheng LvAuthors Info & Claims

ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), Volume 19, Issue 1

Article No.: 12, Pages 1 - 17

https://doi.org/10.1145/3341110

Published: 13 July 2019 Publication History

Abstract

It has been previously observed that training Variational Recurrent Autoencoders (VRAE) for text generation suffers from serious uninformative latent variables problems. The model would collapse into a plain language model that totally ignores the latent variables and can only generate repeating and dull samples. In this article, we explore the reason behind this issue and propose an effective regularizer-based approach to address it. The proposed method directly injects extra constraints on the posteriors of latent variables into the learning process of VRAE, which can flexibly and stably control the tradeoff between the Kullback-Leibler (KL) term and the reconstruction term, making the model learn dense and meaningful latent representations. The experimental results show that the proposed method outperforms several strong baselines and can make the model learn interpretable latent variables and generate diverse meaningful sentences. Furthermore, the proposed method can perform well without using other strategies, such as KL annealing.

References

[1]

Alexander Alemi, Ben Poole, Ian Fischer, Joshua Dillon, Rif A. Saurous, and Kevin Murphy. 2018. Fixing a broken elbo. In ICML.

[2]

Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. 2016. Layer normalization. In ICML.

[3]

Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz, and Samy Bengio. 2016. Generating sentences from a continuous space. In CONLL.

[4]

Xi Chen, Diederik P. Kingma, Tim Salimans, Yan Duan, Prafulla Dhariwal, John Schulman, Ilya Sutskever, and Pieter Abbeel. 2016. Variational lossy autoencoder. In ICML.

[5]

Andrew M. Dai and Quoc V. Le. 2015. Semi-supervised sequence learning. In NIPS.

Digital Library

[6]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In CVPR.

[7]

Li Dong, Shaohan Huang, Furu Wei, Mirella Lapata, Ming Zhou, and Ke Xu. 2017. Learning to generate product reviews from attributes. In EACL.

[8]

Otto Fabius and Joost R. van Amersfoort. 2014. Variational recurrent auto-encoders. arXiv preprint arXiv:1412.6581 (2014).

[9]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In NIPS.

Digital Library

[10]

Anirudh Goyal Alias Parth Goyal, Alessandro Sordoni, Marc-Alexandre Côté, Nan Rosemary Ke, and Yoshua Bengio. 2017. Z-forcing: Training stochastic recurrent networks. In NIPS.

Digital Library

[11]

Karol Gregor, Ivo Danihelka, Alex Graves, Danilo Jimenez Rezende, and Daan Wierstra. 2015. DRAW: A recurrent neural network for image generation. In ICML.

Digital Library

[12]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735--1780.

Digital Library

[13]

Matthew D. Hoffman and Matthew J. Johnson. 2016. Elbo surgery: Yet another way to carve up the variational evidence lower bound. In Workshop in Advances in Approximate Bayesian Inference, NIPS.

[14]

Yoon Kim, Sam Wiseman, Andrew C. Miller, David Sontag, and Alexander M. Rush. 2018. Semi-amortized variational autoencoders. arXiv preprint arXiv:1802.02550 (2018).

[15]

Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. In ICML.

[16]

Diederik P. Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling. 2016. Improved variational inference with inverse autoregressive flow. In NIPS.

Digital Library

[17]

Diederik P. Kingma and Max Welling. 2013. Auto-encoding variational Bayes. In ICLR.

[18]

Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow, and Brendan Frey. 2015. Adversarial autoencoders. In ICML.

[19]

Julian McAuley, Rahul Pandey, and Jure Leskovec. 2015. Inferring networks of substitutable and complementary products. In SIGKDD.

Digital Library

[20]

Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig. 2013. Linguistic regularities in continuous space word representations. In NAACL.

[21]

Danilo Jimenez Rezende and Shakir Mohamed. 2015. Variational inference with normalizing flows. In ICML.

Digital Library

[22]

Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. 2014. Stochastic backpropagation and approximate inference in deep generative models. In ICML.

Digital Library

[23]

Stanislau Semeniuta, Aliaksei Severyn, and Erhardt Barth. 2017. A hybrid convolutional variational autoencoder for text generation. In EMNLP.

[24]

Xiaoyu Shen, Hui Su, Shuzi Niu, and Vera Demberg. 2018. Improving variational encoder-decoders in dialogue generation. In AAAI.

[25]

Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. 2016. Ladder variational autoencoders. In NIPS.

Digital Library

[26]

Martin Sundermeyer, Ralf Schlüter, and Hermann Ney. 2012. LSTM neural networks for language modeling. In INTERSPEECH.

[27]

Pascal Vincent, Hugo Larochelle, Isabelle Lajoie, Yoshua Bengio, and Pierre-Antoine Manzagol. 2010. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11 (2010), 3371--3408.

Digital Library

[28]

Xinchen Yan, Jimei Yang, Kihyuk Sohn, and Honglak Lee. 2016. Attribute2image: Conditional image generation from visual attributes. In ECCV.

[29]

Zichao Yang, Zhiting Hu, Ruslan Salakhutdinov, and Taylor Berg-Kirkpatrick. 2017. Improved variational autoencoders for text modeling using dilated convolutions. In ICML.

Digital Library

[30]

Serena Yeung, Anitha Kannan, Yann Dauphin, and Li Fei-Fei. 2017. Tackling over-pruning in variational autoencoders. arXiv preprint arXiv:1706.03643 (2017).

[31]

Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. 2017. SeqGAN: Sequence generative adversarial nets with policy gradient. In AAAI.

Digital Library

[32]

Shengjia Zhao, Jiaming Song, and Stefano Ermon. 2017. Infovae: Information maximizing variational autoencoders. arXiv preprint arXiv:1706.02262 (2017).

[33]

Tiancheng Zhao, Ran Zhao, and Maxine Eskenazi. 2017. Learning discourse-level diversity for neural dialog models using conditional variational autoencoders. In ACL.

[34]

Yaoming Zhu, Sidi Lu, Lei Zheng, Jiaxian Guo, Weinan Zhang, Jun Wang, and Yong Yu. 2018. Texygen: A Benchmarking Platform for Text Generation Models. In SIGIR.

Digital Library

Cited By

Mohamed NYoussif AEl-Sayed H(2022)Fast and Robust Image Encryption Scheme Based on Quantum Logistic Map and Hyperchaotic SystemComplexity10.1155/2022/36762652022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/3676265
Feine JMorana SMaedche A(2022)Intrance: Designing an Interactive Enhancement System for the Development of QA ChatbotsProceedings of the ACM on Human-Computer Interaction10.1145/35551996:CSCW2(1-24)Online publication date: 11-Nov-2022
https://dl.acm.org/doi/10.1145/3555199
Fatima NImran AKastrati ZDaudpota SSoomro A(2022)A Systematic Literature Review on Text Generation Using Deep Neural Network ModelsIEEE Access10.1109/ACCESS.2022.317410810(53490-53503)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3174108
Show More Cited By

Index Terms

μ-Forcing: Training Variational Recurrent Autoencoders for Text Generation
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Natural language generation

Recommendations

Encouraging Sparsity in Neural Topic Modeling with Non-Mean-Field Inference
Machine Learning and Knowledge Discovery in Databases: Research Track
Abstract
Topic modeling is a popular method for discovering semantic information from textual data, with latent Dirichlet allocation (LDA) being a representative model. Recently, researchers have explored the use of variational autoencoders (VAE) to ...
Neural Topic Model Training with the REBAR Gradient Estimator
Topic modelling is an important approach of unsupervised machine learning that allows automatically extracting the main “topics” from large collections of documents. In addition, topic modelling is able to identify the topic proportions of each individual ...
Variational Session-based Recommendation Using Normalizing Flows
WWW '19: The World Wide Web Conference

We present a novel generative Session-Based Recommendation (SBR) framework, called VAriational SEssion-based Recommendation (VASER) - a non-linear probabilistic methodology allowing Bayesian inference for flexible parameter estimation of sequential ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 19, Issue 1

January 2020

345 pages

ISSN:2375-4699

EISSN:2375-4702

DOI:10.1145/3338846

Editor:
Imed Zitouni
Microsoft, USA

Issue’s Table of Contents

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 July 2019

Accepted: 01 May 2019

Revised: 01 March 2019

Received: 01 September 2018

Published in TALLIP Volume 19, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

National Key R&D Program of China
National Natural Science Fund for Distinguished Young Scholar
State Key Program of National Science Foundation of China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
291
Total Downloads

Downloads (Last 12 months)19
Downloads (Last 6 weeks)4

Reflects downloads up to 01 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Mohamed NYoussif AEl-Sayed H(2022)Fast and Robust Image Encryption Scheme Based on Quantum Logistic Map and Hyperchaotic SystemComplexity10.1155/2022/36762652022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/3676265
Feine JMorana SMaedche A(2022)Intrance: Designing an Interactive Enhancement System for the Development of QA ChatbotsProceedings of the ACM on Human-Computer Interaction10.1145/35551996:CSCW2(1-24)Online publication date: 11-Nov-2022
https://dl.acm.org/doi/10.1145/3555199
Fatima NImran AKastrati ZDaudpota SSoomro A(2022)A Systematic Literature Review on Text Generation Using Deep Neural Network ModelsIEEE Access10.1109/ACCESS.2022.317410810(53490-53503)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3174108
Ali WKumar RDai YKumar JTumrani S(2021)Neural Joint Model for Part-of-Speech Tagging and Entity ExtractionProceedings of the 2021 13th International Conference on Machine Learning and Computing10.1145/3457682.3457718(239-245)Online publication date: 26-Feb-2021
https://dl.acm.org/doi/10.1145/3457682.3457718
Yang KLiu DQu QSang YLv J(2021)An automatic evaluation metric for Ancient-Modern Chinese translationNeural Computing and Applications10.1007/s00521-020-05216-833:8(3855-3867)Online publication date: 1-Apr-2021
https://dl.acm.org/doi/10.1007/s00521-020-05216-8

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents