research-article

LDSeq: Latent Diffusion Models for Sequence to Sequence Text Generation

Authors:

Weisheng HuAuthors Info & Claims

CSAI '23: Proceedings of the 2023 7th International Conference on Computer Science and Artificial Intelligence

Pages 546 - 552

https://doi.org/10.1145/3638584.3638617

Published: 14 March 2024 Publication History

Abstract

Diffusion models have demonstrated remarkable success in generating continuous data, such as images and audios. Previous studies on text generation employing continuous diffusion models have revealed the potential of the diffusion framework. However, challenges like embedding collapse persist, limiting the overall generation performance. In this paper we introduce LDSeq, a latent diffusion framework employing a two-stage training procedure for sequence-to-sequence text generation. In the proposed framework, we first train a Variational Auto-Encoder (VAE) on downstream datasets to compress the target text of samples into a continuous latent space, and then we train a conditional latent diffusion model in the fixed continuous latent space, where the latent vectors are iteratively sampled conditioned on the input source text. The disjoint training stages prevent the collapse of diffusion space. Experimental results on paraphrase generation and text summarization datasets show that LDSeq achieves comparable or superior performance in comparison to AR and NAR baselines while requiring lower training cost. Furthermore, We discuss some potential future directions for enhancing diffusion models in the text generation domain.

References

[1]

Jacob Austin, Daniel D Johnson, Jonathan Ho, Daniel Tarlow, and Rianne van den Berg. 2021. Structured denoising diffusion models in discrete state-spaces. Advances in Neural Information Processing Systems 34 (2021), 17981–17993.

[2]

Ting Chen, Ruixiang Zhang, and Geoffrey Hinton. 2022. Analog bits: Generating discrete data using diffusion models with self-conditioning. arXiv preprint arXiv:2208.04202 (2022).

[3]

Zhujin Gao, Junliang Guo, Xu Tan, Yongxin Zhu, Fang Zhang, Jiang Bian, and Linli Xu. 2022. Difformer: Empowering Diffusion Model on Embedding Space for Text Generation. arXiv preprint arXiv:2212.09412 (2022).

[4]

Marjan Ghazvininejad, Omer Levy, Yinhan Liu, and Luke Zettlemoyer. 2019. Mask-predict: Parallel decoding of conditional masked language models. arXiv preprint arXiv:1904.09324 (2019).

[5]

Shansan Gong, Mukai Li, Jiangtao Feng, Zhiyong Wu, and LingPeng Kong. 2022. Diffuseq: Sequence to sequence text generation with diffusion models. arXiv preprint arXiv:2210.08933 (2022).

[6]

Jiatao Gu, Changhan Wang, and Junbo Zhao. 2019. Levenshtein transformer. Advances in Neural Information Processing Systems 32 (2019).

[7]

Zhengfu He, Tianxiang Sun, Kuanning Wang, Xuanjing Huang, and Xipeng Qiu. 2022. DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models. arXiv preprint arXiv:2211.15029 (2022).

[8]

Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems 33 (2020), 6840–6851.

[9]

Jonathan Ho and Tim Salimans. 2022. Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598 (2022).

[10]

Shankar Kumar and Bill Byrne. 2004. Minimum bayes-risk decoding for statistical machine translation. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004. 169–176.

[11]

Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, and Luke Zettlemoyer. 2019. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019).

[12]

Chunyuan Li, Xiang Gao, Yuan Li, Baolin Peng, Xiujun Li, Yizhe Zhang, and Jianfeng Gao. 2020. Optimus: Organizing sentences via pre-trained modeling of a latent space. arXiv preprint arXiv:2004.04092 (2020).

[13]

Xiang Li, John Thickstun, Ishaan Gulrajani, Percy S Liang, and Tatsunori B Hashimoto. 2022. Diffusion-lm improves controllable text generation. Advances in Neural Information Processing Systems 35 (2022), 4328–4343.

[14]

Zhenghao Lin, Yeyun Gong, Yelong Shen, Tong Wu, Zhihao Fan, Chen Lin, Weizhu Chen, and Nan Duan. 2022. GENIE: Large Scale Pre-training for Text Generation with Diffusion Model. arXiv preprint arXiv:2212.11685 (2022).

[15]

Guangyi Liu, Zeyu Feng, Yuan Gao, Zichao Yang, Xiaodan Liang, Junwei Bao, Xiaodong He, Shuguang Cui, Zhen Li, and Zhiting Hu. 2022. Composable Text Control Operations in Latent Space with Ordinary Differential Equations. arXiv preprint arXiv:2208.00638 (2022).

[16]

Justin Lovelace, Varsha Kishore, Chao Wan, Eliot Shekhtman, and Kilian Weinberger. 2022. Latent Diffusion for Language Generation. arXiv preprint arXiv:2212.09462 (2022).

[17]

Alexander Quinn Nichol and Prafulla Dhariwal. 2021. Improved denoising diffusion probabilistic models. In International Conference on Machine Learning. PMLR, 8162–8171.

[18]

Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, 2019. Language models are unsupervised multitask learners. OpenAI blog 1, 8 (2019), 9.

[19]

Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. 2022. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125 (2022).

[20]

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10684–10695.

[21]

Alexander M Rush, SEAS Harvard, Sumit Chopra, and Jason Weston. 2017. A neural attention model for sentence summarization. In ACLWeb. Proceedings of the 2015 conference on empirical methods in natural language processing.

[22]

Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily L Denton, Kamyar Ghasemipour, Raphael Gontijo Lopes, Burcu Karagol Ayan, Tim Salimans, 2022. Photorealistic text-to-image diffusion models with deep language understanding. Advances in Neural Information Processing Systems 35 (2022), 36479–36494.

[23]

Jiaming Song, Chenlin Meng, and Stefano Ermon. 2020. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502 (2020).

[24]

Robin Strudel, Corentin Tallec, Florent Altché, Yilun Du, Yaroslav Ganin, Arthur Mensch, Will Grathwohl, Nikolay Savinov, Sander Dieleman, Laurent Sifre, 2022. Self-conditioned embedding diffusion for text generation. arXiv preprint arXiv:2211.04236 (2022).

[25]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).

[26]

Jiasheng Ye, Zaixiang Zheng, Yu Bao, Lihua Qian, and Mingxuan Wang. 2023. DINOISER: Diffused Conditional Sequence Learning by Manipulating Noises. arXiv preprint arXiv:2302.10025 (2023).

[27]

Hongyi Yuan, Zheng Yuan, Chuanqi Tan, Fei Huang, and Songfang Huang. 2022. SeqDiffuSeq: Text Diffusion with Encoder-Decoder Transformers. arXiv preprint arXiv:2212.10325 (2022).

[28]

Haopeng Zhang, Xiao Liu, and Jiawei Zhang. 2023. Diffusum: Generation enhanced extractive summarization with diffusion. arXiv preprint arXiv:2305.01735 (2023).

Index Terms

LDSeq: Latent Diffusion Models for Sequence to Sequence Text Generation
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Natural language generation

Recommendations

A Comparative Analysis of Latent Variable Models for Web Page Classification
LA-WEB '08: Proceedings of the 2008 Latin American Web Conference

A main challenge for Web content classification is how to model the input data. This paper discusses the application of two text modeling approaches, Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA), in the Web page classification ...
Geometric latent diffusion models for 3D molecule generation
ICML'23: Proceedings of the 40th International Conference on Machine Learning

Generative models, especially diffusion models (DMs), have achieved promising results for generating feature-rich geometries and advancing foundational science problems such as molecule design. Inspired by the recent huge success of Stable (latent) ...
Diffusion models for non-autoregressive text generation: a survey
IJCAI '23: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence

Non-autoregressive (NAR) text generation has attracted much attention in the field of natural language processing, which greatly reduces the inference latency but has to sacrifice the generation accuracy. Recently, diffusion models, a class of latent ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

CSAI '23: Proceedings of the 2023 7th International Conference on Computer Science and Artificial Intelligence

December 2023

563 pages

ISBN:9798400708688

DOI:10.1145/3638584

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 March 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

CSAI 2023

CSAI 2023: 2023 7th International Conference on Computer Science and Artificial Intelligence

December 8 - 10, 2023

Beijing, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
79
Total Downloads

Downloads (Last 12 months)79
Downloads (Last 6 weeks)11

Reflects downloads up to 30 Aug 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents