research-article

TopNet: Learning from Neural Topic Model to Generate Long Stories

Authors:

Huan SunAuthors Info & Claims

KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

Pages 1997 - 2005

https://doi.org/10.1145/3447548.3467410

Published: 14 August 2021 Publication History

Abstract

Long story generation (LSG) is one of the coveted goals in natural language processing. Different from most text generation tasks, LSG requires to output a long story of rich content based on a much shorter text input, and often suffers from information sparsity. In this paper, we propose TopNet to alleviate this problem, by leveraging the recent advances in neural topic modeling to obtain high-quality skeleton words to complement the short input. In particular, instead of directly generating a story, we first learn to map the short text input to a low-dimensional topic distribution (which is pre-assigned by a topic model). Based on this latent topic distribution, we can use the reconstruction decoder of the topic model to sample a sequence of inter-related words as a skeleton for the story. Experiments on two benchmark datasets show that our proposed framework is highly effective in skeleton word selection and significantly outperforms the state-of-the-art models in both automatic evaluation and human evaluation.

Supplementary Material

MP4 File (topnet_learning_from_neural_topic-yazheng_yang-boyuan_pan-38957984-CKYm.mp4)

Presentation video - TopNet: Learning from Neural Topic Model to GenerateLong Stories. Long story generation requires to output a long story of rich content based on a much shorter text input, and often suffers from information sparsity. In this work, we propose TopNet to alleviate this problem, by leveraging the recent advances in neural topic modeling to obtain high-quality skeleton words to complement the short input.

Download
102.31 MB

References

[1]

Sungjin Ahn, Heeyoul Choi, Tanel Parnamaa, and Yoshua Bengio. 2016. A neural knowledge language model. arXiv preprint arXiv:1608.00318 (2016).

[2]

Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio, and etc. 2015. Neural machine translation by jointly learning to align and translate. ICLR (2015).

[3]

David M Blei and John D Lafferty. 2006. Dynamic topic models. In Proceedings of the 23rd international conference on Machine learning. ACM, 113--120.

Digital Library

[4]

David M Blei, Andrew Y Ng, Michael I Jordan, and. 2003. Latent dirichlet allocation. Journal of machine Learning research, Vol. 3, Jan (2003), 993--1022.

Digital Library

[5]

Ohio Supercomputer Center. 1987. Ohio Supercomputer Center. http://osc.edu/ark:/19495/f5s1ph73

[6]

Jianpeng Cheng, Li Dong, and Mirella Lapata. 2016. Long short-term memory-networks for machine reading. arXiv preprint arXiv:1601.06733 (2016).

[7]

Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).

Digital Library

[8]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

[9]

Adji B Dieng, Chong Wang, Jianfeng Gao, and John Paisley. 2016. Topicrnn: A recurrent neural network with long-range semantic dependency. arXiv preprint arXiv:1611.01702 (2016).

[10]

Angela Fan, Mike Lewis, and Yann Dauphin. 2019. Strategies for Structuring Story Generation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2650--2660.

[11]

Angela Fan, Mike Lewis, Yann Dauphin, and et al. 2018. Hierarchical Neural Story Generation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 889--898.

[12]

Katja Filippova and Yasemin Altun. 2013. Overcoming the Lack of Parallel Data in Sentence Compression. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 1481--1491.

[13]

Zhe Gan, Changyou Chen, Ricardo Henao, David Carlson, and Lawrence Carin. 2015. Scalable deep Poisson factor analysis for topic modeling. In International Conference on Machine Learning. 1823--1832.

[14]

Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, and Yann N Dauphin. 2017. Convolutional sequence to sequence learning. In International Conference on Machine Learning. PMLR, 1243--1252.

[15]

Seraphina Goldfarb-Tarrant, Haining Feng, Nanyun Peng, and et al. 2019. Plan, Write, and Revise: an Interactive System for Open-Domain Story Generation. arXiv preprint arXiv:1904.02357 (2019).

[16]

Jiatao Gu, Zhengdong Lu, Hang Li, and Victor OK Li. 2016. Incorporating copying mechanism in sequence-to-sequence learning. arXiv preprint arXiv:1603.06393 (2016).

[17]

Karl Moritz Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. 2015. Teaching machines to read and comprehend. In NIPS. 1693--1701.

[18]

Geoffrey E Hinton, Ruslan R Salakhutdinov, and. 2009. Replicated softmax: an undirected topic model. In Advances in neural information processing systems. 1607--1614.

[19]

Thomas Hofmann. 1999. Probabilistic latent semantic analysis. In Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc., 289--296.

Digital Library

[20]

Parag Jain, Priyanka Agrawal, Abhijit Mishra, Mohak Sukhwani, Anirban Laha, and Karthik Sankaranarayanan. 2017. Story generation from sequence of independent short descriptions. arXiv preprint arXiv:1707.05501 (2017).

[21]

Mandar Joshi, Danqi Chen, Yinhan Liu, Daniel S Weld, Luke Zettlemoyer, and Omer Levy. 2020. Spanbert: Improving pre-training by representing and predicting spans. Transactions of the Association for Computational Linguistics, Vol. 8 (2020), 64--77.

[22]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[23]

Diederik P Kingma and Max Welling. 2014. Auto-encoding variational bayes. ICLR (2014).

[24]

Ryan Kiros, Yukun Zhu, Russ R Salakhutdinov, Richard Zemel, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. 2015. Skip-thought vectors. In Advances in neural information processing systems. 3294--3302.

[25]

Hugo Larochelle, Stanislas Lauly, and. 2012. A neural autoregressive topic model. In Advances in Neural Information Processing Systems. 2708--2716.

[26]

Jey Han Lau, Timothy Baldwin, and Trevor Cohn. 2017. Topically Driven Neural Language Model. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 355--365.

[27]

Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. 2016. A Diversity-Promoting Objective Function for Neural Conversation Models. In Proceedings of NAACL-HLT. 110--119.

[28]

Jiwei Li, Will Monroe, Tianlin Shi, Sébastien Jean, Alan Ritter, and Dan Jurafsky. 2017. Adversarial Learning for Neural Dialogue Generation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2157--2169.

[29]

Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out. 74--81.

[30]

Yang Liu, Zhiyuan Liu, Tat-Seng Chua, and Maosong Sun. 2015. Topical word embeddings. In Twenty-Ninth AAAI Conference on Artificial Intelligence .

Digital Library

[31]

Huanru Henry Mao, Bodhisattwa Prasad Majumder, Julian McAuley, and Garrison Cottrell. 2019. Improving Neural Story Generation by Targeted Common Sense Grounding. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 5990--5995.

[32]

Yishu Miao, Edward Grefenstette, Phil Blunsom, and et al. 2017. Discovering discrete latent topics with neural variational inference. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2410--2419.

[33]

Yishu Miao, Lei Yu, Phil Blunsom, and et al. 2016. Neural variational inference for text processing. In International conference on machine learning. 1727--1736.

[34]

Tomas Mikolov and Geoffrey Zweig. 2012. Context dependent recurrent neural network language model. In 2012 IEEE Spoken Language Technology Workshop (SLT). IEEE, 234--239.

[35]

Nasrin Mostafazadeh, Nathanael Chambers, Xiaodong He, Devi Parikh, Dhruv Batra, Lucy Vanderwende, Pushmeet Kohli, and James Allen. 2016. A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories. In Proceedings of NAACL-HLT. 839--849.

[36]

Romain Paulus, Caiming Xiong, Richard Socher, and et al. 2018. A deep reinforced model for abstractive summarization. ICLR (2018).

[37]

Nanyun Peng, Marjan Ghazvininejad, Jonathan May, and Kevin Knight. 2018. Towards controllable story generation. In Proceedings of the First Workshop on Storytelling. 43--49.

[38]

Jeffrey Pennington, Richard Socher, Christopher D. Manning, et al. 2014. GloVe: Global Vectors for Word Representation. In EMNLP. 1532--1543.

[39]

Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. (2019).

[40]

Melissa Roemmele. 2016. Writing stories with help from recurrent neural networks. In Thirtieth AAAI Conference on Artificial Intelligence .

Digital Library

[41]

Abigail See, Peter J Liu, Christopher D Manning, and et al. 2017. Get To The Point: Summarization with Pointer-Generator Networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1073--1083.

[42]

Iulian V Serban, Alessandro Sordoni, Yoshua Bengio, Aaron Courville, and Joelle Pineau. 2016. Building end-to-end dialogue systems using generative hierarchical neural network models. In Thirtieth AAAI Conference on Artificial Intelligence .

Digital Library

[43]

Pradyumna Tambwekar, Murtaza Dhuliawala, Lara J Martin, Animesh Mehta, Brent Harrison, and Mark O Riedl. 2019. Controllable neural story plot generation via reward shaping. In Proceedings of the 28th International Joint Conference on Artificial Intelligence. AAAI Press, 5982--5988.

[44]

Yee W Teh, Michael I Jordan, Matthew J Beal, and David M Blei. 2005. Sharing clusters among related groups: Hierarchical Dirichlet processes. In Advances in neural information processing systems. 1385--1392.

[45]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.

[46]

Chenguang Wang, Mu Li, Alexander J. Smola, and. 2019 b. Language Models with Transformers. arXiv preprint arXiv:1904.09408 (2019).

[47]

Wenlin Wang, Zhe Gan, Wenqi Wang, Dinghan Shen, Jiaji Huang, Wei Ping, Sanjeev Satheesh, and Lawrence Carin. 2018. Topic Compositional Neural Language Model. In International Conference on Artificial Intelligence and Statistics. 356--365.

[48]

Wenlin Wang, Zhe Gan, Hongteng Xu, Ruiyi Zhang, Guoyin Wang, Dinghan Shen, Changyou Chen, and Lawrence Carin. 2019 a. Topic-Guided Variational Autoencoders for Text Generation. arXiv preprint arXiv:1903.07137 (2019).

[49]

Hongteng Xu, Wenlin Wang, Wei Liu, and Lawrence Carin. 2018b. Distilled wasserstein learning for word embedding and topic modeling. In Advances in Neural Information Processing Systems. 1716--1725.

[50]

Jingjing Xu, Xuancheng Ren, Yi Zhang, Qi Zeng, Xiaoyan Cai, and Xu Sun. 2018a. A Skeleton-Based Model for Promoting Coherence Among Sentences in Narrative Story Generation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 4306--4315.

[51]

Lili Yao, Nanyun Peng, Weischedel Ralph, Kevin Knight, Dongyan Zhao, and Rui Yan. 2019. Plan-And-Write: Towards Better Automatic Storytelling. AAAI 2019 (2019).

[52]

Matthew D Zeiler. 2012. ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012).

[53]

Yizhe Zhang, Michel Galley, Jianfeng Gao, Zhe Gan, Xiujun Li, Chris Brockett, and Bill Dolan. 2018. Generating informative and diverse conversational responses via adversarial information maximization. In Advances in Neural Information Processing Systems. 1810--1820.

Cited By

Wu XNguyen TLuu A(2024)A survey on neural topic models: methods, applications, and challengesArtificial Intelligence Review10.1007/s10462-023-10661-757:2Online publication date: 25-Jan-2024
https://doi.org/10.1007/s10462-023-10661-7
Li QZhang ZZhuang FXu YLi C(2023)Topic-aware Intention Network for Explainable Recommendation with Knowledge EnhancementACM Transactions on Information Systems10.1145/357999341:4(1-23)Online publication date: 8-Apr-2023
https://dl.acm.org/doi/10.1145/3579993
Zhu NHuang JCao JLu XLiu HXiong H(2023)MtiRec: A Medical Test Recommender System based on the Analysis of Treatment Programs2023 IEEE International Conference on Data Mining (ICDM)10.1109/ICDM58522.2023.00099(898-907)Online publication date: 1-Dec-2023
https://doi.org/10.1109/ICDM58522.2023.00099
Show More Cited By

Index Terms

TopNet: Learning from Neural Topic Model to Generate Long Stories
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Discourse, dialogue and pragmatics
  2. Machine learning
    1. Machine learning algorithms
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Machine learning theory

Recommendations

A biterm topic model for short texts
WWW '13: Proceedings of the 22nd international conference on World Wide Web

Uncovering the topics within short texts, such as tweets and instant messages, has become an important task for many content analysis applications. However, directly applying conventional topic models (e.g. LDA and PLSA) on such short texts may not work ...
Research on Multi-document Summarization Based on LDA Topic Model
IHMSC '14: Proceedings of the 2014 Sixth International Conference on Intelligent Human-Machine Systems and Cybernetics - Volume 02

Compared with VSM (Vector Space Model) and graph-ranking models, LDA (Latent Dirichlet Allocation) Model can discover latent topics in the corpus and latent topics are beneficial to use sentence-ranking mechanisms to form a good summary. In the paper, ...
Topic-driven reader comments summarization
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management

Readers of a news article often read its comments contributed by other readers. By reading comments, readers obtain not only complementary information about this news article but also the opinions from other readers. However, the existing ranking ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

August 2021

4259 pages

ISBN:9781450383325

DOI:10.1145/3447548

General Chairs:
Feida Zhu
Singapore Management University
,
Beng Chin Ooi
National University of Singapore
,
Chunyan Miao
Nanyang Technology University
,
Program Chairs:
Haixun Wang,
Iryna Skrypnyk,
Wynne Hsu,
Sanjay Chawla

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '21

Sponsor:

KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 14 - 18, 2021

Virtual Event, Singapore

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
481
Total Downloads

Downloads (Last 12 months)43
Downloads (Last 6 weeks)1

Reflects downloads up to 30 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wu XNguyen TLuu A(2024)A survey on neural topic models: methods, applications, and challengesArtificial Intelligence Review10.1007/s10462-023-10661-757:2Online publication date: 25-Jan-2024
https://doi.org/10.1007/s10462-023-10661-7
Li QZhang ZZhuang FXu YLi C(2023)Topic-aware Intention Network for Explainable Recommendation with Knowledge EnhancementACM Transactions on Information Systems10.1145/357999341:4(1-23)Online publication date: 8-Apr-2023
https://dl.acm.org/doi/10.1145/3579993
Zhu NHuang JCao JLu XLiu HXiong H(2023)MtiRec: A Medical Test Recommender System based on the Analysis of Treatment Programs2023 IEEE International Conference on Data Mining (ICDM)10.1109/ICDM58522.2023.00099(898-907)Online publication date: 1-Dec-2023
https://doi.org/10.1109/ICDM58522.2023.00099
Habbat NAnoun HHassouni LNouri H(2023)Using neural topic model and CamemBERT to extract topics from Moroccan news in the French languageINTERNATIONAL CONFERENCE ON ADVANCES IN COMMUNICATION TECHNOLOGY AND COMPUTER ENGINEERING10.1063/5.0148733(020004)Online publication date: 2023
https://doi.org/10.1063/5.0148733

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents