Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3447548.3467410acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

TopNet: Learning from Neural Topic Model to Generate Long Stories

Published: 14 August 2021 Publication History

Abstract

Long story generation (LSG) is one of the coveted goals in natural language processing. Different from most text generation tasks, LSG requires to output a long story of rich content based on a much shorter text input, and often suffers from information sparsity. In this paper, we propose TopNet to alleviate this problem, by leveraging the recent advances in neural topic modeling to obtain high-quality skeleton words to complement the short input. In particular, instead of directly generating a story, we first learn to map the short text input to a low-dimensional topic distribution (which is pre-assigned by a topic model). Based on this latent topic distribution, we can use the reconstruction decoder of the topic model to sample a sequence of inter-related words as a skeleton for the story. Experiments on two benchmark datasets show that our proposed framework is highly effective in skeleton word selection and significantly outperforms the state-of-the-art models in both automatic evaluation and human evaluation.

Supplementary Material

MP4 File (topnet_learning_from_neural_topic-yazheng_yang-boyuan_pan-38957984-CKYm.mp4)
Presentation video - TopNet: Learning from Neural Topic Model to GenerateLong Stories. Long story generation requires to output a long story of rich content based on a much shorter text input, and often suffers from information sparsity. In this work, we propose TopNet to alleviate this problem, by leveraging the recent advances in neural topic modeling to obtain high-quality skeleton words to complement the short input.

References

[1]
Sungjin Ahn, Heeyoul Choi, Tanel Parnamaa, and Yoshua Bengio. 2016. A neural knowledge language model. arXiv preprint arXiv:1608.00318 (2016).
[2]
Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio, and etc. 2015. Neural machine translation by jointly learning to align and translate. ICLR (2015).
[3]
David M Blei and John D Lafferty. 2006. Dynamic topic models. In Proceedings of the 23rd international conference on Machine learning. ACM, 113--120.
[4]
David M Blei, Andrew Y Ng, Michael I Jordan, and. 2003. Latent dirichlet allocation. Journal of machine Learning research, Vol. 3, Jan (2003), 993--1022.
[5]
Ohio Supercomputer Center. 1987. Ohio Supercomputer Center. http://osc.edu/ark:/19495/f5s1ph73
[6]
Jianpeng Cheng, Li Dong, and Mirella Lapata. 2016. Long short-term memory-networks for machine reading. arXiv preprint arXiv:1601.06733 (2016).
[7]
Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).
[8]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[9]
Adji B Dieng, Chong Wang, Jianfeng Gao, and John Paisley. 2016. Topicrnn: A recurrent neural network with long-range semantic dependency. arXiv preprint arXiv:1611.01702 (2016).
[10]
Angela Fan, Mike Lewis, and Yann Dauphin. 2019. Strategies for Structuring Story Generation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2650--2660.
[11]
Angela Fan, Mike Lewis, Yann Dauphin, and et al. 2018. Hierarchical Neural Story Generation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 889--898.
[12]
Katja Filippova and Yasemin Altun. 2013. Overcoming the Lack of Parallel Data in Sentence Compression. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 1481--1491.
[13]
Zhe Gan, Changyou Chen, Ricardo Henao, David Carlson, and Lawrence Carin. 2015. Scalable deep Poisson factor analysis for topic modeling. In International Conference on Machine Learning. 1823--1832.
[14]
Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, and Yann N Dauphin. 2017. Convolutional sequence to sequence learning. In International Conference on Machine Learning. PMLR, 1243--1252.
[15]
Seraphina Goldfarb-Tarrant, Haining Feng, Nanyun Peng, and et al. 2019. Plan, Write, and Revise: an Interactive System for Open-Domain Story Generation. arXiv preprint arXiv:1904.02357 (2019).
[16]
Jiatao Gu, Zhengdong Lu, Hang Li, and Victor OK Li. 2016. Incorporating copying mechanism in sequence-to-sequence learning. arXiv preprint arXiv:1603.06393 (2016).
[17]
Karl Moritz Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. 2015. Teaching machines to read and comprehend. In NIPS. 1693--1701.
[18]
Geoffrey E Hinton, Ruslan R Salakhutdinov, and. 2009. Replicated softmax: an undirected topic model. In Advances in neural information processing systems. 1607--1614.
[19]
Thomas Hofmann. 1999. Probabilistic latent semantic analysis. In Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc., 289--296.
[20]
Parag Jain, Priyanka Agrawal, Abhijit Mishra, Mohak Sukhwani, Anirban Laha, and Karthik Sankaranarayanan. 2017. Story generation from sequence of independent short descriptions. arXiv preprint arXiv:1707.05501 (2017).
[21]
Mandar Joshi, Danqi Chen, Yinhan Liu, Daniel S Weld, Luke Zettlemoyer, and Omer Levy. 2020. Spanbert: Improving pre-training by representing and predicting spans. Transactions of the Association for Computational Linguistics, Vol. 8 (2020), 64--77.
[22]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[23]
Diederik P Kingma and Max Welling. 2014. Auto-encoding variational bayes. ICLR (2014).
[24]
Ryan Kiros, Yukun Zhu, Russ R Salakhutdinov, Richard Zemel, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. 2015. Skip-thought vectors. In Advances in neural information processing systems. 3294--3302.
[25]
Hugo Larochelle, Stanislas Lauly, and. 2012. A neural autoregressive topic model. In Advances in Neural Information Processing Systems. 2708--2716.
[26]
Jey Han Lau, Timothy Baldwin, and Trevor Cohn. 2017. Topically Driven Neural Language Model. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 355--365.
[27]
Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. 2016. A Diversity-Promoting Objective Function for Neural Conversation Models. In Proceedings of NAACL-HLT. 110--119.
[28]
Jiwei Li, Will Monroe, Tianlin Shi, Sébastien Jean, Alan Ritter, and Dan Jurafsky. 2017. Adversarial Learning for Neural Dialogue Generation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2157--2169.
[29]
Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out. 74--81.
[30]
Yang Liu, Zhiyuan Liu, Tat-Seng Chua, and Maosong Sun. 2015. Topical word embeddings. In Twenty-Ninth AAAI Conference on Artificial Intelligence .
[31]
Huanru Henry Mao, Bodhisattwa Prasad Majumder, Julian McAuley, and Garrison Cottrell. 2019. Improving Neural Story Generation by Targeted Common Sense Grounding. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 5990--5995.
[32]
Yishu Miao, Edward Grefenstette, Phil Blunsom, and et al. 2017. Discovering discrete latent topics with neural variational inference. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2410--2419.
[33]
Yishu Miao, Lei Yu, Phil Blunsom, and et al. 2016. Neural variational inference for text processing. In International conference on machine learning. 1727--1736.
[34]
Tomas Mikolov and Geoffrey Zweig. 2012. Context dependent recurrent neural network language model. In 2012 IEEE Spoken Language Technology Workshop (SLT). IEEE, 234--239.
[35]
Nasrin Mostafazadeh, Nathanael Chambers, Xiaodong He, Devi Parikh, Dhruv Batra, Lucy Vanderwende, Pushmeet Kohli, and James Allen. 2016. A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories. In Proceedings of NAACL-HLT. 839--849.
[36]
Romain Paulus, Caiming Xiong, Richard Socher, and et al. 2018. A deep reinforced model for abstractive summarization. ICLR (2018).
[37]
Nanyun Peng, Marjan Ghazvininejad, Jonathan May, and Kevin Knight. 2018. Towards controllable story generation. In Proceedings of the First Workshop on Storytelling. 43--49.
[38]
Jeffrey Pennington, Richard Socher, Christopher D. Manning, et al. 2014. GloVe: Global Vectors for Word Representation. In EMNLP. 1532--1543.
[39]
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. (2019).
[40]
Melissa Roemmele. 2016. Writing stories with help from recurrent neural networks. In Thirtieth AAAI Conference on Artificial Intelligence .
[41]
Abigail See, Peter J Liu, Christopher D Manning, and et al. 2017. Get To The Point: Summarization with Pointer-Generator Networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1073--1083.
[42]
Iulian V Serban, Alessandro Sordoni, Yoshua Bengio, Aaron Courville, and Joelle Pineau. 2016. Building end-to-end dialogue systems using generative hierarchical neural network models. In Thirtieth AAAI Conference on Artificial Intelligence .
[43]
Pradyumna Tambwekar, Murtaza Dhuliawala, Lara J Martin, Animesh Mehta, Brent Harrison, and Mark O Riedl. 2019. Controllable neural story plot generation via reward shaping. In Proceedings of the 28th International Joint Conference on Artificial Intelligence. AAAI Press, 5982--5988.
[44]
Yee W Teh, Michael I Jordan, Matthew J Beal, and David M Blei. 2005. Sharing clusters among related groups: Hierarchical Dirichlet processes. In Advances in neural information processing systems. 1385--1392.
[45]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.
[46]
Chenguang Wang, Mu Li, Alexander J. Smola, and. 2019 b. Language Models with Transformers. arXiv preprint arXiv:1904.09408 (2019).
[47]
Wenlin Wang, Zhe Gan, Wenqi Wang, Dinghan Shen, Jiaji Huang, Wei Ping, Sanjeev Satheesh, and Lawrence Carin. 2018. Topic Compositional Neural Language Model. In International Conference on Artificial Intelligence and Statistics. 356--365.
[48]
Wenlin Wang, Zhe Gan, Hongteng Xu, Ruiyi Zhang, Guoyin Wang, Dinghan Shen, Changyou Chen, and Lawrence Carin. 2019 a. Topic-Guided Variational Autoencoders for Text Generation. arXiv preprint arXiv:1903.07137 (2019).
[49]
Hongteng Xu, Wenlin Wang, Wei Liu, and Lawrence Carin. 2018b. Distilled wasserstein learning for word embedding and topic modeling. In Advances in Neural Information Processing Systems. 1716--1725.
[50]
Jingjing Xu, Xuancheng Ren, Yi Zhang, Qi Zeng, Xiaoyan Cai, and Xu Sun. 2018a. A Skeleton-Based Model for Promoting Coherence Among Sentences in Narrative Story Generation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 4306--4315.
[51]
Lili Yao, Nanyun Peng, Weischedel Ralph, Kevin Knight, Dongyan Zhao, and Rui Yan. 2019. Plan-And-Write: Towards Better Automatic Storytelling. AAAI 2019 (2019).
[52]
Matthew D Zeiler. 2012. ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012).
[53]
Yizhe Zhang, Michel Galley, Jianfeng Gao, Zhe Gan, Xiujun Li, Chris Brockett, and Bill Dolan. 2018. Generating informative and diverse conversational responses via adversarial information maximization. In Advances in Neural Information Processing Systems. 1810--1820.

Cited By

View all
  • (2024)A survey on neural topic models: methods, applications, and challengesArtificial Intelligence Review10.1007/s10462-023-10661-757:2Online publication date: 25-Jan-2024
  • (2023)Topic-aware Intention Network for Explainable Recommendation with Knowledge EnhancementACM Transactions on Information Systems10.1145/357999341:4(1-23)Online publication date: 8-Apr-2023
  • (2023)MtiRec: A Medical Test Recommender System based on the Analysis of Treatment Programs2023 IEEE International Conference on Data Mining (ICDM)10.1109/ICDM58522.2023.00099(898-907)Online publication date: 1-Dec-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining
August 2021
4259 pages
ISBN:9781450383325
DOI:10.1145/3447548
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. deep learning
  2. long story generation
  3. natural language processing
  4. story telling
  5. topic model

Qualifiers

  • Research-article

Conference

KDD '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)43
  • Downloads (Last 6 weeks)1
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A survey on neural topic models: methods, applications, and challengesArtificial Intelligence Review10.1007/s10462-023-10661-757:2Online publication date: 25-Jan-2024
  • (2023)Topic-aware Intention Network for Explainable Recommendation with Knowledge EnhancementACM Transactions on Information Systems10.1145/357999341:4(1-23)Online publication date: 8-Apr-2023
  • (2023)MtiRec: A Medical Test Recommender System based on the Analysis of Treatment Programs2023 IEEE International Conference on Data Mining (ICDM)10.1109/ICDM58522.2023.00099(898-907)Online publication date: 1-Dec-2023
  • (2023)Using neural topic model and CamemBERT to extract topics from Moroccan news in the French languageINTERNATIONAL CONFERENCE ON ADVANCES IN COMMUNICATION TECHNOLOGY AND COMPUTER ENGINEERING10.1063/5.0148733(020004)Online publication date: 2023

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media