Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3366423.3380009acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Unsupervised Dual-Cascade Learning with Pseudo-Feedback Distillation for Query-Focused Extractive Summarization

Published: 20 April 2020 Publication History

Abstract

We propose Dual-CES – a novel unsupervised, query-focused, multi-document extractive summarizer. Dual-CES builds on top of the Cross Entropy Summarizer (CES) and is designed to better handle the tradeoff between saliency and focus in summarization. To this end, Dual-CES employs a two-step dual-cascade optimization approach with saliency-based pseudo-feedback distillation. Overall, Dual-CES significantly outperforms all other state-of-the-art unsupervised alternatives. Dual-CES is even shown to be able to outperform strong supervised summarizers.

References

[1]
Z.I. Botev, D.P. Kroese, and T. Taimre. 2007. Generalized Cross-entropy Methods with Applications to Rare-event Simulation and Optimization. Simulation 83, 11 (Nov. 2007), 785–806. https://doi.org/10.1177/0037549707087067
[2]
Ziqiang Cao, Wenjie Li, Sujian Li, Furu Wei, and Yanran Li. 2016. AttSum: Joint Learning of Focusing and Summarization with Neural Attention. In COLING.
[3]
Asli Celikyilmaz and Dilek Hakkani-Tur. 2010. A Hybrid Hierarchical Model for Multi-document Summarization. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics(ACL ’10). Association for Computational Linguistics, Stroudsburg, PA, USA, 815–824. http://dl.acm.org/citation.cfm?id=1858681.1858765
[4]
Yen-Chun Chen and Mohit Bansal. 2018. Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting. CoRR abs/1805.11080(2018). arxiv:1805.11080http://arxiv.org/abs/1805.11080
[5]
Hoa Trang Dang. 2005. Overview of DUC 2005. In Proceedings of the document understanding conference, Vol. 2005. 1–12.
[6]
Shai Erera, Michal Shmueli-Scheuer, Guy Feigenblat, Ora Peled Nakash, Odellia Boni, Haggai Roitman, Doron Cohen, Bar Weiner, Yosi Mass, Or Rivlin, Guy Lev, Achiya Jerbi, Jonathan Herzig, Yufang Hou, Charles Jochim, Martin Gleize, Francesca Bonin, Francesca Bonin, and David Konopnicki. 2019. A Summarization System for Scientific Documents. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations. Association for Computational Linguistics, Hong Kong, China, 211–216. https://doi.org/10.18653/v1/D19-3036
[7]
Guy Feigenblat, Haggai Roitman, Odellia Boni, and David Konopnicki. 2017. Unsupervised Query-Focused Multi-Document Summarization using the Cross Entropy Method. In Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval(SIGIR ’17). ACM.
[8]
Mahak Gambhir and Vishal Gupta. 2017. Recent Automatic Text Summarization Techniques: A Survey. Artif. Intell. Rev. 47, 1 (Jan. 2017), 1–66. https://doi.org/10.1007/s10462-016-9475-9
[9]
Aria Haghighi and Lucy Vanderwende. 2009. Exploring Content Models for Multi-document Summarization. In Proceedings of the 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics(NAACL ’09). Association for Computational Linguistics, Stroudsburg, PA, USA, 362–370. http://dl.acm.org/citation.cfm?id=1620754.1620807
[10]
Zhanying He, Chun Chen, Jiajun Bu, Can Wang, Lijun Zhang, Deng Cai, and Xiaofei He. 2012. Document Summarization Based on Data Reconstruction. In Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence(AAAI’12). AAAI Press, 620–626. http://dl.acm.org/citation.cfm?id=2900728.2900817
[11]
Victor Lavrenko and W. Bruce Croft. 2001. Relevance Based Language Models. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR ’01). ACM, New York, NY, USA, 120–127. https://doi.org/10.1145/383952.383972
[12]
Piji Li, Lidong Bing, Wai Lam, Hang Li, and Yi Liao. 2015. Reader-aware Multi-document Summarization via Sparse Coding. In Proceedings of the 24th International Conference on Artificial Intelligence(IJCAI’15). AAAI Press, 1270–1276. http://dl.acm.org/citation.cfm?id=2832415.2832426
[13]
Piji Li, Wai Lam, Lidong Bing, Weiwei Guo, and Hang Li. 2017. Cascaded Attention based Unsupervised Information Distillation for Compressive Summarization. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2081–2090.
[14]
Piji Li, Zihao Wang, Wai Lam, Zhaochun Ren, and Lidong Bing. 2017. Salience Estimation via Variational Auto-Encoders for Multi-Document Summarization. In AAAI. 3497–3503.
[15]
Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out: Proceedings of the ACL-04 workshop, Vol. 8. Barcelona, Spain.
[16]
Hui Lin and Jeff Bilmes. 2011. A Class of Submodular Functions for Document Summarization. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1(HLT ’11). Association for Computational Linguistics, Stroudsburg, PA, USA, 510–520. http://dl.acm.org/citation.cfm?id=2002472.2002537
[17]
Shulei Ma, Zhi-Hong Deng, and Yunlun Yang. 2016. An Unsupervised Multi-Document Summarization Framework Based on Neural Document Model. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee, Osaka, Japan, 1514–1523. http://aclweb.org/anthology/C16-1143
[18]
Edward Moroshko, Guy Feigenblat, Haggai Roitman, and David Konopnicki. 2019. An Editorial Network for Enhanced Document Summarization. In Proceedings of the 2nd Workshop on New Frontiers in Summarization. Association for Computational Linguistics, Hong Kong, China, 57–63. https://doi.org/10.18653/v1/D19-5407
[19]
Shashi Narayan, Shay B. Cohen, and Lapata Mirella. 2018. Ranking Sentences for Extractive Summarization with Reinforcement Learning. CoRR abs/1802.08636(2018). arxiv:1802.08636http://arxiv.org/abs/1802.08636
[20]
Ani Nenkova, Kathleen McKeown, 2011. Automatic summarization. Foundations and Trends® in Information Retrieval 5, 2–3(2011), 103–233.
[21]
You Ouyang, Wenjie Li, Sujian Li, and Qin Lu. 2011. Applying regression models to query-focused multi-document summarization. Information Processing & Management 47, 2 (2011), 227–237.
[22]
Romain Paulus, Caiming Xiong, and Richard Socher. 2018. A deep reinforced model for abstractive summarization. In In Proceedings of the 6th International Conference on Learning Representations(ICLR ’18).
[23]
Pengjie Ren, Zhumin Chen, Zhaochun Ren, Furu Wei, Jun Ma, and Maarten de Rijke. 2017. Leveraging Contextual Sentence Relations for Extractive Summarization Using a Neural Attention Model. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR ’17). ACM, New York, NY, USA, 95–104. https://doi.org/10.1145/3077136.3080792
[24]
Pengjie Ren, Zhumin Chen, Zhaochun Ren, Furu Wei, Liqiang Nie, Jun Ma, and Maarten de Rijke. 2018. Sentence Relations for Extractive Summarization with Deep Neural Networks. ACM Trans. Inf. Syst. 36, 4, Article 39 (April 2018), 32 pages. https://doi.org/10.1145/3200864
[25]
Reuven Y Rubinstein and Dirk P Kroese. 2004. The cross-entropy method: a unified approach to combinatorial optimization, Monte-Carlo simulation and machine learning. Springer.
[26]
Chao Shen, Tao Li, and Chris H. Q. Ding. 2011. Integrating Clustering and Multi-document Summarization by Bi-mixture Probabilistic Latent Semantic Analysis (PLSA) with Sentence Bases. In Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence(AAAI’11). AAAI Press, 914–920.
[27]
Xiaojun Wan and Jianguo Xiao. 2009. Graph-based Multi-modality Learning for Topic-focused Multi-document Summarization. In Proceedings of the 21st International Jont Conference on Artifical Intelligence(IJCAI’09). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1586–1591. http://dl.acm.org/citation.cfm?id=1661445.1661700
[28]
Yang Xu, Gareth J.F. Jones, and Bin Wang. 2009. Query Dependent Pseudo-relevance Feedback Based on Wikipedia. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR ’09). ACM, New York, NY, USA, 59–66. https://doi.org/10.1145/1571941.1571954
[29]
Jin-ge Yao, Xiaojun Wan, and Jianguo Xiao. 2015. Compressive Document Summarization via Sparse Optimization. In Proceedings of the 24th International Conference on Artificial Intelligence(IJCAI’15). AAAI Press, 1376–1382. http://dl.acm.org/citation.cfm?id=2832415.2832441
[30]
ChengXiang Zhai and John Lafferty. 2006. A Risk Minimization Framework for Information Retrieval. Inf. Process. Manage. 42, 1 (Jan. 2006), 31–55. https://doi.org/10.1016/j.ipm.2004.11.003
[31]
Sheng-hua Zhong, Yan Liu, Bin Li, and Jing Long. 2015. Query-oriented Unsupervised Multi-document Summarization via Deep Learning Model. Expert Syst. Appl. 42, 21 (Nov. 2015), 8146–8155. https://doi.org/10.1016/j.eswa.2015.05.034

Cited By

View all
  • (2024)Unsupervised Topic-Conditional Extractive SummarizationICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10447347(11286-11290)Online publication date: 14-Apr-2024
  • (2023)Review on Query-focused Multi-document Summarization (QMDS) with Comparative AnalysisACM Computing Surveys10.1145/359729956:1(1-38)Online publication date: 16-May-2023
  • (2023)HISum: Hyperbolic Interaction Model for Extractive Multi-Document SummarizationProceedings of the ACM Web Conference 202310.1145/3543507.3583197(1427-1436)Online publication date: 30-Apr-2023
  • Show More Cited By

Index Terms

  1. Unsupervised Dual-Cascade Learning with Pseudo-Feedback Distillation for Query-Focused Extractive Summarization
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        WWW '20: Proceedings of The Web Conference 2020
        April 2020
        3143 pages
        ISBN:9781450370233
        DOI:10.1145/3366423
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 20 April 2020

        Permissions

        Request permissions for this article.

        Check for updates

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Conference

        WWW '20
        Sponsor:
        WWW '20: The Web Conference 2020
        April 20 - 24, 2020
        Taipei, Taiwan

        Acceptance Rates

        Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)6
        • Downloads (Last 6 weeks)1
        Reflects downloads up to 15 Oct 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Unsupervised Topic-Conditional Extractive SummarizationICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10447347(11286-11290)Online publication date: 14-Apr-2024
        • (2023)Review on Query-focused Multi-document Summarization (QMDS) with Comparative AnalysisACM Computing Surveys10.1145/359729956:1(1-38)Online publication date: 16-May-2023
        • (2023)HISum: Hyperbolic Interaction Model for Extractive Multi-Document SummarizationProceedings of the ACM Web Conference 202310.1145/3543507.3583197(1427-1436)Online publication date: 30-Apr-2023
        • (2023)Neural Abstractive Summarization: A Brief Survey2023 IEEE 3rd International Conference on Computer Communication and Artificial Intelligence (CCAI)10.1109/CCAI57533.2023.10201274(50-58)Online publication date: 26-May-2023
        • (2023)Can Anaphora Resolution Improve Extractive Query-Focused Multi-Document Summarization?IEEE Access10.1109/ACCESS.2023.331452411(99961-99976)Online publication date: 2023
        • (2023)A Comparative Study of Sentence Embeddings for Unsupervised Extractive Multi-document SummarizationArtificial Intelligence and Machine Learning10.1007/978-3-031-39144-6_6(78-95)Online publication date: 4-Aug-2023
        • (2022)Domain Adaptation with Pre-trained Transformers for Query-Focused Abstractive Text SummarizationComputational Linguistics10.1162/coli_a_0043448:2(279-320)Online publication date: 9-Jun-2022
        • (2022)Transforming Wikipedia Into Augmented Data for Query-Focused SummarizationIEEE/ACM Transactions on Audio, Speech, and Language Processing10.1109/TASLP.2022.317196330(2357-2367)Online publication date: 2022
        • (2021)Learning from Graph Propagation via Ordinal Distillation for One-Shot Automated Essay ScoringProceedings of the Web Conference 202110.1145/3442381.3450017(2347-2356)Online publication date: 19-Apr-2021
        • (2021)BANDAR: Benchmarking Snippet Generation Algorithms for (RDF) Dataset SearchIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.3095309(1-1)Online publication date: 2021

        View Options

        Get Access

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media