research-article

Unsupervised Dual-Cascade Learning with Pseudo-Feedback Distillation for Query-Focused Extractive Summarization

Authors:

Haggai Roitman,

Guy Feigenblat,

David KonopnickiAuthors Info & Claims

WWW '20: Proceedings of The Web Conference 2020

Pages 2577 - 2584

https://doi.org/10.1145/3366423.3380009

Published: 20 April 2020 Publication History

Abstract

We propose Dual-CES – a novel unsupervised, query-focused, multi-document extractive summarizer. Dual-CES builds on top of the Cross Entropy Summarizer (CES) and is designed to better handle the tradeoff between saliency and focus in summarization. To this end, Dual-CES employs a two-step dual-cascade optimization approach with saliency-based pseudo-feedback distillation. Overall, Dual-CES significantly outperforms all other state-of-the-art unsupervised alternatives. Dual-CES is even shown to be able to outperform strong supervised summarizers.

References

[1]

Z.I. Botev, D.P. Kroese, and T. Taimre. 2007. Generalized Cross-entropy Methods with Applications to Rare-event Simulation and Optimization. Simulation 83, 11 (Nov. 2007), 785–806. https://doi.org/10.1177/0037549707087067

Digital Library

[2]

Ziqiang Cao, Wenjie Li, Sujian Li, Furu Wei, and Yanran Li. 2016. AttSum: Joint Learning of Focusing and Summarization with Neural Attention. In COLING.

[3]

Asli Celikyilmaz and Dilek Hakkani-Tur. 2010. A Hybrid Hierarchical Model for Multi-document Summarization. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics(ACL ’10). Association for Computational Linguistics, Stroudsburg, PA, USA, 815–824. http://dl.acm.org/citation.cfm?id=1858681.1858765

Digital Library

[4]

Yen-Chun Chen and Mohit Bansal. 2018. Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting. CoRR abs/1805.11080(2018). arxiv:1805.11080http://arxiv.org/abs/1805.11080

[5]

Hoa Trang Dang. 2005. Overview of DUC 2005. In Proceedings of the document understanding conference, Vol. 2005. 1–12.

[6]

Shai Erera, Michal Shmueli-Scheuer, Guy Feigenblat, Ora Peled Nakash, Odellia Boni, Haggai Roitman, Doron Cohen, Bar Weiner, Yosi Mass, Or Rivlin, Guy Lev, Achiya Jerbi, Jonathan Herzig, Yufang Hou, Charles Jochim, Martin Gleize, Francesca Bonin, Francesca Bonin, and David Konopnicki. 2019. A Summarization System for Scientific Documents. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations. Association for Computational Linguistics, Hong Kong, China, 211–216. https://doi.org/10.18653/v1/D19-3036

[7]

Guy Feigenblat, Haggai Roitman, Odellia Boni, and David Konopnicki. 2017. Unsupervised Query-Focused Multi-Document Summarization using the Cross Entropy Method. In Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval(SIGIR ’17). ACM.

Digital Library

[8]

Mahak Gambhir and Vishal Gupta. 2017. Recent Automatic Text Summarization Techniques: A Survey. Artif. Intell. Rev. 47, 1 (Jan. 2017), 1–66. https://doi.org/10.1007/s10462-016-9475-9

Digital Library

[9]

Aria Haghighi and Lucy Vanderwende. 2009. Exploring Content Models for Multi-document Summarization. In Proceedings of the 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics(NAACL ’09). Association for Computational Linguistics, Stroudsburg, PA, USA, 362–370. http://dl.acm.org/citation.cfm?id=1620754.1620807

[10]

Zhanying He, Chun Chen, Jiajun Bu, Can Wang, Lijun Zhang, Deng Cai, and Xiaofei He. 2012. Document Summarization Based on Data Reconstruction. In Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence(AAAI’12). AAAI Press, 620–626. http://dl.acm.org/citation.cfm?id=2900728.2900817

[11]

Victor Lavrenko and W. Bruce Croft. 2001. Relevance Based Language Models. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR ’01). ACM, New York, NY, USA, 120–127. https://doi.org/10.1145/383952.383972

Digital Library

[12]

Piji Li, Lidong Bing, Wai Lam, Hang Li, and Yi Liao. 2015. Reader-aware Multi-document Summarization via Sparse Coding. In Proceedings of the 24th International Conference on Artificial Intelligence(IJCAI’15). AAAI Press, 1270–1276. http://dl.acm.org/citation.cfm?id=2832415.2832426

Digital Library

[13]

Piji Li, Wai Lam, Lidong Bing, Weiwei Guo, and Hang Li. 2017. Cascaded Attention based Unsupervised Information Distillation for Compressive Summarization. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2081–2090.

[14]

Piji Li, Zihao Wang, Wai Lam, Zhaochun Ren, and Lidong Bing. 2017. Salience Estimation via Variational Auto-Encoders for Multi-Document Summarization. In AAAI. 3497–3503.

[15]

Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out: Proceedings of the ACL-04 workshop, Vol. 8. Barcelona, Spain.

[16]

Hui Lin and Jeff Bilmes. 2011. A Class of Submodular Functions for Document Summarization. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1(HLT ’11). Association for Computational Linguistics, Stroudsburg, PA, USA, 510–520. http://dl.acm.org/citation.cfm?id=2002472.2002537

[17]

Shulei Ma, Zhi-Hong Deng, and Yunlun Yang. 2016. An Unsupervised Multi-Document Summarization Framework Based on Neural Document Model. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee, Osaka, Japan, 1514–1523. http://aclweb.org/anthology/C16-1143

[18]

Edward Moroshko, Guy Feigenblat, Haggai Roitman, and David Konopnicki. 2019. An Editorial Network for Enhanced Document Summarization. In Proceedings of the 2nd Workshop on New Frontiers in Summarization. Association for Computational Linguistics, Hong Kong, China, 57–63. https://doi.org/10.18653/v1/D19-5407

[19]

Shashi Narayan, Shay B. Cohen, and Lapata Mirella. 2018. Ranking Sentences for Extractive Summarization with Reinforcement Learning. CoRR abs/1802.08636(2018). arxiv:1802.08636http://arxiv.org/abs/1802.08636

[20]

Ani Nenkova, Kathleen McKeown, 2011. Automatic summarization. Foundations and Trends® in Information Retrieval 5, 2–3(2011), 103–233.

[21]

You Ouyang, Wenjie Li, Sujian Li, and Qin Lu. 2011. Applying regression models to query-focused multi-document summarization. Information Processing & Management 47, 2 (2011), 227–237.

Digital Library

[22]

Romain Paulus, Caiming Xiong, and Richard Socher. 2018. A deep reinforced model for abstractive summarization. In In Proceedings of the 6th International Conference on Learning Representations(ICLR ’18).

[23]

Pengjie Ren, Zhumin Chen, Zhaochun Ren, Furu Wei, Jun Ma, and Maarten de Rijke. 2017. Leveraging Contextual Sentence Relations for Extractive Summarization Using a Neural Attention Model. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR ’17). ACM, New York, NY, USA, 95–104. https://doi.org/10.1145/3077136.3080792

Digital Library

[24]

Pengjie Ren, Zhumin Chen, Zhaochun Ren, Furu Wei, Liqiang Nie, Jun Ma, and Maarten de Rijke. 2018. Sentence Relations for Extractive Summarization with Deep Neural Networks. ACM Trans. Inf. Syst. 36, 4, Article 39 (April 2018), 32 pages. https://doi.org/10.1145/3200864

Digital Library

[25]

Reuven Y Rubinstein and Dirk P Kroese. 2004. The cross-entropy method: a unified approach to combinatorial optimization, Monte-Carlo simulation and machine learning. Springer.

Digital Library

[26]

Chao Shen, Tao Li, and Chris H. Q. Ding. 2011. Integrating Clustering and Multi-document Summarization by Bi-mixture Probabilistic Latent Semantic Analysis (PLSA) with Sentence Bases. In Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence(AAAI’11). AAAI Press, 914–920.

[27]

Xiaojun Wan and Jianguo Xiao. 2009. Graph-based Multi-modality Learning for Topic-focused Multi-document Summarization. In Proceedings of the 21st International Jont Conference on Artifical Intelligence(IJCAI’09). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1586–1591. http://dl.acm.org/citation.cfm?id=1661445.1661700

Digital Library

[28]

Yang Xu, Gareth J.F. Jones, and Bin Wang. 2009. Query Dependent Pseudo-relevance Feedback Based on Wikipedia. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR ’09). ACM, New York, NY, USA, 59–66. https://doi.org/10.1145/1571941.1571954

Digital Library

[29]

Jin-ge Yao, Xiaojun Wan, and Jianguo Xiao. 2015. Compressive Document Summarization via Sparse Optimization. In Proceedings of the 24th International Conference on Artificial Intelligence(IJCAI’15). AAAI Press, 1376–1382. http://dl.acm.org/citation.cfm?id=2832415.2832441

[30]

ChengXiang Zhai and John Lafferty. 2006. A Risk Minimization Framework for Information Retrieval. Inf. Process. Manage. 42, 1 (Jan. 2006), 31–55. https://doi.org/10.1016/j.ipm.2004.11.003

[31]

Sheng-hua Zhong, Yan Liu, Bin Li, and Jing Long. 2015. Query-oriented Unsupervised Multi-document Summarization via Deep Learning Model. Expert Syst. Appl. 42, 21 (Nov. 2015), 8146–8155. https://doi.org/10.1016/j.eswa.2015.05.034

Cited By

Malkiel IYehuda YEphrath JKats OBarkan ONice NKoenigstein N(2024)Unsupervised Topic-Conditional Extractive SummarizationICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10447347(11286-11290)Online publication date: 14-Apr-2024
https://doi.org/10.1109/ICASSP48485.2024.10447347
Roy PKundu S(2023)Review on Query-focused Multi-document Summarization (QMDS) with Comparative AnalysisACM Computing Surveys10.1145/359729956:1(1-38)Online publication date: 16-May-2023
https://dl.acm.org/doi/10.1145/3597299
Song MFeng YJing L(2023)HISum: Hyperbolic Interaction Model for Extractive Multi-Document SummarizationProceedings of the ACM Web Conference 202310.1145/3543507.3583197(1427-1436)Online publication date: 30-Apr-2023
https://dl.acm.org/doi/10.1145/3543507.3583197
Show More Cited By

Index Terms

Unsupervised Dual-Cascade Learning with Pseudo-Feedback Distillation for Query-Focused Extractive Summarization
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning
2. Information systems
  1. Information retrieval

Index terms have been assigned to the content through auto-classification.

Recommendations

Unsupervised Extractive Text Summarization with Distance-Augmented Sentence Graphs
SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval

Supervised summarization has made significant improvements in recent years by leveraging cutting-edge deep learning technologies. However, the true success of supervised methods relies on the availability of large quantity of human-generated summaries of ...
RankSum—An unsupervised extractive text summarization based on rank fusion
Abstract
In this paper, we propose Ranksum, an approach for extractive text summarization of single documents based on the rank fusion of four multi-dimensional sentence features extracted for each sentence: topic information, semantic content, ...
Graphical abstract

Display Omitted
Highlights
- A unified summarization framework with multi-dimensional sentence features.
- ...
Extractive summarisation via sentence removal: condensing relevant sentences into a short summary
SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Many on-line services allow users to describe their opinions about a product or a service through a review. In order to help other users to find out the major opinion about a given topic, without the effort to read several reviews, multi-document ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '20: Proceedings of The Web Conference 2020

April 2020

3143 pages

ISBN:9781450370233

DOI:10.1145/3366423

Editors:
Yennun Huang
Acadmica sinica, Taiwan
,
Irwin King
The Chinese University of Hong Kong, Hong Kong
,
Tie-Yan Liu
Microsoft Research Asia, China
,
Maarten van Steen
University of Twente, Netherlands

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 April 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Conference

WWW '20

Sponsor:

SIGWEB

WWW '20: The Web Conference 2020

April 20 - 24, 2020

Taipei, Taiwan

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
259
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)1

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Malkiel IYehuda YEphrath JKats OBarkan ONice NKoenigstein N(2024)Unsupervised Topic-Conditional Extractive SummarizationICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10447347(11286-11290)Online publication date: 14-Apr-2024
https://doi.org/10.1109/ICASSP48485.2024.10447347
Roy PKundu S(2023)Review on Query-focused Multi-document Summarization (QMDS) with Comparative AnalysisACM Computing Surveys10.1145/359729956:1(1-38)Online publication date: 16-May-2023
https://dl.acm.org/doi/10.1145/3597299
Song MFeng YJing L(2023)HISum: Hyperbolic Interaction Model for Extractive Multi-Document SummarizationProceedings of the ACM Web Conference 202310.1145/3543507.3583197(1427-1436)Online publication date: 30-Apr-2023
https://dl.acm.org/doi/10.1145/3543507.3583197
Qian LZhang HWang WLiu DHuang X(2023)Neural Abstractive Summarization: A Brief Survey2023 IEEE 3rd International Conference on Computer Communication and Artificial Intelligence (CCAI)10.1109/CCAI57533.2023.10201274(50-58)Online publication date: 26-May-2023
https://doi.org/10.1109/CCAI57533.2023.10201274
Lamsiyah SMahdaouy ASchommer C(2023)Can Anaphora Resolution Improve Extractive Query-Focused Multi-Document Summarization?IEEE Access10.1109/ACCESS.2023.331452411(99961-99976)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3314524
Lamsiyah SSchommer C(2023)A Comparative Study of Sentence Embeddings for Unsupervised Extractive Multi-document SummarizationArtificial Intelligence and Machine Learning10.1007/978-3-031-39144-6_6(78-95)Online publication date: 4-Aug-2023
https://doi.org/10.1007/978-3-031-39144-6_6
Laskar MHoque EHuang J(2022)Domain Adaptation with Pre-trained Transformers for Query-Focused Abstractive Text SummarizationComputational Linguistics10.1162/coli_a_0043448:2(279-320)Online publication date: 9-Jun-2022
https://doi.org/10.1162/coli_a_00434
Zhu HDong LWei FQin BLiu T(2022)Transforming Wikipedia Into Augmented Data for Query-Focused SummarizationIEEE/ACM Transactions on Audio, Speech, and Language Processing10.1109/TASLP.2022.317196330(2357-2367)Online publication date: 2022
https://doi.org/10.1109/TASLP.2022.3171963
Jiang ZLiu MYin YYu HCheng ZGu Q(2021)Learning from Graph Propagation via Ordinal Distillation for One-Shot Automated Essay ScoringProceedings of the Web Conference 202110.1145/3442381.3450017(2347-2356)Online publication date: 19-Apr-2021
https://dl.acm.org/doi/10.1145/3442381.3450017
Wang XCheng GPan JKharlamov EQu Y(2021)BANDAR: Benchmarking Snippet Generation Algorithms for (RDF) Dataset SearchIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.3095309(1-1)Online publication date: 2021
https://doi.org/10.1109/TKDE.2021.3095309

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents