Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Explicit Diversification of Event Aspects for Temporal Summarization

Published: 02 February 2018 Publication History

Abstract

During major events, such as emergencies and disasters, a large volume of information is reported on newswire and social media platforms. Temporal summarization (TS) approaches are used to automatically produce concise overviews of such events by extracting text snippets from related articles over time. Current TS approaches rely on a combination of event relevance and textual novelty for snippet selection. However, for events that span multiple days, textual novelty is often a poor criterion for selecting snippets, since many snippets are textually unique but are semantically redundant or non-informative. In this article, we propose a framework for the diversification of snippets using explicit event aspects, building on recent works in search result diversification. In particular, we first propose two techniques to identify explicit aspects that a user might want to see covered in a summary for different types of event. We then extend a state-of-the-art explicit diversification framework to maximize the coverage of these aspects when selecting summary snippets for unseen events. Through experimentation over the TREC TS 2013, 2014, and 2015 datasets, we show that explicit diversification for temporal summarization significantly outperforms classical novelty-based diversification, as the use of explicit event aspects reduces the amount of redundant and off-topic snippets returned, while also increasing summary timeliness.

Supplementary Material

JPG File (a25-mccreadie.jpg)
MP4 File (a25-mccreadie.mp4)

References

[1]
Rakesh Agrawal, Sreenivas Gollapudi, Alan Halverson, and Samuel Ieong. 2009. Diversifying search results. In Proceedings of the 2nd ACM International Conference on Web Search and Data Mining. ACM, 5--14.
[2]
James Allan, Rahul Gupta, and Vikas Khandelwal. 2001. Temporal summaries of new topics. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 10--18.
[3]
Omar Alonso and Ricardo Baeza-Yates. 2011. Design and implementation of relevance assessments using crowdsourcing. In European Conference on Information Retrieval. Springer, 153--164.
[4]
Gaurav Baruah, Richard McCreadie, and Jimmy Lin. 2017. A comparison of nuggets and clusters for evaluating timeline summaries. In Proceedings of the 26th ACM International Conference on Information and Knowledge Management.
[5]
Christian Bizer, Jens Lehmann, Georgi Kobilarov, Sören Auer, Christian Becker, Richard Cyganiak, and Sebastian Hellmann. 2009. DBpedia-A crystallization point for the web of data. Web Semant. 7, 3 (2009), 154--165.
[6]
Huanhuan Cao, Daxin Jiang, Jian Pei, Qi He, Zhen Liao, Enhong Chen, and Hang Li. 2008. Context-aware query suggestion by mining click-through and session data. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 875--883.
[7]
Jaime Carbonell and Jade Goldstein. 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 335--336.
[8]
Asli Celikyilmaz and Dilek Hakkani-Tur. 2010. A hybrid hierarchical model for multi-document summarization. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 815--824.
[9]
Yun-Nung Chen, Yu Huang, Ching-Feng Yeh, and Lin-Shan Lee. 2011. Spoken lecture summarization by random walk over a graph constructed with automatically extracted key terms. In Interspeech. 933--936.
[10]
Charles L. A. Clarke, Nick Craswell, Ian Soboroff, and Gordon V. Cormack. 2010. Overview of the TREC 2010 web track. In Proceedings of the 19th Text Retrieval Conference, TREC, Vol. 10.
[11]
John M. Conroy, Judith D. Schlesinger, and Jade Goldstein. 2005. Classy tasked based summarization: Back to basics. In Proceedings of the Document Understanding Conference (DUC).
[12]
Bruce Croft and John Lafferty. 2013. Language Modeling for Information Retrieval. Vol. 13. Springer Science 8 Business Media.
[13]
Hamish Cunningham, Diana Maynard, Kalina Bontcheva, Valentin Tablan, Niraj Aswani, Ian Roberts, Genevieve Gorrell, Adam Funk, Angus Roberts, Danica Damljanovic, and others. 2014. Developing Language Processing Components with Gate Version 8 (A User Guide). University of Sheffield, UK. Retrieved from http://gate.ac.uk/sale/tao/index.
[14]
Hoa Trang Dang and Karolina Owczarzak. 2008. Overview of the TAC 2008 update summarization task. In Proceedings of the Conference on Text Analysis. 1--16.
[15]
Van Dang and W Bruce Croft. 2012. Diversity by proportionality: An election-based approach to search result diversification. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 65--74.
[16]
Qi Guo, Fernando Diaz, and Elad Yom-Tov. 2013. Updating users about time critical events. In European Conference on Information Retrieval. Springer, 483--494.
[17]
Taher H. Haveliwala. 2002. Topic-sensitive pagerank. In Proceedings of the 11th International Conference on World Wide Web. ACM, 517--526.
[18]
Starr Roxanne Hiltz and Linda Plotnick. 2013. Dealing with information overload when using social media for emergency management: Emerging solutions. In Proceedings of the 10th International ISCRAM Conference. 823--827.
[19]
Po Hu, Minlie Huang, Peng Xu, Weichang Li, Adam K Usadi, and Xiaoyan Zhu. 2011. Generating breakpoint-based timeline overview for news topic retrospection. In Proceedings of the 2011 IEEE 11th International Conference on Data Mining (ICDM’11). IEEE, 260--269.
[20]
Sha Hu, Zhicheng Dou, Xiaojie Wang, Tetsuya Sakai, and Ji-Rong Wen. 2015. Search result diversification based on hierarchical intents. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. ACM, 63--72.
[21]
Chris Kedzie, Fernando Diaz, and Kathleen McKeown. 2016. Real-time web scale event summarization using sequential decision making. arXiv:1605.03664 (2016).
[22]
Chris Kedzie, Kathleen McKeown, and Fernando Diaz. 2015. Predicting salient updates for disaster summarization. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL) (1). 1608--1617.
[23]
Jiwei Li and Claire Cardie. 2014. Timeline generation: Tracking individuals on twitter. In Proceedings of the 23rd International Conference on World Wide Web. ACM, 643--652.
[24]
Liangda Li, Ke Zhou, Gui-Rong Xue, Hongyuan Zha, and Yong Yu. 2009. Enhancing diversity, coverage and balance for summarization through structure learning. In Proceedings of the 18th International Conference on World Wide Web. ACM, 71--80.
[25]
Peng Li, Yinglin Wang, Wei Gao, and Jing Jiang. 2011. Generating aspect-oriented multi-document summarization with event-aspect model. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1137--1146.
[26]
Chin-Yew Lin and Eduard Hovy. 2002. From single to multi-document summarization: A prototype system and its evaluation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 457--464.
[27]
Hui Lin and Jeff Bilmes. 2011. A class of submodular functions for document summarization. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, 510--520.
[28]
Qian Liu, Yue Liu, Dayong Wu, and Xueqi Cheng. 2013. ICTNET at temporal summarization track TREC 2013. In Proceedings of the Text Retrieval Conference (TREC’13).
[29]
Craig Macdonald, Richard McCreadie, Rodrygo L. T. Santos, and Iadh Ounis. 2012. From puppy to maturity: Experiences in developing terrier. In Proceedings of the SIGIR 2012 Workshop on Open Source Information Retrieval.
[30]
Richard McCreadie, Craig Macdonald, and Iadh Ounis. 2014. Incremental update summarization: Adaptive sentence selection based on prevalence and novelty. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. ACM, 301--310.
[31]
Ani Nenkova, Lucy Vanderwende, and Kathleen McKeown. 2006. A compositional context sensitive multi-document summarizer: Exploring the factors that influence summarization. In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 573--580.
[32]
Tu Ngoc Nguyen and Nattiya Kanhabua. 2014. Leveraging dynamic query subtopics for time-aware search result diversification. In European Conference on Information Retrieval. Springer, 222--234.
[33]
Miles Osborne, Saša Petrovic, Richard McCreadie, Craig Macdonald, and Iadh Ounis. 2012. Bieber no more: First story detection using Twitter and Wikipedia. In Proceedings of the 2012 Workshop on Time-aware Information Access (SIGIR’12).
[34]
Dragomir R. Radev, Hongyan Jing, Małgorzata Stys, and Daniel Tam. 2004. Centroid-based summarization of multiple documents. Inf. Process. Manage. 40, 6 (2004), 919--938.
[35]
Rodrygo L. T. Santos, Craig Macdonald, and Iadh Ounis. 2010. Exploiting query reformulations for web search result diversification. In Proceedings of the 19th International Conference on World Wide Web (WWW’10). ACM, New York, NY, 881--890.
[36]
Rodrygo L. T. Santos, Iadh Ounis, and Craig Macdonald. 2015. Search result diversification. Found. Trends Inf. Retriev. 9, 1 (2015), 1--90.
[37]
Rodrygo Luis Teodoro Santos. 2013. Explicit Web Search Result Diversification. Ph.D. Dissertation. University of Glasgow.
[38]
Kristina Toutanova, Chris Brockett, Michael Gamon, Jagadeesh Jagarlamudi, Hisami Suzuki, and Lucy Vanderwende. 2007. The Pythy summarization system: Microsoft research at DUC 2007. In Proceedings of the Document Understanding Conferences (DUC’07). Vol. 2007.
[39]
Giang Tran, Mohammad Alrifai, and Eelco Herder. 2015. Timeline summarization from relevant headlines. In European Conference on Information Retrieval. Springer, 245--256.
[40]
Tuan A. Tran, Claudia Niederee, Nattiya Kanhabua, Ujwal Gadiraju, and Avishek Anand. 2015. Balancing novelty and salience: Adaptive learning to rank entities for timeline summarization of high-impact events. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM’15). ACM, New York, NY, 1201--1210.
[41]
Maria Vargas-Vera and David Celjuska. 2004. Event recognition on news stories and semi-automatic population of an ontology. In Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence. IEEE Computer Society, 615--618.
[42]
Dingding Wang and Tao Li. 2010. Document update summarization using incremental hierarchical clustering. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management. ACM, 279--288.
[43]
Dingding Wang, Tao Li, Shenghuo Zhu, and Chris Ding. 2008. Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’08). ACM, New York, NY, 307--314.
[44]
Jun Wang and Jianhan Zhu. 2009. Portfolio theory of information retrieval. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 115--122.
[45]
Lu Wang, Hema Raghavan, Vittorio Castelli, Radu Florian, and Claire Cardie. 2013. A sentence compression based framework to query-focused multi-document summarization. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL) (1). 1384--1394.
[46]
Furu Wei, Wenjie Li, Qin Lu, and Yanxiang He. 2008. Query-sensitive mutual reinforcement chain and its application in query-oriented multi-document summarization. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’08). ACM, New York, NY, 283--290.
[47]
Fei Wu and Daniel S. Weld. 2010. Open information extraction using Wikipedia. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 118--127.
[48]
Tan Xu, Douglas W. Oard, and Paul McNamee. 2013. HLTCOE at TREC 2013: Temporal summarization. In Proceedings of the 22nd Text Retrieval Conference (TREC’13).
[49]
Rui Yan, Xiaojun Wan, Jahna Otterbacher, Liang Kong, Xiaoming Li, and Yan Zhang. 2011. Evolutionary timeline summarization: A balanced optimization framework via iterative substitution. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 745--754.
[50]
Chengxiang Zhai and John Lafferty. 2001. A study of smoothing methods for language models applied to ad hoc information retrieval. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 334--342.
[51]
Chunyun Zhang, Zhanyu Ma, Jiayue Zhang, Weiran Xu, and Jun Guo. 2015. A multi-level system for sequential update summarization. In Proceedings of the 11th International Conference onHeterogeneous Networking for Quality, Reliability, Security and Robustness (QSHINE’15). IEEE, 144--148.
[52]
Guido Zuccon, Leif Azzopardi, Dell Zhang, and Jun Wang. 2012. Top-k retrieval using facility location analysis. In Proceedings of the European Conference on Information Retrieval. Springer, 305--316.

Cited By

View all
  • (2024)Token-Event-Role Structure-Based Multi-Channel Document-Level Event ExtractionACM Transactions on Information Systems10.1145/364388542:4(1-27)Online publication date: 7-Feb-2024
  • (2024)TASP: Topic-based abstractive summarization of Facebook text postsExpert Systems with Applications10.1016/j.eswa.2024.124567255(124567)Online publication date: Dec-2024
  • (2024)DQNC2S: DQN-Based Cross-Stream Crisis Event SummarizerAdvances in Information Retrieval10.1007/978-3-031-56063-7_34(422-430)Online publication date: 24-Mar-2024
  • Show More Cited By

Index Terms

  1. Explicit Diversification of Event Aspects for Temporal Summarization

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Information Systems
    ACM Transactions on Information Systems  Volume 36, Issue 3
    July 2018
    402 pages
    ISSN:1046-8188
    EISSN:1558-2868
    DOI:10.1145/3146384
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 February 2018
    Accepted: 01 October 2017
    Revised: 01 August 2017
    Received: 01 November 2016
    Published in TOIS Volume 36, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Temporal summarization
    2. explicit diversification
    3. xQuAD

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • EC co-funded SUPER

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)12
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 12 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Token-Event-Role Structure-Based Multi-Channel Document-Level Event ExtractionACM Transactions on Information Systems10.1145/364388542:4(1-27)Online publication date: 7-Feb-2024
    • (2024)TASP: Topic-based abstractive summarization of Facebook text postsExpert Systems with Applications10.1016/j.eswa.2024.124567255(124567)Online publication date: Dec-2024
    • (2024)DQNC2S: DQN-Based Cross-Stream Crisis Event SummarizerAdvances in Information Retrieval10.1007/978-3-031-56063-7_34(422-430)Online publication date: 24-Mar-2024
    • (2023)TSSuBERT: How to Sum Up Multiple Years of Reading in a Few TweetsACM Transactions on Information Systems10.1145/358178641:4(1-33)Online publication date: 10-Apr-2023
    • (2023)Exploring unsupervised textual representations generated by neural language models in the context of automatic tweet stream summarizationOnline Social Networks and Media10.1016/j.osnem.2023.10027237-38(100272)Online publication date: Sep-2023
    • (2022)A Multi-channel Hierarchical Graph Attention Network for Open Event ExtractionACM Transactions on Information Systems10.1145/352866841:1(1-27)Online publication date: 22-Apr-2022
    • (2022)Multi-interest Diversification for End-to-end Sequential RecommendationACM Transactions on Information Systems10.1145/347576840:1(1-30)Online publication date: 31-Jan-2022
    • (2022)Computational Understanding of Narratives: A SurveyIEEE Access10.1109/ACCESS.2022.320531410(101575-101594)Online publication date: 2022
    • (2021)Automated Machine Learning Approaches for Emergency Response and Coordination via Social Media in the Aftermath of a Disaster: A ReviewIEEE Access10.1109/ACCESS.2021.30748199(68917-68931)Online publication date: 2021
    • (2021)D-AdFeed: A diversity-aware utility-maximizing advertising framework for mobile usersComputer Networks10.1016/j.comnet.2021.107954190(107954)Online publication date: May-2021
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media