Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2911451.2911526acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Event Digest: A Holistic View on Past Events

Published: 07 July 2016 Publication History

Abstract

For a general user, easy access to vast amounts of online information available on past events has made retrospection much harder. We propose a problem of automatic event digest generation to aid effective and efficient retrospection. For this, in addition to text, a digest should maximize the reportage of time, geolocations, and entities to present a holistic view on the past event of interest.
We propose a novel divergence-based framework that selects excerpts from an initial set of pseudo-relevant documents, such that the overall relevance is maximized, while avoiding redundancy in text, time, geolocations, and named entities, by treating them as independent dimensions of an event. Our method formulates the problem as an Integer Linear Program (ILP) for global inference to diversify across the event dimensions. Relevance and redundancy measures are defined based on JS-divergence between independent query and excerpt models estimated for each event dimension. Elaborate experiments on three real-world datasets are conducted to compare our methods against the state-of-the-art from the literature. Using Wikipedia articles as gold standard summaries in our evaluation, we find that the most holistic digest of an event is generated with our method that integrates all event dimensions. We compare all methods using standard Rouge-1, -2, and -SU4 along with Rouge-NP, and a novel weighted variant of Rouge.

References

[1]
J. Allan, R. Gupta, and V. Khandelwal. Temporal summaries of new topics. SIGIR 2001.
[2]
K. Berberich, S. J. Bedathur, O. Alonso, and G. Weikum. A language modeling approach for temporal information needs. ECIR 2010.
[3]
J. P. Callan. Passage-level evidence in document retrieval. SIGIR 1994.
[4]
J. Carbonell and J. Goldstein. The use of mmr, diversity-based reranking for reordering documentsand producing summaries. SIGIR 1998.
[5]
H. L. Chieu and Y. K. Lee. Query based event extraction along a timeline. SIGIR 2004.
[6]
Z. Dou, S. Hu, K. Chen, R. Song, and J. R. Wen. Multi-dimensional search result diversification. WSDM 2009.
[7]
E. Filatova. Event-based extractive summarization. ACL Workshop on Summarization 2004.
[8]
D. Gillick and B. Favre. A scalable global model for summarization. ILP 2009.
[9]
M. A. Hearst and C. Plaunt. Subtopic structuring for full-length document access. SIGIR 1993.
[10]
J. Hoffart, F. M. Suchanek, K. Berberich, and G. Weikum. YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia. AAAI Press 2013.
[11]
J. Hoffart. Discovering and Disambiguating Named Entities in Text. SIGMOD/PODS Ph.D. Symposium 2013.
[12]
P. Hu, D. H. Ji, H. Wang, and C. Teng. Query-focused multi-document summarization using co-training based semi-supervised learning. PACLIC 2009.
[13]
M. Kaszkiel and J. Zobel. Effective ranking with arbitrary passages. Journal of the American Society for Information Science and Technology, 52(4) 2001.
[14]
Y. Li and S. Li. Query-focused multi-document summarization: Combining a topic modelwith graph-based semi-supervised learning. COLING 2014.
[15]
C. Y. Lin. Rouge: A package for automatic evaluation of summaries. ACL Workshop 2004.
[16]
M. Litvak and M. Last. Graph-based keyword extraction for single-document summarization. ACL 2008.
[17]
R. McCreadie, C. Macdonald, and I. Ounis. Incremental update summarization: Adaptive sentence selection based on prevalence and novelty. CIKM 2014.
[18]
R. McDonald. A Study of Global Inference Algorithms in Multi-document Summarization. Springer Berlin-Heidelberg 2007.
[19]
A. Mishra and K. Berberich. Linking Wikipedia Events to Past News. TAIA 2014.
[20]
A. Mishra and K. Berberich. Leveraging Semantic Annotations to Link Wikipedia and News Archives. ECIR 2016.
[21]
K. Riedhammer, B. Favre, and D. Hakkani-Tür. Long story short--global unsupervised models for keyphrase based meeting summarization. Speech Communication, 52(10) 2010.
[22]
D. Shahaf, C. Guestrin, E. Horvitz, J. Leskovec. Effective ranking with arbitrary passages. Communications of the ACM, 58(11) 2015.
[23]
G. Salton, J. Allan, and C. Buckley. Approaches to passage retrieval in full text information systems. SIGIR 1993.
[24]
B. Taneva, G .Weikum. Gem-based entity-knowledge maintenance. ECIR 20
[25]
M. Tsagkias, M. de Rijke, and W. Weerkamp. Linking online news and social media. WSDM 2011.
[26]
X. Wang, H. Fang, and C. Zhai. A study of methods for negative relevance feedback. SIGIR 2008.
[27]
K. Woodsend and M. Lapata. Multiple aspect summarization using integer linear programming. EMNLP-CoNLL 2012.
[28]
R. Yan, X. Wan, J. Otterbacher, L. Kong, X. Li, and Y. Zhang. Evolutionary timeline summarization: a balanced optimizationframework via iterative substitution. SIGIR 2011.
[29]
C. Zhai and J. D. Lafferty. Two-stage language models for information retrieval. SIGIR 2002.
[30]
C. X. Zhai, W. W. Cohen, and J. Lafferty. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. SIGIR 2003.
[31]
S. H. Zhong, Y. Liu, B. Li, and J. Long. Query-oriented unsupervised multi-document summarization via deep learning model. Expert Systems with Applications, 42(21) 20

Cited By

View all
  • (2021)Event Occurrence Date Estimation based on Multivariate Time Series Analysis over Temporal Document CollectionsProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3462885(398-407)Online publication date: 11-Jul-2021
  • (2020)Toward comprehensive event collectionsInternational Journal on Digital Libraries10.1007/s00799-018-0246-x21:2(215-229)Online publication date: 1-Jun-2020
  • (2019)Constructing a Comprehensive Events Database from the WebProceedings of the 28th ACM International Conference on Information and Knowledge Management10.1145/3357384.3357986(229-238)Online publication date: 3-Nov-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
July 2016
1296 pages
ISBN:9781450340694
DOI:10.1145/2911451
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 July 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. diversification
  2. event digest
  3. linking
  4. semantic annotations

Qualifiers

  • Research-article

Conference

SIGIR '16
Sponsor:

Acceptance Rates

SIGIR '16 Paper Acceptance Rate 62 of 341 submissions, 18%;
Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 24 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Event Occurrence Date Estimation based on Multivariate Time Series Analysis over Temporal Document CollectionsProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3462885(398-407)Online publication date: 11-Jul-2021
  • (2020)Toward comprehensive event collectionsInternational Journal on Digital Libraries10.1007/s00799-018-0246-x21:2(215-229)Online publication date: 1-Jun-2020
  • (2019)Constructing a Comprehensive Events Database from the WebProceedings of the 28th ACM International Conference on Information and Knowledge Management10.1145/3357384.3357986(229-238)Online publication date: 3-Nov-2019
  • (2018)How it HappenedProceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries10.1145/3197026.3197034(193-202)Online publication date: 23-May-2018
  • (2018)Exploring Entity-centric Networks in Entangled News StreamsCompanion Proceedings of the The Web Conference 201810.1145/3184558.3188726(555-563)Online publication date: 23-Apr-2018
  • (2018)Long-Span Language Models for Query-Focused Unsupervised Extractive Text SummarizationAdvances in Information Retrieval10.1007/978-3-319-76941-7_59(657-664)Online publication date: 1-Mar-2018
  • (2017)Building entity-centric event collectionsProceedings of the 17th ACM/IEEE Joint Conference on Digital Libraries10.5555/3200334.3200356(199-208)Online publication date: 19-Jun-2017
  • (2017)Estimating Event Focus Time Using Neural Word EmbeddingsProceedings of the 2017 ACM on Conference on Information and Knowledge Management10.1145/3132847.3133131(2039-2042)Online publication date: 6-Nov-2017
  • (2017)Building Entity-Centric Event Collections2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL)10.1109/JCDL.2017.7991574(1-10)Online publication date: Jun-2017
  • (2016)Refining imprecise spatio-temporal eventsProceedings of the 10th Workshop on Geographic Information Retrieval10.1145/3003464.3003469(1-10)Online publication date: 31-Oct-2016
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media