Article

Free access

Revisions that improve cohesion in multi-document summaries: a preliminary study

Authors:

Jahna C. Otterbacher,

Dragomir R. Radev,

Airong LuoAuthors Info & Claims

AS '02: Proceedings of the ACL-02 Workshop on Automatic Summarization - Volume 4

Pages 27 - 36

https://doi.org/10.3115/1118162.1118166

Published: 11 July 2002 Publication History

Abstract

Extractive summaries produced from multiple source documents suffer from an array of problems with respect to text cohesion. In this preliminary study, we seek to understand what problems occur in such summaries and how often. We present an analysis of a small corpus of manually revised summaries and discuss the feasibility of making such repairs automatically. Additionally, we present a taxonomy of the problems that occur in the corpus, as well as the operators which, when applied to the summaries, can address these concerns. This study represents a first step toward identifying and automating revision operators that could work with current summarization systems in order to repair cohesion problems in multi-document summaries.

References

[1]

{Barzilay et al, 2001} Regina Barzilay, Noemie Elhadad, and Kathleen R. McKeown. Sentence ordering in multi-document summarization. In Proceedings of HLT, San Diego, CA, 2001.

Digital Library

[2]

{Filatova & Hovy, 2001} Elena Filatova and Eduard Hovy. Assigning time-stamps to event-clauses. In Proceedings, ACL Workshop on Temporal and Spatial Information Processing, Toulouse, France, July 2001.

Digital Library

[3]

{Goldstein et al, 2000} Jade Goldstein, Mark Kantrowitz, Vibhu Mittal, and Jamie Carbonell. Summarizing text documents: sentence selection and evaluation metrics. In Proceedings of the 22nd ACM SIGIR Conference on Research and Development in Information Retrieval, Berkeley, CA, 1999.

Digital Library

[4]

{Halliday & Hasan, 1976} M. Halliday and R. Hasan. Cohesion in English. London: Longman, 1976.

[5]

{Harabagiu, 1999} Sanda M. Harabagiu. From lexical cohesion to textual coherence: a data driven perspective. Journal of Pattern Recognition and Artificial Intelligence, 13(2): 247--265, 1999.

[6]

{Hovy, 1993} Eduard Hovy. Automated discourse generation using discourse structure relations. Artificial Intelligence 63, Special Issue on Natural Language Processing, 1993.

Digital Library

[7]

{Jing & McKeown, 2000} Hongyan Jing and Kathleen R. McKeown. Cut and paste based text summarization. In Proceedings of the 1st Conference of the North American Chapter of the Association for Computational Linguistics (NAACL'00), Seattle, WA, May 2000.

Digital Library

[8]

{Mani et al, 1999} Inderjeet Mani, Barbara Gates, and Eric Bloedorn. Improving summaries by revising them. In Proceedings of the 37th Annual Meeting of the ACL '99, pages 558--565, Maryland, 1999.

Digital Library

[9]

{Mann & Thompson, 1988} William C. Mann and Sandra A. Thompson. Rhetorical structure theory: toward a functional theory of text organization. Text, 8(3), 1988.

[10]

{Marcu, 1997} Daniel Marcu. From discourse structure to text summaries. In Proceedings of the ACL '97 EACL '97 Workshop on Intelligent Scalable Text Summarization, pages 82--88, Madrid, Spain, July 1997.

[11]

{Marcu, 2000} Daniel Marcu. The theory and practice of discourse parsing and summarization, The MIT Press, November 2000.

Digital Library

[12]

{Radev, 2000} Dragomir Radev. A common theory of information fusion from multiple text sources, step one: cross-document structure. In Proceedings, 1st ACL SIGDIAL Workshop on Discourse and Dialogue, Hong Kong, October 2000.

Digital Library

[13]

{Radev et al, 2000} Dragomir R. Radev, Hongyan Jing and Malgorzata Budzikowska. Centroid-based summarization of multiple documents: sentence, extraction, utility-based evaluation, and user studies. In ANLP/NAACL Workshop on Summarization, Seattle, WA, April 2000.

Digital Library

[14]

{Radev et al, 2002} Dragomir Radev, Simone Teufel, Horacio Saggion, Wai Lam, John Blitzer, Arda Celebi, Hong Qi, Daniu Liu and Elliot Drabek. Evaluation challenges in large-scale multi-document summarization: the MEAD project. Submitted to SIGIR 2002, Tampere, Finland, August 2002.

[15]

{Zhang et al., 2002} Zhu Zhang, Sasha Blair-Goldensohn, and Dragomir Radev. Towards CST-enhanced summarization. To appear in AAAI 2002, August 2002.

Digital Library

Cited By

Louis AJoshi ANenkova AKatagiri YNakano MFernández RLemon O(2010)Discourse indicators for content selection in summarizationProceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue10.5555/1944506.1944533(147-156)Online publication date: 24-Sep-2010
https://dl.acm.org/doi/10.5555/1944506.1944533
Belz AKow EViethen JGatt A(2010)Generating referring expressions in contextEmpirical methods in natural language generation10.5555/1880370.1880390(294-327)Online publication date: 1-Jan-2010
https://dl.acm.org/doi/10.5555/1880370.1880390
Castro Jorge MPardo TBanea CMoschitti ASomasundaran SZanzotto F(2010)Experiments with CST-based multidocument summarizationProceedings of the 2010 Workshop on Graph-based Methods for Natural Language Processing10.5555/1870490.1870502(74-82)Online publication date: 16-Jul-2010
https://dl.acm.org/doi/10.5555/1870490.1870502
Show More Cited By

Recommendations

Obtaining single document summaries using latent dirichlet allocation
ICONIP'12: Proceedings of the 19th international conference on Neural Information Processing - Volume Part IV

In this paper, we present a novel approach that makes use of topic models based on Latent Dirichlet allocation(LDA) for generating single document summaries. Our approach is distinguished from other LDA based approaches in that we identify the summary ...
Using Cross-Document Random Walks for Topic-Focused Multi-Document
WI '06: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence

Graph-ranking based methods have been developed for generic multi-document summarization in recent years and they make uniform use of the relationships between sentences to extract salient sentences. This paper proposes to integrate the relevance of the ...
Latent dirichlet allocation based multi-document summarization
AND '08: Proceedings of the second workshop on Analytics for noisy unstructured text data

Extraction based Multi-Document Summarization Algorithms consist of choosing sentences from the documents using some weighting mechanism and combining them into a summary. In this article we use Latent Dirichlet Allocation to capture the events being ...

Comments

Information & Contributors

Information

Published In

cover image DL Hosted proceedings

AS '02: Proceedings of the ACL-02 Workshop on Automatic Summarization - Volume 4

July 2002

54 pages

Conference Chairs:
Udo Hahn,
Donna Harman

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 11 July 2002

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
347
Total Downloads

Downloads (Last 12 months)31
Downloads (Last 6 weeks)8

Reflects downloads up to 03 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Louis AJoshi ANenkova AKatagiri YNakano MFernández RLemon O(2010)Discourse indicators for content selection in summarizationProceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue10.5555/1944506.1944533(147-156)Online publication date: 24-Sep-2010
https://dl.acm.org/doi/10.5555/1944506.1944533
Belz AKow EViethen JGatt A(2010)Generating referring expressions in contextEmpirical methods in natural language generation10.5555/1880370.1880390(294-327)Online publication date: 1-Jan-2010
https://dl.acm.org/doi/10.5555/1880370.1880390
Castro Jorge MPardo TBanea CMoschitti ASomasundaran SZanzotto F(2010)Experiments with CST-based multidocument summarizationProceedings of the 2010 Workshop on Graph-based Methods for Natural Language Processing10.5555/1870490.1870502(74-82)Online publication date: 16-Jul-2010
https://dl.acm.org/doi/10.5555/1870490.1870502
Pitler ELouis ANenkova AHajič J(2010)Automatic evaluation of linguistic quality in multi-document summarizationProceedings of the 48th Annual Meeting of the Association for Computational Linguistics10.5555/1858681.1858737(544-554)Online publication date: 11-Jul-2010
https://dl.acm.org/doi/10.5555/1858681.1858737
Tanaka HKinoshita AKobayakawa TKumano TKato NBelz AEvans RVarges S(2009)Syntax-driven sentence revision for broadcast news summarizationProceedings of the 2009 Workshop on Language Generation and Summarisation10.5555/1708155.1708163(39-47)Online publication date: 6-Aug-2009
https://dl.acm.org/doi/10.5555/1708155.1708163
Radev DOtterbacher JWinkel ABlair-Goldensohn S(2005)NewsInEssenceCommunications of the ACM10.1145/1089107.108911148:10(95-98)Online publication date: 1-Oct-2005
https://dl.acm.org/doi/10.1145/1089107.1089111
Orǎsan C(2003)An evolutionary approach for improving the quality of automatic summariesProceedings of the ACL 2003 workshop on Multilingual summarization and question answering - Volume 1210.3115/1119312.1119317(37-45)Online publication date: 11-Jul-2003
https://dl.acm.org/doi/10.3115/1119312.1119317
Fung PNgai GCheung C(2003)Combining optimal clustering and Hidden Markov models for extractive summarizationProceedings of the ACL 2003 workshop on Multilingual summarization and question answering - Volume 1210.3115/1119312.1119315(21-28)Online publication date: 11-Jul-2003
https://dl.acm.org/doi/10.3115/1119312.1119315

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents