Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

SUMMAC: a text summarization evaluation

Published: 01 March 2002 Publication History

Abstract

The TIPSTER Text Summarization Evaluation (SUMMAC) has developed several new extrinsic and intrinsic methods for evaluating summaries. It has established definitively that automatic text summarization is very effective in relevance assessment tasks on news articles. Summaries as short as 17% of full text length sped up decision-making by almost a factor of 2 with no statistically significant degradation in accuracy. Analysis of feedback forms filled in after each decision indicated that the intelligibility of present-day machine-generated summaries is high. Systems that performed most accurately in the production of indicative and informative topic-related summaries used term frequency and co-occurrence statistics, and vocabulary overlap comparisons between text passages. However, in the absence of a topic, these statistical methods do not appear to provide any additional leverage: in the case of generic summaries, the systems were indistinguishable in accuracy. The paper discusses some of the tradeoffs and challenges faced by the evaluation, and also lists some of the lessons learned, impacts, and possible future directions. The evaluation methods used in the SUMMAC evaluation are of interest to both summarization evaluation as well as evaluation of other ‘output-related’ NLP technologies, where there may be many potentially acceptable outputs, with no automatic way to compare them.

References

[1]
Aone, C., Okurowski, M. E., Gorlinsky, J. and Larsen, B. (1997) A trainable summarizer with knowledge acquired from robust NLP techniques. In: Mani, I. and Maybury, M., editors, Advances in Automatic Text Summarization, pp. 71-80. MIT Press.
[2]
Baldwin, N., Donaway, R., Hovy, E., Liddy, E., Mani, I., Marcu, D., McKeown, K., Mittal, V., Moens, M., Radev, D., Sparck Jones, K., Sundheim, B., Teufel, S., Weischedel, R. and White, M. (2000) An Evaluation Roadmap for Summarization Research. http://www-nlpir.nist.gov/projects/duc/roadmapping.html
[3]
Barzilay, R. and Elhadad, M. (1999) Using lexical chains for text summarization. In: Mani, I. and Maybury, M., editors, Advances in Automatic Text Summarization, pp. 111-121. MIT Press.
[4]
Borko, H. and Bernier, C. (1975) Abstracting Concepts and Methods. Academic Press.
[5]
Brandow, R., Mitze, K. and Rau, L. (1995) Automatic condensation of electronic publications by sentence selection. Infor. Process. Manage. 31(5): 675-685. (Reprinted in Mani, I. and Maybury, M., editors, Advances in Automatic Text Summarization, pp. 293-303. MIT Press.)
[6]
Boguraev, B. and Kennedy, C. (1999) Salience-based content characterization of text documents. In: Mani, I. and Maybury, M., editors, Advances in Automatic Text Summarization, pp. 99-110. MIT Press.
[7]
CMP-LG Annotated Corpus (1999) http://www.itl.nist.gov/div894/894.02/related_projects/tipster_summac/index.html
[8]
Carletta, J., Isard, A., Isard, S., Jowtko, J. C., Doherty-Sneddon, G. and Anderson, A. H. (1997) The reliability of a dialogue structure coding scheme. Computational Linguistics, 23(1): 13-32.
[9]
Cohen, J. (1969) Statistical Power Analysis for the Behavioral Sciences. Academic Press.
[10]
Donaway, R. L., Drummey, K. W. and Mather, L. A. (2000) A comparison of rankings produced by summarization evaluation measures. Proceedings ANLP-NAACL'2000 Workshop on Automatic Summarization, pp. 69-78.
[11]
Document Understanding Conference (2001) http://www.nlp-ir.nist.gov/projects/duc/ 2001.html
[12]
Edmundson, H. P. (1969) New methods in automatic abstracting. J. ACM, 16(2): 264-285. (Reprinted in Mani, I. and Maybury, M., editors, Advances in Automatic Text Summarization , pp. 21-42. MIT Press.)
[13]
Goldstein, J., Kantrowitz, M., Mittal, V. and Carbonell, J. (1999) Summarizing text documents: sentence selection and evaluation metrics. Proceedings 22nd International Conference on Research and Development in Information Retrieval (SIGIR'99), pp. 121-128.
[14]
Hahn, U. and Mani, I. (2000) The challenges of automatic summarization. IEEE Computer, 33(11): 29-36.
[15]
Halliday, M. and Hasan, R. (1996) Cohesion in Text. Longman.
[16]
Harman, D. K. and Voorhees, E. M. (1996) The fifth text retrieval conference (TREC-5). National Institute of Standards and Technology NIST SP 500-238.
[17]
Hearst, M. (1997) TextTiling: Segmenting text into multi-paragraph subtopic passages. Computational Linguistics, 23(1): 33-64.
[18]
Hirschman, L., Robinson, P., Ferro, L., Brown, E., Chinchor, N., Sundheim, B. and Grishman, R. (1999) Event99: Event evaluation for news on demand. Unpublished presentation for HUB4, prepared by MITRE, SAIC, NRaD and NYU.
[19]
Hirschman, L. and Mani, I. (2001) Evaluation. In: Mitkov, R., editor, Handbook of Computational inguistics. Oxford University Press.
[20]
Hovy, E. and Marcu, D. (1998) COLING-ACL'98 Tutorial on Text Summarization. http://www.isi.edu/marcu/coling-acl98-tutorial.html.
[21]
Hovy, E. and Lin, C-Y. (1999) Automated text summarization in SUMMARIST. In: Mani, I. and Maybury, M., editors, Advances in Automatic Text Summarization, pp. 81-94. MIT Press.
[22]
Text Summarization Challenge (2001) Proceedings Second NTCIR Workshop on Evaluation of Chinese and Japanese Text Retrieval and Text Summarization. http://galaga.jaist. ac.jp:8000/tsc
[23]
Jing, H., Barzilay, R., McKeown, K. and Elhadad, M. (1998) Summarization evaluation methods: Experiments and analysis. Working Notes of the AAAI Spring Symposium on Intelligent Text Summarization, Technical Report, pp. 60-68.
[24]
Jing, H. and McKeown, K. (1999) The decomposition of human-written summary sentences. Proceedings 22nd International Conference on Research and Development in Information Retrieval (SIGIR'99), pp. 129-136.
[25]
Kirk, R. E. (1968) Experimental Design: Procedures for the Behavioral Sciences. Wadsworth.
[26]
Knight, K. and Marcu, D. (2000) Statistics-based summarization - step one: Sentence compression. Proceedings Seventeenth National Conference on Artificial Intelligence (AAAI'2000), pp. 703-710.
[27]
Kupiec, J., Pedersen, F. and Chen, F. (1995) A trainable document summarizer. Proceedings 18th ACM SIGIR Conference (SIGIR'95), pp. 68-73. (Reprinted in Mani, I. and Maybury, M., editors, Advances in Automatic Text Summarization, pp. 55-60. MIT Press.)
[28]
Lancaster, F. W. (1991) Indexing and Abstracting in Theory and Practice. University of Illinois Graduate School of Library and Information Science.
[29]
Lin, C-Y. (1999) Training a selection function for extraction. Proceedings 18th International Conference on Information and Knowledge Management (CIKM'99), pp. 1-8.
[30]
Luhn, H. P. (1958) The automatic creation of literature abstracts. IBM J. Research & Development , 2: 159-165. (Reprinted in Mani, I. and Maybury, M., editors, Advances in Automatic Text Summarization, pp. 15-21. MIT Press.)
[31]
Mani, I. and Bloedorn, E. (1997) Multi-document Summarization by graph search and merging. Proceedings 14th National Conference on Artificial Intelligence (AAAI-97), Providence, RI, pp. 622-628.
[32]
Mani, I. and Bloedorn, E. (1998) Machine learning of generic and user-focused summarization. Proceedings 15th National Conference on Artificial Intelligence (AAAI-98), Madison, WI, pp. 821-826.
[33]
Mani, I. and Bloedorn, E. (1999) Summarizing similarities and differences among related documents. Infor. Retrieval, 1: 35-67.
[34]
Mani, I., Gates, B. and Bloedorn, E. (1999) Improving summaries by revising them. Proceedings 37th Annual Meeting of the ACL, pp. 558-565.
[35]
Mani, I. and Maybury, M. (editors) (1999) Advances in Automatic Text Summarization. MIT Press.
[36]
Mani, I. (2001) Automatic Summarization. John Benjamins.
[37]
Marcu, D. (1999) The automatic construction of large-scale corpora for summarization research. Proceedings 22nd International Conference on Research and Development in Information Retrieval (SIGIR'99), pp. 137-144.
[38]
Maybury, M. (1995) Generating summaries from event data. Infor. Process. Manage. 31(5): 735-751.
[39]
Miike, S., Itoh, E., Ono, K. and Sumita, K. (1994) A full-text retrieval system with a dynamic abstract generation function. Proceedings 17th International Conference on Research and Development in Information Retrieval (SIGIR'94), pp. 152-161.
[40]
Minel, J.-L., Nugier, S. and Piat, G. (1997) How to appreciate the quality of automatic text summarization. In: Mani, I. and Maybury, M., editors, Proceedings ACL/EACL'97 Workshop on Intelligent Scalable Text Summarization, pp. 25-30.
[41]
Morris, A., Kasper, G. and Adams, D. (1992) The effects and limitations of automatic text condensing on reading comprehension performance. Infor. Syst. Res., 3(1): 17-35. (Reprinted in Mani, I. and Maybury, M., editors, Advances in Automatic Text Summarization, pp. 305-323. MIT Press.)
[42]
Morris, J. and Hirst, G. (1991) Lexical cohesion computed by thesaural relations as an indicator of the structure of text. Computational Linguistics, 17(1): 21-43.
[43]
Grishman, R. and Sundheim, B. (1996) Message Understanding Conference-6: A Brief History. Proceedings COLING-96, pp. 466-471.
[44]
Paice, C. (1990) Constructing literature abstracts by computer: Techniques and prospects. Infor. Process. Manage. 26(1): 171-186.
[45]
Pollock, J. J. and Zamora, A. (1975) Automatic abstracting research at chemical abstracts service. J. Chem. Infor. Comput. Sci. 15(4): 226-232. (Reprinted in Mani, I. and Maybury, M., editors, Advances in Automatic Text Summarization, pp. 43-49. MIT Press.)
[46]
Radev, D. and McKeown, K. (1998) Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24(3): 469-500.
[47]
Rath, G. J., Resnick, A. and Savage, T. R. (1961) The formation of abstracts by the selection of sentences. Am. Documentation, 12(2): 139-143. (Reprinted in Mani, I. and Maybury, M., editors, Advances in Automatic Text Summarization, pp. 287-292. MIT Press.)
[48]
Reimer, U. and Hahn, U. (1988) Text condensation as knowledge base abstraction. Proceedings 4th IEEE/AAAI Conference on Artificial Intelligence Applications, pp. 338-344.
[49]
Robin, J. (1994) Revision-based generation of natural language summaries providing historical background: corpus-based analysis, design and implementation. PhD Thesis, Columbia University.
[50]
Saggion, H. and Lapalme, G. (2000) Concept identification and presentation in the context of technical text summarization. Proceedings ANLP-NAACL'2000 Workshop on Automatic Summarization, pp. 1-10.
[51]
Salton, G. and McGill, M. J. (1983) Introduction to Modern Information Retrieval. McGraw-Hill.
[52]
Salton, G., Singhal, A., Mitra, M. and Buckley, C. (1997) Automatic text structuring and summarization. Infor. Process. Manage. 33(2): 193-207. (Reprinted in Mani, I. and Maybury, M., editors, Advances in Automatic Text Summarization, pp. 341-355. MIT Press.)
[53]
Sparck-Jones, K. (1999) Automatic summarizing: factors and directions. In: Mani, I. and Maybury, M., editors, Advances in Automatic Text Summarization, pp. 1-12. MIT Press.
[54]
Sparck-Jones, K. and Galliers, J. (1996) Evaluating Natural Language Processing Systems: An Analysis and Review: Lecture Notes in Artificial Intelligence 1083. Springer-Verlag.
[55]
Teufel, S. and Moens, M. (1999) Argumentative classification of extracted sentences as a first step towards flexible abstracting. In: Mani, I. and Maybury, M., editors, Advances in Automatic Text Summarization, pp. 155-171. MIT Press.
[56]
Tombros, A. and Sanderson, M. (1998) Advantages of query biased summaries in information retrieval. Proceedings 21st ACM SIGIR Conference (SIGIR'98), pp. 2-10.
[57]
Voorhees, E. M. (1998) Variations in relevance judgments and the measurement of retrieval effectiveness. Proceedings 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR-98), pp. 315-323.

Cited By

View all
  • (2023)TSSuBERT: How to Sum Up Multiple Years of Reading in a Few TweetsACM Transactions on Information Systems10.1145/358178641:4(1-33)Online publication date: 10-Apr-2023
  • (2021)Evaluation of text summaries without human references based on the linear optimization of content metrics using a genetic algorithmExpert Systems with Applications: An International Journal10.1016/j.eswa.2020.113827167:COnline publication date: 1-Apr-2021
  • (2020)ViMs: a high-quality Vietnamese dataset for abstractive multi-document summarizationLanguage Resources and Evaluation10.1007/s10579-020-09495-454:4(893-920)Online publication date: 1-Dec-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Natural Language Engineering
Natural Language Engineering  Volume 8, Issue 1
March 2002
90 pages

Publisher

Cambridge University Press

United States

Publication History

Published: 01 March 2002

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)TSSuBERT: How to Sum Up Multiple Years of Reading in a Few TweetsACM Transactions on Information Systems10.1145/358178641:4(1-33)Online publication date: 10-Apr-2023
  • (2021)Evaluation of text summaries without human references based on the linear optimization of content metrics using a genetic algorithmExpert Systems with Applications: An International Journal10.1016/j.eswa.2020.113827167:COnline publication date: 1-Apr-2021
  • (2020)ViMs: a high-quality Vietnamese dataset for abstractive multi-document summarizationLanguage Resources and Evaluation10.1007/s10579-020-09495-454:4(893-920)Online publication date: 1-Dec-2020
  • (2019)Time-Limits and Summaries for Faster Relevance AssessingProceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3331184.3331270(901-904)Online publication date: 18-Jul-2019
  • (2019)EASY: Evaluation System for SummarizationComputational Linguistics and Intelligent Text Processing10.1007/978-3-031-24340-0_40(529-545)Online publication date: 7-Apr-2019
  • (2018)Comparative Study Between Two Swarm Intelligence Automatic Text SummariesInternational Journal of Applied Metaheuristic Computing10.4018/IJAMC.20180101029:1(15-39)Online publication date: 1-Jan-2018
  • (2018)Effective User Interaction for High-Recall RetrievalProceedings of the 27th ACM International Conference on Information and Knowledge Management10.1145/3269206.3271796(187-196)Online publication date: 17-Oct-2018
  • (2018)Creating a reference data set for the summarization of discussion forum threadsLanguage Resources and Evaluation10.1007/s10579-017-9389-452:2(461-483)Online publication date: 1-Jun-2018
  • (2017)Combining lexical and syntactic features for detecting content-dense texts in newsJournal of Artificial Intelligence Research10.5555/3207692.320769760:1(179-219)Online publication date: 1-Sep-2017
  • (2017)A free Web API for single and multi-document summarizationProceedings of the 15th International Workshop on Content-Based Multimedia Indexing10.1145/3095713.3095738(1-5)Online publication date: 19-Jun-2017
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media