Abstract
Automatic text summarization, the computer-based production of condensed versions of documents, is an important technology for the information society. Without summaries it would be practically impossible for human beings to get access to the ever growing mass of information available online. Although research in text summarization is over 50 years old, some efforts are still needed given the insufficient quality of automatic summaries and the number of interesting summarization topics being proposed in different contexts by end users (“domain-specific summaries”, “opinion-oriented summaries”, “update summaries”, etc.). This paper gives a short overview of summarization methods and evaluation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Barzilay, R.: Modeling local coherence: an entity-based approach. In: Proceedings of ACL 2005, Michigan, pp. 141–148. Association for Computational Linguistics, Stroudsburg (2005)
Barzilay, R., Elhadad, M.: Using lexical chains for text summarization. In: Proceedings of the ACL/EACL’97 Workshop on Intelligent Scalable Text Summarization, Madrid, pp. 10–17 (1997)
Barzilay, R., Elhadad, N., Mckeown, K.R.: Inferring strategies for sentence ordering in multidocument news summarization. J. Artif. Intell. Res. 17, 2002 (2002)
Barzilay, R., Lapata, M.: Modeling local coherence: an entity-based approach. Comput. Linguist. 34(1), 1–34 (2008)
Benbrahim, M., Ahmad, K.: Text summarisation: the role of lexical cohesion analysis. In: The New Review of Document and Text Management, pp. 321–335. Taylor Graham Pub., London, UK (1995)
Bossard, A., Généreux, M., Poibeau, T.: Cbseas, a summarization system – integration of opinion mining techniques to summarize blogs. In: Proceedings of the 12th Meeting of the European Association for Computational Linguistics (system demonstration), EACL ’09, Athens. Association for Computational Linguistics, Stroudsburg (2009)
Carbonell, J.G., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Research and Development in Information Retrieval, pp. 335–336. The Association for Computing Machinery, New York (1998)
Chambers, N., Jurafsky, D.: Unsupervised learning of narrative schemas and their participants. In: ACL/AFNLP, Singapore, pp. 602–610. Association for Computational Linguistics, Stroudsburg (2009)
Cohn, T., Lapata, M.: Sentence compression as tree transduction. J. Artif. Intell. Res. (JAIR) 34, 637–674 (2009)
Dang, H.T., Owczarzak, K.: Overview of the tac 2008 opinion question answering and summarization tasks. In: Proceedings of the TAC 2008 Workshop, Notebook Papers and Results, Gaithersburg, MD, USA. NIST, Gaithersburg, MD, USA (2008)
DeJong, G.: An overview of the FRUMP system. In: Lehnert, W., Ringle, M. (eds.) Strategies for Natural Language Processing, pp. 149–176. Lawrence Erlbaum Associates, Hillsdale (1982)
Edmundson, H.: New methods in automatic extracting. J. Assoc. Comput. Mach. 16(2), 264–285 (1969)
Endres-Niggemeyer, B.: SimSum: an empirically founded simulation of summarizing. Inf. Process. Manag. 36, 659–682 (2000)
Erkan, G., Radev, D.: Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. (JAIR) 22, 457–479 (2004)
Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT, Cambridge (1998)
Grishman, R.: Information extraction: techniques and challenges. In: Pazienza, M.T. (ed.) Information Extraction. A Multidisciplinary Approach to an Emerging Information Technology. Lecture Notes in Artificial Intelligence, vol. 1299. Springer, Berlin/New York (1997)
Harman, D., Liberman, M.: Tipster Complete. Technical Report, University of Pennsylvania, Philadelphia, USA (1993)
Hasler, L., Orãsan, C., Mitkov, R.: Building better corpora for summarisation. In: Proceedings of Corpus Linguistics, Lancaster, pp. 309–319 (2003)
Hovy, E., Lin, C.Y., Zhou, L., Fukumoto, J.: Automated summarization evaluation with basic elements. In: Proceedings of the Fifth Conference on Language Resources and Evaluation (LREC), Genoa, Italy. ELDA, Paris, France (2006)
Jing, H.: Using hidden markov modeling to decompose human-written summaries. Comput. Linguist. 28, 527–543 (2002)
Jing, H., McKeown, K.: The decomposition of human-written summary sentences. In: Hearst, M., Gey, F., Tong, R. (eds.) Proceedings of SIGIR’99 – 22nd International Conference on Research and Development in Information Retrieval, University of California, Berkeley, pp. 129–136 (1999)
Jing, H., McKeown, K.: Cut and paste based text summarization. In: Proceedings of the 1st Meeting of the North American Chapter of the Association for Computational Linguistics, Seattle, pp. 178–185. Association for Computational Linguistics, Stroudsburg (2000)
Jones, K.S.: Automatic summarising: the state of the art. Inf. Process. Manage. 43(6), 1449–1481 (2007)
Kabadjov, M.A., Atkinson, M., Steinberger, J., Steinberger, R., der Goot, E.V.: Newsgist: a multilingual statistical news summarizer. In: ECML/PKDD (3), Barcelona, pp. 591–594. Springer, Berlin/New York (2010)
Knight, K., Marcu, D.: Statistics-based summarization – step one: sentence compression. In: Proceedings of the 17th National Conference of the American Association for Artificial Intelligence, Austin. AAAI, Palo Alto, CA, USA (2000)
Kupiec, J., Pedersen, J., Chen, F.: A trainable document summarizer. In: Proceedings of the 18th ACM-SIGIR Conference, Seattle, pp. 68–73. ACM, New York (1995)
Lapata, M.: Probabilistic text structuring: experiments with sentence ordering. In: Proceedings of the 41st Meeting of the Association of Computational Linguistics, Sapporo, pp. 545–552. Association for Computational Linguistics, Stroudsburg (2003)
Li, P., Jiang, J., Wang, Y.: Generating templates of entity summaries with an entity-aspect model and pattern mining. In: Proceedings of ACL, Uppsala. Association for Computational Linguistics, Uppsala (2010)
Liddy, E.D.: The discourse-level structure of empirical abstracts: an exploratory study. Inf. Process. Manag. 27(1), 55–81 (1991)
Lin, C.Y.: Rouge: a package for automatic evaluation of summaries. In: Proceedings of the Workshop on Text Summarization Branches Out (WAS 2004), Barcelona (2004)
Lin, C., Hovy, E.: Identifying topics by position. In: Fifth Conference on Applied Natural Language Processing, Washington, DC, pp. 283–290. Association for Computational Linguistics, Stroudsburg (1997)
Lin, C.Y., Hovy, E.: The automated acquisition of topic signatures for text summarization. In: Proceedings of the COLING Conference, Saarbrumlcken. Association for Computational Linguistics, Stroudsburg (2000)
Lloret, E., Palomar, M.: Text summarisation in progress: a literature review. Artif. Intell. Rev. 37(1), 1–41 (2011)
Louis, A., Nenkova, A.: Automatically evaluating content selection in summarization without human models. In: Proceedings of EMNLP’09, Singapore, pp. 306–314. Association for Computational Linguistics, Stroudsburg (2009)
Luhn, H.P.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2(2), 159–165 (1958)
Mani, I.: Automatic Text Summarization. John Benjamins, Amsterdam/Philadelphia (2001)
Mani, I., Klein, G., House, D., Hirschman, L., Firmin, T., Sundheim, B.: Summac: a text summarization evaluation. Nat. Lang. Engin. 8, 43–68 (2002). DOI 10.1017/S1351324901002741. http://portal.acm.org/citation.cfm?id=973860.973864
Mani, I., Maybury, M.T.: Advances in Automatic Text Summarization. MIT, Cambridge (1999)
Mann, W., Thompson, S.: Rhetorical structure theory: towards a functional theory of text organization. Text 8(3), 243–281 (1988)
Marcu, D.: From discourse structures to text summaries. In: The Proceedings of the ACL’97/EACL’97 Workshop on Intelligent Scalable Text Summarization, Madrid, pp. 82–88 (1997)
Maynard, D., Tablan, V., Cunningham, H., Ursu, C., Saggion, H., Bontcheva, K., Wilks, Y.: Architectural elements of language engineering robustness. J. Nat. Lang. Engin. Spec. Issue Robust Methods Anal. Nat. Lang. Data 8(2/3), 257–274 (2002)
Mihalcea, R.: Language independent extractive summarization. In: AAAI, Pittsburgh, Pennsylvania, pp. 1688–1689. Association for Computational Linguistics, Stroudsburg (2005)
Mihalcea, R., Tarau, P.: TextRank: Bringing order into texts. In: Proceedings of EMNLP-04and the 2004 Conference on Empirical Methods in Natural Language Processing, Barcelona (2004)
Nenkova, A., Passonneau, R., McKeown, K.: The pyramid method: incorporating human content selection variation in summarization evaluation. ACM Trans. Speech Lang. Process. 4(2), 1–23 (2007)
Hoa Trang Dang (ed.): NIST: Proceedings of the Text Analysis Conference. NIST, Gaithesburg (2008)
Okumura, M., Fukusima, T., Nanba, H., Hirao, T.: Text summarization challenge 2 text summarization evaluation at ntcir workshop 3. SIGIR Forum 38(1), 29–38 (2004)
Ono, K., Sumita, K., Miike, S.: Abstract generation based on rhetorical structure extraction. In: Proceedings of the International Conference on Computational Linguistics, Kyoto, Japan, pp. 344–348. ACL, Stroudsburg, USA (1994)
Over, P., Dang, H., Harman, D.: DUC in context. Inf. Process. Manag. 43, 1506–1520 (2007). DOI 10.1016/j.ipm.2007.01.019. http://portal.acm.org/citation.cfm?id=1284916.1285157
Owczarzak, K., Dang, H.: Overview of the tac 2010 summarization track. In: Proceedings of TAC 2010, NIST, Gaithersburg, MD, USA (2010)
Paice, C.D.: Constructing literature abstracts by computer: technics and prospects. Inf. Process. Manag. 26(1), 171–186 (1990)
Paice, C.D., Oakes, M.P.: A Concept-Based Method for Automatic Abstracting. Technical Report 27, Library and Information Commission, Wetherby (1999)
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL ’02, Philadelphia, pp. 311–318 (2002)
Radev, D., Allison, T., Blair-Goldensohn, S., Blitzer, J., Çelebi, A., Dimitrov, S., Drabek, E., Hakim, A., Lam, W., Liu, D., Otterbacher, J., Qi, H., Saggion, H., Teufel, S., Topper, M., Winkel, A., Zhang, Z.: MEAD — A platform for multidocument multilingual text summarization. In: Conference on Language Resources and Evaluation (LREC), Lisbon (2004)
Radev, D.R., Jing, H., Budzikowska, M.: Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies. In: ANLP/NAACL Workshop on Summarization, Seattle (2000)
Radev, D.R., McKeown, K.R.: Generating natural language summaries from multiple on-line sources. Comput. Linguist. 24(3), 469–500 (1998)
Radev, D.R., Teufel, S., Saggion, H., Lam, W., Blitzer, J., Qi, H., Çelebi, A., Liu, D., Drabek, E.: Evaluation challenges in large-scale document summarization. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Vol. 1, ACL ’03, Sapporo, Japan, pp. 375–382. ACL, Stroudsburg, USA (2003)
Saggion, H.: Multilingual multidocument summarization tools and evaluation. In: Proceedings of LREC 2006, Genoa, Italy. ELDA, Paris, France (2006)
Saggion, H.: Experiments on semantic-based clustering for cross-document coreference. In: Proceedings of the Third Joint International Conference on Natural Language Processing, AFNLP, Hyderabad, pp. 149–156 (2008)
Saggion, H.: SUMMA: a robust and adaptable summarization tool. Traitement Automatique des Langues 49(2), 103–125 (2008)
Saggion, H.: A classification algorithm for predicting the structure of summaries. In: UCNLG+Sum ’09: Proceedings of the 2009 Workshop on Language Generation and Summarisation, pp. 31–38. Association for Computational Linguistics, Morristown (2009)
Saggion, H.: Learning predicate insertion rules for document abstracting. In: CICLing, Tokyo, pp. 301–312. Springer, Berlin/New York (2011)
Saggion, H., Gaizauskas, R.: Multi-document summarization by cluster/profile relevance and redundancy removal. In: Proceedings of the Document Understanding Conference 2004, NIST, Boston (2004)
Saggion, H., Lapalme, G.: Generating indicative-informative summaries with sumUM. Comput. Linguist. 28, 497–526 (2002)
Saggion, H., Radev, D., Teufel, S., Lam, W.: Meta-evaluation of summaries in a cross-lingual environment using content-based metrics. In: Proceedings of COLING 2002, Taipei, pp. 849–855. Association for Computational Linguistics, Stroudsburg (2002)
Saggion, H., Radev, D., Teufel, S., Wai, L., Strassel, S.: Developing infrastructure for the evaluation of single and multi-document summarization systems in a cross-lingual environment. In: LREC 2002, Las Palmas, pp. 747–754 (2002)
Saggion, H., Teufel, S., Radev, D., Lam, W.: Meta-evaluation of summaries in a cross-lingual environment using content-based metrics. In: Proceedings of the 19th international conference on Computational linguistics - Vol. 1, COLING ’02, Taipei, pp. 1–7. Association for Computational Linguistics, Stroudsburg (2002)
Saggion, H., Torres-Moreno, J.M., da Cunha, I., SanJuan, E., Velazquez-Morales, P.: Multilingual summarization evaluation without human models. In: In Proceedings of COLING, Beijing (2010)
Salton, G., Allan, J., Singhal, A.: Automatic text decomposition and structuring. Inf. Process. Manag. 32(2), 127–138 (1996)
Sparck Jones, K.: What might be in a summary? In: K. Knorz, Womser-Hacker (eds.) Information Retrieval 93: Von der Modellierung zur Anwendung (1993)
Sparck Jones, K.: Automatic summarizing: factors and directions. In: Mani, I., Maybury, M. (eds.) Advances in Automatic Text Summarization. MIT, Cambridge (1999)
Sparck Jones, K., Endres-Niggemeyer, B.: Automatic summarizing. Inf. Process. Manag. 31(5), 625–630 (1995)
Spärck Jones, K., Galliers, J.R.: Evaluating Natural Language Processing Systems. Springer, Berlin (1996)
Swales, J.: Genre Analysis: English in Academic and Research Settings. Cambridge University Press, Cambridge (1990)
Teufel, S., Moens, M.: Argumentative classification of extracted sentences as a first step towards flexible abstracting. In: Mani, Maybury, M. (eds.) Advances in Automatic Text Summarization, pp. 155–171. MIT, Cambridge (1999)
Turner, J., Charniak, E.: Supervised and Unsupervised Learning for Sentence Compression. In: ACL, Michigan, Ann Arbor, USA. ACL, Stroudsburg, USA (2005)
Witbrock, M.J., Mittal, V.O.: Ultra-summarization: a statistical approach to generating highly condensed non-extractive summaries. In: In SIGIR99, Berkeley, pp. 315–316. ACM, New York (1999)
Zajic, D., Dorr, B., Lin, J., Schwartz, R.: Multi-candidate reduction: sentence compression as a tool for document summarization tasks. In: Information Processing and Management Special Issue on Summarization, p. 43. Elsevier, Amsterdam, The Netherlands (2007)
Acknowledgements
Horacio Saggion is grateful to a fellowship from Programa Ramón y Cajal, Ministerio de Ciencia e Innovación, Spain. Thierry Poibeau is supported by the “Empirical Fundations of Linguistics” labex, Sorbonne-Paris-Cité. We acknowledge the support from the editors of this volume.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Saggion, H., Poibeau, T. (2013). Automatic Text Summarization: Past, Present and Future. In: Poibeau, T., Saggion, H., Piskorski, J., Yangarber, R. (eds) Multi-source, Multilingual Information Extraction and Summarization. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28569-1_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-28569-1_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28568-4
Online ISBN: 978-3-642-28569-1
eBook Packages: Computer ScienceComputer Science (R0)