Abstract
We propose a text summarization system known as MySum in finding the significance of sentences in order to produce a summary based on asymmetric word similarity and topic similarity. We use mass assignment theory to compute similarity between words based on the basis of their contexts. The algorithm is incremental so that words or documents can be added or subtracted without massive re-computation. Words are considered similar if they appear in similar contexts, however, these words do not have to be synonyms. We also compute the similarity of a sentence to the topic using frequency of overlapping words. We compare the summaries produced with the ones by humans and other system known as TF.ISF (term frequency-inverse sentence frequency). Our method generates summaries that are up to 60% similar to the manually created summaries taken from DUC 2002 test collection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baldwin JF, Martin TP, Pilsworth BW (1995) Fril-Fuzzy and Evidential Reasoning in Artificial Intelligence. Research Studies Press, England
Baldwin JF, Martin TP, Pilsworth BW (1996) “A Mass Assignment Theory of the Probability of Fuzzy Events,” Fuzzy Sets and Systems, (83), pp 353–367
DUC (2002) DUC-Document Understanding Conferences, http://duc.nist.gov
Harris Z (1985) Distributional Structure. In: Katz JJ (ed) The Philosophy of Linguistics. New York: Oxford University Press, pp 26–47
Larocca Neto J, Santos AD, Kaestner CAA, Freitas AA (2000b) Document clustering and text summarization. In Proc. 4th Int. Conf. Practical Applications of Knowledge Discovery and Data Mining (PADD-2000), London: The Practical Application Company, pp 41–55
Lo SH, Meng H, Lam W (2002) “Automatic Bilingual Text Document Summarization,” Proceedings of the Sixth World Multiconference on Systematic, Cybernetics and Informatics, Orlando, Florida, USA
Luhn H (1958) The automatic creation of literature abstracts. IBM Journal of Research and Development, 2 (92):159–165
Mani I, Maybury MT (eds) (1999) Advances in Automatic Text Summarization, Cambridge, MA: The MIT Press
Pantel P, Lin D (2002) “Discovering Word Senses from Text,” In Conference on Knowledge Discovery and Data Mining, Alberta, Canada
Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Information Processing and Management 24, pp 513–523. Reprinted in: Sparck Jones K. and Willet P. (eds) (1997) Readings in Information Retrieval, Morgan Kaufmann, pp 323–328
Yohei S (2002) “Sentence Extraction by tf/idf and Position Weighting from Newspaper Articles TSC-8),” NTCIR Workshop 3 Meeting TSC, pp 55–59
Zadeh LA (1965) “Fuzzy Sets,” Information and Control, vol. 8, pp 338–353
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer
About this paper
Cite this paper
Azmi-Murad, M., Martin, T. (2006). Sentence Extraction Using Asymmetric Word Similarity and Topic Similarity. In: Abraham, A., de Baets, B., Köppen, M., Nickolay, B. (eds) Applied Soft Computing Technologies: The Challenge of Complexity. Advances in Soft Computing, vol 34. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-31662-0_39
Download citation
DOI: https://doi.org/10.1007/3-540-31662-0_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31649-7
Online ISBN: 978-3-540-31662-6
eBook Packages: EngineeringEngineering (R0)