Sentence Extraction Using Asymmetric Word Similarity and Topic Similarity

Azmi-Murad, M.; Martin, T.P.

doi:10.1007/3-540-31662-0_39

M. Azmi-Murad⁶ &
T.P. Martin⁶

Part of the book series: Advances in Soft Computing ((AINSC,volume 34))

1245 Accesses

Abstract

We propose a text summarization system known as MySum in finding the significance of sentences in order to produce a summary based on asymmetric word similarity and topic similarity. We use mass assignment theory to compute similarity between words based on the basis of their contexts. The algorithm is incremental so that words or documents can be added or subtracted without massive re-computation. Words are considered similar if they appear in similar contexts, however, these words do not have to be synonyms. We also compute the similarity of a sentence to the topic using frequency of overlapping words. We compare the summaries produced with the ones by humans and other system known as TF.ISF (term frequency-inverse sentence frequency). Our method generates summaries that are up to 60% similar to the manually created summaries taken from DUC 2002 test collection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Multi-document Text Summarization Using Sentence Extraction

Unsupervised Extractive Text Summarization Using Frequency-Based Sentence Clustering

Benchmarking Semantic, Centroid, and Graph-Based Approaches for Multi-document Summarization

References

Baldwin JF, Martin TP, Pilsworth BW (1995) Fril-Fuzzy and Evidential Reasoning in Artificial Intelligence. Research Studies Press, England
Google Scholar
Baldwin JF, Martin TP, Pilsworth BW (1996) “A Mass Assignment Theory of the Probability of Fuzzy Events,” Fuzzy Sets and Systems, (83), pp 353–367
Article MathSciNet Google Scholar
DUC (2002) DUC-Document Understanding Conferences, http://duc.nist.gov
Google Scholar
Harris Z (1985) Distributional Structure. In: Katz JJ (ed) The Philosophy of Linguistics. New York: Oxford University Press, pp 26–47
Google Scholar
Larocca Neto J, Santos AD, Kaestner CAA, Freitas AA (2000b) Document clustering and text summarization. In Proc. 4th Int. Conf. Practical Applications of Knowledge Discovery and Data Mining (PADD-2000), London: The Practical Application Company, pp 41–55
Google Scholar
Lo SH, Meng H, Lam W (2002) “Automatic Bilingual Text Document Summarization,” Proceedings of the Sixth World Multiconference on Systematic, Cybernetics and Informatics, Orlando, Florida, USA
Google Scholar
Luhn H (1958) The automatic creation of literature abstracts. IBM Journal of Research and Development, 2 (92):159–165
Article MathSciNet Google Scholar
Mani I, Maybury MT (eds) (1999) Advances in Automatic Text Summarization, Cambridge, MA: The MIT Press
Google Scholar
Pantel P, Lin D (2002) “Discovering Word Senses from Text,” In Conference on Knowledge Discovery and Data Mining, Alberta, Canada
Google Scholar
Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Information Processing and Management 24, pp 513–523. Reprinted in: Sparck Jones K. and Willet P. (eds) (1997) Readings in Information Retrieval, Morgan Kaufmann, pp 323–328
Google Scholar
Yohei S (2002) “Sentence Extraction by tf/idf and Position Weighting from Newspaper Articles TSC-8),” NTCIR Workshop 3 Meeting TSC, pp 55–59
Google Scholar
Zadeh LA (1965) “Fuzzy Sets,” Information and Control, vol. 8, pp 338–353
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Engineering Mathematics, University of Bristol, Bristol, BS8 1TR, UK
M. Azmi-Murad & T.P. Martin

Authors

M. Azmi-Murad
View author publications
You can also search for this author in PubMed Google Scholar
T.P. Martin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science and Engineering, Chung-Ang University, Heukseok-dong 221, 156-756, Seoul, Korea
Ajith Abraham
Department of Applied Mathematics Biometrics and Process Control, University Gent, Coupure Links 653, 9000 Gent, Belgium
Bernard de Baets
Dept. Automation Technologies, Fraunhofer IPK Berlin, Pascalstr. 8-9, 10587, Berlin, Germany
Mario Köppen
Dept. Automation Technologies, Fraunhofer IPK Berlin, Pascalstr. 8-9, 10587, Berlin, Germany
Bertram Nickolay

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Azmi-Murad, M., Martin, T. (2006). Sentence Extraction Using Asymmetric Word Similarity and Topic Similarity. In: Abraham, A., de Baets, B., Köppen, M., Nickolay, B. (eds) Applied Soft Computing Technologies: The Challenge of Complexity. Advances in Soft Computing, vol 34. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-31662-0_39

Download citation

DOI: https://doi.org/10.1007/3-540-31662-0_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31649-7
Online ISBN: 978-3-540-31662-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Sentence Extraction Using Asymmetric Word Similarity and Topic Similarity

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Multi-document Text Summarization Using Sentence Extraction

Unsupervised Extractive Text Summarization Using Frequency-Based Sentence Clustering

Benchmarking Semantic, Centroid, and Graph-Based Approaches for Multi-document Summarization

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Sentence Extraction Using Asymmetric Word Similarity and Topic Similarity

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Multi-document Text Summarization Using Sentence Extraction

Unsupervised Extractive Text Summarization Using Frequency-Based Sentence Clustering

Benchmarking Semantic, Centroid, and Graph-Based Approaches for Multi-document Summarization

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation