Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3011077.3011139acmotherconferencesArticle/Chapter ViewAbstractPublication PagessoictConference Proceedingsconference-collections
research-article

The combination of similarity measures for extractive summarization

Published: 08 December 2016 Publication History

Abstract

The key task in extractive summarization is to determine the importance of the sentence in the input. Several recent studies have focused on comparing the similarity between sentences to assess the significance of them efficiently. Each comparison method has its strengths and weaknesses. In this paper, we propose the combination of similarity measures for sentence comparison. Experiments conducted on both English and Vietnamese datasets demonstrate the efficiency of our proposed approach. Our model outperforms the recent works in English with the significant improvement (9.4 ROUGE-2 F1-score) and achieves the competitive result in Vietnamese.

References

[1]
S. Banerjee and T. Pedersen. Extended gloss overlaps as a measure of semantic relatedness. In Proceedings of the 18th International Joint Conference on Artificial Intelligence, IJCAI'03, pages 805--810, 2003.
[2]
J. Carbonell and J. Goldstein. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '98, pages 335--336, New York, NY, USA, 1998. ACM.
[3]
R. Collobert and J. Weston. A unified architecture for natural language processing: deep neural networks with multitask learning. In W. W. Cohen, A. McCallum, and S. T. Roweis, editors, ICML, volume 307 of ACM International Conference Proceeding Series, pages 160--167. ACM, 2008.
[4]
R. Collobert and J. Weston. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th International Conference on Machine Learning, ICML '08, pages 160--167, New York, NY, USA, 2008. ACM.
[5]
V. Dalal and L. Malik. A survey of extractive and abstractive text summarization techniques. In 2013 6th International Conference on Emerging Trends in Engineering and Technology, pages 109--110, Dec 2013.
[6]
G. Erkan and D. R. Radev. Lexrank: Graph-based lexical centrality as salience in text summarization. J. Artif. Int. Res., 22(1):457--479, Dec. 2004.
[7]
K. Ganesan, C. Zhai, and J. Han. Opinosis: A graph-based approach to abstractive summarization of highly redundant opinions. In Proceedings of the 23rd International Conference on Computational Linguistics, COLING '10, pages 340--348, Stroudsburg, PA, USA, 2010. Association for Computational Linguistics.
[8]
M. KÃěgebÃd'ck, O. Mogren, N. Tahmasebi, and D. Dubhashi. Extractive summarization using continuous vector space models. In Proceedings of the 2nd Workshop on Continuous Vector Space Models and their Compositionality (CVSC)@ EACL, pages 31--39, 2014.
[9]
R. Kiros, Y. Zhu, R. R. Salakhutdinov, R. Zemel, R. Urtasun, A. Torralba, and S. Fidler. Skip-thought vectors. In Advances in Neural Information Processing Systems 28, pages 3294--3302. Curran Associates, Inc., 2015.
[10]
C.-Y. Lin. Rouge: A package for automatic evaluation of summaries. In S. S. Marie-Francine Moens, editor, Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, pages 74--81, Barcelona, Spain, July 2004. Association for Computational Linguistics.
[11]
H. Lin and J. Bilmes. A class of submodular functions for document summarization. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, HLT '11, pages 510--520, Stroudsburg, PA, USA, 2011. Association for Computational Linguistics.
[12]
D. Metzler, Y. Bernstein, W. B. Croft, A. Moffat, and J. Zobel. Similarity measures for tracking information flow. In Proceedings of the 14th ACM International Conference on Information and Knowledge Management, CIKM '05, pages 517--524, New York, NY, USA, 2005. ACM.
[13]
R. Mihalcea and P. Tarau. TextRank: Bringing order into texts. In Proceedings of EMNLP-04and the 2004 Conference on Empirical Methods in Natural Language Processing, July 2004.
[14]
T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. In Proceedings of Workshop at ICLR, 2013.
[15]
O. Mogren, M. Kågebäck, and D. P. Dubhashi. Extractive summarization by aggregating multiple similarities. In Recent Advances in Natural Language Processing, RANLP 2015, 7--9 September, 2015, Hissar, Bulgaria, pages 451--457, 2015.
[16]
A. Nenkova and K. McKeown. Automatic summarization. Foundations and Trends in Information Retrieval, 5(2--3):103--233, 2011.
[17]
V. Ung, A. Luong, N. Tran, and M. Nghiem. Combination of features for vietnamese news multi-document summarization. In 2015 Seventh International Conference on Knowledge and Systems Engineering, KSE 2015, pages 186--191, 2015.

Cited By

View all
  • (2018)Towards State-of-the-art Baselines for Vietnamese Multi-document Summarization2018 10th International Conference on Knowledge and Systems Engineering (KSE)10.1109/KSE.2018.8573420(85-90)Online publication date: Nov-2018

Index Terms

  1. The combination of similarity measures for extractive summarization

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    SoICT '16: Proceedings of the 7th Symposium on Information and Communication Technology
    December 2016
    442 pages
    ISBN:9781450348157
    DOI:10.1145/3011077
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 December 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. deep learning
    2. extractive summarization
    3. multi-document summarization
    4. similarity measures

    Qualifiers

    • Research-article

    Funding Sources

    • Honors Program, University of Science, Vietnam National University - Ho Chi Minh City

    Conference

    SoICT '16

    Acceptance Rates

    SoICT '16 Paper Acceptance Rate 58 of 132 submissions, 44%;
    Overall Acceptance Rate 147 of 318 submissions, 46%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 25 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2018)Towards State-of-the-art Baselines for Vietnamese Multi-document Summarization2018 10th International Conference on Knowledge and Systems Engineering (KSE)10.1109/KSE.2018.8573420(85-90)Online publication date: Nov-2018

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media