Abstract
The association analysis of cross-linguistic information is still a challenging problem in the task of multi-language summarization. To address this issue, we propose an LSTM framework based on feature-related attention mechanism to extract the summarization of Chinese-Vietnamese bilingual news. Firstly, the word embedding with multi-features is used as input to the model such as word frequency, sentence position and relevance. Then, the degree of elements co-occurrence in bilingual documents is analyzed, and the attention mechanism based on bilingual features is proposed to calculate the importance scores of sentences. Finally, the sentence with high score is selected and the redundant information is deleted according to the similarity analysis to generate summary. The results of comparison experiments show that the method has achieved good results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Wan, X., Li, H., Xiao, J.: Cross-language document summarization based on machine translation quality prediction. In: 48th ACL Meeting of the Association for Computational Linguistics, pp. 917–926. IEEE Press, Uppsala (2010). https://doi.org/10.17562/pb-43-16
Mathieu, B., Besançon, R., Fluhr, C.: Multilingual document clusters discovery. In: 7th CAIR Computer-Assisted Information Retrieval, pp. 116–125. IEEE Press, Naples (2004)
Lei, Y.: The comparative summarization of Chinese and Vietnamese bilingual news. Kunming University of Science and Technology (2018)
Wang, Y.S.: Detecting hot news topics and generating summarization from bilingual news texts. Kunming University of Science and Technology (2018)
Ko, Y., Seo, J.: An effective sentence-extraction technique using contextual information and statistical approaches for text summarization. Pattern Recogn. Lett. 29, 1366–1371 (2008). https://doi.org/10.1016/j.patrec.2008.02.008
Yao, J.G., Wan, X., Xiao, J.: Recent advances in document summarization. Knowl. Inf. Syst. 53, 1–40 (2017). https://doi.org/10.1007/s10115-017-1042-4
Narayan, S., Papasarantopoulos, N., Cohen, S.B., Lapata, M.: Neural extractive summarization with side information. In: 31th AAAI Conference on Artificial Intelligence, pp. 116–125. IEEE Press, San Francisco (2017). https://doi.org/10.11606/d.55.2018.tde-24102018-155954
Wang, Y., Huang, M., Zhu, X.: Attention-based LSTM for aspect-level sentiment classification. In: 34th EMNLP Empirical Methods in Natural Language Processing, pp. 606–615. IEEE Press, Austin (2016). https://doi.org/10.1109/access.2019.2893806
Huang, T., Li, L., Zhang, Y.: Multilingual multi-document summarization with enhanced hLDA features. In: Sun, M., Huang, X., Lin, H., Liu, Z., Liu, Y. (eds.) CCL/NLP-NABD -2016. LNCS (LNAI), vol. 10035, pp. 299–312. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47674-2_25
Wan, X., Jia, H., Huang, S.: Summarizing the differences in multilingual news. In: 34th SIGIR Conference on Research and Development in Information Retrieval, pp. 735–744. Springer, Beijing (2011). https://doi.org/10.1145/2009916.2010015
Singh, S.P., Kumar, A., Mangal, A., Singhal, S.: Bilingual automatic text summarization using unsupervised deep learning. In: 12th ICEEOT International Conference on Electrical, Electronics, and Optimization Techniques, pp. 1195–1200. IEEE Press, Chennai (2016). https://doi.org/10.1109/iceeot.2016.7754874
Wang, F.L., Yang, C.C.: The impact analysis of language differences on an automatic multilingual text summarization system. J. Assoc. Inf. Sci. Technol. 57, 684–696 (2014). https://doi.org/10.1002/asi.20330
Di Felippo, A., Tosta, FabrÃcio E.S., Pardo, Thiago A.S.: Applying lexical-conceptual knowledge for multilingual multi-document summarization. In: Silva, J., Ribeiro, R., Quaresma, P., Adami, A., Branco, A. (eds.) PROPOR 2016. LNCS (LNAI), vol. 9727, pp. 38–49. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-41552-9_4
Oufaida, H., Blache, P., Nouali, O.: Using distributed word representations and mRMR discriminant analysis for multilingual text summarization. In: Biemann, C., Handschuh, S., Freitas, A., Meziane, F., Métais, E. (eds.) NLDB 2015. LNCS, vol. 9103, pp. 51–63. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19581-0_4
Cruz, C.M., Urrea, A.M.: Extractive summarization based on word information and sentence position. In: Gelbukh, A. (ed.) 6th CLCLing International Conference on Computational Linguistics and Intelligent Text Processing, pp. 653–656. Springer, Berlin (2005). https://doi.org/10.1007/978-3-540-30586-6_73
Vanderwende, L., Suzuki, H., Brockett, C., Nenkova, A.: Beyond sumbasic: task-focused summarization with sentence simplification and lexical expansion. Inf. Process. Manag. 43, 1606–1618 (2007). https://doi.org/10.1016/j.ipm.2007.01.023
Meng, X., Wei, F., Liu, X.: Graph-based lexical centrality as salience in text summarization entity-centric topic-oriented opinion summarization in twitter. In: 18th SIGKDD Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining, vol. 10, pp. 93–102 (2015). https://doi.org/10.4018/ijirr.2015070102
Radev, D.R.: LexRank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457–479 (2004). https://doi.org/10.1613/jair.1523
Yan, S., Wan, X.: SRRank: leveraging semantic roles for extractive multi-document summarization. In: 22th TASLP Transactions on Audio Speech and Language Processing, pp. 2048–2058. IEEE Press, Piscataway (2012). https://doi.org/10.1109/taslp.2014.2360461
Li, J., Li, S.: Query-focused multi-document summarization: combining a novel topic model with graph-based semi-supervised learning. In: 25th COLING International Conference on Computational Linguistics, pp. 1197–1207. IEEE Press, Dublin (2014). https://doi.org/10.1305/coling.2014.232131
Cao, Z., Dong, L.: Ranking with recursive neural networks and its application to multi-document summarization. In: 29th AAAI Conference on Artificial Intelligence, pp. 114–120. IEEE Press, Washington (2013). https://doi.org/10.3455/aaai.2013.1249131
Yong, Z., Meng, J.E., Ning, W., Pratama, M.: Extractive document summarization based on convolutional neural networks. In: 42th IECON Conference of the IEEE Industrial Electronics Society, pp. 918–922. IEEE Press, Beijing (2016). https://doi.org/10.1109/iecon.2016.7793761
Nallapati, R., Zhou, B., Ma, M.: Classify or select: neural architectures for extractive document summarization. In: 5th ICLR International Conference on Learning Representations Conference Submission, pp. 928–936. IEEE Press, Toulon (2017). https://doi.org/10.1109/iclr.2017.7793761
Cheng, J., Lapata, M.: Neural summarization by extracting sentences and words. In: 54th ACL Annual Meeting of the Association for Computational Linguistics, pp. 1138–1146. IEEE Press, Berlin (2016). https://doi.org/10.18653/v1/p16-1046
Nallapati, R., Zhai, F., Zhou, B.: SummaRuNNer: a recurrent neural network based sequence model for extractive summarization of documents. In: 31th AAAI Conference on Artificial Intelligence, pp. 1318–1329. IEEE Press, San Francisco (2017)
Tang, P.L.: The method of the discovery and evolution of bilingual news topics in the Chinese and Vietnamese. Kunming University of Science and Technology (2018)
Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Proceedings of the Workshop on Text Summarization Branches Out, pp. 74–81. IEEE Press, San Francisco (2004)
Mihalcea, R., Tarau, P.: TextRank: Bringing order into texts. In: Emnlp Proceedings Conference on Empirical Methods in Natural Language Processing, pp. 404–411. IEEE Press, Barcelona (2004)
Acknowledgments
This work was supported by National Key Research and Development Plan (Grant Nos. 2018YFC0830105, 2018YFC0830101, 2018YFC0830100); National Natural Science Foundation of China (Grant Nos. 61732005,61761026, 61866019,61672271,61762056,61866020); Science and Technology Leading Talents in Yunnan, and Yunnan High and New Technology Industry Project (Grant No.201606); Talent Fund for Kunming University of Science and Technology (Grant No. KKSY201703005, KKSY201703015).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wu, J., Yu, Z., Gao, S., Guo, J., Song, R. (2019). Chinese-Vietnamese News Documents Summarization Based on Feature-related Attention Mechanism. In: Sun, Y., Lu, T., Yu, Z., Fan, H., Gao, L. (eds) Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2019. Communications in Computer and Information Science, vol 1042. Springer, Singapore. https://doi.org/10.1007/978-981-15-1377-0_41
Download citation
DOI: https://doi.org/10.1007/978-981-15-1377-0_41
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1376-3
Online ISBN: 978-981-15-1377-0
eBook Packages: Computer ScienceComputer Science (R0)