Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3289600.3290614acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

clstk: The Cross-Lingual Summarization Tool-Kit

Published: 30 January 2019 Publication History

Abstract

Cross-lingual summarization (CLS) aims to create summaries in a target language, from a document or document set given in a different, source language. Cross-lingual summarization can play a critical role in enabling cross-lingual information access for millions of people across the globe who do not speak or understand languages having large representation on the web. It can also make documents originally published in local languages quickly accessible to a large audience which does not understand those local languages. Though cross-lingual summarization has gathered some attention in the last decade, there has been no serious effort to publish rigorous software for this task. In this paper, we provide a design for an end-to-end CLS software called clstk. Besides implementing a number of methods proposed by different CLS researchers over years, the software integrates multiple components critical for CLS. We hope that this extremely modular tool-kit will help CLS researchers to contribute more effectively to the area.

References

[1]
Florian Boudin, Stéphane Huet, and Juan-Manuel Torres-Moreno. 2011. A Graph-based Approach to Cross-Language Multi-Document Summarization . Polibits 43 (2011), 113--118.
[2]
Dipanjan Das and André FT Martins. 2007. A Survey on Automatic Text Summarization . Literature Survey for the Language and Statistics II course at Carnegie Mellon University, Vol. 4 (2007), 192--195.
[3]
George Giannakopoulos, Mahmoud El-Haj, Benoit Favre, Marianna Litvak, Josef Steinberger, and Vasudeva Varma. 2011. TAC 2011 MultiLing Pilot Overview . (2011).
[4]
Eva Hasler, Adrià de Gispert, Felix Stahlberg, Aurelien Waite, and Bill Byrne. 2017. Source Sentence Simplification for Statistical Machine Translation . Computer Speech & Language, Vol. 45 (2017), 221--235.
[5]
Nisarg Jhaveri, Manish Gupta, and Vasudeva Varma. 2018a. A Workbench for Rapid Generation of Cross-Lingual Summaries. In Proc. of the $11^th$ Intl. Conf. on Language Resources and Evaluation (LREC 2018) (7--12).
[6]
Nisarg Jhaveri, Manish Gupta, and Vasudeva Varma. 2018b. Translation Quality Estimation for Indian Languages. In EAMT (28--30). 159--168.
[7]
Chin-Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries . Text Summarization Branches Out (2004).
[8]
Hui Lin and Jeff Bilmes. 2010. Multi-Document Summarization via Budgeted Maximization of Submodular Functions. In Human Language Technologies: The 2010 Annual Conf. of the North American Chapter of the Association for Computational Linguistics. 912--920.
[9]
Hui Lin and Jeff Bilmes. 2011. A Class of Submodular Functions for Document Summarization. In Proc. of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. 510--520.
[10]
Ani Nenkova and Kathleen McKeown. 2012. A survey of text summarization techniques. In Mining text data . Springer, 43--76.
[11]
Sergiu Nisioi, Sanja vS tajner, Simone Paolo Ponzetto, and Liviu P Dinu. 2017. Exploring Neural Text Simplification Models. In ACL, Vol. 2. 85--91.
[12]
C Poornima, V Dhanalakshmi, KM Anand, and KP Soman. 2011. Rule-based Sentence Simplification for English to Tamil Machine Translation System . Intl. Journal of Computer Applications, Vol. 25, 8 (2011), 38--42.
[13]
Xiaojun Wan. 2011. Using Bilingual Information for Cross-Language Document Summarization. In ACL-HLT . 1546--1555.
[14]
Xiaojun Wan, Huiying Li, and Jianguo Xiao. 2010. Cross-Language Document Summarization based on Machine Translation Quality Prediction. In ACL . 917--926.
[15]
Xiaojun Wan, Fuli Luo, Xue Sun, Songfang Huang, and Jin-ge Yao. 2018. Cross-Language Document Summarization via Extraction and Ranking of Multiple Summaries . Knowledge and Information Systems (2018), 1--19.
[16]
Jin-ge Yao, Xiaojun Wan, and Jianguo Xiao. 2015. Phrase-based Compressive Cross-Language Summarization. In EMNLP . 118--127.
[17]
Jiajun Zhang, Yu Zhou, and Chengqing Zong. 2016. Abstractive Cross-Language Summarization via Translation Model Enhanced Predicate Argument Structure Fusing . TASLP, Vol. 24, 10 (2016), 1842--1853.

Cited By

View all
  • (2020)Combining Machine Learning and Natural Language Processing for Language-Specific, Multi-Lingual, and Cross-Lingual Text SummarizationTrends and Applications of Text Summarization Techniques10.4018/978-1-5225-9373-7.ch001(1-31)Online publication date: 2020

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WSDM '19: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining
January 2019
874 pages
ISBN:9781450359405
DOI:10.1145/3289600
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 January 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cross-lingual summarization
  2. document summarization tool-kit

Qualifiers

  • Research-article

Conference

WSDM '19

Acceptance Rates

WSDM '19 Paper Acceptance Rate 84 of 511 submissions, 16%;
Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)1
Reflects downloads up to 09 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2020)Combining Machine Learning and Natural Language Processing for Language-Specific, Multi-Lingual, and Cross-Lingual Text SummarizationTrends and Applications of Text Summarization Techniques10.4018/978-1-5225-9373-7.ch001(1-31)Online publication date: 2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media