Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A CDT-Styled End-to-End Chinese Discourse Parser

Published: 13 July 2017 Publication History

Abstract

Discourse parsing is a challenging task and plays a critical role in discourse analysis. Since the release of the Rhetorical Structure Theory Discourse Treebank and the Penn Discourse Treebank, the research on English discourse parsing has attracted increasing attention and achieved considerable success in recent years. At the same time, some preliminary research on certain subtasks about discourse parsing for other languages, such as Chinese, has been conducted. In this article, we present an end-to-end Chinese discourse parser with the Connective-Driven Dependency Tree scheme, which consists of multiple components in a pipeline architecture, such as the elementary discourse unit (EDU) detector, discourse relation recognizer, discourse parse tree generator, and attribution labeler. In particular, the attribution labeler determines two attributions (i.e., sense and centering) for every nonterminal node (i.e., discourse relation) in the discourse parse trees. Systematically, our parser detects all EDUs in a free text, generates the discourse parse tree in a bottom-up way, and determines the sense and centering attributions for all nonterminal nodes by traversing the discourse parse tree. Comprehensive evaluation on the Connective-Driven Dependency Treebank corpus from both component-wise and error-cascading perspectives is conducted to illustrate how each component performs in isolation, and how the pipeline performs with error propagation. Finally, it shows that our end-to-end Chinese discourse parser achieves an overall F1 score of 20% with full automation.

References

[1]
Lynn Carlson, Daniel Marcu, and Mary Ellen Okurowski. 2001. Building a discourse-tagged corpus in the framework of Rhetorical Structure Theory. In Proceedings of the 2001 SIGdial Workshop on Discourse and Dialogue.
[2]
Vanessa Wei Feng and Graeme Hirst. 2012. Text-level discourse parsing with rich linguistic features. In Proceedings of the 2012 ACL Conference (ACL’12).
[3]
Hen-Hsen Huang and Hsin-Hsi Chen. 2011. Chinese discourse relation recognition. In Proceedings of the 2011 IJCNLP Conference (IJCNLP’11).
[4]
Hen-Hsen Huang and Hsin-Hsi Chen. 2012a. An annotation system for development of Chinese discourse corpus. In Proceedings of COLING 2012 Demonstration Papers.
[5]
Hen-Hsen Huang and Hsin-Hsi Chen. 2012b. Contingency and comparison relation labeling and structure prediction in Chinese sentences. In Proceedings of the 2012 Special Interest Group on Discourse and Dialogue.
[6]
Fang Kong, Hwee Tou Ng, and Guodong Zhou. 2004. A constituent-based approach to argument labeling with joint inference in discourse parsing. In Proceedings of the 2014 EMNLP Conference (EMNLP’14).
[7]
Yancui Li, Wenhe Feng, Jing Sun, Fang Kong, and Guodong Zhou. 2014. Building Chinese discourse corpus with connective-driven dependency tree structure. In Proceedings of the 2014 EMNLP Conference (EMNLP’14).
[8]
Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng. 2009. Recognizing implicit discourse relations in the Penn Discourse Treebank. In Proceedings of the 2009 EMNLP Conference (EMNLP’09).
[9]
Ziheng Lin, Hwee Tou Ng, and Min-Yen Kan. 2011. Automatically evaluating text coherence using discourse relations. In Proceedings of the 2011 ACL Conference (ACL’11).
[10]
Ziheng Lin, Hwee Tou Ng, and Min-Yen Kan. 2014. A PDTB-styled end-to-end discourse parser. Natural Language Engineering 20, 2, 151--184.
[11]
Thomas Meyer and Bonnie Webber. 2013. Implicitation of discourse connectives in (machine) translation. In Proceedings of the 2013 Workshop on Discourse in Machine Translation.
[12]
Emily Pitler and Ani Nenkova. 2009. Using syntax to disambiguate explicit discourse connectives in text. In Proceedings of the ACL-IJCNLP 2009 Short Papers.
[13]
Rashmi Prasad, Nikhil Dinesh, Alan Lee, Eleni Miltsakaki, Livio Robaldo, Aravind Joshi, and Bonnie Webber. 2008. The Penn Discourse Treebank 2.0. In Proceedings of the 2008 LREC Conference (LREC’08).
[14]
Susan Verberne, Lou Boves, Nelleke Oostdijk, and Perter Arno Coppen. 2007. Discourse-based answering of why-questions. Traitement Automatique Des Langues 47, 2, 21--41.
[15]
Nianwen Xue. 2005. Annotating discourse connectives in the Chinese Treebank. In Proceedings of the 2005 Workshop on Frontiers in Corpus Annotations.
[16]
Nianwen Xue, Fei Xia, Fu-Dong Chiou, and Marta Palmer. 2005. The Penn Chinese Treebank: Phrase structure annotation of a large corpus. Natural Language Engineering 11, 2, 207--238.
[17]
Yaqin Yang and Nianwen Xue. 2012. Chinese comma disambiguation for discourse analysis. In Proceedings of the 2012 ACL Conference (ACL’12).
[18]
Ming Yue. 2008. Rhetorical structure annotation of Chinese news commentaries. Journal of Chinese Information Processing 22, 4, 19--23.
[19]
Lanjun Zhou, Binyang Li, Zhongyu Wei, and Kam-Fai Wong. 2014. The CUHK discourse treebank for Chinese: Annotating explicit discourse connectives for the Chinese Treebank. In Proceedings of the 2014 LREC Conference (LREC’14).
[20]
Yuping Zhou and Nianwen Xue. 2012. PDTB-style discourse annotation of Chinese text. In Proceedings of the 2012 ACL Conference (ACL’12).
[21]
Yuping Zhou and Nianwen Xue. 2015. The Chinese discourse treebank: A Chinese corpus annotated with discourse relations. Language Resources and Evaluation 49, 2, 397--431.

Cited By

View all
  • (2024)Incorporating contextual evidence to improve implicit discourse relation recognition in ChineseFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-023-2503-418:3Online publication date: 1-Jun-2024
  • (2023)Discourse Parsing on Multi-Granularity Interaction2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191346(1-8)Online publication date: 18-Jun-2023
  • (2023)Topic-Aware Two-Layer Context-Enhanced Model for Chinese Discourse ParsingNeural Information Processing10.1007/978-981-99-8181-6_11(137-148)Online publication date: 27-Nov-2023
  • Show More Cited By

Index Terms

  1. A CDT-Styled End-to-End Chinese Discourse Parser

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian and Low-Resource Language Information Processing
    ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 16, Issue 4
    December 2017
    146 pages
    ISSN:2375-4699
    EISSN:2375-4702
    DOI:10.1145/3097269
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 July 2017
    Accepted: 01 May 2017
    Received: 01 January 2017
    Published in TALLIP Volume 16, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Connective-Driven Dependency Tree
    2. Discourse parsing
    3. discourse parse tree
    4. elementary discourse unit
    5. end-to-end Chinese discourse parser

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • Project 61472264 under the National Natural Science Foundation of China

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 09 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Incorporating contextual evidence to improve implicit discourse relation recognition in ChineseFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-023-2503-418:3Online publication date: 1-Jun-2024
    • (2023)Discourse Parsing on Multi-Granularity Interaction2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191346(1-8)Online publication date: 18-Jun-2023
    • (2023)Topic-Aware Two-Layer Context-Enhanced Model for Chinese Discourse ParsingNeural Information Processing10.1007/978-981-99-8181-6_11(137-148)Online publication date: 27-Nov-2023
    • (2023)A Unified Document-Level Chinese Discourse Parser on Different Granularity LevelsDocument Analysis and Recognition - ICDAR 202310.1007/978-3-031-41676-7_17(286-303)Online publication date: 19-Aug-2023
    • (2022)CNA: A Dataset for Parsing Discourse Structure on Chinese News Articles2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI)10.1109/ICTAI56018.2022.00151(990-995)Online publication date: Oct-2022
    • (2022)Two-Layer Context-Enhanced Representation for Better Chinese Discourse ParsingNatural Language Processing and Chinese Computing10.1007/978-3-031-17120-8_4(43-54)Online publication date: 24-Sep-2022
    • (2021)Chinese Comma Disambiguation in Math Word Problems Using SMOTE and Random ForestsAI10.3390/ai20400442:4(738-755)Online publication date: 20-Dec-2021
    • (2020)Tree Framework With BERT Word Embedding for the Recognition of Chinese Implicit Discourse RelationsIEEE Access10.1109/ACCESS.2020.30195008(162004-162011)Online publication date: 2020
    • (2019)A Survey of Discourse Representations for Chinese Discourse AnnotationACM Transactions on Asian and Low-Resource Language Information Processing10.1145/329344218:3(1-25)Online publication date: 25-Jan-2019
    • (2019)A Multi-stage Strategy for Chinese Discourse Tree Construction2019 International Conference on Asian Language Processing (IALP)10.1109/IALP48816.2019.9037684(302-307)Online publication date: Nov-2019
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media