Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Tri-training for Dependency Parsing Domain Adaptation

Published: 13 December 2021 Publication History

Abstract

In recent years, the research on dependency parsing focuses on improving the accuracy of the domain-specific (in-domain) test datasets and has made remarkable progress. However, there are innumerable scenarios in the real world that are not covered by the dataset, namely, the out-of-domain dataset. As a result, parsers that perform well on the in-domain data usually suffer from significant performance degradation on the out-of-domain data. Therefore, to adapt the existing in-domain parsers with high performance to a new domain scenario, cross-domain transfer learning methods are essential to solve the domain problem in parsing. This paper examines two scenarios for cross-domain transfer learning: semi-supervised and unsupervised cross-domain transfer learning. Specifically, we adopt a pre-trained language model BERT for training on the source domain (in-domain) data at the subword level and introduce self-training methods varied from tri-training for these two scenarios. The evaluation results on the NLPCC-2019 shared task and universal dependency parsing task indicate the effectiveness of the adopted approaches on cross-domain transfer learning and show the potential of self-learning to cross-lingual transfer learning.

References

[1]
Wasi Ahmad, Zhisong Zhang, Xuezhe Ma, Eduard Hovy, Kai-Wei Chang, and Nanyun Peng. 2019. On difficulties of cross-lingual transfer with order differences: A case study on dependency parsing. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 2440–2452. DOI:https://doi.org/10.18653/v1/N19-1253
[2]
Daniel Andor, Chris Alberti, David Weiss, Aliaksei Severyn, Alessandro Presta, Kuzman Ganchev, Slav Petrov, and Michael Collins. 2016. Globally normalized transition-based neural networks. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Berlin, Germany, 2442–2452. DOI:https://doi.org/10.18653/v1/P16-1231
[3]
Gabor Angeli, Melvin Jose Johnson Premkumar, and Christopher D. Manning. 2015. Leveraging linguistic structure for open domain information extraction. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Beijing, China, 344–354. DOI:https://doi.org/10.3115/v1/P15-1034
[4]
Avrim Blum and Tom M. Mitchell. 1998. Combining labeled and unlabeled data with co-training. In Proceedings of the Eleventh Annual Conference on Computational Learning Theory, COLT 1998, Peter L. Bartlett and Yishay Mansour (Eds.), ACM, Madison, Wisconsin, 92–100. DOI:https://doi.org/10.1145/279943.279962
[5]
Samuel R. Bowman, Jon Gauthier, Abhinav Rastogi, Raghav Gupta, Christopher D. Manning, and Christopher Potts. 2016. A fast unified model for parsing and sentence understanding. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Berlin, Germany, 1466–1477. DOI:https://doi.org/10.18653/v1/P16-1139
[6]
Danqi Chen and Christopher Manning. 2014. A fast and accurate dependency parser using neural networks. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Doha, Qatar, 740–750. DOI:https://doi.org/10.3115/v1/D14-1082
[7]
Kehai Chen, Rui Wang, Masao Utiyama, Lemao Liu, Akihiro Tamura, Eiichiro Sumita, and Tiejun Zhao. 2017. Neural machine translation with source dependency representation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, 2846–2852. DOI:https://doi.org/10.18653/v1/D17-1304
[8]
Wenliang Chen and Min Zhang. 2015. Semi-Supervised Dependency Parsing. Springer, Singapore. DOI:https://doi.org/10.1007/978-981-287-552-5_4
[9]
Kevin Clark, Minh-Thang Luong, Christopher D. Manning, and Quoc Le. 2018. Semi-Supervised sequence modeling with cross-view training. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, 1914–1925. DOI:https://doi.org/10.18653/v1/D18-1217
[10]
Timothy Dozat and Christopher D. Manning. 2017. Deep biaffine attention for neural dependency parsing. In Proceedings of the 5th International Conference on Learning Representations 2017. OpenReview.net, Toulon, France. Retrieved from https://openreview.net/forum?id=Hk95PK9le.
[11]
Yoav Goldberg and Joakim Nivre. 2012. A dynamic oracle for arc-eager dependency parsing. In Proceedings of COLING 2012. The COLING 2012 Organizing Committee, Mumbai, India, 959–976. Retrieved from https://www.aclweb.org/anthology/C12-1059.
[12]
Jiang Guo, Wanxiang Che, David Yarowsky, Haifeng Wang, and Ting Liu. 2015. Cross-lingual dependency parsing based on distributed representations. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Beijing, China, 1234–1244. DOI:https://doi.org/10.3115/v1/P15-1119
[13]
Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka, and Richard Socher. 2017. A joint many-task model: Growing a neural network for multiple NLP tasks. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, 1923–1933. DOI:https://doi.org/10.18653/v1/D17-1206
[14]
Jun Hatori, Takuya Matsuzaki, Yusuke Miyao, and Jun’ichi Tsujii. 2012. Incremental joint approach to word segmentation, POS tagging, and dependency parsing in Chinese. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Jeju Island, Korea, 1045–1053. DOI:https://www.aclweb.org/anthology/P12-1110
[15]
Shexia He, Zuchao Li, Hai Zhao, and Hongxiao Bai. 2018. Syntax for semantic role labeling, To Be, Or not to be. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Melbourne, Australia, 2061–2071. DOI:https://doi.org/10.18653/v1/P18-1192
[16]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735–1780. DOI:https://doi.org/10.1162/neco.1997.9.8.1735
[17]
Jeremy Howard and Sebastian Ruder. 2018. Universal language model fine-tuning for text classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Melbourne, Australia, 328–339. DOI:https://doi.org/10.18653/v1/P18-1031
[18]
Eliyahu Kiperwasser and Yoav Goldberg. 2016. Simple and accurate dependency parsing using bidirectional LSTM feature representations. Transactions of the Association for Computational Linguistics 4 (2016), 313–327. DOI:https://doi.org/10.1162/tacl_a_00101
[19]
Terry Koo and Michael Collins. 2010. Efficient third-order dependency parsers. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Uppsala, Sweden, 1–11. https://www.aclweb.org/anthology/P10-1001.
[20]
Shuhei Kurita, Daisuke Kawahara, and Sadao Kurohashi. 2017. Neural joint model for transition-based Chinese syntactic analysis. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vancouver, Canada, 1204–1214. DOI:https://doi.org/10.18653/v1/P17-1111
[21]
Omer Levy and Yoav Goldberg. 2014. Dependency-Based word embeddings. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Baltimore, Maryland, 302–308. DOI:https://doi.org/10.3115/v1/P14-2050
[22]
Zuchao Li, Jiaxun Cai, Shexia He, and Hai Zhao. 2018. Seq2seq dependency parsing. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, Santa Fe, New Mexico, 3203–3214. Retrieved from https://www.aclweb.org/anthology/C18-1271.
[23]
Zhenghua Li, Xue Peng, Min Zhang, Rui Wang, and Luo Si. 2019. Semi-supervised domain adaptation for dependency parsing. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 2386–2395. DOI:https://doi.org/10.18653/v1/P19-1229
[24]
Zhenghua Li, Min Zhang, and Wenliang Chen. 2014. Ambiguity-aware ensemble training for semi-supervised dependency parsing. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Baltimore, Maryland, 457–467. DOI:https://doi.org/10.3115/v1/P14-1043
[25]
Xuezhe Ma, Zecong Hu, Jingzhou Liu, Nanyun Peng, Graham Neubig, and Eduard Hovy. 2018. Stack-Pointer networks for dependency parsing. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Melbourne, Australia, 1403–1414. DOI:https://doi.org/10.18653/v1/P18-1130
[26]
Xuezhe Ma and Fei Xia. 2014. Unsupervised dependency parsing with transferring distribution via parallel guidance and entropy regularization. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Baltimore, Maryland, 1337–1348. DOI:https://doi.org/10.3115/v1/P14-1126
[27]
Xuezhe Ma and Hai Zhao. 2012. Fourth-Order dependency parsing. In Proceedings of COLING 2012: Posters. The COLING 2012 Organizing Committee, Mumbai, India, 785–796. Retrieved from https://www.aclweb.org/anthology/C12-2077.
[28]
Mitchell P. Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz. 1993. Building a large annotated corpus of english: The penn treebank. Computational Linguistics 19, 2 (1993), 313–330.
[29]
Ryan McDonald and Fernando Pereira. 2006. Online learning of approximate dependency parsing algorithms. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Trento, Italy, 81–88. Retrieved from https://www.aclweb.org/anthology/E06-1011.
[30]
Ryan McDonald, Slav Petrov, and Keith Hall. 2011. Multi-Source transfer of delexicalized dependency parsers. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Edinburgh, Scotland, UK., 62–72. Retrieved from https://www.aclweb.org/anthology/D11-1006.
[31]
Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch, and Armand Joulin. 2018. Advances in pre-training distributed word representations. In Proceedings of the 11th International Conference on Language Resources and Evaluation, LREC 2018, Nicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Kôiti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asunción Moreno, Jan Odijk, Stelios Piperidis, and Takenobu Tokunaga (Eds.), European Language Resources Association (ELRA), Miyazaki, Japan, 52–55. Retrieved from http://www.lrec-conf.org/proceedings/lrec2018/summaries/721.html.
[32]
Xue Peng, Zhenghua Li, Min Zhang, Rui Wang, Yue Zhang, and Luo Si. 2019. Overview of the NLPCC 2019 shared task: Cross-domain dependency parsing. Natural Language Processing and Chinese Computing - 8th CCF International Conference, NLPCC 2019, Dunhuang, China, October 9–14, 2019, Proceedings, Part II 11839 (2019), 760–771. DOI:https://doi.org/10.1007/978-3-030-32236-6_69
[33]
Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 2227–2237. DOI:https://doi.org/10.18653/v1/N18-1202
[34]
Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 2227–2237. DOI:https://doi.org/10.18653/v1/N18-1202
[35]
Guy Rotman and Roi Reichart. 2019. Deep contextualized self-training for low resource dependency parsing. Transactions of the Association for Computational Linguistics 7 (March 2019), 695–713. DOI:https://doi.org/10.1162/tacl_a_00294
[36]
Sebastian Ruder and Barbara Plank. 2018. Strong baselines for neural semi-supervised learning under domain shift. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Melbourne, Australia, 1044–1054. DOI:https://doi.org/10.18653/v1/P18-1096
[37]
Kailai Sun, Zuchao Li, and Hai Zhao. 2021. Cross-lingual Universal Dependency Parsing Only from One Monolingual Treebank. arxiv:cs.CL/2012.13163. Retrieved from https://arxiv.org/abs/2012.13163.
[38]
Oscar Täckström, Ryan McDonald, and Jakob Uszkoreit. 2012. Cross-lingual word clusters for direct transfer of linguistic structure. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Montréal, Canada, 477–487. Retrieved from https://www.aclweb.org/anthology/N12-1052.
[39]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems. NIPS, Long Beach, 6000–6010. Retrieved from http://papers.nips.cc/paper/7181-attention-is-all-you-need.
[40]
Shijie Wu and Mark Dredze. 2019. Beto, Bentz, Becas: The surprising cross-lingual effectiveness of BERT. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, Hong Kong, China, 833–844. DOI:https://doi.org/10.18653/v1/D19-1077
[41]
Zhentao Xia, Likai Wang, Weiguang Qu, Junsheng Zhou, and Yanhui Gu. 2020. Neural network based deep transfer learning for cross-domain dependency parsing. In Proceedings of the Artificial Intelligence and Security, Xingming Sun, Jinwei Wang, and Elisa Bertino (Eds.), Springer Singapore, Singapore, 549–558.
[42]
Min Xiao and Yuhong Guo. 2014. Distributed word representation learning for cross-lingual dependency parsing. In Proceedings of the 18th Conference on Computational Natural Language Learning. Association for Computational Linguistics, Ann Arbor, Michigan, 119–129. DOI:https://doi.org/10.3115/v1/W14-1613
[43]
Hang Yan, Xipeng Qiu, and Xuanjing Huang. 2020. A graph-based model for joint Chinese word segmentation and dependency parsing. Transactions of the Association for Computational Linguistics 8 (2020), 78–92. DOI:https://doi.org/10.1162/tacl_a_00301
[44]
David Yarowsky. 1995. Unsupervised word sense disambiguation rivaling supervised methods. In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Cambridge, Massachusetts, 189–196. DOI:https://doi.org/10.3115/981658.981684
[45]
Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks? In Proceedings of the Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger (Eds.), Curran Associates, Inc., Montreal, Quebec, Canada, 3320–3328. Retrieved from http://papers.nips.cc/paper/5347-how-transferable-are-features-in-deep-neural-networks.pdf.
[46]
Juntao Yu, Mohab Elkaref, and Bernd Bohnet. 2015. Domain adaptation for dependency parsing via self-training. In Proceedings of the 14th International Conference on Parsing Technologies. Association for Computational Linguistics, Bilbao, Spain, 1–10. DOI:https://doi.org/10.18653/v1/W15-2201
[47]
Daniel Zeman, Jan Hajič, Martin Popel, Martin Potthast, Milan Straka, Filip Ginter, Joakim Nivre, and Slav Petrov. 2018. CoNLL 2018 Shared Task: Multilingual parsing from raw text to universal dependencies. In Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. Association for Computational Linguistics, Brussels, Belgium, 1–21. DOI:https://doi.org/10.18653/v1/K18-2001
[48]
Hao Zhang, Liang Huang, Kai Zhao, and Ryan McDonald. 2013. Online learning for inexact hypergraph search. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Seattle, Washington, 908–913. Retrieved from https://www.aclweb.org/anthology/D13-1093.
[49]
Hao Zhang and Ryan McDonald. 2012. Generalized higher-order dependency parsing with cube pruning. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, Jeju Island, Korea, 320–331. Retrieved from https://www.aclweb.org/anthology/D12-1030.
[50]
Meishan Zhang, Yue Zhang, Wanxiang Che, and Ting Liu. 2014. Character-Level Chinese dependency parsing. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Baltimore, Maryland, 1326–1336. DOI:https://doi.org/10.3115/v1/P14-1125
[51]
Yue Zhang and Stephen Clark. 2008. A tale of two parsers: Investigating and combining graph-based and transition-based dependency parsing. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Honolulu, Hawaii, 562–571. Retrieved from https://www.aclweb.org/anthology/D08-1059.
[52]
Yi Zhang and Rui Wang. 2009. Cross-Domain dependency parsing using a deep linguistic grammar. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Association for Computational Linguistics, Suntec, Singapore, 378–386. https://www.aclweb.org/anthology/P09-1043.
[53]
Zhuosheng Zhang, Hai Zhao, Kangwei Ling, Jiangtong Li, Zuchao Li, Shexia He, and Guohong Fu. 2019. Effective subword segmentation for text comprehension. IEEE/ACM Transactions on Audio, Speech, and Language Processing 27, 11 (Nov 2019), 1664–1674. DOI:https://doi.org/10.1109/TASLP.2019.2922537
[54]
Zhi-Hua Zhou and Ming Li. 2005. Tri-training: Exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering 17, 11 (Nov 2005), 1529–1541. DOI:https://doi.org/10.1109/TKDE.2005.186

Cited By

View all
  • (2024)De-Noising Tail Entity Selection in Automatic Question Generation with Fine-Tuned T5 ModelData Science and Applications10.1007/978-981-99-7817-5_32(431-443)Online publication date: 18-Jan-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing
ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 21, Issue 3
May 2022
413 pages
ISSN:2375-4699
EISSN:2375-4702
DOI:10.1145/3505182
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 December 2021
Accepted: 01 August 2021
Revised: 01 May 2021
Received: 01 December 2020
Published in TALLIP Volume 21, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Tri-training
  2. dependency parsing
  3. domain adaptation
  4. transfer learning

Qualifiers

  • Research-article
  • Refereed

Funding Sources

  • Key Projects of National Natural Science Foundation of China
  • SJTU Trans-med Awards Research
  • Fundamental Research Funds for the Central Universities
  • 111 Project
  • China Southern Power Grid
  • CCF-Tencent Open Fund
  • Shanghai Municipal Science and Technology Major Project

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)32
  • Downloads (Last 6 weeks)3
Reflects downloads up to 01 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)De-Noising Tail Entity Selection in Automatic Question Generation with Fine-Tuned T5 ModelData Science and Applications10.1007/978-981-99-7817-5_32(431-443)Online publication date: 18-Jan-2024

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media