Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3485447.3512012acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article
Open access

CycleNER: An Unsupervised Training Approach for Named Entity Recognition

Published: 25 April 2022 Publication History

Abstract

Named Entity Recognition (NER) is a crucial natural language understanding task for many down-stream tasks such as question answering and retrieval. Despite significant progress in developing NER models for multiple languages and domains, scaling to emerging and/or low-resource domains still remains challenging, due to the costly nature of acquiring training data. We propose CycleNER, an unsupervised approach based on cycle-consistency training that uses two functions: (i) sentence-to-entity – S2E and (ii) entity-to-sentence – E2S, to carry out the NER task. CycleNER does not require annotations but a set of sentences with no entity labels and another independent set of entity examples. Through cycle-consistency training, the output from one function is used as input for the other (e.g. S2E → E2S) to align the representation spaces of both functions and therefore enable unsupervised training. Evaluation on several domains comparing CycleNER against supervised and unsupervised competitors shows that CycleNER achieves highly competitive performance with only a few thousand input sentences. We demonstrate competitive performance against supervised models, achieving 73% of supervised performance without any annotations on CoNLL03, while significantly outperforming unsupervised approaches.

References

[1]
Giannis Bekoulis, Johannes Deleu, Thomas Demeester, and Chris Develder. 2018. Adversarial training for multi-context joint entity and relation extraction. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2830–2836.
[2]
Leon Derczynski, Eric Nichols, Marieke van Erp, and Nut Limsopatham. 2017. Results of the WNUT2017 shared task on novel and emerging entity recognition. In Proceedings of the 3rd Workshop on Noisy User-generated Text. 140–147.
[3]
Oren Etzioni, Michael Cafarella, Doug Downey, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S. Weld, and Alexander Yates. 2005. Unsupervised named-entity extraction from the web: An experimental study. Artificial intelligence 165, 1 (2005), 91–134. Publisher: Elsevier.
[4]
Jiafeng Guo, Gu Xu, Xueqi Cheng, and Hang Li. 2009. Named entity recognition in query. In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 267–274.
[5]
Qipeng Guo, Zhijing Jin, Xipeng Qiu, Weinan Zhang, David Wipf, and Zheng Zhang. 2020. CycleGT: Unsupervised Graph-to-Text and Text-to-Graph Generation via Cycle Training. In Proceedings of the 3rd International Workshop on Natural Language Generation from the Semantic Web (WebNLG+). 77–88.
[6]
Qipeng Guo, Zhijing Jin, Ziyu Wang, Xipeng Qiu, Weinan Zhang, Jun Zhu, Zheng Zhang, and Wipf David. 2021. Fork or fail: Cycle-consistent training with many-to-one mappings. In Proceedings of International Conference on Artificial Intelligence and Statistics. PMLR, 1828–1836.
[7]
Cong Duy Vu Hoang, Philipp Koehn, Gholamreza Haffari, and Trevor Cohn. 2018. Iterative Back-Translation for Neural Machine Translation. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation. Association for Computational Linguistics, 18–24.
[8]
Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologie. 4171–4186.
[9]
Kai Labusch, Preußischer Kulturbesitz, Clemens Neudecker, and David Zellhöfer. 2019. BERT for Named Entity Recognition in Contemporary and Historical German. In Proceedings of the 15th Conference on Natural Language Processing, Erlangen, Germany. 8–11.
[10]
Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer. 2016. Neural Architectures for Named Entity Recognition. In Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologie. 260–270.
[11]
Guillaume Lample, Alexis Conneau, Ludovic Denoyer, and Marc’Aurelio Ranzato. 2018. Unsupervised Machine Translation Using Monolingual Corpora Only. In Proceedings of International Conference on Learning Representations.
[12]
Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics.
[13]
Xiaoya Li, Xiaofei Sun, Yuxian Meng, Junjun Liang, Fei Wu, and Jiwei Li. 2020. Dice Loss for Data-imbalanced NLP Tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 465–476.
[14]
Chen Liang, Yue Yu, Haoming Jiang, Siawpeng Er, Ruijia Wang, Tuo Zhao, and Chao Zhang. 2020. Cycle-sum: cycle-consistent adversarial lstm networks for unsupervised video summarization. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1054–1064.
[15]
Angli Liu, Jingfei Du, and Veselin Stoyanov. 2019. Knowledge-Augmented Language Model and its Application to Unsupervised Named-Entity Recognition. In Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologie. 1142–1150.
[16]
Xuezhe Ma and Eduard Hovy. 2016. End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 1064–1074.
[17]
Tao Meng, Anjie Fang, Oleg Rokhlenko, and Shervin Malmasi. 2021. GEMNET: Effective Gated Gazetteer Representations for Recognizing Complex Entities in Low-context Input. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
[18]
Muhammad Tasnim Mohiuddin and Shafiq Joty. 2019. Revisiting Adversarial Autoencoder for Unsupervised Word Translation with Cycle Consistency and Improved Training. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 3857–3867.
[19]
Mariana L. Neves and Ulf Leser. 2014. A survey on annotation tools for the biomedical literature. Briefings Bioinform. 15, 2 (2014), 327–340.
[20]
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Alessandro Moschitti, Bo Pang, and Walter Daelemans (Eds.). ACL, 1532–1543.
[21]
Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep Contextualized Word Representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language. Association for Computational Linguistics, 2227–2237.
[22]
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research 21 (2020), 1–67.
[23]
Erik F. Tjong Kim Sang and Fien De Meulder. 2003. Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition. In Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologie. ACL, 142–147.
[24]
Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Improving Neural Machine Translation Models with Monolingual Data. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 86–96.
[25]
Larry Smith, Lorraine K Tanabe, Rie Johnson nee Ando, Cheng-Ju Kuo, I-Fang Chung, Chun-Nan Hsu, Yu-Shi Lin, Roman Klinger, Christoph M Friedrich, Kuzman Ganchev, 2008. Overview of BioCreative II gene mention recognition. Genome biology 9, 2 (2008), 1–19.
[26]
Jana Straková, Milan Straka, and Jan Hajic. 2019. Neural Architectures for Nested NER through Linearization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 5326–5331.
[27]
Erik F. Tjong Kim Sang and Fien De Meulder. 2003. Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. In Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologie. Association for Computational Linguistics, 142–147.
[28]
Ke M Tran, Yonatan Bisk, Ashish Vaswani, Daniel Marcu, and Kevin Knight. 2016. Unsupervised Neural Hidden Markov Models. In Proceedings of the Workshop on Structured Prediction for NLP. 63–71.
[29]
Asahi Ushio and Jose Camacho-Collados. 2021. T-NER: An All-Round Python Library for Transformer-based Named Entity Recognition. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations. 53–62.
[30]
Leon Weber, Jannes Münchmeyer, Tim Rocktäschel, Maryam Habibi, and Ulf Leser. 2020. HUNER: improving biomedical NER with pretraining. Bioinform. 36, 1 (2020), 295–302.
[31]
Weischedel, Ralph, Palmer, Martha, Marcus, Mitchell, Hovy, Eduard, Pradhan, Sameer, Ramshaw, Lance, Xue, Nianwen, Taylor, Ann, Kaufman, Jeff, Franchini, Michelle, El-Bachouti, Mohammed, Belvin, Robert, and Houston, Ann. 2021. OntoNotes Release 5.0.
[32]
Ikuya Yamada, Akari Asai, Hiroyuki Shindo, Hideaki Takeda, and Yuji Matsumoto. 2020. LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 6442–6454.
[33]
Ikuya Yamada, Akari Asai, Hiroyuki Shindo, Hideaki Takeda, and Yuji Matsumoto. 2020. LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 6442–6454.
[34]
Shaodian Zhang and Noémie Elhadad. 2013. Unsupervised biomedical named entity recognition: Experiments with clinical and biological texts. Journal of biomedical informatics 46, 6 (2013), 1088–1098. Publisher: Elsevier.
[35]
Xiangyang Zhou, Daxiang Dong, Hua Wu, Shiqi Zhao, Dianhai Yu, Hao Tian, Xuan Liu, and Rui Yan. 2016. Multi-view Response Selection for Human-Computer Conversation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. The Association for Computational Linguistics, 372–381.
[36]
Huiming Zhu, Chunhui He, Yang Fang, and Weidong Xiao. 2020. Fine Grained Named Entity Recognition via Seq2seq Framework. IEEE Access 8(2020), 53953–53961. Publisher: IEEE.
[37]
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision. 2223–2232.

Cited By

View all
  • (2024)Entity Recognition on Border SecurityProceedings of the 19th International Conference on Availability, Reliability and Security10.1145/3664476.3669922(1-6)Online publication date: 30-Jul-2024
  • (2024)Harnessing the Power of Prompt Experts: Efficient Knowledge Distillation for Enhanced Language UnderstandingMachine Learning and Knowledge Discovery in Databases. Research Track and Demo Track10.1007/978-3-031-70371-3_13(218-234)Online publication date: 22-Aug-2024
  • (2024)Intelligent Code Comments Morphing and GenerationAdvances in Information and Communication10.1007/978-3-031-53963-3_46(654-660)Online publication date: 17-Mar-2024
  • Show More Cited By

Index Terms

  1. CycleNER: An Unsupervised Training Approach for Named Entity Recognition
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        WWW '22: Proceedings of the ACM Web Conference 2022
        April 2022
        3764 pages
        ISBN:9781450390965
        DOI:10.1145/3485447
        This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives International 4.0 License.

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 25 April 2022

        Check for updates

        Author Tags

        1. cycle-consistency training
        2. named entity recognition
        3. natural language processing
        4. unsupervised training

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Conference

        WWW '22
        Sponsor:
        WWW '22: The ACM Web Conference 2022
        April 25 - 29, 2022
        Virtual Event, Lyon, France

        Acceptance Rates

        Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)1,571
        • Downloads (Last 6 weeks)145
        Reflects downloads up to 30 Aug 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Entity Recognition on Border SecurityProceedings of the 19th International Conference on Availability, Reliability and Security10.1145/3664476.3669922(1-6)Online publication date: 30-Jul-2024
        • (2024)Harnessing the Power of Prompt Experts: Efficient Knowledge Distillation for Enhanced Language UnderstandingMachine Learning and Knowledge Discovery in Databases. Research Track and Demo Track10.1007/978-3-031-70371-3_13(218-234)Online publication date: 22-Aug-2024
        • (2024)Intelligent Code Comments Morphing and GenerationAdvances in Information and Communication10.1007/978-3-031-53963-3_46(654-660)Online publication date: 17-Mar-2024
        • (2023)Systematic Literature Review of Information Extraction From Textual Data: Recent Methods, Applications, Trends, and ChallengesIEEE Access10.1109/ACCESS.2023.324089811(10535-10562)Online publication date: 2023
        • (2023)AGRONERExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.120440229:PAOnline publication date: 1-Nov-2023
        • (2023)Embedding models for supervised automatic extraction and classification of named entities in scientific acknowledgementsScientometrics10.1007/s11192-023-04806-2Online publication date: 23-Aug-2023
        • (2023)Named Entity Recognition over Dialog Dataset Using Pre-trained TransformersData Management, Analytics and Innovation10.1007/978-981-99-1414-2_43(583-591)Online publication date: 29-May-2023
        • (2023)Exploring the Challenges and Limitations of Unsupervised Machine Learning Approaches in Legal Concepts DiscoveryAdvances in Soft Computing10.1007/978-3-031-47640-2_5(52-67)Online publication date: 13-Nov-2023
        • (2023)Could KeyWord Masking Strategy Improve Language Model?Natural Language Processing and Information Systems10.1007/978-3-031-35320-8_19(271-284)Online publication date: 21-Jun-2023
        • (2022)Exploiting Named Entity Recognition for Information Extraction from Italian Procurement Documents: A Case StudyInformation Integration and Web Intelligence10.1007/978-3-031-21047-1_5(60-74)Online publication date: 28-Nov-2022

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media