research-article

Open access

CycleNER: An Unsupervised Training Approach for Named Entity Recognition

Authors:

Oleg Rokhlenko,

Shervin MalmasiAuthors Info & Claims

WWW '22: Proceedings of the ACM Web Conference 2022

Pages 2916 - 2924

https://doi.org/10.1145/3485447.3512012

Published: 25 April 2022 Publication History

All formats PDF

Abstract

Named Entity Recognition (NER) is a crucial natural language understanding task for many down-stream tasks such as question answering and retrieval. Despite significant progress in developing NER models for multiple languages and domains, scaling to emerging and/or low-resource domains still remains challenging, due to the costly nature of acquiring training data. We propose CycleNER, an unsupervised approach based on cycle-consistency training that uses two functions: (i) sentence-to-entity – S2E and (ii) entity-to-sentence – E2S, to carry out the NER task. CycleNER does not require annotations but a set of sentences with no entity labels and another independent set of entity examples. Through cycle-consistency training, the output from one function is used as input for the other (e.g. S2E → E2S) to align the representation spaces of both functions and therefore enable unsupervised training. Evaluation on several domains comparing CycleNER against supervised and unsupervised competitors shows that CycleNER achieves highly competitive performance with only a few thousand input sentences. We demonstrate competitive performance against supervised models, achieving 73% of supervised performance without any annotations on CoNLL03, while significantly outperforming unsupervised approaches.

References

[1]

Giannis Bekoulis, Johannes Deleu, Thomas Demeester, and Chris Develder. 2018. Adversarial training for multi-context joint entity and relation extraction. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2830–2836.

[2]

Leon Derczynski, Eric Nichols, Marieke van Erp, and Nut Limsopatham. 2017. Results of the WNUT2017 shared task on novel and emerging entity recognition. In Proceedings of the 3rd Workshop on Noisy User-generated Text. 140–147.

[3]

Oren Etzioni, Michael Cafarella, Doug Downey, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S. Weld, and Alexander Yates. 2005. Unsupervised named-entity extraction from the web: An experimental study. Artificial intelligence 165, 1 (2005), 91–134. Publisher: Elsevier.

[4]

Jiafeng Guo, Gu Xu, Xueqi Cheng, and Hang Li. 2009. Named entity recognition in query. In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 267–274.

Digital Library

[5]

Qipeng Guo, Zhijing Jin, Xipeng Qiu, Weinan Zhang, David Wipf, and Zheng Zhang. 2020. CycleGT: Unsupervised Graph-to-Text and Text-to-Graph Generation via Cycle Training. In Proceedings of the 3rd International Workshop on Natural Language Generation from the Semantic Web (WebNLG+). 77–88.

[6]

Qipeng Guo, Zhijing Jin, Ziyu Wang, Xipeng Qiu, Weinan Zhang, Jun Zhu, Zheng Zhang, and Wipf David. 2021. Fork or fail: Cycle-consistent training with many-to-one mappings. In Proceedings of International Conference on Artificial Intelligence and Statistics. PMLR, 1828–1836.

[7]

Cong Duy Vu Hoang, Philipp Koehn, Gholamreza Haffari, and Trevor Cohn. 2018. Iterative Back-Translation for Neural Machine Translation. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation. Association for Computational Linguistics, 18–24.

[8]

Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologie. 4171–4186.

[9]

Kai Labusch, Preußischer Kulturbesitz, Clemens Neudecker, and David Zellhöfer. 2019. BERT for Named Entity Recognition in Contemporary and Historical German. In Proceedings of the 15th Conference on Natural Language Processing, Erlangen, Germany. 8–11.

[10]

Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer. 2016. Neural Architectures for Named Entity Recognition. In Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologie. 260–270.

[11]

Guillaume Lample, Alexis Conneau, Ludovic Denoyer, and Marc’Aurelio Ranzato. 2018. Unsupervised Machine Translation Using Monolingual Corpora Only. In Proceedings of International Conference on Learning Representations.

[12]

Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics.

[13]

Xiaoya Li, Xiaofei Sun, Yuxian Meng, Junjun Liang, Fei Wu, and Jiwei Li. 2020. Dice Loss for Data-imbalanced NLP Tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 465–476.

[14]

Chen Liang, Yue Yu, Haoming Jiang, Siawpeng Er, Ruijia Wang, Tuo Zhao, and Chao Zhang. 2020. Cycle-sum: cycle-consistent adversarial lstm networks for unsupervised video summarization. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1054–1064.

[15]

Angli Liu, Jingfei Du, and Veselin Stoyanov. 2019. Knowledge-Augmented Language Model and its Application to Unsupervised Named-Entity Recognition. In Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologie. 1142–1150.

[16]

Xuezhe Ma and Eduard Hovy. 2016. End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 1064–1074.

[17]

Tao Meng, Anjie Fang, Oleg Rokhlenko, and Shervin Malmasi. 2021. GEMNET: Effective Gated Gazetteer Representations for Recognizing Complex Entities in Low-context Input. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

[18]

Muhammad Tasnim Mohiuddin and Shafiq Joty. 2019. Revisiting Adversarial Autoencoder for Unsupervised Word Translation with Cycle Consistency and Improved Training. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 3857–3867.

[19]

Mariana L. Neves and Ulf Leser. 2014. A survey on annotation tools for the biomedical literature. Briefings Bioinform. 15, 2 (2014), 327–340.

[20]

Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Alessandro Moschitti, Bo Pang, and Walter Daelemans (Eds.). ACL, 1532–1543.

[21]

Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep Contextualized Word Representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language. Association for Computational Linguistics, 2227–2237.

[22]

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research 21 (2020), 1–67.

[23]

Erik F. Tjong Kim Sang and Fien De Meulder. 2003. Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition. In Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologie. ACL, 142–147.

[24]

Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Improving Neural Machine Translation Models with Monolingual Data. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 86–96.

[25]

Larry Smith, Lorraine K Tanabe, Rie Johnson nee Ando, Cheng-Ju Kuo, I-Fang Chung, Chun-Nan Hsu, Yu-Shi Lin, Roman Klinger, Christoph M Friedrich, Kuzman Ganchev, 2008. Overview of BioCreative II gene mention recognition. Genome biology 9, 2 (2008), 1–19.

[26]

Jana Straková, Milan Straka, and Jan Hajic. 2019. Neural Architectures for Nested NER through Linearization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 5326–5331.

[27]

Erik F. Tjong Kim Sang and Fien De Meulder. 2003. Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. In Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologie. Association for Computational Linguistics, 142–147.

[28]

Ke M Tran, Yonatan Bisk, Ashish Vaswani, Daniel Marcu, and Kevin Knight. 2016. Unsupervised Neural Hidden Markov Models. In Proceedings of the Workshop on Structured Prediction for NLP. 63–71.

[29]

Asahi Ushio and Jose Camacho-Collados. 2021. T-NER: An All-Round Python Library for Transformer-based Named Entity Recognition. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations. 53–62.

[30]

Leon Weber, Jannes Münchmeyer, Tim Rocktäschel, Maryam Habibi, and Ulf Leser. 2020. HUNER: improving biomedical NER with pretraining. Bioinform. 36, 1 (2020), 295–302.

[31]

Weischedel, Ralph, Palmer, Martha, Marcus, Mitchell, Hovy, Eduard, Pradhan, Sameer, Ramshaw, Lance, Xue, Nianwen, Taylor, Ann, Kaufman, Jeff, Franchini, Michelle, El-Bachouti, Mohammed, Belvin, Robert, and Houston, Ann. 2021. OntoNotes Release 5.0.

[32]

Ikuya Yamada, Akari Asai, Hiroyuki Shindo, Hideaki Takeda, and Yuji Matsumoto. 2020. LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 6442–6454.

[33]

Ikuya Yamada, Akari Asai, Hiroyuki Shindo, Hideaki Takeda, and Yuji Matsumoto. 2020. LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 6442–6454.

[34]

Shaodian Zhang and Noémie Elhadad. 2013. Unsupervised biomedical named entity recognition: Experiments with clinical and biological texts. Journal of biomedical informatics 46, 6 (2013), 1088–1098. Publisher: Elsevier.

Digital Library

[35]

Xiangyang Zhou, Daxiang Dong, Hua Wu, Shiqi Zhao, Dianhai Yu, Hao Tian, Xuan Liu, and Rui Yan. 2016. Multi-view Response Selection for Human-Computer Conversation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. The Association for Computational Linguistics, 372–381.

[36]

Huiming Zhu, Chunhui He, Yang Fang, and Weidong Xiao. 2020. Fine Grained Named Entity Recognition via Seq2seq Framework. IEEE Access 8(2020), 53953–53961. Publisher: IEEE.

[37]

Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision. 2223–2232.

Cited By

Suciu GSachian MBratulescu RKoci KParangoni G(2024)Entity Recognition on Border SecurityProceedings of the 19th International Conference on Availability, Reliability and Security10.1145/3664476.3669922(1-6)Online publication date: 30-Jul-2024
https://dl.acm.org/doi/10.1145/3664476.3669922
Meng XRao JQi SWang LXiao JWang X(2024)Harnessing the Power of Prompt Experts: Efficient Knowledge Distillation for Enhanced Language UnderstandingMachine Learning and Knowledge Discovery in Databases. Research Track and Demo Track10.1007/978-3-031-70371-3_13(218-234)Online publication date: 22-Aug-2024
https://doi.org/10.1007/978-3-031-70371-3_13
Banipal IAsthana SMazumder SKochura N(2024)Intelligent Code Comments Morphing and GenerationAdvances in Information and Communication10.1007/978-3-031-53963-3_46(654-660)Online publication date: 17-Mar-2024
https://doi.org/10.1007/978-3-031-53963-3_46
Show More Cited By

Index Terms

CycleNER: An Unsupervised Training Approach for Named Entity Recognition
1. Applied computing
  1. Arts and humanities
    1. Language translation
2. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning
    1. Learning paradigms

Index terms have been assigned to the content through auto-classification.

Recommendations

Boosted Web Named Entity Recognition via Tri-Training
TALLIP Notes and Regular Papers

Named entity extraction is a fundamental task for many natural language processing applications on the web. Existing studies rely on annotated training data, which is quite expensive to obtain large datasets, limiting the effectiveness of recognition. In ...
Generalisation in named entity recognition

Quantitative study of NER performance in diverse corpora of different genres, including newswire and social media.Multiple state of the art NER approaches are tested.Possible reasons for NER failure are analysed and quantified: NE diversity, unseen NEs ...
A Self-training Approach for Few-Shot Named Entity Recognition
Web and Big Data
Abstract
Named entity recognition (NER) is a basic task in natural language processing and can be used in a wide range of downstream tasks, such as question answering, text summarization, and machine translation. In recent years, deep-learning based ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '22: Proceedings of the ACM Web Conference 2022

April 2022

3764 pages

ISBN:9781450390965

DOI:10.1145/3485447

Editors:
Frédérique Laforest
INSA Lyon, France
,
Raphaël Troncy
EURECOM, France
,
Elena Simperl
King’s College London, UK
,
Deepak Agarwal
Pinterest, USA
,
Aristides Gionis
KTH Royal Institute of Technology, Sweden
,
Ivan Herman
W3C / retired
,
Lionel Médini
Université Lyon 1, France

Copyright © 2022 Owner/Author.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives International 4.0 License.

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 April 2022

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

WWW '22

Sponsor:

SIGWEB

WWW '22: The ACM Web Conference 2022

April 25 - 29, 2022

Virtual Event, Lyon, France

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
4,267
Total Downloads

Downloads (Last 12 months)1,571
Downloads (Last 6 weeks)145

Reflects downloads up to 30 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Suciu GSachian MBratulescu RKoci KParangoni G(2024)Entity Recognition on Border SecurityProceedings of the 19th International Conference on Availability, Reliability and Security10.1145/3664476.3669922(1-6)Online publication date: 30-Jul-2024
https://dl.acm.org/doi/10.1145/3664476.3669922
Meng XRao JQi SWang LXiao JWang X(2024)Harnessing the Power of Prompt Experts: Efficient Knowledge Distillation for Enhanced Language UnderstandingMachine Learning and Knowledge Discovery in Databases. Research Track and Demo Track10.1007/978-3-031-70371-3_13(218-234)Online publication date: 22-Aug-2024
https://doi.org/10.1007/978-3-031-70371-3_13
Banipal IAsthana SMazumder SKochura N(2024)Intelligent Code Comments Morphing and GenerationAdvances in Information and Communication10.1007/978-3-031-53963-3_46(654-660)Online publication date: 17-Mar-2024
https://doi.org/10.1007/978-3-031-53963-3_46
Abdullah MAziz NAbdulkadir SAlhussian HTalpur N(2023)Systematic Literature Review of Information Extraction From Textual Data: Recent Methods, Applications, Trends, and ChallengesIEEE Access10.1109/ACCESS.2023.324089811(10535-10562)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3240898
G. VKanjirangat VGupta D(2023)AGRONERExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.120440229:PAOnline publication date: 1-Nov-2023
https://dl.acm.org/doi/10.1016/j.eswa.2023.120440
Smirnova NMayr P(2023)Embedding models for supervised automatic extraction and classification of named entities in scientific acknowledgementsScientometrics10.1007/s11192-023-04806-2Online publication date: 23-Aug-2023
https://doi.org/10.1007/s11192-023-04806-2
Patil AGhumbre SAttar V(2023)Named Entity Recognition over Dialog Dataset Using Pre-trained TransformersData Management, Analytics and Innovation10.1007/978-981-99-1414-2_43(583-591)Online publication date: 29-May-2023
https://doi.org/10.1007/978-981-99-1414-2_43
Prince-Tritto PPonce H(2023)Exploring the Challenges and Limitations of Unsupervised Machine Learning Approaches in Legal Concepts DiscoveryAdvances in Soft Computing10.1007/978-3-031-47640-2_5(52-67)Online publication date: 13-Nov-2023
https://dl.acm.org/doi/10.1007/978-3-031-47640-2_5
Borovikova MFerré ABossy RRoche MNédellec C(2023)Could KeyWord Masking Strategy Improve Language Model?Natural Language Processing and Information Systems10.1007/978-3-031-35320-8_19(271-284)Online publication date: 21-Jun-2023
https://dl.acm.org/doi/10.1007/978-3-031-35320-8_19
Impedovo ABarracchia ERizzo G(2022)Exploiting Named Entity Recognition for Information Extraction from Italian Procurement Documents: A Case StudyInformation Integration and Web Intelligence10.1007/978-3-031-21047-1_5(60-74)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.1007/978-3-031-21047-1_5

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents