research-article

Transfer joint embedding for cross-domain named entity recognition

Authors:

Sinno Jialin Pan,

Zhiqiang Toh,

Jian SuAuthors Info & Claims

ACM Transactions on Information Systems (TOIS), Volume 31, Issue 2

Article No.: 7, Pages 1 - 27

https://doi.org/10.1145/2457465.2457467

Published: 17 May 2013 Publication History

Get Access

Abstract

Named Entity Recognition (NER) is a fundamental task in information extraction from unstructured text. Most previous machine-learning-based NER systems are domain-specific, which implies that they may only perform well on some specific domains (e.g., Newswire) but tend to adapt poorly to other related but different domains (e.g., Weblog). Recently, transfer learning techniques have been proposed to NER. However, most transfer learning approaches to NER are developed for binary classification, while NER is a multiclass classification problem in nature. Therefore, one has to first reduce the NER task to multiple binary classification tasks and solve them independently. In this article, we propose a new transfer learning method, named Transfer Joint Embedding (TJE), for cross-domain multiclass classification, which can fully exploit the relationships between classes (labels), and reduce domain difference in data distributions for transfer learning. More specifically, we aim to embed both labels (outputs) and high-dimensional features (inputs) from different domains (e.g., a source domain and a target domain) into a unified low-dimensional latent space, where 1) each label is represented by a prototype and the intrinsic relationships between labels can be measured by Euclidean distance; 2) the distance in data distributions between the source and target domains can be reduced; 3) the source domain labeled data are closer to their corresponding label-prototypes than others. After the latent space is learned, classification on the target domain data can be done with the simple nearest neighbor rule in the latent space. Furthermore, in order to scale up TJE, we propose an efficient algorithm based on stochastic gradient descent (SGD). Finally, we apply the proposed TJE method for NER across different domains on the ACE 2005 dataset, which is a benchmark in Natural Language Processing (NLP). Experimental results demonstrate the effectiveness of TJE and show that TJE can outperform state-of-the-art transfer learning approaches to NER.

References

[1]

Aone, C., Halverson, L., Hampton, T., and Ramos-Santacruz, M. 1998. SRA: Description of the IE2 system used for MUC-7. In Proceedings of the 7th Message Understanding Conference.

Abstract

References

Cited By

Index Terms

Recommendations

DOZEN: Cross-Domain Zero Shot Named Entity Recognition with Knowledge Graph

Few-shot classification in named entity recognition task

Learning multilingual named entity recognition from Wikipedia

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations