Abstract
Named Entity Recognition (NER) is one of the vital task for many Natural Language Processing (NLP) tasks. In recent times, transformer architecture-based models have become very popular for NLP tasks including NER achieving state-of-the-art results. The Bidirectional Encoder Representations from Transformers (BERT) model especially has been found to be very good for NER tasks. However, in Nepali limited work has been done using these models with existing works mostly using more traditional techniques. In this work, we show that by using a combination of preprocessing techniques and better-initialized BERT models, we can improve the performance of the NER system in Nepali. We show a significant improvement in results using the multilingual RoBERTa model. Using this, we were able to achieve a 6% overall improvement in the f1 score in EverestNER Dataset. In terms of the fields, we have achieved an increase of up to 22% in the f1 score for the Event entity which has the lowest support.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bam, S., Shahi, T.: Named entity recognition for Nepali text using support vector machines. Intell. Inf. Manage. 06, 21–29 (2014). https://doi.org/10.4236/iim.2014.62004
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information (2016)
Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale (2019). https://doi.org/10.48550/ARXIV.1911.02116, https://arxiv.org/abs/1911.02116
Cui, Y., Jia, M., Lin, T., Song, Y., Belongie, S.J.: Class-balanced loss based on effective number of samples. CoRR abs/1901.05555 (2019). http://arxiv.org/abs/1901.05555
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding (2019)
Dey, A., Prukayastha, B.: Named entity recognition using gazetteer method and n-gram technique for an inflectional language: a hybrid approach. Int. J. Comput. Appl. 84, 31–35 (2013). https://doi.org/10.5120/14607-2859
Grishman, R., Sundheim, B.: Message understanding conference-6: a brief history. In: Proceedings of the 16th Conference on Computational Linguistics, COLING 1996, vol. 1, pp. 466–471. Association for Computational Linguistics, USA (1996). https://doi.org/10.3115/992628.992709
Hearst, M., Dumais, S., Osuna, E., Platt, J., Scholkopf, B.: Support vector machines. IEEE Intell. Syst. Appl. 13(4), 18–28 (1998). https://doi.org/10.1109/5254.708428
Kakwani, D., et al.: IndicNLPSuite: monolingual corpora, evaluation benchmarks and pre-trained multilingual language models for Indian languages. In: Findings of EMNLP (2020)
Koirala, P., Niraula, N.B.: NPVec1: word embeddings for Nepali - construction and evaluation. In: Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP-2021), pp. 174–184. Association for Computational Linguistics, Online (2021). https://doi.org/10.18653/v1/2021.repl4nlp-1.18, https://aclanthology.org/2021.repl4nlp-1.18
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, ICML 2001, pp. 282–289. Morgan Kaufmann Publishers Inc., San Francisco (2001)
Li, J., Sun, A., Han, J., Li, C.: A survey on deep learning for named entity recognition. IEEE Trans. Knowl. Data Eng. 34(1), 50–70 (2020)
Maharjan, G., Bal, B.K., Regmi, S.: Named entity recognition (NER) for Nepali. In: Kravets, A.G., Groumpos, P.P., Shcherbakov, M., Kultsova, M. (eds.) CIT &DS 2019. CCIS, vol. 1084, pp. 71–80. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29750-3_6
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space (2013)
Murthy, R., Bhattacharjee, P., Sharnagat, R., Khatri, J., Kanojia, D., Bhattacharyya, P.: HiNER: a large Hindi named entity recognition dataset (2022). https://doi.org/10.48550/ARXIV.2204.13743, https://arxiv.org/abs/2204.13743
Nakayama, H.: SeqEval: a python framework for sequence labeling evaluation (2018). https://github.com/chakki-works/seqeval
Niraula, N., Chapagain, J.: Named entity recognition for Nepali: data sets and algorithms. In: The International FLAIRS Conference Proceedings, vol. 35 (2022). https://doi.org/10.32473/flairs.v35i.130725
Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543. Association for Computational Linguistics, Doha (2014). https://doi.org/10.3115/v1/D14-1162, https://www.aclweb.org/anthology/D14-1162
Rau, L.F.: Extracting company names from text. In: 1991 Proceedings of the Seventh IEEE Conference on Artificial Intelligence Application, vol. i, pp. 29–32 (1991)
Ruokolainen, T., Kauppinen, P., Silfverberg, M., Lindén, K.: A finnish news corpus for named entity recognition. Lang. Resour. Eval. 54(1), 247–272 (2019). https://doi.org/10.1007/s10579-019-09471-7
Singh, O.M., Padia, A., Joshi, A.: Named entity recognition for Nepali language. In: 2019 IEEE 5th International Conference on Collaboration and Internet Computing (CIC), pp. 184–190 (2019). https://doi.org/10.1109/CIC48465.2019.00031
Vaswani, A., et al.: Attention is all you need (2017)
Acknowledgment
We would like to thank Dr. Nobal Niraula for his time in helping us understand the Everest NER dataset and various approaches their team had carried forward during their experimentation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Pande, B.D., Shakya, A., Panday, S.P., Joshi, B. (2023). Named Entity Recognition for Nepali Using BERT Based Models. In: Fujita, H., Wang, Y., Xiao, Y., Moonis, A. (eds) Advances and Trends in Artificial Intelligence. Theory and Applications. IEA/AIE 2023. Lecture Notes in Computer Science(), vol 13926. Springer, Cham. https://doi.org/10.1007/978-3-031-36822-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-36822-6_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36821-9
Online ISBN: 978-3-031-36822-6
eBook Packages: Computer ScienceComputer Science (R0)