Abstract
The field of electrical power encompasses a vast array of diverse information modalities, with textual data standing as a pivotal constituent of this domain. In this study, we harness an extensive corpus of textual data drawn from the electrical power systems domain, comprising regulations, reports, and other pertinent materials. Leveraging this corpus, we construct an Electrical Power Systems Corpus and proceed to annotate entities within this text, thereby introducing a novel Named Entity Recognition (NER) dataset tailored specifically for the electrical power domain. We employ an end-to-end deep learning model, the BERT-BiLSTM-CRF model, for named entity recognition on our custom electrical power domain dataset. This NER model integrates the BERT pre-trained model into the traditional BiLSTM-CRF model, enhancing its ability to capture contextual and semantic information within the text. Results demonstrate that the proposed model outperforms both the BiLSTM-CRF model and the BERT-softmax model in NER tasks across the electrical power domain and various other domains. This study contributes to the advancement of NER applications in the electrical power domain and holds significance for furthering the construction of knowledge graphs and databases related to electrical power systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. Lingvisticae Investigationes 30(1), 3–26 (2007)
Xie, T., Yang, J.A., Liu, H.: Chinese entity recognition based on BERT-BiLSTM-CRF model. Comput. Syst. Appl. 29(7), 48–57 (2020)
Ji, S., Pan, S., Cambria, E., et al.: A survey on knowledge graphs: representation, acquisition, and applications. IEEE Trans. Neural Netw. Learn. Syst. 33(2), 494–514 (2021)
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882(2014)
Elman, J.L.: Finding structure in time. Cogn. Sci. 14(2), 179–211 (1990)
Hochreiter, S., Schmidhuber, J.: Long Short-Term Memory. Neural Comput. 9(8), 1735–1780 (1997)
Chung, J., Gulcehre, C., Cho, K.H., et al.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)
Devlin, J., Chang, M.W., Lee, K., et al.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Baigang, M., Fan, Y.: A review: development of named entity recognition (NER) technology for aeronautical information intelligence. Artif. Intell. Rev. 56(2), 1515–1542 (2023)
Jiao, K.N., Li, X., Zhu, R.C.: A review of named entity recognition in Chinese domain. Comput. Eng. Appl. 57(16), 1–15 (2021)
Eddy, S.R.: Hidden Markov models. Curr. Opin. Struct. Biol. 6(3), 361–365 (1996)
Berger, A., Della Pietra, S.A., Della Pietra, V.J.: A maximum entropy approach to natural language processing. Comput. Linguist. 22(1), 39–71 (1996)
Isozaki, H., Kazawa, H.: Speeding up named entity recognition based on support vector machines. IPSJ SIG Notes NL-149 1, 1–8 (2002)
Lafferty, J., Mccallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data (2001)
Collobert, R., Weston, J., Bottou, L., et al.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
Ma, X. Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNS-CRF. arXiv preprint arXiv:1603.01354 (2016)
Qiu, J., Zhou, Y., Wang, Q., et al.: Chinese clinical named entity recognition using residual dilated convolutional neural network with conditional random field. IEEE Trans. Nanobiosci. 18(3), 306–315 (2019)
Yan, H., Deng, B., Li, X., et al.: TENER: adapting transformer encoder for named entity recognition. arXiv preprint arXiv:1911.04474 (2019)
Zeng, Q.X., Xiong, W.P., Du, J.Q., et al.: Named entity recognition of electronic medical records with BiLSTM-CRF combined with self-attention. Comput. Appl. Softw. 38(3), 159–162 (2021)
Qiu, Q., Xie, Z., Wu, L., et al.: BiLSTM-CRF for geological named entity recognition from the geoscience literature. Earth Sci. Inf. 12(4), 565–579 (2019)
Lample, G., Ballesteros, M., Subramanian, S., et al.: Neural architectures for named entity recognition. arXiv preprint arXiv:1603:01360 (2016)
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: Advances in Neural Information Processing System, vol. 30 (2017)
Xu, L., Tong, Y., Dong, Q., et al.: CLUENER2020: fine-grained named entity recognition dataset and benchmark for Chinese. arXiv preprint arXiv:2001.04351 (2020)
Acknowledgments
This work was supported by the Science and Technology Project of State Grid Zhejiang Electric Power Co., Ltd. (Project number: B311XT220007).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Feng, J., Wang, H., Peng, L., Wang, Y., Song, H., Guo, H. (2024). Chinese Named Entity Recognition Within the Electric Power Domain. In: Shao, J., Katsikas, S.K., Meng, W. (eds) Emerging Information Security and Applications. EISA 2023. Communications in Computer and Information Science, vol 2004 . Springer, Singapore. https://doi.org/10.1007/978-981-99-9614-8_9
Download citation
DOI: https://doi.org/10.1007/978-981-99-9614-8_9
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-9613-1
Online ISBN: 978-981-99-9614-8
eBook Packages: Computer ScienceComputer Science (R0)