Abstract
Named entity recognition is a fundamental task in natural language processing and has been widely studied. The construction of a recognizer requires training data that contains annotated named entities. However, it is expensive to construct such training data for low-resource domains. In this paper, we propose a recognizer that uses not only training data but also a domain specific dictionary that is available and easy to use. Our recognizer first uses character-based distributed representations to classify words into categories in the dictionary. The recognizer then uses the output of the classification as an additional feature. We conducted experiments to recognize named entities in recipe text and report the results to demonstrate the performance of our method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chung, Y.J.: Finding food entity relationships using user-generated data in recipe service. In: Proceedings of International Conference on Information and Knowledge Management, pp. 2611–2614 (2012)
Harashima, J., Michiaki, A., Kenta, M., Masayuki, I.: A large-scale recipe and meal data collection as infrastructure for food research. In: Proceedings of International Conference on Language Resources and Evaluation, pp. 2455–2459 (2016)
Harashima, J., Yamada, Y.: Two-step validation in character-based ingredient normalization. In: Proceedings of Joint Workshop on Multimedia for Cooking and Eating Activities and Multimedia Assisted Dietary Management, pp. 29–32 (2018)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1–32 (1997)
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF Models for Sequence Tagging (2015). https://arxiv.org/abs/1508.01991
Kingma, D.P., Ba, J.L.: Adam: a Method for Stochastic Optimization. In: Proceedings of International Conference on Learning Representations (2015)
Lafferty, J., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of International Conference on Machine Learning, pp. 282–289 (2001)
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: Proceedings of Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 260–270 (2016)
Ma, X., Hovy, E.: End-to-end sequence labeling via Bi-directional LSTM-CNNs-CRF. In: Proceedings of Annual Meeting of the Association for Computational Linguistics (2016)
Mai, K., Pham, et al.: An empirical study on fine-grained named entity recognition. In: Proceedings of International Conference on Computational Linguistics, pp. 711–722 (2018)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of International Conference on Learning Representations (2013)
Mori, S., Maeta, H., Yamakata, Y., Sasada, T.: Flow graph corpus from recipe texts. In: Proceedings of International Conference on Language Resources and Evaluation, pp. 2370–2377 (2014)
Nanba, H., Takezawa, T., Doi, Y., Sumiya, K., Tsujita, M.: Construction of a cooking ontology from cooking recipes and patents. In: Proceedings of ACM International Joint Conference on Pervasive and Ubiquitous Computing Adjunct Publication, pp. 507–516 (2014)
Neubig, G., Nakata, Y., Mori, S.: Pointwise prediction for robust, adaptable japanese morphological analysis. In: Proceedings of Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 529–533 (2011)
Peters, M.E., Ammar, W., Bhagavatula, C., Power, R.: Semi-supervised sequence tagging with bidirectional language models. In: Proceedings of Annual Meeting of the Association for Computational Linguistics, pp. 1756–1765 (2017)
Peters, M.E., et al.: Deep contextualized word representations. In: Proceedings of Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 2227–2237 (2018)
Rehurek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: Proceedings of LREC Workshop on New Challenges for NLP Frameworks, pp. 45–50 (2010)
Sasada, T., Mori, S., Kawahara, T., Yamakata, Y.: Named entity recognizer trainable from partially annotated data. In: Proceedings of International Conference of the Pacific Association for Computational Linguistics. vol. 593, pp. 148–160 (2015)
Sato, M., Shindo, H., Yamada, I., Matsumoto, Y.: Segment-level neural conditional random fields for named entity recognition. In: Proceedings of International Joint Conference on Natural Language Proceedings of Sing, pp. 97–102. No. 1 (2017)
Yadav, V., Bethard, S.: A survey on recent advances in named entity recognition from deep learning models. In: Proceedings of International Conference on Computational Linguistics, pp. 2145–2158 (2018)
Yamagami, K., Kiyomaru, H., Kurohashi, S.: Knowledge-based dialog approach for exploring user’s intention. In: Procceedings of FAIM/ISCA Workshop on Artificial Intelligence for Multimodal Human Robot Interaction, pp. 53–56 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this paper
Cite this paper
Hiramatsu, M., Wakabayashi, K., Harashima, J. (2023). Named Entity Recognition by Character-Based Word Classification Using a Domain Specific Dictionary. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2019. Lecture Notes in Computer Science, vol 13452. Springer, Cham. https://doi.org/10.1007/978-3-031-24340-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-24340-0_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24339-4
Online ISBN: 978-3-031-24340-0
eBook Packages: Computer ScienceComputer Science (R0)