Multimodal named entity recognition with image attributes and image knowledge

D Chen, Z Li, B Gu, Z Chen - … , DASFAA 2021, Taipei, Taiwan, April 11–14 …, 2021 - Springer
D Chen, Z Li, B Gu, Z Chen
Database Systems for Advanced Applications: 26th International Conference …, 2021Springer
Multimodal named entity extraction is an emerging task which uses both textual and visual
information to detect named entities and identify their entity types. The existing efforts are
often flawed in two aspects. Firstly, they may easily ignore the natural prejudice of visual
guidance brought by the image. Secondly, they do not further explore the knowledge
contained in the image. In this paper, we novelly propose a novel neural network model
which introduces both image attributes and image knowledge to help improve named entity …
Abstract
Multimodal named entity extraction is an emerging task which uses both textual and visual information to detect named entities and identify their entity types. The existing efforts are often flawed in two aspects. Firstly, they may easily ignore the natural prejudice of visual guidance brought by the image. Secondly, they do not further explore the knowledge contained in the image. In this paper, we novelly propose a novel neural network model which introduces both image attributes and image knowledge to help improve named entity extraction. While the image attributes are high-level abstract information of an image that could be labelled by a pre-trained model based on ImageNet, the image knowledge could be obtained from a general encyclopedia knowledge graph with multi-modal information such as DBPedia and Yago. Our emperical study conducted on real-world data collection demonstrates the effectiveness of our approach comparing with several state-of-the-art approaches.
Springer