James Mayfield

We present a way to generate gazetteers from the Wikidata knowledge graph and use the lists to improve a neural NER system by adding an input feature indicating that a word is part of a name in the gazetteer. We empirically show that the... more

We present a way to generate gazetteers from the Wikidata knowledge graph and use the lists to improve a neural NER system by adding an input feature indicating that a word is part of a name in the gazetteer. We empirically show that the approach yields performance gains in two distinct languages: a high-resource, word-based language, English and a high-resource, character-based language, Chinese. We apply the approach to a low-resource language, Russian, using a new annotated Russian NER corpus from Reddit tagged with four core and eleven extended types, and show a baseline score.

Publication Date: 2020

Publication Name: Proceedings of the 33rd International FLAIRS Conference

Research Interests:
NLP, Named Entity Recognition, Deep Learning, and Gazetteer

Download (.pdf)

The goal of this work is to improve the performance of a neu-ral named entity recognition system by adding input features that indicate a word is part of a name included in a gazetteer. This article describes how to generate gazetteers... more

The goal of this work is to improve the performance of a neu-ral named entity recognition system by adding input features that indicate a word is part of a name included in a gazetteer. This article describes how to generate gazetteers from the Wikidata knowledge graph as well as how to integrate the information into a neural NER system. Experiments reveal that the approach yields performance gains in two distinct languages: a high-resource, word-based language, English and a high-resource, character-based language, Chinese. Experiments were also performed in a low-resource language, Rus-sian on a newly annotated Russian NER corpus from Reddit tagged with four core types and twelve extended types. This article reports a baseline score. It is a longer version of a paper in the 33rd FLAIRS conference (Song et al. 2020).

Publication Date: 2020

Publication Name: arXiv

Research Interests:
NLP, Named Entity Recognition, Deep Learning, and Gazetteer

Download (.pdf)

The HLTCOE participated in the entity linking and slot filling tasks at TAC 2009. A machine learning-based approach to entity linking, operating over a wide range of feature types, yielded good performance on the entity linking task.... more

The HLTCOE participated in the entity linking and slot filling tasks at TAC 2009. A machine learning-based approach to entity linking, operating over a wide range of feature types, yielded good performance on the entity linking task. Slot-filling based on sentence selection, application of weak patterns and exploitation of redundancy was ineffective in the slot filling task.

Publisher: National Institute of Standards and Technology

Publication Date: Nov 1, 2009

Publication Name: Text Analysis Conference (TAC)

Download (.pdf)

Publication Date: 2020

Publication Name: Proceedings of the 33rd International FLAIRS Conference

Research Interests: NLP, Named Entity Recognition, Deep Learning, and Gazetteer<div>()</div>

Publication Date: 2020

Publication Name: arXiv

Research Interests: NLP, Named Entity Recognition, Deep Learning, and Gazetteer<div>()</div>

Publisher: National Institute of Standards and Technology

Publication Date: Nov 1, 2009

Publication Name: Text Analysis Conference (TAC)

Log In

Research Interests:
NLP, Named Entity Recognition, Deep Learning, and Gazetteer

Research Interests:
NLP, Named Entity Recognition, Deep Learning, and Gazetteer