Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3132218.3132237acmotherconferencesArticle/Chapter ViewAbstractPublication PagessemanticsConference Proceedingsconference-collections
research-article

Good Applications for Crummy Entity Linkers?: The Case of Corpus Selection in Digital Humanities

Published: 11 September 2017 Publication History

Abstract

Over the last decade we have made great progress in entity linking (EL) systems, but performance may vary depending on the context and, arguably, there are even principled limitations preventing a "perfect" EL system. This also suggests that there may be applications for which current "imperfect" EL is already very useful, and makes finding the "right" application as important as building the "right" EL system. We investigate the Digital Humanities use case, where scholars spend a considerable amount of time selecting relevant source texts. We developed WideNet; a semantically-enhanced search tool which leverages the strengths of (imperfect) EL without getting in the way of its expert users. We evaluate this tool in two historical case-studies aiming to collect a set of references to historical periods in parliamentary debates from the last two decades; the first targeted the Dutch Golden Age, and the second World War II. The case-studies conclude with a critical reflection on the utility of WideNet for this kind of research, after which we outline how such a real-world application can help to improve EL technology in general.

References

[1]
Luke Blaxill. 2013. Quantifying the language of British politics, 1880--1910. Historical Research 86, 232 (2013), 313--341.
[2]
Kenneth W. Church and Eduard H. Hovy. 1993. Good Applications for Crummy Machine Translation. Machine Translation 8 (1993), 239--258.
[3]
Marco Cornolti, Paolo Ferragina, Massimiliano Ciaramita, Stefan Rüd, and Hinrich Schütze. 2016. A Piggyback System for Joint Entity Mention Detection and Linking in Web Queries. In Proceedings of the 25th International Conference on World Wide Web. 567--578.
[4]
Rudolf de Cillia, Martin Reisigl, and Ruth Wodak. 1999. The discursive construction of national identities. Discourse & Society 10, 2 (1999), 149--173.
[5]
Max De Wilde. 2015. Improving Retrieval of Historical Content with Entity Linking. In ADBIS 2015: Communications in Computer and Information Science, Vol. 539. 498--504.
[6]
Leon Derczynski, Diana Maynard, Giuseppe Rizzo, Marieke van Erp, Genevieve Gorrell, Raphaël Troncy, Johann Petrak, and Kalina Bontcheva. 2015. Analysis of named entity recognition and linking for tweets. Information Processing and Management 51, 2 (2015), 32--49. arXiv:1410.7182
[7]
Faegheh Hasibi, Krisztian Balog, and Svein Erik Bratsberg. 2015. Entity Linking in Queries: Tasks and Evaluation. In Proceedings of ICTIR '15. 171--180.
[8]
Annika Hinze, Craig Taube-Schock, David Bainbridge, Rangi Matamua, and J. Stephen Downie. 2015. Improving Access to Large-scale Digital Libraries Through Semantic-enhanced Search and Disambiguation. In Proceedings of JCDL '15. 147--156.
[9]
Filip Ilievski, Marten Postma, and Piek Vossen. 2016. Semantic overfitting: what 'world' do we consider when evaluating disambiguation of text?. In Proceedings of COLING 2016. 1180--1191.
[10]
Filip Ilievski, Giuseppe Rizzo, Marieke van Erp, and Julien Plu. 2016. Context-Enhanced Adaptive Entity Linking. In LREC '16: Proceedings of the Tenth International Conference on Language Resources and Evaluation. 541--548.
[11]
Filip Ilievski, Piek Vossen, and Marieke van Erp. 2017. Hunger for Contextual Knowledge and a Road Map to Intelligent Entity Linking. In LDK 2017: Language, Data, and Knowledge.
[12]
Lotte Jensen. 2012. De Gouden Eeuw als ijkpunt van de nationale identiteit. Het beeld van de Gouden Eeuw in verzetsliteratuur tussen 1806 en 1813. De Zeventiende Eeuw 28, 2 (2012).
[13]
Kunal Jha, Michael Röder, and Axel-Cyrille Ngonga Ngomo. 2017. All that Glitters Is Not Gold -- Rule-Based Curation of Reference Datasets for Named Entity Recognition and Entity Linking. In The Semantic Web: ESWC 2017. 305--320.
[14]
Xiao Ling, Sameer Singh, and Daniel S. Weld. 2015. Design Challenges for Entity Linking. Transactions of the Association for Computational Linguistics 3, 2011 (2015), 315--328.
[15]
Daan Odijk, Edgar Meij, and Maarten de Rijke. 2013. Feeding the Second Screen: Semantic Linking based on Subtitles. In Proceedings of OAIR '13. 9--16.
[16]
Alex Olieman, Jaap Kamps, Maarten Marx, and Arjan Nusselder. 2015. A Hybrid Approach to Domain-Specific Entity Linking. In Proceedings of the Posters and Demos Track of SEMANTiCS '15. Vienna, 55--58. arXiv:1509.01865
[17]
Alex Olieman, Jaap Kamps, Gleb Satyukov, and Emil de Valk. 2016. Topical Generalization for Presentation of User Profiles. In DIR'16: Proceedings of the 15th Dutch-Belgian Information Retrieval Workshop. arXiv:1608.07952
[18]
Marko Rodriguez. 2015. The Gremlin Graph Traversal Machine and Language. In Proc. 15th Symp. on Database Programming Languages. 1--10. arXiv:1508.03843
[19]
Ryan Shaw. 2013. Information Organization and the Philosophy of History. Journal of the American Society for Information Science and Technology 64, 6 (jun 2013), 1092--1103.
[20]
Camilo Thorne, Stefano Faralli, and Heiner Stuckenschmidt. 2016. Cross-Evaluation of Entity Linking and Disambiguation Systems for Clinical Text Annotation. In SEMANTiCS '16: Proceedings of the 12th International Conference on Semantic Systems. 169--172.
[21]
Evelien Tonkens, Menno Hurenkamp, and Jan Willem Duyvendak. 2010. Culturalization of citizenship in the Netherlands. In Managing ethnic diversity after 9/11: integration, security, and civil liberties in transatlantic perspective.
[22]
Marieke van Erp, Pablo N. Mendes, Heiko Paulheim, Filip Ilievski, Julien Plu, and Giuseppe Rizzo. 2016. Evaluating Entity Linking: An Analysis of Current Benchmark Datasets and a Roadmap for Doing a Better Job. In Proceedings of LREC '16. 4373--4379.
[23]
Jianyong Wang and Jiawei Han. 2015. Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions. IEEE Transactions on Knowledge and Data Engineering 27, 2 (2015), 443--460.

Cited By

View all
  • (2022)Evaluating Automated and Hybrid Neural Disambiguation for African Historical Named EntitiesArtificial Intelligence Research10.1007/978-3-031-22321-1_18(260-275)Online publication date: 28-Nov-2022
  • (2019)Index-Driven Digitization and Indexation of Historical ArchivesFrontiers in Digital Humanities10.3389/fdigh.2019.000046Online publication date: 11-Mar-2019
  • (2018)Entity-Aspect LinkingProceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries10.1145/3197026.3197047(49-58)Online publication date: 23-May-2018

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
Semantics2017: Proceedings of the 13th International Conference on Semantic Systems
September 2017
202 pages
ISBN:9781450352963
DOI:10.1145/3132218
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

In-Cooperation

  • St. Pölten University: St. Pölten University of Applied Sciences, Austria
  • Wolters Kluwer: Wolters Kluwer, Germany
  • Vrije Universeit Amsterdam: Vrije Universeit Amsterdam
  • Semantic Web Company: Semantic Web Company
  • Uinv. Leipzig: Universität Leipzig

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 September 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Corpus Selection
  2. Digital Humanities
  3. Entity Linking Evaluation
  4. Interactive Information Retrieval
  5. Real-World Applications
  6. Semantically-Enhanced Search

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

Semantics2017

Acceptance Rates

Overall Acceptance Rate 40 of 182 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)2
Reflects downloads up to 16 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Evaluating Automated and Hybrid Neural Disambiguation for African Historical Named EntitiesArtificial Intelligence Research10.1007/978-3-031-22321-1_18(260-275)Online publication date: 28-Nov-2022
  • (2019)Index-Driven Digitization and Indexation of Historical ArchivesFrontiers in Digital Humanities10.3389/fdigh.2019.000046Online publication date: 11-Mar-2019
  • (2018)Entity-Aspect LinkingProceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries10.1145/3197026.3197047(49-58)Online publication date: 23-May-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media