Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3306446.3340826acmotherconferencesArticle/Chapter ViewAbstractPublication PagesopencollabConference Proceedingsconference-collections
research-article
Open access

When humans and machines collaborate: cross-lingual label editing in wikidata

Published: 20 August 2019 Publication History

Abstract

The quality and maintainability of a knowledge graph are determined by the process in which it is created. There are different approaches to such processes; extraction or conversion of available data in the web (automated extraction of knowledge such as DBpedia from Wikipedia), community-created knowledge graphs, often by a group of experts, and hybrid approaches where humans maintain the knowledge graph alongside bots. We focus in this work on the hybrid approach of human edited knowledge graphs supported by automated tools. In particular, we analyse the editing of natural language data, i.e. labels. Labels are the entry point for humans to understand the information, and therefore need to be carefully maintained. We take a step toward the understanding of collaborative editing of humans and automated tools across languages in a knowledge graph. We use Wiki-data as it has a large and active community of humans and bots working together covering over 300 languages. In this work, we analyse the different editor groups and how they interact with the different language data to understand the provenance of the current label data.

References

[1]
Andrew Chisholm, Will Radford, and Ben Hachey. 2017. Learning to generate one-sentence biographies from Wikidata. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. Association for Computational Linguistics, Valencia, Spain, 633--642.
[2]
Dennis Diefenbach, Vanessa Lopez, Kamal Singh, and Pierre Maret. 2017. Core Techniques of Question Answering Systems over Knowledge Bases: A Survey. Knowledge and Information systems (2017), 1--41.
[3]
Basil Ell, Denny Vrandečić, and Elena Simperl. 2011. Labels in the Web of Data. The Semantic Web-ISWC 2011 (2011), 162--176.
[4]
Mauricio Espinoza, Asunción Gómez-Pérez, and Eduardo Mena. 2008. Labeltranslator - A Tool to Automatically Localize an Ontology. The Semantic Web: Research and Applications (2008), 792--796.
[5]
Scott A. Hale. 2014. Multilinguals and Wikipedia Editing. In ACM Web Science Conference, WebSci '14, Bloomington, IN, USA, June 23--26, 2014. 99--108.
[6]
Brent J. Hecht and Darren Gergle. 2010. The tower of Babel meets web 2.0: user-generated content and its applications in a multilingual context. In Proceedings of the 28th International Conference on Human Factors in Computing Systems, CHI 2010, Atlanta, Georgia, USA, April 10--15, 2010. 291--300.
[7]
Lucie-Aimée Kaffee, Hady ElSahar, Pavlos Vougiouklis, Christophe Gravier, Frédérique Laforest, Jonathon S. Hare, and Elena Simperl. 2018. Learning to Generate Wikipedia Summaries for Underserved Languages from Wikidata. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1--6, 2018, Volume 2 (Short Papers). 640--645. https://aclanthology.info/papers/N18-2101/n18-2101
[8]
Lucie-Aimée Kaffee, Hady ElSahar, Pavlos Vougiouklis, Christophe Gravier, Frédérique Laforest, Jonathon S. Hare, and Elena Simperl. 2018. Mind the (Language) Gap: Generation of Multilingual Wikipedia Summaries from Wikidata for ArticlePlaceholders. In The Semantic Web - 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3--7, 2018, Proceedings. 319--334.
[9]
Lucie-Aimée Kaffee and Elena Simperl. 2018. Analysis of Editors' Languages in Wikidata. In Proceedings of the 14th International Symposium on Open Collaboration, OpenSym 2018, Paris, France, August 22--24, 2018. 21:1--21:5.
[10]
Lucie-Aimée Kaffee and Elena Simperl. 2018. The Human Face of the Web of Data: A Cross-sectional Study of Labels. In Proceedings of the 14th International Conference on Semantic Systems, SEMANTICS 2018, Vienna, Austria, September 10--13, 2018. 66--77.
[11]
Lucie-Aimée Kaffee, Alessandro Piscopo, Pavlos Vougiouklis, Elena Simperl, Leslie Carr, and Lydia Pintscher. 2017. A Glimpse into Babel: An Analysis of Multilinguality in Wikidata. In Proceedings of the 13th International Symposium on Open Collaboration. ACM, 14.
[12]
Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas, Pablo N Mendes, Sebastian Hellmann, Mohamed Morsey, Patrick Van Kleef, Sören Auer, et al. 2015. DBpedia - A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia. Semantic Web 6, 2 (2015), 167--195.
[13]
Elena Montiel-Ponsoda, Daniel Vila Suero, Boris Villazón-Terrazas, Gordon Dunsire, Elena Escolano Rodríguez, and Asunción Gómez-Pérez. 2011. Style guidelines for naming and labeling ontologies in the multilingual web. (2011).
[14]
Claudia Müller-Birn, Benjamin Karran, Janette Lehmann, and Markus Luczak-Rösch. 2015. Peer-production System or Collaborative Ontology Engineering Effort: What is Wikidata?. In Proceedings of the 11th International Symposium on Open Collaboration, San Francisco, CA, USA, August 19--21, 2015. 20:1--20:10.
[15]
Sungjoon Park, Suin Kim, Scott Hale, Sooyoung Kim, Jeongmin Byun, and Alice Oh. 2015. MultilingualWikipedia: Editors of Primary Language Contribute to More Complex Articles. In Ninth International AAAI Conference on Web and Social Media.
[16]
John Samuel. 2018. Analyzing and Visualizing Translation Patterns of Wikidata Properties. In Experimental IR Meets Multilinguality, Multimodality, and Interaction - 9th International Conference of the CLEF Association, CLEF 2018, Avignon, France, September 10--14, 2018, Proceedings. 128--134.
[17]
John Samuel. 2018. Towards understanding and improving multilingual collaborative ontology development in Wikidata. In Companion of the The Web Conference 2018 on The Web Conference 2018, WWW 2018, Lyon, France, April 23--27, 2018.
[18]
Thomas Steiner. 2014. Bots vs. Wikipedians, Anons vs. Logged-Ins (Redux): A Global Study of Edit Activity on Wikipedia and Wikidata. In Proceedings of The International Symposium on Open Collaboration, OpenSym 2014, Berlin, Germany, August 27--29, 2014. 25:1--25:7.
[19]
Thomas Pellissier Tanon and Lucie-Aimée Kaffee. 2018. Property Label Stability in Wikidata: Evolution and Convergence of Schemas in Collaborative Knowledge Bases. In Companion of the The Web Conference 2018 on The Web Conference 2018, WWW 2018, Lyon, France, April 23--27, 2018. 1801--1803.
[20]
Denny Vrandečić and Markus Krötzsch. 2014. Wikidata: A Free Collaborative Knowledgebase. Commun. ACM 57, 10 (Sept. 2014), 78--85.

Cited By

View all
  • (2023)Quantifying the Gap: The Gender Gap in French Writers’ WikidataJournal of Cultural Analytics10.22148/001c.740688:2Online publication date: 11-May-2023
  • (2023)How Did They Build the Free Encyclopedia? A Literature Review of Collaboration and Coordination among Wikipedia EditorsACM Transactions on Computer-Human Interaction10.1145/361736931:1(1-48)Online publication date: 29-Nov-2023
  • (2022)Using natural language generation to bootstrap missing Wikipedia articles: A human-centric perspectiveSemantic Web10.3233/SW-21043113:2(163-194)Online publication date: 3-Feb-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
OpenSym '19: Proceedings of the 15th International Symposium on Open Collaboration
August 2019
197 pages
ISBN:9781450363198
DOI:10.1145/3306446
This work is licensed under a Creative Commons Attribution International 4.0 License.

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 August 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. collaborative knowledge graph
  2. multilingual data
  3. wiki-data

Qualifiers

  • Research-article

Funding Sources

Conference

OpenSym '19

Acceptance Rates

OpenSym '19 Paper Acceptance Rate 17 of 23 submissions, 74%;
Overall Acceptance Rate 108 of 195 submissions, 55%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)122
  • Downloads (Last 6 weeks)19
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Quantifying the Gap: The Gender Gap in French Writers’ WikidataJournal of Cultural Analytics10.22148/001c.740688:2Online publication date: 11-May-2023
  • (2023)How Did They Build the Free Encyclopedia? A Literature Review of Collaboration and Coordination among Wikipedia EditorsACM Transactions on Computer-Human Interaction10.1145/361736931:1(1-48)Online publication date: 29-Nov-2023
  • (2022)Using natural language generation to bootstrap missing Wikipedia articles: A human-centric perspectiveSemantic Web10.3233/SW-21043113:2(163-194)Online publication date: 3-Feb-2022
  • (2022)An Analysis of Content Gaps Versus User Needs in the Wikidata Knowledge GraphThe Semantic Web – ISWC 202210.1007/978-3-031-19433-7_21(354-374)Online publication date: 16-Oct-2022
  • (2021)WDProp: Web Application to Analyse Multilingual Aspects of Wikidata PropertiesProceedings of the 17th International Symposium on Open Collaboration10.1145/3479986.3479996(1-12)Online publication date: 15-Sep-2021

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media