Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A linked open data framework to enhance the discoverability and impact of culture heritage

Published: 01 December 2019 Publication History

Abstract

Cultural heritage institutions have recently begun to consider the benefits of sharing their collections using linked open data to disseminate and enrich their metadata. As datasets become very large, challenges appear, such as ingestion, management, querying and enrichment. Furthermore, each institution has particular features related to important aspects such as vocabularies and interoperability, which make it difficult to generalise this process and provide one-for-all solutions. In order to improve the user experience as regards information retrieval systems, researchers have identified that further refinements are required for the recognition and extraction of implicit relationships expressed in natural language. We introduce a framework for the enrichment and disambiguation of locations in text using open knowledge bases such as Wikidata and GeoNames. The framework has been successfully used to publish a dataset based on information from the Biblioteca Virtual Miguel de Cervantes, thus illustrating how semantic enrichment can help information retrieval. The methods applied in order to automate the enrichment process, which build upon open source software components, are described herein.

References

[1]
Marden J, Li-Madeo C, and Whysel N, et al. Linked open data for cultural heritage: evolution of an information technology. In: Proceedings of the 31st ACM international conference on design of communication (eds Albers MJ and Gossett K), Greenville, NC, 30 September–1 October 2013, pp. 107–112. New York: ACM, http://doi.acm.org/10.1145/2507065.2507103
[2]
Kiryakov A, Popov B, and Terziev I, et al. Semantic annotation, indexing, and retrieval. J Web Sem 2004; 2(1): 49–79.
[3]
Mothe J and Hoang TBN. Location extraction from tweets. Informat Process Manage 2018; 54: 129–144.
[4]
Dragoni M, Cabrio E, and Tonelli S, et al. Enriching a small artwork collection through semantic linking. Cham: Springer, 2016.
[5]
Bontcheva K, Wallis M, and Kieniewicz J, et al. Semantic enrichment and search: a case study on environmental science literature. D-Lib Magazine 2015; 21(1–2).
[6]
Le Q and Mikolov T. Distributed representations of sentences and documents. In: Proceedings of the 31st international conference on machine learning, proceedings of machine learning research (eds Xing EP and Jebara T), vol. 32, Beijing, China, 21–26 June 2014, pp. 1188–1196. New York: ACM.
[7]
IFLA. International Standard Bibliographic Description (ISBD). Munich: De Gruyter Saur, Standing Committee of the IFLA Cataloguing Section, 2011.
[8]
Aalberg T and Žumer M. Looking for entities in bibliographic records. In: Proceedings of the digital libraries: universal and ubiquitous access to information: 11th international conference on Asian digital libraries (ICADL 2008). Bali, Indonesia, 2–5 December 2008, pp. 327–330. New York: Springer.
[9]
Muñoz RS. Launching of beta version of datos.bne.es, a LOD service and a FRBR-based catalogue view. Scatnews 2014; 1(42): 13–21.
[10]
Jisc. The Research and Education Space (RES), https://www.jisc.ac.uk/rd/projects/research-education-space (2017, accessed 3 July 2017).
[11]
Systems GRII. The Getty Thesaurus of Geographic Names, http://www.getty.edu/research/tools/vocabularies/tgn/index.html (2017, accessed 10 May 2018).
[12]
Acheson E, Sabbata SD, and Purves RS. A quantitative analysis of global gazetteers: patterns of coverage for common feature types. Comp Environ Urban Syst 2017; 64: 309–320.
[13]
RDA Steering Committee (RSC). RDA registry, http://www.rdaregistry.info/ (2015, accessed 3 July 2017).
[14]
Candela G, Escobar P, and Carrasco R, et al. Migration of a library catalogue into RDA linked open data. Semantic Web J 2018; 9: 481–491.
[15]
Freire N, Borbinha J, and Calado P. An analysis of the named entity recognition problem in digital library metadata. In: Proceedings of the 12th ACM/IEEE-CS joint conference on digital libraries (JCDL’12), pp. 171–174. New York: ACM.
[16]
Hinze A, Taube-Schock C, and Bainbridge D, et al. Improving access to large-scale digital libraries through semantic-enhanced search and disambiguation. In: Proceedings of the 15th ACM/IEEE-CS joint conference on digital libraries (JCDL’15), pp. 147–156. New York: ACM.
[17]
Van Veen T. Wikidata as universal (library) thesaurus, https://www.wikidata.org/wiki/Wikidata:WikidataCon_2017/Submissions/Wikidata_as_universal_(library)_thesaurus (2017, accessed 10 May 2018).
[18]
Van Veen T, Lonij J, and Koppelaar M. Semantic enrichment: a low-barrier infrastructure and proposal for alignment. D-Lib Magazine 2015; 21(7–8).
[19]
Van Veen T, Lonij J, and Faber WJ. Linking named entities in Dutch historical newspapers. In: Garoufallou E, Subirats Coll I, and Stellato A, et al. (eds) Metadata and semantics research. Cham: Springer, pp. 205–210.
[20]
Won M, Murrieta-Flores P, and Martins B. Ensemble named entity recognition (NER): evaluating NER tools in the identification of place names in historical corpora. Front Digital Humanities 2018; 2018.
[21]
Ross JC, Joshi A, and Bhattacharyya P. A framework that uses the web for named entity class identification: case study for Indian classical music forums. Computación Sistemas 2016; 20(3): 505–513.
[22]
Wikidata. SPARQL federation input/archive, https://www.wikidata.org/wiki/Wikidata:SPARQL_federation_input/Archive (2017, accessed 10 July 2017).
[23]
Association of College and Research Libraries (ACRL). RBMS/BSC Latin place names file, http://rbms.info/lpn/ (2015, accessed 3 July 2017).
[24]
Bansal SK. Towards a semantic extract-transform-load (ETL) framework for big data integration. In: Proceedings of the 2014 IEEE international congress on big data, Anchorage, AK, 27 June–2 July 2014, pp. 522–529. New York: IEEE.
[25]
CIDOC and Documentation Standards Working Group (DSWG). CIDOC CRM, http://www.cidoc-crm.org (2015, accessed 5 June 2017).
[26]
Charles V. Europeana data model documentation, http://pro.europeana.eu/edm-documentation (2015, accessed 15 July 2017).
[27]
Yuji T and Jung-ran P. RDA: resource description & access – a survey of the current state of the art. J Am Soc Informat Sci Tech 2013; 64(4): 651–662.
[28]
Boeuf PL. Modeling rare and unique documents: using FRBROO/CIDOC CRM. J Arch Organizat 2012; 10(2): 96–106.
[29]
W3C Working Group Note. Best practices for publishing linked data, https://www.w3.org/TR/ld-bp/ (2014, accessed 20 November 2015).
[30]
Ducheva DP and Pennington DR. Resource description and access in Europe: implementations and perceptions. J Librarianship Informat Sci. Epub ahead of print 24 May 2017.
[31]
Alexiev V. Name data sources for semantic enrichment, http://vladimiralexiev.github.io/CH-names/README.html (2014, accessed 20 November 2015).
[32]
Dodds L and Davis I. Linked Data Patterns: a pattern catalogue for modelling, publishing, and consuming Linked Data, http://patterns.dataincubator.org (2012, accessed 20 November 2015).
[33]
Al-Qawasmeh O, Al-Smadi M, and Fraihat N. Arabic named entity disambiguation using linked open data. In: Proceedings of the 2016 7th international conference on information and communication systems (ICICS), Irbid, Jordan, 5–7 April 2016, pp. 333–338. New York: IEEE.
[34]
Villar-Rodriguez E, Torre-Bastida AI, and Garca-Serrano A, et al. Using linked open data sources for entity disambiguation. In: Proceedings of the working notes for CLEF 2013 conference, Valencia, 20–26 September 2013, http://ceur-ws.org/Vol-1179/CLEF2013wn-RepLab-VillarRodriguezEt2013.pdf (accessed 21 November 2018).
[35]
Hakimov S, Oto SA, and Dogdu E. Named entity recognition and disambiguation using linked data and graph-based centrality scoring. In: Proceedings of the 4th international workshop on semantic web information management (SWIM’12), pp. 41–44. New York: ACM.
[36]
Färber M, Bartscherer F, and Menne C, et al. Linked data quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO. Semantic Web 2018; 9(1): 77–129.
[37]
Putman TE, Lelong S, and Burgstaller-Muehlbacher S, et al. WikiGenomes: an open web application for community consumption and curation of gene annotation data in Wikidata. Database 2017; 2017.
[38]
Neudecker C. An open corpus for named entity recognition in historic newspapers. In: Proceedings of the 10th international conference on language resources and evaluation (LREC 2016) (eds Neudecker C, Choukri K, and Declerck T, et al.), 2016. Paris: European Language Resources Association (ELRA).
[39]
Manning CD, Surdeanu M, and Bauer J, et al. The Stanford CoreNLP natural language processing toolkit. In: Proceedings of the Association for Computational Linguistics (ACL) system demonstrations, 2014, pp. 55–60, http://aclweb.org/anthology/P14-5010 (accessed 21 November 2018).
[40]
Candela G, Escobar P, and Marco-Such M. Semantic enrichment on cultural heritage collections: a case study using geographic information. In: Proceedings of the 2nd international conference on digital access to textual cultural heritage, 2017, pp. 169–174. New York: ACM.

Cited By

View all
  • (2024)A Systematic Review of Wikidata in GLAM Institutions: a Labs ApproachLinking Theory and Practice of Digital Libraries10.1007/978-3-031-72440-4_4(34-50)Online publication date: 24-Sep-2024
  • (2023)An Ontological Approach for Unlocking the Colonial ArchiveJournal on Computing and Cultural Heritage 10.1145/359472716:4(1-18)Online publication date: 16-Nov-2023
  • (2022)Evaluating the quality of linked open data in digital librariesJournal of Information Science10.1177/016555152093095148:1(21-43)Online publication date: 1-Feb-2022

Index Terms

  1. A linked open data framework to enhance the discoverability and impact of culture heritage
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image Journal of Information Science
            Journal of Information Science  Volume 45, Issue 6
            Dec 2019
            150 pages

            Publisher

            Sage Publications, Inc.

            United States

            Publication History

            Published: 01 December 2019

            Author Tags

            1. Bibliographic data
            2. cultural heritage
            3. interoperability
            4. linked open data
            5. metadata enrichment
            6. ontology
            7. semantic web

            Qualifiers

            • Research-article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 01 Jan 2025

            Other Metrics

            Citations

            Cited By

            View all
            • (2024)A Systematic Review of Wikidata in GLAM Institutions: a Labs ApproachLinking Theory and Practice of Digital Libraries10.1007/978-3-031-72440-4_4(34-50)Online publication date: 24-Sep-2024
            • (2023)An Ontological Approach for Unlocking the Colonial ArchiveJournal on Computing and Cultural Heritage 10.1145/359472716:4(1-18)Online publication date: 16-Nov-2023
            • (2022)Evaluating the quality of linked open data in digital librariesJournal of Information Science10.1177/016555152093095148:1(21-43)Online publication date: 1-Feb-2022
            • (2021)Multiple-source Data Collection and Processing into a Graph Database Supporting Cultural Heritage ApplicationsJournal on Computing and Cultural Heritage 10.1145/346574114:4(1-27)Online publication date: 16-Jul-2021

            View Options

            View options

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media