Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Building Semantic Metadata for Historical Archives through an Ontology-driven User Interface

Published: 16 August 2020 Publication History
  • Get Citation Alerts
  • Abstract

    Historical archives represent an immense wealth, the potential of which is endangered by the lack of effective management and access tools. We believe that this issue can be faced by providing archive catalogs with a semantic layer, containing rich semantic metadata, representing the content of documents in a full-fledged formal machine-readable format. In this article, we present the contribution offered in this direction by the PRiSMHA project, in which the conceptual vocabulary of the semantic layer is represented by computational ontologies. However, acquiring semantic knowledge represents a well-known bottleneck for knowledge-based systems; to solve this problem, PRiSMHA relies on a crowdsourcing collaborative model, i.e., an online community of users who collaborate in building semantic representations of the content of archival documents. In this perspective, this article aims at answering the following research question: Starting from the axioms characterizing concepts in the computational ontology underlying the system, how can we derive a user interface enabling users to formally represent the content of archival documents by exploiting the conceptual vocabulary provided by the ontology?
    Our solution includes the following steps: (a) a manually defined configuration, acting as a pre-filter, to hide “unsuited” classes, properties, and relations; (b) an algorithm, combining heuristics and reasoning, which extracts from the ontology all and only the “compatible” properties and relations, given an entity (event) type; and (c) a set of strategies to rank, group, and present the entity (event) properties and relations, based on the results of a study with users. This integrated solution enabled us to design an ontology-driven user interface enabling users to characterize entities, and in particular (historical) events, on the basis of the vocabulary provided by the ontology.

    References

    [1]
    M. Agosti, O. Conlan, N. Ferro, C. Hampson, and G. Munnelly. 2013. Interacting with digital cultural heritage collections via Annotations: The CULTURAapproach. In Proceedings of the ACM Symposium on Document Engineering (DocEng’13). 13--22.
    [2]
    C. van den Akker, L. Aroyo, A. Cybulska, M. van Erp, P. Gorgels, L. Hollink, C. Jager, S. Legêne, L. van der Meij, J. Oomen, J. van Ossenbruggen, G. Schreiber, R. Segers, P. Vossen, and B. Wielinga. 2010. Historical event-based access to museum collections. Appl. Artif. Intell. 25.
    [3]
    J. F. Allen. 1983. Maintaining knowledge about temporal intervals. Commun. ACM 26, 11 (1983), 832--843.
    [4]
    G. Alonso, F. Casati, H. Kuno, and V. Machiraju. 2004. Web Services: Concepts, Architectures, and Applications. Springer.
    [5]
    M. Ashenfelder. 2015. Cultural institutions embrace crowdsourcing. Retrieved from blogs.loc.gov/digitalpreservation/2015/09/cultural-institutions-embrace-crowdsourcing.
    [6]
    F. Baader, D. Calvanese, D. L. McGuinness, D. Nardi, and P. F. Patel-Schneider. (Eds.). 2010. The Description Logic Handbook (2nd ed.). Cambridge University Press.
    [7]
    V. de Boer, J. Oomen, O. Inel, L. Aroyo, E. van Staveren, W. Helmich, and D. de Beurs. 2015. DIVE into the event-based browsing of linked historical media. J. Web Seman. 35, 3 (2015), 152--158.
    [8]
    C. Bonacchi, A. Bevan, D. Pett, A. Keinan-Schoonbaert, R. Sparks, J. Wexler, and N. Wilkin. 2014. Crowd-sourced archaeological research: The micropasts project. Archaeol. Int. 17: 61--68.
    [9]
    C. Bonacchi, A. Bevan, A. Keinan-Schoonbaert, D. Pett, and J. Wexler. 2019. Participation in heritage crowdsourcing. Mus. Manag. Curat. 34, 2 (2019), 166--182.
    [10]
    S. Borgo and C. Masolo. 2009. Foundational choices in DOLCE. S. Staab and R. Studer. (Eds.). Handbook on Ontologies (2nd ed). Springer, 361--381.
    [11]
    F. Boschetti, A. Cimino, F. Dell'Orletta, G. E. Lebani, L. Passaro, P. Picchi, G. Venturi, S. Montemagni, and A. Lenci. 2014. Computational analysis of historical documents: An application to Italian war bulletins in World War I and II. In Proceedings of Language Resources and Evaluation Conference.
    [12]
    D. Calvanese, A. Mosca, J. Remesal, M. Rezk, and G. A. Rull. 2015. ‘Historical case’ of ontology-based data access. In Proceedings of the Conference on Digital Heritage. 291--298.
    [13]
    G. Carducci, M. Leontino, D. P. Radicioni, G. Bonino, E. Pasini, and P. Tripodi. 2019. Semantically aware text categorisation for metadata annotation. In Proceedings of the Italian Research Conference on Digital Libraries. 315--330.
    [14]
    M. Daquino, F. Mambelli, S. Peroni, F. Tomasi, and F. Vitali. 2016. Enhancing semantic expressivity in the cultural heritage domain: Exposing the Zeri photo archive as linked open data. J. Comput. Cult. Herit. 10, 4 (2016).
    [15]
    M. Doerr. 2003. The CIDOC conceptual reference model: An ontological approach to semantic interoperability of metadata. AI Mag. 24, 3 (2003), 75--92.
    [16]
    M. Dragoni, S. Tonelli, and G. Moretti. 2016. A knowledge management architecture for digital cultural heritage. J. Comput. Cult. Herit. 10 3 (2016) 1--18.
    [17]
    Europeana. 2016. Definition of the Europeana Data Model v.5.2.7. Retrieved from: pro.europeana.eu/files/Europeana_Professional/Share_your_data/Technical_requirements/EDM_Documentation/EDM_Definition_v5.2.7_042016.pdf.
    [18]
    J. Foley, P. Kwan, and M. Welch. 2017. A web-based infrastructure for the assisted annotation of heritage collections. ACM J. Comput. Cult. Herit. 10, 3 (2017).
    [19]
    E. Franconi, P. Guagliardo, and M. Trevisan. 2010. An intelligent query interface based on ontology navigation. In Proceedings of the Workshop on Visual Interfaces to the Social and Semantic Web.
    [20]
    A. Gangemi, and P. Mika. 2003. Understanding the semantic web through descriptions and situations. In Proceedings of the OTM Confederated International Conferences on the Move to Meaningful Internet Systems. 689--706.
    [21]
    M. Giretzlehner, D. Girardi, and K. Arthofer. 2012. Ontology-guided data acquisition and analysis: Using ontologies for advanced statistical analysis. In Proceedings of the International Conference on Data Analytics.
    [22]
    R. S. Gonçalves, S. W. Tu, C. I. Nyulas, M. J. Tierney, and M. A. Musen. 2017. An ontology-driven tool for structured data acquisition using web forms. J. Biomed. Seman. 8, 26 (2017).
    [23]
    A. Goy, C. Accornero, D. Astrologo, D. Colla, M. D'Ambrosio, R. Damiano, M. Leontino, A. Lieto, F. Loreto, D. Magro, E. Mensa, A. Montanaro, V. Mosca, S. Musso, D. P. Radicioni, and C. Re. 2019a. Fruitful synergies between computer science, historical studies, and archives: The experience in the PRiSMHA project. In Proceedings of the International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, Vol. 3. KMIS, 225--230.
    [24]
    A. Goy, R. Damiano, F. Loreto, D. Magro, S. Musso, D. Radicioni, C. Accornero, D. Colla, A. Lieto, E. Mensa, M. Rovera, D. Astrologo, B. Boniolo, and M. D'ambrosio 2017. PRiSMHA (Providing Rich Semantic Metadata for Historical Archives. In Proceedings of the Contextual Representation of Objects and Events in Language.
    [25]
    A. Goy and D. Magro. 2019. Collections revisited from the perspective of historical testimonies. Int. J. Metad. Seman. Ontol. 13, 4 (2019) 300--316.
    [26]
    A. Goy, D. Magro, and A. Baldo. 2019b. A semantic web approach to enable a smart route to historical archives. J. Web Eng. 18, 4--6 (2019b), 287--318.
    [27]
    A. Goy, D. Magro, and M. Rovera. 2015. Ontologies and historical archives: A way to tell new stories. Appl. Ontol. 10, 3/4 (2015), 331--338.
    [28]
    A. Goy, D. Magro, and M. Rovera. 2018. On the role of thematic roles in a historical event ontology. Appl. Ontol. 13 (2018), 19--39.
    [29]
    T. R. Gruber. 1995. Toward principles for the design of ontologies used for knowledge sharing. Int. J. Hum.-Comput. Stud. 43, 5-6 (1995) 907--928.
    [30]
    M. Gupta, R. Li, Z. Yin, and J. Han. 2010. Survey on social tagging techniques. ACM SIGKDD Explor. Newslett. 12, 1 (2010), 58--72.
    [31]
    W. R. van Hage, V. Malaisé, R. Segers, L. Hollink, and G. Schreiber. 2011. Design and use of the Simple Event Model (SEM). J. Web Seman. 9, 2 (2011), 128--136.
    [32]
    T. Heath and C. Bizer. 2011. Linked Data: Evolving the Web into a Global Data Space. Morgan 8 Claypool.
    [33]
    F. Hogenboom, F. Frasincar, U. Kaymak, and F. De Jong. 2011. An overview of event extraction from text. In Proceedings of the Workshop on Detection, Representation, and Exploitation of Events in the Semantic Web. 48--57.
    [34]
    M. Horridge, S. Brandt, B. Parsia, and A. Rector. 2014. A domain specific ontology authoring environment for a clinical documentation system. In Proceedings of the IEEE International Symposium on Computer-Based Medical Systems.
    [35]
    S. van Hooland, E. Méndez Rodríguez, and I. Boydens. 2011. Between commodification and Engagement: On the double-edged impact of user-generated metadata within the cultural heritage sector. Libr. Trends 59, 4 (2011), 707--720.
    [36]
    E. Hyvönen, T. Lindquist, J. Törnroos, and E. Mäkelä. 2012. History on the semantic web as linked data-an event gazetteer and timeline for the World War I. In Proceedings of CIDOC: Enriching Cultural Heritage.
    [37]
    A. Isaac. (Ed.). 2013. Europeana data model primer, creative commons licence. Retrieved from https://pro.europeana.eu/files/Europeana_Professional/Share_your_data/Technical_requirements/EDM_Documentation/EDM_Primer_130714.pdf.
    [38]
    A. Katifori, C. Nikolaou, M. Platakis, Y. Ioannidis, A. Tympas, M. Koubarakis, N. Sarris, V. Tountopoulos, E. Tzoannos, S. Bykau, N. Kiyavitskaya, C. Tsinaraki, and Y. Velegrakis. 2011. The papyrus digital library: Discovering history in the news. In Proceedings of the International Conference on Theory and Practice of Digital Libraries (TPDL’11), LNCS 6966. 465--468.
    [39]
    Y. Kazakov. 2008. RIQ and SROIQ are harder than SHOIQ. In Proceedings of the International Conference on Principles of Knowledge Representation and Reasoning. 274--284.
    [40]
    L. King, J. F. Stark, and P. Cooke. 2016. Experiencing the digital world: The cultural value of digital engagement with heritage. Herit Soc. 9, 1 (2016), 76--101.
    [41]
    Z. Koukopoulos and D. Koukopoulos. 2019. Evaluating the usability and the personal and social acceptance of a participatory digital platform for cultural heritage. Heritage 2019 2, 1 (2019), 1--26.
    [42]
    P. Le Boeuf, M. Doerr, C. E. Ore, and S. Stead. 2015. Definition of the CIDOC conceptual reference model. ICOM/CIDOC CRM Special Interest Group. Retrieved from new.cidoccrm.org/Version/version-6.2.2.
    [43]
    A. Meroño-Peñuela, A. Ashkpour, M. van Erp, K. Mandemakers, L. Breure, A. Scharnhorst, S. Schlobach, and F. van Harmelen. 2015. Semantic technologies for historical Research: A Survey. Semantic Web Journal 6, 6 (2015)539--564.
    [44]
    J. Oomen, and L. Belice. 2012. Sharing cultural heritage the linked open data way: Why you should sign up. In Proceedings of the Museums and the Web Conference.
    [45]
    A. Marchetti, M. Tesconi, F. Ronzano, M. Rosella, and S. Minutoli. 2007. Semkey: A semantic collaborative tagging system. In Proceedings of the Workshop on Tagging and Metadata for Social Information Organization. 8--12.
    [46]
    C. Masolo, S. Borgo, A. Gangemi, N. Guarino, and A. Oltramari. 2003. WonderWeb deliverable D18. Istituto di Scienze e Tecnologie della Cognizione -- Consiglio Nazionale delle Ricerche, 2003. Technical Report.
    [47]
    E. Motta, S. Buckingham Shum, and J. Domingue. 2000. Ontology-driven document enrichment: Principles, tools, and applications. Int. J. Hum.-Comput. Stud. 52, 6 (2000), 1071--1109.
    [48]
    D. Nadeau and S. Sekine. 2007. A survey of named entity recognition and classification. Lingvist. Investig. 30, 1 (2007), 3--26.
    [49]
    F. Nanni, S. P. Ponzetto, and L. Dietz. 2017. Building entity-centric event collections. In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries.
    [50]
    J. Noordegraaf, A. Bartholomew, A. Eveleigh. 2014. Modeling crowdsourcing for cultural heritage. In Proceedings of the Museums and the Web Conference.
    [51]
    M. Palmer, D. Gildea, and N. Xue. 2010. Semantic role labeling. Synth. Lect. Hum. Lang. Technol. 3, 1 (2010), 1--103.
    [52]
    B. Parsia, N. Matentzoglu, R. S. Gonçalves, B. Glimm, A. Steigmiller. 2017. The Owl Reasoner Evaluation (ORE) 2015 competition report. J. Autom. Reas. 59, 4 (2017), 455--482.
    [53]
    H. Paulheim and F. Probst. 2010. Ontology-enhanced user interfaces: A survey. Int. J. Seman. Web Inf. Syst. 6, 2 (2010).
    [54]
    Y. Raymond and S. Abdallah. 2007. The event ontology. In Proceedings of the Events and Stories in the News Workshop. 87--97.
    [55]
    A. Rector. 2013. Axioms 8 templates: Distinctions and transformations amongst ontologies, frames and information models. In Proceedings of the International Conference on Knowledge Capture.
    [56]
    M. Ridge. 2013. From tagging to theorizing: Deepening engagement with cultural heritage through crowdsourcing. Mus. J. 56, 4 (2013), 435--450.
    [57]
    H. Sharp, J. Preece, and Y. Rogers. 2019. Interaction Design: Beyond Human-Computer Interaction (5th ed). Wiley.
    [58]
    R. Shaw, R. Troncy, L. Hardman. 2009. LODE: Linking open descriptions of events. In Proceedings of the Asian Conference on the Semantic Web. 153--167.
    [59]
    A. Scherp, T. Franz, C. Saathoff, and F. Staab. 2009. F—A model of events based on the foundational ontology DOLCE+DnS ultralite. In Proceedings of the International Conference on Knowledge Capture. 137--144.
    [60]
    SHACL. 2017. Shapes Constraint Language (SHACL). W3C. Retrieved from www.w3.org/TR/shacl/.
    [61]
    A. Sleimi, N. Sannier, M. Sabetzadeh, L. Briand, and J. Dann. 2018. Automated extraction of semantic legal metadata using natural language processing. In Proceedings of the International Requirements Engineering Conference. 124--135.
    [62]
    A. Soylu, M. Giese, E. Jimenez-Ruiz, E. Kharlamov, D. Zheleznyakov, and I. Horrocks. 2017. Ontology-based end-user visual query formulation: Why, what, who, how, and which? Univ. Access Inf. Soc. 16, (2017), 435--467.
    [63]
    R. Sprugnoli and S. Tonelli. 2017). One, no one and one hundred thousand events: Defining and processing events in an inter-disciplinary perspective. Nat. Lang. Eng. 23, 4 (2017), 485--506.
    [64]
    A. Steigmiller, T. Liebig, B. Glimm. 2014. Konclude: System Description. J. Web Seman. 27, 1 (2014), 78--85.
    [65]
    M. Terras. 2016. Crowdsourcing in the digital humanities. S. Schreibman, R. Siemens, and J. Unsworth (Eds.). In A New Companion to Digital Humanities. Wiley-Blackwell, 420--439.
    [66]
    D. Tunkelang and G. Marchionini. 2009. Faceted Search. Retrieval, and Services. Morgan and Claypool Publishers.
    [67]
    G. Vamvakas, B. Gatos, and S. J. Perantonis. 2010. Handwritten character recognition through two-stage foreground sub-sampling. Pattern Recog. 43, 8 (2010), 2807--2816.
    [68]
    J. Visser and J. Richardson. 2013. Digital Engagement in Culture, Heritage and the Arts. Creative Commons Attribution-Share Alike License. Retrieved from https://digitalengagementframework.com/digenfra3/wp-content/uploads/2016/02/Digital_engagement_in_culture_heritage_and_the_arts.pdf.
    [69]
    G. P. Zarri. 2015. A structured and in-depth representation of the semantic content of elementary and complex events. Int. J. Metad. Seman. Ontol. 10, 1 (2015), 12--27.

    Cited By

    View all
    • (2023)Semantic technology for cultural heritage: a bibliometric-based reviewGlobal Knowledge, Memory and Communication10.1108/GKMC-04-2023-0125Online publication date: 22-Aug-2023
    • (2022)Computational Culture: Transforming Archives Practice and Education for a Post-Covid WorldJournal on Computing and Cultural Heritage 10.1145/349334215:3(1-18)Online publication date: 16-Sep-2022
    • (2022)Bringing Semantics into Historical Archives with Computer-aided Rich Metadata GenerationJournal on Computing and Cultural Heritage 10.1145/348439815:3(1-24)Online publication date: 16-Sep-2022
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Journal on Computing and Cultural Heritage
    Journal on Computing and Cultural Heritage   Volume 13, Issue 3
    October 2020
    211 pages
    ISSN:1556-4673
    EISSN:1556-4711
    DOI:10.1145/3411173
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 August 2020
    Accepted: 01 May 2020
    Revised: 01 February 2020
    Received: 01 December 2019
    Published in JOCCH Volume 13, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Ontology-driven user interfaces
    2. computational ontologies
    3. crowdsourcing platform
    4. historical archives

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • Compagnia di San Paolo Foundation and Università di Torino within the PRiSMHA project

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)44
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 26 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Semantic technology for cultural heritage: a bibliometric-based reviewGlobal Knowledge, Memory and Communication10.1108/GKMC-04-2023-0125Online publication date: 22-Aug-2023
    • (2022)Computational Culture: Transforming Archives Practice and Education for a Post-Covid WorldJournal on Computing and Cultural Heritage 10.1145/349334215:3(1-18)Online publication date: 16-Sep-2022
    • (2022)Bringing Semantics into Historical Archives with Computer-aided Rich Metadata GenerationJournal on Computing and Cultural Heritage 10.1145/348439815:3(1-24)Online publication date: 16-Sep-2022
    • (2021)Wikidata Support in the Creation of Rich Semantic Metadata for Historical ArchivesApplied Sciences10.3390/app1110437811:10(4378)Online publication date: 12-May-2021

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media