Getty Vocabularies OpenRefine Reconciliation
Data managers, developers, and others who wish to reconcile data sets to the Getty Vocabularies may now do so using OpenRefine, an open-source tool for data cleanup and transformation.
|
Data reconciliation refers to a verification process comparing target data against original source data. Traditional use cases for this process include data migration or repurposing. Another important goal of reconciliation is to standardize data, for example connecting a local vocabulary to a standard controlled vocabulary. This improves and standardizes individual data fields or columns of data.
Users of the Getty Vocabularies, or contributors, may use data reconciliation to compare their local data to values in the Getty Vocabularies in order to map to them. For example, an internal artist list from a repository of art can be mapped to a corresponding set of Union List of Artist Names (ULAN) record identifiers. Similarly, a contributor of artist records to ULAN can reconcile their terms for artist role, e.g., watercolorist, to the ULAN role list or to the Art & Architecture Thesaurus (AAT) in order to link to the vocabularies' established term watercolorist, or AAT_300025157.
Reconciliation is a semiautomated process; matches are suggested by the service, but given the complexity of data and homographs, a fully automated process would not ensure accurate matches. Human oversight and judgment are essential. Through the Getty Vocabularies' OpenRefine reconciliation service, the user has the option to decide which data are modified by selecting from a list of results.
Users of the Getty Vocabularies, or contributors, may use data reconciliation to compare their local data to values in the Getty Vocabularies in order to map to them. For example, an internal artist list from a repository of art can be mapped to a corresponding set of Union List of Artist Names (ULAN) record identifiers. Similarly, a contributor of artist records to ULAN can reconcile their terms for artist role, e.g., watercolorist, to the ULAN role list or to the Art & Architecture Thesaurus (AAT) in order to link to the vocabularies' established term watercolorist, or AAT_300025157.
Reconciliation is a semiautomated process; matches are suggested by the service, but given the complexity of data and homographs, a fully automated process would not ensure accurate matches. Human oversight and judgment are essential. Through the Getty Vocabularies' OpenRefine reconciliation service, the user has the option to decide which data are modified by selecting from a list of results.