Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

LexOWL: A Bridge from LexGrid to OWL

2009, Nature Precedings

CORE Metadata, citation and similar papers at core.ac.uk Provided by Nature Precedings LexOWL: A Bridge from LexGrid to OWL Cui Tao PhD Jyotishman Pathak PhD Harold R. Solbrig Christopher G. Chute MD DrPH Mayo Clinic College of Medicine, Rochester, MN Abstract The Lexical Grid project is an on-going community driven initiative that provides a common terminology model to represent multiple vocabulary and ontology sources as well as a scalable and robust API for accessing such information. In order to add more powerful functionalities to the existing infrastructure and align LexGrid more closely with various Semantic Web technologies, we introduce the LexOWL project for representing the ontologies modeled within the LexGrid environment in OWL (Web Ontology Language). The crux of this effort is to create a “bridge” that functionally connects the LexBIG (a LexGrid API) and the OWL API (an interface that implements OWL) seamlessly. In this paper, we discuss the key aspects of designing and implementing the LexOWL bridge. We compared LexOWL with other OWL converting tools and conclude that LexOWL provides an OWL mapping and converting tool with well-defined interoperability for information in the biomedical domain. Introduction The Lexical Grid project (LexGrid) [2, 12] coordinated by the Mayo Clinic Division of Biomedical Statistics and Informatics provides support for a distributed network of lexical resources such as terminologies and ontologies via standards-based tools, storage formats, and access mechanisms. The LexGrid system supports formats such as HL7 RIM, OBO, OWL/Protégé frame, UMLS RRF, and LexGrid XML. It models ontology information including versioning, provenances, entities, associations, and instances. LexGrid loads ontologies and terminologies from different sources, maps the information into the LexGrid model, and stores them in a backend database. Information modeled by LexGrid can be accessed through LexBIG, an interface that implements the LexGrid model, on top of which standard tools and services can be built. A valuable augmentation to LexGrid is the adoption of Semantic Web technologies. The recent emergence of the Semantic Web and the Web Ontology Language (OWL) [4] is fostering a new level of interoperability. The biomedical informatics community greatly benefit by applying OWL’s combination of formal semantics, rich expressiveness and shared software base to biomedical and clinical terminologies. The LexOWL project provides a round trip between LexGrid and OWL. In this paper, we focus on the direction from LexBIG to OWL. Through LexOWL, information modeled in LexGrid can be represented in OWL. Hence, tools and services that have been developed by the Semantic Web community can be directly applied to the biomedical and clinical domain. To name a few, we can use Protégé, which is a widely-used OWL ontology authoring tool, to browse and edit the information modeled in LexGrid. We can apply different reasoning tools to medical and clinical terminologies, to check consistency or to infer new knowledge. We can use OWL ontology modularity tools to integrate or extract ontology modules as well as use OWL ontology mapping tools to map ontologies. The biomedical terminology community has been actively seeking connections to OWL. OBO2OWL [1], OBOInOWL [9], Protégé OBO to OWL Tab [10], and Protégé 4 OBO loader provide mappings and conversions from OBO to OWL. The conversion from UMLS Semantic Network to OWL has been studied in [6, 8]. The NCI Thesaurus to OWL DL conversion is discussed in [11]. The International Healthcare Terminology Standards Development Organization also released a Perl converter for converting from SNOMED CT to OWL in recent SNOMED CT releases. LexOWL augments all these efforts by providing LexGrid a converter to OWL. Compared to the other tools, LexOWL has an inherit advantage in that, it can convert all the ontologies and terminologies from different sources modeled by LexGrid without individual mappers and converters. As an immediate benefit, LexOWL provides a well-defined interoperability across these sources since all the different resources are modeled by LexGrid. We make the following contributions in this paper.  LexOWL functionally converts LexGrid to OWL through an API bridge and represents the information modeled in LexGrid in the OWL API representation. By doing so, we can leverage the services and tools developed for OWL and the Semantic Web directly.  LexOWL provides an OWL converter with relatively well-defined interoperability for different biomedical terminologies and ontologies.  LexOWL provides a dynamic interface between LexGrid and Protégé so that Protégé can use LexGrid as its backend database, which could be a valuable addition to Protégé 4. The rest of the paper is structured as fellow. We begin with an overview of the LexOWL system in Section 2. In Section 3, we discuss how LexOWL maps LexGrid components to OWL. In Section 4, we compare the OWL ontologies exported by LexOWL to those converted by the existing tools. Finally, in Section 5 we summarize and consider future work. LexOWL System Overview Figure 1 shows the LexOWL system overview. The core component of LexOWL is the LexOWLManager. It manages both the LexBIG service through which we can access the LexBIG API, and the OWL Ontology manager through which we can access the OWL API. On the left hand side of the system overview, the LexGrid system loads ontologies in different formats from different sources, translates them to LexGrid representation as well as saves the knowledge to a relational database. Through the LexBIG API, LexOWL can access the ontologies loaded in the database. On the right hand side of the system overview, through the OWL API, LexOWL re-represents the information in the LexGrid database virtually to the OWL API Ontology representation, which can the be used directly by Protégé 4 and other Semantic Web tools. Thus, in essence, LexOWL maps LexGrid to OWL on the API level. It is not just a tool that maps and converts from one format to another. In addition to that, it generates a “bridge” between the two APIs. The “bridge” accesses information from the LexBIG API and translates it to the OWL API’s representations. The benefit of an API “bridge” is that even if the backend representations for ontologies change, the “bridge” still performs the same way and an update is not necessary. We also defined the LexGrid to OWL mapping and a lexgrid2owl meta-ontology [3], based on which LexOWL can re-represent a selected LexGrid ontology to the OWL API representation. In the next section, we discuss how LexOWL maps LexGrid to OWL. information, we can find equivalent representations in standard name spaces such as dublin core (e.g., formalName to dc:title and copyright to dc:rights). We used the lexgrid2owl meta-ontology to represent the rest information (e.g, we define ApproxNumConcepts and isNative as two annotation properties in the metaontology). LexOWL maps each LexGrid concept11 to an OWL class. A concept in the LexGrid model can have properties such as a concept code, descriptions, presentations, definitions, and sources. LexOWL uses the concept code as the OWL class name and assign concept descriptions to a set of rdfs:label. In the lexgrid2owl meta-ontology, we define three OWL classes, Presentation, Definition, and Source, to represent the presentations, definitions, and sources in the LexGrid concept properties. We also defined annotation properties: hasPresentation, hasDefinition, and hasSource in the meta-ontology, to represent the relationships between concepts and such properties. Figure 2(a) shows a sample OBO term and Figure 2(b) shows its LexGrid representation. Figure 2(c) shows how LexOWL represents this concept and its properties in OWL. LexOWL creates an OWL class for the Concept Code “TAIR:0000055” and assign the Entity Description “pollen development” as a rdfs:label. The class has three annotation properties, one hasDefinition and two hasPresentations, which link to “definition21” (an instance of the Definition Class), “presentation37”, and “presentation38” (two instances of the Presentation Class) respectively. In addition, “definition21” has an annotation property hasSource, which links to “source21”. Each of these instances also has annotation properties that represent contents such as synonyms and definitions from the source document. LexGrid also has a special kind of concepts— anonymous concepts—which it uses to represent the anonymous classes in OWL. LexOWL parses each anonymous class and translates it back to OWL based on concept properties. Figure 3 shows an example. The upper part shows the LexGrid representation. The concept “A38” is the anonymous concept which is equivalent to the concept “Father”. LexOWL can translate it back to OWL as the lower part of Figure 3 shows, which is identical to the original OWL representation. Figure 1 LexOWL System Overview. LexGrid to OWL Mapping LexOWL first maps the general ontology information. This includes information about the ontology itself such as name, version, and copyright. For some information, we can find equivalent representations in OWL (e.g., codingScheme to owl:ontology, localName to rdfs:label, and representsVersion to owl:versionInfo). For some (a) A Sample OBO Term 1 A “concept” represents a “kind” or “universal” entity in the LexGrid 2008 model. Here we still use “concept” to be compatible with LexGrid 2008. We are upgrading both LexGrid and LexOWL to avoid using this confusing label. Figure 3: An Example of Anonymous Concept Evaluation and Discussion (b) LexGrid Representation for the Sample Term We tested LexOWL using different ontologies from various sources: OWL, OBO, UMLS Semantic Network, and WHO ICD10. We used Protégé Prompt [5] to compare the OWL ontologies generated by LexOWL and by other tools. We also sampled concepts and associations in each test ontology and compared them with the original source and checked whether all the related information are represent properly. The details of the results are listed below. We tested on 5 OWL files. We chose these 5 ontologies carefully so that they cover most of the OWL Lite syntax introduced in [4]. We compared the OWL ontologies generated by LexOWL with the original ontologies. Each pair of ontologies is semantically equivalent to each other. (c) LexOWL Representation for the Sample Term Figure 2: An Example for Entity Mapping An association in the LexGrid model establishes a relation between two LexGrid entities. LexOWL classifies the LexGrid associations into two types: predefined associations and other associations. A predefined association can be directly mapped to an OWL element. For example, the associations “subClassOf” (OWL), “CHD” (ICD 10), and “is a” (OBO) are all mapped to owl:subClassOf. The association “hasSubtype” (UMLS) is mapped as an inverse of OWL element subClassOf. The associations “equivalentClass” (OWL) and “same as” (UMLS) are mapped to owl:equivalentClass. For detailed information about the pre-defined-association mapping, please see [3]. We also tested on 10 OBO files. For each OBO file, we compared the OWL ontology translated by LexOWL with those converted by OBO2OWL [1], Protégé 3.3.1 OBO to OWL Tab [10], and Protégé 4.0 OBO loader. All the four tools mapped OBO terms to OWL classes, OBO “isa” to OWL subClassOf, and used OWL someValuesFrom to represent relationships two classes. Semantically, the corresponding ontologies from all the 4 converters are identical. However, each converter defined its own annotation properties and used different annotation properties to represent the same OBO information. OBO2OWL and Protégé 4.0 OBO loader have relatively simple and straightforward conversions where they used the OBO labels directly as the OWL annotation property names. Protégé OBO to OWL Tab and LexOWL processed information in a lower granularity (e.g., the “def” in Figure 2(a) is parsed and the source information is annotated separately.) We used LexOWL to export UMLS Semantic Network loaded in LexGrid to an OWL file and compared it with the one converted by Jimenez-Ruiz [6]. LexOWL uses the UIs as the OWL class names versus Jimenez-Ruiz uses the actual names. Hierarchically, these two ontologies are identical. Jimenez-Ruiz introduced some annotation properties that are specific for the UMLS Semantic Network where LexOWL used lexgrid2owl meta-ontology to represent all the information. For example, Jimenez-Ruiz mapped SRDEF to rdfs:comment, whereas LexOWL mapped it to lexgrid2owl :Definition, which can bring better interoperability since definitions of terms from other sources are also mapped to lexgrid2owl:Definition. Jimenez-Ruiz used owl:allValuesFrom to represent relationships between two classes and LexOWL used owl:someValuesFrom since this is the default restriction LexOWL uses for representing relationships between classes2. We also used LexOWL to export ICD10 WHO second edition loaded in LexGrid to an OWL file and compared it with the OWL file converted by Cardillo, et al. [7]. Hierarchically, these two ontologies are identical. The ontology converted by [7] only covered hierarchical information, however. Information such as exclusions and inclusions are ignored whereas LexOWL considered them as OWL ObjectProperties, thereby preserving the semantics. In summary, the test results show that LexOWL can convert information modeled in LexGrid to OWL successfully. LexOWL uses a single meta-ontology for all different sources where other tools use different metaontologies even for the same format. Hence, the ontologies converted by LexOWL has better Interoperability that will bring benefits in ontology mapping, integration and reasoning in the future. Concluding Remarks and Future Work We introduced LexOWL, a system that functionally connects LexGrid to OWL through a bridge over the LexBIG and the OWL APIs. LexOWL can represent information modeled in LexGrid in the OWL API representation, so that tools and services that are developed for OWL can be applied to the biomedical terminologies and ontologies. LexOWL also provides a LexGrid-to-OWL converter with a well-defined interoperability for information from different sources and in different formats. As for the future work, several directions remain to be pursued. First, we would like to investigate performance of LexOWL with large-sized ontologies such as SNOMED CT, the Gene Ontology, and ICD10. We would like to add the editing and saving function as Figure 1 shows, so that we not only can browse, but also edit information represent in LexGrid using Protégé. Finally, LexOWL serves as a foundational pillar for ontology reasoning and inference. Our next step is to explore toward that direction on biomedical and clinical information. 2 How to represent the semantic relationships between classes in a more precise way is a problem we are investigating when mapping information to LexGrid and is out of the scope of this paper. References 1. 2. 3. OBO2OWL: Lossless transformation between OBO and OWL. http://www.cs.utexas.edu/˜hamid/ research/obo2owl.cgi, 2008. LexGrid: The Lexical Grid. https://cabig-kc.nci. nih.gov/Vocab/KC/index.php/LexGrid, 2009. LexGrid to owl Mapping documentations. https:// cabig-kc.nci.nih.gov/Vocab/KC/index.php/LexGrid_ Documentation, 2009. 4. OWL Web Ontology Language Reference. http:// www.w3.org/ TR/owl-ref/, 2009. 5. The Protégé Prompt Tab. http://protege.stanford .edu/plugins/prompt/prompt.html, 2009. 6. The UMLS Semantic Network in OWL. http:// krono.act.uji.es/people/Ernesto/UMLS_SN_OWL, 2009. 7. E. Cardillo, C. Eccher, L. Serafini, and A. Tamilin. Logical analysis of mappings between medical classification systems. In Proceedings of the 13th international conference on Artificial Intelligence, pp 311–321, Sep 2008. 8. D. Fensel, K. Sycara, and J. Mylopoulos. Representing the UMLS semantic network using OWL. In Proceedings of the Second International Semantic Web Conference, pp 1–16, Sanibel Island, Florida, Oct 2003. 9. C. Golbreic, M. Horridge, I. Horrocks, B. Motik, and R. Shearer. OBO and OWL: Leveraging semantic web technologies for the life sciences. In Proceedings of the 6th International Semantic Web Conference, pp 169–182, Busan, Korea, Nov 2007. 10. D. Moreira and M.A. Musen. OBO to OWL: a protégé OWL tab to read/save OBO ontologies. Bioinformatics, 23(14):1868–1870, 2007. 11. N. Noy, S. de Coronado, H. Solbrig, G. Fragoso, F. Hartel, and M. Musen, Representing the NCI Thesaurus in OWL DL: Modeling tools help modeling languages. Journal of Applied Ontology, 3(3):173–190, 2008. 12. J. Pathak, H. Solbrig, J. Buntrock, T. Johnson, and C. Chute. LexGrid: A framework for representing, storing, and querying biomedical terminologies from simple to sublime. Journal of the American Medical Informatics Association, 16(9), 2009.