Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2013
ABSTRACT Tool development for and empirical experimentation in OWL ontology engineering require a... more ABSTRACT Tool development for and empirical experimentation in OWL ontology engineering require a wide variety of suitable ontologies as input for testing and evaluation purposes and detailed characterisations of real ontologies. Empirical activities often resort to (somewhat arbitrarily) hand curated corpora available on the web, such as the NCBO BioPortal and the TONES Repository, or manually selected sets of well-known ontologies. Findings of surveys and results of benchmarking activities may be biased, even heavily, towards these datasets. Sampling from a large corpus of ontologies, on the other hand, may lead to more representative results. Current large scale repositories and web crawls are mostly uncurated and suffer from duplication, small and (for many purposes) uninteresting ontology files, and contain large numbers of ontology versions, variants, and facets, and therefore do not lend themselves to random sampling. In this paper, we survey ontologies as they exist on the web and describe the creation of a corpus of OWL DL ontologies using strategies such as web crawling, various forms of de-duplications and manual cleaning, which allows random sampling of ontologies for a variety of empirical applications.
OWL 2 DL is a complex logic with reasoning problems that have a high worst case complexity. Moder... more OWL 2 DL is a complex logic with reasoning problems that have a high worst case complexity. Modern reasoners perform mostly very well on naturally occurring ontologies of varying sizes and complexity. This performance is achieved through a suite of complex optimisations (with complex interactions) and elaborate engineering. While the formal basis of the core reasoner procedures are well understood, many optimisations are less so, and most of the engineering details (and their possible effect on reasoner correctness) are unreviewed by anyone but the reasoner developer. Thus, it is unclear how much confidence should be placed in the correctness of implemented reasoners. To date, there is no principled, correctness unit test-like suite for simple language features and, even if there were, it is unclear that passing such a suite would say much about correctness on naturally occurring ontologies. This problem is not merely theoretical: Divergence in behaviour (thus known bugginess of implementations) has been observed in the OWL Reasoner Evaluation (ORE) contests to the point where a simple, majority voting procedure has been put in place to resolve disagreements.
In this paper, we present a new technique for finding and resolving reasoner disagreement. We use justifications to cross check disagreements. Some cases are resolved automatically, others need to be manually verified. We evaluate the technique on a corpus of naturally occurring ontologies and a set of popular reasoners. We successfully identify several correctness bugs across different reasoners, identify causes for most of these, and generate appropriate bug reports and patches to ontologies to work around the bug.
Ontologies published on the web can be quite large, from a couple of megabytes to more than a gig... more Ontologies published on the web can be quite large, from a couple of megabytes to more than a gigabyte. Deploying, importing and using such ontologies can be a problem, both in terms of bandwidth and load time over the web, and in terms of physically storing them. Some ontologies in BioPortal for example are shipped in their compressed form (via their web services), which allows �lesize reductions to up to 2% of the original. Moreover, many ontologies have been published in their modular form through the use of owl:imports. Some of the imports are dereferenceable on the web, but in many cases, the imports closure is shipped with the actual (root) ontology, which can lead to confusions on how to interpret the directory structure. In this paper, we are proposing a set of simple conventions to distribute ontologies (monolithic or modular) in a canonical fashion to enable tools to make use of pre-compiled modular structures and reduce IO communication to a minimum using compression.
Very expressive Description Logics in the SH family have worst case complexity ranging from EXPTI... more Very expressive Description Logics in the SH family have worst case complexity ranging from EXPTIME to double NEXPTIME. In spite of this, they are very popular with modellers and serve as the foundation of the Web Ontology Language (OWL), a W3C standard. Highly optimised reasoners handle a wide range of naturally occurring ontologies with relative ease, albeit with some pathological cases. A recent optimisation trend has been modular reasoning, that is, breaking the ontology into hopefully easier subsets with a hopefully smaller overall reasoning time (see MORe and Chainsaw for prominent examples). However, it has been demonstrated that subsets of an OWL ontology may be harder { even much harder { than the whole ontology. This introduces the risk that modular approaches might have even more severe pathological cases than the normal monolithic ones. In this paper, we analyse a number of ontologies from the BioPortal repository in order to isolate cases where random subsets are harder...
ABSTRACT Ontologies published on the web can be quite large, from a couple of megabytes to more t... more ABSTRACT Ontologies published on the web can be quite large, from a couple of megabytes to more than a gigabyte. Deploying, importing and using such ontologies can be a problem, both in terms of bandwidth and load time over the web, and in terms of physically storing them. Some ontologies in BioPortal for example are shipped in their compressed form (via their web services), which allows �lesize reductions to up to 2% of the original. Moreover, many ontologies have been published in their modular form through the use of owl:imports. Some of the imports are dereferenceable on the web, but in many cases, the imports closure is shipped with the actual (root) ontology, which can lead to confusions on how to interpret the directory structure. In this paper, we are proposing a set of simple conventions to distribute ontologies (monolithic or modular) in a canonical fashion to enable tools to make use of pre-compiled modular structures and reduce IO communication to a minimum using compression.
ABSTRACT Tool development for and empirical experimentation in OWL ontology research require a wi... more ABSTRACT Tool development for and empirical experimentation in OWL ontology research require a wide variety of suitable ontologies as input for testing and evaluation purposes and detailed characterisations of real on-tologies. Findings of surveys and results of benchmarking activities may be biased, even heavily, towards manually assembled sets of " somehow suitable " ontologies. We are building the Manchester OWL Repository, a resource for creating and sharing ontology datasets, to push the quality frontier of empirical ontology research and provide access to a great variety of well curated ontologies.
ABSTRACT In spite of the recent renaissance in lightweight description logics (DLs), many promine... more ABSTRACT In spite of the recent renaissance in lightweight description logics (DLs), many prominent DLs, such as that underlying the Web Ontology Language (OWL), have high worst case complexity for their key inference services. Modern reasoners have a large array of optimization, tuned calculi, and implementation tricks that allow them to perform very well in a variety of application scenarios even though the complexity results ensure that they will perform poorly for some inputs. For users, the key question is how often they will encounter those pathological inputs in practice, that is, how robust are reasoners. We attempt to determine this question for classification of existing ontologies as they are found on the Web. It is a fairly common user task to examine ontologies published on the Web as part of their development process. Thus, the robustness of reasoners in this scenario is both directly interesting and provides some hints toward answering the broader question. From our experiments, we show that the current crop of OWL reasoners, in collaboration, is very robust against the Web. 1 Motivation A serious concern about both versions 1 [12] and 2 [5] of the Web Ontology Language (OWL) is that the underlying description logics (SHOIQ and SROIQ) exhibit extremely bad worst case complexity (NEXPTIME and 2NEXPTIME) for their key inference services. While since the mid-1990s, highly optimized description logic reasoners have been exhibiting rather good performance in real cases, even in those more constrained cases there are ontologies (such as Galen) which have proved impossible to process for over a decade. Indeed, concern with such pathology stimulated a renaissance of research into tractable description logics with the EL family [1] and the DL Lite [4] family being incorporated as special " profiles " of OWL 2. However, even though the number of ontologies available on the Web has grown enormously since the standardization of OWL, it is still unclear how robust modern, highly optimized reason-ers are to such input. Anecdotal evidence suggests that pathological cases are common enough to cause problems, however, systematic evidence has been scarce. In this paper we investigate the question of whether modern, highly-optimized description logic reasoners are robust over Web input. The general intuition of a robust system is that it is resistant to failure in the face of a range of input. For any particular robustness determination, one must decide: 1) the range of input, 2) the functional or non-functional properties of interest, and 3) what counts as failure. The instantiation of these parameters strongly influences robustness judgements, with the very same rea-soner being highly robust under one scenario and very non-robust under another. For our current purposes, the key scenario is that an ontology engineer, using a tool like Protégé
ABSTRACT Tool development for and empirical experimentation in OWL ontology engineering require a... more ABSTRACT Tool development for and empirical experimentation in OWL ontology engineering require a wide variety of suitable ontologies as input for testing and evaluation purposes. Empirical activities often resort to (somewhat arbitrarily) hand curated corpora available on the web, such as the NCBO BioPortal and the TONES Repository, or manually select a set of well-known ontologies. Results may be biased, even heavily, towards these datasets. Sampling from a large corpus of ontologies, on the other hand, may lead to more representative results. Current large scale repositories/web crawls are mostly uncurated, suffer from duplication and contain large numbers of ontology versions, variants, and facets, and therefore do not lend themselves to random sampling. In this paper, we describe the creation of a corpus of OWL DL ontologies using strategies such as web crawling, various forms of de-duplications and manual cleaning, which allows random sampling of ontologies for a variety of empirical applications.
ABSTRACT The OWL reasoner evaluation (ORE) workshop brings together reasoner developers and ontol... more ABSTRACT The OWL reasoner evaluation (ORE) workshop brings together reasoner developers and ontology engineers in order to discuss and evaluate the performance and robustness of modern reasoners on OWL ontologies. In addition to paper submissions, the workshop featured a live and offline reasoner competition where standard reasoning tasks were tested: classification, consistency, and concept satisfiability. The reasoner competition is performed on several large corpora of real-life OWL ontologies obtained from the web, as well as user-submitted ontologies which were found to be challenging for reasoners. Overall there were 14 reasoner submissions for the competition, some of which dedicated to certain subsets or profiles of OWL 2, and implementing different algorithms and optimisations. In this report, we give an overview of the competition methodology and present a summary of its results, divided into the respective categories based on OWL 2 profiles and test corpora.
2014 IEEE 27th International Symposium on Computer-Based Medical Systems, 2014
ABSTRACT Clinical assessment scales, such as the Glasgow coma scale, are a core part of Electroni... more ABSTRACT Clinical assessment scales, such as the Glasgow coma scale, are a core part of Electronic Health Records (EHRs). However, fully representing them in an OWL ontology is challenging: In particular, the determination of a score from patient's observations and clinical findings requires forms of aggregation and addition which are either tedious in OWL 2 or merely impractical due to combinatorial explosion. To solve this problem, we propose to separate the representation of the structure and content of an assessment scale from its enactment with the former being captured in OWL 2 and the latter being determined by a SPARQL query. The paper reports the results of a systematic review of 104 well-established clinical assessment scales along with the performance of the SPARQL queries proposed when executed with the query engine ARQ for Jena over HL7 CDA level three documents.
DL reasoners are complex pieces of software that work on even more complex input which... more DL reasoners are complex pieces of software that work on even more complex input which makes manual verification difficult. A single ontology can have hundreds or thousands of classes and thus its classification involve an unsurveyable number of subsumption tests. We propose a new method for debugging classification across multiple reasoners which employs justifications generated from the set of entailments that reasoners disagree upon to determine the cause of the disagreement.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2013
ABSTRACT Tool development for and empirical experimentation in OWL ontology engineering require a... more ABSTRACT Tool development for and empirical experimentation in OWL ontology engineering require a wide variety of suitable ontologies as input for testing and evaluation purposes and detailed characterisations of real ontologies. Empirical activities often resort to (somewhat arbitrarily) hand curated corpora available on the web, such as the NCBO BioPortal and the TONES Repository, or manually selected sets of well-known ontologies. Findings of surveys and results of benchmarking activities may be biased, even heavily, towards these datasets. Sampling from a large corpus of ontologies, on the other hand, may lead to more representative results. Current large scale repositories and web crawls are mostly uncurated and suffer from duplication, small and (for many purposes) uninteresting ontology files, and contain large numbers of ontology versions, variants, and facets, and therefore do not lend themselves to random sampling. In this paper, we survey ontologies as they exist on the web and describe the creation of a corpus of OWL DL ontologies using strategies such as web crawling, various forms of de-duplications and manual cleaning, which allows random sampling of ontologies for a variety of empirical applications.
OWL 2 DL is a complex logic with reasoning problems that have a high worst case complexity. Moder... more OWL 2 DL is a complex logic with reasoning problems that have a high worst case complexity. Modern reasoners perform mostly very well on naturally occurring ontologies of varying sizes and complexity. This performance is achieved through a suite of complex optimisations (with complex interactions) and elaborate engineering. While the formal basis of the core reasoner procedures are well understood, many optimisations are less so, and most of the engineering details (and their possible effect on reasoner correctness) are unreviewed by anyone but the reasoner developer. Thus, it is unclear how much confidence should be placed in the correctness of implemented reasoners. To date, there is no principled, correctness unit test-like suite for simple language features and, even if there were, it is unclear that passing such a suite would say much about correctness on naturally occurring ontologies. This problem is not merely theoretical: Divergence in behaviour (thus known bugginess of implementations) has been observed in the OWL Reasoner Evaluation (ORE) contests to the point where a simple, majority voting procedure has been put in place to resolve disagreements.
In this paper, we present a new technique for finding and resolving reasoner disagreement. We use justifications to cross check disagreements. Some cases are resolved automatically, others need to be manually verified. We evaluate the technique on a corpus of naturally occurring ontologies and a set of popular reasoners. We successfully identify several correctness bugs across different reasoners, identify causes for most of these, and generate appropriate bug reports and patches to ontologies to work around the bug.
Ontologies published on the web can be quite large, from a couple of megabytes to more than a gig... more Ontologies published on the web can be quite large, from a couple of megabytes to more than a gigabyte. Deploying, importing and using such ontologies can be a problem, both in terms of bandwidth and load time over the web, and in terms of physically storing them. Some ontologies in BioPortal for example are shipped in their compressed form (via their web services), which allows �lesize reductions to up to 2% of the original. Moreover, many ontologies have been published in their modular form through the use of owl:imports. Some of the imports are dereferenceable on the web, but in many cases, the imports closure is shipped with the actual (root) ontology, which can lead to confusions on how to interpret the directory structure. In this paper, we are proposing a set of simple conventions to distribute ontologies (monolithic or modular) in a canonical fashion to enable tools to make use of pre-compiled modular structures and reduce IO communication to a minimum using compression.
Very expressive Description Logics in the SH family have worst case complexity ranging from EXPTI... more Very expressive Description Logics in the SH family have worst case complexity ranging from EXPTIME to double NEXPTIME. In spite of this, they are very popular with modellers and serve as the foundation of the Web Ontology Language (OWL), a W3C standard. Highly optimised reasoners handle a wide range of naturally occurring ontologies with relative ease, albeit with some pathological cases. A recent optimisation trend has been modular reasoning, that is, breaking the ontology into hopefully easier subsets with a hopefully smaller overall reasoning time (see MORe and Chainsaw for prominent examples). However, it has been demonstrated that subsets of an OWL ontology may be harder { even much harder { than the whole ontology. This introduces the risk that modular approaches might have even more severe pathological cases than the normal monolithic ones. In this paper, we analyse a number of ontologies from the BioPortal repository in order to isolate cases where random subsets are harder...
ABSTRACT Ontologies published on the web can be quite large, from a couple of megabytes to more t... more ABSTRACT Ontologies published on the web can be quite large, from a couple of megabytes to more than a gigabyte. Deploying, importing and using such ontologies can be a problem, both in terms of bandwidth and load time over the web, and in terms of physically storing them. Some ontologies in BioPortal for example are shipped in their compressed form (via their web services), which allows �lesize reductions to up to 2% of the original. Moreover, many ontologies have been published in their modular form through the use of owl:imports. Some of the imports are dereferenceable on the web, but in many cases, the imports closure is shipped with the actual (root) ontology, which can lead to confusions on how to interpret the directory structure. In this paper, we are proposing a set of simple conventions to distribute ontologies (monolithic or modular) in a canonical fashion to enable tools to make use of pre-compiled modular structures and reduce IO communication to a minimum using compression.
ABSTRACT Tool development for and empirical experimentation in OWL ontology research require a wi... more ABSTRACT Tool development for and empirical experimentation in OWL ontology research require a wide variety of suitable ontologies as input for testing and evaluation purposes and detailed characterisations of real on-tologies. Findings of surveys and results of benchmarking activities may be biased, even heavily, towards manually assembled sets of " somehow suitable " ontologies. We are building the Manchester OWL Repository, a resource for creating and sharing ontology datasets, to push the quality frontier of empirical ontology research and provide access to a great variety of well curated ontologies.
ABSTRACT In spite of the recent renaissance in lightweight description logics (DLs), many promine... more ABSTRACT In spite of the recent renaissance in lightweight description logics (DLs), many prominent DLs, such as that underlying the Web Ontology Language (OWL), have high worst case complexity for their key inference services. Modern reasoners have a large array of optimization, tuned calculi, and implementation tricks that allow them to perform very well in a variety of application scenarios even though the complexity results ensure that they will perform poorly for some inputs. For users, the key question is how often they will encounter those pathological inputs in practice, that is, how robust are reasoners. We attempt to determine this question for classification of existing ontologies as they are found on the Web. It is a fairly common user task to examine ontologies published on the Web as part of their development process. Thus, the robustness of reasoners in this scenario is both directly interesting and provides some hints toward answering the broader question. From our experiments, we show that the current crop of OWL reasoners, in collaboration, is very robust against the Web. 1 Motivation A serious concern about both versions 1 [12] and 2 [5] of the Web Ontology Language (OWL) is that the underlying description logics (SHOIQ and SROIQ) exhibit extremely bad worst case complexity (NEXPTIME and 2NEXPTIME) for their key inference services. While since the mid-1990s, highly optimized description logic reasoners have been exhibiting rather good performance in real cases, even in those more constrained cases there are ontologies (such as Galen) which have proved impossible to process for over a decade. Indeed, concern with such pathology stimulated a renaissance of research into tractable description logics with the EL family [1] and the DL Lite [4] family being incorporated as special " profiles " of OWL 2. However, even though the number of ontologies available on the Web has grown enormously since the standardization of OWL, it is still unclear how robust modern, highly optimized reason-ers are to such input. Anecdotal evidence suggests that pathological cases are common enough to cause problems, however, systematic evidence has been scarce. In this paper we investigate the question of whether modern, highly-optimized description logic reasoners are robust over Web input. The general intuition of a robust system is that it is resistant to failure in the face of a range of input. For any particular robustness determination, one must decide: 1) the range of input, 2) the functional or non-functional properties of interest, and 3) what counts as failure. The instantiation of these parameters strongly influences robustness judgements, with the very same rea-soner being highly robust under one scenario and very non-robust under another. For our current purposes, the key scenario is that an ontology engineer, using a tool like Protégé
ABSTRACT Tool development for and empirical experimentation in OWL ontology engineering require a... more ABSTRACT Tool development for and empirical experimentation in OWL ontology engineering require a wide variety of suitable ontologies as input for testing and evaluation purposes. Empirical activities often resort to (somewhat arbitrarily) hand curated corpora available on the web, such as the NCBO BioPortal and the TONES Repository, or manually select a set of well-known ontologies. Results may be biased, even heavily, towards these datasets. Sampling from a large corpus of ontologies, on the other hand, may lead to more representative results. Current large scale repositories/web crawls are mostly uncurated, suffer from duplication and contain large numbers of ontology versions, variants, and facets, and therefore do not lend themselves to random sampling. In this paper, we describe the creation of a corpus of OWL DL ontologies using strategies such as web crawling, various forms of de-duplications and manual cleaning, which allows random sampling of ontologies for a variety of empirical applications.
ABSTRACT The OWL reasoner evaluation (ORE) workshop brings together reasoner developers and ontol... more ABSTRACT The OWL reasoner evaluation (ORE) workshop brings together reasoner developers and ontology engineers in order to discuss and evaluate the performance and robustness of modern reasoners on OWL ontologies. In addition to paper submissions, the workshop featured a live and offline reasoner competition where standard reasoning tasks were tested: classification, consistency, and concept satisfiability. The reasoner competition is performed on several large corpora of real-life OWL ontologies obtained from the web, as well as user-submitted ontologies which were found to be challenging for reasoners. Overall there were 14 reasoner submissions for the competition, some of which dedicated to certain subsets or profiles of OWL 2, and implementing different algorithms and optimisations. In this report, we give an overview of the competition methodology and present a summary of its results, divided into the respective categories based on OWL 2 profiles and test corpora.
2014 IEEE 27th International Symposium on Computer-Based Medical Systems, 2014
ABSTRACT Clinical assessment scales, such as the Glasgow coma scale, are a core part of Electroni... more ABSTRACT Clinical assessment scales, such as the Glasgow coma scale, are a core part of Electronic Health Records (EHRs). However, fully representing them in an OWL ontology is challenging: In particular, the determination of a score from patient's observations and clinical findings requires forms of aggregation and addition which are either tedious in OWL 2 or merely impractical due to combinatorial explosion. To solve this problem, we propose to separate the representation of the structure and content of an assessment scale from its enactment with the former being captured in OWL 2 and the latter being determined by a SPARQL query. The paper reports the results of a systematic review of 104 well-established clinical assessment scales along with the performance of the SPARQL queries proposed when executed with the query engine ARQ for Jena over HL7 CDA level three documents.
DL reasoners are complex pieces of software that work on even more complex input which... more DL reasoners are complex pieces of software that work on even more complex input which makes manual verification difficult. A single ontology can have hundreds or thousands of classes and thus its classification involve an unsurveyable number of subsumption tests. We propose a new method for debugging classification across multiple reasoners which employs justifications generated from the set of entailments that reasoners disagree upon to determine the cause of the disagreement.
Uploads
Papers by Nicolas Matentzoglu
In this paper, we present a new technique for finding and resolving reasoner disagreement. We use justifications to cross check disagreements. Some cases are resolved automatically, others need to be manually verified. We evaluate the technique on a corpus of naturally occurring ontologies and a set of popular reasoners. We successfully identify several correctness bugs across different reasoners, identify causes for most of these, and generate appropriate bug reports and patches to ontologies to work around the bug.
In this paper, we present a new technique for finding and resolving reasoner disagreement. We use justifications to cross check disagreements. Some cases are resolved automatically, others need to be manually verified. We evaluate the technique on a corpus of naturally occurring ontologies and a set of popular reasoners. We successfully identify several correctness bugs across different reasoners, identify causes for most of these, and generate appropriate bug reports and patches to ontologies to work around the bug.