I'm a junior research fellow at Hertford College, University of Oxford, working on machine learning programs from data. Specifically, I work on inductive logic programming, a form of machine learning which learns logic programs from data.
Meta-interpretive learning (MIL) is a form of inductive logic programming that learns logic progr... more Meta-interpretive learning (MIL) is a form of inductive logic programming that learns logic programs from background knowledge and examples. We claim that adding types to MIL can improve learning performance. We show that type checking can reduce the MIL hypothesis space by a cubic factor. We introduce two typed MIL systems: Metagol_T and HEXMIL_T, implemented in Prolog and Answer Set Programming (ASP), respectively. Both systems support polymorphic types and can infer the types of invented predicates. Our experimental results show that types can substantially reduce learning times.
Meta-interpretive learning (MIL) is a form of inductive logic programming. MIL uses second-order ... more Meta-interpretive learning (MIL) is a form of inductive logic programming. MIL uses second-order Horn clauses, called metarules, as a form of declarative bias. Metarules define the structures of learnable programs and thus the hypothesis space. Deciding which metarules to use is a trade-off between efficiency and expressivity. The hypothesis space increases given more metarules, so we wish to use fewer metarules, but if we use too few metarules then we lose expressivity. A recent paper used Progol's entailment reduction algorithm to identify irreducible, or minimal, sets of metarules. In some cases, as few as two metarules were shown to be sufficient to entail all hypotheses in an infinite language. Moreover, it was shown that compared to non-minimal sets, learning with minimal sets of metarules improves predictive accuracies and lowers learning times. In this paper, we show that entailment reduction can be too strong and can remove metarules necessary to make a hypothesis more specific. We describe a new reduction technique based on derivations. Specifically, we introduce the derivation reduction problem, the problem of finding a finite subset of a Horn theory from which the whole theory can be derived using SLD-resolution. We describe a derivation reduction algorithm which we use to reduce sets of metarules. We also theoretically study whether certain sets of metarules can be derivationally reduced to minimal finite subsets. Our experiments compare learning with entailment and derivation reduced sets of metarules. In general, using derivation reduced sets of metarules outperforms using entailment reduced sets of metarules, both in terms of predictive accuracies and learning times.
When machine learning programs from data, we ideally want to learn efficient, rather than ineffic... more When machine learning programs from data, we ideally want to learn efficient, rather than inefficient, programs. However, existing inductive logic programming (ILP) techniques cannot distinguish between the efficiencies of programs, such as permutation sort (n!) and merge sort O(n l o g n). To address this limitation, we introduce Metaopt, an ILP system that learns minimal cost logic programs. Metaopt iteratively learns lower cost programs, each time further restricting the hypothesis space. We prove that given sufficient examples, Metaopt converges on minimal cost programs, and our experiments show that in practice only small numbers of examples are needed. We also introduce a cost function called tree cost which measures the size of the SLD-tree searched when a program is given a goal. This cost function allows Metaopt to learn minimal time complexity programs, including non-deterministic programs. Our experiments on programming puzzles, robot strategies, and real-world string transformation problems show that Metaopt learns minimal cost programs.
Many tasks in AI require the design of complex programs and representations, whether for programm... more Many tasks in AI require the design of complex programs and representations, whether for programming robots, designing game-playing programs, or conducting textual or visual transformations. This paper explores a novel inductive logic programming approach to learn such programs from examples. To reduce the complexity of the learned programs, and thus the search for such a program, we introduce higher-order operations involving an alternation of Abstraction and Invention. Abstractions are described using logic program definitions containing higher-order predicate variables. Inventions involve the construction of definitions for the predicate variables used in the Abstractions. The use of Abstractions extends the Meta-Interpretive Learning framework and is supported by the use of a user-extendable set of higher-order operators, such as map, until, and ifthenelse. Using these operators reduces the textual complexity required to express target classes of programs. We provide sample complexity results which indicate that the approach leads to reductions in the numbers of examples required to reach high predictive accuracy, as well as significant reductions in overall learning time. Our experiments demonstrate increased accuracy and reduced learning times in all cases. We believe that this paper is the first in the literature to demonstrate the efficiency and accuracy advantages involved in the use of higher-order abstractions.
Inductive programming approaches typically rely on an Occamist bias to select hypotheses with min... more Inductive programming approaches typically rely on an Occamist bias to select hypotheses with minimal textual complexity. This approach, however, fails to distinguish between the efficiencies of hypothe-sised programs, such as merge sort (O(n log n)) and bubble sort (O(n 2)). We address this issue by introducing techniques to learn logic programs with minimal resource complexity. We describe an algorithm proven to learn minimal resource complexity robot strategies, and we propose future work to gen-eralise the approach to a broader class of programs.
Data transformation involves the manual construction of large numbers of special-purpose programs... more Data transformation involves the manual construction of large numbers of special-purpose programs. Although typically small, such programs can be complex, involving problem decomposition, recur-sion, and recognition of context. Building such programs is common in commercial and academic data analytic projects and can be labour intensive and expensive, making it a suitable candidate for machine learning. In this paper, we use the meta-interpretive learning framework (MIL) to learn recursive data transformation programs from small numbers of examples. MIL is well suited to this task because it supports problem decomposition through predicate invention, learning recursive programs, learning from few examples, and learning from only positive examples. We apply Metagol, a MIL implementation, to both semi-structured and unstructured data. We conduct experiments on three real-world datasets: medical patient records, XML mondial records, and natural language taken from ecological papers. The experimental results suggest that high levels of predictive accuracy can be achieved in these tasks from small numbers of training examples, especially when learning with recursion.
Stock market prediction is nothing new. For years researchers from various disciplines have been ... more Stock market prediction is nothing new. For years researchers from various disciplines have been seduced by the appeal of financial gain. Some believe that stock price movements are governed by the random walk hypothesis, which states that stock prices evolve according to a random walk and thus cannot be predicted any more accurately than predicting whether a coin will land heads or tails; others disagree. This thesis investigates whether Twitter can be used to predict stock volume. A Twitter dataset of 40 million tweets is constructed for the period 01 Jan - 31 May 2011. Support vector regression is then used to model and predict next day stock volume for four stocks. The results are disappointing. Tests using features extracted from Twitter fail to beat a two day moving average baseline. This, however, does not mean that prediction using Twitter is impossible, and there is strong potential for future work.
Most logic-based machine learning algorithms rely on an Occamist bias where textual complexity of... more Most logic-based machine learning algorithms rely on an Occamist bias where textual complexity of hypotheses is minimised. Within Inductive Logic Programming (ILP), this approach fails to distin- guish between the efficiencies of hypothesised pro- grams, such as quick sort (O(n log n)) and bub- ble sort (O(n2)). This paper addresses this is- sue by considering techniques to minimise both the textual complexity and resource complexity of hy- pothesised robot strategies. We develop a general framework for the problem of minimising resource complexity and show that on two robot strategy problems, 1) Postman 2) Sorter (recursively sort letters for delivery), the theoretical resource com- plexities of optimal strategies vary depending on whether objects can be composed within a strat- egy. The approach considered is an extension of Meta-Interpretive Learning (MIL), a recently de- veloped paradigm in ILP which supports predicate invention and the learning of recursive logic pro- grams. We introduce a new MIL implementation, M etagolO , and prove its convergence, with in- creasing numbers of randomly chosen examples to optimal strategies of this kind. Our experiments show that MetagolO learns theoretically optimal robot sorting strategies, which is in agreement with the theoretical predictions showing a clear diver- gence in resource requirements as the number of objects grows. To the authors’ knowledge this pa- per is the first demonstration of a learning algo- rithm able to learn optimal resource complexity robot strategies and algorithms for sorting lists.
Meta-Interpretive Learning (MIL) is an ILP technique which uses higher-order meta-rules to suppor... more Meta-Interpretive Learning (MIL) is an ILP technique which uses higher-order meta-rules to support predicate invention and learning of recursive definitions. In MIL the selection of meta-rules is analogous to the choice of refinement operators in a refinement graph search. The meta-rules determine the structure of permissible rules which in turn defines the hypothesis space. On the other hand, the hypothesis space can be shown to increase rapidly in the number of meta-rules. However, methods for reducing the set of meta-rules have so far not been explored within MIL. In this paper we demonstrate that irreducible, or minimal sets of meta-rules can be found automatically by applying Plotkin's clausal theory reduction algorithm. When this approach is applied to a set of meta-rules consisting of an enumeration of all meta-rules in a given finite hypothesis language we show that in some cases as few as two meta-rules are complete and sufficient for generating all hypotheses. In our experiments we compare the effect of using a minimal set of meta-rules to randomly chosen subsets of the maximal set of meta-rules. In general the minimal set of meta-rules leads to lower runtimes and higher predictive accuracies than larger randomly selected sets of meta-rules.
In machine learning we are often faced with the problem of incomplete data, which can lead to low... more In machine learning we are often faced with the problem of incomplete data, which can lead to lower predictive accuracies in both feature-based and re-lational machine learning. It is therefore important to develop techniques to compensate for incomplete data. In inductive logic programming (ILP) incomplete data can be in the form of missing values or missing predicates. In this paper, we investigate whether an ILP learner can compensate for missing background predicates through predicate invention. We conduct experiments on two datasets in which we progressively remove predicates from the background knowledge whilst measuring the predictive accuracy of three ILP learners with differing levels of predicate invention. The experimental results show that as the number of background predicates decreases, an ILP learner which performs predicate invention has higher predictive accuracies than the learners which do not perform predicate invention, suggesting that predicate invention can compensate for incomplete background knowledge.
Most logic-based machine learning algorithms rely on an Occamist bias where textual simplicity of... more Most logic-based machine learning algorithms rely on an Occamist bias where textual simplicity of hypotheses is optimised. This approach, however , fails to distinguish between the efficien-cies of hypothesised programs, such as quick sort (O(n log n)) and bubble sort (O(n 2)). We address this issue by considering techniques to minimise both the resource complexity and textual complexity of hypothesised programs. We describe an algorithm proven to learn optimal resource complexity robot strategies, and we propose future work to generalise this approach to a broader class of logic programs.
Fiction authors rarely provide detailed descriptions of scenes, preferring the reader to fill in ... more Fiction authors rarely provide detailed descriptions of scenes, preferring the reader to fill in the details using their imagination. Therefore, to perform detailed text-to-scene conversion from books, we need to not only identify explicit objects but also infer implicit objects. In this paper, we describe an approach to inferring objects using Wikipedia and WordNet. In our experiments, we are able to infer implicit objects such as monitor and computer by identifying explicit objects such as keyboard.
Meta-interpretive learning (MIL) is a form of inductive logic programming that learns logic progr... more Meta-interpretive learning (MIL) is a form of inductive logic programming that learns logic programs from background knowledge and examples. We claim that adding types to MIL can improve learning performance. We show that type checking can reduce the MIL hypothesis space by a cubic factor. We introduce two typed MIL systems: Metagol_T and HEXMIL_T, implemented in Prolog and Answer Set Programming (ASP), respectively. Both systems support polymorphic types and can infer the types of invented predicates. Our experimental results show that types can substantially reduce learning times.
Meta-interpretive learning (MIL) is a form of inductive logic programming. MIL uses second-order ... more Meta-interpretive learning (MIL) is a form of inductive logic programming. MIL uses second-order Horn clauses, called metarules, as a form of declarative bias. Metarules define the structures of learnable programs and thus the hypothesis space. Deciding which metarules to use is a trade-off between efficiency and expressivity. The hypothesis space increases given more metarules, so we wish to use fewer metarules, but if we use too few metarules then we lose expressivity. A recent paper used Progol's entailment reduction algorithm to identify irreducible, or minimal, sets of metarules. In some cases, as few as two metarules were shown to be sufficient to entail all hypotheses in an infinite language. Moreover, it was shown that compared to non-minimal sets, learning with minimal sets of metarules improves predictive accuracies and lowers learning times. In this paper, we show that entailment reduction can be too strong and can remove metarules necessary to make a hypothesis more specific. We describe a new reduction technique based on derivations. Specifically, we introduce the derivation reduction problem, the problem of finding a finite subset of a Horn theory from which the whole theory can be derived using SLD-resolution. We describe a derivation reduction algorithm which we use to reduce sets of metarules. We also theoretically study whether certain sets of metarules can be derivationally reduced to minimal finite subsets. Our experiments compare learning with entailment and derivation reduced sets of metarules. In general, using derivation reduced sets of metarules outperforms using entailment reduced sets of metarules, both in terms of predictive accuracies and learning times.
When machine learning programs from data, we ideally want to learn efficient, rather than ineffic... more When machine learning programs from data, we ideally want to learn efficient, rather than inefficient, programs. However, existing inductive logic programming (ILP) techniques cannot distinguish between the efficiencies of programs, such as permutation sort (n!) and merge sort O(n l o g n). To address this limitation, we introduce Metaopt, an ILP system that learns minimal cost logic programs. Metaopt iteratively learns lower cost programs, each time further restricting the hypothesis space. We prove that given sufficient examples, Metaopt converges on minimal cost programs, and our experiments show that in practice only small numbers of examples are needed. We also introduce a cost function called tree cost which measures the size of the SLD-tree searched when a program is given a goal. This cost function allows Metaopt to learn minimal time complexity programs, including non-deterministic programs. Our experiments on programming puzzles, robot strategies, and real-world string transformation problems show that Metaopt learns minimal cost programs.
Many tasks in AI require the design of complex programs and representations, whether for programm... more Many tasks in AI require the design of complex programs and representations, whether for programming robots, designing game-playing programs, or conducting textual or visual transformations. This paper explores a novel inductive logic programming approach to learn such programs from examples. To reduce the complexity of the learned programs, and thus the search for such a program, we introduce higher-order operations involving an alternation of Abstraction and Invention. Abstractions are described using logic program definitions containing higher-order predicate variables. Inventions involve the construction of definitions for the predicate variables used in the Abstractions. The use of Abstractions extends the Meta-Interpretive Learning framework and is supported by the use of a user-extendable set of higher-order operators, such as map, until, and ifthenelse. Using these operators reduces the textual complexity required to express target classes of programs. We provide sample complexity results which indicate that the approach leads to reductions in the numbers of examples required to reach high predictive accuracy, as well as significant reductions in overall learning time. Our experiments demonstrate increased accuracy and reduced learning times in all cases. We believe that this paper is the first in the literature to demonstrate the efficiency and accuracy advantages involved in the use of higher-order abstractions.
Inductive programming approaches typically rely on an Occamist bias to select hypotheses with min... more Inductive programming approaches typically rely on an Occamist bias to select hypotheses with minimal textual complexity. This approach, however, fails to distinguish between the efficiencies of hypothe-sised programs, such as merge sort (O(n log n)) and bubble sort (O(n 2)). We address this issue by introducing techniques to learn logic programs with minimal resource complexity. We describe an algorithm proven to learn minimal resource complexity robot strategies, and we propose future work to gen-eralise the approach to a broader class of programs.
Data transformation involves the manual construction of large numbers of special-purpose programs... more Data transformation involves the manual construction of large numbers of special-purpose programs. Although typically small, such programs can be complex, involving problem decomposition, recur-sion, and recognition of context. Building such programs is common in commercial and academic data analytic projects and can be labour intensive and expensive, making it a suitable candidate for machine learning. In this paper, we use the meta-interpretive learning framework (MIL) to learn recursive data transformation programs from small numbers of examples. MIL is well suited to this task because it supports problem decomposition through predicate invention, learning recursive programs, learning from few examples, and learning from only positive examples. We apply Metagol, a MIL implementation, to both semi-structured and unstructured data. We conduct experiments on three real-world datasets: medical patient records, XML mondial records, and natural language taken from ecological papers. The experimental results suggest that high levels of predictive accuracy can be achieved in these tasks from small numbers of training examples, especially when learning with recursion.
Stock market prediction is nothing new. For years researchers from various disciplines have been ... more Stock market prediction is nothing new. For years researchers from various disciplines have been seduced by the appeal of financial gain. Some believe that stock price movements are governed by the random walk hypothesis, which states that stock prices evolve according to a random walk and thus cannot be predicted any more accurately than predicting whether a coin will land heads or tails; others disagree. This thesis investigates whether Twitter can be used to predict stock volume. A Twitter dataset of 40 million tweets is constructed for the period 01 Jan - 31 May 2011. Support vector regression is then used to model and predict next day stock volume for four stocks. The results are disappointing. Tests using features extracted from Twitter fail to beat a two day moving average baseline. This, however, does not mean that prediction using Twitter is impossible, and there is strong potential for future work.
Most logic-based machine learning algorithms rely on an Occamist bias where textual complexity of... more Most logic-based machine learning algorithms rely on an Occamist bias where textual complexity of hypotheses is minimised. Within Inductive Logic Programming (ILP), this approach fails to distin- guish between the efficiencies of hypothesised pro- grams, such as quick sort (O(n log n)) and bub- ble sort (O(n2)). This paper addresses this is- sue by considering techniques to minimise both the textual complexity and resource complexity of hy- pothesised robot strategies. We develop a general framework for the problem of minimising resource complexity and show that on two robot strategy problems, 1) Postman 2) Sorter (recursively sort letters for delivery), the theoretical resource com- plexities of optimal strategies vary depending on whether objects can be composed within a strat- egy. The approach considered is an extension of Meta-Interpretive Learning (MIL), a recently de- veloped paradigm in ILP which supports predicate invention and the learning of recursive logic pro- grams. We introduce a new MIL implementation, M etagolO , and prove its convergence, with in- creasing numbers of randomly chosen examples to optimal strategies of this kind. Our experiments show that MetagolO learns theoretically optimal robot sorting strategies, which is in agreement with the theoretical predictions showing a clear diver- gence in resource requirements as the number of objects grows. To the authors’ knowledge this pa- per is the first demonstration of a learning algo- rithm able to learn optimal resource complexity robot strategies and algorithms for sorting lists.
Meta-Interpretive Learning (MIL) is an ILP technique which uses higher-order meta-rules to suppor... more Meta-Interpretive Learning (MIL) is an ILP technique which uses higher-order meta-rules to support predicate invention and learning of recursive definitions. In MIL the selection of meta-rules is analogous to the choice of refinement operators in a refinement graph search. The meta-rules determine the structure of permissible rules which in turn defines the hypothesis space. On the other hand, the hypothesis space can be shown to increase rapidly in the number of meta-rules. However, methods for reducing the set of meta-rules have so far not been explored within MIL. In this paper we demonstrate that irreducible, or minimal sets of meta-rules can be found automatically by applying Plotkin's clausal theory reduction algorithm. When this approach is applied to a set of meta-rules consisting of an enumeration of all meta-rules in a given finite hypothesis language we show that in some cases as few as two meta-rules are complete and sufficient for generating all hypotheses. In our experiments we compare the effect of using a minimal set of meta-rules to randomly chosen subsets of the maximal set of meta-rules. In general the minimal set of meta-rules leads to lower runtimes and higher predictive accuracies than larger randomly selected sets of meta-rules.
In machine learning we are often faced with the problem of incomplete data, which can lead to low... more In machine learning we are often faced with the problem of incomplete data, which can lead to lower predictive accuracies in both feature-based and re-lational machine learning. It is therefore important to develop techniques to compensate for incomplete data. In inductive logic programming (ILP) incomplete data can be in the form of missing values or missing predicates. In this paper, we investigate whether an ILP learner can compensate for missing background predicates through predicate invention. We conduct experiments on two datasets in which we progressively remove predicates from the background knowledge whilst measuring the predictive accuracy of three ILP learners with differing levels of predicate invention. The experimental results show that as the number of background predicates decreases, an ILP learner which performs predicate invention has higher predictive accuracies than the learners which do not perform predicate invention, suggesting that predicate invention can compensate for incomplete background knowledge.
Most logic-based machine learning algorithms rely on an Occamist bias where textual simplicity of... more Most logic-based machine learning algorithms rely on an Occamist bias where textual simplicity of hypotheses is optimised. This approach, however , fails to distinguish between the efficien-cies of hypothesised programs, such as quick sort (O(n log n)) and bubble sort (O(n 2)). We address this issue by considering techniques to minimise both the resource complexity and textual complexity of hypothesised programs. We describe an algorithm proven to learn optimal resource complexity robot strategies, and we propose future work to generalise this approach to a broader class of logic programs.
Fiction authors rarely provide detailed descriptions of scenes, preferring the reader to fill in ... more Fiction authors rarely provide detailed descriptions of scenes, preferring the reader to fill in the details using their imagination. Therefore, to perform detailed text-to-scene conversion from books, we need to not only identify explicit objects but also infer implicit objects. In this paper, we describe an approach to inferring objects using Wikipedia and WordNet. In our experiments, we are able to infer implicit objects such as monitor and computer by identifying explicit objects such as keyboard.
Uploads