Abstract
In this paper, we study the possibility of Occam’s razors for a widely studied class of Boolean Formulae: Disjunctive Normal Forms (DNF). An Occam’s razor is an algorithm which compresses the knowledge of observations (examples) in small formulae. We prove that approximating the minimally consistent DNF formula, and a generalization of graph colorability, is very hard. Our proof technique is such that the stronger the complexity hypothesis used, the larger the inapproximability ratio obtained. Our ratio is among the first to integrate the three parameters of Occam’s razors: the number of examples, the number of description attributes and the size of the target formula labelling the examples. Theoretically speaking, our result rules out the existence of efficient deterministic Occam’s razor algorithms for DNF. Practically speaking, it puts a large worst-case lower bound on the formulae’s sizes found by learning systems proceeding by rule searching.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
H. Aizenstein and L. Pitt. Exact learning of read-k-disjoint DNF and not-so-disjoint-DNF. In Proc. of the 5th International Conference on Computational Learning Theory, pages 71–76, 1992.
H. Aizenstein and L. Pitt. On the learnability of Disjunctive Normal Form formulas. Machine Learning, 19:183–208, 1995.
J. L. Balcazar, J. Diaz, and J. Gabarro. Structural Complexity I. Springer Verlag, 1988.
U. Berggren. Linear time deterministic learning of k-term-DNF. In Proc. of the 6th International Conference on Computational Learning Theory, pages 37–40, 1993.
A. Blum, R. Khardon, E. Kushilevitz, L. Pitt, and D. Roth. On learning read-k-satisfy-j DNF. In Proc. of the 7International Conference on Computational Learning Theory, pages 110–117, 1994.
A. Blumer, A. Ehrenfeucht, D. Haussler, and M. K. Warmuth. Occam’s razor. Information Processing Letters, pages 377–380, 1987.
C. Brunk and M. Pazzani. Noise-tolerant relational concept learning. In Proc. of the 8th International Conference on Machine Learning, 1991.
N. H. Bshouty, Z. Chen, S. E. Decatur, and S. Homer. On the learnability of zn-DNF formulas. In Proc. of the 8th International Conference on Computational Learning Theory, pages 198–205, 1995.
W. W. Cohen. PAC-learning a restricted class of recursive logic programs. In Proc. of AAAI-93, pages 86–92, 1993.
W. W. Cohen. Fast effective rule induction. In Proc. of the 12th International Conference on Machine Learning, pages 115–123, 1995.
C. de la Higuera. Characteristic sets for polynomial grammatical inference. Machine Learning, pages 1–14, 1997.
L. de Raedt. Iterative concept learning and construction by analogy. Machine Learning, pages 107–150, 1992.
U. Feige and J. Kilian. Zero knowledge and the chromatic number. draft, 1996.
M.R. Garey and D.S. Johnson. Computers and Intractability, a guide to the theory of NP-Completeness. Bell Telephone Laboratories, 1979.
S. A. Goldman and H. D. Mathias. Learning k-term-DNF formulas with an incomplete membership oracle. In Proc. of the 5th International Conference on Computational Learning Theory, pages 77–84, 1992.
J. Hastad. Clique is hard to approximate within n1-ε. In FOCS’96, pages 627–636, 1996.
R.C. Holte. Very simple classification rules perform well on most commonly used datasets. Machine Learning, pages 63–91, 1993.
M. J. Kearns and U. V. Vazirani. An Introduction to Computational Learning Theory. M.I.T. Press, 1994.
M.J. Kearns, M. Li, L. Pitt, and L. Valiant. On the learnability of boolean formulae. Proceedings of the Nineteenth Annual A.C.M. Symposium on Theory of Computing, pages 285–295, 1987.
R. Khardon. On using the fourier transform to learn disjoint DNF. Information Processing Letters, pages 219–222, 1994.
N. Lavrac, S. Dzeroski, and M. Grobelnik. Learning non-recursive definitions of relations with linus. In European Working Session in Learning, 1991.
K. Lund and M. Yannakakis. On the hardness of approximating minimization problems. In Proc. of the 25th Symposium on the Theory of Computing, pages 286–293, 1993.
Y. Mansour. An O(nlog log n) algorithm for dnf under the uniform distribution. In Proc. of the 5th International Conference on Computational Learning Theory, pages 53–61, 1992.
S. Muggleton and C. Feng. Efficient induction of logic programs. In Inductive Logic Programming, 1994.
R. Nock and O. Gascuel. On learning decision committees. In Proc. of the 12th International Conference on Machine Learning, pages 413–420, 1995.
J. Pagallo and D. Haussler. Boolean feature discovery in empirical learning. Machine Learning, 1990.
K. Pillaipakkamnatt and V. Raghavan. On the limits of proper learnability of subclasses of DNF formulae. In Proc. of the 7th International Conference on Computational Learning Theory, pages 118–129, 1994.
L. Pitt and L. G. Valiant. Computational limitations on learning from examples. J. ACM, pages 965–984, 1988.
J. R. Quinlan. Learning logical definition from relations. Machine Learning, pages 239–270, 1990.
J. R. Quinlan. C4.5: programs for machine learning. Morgan Kaufmann, 1994.
J. R. Quinlan. MDL and categorical theories (continued). In Proc. of the 12th International Conference on Machine Learning, pages 464–470, 1995.
C. Rouveirol. ITOU: induction of first-order theories. Inductive Logic Programming, 1992.
S. B. Thrun, J. Bala, E. Bloedorn, I. Bratko, B. Cestnik, J. Cheng, K. De Jong, S. Dzeroski, S. E. Fahlman, D. Fisher, R. Hamann, K. Kaufman, S. Keller, I. Kononenko, J. Kreuziger, R. S. Michalski, T. Mitchell, P. Pachowicz, Y. Reich, H. Vafaie, W. Van de Welde, W. Wenzel, J. Wnek, and J. Zhang. The MONK’s problems: a performance comparison of different lear ning algorithms. Technical Report CMU-CS-91-197, Carnegie Mellon University, 1991.
L. G. Valiant. A theory of the learnable. Communications of the ACM, pages 1134–1142, 1984.
L. G. Valiant. Learning disjunctions of conjunctions. In Proc. of the 9th IJCAI, pages 560–566, 1985.
J Wnek and R. Michalski. Hypothesis-driven constructive induction in AQ17. In Proc. of the 12th IJCAI, 1991.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nock, R., Jappy, P., Sallantin, J. (1998). Generalized Graph Colorability and Compressibility of Boolean Formulae. In: Chwa, KY., Ibarra, O.H. (eds) Algorithms and Computation. ISAAC 1998. Lecture Notes in Computer Science, vol 1533. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49381-6_26
Download citation
DOI: https://doi.org/10.1007/3-540-49381-6_26
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65385-1
Online ISBN: 978-3-540-49381-5
eBook Packages: Springer Book Archive