Maximum Margin Algorithms with Boolean Kernels

Khardon, Roni; Servedio, Rocco A.

doi:10.1007/978-3-540-45167-9_8

Roni Khardon⁸ &
Rocco A. Servedio⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2777))

Abstract

Recent work has introduced Boolean kernels with which one can learn over a feature space containing all conjunctions of length up to k (for any 1≤ k ≤ n) over the original n Boolean features in the input space. This motivates the question of whether maximum margin algorithms such as support vector machines can learn Disjunctive Normal Form expressions in the PAC learning model using this kernel. We study this question, as well as a variant in which structural risk minimization (SRM) is performed where the class hierarchy is taken over the length of conjunctions.

We show that such maximum margin algorithms do not PAC learn t(n)-term DNF for any t(n) = ω(1), even when used with such a SRM scheme. We also consider PAC learning under the uniform distribution and show that if the kernel uses conjunctions of length \(\tilde{\omega}(\sqrt{n})\) then the maximum margin hypothesis will fail on the uniform distribution as well. Our results concretely illustrate that margin based algorithms may overfit when learning simple target functions with natural kernels.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Fast projected gradient method for support vector machines

Article 11 August 2016

PAC-Bayes Bounds for Supervised Classification

Rademacher complexity of margin multi-category classifiers

Article 22 November 2018

References

Blum, A., Furst, M., Jackson, J., Kearns, M., Mansour, Y., Rudich, S.: Weakly learning DNF and characterizing statistical query learning using Fourier analysis. In: Proceedings of the 26th Annual Symposium on Theory of Computing, pp. 253–262 (1994)
Google Scholar
Blum, A., Rudich, S.: Fast learning of k-term DNF formulas with queries. Journal of Computer and System Sciences 51(3), 367–373 (1995)
Article MathSciNet Google Scholar
Boser, B., Guyon, I., Vapnik, V.: A training algorithm for optimal margin classifiers. In: Proceedings of the 5th Annual Workshop on Computational Learning Theory, pp. 144–152 (1992)
Google Scholar
Bshouty, N.: A subexponential exact learning algorithm for DNF using equivalence queries. Information Processing Letters 59, 37–39 (1996)
Article MATH MathSciNet Google Scholar
Bshouty, N., Tamon, C.: On the Fourier spectrum of monotone functions. Journal of the ACM 43(4), 747–770 (1996)
Article MATH MathSciNet Google Scholar
Gentile, C.: A new approximate maximal margin classification algorithm. Journal of Machine Learning Research 2, 213–242 (2001)
Article MathSciNet Google Scholar
Hancock, T., Mansour, Y.: Learning monotone k-μ DNF formulas on product distributions. In: Proceedings of the 4th Annual Conference on Computational Learning Theory, pp. 179–193 (1991)
Google Scholar
Jackson, J.: An efficient membership-query algorithm for learning DNF with respect to the uniform distribution. Journal of Computer and System Sciences 55, 414–440 (1997)
Article MATH MathSciNet Google Scholar
Kearns, M., Vazirani, U.: An introduction to computational learning theory. MIT Press, Cambridge (1994)
Google Scholar
Khardon, R.: On using the Fourier transform to learn disjoint DNF. Information Processing Letters 49, 219–222 (1994)
Article MATH Google Scholar
Khardon, R., Roth, D., Servedio, R.: Efficiency versus convergence of boolean kernels for on-line learning algorithms. In: Dietterich, T.G., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems, Cambridge, MA, vol. 14. MIT Press, Cambridge (2002)
Google Scholar
Klivans, A., Servedio, R.: Learning DNF in time 2õ(n ^1/3). In: Proceedings of the Thirty-Third Annual Symposium on Theory of Computing, pp. 258–265 (2001)
Google Scholar
Kowalczyk, A., Smola, A.J., Williamson, R.C.: Kernel machines and boolean functions. In: Dietterich, T.G., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems, Cambridge, MA, vol. 14. MIT Press, Cambridge (2002)
Google Scholar
Kucera, L., Marchetti-Spaccamela, A., Protassi, M.: On learning monotone DNF formulae under uniform distributions. Information and Computation 110, 84–95 (1994)
Article MATH MathSciNet Google Scholar
Kushilevitz, E., Roth, D.: On learning visual concepts and DNF formulae. In: Proceedings of the 6th Annual Conference on Computational Learning Theory, pp. 317–326 (1993)
Google Scholar
Minsky, M., Papert, S.: Perceptrons: an introduction to computational geometry. MIT Press, Cambridge (1968)
Google Scholar
Sadohara, K.: Learning of boolean functions using support vector machines. In: Abe, N., Khardon, R., Zeugmann, T. (eds.) ALT 2001. LNCS (LNAI), vol. 2225, pp. 106–118. Springer, Heidelberg (2001)
Chapter Google Scholar
Sakai, Y., Maruoka, A.: Learning monotone log-term DNF formulas under the uniform distribution. Theory of Computing Systems 33, 17–33 (2000)
Article MATH MathSciNet Google Scholar
Servedio, R.: On PAC learning using winnow, perceptron, and a perceptron-like algorithm. In: Proceedings of the 12th Annual Conference on Computational Learning Theory, pp. 296–307 (1999)
Google Scholar
Servedio, R.: On learning monotone DNF under product distributions. In: Proceedings of the Fourteenth Annual Conference on Computational Learning Theory, pp. 473–489 (2001)
Google Scholar
Shawe-Taylor, J., Cristianini, N.: An introduction to support vector machines. Cambridge University Press, Cambridge (2000)
Google Scholar
Tarui, J., Tsukiji, T.: Learning DNF by approximating inclusion-exclusion formulae. In: Proceedings of the Fourteenth Conference on Computational Complexity, pp. 215–220 (1999)
Google Scholar
Valiant, L.: A theory of the learnable. Communications of the ACM 27(11), 1134–1142 (1984)
Article MATH Google Scholar
Verbeurgt, K.: Learning DNF under the uniform distribution in quasi-polynomial time. In: Proceedings of the Third Annual Workshop on Computational Learning Theory, pp. 314–326 (1990)
Google Scholar
Verbeurgt, K.: Learning sub-classes of monotone DNF on the uniform distribution. In: Proceedings of the 9th Conference on Algorithmic Learning Theory, pp. 385–399 (1998)
Google Scholar
Watkins, C.: Kernels from matching operations. Technical Report CSD-TR-98-07, Computer Science Department, Royal Holloway, University of London (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Tufts University, Medford, MA, 02155, USA
Roni Khardon
Department of Computer Science, Columbia University, New York, NY, 10027, USA
Rocco A. Servedio

Authors

Roni Khardon
View author publications
You can also search for this author in PubMed Google Scholar
Rocco A. Servedio
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

MPI for Biological Cybernetics, Spemannstr. 38, 72076, Tübingen, Germany
Bernhard Schölkopf
University of California, Santa Cruz
Manfred K. Warmuth

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Khardon, R., Servedio, R.A. (2003). Maximum Margin Algorithms with Boolean Kernels. In: Schölkopf, B., Warmuth, M.K. (eds) Learning Theory and Kernel Machines. Lecture Notes in Computer Science(), vol 2777. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45167-9_8

Download citation

DOI: https://doi.org/10.1007/978-3-540-45167-9_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40720-1
Online ISBN: 978-3-540-45167-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Maximum Margin Algorithms with Boolean Kernels

Abstract

Access this chapter

Preview

Similar content being viewed by others

Fast projected gradient method for support vector machines

PAC-Bayes Bounds for Supervised Classification

Rademacher complexity of margin multi-category classifiers

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Maximum Margin Algorithms with Boolean Kernels

Abstract

Access this chapter

Preview

Similar content being viewed by others

Fast projected gradient method for support vector machines

PAC-Bayes Bounds for Supervised Classification

Rademacher complexity of margin multi-category classifiers

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation