Training Invariant Support Vector Machines

Decoste, Dennis; Schölkopf, Bernhard

doi:10.1023/A:1012454411458

Training Invariant Support Vector Machines

Published: January 2002

Volume 46, pages 161–190, (2002)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Training Invariant Support Vector Machines

Download PDF

Dennis Decoste^1,2 &
Bernhard Schölkopf³

3145 Accesses
3 Altmetric
Explore all metrics

Abstract

Practical experience has shown that in order to obtain the best possible performance, prior knowledge about invariances of a classification problem at hand ought to be incorporated into the training procedure. We describe and review all known methods for doing so in support vector machines, provide experimental results, and discuss their respective merits. One of the significant new results reported in this work is our recent achievement of the lowest reported test error on the well-known MNIST digit recognition benchmark task, with SVM training times that are also significantly faster than previous SVM methods.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Baird, H. (1990). Document image defect models. In Proceedings, IAPR Workshop on Syntactic and Structural Pattern Recognition (pp. 38-46). Murray Hill, NJ.
Google Scholar
Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. In D. Haussler, (Ed.), Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory (pp. 144-152). Pittsburgh, PA: ACM Press.
Google Scholar
Bottou, L. & Vapnik, V. N. (1992). Local learning algorithms. Neural Computation, 4:6, 888-900.
Google Scholar
Bromley, J. & Säckinger, E. (1991). Neural-network and k-nearest-neighbor classifiers. Technical Report 11359-910819-16TM, AT &T.
Burges, C. J. C. (1999). Geometry and invariance in kernel based methods. In B. Schölkopf, C. J. C. Burges, & A. J. Smola (Eds.). Advances in kernel methods-support vector learning (pp. 89-116). Cambridge, MA: MIT Press.
Google Scholar
Burges, C. J. C. & Schölkopf, B. (1997). Improving the accuracy and speed of support vector learning machines. In M. Mozer, M. Jordan, & T. Petsche, (Eds.). Advances in neural information processing systems 9 (pp. 375-381). Cambridge, MA: MIT Press.
Google Scholar
Burl, M. C. (2000). NASA volcanoe data set at UCI KDD Archive. (See http:/kdd.ics.uci.edu/databases/ volcanoes/volcanoes.html).
Burl, M. C. (2001). Mining large image collections: Architecture and algorithms. In R. Grossman, C. Kamath, V. Kumar, & R. Namburu (Eds.). Data mining for scientific and engineering applications. Series in Massive Computing. Cambridge, MA: Kluwer Academic Publishers.
Google Scholar
Burl, M. C., Asker, L., Smyth, P., Fayyad, U., Perona, P., Crumpler, L., & Aubele, J. (1998). Learning to recognize volcanoes on Venus. Machine Learning, 30, 165-194.
Google Scholar
Chapelle, O. & Schölkopf, B. (2000). Incorporating invariances in nonlinear SVMs. Presented at the NIPS2000 workshop on Learning with Kernels.
Cortes, C. & Vapnik, V. (1995). Support vector networks. Machine Learning, 20, 273-297.
Google Scholar
Crampin, M. & Pirani, F. A. E. (1986). Applicable differential geometry. Cambridge, UK: Cambridge University Press.
Google Scholar
DeCoste, D. & Burl, M. C. (2000). Distortion-invariant recognition via jittered queries. In Computer Vision and Pattern Recognition (CVPR-2000).
DeCoste, D. & Wagstaff, K. (2000). Alpha seeding for support vector machines. In International Conference on Knowledge Discovery and Data Mining (KDD-2000).
Drucker, H., Schapire, R., & Simard, P. (1993). Boosting performance in neural networks. International Journal of Pattern Recognition and Artificial Intelligence, 7:4, 705-719.
Google Scholar
Girosi, F. (1998). An equivalence between sparse approximation and support vector machines. Neural Computation, 10:6, 1455-1480.
Google Scholar
Haussler, D. (1999). Convolutional kernels on discrete structures. Technical Report UCSC-CRL-99-10, Computer Science Department, University of California at Santa Cruz.
Jaakkola, T. S. & Haussler, D. (1999). Exploiting generative models in discriminative classifiers. In M. S. Kearns, S. A. Solla, & D. A. Cohn (Eds.). Advances in neural information processing systems 11. Cambridge, MA: MIT Press.
Google Scholar
Joachims, T. (1999). Making large-scale support vector machine learning practical. In Advances in kernel methods: support vector machines (Schölkopf et al., 1999).
Keerthi, S., Shevade, S., Bhattacharyya, C., & Murthy, K. (1999). Improvements to Platt's SMO algorithm for SVM classifier design. Technical Report CD-99-14, Dept. of Mechanical and Production Engineering, National University of Singapore.
Kimeldorf, G. S. & Wahba, G. (1970). A correspondence between Bayesian estimation on stochastic processes and smoothing by splines. Annals of Mathematical Statistics, 41, 495-502.
Google Scholar
LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., & Jackel, L. J. (1989). Backpropagation applied to handwritten zip code recognition. Neural Computation, 1, 541-551.
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86, 2278-2324.
Google Scholar
LeCun, Y., Jackel, L. D., Bottou, L., Brunot, A., Cortes, C., Denker, J. S., Drucker, H., Guyon, I., Müller, U. A., Säckinger, E., Simard, P., & Vapnik, V. (1995). Comparison of learning algorithms for handwritten digit recognition. In F. Fogelman-Soulié, & P. Gallinari (Eds.). Proceedings ICANN'95-International Conference on Artificial Neural Networks vol. II, pp. 53-60). Nanterre, France: EC2.
Google Scholar
Oliver, N., Schölkopf, B., & Smola, A. J. (2000). Natural regularization in SVMs. In A. J. Smola, P. L. Bartlett, B. Schölkopf, & D. Schuurmans (Eds.). Advances in large margin classifiers (pp. 51-60). Cambridge, MA: MIT Press.
Google Scholar
Platt, J. (1999). Fast training of support vector machines using sequential minimal optimization. In B. Schölkopf, C. J. C. Burges, & A. J. Smola (Eds.). Advances in kernel methods-support vector learning (pp. 185-208). Cambridge, MA: MIT Press.
Google Scholar
Poggio, T. & Girosi, F. (1989). A theory of networks for approximation and learning. Technical Report AIM-1140, Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts.
Google Scholar
Poggio, T. & Vetter, T. (1992). Recognition and structure from one 2D model view: Observations on prototypes, object classes and symmetries. A.I. Memo No. 1347, Artificial Intelligence Laboratory, Massachusetts Institute of Technology.
Schölkopf, B. (1997). Support vector learning. R. Oldenbourg Verlag,M¨unchen. Doktorarbeit, TU Berlin. Download: http://www.kernel-machines.org.
Schölkopf, B., Burges, C., & Smola, A. (1999). Advances in kernel methods: support vector machines. Cambridge, MA: MIT Press.
Google Scholar
Schülkopf, B., Burges, C., & Vapnik, V. (1995). Extracting support data for a given task. In U. M. Fayyad, & R. Uthurusamy (Eds.). In Proceedings, First International Conference on Knowledge Discovery & Data Mining, Menlo Park: AAAI Press.
Google Scholar
Schölkopf, B., Burges, C., & Vapnik, V. (1996). Incorporating invariances in support vector learning machines. In C. von der Malsburg, W. von Seelen, J. C. Vorbrüggen, & B. Sendhoff (Eds.). Artificial neural networks-ICANN'96 (pp. 47-52). Berlin: Springer. Lecture Notes in Computer Science (Vol. 1112).
Google Scholar
Schölkopf, B., Simard, P., Smola, A., & Vapnik, V. (1998a). Prior knowledge in support vector kernels. In M. Jordan, M. Kearns, & S. Solla (Eds.). Advances in neural information processing systems 10 (pp. 640-646). Cambridge, MA: MIT Press.
Google Scholar
Schölkopf, B., Smola, A., & Müller, K.-R. (1998b). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10, 1299-1319.
Google Scholar
Simard, P., LeCun, Y., & Denker, J. (1993). Efficient pattern recognition using a new transformation distance. In S. J. Hanson, J. D. Cowan, & C. L. Giles (Eds.). Advances in neural information processing systems 5. Proceedings of the 1992 Conference (pp. 50-58). San Mateo, CA: Morgan Kaufmann.
Google Scholar
Simard, P., Victorri, B., LeCun, Y., & Denker, J. (1992). Tangent prop-a formalism for specifying selected invariances in an adaptive network. In J. E. Moody, S. J. Hanson, & R. P. Lippmann (Eds.). Advances in neural information processing systems 4, San Mateo, CA: Morgan Kaufmann.
Google Scholar
Smola, A., Schölkopf, B., & Müller, K.-R. (1998). The connection between regularization operators and support vector kernels. Neural Networks, 11, 637-649.
Google Scholar
Teow, L.-N. & Loe, K.-F. (2000). Handwritten digit recognition with a novel vision model that extracts linearly separable features. In computer vision and pattern recognition (CVPR-2000).
Vapnik, V. (1995). The nature of statistical learning theory. NY: Springer.
Google Scholar
Vapnik, V. (1998). Statistical learning theory. NY: Wiley.
Google Scholar
Watkins, C. (2000). Dynamic alignment kernels. In A. J. Smola, P. L. Bartlett, B. Schölkopf, & D. Schuurmans (Eds.). Advances in large margin classifiers (pp. 39-50). Cambridge, MA: MIT Press.
Google Scholar
Zien, A., Rätsch, G., Mika, S., Schölkopf, B., Lengauer, T., & Müller, K.-R. (2000). Engineering support vector machine kernels that recognize translation initiation sites. Bioinformatics, 16:9, 799-807.
Google Scholar

Download references

Author information

Authors and Affiliations

Jet Propulsion Laboratory, MS 126-347, 4800 Oak Grove Drive, Pasadena, CA, 91109, USA;
Dennis Decoste
California Institute of Technology, USA
Dennis Decoste
Max-Planck-Institut fuer biologische Kybernetik, Spemannstr. 38, 72076, Tübingen, Germany
Bernhard Schölkopf

Authors

Dennis Decoste
View author publications
You can also search for this author in PubMed Google Scholar
Bernhard Schölkopf
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Decoste, D., Schölkopf, B. Training Invariant Support Vector Machines. Machine Learning 46, 161–190 (2002). https://doi.org/10.1023/A:1012454411458

Download citation

Issue Date: January 2002
DOI: https://doi.org/10.1023/A:1012454411458

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Training Invariant Support Vector Machines

Abstract

Article PDF

Similar content being viewed by others

Support Vector Machines

Support Vector Machines

Linear Support Vector Machines

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Training Invariant Support Vector Machines

Abstract

Article PDF

Similar content being viewed by others

Support Vector Machines

Support Vector Machines

Linear Support Vector Machines

Explore related subjects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article