Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Random Projections for Linear Support Vector Machines

Published: 29 August 2014 Publication History

Abstract

Let X be a data matrix of rank ρ, whose rows represent n points in d-dimensional space. The linear support vector machine constructs a hyperplane separator that maximizes the 1-norm soft margin. We develop a new oblivious dimension reduction technique that is precomputed and can be applied to any input matrix X. We prove that, with high probability, the margin and minimum enclosing ball in the feature space are preserved to within ϵ-relative error, ensuring comparable generalization as in the original space in the case of classification. For regression, we show that the margin is preserved to ϵ-relative error with high probability. We present extensive experiments with real and synthetic data to support our theory.

References

[1]
D. Achlioptas. 2003. Database-friendly random projections: Johnson-Lindenstrauss with binary coins. Journal of Computer and System Sciences 66, 4 (2003), 671--687.
[2]
N. Ailon and B. Chazelle. 2006. Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform. In Proceedings of the 38th Annual ACM Symposium on Theory of Computing. 557--563.
[3]
N. Ailon and E. Liberty. 2008. Fast dimension reduction using Rademacher series on dual BCH codes. In Proceedings of the 19th Annual ACM-SIAM Symposium on Discrete Algorithms. 1--9.
[4]
J. L. Balcazar, Y. Dai, and O. Watanabe. 2001. A random sampling technique for training support vector machines. In Proceedings of the 12th International Conference on Algorithmic Learning Theory. 119--134.
[5]
J. L. Balczar, Y. Dai, and O. Watanabe. 2002. Provably fast support vector regression using random sampling. In Proceedings of SIAM Workshop in Discrete Mathematics and Data Mining.
[6]
A. Blum. 2006. Random projection, margins, kernels, and feature-selection. In Proceedings of the International Conference on Subspace, Latent Structure and Feature Selection. 52--68.
[7]
D. Cai, X. He, J. Han, and H.-J. Zhang. 2006. Orthogonal laplacianfaces for face recognition. IEEE Transactions on Image Processing 15, 11 (2006), 3608--3614.
[8]
C.-C. Chang and C.-J. Lin. 2011. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 3 (2011), 27:1--27:27. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
[9]
K. L. Clarkson and D. W. Woodruff. 2013. Low rank approximation and regression in input sparsity time. In Proceedings of the 45th ACM Symposium on the Theory of Computing.
[10]
K. Crammer and Y. Singer. 2000. On the learnability and design of output codes for multi-class problems. In Computational Learning Theory. 35--46.
[11]
N. Cristianini and J. Shawe-Taylor. 2000. Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press.
[12]
S. Dasgupta and A. Gupta. 2003. An elementary proof of a theorem of Johnson and Lindenstrauss. Random Structures and Algorithms 22, 1 (2003), 60--65.
[13]
D. Davidov, E. Gabrilovich, and S. Markovitch. 2004. Parameterized generation of labeled datasets for text categorization based on a hierarchical directory. In Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 250--257.
[14]
P. Drineas, M. W. Mahoney, S. Muthukrishnan, and T. Sarlos. 2011. Faster least squares approximation. Numerische Mathematik 117, 2 (2011), 219--249.
[15]
R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. 2008. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research (2008), 1871--1874.
[16]
P. Indyk and R. Motwani. 1998. Approximate nearest neighbors: Towards removing the curse of dimensionality. In Proceedings of the 30th Annual ACM Symposium on Theory of Computing. 604--613.
[17]
V. Jethava, K. Suresh, C. Bhattacharyya, and R. Hariharan. 2009. Randomized algorithms for large scale SVMs. CoRR abs/0909.3609 (2009). http://arxiv.org/abs/0909.3609.
[18]
S. Krishnan, C. Bhattacharyya, and R. Hariharan. 2008. A randomized algorithm for large scale support vector learning. In Advances in 20th Neural Information Processing Systems. 793--800.
[19]
D. D. Lewis, Y. Yang, T. G. Rose, and F. Li. 2004. RCV1: A new benchmark collection for text categorization research. Journal of Machine Learning Research (2004), 361--397.
[20]
J. Z. Li, D. M. Absher, H. Tang, A. M. Southwick, A. M. Casto, S. Ramachandran, H. M. Cann, G. S. Barsh, M. Feldman, L. L. Cavalli-Sforza, and R. M. Myers. 2008. Worldwide human relationships inferred from genome-wide patterns of variation. Science 319, 5866 (2008), 1100--1104.
[21]
P. Li, T. J. Hastie, and W. K. Church. 2006. Very sparse random projections. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 287--296.
[22]
A. Magen and A. Zouzias. 2011. Low rank matrix-valued Chernoff bounds and approximate matrix multiplication. In Proceedings of the 22nd Annual ACM-SIAM Symposium on Discrete Algorithms. 1422--1436.
[23]
X. Meng and M. W. Mahoney. 2013. Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression. In Proceedings of the 45th ACM Symposium on the Theory of Computing.
[24]
J. Nelson and H. L. Nguyen. 2013. OSNAP: Faster numerical linear algebra algorithms via sparser subspace embeddings. In Proceedings of the 54th Annual IEEE Symposium on Foundations of Computer Science (FOCS'13).
[25]
P. Paschou, J. Lewis, A. Javed, and P. Drineas. 2010. Ancestry informative markers for fine-scale individual assignment to worldwide populations. Journal of Medical Genetics 47, 12 (2010), 835--47.
[26]
S. Paul, C. Boutsidis, M. Magdon-Ismail, and P. Drineas. 2013. Random projections for support vector machines. In Proceedings of the 16th International Conference on Artificial Intelligence & Statistics, JMLR W&CP. 498--506.
[27]
D. T. Ross, U. Scherf, M. B. Eisen, C. M. Perou, P. Spellman, V. Iyer, S. S. Jeffrey, M. Van de Rijn, M. Waltham, A. Pergamenschikov, J. C. F Lee, D. Lashkari, D. Shalon, T. G. Myers, J. N. Weinstein, D. Botstein, and P. O. Brown. 2000. Systematic variation in gene expression patterns in human cancer cell lines. Nature Genetics 24, 3 (2000), 227--234.
[28]
Q. Shi, J. Petterson, G. Dror, J. Langford, A. Smola, and S. V. N. Vishwanathan. 2009. Hash kernels for structured data. Journal of Machine Learning Research 10 (2009), 2615--2637.
[29]
Q. Shi, C. Shen, R. Hill, and A. V. D. Hengel. 2012. Is margin preserved after random projection?. In Proceedings of 29th International Conference on Machine Learning. 591--598.
[30]
V. N. Vapnik and A. Chervonenkis. 1971. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications 16 (1971), 264--280.
[31]
V. N. Vapnik. 1998. Statistical learning theory. Theory of Probability and its Applications 16 (1998), 264--280.
[32]
L. Zhang, M. Mahdavi, R. Jin, and T. Yang. 2013. Recovering optimal solution by dual random projection. In Proceedings of the Conference on Learning Theory (COLT'13) JMLR W&CP, Vol. 30. 135--157.

Cited By

View all
  • (2024)Random projection using random quantum circuitsPhysical Review Research10.1103/PhysRevResearch.6.0130106:1Online publication date: 3-Jan-2024
  • (2022)Random Projection and Recovery for High Dimensional Optimization with Arbitrary OutliersInternational Journal of Computational Geometry & Applications10.1142/S021819592250007832:03n04(201-225)Online publication date: 19-Dec-2022
  • (2021)Applying a Random Projection Algorithm to Optimize Machine Learning Model for Breast Lesion ClassificationIEEE Transactions on Biomedical Engineering10.1109/TBME.2021.305424868:9(2764-2775)Online publication date: Sep-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 8, Issue 4
October 2014
219 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/2663597
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 August 2014
Accepted: 01 December 2013
Revised: 01 October 2013
Received: 01 April 2013
Published in TKDD Volume 8, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Classification
  2. dimensionality reduction
  3. support vector machines

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)20
  • Downloads (Last 6 weeks)1
Reflects downloads up to 14 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Random projection using random quantum circuitsPhysical Review Research10.1103/PhysRevResearch.6.0130106:1Online publication date: 3-Jan-2024
  • (2022)Random Projection and Recovery for High Dimensional Optimization with Arbitrary OutliersInternational Journal of Computational Geometry & Applications10.1142/S021819592250007832:03n04(201-225)Online publication date: 19-Dec-2022
  • (2021)Applying a Random Projection Algorithm to Optimize Machine Learning Model for Breast Lesion ClassificationIEEE Transactions on Biomedical Engineering10.1109/TBME.2021.305424868:9(2764-2775)Online publication date: Sep-2021
  • (2021)Jointly evolving and compressing fuzzy system for feature reduction and classificationInformation Sciences10.1016/j.ins.2021.08.003579(218-230)Online publication date: Nov-2021
  • (2020)Stable sparse subspace embedding for dimensionality reductionKnowledge-Based Systems10.1016/j.knosys.2020.105639(105639)Online publication date: Feb-2020
  • (2020)Random projections for quadratic programsMathematical Programming10.1007/s10107-020-01517-xOnline publication date: 6-Jun-2020
  • (2019)An Efficient Method for NMR Data Compression Based on Fast Singular Value DecompositionIEEE Geoscience and Remote Sensing Letters10.1109/LGRS.2018.287211116:2(301-305)Online publication date: Feb-2019
  • (2019)A Fast and Robust Support Vector Machine With Anti-Noise Convex Hull and its Application in Large-Scale ncRNA Data ClassificationIEEE Access10.1109/ACCESS.2019.29419867(134730-134741)Online publication date: 2019
  • (2019)Support Vector Machines Resilient Against Training Data Integrity AttacksPattern Recognition10.1016/j.patcog.2019.106985(106985)Online publication date: Aug-2019
  • (2019)A Data-Independent Reusable Projection (DIRP) Technique for Dimension Reduction in Big Data Classification Using k-Nearest Neighbor (k-NN)National Academy Science Letters10.1007/s40009-018-0771-6Online publication date: 6-Feb-2019
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media