Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3447548.3467440acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Fast One-class Classification using Class Boundary-preserving Random Projections

Published: 14 August 2021 Publication History
  • Get Citation Alerts
  • Abstract

    Several applications, like malicious URL detection and web spam detection, require classification on very high-dimensional data. In such cases anomalous data is hard to find but normal data is easily available. As such it is increasingly common to use a one-class classifier (OCC). Unfortunately, most OCC algorithms cannot scale to datasets with extremely high dimensions. In this paper, we present Fast Random projection-based One-Class Classification (FROCC), an extremely efficient, scalable and easily parallelizable method for one-class classification with provable theoretical guarantees. Our method is based on the simple idea of transforming the training data by projecting it onto a set of random unit vectors that are chosen uniformly and independently from the unit sphere, and bounding the regions based on separation of the data. FROCC can be naturally extended with kernels. We provide a new theoretical framework to prove that that FROCC generalizes well in the sense that it is stable and has low bias for some parameter settings. We then develop a fast scalable approximation of FROCC using vectorization, exploiting data sparsity and parallelism to develop a new implementation called ParDFROCC. ParDFROCC achieves up to 2 percent points better ROC than the next best baseline, with up to 12× speedup in training and test times over a range of state-of-the-art benchmarks for the OCC task.

    Supplementary Material

    MP4 File (fast_oneclass_classification_using_class-arindam_bhattacharya-sumanth_varambally-38958014-rtqQ.mp4)
    Presentation video

    References

    [1]
    Mart'in Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dandelion Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. https://www.tensorflow.org/ Software available from tensorflow.org.
    [2]
    Dimitris Achlioptas. 2001. Database-friendly random projections. In PoDS. 274--281.
    [3]
    Nir Ailon and Bernard Chazelle. 2006. Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform. In SoTC. 557--563.
    [4]
    Olivier Bousquet and André Elisseeff. 2002. Stability and generalization. J. Mach. Learn. Res, Vol. 2 (2002), 499--526.
    [5]
    Christopher P Burgess, Irina Higgins, Arka Pal, Loic Matthey, Nick Watters, Guillaume Desjardins, and Alexander Lerchner. 2018. Understanding disentangling in β-VAE. arXiv preprint arXiv:1804.03599 (2018).
    [6]
    Gilles Cohen, Hugo Sax, Antoine Geissbuhler, et al. 2008. Novelty detection using one-class Parzen density estimator. An application to surveillance of nosocomial infections. In Stud Health Technol, Vol. 136. 21--26.
    [7]
    Timothy de Vries, Sanjay Chawla, and Michael E Houle. 2012. Density-preserving projections for large-scale local anomaly detection. Knowl Inf Syst, Vol. 32 (2012), 25--52.
    [8]
    Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml
    [9]
    Bradley Efron. 1965. The convex hull of a random set of points. Biometrika, Vol. 52, 3--4 (1965), 331--343.
    [10]
    James E Fowler and Qian Du. 2011. Anomaly detection and reconstruction from random projections. IEEE Trans. Image Process., Vol. 21, 1 (2011), 184--195.
    [11]
    P. García-Teodoro, J. Díaz-Verdejo, G. Maciá-Fernández, and E. Vázquez. 2009. Anomaly-based network intrusion detection: Techniques, systems and challenges. Computers & Security, Vol. 28, 1 (2009), 18--28.
    [12]
    Markus Goldstein and Andreas Dengel. 2012. Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm. KI (2012), 59--63.
    [13]
    Sachin Goyal, Aditi Raghunathan, Moksh Jain, Harsha Vardhan Simhadri, and Prateek Jain. 2020. DROCC: Deep robust one-class classification. In ICML. PMLR, 3711--3721.
    [14]
    P. Indyk and R. Motwani. 1998. Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. In SToC. 604--613.
    [15]
    Piotr Indyk and Assaf Naor. 2007. Nearest-neighbor-preserving embeddings. ACM Trans. Algorithms, Vol. 3, 3 (2007), 31--es.
    [16]
    Daniel M Kane and Jelani Nelson. 2014. Sparser johnson-lindenstrauss transforms. J. ACM, Vol. 61, 1 (2014), 1--23.
    [17]
    Shehroz S Khan and Michael G Madden. 2009. A survey of recent trends in one class classification. In AICS. Springer, 188--197.
    [18]
    J. M. Kleinberg. 1997. Two algorithms for nearest-neighbor search in high dimensions. In SToC. 599--608.
    [19]
    Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. Technical Report. https://www.cs.toronto.edu/ kriz/learning-features-2009-TR.pdf
    [20]
    Brenden M Lake, Ruslan Salakhutdinov, and Joshua B Tenenbaum. 2015. Human-level concept learning through probabilistic program induction. Science, Vol. 350, 6266 (2015), 1332--1338.
    [21]
    Yann LeCun. 1998. The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/
    [22]
    David A. Levin and Yuval Peres. 2017. Markov Chains and Mixing Times 2nd. revised edition ed.). AMS.
    [23]
    Ping Li, Trevor J Hastie, and Kenneth W Church. 2006. Very sparse random projections. In KDD. 287--296.
    [24]
    Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation forest. In ICDM. IEEE, 413--422.
    [25]
    Justin Ma, Lawrence K Saul, Stefan Savage, and Geoffrey M Voelker. 2009. Identifying suspicious URLs: an application of large-scale online learning. In ICML. 681--688.
    [26]
    Colin McDiarmid. 1989. On the method of bounded differences .Cambridge University Press, 148--188.
    [27]
    Yisroel Mirsky, Tomer Doitshman, Yuval Elovici, and Asaf Shabtai. 2018. Kitsune: an ensemble of autoencoders for online network intrusion detection. Network and Distributed System Security Symposium (NDSS) (2018).
    [28]
    Guansong Pang, Chunhua Shen, Longbing Cao, and Anton van den Hengel. 2020. Deep learning for anomaly detection: A review. arXiv preprint arXiv:2007.02500 (2020).
    [29]
    Pramuditha Perera and Vishal M. Patel. 2018. Learning Deep Features for One-Class Classification. IEEE Trans. Image Process., Vol. 28 (2018), 5450--5463.
    [30]
    Pramuditha Perera and Vishal M. Patel. 2019. Learning Deep Features for One-Class Classification. IEEE Trans. Image Process., Vol. 28, 11 (2019), 5450--5463.
    [31]
    Tomávs Pevnỳ. 2016. Loda: Lightweight on-line detector of anomalies. Mach Learn, Vol. 102, 2 (2016), 275--304.
    [32]
    A. Rényi and R. Sulanke. 1963. Uber die convexe hulle von is zufallig gewahlten punkten I. Z. Wahr. Verw. Geb., Vol. 2 (1963), 75--84.
    [33]
    A. Rényi and R. Sulanke. 1964. Uber die convexe hulle von is zufallig gewahlten punkten II. Z. Wahr. Verw. Geb., Vol. 3 (1964), 138--147.
    [34]
    Peter J. Rousseeuw and Katrien Van Driessen. 1999. A Fast Algorithm for the Minimum Covariance Determinant Estimator. Technometrics, Vol. 41, 3 (Aug 1999), 212--223. https://doi.org/10.1080/00401706.1999.10485670
    [35]
    Lukas Ruff, Robert Vandermeulen, Nico Goernitz, Lucas Deecke, Shoaib Ahmed Siddiqui, Alexander Binder, Emmanuel Müller, and Marius Kloft. 2018. Deep one-class classification. In ICML. 4393--4402.
    [36]
    Bernhard Schö lkopf, Robert Williamson, Alex Smola, John Shawe-Taylor, and John Piatt. 2000. Support vector method for novelty detection. In NeurIPS. 582--588.
    [37]
    J Paul Siebert. 1987. Vehicle Recognition Using Rule Based Methods. Project Report. Turing Institute, Glasgow. http://eprints.gla.ac.uk/91397/
    [38]
    J. Stallkamp, M. Schlipsing, J. Salmen, and C. Igel. 2012. Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. Neural Networks (2012).
    [39]
    Lorne Swersky, Henrique O Marques, Jöerg Sander, Ricardo JGB Campello, and Arthur Zimek. 2016. On the evaluation of outlier detection and one-class classification methods. In DSAA. IEEE, 1--10.
    [40]
    David M.J. Tax and Robert P.W. Duin. 2004. Support Vector Data Description. Mach Learn, Vol. 54 (2004), 45--66.
    [41]
    Steve Webb, James Caverlee, and Calton Pu. 2006. Introducing the Webb Spam Corpus: Using Email Spam to Identify Web Spam Automatically. In CEAS .
    [42]
    Yue Zhao, Zain Nasrullah, and Zheng Li. 2019. PyOD: A Python Toolbox for Scalable Outlier Detection. J Mach Learn Res, Vol. 20, 96 (2019), 1--7.
    [43]
    Panpan Zheng, Shuhan Yuan, Xintao Wu, Jun Yu Li, and Aidong Lu. 2018. One-Class Adversarial Nets for Fraud Detection. In AAAI, Vol. 33. 1286--1293.

    Cited By

    View all
    • (2023)Beyond Hard Negatives in Product Search: Semantic Matching Using One-Class Classification (SMOCC)Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining10.1145/3539597.3570488(1012-1020)Online publication date: 27-Feb-2023
    • (2022)Perturbation learning based anomaly detectionProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601311(14317-14330)Online publication date: 28-Nov-2022
    • (2022)New wine in an old bottleProceedings of the VLDB Endowment10.14778/3538598.353861315:9(1924-1936)Online publication date: 1-May-2022
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining
    August 2021
    4259 pages
    ISBN:9781450383325
    DOI:10.1145/3447548
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 August 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. ensemble classifier
    2. kernel based method
    3. one class classification
    4. random projection

    Qualifiers

    • Research-article

    Conference

    KDD '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '24

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)38
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Beyond Hard Negatives in Product Search: Semantic Matching Using One-Class Classification (SMOCC)Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining10.1145/3539597.3570488(1012-1020)Online publication date: 27-Feb-2023
    • (2022)Perturbation learning based anomaly detectionProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601311(14317-14330)Online publication date: 28-Nov-2022
    • (2022)New wine in an old bottleProceedings of the VLDB Endowment10.14778/3538598.353861315:9(1924-1936)Online publication date: 1-May-2022
    • (2022)One-class Anomaly Detection with Redundancy Reduction and Momentum Mechanism2022 4th International Conference on Data-driven Optimization of Complex Systems (DOCS)10.1109/DOCS55193.2022.9967719(1-6)Online publication date: 28-Oct-2022

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media