Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

One-Class Classification Framework Based on Shrinkage Methods

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

Statistical machine learning, such as kernel methods, have been widely used to discover hidden regularities and patterns in data. In particular, one-class classification algorithms gained a lot of interest in a large number of applications where the only available data designate a unique class, as in industrial processes. In this paper, we propose a sparse framework for one-class classification problems, by investigating the hypersphere enclosing the samples in a Reproducing Kernel Hilbert Space (RKHS). The center of this hypersphere is approximated using a sparse solution, by selecting an appropriate set of relevant samples. For this purpose, we investigate well-known shrinkage methods, namely Least Angle Regression, Least Absolute Shrinkage and Selection Operator, and Elastic Net. We revisit these methods and adapt their algorithms for estimating the sparse center in the RKHS. The proposed framework is extended to include the truncated Mahalanobis distance, which is necessary when dealing with heterogenous input variables. We also provide some theoretical results on the projection error and on the error of the first kind. The proposed algorithms are compared with well-known one-class classification approaches, with experiments conducted on simulated and real datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5

Similar content being viewed by others

Explore related subjects

Find the latest articles, discoveries, and news in related topics.

References

  1. Hofmann, T., Schölkopf, B., & Smola, A.J. (2008). Kernel methods in Machine Learning. Annals of Statistics, 36, 1171–1220.

    Article  MathSciNet  MATH  Google Scholar 

  2. Shawe-Taylor, J., & Cristianini, N. (2004). Kernel methods for pattern analysis. New York: Cambridge University Press.

    Book  MATH  Google Scholar 

  3. Greiner, R., Silver, B., Becker, S., & Grüninger, M. (1988). A review of Machine Learning at AAAI-87. Machine Learning, 3(1), 79–92.

    Google Scholar 

  4. Bredeche, N., Shi, Z., & Zucker, J.-D. (2006). Perceptual learning and abstraction in Machine Learning: an application to autonomous robotics. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 36, 172–181.

    Article  Google Scholar 

  5. Strauss, D., Delb, W., Jung, J., & Plinkert, P. (2003). Adapted filter banks in Machine Learning: applications in biomedical signal processing. In IEEE international conference on acoustics, speech, and signal processing (ICASSP’03) (Vol. 6, pp. VI–425–8).

  6. Mahfouz, S., Mourad-Chehade, F., Honeine, P., Farah, J., & Snoussi, H. (2014). Target tracking using Machine Learning and Kalman filter in wireless sensor networks. IEEE Sensors Journal, 14, 3715–3725.

    Article  Google Scholar 

  7. Mahfouz, S., Mourad-Chehade, F., Honeine, P., Farah, J., & Snoussi, H. (2013). Kernel-based localization using fingerprinting in wireless sensor networks. In Proceedings of the 14th IEEE workshop on signal processing advances in wireless communications (SPAWC) (pp. 744–748). Germany.

  8. Vert, J.P., Tsuda, K., & Scholkopf, B. (2004). A primer on kernel methods. Kernel Methods in Computational Biology, 35–70.

  9. Aronszajn, N. (1950). Theory of reproducing kernels. Transactions of the American Mathematical Society, 68 (3), 337–404.

    Article  MathSciNet  MATH  Google Scholar 

  10. Ten, C.-W., Liu, C.-C., & Manimaran, G. (2008). Vulnerability assessment of cybersecurity for SCADA systems. IEEE Transactions on Power Systems, 23(4), 1836–1846.

    Article  Google Scholar 

  11. Khan, S.S., & Madden, M.G. (2010). A survey of recent trends in one class classification. In Proceedings of the 20th Irish conference on Artificial intelligence and cognitive science. AICS’09 (pp. 188–197).

  12. Zeng, Z., Fu, Y., Roisman, G., Wen, Z., Hu, Y., & Huang, T. (2006). One-class classification for spontaneous facial expression analysis. In 7th international conference on automatic face and gesture recognition (FGR) (pp. 281–286).

  13. Mazhelis, O. (2006). One-class classifiers: a review and analysis of suitability in the context of mobile-masquerader detection. South African Computer Journal, 36, 29–48.

    Google Scholar 

  14. Gardner, A.B., Krieger, A.M., Vachtsevanos, G., & Litt, B. (2006). One-class novelty detection for seizure analysis from intracranial EEG. Journal of Machine Learning Research, 7, 1025–1044.

    MathSciNet  MATH  Google Scholar 

  15. Nader, P., Honeine, P., & Beauseroy, P. (2014). l p -norms in one-class classification for intrusion detection in SCADA systems. IEEE Transactions on Industrial Informatics, 10, 2308–2317.

    Article  Google Scholar 

  16. Tropp, J., & Wright, S. (2010). Computational methods for sparse solution of linear inverse problems. Proceedings of the IEEE, 98, 948–958.

    Article  Google Scholar 

  17. Elad, M. (2010). Sparse and redundant representations: from theory to applications in signal and image processing, 1st edn. Springer Publishing Company, Incorporated.

  18. Honeine, P. (2015). Approximation errors of online sparsification criteria. IEEE Transactions on Signal Processing, 63, 4700– 4709.

    Article  MathSciNet  Google Scholar 

  19. Honeine, P. (2015). Analyzing sparse dictionaries for online learning with kernels. IEEE Transactions on Signal Processing, 63, 6343–6353.

    Article  MathSciNet  Google Scholar 

  20. Schölkopf, B., Platt, J.C., Shawe-Taylor, J.C., Smola, A.J., & Williamson, R.C. (2001). Estimating the support of a high-dimensional distribution. Neural Computation, 13, 1443– 1471.

    Article  MATH  Google Scholar 

  21. Decoste, D., & Schölkopf, B. (2002). Training invariant support vector machines. Machine Learning, 46 (1–3), 161–190.

    Article  MATH  Google Scholar 

  22. Tax, D.M.J., & Duin, R.P.W. (2004). “Support vector data description,”. Machine Learning, 54, 45–66.

    Article  MATH  Google Scholar 

  23. Azami, M.E., Lartizien, C., & Canu, S. (2014). Robust outlier detection with L0-SVDD. In 22th European symposium on artificial neural networks, ESANN 2014, Bruges, Belgium, April 23–25 2014.

  24. Schölkopf, B., Giesen, J., & Spalinger, S. (2005). Kernel methods for implicit surface modeling. In Advances in Neural Information Processing Systems 17, (pp. 1193–1200). MIT Press.

  25. Eigensatz, M., Giesen, J., & Manjunath, M. (2008). The solution path of the slab support vector machine. In The 20th Canadian conference on computational geometry, Mcgill University (pp. 211–214). CCCG.

  26. Tax, D.M.J., & Juszczak, P. (2002). Kernel whitening for one-class classification. In First international workshop on pattern recognition with support vector machines, Niagara Falls, Canada, August 10 (pp. 40–52).

  27. Tsang, I., Kwok, J., & Li, S. (2006). Learning the kernel in Mahalanobis one-class support vector machines. In International joint conference on neural networks (IJCNN) (pp. 1169–1175).

  28. Wang, D., Yeung, D., & Tsang, E.C.C. (2006). Structured one-class classification. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 36, 1283–1295.

    Article  Google Scholar 

  29. Song, Q., Hu, W., & Xie, W. (2002). Robust support vector machine with bullet hole image classification. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 32, 440–448.

    Article  Google Scholar 

  30. Amer, M., Goldstein, M., & Abdennadher, S. (2013). Enhancing one-class support vector machines for unsupervised anomaly detection. In Proceedings of the ACM SIGKDD workshop on outlier detection and description (ODD), August 11–14 (pp. 8–15). New York, 8.

  31. Schölkopf, B., Smola, A., & Müller, K.-R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10, 1299–1319.

    Article  Google Scholar 

  32. Hoffmann, H. (2007). Kernel PCA for novelty detection. Pattern Recognition, 40(3), 863–874.

    Article  MATH  Google Scholar 

  33. Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning. Springer Series in Statistics, New York: Springer New York Inc.

    Book  MATH  Google Scholar 

  34. Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression. Annals of Statistics, 32, 407–499.

    Article  MathSciNet  MATH  Google Scholar 

  35. Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society, Series B, 58, 267–288.

    MathSciNet  MATH  Google Scholar 

  36. Osborne, M.R., Presnell, B., & Turlach, B.A. (1999). On the LASSO and its dual. Journal of Computational and Graphical Statistics, 9, 319–337.

    MathSciNet  Google Scholar 

  37. Zou, H., & Hastie, T. (2005). Regularization and variable selection via the Elastic Net. Journal of the Royal Statistical Society, Series B, 67, 301–320.

    Article  MathSciNet  MATH  Google Scholar 

  38. Zhou, D.-X. (2013). On grouping effect of Elastic Net. Statistics & Probability Letters, 83, 2108–2112.

    Article  MathSciNet  MATH  Google Scholar 

  39. Morris, T., Srivastava, A., Reaves, B., Gao, W., Pavurapu, K., & Reddi, R. (2011). A control system testbed to validate critical infrastructure protection concepts. International Journal of Critical Infrastructure Protection, 4(2), 88–103.

    Article  Google Scholar 

  40. Morris, T., Vaughn, R., & Dandass, Y. S. (2011). A testbed for SCADA control system cybersecurity research and pedagogy. In Proceedings of the seventh annual workshop on cyber security and information intelligence research, CSIIRW ’11 (pp. 27:1–27:1).

  41. Bache, K., & Lichman, M. (2013). UCI Machine Learning repository.

  42. Vapnik, V.N. (1995). The nature of statistical learning theory. New York: Springer.

    Book  MATH  Google Scholar 

  43. Schölkopf, B., Burges, C., & Smola, A. (1998). Introduction to support vector learning In B. Schölkopf, C. Burges, & A. Smola (Eds.), Advances in kernel methods—support vector learning, (pp. 1–22). Cambridge: MIT Press.

    Google Scholar 

  44. Hoerl, A.E., & Kennard, R.W. (1970). Ridge regression: biased estimation for nonorthogonal problems. Technometrics, 12, 55–67.

    Article  MATH  Google Scholar 

  45. Mahalanobis, P.C. (1936). On the generalised distance in statistics. In Proceedings National Institute of Science, India (Vol. 2, pp. 49–55).

Download references

Acknowledgments

The authors would like to thank Thomas Morris and the Mississippi SCADA Laboratory for providing the real SCADA datasets, and the French “Agence Nationale de la Recherche”(ANR) grant SCALA for supporting this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Patric Nader.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nader, P., Honeine, P. & Beauseroy, P. One-Class Classification Framework Based on Shrinkage Methods. J Sign Process Syst 90, 341–356 (2018). https://doi.org/10.1007/s11265-017-1240-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-017-1240-z

Keywords