Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Class-specific image representation for image classification using multiple scale-invariant region detectors

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

We propose a new class-specific image representation for image classification using multiple region detectors. The new representation is designed to solve the problem of increasing variation in object location and size within images of a class, for which traditional spatial pyramid matching shows limited classification accuracy. We propose a new region-division method that divides the image region into two class-specific regions, called class-specific region-of-interest (C-ROI) and focal region (FR). Using multiple region detectors and appropriate mixing of their responses avoids the problem of selecting a region detector that gives the best classification accuracy for a given image class, and thereby yields better results than using only one region detector. Several scale-invariant region detectors are used to obtain C-ROI and FR by considering their importance over a given image class. In experiments using several well-known datasets, the proposed method improved the accuracy and achieved results that were better than or comparable to those achieved by the related methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. We used only one basis because in experiments we observed that one basis is sufficient to represent the particular information of the given data.

  2. http://www.robots.ox.ac.uk/vgg/data3.html

  3. http://www.vision.caltech.edu/Image_Datasets/Caltech101

  4. http://archive.ics.uci.edu/ml/datasets/CMU+Face+Images

  5. http://www-cvr.ai.uiuc.edu/ponce_grp/data

  6. In our experiments, the number of sub-regions has little effect on the classification performance.

  7. To compare classification results for selected 36 classes, we used codes provided from [34, 36].

References

  1. Chai Y, Lempitsky V, Zisserman A (2011) Bicos: a bi-level co-segmentation method for image classification. In: Proceedings of the international conference on computer vision, pp 2579–2586

  2. Cheng H, Wang R (2010) Semantic modeling of natural scenes based on contextual bayesian networks. Pattern Recognit 43(12):4042–4054

    Article  MATH  Google Scholar 

  3. Cheng Y (1995) Mean shift, mode seeking, and clustering. IEEE Trans Pattern Anal Mach Intell 17(8):790–799

    Article  Google Scholar 

  4. Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ (2008) Liblinear: a library for large linear classification. J Mach Learn Res 9(4):1871–1874

    MATH  Google Scholar 

  5. Felzenszwalb PF, Huttenlocher DP (2005) Pictorial structures for object recognition. Int J Comput Vis 61(1):55–79

    Article  Google Scholar 

  6. Fergus R, Perona P, Zisserman A (2003) Object class recognition by unsupervised scale-invariant learning. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 264–271

  7. Fergus R, Perona P, Zisserman A (2005) A sparse object category model for efficient learning and exhaustive recognition. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 380–387

  8. Fischler MA, Elschlager RA (1973) The representation and matching of pictorial structures. IEEE Trans Comput 22(1):67–92

    Article  Google Scholar 

  9. Galleguillos C, Babenko B, Rabinovich A, Belongie SJ (2008) Weakly supervised object localization with stable segmentations. In: Proceedings of the European conference on computer vision, pp 193–207

  10. Gao S, Cheng X, Chia LT (2010) Discovering class-specific informative patches and its application in landmark characterization. Proc Int Multimed Model Conf 5916:218–228

    Google Scholar 

  11. Jiang H, Wang J, Yuan Z, Wu Y, Zheng N, Li S (2013) Salient object detection: a discriminative regional feature integration approach. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 2083–2090

  12. Kadir T, Brady M (2001) Scale, saliency and image description. Int J Comput Vis 45(2):83–105

    Article  MATH  Google Scholar 

  13. Lampert CH, Blaschko MB, Hofmann T (2008) Beyond sliding windows: object localization by efficient subwindow search. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 1–8

  14. Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 2169–2178

  15. Lee DD, Seung HS (1999) Learning the parts of objects using non-negative matrix factorization. Nature 401(6755):788–791

    Article  Google Scholar 

  16. Li FF, Perona P (2005) A bayesian hierarchical model for learning natural scene categories. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 524–531

  17. Li Z, Liu J, Lu H (2010) Sparse constraint nearest neighbour selection in cross-media retrieval. In: ICIP, pp 1465–1468

  18. Li Z, Liu J, Lu H (2013) Structure preserving non-negative matrix factorization for dimensionality reduction. Comput Vis Image Underst 117(9):1175–1189

    Article  Google Scholar 

  19. Li Z, Liu J, Tang J, Lu H (2014) Projective matrix factorization with unified embedding for social image tagging. Comput Vis Image Underst 124:71–78

    Article  Google Scholar 

  20. Li Z, Liu J, Yang Y, Zhou X, Lu H (2014) Clustering-guided sparse structural learning for unsupervised feature selection. IEEE Trans Knowl Data Eng 26(9):2138–2150

    Article  Google Scholar 

  21. Li Z, Yang Y, Liu J, Zhou X, Lu H (2012) Unsupervised feature selection using nonnegative spectral analysis. In: Proceedings of the twenty-sixth AAAI conference on artificial intelligence, July 22–26, 2012, Toronto, Ontario, Canada

  22. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110

    Article  Google Scholar 

  23. Mikolajczyk K, Leibe B, Schiele B (2005) Local features for object class recognition. In: Proceedings of the international conference on computer vision, pp 1792–1799

  24. Mikolajczyk K, Schmid C (2004) Scale and affine invariant interest point detectors. Int J Comput Vis 60(1):63–86

    Article  Google Scholar 

  25. Mikolajczyk K, Tuytelaars T, Schmid C, Zisserman A, Matas J, Schaffalitzky F, Kadir T, Gool LV (2005) A comparison of affine region detectors. Int J Comput Vis 65(1–2):43–72

    Article  Google Scholar 

  26. Nguyen M (2012) Segment-based svms for time series analysis. Ph.D. thesis, The Robotics Institute, Carnegie Mellon University, Pittsburgh, PA

  27. Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 2161–2168

  28. Perronnin F, Sánchez J, Mensink T (2010) Improving the fisher kernel for large-scale image classification. Proc Eur Conf Comput Vis 6314:143–156

    Google Scholar 

  29. Serrano N, Savakis A, Luo J (2004) Improved scene classification using efficient low-level features and semantic cues. Pattern Recognit 37(9):1773–1784

    Article  MATH  Google Scholar 

  30. Shalev-Shwartz, S., Singer, Y., Srebro, N.: Pegasos: Primal estimated sub-gradient solver for svm. In: Proceedings of the international conference on machine learning, pp 807–814

  31. Sharma G (2012) Discriminative spatial saliency for image classification. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 3506–3513

  32. Vedaldi A, Fulkerson B (2010) Vlfeat: an open and portable library of computer vision algorithms. In: Proceedings of the international conference on Multimedia, pp 1469–1472

  33. Vedaldi A, Zisserman A (2012) Efficient additive kernels via explicit feature maps. IEEE Trans Pattern Anal Mach Intell 34(3):480–492

    Article  Google Scholar 

  34. Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 3360–3367

  35. Yakhnenko O, Verbeek J, Schmid C (2011) Region-based image classification with a latent svm model. Tech Rep RR-7665, INRIA. http://hal.inria.fr/inria-00605344

  36. Yang J, Yu K, Gong Y, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classification. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 1794–1801

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hui-Jin Lee.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, HJ., Hong, KS. Class-specific image representation for image classification using multiple scale-invariant region detectors. Pattern Anal Applic 20, 717–732 (2017). https://doi.org/10.1007/s10044-016-0529-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-016-0529-z

Keywords