Abstract
As “big data” transforms the way we solve computer vision problems, the question of how we can efficiently leverage large labelled databases becomes increasingly important. High-dimensional features, such as the convolutional neural network activations that drive many leading recognition frameworks, pose particular challenges for efficient retrieval. We present a novel method for learning compact binary codes in which the conventional dense projection matrix is replaced with a discriminatively-trained sparse projection matrix. The proposed method achieves two to three times faster encoding than modern dense binary encoding methods, while obtaining comparable retrieval accuracy, on SUN RGB-D, AwA, and ImageNet datasets. The method is also more accurate than unsupervised high-dimensional binary encoding methods at similar encoding speeds.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Personal communication.
References
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge (2014). arXiv:1409.0575
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: common objects in context. In: Proceedings of European Conference on Computer Vision (2014)
Song, S., Lichtenberg, S., Xiao, J.: SUN RGB-D: a RGB-D scene understanding benchmark suite. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2016)
Kulis, B., Darrell, T.: Learning to hash with binary reconstructive embeddings. In: Advances in Neural Information Processing Systems (2009)
Norouzi, M., Fleet, D.J.: Minimal loss hashing for compact binary codes. In: Proceedings of International Conference in Machine Learning (2011)
Liu, W., Wang, J., Ji, R., Jiang, Y.G., Chang, S.F.: Supervised hashing with kernels. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2012)
Gong, Y., Lazebnik, S., Gordo, A., Perronnin, F.: Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2916–2929 (2013)
Ge, T., He, K., Sun, J.: Graph cuts for supervised binary coding. In: Proceedigs of European Conference on Computer Vision (2014)
Xia, Y., He, K., Kohli, P., Sun, J.: Sparse projections for high-dimensional binary codes. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2015)
Cakir, F., Sclaroff, S.: Adaptive hashing for fast similarity search. In: Proceedings of IEEE International Conference on Computer Vision (2015)
Shen, F., Shen, C., Liu, W., Shen, H.T.: Supervised discrete hashing. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2015)
Carreira, J., Caseiro, R., Batista, J., Sminchisescu, C.: Semantic segmentation with second-order pooling. In: Proceedings of European Conference on Computer Vision (2012)
Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher kernel for large-scale image classification. In: Proceedings of European Conference on Computer Vision (2010)
Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. In: Proceedings of British Machine Vision Conference (2014)
Gong, Y., Kumar, S., Rowley, H.A., Lazebnik, S.: Learning binary codes for high-dimensional data using bilinear projections. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2013)
Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: Proceedings of International Conference on Very Large Data Bases (1999)
Charikar, M.S.: Similarity estimation techniques from rounding algorithms. In: Proceedings of ACM Symposium on Theory of Computing (2002)
Kulis, B., Grauman, K.: Kernelized locality-sensitive hashing. IEEE Trans. Pattern Anal. Mach. Intell. 34, 1092–1104 (2012)
Jiang, K., Que, Q., Kulis, B.: Revisiting kernelized locality-sensitive hashing for improved large-scale image retrieval. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2015)
Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2010)
Yu, F.X., Kumar, S., Gong, Y., Chang, S.F.: Circulant binary embedding. In: Proceedings of International Conference in Machine Learning (2014)
Rastegari, M., Keskin, C., Kohli, P., Izadi, S.: Computationally bounded retrieval. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2015)
Zhang, X., Yu, F.X., Guo, R., Kumar, S., Wang, S., Chang, S.F.: Fast orthogonal projection based on Kronecker product. In: Proceedings of IEEE International Conference on Computer Vision (2015)
Norouzi, M., Punjani, A., Fleet, D.J.: Fast search in Hamming space with multi-index hashing. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2012)
van den Berg, E., Friedlander, M.P.: Probing the pareto frontier for basis pursuit solutions. SIAM J. Sci. Comput. 31, 890–912 (2008)
Dean, T., Ruzon, M.A., Segal, M., Shlens, J., Vijayanarasimhan, S., Yagnik, J.: Fast, accurate detection of 100,000 object classes on a single machine. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2013)
Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using Places database. In: Advances in Neural Information Processing Systems (2014)
Xiao, J., Hays, J., Ehinger, K., Oliva, A., Torralba, A.: SUN database: large-scale scene recognition from abbey to zoo. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 3485–3492 (2010)
Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 951–958 (2009)
Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1778–1785 (2009)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). arXiv:1409.1556
Acknowledgements
We thank Yan Xia for helpful discussion. This work was funded in part by the Natural Sciences and Engineering Research Council of Canada.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Tung, F., Little, J.J. (2017). SSP: Supervised Sparse Projections for Large-Scale Retrieval in High Dimensions. In: Lai, SH., Lepetit, V., Nishino, K., Sato, Y. (eds) Computer Vision – ACCV 2016. ACCV 2016. Lecture Notes in Computer Science(), vol 10111. Springer, Cham. https://doi.org/10.1007/978-3-319-54181-5_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-54181-5_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54180-8
Online ISBN: 978-3-319-54181-5
eBook Packages: Computer ScienceComputer Science (R0)