Abstract
In order to extract effective representations of data using deep learning models, deep convolutional activation feature (DeCAF) is usually considered. However, since the deep models for learning DeCAF are generally pre-trained, the dimensionality of DeCAF is simply fixed to a constant number (e.g., 4096D). In this case, one may ask whether DeCAF is good enough for image classification and whether we can further improve its performance? In this paper, to answer these two challenging questions, we propose a new model called RS-DeCAF based on “reducing” and “stretching” the dimensionality of DeCAF. In the implementation of RS-DeCAF, we reduce the dimensionality of DeCAF using dimensionality reduction methods and increase its dimensionality by stretching the weight matrix between successive layers. To improve the performance of RS-DeCAF, we also present a modified version of RS-DeCAF by applying the fine-tuning operation. Extensive experiments on several image classification tasks show that RS-DeCAF not only improves DeCAF but also outperforms previous “stretching” approaches. More importantly, from the results, we find that RS-DeCAF can generally achieve the highest classification accuracy when its dimensionality is two to four times of that of DeCAF.
Similar content being viewed by others
References
Baudat G, Anouar F. Generalized discriminant analysis using a kernel approach. Neural Comput. 2000;12(10):2385–404.
Brogaard B. An introduction to the philosophy of cognitive science. Mind Mach. 2002;12(1):151–6.
Cai Y, Zhong G, Zheng Y, Huang K. Is DeCAF good enough for accurate image classification? ICONIP; 2015. p. 354–363.
Cho Y, Saul L. Large-margin classification in infinite neural networks. Neural Comput. 2010;22(10):2678–97.
Coates A, Ng A, Lee H. An analysis of single-layer networks in unsupervised feature learning. In: AISTATS; 2011. p. 215–223.
Deng J, Dong W, Socher R, Li L, Li K, Li F. ImageNet: a large-scale hierarchical image database. In: CVPR; 2009. p. 248–255.
Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T. DeCAF: a deep convolutional activation feature for generic visual recognition. In: ICML; 2014. p. 647–655.
Dosovitskiy A, Fischer P, Springenberg J, Riedmiller M, Brox T. Discriminative unsupervised feature learning with exemplar convolutional neural networks. IEEE Transactions on Pattern Analysis Machine Intelligence. 2016;38(9):1734–47.
Fisher R. The use of multiple measurements in taxonomic problems. Annals of Eugenics. 1936;7(2):179–88.
Gepperth A, Karaoguz CA. A bio-inspired incremental learning architecture for applied perceptual problems. Cognitive Computation. 2016;8(5):924–34.
Guo T, Zhang L, Tan X. Neuron pruning-based discriminative extreme learning machine for pattern classification. Cognitive Computation. 2017
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: CVPR; 2016. p. 770–778.
Hinton G, Osindero S, Teh Y. A fast learning algorithm for deep belief nets. Neural Comput. 2006;18(7):1527–54.
Hinton G, Salakhutdinov R. Reducing the dimensionality of data with neural networks. Science. 313. 2006.
Hinton H, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov R. Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint. 2012;3:212–23.
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T. Caffe: convolutional architecture for fast feature embedding. In: ACM MM; 2014. p. 675–678.
Jolliffe I. 1986. Principal component analysis. Springer.
Kelly J III. 2015. Computing, cognition and the future of knowing. IBM Research: Cognitive Computing.
Krizhevsky A, Sutskever I, Hinton G. ImageNet classification with deep convolutional neural networks. In: NIPS; 2012. p. 1106–1114.
LeCun Y, Boser B, Denker J, Henderson D, Howard R, Hubbard W, Jackel L. Backpropagation applied to handwritten zip code recognition. Neural Comput. 1989;1(4):541–51.
Lin M, Chen Q, Yan S. 2013. Network in network. CoRR arXiv:1312.4400.
Liu J, Dong J, Cai X, Qi L, Chantler M. 2015. Visual perception of procedural textures: identifying perceptual dimensions and predicting generation models. PloS One 10.
Luo B, Hussain A, Mahmud M, Tang J. Advances in brain-inspired cognitive systems. Cognitive Computation. 2016;8(5):795–6.
Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng A. Reading digits in natural images with unsupervised feature learning . NIPS workshop on deep learning and unsupervised feature learning; 2011.
Pandey G, Dukkipati A. Learning by stretching deep networks. In: ICML; 2014. p. 1719–1727.
Peter W, Steve B, Takeshi M, Catherine W, Florian S, Serge B, Pietro P. Caltech-UCSD birds 200. Tech. Rep. CNS-TR-2010-001, California Institute of Technology. 2010
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg A, Li F. ImageNet large scale visual recognition challenge. Int J Comput Vis. 2015;115(3):211–52.
Scholkopf B, Smola A. Learning with kernels: support vector machines, regularization, optimization, and beyond. adaptive computation and machine learning series. MIT Press. 2002.
Scholkopf B, Smola A, Muller K. Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 1998;10(5):1299–319.
Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y. 2013. Overfeat: integrated recognition, localization and detection using convolutional networks eprint Arxiv.
Simonyan K, Zisserman A. 2014. Very deep convolutional networks for large-scale image recognition. CoRR arXiv:1409.1556.
Spratling M. A hierarchical predictive coding model of object recognition in natural images. Cognitive Computation. 2017;9(2):151–67.
Sun Y, Wang X, Tang X. Deep learning face representation by joint Identification-Verification. NIPS; 2014. p. 1988–96.
Swersky K, Snoek J, Adams R. Multi-task bayesian optimization. NIPS; 2013. p. 2004–2012.
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In: CVPR; 2015. p. 1–9.
Taylor J. Cognitive computation. Cognitive Computation. 2009;1(1):4–16.
Vapnik V. Statistical learning theory, vol. 1. Wiley. 1998.
Wang N, Yeung D. Ensemble-based tracking: Aggregating crowdsourced structured time series data. In: ICML; 2014. p. 1107–1115.
Yann L, Bottou L, Yoshua B, Patrick H. Gradient-based learning applied to document recognition. Proc IEEE. 1998;86:2278–324.
Zhang H, Ji P, Wang J, Chen X. A neutrosophic normal cloud and its application in decision-making. Cognitive Computation. 2016;8(4):649–69.
Zheng Y, Zhong G, Liu J, Cai X, Dong J. Visual texture perception with feature learning models and deep architectures. In: CCPR; 2014. p. 401–410.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (No. 61271405, 61403353), the Ph.D. Program Foundation of Ministry of Education Of China (No. 20120132110018) and the Fundamental Research Funds for the Central Universities of China.
Funding
This study was funded by the National Natural Science Foundation of China (No. 61271405, 61403353), the Ph.D. Program Foundation of Ministry of Education Of China (No. 20120132110018) and the Fundamental Research Funds for the Central Universities of China.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interests
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Rights and permissions
About this article
Cite this article
Zhong, G., Yan, S., Huang, K. et al. Reducing and Stretching Deep Convolutional Activation Features for Accurate Image Classification. Cogn Comput 10, 179–186 (2018). https://doi.org/10.1007/s12559-017-9515-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12559-017-9515-z