Abstract
2D views of objects play an important role in 3D object recognition. In this paper, we focus on 3D object recognition using the 2D projective views. The discriminativeness of each view of an object is first investigated with view saliency using 2D Zernike Moments. The proposed view saliency is then used to boost a multi-view convolutional neural network for 3D object recognition. The proposed method is compared with several state-of-the-art methods on the ModelNet dataset. Experimental results have shown that the performance of our method has been significantly improved over the existing multi-view based 3D object recognition methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Brock, A., Lim, T., Ritchie, J.M., Weston, N.: Generative and discriminative voxel modeling with convolutional neural networks. arXiv preprint arXiv:1608.04236 (2016)
Chang, A.X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Su, H.: Shapenet: an information-rich 3D model repository. arXiv preprint arXiv:1512.03012 (2015)
Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. In: British Machine Vision Conference (2014)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)
Garcia-Garcia, A., Gomez-Donoso, F., Garcia-Rodriguez, J., Orts-Escolano, S., Cazorla, M., Azorin-Lopez, J.: Pointnet: a 3D convolutional neural network for real-time object class recognition. In: International Joint Conference on Neural Networks, pp. 1578–1584. IEEE (2016)
Guo, Y., Bennamoun, M., Sohel, F., Lu, M., Wan, J.: 3D object recognition in cluttered scenes with local surface features: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 36(11), 2270–2287 (2014)
Guo, Y., Bennamoun, M., Sohel, F., Lu, M., Wan, J.: An integrated framework for 3-D modeling, object detection, and pose estimation from point-clouds. IEEE Trans. Instrum. Meas. 64(3), 683–693 (2015)
Hegde, V., Zadeh, R.: Fusionnet: 3D object classification using multiple data representations. arXiv preprint arXiv:1607.05695 (2016)
Johns, E., Leutenegger, S., Davison, A.J.: Pairwise decomposition of image sequences for active multi-view recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3813–3822 (2016)
Klokov, R., Lempitsky, V.: Escape from cells: deep kd-networks for the recognition of 3D point cloud models. arXiv preprint arXiv:1704.01222 (2017)
Maturana, D., Scherer, S.: Voxnet: a 3D convolutional neural network for real-time object recognition. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 922–928. IEEE (2015)
Phong, B.T.: Illumination for computer generated pictures. Commun. ACM 18(6), 311–317 (1975)
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3D classification and segmentation. arXiv preprint arXiv:1612.00593 (2016)
Qi, C.R., Su, H., Nießner, M., Dai, A., Yan, M., Guibas, L.J.: Volumetric and multi-view CNNs for object classification on 3D data. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5648–5656 (2016)
Ravanbakhsh, S., Schneider, J., Poczos, B.: Deep learning with sets and point clouds. arXiv preprint arXiv:1611.04500 (2016)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015)
Sedaghat, N., Zolfaghari, M., Brox, T.: Orientation-boosted voxel nets for 3D object recognition. arXiv preprint arXiv:1604.03351 (2016)
Sharma, A., Grau, O., Fritz, M.: VConv-DAE: deep volumetric shape learning without object labels. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 236–250. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49409-8_20
Shi, B., Bai, S., Zhou, Z., Bai, X.: Deeppano: deep panoramic representation for 3D shape recognition. IEEE Signal Process. Lett. 22(12), 2339–2343 (2015)
Sinha, A., Bai, J., Ramani, K.: Deep learning 3D shape surfaces using geometry images. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 223–240. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_14
Song, S., Lichtenberg, S.P., Xiao, J.: Sun RGB-D: a RGB-D scene understanding benchmark suite. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 567–576 (2015)
Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3D shape recognition. In: IEEE International Conference on Computer Vision, pp. 945–953 (2015)
Vedaldi, A., Lenc, K.: MatConvNet - convolutional neural networks for MATLAB. In: Proceeding of the ACM International Conference on Multimedia (2015)
Wang, D., Wang, B., Zhao, S., Yao, H.: View-based 3D object retrieval with discriminative views. Neurocomputing (2017)
Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3D shapenets: a deep representation for volumetric shapes. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)
Xie, Z., Xu, K., Shan, W., Liu, L., Xiong, Y., Huang, H.: Projective feature learning for 3D shapes with multi-view depth images. Comput. Graph. Forum 34(7), 1–11 (2015)
Zhi, S., Liu, Y., Li, X., Guo, Y.: Lightnet: a lightweight 3D convolutional neural network for real-time 3D object recognition. In: Eurographics Workshop on 3D Object Retrieval (2017)
Acknowledgments
This work was supported by the National Natural Science Foundation of China (Nos. 61602499, 61601488, 61471371, 61403265), the National Postdoctoral Program for Innovative Talents (No. BX201600172), the Science and Technology Plan of Sichuan Province (No. 2015SZ0226), and China Postdoctoral Science Foundation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Ma, Y., Zheng, B., Guo, Y., Lei, Y., Zhang, J. (2018). Boosting Multi-view Convolutional Neural Networks for 3D Object Recognition via View Saliency. In: Wang, Y., et al. Advances in Image and Graphics Technologies. IGTA 2017. Communications in Computer and Information Science, vol 757. Springer, Singapore. https://doi.org/10.1007/978-981-10-7389-2_20
Download citation
DOI: https://doi.org/10.1007/978-981-10-7389-2_20
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7388-5
Online ISBN: 978-981-10-7389-2
eBook Packages: Computer ScienceComputer Science (R0)