Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
survey

Deep Learning Advances in Computer Vision with 3D Data: A Survey

Published: 06 April 2017 Publication History
  • Get Citation Alerts
  • Abstract

    Deep learning has recently gained popularity achieving state-of-the-art performance in tasks involving text, sound, or image processing. Due to its outstanding performance, there have been efforts to apply it in more challenging scenarios, for example, 3D data processing. This article surveys methods applying deep learning on 3D data and provides a classification based on how they exploit them. From the results of the examined works, we conclude that systems employing 2D views of 3D data typically surpass voxel-based (3D) deep models, which however, can perform better with more layers and severe data augmentation. Therefore, larger-scale datasets and increased resolutions are required.

    Supplementary Material

    a20-ioannidou-apndx.pdf (ioannidou.zip)
    Supplemental movie, appendix, image and software files for, Deep Learning Advances in Computer Vision with 3D Data: A Survey

    References

    [1]
    M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, R. Jozefowicz, Y. Jia, L. Kaiser, M. Kudlur, J. Levenberg, D. Man, M. Schuster, R. Monga, S. Moore, D. Murray, C. Olah, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Vigas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. (2015). http://tensorflow.org/ Software available from tensorflow.org.
    [2]
    R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Susstrunk. 2012. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 11 (2012), 2274--2282.
    [3]
    A. Agarwal, E. Akchurin, C. Basoglu, G. Chen, S. Cyphers, J. Droppo, A. Eversole, B. Guenter, M. Hillebrand, R. Hoens, X. Huang, Z. Huang, V. Ivanov, A. Kamenev, P. Kranen, O. Kuchaiev, W. Manousek, A. May, B. Mitra, O. Nano, G. Navarro, A. Orlov, M. Padmilac, H. Parthasarathi, B. Peng, A. Reznichenko, F. Seide, M. L. Seltzer, M. Slaney, A. Stolcke, Y. Wang, H. Wang, K. Yao, D. Yu, Y. Zhang, and G. Zweig. 2014. An Introduction to Computational Networks and the Computational Network Toolkit. Technical Report MSR-TR-2014-112. Microsoft Research.
    [4]
    A. K. Aijazi, P. Checchin, and L. Trassoudaine. 2013. Segmentation based classification of 3D urban point clouds: A super-voxel based approach with evaluation. Remote Sensing 5, 4 (2013), 1624--1650.
    [5]
    A. Aldoma, F. Tombari, L. Di Stefano, and M. Vincze. 2012a. A global hypotheses verification method for 3D object recognition. In Proceedings of the 12th European Conference on Computer Vision. 511--524.
    [6]
    A. Aldoma, F. Tombari, R. B. Rusu, and M. Vincze. 2012b. Pattern Recognition: Joint 34th DAGM and 36th OAGM Symposium. Chapter OUR-CVFH -- Oriented, Unique and Repeatable Clustered Viewpoint Feature Histogram for Object Recognition and 6DOF Pose Estimation, 113--122.
    [7]
    A. Aldoma, M. Vincze, N. Blodow, D. Gossow, S. Gedikli, R. B. Rusu, and G. Bradski. 2011. CAD-model recognition and 6DOF pose estimation using 3D cues. In IEEE ICCV Workshops. 585--592.
    [8]
    L. A. Alexandre. 2012. 3D descriptors for object and category recognition: A comparative evaluation. In Workshop on Color-Depth Camera Fusion in Robotics at the IEEE/RSJ IROS.
    [9]
    L. A. Alexandre. 2014. 3D Object recognition using convolutional neural networks with transfer learning between input channels. In 13th International Conference on Intelligent Autonomous Systems, Vol. 301.
    [10]
    S. Bahrampour, N. Ramakrishnan, L. Schott, and M. Shah. 2015. Comparative study of Caffe, Neon, Theano, and Torch for deep learning. CoRR abs/1511.06435 (2015).
    [11]
    S. Bai, X. Bai, Z. Zhou, Z. Zhang, and L. Jan Latecki. 2016. GIFT: A real-time and scalable 3D shape search engine. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
    [12]
    P. Baldi and P. J. Sadowski. 2013. Understanding dropout. In Advances in Neural Information Processing Systems 26. 2814--2822.
    [13]
    F. Bastien, P. Lamblin, R. Pascanu, J. Bergstra, I. J. Goodfellow, A. Bergeron, N. Bouchard, D. Warde-Farley, and Y. Bengio. 2012. Theano: New features and speed improvements. Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop.
    [14]
    S. Bell, C. L. Zitnick, K. Bala, and R. B. Girshick. 2015. Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. CoRR abs/1512.04143 (2015).
    [15]
    J. A. Benediktsson, J. A. Palmason, and J. Sveinsson. 2005. Classification of hyperspectral data from urban areas based on extended morphological profiles. IEEE TGRS 43, 3 (2005), 480--491.
    [16]
    Y. Bengio. 2012. Neural Networks: Tricks of the Trade: Second Edition. Chapter: Practical Recommendations for Gradient-Based Training of Deep Architectures, 437--478.
    [17]
    Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle. 2007. Greedy layer-wise training of deep networks. In Advances in Neural Information Processing Systems 19. 153--160.
    [18]
    J. Bergstra, R. Bardenet, Y. Bengio, and B. Kégl. 2011. Algorithms for hyper-parameter optimization. In 25th Annual Conference on Neural Information Processing Systems (NIPS 2011), Vol. 24.
    [19]
    J. Bergstra and Y. Bengio. 2012. Random search for hyper-parameter optimization. The Journal of Machine Learning Research 13 (2012), 281--305.
    [20]
    J. Bergstra, O. Breuleux, F. Bastien, P. Lamblin, R. Pascanu, G. Desjardins, J. Turian, D. Warde-Farley, and Y. Bengio. 2010. Theano: A CPU and GPU math expression compiler. In Proceedings of the Python for Scientific Computing Conference (SciPy). Oral Presentation.
    [21]
    P. J. Besl and N. D. McKay. 1992. A method for registration of 3-D shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence 14, 2 (1992), 239--256.
    [22]
    J. M. Bioucas-Dias, A. Plaza, G. Camps-Valls, P. Scheunders, N. Nasrabadi, and J. Chanussot. 2013. Hyperspectral remote sensing data analysis and future challenges. IEEE Geoscience and Remote Sensing Magazine 1, 2 (2013), 6--36.
    [23]
    L. Bo, X. Ren, and D. Fox. 2013. Unsupervised feature learning for RGB-D based object recognition. In Experimental Robotics: The 13th International Symposium on Experimental Robotics. 387--402.
    [24]
    D. Borrmann, J. Elseberg, K. Lingemann, and A. Nüchter. 2011. The 3D Hough transform for plane detection in point clouds: A review and a new accumulator design. 3D Research 2, 2 (2011), 1--13.
    [25]
    F. Bosche, Y. Turkan, C. Haas, and R. Haas. 2010. Fusing 4D modeling and laser scanning for automated construction progress control. 26th ARCOM Annual Conference and Annual General Meeting (2010).
    [26]
    Y.-L. Boureau, J. Ponce, and Y. LeCun. 2010. A theoretical analysis of feature pooling in vision algorithms. In Proceedings of the International Conference on Machine learning (ICML’10).
    [27]
    A. Brock, Th. Lim, J. M. Ritchie, and N. Weston. 2016. Generative and discriminative voxel modeling with convolutional neural networks. CoRR abs/1608.04236 (2016).
    [28]
    M. M. Bronstein and I. Kokkinos. 2010. Scale-invariant heat kernel signatures for non-rigid shape recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’10). 1704--1711.
    [29]
    S. Bu, P. Han, Z. Liu, J. Han, and H. Lin. 2015. Local deep feature learning framework for 3D shape. Computers 8 Graphics 46 (2015), 117--129. Shape Modeling International 2014.
    [30]
    S. Bu, Z. Liu, J. Han, J. Wu, and R. Ji. 2014. Learning high-level feature by deep belief networks for 3-D model retrieval and recognition. IEEE Transactions on Multimedia 16, 8 (2014), 2154--2167.
    [31]
    B. Bustos, D. Keim, D. Saupe, and T. Schreck. 2007. Content-based 3D object retrieval. IEEE Computer Graphics and Applications 27, 4 (2007), 22--27.
    [32]
    W. Byeon, T. M. Breuel, F. Raue, and M. Liwicki. 2015. Scene labeling with LSTM recurrent neural networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 3547--3555.
    [33]
    Z. Cai, J. Han, L. Liu, and L. Shao. 2016. RGB-D datasets using microsoft kinect or similar sensors: A survey. Multimedia Tools and Applications (2016), 1--43.
    [34]
    N. Charbonneau, J. Burgess, and L. Robichaud. 2015. Using 4D modelling in a university-museum research partnership. In 2015 Digital Heritage, Vol. 2. 603--610.
    [35]
    K. Chatfield, K. Simonyan, A. Vedaldi, and A. Zisserman. 2014. Return of the devil in the details: Delving deep into convolutional nets. In British Machine Vision Conference.
    [36]
    D.-Y. Chen, X. P. Tian, Y.-T. Shen, and M. Ouhyoung. 2003. On visual similarity based 3D model retrieval. Computer Graphics Forum (EUROGRAPHICS’03) 22, 3 (2003), 223--232.
    [37]
    H. Chen and B. Bhanu. 2007. 3D free-form object recognition in range images using local surface patches. Pattern Recognition Letters 28, 10 (2007), 1252--1262.
    [38]
    W. Chen, J. T. Wilson, S. Tyree, K. Q. Weinberger, and Y. Chen. 2015a. Compressing neural networks with the hashing trick. CoRR abs/1504.04788 (2015).
    [39]
    Y. Chen, H. Jiang, C. Li, X. Jia, and P. Ghamisi. 2016. Deep feature extraction and classification of hyperspectral images based on convolutional neural networks. IEEE TGRS 54, 10 (2016), 6232--6251.
    [40]
    Y. Chen, Z. Lin, X. Zhao, G. Wang, and Y. Gu. 2014. Deep learning-based classification of hyperspectral data. IEEE J-STARS 7, 6 (2014), 2094--2107.
    [41]
    Y. Chen, X. Zhao, and X. Jia. 2015b. Spectral-spatial classification of hyperspectral data based on deep belief network. IEEE J-STARS 8, 6 (2015), 2381--2392.
    [42]
    R. Collobert, K. Kavukcuoglu, and C. Farabet. 2011. Torch7: A matlab-like environment for machine learning. In BigLearn, NIPS Workshop.
    [43]
    R. Collobert, K. Kavukcuoglu, and C. Farabet. 2012. Neural Networks: Tricks of the Trade: Second Edition. Chapter: Implementing Neural Networks Efficiently, 537--557.
    [44]
    C. Cortes and V. Vapnik. 1995. Support-vector networks. Machine Learning 20, 3 (1995), 273--297.
    [45]
    C. Couprie, C. Farabet, L. Najman, and Y. Lecun. 2013. Indoor semantic segmentation using depth information. CoRR abs/1301.3572 (2013).
    [46]
    P. Daras and A. Axenopoulos. 2010. A 3D shape retrieval framework supporting multimodal queries. International Journal of Computer Vision 89, 2 (2010), 229--247.
    [47]
    J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In IEEE Computer Vision and Pattern Recognition (CVPR’09).
    [48]
    L. Deng. 2014. A tutorial survey of architectures, algorithms, and applications for deep learning. APSIPA Transactions on Signal and Information Processing 3 (2014), e5.
    [49]
    M. Denil, B. Shakibi, L. Dinh, M. A. Ranzato, and N. de Freitas. 2013. Predicting parameters in deep learning. CoRR abs/1306.0543 (2013).
    [50]
    E. Denton, E. Zaremba, J. Bruna, Y. LeCun, and R. Fergus. 2014. Exploiting linear structure within convolutional networks for efficient evaluation. CoRR abs/1404.0736 (2014).
    [51]
    B. Douillard, J. Underwood, N. Kuntz, V. Vlaskine, A. Quadros, P. Morton, and A. Frenkel. 2011. On the segmentation of 3D LIDAR point clouds. In IEEE ICRA. 2798--2805.
    [52]
    A. Doulamis, M. Ioannides, N. Doulamis, A. Hadjiprocopis, D. Fritsch, O. Balet, M. Julien, E. Protopapadakis, and others. 2013. 4D reconstruction of the past. Proceedings of SPIE 8795 (2013), 87950J-1--87950J-11.
    [53]
    A. Doulamis, S. Soile, N. Doulamis, C. Chrisouli, N. Grammalidis, K. Dimitropoulos, C. Manesis, C. Potsiou, and C. Ioannidis. 2015. Selective 4D modelling framework for spatial-temporal land information management system. Proceedings of SPIE 9535, 3rd RSCy (2015).
    [54]
    N. Doulamis and A. Doulamis. 2012. Fast and adaptive deep fusion learning for detecting visual objects. In Proceedings of ECCV 2012. Workshops and Demonstrations. 345--354.
    [55]
    A. Eitel, J. T. Springenberg, L. Spinello, M. Riedmiller, and W. Burgard. 2015. Multimodal deep learning for robust RGB-D object recognition. In IEEE/RSJ International Conference on IROS.
    [56]
    Y. Fang, J. Xie, G. Dai, M. Wang, F. Zhu, T. Xu, and E. Wong. 2015. 3D deep shape descriptor. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 2319--2328.
    [57]
    M. Fauvel, J. Chanussot, and J. A. Benediktsson. 2012. A spatial--spectral kernel-based approach for the classification of remote-sensing images. Pattern Recognition 45, 1 (2012), 381--392.
    [58]
    J. Feng, Y. Wang, and S.-F. Chang. 2016. 3D shape retrieval using single depth image from low-cost sensors. In IEEE Winter Conference on Applications of Computer Vision (WACV’16).
    [59]
    S. Filipe and L. A. Alexandre. 2014. A comparative evaluation of 3D keypoint detectors in a RGB-D object dataset. In 9th International Conference on Computer Vision Theory and Applications. 476--483.
    [60]
    S. Filipe, L. Itti, and L. A. Alexandre. 2015. BIK-BUS: Biologically motivated 3D keypoint based on bottom-up saliency. IEEE Transactions on Image Processing 24, 1 (2015), 163--175.
    [61]
    A. Frome, D. Huber, R. Kolluri, T. Bulow, and J. Malik. 2004. Recognizing objects in range data using regional point descriptors. In ECCV 2004. Lecture Notes in Computer Science, Vol. 3023. 224--237.
    [62]
    Y. Gao and Q. Dai. 2014. View-based 3D object retrieval: Challenges and approaches. IEEE MultiMedia 21, 3 (2014), 52--57.
    [63]
    Y. Gao, M. Wang, D. Tao, R. Ji, and Q. Dai. 2012. 3-D object retrieval and recognition with hypergraph analysis. IEEE Transactions on Image Processing 21, 9 (2012), 4290--4303.
    [64]
    Y. Gao, M. Wang, Z. J. Zha, Q. Tian, Q. Dai, and N. Zhang. 2011. Less is more: Efficient 3-D object retrieval with query view selection. IEEE Transactions on Multimedia 13, 5 (2011), 1007--1018.
    [65]
    D. Giorgi, S. Biasotti, and L. Paraboschi. 2007. Shape retrieval contest 2007: Watertight models track. SHREC Competition 8 (2007).
    [66]
    A. Godil, H. Dutagaci, C. Akgul, A. Axenopoulos, B. Bustos, M. Chaouch, P. Daras, and others. 2009. SHREC’09 track: Generic shape retrieval. In Proceedings of the Eurographics Workshop on 3D Object Retrieval. 61--68.
    [67]
    Y. Gong, L. Liu, M. Yang, and L. D. Bourdev. 2014. Compressing deep convolutional networks using vector quantization. CoRR abs/1412.6115 (2014).
    [68]
    I. J. Goodfellow, D. Warde-Farley, M. Mirza, A. Courville, and Y. Bengio. 2013. Maxout networks. In Proceedings of the 30th International Conference on Machine Learning (ICML’13). 1319--1327.
    [69]
    A. Graves, A. Mohamed, and G. E. Hinton. 2013. Speech recognition with deep recurrent neural networks. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’13). 6645--6649.
    [70]
    K. Gregor, I. Danihelka, A. Graves, and D. Wierstra. 2015. DRAW: A recurrent neural network for image generation. CoRR abs/1502.04623 (2015).
    [71]
    Y. Guo, M. Bennamoun, F. Sohel, M. Lu, and J. Wan. 2014. 3D object recognition in cluttered scenes with local surface features: A survey. IEEE TPAMI 36, 11 (2014), 2270--2287.
    [72]
    Y. Guo, M. Bennamoun, F. Sohel, M. Lu, J. Wan, and N. Kwok. 2016a. A comprehensive performance evaluation of 3D local feature descriptors. IJCV 116, 1 (2016), 66--89.
    [73]
    Y. Guo, Y. Liu, A. Oerlemans, S. Lao, S. Wu, and M. S. Lew. 2016b. Deep learning for visual understanding: A review. Neurocomputing 187 (2016), 27--48. Recent Developments on Deep Big Vision.
    [74]
    Y. Guo, F. A. Sohel, M. Bennamoun, M. Lu, and J. Wan. 2013. Rotational projection statistics for 3D local surface description and object recognition. CoRR abs/1304.3192 (2013).
    [75]
    Y. Guo, F. A. Sohel, M. Bennamoun, J. Wan, and M. Lu. 2015. A novel local surface feature for 3D object recognition under clutter and occlusion. Information Sciences 293 (2015), 196--213.
    [76]
    Y. Guo, J. Zhang, M. Lu, J. Wan, and Y. Ma. 2014. Benchmark datasets for 3D computer vision. In 9th IEEE Conference on Industrial Electronics and Applications (ICIEA’14). 1846--1851.
    [77]
    S. Gupta, R. Girshick, P. Arbelaez, and J. Malik. 2014. Learning rich features from RGB-D images for object detection and segmentation. In Proceedings of the 13th European Conference on Computer Vision.
    [78]
    Z. Han, Z. Liu, J. Han, C. M. Vong, S. Bu, and C. L. P. Chen. 2016. Mesh convolutional restricted Boltzmann machines for unsupervised learning of features with structure preservation on 3-D meshes. IEEE Transactions on Neural Networks and Learning Systems PP, 99 (2016), 1--14.
    [79]
    K. He, X. Zhang, S. Ren, and J. Sun. 2014. Spatial pyramid pooling in deep convolutional networks for visual recognition. CoRR abs/1406.4729 (2014).
    [80]
    K. He, X. Zhang, S. Ren, and J. Sun. 2015a. Deep residual learning for image recognition. CoRR abs/1512.03385 (2015).
    [81]
    K. He, X. Zhang, S. Ren, and J. Sun. 2015b. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. CoRR abs/1502.01852 (2015).
    [82]
    K. He, X. Zhang, S. Ren, and J. Sun. 2016. Identity mappings in deep residual networks. CoRR abs/1603.05027 (2016).
    [83]
    V. Hegde and R. Zadeh. 2016. FusionNet: 3D object classification using multiple data representations. CoRR abs/1607.05695 (2016).
    [84]
    M. Hilaga, Y. Shinagawa, T. Kohmura, and T. L. Kunii. 2001. Topology matching for fully automatic similarity estimation of 3D shapes. In Proceedings of the 28th SIGGRAPH. 203--212.
    [85]
    G. E. Hinton. 2002. Training products of experts by minimizing contrastive divergence. Neural Computation 14, 8 (2002), 1771--1800.
    [86]
    G. E. Hinton, P. Dayan, B. Frey, and R. M. Neal. 1995. The wake-sleep algorithm for self-organizing neural networks. Science 268, 5124 (1995), 1158--1161.
    [87]
    G. E. Hinton, S. Osindero, and Y.-W. Teh. 2006. A fast learning algorithm for deep belief nets. Neural Computation 18, 7 (2006), 1527--1554.
    [88]
    G. E. Hinton and R. R. Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. Science 313, 5786 (2006), 504--507.
    [89]
    G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. 2012. Improving neural networks by preventing co-adaptation of feature detectors. CoRR abs/1207.0580 (2012).
    [90]
    G. E. Hinton, O. Vinyals, and J. Dean. 2015. Distilling the knowledge in a neural network. CoRR abs/1503.02531 (2015).
    [91]
    S. Hochreiter. 1991. Untersuchungen Zu Dynamischen Neuronalen Netzen. Diploma thesis. Technical University Munich, Institute of Computer Science.
    [92]
    S. Hochreiter. 1998. The vanishing gradient problem during learning recurrent neural nets and problem solutions. IJUFKS 6, 2 (1998), 107--116.
    [93]
    S. Hochreiter and J. Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735--1780.
    [94]
    W. Hu, Y. Huang, L. Wei, F. Zhang, and H. Li. 2015. Deep convolutional neural networks for hyperspectral image classification. Journal of Sensors 2015, Article 258619 (2015).
    [95]
    G. Huang, Y. Sun, Z. Liu, D. Sedra, and K. Q. Weinberger. 2016. Deep networks with stochastic depth. CoRR abs/1603.09382 (2016).
    [96]
    G. B. Huang, H. Zhou, X. Ding, and R. Zhang. 2012. Extreme learning machine for regression and multiclass classification. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 42, 2 (2012), 513--529.
    [97]
    G. B. Huang, Q.-Y. Zhu, and C.-K. Siew. 2006. Extreme learning machine: Theory and applications. Neurocomputing 70, 13 (2006), 489--501.
    [98]
    M. Ioannides, A. Hadjiprocopis, N. Doulamis, A. Doulamis, E. Protopapadakis, K. Makantasis, and others. 2013. Online 4D reconstruction using multi-images available under open access. ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences 1 (2013), 169--174.
    [99]
    S. Ioffe and C. Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. CoRR abs/1502.03167 (2015).
    [100]
    M. Jaderberg, A. Vedaldi, and A. Zisserman. 2014. Speeding up convolutional neural networks with low rank expansions. CoRR abs/1405.3866 (2014).
    [101]
    K. Jarrett, K. Kavukcuoglu, M. Ranzato, and Y. LeCun. 2009. What is the best multi-stage architecture for object recognition? In 12th IEEE International Conference on Computer Vision. 2146--2153.
    [102]
    S. Jayanti, Y. Kalyanaraman, N. Iyer, and K. Ramani. 2006. Developing an engineering shape benchmark for CAD models. Computer-Aided Design 38, 9 (2006), 939--953.
    [103]
    H. Jegou, M. Douze, and C. Schmid. 2011. Product quantization for nearest neighbor search. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 1 (2011), 117--128.
    [104]
    Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093 (2014).
    [105]
    E. Johns, S. Leutenegger, and A. J. Davison. 2016. Pairwise decomposition of image sequences for active multi-view recognition. In Proceedings of the IEEE Conference on CVPR. 3183--3822.
    [106]
    A. E. Johnson and M. Hebert. 1999. Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence 21, 5 (1999), 433--449.
    [107]
    N. Kalchbrenner, E. Grefenstette, and P. Blunsom. 2014. A convolutional neural network for modelling sentences. CoRR abs/1404.2188 (2014).
    [108]
    L. L. C. Kasun, H. Zhou, G.-B. Huang, and C. M. Vong. 2013. Representational learning with extreme learning machine for big data. IEEE Intelligent Systems 28, 6 (2013), 31--34.
    [109]
    M. Kazhdan, Th. Funkhouser, and S. Rusinkiewicz. 2003. Rotation invariant spherical harmonic representation of 3D shape descriptors. In Symposium on Geometry Processing.
    [110]
    J. M. Khatib, N. Chileshe, and S. Sloan. 2007. Antecedents and benefits of 3D and 4D modelling for construction planners. Journal of Engineering, Design and Technology 5, 2 (2007), 159--172.
    [111]
    A. Krizhevsky, I. Sutskever, and G. E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25. 1097--1105.
    [112]
    A. Krogh and J. A. Hertz. 1992. A simple weight decay can improve generalization. In Advances in Neural Information Processing Systems, Vol. 4. 950--957.
    [113]
    G. Kyriakaki, A. Doulamis, N. Doulamis, M. Ioannides, K. Makantasis, E. Protopapadakis, A. Hadjiprocopis, K. Wenzel, and others. 2014. 4D reconstruction of tangible cultural heritage objects from web-retrieved images. International Journal of Heritage in the Digital Era 3, 2 (2014), 431--451.
    [114]
    L. Ladicky, C. Russell, P. Kohli, and P. H. S. Torr. 2009. Associative hierarchical CRFs for object class image segmentation. Proceedings of the IEEE 12th International Conference on Computer Vision (2009).
    [115]
    K. Lai, L. Bo, X. Ren, and D. Fox. 2011. A large-scale hierarchical multi-view RGB-D object dataset. In IEEE International Conference on on Robotics and Automation.
    [116]
    G. Lavoué. 2012. Combination of bag-of-words descriptors for robust partial shape retrieval. The Visual Computer 28, 9 (2012), 931--942.
    [117]
    V. Lebedev, Y. Ganin, M. Rakhuba, I. V. Oseledets, and V. S. Lempitsky. 2014. Speeding-up convolutional neural networks using fine-tuned CP-decomposition. CoRR abs/1412.6553 (2014).
    [118]
    Y. LeCun, Y. Bengio, and G. E. Hinton. 2015. Deep learning. Nature 521 (2015), 436--444.
    [119]
    Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. 1998. Gradient-based learning applied to document recognition. Proceedings of IEEE 86, 11 (1998), 2278--2324.
    [120]
    Y. LeCun, K. Kavukcuoglu, and C. Farabet. 2010. Convolutional networks and applications in vision. In Proceedings of the 2010 IEEE International Symposium on Circuits and Systems (ISCAS’10). 253--256.
    [121]
    H. Lee, E. Chaitanya, and A. Y. Ng. 2008. Sparse deep belief net model for visual area V2. In Advances in Neural Information Processing Systems 20. 873--880.
    [122]
    B. Leng, S. Guo, X. Zhang, and Z. Xiong. 2015. 3D object retrieval with stacked local convolutional autoencoder. Signal Processing 112, C (2015), 119--128.
    [123]
    B. Leng, Y. Liu, K. Yu, X. Zhang, and Z. Xiong. 2016. 3D object understanding with 3D convolutional neural networks. Information Sciences 336, C (Oct. 2016), 188--201.
    [124]
    B. Leng, X. Zhang, M. Yao, and Z. Xiong. 2014. MultiMedia Modeling: 20th Anniversary International Conference, Part II. Chapter: 3D Object Classification Using Deep Belief Networks, 128--139.
    [125]
    B. Li, Y. Lu, A. Godil, T. Schreck, B. Bustos, A. Ferreira, and others. 2014a. A comparison of methods for sketch-based 3D shape retrieval. Computer Vision and Image Understanding 119 (2014), 57--80.
    [126]
    B. Li, Y. Lu, C. Li, and others. 2015. A comparison of 3D shape retrieval methods based on a large-scale benchmark supporting multimodal queries. Computer Vision and Image Understanding 131 (2015).
    [127]
    B. Li, E. Zhou, B. Huang, J. Duan, Y. Wang, N. Xu, J. Zhang, and H. Yang. 2014b. Large scale recurrent neural network on GPU. In International Joint Conference on Neural Networks (IJCNN’14). 4062--4069.
    [128]
    Z. Lian, A. Godil, B. Bustos, M. Daoudi, and others. 2011. SHREC’11 track: Shape retrieval on non-rigid 3D watertight meshes. In Proceedings of the 4th Eurographics Conference on 3D Object Retrieval. 79--88.
    [129]
    M. Lin, Q. Chen, and S. Yan. 2013. Network in network. CoRR abs/1312.4400 (2013).
    [130]
    Q. Liu. 2012. A survey of recent view-based 3D model retrieval methods. CoRR abs/1208.3670 (2012).
    [131]
    Z. Liu, S. Chen, S. Bu, and K. Li. 2014. High-level semantic feature for 3D shape based on deep belief networks. In IEEE International Conference on Multimedia and Expo (ICME’14). 1--6.
    [132]
    D. G. Lowe. 1999. Object recognition from local scale-invariant features. In Proceedings of the International Conference on Computer Vision (ICCV’99), Vol. 2. 1150--1157.
    [133]
    A. Maas, A. Hannun, and A. Ng. 2013. Rectifier nonlinearities improve neural network acoustic models. In ICML Workshop on Deep Learning for Audio, Speech, and Language Processing.
    [134]
    A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts. 2011. Learning word vectors for sentiment analysis. In Proceedings of the 49th Annual Meeting of the ACL. 142--150.
    [135]
    A. Mademlis, P. Daras, D. Tzovaras, and M. G. Strintzis. 2009. 3D object retrieval using the 3D shape impact descriptor. Pattern Recognition 42, 11 (2009), 2447--2459.
    [136]
    K. Makantasis, A. Doulamis, N. Doulamis, and M. Ioannides. 2016. In the wild image retrieval and clustering for 3D cultural heritage landmarks reconstruction. MTAP 75, 7 (2016), 3593--3629.
    [137]
    K. Makantasis, K. Karantzalos, A. Doulamis, and N. Doulamis. 2015. Deep supervised learning for hyperspectral data classification through convolutional neural networks. In IEEE IGARSS. 4959--4962.
    [138]
    J. Martens and I. Sutskever. 2011. Learning recurrent neural networks with Hessian-free optimization. In Proceedings of the 28th International Conference on Machine Learning (ICML’11). 1033--1040.
    [139]
    H. P. Martínez and G. N. Yannakakis. 2014. Deep multimodal fusion: Combining discrete events and continuous signals. In Proceedings of the 16th International Conference on Multimodal Interaction. 34--41.
    [140]
    M. Mathieu, M. Henaff, and Y. LeCun. 2013. Fast training of convolutional networks through FFTs. CoRR abs/1312.5851 (2013).
    [141]
    D. Maturana and S. Scherer. 2015. VoxNet: A 3D convolutional neural network for real-time object recognition. In IEEE/RSJ International Conference on Intelligent Robots and Systems. 922--928.
    [142]
    W. McCulloch and W. Pitts. 1943. A logical calculus of the ideas immanent in nervous activity. The Bulletin of Mathematical Biophysics 5, 4 (1943), 115--133.
    [143]
    A. Merentitis and C. Debes. 2015. Automatic fusion and classification using random forests and features extracted with deep learning. In International Geoscience and Remote Sensing Symposium. 2943--2946.
    [144]
    A. Mian, M. Bennamoun, and R. Owens. 2010. On the repeatability and quality of keypoints for local feature-based 3D object retrieval from cluttered scenes. IJCV 89, 2--3 (2010), 348--361.
    [145]
    K. Mikolajczyk and C. Schmid. 2005. A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 10 (2005), 1615--1630.
    [146]
    K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, and L. Van Gool. 2005. A comparison of affine region detectors. IJCV 65, 1 (2005), 43--72.
    [147]
    M. Muja and D. G. Lowe. 2009. Fast approximate nearest neighbors with automatic algorithm configuration. In International Conference on Computer Vision Theory and Application (VISSAPP’09). 331--340.
    [148]
    V. Nair and G. E. Hinton. 2010. Rectified linear units improve restricted Boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML’10). 807--814.
    [149]
    A. Nguyen and B. Le. 2013. 3D point cloud segmentation: A survey. In Proceedings of the 6th IEEE Conference on Robotics, Automation and Mechatronics (RAM). 225--230.
    [150]
    M. Niepert, M. Ahmed, and K. Kutzkov. 2016. Learning convolutional neural networks for graphs. CoRR abs/1605.05273 (2016).
    [151]
    W. Ouyang, P. Luo, X. Zeng, S. Qiu, Y. Tian, H. Li, S. Yang, and others. 2014. DeepID-Net: Multi-stage and deformable deep convolutional neural networks for object detection. CoRR abs/1409.3505 (2014).
    [152]
    J. Papon, A. Abramov, M. Schoeler, and F. Worgotter. 2013. Voxel cloud connectivity segmentation—Supervoxels for point clouds. In IEEE Conference on Computer Vision and Pattern Recognition. 2027--2034.
    [153]
    R. Pascanu, C. Gülçehre, K. Cho, and Y. Bengio. 2013a. How to construct deep recurrent neural networks. CoRR abs/1312.6026 (2013).
    [154]
    R. Pascanu, T. Mikolov, and Y. Bengio. 2013b. On the difficulty of training recurrent neural networks. In Proceedings of the 30th International Conference on Machine Learning (ICML’13). 1310--1318.
    [155]
    C. R. Qi, H. Su, M. Niessner, A. Dai, M. Yan, and L. J. Guibas. 2016. Volumetric and multi-view CNNs for object classification on 3D data. arXiv preprint arXiv:1604.03265v2 (2016).
    [156]
    T. Rabbani, F. Van Den Heuvel, and G. Vosselmann. 2006. Segmentation of point clouds using smoothness constraint. ISPRS Archives 36, 5 (2006), 248--253.
    [157]
    M. Ranzato, Y. Boureau, and Y. LeCun. 2008. Sparse feature learning for deep belief networks. In Advances in Neural Information Processing Systems 20. 1185--1192.
    [158]
    S. Ren, K. He, R. Girshick, and J. Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems. 91--99.
    [159]
    S. Rifai, P. Vincent, X. Muller, X. Glorot, and Y. Bengio. 2011. Contracting auto-encoders: Explicit invariance during feature extraction. In Proceedings of the 28th ICML. 833--840.
    [160]
    A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta, and Y. Bengio. 2014. FitNets: Hints for thin deep nets. CoRR abs/1412.6550 (2014).
    [161]
    F. Rosenblatt. 1958. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review 65, 6 (1958), 386--408.
    [162]
    D. E. Rumelhart, G. E. Hinton, and R. J. Williams. 1986. Learning representations by back-propagating errors. Nature 323 (1986), 533--536.
    [163]
    R. B. Rusu, N. Blodow, Z. C. Marton, and M. Beetz. 2008. Aligning point cloud views using persistent feature histograms. In IEEE/RSJ International Conference on Intelligent Robots and Systems. 3384--3391.
    [164]
    R. B. Rusu, G. Bradski, R. Thibaux, and J. Hsu. 2010. Fast 3D recognition and pose using the viewpoint feature histogram. In IEEE/RSJ International Conference on IROS. 2155--2162.
    [165]
    R. B. Rusu and S. Cousins. 2011. 3D is here: Point Cloud Library (PCL). In IEEE International Conference on Robotics and Automation (ICRA’11). 1--4.
    [166]
    S. Salti, A. Petrelli, F. Tombari, and L. Di Stefano. 2012. On the affinity between 3D detectors and descriptors. In Proceedings of the 2nd International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission. 424--431.
    [167]
    J. Sanchez-Riera, K.-L. Hua, Y.-S. Hsiao, T. Lim, S. C. Hidayati, and W.-H. Cheng. 2016. A comparative study of data fusion for RGB-D based visual recognition. Pattern Recognition Letters 73 (2016), 1--6.
    [168]
    D. Scherer, A. Muller, and S. Behnke. 2010. Evaluation of pooling operations in convolutional architectures for object recognition. In 20th ICANN. Vol. 6354. 92--101.
    [169]
    G. Schindler and F. Dellaert. 2012. 4D cities: Analyzing visualizing and interacting with historical urban photo collections. Journal of Multimedia (2012).
    [170]
    C. Schmid, R. Mohr, and C. Bauckhage. 2000. Evaluation of interest point detectors. International Journal of Computer Vision 37, 2 (2000), 151--172.
    [171]
    J. Schmidhuber. 1992. Learning complex, extended sequences using the principle of history compression. Neural Computation 4, 2 (1992), 234--242.
    [172]
    J. Schmidhuber. 2015. Deep learning in neural networks: An overview. Neural Networks 61 (2015), 85--117.
    [173]
    R. Schnabel, R. Wahl, and R. Klein. 2007. Efficient RANSAC for point-cloud shape detection. Computer Graphics Forum 26, 2 (2007), 214--226.
    [174]
    M. Schwarz, H. Schulz, and S. Behnke. 2015. RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features. In IEEE ICRA. 1329--1335.
    [175]
    N. Sedaghat, M. Zolfaghari, and Th. Brox. 2016. Orientation-boosted voxel nets for 3D object recognition. CoRR abs/1604.03351 (2016).
    [176]
    P. Shilane, P. Min, M. Kazhdan, and T. Funkhouser. 2004. The Princeton shape benchmark. In Shape Modeling International.
    [177]
    K. Siddiqi, J. Zhang, D. Macrini, A. Shokoufandeh, S. Bouix, and S. Dickinson. 2008. Retrieving articulated 3-D models using medial surfaces. Machine Vision and Applications 19, 4 (2008), 261--275.
    [178]
    N. Silberman, D. Hoiem, P. Kohli, and R. Fergus. 2012. Indoor segmentation and support inference from RGBD images. In ECCV.
    [179]
    M.-C. Sima and A. Nuchter. 2013. An extension of the Felzenszwalb-Huttenlocher segmentation to 3D point clouds. 5th ICMV: Computer Vision, Image Analysis and Processing 8783 (2013).
    [180]
    D. Smeets, Th. Fabry, J. Hermans, D. Vandermeulen, and P. Suetens. 2009. Isometric deformation modelling for object recognition. In Proceedings of the 13th International Conference on Computer Analysis of Images and Patterns. 757--765.
    [181]
    R. Socher, B. Huval, B. Bhat, C. D. Manning, and A. Y. Ng. 2012. Convolutional-recursive deep learning for 3D object classification. In Advances in Neural Information Processing Systems 25. 656--664.
    [182]
    S. Song and J. Xiao. 2014. Sliding shapes for 3D object detection in depth images. In Proceedings of the 13th European Conference on Computer Vision (ECCV’14). 634--651.
    [183]
    S. Song and J. Xiao. 2016. Deep sliding shapes for amodal 3D object detection in RGB-D images. In Proceedings of the 29th IEEE Conference on Computer Vision and Pattern Recognition.
    [184]
    H. Su, S. Maji, E. Kalogerakis, and E. Learned-Miller. 2015. Multi-view convolutional neural networks for 3D shape recognition. In Proceedings of the International Conference on Computer Vision (ICCV’15).
    [185]
    I. Sutskever. 2012. Training Recurrent Neural Networks. Ph.D. dissertation. University of Toronto.
    [186]
    I. Sutskever, J. Martens, and G. E. Hinton. 2011. Generating text with recurrent neural networks. In Proceedings of the 28th International Conference on Machine Learning (ICML’11). 1017--1024.
    [187]
    C. Szegedy, S. Ioffe, and V. Vanhoucke. 2016. Inception-v4, inception-ResNet and the impact of residual connections on learning. CoRR abs/1602.07261 (2016).
    [188]
    C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. 2014. Going deeper with convolutions. CoRR abs/1409.4842 (2014).
    [189]
    H. Tabia, H. Laga, D. Picard, and P.-H. Gosselin. 2014. Covariance descriptors for 3D shape matching and retrieval. In IEEE Conference on Computer Vision and Pattern Recognition. 4185--4192.
    [190]
    J. Tang, S. Miller, A. Singh, and P. Abbeel. 2012. A textured object recognition pipeline for color and depth image data. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA’12).
    [191]
    J. W. H. Tangelder and R. C. Veltkamp. 2007. A survey of content based 3D shape retrieval methods. Multimedia Tools and Applications 39, 3 (2007), 441.
    [192]
    L. Theis and M. Bethge. 2015. Generative image modeling using spatial LSTMs. In Advances in Neural Information Processing Systems 28.
    [193]
    F. Tombari and L. Di Stefano. 2012. Hough voting for 3D object recognition under occlusion and clutter. IPSJ Transactions on Computer Vision and Applications 4 (2012), 20--29.
    [194]
    F. Tombari, S. Salti, and L. Di Stefano. 2010. Unique signatures of histograms for local surface description. In Proceedings of the 11th European Conference on Computer Vision: Part III (ECCV’10). 356--369.
    [195]
    F. Tombari, S. Salti, and L. Di Stefano. 2013. Performance evaluation of 3D keypoint detectors. International Journal of Computer Vision 102, 1--3 (2013), 198--220.
    [196]
    J. R. R. Uijlings, K. E. A. van de Sande, T. Gevers, and A. W. M. Smeulders. 2013. Selective search for object recognition. International Journal of Computer Vision (2013).
    [197]
    J. P. C. Valentin, S. Sengupta, J. Warrell, A. Shahrokni, and P. H. S. Torr. 2013. Mesh based semantic modelling for indoor and outdoor scenes. In IEEE CVPR. 2067--2074.
    [198]
    V. Vanhoucke, A. Senior, and M. Z. Mao. 2011. Improving the speed of neural networks on CPUs. In Deep Learning and Unsupervised Feature Learning Workshop, NIPS 2011.
    [199]
    P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol. 2010. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research 11 (2010), 3371--3408.
    [200]
    F. Visin, K. Kastner, K. Cho, M. Matteucci, A. C. Courville, and Y. Bengio. 2015. ReNet: A recurrent neural network based alternative to convolutional networks. CoRR abs/1505.00393 (2015).
    [201]
    A.-V. Vo, L. Truong-Hong, D. F. Laefer, and M. Bertolotto. 2015. Octree-based region growing for point cloud segmentation. ISPRS Journal of Photogrammetry and Remote Sensing 104 (2015), 88--100.
    [202]
    L. Wan, M. Zeiler, S. Zhang, Y. LeCun, and R. Fergus. 2013. Regularization of neural networks using dropconnect. In Proceedings of the 30th ICML, Vol. 28. 1058--1066.
    [203]
    F. Wang, L. Kang, and Y. Li. 2015. Sketch-based 3D shape retrieval using convolutional neural networks. In IEEE Conference on Computer Vision and Pattern Recognition.
    [204]
    W. Wang, L. Chen, Z. Liu, K. Kühnlenz, and D. Burschka. 2013. Textured/textureless object recognition and pose estimation using RGB-D image. Journal of Real-Time Image Processing 10, 4 (2013), 667--682.
    [205]
    Y. Wang, Z. Xie, K. Xu, Y. Dou, and Y. Lei. 2016. An efficient and effective convolutional auto-encoder extreme learning machine network for 3D feature learning. Neurocomputing 174 (2016), 988--998.
    [206]
    D. Weikersdorfer, D. Gossow, and M. Beetz. 2012. Depth-adaptive superpixels. In 21st International Conference on Pattern Recognition (ICPR’12). 2087--2090.
    [207]
    P. J. Werbos. 1990. Backpropagation through time: What it does and how to do it. Proceedings of the IEEE 78, 10 (1990), 1550--1560.
    [208]
    W. Wohlkinger and M. Vincze. 2011. Ensemble of shape functions for 3D object classification. In IEEE International Conference on Robotics and Biomimetics (ROBIO’11). 2987--2992.
    [209]
    H. Wu and X. Gu. 2015. Towards dropout training for convolutional neural networks. Neural Networks 71 (2015), 1--10.
    [210]
    Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao. 2015. 3D shapenets: A deep representation for volumetric shapes. In IEEE Conference on Computer Vision and Pattern Recognition. 1912--1920.
    [211]
    J. Xie, Y. Fang, F. Zhu, and E. Wong. 2015a. DeepShape: Deep learned shape descriptor for 3D shape matching and retrieval. In Proceedings of the IEEE Conference on CVPR. 1275--1283.
    [212]
    Z. Xie, K. Xu, W. Shan, L. Liu, Y. Xiong, and H. Huang. 2015b. Projective feature learning for 3D shapes with multi-view depth images. Computer Graphics Forum (Proceedings of Pacific Graphics 2015) 34, 6 (2015).
    [213]
    B. Xu, N. Wang, T. Chen, and M. Li. 2015b. Empirical evaluation of rectified activations in convolutional network. CoRR abs/1505.00853 (2015).
    [214]
    Q. Xu, S. Jiang, W. Huang, F. Ye, and S. Xu. 2015a. Feature fusion based image retrieval using deep learning. Journal of Information and Computational Science 12, 6 (2015), 2361--2373.
    [215]
    Z. Yan, H. Zhang, Y. Jia, Th. Breuel, and Y. Yu. 2016. Combining the best of convolutional layers and recurrent layers: A hybrid network for semantic segmentation. CoRR abs/1603.04871 (2016).
    [216]
    J. Yue, S. Mao, and M. Li. 2016. A deep learning framework for hyperspectral image classification using spatial pyramid pooling. Remote Sensing Letters 7, 9 (2016), 875--884.
    [217]
    A. Zaharescu, E. Boyer, K. Varanasi, and R. Horaud. 2009. Surface feature detection and description with applications to mesh matching. In IEEE Conference on CVPR. 373--380.
    [218]
    H. F. M. Zaki, F. Shafait, and A. Mian. 2016. Convolutional hypercube pyramid for accurate RGB-D object category and instance recognition. In IEEE ICRA. 1685--1692.
    [219]
    D. Zarpalas, P. Daras, A. Axenopoulos, D. Tzovaras, and M. G. Strintzis. 2006. 3D model search and retrieval using the spherical trace transform. EURASIP Journal on Advances in Signal Processing 2007 (2006).
    [220]
    M. D. Zeiler and R. Fergus. 2013. Stochastic pooling for regularization of deep convolutional neural networks. CoRR abs/1301.3557 (2013).
    [221]
    A. Zelener. 2015. Survey of object classification in 3D range scans. (2015).
    [222]
    L. Zhang, L. Zhang, and B. Du. 2016a. Deep learning for remote sensing data: A technical tutorial on the state of the art. IEEE Geoscience and Remote Sensing Magazine 4, 2 (2016), 22--40.
    [223]
    X. Zhang, H. Zhang, Y. Zhang, Y. Yang, M. Wang, H. Luan, J. Li, and T. S. Chua. 2016b. Deep fusion of multiple semantic cues for complex event recognition. IEEE TIP 25, 3 (2016), 1033--1046.
    [224]
    X. Zhang, J. Zou, X. Ming, K. He, and J. Sun. 2014. Efficient and accurate approximations of nonlinear convolutional networks. CoRR abs/1411.4229 (2014).
    [225]
    W. Zhao and S. Du. 2016. Learning multiscale and deep representations for classifying remotely sensed imagery. ISPRS Journal of Photogrammetry and Remote Sensing 113 (2016), 155--165.
    [226]
    Y. Zhong. 2009. Intrinsic shape signatures: A shape descriptor for 3D object recognition. In 12th IEEE International Conference on Computer Vision Workshops (ICCV Workshops). 689--696.
    [227]
    Y. Zhou and Y. Wei. 2016. Learning hierarchical spectral-spatial features for hyperspectral image classification. IEEE Transactions on Cybernetics 46, 7 (2016), 1667--1678.
    [228]
    Z. Zhu, X. Wang, S. Bai, C. Yao, and X. Bai. 2014. Deep learning representation using autoencoder for 3D shape retrieval. CoRR abs/1409.7164 (2014).

    Cited By

    View all
    • (2024)BAMFORESTS: Bamberg Benchmark Forest Dataset of Individual Tree Crowns in Very-High-Resolution UAV ImagesRemote Sensing10.3390/rs1611193516:11(1935)Online publication date: 28-May-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Computing Surveys
    ACM Computing Surveys  Volume 50, Issue 2
    March 2018
    567 pages
    ISSN:0360-0300
    EISSN:1557-7341
    DOI:10.1145/3071073
    • Editor:
    • Sartaj Sahni
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 06 April 2017
    Accepted: 01 January 2017
    Revised: 01 December 2016
    Received: 01 June 2016
    Published in CSUR Volume 50, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. 3D data
    2. 3D object recognition
    3. 3D object retrieval
    4. 3D segmentation
    5. convolutional neural networks
    6. deep learning

    Qualifiers

    • Survey
    • Research
    • Refereed

    Funding Sources

    • EU Horizon 2020 Programme: DigiArt

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)424
    • Downloads (Last 6 weeks)44

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)BAMFORESTS: Bamberg Benchmark Forest Dataset of Individual Tree Crowns in Very-High-Resolution UAV ImagesRemote Sensing10.3390/rs1611193516:11(1935)Online publication date: 28-May-2024
    • (2024)Scaling Deep Learning for Material Imaging: A Pseudo-3d Model for Tera-Scale 3d Domain TransferSSRN Electronic Journal10.2139/ssrn.4808378Online publication date: 2024
    • (2024)Using Run-Time Information to Enhance Static Analysis of Machine Learning Code in NotebooksCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663785(497-501)Online publication date: 10-Jul-2024
    • (2024)Test Input Prioritization for 3D Point CloudsACM Transactions on Software Engineering and Methodology10.1145/364367633:5(1-44)Online publication date: 4-Jun-2024
    • (2024)Generating Point Cloud Augmentations via Class-Conditioned Diffusion Model2024 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)10.1109/WACVW60836.2024.00057(480-488)Online publication date: 1-Jan-2024
    • (2024)Sequential Point Clouds: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.336597046:8(5504-5523)Online publication date: Aug-2024
    • (2024)Dense Dual-Branch Cross Attention Network for Semantic Segmentation of Large-Scale Point CloudsIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2023.334189462(1-16)Online publication date: 2024
    • (2024)Improving the classification of a nanocomposite using nanoparticles based on a meta-analysis study, recurrent neural network and recurrent neural network Monte-Carlo algorithmsNanocomposites10.1080/20550324.2024.236718110:1(322-350)Online publication date: 8-Jul-2024
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media