Abstract
3D reconstruction via neural networks has become striking nowadays. However, the existing works are based on information-rich environment to perform reconstruction, not yet about the Low-Light-Level (LLL) environment where the information is extremely scarce. The implementation of 3D reconstruction in this environment is an urgent requirement for military, aerospace and other fields. Therefore, we introduce an Encapsulated Attention Encoder-Decoder Network (EA-EDNet) in this paper. It can incorporate multiple levels of semantic to adequately extract the limited information from images taken in the LLL environment and can reason out the defective morphological data as well as intensify the attention to the focused parts. The EA-EDNet adopts a two-stage combined coarse-to-fine training fashion. We additionally create a realistic LLL environment dataset 3LNet-12, and accompanying propose an analysis method for filtering this dataset. In experiments, the proposed method not only achieves results superior to the state-of-the-art methods, but also achieves more delicate reconstruction models.
Similar content being viewed by others
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
Zhu, Lei, et al. CED-Net: contextual encoder–decoder network for 3D face reconstruction. Multimedia Systems 28.5, 1713–1722 (2022)
Liang, Q., Li, Q., Nie, W., Liu, A.-A.: Pagn: perturbation adaption generation network for point cloud adversarial defense. Multimedia Syst. 28(3), 851–859 (2022)
Luo, Changwei, et al. Robust 3D face modeling and tracking from RGB-D images. Multimedia Systems 28.5, 1657–1666 (2022)
Kausar, Asma, et al. 3D shallow deep neural network for fast and precise segmentation of left atrium. Multimedia Systems 1–11 (2021)
Choy, C.B., Xu, D., Gwak, J., Chen, K., Savarese, S.: 3d-r2n2: A unified approach for single and multi-view 3d object reconstruction. In: European Conference on Computer Vision, pp. 628–644 (2016). Springer
Minemura, K., Liau, H., Monrroy, A., Kato, S.: Lmnet: Real-time multiclass object detection on cpu using 3d lidar. In: 2018 3rd Asia-Pacific Conference on Intelligent Robot Systems (ACIRS), pp. 28–34 (2018). IEEE
Tran, L., Liu, X.: Nonlinear 3d face morphable model. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7346–7355 (2018)
Alldieck, T., Magnor, M., Bhatnagar, B.L., Theobalt, C., Pons-Moll, G.: Learning to reconstruct people in clothing from a single rgb camera. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1175–1186 (2019)
Tulsiani, S., Efros, A.A., Malik, J.: Multi-view consistency as supervisory signal for learning shape and pose prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2897–2905 (2018)
Fan, Hehe, et al. Deep hierarchical representation of point cloud videos via spatio-temporal decomposition. IEEE Transactions on Pattern Analysis and Machine Intelligence 44.12, 9918–9930 (2021)
Xu, H., Zhou, Z., Wang, Y., Kang, W., Sun, B., Li, H., Qiao, Y.: Digging into uncertainty in self-supervised multi-view stereo. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6078–6087 (2021)
Schonberger, J.L., Frahm, J.-M.: Structure-from-motion revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4104–4113 (2016)
Cui, H., Shen, S., Gao, W., Wang, Z.: Progressive large-scale structure-from-motion with orthogonal msts. In: 2018 International Conference on 3D Vision (3DV), pp. 79–88 (2018). IEEE
Anaya, J., Barbu, A.: Renoir - a dataset for real low-light noise image reduction. J. Visual Communicat. Image Represent. 51, 144–154 (2018)
Loh, Y.P., Chan, C.S.: Getting to know low-light images with the exclusively dark dataset. Comp. Vision Image Underst. 178, 30–42 (2019). https://doi.org/10.1016/j.cviu.2018.10.010
YIN, L.-j., CHEN, Q., GU, G.-h., GONG, S.-x.: Monte carlo simulation and implementation of photon counting image based on apd. Journal of Nanjing University of Science and Technology (Natural Science), 34(5), 649–652 (2010)
Wang, X., Yin, L., Gao, M., Wang, Z., Shen, J., Zou, G.: Denoising method for passive photon counting images based on block-matching 3d filter and non-subsampled contourlet transform. Sensors 19(11), 2462 (2019)
Li, Y., Yin, L., Wang, Z., Pan, J., Gao, M., Zou, G., Liu, J., Wang, L.: Bayesian regularization restoration algorithm for photon counting images. Appl. Intellig. 51(8), 5898–5911 (2021)
Jiang, L., Zhang, J., Deng, B., Li, H., Liu, L.: 3d face reconstruction with geometry details from a single image. IEEE Transact. Image Process. 27(10), 4756–4770 (2018)
Öztireli, A.C., Guennebaud, G., Gross, M.: Feature preserving point set surfaces based on non-linear kernel regression. Comp. Graphics Forum 28, 493–501 (2009)
Guennebaud, G., Gross, M.: Algebraic point set surfaces. In: ACM Siggraph 2007 Papers, p. 23 (2007)
Schönberger, J.L., Zheng, E., Frahm, J.-M., Pollefeys, M.: Pixelwise view selection for unstructured multi-view stereo. In: European Conference on Computer Vision, pp. 501–518 (2016). Springer
Chauve, A.-L., Labatut, P., Pons, J.-P.: Robust piecewise-planar 3d reconstruction and completion from large-scale unstructured point data. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1261–1268 (2010). IEEE
Schnabel, R., Degener, P., Klein, R.: Completion and reconstruction with primitive shapes. Comp Graphics Forum 28, 503–512 (2009)
Kazhdan, M., Hoppe, H.: Screened poisson surface reconstruction. ACM Transact Graph (ToG) 32(3), 1–13 (2013)
Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3d shapenets: A deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)
Nguyen, D.T., Hua, B.-S., Tran, K., Pham, Q.-H., Yeung, S.-K.: A field model for repairing 3d shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5676–5684 (2016)
Wu, J., Zhang, C., Zhang, X., Zhang, Z., Freeman, W.T., Tenenbaum, J.B.: Learning shape priors for single-view 3d completion and reconstruction. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 646–662 (2018)
Dai, A., Ruizhongtai Qi, C., Nießner, M.: Shape completion using 3d-encoder-predictor cnns and shape synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5868–5877 (2017)
Han, X., Li, Z., Huang, H., Kalogerakis, E., Yu, Y.: High-resolution shape completion using deep neural networks for global structure and local geometry inference. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 85–93 (2017)
Saito, S., Simon, T., Saragih, J., Joo, H.: Pifuhd: Multi-level pixel-aligned implicit function for high-resolution 3d human digitization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 84–93 (2020)
Häne, C., Tulsiani, S., Malik, J.: Hierarchical surface prediction for 3d object reconstruction. In: 2017 International Conference on 3D Vision (3DV), pp. 412–420 (2017). IEEE
Tatarchenko, M., Dosovitskiy, A., Brox, T.: Octree generating networks: Efficient convolutional architectures for high-resolution 3d outputs. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2088–2096 (2017)
Kanazawa, A., Tulsiani, S., Efros, A.A., Malik, J.: Learning category-specific mesh reconstruction from image collections. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 371–386 (2018)
Fan, H., Su, H., Guibas, L.J.: A point set generation network for 3d object reconstruction from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 605–613 (2017)
Nguyen, A.-D., Choi, S., Kim, W., Lee, S.: Graphx-convolution for point cloud deformation in 2d-to-3d conversion. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8628–8637 (2019)
Zhang, X., Feng, Y., Li, S., Zou, C., Wan, H., Zhao, X., Guo, Y., Gao, Y.: View-guided point cloud completion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15890–15899 (2021)
Li, Z., Yu, T., Zheng, Z., Guo, K., Liu, Y.: Posefusion: Pose-guided selective fusion for single-view human volumetric capture. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14162–14172 (2021)
Shin, D., Kirmani, A., Goyal, V.K., Shapiro, J.H.: Photon-efficient computational 3-d and reflectivity imaging with single-photon detectors. IEEE Transact. Computat. Imaging 1(2), 112–125 (2015)
Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794–7803 (2018)
Qiu, Shi, Saeed Anwar, and Nick Barnes. Geometric back-projection network for point cloud classification. IEEE Transactions on Multimedia 24, 1943–1955 (2021)
Yi, L., Kim, V.G., Ceylan, D., Shen, I.-C., Yan, M., Su, H., Lu, C., Huang, Q., Sheffer, A., Guibas, L.: A scalable active framework for region annotation in 3d shape collections. ACM Transact. Graphics (ToG) 35(6), 1–12 (2016)
Lai, K., Bo, L., Fox, D.: Unsupervised feature learning for 3d scene labeling. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 3050–3057 (2014). IEEE
Kingma, Diederik P., and Jimmy Ba. Adam: A method for stochastic optimization. arXiv:1412.6980 (2014).
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H.: Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3146–3154 (2019)
Huang, Z., Wang, X., Huang, L., Huang, C., Wei, Y., Liu, W.: Ccnet: Criss-cross attention for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 603–612 (2019)
Oktay, O., Schlemper, J., Folgoc, L.L., Lee, M., Heinrich, M., Misawa, K., Mori, K., McDonagh, S., Hammerla, N.Y., Kainz, B., et al.: Attention u-net: Learning where to look for the pancreas. arXiv preprint arXiv:1804.03999 (2018)
Guo, M.-H., Cai, J.-X., Liu, Z.-N., Mu, T.-J., Martin, R.R., Hu, S.-M.: Pct: Point cloud transformer. Computat. Visual Media 7(2), 187–199 (2021)
Xie, S., Liu, S., Chen, Z., Tu, Z.: Attentional shapecontextnet for point cloud recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4606–4615 (2018)
Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: Squeezenet: Alexnet-level accuracy with 50x fewer parameters and< 0.5 mb model size. arXiv preprint arXiv:1602.07360 (2016)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: Deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)
Klokov, R., Lempitsky, V.: Escape from cells: Deep kd-networks for the recognition of 3d point cloud models. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 863–872 (2017)
Acknowledgements
This work was supported by the National Natural Science Foundation of China (NSFC) (62101310) and Natural Science Foundation of Shandong Province, China (ZR2020MF127).
Author information
Authors and Affiliations
Contributions
YD: conceptualization, methodology, software, writing reviewing and editing. LY: visualization, investigation, supervision. XG: data curation, software, validation. HZ: writing- original draft preparation. ZW, GZ: software.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Deng, Y., Yin, L., Gao, X. et al. EA-EDNet: encapsulated attention encoder-decoder network for 3D reconstruction in low-light-level environment. Multimedia Systems 29, 2263–2279 (2023). https://doi.org/10.1007/s00530-023-01100-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-023-01100-2