A robust framework for multi-view stereopsis

Mao, Wendong; Wang, Mingjie; Huang, Hui; Gong, Minglun

doi:10.1007/s00371-021-02087-5

A robust framework for multi-view stereopsis

Original article
Published: 18 March 2021

Volume 38, pages 1539–1551, (2022)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Wendong Mao ORCID: orcid.org/0000-0002-1061-7165¹,
Mingjie Wang¹,
Hui Huang² &
…
Minglun Gong³

464 Accesses
1 Citation
Explore all metrics

Abstract

Various approaches using neural networks have been proposed to address multi-view stereopsis, but most of them lack capabilities to handle large textureless regions. Hence, a compelling matching network learning comprehensive information from stereo images is constructed to enforce smoothness constraints globally. Trained over binocular stereo datasets only, we show that the network can directly handle the DTU multi-view stereo dataset. When merging together multiple depth maps obtained using either stereo matching, an additional point consolidation procedure is often needed for removing outliers and better aligning individual patches. A second network that consolidates 3D point clouds through directly projecting individual 3D points based on point distributions in their neighborhoods is proposed. Unlike the matching network, this network is trained on local information and is scalable for handling point clouds of any sizes and is capable of processing selected areas of interest as well. Quantitative evaluation on the DTU dataset demonstrates our two networks together can generate point clouds comparable to existing state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 4

A Global-Matching Framework for Multi-View Stereopsis

FFP-MVSNet: Feature Fusion Based Patchmatch for Multi-view Stereo

Pyramid Multi-view Stereo Net with Self-adaptive View Aggregation

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Aanæs, H., Jensen, R.R., Vogiatzis, G., Tola, E., Dahl, A.B.: Large-scale data for multiple-view stereopsis. Int. J. Comput. Vis. 2016, 1–16 (2016)
MathSciNet Google Scholar
Arvanitis, G., Spathis-Papadiotis, A., Lalos, A.S., Moustakas, K., Fakotakis, N.: Outliers removal and consolidation of dynamic point cloud. In: 2018 25th IEEE International Conference on Image Processing (ICIP). IEEE, pp. 3888–3892 (2018)
Bleyer, M., Rhemann, C., Rother, C.: Patchmatch stereo-stereo matching with slanted support windows. In: Proceedings of the British Machine Vision Conference (BMVC), vol. 11, pp. 1–11 (2011)
Boulch, A., Marlet, R.: Deep learning for robust normal estimation in unstructured point clouds. Comput. Graph. Forum 35, 281–290 (2016)
Article Google Scholar
Campbell, N., Vogiatzis, G., Hernandez, C., Cipolla, R.: Using multiple hypotheses to improve depth-maps for multi-view stereo, vol. 5302, pp. 766–779 (2008)
Chang, J.R., Chen, Y.S.: Pyramid stereo matching network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5410–5418 (2018)
Chen, R., Han, S., Xu, J., Su, H.: Point-based multi-view stereo network. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1538–1547 (2019)
Choi, S., Kim, S., Sohn, K., et al.: Learning descriptor, confidence, and depth estimation in multi-view stereo. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, pp. 389–3896 (2018)
Furukawa, Y., Ponce, J.: Accurate, dense, and robust multiview stereopsis. IEEE Trans. Pattern Anal. Mach. Intell. 32(8), 1362–1376 (2010)
Article Google Scholar
Furukawa, Y., Hernández, C., et al.: Multi-view stereo: a tutorial. Found. Trends Comput. Graph. Vis. 9(1–2), 1–148 (2015)
Article Google Scholar
Galliani, S., Schindler, K.: Just look at the image: viewpoint-specific surface normal prediction for improved multi-view reconstruction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5479–5487 (2016)
Galliani, S., Lasinger, K., Schindler, K.: Massively parallel multiview stereopsis by surface normal diffusion. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 873–881 (2015)
Guerrero, P., Kleiman, Y., Ovsjanikov, M., Mitra, N.J.: PCPNet: learning local shape properties from raw point clouds. Comput. Graph. Forum 37(2), 75–85 (2018). https://doi.org/10.1111/cgf.13343
Article Google Scholar
Han, X., Li, Z., Huang, H., Kalogerakis, E., Yu, Y.: High-resolution shape completion using deep neural networks for global structure and local geometry inference, pp 85–93 (2017)
Hartmann, W., Galliani, S., Havlena, M., Van Gool, L., Schindler, K.: Learned multi-patch similarity. In: Proceedings of the International Conference on Computer Vision (ICCV), IEEE, pp 1595–1603 (2017)
Hirschmuller, H.: Stereo processing by semiglobal matching and mutual information. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 328–341 (2008)
Article Google Scholar
Huang, H., Li, D., Zhang, H., Ascher, U., Cohen-Or, D.: Consolidation of unorganized point clouds for surface reconstruction. ACM Trans. Graph. 28(5), 176 (2009)
Article Google Scholar
Huang, H., Wu, S., Gong, M., Cohen-Or, D., Ascher, U., Zhang, H.R.: Edge-aware point set resampling. ACM Trans. Graph. 32(1), 9 (2013)
Article Google Scholar
Huang, P.H., Matzen, K., Kopf, J., Ahuja, N., Huang, J.B.: Deepmvs: Learning multi-view stereopsis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2821–2830 (2018)
Im, S., Jeon, H.G., Lin, S., Kweon, I.S.: DPSNet: End-to-end deep plane sweep stereo. In: International Conference on Learning Representations (2019)
Ji, M., Gall, J., Zheng, H., Liu, Y., Fang, L.: SurfaceNet: An end-to-end 3D neural network for multiview stereopsis. arXiv preprint arXiv:1708.01749 (2017)
Kar, A., Häne, C., Malik, J.: Learning a multi-view stereo machine. In: Advances in Neural Information Processing Systems, pp. 365–376 (2017)
Kazhdan, M., Hoppe, H.: Screened Poisson surface reconstruction. ACM Trans. Graph. 32(3), 29 (2013)
Article Google Scholar
Kendall, A., Martirosyan, H., Dasgupta, S., Henry, P., Kennedy, R., Bachrach, A., Bry, A.: End-to-end learning of geometry and context for deep stereo regression. In: Proceedings of the International Conference on Computer Vision (ICCV) (2017)
Kim, P., Chen, J., Cho, Y.K.: Slam-driven robotic mapping and registration of 3D point clouds. Autom. Constr. 89, 38–48 (2018)
Article Google Scholar
Kim, S.H., Chung, K.Y.: Medical information service system based on human 3D anatomical model. Multimed. Tools Appl. 74(20), 8939–8950 (2015)
Article Google Scholar
Luo, K., Guan, T., Ju, L., Huang, H., Luo, Y.: P-mvsnet: Learning patch-wise matching confidence aggregation for multi-view stereo. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 10452–10461 (2019)
Mao, W., Gong, M., Huang, X., Cai, H., Yi, Z.: A global-matching framework for multi-view stereopsis. In: Vento, M., Percannella, G. (eds.) Computer Analysis of Images and Patterns, pp. 635–647. Springer, Cham (2019)
Chapter Google Scholar
Menze, M., Heipke, C., Geiger, A.: Joint 3D estimation of vehicles and scene flow. In: ISPRS Workshop on Image Sequence Analysis (ISA) (2015)
Pang, J., Sun, W., Ren, J.S., Yang, C., Yan, Q.: Cascade residual learning: a two-stage convolutional neural network for stereo matching. In: Proceedings of the International Conference on Computer Vision (ICCV), vol. 7 (2017)
Park, H., Lee, K.M.: Look wider to match image patches with convolutional neural networks. IEEE Signal Process. Lett. 24(12), 1788–1792 (2017)
Article Google Scholar
Poms, A., Wu, C., Yu, S.I., Sheikh, Y.: Learning patch reconstructability for accelerating multi-view stereo. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3041–3050 (2018)
Preiner, R., Mattausch, O., Arikan, M., Pajarola, R., Wimmer, M.: Continuous projection for fast l1 reconstruction. ACM Trans. Graph. 33(4), 47–1 (2014)
Article Google Scholar
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: Deep learning on point sets for 3D classification and segmentation. arXiv preprint arXiv:1612.00593 (2016)
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. arXiv preprint arXiv:1706.02413 (2017)
Romanoni, A., Matteucci, M.: Tapa-mvs: Textureless-aware patchmatch multi-view stereo. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 10413–10422 (2019)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, pp. 234–241 (2015)
Roveri, R., Öztireli, A.C., Pandele, I., Gross, M.H.: PointProNets: consolidation of point clouds with convolutional neural networks. Comput. Graph. Forum 37, 87–99 (2018)
Article Google Scholar
Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. 47(1–3), 7–42 (2002)
Article Google Scholar
Scharstein, D., Hirschmüller, H., Kitajima, Y., Krathwohl, G., Nešić, N., Wang, X., Westling, P.: High-resolution stereo datasets with subpixel-accurate ground truth. In: German Conference on Pattern Recognition. Springer, pp. 31–42 (2014)
Seitz, S.M., Curless, B., Diebel, J., Scharstein, D., Szeliski, R.: A comparison and evaluation of multi-view stereo reconstruction algorithms. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, vol. 1, pp. 519–528 (2006)
Sun, Y., Schaefer, S., Wang, W.: Denoising point sets via l0 minimization. Comput. Aided Geom. Des. 35, 2–15 (2015)
Article Google Scholar
Tola, E., Strecha, C., Fua, P.: Efficient large-scale multi-view stereo for ultra high-resolution image sets. Mach. Vis. Appl. 23(5), 903–920 (2012)
Article Google Scholar
Wu, S., Huang, H., Gong, M., Zwicker, M., Cohen-Or, D.: Deep points consolidation. ACM Trans. Graph. 34(6), 176 (2015)
Google Scholar
Yan, T., Gan, Y., Xia, Z., Zhao, Q.: Segment-based disparity refinement with occlusion handling for stereo matching. IEEE Trans. Image Process. 28, 3885–3897 (2019)
Article MathSciNet Google Scholar
Yao, Y., Luo, Z., Li, S., Fang, T., Quan, L.: MVSNet: Depth inference for unstructured multi-view stereo. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 767–783 (2018)
Yao, Y., Luo, Z., Li, S., Shen, T., Fang, T., Quan, L.: Recurrent mvsnet for high-resolution multi-view stereo depth inference. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5525–5534 (2019)
Ye, X., Li, J., Wang, H., Huang, H., Zhang, X.: Efficient stereo matching leveraging deep local and context information. IEEE Access 5, 18745–18755 (2017)
Article Google Scholar
Yifan, W., Wu, S., Huang, H., Cohen-Or, D., Sorkine-Hornung, O.: Patch-base progressive 3D Point Set Upsampling. ArXiv e-prints arXiv:1811.11286 (2018)
Yu, L., Li, X., Fu, C.W., Cohen-Or, D., Heng, P.A.: EC-Net: an edge-aware point set consolidation network. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018a)
Yu, L., Li, X., Fu, C.W., Cohen-Or, D., Heng, P.A.: PU-Net: Point cloud upsampling network. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018b)
Yu, Z., Gao, S.: Fast-mvsnet: Sparse-to-dense multi-view stereo with learned propagation and gauss-newton refinement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1949–1958 (2020)
Zbontar, J., LeCun, Y.: Stereo matching by training a convolutional neural network to compare image patches. J. Mach. Learn. Res. 17(1–32), 2 (2016)
MATH Google Scholar
Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22, 1330–1334 (2000)
Article Google Scholar
Zollhöfer, M., Siegl, C., Vetter, M., Dreyer, B., Stamminger, M., Aybek, S., Bauer, F.: Low-cost real-time 3D reconstruction of large-scale excavation sites. J. Comput. Cult. Herit. 9(1), 2 (2016)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Memorial University of Newfoundland, St. John’s, Canada
Wendong Mao & Mingjie Wang
Shenzhen University, Shenzhen, China
Hui Huang
University of Guelph, Guelph, Canada
Minglun Gong

Authors

Wendong Mao
View author publications
You can also search for this author in PubMed Google Scholar
Mingjie Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hui Huang
View author publications
You can also search for this author in PubMed Google Scholar
Minglun Gong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wendong Mao.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mao, W., Wang, M., Huang, H. et al. A robust framework for multi-view stereopsis. Vis Comput 38, 1539–1551 (2022). https://doi.org/10.1007/s00371-021-02087-5

Download citation

Accepted: 08 February 2021
Published: 18 March 2021
Issue Date: May 2022
DOI: https://doi.org/10.1007/s00371-021-02087-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A robust framework for multi-view stereopsis

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Global-Matching Framework for Multi-View Stereopsis

FFP-MVSNet: Feature Fusion Based Patchmatch for Multi-view Stereo

Pyramid Multi-view Stereo Net with Self-adaptive View Aggregation

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A robust framework for multi-view stereopsis

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Global-Matching Framework for Multi-View Stereopsis

FFP-MVSNet: Feature Fusion Based Patchmatch for Multi-view Stereo

Pyramid Multi-view Stereo Net with Self-adaptive View Aggregation

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation