An improved binocular stereo matching algorithm based on AANet

Yang, Ge; Liao, Yuting

doi:10.1007/s11042-023-15183-6

An improved binocular stereo matching algorithm based on AANet

Published: 06 April 2023

Volume 82, pages 40987–41003, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

514 Accesses
1 Altmetric
Explore all metrics

Abstract

Stereo matching is an important part of establishing stereo vision. Parallax information obtained by stereo matching directly affects the three-dimensional information of an object. End-to-end stereo matching algorithms can directly derive parallax maps from the designed network. However, at the same time, the structure of the network is complex, and a large number of parameters take up much memory. The network increases the device burden, which increases the time required to obtain the parallax map, lowering the efficiency of the network movement. Thus, an improved stereo matching algorithm based on AANet (adaptive aggregation network for efficient stereo matching) is proposed in this paper: AEDNet (adaptive end-to-end depth network for stereo matching). In the feature extraction module, the network simplifies the network structure by limiting the convolution kernel size to obtain the features with low abstraction. In cost aggregation, the intra-scale aggregation module is used to achieve adaptive cost aggregation through deformable convolution, and the inter-scale aggregation module uses the traditional cross-scale aggregation method to compensate for the missing global information to a certain extent. The network is verified the performance on the KITTI dataset. The results show that the algorithm can still complete stereo matching efficiently and accurately and obtain a better disparity map when the network is simplified. These provide preconditions for accurate three-dimensional reconstruction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Improved Stereo Matching Algorithm Based on AnyNet

Multi-scale inputs and context-aware aggregation network for stereo matching

Article 12 February 2024

GPDF-Net: geometric prior-guided stereo matching with disparity fusion refinement

Article 04 June 2024

Data availability

The data that support the findings of this study are available from the corresponding author, upon reasonable request.

References

Aleotti F, Poggi M, Tosi F et al (2020) Learning End-to-End Scene Flow by Distilling Single Tasks Knowledge[C]. Nat Conf Artif Intell 34(7):10435–10442
Google Scholar
Bhatti Uzair Aslam, Zhaoyuan Yu, Chanussot Jocelyn, Zeeshan Zeeshan, Yuan Linwang, Luo Wen, Nawaz Saqib Ali, Bhatti Mughair Aslam, Ain QuratUl, Mehmood Anum (2022) Local Similarity-Based Spatial-Spectral Fusion Hyperspectral Image Classification With Deep CNN and Gabor Filtering[J]. IEEE Trans Geosci Remote Sens 60:5514215–5514215
Article Google Scholar
Chang JR, Chen YS (2018) Pyramid stereo matching network [C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 5410–5418
Changhee Won;Jongbin Ryu;Jongwoo Lim (2021) End-to-End Learning for Omnidirectional Stereo Matching With Uncertainty Prior[J]. IEEE Trans Pattern Anal Mach Intell 43(11):3850–3862
Article Google Scholar
Chen S, Xiang Z, Qiao C et al (2020) SGNet: semantics guided deep stereo matching[C]. Proceedings of Asian Conference on Computer Vision (ACCV) 106-122. Springer International Publishing, Kyoto
Google Scholar
Chen W, Jia X, Mingfei Wu, Liang Z (2022) Multi-Dimensional Cooperative Network for Stereo Matching[J]. IEEE Robot Autom Lett 7(1):581–587
Article Google Scholar
Chenglong Xu, Chengdong Wu, Daokui Qu, Fang Xu, Sun H, Song J (2021) Accurate and Efficient Stereo Matching by Log-Angle and Pyramid-Tree[J]. IEEE Trans Circuits Syst Video Technol 31(10):4007–4019
Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The KITTI vision benchmark suite[C]. IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 3354–3361
He K, Zhang X, Ren S, et al (2016) Deep Residual Learning for Image Recognition[C]. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA. 770-778
Hongzhi Du, Li Y, Sun Y, Zhu J, Tombari F (2021) SRH-Net: Stacked Recurrent Hourglass Network for Stereo Matching[J]. IEEE Robot Autom Lett 6(4):8005–8012
Article Google Scholar
Liu J, Feng Y, Ji G, Fu Y, Zhu S (2020) An improved stereo matching algorithm based on PSMNet[J]. South China Univ Technol (Nat Sci Edit) 48(01):60–69+ 83
Kim S, Min D, Kim S, Sohn K (2021) Adversarial Confidence Estimation Networks for Robust Stereo Matching[J]. IEEE Trans Intell Transp Syst 22(11):6875–6889
Article Google Scholar
Kuzmin A, Mikushin D, Lempitsky V (2017) End-to-end Learning of Cost-Volume Aggregation for Real-time Dense Stereo [C]. 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP), Tokyo, Japan. 1–6. https://doi.org/10.1109/MLSP.2017.8168183
Laga Hamid, Jospin Laurent Valentin, Boussaid Farid, Bennamoun Mohammed (2022) A Survey on Deep Learning Techniques for Stereo-Based Depth Estimation[J]. IEEE Trans Pattern Anal Mach Intell 44(4):1738–1764
Article Google Scholar
Lee Y, Kim H (2022) A High-Throughput Depth Estimation Processor for Accurate Semiglobal Stereo Matching Using Pipelined Inter-Pixel Aggregation[J]. IEEE Trans Circuits Syst Vid Technol 32(1):411–422
Article Google Scholar
Li J, Wang P, Xiong P, Cai T, Yan Z, Yang L, Liu J, Fan H, Liu S (2022) Practical stereo matching via cascaded recurrent network with adaptive correlation[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). 16263–16271
Liang Z, Guo Y, Feng Y, Chen W, Qiao L, Zhou Li, Zhang J, Liu H (2021) Stereo Matching Using Multi-Level Cost Volume and Multi-Scale Feature Constancy[J]. IEEE Trans Pattern Anal Mach Intell 43(1):300–315
Article Google Scholar
Lin TY, Dollar P, Girshick R, et al (2017) Feature Pyramid Networks for Object Detection[C]. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, 936–944
Lipson L, Teed Z, Deng J (2021) Raft-stereo: Multilevel recurrent field transforms for stereo matching[C]. 2021 International Conference on 3D Vision (3DV). 202: 218–227
Liu P, King I, Lyu M, Xu J (2020) Flow2Stereo: Effective Self-Supervised Learning of Optical Flow and Stereo Matching[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 6647–6656
Mao Y, Liu Z, Li W, Dai Y, Wang Q, Kim Y-T, Lee H-S (2021) UASNet: Uncertainty adaptive sampling network for deep stereo matching[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision(ICCV). 6291–6299
Mayer N, Ilg E, Hausser P, et al. (2016) A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation [C]. IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas. 4040–4048
Mayer N, Ilg E, Häusser P, Fischer P, Cremers D, Dosovitskiy A, Brox T (2016) A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation[C]. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016:4040–4048
Menze M, Geiger A (2015) Object scene flow for antonomous vehicles [C]. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 3061–3070
Cheng MY, Gai SY, Da FP (2020) Three-dimensional Matching Network Research Based on Attention Mechanism [J]. Optic J 40(14):144–152
Park Jinsun, Jeong Yongseop, Joo Kyungdon, Cho Donghyeon, Kweon In So (2022) Adaptive Cost Volume Fusion Network for Multi-Modal Depth Estimation in Changing Environments[J]. IEEE Robot Autom Lett 7(2):5095–5102
Article Google Scholar
Shankar K, Tjersland M, Ma J, Stone K, Bajracharya M (2022) A Learned Stereo Depth System for Robotic Manipulation in Homes. IEEE Robot Autom Lett 7(2):2305–2312
Article Google Scholar
Shen Z, Dai Y, Rao Z (2021) Cfnet: Cascade and fused cost volume for robust stereo matching[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). 13901-13910
Song X, Zhao X, Fang L et al (2020) Edge Stereo: An Effective Multi-task Learning Network for Stereo Matching and Edge Detection[J]. Int Jo Comput Vis 128(5):910–930. https://doi.org/10.48550/arXiv.1903.01700
Article Google Scholar
Tankovich V, Häne C, Zhang Y, Kowdle A, Fanello S, Bouaziz S (2021) Hitnet: Hierarchical iterative tile refinement network for real-time stereo matching[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR) 14357–14367
Li T, Ma W, Xu SB, Zhang XP (2020) Task-Adaptive End-to-End Networks for Stereo Matching [J]. Comput Res Dev 57(07):1531–1538
Tonioni A, Tosi F, Poggi M, Mattoccia S, Di Stefano L (2019) Real-Time self-adaptive deep stereo[C].The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 195–204
Wang H, Fan R, Cai P, Liu M (2021) PVStereo: Pyramid voting module for end-to-end self-supervised stereo matching[J]. IEEE Robot Autom Lett 6(3):4353–4360
Article Google Scholar
Xu H, Zhang J (2020) AANet: Adaptive Aggregation Network for Efficient Stereo Matching[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 1956–1965
Xu B, Xu Y, Yang X, Jia W, Guo Y (2021) Bilateral grid learning for stereo matching networks [C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). 12492–12501
Yang G Zhao H Shi J Deng Z, Jia J (2018) SegStereo: Exploiting Semantic Information for Disparity Estimation[C]. European Conference on Computer Vision (ECCV). 660–676
Yang J, Wang C, Wang H et al (2020) A RGB-D Based Real-Time Multiple Object Detection and Ranging System for Autonomous Driving[J]. IEEE Sens J 20(20):11959–11966
Article Google Scholar
Yao C, Jia Y, Di H, Li P, Wu Y (2021) A decomposition model for stereo matching[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). 6087–6096
Ye X, Sang X, Chen D, Wang P, Wang K, Yan B, Liu B, Wang H, Qi S (2022) Super pixel Guided Network for Three-Dimensional Stereo Matching[J]. IEEE Trans Comput Imaging 8:54–68
Article Google Scholar
Yufeng Wang, Wang Hongwei Yu, Guang Yang Mingquan, Yuwei Yuan, Jicheng Quan (2019) A Stereo-matching Algorithm based on a 3 D Convolutional Neural Network [J]. Optics 39(11):227–234
Google Scholar
Yufeng W, Hongwei W, Liu Yu, Mingquan Y, Jicheng Q (2020) Progressive-refined real-time stereo matching algorithm [J]. Opt J 40(09):99–109
Google Scholar
Zeng K, Wang Y, Mao J, Liu C, Peng W, Yang Y (2022) Deep Stereo Matching With Hysteresis Attention and Supervised Cost Volume Construction[J]. IEEE Trans Image Process 31:812–822
Article Google Scholar
Zhang F, Prisacariu V, Yang R, Torr PH (2019) GANet: Guided Aggregation Net for end-to-end Stereo Matching. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 185–194
Zhang J, Skinner K, Vasudevan R, Johnson-Roberson M (2019) DispSegNet: Leveraging Semantics for Endto-End Learning of Disparity Estimation From Stereo Imagery[J]. IEEE Robot Autom Lett 4(2):1162–1169
Article Google Scholar
Zhang Y, Chen Y, Bai X, Yu S, Yu K, Li Z, Yang K (2020) Adaptive Unimodal Cost Volume Filtering for Deep Stereo Matching[C]. Proceed AAAI Conf Artif Intell 34(7):12926–12934
Google Scholar

Download references

Acknowledgements

This research was financially supported by the Major Scientific Research Project for Universities of Guangdong Province (2020ZDZX3058); Science and technology projects of Zhuhai in the field of social development (2220004000066); the Key Laboratory of Intelligent Multimedia Technology (201762005)

Author information

Authors and Affiliations

Advanced Institute of Natural Sciences, Key Laboratory of Intelligent Multimedia Technology, Beijing Normal University, Zhuhai, 519087, China
Ge Yang
Engineering Lab On Intelligent Perception for Internet of Things (ELIP), Shenzhen Graduate School, Peking University, Shenzhen, 518055, China
Yuting Liao

Authors

Ge Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yuting Liao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ge Yang.

Ethics declarations

Conflicts of interest

We declare that we have no financial or personal relationships with other people or organizations that may have inappropriately influenced our work. There is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yang, G., Liao, Y. An improved binocular stereo matching algorithm based on AANet. Multimed Tools Appl 82, 40987–41003 (2023). https://doi.org/10.1007/s11042-023-15183-6

Download citation

Received: 22 January 2022
Revised: 26 February 2023
Accepted: 30 March 2023
Published: 06 April 2023
Issue Date: November 2023
DOI: https://doi.org/10.1007/s11042-023-15183-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An improved binocular stereo matching algorithm based on AANet

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Improved Stereo Matching Algorithm Based on AnyNet

Multi-scale inputs and context-aware aggregation network for stereo matching

GPDF-Net: geometric prior-guided stereo matching with disparity fusion refinement

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

An improved binocular stereo matching algorithm based on AANet

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Improved Stereo Matching Algorithm Based on AnyNet

Multi-scale inputs and context-aware aggregation network for stereo matching

GPDF-Net: geometric prior-guided stereo matching with disparity fusion refinement

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation