research-article

Disparity-based Stereo Image Compression with Aligned Cross-View Priors

Authors:

Yongqi Zhai,

Luyang Tang,

Yi Ma,

Rui Peng,

Ronggang WangAuthors Info & Claims

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

Pages 2351 - 2360

https://doi.org/10.1145/3503161.3548136

Published: 10 October 2022 Publication History

Get Access

Abstract

With the wide application of stereo images in various fields, the research on stereo image compression (SIC) attracts extensive attention from academia and industry. The core of SIC is to fully explore the mutual information between the left and right images and reduce redundancy between views as much as possible. In this paper, we propose DispSIC, an end-to-end trainable deep neural network, in which we jointly train a stereo matching model to assist in the image compression task. Based on the stereo matching results (i.e. disparity), the right image can be easily warped to the left view, and only the residuals between the left and right views are encoded for the left image. A three-branch auto-encoder architecture is adopted in DispSIC, which encodes the right image, the disparity map and the residuals respectively. During training, the whole network can learn how to adaptively allocate bitrates to these three parts, achieving better rate-distortion performance at the cost of a lower disparity map bitrates. Moreover, we propose a conditional entropy model with aligned cross-view priors for SIC, which takes the warped latents of the right image as priors to improve the accuracy of the probability estimation for the left image. Experimental results demonstrate that our proposed method achieves superior performance compared to other existing SIC methods on the KITTI and InStereo2K datasets both quantitatively and qualitatively.

Supplementary Material

MP4 File (MM-fp1638.mp4)

Recently, learning-based stereo image compression methods have achieved promising results. However, they still suffer from high computation complexity and insufficint compression performance. In this paper, we propose a disparity-based stereo image compression framework, namely DispSIC, which significantly outperforms the state-of-art deep stereo image compression methods.

Download
15.59 MB

References

[1]

Eirikur Agustsson, Fabian Mentzer, Michael Tschannen, Lukas Cavigelli, Radu Timofte, Luca Benini, and Luc Van Gool. 2017. Soft-to-hard vector quantization for end-to-end learning compressible representations. arXiv preprint arXiv:1704.00648 (2017).

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Joint Image Denoising and Disparity Estimation via Stereo Structure PCA and Noise-Tolerant Cost

Hierarchical stereo matching with image bit-plane slicing

A mesh-based disparity representation method for view interpolation and stereo image compression

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations