Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3474085.3475696acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Salient Error Detection based Refinement for Wide-baseline Image Interpolation

Published: 17 October 2021 Publication History

Abstract

Wide-baseline image interpolation is useful in many multimedia applications such as virtual street roaming and 3D TV. It is also a challenging problem because the large translations and rotations of image patches make it hard to estimate the motion fields between wide-baseline image pairs. We propose a refinement strategy based on salient error detection to improve the result of existing approaches of wide-baseline image interpolation, where we combine the advantages of methods based on piecewise-linear transformation and methods based on variational model. We first use a lightweight interpolation method to estimate the initial motion field between the input image pair, and synthesize the intermediate image as the initial result. Then we detect regions with noticeable artifacts in the initial image to find areas whose motion vectors should be refined. Finally, we refine the motion field of the detected regions using a variational model based method, and obtain the refined intermediate image. The refinement strategy of our method can be used as the post refinement step for many other image interpolation algorithms. We show the effectiveness and efficiency of our method through experiments on different datasets.

References

[1]
Radhakrishna Achanta, Francisco Estrada, Patricia Wils, and Sabine Süsstrunk. 2008. Salient region detection and segmentation. In International conference on computer vision systems. Springer, 66--75.
[2]
D. Anguelov, C. Dulong, D. Filip, C. Frueh, S. Lafon, R. Lyon, A. Ogale, L. Vincent, and J. Weaver. 2010. Google Street View: Capturing the World at Street Level. Computer, Vol. 43, 6 (2010), 32--38. https://doi.org/10.1109/MC.2010.170
[3]
Simon Baker, Daniel Scharstein, J. P. Lewis, Stefan Roth, Michael J. Black, and Richard Szeliski. 2011. A Database and Evaluation Methodology for Optical Flow. Int. J. Comput. Vision, Vol. 92, 1 (March 2011), 1--31. https://doi.org/10.1007/s11263-010-0390-2
[4]
L. Bao, Q. Yang, and H. Jin. 2014. Fast Edge-Preserving PatchMatch for Large Displacement Optical Flow. IEEE Transactions on Image Processing, Vol. 23, 12 (Dec 2014), 4996--5006. https://doi.org/10.1109/TIP.2014.2359374
[5]
Wenbo Bao, Wei-Sheng Lai, Chao Ma, Xiaoyun Zhang, Zhiyong Gao, and Ming-Hsuan Yang. 2019. Depth-Aware Video Frame Interpolation. In IEEE Conference on Computer Vision and Pattern Recognition.
[6]
Frederic Besse, Carsten Rother, Andrew Fitzgibbon, and Jan Kautz. 2014. PMBP: PatchMatch Belief Propagation for Correspondence Field Estimation. International Journal of Computer Vision, Vol. 110, 1 (01 Oct 2014), 2--13. https://doi.org/10.1007/s11263-013-0653-9
[7]
Peter J. Burt and Edward H. Adelson. 1983. A Multiresolution Spline with Application to Image Mosaics. ACM Trans. Graph., Vol. 2, 4 (Oct. 1983), 217--236. https://doi.org/10.1145/245.247
[8]
D. J. Butler, J. Wulff, G. B. Stanley, and M. J. Black. 2012. A naturalistic open source movie for optical flow evaluation. In European Conf. on Computer Vision (ECCV) (Part IV, LNCS 7577), A. Fitzgibbon et al. (Eds.) (Ed.). Springer-Verlag, 611--625.
[9]
Yuan Chang, Congyi Zhang, Yisong Chen, and Guoping Wang. 2021. Homography-guided stereo matching for wide-baseline image interpolation. In International Conference on Computational Visual Media (CVM).
[10]
Gaurav Chaurasia, Sylvain Duchene, Olga Sorkine-Hornung, and George Drettakis. 2013. Depth Synthesis and Local Warps for Plausible Image-based Navigation. ACM Trans. Graph., Vol. 32, 3, Article 30 (July 2013), 12 pages. https://doi.org/10.1145/2487228.2487238
[11]
J. Chen, Z. Cai, J. Lai, and X. Xie. 2018. Fast Optical Flow Estimation Based on the Split Bregman Method. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 28, 3 (2018), 664--678.
[12]
J. Chen, Z. Cai, J. Lai, and X. Xie. 2019. A Filtering-Based Framework for Optical Flow Estimation. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 29, 5 (2019), 1350--1364.
[13]
K. Chen, Y. Wang, C. Hu, and H. Shao. 2020. Salient Object Detection with Boundary Information. In 2020 IEEE International Conference on Multimedia and Expo (ICME). 1--6. https://doi.org/10.1109/ICME46284.2020.9102715
[14]
Q. Chen and V. Koltun. 2016. Full Flow: Optical Flow Estimation By Global Optimization over Regular Grids. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4706--4714.
[15]
Z. Chen, H. Jin, Z. Lin, S. Cohen, and Y. Wu. 2013. Large Displacement Optical Flow from Nearest Neighbor Fields. In 2013 IEEE Conference on Computer Vision and Pattern Recognition. 2443--2450. https://doi.org/10.1109/CVPR.2013.316
[16]
Ming-Ming Cheng, Niloy J Mitra, Xiaolei Huang, Philip HS Torr, and Shi-Min Hu. 2014. Global contrast based salient region detection. IEEE transactions on pattern analysis and machine intelligence, Vol. 37, 3 (2014), 569--582.
[17]
A. Dosovitskiy, P. Fischer, E. Ilg, P. Häusser, C. Hazirbas, V. Golkov, P. v. d. Smagt, D. Cremers, and T. Brox. 2015. FlowNet: Learning Optical Flow with Convolutional Networks. In 2015 IEEE International Conference on Computer Vision (ICCV). 2758--2766.
[18]
P. F. Felzenszwalb and D. R. Huttenlocher. 2004. Efficient belief propagation for early vision. In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., Vol. 1. I--I. https://doi.org/10.1109/CVPR.2004.1315041
[19]
Andreas Geiger, Philip Lenz, and Raquel Urtasun. 2012. Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite. In Conference on Computer Vision and Pattern Recognition (CVPR).
[20]
Stas Goferman, Lihi Zelnik-Manor, and Ayellet Tal. 2011. Context-aware saliency detection. IEEE transactions on pattern analysis and machine intelligence, Vol. 34, 10 (2011), 1915--1926.
[21]
Peter Hedman and Johannes Kopf. 2018. Instant 3D Photography., Vol. 37, 4 (2018), 101:1--101:12.
[22]
Berthold K. P. Horn and Brian G. Schunck. 1981. Determining Optical Flow. Artif. Intell., Vol. 17, 1-3 (Aug. 1981), 185--203. https://doi.org/10.1016/0004-3702(81)90024-2
[23]
Y. Hu, Y. Li, and R. Song. 2017. Robust Interpolation of Correspondences for Large Displacement Optical Flow. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4791--4799. https://doi.org/10.1109/CVPR.2017.509
[24]
Po-Han Huang, Kevin Matzen, Johannes Kopf, Narendra Ahuja, and Jia-Bin Huang. 2018. DeepMVS: Learning Multi-View Stereopsis. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[25]
Tak-Wai Hui, Xiaoou Tang, and Chen Change Loy. 2020. A Lightweight Optical Flow CNN - Revisiting Data Fidelity and Regularization. IEEE Transactions on Pattern Analysis and Machine Intelligence. http://mmlab.ie.cuhk.edu.hk/projects/LiteFlowNet/
[26]
L. Itti, C. Koch, and E. Niebur. 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 20, 11 (1998), 1254--1259. https://doi.org/10.1109/34.730558
[27]
Seong-Gyun Jeong, Chul Lee, and Chang-Su Kim. 2013. Motion-Compensated Frame Interpolation Based on Multihypothesis Motion Estimation and Texture Optimization. Trans. Img. Proc., Vol. 22, 11 (Nov. 2013), 4497--4509. https://doi.org/10.1109/TIP.2013.2274731
[28]
H. Jiang, D. Sun, V. Jampani, M. Yang, E. Learned-Miller, and J. Kautz. 2018. Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9000--9008. https://doi.org/10.1109/CVPR.2018.00938
[29]
H. R. Kaviani and S. Shirani. 2015. Iterative mask generation method for handling occlusion in optical flow assisted view interpolation. In 2015 IEEE International Conference on Image Processing (ICIP). 3387--3391. https://doi.org/10.1109/ICIP.2015.7351432
[30]
Y. Li, D. Min, M. S. Brown, M. N. Do, and J. Lu. 2015. SPM-BP: Sped-Up PatchMatch Belief Propagation for Continuous MRFs. In 2015 IEEE International Conference on Computer Vision (ICCV). 4006--4014. https://doi.org/10.1109/ICCV.2015.456
[31]
Guibiao Liao, Wei Gao, Qiuping Jiang, Ronggang Wang, and Ge Li. 2020. MMNet: Multi-Stage and Multi-Scale Fusion Network for RGB-D Salient Object Detection. In Proceedings of the 28th ACM International Conference on Multimedia. 2436--2444.
[32]
C. Liu, J. Yuen, and A. Torralba. 2011. SIFT Flow: Dense Correspondence across Scenes and Its Applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 33, 5 (May 2011), 978--994. https://doi.org/10.1109/TPAMI.2010.147
[33]
Jiang-Jiang Liu, Qibin Hou, Ming-Ming Cheng, Jiashi Feng, and Jianmin Jiang. 2019. A simple pooling-based design for real-time salient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3917--3926.
[34]
Tie Liu, Zejian Yuan, Jian Sun, Jingdong Wang, Nanning Zheng, Xiaoou Tang, and Heung-Yeung Shum. 2010. Learning to detect a salient object. IEEE Transactions on Pattern analysis and machine intelligence, Vol. 33, 2 (2010), 353--367.
[35]
Gucan Long, Laurent Kneip, Jose M. Alvarez, Hongdong li, Xiaohu Zhang, and Qifeng Yu. 2016. Learning Image Matching by Simply Watching Video, Vol. 9910. 434--450. https://doi.org/10.1007/978-3-319-46466-4_26
[36]
Dhruv Mahajan, Fu-Chung Huang, Wojciech Matusik, Ravi Ramamoorthi, and Peter Belhumeur. 2009. Moving Gradients: A Path-based Method for Plausible Image Interpolation. ACM Trans. Graph., Vol. 28, 3, Article 42 (July 2009), bibinfonumpages11 pages. https://doi.org/10.1145/1531326.1531348
[37]
N. Mayer, E. Ilg, P. Häusser, P. Fischer, D. Cremers, A. Dosovitskiy, and T. Brox. 2016. A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4040--4048.
[38]
Moritz Menze, Christian Heipke, and Andreas Geiger. 2015. Discrete Optimization for Optical Flow. In Pattern Recognition, Juergen Gall, Peter Gehler, and Bastian Leibe (Eds.). Springer International Publishing, Cham, 16--28.
[39]
S. Meyer, O. Wang, H. Zimmer, M. Grosse, and A. Sorkine-Hornung. 2015. Phase-based frame interpolation for video. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1410--1418. https://doi.org/10.1109/CVPR.2015.7298747
[40]
Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. 2020. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. In ECCV.
[41]
Y. Nie, Z. Zhang, H. Sun, T. Su, and G. Li. 2017. Homography Propagation and Optimization for Wide-Baseline Street Image Interpolation. IEEE Transactions on Visualization and Computer Graphics, Vol. 23, 10 (Oct 2017), 2328--2341. https://doi.org/10.1109/TVCG.2016.2618878
[42]
Simon Niklaus and Feng Liu. 2020. Softmax splatting for video frame interpolation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5437--5446.
[43]
Simon Niklaus, Long Mai, and Feng Liu. 2017a. Video Frame Interpolation via Adaptive Convolution. 2270--2279. https://doi.org/10.1109/CVPR.2017.244
[44]
Simon Niklaus, Long Mai, and Feng Liu. 2017b. Video Frame Interpolation via Adaptive Separable Convolution. (08 2017).
[45]
Youwei Pang, Xiaoqi Zhao, Lihe Zhang, and Huchuan Lu. 2020. Multi-Scale Interactive Network for Salient Object Detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[46]
Eric Penner and Li Zhang. 2017. Soft 3D Reconstruction for View Synthesis., Vol. 36, 6 (2017).
[47]
J. Revaud, P. Weinzaepfel, Z. Harchaoui, and C. Schmid. 2015. EpicFlow: Edge-preserving interpolation of correspondences for optical flow. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1164--1172. https://doi.org/10.1109/CVPR.2015.7298720
[48]
Timo Stich, Christian Linz, Christian Wallraven, Douglas Cunningham, and Marcus Magnor. 2011. Perception-motivated Interpolation of Image Sequences. ACM Trans. Appl. Percept., Vol. 8, 2, Article 11 (Feb. 2011), bibinfonumpages25 pages. https://doi.org/10.1145/1870076.1870079
[49]
Deqing Sun, Xiaodong Yang, Ming-Yu Liu, and Jan Kautz. 2018. PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume. 8934--8943. https://doi.org/10.1109/CVPR.2018.00931
[50]
Richard Szeliski, Ramin Zabih, Daniel Scharstein, Olga Veksler, Vladimir Kolmogorov, Aseem Agarwala, Marshall Tappen, and Carsten Rother. 2008. A Comparative Study of Energy Minimization Methods for Markov Random Fields with Smoothness-Based Priors. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 30, 6 (June 2008), 1068--1080. https://doi.org/10.1109/TPAMI.2007.70844
[51]
Zachary Teed and Jia Deng. 2020. Raft: Recurrent all-pairs field transforms for optical flow. In European Conference on Computer Vision. Springer, 402--419.
[52]
C. Verleysen, T. Maugey, P. Frossard, and C. De Vleeschouwer. 2017. Wide-Baseline Foreground Object Interpolation Using Silhouette Shape Prior. IEEE Transactions on Image Processing, Vol. 26, 11 (2017), 5477--5490. https://doi.org/10.1109/TIP.2017.2734567
[53]
S. Wang and R. Wang. 2019. Robust View Synthesis in Wide-baseline Complex Geometric Environments. In ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2297--2301. https://doi.org/10.1109/ICASSP.2019.8683037
[54]
Zexiang Xu, Sai Bi, Kalyan Sunkavalli, Sunil Hadap, Hao Su, and Ravi Ramamoorthi. 2019. Deep View Synthesis from Sparse Photometric Images., Vol. 38, 4 (2019), 76:1--76:13.
[55]
Gengshan Yang and Deva Ramanan. 2019. Volumetric Correspondence Networks for Optical Flow. In NeurIPS.
[56]
Yizong Cheng. 1995. Mean shift, mode seeking, and clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 17, 8 (1995), 790--799. https://doi.org/10.1109/34.400568
[57]
Alex Yu, Vickie Ye, Matthew Tancik, and Angjoo Kanazawa. 2021. pixelNeRF: Neural Radiance Fields from One or Few Images. In CVPR.
[58]
Jia-Xing Zhao, Jiang-Jiang Liu, Deng-Ping Fan, Yang Cao, Jufeng Yang, and Ming-Ming Cheng. 2019. EGNet:Edge Guidance Network for Salient Object Detection. In The IEEE International Conference on Computer Vision (ICCV).
[59]
Shengyu Zhao, Yilun Sheng, Yue Dong, Eric I-Chao Chang, and Yan Xu. 2020. MaskFlownet: Asymmetric Feature Matching with Learnable Occlusion Mask. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[60]
Tinghui Zhou, Richard Tucker, John Flynn, Graham Fyffe, and Noah Snavely. 2018. Stereo Magnification: Learning View Synthesis Using Multiplane Images. ACM Trans. Graph., Vol. 37, 4, Article 65 (July 2018), bibinfonumpages12 pages. https://doi.org/10.1145/3197517.3201323
[61]
Tinghui Zhou, Shubham Tulsiani, Weilun Sun, Jitendra Malik, and Alexei A Efros. 2016. View Synthesis by Appearance Flow. In European Conference on Computer Vision.

Index Terms

  1. Salient Error Detection based Refinement for Wide-baseline Image Interpolation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '21: Proceedings of the 29th ACM International Conference on Multimedia
    October 2021
    5796 pages
    ISBN:9781450386517
    DOI:10.1145/3474085
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 October 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. dense correspondence
    2. image interpolation
    3. saliency map
    4. view synthesis

    Qualifiers

    • Research-article

    Funding Sources

    • National Natural Science Foundation of China (NSFC)
    • National Key Technology Research and Development Program of China
    • PKU-Baidu Fund

    Conference

    MM '21
    Sponsor:
    MM '21: ACM Multimedia Conference
    October 20 - 24, 2021
    Virtual Event, China

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 113
      Total Downloads
    • Downloads (Last 12 months)8
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 07 Mar 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media