Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Blind 3D Video Stabilization with Spatio-Temporally Varying Motion Blur

Published: 13 November 2024 Publication History

Abstract

Video stabilization is a challenging task that attempts to compensate for the overall frame shake during video acquisition. Existing three-dimensional video stabilization methods aim at modeling camera perspective projection through either data-driven training or explicit motion estimation. However, the above methods are difficult to effectively solve the issue of shaky videos with abrupt object movements, resulting in local motion blur in the direction of the movement. This phenomenon is prevalent in real-world scenarios featuring foreground blind motion scenes. Unfortunately, directly combining stabilization and deblurring methods poses challenges when dealing with this situation. In the video, the intensity of motion blur undergoes continuous changes, and the direct combination method inadequately utilizes spatiotemporal information, providing insufficient clues for cross-frame compensation. To alleviate this problem, the Cross-frame-temporal Module framework is proposed to address blind motion blur induced by various conditions, which utilizes cross-frame temporal features to estimate depth maps and camera motion. In this framework, a Blur Transform Network (BTNet) is designed to adapt to spatially varying motion blur, which transforms local regions according to the impact of blur intensities to adapt to the effects of non-uniform motion blur; furthermore, our Temporal-Aware Network (TANet) further suppresses motion blur by leveraging cross-frame temporal features. In addition, the limited availability of pair-training video data containing motion blur limits the application of this approach in practice. The Cross-frame-temporal Module framework adopts an un-pretrained in-test training strategy. Extensive experimental results have demonstrated that our method outperforms state-of-the-art methods.

References

[1]
Muhammad Kashif Ali, Sangjoon Yu, and Tae Hyun Kim. 2020. Deep motion blind video stabilization. arXiv:2011.09697. Retrieved from https://arxiv.org/pdf/2011.09697
[2]
Yiheng Cai, Jiaqi Liu, Yajun Guo, Shaobin Hu, and Shinan Lang. 2021. Video anomaly detection with multi-scale feature and temporal information fusion. Neurocomputing 423 (2021), 264–273.
[3]
Guillermo Carbajal, Patricia Vitoria, Mauricio Delbracio, Pablo Musé, and José Lezama. 2021. Non-uniform blur kernel estimation via adaptive basis decomposition. arXiv:2102.01026. Retrieved from https://arxiv.org/pdf/2102.01026
[4]
Jierun Chen, Shiu-Hong Kao, Hao He, Weipeng Zhuo, Song Wen, Chul-Ho Lee, and S.-H. Gary Chan. 2023. Run, don’t walk: Chasing higher FLOPS for faster neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12021–12031.
[5]
Yu-Ta Chen, Kuan-Wei Tseng, Yao-Chih Lee, Chun-Yu Chen, and Yi-Ping Hung. 2021. Pixstabnet: Fast multi-scale deep online video stabilization with pixel-based warping. In Proceedings of the IEEE International Conference on Image Processing (ICIP ’21). IEEE, 1929–1933.
[6]
Sunghyun Cho, Jue Wang, and Seungyong Lee. 2012. Video deblurring for hand-held cameras using patch-based synthesis. ACM Transactions on Graphics (TOG) 31, 4 (2012), 1–9.
[7]
Jinsoo Choi and In So Kweon. 2020. Deep iterative frame interpolation for full-frame video stabilization. ACM Transactions on Graphics (TOG) 39, 1 (2020), 1–9.
[8]
Chung-Hua Chu. 2015. Visual comfort for stereoscopic 3D by using motion sensors on 3D mobile devices. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 12, 1s (2015), 1–20.
[9]
Michael L. Gleicher and Feng Liu. 2008. Re-cinematography: Improving the camerawork of casual video. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 5, 1 (2008), 1–28.
[10]
Amit Goldstein and Raanan Fattal. 2012. Video stabilization using epipolar geometry. ACM Transactions on Graphics (TOG) 31, 5 (2012), 1–10.
[11]
Shao Huang, Weiqiang Wang, Shengfeng He, and Rynson W. H. Lau. 2017. Egocentric hand detection via dynamic region growing. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 14, 1 (2017), 1–17.
[12]
Maria Silvia Ito and Ebroul Izquierdo. 2019. A dataset and evaluation framework for deep learning based video stabilization systems. In Proceedings of the IEEE Visual Communications and Image Processing (VCIP ’19). IEEE, 1–4.
[13]
Max Jaderberg, Karen Simonyan, Andrew Zisserman, and Koray Kavukcuoglu. 2015. Spatial transformer networks. Advances in Neural Information Processing Systems 28 (2015).
[14]
Jerin Geo James, Devansh Jain, and Ajit Rajwade. 2023. Globalflownet: Video stabilization using deep distilled global motion estimates. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 5078–5087.
[15]
Yao-Chih Lee, Kuan-Wei Tseng, Yu-Ta Chen, Chien-Cheng Chen, Chu-Song Chen, and Yi-Ping Hung. 2021. 3D video stabilization with depth estimation by CNN-based optimization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10621–10630.
[16]
Chengcheng Li, YuanTian, Lisen Ma, Yunhong Jia, and Yueqi Bi. 2024. Vehicle video stabilization algorithm based on grid motion statistics and adaptive Kalman filtering. Signal, Image and Video Processing (SIVP) 18, 2 (2024), 1969–1981.
[17]
Haipeng Li, Kunming Luo, Bing Zeng, and Shuaicheng Liu. 2024. Gyroflow+: Gyroscope-guided unsupervised deep homography and optical flow learning. International Journal of Computer Vision (IJCV) (2024), 1–19.
[18]
Alan J. Lipton, Hironobu Fujiyoshi, and Raju S. Patil. 1998. Moving target classification and tracking from real-time video. In Proceedings of the 4th IEEE Workshop on Applications of Computer Vision (WACV ’98). IEEE, 8–14.
[19]
Feng Liu, Michael Gleicher, Hailin Jin, and Aseem Agarwala. 2009. Content-preserving warps for 3D video stabilization. ACM Transactions on Graphics (TOG) 28, 3 (2009), 1–9.
[20]
Shuaicheng Liu, Ping Tan, Lu Yuan, Jian Sun, and Bing Zeng. 2016. Meshflow: Minimum latency online video stabilization. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference. Springer, 800–815.
[21]
Shuaicheng Liu, Lu Yuan, Ping Tan, and Jian Sun. 2013. Bundled camera paths for video stabilization. ACM Transactions on Graphics (TOG) 32, 4 (2013), 1–10.
[22]
Shuaicheng Liu, Lu Yuan, Ping Tan, and Jian Sun. 2014. Steadyflow: Spatially smooth optical flow for video stabilization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’14), 4209–4216.
[23]
Tao Liu, Gang Wan, Hongyang Bai, Xiaofang Kong, Bo Tang, and Fangyi Wang. 2023. Real-time video stabilization algorithm based on superpoint. IEEE Transactions on Instrumentation and Measurement (TIM), (2023), 1–13.
[24]
Yu-Lun Liu, Wei-Sheng Lai, Ming-Hsuan Yang, Yung-Yu Chuang, and Jia-Bin Huang. 2021. Hybrid neural fusion for full-frame video stabilization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2299–2308.
[25]
Ao Luo, Fan Yang, Xin Li, and Shuaicheng Liu. 2022. Learning optical flow with kernel patch attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8906–8915.
[26]
Xuan Luo, Jia-Bin Huang, Richard Szeliski, Kevin Matzen, and Johannes Kopf. 2020. Consistent video depth estimation. ACM Transactions on Graphics (TOG) 39, 4 (2020), 71–1.
[27]
Yasuyuki Matsushita, Eyal Ofek, Weina Ge, Xiaoou Tang, and Heung-Yeung Shum. 2006. Full-frame video stabilization with motion inpainting. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 7 (2006), 1150–1163.
[28]
Carlos Morimoto and Rama Chellappa. 1998. Evaluation of image stabilization algorithms. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP ’98). Vol. 5. IEEE, 2789–2792.
[29]
Manish Okade and P. K. Biswas. 2011. Improving video stabilization in the presence of motion blur. In Proceedings of the 3rd National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics. IEEE, 78–81.
[30]
Stefano Petrangeli, Jeroen Van Der Hooft, Tim Wauters, and Filip De Turck. 2018. Quality of experience-centric management of adaptive video streaming services: Status and challenges. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM ’18) 14, 2s (2018), 1–29.
[31]
Xu Qin, Zhilin Wang, Yuanchao Bai, Xiaodong Xie, and Huizhu Jia. 2020. FFA-Net: Feature fusion attention network for single image dehazing. In Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 34. 11908–11915.
[32]
Qi Rao, Xin Yu, Shant Navasardyan, and Humphrey Shi. 2023. Sim2realvs: A new benchmark for video stabilization with a strong baseline. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV ’23), 5406–5415.
[33]
Wenqi Ren, Jinshan Pan, Xiaochun Cao, and Ming-Hsuan Yang. 2017. Video deblurring via semantic segmentation and pixel-wise non-linear kernel. In Proceedings of the IEEE International Conference on Computer Vision, 1077–1085.
[34]
Kalpana Seshadrinathan and Alan Conrad Bovik. 2009. Motion tuned spatio-temporal quality assessment of natural videos. IEEE Transactions on Image Processing (TIP) 19, 2 (2009), 335–350.
[35]
Zhenmei Shi, Fuhao Shi, Wei-Sheng Lai, Chia-Kai Liang, and Yingyu Liang. 2022. Deep online fused video stabilization. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 1250–1258.
[36]
Shimon Ullman. 1979. The interpretation of structure from motion. Proceedings of the Royal Society of London. Series B. Biological Sciences 203, 1153 (1979), 405–426.
[37]
Jian Wang, Qiang Ling, and Peiyan Li. 2023. Robust video stabilization based on motion decomposition. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 19, 5 (2023), 1–24.
[38]
Miao Wang, Guo-Ye Yang, Jin-Kun Lin, Song-Hai Zhang, Ariel Shamir, Shao-Ping Lu, and Shi-Min Hu. 2018. Deep online video stabilization with multi-grid warping transformation learning. IEEE Transactions on Image Processing 28, 5 (2018), 2283–2292.
[39]
Naiyao Wang, Changdong Zhou, Rongfeng Zhu, Bo Zhang, Ye Wang, and Hongbo Liu. 2024. SOFT: Self-supervised sparse optical flow transformer for video stabilization via quaternion. Engineering Applications of Artificial Intelligence 130 (2024), 107725.
[40]
Yiming Wang, Qian Huang, Chuanxu Jiang, Jiwen Liu, Mingzhou Shang, and Zhuang Miao. 2023. Video stabilization: A comprehensive survey. Neurocomputing 516 (2023), 205–230.
[41]
Yu-Shuen Wang, Feng Liu, Pu-Sheng Hsu, and Tong-Yee Lee. 2013. Spatially and temporally optimized video stabilization. IEEE Transactions on Visualization and Computer Graphics 19, 8 (2013), 1354–1361.
[42]
Sen-Zhe Xu, Jun Hu, Miao Wang, Tai-Jiang Mu, and Shi-Min Hu. 2018. Deep video stabilization using adversarial networks. In Proceedings of the Computer Graphics Forum. Vol. 37. Wiley Online Library, 267–276.
[43]
Yufei Xu, Jing Zhang, Stephen J. Maybank, and Dacheng Tao. 2022. Dut: Learning video stabilization by simply watching unstable videos. IEEE Transactions on Image Processing 31 (2022), 4306–4320.
[44]
Yufei Xu, Jing Zhang, and Dacheng Tao. 2021. Out-of-boundary view synthesis towards full-frame video stabilization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 4842–4851.
[45]
Jiyang Yu and Ravi Ramamoorthi. 2019. Robust video stabilization by optimization in cnn weight space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR ’19), 3800–3808.
[46]
Jiyang Yu, Ravi Ramamoorthi, Keli Cheng, Michel Sarkis, and Ning Bi. 2021. Real-time selfie video stabilization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR ’21). 12036–12044.
[47]
Lei Zhang, Qing-Zhuo Zheng, and Hua Huang. 2018. Intrinsic motion stability assessment for video stabilization. IEEE Transactions on Visualization and Computer Graphics (TVCG) 25, 4 (2018), 1681–1692.
[48]
Lei Zhang, Qing-Zhuo Zheng, Hong-Kang Liu, and Hua Huang. 2018. Full-reference stability assessment of digital video stabilization based on Riemannian metric. IEEE Transactions on Image Processing 27, 12 (2018), 6051–6063.
[49]
Minda Zhao and Qiang Ling. 2020. Pwstablenet: Learning pixel-wise warping maps for video stabilization. IEEE Transactions on Image Processing 29 (2020), 3582–3595.
[50]
Tinghui Zhou, Matthew Brown, Noah Snavely, and David G. Lowe. 2017. Unsupervised learning of depth and ego-motion from video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1851–1858.
[51]
Zihan Zhou, Hailin Jin, and Yi Ma. 2013. Plane-based content preserving warps for video stabilization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2299–2306.

Cited By

View all
  • (2024)Noise-Resistance Learning via Multi-Granularity Consistency for Unsupervised Domain Adaptive Person Re-IdentificationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3702328Online publication date: 2-Nov-2024
  • (2024)Correlation-aware Cross-modal Attention Network for Fashion Compatibility Modeling in UGC SystemsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3698772Online publication date: 5-Oct-2024
  • (2024)Efficiently Gluing Pre-trained Language and Vision Models for Image CaptioningACM Transactions on Intelligent Systems and Technology10.1145/3682067Online publication date: 29-Jul-2024
  • Show More Cited By

Index Terms

  1. Blind 3D Video Stabilization with Spatio-Temporally Varying Motion Blur

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 20, Issue 11
    November 2024
    702 pages
    EISSN:1551-6865
    DOI:10.1145/3613730
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 November 2024
    Online AM: 08 August 2024
    Accepted: 25 July 2024
    Revised: 24 July 2024
    Received: 24 January 2024
    Published in TOMM Volume 20, Issue 11

    Check for updates

    Author Tags

    1. 3D video stabilization
    2. Blind motion blur
    3. Camera motion estimation
    4. Cross-frame temporal features

    Qualifiers

    • Research-article

    Funding Sources

    • Natural Science Foundation of China

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)255
    • Downloads (Last 6 weeks)51
    Reflects downloads up to 01 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Noise-Resistance Learning via Multi-Granularity Consistency for Unsupervised Domain Adaptive Person Re-IdentificationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3702328Online publication date: 2-Nov-2024
    • (2024)Correlation-aware Cross-modal Attention Network for Fashion Compatibility Modeling in UGC SystemsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3698772Online publication date: 5-Oct-2024
    • (2024)Efficiently Gluing Pre-trained Language and Vision Models for Image CaptioningACM Transactions on Intelligent Systems and Technology10.1145/3682067Online publication date: 29-Jul-2024
    • (2024)Dual-path Collaborative Generation Network for Emotional Video CaptioningProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681603(496-505)Online publication date: 28-Oct-2024
    • (2024)Simple but Effective Raw-Data Level Multimodal Fusion for Composed Image RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657727(229-239)Online publication date: 10-Jul-2024

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media