Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Video super-resolution via pre-frame constrained and deep-feature enhanced sparse reconstruction

Published: 01 April 2020 Publication History

Highlights

A novel sparse reconstruction formulation is introduced for video super-resolution, which uses the previous estimated high-resolution frame as regularization to guarantee the temporal coherence.
Deep features are incorporated into the sparse reconstruction framework to enhance the dictionary which benefit the whole super-resolution process.
An effective dictionary updating strategy is proposed that updates the dictionaries regularly utilizing the newly reconstructed frames.
A joint bilateral filter is utilized to remove reconstruction noises and transfer details.
Experiments show the effectiveness of the proposed method compared with previous approaches.

Abstract

This paper presents a new video super-resolution (SR) method that can generate high-quality and temporally coherent high-resolution (HR) videos. Starting from the traditional sparse reconstruction framework that works well for image SR, we improve it significantly from the following aspects to obtain an effect video SR method. Firstly, to enhance the temporal coherence between adjacent HR frames, once a HR frame is estimated, we use it to guide the sparse reconstruction of the next low-resolution frame. Secondly, instead of using just hand-craft features, we further incorporate deep features generated by VGG16 into our sparse reconstruction based video SR method. Thirdly, we constantly update the dictionary, which is the core of the sparse reconstruction, by making use of the previously estimated HR frame. Finally, after the HR video is reconstructed, we use a joint bilateral filter to post-process it to remove artifacts and transfer image details. Experiments demonstrate that the proposed four strategies effectively improve our final results. In most of the experiments of this paper, our results are better than those produced by the latest deep learning based approaches.

References

[1]
J. Yang, J. Wright, T. Huang, Y. Ma, Image super-resolution as sparse representation of raw image patches, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2008, pp. 1–8.
[2]
J. Yang, J. Wright, T.S. Huang, Y. Ma, Image super-resolution via sparse representation, IEEE Trans. Image Process. 19 (11) (2010) 2861–2873.
[3]
W.T. Freeman, T.R. Jones, E.C. Pasztor, Example-based super-resolution, IEEE Comput. Graph. Appl. 22 (2) (2002) 56–65.
[4]
J. Sun, N.-N. Zheng, H. Tao, H.-Y. Shum, Image hallucination with primal sketch priors, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, 2003, pp. II–729.
[5]
J. Sun, Z. Xu, H.-Y. Shum, Image super-resolution using gradient profile prior, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2008, pp. 1–8.
[6]
W. Fan, D.-Y. Yeung, Image hallucination using neighbor embedding over visual primitive manifolds, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2007, pp. 1–7.
[7]
Y. Zhuang, J. Zhang, F. Wu, Hallucinating faces: Lph super-resolution and neighbor reconstruction for residue compensation, Pattern Recognit. 40 (11) (2007) 3178–3194.
[8]
X. Li, K.M. Lam, G. Qiu, L. Shen, S. Wang, Example-based image super-resolution with class-specific predictors, J. Visual Commun. Image Represent. 20 (5) (2009) 312–322.
[9]
H. Huang, H. He, X. Fan, J. Zhang, Super-resolution of human face image using canonical correlation analysis, Pattern Recognit. 43 (7) (2010) 2532–2543.
[10]
S. Wang, L. Zhang, Y. Liang, Q. Pan, Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2012, pp. 2216–2223.
[11]
A. Akyol, M. GöKmen, Super-resolution reconstruction of faces by enhanced global models of shape and texture, Pattern Recognit. 45 (12) (2012) 4103–4116.
[12]
Q. Dai, S. Yoo, A. Kappeler, A.K. Katsaggelos, Dictionary-based multiple frame video super-resolution, IEEE International Conference on Image Processing, 2015, pp. 83–87.
[13]
W. Wang, J. Shen, L. Shao, Video salient object detection via fully convolutional networks, IEEE Trans. Image Process. 27 (1) (2017) 38–49.
[14]
W. Wang, J. Shen, F. Porikli, R. Yang, Semi-supervised video object segmentation with super-trajectories, IEEE Trans. Pattern Anal. Mach.Intell. 41 (4) (2019) 985–998.
[15]
X. Tao, H. Gao, R. Liao, J. Wang, J. Jia, Detail-revealing deep video super-resolution, Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 4472–4480.
[16]
A. Kappeler, S. Yoo, Q. Dai, A.K. Katsaggelos, Video super-resolution with convolutional neural networks, IEEE Trans. Comput. Imaging 2 (2) (2016) 109–122.
[17]
K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, International Conference on Learning Representations, 2015.
[18]
E. Eisemann, F. Durand, Flash photography enhancement via intrinsic relighting, ACM Transactions on Graphics, vol. 23, 2004, pp. 673–678.
[19]
G. Petschnigg, R. Szeliski, M. Agrawala, M. Cohen, H. Hoppe, K. Toyama, Digital photography with flash and no-flash image pairs, ACM Trans. Graph. 23 (3) (2004) 664–672.
[20]
Q. Lai, Y. Nie, Z. Zhang, H. Sun, Temporal coherent video super-resolution via pre-frame-constrained sparse reconstruction, Proceedings of Computer Graphics International, 2018, pp. 223–232.
[21]
K. Nasrollahi, T.B. Moeslund, Super-resolution: a comprehensive survey, Mach. Vision Appl. 25 (6) (2014) 1423–1468.
[22]
L. Yue, H. Shen, J. Li, Q. Yuan, H. Zhang, L. Zhang, Image super-resolution: the techniques, applications, and future, Signal Process. 128 (2016) 389–408.
[23]
X. Li, M.T. Orchard, New edge-directed interpolation, IEEE Trans. Image Process. 10 (10) (2001) 1521–1527.
[24]
Q. Wang, R.K. Ward, A new orientation-adaptive interpolation method, IEEE Trans. Image Process. 16 (4) (2007) 889–900.
[25]
S.-M. Kwak, J.-H. Moon, J.-K. Han, Modified cubic convolution scaler for edge-directed nonuniform data, Optical Engineering 46 (10) (2007).
[26]
M. Zhao, M. Bosma, G. Haan, Making the best of legacy video on modern displays, J. Soc. Inf. Disp. 15 (1) (2007) 49–60.
[27]
T. Peleg, M. Elad, A statistical prediction model based on sparse representations for single image super-resolution, IEEE Trans. Image Process. 23 (6) (2014) 2569–2582.
[28]
B.C. Song, S.-C. Jeong, Y. Choi, Video super-resolution algorithm using bi-directional overlapped block motion compensation and on-the-fly dictionary training, IEEE Trans. Circuits Syst. Video Technol. 21 (3) (2011) 274–285.
[29]
W. Wang, Q. Lai, H. Fu, J. Shen, H. Ling, Salient object detection in the deep learning era: an in-depth survey, arXiv:1904.09146(2019).
[30]
W. Wang, J. Shen, Deep visual attention prediction, IEEE Trans. Image Process. 27 (5) (2017) 2368–2378.
[31]
W. Wang, J. Shen, X. Dong, A. Borji, R. Yang, Inferring salient objects from human fixations, IEEE Trans. Pattern Anal. Mach.Intell. (2019).
[32]
J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 3431–3440.
[33]
W. Wang, J. Shen, R. Yang, F. Porikli, Saliency-aware video object segmentation, IEEE Trans. Pattern Anal. Mach.Intell. 40 (1) (2017) 20–33.
[34]
W. Wang, J. Shen, H. Ling, A deep network solution for attention and aesthetics aware photo cropping, IEEE Trans. Pattern Anal. Mach.Intell. (2018).
[35]
Z. Shen, W.-S. Lai, T. Xu, J. Kautz, M.-H. Yang, Deep semantic face deblurring, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 8260–8269.
[36]
K. Zhang, W. Zuo, Y. Chen, D. Meng, L. Zhang, Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising, IEEE Trans. Image Process. 26 (7) (2017) 3142–3155.
[37]
J. Xu, L. Zhang, W. Zuo, D. Zhang, X. Feng, Patch group based nonlocal self-similarity prior learning for image denoising, Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 244–252.
[38]
B. Li, X. Peng, Z. Wang, J. Xu, D. Feng, AOD-Net: all-in-one dehazing network, Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 4770–4778.
[39]
S. Nah, T. Hyun Kim, K. Mu Lee, Deep multi-scale convolutional neural network for dynamic scene deblurring, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 3883–3891.
[40]
C. Dong, C.C. Loy, K. He, X. Tang, Image super-resolution using deep convolutional networks, IEEE Trans. Pattern Anal. Mach.Intell. 38 (2) (2016) 295–307.
[41]
W. Shi, J. Caballero, F. Huszár, J. Totz, A.P. Aitken, R. Bishop, D. Rueckert, Z. Wang, Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 1874–1883.
[42]
C. Dong, C.C. Loy, X. Tang, Accelerating the super-resolution convolutional neural network, European Conference on Computer Vision, 2016, pp. 391–407.
[43]
J. Johnson, A. Alahi, L. Fei-Fei, Perceptual losses for real-time style transfer and super-resolution, European Conference on Computer Vision, 2016, pp. 694–711.
[44]
L. Wang, Z. Huang, Y. Gong, C. Pan, Ensemble based deep networks for image super-resolution, Pattern Recognit. 68 (2017) 191–198.
[45]
L. Zhao, H. Bai, J. Liang, B. Zeng, A. Wang, Y. Zhao, Simultaneous color-depth super-resolution with conditional generative adversarial networks, Pattern Recognit. 88 (2019) 356–369.
[46]
S.P. Kim, W.-Y. Su, Recursive high-resolution reconstruction of blurred multiframe images, IEEE Trans. Image Process. 2 (4) (1993) 534–539.
[47]
S. Farsiu, M.D. Robinson, M. Elad, P. Milanfar, Fast and robust multiframe super resolution, IEEE Trans. Image Process. 13 (10) (2004) 1327–1344.
[48]
M. Protter, M. Elad, H. Takeda, P. Milanfar, Generalizing the nonlocal-means to super-resolution reconstruction, IEEE Trans. Image Process. 18 (1) (2009) 36–51.
[49]
H. Zhang, J. Yang, Y. Zhang, T. Huang, Non-local kernel regression for image and video restoration, Eur. Conf. Comput. Vision (2010) 566–579.
[50]
H. Takeda, P. Milanfar, M. Protter, M. Elad, Super-resolution without explicit subpixel motion estimation, IEEE Trans. Image Process. 18 (9) (2009) 1958–1975.
[51]
H. Takeda, S. Farsiu, P. Milanfar, Kernel regression for image processing and reconstruction, IEEE Trans. Image Process. 16 (2) (2007) 349–366.
[52]
J. Yang, Z. Wang, Z. Lin, S. Cohen, T. Huang, Coupled dictionary training for image super-resolution, IEEE Trans. Image Process. 21 (8) (2012) 3467–3478.
[53]
C. Liu, D. Sun, A Bayesian approach to adaptive video super resolution, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2011, pp. 209–216.
[54]
K. Jia, X. Wang, X. Tang, Image transformation based on learning dictionaries across image spaces, IEEE Trans. Pattern Anal. Mach.Intell. 35 (2) (2013) 367–380.
[55]
M. Aharon, M. Elad, A. Bruckstein, K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation, IEEE Trans. Signal Process. 54 (11) (2006) 4311–4322.
[56]
A. Krizhevsky, I. Sutskever, G.E. Hinton, ImageNet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, 2012, pp. 1097–1105.
[57]
F. Brandi, R. de Queiroz, D. Mukherjee, Super resolution of video using key frames, IEEE International Symposium on Circuits and Systems, 2008, pp. 1608–1611.
[58]
W. Wang, J. Shen, L. Shao, Consistent video saliency using local gradient flow optimization and global refinement, IEEE Trans. Image Process. 24 (11) (2015) 4185–4196.
[59]
Y. HaCohen, E. Shechtman, D.B. Goldman, D. Lischinski, Non-rigid dense correspondence with applications for image enhancement, ACM Trans. Graph. 30 (4) (2011) 70.
[60]
J. Mairal, F. Bach, J. Ponce, G. Sapiro, Online dictionary learning for sparse coding, International Conference on Machine Learning, 2009, pp. 689–696.
[61]
C. Tomasi, R. Manduchi, Bilateral filtering for gray and color images, Proceedings of the IEEE International Conference on Computer Vision, 1998, pp. 839–846.
[62]
Y. Li, W. Dong, X. Xie, G. Shi, X. Li, D. Xu, Learning parametric sparse models for image super-resolution, Advances in Neural Information Processing Systems, 2016, pp. 4664–4672.
[63]
J.-B. Huang, A. Singh, N. Ahuja, Single image super-resolution from transformed self-exemplars, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 5197–5206.
[64]
Z. Wang, A.C. Bovik, H.R. Sheikh, E.P. Simoncelli, et al., Image quality assessment: from error visibility to structural similarity, IEEE Trans. Image Process. 13 (4) (2004) 600–612.
[65]
C. Liu, D. Sun, On Bayesian adaptive video super resolution, IEEE Trans. Pattern Anal. Mach.Intell. 36 (2) (2014) 346–360.

Cited By

View all
  • (2024)Cross-Modal Face Super-Resolution Based on Quasi-Siamese Domain Transfer Fusion NetworkACM Transactions on Multimedia Computing, Communications, and Applications10.1145/369038720:11(1-23)Online publication date: 28-Aug-2024

Index Terms

  1. Video super-resolution via pre-frame constrained and deep-feature enhanced sparse reconstruction
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image Pattern Recognition
        Pattern Recognition  Volume 100, Issue C
        Apr 2020
        778 pages

        Publisher

        Elsevier Science Inc.

        United States

        Publication History

        Published: 01 April 2020

        Author Tags

        1. Video super resolution
        2. Sparse representation
        3. Deep features
        4. Temporal coherence

        Author Tags

        1. 00-01
        2. 99-00

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 05 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Cross-Modal Face Super-Resolution Based on Quasi-Siamese Domain Transfer Fusion NetworkACM Transactions on Multimedia Computing, Communications, and Applications10.1145/369038720:11(1-23)Online publication date: 28-Aug-2024

        View Options

        View options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media