research-article

Video super-resolution via pre-frame constrained and deep-feature enhanced sparse reconstruction

Authors:

Zhensong Zhang,

Mingyu XiaoAuthors Info & Claims

Volume 100, Issue C

https://doi.org/10.1016/j.patcog.2019.107139

Published: 01 April 2020 Publication History

Highlights

•

A novel sparse reconstruction formulation is introduced for video super-resolution, which uses the previous estimated high-resolution frame as regularization to guarantee the temporal coherence.

•

Deep features are incorporated into the sparse reconstruction framework to enhance the dictionary which benefit the whole super-resolution process.

•

An effective dictionary updating strategy is proposed that updates the dictionaries regularly utilizing the newly reconstructed frames.

•

A joint bilateral filter is utilized to remove reconstruction noises and transfer details.

•

Experiments show the effectiveness of the proposed method compared with previous approaches.

Abstract

This paper presents a new video super-resolution (SR) method that can generate high-quality and temporally coherent high-resolution (HR) videos. Starting from the traditional sparse reconstruction framework that works well for image SR, we improve it significantly from the following aspects to obtain an effect video SR method. Firstly, to enhance the temporal coherence between adjacent HR frames, once a HR frame is estimated, we use it to guide the sparse reconstruction of the next low-resolution frame. Secondly, instead of using just hand-craft features, we further incorporate deep features generated by VGG16 into our sparse reconstruction based video SR method. Thirdly, we constantly update the dictionary, which is the core of the sparse reconstruction, by making use of the previously estimated HR frame. Finally, after the HR video is reconstructed, we use a joint bilateral filter to post-process it to remove artifacts and transfer image details. Experiments demonstrate that the proposed four strategies effectively improve our final results. In most of the experiments of this paper, our results are better than those produced by the latest deep learning based approaches.

References

[1]

J. Yang, J. Wright, T. Huang, Y. Ma, Image super-resolution as sparse representation of raw image patches, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2008, pp. 1–8.

[2]

J. Yang, J. Wright, T.S. Huang, Y. Ma, Image super-resolution via sparse representation, IEEE Trans. Image Process. 19 (11) (2010) 2861–2873.

Digital Library

[3]

W.T. Freeman, T.R. Jones, E.C. Pasztor, Example-based super-resolution, IEEE Comput. Graph. Appl. 22 (2) (2002) 56–65.

Digital Library

[4]

J. Sun, N.-N. Zheng, H. Tao, H.-Y. Shum, Image hallucination with primal sketch priors, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, 2003, pp. II–729.

[5]

J. Sun, Z. Xu, H.-Y. Shum, Image super-resolution using gradient profile prior, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2008, pp. 1–8.

[6]

W. Fan, D.-Y. Yeung, Image hallucination using neighbor embedding over visual primitive manifolds, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2007, pp. 1–7.

[7]

Y. Zhuang, J. Zhang, F. Wu, Hallucinating faces: Lph super-resolution and neighbor reconstruction for residue compensation, Pattern Recognit. 40 (11) (2007) 3178–3194.

Digital Library

[8]

X. Li, K.M. Lam, G. Qiu, L. Shen, S. Wang, Example-based image super-resolution with class-specific predictors, J. Visual Commun. Image Represent. 20 (5) (2009) 312–322.

[9]

H. Huang, H. He, X. Fan, J. Zhang, Super-resolution of human face image using canonical correlation analysis, Pattern Recognit. 43 (7) (2010) 2532–2543.

Digital Library

[10]

S. Wang, L. Zhang, Y. Liang, Q. Pan, Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2012, pp. 2216–2223.

[11]

A. Akyol, M. GöKmen, Super-resolution reconstruction of faces by enhanced global models of shape and texture, Pattern Recognit. 45 (12) (2012) 4103–4116.

[12]

Q. Dai, S. Yoo, A. Kappeler, A.K. Katsaggelos, Dictionary-based multiple frame video super-resolution, IEEE International Conference on Image Processing, 2015, pp. 83–87.

[13]

W. Wang, J. Shen, L. Shao, Video salient object detection via fully convolutional networks, IEEE Trans. Image Process. 27 (1) (2017) 38–49.

[14]

W. Wang, J. Shen, F. Porikli, R. Yang, Semi-supervised video object segmentation with super-trajectories, IEEE Trans. Pattern Anal. Mach.Intell. 41 (4) (2019) 985–998.

[15]

X. Tao, H. Gao, R. Liao, J. Wang, J. Jia, Detail-revealing deep video super-resolution, Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 4472–4480.

[16]

A. Kappeler, S. Yoo, Q. Dai, A.K. Katsaggelos, Video super-resolution with convolutional neural networks, IEEE Trans. Comput. Imaging 2 (2) (2016) 109–122.

[17]

K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, International Conference on Learning Representations, 2015.

[18]

E. Eisemann, F. Durand, Flash photography enhancement via intrinsic relighting, ACM Transactions on Graphics, vol. 23, 2004, pp. 673–678.

[19]

G. Petschnigg, R. Szeliski, M. Agrawala, M. Cohen, H. Hoppe, K. Toyama, Digital photography with flash and no-flash image pairs, ACM Trans. Graph. 23 (3) (2004) 664–672.

Digital Library

[20]

Q. Lai, Y. Nie, Z. Zhang, H. Sun, Temporal coherent video super-resolution via pre-frame-constrained sparse reconstruction, Proceedings of Computer Graphics International, 2018, pp. 223–232.

[21]

K. Nasrollahi, T.B. Moeslund, Super-resolution: a comprehensive survey, Mach. Vision Appl. 25 (6) (2014) 1423–1468.

[22]

L. Yue, H. Shen, J. Li, Q. Yuan, H. Zhang, L. Zhang, Image super-resolution: the techniques, applications, and future, Signal Process. 128 (2016) 389–408.

Digital Library

[23]

X. Li, M.T. Orchard, New edge-directed interpolation, IEEE Trans. Image Process. 10 (10) (2001) 1521–1527.

Digital Library

[24]

Q. Wang, R.K. Ward, A new orientation-adaptive interpolation method, IEEE Trans. Image Process. 16 (4) (2007) 889–900.

[25]

S.-M. Kwak, J.-H. Moon, J.-K. Han, Modified cubic convolution scaler for edge-directed nonuniform data, Optical Engineering 46 (10) (2007).

[26]

M. Zhao, M. Bosma, G. Haan, Making the best of legacy video on modern displays, J. Soc. Inf. Disp. 15 (1) (2007) 49–60.

[27]

T. Peleg, M. Elad, A statistical prediction model based on sparse representations for single image super-resolution, IEEE Trans. Image Process. 23 (6) (2014) 2569–2582.

Digital Library

[28]

B.C. Song, S.-C. Jeong, Y. Choi, Video super-resolution algorithm using bi-directional overlapped block motion compensation and on-the-fly dictionary training, IEEE Trans. Circuits Syst. Video Technol. 21 (3) (2011) 274–285.

[29]

W. Wang, Q. Lai, H. Fu, J. Shen, H. Ling, Salient object detection in the deep learning era: an in-depth survey, arXiv:1904.09146(2019).

[30]

W. Wang, J. Shen, Deep visual attention prediction, IEEE Trans. Image Process. 27 (5) (2017) 2368–2378.

Digital Library

[31]

W. Wang, J. Shen, X. Dong, A. Borji, R. Yang, Inferring salient objects from human fixations, IEEE Trans. Pattern Anal. Mach.Intell. (2019).

Digital Library

[32]

J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 3431–3440.

[33]

W. Wang, J. Shen, R. Yang, F. Porikli, Saliency-aware video object segmentation, IEEE Trans. Pattern Anal. Mach.Intell. 40 (1) (2017) 20–33.

[34]

W. Wang, J. Shen, H. Ling, A deep network solution for attention and aesthetics aware photo cropping, IEEE Trans. Pattern Anal. Mach.Intell. (2018).

[35]

Z. Shen, W.-S. Lai, T. Xu, J. Kautz, M.-H. Yang, Deep semantic face deblurring, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 8260–8269.

[36]

K. Zhang, W. Zuo, Y. Chen, D. Meng, L. Zhang, Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising, IEEE Trans. Image Process. 26 (7) (2017) 3142–3155.

Digital Library

[37]

J. Xu, L. Zhang, W. Zuo, D. Zhang, X. Feng, Patch group based nonlocal self-similarity prior learning for image denoising, Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 244–252.

[38]

B. Li, X. Peng, Z. Wang, J. Xu, D. Feng, AOD-Net: all-in-one dehazing network, Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 4770–4778.

[39]

S. Nah, T. Hyun Kim, K. Mu Lee, Deep multi-scale convolutional neural network for dynamic scene deblurring, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 3883–3891.

[40]

C. Dong, C.C. Loy, K. He, X. Tang, Image super-resolution using deep convolutional networks, IEEE Trans. Pattern Anal. Mach.Intell. 38 (2) (2016) 295–307.

Digital Library

[41]

W. Shi, J. Caballero, F. Huszár, J. Totz, A.P. Aitken, R. Bishop, D. Rueckert, Z. Wang, Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 1874–1883.

[42]

C. Dong, C.C. Loy, X. Tang, Accelerating the super-resolution convolutional neural network, European Conference on Computer Vision, 2016, pp. 391–407.

[43]

J. Johnson, A. Alahi, L. Fei-Fei, Perceptual losses for real-time style transfer and super-resolution, European Conference on Computer Vision, 2016, pp. 694–711.

[44]

L. Wang, Z. Huang, Y. Gong, C. Pan, Ensemble based deep networks for image super-resolution, Pattern Recognit. 68 (2017) 191–198.

Digital Library

[45]

L. Zhao, H. Bai, J. Liang, B. Zeng, A. Wang, Y. Zhao, Simultaneous color-depth super-resolution with conditional generative adversarial networks, Pattern Recognit. 88 (2019) 356–369.

Digital Library

[46]

S.P. Kim, W.-Y. Su, Recursive high-resolution reconstruction of blurred multiframe images, IEEE Trans. Image Process. 2 (4) (1993) 534–539.

[47]

S. Farsiu, M.D. Robinson, M. Elad, P. Milanfar, Fast and robust multiframe super resolution, IEEE Trans. Image Process. 13 (10) (2004) 1327–1344.

[48]

M. Protter, M. Elad, H. Takeda, P. Milanfar, Generalizing the nonlocal-means to super-resolution reconstruction, IEEE Trans. Image Process. 18 (1) (2009) 36–51.

[49]

H. Zhang, J. Yang, Y. Zhang, T. Huang, Non-local kernel regression for image and video restoration, Eur. Conf. Comput. Vision (2010) 566–579.

[50]

H. Takeda, P. Milanfar, M. Protter, M. Elad, Super-resolution without explicit subpixel motion estimation, IEEE Trans. Image Process. 18 (9) (2009) 1958–1975.

Digital Library

[51]

H. Takeda, S. Farsiu, P. Milanfar, Kernel regression for image processing and reconstruction, IEEE Trans. Image Process. 16 (2) (2007) 349–366.

Digital Library

[52]

J. Yang, Z. Wang, Z. Lin, S. Cohen, T. Huang, Coupled dictionary training for image super-resolution, IEEE Trans. Image Process. 21 (8) (2012) 3467–3478.

Digital Library

[53]

C. Liu, D. Sun, A Bayesian approach to adaptive video super resolution, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2011, pp. 209–216.

[54]

K. Jia, X. Wang, X. Tang, Image transformation based on learning dictionaries across image spaces, IEEE Trans. Pattern Anal. Mach.Intell. 35 (2) (2013) 367–380.

[55]

M. Aharon, M. Elad, A. Bruckstein, K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation, IEEE Trans. Signal Process. 54 (11) (2006) 4311–4322.

Digital Library

[56]

A. Krizhevsky, I. Sutskever, G.E. Hinton, ImageNet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, 2012, pp. 1097–1105.

[57]

F. Brandi, R. de Queiroz, D. Mukherjee, Super resolution of video using key frames, IEEE International Symposium on Circuits and Systems, 2008, pp. 1608–1611.

[58]

W. Wang, J. Shen, L. Shao, Consistent video saliency using local gradient flow optimization and global refinement, IEEE Trans. Image Process. 24 (11) (2015) 4185–4196.

Digital Library

[59]

Y. HaCohen, E. Shechtman, D.B. Goldman, D. Lischinski, Non-rigid dense correspondence with applications for image enhancement, ACM Trans. Graph. 30 (4) (2011) 70.

[60]

J. Mairal, F. Bach, J. Ponce, G. Sapiro, Online dictionary learning for sparse coding, International Conference on Machine Learning, 2009, pp. 689–696.

[61]

C. Tomasi, R. Manduchi, Bilateral filtering for gray and color images, Proceedings of the IEEE International Conference on Computer Vision, 1998, pp. 839–846.

[62]

Y. Li, W. Dong, X. Xie, G. Shi, X. Li, D. Xu, Learning parametric sparse models for image super-resolution, Advances in Neural Information Processing Systems, 2016, pp. 4664–4672.

[63]

J.-B. Huang, A. Singh, N. Ahuja, Single image super-resolution from transformed self-exemplars, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 5197–5206.

[64]

Z. Wang, A.C. Bovik, H.R. Sheikh, E.P. Simoncelli, et al., Image quality assessment: from error visibility to structural similarity, IEEE Trans. Image Process. 13 (4) (2004) 600–612.

Digital Library

[65]

C. Liu, D. Sun, On Bayesian adaptive video super resolution, IEEE Trans. Pattern Anal. Mach.Intell. 36 (2) (2014) 346–360.

Digital Library

Cited By

Wen JShen AHan ZWang ZChen L(2024)Cross-Modal Face Super-Resolution Based on Quasi-Siamese Domain Transfer Fusion NetworkACM Transactions on Multimedia Computing, Communications, and Applications10.1145/369038720:11(1-23)Online publication date: 28-Aug-2024
https://dl.acm.org/doi/10.1145/3690387

Index Terms

Video super-resolution via pre-frame constrained and deep-feature enhanced sparse reconstruction
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
      2. Computer vision tasks
  2. Computer graphics
    1. Image manipulation

Index terms have been assigned to the content through auto-classification.

Recommendations

Temporal Coherent Video Super-resolution via Pre-frame-constrained Sparse Reconstruction
CGI 2018: Proceedings of Computer Graphics International 2018

In this paper, we extend the sparse representation based image super-resolution method to process videos, mainly aiming at obtaining temporally consistent consecutive high-resolution (HR) video frames. In our formulation, the previous estimated HR frame ...
Hierarchical spatio-spectral fusion for hyperspectral image super resolution via sparse representation and pre-trained deep model
Abstract
Fusing a hyperspectral image (HSI) with a high resolution multispectral image (MSI) has been a highly attractive and effective approach for improving the spatial resolution of HSIs. However, most existing spatio-spectral fusion methods ...
Super Resolution Reconstruction via Multiple Frames Joint Learning
CMSP '11: Proceedings of the 2011 International Conference on Multimedia and Signal Processing - Volume 01

This paper presents a novel multi-frame joint learning approach for image super resolution via sparse representation. Based on the assumption that several low-resolution patches degraded from a same high-resolution patch under subpixel translation can ...

Comments

Information & Contributors

Information

Published In

cover image Pattern Recognition

Pattern Recognition Volume 100, Issue C

Apr 2020

778 pages

ISSN:0031-3203

Issue’s Table of Contents

Copyright © 2019.

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 April 2020

Author Tags

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wen JShen AHan ZWang ZChen L(2024)Cross-Modal Face Super-Resolution Based on Quasi-Siamese Domain Transfer Fusion NetworkACM Transactions on Multimedia Computing, Communications, and Applications10.1145/369038720:11(1-23)Online publication date: 28-Aug-2024
https://dl.acm.org/doi/10.1145/3690387

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents