research-article

Video Frame Interpolation Based on Lightweight Convolutional Unit and Three-scale Encoder

Authors:

Qingbei GuoAuthors Info & Claims

ICDIP '23: Proceedings of the 15th International Conference on Digital Image Processing

Article No.: 90, Pages 1 - 7

https://doi.org/10.1145/3604078.3604169

Published: 26 October 2023 Publication History

Abstract

Video frame interpolation (VFI) achieves temporal super-resolution by synthesizing intermediate frame between two original adjacent frames. This paper proposes a lightweight VFI network based on lightweight convolutional unit and three-scale encoder. We first introduce a three-scale encoding-decoding structure with two-level attention cascades to represent the multi-scale motion information. Then, recurrent convolutional layer (RCL) and residual operation are adopted to design the recurrent residual convolutional unit (RRCU) to refine the three-scale structure. Finally, we propose a lightweight unit by combining the depth separable convolution and RRCU, and introduce the local lightweight idea to reduce the model parameters. Experimental results show that the proposed method achieves a better performance against the state-of-the-art methods with fewer parameters.

References

[1]

Jiang, H., Sun, D., Jampani, V., Yang, M. H., Learned-Miller, E. and Kautz, J., "Super slomo: High quality estimation of multiple intermediate frames for video interpolation," Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 9000-9008(2018).

[2]

Xu, X., Pan, J., Zhang, Y. J. and Yang, M. H., “Motion blur kernel estimation via deep learning," IEEE Trans. Image Process. Papers 27(1), 194-205(2018).

[3]

He, J., Yang, G., Liu, X. and Ding, X., “Spatio-temporal saliency-based motion vector refinement for frame rate up-conversion,” ACM Trans. Multim. Comput. Papers 16(2), 1-18(2020).

Digital Library

[4]

Lee, H., Kim, T., Chung, T. Y., Pak, D., Ban, Y. and Lee, S., “AdaCoF: Adaptive collaboration of flows for video frame interpolation," Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 5315-5324(2020).

[5]

Niklaus, S. and Liu, F., “Context-aware synthesis for video frame interpolation," Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 1701-1710(2018).

[6]

Niklaus, S. and Liu, F., “Softmax splatting for video frame interpolation," Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 5436-5445(2020).

[7]

Liu, Z., Yeh, R. A., Tang, X., Liu, Y. and Agarwala, A., “Video frame synthesis using deep voxel flow," Proc. IEEE Int. Conf. Comput. Vis., 4473-4481(2017).

[8]

Park, J., Ko, K., Lee, C. and Kim, C. S., “BMBC: Bilateral motion estimation with bilateral cost volume for video interpolation,” Proc. Eur. Conf. Comput. Vis., 109-125(2020).

[9]

Niklaus, S., Mai, L. and Liu, F., “Video frame interpolation via adaptive convolution," Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2270-2279(2017).

[10]

Peleg, T., Szekely, P., Sabo, D. and Sendik, O., “IM-Net for high resolution video frame interpolation," Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2393-2402(2019).

[11]

Niklaus, S., Mai, L. and Liu, F., “Video frame interpolation via adaptive separable convolution," Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 261-270(2017).

[12]

Bao, W., Lai, W. S., Zhang, X., Gao, Z. and Yang, M. H., “MEMC-Net: Motion estimation and motion compensation driven neural network for video interpolation and enhancement," IEEE Trans. Pattern Anal. Mach. Intell. Papers43(3), 933-948(2021).

[13]

Chi, Z., Nasiri, R. M., Liu, Z., Lu, J., Tang, J. and Plataniotis, K. N., “All at once: Temporally adaptive multi-frame interpolation with advanced motion modeling,” Proc. Eur. Conf. Comput. Vis., 107–123(2020).

[14]

Yuan, L., Chen, Y., Liu, H., Kong, T. and Shi, J., “Zoom-In-To-Check: Boosting video interpolation via instance-level discrimination," Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 12175-12183(2019).

[15]

Xue, T., Chen, B., Wu, J., Wei, D. and Freeman, W. T., “Video enhancement with task-oriented flow,” Int. J. Comput. Vis. Papers 127(8), 1106-1125(2019).

Digital Library

[16]

Howard, A. G., Zhu, M., Chen, B., “MobileNets: Efficient convolutional neural networks for mobile vision applications,” arXiv preprint arXiv:1704.04861, 2017.

[17]

Chollet, F., “Xception: Deep learning with depthwise separable convolutions," Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 1800-1807(2017).

[18]

Ding, X., Huang, P., Zhang, D. and Zhao, X., “Video frame interpolation via local lightweight bidirectional encoding with channel attention cascade," Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 1915-1919(2022).

[19]

Kingma, D. P. and Ba, J., “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.

[20]

Baker, S., Roth, S., Scharstein, D., Black, M. J., Lewis, J. P. and Szeliski, R., “A database and evaluation methodology for optical flow," Proc. IEEE Int. Conf. Comput. Vis., 1-8(2007).

[21]

Soomro, K., Zamir, A. and Shah, M., “UCF101: A dataset of 101 human actions classes from videos in the wild,” arXiv preprint arXiv:1212.0402, 2012.

[22]

Perazzi, F., Pont-Tuset, J., McWilliams, B., Van Gool, L., Gross, M. and Sorkine-Hornung, A., “A benchmark dataset and evaluation methodology for video object segmentation," Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 724-732(2016).

[23]

Kong L., “IFRNet: Intermediate feature refine network for efficient frame interpolation," Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 959-1968(2022).

[24]

Li, H., Yuan, Y. and Wang, Q., “Video frame interpolation via residue refinement," Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2613-2617(2020).

Index Terms

Video Frame Interpolation Based on Lightweight Convolutional Unit and Three-scale Encoder
1. Computing methodologies
  1. Computer graphics
    1. Image manipulation
      1. Image processing

Recommendations

Video Frame Interpolation via Multi-scale Expandable Deformable Convolution
IH&MMSec '23: Proceedings of the 2023 ACM Workshop on Information Hiding and Multimedia Security

Video frame interpolation is a challenging task in the video processing field. Benefiting from the development of deep learning, many video frame interpolation methods have been proposed, which focus on sampling pixels with useful information to ...
Video frame interpolation via down–up scale generative adversarial networks
Abstract
Frame interpolation finds many applications in video applications, including frame rate up-conversion and video compression. Deep learning-based methods have been proposed for frame interpolation, but a long runtime is typically ...
Highlights
- Introduce an improved multi-scale generative adversarial network for frame interpolation.
Video Frame Interpolation via Cyclic Fine-Tuning and Asymmetric Reverse Flow
Image Analysis
Abstract
The objective in video frame interpolation is to predict additional in-between frames in a video while retaining natural motion and good visual quality. In this work, we use a convolutional neural network (CNN) that takes two frames as input and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICDIP '23: Proceedings of the 15th International Conference on Digital Image Processing

May 2023

711 pages

ISBN:9798400708237

DOI:10.1145/3604078

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

the Shandong Provincial Natural Science Foundation
the Independent Innovation Team Project of Jinan City

Conference

ICDIP 2023

ICDIP 2023: The 15th International Conference on Digital Image Processing

May 19 - 22, 2023

Nanjing, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
23
Total Downloads

Downloads (Last 12 months)23
Downloads (Last 6 weeks)1

Reflects downloads up to 01 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents