Real-Time Intermediate Flow Estimation for Video Frame Interpolation

Huang, Zhewei; Zhang, Tianyuan; Heng, Wen; Shi, Boxin; Zhou, Shuchang

doi:10.1007/978-3-031-19781-9_36

Zhewei Huang¹²,
Tianyuan Zhang¹²,
Wen Heng¹²,
Boxin Shi^13,14,15 &
…
Shuchang Zhou¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13674))

Included in the following conference series:

European Conference on Computer Vision

4321 Accesses

Abstract

Real-time video frame interpolation (VFI) is very useful in video processing, media players, and display devices. We propose RIFE, a Real-time Intermediate Flow Estimation algorithm for VFI. To realize a high-quality flow-based VFI method, RIFE uses a neural network named IFNet that can estimate the intermediate flows end-to-end with much faster speed. A privileged distillation scheme is designed for stable IFNet training and improve the overall performance. RIFE does not rely on pre-trained optical flow models and can support arbitrary-timestep frame interpolation with the temporal encoding input. Experiments demonstrate that RIFE achieves state-of-the-art performance on several public benchmarks. Compared with the popular SuperSlomo and DAIN methods, RIFE is 4–27 times faster and produces better results. Furthermore, RIFE can be extended to wider applications thanks to temporal encoding. https://github.com/megvii-research/ECCV2022-RIFE

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Multi-frame Video Interpolation Neural Network for Large Motion

Video Frame Interpolation via Cyclic Fine-Tuning and Asymmetric Reverse Flow

A comprehensive survey on video frame interpolation techniques

Article 04 January 2021

References

Anil, R., Pereyra, G., Passos, A., Ormandi, R., Dahl, G.E., Hinton, G.E.: Large scale distributed neural network training through online distillation. In: Proceedings of the International Conference on Learning Representations (ICLR) (2018)
Google Scholar
Baker, S., Scharstein, D., Lewis, J., Roth, S., Black, M.J., Szeliski, R.: A database and evaluation methodology for optical flow. In: International Journal of Computer Vision (IJCV) (2011)
Google Scholar
Bao, W., Lai, W.S., Ma, C., Zhang, X., Gao, Z., Yang, M.H.: Depth-aware video frame interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
Bao, W., Lai, W.S., Zhang, X., Gao, Z., Yang, M.H.: MEMC-Net: motion estimation and motion compensation driven neural network for video interpolation and enhancement. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI) (2018). https://doi.org/10.1109/TPAMI.2019.2941941
Blau, Y., Michaeli, T.: The perception-distortion tradeoff. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Briedis, K.M., Djelouah, A., Meyer, M., McGonigal, I., Gross, M., Schroers, C.: Neural frame interpolation for rendered content. ACM Trans. Graph. 40(6), 1–13 (2021)
Article Google Scholar
Chen, X., Zhang, Y., Wang, Y., Shu, H., Xu, C., Xu, C.: Optical flow distillation: Towards efficient and stable video style transfer. In: Proceedings of the European Conference on Computer Vision (ECCV) (2020)
Google Scholar
Cheng, X., Chen, Z.: Video frame interpolation via deformable separable convolution. In: AAAI Conference on Artificial Intelligence (2020)
Google Scholar
Cheng, X., Chen, Z.: Multiple video frame interpolation via enhanced deformable separable convolution. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2021). https://doi.org/10.1109/TPAMI.2021.3100714
Choi, M., Kim, H., Han, B., Xu, N., Lee, K.M.: Channel attention is all you need for video frame interpolation. In: AAAI Conference on Artificial Intelligence (2020)
Google Scholar
Danier, D., Zhang, F., Bull, D.: Spatio-temporal multi-flow network for video frame interpolation. arXiv preprint arXiv:2111.15483 (2021)
Ding, L., Goshtasby, A.: On the canny edge detector. Pattern Recogn. 34(3), 721–725 (2001)
Article Google Scholar
Ding, T., Liang, L., Zhu, Z., Zharkov, I.: CDFI: compression-driven network design for frame interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Google Scholar
Ding, X., Zhang, X., Ma, N., Han, J., Ding, G., Sun, J.: RepVGG: making VGG-style convnets great again. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Google Scholar
Dosovitskiy, A., et al.: Learning optical flow with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2015)
Google Scholar
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Huang, Z., Heng, W., Zhou, S.: Learning to paint with model-based deep reinforcement learning. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Hui, T.W., Tang, X., Change Loy, C.: LiteFlowNet: a lightweight convolutional neural network for optical flow estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Ilg, E., et al.: Evolution of optical flow estimation with deep networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
Jiang, H., Sun, D., Jampani, V., Yang, M.H., Learned-Miller, E., Kautz, J.: Super SloMo: high quality estimation of multiple intermediate frames for video interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Jonschkowski, R., Stone, A., Barron, J.T., Gordon, A., Konolige, K., Angelova, A.: What matters in unsupervised optical flow. In: Proceedings of the European Conference on Computer Vision (ECCV) (2020)
Google Scholar
Kalluri, T., Pathak, D., Chandraker, M., Tran, D.: FLAVR: Flow-agnostic video representations for fast frame interpolation. arXiv preprint arXiv:2012.08512 (2020)
Kong, L., et al.: IfrNet: intermediate feature refine network for efficient frame interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Google Scholar
Lee, D.H., et al.: Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In: Proceedings of the IEEE International Conference on Machine Learning Workshops (ICMLW) (2013)
Google Scholar
Lee, H., Kim, T., Chung, T.y., Pak, D., Ban, Y., Lee, S.: AdaCOF: adaptive collaboration of flows for video frame interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Google Scholar
Liu, Y., Xie, L., Siyao, L., Sun, W., Qiao, Y., Dong, C.: Enhanced quadratic video interpolation. In: Proceedings of the European Conference on Computer Vision (ECCV) (2020)
Google Scholar
Liu, Y.L., Liao, Y.T., Lin, Y.Y., Chuang, Y.Y.: Deep video frame interpolation using cyclic frame generation. In: Proceedings of the 33rd Conference on Artificial Intelligence (AAAI) (2019)
Google Scholar
Liu, Z., Yeh, R.A., Tang, X., Liu, Y., Agarwala, A.: Video frame synthesis using deep voxel flow. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017)
Google Scholar
Lopez-Paz, D., Bottou, L., Schölkopf, B., Vapnik, V.: Unifying distillation and privileged information. In: Proceedings of the International Conference on Learning Representations (ICLR) (2016)
Google Scholar
Loshchilov, I., Hutter, F.: Fixing weight decay regularization in Adam. arXiv preprint arXiv:1711.05101 (2017)
Lu, G., Ouyang, W., Xu, D., Zhang, X., Cai, C., Gao, Z.: DVC: an end-to-end deep video compression framework. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
Lu, L., Wu, R., Lin, H., Lu, J., Jia, J.: Video frame interpolation with transformer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Google Scholar
Luo, K., Wang, C., Liu, S., Fan, H., Wang, J., Sun, J.: UPFlow: upsampling pyramid for unsupervised optical flow learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Google Scholar
Ma, N., Zhang, X., Zheng, H.T., Sun, J.: ShuffleNet v2: practical guidelines for efficient CNN architecture design. In: Proceedings of the European conference on computer vision (ECCV) (2018)
Google Scholar
Meister, S., Hur, J., Roth, S.: UnFlow: unsupervised learning of optical flow with a bidirectional census loss. In: AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Meyer, S., Wang, O., Zimmer, H., Grosse, M., Sorkine-Hornung, a.: Phase-based frame interpolation for video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Google Scholar
Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
Niklaus, S., Liu, F.: Context-aware synthesis for video frame interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Niklaus, S., Liu, F.: SoftMax splatting for video frame interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Google Scholar
Niklaus, S., Mai, L., Liu, F.: Video frame interpolation via adaptive convolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Niklaus, S., Mai, L., Liu, F.: Video frame interpolation via adaptive separable convolution. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017)
Google Scholar
Park, J., Lee, C., Kim, C.S.: Asymmetric bilateral motion estimation for video frame interpolation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2021)
Google Scholar
Porrello, A., Bergamini, L., Calderara, S.: Robust re-identification by multiple views knowledge distillation. In: Proceedings of the European Conference on Computer Vision (ECCV) (2020)
Google Scholar
Ranftl, R., Lasinger, K., Hafner, D., Schindler, K., Koltun, V.: Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2020)
Google Scholar
Ranjan, A., Black, M.J.: Optical flow estimation using a spatial pyramid network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Reda, F., Kontkanen, J., Tabellion, E., Sun, D., Pantofaru, C., Curless, B.: Frame interpolation for large motion. arXiv (2022)
Google Scholar
Reda, F.A., et al.: Unsupervised video interpolation using cycle consistency. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Sim, H., Oh, J., Kim, M.: XVFI: extreme video frame interpolation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2021)
Google Scholar
Siyao, L., et al.: Deep animation video interpolation in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Google Scholar
Soomro, K., Zamir, A.R., Shah, M.: Ucf101: a dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 (2012)
Sun, D., Yang, X., Liu, M.Y., Kautz, J.: PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Sun, S., Kuang, Z., Sheng, L., Ouyang, W., Zhang, W.: Optical flow guided feature: a fast and robust motion representation for video action recognition. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Teed, Z., Deng, J.: RAFT: recurrent all-pairs field transforms for optical flow. In: Proceedings of the European Conference on Computer Vision (ECCV) (2020)
Google Scholar
Wu, C.Y., Singhal, N., Krahenbuhl, P.: Video compression through image interpolation. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018)
Google Scholar
Wu, Y., Wen, Q., Chen, Q.: Optimizing video prediction via video frame interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Google Scholar
Xiang, X., Tian, Y., Zhang, Y., Fu, Y., Allebach, J.P., Xu, C.: Zooming slow-MO: fast and accurate one-stage space-time video super-resolution. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Google Scholar
Xu, G., Xu, J., Li, Z., Wang, L., Sun, X., Cheng, M.: Temporal modulation network for controllable space-time video super-resolution. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Google Scholar
Xu, X., Siyao, L., Sun, W., Yin, Q., Yang, M.H.: Quadratic video interpolation. In: Advances in Neural Information Processing Systems (NIPS) (2019)
Google Scholar
Xue, T., Chen, B., Wu, J., Wei, D., Freeman, W.T.: Video enhancement with task-oriented flow. In: International Journal of Computer Vision (IJCV) (2019)
Google Scholar
Yuan, S., Stenger, B., Kim, T.K.: RGB-based 3d hand pose estimation via privileged learning with depth images. In: Proceedings of the IEEE International Conference on Computer Vision Workshops (ICCVW) (2019)
Google Scholar
Zhao, Z., Wu, Z., Zhuang, Y., Li, B., Jia, J.: Tracking objects as pixel-wise distributions. In: Proceedings of the European conference on computer vision (ECCV) (2022)
Google Scholar
Zhou, M., Bai, Y., Zhang, W., Zhao, T., Mei, T.: Responsive listening head generation: a benchmark dataset and baseline. In: Proceedings of the European Conference on Computer Vision (ECCV) (2022)
Google Scholar
Zhou, T., Tulsiani, S., Sun, W., Malik, J., Efros, A.A.: View synthesis by appearance flow. In: Proceedings of the European Conference on Computer Vision (ECCV) (2016)
Google Scholar

Download references

Acknowledgement

This work is supported by National Key R &D Program of China (2021ZD0109803) and National Natural Science Foundation of China under Grant No. 62136001, 62088102.

Author information

Authors and Affiliations

Megvii Research, Beijing, China
Zhewei Huang, Tianyuan Zhang, Wen Heng & Shuchang Zhou
NERCVT, School of Computer Science, Peking University, Beijing, China
Boxin Shi
Institute for Artificial Intelligence, Peking University, Beijing, China
Boxin Shi
Beijing Academy of Artificial Intelligence, Beijing, China
Boxin Shi

Authors

Zhewei Huang
View author publications
You can also search for this author in PubMed Google Scholar
Tianyuan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Wen Heng
View author publications
You can also search for this author in PubMed Google Scholar
Boxin Shi
View author publications
You can also search for this author in PubMed Google Scholar
Shuchang Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Boxin Shi or Shuchang Zhou .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 5519 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huang, Z., Zhang, T., Heng, W., Shi, B., Zhou, S. (2022). Real-Time Intermediate Flow Estimation for Video Frame Interpolation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13674. Springer, Cham. https://doi.org/10.1007/978-3-031-19781-9_36

Download citation

DOI: https://doi.org/10.1007/978-3-031-19781-9_36
Published: 23 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19780-2
Online ISBN: 978-3-031-19781-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Real-Time Intermediate Flow Estimation for Video Frame Interpolation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Multi-frame Video Interpolation Neural Network for Large Motion

Video Frame Interpolation via Cyclic Fine-Tuning and Asymmetric Reverse Flow

A comprehensive survey on video frame interpolation techniques

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 5519 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Real-Time Intermediate Flow Estimation for Video Frame Interpolation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Multi-frame Video Interpolation Neural Network for Large Motion

Video Frame Interpolation via Cyclic Fine-Tuning and Asymmetric Reverse Flow

A comprehensive survey on video frame interpolation techniques

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 5519 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation