research-article

Dynamically Expanded CNN Array for Video Coding

Authors:

Liang-Gee ChenAuthors Info & Claims

ICIGP '20: Proceedings of the 2020 3rd International Conference on Image and Graphics Processing

Pages 85 - 90

https://doi.org/10.1145/3383812.3383825

Published: 25 March 2020 Publication History

Abstract

Video coding is a critical step in all popular methods of streaming video. Marked progress has been made in video quality, compression, and computational efficiency. Recently, there has been an interest in finding ways to apply techniques from the fast-progressing field of Machine Learning to further improve video coding.

We present a method that uses convolutional neural networks to help refine the output of various standard coding methods. The novelty of our approach is to train multiple different groups of network parameters, with each set corresponding to a specific, short segment of video and arranging the groups in a hierarchy that reflects their locality within the video. Low-level groups are updated often and specialize on local features while high-level groups find non-local features that can be used for longer segments of video. The parameter groups expand dynamically to match a video of any length. We show that our method can improve the quality of standard video codecs without increasing in compressed video size.

References

[1]

Wenxue Cui, Tao Zhang, Shengping Zhang, Feng Jiang, Wangmeng Zuo, Zhaolin Wan, and Debin Zhao. 2017. Convolutional Neural Networks Based Intra Prediction for HEVC. In 2017 Data Compression Conference, DCC 2017, Snowbird, UT, USA, April 4-7, 2017. 436. https://doi.org/10.1109/DCC.2017.53

[2]

Chao Dong, Yubin Deng, Chen Change Loy, and Xiaoou Tang. 2015. Compression Artifacts Reduction by a Deep Convolutional Network. In 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7-13, 2015. 576--584. https://doi.org/10.1109/ICCV.2015.73

Digital Library

[3]

Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. 2014. Learning a Deep Convolutional Network for Image Super-Resolution. In Computer Vision - ECCV 2014 - 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part IV. 184--199. https://doi.org/10.1007/978-3-319-10593-2_13

[4]

Shuai Huo, Dong Liu, Feng Wu, and Houqiang Li. 2018. Convolutional Neural Network-Based Motion Compensation Refinement for Video Coding. In IEEE International Symposium on Circuits and Systems, ISCAS 2018, 27-30 May 2018, Florence, Italy. 1--4. https://doi.org/10.1109/ISCAS.2018.8351609

[5]

Chuanmin Jia, Shiqi Wang, Xinfeng Zhang, Shanshe Wang, Jiaying Liu, Shiliang Pu, and Siwei Ma. 2019. Content-Aware Convolutional Neural Network for In- Loop Filtering in High Efficiency Video Coding. IEEE Trans. Image Processing 28, 7 (2019), 3343--3356. https://doi.org/10.1109/TIP.2019.2896489

Digital Library

[6]

Jihong Kang, Sungjei Kim, and Kyoung Mu Lee. 2017. Multi-modal/multi-scale convolutional neural network based in-loop filter design for next generation video codec. In 2017 IEEE International Conference on Image Processing, ICIP 2017, Beijing, China, September 17-20, 2017. 26--30. https://doi.org/10.1109/ICIP.2017.8296236

Digital Library

[7]

Chen Li, Li Song, Rong Xie, andWenjun Zhang. 2017. CNN based post-processing to improve HEVC. In 2017 IEEE International Conference on Image Processing, ICIP 2017, Beijing, China, September 17-20, 2017. 4577--4580. https://doi.org/10.1109/ ICIP.2017.8297149

Digital Library

[8]

Jiahao Li, Bin Li, Jizheng Xu, Ruiqin Xiong, and Wen Gao. 2018. Fully Connected Network-Based Intra Prediction for Image Coding. IEEE Trans. Image Processing 27, 7 (2018), 3236--3247. https://doi.org/10.1109/TIP.2018.2817044

[9]

Guo Lu, Wanli Ouyang, Dong Xu, Xiaoyun Zhang, Chunlei Cai, and Zhiyong Gao. 2018. DVC: An End-to-end Deep Video Compression Framework. CoRR abs/1812.00101 (2018). arXiv:1812.00101

[10]

Woon-Sung Park and Munchurl Kim. 2016. CNN-based in-loop filtering for coding efficiency improvement. In IEEE 12th Image, Video, and Multidimensional Signal Processing Workshop, IVMSP 2016, Bordeaux, France, July 11-12, 2016. 1--5. https://doi.org/10.1109/IVMSPW.2016.7528223

[11]

Oren Rippel, Sanjay Nair, Carissa Lew, Steve Branson, Alexander G. Anderson, and Lubomir D. Bourdev. 2018. Learned Video Compression. CoRR abs/1811.06981 (2018). arXiv:1811.06981

[12]

Heiko Schwarz, Detlev Marpe, and Thomas Wiegand. 2007. Overview of the Scalable Video Coding Extension of the H.264/AVC Standard. IEEE Trans. Circuits Syst. Video Techn. 17, 9 (2007), 1103--1120. https://doi.org/10.1109/TCSVT.2007. 905532

[13]

Gary J. Sullivan, Jens-Rainer Ohm, Woojin Han, and Thomas Wiegand. 2012. Overview of the High Efficiency Video Coding (HEVC) Standard. IEEE Trans. Circuits Syst. Video Techn. 22, 12 (2012), 1649--1668. https://doi.org/10.1109/ TCSVT.2012.2221191

Digital Library

[14]

Tingting Wang, Mingjin Chen, and Hongyang Chao. 2017. A Novel Deep Learning-Based Method of Improving Coding Efficiency from the Decoder-End for HEVC. In 2017 Data Compression Conference, DCC 2017, Snowbird, UT, USA, April 4-7, 2017. 410--419. https://doi.org/10.1109/DCC.2017.42

[15]

Chao-Yuan Wu, Nayan Singhal, and Philipp Krähenbühl. 2018. Video Compression Through Image Interpolation. In Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part VIII. 425--440. https://doi.org/10.1007/978-3-030-01237-3_26

Digital Library

[16]

Ning Yan, Dong Liu, Houqiang Li, Bin Li, Li Li, and Feng Wu. 2019. Convolutional Neural Network-Based Fractional-Pixel Motion Compensation. IEEE Trans. Circuits Syst. Video Techn. 29, 3 (2019), 840--853. https://doi.org/10.1109/TCSVT. 2018.2816932

[17]

Ren Yang, Mai Xu, and Zulin Wang. 2017. Decoder-side HEVC quality enhancement with scalable convolutional neural network. In 2017 IEEE International Conference on Multimedia and Expo, ICME 2017, Hong Kong, China, July 10-14, 2017. 817--822. https://doi.org/10.1109/ICME.2017.8019299

[18]

Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks?. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada. 3320--3328.

Digital Library

[19]

Yongbing Zhang, Tao Shen, Xiangyang Ji, Yun Zhang, Ruiqin Xiong, and Qionghai Dai. 2018. Residual Highway Convolutional Neural Networks for in-loop Filtering in HEVC. IEEE Trans. Image Processing 27, 8 (2018), 3827--3841. https://doi.org/ 10.1109/TIP.2018.2815841

Cited By

Fall EChang KChen L(2024)Tree-managed network ensembles for video predictionMachine Vision and Applications10.1007/s00138-024-01575-735:4Online publication date: 4-Jul-2024
https://dl.acm.org/doi/10.1007/s00138-024-01575-7

Index Terms

Dynamically Expanded CNN Array for Video Coding
1. Computing methodologies
  1. Computer graphics
    1. Image compression
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Information systems
  1. Information systems applications
    1. Multimedia information systems
      1. Multimedia streaming

Recommendations

A new approach to scalable video coding
DCC '95: Proceedings of the Conference on Data Compression

This paper introduces a new framework for video coding that facilitates operation over a wide range of transmission rates. The new method is a subband coding approach that employs motion compensation, and uses prediction-frame and intra-frame coding ...
Low Bit Rate Video Coding Using DCT-Based Fast Decimation/Interpolation and Embedded Zerotree Coding

In this paper, we propose a low bit rate video coding procedure in the discrete cosine transform (DCT) domain that is based in an embedded zerotree algorithm and uses decimation and interpolation. Theory for decimation/interpolation in the DCT domain is ...
Context-based 2D-VLC entropy coder in AVS video coding standard
Special section on China AVS standard

In this paper, a Context-based 2D Variable Length Coding (C2DVLC) method for coding the transformed residuals in AVS video coding standard is presented. One feature in C2DVLC is the usage of multiple 2D-VLC tables and another feature is the usage of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICIGP '20: Proceedings of the 2020 3rd International Conference on Image and Graphics Processing

February 2020

172 pages

ISBN:9781450377201

DOI:10.1145/3383812

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

In-Cooperation

Nanyang Technological University
UNIBO: University of Bologna

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 March 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICIGP 2020

ICIGP 2020: 2020 3rd International Conference on Image and Graphics Processing

February 8 - 10, 2020

Singapore, Singapore

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
95
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 13 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Fall EChang KChen L(2024)Tree-managed network ensembles for video predictionMachine Vision and Applications10.1007/s00138-024-01575-735:4Online publication date: 4-Jul-2024
https://dl.acm.org/doi/10.1007/s00138-024-01575-7

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents