research-article

Multi-Prior Driven Resolution Rescaling Blocks for Intra Frame Coding

Authors:

Xia HuaAuthors Info & Claims

IEEE Transactions on Multimedia, Volume 26

Pages 11274 - 11289

https://doi.org/10.1109/TMM.2024.3453033

Published: 01 January 2024 Publication History

Abstract

Deep learning techniques are increasingly integrated into rescaling-based video compression frameworks and have shown great potential in improving compression efficiency. However, existing methods achieve limited performance because 1) they treat context priors generated by codec as independent sources of information, ignoring potential interactions between multiple priors in rescaling, which may not effectively facilitate compression; 2) they often employ a uniform sampling ratio across regions with varying content complexities, resulting in the loss of important information. To address the above two issues, this paper proposes a spatial multi-prior driven resolution rescaling framework for intra-frame coding, called MP-RRF, consisting of three sub-networks: a multi-prior driven network, a downscaling network, and an upscaling network. First, the multi-prior driven network employs complexity and similarity priors to smooth the unnecessarily complicated information while leveraging similarity and quality priors to produce high-fidelity complementary information. This interaction of complexity, similarity and quality priors ensures redundancy reduction and texture enhancement. Second, the downscaling network discriminatively processes components of different granularities to generate a compact, low-resolution image for encoding. The upscaling network aggregates a complementary set of contextual multi-scale features to reconstruct realistic details while combining variable receptive fields to suppress multi-scale compression artifacts and resampling noise. Extensive experiments show that our network achieves a significant 23.84% Bjøntegaard Delta Rate (BD-Rate) reduction under all-intra configuration compared to the codec anchor, offering the state-of-the-art coding performance.

References

[1]

H. Sun, Z. Cheng, M. Takeuchi, and J. Katto, “Enhanced intra prediction for video coding by using multiple neural networks,” IEEE Trans. Multimedia, vol. 22, pp. 2764–2779, 2020.

[2]

L. Zhu, S. Kwong, Y. Zhang, S. Wang, and X. Wang, “Generative adversarial network-based intra prediction for video coding,” IEEE Trans. Multimedia, vol. 22, pp. 45–58, 2020.

Digital Library

[3]

J. Liu, S. Xia, and W. Yang, “Deep reference generation with multi-domain hierarchical constraints for inter prediction,” IEEE Trans. Multimedia, vol. 22, pp. 2497–2510, 2020.

[4]

N. Yan et al., “Convolutional neural network-based fractional-pixel motion compensation,” IEEE Trans. Circuits Syst. Video Technol., vol. 29, no. 3, pp. 840–853, Mar. 2019.

[5]

L. Murn, S. Blasi, A. F. Smeaton, N. E. O'Connor, and M. Mrak, “Interpreting CNN for low complexity learned sub-pixel motion compensation in video coding,” in 2020 IEEE Int. Conf. Image Process., 2020, pp. 798–802.

[6]

Z. Huang, J. Sun, X. Guo, and M. Shang, “One-for-all: An efficient variable convolution neural network for in-loop filter of VVC,” IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 4, pp. 2342–2355, Mar. 2022.

[7]

C. Liu, H. Sun, J. Katto, X. Zeng, and Y. Fan, “Qa-filter: A QP-adaptive convolutional neural network filter for video coding,” IEEE Trans. Image Process., vol. 31, pp. 3032–3045, 2022.

[8]

Z. Pan, X. Yi, Y. Zhang, B. Jeon, and S. Kwong, “Efficient in-loop filtering based on enhanced deep convolutional neural networks for HEVC,” IEEE Trans. Image Process., vol. 29, pp. 5352–5366, 2020.

[9]

C. Jia et al., “Content-aware convolutional neural network for in-loop filtering in high efficiency video coding,” IEEE Trans. Image Process., vol. 28, no. 7, pp. 3343–3356, Jul. 2019.

Digital Library

[10]

T. Li et al., “A deep learning approach for multi-frame in-loop filter of HEVC,” IEEE Trans. Image Process., vol. 28, no. 11, pp. 5663–5678, Nov. 2019.

Digital Library

[11]

Q. Ding, L. Shen, L. Yu, H. Yang, and M. Xu, “Patch-wise spatial-temporal quality enhancement for HEVC compressed video,” IEEE Trans. Image Process., vol. 30, pp. 6459–6472, 2021.

[12]

Z. Guan et al., “MFQE 2.0: A new approach for multi-frame quality enhancement on compressed video,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 3, pp. 949–963, Mar. 2021.

[13]

L. Peng et al., “OVQE: Omniscient network for compressed video quality enhancement,” IEEE Trans. Broadcast., vol. 69, no. 1, pp. 153–164, Mar. 2023.

[14]

D. Luo, M. Ye, S. Li, C. Zhu, and X. Li, “Spatio-temporal detail information retrieval for compressed video quality enhancement,” IEEE Trans. Multimedia, vol. 25, pp. 6808–6820, 2022.

[15]

N. Jiang, W. Chen, J. Lin, T. Zhao, and C.-W. Lin, “Video compression artifacts removal with spatial-temporal attention-guided enhancement,” IEEE Trans. Multimedia, vol. 26, pp. 5657–5669, 2023.

[16]

W. Shin, N. Ahn, J.-H. Moon, and K.-A. Sohn, “Exploiting distortion information for multi-degraded image restoration,” in 2022 IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2022, pp. 537–546.

[17]

W. Lin et al., “Partition-aware adaptive switching neural networks for post-processing in HEVC,” IEEE Trans. Multimedia, vol. 22, pp. 2749–2763, 2020.

[18]

J. Qian, H. Wang, and L. Yu, “Distortion-based neural network for compression artifacts reduction in VVC,” in 2021 IEEE Int. Conf. Vis. Commun. Image Process., 2021, pp. 1–5.

[19]

C. Wang, P. Xue, and W. Lin, “Improved super-resolution reconstruction from video,” IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 11, pp. 1411–1422, Nov. 2006.

[20]

C. A. Segall, A. K. Katsaggelos, R. Molina, and J. Mateos, “Bayesian resolution enhancement of compressed video,” IEEE Trans. Image Process., vol. 13, no. 7, pp. 898–911, Jul. 2004.

Digital Library

[21]

L. D. Alvarez, J. Mateos, R. Molina, and A. K. Katsaggelos, “High-resolution images from compressed low-resolution video: Motion estimation and observable pixels,” Int. J. Imag. Syst. Technol., vol. 14, no. 2, pp. 58–66, 2004.

[22]

M. Shen, P. Xue, and C. Wang, “Down-sampling based video coding using super-resolution technique,” IEEE Trans. Circuits Syst. Video Technol., vol. 21, no. 6, pp. 755–765, Jun. 2011.

[23]

W. Lin and L. Dong, “Adaptive downsampling to improve image compression at low bit rates,” IEEE Trans. Image Process., vol. 15, no. 9, pp. 2513–2521, Sep. 2006.

Digital Library

[24]

V.-A. Nguyen, Y.-P. Tan, and W. Lin, “Adaptive downsampling/upsampling for better video compression at low bit rate,” in 2008 IEEE Int. Symp. Circuits Syst., 2008, pp. 1624–1627.

[25]

Y. Li et al., “Convolutional neural network-based block up-sampling for intra frame coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 28, no. 9, pp. 2316–2330, Sep. 2018.

[26]

J. Lin, D. Liu, H. Yang, H. Li, and F. Wu, “Convolutional neural network-based block up-sampling for HEVC,” IEEE Trans. Circuits Syst. Video Technol., vol. 29, no. 12, pp. 3701–3715, Dec. 2019.

[27]

Y. Li et al., “Learning a convolutional neural network for image compact-resolution,” IEEE Trans. Image Process., vol. 28, no. 3, pp. 1092–1107, Mar. 2019.

[28]

L. Feng et al., “A dual-network based super-resolution for compressed high definition video,” in 2018 Adv. Multimedia Inf. Process., 2018, pp. 600–610.

Digital Library

[29]

M.-M. Ho, G. He, Z. Wang, and J. Zhou, “Down-sampling based video coding with degradation-aware restoration-reconstruction deep neural network,” in MultiMedia Modeling: 26th Int. Conf., Daejeon, South Korea, 2020, pp. 99–110.

[30]

M. M. Ho, J. Zhou, and G. He, “RR-DNCNN V2. 0: Enhanced restoration-reconstruction deep neural network for down-sampling-based video coding,” IEEE Trans. Image Process., vol. 30, pp. 1702–1715, 2021.

[31]

W. Tao et al., “An end-to-end compression framework based on convolutional neural networks,” in 2017 Data Compression Conf., 2017, pp. 463–463.

[32]

Y. Zhu et al., “High-frequency normalizing flow for image rescaling,” IEEE Trans. Image Process., vol. 32, pp. 6223–6233, 2023.

Digital Library

[33]

Z. Pan, B. Li, D. He, W. Wu, and E. Ding, “Effective invertible arbitrary image rescaling,” in 2023 IEEE/CVF Winter Conf. Appl. Comput. Vis., 2023, pp. 5405–5414.

[34]

R. Timofte, E. Agustsson, L. V. Gool, M.-H. Yang, and L. Zhang, “Ntire 2017 challenge on single image super-resolution: Methods and results,” in 2017 IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, 2017, pp. 114–125.

[35]

H. Zhang et al., “A codec information assisted framework for efficient compressed video super-resolution,” in 2022 Eur. Conf. Comput. Vis., 2022, pp. 220–235.

[36]

M. Lu et al., “Decoder-side cross resolution synthesis for video compression enhancement,” IEEE Trans. Multimedia, vol. 25, pp. 2097–2110, 2022.

Digital Library

[37]

Y. Wei, L. Chen, and L. Song, “Video compression based on jointly learned down-sampling and super-resolution networks,” in 2021 Int. Conf. Vis. Commun. Image Process., 2021, pp. 1–5.

[38]

H. Son, T. Kim, H. Lee, and S. Lee, “Enhanced standard compatible image compression framework based on auxiliary codec networks,” IEEE Trans. Image Process., vol. 31, pp. 664–677, 2022.

Digital Library

[39]

H. Jiang and L. Chen, “An efficient content-aware downsampling-based video compression framework,” in 2022 IEEE Int. Conf. Vis. Commun. Image Process., 2022, pp. 1–5.

[40]

Y. Tian, Y. Yan, G. Zhai, L. Chen, and Z. Gao, “CLSA: A contrastive learning framework with selective aggregation for video rescaling,” IEEE Trans. Image Process., vol. 32, pp. 1300–1314, 2023.

Digital Library

[41]

M. Guo, S. Zhao, H. Jiang, J. Li, and L. Zhang, “Video compression with arbitrary rescaling network,” in 2023 Data Compression Conf., 2023, pp. 1–1.

[42]

Z. Chen and L. Chen, “An enhanced video compression framework based on rescaling networks,” in 2023 IEEE Int. Symp. Broadband Multimedia Syst. Broadcast., 2023, pp. 1–6.

[43]

R. Keys, “Cubic convolution interpolation for digital image processing,” IEEE Trans. Acoust., Speech, Signal Process., vol. 29, no. 6, pp. 1153–1160, Dec. 1981.

[44]

C. Lanczos, “An iteration method for the solution of the eigenvalue problem of linear differential and integral operators,” J. Res. Nat. Bur. Standards, vol. 45, pp. 255–282, 1950. [Online]. Available: https://api.semanticscholar.org/CorpusID, pp 478182.

[45]

J. Agbinya, “Interpolation using the discrete cosine transform,” Electron. Lett., vol. 20, no. 28, pp. 1927–1928, 1992.

[46]

C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color images,” in Sixth Int. Conf. Comput. Vis., 1998, pp. 839–846.

[47]

J. Kopf, A. Shamir, and P. Peers, “Content-adaptive image downscaling,” ACM Trans. Graph., vol. 32, no. 6, pp. 1–8, 2013.

Digital Library

[48]

A. C. Oeztireli and M. Gross, “Perceptually based downscaling of images,” ACM Trans. Graph., vol. 34, no. 4, pp. 1–10, 2015.

Digital Library

[49]

E. S. Gastal and M. M. Oliveira, “Spectral remapping for image downscaling,” ACM Trans. Graph., vol. 36, no. 4, pp. 1–16, 2017.

Digital Library

[50]

J. Kim, J. K. Lee, and K. M. Lee, “Deeply-recursive convolutional network for image super-resolution,” in IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2016, pp. 1637–1645.

[51]

Z. Pan et al., “Towards bidirectional arbitrary image rescaling: Joint optimization and cycle idempotence,” in IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2022, pp. 17389–17398.

[52]

W. Sun and Z. Chen, “Learned image downscaling for upscaling using content adaptive resampler,” IEEE Trans. Image Process., vol. 29, pp. 4027–4040, 2020.

Digital Library

[53]

M. Xiao et al., “Invertible image rescaling,” in 2020 Eur. Conf. Comput. Vis., 2020, pp. 126–144.

[54]

C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution using deep convolutional networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 2, pp. 295–307, Feb. 2015.

Digital Library

[55]

B. Lim, S. Son, H. Kim, S. Nah, and K. Mu Lee, “Enhanced deep residual networks for single image super-resolution,” in 2017 IEEE/CVF Conf. Comput. Vis. Pattern Recognit. Workshops, 2017, pp. 136–144.

[56]

Y. Zhang et al., “Image super-resolution using very deep residual channel attention networks,” 2018 in Eur. Conf. Comput. Cision, 2018, pp. 286–301.

[57]

C. Ledig et al., “Photo-realistic single image super-resolution using a generative adversarial network,” in 2017 IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 4681–4690.

[58]

X. Wang et al., “ESRGAN: Enhanced super-resolution generative adversarial networks,” in Proc. Eur. Conf. Comput. Vis., 2018.

[59]

C. Barnes, E. Shechtman, A. Finkelstein, and D. B. Goldman, “PatchMatch: A randomized correspondence algorithm for structural image editing,” ACM Trans. Graph., vol. 28, no. 3, 2009, Art. no.

Digital Library

[60]

Z. Zhang, Z. Wang, Z. Lin, and H. Qi, “Image super-resolution by neural texture transfer,” in 2019 IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2019, pp. 7982–7991.

[61]

L. Lu, W. Li, X. Tao, J. Lu, and J. Jia, “MASA-SR: Matching acceleration and spatial adaptation for reference-based image super-resolution,” 2021 in IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2021, pp. 6368–6377.

[62]

Y. Jiang, K. C. Chan, X. Wang, C. C. Loy, and Z. Liu, “Robust reference-based super-resolution via c2-matching,” 2021 in IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2021, pp. 2103–2112.

[63]

J. Dai et al., “Deformable convolutional networks,” in 2017 IEEE Int. Conf. Comput. Vis., 2017, pp. 764–773.

[64]

R. Karthik, T. S. Vaichole, S. K. Kulkarni, O. Yadav, and F. Khan, “EFF2Net: An efficient channel attention-based convolutional neural network for skin disease classification,” Biomed. Signal Process. Control, vol. 73, 2022, Art. no.

[65]

S. W. Zamir et al., “Learning enriched features for fast image restoration and enhancement,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 2, pp. 1934–1948, Feb., 2022.

[66]

S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “CBAM: Convolutional block attention module,” in 2018 Eur. Conf. Comput. Vis., 2018, pp. 3–19.

[67]

M. Lu et al., “Decoder-side cross resolution synthesis for video compression enhancement,” IEEE Trans. Multimedia, vol. 25, pp. 2097–2110, 2023.

Digital Library

[68]

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2014, arXiv:1412.6980.

Index Terms

Index terms have been assigned to the content through auto-classification.

Recommendations

New CAVLC design for lossless intra coding
ICIP'09: Proceedings of the 16th IEEE international conference on Image processing

The context-based adaptive variable length coder (CAVLC) in H.264/AVC is not appropriate for lossless video coding because it was designed for lossy video coding. Since statistical characteristics of residual data in lossy and lossless coding are quite ...
Complexity-based intra frame rate control by jointing inter-frame correlation for high efficiency video coding

An intra-frame rate control algorithm by jointing inter-frame correlation is developed.A new prediction measure of content complexity for CTUs of intra-frame is proposed.A frame-level complexity-based bit-allocation-balancing method is brought up.A new ...
Fast inter-frame coding with intra skip strategy in H.264 video coding

Inter-frame coding in the H.264/AVC standard must address inter modes and intra modes when seeking the best coding mode. Despite achieving a higher coding efficiency than any other previous coding standards, H.264/AVC also has a significantly high ...

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Multimedia

IEEE Transactions on Multimedia Volume 26, Issue

2024

11427 pages

ISSN:1520-9210

Issue’s Table of Contents

1520-9210 © 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Publisher

IEEE Press

Publication History

Published: 01 January 2024

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 24 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents