Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Multi-Prior Driven Resolution Rescaling Blocks for Intra Frame Coding

Published: 01 January 2024 Publication History

Abstract

Deep learning techniques are increasingly integrated into rescaling-based video compression frameworks and have shown great potential in improving compression efficiency. However, existing methods achieve limited performance because 1) they treat context priors generated by codec as independent sources of information, ignoring potential interactions between multiple priors in rescaling, which may not effectively facilitate compression; 2) they often employ a uniform sampling ratio across regions with varying content complexities, resulting in the loss of important information. To address the above two issues, this paper proposes a spatial multi-prior driven resolution rescaling framework for intra-frame coding, called MP-RRF, consisting of three sub-networks: a multi-prior driven network, a downscaling network, and an upscaling network. First, the multi-prior driven network employs complexity and similarity priors to smooth the unnecessarily complicated information while leveraging similarity and quality priors to produce high-fidelity complementary information. This interaction of complexity, similarity and quality priors ensures redundancy reduction and texture enhancement. Second, the downscaling network discriminatively processes components of different granularities to generate a compact, low-resolution image for encoding. The upscaling network aggregates a complementary set of contextual multi-scale features to reconstruct realistic details while combining variable receptive fields to suppress multi-scale compression artifacts and resampling noise. Extensive experiments show that our network achieves a significant 23.84% Bjøntegaard Delta Rate (BD-Rate) reduction under all-intra configuration compared to the codec anchor, offering the state-of-the-art coding performance.

References

[1]
H. Sun, Z. Cheng, M. Takeuchi, and J. Katto, “Enhanced intra prediction for video coding by using multiple neural networks,” IEEE Trans. Multimedia, vol. 22, pp. 2764–2779, 2020.
[2]
L. Zhu, S. Kwong, Y. Zhang, S. Wang, and X. Wang, “Generative adversarial network-based intra prediction for video coding,” IEEE Trans. Multimedia, vol. 22, pp. 45–58, 2020.
[3]
J. Liu, S. Xia, and W. Yang, “Deep reference generation with multi-domain hierarchical constraints for inter prediction,” IEEE Trans. Multimedia, vol. 22, pp. 2497–2510, 2020.
[4]
N. Yan et al., “Convolutional neural network-based fractional-pixel motion compensation,” IEEE Trans. Circuits Syst. Video Technol., vol. 29, no. 3, pp. 840–853, Mar. 2019.
[5]
L. Murn, S. Blasi, A. F. Smeaton, N. E. O'Connor, and M. Mrak, “Interpreting CNN for low complexity learned sub-pixel motion compensation in video coding,” in 2020 IEEE Int. Conf. Image Process., 2020, pp. 798–802.
[6]
Z. Huang, J. Sun, X. Guo, and M. Shang, “One-for-all: An efficient variable convolution neural network for in-loop filter of VVC,” IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 4, pp. 2342–2355, Mar. 2022.
[7]
C. Liu, H. Sun, J. Katto, X. Zeng, and Y. Fan, “Qa-filter: A QP-adaptive convolutional neural network filter for video coding,” IEEE Trans. Image Process., vol. 31, pp. 3032–3045, 2022.
[8]
Z. Pan, X. Yi, Y. Zhang, B. Jeon, and S. Kwong, “Efficient in-loop filtering based on enhanced deep convolutional neural networks for HEVC,” IEEE Trans. Image Process., vol. 29, pp. 5352–5366, 2020.
[9]
C. Jia et al., “Content-aware convolutional neural network for in-loop filtering in high efficiency video coding,” IEEE Trans. Image Process., vol. 28, no. 7, pp. 3343–3356, Jul. 2019.
[10]
T. Li et al., “A deep learning approach for multi-frame in-loop filter of HEVC,” IEEE Trans. Image Process., vol. 28, no. 11, pp. 5663–5678, Nov. 2019.
[11]
Q. Ding, L. Shen, L. Yu, H. Yang, and M. Xu, “Patch-wise spatial-temporal quality enhancement for HEVC compressed video,” IEEE Trans. Image Process., vol. 30, pp. 6459–6472, 2021.
[12]
Z. Guan et al., “MFQE 2.0: A new approach for multi-frame quality enhancement on compressed video,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 3, pp. 949–963, Mar. 2021.
[13]
L. Peng et al., “OVQE: Omniscient network for compressed video quality enhancement,” IEEE Trans. Broadcast., vol. 69, no. 1, pp. 153–164, Mar. 2023.
[14]
D. Luo, M. Ye, S. Li, C. Zhu, and X. Li, “Spatio-temporal detail information retrieval for compressed video quality enhancement,” IEEE Trans. Multimedia, vol. 25, pp. 6808–6820, 2022.
[15]
N. Jiang, W. Chen, J. Lin, T. Zhao, and C.-W. Lin, “Video compression artifacts removal with spatial-temporal attention-guided enhancement,” IEEE Trans. Multimedia, vol. 26, pp. 5657–5669, 2023.
[16]
W. Shin, N. Ahn, J.-H. Moon, and K.-A. Sohn, “Exploiting distortion information for multi-degraded image restoration,” in 2022 IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2022, pp. 537–546.
[17]
W. Lin et al., “Partition-aware adaptive switching neural networks for post-processing in HEVC,” IEEE Trans. Multimedia, vol. 22, pp. 2749–2763, 2020.
[18]
J. Qian, H. Wang, and L. Yu, “Distortion-based neural network for compression artifacts reduction in VVC,” in 2021 IEEE Int. Conf. Vis. Commun. Image Process., 2021, pp. 1–5.
[19]
C. Wang, P. Xue, and W. Lin, “Improved super-resolution reconstruction from video,” IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 11, pp. 1411–1422, Nov. 2006.
[20]
C. A. Segall, A. K. Katsaggelos, R. Molina, and J. Mateos, “Bayesian resolution enhancement of compressed video,” IEEE Trans. Image Process., vol. 13, no. 7, pp. 898–911, Jul. 2004.
[21]
L. D. Alvarez, J. Mateos, R. Molina, and A. K. Katsaggelos, “High-resolution images from compressed low-resolution video: Motion estimation and observable pixels,” Int. J. Imag. Syst. Technol., vol. 14, no. 2, pp. 58–66, 2004.
[22]
M. Shen, P. Xue, and C. Wang, “Down-sampling based video coding using super-resolution technique,” IEEE Trans. Circuits Syst. Video Technol., vol. 21, no. 6, pp. 755–765, Jun. 2011.
[23]
W. Lin and L. Dong, “Adaptive downsampling to improve image compression at low bit rates,” IEEE Trans. Image Process., vol. 15, no. 9, pp. 2513–2521, Sep. 2006.
[24]
V.-A. Nguyen, Y.-P. Tan, and W. Lin, “Adaptive downsampling/upsampling for better video compression at low bit rate,” in 2008 IEEE Int. Symp. Circuits Syst., 2008, pp. 1624–1627.
[25]
Y. Li et al., “Convolutional neural network-based block up-sampling for intra frame coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 28, no. 9, pp. 2316–2330, Sep. 2018.
[26]
J. Lin, D. Liu, H. Yang, H. Li, and F. Wu, “Convolutional neural network-based block up-sampling for HEVC,” IEEE Trans. Circuits Syst. Video Technol., vol. 29, no. 12, pp. 3701–3715, Dec. 2019.
[27]
Y. Li et al., “Learning a convolutional neural network for image compact-resolution,” IEEE Trans. Image Process., vol. 28, no. 3, pp. 1092–1107, Mar. 2019.
[28]
L. Feng et al., “A dual-network based super-resolution for compressed high definition video,” in 2018 Adv. Multimedia Inf. Process., 2018, pp. 600–610.
[29]
M.-M. Ho, G. He, Z. Wang, and J. Zhou, “Down-sampling based video coding with degradation-aware restoration-reconstruction deep neural network,” in MultiMedia Modeling: 26th Int. Conf., Daejeon, South Korea, 2020, pp. 99–110.
[30]
M. M. Ho, J. Zhou, and G. He, “RR-DNCNN V2. 0: Enhanced restoration-reconstruction deep neural network for down-sampling-based video coding,” IEEE Trans. Image Process., vol. 30, pp. 1702–1715, 2021.
[31]
W. Tao et al., “An end-to-end compression framework based on convolutional neural networks,” in 2017 Data Compression Conf., 2017, pp. 463–463.
[32]
Y. Zhu et al., “High-frequency normalizing flow for image rescaling,” IEEE Trans. Image Process., vol. 32, pp. 6223–6233, 2023.
[33]
Z. Pan, B. Li, D. He, W. Wu, and E. Ding, “Effective invertible arbitrary image rescaling,” in 2023 IEEE/CVF Winter Conf. Appl. Comput. Vis., 2023, pp. 5405–5414.
[34]
R. Timofte, E. Agustsson, L. V. Gool, M.-H. Yang, and L. Zhang, “Ntire 2017 challenge on single image super-resolution: Methods and results,” in 2017 IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, 2017, pp. 114–125.
[35]
H. Zhang et al., “A codec information assisted framework for efficient compressed video super-resolution,” in 2022 Eur. Conf. Comput. Vis., 2022, pp. 220–235.
[36]
M. Lu et al., “Decoder-side cross resolution synthesis for video compression enhancement,” IEEE Trans. Multimedia, vol. 25, pp. 2097–2110, 2022.
[37]
Y. Wei, L. Chen, and L. Song, “Video compression based on jointly learned down-sampling and super-resolution networks,” in 2021 Int. Conf. Vis. Commun. Image Process., 2021, pp. 1–5.
[38]
H. Son, T. Kim, H. Lee, and S. Lee, “Enhanced standard compatible image compression framework based on auxiliary codec networks,” IEEE Trans. Image Process., vol. 31, pp. 664–677, 2022.
[39]
H. Jiang and L. Chen, “An efficient content-aware downsampling-based video compression framework,” in 2022 IEEE Int. Conf. Vis. Commun. Image Process., 2022, pp. 1–5.
[40]
Y. Tian, Y. Yan, G. Zhai, L. Chen, and Z. Gao, “CLSA: A contrastive learning framework with selective aggregation for video rescaling,” IEEE Trans. Image Process., vol. 32, pp. 1300–1314, 2023.
[41]
M. Guo, S. Zhao, H. Jiang, J. Li, and L. Zhang, “Video compression with arbitrary rescaling network,” in 2023 Data Compression Conf., 2023, pp. 1–1.
[42]
Z. Chen and L. Chen, “An enhanced video compression framework based on rescaling networks,” in 2023 IEEE Int. Symp. Broadband Multimedia Syst. Broadcast., 2023, pp. 1–6.
[43]
R. Keys, “Cubic convolution interpolation for digital image processing,” IEEE Trans. Acoust., Speech, Signal Process., vol. 29, no. 6, pp. 1153–1160, Dec. 1981.
[44]
C. Lanczos, “An iteration method for the solution of the eigenvalue problem of linear differential and integral operators,” J. Res. Nat. Bur. Standards, vol. 45, pp. 255–282, 1950. [Online]. Available: https://api.semanticscholar.org/CorpusID, pp 478182.
[45]
J. Agbinya, “Interpolation using the discrete cosine transform,” Electron. Lett., vol. 20, no. 28, pp. 1927–1928, 1992.
[46]
C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color images,” in Sixth Int. Conf. Comput. Vis., 1998, pp. 839–846.
[47]
J. Kopf, A. Shamir, and P. Peers, “Content-adaptive image downscaling,” ACM Trans. Graph., vol. 32, no. 6, pp. 1–8, 2013.
[48]
A. C. Oeztireli and M. Gross, “Perceptually based downscaling of images,” ACM Trans. Graph., vol. 34, no. 4, pp. 1–10, 2015.
[49]
E. S. Gastal and M. M. Oliveira, “Spectral remapping for image downscaling,” ACM Trans. Graph., vol. 36, no. 4, pp. 1–16, 2017.
[50]
J. Kim, J. K. Lee, and K. M. Lee, “Deeply-recursive convolutional network for image super-resolution,” in IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2016, pp. 1637–1645.
[51]
Z. Pan et al., “Towards bidirectional arbitrary image rescaling: Joint optimization and cycle idempotence,” in IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2022, pp. 17389–17398.
[52]
W. Sun and Z. Chen, “Learned image downscaling for upscaling using content adaptive resampler,” IEEE Trans. Image Process., vol. 29, pp. 4027–4040, 2020.
[53]
M. Xiao et al., “Invertible image rescaling,” in 2020 Eur. Conf. Comput. Vis., 2020, pp. 126–144.
[54]
C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution using deep convolutional networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 2, pp. 295–307, Feb. 2015.
[55]
B. Lim, S. Son, H. Kim, S. Nah, and K. Mu Lee, “Enhanced deep residual networks for single image super-resolution,” in 2017 IEEE/CVF Conf. Comput. Vis. Pattern Recognit. Workshops, 2017, pp. 136–144.
[56]
Y. Zhang et al., “Image super-resolution using very deep residual channel attention networks,” 2018 in Eur. Conf. Comput. Cision, 2018, pp. 286–301.
[57]
C. Ledig et al., “Photo-realistic single image super-resolution using a generative adversarial network,” in 2017 IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 4681–4690.
[58]
X. Wang et al., “ESRGAN: Enhanced super-resolution generative adversarial networks,” in Proc. Eur. Conf. Comput. Vis., 2018.
[59]
C. Barnes, E. Shechtman, A. Finkelstein, and D. B. Goldman, “PatchMatch: A randomized correspondence algorithm for structural image editing,” ACM Trans. Graph., vol. 28, no. 3, 2009, Art. no.
[60]
Z. Zhang, Z. Wang, Z. Lin, and H. Qi, “Image super-resolution by neural texture transfer,” in 2019 IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2019, pp. 7982–7991.
[61]
L. Lu, W. Li, X. Tao, J. Lu, and J. Jia, “MASA-SR: Matching acceleration and spatial adaptation for reference-based image super-resolution,” 2021 in IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2021, pp. 6368–6377.
[62]
Y. Jiang, K. C. Chan, X. Wang, C. C. Loy, and Z. Liu, “Robust reference-based super-resolution via c2-matching,” 2021 in IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2021, pp. 2103–2112.
[63]
J. Dai et al., “Deformable convolutional networks,” in 2017 IEEE Int. Conf. Comput. Vis., 2017, pp. 764–773.
[64]
R. Karthik, T. S. Vaichole, S. K. Kulkarni, O. Yadav, and F. Khan, “EFF2Net: An efficient channel attention-based convolutional neural network for skin disease classification,” Biomed. Signal Process. Control, vol. 73, 2022, Art. no.
[65]
S. W. Zamir et al., “Learning enriched features for fast image restoration and enhancement,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 2, pp. 1934–1948, Feb., 2022.
[66]
S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “CBAM: Convolutional block attention module,” in 2018 Eur. Conf. Comput. Vis., 2018, pp. 3–19.
[67]
M. Lu et al., “Decoder-side cross resolution synthesis for video compression enhancement,” IEEE Trans. Multimedia, vol. 25, pp. 2097–2110, 2023.
[68]
D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2014, arXiv:1412.6980.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Multimedia
IEEE Transactions on Multimedia  Volume 26, Issue
2024
11427 pages

Publisher

IEEE Press

Publication History

Published: 01 January 2024

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 24 Jan 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media