Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Neural network-based cross-channel chroma prediction for versatile video coding

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Despite linear models being introduced in the latest versatile video coding (VVC) standard to exploit the correlation among luma and chroma channels for removing redundancy, these models cannot take into account the nonlinearity of components, resulting in degraded intraprediction precision. In this paper, a neural network-based method is proposed for cross-channel chroma intraprediction to enhance the coding efficiency. Specifically, the neighboring reference and co-located samples are separately input into the proposed network to exploit spatial and cross-channel correlations fully. Furthermore, in order to acquire a more compact representation of residual signals, a transform-based loss is employed to enhance the effectiveness of the compression. The proposed method is integrated into VVC, competing with the intrinsic chroma prediction regarding rate-distortion optimization to enhance coding performance further. The extensive experimental results demonstrate the superiority of the proposed method over the VVC test model (VTM) 18.0, achieving average bitrate savings of 0.28%, 2.44%, and 1.89% for Y, U, and V components, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Availability of data

Not applicable.

References

  1. François E, Fogg C, He Y, Li X, Luthra A, Segall A (2015) High dynamic range and wide color gamut video coding in HEVC: status and potential future enhancements. IEEE Trans Circuits Syst Video Technol 26(1):63–75

    Article  Google Scholar 

  2. Müller K, Schwarz H, Marpe D, Bartnik C, Bosse S, Brust H, Hinz T, Lakshman H, Merkle P, Rhee FH et al (2013) 3d high-efficiency video coding for multi-view video and depth data. IEEE Trans Image Process 22(9):3366–3378

    Article  MathSciNet  Google Scholar 

  3. Sullivan GJ, Ohm J-R, Han W-J, Wiegand T (2012) Overview of the high efficiency video coding (HEVC) standard. IEEE Trans Circuits Syst Video Technol 22(12):1649–1668

    Article  Google Scholar 

  4. Wiegand T, Sullivan GJ, Bjontegaard G, Luthra A (2003) Overview of the h. 264/avc video coding standard. IEEE Trans Circuits Syst Video Technol 13(7):560–576

  5. Bross B, Chen J, Liu S (2018) Versatile video coding (Draft 1), document JVET-J1001. Joint Video Experts Team (JVET)

  6. Li X, Chuang H-C, Chen J, Karczewicz M, Zhang L, Zhao X, Said A (2016) Multi-type-tree. Joint Video Exploration Team (JVET), doc. JVET-D0117

  7. He L, Xiong S, Yang R, He X, Chen H (2022) Low-complexity multiple transform selection combining multi-type tree partition algorithm for versatile video coding. Sensors 22(15):5523

    Article  Google Scholar 

  8. De-Luxán-Hernández S, George V, Ma J, Nguyen T, Schwarz H, Marpe D, Wiegand T (2019) An intra subpartition coding mode for vvc. In: 2019 IEEE International Conference on Image Processing (ICIP), pp 1203–1207

  9. Zhang K, Chen Y-W, Zhang L, Chien W-J, Karczewicz M (2018) An improved framework of affine motion compensation in video coding. IEEE Trans Image Process 28(3):1456–1469

    Article  MathSciNet  Google Scholar 

  10. Schwarz H, Nguyen T, Marpe D, Wiegand T (2019) Hybrid video coding with trellis-coded quantization. In: 2019 Data Compression Conference (DCC), pp 182–191

  11. He L, He X, Xiong S, Zhao Z, Xiao H, Chen H (2022) Efficient rate control in versatile video coding with adaptive spatial-temporal bit allocation and parameter updating. IEEE Trans Circuits Syst Video Technol 33:2920–2934

    Article  Google Scholar 

  12. Zhao X, Chen J, Karczewicz M, Said A, Seregin V (2018) Joint separable and non-separable transforms for next-generation video coding. IEEE Trans Image Process 27(5):2514–2525

    Article  MathSciNet  Google Scholar 

  13. Yeh C-H, Tseng T-Y, Lee C-W, Lin C-Y (2015) Predictive texture synthesis-based intra coding scheme for advanced video coding. IEEE Trans Multimed 17(9):1508–1514

    Article  Google Scholar 

  14. Zhang T, Chen H, Sun M-T, Zhao D, Gao W (2017) Signal dependent transform based on SVD for HEVC intracoding. IEEE Trans Multimed 19(11):2404–2414

    Article  Google Scholar 

  15. Galiano V, Migallón H, Martínez-Rach M, López-Granado O, Malumbres MP (2023) On the use of deep learning and parallelism techniques to significantly reduce the HEVC intra-coding time. J Supercomput 79:1–19

    Google Scholar 

  16. Paraschiv EG, Ruiz-Coll D, Pantoja M, Fernández-Escribano G (2019) Parallelization and improvement of the MDV-SW algorithm for HEVC intra-prediction coding. J Supercomput 75:1150–1162

    Article  Google Scholar 

  17. Galiano V, Migallón H, Herranz V, Piol P, Malumbres MP (2016) GPU-based HEVC intra-prediction module. J Supercomput 73(1):1–14

    Google Scholar 

  18. Li Y, Li L, Li Z, Yang J, Xu N, Liu D, Li H (2018) A hybrid neural network for chroma intra prediction. In: 2018 25th IEEE International Conference on Image Processing (ICIP), pp 1797–1801

  19. Pfaff J, Helle P, Maniry D, Kaltenstadler S, Stallenberger B, Merkle P, Siekmann M, Schwarz H, Marpe D, Wiegand T (2018) Intra prediction modes based on neural networks. Doc. JVET-J0037-v2, Joint Video Exploration Team of ITU-T VCEG and ISO/IEC MPEG

  20. Zhao L, Wang S, Zhang X, Wang S, Ma S, Gao W (2019) Enhanced motion-compensated video coding with deep virtual reference frame generation. IEEE Trans Image Process 28(10):4832–4844

    Article  MathSciNet  Google Scholar 

  21. Yan N, Liu D, Li H, Li B, Li L, Wu F (2018) Convolutional neural network-based fractional-pixel motion compensation. IEEE Trans Circuits Syst Video Technol 29(3):840–853

    Article  Google Scholar 

  22. Li J, Li B, Xu J, Xiong R, Gao W (2018) Fully connected network-based intra prediction for image coding. IEEE Trans Image Process 27(7):3236–3247

    Article  MathSciNet  Google Scholar 

  23. Zhu L, Kwong S, Zhang Y, Wang S, Wang X (2019) Generative adversarial network-based intra prediction for video coding. IEEE Trans Multimed 22(1):45–58

    Article  Google Scholar 

  24. Yu L, Shen L, Yang H, Wang L, An P (2019) Quality enhancement network via multi-reconstruction recursive residual learning for video coding. IEEE Signal Process Lett 26(4):557–561

    Article  Google Scholar 

  25. Shuai X, Qing L, Zhang M, Sun W, He X (2022) A video compression artifact reduction approach combined with quantization parameters estimation. J Supercomput 78:1–19

    Article  Google Scholar 

  26. Lainema J, Bossen F, Han W-J, Min J, Ugur K (2012) Intra coding of the HEVC standard. IEEE Trans Circuits Syst Video Technol 22(12):1792–1801

    Article  Google Scholar 

  27. Kim W-S, Pu W, Khairat A, Siekmann M, Sole J, Chen J, Karczewicz M, Nguyen T, Marpe D (2015) Cross-component prediction in HEVC. IEEE Trans Circuits Syst Video Technol 30(6):1699–1708

    Article  Google Scholar 

  28. Khairat A, Nguyen T, Siekmann M, Marpe D, Wiegand T (2014) Adaptive cross-component prediction for 4: 4: 4 high efficiency video coding. In: 2014 IEEE International Conference on Image Processing (ICIP), pp 3734–3738

  29. Zhang T, Fan X, Zhao D, Gao W (2016) Improving chroma intra prediction for HEVC. In: 2016 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp 1–6

  30. Yeo C, Tan YH, Li Z, Rahardja S (2011) Chroma intra prediction using template matching with reconstructed luma components. In: 2011 18th IEEE International Conference on Image Processing, pp 1637–1640

  31. Zhang K, Chen J, Zhang L, Li X, Karczewicz M (2018) Enhanced cross-component linear model for chroma intra-prediction in video coding. IEEE Trans Image Process 27(8):3983–3997

    Article  MathSciNet  Google Scholar 

  32. Zhang X, Gisquet C, Francois E, Zou F, Au OC (2013) Chroma intra prediction based on inter-channel correlation for HEVC. IEEE Trans Image Process 23(1):274–286

    Article  MathSciNet  Google Scholar 

  33. Zhang K, Chen J, Zhang L, Li X, Karczewicz M (2018) Enhanced cross-component linear model for chroma intra-prediction in video coding. IEEE Trans Image Process 27(8):3983–3997

    Article  MathSciNet  Google Scholar 

  34. Zhang L, Chien W-J, Chen J, Zhao X, Karczewicz M (2017) Multiple direct mode for intra coding. In: 2017 IEEE Visual Communications and Image Processing (VCIP), pp 1–4

  35. Zhang K, Chen J, Zhang L, Li X, Karczewicz M (2018) Enhanced cross-component linear model for chroma intra-prediction in video coding. IEEE Trans Image Process 27(8):3983–3997

    Article  MathSciNet  Google Scholar 

  36. Li J, Li B, Xu J, Xiong R, Gao W (2018) Fully connected network-based intra prediction for image coding. IEEE Trans Image Process 27(7):3236–3247

    Article  MathSciNet  Google Scholar 

  37. Pfaff J, Helle P, Maniry D, Kaltenstadler S, Samek W, Schwarz H, Marpe D, Wiegand T (2018) Neural network based intra prediction for video coding. Applications of Digital Image Processing XLI 10752:359–365

    Google Scholar 

  38. Blanch MG, Blasi S, Smeaton A, O’Connor NE, Mrak M (2020) Chroma intra prediction with attention-based CNN architectures. In: 2020 IEEE International Conference on Image Processing (ICIP), pp 783–787

  39. Zhu L, Zhang Y, Wang S, Kwong S, Jin X, Qiao Y (2021) Deep learning-based chroma prediction for intra versatile video coding. IEEE Trans Circuits Syst Video Technol 31(8):3168–3181

    Article  Google Scholar 

  40. Zou C, Wan S, Mrak M, Blanch MG, Herranz L, Ji T (2022) Towards lightweight neural network-based chroma intra prediction for video coding. In: 2022 IEEE International Conference on Image Processing (ICIP), pp 1006–1010

  41. Zou C, Wan S, Ji T, Mrak M, Blanch MG, Herranz L (2021) Spatial information refinement for chroma intra prediction in video coding. In: 2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, pp 1422–1427

  42. Zou C, Wan S, Ji T, Blanch MG, Mrak M, Herranz L (2023) Chroma intra prediction with lightweight attention-based neural networks. IEEE Trans Circuits Syst Video Technol 34:549–560

    Article  Google Scholar 

  43. Hu Y, Yang W, Li M, Liu J (2019) Progressive spatial recurrent neural network for intra prediction. IEEE Trans Multimed 21(12):3024–3037

    Article  Google Scholar 

  44. Blanch MG, Blasi S, Smeaton AF, O’Connor NE, Mrak M (2021) Attention-based neural networks for chroma intra prediction in video coding. IEEE J Sel Top Signal Process 15(2):366–377

    Article  Google Scholar 

  45. Blanch MG, Blasi S, Smeaton A, O’Connor NE, Mrak M (2020) Chroma intra prediction with attention-based cnn architectures. In: 2020 IEEE International Conference on Image Processing (ICIP), pp 783–787

  46. Li Y, Yi Y, Liu D, Li L, Li Z, Li H (2021) Neural-network-based cross-channel intra prediction. ACM Trans Multimed Comput Commun Appl (TOMM) 17(3):1–23

    Google Scholar 

  47. Zhang X, Gisquet C, François E, Zou F, Au OC (2014) Chroma intra prediction based on inter-channel correlation for HEVC. IEEE Trans Image Process 23(1):274–286

    Article  MathSciNet  Google Scholar 

  48. Pfaff J, Helle P, Maniry DR, Stephan K, Wiegand T (2018) Neural network based intra prediction for video coding. In: Applications of Digital Image Processing XLI

  49. Agustsson E, Timofte R (2017) Ntire 2017 challenge on single image super-resolution: dataset and study. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 126–135

  50. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980

  51. Bossen F, Boyce J, Li X, Seregin V, Sühring K (2019) Jvet common test conditions and software reference configurations for SDR video. Joint Video Experts Team (JVET) of ITU-T SG 16, 19–27

  52. Bjontegaard G (2008) Improvements of the BD-PSNR model. In: ITU-T SG16/Q6, 35th VCEG Meeting, Berlin, Germany, July, 2008

Download references

Acknowledgements

Not applicable.

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

FL helped in conceptualization, methodology, software, formal analysis, investigation, writing original draft preparation; FL and JZ validated and supervised the study and administrated the project.

Corresponding author

Correspondence to Jingde Zhang.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liang, F., Zhang, J. Neural network-based cross-channel chroma prediction for versatile video coding. J Supercomput 80, 12166–12185 (2024). https://doi.org/10.1007/s11227-023-05868-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-023-05868-y

Keywords