Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

2BiVQA: Double Bi-LSTM-based Video Quality Assessment of UGC Videos

Published: 11 December 2023 Publication History
  • Get Citation Alerts
  • Abstract

    Recently, with the growing popularity of mobile devices as well as video sharing platforms (e.g., YouTube, Facebook, TikTok, and Twitch), User-Generated Content (UGC) videos have become increasingly common and now account for a large portion of multimedia traffic on the internet. Unlike professionally generated videos produced by filmmakers and videographers, typically, UGC videos contain multiple authentic distortions, generally introduced during capture and processing by naive users. Quality prediction of UGC videos is of paramount importance to optimize and monitor their processing in hosting platforms, such as their coding, transcoding, and streaming. However, blind quality prediction of UGC is quite challenging, because the degradations of UGC videos are unknown and very diverse, in addition to the unavailability of pristine reference. Therefore, in this article, we propose an accurate and efficient Blind Video Quality Assessment (BVQA) model for UGC videos, which we name 2BiVQA for double Bi-LSTM Video Quality Assessment. 2BiVQA metric consists of three main blocks, including a pre-trained Convolutional Neural Network to extract discriminative features from image patches, which are then fed into two Recurrent Neural Networks for spatial and temporal pooling. Specifically, we use two Bi-directional Long Short-term Memory networks, the first is used to capture short-range dependencies between image patches, while the second allows capturing long-range dependencies between frames to account for the temporal memory effect. Experimental results on recent large-scale UGC VQA datasets show that 2BiVQA achieves high performance at lower computational cost than most state-of-the-art VQA models. The source code of our 2BiVQA metric is made publicly available at https://github.com/atelili/2BiVQA.

    References

    [1]
    Sewoong Ahn and Sanghoon Lee. 2018. Deep blind video quality assessment based on temporal human perception. In Proceedings of the 25th IEEE International Conference on Image Processing (ICIP’18). IEEE, 619–623.
    [2]
    Aishy Amer and Eric Dubois. 2005. Fast and reliable structure-oriented video noise estimation. IEEE Trans. Circ. Syst. Video Technol. 15, 1 (2005), 113–118.
    [3]
    S. A. Amirshahi, M. Pedersen, and S. X. Yu. 2016. Image quality assessment by comparing CNN features between images. J. Imag. Sci. Technol. 60, 6 (2016), 60410–1.
    [4]
    Alexey Bochkovskiy, Chien-Yao Wang, and Hong-Yuan Mark Liao. 2020. YOLOv4: Optimal speed and accuracy of object detection. Retrieved from https://arxiv.org/abs/2004.10934
    [5]
    ITU Recommendation BT. 2012. Methodology for the Subjective Assessment of the Quality of Television Pictures. Int. Telecommun. Union 6 (2012).
    [6]
    Y. Cao, X. Min, W. Sun, and G. Zhai. 2023. Subjective and objective audio-visual quality assessment for user generated content. In IEEE Transactions on Image Processing, Vol. 32, 3847–3861.
    [7]
    U Cisco. 2020. Cisco annual internet report (2018–2023) white paper. Cisco, San Jose, CA.
    [8]
    Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 248–255.
    [9]
    Xiaojun Feng and Jan P. Allebach. 2006. Measurement of ringing artifacts in JPEG images. In Digital Publishing, Vol. 6076. International Society for Optics and Photonics, 60760A.
    [10]
    F. Gao, Y. Wang, P. Li, M. Tan, J. Yu, and Y. Zhu. 2017. Deepsim: Deep similarity for image quality assessment. Neurocomputing 257 (2017), 104–114.
    [11]
    Deepti Ghadiyaram and Alan C. Bovik. 2015. Massive online crowdsourced study of subjective and objective picture quality. IEEE Trans. Image Process. 25, 1 (2015), 372–387.
    [12]
    Deepti Ghadiyaram and Alan C. Bovik. 2017. Perceptual quality prediction on authentically distorted images using a bag of features approach. J. Vision 17, 1 (2017), 32–32.
    [13]
    Ross Girshick. 2015. Fast r-CNN. In Proceedings of the IEEE International Conference on Computer Vision. 1440–1448.
    [14]
    Franz Götz-Hahn, Vlad Hosu, Hanhe Lin, and Dietmar Saupe. 2021. KonVid-150k: A dataset for no-reference video quality assessment of videos in-the-wild. IEEE Access 9 (2021), 72139–72160.
    [15]
    Alex Graves, Navdeep Jaitly, and Abdel-rahman Mohamed. 2013. Hybrid speech recognition with deep bidirectional LSTM. In Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding. IEEE, 273–278.
    [16]
    Alex Graves and Jürgen Schmidhuber. 2005. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 18, 5-6 (2005), 602–610.
    [17]
    Ke Gu, Guangtao Zhai, Xiaokang Yang, and Wenjun Zhang. 2014. Hybrid no-reference quality metric for singly and multiply distorted images. IEEE Trans. Broadcast. 60, 3 (2014), 555–567.
    [18]
    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.
    [19]
    Sepp Hochreiter. 1998. The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int. J. Uncert. Fuzz. Knowl.-Based Syst. 6, 02 (1998), 107–116.
    [20]
    Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735–1780.
    [21]
    Vlad Hosu, Franz Hahn, Mohsen Jenadeleh, Hanhe Lin, Hui Men, Tamás Szirányi, Shujun Li, and Dietmar Saupe. 2017. The Konstanz natural video database (KoNViD-1k). In Proceedings of the 9th International Conference on Quality of Multimedia Experience (QoMEX’17). IEEE, 1–6.
    [22]
    Vlad Hosu, Hanhe Lin, Tamas Sziranyi, and Dietmar Saupe. 2020. KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment. IEEE Trans. Image Process. 29 (2020), 4041–4056.
    [23]
    Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q. Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4700–4708.
    [24]
    Xinhui Huang, Chunyi Li, Abdelhak Bentaleb, Roger Zimmermann, and Guangtao Zhai. 2023. XGC-VQA: A unified video quality assessment model for user, professionally, and occupationally-generated content. Retrieved from https://arXiv:2303.13859
    [25]
    J. Johnson, A. Alahi, and L. Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the European Conference on Computer Vision. Springer, 694–711.
    [26]
    Parimala Kancharla and Sumohana S. Channappayya. 2022. Completely blind quality assessment of user generated video content. IEEE Trans. Image Process. 31 (2022), 263–274.
    [27]
    Le Kang, Peng Ye, Yi Li, and David Doermann. 2014. Convolutional neural networks for no-reference image quality assessment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1733–1740.
    [28]
    Jongyoo Kim, Hui Zeng, Deepti Ghadiyaram, Sanghoon Lee, Lei Zhang, and Alan C. Bovik. 2017. Deep convolutional neural models for picture-quality prediction: Challenges and solutions to data-driven image quality assessment. IEEE Signal Process. Mag. 34, 6 (2017), 130–141.
    [29]
    Woojae Kim, Jongyoo Kim, Sewoong Ahn, Jinwoo Kim, and Sanghoon Lee. 2018. Deep video quality assessor: From spatio-temporal visual sensitivity to a convolutional neural aggregation network. In Proceedings of the European Conference on Computer Vision (ECCV’18). 219–234.
    [30]
    Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR’15), Yoshua Bengio and Yann LeCun (Eds.).
    [31]
    Jari Korhonen. 2019. Two-level approach for no-reference consumer video quality assessment. IEEE Trans. Image Process. 28, 12 (2019), 5923–5938.
    [32]
    Debarati Kundu, Deepti Ghadiyaram, Alan C. Bovik, and Brian L. Evans. 2017. No-reference quality assessment of tone-mapped HDR pictures. IEEE Trans. Image Process. 26, 6 (2017), 2957–2971.
    [33]
    Dingquan Li, Tingting Jiang, and Ming Jiang. 2019. Quality assessment of in-the-wild videos. In Proceedings of the 27th ACM International Conference on Multimedia. 2351–2359.
    [34]
    Dingquan Li, Tingting Jiang, and Ming Jiang. 2021. Unified quality assessment of in-the-wild videos with mixed datasets training. Int. J. Comput. Vision 129, 4 (2021), 1238–1257.
    [35]
    Xuelong Li, Qun Guo, and Xiaoqiang Lu. 2016. Spatiotemporal statistics for video quality assessment. IEEE Trans. Image Process. 25, 7 (2016), 3329–3342.
    [36]
    Wentao Liu, Zhengfang Duanmu, and Zhou Wang. 2018. End-to-end blind quality assessment of compressed videos using deep neural networks. In Proceedings of the 26th ACM International Conference on Multimedia (MM’18). 546–554.
    [37]
    Wei Liu, Andrew Rabinovich, and Alexander C. Berg. 2015. ParseNet: Looking wider to see better. Retrieved from https://arxiv.org/abs/1506.04579
    [38]
    Yutao Liu, Ke Gu, Xiu Li, and Yongbing Zhang. 2020. Blind image quality assessment by natural scene statistics and perceptual characteristics. ACM Trans. Multimedia Comput. Commun. Appl. 16, 3 (2020), 1–91.
    [39]
    Yanan Lu, Fengying Xie, Tongliang Liu, Zhiguo Jiang, and Dacheng Tao. 2015. No reference quality assessment for multiply-distorted images based on an improved bag-of-words model. IEEE Signal Process. Lett. 22, 10 (2015), 1811–1815.
    [40]
    K. Manasa and Sumohana S. Channappayya. 2016. An optical flow-based no-reference video quality assessment algorithm. In Proceedings of the IEEE International Conference on Image Processing (ICIP’16). IEEE, 2400–2404.
    [41]
    Pina Marziliano, Frederic Dufaux, Stefan Winkler, and Touradj Ebrahimi. 2002. A no-reference perceptual blur metric. In Proceedings of the International Conference on Image Processing, Vol. 3. IEEE, III.
    [42]
    Xiongkuo Min, Ke Gu, Guangtao Zhai, Jing Liu, Xiaokang Yang, and Chang Wen Chen. 2017. Blind quality assessment based on pseudo-reference image. IEEE Trans. Multimedia 20, 8 (2017), 2049–2062.
    [43]
    Xiongkuo Min, Ke Gu, Guangtao Zhai, Xiaokang Yang, Wenjun Zhang, Patrick Le Callet, and Chang Wen Chen. 2021. Screen content quality assessment: Overview, benchmark, and beyond. ACM Comput. Surveys 54, 9 (2021), 1–36.
    [44]
    Xiongkuo Min, Kede Ma, Ke Gu, Guangtao Zhai, Zhou Wang, and Weisi Lin. 2017. Unified blind quality assessment of compressed natural, graphic, and screen content images. IEEE Trans. Image Process. 26, 11 (2017), 5462–5474.
    [45]
    Xiongkuo Min, Guangtao Zhai, Ke Gu, Yutao Liu, and Xiaokang Yang. 2018. Blind image quality estimation via distortion aggravation. IEEE Trans. Broadcast. 64, 2 (2018), 508–517.
    [46]
    Xiongkuo Min, Guangtao Zhai, Jiantao Zhou, Mylene C. Q. Farias, and Alan Conrad Bovik. 2020. Study of subjective and objective quality assessment of audio-visual signals. IEEE Trans. Image Process. 29 (2020), 6054–6068.
    [47]
    Shankhanil Mitra and Rajiv Soundararajan. 2022. Multiview contrastive learning for completely blind video quality assessment of user generated content. In Proceedings of the 30th ACM International Conference on Multimedia. 1914–1924.
    [48]
    Anish Mittal, Anush Krishna Moorthy, and Alan Conrad Bovik. 2012. No-reference image quality assessment in the spatial domain. IEEE Trans. Image Process. 21, 12 (2012), 4695–4708.
    [49]
    Anish Mittal, Michele A. Saad, and Alan C. Bovik. 2015. A completely blind video integrity oracle. IEEE Trans. Image Process. 25, 1 (2015), 289–300.
    [50]
    Anish Mittal, Rajiv Soundararajan, and Alan C. Bovik. 2012. Making a “completely blind” image quality analyzer. IEEE Signal Process. Lett. 20, 3 (2012), 209–212.
    [51]
    Anush Krishna Moorthy and Alan Conrad Bovik. 2010. A two-step framework for constructing blind image quality indices. IEEE Signal Process. Lett. 17, 5 (2010), 513–516.
    [52]
    Anush Krishna Moorthy and Alan Conrad Bovik. 2011. Blind image quality assessment: From natural scene statistics to perceptual quality. IEEE Trans. Image Process. 20, 12 (2011), 3350–3364.
    [53]
    Alexandre Ninassi, Olivier Le Meur, Patrick Le Callet, and Dominique Barba. 2009. Considering temporal variations of spatial visual distortions in video quality assessment. IEEE J. Select. Top. Signal Process. 3, 2 (2009), 253–265.
    [54]
    Andrey Norkin and Neil Birkbeck. 2018. Film grain synthesis for AV1 video codec. In Proceedings of the Data Compression Conference. IEEE, 3–12.
    [55]
    Jincheol Park, Kalpana Seshadrinathan, Sanghoon Lee, and Alan Conrad Bovik. 2012. Video quality pooling adaptive to perceptual distortion severity. IEEE Trans. Image Process. 22, 2 (2012), 610–620.
    [56]
    Soo-Chang Pei and Li-Heng Chen. 2015. Image quality assessment using human visual DOG model fused with random forest. IEEE Trans. Image Process. 24, 11 (2015), 3282–3292.
    [57]
    Margaret H. Pinson and Stephen Wolf. 2003. An objective method for combining multiple subjective data sets. In Proceedings of the Conference on Visual Communications and Image Processing, Vol. 5150. International Society for Optics and Photonics, 583–592.
    [58]
    Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-assisted Intervention. Springer, 234–241.
    [59]
    Daniel L. Ruderman and William Bialek. 1994. Statistics of natural images: Scaling in the woods. Phys. Rev. Lett. 73, 6 (1994), 814.
    [60]
    Michele A. Saad, Alan C. Bovik, and Christophe Charrier. 2010. A DCT statistics-based blind image quality index. IEEE Signal Process. Lett. 17, 6 (2010), 583–586.
    [61]
    Michele A. Saad, Alan C. Bovik, and Christophe Charrier. 2012. Blind image quality assessment: A natural scene statistics approach in the DCT domain. IEEE Trans. Image Process. 21, 8 (2012), 3339–3352.
    [62]
    Michele A. Saad, Alan C. Bovik, and Christophe Charrier. 2014. Blind prediction of natural video quality. IEEE Trans. Image Process. 23, 3 (2014), 1352–1365.
    [63]
    Mike Schuster and Kuldip K. Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45, 11 (1997), 2673–2681.
    [64]
    Kalpana Seshadrinathan and Alan C. Bovik. 2011. Temporal hysteresis model of time varying subjective video quality. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’11). IEEE, 1153–1156.
    [65]
    Kalpana Seshadrinathan, Rajiv Soundararajan, Alan Conrad Bovik, and Lawrence K. Cormack. 2010. Study of subjective and objective quality assessment of video. IEEE Trans. Image Process. 19, 6 (2010), 1427–1441.
    [66]
    Wenhao Shen, Mingliang Zhou, Xingran Liao, Weijia Jia, Tao Xiang, Bin Fang, and Zhaowei Shang. 2022. An end-to-end no-reference video quality assessment method with hierarchical spatiotemporal feature representation. IEEE Transactions on Broadcasting 68, 3 (2022), 651–660.
    [67]
    Sima Siami-Namini, Neda Tavakoli, and Akbar Siami Namin. 2019. The performance of LSTM and BiLSTM in forecasting time series. In Proceedings of the IEEE International Conference on Big Data (Big Data’19). IEEE, 3285–3292.
    [68]
    Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In Proceedings of the 3rd International Conference on Learning Representations (ICLR’15), Yoshua Bengio and Yann LeCun (Eds.).
    [69]
    Zeina Sinno and Alan Conrad Bovik. 2018. Large-scale study of perceptual video quality. IEEE Trans. Image Process. 28, 2 (2018), 612–627.
    [70]
    Zeina Sinno and Alan C. Bovik. 2019. Spatio-temporal measures of naturalness. In Proceedings of the IEEE International Conference on Image Processing (ICIP’19). IEEE, 1750–1754.
    [71]
    Wei Sun, Xiongkuo Min, Wei Lu, and Guangtao Zhai. 2022. A deep learning based no-reference quality assessment model for ugc videos. In Proceedings of the 30th ACM International Conference on Multimedia. 856–865.
    [72]
    Wei Sun, Xiongkuo Min, Danyang Tu, Siwei Ma, and Guangtao Zhai. 2023. Blind quality assessment for in-the-wild images via hierarchical feature fusion and iterative mixed database training. IEEE J. Select. Top. Signal Process. (2023).
    [73]
    Wei Sun, Tao Wang, Xiongkuo Min, Fuwang Yi, and Guangtao Zhai. 2021. Deep learning based full-reference and no-reference quality assessment models for compressed UGC videos. In Proceedings of the IEEE International Conference on Multimedia and Expo Workshops (ICMEW’21). IEEE, 1–6.
    [74]
    Mingxing Tan and Quoc Le. 2019. Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International Conference on Machine Learning. PMLR, 6105–6114.
    [75]
    Lohic Fotio Tiotsop, Tomas Mizdos, Marcus Barkowsky, Peter Pocta, Antonio Servetti, and Enrico Masala. 2022. Mimicking individual media quality perception with neural network based artificial observers. ACM Trans. Multimedia Comput. Commun. Appl. 18, 1 (2022), 1–25.
    [76]
    Zhengzhong Tu, Chia-Ju Chen, Li-Heng Chen, Neil Birkbeck, Balu Adsumilli, and Alan C. Bovik. 2020. A comparative evaluation of temporal pooling methods for blind video quality assessment. In Proceedings of the IEEE International Conference on Image Processing (ICIP’20). IEEE, 141–145.
    [77]
    Zhengzhong Tu, Jessie Lin, Yilin Wang, Balu Adsumilli, and Alan C. Bovik. 2020. Bband index: A no-reference banding artifact predictor. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’20). IEEE, 2712–2716.
    [78]
    Zhengzhong Tu, Yilin Wang, Neil Birkbeck, Balu Adsumilli, and Alan C. Bovik. 2021. UGC-VQA: Benchmarking blind video quality assessment for user generated content. IEEE Trans. Image Process. 30 (2021), 4449–4464.
    [79]
    Zhengzhong Tu, Xiangxu Yu, Yilin Wang, Neil Birkbeck, Balu Adsumilli, and Alan C. Bovik. 2021. RAPIQUE: Rapid and accurate video quality prediction of user generated content. IEEE Open J. Signal Process. 2 (2021), 425–440.
    [80]
    Xin Wang, Baofeng Tian, Chao Liang, and Dongcheng Shi. 2008. Blind image quality assessment for measuring image blur. In Proceedings of the Congress on Image and Signal Processing, Vol. 1. IEEE, 467–470.
    [81]
    Yilin Wang, Sasi Inguva, and Balu Adsumilli. 2019. YouTube UGC dataset for video compression research. In Proceedings of the IEEE 21st International Workshop on Multimedia Signal Processing (MMSP’19). IEEE, 1–5.
    [82]
    Yilin Wang, Sang-Uok Kum, Chao Chen, and Anil Kokaram. 2016. A perceptual visibility metric for banding artifacts. In Proceedings of the IEEE International Conference on Image Processing (ICIP’16). IEEE, 2067–2071.
    [83]
    Zhou Wang, Alan C. Bovik, and Brian L. Evan. 2000. Blind measurement of blocking artifacts in images. In Proceedings International Conference on Image Processing, Vol. 3. IEEE, 981–984.
    [84]
    Andrew B. Watson and Cynthia H. Null. 1997. Digital images and human vision. In Proceedings of the Electronic Imaging Science and Technology Conference.
    [85]
    Haoning Wu, Chaofeng Chen, Jingwen Hou, Liang Liao, Annan Wang, Wenxiu Sun, Qiong Yan, and Weisi Lin. 2022. Fast-vqa: Efficient end-to-end video quality assessment with fragment sampling. In Proceedings of the European Conference on Computer Vision. Springer, 538–554.
    [86]
    Haoning Wu, Erli Zhang, Liang Liao, Chaofeng Chen, Jingwen Hou, Annan Wang, Wenxiu Sun, Qiong Yan, and Weisi Lin. 2023. Exploring video quality assessment on user generated contents from aesthetic and technical perspectives. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV’23).
    [87]
    Jiahua Xu, Jing Li, Xingguang Zhou, Wei Zhou, Baichao Wang, and Zhibo Chen. 2021. Perceptual quality assessment of internet videos. In Proceedings of the 29th ACM International Conference on Multimedia. 1248–1257.
    [88]
    Jingtao Xu, Peng Ye, Qiaohong Li, Haiqing Du, Yong Liu, and David Doermann. 2016. Blind image quality assessment based on high order statistics aggregation. IEEE Trans. Image Process. 25, 9 (2016), 4444–4457.
    [89]
    Jingtao Xu, Peng Ye, Yong Liu, and David Doermann. 2014. No-reference video quality assessment via feature learning. In Proceedings of the IEEE International Conference on Image Processing (ICIP’14). IEEE, 491–495.
    [90]
    Wufeng Xue, Xuanqin Mou, Lei Zhang, Alan C. Bovik, and Xiangchu Feng. 2014. Blind image quality assessment using joint statistics of gradient magnitude and Laplacian features. IEEE Trans. Image Process. 23, 11 (2014), 4850–4862.
    [91]
    Xiaohan Yang, Fan Li, and Hantao Liu. 2019. A survey of DNN methods for blind image quality assessment. IEEE Access 7 (2019), 123788–123806.
    [92]
    X. Yang, F. Li, and H. Liu. 2020. Deep feature importance awareness based no-reference image quality prediction. Neurocomputing 401 (2020), 209–223.
    [93]
    Peng Ye, Jayant Kumar, Le Kang, and David Doermann. 2012. Unsupervised feature learning framework for no-reference image quality assessment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1098–1105.
    [94]
    Fuwang Yi, Mianyi Chen, Wei Sun, Xiongkuo Min, Yuan Tian, and Guangtao Zhai. 2021. Attention based network for no-reference UGC video quality assessment. In Proceedings of the IEEE International Conference on Image Processing (ICIP’21). IEEE, 1414–1418.
    [95]
    Zhenqiang Ying, Haoran Niu, Praful Gupta, Dhruv Mahajan, Deepti Ghadiyaram, and Alan Bovik. 2020. From patches to pictures (PaQ-2-PiQ): Mapping the perceptual space of picture quality. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3575–3585.
    [96]
    Junyong You and Jari Korhonen. 2019. Deep neural networks for no-reference video quality assessment. In Proceedings of the IEEE International Conference on Image Processing (ICIP’19). IEEE, 2349–2353.
    [97]
    Guangtao Zhai and Xiongkuo Min. 2020. Perceptual image quality assessment: A survey. Sci. China Info. Sci. 63 (2020), 1–52.
    [98]
    Lin Zhang, Lei Zhang, and Alan C. Bovik. 2015. A feature-enriched completely blind image quality evaluator. IEEE Trans. Image Process. 24, 8 (2015), 2579–2591.
    [99]
    R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 586–595.
    [100]
    Xiangyu Zhang, Jianhua Zou, Kaiming He, and Jian Sun. 2015. Accelerating very deep convolutional networks for classification and detection. IEEE Trans. Pattern Anal. Mach. Intell. 38, 10 (2015), 1943–1955.
    [101]
    Yi Zhang and Damon M. Chandler. 2013. No-reference image quality assessment based on log-derivative statistics of natural scenes. J. Electron. Imag. 22, 4 (2013), 043025.
    [102]
    Yi Zhang, Anush K. Moorthy, Damon M. Chandler, and Alan C. Bovik. 2014. C-DIIVINE: No-reference image quality assessment based on local magnitude and phase statistics of natural scenes. Signal Process.: Image Commun. 29, 7 (2014), 725–747.
    [103]
    Ying Zhang, Luming Zhang, and Roger Zimmermann. 2015. Aesthetics-guided summarization from multiple user generated videos. ACM Trans. Multimedia Comput. Commun. Appl. 11, 2 (2015), 1–23.
    [104]
    Zicheng Zhang, Wei Wu, Wei Sun, Danyang Tu, Wei Lu, Xiongkuo Min, Ying Chen, and Guangtao Zhai. 2023. MD-VQA: Multi-dimensional quality assessment for UGC live videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1746–1755.
    [105]
    Yi Zhu, Sharath Chandra Guntuku, Weisi Lin, Gheorghita Ghinea, and Judith A. Redi. 2018. Measuring individual video QoE: A survey, and proposal for future directions using social media. ACM Trans. Multimedia Comput. Commun. Appl. 14, 2s (2018), 1–24.

    Cited By

    View all
    • (2024)Predict Future Transient Fire Heat Release Rates Based on Fire Imagery and Deep LearningFire10.3390/fire70602007:6(200)Online publication date: 14-Jun-2024
    • (2024)Integrates Spatiotemporal Visual Stimuli for Video Quality AssessmentIEEE Transactions on Broadcasting10.1109/TBC.2023.331293270:1(223-237)Online publication date: Mar-2024
    • (2024)ADS-VQA: Adaptive sampling model for video quality assessmentDisplays10.1016/j.displa.2024.10279284(102792)Online publication date: Sep-2024

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 20, Issue 4
    April 2024
    676 pages
    ISSN:1551-6857
    EISSN:1551-6865
    DOI:10.1145/3613617
    • Editor:
    • Abdulmotaleb El Saddik
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 December 2023
    Online AM: 08 November 2023
    Accepted: 22 October 2023
    Revised: 18 September 2023
    Received: 13 April 2023
    Published in TOMM Volume 20, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Blind video quality assessment
    2. user-generated content
    3. deep learning
    4. Bi-LSTM
    5. spatial pooling
    6. temporal pooling

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)220
    • Downloads (Last 6 weeks)19
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Predict Future Transient Fire Heat Release Rates Based on Fire Imagery and Deep LearningFire10.3390/fire70602007:6(200)Online publication date: 14-Jun-2024
    • (2024)Integrates Spatiotemporal Visual Stimuli for Video Quality AssessmentIEEE Transactions on Broadcasting10.1109/TBC.2023.331293270:1(223-237)Online publication date: Mar-2024
    • (2024)ADS-VQA: Adaptive sampling model for video quality assessmentDisplays10.1016/j.displa.2024.10279284(102792)Online publication date: Sep-2024

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media