Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3524273.3533931acmconferencesArticle/Chapter ViewAbstractPublication PagesmmsysConference Proceedingsconference-collections
research-article

Perception of video quality at a local spatio-temporal horizon: research proposal

Published: 05 August 2022 Publication History
  • Get Citation Alerts
  • Abstract

    This paper contains the research proposal of Andréas Pastor that was presented at the MMSys 2022 doctoral symposium. Encoding video for streaming on Internet has become a major topic to reduce the consumption of bandwidth and latency. At the same time, the human perception of distortions has been explored in multiple research projects, especially for distortions generated by Coder-DECoder (CODEC) algorithms. These algorithms operate in a rate-distortion optimization paradigm to efficiently compress video content. This optimization can be driven by metrics that are most of the time not based on the human perception, and more importantly, not tuned to reflect the local perception of distortions by human eyes.
    In this doctoral study, we proposed to work on the perception of localized distortion at a small temporal and spatial horizon. We present here the fundamental research questions and challenges in the domain with a focus on methods to collect perceptual judgments in subjective studies and metrics that can help us to derive an estimate of the perception of distortions by humans.

    References

    [1]
    Sebastian Bosse, Dominique Maniry, Klaus-Robert Müller, Thomas Wiegand, and Wojciech Samek. 2017. Deep neural networks for no-reference and full-reference image quality assessment. IEEE Transactions on image processing 27, 1 (2017), 206--219.
    [2]
    Ralph A. Bradley and Milton E. Terry. 1952. The Rank Analysis of Incomplete Block Designs --- I. The Method of Paired Comparisons. Biometrika 39 (1952), 324--345.
    [3]
    Xi Chen, Paul N Bennett, Kevyn Collins-Thompson, and Eric Horvitz. 2013. Pairwise ranking aggregation in a crowdsourced setting. In Proceedings of the sixth ACM international conference on Web search and data mining. 193--202.
    [4]
    Marc Demers, Pascal E Fortin, Antoine Weill, Yongjae Yoo, Jeremy R Cooperstock, et al. 2021. Active Sampling for Efficient Subjective Evaluation of Tactons at Scale. In 2021 IEEE World Haptics Conference (WHC). IEEE, 1--6.
    [5]
    Sai Deng, Jingning Han, and Yaowu Xu. 2020. VMAF Based Rate-Distortion Optimization for Video Coding. In 2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP). IEEE, 1--6.
    [6]
    Keyan Ding, Kede Ma, Shiqi Wang, and Eero P Simoncelli. 2020. Image quality assessment: Unifying structure and texture similarity. arXiv preprint arXiv:2004.07728 (2020).
    [7]
    Mark E Glickman and Shane T Jensen. 2005. Adaptive paired comparison design. Journal of statistical planning and inference 127, 1--2 (2005), 279--293.
    [8]
    Yi-Hsin Huang, Tao-Sheng Ou, Po-Yen Su, and Homer H Chen. 2010. Perceptual rate-distortion optimization using structural similarity index as quality metric. IEEE Transactions on Circuits and Systems for Video Technology 20, 11 (2010), 1614--1624.
    [9]
    Yi-Hsin Huang, Tao-Sheng Ou, Po-Yen Su, and Homer H. Chen. 2010. Perceptual Rate-Distortion Optimization Using Structural Similarity Index as Quality Metric. IEEE Transactions on Circuits and Systems for Video Technology 20, 11 (2010), 1614--1624.
    [10]
    ITU Recommendation BT.500-14. 2019. Methodologies for the Subjective Assessment of the Quality of Television Images.
    [11]
    ITU-T Recommendation P.910. 2008. Subjective video quality assessment methods for multimedia applications.
    [12]
    Jongyoo Kim and Sanghoon Lee. 2017. Deep learning of human visual sensitivity in image quality assessment framework. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1676--1684.
    [13]
    Woojae Kim, Jongyoo Kim, Sewoong Ahn, Jinwoo Kim, and Sanghoon Lee. 2018. Deep video quality assessor: From spatio-temporal visual sensitivity to a convolutional neural aggregation network. In Proceedings of the European Conference on Computer Vision (ECCV). 219--234.
    [14]
    Kenneth Knoblauch, Laurence T Maloney, et al. 2008. MLDS: Maximum likelihood difference scaling in R. Journal of Statistical Software 25, 2 (2008), 1--26.
    [15]
    Eric Cooper Larson and Damon Michael Chandler. 2010. Most apparent distortion: full-reference image quality assessment and the role of strategy. Journal of electronic imaging 19, 1 (2010), 011006.
    [16]
    Bin Li, Houqiang Li, Li Li, and Jinlei Zhang. 2014. Lambda domain rate control algorithm for High Efficiency Video Coding. IEEE transactions on Image Processing 23, 9 (2014), 3841--3854.
    [17]
    Jing Li, Marcus Barkowsky, and Patrick Le Callet. 2012. Analysis and improvement of a paired comparison method in the application of 3DTV subjective experiment. In 2012 19th IEEE International Conference on Image Processing. IEEE, 629--632.
    [18]
    Jing Li, Marcus Barkowsky, and Patrick Le Callet. 2013. Boosting paired comparison methodology in measuring visual discomfort of 3DTV: performances of three different designs. In Stereoscopic Displays and Applications XXIV, Vol. 8648. International Society for Optics and Photonics, 86481V.
    [19]
    Jing Li, Rafal K. Mantiuk, Junle Wang, Suiyi Ling, and Patrick Le Callet. 2018. Hybrid-MST: A Hybrid Active Sampling Strategy for Pairwise Preference Aggregation. CoRR abs/1810.08851 (2018). arXiv:1810.08851 http://arxiv.org/abs/1810.08851
    [20]
    Songnan Li, Fan Zhang, Lin Ma, and King Ngi Ngan. 2011. Image quality assessment by separately evaluating detail losses and additive impairments. IEEE Transactions on Multimedia 13, 5 (2011), 935--949.
    [21]
    Zhi Li, Anne Aaron, Ioannis Katsavounidis, Anush Moorthy, and Megha Manohara. 2016. Toward a practical perceptual video quality metric. The Netflix Tech Blog 6, 2 (2016).
    [22]
    Suiyi Ling, Jing Li, Anne-Flore Perrin, Zhi Li, Lukás Krasula, and Patrick Le Callet. 2020. Strategy for Boosting Pair Comparison and Improving Quality Assessment Accuracy. CoRR abs/2010.00370 (2020). arXiv:2010.00370 https://arxiv.org/abs/2010.00370
    [23]
    Zhengyi Luo, Chen Zhu, Yan Huang, Rong Xie, Li Song, and C-C Jay Kuo. 2021. VMAF Oriented Perceptual Coding Based on Piecewise Metric Coupling. IEEE Transactions on Image Processing 30 (2021), 5109--5121.
    [24]
    Chengyue Ma, Karam Naser, Vincent Ricordel, Patrick Le Callet, and Chunmei Qing. 2016. An adaptive Lagrange multiplier determination method for dynamic texture in HEVC. In 2016 IEEE International Conference on Consumer Electronics-China (ICCE-China). IEEE, 1--4.
    [25]
    Laurence T Maloney and Joong Nam Yang. 2003. Maximum likelihood difference scaling. Journal of Vision 3, 8 (2003), 5--5.
    [26]
    Hui Men, Hanhe Lin, Mohsen Jenadeleh, and Dietmar Saupe. 2021. Subjective Image Quality Assessment With Boosted Triplet Comparisons. IEEE Access 9 (2021), 138939--138975.
    [27]
    Aliaksei Mikhailiuk, Clifford Wilmot, Maria Perez-Ortiz, Dingcheng Yue, and Rafał K Mantiuk. 2021. Active sampling for pairwise comparisons via approximate message passing and information gain maximization. In 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 2559--2566.
    [28]
    Karam Naser, Vincent Ricordel, and Patrick Le Callet. 2016. Modeling the perceptual distortion of dynamic textures and its application in HEVC. In 2016 IEEE International Conference on Image Processing (ICIP). IEEE, 3787--3791.
    [29]
    Andréas Pastor, Lukáš Krasula, Xiaoqing Zhu, Zhi Li, and Patrick Le Callet. 2022. Improving Maximum Likelihood Difference Scaling method to measure inter content scale. (2022).
    [30]
    Thomas Pfeiffer, Xi Alice Gao, Yiling Chen, Andrew Mao, and David G Rand. 2012. Adaptive polling for information aggregation. In Twenty-Sixth AAAI Conference on Artificial Intelligence.
    [31]
    Ekta Prashnani, Hong Cai, Yasamin Mostofi, and Pradeep Sen. 2018. PieAPP: Perceptual Image-Error Assessment through Pairwise Preference. CoRR abs/1806.02067 (2018). arXiv:1806.02067 http://arxiv.org/abs/1806.02067
    [32]
    Hamid R Sheikh and Alan C Bovik. 2006. Image information and visual quality. IEEE Transactions on image processing 15, 2 (2006), 430--444.
    [33]
    Edwin Simpson and Iryna Gurevych. 2020. Scalable Bayesian preference learning for crowds. Machine Learning (2020), 1--30.
    [34]
    Louis Leon Thurstone. 1927. A Law of Comparative Judgement. Psychological Review 34 (1927), 278--286.
    [35]
    Amos Tversky. 1972. Elimination by aspects: A theory of choice. Psychological review 79, 4 (1972), 281.
    [36]
    Haiqiang Wang, Ioannis Katsavounidis, Jiantong Zhou, Jeong-Hoon Park, Shawmin Lei, Xin Zhou, Man-On Pun, Xin Jin, Ronggang Wang, Xu Wang, Yun Zhang, Jiwu Huang, Sam Kwong, and C.-C. Jay Kuo. 2017. VideoSet: A Large-Scale Compressed Video Quality Dataset Based on JND Measurement. CoRR abs/1701.01500 (2017). arXiv:1701.01500 http://arxiv.org/abs/1701.01500
    [37]
    Shiqi Wang, Abdul Rehman, Zhou Wang, Siwei Ma, and Wen Gao. 2011. SSIM-motivated rate-distortion optimization for video coding. IEEE Transactions on Circuits and Systems for Video Technology 22, 4 (2011), 516--529.
    [38]
    Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing 13, 4 (2004), 600--612.
    [39]
    Zhou Wang, Eero P Simoncelli, and Alan C Bovik. 2003. Multiscale structural similarity for image quality assessment. In The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, Vol. 2. Ieee, 1398--1402.
    [40]
    Qianqian Xu, Jiechao Xiong, Xi Chen, Qingming Huang, and Yuan Yao. 2018. Hodgerank with information maximization for crowdsourced pairwise ranking aggregation. In Thirty-Second AAAI Conference on Artificial Intelligence.
    [41]
    Peng Ye and David Doermann. 2014. Active sampling for subjective image quality assessment. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4249--4256.
    [42]
    Lin Zhang, Ying Shen, and Hongyu Li. 2014. VSI: A visual saliency-induced index for perceptual image quality assessment. IEEE Transactions on Image processing 23, 10 (2014), 4270--4281.
    [43]
    Lin Zhang, Lei Zhang, Xuanqin Mou, and David Zhang. 2011. FSIM: A feature similarity index for image quality assessment. IEEE transactions on Image Processing 20, 8 (2011), 2378--2386.
    [44]
    Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang. 2018. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. CoRR abs/1801.03924 (2018). arXiv:1801.03924 http://arxiv.org/abs/1801.03924
    [45]
    Chen Zhu, Yan Huang, Rong Xie, and Li Song. 2021. HEVC VMAF-oriented Perceptual Rate Distortion Optimization using CNN. In 2021 Picture Coding Symposium (PCS). 1--5.

    Cited By

    View all
    • (2023)Predicting local distortions introduced by AV1 using Deep Features2023 IEEE International Conference on Visual Communications and Image Processing (VCIP)10.1109/VCIP59821.2023.10402725(1-5)Online publication date: 4-Dec-2023

    Index Terms

    1. Perception of video quality at a local spatio-temporal horizon: research proposal

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        MMSys '22: Proceedings of the 13th ACM Multimedia Systems Conference
        June 2022
        432 pages
        ISBN:9781450392839
        DOI:10.1145/3524273
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Sponsors

        In-Cooperation

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 05 August 2022

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. CODEC
        2. distortion
        3. perception
        4. video quality

        Qualifiers

        • Research-article

        Conference

        MMSys '22
        Sponsor:
        MMSys '22: 13th ACM Multimedia Systems Conference
        June 14 - 17, 2022
        Athlone, Ireland

        Acceptance Rates

        Overall Acceptance Rate 176 of 530 submissions, 33%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)12
        • Downloads (Last 6 weeks)1

        Other Metrics

        Citations

        Cited By

        View all
        • (2023)Predicting local distortions introduced by AV1 using Deep Features2023 IEEE International Conference on Visual Communications and Image Processing (VCIP)10.1109/VCIP59821.2023.10402725(1-5)Online publication date: 4-Dec-2023

        View Options

        Get Access

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media