Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Tracking the gaze on objects in 3D: how do people really look at the bunny?

Published: 04 December 2018 Publication History

Abstract

We provide the first large dataset of human fixations on physical 3D objects presented in varying viewing conditions and made of different materials. Our experimental setup is carefully designed to allow for accurate calibration and measurement. We estimate a mapping from the pair of pupil positions to 3D coordinates in space and register the presented shape with the eye tracking setup. By modeling the fixated positions on 3D shapes as a probability distribution, we analysis the similarities among different conditions. The resulting data indicates that salient features depend on the viewing direction. Stable features across different viewing directions seem to be connected to semantically meaningful parts. We also show that it is possible to estimate the gaze density maps from view dependent data. The dataset provides the necessary ground truth data for computational models of human perception in 3D.

Supplementary Material

ZIP File (a188-wang.zip)
Supplemental files.

References

[1]
Richard A. Abrams, David E. Meyer, and Sylvan Kornblum. 1989. Speed and accuracy of saccadic eye movements: Characteristics of impulse variability in the oculomotor system. Journal of Experimental Psychology: Human Perception and Performance 15, 3 (1989), 8.
[2]
Sameer Agarwal, Keir Mierle, and Others. {n. d.}. Ceres Solver. http://ceres-solver.org. ({n. d.}).
[3]
Frank J Aherne, Neil A Thacker, and Peter I Rockett. 1998. The Bhattacharyya metric as an absolute similarity measure for frequency coded data. Kybernetika 34, 4 (1998), 363--368.
[4]
Amin Banitalebi-Dehkordi, Eleni Nasiopoulos, Mahsa T Pourazad, and Panos Nasiopoulos. 2018. Benchmark 3D eye-tracking dataset for visual saliency prediction on stereoscopic 3D video. arXiv preprint arXiv:1803.04845 (2018).
[5]
Ali Borji and Laurent Itti. 2013. State-of-the-art in visual attention modeling. IEEE transactions on pattern analysis and machine intelligence 35, 1 (2013), 185--207.
[6]
Ali Borji and Laurent Itti. 2015. Cat2000: A large scale fixation dataset for boosting saliency research. arXiv preprint arXiv:1505.03581 (2015).
[7]
Abdullah Bulbul, Tolga Capin, Guillaume Lavouè, and Marius Preda. 2011. Assessing Visual Quality of 3-D Polygonal Models. IEEE Signal Processing Magazine 28, 6 (Nov 2011), 80--90.
[8]
Moran Cerf, Jonathan Harel, Wolfgang Einhäuser, and Christof Koch. 2008. Predicting human gaze using low-level saliency combined with face detection. In Advances in neural information processing systems. 241--248.
[9]
Juan J. Cerrolaza, Arantxa Villanueva, and Rafael Cabeza. 2012. Study of Polynomial Mapping Functions in Video-Oculography Eye Trackers. ACM Trans. Comput.-Hum. Interact. 19, 2, Article 10 (July 2012), 25 pages.
[10]
Xiaobai Chen, Abulhair Saparov, Bill Pang, and Thomas Funkhouser. 2012. Schelling Points on 3D Surface Meshes. ACM Trans. Graph. 31, 4, Article 29 (July 2012), 12 pages.
[11]
Helin Dutagaci, Chun Pan Cheung, and Afzal Godil. 2012. Evaluation of 3D interest point detection techniques via human-generated ground truth. The Visual Computer 28, 9 (01 Sep 2012), 901--917.
[12]
Elham Ebrahimi, Bliss M Altenhoff, Christopher C Pagano, and Sabarish V Babu. 2015. Carryover effects of calibration to visual and proprioceptive information on near field distance judgments in 3d user interaction. In 3D User Interfaces (3DUI), 2015 IEEE Symposium on. IEEE, 97--104.
[13]
Bradley Efron. 1981. Nonparametric estimates of standard error: the jackknife, the bootstrap and other methods. Biometrika 68, 3 (1981), 589--599.
[14]
Kai Essig, Marc Pomplun, and Helge Ritter. 2006. A neural network for 3D gaze recording with binocular eye trackers. International Journal of Parallel, Emergent and Distributed Systems 21, 2 (2006), 79--95.
[15]
Miquel Feixas, Mateu Sbert, and Francisco González. 2009. A Unified Information-theoretic Framework for Viewpoint Selection and Mesh Saliency. ACM Trans. Appl. Percept. 6, 1, Article 1 (Feb. 2009), 23 pages.
[16]
Hayward J Godwin, Tamaryn Menneer, Kyle R Cave, Michael Thaibsyah, and Nick Donnelly. 2015. The effects of increasing target prevalence on information processing during visual search. Psychonomic bulletin & review 22, 2 (2015), 469--475.
[17]
Jacqueline Gottlieb, Pierre-Yves Oudeyer, Manuel Lopes, and Adrien Baranes. 2013. Information-seeking, curiosity, and attention: computational and neural mechanisms. Trends in Cognitive Sciences 17, 11 (2013), 585 -- 593.
[18]
S. Gottschalk, M. C. Lin, and D. Manocha. 1996. OBBTree: A Hierarchical Structure for Rapid Interference Detection. In Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '96). ACM, New York, NY, USA, 171--180.
[19]
Esteban Gutierrez Mlot, Hamed Bahmani, Siegfried Wahl, and Enkelejda Kasneci. 2016. 3D Gaze Estimation Using Eye Vergence. In Proceedings of the International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2016). SCITEPRESS - Science and Technology Publications, Lda, Portugal, 125--131.
[20]
Mary Hayhoe and Dana Ballard. 2005. Eye movements in natural behavior. Trends in Cognitive Sciences 9, 4 (2005), 188 -- 194.
[21]
John M Henderson, James R Brockmole, Monica S Castelhano, and Michael Mack. 2007. Visual saliency does not account for eye movements during visual search in real-world scenes. Eye movements: A window on mind and brain (2007), 537--562.
[22]
John M Henderson and Andrew Hollingworth. 1999. High-level scene perception. Annual review of psychology 50, 1 (1999), 243--271.
[23]
Kenneth Holmqvist and Richard Andersson. 2017. Eye tracking: A comprehensive guide to methods, paradigms and measures. Lund: Lund Eye-Tracking Research Institute.
[24]
Kenneth Holmqvist, Marcus Nyström, Richard Andersson, Richard Dewhurst, Halszka Jarodzka, and Joost Van de Weijer. 2011. Eye tracking: A comprehensive guide to methods and measures. OUP Oxford.
[25]
Laurent Itti. 2005. Quantifying the contribution of low-level saliency to human eye movements in dynamic scenes. Visual Cognition 12, 6 (2005), 1093--1123.
[26]
Laurent Itti and Ali Borji. 2015. Computational models: Bottom-up and top-down aspects. arXiv preprint arXiv:1510.07748 (2015).
[27]
L. Itti and C. Koch. 2001. Computational modelling of visual attention. Nature reviews. Neuroscience 2, 3 (March 2001), 194--203.
[28]
Ming Jiang, Shengsheng Huang, Juanyong Duan, andQi Zhao. 2015. SALICON: Saliency in Context. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[29]
Tilke Judd, Frédo Durand, and Antonio Torralba. 2012. A Benchmark of Computational Models of Saliency to Predict Human Fixations. In MIT Technical Report.
[30]
M. G. Kendall. 1938. A New Measure of Rank Correlation. Biometrika 30, 1/2 (1938), 81--93. http://www.jstor.org/stable/2332226
[31]
Wolf Kienzle, Felix A Wichmann, Matthias O Franz, and Bernhard Schölkopf. 2007. A nonparametric approach to bottom-up visual saliency. In Advances in neural information processing systems. 689--696.
[32]
Youngmin Kim, Amitabh Varshney, David W. Jacobs, and François Guimbretière. 2010. Mesh Saliency and Human Eye Fixations. ACM Trans. Appl. Percept. 7, 2, Article 12 (Feb. 2010), 13 pages.
[33]
Peter Kovesi. 2015. Good Colour Maps: How to Design Them. CoRR abs/1509.03700 (2015). arXiv:1509.03700 http://arxiv.org/abs/1509.03700
[34]
Eileen Kowler. 2011. Eye movements: The past 25years. Vision Research 51, 13 (2011), 1457 -- 1483. Vision Research 50th Anniversary Issue: Part 2.
[35]
Srinivas SS Kruthiventi, Kumar Ayush, and Radhakrishnan Venkatesh Babu. 2017. Deepfix: A fully convolutional neural network for predicting human eye fixations. IEEE Transactions on Image Processing (2017).
[36]
Matthias Kümmerer, Thomas SA Wallis, and Matthias Bethge. 2016. DeepGaze II: Reading fixations from deep features trained on object recognition. arXiv preprint arXiv:1610.01563 (2016).
[37]
Manfred Lau, Kapil Dev, Weiqi Shi, Julie Dorsey, and Holly Rushmeier. 2016. Tactile Mesh Saliency. ACM Trans. Graph. 35, 4, Article 52 (July 2016), 11 pages.
[38]
Guillaume Lavoué, Frédéric Cordier, Hyewon Seo, and Mohamed-Chaker Larabi. 2018. Visual Attention for Rendered 3D Shapes. Computer Graphics Forum 37, 2 (2018), 191--203.
[39]
Guillaume Lavoué and Massimilano Corsini. 2010. A Comparison of Perceptually-Based Metrics for Objective Evaluation of Geometry Processing. IEEE Transactions on Multimedia 12, 7 (Nov 2010), 636--649.
[40]
Chang Ha Lee, Amitabh Varshney, and David W. Jacobs. 2005. Mesh Saliency. ACM Trans. Graph. 24, 3 (July 2005), 659--666.
[41]
Li-Jia Li, Hao Su, Li Fei-Fei, and Eric P Xing. 2010. Object bank: A high-level image representation for scene classification & semantic feature sparsification. In Advances in neural information processing systems. 1378--1386.
[42]
David Lindlbauer, Joerg Mueller, and Marc Alexa. 2016. Changing the Appearance of Physical Interfaces Through Controlled Transparency. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology (UIST '16). ACM, New York, NY, USA, 425--435.
[43]
Simon P Liversedge, Keith Rayner, Sarah J White, John M Findlay, and Eugene McSorley. 2006. Binocular coordination of the eyes during reading. Current Biology 16, 17 (2006), 1726--1729.
[44]
Mohsen Mansouryar, Julian Steil, Yusuke Sugano, and Andreas Bulling. 2016. 3D Gaze Estimation from 2D Pupil Positions on Monocular Head-mounted Eye Trackers. In Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications (ETRA '16). ACM, New York, NY, USA, 197--200.
[45]
Susana Martinez-Conde, Stephen L. Macknik, and David H. Hubel. 2004. The role of fixational eye movements in visual perception. Nature Reviews Neuroscience 5 (01 Mar 2004), 229 EP -. Review Article.
[46]
Kameo Matusita. 1967. On the notion of affinity of several distributions and some of its applications. Annals of the Institute of Statistical Mathematics 19, 1 (1967), 181.
[47]
Michael Maurus, Jan Hendrik Hammer, and Jürgen Beyerer. 2014. Realistic Heatmap Visualization for Interactive Analysis of 3D Gaze Data. In Proceedings of the Symposium on Eye Tracking Research and Applications (ETRA '14). ACM, New York, NY, USA, 295--298.
[48]
Mark Meyer, Alan Barr, Haeyoung Lee, and Mathieu Desbrun. 2002. Generalized Barycentric Coordinates on Irregular Polygons. Journal of Graphics Tools 7, 1 (2002), 13--22.
[49]
Richard A Monty, Dennis F Fisher, and John W Senders. 2017. Eye movements: cognition and visual perception. Routledge.
[50]
Niels Christian Nilsson, Stefania Serafin, Frank Steinicke, and Rolf Nordahl. 2018. Natural walking in virtual reality: A review. Computers in Entertainment (CIE) 16, 2 (2018), 8.
[51]
Antje Nuthmann and Reinhold Kliegl. 2009. An examination of binocular reading fixations based on sentence corpus data. Journal of Vision 9, 5 (2009), 31--31.
[52]
Thies Pfeiffer. 2012. Measuring and Visualizing Attention in Space with 3D Attention Volumes. In Proceedings of the Symposium on Eye Tracking Research and Applications (ETRA '12). ACM, New York, NY, USA, 29--36.
[53]
Thies Pfeiffer, Marc E. Latoschik, and Ipke Wachsmuth. 2008. Evaluation of Binocular Eye Trackers and Algorithms for 3D Gaze Interaction in Virtual Reality Environments. JVRB - Journal of Virtual Reality and Broadcasting 5(2008), 16 (2008).
[54]
Thies Pfeiffer and Patrick Renner. 2014. Eyesee3d: A low-cost approach for analyzing mobile 3d eye tracking data using computer vision and augmented reality technology. In Proceedings of the Symposium on Eye Tracking Research and Applications. ACM, 369--376.
[55]
Thies Pfeiffer, Patrick Renner, and Nadine Pfeiffer-Lessmann. 2016. EyeSee3D 2.0: Model-based Real-time Analysis of Mobile Eye-tracking in Static and Dynamic Three-dimensional Scenes. In Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications (ETRA '16). ACM, New York, NY, USA, 189--196.
[56]
Keith Rayner. 2009. Eye movements and attention in reading, scene perception, and visual search. The quarterly journal of experimental psychology 62, 8 (2009), 1457--1506.
[57]
Ryan V Ringer, Zachary Throneburg, Aaron P Johnson, Arthur F Kramer, and Lester C Loschky. 2016. Impairing the useful field of view in natural scenes: Tunnel vision versus general interference. Journal of Vision 16, 2 (2016), 7--7.
[58]
Do A. Robinson. 1965. The mechanics of human smooth pursuit eye movement. The Journal of Physiology 1801, 3 (1965), 569--591.
[59]
Philip Schneider and David H Eberly. 2002. Geometric tools for computer graphics. Elsevier.
[60]
Johannes L. Schönberger and Jan-Michael Frahm. 2016. Structure-from-Motion Revisited. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4104--4113.
[61]
Philip Shilane and Thomas Funkhouser. 2007. Distinctive Regions of 3D Surfaces. ACM Trans. Graph. 26, 2, Article 7 (June 2007).
[62]
Vincent Sitzmann, Ana Serrano, Amy Pavel, Maneesh Agrawala, Diego Gutierrez, Belen Masia, and Gordon Wetzstein. 2017. How do people explore virtual environments? IEEE Transactions on Visualization and Computer Graphics (2017).
[63]
Ran Song, Yonghuai Liu, Ralph R. Martin, and Paul L. Rosin. 2014a. Mesh Saliency via Spectral Processing. ACM Trans. Graph. 33, 1, Article 6 (Feb. 2014), 17 pages.
[64]
Ran Song, Yonghuai Liu, Ralph R. Martin, and Paul L. Rosin. 2014b. Mesh Saliency via Spectral Processing. ACM Trans. Graph. 33, 1, Article 6 (Feb. 2014), 17 pages.
[65]
Flora P. Tasse, Jiri Kosinka, and Neil Dodgson. 2015. Cluster-Based Point Set Saliency. In 2015 IEEE International Conference on Computer Vision (ICCV). 163--171.
[66]
Benjamin W Tatler, Mary M Hayhoe, Michael F Land, and Dana H Ballard. 2011. Eye guidance in natural vision: Reinterpreting salience. Journal of vision 11, 5 (2011), 5--5.
[67]
Godfried T Toussaint. 1974. Some properties of Matusita's measure of affinity of several distributions. Annals of the Institute of Statistical Mathematics 26, 1 (1974), 389--394.
[68]
Mélodie Vidal, Andreas Bulling, and Hans Gellersen. 2012. Detection of Smooth Pursuits Using Eye Movement Shape Features. In Proceedings of the Symposium on Eye Tracking Research and Applications (ETRA '12). ACM, New York, NY, USA, 177--180.
[69]
Michael Wagner, Walter H. Ehrenstein, and Thomas V. Papathomas. 2009. Vergence in reverspective: Percept-driven versus data-driven eye movement control. Neuroscience Letters 449, 2 (2009), 142 -- 146.
[70]
Rui I. Wang, Brandon Pelfrey, Andrew T. Duchowski, and Donald H. House. 2014. Online 3D Gaze Localization on Stereoscopic Displays. ACM Trans. Appl. Percept. 11, 1, Article 3 (April 2014), 21 pages.
[71]
Wenguan Wang, Jianbing Shen, Yizhou Yu, and Kwan-Liu Ma. 2017b. Stereoscopic Thumbnail Creation via Efficient Stereo Saliency Detection. IEEE Transactions on Visualization and Computer Graphics 23, 8 (Aug 2017), 2014--2027.
[72]
Xi Wang, Kenneth Holmqvist, and Marc Alexa. 2018. The recorded mean point of vergence is biased (In preparation). (2018).
[73]
Xi Wang, David Lindlbauer, Christian Lessig, and Marc Alexa. 2017a. Accuracy of Monocular Gaze Tracking on 3D Geometry. In Eye Tracking and Visualization, Michael Burch, Lewis Chuang, Brian Fisher, Albrecht Schmidt, and Daniel Weiskopf (Eds.). Springer International Publishing, Cham, 169--184.
[74]
Xi Wang, David Lindlbauer, Christian Lessig, Marianne Maertens, and Marc Alexa. 2016. Measuring the Visual Salience of 3D Printed Objects. IEEE Computer Graphics and Applications 36, 4 (July 2016), 46--55. Dagmar A. Wismeijer, Raymond van Ee, and Casper J. Erkelens. 2008. Depth cues, rather than perceived depth, govern vergence. Experimental Brain Research 184, 1 (01 Jan 2008), 61--70.
[75]
Juan Xu, Ming Jiang, Shuo Wang, Mohan S Kankanhalli, and Qi Zhao. 2014. Predicting human gaze beyond pixels. Journal of vision 14, 1 (2014), 28--28.

Cited By

View all
  • (2024)Saliency3D: A 3D Saliency Dataset Collected on ScreenProceedings of the 2024 Symposium on Eye Tracking Research and Applications10.1145/3649902.3653350(1-6)Online publication date: 4-Jun-2024
  • (2024)Towards 3D Colored Mesh Saliency: Database and BenchmarksIEEE Transactions on Multimedia10.1109/TMM.2023.331292426(3580-3591)Online publication date: 2024
  • (2024)CMIGNet: Cross-Modal Inverse Guidance Network for RGB-Depth salient object detectionPattern Recognition10.1016/j.patcog.2024.110693155(110693)Online publication date: Nov-2024
  • Show More Cited By

Index Terms

  1. Tracking the gaze on objects in 3D: how do people really look at the bunny?

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Graphics
    ACM Transactions on Graphics  Volume 37, Issue 6
    December 2018
    1401 pages
    ISSN:0730-0301
    EISSN:1557-7368
    DOI:10.1145/3272127
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 December 2018
    Published in TOG Volume 37, Issue 6

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. 3D object viewing
    2. eye tracking
    3. mesh saliency

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)75
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 12 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Saliency3D: A 3D Saliency Dataset Collected on ScreenProceedings of the 2024 Symposium on Eye Tracking Research and Applications10.1145/3649902.3653350(1-6)Online publication date: 4-Jun-2024
    • (2024)Towards 3D Colored Mesh Saliency: Database and BenchmarksIEEE Transactions on Multimedia10.1109/TMM.2023.331292426(3580-3591)Online publication date: 2024
    • (2024)CMIGNet: Cross-Modal Inverse Guidance Network for RGB-Depth salient object detectionPattern Recognition10.1016/j.patcog.2024.110693155(110693)Online publication date: Nov-2024
    • (2023)Saliency detection of textured 3D models based on multi-view information and texel descriptorPeerJ Computer Science10.7717/peerj-cs.15849(e1584)Online publication date: 25-Oct-2023
    • (2023)Improved Water Sound Synthesis using Coupled BubblesACM Transactions on Graphics10.1145/359242442:4(1-13)Online publication date: 1-Aug-2023
    • (2023)Dense, Interlocking-Free and Scalable Spectral Packing of Generic 3D ObjectsACM Transactions on Graphics10.1145/359212642:4(1-14)Online publication date: 1-Aug-2023
    • (2023)Second-order Stencil Descent for Interior-point HyperelasticityACM Transactions on Graphics10.1145/359210442:4(1-16)Online publication date: 1-Aug-2023
    • (2023)Generating Activity Snippets by Learning Human-Scene InteractionsACM Transactions on Graphics10.1145/359209642:4(1-15)Online publication date: 1-Aug-2023
    • (2023)Data-driven Digital Lighting Design for Residential Indoor SpacesACM Transactions on Graphics10.1145/358200142:3(1-18)Online publication date: 17-Mar-2023
    • (2023)Automatic Schelling Point Detection From MeshesIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2022.314414329:6(2926-2939)Online publication date: 1-Jun-2023
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media