Abstract
Eye tracking technology has been adopted in numerous studies in the field of human–computer interaction (HCI) to understand visual and display-based information processing as well as the underlying cognitive processes employed by users when navigating a computer interface. Analyzing eye tracking data can also help identify interaction patterns with regard to salient regions of an information display. Deep learning technology is increasingly being used in the analysis of eye tracking data by allowing for the classification of large amounts of eye tracking results. In this paper, eye tracking data and convolutional neural networks (CNNs) were used to perform a classification task to predict three types of information presentation methods. As a first step, a number of data preprocessing and feature engineering approaches were applied to eye tracking data collected through a controlled visual information processing experiment. The resulting data were used as input for the comparison of four CNN models with different architectures. In this experiment, two CNN models were effective in classifying the information presentations with overall accuracy greater than 80%.
Similar content being viewed by others
References
Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Mane D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Viegas F, Vinyals O, Warden P, Wattenberg M, Wicke M, Yu Y, Zheng X. Tensorflow: large-scale machine learning on heterogeneous distributed systems. 2016.
Alqahtani Y, Chakraborty J, McGuire M, Feng JH. Understanding visual information processing for American vs. Saudi Arabian users. In: Di Bucchianico G, editor. Advances in design for inclusion. Cham: Springer International Publishing; 2020. p. 229–38.
Alqahtani Y, McGuire M, Chakraborty J, Feng JH. Understanding how ADHD affects visual information processing. In: Antona M, Stephanidis C, editors. Universal access in human-computer interaction. multimodality and assistive environments. Cham: Springer International Publishing; 2019. p. 23–31.
Banker K, Bakkum P, Verch S, Garrett D, Hawkins T. MongoDB in action. 2nd ed. Shelter Island: Manning Publications Co.; 2016.
Benyon D, Innocent P, Murray D. System adaptivity and the modelling of stereotypes. In: Bullinger HJ, Shackel B, editors. Human-computer interaction-INTERACT ’87. Amsterdam: North-Holland; 1987. p. 245–53. https://doi.org/10.1016/B978-0-444-70304-0.50047-9.
Chatfield K, Simonyan K, Vedaldi A, Zisserman A. Return of the devil in the details: delving deep into convolutional nets. In: BMVC 2014—proceedings of the British machine vision conference 2014; 2014. https://doi.org/10.5244/C.28.6.
Chen Z, Fu H, Lo WL, Chi Z. Strabismus recognition using eye-tracking data and convolutional neural networks. J Healthc Eng. 2018;2018:1–9. https://doi.org/10.1155/2018/7692198.
Chollet F. Building autoencoders in keras. 2016. https://blog.keras.io/building-autoencoders-in-keras.html. Accessed 15 Jun 2020.
Chollet F. Deep learning with python. Shelter Island: Manning Publications Co.; 2017.
Chollet F. Xception: deep learning with depthwise separable convolutions; 2017. pp. 1800–1807. https://doi.org/10.1109/CVPR.2017.195.
Chollet F, et al. Keras. 2015. https://github.com/fchollet/keras. Accessed 15 Jun 2020.
Collette A. Python and HDF5. Sebastopol: O’Reilly; 2013.
Goodfellow I, Bengio Y, Courville A. Deep Learning. Cambridge: MIT Press; 2016. http://www.deeplearningbook.org. Accessed 1 Jul 2020.
Google. Tensorflow. 2020. https://www.tensorflow.org/. Accessed 15 Jun 2020.
Groen M, Noyes J. Using eye tracking to evaluate usability of user interfaces: is it warranted? IFAC Proc Vol. 2010;43(13):489–93. https://doi.org/10.3182/20100831-4-FR-2021.00086 11th IFAC/IFIP/IFORS/IEA Symposium on Analysis, Design, and Evaluation of Human-Machine Systems.
Gullà F, Cavalieri L, Ceccacci S, Germani M, Bevilacqua R. Method to design adaptable and adaptive user interfaces. In: Stephanidis C, editor. HCI international 2015—posters’ extended abstracts. Cham: Springer International Publishing; 2015. p. 19–24.
Han J, Kamber M, Pei J. Data mining concepts and techniques. 3rd ed. Waltham: Morgan Kaufmann; 2011.
Hardzeyeu V, Klefenz F, Schikowski P. Vision assistant: a human–computer interface based on adaptive eye-tracking; 2006. pp. 175–182.https://doi.org/10.2495/DN060171.
He K, Gkioxari G, Dollár P, Girshick R. Mask r-cnn. In: 2017 IEEE international conference on computer vision (ICCV); 2017. pp. 2980–2988. https://doi.org/10.1109/ICCV.2017.322.
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR); 2016. pp. 770–778. https://doi.org/10.1109/CVPR.2016.90.
Hinton G, Li Deng, Yu D, Dahl G, Rahman Mohamed A, Jaitly N, Senior A, Vanhoucke V, Nguyen P, Sainath T, Kingsbury B. Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. Sig Process Mag IEEE. 2012;29(6):82–97. https://doi.org/10.1109/MSP.2012.2205597.
Hinton G, Osindero S, Teh YW. A fast learning algorithm for deep belief nets. Neural Comput. 2006;18(7):1527–54. https://doi.org/10.1162/neco.2006.18.7.1527.
Huang B, Chen R, Zhou Q, Xu W. Eye landmarks detection via weakly supervised learning. Pattern Recogn. 2020;98:107076. https://doi.org/10.1016/j.patcog.2019.107076.
Hutt S, Mills C, White S, Donnelly P, Mello S. The eyes have it: Gaze-based detection of mind wandering during learning with an intelligent tutoring system. In: 9th international conference on educational data mining, 2016.
Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Bach F, Blei D, editors. Proceedings of the 32nd International Conference on Machine Learning, Proceedings of Machine Learning Research. Lille: PMLR; vol. 37, 2015. pp. 448–456.
James G, Witten D, Hastie T, Tibshirani R. An introduction to statistical learning: with applications in R. 1st ed. New York: Springer; 2013.
Keras. Keras. 2020. https://keras.io/. Accessed 15 Jun 2020.
Kim J, Kim B, Roy PP, Jeong D. Efficient facial expression recognition algorithm based on hierarchical deep neural network structure. IEEE Access. 2019;7:41273–85. https://doi.org/10.1109/ACCESS.2019.2907327.
Kohavi R. A study of cross-validation and bootstrap for accuracy estimation and model selection. In: International joint conference on artificial intelligence, vol. 14; 1995.
Krizhevsky A, Sutskever I, Hinton G. Imagenet classification with deep convolutional neural networks. Neural Inf Process Syst. 2012;25:1097–105. https://doi.org/10.1145/3065386.
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436–44. https://doi.org/10.1038/nature14539.
Lecun Y, Boser B, Denker J, Henderson D, Howard R, Hubbard W, Jackel L. Handwritten digit recognition with a back-propagation network. Neural Inf Process Syst. 1989;2:396–404.
Lecun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE. 1998;86:2278–324. https://doi.org/10.1109/5.726791.
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg A. Ssd: Single shot multibox detector. In: ECCV; 2016.
Minar MR, Naher J. Recent advances in deep learning: an overview; 2018. https://doi.org/10.13140/RG.2.2.24831.10403.
Minsky ML. Steps toward artificial intelligence. In: Proceedings of the institute of radio engineers. 2000;49.
Naqshbandi K, Gedeon T, Abdulla U. Automatic clustering of eye gaze data for machine learning. 2016;001239–44. https://doi.org/10.1109/SMC.2016.7844411.
Norcio AF, Stanley J. Adaptive human-computer interfaces: a literature survey and perspective. IEEE Trans Syst Man Cybern. 1989;19(2):399–408. https://doi.org/10.1109/21.31042.
Park SJ, Kim BG. Development of low-cost vision-based eye tracking algorithm for information augmented interactive system. J Multimed Inf Syst. 2020;7:11–6. https://doi.org/10.33851/JMIS.2020.7.1.11.
Pillow. Pillow. 2020. https://pillow.readthedocs.io/en/stable/. Accessed 15 Jun 2020.
Poole A, Ball LJ. Eye tracking in hci and usability research. In: Ghaoui C, editor. Encyclopedia of human computer interaction; 2005. pp. 211–219.: Idea Group Reference.
Rakhmatulin I, Duchowski AT. Deep neural networks for low-cost eye tracking. Proc Comput Sci. 2020;176:685–94. https://doi.org/10.1016/j.procs.2020.09.041 Knowledge-Based and Intelligent Information & Engineering Systems: Proceedings of the 24th International Conference KES2020.
Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: unified, real-time object detection. In: The IEEE conference on computer vision and pattern recognition (CVPR); 2016. pp. 779–788. https://doi.org/10.1109/CVPR.2016.91.
Redmon J, Farhadi A. Yolov3: an incremental improvement; 2018. ArXiv:1804.02767.
Riener A, Boll SC, Kun AL. Automotive user interfaces in the age of automation (dagstuhl seminar 16262). Dagstuhl Rep. 2016;6:111–59.
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L. ImageNet large scale visual recognition challenge. Int J Comput Vis (IJCV). 2015;115(3):211–52. https://doi.org/10.1007/s11263-015-0816-y.
Sainath TN, Peddinti V, Kingsbury B, Fousek P, Ramabhadran B, Nahamoo D. Deep scattering spectra with deep neural networks for lvcsr tasks. In: INTERSPEECH, 2014.
Schall A, Bergstrom JR. Eye tracking in user experience design. Waltham: Elsevier Inc.; 2014.
Schmidhuber J. Deep learning in neural networks: an overview. Neural Netw. 2014;61:85–117. https://doi.org/10.1016/j.neunet.2014.09.003.
Sifre L. Rigid-motion scattering for image classification. Ph.D. thesis 2014.
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations, 2015.
Smith B. Beginning JSON. New York: Apress; 2015.
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In: The IEEE conference on computer vision and pattern recognition (CVPR); 2015. pp. 1–9. https://doi.org/10.1109/CVPR.2015.7298594.
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR); 2016. pp. 2818–2826. Piscataway: IEEE. https://doi.org/10.1109/CVPR.2016.308.
Ulahannan A, Jennings P, Oliveira L, Birrell S. Designing an adaptive interface: using eye tracking to classify how information usage changes over time in partially automated vehicles. IEEE Access. 2020;8:16865–75. https://doi.org/10.1109/ACCESS.2020.2966928.
Xie S, Hu H. Facial expression recognition using hierarchical features with deep comprehensive multipatches aggregation convolutional neural networks. IEEE Trans Multimed. 2019;21(1):211–20. https://doi.org/10.1109/TMM.2018.2844085.
Yin Y, Juan C, Chakraborty J, McGuire MP. Classification of eye tracking data using a convolutional neural network. In: 17th IEEE international conference on machine learning and applications (ICMLA); 2018. pp. 530–535. Orlando, FL: IEEE. https://doi.org/10.1109/ICMLA.2018.00085.
Yoon HJ, Alamudun F, Hudson K, Morin-Ducote G, Tourassi G. Deep gaze velocity analysis during mammographic reading for biometric identification of radiologists. J Hum Perform Extrem Environ. 2018;. https://doi.org/10.7771/2327-2937.1088.
Zemblys R, Niehorster D, Komogortsev O, Holmqvist K. Using machine learning to detect events in eye-tracking data. Behav Res Methods. 2017;. https://doi.org/10.3758/s13428-017-0860-3.
Acknowledgements
This research was partially supported by the Towson University School of Emerging Technologies. We also gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan V GPU used for this research.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Research involving human participants and/or animals
The study protocol was approved by the Towson University IRB (Exemption number: 14-X145).
Informed consent
Informed consent was obtained from all individual participants included in the study.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yin, Y., Alqahtani, Y., Feng, J.H. et al. Classification of Eye Tracking Data in Visual Information Processing Tasks Using Convolutional Neural Networks and Feature Engineering. SN COMPUT. SCI. 2, 59 (2021). https://doi.org/10.1007/s42979-020-00444-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-020-00444-0