Residual feature learning with hierarchical calibration for gaze estimation

Yin, Zhengdan; Zhou, Sanping; Wang, Le; Dai, Tao; Hua, Gang; Zheng, Nanning

doi:10.1007/s00138-024-01545-z

Residual feature learning with hierarchical calibration for gaze estimation

Research
Published: 05 May 2024

Volume 35, article number 61, (2024)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Zhengdan Yin^1,2,3,
Sanping Zhou^1,2,3,
Le Wang^1,2,3,
Tao Dai⁴,
Gang Hua⁵ &
…
Nanning Zheng^1,2,3

326 Accesses
Explore all metrics

Abstract

Gaze estimation aims to predict accurate gaze direction from natural eye images, which is an extreme challenging task due to both random variations in head pose and person-specific biases. Existing works often independently learn features from binocular images and directly concatenate them for gaze estimation. In this paper, we propose a simple yet effective two-stage framework for gaze estimation, in which both residual feature learning (RFL) and hierarchical gaze calibration (HGC) networks are designed to consistently improve the performance of gaze estimation. Specifically, the RFL network extracts informative features by jointly exploring the symmetric and asymmetric factors between left and right eyes, which can produce accurate initial predictions as much as possible. Besides, the HGC network cascades a personal-specific transform module to further transform the distribution of gaze point from coarse to fine, which can effectively compensate the subjective bias in initial predictions. Extensive experiments on both EVE and MPIIGaze datasets show that our method outperforms the state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 6

BCNet: Binocular Cooperative Network for Gaze Estimation

TSG-net: a residual-based informing network for 3D Gaze estimation

Article 17 November 2021

Deep Multitask Gaze Estimation with a Constrained Landmark-Gaze Model

References

Konrad, R., Angelopoulos, A., Wetzstein, G.: Gaze-contingent ocular parallax rendering for virtual reality. ACM Trans. Graph. (TOG) 39(2), 1–12 (2020)
Article Google Scholar
Sitzmann, V., Serrano, A., Pavel, A., Agrawala, M., Gutierrez, D., Masia, B., Wetzstein, G.: Saliency in vr: how do people explore virtual environments? IEEE Trans. Visual Comput. Graph. 24(4), 1633–1642 (2018)
Article Google Scholar
Zhang, X., Sugano, Y., Bulling, A.: Everyday eye contact detection using unsupervised gaze target discovery. In: Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology, pp. 193–203 (2017)
Terzioğlu, Y., Mutlu, B., Şahin, E.: Designing social cues for collaborative robots: the role of gaze and breathing in human–robot collaboration. In: Proceedings of the 2020 ACM/IEEE International Conference on Human–Robot Interaction, pp. 343–357 (2020)
Gerber, M.A., Schroeter, R., Xiaomeng, L., Elhenawy, M.: Self-interruptions of non-driving related tasks in automated vehicles: mobile vs head-up display. In: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pp. 1–9 (2020)
Palazzi, A., Abati, D., Solera, F., Cucchiara, R., et al.: Predicting the driver’s focus of attention: the dr (eye) ve project. IEEE Trans. Pattern Anal. Mach. Intell. 41(7), 1720–1733 (2018)
Article Google Scholar
Jin, S., Dai, J., Nguyen, T.: Kappa angle regression with ocular counter-rolling awareness for gaze estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2658–2667 (2023)
Li, Y., Zhan, Y., Yang, Z.: Evaluation of appearance-based eye tracking calibration data selection. In: 2020 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA), pp. 222–224. IEEE (2020)
Park, S., Aksan, E., Zhang, X., Hilliges, O.: Towards end-to-end video-based eye-tracking. In: Computer Vision—ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XII 16, pp. 747–763. Springer (2020)
Bao, J., Liu, B., Yu, J.: An individual-difference-aware model for cross-person gaze estimation. IEEE Trans. Image Process. 31, 3322–3333 (2022)
Article Google Scholar
Miao, Q., Hoai, M., Samaras, D.: Patch-level gaze distribution prediction for gaze following. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 880–889 (2023)
Ghosh, S., Dhall, A., Hayat, M., Knibbe, J., Ji, Q.: Automatic gaze analysis: a survey of deep learning based approaches. arXiv preprint arXiv:2108.05479 (2021)
Zhang, X., Sugano, Y., Fritz, M., Bulling, A.: Mpiigaze: real-world dataset and deep appearance-based gaze estimation. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 162–175 (2017)
Article Google Scholar
Model, D., Eizenman, M.: User-calibration-free remote eye-gaze tracking system with extended tracking range. In: 2011 24th Canadian Conference on Electrical and Computer Engineering (CCECE), pp. 001268–001271. IEEE (2011)
Wang, K., Ji, Q.: Real time eye gaze tracking with kinect. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 2752–2757. IEEE (2016)
Liu, J., Chi, J., Hu, W., Wang, Z.: 3d model-based gaze tracking via iris features with a single camera and a single light source. IEEE Trans. Hum. Mach. Syst. 51(2), 75–86 (2020)
Article Google Scholar
Chen, J., Ji, Q.: 3d gaze estimation with a single camera without ir illumination. In: 2008 19th International Conference on Pattern Recognition, pp. 1–4. IEEE (2008)
Wu, Y., Ji, Q.: Learning the deep features for eye detection in uncontrolled conditions. In: 2014 22nd International Conference on Pattern Recognition, pp. 455–459. IEEE (2014)
Wood, E., Baltrušaitis, T., Morency, L.-P., Robinson, P., Bulling, A.: A 3d morphable eye region model for gaze estimation. In: Computer Vision—ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp. 297–313. Springer (2016)
Fischer, T., Chang, H.J., Demiris, Y.: Rt-gene: real-time eye gaze estimation in natural environments. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 334–352 (2018)
Cheng, Y., Lu, F., Zhang, X.: Appearance-based gaze estimation via evaluation-guided asymmetric regression. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 100–115 (2018)
Cheng, Y., Huang, S., Wang, F., Qian, C., Lu, F.: A coarse-to-fine adaptive network for appearance-based gaze estimation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 10623–10630 (2020)
Bao, Y., Cheng, Y., Liu, Y., Lu, F.: Adaptive feature fusion network for gaze tracking in mobile tablets. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 9936–9943. IEEE (2021)
Zhang, X., Sugano, Y., Fritz, M., Bulling, A.: It’s written all over your face: full-face appearance-based gaze estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 51–60 (2017)
Krafka, K., Khosla, A., Kellnhofer, P., Kannan, H., Bhandarkar, S., Matusik, W., Torralba, A.: Eye tracking for everyone. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2176–2184 (2016)
Cheng, Y., Lu, F.: Gaze estimation using transformer. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 3341–3347. IEEE (2022)
Cheng, Y., Bao, Y., Lu, F.: Puregaze: purifying gaze feature for generalizable gaze estimation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 436–443 (2022)
Park, S., Mello, S.D., Molchanov, P., Iqbal, U., Hilliges, O., Kautz, J.: Few-shot adaptive gaze estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9368–9377 (2019)
Xiong, Y., Kim, H.J., Singh, V.: Mixed effects neural networks (menets) with applications to gaze estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7743–7752 (2019)
Lindén, E., Sjostrand, J., Proutiere, A.: Learning to personalize in appearance-based gaze tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pp. 0–0 (2019)
He, J., Pham, K., Valliappan, N., Xu, P., Roberts, C., Lagun, D., Navalpakkam, V.: On-device few-shot personalization for real-time gaze estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pp. 0–0 (2019)
Guo, Z., Yuan, Z., Zhang, C., Chi, W., Ling, Y., Zhang, S.: Domain adaptation gaze estimation by embedding with prediction consistency. In: Proceedings of the Asian Conference on Computer Vision (2020)
Wang, W., Shen, J., Dong, X., Borji, A., Yang, R.: Inferring salient objects from human fixations. IEEE Trans. Pattern Anal. Mach. Intell. 42(8), 1913–1927 (2019)
Article Google Scholar
Kruthiventi, S.S., Gudisa, V., Dholakiya, J.H., Babu, R.V.: Saliency unified: a deep architecture for simultaneous eye fixation prediction and salient object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5781–5790 (2016)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Zhang, X., Park, S., Beeler, T., Bradley, D., Tang, S., Hilliges, O.: Eth-xgaze: A large scale dataset for gaze estimation under extreme head pose and gaze variation. In: Computer Vision—ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part V 16, pp. 365–381. Springer (2020)
Cheng, Y., Zhang, X., Lu, F., Sato, Y.: Gaze estimation by exploring two-eye asymmetry. IEEE Trans. Image Process. 29, 5259–5272 (2020)
Biswas, P., et al.: Appearance-based gaze estimation using attention and difference mechanism. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3143–3152 (2021)
Abdelrahman, A.A., Hempel, T., Khalifa, A., Al-Hamadi, A.: L2cs-net: fine-grained gaze estimation in unconstrained environments. arXiv preprint arXiv:2203.03339 (2022)

Download references

Acknowledgements

This work was supported partly by National Key R &D Program of China under Grant 2021YFB1714700, NSFC under Grants 62106192 and 12326608, Natural Science Foundation of Shaanxi Province under Grants 2022JC-41 and 2021JQ-054, China Postdoctoral Science Foundation under Grants 2020M683490 and 2022T150518, Fundamental Research Funds for the Central Universities under Grants XTR042021005 and XTR072022001.

Author information

Authors and Affiliations

National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, Xi’an Jiaotong University, Xi’an, Shaanxi, 710049, China
Zhengdan Yin, Sanping Zhou, Le Wang & Nanning Zheng
National Engineering Research Center for Visual Information and Applications, Xi’an Jiaotong University, Xi’an, Shaanxi, 710049, China
Zhengdan Yin, Sanping Zhou, Le Wang & Nanning Zheng
Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University, Xi’an, Shaanxi, 710049, China
Zhengdan Yin, Sanping Zhou, Le Wang & Nanning Zheng
School of Future Transportation, Chang’an University, Xi’an, Shaanxi, 710064, China
Tao Dai
Wormpex AI Research, Bellevue, WA, 98004, USA
Gang Hua

Authors

Zhengdan Yin
View author publications
You can also search for this author in PubMed Google Scholar
Sanping Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Le Wang
View author publications
You can also search for this author in PubMed Google Scholar
Tao Dai
View author publications
You can also search for this author in PubMed Google Scholar
Gang Hua
View author publications
You can also search for this author in PubMed Google Scholar
Nanning Zheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sanping Zhou.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yin, Z., Zhou, S., Wang, L. et al. Residual feature learning with hierarchical calibration for gaze estimation. Machine Vision and Applications 35, 61 (2024). https://doi.org/10.1007/s00138-024-01545-z

Download citation

Received: 26 December 2023
Revised: 16 April 2024
Accepted: 17 April 2024
Published: 05 May 2024
DOI: https://doi.org/10.1007/s00138-024-01545-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Residual feature learning with hierarchical calibration for gaze estimation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

BCNet: Binocular Cooperative Network for Gaze Estimation

TSG-net: a residual-based informing network for 3D Gaze estimation

Deep Multitask Gaze Estimation with a Constrained Landmark-Gaze Model

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Residual feature learning with hierarchical calibration for gaze estimation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

BCNet: Binocular Cooperative Network for Gaze Estimation

TSG-net: a residual-based informing network for 3D Gaze estimation

Deep Multitask Gaze Estimation with a Constrained Landmark-Gaze Model

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation