Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1180995.1181050acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
Article

Tracking head pose and focus of attention with multiple far-field cameras

Published: 02 November 2006 Publication History

Abstract

In this work we present our recent approach on estimating head orientations and foci of attention of multiple people in a smart room, which is equipped with several cameras to monitor the room. In our approach, we estimate each person's head orientation with respect to the room coordinate system by using all camera views. We implemented a Neural Network to estimate head pose on every single camera view, a Bayes filter is then applied to integrate every estimate into one final, joint hypothesis. Using this scheme, we can track peoples' horizontal head orientations in a full 360° range at almost all positions within the room. The tracked head orientations are then used to determine who is looking at whom, i.e. people's focus of attention. We report experimental results on one meeting video, that was recorded in the smart room.

References

[1]
S. O. Ba and J.-M. Obodez. A probabilistic framework for joint head tracking and pose estimation. In Proceedings of the 17th International Conference on Pattern Recognition, 2004.
[2]
A. H. Gee and R. Cipolla. Non-intrusive gaze tracking for human-computer interaction. In Proceedings of Mechatronics and Machine Vision in Practise, pages 112--117, 1994.
[3]
T. Horprasert, Y. Yacoob, and L. S. Davis. Computing 3-d head orientation from a monocular image sequence. In Proceedings of the 2nd International Conference on Automatic Face and Gesture Recognition, 1996.
[4]
M. Katzenmaier, R. Stiefelhagen, T. Schultz, I. Rogina, and A. Waibel. Identifying the addressee in human-human-robot interactions based on head pose and speech. In International Conference on Multimodal Interfaces ICMI, 2004.
[5]
S. R. Langton. The mutual influence of gaze and head orientation in the analysis of social attention direction. In The Quaterly Journal of Experimental Psychology, 53A(3):825--845, 2000.
[6]
M. C. Michael Argyle. Gaze and Mutual Gaze. Cambridge University Press, 1976.
[7]
K. Otsuka, Y. Takemae, J. Yamamoto, and H. Murase. A probabilistic inference of multiparty-conversation structure based on markov-switching models of gaze patterns, head directions and utterances. In Proceedings of the International Conference on Multimodal Interfaces - ICMI, 2005.
[8]
R. Pappu and P. Beardsley. A qualitative approach to classifying gaze direction. In Proceedings of FG98, pages 160--165, 1998.
[9]
V. B. Stephen R. H. Langton, Roger J. Watt. Do the eyes have it? cues to the direction of social attention. In Trends in Cognitive Neuroscience, 4(2):50--58, 2000.
[10]
R. Stiefelhagen. Tracking focus of attention in meetings. In IEEE International Conference on Multimodal Interfaces, pages 273--280, 2002.
[11]
R. Stiefelhagen, J. Yang, and A. Waibel. Simultaneous tracking of head poses in a panoramic view. In International Conference on Pattern Recognition, volume 3, pages 726--729, September 2000.
[12]
R. Stiefelhagen, J. Yang, and A. Waibel. Modeling focus of attention for meeting indexing based on multiple cues. IEEE Transactions on Neural Networks, 13(4), 2002.
[13]
R. Stiefelhagen and J. Zhu. Head orientation and gaze direction in meetings. In Conference on Human Factors in Computing Systems (CHI2002), Minneapolis, April 2002.
[14]
Y.-L. Tian, L. Brown, J. Connell, S. Pankanti, A. Hampapur, A. Senior, and R. Bolle. Absolute head pose estimation from overhead wide-angle cameras. In IEEE International Workshop on Analysis and Modeling of Faces and Gestures, 2003.
[15]
K. van Turhnout, J. Terken, I. Bakx, and B. Eggen. Identifying the intended addressee in mixed human-human and human-computer interaction from non-verbal features. In Proceedings of the International Conference on Multimodal Interfaces - ICMI, 2005.
[16]
R. Vertegaal. Attentive user interfaces. Communications of the ACM, 46(3), 2003.
[17]
M. Voit, K. Nickel, and R. Stiefelhagen. Multi-view head pose estimation using neural networks. In Second Workshop on Face Processing in Video (FPiV'05), in Proceedings of Second Canadian Conference on Computer and Robot Vision. (CRV'05), 9-11 May 2005, Victoria, BC, Canada, 2005.
[18]
M. L. Z. Zhang, G. Potamianos and T. Huang. Robust multi-view multi-camera face detection inside smart rooms using spatio-temporal dynamic programming. In Proceedings of Automatic Face and Gesture Recognition (FG), Southampton, United Kingdom, 2006.

Cited By

View all
  • (2022)Body Posture Analysis for the Classification of Classroom ScenesInterdisciplinary Information Sciences10.4036/iis.2022.A.0528:1(55-62)Online publication date: 2022
  • (2017)InvisibleEyeProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/31309711:3(1-21)Online publication date: 11-Sep-2017
  • (2017)A Multimodal Assistive System for Helping Visually Impaired in Social InteractionsInformatik-Spektrum10.1007/s00287-017-1077-740:6(540-545)Online publication date: 24-Oct-2017
  • Show More Cited By

Index Terms

  1. Tracking head pose and focus of attention with multiple far-field cameras

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ICMI '06: Proceedings of the 8th international conference on Multimodal interfaces
      November 2006
      404 pages
      ISBN:159593541X
      DOI:10.1145/1180995
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 02 November 2006

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Bayesian filter
      2. focus of attention
      3. gaze
      4. head orientation
      5. head pose
      6. neural networks

      Qualifiers

      • Article

      Conference

      ICMI06
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 453 of 1,080 submissions, 42%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)2
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 30 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2022)Body Posture Analysis for the Classification of Classroom ScenesInterdisciplinary Information Sciences10.4036/iis.2022.A.0528:1(55-62)Online publication date: 2022
      • (2017)InvisibleEyeProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/31309711:3(1-21)Online publication date: 11-Sep-2017
      • (2017)A Multimodal Assistive System for Helping Visually Impaired in Social InteractionsInformatik-Spektrum10.1007/s00287-017-1077-740:6(540-545)Online publication date: 24-Oct-2017
      • (2016)Visual Focus of Attention Estimation With Unsupervised Incremental LearningIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2015.250192026:12(2264-2272)Online publication date: 1-Dec-2016
      • (2010)Real time head pose tracking from multiple cameras with a generic model2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops10.1109/CVPRW.2010.5543822(25-32)Online publication date: Jun-2010
      • (2010)Computers in the Human Interaction LoopHandbook of Ambient Intelligence and Smart Environments10.1007/978-0-387-93808-0_40(1071-1116)Online publication date: 2010
      • (2008)A realtime multimodal system for analyzing group meetings by combining face pose tracking and speaker diarizationProceedings of the 10th international conference on Multimodal interfaces10.1145/1452392.1452446(257-264)Online publication date: 20-Oct-2008
      • (2008)Estimation of group attention for automated camerawork2008 IEEE/RSJ International Conference on Intelligent Robots and Systems10.1109/IROS.2008.4651210(2317-2322)Online publication date: Sep-2008
      • (2008)3D tracking and dynamic analysis of human head movements and attentional targets2008 Second ACM/IEEE International Conference on Distributed Smart Cameras10.1109/ICDSC.2008.4635725(1-8)Online publication date: Sep-2008
      • (2008)Tracking identities and attention in smart environments - contributions and progress in the CHIL project2008 8th IEEE International Conference on Automatic Face & Gesture Recognition10.1109/AFGR.2008.4813322(1-8)Online publication date: Sep-2008
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media