Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1891903.1891924acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections

Toward natural interaction in the real world: real-time gesture recognition

Published: 08 November 2010 Publication History


Using a new hand tracking technology capable of tracking 3D hand postures in real-time, we developed a recognition system for continuous natural gestures. By natural gestures, we mean those encountered in spontaneous interaction, rather than a set of artificial gestures chosen to simplify recognition. To date we have achieved 95.6% accuracy on isolated gesture recognition, and 73% recognition rate on continuous gesture recognition, with data from 3 users and twelve gesture classes. We connected our gesture recognition system to Google Earth, enabling real time gestural control of a 3D map. We describe the challenges of signal accuracy and signal interpretation presented by working in a real-world environment, and detail how we overcame them.


A. Adler and R. Davis. Symmetric multimodal interaction in a dynamic dialogue. In 2009 Intelligent User Interfaces Workshop on Sketch Recognition. ACM Press, February 2009.
M. Ashdown and P. Robinson. Escritoire: A personal projected display. IEEE Multimedia, 12(1):34--42, 2005.
B. Bauer and H. Hienz. Relevant features for video-based continuous sign language recognition. In FG '00: Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition 2000, page 440, 2000.
R. A. Bolt. "Put-That-There": Voice and gesture at the graphics interface. In SIGGRAPH '80: Proceedings of the 7th annual conference on Computer graphics and interactive techniques, pages 262--270, 1980.
P. Cohen, D. McGee, and J. Clow. The efficiency of multimodal interaction for a map-based task. In Proceedings of the sixth conference on Applied natural language processing, pages 331--338. Association for Computational Linguistics, 2000.
P. R. Cohen, M. Johnston, D. McGee, S. Oviatt, J. Pittman, I. Smith, L. Chen, and J. Clow. Quickset: multimodal interaction for distributed applications. In MULTIMEDIA '97: Proceedings of the fifth ACM international conference on Multimedia, pages 31--40. ACM, 1997.
D. Demirdjian, T. Ko, and T. Darrell. Untethered gesture acquisition and recognition for virtual world manipulation, 2003.
J. Eisenstein. Gesture in automatic discourse processing. PhD thesis, Massachusetts Institute of Technology, 2008.
W. Freeman and C. Weissman. Television control by hand gestures. In Proc. of Intl. Workshop on Automatic Face and Gesture Recognition, pages 179--183, 1995.
J. L. Hernández-Rebollar, R. W. Lindeman, and N. Kyriakopoulos. A multi-class pattern recognition system for practical finger spelling translation. In ICMI '02: Proceedings of the 4th IEEE International Conference on Multimodal Interfaces, page 185, 2002.
S. Izadi, S. Hodges, S. Taylor, D. Rosenfeld, N. Villar, A. Butler, and J. Westhues. Going beyond the display: a surface technology with an electronically switchable diffuser. In Proceedings of the 21st annual ACM symposium on User interface software and technology, pages 269--278. ACM, 2008.
E. Kaiser, A. Olwal, D. McGee, H. Benko, A. Corradini, X. Li, P. Cohen, and S. Feiner. Mutual disambiguation of 3d multimodal interaction in augmented and virtual reality. In ICMI '03: Proceedings of the 5th international conference on Multimodal interfaces, pages 12--19, 2003.
A. Kendon. Current issues in the study of gesture. In P. P. J.-L. Nespoulous and A. R. Lecours, editors, The Biological Foundations of Gestures: Motor and Semiotic Aspects, pages 23--47. Lawrence Erlbaum Assoc., New Jersey, 1986.
D. McNeil and E. Levy. Conceptual representations of in language activity and gesture. In J. Jarvella and W. Klein, editors, Speech, place and action: Studies in deixis and related topics. Wiley, 1982.
K. Oka, Y. Sato, and H. Koike. Real-time fingertip tracking and gesture recognition. IEEE Computer Graphics and Applications, 22(6):64--71, 2002.
T. Y. Ouyang and R. Davis. A visual approach to sketched symbol recognition. In Proceedings of the 2009 International Joint Conference on Artificial Intelligence (IJCAI), pages 1463--1468, 2009.
S. Oviatt. Ten myths of multimodal interaction. Communications of the ACM, 42(11):74--81, 1999.
V. I. Pavlović, R. Sharma, and T. S. Huang. Visual interpretation of hand gestures for human-computer interaction: A review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19:677--695, 1997.
I. Rauschert, P. Agrawal, I. R. Pyush, and R. Sharma. Designing a human-centered, multimodal GIS interface to support emergency management, 2002.
R. Sharma, J. Cai, S. Chakravarthy, I. Poddar, and Y. Sethi. Exploiting speech/gesture co-occurrence for improving continuous gesture recognition in weather narration. Automatic Face and Gesture Recognition, IEEE International Conference on, 0:422, 2000.
R. Sharma, M. Yeasin, N. Krahnstoever, I. Rauschert, G. Cai, I. Brewer, A. Maceachren, and K. Sengupta. Speech-gesture driven multimodal interfaces for crisis management, 2003.
M. C. Shin, L. V. Tsap, and D. B. Goldgof. Gesture recognition using bezier curves for visualization navigation from registered 3-d data. Pattern Recognition, 37(5):1011--1024, 2004.
T. Starner and A. Pentl. Visual recognition of american sign language using hidden markov models. In In International Workshop on Automatic Face and Gesture Recognition, pages 189--194, 1995.
U. von Agris, D. Schneider, J. Zieren, and K. Kraiss. Rapid signer adaptation for isolated sign language recognition. In Computer Vision and Pattern Recognition Workshop, 2006 Conference on, pages 159--159, 2006.
R. Y. Wang and J. Popović. Real-time hand-tracking with a color glove. ACM Transactions on Graphics, 28(3), 2009.
S. Young, D. Kershaw, J. Odell, D. Ollason, V. Valtchev, and P. Woodland. The HTK Book Version 3.2.1. Cambridge University Press, 2002.

Cited By

View all
  • (2022)A Survey of Natural Design for InteractionProceedings of Mensch und Computer 202210.1145/3543758.3543773(240-254)Online publication date: 4-Sep-2022
  • (2020)Fast TLAM: High-precision Fine Grain Smoking Behavior Detection Network2020 IEEE 3rd International Conference on Information Communication and Signal Processing (ICICSP)10.1109/ICICSP50920.2020.9232100(183-188)Online publication date: Sep-2020
  • (2017)LabDesignARProceedings of the 23rd ACM Symposium on Virtual Reality Software and Technology10.1145/3139131.3141778(1-2)Online publication date: 8-Nov-2017
  • Show More Cited By

Index Terms

  1. Toward natural interaction in the real world: real-time gesture recognition



      Information & Contributors


      Published In

      cover image ACM Conferences
      ICMI-MLMI '10: International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction
      November 2010
      311 pages
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]



      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 08 November 2010


      Request permissions for this article.

      Check for updates

      Author Tags

      1. digital table interaction
      2. gesture recognition
      3. interactive maps
      4. multimodal interaction
      5. natural human computer interaction


      • Research-article

      Funding Sources


      ICMI-MLMI '10

      Acceptance Rates

      ICMI-MLMI '10 Paper Acceptance Rate 41 of 100 submissions, 41%;
      Overall Acceptance Rate 453 of 1,080 submissions, 42%


      Other Metrics

      Bibliometrics & Citations


      Article Metrics

      • Downloads (Last 12 months)16
      • Downloads (Last 6 weeks)3
      Reflects downloads up to 10 Feb 2025

      Other Metrics


      Cited By

      View all
      • (2022)A Survey of Natural Design for InteractionProceedings of Mensch und Computer 202210.1145/3543758.3543773(240-254)Online publication date: 4-Sep-2022
      • (2020)Fast TLAM: High-precision Fine Grain Smoking Behavior Detection Network2020 IEEE 3rd International Conference on Information Communication and Signal Processing (ICICSP)10.1109/ICICSP50920.2020.9232100(183-188)Online publication date: Sep-2020
      • (2017)LabDesignARProceedings of the 23rd ACM Symposium on Virtual Reality Software and Technology10.1145/3139131.3141778(1-2)Online publication date: 8-Nov-2017
      • (2014)Studying Natural Interaction in Multimodal, Multi-Surface, Multiuser ScenariosEmerging Research and Trends in Interactivity and the Human-Computer Interface10.4018/978-1-4666-4623-0.ch008(160-181)Online publication date: 2014
      • (2014)RisQProceedings of the 12th annual international conference on Mobile systems, applications, and services10.1145/2594368.2594379(149-161)Online publication date: 2-Jun-2014
      • (2013)Compilation of ReferencesEmerging Research and Trends in Interactivity and the Human-Computer Interface10.4018/978-1-4666-4623-0.chcrf(0-0)Online publication date: 31-Oct-2013
      • (2013)Gesture spotting and recognition using salience detection and concatenated hidden markov modelsProceedings of the 15th ACM on International conference on multimodal interaction10.1145/2522848.2532588(489-494)Online publication date: 9-Dec-2013
      • (2013)On designing interactivity awareness for ambient displaysMultimedia Tools and Applications10.1007/s11042-012-1140-y66:1(59-80)Online publication date: 1-Sep-2013
      • (2012)Plug&touchProceedings of the 20th ACM international conference on Multimedia10.1145/2393347.2396412(1177-1180)Online publication date: 29-Oct-2012
      • (2012)A hierarchical approach to continuous gesture analysis for natural multi-modal interactionProceedings of the 14th ACM international conference on Multimodal interaction10.1145/2388676.2388756(357-360)Online publication date: 22-Oct-2012
      • Show More Cited By

      View Options

      Login options

      View options


      View or Download as a PDF file.



      View online with eReader.







      Share this Publication link

      Share on social media