Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review

Published: 01 July 1997 Publication History

Abstract

The use of hand gestures provides an attractive alternative to cumbersome interface devices for human-computer interaction (HCI). In particular, visual interpretation of hand gestures can help in achieving the ease and naturalness desired for HCI. This has motivated a very active research area concerned with computer vision-based analysis and interpretation of hand gestures. We survey the literature on visual interpretation of hand gestures in the context of its role in HCI. This discussion is organized on the basis of the method used for modeling, analyzing, and recognizing gestures. Important differences in the gesture interpretation approaches arise depending on whether a 3D model of the human hand or an image appearance model of the human hand is used. 3D hand models offer a way of more elaborate modeling of hand gestures but lead to computational hurdles that have not been overcome given the real-time requirements of HCI. Appearance-based models lead to computationally efficient "purposive" approaches that work well under constrained situations but seem to lack the generality desirable for HCI. We also discuss implemented gestural systems as well as other potential applications of vision-based gesture recognition. Although the current progress is encouraging, further theoretical as well as computational advances are needed before gestures can be widely used for HCI. We discuss directions of future research in gesture recognition, including its integration with other natural modes of human-computer interaction.

References

[1]
J.F. Abramatic, P. Letellier, and M. Nadler, "A Narrow-Band Video Communication System for the Transmission of Sign Language Over Ordinary Telephone Lines," Image Sequences Processing and Dynamic Scene Analysis, T.S. Huang, ed., pp. 314-336.Berlin and Heidelberg: Springer-Verlag, 1983.
[2]
Henk J. Sips, Kees van Reeuwijk, Will Denissen, Analysis of local enumeration and storage schemes in HPF, Proceedings of the 10th international conference on Supercomputing, p.10-17, May 25-28, 1996, Philadelphia, Pennsylvania, United States
[3]
Proc. Int'l Conf. Neural Networks, vol. 3, pp. 1,949-1,954, 1993.
[4]
IEEE Asilomar Conf., 1994.
[5]
Int'l J. Computer Vision, vol. 1, pp. 333-356, 1988.
[6]
A. Azarbayejani, C. Wren, and A. Pentland, "Real-Time 3D Tracking of the Human Body," Proc. IMAGE'COM 96,Bordeaux, France, 1996.
[7]
Y. Azoz, L. Devi, and R. Sharma, "Vision-Based Human Arm Tracking for Gesture Analysis Using Multimodal Constraint Fusion," Proc. 1997 Advanced Display Federated Laboratory Symp.,Adelphi, Md., Jan. 1997.
[8]
R. Bajcsy, "Active Perception," Proc. IEEE, vol. 78, pp. 996-1,005, 1988.
[9]
Thomas Baudel, Michel Beaudouin-Lafon, Charade: remote control of objects using free-hand gestures, Communications of the ACM, v.36 n.7, p.28-35, July 1993
[10]
D.A. Becker and A. Pentland, "Using a Virtual Environment to Teach Cancer Patients T'ai Chi, Relaxation, and Self-Imagery," Proc. Int'l Conf. Automatic Face and Gesture Recognition,Killington, Vt., Oct. 1996.
[11]
A. Blake and A. Yuille, Active Vision.Cambridge, Mass.: MIT Press, 1992.
[12]
A.F. Bobick and J.W. Davis, "Real-Time Recognition of Activity Using Temporal Templates," Proc. Int'l Conf. Automatic Face and Gesture Recognition,Killington, Vt., Oct. 1996.
[13]
H.A. Boulard and N. Morgan, Connectionnist Speech Recognition. A Hybrid Approach.Norwell, Mass.: Kluwer Academic Publishers, 1994.
[14]
U. Bröckl-Fox, "Real-Time 3D Interaction With Up to 16 Degrees of Freedom From Monocular Image Flows," Proc. Int'l Workshop on Automatic Face and Gesture Recognition,Zurich, Switzerland, pp. 172-178, June 1995.
[15]
Proc. Int'l Conf. Automatic Face and Gesture Recognition,Killington, Vt., pp. 157-162, Oct. 1996.
[16]
Image and Vision Computing, vol. 11, pp. 129-155, 1995.
[17]
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 16, no. 9, pp. 882-887, Sept. 1994.
[18]
Image and Vision Computing, vol. 14, pp. 171-178, Mar. 1996.
[19]
Int'l Conf. Computer Vision, pp. 374-382,Berlin, May 1993.
[20]
E. Clergue, M. Goldberg, N. Madrane, and B. Merialdo, "Automatic Face and Gestural Recognition for Video Indexing," Proc. Int'l Workshop on Automatic Face and Gesture Recognition,Zurich, Switzerland, pp. 110-115, June 1995.
[21]
Computer Vision and Image Understanding, vol. 61, pp. 38-59, Jan. 1995.
[22]
J.L. Crowley, F. Berard, and J. Coutaz, "Finger Tacking As an Input Device for Augmented Reality," Proc. Int'l Workshop on Automatic Face and Gesture Recognition,Zurich, Switzerland, pp. 195-200, June 1995.
[23]
Y. Cui and J. Weng, "Learning-Based Hand Sign Recognition," Proc. Int'l Workshop on Automatic Face and Gesture Recognition,Zurich, Switzerland, pp. 201-206, June 1995.
[24]
Proc. Int'l Conf. Automatic Face and Gesture Recognition,Killington, Vt., pp. 88-93, Oct. 1996.
[25]
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 12, pp. 1,236-1,242, Dec. 1996.
[26]
T. Darrell and A.P. Pentland, "Attention-Driven Expression and Gesture Analysis in an Interactive Environment," Proc. Int'l Workshop on Automatic Face and Gesture Recognition,Zurich, Switzerland, pp. 135-140, June 1995.
[27]
Proc. 28th Asilomar Conf. Signals, Systems, and Computer, 1994.
[28]
J. Davis and M. Shah, "Recognizing Hand Gestures," Proc. European Conf. Computer Vision,Stockholm, Sweden, pp. 331-340, 1994.
[29]
A. C. Downton and H. Drouet, "Image Analysis for Model-Based Sign Language Coding," Progress in Image Analysis and Processing II: Proc. Sixth Int'l Conf. Image Analysis and Processing, pp. 637-644, 1991.
[30]
Proc. Int'l Conf. Computer Vision, pp. 360-367,Cambridge, Mass., 1995.
[31]
M. Etoh, A. Tomono, and F. Kishino, "Stereo-Based Description by Generalized Cylinder Complexes From Occluding Contours," Systems and Computers in Japan, vol. 22, no. 12, pp. 79-89, 1991.
[32]
IEEE Trans. Neural Networks, vol. 4, no. 1, pp. 2-8, Jan. 1993.
[33]
Proc. Int'l Conf. Automatic Face and Gesture Recognition,Killington, Vt., pp. 100-105, Oct. 1996.
[34]
W.T. Freeman and M. Roth, "Orientation Histograms for Hand Gesture Recognition," Proc. Int'l Workshop on Automatic Face and Gesture Recognition,Zurich, Switzerland, June 1995.
[35]
W.T. Freeman and C.D. Weissman, "Television Control by Hand Gestures," Proc. Int'l Workshop on Automatic Face and Gesture Recognition,Zurich, Switzerland, pp. 179-183, June 1995.
[36]
Computers and Graphics, vol. 18, no. 5, pp. 633-642, 1994.
[37]
D.M. Gavrila and L.S. Davis, "Towards 3D Model-Based Tracking and Recognition of Human Movement: A Multi-View Approach," Proc. Int'l Workshop on Automatic Face and Gesture Recognition,Zurich, Switzerland, pp. 272-277, June 1995.
[38]
Proc. Int'l Conf. Automatic Face and Gesture Recognition,Killington, Vt., pp. 88-93, Oct. 1996.
[39]
G. D. Hager, Task Directed Sensor Fusion and Planning. Kluwer Academic Publishers, 1990.
[40]
H. Harashima and F. Kishino, "Intelligent Image Coding and Communications With Realistic Sensations—Recent Trends," IEICE Trans., vol. E74, pp. 1,582-1,592, June 1991.
[41]
Int'l J. Man-Machine Studies, vol. 38, pp. 231-249, Feb. 1993.
[42]
Proc. Int'l Conf. Automatic Face and Gesture Recognition,Killington, Vt., pp. 140-145, Oct. 1996.
[43]
E. Hunter, J. Schlenzig, and R. Jain, "Posture Estimation in Reduced-Model Gesture Input Systems," Proc. Int'l Workshop on Automatic Face and Gesture Recognition, June 1995.
[44]
Proc. 1993 Int'l Conf. Systems, Man, and Cybernetics,Le Touquet, France, pp. 324-328, Oct.17-20, 1993.
[45]
Proc. Second Int'l Conf. Automatic Face- and Gesture-Recognition, pp. 38-44, Oct. 1996.
[46]
Computer Vision and Pattern Recognition,CVPR-94, pp. 980-984,Seattle, 1994.
[47]
IEEE Trans. Robotics and Automation, vol. 9, pp. 432-443, Aug. 1993.
[48]
A. Kendon, "Current Issues in the Study of Gesture," The Biological Foundations of Gestures: Motor and Semiotic Aspects, J.-L. Nespoulous, P. Peron, and A. R. Lecours, eds., pp. 23-47. Lawrence Erlbaum Assoc., 1986.
[49]
C. Kervrann and F. Heitz, "Learning Structure and Deformation Modes of Nonrigid Objects in Long Image Sequences," Proc. Int'l Workshop on Automatic Face and Gesture Recognition, June 1995.
[50]
R. Kjeldsen and J. Kender, "Visual Hand Gesture Recognition for Window System Control," Proc. Int'l Workshop on Automatic Face and Gesture Recognition,Zurich, Switzerland, pp. 184-188, June 1995.
[51]
Proc. Int'l Conf. Automatic Face and Gesture Recognition,Killington, Vt., pp. 312-317, Oct. 1996.
[52]
IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 15, No. 6, June 1993, pp. 556-568.
[53]
M.W. Krueger, Artificial Reality II. Addison-Wesley, 1991.
[54]
Myron W. Krueger, Environmental technology: making the real world virtual, Communications of the ACM, v.36 n.7, p.36-37, July 1993
[55]
J.J. Kuch, "Vision-Based Hand Modeling and Gesture Recognition for Human Computer Interaction," master's thesis, Univ. of Illinois at Urbana-Champaign, 1994.
[56]
Proc. IEEE Int'l Conf. Computer Vision,Cambridge, Mass., June 1995.
[57]
Y. Kuno, M. Sakamoto, K. Sakata, and Y. Shirai, "Vision-Based Human Computer Interface With User Centered Frame," Proc. IROS'94, 1994.
[58]
A. Lanitis, C.J. Taylor, T.F. Cootes, and T. Ahmed, "Automatic Interpretation of Human Faces and Hand Gestures Using Flexible Models," Proc. Int'l Workshop on Automatic Face and Gesture Recognition,Zurich, Switzerland, pp. 98-103, June 1995.
[59]
J. Lee and T.L. Kunii, "Constraint-Based Hand Animation," Models and Techniques in Computer Animation, pp. 110-127.Tokyo: Springer-Verlag, 1993.
[60]
IEEE Computer Graphics and Applications, pp. 77-86, Sept. 1995.
[61]
E.T. Levy and D. McNeill, "Speech, Gesture, and Discourse," Discourse Processes, no. 15, pp. 277-301, 1992.
[62]
Proc. IEEE Virtual Reality Ann. Int'l Symp., pp. 118-124, Seattle, Wash., 1993.
[63]
C. Maggioni, "GestureComputer—New Ways of Operating a Computer," Proc. Int'l Workshop on Automatic Face and Gesture Recognition,Zurich, Switzerland, pp. 166-171, June 1995.
[64]
N. Magnenat-Thalmann and D. Thalman, Computer Animation: Theory and Practice.New York: Springer-Verlag, 2nd rev. ed., 1990.
[65]
D. McNeill and E. Levy, "Conceptual Representations in Language Activity and Gesture," Speech, Place and Action: Studies in Deixis and Related Topics, J. Jarvella and W. Klein, eds. Wiley, 1982.
[66]
Artificial Neural Networks 2, I. Alexander and J. Taylor, eds. North-Holland: Elsevier Science Publishers B.V., 1992.
[67]
B. Moghaddam and A. Pentland, "Maximum Likelihood Detection of Faces and Hands," Proc. Int'l Workshop on Automatic Face and Gesture Recognition,Zurich, Switzerland, pp. 122-128, June 1995.
[68]
O'Rourke and N.L. Badler, "Model-Based Image Analysis of Human Motion Using Constraint Propagation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 2, pp. 522-536, 1980.
[69]
Proc. IEEE Int'l Conf. Face and Gesture Recognition, pp. 30-35, Killington, Vt., Oct. 1996.
[70]
Proc. 1990 IEEE National Aerospace and Electronics Conf., vol. 2, 1990.
[71]
F.K.H. Quek, "Toward a Vision-Based Hand Gesture Interface," Virtual Reality Software and Technology Conf., pp. 17-31, Aug. 1994.
[72]
F.K.H. Quek, "Eyes in the Interface," Image and Vision Computing, vol. 13, Aug. 1995.
[73]
F.K.H. Quek, T. Mysliwiec, and M. Zhao, "Finger Mouse: A Freehand Pointing Interface," Proc. Int'l Workshop on Automatic Face and Gesture Recognition,Zurich, Switzerland, pp. 372-377, June 1995.
[74]
Proc. IEEE, vol. 77, no. 2, pp. 257-285, 1989.
[75]
L.R. Rabiner and B. Juang, Fundamentals of Speech Recognition.Englewood Cliffs, N.J.: Prentice Hall, 1993.
[76]
J.M. Rehg and T. Kanade, "DigitEyes: Vision-Based Human Hand Tracking," Technical Report CMU-CS-93-220, School of Computer Science, Carnegie Mellon Univ., 1993.
[77]
Proc. Fifth Int'l Conf. Computer Vision, pp. 612–617, June 1995.
[78]
H. Rheingold, Virtual Reality. Summit Books, 1991.
[79]
Proc. 28th Asilomar Conf. Signals, Systems, and Computers, 1994.
[80]
Proc. Second IEEE Workshop on Applications of Computer Vision,Sarasota, Fla., pp. 187-194, Dec.5-7, 1994.
[81]
J. Segen, "Controlling Computers With Gloveless Gestures," Proc. Virtual Reality Systems, Apr. 1993.
[82]
R. Sharma, "Active Vision for Visual Servoing: A Review," IEEE Workshop on Visual Servoing: Achievements, Applications and Open Problems, May 1994.
[83]
R. Sharma, T.S. Huang, and V.I. Pavlovic, "A Multimodal Framework for Interacting With Virtual Environments," Human Interaction With Complex Systems, C.A. Ntuen and E.H. Park, eds., pp. 53-71. Kluwer Academic Publishers, 1996.
[84]
Proc. Int'l Conf. Pattern Recognition, 1996.
[85]
K. Shirai and S. Furui, "Special Issue on Spoken Dialogue," Speech Communication, vol. 15, pp. 3-4, 1994.
[86]
T.E. Starner and A. Pentland, "Visual Recognition of American Sign Language Using Hidden Markov Models," Proc. Int'l Workshop on Automatic Face and Gesture Recognition,Zurich, Switzerland, pp. 189-194, June 1995.
[87]
J. Streeck, "Gesture as Communication I: Its Coordination With Gaze and Speech," Communication Monographs, vol. 60, pp. 275-299, Dec. 1993.
[88]
IEEE CG&A, Vol. 14, No. 1, Jan. 1994, pp. 30-39.
[89]
D.X. Sun and L. Deng, "Nonstationary Hidden Markov Models for Speech Recognition," Image Models (and Their Speech Model Cousins), S.E. Levinson and L. Shepp, eds., pp. 161-182.New York: Springer-Verlag, 1996.
[90]
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 16, no. 8, pp. 831-836, Aug. 1996.
[91]
IEEE Trans. Robotics and Automation, vol. 11, pp. 86-104, 1995.
[92]
D. Thompson, "Biomechanics of the Hand," Perspectives in Computing, vol. 1, pp. 12-19, Oct. 1981.
[93]
Computers and Graphics, vol. 18, no. 5, pp. 621-631, 1994.
[94]
IEEE Int'l Workshop on Robot and Human Communication, pp. 105-110, 1992.
[95]
R. Tubiana ed., The Hand, vol. 1. Philadelphia, Penn.: Sanders, 1981.
[96]
C. Uras and A. Verri, "Hand Gesture Recognition From Edge Maps," Proc. Int'l Workshop on Automatic Face and Gesture Recognition,Zurich, Switzerland, pp. 116-121, June 1995.
[97]
R. Vaillant and D. Darmon, "Vision-Based Hand Pose Estimation," Proc. Int'l Workshop on Automatic Face and Gesture Recognition,Zurich, Switzerland, pp. 356-361, June 1995.
[98]
M.T. Vo, R. Houghton, J. Yang, U. Bub, U. Meier, A. Waibel, and P. Duchnowski, "Multimodal Learning Interfaces," ARPA Spoken Language Technology Workshop 1995, Jan. 1995.
[99]
M.T. Vo and A. Waibel, "A Multi-Modal Human-Computer Interface: Combination of Gesture and Speech Recognition," Adjunct Proc. InterCHI'93, Apr.26-29 1993.
[100]
A. Waibel and K.F. Lee, Readings in Speech Recognition. Morgan Kaufmann, 1990.
[101]
Proc. IEEE Int'l Conf. Robotics and Automation, vol. 3, pp. 784-789, May 1993.
[102]
K. Watanuki, K. Sakamoto, and F. Togawa, "Multimodal Interaction in Human Communication," IEICE Trans. Information and Systems, vol. E78-D, pp. 609-614, June 1995.
[103]
A.D. Wilson and A.F. Bobick, "Configuration States for the Representation and Recognition of Gesture," Proc. Int'l Workshop on Automatic Face and Gesture Recognition,Zurich, Switzerland, pp. 129-134, June 1995.
[104]
Proc. Int'l Conf. Automatic Face and Gesture Recognition,Killington, Vt., pp. 66-71, Oct. 1996.
[105]
Proc. Second Int'l Conf. Automatic Face and Gesture Recognition, pp. 51–56, Apr. 1998.

Cited By

View all
  • (2024)Co-Rhythm: Analyzing Children's Performative Gesture-based Interactions in a Music Composition ToolProceedings of the 23rd Annual ACM Interaction Design and Children Conference10.1145/3628516.3659375(686-690)Online publication date: 17-Jun-2024
  • (2024)Human and Large Language Model Intent Detection in Image-Based Self-Expression of People with Intellectual DisabilityProceedings of the 2024 Conference on Human Information Interaction and Retrieval10.1145/3627508.3638347(199-208)Online publication date: 10-Mar-2024
  • (2024)MouseRing: Always-available Touchpad Interaction with IMU RingsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642225(1-19)Online publication date: 11-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE Transactions on Pattern Analysis and Machine Intelligence  Volume 19, Issue 7
July 1997
123 pages
ISSN:0162-8828
Issue’s Table of Contents

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 July 1997

Author Tags

  1. Vision-based gesture recognition
  2. gesture analysis
  3. hand tracking
  4. human-computer interaction.
  5. nonrigid motion analysis

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Co-Rhythm: Analyzing Children's Performative Gesture-based Interactions in a Music Composition ToolProceedings of the 23rd Annual ACM Interaction Design and Children Conference10.1145/3628516.3659375(686-690)Online publication date: 17-Jun-2024
  • (2024)Human and Large Language Model Intent Detection in Image-Based Self-Expression of People with Intellectual DisabilityProceedings of the 2024 Conference on Human Information Interaction and Retrieval10.1145/3627508.3638347(199-208)Online publication date: 10-Mar-2024
  • (2024)MouseRing: Always-available Touchpad Interaction with IMU RingsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642225(1-19)Online publication date: 11-May-2024
  • (2024)Machine-Learning-Based Accessibility SystemSN Computer Science10.1007/s42979-024-02615-95:3Online publication date: 28-Feb-2024
  • (2024)mIV3Net: modified inception V3 network for hand gesture recognitionMultimedia Tools and Applications10.1007/s11042-023-15865-183:4(10587-10613)Online publication date: 1-Jan-2024
  • (2024)Designing a 3D gestural interface to support user interaction with time-oriented data as immersive 3D radar chartsVirtual Reality10.1007/s10055-023-00913-w28:1Online publication date: 23-Jan-2024
  • (2024)Survey on vision-based dynamic hand gesture recognitionThe Visual Computer: International Journal of Computer Graphics10.1007/s00371-023-03160-x40:9(6171-6199)Online publication date: 1-Sep-2024
  • (2023)Hands on the Past: Towards a Conceptual Framework for Developing and Evaluating Tangible AR Interfaces for Historical ArtefactsProceedings of the 35th Australian Computer-Human Interaction Conference10.1145/3638380.3638445(340-349)Online publication date: 2-Dec-2023
  • (2023)When Gestures and Words Synchronize: Exploring A Human Lecturer's Multimodal Interaction for the Design of Embodied Pedagogical AgentsCompanion Publication of the 2023 Conference on Computer Supported Cooperative Work and Social Computing10.1145/3584931.3607010(39-44)Online publication date: 14-Oct-2023
  • (2023)Impact of Multimodal Communication on Persuasiveness and Perceived Politeness of Virtual Agents in Small GroupsProceedings of the 23rd ACM International Conference on Intelligent Virtual Agents10.1145/3570945.3607356(1-8)Online publication date: 19-Sep-2023
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media