research-article

Designing with Gaze: Tama -- a Gaze Activated Smart-Speaker

Authors:

Donald McMillan,

Ikkaku Kawaguchi,

Jordi Solsona Belenguer, and

Hideaki KuzuokaAuthors Info & Claims

Proceedings of the ACM on Human-Computer Interaction, Volume 3, Issue CSCW

Article No.: 176, Pages 1 - 26

https://doi.org/10.1145/3359278

Published: 07 November 2019 Publication History

Abstract

Recent developments in gaze tracking present new opportunities for social computing. This paper presents a study of Tama, a gaze actuated smart speaker. Tama was designed taking advantage of research on gaze in conversation. Rather than being activated with a wake word (such as "Ok Google") Tama detects the gaze of a user, moving an articulated 'head' to achieve mutual gaze. We tested Tama's use in a multi-party conversation task, with users successfully activating and receiving a response to over 371 queries (over 10 trials). When Tama worked well, there was no significant difference in length of interaction. However, interactions with Tama had a higher rate of repeated queries, causing longer interactions overall. Video analysis lets us explain the problems users had interacting with gaze. In the discussion, we describe implications for designing new gaze systems, using gaze both as input and output. We also discuss how the relationship to anthropomorphic design and taking advantage of learned skills of interaction. Finally, two paths for future work are proposed, one in the field of speech agents, and the second in using human gaze as an interaction modality more widely.

References

[1]

Henny Admoni and Brian Scassellati. 2017. Social Eye Gaze in Human-Robot Interaction: A Review. J. Hum.-Robot Interact. 6, 1 (May 2017), 25--63. https://doi.org/10.5898/JHRI.6.1.Admoni

Digital Library

[2]

Tawfiq Ammari, Jofish Kaye, Janice Y. Tsai, and Frank Bentley. 2019. Music, Search, and IoT: How People (Really) Use Voice Assistants. ACM Trans. Comput.-Hum. Interact. 26, 3 (April 2019), 17:1--17:28. https://doi.org/10.1145/3311956

Digital Library

[3]

Lynne M. Andersson and Christine M. Pearson. 1999. Tit for Tat? The Spiraling Effect of Incivility in the Workplace. Academy of Management Review 24, 3 (July 1999), 452--471. https://doi.org/10.5465/amr.1999.2202131

[4]

Sean Andrist, Xiang Zhi Tan, Michael Gleicher, and Bilge Mutlu. 2014. Conversational Gaze Aversion for Humanlike Robots. In Proceedings of the 2014 ACM/IEEE International Conference on Human-Robot Interaction (HRI '14). ACM, New York, NY, USA, 25--32. https://doi.org/10.1145/2559636.2559666

Digital Library

[5]

Michael Argyle, Luc Lefebvre, and Mark Cook. 1974. The Meaning of Five Patterns of Gaze. European journal of social psychology 4, 2 (1974), 125--136. https://doi.org/10.1002/ejsp.2420040202

[6]

Hind Baqeel and Saqib Saeed. 2019. Face Detection Authentication on Smartphones: End Users Usability Assessment Experiences. In 2019 International Conference on Computer and Information Sciences (ICCIS). 1--6. https://doi.org/10.1109/ICCISci.2019.8716452

[7]

Janet Beavin Bavelas, Linda Coates, and Trudy Johnson. 2002. Listener Responses as a Collaborative Process: The Role of Gaze. Journal of Communication (2002), 15. https://doi.org/10.1111/j.1460--2466.2002.tb02562.x

[8]

Erin Beneteau, Olivia K. Richards, Mingrui Zhang, Julie A. Kientz, Jason Yip, and Alexis Hiniker. 2019. Communication Breakdowns Between Families and Alexa. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). ACM, New York, NY, USA, 243:1--243:13. https://doi.org/10.1145/3290605.3300473

Digital Library

[9]

Maren Bennewitz, Felix Faber, Dominik Joho, Michael Schreiber, and Sven Behnke. 2005. Integrating Vision and Speech for Conversations with Multiple Persons. In 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems. 2523--2528. https://doi.org/10.1109/IROS.2005.1545158

[10]

Dan Bohus and Eric Horvitz. 2010. Facilitating Multiparty Dialog with Gaze, Gesture, and Speech. In International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction (ICMI-MLMI '10). ACM, New York, NY, USA, 5:1--5:8. https://doi.org/10.1145/1891903.1891910

Digital Library

[11]

Cynthia Breazeal. 2005. Socially Intelligent Robots. Interactions 12, 2 (March 2005), 19--22. https://doi.org/10.1145/1052438.1052455

Digital Library

[12]

John Brook. 1996. SUS-A Quick and Dirty Usability Scale. In Usability Evaluation In Industry, Patrick W. Jordan, B. Thomas, Ian Lyall McClelland, and Bernard Weerdmeester (Eds.). CRC Press, 189--194.

[13]

Allison Bruce, Illah Nourbakhsh, and Reid Simmons. 2002. The Role of Expressiveness and Attention in Human-Robot Interaction. In Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292), Vol. 4. 4138--4142 vol.4. https://doi.org/10.1109/ROBOT.2002.1014396

[14]

Graham Button. 1990. Going Up a Blind Alley: Conflating Conversation Analysis and Computational Modelling. In Computers and Conversation, Paul Luff, Nigel Gilbert, and David Frohlich (Eds.). Academic Press, London, 67--90. https://doi.org/10.1016/B978-0-08-050264--9.50009--9

[15]

Graham Button and John R. E. Lee. 1987. Talk and Social Organisation. Multilingual Matters.

[16]

Donald Carroll. 2004. Restarts in Novice Turn Beginnings: Disfluencies or Interactional Achievements. Second language conversations (2004), 201--220. https://doi.org/10.5040/9781474212335.0014

[17]

Justine Cassell, Timothy W. Bickmore, Mark N. Billinghurst, Lee W. Campbell, K. Chang, Snorri Hjörvar Vilhjálmsson, and Hao Yan. 1999. Embodiment in Conversational Interfaces: Rea. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '99). ACM, New York, NY, USA, 520--527. https://doi.org/10.1145/302979.303150

Digital Library

[18]

Justine Cassell, Catherine Pelachaud, Norman Badler, Mark Steedman, Brett Achorn, Tripp Becket, Brett Douville, Scott Prevost, and Matthew Stone. 1994. Animated Conversation: Rule-Based Generation of Facial Expression, Gesture & Spoken Intonation for Multiple Conversational Agents. In Proceedings of the 21st Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '94). ACM, New York, NY, USA, 413--420. https://doi.org/10.1145/192161.192272

Digital Library

[19]

Eunji Chong, Katha Chanda, Zhefan Ye, Audrey Southerland, Nataniel Ruiz, Rebecca M. Jones, Agata Rozga, and James M. Rehg. 2019. Detecting Gaze Towards Eyes in Natural Social Interactions and Its Use in Child Assessment. arXiv:1902.00607 [cs] (Feb. 2019). https://doi.org/10.1145/3131902 arXiv:cs/1902.00607

[20]

David DeVault, Ron Artstein, Grace Benn, Teresa Dey, Ed Fast, Alesia Gainer, Kallirroi Georgila, Jon Gratch, Arno Hartholt, Margaux Lhommet, Gale Lucas, Stacy Marsella, Fabrizio Morbini, Angela Nazarian, Stefan Scherer, Giota Stratou, Apar Suri, David Traum, Rachel Wood, Yuyu Xu, Albert Rizzo, and Louis-Philippe Morency. 2014. SimSensei Kiosk: A Virtual Human Interviewer for Healthcare Decision Support. In Proceedings of the 2014 International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS '14). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 1061--1068.

Digital Library

[21]

H. Dudley. 1940. The Vocoder-Electrical Re-Creation of Speech. Journal of the Society of Motion Picture Engineers 34, 3 (March 1940), 272--278. https://doi.org/10.5594/J10096

[22]

Starkey Duncan. 1972. Some Signals and Rules for Taking Speaking Turns in Conversations. Journal of Personality and Social Psychology 23, 2 (1972), 283--292. https://doi.org/10.1037/h0033031

[23]

Mary Ellen Foster, Andre Gaschler, Manuel Giuliani, Amy Isard, Maria Pateraki, and Ronald P.A. Petrick. 2012. Two People Walk into a Bar: Dynamic Multi-Party Social Interaction with a Robot Agent. In Proceedings of the 14th ACM International Conference on Multimodal Interaction (ICMI '12). ACM, New York, NY, USA, 3--10. https://doi.org/10.1145/2388676.2388680

[24]

D Frohlich and P Luff. 1990. Applying the Technology of Conversation to the Technology for Conversations. In Computers and Conversation, P. Luff, G. N. Gilbert, and D. Frohlich (Eds.). Academic Press, London.

[25]

Maia Garau, Mel Slater, Simon Bee, and Martina Angela Sasse. 2001. The Impact of Eye Gaze on Communication Using Humanoid Avatars. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '01). ACM, New York, NY, USA, 309--316. https://doi.org/10.1145/365024.365121

Digital Library

[26]

Harold Garfinkel. 1967. Studies in Ethnomethodology. Prentice Hall.

[27]

Nigel Gilbert, Robin Wooffitt, and Norman Fraser. 1990. Chapter 11 - Organising Computer Talk. In Computers and Conversation, PAUL Luff, NIGEL Gilbert, and DAVID Frohlich (Eds.). Academic Press, London, 235--257. https://doi.org/10.1016/B978-0-08-050264--9.50016--6

[28]

Erving Goffman. 1964. The Neglected Situation. American Anthropologist 66, 6_PART2 (Dec. 1964), 133--136. https://doi.org/10.1525/aa.1964.66.suppl_3.02a00090

[29]

Erving Goffman. 1967. Interaction Ritual: Essays on Face-to-Face Interaction. Aldine, Oxford, England.

[30]

Erving Goffman. 2009. Relations in Public. Transaction Publishers.

[31]

Charles Goodwin. 1980. Restarts, Pauses, and the Achievement of a State of Mutual Gaze at Turn-Beginning. Sociological Inquiry 50, 3--4 (July 1980), 272--302. https://doi.org/10.1111/j.1475--682X.1980.tb00023.x

[32]

Marjorie Harness Goodwin and Charles Goodwin. 1986. Gesture and Coparticipation in the Activity of Searching for a Word. Semiotica 62, 1--2 (1986), 51--76.

[33]

A. Hadid, J. Y. Heikkila, O. Silven, and M. Pietikainen. 2007. Face and Eye Detection for Person Authentication in Mobile Phones. In 2007 First ACM/IEEE International Conference on Distributed Smart Cameras. 101--108. https://doi.org/10.1109/ICDSC.2007.4357512

[34]

Christian Heath, Jon Hindmarsh, and Paul Luff. 2010. Video in Qualitative Research. SAGE Publications Ltd, Los Angeles.

[35]

Christian Heath and P Luff. 2000. Technology in Action. Cambridge University Press.

Digital Library

[36]

Dirk Heylen, Ivo van Es, Anton Nijholt, and Betsy van Dijk. 2005. Controlling the Gaze of Conversational Agents. In Advances in Natural Multimodal Dialogue Systems, Jan C. J. van Kuppevelt, Laila Dybkjær, and Niels Ole Bernsen (Eds.). Springer Netherlands, Dordrecht, 245--262. https://doi.org/10.1007/1--4020--3933--6_11

[37]

Laurent Itti, Nitin Dhavale, and Frederic Pighin. 2003. Realistic Avatar Eye and Head Animation Using a Neurobiological Model of Visual Attention. In Applications and Science of Neural Networks, Fuzzy Systems, and Evolutionary Computation VI, Vol. 5200. International Society for Optics and Photonics, 64--79. https://doi.org/10.1117/12.512618

[38]

Jonas Ivarsson and Christian Greiffenhagen. 2015. The Organization of Turn-Taking in Pool Skate Sessions. Research on Language and Social Interaction 48, 4 (2015), 406--429.

[39]

Robert J.K. Jacob, Audrey Girouard, Leanne M. Hirshfield, Michael S. Horn, Orit Shaer, Erin Treacy Solovey, and Jamie Zigelbaum. 2008. Reality-Based Interaction: A Framework for Post-WIMP Interfaces. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '08). ACM, New York, NY, USA, 201--210. https://doi.org/10.1145/1357054.1357089

Digital Library

[40]

Robert J. K. Jacob. 1991. The Use of Eye Movements in Human-Computer Interaction Techniques: What You Look at Is What You Get. ACM Trans. Inf. Syst. 9, 2 (April 1991), 152--169. https://doi.org/10.1145/123078.128728

Digital Library

[41]

Alejandro Jaimes and Nicu Sebe. 2007. Multimodal Human--Computer Interaction: A Survey. Computer Vision and Image Understanding 108, 1 (Oct. 2007), 116--134. https://doi.org/10.1016/j.cviu.2006.10.019

Digital Library

[42]

Hiroko Kamide, Friederike Eyssel, and Tatsuo Arai. 2013. Psychological Anthropomorphism of Robots. In Social Robotics (Lecture Notes in Computer Science), Guido Herrmann, Martin J. Pearson, Alexander Lenz, Paul Bremner, Adam Spiers, and Ute Leonards (Eds.). Springer International Publishing, 199--208.

[43]

Michael Katzenmaier, Rainer Stiefelhagen, and Tanja Schultz. 2004. Identifying the Addressee in Human-Human-Robot Interactions Based on Head Pose and Speech. In Proceedings of the 6th International Conference on Multimodal Interfaces (ICMI '04). ACM, New York, NY, USA, 144--151. https://doi.org/10.1145/1027933.1027959

Digital Library

[44]

Adam Kendon. 1967. Some Functions of Gaze-Direction in Social Interaction. Acta Psychologica 26 (Jan. 1967), 22--63. https://doi.org/10.1016/0001--6918(67)90005--4

[45]

Chris Kleinke. 1986. Gaze and Eye Contact. Psychological Bulletin 100, 1 (July 1986), 78--100.

[46]

Stefan Kopp, Lars Gesellensetter, Nicole C. Krämer, and Ipke Wachsmuth. 2005. A Conversational Agent as Museum Guide -- Design and Evaluation of a Real-World Application. In Intelligent Virtual Agents (Lecture Notes in Computer Science), Themis Panayiotopoulos, Jonathan Gratch, Ruth Aylett, Daniel Ballin, Patrick Olivier, and Thomas Rist (Eds.). Springer Berlin Heidelberg, 329--343.

[47]

Spyros Kousidis and David Schlangen. 2015. The Power of a Glance: Evaluating Embodiment and Turn-Tracking Strategies of an Active Robotic Overhearer. In 2015 AAAI Spring Symposium Series.

[48]

Manu Kumar, Andreas Paepcke, and Terry Winograd. 2007. EyePoint: Practical Pointing and Selection Using Gaze and Keyboard. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems - CHI '07. ACM Press, San Jose, California, USA, 421. https://doi.org/10.1145/1240624.1240692

Digital Library

[49]

Yoshinori Kuno, Kazuhisa Sadazuka, Michie Kawashima, Keiichi Yamazaki, Akiko Yamazaki, and Hideaki Kuzuoka. 2007. Museum Guide Robot Based on Sociological Interaction Analysis. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '07). ACM, New York, NY, USA, 1191--1194. https://doi.org/10.1145/1240624.1240804

Digital Library

[50]

Gene H. Lerner. 2003. Selecting next Speaker: The Context-Sensitive Operation of a Context-Free Organization. Language in Society 32, 2 (April 2003), 177--201. https://doi.org/10.1017/S004740450332202X

[51]

Florent Levillain and Elisabetta Zibetti. 2017. Behavioral Objects: The Rise of the Evocative Machines. J. Hum.-Robot Interact. 6, 1 (May 2017), 4--24. https://doi.org/10.5898/JHRI.6.1.Levillain

Digital Library

[52]

Ewa Luger and Abigail Sellen. 2016. "Like Having a Really Bad PA": The Gulf Between User Expectation and Experience of Conversational Agents. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16). ACM, New York, NY, USA, 5286--5297. https://doi.org/10.1145/2858036.2858288

Digital Library

[53]

Nikolaos Mavridis. 2015. A Review of Verbal and Non-Verbal Human--Robot Interactive Communication. Robotics and Autonomous Systems 63 (Jan. 2015), 22--35. https://doi.org/10.1016/j.robot.2014.09.031

[54]

Yohan Moon, Ki Joon Kim, Dong-Hee Shin, Ki Joon Kim, and Dong-Hee Shin. 2016. Voices of the Internet of Things: An Exploration of Multiple Voice Effects in Smart Homes. In Proceedings of the 4th International Conference on Distributed, Ambient, and Pervasive Interactions, Vol. 9749. Springer International Publishing, Cham, 270--278. https://doi.org/10.1007/978--3--319--39862--4_25

[55]

Robert J Moore. 2013. Ethnomethodology and Conversation Analysis: Empirical Approaches to the Study of Digital Technology in Action. The Sage handbook of digital technology research (2013), 217--235.

[56]

Robert J. Moore and Raphael Arar. 2019. Conversational UX Design: A Practitioner's Guide to the Natural Conversation Framework. Morgan & Claypool.

[57]

Jonathan Mumm and Bilge Mutlu. 2011. Human-Robot Proxemics: Physical and Psychological Distancing in Human- Robot Interaction. In Proceedings of the 6th International Conference on Human-Robot Interaction (HRI '11). ACM, New York, NY, USA, 331--338. https://doi.org/10.1145/1957656.1957786

Digital Library

[58]

Bilge Mutlu, Toshiyuki Shiwa, Takayuki Kanda, Hiroshi Ishiguro, and Norihiro Hagita. 2009. Footing in Human-Robot Conversations: How Robots Might Shape Participant Roles Using Gaze Cues. In Proceedings of the 4th ACM/IEEE International Conference on Human Robot Interaction (HRI '09). ACM, New York, NY, USA, 61--68. https://doi.org/10.1145/1514095.1514109

Digital Library

[59]

D. G. Novick, B. Hansen, and K.Ward. 1996. Coordinating Turn-Taking with Gaze. In Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96, Vol. 3. 1888--1891 vol.3. https://doi.org/10.1109/ICSLP.1996.608001

[60]

William Odom, John Zimmerman, Jodi Forlizzi, Hajin Choi, Stephanie Meier, and Angela Park. 2012. Investigating the Presence, Form and Behavior of Virtual Possessions in the Context of a Teen Bedroom. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '12). ACM, New York, NY, USA, 327--336. https://doi.org/10.1145/2207676.2207722

Digital Library

[61]

Per Persson, Jarmo Laaksolahti, and Peter Lönnqvist. 2000. Anthropomorphism - a Multi-Layered Phenomenon.

[62]

Antoine Picot, Gérard Bailly, Frédéric Elisei, and Stephan Raidt. 2007. Scrutinizing Natural Scenes: Controlling the Gaze of an Embodied Conversational Agent. In 7th International Conference on Intelligent Virtual Agents, IVA'2007 (17--19 September 2007, Paris, France). Paris, France, 50--61.

Digital Library

[63]

Stefania Pizza, Barry Brown, Donald McMillan, and Airi Lampinen. 2016. Smartwatch in Vivo. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16). ACM, New York, NY, USA, 5456--5469. https://doi.org/10.1145/2858036.2858522

Digital Library

[64]

Alex Poole and Linden J. Ball. 2006. Eye Tracking in HCI and Usability Research. In Encyclopedia of Human Computer Interaction. IGI Global, 211--219.

[65]

Martin Porcheron, Joel E. Fischer, Moira McGregor, Barry Brown, Ewa Luger, Heloisa Candello, and Kenton O'Hara. 2017. Talking with Conversational Agents in Collaborative Action. In Companion of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW '17 Companion). ACM, New York, NY, USA, 431--436. https://doi.org/10.1145/3022198.3022666

[66]

Martin Porcheron, Joel E. Fischer, Stuart Reeves, and Sarah Sharples. 2018. Voice Interfaces in Everyday Life. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI '18). ACM, New York, NY, USA, 640:1--640:12. https://doi.org/10.1145/3173574.3174214

Digital Library

[67]

Matthias Rehm and Elisabeth André. 2005. Where Do They Look? Gaze Behaviors of Multiple Users Interacting with an Embodied Conversational Agent. In Intelligent Virtual Agents (Lecture Notes in Computer Science), Themis Panayiotopoulos, Jonathan Gratch, Ruth Aylett, Daniel Ballin, Patrick Olivier, and Thomas Rist (Eds.). Springer Berlin Heidelberg, 241--252.

[68]

C. Rich, B. Ponsler, A. Holroyd, and C. L. Sidner. 2010. Recognizing Engagement in Human-Robot Interaction. In 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI). 375--382. https://doi.org/10.1109/HRI.2010.5453163

[69]

Viktor Richter, Birte Carlmeyer, Florian Lier, Sebastian Meyer zu Borgsen, David Schlangen, Franz Kummert, Sven Wachsmuth, and Britta Wrede. 2016. Are You Talking to Me?: Improving the Robustness of Dialogue Systems in a Multi Party HRI Scenario by Incorporating Gaze Direction and Lip Movement of Attendees. In Proceedings of the Fourth International Conference on Human Agent Interaction (HAI '16). ACM, New York, NY, USA, 43--50. https://doi.org/10.1145/2974804.2974823

Digital Library

[70]

K. Ruhland, C. E. Peters, S. Andrist, J. B. Badler, N. I. Badler, M. Gleicher, B. Mutlu, and R. McDonnell. 2015. A Review of Eye Gaze in Virtual Agents, Social Robotics and HCI: Behaviour Generation, User Interaction and Perception. Computer Graphics Forum 34, 6 (Sept. 2015), 299--326. https://doi.org/10.1111/cgf.12603

Digital Library

[71]

H. Sacks, E. A. Schegloff, and G. Jefferson. 1974. A Simplest Systematics for the Organization of Turn Taking for Conversation. Language 50 (1974), 696--735.

[72]

Emanuel A. Schegloff. 1968. Sequencing in Conversational Openings1. American Anthropologist 70, 6 (Dec. 1968), 1075--1095. https://doi.org/10.1525/aa.1968.70.6.02a00030

[73]

Alex Sciuto, Arnita Saini, Jodi Forlizzi, and Jason I. Hong. 2018. "Hey Alexa, What's Up?": A Mixed-Methods Studies of In-Home Conversational Agent Usage. In Proceedings of the 2018 Designing Interactive Systems Conference (DIS '18). ACM, New York, NY, USA, 857--868. https://doi.org/10.1145/3196709.3196772

Digital Library

[74]

Abigail J. Sellen. 1995. Remote Conversations: The Effects of Mediating Talk With Technology. Human--Computer Interaction 10, 4 (Dec. 1995), 401--444. https://doi.org/10.1207/s15327051hci1004_2

Digital Library

[75]

O. Shaer and E. Hornecker. 2010. Tangible User Interfaces: Past, Present and Future Directions. now.

[76]

Candace L Sidner, Cory D Kidd, Christopher Lee, and Neal Lesh. [n. d.]. Where to Look: A Study of Human-Robot Engagement. ([n. d.]), 7.

[77]

Gabriel Skantze, Martin Johansson, and Jonas Beskow. 2015. Exploring Turn-Taking Cues in Multi-Party Human-Robot Discussions About Objects. In Proceedings of the 2015 ACM on International Conference on Multimodal Interaction (ICMI '15). ACM, New York, NY, USA, 67--74. https://doi.org/10.1145/2818346.2820749

Digital Library

[78]

Masataka Suzuki, Ayano Izawa, Kazushi Takahashi, and Yoshihiko Yamazaki. 2008. The Coordination of Eye, Head, and Arm Movements during Rapid Gaze Orienting and Arm Pointing. Experimental Brain Research 184, 4 (Feb. 2008), 579--585. https://doi.org/10.1007/s00221-007--1222--7

[79]

Daniel Szafir and Bilge Mutlu. 2012. Pay Attention!: Designing Adaptive Agents That Monitor and Improve User Engagement. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '12). ACM, New York, NY, USA, 11--20. https://doi.org/10.1145/2207676.2207679

Digital Library

[80]

James W. Tankard. 1970. Effects of Eye Position on Person Perception. Perceptual and Motor Skills 31, 3 (Dec. 1970), 883--893. https://doi.org/10.2466/pms.1970.31.3.883

[81]

Roel Vertegaal, Robert Slagter, and Anton Nijholt. 2001. Eye Gaze Patterns in Conversations: There Is More to Conversational Agents Than Meets the Eyes. (2001), 8.

[82]

Roel Vertegaal and Harro Vons. 2000. Effects of Gaze on Multiparty Mediated Communication. Graphics Interface (2000), 95--102.

[83]

TIAN (LINGER) XU, HUI ZHANG, and CHEN YU. 2016. See You See Me: The Role of Eye Contact in Multimodal Human- Robot Interaction. ACM transactions on interactive intelligent systems 6, 1 (May 2016). https://doi.org/10.1145/2882970

[84]

Tomoko Yonezawa, Hirotake Yamazoe, Akira Utsumi, and Shinji Abe. 2007. Gaze-Communicative Behavior of Stuffed- Toy Robot with Joint Attention and Eye Contact Based on Ambient Gaze-Tracking. In Proceedings of the 9th International Conference on Multimodal Interfaces (ICMI '07). ACM, New York, NY, USA, 140--145. https://doi.org/10.1145/1322192.1322218

Digital Library

[85]

Oren Zuckerman and Ayelet Gal-Oz. 2013. To TUI or Not to TUI: Evaluating Performance and Preference in Tangible vs. Graphical User Interfaces. International Journal of Human-Computer Studies 71, 7 (July 2013), 803--820. https://doi.org/10.1016/j.ijhcs.2013.04.003

Digital Library

Cited By

Wu LLafreniere BGrossman TWhite TSantosa S(2024)Body Language for VUIs: Exploring Gestures to Enhance Interactions with Voice User InterfacesProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3660691(133-150)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3643834.3660691
Zhan XAbdi NSeymour WSuch J(2024)Healthcare Voice AI Assistants: Factors Influencing Trust and Intention to UseProceedings of the ACM on Human-Computer Interaction10.1145/36373398:CSCW1(1-37)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3637339
Bucher ADolata MEckhardt SStaehelin DSchwabe G(2024)Talking to Multi-Party Conversational Agents in Advisory Services: Command-based vs. Conversational InteractionsProceedings of the ACM on Human-Computer Interaction10.1145/36330728:GROUP(1-25)Online publication date: 16-Feb-2024
https://dl.acm.org/doi/10.1145/3633072
Show More Cited By

Index Terms

Designing with Gaze: Tama -- a Gaze Activated Smart-Speaker
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. HCI design and evaluation methods
      1. User studies
    2. Interaction techniques
  2. Interaction design

Recommendations

Patterns of gaze in speech agent interaction
CUI '19: Proceedings of the 1st International Conference on Conversational User Interfaces

While gaze is an important part of human to human interaction, it has been neglected in the design of conversational agents. In this paper, we report on our experiments with adding gaze to a conventional speech agent system. Tama is a speech agent that ...
Read More
Vibrotactile stimulation of the head enables faster gaze gestures

Gaze gestures are a promising input technology for wearable devices especially in the smart glasses form factor because gaze gesturing is unobtrusive and leaves the hands free for other tasks. We were interested in how gaze gestures can be enhanced with ...
Read More
Eye, Head and Torso Coordination During Gaze Shifts in Virtual Reality

Humans perform gaze shifts naturally through a combination of eye, head and body movements. Although gaze has been long studied as input modality for interaction, this has previously ignored the coordination of the eyes, head and body. This article ...
Read More

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Human-Computer Interaction

Proceedings of the ACM on Human-Computer Interaction Volume 3, Issue CSCW

November 2019

5026 pages

EISSN:2573-0142

DOI:10.1145/3371885

Editors:
Airi Lampinen
Stockholm University, Sweden
,
Darren Gergle
Northwestern University, USA
,
David A. Shamma
FXPAL, USA

Issue’s Table of Contents

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 November 2019

Published in PACMHCI Volume 3, Issue CSCW

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Vetenskapsrådet
Swedish Foundation for Strategic Research
JSPS KAKENHI

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

30
Total Citations
View Citations
1,235
Total Downloads

Downloads (Last 12 months)132
Downloads (Last 6 weeks)13

Other Metrics

View Author Metrics

Citations

Cited By

Wu LLafreniere BGrossman TWhite TSantosa S(2024)Body Language for VUIs: Exploring Gestures to Enhance Interactions with Voice User InterfacesProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3660691(133-150)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3643834.3660691
Zhan XAbdi NSeymour WSuch J(2024)Healthcare Voice AI Assistants: Factors Influencing Trust and Intention to UseProceedings of the ACM on Human-Computer Interaction10.1145/36373398:CSCW1(1-37)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3637339
Bucher ADolata MEckhardt SStaehelin DSchwabe G(2024)Talking to Multi-Party Conversational Agents in Advisory Services: Command-based vs. Conversational InteractionsProceedings of the ACM on Human-Computer Interaction10.1145/36330728:GROUP(1-25)Online publication date: 16-Feb-2024
https://dl.acm.org/doi/10.1145/3633072
Hou YZhang ZHoranyi NMoon JCheng YChang H(2024)Multi-Modal Gaze Following in Conversational Scenarios2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00122(1175-1184)Online publication date: 3-Jan-2024
https://doi.org/10.1109/WACV57701.2024.00122
Bisogni CNappi MTortora GDel Bimbo A(2024)Gaze analysisImage and Vision Computing10.1016/j.imavis.2024.104961144:COnline publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1016/j.imavis.2024.104961
Candon KChen JKim YHsu ZTsoi NVázquez MAgmon NAn BRicci AYeoh W(2023)Nonverbal Human Signals Can Help Autonomous Agents Infer Human Preferences for Their BehaviorProceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems10.5555/3545946.3598652(307-316)Online publication date: 30-May-2023
https://dl.acm.org/doi/10.5555/3545946.3598652
Pelikan HHofstetter E(2023)Managing Delays in Human-Robot InteractionACM Transactions on Computer-Human Interaction10.1145/356989030:4(1-42)Online publication date: 12-Sep-2023
https://dl.acm.org/doi/10.1145/3569890
McMillan DJaber RCowan BFischer JIrfan BCumbal RZargham NLee MCastellano GRiek LCakmak MLeite I(2023)Human-Robot Conversational Interaction (HRCI)Companion of the 2023 ACM/IEEE International Conference on Human-Robot Interaction10.1145/3568294.3579954(923-925)Online publication date: 13-Mar-2023
https://dl.acm.org/doi/10.1145/3568294.3579954
Pelikan HJung MCastellano GRiek LCakmak MLeite I(2023)Designing Robot Sound-In-InteractionProceedings of the 2023 ACM/IEEE International Conference on Human-Robot Interaction10.1145/3568162.3576979(172-182)Online publication date: 13-Mar-2023
https://dl.acm.org/doi/10.1145/3568162.3576979
Zhang SSabir ADas A(2023)Speaker Orientation-Aware Privacy Control to Thwart Misactivation of Voice Assistants2023 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)10.1109/DSN58367.2023.00061(597-610)Online publication date: Jun-2023
https://doi.org/10.1109/DSN58367.2023.00061
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents