Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3626705.3631787acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmumConference Proceedingsconference-collections
poster

Investigating Opportunities for Active Smart Assistants to Initiate Interactions With Users

Published: 03 December 2023 Publication History
  • Get Citation Alerts
  • Abstract

    Passive voice assistants such as Alexa are widespread, responding to user requests. However, due to the rise of domestic robots, we envision active smart assistants initiating interactions seamlessly, weaving themselves into the user’s context, and enabling more suitable interaction. While robots already deliver the hardware, only recently have the advancements in artificial intelligence enabled assistants to grasp the human and the environments to support such visions. We combined hardware with artificial intelligence to build an attentive robot. Here, we present a robotic head prototype discovering and following the users in a room supported by video and sound. We contribute (1) the design and implementation of a prototype system for an active smart assistant and (2) a discussion on design principles for systems engaging in human conversations. This work aims to provide foundations for future research for active smart assistants.

    References

    [1]
    Elizabeth Broadbent, Vinayak Kumar, Xingyan Li, John Sollers 3rd, Rebecca Q. Stafford, Bruce A. MacDonald, and Daniel M. Wegner. 2013. Robots with Display Screens: A Robot with a More Humanlike Face Display Is Perceived To Have More Mind and a Better Personality. PLOS ONE 8, 8 (2013). https://doi.org/10.1371/journal.pone.0072589
    [2]
    Qian Chen, Zhu Zhuo, and Wen Wang. 2019. BERT for Joint Intent Classification and Slot Filling. (2019). https://doi.org/10.48550/arxiv.1902.10909
    [3]
    Roeland De Geest, Efstratios Gavves, Amir Ghodrati, Zhenyang Li, Cees Snoek, and Tinne Tuytelaars. 2016. Online Action Detection. arxiv:1604.06506 [cs]
    [4]
    Joseph Hector Dibiase. 2000. A High-Accuracy, Low-Latency Technique for Talker Localization in Reverberant Environments Using Microphone Arrays. Ph. D. Dissertation.
    [5]
    Rolf Dieter Schraft, Birgit Graf, Andreas Traub, and Dirk John. 2001. A Mobile Robot Platform for Assistance and Entertainment. Industrial Robot: An International Journal 28, 1 (2001), 29–35. https://doi.org/10.1108/01439910110380424
    [6]
    Jannik Fritsch, Marcus Kleinehagenbrock, Sebastian Lang, Gernot A. Fink, and Gerhard Sagerer. 2004. Audiovisual Person Tracking with a Mobile Robot. Proc. Int. Conf. on Intelligent Autonomous Systems (2004).
    [7]
    Randy Gomez, Deborah Szapiro, Kerl Galindo, and Keisuke Nakamura. 2018. Haru: Hardware Design of an Experimental Tabletop Robot Assistant. In Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction (Chicago, IL, USA) (HRI ’18). Association for Computing Machinery, New York, NY, USA, 233–240. https://doi.org/10.1145/3171221.3171288
    [8]
    Van Bay Hoang, Van Hung Nguyen, Trung Dung Ngo, and Xuan-Tung Truong. 2023. Socially Aware Robot Navigation Framework: Where and How to Approach People in Dynamic Social Environments. IEEE Transactions on Automation Science and Engineering 20, 2 (2023), 1322–1336. https://doi.org/10.1109/tase.2022.3174141
    [9]
    Mohammed Moshiul Hoque, Yoshinori Kobayashi, and Yoshinori Kuno. 2014. A Proactive Approach of Robotic Framework for Making Eye Contact with Humans. Advances in Human-Computer Interaction 2014 (2014). https://doi.org/10.1155/2014/694046
    [10]
    Fei Jia, Somshubra Majumdar, and Boris Ginsburg. 2020. MarbleNet: Deep 1D Time-Channel Separable Convolutional Neural Network for Voice Activity Detection. (2020). https://doi.org/10.48550/arxiv.2010.13886
    [11]
    Daphne Karreman, Lex Utama, Michiel Joosse, Manja Lohse, Betsy van Dijk, and Vanessa Evers. 2014. Robot Etiquette: How to Approach a Pair of People?. In Proceedings of the 2014 ACM/IEEE International Conference on Human-Robot Interaction (Bielefeld, Germany) (HRI ’14). Association for Computing Machinery, New York, NY, USA, 196–197. https://doi.org/10.1145/2559636.2559839
    [12]
    Yusuke Kato, Takayuki Kanda, and Hiroshi Ishiguro. 2015. May I Help You? Design of Human-like Polite Approaching Behavior. In Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction (Portland, Oregon, USA) (HRI ’15). Association for Computing Machinery, New York, NY, USA, 35–42. https://doi.org/10.1145/2696454.2696463
    [13]
    Wesley Kerr and Paul Cohen. 2010. Recognizing Behaviors and the Internal State of the Participants. In 2010 IEEE 9th International Conference on Development and Learning (Ann Arbor, MI, USA, 2010-08). Ieee, 33–38. https://doi.org/10.1109/devlrn.2010.5578868
    [14]
    Davis King. 2017. High Quality Face Recognition with Deep Metric Learning.
    [15]
    Charles Knapp and G. Clifford Carter. 1976. The Generalized Correlation Method for Estimation of Time Delay. IEEE Transactions on Acoustics, Speech, and Signal Processing 24, 4 (1976), 320–327. https://doi.org/10.1109/tassp.1976.1162830
    [16]
    Sebastian Lang, Marcus Kleinehagenbrock, Sascha Hohenner, Jannik Fritsch, Gernot A. Fink, and Gerhard Sagerer. 2003. Providing the Basis for Human-Robot-Interaction: A Multi-Modal Attention System for a Mobile Robot. In Proceedings of the 5th International Conference on Multimodal Interfaces (Vancouver, British Columbia, Canada) (ICMI ’03). Association for Computing Machinery, New York, NY, USA, 28–35. https://doi.org/10.1145/958432.958441
    [17]
    Jan Leusmann, Chao Wang, Michael Gienger, Albrecht Schmidt, and Sven Mayer. 2023. Understanding the Uncertainty Loop of Human-Robot Interaction. https://doi.org/10.48550/arXiv.2303.07889 arxiv:2303.07889 [cs.HC]
    [18]
    Camillo Lugaresi, Jiuqiang Tang, Hadon Nash, Chris McClanahan, Esha Uboweja, Michael Hays, Fan Zhang, Chuo-Ling Chang, Ming Guang Yong, Juhyun Lee, Wan-Teh Chang, Wei Hua, Manfred Georg, and Matthias Grundmann. 2019. MediaPipe: A Framework for Building Perception Pipelines. (2019). https://doi.org/10.48550/arxiv.1906.08172
    [19]
    Kazuki Mizumaru, Satoru Satake, Takayuki Kanda, and Tetsuo Ono. 2019. Stop Doing It! Approaching Strategy for a Robot to Admonish Pedestrians. In 2019 14th ACM/IEEE International Conference on Human-Robot Interaction(HRI ’19). IEEE, Daegu, Korea (South), 449–457. https://doi.org/10.1109/hri.2019.8673017
    [20]
    Ali Mollahosseini, Hojjat Abdollahi, Timothy D. Sweeny, Ron Cole, and Mohammad H. Mahoor. 2018. Role of Embodiment and Presence in Human Perception of Robots’ Facial Cues. International Journal of Human-Computer Studies 116 (2018), 25–39. https://doi.org/10.1016/j.ijhcs.2018.04.005
    [21]
    D. Patel, Steve Fleming, and James Kilner. 2012. Inferring Subjective States through the Observation of Actions. Proceedings of the Royal Society B: Biological Sciences 279, 1748 (2012), 4853–4860. https://doi.org/10.1098/rspb.2012.1847
    [22]
    Manoranjan Paul, Shah M. E. Haque, and Subrata Chakraborty. 2013. Human Detection in Surveillance Videos and Its Applications - a Review. EURASIP Journal on Advances in Signal Processing 2013, 1 (2013), 176. https://doi.org/10.1186/1687-6180-2013-176
    [23]
    Danielle Pillet-Shore. 2018. How to Begin. Research on Language and Social Interaction 51, 3 (2018), 213–231. https://doi.org/10.1080/08351813.2018.1485224
    [24]
    Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever. 2022. Robust Speech Recognition via Large-Scale Weak Supervision. (2022). https://doi.org/10.48550/arxiv.2212.04356
    [25]
    Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You Only Look Once: Unified, Real-Time Object Detection. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Las Vegas, NV, USA, 779–788. https://doi.org/10.1109/cvpr.2016.91
    [26]
    Satoru Satake, Takayuki Kanda, Dylan F. Glas, Michita Imai, Hiroshi Ishiguro, and Norihiro Hagita. 2009. How to Approach Humans? Strategies for Social Robots to Initiate Interaction. In Proceedings of the 4th ACM/IEEE International Conference on Human Robot Interaction (La Jolla, California, USA) (HRI ’09). Association for Computing Machinery, New York, NY, USA, 109–116. https://doi.org/10.1145/1514095.1514117
    [27]
    Shayla Sharmin, Mohammed Moshiul Hoque, S. M. Riazul Islam, Md. Fazlul Kader, and Iqbal H. Sarker. 2021. Development of Duplex Eye Contact Framework for Human-Robot Inter Communication. IEEE Access 9 (2021), 54435–54456. https://doi.org/10.1109/access.2021.3071129
    [28]
    Chao Shi, Michihiro Shimada, Takayuki Kanda, Hiroshi Ishiguro, and Norihiro Hagita. 2012. Spatial Formation Model for Initiating Conversation. In Robotics. The MIT Press, 305–312. https://doi.org/10.7551/mitpress/9481.003.0044
    [29]
    Mei Si and Joseph Dean McDaniel. 2016. Using Facial Expression and Body Language to Express Attitude for Non-Humanoid Robot: (Extended Abstract). In Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems (Singapore, Singapore) (AAMAS ’16). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 1457–1458.
    [30]
    Robert P. Spunt and Ralph Adolphs. 2019. The Neuroscience of Understanding the Emotions of Others. Neuroscience Letters (2019), 44–48. https://doi.org/10.1016/j.neulet.2017.06.018
    [31]
    Shashi Suman, Francois Rivest, and Ali Etemad. 2022. Towards Personalization of User Preferences in Partially Observable Smart Home Environments. arxiv:2112.00971 [cs]
    [32]
    Paul Thagard and Ziva Kunda. 1997. Making Sense of People: Coherence Mechanisms. Hillsdale, NJ: Erlbaum.
    [33]
    Joshua Wainer, David Feil-seifer, Dylan Shell, and Maja Mataric. 2006. The Role of Physical Embodiment in Human-Robot Interaction. In ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication (2006-09). IEEE, 117–122. https://doi.org/10.1109/roman.2006.314404

    Index Terms

    1. Investigating Opportunities for Active Smart Assistants to Initiate Interactions With Users

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      MUM '23: Proceedings of the 22nd International Conference on Mobile and Ubiquitous Multimedia
      December 2023
      607 pages
      ISBN:9798400709210
      DOI:10.1145/3626705
      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 03 December 2023

      Check for updates

      Author Tags

      1. conversational agents
      2. human computer interaction
      3. human robot interaction
      4. voice assistants

      Qualifiers

      • Poster
      • Research
      • Refereed limited

      Conference

      MUM '23

      Acceptance Rates

      Overall Acceptance Rate 190 of 465 submissions, 41%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 31
        Total Downloads
      • Downloads (Last 12 months)31
      • Downloads (Last 6 weeks)2

      Other Metrics

      Citations

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media