Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3379503.3403560acmconferencesArticle/Chapter ViewAbstractPublication PagesmobilehciConference Proceedingsconference-collections
research-article

VoiceMessage++: Augmented Voice Recordings for Mobile Instant Messaging

Published: 05 October 2020 Publication History
  • Get Citation Alerts
  • Abstract

    Media (e.g. videos, images, and text) shared on social platforms such as Facebook and WeChat are often visually enriched through digital content (e.g. emojis, stickers, animal faces) increasing joy, personalization, and expressiveness. While voice messages (VMs) are experiencing a high frequent usage, they currently lack any form of digital augmentation. This work is the first to present and explore the concept of augmented VMs. Inspired by visual augmentations we designed and implemented an editor, allowing users to enhance VMs with background sounds, voice changers, and sound stickers. In a first evaluation (N = 15) we found that participants used augmentations frequently (2.73 per message on average) and rated augmented VMs to be expressive, personal and more fun than ordinary VMs. In a consecutive step, we analyzed the 45 augmented VMs recorded during the study and identified three distinct message types (decoration, composition and integrated) that inform about potential usage.

    Supplementary Material

    MP4 File (a30-haas-supplement.mp4)

    References

    [1]
    Adobe. 2019. Adobe Audition | Audio recording, editing, and mixing software. https://www.adobe.com/products/audition.html
    [2]
    Anderson, Monica and Jiang, Jingjing. 2018. Teens, Social Media & Technology 2018. https://www.pewinternet.org/2018/05/31/teens-social-media-technology-2018/
    [3]
    Audacity. 2019. Audacity | Free, open source, cross-platform audio software for multi-track recording and editing.https://www.audacityteam.org/
    [4]
    Avid. 2019. Pro Tools - Musiksoftware - Avid. https://www.avid.com/pro-tools
    [5]
    Saeideh Bakhshi, David A. Shamma, and Eric Gilbert. 2014. Faces Engage Us: Photos with Faces Attract More Likes and Comments on Instagram. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(CHI ’14). ACM, New York, NY, USA, 965–974. https://doi.org/10.1145/2556288.2557403
    [6]
    Saeideh Bakhshi, David A. Shamma, Lyndon Kennedy, and Eric Gilbert. 2015. Why We Filter Our Photos and How It Impacts Engagement. In ICWSM. AAAI, Palo Alto, 10.
    [7]
    Aaron Bangor, Philip Kortum, and James Miller. 2009. Determining What Individual SUS Scores Mean: Adding an Adjective Rating Scale. J. Usability Studies 4, 3 (May 2009), 114–123.
    [8]
    Patrick Bastien. 2003. Voice Specific Signal Processing Tools. In Audio Engineering Society Conference: 23rd International Conference: Signal Processing in Audio Recording and Reproduction. Audio Engineering Society, NY, 21. http://www.aes.org/e-lib/browse.cfm?elib=12316
    [9]
    Floraine Berthouzoz, Wilmot Li, and Maneesh Agrawala. 2012. Tools for Placing Cuts and Transitions in Interview Video. ACM Trans. Graph. 31, 4 (July 2012), 67:1–67:8. https://doi.org/10.1145/2185520.2185563
    [10]
    John Brooke. 1996. SUS - A quick and dirty usability scale. Redhatch Consulting Ltd., London.
    [11]
    Barbara L. Chalfonte, Robert S. Fish, and Robert E. Kraut. 1991. Expressive Richness: A Comparison of Speech and Text As Media for Revision. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(CHI ’91). ACM, New York, NY, USA, 21–26. https://doi.org/10.1145/108844.108848
    [12]
    Annabel J. Cohen. 2001. Music as a source of emotion in film. In Music and emotion: Theory and research. Oxford University Press, New York, NY, US, 249–272.
    [13]
    Di Cui. 2016. Beyond “connected presence”: Multimedia mobile instant messaging in close relationship management. Mobile Media & Communication 4, 1 (Jan. 2016), 19–36. https://doi.org/10.1177/2050157915583925
    [14]
    Instagram Engineering. 2016. Emojineering Part 1: Machine Learning for Emoji Trends. https://instagram-engineering.com/emojineering-part-1-machine-learning-for-emoji-trendsmachine-learning-for-emoji-trends-7f5f9cb979ad
    [15]
    Gary Ferrington. 1994. Audio Design: Creating Multi-Sensory Images For the Mind. Journal of Visual Literacy 14, 1 (Jan. 1994), 61–67. https://doi.org/10.1080/23796529.1994.11674490
    [16]
    Google. 2019. Cloud Speech-to-Text – Spracherkennung | Cloud Speech-to-Text. https://cloud.google.com/speech-to-text/?hl=en
    [17]
    Philip Holzman and Clyde Rousey. 1966. The voice as a percept. Journal of Personality and Social Psychology 4, 1 (July 1966), 79–86.
    [18]
    Yuheng Hu, Lydia Manikonda, and Subbarao Kambhampati. 2014. What We Instagram: A First Analysis of Instagram Photo Content and User Types. In Eighth International AAAI Conference on Weblogs and Social Media. AAAI, Palo Alto, 4. https://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/paper/view/8118
    [19]
    Albert H. Huang, David C. Yen, and Xiaoni Zhang. 2008. Exploring the potential effects of emoticons. Information & Management 45, 7 (Nov. 2008), 466–473. https://doi.org/10.1016/j.im.2008.07.001
    [20]
    Snap Inc.2015. A Whole New Way to See Yourself(ie). https://www.snap.com/en-US/news/post/a-whole-new-way-to-see-yourselfie
    [21]
    Patrick W. Jordan, B. Thomas, Ian Lyall McClelland, and Bernard Weerdmeester. 1996. Usability Evaluation In Industry. CRC Press, Boca Raton, FL.
    [22]
    Simon Kemp. 2019. Digital 2019: Global Digital Overview. https://datareportal.com/reports/digital-2019-global-digital-overview
    [23]
    Philippe Kimura-Thollander and Neha Kumar. 2019. Examining the ”Global” Language of Emojis: Designing for Cultural Representation. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems - CHI ’19. ACM Press, Glasgow, Scotland Uk, 1–14. https://doi.org/10.1145/3290605.3300725
    [24]
    Jonathan Lazar, Jinjuan Heidi Feng, and Harry Hochheiser. 2017. Research methods in human-computer interaction. Morgan Kaufmann, Amsterdam.
    [25]
    Xuan Lu, Wei Ai, Xuanzhe Liu, Qian Li, Ning Wang, Gang Huang, and Qiaozhu Mei. 2016. Learning from the Ubiquitous Language: An Empirical Analysis of Emoji Usage of Smartphone Users. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing(UbiComp ’16). ACM, New York, NY, USA, 770–780. https://doi.org/10.1145/2971648.2971724
    [26]
    Sarah McRoberts, Haiwei Ma, Andrew Hall, and Svetlana Yarosh. 2017. Share First, Save Later: Performance of Self Through Snapchat Stories. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems(CHI ’17). ACM, New York, NY, USA, 6902–6911. https://doi.org/10.1145/3025453.3025771
    [27]
    Bonnie A. Nardi, Steve Whittaker, and Erin Bradner. 2000. Interaction and Outeraction: Instant Messaging in Action. In Proceedings of the 2000 ACM Conference on Computer Supported Cooperative Work (Philadelphia, Pennsylvania, USA) (CSCW ’00). Association for Computing Machinery, New York, NY, USA, 79–88. https://doi.org/10.1145/358916.358975
    [28]
    Anne Oeldorf-Hirsch and S. Shyam Sundar. 2016. Social and Technological Motivations for Online Photo Sharing. Journal of Broadcasting and Electronic Media 60, 4 (Oct. 2016), 624–642. https://doi.org/10.1080/08838151.2016.1234478
    [29]
    Steve Rubin, Floraine Berthouzoz, Gautham J. Mysore, Wilmot Li, and Maneesh Agrawala. 2013. Content-based Tools for Editing Audio Stories. In Proceedings of the 26th Annual ACM Symposium on User Interface Software and Technology(UIST ’13). ACM, New York, NY, USA, 113–122. https://doi.org/10.1145/2501988.2501993
    [30]
    Joren Six, Olmo Cornelis, and Marc Leman. 2014. TarsosDSP, a real-time audio processing framework in Java. In Audio Engineering Society Conference: 53rd International Conference: Semantic Audio. Audio Engineering Society, Audio Engineering Society, New York, NY, 7.
    [31]
    Channary Tauch and Eiman Kanjo. 2016. The roles of emojis in mobile phone notifications. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing Adjunct - UbiComp ’16. ACM Press, Heidelberg, Germany, 1560–1565. https://doi.org/10.1145/2968219.2968549
    [32]
    IBG Tencent. 2017. WeChat Data Report. http://blog.wechat.com/2017/11/09/the-2017-wechat-data-report/
    [33]
    Garreth W. Tigwell and David R. Flatla. 2016. Oh That’s What You Meant!: Reducing Emoji Misunderstanding. In Proceedings of the 18th International Conference on Human-Computer Interaction with Mobile Devices and Services Adjunct(MobileHCI ’16). ACM, New York, NY, USA, 859–866. https://doi.org/10.1145/2957265.2961844
    [34]
    W. Verhelst and M. Roelands. 1993. An overlap-add technique based on waveform similarity (WSOLA) for high quality time-scale modification of speech. In 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 2. IEEE, Piscataway, NJ, 554–557 vol.2. https://doi.org/10.1109/ICASSP.1993.319366
    [35]
    Amy Voida and Elizabeth D. Mynatt. 2005. Six Themes of the Communicative Appropriation of Photographic Images. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(CHI ’05). ACM, New York, NY, USA, 171–180. https://doi.org/10.1145/1054972.1054997
    [36]
    WeChat. 2011. WeChat 2.0 for iPhone. https://www.wechatapp.com/cgi-bin/readtemplate?lang=zh_CN&t=page/faq/ios/ios_20
    [37]
    Steve Whittaker and Brian Amento. 2004. Semantic Speech Editing. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(CHI ’04). ACM, New York, NY, USA, 527–534. https://doi.org/10.1145/985692.985759
    [38]
    Rui Zhou, Jasmine Hentschel, and Neha Kumar. 2017. Goodbye Text, Hello Emoji: Mobile Communication on WeChat in China. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (Denver, Colorado, USA) (CHI ’17). Association for Computing Machinery, New York, NY, USA, 748–759. https://doi.org/10.1145/3025453.3025800

    Cited By

    View all
    • (2024)EmoWear: Exploring Emotional Teasers for Voice Message Interaction on SmartwatchesProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642101(1-16)Online publication date: 11-May-2024
    • (2023)Voice Messages Reimagined: Exploring the Design Space of Current Voice Messaging InterfacesProceedings of Mensch und Computer 202310.1145/3603555.3608562(336-340)Online publication date: 3-Sep-2023
    • (2023)CrossChat: Instant Messaging across Different Apps on Mobile DevicesProceedings of the 38th ACM/SIGAPP Symposium on Applied Computing10.1145/3555776.3577599(1068-1077)Online publication date: 27-Mar-2023

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MobileHCI '20: 22nd International Conference on Human-Computer Interaction with Mobile Devices and Services
    October 2020
    418 pages
    ISBN:9781450375160
    DOI:10.1145/3379503
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 October 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. media sharing
    2. messenger apps
    3. voice messaging

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    MobileHCI '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 202 of 906 submissions, 22%

    Upcoming Conference

    MOBILEHCI '24
    26th International Conference on Mobile Human-Computer Interaction
    September 30 - October 3, 2024
    Melbourne , VIC , Australia

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)27
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)EmoWear: Exploring Emotional Teasers for Voice Message Interaction on SmartwatchesProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642101(1-16)Online publication date: 11-May-2024
    • (2023)Voice Messages Reimagined: Exploring the Design Space of Current Voice Messaging InterfacesProceedings of Mensch und Computer 202310.1145/3603555.3608562(336-340)Online publication date: 3-Sep-2023
    • (2023)CrossChat: Instant Messaging across Different Apps on Mobile DevicesProceedings of the 38th ACM/SIGAPP Symposium on Applied Computing10.1145/3555776.3577599(1068-1077)Online publication date: 27-Mar-2023

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media