Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Public Access

NoteWordy: Investigating Touch and Speech Input on Smartphones for Personal Data Capture

Published: 14 November 2022 Publication History
  • Get Citation Alerts
  • Abstract

    Speech as a natural and low-burden input modality has great potential to support personal data capture. However, little is known about how people use speech input, together with traditional touch input, to capture different types of data in self-tracking contexts. In this work, we designed and developed NoteWordy, a multimodal self-tracking application integrating touch and speech input, and deployed it in the context of productivity tracking for two weeks (N = 17). Our participants used the two input modalities differently, depending on the data type as well as personal preferences, error tolerance for speech recognition issues, and social surroundings. Additionally, we found speech input reduced participants' diary entry time and enhanced the data richness of the free-form text. Drawing from the findings, we discuss opportunities for supporting efficient personal data capture with multimodal input and implications for improving the user experience with natural language input to capture various self-tracking data.

    References

    [1]
    Google Cloud-Send a recognition request with speech adaptation. https://cloud.google.com/speech-to-text/docs/ context-strength. Accessed: 2022-09-30.
    [2]
    Kotlin. https://kotlinlang.org/. Accessed: 2022-09-30.
    [3]
    Microsoft Cognitive Service-Prepare data for Custom Speech. https://docs.microsoft.com/en-us/azure/cognitiveservices/speech-service/ how-to-custom-speech-test-and-train. Accessed: 2022-09-30.
    [4]
    Microsoft Cognitive Service-Speech to Text. https://azure.microsoft.com/en-us/services/cognitive-services/speechto-text/. Accessed: 2022-09-30.
    [5]
    Natty. http://natty.joestelmach.com/. Accessed: 2022-09-30.
    [6]
    Smile-Statistical Machine Intelligence and Learning Engine. https://haifengl.github.io/nlp.html. Accessed: 2022-09-30.
    [7]
    Amazon Alexa. https://alexa.amazon.com/. Accessed: 2022-09-30.
    [8]
    Otter.ai. https://otter.ai/. Accessed: 2022-09-30.
    [9]
    Peter Bates and Lori Gof. 2012. The invisible student: Benefits and challenges of part-time doctoral studies. Alberta Journal of Educational Research 58, 3 ( 2012 ), 368-380. https://doi.org/10.11575/ajer.v58i3. 55628
    [10]
    Virginia Braun and Victoria Clarke. 2012. Thematic analysis. ( 2012 ). https://doi.org/10.1037/ 13620-004
    [11]
    Hancheng Cao, Chia-Jung Lee, Shamsi Iqbal, Mary Czerwinski, Priscilla NY Wong, Sean Rintel, Brent Hecht, Jaime Teevan, and Longqi Yang. 2021. Large scale analysis of multitasking behavior during remote meetings. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1-13. https://doi.org/10.1145/3411764.3445243
    [12]
    Barbara L Chalfonte, Robert S Fish, and Robert E Kraut. 1991. Expressive richness: a comparison of speech and text as media for revision. In Proceedings of the 1991 CHI Conference on Human Factors in Computing Systems. 21-26. https://doi.org/10.1145/108844.108848
    [13]
    Eugene Charniak and Mark Johnson. 2001. Edit detection and parsing for transcribed speech. In Second Meeting of the North American Chapter of the Association for Computational Linguistics. https://www.aclweb.org/anthology/N01-1016
    [14]
    Eun Kyoung Choe, Bongshin Lee, Matthew Kay, Wanda Pratt, and Julie A Kientz. 2015. SleepTight: low-burden, selfmonitoring technology for capturing and reflecting on sleep behaviors. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing. 121-132. https://doi.org/10.1145/2750858.2804266
    [15]
    Rajeev Darolia. 2014. Working (and studying) day and night: Heterogeneous efects of working on the academic performance of full-time and part-time students. Economics of Education Review 38 ( 2014 ), 38-50. https://doi.org/10. 1016/j.econedurev. 2013. 10.004
    [16]
    Marika De Bruijne and Arnaud Wijnant. 2013. Comparing survey results obtained via mobile devices and computers: An experiment with a mobile web survey on a heterogeneous group of mobile devices versus a computer-assisted web survey. Social Science Computer Review 31, 4 ( 2013 ), 482-504. https://doi.org/10.1177/0894439313483976
    [17]
    Marion Dunagan. 2012. Coping strategies of part-time MBA students: The role of boundary management. University of Arkansas.
    [18]
    Daniel A Epstein, Daniel Avrahami, and Jacob T Biehl. 2016. Taking 5: Work-breaks, productivity, and opportunities for personal informatics for knowledge workers. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. 673-684. https://doi.org/10.1145/2858036.2858066
    [19]
    Allyson Ettinger, Sudha Rao, Hal Daumé III, and Emily M Bender. 2017. Towards linguistically generalizable NLP systems: A workshop and shared task. arXiv preprint arXiv:1711.01505 ( 2017 ). https://arxiv.org/abs/1711.01505
    [20]
    Asbjørn Følstad and Petter Bae Brandtzaeg. 2017. Chatbots and the new world of HCI. interactions 24, 4 ( 2017 ), 38-42. https://dl.acm.org/doi/10.1145/3085558
    [21]
    Dayne Freitag. 2000. Machine learning for information extraction in informal domains. Machine learning 39, 2 ( 2000 ), 169-202. https://doi.org/10.1023/A:1007601113994
    [22]
    Genesant Technologies, Inc. Talk-to-Track. https://www.talktotrack.com/ Accessed: 2022-09-30.
    [23]
    VI Gerchikov. 2000. The phenomenon of the working college student. Russian Education & Society 42, 6 ( 2000 ), 67-84. https://doi.org/10.2753/RES1060-9393420667
    [24]
    Debjyoti Ghosh, Can Liu, Shengdong Zhao, and Kotaro Hara. 2020. Commanding and Re-Dictation: Developing Eyes-Free Voice-Based Interaction for Editing Dictated Text. ACM Transactions on Computer-Human Interaction (TOCHI) 27, 4 ( 2020 ), 1-31. https://doi.org/10.1145/3390889
    [25]
    Sharon Goldwater, Dan Jurafsky, and Christopher D Manning. 2010. Which words are hard to recognize? Prosodic, lexical, and disfluency factors that increase speech recognition error rates. Speech Communication 52, 3 ( 2010 ), 181-200.
    [26]
    Gloria Mark, Daniela Gudith, and Ulrich Klocke. 2008. The cost of interrupted work: more speed and stress. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems. 107-110. https://doi.org/10.1145/1357054. 1357072
    [27]
    Gloria Mark, Shamsi Iqbal, and Mary Czerwinski. 2017. How blocking distractions afects workplace focus and productivity. In Proceedings of the 2017 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2017 ACM International Symposium on Wearable Computers. 928-934. https://doi.org/10.1145/ 3123024.3124558
    [28]
    Midnight Plan, Inc. MurMur. https://play.google.com/store/apps/details?id=com.midnightplan.murmur Accessed: 2022-09-30.
    [29]
    Christine Mollen. Trying to Fit In: Barriers to Degree Completion for Part-Time Graduate Students.
    [30]
    Cosmin Munteanu, Matt Jones, Sharon Oviatt, Stephen Brewster, Gerald Penn, Steve Whittaker, Nitendra Rajput, and Amit Nanavati. 2013. We need to talk: HCI and the delicate topic of spoken language interaction. In CHI'13 Extended Abstracts on Human Factors in Computing Systems. 2459-2464. https://doi.org/10.1145/2468356.2468803
    [31]
    Christine Murad, Cosmin Munteanu, Leigh Clark, and Benjamin R Cowan. [n.d.]. Design guidelines for hands-free speech interaction. In MobileHCI 2018. 269-276. https://doi.org/10.1145/3236112.3236149
    [32]
    Elizabeth L Murnane, Dan Cosley, Pamara Chang, Shion Guha, Ellen Frank, Geri Gay, and Mark Matthews. 2016. Self-monitoring practices, attitudes, and needs of individuals with bipolar disorder: implications for the design of technologies to manage mental health. Journal of the American Medical Informatics Association 23, 3 ( 2016 ), 477-484. https://doi.org/10.1093/jamia/ocv165
    [33]
    Jessica M Nicklin, Emily J Meachon, and Laurel A McNall. 2019. Balancing work, school, and personal life among graduate students: A positive psychology approach. Applied Research in Quality of Life 14, 5 ( 2019 ), 1265-1286. https://doi.org/10.1007/s11482-018-9650-z
    [34]
    Fatma Őzcan, Abdul Quamar, Jaydeep Sen, Chuan Lei, and Vasilis Efthymiou. 2020. State of the art and open challenges in natural language interfaces to data. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 2629-2636. https://doi.org/10.1145/3318464.3383128
    [35]
    Denise Pfeifer. 2001. Academic and environmental stress among undergraduate and graduate college students: A literature review. Ph.D. Dissertation. https://minds.wisconsin.edu/bitstream/handle/1793/40121/2001pfeiferd.pdf?sequence= 1
    [36]
    Jakub Piskorski and Roman Yangarber. 2013. Information extraction: Past, present and future. In Multi-source, multilingual information extraction and summarization. Springer, 23-49. https://doi.org/10.1007/978-3-642-28569-1_2
    [37]
    Melanie Revilla, Mick P Couper, Oriol J Bosch, and Marc Asensio. 2020. Testing the use of voice input in a smartphone web survey. Social Science Computer Review 38, 2 ( 2020 ), 207-224. https://doi.org/10.1177/0894439318810715
    [38]
    Verónica Rivera-Pelayo, Angela Fessl, Lars Müller, and Viktoria Pammer. 2017. Introducing mood self-tracking at work: Empirical insights from call centers. ACM Transactions on Computer-Human Interaction (TOCHI) 24, 1 ( 2017 ), 1-28. https://doi.org/10.1145/3014058
    [39]
    Sherry Ruan, Jacob O Wobbrock, Kenny Liou, Andrew Ng, and James A Landay. 2018. Comparing speech and keyboard text entry for short messages in two languages on touchscreen phones. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, 4 ( 2018 ), 1-23. https://doi.org/10.1145/3161187
    [40]
    Johan Schalkwyk, Doug Beeferman, Françoise Beaufays, Bill Byrne, Ciprian Chelba, Mike Cohen, Maryam Kamvar, and Brian Strope. 2010. “ Your word is my command”: Google search by voice: A case study. In Advances in speech recognition. Springer, 61-90. https://doi.org/10.1007/978-1-4419-5951-5_4
    [41]
    Michael F Schober, Frederick G Conrad, Christopher Antoun, Patrick Ehlen, Stefanie Fail, Andrew L Hupp, Michael Johnston, Lucas Vickers, H Yanna Yan, and Chan Zhang. 2015. Precision and disclosure in text and voice interviews on smartphones. PloS one 10, 6 ( 2015 ), e0128337. https://doi.org/10.1371/journal.pone.0128337
    [42]
    Katie Seaborn and Jacqueline Urakami. 2021. Measuring Voice UX Quantitatively: A Rapid Review. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems. 1-8. https://doi.org/10.1145/3411763.3451712
    [43]
    Holly Seirup and Sage Rose. 2011. Exploring the efects of hope on GPA and retention among college undergraduate students on academic probation. Education Research International 2011 ( 2011 ). https://doi.org/10.1155/ 2011 /381429
    [44]
    Lucas M Silva and Daniel A Epstein. 2021. Investigating Preferred Food Description Practices in Digital Food Journaling. In Proceedings of the 2021 Conference on Designing Interactive System. ACM. https://doi.org/10.1145/3461778.3462145
    [45]
    Jolene D Smyth, Don A Dillman, Leah Melani Christian, and Mallory McBride. 2009. Open-ended questions in web surveys: Can increasing the size of answer boxes and providing extra verbal instructions improve response quality? Public Opinion Quarterly 73, 2 ( 2009 ), 325-337. https://doi.org/10.1093/poq/nfp029
    [46]
    Arjun Srinivasan, Bongshin Lee, Nathalie Henry Riche, Steven M Drucker, and Ken Hinckley. 2020. InChorus: Designing consistent multimodal interactions for data visualization on tablet devices. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1-13. https://doi.org/10.1145/3313831.3376782
    [47]
    Arjun Srinivasan and John Stasko. 2020. How to ask what to say?: Strategies for evaluating natural language interfaces for data visualization. IEEE Computer Graphics and Applications 40, 4 ( 2020 ), 96-103. https://doi.org/10.1109/ MCG. 2020.2986902
    [48]
    TeamViewer. TeamViewer QuickSupport. https://www.teamviewer.com/en-us/info/quicksupport/. Accessed: 2022-09-30.
    [49]
    Helma Torkamaan and Jürgen Ziegler. 2020. Exploring chatbot user interfaces for mood measurement: a study of validity and user experience. In Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers. 135-138. https://doi.org/10.1145/3410530.3414395
    [50]
    Ciaran B Trace and Yan Zhang. 2021. Minding the gap: Creating meaning from missing and anomalous data. Information & Culture 56, 2 ( 2021 ), 178-216. https://doi.org/10.7560/IC56204
    [51]
    Aditya Vashistha, Pooja Sethi, and Richard Anderson. 2018. BSpeak: An accessible voice-based crowdsourcing marketplace for low-income blind people. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1-13. https://doi.org/10.1145/3173574.3173631
    [52]
    Emmanuel Vincent, Shinji Watanabe, Aditya Arie Nugraha, Jon Barker, and Ricard Marxer. 2017. An analysis of environment, microphone and data simulation mismatches in robust speech recognition. Computer Speech & Language 46 ( 2017 ), 535-557. https://doi.org/10.1016/j.csl. 2016. 11.005
    [53]
    Tongshuang Wu, Marco Tulio Ribeiro, Jefrey Heer, and Daniel Weld. 2019. Errudite: Scalable, reproducible, and testable error analysis. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. https://doi.org/10.18653/v1/ P19-1073
    [54]
    Yixuan Zhang and Andrea G Parker. 2020. Eat4Thought: A Design of Food Journaling. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems. 1-8. https://doi.org/10.1145/3334480.3383044

    Cited By

    View all
    • (2024)StayFocused: Examining the Effects of Reflective Prompts and Chatbot Support on Compulsive Smartphone UseProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642479(1-19)Online publication date: 11-May-2024
    • (2024)Emotion Embodied: Unveiling the Expressive Potential of Single-Hand GesturesProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642255(1-17)Online publication date: 11-May-2024
    • (2024)Improving Error Correction and Text Editing Using Voice and Mouse Multimodal InterfaceInternational Journal of Human–Computer Interaction10.1080/10447318.2024.2352932(1-24)Online publication date: 22-May-2024

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Human-Computer Interaction
    Proceedings of the ACM on Human-Computer Interaction  Volume 6, Issue ISS
    December 2022
    746 pages
    EISSN:2573-0142
    DOI:10.1145/3554337
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 November 2022
    Published in PACMHCI Volume 6, Issue ISS

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Self-tracking
    2. personal informatics
    3. productivity
    4. speech input
    5. speech interface design

    Qualifiers

    • Research-article

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)242
    • Downloads (Last 6 weeks)29
    Reflects downloads up to 27 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)StayFocused: Examining the Effects of Reflective Prompts and Chatbot Support on Compulsive Smartphone UseProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642479(1-19)Online publication date: 11-May-2024
    • (2024)Emotion Embodied: Unveiling the Expressive Potential of Single-Hand GesturesProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642255(1-17)Online publication date: 11-May-2024
    • (2024)Improving Error Correction and Text Editing Using Voice and Mouse Multimodal InterfaceInternational Journal of Human–Computer Interaction10.1080/10447318.2024.2352932(1-24)Online publication date: 22-May-2024

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media