research-article

Public Access

NoteWordy: Investigating Touch and Speech Input on Smartphones for Personal Data Capture

Authors:

Eun Kyoung ChoeAuthors Info & Claims

Proceedings of the ACM on Human-Computer Interaction, Volume 6, Issue ISS

Article No.: 581, Pages 568 - 591

https://doi.org/10.1145/3567734

Published: 14 November 2022 Publication History

Abstract

Speech as a natural and low-burden input modality has great potential to support personal data capture. However, little is known about how people use speech input, together with traditional touch input, to capture different types of data in self-tracking contexts. In this work, we designed and developed NoteWordy, a multimodal self-tracking application integrating touch and speech input, and deployed it in the context of productivity tracking for two weeks (N = 17). Our participants used the two input modalities differently, depending on the data type as well as personal preferences, error tolerance for speech recognition issues, and social surroundings. Additionally, we found speech input reduced participants' diary entry time and enhanced the data richness of the free-form text. Drawing from the findings, we discuss opportunities for supporting efficient personal data capture with multimodal input and implications for improving the user experience with natural language input to capture various self-tracking data.

References

[1]

Google Cloud-Send a recognition request with speech adaptation. https://cloud.google.com/speech-to-text/docs/ context-strength. Accessed: 2022-09-30.

[2]

Kotlin. https://kotlinlang.org/. Accessed: 2022-09-30.

[3]

Microsoft Cognitive Service-Prepare data for Custom Speech. https://docs.microsoft.com/en-us/azure/cognitiveservices/speech-service/ how-to-custom-speech-test-and-train. Accessed: 2022-09-30.

[4]

Microsoft Cognitive Service-Speech to Text. https://azure.microsoft.com/en-us/services/cognitive-services/speechto-text/. Accessed: 2022-09-30.

[5]

Natty. http://natty.joestelmach.com/. Accessed: 2022-09-30.

[6]

Smile-Statistical Machine Intelligence and Learning Engine. https://haifengl.github.io/nlp.html. Accessed: 2022-09-30.

[7]

Amazon Alexa. https://alexa.amazon.com/. Accessed: 2022-09-30.

[8]

Otter.ai. https://otter.ai/. Accessed: 2022-09-30.

[9]

Peter Bates and Lori Gof. 2012. The invisible student: Benefits and challenges of part-time doctoral studies. Alberta Journal of Educational Research 58, 3 ( 2012 ), 368-380. https://doi.org/10.11575/ajer.v58i3. 55628

[10]

Virginia Braun and Victoria Clarke. 2012. Thematic analysis. ( 2012 ). https://doi.org/10.1037/ 13620-004

[11]

Hancheng Cao, Chia-Jung Lee, Shamsi Iqbal, Mary Czerwinski, Priscilla NY Wong, Sean Rintel, Brent Hecht, Jaime Teevan, and Longqi Yang. 2021. Large scale analysis of multitasking behavior during remote meetings. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1-13. https://doi.org/10.1145/3411764.3445243

Digital Library

[12]

Barbara L Chalfonte, Robert S Fish, and Robert E Kraut. 1991. Expressive richness: a comparison of speech and text as media for revision. In Proceedings of the 1991 CHI Conference on Human Factors in Computing Systems. 21-26. https://doi.org/10.1145/108844.108848

Digital Library

[13]

Eugene Charniak and Mark Johnson. 2001. Edit detection and parsing for transcribed speech. In Second Meeting of the North American Chapter of the Association for Computational Linguistics. https://www.aclweb.org/anthology/N01-1016

Digital Library

[14]

Eun Kyoung Choe, Bongshin Lee, Matthew Kay, Wanda Pratt, and Julie A Kientz. 2015. SleepTight: low-burden, selfmonitoring technology for capturing and reflecting on sleep behaviors. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing. 121-132. https://doi.org/10.1145/2750858.2804266

Digital Library

[15]

Rajeev Darolia. 2014. Working (and studying) day and night: Heterogeneous efects of working on the academic performance of full-time and part-time students. Economics of Education Review 38 ( 2014 ), 38-50. https://doi.org/10. 1016/j.econedurev. 2013. 10.004

[16]

Marika De Bruijne and Arnaud Wijnant. 2013. Comparing survey results obtained via mobile devices and computers: An experiment with a mobile web survey on a heterogeneous group of mobile devices versus a computer-assisted web survey. Social Science Computer Review 31, 4 ( 2013 ), 482-504. https://doi.org/10.1177/0894439313483976

[17]

Marion Dunagan. 2012. Coping strategies of part-time MBA students: The role of boundary management. University of Arkansas.

[18]

Daniel A Epstein, Daniel Avrahami, and Jacob T Biehl. 2016. Taking 5: Work-breaks, productivity, and opportunities for personal informatics for knowledge workers. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. 673-684. https://doi.org/10.1145/2858036.2858066

Digital Library

[19]

Allyson Ettinger, Sudha Rao, Hal Daumé III, and Emily M Bender. 2017. Towards linguistically generalizable NLP systems: A workshop and shared task. arXiv preprint arXiv:1711.01505 ( 2017 ). https://arxiv.org/abs/1711.01505

[20]

Asbjørn Følstad and Petter Bae Brandtzaeg. 2017. Chatbots and the new world of HCI. interactions 24, 4 ( 2017 ), 38-42. https://dl.acm.org/doi/10.1145/3085558

[21]

Dayne Freitag. 2000. Machine learning for information extraction in informal domains. Machine learning 39, 2 ( 2000 ), 169-202. https://doi.org/10.1023/A:1007601113994

Digital Library

[22]

Genesant Technologies, Inc. Talk-to-Track. https://www.talktotrack.com/ Accessed: 2022-09-30.

[23]

VI Gerchikov. 2000. The phenomenon of the working college student. Russian Education & Society 42, 6 ( 2000 ), 67-84. https://doi.org/10.2753/RES1060-9393420667

[24]

Debjyoti Ghosh, Can Liu, Shengdong Zhao, and Kotaro Hara. 2020. Commanding and Re-Dictation: Developing Eyes-Free Voice-Based Interaction for Editing Dictated Text. ACM Transactions on Computer-Human Interaction (TOCHI) 27, 4 ( 2020 ), 1-31. https://doi.org/10.1145/3390889

Digital Library

[25]

Sharon Goldwater, Dan Jurafsky, and Christopher D Manning. 2010. Which words are hard to recognize? Prosodic, lexical, and disfluency factors that increase speech recognition error rates. Speech Communication 52, 3 ( 2010 ), 181-200.

[26]

Gloria Mark, Daniela Gudith, and Ulrich Klocke. 2008. The cost of interrupted work: more speed and stress. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems. 107-110. https://doi.org/10.1145/1357054. 1357072

Digital Library

[27]

Gloria Mark, Shamsi Iqbal, and Mary Czerwinski. 2017. How blocking distractions afects workplace focus and productivity. In Proceedings of the 2017 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2017 ACM International Symposium on Wearable Computers. 928-934. https://doi.org/10.1145/ 3123024.3124558

Digital Library

[28]

Midnight Plan, Inc. MurMur. https://play.google.com/store/apps/details?id=com.midnightplan.murmur Accessed: 2022-09-30.

[29]

Christine Mollen. Trying to Fit In: Barriers to Degree Completion for Part-Time Graduate Students.

[30]

Cosmin Munteanu, Matt Jones, Sharon Oviatt, Stephen Brewster, Gerald Penn, Steve Whittaker, Nitendra Rajput, and Amit Nanavati. 2013. We need to talk: HCI and the delicate topic of spoken language interaction. In CHI'13 Extended Abstracts on Human Factors in Computing Systems. 2459-2464. https://doi.org/10.1145/2468356.2468803

Digital Library

[31]

Christine Murad, Cosmin Munteanu, Leigh Clark, and Benjamin R Cowan. [n.d.]. Design guidelines for hands-free speech interaction. In MobileHCI 2018. 269-276. https://doi.org/10.1145/3236112.3236149

Digital Library

[32]

Elizabeth L Murnane, Dan Cosley, Pamara Chang, Shion Guha, Ellen Frank, Geri Gay, and Mark Matthews. 2016. Self-monitoring practices, attitudes, and needs of individuals with bipolar disorder: implications for the design of technologies to manage mental health. Journal of the American Medical Informatics Association 23, 3 ( 2016 ), 477-484. https://doi.org/10.1093/jamia/ocv165

[33]

Jessica M Nicklin, Emily J Meachon, and Laurel A McNall. 2019. Balancing work, school, and personal life among graduate students: A positive psychology approach. Applied Research in Quality of Life 14, 5 ( 2019 ), 1265-1286. https://doi.org/10.1007/s11482-018-9650-z

[34]

Fatma Őzcan, Abdul Quamar, Jaydeep Sen, Chuan Lei, and Vasilis Efthymiou. 2020. State of the art and open challenges in natural language interfaces to data. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 2629-2636. https://doi.org/10.1145/3318464.3383128

Digital Library

[35]

Denise Pfeifer. 2001. Academic and environmental stress among undergraduate and graduate college students: A literature review. Ph.D. Dissertation. https://minds.wisconsin.edu/bitstream/handle/1793/40121/2001pfeiferd.pdf?sequence= 1

[36]

Jakub Piskorski and Roman Yangarber. 2013. Information extraction: Past, present and future. In Multi-source, multilingual information extraction and summarization. Springer, 23-49. https://doi.org/10.1007/978-3-642-28569-1_2

[37]

Melanie Revilla, Mick P Couper, Oriol J Bosch, and Marc Asensio. 2020. Testing the use of voice input in a smartphone web survey. Social Science Computer Review 38, 2 ( 2020 ), 207-224. https://doi.org/10.1177/0894439318810715

Digital Library

[38]

Verónica Rivera-Pelayo, Angela Fessl, Lars Müller, and Viktoria Pammer. 2017. Introducing mood self-tracking at work: Empirical insights from call centers. ACM Transactions on Computer-Human Interaction (TOCHI) 24, 1 ( 2017 ), 1-28. https://doi.org/10.1145/3014058

Digital Library

[39]

Sherry Ruan, Jacob O Wobbrock, Kenny Liou, Andrew Ng, and James A Landay. 2018. Comparing speech and keyboard text entry for short messages in two languages on touchscreen phones. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, 4 ( 2018 ), 1-23. https://doi.org/10.1145/3161187

Digital Library

[40]

Johan Schalkwyk, Doug Beeferman, Françoise Beaufays, Bill Byrne, Ciprian Chelba, Mike Cohen, Maryam Kamvar, and Brian Strope. 2010. “ Your word is my command”: Google search by voice: A case study. In Advances in speech recognition. Springer, 61-90. https://doi.org/10.1007/978-1-4419-5951-5_4

[41]

Michael F Schober, Frederick G Conrad, Christopher Antoun, Patrick Ehlen, Stefanie Fail, Andrew L Hupp, Michael Johnston, Lucas Vickers, H Yanna Yan, and Chan Zhang. 2015. Precision and disclosure in text and voice interviews on smartphones. PloS one 10, 6 ( 2015 ), e0128337. https://doi.org/10.1371/journal.pone.0128337

[42]

Katie Seaborn and Jacqueline Urakami. 2021. Measuring Voice UX Quantitatively: A Rapid Review. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems. 1-8. https://doi.org/10.1145/3411763.3451712

Digital Library

[43]

Holly Seirup and Sage Rose. 2011. Exploring the efects of hope on GPA and retention among college undergraduate students on academic probation. Education Research International 2011 ( 2011 ). https://doi.org/10.1155/ 2011 /381429

[44]

Lucas M Silva and Daniel A Epstein. 2021. Investigating Preferred Food Description Practices in Digital Food Journaling. In Proceedings of the 2021 Conference on Designing Interactive System. ACM. https://doi.org/10.1145/3461778.3462145

Digital Library

[45]

Jolene D Smyth, Don A Dillman, Leah Melani Christian, and Mallory McBride. 2009. Open-ended questions in web surveys: Can increasing the size of answer boxes and providing extra verbal instructions improve response quality? Public Opinion Quarterly 73, 2 ( 2009 ), 325-337. https://doi.org/10.1093/poq/nfp029

[46]

Arjun Srinivasan, Bongshin Lee, Nathalie Henry Riche, Steven M Drucker, and Ken Hinckley. 2020. InChorus: Designing consistent multimodal interactions for data visualization on tablet devices. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1-13. https://doi.org/10.1145/3313831.3376782

Digital Library

[47]

Arjun Srinivasan and John Stasko. 2020. How to ask what to say?: Strategies for evaluating natural language interfaces for data visualization. IEEE Computer Graphics and Applications 40, 4 ( 2020 ), 96-103. https://doi.org/10.1109/ MCG. 2020.2986902

Digital Library

[48]

TeamViewer. TeamViewer QuickSupport. https://www.teamviewer.com/en-us/info/quicksupport/. Accessed: 2022-09-30.

[49]

Helma Torkamaan and Jürgen Ziegler. 2020. Exploring chatbot user interfaces for mood measurement: a study of validity and user experience. In Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers. 135-138. https://doi.org/10.1145/3410530.3414395

Digital Library

[50]

Ciaran B Trace and Yan Zhang. 2021. Minding the gap: Creating meaning from missing and anomalous data. Information & Culture 56, 2 ( 2021 ), 178-216. https://doi.org/10.7560/IC56204

[51]

Aditya Vashistha, Pooja Sethi, and Richard Anderson. 2018. BSpeak: An accessible voice-based crowdsourcing marketplace for low-income blind people. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1-13. https://doi.org/10.1145/3173574.3173631

Digital Library

[52]

Emmanuel Vincent, Shinji Watanabe, Aditya Arie Nugraha, Jon Barker, and Ricard Marxer. 2017. An analysis of environment, microphone and data simulation mismatches in robust speech recognition. Computer Speech & Language 46 ( 2017 ), 535-557. https://doi.org/10.1016/j.csl. 2016. 11.005

[53]

Tongshuang Wu, Marco Tulio Ribeiro, Jefrey Heer, and Daniel Weld. 2019. Errudite: Scalable, reproducible, and testable error analysis. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. https://doi.org/10.18653/v1/ P19-1073

[54]

Yixuan Zhang and Andrea G Parker. 2020. Eat4Thought: A Design of Food Journaling. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems. 1-8. https://doi.org/10.1145/3334480.3383044

Digital Library

Cited By

Li ZLiang MLc RLuo Y(2024)StayFocused: Examining the Effects of Reflective Prompts and Chatbot Support on Compulsive Smartphone UseProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642479(1-19)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642479
Luo YYu JLiang MWan YZhu KSantosa S(2024)Emotion Embodied: Unveiling the Expressive Potential of Single-Hand GesturesProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642255(1-17)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642255
Taieb-Maimon MRomanovskii-Chernik L(2024)Improving Error Correction and Text Editing Using Voice and Mouse Multimodal InterfaceInternational Journal of Human–Computer Interaction10.1080/10447318.2024.2352932(1-24)Online publication date: 22-May-2024
https://doi.org/10.1080/10447318.2024.2352932

Index Terms

NoteWordy: Investigating Touch and Speech Input on Smartphones for Personal Data Capture
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. HCI design and evaluation methods
      1. Field studies
    2. Interaction devices
      1. Sound-based input / output
  2. Ubiquitous and mobile computing
    1. Ubiquitous and mobile computing design and evaluation methods

Recommendations

Data@Hand: Fostering Visual Exploration of Personal Data on Smartphones Leveraging Speech and Touch Interaction
CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

Most mobile health apps employ data visualization to help people view their health and activity data, but these apps provide limited support for visual data exploration. Furthermore, despite its huge potential benefits, mobile visualization research in ...
FoodScrap: Promoting Rich Data Capture and Reflective Food Journaling Through Speech Input
DIS '21: Proceedings of the 2021 ACM Designing Interactive Systems Conference

The factors influencing people’s food decisions, such as one’s mood and eating environment, are important information to foster self-reflection and to develop personalized healthy diet. But, it is difficult to consistently collect them due to the heavy ...
Designing Multimodal Self-Tracking Technologies to Promote Data Capture and Self-Reflection
DIS '21 Companion: Companion Publication of the 2021 ACM Designing Interactive Systems Conference

Self-tracking is a powerful means to help individuals monitor and improve their behaviors. While numerous tracking technologies are available, it has been challenging to lower the tracking burden whilst promoting reflection. This is because low-burden ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Human-Computer Interaction

Proceedings of the ACM on Human-Computer Interaction Volume 6, Issue ISS

December 2022

746 pages

EISSN:2573-0142

DOI:10.1145/3554337

Editor:
Jeffrey Nichols
Apple, USA

Issue’s Table of Contents

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 November 2022

Published in PACMHCI Volume 6, Issue ISS

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
408
Total Downloads

Downloads (Last 12 months)242
Downloads (Last 6 weeks)29

Reflects downloads up to 27 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Li ZLiang MLc RLuo Y(2024)StayFocused: Examining the Effects of Reflective Prompts and Chatbot Support on Compulsive Smartphone UseProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642479(1-19)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642479
Luo YYu JLiang MWan YZhu KSantosa S(2024)Emotion Embodied: Unveiling the Expressive Potential of Single-Hand GesturesProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642255(1-17)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642255
Taieb-Maimon MRomanovskii-Chernik L(2024)Improving Error Correction and Text Editing Using Voice and Mouse Multimodal InterfaceInternational Journal of Human–Computer Interaction10.1080/10447318.2024.2352932(1-24)Online publication date: 22-May-2024
https://doi.org/10.1080/10447318.2024.2352932

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents