research-article

Public Access

Vocal Resonance: Using Internal Body Voice for Wearable Authentication

Authors:

Cory Cornelius,

Reza Rawassizadeh,

Ronald Peterson,

David KotzAuthors Info & Claims

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Volume 2, Issue 1

Article No.: 19, Pages 1 - 23

https://doi.org/10.1145/3191751

Published: 26 March 2018 Publication History

Abstract

We observe the advent of body-area networks of pervasive wearable devices, whether for health monitoring, personal assistance, entertainment, or home automation. For many devices, it is critical to identify the wearer, allowing sensor data to be properly labeled or personalized behavior to be properly achieved. In this paper we propose the use of vocal resonance, that is, the sound of the person's voice as it travels through the person's body -- a method we anticipate would be suitable for devices worn on the head, neck, or chest. In this regard, we go well beyond the simple challenge of speaker recognition: we want to know who is wearing the device. We explore two machine-learning approaches that analyze voice samples from a small throat-mounted microphone and allow the device to determine whether (a) the speaker is indeed the expected person, and (b) the microphone-enabled device is physically on the speaker's body. We collected data from 29 subjects, demonstrate the feasibility of a prototype, and show that our DNN method achieved balanced accuracy 0.914 for identification and 0.961 for verification by using an LSTM-based deep-learning model, while our efficient GMM method achieved balanced accuracy 0.875 for identification and 0.942 for verification.

References

[1]

Adafruit. 2017. USB Battery Pack. (Aug. 2017). https://www.adafruit.com/product/1959

[2]

Salil P. Banerjee and Damon L. Woodard. 2012. Biometric Authentication and Identification using Keystroke Dynamics: A Survey. Journal of Pattern Recognition Research 7, 1 (July 2012), 116--139.

[3]

Sourav Bhattacharya and Nicholas D. Lane. 2016. Sparsification and Separation of Deep Learning Layers for Constrained Resource Inference on Wearables. In Proceedings of the ACM Conference on Embedded Network Sensor Systems (SenSys). ACM, 176--189.

Digital Library

[4]

Frédéric Bimbot, Jean-Franc Bonastre, Corinne Fredouille, Guillaume Gravier, Ivan Magrin-Chagnolleau, Sylvain Meignier, Téva Merlin, Javier Ortega-García, Dijana Petrovska-Delacrétaz, and Douglas A. Reynolds. 2004. A Tutorial on Text-Independent Speaker Verification. EURASIP Journal on Advances in Signal Processing 2004, 4 (Jan. 2004), 430--451.

Digital Library

[5]

Jorge Blasco, Thomas M. Chen, Juan Tapiador, and Pedro P. Lopez. 2016. A Survey of Wearable Biometric Recognition Systems. Journal ACM Computing Surveys (CSUR) 49, 3 (Sept. 2016), 43:1--43:35.

Digital Library

[6]

Ruud M. Bolle, Jonathan H. Connell, Sharanthchandra Pankanti, Nalini K. Ratha, and Andrew W. Senior. 2004. Guide to biometrics (1 ed.). Springer. https://www.springer.com/computer/image+processing/book/978-0-387-40089-1

Digital Library

[7]

Laura E. Boucheron and Phillip L. De Leon. 2008. On the inversion of Mel-frequency cepstral coefficients for speech enhancement applications. In Proceedings of the International Conference on Signals and Electronic Systems. IEEE, 485--488.

[8]

Shaxun Chen, Amit Pande, and Prasant Mohapatra. 2014. Sensor-Assisted Facial Recognition: An Enhanced Biometric Authentication System for Smartphones. In Proceedings of the International Conference on Mobile Systems, Applications, and Services (MobiSys). ACM, 109--122.

Digital Library

[9]

Cory Cornelius and David Kotz. 2012. Recognizing whether sensors are on the same body. Journal of Pervasive and Mobile Computing 8, 6 (Dec. 2012), 822--836.

Digital Library

[10]

Cory Cornelius, Zachary Marois, Jacob Sorber, Ron Peterson, Shrirang Mare, and David Kotz. 2014. Vocal resonance as a biometric for pervasive wearable devices. Technical Report TR2014-747. Dartmouth Computer Science. http://www.cs.dartmouth.edu/reports/TR2014-747.pdf

[11]

Cory Cornelius, Ronald Peterson, Joseph Skinner, Ryan Halter, and David Kotz. 2014. A wearable system that knows who wears it. In Proceedings of the International Conference on Mobile Systems, Applications, and Services (MobiSys). ACM, 55--67.

Digital Library

[12]

Cory T. Cornelius. 2013. Usable Security for Wireless Body-Area Networks. Ph.D. Dissertation. Dartmouth College Computer Science, Hanover, NH. http://www.cs.dartmouth.edu/reports/TR2013-741.pdf Available as Dartmouth Computer Science Technical Report TR2013-741.

Digital Library

[13]

Arthur P. Dempster, Nan M. Laird, and Donald B. Rubin. 1977. Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B (Methodological) 39, 1 (April 1977), 1--38.

[14]

Florian Eyben, Felix Weninger, Florian Gross, and Björn Schuller. 2013. Recent Developments in openSMILE, the Munich Open-Source Multimedia Feature Extractor. In Proceedings of the ACM International Conference on Multimedia. ACM, 835--838.

Digital Library

[15]

Grant Fairbanks. 1960. Voice and Articulation Drillbook (2nd ed.). Harper 8 Row. 127 pages.

[16]

Huan Feng, Kassem Fawaz, and Kang G. Shin. 2017. Continuous Authentication for Voice Assistants. In Proceedings of the International Conference on Mobile Computing and Networking (MobiCom). ACM, 343--355.

Digital Library

[17]

Fitbit. 2016. Fitbit Alta HR. (Nov. 2016). https://www.fitbit.com

[18]

Jean-Luc Gauvain and Chin-Hui Lee. 1994. Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains. IEEE Transactions on Speech and Audio Processing 2, 2 (April 1994), 291--298.

[19]

Kenneth Grahame. 1908. The wind in the willows. Bantam Classics. 232 pages. https://books.google.com/books?isbn=0752548727

[20]

Rashidul Hasan, Mustafa Jamil, and Golam Rabbani Saifur Rahman. 2004. Speaker Identification using Mel Frequency Cepstral Coefficients. Variations 1 (Dec. 2004), 4.

[21]

Georg Heigold, Ignacio Moreno, Samy Bengio, and Noam Shazeer. 2016. End-to-end text-dependent speaker verification. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 5115--5119.

Digital Library

[22]

Emotiv Inc. 2017. Emotiv Insight. (Oct. 2017). https://www.emotiv.com/insight/

[23]

Amit K. Jain, Arun Ross, and Sanjay Prabhakar. 2004. An Introduction to Biometric Recognition. IEEE Transactions on Circuits and Systems for Video Technology 14, 1 (Jan. 2004), 4--20.

Digital Library

[24]

David Kotz, Carl A. Gunter, Santosh Kumar, and Jonathan P. Weiner. 2016. Privacy and Security in Mobile Health: A Research Agenda. IEEE Computer 49, 6 (June 2016), 22--30.

Digital Library

[25]

Kshitiz Kumar, Chanwoo Kim, and Richard M Stern. 2011. Delta-Spectral Cepstral Coefficients for Robust Speech Recognition. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 4784--4787.

[26]

Nicholas D. Lane, Petko Georgiev, and Lorena Qendro. 2015. DeepEar: Robust Smartphone Audio Sensing in Unconstrained Acoustic Environments Using Deep Learning. In Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing (Ubicomp). ACM, 283--294.

Digital Library

[27]

Yann LeCun, Yoshua Bengio, and Geoffery Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436--444.

[28]

Feng Lin, Chen Song, Yan Zhuang, Wenyao Xu, Changzhi Li, and Kui Ren. 2017. Cardiac Scan: A Non-contact and Continuous Heart-based User Authentication System. In Proceedings of the International Conference on Mobile Computing and Networking (MobiCom). ACM, 315--328.

Digital Library

[29]

Rui Liu, Cory Cornelius, Reza Rawassizadeh, Ron Peterson, and David Kotz. 2017. Poster: Vocal Resonance as a Passive Biometric. In Proceedings of the International Conference on Mobile Systems, Applications, and Services (MobiSys). ACM, 160.

Digital Library

[30]

Rui Liu, Reza Rawassizadeh, and David Kotz. 2017. Toward Accurate and Efficient Feature Selection for Speaker Recognition on Wearables. In Proceedings of the Workshop on Wearable Systems and Applications (WearSys). ACM, 41--46.

Digital Library

[31]

Hong Lu, A. J. Bernheim Brush, Bodhi Priyantha, Amy K. Karlson, and Jie Liu. 2011. SpeakerSense: Energy Efficient Unobtrusive Speaker Identification on Mobile Phones. In Proceedings of International Conference on Pervasive Computing. Springer, 188--205.

Digital Library

[32]

Lumo Bodytech. 2017. Lumo Run. (Oct. 2017). https://www.lumobodytech.com

[33]

Monsoon Solutions, Inc. 2017. Monsoon Power Monitor. (Aug. 2017). https://www.msoon.com/LabEquipment/PowerMonitor

[34]

Vinod Nair and Geoffrey E. Hinton. 2010. Rectified Linear Units Improve Restricted Boltzmann Machines. In Proceedings of the International Conference on Machine Learning (ICML). Omnipress, 807--814.

Digital Library

[35]

Raspberry Pi Foundation. 2017. Raspberry Pi Zero Wireless. (March 2017). https://www.raspberrypi.org

[36]

Reza Rawassizadeh, Elaheh Momeni, Chelsea Dobbins, Joobin Gharibshah, and Michael Pazzani. 2016. Scalable Daily Human Behavioral Pattern Mining from Multivariate Temporal Data. IEEE Transactions on Knowledge and Data Engineering 28, 11 (July 2016), 3098--3112.

Digital Library

[37]

Reza Rawassizadeh, Blaine A. Price, and Marian Petre. 2015. Wearables: Has the Age of Smartwatches Finally Arrived? Communications of the ACM 58, 1 (Jan. 2015), 45--47.

Digital Library

[38]

Douglas A. Reynolds. 1995. Speaker identification and verification using Gaussian mixture speaker models. Speech Communication 17, 1--2 (Aug. 1995), 91--108.

Digital Library

[39]

Douglas A. Reynolds. 2002. An Overview of Automatic Speaker Recognition Technology. In IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 4. IEEE, 4072--4075.

[40]

Douglas A. Reynolds, Thomas F. Quatieri, and Robert B. Dunn. 2000. Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing 10, 1--3 (Jan. 2000), 19--41.

Digital Library

[41]

Fred Richardson, Douglas Reynolds, and Najim Dehak. 2015. Deep Neural Network Approaches to Speaker and Language Recognition. IEEE Signal Processing Letters 22, 10 (April 2015), 1671--1675.

[42]

Hasim Sak, Andrew W Senior, and Francoise Beaufays. 2014. Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling. In INTERSPEECH. International Speech Communication Association (ISCA), 338--342. http://www.isca-speech.org/archive/interspeech_2014/i14_0338.html

[43]

Michael Sherman, Gradeigh Clark, Yulong Yang, Shridatt Sugrim, Arttu Modig, Janne Lindqvist, Antti Oulasvirta, and Teemu Roos. 2014. User-generated Free-form Gestures for Authentication: Security and Memorability. In Proceedings of the International Conference on Mobile Systems, Applications, and Services (MobiSys). ACM, 176--189.

Digital Library

[44]

Sreenivas Sremath Tirumala and Seyed Reza Shahamiri. 2016. A Review on Deep Learning Approaches in Speaker Identification. In Proceedings of the International Conference on Signal Processing Systems (ICSPS). ACM, 142--147.

Digital Library

[45]

Satoru Tsuge, Takashi Osanai, Hisanori Makinae, Toshiaki Kamada, Minoru Fukumi, and Shingo Kuroiwa. 2008. Combination Method of Bone-Conduction Speech and Air-Conduction Speech for Speaker Recognition. In Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH). ISCA, 1929--1932.

[46]

Sharath Umesh, Lawrence Cohen, and David Nelson. 2002. Frequency Warping and the Mel Scale. IEEE Signal Processing Letters 9, 3 (Aug. 2002), 104--107.

[47]

David A. van Leeuwen, Alvin F. Martin, Mark A. Przybocki, and Jos S. Bouten. 2006. NIST and NFI-TNO evaluations of automatic speaker recognition. Computer Speech 8 Language 20, 2--3 (April 2006), 128--158.

[48]

Bayya Yegnanarayana, A. Shahina, and M. R. Kesheorey. 2004. Throat Microphone Signal for Speaker Recognition. In Proceedings of International Conference on Spoken Language Processing (INTERSPEECH). ISCA, 2341--2344. http://www.isca-speech.org/archive/interspeech_2004/i04_2341.html

[49]

Xiao Zeng, Kai Cao, and Mi Zhang. 2017. MobileDeepPill: A Small-Footprint Mobile Deep Learning System for Recognizing Unconstrained Pill Images. In Proceedings of the International Conference on Mobile Systems, Applications, and Services (MobiSys). ACM, 56--67.

Digital Library

Cited By

Inoue MMurao KKostakos VKay JHoang T(2024)User Authentication Method for Smart Glasses using Gaze Information of Registered Known Images and AI-generated Unknown ImagesCompanion of the 2024 on ACM International Joint Conference on Pervasive and Ubiquitous Computing10.1145/3675094.3678450(525-530)Online publication date: 5-Oct-2024
https://dl.acm.org/doi/10.1145/3675094.3678450
Chen TYang YQiu CFan XGuo XShangguan LOkoshi TKo JLiKamWa R(2024)Enabling Hands-Free Voice Assistant Activation on EarphonesProceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services10.1145/3643832.3661890(155-168)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3643832.3661890
Shin HHuh JKwon BKim ICheon EKim HLee COakley I(2024)SkullID: Through-Skull Sound Conduction based Authentication for SmartglassesProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642506(1-19)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642506
Show More Cited By

Index Terms

Vocal Resonance: Using Internal Body Voice for Wearable Authentication
1. Human-centered computing
  1. Ubiquitous and mobile computing
    1. Ubiquitous and mobile devices
2. Security and privacy
  1. Security services
    1. Authentication
      1. Biometrics

Recommendations

Silent-speech enhancement using body-conducted vocal-tract resonance signals

The physical characteristics of weak body-conducted vocal-tract resonance signals called non-audible murmur (NAM) and the acoustic characteristics of three sensors developed for detecting these signals have been investigated. NAM signals attenuate 50dB ...
The Many Roles of Speaker Classification in Speaker Verification and Identification
Speaker Classification I

Speaker classification is a fundamental component of speaker identification and verification (SIV) technologies. This paper provides and overview of the many guises that classification takes within SIV systems.
A wearable system that knows who wears it
MobiSys '14: Proceedings of the 12th annual international conference on Mobile systems, applications, and services

Body-area networks of pervasive wearable devices are increasingly used for health monitoring, personal assistance, entertainment, and home automation. In an ideal world, a user would simply wear their desired set of devices with no configuration ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies Volume 2, Issue 1

March 2018

1370 pages

EISSN:2474-9567

DOI:10.1145/3200905

Issue’s Table of Contents

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 March 2018

Accepted: 01 January 2018

Revised: 01 November 2017

Received: 01 August 2017

Published in IMWUT Volume 2, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

National Science Foundation

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

39
Total Citations
View Citations
1,098
Total Downloads

Downloads (Last 12 months)178
Downloads (Last 6 weeks)36

Reflects downloads up to 01 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Inoue MMurao KKostakos VKay JHoang T(2024)User Authentication Method for Smart Glasses using Gaze Information of Registered Known Images and AI-generated Unknown ImagesCompanion of the 2024 on ACM International Joint Conference on Pervasive and Ubiquitous Computing10.1145/3675094.3678450(525-530)Online publication date: 5-Oct-2024
https://dl.acm.org/doi/10.1145/3675094.3678450
Chen TYang YQiu CFan XGuo XShangguan LOkoshi TKo JLiKamWa R(2024)Enabling Hands-Free Voice Assistant Activation on EarphonesProceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services10.1145/3643832.3661890(155-168)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3643832.3661890
Shin HHuh JKwon BKim ICheon EKim HLee COakley I(2024)SkullID: Through-Skull Sound Conduction based Authentication for SmartglassesProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642506(1-19)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642506
Hwang SLee HPark CJung JLee J(2024)Voice reduction in cardiac auscultation sounds with reference signals measured from vocal resonatorsThe Journal of the Acoustical Society of America10.1121/10.0026237155:6(3822-3832)Online publication date: 14-Jun-2024
https://doi.org/10.1121/10.0026237
Yu ZZhao LCui HSong YLiu YLuo YGuo B(2024)CrowdKit: A Generic Programming Framework for Mobile Crowdsensing ApplicationsIEEE Transactions on Mobile Computing10.1109/TMC.2024.338157823:11(10584-10597)Online publication date: 1-Nov-2024
https://dl.acm.org/doi/10.1109/TMC.2024.3381578
Li XZheng ZYan CLi CJi XXu W(2024)Toward Pitch-Insensitive Speaker Verification via SoundfieldIEEE Internet of Things Journal10.1109/JIOT.2023.329000111:1(1175-1189)Online publication date: 1-Jan-2024
https://doi.org/10.1109/JIOT.2023.3290001
Shende STembhurne JAnsari N(2024)Deep learning based authentication schemes for smart devices in different modalities: progress, challenges, performance, datasets and future directionsMultimedia Tools and Applications10.1007/s11042-024-18350-583:28(71451-71493)Online publication date: 8-Feb-2024
https://doi.org/10.1007/s11042-024-18350-5
Renz ANeff TBaldauf MMaier E(2023)Authentication methods for voice services on smart speakers – a multi-method study on perceived security and ease of usei-com10.1515/icom-2022-003922:1(67-81)Online publication date: 29-Mar-2023
https://doi.org/10.1515/icom-2022-0039
Srivastava TPan SNguyen PJain SEskicioglu RHuang PPatwari N(2023)Jawthenticate: Microphone-free Speech-based Authentication using Jaw Motion and Facial VibrationsProceedings of the 21st ACM Conference on Embedded Networked Sensor Systems10.1145/3625687.3625813(209-222)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3625687.3625813
Amesaka TWatanabe HSugimoto MSugiura YShizuki B(2023)User Authentication Method for Hearables Using Sound Leakage SignalsProceedings of the 2023 ACM International Symposium on Wearable Computers10.1145/3594738.3611376(119-123)Online publication date: 8-Oct-2023
https://dl.acm.org/doi/10.1145/3594738.3611376
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents