Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

SIDA: Self-Supervised Imbalanced Domain Adaptation for Sound Enhancement and Cross-Domain WiFi Sensing

Published: 27 September 2023 Publication History

Abstract

The coronavirus disease 2019 (COVID-19) pneumonia still persists and its chief complaint is dry cough. Physicians design wireless stethoscopes to facilitate diagnosis, however, lung sounds could be easily interfered with by external noises. To achieve lung sound enhancement, prior researches mostly assume the amount of clean and noisy data are the same. This assumption is hardly met due to extensive labor effort for data collection and annotation. The data imbalance across domains widely happens in real-world IoT systems, e.g. sound enhancement and WiFi-based human sensing. In this paper, we propose SIDA, a self-supervised imbalanced domain adaptation framework for sound enhancement and WiFi sensing, which makes it a generic time series domain adaptation solution for IoT systems. SIDA proposes a self-supervised imbalanced domain adaptation model that separately learns the representation of time series signals in a minority domain with limited samples, a majority domain with rich samples, and their mapping relations. For lung sound enhancement, we further proposes a phase correction model to sanitize the phase and a SNR prediction algorithm to recursively perform domain adaptation in an imbalanced noisy and clean lung sound dataset. Extensive experiments demonstrate SIDA increases noisy samples' SNR by 16.49dB and 4.06dB on a synthetic and a realistic imbalanced lung sound dataset, respectively. For WiFi-based human sensing, SIDA designs a cross-domain WiFi-based human identification model irrespective of walking trajectory. A specific trajectory where a group of people walks along in a realistic testing environment is considered the minority domain, and several other trajectories are stored at a server as the majority domain. Extensive experiments show SIDA could recognize individuals with an average accuracy of 94.72% and significantly outperform baselines on highly imbalanced WiFi dataset in cross-domain human identification tasks.

References

[1]
Gökhan ALTAN, Yakup Kutlu, Yusuf Garbi, Adnan Özhan Pekmezci, and Serkan Nural. 2017. Multimedia respiratory database (RespiratoryDatabase@ TR): Auscultation sounds and chest X-rays. Natural and Engineering Sciences 2, 3 (2017), 59--72.
[2]
Sejal Bhalla, Mayank Goel, and Rushil Khurana. 2021. IMU2Doppler: Cross-Modal Domain Adaptation for Doppler-based Activity Recognition Using IMU Data. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 5, 4 (2021), 1--20.
[3]
Qirong Bu, Xingxia Ming, Jingzhao Hu, Tuo Zhang, Jun Feng, and Jing Zhang. 2021. TransferSense: towards environment independent and one-shot wifi sensing. Personal and Ubiquitous Computing (2021), 1--19.
[4]
Aggelina Chatziagapi, Georgios Paraskevopoulos, Dimitris Sgouropoulos, Georgios Pantazopoulos, Malvina Nikandrou, Theodoros Giannakopoulos, Athanasios Katsamanis, Alexandros Potamianos, and Shrikanth Narayanan. 2019. Data Augmentation Using GANs for Speech Emotion Recognition. In Interspeech. 171--175.
[5]
Nitesh V Chawla, Kevin W Bowyer, Lawrence O Hall, and W Philip Kegelmeyer. 2002. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research 16 (2002), 321--357.
[6]
Chris Drummond, Robert C Holte, et al. 2003. C4. 5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling. In Workshop on learning from imbalanced datasets II, Vol. 11. Citeseer, 1--8.
[7]
Wei-jie Guan, Zheng-yi Ni, Yu Hu, Wen-hua Liang, Chun-quan Ou, Jian-xing He, Lei Liu, Hong Shan, Chun-liang Lei, David SC Hui, et al. 2020. Clinical characteristics of coronavirus disease 2019 in China. New England journal of medicine 382, 18 (2020), 1708--1720.
[8]
Nishi Shahnaj Haider, R Periyasamy, Deepak Joshi, and BK Singh. 2018. Savitzky-Golay filter for denoising lung sound. Brazilian Archives of Biology and Technology 61 (2018).
[9]
Qing-Hua He, Bin Yu, Xin Hong, Bo Lv, Tao Liu, Jian Ran, and Yu-Tian Bi. 2016. An Improved Lung Sound De-noising Method by Wavelet Packet Transform with Pso-Based Threshold Selection. Intelligent Automation & Soft Computing (2016), 1--7.
[10]
John R Hershey and Peder A Olsen. 2007. Approximating the Kullback Leibler divergence between Gaussian mixture models. In 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP'07, Vol. 4. IEEE, IV--317.
[11]
Ehsan Hosseini-Asl, Yingbo Zhou, Caiming Xiong, and Richard Socher. 2018. A multi-discriminator cyclegan for unsupervised non-parallel speech domain adaptation. arXiv preprint arXiv:1804.00522 (2018).
[12]
Yinghui Huang, Sijun Meng, Yi Zhang, Shuisheng Wu, Yu Zhang, Yawei Zhang, Yixiang Ye, Qifeng Wei, Niangui Zhao, Jianping Jiang, et al. 2020. The respiratory sound features of COVID-19 patients fill gaps between clinical data and screening methods. MedRxiv (2020).
[13]
Md Tamzeed Islam and Shahriar Nirjon. 2021. Sound-Adapter: Multi-Source Domain Adaptation for Acoustic Classification Through Domain Discovery. In Proceedings of the 20th International Conference on Information Processing in Sensor Networks (co-located with CPS-IoT Week 2021). 176--190.
[14]
Shindong Lee, BongGu Ko, Keonnyeong Lee, In-Chul Yoo, and Dongsuk Yook. 2020. Many-to-many voice conversion using conditional cycle-consistent adversarial networks. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 6279--6283.
[15]
Yi Li, Yang Sun, and Syed Mohsen Naqvi. 2021. Self-Supervised Learning based Monaural Speech Enhancement with Complex-Cycle-Consistent. arXiv preprint arXiv:2112.11142 (2021).
[16]
Ankang Liu, Lingfei Cheng, and Changdong Yu. 2022. SASMOTE: A Self-Attention Oversampling Method for Imbalanced CSI Fingerprints in Indoor Positioning Systems. Sensors 22, 15 (2022), 5677.
[17]
Ming-Yu Liu, Thomas Breuel, and Jan Kautz. 2017. Unsupervised image-to-image translation networks. Advances in neural information processing systems 30 (2017).
[18]
Yi Luo and Nima Mesgarani. 2018. Tasnet: time-domain audio separation network for real-time, single-channel speech separation. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 696--700.
[19]
Zhenchao Ma, Kuiwen Xu, Rencheng Song, Chao-Fu Wang, and Xudong Chen. 2020. Learning-based fast electromagnetic scattering solver through generative adversarial network. IEEE Transactions on Antennas and Propagation 69, 4 (2020), 2194--2208.
[20]
Fei Meng, Yixuan Wang, Yan Shi, and Hongmei Zhao. 2019. A kind of integrated serial algorithms for noise reduction and characteristics expanding in respiratory sound. International journal of biological sciences 15, 9 (2019), 1921.
[21]
Akinari Noda, Takeshi Saraya, Kikuko Morita, Masaoki Saito, Teppei Shimasaki, Daisuke Kurai, Keitaro Nakamoto, and Haruyuki Ishii. 2020. Evidence of the sequential changes of lung sounds in covid-19 pneumonia using a novel wireless stethoscope with the telemedicine system. Internal Medicine 59, 24 (2020), 3213--3216.
[22]
Kazuya Ohara, Takuya Maekawa, and Yasuyuki Matsushita. 2017. Detecting state changes of indoor everyday objects using Wi-Fi channel state information. Proceedings of the ACM on interactive, mobile, wearable and ubiquitous technologies 1, 3 (2017), 1--28.
[23]
Or Patashnik, Dov Danon, Hao Zhang, and Daniel Cohen-Or. 2021. Balagan: Cross-modal image translation between imbalanced domains. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2659--2667.
[24]
M Mahbubur Rahman and Sevgi Z Gurbuz. 2021. Multi-frequency rf sensor data adaptation for motion recognition with multi-modal deep learning. In 2021 IEEE Radar Conference (RadarConf21). IEEE, 1--6.
[25]
BM Rocha, Dimitris Filos, L Mendes, I Vogiatzis, E Perantoni, E Kaimakamis, P Natsiavas, Ana Oliveira, C Jácome, A Marques, et al. 2017. A respiratory sound database for the development of automated classification. In International Conference on Biomedical and Health Informatics. Springer, 33--37.
[26]
Amélie Royer, Konstantinos Bousmalis, Stephan Gouws, Fred Bertsch, Inbar Mosseri, Forrester Cole, and Kevin Murphy. 2020. Xgan: Unsupervised image-to-image translation for many-to-many mappings. In Domain Adaptation for Visual Understanding. Springer, 33--49.
[27]
MF Syahputra, SIG Situmeang, RF Rahmat, and R Budiarto. 2017. Noise reduction in breath sound files using wavelet transform based filter. In IOP Conference Series: Materials Science and Engineering, Vol. 190. IOP Publishing, 012040.
[28]
Hao Wang, Hao He, and Dina Katabi. 2020. Continuously Indexed Domain Adaptation (ICML'20). JMLR.org, Article 918, 10 pages.
[29]
Yu-Che Wang, Shrikant Venkataramani, and Paris Smaragdis. 2020. Self-supervised Learning for Speech Enhancement. arXiv preprint arXiv:2006.10388 (2020).
[30]
Jiasong Wu, Qingchun Li, Guanyu Yang, Lei Li, Lotfi Senhadji, and Huazhong Shu. 2023. Self-supervised speech denoising using only noisy audio signals. Speech Communication 149 (2023), 63--73.
[31]
Yongqin Xian, Tobias Lorenz, Bernt Schiele, and Zeynep Akata. 2018. Feature generating networks for zero-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5542--5551.
[32]
Chunjing Xiao, Daojun Han, Yongsen Ma, and Zhiguang Qin. 2019. CsiGAN: Robust channel state information-based activity recognition with GANs. IEEE Internet of Things Journal 6, 6 (2019), 10191--10204.
[33]
Yuzhe Yang, Hao Wang, and Dina Katabi. 2022. On multi-domain long-tailed recognition, imbalanced domain generalization and beyond. In Computer Vision--ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23--27, 2022, Proceedings, Part XX. Springer, 57--75.
[34]
Guolin Yin, Junqing Zhang, Guanxiong Shen, and Yingying Chen. 2022. FewSense, Towards a Scalable and Cross-Domain Wi-Fi Sensing System Using Few-Shot Learning. IEEE Transactions on Mobile Computing (2022).
[35]
Guochen Yu, Andong Li, Yutian Wang, Yinuo Guo, Hui Wang, and Chengshi Zheng. 2022. Joint magnitude estimation and phase recovery using Cycle-in-Cycle GAN for non-parallel speech enhancement. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 6967--6971.
[36]
Guochen Yu, Yutian Wang, Chengshi Zheng, Hui Wang, and Qin Zhang. 2021. Cyclegan-based non-parallel speech enhancement with an adaptive attention-in-attention mechanism. In 2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, 523--529.
[37]
Kaiwen Yuan and Z. Jane Wang. 2023. A Simple Self-Supervised IMU Denoising Method for Inertial Aided Navigation. IEEE Robotics and Automation Letters 8, 2 (2023), 944--950. https://doi.org/10.1109/LRA.2023.3234778
[38]
Han Zhang, Ian Goodfellow, Dimitris Metaxas, and Augustus Odena. 2019. Self-attention generative adversarial networks. In International conference on machine learning. PMLR, 7354--7363.
[39]
Jin Zhang, Zhuangzhuang Chen, Chengwen Luo, Bo Wei, Salil S. Kanhere, and Jianqiang Li. 2022. MetaGanFi: Cross-Domain Unseen Individual Identification Using WiFi Signals. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 6 (2022), 152:1--152:21.
[40]
Jin Zhang, Bo Wei, Wen Hu, and Salil S Kanhere. 2016. Wifi-id: Human identification using wifi signal. In International Conference on Distributed Computing in Sensor Systems (DCOSS). IEEE, 75--82.
[41]
Jin Zhang, Bo Wei, Fuxiang Wu, Limeng Dong, Wen Hu, Salil S Kanhere, Chengwen Luo, Shui Yu, and Jun Cheng. 2020. Gate-ID: WiFi-based human identification irrespective of walking directions in smart home. IEEE Internet of Things Journal 8, 9 (2020), 7610--7624.
[42]
Jin Zhang, Fuxiang Wu, Bo Wei, Qieshi Zhang, Hui Huang, Syed W Shah, and Jun Cheng. 2020. Data augmentation and dense-LSTM for human activity recognition using WiFi signal. IEEE Internet of Things Journal 8, 6 (2020), 4628--4641.
[43]
Jin Zhang, Weitao Xu, Wen Hu, and Salil S Kanhere. 2017. WiCare: Towards In-Situ Breath Monitoring. In Proceedings of the 14th International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services. ACM, 126--135.
[44]
Qian Zhang, Dong Wang, Run Zhao, Yinggang Yu, and Junjie Shen. 2021. Sensing to Hear: Speech Enhancement for Mobile Devices Using Acoustic Signals. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 5, 3, Article 137 (sep 2021), 30 pages. https://doi.org/10. 1145/3478093
[45]
Yongshun Zhang, Xiu-Shen Wei, Boyan Zhou, and Jianxin Wu. 2021. Bag of tricks for long-tailed visual recognition with deep convolutional neural networks. In Proceedings of the AAAI conference on artificial intelligence, Vol. 35. 3447--3455.
[46]
Yi Zhang, Yue Zheng, Guidong Zhang, Kun Qian, Chen Qian, and Zheng Yang. 2020. GaitID: Robust Wi-Fi Based Gait Recognition. In Wireless Algorithms, Systems, and Applications, Dongxiao Yu, Falko Dressler, and Jiguo Yu (Eds.). Springer International Publishing, Cham, 730--742.
[47]
Lifan Zhao, Yunlong Meng, and Lin Xu. 2022. OA-FSUI2IT: A Novel Few-Shot Cross Domain Object Detection Framework with Object-Aware Few-Shot Unsupervised Image-to-Image Translation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 3426--3435.
[48]
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision (ICCV). 2223--2232.

Cited By

View all
  • (2024)Predicting Multi-dimensional Surgical Outcomes with Multi-modal Mobile SensingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36596288:2(1-30)Online publication date: 15-May-2024
  • (2024)PRECYSE: Predicting Cybersickness using Transformer for Multimodal Time-Series Sensor DataProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36595948:2(1-24)Online publication date: 15-May-2024
  • (2024)Body-Area Capacitive or Electric Field Sensing for Human Activity Recognition and Human-Computer InteractionProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36435558:1(1-49)Online publication date: 6-Mar-2024
  • Show More Cited By

Index Terms

  1. SIDA: Self-Supervised Imbalanced Domain Adaptation for Sound Enhancement and Cross-Domain WiFi Sensing

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
    Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies  Volume 7, Issue 3
    September 2023
    1734 pages
    EISSN:2474-9567
    DOI:10.1145/3626192
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 September 2023
    Published in IMWUT Volume 7, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Domain adaptation
    2. Domain imbalance
    3. ICBHI dataset
    4. IoT
    5. Sound sensing
    6. WiFi sensing

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)215
    • Downloads (Last 6 weeks)19
    Reflects downloads up to 03 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Predicting Multi-dimensional Surgical Outcomes with Multi-modal Mobile SensingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36596288:2(1-30)Online publication date: 15-May-2024
    • (2024)PRECYSE: Predicting Cybersickness using Transformer for Multimodal Time-Series Sensor DataProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36595948:2(1-24)Online publication date: 15-May-2024
    • (2024)Body-Area Capacitive or Electric Field Sensing for Human Activity Recognition and Human-Computer InteractionProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36435558:1(1-49)Online publication date: 6-Mar-2024
    • (2024)MSense: Boosting Wireless Sensing Capability Under Motion InterferenceProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3649350(108-123)Online publication date: 29-May-2024
    • (2024)I Know This Looks Bad, But I Can Explain: Understanding When AI Should Explain Actions In Human-AI TeamsACM Transactions on Interactive Intelligent Systems10.1145/363547414:1(1-23)Online publication date: 5-Feb-2024
    • (2024)WaffleProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36314587:4(1-29)Online publication date: 12-Jan-2024
    • (2024)LiqDetectorProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36314437:4(1-24)Online publication date: 12-Jan-2024
    • (2024)LoCalProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36314367:4(1-27)Online publication date: 12-Jan-2024

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media