1 Introduction
The respiration state during sleep, especially whether sleep disorders occur, is an important indicator for monitoring physical and mental health [
7]. Sleep apnea is a typical sleep disorder in which breathing is briefly and repeatedly interrupted for over 10 seconds during sleep. When sleep apnea events occur over a certain frequency, they can develop into
sleep apnea-hypopnea syndrome (SAHS) [
29] and threaten human health. Previous research has shown that SAHS has been related to many other diseases, including diabetes, hypertension, heart disease, depression, and obesity [
32,
40,
43]. It is estimated that nearly 1 billion people worldwide have sleep apnea, which is 10 times larger than previous estimates [
41]. Therefore, there is great demand for in-home sleep respiration monitoring, which can not only offer the support of preliminary diagnosis and early warning of sleep apnea but also be used for follow-up of patients with SAHS so as to track the disease progress since their last visits to hospitals.
Traditionally,
polysomnography (PSG) is clinically used to diagnose SAHS. Based on the overnight PSG data of a subject, medical technicians can diagnose different respiration events, including central apnea, obstructive apnea, mixed apnea, and hypopnea events. Then apnea-hypopnea index (AHI) is calculated to diagnose the severity of SAHS. However, such PSG test is cumbersome, expensive, and unsuitable for in-home use. Fortunately, the past few years have witnessed a surge of development in contactless sensing. Based on the fact that respiration is closely related to chest displacements and wireless signals reflected by the human body can capture subtle body movements, researchers have been exploiting various wireless signals (including Wi-Fi [
58,
62], RFID [
51,
63], and acoustic signals [
9,
13,
16]) to monitor respiration rates or sleep disorders. Compared to the sensing by wearable devices (e.g., smart watch [
11,
19,
42] and chest belt [
20,
33,
37]), contactless sensing is non-intrusive and easy to deploy at home. It reduces the discomfort of long-term device contact and lowers the chance that data will become unavailable due to incorrect wear.
However, existing solutions to contactless sleep respiration monitoring suffer from several limitations. First, most studies are performed in controlled scenarios and sensitive to environmental changes. Second, some research is either conducted on healthy subjects or simulated data (e.g., holding one’s breath consciously to mimic sleep apnea). Obviously, there exists a substantial discrepancy between these data and data from real patients. For example, respiration events occurring during the sleep of patients have different types and duration, which leads to different manifestations of respiration signals and greatly increases the complexity of detecting respiration events. Recent work on sleep respiration monitoring adopts data from patients. Unfortunately, the detection accuracy is far from satisfactory. Therefore, accurate sleep respiration monitoring in non-controlled environments is still challenging.
In this article, we propose Respnea, a non-intrusive sleep respiration monitoring system using a
ultra-wideband (UWB) device. Respnea enables in-home use to monitor respiration rates and respiration events during the overnight sleep of subjects in a fine-grained way. The main contributions of this article are as follows.
—
We propose an overnight respiration profiling algorithm. By leveraging both amplitude and phase information from UWB signals, we first locate the position of the bed, identify the subject states and sleep duration, and then estimate the respiration rates, thus enabling reliable sleep respiration monitoring in unknown practical scenarios.
—
We design a deep learning model with the multi-head self-attention mechanism and a contrastive learning module to differentiate sleep apnea and hypopnea states from normal respiration states. We further incorporate a multi-window voting mechanism to rectify the incorrect respiration states, and aggregate these states into sleep apnea or hypopnea events.
—
We conduct extensive experiments in both in-hospital and in-home scenarios to evaluate the performance of Respnea. The experimental results show that Respnea achieves a median error of 0.27 bpm in respiration rate estimation and an accuracy of 94.44% on diagnosing the SAHS severity, outperforming the three baselines.
The rest of the article is organized as follows. Section
2 introduces the related work. Section
3 describes the design of Respnea in detail. Section
4 gives the experimental evaluation. Last, the article is concluded in Section
5.
2 Related Work
PSG is clinically used to diagnose SAHS. PSG is composed of multiple sensors: a nasal pressure transducer to measure airflow, chest and abdomen belts to measure thoracic and abdominal respiration motions, a pulse oximeter to measure oxygen saturation, and several electroencephalogram (EEG) sensors to measure brain activity, and so on. Based on the overnight PSG data of a subject, medical technicians can diagnose respiration events. Here, EEG data are used to determine whether the subject is asleep or not. Data of airflow, oxygen saturation, and thoracic and abdominal respiration motions are used to detect different respiration events, including central apnea, obstructive apnea, mixed apnea, and hypopnea events. Then AHI can be calculated by these detected respiration events and the total sleep time. All of these can help doctors diagnose the severity of SAHS of the subject. The process of a PSG test requires operations by medical professionals. Therefore, PSG is only used in hospitals, and cannot be applied to in-home scenarios. In addition, the cost for one overnight PSG test is usually expensive.
As an alternative solution to the PSG test, wearable sensors like
photoplethysmography (PPG) based devices have been applied in respiration and sleep monitoring. For example, OptiBreathe [
39] is an on-device earable system for respiration monitoring, which employs multiple signal processing algorithms to measure respiration rates and breathing phases from PPG signals. Ravichandran et al. [
36] propose an encoder-decoder architecture utilizing residual blocks to perform the task of extracting the respiration signal from a given PPG input. Nabavi et al. [
30] propose an intraoral PPG-based sleep monitoring system to measure PPG signals from the oral cavity and estimate AHI. Hahm et al. [
18] propose a sleep apnea identification system via PPG signals, in which they identify the obstructed sleep apnea in the frequency domain analysis that compares the existence of frequency components between normal and abnormal respiration in real time. Massie et al. [
28] extract the respiratory information embedded in the finger PPG data, and train an ensemble of tree classifiers that predicts the central or obstructive nature of each respiratory event. Wei et al. [
49] propose a sleep apnea detection method combining a multi-scale one-dimensional
convolutional neural network (CNN) and a shadow one-dimensional CNN based on dual-channel input from PPG signals. However, these devices still bring discomfort to the users for long-term contact with the skin, further affecting the sleep quality of the users. Meanwhile, the data may become unavailable due to incorrect wear. In addition, since some devices such as PPG sensors rely on measuring light absorbance through the skin, they are known affected by skin color and obtain unstable monitoring performance in subjects with dark skin [
14].
In recent years, contactless sensing, as a newly-emerging technology, has provided a number of solutions for vital sign monitoring [
12,
17,
46,
60,
64]. For example, Raheel et al. [
34] exploit the UWB signals to detect the respiration rates and heart rates of subjects when they are still. Yang et al. [
52] use
millimeter wave (mmWave) signals for vital sign monitoring, in which the mmWave signals can be directed towards the human’s body and the
received signal strength (RSS) of the reflections can be analyzed for accurate estimation of breathing and heart rates. Adib et al. [
5] exploit the fast Fourier transform based methods to capture the vital signs from the frequency modulation continuous wave signals in the home scenario. Pi-ViMo [
59] is a physiology-inspired vital sign monitoring system using mmWave signals that employs a template matching method to extract human vital signs by adopting physical models of respiration and cardiac activities. Xu et al. [
50] leverage audio devices on smartphones to collect acoustic signals and propose a generative adversarial network to generate fine-grained breathing waveforms from the extracted breathing patterns in respiration signals. Wang et al. [
45] reconstruct the respiration and heartbeat signals by jointly optimizing the decomposition of all the extracted superposed vital signals over different range-azimuth bins, therefore obtaining fine-grained vital signs from the millimeter wave signals. Further, we note that some work aims to monitor overnight vital signs and detect respiration events, where the monitored subjects are asleep.
For overnight vital sign monitoring, WiFi-Sleep [
55] adopts the fine-grained channel state information of Wi-Fi signals for monitoring respiration rates. Liu et al. [
26] exploit Wi-Fi signals to estimate the overnight respiration rates for one-person and two-person in-bed cases. Hussain et al. [
21] attach RFID tags to the shirt of the subject at the abdominal position, so as to enable respiration monitoring throughout the night. DoppleSleep [
35] employs a commercial 24-GHz short-range Doppler continuous wave radar to continuously track human vital signs, enabling real-time and efficient sleep monitoring. Li et al. [
25] take use of a UWB device to monitor not only the respiration rates during sleep but also the respiration depths and some respiration patterns. Yue et al. [
56] propose DeepBreath, an RF-based respiration monitoring system that can recover the breathing signals of multiple individuals, in which they model interference due to multiple reflected RF signals and demonstrate that the original breathing can be recovered via independent component analysis. Zhang et al. [
61] exploit the time-domain auto correlation function to estimate breathing rates, in which they perform maximal ratio combining to combine multiple subcarriers of Wi-Fi signals to maximize breathing signals optimally and achieve respiratory rate estimation at home. WiResP [
47] is a WiFi-based respiration monitoring system that combines both instantaneous and time-domain information to improve the detection of respiration during sleep.
For respiration event detection, existing solutions can be divided into two categories: rule-based solutions and model-based solutions. Rule-based solutions detect sleep apnea events via hand-crafted rules based on the waveform morphology of the events. For example, TagBreathe [
48] captures apnea events by leveraging the changes in RFID signals. UbiBreathe [
4] applies the discrete wavelet transform to extract the hidden breathing signals from the noisy Wi-Fi signals, and detects apnea events by monitoring the changes in RSS. However, these work on apnea detection serves the simulated datasets, where only central apnea events can be mimicked by holding the breath. Some later work adopts data from real overnight scenarios. For example, Li et al. [
25] leverage the amplitude changes of UWB signals to capture central apnea events during sleep. mmVital [
53] measures the distance between peaks and the amplitude of the reflected RSS of mmWave signals to detect central apnea and hypopnea events. ApneaApp [
31] transforms a phone into an active sonar system emitting sound signals and identifies central apnea, obstructive apnea, and hypopnea events by detecting peaks of reflected signals. However, simple hand-crafted rules cannot deal with complex respiration events from different patients. However, model-based solutions develop different machine learning models to improve the performance of sleep apnea detection. For example, Koda et al. [
24] apply a support vector machine algorithm to the 24-GHz mmWave signals for detecting central and mixed sleep apnea events. Chen et al. [
10] calculate features including physical, respiration, heartbeat, and movement features from mmWave signals, and apply them for sleep apnea classification via the ensemble subspace k-nearest neighbors. Kang et al. [
22] design a hybrid neural network consisting of CNNs and
long short-term memory (LSTM) networks to detect snoring and obstructive sleep apnea. Romero et al. [
38] analyze sleep breathing sounds recorded with a smartphone at home, and apply a CNN model for sleep apnea screening. However, these solutions adopt simple feature extraction, which cannot deal with various complicated sleep apnea events in real patients. What is worse, they adopt coarse-grained labels that will lead to underestimation of the number of sleep apnea events. Besides, most studies on sleep monitoring are performed in controlled environments, which makes them impractical for in-home scenarios.
In comparison to existing solutions, Respnea combines advanced signal processing techniques and a deep learning model, and can accurately detect multiple respiration events in non-controlled environments.
4 Evaluation
4.1 Experimental Setup
We adopt a commercial UWB module XETHRU model X4M05 as the front-end of Respnea. The module has a center frequency of 7.3 GHz and a bandwidth of 1.4 GHz. We set the frame per second to 17 in consideration of the range of respiration rate and signal-to-noise ratio. The UWB module is connected to a Raspberry Pi 4B, and both of them are packaged as a compact device, as shown in Figure
7.
We implement the overnight respiration profiling component in Matlab. \(\alpha _1\), \(\alpha _2\), and \(\alpha _3\) are set to 1, 0.15, and 3, respectively. \(\beta\) is set to 0.4. Also, we implement the respiration event detection component in Python. The layer number of CNN-based encoder \(L_c\) is set to 7, and the layer number of self-attention \(L_s\) is set to 2. The dimension of sequence embeddings \(d_{model}\) is set to 64. The number of heads hd is set to 4. The number of blocks b is set to 2. The temperature parameter \(\tau\) is set to 1. \(\lambda _c\) in the loss function is set to 0.1. We use Adam as the optimizer with a learning rate of 0.001. The batch size N is set to 128.
We deploy Respnea in hospital and home scenarios to produce two datasets as follows, and the data collection procedures have been approved by the institutional review board of Institute of Software, Chinese Academy of Sciences and the medical ethics committee of the Second Affiliated Hospital of Xi’an Jiaotong University, respectively.
In-hospital Dataset. Our in-hospital experiments are conducted in seven hospital wards, and the experimental scenario in the hospital ward is shown in Figure
10. Our device is placed on a nightstand and orientated towards the subject. We collect the data of 18 nights (151 h in total) from 18 subjects aged between 5 and 58, including 5 females and 13 males. PSG sensors are used to monitor the overnight sleep of the subjects. By analyzing the PSG data, the medical technicians can diagnose respiration events, providing the ground truth, including the total number of respiration events, the start time and end time of each event, the total sleep time, and the severity of SAHS. Our in-hospital dataset includes 9 healthy subjects, 1 mild-SAHS subject, 3 moderate-SAHS subjects, and 5 severe-SAHS subjects.
In-home Dataset. Our in-home experiments are conducted in 7 home rooms, and the experimental scenario in the home room is shown in Figure
9. Our device is placed beside the bed and orientated towards the sleeper. We recruit 7 volunteers (1 female and 6 males) aged between 22 and 34, and collect the data of 17 nights (108 h in total). As for ground truth, we adopt a three-lead sleep monitor, i.e., Heal Force PC-3000 as shown in Figure
8, to record respiration rates. Meanwhile, we use an infrared camera (i.e., EZVIZ C6CN camera) to record the overnight body motions of the sleeper.
4.2 Performance on Overnight Respiration Profiling
In this subsection, we conduct experiments on the in-hospital dataset and in-home dataset to evaluate the performance of Respnea on overnight respiration profiling.
4.2.1 Respiration Rate Estimation Evaluation.
The estimation error of respiration rate is defined as the absolute value of the difference between the estimated respiration rate
\(R_E\) and the ground truth
\(R_G\), i.e.,
\(|R_E-R_G|\). We select two non-intrusive methods on respiration monitoring (Raheel et al. [
34] and Liu et al. [
26]) as the baselines. Raheel et al. adopt a UWB device to monitor the respiration rates of stationary subjects; Liu et al. exploit Wi-Fi signals to monitor the respiration rates of the subject during sleep. We implement these two methods and apply them to our datasets. The
cumulative distribution functions (CDFs) of the respiration rate estimation errors of three methods are shown in Figure
11 (in-hospital dataset) and Figure
12 (in-home dataset). Meanwhile, Table
2 shows the median errors and 90-quantile errors of three methods on both datasets. The experimental results show that Respnea achieves the lowest median error and 90-quantile error on both datasets. In particular, we find that some respiration rate errors of the baselines are large. Presumably, this is caused by two factors. (i) The baselines use either amplitude or phase sequences to estimate the respiration rates. However, if merely using amplitude or phase information, then human respiration cannot be sensed effectively in blind-spot locations [
57]. Respnea solves this problem by combining both amplitude and phase sequences to estimate respiration rates. (ii) The baselines fail to capture some body motions and perform respiration rate estimation on these waveforms, causing large estimation errors. Figure
13 shows the respiration rates of a patient within a certain interval of a night using three methods. It can be seen that Respnea fits the ground truth the best. And we can also obtain a similar result from the in-home dataset, as shown in Figure
14.
4.2.2 Respiration Rate Coverage Evaluation.
The temporal coverage of respiration rate is defined as the ratio of the duration where the subject is in State IV over the sleep duration. Compared to methods that only use amplitude signals or phase signals, our approach improves the temporal coverage of the respiration rate by combining amplitude and phase signals. Table
2 shows the average temporal coverage of three methods on both datasets. It can be seen that Respnea outperforms the other two methods in temporal coverage of respiration rate while maintaining low errors in respiration rate estimation. Meanwhile, we also find that the temporal coverage of the in-hospital dataset is apparently lower than that of the in-home dataset. This is because subjects of the in-home dataset are healthy and have a longer time in state IV than subjects of the in-hospital dataset. In addition, the estimation error of our approach on the in-hospital dataset is smaller than the estimation error on the in-home dataset. This fact needs to be analyzed together with the results about temporal coverage. Specifically, on the in-hospital dataset, the estimation error is lower, but the duration for which the respiration rate estimation can be performed is shorter. On the in-home dataset, the opposite is true. This is in line with our expectations.
4.2.3 Body Motion Detection Evaluation.
We use the motion sensor data from the PSG devices as the ground truth of body motions (State III) for the in-hospital dataset, and the videos recorded by the infrared camera as the ground truth of body motions for the in-home dataset. Table
3 summarizes the body motion detection results on both datasets by Respnea. The experimental results show that Respnea attains high F1-scores on detecting body motions no matter in the in-hospital scenario (0.9175) or in-home scenario (0.9315), indicating that Respnea can accurately detect body motions of different subjects in different scenarios.
In short, Respnea attains low errors and high temporal coverage in respiration rate estimation, and achieves excellent performance on body motion detection in both in-hospital and in-home datasets. The experimental results illustrate the effectiveness and generalization of Respnea on overnight respiration profiling.
4.3 Performance on Respiration Event Detection
In this subsection, we use in-hospital data to evaluate the performance of Respnea on respiration event detection.
We divide the in-hospital dataset into 18 subsets, and each subset contains one subject’s data. Each time, one subset is chosen for testing, another subset is chosen from the left 17 subsets for validation and the remaining are for training. This procedure is repeated 18 times, thus obtaining SAHS severity for every subject and the final performance of the model is the average of the 18 results.
We label the signal data per second and obtain 49,724 labels of sleep apnea events, 18,923 labels of hypopnea events, and 480,173 labels of normal respiration. To alleviate the class imbalance problem, we adopt two different strategies: (i) for the training set and validation set, we first locate all respiration events based on ground truth. For each event, we obtain a sequence from the UWB signals, whose duration is from 10 seconds before the start of the event to 10 seconds after the end of the event. This sequence contains not only a complete respiration event, but also the transition from normal respiration to a respiration event and the reversed transition. We adopt a 30-second sliding window with a 1-second sliding step to segment the sequence into multiple respiration event samples. Meanwhile, we use a 30-second sliding window with a 10-second sliding step to segment the sequence without any respiration events, obtaining the normal respiration samples. In this way, the training set contains 820,775 labels of sleep apnea events, 292,743 labels of hypopnea events, and 1,532,482 labels of normal respiration in the end; (ii) for the testing set, we produce samples by sliding a 30-second window over the sleep signals with a step of 10 seconds.
Subsequently, we evaluate the performance of Respnea by estimating the total number of respiration events, identifying the total sleep time, and diagnosing the SAHS severity of each subject.
4.3.1 Estimating the Number of Respiration Events.
Figure
15 shows the scatter plot of 18 subjects comparing the total number of respiration events estimated by Respnea to that of ground truth. The intraclass correlation coefficient [
6] between Respnea and the ground truth is 0.9620, indicating that the total number of respiration events estimated by Respnea and the ground truth are highly correlated.
4.3.2 Estimating the Total Sleep Time.
Figure
16 shows the total sleep time estimated by Respnea and the ground truth (estimated by EEG sensors). We can see that the median error of Respnea is 21 minutes, and the mean error is 49 minutes. And the reasons why some subjects obtain relatively large errors are as follows: (i) some subjects wake up several times throughout the night. During each waking period, they remain stationary and try to fall asleep, resulting in overestimation of the total sleep time in Respnea; (ii) some severe SAHS subjects perform frequent motions when the respiration events occur continuously and stay asleep, resulting in underestimation of the total sleep time in Respnea. However, we can find that these errors are acceptable from the viewpoint of the classification accuracy of SAHS severity.
4.3.3 Identifying SAHS Severity.
We estimate AHI using Equation (
15), by which the subject can be diagnosed as normal (AHI
\(\lt\) 5), mild-SAHS (5
\(\le\) AHI
\(\lt\) 15), moderate-SAHS (15
\(\le\) AHI
\(\lt\) 30), and severe-SAHS (AHI
\(\ge\) 30). Table
4 shows the confusion matrix of the SAHS severity classification by Respnea. Among the 18 subjects, we correctly predict the SAHS severity of 17 subjects, reaching an accuracy of
\(94.44\%\). Meanwhile, we select three non-intrusive methods proposed by Nandakumar et al. [
31], Romero et al. [
38], and Kang et al. [
22] as the baselines. Nandakumar et al. use hand-crafted rules for detecting respiration events, Romero et al. adopt a multi-layer CNN model and Kang et al. adopt a CNN-LSTM model for apnea detection. Experimental results of the three methods on the SAHS severity classification are shown in Table
5. It can be seen that Respnea outperforms the other three methods in all metrics.
4.4 Ablation Study
In this subsection, we evaluate the effectiveness of our CNN-based encoder module, multi-head self-attention module, contrastive learning module, and the amplitude-phase combining technique with the ablation study.
We derive five variants from Respnea as follows:
—
Respnea-CNN: Respnea without the CNN-based encoder.
—
Respnea-SelAtt: Respnea without the multi-head self-attention module.
—
Respnea-Contra: Respnea without the contrastive learning module.
—
Respnea-Amp: Respnea without the amplitude signals, only the phase signals are sent to the model as input.
—
Respnea-Pha: Respnea without the phase signals, only the amplitude signals are sent to the model as input.
Table
6 summarizes the comparison results of Respnea and its five variants on diagnosing the SAHS severity. We can see that both Respnea-CNN and Respnea-SelfAtt suffer severe declines in performance, demonstrating that both the CNN-based encoder and multi-head self-attention module are effective in improving the performance. The performance of Respnea-Contra is lower than that of Respnea, indicating that the contrastive learning module can help extract a robust representation. Further, the results of Respnea-Amp and Respnea-Pha show that both amplitude and phase information of the UWB signals provide distinct information, and combining them can help improve the model performance. Meanwhile, we find that both Respnea-Amp and Respnea-Pha have better performance than Respnea-CNN and Respnea-SelfAtt, indicating the effectiveness of the model architecture in Respnea.
4.5 Impact of Different Hyper-parameters
4.5.1 Overnight Respiration Profiling.
We first analyze the sensitivity of the hyper-parameters in the overnight respiration profiling component, including the threshold
\(\alpha _1\) to determine whether the subject performs obvious body motions, the threshold
\(\alpha _2\) to determine whether the subject performs moderate body motions, the threshold
\(\alpha _3\) to determine whether the subject is awake, and the threshold
\(\beta\) to determine whether the subject is stationary and breathing regularly. These parameters will impact the performance of respiration rate estimation by identifying the subject states, and the SAHS classification by affecting the total sleep time. The default values of
\(\alpha _1\),
\(\alpha _2\),
\(\alpha _3\), and
\(\beta\) are 1, 0.15, 3, and 0.4. We vary the value of one parameter while keeping the other parameters fixed, with the restriction of
\(\alpha _2 \lt \alpha _1 \lt \alpha _3\). The experimental results are shown in Table
7, from which we have the following findings:
—
Threshold of obvious body motions \(\alpha _1\). As \(\alpha _1\) increases, the subject is less likely to be identified as in the obvious body motion state. The experimental results show that our system achieves the optimal performance on body motion detection and SAHS classification with \(\alpha _1\) = 1. Also, the compromise between estimation error and temporal coverage of respiration rate is reached by setting \(\alpha _1\) to 1. We can observe that the performance of body motion detection and SAHS classification tends to be stable when \(\alpha _1\) is relatively large, and decreases sharply when \(\alpha _1\) is relatively small. This is because the large \(\alpha _1\) tends to recognize a subject as being in a state of moderate motion rather than obvious motion, and the small \(\alpha _1\) leads to identifying a subject as being in a motion state rather than a stationary state. The former does not affect the performance of the motion detection, but the latter does.
—
Threshold of moderate body motions \(\alpha _2\). The results show that the performance of body motion detection and SAHS classification gradually increases as \(\alpha _2\) varies from 0.05 to 0.15, and decreases as \(\alpha _2\) continues to increase. The reason is that the higher threshold of \(\alpha _2\) leads to more missed detections of moderate body motions. Also, the value of \(\alpha _2\) does not impact the respiration evaluation part as this parameter is only used in detecting moderate motions and is irrelevant to the estimation of respiration rate.
—
Threshold of wake state \(\alpha _3\). This parameter is designed to decide whether the body motions are intense enough to indicate the subject is awake, and as \(\alpha _3\) increases, the subject is less likely to be identified as in the wake state. We vary the value of \(\alpha _3\) and the SAHS classification achieves the best performance when \(\alpha _3\) is set to 3 or 4. This parameter is irrelevant to the estimation of respiration rate and body motion, because we start to identify the wake state after finishing the respiration rate estimation and body motion detection in our algorithm.
—
Threshold of respiration detectable state \(\beta\). As can be seen from Table
7, the smaller
\(\beta\) increases the temporal coverage of respiration rates but results in larger estimation errors. And the situation is completely the opposite for the larger
\(\beta\). Therefore, we make a tradeoff and set
\(\beta\) to 0.4, in which the body motion detection and SAHS classification also achieve the best performance.
4.5.2 Respiration Event Detection.
We now observe how the hyper-parameters in the respiration event detection component affect the diagnosing performance, including the layer number of CNN-based encoder
\(L_c\), the layer number of self-attention
\(L_s\), the dimension of sequence embeddings
\(d_{model}\), the temperature parameter
\(\tau\) of the contrastive learning module, and the batch size
N. Table
8 shows the performance of Respnea with one of the hyper-parameters varying while keeping other hyper-parameters at their optimal settings. From Table
8, we have the following observations:
—
Layer number of CNN \(L_c\). The results show that stacking CNN layers helps extract more complex features from the respiration signals, and can boost diagnosing performance of Respnea. The optimal setting for \(L_c\) is seven layers, and the performance declines when \(L_c\) is greater than 7, largely due to overfitting.
—
Layer number of self-attention \(L_s\). The results show that Respnea benefits from a smaller \(L_s\), and reaches the optimal performance when \(L_s\) is set to two layers. This phenomenon indicates that a shallow self-attention structure is enough for Respnea to learn the transition patterns of the respiration events, after a deep CNN-based encoder.
—
Dimension of sequence embeddings \(d_{model}\). As can be seen in Table
8, the performance of the model gradually increases as
\(d_{model}\) varies from 8 to 64, and starts to decrease when
\(d_{model}\) continues to increase. The reason for this phenomenon may be that the large dimension of embeddings leads to overfitting. Therefore, we set
\(d_{model}\) to 64.
—
Temperature parameter \(\tau\). In the contrastive learning module, \(\tau\) is introduced to control the strength of penalties on hard negative samples. The result shows that the optimal setting for \(\tau\) is 1, which indicates the model does not need too many penalties on hard negative samples.
—
Batch size N. The results show that Respnea benefits the most when N is set to 128, and a smaller or larger N leads to performance decrease.
4.6 Impact of Different Factors
In this section, we evaluate the impact of different factors on the performance of respiration rate estimation, including sleeping posture, distance and direction between the subject and the device. Note that we cannot evaluate the impact of these factors on SAHS classification, because the SAHS severity is a metric of assessing the natural sleep of a subject. Each subject will be diagnosed as one of the four SAHS severities based on his/her whole-night sleep, during which the subject can move around in the bed unconsciously or consciously with different sleeping postures, which leads to different distances and directions between the subject and the device. Considering that forcing a subject to stay still or in a certain sleeping posture during sleep is impractical and violates ethical guidelines, the SAHS classification under a certain sleeping posture or a certain distance/direction to a device does not be evaluated.
4.6.1 Impact of Sleeping Posture.
To observe the impact of different sleeping postures on performance, we evaluate Respnea under four typical sleeping postures on the in-hospital dataset, including supine, left lateral, right lateral, and prone positions, as shown in Figure
17. The posture information during sleep is recorded by the PSG sensors. The results in Table
9 indicate that the errors under left and right lateral postures are smaller than those under supine and prone postures. This is because our device is placed on the nightstand beside the bed, and when the subjects change their sleeping postures from lateral positions to supine or prone positions, the body’s reflection surface changes from the chest or the back to the side of the body. The effective signal reflection surface and motion displacement are both reduced, thus resulting in the slight increase of the errors.
4.6.2 Impact of Subject-device Distance.
In this part, we investigate the impact of different subject-device distances on the performance of Respnea. During sleep, the subject unconsciously moves his/her body, which results in various distances away from the UWB device. From the experimental results of the in-hospital dataset, we select the distance between the device and the subject during sleep from 40 to 100 cm at a step size of 15 cm, as shown in Figure
18(a). The experimental results in Figure
18(b) show that Respnea achieves the median errors of 0.20, 0.23, 0.25, 0.28, and 0.31 bpm under different subject-device distances, respectively. We can observe that the estimation error increases slightly as the distance increases, which should be due to the fact that the reflection signals get weaker with distance.
4.6.3 Impact of Subject-device Direction.
Our UWB device has a typical opening angle of 65
\(^{\circ }\) azimuth and elevation. To evaluate the impact of different subject-device directions on Respnea, we first change the orientation of the device towards the subject at a horizontal angle of 0
\(^{\circ }\), 20
\(^{\circ }\), 40
\(^{\circ }\), and 60
\(^{\circ }\) as shown in Figure
19(a). It can be seen from Figure
19(b) that Respnea achieves the median errors of 0.17, 0.21, 0.25, and 0.30 bpm on respiration rate estimation, respectively. Subsequently, we adjust the height of the device’s placement to ensure the vertical angle between the device and the subject varies from 0
\(^{\circ }\) to 60
\(^{\circ }\), as shown in Figure
20(a). Figure
20(b) shows that Respnea achieves the median errors of 0.17, 0.19, 0.20, and 0.28 bpm on respiration rate estimation, respectively. The results on both horizontal and vertical angles indicate that the error slightly increases as the angle increases, which is due to the gradually weakening signal reflection.
4.7 Generalization Capability Test
To evaluate the performance of Respnea on unseen data from non-hospital scenarios, we conduct experiments in the home scenarios, including dormitories and bedrooms as shown in Figure
21.
We collect the data of 15 nights from 6 subjects in bedrooms and 15 nights from 6 subjects in dormitories. As for the ground truth in these non-hospital scenarios, we adopt a sleeping pad (i.e., Withings Sleep [
3]) for obtaining the SAHS severity. Respnea trained on the in-hospital dataset is employed to diagnose the SAHS severities of the subjects in the new datasets. The experimental results are shown in Table
10. It can be seen that Respnea can also achieve excellent performance on diagnosing SAHS severity in home scenarios including bedrooms and dormitories, which validates the generalizability of our model.
4.8 Case Study
In this subsection, we further evaluate the performance of Respnea on overnight respiration profiling and respiration event detection using two specific cases.
4.8.1 Profiling Overnight Respiration.
We take the data of a certain subject from the in-hospital dataset. Figure
22 shows the subject state distribution during the monitoring period (from 19:00 on the first day to 11:00 on the second day). As can be seen in Figure
22, the actions of the subject and the doctor, and the subject states are recorded: the subject is in the bed and stays awake between 19:00 and 19:45 (Figure
22 shows that the subject is in State III or State IV during this period). The subject leaves the bed at 19:45 and returns at 20:07 (State I during this period). Then the doctor helps the subject wear PSG sensors for nearly 20 minutes (State II), after which the subject falls asleep (Respnea detects the sleep onset and exit as referred to above). The subject wakes up at about 5:30 on the second day (transition from State IV to State III) and leaves the bed once (State I). After interacting with others in the same room at about 7:00 several times (State II), the subject lies in the bed until 9:33. Then the subject leaves the room (State I). The doctor turns off our device at around 11:00 and ends the monitoring.
This case illustrates that Respnea can detect the distribution of subject states during a long monitoring period, and provide a feasible solution in unknown practical scenarios.
4.8.2 Detecting Respiration Events.
Figure
23 shows an example of detecting respiration events. The first subfigure is the ground-truth waveform (i.e., thoracic motions in PSG) for a certain period of time, in which an obstructive sleep apnea event occurs. The second subfigure is the amplitude and phase sequences of the UWB signals during the same period. And the last subfigure is the prediction per second of our model. It can be seen that (i) the UWB waveforms have a high similarity to the ground-truth waveform and (ii) our model is able to accurately predict the sleep state at each second, and rectify the unreasonable predictions using the voting-based method (see Section
3.3.2). Eventually, a sleep apnea event is detected after the model aggregates the continuous predictions of sleep states.
5 Discussion
In this section, we discuss the possible cost and overhead of deploying our system in real-life scenes (e.g., in-home scenarios) for daily use.
Hardware. All the components of the device used in our system are commercial-off-the-shelf (COTS), including a consumer-level UWB chip [
2] and a Raspberry Pi [
1].
Network. The current version of our system is a component-based system, not yet an end-to-end system, with data being collected first and then exported from the device for further analysis. However, for the envisioned use of our system as a commercial product in daily life, continuous data collection, transmission, and analysis will be necessary. Therefore, IoT SIM cards and cloud servers are needed for the actual system deployment.
User Interface. An app on mobile phones is expected to be developed to present the sleep reports (including overnight respiration rates, motions, and SAHS severity) for the users.