More-Fi: Motion-Robust and Fine-Grained Respiration Monitoring Via Deep-Learning Uwb Radar

MoRe-Fi: Motion-robust and Fine-grained Respiration
Monitoring via Deep-Learning UWB Radar

Tianyue Zheng1,2 , Zhe Chen1,3 , Shujie Zhang1 , Chao Cai1 , Jun Luo1
1
School of Computer Science and Engineering, Nanyang Technological University, Singapore
2 Energy Research Institute, Interdisciplinary Graduate Programme, Nanyang Technological University, Singapore
3 China-Singapore International Joint Research Institute, Guangzhou, China
Email: {tianyue002, shujie002, junluo}@ntu.edu.sg, chenz@ssijri.com, chriscai@hust.edu.cn
ABSTRACT 1 INTRODUCTION
arXiv:2111.08195v1 [cs.LG] 16 Nov 2021
Crucial for healthcare and biomedical applications, respiration mon- Respiratory diseases [12, 70, 80] are so common that the deadly
itoring often employs wearable sensors in practice, causing incon- health conditions caused by them influence people worldwide: they
venience due to their direct contact with human bodies. Therefore, affect 2.4% of the global population [76] and cause 7.6 million deaths
researchers have been constantly searching for contact-free alter- per year globally [60]. Fortunately, most of them can be detected in
natives. Nonetheless, existing contact-free designs mostly require their early stages with proper monitoring, as symptoms of these dis-
human subjects to remain static, largely confining their adoptions in eases, such as airflow obstruction [17] and shortness of breath [29],
everyday environments where body movements are inevitable. For- are usually reflected on different vital signs including respiratory
tunately, radio-frequency (RF) enabled contact-free sensing, though rate [19, 62] and fine-grained patterns in the respiratory wave-
suffering motion interference inseparable by conventional filtering, form [5, 73]. Traditionally, to obtain such vital signs for disease
may offer a potential to distill respiratory waveform with the help diagnosis, wearable devices ranging from smartwatches to medical
of deep learning. To realize this potential, we introduce MoRe-Fi sensors have been used [14, 24, 25, 34, 53, 56]. Unfortunately, the in-
to conduct fine-grained respiration monitoring under body move- convenience caused by the contact (even intrusive) nature of these
ments. MoRe-Fi leverages an IR-UWB radar to achieve contact-free sensors has prevented them from being widely adopted under daily
sensing, and it fully exploits the complex radar signal for data environments. To overcome the drawbacks of contact sensing for
augmentation. The core of MoRe-Fi is a novel variational encoder- achieving ubiquitous respiration monitoring, contact-free sensing
decoder network; it aims to single out the respiratory waveforms has attracted increasing attention from both academia and indus-
that are modulated by body movements in a non-linear manner. try [3, 36, 57, 69, 77, 78, 83, 84, 90]. Among these developments.
Our experiments with 12 subjects and 66-hour data demonstrate radio-frequency (RF) sensing leveraging various commercial-grade
that MoRe-Fi accurately recovers respiratory waveform despite the radars has demonstrated a promising future [3, 9, 84, 90], thanks to
interference caused by body movements. We also discuss potential its noise resistance at a reasonable cost.
applications of MoRe-Fi for pulmonary disease diagnoses. Ideally, monitoring should be performed continuously so that
respiration patterns can be used as markers for intervention or evi-
CCS CONCEPTS dence for diagnoses [8, 18]. However, existing RF-sensing systems
• Human-centered computing → Ubiquitous and mobile com- fail to deliver continuous monitoring as they often assume static
puting design and evaluation methods. human subjects. Facing strong body movements, these systems
have to suspend respiration monitoring [3, 85]; otherwise, they
KEYWORDS would obtain noise-like readings as shown in Figure 1. In reality,
we cannot expect subjects to remain static during monitoring for
Respiratory waveform recovery, contact-free RF-sensing, commercial- two major reasons. On one hand, human subjects may undergo a
grade radars, deep learning, variational encoder-decoder. constant motion, e.g., typewriting or exercising. On the other hand,
ACM Reference Format: unintentional posture drifts [64] and unconscious movements (e.g.,
T. Zheng, Z. Chen, S. Zhang, C. Cai, and J. Luo. 2021. MoRe-Fi: Motion- turning-over during sleep) may always occur. Therefore, the static
robust and Fine-grained Respiration Monitoring via Deep-Learning UWB
Radar. In The 19th ACM Conference on Embedded Networked Sensor Systems
(SenSys’21), November 15–17, 2021, Coimbra, Portugal. ACM, New York, NY,
USA, 14 pages. https://doi.org/10.1145/3485730.3485932 Static
Permission to make digital or hard copies of all or part of this work for personal or 0 10 20 30
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation Moving
on the first page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a 0 10 20 30
fee. Request permissions from permissions@acm.org. Time (s)
SenSys’21, November 15–17, 2021, Coimbra, Portugal
© 2021 Association for Computing Machinery. Figure 1: If the human subject is in motion, previous RF-
ACM ISBN 978-1-4503-9097-2/21/11. . . $15.00
https://doi.org/10.1145/3485730.3485932 sensing solutions for respiration monitoring fail.
SenSys’21, November 15–17, 2021, Coimbra, Portugal T. Zheng, Z. Chen, S. Zhang, C. Cai, and J. Luo
assumption contradicts the intention of continuous monitoring, ren- implementation discussed in Section 4. Section 5 reports the eval-
dering existing systems less applicable to real-life scenarios. Now uation results. Potential applications of MoRe-Fi are discussed in
the key question becomes: can we design a fine-grained respiration Section 6. Related works are presented in Section 7. Finally, Sec-
monitoring system working on subjects with body movements? tion 8 concludes this paper and points out future directions.
Answering this question faces four major challenges. First of
all, recovering respiratory waveform with RF-sensing is far from 2 BACKGROUND AND PRELIMINARIES
trivial even for static subjects, as existing systems often rely on In this section, we carefully study the principles of respiration
linear filtering (potentially susceptible to nonlinear interference) to monitoring with an IR-UWB radar. We focus on modeling the radar
obtain coarse-grained waveform or only respiratory rate [3, 85, 87]. signal to reflect how it represents respiration in the I/Q domain,
Secondly, the effect of body movements on the complex RF signal hence laying a theoretical foundation for the robust respiration
has never been put under scrutiny; prior art resorts to either phase monitoring discussed in Section 3.
or amplitude of RF signals [9, 85, 90]. Thirdly, the reflected signals
caused by body movements and respiration are composed in a non- 2.1 Capturing Breath with IR-UWB Radar
linear manner due to varying body positions, making it extremely
We first explain the working principles of IR-UWB radar on how it
hard to separate respiration from motion interference. Last but not
captures human respiration. As shown by the system diagram of
least, reflected signals by body movements exhibit various statistical
IR-UWB radar in Figure 2, each frame 𝑥 (𝑡) is formed by a baseband
properties that cannot be readily processed by a single model-based
method. So far, very few proposals have touched motion-robust cos(2𝜋𝑓c 𝑡)
respiration monitoring: the RF-sensing methods either incur a high Lowpass 𝑟I (𝑡)
complexity [9] or handle very small-scale movements [90], while cos(2𝜋𝑓c 𝑡) filtering
the acoustic sensing methods can be susceptible to real-life acoustic 𝑠(𝑡) 𝑥(𝑡)
Channel ℎ(𝑡)
𝑥 𝑡 ∗ ℎ(𝑡)
interference [69, 83].
Lowpass 𝑟Q (𝑡)
To tackle these challenges, we propose MoRe-Fi for motion-
filtering
robust and fine-grained respiration monitoring. We construct MoRe-
sin(2𝜋𝑓c 𝑡)
Fi based on a commercial-grade IR-UWB radar platform [10], lever-
aging its large bandwidth to achieve high-resolution motion sensing. Figure 2: System diagram of the IR-UWB radar.
Given the raw motion-induced IR-UWB signal embedded with fine-
grained spatial information, we first analyze how respiration is mod- Gaussian pulse 𝑠 (𝑡) modulated by a cosine carrier at frequency 𝑓c .
ulated in the complex in-phase and quadrature (I/Q) components, This frame is then transmitted and reflected by a moving human
as well as the limitations of model-based methods in extracting chest and other irrelevant objects, so as to produce the received
respiratory waveform. Based on the characteristics of I/Q-domain signal 𝑥 (𝑡) ∗ℎ(𝑡), where ℎ(𝑡) denotes the channel impulse response.
signal representation, we design a corresponding data augmen- After I/Q downconversion, the demodulated complex signal be-
tation process for enriching our dataset, which further drives an comes 𝑟 (𝑡) = 𝑟 I (𝑡) + 𝑗𝑟 Q (𝑡). In Figure 3a, we plot the amplitude
IQ Variational Encoder-Decoder (IQ-VED) for robust recovery of of the complex signal; objects at different distances can be clearly
respiratory waveform. As the core of MoRe-Fi, IQ-VED captures differentiated thanks to the large bandwidth.
the complementary I/Q information from the radar signal and en- However, a single frame of the received signal is not enough
codes it to an interpretable latent representation for facilitating for respiration monitoring. To observe periodic movements, the
fine-grained respiratory waveform extraction. Our major contribu- radar transmits frames at a regular interval, and stacks the received
tions in designing and implementing MoRe-Fi are as follows: frames to form a signal matrix 𝒓 (𝑡) = [𝑟 1 (𝑡), · · · 𝑟𝑛 (𝑡), · · · 𝑟 𝑁 (𝑡)]𝑇 ,
• To the best of our knowledge, MoRe-Fi is the first fine- where 𝑡 and 𝑛 are respectively the fast-time and slow-time indices,
grained respiration monitoring system operating in a low- and 𝑁 is the number of slow-time frames [13, 91]. We hereafter
complexity and full-scale motion-robust manner. slightly abuse the terminology by writing 𝒓 (𝑡) as 𝒓 (𝑡, 𝑛) to clearly
• We analyze the necessity to process radar signal in its com- indicate its matrix nature, as illustrated in Figure 3b. One may easily
plex I/Q domain, instead of leveraging incomplete informa- recognize a breathing person in the matrix, as circled in the red
tion, such as only phase or amplitude of the signal. box. Considering a particular 𝑡 corresponding to the respiration,
• We propose IQ-VED for recovering and refining respiratory
waveform. This novel encoder-decoder architecture fully uti- A human
lizes the I/Q components together and achieves fine-grained
Slow time n
Amplitude
subject
respiratory waveform recovery leveraging the generalizabil-
ity brought by the variational inference. A wall
• We conduct extensive evaluations on MoRe-Fi with a 66-hour
dataset; the results strongly confirm its excellent waveform
Fast-time t (distance) Fast-time t (distance)
recovery ability under body movements.
(a) Received signal |𝑟 (𝑡 ) | . (b) Signal matrix 𝒓 (𝑡, 𝑛) .
The rest of the paper is organized as follows. Section 2 introduces
the background for respiration monitoring with IR-UWB radar. Figure 3: Amplitude of single-frame radar signal 𝑟 (𝑡) and sig-
Section 3 presents the system design of MoRe-Fi, with detailed nal matrix 𝒓 (𝑡, 𝑛) composed of multiple frames.
MoRe-Fi: Motion-robust and Fine-grained Respiration Monitoring via Deep-Learning UWB Radar SenSys’21, November 15–17, 2021, Coimbra, Portugal
Amplitude
2.2 Interference Caused by Body Movements
Amplitude
Phase Phase To better understand the effects of body movements, we begin by
defining their scope, as shown in Figure 6. Basically, our concerned
movements should maintain the position of the human subject i)
to stay within the Field of View (FoV) of the radar and ii) to have
0 10 20 30 0 10 20 30 the variable distance 𝑑 between the radar and the subject lying
Time (s) Time (s)
within a reasonable range (e.g., 50cm) around its mean 𝑑. ¯ Such a
(a) Signal segment 1. (b) Signal segment 2.
Figure 4: Neither amplitude nor phase of 𝑟𝑡 (𝑛) alone is suffi-

cient to correctly recover respiratory waveforms.
FoV
conventional methods individually adopt either amplitude or phase
Radar
of the slow-time signal 𝑟𝑡 (𝑛) to characterize respiration [3, 85, 87,
90]; we plot two such results in Figure 4.
In fact, neither of these two real sequences can fully depict res-
d
piratory waveform accurately, despite their periodic structures
resembling the “baseband” of respiration. As illustrated in Figure 4, Figure 6: A subject of MoRe-Fi should stay within the radar
both amplitude and phase waveforms exhibit distortions to various FoV, with its distance from the radar remaining within a rea-
extents, including strength variations (amplitude in Figure 4a) and sonable range around a constant mean value 𝑑. ¯
missing cycles (phase in Figure 4b). To understand why such distor-
tions take place, we analyze 𝑟 (𝑛) (subscript neglected for brevity) scope encompasses body movements not affecting the chest (e.g.,
on both I/Q components with Equations (1) and (2): typewriting or limb position drift), as well as even larger-scale ones
involving chest motion (e.g., exercising on treadmill or at a fixed
4𝜋𝑑 0 4𝜋𝑧 (𝑛)

𝑟 I (𝑛) = 𝛼 (𝑛) cos + + 𝑜 IBBR, (1) spot). It, therefore, forbids the subject to i) drastically change its
𝜆 𝜆 posture (e.g., from standing to lying) and ii) significantly alter its
4𝜋𝑑 0 4𝜋𝑧 (𝑛) position (so as to avoid the need for tracking). This movement

BBR
𝑟 Q (𝑛) = 𝛼 (𝑛) sin + + 𝑜Q , (2)
𝜆 𝜆 scope has barely been studied in the literature, as previous “motion-
where 𝛼 (𝑛) is the strength of the reflected signal from the human robust” systems mostly tackle small-scale movements, e.g., hand
chest, 𝑑 0 is the distance from the radar to the chest, 𝜆 is carrier drifts when holding a device [69] and turning steering wheels dur-
wavelength, and 𝑧 (𝑛) denotes human chest movement. In both ing driving [83, 90]. It should be noted that acoustic sensing [69, 83]
equations, the first terms are caused by respiration, and the second are not applicable to our problem scope, because they are easily
terms 𝑜 IBBR and 𝑜 Q BBR are the offsets caused by body background interfered with by acoustic noise accompanying large-scale move-
ments (e.g., typewriting or exercising on treadmill) and real-life
reflection (BBR). To visualize 𝑟 I (𝑛) and 𝑟 Q (𝑛), we take the signal of
environments (e.g., music in vehicles).
the breathing person (as bounded by the red box in Figure 3b) as
With the scope defined, we hereby analyze the impacts of body
an example, and display it as a constellation diagram in Figure 5a.
movements on the I/Q signal with a simple experiment. We let a
The blue respiration vector in the graph corresponds to the complex
human subject move his body (lean forward and backward, sway left
signal 𝑟 I (𝑛) + 𝑗𝑟 Q (𝑛); they are the sum of BBR offset (green vector)
and right) while sitting in a chair. The corresponding 𝑟 (𝑛) shown
and respiration-induced variation (red vector). One may observe
in Figure 5b demonstrates that body movements prevent the trace
that, as the human subject breathes, the red vector rotates and the
of 𝑟 (𝑛) from falling onto a single elliptic arc; the trace is scattered
trace of the blue vector forms an elliptic arc. The arc may not be
across the I/Q plane in a rather arbitrary manner. The reason is
circular because 𝛼 (𝑛), the radius, is time-varying due to a vary-
that the BBR offset (caused by reflections from limbs and torso
ing radar cross-section [44]. Now Figure 5a clearly explains why
other than the chest) previously presumed to be static is no longer
neither amplitude nor phase waveforms can correctly characterize
respiratory waveforms in Figure 4: although they oscillate with a so, resulting in a random shift of the elliptic center 𝑜 IBBR, 𝑜 Q BBR .
similar frequency to a human breath, they are only projections of In addition, the distance from the radar to the human subject 𝑑
the respiration vector trace onto a lower dimension. also varies and further changes the signal phase.1 Furthermore, the
radar cross-section changes with body movements as well, causing
Q Q a varying reflected signal strength and hence unpredictable changes
BBR offset
BBR offset Variation in the long/short axes of the ellipse. Last but not least, Figure 5b only
Variation Respiration vector
Respiration vector shows 𝑟 (𝑛) for a single fast-time index, yet large-scale movements
can potentially affect multiple fast-time indices.
or
I
I
2.3 Conventional Respiration Recovery
In order to recover respiratory waveform, conventional approaches
(a) Human subject is static. (b) Human subject is moving.
measure the human chest displacement Δ𝑑; it is related to the phase
Figure 5: Constellation diagrams of 𝑟 (𝑛). 1 Being a constant 𝑑 0 in (1) and (2), the distance becomes a variable now.
Δ𝜙 Radar
change Δ𝜙 of the respiration vector in Figure 5a by Δ𝑑 = 𝜆 2𝜋 . Training
Because the center of the ellipse may not coincide with the origin IQ signal IQ-VED Respiratory Respiratory
due to the BBR offset, Δ𝜙 is not always obtainable from the raw I/Q matrix extraction waveform rate
signal. To recover the respiratory waveform, a recent proposal [69] Locating Data Loss
Other
suggests fitting elliptic arcs to (acoustic) I/Q signal and unifies their Respiration augmentation
Ground biomarkers
centers to the origin. Although this method theoretically allows Respiration monitor truth
belt logger
the phase of the respiration vector to be calculated by taking the
inverse tangent of the shifted I/Q signal, it is intrinsically designed Figure 8: System diagram of MoRe-Fi.
for quasi-static human subjects, although it can tolerate very small
limb position drifts (e.g., hand motions when holding a phone [69]) 𝒓 (𝑡, 𝑛) from the IR-UWB radar, MoRe-Fi locates respiration by ex-
by selectively fitting to relatively “clean” data segments free of tracting a sub-matrix corresponding to the concerned human sub-
strong motion interference. ject. It then leverages the rotation invariance in the I/Q domain
Under stronger movements, the traces of respiration vector in to augment the sub-matrix. The core of MoRe-Fi is a novel IQ
the form of elliptic arcs are no longer identifiable and analyzable, Variational Encoder-Decoder (IQ-VED) neural network to distill
because they are severely deformed and blended, as explained in respiratory waveform from the sub-matrix. Trained by the ground
Section 2.2. As a result, one has to fall back to the 1-D signal projec- truth waveform obtained from a wearable sensor [56], our IQ-VED
tion shown by Figure 4, and hence apply linear processing methods is capable of recovering fine-grained respiratory waveform under
taken by previous works to extract respiration, such as bandpass severe interference produced by body movements.
filter [66], Ensemble Empirical Mode Decomposition (EEMD) [83],
and Variational Mode Decomposition (VMD) [90]. We briefly il- 3.1 Locating Respiration
lustrate the recovery results achieved by these 1-D methods in
To locate respiration in the signal matrix 𝒓 (𝑡, 𝑛), MoRe-Fi first
Figure 7, when the subject was exercising on the spot. One may
uses a loopback filter [2] to remove the influence of static back-
observe that the resulting waveforms are very coarse-grained and
ground. The static clutter of the system can be described as: 𝑐𝑛 (𝑡) =
lack correct event details (e.g., rate, duration of inhalation and ex-
𝛽𝑐𝑛−1 (𝑡) + (1 − 𝛽)𝑟𝑛 (𝑡) and the background subtracted signal can
halation, as well as tidal volume). Essentially, as these methods fail
be represented as 𝑟𝑛− (𝑡) = 𝑟𝑛 (𝑡) − 𝑐𝑛 (𝑡), where 𝑟𝑛 (𝑡) denotes the
to properly treat the I/Q scrambling and fast-time crossing issues
𝑛-th frame and the weight 𝛽 is empirically set to 0.9. Then the Con-
of large-scale movements, they definitely cannot handle the body
stant False Alarm Rate (CFAR) algorithm [45] kicks in to detect the
movement scope defined in Section 2.2.
peaks in 𝒓 − (𝑡, 𝑛) = [𝑟 1− (𝑡), · · · 𝑟𝑛− (𝑡), · · · 𝑟 𝑁
− (𝑡)]𝑇 using an adaptive
Ground truth RF Amplitude Bandpass filter EEMD VMD threshold; the threshold 𝜏noise (𝑡) is estimated by averaging values
at neighboring fast-time indices. MoRe-Fi selects multiple fast-time
indices adjacent to the detected peaks to form a sub-matrix 𝒓ˆ (𝑡),
and it finally transposes 𝒓ˆ (𝑡) to obtain a new matrix 𝒓ˆ (𝑛) with
Normalized amplitude
slow-time index 𝑛 as the argument, as shown in Figure 9.
3.2 Data Augmentation

Before putting the I/Q signal contained in 𝒓ˆ (𝑛) into deep analytics,
it is necessary to perform data augmentation for the following
reasons. On one hand, data collection is highly non-trivial because
one has to coordinate among human subjects, data recording of the
0 5 10 15 20 25 30 35 40
IR-UWB radar, and the wearable ground truth sensor. Therefore,
Time (s)
it is desirable to increase the diversity of a dataset by applying
Figure 7: Respiratory waveforms obtained by 1-D signal pro- certain transformations. On the other hand, data augmentation
cessing methods under strong body movements; the event often helps a deep neural network comprehend intrinsic structures
details on the waveforms are all lost (compared with the of the raw data. For MoRe-Fi, this I/Q-induced intrinsic structure is
ground truth), and the rates can be wrongly represented. non-trivially preserved only by rotation but not other transforms
Remark. To summarize, in order to recover fine-grained respi-

ratory waveform under strong body movements, we have to fully
Slow-time n
Fast-time t
CFAR
utilize the I/Q signal and trace its variations across multiple fast-
time indices. Nonetheless, since conventional model-based signal Transpose
processing algorithms have been demonstrated as unable to cope
with this situation, we resort to a data-driven approach. t0 Slow-time n
Fast-time t
3 SYSTEM DESIGN Figure 9: After locating respiration in a sub-matrix 𝒓ˆ (𝑡),

This section introduces the design of MoRe-Fi, whose rough dia- MoRe-Fi transposes it to obtain 𝒓ˆ (𝑛), which contains mul-
gram is shown in Figure 8. Upon obtaining the IQ signal matrix tiple fast-time indices around the detected respiration.
0.02 0.02 0.02 respiratory waveform) by a decoder 𝑝𝜓 (𝒓 ′ |𝒛). In other words, 𝒛

represents the partial features extracted from 𝒓 to characterize only
𝒓 ′ . VED shares the pair of encoder 𝜙 and decoder 𝜓 with ED, but
Q
Q
0 0 0
it maps 𝒓 to a Gaussian distribution parameterized by a mean and
-0.02 -0.02 -0.02 variance. Essentially, the generative process of the VED is enabled
-0.02 0
I
0.02 -0.02 0
I
0.02 -0.02 0
I
0.02
by maximizing the variational lower bound (VLB) [38]:
(a) Original signal. (b) 𝜃 = 𝜋
. 2𝜋
(c) 𝜃 = .
3 3 log 𝑝𝜓 (𝒓 ′ ) ≥ VLBVED (𝒓, 𝒓 ′ ;𝜓, 𝜙)
0.02 0.02 0.02

= E𝑞𝜙 (𝒛 |𝒓) log 𝑝𝜓 (𝒓 ′ |𝒛) − DKL 𝑞𝜙 (𝒛|𝒓)∥𝑝𝜓 (𝒛) , (4)

where 𝑝𝜓 (𝒛) = N (0, I) is a Gaussian prior on the latent repre-

Q
Q
0 0 0
sentation 𝒛 and DKL (·) donotes the Kullback-Leibler (KL) diver-

-0.02
-0.02 0 0.02
-0.02
-0.02 0 0.02
-0.02
-0.02 0 0.02 gence [40]; it works as a regularizer by minimizing the difference
I I I between 𝑞𝜙 (𝒛|𝒓) and 𝑝𝜓 (𝒛). In this way, VED gets around the hard-
4𝜋 5𝜋
(d) 𝜃 = 𝜋 . (e) 𝜃 = 3 . (f) 𝜃 = 3 . ness in estimating the (distribution) of 𝒓 ′ directly from 𝒓, by using
the latent representation 𝒛 as an intermediate relay. Moreover, rep-
Figure 10: Augmenting data by rotating a slow-time row of
resenting 𝒛 as a (continuous) probability distribution rather than a
𝒓ˆ (𝑛) in I/Q domain.
discrete vector set, VED is equipped with a continuous latent space.
Upon unseen inputs, this latent space will be sampled in a more
such as translation. Therefore, we propose to augment 𝒓ˆ (𝑛) by
meaningful manner than that of a conventional ED. Essentially,
rotating its every complex element in I/Q domain:
" aug # the continuous property of the latent space enables VED to avoid
cos 𝜃 − sin 𝜃 overfitting and hence better handle out-of-range inputs.

𝑟I 𝑟I
aug = , (3) There is yet one link missing before applying VED to separate
𝑟Q sin 𝜃 cos 𝜃 𝑟Q
respiration from I/Q-represented RF signal mixture: most building
where 𝜃 specifies a rotation angle and it is varied to achieve data aug- blocks for deep learning are based on real-valued operations and
mentation. Figure 10 illustrates five versions of augmented 𝒓ˆ (𝑛): the representations, how to reform VED to handle complex I/Q signals
rotation preserves the respiration traces, because it affects only the remains a problem. Previously, deep complex networks [63, 71]
distance 𝑑 (which is anyway varying drastically under body move- have been proposed to handle complex numbers, but they require
ments, according to Section 2.2) but not the respiration-induced redefining calculus operations including differentiation crucial to
periodic motions Δ𝑑 of the chest. As shown in Figure 10, the over- backpropagation [71], so they are in general hard to train (with su-
lapped respiration ellipses maintain the overall distribution despite per slow convergence) and hence not widely adopted. The same con-
varying rotations. In practice, MoRe-Fi may choose to employ more vergence and complexity issues also apply to neural networks for
rotation angles for better enriching a dataset. sequential processing, such as general RNNs that include LSTM [72].
Consequently, our IQ-VED performs a bivariate analysis of the I/Q
3.3 Fine-Grained Waveform Recovery signal, as explained in Section 3.3.2.
In this section, we first study the background of Variational Encoder-
Decoder (VED), then discuss how to adapt VED architecture for I/Q 3.3.2 IQ-VED Encoder. The encoder of IQ-VED takes in the I/Q
complex signals, and finally provide details on respiratory wave- signal matrix 𝒓 (𝑛) (ˆ𝒓 (𝑛) in Section 3.1 for brevity) and encodes
form recovery using IQ-VED. it to a latent representation 𝒛. Specifically, IQ-VED adopts a two-
stream design, where the I/Q components 𝒓 I (𝑛) and 𝒓 Q (𝑛) are fed
3.3.1 Design Rationale. Extracting certain signals from a nonlin- into two separate encoders, as shown in Figure 11. Each encoder
ear signal mixture is highly non-trivial [31, 35]; the deep learning consists of i) multiple layers of One-Dimensional Convolutional
community has been employing an Encoder-Decoder (ED) network Neural Network (1D-CNN) [43] for feature extraction, ii) batch
for this task [47, 51]. Unfortunately, the latent space of a regular norm layers [33] for normalization, and iii) leaky ReLU [21] layers
ED network is not continuous given limited training data, so it for adding non-linearity, as illustrated in Figure 12a. Both 𝒓 I (𝑛)
lacks sufficient generalization ability when dealing with unseen and 𝒓 Q (𝑛) are treated as multi-channel 1-D sequences, with each
data. Inspired by the idea of variational inference [6, 28], we tackle channel corresponding to one fast-time index in 𝒓 (𝑛).
the problem of latent space irregularity by forcing the encoder to Essentially, the IQ-VED encoder decomposes the input I/Q sig-
return probability distributions rather than discrete vectors, and we nals and filters out motion interference. The resulting respiration-
name the modified network Variational Encoder-Decoder (VED). It induced signal is compressed and mapped to the latent distribution,
is worth noting that our VED is fundamentally different from Vari- which will then be sampled to drive the decoder so as to recover the
ational AutoEncoder (VAE) [38, 67]: whereas VED aims to extract desired respiratory waveform. The overall convolutional filter aims
signal from a nonlinear mixture, VAE intends to learn an efficient to extract useful features, so it can be deemed as a demixing func-
representation of the input. tion [37] to reverse the entanglement between respiration signal
To achieve the objective of MoRe-Fi in respiratory waveform and non-linear motion interference. It is well known that processing
recovery, a regular ED learns an encoder 𝑞𝜙 (𝒛|𝒓) mapping input the I/Q components of complex signals separately, though substan-
data 𝒓 to a latent representation 𝒛, and generates output 𝒓 ′ (i.e., tially lowering the training complexity, may cause misalignment.
𝝁I
Recovered respiration 𝒓′
In-phase
𝒓I Encoder
Decoder
𝚺I 𝒛I
Augmented radar
signal 𝒓 in
IQ domain Waveform reconstruction
𝒛
Latent space alignment
𝝁Q 𝒛Q
Quadrature
𝒓Q Encoder
Ground truth 𝒓gt
𝚺Q
Figure 11: IQ-VED architecture: it takes in I/Q data as two streams and encodes them separately. The resulting latent represen-
tations are aligned and fed to the decoder to recover respiratory waveform by minimizing the reconstruction error.
Conv1d (5, 32) ConvTranspose1d (512, 256) them to encode the same respiration features. On the other hand,
Batch Norm Batch Norm unlike KL divergence, Wasserstein distance is able to provide a
Leaky ReLU Leaky ReLU
useful gradient when the distributions are not overlapping [39]. As
Conv1d (32, 64) ConvTranspose1d (256, 128)
Batch Norm Batch Norm a result, while most parts of the two distributions are meant to be
Leaky ReLU Leaky ReLU aligned, some discrepancies inherent to the I/Q components, e.g.,
Conv1d (64, 128) ConvTranspose1d (128, 64) amplitude and phase of BBR offset are allowed to be maintained.
Batch Norm Batch Norm
Leaky ReLU Leaky ReLU In the case of multivariate Gaussian distributions, a closed-form
Conv1d (128, 256) ConvTranspose1d (64, 32) solution of Equation (5) can be obtained according to [20]:
Batch Norm Batch Norm
Leaky ReLU Leaky ReLU 1#1
1 2 2
2
1
WIQ = 𝝁 I − 𝝁 Q + Tr (𝚺I ) + Tr 𝚺Q − 2 𝚺I 𝚺I 𝚺Q 2 2
. (6)

Conv1d (256, 512) ConvTranspose1d (32, 1)
Batch Norm Batch Norm 2
Leaky ReLU Leaky ReLU
(a) Encoder (both I and Q). (b) Decoder. Since the covariance matrices obtained by IQ-VED are of diagonal
form, Equation (6) can be simplified as follows:
Figure 12: Encoder and decoder configurations. Two values
2 1 1 2
in parentheses indicate the amounts of input and output
WIQ = 𝝁 I − 𝝁 Q + 𝚺I2 − 𝚺Q2 , (7)
channels, respectively. 2 Frob
where ∥·∥ Frob is the Frobenius norm, defined as the square root of
To overcome this problem, we specifically align their respective the sum of the absolute squares of the matrix elements.
latent spaces in Section 3.3.3.
3.3.4 IQ-VED Decoder and Loss Function. As shown in Figure 12b,
3.3.3 Latent Space Alignment. The outputs of the encoder are two the decoder can be deemed as the reverse of the encoder. To this
Gaussian distributions 𝒛 I ∼ N (𝝁 I, 𝚺2I ) and 𝒛 Q ∼ N (𝝁 Q, 𝚺2Q ) pa- end, we replace 1D-CNN in the decoder with 1-D transposed con-
rameterized by respective means and variances, according to Equa- volutional layers [86] to upsample the latent representation and
tion (4). Since both latent distributions are integral parts of the map them to a longer sequence, so as to finally derive respiratory
complex signal representation, IQ-VED should guarantee that their waveform 𝒓 ′ (𝑛). Note that the encoder and decoder are not exactly
processing (via individual encoders) has been conducted in a coor- symmetric: at the last stage of the decoder, a single-channel signal
dinated manner. Fortunately, since both in-phase and quadrature is recovered, instead of a multi-channel one as the input to the en-
signals are 1-D perspectives of the same complex radar signal, they coder. To train IQ-VED, we employ three loss functions, namely the
share common structures sufficient to align their corresponding reconstruction loss, the I/Q regularizing loss, and the distribution
latent representations. To this end, we choose to minimize the alignment loss.
2-Wasserstein distance [59] between them:
Reconstruction Loss. To correctly recover respiratory waveform,
!1
∫ 2 this loss function compares the output of IQ-VED decoder with
WIQ = inf ∥𝒛 I − 𝒛 Q ∥ 2𝑑 𝐽 (𝒛 I, 𝒛 Q ) , (5) the ground truth obtained by a wearable sensor, and tries to make
𝐽 ∈ J ( N (𝝁 I ,𝚺2I ),N (𝝁 Q ,𝚺2Q )) them similar. 𝐿 2 loss is used to define the reconstruction loss L𝑅𝐶 ,
which measures the sum of all the squared differences between
where J denotes the set of all joint distributions 𝐽 that has N (𝝁 I, 𝚺2I )
the two waveforms. This loss practically implements the term
and N (𝝁 Q, 𝚺2Q ) as respective marginals. The reason for employing
E𝑞𝜙 (𝒛 |𝒓) log 𝑝𝜓 (𝒓 ′ |𝒛) in Equation (4):

the Wasserstein distance is twofold. On one hand, minimizing the
distance shifts the two distributions “close” to each other, enforcing LRC = 𝑟 gt − 𝑟 ′ . (8)
I/Q Regularizing Loss. In Section 3.3.1, it is pointed out that VED 0.6
1
regularizes the latent distribution according to a standard Gaussian ttc Respiratory rate
prior. For IQ-VED, two distributions from the I/Q encoders should 0.5
0.4
be regularized. The I/Q regularizing losses are defined as: vT te ti 0.2

2nd harmonics
of respiration
0
LIR = DKL N 𝝁 I, 𝚺2I , N (0, I) ,

(9) 2 4 6 8 10
0
0 0.5 1 1.5
Time (s) Frequency (Hz)
LQR = DKL N 𝝁 Q, 𝚺2Q , N (0, I) .

(10) (a) Time domain. (b) Frequency domain.
Distribution Alignment Loss. As described in Section 3.3.3, the Figure 13: The recovered respiration signal while exercising
misalignment between the two distributions from the I/Q encoders on the spot outputted by IQ-VED in different domains.
can be measured by a Wasserstein distance. Therefore, the distribu-
tion alignment loss is defined according to Equation (7): exhaling in a cycle (peak to valley), and inhalation/exhalation ra-
2 tio (I/E ratio) representing a compromise between ventilation and
2 1 1
LDA = WIQ = 𝝁 I − 𝝁 Q + 𝚺I − 𝚺Q 2 2
. (11) oxygenation. By finding the peaks and valleys, and their corre-
2 Frob sponding timestamp in the recovered waveform, these biomarkers
Combining these loss functions, the overall loss function for train- can be calculated. We illustrate all the aforementioned time-related
ing IQ-VED can be obtained as follows: biomarkers in Figure 13a.
LIQ−VED = LRC + 𝛾 (LIR + LQR ) + 𝜂LDA, (12)
4 IMPLEMENTATION
where 𝛾 and 𝜂 are the respective weights for regularizing losses and Hardware Implementations. Our MoRe-Fi prototype leverages
distribution alignment loss. LIR and LQR share the same weight IR-UWB signals for monitoring human respiration. The core com-
𝛾 because the I/Q data and encoders are symmetric and of equal ponent of MoRe-Fi is a compact and low-cost Novelda X4M05 [58]
importance. In addition, 𝛾 is supposed to be greater than 1, so as to IR-UWB radar transceiver. The radar operates at a center frequency
emphasize the representation capacity of the latent variables 𝒛 and of 7.29 or 8.7GHz with a bandwidth of 1.5GHz. The sampling rate of
encourage disentanglement of the representations [27], as will be the radar is 23.328GHz, and the frame rate is set to 50fps. The radar
confirmed in Section 5.4.4. has a pair of tx-rx antennas with an FoV of 65◦ in both azimuth and
3.3.5 Waveform Recovery and Biomarker Recognition. With a well- elevation angles. A Raspberry Pi single-board computer [65] is used
trained IQ-VED, respiratory waveform can be recovered from radar to control the transceiver and to interface with a desktop computer;
signal even under motion interference. One immediate way of ap- this computer has an Intel Xeon W-2133 CPU, 16GB RAM, and a
plying IQ-VED to waveform GeForce RTX 2080 Ti graphics card. NeuLog respiration monitor
recovery is to re-sample the latent
belt logger sensor NUL-236 [56] is used to collect ground truth

vectors 𝒛 I and 𝒛 Q from N 𝝁 I, 𝚺2I and N 𝝁 Q, 𝚺2Q . However, this
respiratory waveform, the sampling rate of the NeuLog sensor is
leads to non-deterministic outputs that may cause problem in prac- also set to 50fps, the same as the radar.
tice. To tackle this problem, we perform a deterministic inference
without sampling 𝒛 I and 𝒛 Q as follows: Software Implementations. We implement MoRe-Fi based on
Python 3.7 and C/C++, with the neural network components built
𝒓 ′ ∗ = arg max ′
𝑝𝜓 𝒓 ′ |𝒓 I, 𝒓 Q, 𝒛 ∗I + 𝒛 ∗Q , (13) upon PyTorch 1.7.1 [61]. To align the ground truth respiration
𝒓
signal from the respiration monitor belt logger and radar signals,
where the deterministic latent vectors 𝒛 ∗I and 𝒛 ∗Q are obtained as the Precision Time Protocol [32] relying on message exchanges
𝒛 ∗I = E[𝒛 I |𝒓 I ] and 𝒛 ∗Q = E[𝒛 Q |𝒓 Q ]. over Ethernet is used to synchronize the clocks between hardware
To determine the respiratory rate based on the recovered wave- components. In the data augmentation process, each signal 𝒓 (𝑛)
form, Fast Fourier Transform (FFT) can be applied. Since the fre- is rotated from 0 to 2𝜋 with an interval of 𝜋/30 for 60 times. The
quency of respiration ranges from 0.16 Hz to 0.6 Hz, the search parameters of IQ-VED are set as follows: 𝛾 and 𝜂 in Equation (12)
space can be narrowed down, and the peak frequency in the range are set to 3 and 2e-4, respectively. For the encoder, 5 consecutive
can be identified as the respiratory rate. As an example, a sample 1-D convolutional layers are used, whose kernel size is set to 3,
respiration given subject exercising on the spot is shown in Fig- stride to 1, padding size to 1; and the number of output channels
ure 13, with both time and frequency representations. In particular, of these convolutional layers is set to 32, 64, 128, 256, and 512. As
Figure 13b shows that the respiratory rate can be estimated to be for the decoder, 5 consecutive transposed convolutional layers are
0.24Hz, or approximately 14.4 beats per minute (bpm). In addition, used, their kernel size is set to 3, stride to 1, scale factor to 2, and
the instantaneous frequency can also be obtained by taking the the number of output channels of these layers is set to 512, 256,
reciprocal of the total cycle time. 128, 64, and 32. All weights are initialized by the Xavier uniform
Apart from respiratory rate, other biomarkers can also be in- initializer [41]. Consequently, IQ-VED involves 2.36 × 107 parame-
ferred, such as tidal volume (𝑣 T ) denoting the amount of air that ters; it incurs 1.21 × 108 multiply-accumulate operations for each
moves in or out of the lungs with each respiratory cycle, total cy- signal instance during inference. The collected dataset is divided
cle time 𝑡 tc measuring the total time for a respiration cycle (peak into training and test sets. The training set contains 8,000 pairs of
to peak), inspiratory time 𝑡 i indicating the time of inhaling in a radar signal matrices and respiration ground truths obtained from
cycle (valley to peak), expiratory time 𝑡 e indicating the time of NeuLog sensor, and the test set contains 4,000 pairs. The size of
the raw training radar signal matrix is 1000 × 138. For the training Suppose the timestamps of the peak and valley are 𝑡 p and 𝑡 v , then
process, the batch size is set to 64, the IQ-VED loss in Equation (12) the errors are defined as absolute differences between the estimated
is adopted, and the learning rate and momentum of the Stochastic and actual values, i.e., |𝑡 pe − 𝑡 pa | and |𝑡 ve − 𝑡 va |.
Gradient Descent optimizer [7] are respectively set to 0.01 and 0.9.
5.2.3 Respiratory Rate Estimation Error. After recovering peaks
5 EVALUATION and valleys in respiratory waveforms, the instantaneous respiratory
rate 𝜌 can be estimated by taking the reciprocal of 𝑡 tc . The error of
In this section, we perform intensive evaluations on the perfor- respiratory rate is defined as the absolute difference between the
mance of MoRe-Fi given several real-life scenarios and under vari- estimated respiratory rate 𝜌 Re and the actual respiratory rate 𝜌 Ra ,
ous parameter settings. namely, |𝜌 Re − 𝜌 Ra |.
5.1 Experiment Setup 5.2.4 Volume Estimation Error. Besides the event times in the wave-
In order to conduct the evaluations, we recruit 12 volunteers (6 form, the amplitude of the waveform is also worthy of exploring,
females and 6 males), aged from 15 to 64, and weighing from 50 to as it can be used to represent the tidal volume 𝑣 T of respiration
80kg. All volunteers are healthy, and we measure their respiration given their proportional relation. Since we are not interested in the
in the natural state without the volunteers consciously controlling absolute value of the waveform amplitude, we define the relative
breathing or undergoing external forceful intervention. The vol- error of volume as the absolute difference between the estimated
unteers are asked to carry out 7 common activities with different value and actual value divided by the actual value |𝑣 Te − 𝑣 Ta |/𝑣 Ta .
degrees of body movements: playing on phone, typewriting, exer- 5.2.5 Baseline Selection. BreathListener [83], a respiration moni-
cising on the spot, shaking legs, walking on a treadmill, standing toring system for driving environments, is picked as a comparison
up/sitting down, and turning over in bed, all in real-life environ- baseline, because BreathListener also claims to take into account
ments such as office, gym, and bedroom. For brevity, the names motion interference (albeit only small-scale ones) caused by run-
of these movements are abbreviated as PP, TW, ES, SL, WT, SS, ning vehicles. However, we have to port BreathListener to radar
and TO, respectively. All experiments have strictly followed the as it was designed for acoustic sensing. Essentially, Breathlistener
standard procedures of IRB of our institute. adopts a two-stage processing pipeline. It first employs EEMD [82]
The IR-UWB radar is placed to face a human subject, within a to separate respiration from interference. As discussed in Section 2.3,
range of 0.5 to 2m, and on the same height as the chest of the sub- EEMD can only recover coarse-grained waveform (if not incorrect
ject. We collect a larger number of data entries for shorter activities one). Therefore, it further applies a Generative Adversarial Network
(e.g. walking on treadmill that costs several minutes) and a smaller (GAN) [22] for adding details to the recovered waveform.
number of data entries for longer activities (overnight sleeping with
turning over in bed) so as to guarantee a balanced dataset. Our data 5.3 Performance Results
collection leads to a 66-hour dataset of RF and ground truth record-
We start by performing an overall evaluation, showing intuitive
ings, including approximately 72,000 respiration cycles and roughly
waveform results and cosine similarities under different body move-
the same amount of data from every subject. After collection, both
ments. We then compare MoRe-Fi with a baseline method. Finally,
the raw and ground truth data are sliced into 20 s samples. Two-
we measure various estimation errors quantitatively.
thirds of the collected data are used for training IQ-VED, and the
remaining one-third is used for testing the performance of MoRe-Fi 5.3.1 Overall Performance. Figure 14 shows the respiratory wave-
in recovering respiratory waveform. form generated by IQ-VED compared to its corresponding ground
truth versions during several 50s activities.2 In Figure 14a, it can
5.2 Metric and Baseline Selection be observed that the motion interference caused by typewriting is
5.2.1 Cosine Similarity. The cosine similarity S(𝒓 ′, 𝒓 gt ) between sporadic, but the intensive interference period can affect the non-
the IQ-VED recovered waveform 𝒓 ′ (𝑛) and the ground truth 𝒓 gt (𝑛) intensive one by shifting the phase of respiration significantly (the
is used to measure the recovering performance of IQ-VED. Specifi- respiratory waveform can be roughly observed during the latter pe-
cally, the cosine similarity is measured by the cosine of the angle riod). Fortunately, MoRe-Fi not only recovers respiratory waveform
between two vectors 𝒓 ′ (𝑛) and 𝒓 gt (𝑛), and then determines to what during the intensive period, but also corrects the phase throughout
extent the two vectors point to the same “direction” in a high di- the whole period, at a minor cost of waveform deforming during the
mensional space. It is defined as follows: transitional phase. Compared with Figure 14a, both ES (Figure 14b)
Í𝑁 ′ and WT (Figure 14c) incur more stationary motion interference
𝒓 ′ · 𝒓 gt 𝑖=1 𝒓 (𝑖)𝒓 gt (𝑖) with WT being much more intensive, clearly affecting respiration
S(𝒓 ′, 𝒓 gt ) = ′ = √︃ , (14)
∥𝒓 ∥∥𝒓 gt ∥ Í𝑁 ′2 √︃Í𝑁 2 to a greater extent and causing deterioration in both respiratory
𝑖=1 𝒓 (𝑖) 𝑖=1 𝒓 gt (𝑖)
waveforms 𝒓 ′ (𝑛) and 𝒓 gt (𝑛). As the last example, Figure 14d shows
whose value lies in the range of [0, 1]. that, though TO (during the time span of [5, 10] s and [34, 43] s)
affects the signal phase significantly, IQ-VED successfully recovers
5.2.2 Time Estimation Error. In Section 3.3.5, we have discussed
respiration both when the human subject is lying quasi-statically
several time-related biomarkers that can be deduced from the wave-
and turning over.
form, including 𝑡 tc , 𝑡 i , and 𝑡 e . All of them can be calculated from
the timestamps of the peaks and valleys in respiratory waveforms, 2 Althoughour IQ-VED is trained with 20s samples, its CNN-based encoder is flexible
so we study estimation errors in terms of the peak and valley times. enough to accommodate an arbitrary sample length in practice.
5.3.2 Comparison with Baseline Method. We compare MoRe-Fi

Actual respiratory waveform

1
with BreathListener [83] in terms of recovered waveform quality in

IQ-VED generated waveform
Phase of radar signal
Figure 16. Two examples are shown in Figure 16a and 16b to directly
0.5
contrast the waveforms, then comparisons in terms of three metrics

are provided in the remaining subfigures. In general, MoRe-Fi recov-
0
0 5 10 15 20 25 30 35 40 45 50 ers respiratory waveform accurately, whereas BreathListener tends

Time (s)
(a) Typewriting (TW).
to generate distorted (sometimes even erroneous) waveform when
the body movements become more intensive. In Figures 16c, 16d,
and 16e, MoRe-Fi exhibits a much better performance in cosine sim-
1
ilarity, respiratory rate estimation, and peak/valley time estimation

than the baseline, all thanks to its motion-robust design.
0.5
The inferior performance of BreathListener can be attributed

to the mismatch between EEMD and GAN adopted by it. As dis-
0
0 5 10 15 20 25 30 35 40 45 50 cussed in Section 2.3, the EEMD algorithm is incapable of handling

Time (s)
complex I/Q signals, so one has to first project the I/Q signals to
(b) Exercising on the spot (ES).
a 1-D sequence in an information-lossy manner. A consequence
of this drawback is that motion interference cannot be correctly
1
separated, as illustrated in Figure 7. Given the potentially erroneous

decomposition of EEMD, GAN that already suffers from instability
0.5
during training [68] becomes even harder to converge. For those

converged cases, the EEMD decomposed waveform is already close
0
0 5 10 15 20 25 30 35 40 45 50 to ground truth, though possibly with wrong features (e.g., phase)

Time (s) that GAN barely helps to correct. Consequently, the biomarkers in-
(c) Walking on a treadmill (WT).
ferred from the BreathListener recovered waveform can have very
large errors, as shown in In Figures 16d and 16e. On the contrary,
our IQ-VED is trained and operates in an integrated manner: it uses

1
the encoder to decompose signal and the decoder to reconstruct

0.5
respiratory waveform. As a result, MoRe-Fi is far more effective

than the baseline, as demonstrated by these comparisons.
0
0 5 10 15 20 25 30 35 40 45 50
Time (s)
Actual respiratory waveform

(d) Turning over in bed (TO).
1
MoRe-Fi
BreathListener
Figure 14: Qualitative result showing that MoRe-Fi recovers
0.5
the respiratory waveform from the noisy radar signal under

different body movements.
0
To further explore how individual body movement types affect 0 5 10 15 20 25 30 35 40 45 50

Time (s)
the performance of MoRe-Fi, the cosine similarity under each body
(a) Turning over in bed (TO).
movement is studied in Figure 15. PP and SL can be observed as
having the least impact on the MoRe-Fi performance, as the body
parts involved in the movements are far from the subject’s chest.
1
As expected, WT and SS cause the worst performance of MoRe-Fi

0.5
since they both induce large body movements that severely interfere
with the respiration signal. Overall, the average cosine similarity
between the recovered and ground truth respiratory waveform is
0
0 5 10 15 20 25 30 35 40 45 50
0.9162, indicating a very successful recovery, as a similarity greater Time (s)
than 0.8 suggests a strong positive correlation. (b) Walking on a treadmill (WT).
1
Rate estimation error(rpm)
Time estimation error(s)

1
1.5
5
Cosine similarity
Cosine similarity
4
0.9
0.8
1
3
0.8
2
0.6
0.5
1
0.4
0.7
PP TW ES SL WT SS TO MoRe-Fi BreathListener MoRe-Fi BreathListener MoRe-Fi BreathListener

(c) Cosine similarity. (d) Respiratory rate. (e) Peak and valley time.
Figure 15: Cosine similarity between recovered waveform
and ground truth under different body movements. Figure 16: Comparison with baseline method.
15
Relative volume error (%)
5.3.3 Estimation Errors of Indicators. Given the overall perfor-
mance of MoRe-Fi reported in the previous sub-sections, we hereby
10
pay special attention to the estimation errors of several indicators.
Naturally, the performance of MoRe-Fi in estimating instantaneous
5
respiratory rate is first evaluated in Figure 17, showing a very con-
sistent accuracy with the majority of errors being under 0.1bpm.
0
PP TW ES SL WT SS TO
Similar to the cosine similarity, both PP and SL have the least impact
on rate estimation, while WT and SS cause the worst performance Figure 19: Estimation error of tidal volume.
because they entail large body movements.
5.4.1 Human Subjects. We show the cosine similarities of MoRe-Fi
Rate estimation error (bpm)
recovered respiratory waveform for all 12 subjects in Figure 20.

0.2
Based on the figure, one may readily conclude that the mean cosine
0.15
similarities are always greater than 0.95, and more than 75% of all
0.1
similarities are above 0.85. These results show that the respiratory
waveform recovery of MoRe-Fi remains accurate across all involved
0.05
subjects, largely insensitive to physical discrepancies among them.

0
1
Figure 17: Estimation error of respiratory rate.
Cosine similarity
0.9
To evaluate the accuracy of time-related biomarkers such as
𝑡 tc , 𝑡 i , and 𝑡 e , we inspect the estimation errors of the peak and 0.8
valley times on respiratory waveform, and the results are shown in
Figure 18. It can be observed that most of the mean errors are below 0.7
1 2 3 4 5 6 7 8 9 10 11 12
0.1s, indicating high accuracy of MoRe-Fi’s event time estimation. Human subject index
An interesting phenomenon is that the errors of valley time are
Figure 20: Impact of different human subjects on the cosine
noticeably larger than those of the peak time; this can be attributed
similarity.
to the fact that the valleys in the waveform, as shown in Figure 14,
are relatively “flatter” than the peaks, thus making it harder for
5.4.2 Training Set Size. As stated in Section 4, we collect 8,000
IQ-VED to capture and recover the exact times of the valleys.
data samples of subjects performing different activities for training
the IQ-VED network of MoRe-Fi. Figure 21 shows the impact of
0.1 0.2 0.3 0.4 0.5
Time estimation error (s)
Timing error of peaks

Timing error of valleys training set size on the cosine similarity between the recovered and
ground truth waveforms. One may observe that, as the training set
size increases, the cosine similarity first increases and then comes
to saturation. Specifically, MoRe-Fi achieves a cosine similarity
greater than 0.9 with 6,000 training samples, which corresponds to
33 hours of activity data. Because more training data improve the
0
waveform recovery performance only by a negligible margin, our
Figure 18: Estimation errors of peak and valley times on res-
selection of 8,000 training samples is sufficient.
piratory waveform.
5.4.3 Latent Space Dimension. Another property of IQ-VED that
Finally, the performance of tidal volume estimation is reported affects the recovery performance is the number of latent space
in Figure 19, which shows that the mean errors for all seven body dimensions. On one hand, a small latent space dimension may limit
movements are below 3%, and more than 75% of the errors are below the capacity of the latent representation and potentially prevent the
5%. Similar to previous sections, we find that PP and SL have the loss function from converging to a sufficiently small value. On the
least impact on rate estimation, while WT and SS cause the worst other hand, as most practical signals are sparse, overly increasing
performance. Particularly, it appears that SS has the most adverse the dimension of the latent space can be unnecessary while causing
effects on the result, which can be explained by the fact that the slow convergence in training. Consequently, a competent system
chest displacement caused by a human subject standing up or sitting
down varies the most (comparing with other body movements) 1
Cosine similarity
Cosine similarity
along the propagation direction of radar signals. Overall, these 0.9
promising results suggest that MoRe-Fi has the potential to further 0.5
0.8
enable lung volume monitoring, as will be discussed in Section 6.
0 0.7
5.4 Impact of Practical Factors 1000 2000 3000 4000 5000 6000 7000 8000 8 16 32 64 128 256 512
Training set size Latent space dimension
Because biomarkers can all be inferred from respiratory waveform,
we focus on evaluating the cosine similarity of the waveforms in Figure 21: Impact of training Figure 22: Impact of latent
this section. set size. space dimension.
should strike a balance between expressiveness and compactness sensing distance on performance. Nonetheless, the average cosine
of the latent space. According to Figure 22 that shows the impact of similarity remains above 0.8 even under the most intensive body
latent space dimension on the cosine similarity between the recov- movements, firmly proving the effectiveness of MoRe-Fi.
ered and ground truth waveforms, the performance first improves
with the dimension thanks to a better expressiveness but degrades 6 POTENTIAL MEDICAL ADOPTIONS
after the dimension reaching 64 due to the increased hardness in MoRe-Fi is expected to not only continuously extract respiratory
training. Therefore, 64 is chosen as the dimension of the latent waveform, but also enable earlier intervention for people with po-
space for IQ-VED. tential pulmonary disease. Unfortunately, we cannot fully evaluate
5.4.4 Weights of the Loss Function. The weights in Equation (12) the second one due to the lack of subjects with related diseases and
are crucial parameters to be tuned for IQ-VED. In theory, a larger of the support from medical professionals. Therefore, instead of
𝛾 encourages continuity and disentanglement of the latent space, evaluating the performance in disease diagnosis, we hereby give a
potentially improving the generalization capability of IQ-VED. A brief discussion on potential medical adoptions of MoRe-Fi.
larger 𝜂 improves the alignment of the I/Q representations but The biomarkers evaluated in Sections 3.3.5 and 5.3.3 are directly
may restrict their expressiveness of the underlying I/Q signal. To obtainable from the timestamps and amplitude of respiratory wave-
determine the optimal weights, Figure 23 shows the cosine simi- form; they certainly reflect changes in respiratory patterns and
larity between the recovered and ground truth waveforms as the hence can serve as indicators to a series of health conditions, in-
functions of individual weights; one can clearly observe that 𝛾 = 3 cluding apnea (cessation of breathing) [79], tachypnea (abnormally
and 𝜂 = 2e−4 allow IQ-VED to achieve the best performance in rapid breathing) [4], hyperpnea (abnormally slow breathing) [1],
waveform recovery. dyspnea [49] (shortness of breath), Cheyne-Stokes respiration (pro-
gressively deeper breathing followed by a gradual decrease that
results in an apnea) [42], and Biot respiration (regular deep inhala-
Cosine similarity
Cosine similarity
0.91
0.91
tions followed by periods of apnea) [15]. In addition to the directly
0.9
0.9
0.89
observable biomarkers, respiration flow can also be derived from
0.68 0.88
the fine-grained waveform. Because respiration flow 𝑞(𝑛) can be
0.67 0.87
seen as the derivative of respiration volume, and respiration volume
0 0.5 1 1.5 2 3 4 5 0 4e-5 1e-4 2e-4 3e-4 4e-4 5e-4 is proportional to the amplitude of the recovered waveform 𝑟 ′ (𝑛),
𝑑𝑟 ′ (𝑛)
(a) 𝛾 (when 𝜂 = 2e−4). (b) 𝜂 (when 𝛾 = 3). we have 𝑞(𝑛) = 𝑐 𝑑𝑛 , where 𝑛 denotes time, and 𝑐 is a scaling
factor. We further apply the noise-robust differentiator [30] to get
Figure 23: Impact of different weights of the loss function. the respiration flow 𝑞(𝑛):
1 Í𝑀
5.4.5 Subject Clothing. In the experiments, we ask the subjects to 𝑏 · (𝑟 (𝑛 + 𝑘) − 𝑟 (𝑛 − 𝑘)) ,
𝑞(𝑛) ≈ (15)
wear different clothes and record respective measurements accord- ℎ 𝑘=1 𝑘
ingly. The results shown in Figure 24 involve four types of clothes: 2𝑚 2𝑚

where 𝑏𝑘 = 22𝑚+11 − , ℎ is the time
i) lightweight T-shirt, ii) heavyweight T-shirt, iii) coat+lightweight 𝑚 −𝑘 +1 𝑚 −𝑘 −1
T-shirt, and iv) coat + heavyweight T-shirt. According to Figure 24, interval between two consecutive points, 𝑚 = 2 , 𝑀 = 𝑁2−1 , and
𝑁 −3
MoRe-Fi reaches an overall average cosine similarity above 0.9 𝑁 is the number of points used to estimate derivative. An example
across all types of clothes. Intuitively, MoRe-Fi does performs is provide in Figure 26a. By tracing the change of flow and volume
slightly better when a subject wears less, because heavier clothes together, the flow-volume loop graph is obtained in Figure 26b,
attenuate more severely the reflected signals from the subject’s which clearly visualizes both inhalation and exhalation processes.
chest. Fortunately, even in the worst case (with coat and heavy The shape of the flow-volume loop can provide diagnostic infor-
T-shirt), the average cosine similarity remains well above 0.8. mation for many chronic pulmonary diseases related to abnormal
5.4.6 Sensing Distance. Sensing distance is a major limiting factor airflow, as illustrated in Figure 27. Figures 27a to 27d respectively
of RF respiration monitoring system. We ask the human subjects demonstrate i) restrictive lung disease as a result of a decreased
to be away from the radar at 0.5m, 1m, 1.5m, and 2m to study the lung volume [55], ii) obstructive lung disease caused by obstruction
impact of sensing distance. Not surprisingly, the cosine similari- to airflow when exhaling [16], iii) fixed upper airway obstruction
ties reported in Figure 25 clearly demonstrate a negative effect of occurring when infections spread along the planes formed by the
1
1
Cosine similarity
Cosine similarity
0.8 1 r(n)
q(n)
Amplitude
PP
0.9 Exhalation
Flow q(n)
0.6 TW
SW 0.5
0.4 0.8 SL
WT
0.2 SS 0 Inhalation
0.7 TO
0 0 5 10 15 20
light heavy coat+light coat+heavy 0.5 1 1.5 2
Distance (m) Time (s) Volume r(n)
T-shirt T-shirt T-shirt T-shirt
(a) Respiration volume and flow. (b) Flow-volume loop.
Figure 24: Impact of subject Figure 25: Impact of sensing
clothing. distance. Figure 26: Visualization of respiration volume and flow.
normal normal placements of multiple radars, so as to cancel out motion interfer-

disease disease ence at the cost of cumbersome synchronization. Later researchers
Flow q(n)
Flow q(n)
have also tried to apply linear filtering for mitigating the effects of
motion interference [48, 74], but their assumptions of unrealistic
1-D body movements [74] or the existence of quasi-static periods
Volume r(n) Volume r(n) during movements [48] are too strong to be realistic. The latest pro-
(a) Restrictive lung disease. (b) Obstructive lung disease. posal V2 iFi [90] employs an adapted VMD algorithm for removing
motion interference from turning steering wheel and the running
normal normal vehicle, but it achieves coarse-grained respiration monitoring by
disease disease leveraging only the amplitude of RF signal.
Flow q(n)
Flow q(n)
Besides RF-sensing, acoustic sensing is an alternative candidate

for respiration monitoring due to its readily deployable consumer-
grade hardware. C-FMCW [78] and BreathJunior [77] adopt either
Volume r(n) Volume r(n) continuous-wave or white-noise for respiration monitoring. They
(c) Fixed upper airway. (d) Variable extrathoracic. both target respiratory rate by assuming a subject to remain static.
BreathListener [83] targets respiratory waveform recovery in driv-
Figure 27: Flow-volume loop patterns of different pul- ing environments by tackling only small-scale body movements.
monary diseases related to abnormal airflow. As explained in Section 5.3.2, the two-stage algorithm adopted
there is both complex and incompatible between its two stages.
deep cervical fascia [50], and iv) variable extrathoracic obstruction
SpiroSonic [69] monitors human lung function by turning the
resulting from tumors of the lower trachea or main bronchus [26].
speaker and microphone of a smartphone into a spirometer; it
The capabilities of continuous, motion-robust monitoring po-
leverages deep learning to extract several vital indicators such as
tential lung diseases make MoRe-Fi useful in many scenarios. To
peak expiratory flow, forced expiratory volume, and forced vital
take a few examples, MoRe-Fi can be deployed in hospitals, nursing
capacity. Though SpiroSonic is not a continuous respiration mon-
facilities, as well at homes. In hospitals, the long-term respiration
itoring system as MoRe-Fi does, it does tolerate very small-scale
data provided by MoRe-Fi allow effective triage and ongoing care
hand drifts when holding a phone, by selectively fitting to relatively
management. In nursing facilities and at home, health professionals
“clean” data segments free of strong motion interference. In general,
and normal users will become more aware of the respiratory status
acoustic sensing has limited applicability because they are prone
of the care recipients, and any sudden changes in the recovered
to ambient and motion-induced acoustic interference.
respiratory waveform can be addressed immediately to prevent
further exacerbation. Moreover, in the recent COVID-19 pandemic, 8 CONCLUSION
MoRe-Fi can be deployed to enable earlier intervention, and prevent
the spread of the deadly infectious disease. Taking an important step toward continuous and ubiquitous health
care, we have proposed MoRe-Fi in this paper for motion-robust and
7 RELATED WORK contact-free respiration monitoring, aiming to recover fine-grained
waveform rather than respiratory rate only. Built upon a radar
Existing works on respiration monitoring can be roughly catego- platform [10] and up-to-date deep learning technologies, MoRe-Fi
rized into wearable sensor-based methods [11, 14, 24, 53, 56] and expands the scope of contact-free vital sign monitoring by tackling
contact-free methods [3, 36, 52, 57, 66, 69, 77, 83, 84]. The con- full-scale body movements gracefully. Essentially, though existing
tact nature of wearable sensors causes time-consuming adoptions, signal processing methods unanimously fail to handle the com-
degrades user experience, and may even change users’ respira- position between body movements and respiration-induced chest
tion habits [23], so our paper focuses on innovating contact-free motion, our IQ-VED (the core of MoRe-Fi) succeeds by exerting
respiration monitoring. Contact-free respiration monitoring ex- its non-linear decomposition ability. Via extensive experiments on
ploits either RF-sensing [3, 36, 57, 66, 84, 87, 90] or acoustic sens- healthy subjects, we have demonstrated the promising performance
ing [52, 69, 77, 78, 83].3 Being a typical RF-sensing system, MoRe-Fi of MoRe-Fi in fine-grained waveform recovery and long-term respi-
aims to achieve both motion-robust and fine-grained respiratory ration monitoring. As this line of work progresses, we are planning
waveform recovery, hence our brief literature survey emphasizes to evaluate MoRe-Fi’s performance in real-life clinical scenarios, as
these two aspects when discussing both sensing media. we believe this work has significant implications to various medical
RF-based respiration monitoring started with estimating respira- applications including pulmonary disease diagnosis. Moreover, we
tory rate and recovering coarse-grained waveform of human sub- are planning to exploit the spatial diversity of offered by large-scale
jects in static conditions [3, 36, 66, 84, 85, 87, 89], readily achievable antenna arrays [88, 92] to approach the issue of monitoring vital
by basic spectrum analysis and filtering. Because these methods do signs of a walking subject.
not take motion-robustness into account, they simply suspend respi-
ration monitoring upon encountering sudden motion interference. ACKNOWLEDGMENTS
To achieve motion robustness, early proposals [46, 54] rely on tricky
We are grateful to the anonymous reviewers for their valuable and
3 We deliberately omit the light sensing methods (e.g., [75]) as they rely on a camera constructive comments. We would also like to thank WiRUSH [81]
to perform motion tracking, which is both complicated and ineffective: it certainly for providing fund to develop MoRe-Fi.
cannot handle large-scale body movements faced by MoRe-Fi.
REFERENCES [27] Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot,
[1] E.A. Aaron, K.C. Seow, B.D. Johnson, and J.A. Dempsey. 1992. Oxygen Cost of Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. 2016. beta-
Exercise Hyperpnea: Implications for Performance. Journal of Applied Physiology VAE: Learning Basic Visual Concepts with a Constrained Variational Framework.
72, 5 (1992), 1818–1825. (2016).
[2] Fadel Adib, Zach Kabelac, Dina Katabi, and Robert C. Miller. 2014. 3D Tracking [28] Matthew D. Hoffman, David M. Blei, Chong Wang, and John Paisley. 2013. Sto-
via Body Radio Reflections. In Proc. of the 10th USENIX NSDI. 317–329. chastic Variational Inference. Journal of Machine Learning Research 14, 5 (2013).
[3] Fadel Adib, Hongzi Mao, Zachary Kabelac, Dina Katabi, and Robert C. Miller. [29] Anne E. Holland, Catherine J. Hill, Alice Y. Jones, and Christine F. McDonald.
2015. Smart Homes that Monitor Breathing and Heart Rate. In Proc. of the 33rd 2012. Breathing Exercises for Chronic Obstructive Pulmonary Disease. Cochrane
ACM CHI. 837–846. Database of Systematic Reviews 10 (2012).
[4] Mary Ellen Avery, Olga Baghdassarian Gatewood, and George Brumley. 1966. [30] Pavel Holoborodko. 2008. Smooth Noise Robust Differentiators.
Transient Tachypnea of Newborn: Possible Delayed Resorption of Fluid at Birth. http://www.holoborodko.com/pavel/numerical-methods/numerical-
American Journal of Diseases of Children 111, 4 (1966), 380–385. derivative/smooth-low-noise-differentiators/.
[5] Surya P. Bhatt, Young-il Kim, James M. Wells, William C. Bailey, Joe W. Rams- [31] Aapo Hyvärinen and Petteri Pajunen. 1999. Nonlinear Independent Component
dell, Marilyn G. Foreman, Robert L. Jensen, Douglas S. Stinson, Carla G. Wilson, Analysis: Existence and Uniqueness Results. Neural Netw. 12, 3 (1999), 429–439.
David A. Lynch, et al. 2014. FEV1/FEV6 to Diagnose Airflow Obstruction. Com- [32] IEFT. 2017. Precision Time Protocol Version 2 (PTPv2). Accessed: 2021-04-30.
parisons with Computed Tomography and Morbidity Indices. Annals of the [33] Sergey Ioffe and Christian Szegedy. 2015. Batch Normalization: Accelerating
American Thoracic Society 11, 3 (2014), 335–341. Deep Network Training by Reducing Internal Covariate Shift. In Proc. of ICML.
[6] David M. Blei, Alp Kucukelbir, and Jon D. McAuliffe. 2017. Variational Inference: PMLR, 448–456.
A Review for Statisticians. Journal of the American statistical Association 112, 518 [34] Zhenhua Jia, Amelie Bonde, Sugang Li, Chenren Xu, Jingxian Wang, Yanyong
(2017), 859–877. Zhang, Richard E. Howard, and Pei Zhang. 2017. Monitoring a Person’s Heart
[7] Léon Bottou. 2012. Stochastic Gradient Descent Tricks. In Neural Networks: Rate and Respiratory Rate on a Shared Bed using Geophones. In Proc. of the 15th
Tricks of the Trade. Springer, 421–436. ACM SenSys. 1–14.
[8] Laurent Brochard, Greg S. Martin, Lluis Blanch, Paolo Pelosi, F. Javier Belda, Amal [35] Christian Jutten and Juha Karhunen. 2003. Advances in Nonlinear Blind Source
Jubran, Luciano Gattinoni, Jordi Mancebo, V. Marco Ranieri, Jean-Christophe M. Separation. In Proc. of the 4th Int. Symp. on Independent Component Analysis and
Richard, et al. 2012. Clinical Review: Respiratory Monitoring in the ICU-A Blind Signal Separation (ICA2003). 245–256.
Consensus of 16. Critical Care 16, 2 (2012), 1–14. [36] Ossi Kaltiokallio, Hüseyin Yiğitler, Riku Jäntti, and Neal Patwari. 2014. Non-
[9] Zhe Chen, Tianyue Zheng, Chao Cai, and Jun Luo. 2021. MoVi-Fi: Motion-robust Invasive Respiration Rate Monitoring using a Single COTS TX-RX Pair. In Proc.
Vital Signs Waveform Recovery via Deep Interpreted RF Sensing. In Proc. of the of the 13th ACM IPSN. IEEE, 59–69.
27th ACM MobiCom. 1–14. [37] Ilyes Khemakhem, Diederik Kingma, Ricardo Monti, and Aapo Hyvarinen. 2020.
[10] Zhe Chen, Tianyue Zheng, and Jun Luo. 2021. Octopus: A Practical and Versatile Variational Autoencoders and Nonlinear ICA: A Unifying Framework. In Proc. of
Wideband MIMO Sensing Platform. In Proc. of the 27th ACM MobiCom. 1–14. the 20th AISTATS. PMLR, 2207–2217.
[11] Michael Chu, Thao Nguyen, Vaibhav Pandey, Yongxiao Zhou, Hoang N. Pham, Ro- [38] Diederik P. Kingma and Max Welling. 2013. Auto-Encoding Variational Bayes. In
nen Bar-Yoseph, Shlomit Radom-Aizik, Ramesh Jain, Dan M. Cooper, and Michelle Proc. of ICLR. 1–14.
Khine. 2019. Respiration Rate and Volume Measurements Using Wearable Strain [39] Soheil Kolouri, Phillip E. Pope, Charles E. Martin, and Gustavo K. Rohde. 2018.
Sensors. NPJ digital medicine 2, 1 (2019), 1–9. Sliced Wasserstein Auto-Encoders. In Proc. of ICLR. 1–19.
[12] Jonathan Corren, Alkis Togias, and Jean Bousquet. 2003. Upper and Lower Respi- [40] Solomon Kullback and Richard A. Leibler. 1951. On Information and Sufficiency.
ratory Disease. CRC Press. The Annals of Mathematical Statistics 22, 1 (1951), 79–86.
[13] Shuya Ding, Zhe Chen, Tianyue Zheng, and Jun Luo. 2020. RF-Net: A Unified [41] Siddharth Krishna Kumar. 2017. On Weight Initialization in Deep Neural Net-
Meta-Learning Framework for RF-Enabled One-Shot Human Activity Recogni- works. arXiv preprint arXiv:1704.08863 (2017).
tion. In Proc. of the 18th ACM SenSys. 517–530. [42] Paola A. Lanfranchi, Alberto Braghiroli, Enzo Bosimini, Giorgio Mazzuero,
[14] Biyi Fang, Nicholas D. Lane, Mi Zhang, Aidan Boran, and Fahim Kawsar. 2016. Roberto Colombo, Claudio F. Donner, and Pantaleo Giannuzzi. 1999. Prognostic
BodyScan: Enabling Radio-based Sensing on Wearable Devices for Contactless Value of Nocturnal Cheyne-Stokes Respiration in Chronic Heart Failure. Circula-
Activity and Vital Sign Monitoring. In Proc. of the 14th ACM MobiSys. 97–110. tion 99, 11 (1999), 1435–1440.
[15] Robert J. Farney, James M. Walker, Kathleen M. Boyle, Tom V. Cloward, and [43] Yann LeCun, Patrick Haffner, Léon Bottou, and Yoshua Bengio. 1999. Object
Kevin C. Shilling. 2008. Adaptive Servoventilation (ASV) in Patients with Sleep Recognition with Gradient-Based Learning. In Shape, Contour and Grouping in
Disordered Breathing Associated with Chronic Opioid Medications for Non- Computer Vision. Springer, 319–345.
Malignant Pain. Journal of Clinical Sleep Medicine 4, 4 (2008), 311–319. [44] Alexander Lee, Xiaomeng Gao, Jia Xu, and Olga Boric-Lubecke. 2017. Effects
[16] D.C. Flenley. 1985. Sleep in Chronic Obstructive Lung Disease. Clinics in Chest of Respiration Depth on Human Body Radar Cross Section Using 2.4 GHz Con-
Medicine 6, 4 (1985), 651–661. tinuous Wave Radar. In 2017 39th Annual International Conference of the IEEE
[17] Charles Fletcher and Richard Peto. 1977. The Natural History of Chronic Airflow Engineering in Medicine and Biology Society (EMBC). IEEE, 4070–4073.
Obstruction. Br Med J 1, 6077 (1977), 1645–1648. [45] Nadav Levanon. 1988. Radar Principles. Wiley.
[18] Mia Folke, Lars Cernerud, Martin Ekström, and Bertil Hök. 2003. Critical Review [46] Changzhi Li and Jenshan Lin. 2008. Random Body Movement Cancellation in
of Non-Invasive Respiratory Monitoring in Medical Care. Medical and Biological Doppler Radar Vital Sign Detection. IEEE Transactions on Microwave Theory and
Engineering and Computing 41, 4 (2003), 377–383. Techniques 56, 12 (2008), 3143–3152.
[19] Audrey G. Gift, Trellis Moore, and Karen Soeken. 1992. Relaxation to Reduce [47] Yi Luo and Nima Mesgarani. 2018. TasNet: Time-Domain Audio Separation Net-
Dyspnea and Anxiety in COPD Patients. Nursing Research (1992). work for Real-Time, Single-Channel Speech Separation. In 2018 IEEE International
[20] Clark R. Givens, Rae Michael Shortt, et al. 1984. A Class of Wasserstein Metrics Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 696–700.
for Probability Distributions. The Michigan Mathematical Journal 31, 2 (1984), [48] Qinyi Lv, Lei Chen, Kang An, Jun Wang, Huan Li, Dexin Ye, Jiangtao Huangfu,
231–240. Changzhi Li, and Lixin Ran. 2018. Doppler Vital Signs Detection in the Presence
[21] Xavier Glorot, Antoine Bordes, and Yoshua Bengio. 2011. Deep Sparse Rectifier of Large-Scale Random Body Movements. IEEE Transactions on Microwave Theory
Neural Networks. In Proc. of the 14th AISTATS. 315–323. and Techniques 66, 9 (2018), 4261–4270.
[22] Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde- [49] Harold L. Manning and Richard M. Schwartzstein. 1995. Pathophysiology of
Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Dyspnea. New England Journal of Medicine 333, 23 (1995), 1547–1553.
Adversarial Networks. In Proc. of The 27th NIPS. 1–9. [50] Albert Miller, Lee K. Brown, and Alvin S. Teirstein. 1985. Stenosis of Main Bronchi
[23] Changzhan Gu and Changzhi Li. 2015. Assessment of Human Respiration Patterns Mimicking Fixed Upper Airway Obstruction in Sarcoidosis. Chest 88, 2 (1985),
via Noncontact Sensing using Doppler Multi-Radar System. Sensors 15, 3 (2015), 244–248.
6383–6398. [51] Stylianos Ioannis Mimilakis, Konstantinos Drossos, Tuomas Virtanen, and Gerald
[24] Firat Güder, Alar Ainla, Julia Redston, Bobak Mosadegh, Ana Glavan, T.J. Martin, Schuller. 2017. A Recurrent Encoder-Decoder Approach with Skip-Filtering Con-
and George M. Whitesides. 2016. Paper-based Electrical Respiration Sensor. nections for Monaural Singing Voice Separation. In 2017 IEEE 27th International
Angewandte Chemie International Edition 55, 19 (2016), 5727–5732. Workshop on Machine Learning for Signal Processing (MLSP). IEEE, 1–6.
[25] Tian Hao, Chongguang Bi, Guoliang Xing, Roxane Chan, and Linlin Tu. 2017. [52] Se Dong Min, Jin Kwon Kim, Hang Sik Shin, Yong Hyeon Yun, Chung Keun Lee,
MindfulWatch: A Smartwatch-based System for Real-Time Respiration Monitor- and Myoungho Lee. 2010. Noncontact Respiration Rate Measurement System
ing during Meditation. In Proc. of the 19th ACM UbiComp. 1–19. Using an Ultrasonic Proximity Sensor. IEEE Sensors Journal 10, 11 (2010), 1732–
[26] Edward F. Haponik, Eugene R. Bleecker, Richard P. Allen, Philip L. Smith, and 1739.
Joseph Kaplan. 1981. Abnormal Inspiratory Flow-Volume Curves in Patients with [53] Se Dong Min, Yonghyeon Yun, and Hangsik Shin. 2014. Simplified Structural
Sleep-Disordered Breathing. American Review of Respiratory Disease 124, 5 (1981), Textile Respiration Sensor based on Capacitive Pressure Sensing Method. IEEE
571–574. Sensors Journal 14, 9 (2014), 3245–3251.
[54] José-María Muñoz-Ferreras, Zhengyu Peng, Roberto Gómez-García, and [73] Ross T. Tsuyuki, William Midodzi, Cristina Villa-Roel, Darcy Marciniuk, Irvin
Changzhi Li. 2016. Random Body Movement Mitigation for FMCW-radar-based Mayers, Dilini Vethanayagam, Michael Chan, and Brian H. Rowe. 2020. Diagnostic
Vital-Sign Monitoring. In IEEE Topical Conference on Biomedical Wireless Tech- Practices for Patients with Shortness of Breath and Presumed Obstructive Airway
nologies, Networks, and Sensing Systems. 22–24. Disorders: A Cross-Sectional Analysis. CMAJ open 8, 3 (2020), E605.
[55] Nizar A. Naji, Marian C. Connor, Seamas C. Donnelly, and Timothy J. McDonnell. [74] Jianxuan Tu, Taesong Hwang, and Jenshan Lin. 2016. Respiration Rate Measure-
2006. Effectiveness of Pulmonary Rehabilitation in Restrictive Lung Disease. ment Under 1-D Body Motion Using Single Continuous-Wave Doppler Radar Vital
Journal of Cardiopulmonary Rehabilitation and Prevention 26, 4 (2006), 237–243. Sign Detection System. IEEE Transactions on Microwave Theory and Techniques
[56] NeuLog. 2017. Respiration Monitor Belt Logger Sensor NUL-236. https://neulog. 64, 6 (2016), 1937–1946.
com/respiration-monitor-belt/. Accessed: 2021-04-28. [75] Mark van Gastel, Sander Stuijk, and Gerard de Haan. 2016. Robust Respiration
[57] Phuc Nguyen, Xinyu Zhang, Ann Halbower, and Tam Vu. 2016. Continuous and Detection from Remote Photoplethysmography. Biomedical Optics Express 7, 12
Fine-Grained Breathing Volume Monitoring from Afar Using Wireless Signals. (2016), 4941–4957.
In Proc. of the 35th IEEE INFOCOM. 1–9. [76] Theo Vos, Christine Allen, Megha Arora, Ryan M. Barber, Zulfiqar A. Bhutta,
[58] Novelda AS. 2017. The World Leader in Ultra Wideband (UWB) Sensing. https: Alexandria Brown, Austin Carter, Daniel C. Casey, Fiona J. Charlson, Alan Z.
//novelda.com/technology/. Accessed: 2021-04-22. Chen, et al. 2016. Global, Regional, and National Incidence, Prevalence, and Years
[59] Ingram Olkin and Friedrich Pukelsheim. 1982. The Distance between Two Ran- Lived with Disability for 310 Diseases and Injuries, 1990–2015: A Systematic
dom Vectors with Given Dispersion Matrices. Linear Algebra Appl. 48 (1982), Analysis for the Global Burden of Disease Study 2015. The Lancet 388, 10053
257–263. (2016), 1545–1602.
[60] World Health Organization. 2020. The Top 10 Causes of Death. https://www. [77] Anran Wang, Jacob E. Sunshine, and Shyamnath Gollakota. 2019. Contactless
who.int/news-room/fact-sheets/detail/the-top-10-causes-of-death. Accessed: Infant Monitoring Using White Noise. In Proc. of The 25th ACM MobiCom. 52:1–
2021-04-14. 16.
[61] Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory [78] Tianben Wang, Daqing Zhang, Yuanqing Zheng, Tao Gu, Xingshe Zhou, and
Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Bernadette Dorizzi. 2018. C-FMCW based Contactless Respiration Detection
PyTorch: An Imperative Style, High-Performance Deep Learning Library. arXiv using Acoustic Signal. In Proc. of the 20th ACM UbiComp. 170:1–20.
preprint arXiv:1912.01703 (2019). [79] David P. White. 2006. Sleep Apnea. Proceedings of the American Thoracic Society
[62] Mark Pollock, Jairo Roa, Joshua Benditt, and Bartolome Celli. 1993. Estimation of 3, 1 (2006), 124–128.
Ventilatory Reserve by Stair Climbing: A Study in Patients with Chronic Airflow [80] Simon Johnson Williams. 1993. Chronic Respiratory Illness. Psychology Press.
Obstruction. Chest 104, 5 (1993), 1378–1383. [81] WiRUSH/AIWiSe. 2019. Guangxi Wanyun and Guangzhou AIWiSe Technology
[63] Thomas E. Potok, Catherine Schuman, Steven Young, Robert Patton, Federico Co., Ltd. https://www.wirush.ai and https://aiwise.wirush.ai.
Spedalieri, Jeremy Liu, Ke-Thia Yao, Garrett Rose, and Gangotree Chakma. 2018. A [82] Zhaohua Wu and Norden E. Huang. 2009. Ensemble Empirical Mode Decom-
Study of Complex Deep Learning Networks on High-performance, Neuromorphic, position: A Noise-Assisted Data Analysis Method. Advances in Adaptive Data
and Quantum Computers. ACM Journal on Emerging Technologies in Computing Analysis 1, 01 (2009), 1–41.
Systems (JETC) 14, 2 (2018), 1–21. [83] Xiangyu Xu, Jiadi Yu, Yingying Chen, Yanmin Zhu, Linghe Kong, and Minglu Li.
[64] Omid Rasouli, Stanisław Solnik, Mariusz P. Furmanek, Daniele Piscitelli, Ali 2019. BreathListener: Fine-Grained Breathing Monitoring in Driving Environ-
Falaki, and Mark L. Latash. 2017. Unintentional Drifts During Quiet Stance and ments Utilizing Acoustic Signals. In Proc. of the 17th ACM MobiSys. 54–66.
Voluntary Body Sway. Experimental brain research 235, 7 (2017), 2301–2316. [84] Zhicheng Yang, Parth H. Pathak, Yunze Zeng, Xixi Liran, and Prasant Mohapatra.
[65] Raspberry Pi Foundation. 2021. Teach, Learn and Make with Raspberry Pi - 2017. Vital Sign and Sleep Monitoring using Millimeter Wave. ACM Transactions
Raspberry Pi. https://https://www.raspberrypi.org/. Accessed: 2021-04-28. on Sensor Networks 13, 2 (2017), 1–32.
[66] Ruth Ravichandran, Elliot Saba, Ke-Yu Chen, Mayank Goel, Sidhant Gupta, and [85] Shichao Yue, Hao He, Hao Wang, Hariharan Rahul, and Dina Katabi. 2018. Ex-
Shwetak N. Patel. 2015. WiBreathe: Estimating Respiration Rate Using Wireless tracting Multi-Person Respiration from Entangled RF Signals. In Proc. of the 20th
Signals in Natural Settings in the Home. In Proc. of the 13rd IEEE PerCom. 131–139. ACM UbiComp. 86:1–22.
[67] Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. 2014. Stochastic [86] Matthew D. Zeiler, Dilip Krishnan, Graham W. Taylor, and Rob Fergus. 2010.
Backpropagation and Approximate Inference in Deep Generative Models. In Proc. Deconvolutional Networks. In Proc. of the 23rd IEEE CVPR. IEEE, 2528–2535.
of ICML. 1278–1286. [87] Youwei Zeng, Dan Wu, Jie Xiong, Enze Yi, Ruiyang Gao, and Daqing Zhang. 2019.
[68] Konstantin Shmelkov, Cordelia Schmid, and Karteek Alahari. 2018. How Good is FarSense: Pushing the Range Limit of WiFi-based Respiration Sensing with CSI
My GAN?. In Proc. of the 15th ECCV. 213–229. Ratio of Two Antennas. In Proc. of the 21st ACM UbiComp. 1–26.
[69] Xingzhe Song, Boyuan Yang, Ge Yang, Ruirong Chen, Erick Forno, Wei Chen, [88] Chi Zhang, Feng Li, Jun Luo, and Ying He. 2014. iLocScan: Harnessing Multipath
and Wei Gao. 2020. SpiroSonic: Monitoring Human Lung Function via Acoustic for Simultaneous Indoor Source Localization and Space Scanning. In Proc. of the
Sensing on Commodity Smartphones. In Proc. of The 26th ACM MobiCom. 1–14. 12th ACM SenSys. 91—104.
[70] Joan B. Soriano, Parkes J. Kendrick, Katherine R. Paulson, Vinay Gupta, Elissa M. [89] Jin Zhang, Weitao Xu, Wen Hu, and Salil S. Kanhere. 2017. WiCare: Towards
Abrams, Rufus Adesoji Adedoyin, Tara Ballav Adhikari, Shailesh M. Advani, in-Situ Breath Monitoring. In Proc. of the 14th EAI MobiQuitous. 126–135.
Anurag Agrawal, Elham Ahmadian, et al. 2020. Prevalence and Attributable [90] Tianyue Zheng, Zhe Chen, Chao Cai, Jun Luo, and Xu Zhang. 2020. V2 iFi: in-
Health Burden of Chronic Respiratory Diseases, 1990–2017: A Systematic Analy- Vehicle Vital Sign Monitoring via Compact RF Sensing. In Proc. of the 22th ACM
sis for the Global Burden of Disease Study 2017. The Lancet Respiratory Medicine UbiComp. 70:1–27.
8, 6 (2020), 585–596. [91] Tianyue Zheng, Zhe Chen, Shuya Ding, and Jun Luo. 2021. Enhancing RF Sensing
[71] Chiheb Trabelsi, Olexa Bilaniuk, Ying Zhang, Dmitriy Serdyuk, Sandeep Subra- with Deep Learning: A Layered Approach. IEEE Communications Magazine 59, 2
manian, Joao Felipe Santos, Soroush Mehri, Negar Rostamzadeh, Yoshua Bengio, (2021), 70–76.
and Christopher J. Pal. 2018. Deep Complex Networks. In Proc. of ICLR. 1–19. [92] Tianyue Zheng, Zhe Chen, Jun Luo, Lin Ke, Chaoyang Zhao, and Yaowen Yang.
[72] Trieu Trinh, Andrew Dai, Thang Luong, and Quoc Le. 2018. Learning Longer- 2021. SiWa: See into Walls via Deep UWB Radar. In Proc. of the 27th ACM
Term Dependencies in RNNs with Auxiliary Losses. In Proc. of ICML. PMLR, MobiCom. 1–14.
4965–4974.

More-Fi: Motion-Robust and Fine-Grained Respiration Monitoring Via Deep-Learning Uwb Radar

Uploaded by

Copyright:

Available Formats

More-Fi: Motion-Robust and Fine-Grained Respiration Monitoring Via Deep-Learning Uwb Radar

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

More-Fi: Motion-Robust and Fine-Grained Respiration Monitoring Via Deep-Learning Uwb Radar

Uploaded by

Copyright:

Available Formats

MoRe-Fi: Motion-robust and Fine-grained Respiration

Monitoring via Deep-Learning UWB Radar

Email: {tianyue002, shujie002, junluo}@ntu.edu.sg, chenz@ssijri.com, chriscai@hust.edu.cn

Figure 4: Neither amplitude nor phase of 𝑟𝑡 (𝑛) alone is suffi-

slow-time index 𝑛 as the argument, as shown in Figure 9.

3.2 Data Augmentation

Remark. To summarize, in order to recover fine-grained respi-

3 SYSTEM DESIGN Figure 9: After locating respiration in a sub-matrix 𝒓ˆ (𝑡),

0.02 0.02 0.02 respiratory waveform) by a decoder 𝑝𝜓 (𝒓 ′ |𝒛). In other words, 𝒛

where 𝑝𝜓 (𝒛) = N (0, I) is a Gaussian prior on the latent repre-

sentation 𝒛 and DKL (·) donotes the Kullback-Leibler (KL) diver-

be regularized. The I/Q regularizing losses are defined as: vT te ti 0.2

5.3.2 Comparison with Baseline Method. We compare MoRe-Fi

Actual respiratory waveform

with BreathListener [83] in terms of recovered waveform quality in

contrast the waveforms, then comparisons in terms of three metrics

0 5 10 15 20 25 30 35 40 45 50 ers respiratory waveform accurately, whereas BreathListener tends

ilarity, respiratory rate estimation, and peak/valley time estimation

The inferior performance of BreathListener can be attributed

0 5 10 15 20 25 30 35 40 45 50 cussed in Section 2.3, the EEMD algorithm is incapable of handling

separated, as illustrated in Figure 7. Given the potentially erroneous

during training [68] becomes even harder to converge. For those

0 5 10 15 20 25 30 35 40 45 50 to ground truth, though possibly with wrong features (e.g., phase)

our IQ-VED is trained and operates in an integrated manner: it uses

the encoder to decompose signal and the decoder to reconstruct

respiratory waveform. As a result, MoRe-Fi is far more effective

Actual respiratory waveform

the respiratory waveform from the noisy radar signal under

To further explore how individual body movement types affect 0 5 10 15 20 25 30 35 40 45 50

As expected, WT and SS cause the worst performance of MoRe-Fi

Rate estimation error(rpm)

Time estimation error(s)

PP TW ES SL WT SS TO MoRe-Fi BreathListener MoRe-Fi BreathListener MoRe-Fi BreathListener

recovered respiratory waveform for all 12 subjects in Figure 20.

subjects, largely insensitive to physical discrepancies among them.

Timing error of peaks

along the propagation direction of radar signals. Overall, these 0.9

normal normal placements of multiple radars, so as to cancel out motion interfer-

Besides RF-sensing, acoustic sensing is an alternative candidate

You might also like