1 Introduction
Underwater acoustic communication (UWAC) utilizes acoustic signals for information transmission. It plays a pivotal role in scientific exploration, marine resource discovery, environmental protection, disaster warning, emergency rescue, and education. As technology continues to advance and application fields expand, the significance and importance of UWAC have been further recognized. Therefore, a substantial amount of effort has been invested into designing efficient UWAC systems.
Distinct from terrestrial wireless communications, UWAC systems present unique challenges [
1]. The acoustic signal exhibits prolonged and fluctuating propagation delays, diminishing the effectiveness of carrier sensing in comparison to land-based RF networks. Additionally, the underwater acoustic channel displays temporal and spatial dynamics, substantial multipath spread, and notable Doppler effects, resulting in a heightened bit error rate (BER) and inter-symbol interference (ISI) compared to terrestrial counterparts. Furthermore, the transmitting and receiving units are considerably larger and more power-intensive than those used in RF networks due to the nature of the signal. This necessitates the design of energy-efficient communication systems, particularly because most underwater network nodes rely on batteries with limited capacity. Clipping is often employed to truncate signal peaks and mitigate overwhelming signals to random receivers in a distributed network, albeit introducing signal distortions. Moreover, the underwater environment is typically noisy, and the received acoustic signal can be very weak due to attenuation and path loss. Low signal-to-noise ratio (SNR) performance is thus a critical requirement for a high-performing UWAC system.
In recent years, the CSS based method has attracted much attentions, due to its long range and low power consumption. A notable example is Lora [
2], which has great successes in terrestrial wireless sensor networks. In UWAC, [
3,
4,
5] explored the adaptation of Lora for underwater communications. Steinmetz
et al. presented a lightweight CSS [
6] and designed preample codes to reduce BER [
7].
Despite exhibiting considerable potential for underwater communications, the efficacy of these methods remains to be assessed in real ocean environments. Therefore, we conducted a sea trial in April 2024, transmitting a series of preconfigured signals in the South China Sea, situated approximately 100 miles offshore. Notably, among the gathered data, a few specific received signals, one of which is depicted in Fig.
1, attracted our attention due to its apparent interference. As explained in Sec.
2.2, our analysis revealed that the interference stemmed from the harmonic series, a consequence of substantial signal clipping at the receiver. It escalates the Bit Error Rate (BER) and poses a challenge to the UWAC.
While a better designed AGC or other sophisticated hardware could alleviate the problem to a certain extent, we are trying to answer two questions in this paper: 1) Is it possible to address this issue through the development of a modulation approach insensitive to signal clipping at the receiver? 2) If a signal is clipped, is there any way to recover from the distortion? Fortunately, our research outcome says yes to both of the questions. Better yet, we show that by using a specific method, we can harvest the energy in the clipped signal and improve the SNR at the receiver.
The rest of the paper is organized as follows: Section
2 presents the system models; Section
3 describes the spread spectrum method and defines key notations; Section
4 models the acoustic harmonic series; Section
5 elaborates how we leverage the interference to improve system performance; Section
6 shows the multiple users support; Section
7 and
8 showcase our simulation and experiment results; finally, Section
9 concludes the paper and discusses potential future research directions.
2 System Model
This section primarily presents the communication system model and gives the relevant definitions of clipping signals.
2.1 Time-varying Rayleigh Channel
The actual underwater channel can be modeled as a time-varying Rayleigh channel. Channel conditions in water propagation are mainly characterized by path loss, multi-path effects due to signal reflection around the water surface, and Doppler effects caused by transmitter and receiver relative motion. In the time domain, the channel impulsive response can be given as follows [
4]:
where
ap(
t),
τp represent the path loss and time delay in the
p-th path separately, and
D(
f) in the phase term represents frequency Doppler shift.
The signal is transmitted through the channel after several signal-processing operations, including modulation and coding. In the presence of channel fading and noise, the resulting received signal can be calculated as:
where
x(
t) is an acoustic signal and
h(
t) represents channel given by Eq.
1,
w(
t) is a collection of real underwater noise.
2.2 Clipped Signals on Real Underwater Acoustic Communications
In a general Frequency Modulation (FM) system, an amplitude limiter must precede the demodulator in the FM Receiver for signal rectification. Signal rectification is employed to mitigate undesired amplitude variations induced by noise. Namely, an amplitude limiter is a system that restricts the received signal strength. Below, we model the limiting signal. An aperiodic signal subject to clipping can be represented as [
8]:
where
a denotes the maximal signal amplitude after clipping. The time-domain amplitude graph of an aperiodic signal after amplitude limiting is shown in the right of Fig.
1.
In practical transmission, the received signal is biased due to undesirable channel characteristics, which are often ignored but will affect the harmonic generation, so we define the limiting ratio in a more precise way:
where Amax, and Amin denote the maximum and minimum values of the signal without limiting, respectively.
It is worth noting that clipping alone will only produce odd harmonics, whereas if the received signal is biased and then limited, it will produce both odd and even harmonics. One may wonder if the harmonic series of CSS carries useful information. And what we can do with it? We will answer these questions in the following sections.
3 Chirp Spread Spectrum
This section is divided into two parts: chirp spread spectrum modulation and chirp spread spectrum demodulation. Instead of simply introducing spread spectrum modulation, we are emphasizing the important parameters of the modulation process, as they are fundamental to the realization of our subsequent ideas.
3.1 Chirp Spread Spectrum Symbols
Chirp Spread Spectrum is an Linear Frequency Modulation (LMF) technique. At the heart of the LMF is a chirp signal whose frequency changes linearly in the bandwidth. A prominent technology that relies on the chirp spread spectrum is the Long Range (LoRa) modulation [
2]. LoRa broadens raw signal waveforms with linear frequency modulation chirps. The frequency can be linearly increasing or decreasing, which are called
up-chirp and
down-chirp respectively.
The key factors that determine chirp are the spreading factor, minimum frequency, maximum frequency, and starting frequency, denoted as SF, fmin , fmax , fs, respectively. The bandwidth B = fmax − fmin .
Spreading Factor (SF) is an important parameter for measuring transmission rate and code element spacing. Another significance of the spreading factor is that SF binary digits are used to represent a symbol.
In other words, with an arbitrary
SF, 2
SF different symbols can be represented. For modulation, the bandwidth
B is uniformly divided into 2
SF subbands
\(\Delta f=\frac{B}{2^{SF}}\), utilizing different starting frequencies to represent different code symbols. The starting frequency can be explicitly expressed as:
where
αn ∈ {0, 1, 2⋅⋅⋅2
SF − 1} denotes symbol values. The spreading factor and bandwidth jointly determine one symbol’s duration
Ts = 2
SF/
B.
Once a symbol is modulated into a chirp, the chirp
slope β is fixed, a function of the bandwidth and symbol duration:
Symbol valid information is modulated at the starting frequency of the chirp, i.e.,
fs. Consider an up-chirp process: the modulated chirp starts from
fs, and the carrier frequency increases linearly to
fmax , marking the time breakpoint
tf, and from the time
tf onwards, the carrier frequency is increased from
fmin to
fs. The carrier frequency versus time is shown in Fig.
2.
We can also describe the LoRa modulation from the time domain. One symbol duration of the carrier frequency can be expressed in the form of a segmented primary function:
The frequency is the derivative of the phase, and the modulated signal is written as an exponential function segmented as follows:
The above equation is a time-domain expression of a data symbol modulated by LoRa, and the transmitter signal is the signal adjusted for the starting frequency
fs and extended in the time domain
which is the acoustic signal in Eq.
2. and
Nα is the total number of symbols per frame.
3.2 Decoding CSS Symbols
LoRa demodulation is characterized by its simplicity. After merely a dechirping operation, the demodulated signal turns into an MFSK (Multiple Frequency Shift Keying) signal, and the subsequent demodulation and decoding processes are straightforward.
Specifically, a baseline down-chirp needs to be prepared, we denote it as the reference signal, and such a down-chirp is generally chosen to be a continuous chirp that varies linearly from fmax as the starting frequency to fmin in the baseband.
Similarly to [
9], we use superscript asterisks to indicate in-phase inversion operation. The mathematical expression for down-chirp can be written as:
The operation of LoRa demodulation is accomplished by multiplying such a reference signal with the received signal and resampling it. From the time-frequency domain, the signal after LoRa demodulation is an FM signal with a carrier frequency of
fs, As long as the carrier frequency information can be extracted, the data symbol value can be decoded.
here,
w*(
t) =
w(
t) ·
x*(
t), indicating that noise is multiplied by the down-chirp signal, and still maintains its distribution.
There are several methods for decoding MFSK, and we use non-coherent demodulation here. The non-coherent demodulation is attained by implementing energy detection on the MFSK signal using a fast fourier transform (FFT), where the dominant component of the power spectral density (PSD), i.e. its peak, is extracted. This dominant frequency component represents fs that encodes the value of the symbol. Accordingly, y*(t) is resampled and recorded as y(t).
4 Modeling of Harmonic Signals
To facilitate further discussion on harmonic signals, this section focuses on modeling harmonic signals based on their characteristics.
Considering a periodic signal, it can be harmonically decomposed by Fourier series. We write it in the following generic form: [
10]
where
\(a_{k}= \frac{1}{T}\int _{T}x(t)\mathrm{e}^{-\mathrm{j}k\omega _{0}t} \mathrm{d}t\) is the Fourier coefficient. When
ak is not zero,
x(
t) can be viewed as the sum of a series of harmonic signals.
Clipping does not change the periodicity of the signal, so clipping in the digital domain causes a large amount of harmonic distortion [
11]. And a clipped signal creates spectral leakage in other frequency bands [
12]. This is a devastating challenge for communication signal detection.
Assuming that the original signal is modulated at the center frequency fc and the channel bandwidth is B, then the clipped signal will not only be received in B but will also generate spectral spreading at an integer multiple of the original center frequency fc. As a result, in the same data symbol duration, the bandwidth at the fc, and the bandwidth is B, while the bandwidth at the 2fc is 2B.
Compared to general modulation, the spectral multiplication of LoRa modulation is distinguishable on the receiver side.
Without loss of generality, considering single data symbol duration
Ts, the
nth harmonic signal exhibits spectral leakage at a center frequency of
nfc, bandwidth of
nB, and with Eq.
8, the time-domain expression for the
nth harmonic signal can be written correspondingly.
where the superscript
n indicates the index of the
nth harmonic signal, and
n = 1, 2, 3,...,
N. Based on the relationship between the
nth harmonic signal and the 1
st harmonic signal, it can be derived:
when a received signal containing a series of harmonic signals
\(x_{limiting}(t)=\sum _{i=1}^{N_{\alpha }}\sum _{n=1}^{N}x_{s}^{(n)}(t-iT_{s})\) is multiplied by a down-chirp, the results of
dechirping are as follows: only the 1
st harmonic signal can be demodulated to MFSK with a constant starting frequency and thus successfully decoded.
But what effect does the dechirping operation have on the harmonic components? Let’s take the linear frequency components in Eq.
16.
Focusing on two specific points, the frequency at
t = 0 is
\((f_s-\frac{\left(n-1\right)B}{2})\), and the frequency at
t =
Ts is
\((n-1)B+\left(f_s-\frac{(3n-1)B}{2}\right)\), where
βt =
B. The linear frequency difference between the two points is B. This means that the linear frequency of the
nth harmonic after a CSS demodulation no longer varies linearly within a bandwidth of
nB but within (
n − 1)
B. The symbol duration is constant, according to Eq.
6, each demodulation changes the slope of the chirp to 1/2 of the original.
In summary, multiplying the harmonic signals with the down-chirp signal merely reduces the actual spectrum occupancy. A meticulous analysis of the spectrum reveals that the harmonic signal after one dechirping remains a chirp with an actual spectrum occupancy rate (
n − 1)
B. Thus the presence of the harmonic signal will not affect the subsequent decoding. Simulation results are given in Section
7.1.
5 Energy Harvesting in the Harmonic Series
In the previous section, we proposed the application of LoRa modulation that incorporates an amplitude limiter to effectively mitigate the influence of harmonic signals. Although the effect of harmonic signals on decoding has been reduced to very low, the generation of harmonic signals also occupies a portion of the transmit power, this energy can be harvested and utilized. Moreover, the harmonic signal is the outcome of spectral leakage resulting from the clipping, and the information carried by the harmonic signal constitutes the valid information transmitted.
Inspired by the previous section, the harmonic signal only affects the actual bandwidth occupancy after one dechirping. We contemplate the utilization of the harmonic signal to reconstruct the original signal by altering the demodulation approach at the receiver.
5.1 Recursive Demodulation
The time-domain model of the clipped signal is represented by the following equation.
applying dechirping operation to a solitary data symbol:
The outcome of the initial dechirping undergoes MFSK demodulation, and the decoded information is denoted as \(\hat{\gamma }_1\). It is highly intriguing to observe that the exponential component of the 2nd harmonic signal after one dechirping transforms into the form of the original signal before its demodulation.
The difference from the original signal is that the bandwidth of the remaining signal is 2
B, while the slope of the chirp is
β, and the actual bandwidth of the remaining signal can be known as
B according to Eq.
16.
Eq.
20 expresses the result of the 2
nd harmonic signal after two demodulation iterations, and as we expect, the starting frequency can be extracted.
5.2 Improved LoRa Demodulation Structure
Based on the derivation of the harmonic signal energy harvesting generated by the clipped signal in the previous subsection, we can further optimize the demodulation structure of the system.
Fig.
3 shows the comparison between the receiving block diagram of the traditional receiver and the block diagram designed by further exploiting the amplitude-limited signal in the system.
In particular, we make full use of the harmonic signals by recursively calling the demodulation. Following the completion of each dechirping operation, the resulting signal is then subjected to decoding. Subsequently, the signal within the specified frequency range is removed from the received signal, thereby initiating a recursive call to demodulation. Once all the decoding processes have been completed, the decoded signals within each frequency range are combined using specific weights.
In the simulation section, we demonstrated the process of reconstructing the amplitude-limited signal to verify the feasibility of the algorithm.
6 Multi-user Communication
We emphasize that in our introduction to CSS modulation: LoRa modulation is only determined by two parameters: spreading factor and bandwidth, and the slope is jointly determined by the spreading factor and bandwidth. Considering the narrow bandwidth of the underwater channel, we have to consider the problem of bandwidth sharing among multiple users, and the slope of the LoRa modulation is exactly the indispensable condition for achieving simultaneous communication of multiple users in the same channel.
In this section, we will discuss how to achieve multi-user parallel transmission in the underwater narrow channel by using LoRa modulation. First, we will provide the theoretical proof that multi-user parallel transmission can be realized. Second, we will present the restrictions when using LoRa for parallel transmission. Finally, the corresponding conclusion will be given.
6.1 Multiple Access
The most notable issue in achieving multi-user parallel communication over narrow-band channels is how to distinguish users.
The natural advantage of LoRa modulation is that the spreading factor and bandwidth determine whether or not the receiver can demodulate correctly. The spreading factor and the bandwidth together determine the slope of the linear frequency, i.e. the variation of the linear frequency. As long as the transmitter and receiver have the same spreading factor and bandwidth, successful demodulation and decoding can be achieved. This is why we can use the slope to distinguish between different users.
We assume that the transmission bandwidth and spreading factor of
User1 and
User2 are
B1,
SF1,
B2, and
SF2 respectively. We can obtain a “password” unique to each user based on Eq.
6 and the slope formula:
Without loss of generality and to simplify the expression form,
β1 ≠
β2, the following analysis only discusses the first half of the segmented function. We attempt to use the “password” of
User2 to decrypt the information sent by
User1 in an “encrypted” manner. The corresponding time-domain expression is as follows:
Eq.
22 denotes the multiplication of a single symbol
fs1 transmitted by
User1 with the dechirped signal of
User2. Analyzing the resulting expression obtained, the frequency component still exhibits the form of a primary function, i.e., when demodulated between different users, it still behaves as a chirp signal.
Since User1 does not have information about User2 (SF, B), Communication cannot succeed no matter how many demodulation operations are performed.
6.2 Restrictions on Multi-user Communication
Reconstruction of the harmonic signal depends on the property that, the bandwidth is an integer multiple of the bandwidth of the original signal. In other words, the slope of the harmonic signal is an integer multiple of the original signal. Therefore, when differentiating users employing the slope, it is crucial to prevent the circumstance where the slope of User2 is an integer multiple of that of User1.
7 simulation Results
In this section, we validate the proposed model in Matlab simulations.
7.1 Simulation on Countering Clipping
In this subsection, we will show that limiting has little effect on the reception and demodulation of LoRa modulation. To simulate a real sea trial scenario, we have implemented a channel simulator using the Matlab comm tool. We mainly consider the multi-path effect, the Doppler effect, and the signal-to-noise ratio of the channel, and the specific parameter design is shown in Table
I. The Doppler shift factor is set to 10
− 4, which we estimated from sea trial data. However, since this data has already undergone Doppler equalization at the receiver, it does not accurately reflect the true conditions of the underwater channel.
The modulated signals are fed into the simulated channel and the received signals are decoded and the BER of each received signal is counted, taking into account different SNR and limiting ratios. It is worth noting that we define the limiting ratio of the unrestricted signal as
Th = ∞ in our simulation. As shown in Fig.
4, with a fixed limiting ratio of 0.7, we performed 200 valid signal reception simulations under different SNR conditions, respectively. It can be seen that at SNR greater than -20 dB, the BER difference between the unrestricted and restricted signals is not significant. Fig.
5 shows that we have performed 100 valid signal reception simulations under different SNR conditions with different limiting ratios, respectively. It can be seen that when the SNR is greater than -20 dB, the BER difference between the unlimited amplitude signal and the restricted amplitude signal is not significant, and even when the limiting ratio reaches 0.1, the LoRa modulation can still effectively mitigate the impact of the restricted amplitude signal on the reception.
It is worth noting that at very low SNR, the BER of a signal that has been clipped is better than that of an unclipped signal. This is because, in the case of very low SNR, the energy of the noise is much higher than the energy of the signal itself, and limiting the amplitude can suppress the more noise than the signal, thus improving the BER.
7.2 Simulation on Harmonic Signals
In this subsection, we demonstrate the process of reconstructing the amplitude-limited signal to verify the feasibility of the improved receiver. Further, we measure the gain due to the harmonic signals in terms of the SNR at the receiver.
Initially, we generate a series of up-chirp signals with a central frequency of 12kHz and a bandwidth of 6kHz. Simultaneously, we modulate the same symbol information onto a signal with a central frequency of 24kHz, which has a bandwidth that is twice that of the fundamental harmonic signal, i.e., 12kHz. We regard this as the simulated 2nd harmonic signal. It’s important to note that while we aim for the 2nd harmonic signal’s bandwidth to be twice that of the fundamental harmonic signal, we do not wish for the duration of each symbol to change. This necessitates the increase of the SF.
Following this, we superimpose the two signals in the time domain. The time-frequency-power distribution of the superimposed signals is shown in Fig.
6a, which is exactly as we expected: i.e., under the premise of the symbol duration remaining unchanged, the bandwidth of the 2
nd harmonic reaches twice that of the fundamental harmonic signal’s bandwidth.
As shown in Fig.
6b, the slope of the downchirp of the baseband reference after modulation is precisely the opposite of the fundamental harmonic signal. After multiplying these two sets of signals and completing the demodulation operation, the time-frequency diagram is shown in Fig.
6c. It can be noted that the chirp on the fundamental harmonic signal band has been successfully demodulated into a conventional frequency-modulated signal, that is, the frequency is no longer a function that changes with time, but a constant value (symbol information
fs).
For the fundamental harmonic signal, we simply follow the defined steps of resampling, FSK demodulation, and decoding to complete the entire process of communication. For the signal that has been decoded, we demodulate it again using the same down chirp as before, and the result is shown in Fig.
6d. The modulation frequency has also become constant. This corresponds exactly to our conclusions in Section
5.1. A series of corresponding demodulation decoding also yields the corresponding symbol information.
The addition of harmonic signal energy inevitably results in gain, which we attempt to measure in terms of SNR. The figure shows the average power ratio versus the number of demodulation iterations for different limiter ratios. As we mentioned when we introduced limiting, the limiter is designed to mitigate noise-induced amplitude variations. So the lower the transmitter SNR and the lower the limiter ratios, the better the performance of our algorithms.
We fixed the transmit signal-to-noise ratio to be − 10dB, which is expected to give better performance under such communication conditions. As can be seen from Fig.
7. Regardless of the extent of the limiting factor, there are varying degrees of SNR improvement. The SNR gain reaches 6 dB at
Th = 0.1, and if the limiting is not too large, there is still a gain of about 3 dB.
Concurrently, the limiting ratio
Th links Fig.
5 and Fig.
7. As illustrated in Fig.
5, the unrestricted signal exhibits the lowest
BER at a transmit
SNR of − 10 dB. When the clipping is the most severe (
Th = 0.1), there is still scope for improvement, although the
BER difference is minimal. In the event of clipping, our algorithm exhibits the capacity for recovery.
7.3 Simulation on Multi-user Communication
In this subsection, we represent different users by varying the linear frequencies. Despite the overlapping time-domain signals on the same bandwidth, we can still distinguish different users using the “slope password”.
The two images in Fig.
8a, Fig.
8b illustrate the chirp sent by two users after LoRa modulation. The characteristic of these two users is that they share the same communication bandwidth. Still, the time required to transmit each symbol differs (the first user takes approximately 700 milliseconds to send the sequence, while the second only takes about 400 milliseconds). This difference is reflected in the chirp, that is, the carrier slopes of the two signals are different.
The image in Fig.
8c shows how the two signals overlap in the time domain: some differences can be observed at about 220ms. Multiplying the down chirp of
USER1 with the overlapped signal, the demodulation result is shown in Fig.
8d: the carrier frequency of
USER1 has been successfully "downscaling" and is not interfered with by
USER2 at all, even though the two signals are overlapped in such a messy way.
8 experiment
In this section, we validate the harmonic signal reconstruction using real ocean test data from experiments in the South China Sea. Experiments have shown that for harmonic with spectral leakage, the use of multiple demodulation can sequentially utilize signals in different frequency bands without wasting too much transmitter power.
Fig.
9a shows the received signal pre-processed by the acoustic modem, and it is visible that the signal in the frequency range centered at 12 kHz, and there are chirps with similar shapes in the band of integer multiples of its center frequency.
The preamble chirps are lead and synchronization signals that carry no information and are used only for synchronization. The synchronization between the transmitter and receiver is necessary to dechirp and demodulate the signal accurately.
The seemingly irregular chirp that follows is the payload signal. The data after one demodulation is shown in Fig.
9b, where useful information and synchronization signals can be identified (Two consecutive down chirps are the demarcation). At the same time, the harmonic signal has changed as we speculated before.
We assume that a signal with a center frequency of 12 kHz has been successfully decoded and stored, and the data after the 2
nd harmonic demodulation is shown in Fig.
9c. Although the signal seems weak, it can still be seen that it is consistent with the original band information.
In summary, multiple frequency signals are demodulated, decoded, and saved multiple times in steps as those described above, maximizing the use of all transmit power.
9 conclusion
In this paper we investigate a practical problem encountered during sea trials, i.e. the signal clipping results in spectral leakage and wasted transmit power. Our research shows that CSS modulation can effectively counteract the signal distortion caused by clipping. For clipped signals, we put forward an efficient reception mechanism for the reconstruction of the signal, intending to fully utilize the transmitted power. Finally, we propose that different users can be represented by varying the slope of the chirp to achieve parallel communications in underwater narrowband channels.
In addition to validating the above ideas through simulations, we also qualitatively verify our proposed receiving mechanism using actual sea trial data. It is worth noting that the quantitative simulation results show that at very low transmit signal-to-noise ratios, different clipping ratios have different degrees of power gain. At ultra-low limiting ratios, the power gain reaches 6dB.
Acknowledgments
This work is supported in part by the Key Research and Development Program (Grant No. 2022YFC2803802) from the Ministry of Science and Technology of China.