Greedy receiver for photon-efficient optical communication

Karol Łukanowski Centre for Quantum Optical Technologies, Centre of New Technologies, University of Warsaw, Stefana Banacha 2c, 02-097 Warszawa, Poland k.lukanowski@cent.uw.edu.pl

Abstract

In optical communication the transmitter encodes information into a set of light states defined by the modulation format, selected to accommodate specific channel conditions and to remain sufficiently distinguishable at the output. Various receiver architectures have been designed to improve the demodulation performance, ultimately limited by quantum theory. In this work I introduce a new receiver based on a locally optimal greedy algorithm and apply it to pulse position modulation. The receiver reduces the error probabilities of previously proposed strategies in all signal strength regimes and achieves results comparable with those obtained by numerical optimization of the detection process. In contrast, however, it is conceptually simple and therefore can be scaled to arbitrarily high modulation orders for which numerical methods become intractable. In the photon-starved regime characteristic of deep space optical communication, the greedy receiver approaches the quantum-optimal Helstrom bound on state discrimination error probability. In the regime of few-photon pulses, the error reduction offered over the other methods grows up to an order of magnitude.

I Introduction

Long distance communication between devices of modern technology is for the most part optical. Messages are transmitted by modulating an electromagnetic wave and propagating it through an optical channel, such as fiber or free space, upon which end the receiver strives to demodulate the incoming signal to read out the original message [1, 2, 3]. Although research into the fundamental quantum description of light paves the way for novel non-classical light-state encodings of information that may prove advantageous in specialized applications—e.g., photon-number states in purely lossy conditions [4] or squeezed light in strongly dephasing channels [5]—the modulation techniques currently in use still rely on essentially classical laser pulses [3]. These are represented in quantum theory by coherent states of light denoted $|\hskip 0.7pt\alpha\rangle$ . The modulus and argument of the complex number $\alpha$ are then, respectively, the amplitude and the phase of the classical lightwave [6, 7].

Surprisingly, although coherent states are never mutually orthogonal and thus cannot be distinguished with zero probability of error by any physical measurement, they can still saturate the ultimate Holevo limit on achievable bitrate in optical information transmission [8, 9, 10, 11]. Such a communication protocol would, however, require a receiver able to perform collective measurements on multiple incoming states, the optical implementation of which remains challenging [12, 13]. In practice, several modulation formats based on coherent states have been established and the research into practical receiver architectures that improve demodulation performance continues [14].

In this work I introduce a new coherent state receiver that adapts its behaviour during signal reception. Crucially, it does so according to a so-called greedy algorithm belonging to a class of methods in computer science that make locally optimal choices in the hope that they will lead to a globally optimal solution [15]. One classic illustration of such algorithms involves the knapsack problem in which, given a set of objects with weights and values, one (perhaps a thief in a jewelry store) is tasked with filling the knapsack up to some weight while maximizing the total value of the items taken. A greedy approach would be to sort the items by value-to-weight ratio and pack them one by one until the maximal weight is reached. If the choice is always to either take an item or leave it, this usually produces suboptimal solutions. Packing certain items early on limits the ability to later pack other ones, that although possess a lower ratio, could ultimately yield a higher total value and still fit under the weight limit. However, the greedy solution is easily computable and oftentimes good enough, in contrast to globally optimal methods (that do not do what is best at the moment, but rather consider far-reaching consequences of their choices) which inevitably consume more resources and time. Furthermore, in some cases greedy turns out to be provably optimal—such as the fractional knapsack problem, in which one is allowed to take fractions of items (for instance, if the thief has the option to cut the precious jewels). This work studies the greedy approach to optical demodulation on the exemplary pulse position modulation format and shows it to be very effective.

II Pulse position demodulation

In severely power-limited conditions, such as deep-space optical communication, pulse position modulation (PPM) is the common modulation choice [16, 17, 18]. This is in part due to straightforward signal preparation and high photon information efficiency approaching the quantum-optimal Holevo limit in the photon-starved regime [19, 20]. In $M$ -ary PPM or $M$ -PPM, where we refer to $M$ as the PPM order, sketched in Figure 1a), the frame of chosen duration $T$ is divided into $M$ slots. A pulse of light is prepared in one of them, denoted by a symbol $x\in\{1,\ldots,M\}$ . The position of the pulse within the frame carries then an information content of up to $\log_{2}M$ bits [2]. To extract this information, the aim of the receiver is to identify the pulse position within the received window, or in other words, measure the arrival time of the pulse within a single frame and output an estimate $y\in\{1,\ldots,M\}$ that agrees with $x$ .

Refer to caption — Figure 1: a) Pulse position modulation. In one channel use of $M$ -PPM, the transmitter produces a symbol $x\in\{1,\ldots,M\}$ by dividing a time frame into $M$ slots and preparing a pulse in one of them, leaving the other slots empty. The symbol is then transmitted through an optical communication channel to the receiver, whose shutter is synchronized with the transmitter so that the detection begins with the arrival of the frame. The aim of the receiver is to output an estimate $y$ of the pulse position in the received frame that agrees with the actual symbol $x$ being transmitted. An exemplary transmission of $x=1$ in 3-PPM is depicted. b)—d) PPM demodulation algorithms with displacements. Standard PPM demodulation strategies measure the incoming signal in each slot one by one with “on-off” direct detection resulting in a “click” if photons are detected or a “no-click” otherwise. The receiver outputs some final pulse position estimate $y$ by following a decision tree with branches corresponding to the binary measurement outcomes. Displacement receivers additionally shift the slot amplitude before direct detection by some amount $\beta$ prescribed in the tree nodes. The trees corresponding to direct detection and conditional pulse nulling are depicted for 3-PPM and highlighted are some possible paths through the tree leading to the correct estimate $y=1$ .

A simple way to assess receiver performance is to calculate the average probability $P_{e}$ of an incorrect identification of the input symbol, or, equivalently, the average probability of correct decision $P_{c}$ . We have

P_{c}=\sum_{x,y}p(x)p(y=x|x),\qquad P_{e}=1-P_{c},

(1)

where $p(y=x|x)$ is the probability of correctly identifying the symbol $x$ by the output estimate $y$ , conditioned on $x$ being transmitted which happens with probability $p(x)$ . Typically one sets all $p(x)$ equal, so that $P_{c}=\sum_{x,y}p(y=x|x)/M$ . Thanks to the celebrated Helstrom theory [21], it is possible to tightly lower-bound $P_{e}$ , although only for highly symmetric state constellations [22]. In PPM, for instance, if the received pulse slot indeed holds a pure coherent state $|\hskip 0.7pt\alpha\rangle$ and the other slots contain a perfect quantum vacuum $|\hskip 0.7pt0\rangle$ , which corresponds to ideal noiseless transmission, the minimal error probability reads [23] {IEEEeqnarray}rCl P_e,M^H &= M-1M2 (1+(M-1) e^-|α|^2 - 1 - e^-|α|^2 )^2. \IEEEeqnarraynumspace This ultimate limit provides a benchmark to which the performance of any receiver can be compared, although for this comparison to be fair, similar noiseless or near-noiseless conditions should be assumed.

For more realistic scenarios in which the transmitted states are disturbed by loss and noise, no general quantum limit is known [22], but it is still customary and convenient to compare receivers based on their error probabilities [23]. Note, however, that for the sake of simplicity, some nuance is lost. First of all, besides errors, receivers could also announce erasures (the “I have no idea” output) which can be beneficial for the bitrate [2]—while to minimize the error probability it is always better to guess the input symbol randomly in the event of an erasure, so that at least sometimes the output estimate will be correct. Still, the addition of erasures is straightforward for all the receivers that will be studied here. Second of all, the intuition “the smaller the error probability, the higher the protocol bitrate” comes with some caveats. From some point onward, decreasing the error probability does not increase the bitrate much if it is already close to the Shannon limit [2]. Nonetheless, smaller error probabilities always lead to improvements in performance, if not in the bitrate itself, then in error-correction coding complexity [24, 25] and finite-blocklength communication [26], such as that between Earth and a satellite that passes only briefly over the ground station. I shall therefore keep the error probability as a figure of merit.

The basic approach to PPM demodulation is the direct detection (DD) receiver that relies on photon counting in each subsequent slot of the frame to estimate the pulse position. Typically the detection is performed by a single photon detector (SPD) operating in “on-off” mode, so that either no detections are observed in a slot (a “no-click”) or at least one is (a “click”) [27, 28]. Most of the time one can expect that the pulse slot $x$ will produce a click and the other empty slots will result in no-clicks, leading to the correct estimate $y=x$ . If a click is recorded in more than one slot due to stray light, background noise, or dark counts in the detector, the receiver outputs an estimate chosen randomly out of these slots. On the other hand, the stochastic nature of photodetection makes it possible to detect no photons in the frame at all, even in the pulse slot due to the nonzero vacuum contribution to the state. Then the receiver could announce an erasure, but under the error-only assumption it simply guesses randomly from all the $M$ slots. The DD error probability is given by (see Methods for the derivation)

P_{e,M}^{\mathrm{DD}}=\frac{(M-p_{0}q_{0}^{M-1})\bar{q}_{0}-\bar{p}_{0}(1-q_{0% }^{M})}{M\bar{q}_{0}},

(2)

where $p_{0}$ is the no-click probability in the pulse slot ( ${\bar{p}_{0}=1-p_{0}}$ ) and $q_{0}$ the no-click probability in an empty slot ( $\bar{q}_{0}=1-q_{0}$ ). Some authors refer to (2) as the standard quantum limit [25]. Figure 1c) depicts an exemplary decision tree traversed by the DD receiver in the detection process.

A broader class of receivers introduced by Kennedy and Dolinar for a binary coherent state modulation [29, 30] and later adapted by Dolinar to PPM [31] are displacement receivers that displace in phase-space the complex amplitudes of incoming signals before performing direct detection on them. If the incoming state is a pure coherent state $|\hskip 0.7pt\alpha\rangle$ , displacing by $\beta$ yields $|\hskip 0.7pt\alpha-\beta\rangle$ . In practical models with different kinds of noise the mathematical description becomes more involved, but the intuition of displacing the amplitude in the slot holds. The goal of displacement is to modify the photodetection statistics in a way that can be exploited to reduce the receiver error probability. Such a strategy has proved useful in receivers designed for other modulation formats as well [32].

Without loss of generality, let us denote by $q_{\beta}$ the probability of no clicks being recorded in an empty slot that was displaced by $\beta$ and the corresponding click probability by $\bar{q}_{\beta}:=1-q_{\beta}$ . Similarly, let $p_{\beta}$ be the no-click probability in the slot containing the pulse and displaced by $\beta$ , and $\bar{p}_{\beta}:=1-p_{\beta}$ the corresponding click probability. Poissonian photon statistics [33] assumed in both the signal and displacement modes, as well as for the additive noise, result in

q_{\beta}=e^{-\beta^{2}-N_{b}},\quad p_{\beta}=e^{-\left(\alpha-\sqrt{1-\Delta% }\beta\right)^{2}-\Delta\beta^{2}-N_{b}}.

(3)

Here $\alpha$ is the pulse amplitude, $\beta$ is the displacement applied in the round, and $N_{b}$ is the average number of noise photons impinging on the detector in one slot. The parameter $\Delta$ represents mode mismatch between $\alpha$ and $\beta$ —if positive, their interference is not perfect and a fraction of the displacement pulse leaks out to the detector and may contribute to photocounts. Additionally, I have already assumed $\alpha,\beta\in\mathbb{R}$ . This is fine for PPM as the phase is not used for information encoding. Physically, however, it requires phase synchronization between the signal and displacement modes which constitutes a technical challenge—nevertheless, the degree of phase mismatch can be modeled with the $\Delta$ parameter as well [25].

The original displacement algorithm devised by Dolinar has since been referred to as conditional pulse nulling (CPN) [31], where the word nulling entails displacements by $\beta=\alpha$ . The idea is to null every slot and measure it with an SPD until a no-click is observed at some slot $k$ , signifying $x=k$ with high probability. Then, a switchover occurs and the rest of the slots are detected directly. If no clicks are observed after the switchover, $k$ is given as the final estimate; if there are clicks detected, the output of the CPN is the output of DD in those last $M-k$ slots; if no switchover occurs, the output is random (or an erasure is announced). Proposed modifications to the CPN that further decrease its error probabilities, albeit slightly, include inexact nulling $\beta\neq\alpha$ or additional squeezing of the incoming light before the measurement [34, 12]. The error probability (allowing for inexact nulling) reads {IEEEeqnarray}rCl P_e,M^CPN &= 1M ¯q0(¯qβ- q0) [ (¯p_0 - M ¯q_0) (q_0 - ¯q_β)
+ ¯q_0 ¯q_β^M-1 (¯p_βq_0 - p_0 ¯q_β) + q_0^M (p_β¯q_0 - ¯p_0 q_β) ], \IEEEeqnarraynumspace where $\beta$ is the constant displacement applied in the nulling rounds which can be optimized to obtain the minimal possible error probability in (1). The slightly more involved derivation is again delegated to the Methods section and an exemplary decision tree is depicted in Figure 1d).

Last but not least, more recently novel adaptive architectures have been put forward that optimize the detection strategy numerically by efficient means rather than brute force—either with dynamic programming [23] or reinforcement learning [35, 14]. It is worthwile to consider them separately from CPN and its modifications, as their displacement algorithm is not prescribed, but the sequence of optimal displacement amplitudes can be tailored to the observed channel behaviour. Specifically, those algorithms construct decision trees like the one in Figure 1b) which are then traversed by the receiver according to the observed outcomes. Each node of the tree is labeled by a vector $\widearrow{k}$ of the so-far observed outcomes and specifies a displacement $\beta_{\widearrow{k}}$ to be applied in the next slot. After the measurement, the receiver proceeds along the branch corresponding to the observed measurement outcome, arrives at the next node and performs the displacement prescribed therein, so on and so forth, until it reaches the end of the tree and outputs some estimate $y_{\widearrow{M}}$ . In principle, every displacement receiver can be described by such a decision tree and additional measurement parameters such as squeezing power can be appended to the $\beta_{\widearrow{k}}$ parameters to also be optimized over. The upshot of the method lies in the ability to adapt the displacement amplitude in each round to the previously observed outcomes, instead of keeping it constant for the whole duration of the measurement process, like in traditional CPN.

Unfortunately, these numerical techniques suffer from the curse of dimensionality. Already for a 32-PPM receiver with binary measurement outcomes that keeps in each node only the one-bit information whether to displace by some constant $\beta$ or not, the whole tree weighs more than a gigabyte. For 256-PPM the number of nodes comes close to the number of atoms in the observable universe. Similarly, storing the final estimates for each possible outcome sequence becomes unfeasible due to the sheer number of possibilities, and if one allows the displacements to be real numbers, the memory required to store the tree grows bigger still. On the other hand, even though adaptive methods are more efficient than brute force optimization of the displacement decision tree, the runtime of the algorithms and their sensitivity to numerical precision also increase quickly with tree depth, i.e., the modulation order. The dynamic algorithm of [23] is shown to work for 8-PPM at the highest, whereas the largest tree constructed in [35] for quadrature amplitude modulation has depth 6 with 3 branches extending from each node. Those can surely be pushed further by upgrading the hardware with more memory and faster components, but never to $M$ ’s of the order of hundreds or more. Practical communication protocols are typically envisioned for $M$ ’s on the order of 128 or higher [36] which is necessary for attaining high photon information efficiency [37] and bridging the gap to the Holevo limit. Therefore, optimal high-order PPM receiver architectures must necessarily operate under a modest number of rules, rather than follow a preconstructed decision tree.

III Results

The outcome of this work is a new displacement receiver scheme. When applied to PPM demodulation, it offers a substantial improvement in symbol error probability over the current methods. The underlying greedy algorithm chooses the next displacement slot-by-slot to be locally optimal, that is, it maximizes the probability of a correct estimate only after the next slot is measured, based on an efficient compression of information provided by the preceding slot measurement outcomes. This constrasts with adaptive receivers that rely on decision trees preconstructed for the whole frame duration.

III.1 The greedy receiver idea

A precise formulation of the greedy displacement algorithm is given in the Methods section, whereas here I outline the basic idea depicted in Figure 2. The initial displacement $\beta_{\mathrm{in}}$ is given as an input to the greedy receiver, after which it assumes the initial hypothesis $y=1$ and then proceeds algorithmically. The receiver requires a small memory in which it stores the current slot hypothesis $y_{\widearrow{k}}$ and the so-called revision ratio $r_{\widearrow{k}}\in\mathbb{R}$ , where $\widearrow{k}$ denotes the vector of the so-far observed outcomes. The revision ratio changes if after a slot measurement the receiver chooses to update its hypothesis—in this parameter it encodes the information about the future possible changes in the probability of a correct estimate. It can be shown that at each slot, knowing the revision ratio, the receiver can act in a way that maximizes the correct estimate probability after the next slot measurement. First of all, only two options can be considered: A, in which the receiver changes the estimate after recording a click and does not if no clicks are observed, and B, in which the estimate is changed after a no-click and remains unchanged if the detector clicks. For each of the options, an optimal displacement leading to the highest correct estimate probability can be easily found numerically since it constitutes a one-parameter optimization problem. The “greedy choice” is then to choose option A or B and apply the corresponding optimal displacement. The receiver continues doing so until the last slot, after which the current estimate is given as the final one.

Because the greedy choice relies only on the value of the revision ratio and, importantly, in the same manner in each slot, the optimal A/B option and the corresponding optimal displacement can be found ahead of time for a range of $r_{\widearrow{k}}$ values and stored in a lookup table, in a way similar to the reinforcement-learning-based experimental setup of Ref. [35]. The optimal initial displacement $\beta_{\mathrm{in}}$ leading to the lowest output error probability can be found ahead of time by simulating the detection for a range of $\beta_{\mathrm{in}}$ values. Such a simulation is not resource-intensive even for large $M$ (for instance, I have performed it for 1024-PPM, cf. Sec. III.3) because its runtime grows only linearly with modulation order.

In what follows, I show how at low PPM orders for which the numerical optimization of the whole decision tree can be carried out, the greedy receiver achieves comparable results. Next, in the photon-starved regime and low noise conditions it is shown to approach the Helstrom bound with an exponential improvement in scaling compared to CPN. Because of its simplicity, the greedy receiver can be applied to arbitrarily high modulation orders as well, at which it exhibits performance analogous to low orders. Furthermore, it is shown to outperform both DD and CPN in the whole signal power spectrum, with a surprising order-of-magnitude improvement in error probability in the limit of strong pulses under noisy conditions. An exemplary application to real PPM communication scenarios concludes the Results.

III.2 Comparison of the greedy receiver with numerical optimization

For low PPM orders it is possible to numerically optimize the entire displacement decision tree, i.e., find parameters $\beta_{\widearrow{k}}$ and $y_{\widearrow{M}}$ , like the ones in Figure 1b), that yield the minimal possible average error probability. Figure 3 compares the optimal error probabilities calculated by the adaptive algorithm of Ref. [23] with CPN (already with optimized $\beta$ in (1)) and the greedy receiver, assuming 4-PPM. The boundaries of the grey regions correspond to the standard quantum limit (SQL), i.e., direct detection error probabilities given by (2) and independent of mode mismatch $\Delta$ —anything below those regions signifies an advantage over conventional detection.

Figure 3a) depicts a low-noise scenario with $N_{b}=0.002$ , whereas Figure 3b) assumes noisy conditions with $N_{b}=0.2$ . In both figures, the pink curves correspond to zero $\Delta$ and the blue curves to $\Delta=0.1$ , modeling imperfect interference between the signal and displacement modes. In all cases, the greedy receiver closely tracks the numerically optimal performance of the adaptive algorithm. It universally outperforms DD and CPN, with the advantage pronounced especially strongly with increased $N_{b}$ and $\Delta$ . In fact, both additive noise $N_{b}$ and mode mismatch $\Delta$ deteriorate the performance of CPN more quickly—even pushing it above the SQL—than that of the greedy receiver, whose error probability continues to follow the optimal one and thus can be seen to adapt well to worsening conditions. The seemingly constant proximity of the greedy and optimal results suggests the two may be somehow linked, perhaps by a constant factor—a feature encountered in some greedy algorithms [15].

III.3 Asymptotic behaviour of the greedy receiver

III.3.1 Noiseless conditions

Figure 4a) compares the error probabilities of different PPM receivers assuming 32-PPM and ideal transmission, i.e., no additive noise $N_{b}=0$ and zero mode mismatch $\Delta=0$ in (3). These assumptions allow now for a fair comparison with the quantum-optimal, noiseless Helstrom bound (1), indicated in the Figure by a dark gray region of error probabilities prohibited by the quantum theory. The standard quantum limit is again depicted as a boundary of a light gray region.

As noted in the original work on CPN [31], under noiseless conditions its error probability is near-optimal, i.e., achieves the same large-energy scaling as the Helstrom bound. Specifically, with the average number of photons detected per frame $n\gg 0$ , both the Helstrom bound (1) and the CPN error probability (1) scale like $e^{-2n}$ . The greedy receiver can be seen on Figure 4a) to approach the CPN curve with growing $n$ . Indeed, in this limit the CPN error probability is an upper bound to the greedy one. This can be seen by first noting that the greedy receiver has to be initialized with some displacement $\beta_{\mathrm{in}}$ , with respect to which the error probability can be minimized. Therefore, for any initial displacement the resulting greedy error probability is an upper bound to the minimal one—and if one chooses $\beta_{\mathrm{in}}=\alpha$ , it can be shown (cf. Supp. Mat) that in this limit the greedy receiver results in exactly the CPN error formula, making it an upper bound.

On the other end of the energy spectrum, the photon-starved regime $n\ll 0$ characteristic of deep-space optical communication and depicted in the inset of Figure 4a), the Helstrom bound scales like $\left(\frac{M-1}{M}-\gamma\sqrt{n}\right)$ , where $\gamma$ is some positive proportionality factor. CPN is then no longer near-optimal, as its error probability scales linearly with $n$ like $\left(\frac{M-1}{M}-\kappa n\right)$ for some $\kappa>0$ . The greedy receiver appears to track the quantum-optimal Helstrom scaling. In this limit its analytic behaviour is not easily inferred. Therefore, I resorted to calculating the error probability formulas obtained from decision trees traversed by the greedy receiver for up to $M=12$ . Indeed, in each case a $\sim\!\sqrt{n}$ scaling was observed, mimicking the quantum-optimal Helstrom bound.

III.3.2 Strong, noisy pulses

A striking feature of the greedy receiver is visible in Figure 4b) in the limit of few-photon or higher pulse energies with non-zero additive noise $N_{b}$ , where a large gap develops between DD&CPN and the greedy receiver error probabilities, spanning orders of magnitude. The gap diminishes with rising mode mismatch $\Delta$ . Its appearance can be explained by the lack of photon number resolution, which results in DD&CPN reaching a “dark-count floor”, a characteristic flattening of the error probability. This arises because at this point a further increase in photon number does not influence much the pulse slot click probability as it is already practically equal to 1. The error probability is therefore limited by the amount of noise leading to clicks in empty slots. The greedy receiver has the ability to circumvent this limitation and reaches its much lower dark-count floor at higher pulse energies.

In this regime it is also possible to analytically infer the behaviour of the greedy receiver by plugging in a chosen value of the initial displacement $\beta_{\mathrm{in}}$ and observing that the resulting error probability must then be an upper bound to the minimal possible one. From greedy receiver simulations, it appears that in the strong pulse regime, the optimal initial displacement $\beta_{\mathrm{in}}$ coincides with the pulse amplitude $\alpha$ . Plugging that, it can be shown that the greedy receiver follows a specific repetitive decision tree discussed in the Supp. Mat. that contains nodes with either no displacement or displacement by $\alpha$ . The resulting error probability formula is then a function of $q_{0}$ , $p_{0}$ , $q_{\alpha}$ and $p_{\alpha}$ . Because in the strong pulse regime a click is virtually certain if a large amplitude $\alpha$ impinges on the SPD, one can take the limits $p_{0}\to 0$ and $q_{\alpha}\to 0$ , after which the resulting expression for the greedy error probability reads {IEEEeqnarray}rCl P_e,M^Gr. limit &= ¯pα(-1 + M - M q0+ q0M)M ¯q0
= ¯p_αlim_p_0 →0 P_e,M^DD = ¯q_0 lim_p_0 →0 P_e,M^DD. The last equality follows if we additionally assume that a non-displaced pulse gives the same photodetection statistics as a displaced empty slot (this corresponds to setting $\Delta=0$ in the photodetection statistics (3)). One obtains then that the greedy receiver achieves the SQL, but additionally multiplied by the probability that an empty slot clicks. The resulting formulas are plotted in Figure 4 and found to match those obtained with greedy receiver simulation—interestingly, for $\Delta\neq 0$ as well. As an aside it can also be shown that in this regime and for $M\gg 1$ the error probabilities of DD&CPN converge, and so the latter is in fact limited by modulation order. The gap between them and the greedy receiver is, however, independent of the modulation order.

A natural way to alleviate the dark-count floor issue is to increase the detection threshold of the DD&CPN receivers, so that, for instance, up to two or even three photodetections in a slot are not yet counted as a click. This would effectively eliminate the erroneous clicks arising from noise but still allow for more energetic signal pulses to be detected, reducing the overall error probability. Note, however, that the error reduction in the greedy receiver is independent from the actual photodetection statistics (3)—in the derivation one stays at the level of $q-$ and $p-$ probabilities without the need to specify them exactly. Thus, the strong pulse advantage of the greedy method over DD&CPN persists even if the detection threshold is modified (or, in fact, for any photodetection statistics put in place of (3)).

III.4 Greedy reception for real communication systems

Finally, to estimate the theoretical gain of greedy reception in real-life communication scenarios, I apply it to two practical use-cases of high-order PPM: that of the recently launched PSYCHE mission [36], and that of the Deep Space Optical Transceiver concept coupled with the Large Binocular Telescope, theorized in [38]. As depicted in Figure 5, for the case of PSYCHE, operating in the photon-starved regime, the use of CPN or DD leaves a gap of approx. $0.30$ dB with respect to the Helstrom bound. The greedy receiver would allow to cut this gap in half, improving the error probability by approx. $0.15$ dB over DD&CPN and approaching the bound within $0.15$ dB as well. For DSOT with LBT, due to stronger signals, greedy reception allows to reduce the error level by a factor of three when compared with DD and CPN, which additionally suffer more from the appropriately increased background noise levels.

IV Discussion

I have demonstrated the advantage of using a displacement receiver operating according to a greedy decision algorithm for pulse position demodulation. The greedy receiver exhibits improved resistance to noise and mode mismatch compared with the state-of-the-art CPN method and the conventional direct detection. Moreover, it performs comparably to the optimal displacement strategies, in the few scenarios for which the latter can be established numerically. The results raise many interesting questions. What is it about the problem studied that allows such a simple strategy to perform so well, not only in noisy conditions (for which one could expect the level of disturbance to negate the advantage of “forward-thinking” methods), but ideal ones as well? There exist problem structures in computer science for which greedy methods can be guaranteed to be optimal or near-optimal [15]—perhaps such a structure could be identified in coherent state demodulation as well? The interesting next step would be to adapt the idea to other modulation formats or information-theoretic tasks based on coherent state discrimination.

The greedy choice idea lends itself to easy modification with regards to the number of measurement parameters and outcomes, as one still only needs to analyze a single measurement. It would therefore be interesting to incorporate, for instance, additional squeezing of the light before the measurement, like in one of the CPN modifications [34], or allow for photon-number-resolution, i.e., decision trees with more than two measurement outcomes corresponding to different numbers of photons detected in each slot. One could also relax the greediness and instead of maximizing the reward after the next slot, do it for some low number of slots ahead. Any of these ideas could reduce the error probability further, possibly even bridging the gap to the Helstrom bound in the ideal scenario of Figure 4a). It would also be interesting to check how the greedy receiver performs in channels with time-varying characteristics as suggested in [35]. Last but not least, besides error probability, the other noteworthy receiver benchmark is the achievable channel capacity [1]. A comprehensive comparison between different receivers should then account for the possibility of erasures. This are usually announced in DD&CPN in the event of no detections at all. A simple way to incorporate erasures into the greedy receiver would be to have it additionally store the probability of the current estimate being correct, and in the end, to only output an estimate if this probability exceeds some threshold, and otherwise declare an erasure.

V Methods

V.1 The operation of the greedy receiver

The greedy receiver algorithm is motivated by three observations. First of all, note that any receiver is free to hold a temporary estimate $y_{\widearrow{k}}\leq k$ after measuring $k\leq M$ slots of a frame, where by the vector $\widearrow{k}$ I denote the history of observed outcomes. After measurement in slot $k+1$ the estimate can be kept unchanged ( $y_{\widearrow{k+1}}=y_{\widearrow{k}}$ ) or it can be updated ( $y_{\widearrow{k+1}}=k+1$ ). For $k=M$ the estimate $y_{\widearrow{M}}$ is the final output and the goal of a receiver is to maximize, on average, the probability of $y_{\widearrow{M}}=x$ .

Second of all, given that the receiver arrives at an estimate $y_{\widearrow{k}}$ by recording outcomes $\widearrow{k}$ , the probability $P_{c,k}(y_{\widearrow{k}}=x|\widearrow{k})$ of the estimate being correct is a product of $p(x)=1/M$ and a sequence of $q_{\beta}$ and $p_{\beta}$ probabilities or their barred counterparts $\bar{q}_{\beta}=1-q_{\beta}$ , $\bar{p}_{\beta}=1-q_{\beta}$ . Specifically, the product contains $k$ such factors, all of them $q$ -like instead of the one at position $y_{\widearrow{k}}$ which is $p$ -like. The $\beta$ subscripts encode the sequence of displacements applied so far and the presence of a bar indicates whether a click was observed or not in the corresponding slot. For example, assume that $y_{010}=1$ in the general receiver tree pictured in Figure 1b). Then the probability that the receiver holds this estimate and it is correct reads $P_{c,M}(y_{010}=1|010)=\frac{1}{3}p_{\beta_{\mathrm{in}}}\bar{q}_{\beta_{0}}q_% {\beta_{01}}$ .

Third of all, because the expression for $P_{c,k}$ tracks the history of measurements and outcomes applied, updating it is simple. If after the measurement with displacement $\beta_{?}$ the receiver does not change its estimate, $P_{c,k}$ is simply multiplied by $q_{\beta_{?}}$ in case of a no-click or $\bar{q}_{\beta_{?}}$ in case of a click. On the other hand, if the receiver does update the estimate, the measurement and outcome history remain unchanged—the only required correction to $P_{c,k}$ is to change the current single $p$ -like probability to a $q$ -like and to multiply by $p_{\beta_{?}}$ or $\bar{p}_{\beta_{?}}$ to reflect the estimate update.

Different decisions result in different final probabilities of the estimate being correct. DD and CPN receivers make the choice according to their guiding algorithms while adaptive schemes make use of a preconstructed tree in which the choices have already been made. The idea behind the greedy receiver, depicted in Figure 2, is to hold at all times not only the temporary estimate $y_{\widearrow{k}}$ but also the revision ratio $r_{\widearrow{k}}$ by which one has to multiply the correct estimate probability expression to change a $p$ -like probability into a $q$ -like. If the temporary estimate was acquired after a no-click, the ratio is given by $r_{\widearrow{k}}=q_{\beta_{\widearrow{k}}}/p_{\beta_{\widearrow{k}}}$ where $\beta_{\widearrow{k}}$ is the displacement applied in the slot $y_{\widearrow{k}}$ . Conversely, if the temporary estimate was acquired after a click, the ratio reads $r_{\widearrow{k}}=\bar{q}_{\beta_{\widearrow{k}}}/\bar{p}_{\beta_{\widearrow{k% }}}$ .

With the revision ratio $r_{\widearrow{k}}$ it is possible to choose $\beta_{?}$ such that $P_{c,k+1}$ is maximal, irrespectively of the current value of $P_{c,k}$ . One needs only to compare the two options for the estimate update depicted in Figure 2b). In option A the receiver updates the estimate in case of a no-click and keeps it unchanged in case of a click. This has the effect of multiplying $P_{c,k}$ by $p_{\mathrm{A}}:=q_{\beta_{?}}+r_{\widearrow{k}}\bar{p}_{\beta_{?}}$ . In option B, the estimate is updated in case of a click and kept unchanged in case of a no-click. This corresponds to multiplying $P_{c,k}$ by $p_{\mathrm{B}}:=r_{\widearrow{k}}p_{\beta_{?}}+\bar{q}_{\beta_{?}}$ . The revision ratio can thus be interpreted as a compression of information about the recorded outcomes which quantifies how much can be gained by an estimate update. The greedy choice is simply to displace by $\beta_{?}$ that maximizes $\max\{p_{\mathrm{A}},p_{\mathrm{B}}\}$ and assume correspondingly option A or B in the measurement.

This decision process can be repeated for every slot irrespectively of the modulation order $M$ . The one-parameter optimization over $\beta_{?}$ should be performed beforehand for a sample of ratios $r_{\widearrow{k}}$ and the optimal $\beta_{?}$ and A/B options can be kept in a lookup table. This way in practical implementation the next displacement to apply can be found in time for the arrival of the next slot, similarly as demonstrated experimentally in [35] for adaptively learned decision trees. Note, however, that the speed of the electronics governing the displacement changes limits the minimal possible slot width, as is the case for any displacement receiver. Finally, the initial displacement $\beta_{\mathrm{in}}$ is not determined by the algorithm and should be optimized for given channel conditions by testing a range of values.

V.2 Simulation data

For low PPM orders the greedy receiver error probability can be found exactly (within numerical precision) by simply generating the whole decision tree that the greedy receiver follows. This has been done in Figure 3 and the greedy error probabilities are plotted there as points. For larger modulation orders storing the trees becomes infeasible and because of that, in other figures the greedy receiver is simulated. For a given amplitude $\alpha$ and photodetection statistics, a lookup table is generated for a range of revision ratio values (typically for a 1000-10000 logarithmically spaced values of $r_{\widearrow{k}}\in[10^{-16},10^{16}]$ ), containing the optimal displacement to apply and greedy choice to make. Then, in one simulation round, a input symbol $x$ is drawn randomly and the slot-by-slot simulation is explicitly performed via the greedy receiver algorithm, leading to either a successful identification of the pulse position, or an error. For the figures in the article, each point was simulated for between 10000 and 1000000 rounds. The errorbars are set to $\pm 3\sigma$ , with $\sigma$ being the sample mean standard deviation of the obtained simulation results. The data obtained and simulation code can be shared by the author upon request.

VI Acknowledgements

The author is immensely grateful to Konrad Banaszek, René-Jean Essiambre, Saikat Guha, Michał Jachura, Marcin Jarzyna, and Matteo Rosati for fruitful discussions. This work was supported by the polish Ministry of Education and Science under the “Quantum strategies in communication through noisy optical channels” project no. PN/01/0204/2022 carried out within the “Pearls of Science” program.

References

[1] C. E. Shannon. Communication in the presence of noise. Proc. IRE, 37(1):10–21, 1949.
[2] David J.C. MacKay. Information Theory, Inference, and Learning Algorithms. Cambridge University Press, Cambridge, UK, 1st edition, 2003.
[3] K. Banaszek, L. Kunz, M. Jachura, and M. Jarzyna. Quantum limits in optical communications. Journal of Lightwave Technology, 38(10):2741–2754, 2020.
[4] Karol Łukanowski and Marcin Jarzyna. Capacity of a lossy photon channel with direct detection. IEEE Transactions on Communications, 69(8):5059–5068, 2021.
[5] Marco Fanizza, Matteo Rosati, Michalis Skotiniotis, John Calsamiglia, and Vittorio Giovannetti. Squeezing-enhanced communication without a phase reference. Quantum, 5:608, December 2021.
[6] Roy J. Glauber. Coherent and incoherent states of the radiation field. Physical Review, 131(6):2766–2788, Sep. 1963.
[7] E. C. G. Sudarshan. Equivalence of semiclassical and quantum mechanical descriptions of statistical light beams. Phys. Rev. Lett., 10:277–279, Apr. 1963.
[8] V. Giovannetti, S. Guha, S. Lloyd, L. Maccone, J. H. Shapiro, and H. P. Yuen. Classical capacity of the lossy bosonic channel: The exact solution. Phys. Rev. Lett., 92:027902, Jan. 2004.
[9] V. Giovannetti, R. García-Patrón, N. J. Cerf, and A. S. Holevo. Ultimate classical communication rates of quantum optical channels. Nature Photon., 8:796–800, 2014.
[10] A. S. Holevo. Bounds for the quantity of information transmitted by a quantum communication channel. Problems of Information Transmission, 9(3):177–183, 1973.
[11] B. Schumacher and M. D. Westmoreland. Sending classical information via noisy quantum channels. Phys. Rev. A, 56:131–138, Jul. 1997.
[12] S. Guha. Structured optical receivers to attain superadditive capacity and the Holevo limit. Phys. Rev. Lett., 106(24):240502, 2011.
[13] Masahide Sasaki, Kentaro Kato, Masayuki Izutsu, and Osamu Hirota. Quantum channels showing superadditivity in classical capacity. Phys. Rev. A, 58:146–158, Jul 1998.
[14] M. Bilkis, M. Rosati, R. Morral Yepes, and J. Calsamiglia. Real-time calibration of coherent-state receivers: Learning by trial and error. Phys. Rev. Res., 2:033295, Aug 2020.
[15] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms. The MIT Press, 4th edition, 2022.
[16] S. J. Dolinar, J. Hamkins, B. E. Moision, and V. A. Vilnrotter. Deep-space optical communications. In H. Hemmati, editor, Deep-Space Communications and Navigation Series, pages 215–289. Wiley, New York, 2006.
[17] Hamid Hemmati, Abhijit Biswas, and Ivan B. Djordjevic. Deep-space optical communications: Future perspectives and applications. Proceedings of the IEEE, 99(11):2020–2039, 2011.
[18] Don M. Boroson, Abhijit Biswas, and Bernard L. Edwards. MLCD: overview of NASA’s Mars laser communications demonstration system. In Steve Mecherle, Cynthia Y. Young, and John S. Stryjewski, editors, Free-Space Laser Communication Technologies XVI, volume 5338, pages 16 – 28. International Society for Optics and Photonics, SPIE, 2004.
[19] Yuval Kochman, Ligong Wang, and Gregory W. Wornell. Toward photon-efficient key distribution over optical channels. IEEE Transactions on Information Theory, 60(8):4958–4972, 2014.
[20] M. Jarzyna, P. Kuszaj, and K. Banaszek. Incoherent on-off keying with classical and non-classical light. Optics Express, 23(3):3170–3175, 2015.
[21] Carl W. Helstrom. Quantum detection and estimation theory. Journal of Statistical Physics, 1(2):231–252, 1969.
[22] G. Cariolaro and G. Pierobon. Theory of quantum pulse position modulation and related numerical problems. IEEE Transactions on Communications, 58(4):1213–1222, 2010.
[23] Nicola Dalla Pozza and Nicola Laurenti. Adaptive discrimination scheme for quantum pulse-position-modulation signals. Phys. Rev. A, 89:012339, Jan 2014.
[24] T. M. Cover and J. A. Thomas. Elements of Information Theory. Wiley, 2006.
[25] Jian Chen, Jonathan L. Habif, Zachary Dutton, Richard Lazarus, and Saikat Guha. Optical codeword demodulation with error rates below the standard quantum limit using a conditional nulling receiver. Nature Photonics, 6(6):374–379, May 2012.
[26] Yury Polyanskiy, H. Vincent Poor, and Sergio Verdu. Channel coding rate in the finite blocklength regime. IEEE Transactions on Information Theory, 56(5):2307–2359, 2010.
[27] Marcin Jarzyna and Konrad Banaszek. Efficiency of optimized pulse position modulation with noisy direct detection. In 2017 IEEE International Conference on Space Optical Systems and Applications (ICSOS), pages 166–171, 2017.
[28] Wojciech Zwoliński, Marcin Jarzyna, and Konrad Banaszek. Range dependence of an optical pulse position modulation link in the presence of background noise. Opt. Express, 26(20):25827–25838, Oct 2018.
[29] R. S. Kennedy. A near-optimum receiver for the binary coherent state quantum channel. Quarterly Progress Report 108, Research Laboratory of Electronics, M.I.T., January 15 1973.
[30] S. J. Dolinar. An optimum receiver for the binary coherent state quantum channel. Quarterly Progress Report, 111, 1973.
[31] S. J. Dolinar, Jr. The telecommunications and data acquisition progress report 42-72: October-december 1982. Technical report, NASA, Pasadena, CA, 1983.
[32] Roy S. Bondurant. Near-quantum optimum receivers for the phase-quadrature coherent-state channel. Opt. Lett., 18(22):1896–1898, Nov 1993.
[33] Leonard Mandel and Emil Wolf. Optical Coherence and Quantum Optics. Cambridge University Press, 1995.
[34] Saikat Guha, Jonathan L. Habif, and M. Takeoka. Approaching helstrom limits to optical pulse-position demodulation using single photon detection and optical feedback. Journal of Modern Optics, 58(3-4):257–265, 2011.
[35] Chaohan Cui, William Horrocks, Shuhong Hao, Saikat Guha, Nasser Peyghambarian, Quntao Zhuang, and Zheshen Zhang. Quantum receiver enhanced by adaptive learning. Light: Science & Applications, 11(1), December 2022.
[36] Daniel Rieländer, Andrea Di Mira, David Alaluf, Robert Daddato, Sinda Mejri, Jorge Piris, Jorge Alves, Dimitrios Antsos, Abhijit Biswas, Nikos Karafolas, Klaus-Jürgen Schulz, and Clemens Heese. ESA ground infrastructure for the NASA/JPL PSYCHE Deep-Space Optical Communication demonstration. In Kyriaki Minoglou, Nikos Karafolas, and Bruno Cugny, editors, International Conference on Space Optics — ICSO 2022, volume 12777, page 127770E. International Society for Optics and Photonics, SPIE, 2023.
[37] Konrad Banaszek, Ludwig Kunz, Marcin Jarzyna, and Michal Jachura. Approaching the ultimate capacity limit in deep-space optical communication. In Hamid Hemmati and Don M. Boroson, editors, Free-Space Laser Communications XXXI, volume 10910, page 109100A. International Society for Optics and Photonics, SPIE, 2019.
[38] Bruce Moision and William Farr. Range dependence of the optical communications channel. The Interplanetary Network Progress Report, 42-199, Nov 2014.
[39] Marcin Jarzyna, Wojciech Zwoliński, Michał Jachura, and Konrad Banaszek. Optimizing deep-space optical communication under power constraints. In Hamid Hemmati and Don M. Boroson, editors, Free-Space Laser Communication and Atmospheric Propagation XXX, volume 10524, page 105240A. International Society for Optics and Photonics, SPIE, 2018.