Abstract
The Boolean multireference alignment problem consists in recovering a Boolean signal from multiple shifted and noisy observations. In this paper we obtain an expression for the error exponent of the maximum A posteriori decoder. This expression is used to characterize the number of measurements needed for signal recovery in the low SNR regime, in terms of higher order autocorrelations of the signal. The characterization is explicit for various signal dimensions, such as prime and even dimensions.
I. Introduction
The Boolean multireference alignment (BMA) problem consists of estimating an unknown signal , from noisy cyclically shifted copies , i.e.,
(1) |
where the error Zi ~ Ber(p)L, the product measure of L Bernoulli variables with parameter p, ⊕ denotes addition mod 2, R is the index cyclic shift operator that shifts a vector one element to the right (x1, …, xN) ↦ (xN, x1, …, xN−1), RSi corresponds to applying Si times the operator R and the shifts Si ~ 𝒰(ℤL), the uniform distribution in ℤL.
The motivation to study this problem comes from the classical multireference alignment problem, where the signal and observations are real valued vectors, and the error is Gaussian white noise. Several algorithms were recently proposed to solve the problem, including angular synchronization [1], semidefinite program relaxations of the maximum likelihood decoder [2] and reconstruction using the bispectrum [3]. This problem is also an instance of a larger class of problems, called Non-Unique Games, which also includes the orientation estimation problem in cryo-electron microscopy [4].
Despite these advancements in algorithmic development, not much progress has been made in understanding the fundamental limits of signal recovery. The recent paper [5] investigated fundamental limits of shift recovery in multireference alignment, but not those of signal recovery. We note that estimating the shifts is impossible at low signal-to-noise ratio (SNR) even if an oracle presents us with the true signal. Also, the goal of many applications is signal recovery rather than shift estimation. Our paper aims to fill the gap on signal recovery, by studying the Boolean case.
We focus on the low SNR regime, since pairwise alignment performs well in the high SNR regime, while in applications, such as cryo-electron microscopy, the low regime is predominant. We show here that signal recovery is possible at arbitrarily low SNR, if sufficiently many measurements are available, and quantify this tradeoff. We do not consider here the problem of determining the sample complexity of multireference alignment in the real-valued Gaussian noise case, which is a topic of ongoing research [6], [7].
In BMA the search space is finite, and the maximum A posteriori decoder (MAP) minimizes the probability of error. Our main contribution is an expression for the error exponent of MAP, in the low SNR regime, given in Theorems III.2 and III.3. Our results imply how many measurements are needed, as a function of the SNR, in order to accurately estimate the signal.
The expression depends on the autocorrelations of the signal, defined in (6). Our results connect the order of autocorrelations needed to reconstruct the signal to the number of measurements needed to estimate the signal. This has some connections with previous theoretical work on uniqueness of the bispectrum [8].
We also consider some generalizations of the original problem in order to model some aspects of multireference alignment that arise in applications, such as the introduction of deletions.
II. BMA Problem
In the BMA problem, the errors are i.i.d. Bernoulli of parameter p. If , then the observations , regardless of the original signal, and signal recovery is impossible. This corresponds to the case when SNR = 0. On the other hand, p = 0 or 1 corresponds to the noiseless case. Thus we define
(2) |
In contrast to proposing an algorithm to solve the BMA problem, our paper focuses on its sample complexity, in the low SNR regime, when and SNR → 0.
Note that the observations Yi, i ∈ [N], given the signal x, are i.i.d., since both the shifts Si and the errors Zi are i.i.d. For that reason we will drop the index i when it is more convenient. We rewrite (1), denoting by x(j) the j-th entry of x.
(3) |
where ‘+’ is addition mod L.
Our paper also considers the sample complexity of the following variations of the basic BMA problem:
-
BMA Problem with consecutive deletions: In this case the measurements Y1, …, YN are in , with K ≤ L, and
(4) When K = L we obtain the original BMA problem.
-
BMA Problem with known deletions: Let V ⊂ ℤL be an ordered set of non-deletions, i.e. the set of deletions is ℤL\V. Now the measurements Y1, …, YN are in , with K = |V|, and:
(5) where Vj denotes the j-th element of V. When V = [K] we recover the BMA problem with consecutive deletions.
BMA Problem (and variations) with non uniform rotations: Similar to the previous problems, but now the shifts follow some distribution ξ in ℤL.
These variations are motivated by problems similar to multireference alignment. The case of possible deletions is intended to model instances where the observations are only partial, whereas the extension to non-uniform shifts attempts to represent a non-symmetric version of the problem.
III. Results
We start by introducing the following notion of autocorrelation of a signal that is central to our main results.
Definition III.1
The (ξ, k)-autocorrelation of x, with respect to a distribution ξ in ℤL and is defined as
(6) |
We refer to d = |k| as the order of the auto-correlation. When ξ ~ 𝒰(ℤL), we simply write k-autocorrelation and Ak. Notice Ak is shift invariant, that is Ak(x) = Ak(Rsx), and in this case we may assume k1 = 0.
We define the minimum autocorrelation order necessary to distinguish x1 and x2 under ξ and V as
(7) |
where Vd denotes the vectors in with entries in V. The minimum autocorrelation order necessary to describe all signals in 𝒳 is defined as
(8) |
Given a prior distribution on the signals PX, with support 𝒳, denote by X the random variable with distribution PX. Given an algorithm for BMA the probability of error is defined as
(9) |
where X̂ is the answer given by the algorithm. In the BMA problem the search space is finite, thus MAP minimizes the probability of error (9). We obtain results that do not depend on the prior distribution, they depend only on its support.
Theorem III.2
Consider the BMA problem with known deletions ZL\V and shift distribution ξ. Let be the support of the prior distribution of the signals and μx the conditional distribution in of the observations Y given the signal x, where K = |V|. The probability of error of the MAP estimator, denoted by Pe, has the following asymptotic behavior
(10) |
with
(11) |
and t = tξ,V (x1, x2).
The theorem implies that the exponent on SNR is tξ,V(𝒳). In the original problem, with uniform shifts and no deletions, the recovery of the original signal is possible only up to a shift, i.e. we can only recover Rkx, where x is the original signal, and k is some shift in ℤL. For that reason, we consider 𝒳 to have exactly one element of each class of all the shifts of a signal, i.e., there are no two elements in 𝒳 where one is a shift of the other (for example, if L is prime, then there are 2L − 2 such elements).
Corollary III.3
Consider the original problem, with V = [L], ξ ~ 𝒰(ℤL) and 𝒳 as defined above. By inspection one can obtain the error exponent for L ≤ 5. For L ≥ 6, we either have
(12) |
Also, the first case occurs when L is prime, and the second when L ≥ 12 and is even. The other values of L remain open.
IV. Proof Techniques
Proof of Theorem III.2
The proof consists of two main parts. The next theorem gives a formula to the error exponent and claim IV.2 makes the connection with autocorrelations.
Theorem IV.1
Consider the BMA problem with known deletions ZL\V and shift distribution ξ. Let be the space of possible signals and μx := PY|X (·|x) the conditional distribution in of the observations given the signal x. The probability of error of the MAP estimator (Pe) has the following asymptotic behavior
(13) |
with
(14) |
where denotes the m-th derivative of μx(y; p) in p, i.e. the derivative of the conditional distribution in y given x in order of the Bernoulli parameter p, and
This theorem follows from Theorems 1 and 2 in [9]. Theorem 1 is a corollary of Sanov Theorem, and expression (13) is in fact the Chernoff Information between distributions μx1 and μx2 [10]. In Theorem 2 [9] we Taylor expand the Chernoff Information (13) and obtain (14).
Claim IV.2
If
(15) |
then the following expressions are equal:
(16) |
and
(17) |
In fact, since the expressions (16) and (17) are both sum of squares, the claim implies that tξ,V (x1, x2) = s(x1, x2), what concludes the proof of theorem III.2.
Proof of Claim IV.2
The claim is proved by induction on n. Note that condition (15) is equivalent to (16) vanishing for m < n, which implies (17) also vanishes by applying the claim with m. In general (17) is a function of (16) and lower order terms, which vanish when we enforce condition (15). The rigorous proof follows.
Denote by x(V) the vector in that consists of the values of x with indices in V, i.e. the j-th element of x(V) is x(Vj). Also, given s ∈ ℤL denote by s + V the ordered set corresponding to the sum of each element in V with s mod L. Equation (5) can then be rewritten, as
(18) |
Then since Z ~ Ber(p)L, we have
where w denotes the Hamming weight, and since S ~ ξ
(19) |
In the statement of the theorem we have , however it is convenient for the proof to consider the entries of x to be −1, 1, changed by the rule: a ↦ 1 − 2a. We will call
(20) |
the corresponding element of x with ±1 values, where Σ2 := {−1, 1}, and v := 1 − 2y. In analogy to the Hamming weight, we define
(21) |
With this we rewrite (19)
(22) |
where μu(v; p) := μx(y; p). For simplicity of notation denote
By properties of Jacobi polynomials [11] we have
where Pm is a polynomial with the following property
(23) |
where Qm has degree at most m − 1, and Q0 ≡ Q1 ≡ 0. Thus
(24) |
Then when m = 1
Now, by the induction hypothesis if for all k ≤ n − 1,
for all since Qn has degree at most n − 1. Thus by (23) and (24)
(25) |
Now splitting the square of the sum on the RHS into a product of two sums and expanding, we obtain terms of the form
(26) |
where α and β are 1 or 2. By Lemma IV.3 we get
(27) |
Where uaij is uα(s1 + V) if aij ≤ n, and is uβ(s2 + V) otherwise. So, since |ai| is even, as A is an even partition, and the entries of uaij are ±1,
if |ai ∩ [n]| is odd, and it is K otherwise. Then
where Rn is a polynomial with degree n (with coefficients possibly depending on K and n), and R1(b) = 2kb. It cannot have degree n + 1 since |A| ≤ n, since it is an even partition of [2n]. For it to be a power of order n, we need |A| = n, so |ai| = 2 for i = 1, …, n, thus CA = 1, by the Lemma. Also |ai ∩ [n]| must be odd for all i, thus |ai ∩ [n]| = 1. There are exactly n! partitions with this property, so the leading coefficient of Rn is 2K n!. We also have
(28) |
The equation will be true for n = 1, since R1(b) = 2kb. As in (25), the induction hypothesis implies the lower order terms in Rn cancel and only the leading coefficient is of interest. We get
(29) |
Now through some algebraic manipulation, and using again the argument of the leading coefficient, if |k| = n, then
(30) |
Lemma IV.3
For any partition A = {a1, …, a|A|} of the set {1, 2, …, m}, denote by aij the j-th entry of ai and M[m] the set of all such partitions. If
(31) |
where A is even if all |ai| are even for i ∈ {1, …, |A|}. Moreover, CA is a constant that depends only on the partition A and is always 1 if |ai| = 2 for all i ∈ {1, …, |A|}.
Proof
Recall (21). We have
(32) |
The last sum is 2K when A is even, and 0 otherwise. Using a combinatorial argument we can rewrite (32) without the ‘all-distinct’ condition, at the cost of a constant CA, which is 1 when |ai| = 2 for i ∈ {1, …, |A|}. We get
Proof of Corollary III.3
We first prove equation (12). Recall (6), and denote by
and
Note that Bm(x1, x2) = 0 if m < tξ,V (x1, x2) by (7). For convenience let B(x1, x2) := Btξ,V(x1,x2)(x1, x2) and B(L) := Btξ,V(𝒳)(L). Using this notation we rewrite (10) and (11)
Now equation (12) is equivalent to having tξ,V(𝒳) ≥ 3 and B3(L) either or 0. Turns out, for L ≥ 6, if we take
then and . Also we cannot have . This implies there exists x1 and x2 in 𝒳 such that . Since it is positive, there is such that Ak*(x1) ≠ Ak*(x2). But by definition (6), since , LAk*(x) is an integer for , and L2(Ak*(x1) − Ak*(x2))2 ∈ ℤ.
Now by the definition we also have Aσ(k*)(x) = Ak*(x), where σ permutes the entries of k*. Also, for s ∈ ℤL, let , then As+k*(x) = Ak*(x). There is 6 permutations and L possible values for s ∈ ℤL, so B(x1, x2) is an integer multiple of . (we can also have not trivial s and σ such that s + k* = σ(k*) but that case also has the property mentioned). However we cannot have . That means there exists only one (with permutations and shifts) such that Ak*(x1) ≠ Ak* (x2). Then
(33) |
On the other hand
where A0 denotes k-autocorrelation with k = 0. Since tL > 1, A0(x1) = A0(x2), so equation (33) must be 0, and equation (12) follows by contradiction. Now if L ≥ 12 is even, choose
and the vector obtained by reversing the entries of . Since one is the reverse of the other, they have same 1 and 2 order autocorrelations. Recall (20) and (6) and notice that in this case both Ak(u1) and Ak(u2) are 0 when |k| is odd, since half of the signal is the symmetric of the other half, i.e. . Now because of (30) we have Ak(x1) = Ak(x2) when |k| = 3, so tL ≥ 4, and B3(L) = 0.
Finally, let L ≥ 6 be prime. We prove by contradiction that tL = 3 and . If this is not true, then it exists and such that , so
(34) |
By Theorem 2 of paper [8], if the Fourier coefficients of and are non-zero, then equation (34) implies one is a shift of the other. Denote by and the Fourier coefficients of and , respectively, which are given by
(35) |
(36) |
where ωL is the L’th root of unity. implies only has zeros, and is 0 only if is a root of the polynomial
(37) |
However, since L is prime, the minimal polynomial of in ℚ[x], for L > j > 0, is 1 + x + ··· + xL−1 [12], so this polynomial must divide (37). Thus and must be the all zeros and all ones signals, but these signals also do not satisfy (34).
Acknowledgments
A. S. and J. P. were partially supported by Award Number R01GM090200 from the NIGMS, FA9550-12-1-0317 from AFOSR, Simons Foundation Investigator Award and Simons Collaborations on Algorithms and Geometry, and the Moore Foundation Data-Driven Discovery Investigator Award.
E. A. is partially supported by NSF CAREER Award CCF-1552131 and ARO grant W911NF-16-1-0051.
Contributor Information
Emmanuel Abbe, Princeton University.
João M. Pereira, Princeton University
Amit Singer, Princeton University.
References
- 1.Singer A. Angular synchronization by eigenvectors and semidefinite programming. Applied and computational harmonic analysis. 2011;30(1):20–36. doi: 10.1016/j.acha.2010.02.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Bandeira AS, Charikar M, Singer A, Zhu A. Multireference alignment using semidefinite programming. Proceedings of the 5th conference on Innovations in theoretical computer science; ACM; 2014. pp. 459–470. [Google Scholar]
- 3.Sadler BM, Giannakis GB. Shift- and rotation-invariant object reconstruction using the bispectrum. JOSA A. 1992;9(1):57–69. [Google Scholar]
- 4.Bandeira AS, Chen Y, Singer A. Non-unique games over compact groups and orientation estimation in cryo-EM. 2015 doi: 10.1088/1361-6420/ab7d2c. arXiv preprint arXiv:1505.03840. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Aguerrebere C, Delbracio M, Bartesaghi A, Sapiro G. Fundamental limits in multi-image alignment. IEEE Transactions on Signal Processing. 2016;64(21):5707–5722. [Google Scholar]
- 6.Bandeira AS, Rigollet P, Weed J. Optimal rates of estimation for multi-reference alignment. In preparation. [Google Scholar]
- 7.Perry A, Weed J, Bandeira AS, Rigollet P, Singer A. The sample complexity of multi-reference alignment. In preparation. [Google Scholar]
- 8.Kakarala R. The bispectrum as a source of phase-sensitive invariants for Fourier descriptors: a group-theoretic approach. Journal of Mathematical Imaging and Vision. 2012;44(3):341–353. [Google Scholar]
- 9.Abbe E, Pereira JM, Singer A. Very noisy Sanov’s theorem and applications to alignment problems. 2017 Preprint. [Google Scholar]
- 10.Cover TM, Thomas JA. Elements of information theory. John Wiley & Sons; 2006. [Google Scholar]
- 11.Szego G. Orthogonal polynomials. Vol. 23 American Mathematical Society; 1975. [Google Scholar]
- 12.Jacobson N. Lectures in Abstract Algebra: III. Theory of Fields and Galois Theory. Vol. 32 Springer-Verlag; 1964. [Google Scholar]