Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 May 11.
Published in final edited form as: Proc IEEE Int Symp Info Theory. 2017 Aug 15;2017:1316–1320. doi: 10.1109/ISIT.2017.8006742

Sample Complexity of the Boolean Multireference Alignment Problem

Emmanuel Abbe 1, João M Pereira 2, Amit Singer 3
PMCID: PMC5947970  NIHMSID: NIHMS912110  PMID: 29755834

Abstract

The Boolean multireference alignment problem consists in recovering a Boolean signal from multiple shifted and noisy observations. In this paper we obtain an expression for the error exponent of the maximum A posteriori decoder. This expression is used to characterize the number of measurements needed for signal recovery in the low SNR regime, in terms of higher order autocorrelations of the signal. The characterization is explicit for various signal dimensions, such as prime and even dimensions.

I. Introduction

The Boolean multireference alignment (BMA) problem consists of estimating an unknown signal x2L, from noisy cyclically shifted copies Y1,,YN2L, i.e.,

Yi=RSixZi,i{1,,N}, (1)

where the error Zi ~ Ber(p)L, the product measure of L Bernoulli variables with parameter p, ⊕ denotes addition mod 2, R is the index cyclic shift operator that shifts a vector one element to the right (x1, …, xN) ↦ (xN, x1, …, xN−1), RSi corresponds to applying Si times the operator R and the shifts Si ~ 𝒰(ℤL), the uniform distribution in ℤL.

The motivation to study this problem comes from the classical multireference alignment problem, where the signal and observations are real valued vectors, and the error is Gaussian white noise. Several algorithms were recently proposed to solve the problem, including angular synchronization [1], semidefinite program relaxations of the maximum likelihood decoder [2] and reconstruction using the bispectrum [3]. This problem is also an instance of a larger class of problems, called Non-Unique Games, which also includes the orientation estimation problem in cryo-electron microscopy [4].

Despite these advancements in algorithmic development, not much progress has been made in understanding the fundamental limits of signal recovery. The recent paper [5] investigated fundamental limits of shift recovery in multireference alignment, but not those of signal recovery. We note that estimating the shifts is impossible at low signal-to-noise ratio (SNR) even if an oracle presents us with the true signal. Also, the goal of many applications is signal recovery rather than shift estimation. Our paper aims to fill the gap on signal recovery, by studying the Boolean case.

We focus on the low SNR regime, since pairwise alignment performs well in the high SNR regime, while in applications, such as cryo-electron microscopy, the low regime is predominant. We show here that signal recovery is possible at arbitrarily low SNR, if sufficiently many measurements are available, and quantify this tradeoff. We do not consider here the problem of determining the sample complexity of multireference alignment in the real-valued Gaussian noise case, which is a topic of ongoing research [6], [7].

In BMA the search space is finite, and the maximum A posteriori decoder (MAP) minimizes the probability of error. Our main contribution is an expression for the error exponent of MAP, in the low SNR regime, given in Theorems III.2 and III.3. Our results imply how many measurements are needed, as a function of the SNR, in order to accurately estimate the signal.

The expression depends on the autocorrelations of the signal, defined in (6). Our results connect the order of autocorrelations needed to reconstruct the signal to the number of measurements needed to estimate the signal. This has some connections with previous theoretical work on uniqueness of the bispectrum [8].

We also consider some generalizations of the original problem in order to model some aspects of multireference alignment that arise in applications, such as the introduction of deletions.

II. BMA Problem

In the BMA problem, the errors are i.i.d. Bernoulli of parameter p. If p=12, then the observations Yi~Ber(12)L, regardless of the original signal, and signal recovery is impossible. This corresponds to the case when SNR = 0. On the other hand, p = 0 or 1 corresponds to the noiseless case. Thus we define

SNR:=(p-12)2. (2)

In contrast to proposing an algorithm to solve the BMA problem, our paper focuses on its sample complexity, in the low SNR regime, when p12 and SNR → 0.

Note that the observations Yi, i ∈ [N], given the signal x, are i.i.d., since both the shifts Si and the errors Zi are i.i.d. For that reason we will drop the index i when it is more convenient. We rewrite (1), denoting by x(j) the j-th entry of x.

Y(j)=x(S+j)Z(j),jL, (3)

where ‘+’ is addition mod L.

Our paper also considers the sample complexity of the following variations of the basic BMA problem:

  • BMA Problem with consecutive deletions: In this case the measurements Y1, …, YN are in 2K, with KL, and
    Y(j)=x(S+j)Z(j),jK. (4)

    When K = L we obtain the original BMA problem.

  • BMA Problem with known deletions: Let V ⊂ ℤL be an ordered set of non-deletions, i.e. the set of deletions is ℤL\V. Now the measurements Y1, …, YN are in 2K, with K = |V|, and:
    Y(j)=x(S+Vj)Z(j),jK, (5)

    where Vj denotes the j-th element of V. When V = [K] we recover the BMA problem with consecutive deletions.

  • BMA Problem (and variations) with non uniform rotations: Similar to the previous problems, but now the shifts follow some distribution ξ in ℤL.

These variations are motivated by problems similar to multireference alignment. The case of possible deletions is intended to model instances where the observations are only partial, whereas the extension to non-uniform shifts attempts to represent a non-symmetric version of the problem.

III. Results

We start by introducing the following notion of autocorrelation of a signal that is central to our main results.

Definition III.1

The (ξ, k)-autocorrelation of x, with respect to a distribution ξ inL and k=(k1,k2,,kd)Ld is defined as

Aξ,k(x):=s=1Lξ(s)x(k1+s)x(kd+s). (6)

We refer to d = |k| as the order of the auto-correlation. When ξ ~ 𝒰(ℤL), we simply write k-autocorrelation and Ak. Notice Ak is shift invariant, that is Ak(x) = Ak(Rsx), and in this case we may assume k1 = 0.

We define the minimum autocorrelation order necessary to distinguish x1 and x2 under ξ and V as

tξ,V(x1,x2):=inf{x:Aξ,k(x1)Aξ,k(x2),kVd}, (7)

where Vd denotes the vectors in Z2d with entries in V. The minimum autocorrelation order necessary to describe all signals in 𝒳 is defined as

tξ,V(X):=maxx1,x2Xx1x2tξ,V(x1,x2). (8)

Given a prior distribution on the signals PX, with support 𝒳, denote by X the random variable with distribution PX. Given an algorithm for BMA the probability of error is defined as

P(X^X)=xiXP(X^xi)PX(xi), (9)

where is the answer given by the algorithm. In the BMA problem the search space is finite, thus MAP minimizes the probability of error (9). We obtain results that do not depend on the prior distribution, they depend only on its support.

Theorem III.2

Consider the BMA problem with known deletions ZL\V and shift distribution ξ. Let XZ2L be the support of the prior distribution of the signals and μx the conditional distribution in 2K of the observations Y given the signal x, where K = |V|. The probability of error of the MAP estimator, denoted by Pe, has the following asymptotic behavior

limN1NlogPe=minx1,x2Xx1x2C(μx1,μx2), (10)

with

C(μx1,μx2)=24t-3t!SNRtkVt(Aξ,k(x1)-Aξ,k(x2))2+O(SNRt+1), (11)

and t = tξ,V (x1, x2).

The theorem implies that the exponent on SNR is tξ,V(𝒳). In the original problem, with uniform shifts and no deletions, the recovery of the original signal is possible only up to a shift, i.e. we can only recover Rkx, where x is the original signal, and k is some shift in ℤL. For that reason, we consider 𝒳 to have exactly one element of each class of all the shifts of a signal, i.e., there are no two elements in 𝒳 where one is a shift of the other (for example, if L is prime, then there are 2L − 2 such elements).

Corollary III.3

Consider the original problem, with V = [L], ξ ~ 𝒰(ℤL) and 𝒳 as defined above. By inspection one can obtain the error exponent for L ≤ 5. For L ≥ 6, we either have

limN1NlogPe={210LSNR3+O(SNR4)O(SNR4) (12)

Also, the first case occurs when L is prime, and the second when L ≥ 12 and is even. The other values of L remain open.

IV. Proof Techniques

Proof of Theorem III.2

The proof consists of two main parts. The next theorem gives a formula to the error exponent and claim IV.2 makes the connection with autocorrelations.

Theorem IV.1

Consider the BMA problem with known deletions ZL\V and shift distribution ξ. Let XZ2L be the space of possible signals and μx := PY|X|x) the conditional distribution in 2K of the observations given the signal x. The probability of error of the MAP estimator (Pe) has the following asymptotic behavior

limN1NlogPe=minx1x2XC(μx1,μx2), (13)

with

C(μx1,μx2)=(12-p)2s8(s!)2y2K(μx1(s)(y;12)-μx2(s)(y;12))2μx1(y;12)+O(12-p)2s+2, (14)

where μx(m)(y;p) denotes the m-th derivative of μx(y; p) in p, i.e. the derivative of the conditional distribution in y given x in order of the Bernoulli parameter p, and

s(x1,x2):=inf{m:μx1(m)(y;12)μx2(m)(y;12),y2K}.

This theorem follows from Theorems 1 and 2 in [9]. Theorem 1 is a corollary of Sanov Theorem, and expression (13) is in fact the Chernoff Information between distributions μx1 and μx2 [10]. In Theorem 2 [9] we Taylor expand the Chernoff Information (13) and obtain (14).

Claim IV.2

If

μx1(m)(y;12)=μx2(m)(y;12)m<n,y2K, (15)

then the following expressions are equal:

y2K(μx1(n)(y;12)-μx2(n)(y;12))2μx1(y;12) (16)

and

24nn!kVL(Aξ,k(x1)-Aξ,k(x2))2. (17)

In fact, since the expressions (16) and (17) are both sum of squares, the claim implies that tξ,V (x1, x2) = s(x1, x2), what concludes the proof of theorem III.2.

Proof of Claim IV.2

The claim is proved by induction on n. Note that condition (15) is equivalent to (16) vanishing for m < n, which implies (17) also vanishes by applying the claim with m. In general (17) is a function of (16) and lower order terms, which vanish when we enforce condition (15). The rigorous proof follows.

Denote by x(V) the vector in 2K(K=V) that consists of the values of x with indices in V, i.e. the j-th element of x(V) is x(Vj). Also, given s ∈ ℤL denote by s + V the ordered set corresponding to the sum of each element in V with s mod L. Equation (5) can then be rewritten, as

Y=x(S+V)Z (18)

Then since Z ~ Ber(p)L, we have

μx(y;pS=s)=(1-p)K-w(yx(s+V))pw(yx(s+V)),

where w denotes the Hamming weight, and since S ~ ξ

μx(y;p)=s=1Lξ(s)(1-p)K-w(yx(s+V))pw(yx(s+V)). (19)

In the statement of the theorem we have x2L, however it is convenient for the proof to consider the entries of x to be −1, 1, changed by the rule: a ↦ 1 − 2a. We will call

u:=1-2x2L (20)

the corresponding element of x with ±1 values, where Σ2 := {−1, 1}, and v := 1 − 2y. In analogy to the Hamming weight, we define

W(u):=s=1Lu(s)=L-2w(x). (21)

With this we rewrite (19)

μu(v;p)=s=1Lξ(s)(1-p)K2+W(vu(s+V))2pK2-W(vu(s+V))2, (22)

where μu(v; p) := μx(y; p). For simplicity of notation denote

Wv,u,s:=W(vu(s+V)).

By properties of Jacobi polynomials [11] we have

(pK2-b2(1-p)K2+b2)p=12(m)=(-2)m-KPm(b),

where Pm is a polynomial with the following property

Pm(b)=bm+Qm(b), (23)

where Qm has degree at most m − 1, and Q0Q1 ≡ 0. Thus

μu(m)(v;12)=(-2)m-Ks=1Lξ(s)Pm(Wv,u,s). (24)

Then when m = 1

v2K(μu1(1)(v;12)-μu2(1)(v;12))2μu1(v;12)=22-Kv2K[s=1Lξ(s)(Wv,u1,s-Wv,u2,s)]2.

Now, by the induction hypothesis if μu1(k)(v;12)=μu2(k)(v;12) for all kn − 1, v2K

s=1Lξ(s)Qn(Wv,u1,s)=s=1Lξ(s)Qn(Wv,u2,s),

for all v2K since Qn has degree at most n − 1. Thus by (23) and (24)

v2K(μu1(n)(v;12)-μu2(n)(v;12))2μu1(v;12)=22n-Kv2K[s=1Lξ(s)(Wv,u1,sn-Wv,u2,sn)]2 (25)

Now splitting the square of the sum on the RHS into a product of two sums and expanding, we obtain terms of the form

s1=1Ls2=1Lξ(s1)ξ(s2)(-1)α+βv2KWv,uα,s1nWv,uβ,s2n, (26)

where α and β are 1 or 2. By Lemma IV.3 we get

v2KWv,uα,s1nWv,uβ,s2n=2KAM[2n]AisevenCAi=1A(k=1Kj=1aiuaij(k)), (27)

Where uaij is uα(s1 + V) if aijn, and is uβ(s2 + V) otherwise. So, since |ai| is even, as A is an even partition, and the entries of uaij are ±1,

k=1Kj=1aiuaij(k)=kVuα(s1+k)uβ(s2+k)

if |ai ∩ [n]| is odd, and it is K otherwise. Then

v2KWv,uα,s1nWv,uβ,s2n=Rn(kVuα(s1+k)uβ(s2+k)),

where Rn is a polynomial with degree n (with coefficients possibly depending on K and n), and R1(b) = 2kb. It cannot have degree n + 1 since |A| ≤ n, since it is an even partition of [2n]. For it to be a power of order n, we need |A| = n, so |ai| = 2 for i = 1, …, n, thus CA = 1, by the Lemma. Also |ai ∩ [n]| must be odd for all i, thus |ai ∩ [n]| = 1. There are exactly n! partitions with this property, so the leading coefficient of Rn is 2K n!. We also have

s1=1Ls2=1Lξ(s1)ξ(s2)(kVuα(s1+k)uβ(s2+k))n=s1=1Ls2=1Lξ(s1)ξ(s2)kVni=1nuα(s1+ki)uβ(s2+ki)=kVnAξ,k(uα)Aξ,k(uβ), (28)

The equation will be true for n = 1, since R1(b) = 2kb. As in (25), the induction hypothesis implies the lower order terms in Rn cancel and only the leading coefficient is of interest. We get

v2K[s=1Lξ(s)(Wv,u1,sn-Wv,u2,sn)]2=2kn!kVn(Aξ,k(u1)-Aξ,k(u2))2 (29)

Now through some algebraic manipulation, and using again the argument of the leading coefficient, if |k| = n, then

kVn(Aξ,k(u1)-Aξ,k(u2))2=22nkVn(Aξ,k(x1)-Aξ,k(x2))2 (30)

This together with (25) and (29) concludes the proof.

Lemma IV.3

For any partition A = {a1, …, a|A|} of the set {1, 2, …, m}, denote by aij the j-th entry of ai and M[m] the set of all such partitions. If u1,,um2K

v2KW(u1v)W(umv)=2KAM[m]AisevenCAi=1A(k=1Kj=1aiuaij(k)), (31)

where A is even if all |ai| are even for i ∈ {1, …, |A|}. Moreover, CA is a constant that depends only on the partition A and is always 1 if |ai| = 2 for all i ∈ {1, …, |A|}.

Proof

Recall (21). We have W(uv)=k=1Ku(k)v(k)

v2KW(u1v)W(umv)=k1=1Kkm=1Ku1(k1)um(km)v2Kv(k1)v(km)=AM[m]k1,,kA=1alldistinctKi=1Aj=1aiuaij(ki)v2Ki=1Av(ki)ai, (32)

The last sum is 2K when A is even, and 0 otherwise. Using a combinatorial argument we can rewrite (32) without the ‘all-distinct’ condition, at the cost of a constant CA, which is 1 when |ai| = 2 for i ∈ {1, …, |A|}. We get

2KAM[m]AisevenCAk1,,kA=1Ki=1Aj=1aiuaij(ki)=2KAM[m]AisevenCAi=1A(k=1Kj=1aiuaij(k))

Proof of Corollary III.3

We first prove equation (12). Recall (6), and denote by

Bm(x1,x2):=kLm(Ak(x1)-Ak(x2))2

and

Bm(L):=minx1x2XBm(x1,x2)

Note that Bm(x1, x2) = 0 if m < tξ,V (x1, x2) by (7). For convenience let B(x1, x2) := Btξ,V(x1,x2)(x1, x2) and B(L) := Btξ,V(𝒳)(L). Using this notation we rewrite (10) and (11)

limN1NlogPe=B(L)24tL-3tL!SNRtL+O(SNRtL+1)

Now equation (12) is equivalent to having tξ,V(𝒳) ≥ 3 and B3(L) either 12L or 0. Turns out, for L ≥ 6, if we take

x1=(1,1,0,1,0,,0L-4zeros)andx2=(1,0,1,1,0,,0L-4zeros),

then tξ,V(X)tξ,V(x1,x2)=3 and B3(L)B(x1,x2)=12L. Also we cannot have 12L>B3(L)>0. This implies there exists x1 and x2 in 𝒳 such that 12L>B(x1,x2)>0. Since it is positive, there is kL3 such that Ak*(x1) ≠ Ak*(x2). But by definition (6), since ξ(s)=1L, LAk*(x) is an integer for x2L, and L2(Ak*(x1) − Ak*(x2))2 ∈ ℤ.

Now by the definition we also have Aσ(k*)(x) = Ak*(x), where σ permutes the entries of k*. Also, for s ∈ ℤL, let s+k:=(s+k1,s+k2,s+k3), then As+k*(x) = Ak*(x). There is 6 permutations and L possible values for s ∈ ℤL, so B(x1, x2) is an integer multiple of 6L. (we can also have not trivial s and σ such that s + k* = σ(k*) but that case also has the property mentioned). However we cannot have B(x1,x2)=6L. That means there exists only one kL3 (with permutations and shifts) such that Ak*(x1) ≠ Ak* (x2). Then

kL3Ak(x1)-Ak(x2)=6L(Ak(x1)-Ak(x2))0 (33)

On the other hand

kL3Ak(x1)=1Ls=1LkL2x(k1+s)x(k2+s)x(k3+s)=L3A0(x1)3,

where A0 denotes k-autocorrelation with k = 0. Since tL > 1, A0(x1) = A0(x2), so equation (33) must be 0, and equation (12) follows by contradiction. Now if L ≥ 12 is even, choose

x1=(1,1,0,1,,1L2-3ones,0,0,1,0,,0L2-3zeros)

and x2 the vector obtained by reversing the entries of x1. Since one is the reverse of the other, they have same 1 and 2 order autocorrelations. Recall (20) and (6) and notice that in this case both Ak(u1) and Ak(u2) are 0 when |k| is odd, since half of the signal is the symmetric of the other half, i.e. u1({1,,L2})=-u1({L2+1,,L}). Now because of (30) we have Ak(x1) = Ak(x2) when |k| = 3, so tL ≥ 4, and B3(L) = 0.

Finally, let L ≥ 6 be prime. We prove by contradiction that tL = 3 and B3(L)=12L. If this is not true, then it exists x1 and x2 such that tx1,x2>3, so

Ak(x1)=Ak(x2),kLn,n3 (34)

By Theorem 2 of paper [8], if the Fourier coefficients of x1 and x2 are non-zero, then equation (34) implies one is a shift of the other. Denote by {rj1}jL and {rj2}jL the Fourier coefficients of x1 and x2, respectively, which are given by

rjα=1Ls=1Lxα(s)ωL-js,α{1,2},jL, (35)
=1Ls:xα(s)=1ωL-js, (36)

where ωL is the L’th root of unity. r0α=0 implies xα only has zeros, and rjα is 0 only if wL-j is a root of the polynomial

s:xα(s)=1bs (37)

However, since L is prime, the minimal polynomial of wL-j in ℚ[x], for L > j > 0, is 1 + x + ··· + xL−1 [12], so this polynomial must divide (37). Thus x1 and x2 must be the all zeros and all ones signals, but these signals also do not satisfy (34).

Acknowledgments

A. S. and J. P. were partially supported by Award Number R01GM090200 from the NIGMS, FA9550-12-1-0317 from AFOSR, Simons Foundation Investigator Award and Simons Collaborations on Algorithms and Geometry, and the Moore Foundation Data-Driven Discovery Investigator Award.

E. A. is partially supported by NSF CAREER Award CCF-1552131 and ARO grant W911NF-16-1-0051.

Contributor Information

Emmanuel Abbe, Princeton University.

João M. Pereira, Princeton University

Amit Singer, Princeton University.

References

  • 1.Singer A. Angular synchronization by eigenvectors and semidefinite programming. Applied and computational harmonic analysis. 2011;30(1):20–36. doi: 10.1016/j.acha.2010.02.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Bandeira AS, Charikar M, Singer A, Zhu A. Multireference alignment using semidefinite programming. Proceedings of the 5th conference on Innovations in theoretical computer science; ACM; 2014. pp. 459–470. [Google Scholar]
  • 3.Sadler BM, Giannakis GB. Shift- and rotation-invariant object reconstruction using the bispectrum. JOSA A. 1992;9(1):57–69. [Google Scholar]
  • 4.Bandeira AS, Chen Y, Singer A. Non-unique games over compact groups and orientation estimation in cryo-EM. 2015 doi: 10.1088/1361-6420/ab7d2c. arXiv preprint arXiv:1505.03840. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Aguerrebere C, Delbracio M, Bartesaghi A, Sapiro G. Fundamental limits in multi-image alignment. IEEE Transactions on Signal Processing. 2016;64(21):5707–5722. [Google Scholar]
  • 6.Bandeira AS, Rigollet P, Weed J. Optimal rates of estimation for multi-reference alignment. In preparation. [Google Scholar]
  • 7.Perry A, Weed J, Bandeira AS, Rigollet P, Singer A. The sample complexity of multi-reference alignment. In preparation. [Google Scholar]
  • 8.Kakarala R. The bispectrum as a source of phase-sensitive invariants for Fourier descriptors: a group-theoretic approach. Journal of Mathematical Imaging and Vision. 2012;44(3):341–353. [Google Scholar]
  • 9.Abbe E, Pereira JM, Singer A. Very noisy Sanov’s theorem and applications to alignment problems. 2017 Preprint. [Google Scholar]
  • 10.Cover TM, Thomas JA. Elements of information theory. John Wiley & Sons; 2006. [Google Scholar]
  • 11.Szego G. Orthogonal polynomials. Vol. 23 American Mathematical Society; 1975. [Google Scholar]
  • 12.Jacobson N. Lectures in Abstract Algebra: III. Theory of Fields and Galois Theory. Vol. 32 Springer-Verlag; 1964. [Google Scholar]

RESOURCES