Abstract
Maximum likelihood estimation problems are, in general, intractable optimization problems. As a result, it is common to approximate the maximum likelihood estimator (MLE) using convex relaxations. In some cases, the relaxation is tight: it recovers the true MLE. Most tightness proofs only apply to situations where the MLE exactly recovers a planted solution (known to the analyst). It is then sufficient to establish that the optimality conditions hold at the planted signal. In this paper, we study an estimation problem (angular synchronization) for which the MLE is not a simple function of the planted solution, yet for which the convex relaxation is tight. To establish tightness in this context, the proof is less direct because the point at which to verify optimality conditions is not known explicitly. Angular synchronization consists in estimating a collection of n phases, given noisy measurements of the pairwise relative phases. The MLE for angular synchronization is the solution of a (hard) non-bipartite Grothendieck problem over the complex numbers. We consider a stochastic model for the data: a planted signal (that is, a ground truth set of phases) is corrupted with non-adversarial random noise. Even though the MLE does not coincide with the planted signal, we show that the classical semidefinite relaxation for it is tight, with high probability. This holds even for high levels of noise.
Similar content being viewed by others
Notes
Indeed, \(S+C\) is diagonal and \(X_{ii}=1\), hence \({\text {Tr}}\left( S+C\right) = {\text {Tr}}\left( (S+C)X\right) = {\text {Tr}}\left( SX\right) + {\text {Tr}}\left( CX\right) \).
Using \(C = \mathbbm {1}\mathbbm {1}^{\top } + \sigma W\), we get \(S\mathbbm {1} = n\mathbbm {1} + \sigma W\mathbbm {1} - n\mathbbm {1} - \sigma W\mathbbm {1} = 0\).
A similar but different definition appeared in a previous version of this paper.
The inequality is independent of \(C_{ii}\). Assuming nonnegativity merely eases the exposition.
A second-order critical point satisfies first- and second-order necessary optimality conditions, namely, the gradient is zero and the Hessian is positive semidefinite (if minimizing) [26, Sect. 3.2.1].
As before, S is independent of \({\text {diag}}(C)\). Assuming \(C_{ii} = 1\) involves no loss of generality.
References
Abbe, E., Bandeira, A.S., Bracher, A., Singer, A.: Decoding binary node labels from censored edge measurements: phase transition and efficient recovery. IEEE Trans. Network Sci. Eng. 1(1), 10–22 (2014)
Abbe, E., Bandeira, A.S., Hall, G.: Exact recovery in the stochastic block model. IEEE Trans. Inf. Theory 62(1), 471–487 (2015)
Absil, P.-A., Mahony, R., Sepulchre, R.: Optimization Algorithms on Matrix. Princeton University Press, Princeton (2008)
Agrawal, A., Raskar, R., Chellappa, R.: What is the range of surface reconstructions from a gradient field? In: Leonardis, A., Bischof, H., Pinz, A. (eds.) Computer Vision—ECCV 2006. Lecture Notes in Computer Science, vol. 3951, pp. 578–591. Springer, Berlin Heidelberg (2006)
Alexeev, B., Bandeira, A.S., Fickus, M., Mixon, D.G.: Phase retrieval with polarization. SIAM J. on Imaging Sci. 7(1), 35–66 (2013)
Alizadeh, F., Haeberly, J.-P., Overton, M.L.: Complementarity and nondegeneracy in semidefinite programming. Math. Program. 77(1), 111–128 (1997)
Amelunxen, D., Lotz, M., McCoy, M.B., Tropp, J.A.: Living on the edge: phase transitions in convex programs with random data. Inf. Inference 3(3), 224–294 (2014)
Ames, B.P.W.: Guaranteed clustering and biclustering via semidefinite programming. Math. Program. 147, 429–465 (2014)
Bandeira, A.S., Chen, Y., Mixon, D.G.: Phase retrieval from power spectra of masked signals. Inf. Inference J. IMA 3, 83–102 (2014)
Bandeira, A.S., Kennedy, C., Singer, A.: Approximating the little Grothendieck problem over the orthogonal group. Math. Program. Ser. A, 1–43 (2016). doi:10.1007/s10107-016-0993-7
Bandeira, A.S., Singer, A., Spielman, D.A.: A Cheeger inequality for the graph connection Laplacian. SIAM J. Matrix Anal. Appl. 34(4), 1611–1630 (2013)
Bandeira, A.S.: Random Laplacian matrices and convex relaxations. (2015). arXiv:1504.03987
Bandeira, A.S., Charikar, M., Singer, A., Zhu, A.: Multireference alignment using semidefinite programming. In: Proceedings of the 5th Conference on Innovations in Theoretical Computer Science, pp. 459–470. ACM, (2014)
Bandeira, A.S., Khoo, Y., and Singer, A.: Open problem: Tightness of maximum likelihood semidefinite relaxations. In: Maria Florina B., Vitaly F., Csaba S., (ed.), Proceedings of the 27th Conference on Learning Theory, vol. 35 of JMLR W&CP, pp. 1265–1267, (2014)
Bandeira, A.S., van Handel, R.: Sharp nonasymptotic bounds on the norm of random matrices with independent entries. Annal. Probab., (to appear)
Barvinok, A.I.: Problems of distance geometry and convex properties of quadratic maps. Discrete Comput. Geom. 13(1), 189–202 (1995)
Boumal, N.: A Riemannian low-rank method for optimization over semidefinite matrices with block-diagonal constraints. (2015). arXiv:1506.00575
Boumal, N., Singer, A., Absil, P.-A., Blondel, V.D.: Cramér-Rao bounds for synchronization of rotations. Inf. Inference 3, 1–39 (2014)
Briët, J., de Oliveira Filho, F.M., Vallentin, F.: Grothendieck inequalities for semidefinite programs with rank constraint. Theory Comput. 10(4), 77–105 (2014)
Candès, E.J., Recht, B.: Exact matrix completion via convex optimization. Found. Comput. Math. 9(6), 717–772 (2009)
Candès, E.J., Romberg, J., Tao, T.: Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inform. Theory 52, 489–509 (2006)
Candès, E.J., Romberg, J., Tao, T.: Stable signal recovery from incomplete and inaccurate measurements. Comm. Pure Appl. Math. 59, 1207–1223 (2006)
Candès, E.J., Strohmer, T., Voroninski, V.: Phaselift: exact and stable signal recovery from magnitude measurements via convex programming. Commun. Pure Appl. Math. 66, 1241–1274 (2011)
Chandrasekaran, V., Parrilo, P.A., Willsky, A.S.: Latent variable graphical model selection via convex optimization. Annal. Stat. 40, 1935–1967 (2012)
Chandrasekaran, V., Recht, B., Parrilo, P.A., Willsky, A.S.: The convex geometry of linear inverse problems. Found. Comput. Math. 12(6), 805–849 (2012)
Conn, A.R., Gould, N.I.M., Toint, P.L.: Trust-region methods. MOS-SIAM Series on Optimization, Philadelphia. doi:10.1137/1.9780898719857 (2000)
Cucuringu, M.: Sync-Rank: Robust ranking, constrained ranking and rank aggregation via eigenvector and semidefinite programming synchronization. IEEE Trans. Netw. Sci. 3(1), 58–79 (2015)
Demanet, L., Hand, P.: Stable optimizationless recovery from phaseless linear measurements. J. Fourier Anal. Appl. 20(1), 199–221. doi:10.1007/s00041-013-9305-2
Demanet, L., Jugnon, V.: Convex recovery from interferometric measurements. (2013). arXiv:1307.6864
Ding, X., Jiang, T.: Spectral distributions of adjacency and Laplacian matrices of random graphs. Annal. Appl. Probab. 20(6), 2086–2117 (2010)
Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theory 52, 1289–1306 (2006)
Giridhar, A., Kumar, P.R.: Distributed clock synchronization over wireless networks: Algorithms and analysis. In: IEEE Conf. Decis. Control, 2006 45th, pp. 4915–4920. IEEE, (2006)
Goemans, M.X., Williamson, D.P.: Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. J. ACM (JACM) 42(6), 1115–1145 (1995)
Goemans, M.X., Williamson, D.P.: Approximation algorithms for Max-3-Cut and other problems via complex semidefinite programming. J. Comput. Syst. Sci. 68(2), 442–470 (2004)
Grothendieck, A.: Resume de la theorie metrique des produits tensoriels topologiques (french). Reprint of Bol. Soc. Mat. Sao Paulo, p. 179, (1996)
Hartley, R., Trumpf, J., Dai, Y., Li, H.: Rotation averaging. Int. J. Comput. Vis. 103(3), 267–305 (2013)
Huang, Q.X., Guibas, L.: Consistent shape maps via semidefinite programming. In: Computer Graphics Forum. vol. 32, pp. 177–186. Wiley Online Library, (2013)
Journée, M., Bach, F., Absil, P.-A., Sepulchre, R.: Low-rank optimization on the cone of positive semidefinite matrices. SIAM J. Optim. 20(5), 2327–2351 (2010)
Ledoux, M., Talagrand, M.: Probability in Banach Spaces: isoperimetry and processes, vol. 23, Ergebnisse der Mathematik und ihrer Grenzgebiete. 3. Folge / A Series of Modern Surveys in Mathematics. (1991)
Low, S.H.: Convex relaxation of optimal power flow: a tutorial. In: Bulk Power System Dynamics and Control-IX Optimization, Security and Control of the Emerging Power Grid (IREP), 2013 IREP Symposium, pp. 1–15. IEEE, (2013)
Luo, Z., Ma, W., So, A.M.C., Ye, Y., Zhang, S.: Semidefinite relaxation of quadratic optimization problems. Sign. Process. Mag. IEEE 27(3), 20–34 (2010)
Martinec, D., Pajdla, T.: Robust rotation and translation estimation in multiview reconstruction. In: IEEE Conference on Computer Vision and Pattern Recognition, 2007. CVPR ’07. pp. 1–8, June (2007)
Pataki, G.: On the rank of extreme matrices in semidefinite programs and the multiplicity of optimal eigenvalues. Math. Operations Res. 23(2), 339–358 (1998)
Pisier, G.: Grothendieck’s theorem, past and present. Bull. Amer. Math. Soc. 49, 237–323 (2011)
Rubinstein, J., Wolansky, G.: Reconstruction of optical surfaces from ray data. Opt. Rev. 8(4), 281–283 (2001)
Ruszczyński, A.P.: Nonlinear Optimization. Princeton University Press, Princeton (2006)
Sagnol, G.: A class of semidefinite programs with rank-one solutions. Linear Algebra Appl. 435(6), 1446–1463 (2011)
Shapiro, A.: Rank-reducibility of a symmetric matrix and sampling theory of minimum trace factor analysis. Psychometrika 47(2), 187–199 (1982)
Singer, A.: Angular synchronization by eigenvectors and semidefinite programming. Appl. Comput. Harmonic Anal. 30(1), 20–36 (2011)
Singer, A., Shkolnisky, Y.: Three-dimensional structure determination from common lines in Cryo-EM by eigenvectors and semidefinite programming. SIAM J. Imaging Sci. 4(2), 543–572 (2011)
So, A.M.-C.: Probabilistic analysis of the semidefinite relaxation detector in digital communications. Proc, SODA (2010)
So, A.M.-C., Zhang, J., Ye, Y.: On approximating complex quadratic optimization problems via semidefinite programming relaxations. Math. Program. 110(1), 93–110 (2007)
Sojoudi, S., Lavaei, J.: Exactness of semidefinite relaxations for nonlinear optimization problems with underlying graph structure. SIAM J. Optim. 24(4), 1746–1778 (2014)
Tropp, J.A.: Just relax: convex programming methods for identifying sparse signals in noise. IEEE Trans Inf. Theory 52(3), 1030–1051 (2006)
Vanderberghe, L., Boyd, S.: Semidefinite programming. SIAM Rev. 38, 49–95 (1996)
Vershynin, R.: Introduction to the non-asymptotic analysis of random matrices. In: Eldar, Y., Kutyniok, G. (eds.) Chapter 5 of: Compressed Sensing, Theory and Applications. Cambridge University Press, Cambridge (2012)
Wang, L., Singer, A.: Exact and stable recovery of rotations for robust synchronization. Inf. Inference 2(2), 145–193 (2013)
Williamson, D.P., Shmoys, D.B.: The Design of Approximation Algorithms. Cambridge University Press, Cambridge (2011)
Zhang, S., Huang, Y.: Complex quadratic optimization and semidefinite programming. SIAM J. Optim. 16(3), 871–890 (2006)
Zhang, T., and Singer, A.: Disentangling two orthogonal matrices. (2015). arXiv:1506.02217
Acknowledgments
A. S. Bandeira was supported by AFOSR Grant No. FA9550-12-1-0317. Most of this work was done while he was with the Program for Applied and Computational Mathematics at Princeton University, and some while he was with the Department of Mathematics at the Massachusetts Institute of Technology. N. Boumal was supported by a Belgian F.R.S.-FNRS fellowship while working at the Université catholique de Louvain (Belgium), by a Research in Paris fellowship at Inria and ENS, the “Fonds Spéciaux de Recherche” (FSR UCLouvain), the Chaire Havas “Chaire Economie et gestion des nouvelles données” and the ERC Starting Grant SIPA. A. Singer was partially supported by Award Number R01GM090200 from the NIGMS, by Award Numbers FA9550-12-1-0317 and FA9550-13-1-0076 from AFOSR, by Award Number LTR DTD 06-05-2012 from the Simons Foundation, and by the Moore Foundation.
Author information
Authors and Affiliations
Corresponding author
Appendix: Wigner matrices are discordant
Appendix: Wigner matrices are discordant
This appendix is a proof for Proposition 3.3, namely, that for arbitrary \(z\in \mathbb {C}^n\) such that \(|z_1| = \cdots = |z_n| = 1\), complex Wigner matrices are z-discordant (Definition 3.1) with high probability.
A matrix W is z-discordant if and only if \({\text {diag}}(z)^* W {\text {diag}}(z)\) is \(\mathbbm {1}\)-discordant. Since \({\text {diag}}(z)^* W {\text {diag}}(z)\) has the same distribution as W (owing to complex normal random variables having uniformly random phase), we may without loss of generality assume \(z = \mathbbm {1}\) in the remainder of the proof.
-
1.
\(\Pr \left\{ \left\| W\right\| _{\mathrm {op}} > 3 n^{1/2} \right\} \le e^{-n/2} \).
Although tail bounds for the real version of this are well-known (see for example [15, 56]) and they mostly hold verbatim in the complex case, for the sake of completeness we include a classical argument, based on Slepian’s comparison theorem and Gaussian concentration, for a tail bound in the complex valued case.
We will bound the largest eigenvalue of W. It is clear that a simple union bound argument will allow us to bound also the smallest, and thus bound the largest in magnitude. Let \(\lambda _+ = \max _{v\in \mathbb {C}^n: \Vert v\Vert =1}v^*W v\) denote the largest eigenvalue of W. For any unit-norm \(u,v\in \mathbb {C}^n\), the real valued Gaussian process \(X_v = v^*Wv\) satisfies:
$$\begin{aligned} \mathbb {E}\left( X_v - X_u\right) ^2&= \mathbb {E}\left( \sum _{i<j}W_{ij}\left( \overline{v_i}v_j -\overline{u_i}u_j\right) + W_{ji}\left( \overline{v_j}v_i -\overline{u_j}u_i\right) \right) ^2\\&= \sum _{i<j}\mathbb {E}\left[ W_{ij}\left( \overline{v_i}v_j -\overline{u_i}u_j\right) + W_{ji}\left( \overline{v_j}v_i -\overline{u_j}u_i\right) \right] ^2. \end{aligned}$$The variable \(W_{ij}\) has uniformly random phase, hence so does \(W_{ij}^2\), so that \(\mathbb {E}W_{ij}^2 = 0\). As a result,
$$\begin{aligned} \mathbb {E}\left( X_v - X_u\right) ^2&= \sum _{i<j}2\mathbb {E}\left| W_{ij}\right| ^2\left| \overline{v_i}v_j -\overline{u_i}u_j\right| ^2 \\&= 2\sum _{i<j}\left| \overline{v_i}v_j -\overline{u_i}u_j\right| ^2 \\&\le \sum _{i,j}\left| \overline{v_i}v_j -\overline{u_i}u_j\right| ^2. \end{aligned}$$Note that, since \(\Vert u\Vert =\Vert v\Vert =1\),
$$\begin{aligned} \sum _{i,j}\left| \overline{v_i}v_j -\overline{u_i}u_j\right| ^2&= \sum _{ij}\left[ |v_i|^2|v_j|^2 + |u_i|^2|u_j|^2 - \overline{v_i}v_ju_i\overline{u_j} - v_i\overline{v_j}\overline{u_i}u_j\right] \\&= 2- 2\left| v^*u\right| ^2 \\&\le 2\left( 2 - 2\left| v^*u\right| \right) \\&\le 4\left( 1 - \mathfrak {R}\left[ v^*u \right] \right) \\&= 2\Vert v-u \Vert ^2. \end{aligned}$$This means that we can use Slepian’s comparison theorem (see for example [39, Cor. 3.12]) to get
$$\begin{aligned} \mathbb {E}\lambda _+ \le \sqrt{2}\mathbb {E}\max _{\tilde{v}\in \mathbb {R}^{2n}:\Vert \tilde{v}\Vert =1} \tilde{v}^T g \le 2\sqrt{n}, \end{aligned}$$(5.1)where g is a standard Gaussian vector in \(\mathbb {R}^{2n}\) and \(\tilde{v}\) is a vector in \(\mathbb {R}^{2n}\) obtained from \(v\in \mathbb {C}^{n}\) by stacking its real and imaginary parts. Since \(\left| \Vert W_1\Vert - \Vert W_2\Vert \right| \le \Vert W_1-W_2\Vert \le \Vert W_1-W_2\Vert _{\mathrm {F}}\), Gaussian concentration [39] gives
$$\begin{aligned} \Pr \left\{ \lambda _+ - \mathbb {E}\lambda _+ \ge t \right\} \le e^{-t^2/2}. \end{aligned}$$(5.2)$$\begin{aligned} \Pr \left\{ \left\| W\right\| _{\mathrm {op}} > 3 n^{1/2} \right\} \le e^{-n/2}. \end{aligned}$$ -
2.
\(\Pr \left\{ \Vert W\mathbbm {1}\Vert _\infty > 3\sqrt{n \log n} \right\} \le 2n^{-5/4}\).
The random vector given by \(\frac{1}{(n-1)^{1/2}}W\mathbbm {1}\) is jointly Gaussian where the marginal of each entry is a standard complex Gaussian. By a suboptimal union bound argument, the maximum absolute value among k standard complex Gaussian random variables (not necessarily independent) is larger than t with probability at most \(2ke^{-t^2/4}\). Hence,
$$\begin{aligned} \Pr \left\{ \Vert W\mathbbm {1}\Vert _\infty> 3\sqrt{n\log n} \right\}&\le \Pr \left\{ \left\| \frac{1}{(n-1)^{1/2}}W\mathbbm {1}\right\| _\infty > 3 \sqrt{\log n} \right\} \\&\le 2ne^{-\frac{9}{4} \log n} = 2n^{-5/4}. \end{aligned}$$
To support the discussion following Proposition 3.3, we further argue that
It is easy to see that \(\mathbbm {1}^* W \mathbbm {1}\) is a real Gaussian random variable with zero mean and variance \(2\frac{n(n-1)}{2} = n(n-1)\). This implies that:
Rights and permissions
About this article
Cite this article
Bandeira, A.S., Boumal, N. & Singer, A. Tightness of the maximum likelihood semidefinite relaxation for angular synchronization. Math. Program. 163, 145–167 (2017). https://doi.org/10.1007/s10107-016-1059-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-016-1059-6
Keywords
- Angular synchronization
- Semidefinite programming
- Tightness of convex relaxation
- Maximum likelihood estimation