-
Efficient Certificates of Anti-Concentration Beyond Gaussians
Authors:
Ainesh Bakshi,
Pravesh Kothari,
Goutham Rajendran,
Madhur Tulsiani,
Aravindan Vijayaraghavan
Abstract:
A set of high dimensional points $X=\{x_1, x_2,\ldots, x_n\} \subset R^d$ in isotropic position is said to be $δ$-anti concentrated if for every direction $v$, the fraction of points in $X$ satisfying $|\langle x_i,v \rangle |\leq δ$ is at most $O(δ)$. Motivated by applications to list-decodable learning and clustering, recent works have considered the problem of constructing efficient certificate…
▽ More
A set of high dimensional points $X=\{x_1, x_2,\ldots, x_n\} \subset R^d$ in isotropic position is said to be $δ$-anti concentrated if for every direction $v$, the fraction of points in $X$ satisfying $|\langle x_i,v \rangle |\leq δ$ is at most $O(δ)$. Motivated by applications to list-decodable learning and clustering, recent works have considered the problem of constructing efficient certificates of anti-concentration in the average case, when the set of points $X$ corresponds to samples from a Gaussian distribution. Their certificates played a crucial role in several subsequent works in algorithmic robust statistics on list-decodable learning and settling the robust learnability of arbitrary Gaussian mixtures, yet remain limited to rotationally invariant distributions.
This work presents a new (and arguably the most natural) formulation for anti-concentration. Using this formulation, we give quasi-polynomial time verifiable sum-of-squares certificates of anti-concentration that hold for a wide class of non-Gaussian distributions including anti-concentrated bounded product distributions and uniform distributions over $L_p$ balls (and their affine transformations). Consequently, our method upgrades and extends results in algorithmic robust statistics e.g., list-decodable learning and clustering, to such distributions. Our approach constructs a canonical integer program for anti-concentration and analysis a sum-of-squares relaxation of it, independent of the intended application. We rely on duality and analyze a pseudo-expectation on large subsets of the input points that take a small value in some direction. Our analysis uses the method of polynomial reweightings to reduce the problem to analyzing only analytically dense or sparse directions.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
List Decoding of Tanner and Expander Amplified Codes from Distance Certificates
Authors:
Fernando Granha Jeronimo,
Shashank Srivastava,
Madhur Tulsiani
Abstract:
We develop new list decoding algorithms for Tanner codes and distance-amplified codes based on bipartite spectral expanders. We show that proofs exhibiting lower bounds on the minimum distance of these codes can be used as certificates discoverable by relaxations in the Sum-of-Squares (SoS) semidefinite programming hierarchy. Combining these certificates with certain entropic proxies to ensure tha…
▽ More
We develop new list decoding algorithms for Tanner codes and distance-amplified codes based on bipartite spectral expanders. We show that proofs exhibiting lower bounds on the minimum distance of these codes can be used as certificates discoverable by relaxations in the Sum-of-Squares (SoS) semidefinite programming hierarchy. Combining these certificates with certain entropic proxies to ensure that the solutions to the relaxations cover the entire list, then leads to algorithms for list decoding several families of codes up to the Johnson bound.
We prove the following:
- We show that the LDPC Tanner codes of Sipser-Spielman [IEEE Trans. Inf. Theory 1996] and Zémor [IEEE Trans. Inf. Theory 2001] with alphabet size $q$, block-length $n$ and distance $δ$, based on an expander graph with degree $d$, can be list-decoded up to distance $\mathcal{J}_q(δ) - ε$ in time $n^{O_{d,q}(1/ε^4)}$, where $\mathcal{J}_q(δ)$ denotes the Johnson bound.
- We show that the codes obtained via the expander-based distance amplification procedure of Alon, Edmonds and Luby [FOCS 1995] can be list-decoded close to the Johnson bound using the SoS hierarchy, by reducing the list decoding problem to unique decoding of the base code. In particular, starting from \emph{any} base code unique-decodable up to distance $δ$, one can obtain near-MDS codes with rate $R$ and distance $1-R - ε$, list-decodable up to the Johnson bound in time $n^{O_{ε, δ}(1)}$.
- We show that the locally testable codes of Dinur et al. [STOC 2022] with alphabet size $q$, block-length $n$ and distance $δ$ based on a square Cayley complex with generator sets of size $d$, can be list-decoded up to distance $\mathcal{J}_q(δ) - ε$ in time $n^{O_{d,q}(1/ε^{4})}$, where $\mathcal{J}_q(δ)$ denotes the Johnson bound.
△ Less
Submitted 3 November, 2023;
originally announced November 2023.
-
Ellipsoid fitting up to constant via empirical covariance estimation
Authors:
Madhur Tulsiani,
June Wu
Abstract:
The ellipsoid fitting conjecture of Saunderson, Chandrasekaran, Parrilo and Willsky considers the maximum number $n$ random Gaussian points in $\mathbb{R}^d$, such that with high probability, there exists an origin-symmetric ellipsoid passing through all the points. They conjectured a threshold of $n = (1-o_d(1)) \cdot d^2/4$, while until recently, known lower bounds on the maximum possible $n$ we…
▽ More
The ellipsoid fitting conjecture of Saunderson, Chandrasekaran, Parrilo and Willsky considers the maximum number $n$ random Gaussian points in $\mathbb{R}^d$, such that with high probability, there exists an origin-symmetric ellipsoid passing through all the points. They conjectured a threshold of $n = (1-o_d(1)) \cdot d^2/4$, while until recently, known lower bounds on the maximum possible $n$ were of the form $d^2/(\log d)^{O(1)}$. We give a simple proof based on concentration of sample covariance matrices, that with probability $1 - o_d(1)$, it is possible to fit an ellipsoid through $d^2/C$ random Gaussian points. Similar results were also obtained in two recent independent works by Hsieh, Kothari, Potechin and Xu [arXiv, July 2023] and by Bandeira, Maillard, Mendelson, and Paquette [arXiv, July 2023].
△ Less
Submitted 23 July, 2023; v1 submitted 20 July, 2023;
originally announced July 2023.
-
Concentration of polynomial random matrices via Efron-Stein inequalities
Authors:
Goutham Rajendran,
Madhur Tulsiani
Abstract:
Analyzing concentration of large random matrices is a common task in a wide variety of fields. Given independent random variables, many tools are available to analyze random matrices whose entries are linear in the variables, e.g. the matrix-Bernstein inequality. However, in many applications, we need to analyze random matrices whose entries are polynomials in the variables. These arise naturally…
▽ More
Analyzing concentration of large random matrices is a common task in a wide variety of fields. Given independent random variables, many tools are available to analyze random matrices whose entries are linear in the variables, e.g. the matrix-Bernstein inequality. However, in many applications, we need to analyze random matrices whose entries are polynomials in the variables. These arise naturally in the analysis of spectral algorithms, e.g., Hopkins et al. [STOC 2016], Moitra-Wein [STOC 2019]; and in lower bounds for semidefinite programs based on the Sum of Squares hierarchy, e.g. Barak et al. [FOCS 2016], Jones et al. [FOCS 2021]. In this work, we present a general framework to obtain such bounds, based on the matrix Efron-Stein inequalities developed by Paulin-Mackey-Tropp [Annals of Probability 2016]. The Efron-Stein inequality bounds the norm of a random matrix by the norm of another simpler (but still random) matrix, which we view as arising by "differentiating" the starting matrix. By recursively differentiating, our framework reduces the main task to analyzing far simpler matrices. For Rademacher variables, these simpler matrices are in fact deterministic and hence, analyzing them is far easier. For general non-Rademacher variables, the task reduces to scalar concentration, which is much easier. Moreover, in the setting of polynomial matrices, our results generalize the work of Paulin-Mackey-Tropp. Using our basic framework, we recover known bounds in the literature for simple "tensor networks" and "dense graph matrices". Using our general framework, we derive bounds for "sparse graph matrices", which were obtained only recently by Jones et al. [FOCS 2021] using a nontrivial application of the trace power method, and was a core component in their work. We expect our framework to be helpful for other applications involving concentration phenomena for nonlinear random matrices.
△ Less
Submitted 17 January, 2023; v1 submitted 6 September, 2022;
originally announced September 2022.
-
Explicit Abelian Lifts and Quantum LDPC Codes
Authors:
Fernando Granha Jeronimo,
Tushant Mittal,
Ryan O'Donnell,
Pedro Paredes,
Madhur Tulsiani
Abstract:
For an abelian group $H$ acting on the set $[\ell]$, an $(H,\ell)$-lift of a graph $G_0$ is a graph obtained by replacing each vertex by $\ell$ copies, and each edge by a matching corresponding to the action of an element of $H$.
In this work, we show the following explicit constructions of expanders obtained via abelian lifts. For every (transitive) abelian group $H \leqslant \text{Sym}(\ell)$,…
▽ More
For an abelian group $H$ acting on the set $[\ell]$, an $(H,\ell)$-lift of a graph $G_0$ is a graph obtained by replacing each vertex by $\ell$ copies, and each edge by a matching corresponding to the action of an element of $H$.
In this work, we show the following explicit constructions of expanders obtained via abelian lifts. For every (transitive) abelian group $H \leqslant \text{Sym}(\ell)$, constant degree $d \ge 3$ and $ε> 0$, we construct explicit $d$-regular expander graphs $G$ obtained from an $(H,\ell)$-lift of a (suitable) base $n$-vertex expander $G_0$ with the following parameters:
(i) $λ(G) \le 2\sqrt{d-1} + ε$, for any lift size $\ell \le 2^{n^δ}$ where $δ=δ(d,ε)$,
(ii) $λ(G) \le ε\cdot d$, for any lift size $\ell \le 2^{n^{δ_0}}$ for a fixed $δ_0 > 0$, when $d \ge d_0(ε)$, or
(iii) $λ(G) \le \widetilde{O}(\sqrt{d})$, for lift size ``exactly'' $\ell = 2^{Θ(n)}$.
As corollaries, we obtain explicit quantum lifted product codes of Panteleev and Kalachev of almost linear distance (and also in a wide range of parameters) and explicit classical quasi-cyclic LDPC codes with wide range of circulant sizes.
Items $(i)$ and $(ii)$ above are obtained by extending the techniques of Mohanty, O'Donnell and Paredes [STOC 2020] for $2$-lifts to much larger abelian lift sizes (as a byproduct simplifying their construction). This is done by providing a new encoding of special walks arising in the trace power method, carefully "compressing'" depth-first search traversals. Result $(iii)$ is via a simpler proof of Agarwal et al. [SIAM J. Discrete Math 2019] at the expense of polylog factors in the expansion.
△ Less
Submitted 2 December, 2021;
originally announced December 2021.
-
Sum-of-Squares Lower Bounds for Sparse Independent Set
Authors:
Chris Jones,
Aaron Potechin,
Goutham Rajendran,
Madhur Tulsiani,
Jeff Xu
Abstract:
The Sum-of-Squares (SoS) hierarchy of semidefinite programs is a powerful algorithmic paradigm which captures state-of-the-art algorithmic guarantees for a wide array of problems. In the average case setting, SoS lower bounds provide strong evidence of algorithmic hardness or information-computation gaps. Prior to this work, SoS lower bounds have been obtained for problems in the "dense" input reg…
▽ More
The Sum-of-Squares (SoS) hierarchy of semidefinite programs is a powerful algorithmic paradigm which captures state-of-the-art algorithmic guarantees for a wide array of problems. In the average case setting, SoS lower bounds provide strong evidence of algorithmic hardness or information-computation gaps. Prior to this work, SoS lower bounds have been obtained for problems in the "dense" input regime, where the input is a collection of independent Rademacher or Gaussian random variables, while the sparse regime has remained out of reach. We make the first progress in this direction by obtaining strong SoS lower bounds for the problem of Independent Set on sparse random graphs. We prove that with high probability over an Erdos-Renyi random graph $G\sim G_{n,\frac{d}{n}}$ with average degree $d>\log^2 n$, degree-$D_{SoS}$ SoS fails to refute the existence of an independent set of size $k = Ω\left(\frac{n}{\sqrt{d}(\log n)(D_{SoS})^{c_0}} \right)$ in $G$ (where $c_0$ is an absolute constant), whereas the true size of the largest independent set in $G$ is $O\left(\frac{n\log d}{d}\right)$.
Our proof involves several significant extensions of the techniques used for proving SoS lower bounds in the dense setting. Previous lower bounds are based on the pseudo-calibration heuristic of Barak et al [FOCS 2016] which produces a candidate SoS solution using a planted distribution indistinguishable from the input distribution via low-degree tests. In the sparse case the natural planted distribution does admit low-degree distinguishers, and we show how to adapt the pseudo-calibration heuristic to overcome this.
Another notorious technical challenge for the sparse regime is the quest for matrix norm bounds. In this paper, we obtain new norm bounds for graph matrices in the sparse setting.
△ Less
Submitted 17 November, 2021;
originally announced November 2021.
-
Unique Decoding of Explicit $ε$-balanced Codes Near the Gilbert-Varshamov Bound
Authors:
Fernando Granha Jeronimo,
Dylan Quintana,
Shashank Srivastava,
Madhur Tulsiani
Abstract:
The Gilbert-Varshamov bound (non-constructively) establishes the existence of binary codes of distance $1/2 -ε$ and rate $Ω(ε^2)$ (where an upper bound of $O(ε^2\log(1/ε))$ is known). Ta-Shma [STOC 2017] gave an explicit construction of $ε$-balanced binary codes, where any two distinct codewords are at a distance between $1/2 -ε/2$ and $1/2+ε/2$, achieving a near optimal rate of $Ω(ε^{2+β})$, wher…
▽ More
The Gilbert-Varshamov bound (non-constructively) establishes the existence of binary codes of distance $1/2 -ε$ and rate $Ω(ε^2)$ (where an upper bound of $O(ε^2\log(1/ε))$ is known). Ta-Shma [STOC 2017] gave an explicit construction of $ε$-balanced binary codes, where any two distinct codewords are at a distance between $1/2 -ε/2$ and $1/2+ε/2$, achieving a near optimal rate of $Ω(ε^{2+β})$, where $β\to 0$ as $ε\to 0$.
We develop unique and list decoding algorithms for (essentially) the family of codes constructed by Ta-Shma. We prove the following results for $ε$-balanced codes with block length $N$ and rate $Ω(ε^{2+β})$ in this family:
- For all $ε, β> 0$ there are explicit codes which can be uniquely decoded up to an error of half the minimum distance in time $N^{O_{ε, β}(1)}$.
- For any fixed constant $β$ independent of $ε$, there is an explicit construction of codes which can be uniquely decoded up to an error of half the minimum distance in time $(\log(1/ε))^{O(1)} \cdot N^{O_β(1)}$.
- For any $ε> 0$, there are explicit $ε$-balanced codes with rate $Ω(ε^{2+β})$ which can be list decoded up to error $1/2 - ε'$ in time $N^{O_{ε,ε',β}(1)}$, where $ε', β\to 0$ as $ε\to 0$.
The starting point of our algorithms is the list decoding framework from Alev et al. [SODA 2020], which uses the Sum-of-Squares SDP hierarchy. The rates obtained there were quasipolynomial in $ε$. Here, we show how to overcome the far from optimal rates of this framework obtaining unique decoding algorithms for explicit binary codes of near optimal rate. These codes are based on simple modifications of Ta-Shma's construction.
△ Less
Submitted 10 November, 2020;
originally announced November 2020.
-
List Decoding of Direct Sum Codes
Authors:
Vedat Levi Alev,
Fernando Granha Jeronimo,
Dylan Quintana,
Shashank Srivastava,
Madhur Tulsiani
Abstract:
We consider families of codes obtained by "lifting" a base code $\mathcal{C}$ through operations such as $k$-XOR applied to "local views" of codewords of $\mathcal{C}$, according to a suitable $k$-uniform hypergraph. The $k$-XOR operation yields the direct sum encoding used in works of [Ta-Shma, STOC 2017] and [Dinur and Kaufman, FOCS 2017].
We give a general framework for list decoding such lif…
▽ More
We consider families of codes obtained by "lifting" a base code $\mathcal{C}$ through operations such as $k$-XOR applied to "local views" of codewords of $\mathcal{C}$, according to a suitable $k$-uniform hypergraph. The $k$-XOR operation yields the direct sum encoding used in works of [Ta-Shma, STOC 2017] and [Dinur and Kaufman, FOCS 2017].
We give a general framework for list decoding such lifted codes, as long as the base code admits a unique decoding algorithm, and the hypergraph used for lifting satisfies certain expansion properties. We show that these properties are satisfied by the collection of length $k$ walks on an expander graph, and by hypergraphs corresponding to high-dimensional expanders. Instantiating our framework, we obtain list decoding algorithms for direct sum liftings on the above hypergraph families. Using known connections between direct sum and direct product, we also recover the recent results of Dinur et al. [SODA 2019] on list decoding for direct product liftings.
Our framework relies on relaxations given by the Sum-of-Squares (SOS) SDP hierarchy for solving various constraint satisfaction problems (CSPs). We view the problem of recovering the closest codeword to a given word, as finding the optimal solution of a CSP. Constraints in the instance correspond to edges of the lifting hypergraph, and the solutions are restricted to lie in the base code $\mathcal{C}$. We show that recent algorithms for (approximately) solving CSPs on certain expanding hypergraphs also yield a decoding algorithm for such lifted codes.
We extend the framework to list decoding, by requiring the SOS solution to minimize a convex proxy for negative entropy. We show that this ensures a covering property for the SOS solution, and the "condition and round" approach used in several SOS algorithms can then be used to recover the required list of codewords.
△ Less
Submitted 10 November, 2020;
originally announced November 2020.
-
Explicit SoS lower bounds from high-dimensional expanders
Authors:
Irit Dinur,
Yuval Filmus,
Prahladh Harsha,
Madhur Tulsiani
Abstract:
We construct an explicit family of 3XOR instances which is hard for $O(\sqrt{\log n})$ levels of the Sum-of-Squares hierarchy. In contrast to earlier constructions, which involve a random component, our systems can be constructed explicitly in deterministic polynomial time.
Our construction is based on the high-dimensional expanders devised by Lubotzky, Samuels and Vishne, known as LSV complexes…
▽ More
We construct an explicit family of 3XOR instances which is hard for $O(\sqrt{\log n})$ levels of the Sum-of-Squares hierarchy. In contrast to earlier constructions, which involve a random component, our systems can be constructed explicitly in deterministic polynomial time.
Our construction is based on the high-dimensional expanders devised by Lubotzky, Samuels and Vishne, known as LSV complexes or Ramanujan complexes, and our analysis is based on two notions of expansion for these complexes: cosystolic expansion, and a local isoperimetric inequality due to Gromov.
Our construction offers an interesting contrast to the recent work of Alev, Jeronimo and the last author~(FOCS 2019). They showed that 3XOR instances in which the variables correspond to vertices in a high-dimensional expander are easy to solve. In contrast, in our instances the variables correspond to the edges of the complex.
△ Less
Submitted 10 September, 2020;
originally announced September 2020.
-
Approximating Constraint Satisfaction Problems on High-Dimensional Expanders
Authors:
Vedat Levi Alev,
Fernando Granha Jeronimo,
Madhur Tulsiani
Abstract:
We consider the problem of approximately solving constraint satisfaction problems with arity $k > 2$ ($k$-CSPs) on instances satisfying certain expansion properties, when viewed as hypergraphs. Random instances of $k$-CSPs, which are also highly expanding, are well-known to be hard to approximate using known algorithmic techniques (and are widely believed to be hard to approximate in polynomial ti…
▽ More
We consider the problem of approximately solving constraint satisfaction problems with arity $k > 2$ ($k$-CSPs) on instances satisfying certain expansion properties, when viewed as hypergraphs. Random instances of $k$-CSPs, which are also highly expanding, are well-known to be hard to approximate using known algorithmic techniques (and are widely believed to be hard to approximate in polynomial time). However, we show that this is not necessarily the case for instances where the hypergraph is a high-dimensional expander.
We consider the spectral definition of high-dimensional expansion used by Dinur and Kaufman [FOCS 2017] to construct certain primitives related to PCPs. They measure the expansion in terms of a parameter $γ$ which is the analogue of the second singular value for expanding graphs. Extending the results by Barak, Raghavendra and Steurer [FOCS 2011] for 2-CSPs, we show that if an instance of MAX k-CSP over alphabet $[q]$ is a high-dimensional expander with parameter $γ$, then it is possible to approximate the maximum fraction of satisfiable constraints up to an additive error $ε$ using $q^{O(k)} \cdot (k/ε)^{O(1)}$ levels of the sum-of-squares SDP hierarchy, provided $γ\leq ε^{O(1)} \cdot (1/(kq))^{O(k)}$.
Based on our analysis, we also suggest a notion of threshold-rank for hypergraphs, which can be used to extend the results for approximating 2-CSPs on low threshold-rank graphs. We show that if an instance of MAX k-CSP has threshold rank $r$ for a threshold $τ= (ε/k)^{O(1)} \cdot (1/q)^{O(k)}$, then it is possible to approximately solve the instance up to additive error $ε$, using $r \cdot q^{O(k)} \cdot (k/ε)^{O(1)}$ levels of the sum-of-squares hierarchy. As in the case of graphs, high-dimensional expanders (with sufficiently small $γ$) have threshold rank 1 according to our definition.
△ Less
Submitted 17 July, 2019;
originally announced July 2019.
-
Approximating Operator Norms via Generalized Krivine Rounding
Authors:
Vijay Bhattiprolu,
Mrinalkanti Ghosh,
Venkatesan Guruswami,
Euiwoong Lee,
Madhur Tulsiani
Abstract:
We consider the $(\ell_p,\ell_r)$-Grothendieck problem, which seeks to maximize the bilinear form $y^T A x$ for an input matrix $A$ over vectors $x,y$ with $\|x\|_p=\|y\|_r=1$. The problem is equivalent to computing the $p \to r^*$ operator norm of $A$. The case $p=r=\infty$ corresponds to the classical Grothendieck problem. Our main result is an algorithm for arbitrary $p,r \ge 2$ with approximat…
▽ More
We consider the $(\ell_p,\ell_r)$-Grothendieck problem, which seeks to maximize the bilinear form $y^T A x$ for an input matrix $A$ over vectors $x,y$ with $\|x\|_p=\|y\|_r=1$. The problem is equivalent to computing the $p \to r^*$ operator norm of $A$. The case $p=r=\infty$ corresponds to the classical Grothendieck problem. Our main result is an algorithm for arbitrary $p,r \ge 2$ with approximation ratio $(1+ε_0)/(\sinh^{-1}(1)\cdot γ_{p^*} \,γ_{r^*})$ for some fixed $ε_0 \le 0.00863$. Comparing this with Krivine's approximation ratio of $(π/2)/\sinh^{-1}(1)$ for the original Grothendieck problem, our guarantee is off from the best known hardness factor of $(γ_{p^*} γ_{r^*})^{-1}$ for the problem by a factor similar to Krivine's defect.
Our approximation follows by bounding the value of the natural vector relaxation for the problem which is convex when $p,r \ge 2$. We give a generalization of random hyperplane rounding and relate the performance of this rounding to certain hypergeometric functions, which prescribe necessary transformations to the vector solution before the rounding is applied. Unlike Krivine's Rounding where the relevant hypergeometric function was $\arcsin$, we have to study a family of hypergeometric functions. The bulk of our technical work then involves methods from complex analysis to gain detailed information about the Taylor series coefficients of the inverses of these hypergeometric functions, which then dictate our approximation factor.
Our result also implies improved bounds for "factorization through $\ell_{2}^{\,n}$" of operators from $\ell_{p}^{\,n}$ to $\ell_{q}^{\,m}$ (when $p\geq 2 \geq q$)--- such bounds are of significant interest in functional analysis and our work provides modest supplementary evidence for an intriguing parallel between factorizability, and constant-factor approximability.
△ Less
Submitted 5 November, 2019; v1 submitted 10 April, 2018;
originally announced April 2018.
-
Inapproximability of Matrix $p\rightarrow q$ Norms
Authors:
Vijay Bhattiprolu,
Mrinalkanti Ghosh,
Venkatesan Guruswami,
Euiwoong Lee,
Madhur Tulsiani
Abstract:
We study the problem of computing the $p\rightarrow q$ norm of a matrix $A \in R^{m \times n}$, defined as \[ \|A\|_{p\rightarrow q} ~:=~ \max_{x \,\in\, R^n \setminus \{0\}} \frac{\|Ax\|_q}{\|x\|_p} \] This problem generalizes the spectral norm of a matrix ($p=q=2$) and the Grothendieck problem ($p=\infty$, $q=1$), and has been widely studied in various regimes. When $p \geq q$, the problem exhib…
▽ More
We study the problem of computing the $p\rightarrow q$ norm of a matrix $A \in R^{m \times n}$, defined as \[ \|A\|_{p\rightarrow q} ~:=~ \max_{x \,\in\, R^n \setminus \{0\}} \frac{\|Ax\|_q}{\|x\|_p} \] This problem generalizes the spectral norm of a matrix ($p=q=2$) and the Grothendieck problem ($p=\infty$, $q=1$), and has been widely studied in various regimes. When $p \geq q$, the problem exhibits a dichotomy: constant factor approximation algorithms are known if $2 \in [q,p]$, and the problem is hard to approximate within almost polynomial factors when $2 \notin [q,p]$.
The regime when $p < q$, known as \emph{hypercontractive norms}, is particularly significant for various applications but much less well understood. The case with $p = 2$ and $q > 2$ was studied by [Barak et al, STOC'12] who gave sub-exponential algorithms for a promise version of the problem (which captures small-set expansion) and also proved hardness of approximation results based on the Exponential Time Hypothesis. However, no NP-hardness of approximation is known for these problems for any $p < q$.
We study the hardness of approximating matrix norms in both the above cases and prove the following results:
- We show that for any $1< p < q < \infty$ with $2 \notin [p,q]$, $\|A\|_{p\rightarrow q}$ is hard to approximate within $2^{O(\log^{1-ε}\!n)}$ assuming $NP \not\subseteq BPTIME(2^{\log^{O(1)}\!n})$. This suggests that, similar to the case of $p \geq q$, the hypercontractive setting may be qualitatively different when $2$ does not lie between $p$ and $q$.
- For all $p \geq q$ with $2 \in [q,p]$, we show $\|A\|_{p\rightarrow q}$ is hard to approximate within any factor than $1/(γ_{p^*} \cdot γ_q)$, where for any $r$, $γ_r$ denotes the $r^{th}$ norm of a gaussian, and $p^*$ is the dual norm of $p$.
△ Less
Submitted 8 August, 2018; v1 submitted 21 February, 2018;
originally announced February 2018.
-
Weak Decoupling, Polynomial Folds, and Approximate Optimization over the Sphere
Authors:
Vijay Bhattiprolu,
Mrinalkanti Ghosh,
Venkatesan Guruswami,
Euiwoong Lee,
Madhur Tulsiani
Abstract:
We consider the following basic problem: given an $n$-variate degree-$d$ homogeneous polynomial $f$ with real coefficients, compute a unit vector $x \in \mathbb{R}^n$ that maximizes $|f(x)|$. Besides its fundamental nature, this problem arises in diverse contexts ranging from tensor and operator norms to graph expansion to quantum information theory. The homogeneous degree $2$ case is efficiently…
▽ More
We consider the following basic problem: given an $n$-variate degree-$d$ homogeneous polynomial $f$ with real coefficients, compute a unit vector $x \in \mathbb{R}^n$ that maximizes $|f(x)|$. Besides its fundamental nature, this problem arises in diverse contexts ranging from tensor and operator norms to graph expansion to quantum information theory. The homogeneous degree $2$ case is efficiently solvable as it corresponds to computing the spectral norm of an associated matrix, but the higher degree case is NP-hard.
We give approximation algorithms for this problem that offer a trade-off between the approximation ratio and running time: in $n^{O(q)}$ time, we get an approximation within factor $O_d((n/q)^{d/2-1})$ for arbitrary polynomials, $O_d((n/q)^{d/4-1/2})$ for polynomials with non-negative coefficients, and $O_d(\sqrt{m/q})$ for sparse polynomials with $m$ monomials. The approximation guarantees are with respect to the optimum of the level-$q$ sum-of-squares (SoS) SDP relaxation of the problem. Known polynomial time algorithms for this problem rely on "decoupling lemmas." Such tools are not capable of offering a trade-off like our results as they blow up the number of variables by a factor equal to the degree. We develop new decoupling tools that are more efficient in the number of variables at the expense of less structure in the output polynomials. This enables us to harness the benefits of higher level SoS relaxations.
We complement our algorithmic results with some polynomially large integrality gaps, albeit for a slightly weaker (but still very natural) relaxation. Toward this, we give a method to lift a level-$4$ solution matrix $M$ to a higher level solution, under a mild technical condition on $M$.
△ Less
Submitted 22 April, 2017; v1 submitted 18 November, 2016;
originally announced November 2016.
-
From Weak to Strong LP Gaps for all CSPs
Authors:
Mrinalkanti Ghosh,
Madhur Tulsiani
Abstract:
We study the approximability of constraint satisfaction problems (CSPs) by linear programming (LP) relaxations. We show that for every CSP, the approximation obtained by a basic LP relaxation, is no weaker than the approximation obtained using relaxations given by $Ω\left(\frac{\log n}{\log \log n}\right)$ levels of the Sherali-Adams hierarchy on instances of size $n$.
It was proved by Chan et a…
▽ More
We study the approximability of constraint satisfaction problems (CSPs) by linear programming (LP) relaxations. We show that for every CSP, the approximation obtained by a basic LP relaxation, is no weaker than the approximation obtained using relaxations given by $Ω\left(\frac{\log n}{\log \log n}\right)$ levels of the Sherali-Adams hierarchy on instances of size $n$.
It was proved by Chan et al. [FOCS 2013] that any polynomial size LP extended formulation is no stronger than relaxations obtained by a super-constant levels of the Sherali-Adams hierarchy.. Combining this with our result also implies that any polynomial size LP extended formulation is no stronger than the basic LP.
Using our techniques, we also simplify and strengthen the result by Khot et al. [STOC 2014] on (strong) approximation resistance for LPs. They provided a necessary and sufficient condition under which $Ω(\log \log n)$ levels of the Sherali-Adams hierarchy cannot achieve an approximation better than a random assignment. We simplify their proof and strengthen the bound to $Ω\left(\frac{\log n}{\log \log n}\right)$ levels.
△ Less
Submitted 1 August, 2016;
originally announced August 2016.
-
Algorithmic regularity for polynomials and applications
Authors:
Arnab Bhattacharyya,
Pooya Hatami,
Madhur Tulsiani
Abstract:
In analogy with the regularity lemma of Szemerédi, regularity lemmas for polynomials shown by Green and Tao (Contrib. Discrete Math. 2009) and by Kaufman and Lovett (FOCS 2008) modify a given collection of polynomials \calF = {P_1,...,P_m} to a new collection \calF' so that the polynomials in \calF' are "pseudorandom". These lemmas have various applications, such as (special cases) of Reed-Muller…
▽ More
In analogy with the regularity lemma of Szemerédi, regularity lemmas for polynomials shown by Green and Tao (Contrib. Discrete Math. 2009) and by Kaufman and Lovett (FOCS 2008) modify a given collection of polynomials \calF = {P_1,...,P_m} to a new collection \calF' so that the polynomials in \calF' are "pseudorandom". These lemmas have various applications, such as (special cases) of Reed-Muller testing and worst-case to average-case reductions for polynomials. However, the transformation from \calF to \calF' is not algorithmic for either regularity lemma. We define new notions of regularity for polynomials, which are analogous to the above, but which allow for an efficient algorithm to compute the pseudorandom collection \calF'. In particular, when the field is of high characteristic, in polynomial time, we can refine \calF into \calF' where every nonzero linear combination of polynomials in \calF' has desirably small Gowers norm.
Using the algorithmic regularity lemmas, we show that if a polynomial P of degree d is within (normalized) Hamming distance 1-1/|F| -\eps of some unknown polynomial of degree k over a prime field F (for k < d < |F|), then there is an efficient algorithm for finding a degree-k polynomial Q, which is within distance 1-1/|F| -ηof P, for some ηdepending on \eps. This can be thought of as decoding the Reed-Muller code of order k beyond the list decoding radius (finding one close codeword), when the received word P itself is a polynomial of degree d (with k < d < |F|).
We also obtain an algorithmic version of the worst-case to average-case reductions by Kaufman and Lovett. They show that if a polynomial of degree d can be weakly approximated by a polynomial of lower degree, then it can be computed exactly using a collection of polynomials of degree at most d-1. We give an efficient (randomized) algorithm to find this collection.
△ Less
Submitted 13 November, 2013;
originally announced November 2013.
-
A Characterization of Approximation Resistance
Authors:
Subhash Khot,
Madhur Tulsiani,
Pratik Worah
Abstract:
A predicate f:{-1,1}^k -> {0,1} with ρ(f) = \frac{|f^{-1}(1)|}{2^k} is called {\it approximation resistant} if given a near-satisfiable instance of CSP(f), it is computationally hard to find an assignment that satisfies at least ρ(f)+Ω(1) fraction of the constraints.
We present a complete characterization of approximation resistant predicates under the Unique Games Conjecture. We also present ch…
▽ More
A predicate f:{-1,1}^k -> {0,1} with ρ(f) = \frac{|f^{-1}(1)|}{2^k} is called {\it approximation resistant} if given a near-satisfiable instance of CSP(f), it is computationally hard to find an assignment that satisfies at least ρ(f)+Ω(1) fraction of the constraints.
We present a complete characterization of approximation resistant predicates under the Unique Games Conjecture. We also present characterizations in the {\it mixed} linear and semi-definite programming hierarchy and the Sherali-Adams linear programming hierarchy. In the former case, the characterization coincides with the one based on UGC. Each of the two characterizations is in terms of existence of a probability measure with certain symmetry properties on a natural convex polytope associated with the predicate.
△ Less
Submitted 23 October, 2013; v1 submitted 23 May, 2013;
originally announced May 2013.
-
An Arithmetic Analogue of Fox's Triangle Removal Argument
Authors:
Pooya Hatami,
Sushant Sachdeva,
Madhur Tulsiani
Abstract:
We give an arithmetic version of the recent proof of the triangle removal lemma by Fox [Fox11], for the group $\mathbb{F}_2^n$.
A triangle in $\mathbb{F}_2^n$ is a triple $(x,y,z)$ such that $x+y+z = 0$. The triangle removal lemma for $\mathbb{F}_2^n$ states that for every $ε> 0$ there is a $δ> 0$, such that if a subset $A$ of $\mathbb{F}_2^n$ requires the removal of at least $ε\cdot 2^n$ elemen…
▽ More
We give an arithmetic version of the recent proof of the triangle removal lemma by Fox [Fox11], for the group $\mathbb{F}_2^n$.
A triangle in $\mathbb{F}_2^n$ is a triple $(x,y,z)$ such that $x+y+z = 0$. The triangle removal lemma for $\mathbb{F}_2^n$ states that for every $ε> 0$ there is a $δ> 0$, such that if a subset $A$ of $\mathbb{F}_2^n$ requires the removal of at least $ε\cdot 2^n$ elements to make it triangle-free, then it must contain at least $δ\cdot 2^{2n}$ triangles. This problem was first studied by Green [Gre05] who proved a lower bound on $δ$ using an arithmetic regularity lemma. Regularity based lower bounds for triangle removal in graphs were recently improved by Fox and we give a direct proof of an analogous improvement for triangle removal in $\mathbb{F}_2^n$.
The improved lower bound was already known to follow (for triangle-removal in all groups), using Fox's removal lemma for directed cycles and a reduction by Král, Serra and Vena [KSV09] (see [Fox11,CF13]). The purpose of this note is to provide a direct Fourier-analytic proof for the group $\mathbb{F}_2^n.$
△ Less
Submitted 1 February, 2016; v1 submitted 17 April, 2013;
originally announced April 2013.
-
Sampling-based proofs of almost-periodicity results and algorithmic applications
Authors:
Eli Ben-Sasson,
Noga Ron-Zewi,
Madhur Tulsiani,
Julia Wolf
Abstract:
We give new combinatorial proofs of known almost-periodicity results for sumsets of sets with small doubling in the spirit of Croot and Sisask, whose almost-periodicity lemma has had far-reaching implications in additive combinatorics. We provide an alternative (and L^p-norm free) point of view, which allows for proofs to easily be converted to probabilistic algorithms that decide membership in al…
▽ More
We give new combinatorial proofs of known almost-periodicity results for sumsets of sets with small doubling in the spirit of Croot and Sisask, whose almost-periodicity lemma has had far-reaching implications in additive combinatorics. We provide an alternative (and L^p-norm free) point of view, which allows for proofs to easily be converted to probabilistic algorithms that decide membership in almost-periodic sumsets of dense subsets of F_2^n.
As an application, we give a new algorithmic version of the quasipolynomial Bogolyubov-Ruzsa lemma recently proved by Sanders. Together with the results by the last two authors, this implies an algorithmic version of the quadratic Goldreich-Levin theorem in which the number of terms in the quadratic Fourier decomposition of a given function is quasipolynomial in the error parameter, compared with an exponential dependence previously proved by the authors. It also improves the running time of the algorithm to have quasipolynomial dependence instead of an exponential one.
We also give an application to the problem of finding large subspaces in sumsets of dense sets. Green showed that the sumset of a dense subset of F_2^n contains a large subspace. Using Fourier analytic methods, Sanders proved that such a subspace must have dimension bounded below by a constant times the density times n. We provide an alternative (and L^p norm-free) proof of a comparable bound, which is analogous to a recent result of Croot, Laba and Sisask in the integers.
△ Less
Submitted 25 October, 2012;
originally announced October 2012.
-
Quadratic Goldreich-Levin Theorems
Authors:
Madhur Tulsiani,
Julia Wolf
Abstract:
Decomposition theorems in classical Fourier analysis enable us to express a bounded function in terms of few linear phases with large Fourier coefficients plus a part that is pseudorandom with respect to linear phases. The Goldreich-Levin algorithm can be viewed as an algorithmic analogue of such a decomposition as it gives a way to efficiently find the linear phases associated with large Fourier…
▽ More
Decomposition theorems in classical Fourier analysis enable us to express a bounded function in terms of few linear phases with large Fourier coefficients plus a part that is pseudorandom with respect to linear phases. The Goldreich-Levin algorithm can be viewed as an algorithmic analogue of such a decomposition as it gives a way to efficiently find the linear phases associated with large Fourier coefficients.
In the study of "quadratic Fourier analysis", higher-degree analogues of such decompositions have been developed in which the pseudorandomness property is stronger but the structured part correspondingly weaker. For example, it has previously been shown that it is possible to express a bounded function as a sum of a few quadratic phases plus a part that is small in the $U^3$ norm, defined by Gowers for the purpose of counting arithmetic progressions of length 4. We give a polynomial time algorithm for computing such a decomposition.
A key part of the algorithm is a local self-correction procedure for Reed-Muller codes of order 2 (over $\F_2^n$) for a function at distance $1/2-ε$ from a codeword. Given a function $f:\F_2^n \to \{-1,1\}$ at fractional Hamming distance $1/2-ε$ from a quadratic phase (which is a codeword of Reed-Muller code of order 2), we give an algorithm that runs in time polynomial in $n$ and finds a codeword at distance at most $1/2-η$ for $η= η(ε)$. This is an algorithmic analogue of Samorodnitsky's result, which gave a tester for the above problem. To our knowledge, it represents the first instance of a correction procedure for any class of codes, beyond the list-decoding radius.
In the process, we give algorithmic versions of results from additive combinatorics used in Samorodnitsky's proof and a refined version of the inverse theorem for the Gowers $U^3$ norm over $\F_2^n$.
△ Less
Submitted 22 May, 2011;
originally announced May 2011.
-
Cuts in Cartesian Products of Graphs
Authors:
Sushant Sachdeva,
Madhur Tulsiani
Abstract:
The k-fold Cartesian product of a graph G is defined as a graph on k-tuples of vertices, where two tuples are connected if they form an edge in one of the positions and are equal in the rest. Starting with G as a single edge gives G^k as a k-dimensional hypercube. We study the distributions of edges crossed by a cut in G^k across the copies of G in different positions. This is a generalization of…
▽ More
The k-fold Cartesian product of a graph G is defined as a graph on k-tuples of vertices, where two tuples are connected if they form an edge in one of the positions and are equal in the rest. Starting with G as a single edge gives G^k as a k-dimensional hypercube. We study the distributions of edges crossed by a cut in G^k across the copies of G in different positions. This is a generalization of the notion of influences for cuts on the hypercube.
We show the analogues of results of Kahn, Kalai, and Linial (KKL Theorem [KahnKL88]) and that of Friedgut (Friedgut's Junta theorem [Friedgut98]), for the setting of Cartesian products of arbitrary graphs. Our proofs extend the arguments of Rossignol [Rossignol06] and of Falik and Samorodnitsky [FalikS07], to the case of arbitrary Cartesian products. We also extend the work on studying isoperimetric constants for these graphs [HoudreT96, ChungT98] to the value of semidefinite relaxations for edge-expansion. We connect the optimal values of the relaxations for computing expansion, given by various semidefinite hierarchies, for G and G^k.
△ Less
Submitted 24 September, 2013; v1 submitted 17 May, 2011;
originally announced May 2011.
-
Reductions Between Expansion Problems
Authors:
Prasad Raghavendra,
David Steurer,
Madhur Tulsiani
Abstract:
The Small-Set Expansion Hypothesis (Raghavendra, Steurer, STOC 2010) is a natural hardness assumption concerning the problem of approximating the edge expansion of small sets in graphs. This hardness assumption is closely connected to the Unique Games Conjecture (Khot, STOC 2002). In particular, the Small-Set Expansion Hypothesis implies the Unique Games Conjecture (Raghavendra, Steurer, STOC 2010…
▽ More
The Small-Set Expansion Hypothesis (Raghavendra, Steurer, STOC 2010) is a natural hardness assumption concerning the problem of approximating the edge expansion of small sets in graphs. This hardness assumption is closely connected to the Unique Games Conjecture (Khot, STOC 2002). In particular, the Small-Set Expansion Hypothesis implies the Unique Games Conjecture (Raghavendra, Steurer, STOC 2010).
Our main result is that the Small-Set Expansion Hypothesis is in fact equivalent to a variant of the Unique Games Conjecture. More precisely, the hypothesis is equivalent to the Unique Games Conjecture restricted to instance with a fairly mild condition on the expansion of small sets. Alongside, we obtain the first strong hardness of approximation results for the Balanced Separator and Minimum Linear Arrangement problems. Before, no such hardness was known for these problems even assuming the Unique Games Conjecture.
These results not only establish the Small-Set Expansion Hypothesis as a natural unifying hypothesis that implies the Unique Games Conjecture, all its consequences and, in addition, hardness results for other problems like Balanced Separator and Minimum Linear Arrangement, but our results also show that the Small-Set Expansion Hypothesis problem lies at the combinatorial heart of the Unique Games Conjecture.
The key technical ingredient is a new way of exploiting the structure of the Unique Games instances obtained from the Small-Set Expansion Hypothesis via (Raghavendra, Steurer, 2010). This additional structure allows us to modify standard reductions in a way that essentially destroys their local-gadget nature. Using this modification, we can argue about the expansion in the graphs produced by the reduction without relying on expansion properties of the underlying Unique Games instance (which would be impossible for a local-gadget reduction).
△ Less
Submitted 11 November, 2010;
originally announced November 2010.
-
On the Optimality of a Class of LP-based Algorithms
Authors:
Amit Kumar,
Rajsekar Manokaran,
Madhur Tulsiani,
Nisheeth K. Vishnoi
Abstract:
In this paper we will be concerned with a class of packing and covering problems which includes Vertex Cover and Independent Set. Typically, one can write an LP relaxation and then round the solution. In this paper, we explain why the simple LP-based rounding algorithm for the \\VC problem is optimal assuming the UGC. Complementing Raghavendra's result, our result generalizes to a class of stric…
▽ More
In this paper we will be concerned with a class of packing and covering problems which includes Vertex Cover and Independent Set. Typically, one can write an LP relaxation and then round the solution. In this paper, we explain why the simple LP-based rounding algorithm for the \\VC problem is optimal assuming the UGC. Complementing Raghavendra's result, our result generalizes to a class of strict, covering/packing type CSPs.
△ Less
Submitted 9 December, 2009;
originally announced December 2009.
-
Algorithms and Hardness for Subspace Approximation
Authors:
Amit Deshpande,
Kasturi Varadarajan,
Madhur Tulsiani,
Nisheeth K. Vishnoi
Abstract:
The subspace approximation problem Subspace($k$,$p$) asks for a $k$-dimensional linear subspace that fits a given set of points optimally, where the error for fitting is a generalization of the least squares fit and uses the $\ell_{p}$ norm instead. Most of the previous work on subspace approximation has focused on small or constant $k$ and $p$, using coresets and sampling techniques from computat…
▽ More
The subspace approximation problem Subspace($k$,$p$) asks for a $k$-dimensional linear subspace that fits a given set of points optimally, where the error for fitting is a generalization of the least squares fit and uses the $\ell_{p}$ norm instead. Most of the previous work on subspace approximation has focused on small or constant $k$ and $p$, using coresets and sampling techniques from computational geometry.
In this paper, extending another line of work based on convex relaxation and rounding, we give a polynomial time algorithm, \emph{for any $k$ and any $p \geq 2$}, with the approximation guarantee roughly $γ_{p} \sqrt{2 - \frac{1}{n-k}}$, where $γ_{p}$ is the $p$-th moment of a standard normal random variable N(0,1). We show that the convex relaxation we use has an integrality gap (or "rank gap") of $γ_{p} (1 - ε)$, for any constant $ε> 0$. Finally, we show that assuming the Unique Games Conjecture, the subspace approximation problem is hard to approximate within a factor better than $γ_{p} (1 - ε)$, for any constant $ε> 0$.
△ Less
Submitted 30 December, 2010; v1 submitted 7 December, 2009;
originally announced December 2009.