We show that adaptively sampled O(k) centers give a constant factor bi-criteria approximation for... more We show that adaptively sampled O(k) centers give a constant factor bi-criteria approximation for the k-means problem, with a constant probability. Moreover, these O(k) centers contain a subset of k centers which give a constant factor approximation, and can be found using LP-based techniques of Jain and Vazirani [JV01] and Charikar et al. [CGTS02]. Both these algorithms run in effectively O(nkd) time and extend the O(logk)-approximation achieved by the k-means++ algorithm of Arthur and Vassilvitskii [AV07].
Efficient sampling, integration and optimization algorithms for logconcave functions [BV04, KV06,... more Efficient sampling, integration and optimization algorithms for logconcave functions [BV04, KV06, LV06a] rely on the good isoperimetry of these functions. We extend this to show that− 1/(n− 1)-concave functions have good isoperimetry, and moreover, using a ...
Efficient sampling, integration and optimization algorithms for logconcave functions [BV04, KV06,... more Efficient sampling, integration and optimization algorithms for logconcave functions [BV04, KV06, LV06a] rely on the good isoperimetry of these functions. We extend this to show that − 1/(n − 1)-concave functions have good isoperimetry, and moreover, using a characterization of functions based on their values along every line, we prove that this is the largest class of functions with good isoperimetry in the spectrum from concave to quasi-concave. We give an efficient sampling algorithm based on a random walk for − 1/(n − 1)-concave probability densities satisfying a smoothness criterion, which includes heavy-tailed densities such as the Cauchy density. In addition, the mixing time of this random walk for Cauchy density matches the corresponding best known bounds for logconcave densities.
Electronic Colloquium on Computational Complexity, 2006
We prove that any real matrix A contains a subset of at most 4k/ε+ 2k log(k+1) rows whose span “c... more We prove that any real matrix A contains a subset of at most 4k/ε+ 2k log(k+1) rows whose span “contains” a matrix of rank at most k with error only (1+ε) times the error of the best rank-k approximation of A. We complement it with an almost matching lower bound by constructing matrices where the span of any k/2ε rows does not “contain” a relative (1+ε)-approximation of rank k. Our existence result leads to an algorithm that finds such rank-k approximation in time $ O \left( M \left( \frac{k}{\epsilon} + k^{2} \log k \right) + (m+n) \left( \frac{k^{2}}{\epsilon^{2}} + \frac{k^{3} \log k}{\epsilon} + k^{4} \log^{2} k \right) \right), $ i.e., essentially O(Mk/ε), where M is the number of nonzero entries of A. The algorithm maintains sparsity, and in the streaming model [12,14,15], it can be implemented using only 2(k+1)(log(k+1)+1) passes over the input matrix and $O \left( \min \{ m, n \} (\frac{k}{\epsilon} + k^{2} \log k) \right)$ additional space. Previous algorithms for low-rank approximation use only one or two passes but obtain an additive approximation.
We show that adaptively sampled O(k) centers give a constant factor bi-criteria approximation for... more We show that adaptively sampled O(k) centers give a constant factor bi-criteria approximation for the k-means problem, with a constant probability. Moreover, these O(k) centers contain a subset of k centers which give a constant factor approximation, and can be found using LP-based techniques of Jain and Vazirani [JV01] and Charikar et al. [CGTS02]. Both these algorithms run in effectively O(nkd) time and extend the O(logk)-approximation achieved by the k-means++ algorithm of Arthur and Vassilvitskii [AV07].
Efficient sampling, integration and optimization algorithms for logconcave functions [BV04, KV06,... more Efficient sampling, integration and optimization algorithms for logconcave functions [BV04, KV06, LV06a] rely on the good isoperimetry of these functions. We extend this to show that− 1/(n− 1)-concave functions have good isoperimetry, and moreover, using a ...
Efficient sampling, integration and optimization algorithms for logconcave functions [BV04, KV06,... more Efficient sampling, integration and optimization algorithms for logconcave functions [BV04, KV06, LV06a] rely on the good isoperimetry of these functions. We extend this to show that − 1/(n − 1)-concave functions have good isoperimetry, and moreover, using a characterization of functions based on their values along every line, we prove that this is the largest class of functions with good isoperimetry in the spectrum from concave to quasi-concave. We give an efficient sampling algorithm based on a random walk for − 1/(n − 1)-concave probability densities satisfying a smoothness criterion, which includes heavy-tailed densities such as the Cauchy density. In addition, the mixing time of this random walk for Cauchy density matches the corresponding best known bounds for logconcave densities.
Electronic Colloquium on Computational Complexity, 2006
We prove that any real matrix A contains a subset of at most 4k/ε+ 2k log(k+1) rows whose span “c... more We prove that any real matrix A contains a subset of at most 4k/ε+ 2k log(k+1) rows whose span “contains” a matrix of rank at most k with error only (1+ε) times the error of the best rank-k approximation of A. We complement it with an almost matching lower bound by constructing matrices where the span of any k/2ε rows does not “contain” a relative (1+ε)-approximation of rank k. Our existence result leads to an algorithm that finds such rank-k approximation in time $ O \left( M \left( \frac{k}{\epsilon} + k^{2} \log k \right) + (m+n) \left( \frac{k^{2}}{\epsilon^{2}} + \frac{k^{3} \log k}{\epsilon} + k^{4} \log^{2} k \right) \right), $ i.e., essentially O(Mk/ε), where M is the number of nonzero entries of A. The algorithm maintains sparsity, and in the streaming model [12,14,15], it can be implemented using only 2(k+1)(log(k+1)+1) passes over the input matrix and $O \left( \min \{ m, n \} (\frac{k}{\epsilon} + k^{2} \log k) \right)$ additional space. Previous algorithms for low-rank approximation use only one or two passes but obtain an additive approximation.
Uploads