research-article

Public Access

Geometric median in nearly linear time

Authors:

Michael B. Cohen,

Jakub Pachocki,

Aaron SidfordAuthors Info & Claims

STOC '16: Proceedings of the forty-eighth annual ACM symposium on Theory of Computing

Pages 9 - 21

https://doi.org/10.1145/2897518.2897647

Published: 19 June 2016 Publication History

Abstract

In this paper we provide faster algorithms for solving the geometric median problem: given n points in ^d compute a point that minimizes the sum of Euclidean distances to the points. This is one of the oldest non-trivial problems in computational geometry yet despite a long history of research the previous fastest running times for computing a (1+є)-approximate geometric median were O(d· n^4/3є^−8/3) by Chin et. al, Õ(dexpє⁻⁴logє⁻¹) by Badoiu et. al, O(nd+poly(d,є⁻¹)) by Feldman and Langberg, and the polynomial running time of O((nd)^O(1)log1/є) by Parrilo and Sturmfels and Xue and Ye.

In this paper we show how to compute such an approximate geometric median in time O(ndlog³n/є) and O(dє⁻²). While our O(dє⁻²) is a fairly straightforward application of stochastic subgradient descent, our O(ndlog³n/є) time algorithm is a novel long step interior point method. We start with a simple O((nd)^O(1)log1/є) time interior point method and show how to improve it, ultimately building an algorithm that is quite non-standard from the perspective of interior point literature. Our result is one of few cases of outperforming standard interior point theory. Furthermore, it is the only case we know of where interior point methods yield a nearly linear time algorithm for a canonical optimization problem that traditionally requires superlinear time.

References

[1]

M. Badoiu, S. Har-Peled, and P. Indyk. Approximate clustering via core-sets. In Proceedings on 34th Annual ACM Symposium on Theory of Computing, May 19-21, 2002, Montréal, Québec, Canada, pages 250–257, 2002.

Digital Library

[2]

C. Bajaj. The algebraic degree of geometric optimization problems. Discrete & Computational Geometry, 3(2):177–191, 1988.

Digital Library

[3]

E. Balas and C.-S. Yu. A note on the weiszfeld-kuhn algorithm for the general fermat problem. Managme Sci Res Report, (484):1–6, 1982.

[4]

P. Bose, A. Maheshwari, and P. Morin. Fast approximations for sums of distances, clustering and the Fermat-Weber problem. Computational Geometry, 24(3):135 – 146, 2003.

Digital Library

[5]

S. Bubeck. Theory of convex optimization for machine learning. arXiv preprint arXiv:1405.4980, 2014.

[6]

R. Chandrasekaran and A. Tamir. Open questions concerning weiszfeld’s algorithm for the fermat-weber location problem. Mathematical Programming, 44(1-3):293–295, 1989.

[7]

H. H. Chin, A. Madry, G. L. Miller, and R. Peng. Runtime guarantees for regression problems. In ITCS, pages 269–282, 2013.

Digital Library

[8]

L. Cooper and I. Katz. The weber problem revisited. Computers and Mathematics with Applications, 7(3):225 – 234, 1981.

[9]

Z. Drezner, K. Klamroth, A. Sch ˜ A˝ ubel, and G. Wesolowsky. Facility location, chapter The Weber problem, pages 1–36. Springer, 2002.

[10]

D. Feldman and M. Langberg. A unified framework for approximating and clustering data. In Proceedings of the forty-third annual ACM symposium on Theory of computing, pages 569–578. ACM, 2011.

Digital Library

[11]

C. C. Gonzaga. Path-following methods for linear programming. SIAM review, 34(2):167–224, 1992.

Digital Library

[12]

S. Har-Peled and A. Kushal. Smaller coresets for k-median and k-means clustering. In Proceedings of the twenty-first annual symposium on Computational geometry, pages 126–134. ACM, 2005.

Digital Library

[13]

P. Indyk and S. U. C. S. Dept. High-dimensional computational geometry. Stanford University, 2000.

Digital Library

[14]

J. Krarup and S. Vajda. On torricelli’s geometrical solution to a problem of fermat. IMA Journal of Management Mathematics, 8(3):215–224, 1997.

[15]

R. A. Kronmal and A. V. Peterson. The alias and alias-rejection-mixture methods for generating random variables from probability distributions. In Proceedings of the 11th Conference on Winter Simulation - Volume 1, WSC ’79, pages 269–280, Piscataway, NJ, USA, 1979. IEEE Press.

Digital Library

[16]

H. Kuhn. A note on fermat’s problem. Mathematical Programming, 4(1):98–107, 1973.

[17]

Y. T. Lee and A. Sidford. Path-finding methods for linear programming : Solving linear programs in ˜ o(sqrt(rank)) iterations and faster algorithms for maximum flow. In 55th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2014, 18-21 October, 2014, Philadelphia, PA, USA, pages 424–433, 2014.

Digital Library

[18]

H. P. Lopuhaa and P. J. Rousseeuw. Breakdown points of affine equivariant estimators of multivariate location and covariance matrices. Ann. Statist., 19(1):229–248, 03 1991.

[19]

H. P. Lopuhaa and P. J. Rousseeuw. Breakdown points of affine equivariant estimators of multivariate location and covariance matrices. The Annals of Statistics, pages 229–248, 1991.

[20]

A. Madry. Navigating central path with electrical flows: from flows to matchings, and back. In Proceedings of the 54th Annual Symposium on Foundations of Computer Science, 2013.

Digital Library

[21]

Y. Nesterov. Introductory Lectures on Convex Optimization: A Basic Course, volume I. 2003.

[22]

Y. Nesterov and A. S. Nemirovskii. Interior-point polynomial algorithms in convex programming, volume 13. Society for Industrial and Applied Mathematics, 1994.

[23]

L. M. Ostresh. On the convergence of a class of iterative methods for solving the weber location problem. Operations Research, 26(4):597–609, 1978.

Digital Library

[24]

P. A. Parrilo and B. Sturmfels. Minimizing polynomial functions. In DIMACS Workshop on Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathematics and Computer Science, March 12-16, 2001, DIMACS Center, Rutgers University, Piscataway, NJ, USA, pages 83–100, 2001.

[25]

F. Plastria and M. Elosmani. On the convergence of the weiszfeld algorithm forăcontinuous single facility location allocation problems. TOP, 16(2):388–406, 2008.

[26]

J. Renegar. A polynomial-time algorithm, based on newton’s method, for linear programming. Mathematical Programming, 40(1-3):59–93, 1988.

Digital Library

[27]

Y. Vardi and C.-H. Zhang. The multivariate l1-median and associated data depth. Proceedings of the National Academy of Sciences, 97(4):1423–1426, 2000.

[28]

V. Viviani. De maximis et minimis geometrica divinatio liber 2. De Maximis et Minimis Geometrica Divinatio, 1659.

[29]

A. Weber. The Theory of the Location of Industries. Chicago University Press, 1909. Aber den I der Industrien.

[30]

E. Weiszfeld. Sur le point pour lequel la somme des distances de n points donnes est minimum. Tohoku Mathematical Journal, pages 355–386, 1937.

[31]

G. Xue and Y. Ye. An efficient algorithm for minimizing a sum of euclidean norms with applications. SIAM Journal on Optimization, 7:1017–1036, 1997.

Digital Library

[32]

Y. Ye. Interior point algorithms: theory and analysis, volume 44. John Wiley & Sons, 2011.

Cited By

Draganov ASaulpic DSchwiegelshohn C(2024)Settling Time vs. Accuracy Tradeoffs for Clustering Big DataProceedings of the ACM on Management of Data10.1145/36549762:3(1-25)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3654976
Guerraoui RGupta NPinot R(2024)Byzantine Machine Learning: A PrimerACM Computing Surveys10.1145/361653756:7(1-39)Online publication date: 9-Apr-2024
https://dl.acm.org/doi/10.1145/3616537
Genest BCourty NCoeurjolly D(2024)Non‐Euclidean Sliced Optimal Transport SamplingComputer Graphics Forum10.1111/cgf.1502043:2Online publication date: 30-Apr-2024
https://doi.org/10.1111/cgf.15020
Show More Cited By

Index Terms

Geometric median in nearly linear time
1. Mathematics of computing
  1. Mathematical analysis
    1. Mathematical optimization
2. Theory of computation
  1. Design and analysis of algorithms
    1. Mathematical optimization

Recommendations

Nearly linear time approximation schemes for Euclidean TSP and other geometric problems
FOCS '97: Proceedings of the 38th Annual Symposium on Foundations of Computer Science

We present a randomized polynomial time approximation scheme for Euclidean TSP in R/sup 2/ that is substantially more efficient than our earlier scheme (1996) (and the scheme of Mitchell (1996)). For any fixed c>1 and any set of n nodes in the plane, ...
Nearly linear-time packing and covering LP solvers

Packing and covering linear programs (PC-LP s) form an important class of linear programs (LPs) across computer science, operations research, and optimization. Luby and Nisan (in: STOC, ACM Press, New York, 1993) constructed an iterative algorithm for ...
Sorting in Linear Time?

We show that a unit-cost RAM with a word length ofwbits can sortnintegers in the range 0 2w 1 inO(nloglogn) time for arbitraryw logn, a significant improvement over the bound ofO(nlogn) achieved by the fusion trees of Fredman and Willard. Provided thatw ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

STOC '16: Proceedings of the forty-eighth annual ACM symposium on Theory of Computing

June 2016

1141 pages

ISBN:9781450341325

DOI:10.1145/2897518

General Chair:
Daniel Wichs
Northeastern, USA
,
Program Chair:
Yishay Mansour
Tel Aviv

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGACT: ACM Special Interest Group on Algorithms and Computation Theory

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 June 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

STOC '16

Sponsor:

SIGACT

STOC '16: Symposium on Theory of Computing

June 19 - 21, 2016

MA, Cambridge, USA

Acceptance Rates

Overall Acceptance Rate 1,469 of 4,586 submissions, 32%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

55
Total Citations
View Citations
2,206
Total Downloads

Downloads (Last 12 months)336
Downloads (Last 6 weeks)34

Reflects downloads up to 11 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Draganov ASaulpic DSchwiegelshohn C(2024)Settling Time vs. Accuracy Tradeoffs for Clustering Big DataProceedings of the ACM on Management of Data10.1145/36549762:3(1-25)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3654976
Guerraoui RGupta NPinot R(2024)Byzantine Machine Learning: A PrimerACM Computing Surveys10.1145/361653756:7(1-39)Online publication date: 9-Apr-2024
https://dl.acm.org/doi/10.1145/3616537
Genest BCourty NCoeurjolly D(2024)Non‐Euclidean Sliced Optimal Transport SamplingComputer Graphics Forum10.1111/cgf.1502043:2Online publication date: 30-Apr-2024
https://doi.org/10.1111/cgf.15020
Yang ZZhang SLi CWang MYang JZhang M(2024)Equalized Aggregation for Heterogeneous Federated Mobile Edge LearningIEEE Transactions on Mobile Computing10.1109/TMC.2023.3276900(1-18)Online publication date: 2024
https://doi.org/10.1109/TMC.2023.3276900
Yao WZhao HShi H(2024)Privacy-Preserving Collaborative Intrusion Detection in Edge of Internet of Things: A Robust and Efficient Deep Generative Learning ApproachIEEE Internet of Things Journal10.1109/JIOT.2023.334811711:9(15704-15722)Online publication date: 1-May-2024
https://doi.org/10.1109/JIOT.2023.3348117
Moss J(2024)Measures of Agreement with Multiple Raters: Fréchet Variances and InferencePsychometrika10.1007/s11336-023-09945-289:2(517-541)Online publication date: 8-Jan-2024
https://doi.org/10.1007/s11336-023-09945-2
Guerraoui RGupta NPinot RGuerraoui RGupta NPinot R(2024)Fundamentals of Robust Machine LearningRobust Machine Learning10.1007/978-981-97-0688-4_4(55-92)Online publication date: 5-Apr-2024
https://doi.org/10.1007/978-981-97-0688-4_4
Evans WTabatabaee S(2024)Minimizing the Size of the Uncertainty Regions for Centers of Moving EntitiesLATIN 2024: Theoretical Informatics10.1007/978-3-031-55598-5_18(273-287)Online publication date: 6-Mar-2024
https://doi.org/10.1007/978-3-031-55598-5_18
Epasto AMirrokni VNarayanan SZhong POh ANaumann TGloberson ASaenko KHardt MLevine S(2023)k-means clustering with distance-based privacyProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666981(19570-19593)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3666981
Huang LHuang RHuang ZWu XKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)On coresets for clustering in small dimensional euclidean spacesProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3618972(13891-13915)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3618972
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents