Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Streaming Algorithms for Geometric Steiner Forest

Published: 05 August 2024 Publication History

Abstract

We consider a generalization of the Steiner tree problem, the Steiner forest problem, in the Euclidean plane: the input is a multiset \(X\subseteq{\mathbb{R}}^{2}\) , partitioned into \(k\) color classes \(C_{1},\ldots,C_{k}\subseteq X\) . The goal is to find a minimum-cost Euclidean graph \(G\) such that every color class \(C_{i}\) is connected in \(G\) . We study this Steiner forest problem in the streaming setting, where the stream consists of insertions and deletions of points to \(X\) . Each input point \(x {\in} X\) arrives with its color \(\mathsf{color}(x) {\in} [k]\) , and as usual for dynamic geometric streams, the input is restricted to the discrete grid \(\{1,\ldots,\Delta\}^{2}\) .
We design a single-pass streaming algorithm that uses \(\operatorname{poly}(k\cdot\log\Delta)\) space and time, and estimates the cost of an optimal Steiner forest solution within ratio arbitrarily close to the famous Euclidean Steiner ratio \(\alpha_{2}\) (currently \(1.1547\leq\alpha_{2}\leq 1.214\) ). This approximation guarantee matches the state-of-the-art bound for streaming Steiner tree, i.e., when \(k=1\) , and it is a major open question to improve the ratio to \(1+\varepsilon\) even for this special case. Our approach relies on a novel combination of streaming techniques, like sampling and linear sketching, with the classical Arora-style dynamic-programming framework for geometric optimization problems, which usually requires large memory and so far has not been applied in the streaming setting.
We complement our streaming algorithm for the Steiner forest problem with simple arguments showing that any finite multiplicative approximation requires \(\Omega(k)\) bits of space.

References

[1]
Pankaj K. Agarwal and R. Sharathkumar. 2015. Streaming Algorithms for Extent Problems in High Dimensions. Algorithmica 72, 1 (2015), 83–98. DOI:
[2]
Ajit Agrawal, Philip N. Klein, and R. Ravi. 1995. When Trees Collide: An Approximation Algorithm for the Generalized Steiner Problem on Networks. SIAM Journal on Computing 24, 3 (1995), 440–456.
[3]
Mattias Andersson, Joachim Gudmundsson, Christos Levcopoulos, and Giri Narasimhan. 2003. Balanced Partition of Minimum Spanning Trees. International Journal of Computational Geometry and Applications 13, 4 (2003), 303–316.
[4]
Alexandr Andoni, Khanh Do Ba, Piotr Indyk, and David P. Woodruff. 2009. Efficient Sketches for Earth-Mover Distance, with Applications. In Proceedings of the 50th Annual IEEE Symposium on Foundations of Computer Science (FOCS). 324–330. DOI:
[5]
Alexandr Andoni, Piotr Indyk, and Robert Krauthgamer. 2008. Earth Mover Distance over High-Dimensional Spaces. In Proceedings of the 19th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 343–352. Retrieved from http://dl.acm.org/citation.cfm?id=1347082.1347120
[6]
Alexandr Andoni and Huy L. Nguyen. 2012. Width of Points in the Streaming Model. In Proceedings of the 23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 447–452.
[7]
Alexandr Andoni, Aleksandar Nikolov, Krzysztof Onak, and Grigory Yaroslavtsev. 2014. Parallel Algorithms for Geometric Graph Problems. In Proceedings of the 46th Annual ACM Symposium on Theory of Computing (STOC). 574–583.
[8]
Sanjeev Arora. 1998. Polynomial Time Approximation Schemes for Euclidean Traveling Salesman and Other Geometric Problems. Journal of the ACM 45, 5 (1998), 753–782. DOI:
[9]
Sanjeev Arora, Prabhakar Raghavan, and Satish Rao. 1998. Approximation Schemes for Euclidean \(k\) -Medians and Related Problems. In Proceedings of the 13th Annual ACM Symposium on the Theory of Computing (STOC). 106–113.
[10]
Yair Bartal. 1996. Probabilistic Approximations of Metric Spaces and Its Algorithmic Applications. In Proceedings of the 37th Annual IEEE Symposium on Foundations of Computer Science (FOCS). 184–193.
[11]
MohammadHossein Bateni, Hossein Esfandiari, and Vahab S. Mirrokni. 2017. Almost Optimal Streaming Algorithms for Coverage Problems. In Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 13–23.
[12]
MohammadHossein Bateni and MohammadTaghi Hajiaghayi. 2012. Euclidean Prize-Collecting Steiner Forest. Algorithmica 62, 3–4 (2012), 906–929.
[13]
MohammadHossein Bateni, MohammadTaghi Hajiaghayi, and Dániel Marx. 2011. Approximation Schemes for Steiner Forest on Planar Graphs and Graphs of Bounded Treewidth. Journal of the ACM 58, 5 (2011), 21:1–21:37.
[14]
Djamal Belazzougui and Qin Zhang. 2016. Edit Distance: Sketching, Streaming, and Document Exchange. In Proceedings of the 57th Annual IEEE Symposium on Foundations of Computer Science (FOCS). 51–60. DOI:
[15]
Glencora Borradaile, Philip N. Klein, and Claire Mathieu. 2015. A Polynomial-Time Approximation Scheme for Euclidean Steiner Forest. ACM Transactions of Algorithms 11, 3 (2015), 19:1–19:20.
[16]
Vladimir Braverman, Gereon Frahling, Harry Lang, Christian Sohler, and Lin F. Yang. 2017. Clustering High Dimensional Dynamic Data Streams. In Proceedings of the 34th International Conference on Machine Learning (ICML). 576–585.
[17]
Diptarka Chakraborty, Elazar Goldenberg, and Michal Koucký. 2016. Streaming Algorithms for Embedding and Computing Edit Distance in the Low Distance Regime. In Proceedings of the 48th Annual ACM Symposium on Theory of Computing (STOC). 712–725.
[18]
T.-H. Hubert Chan, Shuguang Hu, and Shaofeng H.-C. Jiang. 2018. A PTAS for the Steiner Forest Problem in Doubling Metrics. SIAM Journal on Computing 47, 4 (2018), 1705–1734.
[19]
Timothy M. Chan. 2006. Faster Core-Set Constructions and Data-Stream Algorithms in Fixed Dimensions. Computation Geometry 35, 1–2 (2006), 20–35.
[20]
Timothy M. Chan. 2016. Dynamic Streaming Algorithms for \(\varepsilon\) -Kernels. In Proceedings of the 32nd International Symposium on Computational Geometry (SoCG). 27:1–27:11.
[21]
Bernard Chazelle, Ronitt Rubinfeld, and Luca Trevisan. 2005. Approximating the Minimum Spanning Tree Weight in Sublinear Time. SIAM Journal on Computing 34, 6 (2005), 1370–1379.
[22]
Xi Chen, Vincent Cohen-Addad, Rajesh Jayaram, Amit Levi, and Erik Waingarten. 2023a. Streaming Euclidean MST to a Constant Factor. In Proceedings of the 55th Annual ACM SIGACT Symposium on Theory of Computing (STOC). 156–169.
[23]
Xi Chen, Rajesh Jayaram, Amit Levi, and Erik Waingarten. 2022. New Streaming Algorithms for High Dimensional EMD and MST. In Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing (STOC) . 222–233. DOI:
[24]
Xiaoyu Chen, Shaofeng H.-C. Jiang, and Robert Krauthgamer. 2023b. Streaming Euclidean Max-Cut: Dimension vs Data Reduction. In Proceedings of the 55th Annual ACM SIGACT Symposium on Theory of Computing (STOC). ACM, 170–182.
[25]
Kuan Cheng, Alireza Farhadi, MohammadTaghi Hajiaghayi, Zhengzhong Jin, Xin Li, Aviad Rubinstein, Saeed Seddighin, and Yu Zheng. 2021. Streaming and Small Space Approximation Algorithms for Edit Distance and Longest Common Subsequence. In Proceedings of the 48th International Colloquium on Automata, Languages, and Programming (ICALP). 54:1–54:20.
[26]
Fan RK Chung and Ronald L. Graham. 1985. A New Bound for Euclidean Steiner Minimal Trees. Annals of the New York Academy of Sciences 440, 1 (1985), 328–346.
[27]
Graham Cormode and Donatella Firmani. 2014. A Unifying Framework for \(0\) -Sampling Algorithms. Distributed Parallel Databases 32, 3 (2014), 315–335. DOI:
[28]
Graham Cormode and S. Muthukrishnan. 2006. Combinatorial Algorithms for Compressed Sensing. In Proceedings of the 13th International Colloquium on Structural Information and Communication Complexity (SIROCCO). 280–294.
[29]
Artur Czumaj, Shaofeng H.-C. Jiang, Robert Krauthgamer, Pavel Veselý, and Mingwei Yang. 2022. Streaming Facility Location in High Dimension via Geometric Hashing. In Proceedings of the 63rd Annual Symposium on Foundations of Computer Science (FOCS). 462–473. DOI:
[30]
Artur Czumaj, Christiane Lammersen, Morteza Monemizadeh, and Christian Sohler. 2013. \((1+\varepsilon)\) -Approximation for Facility Location in Data Streams. In Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 1710–1728. DOI:
[31]
Funda Ergun and Hossein Jowhari. 2015. On the Monotonicity of a Data Stream. Combinatorica 35, 6 (2015), 641–653.
[32]
Joan Feigenbaum, Sampath Kannan, and Jian Zhang. 2005. Computing Diameter in the Streaming and Sliding-Window Models. Algorithmica 41, 1 (2005), 25–41. DOI:
[33]
Gereon Frahling, Piotr Indyk, and Christian Sohler. 2008. Sampling in Dynamic Data Streams and Applications. International Journal of Computational Geometry and Applications 18, 1–2 (2008), 3–28. DOI:
[34]
Gereon Frahling and Christian Sohler. 2005. Coresets in Dynamic Geometric Data Streams. In Proceedings of the 37th Annual ACM Symposium on Theory of Computing (STOC). 209–217. DOI:
[35]
Edgar N. Gilbert and Henry O. Pollak. 1968. Steiner Minimal Trees. SIAM Journal on Applied Mathematics 16, 1 (1968), 1–29.
[36]
Michel X. Goemans and David P. Williamson. 1995. A General Approximation Technique for Constrained Forest Problems. SIAM Journal on Computing 24, 2 (1995), 296–317.
[37]
Parikshit Gopalan, T. S. Jayram, Robert Krauthgamer, and Ravi Kumar. 2007. Estimating the Sortedness of a Data Stream. In Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 318–327. DOI:
[38]
Martin Groß, Anupam Gupta, Amit Kumar, Jannik Matuschke, Daniel R. Schmidt, Melanie Schmidt, and José Verschae. 2018. A Local-Search Algorithm for Steiner Forest. In Proceedings of the 9th Innovations in Theoretical Computer Science Conference (ITCS 2018). 31:1–31:17.
[39]
Anupam Gupta and Amit Kumar. 2015. Greedy Algorithms for Steiner Forest. In Proceedings of the 47th Annual ACM Symposium on Theory of Computing (STOC). 871–878.
[40]
Sariel Har-Peled. 2011. Geometric Approximation Algorithms. American Mathematical Society .
[41]
Sariel Har-Peled and Soham Mazumdar. 2004. On Coresets for \(k\) -Means and \(k\) -Median Clustering. In Proceedings of the 36th Annual ACM Symposium on Theory of Computing (STOC). 291–300.
[42]
Wei Hu, Zhao Song, Lin F. Yang, and Peilin Zhong. 2019. Nearly Optimal Dynamic \(k\) -Means Clustering for High-Dimensional Data. arXiv:1802.00459. Retrieved from https://arxiv.org/abs/1802.00459.
[43]
Piotr Indyk. 2004. Algorithms for Dynamic Geometric Problems over Data Streams. In Proceedings of the 36th Annual ACM Symposium on Theory of Computing (STOC). 373–380. DOI:
[44]
Piotr Indyk and Nitin Thaper. 2003. Fast Image Retrieval via Embeddings. In Proceedings of the 3rd International Workshop on Statistical and Computational Theories of Vision (SCTV). Retrieved from https://people.csail.mit.edu/indyk/emd.pdf
[45]
Kamal Jain. 2001. A Factor 2 Approximation Algorithm for the Generalized Steiner Network Problem. Combinatorica 21, 1 (2001), 39–60.
[46]
T. S. Jayram, Ravi Kumar, and Dandapani Sivakumar. 2008. The One-Way Communication Complexity of Hamming Distance. Theory of Computing 4, 6 (2008), 129–135. DOI:
[47]
Daniel M. Kane, Jelani Nelson, Ely Porat, and David P. Woodruff. 2011. Fast Moment Estimation in Data Streams in Optimal Space. In Proceedings of the 43rd Annual ACM Symposium on Theory of Computing (STOC). 745–754.
[48]
Daniel M. Kane, Jelani Nelson, and David P. Woodruff. 2010. An Optimal Algorithm for the Distinct Elements Problem. In Proceedings of the 29th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS). 41–52. DOI:
[49]
Ilan Kremer, Noam Nisan, and Dana Ron. 1999. On Randomized One-Round Communication Complexity. Computational Complexity 8, 1 (1999), 21–49. DOI:
[50]
Eyal Kushilevitz and Noam Nisan. 1997. Communication Complexity. Cambridge University Press.
[51]
Christiane Lammersen, Anastasios Sidiropoulos, and Christian Sohler. 2009. Streaming Embeddings with Slack. In Proceedings of the 11th International Symposium on Algorithms and Data Structures (WADS). 483–494. DOI:
[52]
Christiane Lammersen and Christian Sohler. 2008. Facility Location in Dynamic Geometric Data Streams. In Proceedings of the 16th Annual European Symposium on Algorithms (ESA). 660–671. DOI:
[53]
Kasper Green Larsen, Jelani Nelson, Huy L. Nguyen, and Mikkel Thorup. 2019. Heavy Hitters via Cluster-Preserving Clustering. Communications of the ACM 62, 8 (2019), 95–100.
[54]
Thomas L. Magnanti and Laurence A. Wolsey. 1995. Chapter 9: Optimal Trees. In Michael Ball, Tom Magnanti, Clyde Monma, and George Nemhauser (Eds.), Network Models. Handbooks in Operations Research and Management Science, Vol. 7. Elsevier, 503–615. DOI:
[55]
Joseph S. B. Mitchell. 1999. Guillotine Subdivisions Approximate Polygonal Subdivisions: A Simple Polynomial-Time Approximation Scheme for Geometric TSP, \(k\) -MST, and Related Problems. SIAM Journal on Computing 28, 4 (1999), 1298–1309.
[56]
David Pollard. 1990. Chapter 4: Packing and Covering in Euclidean Spaces. In Empirical Processes: Theory and Applications. Institute of Mathematical Statistics (IMS) 14–20. DOI:
[57]
Michael Saks and C. Seshadhri. 2013. Space Efficient Streaming Algorithms for the Distance to Monotonicity and Asymmetric Edit Distance. In Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 1698–1709. DOI:
[58]
Guido Schäfer. 2016. Steiner Forest. In Encyclopedia of Algorithms. Ming-Yang Kao (Ed.). Springer, 2099–2102. DOI:
[59]
Christian Sohler. 2012. Problem 52: TSP in the Streaming Model. Retrieved from https://sublinear.info/52
[60]
Xiaoming Sun and David P. Woodruff. 2007. The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences. In Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 336–345.
[61]
Kiri Wagstaff, Claire Cardie, Seth Rogers, and Stefan Schrödl. 2001. Constrained \(k\) -Means Clustering with Background Knowledge. In Proceedings of the 18th International Conference on Machine Learning (ICML). 577–584.
[62]
David P. Woodruff and Taisuke Yasuda. 2022. High-Dimensional Geometric Streaming in Polynomial Space. In Proceedings of the 63rd Annual Symposium on Foundations of Computer Science (FOCS). 732–743. DOI:
[63]
Liang Zhao, Hiroshi Nagamochi, and Toshihide Ibaraki. 2005. Greedy Splitting Algorithms for Approximating Multiway Partition Problems. Mathematical Programming, Series A 102, 1 (2005), 167–183.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Algorithms
ACM Transactions on Algorithms  Volume 20, Issue 4
October 2024
272 pages
EISSN:1549-6333
DOI:10.1145/3613685
  • Editor:
  • Edith Cohen
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 August 2024
Online AM: 07 May 2024
Accepted: 20 April 2024
Revised: 17 August 2023
Received: 03 October 2022
Published in TALG Volume 20, Issue 4

Check for updates

Author Tags

  1. Steiner forest
  2. Steiner tree
  3. streaming algorithms

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 99
    Total Downloads
  • Downloads (Last 12 months)99
  • Downloads (Last 6 weeks)48
Reflects downloads up to 12 Sep 2024

Other Metrics

Citations

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media