Search | arXiv e-print repository

Interactive Coding with Small Memory and Improved Rate

Authors: Dorsa Fathollahi, Bernhard Haeupler, Nicolas Resch, Mary Wootters

Abstract: In this work, we study two-party interactive coding for adversarial noise, when both parties have limited memory. We show how to convert any adaptive protocol $Π$ into a protocol $Π'$ that is robust to an $ε$-fraction of adversarial corruptions, not too much longer than $Π$, and which uses small space. More precisely, if $Π$ requires space $\log(s)$ and has $|Π|$ rounds of communication, then… ▽ More In this work, we study two-party interactive coding for adversarial noise, when both parties have limited memory. We show how to convert any adaptive protocol $Π$ into a protocol $Π'$ that is robust to an $ε$-fraction of adversarial corruptions, not too much longer than $Π$, and which uses small space. More precisely, if $Π$ requires space $\log(s)$ and has $|Π|$ rounds of communication, then $Π'$ requires $O_ε(\log s \log |Π|)$ memory, and has $$|Π'| = |Π|\cdot\left( 1 + O\left( \sqrt{ ε\log \log 1/ε} \right)\right)$$ rounds of communication. The above matches the best known communication rate, even for protocols with no space restrictions. △ Less

Submitted 12 August, 2024; originally announced August 2024.

arXiv:2406.14384 [pdf, ps, other]

doi 10.1145/3618260.3649689

Low-Step Multi-Commodity Flow Emulators

Authors: Bernhard Haeupler, D Ellis Hershkowitz, Jason Li, Antti Roeyskoe, Thatchaphol Saranurak

Abstract: We introduce the concept of low-step multi-commodity flow emulators for any undirected, capacitated graph. At a high level, these emulators contain approximate multi-commodity flows whose paths contain a small number of edges, shattering the infamous flow decomposition barrier for multi-commodity flow. We prove the existence of low-step multi-commodity flow emulators and develop efficient algori… ▽ More We introduce the concept of low-step multi-commodity flow emulators for any undirected, capacitated graph. At a high level, these emulators contain approximate multi-commodity flows whose paths contain a small number of edges, shattering the infamous flow decomposition barrier for multi-commodity flow. We prove the existence of low-step multi-commodity flow emulators and develop efficient algorithms to compute them. We then apply them to solve constant-approximate $k$-commodity flow in $O((m+k)^{1+ε})$ time. To bypass the $O(mk)$ flow decomposition barrier, we represent our output multi-commodity flow implicitly; prior to our work, even the existence of implicit constant-approximate multi-commodity flows of size $o(mk)$ was unknown. Our results generalize to the minimum cost setting, where each edge has an associated cost and the multi-commodity flow must satisfy a cost budget. Our algorithms are also parallel. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Comments: Appears at STOC 2024

arXiv:2404.13446 [pdf, other]

New Structures and Algorithms for Length-Constrained Expander Decompositions

Authors: Bernhard Haeupler, D Ellis Hershkowitz, Zihan Tan

Abstract: Expander decompositions form the basis of one of the most flexible paradigms for close-to-linear-time graph algorithms. Length-constrained expander decompositions generalize this paradigm to better work for problems with lengths, distances and costs. Roughly, an $(h,s)$-length $φ$-expander decomposition is a small collection of length increases to a graph so that nodes within distance $h$ can rout… ▽ More Expander decompositions form the basis of one of the most flexible paradigms for close-to-linear-time graph algorithms. Length-constrained expander decompositions generalize this paradigm to better work for problems with lengths, distances and costs. Roughly, an $(h,s)$-length $φ$-expander decomposition is a small collection of length increases to a graph so that nodes within distance $h$ can route flow over paths of length $hs$ with congestion at most $1/φ$. In this work, we give a close-to-linear time algorithm for computing length-constrained expander decompositions in graphs with general lengths and capacities. Notably, and unlike previous works, our algorithm allows for one to trade off off between the size of the decomposition and the length of routing paths: for any $ε> 0$ not too small, our algorithm computes in close-to-linear time an $(h,s)$-length $φ$-expander decomposition of size $m \cdot φ\cdot n^ε$ where $s = \exp(\text{poly}(1/ε))$. The key foundations of our algorithm are: (1) a simple yet powerful structural theorem which states that the union of a sequence of sparse length-constrained cuts is itself sparse and (2) new algorithms for efficiently computing sparse length-constrained flows. △ Less

Submitted 15 May, 2024; v1 submitted 20 April, 2024; originally announced April 2024.

Comments: Added funding info

arXiv:2404.04552 [pdf, ps, other]

Fast and Simple Sorting Using Partial Information

Authors: Bernhard Haeupler, Richard Hladík, John Iacono, Vaclav Rozhon, Robert Tarjan, Jakub Tětek

Abstract: We consider the problem of sorting $n$ items, given the outcomes of $m$ pre-existing comparisons. We present a simple, natural deterministic algorithm that runs in $O(m+\log T)$ time and does $O(\log T)$ comparisons, where $T$ is the number of total orders consistent with the pre-existing comparisons. Our running time and comparison bounds are best possible up to constant factors, thus resolving… ▽ More We consider the problem of sorting $n$ items, given the outcomes of $m$ pre-existing comparisons. We present a simple, natural deterministic algorithm that runs in $O(m+\log T)$ time and does $O(\log T)$ comparisons, where $T$ is the number of total orders consistent with the pre-existing comparisons. Our running time and comparison bounds are best possible up to constant factors, thus resolving a problem that has been studied intensely since 1976 (Fredman, Theoretical Computer Science). The best previous algorithm with a bound of $O(\lg T)$ on the number of comparisons has a time bound of $O(n^{2.5})$ and is more complicated. (A recent independent and concurrent work by Van der Hoog and Rutschmann implies an $O(n^{2.371552})$-time algorithm with $O(\log T)$ comparisons.) Our algorithm combines three classic algorithms: topological sort, heapsort with the right kind of heap, and efficient insertion into a sorted list. It outputs the items in sorted order one by one. As a result, it can be modified to solve the important and more general top-$k$ sorting problem: Given $k$ and the outcomes of some pre-existing comparisons, output the smallest $k$ items in sorted order. The modified algorithm solves the top-$k$ sorting problem in minimum time and comparisons, to within constant factors. △ Less

Submitted 22 July, 2024; v1 submitted 6 April, 2024; originally announced April 2024.

ACM Class: F.2.2; G.2.2

arXiv:2403.07410 [pdf, ps, other]

Polylog-Competitive Deterministic Local Routing and Scheduling

Authors: Bernhard Haeupler, Shyamal Patel, Antti Roeyskoe, Cliff Stein, Goran Zuzic

Abstract: This paper addresses point-to-point packet routing in undirected networks, which is the most important communication primitive in most networks. The main result proves the existence of routing tables that guarantee a polylog-competitive completion-time $\textbf{deterministically}$: in any undirected network, it is possible to give each node simple stateless deterministic local forwarding rules, su… ▽ More This paper addresses point-to-point packet routing in undirected networks, which is the most important communication primitive in most networks. The main result proves the existence of routing tables that guarantee a polylog-competitive completion-time $\textbf{deterministically}$: in any undirected network, it is possible to give each node simple stateless deterministic local forwarding rules, such that, any adversarially chosen set of packets are delivered as fast as possible, up to polylog factors. All previous routing strategies crucially required randomization for both route selection and packet scheduling. The core technical contribution of this paper is a new local packet scheduling result of independent interest. This scheduling strategy integrates well with recent sparse semi-oblivious path selection strategies. Such strategies deterministically select not one but several candidate paths for each packet and require a global coordinator to select a single good path from those candidates for each packet. Another challenge is that, even if a single path is selected for each packet, no strategy for scheduling packets along low-congestion paths that is both local and deterministic is known. Our novel scheduling strategy utilizes the fact that every semi-oblivious routing strategy uses only a small (polynomial) subset of candidate routes. It overcomes the issue of global coordination by furthermore being provably robust to adversarial noise. This avoids the issue of having to choose a single path per packet because congestion caused by ineffective candidate paths can be treated as noise. Our results imply the first deterministic universally-optimal algorithms in the distributed supported-CONGEST model for many important global distributed tasks, including computing minimum spanning trees, approximate shortest paths, and part-wise aggregates. △ Less

Submitted 13 March, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

Comments: To appear at STOC 2024

arXiv:2402.18541 [pdf, ps, other]

Dynamic Deterministic Constant-Approximate Distance Oracles with $n^ε$ Worst-Case Update Time

Authors: Bernhard Haeupler, Yaowei Long, Thatchaphol Saranurak

Abstract: We present a new distance oracle in the fully dynamic setting: given a weighted undirected graph $G=(V,E)$ with $n$ vertices undergoing both edge insertions and deletions, and an arbitrary parameter $ε$ where $ε\in[1/\log^{c} n,1]$ and $c>0$ is a small constant, we can deterministically maintain a data structure with $n^ε$ worst-case update time that, given any pair of vertices $(u,v)$, returns a… ▽ More We present a new distance oracle in the fully dynamic setting: given a weighted undirected graph $G=(V,E)$ with $n$ vertices undergoing both edge insertions and deletions, and an arbitrary parameter $ε$ where $ε\in[1/\log^{c} n,1]$ and $c>0$ is a small constant, we can deterministically maintain a data structure with $n^ε$ worst-case update time that, given any pair of vertices $(u,v)$, returns a $2^{{\rm poly}(1/ε)}$-approximate distance between $u$ and $v$ in ${\rm poly}(1/ε)\log\log n$ query time. Our algorithm significantly advances the state-of-the-art in two aspects, both for fully dynamic algorithms and even decremental algorithms. First, no existing algorithm with worst-case update time guarantees a $o(n)$-approximation while also achieving an $n^{2-Ω(1)}$ update and $n^{o(1)}$ query time, while our algorithm offers a constant $O_ε(1)$-approximation with $n^ε$ update time and $O_ε(\log \log n)$ query time. Second, even if amortized update time is allowed, it is the first deterministic constant-approximation algorithm with $n^{1-Ω(1)}$ update and query time. The best result in this direction is the recent deterministic distance oracle by Chuzhoy and Zhang [STOC 2023] which achieves an approximation of $(\log\log n)^{2^{O(1/ε^{3})}}$ with amortized update time of $n^ε$ and query time of $2^{{\rm poly}(1/ε)}\log n\log\log n$. We obtain the result by dynamizing tools related to length-constrained expanders [Haeupler-Räcke-Ghaffari, STOC 2022; Haeupler-Hershkowitz-Tan, 2023; Haeupler-Huebotter-Ghaffari, 2022]. Our technique completely bypasses the 40-year-old Even-Shiloach tree, which has remained the most pervasive tool in the area but is inherently amortized. △ Less

Submitted 10 April, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

Comments: 137 pages

arXiv:2311.11793 [pdf, other]

Universal Optimality of Dijkstra via Beyond-Worst-Case Heaps

Authors: Bernhard Haeupler, Richard Hladík, Václav Rozhoň, Robert Tarjan, Jakub Tětek

Abstract: This paper proves that Dijkstra's shortest-path algorithm is universally optimal in both its running time and number of comparisons when combined with a sufficiently efficient heap data structure. Universal optimality is a powerful beyond-worst-case performance guarantee for graph algorithms that informally states that a single algorithm performs as well as possible for every single graph topolo… ▽ More This paper proves that Dijkstra's shortest-path algorithm is universally optimal in both its running time and number of comparisons when combined with a sufficiently efficient heap data structure. Universal optimality is a powerful beyond-worst-case performance guarantee for graph algorithms that informally states that a single algorithm performs as well as possible for every single graph topology. We give the first application of this notion to any sequential algorithm. We design a new heap data structure with a working-set property guaranteeing that the heap takes advantage of locality in heap operations. Our heap matches the optimal (worst-case) bounds of Fibonacci heaps but also provides the beyond-worst-case guarantee that the cost of extracting the minimum element is merely logarithmic in the number of elements inserted after it instead of logarithmic in the number of all elements in the heap. This makes the extraction of recently added elements cheaper. We prove that our working-set property is sufficient to guarantee universal optimality, specifically, for the problem of ordering vertices by their distance from the source vertex: The locality in the sequence of heap operations generated by any run of Dijkstra's algorithm on a fixed topology is strong enough that one can couple the number of comparisons performed by any heap with our working-set property to the minimum number of comparisons required to solve the distance ordering problem on this topology. △ Less

Submitted 9 April, 2024; v1 submitted 20 November, 2023; originally announced November 2023.

ACM Class: F.2.2; G.2.2

arXiv:2309.06696 [pdf, ps, other]

Fault-Tolerant Spanners against Bounded-Degree Edge Failures: Linearly More Faults, Almost For Free

Authors: Greg Bodwin, Bernhard Haeupler, Merav Parter

Abstract: We study a new and stronger notion of fault-tolerant graph structures whose size bounds depend on the degree of the failing edge set, rather than the total number of faults. For a subset of faulty edges $F \subseteq G$, the faulty-degree $°(F)$ is the largest number of faults in $F$ incident to any given vertex. We design new fault-tolerant structures with size comparable to previous constructions… ▽ More We study a new and stronger notion of fault-tolerant graph structures whose size bounds depend on the degree of the failing edge set, rather than the total number of faults. For a subset of faulty edges $F \subseteq G$, the faulty-degree $°(F)$ is the largest number of faults in $F$ incident to any given vertex. We design new fault-tolerant structures with size comparable to previous constructions, but which tolerate every fault set of small faulty-degree $°(F)$, rather than only fault sets of small size $|F|$. Our main results are: - New FT-Certificates: For every $n$-vertex graph $G$ and degree threshold $f$, one can compute a connectivity certificate $H \subseteq G$ with $|E(H)| = \widetilde{O}(fn)$ edges that has the following guarantee: for any edge set $F$ with faulty-degree $°(F)\leq f$ and every vertex pair $u,v$, it holds that $u$ and $v$ are connected in $H \setminus F$ iff they are connected in $G \setminus F$. This bound on $|E(H)|$ is nearly tight. Since our certificates handle some fault sets of size up to $|F|=O(fn)$, prior work did not imply any nontrivial upper bound for this problem, even when $f=1$. - New FT-Spanners: We show that every $n$-vertex graph $G$ admits a $(2k-1)$-spanner $H$ with $|E(H)| = O_k(f^{1-1/k} n^{1+1/k})$ edges, which tolerates any fault set $F$ of faulty-degree at most $f$. This bound on $|E(H)|$ optimal up to its hidden dependence on $k$, and it is close to the bound of $O_k(|F|^{1/2} n^{1+1/k} + |F|n)$ that is known for the case where the total number of faults is $|F|$ [Bodwin, Dinitz, Robelle SODA '22]. Our proof of this theorem is non-constructive, but by following a proof strategy of Dinitz and Robelle [PODC '20], we show that the runtime can be made polynomial by paying an additional $\text{polylog } n$ factor in spanner size. △ Less

Submitted 12 September, 2023; originally announced September 2023.

arXiv:2307.13747 [pdf, other]

Fully Dynamic Consistent $k$-Center Clustering

Authors: Jakub Łącki, Bernhard Haeupler, Christoph Grunau, Václav Rozhoň, Rajesh Jayaram

Abstract: We study the consistent k-center clustering problem. In this problem, the goal is to maintain a constant factor approximate $k$-center solution during a sequence of $n$ point insertions and deletions while minimizing the recourse, i.e., the number of changes made to the set of centers after each point insertion or deletion. Previous works by Lattanzi and Vassilvitskii [ICML '12] and Fichtenberger,… ▽ More We study the consistent k-center clustering problem. In this problem, the goal is to maintain a constant factor approximate $k$-center solution during a sequence of $n$ point insertions and deletions while minimizing the recourse, i.e., the number of changes made to the set of centers after each point insertion or deletion. Previous works by Lattanzi and Vassilvitskii [ICML '12] and Fichtenberger, Lattanzi, Norouzi-Fard, and Svensson [SODA '21] showed that in the incremental setting, where deletions are not allowed, one can obtain $k \cdot \textrm{polylog}(n) / n$ amortized recourse for both $k$-center and $k$-median, and demonstrated a matching lower bound. However, no algorithm for the fully dynamic setting achieves less than the trivial $O(k)$ changes per update, which can be obtained by simply reclustering the full dataset after every update. In this work, we give the first algorithm for consistent $k$-center clustering for the fully dynamic setting, i.e., when both point insertions and deletions are allowed, and improves upon a trivial $O(k)$ recourse bound. Specifically, our algorithm maintains a constant factor approximate solution while ensuring worst-case constant recourse per update, which is optimal in the fully dynamic setting. Moreover, our algorithm is deterministic and is therefore correct even if an adaptive adversary chooses the insertions and deletions. △ Less

Submitted 25 July, 2023; originally announced July 2023.

arXiv:2304.08892 [pdf, other]

Parallel Greedy Spanners

Authors: Bernhard Haeupler, D Ellis Hershkowitz, Zihan Tan

Abstract: A $t$-spanner of a graph is a subgraph that $t$-approximates pairwise distances. The greedy algorithm is one of the simplest and most well-studied algorithms for constructing a sparse spanner: it computes a $t$-spanner with $n^{1+O(1/t)}$ edges by repeatedly choosing any edge which does not close a cycle of chosen edges with $t+1$ or fewer edges. We demonstrate that the greedy algorithm computes… ▽ More A $t$-spanner of a graph is a subgraph that $t$-approximates pairwise distances. The greedy algorithm is one of the simplest and most well-studied algorithms for constructing a sparse spanner: it computes a $t$-spanner with $n^{1+O(1/t)}$ edges by repeatedly choosing any edge which does not close a cycle of chosen edges with $t+1$ or fewer edges. We demonstrate that the greedy algorithm computes a $t$-spanner with $t^3\cdot \log^3 n \cdot n^{1 + O(1/t)}$ edges even when a matching of such edges are added in parallel. In particular, it suffices to repeatedly add any matching where each individual edge does not close a cycle with $t +1$ or fewer edges but where adding the entire matching might. Our analysis makes use of and illustrates the power of new advances in length-constrained expander decompositions. △ Less

Submitted 2 August, 2023; v1 submitted 18 April, 2023; originally announced April 2023.

arXiv:2303.00811 [pdf, other]

Parallel and Distributed Exact Single-Source Shortest Paths with Negative Edge Weights

Authors: Vikrant Ashvinkumar, Aaron Bernstein, Nairen Cao, Christoph Grunau, Bernhard Haeupler, Yonggang Jiang, Danupon Nanongkai, Hsin Hao Su

Abstract: This paper presents parallel and distributed algorithms for single-source shortest paths when edges can have negative weights (negative-weight SSSP). We show a framework that reduces negative-weight SSSP in either setting to $n^{o(1)}$ calls to any SSSP algorithm that works with a virtual source. More specifically, for a graph with $m$ edges, $n$ vertices, undirected hop-diameter $D$, and polynomi… ▽ More This paper presents parallel and distributed algorithms for single-source shortest paths when edges can have negative weights (negative-weight SSSP). We show a framework that reduces negative-weight SSSP in either setting to $n^{o(1)}$ calls to any SSSP algorithm that works with a virtual source. More specifically, for a graph with $m$ edges, $n$ vertices, undirected hop-diameter $D$, and polynomially bounded integer edge weights, we show randomized algorithms for negative-weight SSSP with (i) $W_{SSSP}(m,n)n^{o(1)}$ work and $S_{SSSP}(m,n)n^{o(1)}$ span, given access to an SSSP algorithm with $W_{SSSP}(m,n)$ work and $S_{SSSP}(m,n)$ span in the parallel model, (ii) $T_{SSSP}(n,D)n^{o(1)}$, given access to an SSSP algorithm that takes $T_{SSSP}(n,D)$ rounds in $\mathsf{CONGEST}$. This work builds off the recent result of [Bernstein, Nanongkai, Wulff-Nilsen, FOCS'22], which gives a near-linear time algorithm for negative-weight SSSP in the sequential setting. Using current state-of-the-art SSSP algorithms yields randomized algorithms for negative-weight SSSP with (i) $m^{1+o(1)}$ work and $n^{1/2+o(1)}$ span in the parallel model, (ii) $(n^{2/5}D^{2/5} + \sqrt{n} + D)n^{o(1)}$ rounds in $\mathsf{CONGEST}$. Our main technical contribution is an efficient reduction for computing a low-diameter decomposition (LDD) of directed graphs to computations of SSSP with a virtual source. Efficiently computing an LDD has heretofore only been known for undirected graphs in both the parallel and distributed models. The LDD is a crucial step of the algorithm in [Bernstein, Nanongkai, Wulff-Nilsen, FOCS'22], and we think that its applications to other problems in parallel and distributed models are far from being exhausted. △ Less

Submitted 1 March, 2023; originally announced March 2023.

arXiv:2301.06647 [pdf, other]

Sparse Semi-Oblivious Routing: Few Random Paths Suffice

Authors: Goran Zuzic, Bernhard Haeupler, Antti Roeyskoe

Abstract: The packet routing problem asks to select routing paths that minimize the maximum edge congestion for a set of packets specified by source-destination vertex pairs. We revisit a semi-oblivious approach to this problem: each source-destination pair is assigned a small set of predefined paths before the demand is revealed, while the sending rates along the paths can be optimally adapted to the deman… ▽ More The packet routing problem asks to select routing paths that minimize the maximum edge congestion for a set of packets specified by source-destination vertex pairs. We revisit a semi-oblivious approach to this problem: each source-destination pair is assigned a small set of predefined paths before the demand is revealed, while the sending rates along the paths can be optimally adapted to the demand. This approach has been considered in practice in network traffic engineering due to its superior robustness and performance as compared to both oblivious routing and traditional traffic engineering approaches. We show the existence of sparse semi-oblivious routings: only $O(\log n)$ paths are selected between each pair of vertices. The routing is $(poly \log n)$-competitive for all demands against the offline-optimal congestion objective. Even for the well-studied case of hypercubes, no such result was known: our deterministic and oblivious selection of $O(\log n)$ paths is the first simple construction of a deterministic oblivious structure that near-optimally assigns source-destination pairs to few routes. Our results contrast the current solely-negative landscape of results for semi-oblivious routing. We give the sparsity-competitiveness trade-off for lower sparsities and nearly match it with a lower bound. Our construction is extremely simple: Sample the few paths from any competitive oblivious routing. Indeed, this natural construction was used in traffic engineering as an unproven heuristic. We give a satisfactory theoretical justification for their empirical effectiveness: the competitiveness of the construction improves exponentially with the number of paths. Finally, when combined with the recent hop-constrained oblivious routing, we also obtain sparse and competitive structures for the completion-time objective. △ Less

Submitted 12 May, 2023; v1 submitted 16 January, 2023; originally announced January 2023.

Comments: Appears at PODC 2023

arXiv:2211.11726 [pdf, ps, other]

A Cut-Matching Game for Constant-Hop Expanders

Authors: Bernhard Haeupler, Jonas Huebotter, Mohsen Ghaffari

Abstract: This paper provides a cut-strategy that produces constant-hop expanders in the well-known cut-matching game framework. Constant-hop expanders strengthen expanders with constant conductance by guaranteeing that any demand can be (obliviously) routed along constant-hop paths - in contrast to the $Ω(\log n)$-hop routes in expanders. Cut-matching games for expanders are key tools for obtaining clo… ▽ More This paper provides a cut-strategy that produces constant-hop expanders in the well-known cut-matching game framework. Constant-hop expanders strengthen expanders with constant conductance by guaranteeing that any demand can be (obliviously) routed along constant-hop paths - in contrast to the $Ω(\log n)$-hop routes in expanders. Cut-matching games for expanders are key tools for obtaining close-to-linear-time approximation algorithms for many hard problems, including finding (balanced or approximately-largest) sparse cuts, certifying the expansion of a graph by embedding an (explicit) expander, as well as computing expander decompositions, hierarchical cut decompositions, oblivious routings, multi-cuts, and multicommodity flows. The cut-matching game provided in this paper is crucial in extending this versatile and powerful machinery to constant-hop expanders. It is also a key ingredient towards close-to-linear time algorithms for computing a constant approximation of multicommodity-flows and multi-cuts - the approximation factor being a constant relies on the expanders being constant-hop. △ Less

Submitted 21 November, 2022; originally announced November 2022.

arXiv:2210.16351 [pdf, other]

Parallel Breadth-First Search and Exact Shortest Paths and Stronger Notions for Approximate Distances

Authors: Václav Rozhoň, Bernhard Haeupler, Anders Martinsson, Christoph Grunau, Goran Zuzic

Abstract: We introduce stronger notions for approximate single-source shortest-path distances, show how to efficiently compute them from weaker standard notions, and demonstrate the algorithmic power of these new notions and transformations. One application is the first work-efficient parallel algorithm for computing exact single-source shortest paths graphs -- resolving a major open problem in parallel com… ▽ More We introduce stronger notions for approximate single-source shortest-path distances, show how to efficiently compute them from weaker standard notions, and demonstrate the algorithmic power of these new notions and transformations. One application is the first work-efficient parallel algorithm for computing exact single-source shortest paths graphs -- resolving a major open problem in parallel computing. Given a source vertex in a directed graph with polynomially-bounded nonnegative integer lengths, the algorithm computes an exact shortest path tree in $m \log^{O(1)} n$ work and $n^{1/2+o(1)}$ depth. Previously, no parallel algorithm improving the trivial linear depths of Dijkstra's algorithm without significantly increasing the work was known, even for the case of undirected and unweighted graphs (i.e., for computing a BFS-tree). Our main result is a black-box transformation that uses $\log^{O(1)} n$ standard approximate distance computations to produce approximate distances which also satisfy the subtractive triangle inequality (up to a $(1+\varepsilon)$ factor) and even induce an exact shortest path tree in a graph with only slightly perturbed edge lengths. These strengthened approximations are algorithmically significantly more powerful and overcome well-known and often encountered barriers for using approximate distances. In directed graphs they can even be boosted to exact distances. This results in a black-box transformation of any (parallel or distributed) algorithm for approximate shortest paths in directed graphs into an algorithm computing exact distances at essentially no cost. Applying this to the recent breakthroughs of Fineman et al. for compute approximate SSSP-distances via approximate hopsets gives new parallel and distributed algorithm for exact shortest paths. △ Less

Submitted 28 October, 2022; originally announced October 2022.

arXiv:2210.11784 [pdf, ps, other]

A Simple Deterministic Distributed Low-Diameter Clustering

Authors: Václav Rozhoň, Bernhard Haeupler, Christoph Grunau

Abstract: We give a simple, local process for nodes in an undirected graph to form non-adjacent clusters that (1) have at most a polylogarithmic diameter and (2) contain at least half of all vertices. Efficient deterministic distributed clustering algorithms for computing strong-diameter network decompositions and other key tools follow immediately. Overall, our process is a direct and drastically simplifie… ▽ More We give a simple, local process for nodes in an undirected graph to form non-adjacent clusters that (1) have at most a polylogarithmic diameter and (2) contain at least half of all vertices. Efficient deterministic distributed clustering algorithms for computing strong-diameter network decompositions and other key tools follow immediately. Overall, our process is a direct and drastically simplified way for computing these fundamental objects. △ Less

Submitted 21 October, 2022; originally announced October 2022.

arXiv:2209.11669 [pdf, ps, other]

Improved Distributed Network Decomposition, Hitting Sets, and Spanners, via Derandomization

Authors: Mohsen Ghaffari, Christoph Grunau, Bernhard Haeupler, Saeed Ilchi, Václav Rozhoň

Abstract: This paper presents significantly improved deterministic algorithms for some of the key problems in the area of distributed graph algorithms, including network decomposition, hitting sets, and spanners. As the main ingredient in these results, we develop novel randomized distributed algorithms that we can analyze using only pairwise independence, and we can thus derandomize efficiently. As our mos… ▽ More This paper presents significantly improved deterministic algorithms for some of the key problems in the area of distributed graph algorithms, including network decomposition, hitting sets, and spanners. As the main ingredient in these results, we develop novel randomized distributed algorithms that we can analyze using only pairwise independence, and we can thus derandomize efficiently. As our most prominent end-result, we obtain a deterministic construction for $O(\log n)$-color $O(\log n \cdot \log\log\log n)$-strong diameter network decomposition in $\tilde{O}(\log^3 n)$ rounds. This is the first construction that achieves almost $\log n$ in both parameters, and it improves on a recent line of exciting progress on deterministic distributed network decompositions [Rozhoň, Ghaffari STOC'20; Ghaffari, Grunau, Rozhoň SODA'21; Chang, Ghaffari PODC'21; Elkin, Haeupler, Rozhoň, Grunau FOCS'22]. △ Less

Submitted 23 September, 2022; originally announced September 2022.

arXiv:2204.14086 [pdf, other]

Deterministic Distributed Sparse and Ultra-Sparse Spanners and Connectivity Certificates

Authors: Marcel Bezdrighin, Michael Elkin, Mohsen Ghaffari, Christoph Grunau, Bernhard Haeupler, Saeed Ilchi, Václav Rozhoň

Abstract: This paper presents efficient distributed algorithms for a number of fundamental problems in the area of graph sparsification: We provide the first deterministic distributed algorithm that computes an ultra-sparse spanner in $\textrm{polylog}(n)$ rounds in weighted graphs. Concretely, our algorithm outputs a spanning subgraph with only $n+o(n)$ edges in which the pairwise distances are stretched… ▽ More This paper presents efficient distributed algorithms for a number of fundamental problems in the area of graph sparsification: We provide the first deterministic distributed algorithm that computes an ultra-sparse spanner in $\textrm{polylog}(n)$ rounds in weighted graphs. Concretely, our algorithm outputs a spanning subgraph with only $n+o(n)$ edges in which the pairwise distances are stretched by a factor of at most $O(\log n \;\cdot\; 2^{O(\log^* n)})$. We provide a $\textrm{polylog}(n)$-round deterministic distributed algorithm that computes a spanner with stretch $(2k-1)$ and $O(nk + n^{1 + 1/k} \log k)$ edges in unweighted graphs and with $O(n^{1 + 1/k} k)$ edges in weighted graphs. We present the first $\textrm{polylog}(n)$-round randomized distributed algorithm that computes a sparse connectivity certificate. For an $n$-node graph $G$, a certificate for connectivity $k$ is a spanning subgraph $H$ that is $k$-edge-connected if and only if $G$ is $k$-edge-connected, and this subgraph $H$ is called sparse if it has $O(nk)$ edges. Our algorithm achieves a sparsity of $(1 + o(1))nk$ edges, which is within a $2(1 + o(1))$ factor of the best possible. △ Less

Submitted 23 September, 2022; v1 submitted 29 April, 2022; originally announced April 2022.

arXiv:2204.08254 [pdf, other]

Deterministic Low-Diameter Decompositions for Weighted Graphs and Distributed and Parallel Applications

Authors: Václav Rozhoň, Michael Elkin, Christoph Grunau, Bernhard Haeupler

Abstract: This paper presents new deterministic and distributed low-diameter decomposition algorithms for weighted graphs. In particular, we show that if one can efficiently compute approximate distances in a parallel or a distributed setting, one can also efficiently compute low-diameter decompositions. This consequently implies solutions to many fundamental distance based problems using a polylogarithmic… ▽ More This paper presents new deterministic and distributed low-diameter decomposition algorithms for weighted graphs. In particular, we show that if one can efficiently compute approximate distances in a parallel or a distributed setting, one can also efficiently compute low-diameter decompositions. This consequently implies solutions to many fundamental distance based problems using a polylogarithmic number of approximate distance computations. Our low-diameter decomposition generalizes and extends the line of work starting from [Rozhoň, Ghaffari STOC 2020] to weighted graphs in a very model-independent manner. Moreover, our clustering results have additional useful properties, including strong-diameter guarantees, separation properties, restricting cluster centers to specified terminals, and more. Applications include: -- The first near-linear work and polylogarithmic depth randomized and deterministic parallel algorithm for low-stretch spanning trees (LSST) with polylogarithmic stretch. Previously, the best parallel LSST algorithm required $m \cdot n^{o(1)}$ work and $n^{o(1)}$ depth and was inherently randomized. No deterministic LSST algorithm with truly sub-quadratic work and sub-linear depth was known. -- The first near-linear work and polylogarithmic depth deterministic algorithm for computing an $\ell_1$-embedding into polylogarithmic dimensional space with polylogarithmic distortion. The best prior deterministic algorithms for $\ell_1$-embeddings either require large polynomial work or are inherently sequential. Even when we apply our techniques to the classical problem of computing a ball-carving with strong-diameter $O(\log^2 n)$ in an unweighted graph, our new clustering algorithm still leads to an improvement in round complexity from $O(\log^{10} n)$ rounds [Chang, Ghaffari PODC 21] to $O(\log^{4} n)$. △ Less

Submitted 3 September, 2022; v1 submitted 18 April, 2022; originally announced April 2022.

arXiv:2204.05874 [pdf, ps, other]

Undirected $(1+\varepsilon)$-Shortest Paths via Minor-Aggregates: Near-Optimal Deterministic Parallel & Distributed Algorithms

Authors: Václav Rozhoň, Christoph Grunau, Bernhard Haeupler, Goran Zuzic, Jason Li

Abstract: This paper presents near-optimal deterministic parallel and distributed algorithms for computing $(1+\varepsilon)$-approximate single-source shortest paths in any undirected weighted graph. On a high level, we deterministically reduce this and other shortest-path problems to $\tilde{O}(1)$ Minor-Aggregations. A Minor-Aggregation computes an aggregate (e.g., max or sum) of node-values for every c… ▽ More This paper presents near-optimal deterministic parallel and distributed algorithms for computing $(1+\varepsilon)$-approximate single-source shortest paths in any undirected weighted graph. On a high level, we deterministically reduce this and other shortest-path problems to $\tilde{O}(1)$ Minor-Aggregations. A Minor-Aggregation computes an aggregate (e.g., max or sum) of node-values for every connected component of some subgraph. Our reduction immediately implies: Optimal deterministic parallel (PRAM) algorithms with $\tilde{O}(1)$ depth and near-linear work. Universally-optimal deterministic distributed (CONGEST) algorithms, whenever deterministic Minor-Aggregate algorithms exist. For example, an optimal $\tilde{O}(HopDiameter(G))$-round deterministic CONGEST algorithm for excluded-minor networks. Several novel tools developed for the above results are interesting in their own right: A local iterative approach for reducing shortest path computations "up to distance $D$" to computing low-diameter decompositions "up to distance $\frac{D}{2}$". Compared to the recursive vertex-reduction approach of [Li20], our approach is simpler, suitable for distributed algorithms, and eliminates many derandomization barriers. A simple graph-based $\tilde{O}(1)$-competitive $\ell_1$-oblivious routing based on low-diameter decompositions that can be evaluated in near-linear work. The previous such routing [ZGY+20] was $n^{o(1)}$-competitive and required $n^{o(1)}$ more work. A deterministic algorithm to round any fractional single-source transshipment flow into an integral tree solution. The first distributed algorithms for computing Eulerian orientations. △ Less

Submitted 23 September, 2022; v1 submitted 12 April, 2022; originally announced April 2022.

arXiv:2111.01422 [pdf, other]

Maximum Length-Constrained Flows and Disjoint Paths: Distributed, Deterministic and Fast

Authors: Bernhard Haeupler, D Ellis Hershkowitz, Thatchaphol Saranurak

Abstract: Computing routing schemes that support both high throughput and low latency is one of the core challenges of network optimization. Such routes can be formalized as $h$-length flows which are defined as flows whose flow paths are restricted to have length at most $h$. Many well-studied algorithmic primitives -- such as maximal and maximum length-constrained disjoint paths -- are special cases of… ▽ More Computing routing schemes that support both high throughput and low latency is one of the core challenges of network optimization. Such routes can be formalized as $h$-length flows which are defined as flows whose flow paths are restricted to have length at most $h$. Many well-studied algorithmic primitives -- such as maximal and maximum length-constrained disjoint paths -- are special cases of $h$-length flows. Likewise the optimal $h$-length flow is a fundamental quantity in network optimization, characterizing, up to poly-log factors, how quickly a network can accomplish numerous distributed primitives. In this work, we give the first efficient algorithms for computing $(1 - ε)$-approximate $h$-length flows. We give deterministic algorithms that take $\tilde{O}(\text{poly}(h, \frac{1}ε))$ parallel time and $\tilde{O}(\text{poly}(h, \frac{1}ε) \cdot 2^{O(\sqrt{\log n})})$ distributed CONGEST time. We also give a CONGEST algorithm that succeeds with high probability and only takes $\tilde{O}(\text{poly}(h, \frac{1}ε))$ time. Using our $h$-length flow algorithms, we give the first efficient deterministic CONGEST algorithms for the maximal length-constrained disjoint paths problem -- settling an open question of Chang and Saranurak (FOCS 2020) -- as well as essentially-optimal parallel and distributed approximation algorithms for maximum length-constrained disjoint paths. The former greatly simplifies deterministic CONGEST algorithms for computing expander decompositions. We also use our techniques to give the first efficient $(1-ε)$-approximation algorithms for bipartite $b$-matching in CONGEST. Lastly, using our flow algorithms, we give the first algorithms to efficiently compute $h$-length cutmatches, an object at the heart of recent advances in length-constrained expander decompositions. △ Less

Submitted 16 August, 2023; v1 submitted 2 November, 2021; originally announced November 2021.

arXiv:2110.15944 [pdf, ps, other]

Universally-Optimal Distributed Shortest Paths and Transshipment via Graph-Based L1-Oblivious Routing

Authors: Goran Zuzic, Gramoz Goranci, Mingquan Ye, Bernhard Haeupler, Xiaorui Sun

Abstract: We provide universally-optimal distributed graph algorithms for $(1+\varepsilon)$-approximate shortest path problems including shortest-path-tree and transshipment. The universal optimality of our algorithms guarantees that, on any $n$-node network $G$, our algorithm completes in $T \cdot n^{o(1)}$ rounds whenever a $T$-round algorithm exists for $G$. This includes $D \cdot n^{o(1)}$-round algor… ▽ More We provide universally-optimal distributed graph algorithms for $(1+\varepsilon)$-approximate shortest path problems including shortest-path-tree and transshipment. The universal optimality of our algorithms guarantees that, on any $n$-node network $G$, our algorithm completes in $T \cdot n^{o(1)}$ rounds whenever a $T$-round algorithm exists for $G$. This includes $D \cdot n^{o(1)}$-round algorithms for any planar or excluded-minor network. Our algorithms never require more than $(\sqrt{n} + D) \cdot n^{o(1)}$ rounds, resulting in the first sub-linear-round distributed algorithm for transshipment. The key technical contribution leading to these results is the first efficient $n^{o(1)}$-competitive linear $\ell_1$-oblivious routing operator that does not require the use of $\ell_1$-embeddings. Our construction is simple, solely based on low-diameter decompositions, and -- in contrast to all known constructions -- directly produces an oblivious flow instead of just an approximation of the optimal flow cost. This also has the benefit of simplifying the interaction with Sherman's multiplicative weight framework [SODA'17] in the distributed setting and its subsequent rounding procedures. △ Less

Submitted 29 October, 2021; originally announced October 2021.

Comments: Accepted to SODA 2022. Author ordering was randomized using https://www.aeaweb.org/journals/policies/random-author-order/generator

arXiv:2109.05151 [pdf, other]

Almost Universally Optimal Distributed Laplacian Solvers via Low-Congestion Shortcuts

Authors: Ioannis Anagnostides, Christoph Lenzen, Bernhard Haeupler, Goran Zuzic, Themis Gouleakis

Abstract: In this paper, we refine the (almost) \emph{existentially optimal} distributed Laplacian solver recently developed by Forster, Goranci, Liu, Peng, Sun, and Ye (FOCS `21) into an (almost) \emph{universally optimal} distributed Laplacian solver. Specifically, when the topology is known, we show that any Laplacian system on an $n$-node graph with \emph{shortcut quality} $\text{SQ}(G)$ can be solved… ▽ More In this paper, we refine the (almost) \emph{existentially optimal} distributed Laplacian solver recently developed by Forster, Goranci, Liu, Peng, Sun, and Ye (FOCS `21) into an (almost) \emph{universally optimal} distributed Laplacian solver. Specifically, when the topology is known, we show that any Laplacian system on an $n$-node graph with \emph{shortcut quality} $\text{SQ}(G)$ can be solved within $n^{o(1)} \text{SQ}(G) \log(1/\varepsilon)$ rounds, where $\varepsilon$ is the required accuracy. This almost matches our lower bound which guarantees that any correct algorithm on $G$ requires $\widetildeΩ(\text{SQ}(G))$ rounds, even for a crude solution with $\varepsilon \le 1/2$. Even in the unknown-topology case (i.e., standard CONGEST), the same bounds also hold in most networks of interest. Furthermore, conditional on conjectured improvements in state-of-the-art constructions of low-congestion shortcuts, the CONGEST results will match the known-topology ones. Moreover, following a recent line of work in distributed algorithms, we consider a hybrid communication model which enhances CONGEST with limited global power in the form of the node-capacitated clique (NCC) model. In this model, we show the existence of a Laplacian solver with round complexity $n^{o(1)} \log(1/\varepsilon)$. The unifying thread of these results, and our main technical contribution, is the study of novel \emph{congested} generalization of the standard \emph{part-wise aggregation} problem. We develop near-optimal algorithms for this primitive in the Supported-CONGEST model, almost-optimal algorithms in (standard) CONGEST, as well as a very simple algorithm for bounded-treewidth graphs with slightly worse bounds. This primitive can be readily used to accelerate the FOCS`21 Laplacian solver. We believe this primitive will find further independent applications. △ Less

Submitted 14 May, 2022; v1 submitted 10 September, 2021; originally announced September 2021.

arXiv:2104.03932 [pdf, ps, other]

Universally-Optimal Distributed Algorithms for Known Topologies

Authors: Bernhard Haeupler, David Wajc, Goran Zuzic

Abstract: Many distributed optimization algorithms achieve existentially-optimal running times, meaning that there exists some pathological worst-case topology on which no algorithm can do better. Still, most networks of interest allow for exponentially faster algorithms. This motivates two questions: (1) What network topology parameters determine the complexity of distributed optimization? (2) Are there un… ▽ More Many distributed optimization algorithms achieve existentially-optimal running times, meaning that there exists some pathological worst-case topology on which no algorithm can do better. Still, most networks of interest allow for exponentially faster algorithms. This motivates two questions: (1) What network topology parameters determine the complexity of distributed optimization? (2) Are there universally-optimal algorithms that are as fast as possible on every topology? We resolve these 25-year-old open problems in the known-topology setting (i.e., supported CONGEST) for a wide class of global network optimization problems including MST, $(1+\varepsilon)$-min cut, various approximate shortest paths problems, sub-graph connectivity, etc. In particular, we provide several (equivalent) graph parameters and show they are tight universal lower bounds for the above problems, fully characterizing their inherent complexity. Our results also imply that algorithms based on the low-congestion shortcut framework match the above lower bound, making them universally optimal if shortcuts are efficiently approximable. We leverage a recent result in hop-constrained oblivious routing to show this is the case if the topology is known -- giving universally-optimal algorithms for all above problems. △ Less

Submitted 26 December, 2023; v1 submitted 8 April, 2021; originally announced April 2021.

Comments: Full version of extended abstract in STOC 2021

ACM Class: G.2.2; G.2.1; C.2.4; F.1.1; F.2.0

arXiv:2102.05168 [pdf, other]

Deterministic Tree Embeddings with Copies for Algorithms Against Adaptive Adversaries

Authors: Bernhard Haeupler, D Ellis Hershkowitz, Goran Zuzic

Abstract: Embeddings of graphs into distributions of trees that preserve distances in expectation are a cornerstone of many optimization algorithms. Unfortunately, online or dynamic algorithms which use these embeddings seem inherently randomized and ill-suited against adaptive adversaries. In this paper we provide a new tree embedding which addresses these issues by deterministically embedding a graph in… ▽ More Embeddings of graphs into distributions of trees that preserve distances in expectation are a cornerstone of many optimization algorithms. Unfortunately, online or dynamic algorithms which use these embeddings seem inherently randomized and ill-suited against adaptive adversaries. In this paper we provide a new tree embedding which addresses these issues by deterministically embedding a graph into a single tree containing $O(\log n)$ copies of each vertex while preserving the connectivity structure of every subgraph and $O(\log^2 n)$-approximating the cost of every subgraph. Using this embedding we obtain several new algorithmic results: We reduce an open question of Alon et al. [SODA 2004] -- the existence of a deterministic poly-log-competitive algorithm for online group Steiner tree on a general graph -- to its tree case. We give a poly-log-competitive deterministic algorithm for a closely related problem -- online partial group Steiner tree -- which, roughly, is a bicriteria version of online group Steiner tree. Lastly, we give the first poly-log approximations for demand-robust Steiner forest, group Steiner tree and group Steiner forest. △ Less

Submitted 9 February, 2021; originally announced February 2021.

arXiv:2101.00711 [pdf, other]

Synchronization Strings and Codes for Insertions and Deletions -- a Survey

Authors: Bernhard Haeupler, Amirbehshad Shahrasbi

Abstract: Already in the 1960s, Levenshtein and others studied error-correcting codes that protect against synchronization errors, such as symbol insertions and deletions. However, despite significant efforts, progress on designing such codes has been lagging until recently, particularly compared to the detailed understanding of error-correcting codes for symbol substitution or erasure errors. This paper su… ▽ More Already in the 1960s, Levenshtein and others studied error-correcting codes that protect against synchronization errors, such as symbol insertions and deletions. However, despite significant efforts, progress on designing such codes has been lagging until recently, particularly compared to the detailed understanding of error-correcting codes for symbol substitution or erasure errors. This paper surveys the recent progress in designing efficient error-correcting codes over finite alphabets that can correct a constant fraction of worst-case insertions and deletions. Most state-of-the-art results for such codes rely on synchronization strings, simple yet powerful pseudo-random objects that have proven to be very effective solutions for coping with synchronization errors in various settings. This survey also includes an overview of what is known about synchronization strings and discusses communication settings related to error-correcting codes in which synchronization strings have been applied. △ Less

Submitted 3 January, 2021; originally announced January 2021.

arXiv:2011.10446 [pdf, ps, other]

Hop-Constrained Oblivious Routing

Authors: Mohsen Ghaffari, Bernhard Haeupler, Goran Zuzic

Abstract: We prove the existence of an oblivious routing scheme that is $\mathrm{poly}(\log n)$-competitive in terms of $(congestion + dilation)$, thus resolving a well-known question in oblivious routing. Concretely, consider an undirected network and a set of packets each with its own source and destination. The objective is to choose a path for each packet, from its source to its destination, so as to… ▽ More We prove the existence of an oblivious routing scheme that is $\mathrm{poly}(\log n)$-competitive in terms of $(congestion + dilation)$, thus resolving a well-known question in oblivious routing. Concretely, consider an undirected network and a set of packets each with its own source and destination. The objective is to choose a path for each packet, from its source to its destination, so as to minimize $(congestion + dilation)$, defined as follows: The dilation is the maximum path hop-length, and the congestion is the maximum number of paths that include any single edge. The routing scheme obliviously and randomly selects a path for each packet independent of (the existence of) the other packets. Despite this obliviousness, the selected paths have $(congestion + dilation)$ within a $\mathrm{poly}(\log n)$ factor of the best possible value. More precisely, for any integer hop-bound $h$, this oblivious routing scheme selects paths of length at most $h \cdot \mathrm{poly}(\log n)$ and is $\mathrm{poly}(\log n)$-competitive in terms of $congestion$ in comparison to the best possible $congestion$ achievable via paths of length at most $h$ hops. These paths can be sampled in polynomial time. This result can be viewed as an analogue of the celebrated oblivious routing results of Räcke [FOCS 2002, STOC 2008], which are $O(\log n)$-competitive in terms of $congestion$, but are not competitive in terms of $dilation$. △ Less

Submitted 21 October, 2022; v1 submitted 20 November, 2020; originally announced November 2020.

Comments: STOC 2021, invited to the corresponding special issue of SICOMP journal

arXiv:2011.06112 [pdf, other]

Tree Embeddings for Hop-Constrained Network Design

Authors: Bernhard Haeupler, D Ellis Hershkowitz, Goran Zuzic

Abstract: Network design problems aim to compute low-cost structures such as routes, trees and subgraphs. Often, it is natural and desirable to require that these structures have small hop length or hop diameter. Unfortunately, optimization problems with hop constraints are much harder and less well understood than their hop-unconstrained counterparts. A significant algorithmic barrier in this setting is th… ▽ More Network design problems aim to compute low-cost structures such as routes, trees and subgraphs. Often, it is natural and desirable to require that these structures have small hop length or hop diameter. Unfortunately, optimization problems with hop constraints are much harder and less well understood than their hop-unconstrained counterparts. A significant algorithmic barrier in this setting is the fact that hop-constrained distances in graphs are very far from being a metric. We show that, nonetheless, hop-constrained distances can be approximated by distributions over "partial tree metrics." We build this result into a powerful and versatile algorithmic tool which, similarly to classic probabilistic tree embeddings, reduces hop-constrained problems in general graphs to hop-unconstrained problems on trees. We then use this tool to give the first poly-logarithmic bicriteria approximations for the hop-constrained variants of many classic network design problems. These include Steiner forest, group Steiner tree, group Steiner forest, buy-at-bulk network design as well as online and oblivious versions of many of these problems. △ Less

Submitted 11 November, 2020; originally announced November 2020.

arXiv:2009.13307 [pdf, other]

Rate-Distance Trade-offs for List-Decodable Insertion-Deletion Codes

Authors: Bernhard Haeupler, Amirbehshad Shahrasbi

Abstract: This paper presents general bounds on the highest achievable rate for list-decodable insertion-deletion codes. In particular, we give novel outer and inner bounds for the highest achievable communication rate of any insertion-deletion code that can be list-decoded from any $γ$ fraction of insertions and any $δ$ fraction of deletions. Our bounds simultaneously generalize the known bounds for the pr… ▽ More This paper presents general bounds on the highest achievable rate for list-decodable insertion-deletion codes. In particular, we give novel outer and inner bounds for the highest achievable communication rate of any insertion-deletion code that can be list-decoded from any $γ$ fraction of insertions and any $δ$ fraction of deletions. Our bounds simultaneously generalize the known bounds for the previously studied special cases of insertion-only, deletion-only, and zero-rate and correct other bounds that had been reported for the general case. △ Less

Submitted 9 August, 2022; v1 submitted 28 September, 2020; originally announced September 2020.

arXiv:2008.03091 [pdf, ps, other]

Low-Congestion Shortcuts for Graphs Excluding Dense Minors

Authors: Mohsen Ghaffari, Bernhard Haeupler

Abstract: We prove that any $n$-node graph $G$ with diameter $D$ admits shortcuts with congestion $O(δD \log n)$ and dilation $O(δD)$, where $δ$ is the maximum edge-density of any minor of $G$. Our proof is simple, elementary, and constructive - featuring a $\tildeΘ(δD)$-round distributed construction algorithm. Our results are tight up to $\tilde{O}(1)$ factors and generalize, simplify, unify, and strength… ▽ More We prove that any $n$-node graph $G$ with diameter $D$ admits shortcuts with congestion $O(δD \log n)$ and dilation $O(δD)$, where $δ$ is the maximum edge-density of any minor of $G$. Our proof is simple, elementary, and constructive - featuring a $\tildeΘ(δD)$-round distributed construction algorithm. Our results are tight up to $\tilde{O}(1)$ factors and generalize, simplify, unify, and strengthen several prior results. For example, for graphs excluding a fixed minor, i.e., graphs with constant $δ$, only a $\tilde{O}(D^2)$ bound was known based on a very technical proof that relies on the Robertson-Seymour Graph Structure Theorem. A direct consequence of our result is that many graph families, including any minor-excluded ones, have near-optimal $\tildeΘ(D)$-round distributed algorithms for many fundamental communication primitives and optimization problems including minimum spanning tree, minimum cut, and shortest-path approximations. △ Less

Submitted 7 August, 2020; originally announced August 2020.

arXiv:2007.09075 [pdf, ps, other]

Efficient Linear and Affine Codes for Correcting Insertions/Deletions

Authors: Kuan Cheng, Venkatesan Guruswami, Bernhard Haeupler, Xin Li

Abstract: This paper studies \emph{linear} and \emph{affine} error-correcting codes for correcting synchronization errors such as insertions and deletions. We call such codes linear/affine insdel codes. Linear codes that can correct even a single deletion are limited to have information rate at most $1/2$ (achieved by the trivial 2-fold repetition code). Previously, it was (erroneously) reported that more… ▽ More This paper studies \emph{linear} and \emph{affine} error-correcting codes for correcting synchronization errors such as insertions and deletions. We call such codes linear/affine insdel codes. Linear codes that can correct even a single deletion are limited to have information rate at most $1/2$ (achieved by the trivial 2-fold repetition code). Previously, it was (erroneously) reported that more generally no non-trivial linear codes correcting $k$ deletions exist, i.e., that the $(k+1)$-fold repetition codes and its rate of $1/(k+1)$ are basically optimal for any $k$. We disprove this and show the existence of binary linear codes of length $n$ and rate just below $1/2$ capable of correcting $Ω(n)$ insertions and deletions. This identifies rate $1/2$ as a sharp threshold for recovery from deletions for linear codes, and reopens the quest for a better understanding of the capabilities of linear codes for correcting insertions/deletions. We prove novel outer bounds and existential inner bounds for the rate vs. (edit) distance trade-off of linear insdel codes. We complement our existential results with an efficient synchronization-string-based transformation that converts any asymptotically-good linear code for Hamming errors into an asymptotically-good linear code for insdel errors. Lastly, we show that the $\frac{1}{2}$-rate limitation does not hold for affine codes by giving an explicit affine code of rate $1-ε$ which can efficiently correct a constant fraction of insdel errors. △ Less

Submitted 20 July, 2022; v1 submitted 17 July, 2020; originally announced July 2020.

arXiv:2001.00072 [pdf, ps, other]

Near-Optimal Schedules for Simultaneous Multicasts

Authors: Bernhard Haeupler, D Ellis Hershkowitz, David Wajc

Abstract: We study the store-and-forward packet routing problem for simultaneous multicasts, in which multiple packets have to be forwarded along given trees as fast as possible. This is a natural generalization of the seminal work of Leighton, Maggs and Rao, which solved this problem for unicasts, i.e. the case where all trees are paths. They showed the existence of asymptotically optimal $O(C + D)$-leng… ▽ More We study the store-and-forward packet routing problem for simultaneous multicasts, in which multiple packets have to be forwarded along given trees as fast as possible. This is a natural generalization of the seminal work of Leighton, Maggs and Rao, which solved this problem for unicasts, i.e. the case where all trees are paths. They showed the existence of asymptotically optimal $O(C + D)$-length schedules, where the congestion $C$ is the maximum number of packets sent over an edge and the dilation $D$ is the maximum depth of a tree. This improves over the trivial $O(CD)$ length schedules. We prove a lower bound for multicasts, which shows that there do not always exist schedules of non-trivial length, $o(CD)$. On the positive side, we construct $O(C+D+\log^2 n)$-length schedules in any $n$-node network. These schedules are near-optimal, since our lower bound shows that this length cannot be improved to $O(C+D) + o(\log n)$. △ Less

Submitted 2 May, 2021; v1 submitted 31 December, 2019; originally announced January 2020.

Comments: In ICALP 2021

arXiv:1909.10683 [pdf, other]

Optimally Resilient Codes for List-Decoding from Insertions and Deletions

Authors: Venkatesan Guruswami, Bernhard Haeupler, Amirbehshad Shahrasbi

Abstract: We give a complete answer to the following basic question: "What is the maximal fraction of deletions or insertions tolerable by $q$-ary list-decodable codes with non-vanishing information rate?" This question has been open even for binary codes, including the restriction to the binary insertion-only setting, where the best-known result was that a $γ\leq 0.707$ fraction of insertions is tolerabl… ▽ More We give a complete answer to the following basic question: "What is the maximal fraction of deletions or insertions tolerable by $q$-ary list-decodable codes with non-vanishing information rate?" This question has been open even for binary codes, including the restriction to the binary insertion-only setting, where the best-known result was that a $γ\leq 0.707$ fraction of insertions is tolerable by some binary code family. For any desired $ε> 0$, we construct a family of binary codes of positive rate which can be efficiently list-decoded from any combination of $γ$ fraction of insertions and $δ$ fraction of deletions as long as $ γ+2δ\leq 1-ε$. On the other hand, for any $γ,δ$ with $γ+2δ=1$ list-decoding is impossible. Our result thus precisely characterizes the feasibility region of binary list-decodable codes for insertions and deletions. We further generalize our result to codes over any finite alphabet of size $q$. Surprisingly, our work reveals that the feasibility region for $q>2$ is not the natural generalization of the binary bound above. We provide tight upper and lower bounds that precisely pin down the feasibility region, which turns out to have a $(q-1)$-piece-wise linear boundary whose $q$ corner-points lie on a quadratic curve. The main technical work in our results is proving the existence of code families of sufficiently large size with good list-decoding properties for any combination of $δ,γ$ within the claimed feasibility region. We achieve this via an intricate analysis of codes introduced by [Bukh, Ma; SIAM J. Discrete Math; 2014]. Finally, we give a simple yet powerful concatenation scheme for list-decodable insertion-deletion codes which transforms any such (non-efficient) code family (with vanishing information rate) into an efficiently decodable code family with constant rate. △ Less

Submitted 4 May, 2020; v1 submitted 23 September, 2019; originally announced September 2019.

arXiv:1905.02805 [pdf, other]

Network Coding Gaps for Completion Times of Multiple Unicasts

Authors: Bernhard Haeupler, David Wajc, Goran Zuzic

Abstract: We study network coding gaps for the problem of makespan minimization of multiple unicasts. In this problem distinct packets at different nodes in a network need to be delivered to a destination specific to each packet, as fast as possible. The network coding gap specifies how much coding packets together in a network can help compared to the more natural approach of routing. While makespan mini… ▽ More We study network coding gaps for the problem of makespan minimization of multiple unicasts. In this problem distinct packets at different nodes in a network need to be delivered to a destination specific to each packet, as fast as possible. The network coding gap specifies how much coding packets together in a network can help compared to the more natural approach of routing. While makespan minimization using routing has been intensely studied for the multiple unicasts problem, no bounds on network coding gaps for this problem are known. We develop new techniques which allow us to upper bound the network coding gap for the makespan of $k$ unicasts, proving this gap is at most polylogarithmic in $k$. Complementing this result, we show there exist instances of $k$ unicasts for which this coding gap is polylogarithmic in $k$. Our results also hold for average completion time, and more generally any $\ell_p$ norm of completion times. △ Less

Submitted 28 April, 2020; v1 submitted 7 May, 2019; originally announced May 2019.

arXiv:1810.11863 [pdf, other]

doi 10.1145/3313276.3316371

Near-Linear Time Insertion-Deletion Codes and (1+$\varepsilon$)-Approximating Edit Distance via Indexing

Authors: Bernhard Haeupler, Aviad Rubinstein, Amirbehshad Shahrasbi

Abstract: We introduce fast-decodable indexing schemes for edit distance which can be used to speed up edit distance computations to near-linear time if one of the strings is indexed by an indexing string $I$. In particular, for every length $n$ and every $\varepsilon >0$, one can in near linear time construct a string $I \in Σ'^n$ with $|Σ'| = O_{\varepsilon}(1)$, such that, indexing any string… ▽ More We introduce fast-decodable indexing schemes for edit distance which can be used to speed up edit distance computations to near-linear time if one of the strings is indexed by an indexing string $I$. In particular, for every length $n$ and every $\varepsilon >0$, one can in near linear time construct a string $I \in Σ'^n$ with $|Σ'| = O_{\varepsilon}(1)$, such that, indexing any string $S \in Σ^n$, symbol-by-symbol, with $I$ results in a string $S' \in Σ''^n$ where $Σ'' = Σ\times Σ'$ for which edit distance computations are easy, i.e., one can compute a $(1+\varepsilon)$-approximation of the edit distance between $S'$ and any other string in $O(n \text{poly}(\log n))$ time. Our indexing schemes can be used to improve the decoding complexity of state-of-the-art error correcting codes for insertions and deletions. In particular, they lead to near-linear time decoding algorithms for the insertion-deletion codes of [Haeupler, Shahrasbi; STOC `17] and faster decoding algorithms for list-decodable insertion-deletion codes of [Haeupler, Shahrasbi, Sudan; ICALP `18]. Interestingly, the latter codes are a crucial ingredient in the construction of fast-decodable indexing schemes. △ Less

Submitted 9 April, 2019; v1 submitted 28 October, 2018; originally announced October 2018.

arXiv:1809.06727 [pdf, ps, other]

Optimal strategies for patrolling fences

Authors: Bernhard Haeupler, Fabian Kuhn, Anders Martinsson, Kalina Petrova, Pascal Pfister

Abstract: A classical multi-agent fence patrolling problem asks: What is the maximum length $L$ of a line that $k$ agents with maximum speeds $v_1,\ldots,v_k$ can patrol if each point on the line needs to be visited at least once every unit of time. It is easy to see that $L = α\sum_{i=1}^k v_i$ for some efficiency $α\in [\frac{1}{2},1)$. After a series of works giving better and better efficiencies, it was… ▽ More A classical multi-agent fence patrolling problem asks: What is the maximum length $L$ of a line that $k$ agents with maximum speeds $v_1,\ldots,v_k$ can patrol if each point on the line needs to be visited at least once every unit of time. It is easy to see that $L = α\sum_{i=1}^k v_i$ for some efficiency $α\in [\frac{1}{2},1)$. After a series of works giving better and better efficiencies, it was conjectured that the best possible efficiency approaches $\frac{2}{3}$. No upper bounds on the efficiency below $1$ were known. We prove the first such upper bounds and tightly bound the optimal efficiency in terms of the minimum ratio of speeds $s = {v_{\max}}/{v_{\min}}$ and the number of agents $k$. Guided by our upper bounds, we construct a scheme whose efficiency approaches $1$, disproving the conjecture of Kawamura and Soejima. Our scheme asymptotically matches our upper bounds in terms of the maximal speed difference and the number of agents used, proving them to be asymptotically tight. A variation of the fence patrolling problem considers a circular fence instead and asks for its circumference to be maximized. We consider the unidirectional case of this variation, where all agents are only allowed to move in one direction, say clockwise. At first, a strategy yielding $L = \max_{r \in [k]} r \cdot v_r$ where $v_1 \geq v_2 \geq \dots \geq v_k$ was conjectured to be optimal by Czyzowicz et al. This was proven not to be the case by giving constructions for only specific numbers of agents with marginal improvements of $L$. We give a general construction that yields $L = \frac{1}{33 \log_e\log_2(k)} \sum_{i=1}^k v_i$ for any set of agents, which in particular for the case $1, 1/2, \dots, 1/k$ diverges as $k \rightarrow \infty$, thus resolving a conjecture by Kawamura and Soejima affirmatively. △ Less

Submitted 12 June, 2019; v1 submitted 18 September, 2018; originally announced September 2018.

Comments: 19 pages, 3 figures. Part of our main result (circle strategy) is new to this version of the paper. A shorter version of this is to appear in the proceedings of ICALP 2019

arXiv:1808.00838 [pdf, ps, other]

Algorithms for Noisy Broadcast under Erasures

Authors: Ofer Grossman, Bernhard Haeupler, Sidhanth Mohanty

Abstract: The noisy broadcast model was first studied in [Gallager, TranInf'88] where an $n$-character input is distributed among $n$ processors, so that each processor receives one input bit. Computation proceeds in rounds, where in each round each processor broadcasts a single character, and each reception is corrupted independently at random with some probability $p$. [Gallager, TranInf'88] gave an algor… ▽ More The noisy broadcast model was first studied in [Gallager, TranInf'88] where an $n$-character input is distributed among $n$ processors, so that each processor receives one input bit. Computation proceeds in rounds, where in each round each processor broadcasts a single character, and each reception is corrupted independently at random with some probability $p$. [Gallager, TranInf'88] gave an algorithm for all processors to learn the input in $O(\log\log n)$ rounds with high probability. Later, a matching lower bound of $Ω(\log\log n)$ was given in [Goyal, Kindler, Saks; SICOMP'08]. We study a relaxed version of this model where each reception is erased and replaced with a `?' independently with probability $p$. In this relaxed model, we break past the lower bound of [Goyal, Kindler, Saks; SICOMP'08] and obtain an $O(\log^* n)$-round algorithm for all processors to learn the input with high probability. We also show an $O(1)$-round algorithm for the same problem when the alphabet size is $Ω(\mathrm{poly}(n))$. △ Less

Submitted 2 August, 2018; originally announced August 2018.

Comments: Appeared in ICALP 2018

arXiv:1806.05701 [pdf, other]

Computation-Aware Data Aggregation

Authors: Bernhard Haeupler, D Ellis Hershkowitz, Anson Kahng, Ariel D. Procaccia

Abstract: Data aggregation is a fundamental primitive in distributed computing wherein a network computes a function of every nodes' input. However, while compute time is non-negligible in modern systems, standard models of distributed computing do not take compute time into account. Rather, most distributed models of computation only explicitly consider communication time. In this paper, we introduce a m… ▽ More Data aggregation is a fundamental primitive in distributed computing wherein a network computes a function of every nodes' input. However, while compute time is non-negligible in modern systems, standard models of distributed computing do not take compute time into account. Rather, most distributed models of computation only explicitly consider communication time. In this paper, we introduce a model of distributed computation that considers \emph{both} computation and communication so as to give a theoretical treatment of data aggregation. We study both the structure of and how to compute the fastest data aggregation schedule in this model. As our first result, we give a polynomial-time algorithm that computes the optimal schedule when the input network is a complete graph. Moreover, since one may want to aggregate data over a pre-existing network, we also study data aggregation scheduling on arbitrary graphs. We demonstrate that this problem on arbitrary graphs is hard to approximate within a multiplicative $1.5$ factor. Finally, we give an $O(\log n \cdot \log \frac{\mathrm{OPT}}{t_m})$-approximation algorithm for this problem on arbitrary graphs, where $n$ is the number of nodes and $\mathrm{OPT}$ is the length of the optimal schedule. △ Less

Submitted 12 November, 2019; v1 submitted 14 June, 2018; originally announced June 2018.

Comments: Changed the introduction and title; this is the ITCS camera-ready version

arXiv:1805.06872

Coding for Interactive Communication with Small Memory and Applications to Robust Circuits

Authors: Bernhard Haeupler, Nicolas Resch

Abstract: Classically, coding theory has been concerned with the problem of transmitting a single message in a format which is robust to noise. Recently, researchers have turned their attention to designing coding schemes to make two-way conversations robust to noise. That is, given an interactive communication protocol $Π$, an \emph{interactive coding scheme} converts $Π$ into another communication protoco… ▽ More Classically, coding theory has been concerned with the problem of transmitting a single message in a format which is robust to noise. Recently, researchers have turned their attention to designing coding schemes to make two-way conversations robust to noise. That is, given an interactive communication protocol $Π$, an \emph{interactive coding scheme} converts $Π$ into another communication protocol $Π'$ such that, even if errors are introduced during the execution of $Π'$, the parties are able to determine what the outcome of running $Π$ would be in a noise-free setting. We consider the problem of designing interactive coding schemes which allow the parties to simulate the original protocol using little memory. Specifically, given any communication protocol $Π$ we construct robust simulating protocols which tolerate a constant noise rate and require the parties to use only $O(\log d \log s)$ memory, where $d$ is the depth of $Π$ and $s$ is a measure of the size of $Π$. Prior to this work, all known coding schemes required the parties to use at least $Ω(d)$ memory, as the parties were required to remember the transcript of the conversation thus far. Moreover, our coding scheme achieves a communication rate of $1-O(\sqrt{\varepsilon})$ over oblivious channels and $1-O(\sqrt{\varepsilon\log\log\tfrac{1}{\varepsilon}})$ over adaptive adversarial channels, matching the conjecturally optimal rates. Lastly, we point to connections between fault-tolerant circuits and coding for interactive communication with small memory. △ Less

Submitted 24 July, 2019; v1 submitted 17 May, 2018; originally announced May 2018.

Comments: There is a problem with the main results, i.e., Theorems 6.4 and 6.5. Specifically, an entire iteration can be corrupted in such a way that both parties arrive at the same incorrect node. Our algorithm cannot detect this error. Moreover, the robust KW-transform (Lemma 7.7) for circuits needs to include the assumption that all the internal nodes correspond to combinatorial rectangles

arXiv:1805.04165 [pdf, other]

Erasure Correction for Noisy Radio Networks

Authors: Keren Censor-Hillel, Bernhard Haeupler, D Ellis Hershkowitz, Goran Zuzic

Abstract: The radio network model is a well-studied model of wireless, multi-hop networks. However, radio networks make the strong assumption that messages are delivered deterministically. The recently introduced noisy radio network model relaxes this assumption by dropping messages independently at random. In this work we quantify the relative computational power of noisy radio networks and classic radio… ▽ More The radio network model is a well-studied model of wireless, multi-hop networks. However, radio networks make the strong assumption that messages are delivered deterministically. The recently introduced noisy radio network model relaxes this assumption by dropping messages independently at random. In this work we quantify the relative computational power of noisy radio networks and classic radio networks. In particular, given a non-adaptive protocol for a fixed radio network we show how to reliably simulate this protocol if noise is introduced with a multiplicative cost of $\mathrm{poly}(\log Δ, \log \log n)$ rounds where $n$ is the number nodes in the network and $Δ$ is the max degree. Moreover, we demonstrate that, even if the simulated protocol is not non-adaptive, it can be simulated with a multiplicative $O(Δ\log ^2 Δ)$ cost in the number of rounds. Lastly, we argue that simulations with a multiplicative overhead of $o(\log Δ)$ are unlikely to exist by proving that an $Ω(\log Δ)$ multiplicative round overhead is necessary under certain natural assumptions. △ Less

Submitted 16 May, 2019; v1 submitted 10 May, 2018; originally announced May 2018.

Comments: We gave significantly more high level intuition of our results in a new section

ACM Class: G.2.2; C.2.4; F.1.1; F.2.0

arXiv:1804.03604 [pdf, ps, other]

Optimal Document Exchange and New Codes for Insertions and Deletions

Authors: Bernhard Haeupler

Abstract: We give the first communication-optimal document exchange protocol. For any $n$ and $k < n$ our randomized scheme takes any $n$-bit file $F$ and computes a $Θ(k \log \frac{n}{k})$-bit summary from which one can reconstruct $F$, with high probability, given a related file $F'$ with edit distance $ED(F,F') \leq k$. The size of our summary is information-theoretically order optimal for all values o… ▽ More We give the first communication-optimal document exchange protocol. For any $n$ and $k < n$ our randomized scheme takes any $n$-bit file $F$ and computes a $Θ(k \log \frac{n}{k})$-bit summary from which one can reconstruct $F$, with high probability, given a related file $F'$ with edit distance $ED(F,F') \leq k$. The size of our summary is information-theoretically order optimal for all values of $k$, giving a randomized solution to a longstanding open question of [Orlitsky; FOCS'91]. It also is the first non-trivial solution for the interesting setting where a small constant fraction of symbols have been edited, producing an optimal summary of size $O(H(δ)n)$ for $k=δn$. This concludes a long series of better-and-better protocols which produce larger summaries for sub-linear values of $k$ and sub-polynomial failure probabilities. In particular, the recent break-through of [Belazzougui, Zhang; FOCS'16] assumes that $k < n^ε$, produces a summary of size $O(k\log^2 k + k\log n)$, and succeeds with probability $1-(k \log n)^{-O(1)}$. We also give an efficient derandomized document exchange protocol with summary size $O(k \log^2 \frac{n}{k})$. This improves, for any $k$, over a deterministic document exchange protocol by Belazzougui with summary size $O(k^2 + k \log^2 n)$. Our deterministic document exchange directly provides new efficient systematic error correcting codes for insertions and deletions. These (binary) codes correct any $δ$ fraction of adversarial insertions/deletions while having a rate of $1 - O(δ\log^2 \frac{1}δ)$ and improve over the codes of Guruswami and Li and Haeupler, Shahrasbi and Vitercik which have rate $1 - Θ\left(\sqrtδ \log^{O(1)} \frac{1}ε\right)$. △ Less

Submitted 25 September, 2019; v1 submitted 10 April, 2018; originally announced April 2018.

arXiv:1803.03530 [pdf, other]

Synchronization Strings: Efficient and Fast Deterministic Constructions over Small Alphabets

Authors: Kuan Cheng, Bernhard Haeupler, Xin Li, Amirbehshad Shahrasbi, Ke Wu

Abstract: Synchronization strings are recently introduced by Haeupler and Shahrasbi (STOC 2017) in the study of codes for correcting insertion and deletion errors (insdel codes). They showed that for any parameter $\varepsilon>0$, synchronization strings of arbitrary length exist over an alphabet whose size depends only on $\varepsilon$. Specifically, they obtained an alphabet size of $O(\varepsilon^{-4})$,… ▽ More Synchronization strings are recently introduced by Haeupler and Shahrasbi (STOC 2017) in the study of codes for correcting insertion and deletion errors (insdel codes). They showed that for any parameter $\varepsilon>0$, synchronization strings of arbitrary length exist over an alphabet whose size depends only on $\varepsilon$. Specifically, they obtained an alphabet size of $O(\varepsilon^{-4})$, which left an open question on where the minimal size of such alphabets lies between $Ω(\varepsilon^{-1})$ and $O(\varepsilon^{-4})$. In this work, we partially bridge this gap by providing an improved lower bound of $Ω(\varepsilon^{-3/2})$, and an improved upper bound of $O(\varepsilon^{-2})$. We also provide fast explicit constructions of synchronization strings over small alphabets. Further, along the lines of previous work on similar combinatorial objects, we study the extremal question of the smallest possible alphabet size over which synchronization strings can exist for some constant $\varepsilon < 1$. We show that one can construct $\varepsilon$-synchronization strings over alphabets of size four while no such string exists over binary alphabets. This reduces the extremal question to whether synchronization strings exist over ternary alphabets. △ Less

Submitted 7 March, 2018; originally announced March 2018.

Comments: 29 pages. arXiv admin note: substantial text overlap with arXiv:1710.07356

arXiv:1802.08663 [pdf, ps, other]

Synchronization Strings: List Decoding for Insertions and Deletions

Authors: Bernhard Haeupler, Amirbehshad Shahrasbi, Madhu Sudan

Abstract: We study codes that are list-decodable under insertions and deletions. Specifically, we consider the setting where a codeword over some finite alphabet of size $q$ may suffer from $δ$ fraction of adversarial deletions and $γ$ fraction of adversarial insertions. A code is said to be $L$-list-decodable if there is an (efficient) algorithm that, given a received word, reports a list of $L$ codewords… ▽ More We study codes that are list-decodable under insertions and deletions. Specifically, we consider the setting where a codeword over some finite alphabet of size $q$ may suffer from $δ$ fraction of adversarial deletions and $γ$ fraction of adversarial insertions. A code is said to be $L$-list-decodable if there is an (efficient) algorithm that, given a received word, reports a list of $L$ codewords that include the original codeword. Using the concept of synchronization strings, introduced by the first two authors [STOC 2017], we show some surprising results. We show that for every $0\leqδ<1$, every $0\leqγ<\infty$ and every $ε>0$ there exist efficient codes of rate $1-δ-ε$ and constant alphabet (so $q=O_{δ,γ,ε}(1)$) and sub-logarithmic list sizes. We stress that the fraction of insertions can be arbitrarily large and the rate is independent of this parameter. Our result sheds light on the remarkable asymmetry between the impact of insertions and deletions from the point of view of error-correction: Whereas deletions cost in the rate of the code, insertion costs are borne by the adversary and not the code! We also prove several tight bounds on the parameters of list-decodable insdel codes. In particular, we show that the alphabet size of insdel codes needs to be exponentially large in $ε^{-1}$, where $ε$ is the gap to capacity above. Our result even applies to settings where the unique-decoding capacity equals the list-decoding capacity and when it does so, it shows that the alphabet size needs to be exponentially large in the gap to capacity. This is sharp contrast to the Hamming error model where alphabet size polynomial in $ε^{-1}$ suffices for unique decoding and also shows that the exponential dependence on the alphabet size in previous works that constructed insdel codes is actually necessary! △ Less

Submitted 23 February, 2018; originally announced February 2018.

arXiv:1802.03671 [pdf, ps, other]

Faster Distributed Shortest Path Approximations via Shortcuts

Authors: Bernhard Haeupler, Jason Li

Abstract: A long series of recent results and breakthroughs have led to faster and better distributed approximation algorithms for single source shortest paths (SSSP) and related problems in the CONGEST model. The runtime of all these algorithms, however, is $\tildeΩ(\sqrt{n})$, regardless of the network topology, even on nice networks with a (poly)logarithmic network diameter $D$. While this is known to be… ▽ More A long series of recent results and breakthroughs have led to faster and better distributed approximation algorithms for single source shortest paths (SSSP) and related problems in the CONGEST model. The runtime of all these algorithms, however, is $\tildeΩ(\sqrt{n})$, regardless of the network topology, even on nice networks with a (poly)logarithmic network diameter $D$. While this is known to be necessary for some pathological networks, most topologies of interest are arguably not of this type. We give the first distributed approximation algorithms for shortest paths problems that adjust to the topology they are run on, thus achieving significantly faster running times on many topologies of interest. The running time of our algorithms depends on and is close to $Q$, where $Q$ is the quality of the best shortcut that exists for the given topology. While $Q = \tildeΘ(\sqrt{n} + D)$ for pathological worst-case topologies, many topologies of interest have $Q = \tildeΘ(D)$, which results in near instance optimal running times for our algorithm, given the trivial $Ω(D)$ lower bound. The problems we consider are as follows: (1) an approximate shortest path tree and SSSP distances, (2) a polylogarithmic size distance label for every node such that from the labels of any two nodes alone one can determine their distance (approximately), and (3) an (approximately) optimal flow for the transshipment problem. Our algorithms have a tunable tradeoff between running time and approximation ratio. Our fastest algorithms have an arbitrarily good polynomial approximation guarantee and an essentially optimal $\tilde{O}(Q)$ running time. On the other end of the spectrum, we achieve polylogarithmic approximations in $\tilde{O}(Q \cdot n^ε)$ rounds for any $ε> 0$. △ Less

Submitted 7 August, 2018; v1 submitted 10 February, 2018; originally announced February 2018.

Comments: To appear in DISC 2018; 24 pages

arXiv:1801.06237 [pdf, other]

Minor Excluded Network Families Admit Fast Distributed Algorithms

Authors: Bernhard Haeupler, Jason Li, Goran Zuzic

Abstract: Distributed network optimization algorithms, such as minimum spanning tree, minimum cut, and shortest path, are an active research area in distributed computing. This paper presents a fast distributed algorithm for such problems in the CONGEST model, on networks that exclude a fixed minor. On general graphs, many optimization problems, including the ones mentioned above, require… ▽ More Distributed network optimization algorithms, such as minimum spanning tree, minimum cut, and shortest path, are an active research area in distributed computing. This paper presents a fast distributed algorithm for such problems in the CONGEST model, on networks that exclude a fixed minor. On general graphs, many optimization problems, including the ones mentioned above, require $\tildeΩ(\sqrt n)$ rounds of communication in the CONGEST model, even if the network graph has a much smaller diameter. Naturally, the next step in algorithm design is to design efficient algorithms which bypass this lower bound on a restricted class of graphs. Currently, the only known method of doing so uses the low-congestion shortcut framework of Ghaffari and Haeupler [SODA'16]. Building off of their work, this paper proves that excluded minor graphs admit high-quality shortcuts, leading to an $\tilde O(D^2)$ round algorithm for the aforementioned problems, where $D$ is the diameter of the network graph. To work with excluded minor graph families, we utilize the Graph Structure Theorem of Robertson and Seymour. To the best of our knowledge, this is the first time the Graph Structure Theorem has been used for an algorithmic result in the distributed setting. Even though the proof is involved, merely showing the existence of good shortcuts is sufficient to obtain simple, efficient distributed algorithms. In particular, the shortcut framework can efficiently construct near-optimal shortcuts and then use them to solve the optimization problems. This, combined with the very general family of excluded minor graphs, which includes most other important graph classes, makes this result of significant interest. △ Less

Submitted 18 January, 2018; originally announced January 2018.

MSC Class: 05C83; 68Q85

arXiv:1801.05127 [pdf, other]

Round- and Message-Optimal Distributed Graph Algorithms

Authors: Bernhard Haeupler, D. Ellis Hershkowitz, David Wajc

Abstract: Distributed graph algorithms that separately optimize for either the number of rounds used or the total number of messages sent have been studied extensively. However, algorithms simultaneously efficient with respect to both measures have been elusive. For example, only very recently was it shown that for Minimum Spanning Tree (MST), an optimal message and round complexity is achievable (up to pol… ▽ More Distributed graph algorithms that separately optimize for either the number of rounds used or the total number of messages sent have been studied extensively. However, algorithms simultaneously efficient with respect to both measures have been elusive. For example, only very recently was it shown that for Minimum Spanning Tree (MST), an optimal message and round complexity is achievable (up to polylog terms) by a single algorithm in the CONGEST model of communication. In this paper we provide algorithms that are simultaneously round- and message-optimal for a number of well-studied distributed optimization problems. Our main result is such a distributed algorithm for the fundamental primitive of computing simple functions over each part of a graph partition. From this algorithm we derive round- and message-optimal algorithms for multiple problems, including MST, Approximate Min-Cut and Approximate Single Source Shortest Paths, among others. On general graphs all of our algorithms achieve worst-case optimal $\tilde{O}(D+\sqrt n)$ round complexity and $\tilde{O}(m)$ message complexity. Furthermore, our algorithms require an optimal $\tilde{O}(D)$ rounds and $\tilde{O}(n)$ messages on planar, genus-bounded, treewidth-bounded and pathwidth-bounded graphs. △ Less

Submitted 16 May, 2018; v1 submitted 16 January, 2018; originally announced January 2018.

Comments: To appear in PODC 2018

ACM Class: F.2.2

arXiv:1711.09258 [pdf, other]

Optimal Gossip Algorithms for Exact and Approximate Quantile Computations

Authors: Bernhard Haeupler, Jeet Mohapatra, Hsin-Hao Su

Abstract: This paper gives drastically faster gossip algorithms to compute exact and approximate quantiles. Gossip algorithms, which allow each node to contact a uniformly random other node in each round, have been intensely studied and been adopted in many applications due to their fast convergence and their robustness to failures. Kempe et al. [FOCS'03] gave gossip algorithms to compute important aggreg… ▽ More This paper gives drastically faster gossip algorithms to compute exact and approximate quantiles. Gossip algorithms, which allow each node to contact a uniformly random other node in each round, have been intensely studied and been adopted in many applications due to their fast convergence and their robustness to failures. Kempe et al. [FOCS'03] gave gossip algorithms to compute important aggregate statistics if every node is given a value. In particular, they gave a beautiful $O(\log n + \log \frac{1}ε)$ round algorithm to $ε$-approximate the sum of all values and an $O(\log^2 n)$ round algorithm to compute the exact $φ$-quantile, i.e., the the $\lceil φn \rceil$ smallest value. We give an quadratically faster and in fact optimal gossip algorithm for the exact $φ$-quantile problem which runs in $O(\log n)$ rounds. We furthermore show that one can achieve an exponential speedup if one allows for an $ε$-approximation. We give an $O(\log \log n + \log \frac{1}ε)$ round gossip algorithm which computes a value of rank between $φn$ and $(φ+ε)n$ at every node.% for any $0 \leq φ\leq 1$ and $0 < ε< 1$. Our algorithms are extremely simple and very robust - they can be operated with the same running times even if every transmission fails with a, potentially different, constant probability. We also give a matching $Ω(\log \log n + \log \frac{1}ε)$ lower bound which shows that our algorithm is optimal for all values of $ε$. △ Less

Submitted 25 November, 2017; originally announced November 2017.

ACM Class: F.2.2

arXiv:1710.09795 [pdf, other]

Synchronization Strings: Explicit Constructions, Local Decoding, and Applications

Authors: Bernhard Haeupler, Amirbehshad Shahrasbi

Abstract: This paper gives new results for synchronization strings, a powerful combinatorial object that allows to efficiently deal with insertions and deletions in various communication settings: $\bullet$ We give a deterministic, linear time synchronization string construction, improving over an $O(n^5)$ time randomized construction. Independently of this work, a deterministic $O(n\log^2\log n)$ time co… ▽ More This paper gives new results for synchronization strings, a powerful combinatorial object that allows to efficiently deal with insertions and deletions in various communication settings: $\bullet$ We give a deterministic, linear time synchronization string construction, improving over an $O(n^5)$ time randomized construction. Independently of this work, a deterministic $O(n\log^2\log n)$ time construction was just put on arXiv by Cheng, Li, and Wu. We also give a deterministic linear time construction of an infinite synchronization string, which was not known to be computable before. Both constructions are highly explicit, i.e., the $i^{th}$ symbol can be computed in $O(\log i)$ time. $\bullet$ This paper also introduces a generalized notion we call long-distance synchronization strings that allow for local and very fast decoding. In particular, only $O(\log^3 n)$ time and access to logarithmically many symbols is required to decode any index. We give several applications for these results: $\bullet$ For any $δ<1$ and $ε>0$ we provide an insdel correcting code with rate $1-δ-ε$ which can correct any $O(δ)$ fraction of insdel errors in $O(n\log^3n)$ time. This near linear computational efficiency is surprising given that we do not even know how to compute the (edit) distance between the decoding input and output in sub-quadratic time. We show that such codes can not only efficiently recover from $δ$ fraction of insdel errors but, similar to [Schulman, Zuckerman; TransInf'99], also from any $O(δ/\log n)$ fraction of block transpositions and replications. $\bullet$ We show that highly explicitness and local decoding allow for infinite channel simulations with exponentially smaller memory and decoding time requirements. These simulations can be used to give the first near linear time interactive coding scheme for insdel errors. △ Less

Submitted 9 November, 2017; v1 submitted 26 October, 2017; originally announced October 2017.

arXiv:1707.04233 [pdf, other]

Synchronization Strings: Channel Simulations and Interactive Coding for Insertions and Deletions

Authors: Bernhard Haeupler, Amirbehshad Shahrasbi, Ellen Vitercik

Abstract: We present many new results related to reliable (interactive) communication over insertion-deletion channels. Synchronization errors, such as insertions and deletions, strictly generalize the usual symbol corruption errors and are much harder to protect against. We show how to hide the complications of synchronization errors in many applications by introducing very general channel simulations wh… ▽ More We present many new results related to reliable (interactive) communication over insertion-deletion channels. Synchronization errors, such as insertions and deletions, strictly generalize the usual symbol corruption errors and are much harder to protect against. We show how to hide the complications of synchronization errors in many applications by introducing very general channel simulations which efficiently transform an insertion-deletion channel into a regular symbol corruption channel with an error rate larger by a constant factor and a slightly smaller alphabet. We generalize synchronization string based methods which were recently introduced as a tool to design essentially optimal error correcting codes for insertion-deletion channels. Our channel simulations depend on the fact that, at the cost of increasing the error rate by a constant factor, synchronization strings can be decoded in a streaming manner that preserves linearity of time. We also provide a lower bound showing that this constant factor cannot be improved to $1+ε$, in contrast to what is achievable for error correcting codes. Our channel simulations drastically generalize the applicability of synchronization strings. We provide new interactive coding schemes which simulate any interactive two-party protocol over an insertion-deletion channel. Our results improve over the interactive coding schemes of Braverman et al. [TransInf 2017] and Sherstov and Wu [FOCS 2017], which achieve a small constant rate and require exponential time computations, with respect to computational and communication complexities. We provide the first computationally efficient interactive coding schemes for synchronization errors, the first coding scheme with a rate approaching one for small noise rates, and also the first coding scheme that works over arbitrarily small alphabet sizes. △ Less

Submitted 20 March, 2018; v1 submitted 13 July, 2017; originally announced July 2017.

arXiv:1705.07369 [pdf, other]

doi 10.1145/3087801.3087808

Broadcasting in Noisy Radio Networks

Authors: Keren Censor-Hillel, Bernhard Haeupler, D. Ellis Hershkowitz, Goran Zuzic

Abstract: The widely-studied radio network model [Chlamtac and Kutten, 1985] is a graph-based description that captures the inherent impact of collisions in wireless communication. In this model, the strong assumption is made that node $v$ receives a message from a neighbor if and only if exactly one of its neighbors broadcasts. We relax this assumption by introducing a new noisy radio network model in wh… ▽ More The widely-studied radio network model [Chlamtac and Kutten, 1985] is a graph-based description that captures the inherent impact of collisions in wireless communication. In this model, the strong assumption is made that node $v$ receives a message from a neighbor if and only if exactly one of its neighbors broadcasts. We relax this assumption by introducing a new noisy radio network model in which random faults occur at senders or receivers. Specifically, for a constant noise parameter $p \in [0,1)$, either every sender has probability $p$ of transmitting noise or every receiver of a single transmission in its neighborhood has probability $p$ of receiving noise. We first study single-message broadcast algorithms in noisy radio networks and show that the Decay algorithm [Bar-Yehuda et al., 1992] remains robust in the noisy model while the diameter-linear algorithm of Gasieniec et al., 2007 does not. We give a modified version of the algorithm of Gasieniec et al., 2007 that is robust to sender and receiver faults, and extend both this modified algorithm and the Decay algorithm to robust multi-message broadcast algorithms. We next investigate the extent to which (network) coding improves throughput in noisy radio networks. We address the previously perplexing result of Alon et al. 2014 that worst case coding throughput is no better than worst case routing throughput up to constants: we show that the worst case throughput performance of coding is, in fact, superior to that of routing -- by a $Θ(\log(n))$ gap -- provided receiver faults are introduced. However, we show that any coding or routing scheme for the noiseless setting can be transformed to be robust to sender faults with only a constant throughput overhead. These transformations imply that the results of Alon et al., 2014 carry over to noisy radio networks with sender faults. △ Less

Submitted 20 May, 2017; originally announced May 2017.

Comments: Principles of Distributed Computing 2017

arXiv:1704.00807 [pdf, other]

doi 10.1145/3055399.3055498

Synchronization Strings: Codes for Insertions and Deletions Approaching the Singleton Bound

Authors: Bernhard Haeupler, Amirbehshad Shahrasbi

Abstract: We introduce synchronization strings as a novel way of efficiently dealing with synchronization errors, i.e., insertions and deletions. Synchronization errors are strictly more general and much harder to deal with than commonly considered half-errors, i.e., symbol corruptions and erasures. For every $ε>0$, synchronization strings allow to index a sequence with an $ε^{-O(1)}$ size alphabet such tha… ▽ More We introduce synchronization strings as a novel way of efficiently dealing with synchronization errors, i.e., insertions and deletions. Synchronization errors are strictly more general and much harder to deal with than commonly considered half-errors, i.e., symbol corruptions and erasures. For every $ε>0$, synchronization strings allow to index a sequence with an $ε^{-O(1)}$ size alphabet such that one can efficiently transform $k$ synchronization errors into $(1+ε)k$ half-errors. This powerful new technique has many applications. In this paper, we focus on designing insdel codes, i.e., error correcting block codes (ECCs) for insertion deletion channels. While ECCs for both half-errors and synchronization errors have been intensely studied, the later has largely resisted progress. Indeed, it took until 1999 for the first insdel codes with constant rate, constant distance, and constant alphabet size to be constructed by Schulman and Zuckerman. Insdel codes for asymptotically large or small noise rates were given in 2016 by Guruswami et al. but these codes are still polynomially far from the optimal rate-distance tradeoff. This makes the understanding of insdel codes up to this work equivalent to what was known for regular ECCs after Forney introduced concatenated codes in his doctoral thesis 50 years ago. A direct application of our synchronization strings based indexing method gives a simple black-box construction which transforms any ECC into an equally efficient insdel code with a slightly larger alphabet size. This instantly transfers much of the highly developed understanding for regular ECCs over large constant alphabets into the realm of insdel codes. Most notably, we obtain efficient insdel codes which get arbitrarily close to the optimal rate-distance tradeoff given by the Singleton bound for the complete noise spectrum. △ Less

Submitted 3 April, 2017; originally announced April 2017.

Showing 1–50 of 90 results for author: Haeupler, B