Abstract
We study the classic load balancing problem on dynamic general graphs, where the graph changes arbitrarily between the computational rounds, remaining connected with no permanent cut. A lower bound of Ω(n2) for the running time bound in the dynamic setting, where n is the number of nodes in the graph, is known even for randomized algorithms. We solve the problem by deterministic distributed algorithms, based on a short local deal-agreement communication of proposal/deal in the neighborhood of each node. Our synchronous load balancing algorithms achieve a discrepancy of 𝜖 within the time of \(O(nD \log (nK/\epsilon ))\) for the continuous setting and the discrepancy of at most 2D within the time of \(O(n D \log (n K/D))\) and a 1-balanced state within the additional time of O(nD2) for the discrete setting, where K is the initial discrepancy, and D is a bound for the graph diameter. Also, the stability of the achieved 1-balanced state is studied. The above results are extended to the case of unbounded diameter, essentially keeping the time bounds, via special averaging of the graph diameter over time. Our algorithms can be considered anytime ones, in the sense that they can be stopped at any time during the execution, since they never make loads negative and never worsen the state as the execution progresses. In addition, we describe a version of our algorithms, where each node may transfer load to and from several neighbors at each round, as a heuristic for better performance. The algorithms are generalized to the asynchronous distributed model. We also introduce a self-stabilizing version of our asynchronous algorithms.
Similar content being viewed by others
References
Khan, S., Nazir, B, Khan, I.A., Shamshirband, S., Chronopoulos, A.T.: Load balancing in grid computing: Taxonomy, trends and opportunities. J. Netw. Comput. Appl 88, 99–111 (2017). https://doi.org/10.1016/j.jnca.2017.02.013
Randhawa, S., Jain, S.: MLBC: Multi-objective load balancing clustering technique in wireless sensor networks. Appl. Soft Comput. 74, 66–89 (2019). https://doi.org/10.1016/j.asoc.2018.10.002
Mishra, S.K., Sahoo, B., Parida, P.P.: Load balancing in cloud computing: a big picture. J. King Saud Univ. Comput. Inf. Sci. 32(2), 149–158 (2020). https://doi.org/10.1016/j.jksuci.2018.01.003
Aghdashi, A., Mirtaheri, S.L.: Novel dynamic load balancing algorithm for cloud-based big data analytics. J. Supercomput. 78(3), 4131–4156 (2022). https://doi.org/10.1007/s11227-021-04024-8
Dinitz, M., Fineman, J.T., Gilbert, S., Newport, C.: Load balancing with bounded convergence in dynamic networks. In: 2017 IEEE Conference on computer communications, INFOCOM 2017, Atlanta, GA, USA, May 1-4, 2017, pp. 1–9 (2017)
Linial, N.: Locality in distributed graph algorithms. SIAM J. Comput. 21(1), 193–201 (1992)
Peleg, D.: Distributed Computing: A Locality-Sensitive Approach. Society for Industrial and Applied Mathematics, USA (2000)
Feuilloley, L., Hirvonen, J., Suomela, J.: Locally optimal load balancing. In: Distributed computing - 29th international symposium, DISC 2015, Tokyo, Japan, October 7-9, 2015, Proceedings, pp. 544–558 (2015)
Gilbert, S., Meir, U., Paz, A., Schwartzman, G.: On the complexity of load balancing in dynamic networks. In: SPAA ’21: 33rd ACM symposium on parallelism in algorithms and architectures, virtual event, USA, 6-8 July, 2021, pp. 254–264 (2021)
Dinitz, Y., Dolev, S., Kumar, M.: Brief announcement: Local deal-agreement based monotonic distributed algorithms for load balancing in general graphs. In: Stabilization, safety, and security of distributed systems - 22nd international symposium, SSS 2020, Austin, TX, USA, November 18-21, 2020, Proceedings, pp. 113–117 (2020)
Dolev, S., Kumar, M.: Self-stabilizing local load balancing (phd track technical report). In: International symposium on Cyber security cryptology and machine learning. BGU CS technical report #19-01 (2019)
Dinitz, Y., Dolev, S., Kumar, M.: Local deal-agreement based monotonic distributed algorithms for load balancing in general graphs. CoRR arXiv:2010.02486 (2020)
Rabani, Y., Sinclair, A., Wanka, R.: Local divergence of markov chains and the analysis of iterative load balancing schemes. In: 39th Annual symposium on foundations of computer science, FOCS ’98, November 8-11, 1998, Palo Alto, California, USA, pp. 694–705. https://doi.org/10.1109/SFCS.1998.743520 (1998)
Berenbrink, P., Klasing, R., Kosowski, A., Mallmann-Trenn, F., Uznanski, P.: Improved analysis of deterministic load-balancing schemes. ACM Trans. Algorithms 15(1), 10–11022 (2019). https://doi.org/10.1145/3282435
Sauerwald, T., Sun, H.: Tight bounds for randomized load balancing on arbitrary network topologies. In: 53Rd Annual IEEE symposium on foundations of computer science, FOCS 2012, New Brunswick, NJ, USA, October 20-23, 2012, pp. 341–350. https://doi.org/10.1109/FOCS.2012.86 (2012)
Aiello, W., Awerbuch, B., Maggs, B.M., Rao, S.: Approximate load balancing on dynamic and asynchronous networks. In: Proceedings of the twenty-fifth annual ACM symposium on theory of computing, May 16-18, 1993, San Diego, CA, USA, pp. 632–641. https://doi.org/10.1145/167088.167250 (1993)
Dean, T.L., Boddy, M.S.: An analysis of time-dependent planning. In: Proceedings of the 7th national conference on artificial intelligence, St. Paul, MN, USA, August 21-26, 1988, pp. 49–54. http://www.aaai.org/Library/AAAI/1988/aaai88-009.php (1988)
Horvitz, E.: Reasoning about beliefs and actions under computational resource constraints. Int. J. Approx. Reason. 2(3), 337–338 (1988)
Elsässer, R., Sauerwald, T.: Discrete load balancing is (almost) as easy as continuous load balancing. In: Proceedings of the 29th annual ACM symposium on principles of distributed computing, PODC 2010, Zurich, Switzerland, July 25-28, 2010, pp. 346–354. https://doi.org/10.1145/1835698.1835780 (2010)
Friedrich, T., Gairing, M., Sauerwald, T.: Quasirandom load balancing. In: Proceedings of the Twenty-first annual ACM-SIAM SODA 2010, Austin, Texas, USA, January 17-19, 2010, pp. 1620–1629 (2010)
Akbari, H., Berenbrink, P., Sauerwald, T.: A simple approach for adapting continuous load balancing processes to discrete settings. In: ACM Symposium on principles of distributed computing, PODC ’12, Funchal, Madeira, Portugal, July 16-18, 2012, pp. 271–280. https://doi.org/10.1145/2332432.2332486https://doi.org/10.1145/2332432.2332486 (2012)
Dinitz, M., Fineman, J.T., Gilbert, S., Newport, C.: Smoothed analysis of dynamic networks. Distributed Comput. 31(4), 273–287 (2018). https://doi.org/10.1007/s00446-017-0300-8
Kuhn, F., Oshman, R.: Dynamic networks: Models and algorithms. SIGACT News. 42(1), 82–96 (2011)
Dolev, S., Segala, R., Shvartsman, A.A.: Dynamic load balancing with group communication. Theor. Comput. Sci 369(1-3), 348–360 (2006). https://doi.org/10.1016/j.tcs.2006.09.020
Dolev, S.: Self-Stabilization. MIT Press, Cambridge (2000)
Lynch, N.A.: Distributed Algorithms. Morgan Kaufmann Publishers Inc, San Francisco (1996)
Dolev, S., Hanemann, A., Schiller, E.M., Sharma, S.: Self-stabilizing end-to-end communication in (bounded capacity, omitting, duplicating and non-fifo) dynamic networks - (extended abstract). In: Stabilization, safety, and security of distributed systems - 14th international symposium, SSS 2012, Toronto, Canada, October 1-4, 2012. Proceedings, pp. 133–147. https://doi.org/10.1007/978-3-642-33536-5_14 (2012)
Dolev, S., Herman, T.: Superstabilizing protocols for dynamic distributed systems. Chic. J. Theor. Comput. Sci. vol. 1997 (1997)
Cybenko, G.: Dynamic load balancing for distributed memory multiprocessors. J. Parallel Distributed Comput. 7(2), 279–301 (1989). https://doi.org/10.1016/0743-7315(89)90021-X
Boillat, J.E.: Load balancing and poisson equation in a graph. Concurr. Pract. Exp. 2(4), 289–314 (1990). https://doi.org/10.1002/cpe.4330020403
Bertsekas, D.P., Tsitsiklis, J.N.: Parallel and Distributed Computation: Numerical Methods. Prentice-Hall, Inc, USA (1989)
Berenbrink, P., Cooper, C., Friedetzky, T., Friedrich, T., Sauerwald, T.: Randomized diffusion for indivisible loads. In: Proceedings of the Twenty-second annual ACM-SIAM symposium on discrete algorithms, SODA 2011, San Francisco, California, USA, January 23-25, 2011, pp. 429–439. https://doi.org/10.1137/1.9781611973082.34(2011)
Ghosh, B., Leighton, F.T., Maggs, B.M., Muthukrishnan, S., Plaxton, C.G., Rajaraman, R., Richa, A.W., Tarjan, R.E., Zuckerman, D.: Tight analyses of two local load balancing algorithms. In: Proceedings of the twenty-seventh annual ACM symposium on theory of computing, 29 May-1 June 1995, Las Vegas, Nevada, USA, pp. 548–558. https://doi.org/10.1145/225058.225272 (1995)
Muthukrishnan, S., Ghosh, B., Schultz, M.H.: First- and second-order diffusive methods for rapid, coarse, distributed load balancing. Theory Comput. Syst 31(4), 331–354 (1998). https://doi.org/10.1007/s002240000092
Elsässer, R., Monien, B., Schamberger, S.: Distributing unit size workload packages in heterogeneous networks. J. Graph Algorithms Appl 10(1), 51–68 (2006). https://doi.org/10.7155/jgaa.00118
Friedrich, T., Sauerwald, T.: Near-perfect load balancing by randomized rounding. In: Proceedings of the 41st annual ACM symposium on theory of computing, STOC 2009, Bethesda, MD, USA, May 31 - June 2, 2009, pp. 121–130. https://doi.org/10.1145/1536414.1536433 (2009)
Aspnes, J., Herlihy, M., Shavit, N.: Counting networks and multi-processor coordination. In: Proceedings of the 23rd annual ACM symposium on theory of computing, May 5-8, 1991, New Orleans, Louisiana, USA, pp. 348–358. https://doi.org/10.1145/103418.103421 (1991)
James, A., Maurice, H., Shavit, N.: Counting networks and multi-processor coordination. In: Proceedings of the Twenty-Third Annual ACM STOC. STOC ’91, pp. 348–358. Association for Computing Machinery, New York, NY, USA (1991)
Sudo, Y., Datta, A.K., Larmore, L.L., Masuzawa, T.: Self-stabilizing token distribution with constant-space for trees. In: 22nd International conference on principles of distributed systems, OPODIS 2018, December 17-19, 2018, Hong Kong, China, pp. 31–13116. https://doi.org/10.4230/LIPIcs.OPODIS.2018.31 (2018)
Flatebo, M., Datta, A.K., Bourgon, B.: Self-stabilizing load balancing algorithms. In: Proceeding of 13th IEEE annual international phoenix conference on computers and communications, pp. 303 (1994)
Song, J.: A partially asynchronous and iterative algorithm for distributed load balancing. In: The seventh international parallel processing symposium, Proceedings, Newport Beach, California, USA, April 13-16, 1993, pp. 358–362 (1993)
Acknowledgements
This research was (partially) funded by the Office of the Israel Innovation Authority of the Israel Ministry of Economy under Genesis generic research project, the Rita Altura trust chair in computer science, and by the Lynne and William Frankel Center for Computer Science.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A: Related Work for Load Balancing in Static Graphs
Appendix A: Related Work for Load Balancing in Static Graphs
Most of the research on the load balancing problem is for static graphs. Our contribution is primarily for the case of dynamic networks, still for the sake of completeness, we next review load balancing research in the static case too. For a summary of the results for both static and dynamic settings, see Table 1. The two pioneering papers solving the problem are those of Cybenko [29], and Boillat [30]. Both are based on the concept of diffusion: at any synchronized round, every node divides a certain part of its load equally among its neighbors, keeping the rest of its load for itself; for regular graphs, the load fraction kept for itself is usually set to be equal to that sent to each neighbor. Many further solutions are based on diffusion, see, e.g., [13, 16, 21, 31,32,33]. These papers use Markov chains and ergodic theory for deriving the rate of convergence; most of them consider d-regular graphs. This approach works smoothly for the continuous setting. In paper of Rabani et al. [13], the time bound \(O\left (\frac {\log (Kn/\epsilon )}{(1-\lambda )}\right )\) was established for reaching the discrepancy of 𝜖, where λ is the second largest eigenvalue of the diffusion matrix. This bound enjoys using the graph-specific parameter λ. Note that the factor of \(\frac 1{1-\lambda }\) is Θ(n2) in the worst case, e.g., for a graph that is a cycle.
In the discrete setting, diffusion methods require rounding of every transferred amount, which makes the analysis harder. Let us begin with the deterministic algorithms. Muthukrishnan et al. [34] refer to the usual continuous diffusion model as the first order scheme and extend it to the second-order scheme, which takes \(O(\log (Kn))/(1-\lambda ))\) time for achieving a discrepancy of O(dn/(1 − λ)). Further progress was made by Rabani et al. [13], who introduced the so-called local divergence, which is an effective way to compute the deviation between the actual load and the deviation generated by a Markov chain. They proved that the local divergence yields a bound on the maximum deviation between the continuous and discrete case for both the diffusion and matching models. They proved that the local divergence can be reduced to \(O (d \log (Kn)/(1 - \lambda ))\) in \(O(\log (Kn))/(1-\lambda ))\) rounds for a d-regular graph. Note that the final discrepancy achieved by their method is not a constant. As mentioned in [19], the discrepancy cannot be reduced below \({\Omega }(d_{\min \limits } D)\) by a deterministic diffusion-based algorithm, where D is the diameter and \(d_{\min \limits }\) is the minimum node degree, in the graph.
Randomized diffusion-based algorithms for the discrete setting [20, 21, 32] are algorithms in which every node distributes its load as evenly as possible among its neighbors and itself, and if the remaining load is impossible to distribute without dividing some load unit, then the node redistributes the remaining loads to its neighbors randomly. The bounds for the discrepancy are achieved w.h.p. (with high probability). In Berenbrink et al. [32], the authors show randomized diffusion-based discrete load balancing algorithm for which the discrepancy depends on the expansion property of the graph (depending, e.g., on λ and maximum degree d). Another quasi-random diffusion-based algorithm for general graphs [20] considered bounded-error property, where the sum of rounding errors on each edge is bounded by some constant all the time. The discrepancy results of the randomized algorithms in [20, 32] are further improved to \(O(d^{2} \sqrt {\log n})\) and \(O(d \sqrt {\log n})\), respectively, by applying the results from [15], where tighter bounds are obtained for certain graph parameters used in discrepancy bounds of [20, 32]. Akbari et al. [21] discussed a randomized discrete load balancing algorithm for general graphs that balances the load up to a discrepancy of \(O(\sqrt {d \log n})\) in \(O (\log (Kn)/(1 - \lambda ))\) time, where d is the maximum degree. Using the algorithm of Akbari et al. [21], the discrepancy is also reduced for specific graph topologies. E.g., the discrepancy reduces to \(O(\log n)\) for hypercubes and to \(O(\sqrt {\log n})\) for expanders and tori.
Elsässer et al. [19, 35] propose load balancing algorithms based on randomized post-processing, using random walks, for decreasing the discrepancy to a constant w.h.p. Elsässer et al. [35] present an approach achieving a constant discrepancy after \(O((\log K) + (\log n)^{2}(1- \lambda ))\) steps. Elsässer et al. [19] improved that result by reducing the time to \(O \left (\frac {\log (Kn)}{1 - \lambda }\right )\) w.h.p.. Berenbrink et al. [14] introduced the cumulatively fair balancer algorithms, which include the rotor-router model and give an upper bound of \(O(d \cdot \min \limits (\sqrt {\log n/(1- \lambda )}, \sqrt {n}))\) for d-regular graphs.
One of the alternatives to diffusion is the matching approach, also known as the dimension exchange model. There, a node matching is chosen at the beginning of each synchronized round, and after that, every two matched nodes balance their loads. In known deterministic algorithms, those matchings are usually chosen in advance according to the graph structure; in randomized algorithms, they are chosen randomly. The randomized algorithm of Friedrich et al. [36] reaches the discrepancy of \(O \left (\sqrt { \log ^{3} n / (1- \lambda )} \right )\) in \(O(\log (Kn)/ (1-\lambda ))\) steps w.h.p. for general graphs. This result is improved by Sauerwald et al. [15], where they achieve a constant discrepancy in \(O(\log (Kn)/(1-\lambda ))\) steps w.h.p. for regular graphs. The above constant is independent of the graph and of the discrepancy K. The deterministic matching algorithms of Feuilloley et al. [8] extensively use communication between nodes; they also use graph preprocessing. They achieve a 1-balanced final state for general graphs, in time not depending on the graph size n, but depending cubically, and also exponentially for the discrete setting, on the initial discrepancy K.
Yet another balancing circuits [13, 37] approach is based on sorting circuits [13], where each 2-input comparator balances the loads of its input nodes (instead of the comparison), see, e.g., [38]. The randomized balancing circuits algorithm of [15] achieves a constant final discrepancy w.h.p.
In the asynchronous load balancing model, both local computations and messages may be delayed, but each message is eventually delivered and each computation is eventually performed. Aiello et al. [16] and Ghosh et al. [33] introduced matching-based asynchronous load balancing algorithms with the restriction that in each round, only one unit load transfers between two nodes. By this restriction, these asynchronous load balancing algorithms require more time to converge. The algorithms [16, 33] suggest turning the asynchronous setting into synchronous by appropriately enlarging the time unit.
There is a literature on self-stabilizing token distribution problem [39], where any number of tokens are arbitrarily distributed and each node has an arbitrary state. The goal is that each node will have exactly k tokens after the execution of the algorithm. Sudo et al. [39] introduced the algorithm for asynchronous model and rooted tree networks, where the root can push/pull tokens to/from the external store, and each node knows the value of k. Two self-stabilizing algorithms for transferring the load (tasks) around the network are presented by Flatebo et al. [40]. These algorithms are in terms of a new task received from the environment that triggers send task or start task, rather than being activated when no task is received, which is our scope here.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Dinitz, Y., Dolev, S. & Kumar, M. Local Deal-Agreement Algorithms for Load Balancing in Dynamic General Graphs. Theory Comput Syst 67, 348–382 (2023). https://doi.org/10.1007/s00224-022-10097-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00224-022-10097-6