Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Public Access

ABRA: Approximating Betweenness Centrality in Static and Dynamic Graphs with Rademacher Averages

Published: 20 July 2018 Publication History

Abstract

ABPA Ξ AΣ (ABRAXAS): Gnostic word of mystic meaning.
We present ABRA, a suite of algorithms to compute and maintain probabilistically guaranteed high-quality approximations of the betweenness centrality of all nodes (or edges) on both static and fully dynamic graphs. Our algorithms use progressive random sampling and their analysis rely on Rademacher averages and pseudodimension, fundamental concepts from statistical learning theory. To our knowledge, ABRA is the first application of these concepts to the field of graph analysis. Our experimental results show that ABRA is much faster than exact methods, and vastly outperforms, in both runtime number of samples, and accuracy, state-of-the-art algorithms with the same quality guarantees.

References

[1]
Amir Abboud, Fabrizio Grandoni, and Virginia Vassilevska Williams. 2015. Subcubic equivalences between graph centrality problems, APSP and diameter. In Proceedings of the 26th Annual ACM-SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics, 1681--1697.
[2]
Davide Anguita, Alessandro Ghio, Luca Oneto, and Sandro Ridella. 2014. A deep connection between the Vapnik--Chervonenkis entropy and the Rademacher complexity. IEEE Transactions on Neural Networks and Learning Systems 25, 12 (2014), 2202--2211.
[3]
Jac M. Anthonisse. 1971. The Rush in a Directed Graph. Technical Report BN 9/71. Stichting Mathematisch Centrum, Amsterdam, Netherlands.
[4]
Martin Anthony and Peter L. Bartlett. 1999. Neural Network Learning -- Theoretical Foundations. Cambridge University Press.
[5]
Martin Anthony and John Shawe-Taylor. 1993. A result of Vapnik with applications. Discrete Applied Mathematics 47, 3 (1993), 207--217.
[6]
David A. Bader, Shiva Kintali, Kamesh Madduri, and Milena Mihail. 2007. Approximating betweenness centrality. In Algorithms and Models for the Web-Graph, Anthony Bonato and Fan R. K. Chung (Eds.), Lecture Notes in Computer Science, Vol. 4863. Springer, Berlin, 124--137.
[7]
Peter L. Bartlett and Gábor Lugosi. 1999. An inequality for uniform deviations of sample averages from their means. Statistics 8 Probability Letters 44, 1 (1999), 55--62.
[8]
Elisabetta Bergamini and Henning Meyerhenke. 2015. Fully-dynamic approximation of betweenness centrality. In Proceedings of the 23rd European Symposium on Algorithms (ESA’15). 155--166.
[9]
Elisabetta Bergamini and Henning Meyerhenke. 2016. Approximating betweenness centrality in fully-dynamic networks. Internet Mathematics 12, 5 (2016), 281--314.
[10]
Elisabetta Bergamini, Henning Meyerhenke, and Christian L. Staudt. 2015. Approximating betweenness centrality in large evolving networks. In Proceedings of the 17th Workshop on Algorithm Engineering and Experiments (ALENEX’15). SIAM, 133--146.
[11]
Michele Borassi and Emanuele Natale. 2016. KADABRA is an adaptive algorithm for betweenness via random approximation. In Proceedings of the 24th Annual European Symposium on Algorithms (ESA’16). 20:1--20:18.
[12]
Stéphane Boucheron, Olivier Bousquet, and Gábor Lugosi. 2005. Theory of classification: A survey of some recent advances. ESAIM: Probability and Statistics 9 (2005), 323--375.
[13]
Stephen Boyd and Lieven Vandenberghe. 2004. Convex Optimization. Cambridge University Press.
[14]
Ulrik Brandes. 2001. A faster algorithm for betweenness centrality. Journal of Mathematical Sociology 25, 2 (2001), 163--177.
[15]
Ulrik Brandes and Christian Pich. 2007. Centrality estimation in large networks. International Journal of Bifurcation and Chaos 17, 7 (2007), 2303--2318.
[16]
Corinna Cortes, Spencer Greenberg, and Mehryar Mohri. 2013. Relative deviation learning bounds and generalization with unbounded loss functions. arXiv:1310.5796v4 (Oct. 2013).
[17]
Tapio Elomaa and Matti Kääriäinen. 2002. Progressive Rademacher sampling. In Proceedings of AAAI/IAAI, Rina Dechter and Richard S. Sutton (Eds.). AAAI Press/MIT Press, 140--145.
[18]
Dóra Erdős, Vatche Ishakian, Azer Bestavros, and Evimaria Terzi. 2015. A divide-and-conquer algorithm for betweenness centrality. In Proceedings of SIAM International Conference on Data Mining (SDM’15). SIAM, 433--441.
[19]
Linton C. Freeman. 1977. A set of measures of centrality based on betweenness. Sociometry 40 (1977), 35--41.
[20]
Robert Geisberger, Peter Sanders, and Dominik Schultes. 2008. Better approximation of betweenness centrality. In Proceedings of the 10th Workshop on Algorithm Engineering and Experiments (ALENEX’08). SIAM, 90--100.
[21]
O. Green, R. McColl, and David A. Bader. 2012. A fast algorithm for streaming betweenness centrality. In Proceedings of the 2012 International Conference on Privacy, Security, Risk and Trust (PASSAT’12). IEEE, 11--20.
[22]
Sariel Har-Peled and Micha Sharir. 2011. Relative -approximations in geometry. Discrete 8 Computational Geometry 45, 3 (2011), 462--496.
[23]
David Haussler. 1992. Decision theoretic generalizations of the PAC model for neural net and other learning applications. Information and Computation 100, 1 (1992), 78--150.
[24]
Takanori Hayashi, Takuya Akiba, and Yuichi Yoshida. 2015. Fully dynamic betweenness centrality maintenance on massive networks. Proceedings of the VLDB Endowment 9, 2 (2015), 48--59.
[25]
Riko Jacob, Dirk Koschützki, KatharinaAnna Lehmann, Leon Peeters, and Dagmar Tenfelde-Podehl. 2005. Algorithms for centrality indices. In Network Analysis, Ulrik Brandes and Thomas Erlebach (Eds.), Lecture Notes in Computer Science, vol. 3418. Springer, Berlin, 62--82.
[26]
Shiyu Ji and Zenghui Yan. 2016. Refining approximating betweenness centrality based on samplings. arXiv:1608.04472 (2016).
[27]
Steven G. Johnson. 2014. The NLopt Nonlinear-Optimization Package. (2014). Retrieved https://nlopt.readthedocs.io/en/latest/.
[28]
Miray Kas, Matthew Wachs, Kathleen M. Carley, and L. Richard Carley. 2013. Incremental algorithm for updating betweenness centrality in dynamically growing networks. In Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM’13). IEEE/ACM, 33--40.
[29]
Vladimir Koltchinskii. 2001. Rademacher penalties and structural risk minimization. IEEE Transactions on Information Theory 47, 5 (Jul. 2001), 1902--1914.
[30]
Vladimir Koltchinskii, C. T. Abdallah, Marco Ariola, Peter Dorato, and Dmitry Panchenko. 2000. Improved sample complexity estimates for statistical learning control of uncertain systems. IEEE Transactions on Automatic Control 45, 12 (Dec. 2000), 2383--2388.
[31]
Nicolas Kourtellis, Tharaka Alahakoon, Ramanuja Simha, Adriana Iamnitchi, and Rahul Tripathi. 2012. Identifying high betweenness centrality nodes in large social networks. Social Network Analysis and Mining 3, 4 (2012), 899--914.
[32]
Nicolas Kourtellis, Gianmarco De Francisci Morales, and Francesco Bonchi. 2015. Scalable online betweenness centrality in evolving graphs. IEEE Transactions on Knowledge and Data Engineering 27, 9 (2015), 2494--2506.
[33]
Min-Joong Lee, Jungmin Lee, Jaimie Yejean Park, Ryan Hyun Choi, and Chin-Wan Chung. 2012. QUBE: A quick algorithm for updating betweenness centrality. In Proceedings of the 21st International Conference on World Wide Web (WWW’12). IW3C2, 351--360.
[34]
Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. 2007. Graph evolution: Densification and shrinking diameters. ACM Transactions on Knowledge Discovery from Data 1, 1 (2007), Article 2.
[35]
Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. (June 2014) Retrieved http://snap.stanford.edu/data.
[36]
Yi Li, Philip M. Long, and Aravind Srinivasan. 2001. Improved bounds on the sample complexity of learning. Journal of Computer and System Sciences 62, 3 (2001), 516--527.
[37]
Maarten Löffler and Jeff M. Phillips. 2009. Shape fitting on point sets with probability distributions. In Proceedings of Algorithms - ESA 2009, Amos Fiat and Peter Sanders (Eds.), Lecture Notes in Computer Science, vol. 5757. Springer, Berlin, 313--324.
[38]
Meghana Nasre, Matteo Pontecorvi, and Vijaya Ramachandran. 2014. Betweenness centrality -- Incremental and faster. In Proceedings of International Symposium on Mathematical Foundations of Computer Science (MFCS’14). 577--588.
[39]
Meghana Nasre, Matteo Pontecorvi, and Vijaya Ramachandran. 2014. Decremental all-pairs ALL shortest paths and betweenness centrality. In Proceedings of the 25th International Symposium on Algorithms and Computation (ISAAC’14). 766--778.
[40]
Mark E. J. Newman. 2010. Networks -- An Introduction. Oxford University Press.
[41]
Luca Oneto, Alessandro Ghio, Davide Anguita, and Sandro Ridella. 2013. An improved analysis of the Rademacher data-dependent bound using its self bounding property. Neural Networks 44 (2013), 107--111.
[42]
Luca Oneto, Alessandro Ghio, Sandro Ridella, and Davide Anguita. 2016. Global Rademacher complexity bounds: From slow to fast convergence rates. Neural Processing Letters 43, 2 (2016), 567--602.
[43]
David Pollard. 1984. Convergence of Stochastic Processes. Springer-Verlag.
[44]
Matteo Pontecorvi and Vijaya Ramachandran. 2015. Fully dynamic betweenness centrality. In Proceedings of the 26th International Symposium on Algorithms and Computation (ISAAC’15). 331--342.
[45]
Foster Provost, David Jensen, and Tim Oates. 1999. Efficient progressive sampling. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery Data Mining (KDD’99). ACM, New York, NY, 23--32.
[46]
Matteo Riondato and Evgenios M. Kornaropoulos. 2014. Fast approximation of betweenness centrality through sampling. In Proceedings of WSDM, Ben Carterette, Fernando Diaz, Carlos Castillo, and Donald Metzler (Eds.). ACM, 413--422.
[47]
Matteo Riondato and Evgenios M. Kornaropoulos. 2016. Fast approximation of betweenness centrality through sampling. Data Mining and Knowledge Discovery 30, 2 (2016), 438--475.
[48]
Matteo Riondato and Eli Upfal. 2015. Mining frequent itemsets through progressive sampling with Rademacher averages. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’15). ACM, 1005--1014.
[49]
Matteo Riondato and Eli Upfal. 2016. ABRA: Approximating betweenness centrality in static and dynamic graphs with Rademacher averages. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’16). ACM, 1145--1154.
[50]
Ahmet Erdem Sarıyüce, Erik Saule, Kamer Kaya, and Ümit V. Çatalyürek. 2013. Shattering and compressing networks for betweenness centrality. In Proceedings of SIAM International Conference on Data Mining (SDM’13). SIAM, 686--694.
[51]
Shai Shalev-Shwartz and Shai Ben-David. 2014. Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press.
[52]
Christian L. Staudt, Aleksejs Sazonovs, and Henning Meyerhenke. 2016. NetworKit: A tool suite for large-scale complex network analysis. Network Science 4, 4 (2016), 508--530.
[53]
Vladimir N. Vapnik. 1999. The Nature of Statistical Learning Theory. Springer-Verlag.

Cited By

View all
  • (2024)Making Temporal Betweenness Computation Faster and RestlessProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671825(163-174)Online publication date: 25-Aug-2024
  • (2024)TATKC: A Temporal Graph Neural Network for Fast Approximate Temporal Katz Centrality RankingProceedings of the ACM Web Conference 202410.1145/3589334.3645432(527-538)Online publication date: 13-May-2024
  • (2024)Scaling Expected Force: Efficient Identification of Key Nodes in Network-Based Epidemic Models2024 32nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)10.1109/PDP62718.2024.00021(98-107)Online publication date: 20-Mar-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 12, Issue 5
October 2018
354 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3234931
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 July 2018
Accepted: 01 April 2018
Revised: 01 April 2018
Received: 01 July 2017
Published in TKDD Volume 12, Issue 5

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Centrality measures
  2. pseudodimension
  3. statistical learning theory
  4. uniform bounds

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Two Sigma Investments, LP
  • NIH
  • NSF

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)227
  • Downloads (Last 6 weeks)24
Reflects downloads up to 23 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Making Temporal Betweenness Computation Faster and RestlessProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671825(163-174)Online publication date: 25-Aug-2024
  • (2024)TATKC: A Temporal Graph Neural Network for Fast Approximate Temporal Katz Centrality RankingProceedings of the ACM Web Conference 202410.1145/3589334.3645432(527-538)Online publication date: 13-May-2024
  • (2024)Scaling Expected Force: Efficient Identification of Key Nodes in Network-Based Epidemic Models2024 32nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)10.1109/PDP62718.2024.00021(98-107)Online publication date: 20-Mar-2024
  • (2024)Efficient processing of coverage centrality queries on road networksWorld Wide Web10.1007/s11280-024-01248-527:3Online publication date: 12-Apr-2024
  • (2024)Bounding the family-wise error rate in local causal discovery using Rademacher averagesData Mining and Knowledge Discovery10.1007/s10618-024-01069-0Online publication date: 9-Sep-2024
  • (2024)MANTRA: Temporal Betweenness Centrality Approximation Through SamplingMachine Learning and Knowledge Discovery in Databases. Research Track10.1007/978-3-031-70341-6_8(125-143)Online publication date: 22-Aug-2024
  • (2023) SILVAN: Estimating Betweenness Centralities with Progressive Sampling and Non-uniform Rademacher BoundsACM Transactions on Knowledge Discovery from Data10.1145/362860118:3(1-55)Online publication date: 9-Dec-2023
  • (2023)MaNIACS: Approximate Mining of Frequent Subgraph Patterns through SamplingACM Transactions on Intelligent Systems and Technology10.1145/358725414:3(1-29)Online publication date: 13-Apr-2023
  • (2023)Efficient Centrality Maximization with Rademacher AveragesProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599325(1872-1884)Online publication date: 6-Aug-2023
  • (2023) Bavarian: Betweenness Centrality Approximation with Variance-aware Rademacher AveragesACM Transactions on Knowledge Discovery from Data10.1145/357702117:6(1-47)Online publication date: 6-Mar-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media