research-article

Shortest Paths Discovery in Uncertain Networks via Transfer Learning

Authors:

Zhifeng BaoAuthors Info & Claims

Proceedings of the ACM on Management of Data, Volume 1, Issue 2

Article No.: 141, Pages 1 - 25

https://doi.org/10.1145/3589286

Published: 20 June 2023 Publication History

Abstract

Due to various reasons such as noisy measurement and privacy preservation, a network/graph is often uncertain such that each edge in the network has a probability of existence. In this paper, we study finding the most probable shortest path which has the highest probability of being the shortest path between a given pair of nodes in an uncertain network. Despite significant progress being made, this problem still suffers from the efficiency and scalability issue. To solve this problem, the state-of-the-art adopts a two-phase approach where Phase 1 generates some candidate paths and Phase 2 estimates their probabilities of being the shortest path and returns the one with the highest probability as the solution. Notably, Phase 2 requires a large number of simulations over all edges in the network and can easily dominate the cost of the whole process. In this paper, we aim to resolve the efficiency and scalability issue by optimizing Phase 2. Specifically, we first propose a non-learning based fast approximation technique which significantly reduces the number of samples for the probability estimation in each simulation. Afterwards, we further propose a learning-based method which can directly estimate the probability of each candidate path without costly simulations. Extensive experiments show that (1) compared to the state-of-the-art, our fast approximation technique and learning-based method can achieve up to 5x and 210x speedups in Phase 2 respectively while maintaining highly competitive or even equivalent results, (2) the training process is highly scalable and (3) the prediction function can work effectively under the problem settings different from the one it was trained.

Supplemental Material

MP4 File

Presentation video for SIGMOD 2023

Download
16.98 MB

References

[1]

2023. https://github.com/rmitbggroup/MPSP.

[2]

Eytan Adar and Christopher Re. 2007. Managing uncertainty in social networks. IEEE Data Eng. Bull. 30, 2 (2007), 15--22.

[3]

Saurabh Asthana, Oliver D King, Francis D Gibbons, and Frederick P Roth. 2004. Predicting protein complex membership using probabilistic network reliability. Genome research 14, 6 (2004), 1170--1175.

[4]

Yunsheng Bai, Hao Ding, Song Bian, Ting Chen, Yizhou Sun, and Wei Wang. 2019. Simgnn: A neural network approach to fast graph similarity computation. In WSDM. 384--392.

Digital Library

[5]

Michael O Ball. 1986. Computational complexity of network reliability analysis: An overview. IEEE Transactions on Reliability 35, 3 (1986), 230--239.

[6]

Paolo Boldi, Francesco Bonchi, Aris Gionis, and Tamir Tassa. 2012. Injecting uncertainty in graphs for identity obfuscation. arXiv preprint arXiv:1208.4145 (2012).

[7]

Francesco Bonchi, Aristides Gionis, Francesco Gullo, and Antti Ukkonen. 2014. Distance oracles in edge-labeled graphs. In EDBT. 547--558.

[8]

Edwin V Bonilla, Kian Chai, and Christopher Williams. 2007. Multi-task Gaussian process prediction. NeurIPS 20 (2007).

[9]

Hongxu Chen, Hongzhi Yin, Weiqing Wang, Hao Wang, Quoc Viet Hung Nguyen, and Xue Li. 2018. PME: projected metric embedding on heterogeneous networks for link prediction. In SIGKDD. 1177--1186.

[10]

Yurong Cheng, Ye Yuan, Guoren Wang, Baiyou Qiao, and Zhiqiong Wang. 2014. Efficient sampling methods for shortest path query over uncertain graphs. In DASFFA. 124--140.

[11]

Yu-Rong Cheng, Ye Yuan, Lei Chen, and Guo-Ren Wang. 2015. Threshold-based shortest path query over large correlated uncertain graphs. Journal of Computer Science and Technology 30, 4 (2015), 762--780.

[12]

The Koblenz Network Collection. 2017. http://konect.uni-koblenz.de.

[13]

Cameron Craddock, Yassine Benhajali, Carlton Chu, Francois Chouinard, Alan Evans, András Jakab, Budhachandra Singh Khundrakpam, John David Lewis, Qingyang Li, Michael Milham, et al . 2013. The neuro bureau preprocessing initiative: open sharing of preprocessed neuroimaging data and derivatives. Frontiers in Neuroinformatics 7 (2013), 27.

[14]

Hanjun Dai, Bo Dai, and Le Song. 2016. Discriminative embeddings of latent variable models for structured data. In ICML. 2702--2711.

[15]

Wenyuan Dai, Gui-Rong Xue, Qiang Yang, and Yong Yu. 2007. Transferring naive bayes classifiers for text classification. In AAAI, Vol. 7. 540--545.

[16]

Wenyuan Dai, Qiang Yang, Gui-Rong Xue, and Yong Yu. 2008. Self-taught clustering. In ICML. 200--207.

[17]

Hal Daume III and Daniel Marcu. 2006. Domain adaptation for statistical classifiers. Journal of artificial Intelligence research 26 (2006), 101--126.

Digital Library

[18]

Jesse Davis and Pedro Domingos. 2009. Deep transfer via second-order markov logic. In ICML. 217--224.

[19]

Adriana Di Martino, Clare Kelly, Rebecca Grzadzinski, Xi-Nian Zuo, Maarten Mennes, Maria Angeles Mairena, Catherine Lord, F Xavier Castellanos, and Michael P Milham. 2011. Aberrant striatal functional connectivity in children with autism. Biological psychiatry 69, 9 (2011), 847--856.

[20]

Xingbo Du, Junchi Yan, and Hongyuan Zha. 2019. Joint Link Prediction and Network Alignment via Cross-graph Embedding. In IJCAI. 2251--2257.

[21]

David K Duvenaud, Dougal Maclaurin, Jorge Iparraguirre, Rafael Bombarell, Timothy Hirzel, Alán Aspuru-Guzik, and Ryan P Adams. 2015. Convolutional networks on graphs for learning molecular fingerprints. NeurIPS 28 (2015).

[22]

David Eppstein. 1998. Finding the k shortest paths. SIAM Journal on computing 28, 2 (1998), 652--673.

[23]

Yixiang Fang, Reynold Cheng, Xiaodong Li, Siqiang Luo, and Jiafeng Hu. 2017. Effective community search over large spatial graphs. PVLDB 10, 6 (2017), 709--720.

Digital Library

[24]

Kunihiko Fukushima. 1969. Visual feature extraction by a multilayered network of analog threshold elements. IEEE Transactions on Systems Science and Cybernetics 5, 4 (1969), 322--333.

[25]

Kunihiko Fukushima and Sei Miyake. 1982. Neocognitron: A self-organizing neural network model for a mechanism of visual pattern recognition. In Competition and cooperation in neural nets. 267--285.

[26]

Luis García Domínguez, Jim Stieben, José Luis Pérez Velázquez, and Stuart Shanker. 2013. The imaginary part of coherency in autism: differences in cortical functional connectivity in preschool children. PLoS One 8, 10 (2013), e75941.

[27]

Joy Ghosh, Hung Q Ngo, Seokhoon Yoon, and Chunming Qiao. 2007. On a routing problem within probabilistic graphs and its application to intermittently connected networks. In IEEE INFOCOM. 1721--1729.

[28]

Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In SIGKDD. 855--864.

[29]

Kai Han, Fei Gui, Xiaokui Xiao, Jing Tang, Yuntian He, Zongmai Cao, and He Huang. 2019. Efficient and effective algorithms for clustering uncertain graphs. PVLDB 12, 6 (2019), 667--680.

Digital Library

[30]

Jiafeng Hu, Reynold Cheng, Zhipeng Huang, Yixang Fang, and Siqiang Luo. 2017. On embedding uncertain graphs. In CIKM. 157--166.

[31]

Ming Hua and Jian Pei. 2010. Probabilistic path queries in road networks: traffic uncertainty aware path selection. In EDBT. 347--358.

[32]

Shixun Huang, Zhifeng Bao, J Shane Culpepper, and Bang Zhang. 2019. Finding temporal influential users over evolving social networks. In ICDE. 398--409.

[33]

Shixun Huang, Zhifeng Bao, Guoliang Li, Yanghao Zhou, and J Shane Culpepper. 2020. Temporal network representation learning via historical neighborhoods aggregation. In ICDE. IEEE, 1117--1128.

[34]

Shixun Huang, Wenqing Lin, Zhifeng Bao, and Jiachen Sun. 2022. Influence maximization in real-world closed social networks. PVLDB 16, 2 (2022), 180--192.

Digital Library

[35]

Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning. 448--456.

Digital Library

[36]

Ruoming Jin, Lin Liu, Bolin Ding, and Haixun Wang. 2011. Distance-constraint reachability computation in uncertain graphs. PVLDB 4, 9 (2011), 551--562.

Digital Library

[37]

Donald B Johnson. 1977. Efficient algorithms for shortest paths in sparse networks. J. ACM 24, 1 (1977), 1--13.

Digital Library

[38]

Richard M Karp and Michael Luby. 1983. Monte-Carlo algorithms for enumeration and reliability problems. In 24th Annual Symposium on Foundations of Computer Science (sfcs 1983). 56--64.

Digital Library

[39]

Xiangyu Ke, Arijit Khan, Mohammad Al Hasan, and Rojin Rezvansangsari. 2020. Reliability maximization in uncertain graphs. TKDE (2020).

[40]

Xiangyu Ke, Arijit Khan, and Leroy Lim Hong Quan. 2019. An in-depth comparison of st reliability algorithms over uncertain graphs. arXiv preprint arXiv:1904.05300 (2019).

[41]

Neil D Lawrence and John C Platt. 2004. Learning to learn with the informative vector machine. In ICML. 65.

[42]

Su-In Lee, Vassil Chatalbashev, David Vickrey, and Daphne Koller. 2007. Learning a meta-level prior for feature relevance from multiple related tasks. In ICML. 489--496.

[43]

Xiaodong Li, Reynold Cheng, Yixiang Fang, Jiafeng Hu, and Silviu Maniu. 2018. Scalable evaluation of k-NN queries on large uncertain graphs. In EDBT.

[44]

Chenhao Ma, Reynold Cheng, Laks VS Lakshmanan, Tobias Grubenmann, Yixiang Fang, and Xiaodong Li. 2019. Linc: a motif counting algorithm for uncertain graphs. PVLDB 13, 2 (2019), 155--168.

Digital Library

[45]

Samuel Madden. 2004. Intel lab data. http://db.csail.mit.edu/labdata/labdata.html.

[46]

Lilyana Mihalkova, Tuyen Huynh, and Raymond J Mooney. 2007. Mapping and revising markov logic networks for transfer learning. In AAAI, Vol. 7. 608--614.

[47]

Lilyana Mihalkova and Raymond J Mooney. 2008. Transfer learning by mapping with minimal target data. In Proceedings of the AAAI-08 workshop on transfer learning for complex tasks.

[48]

OpenStreetMap contributors. 2017. Planet dump retrieved from https://planet.osm.org . https://www.openstreetmap.org.

[49]

Sinno Jialin Pan and Qiang Yang. 2009. A survey on transfer learning. TKDE 22, 10 (2009), 1345--1359.

Digital Library

[50]

Panos Parchas, Francesco Gullo, Dimitris Papadias, and Franceseco Bonchi. 2014. The pursuit of a good possible world: extracting representative instances of uncertain graphs. In SIGMOD. 967--978.

[51]

Michalis Potamias, Francesco Bonchi, Aristides Gionis, and George Kollios. 2010. K-nearest neighbors in uncertain graphs. PVLDB 3, 1--2 (2010), 997--1008.

Digital Library

[52]

Liang Qu, Huaisheng Zhu, Qiqi Duan, and Yuhui Shi. 2020. Continuous-time link prediction via temporal dependent graph neural network. In Proceedings of The Web Conference. 3026--3032.

Digital Library

[53]

Rajat Raina, Alexis Battle, Honglak Lee, Benjamin Packer, and Andrew Y Ng. 2007. Self-taught learning: transfer learning from unlabeled data. In ICML. 759--766.

[54]

Arkaprava Saha, Ruben Brokkelkamp, Yllka Velaj, Arijit Khan, and Francesco Bonchi. 2021. Shortest paths and centrality in uncertain networks. VLDB 14, 7 (2021), 1188--1201.

Digital Library

[55]

Hidetoshi Shimodaira. 2000. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of statistical planning and inference 90, 2 (2000), 227--244.

[56]

Leslie G Valiant. 1979. The complexity of enumeration and reliability problems. SIAM J. Comput. 8, 3 (1979), 410--421.

Digital Library

[57]

Zheng Wang, Yangqiu Song, and Changshui Zhang. 2008. Transferred dimensionality reduction. In Joint European conference on machine learning and knowledge discovery in databases. 550--565.

[58]

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2018. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826 (2018).

[59]

Ye Yuan, Lei Chen, and Guoren Wang. 2010. Efficiently answering probability threshold-based shortest path queries over uncertain graphs. In DASFFA. 155--170.

[60]

Bianca Zadrozny. 2004. Learning and evaluating classifiers under sample selection bias. In ICML. 114.

[61]

Fuzhen Zhuang, Zhiyuan Qi, Keyu Duan, Dongbo Xi, Yongchun Zhu, Hengshu Zhu, Hui Xiong, and Qing He. 2020. A comprehensive survey on transfer learning. Proc. IEEE 109, 1 (2020), 43--76.

[62]

Lei Zou, Peng Peng, and Dongyan Zhao. 2011. Top-K possible shortest path query over a large uncertain graph. In WISE. 72--86.

Cited By

Bača R(2024)Window Function Expression: Let the Self-Join EnterProceedings of the VLDB Endowment10.14778/3665844.366584817:9(2162-2174)Online publication date: 1-May-2024
https://dl.acm.org/doi/10.14778/3665844.3665848
Yan YWong R(2024)Proximity Queries on Point Clouds using Rapid Construction Path OracleProceedings of the ACM on Management of Data10.1145/36392612:1(1-26)Online publication date: 26-Mar-2024
https://dl.acm.org/doi/10.1145/3639261

Index Terms

Shortest Paths Discovery in Uncertain Networks via Transfer Learning
1. Computing methodologies
  1. Machine learning
2. Theory of computation
  1. Design and analysis of algorithms

Recommendations

On shortest disjoint paths in planar graphs

For a graph G and a collection of vertex pairs {(s"1,t"1),...,(s"k,t"k)}, the k disjoint paths problem is to find k vertex-disjoint paths P"1,...,P"k, where P"i is a path from s"i to t"i for each i=1,...,k. In the corresponding optimization problem, the ...
Computing shortest paths with uncertainty

We consider the problem of estimating the length of the shortest path from a vertex s to a vertex t in a DAG whose edge lengths are known only approximately but can be determined exactly at a cost. Initially, for each edge e, the length of e is known ...
Fast top-k simple shortest paths discovery in graphs
CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management

With the wide applications of large scale graph data such as social networks, the problem of finding the top-k shortest paths attracts increasing attention. This paper focuses on the discovery of the top-k simple shortest paths (paths without loops). ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data

Proceedings of the ACM on Management of Data Volume 1, Issue 2

PACMMOD

June 2023

2310 pages

EISSN:2836-6573

DOI:10.1145/3605748

Editor:
Divyakant Agrawal
UC Santa Barbara, United States

Issue’s Table of Contents

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 June 2023

Published in PACMMOD Volume 1, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Author Tags

Qualifiers

Research-article

Funding Sources

Australian Research Council

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
163
Total Downloads

Downloads (Last 12 months)109
Downloads (Last 6 weeks)4

Reflects downloads up to 03 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Bača R(2024)Window Function Expression: Let the Self-Join EnterProceedings of the VLDB Endowment10.14778/3665844.366584817:9(2162-2174)Online publication date: 1-May-2024
https://dl.acm.org/doi/10.14778/3665844.3665848
Yan YWong R(2024)Proximity Queries on Point Clouds using Rapid Construction Path OracleProceedings of the ACM on Management of Data10.1145/36392612:1(1-26)Online publication date: 26-Mar-2024
https://dl.acm.org/doi/10.1145/3639261

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents