Seven-Layer Model in Complex Networks Link Prediction: A Survey
Abstract
:1. Introduction
2. Problem Formulation
3. Link Prediction Seven-Layer Model
3.1. Network Layer
3.2. Metadata Layer
3.2.1. Time Features
3.2.2. Topological Features
3.2.3. Weight Features
3.2.4. Attributive Features
3.2.5. Label Features
3.2.6. Directional Features
3.2.7. Symbolic Features
3.2.8. Auxiliary Information
- RoleAs auxiliary information in link prediction, role mainly includes strong and weak ties. Strong ties are stable and deep social relations, whereas weak ties are flexible and extensive social relations compared with strong connections. In social networks, approximately 20% are strong and 80% are weak relational connections. Weak ties are more crucial than strong ties. Liu et al. [21] claimed that weak ties significantly influence link prediction, and the performance of link prediction can be improved using weak ties.
- CentralityIn many real networks, centrality theory also significantly influences the performance of link prediction. Nodes in a network prefer to link to both similar and central nodes. Li et al. [22] used a maximum entropy random walk for link prediction, and the method used the node centrality theory, which had better performance than the link prediction method without centrality theory. Ma et al. [23] proposed a centrality information model, improving the performance of link prediction using node importance theory. It included several centralities, such as degree, closeness, betweenness, and eigenvector centrality.
- HomophilyWang et al. [24] found the internal relationship between links and attributes in the network by using the homogeneity theory, combining link prediction and attribute inference with the community structure, and proposed a community-based link and attribute inference approach. This method can both predict attributes and links and improve the accuracy of link prediction and attribute inference by the iterative method. In social networks, Yang et al. [25] proposed a model that uses the homophily theory to connect users with the services in which they are interested and connect different users with common interests to effectively spread friendship and interests. Weng et al. [26] describe the user with a set of parameters associated with different link creation strategies and use maximum likelihood estimates to confirm that triadic closure does have a strong effect on link formation.
- CommunityTaking the community structure information of the network as the input metadata can more easily discover some hidden laws in the network and predict the behavior of the network, help us further analyze the network topology, and better understand and explain the network functions. Weng et al. [27] propose a practical method that converts data about community structure into predictive knowledge of information that can be widely disseminated. Valverde-Rebaza and Lopes [28] combine topological structure and community information, have high efficiency, and improve the link prediction performance of directed and asymmetric large-scale social networks.
3.3. Feature Classification Layer
- Graph structure featuresGraph structure features are located in the observation nodes and edge structures of the network, which can be directly observed and calculated. Link prediction heuristics belong to graph structure features, such as Common Neighbors, Jaccard, preferential attachment, Adamic-Adar, resource allocation, Katz, PageRank, and SimRank. In addition to link prediction heuristics, degree centrality, closeness centrality, betweenness centrality, and eigenvector centrality belong to graph structure features, which are inductive, meaning that these features are not associated with a specific node or network. Cukierski et al. [29] used 94 distinct graph features as input metadata for classification with RF, at the same times, proposed several variants of similarity method for link prediction. The research shows that the combination of features can achieve a better prediction effect.
- Latent featuresA latent feature is a potential attribute or representation of a node, usually obtained by decomposing a specific matrix derived from a network. They are powerful in linking predictions. Assume that each entity is associated with an unobserved eigenvector, the probability of the link is then calculated by the interaction between these potential features. They reveal structural relationships between entities, automatically learn potential features, make accurate predictions, and perform at their best. Latent features focus more on global properties and long-term effects, fail to capture structural similarities between nodes, and are less interpretable than graph structure features.
- Explicit featuresExplicit features are usually given by continuous or discrete node attribute vectors. In principle, any side information about the network other than its structure can be seen as explicit features. For instance, in social networks, a user’s profile information is also an explicit feature. However, their friendship information belongs to graph structure features.
3.4. Selection Input Layer
- Single featureEarly link prediction approaches used a single feature in the classification layer for link prediction, that is, graph structure features, latent features, and explicit features only use one item as an input feature item.
- Multiple featuresThe graph structure, latent, and explicit features are largely orthogonal to each other. We can try using them together for link prediction to improve the performance of single-feature-based methods, that is, using a combination of graph structure and latent features or a combination of latent and explicit features. Koren et al. [30] established a more accurate composite model. The user’s explicit and implicit features are used to further improve the precision.
3.5. Processing Layer
3.5.1. Feature Extraction Methods
- Similarity-based methods
- (1)
- Global Similarity
- Katz Index (KI)
- SimRank (SR)
- Random Walk (RW)
- Random Walk with Restart (RWR)
- (2)
- Local Similarity
- Common Neighbors (CN)
- Jaccard Index (JC)
- Salton Index (SI)
- Preferential Attachment Index (PA)
- Adamic-Adar Index (AA)
- Resource Allocation Index (RA)
- Hub Depressed Index (HDI)
- Hub Promoted index (HPI)
- (3)
- Quasi-local Similarity
- Local Path Index (LP)
- Local Random Walk (LRW)
- Superposed random walk (SRW)
- FriendLink (FL)
- PropFlow Predictor (PFP) Index
- 2.
- Likelihood methods
- Hierarchical Structure Models (HSM)
- Stochastic Block Models (SBM)
- 3.
- Probabilistic methods
- Probabilistic Relational Models (PRM)
- Entity-relational models (ERM)
- Stochastic Relational Models (SRM)
3.5.2. Feature Learning Methods
- Unsupervised learning
- DeepWalk
- LINE
- GraRep
- DNGR
- SDNE
- Node2Vec
- HOPE
- Graph Representation Learning with Generative Adversarial Nets (GraphGAN)
- Struct2Vec
- 2.
- Semi-supervised learning
- GCN
- GNN
- GAN
- LSTM
- GAE
- GAT
- 3.
- Supervised learning
- SVM
- KNN
- Logistic Regression (LR)
- Ensemble Learning (EL)
- RF
- Multilayer Perceptron (MLP)
- Naïve Bayes (NB)
- Matrix Factorization (MF)
- 4.
- Reinforcement Learning Methods
- GCPN
- GTPN
3.6. Selection Layer
- (1)
- Single method
- (2)
- Combination methods
3.7. Output Layer
4. Comparison of Input Features of Link Prediction Methods and Complexity
5. Evaluating Metrics
- AUC
- 2.
- Precision
6. A Summary of Open-Source Implementations
7. Experimental Comparison and Relative Merits for Each Link Prediction
8. Future Directions
- 1.
- Link prediction for complex type networksExisting research is imperfect, opening the opportunity to explore how to make link predictions in complex network structures, such as multiple layer networks, interdependent networks, and hypernetworks.
- 2.
- Personal privacy protectionUser privacy protection is an unavoidable problem in practical applications. How to obtain accurate prediction effects without compromising user privacy is also a problem worthy of study.
- 3.
- InterpretabilityLink prediction has many practical applications, making it critical to explain the prediction results. In medicine, such interpretability is essential in translating computer experiments into clinical applications.
- 4.
- CombinationAs mentioned above, many existing methods can work together. How to fully exploit the advantages of each method and combine them should be solved.
- 5.
- Scalability and parallelizationIn the era of big data, large social networks typically have millions of nodes and edges. Therefore, designing an extensible model with linear time complexity becomes critical. Furthermore, because the nodes and edges of a graph are interconnected, it is often necessary to model it in its entirety, highlighting the need for parallel computation.
- 6.
- InterdisciplinaryLink prediction has attracted the attention of experts in various fields. Interdisciplinary crossing brings both opportunities and challenges. Domain knowledge is used to solve specific problems, but cross-integration domain knowledge could make the model design more difficult.
- 7.
- New evaluation methods
9. Summary
Author Contributions
Funding
Conflicts of Interest
References
- Liben-Nowell, D.; Kleinberg, J.M. The link-prediction problem for social networks. J. Am. Soc. Inf. Sci. Technol. 2007, 58, 1019–1031. [Google Scholar] [CrossRef] [Green Version]
- Lü, L.; Zhou, T. Link prediction in complex networks: A survey. Phys. A Stat. Mech. Its Appl. 2011, 390, 1150–1170. [Google Scholar] [CrossRef] [Green Version]
- Shang, K.K.; Li, T.C.; Small, M.; Burton, D. Link prediction for tree-like networks. Chaos Interdiscip. J. Nonlinear Sci. 2019, 29, 1–9. [Google Scholar] [CrossRef] [PubMed]
- Hasan, M.A.; Zaki, M. A survey of link prediction in social networks. Soc. Netw. Data Anal. 2011, 8, 243–275. [Google Scholar]
- Wang, P.; Xu, B.W.; Wu, Y.R. Link prediction in social networks: The state-of-the-art. Sci. China Inf. Sci. 2015, 58, 1–38. [Google Scholar] [CrossRef] [Green Version]
- Daud, N.N.; Hamid, S.H.A.; Saadoon, M. Applications of link prediction in social network: A Review. J. Netw. Comput. Appl. 2020, 166, 1–38. [Google Scholar] [CrossRef]
- Munasinghe, L.; Ichise, R. Time Score: A New feature for link prediction in social network. IEICE Trans. Inf. Syst. 2012, 95, 821–828. [Google Scholar] [CrossRef] [Green Version]
- Huang, Z.; Lin, D.K.J. The time-series link prediction problem with applications in communication surveillance. Inf. J. Comput. 2009, 21, 286–303. [Google Scholar] [CrossRef] [Green Version]
- Ricardo, P.; Soares, S.; Prudencio, R.B.C. Time Series Based Link Prediction. In Proceedings of the 2012 Joint Conference on IEEE, Brisbane, QLD, Australia, 10–15 June 2012; pp. 10–15. [Google Scholar]
- Xu, H.H.; Zhang, L.J. Application of link prediction in temporal network. Adv. Mater. Res. 2012, 59, 756–759. [Google Scholar]
- Fire, M.; Tenenboim, L.; Lesser, O.; Puzis, R. Link prediction in social networks using computationally efficient topological features. In Proceedings of the 2011 IEEE Third International Conference, Boston, MA, USA, 9–11 October 2011; pp. 9–11. [Google Scholar]
- Wang, J.; Rong, L.L. Similarity index based on the information of neighbor nodes for link prediction of complex network. Mod. Phys. Lett. B 2013, 27, 87–89. [Google Scholar] [CrossRef]
- Shang, K.K.; Small, M.; Xu, X.; Yan, W.S. The role of direct links for link prediction in evolving networks. EPL Europhys. Lett. 2017, 117, 1–7. [Google Scholar] [CrossRef]
- Danh, B.T.; Ryntaro, I.; Bac, L. Link prediction in social networks based on local weighted paths. Future Data Secur. Eng. 2014, 21, 151–163. [Google Scholar]
- Shi, S.L.; Wen, Y.M.; Li, Y.P.; Xie, W. Adding the sentiment attribute of nodes to improve link prediction in social network. In Proceedings of the 2015 12th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD, Zhangjiajie, China, 15–17 August 2015; pp. 15–17. [Google Scholar]
- Zhang, Y.; Gao, K.; Li, F. A New method for link prediction using various features in social network. Web Inf. Syst. Appl. 2014, 34, 12–14. [Google Scholar]
- Gong, N.Z.Q.; Talwalkar, A.; Mackey, L.; Huang, L. Joint link prediction and attribute inference using a social attribute network. ACM Trans. Intell. Syst. Technol. 2013, 5, 84–104. [Google Scholar] [CrossRef]
- Zhao, Y.F.; Li, L.; Wu, X.D. Link prediction-based multi-label classification on networked data. In Proceedings of the 2016 IEEE First International Conference on Data Science in Cyberspace (DSC), Changsha, China, 13–16 June 2016; pp. 13–16. [Google Scholar]
- Shang, K.K.; Small, M.; Yan, W.S. Link direction for link prediction. Phys. A Stat. Mech. Its Appl. 2017, 469, 767–776. [Google Scholar] [CrossRef]
- Thi, A.T.N.; Nguyen, P.Q.; Duc, T.N.; Hoang, T.A.H. Transfer adaboost svm for link prediction in newly signed social networks using explicit and PNR features. Procedia Comput. Sci. 2017, 60, 332–341. [Google Scholar]
- Liu, H.F.; Hu, Z.; Haddadi, H.; Tian, H. Hidden link prediction based on node centrality and weak ties. EPL Eur. Lett. 2013, 101, 1–6. [Google Scholar] [CrossRef]
- Li, R.H.; Yu, J.X.; Liu, J.Q. Link prediction: The power of maximal entropy random walk. In Proceedings of the 20th ACM international Conference on Information and Knowledge Management, CIKM, Glasgow, UK, 24–28 October 2011; pp. 1147–1156. [Google Scholar]
- Ma, Y.; Wang, S.H.; Tang, J.L. Network embedding with centrality information. In Proceedings of the 2017 IEEE International Conference on Data Mining Workshops, New Orleans, LA, USA, 18–21 November 2017; Volume 162, pp. 1144–1145. [Google Scholar]
- Wang, R.; Wu, L.L.; Shi, C.; Wu, B. Integrating link prediction and attribute inference based on community structure. Acta Electron. Sin. 2016, 44, 2062–2067. [Google Scholar]
- Yang, S.H.; Long, B.; Smola, A.J.; Sadagopan, N.Y. Like like alike: Joint friendship and interest propagation in social networks. In Proceedings of the 20th International Conference on World Wide Web (WWW’11), Hyderabad, India, 28–30 April 2011; pp. 537–546. [Google Scholar]
- Weng, L.; Ratkiewicz, J.; Perra, N.; Flammini, A. The role of information diffusion in the evolution of social networks. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, IL, USA, 11–14 August 2013; pp. 356–364. [Google Scholar]
- Weng, L.; Menczer, F.; Ahn, Y. Virality prediction and community structure in social networks. Sci. Rep. 2013, 3, 2522–2527. [Google Scholar] [CrossRef] [Green Version]
- Valverde-Rebaza, J.A.; Lopes, A.D. Exploiting behaviors of communities of twitter users for link prediction. Soc. Netw. Anal. Min. 2013, 8, 1063–1074. [Google Scholar] [CrossRef]
- Cukierski, W.; Hammner, B.; Yang, B. Graph-based features for supervised link prediction. In Proceedings of the International Joint Conference on Neural Networks, SanJose, CA, USA, 31 August–5 September 2011; pp. 1237–1244. [Google Scholar]
- Koren, Y. Factorization meets the neighborhood: A multifaceted collaborative filtering model. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, NV, USA, 24–27 August 2008; pp. 426–434. [Google Scholar]
- Katz, L. A new status index derived from sociometric analysis. Psychmetrika 1953, 18, 39–43. [Google Scholar] [CrossRef]
- Jeh, G.; Widom, J. SimRank: A measure of structural-context similarity. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, AB, Canada, 23–26 July 2002; pp. 538–543. [Google Scholar]
- Tong, H.H.; Faloutsos, C. Fast Random walk with restart and its applications. Mach. Learn. Dep. 2006, 109, 1–21. [Google Scholar]
- Newman, M.E.J. Clustering and preferential attachment in growing networks. Phys. Rev. 2001, 64, 1–13. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Kossinets, G. Effects of missing data in social networks. Soc. Netw. 2005, 28, 247–268. [Google Scholar] [CrossRef] [Green Version]
- Jaccard, P. Etude comparative de la distribution florale dans une portion des Alpes et des Jura. Bull. Soc. Vaud. Sci. Nat. 1901, 37, 547–579. [Google Scholar]
- Bhawsar, Y.C.; Thakur, G.S.; Thakur, R.S. Combining Jaccard coefficient with fuzzy soft set for predicting links in social Media. Res. J. Comput. Inf. Technol. Sci. 2015, 3, 6–10. [Google Scholar]
- Sørensen, T.A. A Method of Establishing Groups of Equal Amplitude in Plant Sociology Based on Similarity of Species Content and Its Application to Analyses of the Vegetation on Danish Commons; Biologiske Skrifter/Kongelige Danske Videnskabernes Selskab: Copenhagen, Denmark, 1948; pp. 1–34. ISSN 0366-3612. [Google Scholar]
- Barabâsi, A.L.; Jeong, H.; Néda, Z.; Ravasz, E.; Schubert, A.; Vicsek, T. Evolution of the social network of scientific collaborations. Phys. A 2001, 311, 590–614. [Google Scholar] [CrossRef] [Green Version]
- Adamic, L.A.; Adar, E. Friends and neighbors on the Web. Soc. Netw. 2003, 25, 211–230. [Google Scholar] [CrossRef] [Green Version]
- Richardson, M.; Domingos, P. Mining knowledge-sharing sites for viral marketing. In Proceedings of the 8th ACMSIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’02), Edmonton, AB, Canada, 13–16 August 2002; pp. 61–70. [Google Scholar]
- Zhou, T.; Lü, L. Predicting missing links via local information. Eur. Phys. J. B 2009, 71, 623–630. [Google Scholar] [CrossRef] [Green Version]
- Ravasz, E.; Somera, A.L.; Mongru, D.A. Hierarchical organization of modularity in metabolic networks. Science 2002, 279, 1551–1555. [Google Scholar] [CrossRef] [Green Version]
- Lü, L.; Jin, C.H.; Zhou, T. Similarity index based on local paths for link prediction of complex networks. Phys. Rev. E 2009, 80, 1–9. [Google Scholar] [CrossRef] [Green Version]
- Liu, W.; Lü, L. Link prediction based on local random walk. EPL Europhys. Lett. 2010, 89, 1–7. [Google Scholar] [CrossRef] [Green Version]
- Papadimitriou, A.; Symeonidis, P.; Manolopoulos, Y. Fast and accurate link prediction in social networking systems. J. Syst. Softw. 2012, 85, 2119–2132. [Google Scholar] [CrossRef]
- Lichtenwalter, R.N.; Lussier, J.T.; Chawla, N.V. New perspectives and methods in link prediction. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 25–28 July 2010; pp. 243–252. [Google Scholar]
- Newman, M.E.J.; Moore, C.; Clauset, A. Hierarchical structure and the prediction of missing links in networks. Nature 2008, 453, 98–101. [Google Scholar]
- Krebs, V. Mapping networks of terrorist cells. Connections 2002, 24, 43–52. [Google Scholar]
- Girvan, M.; Newman, M.E.J. Community structure in social and biological networks. Lit. Rev. 2002, 99, 7821–7826. [Google Scholar] [CrossRef] [Green Version]
- White, H.C.; Boorman, S.A.; Breiger, R.L. Social structure from multiple networks: Blockmodels of roles and positions. Am. J. Sociol. 1976, 81, 730–780. [Google Scholar] [CrossRef]
- Holland, P.W.; Laskey, K.B.; Leinardt, S. Stochastic block models: First steps. Soc. Netw. 1983, 5, 109–137. [Google Scholar] [CrossRef]
- Dorelan, P.; Batagelj, V.; Ferligoj, A. Generalized Block Modeling; Cambridge University Press: Cambridge, UK, 2006; pp. 275–282. [Google Scholar]
- Taskar, B.; Abbeel, P.; Koller, D. Discriminative probabilistic models in relational data. In Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence (UAI02), Limerick, Ireland, 17–21 January 2002; pp. 485–492. [Google Scholar]
- Heckerman, D.; Chickering, D.M.; Meek, C.; Rounthwaite, R. Dependency networks for inference. Mach. Learn. Res. 2000, 378, 90118–90121. [Google Scholar]
- Heckerman, D.; Meek, C.; Koller, D. Probabilistic Entity-Relationship Models, PRMs, and Plate Models. In Proceedings of the 21st International Conference on Machine Learning, Biopolis, Singapore, 7–9 June 2011; pp. 55–94. [Google Scholar]
- Yu, K.; Chu, W.; Yu, S.P.; Tresp, V. Stochastic Relational Models for Discriminative Link Prediction. In Proceedings of the 19th International Conference on Neural Information Processing Systems, Cambridge, MA, USA, 4–7 December 2006; pp. 1553–1560. [Google Scholar]
- Bryan, P.; Rami, A.R.; Steven, S. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; pp. 701–710. [Google Scholar]
- Tang, J.; Qu, M.; Wang, M.Z.; Zhang, M. Line: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web, Florence, Italy, 18–22 May 2015; pp. 1067–1077. [Google Scholar]
- Xu, Y.W. An empirical study of locally updated large-scale information network embedding. UCLA Electron. Theses Diss. 2017, 10, 915–994. [Google Scholar]
- Cao, S.S.; Lu, W.; Xu, Q.K. Grarep: Learning graph representations with global structural information. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, Melbourne, Australia, 19–23 October 2015; pp. 891–900. [Google Scholar]
- Cao, S.; Lu, W.; Xu, Q. Deep neural networks for learning graph representations. In Proceedings of the AAAI Conference on Artificial Intelligence, Biopolis, Singapore, 7–9 June 2016; pp. 1145–1152. [Google Scholar]
- Wang, D.; Cui, P.; Zhu, W. Structural deep network embedding. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1225–1234. [Google Scholar]
- Grover, A.; Leskovec, J. Node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference, Florence, Italy, 11–13 June 2016; pp. 855–864. [Google Scholar]
- Ou, M.D.; Cui, P.; Pei, J.; Zhang, Z.W.; Zhu, W.W. Asymmetric transitivity preserving graph embedding. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1105–1114. [Google Scholar]
- Wang, H.W.; Wang, J.; Wang, J.L. Graphgan: Graph representation learning with generative adversarial nets. IEEE Trans. Knowl. Data Eng. 2017, 64, 1–8. [Google Scholar]
- Ribeiro, L.; Saverese, P.H.P.; Figueiredo, D.R. struc2vec: Learning node representations from structural identity. ACM 2017, 17, 13–17. [Google Scholar]
- Gu, S.; Milenkovic, T.J. Graphlets versus node2vec and struc2vec in the task of network alignment. Soc. Inf. Netw. 2018, 12, 1–11. [Google Scholar]
- Thoms, N.; Max, W. Semi-supervised classification with graph convolutional networks. In Proceedings of the International Conference on Learning Representations, Dallas, TX, USA, 23–25 September 2017; pp. 1–14. [Google Scholar]
- Chen, J.; Ma, T.F.; Xiao, C. FastGCN: Fast learning with graph convolutional networks via importance sampling. In Proceedings of the International Conference on Learning Represent, Biopolis, Singapore, 7–9 January 2018; pp. 1–15. [Google Scholar]
- Michael, S.; Thoms, N.K.; Peter, B. Modeling Relational Data with Graph Convolutional Network. In Proceedings of the European Semantic Web Conference, Biopolis, Singapore, 7–9 March 2018; pp. 593–607. [Google Scholar]
- Scarselli, F.; Gori, M.; Tsoi, A.C.; Monfardini, G. The Graph Neural Network Model. IEEE Trans. Nenral Netw. 2009, 20, 61–80. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhang, X.Y.; Chen, L.H. Capsule Graph Neural Network. In Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA, 6–9 May 2019; Volume 11, pp. 1544–1554. [Google Scholar]
- Singh, M.K.; Banerjee, S.; Chaudhuri, S. NENET: An edge learnable network for link prediction in scene text. arXiv 2020, arXiv:2005.12147. [Google Scholar]
- Lei, K.; Qin, M.; Bai, B.; Zhang, G.; Yang, M. GCN-GAN: A Non-linear Temporal Link Prediction Model for Weighted Dynamic Networks. In Proceedings of the IEEE INFOCOM 2019-IEEE Conference on Computer Communications, Paris, France, 29 April–2 May 2019; pp. 21–30. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Chen, J.Y.; Xu, X.H.; Wu, Y.Y.; Zheng, H.B. GC-LSTM: Graph convolution embedded LSTM for dynamic link prediction. arXiv 2018, arXiv:1812.04206v1. [Google Scholar]
- Thomas, N.K.; Max, W. Variational Graph Auto-Encoders. Comput. Sci. 2016, 7, 1–3. [Google Scholar]
- Gu, W.W.; Gao, F.; Lou, X.D. Link prediction via Graph attention network. Soc. Inf. Netw. 2019, 4, 7–12. [Google Scholar]
- Hasan, M.A.; Chaoji, V.; Salem, S. Link prediction using supervised learning. In Proceedings of the SDM’06: Workshop on Link Analysis, Counter-Terrorism and Security, NewYork, NY, USA, 15–18 January 2006; pp. 22–26. [Google Scholar]
- Jalili, M.; Orouskhani, Y.; Asgari, M. Link prediction in multiplex online social networks. R. Soc. Open Sci. 2017, 4, 1–11. [Google Scholar] [CrossRef] [Green Version]
- Laishram, R. Link prediction in dynamic weighted and directed social network using supervised learning. Diss. Theses Gradworks 2015, 12, 355–433. [Google Scholar]
- Aouay, S.; Jamoussi, S.; Gargouri, F. Feature based link prediction. In Proceedings of the 2014 IEEE/ACS 11th International Conference on Computer Systems and Applications, Doha, Qatar, 10–13 November 2014; Volume 10, pp. 10–13. [Google Scholar]
- Zhou, J.; Huang, D.Y.; Wang, H.S. A dynamic logistic regression for network link prediction. Sci. China Math. 2017, 60, 165–176. [Google Scholar] [CrossRef]
- Chiang, K.Y.; Natarajan, N.; Tewari, A. Exploiting longer cycles for link prediction in signed networks. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management, Glasgow, Scotland, UK, 24–28 October 2011; pp. 1157–1162. [Google Scholar]
- Pachaury, S.; Kumar, N.; Khanduri, A.; Mittal, H. Link Prediction Method using Topological Features and Ensemble Model. In Proceedings of the 2018 Eleventh International Conference on Contemporary Computing, Noida, India, 2–4 August 2018; pp. 1021–1026. [Google Scholar]
- Guns, R.; Rousseau, R. Recommending Rearch collaboration Using Link Prediction and Random Forest Classifiers. Scientometrics 2014, 101, 1461–1473. [Google Scholar] [CrossRef]
- Scellato, S.; Noulas, A.; Mascolo, M. Exploiting place features in link prediction on location-based social networks. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, 21–24 August 2011; pp. 1046–1054. [Google Scholar]
- Kastrin, A.; Rindflesch, T.; Hristovski, D. Link prediction on a network of co-occurring mesh terms: Towards literature-based discovery. Methods Inf. Med. 2016, 55, 579–583. [Google Scholar]
- Jorge, V.R.; Valejo, A. A naïve Bayes model based on overlapping groups for link prediction in online social networks. In Proceeding of the 30th Annual ACM Symposium on Applied Computing, Cambridge, MA, USA, 3–5 April 2015; pp. 1136–1141. [Google Scholar]
- Kunegis, J.; Lommatzsch, A. Learning spectral graph transformations for link prediction. In Proceedings of the 26th Annual International Conference on Machine Learning, Berlin, Germany, 13–15 January 2009; pp. 561–568. [Google Scholar]
- Aditya, K.M.; Charles, E. Link prediction via matrix factorization. Lecture Notes in computer Science. In Proceedings of the 2011 European Conference on Machine Learning and Knowledge Discovery in Databases, Athens, Greece, 5–9 September 2011; pp. 437–452. [Google Scholar]
- You, J.X. Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation. 2018. Available online: https://arxiv.org/abs/1806.02473 (accessed on 29 September 2020).
- Do, K.; Tran, T.; Venkatesh, S. Graph transformation policy network for chemical reaction prediction. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Berlin, Germany, 21–23 July 2019; pp. 750–756. [Google Scholar]
- Zhang, M.H.; Chen, Y.X. Link Prediction Based on Graph Neural Networks. 2018. Available online: https://arxiv.org/abs/1802.09691 (accessed on 29 September 2020).
- Yang, Y.; Lichtenwalter, R.N.; Chawla, N.V. Evaluating link prediction methodsm. Knowl. Inf. Syst. 2014, 45, 751–782. [Google Scholar] [CrossRef] [Green Version]
- Didier, V.O.; Zhao, A.L.; Bertonm, L. Evaluating link prediction by diffusion processes in dynamic networks. Sci. Rep. 2019, 9, 10833–10846. [Google Scholar]
Algorithms | Time | Topology | Weight | Attributive | Label | Directional | Symbolic | Auxiliary Information |
---|---|---|---|---|---|---|---|---|
Katz [31] | √ | |||||||
SR [32] | √ | √ | ||||||
RW [22] | √ | √ | √ | |||||
RWR [33] | √ | √ | √ | |||||
CN [34] | √ | √ | √ | |||||
JC [36] | √ | √ | ||||||
SI [38] | √ | |||||||
PA [39] | √ | √ | ||||||
AA [40] | √ | √ | ||||||
RA [41] | √ | |||||||
HPI [42] | √ | |||||||
HDI [43] | √ | |||||||
LP [44] | √ | |||||||
LRW [45] | √ | |||||||
SRW [45] | √ | |||||||
FL [46] | √ | |||||||
PFP [47] | √ | |||||||
HSM [48] | √ | √ | ||||||
SBM [51] | √ | √ | ||||||
PRM [55] | √ | √ | ||||||
ERM [56] | √ | |||||||
SRM [57] | √ | √ | √ | |||||
DeepWalk [58] | √ | √ | √ | |||||
LINE [59] | √ | √ | √ | |||||
GraRep [61] | √ | √ | √ | |||||
DNGR [62] | √ | √ | ||||||
SDNE [63] | √ | |||||||
Node2Vec [64] | √ | √ | √ | |||||
HOPE [65] | √ | √ | ||||||
GraphGAN [66] | √ | |||||||
Struct2vec [67] | √ | √ | √ | √ | √ | √ | ||
GCN [69] | √ | √ | √ | √ | ||||
GNN [72] | √ | √ | √ | √ | ||||
GAN [75] | √ | √ | √ | |||||
LSTM [76] | √ | √ | √ | √ | ||||
GAT [79] | √ | √ | ||||||
GAE [78] | √ | √ | √ | |||||
SVM [81] | √ | √ | √ | √ | ||||
KNN [83] | √ | √ | √ | |||||
LR [84] | √ | √ | √ | √ | √ | √ | ||
EL [86] | √ | √ | √ | √ | ||||
RF [88] | √ | √ | √ | |||||
MLP [89] | √ | √ | √ | √ | ||||
NB [90] | √ | √ | √ | |||||
MF [91] | √ | √ | √ | |||||
GCPN [93] | √ | √ | √ | |||||
GTPN [94] | √ | √ | √ |
Category | Algorithm | Complexity | Remarks |
---|---|---|---|
Similarity-based methods | Katz [31] | O(N3) | N represents the number of nodes in the network. |
SimRank [32] | O(N4) | ||
Random Walk [22] | O<cN2k> | C is the network aggregation coefficient. K stands for average degree. | |
Random Walk with Restart [33] | O(N3) | ||
Common Neighbors [34] | O(N2) | ||
Jaccard Index [36] | O(2N2) | ||
Salton Index [38] | O(N2) | ||
PA [39] | O(2N) | ||
AA [40] | O(2N2) | ||
RA [41] | O(2N2) | ||
Hub Depressed index [42] | O(N2) | ||
Hub Promoted index [43] | O(N2) | ||
Local Path index [44] | O(N<K>3) | K stands for average degree. | |
LRW [45] | O(N<K>n) | n represents the number of random walk steps. | |
SRW [45] | O(N<K>n) | ||
Unsupervised learning Methods | DeepWalk [58] | O() | represents the number of nodes in the graph. d represents the average shortest distance. |
LINE [59] | O() | represents the number of edges in the graph. | |
GraRep [61] | O() | ||
DNGR [62] | O() | ||
SDNE [63] | O() | ||
Node2Vec [64] | O() | ||
HOPE [65] | O() | ||
GraphGAN [66] | O() | ||
Struct2vec [67] | O() | ||
Semi-supervised learning Methods | GCN [69] | O() | |
GNN [72] | O() | ||
GAN [75] | O() | ||
LSTM [76] | O(nm+n2+n) | n is hidden_size, m is input_size. | |
Supervised learning Methods | SVM [81] | O(n2) | n is the number of samples. |
KNN [83] | O(n*k*d) | d is data dimension, k is the number of neighbors. | |
Logistic Regression [84] | O(n*d) | ||
Ensemble learning [86] | O(n) | ||
Random Forrest [88] | O(n*log(n)*d*k) | ||
Naïve Bayes [90] | O(n*d) |
Classification | Methods | AUC | Precision | Dataset | Relative Merits |
---|---|---|---|---|---|
Similarity-based methods | Katz | 0.956 | 0.719 | USAir | Katz sums over the sets of paths. |
RWR | 0.978 | 0.650 | PPI | RWR provides a good relevance score between two nodes in a weighted graph. | |
CN | 0.937 | 0.814 | USAir | CN is simple and intuitive. | |
JC | 0.933 | NS | JC normalizes the size of CN. | ||
SI | 0.911 | NS | SI is the metric which is known as cosine similarity in the literature. | ||
PA | 0.886 | 0.764 | USAir | PA has the lowest complexity compared with other algorithms and requires the least information. | |
RA | 0.955 | 0.828 | USAir | RA is more superior when the average degree is high. | |
AA | 0.932 | 0.699 | NS | AA refines the simple counting of CN. | |
HPI | 0.911 | NS | HPI value is determined by the lower degree of nodes. | ||
HDI | 0.933 | NS | HDI value is determined by the higher degrees of nodes. | ||
LP | 0.939 | 0.734 | PB | LP has obvious advantage in computing speed for the large and sparse network. | |
LRW | 0.989 | 0.863 | NS | LRW was suitable for large and sparse networks. | |
SRW | 0.992 | 0.739 | NS | SRW optimizes prediction accuracy at an earlier time and prevent sensitive dependence of LRW to the nodes further away. | |
FL | 0.875 | Epinions | FL can provide more accurate and faster link prediction. | ||
PFP | 0.917 | Ca-condmat | PFP is highly scalable and achieves major improvements. | ||
Likelihood methods | HSM | 0.856 | Food | HSM is suitable for networks with obvious hierarchical structure. | |
SBM | 0.902 | Ca-condmat | SBM is suitable for predicting error edges. | ||
Probabilistic method | PRM | 0.874 | WebKB | PRM can be significantly improved by modeling relational dependencies. | |
ERM | 0.889 | DBLP | ERM are capable of performing better than the other models when the relational structure in uncertain. | ||
SRM | 0.942 | Movie | SRM can reduce the overall computational complexity. | ||
Unsupervised learning methods | DeepWalk | 0.809 | 0.839 | Blogcatalog | DeepWalk can generate random walks on demand, it is efficient and parallelized. |
LINE | 0.837 | 0.814 | DBLP | LINE is suitable for arbitrary types of information networks and improves both the effectiveness and the efficiency of the inference. | |
GraRep | 0.814 | Blogcactalog | GraRep can capture global structural information associated with the graph and extend it to support weighted graphs. | ||
DNGR | 0.804 | Wikipedia | DNGR can capture nonlinear information conveyed by the graph and learn better low-dimensional vertex representations of graph. | ||
SDNE | 0.836 | Arxiv | SDNE can capture the highly nonlinear network structure and is robust to sparse networks. | ||
Node2Vec | 0.968 | 0.854 | Node2vec is flexible, controllable, scalable, and robust. | ||
HOPE | 0.881 | 0.812 | HOPE is scalable to preserve high-order proximities of large-scale graphs and capable of capturing the asymmetric transitivity. | ||
GraphGAN | 0.859 | 0.853 | Arxiv | GraphGAN achieves substantial gains in link prediction and satisfy desirable properties of normalization. | |
Struct2Vec | 0.853 | 0.810 | Air-traffic network | Struct2Vec can capture stronger notions of structural identity. | |
Semi-supervised learning methods | GCN | 0.941 | Citeseer | GCN can effectively encode graph structure data and features and achieve high prediction speed and performance. | |
GNN | 0.890 | 0.891 | Neural netwok | GNN can capture more local structure information, provide much richer representation and calculates faster. | |
GAN | 0.932 | 0.920 | UCSB | GAN only uses backpropagation, without the need for a complicated Markov chain, and it can generate samples that are clearer and more realistic than other models. | |
LSTM | 0.982 | 0.810 | Hypertext | LSTM can fit sequence data and solve the problem of gradient disappearance. | |
GAE | 0.925 | 0.902 | Core | GAE can address link prediction in directed graphs. | |
GAT | 0.880 | 0.790 | Core | GAT can not only make predictions on links but also learn meaningful node representations. | |
Supervised learning methods | SVM | 0.982 | 0.991 | SVM is extremely robust, especially in high-dimensional spaces. | |
KNN | 0.803 | 0.920 | Flickr | The theory of KNN is simple and easy to implement, new data can be added directly without retraining. | |
LR | 0.901 | Epinions | LR is not computationally expensive and easy to understand and implement. | ||
EL | 0.994 | 0.989 | Flickr | EL combines various classifiers to learn from each other and has better prediction performance. | |
RF | 0.987 | 0.989 | RF can achieve high accuracy, without worrying about overfitting, each time only a few randomly selected features are used to train the tree. | ||
MLP | 0.862 | Mesh | MLP can learn nonlinear models and can perform real-time learning. | ||
NB | 0.808 | 0.04 | Flickr | NB is easy to implement and very useful for large data sets. | |
MF | 0.793 | PowerGrid | MF has much fewer parameters to learn and capture global structure. | ||
Reinforcement learning methods | GCPN | 0.855 | 0.741 | ZINC250K molecule dataset | GCPN is effective in a variety of graph generation problems, especially in dealing with link prediction problems, and has better performance. |
GTPN | 0.906 | 0.832 | USPTO | GTPN improves the top-1 accuracy over the current state-of-the-art method by about 3% on the large USPTO dataset. |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, H.; Le, Z. Seven-Layer Model in Complex Networks Link Prediction: A Survey. Sensors 2020, 20, 6560. https://doi.org/10.3390/s20226560
Wang H, Le Z. Seven-Layer Model in Complex Networks Link Prediction: A Survey. Sensors. 2020; 20(22):6560. https://doi.org/10.3390/s20226560
Chicago/Turabian StyleWang, Hui, and Zichun Le. 2020. "Seven-Layer Model in Complex Networks Link Prediction: A Survey" Sensors 20, no. 22: 6560. https://doi.org/10.3390/s20226560