Abstract
We introduce and study the spectral evolution model, which characterizes the growth of large networks in terms of the eigenvalue decomposition of their adjacency matrices: In large networks, changes over time result in a change of a graph’s spectrum, leaving the eigenvectors unchanged. We validate this hypothesis for several large social, collaboration, rating, citation, and communication networks. Following these observations, we introduce two link prediction algorithms based on the learning of the changes to a network’s spectrum. These new link prediction methods generalize several common graph kernels that can be expressed as spectral transformations. The first method is based on reducing the link prediction problem to a one-dimensional curve-fitting problem which can be solved efficiently. The second algorithm extrapolates a network’s spectrum to predict links. Both algorithms are evaluated on fifteen network datasets for which edge creation times are known.













Similar content being viewed by others
Notes
This can be seen by noting that a single edge has the adjacency matrix \([0, 1; 1, 0]\), which is indefinite, and in fact only by adding a positive-semidefinite component to a matrix can it be guaranteed that eigenvalues do not shrink; this follows from the interlacing theorem given in [62, p. 97].
References
Adamic L, Adar E (2001) Friends and neighbors on the web. Soc Netw 25:211–230
Barabási A-L, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512
Blei D, Ng A, Jordan M, Lafferty J (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
Bradley AP (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn 30:1145–1159
Breese JS, Heckerman D, Kadie C (1998) Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of conference on uncertainty in artificial intelligence, pp 43–52
Brin S, Page L (1998) The anatomy of a large-scale hypertextual Web search engine. Comput Netw ISDN Syst 30(1–7):107–117
Chaintreau A, Hui P, Crowcroft J, Diot C, Gass R, Scott J (2007) Impact of human mobility on opportunistic forwarding algorithms. IEEE Trans Mobile Comput 6(6):606–620
Chapelle O, Weston J, Schölkopf B (2003) Cluster kernels for semi-supervised learning. Adv Neural Inform Process Syst 15:585–592
Chebotarev P, Shamis E (1997) The matrix-forest theorem and measuring relations in small social groups. Autom Remote Control 58(9):1505–1514
Choudhury MD, Lin Y-R, Sundaram H, Candan KS, Xie L, Kelliher A (2010) How does the data sampling strategy impact the discovery of information diffusion in social media? In: Proceedings of conference on weblogs and social media, pp 34–41
Choudhury MD, Sundaram H, John A, Seligmann DD (2009) Social synchrony: predicting mimicry of user actions in online social media. In: Proceedings of the international conference on computational science and engineering, pp 151–158
Chung F (1997) Spectral graph theory. American Mathematical Society, Providence
Doyle PG, Snell JL (1984) Random walks and electric networks. Mathematical Association of America, Washington, DC
Fouss F, Pirotte A, Renders J-M, Saerens M (2004) A novel way of computing dissimilarities between nodes of a graph, with application to collaborative filtering and subspace projection of the graph nodes. In: Proceedings of European conference on machine learning, pp 26–37
Fouss F, Yen L, Pirotte A, Saerens M (2006) An experimental investigation of graph kernels on a collaborative recommendation task. In: Proceedings of the international conference on data mining, pp 863–868
Goldenberg A, Kubica J, Komarek P (2003) A comparison of statistical and machine learning algorithms on the task of link completion. In Proceedings workshop on link analysis for detecting complex behavior
Guha R, Kumar R, Raghavan P, Tomkins A (2004) Propagation of trust and distrust. In: Proceedings of the international world wide web conference, pp 403–412
Guimerà R, Sales-Pardo M, Amaral LAN (2004) Modularity from fluctuations in random graphs and complex networks. Phys Rev E 70(2):025101
Huang Z, Chung W, Chen H (2004) A graph model for e-commerce recommender systems. Am Soc Inf Sci Technol 55(3):259–274
Huang Z, Zeng D, Chen H (2007) A comparison of collaborative-filtering recommendation algorithms for e-commerce. IEEE Intell Syst 22(5):68–78
Ito T, Shimbo M, Kudo T, Matsumoto Y (2005) Application of kernels to link analysis. In: Proceedings of the international conference on knowledge discovery in data mining, pp 586–592
Järvelin K, Kekäläinen J (2002) Cumulated gain-based evaluation of IR techniques. ACM Trans Inf Syst 20(4):422–446
Kandola J, Shawe-Taylor J, Cristianini N (2002) Learning semantic similarity. In Advances in neural information processing systems, pp 657–664
Katz L (1953) A new status index derived from sociometric analysis. Psychometrika 18(1):39–43
Klein DJ, Randić M (1993) Resistance distance. J Math Chem 12(1):81–95
Klimt B, Yang Y (2004) The Enron corpus: A new dataset for email classification research. In: Proceedings of European conference on machine learning, pp 217–226
Kondor R, Lafferty J (2002) Diffusion kernels on graphs and other discrete structures. In: Proceedings of the international conference on machine learning, pp 315–322
Kunegis J, Fay D, Bauckhage C (2010) Network growth and the spectral evolution model. In: Proceedings of the international conference on information and knowledge management, pp 739–748
Kunegis J, Lommatzsch A (2009) Learning spectral graph transformations for link prediction. In: Proceedings of the international conference on machine learning, pp 561–568
Kunegis J, Lommatzsch A, Bauckhage C (2009) The slashdot zoo: mining a social network with negative edges. In: Proceedings of the international world wide web conference, pp 741–750
Kunegis J, Schmidt S, Lommatzsch A, Lerner J (2010) Spectral analysis of signed graphs for clustering, prediction and visualization. In: Proceedings of SIAM international conference on data mining, pp 559–570
Lü L, Jin C-H, Zhou T (2009) Similarity index based on local paths for link prediction of complex networks. Phys. Rev. E 80(4):046122
Lax PD (1984) Linear algebra and its applications. Wiley, London
Lee DD, Seung SH (2000) Algorithms for non-negative matrix factorization. In: Advances in neural information processing systems, pp 556–562
Leskovec J, Backstrom L, Kumar R, Tomkins A (2008) Microscopic evolution of social networks. In: Proceedings of the international conference on knowledge discovery and data mining, pp 462–470
Leskovec J, Huttenlocher D, Kleinberg J (2010) Governance in social media: A case study of the Wikipedia promotion process. In: Proceedings of th international conference on weblogs and social media
Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: densification and shrinking diameters. ACM Trans Knowl Discov Data 1(1):1–40
Ley M (2002) The DBLP computer science bibliography: evolution, research issues, perspectives. In: Proceedings of the international symposium on string processing and information retrieval, pp 1–10
Liben-Nowell, D. and Kleinberg, J. (2003), The link prediction problem for social networks. In: Proceedings of the international conference on information and knowledge management, pp 556–559
Liu W, Qian B, Cui J, Liu J (2009) Spectral kernel learning for semi-supervised classification. In: Proceedings of the international joint conference on artificial intelligence, pp 1150–1155
Long B, Zhang Z, Yu PS (2010) A general framework for relation graph clustering. J Knowl Inf Syst 24(3):393–413
Lu Z, Jain P, Dhillon IS (2009) Geometry-aware metric learning In: Proceedings of the international conference on machine learning, pp 673–680
von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416
Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, Cambridge
Massa P, Avesani P (2005) Controversial users demand local trust metrics: an experimental study on epinions.com community. In: Proceedings of American association for artificial intelligence conference, pp 121–126
Mislove A (2009) Online social networks: measurement, analysis, and applications to distributed information systems. PhD thesis, Rice University
Mislove A, Koppula HS, Gummadi KP, Druschel P, Bhattacharjee B (2008) Growth of the Flickr social network. In: Proceedings of the workshop on online social networks, pp 25–30
Mohar B (1991) The Laplacian spectrum of graphs. Graph Theory Comb Appl 2:871–898
Najork MA, Zaragoza H, Taylor MJ (2007) HITS on the Web: how does it compare? In: Proceedings of the international conference on research and development in information retrieval, pp 471–478
Newman MEJ (2006) Finding community structure in networks using the eigenvectors of matrices. Phys Rev E 74(3):036104
Peng W, Li T (2011) Temporal relation co-clustering on directional social network and author-topic evolution. J Knowl Inf Syst 26(3):467–486
Radl A, von Luxburg U, Hein M (2009) The resistance distance is meaningless for large random geometric graphs. In: Proceedings of the workshop on analyzing networks and learning with graphs
Sarwar B, Karypis G, Konstan J, Riedl J (2000) Application of dimensionality reduction in recommender systems-a case study. In: Proceedings of ACM WebKDD Workshop
Smola A, Kondor R (2003) Kernels and regularization on graphs. In: Proceedings of the conference on learning theory and kernel machines, pp 144–158
Stewart GW (1990) Perturbation theory for the singular value decomposition, Technical report. University of Maryland, College Park
Sun J, Tao D, Faloutsos C (2006) Beyond streams and graphs: dynamic tensor analysis. In: Proceedings of the international conference on knowledge discovery and data mining, pp 374–383
Thurau C, Kersting K, Wahabzada M, Bauckhage C (2011) Convex non-negative matrix factorization for massive datasets. J Knowl Inf Syst 29(2):457–478
Viswanath B, Mislove A, Cha M, Gummadi KP (2009) On the evolution of user interaction in Facebook. FIn: Proceedings of the workshop on online social networks, pp 37–42
Wax M, Sheinvald J (1997) A least-squares approach to joint diagonalization. IEEE Signal Process Lett 4(2):52–53
Wedin P-Å (1972) Perturbation bounds in connection with singular value decomposition. BIT Numer Math 12(1):99–111
Weyl H (1912) Das asymptotische Verteilungsgesetz der Eigenwerte linearer partieller Differenzialgleichungen (mit einer Anwendung auf die Theorie der Hohlraumstrahlung). Math Ann 71(4):441–479
Wilkinson JH (1965) The algebraic eigenvalue problem. Oxford University Press, Oxford
Zhang B, Liu R, Massey D, Zhang L (2005) Collecting the internet AS-level topology. SIGCOMM Comput Commun Rev 35(1):53–61
Zhang D, Mao R (2008) Classifying networked entities with modularity kernels. In Proceedings of the conference on information and knowledge management, pp 113–122
Zhu X, Kandola J, Lafferty J, Ghahramani Z (2006) Semi-supervised learning, MIT Press, chapter graph kernels by Spectral Transforms
Acknowledgments
The research leading to these results has received funding from the European Community’s Seventh Frame Programme under grant agreement no 257859, ROBUST, and from the German Research Foundation (DFG) under the Multipla project (grant 38457858).
Author information
Authors and Affiliations
Corresponding author
Appendix: list of datasets
Appendix: list of datasets
This is the complete list of network datasets; all are from the Koblenz Network Collection (KONECT, konect.uni-koblenz.de). The Haggle dataset (CO) represents contacts between persons [7]. The Digg datasets (DG) contain message replies between users of the Web site Digg [11]. The English Wikipedia vote datasets (EL) represent administrator votes between users of the English Wikipedia [36]. The Enron dataset (EN) is a network of e-mail messages between employees of Enron [26]. Epinions (EP) is a trust/distrust network from the Web site of the same name [45]. The Flickr network (FL) contains user–user friendship links from the Flickr image sharing Web site [47]. The two Facebook datasets (Ol, Ow) contain the friendship links and wall messages between users of the social network Facebook in New Orleans [58]. The DBLP network (Pc) contains the coauthorship links between authors in the DBLP bibliography [38]. The arXiv hep-ph and hep-th networks (PH, TH) also contain the coauthorship links between authors in the corresponding sections of arXiv [37]. The Internet topology dataset (TO) contains the structural network of the Internet [63]. The Twitter network (Wa) contains the user–user mentions using the “@” syntax in Twitter [10]. The English Wikipedia hyperlink network (WP) contains all links between articles of the English Wikipedia [46]. The YouTube dataset (YT) contains the user–user friendships from video-sharing site YouTube [46] (Table 5).
Rights and permissions
About this article
Cite this article
Kunegis, J., Fay, D. & Bauckhage, C. Spectral evolution in dynamic networks. Knowl Inf Syst 37, 1–36 (2013). https://doi.org/10.1007/s10115-012-0575-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-012-0575-9