Abstract
This paper presents a method for visualization of large graphs in a two-dimensional space, such as a collection of Web pages. The main contribution here is in the representation change to enable better handling of the data. The idea of the method consists from three major steps: (1) First, we transform a graph into a sparse matrix, where for each vertex in the graph there is one sparse vector in the matrix. Sparse vectors have non-zero components for the vertices that are close to the vertex represented by the vector. (2) Next, we perform hierarchical clustering (eg., hierarchical K-Means) on the set of sparse vectors, resulting in the hierarchy of clusters. (3) In the last step, we map hierarchy of clusters into a two-dimensional space in the way that more similar clusters appear closely on the picture. The effect of the whole procedure is that we assign unique X and Y coordinates to each vertex, in a way those vertices or groups of vertices on several levels of hierarchy that are stronger connected in a graph are place closer in the picture. The method is particular useful for power distributed graphs. We show applications of the method on real-world examples of visualization of institution collaboration graph and cross-sell recommendation graph.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley-Interscience, Hoboken (2000)
Fayyad, U., Grinstein, G.G., Wierse, A. (eds.): Information Visualization in Data Mining and Knowledge Discovery. Morgan Kaufmann, San Francisco (2001)
Grobelnik, M., Mladenić, D.: Efficient visualization of large text corpora. In: Proceedings of the seventh TELRI seminar, Dubrovnik, Croatia (2002)
Grobelnik, M., Mladenić, D.: Analysis of a database of research projects using text mining and link analysis. In: Mladenić, D., Lavrac, N., Bohanec, M., Moyle, S. (eds.) Data mining and decision support: integration and collaboration. The Kluwer international series in engineering and computer science, SECS 745, pp. 157–166. Kluwer Academic Publishers, Dordrecht (2003)
Hand, D.J., Mannila, H., Smyth, P.: Principles of Data Mining. Adaptive Computation and Machine Learning. MIT Press, Cambridge (2001)
Hastie, T., Tibshirani, R., Friedman, J.H.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Series in Statistics. Springer, Heidelberg (2001)
Mitchell, T.M.: Machine Learning. The McGraw-Hill Companies, Inc., New York (1997)
Mutzel, P., Jünger, M., Leipert, S. (eds.): GD 2001. LNCS, vol. 2265. Springer, Heidelberg (2002)
North, S.C. (ed.): GD 1996. LNCS, vol. 1190. Springer, Heidelberg (1997)
Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the web. Tech. Rept. SIDL-WP-1999-020, Stanford University (January 1998)
Robbins, K.S., Gorman, M.: Fast Visualization Methods for Comparing Dynamics: A Case Study in Combustion. In: Proceedings of the 11th IEEE Visualization 2000 Conference. IEEE Computer Society, Los Alamitos (2000)
Steinbach, M., Karypis, G., Kumar, V.: A comparison of document clustering techniques. In: Proceedings of KDD Workshop on Text Mining, pp. 109–110 (2000)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mladenic, D., Grobelnik, M. (2005). Visualizing Very Large Graphs Using Clustering Neighborhoods. In: Morik, K., Boulicaut, JF., Siebes, A. (eds) Local Pattern Detection. Lecture Notes in Computer Science(), vol 3539. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11504245_6
Download citation
DOI: https://doi.org/10.1007/11504245_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26543-6
Online ISBN: 978-3-540-31894-1
eBook Packages: Computer ScienceComputer Science (R0)