Abstract
We introduce a Bayesian Dirichlet-Multinomial model of the edge weights of the Cumulative ADJacency (\(CADJ\)) graph [1] with the goal of intelligent graph pruning. As a topology representing graph, \(CADJ\) is an effective tool for cluster extraction from the learned prototypes of SOMs, but for complex data the graph must typically be pruned to elicit meaningful cluster structure. This work is a first attempt to guide this pruning in a formal modeling framework. Our model, dubbed DM-Prune, earmarks edges for removal via comparisons to a novel null model and provides an internal assessment of information loss resulting from iterative removal of edges. We show that DM-Pruned \(CADJ\) graphs lead to clusterings comparable to the best previously achieved on highly structured real data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Taşdemir K, Merényi E (2009) Exploiting data topology in visualization and clustering of Self-Organizing Maps. IEEE Trans. Neur. Netw. 20(4):549–562. ISSN 1045-9227
Martinetz T, Schulten K (1994) Topology representing networks. Neural Netw. 7(3):507–522. ISSN 0893-6080
Kohonen T (1997) Self-Organizing Maps, 2nd edn. Springer, Heidelberg
Merényi E, Taşdemir K, Zhang L (2009) Similarity-based clustering. chapter Learning Highly Structured Manifolds: Harnessing the Power of SOMs. Springer, Heidelberg, pp 138–168. ISBN 978-3-642-01804-6
Okabe A, Boots B, Sugihara K (1992) Spatial tessellations: concepts and applications of voronoi diagrams. John Wiley & Sons Inc., New York
Mosimann J (1962) On the compound multinomial distribution, the multivariate \(\beta \)-distribution, and correlations among proportions. Biometrika 49(1/2):65–82. ISSN 00063444
DeSieno D (March 1988) Adding a conscience to competitive learning. In: Proceedings of the international conference neural network (ICNN), New York, vol. I, pp I–117–124
Agrell E (January 1993) A method for examining vector quantizer structures. In: Proceedings of the IEEE international symposium information theory. IEEE, pp 394–394
Dyer M, Frieze A (1991) Computing the volume of convex bodies: a case where randomness probably helps. Probab. Comb. Appl. 44:123–170
Lovász L, Vempala S (2006) Simulated annealing in convex bodies and an o(n4) volume algorithm. J. Comput. Syst. Sci. 72(2):392–417
Pons P, Latapy M (2005) Computing communities in large networks using random walks. In: Proceedings of the 20th international conference on computer information science, ISCIS 2005, Springer, Heidelberg, pp 284–293
Nepusz T, Csardi G (2006) The igraph software package for complex network research. Int J Complex Syst 1695(5):1–9
Merényi E, Taylor J (April 2018) Empowering graph segmentation methods with SOMs and CONN similarity for clustering large and complex data. Neural Comput. Appl. (forthcoming)
Jain A (2004) Issues Related to Data Mining with Self-Organizing Maps. Rice University, 2004. M.Sc. thesis
Merényi E, Csató B, Taşdemir K (2007) Knowledge discovery in urban environments from fused multi-dimensional imagery. In: Gamba P, Crawford M (eds) Proceedings of the IEEE GRSS/ISPRS joint workshop on remote sensing and data fusion over urban areas (URBAN 2007), Paris, France, 11–13 April 2007, pp 1–13
Merényi E, Taylor J (June 2017) SOM-empowered graph segmentation for fast automatic clustering of large and complex data. In: 12th international workshop on self-organizing maps and learning vector quantization, clustering and data visualization (WSOM+2017), pp 1–9
Acknowledgment
We thank Dr. Beáta Csathó, University of Buffalo, for the Ocean City spectral image and accompanying truth. This project was partially supported by a North American ALMA Development Cycle 5 Study Program, administered by the National Radio Astronomy Observatory, with the consent of the U.S. National Science Foundation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Taylor, J., Merényi, E. (2020). A Probabilistic Method for Pruning CADJ Graphs with Applications to SOM Clustering. In: Vellido, A., Gibert, K., Angulo, C., Martín Guerrero, J. (eds) Advances in Self-Organizing Maps, Learning Vector Quantization, Clustering and Data Visualization. WSOM 2019. Advances in Intelligent Systems and Computing, vol 976. Springer, Cham. https://doi.org/10.1007/978-3-030-19642-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-19642-4_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-19641-7
Online ISBN: 978-3-030-19642-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)