Abstract
In this paper we present and evaluate a fast and parallel method that addresses the problem of similarity assessment between node-labeled and edge-weighted graphs which represent the binding pockets of protein. In order to predict the functional family of proteins, graphs can be used to model binding pockets to depict their geometry and physiochemical composition without information loss. To facilitate the measure of similarity on graphs, community detection can be used. Our approach is based on a parallel implementation of community detection algorithm which is an adaptation and extension of Louvain method. Compared to the existing solutions, our method can achieve nearly well-balanced workload among processors and higher accuracy of graph clustering on real-world large graphs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Ångström is a unit of length equal to \(10^{-10}\) m.
References
Awal, G.K., Bharadwaj, K.: Team formation in social networks based on collective intelligence: an evolutionary approach, pp. 627–648 (2014)
Bengoetxea, E.: Inexact graph matching using estimation of distribution algorithms. Ecole Nationale Supérieure des Télécommunications, Paris 2(4), 49 (2002)
Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Bourne, P.E.: The protein data bank. Nucleic Acids Res. 28, 235–242 (2000)
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks, P1008 (2008)
Boukhris, I., Elouedi, Z., Fober, T., Mernberger, M., Hullermeier, E.: Similarity analysis of protein binding sites: a generalization of the maximum common subgraph measure based on quasi-clique detection. In: ISDA, pp. 1245–1250. IEEE Computer Society (2009)
Bron, C., Kerbosch, J.: Algorithm 457: finding all cliques of an undirected graph. Commun. ACM 16(9), 575–577 (1973)
Cohen, J., Castonguay, P.: Efficient graph matching and coloring on the GPU. In: GPU Technology Conference, pp. 1–10 (2012)
Daxin, J., Jian, P.: Mining frequent cross-graph quasi-cliques. ACM Trans. Knowl. Discov. 16(1), 16–42 (2009)
Emmert-Streib, F., Dehmer, M., Shi, Y.: Fifty years of graph matching, network alignment and network comparison. Inf. Sci. 346, 180–197 (2016)
Ferrer, M., Valveny, E., Serratosa, F.: Median graph: a new exact algorithm using a distance based on the maximum common subgraph. Pattern Recogn. Lett. 30(5), 579–588 (2009)
Fober, T., Klebe, G., Hullermeier, E.: Local clique merging: an extension of the maximum common subgraph measure with applications in structural bioinformatics. In: Algorithms from and for Nature and Life, pp. 279–286 (2013)
Frasconi, P., Passerini, A.: Predicting the geometry of metal binding sites from protein sequence 9, 203–213 (2012)
Harary, F., Norman, R.Z.: Graph theory as a mathematical model in social science, p. 45 (1953)
Levi, G.: A note on the derivation of maximal common subgraphs of two directed or undirected graphs. Calcolo 9(4), 341 (1973)
Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 135–146 (2010)
Mallek, S., Boukhris, I., Elouedi, Z.: Community detection for graphbased similarity: application to protein binding pockets classification. Pattern Recogn. Lett. 62, 49–54 (2015)
McGregor, J.J.: Backtrack search algorithms and the maximal common subgraph problem. Softw.: Pract. Experience 12(1), 23–34 (1982)
Schmitt, S., Kuhn, D., Klebe, G.: A new method to detect related function among proteins independent of sequence and fold homology. J. Mol. Biol. 323(2), 387–406 (2002)
Shiokawa, H., Fujiwara, Y., Onizuka, M.: Fast algorithm for modularity-based graph clustering. In: AAAI, pp. 1170–1176 (2013)
Tomita, E., Tanaka, A., Takahashi, H.: The worst-case time complexity for generating all maximal cliques and computational experiments. Theor. Comput. Sci. 363(1), 28–42 (2006)
Weskamp, N., Hullermeier, E., Kuhn, D., Klebe, G.: Multiple graph alignment for the structural analysis of protein active sites. IEEE/ACM Trans. Comput. Biol. Bioinf. (TCBB) 4(2), 310–320 (2007)
Wu, S.D., Byeon, E.S., Storer, R.: A graph-theoretic decomposition of the job shop scheduling problem to achieve scheduling robustness. Oper. Res. 47(1), 113–124 (1999)
Yang, J., McAuley, J., Leskovec, J.: Community detection in networks with node attributes. In: Data Mining (ICDM), pp. 1151–1156 (2013)
Chi, Y., Dai, G., Wang, Y., Sun, G., Li, G., Yang, H.: Nxgraph: an efficient graph processing system on a single machine. In: 2016 IEEE 32nd International Conference on Data Engineering (ICDE), pp. 409-420, May 2016
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Ben Rejab, A., Boukhris, I. (2020). FAST Community Detection for Proteins Graph-Based Functional Classification. In: Abraham, A., Cherukuri, A., Melin, P., Gandhi, N. (eds) Intelligent Systems Design and Applications. ISDA 2018 2018. Advances in Intelligent Systems and Computing, vol 941. Springer, Cham. https://doi.org/10.1007/978-3-030-16660-1_57
Download citation
DOI: https://doi.org/10.1007/978-3-030-16660-1_57
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16659-5
Online ISBN: 978-3-030-16660-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)