Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Boosting Similar Compounds Searches via Correlated Subgraph Analysis

  • Conference paper
  • First Online:
Information Integration and Web Intelligence (iiWAS 2023)

Abstract

Graph similarity search (GSS) models chemical compounds as a graph database. GSS is an essential tool for drug discovery because they can find similar graphs (compounds) for a query. Existing GSS methods have two critical limitations. First, handling large databases is time consuming. Second, finding compounds with the structure-activity relationship (SAR), which is vital in drug discovery, remains difficult. Herein a novel graph-based method for chemical compound searches is proposed to overcome these limitations. Since compounds with SAR share similar substructures, the proposed method extracts correlated subgraphs included in a query and explores similar compounds. In practical drug discovery task, our method achieves faster searches and improved accuracy compared to existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bellmann, L., Penner, P., Rarey, M.: Connected subgraph fingerprints: representing molecules using exhaustive subgraph enumeration. J. Chem. Inf. Model. 59(11), 4625–4635 (2019)

    Article  Google Scholar 

  2. Cao, Y., Jiang, T., Girke, T.: A maximum common substructure-based algorithm for searching and predicting drug-like compounds. Bioinformatics 24(13), i366–i374 (2008)

    Google Scholar 

  3. Chang, L., Feng, X., Yao, K., Qin, L., Zhang, W.: Accelerating graph similarity search via efficient GED computation. IEEE Trans. Knowl. Data Eng. 35(5), 4485–4498 (2023)

    Google Scholar 

  4. Doan, K.D., Manchanda, S., Mahapatra, S., Reddy, C.K.: Interpretable graph similarity computation via differentiable optimal alignment of node embeddings. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2021), pp. 665–674 (2021)

    Google Scholar 

  5. Fankhauser, S., Riesen, K., Bunke, H.: Speeding up graph edit distance computation through fast bipartite matching. In: Proceedings of the 7th International Workshop on Graph-Based Representations in Pattern Recognition (GbRPR 2011), pp. 102–111 (2011)

    Google Scholar 

  6. Garcia-Hernandez, C., Fernández, A., Serratosa, F.: Ligand-based virtual screening using graph edit distance as molecular similarity measure. J. Chem. Inf. Model. 59(4), 1410–1421 (2019)

    Article  Google Scholar 

  7. Ke, Y., Cheng, J., Ng, W.: Correlation search in graph databases. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2007), pp. 390–399 (2007)

    Google Scholar 

  8. Ke, Y., Cheng, J., Ng, W.: Efficient correlation search from graph databases. IEEE Trans. Knowl. Data Eng. 20(12), 1601–1615 (2008)

    Article  Google Scholar 

  9. Ke, Y., Cheng, J., Yu, J.X.: Top-k correlative graph mining. In: Proceedings of the 2009 SIAM International Conference on Data Mining (SDM 2009), pp. 1038–1049 (2009)

    Google Scholar 

  10. Lee, E.S.A., Fung, S., Sze-To, H.Y., Wong, A.K.C.: Discovering co-occurring patterns and their biological significance in protein families. BMC Bioinformatics 15(S2), 13 (2014)

    Google Scholar 

  11. Liang, Y., Zhao, P.: Similarity search in graph databases: a multi-layered indexing approach. In: Proceedings of the 33rd IEEE International Conference on Data Engineering (ICDE 2017), pp. 783–794 (2017)

    Google Scholar 

  12. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)

    Book  Google Scholar 

  13. Mysinger, M.M., Carchia, M., Irwin, J.J., Shoichet, B.K.: Directory of useful decoys, enhanced (DUD-E): better ligands and decoys for better benchmarking. J. Med. Chem. 55(14), 6582–6594 (2012)

    Article  Google Scholar 

  14. Nguyen, D.D., Wei, G.W.: AGL-score: algebraic graph learning score for protein-ligand binding scoring, ranking, docking, and screening. J. Chem. Inf. Model. 59(7), 3291–3304 (2019)

    Article  Google Scholar 

  15. Nguyen, V.K.T., Jacquemard, C., Rognan, D.: LIT-PCBA: an unbiased data set for machine learning and virtual screening. J. Chem. Inf. Model. 60(9), 4263–4273 (2020)

    Article  Google Scholar 

  16. Onizuka, M., Fujimori, T., Shiokawa, H.: Graph partitioning for distributed graph processing. Data Sci. Eng. 2(1), 94–105 (2017)

    Article  Google Scholar 

  17. Prateek, A., Khan, A., Goyal, A., Ranu, S.: Mining top-k pairs of correlated subgraphs in a large network. Proc. VLDB Endowm. 13(9), 1511–1524 (2020)

    Article  Google Scholar 

  18. Reynolds, H.T.: The Analysis of Cross-classifications. The Free Press, New York (1977)

    Google Scholar 

  19. Riesen, K., Emmenegger, S., Bunke, H.: A novel software toolkit for graph edit distance computation. In: Proceedings of the 9th International Workshop on Graph-Based Representations in Pattern Recognition (GbRPR 2013), pp. 142–151 (2013)

    Google Scholar 

  20. Shiokawa, H., Amagasa, T., Kitagawa, H.: Scaling fine-grained modularity clustering for massive graphs. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI 2019), pp. 4597–4604 (2019)

    Google Scholar 

  21. Shiokawa, H., Fujiwara, Y., Onizuka, M.: Fast algorithm for modularity-based graph clustering. In: Proceedings of the 27th AAAI Conference on Artificial Intelligence (AAAI 2013) (2013)

    Google Scholar 

  22. Shiokawa, H., Fujiwara, Y., Onizuka, M.: SCAN++: efficient algorithm for finding clusters, hubs and outliers on large-scale graphs. Proc. VLDB Endowm. 8(11), 1178–1189 (2015)

    Article  Google Scholar 

  23. Shiokawa, H., Takahashi, T.: DSCAN: distributed structural graph clustering for billion-edge graphs. In: Proceedings of the 31st International Conference on Database and Expert Systems Applications (DEXA 2020), pp. 38–54 (2020)

    Google Scholar 

  24. Wang, X., Ding, X., Tung, A.K., Ying, S., Jin, H.: An efficient graph indexing method. In: Proceedings of the 28th IEEE International Conference on Data Engineering (ICDE 2012), pp. 210–221 (2012)

    Google Scholar 

  25. Yagi, R., Shiokawa, H.: Fast top-k similar sequence search on DNA databases. In: Proceedings of the 24th International Conference on Information Integration and Web Intelligence (iiWAS 2022), pp. 145–150 (2022)

    Google Scholar 

  26. Zeng, Z., Tung, A.K.H., Wang, J., Feng, J., Zhou, L.: Comparing stars: on approximating graph edit distance. Proc. VLDB Endowm. 2(1), 25–36 (2009)

    Article  Google Scholar 

  27. Zhao, X., Xiao, C., Lin, X., Wang, W., Ishikawa, Y.: Efficient processing of graph similarity queries with edit distance constraints. VLDB J. 22(6), 727–752 (2013)

    Article  Google Scholar 

  28. Zhao, X., Xiao, C., Lin, X., Zhang, W., Wang, Y.: Efficient structure similarity searches: a partition-based approach. VLDB J. 27(1), 53–78 (2018)

    Article  Google Scholar 

Download references

Acknowledgements

This work was partly supported by JST PRESTO (JPMJPR2033) and JSPS KAKENHI (JP22K17894).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuma Naoi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Naoi, Y., Shiokawa, H. (2023). Boosting Similar Compounds Searches via Correlated Subgraph Analysis. In: Delir Haghighi, P., et al. Information Integration and Web Intelligence. iiWAS 2023. Lecture Notes in Computer Science, vol 14416. Springer, Cham. https://doi.org/10.1007/978-3-031-48316-5_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-48316-5_42

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-48315-8

  • Online ISBN: 978-3-031-48316-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics