Abstract
Illegal gambling and pornography are becoming increasingly rampant in cyberspace over the last ten years. Existing research of cybercrime governance mainly relies on detecting illegal websites and using simple rules to determine organizational affiliation. These methods are not only inaccurate enough, but also ignore other cybercrime assets and their complex relationships. To address these problems, we propose a novel cybercrime assets knowledge graph clustering method (CAKGC) based on feature fusion which combines heterogeneous node attributes with graph structure features of knowledge graph. We carefully analyzed different kinds of multi-source heterogeneous cybercrime assets exposed on the Internet and their intricate relationships, providing preparation for designing ontology and constructing a comprehensive cybercrime asset knowledge graph. Moreover, two features extraction strategies are adopted to learn heterogeneous node attributes and graph structure features automatically. Finally, we fuse two-level features by dimensionality reduction and apply clustering algorithms to discover highly dense cybercrime assets of groups. Experimental results on real-world cybercrime datasets demonstrate the superiority of CAKGC in terms of clustering accuracy (ACC) and normalized mutual information (NMI) and purity, outperforming advanced baseline methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Gainsbury, S.M.: Online gambling addiction: the relationship between internet gambling and disordered gambling. Curr. Addict. Rep. 2(2), 185–193 (2015)
Pau, L.F., Kirtava, Z.: International survey & analysis of laws and regulations addressing internet addiction and/or problematic usage of the internet. Available at SSRN 3550406 (2020)
DÃaz, A., Pérez, L.: Online gambling-related harm: findings from the study on the prevalence, behavior and characteristics of gamblers in spain. J. Gambl. Stud. 37, 599–607 (2021)
Hydén, H.: Pornography. the politics of legal changes. An opinion article. Front. Sociol. 8, 1250012 (2023)
Xiong, J.: Recognition of illegal websites based on similarity of sensitive features of mixed elements. In: 2022 International Conference on Computation, Big-Data and Engineering (ICCBE), pp. 9–12. IEEE (2022)
Li, Y., Lingjing, Y., Liu, Q.: HinPage: illegal and harmful webpage identification using transductive classification. In: Deng, Y., Yung, M. (eds.) Information Security and Cryptology: 18th International Conference, Inscrypt 2022, Beijing, China, December 11–13, 2022, Revised Selected Papers, pp. 373–390. Springer Nature Switzerland, Cham (2023). https://doi.org/10.1007/978-3-031-26553-2_20
Zhao, J., Shao, M., Peng, H., Wang, H., Li, B., Liu, X.: Porn2vec: a robust framework for detecting pornographic websites based on contrastive learning. Knowl.-Based Syst. 228, 107296 (2021)
Rao, R.S., Pais, A.R.: An enhanced blacklist method to detect phishing websites. In: Shyamasundar, R., Singh, V., Vaidya, J. (eds) Information Systems Security. ICISS 2017. LNCS, vol. 10717. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-72598-7_20
Sun, G., Ye, F., Chai, T., Zhang, Z., Tong, X., Prasad, S.: Gambling domain name recognition via certificate and textual analysis. Comput. J. 66(8), 1829–1839 (2023)
Li, L., Gou, G., Xiong, G., Cao, Z., Li, Z.: Identifying gambling and porn websites with image recognition. In: Zeng, B., Huang, Q., El Saddik, A., Li, H., Jiang, S., Fan, Xiaopeng (eds.) PCM 2017. LNCS, vol. 10736, pp. 488–497. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-77383-4_48
Chen, Y., Zheng, R., Zhou, A., Liao, S., Liu, L.: Automatic detection of pornographic and gambling websites based on visual and textual content using a decision mechanism. Sensors 20(14), 3989 (2020)
Wang, C., Zhang, M., Shi, F., Xue, P., Li, Y.: A hybrid multimodal data fusion-based method for identifying gambling websites. Electronics 11(16), 2489 (2022)
Yang, R., et al.: Scalable detection of promotional website defacements in black hat {SEO} campaigns. In: 30th USENIX Security Symposium (USENIX Security 21), pp. 3703–3720 (2021)
Yuan, K., et al.: Stealthy porn: Understanding real-world adversarial images for illicit online promotion. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 952–966. IEEE (2019)
Hong, G., et al.: Analyz ing ground-truth data of mobile gambling scams. In: 2022 IEEE Symposium on Security and Privacy (SP), pp. 2176–2193. IEEE (2022)
Gao, Y., Wang, H., Li, L., Luo, X., Xu, G., Liu, X.: Demystifying illegal mobile gambling apps. In: Proceedings of the Web Conference 2021, pp. 1447–1458 (2021)
Gomez, G., Moreno-Sanchez, P., Caballero, J.: Watch your back: Identifying cybercrime financial relationships in bitcoin through back-and-forth exploration. In: Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, pp. 1291–1305 (2022)
Yang, H., et al.: Casino royale: a deep exploration of illegal online gambling. In: Proceedings of the 35th Annual Computer Security Applications Conference, pp. 500–513 (2019)
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. Adv. Neural Inform. Process. Syst. 26 (2013)
Lin, Y., Liu, Z., Sun, M., Liu, Y., Zhu, X.: Learning entity and relation embed dings for knowledge graph completion. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 29 (2015)
He, S., Liu, K., Ji, G., Zhao, J.: Learning to represent knowledge graphs with gaussian embedding. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 623–632 (2015)
Yang, B., Yih, W.T., He, X., Gao, J., Deng, L.: Embedding entities and relations for learning and inference in knowledge bases. arXiv preprint arXiv:1412.6575 (2014)
Trouillon, T., Welbl, J., Riedel, S., Gaussier, É., Bouchard, G.: Complex embed dings for simple link prediction. In: International conference on machine learning. pp. 2071–2080. PMLR (2016)
Sun, Z., Deng, Z.H., Nie, J.Y., Tang, J.: Rotate: Knowledge graph embedding by relational rotation in complex space. arXiv preprint arXiv:1902.10197 (2019)
Dong, X., et al.: Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 601–610 (2014)
Dettmers, T., Minervini, P., Stenetorp, P., Riedel, S.: Convolutional 2d knowledge graph embeddings. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. Adv. Neural Inform. Process. Syst. 30 (2017)
Noy, N.F., McGuinness, D.L., et al.: Ontology development 101: a guide to creating your first ontology (2001)
CodePunch Solutions (2017). https://codepunch.com/
Strehl, A., Ghosh, J.: Cluster ensembles-a knowledge reuse framework for combining partitionings. AAAI (2002)
Zhao, Y., Karypis, G.: Empirical and theoretical comparisons of selected criterion functions for document clustering. Mach. Learn. 55, 311–331 (2004)
Acknowledgments
This study was funded by National Key Research and Development Program of China (2021YFB3100500).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhang, B., Shi, F., Xu, C., Xue, P., Sun, J. (2024). CAKGC: A Clustering Method of Cybercrime Assets Knowledge Graph Based on Feature Fusion. In: Huang, DS., Chen, W., Guo, J. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2024. Lecture Notes in Computer Science, vol 14870. Springer, Singapore. https://doi.org/10.1007/978-981-97-5606-3_15
Download citation
DOI: https://doi.org/10.1007/978-981-97-5606-3_15
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5605-6
Online ISBN: 978-981-97-5606-3
eBook Packages: Computer ScienceComputer Science (R0)