Abstract
In the current information explosion era, many complex systems can be modeled using networks/graphs. The development of artificial intelligence and machine learning has also provided more means for graph analysis tasks. However, the high-dimensional large-scale graphs cannot be used as input to machine learning algorithms directly. One typically needs to apply representation learning to transform the high-dimensional graphs to low-dimensional vector representations. As for network embedding/representation learning, the study on homogeneous graphs is already highly adequate. However, heterogeneous information networks are more common in real-world applications. Applying homogeneous-graph embedding methods to heterogeneous graphs will incur significant information loss. In this paper, we propose a numerical signature based method, which is highly pluggable—given a target heterogeneous graph G, our method can complement any existing network embedding method on either homogeneous or heterogeneous graphs and universally improve the embedding quality of G, while only introducing minimum overhead. We use real datasets from four different domains, and compare with a representative homogeneous network embedding method, a representative heterogeneous network embedding method, and a state-of-the-art heterogeneous network embedding method, to illustrate the improvement effect of the proposed framework on the quality of network embedding, in terms of node classification, node clustering, and edge classification tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
All datasets and code are publicly available at https://github.com/guaw/sig_py.
References
Bonner, S., Kureshi, I., Brennan, J., Theodoropoulos, G., McGough, A.S., Obara, B.: Exploring the semantic content of unsupervised graph embeddings: an empirical study. Data Sci. Eng. 4(3), 269–289 (2019). https://doi.org/10.1007/s41019-019-0097-5
Cui, P., Wang, X., Pei, J., Zhu, W.: A survey on network embedding. IEEE Trans. Knowl. Data Eng. 31(5), 833–852 (2019)
Dave, V.S., Zhang, B., Chen, P.Y., Hasan, M.A.: Neural-brane: neural Bayesian personalized ranking for attributed network embedding. Data Sci. Eng. 4(2), 119–131 (2019). https://doi.org/10.1007/s41019-019-0092-x
Dong, Y., Chawla, N.V., Swami, A.: metapath2vec: scalable representation learning for heterogeneous networks. In: KDD, pp. 135–144 (2017)
Eagle, N., Pentland, A.: Crawdad dataset mit/reality (v. 2005–07-01). downloaded from http://crawdad.org/mit/reality/20050701
Gao, H., Huang, H.: Deep attributed network embedding. In: IJCAI, vol. 18, pp. 3364–3370 (2018)
Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: SIGKDD, pp. 855–864 (2016)
Huang, Z., Mamoulis, N.: Heterogeneous information network embedding for meta path based proximity (2017). arXiv:1701.05291v1
Hussein, R., Yang, D., Cudre-Mauroux, P.: Are meta-paths necessary? Revisiting heterogeneous graph embeddings. In: CIKM, pp. 437–446 (2018)
Lever, J., Krzywinski, M., Altman, N.: Principal component analysis. Nat. Methods 14, 641–642 (2017)
Ou, M., Cui, P., Pei, J., Zhang, Z., Zhu, W.: Asymmetric transitivity preserving graph embedding. In: SIGKDD, pp. 1105–1114 (2016)
Shi, C., Li, Y., Zhang, J., Sun, Y., Yu, P.S.: A survey of heterogeneous information network analysis. TKDE 29, 17–37 (2017)
Sun, Y., Han, J., Yan, X., Yu, P.S., Wu, T.: Pathsim: meta path-based top-k similarity search in heterogeneous information networks. VLDB 4(11), 992–1003 (2011)
Wang, D., Cui, P., Zhu, W.: Structural deep network embedding. In: SIGKDD, pp. 1225–1234 (2016)
Yang, D., Zhang, D., Qu, B.: Participatory cultural mapping based on collective behavior data in location based social networks. TIST 7(3), 1–23 (2015)
Acknowledgements
Chunyao Song is supported in part by the NSFC under the grants 61702285, 61772289, U1836109, U1936206, and U1936105, the NSF of Tianjin under the grant 17JCQNJC00200, and Jiangsu Key Laboratory of Big Data Security & Intelligent Processing, NJUPT under the grant BDSIP1902. Tingjian Ge is supported in part by NSF grant IIS-1633271.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Song, C., Guo, J., Ge, T., Yuan, X. (2020). Type Preserving Representation of Heterogeneous Information Networks. In: Nah, Y., Cui, B., Lee, SW., Yu, J.X., Moon, YS., Whang, S.E. (eds) Database Systems for Advanced Applications. DASFAA 2020. Lecture Notes in Computer Science(), vol 12113. Springer, Cham. https://doi.org/10.1007/978-3-030-59416-9_36
Download citation
DOI: https://doi.org/10.1007/978-3-030-59416-9_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59415-2
Online ISBN: 978-3-030-59416-9
eBook Packages: Computer ScienceComputer Science (R0)