[Report a bug]                
Clustering Heterogeneous Information Network by Joint Graph Embedding and Nonnegative Matrix Factorization

Published: 10 June 2021


Many complex systems derived from nature and society consist of multiple types of entities and heterogeneous interactions, which can be effectively modeled as heterogeneous information network (HIN). Structural analysis of heterogeneous networks is of great significance by leveraging the rich semantic information of objects and links in the heterogeneous networks. And, clustering heterogeneous networks aims to group vertices into classes, which sheds light on revealing the structure–function relations of the underlying systems. The current algorithms independently perform the feature extraction and clustering, which are criticized for not fully characterizing the structure of clusters. In this study, we propose a learning model by joint <underline>G</underline>raph <underline>E</underline>mbedding and <underline>N</underline>onnegative <underline>M</underline>atrix <underline>F</underline>actorization (aka GEjNMF), where feature extraction and clustering are simultaneously learned by exploiting the graph embedding and latent structure of networks. We formulate the objective function of GEjNMF and transform the heterogeneous network clustering problem into a constrained optimization problem, which is effectively solved by l0-norm optimization. The advantage of GEjNMF is that features are selected under the guidance of clustering, which improves the performance and saves the running time of algorithms at the same time. The experimental results on three benchmark heterogeneous networks demonstrate that GEjNMF achieves the best performance with the least running time compared with the best state-of-the-art methods. Furthermore, the proposed algorithm is robust across heterogeneous networks from various fields. The proposed model and method provide an effective alternative for heterogeneous network clustering.


ACM Transactions on Knowledge Discovery from Data  Volume 15, Issue 4
August 2021
August 2021
486 pages
Issue’s Table of Contents
Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 June 2021
Accepted: 01 December 2020
Revised: 01 September 2020
Received: 01 January 2020
Published in TKDD Volume 15, Issue 4


Author Tags

  1. Heterogeneous information network
  2. Non-negative matrix factorization
  3. clustering


NFSC

  • NFSC


