Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Mining heterogeneous information networks: a structural analysis approach

Published: 30 April 2013 Publication History

Abstract

Most objects and data in the real world are of multiple types, interconnected, forming complex, heterogeneous but often semi-structured information networks. However, most network science researchers are focused on homogeneous networks, without distinguishing different types of objects and links in the networks. We view interconnected, multityped data, including the typical relational database data, as heterogeneous information networks, study how to leverage the rich semantic meaning of structural types of objects and links in the networks, and develop a structural analysis approach on mining semi-structured, multi-typed heterogeneous information networks. In this article, we summarize a set of methodologies that can effectively and efficiently mine useful knowledge from such information networks, and point out some promising research directions.

References

[1]
C. C. Aggarwal, editor. Social Network Data Analytics. Springer, 2011.
[2]
S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. In Proc. 7th Int. World Wide Web Conf. (WWW'98), pages 107--117, Brisbane, Australia, April 1998.
[3]
C. Chen, X. Yan, F. Zhu, J. Han, and P. S. Yu. Graph OLAP: Towards online analytical processing on graphs. In Proc. 2008 Int. Conf. Data Mining (ICDM'08), Pisa, Italy, Dec. 2008.
[4]
N. A. Christakis and J. H. Fowler. The spread of obesity in a large social network over 32 years. The New England Journal of Medicine, 357(4):370--379, 2007.
[5]
H. Deng, J. Han, M. R. Lyu, and I. King. Modeling and exploiting heterogeneous bibliographic networks for expertise ranking. In Proceedings of the 12th ACM/IEEECS joint conference on Digital Libraries (JCDL'12), pages 71--80, 2012.
[6]
H. Deng, J. Han, B. Zhao, Y. Yu, and C. X. Lin. Probabilistic topic models with biased propagation on heterogeneous information networks. In Proc. 2011 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'11), San Diego, CA, Aug. 2011.
[7]
X. L. Dong, L. Berti-Equille, Y. Hu, and D. Srivastava. Global detection of complex copying relationships between sources. Proc. VLDB Endow., 3(1-2):1358--1369, Sept. 2010.
[8]
C. L. Giles. The future of citeseer: citeseerx. In Proc. 10th European Conf. Principles and Practice of Knowledge Discovery in Databases (PKDD'06), Berlin, Germany, September 2006.
[9]
M. Ji, J. Han, and M. Danilevsky. Ranking-based classification of heterogeneous information networks. In Proc. 2011 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'11), San Diego, CA, Aug. 2011.
[10]
M. Ji, Y. Sun, M. Danilevsky, J. Han, and J. Gao. Graph regularized transductive classification on heterogeneous information networks. In Proc. 2010 European Conf. Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD'10), Barcelona, Spain, Sept. 2010.
[11]
E. M. Rogers. Diffusion of Innovations, 5th Edition. Free Press, 2003.
[12]
T. L. S. Roy and M. Werner-Washburne. Integrative construction and analysis of condition-specific biological networks. In Proc. 2007 AAAI Conf. on Artificial Intelligence (AAAI'07), Vancouver, BC, July 2007.
[13]
C. Shi, X. Kong, P. S. Yu, S. Xie, and B.Wu. Relevance search in heterogeneous networks. In Proc. 2012 Int. Conf. on Extending Database Technology (EDBT'12), pages 180--191, Berlin, Germany, March 2012.
[14]
Y. Sun, C. C. Aggarwal, and J. Han. Relation strengthaware clustering of heterogeneous information networks with incomplete attributes. PVLDB, 5:394--405, 2012.
[15]
Y. Sun, R. Barber, M. Gupta, C. Aggarwal, and J. Han. Co-author relationship prediction in heterogeneous bibliographic networks. In Proc. 2011 Int. Conf. Advances in Social Network Analysis and Mining (ASONAM'11), Kaohsiung, Taiwan, July 2011.
[16]
Y. Sun, J. Han, C. C. Aggarwal, and N. Chawla. When will it happen? relationship prediction in heterogeneous information networks. In Proc. 2012 ACM Int. Conf. on Web Search and Data Mining (WSDM'12), Seattle, WA, Feb. 2012.
[17]
Y. Sun, J. Han, J. Gao, and Y. Yu. iTopicModel: Information network-integrated topic modeling. In Proc. 2009 Int. Conf. Data Mining (ICDM'09), Miami, FL, Dec. 2009.
[18]
Y. Sun, J. Han, X. Yan, P. S. Yu, and T. Wu. PathSim: Meta path-based top-k similarity search in heterogeneous information networks. In Proc. 2011 Int. Conf. Very Large Data Bases (VLDB'11), Seattle, WA, Aug. 2011.
[19]
Y. Sun, J. Han, P. Zhao, Z. Yin, H. Cheng, and T. Wu. RankClus: Integrating clustering with ranking for heterogeneous information network analysis. In Proc. 2009 Int. Conf. Extending Data Base Technology (EDBT'09), Saint-Petersburg, Russia, Mar. 2009.
[20]
Y. Sun, B. Norick, J. Han, X. Yan, P. S. Yu, and X. Yu. Integrating meta-path selection with user guided object clustering in heterogeneous information networks. In Proc. of 2012 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'12), Beijing, China, Aug. 2012.
[21]
Y. Sun, J. Tang, J. Han, M. Gupta, and B. Zhao. Community evolution detection in dynamic heterogeneous information networks. In Proc. 2010 KDD Workshop on Mining and Learning with Graphs (MLG'10), Washington D.C., July 2010.
[22]
Y. Sun, Y. Yu, and J. Han. Ranking-based clustering of heterogeneous information networks with star network schema. In Proc. 2009 ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining (KDD'09), Paris, France, June 2009.
[23]
Y. Tian, R. A. Hankins, and J. M. Patel. Efficient aggregation for graph summarization. In Proc. 2008 ACM SIGMOD Int. Conf. Management of Data (SIGMOD' 08), pages 567--580, Vancouver, BC, Canada, June 2008.
[24]
Z. B. C. C. W. Jiang, J. Vaidya and B. Banich. Knowledge discovery from transportation network data. In Proc. 2005 Int. Conf. Data Mining (ICDE'05), Tokyo, Japan, April 2005.
[25]
C. Wang, J. Han, Y. Jia, J. Tang, D. Zhang, Y. Yu, and J. Guo. Mining advisor-advisee relationships from research publication networks. In Proc. 2010 ACM SIGKDD Conf. Knowledge Discovery and Data Mining (KDD'10), Washington D.C., July 2010.
[26]
C. Wang, J. Han, Q. Li, X. Li, W.-P. Lin, and H. Ji. Learning hierarchical relationships among partially ordered objects with heterogeneous attributes and links. In Proc. 2012 SIAM Int. Conf. on Data Mining (SDM'12), Anaheim, CA, April 2012.
[27]
X. Yin, J. Han, and P. S. Yu. Object distinction: Distinguishing objects with identical names by link analysis. In Proc. 2007 Int. Conf. Data Engineering (ICDE'07), Istanbul, Turkey, April 2007.
[28]
X. Yin, J. Han, and P. S. Yu. Truth discovery with multiple conflicting information providers on the Web. IEEE Trans. Knowledge and Data Engineering, 20:796--808, 2008.
[29]
X. Yu, Y. Sun, P. Zhao, and J. Han. Query-driven discovery of semantically similar substructures in heterogeneous networks. In Proc. of 2012 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'12), Beijing, China, Aug. 2012.
[30]
B. Zhao, B. I. P. Rubinstein, J. Gemmell, and J. Han. A Bayesian approach to discovering truth from conflicting sources for data integration. In Proc. 2012 Int. Conf. Very Large Data Bases (VLDB'12), Istanbul, Turkey, Aug. 2012.
[31]
P. Zhao, X. Li, D. Xin, and J. Han. Graph cube: On warehousing and OLAP multidimensional networks. In Proc. 2011 ACM SIGMOD Int. Conf. on Management of Data (SIGMOD'11), Athens, Greece, June 2011.

Cited By

View all
  • (2025)Community-oriented multi-scale heterogeneous community detection using weighted positives and debiased negativesKnowledge-Based Systems10.1016/j.knosys.2024.112934(112934)Online publication date: Jan-2025
  • (2025)Knowledge-driven hierarchical intents modeling for recommendationExpert Systems with Applications10.1016/j.eswa.2024.125361259(125361)Online publication date: Jan-2025
  • (2025)Heterogeneous Graph Transformer Auto-Encoder for multivariate time series forecastingComputers and Electrical Engineering10.1016/j.compeleceng.2024.109927122(109927)Online publication date: Mar-2025
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGKDD Explorations Newsletter
ACM SIGKDD Explorations Newsletter  Volume 14, Issue 2
December 2012
81 pages
ISSN:1931-0145
EISSN:1931-0153
DOI:10.1145/2481244
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 April 2013
Published in SIGKDD Volume 14, Issue 2

Check for updates

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)170
  • Downloads (Last 6 weeks)15
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Community-oriented multi-scale heterogeneous community detection using weighted positives and debiased negativesKnowledge-Based Systems10.1016/j.knosys.2024.112934(112934)Online publication date: Jan-2025
  • (2025)Knowledge-driven hierarchical intents modeling for recommendationExpert Systems with Applications10.1016/j.eswa.2024.125361259(125361)Online publication date: Jan-2025
  • (2025)Heterogeneous Graph Transformer Auto-Encoder for multivariate time series forecastingComputers and Electrical Engineering10.1016/j.compeleceng.2024.109927122(109927)Online publication date: Mar-2025
  • (2025)HG-search: multi-stage search for heterogeneous graph neural networksApplied Intelligence10.1007/s10489-024-06058-w55:1Online publication date: 1-Jan-2025
  • (2024)StructSim: Meta-Structure-Based Similarity Measure in Heterogeneous Information NetworksApplied Sciences10.3390/app1402093514:2(935)Online publication date: 22-Jan-2024
  • (2024)Complex-Path: Effective and Efficient Node Ranking with Paths in Billion-Scale Heterogeneous GraphsProceedings of the VLDB Endowment10.14778/3685800.368582017:12(3973-3986)Online publication date: 8-Nov-2024
  • (2024)FlowWalker: A Memory-Efficient and High-Performance GPU-Based Dynamic Graph Random Walk FrameworkProceedings of the VLDB Endowment10.14778/3659437.365943817:8(1788-1801)Online publication date: 1-Apr-2024
  • (2024)Question Embedding on Weighted Heterogeneous Information Network for Knowledge TracingACM Transactions on Knowledge Discovery from Data10.1145/370315819:1(1-28)Online publication date: 4-Nov-2024
  • (2024)KPAR: Knowledge-aware Path-based Attentive Recommender with InterpretabilityACM Transactions on Recommender Systems10.1145/3673243Online publication date: 17-Jun-2024
  • (2024)Multi-level Disentangled Contrastive Learning on Heterogeneous GraphsProceedings of the 2024 16th International Conference on Machine Learning and Computing10.1145/3651671.3651684(628-634)Online publication date: 2-Feb-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media