Abstract
In order to utilize the associated relationship in the expert page efficiently, we’d like to introduce a Chinese expert disambiguation method based on the semi-supervised graph clustering with the integration of various associated relationships. Firstly, extract the correlation characteristics of the expert attributes according to the correlation analysis on the expert page. Secondly, construct a similarity matrix between the documents on different expert pages with the utilization of the attributes characteristics and the associated relationship of the expert pages. Finally, with the adoption of the attribute correlation as the semi-supervised constraint, construct an expert disambiguation model by applying the graph-based clustering approach to get the solution of the model through the kernel-based method for the purpose to achieve expert name disambiguation. Through the contrast experiment in the Chinese expert disambiguation, it turns out that the disambiguation effect is much better with the adoption of the semi-supervised graph clustering method that has been integrated with the expert-associated relationships.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Wang H, Mei Z (2005) Chinese multi-document person name disambiguation. High Technol Lett 11(3):280–283
Cohen W, Ravikumar P, Fienberg S (2003) A comparison of string distance metrics for name-matching tasks. In: The IJCAI workshop on information integration on the web, Acapulco, Mexico, pp 73–78
Lang J, Qin B (2009) Person name disambiguation of searching results using social network. Chin J Comput 7:1365–1375
Tian W, Yu Z et al (2013) A Chinese expert name disambiguation approach based on spectral clustering with the expert page-associated relationships. In: Proceedings of 2013 Chinese intelligent automation conference. Springer, Berlin, Heidelberg, pp 245–253
Zhang S, You L (2010) Chinese people name disambiguation by hierarchical clustering. New Technol Libr Inf Serv 11:64–68
Wagstaff K, Cardie C, Rogers S et al (2001) Constrained K-means clustering with background knowledge. In: Proceedings of 18th international conference on machine learning, San Francisco, USA, pp 577–584
Bensaid AM, Hall LO, Bezdek JC (1996) Partially supervised clustering for image segmentation. Pattern Recogn 29(5):859–871
Sarma TH, Viswanath P, Reddy BE (2013) A hybrid approach to speed-up the k-means clustering method. Int J Mach Learn Cybern 4(2):107–117
Wang X, Wang Y, Wang L (2004) Improving fuzzy c-means clustering based on feature-weight learning. Pattern Recogn Lett 25(10):1123–1132
Guha S, Rastogi R, Shim K (1998) CURE: an efficient clustering algorithm for large databases. In: Proceedings of the ACM SIGMOD international conference on management of data, Seattle, Washington, USA, pp 73–84
Guha S, Rastogi R, Shim K (2000) ROCK: a robust clustering algorithm for categorical attributes. Inf Syst 25(5):345–366
Ester M, Kriegel HP, Sander J et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of 2nd international conference on knowledge discovery and data mining. The AAAI Press, Menlo Park, CA, pp 226–231
Wang W, Yang J, Muntz R (1999) STING: a statistical information grid approach to spatial data mining. In: Proceedings of the 15th international conference on data engineering, Sydney, New South Wales, Australia, pp 116–125
Rana S, Jasola S, Kumar R (2013) A boundary restricted adaptive particle swarm optimization for data clustering. Int J Mach Learn Cybern 4(4):391–400
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905
Dhillon I, Guan Y, Kulis B (2007) Weighted graph cuts without eigenvectors: a multilevel approach. IEEE Trans Pattern Anal Mach Intell 29(11):1944–1957
Wagstaff K, Cardie C (2000) Clustering with instance-level constraints. In: Proceedings of the 17th international conference on machine learning. Morgan Kaufmann, San Franisco, pp 1103–1110
Dhillon I, Guan Y, Kulis B (2005) A unified view of kernel k-means, spectral clustering and graph cuts. The University of Texas at Austin, Department of Computer Sciences, Technical Report TR-04
Acknowledgments
This paper is supported by National Nature Science Foundation (No. 61175068), and the National Innovation Fund for Technology based Firms (No. 11C26215305905), and the Open Fund of Software Engineering Key Laboratory of Yunnan Province (No. 2011SE14), and the Ministry of Education of Returned Overseas Students to Start Research and Fund Projects.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Jiang, J., Yan, X., Yu, Z. et al. A Chinese expert disambiguation method based on semi-supervised graph clustering. Int. J. Mach. Learn. & Cyber. 6, 197–204 (2015). https://doi.org/10.1007/s13042-014-0255-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-014-0255-z