Abstract
With the deepening of human academic research in various fields and the diversification of research branches, it has become an important work to obtain the information of scholars in the same field and conduct reference research on their research results. Thus, it is of vital importance to obtain relevant scholar information through information extraction and prediction by the result of search engines. Through XGBoost, KNN, information extraction and other methods, we realized the function of predicting scholars’ home page, email address, language, gender, title and other information through the search engine search results of scholars’ names and institutions, and achieved high accuracy in some aspects.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, pp. 785–794 (2016)
Lee, J., Kim, H., Ko, M., Choi, D., Choi, J., Kang, J.: Name nationality classification with recurrent neural networks. In: IJCAI, pp. 2081–2087 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
The positive key words include ‘edu’, ‘faculty’, ‘id’, ‘staff’, ‘detail’, ‘person’, ‘about’, ‘academic’, ‘teacher’, ‘list’, ‘people’, ‘lish’, ‘homepages’, ‘researcher’, ‘team’, ‘teachers’, ‘member’, ‘profile’.
The negative key words include ‘books’, ‘google’, ‘pdf’, ‘esc’, ‘scholar’, ‘netprofile’, ‘linkedin’, ‘researchgate’, ‘news’, ‘article’, ‘wikipedia’, ‘gov’, ‘showrating’, ‘youtube’, ‘blots’, ‘citation’, ‘expert’, ‘dblp’, ‘researchgate’, ‘baidu’, ‘aminer’, ‘irps’, ‘taobao’.
Rights and permissions
Copyright information
© 2022 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Yang, W., Sun, B., Liu, B. (2022). Basic Profiling Extraction Based on XGBoost. In: Qin, B., Wang, H., Liu, M., Zhang, J. (eds) CCKS 2021 - Evaluation Track. CCKS 2021. Communications in Computer and Information Science, vol 1553. Springer, Singapore. https://doi.org/10.1007/978-981-19-0713-5_7
Download citation
DOI: https://doi.org/10.1007/978-981-19-0713-5_7
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-0712-8
Online ISBN: 978-981-19-0713-5
eBook Packages: Computer ScienceComputer Science (R0)