Abstract
With the repaid development of bioinformatics and pharmaceutical engineering, pharmaceutical company and institutes increasingly pay attention to intellectual property protection via medical patents. As a result, how to classify the massive medical patents accurately without manual intervention is an important challenge for academia and industrials. To address it, we propose a deep learning based classification framework for medical patents, which consists of three components (i.e., text processing, feature extraction, and prototype clustering). Different from the existing classification method based on machine learning, the proposed framework enjoys the robust characteristic for the external samples, while it can guarantee high precision. In detail, for the text processors, a professional medical text thesaurus is built via the GloVe method, which can learn more specialized vocabulary in the medical field. In the feature extraction, a hybrid deep learning model is proposed to extract the features of patent texts, which integers a one-dimensional convolutional neural network (CNN) and two bidirectional long-short-term sequence network (Bi-LSTM), propose an improved distance-based center loss function (DCL). Finally, extensive experiments are conducted on the Chinese medical patents dataset supported by the company. It demonstrates that our proposed method shows the significant superiority in the classification precision and robustness, compared with other existing multi-classification methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
References
Wu, B., Miao, Y.N., Peng, X.Q., et al.: Patent protection strategy of technical standards in medical innovation. Chin. J. New Drugs 27(5), 494–497 (2018)
Han, J.W., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, New York (2001)
Yong, Z., Li, Y., Xia, S.: An improved KNN text classification algorithm based on clustering. J. Comput. 4(3), 230–237 (2009)
Sun, A., Lim, E.P., Liu, Y.: On strategies for imbalanced text classification using SVM: a comparative study. Decis. Support Syst. 48(1), 191–201 (2009)
Hu, J., Li, S., et al.: A patent classification model based on convolutional neural networks and rand forest. Sci. Technol. Eng. 18(6), 268–272 (2018)
Yang, H.M., Zhang, X.Y., Yin, F.C., Liu, L.: Robust classification with convolutional prototype learning. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3474–3482 (2018)
Kim, Y.: Convolutional neural networks for sentence classification. Eprint Arxiv (2014)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Zhang, S., Zheng, D.Q., Hu, C., Yang, M.: Bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 207–212, August 2016
Hodge, V.J., Austin, J.: A survey of outlier detection methodologies. Artif. Intell. Rev. 22, 85–126 (2004)
Song, J., Huang, X., Qin, S., et al.: A bi-directional sampling based on K-means method for imbalance text classification. In: International Conference on Computer and Information Science (ICIS), pp. 1–5 (2016)
Han, E.-H., Karypis, G., Kumar, V.: Text categorization using weight adjusted k-nearest neighbor classification. In: Cheung, D., Williams, G.J., Li, Q. (eds.) PAKDD. LNCS, vol. 2035, pp. 53–65. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-45357-1_9
Crammer, K., Gilad-Bachrach, R., Navot, A., et al.: Margin analysis of the LVQ Algorithm. In: Advances in Neural Information Processing Systems, pp. 462–469 (2003)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space, Computer Science, January 2013
Acknowledgment
This work was supported in part by the National Natural Science Foundation of China under Grant 61602434, in part by Chongqing research program of technology innovation and application under grant cstc2019jscx-zdztzxX0019, in part by Chongqing research program of key standard technologies innovation of key industries under grant cstc2017zdcy-zdyfX0076, in part by Youth Innovation Promotion Association CAS, No. 2017393, in part by Nanchong major scientific and technological achievements conversion project 18SXHZ0386, Scientific research Fund for talents of the China West Normal University, No. 17YC149 and in part by Education program of the Ministry of Education project 201702049002.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, S., Long, M., Shi, X., He, X., Shang, M. (2021). A Robust Classification Framework for Medical Patents Based on Deep Learning. In: Meng, H., Lei, T., Li, M., Li, K., Xiong, N., Wang, L. (eds) Advances in Natural Computation, Fuzzy Systems and Knowledge Discovery. ICNC-FSKD 2020. Lecture Notes on Data Engineering and Communications Technologies, vol 88. Springer, Cham. https://doi.org/10.1007/978-3-030-70665-4_25
Download citation
DOI: https://doi.org/10.1007/978-3-030-70665-4_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-70664-7
Online ISBN: 978-3-030-70665-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)