Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3529836.3529849acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlcConference Proceedingsconference-collections
research-article
Open access

Automated Single-Label Patent Classification using Ensemble Classifiers

Published: 21 June 2022 Publication History

Abstract

Many thousands of patent applications arrive at patent offices around the world every day. One important task when a patent application is submitted is to assign one or more classification codes from the complex and hierarchical patent classification schemes that will enable routing of the patent application to a patent examiner who is knowledgeable about the specific technical field. This task is typically undertaken by patent professionals, however due to the large number of applications and the potential complexity of an invention, they are usually overwhelmed. Therefore, there is a need for this code assignment manual task to be supported or even fully automated by classification systems that will classify patent applications, hopefully with an accuracy close to patent professionals. Like in many other text analysis problems, in the last years, this intellectually demanding task has been studied using word embeddings and deep learning techniques. In this paper these research efforts are shortly reviewed and re-produced with similar deep learning techniques using different feature representations on automatic patent classification in the level of sub-classes. On top of that, an innovative method of ensemble classifiers trained with different parts of the patent document is proposed. To the best of our knowledge, this is the first time that an ensemble method was proposed for the patent classification problem. Our first results are quite promising showing that an ensemble architecture of classifiers significantly outperforms current state-of-the-art techniques using the same classifiers as standalone solutions.

References

[1]
WIPO, World intellectual property indicators 2020, 2021. Available at: https://www.wipo.int/edocs/pubdocs/en/wipo_pub_941_2020.pdf.
[2]
Salampasis, M., Paltoglou, G., Giahanou A. (2012). Report on the CLEF-IP 2012 Experiments: Search of Topically Organized Patents. CLEF (Online Working Notes/Labs/Workshop)
[3]
B. Wolter. It takes all kinds to make a world–some thoughts on the use of classification in patent searching. WPI, 34(1), pp. 8-18, 2012
[4]
Nguyen, T. T., Hatua, A., & Sung, A. H. (2017). Building a learning machine classifier with inadequate data for crime prediction. Journal of Advances in Information Technology Vol, 8(2).
[5]
Nam, J. H., & Yow, K. C. (2017). A Data-driven Approach to the Automatic Classification of Korean Poetry. Journal of Advances in Information Technology Vol, 8(3).
[6]
Li, S., Hu, J., Cui, Y., & Hu, J. (2018). DeepPatent: patent classification with convolutional neural networks and word embedding. Scientometrics, 117(2), 721-744.
[7]
Zhu, H., He, C., Fang, Y., Ge, B., Xing, M., & Xiao, W. (2020). Patent Automatic Classification Based on Symmetric Hierarchical Convolution Neural Network. Symmetry, 12(2), 186.
[8]
Abdelgawad, L., Kluegl, P., Genc, E., Falkner, S., & Hutter, F. (2019). Optimizing neural networks for patent classification. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 688-703). Springer, Cham.
[9]
Grawe, M. F., Martins, C. A., & Bonfante, A. G. (2017). Automated patent classification using word embedding. In 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA) (pp. 408-411). IEEE.
[10]
Xiao, L., Wang, G., & Zuo, Y. (2018). Research on patent text classification based on word2vec and LSTM. In 2018 11th International Symposium on Computational Intelligence and Design (ISCID) (Vol. 1, pp. 71-74). IEEE.
[11]
Mustafa Sofean (2019) Deep Learning based Pipeline with Multichannel Inputs for Patent Classification. 1st Workshop on Patent Text Mining and Semantic Technologies. PatentSemTech
[12]
Risch J. & Krestel, R. (2019). Domain-specific word embeddings for patent classification. Data Technologies and Applications.
[13]
Fall, C. J., Törcsvári, A., Benzineb, K., & Karetka, G. (2003). Automated categorization in the international patent classification. In Acm Sigir Forum (Vol. 37, No. 1, pp. 10-25). New York, NY, USA: ACM.
[14]
Tikk, D., Biró, G., & Törcsvári, A. (2008). A hierarchical online classifier for patent categorization. In Emerging technologies of text mining: Techniques and applications (pp. 244-267). IGI Global.
[15]
Lee, J. S., & Hsiang, J. (2020). Patent classification by fine-tuning BERT language model. World Patent Information, 61, 101965.
[16]
Roudsari, A. H., Afshar, J., Lee, C. C., & Lee, W. (2020). Multi-label patent classification using attention-aware deep learning model. In 2020 IEEE International Conference on Big Data and Smart Computing (BigComp) (pp. 558-559). IEEE.
[17]
Lim, S., & Kwon, Y. (2016). IPC Multi-label Classification Based on the Field Functionality of Patent Documents. In International Conference on Advanced Data Mining and Applications (pp. 677-691). Springer, Cham.
[18]
Zhou, Z. H., Wu, J., & Tang, W. (2002). Ensembling neural networks: many could be better than all. Artificial intelligence, 137, 239-263.
[19]
Benites, F., Malmasi, S., & Zampieri, M. (2018). Classifying patent applications with ensemble methods. arXiv preprint arXiv:1811.04695.

Cited By

View all
  • (2024)A New Entity Relationship Extraction Method for Semi-Structured Patent DocumentsElectronics10.3390/electronics1316314413:16(3144)Online publication date: 8-Aug-2024
  • (2024)Innovating Patent Retrieval: A Comprehensive Review of Techniques, Trends, and Challenges in Prior Art SearchesApplied System Innovation10.3390/asi70500917:5(91)Online publication date: 26-Sep-2024
  • (2023)Unveiling Black-Boxes: Explainable Deep Learning Models for Patent ClassificationExplainable Artificial Intelligence10.1007/978-3-031-44067-0_24(457-474)Online publication date: 21-Oct-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICMLC '22: Proceedings of the 2022 14th International Conference on Machine Learning and Computing
February 2022
570 pages
ISBN:9781450395700
DOI:10.1145/3529836
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 June 2022

Check for updates

Author Tags

  1. Classification
  2. Deep learning
  3. Ensemble method
  4. Patent
  5. Single-label
  6. Sub-classes
  7. Word embeddings

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • EU HORIZON 2020 / MARIE SKLODOWSKA-CURIE

Conference

ICMLC 2022

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)120
  • Downloads (Last 6 weeks)11
Reflects downloads up to 17 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A New Entity Relationship Extraction Method for Semi-Structured Patent DocumentsElectronics10.3390/electronics1316314413:16(3144)Online publication date: 8-Aug-2024
  • (2024)Innovating Patent Retrieval: A Comprehensive Review of Techniques, Trends, and Challenges in Prior Art SearchesApplied System Innovation10.3390/asi70500917:5(91)Online publication date: 26-Sep-2024
  • (2023)Unveiling Black-Boxes: Explainable Deep Learning Models for Patent ClassificationExplainable Artificial Intelligence10.1007/978-3-031-44067-0_24(457-474)Online publication date: 21-Oct-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media