Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3625007.3629127acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
short-paper
Open access

KoExPubMed: A Tool for Effective and Customized Knowledge Extraction from PubMed

Published: 15 March 2024 Publication History

Abstract

An exponential growth in the literature in general and the medical literature in particular raises a need for effective intelligent analysis strategies and tools to provide valuable insights to researchers about the current evolving literature. While existing applications provide more specific approaches to the problem, such as focusing on particular genome or protein information, in this paper, the proposed application provides effective and detailed analysis of PubMed. The developed tool, named KoExPubMed, follows a more generalized and holistic way by taking into consideration different types of information such as authors, countries, genes, and the interactions between them. The developed application consists of four main components; (1) keyword search and ID extraction, (2) PubMed article information and abstract retrieval, (3) country and address extraction, and (4) gene information extraction. In addition to the fundamental components, the tool provides a variety of visualization options for showing the extracted information and the related associations, including line charts for densities and countries, chord charts for collaborations of authors, network graphs for the genes mentioned together, bubble charts for gene frequencies, etc. By addressing the need for a generalized data mining tool, we propose a comprehensive application which is capable of employing data mining and machine learning techniques to extract from PubMed knowledge valuable to researchers and practitioners who are interested in closely investigating the achievements of others.

References

[1]
National Library of Medicine, PubMed. URL https:// .ncbi.nlm.nih.gov/, 2023.
[2]
Google, Geocoding API. URL https://developers.google.com/maps/documentation/geocoding/overview, 2023.
[3]
Becas, becas API. URL https://bioinformatics.ua.pt/becas/#!/api, 2023.
[4]
UniPort, Programmatic access. URL https://www.uniprot.org/help/programmatic_access, 2023.
[5]
T. Botsis, J. Murray, L. E. Alessandro, D. Palsgrove, W. A. Wei, J. R. White, V. E. Velculescu, and V. Anagnostou, Natural language processing approaches for retrieval of clinically relevant genomic information in cancer, Studies in health technology and informatics, 295, p.350, 2022.
[6]
C. Simon, K. Davidsen, C. Hansen, E. Seymour, M. B. Barnkob, and L. R. Olsen, BioReader: a text mining tool for performing classification of biomedical literature, BMC bioinformatics, 19, pp.165--170, 2019.
[7]
Z. Lu, PubMed and beyond: a survey of web tools for searching biomedical literature, Database, p.baq036, 2011.
[8]
J. Rani, A. R. Shah, and S. Ramachandran, pubmed.mineR: An R package with text-mining algorithms to analyse PubMed abstracts, Journal of biosciences, 40, pp.671--682, 2015.
[9]
J. H. Eom, and B. T. Zhang, PubMiner: machine learning-based text mining for biomedical information analysis, Genomics & Informatics, 2(2), pp.99--106, 2004.
[10]
J. Bockskopf, Y. uning Chen, D. avid Dowey, B. Gao, A. Garza, and I. Smith, PubMiner: An Interactive Tool for Demographic-enriched PubMed Searches.
[11]
Y. Han, S. A. Wennersten, and M. P. Lam, Working the literature harder: what can text mining and bibliometric analysis reveal?, Expert review of proteomics, 16(11--12), pp.871--873, 2019.
[12]
T. Greenhalgh, How to read a paper: the Medline database, Bmj, 315(7101), pp.180--183, 1997.
[13]
J. O. Corvi, A. McKitrick, J. M. Fernández, C. V. Fuenteslópez, J. L. Gelpì, M. P. Ginebra, S. Capella-Gutierrez, and O. Hakimi, DEBBIE: The Open Access Database of Experimental Scaffolds and Biomaterials Built Using an Automated Text Mining Pipeline, Advanced Healthcare Materials, p.2300150, 2023.
[14]
J. H. Wang, L. F. Zhao, H. F. Wang, Y. T. Wen, K. K. Jiang, X. M. Mao, Z. Y. Zhou, K. T. Yao, Q. S. Geng, D. Guo, and Z. X. Huang, GenCLiP 3: mining human genes' functions and regulatory networks from PubMed based on co-occurrences and natural language processing, 2020.

Cited By

View all
  • (2024)Biomedical Text Mining: Biocuration and Literature Search EnginesReference Module in Life Sciences10.1016/B978-0-323-95502-7.00101-9Online publication date: 2024

Index Terms

  1. KoExPubMed: A Tool for Effective and Customized Knowledge Extraction from PubMed
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image ACM Conferences
            ASONAM '23: Proceedings of the 2023 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
            November 2023
            835 pages
            ISBN:9798400704093
            DOI:10.1145/3625007
            This work is licensed under a Creative Commons Attribution International 4.0 License.

            Sponsors

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            Published: 15 March 2024

            Check for updates

            Author Tags

            1. literature analysis
            2. knowledge extraction
            3. gene interaction
            4. data mining
            5. data visualization

            Qualifiers

            • Short-paper

            Conference

            ASONAM '23
            Sponsor:

            Acceptance Rates

            ASONAM '23 Paper Acceptance Rate 53 of 145 submissions, 37%;
            Overall Acceptance Rate 116 of 549 submissions, 21%

            Upcoming Conference

            KDD '25

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)105
            • Downloads (Last 6 weeks)15
            Reflects downloads up to 17 Feb 2025

            Other Metrics

            Citations

            Cited By

            View all
            • (2024)Biomedical Text Mining: Biocuration and Literature Search EnginesReference Module in Life Sciences10.1016/B978-0-323-95502-7.00101-9Online publication date: 2024

            View Options

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            Login options

            Figures

            Tables

            Media

            Share

            Share

            Share this Publication link

            Share on social media