Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2390068.2390074acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Inferring appropriate eligibility criteria in clinical trial protocols without labeled data

Published: 29 October 2012 Publication History
  • Get Citation Alerts
  • Abstract

    We consider the user task of designing clinical trial protocols and propose a method that outputs the most appropriate eligibility criteria from a potentially huge set of candidates. Each document d in our collection D is a clinical trial protocol which itself contains a set of eligibility criteria. Given a small set of sample documents D', |D'|«|D|, a user has initially identified as relevant e.g., via a user query interface, our scoring method automatically suggests eligibility criteria from D by ranking them according to how appropriate they are to the clinical trial protocol currently being designed. We view a document as a mixture of latent topics and our method exploits this by applying a three-step procedure. First, we infer the latent topics in the sample documents using Latent Dirichlet Allocation (LDA) [3]. Next, we use logistic regression models to compute the probability that a given candidate criterion belongs to a particular topic. Lastly, we score each criterion by computing its expected value, the probability-weighted sum of the topic proportions inferred from the set of sample documents. Intuitively, the greater the probability that a candidate criterion belongs to the topics that are dominant in the samples, the higher its expected value or score. Results from our experiments indicate that our proposed method is 8 and 9 times better (resp., for inclusion and exclusion criteria) than randomly choosing from a set of candidates obtained from relevant documents. In user simulation experiments, we were able to automatically construct eligibility criteria that are on the average 75% and 70% (resp., for inclusion and exclusion criteria) similar to the correct eligibility criteria.

    References

    [1]
    Berry de Bruijn, Simona Carini, Svetlana Kiritchenko, Joel Martin and Ida Sim. Automated Information Extraction of Key Trial Design Elements from Clinical Trial Publications. In AMIA 2008 Symposium Proceedings, pages 141--145, 2008.
    [2]
    C. M. Bishop. Pattern Recognition and Machine Learning. Springer, 2007.
    [3]
    D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet Allocation. Journal of Machine Learning Research (JLMR), 3:993--1022, Jan 2003.
    [4]
    F. Boudin, J.-Y. Nie, and M. Dawes. Clinical information retrieval using document and pico structure. In Proceedings of the NAACL 2010, pages 822--830, Los Angeles, California, 2010. Association for Computational Linguistics.
    [5]
    Chintan O. Patel and James J. Cimino. Semantic Query Generation from Eligibility Criteria in Clinical Trials. In AMIA 2007 Symposium Proceedings, page 1070, 2007.
    [6]
    D. Demner-Fushman and J. Lin. Answering Clinical Questions with Knowledge-Based and Statistical Techniques. Computational Linguistics, 33:63--103, March 2007.
    [7]
    Dina Demner-Fushman and Jimmy Lin. Answer Extraction, Semantic Clustering, and Extractive Summarization for Clinical Question Answering. In ACL 2006 Proceedings, 2006.
    [8]
    R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. Wiley, 2001.
    [9]
    R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research (JLMR), 9:1871--1874, Aug 2008.
    [10]
    Hong Yu and Yong-gang Cao. Automatically Extracting Information Needs from Ad Hoc Clinical Questions. In AMIA 2008 Symposium Proceedings, pages 96--100, 2008.
    [11]
    Kenjiro Taura. GXP : An Interactive Shell for the Grid Environment. In International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems, 2004.
    [12]
    S. N. Kim, D. Martinez, L. Cavedon, and L. Yencken. Automatic classification of sentences to support evidence based medicine. BMC Bioinformatics, 12 (Suppl2), March 2011.
    [13]
    S. Kiritchenko, B. de Bruijn, S. Carini, J. Martin, and I. Sim. ExaCT: automatic extraction of clinical trial characteristics from journal publications. BMC Medical Informatics and Decision Making, 10:56, 2010.
    [14]
    I. Korkontzelos, T. Mu, and S. Ananiadou. ASCOT: a text mining-based web-service for efficient search and assisted creation of clinical trials. BMC Medical Informatics and Decision Making, 12(Suppl 1):S3, 2012.
    [15]
    I. Korkontzelos, T. Mu, A. Restificar, and S. Ananiadou. Text mining for efficient search and assisted creation of clinical trials. In Proceedings of the ACM Fifth International Workshop on Data and Text Mining in Biomedical Informatics, pages 43--50, 2011.
    [16]
    C.-J. Lin, R. C. Weng, and S. S. Keerthi. Trust region Newton method for large-scale logistic regression. Journal of Machine Learning Research (JLMR), 9:627--650, Apr 2008.
    [17]
    M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann and I. H. Witten. The WEKA Data Mining Software: An Update. SIGKDD Explorations, 11, 2009.
    [18]
    Mark Steyvers and Tom Griffiths. Probabilistic Topic Models. In T. Landauer, D. McNamara, S. Dennis, and W. Kintsch, editors, Latent Semantic Analysis: A Road to Meaning. Laurence Erlbaum, 2006.
    [19]
    A. K. McCallum. MALLET: A Machine Learning for Language Toolkit. http://mallet.cs.umass.edu, 2002.
    [20]
    C. Schardt, M. B. Adams, T. Owens, S. Keitz, and P. Fontelo. Utilization of the PICO framework to improve searching for clinical questions. BMC Medical Informatics and Decision Making, 7, 2007.
    [21]
    T. G. Dietterich. Machine Learning. In Nature Encyclopedia of Cognitive Science. Macmillan, 2003.
    [22]
    S. Tu, M. Peleg, S. Carini, M. Bobak, J. Ross, D. Rubin, and I. Sim. A Practical Method for Transforming Free-Text Eligibility Criteria into Computable Criteria. Journal of Biomedical Informatics, 44:239--250, 2011.
    [23]
    W. S. Richardson, et al. The well-built clinical question: a key to evidence-based decisions. ACP Journal Club, 123, Nov-Dec 1995.
    [24]
    X. Huang, J. Lin and D. Demner-Fushman. PICO as a Knowledge Representation for Clinical Questions. In AMIA 2006 Symposium Proceedings, page 359--363, 2006.

    Cited By

    View all
    • (2024)Data extraction for evidence synthesis using a large language model: A proof‐of‐concept studyResearch Synthesis Methods10.1002/jrsm.171015:4(576-589)Online publication date: 3-Mar-2024
    • (2023)Finding Trends in Software ResearchIEEE Transactions on Software Engineering10.1109/TSE.2018.287038849:4(1397-1410)Online publication date: 1-Apr-2023
    • (2023)Prediction of complications in diabetes mellitus using machine learning models with transplanted topic model featuresBiomedical Engineering Letters10.1007/s13534-023-00322-714:1(163-171)Online publication date: 6-Oct-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    DTMBIO '12: Proceedings of the ACM sixth international workshop on Data and text mining in biomedical informatics
    October 2012
    92 pages
    ISBN:9781450317160
    DOI:10.1145/2390068
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 29 October 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. clinical trials
    2. eligibility criteria
    3. unsupervised machine learning

    Qualifiers

    • Research-article

    Conference

    CIKM'12
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 41 of 247 submissions, 17%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Data extraction for evidence synthesis using a large language model: A proof‐of‐concept studyResearch Synthesis Methods10.1002/jrsm.171015:4(576-589)Online publication date: 3-Mar-2024
    • (2023)Finding Trends in Software ResearchIEEE Transactions on Software Engineering10.1109/TSE.2018.287038849:4(1397-1410)Online publication date: 1-Apr-2023
    • (2023)Prediction of complications in diabetes mellitus using machine learning models with transplanted topic model featuresBiomedical Engineering Letters10.1007/s13534-023-00322-714:1(163-171)Online publication date: 6-Oct-2023
    • (2019)Improving reference prioritisation with PICO recognitionBMC Medical Informatics and Decision Making10.1186/s12911-019-0992-819:1Online publication date: 5-Dec-2019
    • (2018)Feasibility of Feature-based Indexing, Clustering, and Search of Clinical TrialsMethods of Information in Medicine10.3414/ME12-01-009252:05(382-394)Online publication date: 20-Jan-2018
    • (2015)Automating data extraction in systematic reviews: a systematic reviewSystematic Reviews10.1186/s13643-015-0066-74:1Online publication date: 15-Jun-2015
    • (2013)A method for discovering and inferring appropriate eligibility criteria in clinical trial protocols without labeled dataBMC Medical Informatics and Decision Making10.1186/1472-6947-13-S1-S613:Suppl 1(S6)Online publication date: 2013
    • (2012)DTMBIO 2012Proceedings of the 21st ACM international conference on Information and knowledge management10.1145/2396761.2398758(2766-2767)Online publication date: 29-Oct-2012

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media