Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

    E. Bernstam

    Ambiguous gene names in the biomedical literature are a barrier to accurate information extraction. To overcome this hurdle, we generated Ontology Fingerprints for selected genes that are relevant for personalized cancer therapy. These... more
    Ambiguous gene names in the biomedical literature are a barrier to accurate information extraction. To overcome this hurdle, we generated Ontology Fingerprints for selected genes that are relevant for personalized cancer therapy. These Ontology Fingerprints were used to evaluate the association between genes and biomedical literature to disambiguate gene names. We obtained 93.6% precision for the test gene set and 80.4% for the area under a receiver-operating characteristics curve for gene and article association. The core algorithm was implemented using a graphics processing unit-based MapReduce framework to handle big data and to improve performance. We conclude that Ontology Fingerprints can help disambiguate gene names mentioned in text and analyse the association between genes and articles. Database URL: http://www.ontologyfingerprint.org.
    The need to maintain accessibility of the biomedical literature has led to development of methods to assist human indexers by recommending index terms for newly encountered articles. Given the rapid expansion of this literature, it is... more
    The need to maintain accessibility of the biomedical literature has led to development of methods to assist human indexers by recommending index terms for newly encountered articles. Given the rapid expansion of this literature, it is essential that these methods be scalable. Document vector representations are commonly used for automated indexing, and Random Indexing (RI) provides the means to generate them efficiently. However, RI is difficult to implement in real-world indexing systems, as (1) efficient nearest-neighbor search requires retaining all document vectors in RAM, and (2) it is necessary to maintain a store of randomly generated term vectors to index future documents. Motivated by these concerns, this paper documents the development and evaluation of a deterministic binary variant of RI. The increased capacity demonstrated by binary vectors has implications for information retrieval, and the elimination of the need to retain term vectors facilitates distributed implemen...
    The National Guideline Clearinghouse (NGC) and its guideline classification system are significant contributions to the study of clinical practice guidelines (CPGs) and their incorporation into routine clinical care. The NGC... more
    The National Guideline Clearinghouse (NGC) and its guideline classification system are significant contributions to the study of clinical practice guidelines (CPGs) and their incorporation into routine clinical care. The NGC classification system is primarily designed to support guideline retrieval. We believe that a guideline classification system should also support identification of features that relate to incorporation of executable CPGs into computer-based applications for sharing and delivering guideline-based advice. We have developed a proposed expansion of the NGC guideline classification for this purpose. The axes of the proposed scheme have implications for designing formal models and structures for representing and authoring CPGs. This scheme also has implications for future research.
    GLIF3: The Evolution of a Guideline Representation Format 122 Mor Peleg, Ph.D.', Aziz A. Boxwala, MBBS, Ph.D.2, Omolola Ogunyemi, Ph.D. , Qin ... 22 Zeng, Ph.D. , Samson Tu, MS', Ronilda Lacson, MD2, Elmer Bernstam, MD, MSE ,
    The Society of University Surgeons (SUS) has an ongoing competitive funding program to support research training for residents. We sought to determine the career track of award recipients. We included in the study SUS resident awardees... more
    The Society of University Surgeons (SUS) has an ongoing competitive funding program to support research training for residents. We sought to determine the career track of award recipients. We included in the study SUS resident awardees who completed awards from 1989-2007. Characteristics of awardees and their academic productivity were extracted from curriculum vitae provided by awardees (n = 24), or from online sources (n = 7). Awardees spent an average of 2.7 y (range, 1-4 y) of dedicated research time during residency. Awardees averaged 9.8 publications (range, 1-32), with 5.4 as first author (range, 1-17), with their mentor within 3 y of award completion, with an average maximum impact factor of 5.7. A total of 25 residents (81%) pursued fellowships. At an average follow-up of 11.4 y (range, 4-22 y) from the end of the award and 7.2 y (range, 0-18 y) from end of clinical training, awardees had a Hirsch index of 14.5 (range, 2-48). At the time of the study, 26 awardees (84%) were in academic surgery. Of the 23 awardees who had completed surgical training ≥ 3 y earlier, 11 (48%) received independent research funding, seven of whom (30%) received R01 or equivalent funding. The SUS resident research awardees had a productive research experience. Although our retrospective study cannot determine causation, the SUS award mechanism delivers on its promise of supporting junior surgeon-scientists who pursue academic careers and establish independent research programs. Further studies are needed to determine how rates of subsequent independent research funding can be improved.
    As the volume of biomedical text increases exponentially, automatic indexing becomes increasingly important. However, existing approaches do not distinguish central (or core) concepts from concepts that were mentioned in passing. We focus... more
    As the volume of biomedical text increases exponentially, automatic indexing becomes increasingly important. However, existing approaches do not distinguish central (or core) concepts from concepts that were mentioned in passing. We focus on the problem of indexing MEDLINE records, a process that is currently performed by highly trained humans at the National Library of Medicine (NLM). NLM indexers are assisted by a system called the Medical Text Indexer (MTI) that suggests candidate indexing terms. To improve the ability of MTI to select the core terms in MEDLINE abstracts. These core concepts are deemed to be most important and are designated as "major headings" by MEDLINE indexers. We introduce and evaluate a graph-based indexing methodology called MEDRank that generates concept graphs from biomedical text and then ranks the concepts within these graphs to identify the most important ones. We insert a MEDRank step into the MTI and compare MTI's output with and without MEDRank to the MEDLINE indexers' selected terms for a sample of 11,803 PubMed Central articles. We also tested whether human raters prefer terms generated by the MEDLINE indexers, MTI without MEDRank, and MTI with MEDRank for a sample of 36 PubMed Central articles. MEDRank improved recall of major headings designated by 30% over MTI without MEDRank (0.489 vs. 0.376). Overall recall was only slightly (6.5%) higher (0.490 vs. 0.460) as was F(2) (3%, 0.408 vs. 0.396). However, overall precision was 3.9% lower (0.268 vs. 0.279). Human raters preferred terms generated by MTI with MEDRank over terms generated by MTI without MEDRank (by an average of 1.00 more term per article), and preferred terms generated by MTI with MEDRank and the MEDLINE indexers at the same rate. The addition of MEDRank to MTI significantly improved the retrieval of core concepts in MEDLINE abstracts and more closely matched human expectations compared to MTI without MEDRank. In addition, MEDRank slightly improved overall recall and F(2).
    Technological and cultural factors influence access to health information on the web in multifarious ways. We evaluated structural differences and availability of communication services on the web in three diverse language and cultural... more
    Technological and cultural factors influence access to health information on the web in multifarious ways. We evaluated structural differences and availability of communication services on the web in three diverse language and cultural groups: Chinese, English, and Spanish. A total of 382 web sites were analyzed: 144 were English language sites (38%), 129 were Chinese language sites (34%), and 108 were Spanish language sites (28%). We did not find technical differences in the number of outgoing links per domain or the total availability of communication services between the three groups. There were differences in the distribution of available services between Chinese and English sites. In the Chinese sites, there were more communication services between consumers and health experts. Our results suggest that the health-related web presence of these three cultural groups is technologically comparable, but reflects differences that may be attributable to cultural factors.
    Consumers are increasingly turning to the Web, expecting to find the latest health information. The purpose of this study was to assess the currency of online breast cancer information. We determined whether nine recent advances in breast... more
    Consumers are increasingly turning to the Web, expecting to find the latest health information. The purpose of this study was to assess the currency of online breast cancer information. We determined whether nine recent advances in breast cancer management were incorporated into 337 unique breast cancer Web pages. Two reviewers independently assessed content; if a Web page covered appropriate advances it was deemed to be "current." Of the 337 Web pages, 89 contained one or more advances. Of the 122 Web pages that had dates of update available, 49% had been updated within 6 months. Only 11%-37% of Web pages covered clinically accepted advances, even among Web pages that were updated after acceptance of the advance into clinical practice. We conclude that online health information is often not sufficiently current. Consumers searching for health information online should always consult an expert clinician before taking action.
    Guidelines are modeled in GLIF at three levels of abstraction: a conceptual flowchart that is easy to author and comprehend, a computable specification that can be verified for logical consistency and completeness, and an implementable... more
    Guidelines are modeled in GLIF at three levels of abstraction: a conceptual flowchart that is easy to author and comprehend, a computable specification that can be verified for logical consistency and completeness, and an implementable specification that can be ...