Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Peter Rogan

Purpose: Combinations of expressed genes can discriminate radiation-exposed from normal control blood samples by machine learning (ML) based signatures (with 8-20% misclassification rates). These signatures can quantify therapeutically... more
Purpose: Combinations of expressed genes can discriminate radiation-exposed from normal control blood samples by machine learning (ML) based signatures (with 8-20% misclassification rates). These signatures can quantify therapeutically relevant as well as accidental radiation exposures. The prodromal symptoms of acute radiation syndrome (ARS) overlap those present in influenza and dengue fever infections. Surprisingly, these human radiation signatures misclassified gene expression profiles of virally infected samples as false positive exposures. The present study investigates these and other confounders, and then mitigates their impact on signature accuracy. Methods: This study investigated recall by previous and novel radiation signatures independently derived from multiple Gene Expression Omnibus datasets on common and rare non-neoplastic blood disorders and blood-borne infections (thromboembolism, S. aureus bacteremia, malaria, sickle cell disease, polycythemia vera, and aplastic anemia). Normalized expression levels of signature genes are used as input to ML-based classifiers to predict radiation exposure in other hematological conditions. Results: Except for aplastic anemia, these blood-borne disorders modify the normal baseline expression values of genes present in radiation signatures, leading to false-positive misclassification of radiation exposures in 8-54% of individuals. Shared changes, predominantly in DNA damage response and apoptosis-related gene transcripts in radiation and confounding hematological conditions, compromise the utility of these signatures for radiation assessment. These confounding conditions (sickle cell disease, thrombosis, S. aureus bacteremia, malaria) induce neutrophil extracellular traps, initiated by chromatin decondensation, DNA damage response and fragmentation followed by programmed cell death or extrusion of DNA fragments. Riboviral infections (e.g. influenza or dengue fever) have been proposed to bind and deplete host RNA binding proteins, inducing R-loops in chromatin. R-loops that collide with incoming replication forks can result in incompletely repaired DNA damage, inducing apoptosis and releasing mature virus. To mitigate the effects of confounders, we evaluated predicted radiation-positive samples with novel gene expression signatures derived from radiation-responsive transcripts encoding secreted blood plasma proteins whose expression levels are unperturbed by these conditions. Conclusions: This approach identifies and eliminates misclassified samples with underlying hematological or infectious conditions, leaving only samples with true radiation exposures. Diagnostic accuracy is significantly improved by selecting genes that maximize both sensitivity and specificity in the appropriate tissue using combinations of the best signatures for each of these classes of signatures.
Purpose: Inhomogeneous exposures to ionizing radiation can be detected and quantified with the dicentric chromosome assay (DCA) of metaphase cells. Complete automation of interpretation of the DCA for whole-body irradiation has... more
Purpose: Inhomogeneous exposures to ionizing radiation can be detected and quantified with
the dicentric chromosome assay (DCA) of metaphase cells. Complete automation of interpretation
of the DCA for whole-body irradiation has significantly improved throughput without compromising
accuracy, however, low levels of residual false positive dicentric chromosomes (DCs) have confounded
its application for partial-body exposure determination.
Materials and methods: We describe a method of estimating and correcting for false positive
DCs in digitally processed images of metaphase cells. Nearly all DCs detected in unirradiated calibration
samples are introduced by digital image processing. DC frequencies of irradiated calibration
samples and those exposed to unknown radiation levels are corrected subtracting this false
positive fraction from each. In partial-body exposures, the fraction of cells exposed, and radiation
dose can be quantified after applying this modification of the contaminated Poisson method.
Results: Dose estimates of three partially irradiated samples diverged 0.2–2.5 Gy from physical
doses and irradiated cell fractions deviated by 2.3%–15.8% from the known levels. Synthetic partial-
body samples comprised of unirradiated and 3Gy samples from 4 laboratories were correctly
discriminated as inhomogeneous by multiple criteria. Root mean squared errors of these dose estimates
ranged from 0.52 to 1.14Gy2 and from 8.1 to 33.3%2 for the fraction of cells irradiated.
Conclusions: Automated DCA can differentiate whole- from partial-body radiation exposures and
provides timely quantification of estimated whole-body equivalent dose.
Cancer chemotherapy responses have been related to multiple pharmacogenetic biomarkers, often for the same drug. This study utilizes machine learning to derive multi-gene expression signatures that predict individual patient responses to... more
Cancer chemotherapy responses have been related to multiple pharmacogenetic biomarkers, often for the same drug. This study utilizes machine learning to derive multi-gene expression signatures that predict individual patient responses to specific tyrosine kinase inhibitors, including erlotinib, gefitinib, sorafenib, sunitinib, lapatinib and imatinib. Support vector machine (SVM) learning was used to train mathematical models that distinguished sensitivity from resistance to these drugs using a novel systems biology-based approach. This began with expression of genes previously implicated in specific drug responses, then expanded to evaluate genes whose products were related through biochemical pathways and interactions. Optimal pathway-extended SVMs predicted responses in patients at accuracies of 70% (imatinib), 71% (lapatinib), 83% (sunitinib), 83% (erlotinib), 88% (sorafenib) and 91% (gefitinib). These best performing pathway-extended models demonstrated improved balance predicting both sensitive and resistant patient categories, with many of these genes having a known role in cancer aetiology. Ensemble machine learning-based averaging of multiple pathway-extended models derived for an individual drug increased accuracy to >70% for erlotinib, gefitinib, lapatinib and sorafenib. Through incorporation of novel cancer biomarkers, machine learning-based pathway-extended signatures display strong efficacy predicting both sensitive and resistant patient responses to chemotherapy. K E Y W O R D S biochemical pathways, gene signatures, machine learning, systems biology, tyrosine kinase inhibitors This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
The gemcitabine SVM exhibited 62% prediction accuracy for the tumor blocks due to the presence of samples with poor nucleic acid integrity. Nevertheless, the paclitaxel SVM predicted sensitivity in 84% of patients with no or minimal... more
The gemcitabine SVM exhibited 62% prediction accuracy for the tumor blocks due to the presence of samples with poor nucleic acid integrity. Nevertheless, the paclitaxel SVM predicted sensitivity in 84% of patients with no or minimal residual disease.
Research Interests:
Rapid sample processing and interpretation of estimated exposures will be critical for triaging exposed individuals after a major radiation incident. The dicentric chromosome (DC) assay assesses absorbed radiation using metaphase cells... more
Rapid sample processing and interpretation of estimated exposures will be critical for triaging exposed individuals after a major radiation incident. The dicentric chromosome (DC) assay assesses absorbed radiation using metaphase cells from blood. The Automated Dicentric Chromosome Identifier and Dose Estimator System (ADCI) identifies DCs and determines radiation doses. This study aimed to broaden accessibility and speed of this system, while protecting data and software integrity. ADCI Online is a secure web-streaming platform accessible worldwide from local servers. Cloud-based systems containing data and software are separated until they are linked for radiation exposure estimation. Dose estimates are identical to ADCI on dedicated computer hardware. Image processing and selection, calibration curve generation, and dose estimation of 9 test samples completed in <2 days. ADCI Online has the capacity to alleviate analytic bottlenecks in intermediate-to-large radiation incidents. Multiple cloned software instances configured on different cloud environments accelerated dose estimation to within clinically relevant time frames.
Research Interests:
Expedited Radiation Biodosimetry by Automated Dicentric Chromosome Identification (ADCI) and Dose Estimation. Abstract Biological radiation dose can be estimated from dicentric chromosome frequencies in metaphase cells. Performing these... more
Expedited Radiation Biodosimetry by Automated Dicentric Chromosome Identification (ADCI) and Dose Estimation. Abstract Biological radiation dose can be estimated from dicentric chromosome frequencies in metaphase cells. Performing these cytogenetic dicentric chromosome assays is traditionally a manual, labor-intensive process not well suited to handle the volume of samples which may require examination in the wake of a mass casualty event. Automated Dicentric Chromosome Identifier and Dose Estimator (ADCI) software automates this process by examining sets of metaphase images using machine learning-based image processing techniques. The software selects appropriate images for analysis by removing unsuitable images, classifies each object as either a centromere-containing chromosome or non-chromosome, further distinguishes chromosomes as monocentric chromosomes (MCs) or dicentric chromosomes (DCs), determines DC frequency within a sample, and estimates biological radiation dose by comparing sample DC frequency with calibration curves computed using calibration samples. This protocol describes the usage of ADCI software. Typically, both calibration (known dose) and test (unknown dose) sets of metaphase images are imported to perform accurate dose estimation. Optimal images for analysis can be found automatically using preset image filters or can also be filtered through manual inspection. The software processes images within each sample and DC frequencies are computed at different levels of stringency for calling DCs, using a machine learning approach. Linear-quadratic calibration curves are generated based on DC frequencies in calibration samples exposed to known physical doses. Doses of test samples exposed to uncertain radiation levels are estimated from their DC frequencies using these calibration curves. Reports can be generated upon request and provide summary of results of one or more samples, of one or more calibration curves, or of dose estimation. Video Link The video component of this article can be found at https://www.jove.com/video/56245/
ABSTRACT The cause of variability of CYP2D6 activity within subjects with identical genotypes remains an enigma. To date only one SNP (-1584C&amp;gt;G) modulating transcription levels has been described. Alternatively, splice events may... more
ABSTRACT The cause of variability of CYP2D6 activity within subjects with identical genotypes remains an enigma. To date only one SNP (-1584C&amp;gt;G) modulating transcription levels has been described. Alternatively, splice events may contribute to the observed variability by giving rise to variable ratios of functional to aberrantly spliced mRNA. Recently, we described information theory as a quantitative measure of the relative strength of CYP2D6 splice acceptor and donor sites (Rogan et al, PG 13, 207, 2003). Relatively low information for exons 6 and 7 (Ri values&amp;lt;1.6 bits) and exons 2 and 3 (Ri=2 to 3.6 bits) suggested that these regions may be susceptible to alternative splicing. To test this hypothesis, liver RNA (n=5) was reverse transcribed and exons 1-9 and 3-9 amplified. Multiple products were observed after PCR in each sample. Cloning and sequencing revealed splice variants with a loss of exon 3 or 6, partial intron1 retention and intron 5 or 6 retention; 51 amino acids are lost with exon 3, while all other variants contain a premature stop-codon. Other clones that may contain additional variants are being characterized. In conclusion, CYP2D6 mRNA undergoes extensive alternative splicing, and the relative abundance of properly to aberrantly spliced mRNA appears to be an additional source of interindividual variability in CYP2D6 activity in vivo and in vitro. Furthermore, it may also explain the relatively poor correlation between in vitro activity and |[ldquo]|mRNA|[rdquo]| levels.
Nucleotide variants in genes of the lipid metabolism influence the risk of premature atherosclerosis. Ten percent of all single nucleotide substitutions in these genes involve splice sites. The effects of these changes on mRNA splicing... more
Nucleotide variants in genes of the lipid metabolism influence the risk of premature atherosclerosis. Ten percent of all single nucleotide substitutions in these genes involve splice sites. The effects of these changes on mRNA splicing and phenotypic severity, however, are not inherently obvious from the nucleotide sequence. This review presents various genes of lipid metabolism with splicing mutations known to influence the risk of premature atherosclerosis. Mechanisms of pre-mRNA splicing are illustrated and different models for prediction of the effect of nucleotide substitutions on splice-site function are presented. The role of information theory-based models is emphasized along with its role for prediction of splice-site function and phenotypic severity of atherosclerosis.
Maternal uniparental disomy for the complete long arm of chromosome 14 has been reported in 14 patients to date and is associated with a specific pattern of malformation. We report a child with clinical features of this syndrome who... more
Maternal uniparental disomy for the complete long arm of chromosome 14 has been reported in 14 patients to date and is associated with a specific pattern of malformation. We report a child with clinical features of this syndrome who exhibits maternal uniparental disomy confined to a specific interstitial segment of chromosome 14.
The aim of this study was to clarify the nature of the sleep abnormalities (excessive daytime sleepiness [EDS] and rapid eye movement [REM] sleep alterations) in Prader-Willi; Syndrome (PWS). Eight PWS patients, 15 normal, 16 narcoleptic,... more
The aim of this study was to clarify the nature of the sleep abnormalities (excessive daytime sleepiness [EDS] and rapid eye movement [REM] sleep alterations) in Prader-Willi; Syndrome (PWS). Eight PWS patients, 15 normal, 16 narcoleptic, and 16 obese subjects were recorded in the sleep laboratory, both during daytime and nighttime. A principal-finding was that EDS in PWS was associated with an increased amount and depth of sleep. In PWS patients with EDS, compared to those PWS patients without EDS or the narcoleptic, obese, and normal groups, there were significant decreases in wakefulness and increases in percentage of sleep time (ST) and slow-wave sleep (SWS) both during daytime and nighttime testing. Also, in the adult PWS subjects (n = 6), in contrast to normal narcoleptic subjects, intensity of EDS was correlated with increased nocturnal percentage of ST and SWS and % SWS was positively correlated with % ST (both during daytime and nighttime testing). Another principal finding...
We have previously characterized two surfactant protein A (SP-A) cDNAs termed 1A and 6A, as well as a 6A allelic variant termed 6A1. These sequences are quite heterogeneous at the 3&#39; untranslated region (3&#39;UT). Differences between... more
We have previously characterized two surfactant protein A (SP-A) cDNAs termed 1A and 6A, as well as a 6A allelic variant termed 6A1. These sequences are quite heterogeneous at the 3&#39; untranslated region (3&#39;UT). Differences between 6A and 6A1 alleles include an 11-bp insertion/deletion 407 bases downstream from the start of the translation termination codon and a base pair polymorphism (C or G) in exon 1 (position 1,193; White, Damm, Miller, Spratt, Schilling, Hawgood, Benson, and Cordell. Nature Lond. 317: 361-363, 1985). The 11-bp (GCCCACTGCCT) segment is present in 6A1 and absent in 6A. The 6A/6A genotype, in a small number of specimens, showed a trend toward a higher frequency in the black Nigerian population compared with Caucasians. In this report, we examine the frequency of the 6A genotype in a larger number of samples from Caucasians and black Nigerians as well as the meiotic stability of the 3&#39;UT heterogeneity. Slot-blot analysis and allele-specific oligonucleot...
Congenital cataracts constitute a morphologically and genetically heterogeneous group of diseases that are a major cause of childhood blindness. Different loci for hereditary congenital cataracts have been mapped to chromosomes 1, 2, 16,... more
Congenital cataracts constitute a morphologically and genetically heterogeneous group of diseases that are a major cause of childhood blindness. Different loci for hereditary congenital cataracts have been mapped to chromosomes 1, 2, 16, and 17q24. We report linkage of a gene causing a unique form of autosomal dominant zonular cataracts with Y-sutural opacities to chromosome 17q11-12 in a three-generation family exhibiting a maximum lod score of 3.9 at D17S805. Multipoint analysis gave a 1-lod confidence interval of 17 cM. This interval is bounded by the markers D17S799 and D17S798, a region that would encompass a number of candidate genes including that coding for beta A3/A1-crystallin.
We present Delila-genome, a software system for identification, visualization and analysis of protein binding sites in complete genome sequences. Binding sites are predicted by scanning genomic sequences with information theory-based (or... more
We present Delila-genome, a software system for identification, visualization and analysis of protein binding sites in complete genome sequences. Binding sites are predicted by scanning genomic sequences with information theory-based (or user-defined) weight matrices. Matrices are refined by adding experimentally-defined binding sites to published binding sites. Delila-Genome was used to examine the accuracy of individual information contents of binding sites detected with refined matrices as a measure of the strengths of the corresponding protein-nucleic acid interactions. The software can then be used to predict novel sites by rescanning the genome with the refined matrices. Parameters for genome scans are entered using a Java-based GUI interface and backend scripts in Perl. Multi-processor CPU load-sharing minimized the average response time for scans of different chromosomes. Scans of human genome assemblies required 4-6 hours for transcription factor binding sites and 10-19 hou...
Extract: Numerical and structural chromosome abnormalities are a common cause of inherited and acquired diseases in humans. Cytogenetic detection of genomic imbalances and rearrangements is standard diagnostic practice, and is used both... more
Extract: Numerical and structural chromosome abnormalities are a common cause of inherited and acquired diseases in humans. Cytogenetic detection of genomic imbalances and rearrangements is standard diagnostic practice, and is used both prognostically and for treatment stratification, especially for neoplastic disorders. Abnormalities are recognized by chromosome banding and by fluorescence in situ hybridization (FISH). FISH permits examination of specific DNA sequences within single or multiple chromosome bands on a metaphase cell or within an interphase cell. Locus-specific FISH probes have traditionally been composed of recombinant DNA segments that span large chromosomal targets of hundreds of thousands of base pairs, about an order of magnitude smaller than the length of a typical chromosomal band. These probes, which contain either repetitive sequences, single copy sequences or combinations of both, have been developed to hybridize to a wide spectrum of chromosomal targets -- ...
The identification of promoter regions that are regulated by a given transcription factor has traditionally relied upon the identification and distributions of binding sites recognized by the factor. In this study, we have developed a... more
The identification of promoter regions that are regulated by a given transcription factor has traditionally relied upon the identification and distributions of binding sites recognized by the factor. In this study, we have developed a tandem machine learning approach for the identification of regulatory target genes based on these parameters and on the corresponding binding site information contents that measure the affinities of the factor for these cognate elements. This method has been validated using models of DNA binding sites recognized by the xenobiotic-sensitive nuclear receptor, PXR/RXRalpha, for target genes within the human genome. An information theory-based weight matrix was first derived and refined from known PXR/RXRalpha binding sites. The promoter region of candidate genes was scanned with the weight matrix. A novel information density-based clustering algorithm was then used to identify clusters of information rich sites. Finally, transformed data representing metr...
The purpose of this paper is to map the locus for a variant form of Oguchi disease in a Pakistani family and to identify the causative mutation. Family 61029 was ascertained in the Punjab province of Pakistan. It includes three 13- to... more
The purpose of this paper is to map the locus for a variant form of Oguchi disease in a Pakistani family and to identify the causative mutation. Family 61029 was ascertained in the Punjab province of Pakistan. It includes three 13- to 19-year-old patients with night blindness and 12 unaffected family members. A complete ophthalmological examination including fundus photography and electroretinography (ERG) was performed on each family member. A genome-wide scan was performed using microsatellite markers at about 10 cM intervals, and two-point lod scores were calculated. Polymerase chain reaction (PCR) cycle dideoxynucleotide sequencing was used to screen candidate genes inside the linked region for mutations and to delineate the deletion. Multiplex PCR and long template PCR were used to detect deletions and to define the size of deletions. Evaluation of fundus changes and ERG, lod score estimation, and identification of a mutation in the GRK1 gene were carried out. All patients had ...
Many dimeric protein complexes bind cooperatively to families of bipartite nucleic acid sequence elements, which consist of pairs of conserved half-site sequences separated by intervening distances that vary among individual sites. We... more
Many dimeric protein complexes bind cooperatively to families of bipartite nucleic acid sequence elements, which consist of pairs of conserved half-site sequences separated by intervening distances that vary among individual sites. We introduce the Bipad Server, a web interface to predict sequence elements embedded within unaligned sequences. Either a bipartite model, consisting of a pair of one-block position weight matrices (PWM&#39;s) with a gap distribution, or a single PWM matrix for contiguous single block motifs may be produced. The Bipad program performs multiple local alignment by entropy minimization and cyclic refinement using a stochastic greedy search strategy. The best models are refined by maximizing incremental information contents among a set of potential models with varying half site and gap lengths. The web service generates information positional weight matrices, identifies binding site motifs, graphically represents the set of discovered elements as a sequence l...
Page 1. Visual Display of Sequence Conservation as an Aid to Taxonomic Classification Using PCR Amplification Peter K. Rogan, Joseph J. Salvo, R. Michael Stephens and Thomas D. Schneider By comparing corresponding ...
With the increasing use of Fluorescence In Situ Hybridization (FISH) probes as markers for certain genetic sequences, the requirement of a proper image processing framework is becoming a necessity to accurately detect these probe signal... more
With the increasing use of Fluorescence In Situ Hybridization (FISH) probes as markers for certain genetic sequences, the requirement of a proper image processing framework is becoming a necessity to accurately detect these probe signal locations in relation to the centerline of the chromosome. Although many image processing techniques have been developed for chromosomal analysis, they fail to provide reliable results in segmenting and extracting the centerline of chromosomes due to the high variability in shape of chromosomes on microscope slides. In this paper we propose a hybrid algorithm that utilizes Gradient Vector Flow active contours, Discrete Curve Evolution based skeleton pruning and morphological thinning to provide a robust and accurate centerline of the chromosome, which is then used for the measurement of the FISH probe signals. The ability to accurately detect FISH probe locations with respective to the centerline and other landmarks can provide the cytogeneticists with detailed information that could lead to a faster diagnosis.
Skip to Main Content. ...
The interpretation of genomic variants has become one of the paramount challenges in the post-genome sequencing era. In this review we summarize nearly 20 years of research on the applications of information theory (IT) to interpret... more
The interpretation of genomic variants has become one of the paramount challenges in the post-genome sequencing era. In this review we summarize nearly 20 years of research on the applications of information theory (IT) to interpret coding and non-coding mutations that alter mRNA splicing in rare and common diseases. We compile and summarize the spectrum of published variants analyzed by IT, to provide a broad perspective of the distribution of deleterious natural and cryptic splice site variants detected, as well as those affecting splicing regulatory sequences. Results for natural splice site mutations can be interrogated dynamically with Splicing Mutation Calculator, a companion software program that computes changes in information content for any splice site substitution, linked to corresponding publications containing these mutations. The accuracy of IT-based analysis was assessed in the context of experimentally validated mutations. Because splice site information quantifies b...
Bloom syndrome (BS) is an autosomal recessive disorder characterized by increases in the frequency of sister-chromatid exchange and in the incidence of malignancy. Chromosome-transfer studies have shown the BS locus to map to chromosome... more
Bloom syndrome (BS) is an autosomal recessive disorder characterized by increases in the frequency of sister-chromatid exchange and in the incidence of malignancy. Chromosome-transfer studies have shown the BS locus to map to chromosome 15q. This report describes a subject with features of both BS and Prader-Willi syndrome (PWS). Molecular analysis showed maternal uniparental disomy for chromosome 15. Meiotic recombination between the two disomic chromosomes 15 has resulted in heterodisomy for proximal 15q and isodisomy for distal 15q. In this individual BS is probably due to homozygosity for a gene that is telomeric to D15S95 (15q25), rather than to genetic imprinting, the mechanism responsible for the development of PWS. This report represents the first application of disomy analysis to the regional localization of a disease gene. This strategy promises to be useful in the genetic mapping of other uncommon autosomal recessive conditions.

And 160 more