Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
Iman Rezaeian

    Iman Rezaeian

    Prostate cancer is a leading cause of death world-widely and the third leading cause of cancer death in Northen American men. Prostate cancer causes parts of the prostate cells to lose normal control of growth and division. The Gleason... more
    Prostate cancer is a leading cause of death world-widely and the third leading cause of cancer death in Northen American men. Prostate cancer causes parts of the prostate cells to lose normal control of growth and division. The Gleason classification system is one of the known systems used to grade the aggressiveness of the prostate progression.
    There are various factors that derive mutual funds’ flow. Among them, factors related to investment patterns and behaviors of the investors are highly informative. It is very beneficial for a fund manager to know who are the most probable... more
    There are various factors that derive mutual funds’ flow. Among them, factors related to investment patterns and behaviors of the investors are highly informative. It is very beneficial for a fund manager to know who are the most probable investors that are going to subscribe to or redeem from a particular fund in near future. In addition, extracting the important underlying factors involved in this process helps fund managers to plan for optimizing their fund’s performance. Our experiments on historic transaction data of about 400 mutual funds show that we can extract most informative patterns and use them to predict mutual funds’ flow with relatively high accuracy. In addition, the proposed investors’ ranking method gives a curated list for running more effective targeted campaign.
    Breast cancer is a complex disease that has been characterized into ten different molecular subtypes. Current computational methods for determining the subtypes are based on identifying gene biomarkers; i.e. differentially expressed genes... more
    Breast cancer is a complex disease that has been characterized into ten different molecular subtypes. Current computational methods for determining the subtypes are based on identifying gene biomarkers; i.e. differentially expressed genes that best separate the subtypes. Such methods do not take into account the functional relationships between genes, and hence, may not yield informative biomarkers. We propose a machine learning framework for identifying network biomarkers of breast cancer subtypes; i.e. subnetworks of functionally related to gene biomarkers that best distinguish the subtypes. Our framework incorporates genomics, transcriptomics and interactomics information in identifying discriminative network biomarkers corresponding to each subtype. We applied our method on the METABRIC data and obtained a collection of highly predictive network biomarkers with AUC performances ranging from 89.6% to 99.1%.
    'De novo' drug discovery is costly, slow, and with high risk. Repurposing known drugs for treatment of other diseases offers a fast, low-cost/risk and highly-efficient method toward development of efficacious treatments. The... more
    'De novo' drug discovery is costly, slow, and with high risk. Repurposing known drugs for treatment of other diseases offers a fast, low-cost/risk and highly-efficient method toward development of efficacious treatments. The emergence of large-scale heterogeneous biomolecular networks, molecular, chemical and bioactivity data, and genomic and phenotypic data of pharmacological compounds is enabling the development of new area of drug repurposing called 'in silico' drug repurposing, i.e., computational drug repurposing (CDR). The aim of CDR is to discover new indications for an existing drug (drug-centric) or to identify effective drugs for a disease (disease-centric). Both drug-centric and disease-centric approaches have the common challenge of either assessing the similarity or connections between drugs and diseases. However, traditional CDR is fraught with many challenges due to the underlying complex pharmacology and biology of diseases, genes, and drugs, as well as the complexity of their associations. As such, capturing highly non-linear associations among drugs, genes, diseases by most existing CDR methods has been challenging.We propose a network-based integration approach that can best capture knowledge (and complex relationships) contained within and between drugs, genes and disease data. A network-based machine learning approach is applied thereafter by using the extracted knowledge and relationships in order to identify single and pair of approved or experimental drugs with potential therapeutic effects on different breast cancer subtypes.
    Prostate cancer is one of the most common types of cancer among Canadian men. Next-generation sequencing using RNA-Seq provides large amounts of data that may reveal novel and informative biomarkers. We introduce a method that uses... more
    Prostate cancer is one of the most common types of cancer among Canadian men. Next-generation sequencing using RNA-Seq provides large amounts of data that may reveal novel and informative biomarkers. We introduce a method that uses machine learning techniques to identify transcripts that correlate with prostate cancer development and progression. We have isolated transcripts that have the potential to serve as prognostic indicators and may have tremendous value in guiding treatment decisions. Analysis of normal versus malignant prostate cancer data sets indicates differential expression of the genes HEATR5B, DDC, and GABPB1-AS1 as potential prostate cancer biomarkers. Our study also supports PTGFR, NREP, SCARNA22, DOCK9, FLVCR2, IK2F3, USP13, and CLASP1 as potential biomarkers to predict prostate cancer progression, especially between stage II and subsequent stages of the disease.
    Breast cancer is a complex disease that can be classified into at least 10 different molecular subtypes. Appropriate diagnosis of specific subtypes is critical for ensuring the best possible patient treatment and response to therapy.... more
    Breast cancer is a complex disease that can be classified into at least 10 different molecular subtypes. Appropriate diagnosis of specific subtypes is critical for ensuring the best possible patient treatment and response to therapy. Current computational methods for determining the subtypes are based on identifying differentially expressed genes (i.e., biomarkers) that can best discriminate the subtypes. Such approaches, however, are known to be unreliable since they yield different biomarker sets when applied to data sets from different studies. Gathering knowledge about the functional relationship among genes will identify "network biomarkers" that will enrich the criteria for biomarker selection. Cancer network biomarkers are subnetworks of functionally related genes that "work in concert" to perform functions associated with a tumorigenic. We propose a machine learning framework that can be used to identify network biomarkers and driver genes for each specif...
    Genomic aberrations and gene expression-defined subtypes in the large METABRIC patient cohort have been used to stratify and predict survival. The present study used normalized gene expression signatures of paclitaxel drug response to... more
    Genomic aberrations and gene expression-defined subtypes in the large METABRIC patient cohort have been used to stratify and predict survival. The present study used normalized gene expression signatures of paclitaxel drug response to predict outcome for different survival times in METABRIC patients receiving hormone (HT) and, in some cases, chemotherapy (CT) agents. This machine learning method, which distinguishes sensitivity vs. resistance in breast cancer cell lines and validates predictions in patients, was also used to derive gene signatures of other HT  (tamoxifen) and CT agents (methotrexate, epirubicin, doxorubicin, and 5-fluorouracil) used in METABRIC. Paclitaxel gene signatures exhibited the best performance, however the other agents also predicted survival with acceptable accuracies. A support vector machine (SVM) model of paclitaxel response containing the ABCB1, ABCB11, ABCC1, ABCC10, BAD, BBC3, BCL2, BCL2L1, BMF, CYP2C8, CYP3A4, MAP2, MAP4, MAPT, NR1I2, SLCO1B3, TUBB1...
    In cancer alternative RNA splicing represents one mechanism for flexible gene regulation, whereby protein isoforms can be created to promote cell growth, division and survival. Detecting novel splice junctions in the cancer transcriptome... more
    In cancer alternative RNA splicing represents one mechanism for flexible gene regulation, whereby protein isoforms can be created to promote cell growth, division and survival. Detecting novel splice junctions in the cancer transcriptome may reveal pathways driving tumorigenic events. In this regard, RNA-Seq, a high-throughput sequencing technology, has expanded the study of cancer transcriptomics in the areas of gene expression, chimeric events and alternative splicing in search of novel biomarkers for the disease. In this study, we propose a new two-dimensional peak finding method for detecting differential splice junctions in prostate cancer using RNA-Seq data. We have designed an integrative process that involves a new two-dimensional peak finding algorithm to combine junctions and then remove irrelevant introns across different samples within a population. We have also designed a scoring mechanism to select the most common junctions. Our computational analysis on three independ...
    ABSTRACT Genome-wide profiling of DNA-binding proteins using ChIP-Seq has emerged as an alternative to ChIP-chip methods. Due to the large amounts of data produced by next generation sequencing, ChIP-Seq offers many advantages, such as... more
    ABSTRACT Genome-wide profiling of DNA-binding proteins using ChIP-Seq has emerged as an alternative to ChIP-chip methods. Due to the large amounts of data produced by next generation sequencing, ChIP-Seq offers many advantages, such as much higher resolution, less noise and greater coverage than its predecessor, the ChIP-chip array. Multi-level thresholding algorithms have been applied to many problems in image and signal processing. These algorithms have been used for transcriptomics and genomics data analysis such as sub-grid and spot detection in DNA microarrays, and also for detecting significant regions based on next generation sequencing data. We show that our Optimal Multilevel Thresholding algorithm (OMT) has higher accuracy in detecting enriched regions (peaks) in comparison with previously proposed peak finders by testing three algorithms on the well-known FoxA1 Data set and also for four transcription factors (with a total of six antibodies) for Drosophila melanogaster. Using a small number of parameters is another advantage of the proposed method.
    One of the main issues of the analysis of microarray data is quantification of gene expression. The quantified signal intensities should be linearly related to the expression levels of the corresponding genes. In this paper, we present a... more
    One of the main issues of the analysis of microarray data is quantification of gene expression. The quantified signal intensities should be linearly related to the expression levels of the corresponding genes. In this paper, we present a biological assessment for detection and segmentation of grids and spots, and quantification of gene expression in cDNA microarray images. The results on several dilution steps on cDNA microarray images show that the proposed method can detect the location of the spots very effectively even for noisy conditions based on a parameterless multilevel thresholding algorithm. The proposed method can also segment and quantify the intensity of each probe with a nearly perfect degree of accuracy. This guarantees that the proposed method estimates the correct intensity of each spot with a high degree of accuracy and relates it to the expression levels of the corresponding genes very well.
    ABSTRACT Finding genomic features in ChlP-Seq data has become an attractive research topic lately, because of the power, resolution and low-noise of next generation sequencing, making it a much better alternative to traditional... more
    ABSTRACT Finding genomic features in ChlP-Seq data has become an attractive research topic lately, because of the power, resolution and low-noise of next generation sequencing, making it a much better alternative to traditional microarrays such as ChlP-chip and other related methods. However, handling ChlP-Seq data is not straightforward, mainly because of the large amounts of data produced by next generation sequencing. ChlP-Seq has widespread over a range of applications in finding biomarkers, especially those associated with important genomic features in epigenomics and transcriptomics, including binding sites, promoters, exons/introns, transcription sites, among others. Efficient algorithms for finding relevant regions in ChlP-Seq data have been proposed, which capture the most significant peaks from the sequence reads. Among these, multilevel thresholding algorithms have been applied successfully for transcriptomics and genomics data analysis, in particular for detecting significant regions based on next generation sequencing data. We show that the Optimal Multilevel Thresholding algorithm (OMT) achieves higher accuracy in detecting enriched regions and genomic features of detected regions on FoxAl data. OMT finds more gene-related regions (gene, exon, promoter) in comparison with other methods. Using a small number of parameters is another advantage of the proposed method.
    ABSTRACT Gridding cDNA microarray images is a critical step in gene expression analysis, since any errors in this stage are propagated in future steps in the analysis. We propose a fully automatic approach to detect the locations of the... more
    ABSTRACT Gridding cDNA microarray images is a critical step in gene expression analysis, since any errors in this stage are propagated in future steps in the analysis. We propose a fully automatic approach to detect the locations of the spots. The approach first detects and corrects rotations in the sub-grids by an affine transformation, followed by a polynomial-time optimal multi-level thresholding algorithm that finds the positions of the spots. Additionally, a new validity index is proposed in order to find the correct number of spots in each sub-grid, followed by a refinement procedure used to improve the performance of the method. Extensive experiments on real-life microarray images show that the proposed method performs these tasks automatically and with very high accuracy.
    The analysis of DNA microarray images is a crucial step in gene expression analysis, since any errors in early stages are propagated in future steps in the analysis. When processing the underlying images, accurately separating the... more
    The analysis of DNA microarray images is a crucial step in gene expression analysis, since any errors in early stages are propagated in future steps in the analysis. When processing the underlying images, accurately separating the sub-grids and spots is of extreme importance for subsequent steps that include segmentation, quantification, normalization and clustering. We propose a fully automatic approach that first detects the sub-grids given the entire microarray image, and then detects the locations of the spots in each sub-grid. The approach first detects and corrects rotations in the images by an affine transformation, followed by a polynomial-time optimal multi-level thresholding algorithm to find the positions of the sub-grids and spots. Additionally, a new validity index is proposed in order to find the correct number of sub-grids in the microarray image, and the correct number of spots in each sub-grid. Extensive experiments on real-life microarray images show that the metho...
    Research Interests: