Background: Identification of novel drug–target interactions (DTIs) is important for drug discove... more Background: Identification of novel drug–target interactions (DTIs) is important for drug discovery. Experimental determination of such DTIs is costly and time consuming, hence it necessitates the development of efficient com‑ putational methods for the accurate prediction of potential DTIs. To‑date, many computational methods have been proposed for this purpose, but they suffer the drawback of a high rate of false positive predictions. Results: Here, we developed a novel computational DTI prediction method, DASPfind. DASPfind uses simple paths of particular lengths inferred from a graph that describes DTIs, similarities between drugs, and similarities between the protein targets of drugs. We show that on average, over the four gold standard DTI datasets, DASPfind significantly outperforms other existing methods when the single top‑ranked predictions are considered, resulting in 46.17 % of these predictions being correct, and it achieves 49.22 % correct single top ranked predictions when the set of all DTIs for a single drug is tested. Furthermore, we demonstrate that our method is best suited for predicting DTIs in cases of drugs with no known targets or with few known targets. We also show the practical use of DASPfind by generating novel predictions for the Ion Channel dataset and validating them manually. Conclusions: DASPfind is a computational method for finding reliable new interactions between drugs and proteins. We show over six different DTI datasets that DASPfind outperforms other state‑of‑the‑art methods when the single top‑ranked predictions are considered, or when a drug with no known targets or with few known targets is consid‑ ered. We illustrate the usefulness and practicality of DASPfind by predicting novel DTIs for the Ion Channel dataset. The validated predictions suggest that DASPfind can be used as an efficient method to identify correct DTIs, thus reducing the cost of necessary experimental verifications in the process of drug discovery. DASPfind can be accessed online at: http://www.cbrc.kaust.edu.sa/daspfind.
High-throughput screening (HTS) experiments provide a valuable resource that reports biological a... more High-throughput screening (HTS) experiments provide a valuable resource that reports biological activity of numerous chemical compounds relative to their molecular targets. Building computational models that accurately predict such activity status (active vs. inactive) in specific assays is a challenging task given the large volume of data and frequently small proportion of active compounds relative to the inactive ones. We developed a method, DRAMOTE, to predict activity status of chemical compounds in HTP activity assays. For a class of HTP assays, our method achieves considerably better results than the current state-of-the-art-solutions. We achieved this by modification of a minority oversampling technique. To demonstrate that DRAMOTE is performing better than the other methods, we performed a comprehensive comparison analysis with several other methods and evaluated them on data from 11 PubChem assays through 1,350 experiments that involved approximately 500,000 interactions between chemicals and their target proteins. As an example of potential use, we applied DRAMOTE to develop robust models for predicting FDA approved drugs that have high probability to interact with the thyroid stimulating hormone receptor (TSHR) in humans. Our findings are further partially and indirectly supported by 3D docking results and literature information. The results based on approximately 500,000 interactions suggest that DRAMOTE has performed the best and that it can be used for developing robust virtual screening models. The datasets and implementation of all solutions are available as a MATLAB toolbox online at www.cbrc.kaust.edu.sa/dramote and can be found on Figshare.
The aim of this paper is to propose an application of mutual information-based ensemble methods t... more The aim of this paper is to propose an application of mutual information-based ensemble methods to the analysis and classification of heart beats associated with different types of Arrhythmia. Models of multilayer perceptrons, support vector machines, and radial basis function neural networks were trained and tested using the MIT-BIH arrhythmia database. This research brings a focus to an ensemble method that, to our knowledge, is a novel application in the area of ECG Arrhythmia detection. The proposed classifier ensemble method showed improved performance, relative to either majority voting classifier integration or to individual classifier performance. The overall ensemble accuracy was 98.25%.
Many scientific problems can be formulated as classification tasks. Data that harbor relevant inf... more Many scientific problems can be formulated as classification tasks. Data that harbor relevant information are usually described by a large number of features. Frequently, many of these features are irrelevant for the class prediction. The efficient implementation of classification models requires identification of suitable combinations of features. The smaller number of fea- tures reduces the problem’s dimensionality and may result in higher classification perfor- mance. We developed DWFS, a web-based tool that allows for efficient selection of features for a variety of problems. DWFS follows the wrapper paradigm and applies a search strategy based on Genetic Algorithms (GAs). A parallel GA implementation examines and evaluates si- multaneously large number of candidate collections of features. DWFS also integrates various filtering methods that may be applied as a pre-processing step in the feature selection process. Furthermore, weights and parameters in the fitness function of GA can be adjusted according to the application requirements. Experiments using heterogeneous datasets from different bio- medical applications demonstrate that DWFS is fast and leads to a significant reduction of the number of features without sacrificing performance as compared to several widely used exist- ing methods. DWFS can be accessed online at www.cbrc.kaust.edu.sa/dwfs.
Background: Identification of novel drug–target interactions (DTIs) is important for drug discove... more Background: Identification of novel drug–target interactions (DTIs) is important for drug discovery. Experimental determination of such DTIs is costly and time consuming, hence it necessitates the development of efficient com‑ putational methods for the accurate prediction of potential DTIs. To‑date, many computational methods have been proposed for this purpose, but they suffer the drawback of a high rate of false positive predictions. Results: Here, we developed a novel computational DTI prediction method, DASPfind. DASPfind uses simple paths of particular lengths inferred from a graph that describes DTIs, similarities between drugs, and similarities between the protein targets of drugs. We show that on average, over the four gold standard DTI datasets, DASPfind significantly outperforms other existing methods when the single top‑ranked predictions are considered, resulting in 46.17 % of these predictions being correct, and it achieves 49.22 % correct single top ranked predictions when the set of all DTIs for a single drug is tested. Furthermore, we demonstrate that our method is best suited for predicting DTIs in cases of drugs with no known targets or with few known targets. We also show the practical use of DASPfind by generating novel predictions for the Ion Channel dataset and validating them manually. Conclusions: DASPfind is a computational method for finding reliable new interactions between drugs and proteins. We show over six different DTI datasets that DASPfind outperforms other state‑of‑the‑art methods when the single top‑ranked predictions are considered, or when a drug with no known targets or with few known targets is consid‑ ered. We illustrate the usefulness and practicality of DASPfind by predicting novel DTIs for the Ion Channel dataset. The validated predictions suggest that DASPfind can be used as an efficient method to identify correct DTIs, thus reducing the cost of necessary experimental verifications in the process of drug discovery. DASPfind can be accessed online at: http://www.cbrc.kaust.edu.sa/daspfind.
High-throughput screening (HTS) experiments provide a valuable resource that reports biological a... more High-throughput screening (HTS) experiments provide a valuable resource that reports biological activity of numerous chemical compounds relative to their molecular targets. Building computational models that accurately predict such activity status (active vs. inactive) in specific assays is a challenging task given the large volume of data and frequently small proportion of active compounds relative to the inactive ones. We developed a method, DRAMOTE, to predict activity status of chemical compounds in HTP activity assays. For a class of HTP assays, our method achieves considerably better results than the current state-of-the-art-solutions. We achieved this by modification of a minority oversampling technique. To demonstrate that DRAMOTE is performing better than the other methods, we performed a comprehensive comparison analysis with several other methods and evaluated them on data from 11 PubChem assays through 1,350 experiments that involved approximately 500,000 interactions between chemicals and their target proteins. As an example of potential use, we applied DRAMOTE to develop robust models for predicting FDA approved drugs that have high probability to interact with the thyroid stimulating hormone receptor (TSHR) in humans. Our findings are further partially and indirectly supported by 3D docking results and literature information. The results based on approximately 500,000 interactions suggest that DRAMOTE has performed the best and that it can be used for developing robust virtual screening models. The datasets and implementation of all solutions are available as a MATLAB toolbox online at www.cbrc.kaust.edu.sa/dramote and can be found on Figshare.
The aim of this paper is to propose an application of mutual information-based ensemble methods t... more The aim of this paper is to propose an application of mutual information-based ensemble methods to the analysis and classification of heart beats associated with different types of Arrhythmia. Models of multilayer perceptrons, support vector machines, and radial basis function neural networks were trained and tested using the MIT-BIH arrhythmia database. This research brings a focus to an ensemble method that, to our knowledge, is a novel application in the area of ECG Arrhythmia detection. The proposed classifier ensemble method showed improved performance, relative to either majority voting classifier integration or to individual classifier performance. The overall ensemble accuracy was 98.25%.
Many scientific problems can be formulated as classification tasks. Data that harbor relevant inf... more Many scientific problems can be formulated as classification tasks. Data that harbor relevant information are usually described by a large number of features. Frequently, many of these features are irrelevant for the class prediction. The efficient implementation of classification models requires identification of suitable combinations of features. The smaller number of fea- tures reduces the problem’s dimensionality and may result in higher classification perfor- mance. We developed DWFS, a web-based tool that allows for efficient selection of features for a variety of problems. DWFS follows the wrapper paradigm and applies a search strategy based on Genetic Algorithms (GAs). A parallel GA implementation examines and evaluates si- multaneously large number of candidate collections of features. DWFS also integrates various filtering methods that may be applied as a pre-processing step in the feature selection process. Furthermore, weights and parameters in the fitness function of GA can be adjusted according to the application requirements. Experiments using heterogeneous datasets from different bio- medical applications demonstrate that DWFS is fast and leads to a significant reduction of the number of features without sacrificing performance as compared to several widely used exist- ing methods. DWFS can be accessed online at www.cbrc.kaust.edu.sa/dwfs.
Uploads
Papers by Othman Soufan