Abstract
Identifying the mechanism of action (MoA) of an unknown, possibly novel, substance (chemical, protein, or pathogen) is a significant challenge. Biologists typically spend years working out the MoA for known compounds. MoA determination is especially challenging if there is no prior knowledge and if there is an urgent need to understand the mechanism for rapid treatment and/or prevention of global health emergencies. In this paper, we describe a data analysis approach using Gaussian processes and machine learning techniques to infer components of the MoA of an unknown agent from time series transcriptomics, proteomics, and metabolomics data.
The work was performed as part of the DARPA Rapid Threat Assessment program, where the challenge was to identify the MoA of a potential threat agent in 30 days or less, using only project generated data, with no recourse to pre-existing databases or published literature.
Sponsored by the US Army Research Office and the Defense Advanced Research Projects Agency; accomplished under Cooperative Agreement W911NF-14-2-0020.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
We use the machine learning framework [19] for PCA and basic clustering.
- 2.
Alternatively, PCA can be applied along the gene/compound dimension which we have done in another part of our RTA workflow.
- 3.
Another affinity measure we have explored is based on correlated changes (using time series derivatives) but beyond the scope of this paper.
- 4.
We explored some other metrics based on the original (non-normalized) time series data that we omit for brevity.
- 5.
Currently we restrict analysis of transcriptomics data to protein coding genes.
- 6.
Disclaimer. Research was sponsored by the U.S. Army Research Office and the Defense Advanced Research Projects Agency and was accomplished under Cooperative Agreement Number W911NF-14-2-0020. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Office, DARPA, or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon.
References
Abadi, M., et. al.: TensorFlow: a system for large-scale machine learning. In: Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, pp. 265–283. USENIX Association (2016)
Barabasi, A.L., Gulbahce, N., Loscalzo, J.: Network medicine: a network-based approach to human disease. Nat. Rev. Genet. 12(1), 56–68 (2011)
Barabasi, A.L., Oltvai, Z.N.: Network biology: understanding the cell’s functional organization. Nat. Rev. Genet. 5(2), 101–113 (2004)
Cajka, T., Fiehn, O.: Toward merging untargeted and targeted methods in mass spectrometry-based metabolomics and lipidomics. Anal. Chem. 88(1), 524–545 (2016)
Dettmer, K., Aronov, P.A., Hammock, B.D.: Mass spectrometry-based metabolomics. Mass Spectrom. Rev. 26(1), 51–78 (2007)
de Matthews, G., et al.: GPflow: a gaussian process library using tensorflow. J. Mach. Learn. Res. 18, 40:1–40:6 (2017)
Girault, C., Valk, R.: Petri Nets for Systems Engineering: A Guide to Modeling, Verification, and Applications. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-662-05324-9
Goodwin, S., McPherson, J.D., McCombie, W.R.: Coming of age: ten years of next-generation sequencing technologies. Nat. Rev. Genet. 17(6), 333–351 (2016)
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Jeong, H., Tombor, B., Albert, R., Oltvai, Z.N., Barabasi, A.L.: The large-scale organization of metabolic networks. Nature 407(6804), 651–654 (2000)
Kim, M.S., et al.: A draft map of the human proteome. Nature 509(7502), 575–581 (2014)
Kluyver, T., et. al.: Jupyter notebooks - a publishing format for reproducible computational workflows. In: Loizides, F., Schmidt, B. (eds.) Positioning and Power in Academic Publishing: Players, Agents and Agendas, pp. 87–90. IOS Press (2016)
Kramer, A., Green, J., Pollard, J., Tugendreich, S.: Causal analysis approaches in ingenuity pathway analysis. Bioinformatics 30(4), 523–530 (2014)
Mann, M., Kulak, N.A., Nagaraj, N., Cox, J.: The coming age of complete, accurate, and ubiquitous proteomes. Mol. Cell 49(4), 583–590 (2013)
Masci, J., Meier, U., Cireşan, D., Schmidhuber, J.: Stacked convolutional auto-encoders for hierarchical feature extraction. In: Honkela, T., Duch, W., Girolami, M., Kaski, S. (eds.) ICANN 2011 Part I. LNCS, vol. 6791, pp. 52–59. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21735-7_7
McInnes, L., Healy, J., Astels, S.: HDBSCAN: hierarchical density based clustering. J. Open Sour. Softw. 2(11) (2017)
Mi, H., et al.: Panther version 11: expanded annotation data from gene ontology and reactome pathways, and data analysis tool enhancements. Nucleic Acids Res. D45(1), D183–D189 (2017)
Noh, H., Shoemaker, J.E., Gunawan, R.: Network perturbation analysis of gene transcriptional profiles reveals protein targets and mechanism of action of drugs and influenza a viral infection. Nucleic Acids Res. 46(6), e34 (2018)
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Pujol, A., Mosca, R., Farres, J., Aloy, P.: Unveiling the role of network and systems biology in drug discovery. Trends Pharmacol. Sci. 31(3), 115–123 (2010)
Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press, Cambridge (2005)
Tautenhahn, R., et al.: An accelerated workflow for untargeted metabolomics using the METLIN database. Nat. Biotechnol. 30(9), 826–828 (2012)
Uhlen, M., et al.: Tissue-based map of the human proteome. Science 347(6220), 4 (2015)
Vertes, A., et. al.: Time-dependent metabolomics in systems biology context for mechanism of action studies. In: US HUPO Conference - Proteomics: From Genes to Function, San Diego, CA (2017)
Vertes, A., et. al.: Mechanism of action identification of threat agents within 30 days. In: Society of Toxicology 57th Annual Meeting, San Antonio, TX (2018)
Vertes, A., et. al.: Novel high-throughput metabolomic techniques and mainstream tools for the discovery of drug mechanism of action. In: US HUPO 14th Annual Conference - Technology Accelerating Discovery, Minneapolis, MN (2018)
Vertes, A., et. al.: Systems biology approach for mechanism of action identification in 30 days. In: ASMS Conference, San Diego, CA (2018)
Wang, Z., Gerstein, M., Snyder, M.: RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10(1), 57–63 (2009)
Woo, J.H., et al.: Elucidating compound mechanism of action by network Perturbation analysis. Cell 162(2), 441–451 (2015)
Xu, W.H., et al.: Human transcriptome array for high-throughput clinical studies. Proc. Natl. Acad. Sci. U.S.A. 108(9), 3707–3712 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Vertes, A. et al. (2018). Inferring Mechanism of Action of an Unknown Compound from Time Series Omics Data. In: Češka, M., Šafránek, D. (eds) Computational Methods in Systems Biology. CMSB 2018. Lecture Notes in Computer Science(), vol 11095. Springer, Cham. https://doi.org/10.1007/978-3-319-99429-1_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-99429-1_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99428-4
Online ISBN: 978-3-319-99429-1
eBook Packages: Computer ScienceComputer Science (R0)