Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3468267.3470578acmconferencesArticle/Chapter ViewAbstractPublication PagespascConference Proceedingsconference-collections
research-article
Public Access

Stream-AI-MD: streaming AI-driven adaptive molecular simulations for heterogeneous computing platforms

Published: 26 August 2021 Publication History

Abstract

Emerging hardware tailored for artificial intelligence (AI) and machine learning (ML) methods provide novel means to couple them with traditional high performance computing (HPC) workflows involving molecular dynamics (MD) simulations. We propose Stream-AI-MD, a novel instance of applying deep learning methods to drive adaptive MD simulation campaigns in a streaming manner. We leverage the ability to run ensemble MD simulations on GPU clusters, while the data from atomistic MD simulations are streamed continuously to AI/ML approaches to guide the conformational search in a biophysically meaningful manner on a wafer-scale AI accelerator. We demonstrate the efficacy of Stream-AI-MD simulations for two scientific use-cases: (1) folding a small prototypical protein, namely ββα-fold (BBA) FSD-EY and (2) understanding protein-protein interaction (PPI) within the SARS-CoV-2 proteome between two proteins, nsp16 and nsp10. We show that Stream-AI-MD simulations can improve time-to-solution by ~50X for BBA protein folding. Further, we also discuss performance trade-offs involved in implementing AI-coupled HPC workflows on heterogeneous computing architectures.

References

[1]
Debsindhu Bhowmik, Shang Gao, Michael T. Young, and Arvind Ramanathan. 2018. Deep clustering of protein folding simulations. BMC Bioinformatics 19, 18 (2018), 484.
[2]
Luigi Bonati, Yue-Yu Zhang, and Michele Parrinello. 2019. Neural networks-based variationally enhanced sampling. Proceedings of the National Academy of Sciences 116, 36 (2019), 17641--17647. arXiv:https://www.pnas.org/content/116/36/17641.full.pdf
[3]
Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, and Jörg Sander. 2000. LOF: Identifying Density-Based Local Outliers. SIGMOD Rec. 29, 2 (May 2000), 93--104.
[4]
Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, and Jörg Sander. 2000. LOF: Identifying Density-Based Local Outliers. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD '00). Association for Computing Machinery, New York, NY, USA, 93--104.
[5]
Lorenzo Casalino, Abigail C Dommer, Zied Gaieb, Emilia P Barros, Terra Sztain, Surl-Hee Ahn, Anda Trifan, Alexander Brace, Anthony T Bogetti, Austin Clyde, Heng Ma, Hyungro Lee, Matteo Turilli, Syma Khalid, Lillian T Chong, Carlos Simmerling, David J Hardy, Julio DC Maia, James C Phillips, Thorsten Kurth, Abraham C Stern, Lei Huang, John D McCalpin, Mahidhar Tatineni, Tom Gibbs, John E Stone, Shantenu Jha, Arvind Ramanathan, and Rommie E Amaro. 0. AI-driven multiscale simulations illuminate mechanisms of SARS-CoV-2 spike dynamics. The International Journal of High Performance Computing Applications 0, 0 (0), 10943420211006452. arXiv:https://doi.org/10.1177/10943420211006452
[6]
Matteo T. Degiacomi. 2019. Coupling Molecular Dynamics and Deep Learning to Mine Protein Conformational Space. Structure 27, 6 (2019), 1034 -- 1040.e3.
[7]
Kalyani Dhusia, Zhaoqian Su, and Yinghao Wu. 2020. Using Coarse-Grained Simulations to Characterize the Mechanisms of Protein-Protein Association. Biomolecules 10, 7 (2020).
[8]
Carl Doersch. 2016. Tutorial on Variational Autoencoders. arXiv:stat.ML/1606.05908
[9]
Peter Eastman, Jason Swails, John D Chodera, Robert T McGibbon, Yutong Zhao, Kyle A Beauchamp, Lee-Ping Wang, Andrew C Simmonett, Matthew P Harrigan, Chaya D Stern, et al. 2017. OpenMM 7: Rapid development of high performance algorithms for molecular dynamics. PLoS computational biology 13, 7 (2017), e1005659.
[10]
Adrian H. Elcock, David Sept, and J. Andrew McCammon. 2001. Computer Simulation of Protein-Protein Interactions. The Journal of Physical Chemistry B 105, 8 (2001), 1504--1518. arXiv:https://doi.org/10.1021/jp003602d
[11]
Richard J Gowers, Max Linke, Jonathan Barnoud, Tyler John Edward Reddy, Manuel N Melo, Sean L Seyler, Jan Domanski, David L Dotson, Sébastien Buchoux, Ian M Kenney, et al. 2019. MDAnalysis: a Python package for the rapid analysis of molecular dynamics simulations. Technical Report. Los Alamos National Lab.(LANL), Los Alamos, NM (United States).
[12]
Carlos X. Hernández, Hannah K. Wayment-Steele, Mohammad M. Sultan, Brooke E. Husic, and Vijay S. Pande. 2018. Variational encoding of complex dynamics. Phys. Rev. E 97 (Jun 2018), 062412. Issue 6.
[13]
Brooke E. Husic, Nicholas E. Charron, Dominik Lemm, Jiang Wang, Adrià Pérez, Andreas Krämer, Yaoyi Chen, Simon Olsson, Gianni de Fabritiis, Frank Noé, and Cecilia Clementi. 2020. Coarse Graining Molecular Dynamics with Graph Neural Networks. arXiv:physics.comp-ph/2007.11412
[14]
Brooke E. Husic and Vijay S. Pande. 2018. Markov State Models: From an Art to a Science. Journal of the American Chemical Society 140, 7 (2018), 2386--2396. arXiv:https://doi.org/10.1021/jacs.7b12191 29323881.
[15]
Travis Johnston, Boyu Zhang, Adam Liwo, Silvia Crivelli, and Michela Taufer. 2017. In situ data analytics and indexing of protein trajectories. Journal of Computational Chemistry 38, 16 (2017), 1419--1430. arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/jcc.24729
[16]
Peter M Kasson and Shantenu Jha. 2018. Adaptive ensemble simulations of biomolecules. Current Opinion in Structural Biology 52 (2018), 87 -- 94. Cryo electron microscopy: the impact of the cryo-EM revolution in biology • Biophysical and computational methods - Part A.
[17]
H. Lee, M. Turilli, S. Jha, D. Bhowmik, H. Ma, and A. Ramanathan. 2019. Deep-DriveMD: Deep-Learning Driven Adaptive Molecular Simulations for Protein Folding. In 2019 IEEE/ACM Third Workshop on Deep Learning on Supercomputers (DLS). 12--19.
[18]
Can Li, Daniel Belkin, Yunning Li, Peng Yan, Miao Hu, Ning Ge, Hao Jiang, Eric Montgomery, Peng Lin, Zhongrui Wang, Wenhao Song, John Paul Strachan, Mark Barnell, Qing Wu, R. Stanley Williams, J. Joshua Yang, and Qiangfei Xia. 2018. Efficient and self-adaptive in-situ learning in multilayer memristor neural networks. Nature Communications 9, 1 (2018), 2385.
[19]
Kresten Lindorff-Larsen, Stefano Piana, Ron O. Dror, and David E. Shaw. 2011. How Fast-Folding Proteins Fold. Science 334, 6055 (2011), 517--520. arXiv:https://science.sciencemag.org/content/334/6055/517.full.pdf
[20]
Kresten Lindorff-Larsen, Stefano Piana, Kim Palmo, Paul Maragakis, John L. Klepeis, Ron O. Dror, and David E. Shaw. 2010. Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins: Structure, Function, and Bioinformatics 78, 8 (2010), 1950--1958. arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/prot.22711
[21]
Andreas Mardt, Luca Pasquali, Hao Wu, and Frank Noé. 2018. VAMPnets for deep learning of molecular kinetics. Nature Communications 9, 1 (2018), 5.
[22]
Naveen Michaud-Agrawal, Elizabeth J Denning, Thomas B Woolf, and Oliver Beckstein. 2011. MDAnalysis: a toolkit for the analysis of molecular dynamics simulations. Journal of computational chemistry 32, 10 (2011), 2319--2327.
[23]
Mikita M. Misiura and Anatoly B. Kolomeisky. 2020. Role of Intrinsically Disordered Regions in Acceleration of Protein-Protein Association. The Journal of Physical Chemistry B 124, 1 (2020), 20--27. arXiv:https://doi.org/10.1021/acs.jpcb.9b08793 31804089.
[24]
Frank Noé. 2020. Machine Learning for Molecular Dynamics on Long Timescales. Springer International Publishing, Cham, 331--372.
[25]
Frank Noé, Simon Olsson, Jonas Köhler, and Hao Wu. 2019. Boltzmann generators: Sampling equilibrium states of many-body systems with deep learning. Science 365, 6457 (2019). arXiv:https://science.sciencemag.org/content/365/6457/eaaw1147.full.pdf
[26]
Frank Noé, Gianni De Fabritiis, and Cecilia Clementi. 2020. Machine learning for protein folding and dynamics. Current Opinion in Structural Biology 60 (2020), 77 -- 84. Folding and Binding - Proteins.
[27]
Masahito Ohue, Kento Aoyama, and Yutaka Akiyama. 2020. High-performance cloud computing for exhaustive protein-protein docking. arXiv:cs.DC/2006.08905
[28]
Alexey Onufriev, Donald Bashford, and David A Case. 2004. Exploring protein native states and large-scale conformational changes with a modified generalized born model. Proteins: Structure, Function, and Bioinformatics 55, 2 (2004), 383--394.
[29]
Albert C. Pan, Daniel Jacobson, Konstantin Yatsenko, Duluxan Sritharan, Thomas M. Weinreich, and David E. Shaw. 2019. Atomic-level characterization of protein-protein association. Proceedings of the National Academy of Sciences 116, 10 (2019), 4244--4249. arXiv:https://www.pnas.org/content/116/10/4244.full.pdf
[30]
Fabian Paul, Christoph Wehmeyer, Esam T Abualrous, Hao Wu, Michael D Crabtree, Johannes Schöneberg, Jane Clarke, Christian Freund, Thomas R Weikl, and Frank Noé. 2017. Protein-peptide association kinetics beyond the seconds timescale from atomistic simulations. Nature communications 8, 1 (2017), 1--10.
[31]
Bharadwaj Pudipeddi, Maral Mesmakhosroshahi, Jinwen Xi, and Sujeeth Bharadwaj. 2020. Training Large Neural Networks with Constant Memory using a New Execution Algorithm. arXiv:cs.LG/2002.05645
[32]
Arvind Ramanathan, Akash Parvatikar, Srinivas C. Chennubhotla, Yang Mei, and Sangita C. Sinha. 2020. Transient Unfolding and Long-Range Interactions in Viral BCL2 M11 Enable Binding to the BECN1 BH3 Domain. Biomolecules 10, 9 (2020).
[33]
Arvind Ramanathan, Andrej Savol, Virginia Burger, Chakra S. Chennubhotla, and Pratul K. Agarwal. 2014. Protein Conformational Populations and Functionally Relevant Substates. Accounts of Chemical Research 47, 1 (2014), 149--156. arXiv:https://doi.org/10.1021/ar400084s 23988159.
[34]
Arvind Ramanathan, Andrej J. Savol, Pratul K. Agarwal, and Chakra S. Chennubhotla. 2012. Event detection and sub-state discovery from biomolecular simulations using higher-order statistics: Application to enzyme adenylate kinase. Proteins: Structure, Function, and Bioinformatics 80, 11 (2012), 2536--2551. arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/prot.24135
[35]
João Marcelo Lamim Ribeiro, Pablo Bravo, Yihang Wang, and Pratyush Tiwary. 2018. Reweighted autoencoded variational Bayes for enhanced sampling (RAVE). The Journal of Chemical Physics 149, 7 (2018), 072301. arXiv:https://doi.org/10.1063/1.5025487
[36]
Raquel Romero, Arvind Ramanathan, Tony Yuen, Debsindhu Bhowmik, Mehr Mathew, Lubna Bashir Munshi, Seher Javaid, Madison Bloch, Daria Lizneva, Alina Rahimova, Ayesha Khan, Charit Taneja, Se-Min Kim, Li Sun, Maria I. New, Shozeb Haider, and Mone Zaidi. 2019. Mechanism of glucocerebrosidase activation and dysfunction in Gaucher disease unraveled by molecular dynamics and deep learning. Proceedings of the National Academy of Sciences 116, 11 (2019), 5086--5095. arXiv:https://www.pnas.org/content/116/11/5086.full.pdf
[37]
Monica Rosas-Lemus, George Minasov, Ludmilla Shuvalova, Nicole L. Inniss, Olga Kiryukhina, Joseph Brunzelle, and Karla J. F. Satchell. 2020. High-resolution structures of the SARS-CoV-2 2'-O-methyltransferase reveal strategies for structure-based inhibitor design. Science Signaling 13, 651 (2020). arXiv:https://stke.sciencemag.org/content/13/651/eabe1202.full.pdf
[38]
M. Salim, T. Uram, J. T. Childers, V. Vishwanath, and M. Papka. 2019. Balsam: Near Real-Time Experimental Data Analysis on Supercomputers. In 2019 IEEE/ACM 1st Annual Workshop on Large-scale Experiment-in-the-Loop Computing (XLOOP). 26--31.
[39]
Catherine A Sarisky and Stephen L Mayo. 2001. The ββα fold: explorations in sequence space. Journal of Molecular Biology 307, 5 (2001), 1411 -- 1418.
[40]
David Sculley. 2010. Web-scale k-means clustering. In Proceedings of the 19th international conference on World wide web. 1177--1178.
[41]
Alexander Sergeev and Mike Del Balso. 2018. Horovod: fast and easy distributed deep learning in TensorFlow. arXiv preprint arXiv:1802.05799 (2018).
[42]
Zahra Shamsi, Kevin J. Cheng, and Diwakar Shukla. 2018. Reinforcement Learning Based Adaptive Sampling: REAPing Rewards by Exploring Protein Conformational Landscapes. The Journal of Physical Chemistry B 122, 35 (09 2018), 8386--8395.
[43]
David E Shaw, JP Grossman, Joseph A Bank, Brannon Batson, J Adam Butts, Jack C Chao, Martin M Deneroff, Ron O Dror, Amos Even, Christopher H Fenton, et al. 2014. Anton 2: raising the bar for performance and programmability in a special-purpose molecular dynamics supercomputer. In SC'14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 41--53.
[44]
Michael R Shirts, Christoph Klein, Jason M Swails, Jian Yin, Michael K Gilson, David L Mobley, David A Case, and Ellen D Zhong. 2017. Lessons learned from comparing molecular dynamics engines on the SAMPL5 dataset. Journal of computer-aided molecular design 31, 1 (2017), 147--161.
[45]
Rick Stevens, Valerie Taylor, Jeff Nichols, Arthur B. MacCabe, Katherine Yellick, and David Brown. 2019.
[46]
Cerebras Systems. 2019. Wafer-Scale Deep Learning, Presentation at HotChips 2019. Retrieved October 3, 2020 from https://www.youtube.com/watch?v=QF9oObzMBpU&t=3715.
[47]
M. Taufer, S. Thomas, M. Wyatt, T. M. Anh Do, L. Pottier, R. F. da Silva, H. Weinstein, M. A. Cuendet, T. Estrada, and E. Deelman. 2019. Characterizing In Situ and In Transit Analytics of Molecular Dynamics Simulations for Next-Generation Supercomputers. In 2019 15th International Conference on eScience (eScience). 188--198.
[48]
Kei Terayama, Ai Shinobu, Koji Tsuda, Kazuhiro Takemura, and Akio Kitao. 2019. evERdock BAI: Machine-learning-guided selection of protein-protein complex structure. The Journal of Chemical Physics 151, 21 (2019), 215104. arXiv:https://doi.org/10.1063/1.5129551
[49]
Gerrit J. J. van den Burg and Christopher K. I. Williams. 2020. An Evaluation of Change Point Detection Algorithms. arXiv:stat.ML/2003.06222
[50]
Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, 11 (2008).
[51]
Yihang Wang, João Marcelo Lamim Ribeiro, and Pratyush Tiwary. 2019. Past-future information bottleneck for sampling molecular reaction coordinate simultaneously with thermodynamics and kinetics. Nature Communications 10, 1 (2019), 3573.
[52]
Willy Wriggers, Kate A. Stafford, Yibing Shan, Stefano Piana, Paul Maragakis, Kresten Lindorff-Larsen, Patrick J. Miller, Justin Gullingsrud, Charles A. Rendleman, Michael P. Eastwood, Ron O. Dror, and David E. Shaw. 2009. Automated Event Detection and Activity Monitoring in Long Molecular Dynamics Simulations. Journal of Chemical Theory and Computation 5, 10 (2009), 2595--2605. arXiv:https://doi.org/10.1021/ct900229u 26631775.
[53]
Jun Zhang, Yi Isaac Yang, and Frank Noé. 2019. Targeted Adversarial Learning Optimized Sampling. The Journal of Physical Chemistry Letters 10, 19 (10 2019), 5791--5797.
[54]
Huan-Xiang Zhou and Paul A Bates. 2013. Modeling protein association mechanisms and kinetics. Current Opinion in Structural Biology 23, 6 (2013), 887 -- 893. Catalysis and regulation / Proteinprotein interactions.

Cited By

View all
  • (2024)Employing artificial intelligence to steer exascale workflows with colmenaThe International Journal of High Performance Computing Applications10.1177/10943420241288242Online publication date: 8-Oct-2024
  • (2024)A Portable, Fast, DCT-based Compressor for AI AcceleratorsProceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3625549.3658662(109-121)Online publication date: 3-Jun-2024
  • (2023)#COVIDisAirborneInternational Journal of High Performance Computing Applications10.1177/1094342022112823337:1(28-44)Online publication date: 1-Jan-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PASC '21: Proceedings of the Platform for Advanced Scientific Computing Conference
July 2021
186 pages
ISBN:9781450385633
DOI:10.1145/3468267
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • CSCS: Swiss National Supercomputing Centre

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 August 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. accelerators
  2. adaptive simulations
  3. deep learning
  4. molecular biophysics
  5. protein-protein interactions
  6. streaming data analytics

Qualifiers

  • Research-article

Funding Sources

Conference

PASC '21
Sponsor:

Acceptance Rates

PASC '21 Paper Acceptance Rate 17 of 33 submissions, 52%;
Overall Acceptance Rate 109 of 221 submissions, 49%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)293
  • Downloads (Last 6 weeks)35
Reflects downloads up to 17 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Employing artificial intelligence to steer exascale workflows with colmenaThe International Journal of High Performance Computing Applications10.1177/10943420241288242Online publication date: 8-Oct-2024
  • (2024)A Portable, Fast, DCT-based Compressor for AI AcceleratorsProceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3625549.3658662(109-121)Online publication date: 3-Jun-2024
  • (2023)#COVIDisAirborneInternational Journal of High Performance Computing Applications10.1177/1094342022112823337:1(28-44)Online publication date: 1-Jan-2023
  • (2023)Cloud Services Enable Efficient AI-Guided Simulation Workflows across Heterogeneous Resources2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW59300.2023.00018(32-41)Online publication date: May-2023
  • (2023) Enhanced sampling in explicit solvent by deep learning module in FSATOOL Journal of Computational Chemistry10.1002/jcc.2713244:22(1845-1856)Online publication date: 16-May-2023
  • (2022)Machine learning, artificial intelligence, and chemistry: How smart algorithms are reshaping simulation and the laboratoryPure and Applied Chemistry10.1515/pac-2022-020294:8(1019-1054)Online publication date: 28-Sep-2022
  • (2022)A Comprehensive Evaluation of Novel AI Accelerators for Deep Learning Workloads2022 IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)10.1109/PMBS56514.2022.00007(13-25)Online publication date: Nov-2022

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media