In Situ Data Analytics for Next Generation Molecular Dynamics Workflows

slider image slider image slider image slider image
slider image

About Analytics4MD

This project tackles the data challenge of data analysis of molecular dynamics simulations on the next-generation supercomputers by:
Creating new in situ methods to trace molecular events such as conformational changes, phase transitions, or binding events in molecular dynamics simulations at runtime by locally reducing knowledge on high-dimensional molecular organization into a set of relevant structural molecular properties
Integrating simulation and analytics into complex workflows for runtime detection of changes in structural and temporal molecular properties
Designing new data representations and extend unsupervised machine learning techniques to accurately and efficiently build an explicit global organization of structural and temporal molecular properties
Developing new curriculum material, online courses, and online training material targeting data analytics
The project's harnessed knowledge of molecular structures' transformations at runtime can be used to steer simulations to more promising areas of the simulation space, identify the data that should be written to congested parallel file systems, and index generated data for retrieval and post-simulation analysis. Supported by this knowledge, molecular dynamics workflows such as replica exchange simulations, Markov state models, and the string method with swarms of trajectories can be executed from the outside (i.e., without reengineering the molecular dynamics code)

Selected Publications

Hector Carrillo-Cabada, Jeremy Benson, Asghar Razavi, Brianna Mulligan, Michel A. Cuendet, Harel Weinstein, Michela Taufer, and Trilce Estrada. A Graphic Encoding Method for Quantitative Classification of Protein Structure and Representation of Conformational Changes IEEE/ACM Transactions on Computational Biology and Bioinformatics (IEEE/ACM TCBC). (2020). [link]
Tu Mai Anh Do, Loic Pottier, Stephen Thomas, Rafael Ferreira da Silva, Michel A. Cuendet, Harel Weinstein, Trilce Estrada, Michela Taufer, and Ewa Deelman. A Novel Metric to Evaluate In Situ Workflows In Proceedings of the International Conference on Computational Science (ICCS), pp. 1 – 14. (2020). [link]
Michela Taufer, Trilce Estrada, and Travis Johnston. A Survey of Algorithms for Transforming Molecular Dynamics Data into Metadata for In Situ Analytics based on Machine Learning Methods Issue of Philosophical Transactions A., 378(2166):1-11. (2020). [link]
Michela Taufer, Stephen Thomas, Michael Wyatt, Tu Mai Anh Do, Loïc Pottier, Rafael Ferreira da Silva, Harel Weinstein, Michel A. Cuendet, Trilce Estrada, and Ewa Deelman. Characterization of In Situ and In Transit Analytics of Molecular Dynamics Simulations for Next-generation Supercomputers In Proceedings of the IEEE eScience Conference, pp.1-12. (2019). [link]
Trilce Estrada, Jeremy Benson, Hector Carrillo-Cabada, Asghar M. Razavi, Michel A. Cuendet, Harel Weinstein, Ewa Deelman, and Michela Taufer. Graphic Encoding of Proteins for Efficient High-Throughput Analysis In Proceedings of the 9th ACM Conference on Bioinformatics, BCB, pp. 315 – 324. Washington, DC, USA. August 29 - September 1. (2018). [link]
Rafael Ferreira da Silva, Scott Callaghan, Tu Mai Anh Do, George Papadimitriou, and Ewa Deelman. Measuring the Impact of Burst Buffers on Data-Intensive Scientific Workflows Future Generation Computer Systems, 101, 208-220 (2019). [link]
Asghar M. Razavi, George Khelashvili, and Harel Weinstein. A Markov State-based Quantitative Kinetic Model of Sodium Release from the Dopamine Transporter Scientific Reports, 7 (2017). [link]
Rafael Ferreira da Silva, Rosa Filgueira, Lia Pietri, Ming Jiang, Rizos Sakellariou, Ewa Deelman. A Characterization of Workflow Management Systems for Extreme-scale Applications Future Generation Computer Systems, 75, 228-238 (2017). [link]
Travis Johnston, Boyu Zhang, Adam Liwo, Silvia Crivelli, Michela Taufer. In situ data analytics and indexing of protein trajectories. Journal of Computational Chemistry, 38 (16), 1419-1430, (2017). [link]
Michel A. Cuendet, Harel Weinstein, and Michael V. LeVine. The Allostery Landscape: Quantifying Thermodynamic Couplings in Biomolecular Systems Journal of Chemical Theory and Computation, 12 (12), 5758-5767, (2016). [link]
Travis Johnston, Boyu Zhang, Adam Liwo, Silvia Crivelli, Michela Taufer. In-Situ Data Analysis of Protein Folding Trajectories. arXiv:1510.08789, (2015). [link]

Travis Johnston

Research Scientist for AI in HPC

Asghar Razavi

Postdoctoral Associate in Physiology and Biophysics at Weill Cornell Medical College of Cornell University

Loïc Pottier

Computer Scientist at the University of Southern California

Silvina Caino-Lores

Postdoctoral Research Associate, University of Tennessee, Knoxville

Ekaterina Kots

Postdoctoral Associate, Weill Cornell Medical School

Tu Mai Anh Do

Graduate Research Assistant at the University of Southern California

Michael Wyatt

Research Assistant, University of Tennessee, Knoxville

Hector Alexis Carrillo Cabada

Graduate Research Assistant at the University of New Mexico

Ian Lumsden

Graduate Research Assistant, University of Tennessee, Knoxville

Stephen Thomas

Post-Doctoral Research Associate at University of Tennessee Knoxville