In Situ Data Analytics for Next Generation Molecular Dynamics Workflows

slider image slider image slider image slider image
slider image

About Analytics4MD

This project tackles the data challenge of data analysis of molecular dynamics simulations on the next-generation supercomputers by:
Creating new in situ methods to trace molecular events such as conformational changes, phase transitions, or binding events in molecular dynamics simulations at runtime by locally reducing knowledge on high-dimensional molecular organization into a set of relevant structural molecular properties
Integrating simulation and analytics into complex workflows for runtime detection of changes in structural and temporal molecular properties
Designing new data representations and extend unsupervised machine learning techniques to accurately and efficiently build an explicit global organization of structural and temporal molecular properties
Developing new curriculum material, online courses, and online training material targeting data analytics
The project's harnessed knowledge of molecular structures' transformations at runtime can be used to steer simulations to more promising areas of the simulation space, identify the data that should be written to congested parallel file systems, and index generated data for retrieval and post-simulation analysis. Supported by this knowledge, molecular dynamics workflows such as replica exchange simulations, Markov state models, and the string method with swarms of trajectories can be executed from the outside (i.e., without reengineering the molecular dynamics code)

Selected Publications

Rafael Ferreira da Silva, Scott Callaghan, Tu Mai Anh Do, George Papadimitriou, and Ewa Deelman, Measuring the Impact of Burst Buffers on Data-Intensive Scientific Workflows Future Generation Computer Systems, 101, 208-220 (2019). (2017). [link]
Asghar M. Razavi, George Khelashvili, Harel Weinstein. A Markov State-based Quantitative Kinetic Model of Sodium Release from the Dopamine Transporter Scientific Reports, 7 (2017). [link]
Rafael Ferreira da Silva, Rosa Filgueira, Lia Pietri, Ming Jiang, Rizos Sakellariou, Ewa Deelman. A Characterization of Workflow Management Systems for Extreme-scale Applications Future Generation Computer Systems, 75, 228-238 (2017). [link]
Michel A. Cuendet, Harel Weinstein, Michael V. LeVine. The Allostery Landscape: Quantifying Thermodynamic Couplings in Biomolecular Systems Journal of Chemical Theory and Computation, 12 (12), 5758-5767, (2016). [link]
Travis Johnston, Boyu Zhang, Adam Liwo, Silvia Crivelli, Michela Taufer. In situ data analytics and indexing of protein trajectories. Journal of Computational Chemistry, 38 (16), 1419-1430, (2017). [link]
Travis Johnston, Boyu Zhang, Adam Liwo, Silvia Crivelli, Michela Taufer. In-Situ Data Analysis of Protein Folding Trajectories. arXiv:1510.08789, (2015). [link]

Travis Johnston

Postdoctoral researcher at Oak Ridge National Laboratory.

Asghar Razavi

Postdoctoral Associate in Physiology and Biophysics at Weill Cornell Medical College of Cornell University.

Stephen Thomas

Post-Doctoral Research Associate at University of Tennessee Knoxville

Tu Mai Anh Do

Graduate research assistant at the University of Southern California.

Hector Alexis Carrillo Cabada

Graduate research assistant at the University of New Mexico.