In Situ Data Analytics for Next Generation Molecular Dynamics Workflows

slider image slider image slider image slider image
slider image

About Analytics4MD

This project tackles the data challenge of data analysis of molecular dynamics simulations on the next-generation supercomputers by:
Creating new in situ methods to trace molecular events such as conformational changes, phase transitions, or binding events in molecular dynamics simulations at runtime by locally reducing knowledge on high-dimensional molecular organization into a set of relevant structural molecular properties
Integrating simulation and analytics into complex workflows for runtime detection of changes in structural and temporal molecular properties
Designing new data representations and extend unsupervised machine learning techniques to accurately and efficiently build an explicit global organization of structural and temporal molecular properties
Developing new curriculum material, online courses, and online training material targeting data analytics
The project's harnessed knowledge of molecular structures' transformations at runtime can be used to steer simulations to more promising areas of the simulation space, identify the data that should be written to congested parallel file systems, and index generated data for retrieval and post-simulation analysis. Supported by this knowledge, molecular dynamics workflows such as replica exchange simulations, Markov state models, and the string method with swarms of trajectories can be executed from the outside (i.e., without reengineering the molecular dynamics code)

Get the A4MD Software

Download A4MD software here: Github Repository

Selected Publications

Harshita Sahni, Hector Carrillo-Cabada, Ekaterina Kots, Silvina Caino-Lores, Jack Marquez, Ewa Deelman, Michel Cuendet Harel Weinstein, Michela Taufer, and Trilce Estrada Online Boosted Gaussian Learners for in-situ Detection and Characterization of Protein Folding States in Molecular Dynamics Simulations In Proceedings of the 19th IEEE International Conference on e-Science (eScience) (2023).
Silvina Caino-Lores, Michel A. Cuendet, Jack Marquez* Trilce Estrada, Ewa Deelman, Harel Weinstein, and Michela Taufer. Runtime Steering of Molecular Dynamics Simulations Through In Situ Analysis and Annotation of Collective Variables In Proceedings of Platform for Advanced Scientific Computing (PASC) Conference (2023).
Tu Mai Anh Do, Lo ̈ıc Pottier, Rafael Ferreira da Silva, Silvina Ca ́ıno-Lores, Michela Taufer, and Ewa Deelman. Performance Assessment of Ensembles of In Situ Workflows under Re- source Constraints. Journal of Concurrency and Computation: Practice and Experience (CCPE). (2023).
Silvina Caino-Lores, Michel Cuendet, Trilce Estrada, Ewa Deelman, Harel Weinstein, and Taufer, Michela. High-Throughput In-Situ Workflows for Ensemble Molecular Dynamics. In Proceedings of the 18th IEEE International Conference on e-Science (eScience) (2022).
Tu Mai Anh Do, Loic Pottier, Rafael Ferreira da Silva, Frederic Suter, Silvina Ca ́ıno-Lores*, Michela Taufer, and Ewa Deelman. Co-Scheduling Ensembles of In Situ Workflows. In Pro- ceedings of the 17th Workshop on Workflows in Support of Large-Scale Science (WORKS) (2022).
Michela Taufer, Ewa Deelman, Rafael Ferreira da Silva, Trilce Estrada, Mary Hall, and Miron Livny. A Roadmap to Robust Science for High-throughput Applications: The Developers’ Per- spective. In Proceedings of the IEEE Cluster Conference (CLUSTER) (2021).
Michela Taufer, Ewa Deelman, Rafael Ferreira da Silva, Trilce Estrada, and Mary Hall A Roadmap to Robust Science for High-throughput Applications: The Scientists’ Perspective. In Proceedings of the 20th IEEE International Conference on eScience (2021).
Tu Mai Anh Do, Lo ̈ıc Pottier, Silvina Ca ́ıno-Lores, Rafael Ferreira da Silva, Michel A. Cuen- det, Harel Weinstein, Trilce Estrada, Michela Taufer, and Ewa Deelman. A Lightweight Method for Evaluating in situ Workflow Efficiency J. Comput. Sci. Elsevier (2021).
Hector Carrillo-Cabada, Jeremy Benson, Asghar Razavi, Brianna Mulligan, Michel A. Cuendet, Harel Weinstein, Michela Taufer, and Trilce Estrada. A Graphic Encoding Method for Quantitative Classification of Protein Structure and Representation of Conformational Changes IEEE/ACM Transactions on Computational Biology and Bioinformatics (IEEE/ACM TCBC). (2020). [link]
Tu Mai Anh Do, Loic Pottier, Stephen Thomas, Rafael Ferreira da Silva, Michel A. Cuendet, Harel Weinstein, Trilce Estrada, Michela Taufer, and Ewa Deelman. A Novel Metric to Evaluate In Situ Workflows In Proceedings of the International Conference on Computational Science (ICCS), pp. 1 – 14. (2020). [link]
Michela Taufer, Trilce Estrada, and Travis Johnston. A Survey of Algorithms for Transforming Molecular Dynamics Data into Metadata for In Situ Analytics based on Machine Learning Methods Issue of Philosophical Transactions A., 378(2166):1-11. (2020). [link]
Michela Taufer, Stephen Thomas, Michael Wyatt, Tu Mai Anh Do, Loïc Pottier, Rafael Ferreira da Silva, Harel Weinstein, Michel A. Cuendet, Trilce Estrada, and Ewa Deelman. Characterization of In Situ and In Transit Analytics of Molecular Dynamics Simulations for Next-generation Supercomputers In Proceedings of the IEEE eScience Conference, pp.1-12. (2019). [link]
Trilce Estrada, Jeremy Benson, Hector Carrillo-Cabada, Asghar M. Razavi, Michel A. Cuendet, Harel Weinstein, Ewa Deelman, and Michela Taufer. Graphic Encoding of Proteins for Efficient High-Throughput Analysis In Proceedings of the 9th ACM Conference on Bioinformatics, BCB, pp. 315 – 324. Washington, DC, USA. August 29 - September 1. (2018). [link]
Rafael Ferreira da Silva, Scott Callaghan, Tu Mai Anh Do, George Papadimitriou, and Ewa Deelman. Measuring the Impact of Burst Buffers on Data-Intensive Scientific Workflows Future Generation Computer Systems, 101, 208-220 (2019). [link]
Asghar M. Razavi, George Khelashvili, and Harel Weinstein. A Markov State-based Quantitative Kinetic Model of Sodium Release from the Dopamine Transporter Scientific Reports, 7 (2017). [link]
Rafael Ferreira da Silva, Rosa Filgueira, Lia Pietri, Ming Jiang, Rizos Sakellariou, Ewa Deelman. A Characterization of Workflow Management Systems for Extreme-scale Applications Future Generation Computer Systems, 75, 228-238 (2017). [link]
Travis Johnston, Boyu Zhang, Adam Liwo, Silvia Crivelli, Michela Taufer. In situ data analytics and indexing of protein trajectories. Journal of Computational Chemistry, 38 (16), 1419-1430, (2017). [link]
Michel A. Cuendet, Harel Weinstein, and Michael V. LeVine. The Allostery Landscape: Quantifying Thermodynamic Couplings in Biomolecular Systems Journal of Chemical Theory and Computation, 12 (12), 5758-5767, (2016). [link]
Travis Johnston, Boyu Zhang, Adam Liwo, Silvia Crivelli, Michela Taufer. In-Situ Data Analysis of Protein Folding Trajectories. arXiv:1510.08789, (2015). [link]

Travis Johnston

Research Scientist for AI in HPC

Asghar Razavi

Postdoctoral Associate in Physiology and Biophysics at Weill Cornell Medical College of Cornell University

Loïc Pottier

Computer Scientist at the University of Southern California

Silvina Caino-Lores

Postdoctoral Research Associate, University of Tennessee, Knoxville

Ekaterina Kots

Postdoctoral Associate, Weill Cornell Medical School

Tu Mai Anh Do

Graduate Research Assistant at the University of Southern California

Michael Wyatt

Research Assistant, University of Tennessee, Knoxville

Hector Alexis Carrillo Cabada

Graduate Research Assistant at the University of New Mexico

Ian Lumsden

Graduate Research Assistant, University of Tennessee, Knoxville

Jack Marquez

Research Assistant Professor at University of Tennessee Knoxville

Stephen Thomas

Post-Doctoral Research Associate at University of Tennessee Knoxville

Harshita Sahni

Graduate Student at University of New Mexico