Molecules in Action, LLC

Molecules in Action, LLC provides consulting services and custom software development for biomedical research. The company utilizes and custom-tailores multi-scale approaches to molecular modeling, simulations, and molecular engineering.

APPLICATIONS METHODS

Medusa

Medusa is a computational platform for molecular modeling, which includes a physics-based force field to evaluate the energetics of given conformations of molecules and molecular complexes, as well as rapid algorithms to search conformational space via a Monte Carlo based algorithm. Benchmark studies of native sequence recapitulation from protein backbone and the prediction of protein stability changes (Eris) highlight the accuracy of the Medusa force field. We have extended the Medusa force field to model small molecule ligands (MedusaScore), which allows us to predict the binding energy of ligand-receptor complexes. In terms of sampling, Medusa allows for the rapid search of protein sequence space and side chain conformational space, which enables us to study protein evolution and perform protein design. Medusa can also rapidly sample ligand conformational space, which made it possible for us to develop a flexible ligand-receptor docking algorithm, MedusaDock, to simultaneously sample the conformational flexibility of ligand and receptor.

Eris

To estimate protein thermostability and structural changes upon mutation is of great importance for molecular biologists. Therefore, we have developed a computational tool, Eris, for accurately predicting the mutation-induced protein stability changes1. Due to the complex nature of the interactions involved in protein folding, existing stability prediction methods often use empirical parameters trained on experimental protein stability data. Moreover, limited by their capability to model the structural changes induced by mutations, the applications of these methods are often restricted to mutations from large residues to small ones. We address these deficiencies with a unique approach that combines a physical force field with a fast conformation-sampling algorithm in an atomic framework of proteins. We show that Eris can effectively detect and resolve the atomic clashes and structural strain introduced by mutation and yield reliable predictions of the stability change for these mutants. We test Eris on 595 mutants and find significant correlation between the predicted and experimental stability changes. Eris is accessible through the Dokholyan laboratory server (Eris) and as a standalone software package.

πDMD

Discrete molecular dynamics (DMD) is a special type of molecular dynamics (MD) algorithm that uses stepwise potential functions to approximate the continuous interaction potentials in traditional MD. This simplification reduces time-driven dynamics to event-driven motion, which has been highly optimized in order to increase the computational efficiency. As the result, DMD features increased sampling efficiency over traditional MD. In combination with simplified protein models, DMD simulation is orders of magnitude faster than traditional MD simulation. Over the years, Dokholyan laboratory have developed a series of models of proteins, nucleotides, and lipids for DMD simulations. The models include various levels of coarse-graining as well as atomic resolution. Dokholyan laboratory have successfully applied DMD to the study various biological problems, including protein folding dynamics, protein misfolding and aggregation, ensemble reconstruction using experimental constraints, protein design, self-assembly of lipids, and RNA folding. Molecules in Action has developed a parallel and most efficient version of the DMD software, πDMD.

MedusaScore

The application of virtual screening processes is still limited by the lack of accurate scoring functions, as has been shown by recent benchmark studies. To address this problem, Dokholyan laboratory have developed a novel scoring function, MedusaScore, for the fast and accurate evaluation of protein-ligand binding. MedusaScore is a direct extension of the Medusa force field, which has demonstrated superior performance in modeling protein stabilities. Using publicly available benchmark datasets, we find that MedusaScore can recognize native-like docking poses and predict binding affinity at high fidelity. The overall accuracy is found to be superior than other widely-used scoring functions that have been tested using the same dataset, including Autodock, ChemScore, DrugScore, D-Score(DOCK), F-Score(FlexX), G-Score(GOLD), HINT, LigScore, LUDI, PLP, PMF, and X-Score. In contrast to most other scoring functions, MedusaDock was developed without the use of any protein-ligand complex structures for parameter training, thereby maintaining the best transferability of the scoring function to a wide-range of targets and ligands. Both Dokholyan laboratory and Molecules in Action has been using MedusaScore for virtual screening in structure-based drug design.

MedusaDock

A principal source of innacuracies in molecular docking is inability of current approaches to capture both protein and ligand dynamics during molecular docking. While a number of attempts have been made to circumvent this weakness of the current approaches by performing "ensemble" docking of multiple (sampled) proteins conformations and multiple ligand conformations, such docking approaches still do not address the synergism of protein-ligand reconfigurations upon docking. Dokholyan laboratory developed a methodology, MedusaDock, that performs fully flexible docking of both the ligand and a protein target and validated this approach on a number of targets. MedusaDock has been also utilized by Molecules in Action for virtual drug screening for clients.

Surface matching through fingerprints

Matching of protein surfaces is important for protein function annotation, protein-protein interaction prediction, and protein-protein interface design. However, traditional surface comparison methods are computationally too expensive to be applied to a dataset of a large number of protein structures, such as the Protein DataBank (PDB). Dokholyan laboratory has developed a novel approach to match protein surfaces that uses geometric invariant fingerprints. Borrowed from the computer vision field, these fingerprints accurately describe the 3D features of an object, so that similarity between fingerprints reflects the similarity between the corresponding objects. Using fingerprints, the comparison of 3D objects can be achieved at high speed. We introduce a novel neighbor-averaging protocol, which not only significantly improves the accuracy of the fingerprint-based method, but also suggests a tentative 3D alignment to allow further explicit alignment of the objects without undue computational cost. Using our approach, we successfully screened the entire PDB for local surface similarities between proteins and protein inhibitors that are identified as binders to the pocket of a common enzyme. The identified inhibitors belong to unrelated fold families, and could not be detected using traditional sequence or fold comparison methods.

CryoEM fitting

Dokholyan laboratory has developed an algorithm that can rapidly screen hundreds of thousands of structures and identify those that best fit a given cryoEM density. Our method is based on geometric invariant fingerprints constructed using 3D Zernike functions. Using those fingerprints, the comparison of electron densities to structures is reduced to a comparison of fingerprints, which is extremely fast. To demonstrate the feasibility of this method, Dokholyan laboratory used experimental cryoEM densities of GroEL and rhodopsin proteins to screen the entire Protein DataBank, and successfully identify other GroEL and rhodopsin structures as the top hits.

Structural filters and high-resultion structure refinement

Dokholyan laboratory has developed a set of filters to assess the quality of a structural model or a low-resolution structure. Three important qualities of a protein structure are assessed: (i) extent of steric clashes, (ii) extent of buried voids, and (iii) percentage of hydrogen bond donors/acceptors that are buried but do not form hydrogen bonds. Distributions of each of these measures on high-resolution crystal structures (0-2.5 Å) allow comparison of a given model to structures of natural proteins. The measure of a given structure with respect to these filters is compared to the high-resolution distribution to obtain a P-value, which reflects the quality of the structure. Dokholyan laboratory offers an online access to these filters via Gaia. Furthermore, Dokholyan laboratory has developed a method Chiron which allows high-resolution structural refinement.

Loop modeling and grafting

Loop modeling is an important and crucial step when building structural models of proteins. Loop grafting is important in protein design. Molecules in Action has developed a suite MiA Suite, which allows building loop structures using DMD simulations as well as their grafting into a host protein.

Homology modeling

If a protein sequence of unknown structure is at least 30% similar to the sequence of any experimentally-determined structure, one can use homology modeling to predict its structure. The main steps in homology modeling are: (i) obtaining sequence alignment between query and template, (ii) changing the sequence of the template to reflect that of the query, (iii) processing the insertions and deletions in the template to obtain a final structure. Once an alignment of query and template is obtained using other tools (for example, ClustalW or PSI-BLAST), one may perform the remaining steps using Medusa and DMD with high efficiency. Step (ii) can be performed using the Medusa suite. Step (iii), which involves breaking and annealing new peptide bonds and also folding of the insertions in conjunction of the rest of the protein, can be performed by DMD. DMD is highly efficient in loop modeling and satisfying peptide-bond constraints in order to bring distant pieces of protein structure together after a deletion.

Protein structure refinement

Homology modeling, a viable alternative for protein structure prediction, introduces artifacts into the final structural model, either due to inaccuracies in the force field used for model building, or due to the model-building protocol. Steric clashing is one such common structural artifact, characterized by the unphysical overlap of any two atoms in a protein structure. Refinement of structural models to resolve steric clashes is critical for making further predictions using the generated model. Although there exist programs for identifying clashes in protein structures based on a predetermined distance cutoff, tools for efficient resolution of such artifacts are sparse. Dokholyan laboratory has developed a DMD-based protocol to efficiently relax the protein backbone in order to remove steric clashes from protein structures Chiron. Based on statistics obtained from a large dataset of high-resolution crystal structures, Dokholyan laboratory derived a metric to determine the quality of a model in terms of steric clashes and resolve them if required. Using this protocol, one can minimize steric clashes from protein structures and homology models with an overall backbone deviation of less than 1 Å from the initial structure.

Peptide-protein binding motif prediction

Protein-peptide interactions form the basis of many signaling pathways in a cell, mediating a foray of functions, from basic cellular processes like phosphorylation to specialized processes like epitope recognition. Knowledge of specific protein-peptide interactions will help in the identification of natural binding partners and specific members of cellular signaling pathways. Dokholyan laboratory has developed a semi-automated protocol to rapidly screen all possible peptide sequences that fit a protein-peptide complex and select those that energetically favor the complex scaffold. Using our protocol, Dokholyan laboratory has screened the peptide combinations for a chaperone-peptide complex and identified a consensus motif that, when present in a protein, can be a substrate for the chaperone. Further it was experimentally validated the binding motif and demonstrated that the motif forms a sufficient condition for substrate recognition by the chaperone. This protocol can be applied to any protein-peptide complex in order to identify the most suited peptide combinations for binding to the target protein.

Protein design

Accurate prediction of stability change upon mutation (Eris/Medusa) can be utilized in the rational design of mutations for different purposes. For a protein of known structure, one can screen for mutations to either stabilize or destabilize the protein as required. Similarly, one may attempt to rescue mutants that are experimentally characterized as unstable by employing the rational selection of further mutation that restores stability. Furthermore, discrepancy between the thermodynamic and evolutionary preferences of various amino acids at different positions (evaluated using Eris) in a protein points to functional sites in protein structure. The fast repacking and selection of optimal residue side chains afforded by Eris can be used as an intermediate step in protein design. Cycles of backbone conformational sampling by πDMD and sequence selection by Eris can be used as a viable protocol for the design of proteins with a specific structure, stability, and/or function.

Protein folding: from structure prediction to dynamic characterization

piDMD utilizes Medusa force field in all-atom protein simulations. Using replica exchange DMD simulations, Dokholyan laboratory demonstrated folding of six small proteins ab initio, which highlights the accuracy of the force field as well as the sampling efficiency. Although folding of large proteins ab inito is too time consuming, we can incorporate secondary structure constraints in the simulations to simulate the rearrangement and packing in order to predict the fully folded structure.

Another important application of all-atom DMD simulation is the characterization of protein dynamics in the folded state. Conformational dynamics can be used to identify the fuctionally important dynamics. For example, the near-native dynamics of SOD1 monomer is associated with a propensity for misfolding and aggregation. Dynamics coupling analysis from simulation can help to identify remotely coupled regions, which can then be used to understand allostery-related functions and to engineer novel allosteric proteins.

RNA folding and structure prediction

Dokholyan laboratory has developed a coarse-grained RNA model for DMD simulations, which can successfully fold small RNA (<50 nt) ab initio. To fold large RNA molecules with complex tertiary structures, we incorporate experimentally-derived structural information into RNA modeling. Dokholyan laboratory has developed an automated RNA structure refinement method utilizing base-pairs and distance constraints. RNA base-pairs can often be accurately derived by RNA secondary structure predictions using biochemical probing data, such as SHAPE chemistry. Distance constraints can be inferred from a variety of biochemical and bioinformatic techniques, including site-directed hydroxyl radical probing, fluorescence resonance energy transfer, cross-linking, and sequence covariation. This methodology has been applied to tRNA^Asp, and has benchmarked the method on four different RNAs ranging from 49 nt to 158 nt. In all cases, the RNA model structure can be refined to a statistically significant native-like structure.

Virtual drug screening

If the structure of a protein target is known, one can computationally screen millions of compounds to identify molecules that bind to the target. Molecules in Action approach combines chemoinformatics-based methods with MedusaDock, an innovative fully flexible docking algorithm, to predict the correct binding poses of the molecules, from which the binding affinitites are estimated using MedusaScore. Only the top-ranked molecules are tested with an experimental assay. Our method features chemically diverse molecules among the top hits, which is useful for the identification of novel drug molecules. In addition, the false positive rate is low among the top hits, which allows the discovery of true binders while performing fewer experimental assays. This virtual screening scheme has been validated in several benchmark studies and experimental tests.

Dynamics in drug screening

For some difficult drug screening targets, all available virtual screening scoring functions fail to identify the native binding pose of a drug in the pocket. This failure of traditional dock-and-score virtual screening methods is due to dynamic interactions between the target and ligand that are not captured in a single static structure. Therefore, Dokholyan laboratory has developed a methodology combining MedusaDock and MedusaScore with πDMD simulations. By using traditional methods as a filter and performing DMD simulations of the most promising ligand poses, it is possibleto identify the native binding pose by its dynamic behavior in the pocket.