Document Type

Technical Report

Publication Date


Technical Report Number



The long development process of novel pharmaceutical compounds begins with the identification of a lead inhibitor compound. Computational screening to identify those ligands, or small molecules, most likely to inhibit a target protein may benefit the pharmaceutical development process by reducing the time required to identify a lead compound. Typically, computational ligand screening utilizes high-resolution structural models of both the protein and ligand to fit or `dock' each member of a ligand database into the binding site of the protein. Ligands are then ranked by the number and quality of interactions formed in the predicted protein-ligand complex. It is currently believed that proteins in solution do not assume a single rigid conformation but instead tend to move through a small region of conformation space. Therefore, docking ligands against a static snapshot of protein structure has predictive limitations because it ignores the inherent flexibility of the protein. A challenge, therefore, has been the development of docking algorithms capable of modeling protein flexibility while balancing computational feasibility. In this paper, we present our initial development and work on a molecular ensemble-based algorithm to model protein flexibility for protein-ligand binding prediction. First, a molecular ensemble is generated from molecular structures satisfying experimentally-measured NMR constraints. Second, traditional protein-ligand docking is performed on each member of the protein's molecular ensemble. This step generates lists of ligands predicted to bind to each individual member of the ensemble. Finally, lists of top predicted binders are consolidated to identify those ligands predicted to bind multiple members of the protein's molecular ensemble. We applied our algorithm to identify inhibitors of Core Binding Factor (CBF) among a subset of approximately 70,000 ligands of the Available Chemicals Directory. Our 26 top-predicted binding ligands are currently being tested experimentally in the wetlab by both NMR-binding experiments (15N-edited Heteronuclear Single-Quantum Coherence (HSQC)) and Electrophoretic Gel Mobility Shift Assays (EMSA). Preliminary results indicate that of approximately 26 ligands tested, three induce perturbations in the protein's NMR chemical shifts indicative of ligand binding and one ligand (2-amino-5-cyano-4-tertbutyl thiazole) causes a band pattern in the EMSA indicating the disruption of CBF dimerization.