Synergistic Use of Compound Properties and Docking Scores in Neural Network Modeling of CYP2D6 Binding: Predicting Affinity and Conformational Sampling

Document Type




Format of Original

11 p.

Publication Date



American Chemical Society

Source Publication

Journal of Chemical Information and Modeling

Source ISSN


Original Item ID

doi: 10.1021/ci600267k


Cytochrome P450 2D6 (CYP2D6) is used to develop an approach for predicting affinity and relevant binding conformation(s) for highly flexible binding sites. The approach combines the use of docking scores and compound properties as attributes in building a neural network (NN) model. It begins by identifying segments of CYP2D6 that are important for binding specificity, based on structural variability among diverse CYP enzymes. A family of distinct, low-energy conformations of CYP2D6 are generated using simulated annealing (SA) and a collection of 82 compounds with known CYP2D6 affinities are docked. Interestingly, docking poses are observed on the backside of the heme as well as in the known active site. Docking scores for the active site binders, along with compound-specific attributes, are used to train a neural network model to properly bin compounds as strong binders, moderate binders, or nonbinders. Attribute selection is used to preselect the most important scores and compound-specific attributes for the model. A prediction accuracy of 85 ± 6% is achieved. Dominant attributes include docking scores for three of the 20 conformations in the ensemble as well as the compound's formal charge, number of aromatic rings, and AlogP. Although compound properties were highly predictive attributes (12% improvement over baseline) in the NN-based prediction of CYP2D6 binders, their combined use with docking score attributes is synergistic (net increase of 23% above baseline). Beyond prediction of affinity, attribute selection provides a way to identify the most relevant protein conformation(s), in terms of binding competence. In the case of CYP2D6, three out of the ensemble of 20 SA-generated structures are found to be the most predictive for binding.


Journal of Chemical Information and Modeling, Vol. 46, No. 6 (October 16, 2006): 2698-2708. DOI.