Accelerated bottom-up drug design platform enables the discovery of novel stearoyl-CoA desaturase 1 inhibitors for cancer therapy

Here we present an innovative computational-based drug discovery strategy, coupled with machine-based learning and functional assessment, for the rational design of novel small molecule inhibitors of the lipogenic enzyme stearoyl-CoA desaturase 1 (SCD1). Our methods resulted in the discovery of several unique molecules, of which our lead compound SSI-4 demonstrates potent anti-tumor activity, with an excellent pharmacokinetic and toxicology profile. We improve upon key characteristics, including chemoinformatics and absorption/distribution/metabolism/excretion (ADME) toxicity, while driving the IC50 to 0.6 nM in some instances. This approach to drug design can be executed in smaller research settings, applied to a wealth of other targets, and paves a path forward for bringing small-batch based drug programs into the Clinic.

Several models were generated and compared for hybrid modeling, which resulted in best scoring portions from each program and validated for dihedrals, packing, and other metrics like Phi-Psi space.
From these final models for SCD1 were mapped using the SiteMap module to score highest confidence binding sites on SCD1, which resulted in a deep internal pocket that would correspond with lipophilic substrate binding, required for stearoyl-coA binding ( Prior to mapping out potential grid surfaces for an active site region surrounding residues, the ProteinPreparationWizard contained in Schrödinger (Protein Preparation Wizard; Epik version 2.8; Impact version 6.3; Prime version 3.5 [computer program]: Schrödinger, LLC, New York, NY, USA; 2014) was used. The top regions identified using SiteMap region were then mapped for grid generation and decomposition of the protein's three-dimensional space for docking experiments. Using this grid, initial placement for A939572, SAR707, and other known inhibitors were docked using the Glide algorithm within the Schrodinger suite as a virtual screening workflow (VSW). The docking was accomplished using a scheme that proceeds from single-precision (SP) through extra-precision (XP) with the Glide algorithm (Glide, v. 5.7, Schrödinger, LLC). The top seeded poses were ranked for best scoring pose and unfavorable scoring poses were discarded. Each conformer was allowed multiple orientations in the site. Site hydroxyls, such as in serines and threonines, were allowed to move with rotational freedom. Docking scores were not retained as useful, since covalent bonding is the outcome. Hydrophobic patches were utilized within the virtual screening workflow (VSW) as an enhancement. Top favorable scores from initial dockings of yielded thousands of poses with the top five poses retained. XP descriptors were used to obtain atomic energy terms like hydrogen bond interaction, electrostatic interaction, hydrophobic enclosure and π-π stacking interaction that result during the docking run. VSW docking was completed for all novel generated compounds from the de novo design process described in the QSAR section and the docking utilized both hydrophobic constraints and the seeding generated from known inhibitor docking poses. Molecular modeling for importing and refining the known inhibitor compounds and generation of the small molecule compounds were prepared with LigPrep module (

Active, Inactive, Unknown Ligand Preparation
Twenty active compounds that ranged from 3 nM to 400 nM and eight inactives ranging from 2.68 micromolar to >10 micromolar were built into our pharmacophore modeling system. Conformers were generated for all actives, inactives, and test set compounds using ConfGen and Mixed MCMM/LCMOD within Schrodinger (Watts KS, Dalal P, Murphy RB, Sherman W, Friesner RA, Shelley JC. ConfGen: A Conformational Search Method for Efficient Generation of Bioactive Conformers. J.Chem. Inf. Model. 2010;50:534-546). For ConfGen the number of conformers per rotatable bond was set at 100, maximum number of conformers per structure was set at 1000, sampling was set on "Thorough" mode and included preprocess www.impactjournals.com/oncotarget minimization of 100 steps and postprocess minimization of 50 steps and eliminate high-energy/redundant conformers. The MacroModel options for conformer generation used the OPLS2005 force field, GB/SA water solvation treatment and default setting for the maximum relative energy difference and maximum allowed atom deviation.

Pharmacophore Hypothesis Generation
A common pharmacophore was determined over a variant list that ranged from 5-6 sites and required a match from at least 6 of the 17 actives built into the model. The variant list included 187 selections based on sites created for all the actives chosen. The following letter code was used for the pharmacophore sites/features: hydrogen bond acceptor (A), hydrogen bond donor (D), hydrophobic group (H), negatively charged group (N), positively charged group [P], aromatic ring [R], and no custom features for X, Y, or Z designation were assigned. For images with pharmacophore models, the following appearance is given: (A) is light red sphere located at atom with lone pair and arrow point toward the lone pair, (D) is light blue sphere centered on hydrogen atom with arrow in direction of the potential H-bond, (H) is green sphere, (N) is red sphere, [P] is blue sphere, and [R] is orange torus in plane of aromatic ring. Our search method employed for finding common pharmacophores is identified using a tree-based partitioning technique that groups according to inter-site distances (k-points). Using a binary decision tree, a tree depth of five was allowed and partition into bins based on a 2.0 Å width, while partition continues to either eliminate or survive the procedure.
The scoring hypotheses were bases on the identified pharmacophores from each surviving n-dimensional box for the chosen actives and additional information from partial matching of ligand alignments. The quality alignments were determined using three metrics, namely, the alignment score (via root-mean-squared-deviation (RMSD)), vector score (average cosine of the angles formed by corresponding pairs of vector features (A, D, R), and volume score (overlap of van der Waals models of non-hydrogen atoms in each pair of structures). As well, site scores for each alignment were computed to augment the alignment score with a cutoff Calign, which combined the site score, vector score, and volume score with separate weights to yield a combined alignment score for each non-reference pharmacophore that was aligned with reference. All pharmacophores within a box were treated as a reference and the highest one selected as a hypothesis during multi-ligand alignment optimization. The final scoring function (survival score) was: , where W's represented the weights and S's represented the scores. Inactives were penalized by adjusting their alignment score, such that, when an inactive matches only k out of n sites, an effective n-point alignment score was computed as follows: , where W k =k/n. The final adjusted score mentioned above became S adjusted = S actives − W inactives S inactives . Default weights were used with all equations, as shown above. Hypotheses were clustered for to tease out pharmacophore model variants with similar scores.

Generation of 3D QSAR Models
The 3D QSAR models were built by mapping the chemical features of ligand structures onto a cubic three-dimensional grid space with the smallest grid spacing of 1 Å per side. As above, the ligands are first aligned to the set of pharmacophore features for the selected hypothesis using a standard least-squares approach, which utilizes regression of the independent variables with binary-valued bits in the cubes by structural components and the dependent variables are the activities. The regression was performed via a partial least squares (PLS) method, where a series of models generated with increasing number Oncotarget 4 www.impactjournals.com/oncotarget of PLS factors. T-value filter (t-value < 2.0) was used to eliminate independent variables overly sensitive to incremental changes from the training set. For structural components, both atom-based and pharmacophore features based were examined. The regression with m PLS factors, fitted to activities is given as: , where m = number of PLS factors, b is regression coefficient, vector y represents activity values in the training set. For the prediction of activities for the new ligands, the following was used: . Models with high stability were preferred. Z-score matrix A final Z-filtering mechanism was applied to reduce the dataset to a few select compounds for synthesis and experimental screening by using combined normalized scoring. The final Z-score for each compound was determined as: Z scr = (Shape norm + Dock norm + QSAR norm ) 3 , which averages average of the sums the normalization of each individual Shape, the Dock, and 3D-QSAR score