Systematic approach identifies RHOA as a potential biomarker therapeutic target for Asian gastric cancer

Gastric cancer (GC) is a highly heterogeneous disease, in dire need of specific, biomarker-driven cancer therapies. While the accumulation of cancer “Big Data” has propelled the search for novel molecular targets for GC, its specific subpathway and cellular functions vary from patient to patient. In particular, mutations in the small GTPase gene RHOA have been identified in recent genome-wide sequencing of GC tumors. Moreover, protein overexpression of RHOA was reported in Chinese populations, while RHOA mutations were found in Caucasian GC tumors. To develop evidence-based precision medicine for heterogeneous cancers, we established a systematic approach to integrate transcriptomic and genomic data. Predicted signaling subpathways were then laboratory-validated both in vitro and in vivo, resulting in the identification of new candidate therapeutic targets. Here, we show: i) differences in RHOA expression patterns, and its pathway activity, between Asian and Caucasian GC tumors; ii) in vitro and in vivo perturbed RHOA expression inhibits GC cell growth in high RHOA-expressing cell lines; iii) inverse correlation between RHOA and RHOB expression; and iv) an innovative small molecule design strategy for RHOA inhibitors. In summary, RHOA, and its oncogenic signaling pathway, represent a strong biomarker-driven therapeutic target for Asian GC. This comprehensive strategy represents a promising approach for the development of “hit” compounds.


In Silico approach for identifying 730 RHOA small molecule compounds
Based on the "seed" ligand (the known RHOA inhibitor Rhosin), which molds the entire pocket of RHOA [18], we obtained similar candidate compounds for further docking and statistical analyses. We searched about 45 million compounds PUBCHEM, and found 730 similar backbone compounds with high Tanimoto scores of 0.6, measured by the Open Babel program [41].
Using these 730 chemical compounds, we performed docking simulations, based on the RHOA protein structure (Protein Data Bank entry ID 1X86 and chain B) [20]. The available docking pocket residues of the RHOA consisted of 75 residues. Docking simulation of each compound was performed as follows: i) Move the center of mass (COM) of a compound to an out-ofpocket residue; ii) Given that specific residue, perform an AutoDock Vina [42] calculation three times using different random seed; iii) Repeat steps 1 and 2 for all 75 pocket residues (i.e., a total number of docking executions= 3x75=225 for each compound). A 3D imaginary box bounded by equal dimensions of 15 Å was used to retain each ligand at a specifically assigned pocket residue. The center of the box was located on the Cα carbon of the pocket residue; iv) After docking simulations, the COMs of the conformations, within a radius of 4 Å, were clustered to generate multiple groups (i.e., clusters). When the two ligands approach each other within a radius of 4 Å, the two are assigned to the same group, as determined by CHARMM [43]. For each specific group, we assigned the lowest energy conformer as the representative conformer for that group. Presence of multiple groups for the same ligand suggests that the ligand is able to bind to different localities within the pocket, suggesting existence of nonspecific binding. For addressing this situation, we obtained the two groups (clusters) having the first (E 1 ) and second (E 2 ) lowest representative energies among the groups, and the two representative energy conformers were used for further analyses; and v) we then applied steps 1-4 to the remaining compounds (total number of docking executions: 730 x 225=164,250). After wholedocking simulations of all 730 compounds, we identified the group having the lowest representative energy conformer for each compound, resulting in 730 ligand conformers, as well as their corresponding energies.

Lipinski's rule application to the 730 compounds
Of the 730 compounds, the physicochemical properties relating to Lipinski's rule of five (RO5) [19], i.e., the number of hydrogen bond donors < 5, the number of hydrogen bond acceptors < 10, molecular mass < 500, octanol-water partition coefficient (logP) < 5, and the number of rotatable bonds < 10, were then applied as the second screen filter. Additionally, two more rules were applied: i) the number of heavy atoms ≤ 100; and ii) the number of compounds = 1. These two criteria reflect general properties of most currently used drugs, facilitating the selection of relatively small and moderately lipophilic molecules.
Since the candidate ligands could also interact with different local binding regions within the pocket of the RHOA protein, we assumed non-specific-RHOA binding as less preferable for druggability; consequently, aim to identify pocket-specific, locally binding ligands. For the purpose, we constructed a statistical test for evaluating druggability in terms of specific binding, which is described in the next section.

Specific binding test
Candidate ligand interactions with various local binding regions within the RHOA pocket can be shown by ligand-pocket clustering. In our hypothesis, higher ΔE values (difference between the lowest and the second lowest energies in the two binding regions) indicate that the compound is likely to specifically bind to the region having the lowest binding energy ( Figure 8B). The compound in the right panel has specific binding to the pocket, compared to the left panel ( Figure 8B). Considering that nonspecific protein-binding ligands were less preferable as druggable candidates, we developed a statistical test for identifying statistically significant, specific locally binding ligands.
For a given ligand candidate h (h=1,..,m), we obtained the most stable ligand-pocket conformational state (state 1), and the second-most stable ligand-pocket conformational state (state 2), out of all the ligandpocket pairs in the protein. We then defined the energy corresponding to each conformational state i (i=1,2…) as E hi , for each ligand h, using the postulate that the more ligand-specific RHOA binding, the higher the energy difference E h1 /E h2 . Using these parameters, we could establish a statistical measure for the significance of the energy differences.
We transformed E hi (i=1,2) into exp(-E hi /kT) ("C hi "), defined as the occurrences of the ligand-pocket conformational state corresponding to E hi (k: Boltzmann constant; T: temperature, with kT arbitrarily set to one). We then set C hi~b inomial(n i , p hi ) (i=1,2), where n i is the total number of occurrences for all E hi s, and p hi is the probability of the energy conformational state i for the ligand h. Using the relationship X=log 2 C h1 , Y=log 2 C h2 , M=X-Y=log 2 (C h1 /C h2 ), and A=(X+Y)/2, and assuming that the C hi s are independent and the n i s are large enough, the conditional distribution, M|A, follows a normal distribution that leads to the same statistical method used in DEGseq. Combining all of these considerations, we could obtain the significance of the energy differences for all the candidate compounds. The 41 most significant compounds by the tests were reported in Results section.

Small molecule synthesis
All reactions sensitive to air or moisture were conducted under nitrogen atmosphere. Reagents were purchased from Sigma-Aldrich and Tokyo Chemical Industry. All anhydrous solvents were distilled over CaH 2 , P 2 O 5 , or Na/benzophenone prior to reaction, unless otherwise stated. Analytical thin-layer chromatography (TLC) was performed using commercial precoated TLC plates (Silicagel 60, F-254, Merck). Spots were detected by viewing under UV light (excitation, 254 nm), or colorizing with charring after dipping in any of the following solutions: phosphomolybdic acid (PMA) in ethanol or potassium permanganate aqueous solution. Flash column chromatography was performed on silica gel 60 (0.040~0.063 mm, 230-400 mesh, Merck). Infrared spectra were recorded using Agilent Cary670. 1 H NMR spectra (CDCl 3 , CD 3 OD, D 2 O or DMSO-d 6 ) were recorded on Agilent 400-MR (400 MHz). The chemical shifts were reported in parts per million (δ) units, relative to the solvent peak. 1 H NMR data were reported as peak multiplicities: s for singlet; d for doublet; dd for doublet of doublets; ddd for doublet of doublet of doublets; t for triplet; pseudo t for pseudo triplet; brs for broad singlet; and m for multiplet. Coupling constants were reported in Hertz. 13 C NMR spectra (CDCl 3 , CD 3 OD, D 2 O or DMSOd 6 ) were similarly recorded using an Agilent 400-MR (100 MHz) device. The chemical shifts were reported as ppm (δ), relative to the solvent peak. Mass spectra were recorded on ESI + source in methylene chloride or methanol.
General synthesis procedures of hydrazides were as follows: A mixture of piperonal (7 mmol) and appropriate hydrazide compounds (7 mmol) in MeOH or EtOH were stirred at room temperature, or heated under reflux, for 2-40 h. The progress of reaction was monitored by TLC. After completion of reaction, contents were cooled to room temperature and poured into ice cold water (35 mL), while stirring. Solid was filtered, dried, and purified by recrystallization using MeOH or EtOH to give hydrazide products JK-121 ~ 125 in 64-95% yields.