Human oncoprotein Musashi-2 N-terminal RNA recognition motif backbone assignment and identification of RNA-binding pocket

RNA-binding protein Musashi-2 (MSI2) is a key regulator in stem cells, it is over-expressed in a variety of cancers and its higher expression is associated with poor prognosis. Like Musashi-1, it contains two N-terminal RRMs (RNA-recognition Motifs, also called RBDs (RNA-binding Domains)), RRM1 and RRM2, which mediate the binding to their target mRNAs. Previous studies have obtained the three-dimensional structures of the RBDs of Musashi-1 and the RBD1:RNA complex. Here we show the binding of MSI2-RRM1 to a 15nt Numb RNA in Fluorescence Polarization assay and time resolved Fluorescence Resonance Energy Transfer assay. Using nuclear magnetic resonance (NMR) spectroscopy we assigned the backbone resonances of MSI2-RRM1, and characterized the direct interaction of RRM1 to Numb RNA r(GUAGU). Our NMR titration and structure modeling studies showed that MSI2-RRM1 and MSI1-RBD1 have similar RNA binding events and binding pockets. This work adds significant information to MSI2-RRM1 structure and RNA binding pocket, and contributes to the development of MSI2 specific and MSI1/MSI2 dual inhibitors.

Both Musashi proteins belong to the class A/B heterogeneous nuclear ribonucleoproteins (hnRNPs). They each have two N-terminal RRMs (RNA-recognition

Research Paper
Motifs, also called RBDs (RNA-binding Domains)), RRM1 and RRM2, which mediate the binding to their target mRNAs [3]. Like MSI1, MSI2 post-transcriptionally regulates mRNAs by binding to the recognition motifs located at 3ʹ-UTR of target mRNAs. One of the motifs, r(UAG), was shared between MSI1 and MSI2 [25,26]. The residues that recognize r(UAG) in MSI1 are highly conserved between MSI1 and MSI2 [26].
MSI1 and MSI2 share several overlapping targets that are involved in oncogenic pathways (summarized in [20,22]). One of the targets is Numb [6,7,27,28], a negative regulator of the Notch signaling. MSI proteins bind to Numb mRNA and inhibits its translation, leading to elevated Notch signaling, increased proliferation and survival, and decreased apoptosis of cancer cells. Specifically, Ito et al. reported in Nature that the Musashi-Numb pathway can control the differentiation of Chronic myelogenous leukemia (CML); expression of Numb as a result of MSI2 loss impairs the development and propagation of blast crisis CML in vitro and in vivo [6].
The solution structures of the two N-terminal RBDs of mouse Msi1 and their interactions with RNA have been studied extensively [25,26,29,30]. The two RBDs of mouse Msi1 share the same ribonucleoprotien (RNP)-type fold, and MSI1-RBD1 can specifically bind to RNA by stacking interactions between aromatic residues and RNA bases. However, no study to date has examined Msi2/MSI2 residues that directly interact with RNA, and there are no high-resolution Msi2/MSI2 RRMs structures available. Thus, investigation of how RRMs of MSI2 interact with RNA can contribute to identifying novel compounds that can disrupt these unique MSI2-RNA interactions. Currently, there are three studies including ours in developing smallmolecule inhibitors of MSI. The inhibitors identified from these studies have Ki values ranging from ~0.5 µM to 5 µM [31][32][33], yet the most potent one (-)-gossypol (Ki~0.5 µM) from our study is not MSI specific [33]. With the help of structure-based rational design based on NMR structure, we will develop more potent and specific MSI inhibitors.
Here we characterize MSI2-RRM1 and RNA interactions by FP (florescence polarization), TR-FRET (time resolved Fluorescence Resonance Energy Transfer). We also describe an NMR (Nuclear Magnetic Resonance Spectroscopy) investigation of backbone assignment of MSI2-RRM1 and its intermolecular interactions with RNA. Based on these studies, we identified the RNA-binding pocket of MSI2-RRM1, and revealed the similarities between the binding pockets of MSI2-RRM1 and MSI1-RBD1.

Musashi-2-Numb RNA binding
Here we show that the Numb FITC but not a Control RNA (Control FITC ) with a scrambled sequence binds to MSI2-RRM1, as indicated by the increase FP value ( Figure 1A). Such binding is also evident in TR-FRET assay ( Figure 1B). Using biotinylated Numb RNA (Numb Biotin : 5ʹUAGGUAGUAGUUUUA-Biotin), the TR-FRET assay detects a Kd of 2.07 nM, whereas a Control RNA (Control Biotin ), a Numb mutant (Numbmut Biotin : 5ʹUAGCAUCAUCAUUUA-Biotin) or a nonlabelled Numb ( Figure 1B) show no detectable binding in the nM range. A competition assay on preformed MSI2-RRM1-Numb Biotin complex indicates that only Numb but not Control RNA can displace Numb Biotin ( Figure 1C).

Backbone assignment
To keep the amino acid numbers of recombinant MSI2-RRM1 consistent with those of MSI2, residues of the recombinant MSI2-RRM1 are numbered starting from negative 3. In other words, the first residue of the N-terminal hexahistidine tag is M-3, and K111 is the C-terminal end. Residues M-3 to A20 comprise the N-terminal hexahistidine tag and TEV protease recognition site. G21 is the first residue found in MSI2-RRM1, and it corresponds to G21 in MSI2 sequence ( Figure 5A). Figure 2 shows the 2D 1 H-15 N HSQC spectrum of MSI2-RRM1 annotated with the 1 H, 15 NH assignment of residues G21-K111 and W30 side chain NHε. Overall, 92.6% of the non-proline backbone resonances were assigned. Residues not assigned at this time are M-3 through S4 in the hexahistidine tag, and S18 in the TEV protease recognition site. Backbone 1 H, 15 N, and 13 C assignments have been deposited in the BMRB data bank under accession number 27111.

RNA titration
To map the RNA-binding interface of MSI2-RRM1, we titrated 5 nt Numb (Numb5: GUAGU) stepwise into 15 N labeled protein, and recorded a series of 2D 1 H-15 N HSQC spectra. Significant perturbations of the chemical shifts of many backbone resonances of MSI2-RRM1 are evident upon binding ( Figure 3A). The chemical shift changes reached a plateau at 1:1 molar ratio of MSI2-RRM1:Numb5, indicating a 1:1 stoichiometry of MSI2-RRM1 and Numb5 complex. The peaks of many residues broadened or disappeared at substoichiometirc ratios of MSI2-RRM1:Numb5, and reappeared at new positions at 1:1 stoichiometric ratio of MSI2-RRM1:Numb5. For some residues, their resonances shifted to nearby positions, while for others, the chemical shifts were perturbed so significantly that the new locations cannot be determined. This behavior may result from direct binding of ligand to protein in which the free and bound states undergo slow exchange in the NMR chemical shift time scale, or ligand induced conformational changes. Chemical shift perturbations (CSPs) were calculated to identify residues that are involved in interacting with Numb5. Since the chemical shifts of seven residues I25, G27, W30, S61, R62, G65 and F97 were perturbed so significantly that the new locations cannot be determined, we acquired the 15 N-edited NOESY-HSQC spectrum of MSI2-RRM1:Numb5 complex, and assigned the backbone 1 H and 15 NH resonances of the seven residues in the bound state. CSPs were plotted against residue number (G21-K111) in Figure 3B. Residues exhibiting CSPs higher than one standard deviation above the average are G21, K22, F24, I25, G26, G27, S29, W30, Q31, T32, C50, S61, R62, G63, G65, F66, K94 V95 and F97. These nineteen residues interact directly or indirectly with the Numb5 RNA oligomer.

MSI2-RRM1 structure prediction
A model of MSI2-RRM1 was generated using CS-ROSETTA. This program utilizes protein backbone chemical shifts to select protein fragments from the protein data bank (PDB), followed by Rosetta Monte Carlo assembly and relaxation methods. The CS-ROSETTA program has been shown to be effective in de novo structure prediction for small proteins (≤16 kDa) [34,35]. Residues in the N-terminus (M-3-K22) and C-terminus (A101-K111) are highly flexible and thus were excluded in the CS-ROSETTA calculation. A total of 25,000 structures were generated. The 10 lowest energy structures were selected and their averaged Cɑ root-mean-squaredeviation (RMSD) against the lowest structure is 0.933 ± 0.284 Å. The superposition of the backbone atoms of the 10 lowest energy CS-ROSETTA structures and the ribbon diagram representation of the lowest energy structure for MSI2-RRM1 are shown in Figure 4A and 4B. The lowest energy structure is composed of five β-sheets (β1-β5) at residues 24-26, 47-52, 64-69, 84-86 and 89-96, and two ɑ helices (ɑ1 and ɑ2) at residues 34-43 and 73-81.

RNA binding pocket
The agreement between the overall structures of the CS-ROSETTA model and the homology model of MSI2-RRM1 encouraged us to use the CS-ROSETTA model to identify the RNA binding pocket, and compare the binding pocket with that of MSI1-RBD1. Aligning the amino acid sequences of MSI2-RRM1 and MSI1-RBD1 and comparing their secondary structure elements revealed a high degree of secondary structure conservation between MSI2-RRM1 and MSI1-RBD1 ( Figure 5A). Ohyama et al. performed an NMR titration of MSI1-RBD1 with Numb5 [26]. Based on their results, we highlighted (in blue) seventeen residues (F23, I24, G25, G26, S28, W29,  Q30, T31, M52, S60, R61, G62, G64, F65, K93, V94 and F96) in MSI1-RBD1 with largest CSPs on a ribbon diagram of the MSI1-RBD1 structure ( Figure 5B right panel). Likewise, Figure 5B (left) shows in red seventeen residues (F24, I25, G26, G27, S29, W30, Q31, T32, C50,  S61, R62, G63, G65, F66, K94, V95 and F97) that were significantly affected upon titration of Numb5 in MSI2-RRM1. In the CS-ROSSETTA model of MSI2-RRM1, G21 and K22 are highly flexible, thus, residues G21 and K22 in MSI2-RRM1 and their equivalent residues C20 and K21 in MSI1-RBD1 were not mapped on to the ribbon diagram representations. In both proteins, the seventeen residues experiencing significant CSPs cluster around the central anti-parallel β-sheets and the surrounding loops. In MSI1-RBD1, this region is involved directly in binding with Numb5 [26]. Therefore, we hypothesize that analogous to MSI1-RBD1, the RNA-binding pocket of MSI2-RRM1 is also composed of the central anti-  the RNA-binding surface where Numb5 binds ( Figure 5C  right panel). Moreover, the CS-ROSETTA model of MSI2-RRM1 and the NMR structural model of MSI1-RBD1 are easily superimposed with an RMSD of 2.29 Å for 75 Cα (Figure 6). Taken together, our CSP mapping results suggest first, that specific interactions exist between MSI2-RRM1 and Numb5, and second, that the perturbed residues in MSI2-RRM1 are generally consistent with those in MSI1-RBD1, and finally, that MSI2-RRM1 and MS1-RBD1 have similar RNA-binding pocket.

DISCUSSION
In this study, we characterized the MSI2-RRM1 protein and Numb RNA interactions using FP, TR-FRET and NMR, and revealed the key features of protein-RNA binding and the binding pocket. MSI2-RRM1 can specifically bind to Numb5 to form a 1:1 complex, and the binding process exhibits a slow exchange behavior indicative of a high binding affinity. We propose that the putative binding pocket of MSI2-RRM1 is composed of the central anti-parallel β-sheets and the surrounding loops in the MSI2-RRM1 CS-ROSETTA model, which was supported by the fact that seventeen residues with significant CSPs (F24, I25, G26, G27, S29, W30, Q31,  T32, C50, S61, R62, G63, G65, F66, K94, V95 and F97) cluster together and thus probably comprise the binding pocket. The putative RNA binding pocket of MSI2-MMR1 identified in the study contains the canonical RNP motifs RNP1 in β3 and RNP2 in β1 [36]. F64 and F66 in RNP1 and F24 in RNP2 are likely to canonically stack with RNA. The putative RNA binding pocket of MSI2-RRM1 also has unique features that W30 in the loop between β1 and α1 and F97 in the C-terminal end showed significant CSPs, and may involve in non-canonical base stacking interactions with RNA.
MSI2-RRM1 and MSI1-RBD1 have high similarity in their RNA-binding characteristics and RNA-binding pockets. Both proteins can specifically bind to Numb5 in a 1:1 stoichiometric manner, and their bound and unbound states undergo slow exchange in the NMR chemical shift time scale. In both proteins, the residues involved in direct interaction with Numb5 are generally consistent, moreover, seven residues in MSI2-RRM1 (I25, G27, W30, S61, R62, G65 and F97) whose chemical shifts were perturbed so significantly (ΔδH > 0.3 ppm) are all conserved in MSI1-RBD1, being I24, G26, W29, S60, R61, G64 and F96, and indeed they showed pronounced CSPs (ΔδH > 0.3 ppm) in MSI1-RBD1. These findings indicate both proteins use a similar set of residues to bind to Numb5. Their RNA-binding pockets are both formed by the central anti-parallel β-sheets and the surrounding loops.
Taken together, the existence of similarities in the CSPs of MSI2-RRM1 and MSI1-RBD1 due to RNA binding suggests the possibility of identifying novel small molecule inhibitors that are MSI2-specific as well as others that can function as MSI1/MSI2 dual inhibitors. Such inhibitors could potentially disrupt these unique MSI2-RNA aromatic amino acids mediated base stacking interactions or MSI-RNA interactions, thus leading to inhibition of MSI1/MSI2-mediated biological functions and compromising, among others, the viability of cancer cells that depend on MSI1/MSI2. Combined with our other efforts in discovering chemical probes, e.g. fragment based drug screening, obtaining the solution structure of MSI2-RRM1 will help the development of MSI1/2 inhibitors with structure-based rational design (Docking with NMR data) to discover compounds that fit the long narrow RNA-binding pocket of MSI2.

Binding assays
Numb mRNA contains the Musashi recognition motif r(UAG), and is a binding target shared by both MSI1 and MSI2. In order to dissect the similarities and differences between MSI1 and MSI2 in the RNAbinding pocket, we used the same 15 nt Numb RNA (5ʹUAGGUAGUAGUUUUA) for our binding assays and the same 5nt Numb RNA (GUAGU) for the NMR studies as our previous study [33]. FP assay was carried out using a FITC (fluorescein isothiocyanate) labeled Numb RNA (Numb FITC : 5ʹUAGGUAGUAGUUUUA-FITC) according to our previous publication [33]. TR-FRET assay was carried out using Streptavidin-d2 beads (610SADLA, Cisbio, Bedford, MA) and MAb Anti-6HIS Tb cryptate Gold beads (61HI2TLA, Cisbio) in HTRF 96 well low volume plate (66PL96005, Cisbio) following the protocol recommended by the manufacturer. Fluorescent measurements were taken at room temperature using a BioTek Synergy H4 plate reader (Biotek, Winooski, VT).

Protein expression and purification
The MSI2-RRM1 domain was sub-cloned into the pTGSG vector using the Ligation Independent Cloning method as described [37]. The non-labelled MSI2-RRM1 protein was expressed in E. coli BL21(DE3) pRARE and purified as previously described [33], and was kept in the buffer of 50 mM Tris-HCl, pH 8.0, 500 mM NaCl till use. The isotopically labelled protein was expressed in M9 medium using C 13 glucose and N 15 Ammonium Chloride as sole carbon and nitrogen source, respectively. The isotope uniformly labeled protein was isolated and purified using the same method as that of the unlabeled protein and then further purified through a Superdex TM 75 10/300 GL column equilibrated with 20 mM MES, pH 6.0, 150 mM NaCl. The protein was then concentrated and 5% D 2 O was added for NMR spectroscopy; Protein purity was assessed on coommassie stained acrylamide gel and protein concentrations were determined using the Bradford assay (Bio-Rad, Hercules, CA).

NMR spectroscopy
All NMR experiments were performed at 298 K on a Bruker AVANCE 800 MHz spectrometer equipped with a triple resonance cryoprobe and a Bruker AVANCE 600 MHz spectrometer equipped with a triple resonance room temperature probe. The NMR buffer was 20 mM MES (pH 6.0), 150 mM NaCl and 5% D 2 O for spectrometer lock. NMR data were processed using the NMRPipe program [38] and visualized and analyzed using CCPN Analysis [39] and NMRViewJ [40].