MicroRNA-200, associated with metastatic breast cancer, promotes traits of mammary luminal progenitor cells

MicroRNAs are critical regulators of gene networks in normal and abnormal biological processes. Focusing on invasive ductal breast cancer (IDC), we have found dysregulated expression in tumor samples of several microRNAs, including the miR-200 family, along progression from primary tumors to distant metastases, further reflected in higher blood levels of miR-200b and miR-7 in IDC patients with regional or distant metastases relative to patients with primary node-negative tumors. Forced expression of miR-200s in MCF10CA1h mammary cells induced an enhanced epithelial program, aldehyde dehydrogenase (ALDH) activity, mammosphere growth and ability to form branched tubuloalveolar structures while promoting orthotopic tumor growth and lung colonization in vivo. MiR-200s also induced the constitutive activation of the PI3K-Akt signaling through downregulation of PTEN, and the enhanced mammosphere growth and ALDH activity induced in MCF10CA1h cells by miR-200s required the activation of this signaling pathway. Interestingly, the morphology of tumors formed in vivo by cells expressing miR-200s was reminiscent of metaplastic breast cancer (MBC). Indeed, the epithelial components of MBC samples expressed significantly higher levels of miR-200s than their mesenchymal components and displayed a marker profile compatible with luminal progenitor cells. We propose that microRNAs of the miR-200 family promote traits of highly proliferative breast luminal progenitor cells, thereby exacerbating the growth and metastatic properties of transformed mammary epithelial cells.


INTRODUCTION
The acquisition of metastatic properties by cancer cells requires both the occurrence of genetic events that confer a minority of neoplastic cells with inherent selfrenewal properties that no longer rely on environmental cues specific to the tissue of origin and also the epigenetic remodeling of selected properties that allow the cells to overcome environmental hurdles to metastasis [1]. As opposed to mutations, epigenetic processes are reversible and are used by cancer cells as a survival strategy that allows them to switch phenotypes in order to optimize adaptation to potentially hostile environmental conditions [1].
MicroRNAs are epigenetic regulators that modulate gene networks that determine the balance between cell invasion vs. cohesion, proliferation vs. quiescence, and senescence vs. cell death [2,3]. Many microRNAs display dysregulated expression in neoplastic states and either confer or repress the metastatic properties to tumor cells in experimental models [4]. Herein, we have assessed the role of microRNAs in regulating the ability of invasive breast ductal carcinoma (IDC) cells to metastasize to local lymph nodes as a surrogate marker of metastatic potential. We report that the miR-200 family members are significantly upregulated in lymph node (LN) metastases and, by using the MCF10CA1 mammary cell model, that miR-200s induce features of mammary luminal progenitor cells, which may explain the association of enhanced selfrenewal, tumorigenesis and metastatic potentials with high levels of expression of miR-200s.

Identification of sets of microRNAs dysregulated along IDC metastatic progression
Focusing on lymph node spread as a readout for the metastatic potential of cancer cells, we sought to identify microRNAs (miRNAs) with dysregulated expression in the course of IDC metastatic progression. For this, a miRNA microarray expression screening was performed using tissue samples from 8 ductal carcinomas in situ (DCIS), 20 primary IDC without LN involvement (PNM), 20 primary IDC with regional LN involvement (PM), 20 LN metastases matched to the node-positive primary tumors (LNM) and 20 distant metastases (DM) (Supplementary Table 1) and a pool of 10 normal breast epithelial tissues (N). This analysis yielded miRNAs that were significantly upregulated in the transition from normal breast tissues to PM, including miR200a, miR-200b and miR-429 ( Figure 1A). Another set of miRNAs, including miR-181a, miR-181b, miR-210 or miR-7 was upregulated in DM relative to primary tumors ( Figure 1A). qPCR confirmed that miR-200b, miR-7 and miR-210 are significantly upregulated from PNM to PM, while miR-7, miR-210, miR-181a and miR-181b were up-regulated in DM relative to node-negative tumors ( Figure 1B). Furthermore, higher levels of miR-200a, miR-200b and miR-429 were expressed in a majority of LNM samples compared to their matched primary tumors ( Figure 2A). Consistently, staining of these samples for E-cadherin, a hallmark of the epithelial reprogramming driven by miR-200s, revealed a generally stronger staining intensity in LNM cancer cells when compared to matched primary tumors ( Figure 2B).
Although the primary mode of breast cancer dissemination is lymphogenous [5], if LN metastasis reflects a general propensity of breast cancer cells for metastatic spread, similar changes might be detected in the blood as circulating miRNAs. Because the tissue levels of miR-200b, miR-200c, miR-7, miR-10b, miR-148, miR-101, miR-30a*, miR-181a, miR-181b and miR-210 were significantly dysregulated during metastatic progression, we determined their levels in blood samples prospectively collected from 78 patients with a diagnosis of IDC (Supplementary Table 2). Patients with node-positive tumors at the time of diagnosis displayed higher miR-200b and miR-7 blood levels than patients with node-negative tumors ( Figure 2C). Differences were also significant when comparing patients with distant metastases to those bearing only primary tumors ( Figure 2C). The blood levels of miR-210 were significantly lower in patients with distant metastasis relative to those with node-negative primary tumors ( Figure 2C). The remainder of the microRNAs analyzed did not show significant differences in their blood levels between any of the patient groups studied (not shown). The observed concordance between increased levels of miR-200b and miR-7 during metastatic progression in both tumor tissues and blood argue in favor of a carcinomatous origin of the circulating miRNAs. activation of this signaling pathway. Interestingly, the morphology of tumors formed in vivo by cells expressing miR-200s was reminiscent of metaplastic breast cancer (MBC). Indeed, the epithelial components of MBC samples expressed significantly higher levels of miR-200s than their mesenchymal components and displayed a marker profile compatible with luminal progenitor cells. We propose that microRNAs of the miR-200 family promote traits of highly proliferative breast luminal progenitor cells, thereby exacerbating the growth and metastatic properties of transformed mammary epithelial cells.

miR-200s promote luminal progenitor properties in tandem with tumorigenic and metastatic growth of MCF10CA1h cells
The above results suggest that metastatic breast cancer cells express increased miR-200 levels as compared to primary tumor cells and thus are expected to display an enhanced epithelial phenotype. Although several studies have found an association between low miR200 levels with invasive cancers and poor prognosis, concluding that constitutive expression of miR-200s of relative expression levels of miRNAs differentially expressed between samples of normal breast epithelium (N), ductal carcinoma in situ (DCIS), primary IDC with no regional lymph node involvement or distant metastasis (PNM), primary IDC with lymph node involvement (PM), matched lymph node metastases (LNM) or distant metastases (DM). The heatmap was generated after probe normalization and selection of differentially expressed miRNAs. B. Quantification by qPCR of levels of selected miRNAs along metastatic progression of IDC. Reference probe-normalized values (n -ΔCt ) are shown relative to the median of values for normal breast epithelial tissues (N). www.impactjournals.com/oncotarget  suppresses invasiveness and metastatic potential in xenotransplanation models [6,7], more recent studies have reported that high-burden metastatic breast cancer cells express an epithelial gene program [8] and that miR-200s can play pro-metastatic roles [9][10][11][12]. To address this issue, we quantified the expression levels of miR200b in cancer cell models displaying differential metastatic potentials. Remarkably, two clonal lines derived from MD-MBA-231 basal-like cancer cells [13] presented a striking differential expression of miR-200b, with the highly metastatic SCP2 cells expressing 60-fold higher levels of miR-200b relative to poorly metastatic SCP6 cells ( Figure 3A). Additionally, two subpopulations derived from the non-cancerous MCF10 cell line [14,15], the moderately metastatic MCF10CA1a and the non-metastatic MCF10CA1h cells, also presented an unambiguous differential expression profile of miR200s, with MCF10CA1a cells expressing miR200b levels up to 3 orders of magnitude higher than those of MCF10CA1h cells ( Figure 3A).
These MCF10 subpopulations are transformed variants of cells derived from non-tumoral mammary epithelial cells, with the MCF10CA1h subpopulation displaying a mesenchymal-like morphology and low anchorage-independent mammosphere growth potential relative to the more epithelial MCF10CA1a subpopulation [15]. We exploited these distinctive growth features to test the phenotypic consequences of the ectopic expression of miR-200s. Transduction of MCF10CA1h cells with miR-200 cluster 1 (C1; miR-200a, miR-200b, miR-429) or cluster 2 (C2; mir-200c, miR-141) induced an epithelial gene program with a strong upregulation of the epithelial genes, E-cadherin, EpCAM, and desmoplakin, together with a downregulation of the mesenchymal markers fibronectin, smooth muscle cell actin, TWIST1/2 or ZEB1/2 ( Figure 3B-3C), accompanied with characteristic epithelial morphologies either grown on plastic or embedded in 3-D collagen I gels ( Figure 3D). This gain in epithelial gene program and phenotype was accompanied with an inhibition of the capacity of miR-200-expressing MCF10CA1h cells to migrate in wound healing and invasiveness assays (Supplementary Figures S1 and S2).
Importantly, transduction of miR-200s strongly enhanced the capacity of MCF10CA1h cells to form primary and secondary mammospheres ( Figure 4A). This was accompanied with a striking 10-fold induction of aldehyde dehydrogenase (ALDH) activity ( Figure 4B and Supplementary Figure 3). Further, miR-200s elicited a shift of MCF10CA1h cells from a CD44 high CD24 low to a CD44 high CD24 high cell surface haplotype ( Figure 4C). While mammary epithelial stem cells are preferentially enriched in the CD44 high CD24 low population [16], the CD44 high CD24 high population encompasses more committed cell populations, including highly proliferating luminal progenitor cells and differentiated luminal cells [17], that likewise express high levels of ALDH activity [18,19]. The upregulation of ALDH activity in miR-200-expressing MCF10CA1h cells was not accompanied with unamibugously enhanced resistance to drugs such as cis-Pt, etoposide, olaparib or docetaxel (Supplementary Figure 4). Taken together, these features are consistent with a miR200-mediated induction of a progenitor luminal state in mammary cells [17,18,20,21].
To  Figure 5B). Furthermore, the gain in mammosphere formation conferred by miR-200s was completely abolished by the PI3K inhibitor, LY294002 or the mTOR inhibitor, rapamycin ( Figure 5C), supporting a model wherein the miR-200-dependent activation of these signaling pathways promotes mammosphere formation. While novel miR-200a and miR-200b target status was demonstrated for the mTOR negative regulator, TSC1, PTEN was not directly targeted ( Figure 5D), suggesting an indirect mechanism of downregulation by miR-200s [22]. In this regard, overexpression of miR-200 cluster 2 in MCF10CA1h cells caused an upregulation of miR-205 ( Figure 5E), known to directly target PTEN, suggesting a mechanism whereby miR-200's induce the expression of miR-205, which in turn targets and downregulates PTEN. In support for a role of PTEN downregulation in the phenotype induced by mir-200, direct knockdown of PTEN in MCF10CA1h cells significantly enhanced mammosphere growth ( Figure 5F). However, it failed to upregulate ALDH activity ( Figure  5G) or the number of CD24 pos cells ( Figure 5H). On the other hand, knockdown of ZEB2 in MCF10CA1h cells (Supplementary Figure S5) caused both a strong increase in mammosphere growth ( Figure 6A) and ALDH activity ( Figure 6B). Surprisingly, ZEB2 knockdown, despite prompting an upregulation of miR-200's ( Figure 6C), did not completely phenocopy miR-200 expression, since it failed to engage a significant epithelial gene program ( Figure 6D-6E) or upregulate CD24 in MCF10CA1h cells ( Figure 6F). These observations suggest that the acquisition of strong mammosphere growth induced by miR-200s, in part driven or reinforced by downregulation of ZEB2, is reinforced by the activation of the PI3K-Akt signaling pathway through indirect downregulation of PTEN. The Downregulation of known and predicted miR-200 target RNAs, determined by qPCR. As controls of miR-200 activity, the epithelial markers CDH1 and EPCAM are strongly upregulated, while the mesenchymal marker FN1 is downregulated. B. Induction by miR-200s of steady-state phosphorylation of Akt. Western blotting of total extracts of cells serum-starved for 24 h, probed with anti-pAkt (Ser473), total Akt or β-tubulin. C. Abrogation by LY294002 and rapamycin of miR-200(C1)-induced mammosphere growth of MCF10CA1h cells. D. The mTOR negative regulator TSC1 is a target of miR-200a and miR-200b, and the inositol-1,4,5-trisphosphate 5-phosphatase SYNJ1 is a target of miR-200b. Luciferase assays were run with constructs bearing the indicated 3'-UTR fragments. E. Lentivirally-mediated expression of miR-200-C2 in MCF10CA1h cells induces the expression of endogenous miR-205, as determined by microarray analysis. F. Knockdown of PTEN induces mammosphere growth in MCF10CA1h cells. G. Knockdown of PTEN fails to enhance ALDH activity in MCF10CA1h cells, as assessed by the Aldefluor assay. H. Knockdown of PTEN fails to induce cell surface expression of CD24, determined by flow cytometry. www.impactjournals.com/oncotarget latter represents a novel mechanism that may explain the significant growth and survival advantages conferred by the expression of miR-200 in MCF10CA1h and, possibly, breast cancer cells.
After seven days of implantation in the cleared mammary fat pads of NOD-SCID mice, the growth of control cells stagnated while that of MCF10CA1h. C1 cells continued at an exponential rate ( Figure 7A). Neither control nor miR-200-expressing MCF10CA1h cells produced detectable metastatic growth outside of their sites of implantation for the duration of local growth monitoring and up to two additional months of follow up after removal of the implanted tumors. However, intravenous inoculation of MCF10CA1h.C1 cells resulted in tumor colonization of the lungs at significantly higher rates than control cells with a significantly higher number of metastases than controls (mean: 7.6 vs. 2.9 per lung) ( Figure 7B-7C), illustrating that enhanced metastatic colonization and growth can be uncoupled from local escape functions of tumor cells [23].

miR-200s provide cues for the morphogenetic differentiation of MCF10CA1h cells
The induction by miR-200s of breast luminal progenitor properties in MCF10CA1h cells prompted us to interrogate whether these cells could acquire further differentiated features under appropriate stimuli. When MCF10CA1h.C1 cells were grown in 5% Matrigel 3-D lattices, they formed highly organized branched tubular structures with multiple terminal hollow alveolar-like structures ( Figure 8A), reminiscent of the complex structures formed in vitro from explanted normal mammary epithelial progenitor cells [24]. In contrast, control MCF10CA1h cells only formed amorphous or solid spherical structures ( Figure 8A). Further, by coexpressing miR-200b and green fluorescent protein, we demonstrate that these cells contributed equally to both ductal-and alveolar-like structures ( Figure 8A).
In addition to the induction of an epithelial gene program, expression of miR-200s in MCF10CA1h cells elicited a marked upregulation of the basal keratin, KRT5 ( Figure 8B-8C) and a more modest, but significant, induction of the luminal keratins, KRT8 and KRT18, as well as the basal marker, p63 ( Figure 8B-8C). In concert with these changes, miR-200-expressing cells were enriched in the active transcription mark, histone H3K4me3, on the KRT18 promoter and depleted in the repressive mark, H3K27me3, on the KRT5 promoter (Figure 9), suggesting that the observed transcript modulation is at least partially attributable to a shift in transcriptional programs induced by miR-200s. miR200expressing MCF10CA1h cells also showed a significantly higher proliferation index than control cells as determined by Ki67 staining (Supplementary Figure 6). These cells failed to express the luminal differentiation marker, estrogen receptor alpha, or the myoepithelial markers, myosin heavy chain. Remarkably, 3-D Matrigel cultures of MCF10CA1h.C2 cells, and more weakly MCF10CA1h. C1 cells, expressed the luminal differentiation master regulator, GATA3, which was not expressed in the absence of Matrigel (Supplementary Figure 6).
Tumors grown in mice upon orthotopic implantation of control MCF10CA1h cells displayed morphologically heterogeneous areas that included predominantly spindleshaped and mesenchymal-like elements along with areas that included a more epithelioid appearance that failed to generate glandular structures ( Figure 10). In contrast, tumors formed by MCF10CA1h.C1 and MCF10CA1h.  C2 cells contained fields dominated by a more epithelioid appearance, including areas that had undergone morphological differentiation into gland-like structures ( Figure 10). Similar to the phenotypes observed in vitro, the epitheloid components of these tumors showed intense staining for E-cadherin as well as luminal (KRT8) and basal (KRT5) cytokeratins ( Figure 10). Further, whereas the mesenchymal marker, vimentin, was strongly and diffusely expressed in control tumors, it did so only in spindle-cell areas of MCF10CA1h-miR200 tumors (Supplementary Figure 7). Although GATA3 was focally expressed in epithelioid areas of MCF10CA1h.C2 tumors (Supplementary Figure 7), all tumors failed to express estrogen receptor-alpha or smooth muscle myosin heavy chain (not shown). Finally, p63 was diffusely expressed in spindle-cell areas of tumors from control cells and was highly expressed in periglandular cells in tumors from MCF10CA1h-miR200 cells (Figure 10).
To summarize, the upregulation of basal (KRT5, p63) and luminal (KRT8/18, GATA 3, CD24) markers by miR200 in these cells, together with the robust induction of ALDH activity and the epithelial markers, EpCAM and E-cadherin, as well as their capacity to undergo glandular differentiation upon orthotopic implantation in mice, reinforce the hypothesis that miR-200s drive the acquisition of luminal progenitor characteristics [17].

The epithelial components of metaplastic breast cancer express high levels of miR-200 and luminal progenitor cell marker profiles
The morphological features of tumors formed in mice by MCF10CA1h-miR200 cells are reminiscent of metaplastic breast cancer (MBC) with spindle cell component [25]. As such, we hypothesized that the epithelial/glandular components of human MBC would express high levels of miR-200 and co-express luminal and basal markers. Given their histological and likely underlying genetic and epigenetic heterogeneity of these rare tumors [26], we restricted this analysis to five cases  of the carcinosarcoma subtype, with well-delimited dual epithelial and mesenchymal components as determined by mutually exclusive expression of cytokeratins and vimentin. In 4 cases, we macrodissected the epithelial and mesenchymal components of the tumors, finding that the epithelial components in 3 out of the 4 tumors analyzed expressed significantly higher levels than the mesenchymal components of at least two of the five miR-200s quantified ( Figure 11A).
Immunohistochemical analysis showed that KRT8 was expressed in the epithelial component of all five carcinosarcoma MBC cases (Supplementary Table 3). In all but one case, the same cellular areas co-expressed KRT5 ( Figure 11B). In addition, ALDH was also strongly expressed in the epithelial components of 3 of the 5 cases studied, corresponding to those with a strong expression of KRT8 and weaker or focal expression of KRT5 ( Figure 11B). As expected, E-cadherin was consistently and diffusely expressed in the epithelial components of all cases ( Figure 11B) while estrogen receptor-alpha and HER2 were undetectable (not shown). Interestingly, one case with strong ALDH and KRT8 staining in the epithelial component also showed positive nuclear staining for GATA3 ( Figure 11C). In sum, the epithelial components of MBC with double mesenchymal and epithelioid components (carcinosarcomas) express markers of breast luminal progenitor cells, including ALDH and double luminal-basal cytokeratins, coincident with high levels of miR-200s.

DISCUSSION
In this study, we have identified sets of miRNAs that are dysregulated along the metastatic progression of IDC. Of these, only miR-200 and miR-7 showed the same trend in tissue samples and in blood, namely a statistically significant tendency for higher levels in distant metastatic than in node-positive primary cases and tumors (PM) and higher levels in PM than in node-negative primary cases and tumors. These concordances suggest that circulating miR-200 and miR-7 levels reflect the expression levels of these miRNAs in the tumors borne by the patients.
Previous reports have shown that miR-200s circulate at higher levels in association with metastasis in epithelial tumors [11,27,28] in consonance with reports demonstrating that circulating breast cancer cells with metastatic potential are predominantly epithelial [29] and that expression of miR-200s promotes the metastatic behavior of breast cancer cells [10][11][12]. However, a more broadly extended view is that miR-200s prevent metastasis through induction of an epithelial gene program in detriment of EMT and invasive properties of cells [6,7]. Nevertheless, in addition to shifting the balance in favor of epithelial gene programs, miR-200s have been found to negatively regulate self-renewal gene networks, by targeting BMI1 or SOX2 and thereby limiting the growth of normal and tumor cells [30]. Indeed, we have observed a downregulation of SOX2 upon expression of miR-200s. However, this was accompanied with enhanced mammosphere growth, suggesting that expression of SOX2 does not play a significant role in the self-renewal properties of these cells. Likewise, our observations do not support a role for BMI1 in the maintenance of the growth characteristics of these cells, and thus alternative mechanisms must be invoked to explain the observed phenotypes. In this regard, the key roles found by others for miR-200s and an epithelial gene program in the generation of induced pluripotent stem cells with high self-renewal characteristics [31,32] are in agreement with our observations that miR-200s promote, rather than counter, self-renewal (mammosphere growth) and metastatic properties of MCF10CA1h cells. Recent reports also support a metastasis-promoting function of miR-200 [10][11][12], in one case linked to its targeting of specific secretory pathways [10].
Our observations suggest a novel mechanism by which miR-200s promote self-renewal and the acquisition of progenitor cell characteristics, through activation of the PI3K-Akt-mTOR signaling pathway as a result of indirect downregulation of PTEN and direct targeting of TSC1. Although it has been reported by others that miR-200-family members bind to the 3'UTR of PTEN resulting in its downregulation [33], we have not found a direct interaction, which leads us to suggest an indirect mechanism. Relevantly, we have found that miR-205, a contrasted interactor and modulator of PTEN mRNA, is strongly upregulated in MCF10CA1h cells upon expression of miR-200's. We thus propose that induction of miR-205 by miR-200's through mechanisms yet to be determined leads to a downregulation of PTEN and consequent constitutive activation of the PI3K-AKT-mTOR signaling axis. Downregulation of PTEN and activation of Akt and mTOR have been shown to be critical for proliferation and survival of CSCs and specifically, breast cancer stem cells [34], and an active PI3K/Akt/mTOR signalling axis is required for the maintenance of the undifferentiated properties of ESCs [35] and iPS cells [36]. Interestingly, the acquisition of a CSC-like phenotype by reprogramming differentiated luminal-like cells is associated with the activation of mTOR [37].
A further relevant and novel finding of our study is the induction by miR-200s of luminal progenitor-like features in MCF10CA1h cells, endowing them with a capacity for morphological differentiation, accompanied with the expression of markers associated with that stage in the differentiation of mammary epithelial cells. Thus, under the influence of miR-200's, MCF10CA1h cells develop well-defined albeit imperfect (monolayered rather than bilayered) tubuloalveolar structures and engage limited proneness to display luminal differentiation, with weak expression of GATA3 when grown in Matrigel or in vivo but failure to express estrogen receptor. These observations suggest that complex breast morphogenetic events can take place in the absence of terminal differentiation of the two major breast epithelial lineages, that miR-200s promote this process and that environmental factors, yet to be identified, provide additional cues that drive the differentiation of miR-200-expressing breast epithelial cells along the luminal lineage.
The induction of high levels of ALDH activity, a CD44 high CD24 high haplotype, enhanced mammosphere growth, alveolotubular morphogenesis in vitro and in vivo, and a proneness to luminal differentiation support our proposal that miR-200s induce a luminal progenitor phenotype in MCF10CA1h cells. In the context of our observations, prior evidences that constitutive expression of miR-200 in breast stem cells causes attrition of luminal progenitor cells thus preventing luminal differentiation and correct breast morphogenesis [30] may indicate that the promotion by miR-200s of luminal progenitor phenotypes occur within defined developmental windows and that prolonged expression cripples normal progenitor cells in physiological contexts. Our observation of high levels of expression of miR-200s in breast cancer cells with aggressive phenotypes or in MCF10CA1a cells indicate that the compromise in viability imposed by constitutive expression of miR-200s in normal breast stem cells can be overcome by factors that drive spontaneous (neoplastic) or experimentally induced immortalization. The precise correspondence of the MCF10CA1h cell line to a physiological stage along the mammary epithelial lineage has not been established, and therefore it is difficult to propose whether miR-200s drive these cells from a less differentiated to a more differentiated stage or in the converse direction. The failure of these cells to undergo full terminal differentiation after miR-200s expression is more likely related to specific constraints pertaining to MCF10 cells, either inherent in the original cells or acquired through the generation of different immortalized variants and subpopulations [14,15] than to a differentiation block imposed by miR-200s [30].
Our analysis of metaplastic breast cancer samples indicates that tumor cells expressing epithelial programs and miR-200s can co-express luminal and basal keratins and ALDH. Co-expression of luminal and basal keratins in MBC has been reported previously and interpreted as a reflection of the cells of origin of these tumor cells being bipotent stem cells or myoepithelial cells [38][39][40]. Although such interpretations may apply to some subtypes of MBC, they may not suit the carcinosarcomatype cases, since they fail to consider that normal breast epithelial stem cells express low levels of ALDH [18] and that myoepithelial cells do not express luminal keratins [17]. In light of the evidences presented herein for the MCF10CA1h cell model, our findings of co-expression of luminal and basal keratins and ALDH suggests that the epithelial neoplastic components of a subset of MBC may have originated in cells expressing an early progenitor luminal phenotype but retaining bipotent features. Evidences suggest that most other types of breast cancer originate from luminal progenitor cells [17,41], with a minority of tumor types, including claudin-low and some metaplastic tumors [26], possibly originating from less differentiated cells that might correspond to mammary stem cells [17]. The epithelial components of MBC may harbor sufficient phenotypic plasticity to give rise to mesenchymal (and other metaplastic) neoplastic components through the engagement of EMT in response to epigenetic cues or genetic mutations [42,43]. In addition, based on our observations with MCF10CA1h cells, we propose that the more intrinsically aggressive components of MBC are the epithelial components, rather than the mesenchymal components. Indeed, metastatic samples from MBC tend to display hallmarks of epithelial differentiation [44] and subtypes with a higher representation of epithelial components have been reported as associated with higher rates of distant metastasis and worse outcomes than those with a greater representation of mesenchymal components [45].
In conclusion, we propose that microRNAs of the miR-200 family promote traits of highly proliferative breast luminal progenitor cells and may contribute to exacerbate the growth and metastatic properties of transformed breast epithelial cells, including those present in histological variants such as metaplastic carcinomas.

Sample procurement and processing
All patient samples were procured through the Hospital Clínic-IDIBAPS Biobank. Formalin-fixed and paraffin-embedded (FFPE) samples from invasive ductal carcinomas were selected on the basis of estrogen receptor (ER), progesterone receptor (PR) and HER2 expression. Also, normal breast epithelial tissues from patients undergoing reductive mammoplasty were collected (Supplementary Table 1). Samples with > 70% tumor epithelial enrichment were macrodissected to minimize stromal and lymphocytic components. Blood samples were collected from untreated breast cancer patients at diagnosis and 6 healthy women with negative mammographies (Supplementary Table 2).

RNA isolation, reverse transcription and real-time RT-PCR of cell culture samples
For RNA extraction of adherent cells, cells were grown to 70-80% confluence and lysed directly on the plate with Qyazol lysis reagent. Mammospheres were collected by gentle centrifugation and resuspended with Qyazol. 3D structures were recovered from Matrigel using the non-enzymatic solution Matrisperse (Cultek), following manufacturer's instructions, and resuspended in Qyazol after gentle centrifugation. RNA was isolated with the miRNeasy Mini kit (Qiagen). cDNA was synthesized with the HighCapacity cDNA Reverse Transcription Kit (Applied Biosystems). Real-time quantitative PCR assays were performed on a LightCycler 480 instrument (Roche) and analyzed with the LightCycler 480 Software release 1.5.0. The Universal Probe Library system (UPL) (Roche) was used to quantify transcripts. Probes and sequences are shown in Supplementary Table S5. RN18S5 amplification levels were used as an internal reference, and relative transcript quantification determined by the ∆∆Cp method.

MicroRNA quantitative PCR
Total RNA was retrotranscribed with the Universal cDNA Synthesis kit (Exiqon). Mature microRNAs were detected using the ExiLENT SYBR Green Master Mix (Exiqon) and specific LNA ™ primers. miR-16 and let7a were used as reference microRNAs. Real-time quantitative PCR assays were performed on a StepOnePlus instrument (Life Technologies). Relative quantifications were assessed by the ∆∆Cp method. Normalized values were used in comparative analysis between categories of samples using either parametric (t-test) or non-parametric tests (Mann-Whitney). For quantification of miR in blood, total RNA was isolated from 2.5 mL of blood collected in PAXgene Blood RNA tubes (Qiagen), retrotranscribed and microRNA levels determined by real-time PCR as above, using miR-16 and miR-103a-3p as references.

mRNA microarray analysis
Total mRNA was isolated and processed for hybridization on Human Gene ST 2.1 strips (Affymetrix). Signals were fitted with a probe-level model and expression values were calculated and log 2 transformed using a robust multi-array average (RMA) [49]. Probes with ≥ 2-fold change in intergroup comparisons were selected for hierarchical clustering analysis and heatmap plotting [50].

Collagen I 3-D culture
Type I collagen was isolated from mouse tail tendons as described previously [51] and dissolved in 0.2% acetic acid at a final concentration of 2.7 mg/mL Before gelation, the collagen solution was mixed with 10× minimum essential media (MEM) and 0.34 N NaOH at a ratio of 8:1:1 at 4 °C, with MCF10CA1h control and MCF10CA1h. C2 cells (1-5 × 10 6 ) suspended in 1 mL of this mixture. The carcinoma cell-collagen mixtures were incubated for 1 h at 37 °C to allow for gelation and culture media (MEM supplemented with 10% FCS) added atop the gel.

Wound-healing assay
Cells were seeded in 24-well plates (Corning) at 2 x 10 4 /well in DMEM:HAM medium supplemented with 5% horse serum (HS). After reaching 80% confluence, the medium was replaced with fresh medium devoid of serum containing 0.5% (w/v) mitomycin C (Sigma). After 1 h of treatment, medium was replaced with fresh medium supplemented with 0.5% HS and approximately 1-mm wide wounds produced in the confluent monolayer. Wounds were imaged at 0 h, 12 h, 24 h and 48 h and analyzed with the aid of Image J. At least three wounds per condition were scored.

In vitro invasiveness assay
Transwell chambers (Costar) with 8-µm diameter pore membranes were coated with growth factorreduced Matrigel (BD Biosciences) at 410 µg/mL. Cells were serum-deprived for 24 h, detached, resuspended in medium supplemented with 1% BSA/0.5% FBS and then seeded (1.5 x 10 5 /well in 24-well plates) onto the pre-coated Transwell inserts, with the lower chamber containing medium supplemented with 0.5% FBS. After 24 hours, cells migrating to the lower chamber were collected by detachment with trypsin-EDTA, washed with PBS, and fluorescent cells scored in a Coulter Epics XL instrument (Coulter Electronics, Luton, UK). Each condition was done in quadruplicate.

Cytotoxicity assay
Cells were seeded in 96-well plates (Corning) at 1.5 x 10 3 cells/well, allowed to attach overnight and exposed to varying concentrations of drugs for 96 h. After treatment, 10 µL of MTT ((3-(4,5-dimethylthiazol-2yl)-2,5-diphenyltetrazolium bromide) solution (Sigma) were added and incubated at 37 °C for 3 h. Crystals were solubilized with 100 µL of 0.08 M HCl/isopropanol by shaking for 30 min in the dark, and absorbance recorded at 570 nm. Each condition was done in quintuplicate.

Flow cytometry
Aldehyde dehydrogenase activity was detected with the Aldefluor kit (Stem Cell Technologies) used according to manufacturer's protocol and analyzed by flow cytometry on a Gallios Flow Cytometer instrument (Beckman Coulter). For cell surface immunophenotyping, cells were detached with 0.25% trypsin/ 0.1% EDTA, washed and incubated with primary antibodies CD44 (Alexa Fluor 647, anti-mouse/human, 1:4000 dilution, BioLegend) and CD24 (Alexa Fluor 488, anti-human, 1:20 dilution, BioLegend) in PBS/3% normal goat serum for 30 min in a shaker at 4 °C, washed and analyzed by flow cytometry.

3'UTR luciferase reporter assays
psiCHECK2-PTEN 3'UTR construct was obtained from Addgene (Plasmid 50936). A plasmid containing the 3'UTR of SYNJ1 was generated by cloning a 3'UTR fragment of 628 bp using the Zero Blunt PCR cloning kit (Invitrogen). Due to size constraints, TSC1 3'UTR was cloned as three different fragments (Fragment #1:483bp; fragment #2: 941bp; fragment #3: 368bp) into the pCR Blunt vector. SYNJ1 and TSC1 3'UTR fragments were subcloned into psiCHECK-2 (Promega) using XhoI and NotI restriction sites. The primers used are listed in Supplementary Table S6. Reporter assays were performed as follows: HEK293T were transduced with lentiviral particles carrying either pmiR-empty or pmiR-200b. Seventy-two hours post-infection, cells were seeded into 96-well plates and transfected with 50 ng of the indicated 3'UTR reporter vectors for an additional 24 h (n = 4 per condition). Luciferase activity was measured using the Dual-Glo Luciferase AssaySystem (Promega). Renilla luciferase activity was normalized to corresponding firefly luciferase activity and plotted as a percentage of the control.

Chromatin immunoprecipitation
Adherent cells were fixed in 1% formaldehyde for 10 min, quenched with 0.125 M fresh glycine for 5 min, washed twice with PBS, lysed (1% SDS; 10 mM EDTA pH8.0; 50 mM Tris-HCl pH 8.1, with protease inhibitors), and samples were kept on ice for 20 min. Cell lysates were sonicated in a Branson 450 sonicator (5 cycles of 20 seconds at 30% amplitude) to yield 200-500 bps chromatin fragments. Chromatin was purified by centrifugation at 13,200 rpm at 4 ºC for 30 min, precleared with protein A agarose during 30 min and 25 µg of chromatin were immunoprecipitated with 5 µg of one the following antibodies: H3K27me3 (Millipore 07-449), H3K4me3 (Abcam ab8580) or nonspecific rabbit IgGs (Diagenode C15410206). Antibody-chromatin complexes were recovered with magnetic beads (Magna ChIP™ Protein A Magnetic Beads (Millipore 16-661) and immunocomplexes were washed once with TSE I (0.1% SDS; 1% Triton-X100; 2 mM EDTA pH 8.0; 20 mM Tris-HCl pH 8.1; 150 mM NaCl), TSE II (0.1% SDS; 1% Triton-X100; 2 mM EDTA pH 8.0; 20 mM Triswww.impactjournals.com/oncotarget HCl pH 8.1; 500 mM NaCl), TSE III (0.25 M LiCl; 1% NP-40; 1% Sodium Deoxicholate; 1 mM EDTA pH8.0; 10 mM Tris-HCl pH 8.1) and twice with TE (10 mM Tris-HCl, 1 mM EDTA). Crosslinking was reversed by overnight incubation at 65 ºC in elution buffer (1% SDS, 0.1 M NaHCO 3 ). DNA was purified by phenol-chloroform extraction followed by ethanol precipitation. Enrichment of target regions was determined by qPCR in a Lightcycler 480 instrument (Roche) using the primers listed in Supplementary Table S7. A fraction of input was used for the quantification of the immunoprecipitated material with respect to the total starting chromatin and the latter values for nonspecific IgG subtracted from values for specific antibodies. A region in the ACTB promoter was used as a control for the enrichment of H3K4me3 marks, and a region in the NEUROD1 promoter as a control for the enrichment of H3K27me3 marks. Percent input values for test promoter regions were normalized against these two controls to calculate relative enrichments in these two histone marks.

In vivo tumorigenic and lung colonization assays
Cells were transduced with pCMV-GFP/luc for the constitutive co-expression of the firefly luciferase gene and GFP and selected by FACS. Female SCID-NOD mice (8-10 weeks) were injected with 50 µL of 2 x 10 6 cells suspended in 50% Matrigel/PBS into cleared abdominal mammary fat pads. Tumor growth was monitored after intraperitoneal injection of 150 mg/kg D-luciferin (Caliper Life Science) and imaging in an ORCA-2BT instrument (Hamamatsu Photonics). To assess lung colonization, 5 x 10 5 cells in 150 µL were injected in the tail vein of mice.

Study approval
All protocols involving patient selection and sample procurement complied with Spanish laws regarding data protection and written informed consent and were approved by the Hospital Clinic -IDIBAPS Ethics Committee and Review Board. All animal procedures were reviewed and approved by the Institutional Animal Experimentation Ethics Committee (CSIC).