Expression analysis of microRNA as prognostic biomarkers in colorectal cancer

microRNA (miRNA) based biomarkers have unique advantages due to their critical regulatory function, superior stability, and relatively small number compared to mRNAs. A number of miRNAs play key roles in colon cancer stem cell chemoresistance and have clinical potential as prognostic biomarkers. The purpose of this study is to systematically validate the prognostic potential of miRNAs in colorectal cancer. In this study, we validated the prognostic potential of a panel of miRNAs using 205 stage II, III, and IV colorectal cancer specimens by qRT-PCR analysis. We cross validated our results using The Cancer Genome Atlas (TCGA) database. Many of the miRNAs we investigated have been functionally validated to be important in contributing to chemoresistance to 5-fluorouracil (5-FU) based chemotherapy. We determined that miR-16 is the most consistent miRNA for expression normalization in colorectal cancer. We have validated several miRNAs (miR-15b, miR-215, miR-145, miR-192, let-7g) that are significantly associated with progression free survival (PFS) and/or overall survival (OS) of colorectal cancer patients independent of tumor stage and age at diagnosis. These 5 miRNAs are significantly associated with OS of colorectal cancer even after tumor location (left side vs. right side) is adjusted for. Furthermore, the prognostic value of let-7g for overall survival was independently validated using the RNA-Seq results from TCGA colorectal cancer database. These results, taken together, establish a solid foundation towards miRNA based precision management of colorectal cancer.


INTRODUCTION
Colorectal cancer is one of the leading causes of cancer related death in the United States with more than 50,000 deaths every year [1]. The current 5-year survival rate for stage II colorectal cancer patients is between 70-80%. Surgery is the standard treatment option for stage II colorectal cancer. However, about 30% of stage II patients will have relapse and there is no reliable biomarker to determine which patients are at high risk and should be managed with adjuvant chemotherapy. As for advanced stage III and IV colorectal cancer patients, despite years of effort, there is still a lack of highly reliable prognostic biomarkers to determine which patients will benefit from chemotherapy. In both early and advanced stages of colorectal cancers, there is clearly an unmet need for biomarkers for better clinical management.
Recently, it has been recognized that epigenetic changes play a key role in tumorigenesis and resistance to chemotherapy [2,3]. Resistance to 5-fluorouracil (5-FU) based chemotherapy is the major reason for failures in treating advanced colorectal cancer. Colorectal cancer cells are highly heterogeneous, chemotherapy can be quite effective in eliminating most of the rapidly Research Paper www.impactjournals.com/oncotarget proliferating cancer cells. However, a small population of slow proliferating, cancer stem cells, is highly resistant and leads to recurrence [4]. Although the mechanism of chemoresistance to 5-Fluorouracil (5-FU) is complex and is often associated with elevated 5-FU target enzyme thymidylate synthase (TS, TYMS) [5,6], recent studies have shown that epigenetic alterations such as non-coding miRNAs are major contributors to resistance mechanisms to 5-FU. miRNAs regulate acute changes in protein synthesis at the post-transcriptional and translational levels [3,[7][8][9][10][11][12].
miRNAs are a class of small non-coding RNAs with crucial regulatory function [13,14]. miRNAs modulate protein expression by promoting RNA degradation, inhibiting mRNA translation, and in some cases, affecting transcription. miRNA regulation of gene expression, provides cancer cells with an advantage in response to genotoxic stress and growth condition changes. We have determined through systematic evaluation that miRNAs are highly stable in archival formalin fixed paraffin embedded (FFPE) colorectal tumor specimens [15]. This result provides the foundation for the investigation of miRNA based biomarkers using large deposits of archival FFPE specimens with long term clinical follow up information. We subsequently demonstrated the clinical significance of miRNAs (e.g. let-7g, miR-15a, miR-215, miR-129, miR-181b, miR-140, miR-200c) in colorectal cancer, especially for long term survival for patients treated with fluoropyrimidine based chemotherapy [9,[16][17][18][19]. Our subsequent studies demonstrated that these miRNAs have direct functional significance in colorectal cancer by regulating key targets such as thymidylate synthase (TYMS, TS) [20], dihydrofolate reductase (DHFR) [7], histone deacetylase [3], E2F3, and Bcl-2 [18]. Some of which are directly linked to chemoresistance in highly resistant colon cancer stem cells [12,19].
In this study, we systematically validated the prognostic potential of candidate miRNAs in stage II, III and IV colorectal cancer. We quantified expression of a panel of 11 miRNAs (Let-7g, miR-15a, miR-15b, miR-21, miR-140, miR-143, miR-145, miR-181b, miR-192, miR-200, miR-215) selected based on their critical functions in chemoresistance and cell death in colorectal cancer as well as several profiling studies [21,22]. We also quantified four housekeeping genes (RNU44, 5S, β-actin, miR-16) to determine the best housekeeping gene for normalizing of miRNA expression. Our results show that the best housekeeping miRNA for normalization is miR-16. We discovered several significant miRNAs (miR-15b, miR-215, miR-145, miR-192, let-7g) as prognostic biomarkers of OS that are independent of tumor stage and age at diagnosis. It has been demonstrated that the primary tumor location is an important prognostic factor in metastatic colorectal cancer [23][24][25][26]. We also analyzed the prognostic potential of these miRNAs by taking into account the primary tumor location being left side or right side. Our results show that let-7g is significantly associated with OS with independent validation using TCGA colorectal cancer datasets.

Determining the best housekeeping gene for normalization of miRNA in colorectal cancer
One of the most important considerations for expression based biomarker analysis is to find and validate a true housekeeping gene for normalization. We have systematically searched the literature and selected several candidates for miRNA expression normalization in colorectal cancer. The candidates that we selected are ribosomal RNA 5S, RNU6b, β-actin, RNU44 and miR-16. Based on the expression profiles of these genes from 200 colorectal cancer samples, we show that miR-16 is the best housekeeping gene for miRNA expression analysis ( Figure 1). The relative C T values of gene expression were listed in Figure 1A. Our results show that miR-16 is relatively more abundant than the other housekeeping gene candidates. The standard deviation of the expression of each potential housekeeping gene are listed in Figure 1B. We have previously used RNU44 as a normalization control and RNU44 is also a good housekeeping gene with lower standard deviation than miR-16. However, due to its relative low expression levels compared to miR-16, we chose miR-16 as the best housekeeping gene for this study.

Survival analysis
We analyzed the significance of miRNA expression with progression free survival (PFS) and overall survival (OS) based on all patients after adjusting for stage and age at diagnosis. The association results were summarized in Table 1. Our results show that low expression of Let-7g, miR-15b or miR-192 were significantly associated with better OS (estimated HRs of low expression versus high expression were 0.62, 0.26, 0.64 with p-values 0.0238, 0.0024 and 0.0406, respectively) while high expression of miR-145 was significantly associated with better OS (estimated HR=2.71 with p-value=0.0077). Low expression of miR-15b and miR-215 were also significantly associated with better PFS (estimated HRs = 0.33, 0.48 with p-values=0.0132 and 0.0235, respectively) while high expression of miR-145 was significantly associated with better PFS (estimated HR=2.50 with p-value=0.0138). In stratified analysis of stage III/IV patients, low expression of miR-21, miR-200 and miR-215 were also associated with better OS (estimated HRs=0.22, 0.44, 0.47 with p-values=0.0362, 0.0137, 0.0269, respectively) while miR-145 was not significantly associated with OS. Among advanced stage III/IV colorectal cancer patients, low expression of miR-15a was associated with worse PFS (estimated HR=2.36 with p-value=0.0081) and low expression of miR-200 and miR-215 were associated with better PFS (estimated HRs=0.49 and 0.30 with p-values=0.0194 and 0.0064, respectively). Different association results were found in stage II colorectal cancer patients: low expression of miR-143 was significantly associated with better OS and PFS (estimated HR=0.38, 0.48 with p-values=0.008, 0.0391, respectively); high expression of miR-145 and miR-181b were significantly associated with better OS and PFS (all estimated HRs>1 with p-values<0.05). In addition, low expression of miR-15b was associated with better OS in stage II patients (estimated HR=0.19 with p-value=0.0122).

Survival analysis considering primary tumor location
In addition to the survival analysis without considering the primary tumor location, we recognized that tumor location (left vs. right) is a significant prognostic factor to be considered when stratifying patient survival in colorectal cancer [23,24]. We further analyzed the association of PFS and OS by including the primary tumor location. Our results show that tumor location does indeed influence the association between miRNAs and patient survival ( Table 2). In terms of OS, the expression of let-7g, miR-15b, miR-145, and miR-192 were significant prognostic biomarkers for stage II/III/IV colorectal cancer patient survival and miR-215 became associated with OS after further controlling for tumor location (estimated HR=0.49), but it was on the border-line (p-value=0.0499). The expression of let-7g, miR-15b, miR-21, miR-192, miR-200, and miR-215 were still significantly associated OS among stage III/IV colorectal cancer patients. The expression of miR-143, miR-145, and miR-181b were still significantly associated with OS among stage II colorectal cancer patients. For PFS association, the expression of miR-215 was still significantly associated with stages II/ III/IV patients but the expression of miR-15b and miR-145 were not after adjusting for tumor location. miR-15a, miR-215 and miR-200 expression levels were still significantly associated with stage III/IV PFS. miR-143, miR-145 and miR-181b expression levels were still significantly associated with stage II PFS, but miR-15b expression level was not significantly associated with stage II PFS after considering tumor location.

Survival analysis using TCGA data
To further validate the results, we used our own data as the experimental set, and we analyzed the miRNA expression with patient survival by using TCGA colorectal cancer data as a validation set. Our results show that TCGA data is highly consistent with our results generated from patient samples at Stony Brook University Medical Center ( Table 3). The expression levels of let-7g, miR-15a, miR-200 and miR-181 were significantly associated with OS of stage II/III/IV patients before and after considering tumor location. The representative Kaplan-Meier OS survival curves of let-7g based on our data and TCGA data for different patient groups are shown in Figure 2. The expression levels of let-7g, miR-21, miR140, miR143, miR-181, miR-192, and miR-215 are significantly associated with stage III/IV patient's OS. Low expression of miR-15b was associated with better OS in stage III/ IV patients, but this did not achieve statistical significance after considering tumor location. Among stage II patients, findings from our data were also confirmed by TCGA data while additional miRNAs were shown to be significantly associated with stage II patients' OS such as let-7g, miR-15a, miR-140, miR-192 and miR-200.

DISCUSSION
In this study, we evaluated the prognostic potential of miRNAs in colorectal cancer based on 200 patient samples with clinical outcome follow up information. We have also systematically validated the housekeeping controls that can be used for miRNA expression normalization in colorectal cancer. Our findings are consistent with previous reports that RNU44 and miR-16 are two genes with the least expression variation in colorectal cancer [9,27]. We chose miR-16 as the housekeeping miRNA for this study as it has a relatively higher level of expression than RNU44 with the best standard deviation (Figure 1). miR-16 has been shown to be the ideal housekeeping candidate in breast cancer and pancreatic cancer [28,29]. Our results are also consistent with previous studies that miR-16 is a suitable housekeeping miRNA in colon cancer [27]. We show that miR-15a and miR-15b are significantly associated with colorectal cancer patient's survival (Table 1 and 2). This is in contrast to CLL in which a cluster containing miR-15 and miR-16 is deleted from 13q14 [2]. There might be unique differential processing of miR-15 and miR-16 in solid tumors such as breast cancer, pancreatic cancer and colorectal cancer. As miR-15a/16-1 cluster is located in chromosome 13 while miR-15b/miR-16-2 is located in chrosomsome 3. Identification and utilization of the proper housekeeping gene is one of the most important factors for any biomarker study as the housekeeping gene used for miRNA expression normalization is often times different in unique tumor types.
There are a number of advantages to using miRNA as biomarkers. miRNAs are relatively stable in archival FFPE samples which makes them superior to degraded mRNAs [15]. This allows for large scale retrospective studies using archival FFPE specimens. In addition, their aberrant expression may be indicative of the disruption and dysregulation of multiple cellular networks.
Our studies have identified miRNAs that are significantly associated with stage II, III, and IV colorectal cancer patients. This study further validated several miRNAs (e.g. miR-21, miR-200, miR-215) with prognostic potential in colorectal cancer from our own group as well as  [7,20]. miR-192 and miR-215 are directly regulated by the tumor suppressor gene p53 in colon cancer [7,20]. We have also demonstrated previously that miR-181b is significantly associated with chemotherapeutic response in colorectal cancer [9,16]. One of the novel and significant aspects of this study is to take colorectal tumor location into consideration. This has never been investigated in conjunction with the expression of miRNAs in colorectal cancer. Previous studies have shown that primary tumor location (left vs. right) is a significant prognostic factor in metastatic colorectal cancer [23]. When we include primary tumor location as a factor with miRNA expression, we show that let-7g, miR-15b, miR145, miR-200, and miR-215 are still significantly associated with patient's OS. This is highly consistent with our previous studies that let-7g is closely associated with colon cancer chemoresponse [16]. We have also shown that miR-200 has prognostic potential in colorectal cancer, which is consistent with previous studies [9]. It is well established that miR-200 plays key roles in EMT by regulating Zeb1 expression [31].
We further cross validated the findings based on colorectal cancer patients from Stony Brook University Medical Center with colorectal cancer TCGA database containing RNA-Seq expression data for all miRNAs [32]. Based on TCGA miRNA dataset with tumor location information, we were able to show that let-7g, miR-140, miR-200, miR-192, miR-181 remained highly significant prognostic factors for metastatic colorectal cancer independent of tumor stage. Our results are also consistent with some of the miRNA based colorectal cancer biomarker studies [21,33]. Let-7g is the most consistent prognostic biomarker between the Stony Brook University Medical Center Cohort and TCGA ( Figure 2). Patients with low expression of let-7g have better survival compared to the group of patients with high expression. This is in contrast to the general notion that let-7g has a tumor suppressor role. We reason that this may be due to the fact that tumor cells in patients with lower let-7g levels may have a more rapid proliferation rate, making them more sensitive to 5-FU and oxaliplatin based DNA damaging agents used in colorectal cancer chemotherapy. This is in fact consistent with another tumor suppressive miRNA, miR-215, as patients with low expression have improved survival compared to patients with higher expression [17].
In conclusion, we systematically validated several miRNAs with clinical prognostic potential for metastatic colorectal cancer using patient cohorts from Stony Brook University Medical Center and TCGA. With cancer clinical management moving to more personalized approaches, this study provides a foundation to better prepare us to leverage these potential biomarkers to assist future clinical management of colorectal cancer.

Clinical Samples
We selected 205 colorectal cancer specimens from patients who underwent surgical resection of primary tumors at the Stony Brook University Medical Center, Stony Brook, NY, USA. Patient consent forms were obtained from each patient according to institutional policies. Patient clinical information was provided by the Cancer Registry of Stony Brook University Medical Center, and the characteristics of these patients are shown in Table 4. Among these, we have 89 cases of stage II, 86 cases of stage III and 30 stage IV colorectal cancer patient archival formalin fixed paraffin embedded (FFPE) tissue specimens. Representative tissue blocks from each case were assembled from the archival collections of the Department of Pathology, and used for subsequent analysis. 200 patients had both follow-up information and miRNA expression information and 187 patients had tumor location information available.
Clinical and expression data used for validation in this study were downloaded from the UCSC cancer genome browser (http://xena.ucsc.edu/), which is a set of web-based tools to display and investigate cancer genomics data and its associated clinical information [34]. Specifically, we extracted the clinical and miRNA expression data for TCGA colon adenocarcinoma. Genome-wide characterizations of the expression patterns of mRNA and miRNA of these samples have been reported previously [32]. For the clinical data, the survival information for 431 subjects is available. The miRNA and mRNA expression was measured using HiSeq platform. There are 331 subjects that have both survival and miRNA expression data and 327 subjects that have tumor location, survival and miRNA expression data.

RNA isolation and qRT-PCR analysis
FFPE specimens are deparaffinized using xylene and ethanol washes as previously described. Samples were digested with protease to recover total RNA. RNA were purified using a rapid glass-fiber filter methodology from Thermo Fisher Scientific (Ambion RecoverAll™ Total Nucleic Acid Isolation Kit for FFPE) that includes an onfilter DNase treatment to remove contaminated genomic DNA. Purified RNA samples are eluted with nuclease free water for cDNA synthesis and quantitative RT-PCR analysis.
The reverse transcription of miRNAs to cDNAs were conducted using TaqMan miRNA RT kit from Thermo Fisher Scientific (Life Technologes) by combining primers for different miRNAs using 40 ng of purified total RNA. Multiplex qRT-PCR reactions were performed using the Thermo Fisher Scientific (Applied Biosystems Inc.) 7900HT Fast Real Time PCR Detection System with 95 o C for 10 min, then 40 cycles of 95 o C for 15 seconds, 60 o C for 60 seconds. miRNA level was analyzed with its specific primers and internal housekeeping control miR-16. Fluorescent signals from each sample were collected at the endpoint of every cycle, and the expression level of each unique miRNA was calculated by ΔΔC T values based on the internal controls, normalized to control group and plotted as relative value (RQ).

Data analysis
Overall survival (OS) was defined as the time from the date of diagnosis to the last follow-up date or the date of death whichever occurred first. Progression free survival (PFS) was defined as the time from the date of diagnosis to the last follow-up date or the date of death or the date of recurrence whichever occurred first, and only being alive at the last follow-up date was considered as censored. Normalized expression levels were defined using the 2 -ΔΔCT method with miR-16 used as housekeeping control gene. Each miRNA's expression was dichotomized into low and high expression using a cutoff value in the expression level which gave the smallest p-value to test if the specific miRNA expression was associated with survival outcomes based on multivariable Cox proportional hazard model after adjusting for age at diagnosis and cancer stage. Estimated hazard ratios and their 95% confidence intervals for each miRNA between low expression and high expression were reported for both OS and PFS. Stratified analysis for early stage (stage II) and advanced stage (stage III, IV) were performed. TCGA data were analyzed similarly. Statistical analysis was performed using SAS 9.3 (SAS Institute, Inc., Cary, NC) and Kaplan-Meier curves were estimated using R i386 3.3.0. Statistical significance was set at 0.05.