Site-specific associations between miRNA expression and survival in colorectal cancer cases

Background MicroRNAs (miRNA) are small non-coding RNA involved in cellular processes, including cell proliferation and angiogenesis. Thus, miRNA expression may alter survival after diagnosis with colorectal cancer (CRC). Results Individuals diagnosed with stage 1 or stage 2 rectal cancer had worse survival than colon cancer cases diagnosed at stage 1 or stage 2. After adjustment for multiple comparisons, no miRNAs were significantly associated with disease stage. Two miRNAs infrequently expressed in the population and not previously reported were associated with survival after diagnosis with colon cancer (miR-1 HR 2.17 95% CI 1.41, 3.36; and miR-101-3p HR 3.51 95% CI 1.72, 7.15). Among those diagnosed with rectal cancer, 201 miRNAs were associated with survival when the FDR q value was < 0.05. Assessment of 105 previously reported miRNAs associated with prognosis showed that four miRNAs influenced colon cancer survival and 17 influenced survival after a diagnosis with rectal cancer when raw p values were considered. Patients and Methods This study includes data from population-based studies of CRC conducted in Utah and the Kaiser Permanente Medical Care Program. A total of 1893 carcinoma and normal paired colorectal mucosa tissue samples were run using the Agilent Human miRNA Microarray V19.0. We assessed miRNA differential expression between paired carcinoma and normal colonic mucosa tissue with CRC- specific survival evaluating stage and site-specific associations after adjusting for age, sex, microsatellite instability tumor status, and AJCC stage. Conclusions MiRNAs dysregulated for both colon and rectal cancer had a greater impact on survival after a diagnosis with rectal cancer.


INTRODUCTION
MicroRNAs (miRNA) are small non-coding RNAs that regulate gene expression and are thus involved in numerous physiological and cellular processes, including tumor initiation and growth, cell proliferation, apoptosis, and angiogenesis [1][2][3]. We have shown that miRNAs are extensively dysregulated in colorectal carcinoma (CRC) [4]. Given the extensive role of miRNAs in gene regulation and cellular processes, the evaluation of miRNAs as regulators of tumor aggressiveness and prognosis is of interest [5,6]. To this end, several miRNAs have been shown to be associated with either disease stage or survival after diagnosis with CRC [7]. Most studies have focused on select miRNAs, such as miR-21, in sample sets that are relatively small [8][9][10]. Our earlier attempt to replicate these findings in a sample of 1141 cases of CRC looking at miRNA expression in tumors showed that few of these candidate miRNAs replicated in terms of survival [11].
The importance of evaluating prognostic effects for colon and rectal cancer separately may provide clues to site-specific important miRNAs in terms of survival. We have previously shown that tumors with microsatellite instability (MSI) behavior differently in terms of survival for colon and rectal cancers, with colon tumors having better survival and rectal tumors having worse survival when they are unstable [12,13]. Additionally we have shown that unique dysregulation of miRNAs for MSI vs microsatellite stable (MSS) cancers is more pronounced than for any other tumor phenotype (tumor phenotype paper in press) [14].
In this study, we examine site-specific associations with colon and rectal cancer with survival. We demonstrate that low stage rectal carcinomas have significantly poorer survival than low stage colon carcinomas. We evaluate the associations with survival between miRNAs and disease stage within colon and rectal carcinomas separately. We utilize a large set of 1134 colon carcinomas and 721 rectal carcinomas to determine if differential expression between carcinomas and normal colorectal mucosa influence survival. We utilize these data to implement both a discovery stage examining miRNAs from an Agilent platform that have not been previously associated with survival as well as evaluate 105 miRNAs that have been associated in the literature with either stage or survival in order to replicate previous findings.

RESULTS
Slightly more than half of the study subject were male (Table 1). Approximately half of the colon carcinomas were located in the proximal colon and half were located in the distal colon. Slightly less than half of the study participants had died at the time of last followup; the average follow-up time was 60.4 months. Figure 1 shows five-year survival after diagnosis with colon or rectal cancer by stage at time of diagnosis. The p-values in this figure are those associated with the Mantel-Haenszel/log-rank test for equal survival functions. Rectal cancers diagnosed at either stage 1 or 2 had significantly poorer survival than those diagnosed with stage 1 and 2 colon cancer, however for both colon and rectal cancer diagnosed at stage 1 over 90% survived while over 80% survived for both colon and rectal cancers diagnosed at stage 2. However, stage 4 rectal cancers had better survival than stage 4 colon cancers.

Stage-specific analysis
We observed no stage-specific associations of miRNA's with survival after adjustment for multiple comparisons for either colon or rectal carcinomas (Supplementary Tables S1 for colon and Supplementary  Tables S2 for rectal show those with p values of < 0.05 prior to adjustment for multiple comparisons). Prior to adjustment for multiple comparisons seven miRNAs were associated with survival for stages 1 and 2 for colon cancer and 17 miRNAs were associated with survival for more advanced carcinomas. For rectal cancer, 44 differentially expressed miRNAs were associated with survival for stage 1 and 2 carcinomas and 150 were associated with survival for carcinomas at more advanced stages. Of the miRNAs associated with survival by stage for rectal cancer, 13 were associated similarly for stages 1 and 2 and for stages 3 and 4. Two of the miRNAs associated with survival for stages 1 and 2, miR-429 and miR-4461, were associated similarly with survival for both colon and rectal carcinoma.
In an attempt to identify miRNAs that could uniquely contribute to survival after diagnosis with rectal cancer in stages 1 or 2, we compared the miRNAs associated with survival for stage 1 and 2 rectal cancer with those associated with survival for colon stage 1 and 2, rectal stage 3 and 4, and stage-adjusted rectal carcinoma. In this exploratory analysis, we observed that 20 miRNAs that were uniquely associated with survival for stages 1 and 2 rectal carcinomas ( Table 2).

miRNAs in discovery component
In the discovery component of the study, examination of miRNAs commonly expressed (i.e. those expressed in at least 50% of the population) showed no significant findings for survival and colon cancer after adjustment for multiple comparisons for either site overall or any specific site within the colon; we did not see a significant interaction between miRNAs and mortality for the progression of colon tumor site from cecum to sigmoid colon. However, 19 miRNAs were associated with survival prior to adjustment for multiple comparisons (Supplementary Table S3). On the other hand, for rectal cancer we observed that differential expression of 201 miRNAs were associated with survival when the FDR q value was < 0.05 (Table 3 for those with q value of < 0.031 and Supplementary Table S4 for those with q value between 0.031 and < 0.05); 228 miRNAs were associated with survival after diagnosis with rectal cancer prior to adjustment for multiple comparisons. The majority of HR were modest with the strongest associations for the interquartile range of differential expression being miR-30e-5p (HR 0.69 95% CI 0.56, 0.84) for tumors having up-regulated expression of this miRNA and for miR-6131 for increased risk when this miRNA was not down-regulated in tumors (HR 1.31 95% CI 1.12, 1.54). Five canonical pathways were significantly associated with those miRNAs that reduced risk of dying after adjustment for multiple comparisons. Those pathways were integrin signaling, actin cytoskeleton signaling, epithelial adherens junction signaling, ILK signaling, and ERK/MAPK signaling.
Evaluation of the 646 miRNAs that were expressed in less than 50% of the population, showed that any expression of two miRNAs, miR-1 and miR-101-3p, was associated with poorer survival after a diagnosis of colon cancer (Table 4). Having any expression of these miRNAs was associated with increased hazard of dying (HR 2.17

miRNAs in replication component
Assessment of replication of 52 previously reported miRNAs commonly expressed and associated with prognosis, showed that no miRNAs influenced colon cancer survival and 17 influenced survival after a diagnosis with rectal cancer (Table 5). These 17 miRNAs had a raw p value of < 0.05 and the miRNAs associated with rectal cancer had q values of < 0.1. Assessment of CRC-specific survival with less commonly expressed miRNAs showed two significant associations (q value < 0.05) with colon cancer survival and three associations, miR-224-5p, miR-335-5p and miR-374a-5p (q value < 0.1), with survival after being diagnosed with rectal carcinoma (Table 6).

DISCUSSION
Previously we reported that dysregulated miRNAs for colon and rectal carcinoma were almost identical [15]. Despite this, we observed that these dysregulated miRNAs had different effects on survival for colon and rectal carcinoma. While miRNAs had minimal impact on survival when dysregulated for colon cancer, the same miRNAs had considerable impact on survival after being diagnosed with rectal cancer. Some of the dysregulated miRNAs for rectal cancer appear to be uniquely associated with survival for stages 1 and 2 rectal cancer prior to adjustment for multiple comparisons. It is possible that these miRNAs may contribute to the decreased survival observed for stages 1 and 2 rectal cancer compared to stages 1 and 2 colon cancer. However, it should also be acknowledged that the FDR q value associated with these miRNAs was 0.29, thus, one would expect about 29% of associations to be null findings.
We have previously noted differences in survival between people diagnosed with colon or rectal cancer based on MSI status of tumors [12,13]. MSI tumors were associated with worse survival for rectal tumors and better survival for colon tumors. Differences in survival after a diagnosis of colon or rectal cancer is also seen by AJCC disease stage in this study and supports previous reports [16]. We observed that people diagnosed with stage 1 and 2 rectal carcinomas had worse survival than those with stage 1 and 2 colon carcinomas. However people diagnosed with a stage 4 rectal tumor had slightly better survival than those with colon cancer diagnosed at stage 4,   this is similar to observations from broader SEER data [16]. Although the differences in survival were statistically significant, there were few deaths for either colon or rectal cancer at stages 1 and 2. Over 90% of individuals diagnosed at AJCC Stage 1 survived 5 years for both colon and rectal cancer, while for AJCC Stage 2 the five year survival was over 80% for both tumor sites. However, as we observed, the differences between colon and rectal cancer were statistically significant and the reasons for these differences are not clear. Perhaps one of the most important questions raised by this study is why the differences in survival associated with differential miRNA expression is seen for rectal cancer but not for colon cancer. This difference is even more puzzling when considering that the actual differential expression between carcinoma and normal colorectal mucosa was almost identical [4,17]. These differences in survival patterns cannot be explained by age differences, with rectal cancer cases being slightly younger than colon cancer cases. This would suggest that factors such as tumor microenvironment may play a major role in the ultimate prognosis associated with differentially expressed miRNAs. For instance it has been suggested that miRNAs act as modulators of angiogenesis [18] and that tumor microenvironment may further regulate angiogenesis [19]. Several of the miRNAs previously associated with angiogenesis, such as miR-17-5p, miR-20a, miR-221, miR-20, and miR-92 [18], were associated with survival after diagnosis with rectal cancer in our study. It is thus possible that the microenvironment of the rectum is sufficiently different than that of the colon cancer that once miRNAs are dysregulated, they have a greater impact on prognosis for rectal cancer than for colon cancer. Another possibility is the influence of preoperative chemo-radiation for rectal cancer may interact with the microenvironment or alter the microenvironment so that miRNA expression had a different impact on prognosis for rectal than colon carcinoma. Unfortunately, we are not able to test this hypothesis directly since treatment data at the time of diagnosis was not uniformly available.
In this study we incorporated both a discovery and a replication component. Because of our ability to evaluate 970 miRNAs expressed in colorectal tissue, we were able to assess site-specific associations with survival and disease stage for many miRNAs not previously assessed with prognosis. In this process we identified over a hundred miRNAs that could influence survival after diagnosed with rectal cancer. However, for the most part the impact on survival was not large, with most HRs for the interquartile range of differential expression being less than 1.3 after adjusting for age, AJCC disease stage, and MSI status.
In our previous study we reported replication of 121 miRNAs using 1141 of the colorectal samples that  we included in this work [11]. In that replication we identified five miRNAs whose carcinoma expression levels were associated with advanced disease stage and 12 with colorectal cancer mortality among individuals diagnosed with colon cancer and 14 among individuals diagnosed with rectal cancer. In that work, we did not adjust for multiple comparisons since we were testing previously identified miRNAs and were therefore testing specific hypotheses. Additionally, we examined the level of miRNA expression in tumors in our previous analysis while here we examined differential expression and report both p values unadjusted for multiple comparisons as well as FDR q values. For those miRNAs previously identified with colon cancer and survival, we replicated results for four of the miRNA and nine for rectal cancer when looking at differential expression with survival using raw p values. However, with the larger sample and examination of differential expression in this current study, eight miRNAs that did not previously replicate with carcinoma miRNA expression were significantly associated with survival after diagnosis with rectal cancer. It is interesting to note that two of the miRNAs associated with survival after diagnosis with colon cancer and 12 of the miRNAs associated with survival after diagnosis with rectal cancer were still significant after adjustment for multiple comparisons with a q value of < 0.05 in our current study. Others who have evaluated miRNAs with survival and disease stage have not adjusted for multiple comparisons, which could account for some differences in significant associations.
The study has several strengths including the large sample size, the site-specific colorectal cancer data, data on MSI status, and AJCC stage data for all study participants. When assessing stage specific survival differences (Figure 1), we included data from all individuals diagnosed in the time period from the target geographic areas to have a population-based assessment of survival differences by stage and tumor site. The platform used enabled us to undertake both a discovery and replication study component. The Agilent platform has been shown to have excellent repeatability (r = 0.98) and relatively good agreement with Nanostring [4]. Comparison of Agilent results to qPCR showed 100% agreement in terms of directionality of dysregulation and almost 100% agreement in terms of the fold change associated with that dysregulation [17]. There are additionally several weaknesses we encountered that are applicable to any study of miRNAs. Bioinformatics tools to determine functionality of miRNAs associated with survival are very non-specific and incomplete. For instance, assessment of the 24 miRNAs associated uniquely with stages 1 and 2 rectal cancer and survival showed that they regulated thousands of validated genes. When focusing on the miRNAs associated with rectal cancer that were previously not reported as associated with survival, we identified 2942 target genes for those miRNAs associated with improved survival and 4224 target genes for those miRNAs associated with worse survival. Of those targeted genes, 2681 were targeted by more than one miRNA. Assessing those targeted genes with mRNA expression in 48 rectal samples, yielded 588 significant associations. Of those, 376 also had significant differential expression between carcinoma and normal rectal mucosa. Pathway analysis yielded five significant pathways associated with miRNAs that improved survival. Because of the number of miRNAs being examined, we adjusted for multiple comparisons; while this is a strength of the study, it is also a limitation of being able to relate findings to previously reported results. Most published studies have focused on a few miRNAs in small populations, making lack of confirmation of many previous findings not unexpected.
In conclusion, stage-specific prognosis differs for colon and rectal cancer; it is unclear the extent to which miRNAs contribute to this difference. We observed that miRNA differential expression in rectal carcinomas had a more pronounced impact on survival than they did for colon carcinomas. The majority of miRNAs identified as being associated with survival after diagnosis with rectal cancer were newly identified miRNAs not previously reported as being associated with survival. Replication of previous associations showed few remained significant in this large study, although differences in methodologies in both assessment of miRNAs and statistical methods used could contribute to these findings.

MATERIALS AND METHODS
Study participants come from two population-based case-control studies that included all incident colon and rectal cancers between 30 and 79 years of age who resided along the Wasatch Front in Utah or were members of the Kaiser Permanente Medical Care Program (KPMCP) in Northern California. Participants were white, Hispanic, or black for the colon cancer study; the rectal cancer study included Asians and American Indians not living on reservations [20,21]. Cases had to have tumor registry verification of a first primary adenocarcinoma of the colon or rectum and diagnosed between October 1991 and September 1994 for the colon cancer group and between June 1997 and May 2001 for the rectal cancer group. Tumor tissue was obtained for 97% of all Utah cases diagnosed and for 85% of all KPMCP study participants [22] and included those who signed informed consent and those retrieved by local tumor registries and sent to study investigators without personal identifiers. Local tumor registry data were used to obtain date of birth, date of diagnosis, date of death or date of last contact, and tumor information for those individuals who were not interviewed. The study was approved by the Institutional Review Board of the University of Utah. miRNA processing RNA (miRNA) was extracted from formalin-fixed paraffin embedded tissues. We assessed slides and tumor blocks that were prepared over the duration of the study prior to the time of miRNA isolation to determine their suitability. Older slides produced comparable RNA quality as more recent slides. The study pathologist reviewed slides to delineate tumor, normal, and polyp tissue. Cells were dissected from 1-6 sequential sections on aniline blue stained slides using an H&E slide for reference. Total RNA containing miRNA was extracted, isolated, and purified using the RecoverAll Total Nucleic Acid isolation kit (Ambion), RNA yields were determined using a NanoDrop spectrophotometer. 100 ng total RNA was labeled with cy3 and hybridized to Agilent Human miRNA Microarray V19.0 and were scanned on an Agilent SureScan microarray scanner model G2600D. The Agilent Human microarray was generated using known miRNA sequence information compiled in the Sanger miRBASE database v19.0. The microarray contains probes for 2006 unique human miRNAs, with one to four unique probes for each of the known miRNAs. The miRNA array contains 60,000 unique human sequences and averages 30 replicates per probe sequence. Data were extracted from the scanned image using Agilent Feature Extract software v.11.5.1.1. Data were required to pass stringent QC parameters established by Agilent that included tests for excessive background fluorescence, excessive variation among probe sequence replicates on the array, and measures of the total gene signal on the array to assess low signal. If samples failed to meet quality standards for any of these parameters, the sample was re-labeled, hybridized to arrays, and scanned. If a sample failed QC assessment a second time the sample was deemed to be of poor quality and the individual was excluded from downstream analysis.
The Agilent platform was found to be highly reliable (r = 0.98), to have reasonable agreement with NanoString as well as excellent agreement with qRT-PCR [4,17]. For unpaired samples due to missing normal scans, we imputed values for normal mucosa as previously described [23]. To minimize differences that could be attributed to the array, amount of RNA, location on array, or other factors that could erroneously influence expression, total gene signal was normalized by multiplying each sample by a scaling factor which was the median of the 75th percentiles of all the samples divided by the 75th percentile of each individual sample [24]. This scaling factor was implemented using SAS 9.4.

Data sharing and availability
Data will be made available based on limitations of signed informed consent. Because of restrictions of consent forms, data are not incorporated at this time into public data resources. Individuals interested in having access to data can work with study investigators and establish a formal data transfer agreement.

Statistical methods
Analyses were conducted using the log base 2 transformed data. Data were available for 1855 subjects Our analysis included both a discovery component of miRNAs not previously reported with survival as well as a replication component, examining those miRNAs where previous associations were suggested for disease stage or prognosis. In the discovery component of the study we examined over 970 miRNAs and in the replication component of the study we evaluated an additional 105 miRNAs where the miRNA was expressed in at least 5 individuals who had both survival and stage information. In both the discovery and replication components of the study, we analyzed the miRNA data in two groups, one group of more commonly expressed miRNAs (defined as expressed in at least 50% of the population) and a second group less frequently expressed (treated as either expressed or not expressed in the analysis). Since associations from five randomly selected samples of 80% of the population showed that associations with survival for differential expression between carcinoma and normal colorectal mucosa were much more consistent across sample subsets than absolute miRNA carcinoma expression, we used differential expression data when assessing survival.
Survival months were calculated based on difference between the diagnosis date and date of death or date of last follow-up. CRC-specific follow-up included deaths where the primary or secondary cause of death was listed as CRC. Individuals dying of other causes or who were lost to follow-up were censored at their time of death or date of last contact. The R package "survival" was used to calculate p-values based upon 10,000 permutations of the likelihood ratio test from the Cox proportional hazards model adjusted for age at diagnosis, gender, AJCC tumor stage, and MSI tumor molecular phenotype. The study population was over 90% non-Hispanic white. Because a number of miRNAs were infrequently expressed, we calculated the HR for these miRNA based upon any vs. no expression using non-permutated p-values in SAS 9.4 (SAS Institute, Cary, NC) from the Cox model using the same adjustment variables. We combined the p-values of miRNAs in the discovery and replications study components within the two expression level groups to adjust for multiple comparisons using a false discovery rate (FDR) q-value threshold of 0.05 [25].
We have previously assessed these tumors for microsatellite instability (MSI) based on the mononucleotides BAT26 and TGFβRII and a panel of 10 tetranucleotide repeats that were correlated highly with the Bethesda Panel [26]; our study was done prior to the Bethesda Panel development. We report hazard ratios (HR) and 95% Confidence Intervals (CI) from the Cox proportional hazards model after adjusting for age at diagnosis, gender, AJCC tumor stage, and MSI tumor molecular phenotype to assess CRC-specific mortality based on the interquartile range for those more commonly expressed (50% or more of the population) and for any expression vs no expression for less commonly expressed miRNAs. To examine site-specific effects within the colon we created an ordinal site variable and evaluated if there was a significant interaction between miRNAs and site, using a continuous model. Additionally we evaluated if miRNAs were associated significantly with any specific site within the colon.

Bioinformatics analysis
The top miRNAs that were associated with survival were split into two groups, those that improved survival and those that worsened survival. The target genes for each group were then identified using miRTarBase v6.0 [27]. To better identify which of these target genes might be most relevant for colorectal cancer, we analyzed differential miRNA and mRNA expression between carcinoma and paired normal colorectal mucosa to determine if any miRNAs were affecting mRNA expression in our dataset. For those mRNAs that significantly associated with mRNA, we determined if the mRNA was significantly differentially expressed in colorectal tissue. We then used the significantly differentially expressed mRNAs, with their respective fold changes, as input to QIAGEN's Ingenuity Pathway Analysis (IPA, QIAGEN Redwood City, www.qiagen.com/ingenuity). We used the 'core analysis' tool, and included direct and indirect relationships, experimentally verified interactions, mammalian species, and all mutations as our analysis settings.