Clinical value of integrated-signature miRNAs in colorectal cancer: miRNA expression profiling analysis and experimental validation

MicroRNA (miRNA) expression profiling of colorectal cancer (CRC) are often inconsistent among different studies. To determine candidate miRNA biomarkers for CRC, we performed an integrative analysis of miRNA expression profiling compared CRC tissues and paired neighboring noncancerous colorectal tissues. Using robust rank aggregation method, we identified a miRNA set of 10 integrated-signature miRNAs. In addition, the qRT-PCR validation demonstrated that 9 miRNAs were consistent dysregulated with the integrative analysis in CRC tissues, 4 miRNAs (miR-21-5p, miR-183-5p, miR-17-5p and miR-20a-5p) were up-regulated expression, and 5 miRNAs (miR-145-5p, miR-195-5p, miR-139-5p, miR-378a-5p and miR-143-3p) were down-regulated expression (all p < 0.05). Consistent with the initial analysis, 7 miRNAs were found to be significantly dysregulated in CRC tissues in TCGA data base, 4 miRNAs (miR-21-5p, miR-183-5p, miR-17-5p and miR-20a-5p) were significantly up-regulated expression, and 3 miRNAs (miR-145-5p, miR-139-5p and miR-378a-5p) were significantly down-regulated expression in CRC tissues (all p < 0.001). Furthermore, miR-17-5p (p = 0.011) and miR-20a-5p (p = 0.003) were up-regulated expression in the III/IV tumor stage, miR-145-5p (p = 0.028) and miR-195-5p (p = 0.001) were significantly increased expression with microscopic vascular invasion in CRC tissues, miR-17-5p (p = 0.037) and miR-145-5p (p = 0.023) were significantly increased expression with lymphovascular invasion. Moreover, Cox regression analysis of CRC patients in TCGA data base showed miR-20a-5p was correlated with survival (hazard ratio: 1.875, 95%CI: 1.088–3.232, p = 0.024). Hence, the finding of current study provides a basic implication of these miRNAs for further clinical application in CRC.

MicroRNAs (miRNAs) belong to a class of small noncoding RNAs that can regulate the expression level of target genes at the post-transcription phase. Growing evidences indicate that miRNAs can serve as ideal biomarkers for cancer detection and accurate predictions of prognosis, as well as targets for treatment [8,9]. The number of miRNAs has increased largely from 2002 to 2014, and the latest miRBase version (Release 21, 2014) contains 2588 mature human miRNA entries. There are some studies to search biomarkers by analyzing miRNA expression profiling between CRC tissues and paired neighboring noncancerous colorectal tissues, but the results of identification significantly expressed miRNAs are inconsistent or discrepant among each study. There are several reasons to be responsible for the inconsistent results, such as different technological detection platforms, various methods for data processing, constant discovery of new miRNAs and samples from different backgrounds.
To overcome these limitations, we need to integrate their results in an unbiased manner. Robust rank aggregation (RRA) approach has been specifically designed for comparison of several ranked gene lists [10]. The tool looks at how each item is positioned in the ranked lists and compares this to the baseline case where all the preference lists are randomly shuffled. As a result, a p-value would be assigned for all items, showing how much better it was positioned in the ranked lists than expected by chance. This p-value is used both for reranking the items and deciding their significance. RRA is a suitable and effective solution for identification of statistically significant miRNA integrated-signature and is particularly useful when input experiments are performed by different technological platforms cover different sets of genes and full rankings of miRNAs are not available.
Hence, we executed integrated analysis approach by using RRA method to identify the consistently significant miRNAs in different expression profiling datasets. Moreover, we further validated the consistent up-regulated and down-regulated miRNAs in fresh CRC tissues and TCGA datasets in a clinical setting.
The number of research samples ranged from 3 to 84 (median 27) across the studies. The average number of miRNAs assayed was 772 (ranging from 156 to 2090) and most of the studies applied various microarray platforms (either commercial or custom). Five studies were focused only on colon cancer and two studies included only In total, there are 228 significantly up-regulated miRNAs and 230 significantly down-regulated miRNAs reported at least in one study. Additionally, 77 miRNAs were reported as inconsistent alteration. Each study reported with up-regulated miRNAs and down-regulated miRNAs except one study (MY), which reported with only up-regulated miRNAs and no down-regulated ones. The number of significantly dysregulated miRNAs varies across the studies and the top list of miRNAs differs greatly from study to study ( Figure 1). Additionally, none of the miRNAs altered consistently across all of the studies.

Identification integrated-signature miRNAs
A list of statistically significant integrated-signature miRNAs included 5 up-regulated and 5 down-regulated miRNAs were identified in CRC samples compared to noncancerous colorectal tissues using RRA method ( Table 1). All of the 10 integrated-signature miRNAs that reached statistical significance after Bonferroni correction were reported by at least 1/2 datasets. The full list of statistically significant miRNAs is provided in Supplementary Table 2. The most significantly upregulated (miR-21-5p) and down-regulated (miR-145-5p) miRNAs were reported by 19 and 21 datasets, respectively. The direction of expression alteration of integrated-signature miRNAs is consistent across all studies. The majority of the integrated-signature miRNAs belong to the broadly conserved seed families (conserved across most vertebrates).
The 10 integrated-signature miRNA genes are scattered on different chromosome regions and 6 integrated-signature miRNAs belong to the cluster of two or more miRNAs. A cluster is defined as follows: miRNA genes locate at a distance of less than 10 kb and not separated by a transcription unit. MiR-17-5p and miR-20a-5p are all in the chromosomal location of 13q31.3 and belong to the same cluster. MiR-145-5p, miR-378a-5p and miR-143-3p are all in the chromosomal location of 5q32, but miR-378a-5p doesn't belong to the cluster of miR-143/145.

Validation the expression of integrated-signature mirnAs in patients with crc and clinical significance
To validate the expression of the 10 integratedsignature miRNAs that may be the candidate biomarkers for CRC, the expression levels of these miRNAs were compared using qRT-PCR analysis between 11 CRC tissues and the neighboring noncancerous colorectal tissues. The results showed that expression alteration of the miRNAs is consistent with the integrated analysis except miR-31-5p. The expression level of miR-21-5p, miR-183-5p, miR-17-5p and miR-20a-5p were increased more than 2 folds (all p < 0.05), whereas miR-145-5p, miR-195-5p, miR-139-5p, miR-378a-5p and miR-143-3p were decreased more than 2 folds in CRC tissues (all p < 0.05) ( Figure 2). The most up-regulated expressed miRNA was miR-17-5p with a 4.95 fold change (p = 0.003) and the most down-regulated expressed miRNA was miR-378a-5p with a 0.09 fold change in CRC tissues compared to noncancerous tissues (p < 0.001).

target prediction and functional analysis
Target genes were obtained from both prediction algorithms and experimentally supported databases. The counts of predicted targets, experimentally validated targets and consensus targets were summarized in Supplementary Figure 5. MiR-17-5p had highest number of consensus targets, whereas miR-378a-5p was the miRNA with smallest number of targets. In addtion, Enriched KEGG pathways and Panther pathways were most frequently associated with cell signaling (MAPK signaling pathway, Wnt signaling pathway, EGF receptor signaling pathway, Apoptosis signaling pathway) and tumorigenesis (Pathways in cancer, Angiogenesis, Ras Pathway, p53 pathway), but also with cell mobility and community (regulation of actin cytoskeleton, Focal adhesion) ( Table 2). The most enriched GO processes regulated by the miRNAs include regulation of transcription, cell cycle, signal transduction, and apoptotic process. The heatmaps of enriched KEGG pathways and Panther pathways are shown in Figure 5 and Supplementary Figure 6, respectively. Furthermore, to evaluate association between these pathways and CRC, the publications which described CRC related constituent objects in the pathways were searched in PubMed. The pathways in which the constituent objects supported were considered to be CRC-related. After text mining, 42 KEGG pathways and 28 Panther pathways were found to be CRC-related. To visualize the most significantly enriched pathways, volcano plots were constructed by plotting the -log10 of p-value versus gene enrichment ratio. Finally, 26 of the 42 KEGG pathways and 21 of the 28 Panther pathways were highly related with CRC (enrichment ratio > 0.3, p-value < 0.00001) (Supplementary Table 3, Figure 6).

dIscussIon
The most common drawback of miRNA expression profiling studies is inconsistency among different studies. A reasonable solution to the problem is to determine the miRNAs which were consistently reported among the different miRNA expression profiling studies and expressed in a consistent orientation. Systematic review or meta-analysis has been done previously to determine differentially expressed genes in cancer at the gene level [35,36]. However, such rigorous approach is often not possible due to the lack of cross-platform standardization of miRNA profiling technologies or the unavailability of raw data. In this study, we analyzed CRC-specific miRNAs derived from independent expression profiling experiments using RRA method. The miRNAs would be   re-ranked and their significance would be re-decided. We identified an integrated-signature of 5 up-regulated and 5 down-regulated miRNAs from 27 expression profiling datasets included more than 1000 CRC tissue and noncancerous tissue samples.
To determine whether the 10 integrated-signature miRNAs have clinical values as biomarkers in CRC, we also performed a validation experiment. Our data confirmed the expression changes of 9 miRNAs were consistent with our integrated analysis. These results of our validation experiment have further supported the findings obtained in the present integrated analysis. Consistent with our initial analysis, 7 miRNAs (miR-21-5p, miR-183-5p, miR-17-5p, miR-20a-5p, miR-145-5p, miR-139-5p and miR-378a-5p) were found to be significantly dysregulated in TCGA data base. In this study, as was validated by qRT-PCR and TCGA datasets, the expression changes of the 7 miRNAs were all consistent with integrated analysis. Therefore, this miRNA panel might be novel potential biomarkers for the diagnosis of CRC.
The integrated-signature miRNAs are key regulatory factors of the oncogenic process. This is supported by KEGG and Panther pathways enrichment analyses of targets of miRNAs, which indicate very strong impact on pathways such as: Pathways in cancer, MAPK signaling pathway, Wnt signaling pathway, EGF receptor signaling pathway, Apoptosis signaling pathway ( Table 2). In addition, the association between these pathways and CRC are highly related by publication analysis. Therefore, these miRNAs may be good candidate therapeutic targets in CRC. Further studies could be performed to evaluate the clinical values of the miRNA expression signature in CRC.
The most consistently reported miRNA of upregulation in our integrated analysis is miR-21-5p, which is an oncogenic factor and often alters expression in cancers. It is shown that miR-21-5p directly target the tumor-suppressor PTEN, chemokine gene CCL20 and PDCD4 [37][38][39]. High level of miR-21-5p is associated with worse survival and poor response to chemotherapy in CRC patients [40]. Furthermore, miRNA-21-5p can induce resistance to 5-fluorouracil by down-regulating human DNA MutS homolog 2 (hMSH2), which was a core mismatch repair recognition protein complex [41]. Additionally, miR-21-5p expression level showed significant association with depth of invasion, lymphatic and venous invasion, liver metastasis and Dukes' stage [42]. The most consistently reported down-regulated miRNA is miR-145-5p which was shown to possess antitumorigenic activity, including involvement in various cancer-related processes such as proliferation, invasion and migration. It is showed that miR-145-5p directly targeted catenin δ-1, contributing to the aberrant translocation of β-catenin in the Wnt/β-catenin signaling pathway to suppress human colon cancer cells [43]. Moreover, several genes have been shown to be direct target by miR-145-5p in CRC [44][45][46][47][48], and majority of these genes are oncosuppression. Additionally, patients with a low miR-145-5p expression level had significantly more often a worse response to neoadjuvant chemoradiotherapy compared to patients with a high expression level of miR-145-5p [49].
Although our analysis was limited to comparison and validation between tumor and noncancerous tissue only, the significantly and consistently reported miRNAs could be used as potential diagnostic biomarkers. In a clinical setting, sufficient sensitivity and specificity of the panel of miRNAs should be determined in the further well designed clinical studies. Furthermore, targets prediction and functional enrichment analysis may provide a clue for elucidating the role of miRNAs in tumorigenesis of CRC and the precise underlying mechanisms. Taken together, the findings of the current study may have substantial clinical significance or implications.
In conclusion, a CRC associated miRNA expression signature, consisting of 7 highly significant and consistently dysregulated miRNAs, were identified in our integrated analysis and validation study, which may be potential candidate biomarkers for CRC. The rigorous evaluation of integrated-miRNA signature and functional enrichment analysis of their targets were promising them as candidates for diagnostic biomarkers of CRC. Further clinical and mechanistic studies focusing on these miRNAs are required for their clinical significance and the underlying mechanisms in tumorigenesis of CRC.

MAterIAls And MetHods studies selection and analysis
CRC miRNA expression profiling studies were identified through Web of Science (www.isiknowledge. com) and PubMed (www.ncbi.nlm.nih.gov/pubmed) using search term: ((mirna OR microrna or mirnas or micrornas) AND (profile or profiles or profiling or pattern) AND (colorectal OR colon OR rectal) AND (cancer OR tumor OR adenocarcinoma OR neoplasia)). The last search was performed in December 15 2014. To ensure that relevant studies were not missed, search was also performed in Gene Expression Omnibus (GEO, www.ncbi.nlm.nih.gov/ gds) and ArrayExpress (www.ebi.ac.uk/arrayexpress).
Full texts of studies published in English language were involved in our analysis, and only original experimental studies that screened different miRNAs between CRC and paired adjacent noncancerous tissue in human were included for further evaluation. Studies analyzing the individual preselected candidate miRNAs or using only cell lines as the research samples were excluded. Studies that profiled different expression miRNAs only in histological subtypes but did not include noncancerous tissue were also excluded.
The lists of miRNAs that had statistically significant changes in expression level were accurate extracted from each eligible study. In case the miRNA lists was not available from the publications, the authors were contacted directly. All miRNA names were standardized according to miRBase version 21, and viral miRNAs and non-miRNA probes were excluded from integrated analysis.
To identify miRNAs that are ranked consistently better than expected by chance, we used RRA method implemented as a RobustRankAggreg package in R software (version 3.1.3). Analysis was repeated 10,000 times to assess the stability of acquired p-values by one random miRNA list removed from the analysis each time. Acquired p-values from each round were finally averaged for each miRNA.

Validation of the integrated-signature mirnAs
To validate the expression of integrated-signature miRNAs, fresh CRC tissues and paired neighboring noncancerous colorectal tissues were obtained from 11 patients by experienced surgeons and examined by experienced pathologists at the First Affiliated Hospital of Wenzhou Medical University (the detailed information of patients is provided in Table 3). Informed consents were obtained from all the patients and the study was approved by the Institutional Review Board. The samples were immediately frozen in liquid nitrogen in 10 minutes after surgical resection and were then stored at −80°C temperature until RNA extraction.
Total RNA was extracted and isolated using miRNeasy Micro Kit (QIAGEN) following the manufacturer's instructions. SYBR-Green qRT-PCR assay was used for miRNA quantification. In brief, 40 ng of total RNA containing miRNA was polyadenylated by poly (A) polymerase and then reverse transcribed to cDNA using miScript Reverse Transcription kit (QIAGEN) according to the manufacturer's instructions. Real-time qPCR was performed using miScript SYBR-Green PCR kit (QIAGEN) containing universal primer in ABI 7500 PCR system (Applied Biosystems). The primers of each dysregulated miRNAs were listed in Supplementary Table  4. Each reaction volume contains 4 μl of cDNA, 0.5 mM of each primer and SYBR-Green PCR Master mix in a final 20 μl volume. The amplification program was: 95˚C for 2 min for one cycle, then 94˚C for 20 sec with 60˚C for 35 sec for 45 cycles, and melting curve analyses were performed at the end of the cycles. The expression levels of miRNAs were normalized to RNU6B. The relative expression of each miRNA was calculated using the 2 -ΔΔCt method. Student's t-test was used to compare miRNA expression values of CRC and noncancerous colorectal tissues, and p < 0.05 indicated significant difference.
The results of qRT-PCR were validated in the TCGA datasets. The miRNA expression data and corresponding clinical information for CRC datasets were downloaded from TCGA data portal in June 2015 (Supplementary  Table 5). TCGA data are classified by data type (clinical, mutations, gene expression) and data level, to allow structured access to this resource with appropriate patient privacy protection. This study meets the publication guidelines provided by TCGA. The miRNA expression profiling was performed using the Illumina HiSeq 2000 miRNA sequencing platforms (Illumina Inc, San Diego, CA). The miRNA expression level was demonstrated as reads per million miRNA mapped data. The miRNA expression analyses were performed using BRB-ArrayTools (version 4.4). In brief, the miRNAs with missing data exceeded 10% of all subjects were excluded from the datasets and the expression level of each individual miRNA was log2-transformed for further analysis.
The prognosis of CRC strongly depends upon tumor stage, lymphovascular invasion and microscopic vascular invasion (MVI). Therefore, the differences of miRNAs expression level were also tested with or without lymphovascular invasion, presence of MVI or not, and different tumor stages in the TCGA datasets. The statistical analyses were performed using the SPSS 18.0 (SPSS Inc.). Statistical significance was defined as p < 0.05. For survival analysis, we used the Kaplan-Meier method to analysis the correlation between overall survival and the miRNAs, and the log-rank test was used to compare survival curves. The optimum cut-off value for the miRNAs using X-tile plots based on the association with mortality of the patients. X-tile plots provide a single and intuitive method to assess the association between variables and survival. The X-tile program can automatically select the optimum data cut point according to the highest χ2 value (minimum p-value) defined by Kaplan-Meier survival analysis and log-rank test [56]. We did the X-tile plots using the X-tile software version 3.6.1 (Yale University School of Medicine, New Haven, CT, USA).

enrichment analysis
Enrichment analyses for KEGG, Panther pathways and GO processes were performed with GeneCodis web tool (http://genecodis.cnb.csic.es/). Consensus target genes for each integrated-signature miRNA were used as input data and then acquired false discovery rate (FDR)corrected p-values of each pathway. Clustering of the heatmap was acquired by applying a pheatmap package in R software (version 3.1.3) based on Pearson correlation and average linkage. Furthermore, the association between the pathways affected by altered expression of miRNAs and CRC was evaluated.

conFlIcts oF Interest
No conflicts of interest exist for all authors.