Circulating microRNA-422a is associated with lymphatic metastasis in lung cancer

To identify specific circulating microRNAs that were associated with the lymphatic metastasis in lung cancer, we performed miRNA microarray analysis of lymph node with and without metastasis from five lung cancer patients. Top six differentially expressed miRNAs were selected for further validation. A training cohort of 26 patients with lung cancer was firstly recruited and the selected miRNAs in the plasma samples were investigated. miRNA-422a, with highest diagnostic accuracy in lymphatic metastasis was identified (AUC, area under the receiver operating characteristic curve, 0.744; 95%CI, 0.570-0.918). The diagnostic value of miR-422a was also demonstrated by a validation cohort of 51 lung cancer patients (AUC, 0.880; 95%CI, 0.787-0.972). Moreover, a high diagnostic value was also observed after integrated analysis of training and validation cohorts (AUC, 0.792; 95%CI, 0.688-0.896). The odds ratio of high miR-422a expression for lymphatic metastasis in lung cancer was 13.645 (95%CI, 2.677-69.553) after adjustment of the potential confounding factors. Furthermore, we predicted the target genes of miR-422a by combining the online database, miRcords, and the data from GEO and TCGA. Sixty-one target genes of miR-422a that might be involved in lymphatic metastasis in lung cancer were identified. And GO analysis suggested multiple target genes relatively concentrated in the biological processes of apoptosis, transport, and protein phosphorylation.


INTRODUCTION
Lung cancer is one of the leading causes of cancer related death worldwide and its 5-year survival rate is low, about 15%-20% [1]. Surgical resection is an effective approach in achieving long-term survival and widely used in lung cancer. However, postoperative relapse and subsequent lymphatic and hematogenous metastasis still result in 90% of mortalities in lung cancer [2]. Lymphatic metastasis is directly associated with distant recurrence and overall survival (OS) in resected non-small cell lung cancer [3]. The image approaches such as PET/CT are currently most widely used for lymph node staging, however, the sensitivity for detecting metastatic tumor in lymph nodes smaller than 1 cm is low [4,5]. Therefore, searching for biomarkers that can predict early metastatic phenomenon is beneficial to patients' survival. Molecular markers including microRNAs will be of great use for a more accurate risk assessment of nodal metastasis.
MicroRNAs (miRNAs or miRs) are a family of endogenous and small RNA gene products (19 -25 nucleotides in length) [6]. Over 50% annotated human miRNA genes are located within the cancer-associated genomic regions or fragile sites and exert important Research Paper functions similar to oncogenes or tumor suppressors in malignancy [7]. These small RNA molecules can be efficiently retrieved from (fixed or frozen) tumor samples or biological fluids and display high stability and tissue specificity [8]. Previous studies demonstrated that a number of miRNAs have potential roles in diagnosis, staging, and prediction of outcomes in various types of cancer including lung cancer [9,10]. Until now, many miRNAs have also been identified to be correlated with lymph node metastasis of lung cancer [11].
In the present study, we performed miRNAs microarray analysis in lymph node tissues of the patients with lung cancer and screened the differentially expressed miRNAs related to lymph node metastasis. Then we enrolled a training cohort of lung cancer patients and detected the expressions of top candidate miRNAs in plasma samples. The potential diagnostic value of the selected miRNAs in distinguishing lymph node metastasis was analyzed. Furthermore, miR-422a, the miRNA with highest accuracy, was validated in another cohort. Finally, the predicted target genes of miR-422a were analyzed by bioinformatic methods.

Literature review
To summarize the miRNAs that were correlated with lymphatic metastasis in lung cancer in previous reports, we performed a systematic literature review in two English databases (PubMed and Embase) up to June 30, 2016. After initial screening by reading the article titles and abstracts, 187 articles were left for selection. 54 papers were excluded according to the inclusion criteria, of which, 28 were irrelevant studies, 12 were review articles, 8 were only experimental studies, and 8 only reported the panel or signature of multiple miRNAs. Finally, 133 papers reporting 115 miRNAs were identified. Of the 115 miRNAs, 95 miRNAs were found to be associated with lymphatic metastasis in lung cancer in at least one publication while 20 were not correlated with the metastasis (Supplementary Table 1). miR-21, 10b, 148b, 155, and 200c were the most studied spots.

miRNA microarray analysis in NSCLC with lymphatic metastasis or not
In the present study, we aimed to identify novel miRNAs correlated with lymphatic metastasis in NSCLC. Firstly, we performed miRNA microarray analysis in lymph node tissues with metastasis from five lung cancer patients and compared with that in the corresponding lymph node without metastasis. After data processing and analysis, a set of 50 miRNAs were identified to be differentially expressed between metastatic lymph nodes and non metastatic lymph nodes, of which, 39 were upregulated and 11 were down regulated ( Figure 1 and Table  1). Interestingly, 18 of the 50 miRNAs were also presented in the literature review.

Value of the top differential expressed miRNA in diagnoses lymph node metastasis in lung cancer
To determine the diagnostic value of the miRNA identified by miRNA microarray analysis, we chose six miRNAs (hsa-miR-375, hsa-miR-205, hsa-miR-183, hsa-miR-200b, hsa-miR-378, and hsa-miR-422a). And we evaluated their expression levels in plasma samples   Table 2) and five patients with benign lung diseases by quantitative real-time polymerase chain reaction (qRT-PCR). The results suggested that all the six miRNAs were highly expressed in lung cancer with lymph node metastasis compared with the cancers without lymph node metastasis and benign diseases, of note miR-422a exhibited significant difference ( Figure 2). Then Receiver Operating Characteristic curve (ROC) analysis was performed to compare the diagnosis accuracy among miRNAs or traditional tumor markers such as carcino embryonie antigen (CEA), cancer antigen 199 (CA199), cytokeratin 19 fragments (CYFRA21-1), and squamous cell carcinoma antigen (SCC). And the results indicated that the candidate miRNA, miR-422a, showed highest accuracy in predicting lymphatic metastasis with an AUC value of 0.744 (Figures 3-4 and Table 3). In addition, correlations among the six miRNAs were also analyzed. And the correlations of miR-200b with miR-183, miR-375 with miR-183 and miR-200b, and miR-422a with miR-183, miR-200b, miR-375, and miR-378 were identified (Table 4). At same time, to clarify if the aberrant expression of miR-422a in plasma were resulted from the cancer tissue, we downloaded a publically available dataset (GSE16025) from NCBI GEO database, which contained the miR-422a expression data in 20 lung cancers with lymphatic metastasis, 41 without lymphatic metastasis, and 10 normal lung tissues [12]. Consistent with our results in plasma samples, miR-422a expression was increased significantly from normal lung tissue to cancer tissue with lymphatic metastasis (Supplementary Figure 1).

Validation of miR-422a in predicting lymphatic metastasis of lung cancer
Another cohort consisting of 40 lymphatic metastatic lung cancer cases and 11 non-lymphatic metastatic cases was used to further verify the diagnostic value of miR-422a ( Table 2). The plasma miR-422a levels were determined by qRT-PCR and ROC curve analysis was performed. The expression of miR-422a in samples with lymphatic metastasis was significantly higher than those without lymphatic metastasis, consistent with the results of training cohort. ROC analysis suggested that miR-422a exhibited high diagnostic accuracy with an AUC value of 0.880 (Table 3 and Figure 5A). Furthermore, the training cohort and validation cohort were integrated by normalizing the miR-422 expression data and a high diagnostic accuracy of miR-422a was still observed (AUC=0.792, Table 3 and Figure 5B). Meanwhile, the correlations between miR-422a and the clinical features including age, gender, histological subtypes, Tumor diameter, T stage, M stage, and the lymph node metastasis in all patients were investigated. The miR-422a expression level less than the cut-off value from the ROC analysis was defined as "Low expression" and that more than the cut-off value was defined as "High expression". The results suggested that miR-422a was correlated with histological subtype, T stage, and tumor diameter stratified by 2.55 cm, a cut-off value from ROC analysis with lymph node metastasis as state variable in all patients with lung cancer (Table 5).
In addition, the correlations of lymphatic node metastasis with the clinical features were also explored. The results suggested that significant correlation was observed between lymphatic metastasis and the clinical   association of miR-422a with lymphatic metastasis in lung cancer (Table 7). On the other hand, the analyses of the data from GSE16025 also revealed that miR-422a levels in cancer tissue was associated with lymphatic metastasis in lung cancer in Univariate analysis (OR, 6

Prediction of miR-422a target genes that might be involved in lymphatic metastasis
Given the fact that the biological significance of miRNAs deregulation relies on the actions of their targets, we aimed to identify the potential targets of miR-422a involved in lymphatic metastasis in lung cancer by prediction and data mining. Firstly, the predicted targets of the miR-422a were analyzed by the online database, miRecords and a total of 3309 predicted target genes were identified. Then, GSE51852 and GSE51853 were downloaded from NCBI, which contained both the mRNA and the miRNA expression profile of 126 lung cancer patients. 2052 genes co-expressed with miR-422a were obtained after analyzing the datasets. Thirdly, 983 lung cancer patients at N0-3 stage were downloaded from The Cancer Genome Atlas (TCGA) and 4842 differential expressed genes in lymph node metastasis cancers compared with non-metastatic cancers were identified. Finally, to find potential novel target genes of miR-422a involved in lymph node metastasis of lung cancer, the results from the above analyses were integrated and a total of 61 genes were left (Table 8). Then these genes were subjected to gene ontology (GO) analysis in the GeneCoDis3 online database (http://genecodis.cnb.csic. es). The biological processes of apoptosis, transport, and protein phosphorylation contained multiple target genes. Many of the genes were related to DNA and protein binding. And most the genes served as nucleus and membrane components (Table 9).

DISCUSSION
In the present study, we screened novel circulating miRNAs in predicting lymph node metastasis in lung cancer. One novel miRNA, miR-422a, showed the highest diagnostic value compared with other miRNAs and traditional tumor markers, such as CEA, CA199, and CYFEA211. Furthermore, a validation cohort consist of 51 lung cancer patients was recruited to confirm the diagnostic value of miR-422a. ROC curve analyses revealed that miR-422a possessed a high AUC value of 0.880 with sensitivity of 86.22% and specificity of 96.55%. Some miRNAs have been reported to be correlated with lymphatic metastasis and showed in our literature review. For example, miR-196a [13] and miR-130a [14] were positively associated with lymph node metastasis, while miR-451 [15] was negatively associated with lymph node metastasis. Recent studies have focused on the sources of circulatory molecular biomarkers [16].
miR-422a was a less studied miRNA. A few studies have revealed the involvement of miR-422a in human cancers. It is dysregulated in osteosarcoma [19], colorectal cancer [20][21][22], hepatocarcinoma [23], and gastric cancer [24]. miR-422a is an indicative marker for poor prognosis in osteosarcoma, gastric cancer, and hepatocarcinoma [23]. It is also associated with chemo-resistance in osteosarcoma [25]. In lung cancer, one study suggested that miR-422a can be used to classify lung adenocarcinoma from lung squamous cell carcinoma [26]. In the present study, we found that plasma miR-422a levels were upregulated in lung cancer patient with lymphatic metastasis than that in patients without lymphatic metastasis. Here, the      The adjusted OR was obtained by controlling the factors of histology and tumor size. The adjusted OR was obtained by controlling the factors of age and T stage. expression changes trends of miR-422a was opposite with the results of the microarray analysis. We predicted that the sample size in microarray analysis was very small and then the individual difference led to a certain degree of bias when exploring the differential expressed microRNAs in the patients with lymphatic metastasis or not. The upregulation of plasma miR-422a in lymphatic metastatic patients were also evidenced by an independent cohort, GSE16025, which analyzing the mRNA expression profile of lung cancer tissues (Supplementary Figure 1). And our study, for first time, found that miR-422a had a potential diagnostic value in discriminating lymph node metastasis of lung cancer. To further clarify the association of miR-422a and lymphatic metastasis in lung cancer, we analyzed the associations of miR-422a with the clinical parameters and the associations of lymphatic metastasis with the clinical parameters. And the results suggested miR-422a was associated with histology and T stage/tumor diameter, which might be the confounding factors for the association of miR-422a with lymphatic metastasis. Notably, a strong significant association was still observed between miR-422a and lymphatic metastasis after adjustment of the confounding factors (Table 7). Furthermore, by analyses the mRNA profile data in cancer tissues in GSE16025, a similar association was observed between cancer tissue miR-422a level and lymphatic metastasis in lung cancer.
In summary, our findings highlight that a novel circulating miRNA, miR-422a, may serve as a noninvasive biomarker with sufficient accuracy in predicting lymph node metastasis in lung cancer patients. And the  application of miR-422a in clinical practice may help for prophylactic intervention to mitigate morbidity and mortality. miR-422a may also provide a new target for therapeutic approaches in management of lung cancer.

Literature review
To summary the associations of miRNAs with lymph node metastasis in lung cancer, a systematic literature search was performed independently by two authors in two databases (PubMed and EMBASE) up to June 30, 2016. The following termed were used: "lung cancer", "lymph node OR lymph node metastasis, OR lymphatic metastasis", and "Microrna OR miRNA". Inclusion criteria: (1) English language papers; (2) The associations of miRNA with lymph node metastasis were determined in clinical samples; (3) single miRNA but not panel or signature of mutiple miRNAs was investigated.

Patients and samples
Anonymized fresh metastatic lymph node samples and the compared noncancerous lymph nodes from five lung cancer patients were collected after surgical resection for miRNA profile analysis. Then, 26 lung cancer patients (14 cases with lymphatic metastasis and 12 cases without lymphatic metastasis) from August 2013 to May 2014 were included as a training cohort to determine the diagnosis value of the top differentially expressed miRNAs from miRNA profile analysis (Table  2). In addition, five patients with benign lung diseases without cancers were recruited as controls. Furthermore, another cohort of 51 lung cancer patients (LN+, 40; LN-, 11) was recruited from June 2014 to October 2015 to validate the miRNA with highest diagnosis accuracy in the training cohort (Table 2). Unlike with the microarray analysis, fresh peripheral blood samples were used. Clinicopathological characteristics of patients with lung cancer were defined according to the TNM staging system criteria of the Union for International Cancer Control. And histological diagnosis was based on the medical records. Formalin-fixed paraffin-embedded (FFPE) sections were carefully reviewed for the diagnosis of metastatic lymph nodes and noncancerous lymph nodes. All procedures were approved by the Ethics Committee of Peking University. Written informed consents were obtained from all patients or their families.

miRNA microarray analysis
The matched frozen tissue samples of metastatic and noncancerous lymph nodes from the same patients

Quantitative real-time polymerase chain reaction (qRT-PCR)
MiRNAs were extracted from the plasma samples from lung cancer patients using miRNeasy Mini Kit (Qiagen, Valencia, CA, USA) according to the manufacturer's instructions. The concentration and purity of isolated RNA was estimated using the ND-1000 microspectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). RNA integrity was assessed on BioAnalyzer 2100 using BioAnalyzer RNA 6000 Nano LabChip Kit (Agilent Technologies, Palo Alto, CA, USA). Then TaqMan miRNA assays (Applied Biosystems, Thermo Fisher Scientific, Waltham, MA, USA) were used to detect the expression levels of mature miRNAs. For reverse transcription reactions, 10 ng of total RNA was mixed with the reverse transcription primers, reacted at 16°C for 30 min, 42°C for 30 min, and 85°C for 5 min, and then maintained at 4°C. Following the reverse transcription reactions, 1 μL of cDNA was used for polymerase chain reaction (PCR) using 2 μl of the TaqMan primers. PCR reactions were conducted at 95°C for 10 min followed by 40 cycles of 95°C for 15 s and 60°C for 60 s on an ABI 7500 Fast Real-time PCR system (Thermo Fisher Scientific, Waltham, MA, USA). Real-time PCR results were analyzed and the expression of miRNA levels was calculated using the 2 -ΔΔCt method and normalized to miR-16, an internal reference control. The probes of the candidate miRNAs were show in Supplementary Table 2.

Predicted targets analysis of miR-422a
The predicted targets of the miR-422a were analyzed by an online database, miRecords (http:// c1.accurascience.com/miRecords/), which is a resource for animal miRNA-target interactions. The Predicted Targets component of mIRecords integrates the predicted targets of the following miRNA target prediction tools: DIANA-microT, MicroInspector, miRanda, MirTarget2, miTarget, NBmiRTar, PicTar, PITA, RNA22, RNAhybrid, and TargetScan/TargertScanS [27]. The target genes of miR-422a were derived when the target genes were predicted by at least three algorithms in mIRecords. In addition, the GEO datasets (GSE51852 and GSE51853) were downloaded from NCBI, which contained both the mRNA and the miRNA expression profile of 126 lung cancer patients and were used to find the genes co-expressed with the selected miRNA, miR-422a [28]. Furthermore, mRNA profile data of 983 lung cancer patients was downloaded from TCGA database, subgrouped by lymphatic metastasis or not, and used to find genes related to lymphatic metastasis. Then, the potential target genes of miR-422a involved in lymphatic metastasis were identified by integrated analysis of predicted targets from miRcords and data mining of GEO and TCGA. Finally, the potential targets were subjected to GO analysis in the GeneCoDis3 online database (http://genecodis.cnb.csic.es).

Statistical analysis
Statistical analyses were performed with SPSS 16.0 (IBM, Armonk, NY, USA). Kruskall Wallis H test was used for comparing miRNAs expression among multiple groups. Receiver operating characteristic (ROC) curves were established for discriminating the patients with lymphatic metastasis or not. The area under curve (AUC) value and 95% confidence intervals (CI) were calculated to determine the specificity and sensitivity. Z-score method was used to normalize the expression data of miRNA in training and validation cohorts. Chi-square test was used to explore the associations of miR-422a and clinical features and the associations of lymphatic metastasis with clinical features in the all recruited lung cancer patients and in dataset GSE16025. The correlations among the miRNAs in training cohort and the correlations between the genes and miR-422a in GEO data were analyzed by Pearson's correlation coefficient. The mRNA expressions in lymphatic metastatic lung cancer patients compared with non-lymphatic metastatic patients in TCGA data were analyzed by Student's t test. All tests were two-sided. P < 0.05 was considered to be statistically significant.

Author contributions
LNW, BH and BTZ prepared the manuscript. YNL, YY and LJZ designed the study and performed the statistical analysis. JFC conceived the study, participated in its design and coordination, and helped to draft the manuscript. All authors read and approved the final manuscript.

CONFLICTS OF INTEREST
The authors declare no conflicts of interest.