Polymorphisms of pri-miR-219-1 are associated with the susceptibility and prognosis of non-small cell lung cancer in a Northeast Chinese population

Occurrence and development of non-small cell lung cancer (NSCLC) is a complex process affected both by gene and environment. Single nucleotide polymorphisms (SNPs) in microRNAs’ (miRNAs) biogenesis influenced the expression of mature miRNAs, further had an impact on risk of NSCLC. Our study focused on the correlation between rs213210, rs421446 or rs107822 polymorphisms in pri-miR-219-1 and susceptibility or prognosis of NSCLC in Chinese. A case-control study of 405 new-diagnosis patients and 405 controls was performed. Ten ml venous blood from each subject was collected for genotype test via using TaqMan allelic discrimination methodology and SPSS was performed for statistical analyses. We found that CC genotype in rs213210 (OR=3.462, 95%CI=2.222-5.394, P<0.001) compared with TT genotype and GG genotype in rs107822 (OR=3.553, 95%CI=2.329-5.419, P<0.001) compared with AA genotype showed significantly increased risk of NSCLC. Haplotype analysis showed that pri-miR-219-1 haplotype Crs213210Crs421446Grs107822 was a dangerous haplotype for lung cancer. And polymorphisms in pri-miR-219-1 have showed no relationship with overall survival of NSCLC. Overall, these findings firstly showed that rs213210 and rs107822 could be meaningful as genetic markers for lung cancer risk.


INTRODUCTION
Non-small cell lung cancer (NSCLC) accounts for approximately 80% of lung cancer [1], which ranks first for mortality among the malignant neoplasms [2], even with advanced chemotherapy and precise moleculartargeted treatment. According to histological type, NSCLC is divided into three main categories: squamous-cell carcinoma (SCC), adenocarcinoma (ADC) and largecell carcinoma (LCC) [3]. Although smoking is one of the established risk factors for NSCLC, a particular genetic alteration or genotype combination, especially the interaction of environmental factors and genes is also involved in the occurrence and development of this malignant neoplasm [4][5][6].
MicroRNAs (miRNAs) are a large family of small endogenous non-coding RNA in the length of 21-25 nucleotides, which can modulate the expression of target messenger RNA (mRNA) by mRNA cleavage or post-transcriptional inhibition [7,8]. Mature miRNAs are produced from the stem loop structure of precursor miRNAs (pre-miRNAs) derived from a long primary miRNA (pri-miRNA) [9]. In terms of all the gene sequence, miRNAs are more likely to produce a single gene mutation [10]. Emerging in the pri-miRNAs, the single nucleotide polymorphisms (SNPs) influences the maturation of www.impactjournals.com/oncotarget/ Oncotarget, 2017, Vol. 8, (No. 34), pp: 56533-56541

Research Paper
respective miRNAs, which then affects its mature miRNAs expression and interaction as a consequence [11][12][13].
In the field of NSCLC, SNPs in the pri-miRNAs may prognosticate the development of cancer susceptibility and prognosis as genetic detection markers [14].
Hsa-miR-219a-1 (miR-219-1), located on chromosome 6 (6p21), may exert an influence on lung cancer [15,16]. Several SNPs in the pri-miRNA sequence of miR-219-1 have been recognized and studied, of which rs213210 was associated with poor prognosis in colorectal cancer [17], rs107822 reduced the risk of esophageal cancer [18], and rs421446 was correlated to the risk of hepatocellular carcinoma progression [19]. Given the linkage disequilibrium of these three loci has been reported and verified, we propose a hypothesis in the present study: rs213210, rs421446 and rs107822 in miR-219-1, as a SNP or a haplotype, may be related to NSCLC susceptibility and prognosis.

Study characteristics
The demographic characteristics of 405 cases and 405 controls in the present study are summarized in Table  1. There was no significant difference in the distributions of age (P=0.49) between the cases (51.27±22.08 years) and controls (50.23±21.02 years). While, the distributions of gender and smoking status were of significant differences (both were P<0.001), so the next study was adjusted by smoking status, age and gender. Among 405 patients there were 324 subjects have follow-up information, with the median follow-up time was 24.35 months.
pri-miR-219-1 polymorphisms are associated with the risk for NSCLC

Interactions between polymorphisms of pri-miR-219-1 and tobacco exposure
This study further investigated the interaction of tobacco exposure and SNPs in cross-over analysis (Table 3). Relative to rs213210 non-smoking and TT genotype carriers, the OR (4.33) for CC genotype carriers with tobacco exposure was higher than the OR (2.45) for rs213210 TT carriers with tobacco exposure or the OR (3.83) for rs213210 CC genotype carriers without tobacco exposure. Similar results were obtained when rs421446and rs107822-tobacco exposure were examined. Above cross-over results indicated that SNPs-tobacco exposure interaction may exist, therefore statistical tests were used to evaluate the significance of the interaction on both additive scale and multiplicative scale (data not shown). The results suggested that interactions between the SNPs and tobacco exposure were not significant on an additive scale or a multiplicative scale.

Haplotype association analysis of SNPs on pri-miR-219-1
The LD of rs213210, rs421446 and rs107822 was observed in Ensemble variation resources [20], each of them suggested great LD in pri-miR-219-1 of NSCLC (all D=1 and r 2 >0.5). The haplotype analysis of miR-219 SNPs was performed using online software SHEsis (http://analysis.bio-x.cn). Figure 1 intuitively represented the odd ratio of the dangerous and protective haplotype symbolized by the pie area. Each pie represented a haplotype, plotted by cases on horizontal axis, controls on vertical axis. Red pies meant dangerous haplotypes, blue pies meant protective haplotypes, and grey pies meant that distribution of these haplotypes was of no significant difference between cases and controls. The biggest red pie presented the risk of individual carrying C rs213210 C rs421446 G rs107822 was significantly higher than all other haplotypes (OR=4.997, 95%CI= 3.524-7.086, P<0.001) ( Table 4). Table 5 rendered the results using multivariate Cox model for investigating the association between genotype polymorphisms in pri-miR-219 and NSCLC prognosis, adjusted for smoking status, age and gender. There was no relationship between rs213210, rs421446 or rs107822 polymorphisms and prognosis of OS in this study.

DISCUSSION
The occurrence and development of NSCLC is a complicated process, which is inevitably affected by multiple established or uncertain factors. In recent years, researchers all around world have made great efforts to elucidate microRNAs' functions in initiation and progression of lung cancer. Recent studies showed that expression level of miRNAs in cells from cancer tissue were often in deregulation compared with adjacent normal tissues [13,21]. An early group has suggested that miR-219 was significantly downregulated in hepatocellular carcinoma, also exerted tumor-suppressive effects in hepatic carcinogenesis via inhibiting transcription and translation of glypican-3 [22].
SNPs in microRNAs non-coding region may affect Drosha or Dicer trimming or reshaping miRNA target genes. A previous study on esophageal squamous cell carcinoma in Chinese Kazakh demonstrated that rs107822 A allele in primary form of miR-219-1 may reduce the efficacy of the maturation process from pre-mir-219-1 to miR-219-1 compared with G allele [18].
On the whole, the present research firstly reported association between the polymorphisms in miR-219-1 (rs213210, rs421446 and rs107822) and susceptibility or prognosis of NSCLC. In our study, after Bonferroni correction, CC genotype carriers of rs213210, as well as GG and GA genotypes carriers of rs107822 were still significantly dangerous for risk of lung cancer, compared with other genotypes. A previous study reported that rs107822 was associated with risk of esophageal squamous cell carcinoma in Chinese Kazakh [18], moreover, this locus also influenced susceptibility of schizophrenia via N-Methyl-D-aspartate-type glutamate receptor signaling pathway [23,24]. Taken together, these results suggested that rs107822 polymorphism may play a considerable role in the process of cancer development.
Given that smoking is an established major environmental risk for developing lung cancer, geneenvironment interactions may have potential effect. An early study to evaluate the lungs of environmental cigarette smoke-exposed mice showed that expression level of miR-219-1-5p was significantly low compared with normal mice [25]. The effects of interaction between polymorphisms in miR-219-1 and tobacco exposure on lung cancer have seldom been investigated. Hence, in our study, we evaluated the interaction between smoking status and three miRNA SNPs by using cross-over analysis. We found that tobacco exposure and SNPs interaction may exist, however, not significant in an additive scale or a multiplicative scale.
Haplotypes are more meaningful than a single SNP for changes in gene function [26,27]. We found that C rs213210 C rs421446 G rs107822 haplotype among 5 common haplotypes in our study was most dangerous than any other haplotypes.
However, SNPs and haplotypes of miR-219-1 with susceptibility to lung cancer should be investigated in further studies with a larger size of samples.
In Cox model, there were no significant associations between the polymorphism in miR-219-1 (rs213210, rs421446 and rs107822) and survival of NSCLC patients. Results of rs213210 were consistent with those from a previous study [14]. But in stage III of colorectal cancer, C allele carriers in rs213210 were significantly related with a better outcome [28].
Although we still cannot fully understand the mechanism of NSCLC, it is needless to say the importance of research that SNPs affect susceptibility and prognosis of NSCLC. Considering the temporality of epigenetic change and diverse distribution of miR-219-1 in human body, there could be differences in the distribution of SNPs loci between tissues and blood. Although statistical power is 87% in our study, a larger sample size is required to verify the relation of rs213210, rs421446 or rs107822 polymorphisms with the mechanism of lung cancer in the further study.

Subject data collection
Our hospital-based case control study population consisted of 405 NSCLC patients and 405 cancer-free controls. Cases were firstly diagnosed as NSCLC by the professional pathologists without restriction of gender or histology. Meanwhile, control subjects as frequencymatched cases according to age (±5 years) were individuals with other type of diseases, such as gastritis, coronary disease, diabetes mellitus and so on. The whole subjects were recruited from unrelated ethnic Han Chinese, with at least two years of follow-up time, in the Fourth Affiliated Hospital of China Medical University. Our study was performed in full compliance with requirements of the Institutional Review Board of China Medical University. Each subject signed an informed consent form. And consent from subject's representative was approved if subject's consent could not be obtained.
The data was collected under the following strict criterions: (i) 10 ml peripheral blood was donated by each participant during their first hospitalization; (ii) Each volunteer have no history of radiotherapy or chemotherapy; (iii) Tobacco exposure (a non-smoker is defined less than 100 cigarettes consumed in his whole life, otherwise are smokers) and demographical characteristics were questioned by well-trained interviewers under a pretested questionnaire just after taking their blood; and (iv) Patient's pathology and medical records were reviewed by us for clinical stage, pathologic type, date of diagnosis and performance status, twice a year.

DNA genotyping
Genomic DNA samples were extracted from proteinase K-digested leukocyte pellet by conventional phenol-chloroform extraction and ethanol precipitation method [29]. SNP Genotyping was done using Taqman allelic discrimination assay from Applied Biosystems (ABI, Foster City, CA). The SNP Assays were also purchased from ABI, including primer and FAM/ VIC probe (assay ID C_2215074_10 for rs213210, C_27015692_10 for rs421446 and C_2215075_20 for rs107822). Each reaction in Fast 96-well plate was conducted in a total volume of 10 μL including 2 μL purified genomic DNA (15-25 ng/μL), 5 μL TaqMan Genotyping master mix, 2.5 μL RNase-free water and 0.5 μL SNP Assay. The quantitative real-time PCR (qPCR) condition was as follows: 95°C for 10 min, followed by 47 cycles of 30 secs at 92°C, and 1 min at 62°C. The reacted results were read by an ABI 7500 FAST Real-Time PCR System with the Sequence Detection Software. A 5% selected sample were duplicated to validate the results, which were 100% consistent.

Statistical analysis
The χ2 test and Student's t-test were used to examine difference between cases and controls in demographic variables, tobacco exposure and SNPs of genotypes. Hardy-Weinberg equilibrium (HWE) [30] of each SNP was tested by Pearson's goodness-offit test among control. Logistic regression model and Cox's proportional hazards model were used to estimate the odds ratios (ORs) and hazard ratios (HRs) and their 95% confidence intervals (CIs) for unconditional logistic regression analysis and the multivariate survival analyses, respectively. Crossover analysis was applied to examine gene-environment interactions. All the analyses were adjusted by age and gender. Linkage disequilibrium (LD) was measured in Haploview algorithm [31], haplotype analysis were performed by using SHEsis Software [32]. All above analyses were two-sided, and performed in Statistical Products and Services Solutions software (v. 16.0, SPSS Institute Cary, Chicago, IL, USA) unless specified. A value of P< 0.05 was considered statistical significant.

CONCLUSION
There may exist significant association between polymorphisms in miR-219-1 (rs213210 and rs107822) and lung cancer risk in Chinses population. There is no relationship between polymorphisms in miR-219-1 (rs213210, rs421446 and rs107822) and outcome of lung cancer.

CONFLICTS OF INTEREST
We declare that we have no conflicts of interest.

GRANT SUPPORT
This study is supported by grant no.81272293 and no.81502878 from National Natural Science Foundation of China.