Long non-coding RNA polymorphisms in 6p21.1 are associated with atrophic gastritis risk and gastric cancer prognosis

It has been suggested that the genetic variation in human chromosome 6p21.1 has potential importance for the susceptibility to gastric cancer (GC). The study aims to explore the relationship between the long non-coding RNA (lncRNA) polymorphisms in 6p21.1 and the risk of GC as well as atrophic gastritis (AG). Genotyping for eight single nucleotide polymorphisms (SNPs) was conducted using Sequenom MassARRAY platform in a total of 2507 northern Chinese subjects, including 749 GC cases, 878 AG cases and 880 controls. The results showed rs61516247 was associated with an increased AG risk in overall population (AA vs. GG: P = 0.046, OR = 1.46; A vs. G: P = 0.037, OR = 1.18). Four SNPs, rs61516247, rs1886753, rs7747696 and rs7749023 were associated with AG risk in some specific subgroups. Among them, rs1886753 had an interaction effect with H.pylori infection on AG risk (Pinteraction = 0.038, OR = 1.62). In prognosis analysis, two SNPs, rs80112640 (AG+GG vs. AA: P = 0.047, HR = 0.56; G vs. A: P = 0.039, HR = 0.57) and rs72855279 (P = 0.043, HR = 0.57) were found to improve the overall survival of GC patients. In conclusion, lncRNA SNPs in 6p21.1 are associated with AG risk and GC prognosis. Our study provides all-new research clues for screening lncRNA-based biomarkers in the cancer-related hotspot region 6p21.1 with the potential to predict risk and prognosis of GC along with its precursor.


INTRODUCTION
Genetic variation is a common phenomenon in the species evolution. As the most common form of genetic variation, the single nucleotide polymorphism (SNP) has been extensively investigated in the relationship with various diseases. SNPs can occur in different regions of chromosomes, changing structure and function of the genes involved.
Human chromosome 6 has more than 166 million base pairs. In 2003, the Welcome Trust Sanger researchers first reported that there were 2190 genes in chromosome 6 via sequencing analysis, of which 1557 were functional genes and about 130 were related to human diseases including hereditary hemochromatosis, Parkinson's disease, epilepsy, schizophrenia and heart disease etc.
[1]. In 2010, the Genetic Epidemiology of Lung Cancer Consortium (GELCC) found that family lung cancer susceptibility gene was located in chromosome 6 by comparing the alleles of all 392 known genetic variants as genetic markers for both cancer patients and their healthy family members [2]. And in 2012, Guangfu Jin etc. conducted a large-scale case-control study by using combined samples of genome-wide association studies (GWAS) and replication stages, suggesting the potential importance of variants at 6p21.1 in the susceptibility to gastric cancer (GC) [3], which was the fourth common cancer worldwide and the second leading cause of cancer-

Research Paper
Oncotarget 95304 www.impactjournals.com/oncotarget related death [4]. Meanwhile, the association with GC of a polymorphism in the LRFN2 gene at that region was also revealed [3]. Subsequently, the SNPs in pepsinogen C (PGC), just located in 6p21.1, was found to play an important role in altering susceptibility to atrophic gastritis (AG) and GC by our research group in 2014 [5]. However, all the present studies focused on this hotspot region were related to protein-coding genes but few for non-coding RNAs (ncRNAs), with well-known significant gene regulative function. Long non-coding RNAs (lncRNAs) are 200-nt to 100-kb long, constituting the largest proportion of ncRNAs [6]. Accumulating studies have suggested lncRNAs are involved in the regulation of cell proliferation, invasion, metastasis and apoptosis in GC [7][8][9]. Currently, SNPs in six lncRNA genes have been reported to be associated with GC risk and prognosis, including H19, HOTAIR, TINCR, PRNCR1, NR_024015 and CASC8 [10][11][12][13][14]. However, it is remain unclear whether the lncRNA SNPs located in 6p21.1, the cancerrelated hotspot region, are related to GC as well as its precancerous diseases.
In the present study, we conducted an analysis for the lncRNA SNPs at 6p21.1 in a northern Chinese population, aiming to explore their relationship with GC and AG. Our study might provide clues for screening novel biomarkers with the potential to predict risk and prognosis of GC along with its precursor.

Baseline characteristics of the subjects
The study subjects consisted of 878 AG, 749 GC, and two groups of gender-and age-matched controls, which were respectively 878 and 744 for AG and GC cases. H.pylori infection ratio was significantly higher in both AG and GC groups than control groups (P < 0.001). The proportion of individuals with drinking history in GC group was remarkably larger than the control group (P = 0.040). No significant difference in distribution of gender, age and smoking history was observed between any pairwise case and control groups (P > 0.05, Supplementary Table 1).

Association of the studied SNPs with AG and GC risk
A total of eight SNPs were involved in the study based on our selection criteria. However, one of them entitled rs72854760 polymorphism was found not to be in accordance with HWE (P > 0.05), as a result of which, it was excluded from subsequent calculation. Reference frequencies of these SNPs in healthy controls (Beijing Han, China, NCBI database) were shown in Table 1.
First, the association between each SNP and gastric diseases risk in overall population was evaluated. Only rs61516247 polymorphism was found to be statistically significant, and both the homozygote variant AA and the allelic model were associated with an increased AG risk compared with the homozygote wild (AA vs. GG: P = 0.046, OR = 1.46, 95% CI = 1.01-2.12; A vs. G: P = 0.037, OR = 1.18, 95% CI = 1.01-1.37, Table 1).
We next divided GC into intestinal-type and diffused-type according to Lauren classification, estimating the association of the SNPs with each type of GC. However, no SNP demonstrated positive outcomes in any of genetic models (P > 0.05, Supplementary Table 2).

Stratified analysis for the studied SNPs
To evaluate the association between the selected SNPs and gastric diseases risk in specific subgroups, we further performed stratified analyses based on the host characteristics. It was suggested four SNPs were associated with AG risk, including the rs61516247, rs1886753, rs7747696 and rs7749023 polymorphisms. For rs61516247, the homozygote variant AA, recessive model and allelic model could elevate AG risk significantly both in the subjects of age ≤ 60 years (P = 0.027, P = 0.049, P = 0.028, respectively) and non-smokers (P = 0.019, P = 0.028, P = 0.027, respectively). For rs1886753, all the genetic models other than recessive model were associated with a decreased AG risk in the H.pylori-positive subjects (AG vs. AA: P = 0.029; GG vs. AA: P = 0.030; dominant model: P = 0.016; G vs. A: P = 0.027); in the drinker group, its dominant model could also reduce AG risk (P = 0.048). For rs7747696, both the heterozygote AG and dominant model conferred an increased AG risk in the H.pylori-negative subjects (P = 0.043, P = 0.041, respectively); its G allele could elevate AG risk in the drinkers (P = 0.031). For rs7749023, individuals carried with the variant C allele had a 1.55-fold increased AG risk compared with the wild allele in the drinker group (P = 0.029, Supplementary Table 3).

Haplotype analysis
Haplotype analyses were conducted to assess the association between haplotypes of these SNPs and gastric diseases risk. First, all the selected SNPs were included and seven haplotypes were found out. One of them was associated with a decreased AG risk (P = 0.017, OR = 0.83, 95% CI = 0.72-0.97). However, among the 7 SNPs, only four demonstrated significant associations with AG risk in previous analysis. To investigate whether the significance of the haplotype was contributed by the 4 SNPs, haplotype analysis for them was performed next, and one haplotype could reduce AG risk as well (P = 0.016, OR = 0.84, 95% CI = 0.72-0.97, Supplementary

Cumulative and interaction effects
The contribution to gastric diseases risk when the selected SNPs were combined with each other was evaluated. Based on the results presented in Supplementary Table 3, we defined four genetic models as risk genotypes that elevate AG risk, which were AA for rs61516247, AG+GG for rs1886753, AG+GG for rs7747696 and CC for rs7749023. All the subjects were divided into four groups according to the number of risk genotypes they carried with, and individuals without any risk genotype were considered as control group (Figure 1). Other than the susceptibility to AG for individuals carried with four risk genotypes was remarkably increased when   The interactions between the SNPs and environmental factors were measured next. The wild genotype of rs1886753 was found to have a positive interaction effect with H.pylori infection on AG risk (P interaction = 0.038, Table 2 and Supplementary Table 5). No interaction of three dimensions in AG risk was observed among the rs1886753 polymorphism and environmental factors (Supplementary Table 6).

Association of the studied SNPs with GC prognosis
The association between the SNPs and five clinicopathological parameters was evaluated at first. The rs61516247 and rs1886753 polymorphisms were found to be associated with several parameters (P < 0.05, Supplementary Table 7).
We next made an assessment for the effects of host characteristics on OS for GC patients, including all the epidemiological and clinicopathological parameters. It was observed that OS was significantly affected by macroscopic type, TNM stage, lymphatic metastasis and depth of invasion (P = 0.043, P < 0.001, P < 0.001, P < 0.001, respectively, Table 3). Therefore, multivariate analysis was subsequently performed adjusted by these factors.
Ultimately, the association between the SNPs and OS for GC patients was estimated both in univariate and multivariate analysis.  Table 4). The corresponding survival curves were presented in Figure 2.

DISCUSSION
This case-control study explored the relationship of seven lncRNA SNPs in 6p21.1 with the risk and prognosis for GC and AG in a total of 2507 subjects. We newly found the rs61516247 polymorphism was associated with an increased AG risk in overall population. For the stratified analyses, associations with the susceptibility to AG were demonstrated in the rs61516247, rs1886753, rs7747696 and rs7749023 polymorphisms. Higher AG risk was observed when combining all these 4 SNPs. Very interestingly, the wild genotype of rs1886753 had a positive interaction effect with H.pylori infection, synergistically elevated AG risk. In addition, the rs80112640 and rs72855279 polymorphisms were found to improve OS for GC patients in multivariate analysis. To our knowledge, this is the first study about the relationship of lncRNA SNPs in the cancer-related hotspot region 6p21.1 with GC risk and prognosis, and it is also the first time to report the lncRNA SNPs associated with the susceptibility to AG.
It has been widely accepted that GC can develop from inflammation, atrophy, intestinal metaplasia and dysplasia. AG is considered as a precancerous condition of GC. To detect high-risk AG individuals could benefit the intervention and prevention of GC. In our study, three lncRNA genes at 6p21.1 were suggested to be associated with AG risk, including lnc-LRFN2-1, lnc-LRFN2-2 and lnc-C6orf132-1. As an important class of molecular regulators in human genomes, lncRNAs could result in various diseases by silencing or activating specific genes in epigenetic, transcriptional or posttranscriptional levels [15]. Based on the Database for Annotation, Visualization and Integrated Discovery (DAVID, https://david-d. ncifcrf.gov), we initially employed Gene Ontology (GO) analysis to obtain the function information of the three lncRNAs and their co-expressing genes from three aspects, including cell component (CC), biological process (BP) and molecular function (MF). Consequently, some lncRNAs were suggested to possibly contribute to AG initiation. For lnc-LRFN2-1, several co-expressing genes were found through Multi Experiment Matrix (MEM) [16]. GO analysis demonstrated they might target plasma membrane, concentrating on ion transport, channel activity and detection of external stimulus. Three co-expressing genes for lnc-LRFN2-2 were identified by our lncRNA expression profile, including CYP27B1, CACNA1I and GRIN2A, also shown to be associated  with calcium ion transport in BP analysis. It has been reported that calcium ion could impair gastric mucosa through several pathways, leading to AG development [17]. Additionally, functional SNPs in lncRNA genes have been well accepted to exert regulatory roles in cancer [18,19]. Therefore, it is reasonable to infer that the dysfunction of lnc-LRFN2-1 and lnc-LRFN2-2 caused by their SNPs might change the ion channel activity in membranes of gastric mucosal cells, making the epithelium more sensitive and vulnerable to environmental risk factors via calcium signaling pathway. However, all of the assumptions about the molecular mechanism need to be verified by further investigation.
Among the SNPs associated with AG risk, the rs61516247 polymorphism was statistically significant both in overall and stratified analysis. The risk effects demonstrated in its variant genotypes were more evident in younger subjects (age ≤ 60 years) and non-smokers. Tracing it to the cause, on the one hand, the defense of gastric mucosa to external hazards would become weakened as individuals grow old [20]; on the other hand, tobacco intake has been regarded as an independent risk factor for gastric diseases [21]. As a result, the association between rs61516247 and AG risk seems to be overlapped by aging and smoking. With respect to the rs1886753, rs7747696 and rs7749023 polymorphisms, they were all  merely related to the subjects with or without H.pylori infection or drinking history, suggesting the association of the SNPs in overall subjects might be masked by H.pylori infection and alcohol consumption. From our perspective, it is also not difficult to figure out this phenomenon. Accumulated exposure to alcohol plays a crucial role in the progression of diseases [22]. Besides, H.pylori is one of the best-known environmental pathogenic factors, leading to gastric mucosa impaired after colonization in the stomach [23]. Interestingly, the variant genotypes of rs1886753 had protective effect on AG risk, while the wild AA was relatively a risk genotype, being able to elevate AG risk synergistically with H.pylori infection. Several studies have focused on the interaction between lncRNAs and H.pylori. Differentially expressed lncRNAs may play a partial or key role in the immune response to H.pylori [24]. And H.pylori infection might promote GC by deregulating lncRNAs expression [25]. However, further investigations are needed to elucidate whether the lncRNAs in 6p21.1 could interact with H.pylori and the specific mechanisms.
Due to the complex factors present in gastric diseases initiation, the capacity in recognition of susceptibility for one single polymorphism locus is limited [26,27]. More advantages could be obtained when multiple SNPs are combined for detection. Our results showed the OR for AG risk calculated in the subjects carried with 4 risk genotypes simultaneously was almost doubled when compared with individuals carrying less risk genotypes, indicating a forceful cumulative effect of the SNPs. Obviously, better diagnostic efficacy for AG risk could be achieved when the rs61516247, rs1886753, rs7747696 and rs7749023 polymorphisms were all combined.
In the prognosis analysis, the rs80112640 and rs72855279 polymorphisms could both improve OS for GC patients after adjustments by several clinicopathological parameters. No significance was observed in univariate model, which was consistent with the results of analysis for OS-related factors. The two SNPs were located in the exon of lnc-C6orf132-1, of which the structural motifs might be affected and display a protective role for GC. However, the other SNPs in lnc-C6orf132-1, rs7747696 and rs7749023 were both associated with an increased AG risk, seemingly conflicting for the polymorphisms in the same lncRNA gene. Considering the results in function analysis of lnc-C6orf132-1, we believe this phenomenon could be explained to some extent. A number of co-expressing genes for lnc-C6orf132-1 were revealed in our lncRNA expression profile, shown to have bidirectional regulation effects on DNA transcription. That indicates lnc-C6orf132-1 has the ability to simultaneously upregulate and downregulate the expression of some relevant oncogenes or tumor suppressor genes when affected by different SNPs. As a result, the expression level of the same gene may vary from different stages during the progression of gastric diseases. Besides, the components associated with cancer outcome are quite complex, in which diverse factors might interact with each other. Therefore, it is comprehensible that the SNPs in lnc-C6orf132-1 cause contrary effects on AG risk and GC prognosis, while the specific mechanism still needs to be further investigated.
Several limitations should be acknowledged in our study. Firstly, the existence of data missing might influence the efficacy of statistical analysis to some extent, including SNP genotypes and epidemiological data. Secondly, the lncRNA SNPs in 6p21.1 region are not completely covered, which needs supplements in the future. Furthermore, our research is only focused on the association study without in-depth investigation about involved mechanisms. In the future functional studies need to be conducted to investigate the specific mechanism pathways in which the polymorphisms take effects.
In summary, we performed a case-control study to explore the relationship of the lncRNA SNPs in the cancerrelated hotspot region 6p21.1 with the risk and prognosis for AG and GC in a Chinese population. Four SNPs were suggested to be associated with the susceptibility to AG in overall or stratified analysis, including the rs61516247, rs1886753, rs7747696 and rs7749023 polymorphisms. Two SNPs, rs80112640 and rs72855279 were found to be associated with OS for GC patients, of which the variant genotypes both indicated a better GC prognosis. These findings demonstrated the lncRNA polymorphisms in 6p21.1 might have the potential to become prediction biomarkers for AG risk and GC prognosis. The study would provide important clues for further research in this field, and also be guidance for the early diagnosis as well as individualized therapy of gastric diseases. Very interestingly, the lncRNA genes where our studied SNPs located are just adjacent to PGC, a specific marker related to gastric diseases quite intimately. Therefore, our study might also provide research clues for the exploration of the interaction effects between genetic variation of PGC and its neighbour lncRNA genes on the susceptibility to GC along with its precursor.

Study participants
The study was approved by the Ethics Committee of the First Affiliated Hospital of China Medical University. Written informed consent was obtained from all participants. A total of 2507 subjects were involved in our study, including 749 GC, 878 AG and 880 controls. All enrolled individuals were recruited from the Zhuanghe Gastric Diseases Screening Program or hospitals in Zhuanghe and Shenyang of Liaoning Province, China between 2002 and 2013, which had been previously reported [28]. The controls were