Genetic variation in CDH13 gene was associated with non-small cell lung cancer (NSCLC): A population-based case-control study

Cadherin 13 (CDH13, T-cadherin, H-cadherin) has been identified as an anti-oncogene in various cancers. Recent studies have reported that downregulation of H-cadherin in cancers is associated with CDH13 promoter hypermethylation, which could be affected by the single nucleotide polymorphisms (SNPs) near CpG sites in the CDH13 promoter. In the current study, we investigated and analyzed the association of seven SNPs (rs11646213, rs12596316, rs3865188, rs12444338, rs4783244, rs12051272 and rs7195409) with non-small cell lung cancer (NSCLC) using logistic regression analysis. SNPs rs11646213, rs12596316, rs3865188 and rs12444338 are located in the promoter region, rs4783244 and rs12051272 are located in intron 1, and rs7195409 is located in intron 7. A total of 454 patients with NSCLC were placed into a NSCLC group and 444 healthy controls were placed into a control group, all participants were recruited to genotype the SNPs using Taqman assay. Our results showed that the allelic frequencies of rs11646213 were significantly different between NSCLC and control groups (P = 0.006). In addition, the association analysis of these SNPs stratified into NSCLC pathologic stages I+II and III+IV showed that the allelic frequencies rs7195409 had a significant difference between NSCLC pathologic stages I+II and III+IV (P = 0.006). Our results indicated that the rs11646213 and rs7195409 in CDH13 could be associated with NSCLC or its pathologic stages in the Chinese Han population.


INTRODUCTION
Lung cancer (LC) is the leading cause of cancer deaths in the world. In 2012, it is estimated that there were more than 1.8 million new cases (13% of total cancer incidences) and almost 1.6 million deaths (20% of total cancer mortality) [1]. In China, the incidence and mortality of LC are reported to be approximately 0.7 and 0.6 million cases, respectively in 2013 [2]. Non-small cell lung cancer (NSCLC) accounts for approximately 80% of all LC cases [3]. Recently, several studies have shown that Cadherin 13 (CDH13, T-cadherin, H-cadherin) functioned as an anti-oncogene and that its polymorphisms were associated with the development of different cancers [4][5][6][7][8].
Cadherin 13, a new member of the cadherin superfamily, is coded by the CDH13 gene, which maps to chromosome 16q24.2 [9]. Cadherin proteins often contribute to the formation of intercellular junctions (e.g. N-and E-cadherin). Loss of cadherin expression has been described in many epithelial cancers and may play a role in tumor cell invasion and metastasis [10]. Recent studies have reported that Cadherin 13 functioned as an antioncogene in lung [4], breast [5], ovarian [6], bladder [11], esophageal [12] and gastric [13]. As an anti-oncogene, the Research Paper downregulation of Cadherin 13 expression would promote cancer progression. In 2001, Toyooka et al. reported that Cadherin 13 expression is diminished in LC, and they demonstrated that the downregulation of Cadherin 13 might be due to hypermethylation in the CDH13 promoter [14]. In addition, Putku et al. described that single nucleotide polymorphisms (SNPs) in CDH13 gene could affect the methylation of CpG sites in CDH13 gene [15]. Moreover, studies have shown that the SNPs in CDH13 gene could affect disease progression by influencing serum adiponectin levels [7,8], and the serum adiponectin level was identified to be associated with LC [16]. Thus, the SNPs in CDH13 gene might be associated with LC through its correlation with CDH13 gene methylation and serum adiponectin level. Several studies have reported that SNPs in CDH13 gene were associated with other diseases, such as colorectal cancer [17][18][19]. However, few study investigated the association between SNPs in CDH13 gene and NSCLC.
In the current study, we analyzed the association of seven SNPs (rs11646213, rs12596316, rs3865188, rs12444338, rs4783244, rs12051272 and rs7195409) in the CDH13 gene with NSCLC and its pathologic stages in a Chinese Han population. SNPs rs11646213, rs12596316, rs3865188 and rs12444338 are located in the promoter, rs4783244 and rs12051272 are located in intron 1, and rs7195409 is located in intron 7. Table 1 lists the clinical characteristics of the subjects in the present study. There were no significant differences in age or gender between the NSCLC and control groups (P > 0.05). In the NSCLC group, there were 283 patients with adenocarcinoma (AC), 163 patients with squamous cell carcinoma (SCC), and 8 patients with adenocarcinoma and squamous cell carcinoma (AC + SCC). There were 73 patients in pathological stage I, 73 patients in stage II, 163 patients in stage III and 145 patients in stage IV.

Association of the seven SNPs in CDH13 with NSCLC
The allelic and genotypic frequencies for rs11646213, rs12596316, rs3865188, rs12444338, rs4783244, rs12051272 and rs7195409 in the NSCLC and control groups are listed in Table 2. These SNPs were all in Hardy-Weinberg equilibrium (HWE) for the NSCLC and control groups (P > 0.05). The logistic regression analysis showed that the allelic frequencies of rs11646213 were significantly different between NSCLC group and the control group (P = 0.006), which suggested that T allele of re11646213 had an increased effect on NSCLC risk after adjusted for gender and age (OR = 1.409;95%CI:1.105-1.798). However, the allelic and genotypic frequencies of the other SNPs were not significantly different between the NSCLC and control groups (P > 0.007).

Model of inheritance analysis of the seven SNPs in CDH13 gene with NSCLC
Logistic regression analysis was used in model of inheritance analysis to evaluate the association between genotypes of the SNPs and NSCLC. The Akaike information criterion (AIC) and Bayesian information criterion (BIC) were calculated to determining the best fit inheritance model, which possesses the smallest AIC and BIC values. The best inheritance model with the lowest AIC and BIC for rs11646213 was the recessive model (P = 0.004, after adjusted for gender and age) ( Table 3). In this model, the T/T genotype of rs11646213 conferred more risk of NSCLC (OR = 3.26; 95%CI:1. 41-7.56). In addition, no significant differences for other SNPs were found between NSCLC and control groups in the model of inheritance analysis (P > 0.007) (data not shown).

Linkage disequilibrium (LD) and haplotype analysis of the seven SNPs in CDH13 gene
Significant LD values (D'> 0.85 and R 2 > 0.7) among five of the seven SNPs (except for rs7195409 and rs11646213) were found in all individuals (Table 4). Based on the LD result, we constructed the haplotypes of the five SNPs (rs12596316, rs3865188, rs12444338, rs4783244 and rs12051272) and analyzed the difference in the haplotype frequencies (frequency more than 3%) between the NSCLC and control groups. Two main haplotypes were observed, and the frequencies of rs12596316A-rs3865188A-rs12444338G-rs4783244G-12051272G and rs12596316G-rs3865188T-rs12444338T-rs4783244T-12051272T were 58.0% and 33.7% in NSCLC and 59.8% and 30.6% in control groups. None of the haplotypes were significantly different in the NSCLC and control groups (P > 0.007) ( Table 5). Table 6 lists the comparisons of the genotypic and the allelic distribution of the seven SNPs in different NSCLC pathologic stages (I+II and III+IV). The allelic frequencies of rs12444338, rs4783244, rs12051272 and the genotypic frequencies of rs7195409 just only exhibited a trend of significant difference between NSCLC I+II and III+IV patients (P = 0.011, 0.018, 0.024 and 0.013 respectively). However, the allelic frequencies of rs7195409 showed significant difference between NSCLC I+II and III+IV patients after Bonferroni correction (P = 0.006). We also conducted an inheritance model analysis to identify the best fit model of these four SNPs to compare the differences between NSCLC I+II and III+IV pathologic stages. The results showed that the best fit model for rs12444338, rs4783244 and rs12051272 was the log-additive model (P = 0.004, 0.006 and 0.005 respectively), and the best fit model for rs7195409 was the dominant model (P = 0.001) ( Table 7).

DISCUSSION
Although persistent work has been done for the prevention and therapy of NSCLC, it remains the most common cancer in the world [20], and annual incidence and mortality has been trending upward [21]. Recently, studies have shown that polymorphisms in CDH13 gene could lead to aberrant methylation of the CDH13 promoter and influence plasma adiponectin level which have been demonstrated to be as biomarkers of LC [22][23][24]. However, they did not evaluate the association of SNPs in CDH13 gene with LC [22][23][24]. In current study, we investigated the relationship between seven SNPs in CDH13 gene and NSCLC.
Adiponectin is an adipose tissue-secreted protein that acts as an endogenous insulin sensitizer by binding to insulin receptors [25]. Data from recent studies proved that lower adiponectin levels are associated with an increased risk of endometrial cancer [26], renal cancer [27], colon cancer [28] and breast cancer [29]. In 2016, Wei et al. performed a meta-analysis on circulating adiponectin levels in various malignancies and found that decreased adiponectin levels are associated with the risk of various cancers, including LC [30]. Several studies revealed that SNPs (rs12444338, rs3865188, rs4783244, rs12051272, rs12596316, rs11646213, rs7195409) in CDH13 gene determined plasma adiponectin levels in multi-ethnic populations [31][32][33][34]. In 2017, Nicolas et al. reported that rs11646213 was correlated with plasma adiponectin levels [34]. They observed that the A allele of rs11646213 was significantly associated with lower plasma adiponectin levels in Insulin Resistance Syndrome in French populations. However, this result was in contradiction with results of the present study, where our results showed that the T allele of rs11646213 might be the risk factor for NSCLC, which should be associated with lower plasma adiponectin levels. Moreover, Ling et al. found that rs7195409 was associated with adiponectin levels in Europeans (P = 2.0 × 10 -5 ) [35], while in Filipino women, Wu et al. [31] did not find any association between rs7195409 and adiponectin levels that was coincident with the current study (no association between this SNP and NSCLC). One of the reasons for the discrepancies might be the two different study populations (European and Asian) with different genetic backgrounds. The A allele of rs11646213 accounts for approximately 42% as the minor allele in the European population, while the same allele accounts for approximately 82% as the major allele in the East Asian population (http://asia.ensembl.org/Homo_ sapiens/Variation/Population?db=core;r=16:82608546-82609546;v=rs11646213;vdb=variation;vf=6808482). Another reason could be that different diseases have different molecular mechanisms. The complex pathogenic mechanisms could make the same SNP play different roles in the different diseases. However, the plasma adiponectin levels were not measured in the current study, thus, we could not evaluate the association between genetic data, adiponectin levels and NSCLC risk. Whether rs11646213 was associated with NSCLC through influencing the plasma adiponectin levels in the current study population requires functional studies to be clarified in different diseases.
In 1996, Lee et al. found that the CDH13 transcript was undetectable in all examined breast cancer and most other cancer cell lines, supporting its role as a tumor suppressor [36]. Then, Zhong et al. reported that the loss of Cadherin 13 expression was associated with tumorigenicity in nude mice transplanted with NSCLC tumors [37].   [38,39]. Thus, Cadherin 13 was considered an important tumor suppressor in colorectal, lung, breast, ovarian and bladder cancers [40][41][42][43]. Aberrant methylation of tumor suppressor genes leads to tumorigenesis, due to the silencing of suppressors [44]. The aberrant methylation of CpG islands in the CDH13 promoter might lead to aberrant gene expression and further promote tumor progression. This association was reported in many cancers, such as hepatocellular carcinoma, cervical neoplasia and breast cancer [45][46][47]. In 2016, Jin et al.
reported that aberrant methylation of the CDH13 promoter is associated with tumor progress in primary NSCLC [48]. Moreover, Shi et al. showed that genetic variability extensively influenced DNA methylation [49], and polymorphisms in the CpG sites of the CDH13 promoter were associated with aberrant methylation of the CDH13 promoter [15,50]. In the current study, our results showed that the frequency of rs11646213 in the CDH13 promoter was significantly different between NSCLC and control groups at the allelic and genotypic level. Thus, we could deduce that rs11646213 might be associated with NSCLC through affecting the CDH13 promoter methylation status. However, the other SNPs allelic frequency in the CDH13 promoter, such as rs12596316, rs3865188 and rs12444338, was not significantly different between NSCLC and control groups. The reason for the discrepancy might be the linkage disequilibrium (LD) of these SNPs, and rs11646213 was not in LD with other SNPs in the CDH13 promoter (Table 4).
On the NSCLC pathologic level, rs11646213 showed no association with NSCLC pathologic stages. It was interesting that rs12444338, rs4738244 and rs12051272 SNPs, which are in different regions of CDH13 gene showed a correlation trend with NSCLC (P = 0.011, 0.018 and 0.024, respectively), which might be due to the fact that rs12444338 was in LD with the rs4783244 and rs12051272 (Table 4). In addition, Morisaki et al. reported that rs4783244 and rs12051272, which are located in CDH13 intron 1, were in LD with rs12444338 located in the promoter because of a 30-kb haplotype block from the CDH13 promoter region to the first intron [7]. Thus, rs4783244, rs12051272 and rs12444338 exhibited a similar trend of association with NSCLC pathologic stages, despite that these SNPs are located in different regions of the CDH13 gene. In the current study, we found that rs7195409, located in CDH13 intron 7, was associated with NSCLC pathologic stages (P = 0.006). The model of inheritance analysis showed that rs4783244, rs12051272, rs12444338 and rs7195409 were all associated with pathologic stages of NSCLC (P = 0.004, 0.006, 0.005 and 0.001, respectively). The best fit model of rs4783244, rs12051272 and rs12444338 was log- additive, but the best fit model of rs7195409 was dominant ( Table 7). This difference could be because rs7195409 was not in LD with rs4783244, rs12051272 and rs12444338. The rs7195409 SNP was located in CDH13 intron 7, and its surrounding nucleotide sequence does not match the known transcription factor binding site or miRNA targeted sequence. Thus, it is likely that rs7195409 is an independent marker or candidate SNP for NSCLC development. In addition, other SNPs surrounding or in LD with rs7195409 might be the real candidate SNPs that influence and play important roles in the development of NSCLC.

Ethics statement
The current study was conducted in accordance with the guidelines and principles declared in the Declaration of Helsinki and approved by the Institutional Review Boards of the No.1 Affiliated Hospital of Kunming Medical University. All participants provided written informed consent.  [51]. According to the pathomorphological reports, the NSCLC patients were divided into AC, SCC, and AC+SCC. Subjects with oncotherapy history or other cancers were excluded from the current study. In addition, individuals with hypertension, coronary heart disease and diabetes were also excluded from the current study. Clinical characteristics, such as gender, age, family history of cancer, and histological type of cancer, were collected. A total of 444 healthy individuals (296 males and 148 females) who had no family history of NSCLC were recruited from a population undergoing routine health checkups at the same hospitals. As subjects with a family history of cancer were excluded from control group, individuals who had family history of cancer were also removed from NSCLC group. All participants self-reported as Han and lived roughly within Yunnan Province, southwest of China.

Statistical analysis
Microsoft Excel software and the SPSS 19.0 statistical package (SPSS, Chicago, IL, USA) were used to perform statistical analyses. Both the NSCLC and control groups were evaluated by Hardy-Weinberg equilibrium (HWE) for representativeness. The linkage disequilibrium (LD) having a D'/R2 value greater than 0.85/0.70 was considered to be in linkage disequilibrium, and haplotypes were constructed based on the genotyping results by the expectation-maximization algorithm in SHEsis software [52,53]. The effects of the polymorphisms on the risk of NSCLC were expressed as ORs with 95%CI, which were calculated using logistic regression analysis with adjustment for age and gender. The association between each genotype and the risk of NSCLC was assessed using inheritance model analysis of SNPstats software [54]. Five inheritance models were analyzed including Codominant, Dominant, Recessive, Overdominant and Log-additive (https://www.snpstats.net/snpstats/tutorial. htm?q=snpstats/tutorial.htm). Codominant model: this model allows every genotype to give a different and no additive risk., which compares heterozygous T/C (He) and homozygous for the variant allele C/C (Va) genotypes to the homozygous for the most frequent allele T/T. Dominant model: a single copy of C is enough to modify the risk, then heterozygous and homozygous genotypes have the same risk. Thus, we could compare a combination of these two possible genotypes T/C+C/C (Do) to the homozygous T/T. Recessive model: two copies of C are necessary to change the risk. Hence, T/C and T/T genotypes have the same effect. A combination of both T/ T+T/C (Re) is compared to the variant allele homozygous genotype C/C. Over-dominant model: heterozygous are compared to a pool of both allele homozygous, the T/C (He) is compared versus T/T+C/C. Additive model: each copy of C modifies the risk in an additive form, the homozygous C/C have double risk than heterozygous T/C. Now, compare a combination of the two genotypes with weights 2 and 1 respectively 2C/C+T/C (Ad), to T/T. The AIC and BIC were calculated to determine the best fit model for each SNP. The statistical power was calculated using PS Software [55]. Bonferroni correction was performed on the P values for multiple comparison in the current study, and the statistical significant threshold was set at P < 0.007 (0.05/7).

CONCLUSIONS
Several studies have demonstrated the association of SNPs in CDH13 gene with the methylation of the CDH13 gene and circulating adiponectin levels, which might be used as the diagnostic and prognostic biomarker for NSCLC [22]. In the current study, we found that the T allele of rs11646213 in CDH13 might be the risk factor for NSCLC. The SNPs rs7195409 was associated with NSCLC pathologic stages. However, there were some limitations affecting the identification or association of SNPs with NSCLC in the current study. One of the limitations was the relatively modest sample size with a statistical power of only 75.9%. Another limitation was the lack of smoking status data for the control individuals, which made it difficult to perform further analyses of such exposure variables and to perform a gene-smoking interaction analysis in the current study. To clarify the association of CDH13 variations with NSCLC susceptibility, larger scale samples and systemic studies that focus on the association of the SNPs in CDH13 gene with the methylation status, serum adiponectin levels and NSCLC susceptibility are needed in the future