Common polymorphisms in CD44 gene and susceptibility to cancer: a systematic review and meta-analysis of 45 studies

CD44 is one of the commonly recognized stem cell markers, which plays a critical role in many cancer related cellular processes. Relationships between CD44 polymorphisms and cancer risk have been widely investigated previously, whereas results derived from these studies were inconclusive and controversial. We conducted present meta-analysis aiming to explore the association between CD44 polymorphisms and cancer risk. We calculated pooled odds ratios (ORs) corresponding with the 95% confidence intervals (CIs) to make the evaluation clear. Embase, Web of Science, PubMed and Cochrane Library databases were retrieved to identify all eligible publications. As a result, a total of 12 publications comprised 25,777 cases and 27,485 controls fulfilled the inclusion criteria. Nevertheless, the pooled analyses suggested that no significant association was uncovered between CD44 (rs10836347, rs11821102, rs13347, rs1425802, rs353639, rs713330 and rs187115) polymorphisms with overall cancer risk. Subsequently, we conducted subgroup analysis for rs13347 polymorphism based on source of control, and we identified a significantly increased cancer risk for the population-based (P-B) group restricted to a recessive model (TT vs. TC+CC: OR = 2.030, 95%CI: 1.163-3.545, PAdjust < 0.001). In conclusion, our meta-analysis demonstrates that CD44 polymorphisms may not represent risk factors for cancer. Future well-designed large-scale case-control studies are warranted to verify our findings.


INTRODUCTION
Malignant tumors pose serious threats to human health and are currently among the top causes of death [1]. In this era of precision medicine, the identification of ideal biomarkers for diagnosis to optimize the prevention and treatment of malignant tumors has become a hotspot in both research and clinical practice.
CD44 was primarily demonstrated as a receptor for the hyaluronan and lymphocyte-homing receptor [2]. Recently, this multi-structural and multi-functional transmembrane glycoprotein has been demonstrated to play a pivotal part in evaluating prognosis for a variety of cancer types, such as bile duct cancer [3], colorectal cancer (CRC) [4] and breast cancer (BC) [5]. CD44 is expressed as different isoforms derived from alternative splicing of variant exons [6]. And common isoforms of CD44, which have been identified related to cancer metastasis, are the surface adhesion molecules. In 1990s, CD44v6 was widely-accepted to be the major variant isoform in rat carcinoma cells participated in the regulation of tumor metastasis [7]. Besides, CD44v6 also expressed in both premature and mature lung tissues and connected with epithelial stem cells [8].
Several recent studies have demonstrated that many polymorphisms in CD44 were correlated with the risk Research Paper www.impactjournals.com/oncotarget of many cancers, including BC [9], gastric cancer (GC) [10] and CRC [11]. In Jiang et al.'s study [9], the authors identified that rs13347 CT + TT genotype increased individuals' susceptibility to BC relative to the most common CC genotype, particularly for estrogen receptor (ER) negative patients. Consistently, Wu et al. [11] verified these results in CRC. In addition, the functional assays demonstrated that rs13347 polymorphism C to T base change disrupted the binding site for mir-509-3p, thus, the transcriptional activity was increased, as well as the expression level of CD44. Later on, Tulsyan et al. [12] revealed that CD44 rs353639 polymorphism potentially has a significant effect in BC patients' prognosis. Nevertheless, both rs13347 and rs353639 polymorphisms had no influence on BC risk. Noting these controversial and inconclusive results, we conducted the current meta-analysis in order to determine a more exact relationship between CD44 polymorphisms and the risk of cancer.

Characteristics of the eligible studies
A sum of 12 publications that met the inclusion criteria were enrolled in the quantitative synthesis ( Figure 1 and Table 1) [9][10][11][12][13][14][15][16][17][18][19][20]. For rs10836347 polymorphism, we identified six studies encompassing 4,124 cases and 4,672 controls. The ethnicities of all these studies were Asian populations. For rs11821102 polymorphism, we enrolled seven qualified studies consisted of 4,399 cases and 4,947 controls. For rs13347 polymorphism, ten publications met the inclusion criteria, comprising 6,438 cases and   Table 1 summarized the demographic characteristics of these selected studies enrolled in present meta-analysis. As shown in Table 1, genotyping methods applied in these studies included MassArray, reverse transcriptionpolymerase chain reaction (RT-PCR), Amplification Refractory Mutation System-Polymerase Chain Reaction (ARMS-PCR) and TaqMan. In addition, there were five case-control studies whose genotype distributions in the control groups were not conformed to Hardy-Weinberg equilibrium (HWE) ( Table 1) [11,[14][15][16][17]. For these studies, subgroup analyses by HWE status and sensitivity analyses were conducted to evaluate the potential effects of these studies on the overall pooled results.

Pooled analysis
The association between CD44 genetic polymorphisms and risk of cancer was shown in Table 2. No any statistically significant association was found between CD44 polymorphisms (rs10836347, rs11821102, rs13347, rs1425802, rs353639, rs713330 and rs187115) and overall cancer risk in all the five genetic models ( Table 2).

Subgroup analysis
Results of the subgroup analyses were also shown in Table 2. We performed stratified analyses according to source of control, ethnicity, genotyping method and HWE status. No significant association of rs13347 polymorphism and cancer risk was identified for Asian and Caucasian subgroups ( Table 2). When the stratification analysis was conducted based on source of control, we uncovered that population-based (P-B) group was the source of heterogeneity in recessive model (TT vs. TC+CC: OR=2.397, 95%CI: 0.732-3.317, P Adjust < 0.001) rather than hospital-based (H-B) group. Subsequently, we also conducted a subgroup analysis referring to genotyping method. In the MassArray group, statistical heterogeneity preserved significance in all the genetic models (T vs. C: OR = 1.766, 95%CI: 1.454-2.144, P Adjust < 0.001; TC vs. CC: OR = 1.857, 95%CI: 1.528-2.257, P Adjust < 0.001; TC+TT vs. CC: OR = 2.003, 95%CI: 1.603-2.502, P Adjust < 0.001; TT vs. CC: OR = 2.836, 95%CI: 1.981-4.059, P Adjust < 0.001; TT vs. TC+CC: OR = 2.062, 95%CI: 1.719-2.474, P Adjust < 0.001). In contrast, no significant association between rs13347 polymorphism and cancer risk was identified for either the RT-PCR or the TaqMan groups (Table 2). Finally, when stratified by cancer type, we found no association between rs13347 polymorphism and BC risk ( Table 2).
For the rs187115 polymorphism, a significantly increased association was observed in the RT-PCR group upon stratifying by genotyping For the rs187115 polymorphism, a significantly increased association was observed in the RT-PCR group upon stratifying by genotyping method, indicating RT-PCR group can account for the source of heterogeneity. ( Table 2). However, no association was found in the TaqMan group. Subgroup analysis based on ethnicity presented that rs187115 polymorphism was not related to cancer risk for both Asian and Caucasian populations ( Table 2).
For the remaining CD44 polymorphisms, when stratified analysis by genotyping method, source of control, ethnicity, cancer type and HWE status, no significant association was identified from the pooled results (Table 2).

Sensitivity analysis and publication bias
Sensitivity analysis was conducted to evaluate the stability of pooled ORs, in which an individual study will be removed each time in turn from the pooled analyses to detect the influence of individual case-control studies on the pooled ORs. We identified that removal of any single case-control study did not influence the stability of the results. We also generated Egger's funnel plot and conducted Begg's test to assess the publication bias. The shapes of funnel plot appeared symmetrical, indicating no publication bias was existed. These findings were further supported by Egger's funnel plot for the seven CD44 polymorphisms (rs1425802, rs10836347, rs11821102, rs13347, rs187115, rs353639 and rs713330) Table S1.
Additionally, PRISMA 2009 Checklist for this Metaanalysis was presented in Supplementary Table 2, and the quality of the enrolled studies was shown in Table 3, which was evaluated by Newcastle-Ottawa Scale (NOS).

Linkage disequilibrium (LD) analysis across populations
In order to better understand these results, LD analysis was performed to test the existence of bins. However, only six polymorphisms could be matched from the database, including rs10836347,  rs11821102, rs13347, rs187115, rs353639 and rs713330 polymorphisms. LD plots for the CEU population showed a moderate LD value (r 2 ≥0.5) between rs187115 and rs353639 polymorphisms. Additionally, LD plots for the YRI population showed a moderate LD value (r 2 >0.6) between rs11821102 and rs13347 polymorphisms (Supplementary Figure S1).

DISCUSSION
Currently, personalized analyses and improved methods for cancer diagnoses can be offered by preferable comprehending the association between genetic polymorphisms and malignancies risk. Among the polymorphisms widely researched for risk factors associated with cancers, CD44 has become a common target gene.
CD44 is involved in many cellular processes, such as angiogenesis, proliferation, and metastasis [21]. The CD44 is composed 20 exons grouped into two areas [22]. Group 1 is comprised of co-expressed exons 1-5 and 16-20, while group 2 is comprised of exons 6-15. Ten exons in group 1 are spliced alternatively (exons 5 and 16). Multi-functional characteristics of CD44 contribute to the binding of its ligand, hyaluronan [23]. Two binding domains are available for hyaluronan, encoded by exons 2 and 5 [24]. Interaction of hyaluronan with CD44 facilitates the regulation of BC via cell to cell adhesion and suppressed invasion [25]. Alterations in binding of hyaluronan to CD44 leads to the activation of invasion and  [26,27], sarcoma and GC [28,29]. Based on these findings, we predicted that CD44 would have a significant impact on the pathogenesis and prognosis of many cancer types. A previous study performed by Tulsyan et al. [12] aimed to determine if genetic variants (rs13347 and rs353639) of CD44 influence individuals' risk for BC in 258 cases and 131 healthy controls. However, no significant differences were addressed. Their results were not consistent with Jiang et al.'s work [9], in which the authors evaluated the rs13347 polymorphism in a Chinese population consisted of 1,853 BC patients and 1,992 healthy controls and identified that variant genotype (CT+TT) conferred a 1.72-fold increased risk of BC. In addition, they also carried out a reporter assay to verify these findings and elucidated that CT+TT genotype carriers have higher expression of CD44 than wildtype CC carriers. The differences in these findings can be attributed to the differences in ethnicities or the presence of another linked CD44 polymorphism that confers risk in Chinese population. Another study conducted by Xiao et al. [18] reports that CD44 rs13347 C > T polymorphism is a susceptibility factor for nasopharyngeal carcinoma (NPC). Subsequently, Sharma et al. [16] re-considered the role of four CD44 polymorphisms (rs13347, rs353639, rs187116 and rs187115) and gall bladder cancer (GBC) risk, and they found no significant difference in the frequency distribution of selected polymorphisms in GBC cases when compared with controls at either allelic or genotypic levels in a North Indian population.
The conclusions from enrolled studies were controversial, and independent studies may not have sufficient statistical strength to precisely identify the effects of CD44 polymorphisms on overall cancer risk. Thus, our team performed a quantitative metaanalysis to allow for increasing statistical power and provide multiple lines of evidence for the relationship between CD44 polymorphisms and cancer risk. A total of 45 case-control studies were enrolled for the seven polymorphisms (rs10836347, rs11821102, rs13347, rs1425802, rs187115, and rs353639 and rs713330). Finally, we identified that the mutated B allele of CD44 polymorphisms was not observed to be associated with an increased risk of cancer. Nevertheless, it is worth noting that our data was not consistent with previously published studies, including a meta-analysis. In a study by Weng et al.[20], the authors found that carriers of the CD44 rs187115 polymorphism with the genotype of at least one G were at an increased risk of developing transitional cell carcinoma (TCC). Around the same time, a similar finding was obtained from a study by Chou et al. [14], which found that CD44 rs187115 polymorphism may serve as a biomarker for predicting prognosis of latestage hepatocellular carcinoma (HCC). Furthermore, a study by Xiao et al. [18] revealed a positive relationship between the CD44 rs13347 (C > T) polymorphism and NPC development. When the data were stratified based on genotyping method, CD44 rs13347 polymorphism was found to be associated with an increased risk of cancer in the MassArray group in all the five genetic models. Additionally, in the RT-PCR subgroup, we also observed a significant increased association between the rs187115 polymorphism and cancer risk in all the genetic models. Moreover, subgroup analysis based on source of control suggests that a significant association was existed between rs13347 polymorphism and cancer risk in recessive model in P-B group. The existence of this phenomenon may be due to the inconsistencies in control groups. Although most of the controls were chose from healthy populations, many individuals may have suffered from other noncancer diseases. These differences in control case characteristics could make our findings biased. On the other hand, we also observed significant between-study heterogeneity in our analysis. Absolute meta-regression analysis revealed that the genotyping method introduced substantial heterogeneity. Methodological problems are reflected in the deviations in HWE status, such as the errors in genotyping, the bias of population stratification or selection. Although we did not exclude these studies that were deviated from HWE, we have conducted a subgroup analysis by HWE status. We proved that HWE status did not give rise to bias of results. In addition, the stability of these results were further enhanced by sensitivity analysis.
The current meta-analysis comes with some advantages. Firstly, we have conducted a comprehensive search to identify more eligible studies thus, makes our analysis more persuasive and substantial. Secondly, quality of enrolled studies were all assessed by Newcastle-Ottawa Scale (NOS), so low quality studies should be excluded in order to raise the overall quality. Thirdly, subgroup analysis was performed according to cancer type, HWE status and so on at the aim of further deeply exploring the sources of heterogeneity. Fourthly, results were adjusted according to the recognized formula, ensuring the accuracy of the results. In addition, the stability of these studies was further confirmed by sensitivity analysis, and publication bias was tested by Egger's test and Begg's funnel plot. Finally, we have carefully searched for the databases and identified one recent published meta-analysis, which conducted by Shi and his colleagues [30]. They payed attention to the association of CD44 rs13347 genetic polymorphism and cancer risk, and their ethnicity was restricted to Asians [30]. However, in our study, we analyzed seven polymorphisms in CD44 and cancer risk, and the ethnicity comprised Asians and Caucasians. The largely increased sample size of current work provides us with more sufficient power to identify some conceal findings. In the end, Shi et al. [30] suggested that CD44 rs13347 (C>T) polymorphism was related to an increased risk of human cancer in Asian people, especially in Chinese populations. Different from their work, we observed the mutated B allele of all CD44 polymorphisms was not associated with the risk of cancer after adjusting. However, several drawbacks in our study should also be noted. Firstly, a relatively small number of studies were enrolled for each polymorphism, with a particularly small number of studies analyzing for the rs353639 polymorphism (only four case-control studies). This limitation may have resulted in an insufficient power for identifying minor association between CD44 polymorphisms and cancer risk. Secondly, further studies are warranted to evaluate the effects of CD44 polymorphisms on cancer risk in different ethnicities. In ethnicity subgroup analysis, the enrolled studies were restricted to Asian and Caucasian populations; data for other ethnicities were not analyzed. Thirdly, the phenotype of our study was a heterogeneous aggregation of a variety of cancer types, and only for part of CD44 polymorphisms, a subgroup analysis based on cancer type was conducted, while for others, attributing to the limited number of studies for specific cancers, such as BC and CC, we were unable to validate the potential effects on these cancers homogeneous or not, which should be investigated in the future. Additionally, several potentially confounding factors were not considered in this study, such as age, sex, smoking and drinking status, (hepatitis B virus) HBV/ (hepatitis C virus) HCV carrier status, environmental factors, and so on.

CONCLUSIONS
Our meta-analysis suggests that CD44 polymorphisms might not represent risk factors for cancer. However, our findings require further validation in more well-designed studies with larger sample sizes in order to strengthen our conclusions.

Search Strategy
We carried out a comprehensive literature search on Embase, Cochrane Library and PubMed (up to April 2, 2016) to find all relevant publications exploring the relationship between CD44 polymorphisms and the risk of cancer. The search terms were as follows: "CD44" AND "SNP OR polymorphism OR mutation OR allele OR variation" AND "cancer OR adenocarcinoma OR carcinoma OR tumor OR neoplasm OR Leukemia OR lymphoma." The language was restricted to English. These publications were extracted by two reviewers to identify studies specific to various cancers. We then carried out a manual retrieve of the references lists of these enrolled original publications/Reviews to identify additional eligible case-control studies.

Inclusion and Exclusion Criteria
Enrolled studies should meet the following criteria: 1) they should assess the association between CD44 polymorphisms and cancer risk; 2) they should be casecontrol/cohort studies; and 3) they should comprise sufficient data (allele and genotype frequencies). In addition, studies were excluded when they were: 1) case only studies, such as Reviews/comments/case reports and 2) not containing sufficient data.

Data Extraction
Two reviewers (Meng Zhang and Yangyang Wang) performed the data extraction process based on the previously described enrollment criteria. All discrepancies were discussed until consensuses were obtained. In addition, the following characteristics were also extracted from publications: name of first author, publication year, ethnicity of the subjects in the case-control study, source of control, genotype frequency, and etc.

Statistical analysis
ORs correspondence with 95%CIs were calculated to evaluate the strength of the relationship between CD44 polymorphisms and cancer risk in five genetic models: allele contrast (B vs. A), dominant (BB + BA vs. AA), recessive (BB vs. BA + AA), homozygous (BB vs. AA), and heterozygous (BA vs. AA) models (A: wild type allele; B: variant allele). Subsequently, stratified analyses were performed by cancer type, ethnicity, source of control and genotyping method. We evaluated the statistical heterogeneity assumption by I 2 statistics to quantify any inconsistency arising from inter-research variability derived from heterogeneity instead of random chance. An I 2 value more than 50% was regarded as significant heterogeneity among these studies. In that case, pooled OR estimations of individual studies were tested by random effect model; if not, fixed effect model will be employed. Moreover, sensitivity analysis was carried out to verify the stability of our results and Egger's regression test and Begg's funnel plot were carried out to evaluate the publication bias. STATA 12.0 software was employed to calculate all the statistical analyses (STATA Corp, College Station, TX). In addition, Bonferroni corrections were also performed to adjust the results [31]. P<0.05 was regarded as statistically significant. Besides, this study is a systemic review of the literature, so ethical approval was not required.

LD analysis across populations
Data was extracted from the International HapMap Project (http://hapmap.ncbi.nlm.nih.gov/cgi-perl/gbrowse/ hapmap24_B36/#search), which comprises CD44 polymorphisms evaluated in the current study. Briefly, populations incorporated in the project including YRI (Yoruba in Ibadan, Nigeria), CHB (Han Chinese in Beijing, China), JPT (Japanese in Tokyo, Japan) and CEU (Utah residents with northern and western Europe ancestry). Then, Haploview software was employed to conduct the analysis and LD was evaluated by r 2 statistics in each of the populations [32].