Quantitative assessment of CD44 genetic variants and cancer susceptibility in Asians: a meta-analysis

CD44 is a well-established cancer stem cell marker playing a crucial role in tumor metastasis, recurrence and chemo-resistance. Genetic variants of CD44 have been shown to be associated with susceptibility to various cancers; however, the results are confounding. Hence, we performed a meta-analysis to clarify these associations more accurately. Overall, rs13347 (T vs. C: OR=1.30, p=<0.004, pcorr=0.032; CT vs. CC: OR=1.29, p=0.015, pcorr=0.047; TT vs. CC: OR=1.77, p=<0.000, pcorr=0.018; CT+TT vs. CC: OR=1.34, p=<0.009, pcorr=0.041) and rs187115 (GG vs. AA: OR=2.34, p=<0.000, pcorr=0.025; AG vs. AA: OR=1.59, p=<0.000, pcorr=0.038; G vs. A allele OR=1.56, p=0.000, pcorr=0.05; AG+GG vs. AA: OR=1.63, p=<0.000, pcorr=0.013) polymorphisms were found to significantly increase the cancer risk in Asians. On the other hand, rs11821102 was found to confer low risk (A vs. G: OR=0.87, p=<0.027, pcorr=0.04; AG vs. GG: OR=0.85, p=<0.017, pcorr=0.01; AG+AA vs. GG: OR=0.86, p=<0.020, pcorr=0.02). Based on our analysis, we suggest significant role of CD44 variants (rs13347, rs187115 and rs11821102) in modulating individual's cancer susceptibility in Asians. Therefore, these variants may be used as predictive genetic biomarkers for cancer predisposition in Asian populations. However, more comprehensive studies involving other cancers and/or populations, haplotypes, gene-gene and gene-environment interactions are necessary to delineate the role of these variants in conferring cancer risk.


INTRODUCTION
Cancer, which is an extremely complex and multifaceted disease involving multiple steps, is a leading cause of death worldwide. During last decades, considerable advancements have taken place in the development of better therapeutic interventions for cancer; however, chemo-resistance and disease recurrence have resulted in minimal disease outcome and poor survival rates [1]. Cancer stem cells (CSCs) are a small population of cells within a tumor and play a crucial role in cancer progression and recurrence. Because of their ability for self-renewal, they may initiate tumor growth and promote metastasis; thereby leading to aggressive forms of the disease. [2][3][4][5][6]. Hence, CSCs represent the most attractive and promising targets in clinical oncology [7].
Cluster of differentiation (CD) 44, a well-recognized CSC marker [8,9], is a multistructural and multifunctional transmembrane glycoprotein that belongs to a family of cell adhesion receptors and is widely expressed in most mammalian cells [10,11]. The gene for CD44 is complex (aprox 50 Kb), located on human chromosome 11p13, comprising 20 exons out of which 10 exons (exon [6][7][8][9][10][11][12][13][14][15] are involved in alternative splicing of CD44 to generate Clinical Research Paper Tulsyan [12,13]. Although CD44 is the major receptor for hyaluronan (HA), the main component of the extracellular matrix, it can also bind with MMPs, collagens and osteopontin. It is involved in maintenance of cell-cell/ extracellular matrix (ECM) interactions, cell adhesion, cell trafficking and migration etc. [14][15][16][17]. In addition, it mediates multiple vital biological processes such as angiogenesis, cell proliferation, cell differentiation and presentation of cytokines, chemokines and growth factors to the corresponding receptors, docking of proteases as well as cell survival signaling that are closely associated with neoplastic transformation and tumor progression [18][19][20]. Several evidences have firmly established the role of CD44 in cell differentiation, epithelial -mesenchymal transition (EMT), invasion and metastatic spread in various human cancers [21][22][23][24][25]. In addition, CD44 aberrations have been shown to confer apoptosis resistance [26]. It was found to act like a tumor promoter in some cancers while it functioned as a tumor-suppressor in others [27]. Increased or decreased expression of CD44s or CD44v molecules has been reported in various cancers and shown to be associated with increased tumor aggressiveness, metastasis, early tumor recurrence and chemo-or radioresistance, as well as poor prognosis [28][29][30][31]. Further, CD44 targeting by monoclonal antibodies and blocking peptides has been established as a promising therapeutic approach for cancer [32][33][34].
Considering the important role of CD44 in carcinogenesis, several studies have explored the role of genetic variants of CD44 in cancer susceptibility, prognosis and chemotherapeutic response in various human cancers [35][36][37][38]. However, the results are controversial and the power of each study was restricted due to low sample size, necessitating further clarification of its role in cancer predisposition. Hence, we performed a meta-analysis of all eligible case-control studies to better interpret the associations between common SNPs of the CD44 gene (rs13347 C>T, rs10836347 C>T, rs11821102 G>A, rs713330 T>C, rs187115 T>C) and cancer risk.

RESULTS
According to the search strategies mentioned above, we found a total of 13 case-control studies investigating the association of CD44 polymorphisms (rs13347 C>T, rs10836347 C>T, rs11821102 G>A, rs713330 T>C, rs187115 T>C) with cancer susceptibility [36,[38][39][40][41][42][43][44][45][46][47][48][49]. However, the study by Qiu et al. [49] in Chinese gastric patients lacked genotyping details for each of the studied SNP, hence excluded. Therefore, we included only 12 potential case-control studies in the present metaanalysis and the characteristics of each eligible study are presented in Table 1. Since all studies were performed in Asian populations and are limited for cancer types, we

Quality assessment
According to the Newcastle-Ottawa quality assessment scale (NOS), the quality of all recruited case-control studies and their total quality scores are summarized in Table 2. The quality scores ranged from 6 to 8 and the average score of case-control studies was 7.08. Thus, our NOS results indicated that most of these studies (9) in our meta-analysis were of high quality (NOS score 7 or 8) and only three studies with NOS score of 6 were classified into intermediate quality.

CD44 rs13347
For CD44 rs13347 meta-analysis, a total of 12 articles [36,[38][39][40][41][42][43][44][45][46][47][48] with 6612 multiple cancer cases and 7450 controls were found to be eligible. The minor allele frequency (MAF) for rs13347 polymorphism varied from 13-29%. Overall, the variant allele and all genotypic models having at least one variant allele of rs13347 polymorphism were found to significantly increase the overall cancer risk compared with the wild allele/ genotype. ( Figure 2. and 3.). For this SNP, we used random effect model as the present meta-analysis revealed significant heterogeneity in all genotypic models. The removal of Lou et al. [39], Jiang et al. [42], Wu et al. [41,48], and Xiao et al. [40] were found to remove heterogeneity for hetero as well as variant models (CT vs. CC: ph = 0.085, I 2 = 46.057; TT vs. CC: ph = 0.288, I 2 = 1o.17) while removal of above studies together with Verma et al. [47] was found to remove heterogeneity at allele level and in dominant model (T vs. C: ph = 0.576, I 2 = 0.000; CT+TT vs. CC: ph = 0.386, I 2 = 4.764) However, it was found to significantly change the pooled results. In our sensitivity analysis, we did not find any obvious change in the corresponding pooled ORs after removing one study each time for a genetic model, thereby confirming reliability of our results.
In subgroup analysis based on study design and genotyping method (Taqman and/or other), the significant  Table 3).

CD44 rs11821102
Among 11, only eight studies [36, 39-42, 45, 46, 48] investigated the association of rs11821102 polymorphism and cancer risk, however the study of Wu et al. [48] on CRC failed to follow the HWE in controls and was hence excluded. Thus, seven studies with 3733 multiple cancer cases and 4454 healthy controls were included for rs11821102 meta-analysis. The minor allele frequency (MAF) for rs11821102 SNP varied from 6-10% and overall it was found to reduce the risk of cancer in most of the genotypic models  Table 3.). We did not encounter any significant heterogeneity in the selected studies. In sensitivity analysis, removal of two studies by Chou et al. [45] or Xiao et al. [40] was found to alter the corresponding statistical p value of association in hetero and dominant models while the removal of Weng et al. [36], Chou et al. [45] and Xiao et al. [40] was found to alter the pooled OR at allele level.
Further, in stratified analysis this association was lost in each subgroup except for HNC at allele level as well as at hetero genotype model (A vs. G: OR = 0.83, 95% CI = 0.69-0.99, p = <0.038; AG vs. GG: OR = 0.79, 95% CI = 0.65-0.97, p = <0.022; ). This may be because the small sample size of the subgroup did not possess sufficient statistical power to detect a weak effect.   AA+AG vs. GG and overall cancer risk. For each study, the estimates of OR and 95% CI were plotted with square and horizontal lines. The size of the square points is the relative weight of the respective study. Diamonds indicate the pooled OR and its 95% CI. For each study, the estimates of OR and 95% CI were plotted with square and horizontal lines. The size of the square points is the relative weight of the respective study. Diamonds indicate the pooled OR and its 95% CI. www.impactjournals.com/oncotarget

CD44 rs10836347
Among 11 studies, only seven (with a total of 4402 multiple cancer cases and 5167 controls) investigated the association of rs10836347 polymorphism in various cancers [39-42, 45, 46, 48]. The MAF for rs10836347 SNP varies from 6-8%. However, none of the genotypic combinations were found to affect the risk of overall cancer compared with the wild genotype (Table 3.). Our meta-analysis result was without any significant heterogeneity. The sensitivity analysis also confirmed the reliability of our result. Stratified analysis based on study design, cancer types and genotyping method did not modify the pooled result ( Table 3.).).

CD44 rs713330
For rs713330, we found a total of six eligible studies with 3453 cancer cases and 3958 controls [36,41,42,45,46,48]. The MAF for rs713330 polymorphism varies as 9-11% in controls. Overall, none of the genotypic combinations were found to affect the risk of overall cancer compared with the wild genotype (Table 3.). Our meta-analysis result was without any significant heterogeneity. The reliability of these results was further confirmed by sensitivity analysis demonstrating no significant change in the pooled ORs. Stratified analysis based on genotyping method did not modify the pooled result (Table 3.).

Publication bias
For rs13347, the review of funnel plot showed slight apparent asymmetry. Further, Egger's test as well as Begg and Mazumdar rank correlation tests did not demonstrate significant asymmetry except in the heterogenotype as  (Table 4). However, stratified analysis based on study design and genotyping method did not revealed any significant biasness suggesting them as the main source of biasness in our meta-analysis ( Figure  6). For other SNPs also, although funnel plot showed little asymmetry. Egger's as well as Begg and Mazumdar rank correlation tests demonstrated no apparent asymmetry except for rs11821102 in the variant genotype model (Table 4.). This may be because the number of studies is very low (5-7) to draw a more conclusive funnel plot

Credibility of meta-analysis results
According to Venice guidelines, credibility of the cumulative association of CD44 variants with cancer risk are shown in Table 5. Our results demonstrated moderate evidence of association for CD44 rs13347 and rs187115 variants while weak evidence for rs11821102, rs10836347 and rs713330. This may be due to fewer number of studies as well as small sample sizes. Additionally, different genotyping methods and study designs contributing to likely biasness are the potential reasons for moderate or weak evidence of association.

DISCUSSION
Single nucleotide polymorphism is the most common form of genetic variation, altering the expression level and/or function of any gene, thereby affecting an individuals' risk of cancer. In the present meta-analysis, we found that CD44 SNPs significantly modulate the risk of cancer in Asians. Specifically, rs13347 and rs187115 were found to significantly increase the cancer risk while rs11821102 was associated with cancer protection. On the other hand, rs713330, rs10836347 did not affect an individual's susceptibility for cancer.
The rs13347C/T located in the 3'-untranslated region (UTR) of CD44 is highly conserved and it is the main target region for microRNAs. The C to T base change of this SNP was found to disrupt the hsa-mir-509-3p binding site, thereby modifying the CD44 mRNA stability and its expression. Further, functional studies established the association of T allele with enhanced transcriptional activity as compared with C allele [39,48] and individuals carrying the T allele were shown to have higher expression of CD44 [42,48]. In addition, it was reported to affect the hematopoietic stem cell mobilization in patients with hematologic malignancies. [51]. The rs13347C/T was significantly associated with an increased risk of CRC [48], NPC [39,40], AML [41] and breast cancer [42], and this risk was found to increase as the number of variant alleles (T) increased. The rs13347T variant was also shown to be associated with tumor stage and lower five year survival rate in cancer patients [42,48]. Although, some studies failed to find the association of rs13347 with various cancers [38,[43][44][45][46], our metaanalysis established that CD44 rs13347 polymorphism is significantly associated with an overall increased risk of cancer. These findings suggest that this SNP may be used as potential biomarker for genetic susceptibility to various cancers in Asians.
The rs187115 SNP is located in the first intron of CD44. Intronic SNPs have been shown to play an important role in gene function by regulating its transcription and splicing [52]. Previously, this SNP was shown to be associated with cellular responses to a large panel of cytotoxic chemotherapeutic agents in a p53-dependent manner. In addition, the variant allele of this SNP was found to confer decreased drug sensitivity, poor overall survival and an earlier age of diagnosis in soft tissue sarcoma patients [35]. Further, several studies reported significant association of this SNP with increased susceptibility, development, invasion, advanced stage and poor prognosis of various cancers [38,45,46,53]. Liu et al. (2014) reported that individuals having at least one copy of CD44 rs187115 variant allele were associated with increased bone metastasis and tumor stage, as well as with decreased survival rate in NSCLC patients. Thus, this variant was suggested as a potential predictive marker of survival in NSCLC patients [38].
Though none of the studies demonstrated significant association of rs1182102 with cancer susceptibility, our result demonstrated a significant role of this SNP in cancer protection. The exact mechanism by which this SNP modulates cancer risk has not yet been elucidated; *First letter refers to the Amount of evidence that was assessed by counting the number of minor alleles. Grade A, B and C correspond if nminor = >1,000, 100-1000 and <100, respectively where nminor is the total number of cases and controls with the least frequent genotype. Second letter refers replication assessment and the third letter demonstrated protection from bias [50].
however, its location in the 3'UTR suggests that it alters the binding of miRNA contributing CD44 deregulation. Further, our in-silico analysis also revealed the role of CD44 rs1182102 in transcriptional regulation (Table 6.).
To the best of our knowledge, we are the first to perform such a comprehensive meta-analysis of common functional polymorphisms of the CD44 gene comprising all the published and well defined case-control studies. We followed a strict inclusion/exclusion criteria to avoid likely biases and NOS system was used to evaluate the quality of each studies demonstrating that all the included studies were of good (moderate to high) methodologic quality. In addition, our study has improved the statistical power of the analysis since we pooled large number of cases and controls from various studies. We also performed sensitivity analysis and multiple corrections to remove any false result, though the result remained unaffected, thereby adding weight to our findings. Since cancer is a highly fatal disease, our results investigating the association of functional SNP in CD44 gene may have clinical significance in that they can help to identify interindividual differences in tumor susceptibility, recurrence capacity and chemoresistance among patients. However, care should be taken to interpret these results with caution as overall our study indicate moderate or weak evidence for association mainly due to limited number of studies.

Study limitation
Though, we have collected all published articles till date, we could not perform a comprehensive subgroup analysis as the number of available studies were limited to Asian population and also for limited cancer types. In addition, there was significant heterogeneity for rs13347 meta-analysis, although in subgroup analysis it was removed or decreased Further, our results are based on unadjusted or crude estimates and the roles of haplotypes, gene-gene, and gene-environment interactions, as well as linkage disequilibriums were not considered. Last but not the least, we could not exclude the possibility of selection bias as study selection was limited to published results, articles in English language only and methodologies using different genotyping methods and study designs.

Conclusions
We demonstrated a significant association of CD44 SNPs in modulation of cancer risk. Specifically, rs13347 and rs187115 may be used as potential biomarkers for cancers in Asian populations. However, further analysis considering the aforementioned limitations and prognostic significance of CD44 are required to better understand the role of these CD44 SNPs in cancer risk.

Literature search and study selection criteria
Following the PRISMA statement [54], we performed a systematic and comprehensive literature search on "Pubmed", "Medline", "Google Scholar", "EMBASE", and "Scopus" databases by using the following MeSH index keywords: "CD44 gene", "Cluster of differentiation", in combination with "single nucleotide polymorphism (SNP) /variation/genotype", and "cancer/ carcinoma" or "tumor". All published case-control studies investigating the association of CD44 gene polymorphisms with human cancer susceptibility in English language were searched until May2016. All relevant studies were collected after thorough investigation of the abstracts of potential articles. Further, the reference lists of the selected articles and related reviews on the topic were manually examined to collect additional relevant studies.
The selection criteria of the studies were; original case-control study examining the association of CD44 polymorphism with cancer risk having sufficient information to calculate the relative risk and 95% confidence intervals (CI), histo-pathologically confirmed cancer cases and healthy controls (free from any malignancy or other related pre-malignant condition such as benign and hyperplasia). On the other hand, studies unrelated to cancer research or lacking control population or sufficient data, and those not in accordance with Hardy-Weinberg equilibrium (HWE) in control groups were excluded from the meta-analysis. Duplicate or ecological studies, case reports, reviews, abstracts, comments and editorials were also excluded from the present metaanalysis.

Data extraction
Two independent investigators separately weighed the eligibility of each study according to the inclusion and exclusion criteria listed above and any disagreements were further resolved by discussions and agreements. Data such as first author name, publication year, country of origin, ethnicity, genotyping methods, cancer types, frequency of cases and controls, genotype frequencies, minor allele frequencies, etc., were cautiously extracted from all eligible studies.

Quality score assessment
The quality of each studies included in this metaanalysis (Ref) was rigorously evaluated independently by two authors (Rai Rajani and Gupta Usha), by using the Newcastle-Ottawa quality assessment scale (NOS) [55,56] and all disagreements were resolved by discussion. The NOS is a star rating system in which each study was judged on standard criteria and subsequently categorized based on three fact: selection, comparability and exposure assessment with scores ranging from zero to nine stars. A study with NOS score of 7 to 9, 4 to 6 and 1 to 3 stars are usually considered to be a high, intermediate and lowmethodological quality respectively.

Statistical analysis
All statistical analyses were conducted using the Comprehensive Meta-analysis software (Version 2.0, BIOSTAT, Englewood, NJ). The pooled ORs were estimated for allele contrast, log-additive and dominant models. Odd's ratio greater than 1 is considered significant. Heterogeneity was measured using the I 2 value and Chisquare-based Q statistics (significant at p < 0.05). I 2 = 0%, 25%, 50% and 75% were considered as no, low, moderate, and high observed heterogeneity, respectively [57]. In the case of significant heterogeneity, the random-effect model was used to calculate the pooled ORs [58,59]. Funnel plot and Egger tests were performed to examine the publication bias [60]. Moreover, sensitivity analysis was performed to check if alteration of the inclusion criteria affects the results of the meta-analysis. To adjust the p values for multiple comparisons in subgroup analyses, we applied the Benjamini-Hochberg (BH) step-up correction method, which control the false discovery rate (FDR) yielding p corr . A p corr value less than 0.05 was considered as significant [61]. Hardy-Weinberg equilibrium (HWE) test of SNP was performed using Michael H. Court's (2005-2008) online calculator (http://www.tufts.edu/~mcourt01/Documents / Court%20lab%20-%20HW%20calculator.xls). Further, in-silico study was performed by using online Web servers-FastsnP (http://fastsnp.ibms.sinica.edu.tw) and F-SNP (http://compbio.cs.queensu.ca/F-snP/) to predict the functional effect of each SNPs.

Credibility of meta-analysis results
The credibility of the cumulative association of CD44 polymorphisms and the cancer risk was scrutinized by using Venice interim criteria [50] including a set of three scores (the amount of evidence, replication of results, and protection from bias) which are used to grade the evidence produced by the study. Each of these three scores can attain a maximum of 'A' grade, followed by 'B', and 'C'. Finally, the grades may be scored as followsstrong evidence (AAA) , moderate evidence (AAB, ABA, ABB, BAA, BBA, BBB, BAB) and weak evidence (rest all scores).