PSCA polymorphisms and gastric cancer susceptibility in an eastern Chinese population

The prostate stem cell antigen (PSCA) gene, which encodes a prostate-specific antigen (PSA), was identified as a gene involved in cell adhesion and proliferation. The associations between the PSCA rs2294008 and rs2976392 single nucleotide polymorphisms (SNPs) and gastric cancer (GCa) susceptibility were still controversial. To derive a more precise estimation of the associations, we conducted a case-control study of 1,124 cases and 1,192 controls in an eastern Chinese population. We found that the rs2294008T variant genotypes were associated with an increased GCa risk in this study population (CT vs CC, OR=1.59, 95% CI=1.33-1.89 and CT+TT vs CC, OR=1.38, 95% CI=1.17-1.62). For SNP rs2976392, the variant A genotypes were also associated with an increased GCa risk (AG vs GG, OR=1.61, 95% CI=1.35-1.91 and AG+AA vs GG, OR=1.47, 95% CI=1.25-1.74). The results were further validated by a meta-analysis. In conclusion, the results indicated that the PSCA rs2294008 T and rs2976392 A alleles were low-penetrate risk factors for GCa in this study population. However, large and well-designed studies are warranted to validate our findings.


INTRODUCTION
Gastric cancer (GCa) is the most frequently occurring cancer and one of the leading causes of cancerrelated deaths. There were 951,600 new GCa cases and 723,100 deaths in 2012, accounting for 8% of the cancer cases and 10% of cancer deaths in the world, respectively [1]. Therefore, GCa has become a major public health challenge. While the mechanism of gastric carcinogenesis is still not fully understood, it has been suggested that environmental factors and low-penetrance susceptibility genes may be important in the etiology of GCa. A high rate Oncotarget 9421 www.impactjournals.com/oncotarget of Helicobacter pylori(HP) infection might be a potential risk factor for an increased GCa risk in developing countries (70-90%) than in developed countries (25-50%) [2,3]. However, only few HP carriers will develop GCa; therefore, other factors must play a role in GCa risk. Lifestyle factors such as tobacco smoking, alcohol use and dietary habits are also likely to be potential risk factors for GCa [4]. Although genetic factors for GCa risk are still not fully understood, some recent success in identifying significant associations between genetic variants and GCa risk is encouraging [5][6][7][8][9], and it is necessary to confirm those genetic factors that have been reported to play a role in GCa risk.
The prostate stem cell antigen (PSCA) gene, which encodes a prostate-specific antigen (PSA), was identified as a gene involved in cell adhesion, proliferation and patient survival [10,11]. PSCA is mainly expressed in the region of isthmus/neck, but its expression was undetectable in GCa tumor tissues [12], suggesting a loss of tumor-suppressor effect of PSCA in GCa. In addition, its biological role in cancer advancement was also reported by published in vivo functional studies [13,14]. Therefore, it is necessary to investigate the role of PSCA genetic variants in the etiology of GCa. Importantly, several GWAS studies have demonstrated an association between PSCA variants and cancer susceptibility [13,15,16]. One GWAS study in Korea and Japanese populations reported that two SNPs in the PSCA gene (rs2294008 C>T and rs2976392 G>A) were associated with an increased GCa risk [13]. However, these associations were not replicated in the subsequent replication studies [17,18].
To further confirm the associations between PSCA rs2294008 and rs2976392 SNPs and GCa risk, we conducted a replication study in a large Eastern Chinese population and also performed a meta-analysis with published studies.

RESULTS
Baseline characteristics of individuals included in this study were consistent with those described in our previous study [19], but one sample in cases and four samples in controls failed to be genotyped. Thus, the final analysis included 1,124 GCa patients and 1,192 cancerfree controls (supplemental Table 1). Subjects were well matched by age and sex with more smokers and drinkers in the controls, but these variables were further adjusted in the following multivariate analysis. The rs2294008 and rs2976392 appeared to be in a high linkage disequilibrium (r 2 = 0.969).
The allele frequencies of SNPs rs2294008 and rs2976392 in cases and controls and their associations with GCa risk are presented in Table1. The variant rs2294008T genotypes were associated with an increased risk of GCa (CT vs CC, OR=1.59, 95% CI=1.33-1.89 and CT+TT vs CC, OR=1.38, 95% CI=1.17-1.62). For SNP rs2976392, the variant A genotypes were also associated with an increased GCa risk (AG vs GG, OR=1.61, 95% CI=1.35-1.91 and AG+AA vs GG, OR=1.47, 95% CI=1.25-1.74). When these two SNPs were combined, subjects who carried more than one risk alleles exhibited a significantly increased risk of GCa (OR=1.35, 95% CI=1.14-1.59), compared with those who did not carry any risk alleles.
In the stratified analysis presented in Table 2, we found that the associations between the SNP rs2294008 and GCa risk remained significant in dominant models for subgroups of <=59 years (OR=1.53, 95%CI=1. 22 Table 2).
Then, we performed a min meta-analysis, including the present study, of 19 studies [17,18,[26][27][28][29][30][31][32][33][34][35][36][37][38][39][40][41][42]. Pooled data indicated that both PSCA rs2294008 and rs2976392 SNPs were strongly associated with an increased GCa risk (  Figure 2) without significant publication bias. However, significant heterogeneities across studies were present in these genetic models. Thus, we performed a sensitive analysis to assess the effects of each study on pooled results. Pooled ORs were not affected by omitting each of studies at a time (data not shown), which suggests that the results are robust. The required information size were estimated with 80% power, 5% two-side alpha, 10% reduced relative risk and heterogeneity correlation based on the results of traditional meta-analysis. the TSA analysis suggested that cumulative information size of dominant models for rs2294008 (28259) had reached the cumulative information size (18726). Other metaanalysis did not reach but was not far from the required information size. Moreover, the Lan DeMets sequential monitoring boundary for benefit of wild-type alleles was Oncotarget 9422 www.impactjournals.com/oncotarget Table 1: Logistic regression analysis of associations between PSCA genotypes and gastric cancer risk in an eastern Chinese population CI, confidence interval; OR, odds ratio a Chi square test for genotype distributions between cases and controls b Adjusted for age, sex, smoking and drinking status in logistic regression models c for additive genetic models d for dominant genetic models crossed by cumulative results in all genetic models except for heterozygous models for rs2294008, indicating the results in our meta-analysis are conclusive and reliable. In addition, it was informed by sequential meta-analysis that results of the present study increased the cumulative Z score and therefore strengthened the positive evidence that rs2294008T and rs2976392A variants did have an effect on GCa risk ( Figure 3A and 3B).

DISCUSSION
In addition to environmental and lifestyle factors for GCa risk, genetic factors are also important in identifying at-risk populations for prevention of GCa. Although PSCA is mainly expressed in the isthmus/neck region of the stomach where the GCa often occurs [13], it is interesting that PSCA expression is suppressed in GCa tumor tissues.   Sequential meta-analysis for dominant models for rs2294008, with relative risk reduction of 10%, power of 80%, alpha of 5%, and heterogeneity correction of 78.75%. Sequential boundary for benefit has been crossed and the required information size was satisfied. B. Sequential meta-analysis for dominant models for rs2294008, with relative risk reduction of 10%, power of 80%, alpha of 5%, and heterogeneity correction of 84.70%. Sequential boundary for benefit has been crossed.
Oncotarget 9425 www.impactjournals.com/oncotarget Moreover, it was reported that the expression of PSCA may be coming from some proliferating precursor cells [43]. These findings suggested that PSCA is a potential tumor suppressor in GCa. Therefore, it is biologically plausible that SNPs that lead to down-regulate expression of PSCA make individuals predisposed to GCa. This speculative hypothesis is consistent with or supported by our results that the PSCA rs2294008 T and rs2976392 A alleles were associated with an increased GCa risk in the study population. More importantly, these associations were further validated by our meta-analysis with pooled data from all the published studies. Also, results of most pooled data were considered to be robust by sequential meta-analysis, except for rs2294008 with heterogenous results across studies (I 2 =95.90%).
However, it was surprising that GCa risk for the subgroups of PSCA homozygotes (TT for rs2294008 and AA for rs2297692) were not statistically significant. This may be explained by a co-dominant genetic model, in which only the imbalanced paired protein subunits coded by the two different alleles will have an effect on the proteins' functions; alternatively, the variant homozygotes may have experienced embryo lethal events, leading to a high rate of abortions; lastly, this finding may be simply due to chance, because of a systemic error in genotyping or because the small sample size of the subgroup may have insufficient statistical power to detect a weak effect or may have generated an unstable risk estimate. All these speculations should be further explored in future larger and mechanistic studies. Therefore, our results should be interpreted with caution.
There are some limitations in the present study. First, although age, sex, smoking and drinking status, and tumor site were taken into consideration for subgroup analysis, other important risk factors such as diet and HP infection, which were missing in this study, might also contribute to the etiology of GCa. Second, new classification of GCa tumor types, which was not available for the patients diagnosed years ago, is also important, which may have a different genetic basis in the etiology. Third, the sample size of the cases in subgroups was largely reduced in the stratification analysis, which may have led to limited statistical power in subsequent analysis.
In summary, our results indicated that the PSCA rs2294008 T and rs2976392 A alleles may be lowpenetrate risk factors for GCa. However, future studies should incorporate diet, HP infection status and Lauren classification to better understand the associations between the PSCA SNPs and GCa risk.

Study subjects
This study included GCa patients and cancerfree controls who were part of our ongoing molecular epidemiology study as described previously [19][20][21]. Briefly, 1,125 unrelated ethnic Han Chinese patients with newly diagnosed and histopathologically confirmed primary GCa were recruited from Fudan University Shanghai Cancer Center (FUSCC) in Eastern China between January 2009 and March 2011. Patients other than histopathologically confirmed primary GCa were excluded. In addition, 1,196 age and sex-matched cancerfree ethnic Han Chinese controls were recruited from the Taizhou Longitudinal (TZL) study conducted at the same time period in Eastern China as described previously [22]. Blood samples of GCa patients and cancer-free controls were provided by the tissue banks of FUSCC and the TZL study, respectively. All subjects had signed a written informed consent for donating their biological samples to the tissue banks for scientific research. Demographic data and environmental exposure history of each subject were collected. The overall response rate was approximately 91% for cases and 90% for controls. This research protocol was approved by the FUSCC institutional review board.

SNP genotyping
Using to a standard protocol, we extracted genomic DNA from peripheral blood samples. The rs2294008 and rs2976392 SNPs were genotyped by the TaqMan assay with the ABI7900HT real-time PCR system as reported previously [19]. Subjects' case-control status was unrevealed in the genotyping process. As recommend by the company, four negative controls (without a DNA template) and two duplicated samples were included in each of 384-plates for the quality control. The assays were repeated for 5% of the samples, and the results were 100% concordant.

Statistical methods
An individual who never smoked cigarettes was defined as a never smoker; who smoked cigarettes but quit more than one year before diagnosis (for cases) or before the interview (for controls) was defined as a former smoker; and who smoked currently or quit within one year before diagnosis (for cases) or before the interview (for controls) was defined as a current smoker. Ever smokers included both former smokers and current smokers. Those who drank alcoholic beverages at least once a week for one year or more were defined as drinkers, while the www.impactjournals.com/oncotarget others were non-drinkers. The χ 2 test was used to assess the differences in the distributions of demographic characteristics between cases and controls. The associations between SNPs and GCa risk were assessed by odds ratios (ORs) and 95% confidence intervals (CIs) in heterozygous, homogenous, and dominant models. ORs were calculated by univariate and multivariate logistic regression models. Logistic regression models were used to test for each genetic model with adjustment for age, sex, smoking and drinking status. The combined effect of the tested SNPs on GCa risk was also evaluated in logistic models. Furthermore, associations between the PSCA rs2294008 C>T and rs2976392 G>A SNPs and GCa risk were stratified by age, sex, smoking or drinking status, and primary tumor site. All the statistical processes were performed by using SAS software (version 9.1; SAS Institute, Cary, NC) To further validate our results, we also performed a mini meta-analysis with studies searched from Medline, PubMed and Embase. Principles in search terms and inclusion and exclusion criteria were basically in accordance with previous studies [23,24]. All primary reports were carefully reviewed, and the relevant references in these papers were also manually searched and reviewed by two independent authors. Then, data were retrieved form included studies and pooled ORs for heterozygous, homozygous, and dominant models were calculated. Heterogeneity among studies was estimated by Chi-square-based Q test. A P value greater than 0.10 for the Q-test indicates a lack of heterogeneity among studies, so the pooled OR estimate of the each study was calculated by the fixed-effects model (the Mantel-Haenszel method). Otherwise, the random-effects model (the DerSimonian and Laird method) was used [25]. To validate the stability of the pooled results and find the sources of heterogeneity, we performed the leave-oneout sensitive analysis. Publication bias was shown by the funnel plot, in which the asymmetry was estimated by the Egger's liner regression test, where the statistically significant publication bias was tested out when a P<0.05 determined by the t test was suggested by Egger. All the statistical processes were achieved by using STATA version 10.0 (Stata Corporation, College Station, TX). Whether results were conclusive could not be answered by traditional meta-analysis without keeping and balancing the type I and type II error. To address this problem, studies included in this meta-analysis were analogous to interim randomized controlled clinical trials. Sequential metaanalysis (SMA) was conducted to calculate the required information size in this meta-analysis. Lan DeMets sequential monitoring boundary was established to control the type I and II error by alpha and beta-spending function methods. Whether the monitoring boundary was crossed was used as a way to estimate the reliability of the results acquired from traditional meta-analysis. All steps were accomplished with the TSA software version 0.9.