Risk prediction for early-onset gastric carcinoma: a case-control study of polygenic gastric cancer in Han Chinese with hereditary background

Recent genomewide studies have identified several germline variations associated with gastric cancer. The aim of the present study was to identify, in a Chinese Han population, the individual and combined effects of those single nucleotide polymorphisms (SNPs) that increase the risk of early-onset gastric cancer. We conducted a case-control study comprising 116 patients with gastric cancer as well as 102 sex- and age-matched controls and confirmed that the SNPs MUC1 (mucin 1) rs9841504 and ZBTB20 (zinc finger and BTB domain containing 20) rs4072037 were associated with an increased gastric cancer risk. Of the 116 patients diagnosed with cancer, 65 had at least 1 direct lineal relative with carcinoma of the digestive system or breast/ovarian cancer. These 65 had another 4 SNPs associated with gastric cancer susceptibility: PSCA (prostate stem cell antigen) rs2294008, PLCE1 (phospholipase C epsilon 1) rs2274223, PTGER4/PRKAA1 (prostaglandin E receptor 4/protein kinase AMP-activated catalytic subunit alpha 1) rs13361707, and TYMS (thymidylate synthetase) rs2790. However, each of these low-penetrance susceptibility polymorphisms alone is not considered influential enough to predict the absolute risk of early-onset gastric cancer. Thus we decided to study different combinations of polygenes as they affected for our population. Those subjects with both the risk alleles MUC1 rs9841504 and ZBTB20 rs4072037 had a greater than 3-fold increased risk of gastric cancer. Also those with a hereditary background including the risk alleles PLCE1 rs2274223 and PTGER4/PRKAA1 rs13361707 were 3 times more susceptible to cardia cancer than those without. These findings show that the study of combined polymorphisms, instead of single low-penetrance variations in susceptibility, may lead to a high-risk classification for a specific population.

since the 1990s. [7] There are several reasons for this paradoxical phenomenon. Although factors like H. pylori have recently been better recognized and understood, other environmental factors, such as air pollution and climate change, can induce effects that seem relatively small but that accumulate yearly and thus affect these specific generations. [8,9] Also, after the decline of H. pylorii infection, other risk factors-such as Epstein-Barr virus, which was unmasked by the eradication of H. pylorican also increase the risk of carcinogenesis. [6] Third, the accumulated genetic variations in carcinogenesis have now become more marked, leading to an earlier onset disease. [10,11] Genetic variations in breast cancer (e.g., BRCA1 and BRCA2) are highly penetrant, suggesting a strong linkage with family history and genetic susceptibility. [12] Similarly, a relationship between germline alterations in CDH1 (E-cadherin) and hereditary diffuse gastric cancer with family clustering has been observed in western countries [13]; however, rare families have also been reported in Asian countries. [14][15][16] Finally, in sporadic gastric cancer, genetic susceptibility to the SNP CDH1 rs16260 was reported at odds ratios of 1.20 in European and 0.93 in Asian populations. [17,18] We undertook further study of genetic-related gastric cancer in an Asian population, much as in the Genome-Wide Association Studies (GWAS). The latter identified several riskassociated loci with genetic susceptibility, including the SNPs PSCA rs2976392 (strong linkage disequilibrium with rs2294008), PLCE1 rs2274223, ZBTB20 rs9841504, and PTGER4/PRKAA1rs13361707. [19][20][21] MUC1 rs9841504 and TYMS rs2790 have been recognized as risk alleles in similar studies of gastric cancer. [25,26] However, results have not always been consistent, possibly owing to varying hereditary traits. [22−25] Polygenic approaches have been attempted to predict and prevent breast and bladder cancers stemming from low-penetrance mutations. [27,28] Recently several genetic susceptibility loci associated with gastric cancer risk have been identified and verified, and it was suggested that "sporadic" cancer be called "polygenic" instead of "nonhereditary." [29] Although twin studies have suggested that many 'sporadic' cancers show little or no heritability, Lu et. al. have demonstrated that several 'sporadic' cancers have a significant inherited component. [29] We named them as 'hereditary background' in this paper. In our research involving Chinese Han individuals of age 50 years or below with a hereditary background of malignancy, we were able to identify a number of potential risk alleles in polygenic gastric cancer. The primary purpose of our study was to elucidate the combined effect of such early-onset risk alleles.

results characteristics of study subjects
This study included 116 Chinese Han individuals less than 50 years of age with gastric cancer and 102 healthy sex-and age-matched controls. All were retrospectively chosen between March 2005 and June 2014 from the Department of Gastrointestinal Oncology of the Peking University Cancer Hospital (Table 1). Sixty-five individuals in our study who had already been diagnosed with cancer had at least 1 direct lineal relative with carcinoma of the digestive system or breast/ovarian cancer; therefore these subjects were assumed to have a hereditary background of malignancy ( Figure 1).

the risk of individual loci for early-onset gastric carcinoma
We investigated SNPs of MUC1 rs9841504, ZBTB20 rs4072037, PSCA rs2294008, PLCE1 rs2274223, PTGER4/PRKAA1 rs13361707, and TYMS rs2790. The Hardy-Weinberg equation was used to compare the observed and expected genotype frequencies (Supplementary Table 1). The frequencies of these loci in the general population were similar to those found by the Human Genome Project (Supplementary Table 2). Compared with the low-risk allele, the high-risk allele of SNP rs4072037 in MUC1, with a frequency of 89% in the group of all gastric cancer cases under 50 years of age, had a per-allele risk of 1.76 (95% CI 1.01-3.05, P = 0.045*) adjusted for sex and age in an unconditional logistical model (Table 2). Similarly, SNP rs9841504 in ZBTB20 had a per-allele risk of 2.21 (95% CI 1.20-4.05, P = 0.011*). However, for SNPs rs2294008, rs2274223, rs13361707, and rs2790, a more obvious difference was observed in a comparison between groups of gastric cancer patients with or without hereditary background in the allelespecific model ( Table 2). Similar results were obtained in the codominant, dominant, and recessive models (Supplementary Table 3). According to the multiplicative polygenic model applied in breast cancer, [30] we calculated that all these 6 variants account for 32% of the genetic risk of gastric cancer (Supplementary Table 4; see Supplementary Materials for detail).

subgroup analysis
In our subgroup analysis, we divided those SNPs into particular groups according to the Lauren classification and tumor locations ( Table 3, Supplementary  Table 5). The SNPs rs9841504, rs2294008, and rs2790 increased the risk of noncardia gastric cancer, whereas rs2274223 increased the risk of cardia cancer. In contrast, rs4072037 increased the risk of diffuse-type gastric cancer, while rs2294008 increased the risk of intestinaltype gastric cancer. The age difference was not significant; however, rs9841504, rs2274223, and rs2790 increased the risk of gastric cancer in males.

Polygenic analysis
We obtained our results from the allele-specific and subgroup analyses by studying those SNPs polygenically. Because MUC1 rs4072037 and ZBTB20 rs9841504 increased gastric cancer risk in the whole population, they were used to predict the risk of gastric cancer among the Han Chinese. Those with AA-GG and AA-GC alleles had a 2.93-and 6.18-fold higher risk compared with those who had only GA-GG alleles (P = 0.0046 †; P = 0.0003 ‡) (Table 4). Similarly, PLCE1 rs2274223 and PTGER4 and PRKAA1 rs13361707 were used to predict the risk of cardia cancer in populations with a hereditary background, who faced a greater than 3-fold higher risk (P < 0.05*) ( Table 5). More interestingly, whereas MUC1 rs4072037, ZBTB20 rs9841504, and TYMS rs2790 were suggested to improve the risk of noncardia gastric cancer (mainly the diffuse type) by 5-to 8-fold (P < 0.05*) ( Table 6).

dIscussIon
Our studies explored the field of hereditary gastric carcinoma in a polygenic way, and the multiplicative model showed the importance of genetic variants in earlyonset gastric cancer susceptibility. By distinguishing high- risk from low-risk populations, we hoped to develop more economical and efficient screening programs, especially in developing countries with many different populations.
In the West, hereditary gastric cancer was first related to the CDH1 mutation. Since then, according to the guidelines of Oliveira and colleagues, [31] it has been suggested that diffuse familial gastric cancer or hereditary diffuse gastric cancer is similar to gastric cancer due to the CDH1 mutation. Because few families with the CDH1 mutation have been reported in Asian countries, less attention was paid to possible hereditary factors there than to environment factors (such as Helicobacter pylori infection and personal lifestyle). However, different incidence rates of gastric cancer were observed under the same personal and environment conditions, suggesting that low-penetrance genes other than CDH1 might play a role in gastric cancer susceptibility.
A number of genetic loci for gastric cancer susceptibility-such as MUC1 rs4072037, ZBTB20 rs9841504, PSCA rs2294008, PLCE1 rs2274223, and * Odds ratios were adjusted for age and sex in unconditional logistic regression models. † Odds ratios were adjusted for age in unconditional logistic regression models. ‡ Odds ratios were adjusted for sex in unconditional logistic regression models. www.impactjournals.com/oncotarget PTGER4 and PRKAA1 rs13361707-were recently discovered by GWAS. [19][20][21] In order to avoid falsepositive results, a large number of confirmation studies and meta-analyses followed. [22,23,25,32,33] However, the results have not always been consistent. Two crucial factors should be taken into account. First, populations of different ethnicities were enrolled and compared, which would be worthless for risk prediction. The Human Genome Project pointed out that the major/minor allele of the same SNP varied greatly in percentage among such populations. Because the baselines are not consistent across different populations, the role of each SNP in risk prediction should not be equally weighted. Second, the assignment of a variety of weights would induce different ages of onset when the existence of these SNPs was discovered. Because the loci used in our calculations were selected from previous studies and the percentages were consistent with the result in the Human Genome Project, the supportive evidence is strong. Although many studies related to gastric cancer polymorphisms have recently been published, the application of their results in clinical and preventive medicine remains to be explored. The reasons for this are complicated. Instead of affecting protein function in carcinogenesis directly, genes such as PSCA may inhibit the growth of differentiated epithelial cells [21]; furthermore, SNP studies are often on the genetic level, which makes the relationship between a single molecule and changes in the stomach difficult to explain. Also, studies have shown large variations with the same SNP because of varying genetic backgrounds (e.g., involving ethnicity and gender). Our study showed that males were more susceptible to gastric cancer than females. Thus a better way to apply our findings clinically would be to classify each population in terms of the predicted percentile risk for individuals within that population.
We assumed a 40% reduction of gastric cancer risk with the gastroscopy examination, but it is not that simple. With our assumption, a greater number of loci predicting the risk of gastric cancer will be discovered.  The risk estimation, however, still suggests an accurate calculation. The true benefit is a complex interaction between absolute risk and the gastroscopy operator. In China, public awareness of gastric cancer in high-risk populations should be promoted. Besides, professional education would be necessary for accepting the concept of a multiplicative genetic model. If populations at varying risk were grouped, a special screening process (e.g., gastroscopy, abdominal CT, and/or PET-CT) to be administered at given intervals could be designed for different groups. However, we must still face the fact that most of the risk factors for gastric carcinoma are yet be discovered. At least 2 different methods need to be improved. First, technological improvements in the detection of SNPs, with both higher sensitivity and specificity, will enable more reliable predictions. Importantly, a large sample size is indeed necessary, not only for confirmation but for new findings as well.
With the discovery of more susceptible loci of gastric cancer in the future, our understanding of hereditary polygenic gastric cancer will become more complete. Disease prediction and prevention will enter a new era-the genetic era.

Inclusion criteria for study subjects
Data on the characteristics of study subjects (e.g., age, sex, family history, etc.) were collected from the medical record. Histology was confirmed on the basis of biopsy specimens in the Department of Pathology at the same hospital. The gastric carcinomas were all adenocarcinomas. In this study, the inclusion criterion for diffuse-type gastric cancer was Lauren's diffuse type with poorly differentiated or signet-ring cell histology in the World Health Organization (WHO) classification or linitis plastica. The inclusion criterion for intestinal-type gastric cancer was Lauren's intestinal type with papillary, welldifferentiated, or moderately differentiated histology by the WHO classification. The study was approved by the Ethics Committee of Peking University Cancer Hospital and informed consent was obtained from all subjects.

Genotyping
Genomic DNA was extracted from venous blood with the QIAamp Blood Kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions and stored at -20oC for future use. The polymerase chain reaction (PCR) was used to perform the genotyping. [34] PCR was conducted on the GeneAmp PCR System 9700 Thermal Cycler (Applied Biosystems, Foster City, CA, USA); it has a total volume of 20 μL containing 2 μL genomic DNA (around 40 ng/μL), 2 μL 10x LA PCR buffer, 0.5 μL 10 μM each primer, 2 μL 10 mmol/L dNTP, 0.2 μL Taq DNA polymerase (DRR200A, TAKARA), and 13.3 μL ddH2O. The cycling parameters were 94 o C for 5 minutes, 35 cycles at 94 o C for 30 seconds, 57 to 62 o C (depending on the primers) for 45 seconds, 72 o C for 20 seconds, and a final extension step at 72 o C for 7 minutes. The PCR products were determined by 2% agarose gel electrophoresis and sequenced by an Invitrogen 3730XL genetic analyzer. The sequencing results were analyzed with Chromas software under the condition of signal/noise > 98%.

statistical analysis
Statistical analysis was performed using the STATA 13 software package (StataCorp LP, College Station, Texas, USA). The Hardy-Weinberg equation was used to compare the observed and expected genotype frequencies.
The genotype distributions were compared with two-sided contingency tables using the χ2 test. The odds ratio (OR) and 95% confidence interval (CI) were calculated using an unconditional logistical regression model. The P value was considered significant at less than 5%.