The association of six polymorphisms of five genes involved in three steps of nucleotide excision repair pathways with hepatocellular cancer risk

Background Hundreds of single nucleotide polymorphisms (SNPs) of the genes encoding nucleotide excision repair (NER) proteins are involved in every step of the DNA recognition–unwinding–incision process, which may affect cancer risk. However, only a limited number of studies have examined the association of NER SNPs with hepatocellular cancer (HCC) risk. Results In screening stage, single-locus analysis showed that six SNPs in five genes were associated with HCC risk, including three risk SNPs (XPA rs10817938, XPC rs1870134 and ERCC2 rs238417) and three protective SNPs (ERCC1 rs2298881 and rs3212961, and ERCC5 rs873601). In verification stage, only XPC rs1870134 was verified to be associated with HCC risk (P = 4.7 × 10−4). Furthermore, multivariate logistic regression and MDR analysis consistently revealed a gene–gene interaction among ERCC1 rs2298881 and XPC rs1870134 SNPs associated with HCC risk (Pinteraction = 0.023). When analyzing the effect of the positive SNP on the mRNA expression, we found XPC rs1870134 GG genotype which was associated with an increased HCC risk showed lower XPC mRNA expression. Methods This study designed as “screening-verification” experiments and included a total of 1472 participants (570 HCC patients vs. 902 controls). We explored 39 SNPs in eight genes involved in NER Pathways, including XPA, XPC, DDB2, ERCC3, ERCC2, ERCC1, ERCC4 and ERCC5, using Sequenom MassARRAY and KASPar platform. Eighty-six cases of HCC and the neighboring noncancerous tissues were subjected to the measurement of mRNA expression level of the promising gene. Conclusions XPC promoter rs1870134 SNP and SNP-SNP interaction were associated with HCC risk.


INTRODUCTION
Hepatocellular cancer (HCC) is the sixth most common type of cancer and the third most frequent cause of cancer death worldwide [1,2].The incidence of HCC is associated with environmental and hereditary factors, therefore the risk of developing disease varies between patients.To date, several single nucleotide polymorphisms (SNPs) in some genes involved in oxidative stress, metabolism and inflammation pathways, have been proved to be associated with HCC risk [3].These SNPs have great significance for the selection of individuals who benefit from specific preventative measures [3].
DNA repair systems include nucleotide excision repair (NER), base excision repair, mismatch repair, and double-strand break repair [4].Among these repair systems, the NER system is most frequently associated with cancer [5].Ishikawa et al. previously reported that the DNA repair system, especially the NER pathway, played a vital role in protection against human cancer [6].
The NER pathway is composed of DNA recognitionrelated proteins including XPA, XPC, and DDB2; DNA unwinding-related proteins such as XPB (ERCC3) and XPD (ERCC2); and DNA incision-related proteins, for instance ERCC1, XPF (ERCC4), and XPG (ERCC5).Hundreds of SNP variants of the genes encoding these NER proteins are involved in every step of the DNA recognition-unwinding-incision process, which may increase or decrease protein expression and function [7,8].However, only a limited number of studies have examined the association of NER SNPs with HCC risk, although a few studies focused on single exon SNPs such as XRCC1 Arg399Gln, XRCC3 Thr241Met, and XPD Lys751Gln have been reported [9][10][11][12].And there was a meta-analysis investigating the association of NER SNPs with risks of several kinds of cancers [13] without hepatocellular cancer, which maybe because that few studies were performed about the association of NER SNPs with HCC risk.Thus, a systematic and comprehensive evaluation of the relationship between these SNPs and HCC risk are urgently required, which could provide a comprehensive understanding of the implications of NER biological pathways involved in hepatocarcinogenesis, as well as screening the most significant functional SNP variants and potential biomarkers for predicting HCC risk.
In the present study, we adopted candidate gene association study strategy with selected 39 potentially functional tag SNPs (tagSNPs) in eight genes involved in NER pathways: XPA, XPC, DDB2, ERCC1, ERCC2, ERCC3, ERCC4 and ERCC5.We determined whether these genes were associated with HCC in the second verification stage, and for the promising SNPs we investigated the effects of the SNPs on the mRNA expression of the corresponding genes.We aimed to identify predictive biomarkers for HCC risk and tried to establish an experimental basis to improve understanding of the etiology and the mechanism of HCC.
Among these 39 SNPs, six SNPs in five genes were associated with HCC risk in stage 1, including three risk SNPs (XPA rs10817938, XPC rs1870134 and ERCC2 rs238417) and three protective SNPs (ERCC1 rs2298881 and rs3212961, and ERCC5 rs873601).We further analyzed these promising SNPs and found that the XPA rs10817938 variant CC genotype showed an increased risk of HCC (odds ratio [OR] = 2.52 and 2.66, respectively; Table 1) when compared with TT wild-type and TT + TC genotype.The ERCC2 rs238417 variant CC genotype also showed an increased risk (OR = 1.77 and 1.33, respectively) under the allelic model.And the XPC rs1870134 variant GG + GC genotype showed an increased risk for HCC (OR = 2.78) when compared with CC genotype.By contrast, the ERCC5 rs873601 variant AA genotype had a decreased risk for HCC (OR = 0.58 and 0.59, respectively) when compared with GG wild-type and under the recessive model.Two positive SNPs were identified in ERCC1, rs2298881 and rs3212961, which were associated with a decreased HCC risk (OR = 0.64 and 0.60, respectively, Table 1) when the heterozygote was compared with the wild-type.
In the verification stage, as the P value was cut-off for 0.00128, we only found that the XPC rs1870134 GG genotype showed a significant increased risk for HCC (P = 4.7 × 10 −4 , OR = 1.67) when compared with CC + GC genotype.We merged this two stages for a meta-analysis, and also found this XPC rs1870134 GG genotype showed a significant increased risk for HCC (P = 0.001, OR = 1.45,Table 1).
And we also analyzed the association of the positive XPC rs1870134 SNP with the clinical features of HCC about smoking, drinking, family history, HBV and HCV infection status and histopathology classification, but found no significant association (Supplementary Table S3).

The association of haplotype in NER pathway genes with hepatocellular cancer risk
We considered that haplotypes with a frequency less than 0.03 would be excluded from analysis.Six haplotypes in four genes were found to be associated with HCC risk.
Linkage disequilibrium data composed of D' and r 2 for these eight polymorphisms are shown in Supplementary Figure S1.

Differences of XPC rs1870134 gene mRNA levels in different genotypes in hepatocellular cancer and non-cancer tissues
For XPC mRNA expression level in non-cancerous and cancerous tissues, we found XPC gene was decreased in tend from non-cancer tissues to cancer tissues (1.50 ± 4.32 vs. 3.58 ± 25.77, Table 5), although the P value did not reach the statistical significance (P = 0.456).We further explore the potential biological significance of the XPC rs1870134 polymorphism at mRNA level (Table 5).In cancerous group, XPC mRNA levels were significantly lower in subjects carrying XPC rs1870134 GG genotype compared with patients with CC genotype (P = 5 × 10 −5 ).

DISCUSSION
The NER repair pathway was divided into three steps: the recognition-unwinding-incision steps.Briefly, in the recognition step, XPC-RAD23B complex or UV-damaged DNA-binding protein 2 (DDB2) could recognize DNA damage.In the unwinding step, Helicase subunits composed of XPB (ERCC3) and XPD (ERCC2) were activated, and opened the DNA duplex, then recruited XPA.XPB was a 3′ to 5′ translocase and XPD was a 5′ to 3′ translocase.In the incision step, XPD (ERCC2) remained in the damaged DNA 3′ region, and then XPG (ERCC5) cut on the 3′ side while ERCC1-XPF (ERCC4) complex on the 5′ side.The damaged DNA was then repaired.Although several studies have reported an association between a single exon SNP in NER gene (such as XRCC1 Arg399Gln) with HCC, none have demonstrated a systematic and comprehensive analysis between polymorphisms in every step of NER pathways genes and HCC risk.We preliminarily screened among the NER pathways SNPs for HCC risk, and identified three risk SNPs and three protective SNPs in five genes, that is, the positive six SNPs composed of two in recognition step (XPA rs10817938 and XPC rs1870134), one in unwinding (ERCC2 rs238417) and three in incision step (ERCC5 rs873601, ERCC1 rs2298881 and rs3212961), and one combination of a gene-gene interaction model (ERCC1 rs2298881-XPC rs1870134 pairwise) associated with HCC risk.Further functional experiments confirmed that one positive SNP XPC rs1870134 associated with HCC risk had an effect of polymorphism on XPC mRNA expression.

XPA rs10817938 and XPC rs1870134 in recognition step
XPA was the first human NER protein showing a preference for binding to damaged DNA.It is also a zincbinding protein with affinity for various DNA damage [14] that functions at the core of the NER system [15].XPA is located on chromosome 9q22.3,and the promoter rs10817938 SNP we studied is located in −2718 bp from the transcription start site.This polymorphism has not been evaluated to be associated with the risk of any disease, but another XPA promoter SNP located at −4 bp might change XPA mRNA tertiary structure and stability, and might play a role in susceptibility to cancer [16].The association between XPA polymorphisms with HCC risk is biologically plausible since XPA plays an important role in NER pathway while XPA protein defects were previously shown to lead to HCC susceptibility in mouse model experiments [6].And the XPC protein also plays an important role early in the DNA damage recognition step.It tightly binds to a distorted region and changes the structure of the DNA to allow other components of the repair apparatus to enter [15].XPC gene is located on chromosome 3q25, and the promoter rs1870134 SNP we studied is located at +149 bp, which has not previously been reported to be associated with disease.In this study, we found two promoter SNPs in the XPA and XPC genes respectively were shown to be associated with HCC risk.As mentioned above, the promoter SNP could change the gene's function, including mRNA structure or stability, and protein expression or function, which would explain our observed association of this two promoter SNPs with HCC susceptibility.
Because the XPC gene promoter rs1870134 SNP was confirmed by the verification stage, we further performed XPC mRNA expression study.We found the expression of XPC gene was decreased in cancer tissues when compared with non-cancer tissues, which suggest that XPC protein functioned as protective protein in hepatocellular cancer.Then, we analyzed the effects of this SNP on XPC mRNA expression in order to clarify the possible mechanism for polymorphisms.We found the GG genotype which was associated with an increased HCC risk, showed a lower XPC mRNA expression.As XPC was a protective protein, individuals carrying risk GG genotype showed a decreased mRNA expression, causing the lower expression of this protective protein, which might be the possible mechanism for the high risk of GG genotype.The similar study reported that the rs2298881 G allele of ERCC1 gene located in 5′-flanking region was associated with a down-regulated protein expression, and subsequent functional experiment showed this SNP could decrease the gene's promoter activity and transcription factor binding activity [17].Thus, further functional experiments such as promoter activity and transcription factor binding activity assays should be performed to clarify the associated mechanism.

ERCC2 rs238417 in unwinding step
ERCC2, also called XPD, is an unwinding protein in NER repair that unwinds DNA from 5′ to 3′ [15].ERCC2 gene is located on chromosome 19q13.3,and includes 21 SNP sites in the HapMap database.ERCC2 SNP rs238417 is located in intron 18 which was short and only 90 bp length.Thus any polymorphism among intron 18 could largely change the second structure and mRNA stability, and the variation of rs238417 from G to C could make 3 kcal/mol change of the minimum free energy from stable to unstable using RNAfold predicting software (http:// rna.tbi.univie.ac.at/cgi-bin/RNAfold.cgi).Peethambaram P et al. studied 11 SNPs in ERCC2 gene and found this rs238417 SNP was the most significant polymorphism associated with the outcome of ovarian cancer [18].In the present study, we demonstrated that the rs238417 variant CC genotype increased the HCC risk by 1.77-fold.As previous study showed, an intronic polymorphism also played an important role in mRNA splicing or protein expression [19].Considering the length of this intron 18 was short, the short intron may cause this polymorphism to be a functional SNP.The above-mentioned might be the reason for our observed association of this rs238417 SNP with HCC susceptibility.Further experiments are required to confirm this observation.

ERCC1 rs2298881 and rs3212961 in incision step
ERCC1 is an incision and repair protein that binds first to ERCC4 (XPF), then ERCC3 (XPB).This ERCC1-ERCC4-ERCC3 complex cuts DNA from 5′ to 3′ [15].ERCC1 gene is located on chromosome 19q13.32and any variation of ERCC1 might cause the change of ERCC1 protein, ERCC1-ERCC4-ERCC3 complex and even the whole NER pathway.Yin J et al. studied several SNPs in ERCC1 and ERCC2 and found both the rs2298881 and rs3212961 SNPs had interaction with smoking in lung cancer patients [20].We also found that ERCC1 rs2298881 and rs3212961 were significantly associated with HCC risk.Several studies have previously reported an association with the ERCC1 promoter SNP rs2298881 and cancer risk.Indeed, Yu et al. found this rs2298881 SNP was associated with a decreased risk of lung cancer [17].Other study showed that this SNP exhibited an increased prostate cancer risk with high fonofos exposure compared with controls [21].In this study, we found that the ERCC1 promoter rs2298881 CA heterozygote decreased the HCC risk by a 0.64-fold, which was consistent with the report by Yu et al.It was reported that the variant allele of this rs2298881 down-regulated ERCC1 promoter activity, down-regulated transcription factor binding activity, and thus decreased ERCC1 protein expression with a subsequent decrease in the cancer risk [17].Another HCC-associated ERCC1 SNP, rs3212961, located in intron 3 immediately 3′ to exon 3, could affect transcription splicing, because ERCC1 shows several splicing variants.Although many studies showed the association of this SNP with cancer risk, the results were inconsistent.Shen et al. found that rs3212961 decreased the risk of lung cancer under a dominant model [22], while others showed that it could increase the risk for bladder cancer under a dominant model [23], as well as colorectal cancer [24].In this study, we demonstrated that rs3212961 decreased HCC risk by 0.66-fold under the dominant model, but the detailed underlying mechanism should be investigated, especially whether this polymorphism influence the expression of ERCC1 exon 3 or influence the selective splicing.

ERCC5 rs873601 in incision step
ERCC5 (XPG) is activated and binds the 3′ region of DNA damage.This is followed by ERCC2 and ERCC5 (XPG) binding.ERCC5 rs873601 SNP is located in the 3′-UTR.Regarding ERCC5 rs873601 SNP, a previous report showed no association with the risk of esophageal squamous cell carcinoma [25].In single locus analysis, we found this SNP could decrease HCC risk.We speculate the way of the possibilities: 1) impact through ERCC5 mRNA and protein, or 2) impact through a third party.It is possible that this polymorphism affect protein expression and/or activity, or that they involve the same miRNA that binds the 3′ UTR region.Based on this result, we recently performed the mRNA expression experiment to further analyze the effects of SNP genotypes to the mRNA expression.The ERCC5 rs873601 variant genotypes associated with a decreased HCC risk showed a higher mRNA expression of ERCC5 gene (Data were not published).The mechanism of the rs873601 SNP to the mRNA level expression was not very clear now, so further functional study is required to verify this finding in future studies.

Gene haplotype with HCC risk
Combined haplotype and LD association analyses for multiple SNPs are more sensitive and powerful than SNP analysis alone [26].We showed that the DDB2 A-C-A-T haplotype of rs2029298-rs830083-rs3781619-rs326222 increased the HCC risk (OR = 2.29), while these SNPs alone had no significant association with HCC risk, indicating that the haplotype was more sensitive than a single SNP.We re-analyzed a positive SNP detected in single-site analysis of ERCC1, and observed that the twosite combination haplotype showed similar results to the entire six-site combination haplotype for both ERCC1.This simplification showed that the ERCC1 C-C haplotype of rs2298881-rs3212961 increased HCC risk, while the ERCC1 A-C haplotype decreased HCC risk.

SNP-SNP interaction with HCC risk
One of the most significant finding in our study was the multiple SNP-SNP interactions composed of ERCC1 rs2298881 and XPC rs1870134 polymorphisms, which were consistently identified by two different statistical approaches: multivariate logistic regression and MDR analyses.We found the P value for "ERCC1 rs2298881-XPC rs1870134-ERCC2 rs238417-ERCC5 rs873601" combination was more significant than the two-way interactions of "ERCC1 rs2298881 and XPC rs1870134", but the four-way interaction combination was not verified by the multivariate logistic regression method, which might due to the more subgroups causing the rare genotypes.Several studies showed that the combined effect of multiple SNPs in several genes in one or more relevant DNA repair pathways could have a greater impact on pathological phenotypes than SNPs in single genes [27].And we found the OR of "ERCC1 rs2298881 and XPC rs1870134" polymorphisms interaction was higher than the OR of single-locus (OR interaction : 2.11 vs. OR XPC: 1.67), which suggest that this two-way interaction was a superior combination model for the prediction of HCC risk.As the mechanism of these two SNPs was not very clear now, it required further functional study to verify this finding in future studies.

Limitations
However, this study still had some limitations.First, the sample size was still limited, thus restricted the probability of the subgroup analysis for variant genotypes, and also limited the interaction analysis.Second, the HCC samples were only from the surgically resected patients and not covered for patients undergoing radiotherapy or chemotherapy.However, maybe because of this, we avoided and eliminated the effect of the heterogeneity bringing from the samples sourced from various treatments.Third, the expression of the promising XPC gene in this study was only at mRNA level, while the protein expressions were also warranted to study in future research.

Conclusion
In summary, we preliminarily explored gene polymorphisms in NER pathway for predicting the risk of HCC.In the screening stage, six SNPs of five genes in three steps of NER pathways were associated with HCC risk, including three risk SNPs (XPA rs10817938, XPC rs1870134 and ERCC2 rs238417) and three protective SNPs (ERCC1 rs2298881 and rs3212961, and ERCC5 rs873601).In the verification stage, only XPC rs1870134 was verified to be associated with an increased HCC risk.Furthermore, multivariate logistic regression and MDR

Statistical analysis
Between-group differences in sex variability, as well as the Hardy Weinberg Equilibrium, were compared by the χ 2 test.And analysis of variance was used for age variability.Multivariate logistic regression with adjustments for age and sex was used to show the association between selected gene polymorphisms with HCC risk.The haplotype of each gene was analyzed using SHEsis software [34].All NER gene polymorphisms identified in the best models of gene-gene interactions were calculated using MDR software (version 3.0.2) and MDR permutation testing software (version 1.0 beta 2) [35].The combined effect of selected SNP-SNP interactions in the best model was determined by multivariate logistic regression adjusted for age and sex.The differences of relative mRNA levels between two groups were tested by the Student t-test.In the screening stage, P value < 0.05 was considered significant.And significance values shown for the analysis in stage 2 and merged meta data were adjusted for multiple test correction.The cut-off of significance P value was used as < 0.00128 (0.05 ÷ 39 = 0.00128) for the verification stage.

.33 (1.22-4.46)
accuracy of 0.6001 and the maximal CV consistency of 10/10 (significant test P = 0.0010, and P for permutation test = 0.0010-0.0020).And the second interaction model selected was the two-factor model including ERCC1 rs2298881-XPC rs1870134, which yielded the highest testing accuracy of 0.5977 and the maximal CV consistency of 10/10 (significant test P = 0.0107, and P for permutation test = 0.0010-0.0020).To further validate the MDR results, we conducted analyses of both four-factor and two-factor models using

Table 4 : The genotype combinations of the SNP-SNP interactions in two polymorphisms with the risk of hepatocellular cancer a
Note: a , P for interaction was used Logistic Regession adjusted by sex and age.CON: controls; HCC: hepatocellular cancer.