Association of four genetic variants with colorectal cancer in Kazakhstan population

The study was conducted to search for polymorphisms located in the 10th chromosome associated with colorectal adenocarcinoma in representatives of the Kazakhstan population. Study was performed with 282 colorectal cancer (CRC) patients and 159 controls. Genotyping of SNPs was performed by QuantStudio 12K Flex PCR. For four significant SNPs inheritance model analysis was performed. Increasing risk of CRC was noted for rs10795668 in log-additive model (OR = 1.45, 95% CI: 1.05–1.99, p = 0.023); for rs1035209 in log-additive model (OR = 1.79, 95% CI: 1.18–2.72, p = 0.003); for rs11190164 in log-additive model (OR = 1.67, 95% CI: 1.17–2.38, p = 0.004). Decreasing risk of CRC was noted for rs10506868 in log-additive model (OR = 0.56, 95% CI: 0.37–0.85, p = 0.006). We detected SNPs that are associated with CRC risk in the Kazakhstan population.


INTRODUCTION
On a worldwide scale, colorectal cancer (CRC) affects more than 1.9 million patients annually, and more than 935,000 die from the disease [1], which brings CRC third rank by incidence and second rank by mortality among other cancers. There are great differences among the countries and regions in terms of incidence and mortality rates. That could be considered as a measure of socioeconomic development, as far as the rates have the tendency to come after rising human development index (HDI), presumably leading to modify lifestyle factors and diet. Another side is the affordable healthcare policies, making serious distinctions between countries.
In Kazakhstan, a country-wide colorectal screening program was started in 2011, and followed by a smooth rising in incidence and decrement of mortality. The proportion between I-II, III, IV stages among incident cases also was changed over the years, with predominant disease detection at early stages [2].
According to Globocan (2020), Kazakhstan has the highest estimated age-standardized incidence (13.5 per 100 000) and mortality (7.5 per 100 000) rates in the South-Central Asia region [3]. Such discrepancy could be explained by the lack of population-based screening programs in the region. Most countries of the region present no such programs, some settling just pilot projects.
Despite strong hereditary components, most cases of CRC are sporadic and develop slowly over several years.

Research Paper www.oncotarget.com
Nowadays studied a genetic variation of single nucleotide polymorphisms (SNPs) and environmental risk factors are considered the etiology of CRC [4]. However, SNPs can vary in ethnic groups and personally adding a share of probability to the development of pathology.
The study was conducted to search for polymorphisms associated with the development of colorectal adenocarcinoma in representatives of the Kazakhstan population. A significant number of polymorphisms associated with colorectal cancer were located in the 10th chromosome. The article aimed to determine the association between SNPs in chromosome 10 and CRC risk in the Kazakhstan population.

Study participants
The details of selected characteristics of all subjects were presented in Table 1. We enrolled 441 participants, including 282 patients (149 males, 133 females) and 159 controls (83 males, 76 females) in this study, the mean age of the subjects was 64.97 ± 10.77 years for cases and 46.59 ± 13.58 years for controls. All of the cancers included adenocarcinoma.

Hardy-weinberg equilibrium and alleles of selected SNPs
A total of 12 SNPs were successfully genotyped in all participants, but not all of the tested SNPs were in accordance with Hardy-Weinberg equilibrium (HWE) in the control group. Thus, 4 SNPs: rs11255841 (HWE p = 0.017), rs4948317 (HWE p < 0.001), rs704017 (HWE p = 0.031), and rs11196172 (HWE p = 0.045) did not reach the critical value p ≥ 0.05 for HWE in control and were excluded from further analysis. Table 2 presents a list of the studied polymorphisms, their location, and position on the chromosome, possible functions in genes and the achieved level of significance of the exact test for HWE in the control group.
Minor allele frequency, effect alleles, an association between SNPs and risk of CRC in the allelic model analysis were showed in Table 3.
Comparison of minor allele frequency (MAF) distribution in Kazakhstan population and among other populations showed similar distribution to East Asian population for rs10795668 (Kazakhstan population control group −0.39, EAS −0.37) and for rs4919687 (Kazakhstan population control group −0.24, EAS −0.25). For rs1035209 (Kazakhstan population control group −0.12) and rs10506868 (Kazakhstan population control group −0.16) MAF was close to South Asian population (SAS -0.09) and (SAS −0.21) respectively. In addition, in two cases MAF for rs12412391 (Kazakhstan population control group −0.35) and rs11190164 (Kazakhstan population control group −0.18) was close to American population (AMR −0.33) and (AMR −0.17) respectively. For rs12241008 MAF was 0.18 in Kazakhstan population control group and close to African population AFR −0.21.
We suggested that the minor allele of each SNP was not always a risk factor for CRC. For rs12412391, rs1035209, rs11190164, rs12241008, rs10506868 and rs1665650 minor allele coincide with effect allele. Therefore, for allelic model analysis and inheritance model analysis non-effect alleles were considered as references [5][6][7][8][9][10].
Following SNPs were found to be associated with CRC and its risk using the FDR adjustment for p values of χ 2

Inheritance models
For four significant polymorphisms according to allelic model analysis, inheritance model analysis was performed.
The data obtained allow us to speak about decreasing risk of CRC in the log-additive model for rs10506868, OR = 0.56, 95% CI: 0.37-0.85, p = 0.006.

DISCUSSION
In Kazakhstan CRC stay at third position among other cancers by estimated age-standardized incidence and mortality rates (14.3 and 7.6 per 100000 respectively) as for females, as for males (18 and 11.7 per 100 000 respectively) [3].
The genetic component of CRC is widely studied. According to publications, chromosome 10 is a carrier of polymorphisms associated with CRC [11,12].
We successfully genotyped 12 SNPs on the 10th chromosome and found some evidence of association of 4 SNPs (rs10795668, rs1035209, rs11190164 and rs10506868) that were associated with CRC risk variation according to inheritance models.
A small number of studies describe these SNPs, nevertheless, the presented SNPs are demonstrated in research and play a role in changing the risk of CRC.
These SNPs are likely involved in more sophisticated mechanisms of cancer development. A more detailed understanding of these patterns can help in a holistic view of the biological and genetic mechanisms associated with the initiation and progression of CRC.
Rs10795668 is leading to an aberrant expression of lncRNA by violating its regulatory region and modulate the binding of transcription factors to its promoter region. According to this mechanism regulation of the expression level of target genes occurs [13].
Rs10795668 is found by recent research to be related to family history CRC. A study by Gargallo et al. [14] showed that rs10795668 is associated with the susceptibility to colorectal adenomas and modified by the presence of a family history of CRC. The authors demonstrate that the low contribution of identified genetic variants associated with CRC does provide clinically relevant information by itself, but can increase the risk of CRC in a combination of risk-associated alleles in a polygenic model [14,15].
More recently, some researchers used a genetic risk score, which adopts risk variants to predict CRC. Studies in Korea were based on seven SNPs, including rs10795668, to develop a genetic risk score for prediction and had significant associations with CRC and rectal cancer, but no clear association with colon cancer [16].
The Win et al. [17] observed that heterozygous and homozygous carriers of the G alleles for rs10795668 decreased CRC risk only among PMS2 mutation carriers. The authors also pointed out that these variants had the opposite direction observed for the general population [17]. A meta-analysis of genetic associations between SNPs and colorectal cancer risk published by Hong et al. [18], confirmed significant associations a risk variant at rs6983267 (OR: 1.388, 95% CI: 1.180-1.8633) which had the highest ORs in homogeneous model. Our study findings also indicate an increase in the risk of CRC in the presence of the G allele in the genotype of rs10795668.
Rs1035209 is located in an intergenic region of NKX2-3, SLC25A28. Lu et al. [19] demonstrated its role in regulating pathways of cells differentiation in CRC pointed that the imbalance between proliferation and differentiation in colorectal cells can cause cancer [20]. Whiffin and coauthors stated that their research has identified a novel CRC susceptibility SNP rs1035209 at 10q24.2 and it is associated with CRC risk in Europeans at genome-wide levels of significance [21]. A recent largescale GWAS of the international group on 70,506 East Asians (22,775 cases and 47,731 controls) identified risk allele T of rs1035209 with OR from 1.04 to 1.12 (p = 3.6 × 10 −4 ) [22]. An expressed risk of CRC in the presence of the effect T allele in SNP rs1035209 was determined in our study in Kazakhstan population.
Rs11190164 is located at a region containing multiple genes including SLC25A28, ENTPD7, COX15, CUTC, and ABCC2. A Genome-wide association study   [8]. Our results also revealed the significant association with increased risk of CRC in the presence of the G allele in rs11190164. There were found associations of rs11190164 with rs1035209 and rs3740078 which were also significantly associated with CRC risk [21]. In addition, rs11190614 is highly linked with rs3740078 (r 2 = 0.71, CEU population; P = 3.2E-05) which cause a synonymous mutation in the ENTPD7 gene leading to intestinal epithelial inflammation [23]. Rs10506868 is located in intron 6 of the VTI1A gene a fusion partner of a CRC susceptibility candidate gene, TCF7L2 [9]. It was found to be associated with the risk of colorectal cancer in East Asians and the association of this SNP with the risk of colorectal cancer needs to be further evaluated in other ethnic groups [9]. In our study, the presence of the effective allele T in rs10506868 leads to decrease the risk of colorectal cancer development in the population of the Republic of Kazakhstan. Zhang et al. study revealed strong evidence of the association with cancer risk for four variants (rs12241008, rs10506868, rs7086803 and rs11196067) in the VTI1A gene. The results demonstrate a weak correlation of rs10506868 with rs12241008 in Europeans (r 2 = 0.201), moderate correlation in East Asians (r 2 = 0.647), and no correlation in Africans (r 2 = 0.014). The data of the presented article confirm the fact that the functional mechanisms of interaction of several SNPs associated with the risk of colorectal cancer may be opposite in different ethnic groups [24].
The study demonstrated the significant number of polymorphisms associated with colorectal cancer, located in the 10th chromosome in the Kazakhstan population. However, it is known, that the risk of CRC varies among individuals depending on the characteristics of diet, lifestyle and hereditary factors [25]. According to 2020 Global Nutrition Report [26] in Kazakhstan consumption of both red (59.9 g) and processed (19.3 g) meats overcome as regional (25.8 and 5.4 g respectively) as global values (31.4 and 12.2 g respectively). Unequal distribution of CRC cases over the country could be related with various dietary patterns in different parts of the country. According to Zhylkaidarova et al. [2] northern, eastern, and central regions with higher mortality rates have also often consumption of red meat. Dietary patterns in Western Kazakhstan include both meat and fish with medium mortality rates. Low rates were found in southern regions with traditional consumption of fruits and vegetables.
Thus, our findings demonstrate the prevalence of four polymorphisms in the 10th chromosome in the Kazakhstan population associated with sporadic CRC development. It is assumed that SNP interaction can vary in different ethnic groups and within environmental and personal risk factors may lead to alternative SNP patterns and its functional role in a share of pathology. The study of dietary and environmental influence on polymorphism allocation in the Kazakhstan population requires a more detailed analysis.

MATERIALS AND METHODS
The study was performed with 282 CRC patients and 159 cancer-free controls living in the Karaganda Region, Kazakhstan. Each subject was interviewed by a doctor, and signed their written informed consents, afterwards participants donated 3-5 ml of venous blood. All participants underwent a general clinical examination at the outpatient stage, an express test of feces for occult blood, and fibrocolonoscopy as part of the national screening. The clinical diagnosis of CRC patients was established according to ICD 10. Criteria for inclusion: age 18-80 years with colorectal adenocarcinoma. Exclusion criteria: a hereditary form of CRC, somatic diseases in the stage of decompensation, the presence of genetic diseases.
The research protocol was approved by the Committee on Bioethics of KSMU No. 305 dated May 19, 2017. For genetic research, patients' peripheral blood was taken into vacuum tubes with EDTA. Genomic DNA was extracted from whole blood using commercial kits GeneJET Genomic DNA Purification Kit (Thermo Scientific, Vilnius, Lithuania), in accordance with the manufacturer's instructions. DNA concentration was measured using NanoPhotometer (Implen, Germany). The isolated DNA was stored at −20°C.

SNP selection and genotyping
Selection and genotyping of SNPs were made using the Ensembl (https://www.ensembl.org/). We

Statistical analyses
Statistical analysis was performed using online resource SNPstats (https://www.snpstats.net/) and R V 3.6.3 statistics. Based on the results of genotyping, for each polymorphism in two groups, such parameters as the percentage of major and minor alleles, the value of minor allele frequency (MAF), relative values for genotypes were calculated. Each SNP frequency was evaluated using Hardy-Weinberg equilibrium (HWE) with the exact criterion among controls. Comparison of genotypic distribution and allele frequencies between groups were performed using the χ 2 test (Pearson's chi-squared test). To estimate the association between the 10th chromosome SNPs and risk CRC, we measured odds ratios (ORs) and 95% confidence intervals (CIs). Differences between samples were considered statistically significant at p < 0.05. The genotype-phenotype association was determined using multiple inheritance models (dominant, recessive, and log-additive) to evaluate the associations between certain SNPs on the 10th chromosome and CRC. False discovery rate (FDR) adjustment was used for p-value.