Contribution of MSMB promoter region gene polymorphism to early-onset prostate cancer risk in Mexican males

Sexually transmitted infections and its contribution to prostate cancer (PC) development have been relevant in different populations. MSMB gene polymorphism (rs10993994) has exhibited an association both with PC as well as the susceptibility to sexually transmitted infections. Hitherto, these conditions have been not studied in Mexico yet, neither if sexually transmitted infections could modify the MSMB and PC association. Herein, socio-demographic features, sexually transmitted infections records, the reproductive backgrounds, and the genetic characterisation were analysed in 322 incident PC cases and 628 population healthy controls from Mexico City. Whole PC, early-onset PC (PC at < 60 years old), late-onset PC (≥ 60 years old), and PC aggressiveness were used to evaluate the genetic variants contribution to PC risk using unconditional logistic regression models. Overall, none associations between the allelic variants of rs10993994 polymorphisms with whole and PC aggressiveness were found. Howbeit, the TT genotype carriers presented the highest susceptibility to develop early-onset PC (OR = 2.66; 95% CI = 1.41, 5.04; p = 0.03) than CC+CT carriers, both with codominant and recessive models. Although none association between whole PC and MSMB gene polymorphism was found, our results were reinforced by prior studies in European descendent populations, suggesting a contribution between rs10993994 and early-onset PC development.


INTRODUCTION
Prostate cancer (PC) is the second most common malignant neoplasm in males worldwide and the fifth leading cause of death from cancer in men [1]. Risk factors such as age, diet, sexually transmitted infections (STIs), smoking, and obesity have been associated with PC development [2,3]. In turn, ethnicity, family history and inherited gene changes also play a meaningful role in this cancer [4,5].
Matrix metalloproteinase, vitamin D receptor polymorphisms as well as those associated with the androgen's metabolism have been critical traits in genotype-phenotype association studies [6][7][8]. Recent studies suggest that gene polymorphisms related to prostatic function may participate in prostatic carcinogenesis and seem to confer susceptibility to STIs [9,10]. Genome-wide association studies have reported a polymorphic variant located in the promoter region of the β -microseminoprotein (MSMB) gene (rs10993994; C>T) associated with higher PC risk [11][12][13][14]. T allele has been also associated with the modification of the transcription factor binding sites (i.e., cAMP response element-binding protein) reducing the promoter activity of the MSMB gene.

Research Paper
Oncotarget 739 www.oncotarget.com The T allele alters the production of MSMB, a prostate secretory protein 94 amino acids (also named PSP94) and one of the most abundant proteins secreted by the prostate [15][16][17]. This protein, present in blood and seminal plasma, regulates prostate growth inducing apoptosis and also under post-coital conditions of vaginal pH, and low calcium concentration, it presents antimicrobial activity [18][19][20].
Mexican population presents a complex genetic architecture where gene frequencies of rs10993994 are still unknown. As far as we know, this genetic variant has not been studied in terms of PC association in Latino populations. In the present study, a genotype-phenotype association between rs10993994 and whole PC, earlyonset PC (PC at < 60 years old), late-onset PC (≥ 60 years old), and PC aggressiveness were evaluated. Also, we evaluated whether STIs history modifies this association.
Regarding the sexual history, 15.82% of the men reported at least one STI event being the most frequent gonorrhoea (10.4%); chancre (1.58%), acquired syphilis (1.05%), genital warts (1.05%) and herpes (0.95%) were the least frequent. It is noteworthy that precedents of any kind of STIs (at least one STI) were higher in cases than in controls (OR = 2.55; 95% CI = 1.79 -3.64; p ≤ 0.01); particularly the gonorrhoea background (OR = 3.80; 95% CI = 2.47 -5.86; p ≤ 0.01), was associated with more frequency of PC. The number of sexual partners (≥ 6) through all lifelong was twice associated with the PC frequency (Table 1).

Genetic statistical analysis
Similar distributions in allele and genotype frequencies were found between cases and controls. Likewise, an important Hardy-Weinberg departure (HWD) related to homozygous excess was found both in cases (F IS = 0.112; p = 0.03) and in controls (F IS = 0.125; p = 0.002). T allele frequency within the controls (Supplementary Table 1) exhibited dissimilar distributions frequencies related to the birthplaces, these differences were more prominent in the Central-West (10%) and South states (4.3%).
About the relationship between T allele with STIs, relevant differences in frequency were found in men with history of herpes. The carriers of two T allele dosages presented a significant seven-fold (2.56 vs 0.36; p = 0.02) higher frequency of herpes background in comparison with the CC+CT genotype carriers. Nonetheless, given the scantiness number of individuals with this condition (just four men reported herpes history) it results should be interpreted with caution. The rest of STIs did not show phenotype-genotype interaction.
On the other hand, no association between the risk allele (T) with whole PC (Table 2), and PC aggressiveness ( Table 3) were found. Also, we did not observe an interaction between STIs background and the risk allele T (Supplementary Table 2).
Under recessive and codominant inheritance models the carriers of the TT genotype, exhibited almost thrice more possibility to present early-onset PC (OR = 2.66; 95% CI = 1.41, 5.04; p = 0.003) than the rest of the genotypes (Table 4). This contribution was preserved even considering variables such as age at interview, birthplace, body mass index (two years before PC diagnosis or interview), and family history of PC.

Comparisons with other populations
To compare the rs10993994 gene frequencies with the 1000 genomes data using related descendant populations (i.e., Mexican ancestry in Los Angeles, California (MXL); Utah residents with Northern and Western European ancestry from the The Centre d'Etude du Polymorphism Humain collection (CEU); Iberian population in Spain (IBS); Yoruba in Ibadan, Nigeria (YRI)), a complex genetic architecture was depicted. The first dimension separated Cases, Controls, and MXL from the rest of the populations; the second one sets apart Cases, Controls, and YRI from MXL, IBS, and CEU suggesting a possible ancestral relation with the African descendants ( Figure 1).

Y-chromosome analysis
Risk allele of rs10993994 locus has been previously related to PC elevated risk, mainly in African derived populations. Thus, a possible relation between the ancestral lineages and early onset-PC was also done. Both the European (37%; G2a, I2b1, and R1b) as the Native American (Q lineage) ancestries were found in similar proportions (38%). The North African and the Middle East lineages (E1b1b, J1, J2a1 x J1a1-bh, J2a1h, J2b, and T; Table 5) were presented in 25% of the early onset-PC. The results were compared with a previous report in the Central Valley of Mexico (CVM) population [32]; the proportions of these two populations (present study early onset-PC vs CVM) were analogous.

DISCUSSION
Herein, a susceptibility pattern between the TT genotype and early-onset PC development, aside from the family inheritance (PC family history of firstdegree relatives) and the STIs history was found under a recessive heritability model. In discordant to prior reports, none association between rs10993994 polymorphism with the whole PC neither with PC aggressiveness in Mexican males were found. Of note that those studies where the whole PC was associated with the rs10993994 polymorphism have shown a nuanced contribution despite its substantial sample size. Likewise, it is worthwhile to mention that this association was found in Europeanderived men, where T allele frequency is the most   prominent. Thus, it is likely that the lack of association of our study could be related to its modest sample size.
Regarding the number of studies where an association between the risk's variant with the age of cancer diagnosis was found, it has been scarce hitherto, and most of them only considered dominant and additive heritage models. However, our findings are consistent with some studies in European derived populations where the T allele was associated with early-onset PC (< 55 years old) [13,33].
Likewise, a study in Scotland population where diagnosis age was not consider found that PC risk was two-fold higher in the TT carriers (OR = 1.87; 95% IC = 1.26 -2.77) than in CC+CT genotypes, using identical models those used herein [28]. Worth of note, akin to the results obtained herein, they did not find any association using the dominant model.
In relation to PC aggressiveness, the studies of Fitzgerald et al. [23], and Chang et al. [22], suggested a possible association between T allele with low-grade PC  (Gleason ≤ 6). Nevertheless, our findings did not evidence any association with PC aggressiveness independently of the heritability model used. By contrast to the reports in other worldwide populations where population screening program exists and the low-grade PC frequency ranges between 50 to 84%, in Mexico the observed low-grade PC frequency was low (26%). Hence, it is possible that the association between MSMB polymorphism and low-grade PC observed in those population could be a proxy of age at diagnosis. Rather, the results found by Stott-Miller et al. [26] supported that the increase of PC among the T allele bearers was independent both the family PC background and personal SITs history, which were consistent with our results. On the other hand, the accordance of the phenotypegenotype association in different ethnicities, as well as the congruousness in gene frequencies across other Hispanic populations reinforced our results. Howbeit, similar gene distributions both in cases than in controls as well as the HWD related to homozygous excess should not pass over. Of note, that this inbreeding was not related with the risk allele. This remarkable inbreeding could be associated with the youth of the Mexican mestizo population, which exhibits at most fifteen generations [34]. Populations such as IBS and even MXL presented F IS related to the homozygous deficit; CEU and YRI displayed genetic homogeneity in this locus.
Previous studies in Mexican Mestizo population [32] have reported little proportions of African derived lineages (7%). These findings were contrasting with the African proportions found in the early-onset PC group  (25%). This finding is consistent with prior studies where macro-haplogroup (DE), and sub-haplogroups within E1b1b (i.e., E1b1b1c) have been associated with PC in Japanese and Ashkenazi populations, respectively [35,36]. Although these lineages have exhibited higher risk than another one in previous findings, our results should be interpreted with caution given that these results represent only a subsample of 49 unrelated individuals.
About the decrease in PSP94 production as a result of the T variant, it has been well documented in previous reports as well as among different ethnicities [10]. PSP94 has been related to prostate tumour growth suppression, presenting less expression in PC advanced stages as those cases refractory to androgens [37]. Given that the association was related to EO-PC, it is likely that the mechanism described before did not stick to our results. Thus, it is likely that the T allele might alter the PSP94 antimicrobial activity, contributing to the prostate chronic inflammatory processes [19]. Nonetheless, this mechanism has not been elucidated yet.
Like any other study, some weaknesses should be considered such as the possibility of misclassifications in relation about STIs history owing to it was evaluated using a questionnaire. This strategy, recurrent in different studies, could skew the precedent history of STIs. Particularly, those STIs transmitted by protozoa (i.e., trichomoniasis), virus (i.e., human papillomavirus), fungi, and yeast (C. albicans), could remain silent and present unspecific symptoms [19,38]. This possible bias could negatively impact in power to detect the interaction between the rs10993994 polymorphism and the STIs (both overall as particularly). Besides, previous reports have suggested that the antimicrobial MSMB gene capability could be biological-agent specific. In this setting, a marginal association between the homozygote state of T allele and the genital herpes history was found. This finding, in connection with the sample size, did not allow us to detect a possible interaction between the gene polymorphism and the herpes history.
Other skew sources were the impact of Mexican population admixture, and the linkage disequilibrium (LD). On the one hand, the F IS found was related to inbreeding more than heterozygous excess, which is one of the main signatures of population stratification [39]. Likewise, the AMOVA test did not exhibit any difference between cases and controls (p ≥ 0.05), being the primary variation source within individuals, probably related to the studied men's birthplaces. To adjust, the birthplace variable, the bias might diminish significantly. In light of this evidence, the birthplace has been considered as a valid proxy for the ancestry markers [40]. Nevertheless, the population stratification correction using ancestry markers is the best strategies to avoid the population stratification bias [41]; consequently, it is likely a residual confusion in our results. However, the association of the risk allele with the ancestral lineages could be a right approach to avoid spurious associations. On the other hand, the LD found between rs10993994 and different polymorphisms on 10.q11 (i.e., rs7071471, rs7081532, rs11006207, rs3123078, rs7075697, rs7077830, rs2611489) could also mean a bias. These genetic markers have not been studied in the Mexican population yet, in turn, is not possible to rule out this confounder factor. However, the studies in Asian and European populations suggest the genetic marker evaluated herein is the strongest related to PC (OR = 1.64; 95% CI = 1.47 -1.82), supporting our findings. Hence, our results should be interpreted in light of these limitations.
To our knowledge, this is the first study where the T variant of the rs10993994 polymorphism and its relation to PC was evaluated in Latino populations. Our findings suggest that, under recessive and codominant heritability models, the rs10993994-T allele could contribute to the early-onset of prostate cancer development in the Mexican men. Although the findings found in the present study were consistent with previous reports among different ethnicities, further population-based genetic studies in a more lengthen sample of Mexican men (Native Americans and Mestizos) should be done. These studies should improve the STIs measurement to confirm the risk allele frequency avoiding type II statistical errors. The results found herein could constitute a relevant strategy in further PC analyses, where the onset age ought to be considered. Ultimately, our study could also reinforce the ethnic and even inter-ethnic variation, which was exhibited, indirectly, through the dissimilitude frequencies found between the geographic regions (Central-West vs South regions), possible indicative of a genetic susceptibility.

Study subject selection
A population-based case-control study was carried out from November 2011 to August 2014 including unrelated men residents from Mexico City without any record of other types of cancer (age range 42-94 years old). Cases were subjects with incident PC diagnosis identified at six specialty hospitals (four third level hospitals and two second level hospitals) without any restriction regarding their clinical stage. All these cases were histologically confirmed, and based on the Gleason score [30] at diagnosis, these were classified according to its aggressiveness as: well differentiated, ≤ 6; moderately differentiated, = 7; and poorly differentiated, ≥ 8. The individuals whose PC was diagnosed before reaching the age of 60 were designed as early-onset PC, if diagnosis occurred at or after reaching the age of 60, it was considered as late-onset PC. Out of 468 eligible cases, 402 agreed to participate (85.9%). Each case was paired by age (± 5 years) with two healthy men without any PC history, symptomatology associated with probably Oncotarget 745 www.oncotarget.com benign or malignant prostate disease (i.e. hematuria, dysuria, among others) or those who reported a previous study of prostate antigen (PSA) ≥ 4 ng/mL. Using the Master Sampling Frame (National Health Survey), we identified 920 potential controls, and 805 of them agreed to participate (85.5%). For control identification, 33 basic geo-statistical areas (BGAs) were selected. From each BGA in the sample, ten blocks were chosen and finally, starting from the northeast corner of the block, we visited and knocked on each household door to determine whether a male met the eligibility criteria and if there were two or more eligible subjects, one was randomly chosen. Based on those subjects with DNA sample, our final sample size was 322 cases and 628 controls with a ratio Controls/ Cases of 1.96. All participants signed an informed consent validated letter by the Ethics Committee of the Mexican National Institute of Public Health (CI: 980); this study was conducted in agreement with the principles established by the Declaration of Helsinki.

Face-to-face interviews
Face-to-face interviews were carried out with highly trained staff. Socio-demographic features (i.e., age, place of birth, occupation, and schooling), first-degree PC family history, as well as STIs, physical activity, and smoking histories were determined. Regarding place of birth, six geographic regions were studied (Figure 2). Chronic diseases background (i.e., diabetes type 2, hypercholesterolemia, and hypertension) was also determined in all subjects as presence or absence. Sexual history information included sexual activity age initiation, sexual relationships with sex workers and/or males, and the number of sexual partners that was categorized according to its tertiles distribution between controls. The number of episodes and the start age of STIs such as gonorrhoea, syphilis, genital warts, herpes or chancroid ulcers were also inquired.
Smoking information (age of initiation, frequency and number of cigarettes) was collected for three life's stages (≤ 20 yr, 21-30 yr, and ≥ 31 yr). If participants did not smoke at the moment of the interview the age when they quit smoking was also obtained. According to the smoking index, calculated for each life stage, two smoking patterns were identified [31]: pattern (A) individuals with low but constant smoking intensity, and pattern (B), individuals whose smoking intensity increased from 30 years old.
A similar procedure was followed regarding the history of physical activity. Through a validated  Anthropometric measurements such as weight (kilograms, kg), height (meters, m), and abdominal circumference (centimetres, cm) were done at interview. Because body mass index (BMI) could has been affected by the disease under study this variable was estimated based on reported weight for two years before diagnosis or interview. Dietary habits surveys in PC cases were obtained taking as time frame the three years previous to diagnosis and three years before the interview for controls.
Allelic discrimination of the rs10993994 polymorphism was determined using a TaqMan Genotyping assay (Applied Biosystems, Carlsbad, CA, USA) using Viia 7 Real-Time PCR System (Applied Biosystems, Carlsbad, CA, USA) following manufacturer instructions. All experiments were performed in duplicate.
To evaluate the possible contribution of ancestral background in PC development Y-chromosome haplogroups were determined in 49 early-onset PC individuals. The haplotypes were determined by 17 hypervariable markers with Y-Filer kit (Applied Biosystems, Carlsbad, CA, USA); the resulting amplicons were carried out using capillary electrophoresis (ABI 3130XL Genetic Analyzer, Applied Biosystems, Carlsbad, CA, USA). The haplogroups were assigned using a fitness score; the probability of identifying the ancestral lineage was done using the haplogroup predictor software (http://www.hprg.com/hapest5/).

Statistical analysis
Cases and controls were compared according to selected characteristics depending on the variable, Chisquare or t-Student was used. Allelic and genotypic frequencies of MSMB polymorphism, Hardy-Weinberg expectations, and analysis of molecular variance (AMOVA) were obtained with Arlequin v3.5. Multidimensional scaling plots (MDS) were determined using SPSS v11, this analysis included related ancestral populations obtained from 1000 genomes project (http:// www.internationalgenome.org/).
The genetic associations (allelic and genotypic) between the MSMB polymorphism and PC were done using unconditional logistic regression; the genotypic association was evaluated using dominant (CC vs CT+TT), co-dominant (CC vs CT and vs TT), and recessive (CC+CT vs TT) models. Independent models were used to determine the association between risk variant and PC aggressiveness (well, moderated, and poorly-differentiated), as well as between risk variant and start age (early-and late-onset).
Age (at interview) was included as a continuous variable in the bivariate and multivariate models. We evaluated as potential confounders the following features: birthplace, familiar first-degree history of PC, the history of chronic and STIs diseases, smoking habits, BMI two years before diagnosis or interview and physical activity. In the final models for whole PC, only remain: birthplace and the familiar first-degree history of PC; early-onset PC model was also adjusted by BMI. The effect modification between history of STIs and the allelic variants (rs10993994) were evaluated including an interaction term between both variables; p ≤ 0.10 was considered as statistically significant. All analyses were carried out using STATA v14 software package (Stata Co, College Station, TX, USA).

Author contributions
Silvia Juliana Trujillo-Cáceres: Performed the statistical analysis of the data and wrote the first draft of the manuscript. Luisa Torres-Sánchez: conceptualized the study, participated in the interpretation of data and redaction of the manuscript. Rocío Gómez: participated in the interpretation of data and redaction of the manuscript. Ana I. Burguete-García, Yaneth Citlalli Orbe-Orihuela, Ruth Argelia Vázquez-Salas, Esmeralda Álvarez-Topete: made a critical revision of the manuscript.