Transcriptomic features of primary prostate cancer and their prognostic relevance to castration-resistant prostate cancer

Although various mechanisms of castration-resistant prostate cancer (CRPC) have been discovered, reliable biomarkers for monitoring CRPC progression are lacking. We sought to identify molecules that predict the progression of advanced prostate cancer (AdvPC) into CRPC. The study used primary-site samples (N=45 for next-generation sequencing (NGS); N=243 for real-time polymerase chain reaction) from patients with prostate cancer (PC). Five public databases containing microarray data of AdvPC and CRPC samples were analyzed. The NGS data showed that each progression step in PC associated with distinct gene expression profiles. Androgen receptor (AR) associated with tumorigenesis, advanced progression, and progression into CRPC. Analysis of the paired and unpaired AdvPC and CRPC samples in the NGS cohort showed that 15 genes associated with progression into CRPC. This was validated by cohort-1 and public database analyses. Analysis of the third cohort with AdvPC showed that higher serine peptidase inhibitor, Kazal type 1 (SPINK1) and lower Sp8 transcription factor (SP8) expression associated with progression into CRPC (log-rank test, both P<0.05). Multivariate regression analysis showed that higher SPINK1 (Hazard Ratio (HR)=4.506, 95% confidence intervals (CI)=1.175–17.29, P=0.028) and lower SP8 (HR=0.199, 95% CI=0.063–0.632, P=0.006) expression independently predicted progression into CRPC. Gene network analysis showed that CRPC progression may be mediated through the AR-SPINK1 pathway by a HNF1A-based gene network. Taken together, our results suggest thatSPINK1 and SP8 may be useful for classifying patients with AdvPC who have a higher risk of progressing to CRPC.


Public gene-expression datasets used for validation
Analysis of the NGS cohort and the four advanced prostate cancer (PC):casteration-resistant PC (CRPC) NGS sample pairs showed that 12 genes were differentially expressed between advanced PC (AdvPC) and CRPC. To validate these findings, five gene-expression datasets based on PC samples at various disease stages were selected from the National Center for Biotechnology (NCBI) Gene Expression Omnibus (GEO). The five datasets used were GSE28403, GSE32269, GSE35988, GSE37199, and GSE70768. GSE28403 contained the gene-expression data of four cases of AdvPC and nine cases of CRPC. All AdvPC and two CRPC samples were collected from distant metastatic sites; the remaining seven CRPC samples were acquired from the primary site on the prostate [1]. GSE32269 contained the geneexpression data of 22 localized PCs (hormone-sensitive) and 29 prostate-to-bone metastatic CRPCs [2]. GSE35988 consisted of 59 localized PCs and 35 metastatic CRPCs whose gene expression data were generated by two custom microarray platforms [3]. The gene-expression dataset in GSE37199 was generated from 107 PC samples obtained from 92 PC patients (31 good prognosis PCs and 63 advanced CRPCs) and included nine biological and four technical replicates. Thus, in total, GSE37199 contained 39 good prognosis PCs and 68 CRPCs [4]. Lastly, GSE70768 contained gene-expression data from 13 CRPCs and 113 PCs from the prostate that were collected by robotic radical prostatectomy [5]. The GSE28403, GSE32269, and GSE37199 gene-expression datasets were generated by using the Affymetrix Human Genome U133A and U133 Plus 2.0 arrays. The GSE35988 dataset was generated by using two customized Agilent platforms (Agilent Whole Human Genome Microarray G4112F and G4112A). The GSE70768 dataset was created by using an Illumina HumanHT-12 V4.0 expression bead chip. It should be noted that even after exploring various PC datasets in the public database, only a few were perfectly compatible with our unique RNA-Seq dataset.

Biological insights into the signature gene profile of progression to CRPC
To identify signaling pathways that participate in the progression of AdvPC to CRPC, function enrichment test and gene-to-gene network analyses were performed with the 90 genes that were differentially expressed between AdvPC and CRPC in the NGS analysis ( Figure 1). For this, the IPA tool was used. As expected, when we searched for enriched functions, there was significant enrichment of genes involved in cancer, cellular growth and proliferation, the cell cycle, and cell death and survival. We also found high enrichment of genes involved in immunological and inflammatory disease (Supplementary Figure 7).
Interestingly, the vast majority of enriched functions contained AR or SPINK1, which are the two oncogenic molecules that were found to be up-regulated in CRPCs relative to in AdvPC when we compared the paired and unpaired AdvPC:CRPC samples in our NGS cohort. This indicates that AR and SPINK1 play a crucial role in the progression of AdvPC to CRPC.
An exploration of gene-to-gene networks revealed a functional connectivity between AR and SPINK1 in a network in which SPINK1 is a downstream effector of AR [6] (Figure 4). SPINK1 plays a number of roles in the cell, including in abnormal morphology and proliferation [7,8]. It also regulates many matrix metalloproteinase family genes, including MMP13, which participates in the identified network (Figure 4). Of the 90 candidate genes identified by the NGS analysis, we also found putative activation of the oncogenic transcription regulator HNF1A (Figure 4). HNF1A formed the primary hub of the gene network we identified because it regulates many downstream effectors, including UDP glucuronosyltransferase family members (i.e., UGT1A1, UGT1A3, and UGT2B15), VIL1, AKR1C1/AKR1C2, and ANPEP. HNF1A is well-known to participate in several molecular functions, including cancer, cell proliferation, cell or tumor morphology, and gastrointestinal disease.
To confirm the activities of these AR-and HNF1Aconnected candidate genes, we examined their expression levels in CRPC relative to those in controls by using five publicly available datasets that are based on independent patient cohorts, namely, GSE28403, GSE32269, GSE35988, GSE37199, and GSE70768. The controls in these comparisons were AdvPC (GSE28403), localized PC (GSE32269 and GSE35988), PC with a good prognosis (GSE37199), and primary-site PC (GSE70768). In these analyses, we focused on the network molecules that our NGS analysis showed were up-regulated in CRPC relative to AdvPC (i.e., AR, SPINK1, S100A8, HNF1A, VIL1, and MMP13). In total, 54 comparisons were made in these five datasets. All six genes were generally up-regulated in CRPC compared to in the control. Many of these upregulations were significant. In only three cases, downregulation was observed (once for AR and twice for HNF1) (Supplementary Figures 8-12). These data support the notion that gene networks that are mediated by AR or HNF1A participate in the development of CRPC.