DNA-damage related genes and clinical outcome in hormone receptor positive breast cancer

Background Control of DNA damage is frequently deregulated in solid tumors. Upregulation of genes within this process can be indicative of a more aggressive phenotype and linked with worse outcome. In the present article we identify DNA damage related genes associated with worse outcome in breast cancer. Results 2286 genes were differentially expressed between normal breast tissue and basal-like tumors, and 62 included in the DNA metabolic process function. Expression of RAD51, GINS1, TRIP13 and MCM2 were associated with detrimental relapse free survival (RFS) and overall survival (OS) in luminal tumors. The combined analyses of TRIP13+RAD51+MCM2 showed the worse association for RFS (HR 2.25 (1.51-3.35) log rank p= 4.1e-05) and TRIP13+RAD51 for OS (HR 5.13 (0.6-44.17) log rank p=0.098) in ER+/HER2- tumors. TRIP13 is amplified in 3.1% of breast cancers. Methods Transcriptomic analyses using public datasets evaluating expression values between normal breast tissue and TNBC identified upregulated genes. Genes included in the DNA metabolic process were selected and confirmed using data contained at oncomine (www.oncomine.org). Evaluation of the selected genes with RFS and OS was performed using the KM Plotter Online Tool (http://www.kmplot.com). Evaluation of molecular alterations was performed using cBioportal (www.cbioportal.org). Conclusions Expression of DNA metabolic related genes RAD51, GINS1, TRIP13 and MCM2 are associated with poor outcome. Combinations of some of these genes are linked to poor RFS or OS in luminal A, B and ER+HER2- tumors. Evaluation of its predictive capacity in prospective studies is required.


INTRODUCTION
Breast cancer is an heterogeneous disease with several molecular alterations [1,2]. This has been characterized with different methods including immunohistochemistry (IHC) or by using transcriptomic analyses [1,2]. Whole sequencing studies have also shown heterogeneity at a molecular level including mutations, or copy number modifications [3]. In addition, this biological diversity correlates with a different clinical behavior and pattern of relapse, helping to select among different therapeutic strategies [4].
Tumors that express the estrogen receptor and lack HER2 overexpression have been classified by genomic expression as luminal [2], and are treated with antihormonal therapy [2,5]. Among this group those that are enriched with genes associated with proliferation and therefore linked with a slightly worse outcome have been termed luminal B tumors [2]. This subgroup can be identified by the high expression of KI67 measured by

Research Paper
IHC [6]. However, although the luminal A group have better outcome, some of these tumors can have different clinical behavior and be associated with a more aggressive phenotype.
Triple negative breast cancer (TNBC) is a molecular subtype defined by the lack of expression of estrogen and HER2 receptors [1]. It accounts for around 15% of tumors and is associated with worse outcome [1,4]. Among molecular functions that are altered in breast cancer "cell division" and "DNA damage response" are some of the most modified and relevant, particularly in the TNBC subgroup [5]. In this context, genes that participate in the regulation of these two functions can be the target of novel compounds like those directed against PARP such as Olaparib, or novel agents under development against mitotic kinases [7,8].
In this project we hypothesized that deregulation of genes involved in "DNA damage metabolism" is not restricted to TNBC, and could be present also in other breast cancer subtypes like in estrogen receptor positive tumors. In this context, estrogen receptor positive tumors that overexpress some of these genes could be those that are associated with a more aggressive phenotype.
In the present article by using gene expression analyses we explore genes related to DNA repair mechanism that are deregulated in luminal tumors and that are associated with poor outcome. We identified a set of genes that predict detrimental outcome only in luminal tumors. In addition, the fact that this set is associated with DNA damage suggests that these genes should be evaluated in future studies as potential predictors of efficacy to DNA damaging agents.

Identification of metabolic DNA related genes linked to worse outcome
To identify functions and relevant genes that could predict worse outcome, we first compared transcriptomic data from normal breast with basal-like tumors using a public dataset (GEO DataSet accession number: GDS2250). Using a cut-off fold change of four or more, we selected 2286 genes differentially expressed between both groups. Functional analyses identified several functions as can be seen in Figure 1A, and we focused on DNA metabolic process where 62 genes were deregulated. The genes included in this function as provided by DAVID bioinformatic resources 6.7 are shown in Table 1. 49 of these genes were upregulated ( Figure 1B).
As we hypothesized that upregulated genes within the DNA metabolic process could be involved in the oncogenesis of other breast cancer subtypes, we explored next which of the identified genes were associated with worse relapse free survival (RFS) and overall survival (OS) in luminal tumors. We used the KM plotter tool, as described in material and methods, that contains information from datasets grouping more than 3500 patients. As can be seen in Figure 1B, we identified RAD51, GINS1, TRIP13 and MCM2 genes associated with worse outcome in luminal A, and luminal B tumors, for both RFS and OS (Supplementary Table S1, S2). We also confirmed the upregulation of these genes using data available at Oncomine ( Figure 1B).

Molecular alterations in identified genes and biological functions
Finally we evaluated if the identified genes have any relevant molecular alteration in breast tumors by using information contained at cBioportal [9]. No mutations or deletions were identified in a relevant proportion of patients. Amplification of TRIP13 was observed in 3.1% of the 974 breast invasive carcinoma samples ( Table 4). Functions of the genes are described in Supplementary  Table S3.

DISCUSSION
In the present article we describe a set of genes included within the metabolic DNA process that are linked with worse outcome in luminal breast tumors.
Luminal tumors are a breast cancer subtype that expresses hormone receptors and that is usually targeted with hormonal therapy. In addition, it is associated with   good clinical outcome. However, not all patients within this group have the same clinical behavior and some have poor outcome. In this context, it is relevant to identify subgroups of patients that express the estrogen receptor and are associated with a detrimental outcome in order to optimize therapeutic strategies. In our article we describe a set of genes that discriminate luminal tumors, identifying a subset linked with poor prognosis. The DNA repair machinery is usually hyper activated in tumors and in many occasions deficiencies in some of their components have been linked with the genesis and maintenance of some tumors [8]. In TNBC  molecular alterations at a genomic level are implicated in the oncogenesis of this tumor and furthermore can help to identify patients that would response to some therapies [5,8]. In this project we used basal-like tumors as a model to identify genes that were upregulated in this disease. These genes, as demonstrated latter, could also be expressed in other tumor types, and be linked with a more aggressive phenotype.  In our work, we have identified four genes that are associated with poor outcome in luminal tumors. These genes code for different proteins. MCM2 is a DNA replication licensing factor that has been described as associated with worse outcome in ovarian carcinoma [10]. TRIP13 promotes early steps of the DNA doublestrand break repair and its presence was associated with progression in prostate cancer [11]; and GINS1 plays a role in the initiation of DNA replication and has been part of a gene-set associated with outcome in early stage non-small cell lung cancer [12]. Finally overexpression of RAD51 has also been linked with poor outcome in colorectal carcinomas [13]. Of note the involvement of these genes in breast cancer has not been fully explored.
Our work has limitations, this is an in silico analysis based on functional genomics using data from several sources, so evaluation of these findings in an homogeneous data set is mandatory. In addition, no multivariable analyses is included so it is impossible to identify any potential association with other prognostic factors. Finally, the data should be considered hypotheses generating and be confirmed in independent datasets and in homogenous prospective and retrospective studies.
In conclusion, we describe a gene signature linked with DNA metabolic process that is associated with poor outcome in luminal tumors. Our findings have potential to discriminate patients with higher risk of relapse, therefore helping to customize therapies.

Identification of upregulated genes by transcriptomic analyses
We extracted mRNA level data of normal breast tissue and basal-like cancers from a public dataset (GEO DataSet accession number: GDS2250). Affymetrix CEL files were downloaded and analyzed with Affymetrix Transcriptome Analysis Console 3.0. We selected genes with minimum 4-fold differential expression values between both groups. Gene set enrichment analyses using DAVID Bioinformatics Resources 6.7 was used to analyze the list of genes and to identify functions of these genes (https://david.ncifcrf.gov/), using an adjusted p-value <0.05. Data contained at oncomine (www.oncomine. org) was used to independently confirmed the difference among the selected genes.

Analyses of outcome for RFS and OS
To evaluate the relationship between the presence of different genes and patient clinical outcome we used the KM Plotter Online Tool (http://www.kmplot.com) in different breast cancer subtypes [14]. This is a public database containing information from 3500 patients that permits to investigate the association of genes with overall survival (OS) and relapse-free survival (RFS).