Identification of evolutionarily conserved DNA damage response genes that alter sensitivity to cisplatin

Ovarian, head and neck, and other cancers are commonly treated with cisplatin and other DNA damaging cytotoxic agents. Altered DNA damage response (DDR) contributes to resistance of these tumors to chemotherapies, some targeted therapies, and radiation. DDR involves multiple protein complexes and signaling pathways, some of which are evolutionarily ancient and involve protein orthologs conserved from yeast to humans. To identify new regulators of cisplatin-resistance in human tumors, we integrated high throughput and curated datasets describing yeast genes that regulate sensitivity to cisplatin and/or ionizing radiation. Next, we clustered highly validated genes based on chemogenomic profiling, and then mapped orthologs of these genes in expanded genomic networks for multiple metazoans, including humans. This approach identified an enriched candidate set of genes involved in the regulation of resistance to radiation and/or cisplatin in humans. Direct functional assessment of selected candidate genes using RNA interference confirmed their activity in influencing cisplatin resistance, degree of γH2AX focus formation and ATR phosphorylation, in ovarian and head and neck cancer cell lines, suggesting impaired DDR signaling as the driving mechanism. This work enlarges the set of genes that may contribute to chemotherapy resistance and provides a new contextual resource for interpreting next generation sequencing (NGS) genomic profiling of tumors.


Identification of genes regulating radiation and/ or cisplatin resistance in yeast and other lower eukaryotes
The Saccharomyces Genome Database (SGD, [1]; data current on Dec-28-2015) was searched for a list of phenotypes including "gamma ray resistance", "X ray resistance", "ionizing radiation resistance," and "UV resistance". Identifiers for genes, mutations in which resulted in altered phenotypes based on these search terms, were extracted and pooled. In parallel, the SGD was searched for the phenotype of altered "resistance to chemicals (cisplatin)", and identifiers for genes in which inactivating mutation or deletion resulted in increased sensitivity to cisplatin were extracted. An initial assessment of high-throughput studies (defined here as those screening > 15% of the genome) reported in SGD revealed inconsistencies in the ways their results were transferred to SGD. Therefore, we also reviewed a number of primary papers reporting high throughput studies from Pubmed [2][3][4][5][6][7][8][9][10][11][12][13]. Data available in these papers were extracted manually and integrated with the results reported in SGD.
To sort chemogenomics (CGS) data, the sensitization gradient observed for cisplatin in each of 3 independent screens was aligned with the set of high confidence hits determined through performance of binary screens (HTS and/or LTS reported at least two independent studies). The set of CGS genes ranging from most sensitizing to less sensitizing, with a cut-off at the point where >65% of high confidence binary genes had been observed in CGS analysis, was taken for further consideration as a high confidence CGS set for each independent screen. Subsequently, intersections between each pair of high-confidence subsets from the 3 screens were established: genes found in at least 2 screens were considered reproducible. A hypergeometric test was used to establish the statistical significance of each intersection (p<1 E-10). The three candidate lists were then merged into one high-confidence CGS candidate set.
For characterization of yeast clusters of interest, a t-test was used to identify which screening modalities (i.e., response to which individual drugs) differed significantly between the selected cluster of interest and the rest of the yeast clones, based on level of sensitivity, using data in Hillenmeyer et al. [18]. For each statistically significant category (drug), the number of experiments which show the difference in sensitivity was extracted separately and sorted into two bins: Highly significant (t-test < 1.010E-7) and moderately significant (t-test between 1.010E-2 and 1.010E-7). Mechanism of action was extracted for each drug using online resources: The NCI Drug Dictionary (http://www.cancer.gov/publications/dictionaries/cancerdrug) and The DrugBank (http://www.drugbank.ca).

siRNA screening
Human genes to be assessed for modulation of cisplatin sensitivity were depleted using two pooled siRNAs from Qiagen (Hilden, Germany) per gene. siRNAs targeting polo-like kinase 1 (PLK1) were used as a positive control for transfection, and scrambled siRNAs targeting the firefly luciferase gene (GL2) were included as a negative control for normalization (Dharmacon, Pittsburgh, PA). SiRNAs were introduced into cells by reverse transfection, using DharmaFECT-1 (Dharmacon, Pittsburgh, PA) diluted in reduced-serum media (OptiMEM, Invitrogen) in V-bottom 96-well dilution plates containing siRNA pools using a bulk reagent microplate dispenser. After 30 min at room temperature, each siRNAlipid complex was aliquoted into 96-well flat-bottom test plates using a CyBio Vario liquid handler, followed by addition of cells in normal growth media lacking antibiotics (10,000 cells/well for SCC61 and SCC25, 4,000 cells/well for OCAR8, 90 μl final volume/well).
After 24 hours recovery, cells were treated with cisplatin or vehicle for 72 hours, then cell viability measured using a Cell Titer Blue assay (Promega, Madison, WI), with signal quantified after 3 hours using a Envision (Perkin Elmer, Waltham, MD, USA) multi-label microplate reader. To calculate cell viability following siRNA treatment, the fluorescence intensity (FI) value from each well targeted by gene-specific siRNAs was divided by the mean FI value from three reference wells containing the non-targeting negative control GL2 siRNA on each plate to yield a viability score (V) defined as V = (fluorescence intensity, query gene-specific siRNA)/(mean fluorescence intensity, GL2 siRNA)) corresponding to each gene. The sensitization index (SI) of each siRNA was then defined as the viability of cells in the presence of siRNA and drug divided by the viability of the cells in the presence of siRNA and vehicle (SI = (VsiRNA + drug)/(VsiRNA + vehicle)). Biological significance was defined as a decrease or increase in the SI greater than 15%, as in previous studies [19]. These experiments were performed at least 3 times independently for each cell line. Four siRNAs were tested for each gene; for validation of on-target activity, the two siRNAs with the most robust sensitizing phenotype were tested for depletion of each gene by RT-PCR, and then were pooled together for further experiments (Supp Tables  S8, S9).

Interaction networks of yeast genes and their human orthologs
Human orthologues for yeast genes of interest were identified using Ensemble Biomart (http:// useast.ensembl.org/biomart/martview/) [20], using the orthology confidence cutoff 1 (maximum stringency). Gene homology was further verified using the P-POD: Princeton Protein Orthology Database (http://ppod. princeton.edu) [21]. For both yeast and human proteins, interaction networks were built using Cytoscape [22] with a GeneMania plugin [23], with the types of interactions restricted to physical and genetic. For human genes, settings allowed retrieval of up to five additional genes, to provide biological context.
Comprehensive genomic profiles are available for SCC25 and OVCAR-8 cell lines, including mutational landscape and gene amplification/deletion profiles, at the following links: http://www.cbioportal.org/case.do?cancer_study_ id=cellline_ccle_broad&case_id=OVCAR-8 and http:// www.cbioportal.org/case.do?cancer_study_id=cellline_ ccle_broad&case_id=SCC-25 Only limited details are available in regard of SCC61, as described at [24] and http://cancer.sanger. ac.uk/cosmic/sample/overview?id=1122673. In general, the comparison between the genomic profiles of HNSCC cell lines with the genomic profiles of HNSCC tumors indicates a good correlation [25]. (Table S7) The rank for each human candidate gene was calculated based on two criteria: 1) the confidence in identification of a candidate gene in a model organism, and 2) the confidence in identification of the corresponding human ortholog (s). Confidence in identification of a UV_ rad candidate gene in a model organism was calculated based on the number of independent publications implicating the gene in both LTS and HTS datasets, as follows: 0.5*(n(LTS) i /N(LTS) + n(HTS) i /N(LTS)) where n i is the number of publications supporting the candidate, and N is the maximum number publications supporting a candidate in the gene set. Where the assignment of the gene was also supported by data from at least one other model organism, the confidence was set to 1. For the cisplatin set, the confidence was calculated in the same way, except that instead of using LTS and HTS counts, "binary" and "CGS" counts were used (i.e., 0.5*(n(binary) i /N(binary) + n(CGS) i /N(CGS)). Thus, genes implicated by only one HTS or CGS publication coupled with functional clustering with better-characterized genes have the lowest confidence.

Rank calculation for human orthologs of genes modulating UV_rad and/or cisplatin resistance in model organisms
Calculation of the confidence in identification of appropriate human ortholog(s) for yeast genes incorporated consideration of the overall degree of evolutionary conservation (e.g., existence of orthologs in D.melanogaster and C. elegans), and the percent identity between the genes in model organisms and humans. The formula (0.1* C(hs) +0.1* C(dm) 0.1* C(ce) + 0.65*P(hom)) was used, where C(hs), C(dm), C(ce) are confidence scores for H. sapiens, D. melanogaster and C. elegans orthologs (either 1 or 0), as retrieved using Ensemble Biomart [20]. P(hom) is the percent of sequence identity between the model organism and human proteins. The final rank was assigned by combining the confidence and orthology scores, and subsequently calculating rank for each value in the set.