Modulation of proliferation factors in lung adenocarcinoma with an analysis of the transcriptional consequences of genomic EGFR activation

Genes of the pre-replication, pre-initiation and replisome complexes duplicate the genome from many sites once in a normal cell cycle. This study examines complex components in lung adenocarcinoma (LUAD) closely, correlating changes in the genome and transcriptome with proliferation and overall survival. Molecular subtypes (The Cancer Genome Atlas (TCGA), 2014) based on copy number, DNA methylation, and mRNA expression had variable proliferation levels, the highest correlating with decreased survival. A pattern of increased expression typified by POLE2 and POLQ was found for multiple replication factors over thirty-seven tumor types. EGFR altered cases unanticipatedly inversely correlated with proliferation factor expression in LUAD, Colon adenocarcinoma, and Cancer Cell Line Encyclopedia cell lines, but not in glioblastoma or breast cancer. Activation mutations did not uniformly correlate with proliferation, most cases were pre-metastatic. A gene expression profile was identified, and pathway involvement considered. Significantly, results suggest EGFR over expression and activation are early alterations that likely stall the replication complex through PCNA phosphorylation creating replication stress responsible for DNA damage response and further mutation, but does not promote increased proliferation itself. An argument is presented that the mechanism driving lethality in this tumor cohort could differ from over proliferation seen in other LUAD.


INTRODUCTION
It is well established that cancer is the result of accumulated genetic changes to tumor suppressor genes or oncogenes, and that these changes lead to uncontrolled cellular proliferation.This feature is important clinically, and many well established chemotherapeutic agents are designed to directly or indirectly inhibit DNA synthesis.Targeting the DNA is often effective, but has major disadvantages of toxicity due to non-specificity and predisposition to develop resistant clones.Cisplatin and carboplatin are two examples, often used in neoadjuvant and adjuvant chemotherapy regimens when treating patients with lung adenocarcinoma (LUAD).These reagents crosslink purine bases in DNA preventing replication and repair and promoting cell death.The choice to use them is made after careful initial assessment that takes into consideration risk factors, radiological appearance, tumor histology, and node involvement.Targeted therapies are now also available for several molecular classes of LUAD including EGFR mutation positive, ALK rearrangement positive, ROS1 rearrangement positive, BRAF V600E mutation positive, NTRK gene fusion positive, as well as anti-PD-L1 therapy [1].
Hypothetically, therapeutic reagents targeting DNA should be most effective for tumors that are highly proliferating and undergoing trans-lesion DNA synthesis.While uncontrolled replication can be initiated by changes involving cell cycle regulation, mitosis, and apoptosis among others, continued activity of genes in cellular proliferation is critical for the neoplastic state.This study focuses on genomic and transcriptional changes to proliferation genes across a LUAD cohort Research Paper created by The Cancer Genome Atlas (TCGA) [2,3], previously subtyped by them on the basis of copy number, DNA methylation, and mRNA expression.Selected genes coding for proteins involved in the prereplication, pre-initiation, and replisome complexes were compared to evaluate the proliferation status of each of the subtypes.Since approximately 14% of the LUAD tumor cohort had Epidermal Growth Factor Receptor (EGFR) activation, and EGFR plays a known role in the replication of embryonic epidermal cells, it was included in the study as a LUAD proliferation factor.
The initial part of this study finds that subtype 2, 3, and 6 cases have highest expression of replication components and subtypes 1, 4, and 5 lowest; subtypes with highest expression have decreased survival.Evidence is presented that POLE and POLQ expression is elevated in subtype 2, and polymerase accessory subunits POLA2, POLD2, and POLE2 in subtypes 2 and 3. Comparing polymerase components across thirty-seven tumor types revealed a common subset with elevated transcriptional pattern typified by POLE2 and POLQ that included EXO1, MCM10, GINS2, CDT1, ORC6L, and BLM.Subtype 1 cases had increased expression of POLI, POLK, and POLL.
The second part of this study unexpectedly found that levels of EGFR expression overall were inversely proportional to the expression levels of multiple other important proliferation factors across the TCGA LUAD cohort.The same inverse relationship was found when examining EGFR expression in colon adenocarcinoma (COAD) and a cohort of cancer cell lines across several tissue types, but not in breast cancer (BRCA) or glioblastoma (GBB).To better understand the molecular process of EGFR activated LUAD, a search was made for genes altered in at least 50% of the activated cases.Twenty-one genes met the criterion, fifteen of which were found on the short arm of chromosome 7 proximal to EGFR.They included YKT6, MCPH1, UBE2D4, TP53, WIPI2, NUDCD3, PBXIP1, KLHL7, MRM2, HERPUD2, RNF216, FBXO42, FAM220A, URGCP, ZNF12, USP42, EXOC3, C7ORF26, VOPP1, ZDHHC4, and CLPTM1L.
Among the group CLPTM1L, PBXIP1, and URGCP like EGFR while showing increased expression over the EGFR cohort, inversely correlated with the expression of multiple key replication proteins over total LUAD.YKT6, KLHL7, FAM220A, and VOPP1 also had increased expression over the EGFR cohort but directly correlated with high expression of multiple proliferation genes.Altered cellular processes occurring with EGFR activation included increased PI3K/AKT/ mTOR signaling and autophagy, cytokinesis and apoptosis impairment, exosome production, cytoskeletal changes, and multiple changes to proteins of the plasma membrane.

Differential expression and overall survival
The status of forty genes known to be involved in cellular proliferation was examined for genomic mutations and changes in expression over the LUAD tumor cohort (Figure 1) (Table 1).Alterations are presented in the context of six LUAD subtypes defined by cluster analysis based on copy number, DNA methylation, and mRNA expression (Figure 2A) [4].Expression values in this image were calculated by cBioPortal [5,6], which uses the value of the diploid fraction of the LUAD cohort as an estimated normal reference.
LUAD subtypes with highest differential expression of replication factors (clusters 2, 3, and 6) versus lowest expression (clusters 1, 4, and 5) had decreased survival in Kaplan Meier survival plots (P = 0.0463) (Figure 2B).The most prominent feature between the groups is over expression of pre-replication and pre-initiation complex components and POLQ with relative under expression of POLI, POLK, POLL, and POLM in clusters 2, 3, and 6 implying these clusters have more "licensing", origin firing and micro-homology end-joining.Subtypes 1, 4, and 5 had increased expression of POLH, POLI, POLK, and POLL, and POLM polymerases involved in trans-lesion DNA synthesis, double stranded break repair, and abasic site repair, suggesting the ability to carry out error prone DNA synthesis.

Bimodal distribution and survival
mRNA levels for the forty genes were also examined for bimodal distribution above and below the LUAD tumor cohort average.Data are presented in the context of functional replication complexes (Tables 2-6).Mini-Chromosome (MCM) helicase proteins contribute to the pre-replication, pre-initiation, and replisome complexes.Components MCM 2, 3, 4, 6, and 7 were expressed above average consistently in subtype 2, differing significantly from subtype 1, 4, 5, and 6 (Table 2).MCM5 expression also was above average in subtype 2, but differed significantly only with subtype 1.The pre-initiation complex components CDC45, GINS1, GINS2, GINS3, GINS4, and MCM10 were found above the tumor average in subtype 2, differing significantly from subtype 1, 4, 5, and 6.Decreased survival was observed in Kaplan Meier survival curves for cases with MCM 2, 4, and 5, CDC45, GINS1, and MCM10 expression above the tumor cohort average (Table 3), in agreement with the general concept that high expression of proliferation genes correlates with decreased survival.
Replisome complexes duplicate DNA on leading and lagging strands (Figure 1).PCNA is a major component of this complex forming a sliding ring structure that attracts and tethers many other replicative proteins,   4); accessory subunits (POLA2, POLD2, and POLE2) were significantly elevated in subtypes 2 and 3, potentially signifying a function other than structural for their respective polymerase complexes in cancer (Table 5).

Proliferation genes highly transcribed in cancer
A common pattern of increased expression for POLE2 and POLQ in LUAD were found across thirtyseven tumor types [7].Searching for other proliferation genes resulted in finding EXO1, MCM10, GINS2, CDT1, ORC6L, and BLM had the same pattern indicating they are most likely necessarily highly transcribed in cancer (Figure 2C and 2D).

EGFR expression and proliferation
EGFR activated cases were found in multiple LUAD subtypes, with low differential expression of pre-replication, pre-initiation, and replisome complex factors (Figure 2A).Comparing EGFR and PCNA mRNA heatmaps (exonic level) side by side, suggested an inverse relationship from one subtype to the next (Figure 3A), an impression supported by cBioPortal differential expression data.These findings motivated arranging cases by PCNA mRNA expression (minimum to maximum), and interrogating other gene(s) expression relative to the curve.FEN1, POLD1, POLE2, POLQ, and MCM4, proteins that physically interact with PCNA or function close by, had expression curves in direct correlation to the DNA clamp.EGFR mRNA expression was inversely proportional to all (Figure 3B).PDGFRA (4q12), another cell surface tyrosine kinase receptor, similar to EGFR also had expression inversely correlated to the PCNA curve (data not shown).A Kaplan Meier survival plot comparing all cases with putative EGFR driver mutations to cases without EGFR alteration showed significant decreased survival for patients with putative driver mutations (Figure 3C) (Supplementary Table 1).Cases with EGFR missense mutations that were putative drivers did not cluster at any one point when arranged lowest to highest proliferation markers (Figure 3D) suggesting EGFR activation does not directly lead to increased proliferation in LUAD.This is supported by the fact that the twenty-eight EGFR tyrosine kinase activated cases also localize to the multiple LUAD subtypes predominantly 3, 5, and 6 (Figure 2A).Eight LUAD cases with distant metastasis tended towards higher placement on the PCNA proliferation curve, but not highest (Figure 3E); only one of these cases was also EGFR activated.Altogether, these results seem to support idea that the lethality in LUAD achieved through the EGFR activation pathway, and the lethality in LUAD achieved by increased levels proliferation are different mechanistically.
Examining the behavior of EGFR differential expression to proliferation markers in COAD, BRCA, GBB, and the Cancer Cell Line Encyclopedia (from the Broad Institute and Novartis, 877 samples) [8,9] (Supplementary Table 2) permitted identification of the inverse relationship in COAD and in the cell lines, but not in GBB or BRCA (Figure 4A).GBB had higher EGFR expression levels overall compared to LUAD, COAD, BRCA, and the cell line cohort.

Genes involved in EGFR activated cases
Twenty-one genes with genomic and expression alterations in at least 50% of EGFR tyrosine kinase activated cases were identified (Table 7).They were examined over LUAD subtypes 1-6, and in cases with distant metastases (Figure 5).Clusters 3 and 5 containing most of the EGFR tyrosine kinase activated cases were most similar, metastatic cases were less so.TP53 and MCPH1 both had low differential expression.TP53 had additional frequent genomic mutation (at DNA binding domain, 46.5%) MCPH1 did not (0.9%).MCPH1 expression was altered across the entire LUAD cohort (28%) including cases with increased expression.Mutual exclusivity testing for MCPH1 (low expression) and EGFR revealed they co-occurred significantly (P = 0.004).YKT6, UBE2D4, WIPI2, NUDCD3, PBXIP1, KLHL7, MRM2, HERPUD2, RNF216, FBXO42, FAM220A, URGCP, ZNF12, USP42, EXOC3, C7ORF26, VOPP1, ZDHHC4, and CLPTM1L were more highly expressed.Fifteen of these had cytological locations on the short arm of chromosome 7 (Table 7).The dysregulation of so many other genes proximal to EGFR's cytological location suggests an epigenetic event affecting transcription as an early alteration in these tumors.

Genes involved in EGFR activated cases in relation to proliferation
Comparing EGFR relevant gene expression to the PCNA expression curve, URGCP, PBXIP1, and CLPTM1L have inverse relationships to proliferation, while VOPP1, YKT6, KLHL7, and FAM220A are direct (Table 7, Figure 4B).VOPP1 (also known as Vesicular, Overexpressed in cancer, Prosurvival Protein 1 or EGFR-Co-amplified and Overexpressed Protein) expression is shown first in relation to EGFR expression, then to PCNA expression.VOPP1 is overexpressed in the most highly proliferative cases, with lowest EGFR expression, the finding was confirmed in cBioPortal.CLPTM1L, a gene coding for a membrane protein that when over expressed in cisplatin-sensitive cells causes apoptosis, is over expressed in EGFR activated cases that are the least proliferative.

Survival correlations for EGFR relevant genes
Genes implicated in the EGFR subtype were examined for overall survival across the LUAD tumor cohort (n = 230) using the Kaplan-Meier tool in cBioPortal (Figure 6).Cases with alteration in the genome and/or expression levels to TP53, YKT6, UBE2D4, and MCPH1 had decreased survival when compared to cases that did not.Cases with EGFR alteration in addition to each of these genes showed increased significance (Log rank) for YKT6, UBE2D4, and TP53.MCPH1 had decreased significance (Log rank) when in combination with EGFR.
A matrix table was set up to identify combinatorial alterations to TP53, YKT6, UBE2D4, and MCPH1 for individual EGFR activated cases (Table 8).Kaplan-Meier survival curves were calculated for each combination over the entire LUAD cohort, and sorted on the basis of Log-rank outcome.MCPH1 under expression correlated with loss of significance between curves for all gene permutations, with decrease in median survival seen in the "unaltered" curve.In EGFR activated cases alone, those with MCPH1 reduced expression (n = 11) were compared with MCPH1 normal expression (n = 12) when survival data was available, the curves were not significantly different from each other (P = 0.4264) (Figure 6).

Pathway alteration in EGFR activated LUAD
Overall, EGFR tyrosine kinase activated cases did not have distant metastases prompting examination of markers of epithelial to mesenchyme transition (EMT) (Figure 7).In agreement, appreciable alteration was not found for CDH1, VIM, SNAI1, SNAI2, TWIST1, ZEB1, or ZEB2.EGFR activation did correlate with activation of HUS1, RAD1, and several other components of the 9-1-1 DNA damage response pathway suggesting the major replication complexes were under replication stress.Important representative genes comprising the MAPK/ERK pathway revealed some activation of the pathway.The PI3K/ AKT/mTOR pathway, particularly several components of autophagy, had high activation over a high frequency of cases.Examining the SHH pathway independently from other genes, found none of the components altered in 50% or over cases.However, GNA12 (7p22.3-p22.2) expression was upregulated and significantly co-expressed with EGFR (P = 0.001) in 36% of the EGFR activated cases, and Gli3 (7p14.1)expression in 39%.

DISCUSSION
This study shows expression of pre-replication and pre-initiation complex and replisome components vary between LUAD subtypes reported by TCGA [2,3].Subtypes 2 and 3 tumors have highest expression that potentiates highest proliferation.Subtype 1 tumors have the least of expression, and represent the opposite boundary.While cytotoxicity of cisplatin and carboplatin may be the result of complex cellular processes, it follows that DNA targeting agents would be most effective on highly proliferating tumors.And if so, the development of (or absence of expression for some genes) would be advantageous.An estimated measurement of the kinetics of proliferation could be investigated for correlation with responsiveness to DNA targeting reagents, possibly enabling prediction of response prior to use and permitting therapeutic decisions based on a score.Cases from LUAD subtype 1 and some patients with high EGFR expression and tyrosine kinase activating mutations for example, might benefit from less toxic therapies without prior cisplatin/carboplatin treatment.Both metastatic [10] and drug resistant tumor cells [11,12] undergo a low/non proliferative phase that reactivates after extravasation in the case of metastasis, or clonal outgrowth in drug resistant cells.A standard form of measurement could be of help defining both processes.EGFR is not a known component of the prereplication and pre-initiation complexes or the replisome.
It is a receptor tyrosine kinase (RTK) in the plasma membrane and functions in the regulation of the ERK/ MAPK pathway kinase cascade and IGF-1 mTOR pathway [13].Activating mutations are often found in the tyrosine kinase domain; RTK inhibitors are effective in LUAD treatment (for a recent review) [14].That EGFR activation contributes to cancer is implicit from decades of research [15][16][17][18].It is difficult to fathom that this gene's over expression would not clearly correlate with increased replication processing.In actuality, despite the large body of work that exists, a full understanding of how or if the receptor functions in the proliferation process is lacking.EGFR is known to phosphorylate PCNA stabilizing the chromatin bound form [19], however the functional significance of the modification is not definitively clear.Some evidence exists that mismatch repair (MMR) is inhibited by EGFR phosphorylation of PCNA [20].In the present study, the low proliferation status of highly  expressed EGFR cases suggests PCNA phosphorylation by EGFR may actually play a role in slowing replication complexes.An alteration leading to over transcription of EGFR and surrounding chromosome 7 genes would be a germline or early event in these tumors, and PCNA phosphorylation would hypothetically result in "replication stress", an early and strong driving force in tumorigenesis, which would then invoke the DNA damage response.Interestingly, transcription and replication start sites often coincide at nucleosome free regions with some transcription factors playing a role in proliferation [21].The activation/overexpression of EGFR may initiate a transcriptional feedback inhibition mechanism for EGFR and replication genes.This study shows that tumors with missense activating EGFR mutations did not fall any one place on the PCNA proliferation curve, and (with one exception) EGFR activated cases were premetastatic.None-the-less, these untreated cases where EGFR activation was determined to be a putative driver had significant decreased survival on Kaplan Meier plots compared to cases that did not, suggesting that over proliferation and metastasis are not driving their lethality.In attempt to create a narrative of EGFR activated tumors, the remainder of this discussion goes into some detail explaining how the receptor was first associated with proliferation and then examines the functions of genes and pathways found altered in EGFR activated cases.
The earliest report pertinent to EGFR was written in 1947 on "urogastrone" known at the time as an inhibitor of gastric secretion, later identified as EGF [22,23].Initially, isolated from the submaxillary glands of male mice EGF elicited early eyelid opening and tooth eruption [24].Embryonal chick epidermal sections incubated with purified EGF and radioactively labeled thymidine had increased nucleotide incorporation and keratinization, establishing EGF's identity as proliferation factor.Curiously, control embryonal epidermal sections grown without EGF were capable of generating feathers while sections grown in the presence of purified EGF could not.Also, EGF stimulated basal cell production but the columnar orientation of the basal cells was not maintained.EGF did not stimulate the periderm, a structure present in chick embryonic skin made up of squamous cells linked closely together by junctional complexes and sloughed off at hatching [25].Cohen [24] reported periderm contact with supporting filter led to migration of cells with no proliferation.In retrospect, the loss of columnar orientation observed in this early study is suggestive of an EGF/EGFR relationship to the cytoskeleton, abnormal keratinization hints at Wnt signaling involvement, and the disparity in feather formation and periderm behavior suggests that the EGF/EGFR relationship to proliferation is cell type dependent and complex.Molecular mechanisms of chick feather formation involve sonic hedgehog (Shh) [26] and the pathway is also thought involved in human lung development [27]; canonical activation is not thought to occur generally in LUAD.In this study EGFR activated LUAD had non-canonical Shh pathway activation through increased GNA12 and Gli3 expression (not Shh, Ptch, and Smo).Both of these genes are found on the short arm of Chromosome 7 where fifteen other transcriptionally dysregulated genes were identified suggesting the region is specifically important in EGFR activated lung cancer.
Recent reports implicate the region in LUAD [28,29].This study found that with exception of TP53 and MCPH1, genes altered in EGFR cases were over expressed, NUDCD3 with highest frequency.NUDCD3 protein localizes to the cytoskeleton and is important in cytokinesis, mitosis, and dynein (cytoskeletal motor protein) stability.Over expression causes defects in cytokinesis that inhibit proliferation and induce the formation of binucleated cells, multipolar spindles, and lagging chromosomes [30].Low TP53 expression and TP53 mutation was found in these tumors as was overexpression to FBXO42 [31] and UBE2D4 [32].Both FBXO42 and UBE2D4 facilitate TP53 protein ubiquitination and degradation.TP53 absence would eliminate apoptosis, contribute to cell cycle arrest, alter angiogenesis, affect DNA repair, increase IGF-1/mTOR pathway signaling, and change exosome mediated secretion [33].This study finds an increase in mTOR signaling in these cases, conjecturally a result of TP53 elimination and non-canonical hedgehog signaling (via GSK3β, also a member of Wnt signaling).PI3K/AKT/ mTOR pathway favors tumor growth and cell size over proliferation and regulates autophagy.EGFR activated tumors have been suggested to be oncogene addicted to and dependent on this pathway [34,35].Autophagy, the endomembrane degradation process necessary for cellular homeostasis [36], has similarities and connections to apoptosis through BCL-2, an inhibitor of both.High EGFR expressing tumors had low BCL-2 expression in this study (data not shown) enabling increased autophagy in the absence of TP53 mediated apoptosis.Dysregulation of autophagy occurs in cancer, neurodegeneration, and microbial infection.This investigation observed autophagy components WIPI2 and YKT6 over expressed in EGFR activated tumors.WIPI proteins normally form a "propeller" that binds specifically to phosphatidylinositol 3-phosphate (PIP3, upregulated in mTOR/AKT/PI3K signaling) responsible for lipidation [37].YKT6, a member of the SNARE proteins, is important in vesicular trafficking between the endoplasmic reticulum and the golgi apparatus and in neurotransmitter exocytosis [36].Proteins involved in autophagy and exosome production, a process related through TP53, paradoxically contribute to tumor suppression in normal cells, and tumor promotion in cancer cells.Tumors with TP53 mutation and EGFR activation have altered exosome cargo currently being examined for biomarker use in liquid biopsies, effect on immune response, and cell-cell intercommunication.The dysregulation of these processes suggest possible involvement in the lethality of the EGFR activated phenotype.
Interestingly, WIPI2 like MCPH1 has a central nervous system phenotype (CNS) phenotype when mutated in humans [38,39].MCPH1 also functions in chromosome condensation [40], the DNA damage response, and regulation of CHK1 and BRCA1 [41] localizing to the centrosome in neurons [42].Null MCPH1 in Drosophila undergo mitotic arrest with spindles that lack chromosomes, a phenotype that can be suppressed by Chk2 mutation [43].MCPH1 is synonymous to BRIT1 [44], and is another post-transcriptional regulator of TP53 through ubiquitination.It is implicated in cancer as a tumor suppressor [45,46,47].This study also suggests MCPH1 is a tumor suppressor in EGFR activated LUAD, however the lack of increasing significance in Kaplan Meier plots with additional affected genes (including TP53), and the decrease in median survival of the unaltered curves over total LUAD is perplexing (Table 8, Figure 6) possibly suggesting, like NUDCD3, a narrow window of expression is necessary to maintain mitotic capability or that DDR capability is necessary for EGFR activated tumor cells to replicate and survive.
When EGFR activated cases do metastasize, they target the brain at a much higher rate that non-EGFR activated cases.These tumor cells may bear some functional resemblance to the environment they are capable of metastasizing with both tumor and neuron being low proliferating with high EGFR expression and increased membrane trafficking.Some of the membrane trafficking components are known to play a role in neural synaptic transmission.These thoughts raise speculative questions, do unappreciated CNS symptoms exist in pre-metastatic EGFR activated LUAD patients?Could exosomes originating from EGFR activated tumors cross the blood brain barrier and interfere with neural synaptic transmission?Another question that may address EGFR activation lethality, do exosomes affect immune response in the lung?

Tumor cohort, cluster and mutational analysis
The tumor cohort used throughout this study was created and examined by TCGA [2,3]; patient sample information is in Supplementary Table 3. Whole Exome Sequencing was performed on tumor and germline DNA.Cluster analysis based on copy number, DNA methylation, and mRNA expression revealed six subtypes (1-6) [4].Mutational findings, also based upon data generated by the TCGA Research network, are found in cBio-Portal [6,5,48].the exonStartStop.txtfile were examined in Integrative Genome Viewer (IGV) [51,52] using RNA-Seq data from the same case to confirm the authenticity of the exon.Z scores were calculated for each exon of each gene by mean-centering with the average alteration level the log2 transformed RPKM values and dividing by the standard deviation, visualizing high (red), no change/no expression (white), and low (blue) and arranging data by LUAD cluster assignments (1-6) in heat maps.
The "Firebrowse" tool in Firehose [7] was used to examine differential expression levels of a specific gene across thirty-seven tumor types.

Clinical data and survival analysis
The TCGA LUAD cohort was made up of two hundred-thirty matched tumor and normal samples from patients that did not have previous treatment.Appropriate informed consent was obtained.All major histologic types of lung adenocarcinoma were represented: 5% lepidic, 33% acinar, 9% papillary, 14% micropapillary, 25% solid, 4% invasive mucinous, 0.4% colloid and 8% unclassifiable adenocarcinoma.A full description of the pathological and histological assessment can be found in the Supplementary Materials of the TCGA report [2] (see also Supplementary Table 3).Kaplan Meier survival plots were constructed for "overall survival" and created either using the cBioPortal survival tool, or by using GraphPad Prism 6.0 software where indicated.In all cases the Logrank (Mantel-Cox) and Hazard Ratio tests were used to determine significance.In some instances the Gehan-Breslow-Wilcoxon test which gives weight to deaths at early time points of the survival curve was also observed.

Replication component expression as reference curve
To examine subtle relationships between gene expression and replication over the LUAD cohort, Proliferating Cell Nuclear Antigen (PCNA) and other components of the replisome were arranged from minimum and maximum expression and used as a reference curve.Test gene Z scores were arranged according to the reference gene case order using GraphPad Prism 6.0 software; data was obtained from cBioPortal unless otherwise indicated.The rationale for using PCNA in this manner stems from its integral role as a clamp in the replication process, to which many other proliferation factors bind [53].It is used in this study as a replication marker.

Identification of altered genes in EGFR activated cases
A list for "total genes" was generated [54] (approximately 22,165).Protein coding genes were screened for mutations, copy number alterations, mRNA expression (RNA Seq V2 RSEM), and protein expression (RPPA) in cBioPortal.Genes with alterations in at least 50% or more EGFR activated cases were identified.This gave extensive but not exhaustive results, RNA genes were not available for observation.

Pathway analysis and reference sources
The Kyoto Encyclopedia of Genes and Genomes (KEGG) resource was used to examine the placement of specific genes in pathways [36].Canonical pathways and networks were also examined using Metacore [55], references were examined in Metacore and PubMed.
by The Cancer Genome Atlas (TCGA) Data Access Committee.Participants agreed to participate and give informed consent.Human Subjects Protection, Data Access Policies, and HIPAA Privacy Rule compliance were developed by the NCI and NHGRI to protect their privacy.This study is compliant with the TCGA "exclusivity period" for publication.

Figure 2 :
Figure 2: Replication genes examined for genomic and transcriptomic alteration in the context of LUAD subtypes, with survival analysis.(A) Cluster analysis is based on copy number, DNA methylation, and mRNA expression ([2, 3]), tumors with EGFR kinase activation are indicated.(B) Kaplan Meier survival plot, high versus low proliferating clusters.(C) Differential expression of polymerase components in thirty seven tumor types.(D) Replication genes found with common increased expression, thirty seven tumors.

Figure 5 :
Figure 5: Genes found with genomic and transcriptomic alterations in EGFR activated LUAD, with high frequency.
(A) Compared to cases with distant metastasis.(B) Compared to TCGA subtypes.

Figure 7 :
Figure 7: EGFR activated cases compared to pathway markers.