Relationship between EGFR expression and subcellular localization with cancer development and clinical outcome

Epidermal growth factor receptor (EGFR) as a prevalent oncogene regulates proliferation, apoptosis and differentiation and thereby contributes to carcinogenesis. Even though, the documentation on its clinical relevance is surprisingly heterogeneous in the scientific literature. Here, we systematically investigated the correlation of mRNA to survival time and pathological parameters by analyzing 30 datasets in silico. Furthermore, the prognostic value of membrane-bound, cytoplasmic (mcEGFR) and nuclear expression (nEGFR) of EGFR was experimentally analyzed by immunohistochemical staining of 502 biopsies from 27 tumor types. We found that protein expression of EGFR showed better prognostic efficiency compared to mRNA, and that mcEGFR expression was positively correlated with nEGFR expression (p < 0.001). Unexpectedly, both mcEGFR and nEGFR expression were associated with low T stage (p < 0.001 and p = 0.004; respectively). Moreover, positive mcEGFR was significantly related to high differentiation (p = 0.027). No significant correlation was found with any other pathological parameters. Collectively, our results imply that the oncogenic function of EGFR may be more related to nascent stages of carcinogenesis than to advanced and progressive tumors, which may as well explain at least partially the occurrence of secondary resistance against EGFR-directed therapy.


INTRODUCTION
Epidermal growth factor receptor (EGFR), also known as ErbB1 or HER1, together with three homologues (HER2, HER3 and HER4) composes the ErbB family of tyrosine kinase receptors (TRKs).EGFR represents a transmembrane receptor with a molecular weight of 175 kDa.Upon binding to its ligands such as epidermal growth factor (EGF) or transforming growth factor-α (TGF-α), EGFR homo-or hetero-dimerizes with its counterparts [1].Such dimerization stimulates auto-phosphorylation of several tyrosine residues in its intracellular kinase domain, which further activates downstream transduction cascades, e.g.PI3K/AKT, MAPK/ERK and PLCγ1/PKC to exert cell proliferation and differentiation effect [2].
Signal transduction of EGFR is ordinarily under intimate control in human beings.However, tumor patients tend to display deregulated EGFR activity, mostly due to point mutations, exon 8 deletion or gene amplification [3][4][5].Abnormal enhancement of EGFR activity represents a carcinogenesis initiator.In this context, the enormous relevance of anti-EGFR strategy e.g.small molecule tyrosine kinase inhibitors (TKI) gefitinib or monoclonal antibodies panitumumab and their clinical implication gained great success in the past years [6].
Besides functioning as carcinogenesis initiator, excessive EGFR activity is also considered to affect Research Paper subsequent malignant development.Despite its unambiguous role as oncogene, the documentation of its clinical relevance is surprisingly heterogeneous in the scientific literature [7].In the present investigation, we estimated the association of EGFR with clinical outcomes and pathological parameters at both mRNA and protein levels.We assessed EGFR mRNA expression and its correlation with overall survival (OS), TNM stage and grade of patients from 30 datasets covering 15 cancer types and compared 30 studies in this regard.We also performed immunohistochemical analysis on 502 human cases covering 27 tumor types and studied the correlation between EGFR protein expression and clinical outcomes or pathological characteristics corresponding to membranous and cytoplasmic or nuclear expression pattern as explanatory variable due to the fact that granular EGFR expression in the nucleus has been described as a factor of resistance to chemo-and radiotherapy [8][9][10].Here, we integrated this information and considered, how it might be best applied for clinical routine diagnosis.

Correlation of EGFR mRNA expression and clinical outcomes
Thirty datasets were screened with filters in the Oncomine database.The filter flow is shown in Figure 1.Among 30 datasets (Tables 1-3), 23 datasets (=76.7%) did not show any significant association between EGFR mRNA level and clinical outcome or pathological characteristics of patients, except datasets GSE22226 and GSE10846, which showed significant associations between high EGFR mRNA expression levels and poor overall survival (cutoff mean, p = 0.03; cutoff mean, p = 0.03; respectively) (Table 1).However, adverse effects were documented in datasets GSE4412 and GSE15081 with statistical significance (cutoff median = mean, p = 0.02; cutoff median, probe AGhsB031519, p = 0.04), which indicated that high EGFR mRNA expression level was correlated with better overall survival.
Since EGFR mRNA expression did not correlate with survival times of patients, we were interested to analyze, whether or not EGFR protein expression was of prognostic value.

Survey of immunohistochemical studies
Thirty studies filtered with following keywords "EGFR", "expression", "predictor", "biomarker" and "prognosis/prognostic" were included in our survey (Table 4).Eighteen studies (=60%) revealed that high EGFR protein expression significantly correlated with poor clinical outcome parameters, e.g.overall survival (OS), progression-free survival (PFS), disease-free survival (DFS), as well as poor pathological characteristics, e.g.TNM stage, grade or overall stage of patients.The other studies claimed no significant correlations.Compared to rather poor prognostic value EGFR mRNA expression, EGFR protein expression was of superior utility.Likewise, more significant associations with pathological characteristics were observed.

Correlation of EGFR protein expression and pathological parameters
To validate whether EGFR protein expression and specifically its expression pattern as mcEGFR or nEGFR may provide paired associations with pathological characteristics, we conducted immunohistochemistry on a total number of 502 cases covering 27 tumor types.
Among all cases, the frequency of negative, weak, moderate and strong staining was 36.25%,30.08%, 27.89% and 5.78% for mcEGFR, while 48.24%, 26.13%, 15.08%, 10.55% of the tumors revealed nEGFR (Figure 2).Based on our investigation, higher expression of both mcEGFR and nEGFR was accompanied with lower occurrence (Figure 3).In other words, extreme high EGFR  expression regardless of membrane-bound or nuclear expression patterns was rather rare among the tumors investigated.Furthermore, we identified the distribution of mcEGFR and nEGFR expression in different tumor types (Figure 4).As shown in Figure 4A, mcEGFR was highly expressed in brain tumors followed by lung tumors.Compared to lung tumors, the expression in brain tumors tend to be more intensive if the whisker range was put into consideration.Uterus, colorectal and kidney tumors expressed mcEGFR in a similar manner.Breast, ovary, pancreas and prostate tumors revealed comparatively low expression levels.Noticeably, there were a few cases of breast tumors with strong mcEGFR expression, which exceeded the whisker range.Tumor types comprising less than 5 cases were classified as "others" (Figure 4B), among which fallopian tube tumor ranked top while parotid and testis ranked the lowest.However, the results could not provide accurate information due to limited case number.In the case of nEGFR, brain tumors were excluded from analysis due to the difficulty in determining nEGFR in this tumor entity.By contrast, nEGFR was frequently found in lung tumors followed by kidney, colorectum, pancreas, ovary and uterus, respectively (Figure 4C).In addition, stomach tumors also expressed high nEGFR (Figure 4D).However, nEGFR expression in breast and prostate was comparatively rare.
To explore the relationship between mcEGFR and nEGFR, we performed independent t-tests with negative or positive expression of nEGFR as grouping variable.Furthermore, we categorized the H-score as described above into four levels.Pearson's χ2-test was applied to assess the independence between H-score levels and nEGFR levels (Table 5).The result provided a compelling argument that mcEGFR and nEGFR are dependent factors (p < 0.001).Besides, there was a significant difference of H-score mean value between negative nEGFR and positive nEGFR groups (p < 0.001) which indicated cases harboring negative nEGFR also showed lower mcEGFR expression compared to positive nEGFR cases.
To further explore the correlation of EGFR protein expression and pathological characteristics, we firstly run ANOVA mean comparison test for mcEGFR H-score, TNM stage and grade, respectively.Then, we used Pearson's χ2-test to determine the independence of H-score as negative and positive groups with TNM stage and grade, respectively.Unexpectedly, there was an adverse association between mcEGFR and T stage as mean comparison (Figure 5, p < 0.001).In addition, H-score and T stage were dependent in an adverse manner as well (p < 0.001).Moreover, positive mcEGFR was associated with low grade (p = 0.027) in Pearson's χ2-test.The same trend was also found in one-way ANOVA mean comparison test but without significance (p = 0.233).However, no significant difference was found among any other pathological parameters.Neither were any dependent relationships in between these parameters (Table 6).Interestingly, nEGFR revealed consistent results that its expression and T stage was adversely dependent (p = 0.004) by Pearson's χ2-test (Table 6).

DISCUSSION
EGFR is well-known as oncogenic signal regulating proliferation apoptosis and differentiation and thereby contributes to carcinogenesis.The development of specific small molecules and antibodies targeted to EGFR represents an attractive clinical implementation [11,12].
The reasons for EGFR overexpression are related with EGFR gene amplification, receptor-activating mutations, or deficiency of negative regulatory mechanisms [13].
Here, we investigated prognostic value of EGFR mRNA expression by mining the data deposited in the GEO and Oncomine databases.Although there are studies revealing that high EGFR mRNA [13-18] or even gene copy number [19] was correlated with poor clinical outcomes or pathological characteristics, a more systematic evaluation of published studies did not validate the proposed impact of EGFR mRNA expression.The inconsistency partially may be attributed to the choice of the EGFR probe.Microarray chips normally provided several probes targeting the same gene.Expression intensity according to different probes can extraordinarily differ, which may even lead to completely opposite conclusions.We used the optimal probe for our analysis based on the concept of jetset probe [20], which means only those probes providing comparatively better overall specificity, coverage and robustness were chosen.Since no correlation was found based on mRNA expression, we assessed 30 independent studies assuming that EGFR  nEGFR levels were classified as "Negative", "Weak", "Moderate" and "Strong" and each level was coded with green, light yellow, yellow and orange respectively.(D), nEGFR expression among tumor types with less than 5 cases.Heat map was drawn according to nEGFR level and tissue type.3-Color scale indicated frequency of nEGFR expression where green showed 0 case, yellow showed 1 case while orange showed 2 cases.Detailed information about "others" refers to Supplementary Table 2.
protein expression might be a more promising prognostic factor than EGFR mRNA expression.
As demonstrated by elegant analyses, there exist two distinct patterns of EGFR expression.Upon stimulation with ligands, mcEGFR undergoes COPI-mediated retrograde trafficking from the Golgi apparatus to the endoplasmic reticulum.With the help of importin β1 and Sec61β, mcEGFR can be shuttled from outer nuclear membrane to inner nuclear membrane and finally released into nucleoplasm and become nEGFR [21,22].Therefore, we took one step further and investigated, whether protein expression patterns as membranous and cytoplasmic or  nuclear expression would make a difference in regard of affecting clinical outcomes or pathological characteristics.
Although it has been reported that once entered into nucleus, nEGFR functions in a manner distinct from its cytoplasmic membrane counterpart [9,23,24,10,[25][26][27], we primarily focused on clarifying the relationship between mcEGFR and nEGFR.In the current study, we observed a clear positive correlation between mcEGFR and nEGFR (p < 0.001).Furthermore, both mcEGFR and nEGFR expressions were unexpectedly associated with T stage in an adverse manner (p < 0.001 and p = 0.004; respectively).Positive mcEGFR was related to well differentiation (p = 0.027).We also revealed the diverse distribution patterns of both mcEGFR and nEGFR within different tumor types.
Taken together, our results indicated that protein rather than mRNA expression reflects the prognostic value of EGFR.This may have important implications, since results based on EGFR expression obtained by mRNA microarray and next generation sequencing technologies may be less informative than those resulting from protein arrays or immunohistochemical analyses.Recently, the nuclear expression of EGFR came more into the focus of attention, which can be only monitored by methods based on protein visualization and localization.Furthermore, the fact that both mcEGFR and nEGFR expression was rather associated with low T stage and positive mcEGFR was related to low grade, thus high tissue differentiation, may imply that the oncogenic function of EGFR may be more related to nascent stages of carcinogenesis than to advanced and progressive tumors, which may as well explain at least partially the occurrence of secondary resistance against EGFR-directed therapy.

Tumor cases
A total number of 502 formalin-fixed and paraffinembedded tumor cases covering 27 tumor types have been obtained from different sources: Ovarian and endometrial carcinoma biopsies were provided by Prof. Jose Schneider and belong to the tumor banks of Hospital Universitario de Cruces, Bilbao, Spain and Hospital Universitario Valdecilla, Santander, Spain, respectively, and were to a large extent used in previous studies on oncogenic activation in gynecologic tumors [28,29].Relevant data and ethical approval by Wandsworth Ethics Committee (Wandsworth, UK, Ref: 08/H0803/3) regarding colon cancer has been published by us [30]

Statistical evaluation of the GEO and Oncomine databases
EGFR mRNA expression data and corresponding overall survival time, TNM stage and grade information were obtained from the GEO (https://www.ncbi.nlm.nih.gov/geo/) and Oncomine (https://www.oncomine.org/)databases.Normalized and log-2 transformed EGFR mRNA expression values of jetset probes were further determined as "low" or "high" using both median and mean as the cut-off value.Thirty datasets covering 15 cancer types were analyzed for time-to-event distributions estimated with Kaplan-Meier curves with log-rank test as assessing significance method.Associations of EGFR mRNA expression level with pathological characteristics were determined by Pearson's χ2-test.The above mentioned statistical analyses were performed using IBM SPSS Statistics version 23 (IBM, USA).Statistical differences with p-values less than 0.05 were considered as significant.

Search strategy
Thirty independent studies [14,19, based on immunohistochemical EGFR determination from Pubmed engine (https://www.ncbi.nlm.nih.gov/pubmed) were identified by combining the search terms "EGFR", "expression", "predictor", "biomarker" and "prognosis/ prognostic" for estimating EGFR protein expression and its correlation with clinical outcomes in comparison to analyses derived from the GEO and Oncomine databases based on mRNA expression.

Immunohistochemistry and statistical application
Immunohistochemistry was performed on 502 biopsies using EGFR rabbit monoclonal antibody (Clone EP38Y; Thermo Fisher Scientific, Dreieich, Germany) as primary antibody.The staining procedure has been previously published by us [59].Quantification of immunostainings was performed by using Panoramic Desk (3D Histotech Panoramic digital slide scanner, Budapest, Hungary).Membranous and cytoplasmic EGFR (mcEGFR) was quantified by MembraneQuant software by using H-Score.A minimum of each three www.oncotarget.comrepresentative areas per tumor were scanned and the mean values together with standard deviations were calculated.One-hundred-four cases were excluded for nuclear EGFR (nEGFR) analysis due to the limitation in distinguishing extremely positive mcEGFR and existence of nEGFR.The other 398 cases were manually graded regarding nEGFR expression.
We used one-way ANOVA to exert mean comparison of mcEGFR H-score within different cancer types, TNM stage and grade, respectively.Independent t-test was used to determine variation in distribution of mcEGFR H-score in nEGFR negative and positive groups.mcEGFR and nEGFR were further categorized into four degrees or negative and positive groups according to expression intensity.As to mcEGFR H-scores, values below 20 were grouped as negative; H-scores ranging from 20 to 115 as weakly positive, from 115 to 210 as moderate positive and above 210 as strongly positive.The later three groups were all considered as positive.The signalto-noise cutoff of mcEGFR H-score was determined by H-score obtained from negative controls (omission of primary antibody during staining procedure).nEGFR was similarly grouped as negative, weak, moderate and strong positive immunostaining or as negative and positive groups.As categorical data, both mcEGFR and nEGFR and their association with pathological TNM stage and grade was assessed by Pearson's χ2-test.Above statistical analyses were performed by using IBM SPSS Statistics version 23 (IBM, USA).Statistical differences with p-values less than 0.05 were considered as significant.Noticeably, as to graderelevant analyses, cases graded as G0 were excluded, well differentiated to moderate differentiated cases were grouped as low grade, while moderate-to-poorly differentiated to poorly differentiated cases were grouped as high grade.

Figure 1 :
Figure 1: Filter flow for datasets screen.

Figure 3 :
Figure 3: Distribution of mcEGFR and nEGFR among all tumor types.(A), histogram of H-score, as indicator of mcEGFR expression, distribution among all 502 biopsies.(B), histogram of nEGFR distribution among all 398 biopsies.0, 1, 2 and 3 on x-axis in histogram of nEGFR level indicated negative, weak, moderate and strong expression respectively.

Figure 4 :
Figure 4: Distribution of EGFR in different tumor tissue types.(A), H-score, as indicator of mcEGFR expression, distribution in different tumor types.All the tumor types comprising less than 5 cases were grouped as "others".Tissue types were color coded as shown in legend.(B), H-score distribution among "others".In this figure, cases of each tumor type were less than 5. Plot was drawn according to H-score and tumor types.Tissue types were color coded as shown in legend.(C), Distribution of nEGFR among different tumor types.nEGFRlevels were classified as "Negative", "Weak", "Moderate" and "Strong" and each level was coded with green, light yellow, yellow and orange respectively.(D), nEGFR expression among tumor types with less than 5 cases.Heat map was drawn according to nEGFR level and tissue type.3-Color scale indicated frequency of nEGFR expression where green showed 0 case, yellow showed 1 case while orange showed 2 cases.Detailed information about "others" refers to Supplementary Table 2.

Figure 5 :
Figure 5: Correlation between H-score and T stage.T stage was color coded as green represented T1 stage, light yellow T2, yellow T3 while orange T4.

Table 1 : Correlation of EGFR mRNA expression and overall survival Cancer type GEO accession Jetset probe OS (p value) Median Mean
P value < 0.05 was labeled with asterisk mark.OS, overall survival.Median, group EGFR mRNA expression as "high" and "low" by median.Mean, group EGFR mRNA expression as "high" and "low" by mean.

Table 2 : Correlation of EGFR mRNA expression and grade
P value < 0.05 was labeled with asterisk mark.Median, group EGFR mRNA expression as "high" and "low" by median.Mean, group EGFR mRNA expression as "high" and "low" by mean.

Table 3 : Correlation of EGFR mRNA expression and TNM stage
P value < 0.05 was labeled with asterisk mark.T, N and M represented T stage, N stage and M stage, respectively.Median, group EGFR mRNA expression as "high" and "low" by median.Mean, group EGFR mRNA expression as "high" and "low" by mean.

Table 4 : Survey of immunohistochemical studies
Abbreviations: OS, overall survival; PFS, progression free survival; DFS, disease free survival; NS, not significant, but the article did not provide exact data; P value < 0.05 was labeled with asterisk mark.

Table 6 : Correlation of EGFR protein expression and pathological characteristics mcEGFR No. patients (% within pathological parameters) nEGFR No. patients (% within pathological parameters) One way ANOVA /independent t-test Pearson's χ 2 -test Pearson's χ 2 -test
P value < 0.05 was labeled with asterisk mark.G0 and N3 cases were excluded for analysis.Well differentiated to moderate differentiated cases were grouped as low grade while moderate-to-poorly differentiated to poorly differentiated cases were grouped as high grade.
. Further tumor biopsies have been obtained from Dr. Zahir Yassin (Tayba Cancer Centre, Khartoum, Sudan) with ethical approval from the National Medicines ans Poisons Board, Sudan (dated: September 20, 2015; Ref.: TQM/Pir-F/4).In addition, two tissue microarrays (TMAs) BC000119 (Biomax Inc., Derwood, USA) and T8235713 (Biocat, Heidelberg, Germany) were commercially available.Three further TMAs were provided by the Tissue Bank of the Institute of Pathology, University Medical Center, Mainz, Germany) with ethical approval from The Ethics Committee of the State Authorization Association for Medical Issues (Landesärtzekammer) Rheinland Pfalz (dated: March 22, 2018; Ref. 2018-13179).All patients gave informed consent prior to participation.All tumor cases information refers to Supplementary Table 1.