Prognostic gene expression profiling in esophageal cancer: a systematic review

Background Individual variability in prognosis of esophageal cancer highlights the need for advances in personalized therapy. This systematic review aimed at elucidating the prognostic role of gene expression profiles and at identifying gene signatures to predict clinical outcome. Methods A systematic search of the Medline, Embase and the Cochrane library databases (2000-2015) was performed. Articles associating gene expression profiles in patients with esophageal adenocarcinoma or squamous cell carcinoma to survival, response to chemo(radio)therapy and/or lymph node metastasis were identified. Differentially expressed genes and gene signatures were extracted from each study and combined to construct a list of prognostic genes per outcome and histological tumor type. Results This review includes a total of 22 studies. Gene expression profiles were related to survival in 9 studies, to response to chemo(radio)therapy in 7 studies, and to lymph node metastasis in 9 studies. The studies proposed many differentially expressed genes. However, the findings were heterogeneous and only 12 (ALDH1A3, ATR, BIN1, CSPG2, DOK1, IFIT1, IFIT3, MAL, PCP4, PHB, SPP1) of the 1.112 reported genes were identified in more than 1 study. Overall, 16 studies reported a prognostic gene signature, which was externally validated in 10 studies. Conclusion This systematic review shows heterogeneous findings in associating gene expression with clinical outcome in esophageal cancer. Larger validated studies employing RNA next-generation sequencing are required to establish gene expression profiles to predict clinical outcome and to select optimal personalized therapy.


INTRODUCTION
Esophageal cancer is the eight most common cancer worldwide, with 450.000 new cases and 400.000 estimated deaths per year. [1,2] The two main types of esophageal cancer, squamous cell carcinoma (SCC) and adenocarcinoma (AC), differ in pathogenesis, epidemiology, tumor biology, prognosis and treatment strategies. [3,4] Multimodality treatment, combining esophagectomy with perioperative chemotherapy or neoadjuvant chemoradiotherapy, has been shown to improve patients' survival and is therefore the standard treatment for curable esophageal cancer. [5][6][7] However, due to the aggressive character of the tumor and the lack of effective individualized treatment, the survival remains poor with 5-year survival rates of merely 36-47%. [5][6][7][8] Moreover, large individual differences in survival, treatment response and metastasis emphasize the need for more personalized therapy. Existing histopathological terms, such as the pathologic TNM classification, are insufficient to accurately predict these individual differences in outcome and to inform personalized treatment. [9][10][11] Evidence for the potential prognostic role of gene expression profiles is accumulating. Gene signatures may find clinical application in predicting survival, response to neoadjuvant treatment and metastatic potential. This would enable individualized targeted therapy in order to Review Oncotarget 5567 www.impactjournals.com/oncotarget avoid unnecessary treatment and improve quality of life and longevity.
Prognostic pretreatment gene expression profiles have already been identified through genome-wide microarray analysis for rectal adenocarcinoma [12,13] and breast cancer [14][15][16]. With regard to esophageal cancer, multiple studies have suggested a clear association between gene expression and clinical outcome. This systematic review aims to provide an overview of the results, in order to outline the current understanding of the predictive potential of gene expression for survival, response to chemo(radio)therapy, and lymph node metastasis.

Adenocarcinoma
Of the 6 studies [25,27,29,30,36,37] conducted on AC, 5 studies [27,29,30,36,37] reported gene signatures associated with survival. One study [36] reported and validated a prognostic 4-gene signature (and showed that underexpression of DCK, PAPSS2 and SIRT2 in combination with overexpression of TRIM44 decreased 5-year survival from 58% to 14%. A second study [29] discovered clusters of patients with differential gene expression profiles and further investigated genes with overexpression in the poor prognosis cluster. External validation of these genes showed a significant association between a 2-gene signature, with combined overexpression of SPARC and SPP1, and poor survival. A third study [30] also performed cluster analysis and found that another validated 4-gene signature (EGFR, MTMR9, NEIL2, and WT1) was able to stratify patients into 5 survival clusters. A similar study identified a 165gene [37] signature to classify patients into a good survival cluster and poor survival cluster. The last study [27] took another approach to divide a cohort of patients into a good survival group and a poor survival group and compared gene expression between both groups to find a 59-gene signature predictive of survival.

Squamous cell carcinoma
As for SCC, only 1 [38] of the 3 studies [19,38,40] identified a prognostic gene signature. This study [38] found an association between overexpression of the randomly selected gene CTTN and shorter disease-free survival in an external validation cohort.

Gene expression and response to chemo(radio) therapy
The 7 studies [24,27,28,32,34,35,39] that analyzed gene expression in association with response to  Table 3. Patients received neoadjuvant chemotherapy in 3 studies [27,28,35] and neoadjuvant chemoradiotherapy in 4 studies [24,32,34,39], following varying regimens as specified in Table 2. All studies obtained fresh frozen pretreatment endoscopic biopsies for microarray analysis. Response to chemo(radio)therapy was defined differently in each study and evaluated on the basis of resection specimens in 4 studies [32,34,35,39] and on the basis of (a combination of) medical imaging techniques in 3 studies [24,27,28]. All studies proposed a gene signature prognostic for response to chemo(radio)therapy, which was externally validated in 3 studies [28,34,39].

Adenocarcinoma
Two studies [27,35] found prognostic genes and a gene signature for AC only. One study [35] compared gene expression between responders and non-responders to chemotherapy. The results demonstrated a significant correlation between the overexpression of Ephrin B3 and response. Another study [27] identified, on the basis of endoscopic ultrasound (EUS) only, a 113-gene signature correlated with chemotherapy response.

Squamous cell carcinoma
Two studies [28,39] were conducted on SCC only. One study [28] documented a 199-gene signature and showed in external validation that 1 gene (PERP) was underexpressed and 4 genes (DAD1, PRDX6, SELPINB6 and SRF) were overexpressed in non-responders compared to responders to chemotherapy. Similarly, another study [39] identified a 3-gene signature and found that a combination of underexpression of ClOrf226 and LIMCHI1, and overexpression of MMP1 was predicitive for responders.

Adenocarcinoma and squamous cell carcinoma
Three studies [24,32,34] obtained biopsies of both AC and SCC. The first study [34] found differentially expressed genes, of which 12 genes were selected for external validation. A model of 5 genes reported that underexpression of 4 genes (EPB41L3, NMES1, RNPC1, STAT5B) and overexpression of 1 gene (RTKN), was able to identify responders from non-responders to chemoradiotherapy. The second study [32] showed that overexpression of 3 genes (PERP, S100A2 and SPRR3) was able to characterize complete responders to chemoradiotherapy. The third study [24] investigated differential expression in both AC and SCC and identified a 32-gene signature that was predictive for response to chemoradiotherapy in SCC only.   25,26,31,36] the (International Union Against Cancer (UICC)) TNM classification was used by pathologists to assess lymph node metastasis and in 3 studies [20,21,33] no form of classification was stated. All 9 studies identified prognostic genes with differential expression between patients groups with node-negative (N0) versus node-positive (N+) tumors. Of these, 4 studies [20,22,26,31] reported a gene signature, which was validated in an external cohort in 3 studies [20,26,31].

Adenocarcinoma
In AC, 3 studies [25,33,36] identified differentially expressed genes but none reported a gene signature to predict N+ tumors.

Overlap in gene expression
Supplementary Table S3 combines the available results of all reviewed studies into a list of genes for each of the 3 outcomes per histological tumor type. In association with survival, the 9 studies identified a total of 1.337 genes (excluding 1 study [19] that did not specify the number of multiple genes identified with differential expression), of which 277 genes (21%) (range 1-113 per study) were reported. Only for AC, overlap between the studies was found with respect to 9 genes (ALDH1A3, BIN1, CSPG2, DOK1, IFIT1, IFIT3, PHB, SPP1) which were described by 2 of the 5 studies. For response to chemo(radio)therapy, the 7 studies identified a total of 19.726 genes (excluding 1 study [28] that identified 19.166 genes with differential expression), of which 158 genes (1%) (range 1-49 per study) were reported. No overlap in genes was found between the studies. The 9 studies that studied gene expression in relation with lymph node metastasis identified a total of 1.001 genes and reported a total of 677 genes (68%) (range 5-252 per study). Of these, 3 genes (ATR, MAL, PCP4) were described in 2 studies investigating SCC.

DISCUSSION
This systematic review aimed to identify prognostic genetic profiles in esophageal carcinoma. The results demonstrate a large heterogeneity in gene expression profiles and gene signatures predicting survival, response to chemo(radio)therapy, and lymph node metastasis. None of the identified gene signatures is directly applicable in clinical practice at present. However, microarray analysis and genome sequencing might provide valuable information for predicting individual variations in clinical outcome and establishing personalized therapy for esophageal cancer patients in the near future.
A careful literature search and quality assessment were directed at including all studies of moderate and high quality relevant to prognostic gene expression. The 22 included studies were assessed on the same criteria and genes with prognostic relevance were retrieved from the article or through additional contact with study authors if possible. In support of a prognostic role, all studies, except 2 [30,38], identified genes with differential expression with regard to the investigated outcome and 16 studies [20,22,24,[26][27][28][29][30][31][32][34][35][36][37][38][39] identified and reported a gene signature. In consistence with the distinct epidemiology and tumor biology of AC and SCC [4,11], most studies conducted in East Asia included exclusively SCC and most studies in western countries focused on AC only. Of the 3 studies [24,32,34] investigating both AC and SCC, 2 studies [32, 34] described a gene signature for both tumor types, while the third study [24] found a gene signature that was predictive for SCC only.
Despite the thorough review, studies to current date have been heterogeneous in both methods and results. Therefore, comparison of the different studies has been unable to establish a repeatedly identified gene signature with clinical relevance. Studies differed largely in the documented number of prognostic genes and genes were not reported nor provided after contacting authors in 8 articles [26-28, 32, 35, 37, 39, 40]. Although the 22 studies identified a large number of prognostic genes ( Table 2, 3 and 4), only 1.112 genes were reported and included in Supplementary Table S3. Comparison of data per outcome and histological subtype showed that only a 12 genes return in more than 1 study, suggesting a high false positive rate of identified genes. Only 9 genes (ALDH1A3, BIN1, CSPG2, DOK1, IFIT1, IFIT3, PHB, SPP1) were described in 2 studies on survival in AC and only 3 genes (ATR, MAL, PCP4) on lymph node metastasis in SCC. There was no overlap between studies investigating response to chemo(radio)therapy.
The heterogeneity in identified genes can be attributed to several factors. Firstly, the studies used limited sample sizes and thus individual genetic variability may have largely impacted the identified genes. The studies included 8 to 89 patients, with only 3 studies [25,33,36] investigating 75 or more patients.
Oncotarget 5573 www.impactjournals.com/oncotarget Moreover, only 10 [20, 26, 28-31, 34, 36, 38, 39] of the 16 reported gene signatures were validated in external cohorts. Validation in independent cohorts increases reliability of the gene signatures and is required to achieve clinical implementation. In addition, the included studies showed large methodological variation in treatment of patients, definition and evaluation of outcome, employed microarray analysis and chosen cut-off point for differential expression.
The studies investigating survival showed heterogeneity in whether gene expression was correlated to continuous survival [25,36,38], compared between differently defined poor and good survival groups [27,40] or used to create patient clusters [19,29,30,37]. More importantly, response to chemo(radio)therapy was assessed using computerized tomography (CT) scans, Fluorodeoxyglucose Positron Emission Tomography (FDG-PET), or endoscopic ultrasound (EUS) with or without biopsy in 3 studies [24,27,28]. Accuracy of these imaging techniques in evaluating response is suboptimal compared to histopathology, which is the gold standard [41][42][43]. Conclusions on response to chemo(radio) therapy were further complicated by the use of different chemotherapy regimens and varying definition of response. Response was defined as either 50% reduction of viable tumor cells [28,35] or absence of residual tumor cells [24,32,34,39]. These are 2 distinct predictive categories of partial response and complete response, respectively. [44,45] Similarly, the studies investigating lymph node metastasis differed in the classification system used and the lymph node-stage compared.
Despite these limitations, the current findings show that gene signatures can be of great prognostic value for clinical outcomes and are therefore paramount in understanding pathogenesis and selecting optimal personalized therapy for the individual patient. A gene signature to predict survival may be able to explain why some patients with good tumor characteristics show shorter disease-free survival than expected, and vice versa, thus offering information that is not accurately provided by the pathologic TNM classification. [46,47] Moreover, patients who are unlikely to benefit from chemo(radio)therapy could be selected to receive direct surgical resection, avoiding unnecessary toxicity [7,48] and delay in surgical treatment with risk of disease progression. Conversely, a restrictive and non-surgical approach with less comorbidity might be considered for patients who are likely to be complete responders to chemo(radio)therapy. In addition, the identification of a gene signature to predict lymph node metastasis would be a powerful diagnostic tool. Lymph node-negative patients could receive limited-field radiotherapy with reduced treatment-related toxicities. [49] Furthermore, an invasive extended lymphadenectomy might be avoided in these patients, limiting the risk of postoperative morbidity. [50] Although the included studies did not investigate other prognostic variables as tumor differentiation, perineural and angioinvasive growth, gene signatures for these variables can be of great value in the future as well.
This systematic review shows potential for prognostic gene expression analysis and future research should aim at translation to clinical practice. An international consortium dedicated to large-scale data sharing and using a clear methodological 'gold standard' to perform analyses can resolve inconsistencies among the reported gene expression profiles. [51] In addition, future studies can yield more reliable results by conducting gene expression profiling on larger samples and by validating signatures in independent patient cohorts. This would allow for the development of a gene signature with direct clinical relevance, similar to the 70-gene signature 'MammaPrint' to predict survival of patients with breast cancer [14][15][16] and the 42-gene 'ColoPrint' to predict disease relapse in early stage colon cancer [12,13]. In addition to microarrays analysis, 1 study [40] employed RNA next-generation sequencing. This exciting emerging technique can provide additional information on gene fusions and alternative splicing with high accuracy and sensitivity [52,53]. Advances in microarray analysis and next-generation sequencing will form valuable tools to realize personalized medicine for patients with esophageal cancer in the near future.

Search strategy
A systematic literature search was conducted of the Medline (via PubMed), Embase and Cochrane library databases, using the limits 'human', 'English language' and 'publication date 2000-2015'. The medical subject headings (MeSH) and their synonyms concern 'esophageal neoplasm', in combination with 'DNA/RNA sequence analysis' or 'gene expression', in relation to 'response to chemo(radio)therapy', 'metastasis', 'survival', 'prognosis' or 'recurrence'. The complete search strategy is provided in the supplementary Table S1. The search was last updated on June 30 2016.

Study selection
Studies identified by the search strategy were evaluated for eligibility by independent dual author review (EV and IF). Discrepancies between the two reviewers were resolved by consensus. After removal of duplicates, studies that seemed unrelated to the study aims were excluded in title screening. The remaining articles underwent subsequent abstract and full text screening, based on carefully constructed inclusion and exclusion criteria (Supplementary Table S2). These criteria Oncotarget 5574 www.impactjournals.com/oncotarget aimed at inclusion of original studies, that conducted microarray analysis or genome sequencing on untreated cancer biopsies or resection specimens of AC or SCC, and associated the genetic profile to survival, response to chemo(radio)therapy and/or lymph node metastasis. A manual cross-reference search was performed in the reference lists of the eligible articles to assure that relevant related articles were included in this study.

Quality assessment
Studies eligible on the basis of the inclusion and exclusion criteria were subsequently evaluated in a critical appraisal, using the Quality in Prognostic Studies (QUIPS) tool. [17] The quality of studies was assessed on the basis of bias in study participation, study attrition, prognostic factor measurement, outcome measurement, study confounding and statistical analysis plus reporting. Studies were assigned low, moderate or high risk of bias in each of these 6 domains. In case of discrepancies in quality assessment between the two authors (EV and IF), a consensus was reached through discussion. The overall quality of each study was scored using a three-point scale (low, moderate and high quality). Low quality studies, defined as high bias in at least 2 of the 6 domains, were excluded from this study. Studies were defined to be of high quality if they scored low bias on at least 4 domains in the absence of any high bias score. Both high quality studies and the remaining studies of moderate quality were included in this study.

Data extraction
Data were collected independently by two authors (EV and IF). The following information was obtained from the included studies: first author's name, publication year, country of origin, sample size, histological tumor type, treatment, study material, definition of outcome, sequencing method or microarray analysis, cutoff value for expression, method of analysis, identified prognostic genes and/or gene signature, and validation. Gene signatures consisting of less than 10 genes were mentioned in the text. If data on any of the above items were not reported in the study, items were indicated as "not reported". Authors were contacted for important information that was missing or unclear.

Data presentation
The above-mentioned data were presented per study in tables, making a distinction between survival, response to chemo(radio)therapy and metastasis and separating AC and SCC. Studies were compared on the basis of treatment and study material as well as methods used to identify genes, such as chosen microarray analysis, and validation. When available, the identified prognostic genes and gene signatures were described. A list of prognostic genes was constructed, combining the results of all studies per outcome and histological tumor type. Genes reported in more than 1 study were highlighted. When studies investigated differential gene expression on multiple cutoff points, genes identified by the lowest cut-off point were included in the list of genes.