Integrated multigene expression panel to prognosticate patients with gastric cancer

Most of the proposed individual markers had limited clinical utility due to the inherent biological and genetic heterogeneity of gastric cancer. We aimed to build a new molecular-based model to predict prognosis in patients with gastric cancer. A total of 200 patients who underwent gastric resection for gastric cancer were divided into learning and validation cohorts using a table of random numbers in a 1:1 ratio. In the learning cohort, mRNA expression levels of 15 molecular markers in gastric tissues were analyzed and concordance index (C-index) values of all single and combinations of the 15 candidate markers for overall survival were calculated. The multigene expression panel was designed according to C-index values and the subpopulation index. Expression scores were determined with weighting according to the coefficient of each constituent. The reproducibility of the panel was evaluated in the validation cohort. C-index values of the 15 single candidate markers ranged from 0.506–0.653. Among 32,767 combinations, the optimal and balanced expression panel comprised four constituents (MAGED2, SYT8, BTG1, and FAM46) and the C-index value was 0.793. Using this panel, patients were provisionally categorized with scores of 1–3, and clearly stratified into favorable, intermediate, and poor overall survival groups. In the validation cohort, both overall and disease-free survival rates decreased incrementally with increasing expression scores. Multivariate analysis revealed that the expression score was an independent prognostic factor for overall survival after curative gastrectomy. We developed an integrated multigene expression panel that simply and accurately stratified risk of patients with gastric cancer.


INTRODUCTION
Gastric cancer is still a severe public health problem worldwide, particularly in Eastern Asia [1]. While stage I gastric cancer may be curable by surgery alone, patients with advanced gastric cancer are at risk of death due to disease recurrence after initial tumor resection and failure to respond to subsequent chemotherapy [2,3]. This underscores the importance of building a new risk stratification model for accurate prediction of prognosis, disease monitoring, and evaluation of treatment response.
Currently, endoscopy, and enhanced computed tomography are still the standard tests for diagnosing and staging gastric cancer [4,5]. However, these are invasive procedures with a significant cost for patients. However, noninvasive serum tumor markers such as carcinoembryonic antigen (CEA) and carbohydrate antigen (CA) 19-9 are widely used in clinical practice, www.oncotarget.com Oncotarget, 2018, Vol. 9, (No. 27), pp: 18775-18785 Research Paper www.oncotarget.com but have limited sensitivity and specificity, limiting their utility in decision making and management of patients with gastric cancer [6][7][8]. With the development of genomics, proteomics, and metabolomics, an increasing number of biomarkers have been identified and studied [9]. This holds promise that novel noninvasive markers with potential clinical value will be discovered to improve the management of gastric cancer [10]. However, due to the inherent heterogeneity of gastric cancer in terms of its biological and genetic characteristics, most individual markers have shown limited value in predicting differences in biology of the individual tumors and ultimately, in predicting clinical outcomes.
Recently, the concept of combining multiple markers has shifted the paradigm away from single gene analysis, providing more reliable insight into tumor biology, and yielding more robust oncological information. The Oncotype DX ® Colon Cancer Assay (Genomic Health, Redwood City, CA, USA), for example, utilizes a quantitative reverse transcription-polymerase chain reaction (RT-PCR)-based panel test using 12 molecular markers and has been validated in large clinical trials as a significant predictor of recurrence in stage II colon cancer [11]. It is a good example of success in demonstrating that comprehensive characterization of individual patients' tumors is key to realizing the potential of personalized therapeutic strategies [12]. Still, there is room for improvement in assay simplification associated with a reduction in technical requirements, cost, and time. Taking into account the clinical application, an ideal assay strikes a balance between accuracy and simplicity.
These realities prompted us to build a predictive model for gastric cancer risk assessment. The aim of this study was to develop a simple and accurate integrated multigene expression panel that can provide clinical guidance in determining the optimal treatment for gastric cancer.

Development of an integrated multigene expression panel
After randomized assignment of patients, there were no significant differences in patient characteristics between the learning and validation cohorts ( Figure  1A and Supplementary Table 1). Concordance index (C-index) values of the 15 single candidate markers ranged from 0.506-0.653, and those of preoperative serum CEA (cutoff 5 ng/ml) and CA19-9 (cutoff 37 IU/ml) were 0.545 and 0.561, respectively ( Figure 1B). C-index values of all single and combinations of the 15 candidate markers (neither CEA nor CA19-9 included) for overall survival were calculated and counted for 32,767 patterns. The highest C-index value among all combinations was 0.840, which was determined for the expression panel consisting of 13 markers ( Figure 1C). The larger the number of markers included in the panel, the greater the number of subpopulations into which patients were clustered with a corresponding decrease in the minimal number of patients in a subpopulation. The subpopulation index rapidly decreased after the number of markers was ≥5 (Figure 1C). We decided that the optimal and balanced number of markers was four (C-index >0.75 and subpopulation index >45). The expression panel having the greatest C-index among combinations of four constituents comprised MAGED2, SYT8, BTG1, and FAM46, and the C-index value was 0.793 (Supplementary Table 2). The expression index was determined by weighting each marker using the coefficient, and then provisionally categorized into score 1 (expression index <40), score 2 (index 41-80), and score 3 (index ≥81). The scoring system clearly stratified patients into favorable, intermediate, and poor overall survival groups (Figure 2A), and none of the individual constituents of the expression panel (MAGED2, SYT8, BTG1, and FAM46) exhibited the equivalent stratifying performance compared with the multigene expression panel ( Figure 2B). Based on these findings, the scoring system proceeded to the validation stage using another cohort.

Clinical significance of the integrated multigene expression panel
The reproducibility of the panel was evaluated in the validation cohort. The overall survival of patients with expression scores 1, 2, and 3 were clearly distinguished from each other ( Figure 3A). No significant differences were found with respect to histology, tumor depth differentiation. In contrast, higher expression scores were significantly associated with larger tumor size, lymph node metastasis, peritoneal metastasis, hepatic metastasis, and advanced disease stage (Table 1).
When focused on patients who underwent curative gastrectomy (stage I-III gastric cancer), overall ( Figure   3B) and disease-free survival rates ( Figure 3C) gradually decreased with increasing expression score. Multivariable analysis revealed that expression score 3 was an independent prognostic factor for overall survival after curative gastrectomy (hazard ratio 3.18, 95% confidence interval 1.19-8.62, P = 0.021; Table 2). Overall recurrence rates and frequency of each recurrent pattern observed according to the expression score are depicted in Figure  3D. No patients with the score 1 experienced peritoneal and/or hepatic recurrences. In contrast, the prevalence of peritoneal recurrences showed a stepwise increase in proportion to the expression score ( Figure 3D).

DISCUSSION
In this study, we analyzed 32,767 patterns and built a new prognostic model, an integrated multigene expression panel that can clearly stratify patients into low, intermediate, and high risk after gastrectomy for gastric cancer. The advantages of the panel are manifested in the following ways: a novel panel comprising original molecular markers, results presented using a simple scoring system, high predictive value with respect to overall and disease-free survival, and, confirmed reproducibility as demonstrated in both the learning and validation cohorts.
A growing body of evidence has demonstrated that gastric cancer is a complex and heterogeneous disease with substantial variation in its molecular and clinical characteristics [13,14]. Since it is unlikely that a single molecular marker can faithfully represent the various oncological signatures, more reliable and convenient prognostic models are required to enhance the long-term survival of patients with gastric cancer [9,15]. Combining multiple independently predictive markers has been demonstrated to improve accuracy in large clinical trials for breast, prostate, and colorectal cancer; however, few studies have investigated the diagnostic efficacy of threedimensional combined biomarkers for gastric cancer [16][17][18].
Given that our aim was to develop a simple and high-performance multigene expression panel, certain procedures were required to be followed. The larger the number of markers included in the panel, the greater the number of subpopulations into which patients were clustered with a corresponding decrease in the minimal number of patients in any given subpopulation. The subpopulation index was used to optimize the number of markers included in the panel, and inclusion of four markers was found to be the most objectively balanced system. To maximize performance of the expression panel, a weighting using the coefficient of each constituent was employed to determine the expression index for all patients [19]. Thereafter, patients were stratified based on their expression score (1 to 3) according to the expression index, which was a more straightforward patient stratification method compared with using continuous numeric variables. Considering that our attempt was certainly exploratory, the validation process was necessary   [20,21]. We reported that increased levels of tissue and serum MAGED2 were associated with distant metastasis in gastric cancer [22]. SYT8 encodes a single-pass membrane protein involved in membrane trafficking [23]. Elevated SYT8 levels were significantly and specifically associated with peritoneal metastasis, and intraperitoneal administration of an SYT8-specific small interfering RNA inhibited the growth of peritoneal nodules and prolonged survival in mouse xenograft models [24]. BTG1 reportedly is a mediator of B-cell differentiation and may act as a tumor suppressor because of its inhibitory effects on proliferation and cell cycle progression [25,26]. In our previous study, downregulation of BTG1 was associated with larger tumor size and lymph node metastasis [27]. FAM46C is a signal transducer that stabilizes mRNA and is frequently mutated and downregulated in gastric cancer tissues [28]. We found that downregulation of FAM46C served as a predictive marker of hepatic recurrence after

Variables
Score 1 (n = 11) Score 2 (n = 45) Score 3 (n = 44) P curative gastrectomy [29]. Since four different types of biomarkers have distinguishing features and contribute to metastatic patterns, they complementarily interacted with each other and contributed to the expression panel, having an improved predictive performance in gastric cancer, even though it is an extremely heterogeneous disease. Furthermore, use of our study concept can leverage current knowledge of single molecular markers and bring them to the next stage, which would be an important step forward in the realization of precision medicine.
To translate results of the present study to the clinic will involve discussion about how best to use the expression panel. Our findings highlight that the integrated multigene expression panel enables physicians to easily identify individuals expected to have an excellent prognosis (low risk), and conversely those expected to have an adverse outcome (high risk). For patients at low risk, avoidance of excessive intervention both in disease monitoring and treatment can reduce the burden for patients, as well as medical costs. In contrast, intensive systemic surveillance including enhanced computed tomography to detect signs of peritoneal, nodal or hepatic recurrences, and aggressive adjuvant therapy could be considered for patients at high risk. For patients at intermediate risk, standard management conformable to the treatment guidelines is recommended [30]. Patients who underwent curative gastrectomy are recognized as a delicate population characterized by varied prognosis (range from complete cure to early recurrence) that will likely benefit from accurate risk stratification. Therefore, for patents with resectable gastric cancer, the expression panel might merit inclusion as an adjustment factor or one of the endpoints in prospective clinical trials evaluating survival benefit of systemic adjuvant chemotherapy in gastric cancer [31]. In this study, expression levels were determined using surgically-resected gastric tissues. Since endoscopic biopsy samples are also available for mRNA analysis, expression scores can be determined before surgery and may contribute to decision-making regarding indication of perioperative treatment or surgery. Because the clinical utility of the expression panel to accurately predict patient outcomes is the ultimate goal, the present work should be viewed as an important first step but not as the definitive answer.
This study had some limitations. Despite an effort to reduce selection bias using a 2-step evaluation, the retrospective nature of the study, the relatively small cohort size, the usage of some old samples, and the long period of study may have biased the data. Although we designed a 2-step evaluation protocol of the predictive value of our integrated multigene expression panel, extravalidation and a prospective large-scale observational study will be required for the next step toward translation to the clinical practice. Although mRNA expression levels were used because they are easy to quantify objectively, the use of IHC could be considered given that it is a readily accessible and commonly used technique in clinical practice. Taken together, we developed an integrated multigene expression panel for patients with gastric cancer that may maximize the predictive performance of each single marker, enable accurate risk stratification, and eventually contribute to personalized medicine in the field of surgical oncology.

Patients, sample collection, and randomization
Primary gastric cancer tissues and corresponding noncancerous adjacent tissues were collected from 200 gastric cancer patients who underwent gastric resection without preoperative treatment at the Department of Gastroenterological Surgery, Nagoya University Hospital between 2001 and 2014. Tissue samples were collected,

Development and validation of the integrated multigene expression panel
To build an integrated multigene expression panel, the following processes were carried out in the learning cohort. The study flowchart is shown in Figure 1A. First, C-index values of all single and combinations of the 15 candidate markers for overall survival were calculated. Second, the best C-index values for each number of combinations (1-15) were calculated. The larger the number of markers included in the panel, the greater the number of subpopulations that patients were clustered into with a corresponding decrease in the minimal number of patients in a subpopulation. Thus, third, we used the subpopulation index, calculated as number of constituents × the minimal patient number in a subpopulation, for each number of combinations to determine the most well-balanced number of markers to be included in the expression panel [19]. Fourth, the expression index was determined with weighting according to the coefficient in a Cox regression of each constituent. Fifth, provisional cutoff for the scoring (score 1 to 3) were determined in the discovery set based on the following concept. The lower cutoff line was set strictly to achieve careful selection of patients with excellent postoperative outcomes, even if the population becomes small. Similarly, the upper cutoff line was set to select patients at very high risk. Sixth, patients were classified as having expression scores of 1, 2, and 3 according to the cutoff lines of expression index. Last, the reproducibility of the integrated multigene expression panel was evaluated in the validation cohort.

Statistical analysis
The qualitative χ 2 and quantitative Mann-Whitney tests were used to compare the two groups. Survival rates were calculated using the Kaplan-Meier method, and the difference between curves was analyzed using the log-rank test. The Cox regression model was used to evaluate the overall survival hazard ratio associated with each variable. The prediction score was internally validated using the C-index. The C-index is a probability of concordance between predicted and observed survival, with C = 0.5 for random predictions and C = 1 for a perfectly discriminating score. The C-index was evaluated on the discovery set using bootstrapping with 10,000 resamples [43]. Statistical analysis was performed using JMP 10 software and SAS9.4 (SAS Institute Inc., Cary, NC, USA). P <0.05 indicates a statistically significant difference.

Author contributions
MK, KM: acquisition of data, analysis and interpretation of data, drafting of the manuscript. HT, TM, SU, MH, NH: acquisition and interpretation of data, manuscript revision. CT, DK, MS, SY, GN, MF: material support, generation of data. YK: study concept and design, study supervision, interpretation of data, revision of the manuscript. MK had full access to all of the data and takes full responsibility for the veracity of the data and statistical analysis.

CONFLICTS OF INTEREST
The authors declare no conflicts of interest.