Serum metabolomic profiling of human gastric cancer and its relationship with the prognosis

Objective This study was aimed to investigate serum metabolites in gastric cancer (GC) patients and their relationships with the prognosis of GC in order to find potential specific serum biomarkers for GC. Methods Blood samples of 125 GC patients of unifocal GC at initial stage and 38 healthy people recruited in our hospital from September 2008 to August 2009 were analyzed by using high performance liquid chromatography coupled with electrospray ionization/quadrupole-time-of-flight mass spectrometry (HPLCESI/Q-TOFMS). Multiple statistical methods like principal component analysis (PCA), hierarchical clustering analysis, partial least squares discriminant analysis (PLS-DA), multivariate COX regression analysis, variance analysis and K-M survival curve were applied to analyze the raw obtained mass data in order to analyze the independent prognostic factors of GC. The structures of these metabolites were confirmed by comparing the m/z ratio and ion mode of with the data published from HMDB (www.hmdb.ca) databases. Results By PLS-DA test, 16 serum metabolites in ESI+ mode of VIP>1 in both test group and validation group could definitely distinguish GC patients from healthy peoples (p<0.05). Multivariate COX regression analysis showed TNM staging, 2,4-hexadienoic acid, 4-methylphenyl dodecanoate and glycerol tributanoate were independent prognostic factors of GC (p<0.05). In the K-M survival analysis, the survival rate in high level group of the 3 selected serum metabolites together or alone was significant lower than in those in low level group (p<0.05). Conclusion Low serum levels of 2,4-hexadienoic acid, 4-methylphenyl dodecanoate and glycerol tributanoate may be important independent prognostic factors of GC.


INTRODUCTION
Developed from the malignant cells in the stomach inner lining, gastric cancer (GC) has high mortality all over the world, commonly in the Eastern countries, such as in China, Korea and Japan [1][2][3]. Nowadays, therapies for GC consist of chemotherapy, surgery, radiation and targeted therapy [4][5][6]. As the underlying molecular mechanism of GC is still unknown and the clinical symptoms of early gastric cancer are usually unobvious, there still lacks of effective therapy for GC.
Endoscopy is the most common diagnostic method for the early GC, but the efficiency was inconsistent among different endoscopists and pathologists [6,7]. Although early definition and management at the beginning stage can decrease GC incidence, the prognosis of GC remains poor www.impactjournals.com/oncotarget/ Oncotarget, 2017, Vol. 8, (No. 66), pp: 110000-110015 Research Paper and the overall 5-year survival rate is still less than 40% [4][5][6]. The current prognosis indicators include pathology findings of histological type, invasion and metastasis, imaging results of classifications and other clinical characteristics like age and underlying diseases, etc., and further therapy method should be determined based on all these indicators. However, there still lots of limitations for these traditional indicators. Currently, biomarkers like p27, cyclin E, E-cadherin, c-erbB2, cmyc, tumor suppressor gene p53 etc. have been reported to be effective prognosis factors for GC patients. Furthermore, more and more serum metabolomics have been recommended as prognosis indicators in developed countries due to high specificity and sensitivity, which plays a key role in GC therapy [8,9].
Lots of biological studies show that metabolites in human fluid samples (such as serum, bile, sputum, aqueous humor, etc.) can be important downstream or endpoint biomarkers for gene mutation mutations due to endogenous substance or xenobiotics and are more specific and sensitive to different disease stages [10][11][12][13][14]. As a result, metabolomics plays an important role in current biological research, especially in underlying carcinogenesis and proliferation mechanism and thereby establishing useful biomedical indicators for early diagnosis and management in current cancer research, such as breast cancer, prostate cancer, lung cancer, colorectal cancer, pancreatic esophageal cancer, ovarian cancer, bladder cancer, renal cancer, etc. [10][11][12][13][14]. With the application of modern chromatography and mass spectrometry or other detection techniques like nuclear magnetic resonance, lots of biological metabolites have proven by numbers of statistical methods (such as t-test, discriminant analysis, principal component analysis, cluster analysis etc.) to be specific and sensitive biomarkers in current cancer research [10][11][12][13][14]. Recently, some clinical studies showed that biological metabolites in the fluid or tissue samples were of great beneficial in the early diagnosis and managements for GC [15][16][17][18][19]. However, there were still few studies investigated on the serum metabolites might be novel diagnostic indicators for gastrointestinal cancer [20][21][22][23].
In this study, we aimed to analyze serum metabolites between GC patients and the healthy people as well as their relationship with the prognosis of GC in order to find potential specific and sensitive serum biomarkers for GC by using high performance liquid chromatography coupled with electrospray ionization/quadrupole-time-offlight mass spectrometry (HPLCESI/Q-TOFMS), which can be of clinical beneficial for the early diagnosis and management of GC.

Total ion current spectras between GC patients and healthy people in serum samples
In this study, all the biological metabolites in serum samples from both GC patients and healthy people were detected by using HPLCESI/Q-TOFMS, providing lots of information for clarifying unknown molecular mechanisms for GC. There were obvious difference in the total ion current spectras of serum samples between GC patients and healthy people under the full-scan mode (in retention time of 11 minutes), which suggested that there might be some important metabolic changes in GC patients ( Figure 1). By using compared t-test, a total of 87 metabolites in ESI + mode were find to be statistically different between GC patients and healthy people (p<0.05) (Table 1). However, there were no statistically different serum metabolites in ESImode between GC patients and healthy people (Figure not shown). The clear identifications of each biological metabolites and their related effect in the biological processes are critical for the metabolomics' researches. Compared with the data obtained from HMDB (www.hmdb.ca) databases, 87 metabolites in ESI + mode were structurally confirmed.

Principal component analysis (PCA)
PCA was adopted as a statistical tool for clustering the detected serum metabolites into smaller number as principal components (PCs) to find the specific metabolic differences between GC patients and healthy controls in test group (24 GC patients and 24 healthy controls) and then to distinguish the outliers or discretization trends in GC patients. In the current study, almost all the samples were clearly grouped or separated in PCA plots, indicating the serum metabolites were properly classified in GC patients and healthy people ( Figure 2). There were obvious difference between GC patients and healthy people under the full-scan mode, but the response in GC patients or healthy people did not vary a lot. By using two sample t-test, a total of 39 metabolites in ESI + mode were found to be statistically different between GC patients and healthy people, which might be of potential power to distinguish the GC patients from the healthy controls (p<0.05) ( Table 1).

Partial least squares discriminant analysis (PLS-DA)
To find the most important serum metabolites between GC patients and healthy controls, partial least squares discriminant analysis (PLS-DA) (Figure 3) was done in test group (24 GC patients and 24 healthy controls). PLS-DA results showed that most of the serum metabolites were clearly clustered in PLS-DA plot with the sensitivity and specificity was 100%. This was consistent with the PCA results, indicating that the 39 serum metabolites could be of statistical importance to separate GC patients and healthy controls. Subsequently, 16 serum metabolites in ESI + mode according to the VIP (Variable Importance in Projection) plots of PLS-DA were found to be highly significant between GC patients and healthy volunteers (VIP>1). Based on above two sample t-tests, a hierarchical clustering analysis on the selected serum metabolites was implemented to visualize the relative significant serum metabolites. Statistically, 24 serum metabolites in ESI + mode in test group could definitely distinguish GC patients from healthy peoples ( Figure  4A), furthermore, 16 serum metabolites in ESI + mode of VIP>1 in validation group could definitely distinguish GC patients from healthy peoples ( Figure 4B) (p<0.05).

Relationships between serum metabolites with the clinicopathologic features of GC patients
According to 16 serum metabolites selected from the PLS-DA results, all the 125 GC patients were divided into 3 groups (Group 1, Group 2 and Group 3) ( Figure   5). In the same group of GC patients, responses to the 16 selected serum metabolites were similar. Differences in clinicopathological parameters like tumor differentiation, age, vascular invasion, TNM staging, tumor position and the expression of Ki-67 and P53 were observed among the subgroups by chi-square test ( Figure 6). In the GC patients of Group 3, there were significant higher proportion of older patients (>60 years), TNM staging (phase III), poorly differentiated gastric cancer, and upper gastric cancer (p<0.05). In the GC patients of Group 1, there were significant higher proportion of vascular invasion and upper gastric cancer (p<0.05). By using variance analysis, an increased trend in the expression of Ki-67 and P53 was observed among different groups with the highest level in Group 3 (p<0.05). The Kaplan-Meier (K-M) survival curve of each group were plotted  and the survival rates were computed correspondingly ( Figure 5). Based on the Kaplan-Meier (K-M) survival curve, survival rate statistically vary a lot among the three groups and significant differences in survival time were obviously observed between Group 1 and Groups 2 or Group 3(p<0.05) (Figure 7).

DISCUSSION
More and more biological studies showed that biological metabolites were "downstream" to genes or "endpoint markers" for disease, and thereby metabolomics is now of global attentions in cancer research in the field of breast cancer, prostate cancer, lung cancer, colorectal cancer, pancreatic esophageal cancer, ovarian cancer, bladder cancer and renal cancer for early diagnosis and effective managements as well Note: Values of response were expressed as mean ± SD. NA a : not available.
as molecular mechanisms. Detected by using modern chromatography and mass spectrometry or other detection techniques, lots of biological metabolites have proven by numbers of statistical methods (such as t-test, partial least squares discriminant analysis, principal component analysis, cluster analysis, etc.) to be potential specific and sensitive biomarkers for different cancer [10][11][12][13][14]. Developed from the malignant cells in the stomach inner lining, gastric cancer (GC) is a major cause of cancer-related death today. Traditional diagnosis methods consist of biopsy, endoscopy and pathological examination [1][2][3]. However, these diagnosis methods involve with significant limitations, and the efficiency was inconsistent among different endoscopists and pathologists. Recently, some clinical studies showed that biological metabolites in the fluid or tissue samples were of great beneficial in the early diagnosis and managements for GC [15][16][17][18][19]. Whereas, there were still few studies recommended using serum metabolites as a novel diagnostic approach for GC [20][21][22][23]. In this study, serum metabolites were investigated between gastric cancer (GC) patients and the healthy people as well as their relationships with the prognosis of GC in order to find potential specific serum biomarkers for GC by using high performance liquid chromatography coupled with electrospray ionization/quadrupole-time-of-flight mass spectrometry (HPLCESI/Q-TOFMS). Statistically, a total of 87 metabolites (Table 1) in ESI+ mode were find to be statistically different between GC patients and healthy people, including 16 serum metabolites in ESI+ mode of VIP>1 in both test group and validation group which could definitely distinguish GC patients from healthy peoples (p<0.05) (Figure 4). According to 16 serum metabolites selected from the PLS-DA results ( Table  2), all the 125 GC patients were divided into 3 groups (Group 1, Group 2 and Group 3) ( Figure 5). Serum metabolites detected in this study with statistical different responses in between GC patients and healthy people reveal several important metabolic or molecular pathways for GC. Firstly, most gastric cancer cells produce energy primarily through Valsalva effect instead of the citric acid cycle and will change the serum levels of the metabolites of the citric acid cycles [21]. Secondly, the disorder of serum amino acid can influence the cell growth, cell    Table 2. www.impactjournals.com/oncotarget metastasis and cell apoptosis as the raw materials for the protein and nucleic acid synthesis of the cancer cells [24]. Additionally, the disorder of serum fatty acid can also affect the cell growth, cell metastasis and cell apoptosis as well as tumor angiogenesis through untaken or over-exploited by the cancer cell proliferation and growth, or inhibited by the synthesis [9]. Furthermore, other substances like lactate, creatine and succinate were also involved in the metabolic pathways in GC patients comparing with the healthy controls [25].
Clinicopathological parameters like tumor differentiation, age, vascular invasion, TNM staging, survival rate, tumor position and the expression of Ki-67 and P53 were statistically different among the subgroups divided by the 16 serum metabolites selected from the PLS-DA results by the using chi-square test or Kaplan-Meier (K-M) survival curve (p<0.05) ( Figure 6). These results was consistent with some prvious studies showing some structural proteins include receptors, membrane channel proteins and enzymes like SRY (sex determining  region Y)-box 2, serum gastrin, pepsinogen I and octamerbinding protein-4 (OCT4) plays a vital role in gastric cancer metastasis or differentiation and thereby in TNM staging of GC [26,27]. Ki-67 and and P53 were highly expressed in GC, and factors influence the expression of Ki-67 and P53 might be important for regulating the cell growth, cell metastasis and cell apoptosis in GC [28].
In the current study, there are some limitations. Firstly, more clinicopathological parameters like gender, weight, BMI, eating habits or other biomarkers for GC like related miRNA or RNA levels like let-7, matrix metalloproteinase levels like MMP-3, MMP7 and MMP-13, COX-2 levels should also be observed. Secondly, the groups should also into more detailed subgroups for each clinicopathological parameters. Finally, other online compound databanks besides HMDB, including METLIN, LIPID MAPS and CEU Mass Mediator, should be used to confirm the chemical structure of the serum metabolites. All these limitations might cause some variation to the results.
To conclude, with the application of modern chromatography and detection techniques as well as different statistical methods, genomics, transcriptomics, proteomics and metabolomics is of key importance in biological studies, especially in the fields of establishing specific potential biomarkers for early diagnosis and effective managements or finding novel molecular mechanisms for the cell growth, cell metastasis and cell apoptosis, tumor angiogenesis in current cancer research, such as breast cancer, prostate cancer, lung cancer, colorectal cancer, pancreatic esophageal cancer, ovarian cancer, bladder cancer, renal cancer, etc. More and more biological studies showed that biological metabolites were highly related with clinicopathological parameters like tumor differentiation, age, vascular invasion, TNM staging, survival rate, tumor position, as well as the prognosis of the cancer. In this metabolomics study, 16 serum metabolites was found to be able to distinguish the GC patients from the healthy controls and 3 serum metabolites (2,4-hexadienoic acid, 4-methylphenyl dodecanoate and glycerol tributanoate) of fatty acid pathways may be independent prognostic factors of GC,

Study design
Blood samples of 125 GC patients of unifocal GC at initial stage and 38 healthy people recruited in our hospital from September 2008 to August 2009 were analyzed in this study. The blood samples of all the patients and healthy people were extracted in the morning and the basal metabolic rate (BMR) were in the normal range. All the patients were divided into 3 groups: the test group (24 GC patients and 24 healthy controls), the validation group (14 GC patients and 14 healthy control) and the additional group (87 GC patients). There were no significantly different in the basic clinicopathological factors, such as age, sex, BMI, etc between GC patients and healthy people in both the test and the validation group. Both the test and the validation group were investigated to compare the differences serum metabolites between GC patients and healthy controls in order to find the potential specific biomarkers for GC. Besides, blood samples of all the 125 patients were analyzed to find the relationship between biological metabolites and clinical parameters of GC and find specific prognostic factors for GC.
All the included GC patients had a complete 5-year follow-up record and did not have any hormone therapy or chemotherapy before, and all of them were with no significant acute inflammatory disease, normal liver and kidney function, routine physical status, normal results of biochemical tests and electrocardiograp (ECG). The patients should not have congenital disease for the last 2 weeks like burns, severe trauma, and septic shock, metabolic diseases like diabetes, severe heart and lung, liver or kidney disease, neurological and psychiatric diseases, blood diseases like leukemia and anemia, chronic inflammatory diseases, infectious diseases like HIV, hepatitis, and active tuberculosis, or any acute illnesses or stress reactions, etc. The lactation and pregnancy or possible pregnancy women, drinker, drug addicts, long-term user of proton pump inhibitors, hormones or Note: * p< 0.05 was set to be of statistical significance. non-steroidal anti-inflammatory agents should also be excluded. All the included healthy peoples should be of good health with no obvious abnormalities in routine physical examinations. The study protocol was authorized and all the procedures performed in this study involving human participants were in strict consistence with the ethical standards of ethics Committee at the xxx Hospital and with the 1964 Helsinki declaration and its later amendment. Well-written informed consent was obtained from all the participants prior to their enrollments.

Sample procession and detection method
A volume of 100ul serum samples were thawed, deproteinized with the volume of 400ul acetonitrile, and centrifuged at 14000r / min for 5 minutes. Each sample was processed by Agilent 1200 high performance liquid chromatography combined with a 6520 accurate electrospray ionization /quadrupole-time-of-flight mass system (Agilent Technologies, California, USA). Serum samples were separated on an Eclipse Plus C18 column (2.1x150mm, 3.5μm, Agilent Technologies, USA), with the condition of 180 μl injection volume, 0.8 ml/min flow rate and 45˚C column temperature, by using a gradient program of the mobile phase A 0.1% formic acid solution (ESI + )/water (ESI -) and mobile phase B was acetonitrile with 0.1% formic acid solution (ESI + )/ acetonitrile (Merck, Darmstadt, Germany) (ESI -). The gradient program started from 20% B for 0-1.5 min, linear increased from 20 to 95% B for 1.5-7 min, stayed at 95% B for 7-9.9min, and then linear decreased from 95 to 20% B for 9.9-10 min and equilibrated for 20% B for 10-11 min. To avoid crosscontamination from GC patients, all the serum samples of healthy people were injected at the end.
All the data were collected in ionization quadrupoletime-of-flight mass spectrometry with both positive (ESI + ) and negative (ESI -) full scan mode find either basic or acidic biological compounds in human serum, which may be specific and sensitive biomarkers for GC. The conditions of mass spectrometry were as following: the capillary voltage 3.2kv; the cone voltage 35V; the desolvation temperature 350˚C; the source temperature was 100˚C; the desolvation gas (nitrogen) flow rate 650L/h; the cone gas (nitrogen) flow rate was 50L/h; a mass range of 50 to 1000; scan time of 1s and inter-scan delay of 0.02s.

Data processing and statistical analysis
Firstly, both full-scan ESI + and ESIraw mass spectra was gathered by using data-acquisition software Analyst TF 1.5.1 (AB Sciex, California, USA). Then, the data like retention time, peak area and m/z ratio, was generated by using Marker View 1.2 (AB Sciex, California, USA) and the related serum metabolites were structurally confirmed by comparing the m/z ratio and ion mode of those metabolites with data shown in HMDB (www.hmdb.ca) databases. Subsequently, principal component analysis (PCA) was adopted in score plots to make a distinction between the similarity or difference of the scatters between GC patients and healthy controls, and thereby two sample t-test by using SPSS 22.0 (SPSS Inc., Chicago, USA) was performed to select potential biological variables which statistically significant different between GC patients and healthy controls. Additionally, the hierarchical clustering analysis was implemented by using BRB-Array Tools (Dr. Richard Simo & BRB-Array Tools Development Team, USA) to discriminate the subgroups and partial least squares discriminant analysis (PLS-DA) was applied by using simca-p software (Umetric AB, CA, USA) to identify the statistical important serum metabolites between GC patients and healthy controls. VIP plots of PLS-DA were drawn to ensure the correct potential serum biomarkers for both GC patients and healthy controls. The intensity of the background interference was normalized by using the global median subtraction method.
All the 125 GC patients were clustered into serum metabolites selected from the PLS-DA analysis, and differences in clinicopathological parameters like tumor differentiation, age, vascular invasion, TNM staging, survival rate, levels of Ki-67 and P53 and tumor position and expression were observed among the subgroups by chi-square test or variance analysis. The Kaplan-Meier (K-M) survival curve of each group was plotted based on different thresholds of sensitivity and specificity of survival time in order to find the most potential serum biomarkers for GC. Finally, multivariate COX regression analysis, variance analysis and K-M survival curve were used to find the independent prognostic factor for GC in human serum. All the statistical significance was set to be p< 0.05.