A prospective study of serum metabolites and glioma risk

Malignant glioma is one of the most lethal adult cancers, yet its etiology remains largely unknown. We conducted a prospective serum metabolomic analysis of glioma based on 64 cases and 64 matched controls selected from Alpha-Tocopherol, Beta-Carotene Cancer Prevention (ATBC) Study. Median time from collection of baseline fasting serum to diagnosis was nine years (inter-decile range 3-20 years). LC/MS-MS identified 730 known metabolites, and conditional logistic regression models estimated odds ratios for one-standard deviation differences in log-metabolite signals. Forty-three metabolites were associated with glioma at P<0.05. 2-Oxoarginine, cysteine, alpha-ketoglutarate, chenodeoxycholate and argininate yielded the strongest metabolite signals and were inversely related to overall glioma risk (0.0065≤P<0.0083). Also, seven xanthine metabolites related to caffeine metabolism were higher in cases than in controls (0.017≤P<0.042). Findings were mostly similar in high-grade glioma cases, although prominent inversely associated metabolites included the secondary bile acids glycocholenate sulfate and 3β-hydroxy-5-cholenoic acid, xenobiotic methyl 4-hydroxybenzoate sulfate, sex steroid 5alpha-pregnan-3beta, 20beta-diol-monosulfate, and cofactor/vitamin oxalate (0.0091≤P<0.021). A serum metabolomic profile of glioma identified years in advance of clinical diagnoses is characterized by altered signals in arginine/proline, antioxidant, and coffee-related metabolites. The observed pattern provides new potential leads regarding the molecular basis relevant to etiologic or sub-clinical biomarkers for glioma.


INTRODUCTION
Brain cancer is one of the most fatal and devastating malignancies, given its poor prognosis and adverse impact on quality of life, including, particularly, cognitive function. Malignant glioma accounts for 80% of adult brain cancers [1], and its etiology remains largely unknown, with the exception of ionizing radiation and family history, and evidence pointing to inverse associations with asthma and allergies [1][2][3][4][5][6][7]. A spectrum of genetic alterations has been characterized for glioma, including germline and somatic mutations, recurrent translocations, and copy number variations, [8,9] yet these do not account for all the underlying biology. Rapid development of technologies in liquid and gas chromatography, mass spectrometry and nuclear magnetic resonance have facilitated the Research Paper measurement of a broad array of low molecular weight metabolites in biospecimens such as plasma and serum, urine and tissue. Quantification of the metabolome provides an integrated snapshot reflection of exogenous and endogenous exposures, and may thus help to identify novel disease associations and point to biochemical pathways involved in disease pathogenesis. Chinnaiyan and colleagues demonstrated unique tumor metabolomic signatures, involving cellular energy, anabolism and phospholipid pathways, that distinguished low-grade from high-grade gliomas and had prognostic relevance [10]. The significance of these metabolic differences to the etiology, early detection, and prevention of the disease remains to be established, however, including through prospective investigations [11,12].
To address the potential role of altered metabolites and their related biological pathways in glioma tumorigenesis, we conducted a prospective case-control serologic analysis including 64 glioma cases nested within the Alpha-Tocopherol, Beta-Carotene Cancer Prevention (ATBC) Study cohort. According to the World Health Organization (WHO) grade for glioma (I-IV) [13], the present study includes 41 high-grade gliomas (grade IV), 19 lower-to-intermediate-grade gliomas (grades II and III; subsequently referred to "lower-grade"), and 4 cases of unknown grade [14].

RESULTS
Cases were similar to controls with respect to baseline characteristics, with the exception that they had lower average body mass index (BMI) (P=0.025) and appeared to consume more coffee (although not statistically significantly different (P=0.21) ( Table 1). The median time from serum collection to glioma diagnosis was nine years (inter-decile range=3.0-20.0 years). Based on the quality control samples, the median coefficient of variation (CV) across all the metabolites was 7% (interquartile range=4%-14%) and the median intraclass correlation coefficient (ICC) was 0.97 (interquartile range=0.89-0.99).
Stratifying cases and their controls by median time from blood draw to diagnosis showed that several lysoplasmalogen, sphingolipid, and three of four benzoate metabolites were positively related to glioma within nine years of blood collection (Supplementary Table 1). By contrast, abundant diacylglycerol, monoacylglycerol, phospholipid, and sphingolipid metabolites were prominently and inversely related to cases that were diagnosed at least nine years after blood collection (Supplementary Table 2). We found that caffeine intake did not modify the association between identified caffeine related metabolites and glioma risk (Supplementary Table 3).
None of the top 10 principal components were significantly associated with glioma risk, with all tests showing P≥0.034 (P<0.005 being the threshold). The Gene-Set Analysis (GSA) pathway analysis revealed primary bile acid, urea cycle/arginine and proline, tocopherol, and glycolysis/gluconeogenesis/pyruvate associations with overall glioma risk (0.005≤P<0.048; Table 4). Ascorbate and aldarate metabolites appeared to be related to high-grade glioma (P=0.02), while glutamate, glycolysis/gluconeogenesis/pyruvate (i.e., cellular energy), eicosanoid and alanine/aspartate metabolites were related to lower-grade glioma (P=0.006, 0.017, 0.03, 0.04, respectively; Table 4). The sub-pathway analysis stratified by time from blood collection to diagnosis revealed that primary bile acid metabolites were related to cases diagnosed within nine years of blood collection (P=0.034), whereas urea cycle/arginine and proline, glycolysis/gluconeogenesis/pyruvate, monoacylglycerol, histidine, food component/plant, and diacylglycerol metabolites were related to risk of glioma after nine years (0.003≤P<0.045, Supplementary Table 4). These pathway associations did not, however, reach the stringent Bonferroni significance threshold for correction of multiple comparisons [i.e., eight tests for super-pathways (P=0.0063) and 70 tests for sub-pathways (P=0.00071)].

DISCUSSION
Our study identified the amino acids 2-oxoarginine, cysteine and argininate, energy metabolite alphaketoglutarate, and secondary bile acid chenodeoxycholate, as well as some other compounds, as being lower in circulation of glioma cases years in advance of diagnosis compared to study-based controls. By contrast, we show several xanthine metabolites of caffeine to be related  A possible role for coffee consumption in the etiology of glioma has long been hypothesized, and a meta-analysis of four prospective and two retrospective studies showed no association [15]. Previous research has identified several coffee-related metabolites in circulation, including quinate, 1-methylurate, 1-methylxanthine, paraxanthine, theobromine, 5-acetylamino-6-amino-3-methyluracil, theophylline, 7-methylxanthine, and trigonelline [16], with many of these compounds showing substantial positive associations with overall glioma risk in the present investigation. Whether these xanthine metabolites have a direct causal role in glioma, or are elevated years in advance of diagnosis because of increased coffee consumption in response to tumor-related neurologic changes or cancer-related fatigue, for example, will require additional clinical and prospective studies.
The arginine/proline metabolic pathway compounds 2-oxoarginine, a guanidino metabolite of arginine, and argininate, the conjugate base of arginine, were substantially lower in men years in advance -and  [17]. Earlier studies have indicated that arginine/proline metabolites are involved in tumorigenesis (including glioblastoma), exogenous arginine is required for tumor growth, and arginine deprivation leads to impairment of glioma cell motility, invasiveness, and adhesion [18][19][20][21]. A large proportion (i.e., 60-90%) of low-grade gliomas harbor a heterozygous mutation (R132H) in the gene encoding the cytosolic isoform of isocitrate dehydrogenase (IDH1) [22]. Data from genome-wide association studies of glioma have identified single nucleotide polymorphisms (SNP) associated with altered risk of IDH-mutated glioma, including rs55705857 in 8q24.21, rs4295627 in CCDC26, and rs498872 in PHLDB1 [23,24], and in a recent case-control study of 285 gliomas, 316 healthy controls, and 531 other types of cancers, the authors showed that SNP rs55705857 was strongly associated with altered risk of IDH-mutant glioma, but not with other cancers [23]. The wild-type IDH1 catalyzes the oxidative decarboxylation of isocitrate to generate alpha-ketoglutarate (α-KG), whereas the mutant enzyme is able to convert α-KG into molecules of 2-HG, which is an "oncometabolite" that may mediate several tumorigenic events [25][26][27]. In the present study, we observed increased pre-diagnostic serum 2-HG and decreased α-KG in glioma cases. We could not, however, evaluate IDH1 mutation status and its correlation with serum 2-HG, although previous studies indicated no correlation between serum 2-HG and IDH1/2 status or tumor size [17,28]. On the other hand, it is unlikely that the serum 2-HG and α-KG reflect early cellular changes in transformed astrocytes or emerging gliomas, but rather etiologic biomarkers that must be considered and further evaluated in other studies. It is of note that we observed that another TCA cycle metabolite, aconitate (related to cis-aconitate), was elevated in high-grade cases (OR=1.7, P=0.05).
Being a highly metabolically active organ, the brain generates substantial reactive oxygen species (ROS) and is slower to neutralize these free radicals compared to other tissues [29], possibly leading to DNA damage, genomic instability and tumor development. We found several antioxidant pathway metabolites inversely associated with risk that may be indicative of serum antioxidant depletion resulting from increased tumor ROS. For example, both cysteine and cysteine-S-sulfate were lower in cases, and this might be associated with increased cysteine uptake, and modulating of redox status, in the central nervous system. The conditionally essential amino acid cysteine is a rate-limiting precursor for the antioxidant glutathione, one of the most abundant antioxidants in the central nervous system [30,31]. Whereas cysteine is an extracellular antioxidant, glutathione acts intracellularly and may play a role in glioma cell survival under redox stress and hypoxic conditions [30][31][32]. Consistent with this is a metabolomic study of patient-derived glioma tissue that found cysteine catabolism and cysteine sulfinic acid accumulation in the high-grade glioblastoma cases [33]. Higher circulating cysteine has also been related to lower risks of several other cancers including colon [34], esophagus, and stomach [35]. Also relevant to ROS and glioma risk is the inverse association we observed for alpha-tocopherol, the most biologically active form of vitamin E and a potent inhibitor of lipid peroxidation [36]. Two previous studies had similar findings for glioma and glioblastoma [37,38], while a recent report showed positive associations for several antioxidants including alpha-tocopherol [39]. The latter finding was, however, restricted to risk 10-22 years after blood collection [39]. By contrast, and also relevant to the metabolite-ROS-glioma associations, we showed two of three compounds in the ascorbate/aldarate pathway were inversely associated with high-grade disease.
5Alpha-pregnan-3beta, 20beta-diol monosulfate and pregnenolone sulfate were reduced in high-grade glioma cases. The latter is considered a neurosteroid that can be synthesized in the central nervous system, is present in higher concentrations in brain tissue than in plasma [40,41], and is a precursor of some neurosteroids characterized as having neuroprotective effects [42,43]. Experimental data also indicate that pregnenolone can regulate glioma cell death through extrinsic and intrinsic apoptotic pathways in a caspase-dependent manner [44].
Altered energy metabolism has been considered one of the hallmarks of cancer [45]. Glycolysis, as a highly conserved metabolic process, is essential for energy production in normal mammalian cells, and impairment of glycolysis has been depicted as a feature of cancer metabolism; i.e., the Warburg effect [46][47][48]. Impairment of glycolysis in glioma has been found in the present study (especially in low-grade tumors), as well as in previous studies (distinguishing low-and high-grade tumors) [10,17]. One of the previous studies [10] that identified a total of 308 known biochemicals and used fresh-frozen tumor tissue found alterations in glucose metabolism as a function of tumor grade, especially distinguishing grade IV tumors from grade II (33 grade IV tumors and 18 grade II tumors). Another study [17] using serum that detected a total of 224 known metabolites from 25 key metabolic pathways reported that the carbohydrate pathway (e.g. glycolysis or gluconeogenesis) was ranked 6th of 18 metabolic pathways that significantly differed between high-and low-grade glioma. They have defined low-grade glioma (N=42) as grade I and II, and high-grade (N=45) glioma as grade III and IV. Based on differences in study designs, including biospecimen type (serum vs. tissue), reference groups (healthly controls in the present study vs. low-grade glioma cases in previous clinical studies), number of identified metabolites (730 known metabolites in the present study vs. 200-300 metabolites in the previous studies), and definition of low-and high-grade glioma (grade II and III as low-grade and grade IV as high-grade in the present study vs. grade I/ II as low-grade and grade III/IV as high grade), a direct comparison across the studies is not facile and involves imprecision. Our findings suggest that impairment of the glycolysis pathway may be an early event during glioma development, likely to support cell proliferation and tumor anabolic activity. Of note, impairment of glycolysis has been shown in earlier studies to be associated with activated oncogenes (such as RAS, or MYC) and mutant tumor suppressors (such as TP53) [45,49,50] and such mutations have been reported to occur more frequently in low-grade rather than high-grade gliomas [51][52][53].
Notable strengths of our investigation include assaying of overnight fasting serum samples that were obtained up to two decades prior to the diagnosis of glioma. Cases were ascertained from census-based population registers with high accuracy. Untargeted metabolomic profiling was able to identify several hundred metabolites using a high-quality platform with careful quality control and laboratory blinding to case-control status. Study limitations include the homogenous nature of the Finnish male smoker population which impacts the generalizability of our findings to other populations. Also, serum metabolites were only measured at one blood sampling time-point, whereas additional timepoints would have provided a more robust reflection of one's usual or average profile. We are unable to access tumor tissue for glioma genotyping, such as for IDH mutation, thus we could not classify gliomas based on this factor. Our study sample size was relatively small for an agnostic investigation, albeit glioma is a relatively rare malignancy and all available cases were studied. Although none of our findings exceeded statistically significant thresholds for multiple comparisons, a large number of the metabolites were highly correlated with one another (e.g., fatty acids), which makes the Bonferroni threshold particularly stringent in that the tests were not completely independent. Nonetheless, our findings should be considered preliminary and hypothesis-generating, even with respect to top signals that were consistent with data from previous studies.
In conclusion, the present study finds a serum metabolomic profile of glioma up to 20 years prior to clinical diagnosis that is characterized by altered molecular signals in arginine/proline, antioxidant, and coffee-related metabolites. Ascorbate/aldarate and steroid hormone metabolites were found to be associated with high-grade glioma. The observed profiles provide evidence regarding the molecular basis relevant to etiologic or sub-clinical biomarkers for glioma. Further prospective metabolomic studies are needed to re-examine the findings in larger and more diverse populations.

Study population
The ATBC Study was a 2×2 factorial, randomized, double-blind, placebo-controlled primary prevention trial originally conducted to examine whether α-tocopherol and β-carotene supplementation could reduce incidence of cancer [54]. Details of the study have been described [54]. Briefly, the trial enrolled Caucasian male smokers (n=29,133), aged 50-69 years from 1985 to 1988 in southwest Finland. The participants were randomly assigned to receive one of four supplements: α-tocopherol (50 mg/day), β-carotene (20 mg/day), both or placebo for 5-8 years (median=6.1) through the end of the trial (April 30, 1993). Pre-supplementation fasting blood samples from all participants (years prior to cancer diagnoses) were collected during the baseline visit, and stored at -70 °C until assessment. At enrollment, self-reported questionnaires were completed with information regarding general health, behavioral and lifestyle factors, and height and weight were measured. Participants were followed from date of enrollment, to date of glioma diagnosis (Finnish Cancer Registry), date of death (Finnish Register of Causes of Death), or censor date (December 31, 2012), whichever occurred first.
All participants provided written informed consent at enrollment. The ATBC Study was approved by institutional review boards at the U.S. National Cancer Institute and the Finnish National Institute for Health and Welfare.

Case ascertainment and control selection
A total of 64 glioma cases (ICD-9 191, ICD-O morphology code 9380-9481), including 41 high-grade, 19 lower-grade and 4 unknown grade, ascertained during the follow-up period are included in the present analysis. Using incidence-density sampling without replacement, we randomly selected 64 matched controls based on age (± 1 year) and date of blood collection (± 30 days).

Metabolite assessment
Ultrahigh performance liquid chromatograph/ tandem mass spectrometry (LC-MS/MS), a high resolution accurate mass (HRAM) platform at Metabolon Inc., was used to determine serum metabolomic profiles as previously described [55,56]. To summarize, extraction of samples was processed using an automated liquid handling robot (Hamilton LabStar, Hamilton Robotics, Inc., Reno, NV), and 450μl of methanol was added to 100 μl of sample to precipitate proteins. To confirm extraction efficiency, four recovery standards were added to the methanol, including DL-2-fluorophenylglycine, tridecanoic acid, d6cholesterol and 4-chlorophenylalanine. Four aliquots from each sample were obtained and dried. For the negative ion analysis, two aliquots of each serum sample were reconstituted in 50μl of 6.5 mM ammonium bicarbonate in water with a pH of 8. For the positive ion analysis, another two aliquots of each serum sample were reconstituted using 50μl 0.1% formic acid in water with a pH of 3.5. The procedures further included raw data extraction, peakidentification and quality control (QC) inclusion in each assay. The assays were run in four batches, each batch contained 16 case-control pairs and two QC samples. We identified a total of 1,064 metabolites, of which 311 metabolites were unknown, and 753 were known molecules. We excluded 23 metabolites that were missing (i.e. below the limit of detection) in >110 study individuals (86%), leaving a total of 730 identified compounds in the final analysis. Metabolites were categorized into one of eight mutually exclusive chemical classes: amino acids and amino acid derivatives (subsequently refer to as "amino acids"), carbohydrates, cofactors and vitamins, energy metabolites, lipids, nucleotides, peptides or xenobiotics. CVs and ICCs were used to assess the data reliability. The CV is defined as the square root of the within-subject variance divided by the mean value. The lower the CV, the better the assay repeatability. The ICC is defined as the between-individual variance divided by the total variance. The range of ICC values are 0-1, and close to 0 suggests little to null reproducibility, whereas close to 1 indicates good reproducibility. Rosner has proposed the classification of ICCs as poor (<0.4), fair-to-good (0.4-0.75), and excellent (≥0.75) [57].

Statistical analyses
We used either Wilcoxon rank sum (for continuous variables) or Fisher's exact test (for categorical variables) to compare the demographic characteristics of cases and controls. To standardize the batch variability, we normalized each metabolite within a given batch by dividing by the batch mean of all non-missing values. Metabolites with missing values were imputed to the minimum of the observed values. Metabolite levels were then log-transformed and normalized to have mean = 0 and variance = 1. We modeled the association between glioma and each normalized log-transformed metabolite level by conditional logistic regression and report the ORs for a 1-standard deviation (SD) increase and their 95% CIs, with only matching factors adjusted in the final model. The threshold for statistical significance was defined by Bonferroni correction in the primary analysis with 730 tests (P=0.000068). We next performed principle component analysis [58] and repeated the conditional logistic regression for each of the top 10 principal components. We then used GSA, a standard pathway method, to examine whether any of the pre-defined sub or super-pathways were associated with glioma status [59].
Sensitivity analyses were performed, including restriction to high-grade case-control pairs, and stratification by age at enrollment (<56 vs. ≥56 years, median age as cutoff point), time to diagnosis (<9 vs. ≥9 years, median time as cutoff point), and caffeine intake (<560 vs. ≥560 gram, median as cutoff point). Additional analyses adjusted (separately) for BMI (continuous), height (continuous), history of diabetes (yes or no), physical activity (no activity vs. at least light activity), and number of daily cigarettes, red meat consumption, and alcohol consumption (all continuous).
The R statistical language version 3.2.3 (Vienna, Austria) was used for GSA analysis, and SAS software version 9.3 (SAS Institute, Cary, NC) was used for other analyses. All presented P-values are two-sided.

CONFLICTS OF INTEREST
EDK is employed by Metabolon, Inc. JH, SJW, CMK, JNS and DA declare no conflicts of interest.