Prognostic value of 18F-FDG-PET/CT in patients with nasopharyngeal carcinoma: a systematic review and meta-analysis

Background The prognostic role of 18F-fluorodeoxyglucose positron emission tomography CT (18F-FDG PET/CT) parameters is still controversial in nasopharyngeal carcinoma patients. We sought to perform a systematic review and meta-analysis to explore the prognostic value of maximal standardized uptake value (SUVmax), metabolic tumor volume (MTV) and total lesion glycolysis (TLG) on event-free survival (EFS) and overall survival (OS) in nasopharyngeal carcinoma patients. Results Fifteen studies comprising 1,938 patients were included in this study. The combined hazard ratios (HRs) for EFS were 2.63 (95%CI 1.71-4.05) for SUVmax, 2.55 (95%CI 1.49-4.35) for MTV, and 3.32 (95%CI 1.23-8.95) for TLG. The pooled HRs for OS were 2.07 (95%CI 1.54-2.79) for SUVmax, 3.86 (95%CI 1.85-8.06) for MTV, and 2.60 (95%CI 1.55-4.34) for TLG. The prognostic role of SUVmax, MTV and TLG remained similar in the sub-group analyses. Methods A systematic literature search was performed to identify studies which associated 18F-FDG PET/CT to clinical survival outcomes of nasopharyngeal carcinoma patients. The summarized HRs for EFS and OS were estimated by using fixed- or random-effect models according to heterogeneity between trials. Conclusions The present meta-analysis confirms that high values of SUVmax, MTV and TLG predicted a higher risk of adverse events or death in patients with nasopharyngeal carcinoma, despite clinically heterogeneous nasopharyngeal carcinoma patients and the various methods adopted between these studies.


INTRODUCTION
Nasopharyngeal carcinoma (NPC) is a cancer deriving from the epithelial cells, which is covering the surface and lining the nasopharynx [1,2]. Worldwidely speaking, 52.7% of new NPC cases were in World Health Organization (WHO) Western Pacific Region; the remainders are WHO South-East Asia, and Africa Region [3]. The age-standardized incidence in some ethnic groups is reported higher than others-eg, the Hmong in China, Bidayuh in Borneo, Inuits in the Artic, Nagas in northern India and Chamorro ethnic Polynesians [4]. The prognosis of NPC is related to the amount of conventional prognostic factors, such as TNM stage classification, history of smoking, clinical and molecular prognostic variables, and the raised plasma Epstein-Barr virus DNA is also one of the highlighted determinants of prognosis [2]. However, none of them can accurately assess the prognosis of patients in clinical practice.
In the early nineties, 18 F-fluorodeoxyglucose positron emission tomography ( 18 F-FDG PET) entered into clinical usage as a practical imaging technique in Review the regulation of neoplastic disorders, and it also applied in oncologic procedures such as TNM staging, restaging in progression and treatment efficacy assessment in different therapeutic process [5,6]. In addition, various FDG parameters have been discussed during or after chemotherapy and radiotherapy as independent prognostic factors for outcome in numerous malignant tumor [6][7][8].
Standardized uptake value (SUV), a semi-quantitative parameter in 18 F-FDG-PET/CT, is calculated as of the ratio of the FDG concentration to the weight-standardized injected dose in a region of interest (ROI) [9]. The most widely used parameter is SUV max , defined as the maximal SUV value in the ROI and is supposed to be a prognostic marker in some malignancies [6,[10][11]. Apart from SUV max , metabolic tumor volume (MTV) and total lesion glycolysis (TLG), as the tumor metabolic and volumetric parameter, are more widely applied in 18 F-FDG-PET/CT recently [12]. MTV is the size of tumor tissues which is active 18 F-FDG uptake, and TLG is the median SUV value in a region of interest multiplied by the MTV [13][14][15]. MTV and TLG might be utilized to represent the burthen of metabolically active lesion and tumor invasiveness in some malignancies [16].
However, a number of studies reported conflicting results of the prognostic values of SUV max , MTV and TLG in NPC patients [17][18][19]. Thus, this meta-analysis and systematic review was aimed at evaluating the prognostic values of 18 F-FDG-PET/CT for survival outcomes in patients with NPC.

Search results
For primary retrieval, 603 articles were identified through 4 databases. The results were as follows: 336 articles from Embase, 169 articles from Web of Science, 98 articles in PubMed, and none from Cochrane Library. We firstly excluded the duplicates (n = 340) and conference abstracts (n = 131). Of the remaining, 105 articles were excluded according to the titles and abstracts, we included 27 potentially eligible articles from all databases and reviewed the full text. Of these articles, 7 were eliminated because the ln(hazard radio (HR)) and its variance of 18 F-FDG-PET/CT parameters from NPC patients could not be extracted and calculated [20][21][22][23][24][25]; 4 were excluded because two author published 4 and 2 reports on the same population, respectively [26][27][28], [29]; and 1 article of overlapping patients was also excluded [18]. Finally, 1,938 patients of 15 studies published from 2008-2016 were eligible for this study ( Figure 1) [17,19,[30][31][32][33][34][35][36][37][38][39][40][41][42]. Table 1 shows the principal characteristics of the included studies. Nearly all of them were conducted in Asia, 6 studies in China, 4 studies in Taiwan, 3 studies in Korea, 1 in South Korea, and 1 in Egypt. 2 of them were of the prospective design and the remaining 13 studies were of the retrospective design. Of these studied 14 provided the sample size that ranged from 40 to 449 (median 70). The follow-up duration varied from 13.6 to 84.5 months (median 40.0 months). Table 2 shows the patterns of 18 F-FDG PET scanning. Different scanners and various scanning protocols that patients received scans with were used in each study. The duration of fasting varied from 8 h to 4 h and not reported in 1 study. Serum blood glucose before injection ranged from 144-200 mg/dL and not reported in 6 studies. The injected dose varied from 296 to 555 MBq and the post-injection interval ranged from 45 to 70 min. Four threshold methods were used to calculate the cutoff values, including receiver-operating characteristics (ROCs) in 10 studies, minimum P value in 1 study, median value in 1 study, Contal and O'Quigley's method in 1 study and not reported in 2 studies. Two threshold methods were applied to MTV and TLG for the segmentation of the primary NPC lesions. The fixed SUV of 2.5 was used in 4 articles [30,[38][39][40] and the isocontour method was used in 1 study [19]. The median cut-off point was 8.78 (5.0 to 15.6) for SUV max . The cut-off values of MTV varied from 28.9 to 110 cm 3 , and TLG values were between 249.1 and 764. The Newcastle-Ottawa Scale (NOS) scores are shown in Supplement Table 1 and all of the included studies have more than 6 scores.

Primary outcome: EFS
11 studies were included to determine the association between SUV max and event-free survival (EFS) and the combined data revealed that high SUV max   The symmetrical funnel plot was demonstrated after the trim and fill analysis ( Figure 3). When the hypothesized literatures were added, the results (pooled HR = 1.88; 95% CI = 1.52-2.33, P < 0.0001) of this sensitivity analysis still indicated that the correlation between SUV max and EFS is significant. Also, we conducted sensitively analysis to further estimate the impact on the combined HRs. One study [35] were omitted, and an HR of 1.94 (1.56-2.43) was given a decreased I 2 of 21% using a fixed-model. On the one hand, 2 studies were included to analyze the prognostic value of MTV for EFS. Since no significant heterogeneity (χ 2 = 1.88, P = 0.39; I 2 = 0 %) was found among these studies, the HR was 2.55 (95%CI = 1.49 -4.35, P = 0.0006) after using the fixed-effect model ( Figure 2C). On the other hand, 3 studies were combined in the analysis of TLG for EFS. Significant heterogeneity (χ 2 = 4.74, P = 0.09; I 2 = 58 %) was found among these studies, so we used the random-effect model to calculate the HR (3.32, 95%CI = 1.23 -8.95, P = 0.02) ( Figure 2E). When the study of Yang, Z. et al. [38] was excluded, it reduced the heterogeneity from 58% to 36% (P = 0.21) and the pooled HR reached 4.41 (95%CI = 2. 36-8.26).

Publication bias
Begg's and Egger's test were conducted to assess the publication bias.

DISCUSSION
Physicians sometimes face such an embarrassing situation that the standard therapies which are applied in a number of tumors, including NPC, are not effective, so how to reduce the toxicity of treatment failure and avoid unnecessary treatment becomes critical. [43]. From the literatures in recent years, not only the metabolic parameters of 18 F-FDG PET/CT (SUV max , MTV and TLG) can be supposed to reflect the tumor biologic characteristics, but also can evaluate clinical prognosis [18,38]. At present, SUV max is considered to be the most frequently used value in diagnosis and therapeutic evaluation because of the high practicability, sensibility and efficiency [44][45][46]. Meanwhile, a poor prognostic value of SUV max for head and neck cancer was reported in different staged and treated populations [47]. As is generally known, NPC is one of the most common types of head and neck cancer. There are some studies referring that SUV max is one of the most important prognostic values of NPC patients [34]. However, SUV max only demonstrates a simple tumor glucose metabolism within the lesion and cannot evaluate the heterogeneity of total tumor uptake. Recently, the prognostic value of MTV and TLG which are volumetric parameters is also pointed out in conference literatures [48][49][50]. Accordingly, we conducted a metaanalysis and revealed that higher values of SUV max , MTV and TLG, could predict a poor prognosis in NPC patients.
In this meta-analysis, the combined results demonstrated that SUV max was a significant prognostic value for EFS and OS. But the association between SUV max and survival outcomes may be affected by several confounding factors, so, the subgroup analysis of the statistical analysis method was conducted to validate the independent prognostic factor. Multivariate analysis is an effective method, which utilizes Cox proportional hazards model or logistic regression model to reduce bias from major confounders [51]. In our study, both univariate and multivariate subgroup of SUV max were significant, so, it could be presumed that SUV max might be one of the independent prognostic factors for survival outcomes. In addition, the methods to evaluate cut-off values are various in the included studies, such as ROS curve, minimal p-value approach and median value method, et al. Of all these methods, ROC was the most frequent and reasonable method to calculate the cut-off values in our meta-analysis. Although the use of other approaches including minimal p-value approach, might result in high false-positives, they were also reported widely applied in previous studies [52]. So subgroups stratified by the methods were conducted to evaluate the cut-off values.
It is still controversial that whether traditional imaging technique can predict NPC patients' survival, because they only focus on tumor size. While MTV and TLG which were the volumetric parameters, could be utilized in metabolic analysis of radiotracer activity in tumor tissues and reflect the accurate tumor burden. Our study confirmed that high value of the volumetric parameters indicated poor EFS and OS, suggesting that 18 F-FDG-PET/CT has vast prospect in predicting survival outcomes of NPC patients. To our knowledge, there were some articles studying on the parameters of PET of tumor or lymph nodes, but our study only focused on the parameters of tumor. Although 3 included articles [31,33,42] reported that SUV max of lymph nodes was supposed to be an independent predictor of EFS or OS, there were no more statistics about MTV and TLG of lymph nodes for survival and we could not analyse them systematically. More studies are in need to further validate the findings.
We identified 22 previous meta-analyses assessing the clinical application of 18 F-FDG-PET/CT in NPC and head and neck cancer by electronic search of PubMed (Table 4, Supplement Table 1). Only 4 of these literature were about NPC and they all analysed the accuracy of PET for residual and recurrent NPC or detected the lymph node and distant metastases [53][54][55][56]. As far as our information goes, our meta-analysis is the first to assess the prognostic values of 18 F-FDG PET/CT parameters in NPC patients. Of the remaining studies on head and neck cancers, 14 studies analysed the diagnostic performance of PET for NPC [57][58][59], and distant metastasis [60][61][62][63][64][65][66][67], residual or recurrent disease [68][69][70] for head and neck cancers; 4 studies evaluated PET parameters for EFS, OS, disease-free survival (DFS) or loco-regional control using HRs, odds radios or risk radios [47,[71][72][73]. Pak. et al. suggested that the associations between high volumetric PET parameters (MTV and TLG) and the risk of adverse events, disease progression, or death were significant (i.e., an approximately 3-fold increase in the HR). In addition, they also demonstrated that high SUV max was associated with worse EFS (HR = 1.83; 95% CI: 1.39-2.42) and worse OS (HR = 2.36; 95% CI, 1.48-3.77).
Heterogeneity was found in some analyses. On the one hand, some 18 F-FDG-PET/CT imaging processes are significant contributors to heterogeneity -eg, fasting duration, pre-injected blood glucose level, postinjection interval and FDG doses. According to guidelines and protocols for 18 F-FDG PET imaging [74][75][76], it recommend that duration of fasting should be at least 4h, pre-injection blood glucose can be level less than 200 mg/ dL and a post-injection interval must be less than 75 min. The heterogeneity of the results was acceptable since the values were within normal range. On the other hand, the PET imaging thresholds found obviously between the studies can also induce the heterogeneity, which could be interpreted by various influence factors, such as the PET machine types, treatment protocol variations, different scanning executions, diversity of patient cohorts and variations of institutional technical [77][78][79]. A subgroup analysis of SUV max was performed based on median values, however, the cut-off values and 18 F-FDG PET scanning techniques being used in these studies were different and the number of studies was too small to apply as groups.
Moreover, this study indeed has a few limitations. Firstly, the quality of the included studies can also be taken into account as a limitation of our study. Although all of the included studies were evaluated by NOS scores and considered as high quality, we included only 2 prospective studies, some studies still lacked partial details of patients and data of 18 F-FDG PET scan. Further prospective studies combining survival rate of NPC and PET parameters are needed. Secondly, we only included the English articles so that the potential effect of language bias should not be ignored. Thirdly, only published studies had been included when we searched the electronic databases, so the publication bias could not be excluded, even though the Begg's test was conducted and did not suggest clear evidence of it. Moreover, the final result of our trim and fill sensitivity analysis was not affected after incorporating the hypothetical missing literatures, which demonstrates that our analysis was reliable. In addition, the included studies of this meta-analysis are almost in Asia, only one [41] in Africa, none in Europe and other continents. Because the incident of NPC is high in these regions and countries and it may cause the bias of the race of humans. Finally, it may lead to imprecision that Engauge Digitizer was used to extract the data of HRs from survival curves indirectly. Nonetheless, some recent clinical studies [79,80] supported the validity of the main results in our study.

Search strategies
We systematically searched PubMed, Embase, Cochrane Libraryand Web of Science with no restriction on language and date of publication. The last search was conducted on July 4, 2016, using the following terms: ("nasopharynx cancer" or "nasopharyngeal carcinoma" or "nasopharyngeal cancer" or "nasopharynx carcinoma") and ("positron emission tomography" or "positron emission tomography-computed tomography" or "positron emission tomography computed tomography" or "PET" or "PET-CT" or "PET CT" or "PET/CT" or "fluorodeoxyglucose" or "FDG") and ("prognostic" or "prognosis" or "predictive" or "survival" or "outcome").

Inclusion and exclusion criteria
All studies in the meta-analysis should meet the following criteria: (1) patients diagnosed with nasopharyngeal carcinoma pathologically; (2) case control study or cohort; (3) at least once 18 F-FDG PET scan before or/and in treatment (4) referring to PET-CT prognostic value, such as OS, DFS, EFS, progress-free survival (PFS) and disease metastasis-free survival (DMFS) and eventfree survival (EFS); (5) providing the HRs and 95%CIs and other useful information; (6) were in language of English. Articles were excluded by following criteria: (1) based on the study of animals or cells; (2) comment letters, case report, conference abstracts; (3) had not enough data to calculate the HRs and 95%CIs; (4) the research is limited in PET-CT of diagnosis and tumor staging, not provide prognostic parameters. (5) less than 10 patients. When articles recruiting overlapping patients were detected, only the most complete or recent studies include. Two authors (J Lin and MH Yan) independently evaluated the literature review for eligibility. Disagreements were under discussion and adjudicated by the corresponding author (GZ Xie).

Data extraction
Two authors (J Lin and H Li) performed the data extraction independently from the publications. A Microsoft Excel sheet was designed to collect the following items: (1) Basic information of study including author names, year of publication, study period, follow-up duration, study design; (2) Details of patient and tumor including patient source, number, median age, TNM staging and end points provided; (3) Data of 18 F-FDG-PET scan and parameters including PET scanners, duration of fasting before FDG injection, pre-injection blood glucose test, radiation doses of FDG, post-injection interval, the method of determination of cut-off values, PET parameters, tumor delineation and cut-off values of SUV max , MTV, TLG.

Quality assessment
According to the Newcastle-Ottawa Scale criteria (http://www.ohri.ca/programs/clinical_epidemiology/ oxford.asp), two investigators (J Lin and GX Liao) independently assessed the quality of the potentially included studies. The NOS criteria are scored based on three items: subject selection, comparability of subject and outcome (cohort studies) or exposure (case control). For quality assessment, each item had three scores and a total of scores varied from 0 (lowest) to 9 (highest). During this process, we suggested that studies with scores ≥6 were rated as high quality studies and scores less than 6 were excluded in this meta-analysis and discrepancies were resolved by consensus (Supplement Table 1).

Statistical analysis
In this meta-analysis, disease-free survival, progression-free survival, disease metastasis-free survival in the included studies were merged and redefined as EFS.
The primary endpoint was EFS, defined as the time from initiation of therapy until recurrence or metastasis [43]. The secondary outcome was OS, which was measured from the date of initiation of therapy to the date of death from any cause. The impact of 18 F-FDG PET parameters on survival outcomes was measured by the effective size of the HR. HR values of included study were extracted using the following methodology suggested by Parmar et al. [81] and Tierney et al. . [82] HR values and its 95% CIs from included studies could be directly extracted if the original data was supplied by the authors. Otherwise, P values of the log-rank test, number of events, and total number of patients in each group were extracted to estimate the HR indirectly; or, we extracted the HRs from survival curves. We presumed that patients were censored at a constant rate during the follow-up, and the Kaplan-Meier curves were read by Engauge Digitizer (version 8.2 for Mac; http://digitizer. sourceforge.net) to reconstruct the HR estimate and its variance. An observed HR>1 indicated a worse prognosis in patients with high parameter value and HR < 1 suggested a better prognosis. Heterogeneity between studies was evaluated by Chisquare test and I 2 statistics, following recommendation of Cochrane Handbook (http://handbook.cochrane.org/). If P-value was >0.1 or/and I 2 < 50%, indicating there was no or moderate heterogeneity, a fixed-effects model was used; otherwise, the random-effects model was used. The analyses described above were conducted by Review Manager (RevMan, version 5.3; The Nordic Cochrane Centre, The Cochrane Collaboration). Begg's funnel test and Egger's test were made for testing publication bias by STATA version 12.0 (STATA Corp., College Station, TX). It is considered statistically significant when a P-value is less than 0.05.

CONCLUSION
This meta-analysis demonstrated that NPC patients with a high SUV max , MTV or TLG of 18 F-FDG-PET/CT are at higher risk for adverse events or death, despite clinically heterogeneous NPC patients and the various methods adopted between studies. 18 F-FDG-PET/CT can be used for risk stratification in disease control and survival. Future multi-center studies are needed to validate our findings and further explore the significant prognosis value of other 18 F-FDG PET/CT parameters in prolonging survival of NPC patients.