Prognostic significance of total metabolic tumor volume on 18F-fluorodeoxyglucose positron emission tomography/ computed tomography in patients with diffuse large B-cell lymphoma receiving rituximab-containing chemotherapy

Purpose The purpose of this study was to determine the prognostic significance of metabolic parameters on pre-treatment 18F-fluorodeoxyglucose positron emission tomography/ computed tomography (FDG PET/CT), in patients with diffuse large B-cell lymphoma (DLBCL) receiving rituximab-containing therapy. Materials and Methods From September 2009 to December 2014, DLBCL patients who had received FDG PET/CT scans for staging were enrolled. The maximal standardized uptake value of tumor (SUVt) was recorded. The metabolic tumor volume (MTV) was the volume of lesion with an elevated SUV greater than 2.5. The total lesion glycolysis (TLG) was the sum of the products of MTV and mean SUV in all measured lesions. Univariate and multivariate analyses were used to assess the prognostic significance of maximal SUVt, total MTV, TLG and other clinical parameters. Results There were 118 patients enrolled in this study. The median follow-up time was 28.7 months. The 5-year progression-free survival (PFS) for patients with higher and lower total MTV was 32.3% and 66.0% respectively (p = 0.0001). The 5-year overall survival (OS) for patients with higher and lower total MTV was 34.3% and 69.9% respectively (p < 0.0001). Multivariate analysis revealed, besides IPI, that total MTV was independently predictive for PFS (HR: 2.31, 95% CI: 1.16 – 4.60, p = 0.0180) and OS (HR: 2.38, 95% CI: 1.12 – 5.04, p = 0.024). TLG and maximal SUV of tumor were not independent prognostic factors. Conclusions An elevated total MTV was a predictor for shorter PFS and OS in patients with DLBCL receiving rituximab-containing therapy, independent of IPI.


INTRODUCTION
Diffuse large B-cell lymphoma (DLBCL), accounting for about one-third of all non-Hodgkin's lymphoma (NHL), is the most common type of NHL [1]. The international prognostic index (IPI) had been a powerful prognostic tool for more than 20 years for stratifying patient risks [2]. The immuno-chemotherapy combining rituximab and cyclophosphamide, doxorubicin, vincristine, and prednisone (R-CHOP) has resulted in a significant improvement of survival [1]. However, a section of patients was not cured with R-CHOP, either due to primary refractory disease or late relapse following an initial response. Efforts have been made to improve the risk stratification model, including regrouping the IPI score (revised IPI) [3], initial hematological index [4], type of bone marrow involvement [5] and tumor bulk [6]. Nonetheless, these efforts have only resulted in an incremental improvement. New prognostic biomarkers for the rituximab era are needed.
Over the past decade, 18 F-fluorodeoxyglucose (FDG) positron emission tomography, combined with computed tomography (PET/CT) has been widely used for the management of DLBCL [7][8][9][10][11]. The standardized uptake value (SUV) is the most commonly used semi-quantitative parameter in FDG PET/CT. Higher maximal SUV in the lesion has been proved to be of prognostic significance in patients with DLBCL [12,13]. Beyond SUV, with the development of software programs, metabolic tumor volume (MTV) and total lesion glycolysis (TLG) have recently been found to play an important role in the prediction of patient outcomes. However, some recent studies evaluating the prognostic values of total MTV and TLG in DLBCL showed inconclusive and contradictory results [14][15][16][17][18][19][20].
Therefore, the aim of the current study was to determine the prognostic value of total MTV and TLG measured on pre-treatment FDG PET/CT, and to compare MTV and TLG with other clinical prognostic factors, in patients with newly diagnosed DLBCL receiving R-CHOP therapy.

Correlation between MTV, TLG and clinical prognostic parameters
Correlations between metabolic parameters from FDG PET/CT scans and clinical prognostic parameters are listed in Table 3. Using Spearman's correlation test, total MTV was positively and significantly correlated with LDH level, creatinine level, GOT level, β2-microglobulin level, clinical stage, IPI score, revised IPI, bone marrow status, maximal SUVt and TLG. Inverse and significant correlations were seen between total MTV toward Hb and albumin level. On the other hand, TLG was positively and significantly correlated with LDH, GOT level, β2microglobulin level, clinical stage, IPI score, revised IPI and maximal SUVt. Inverse and significant correlations were also seen between TLG toward Hb and albumin level.

Comparison between metabolic parameters measured in patients with different clinical outcomes
During the follow-up, patients who had progressive disease or had died, were grouped as progression (n = 55), as compared to patients in complete or partial remission (n = 63). After a median follow-up period of 28.7 months, 69 (58.5%) patients were alive and 49 (41.5%) patients had expired at the end of the study. The comparisons among maximal SUVt, total MTV and TLG in patients with different clinical outcomes was shown in Table 4. There were no significant differences in maximal SUVt between patients with progression and remission, and between patients who had expired and those who were alive. However, patients who underwent progression of disease had much higher total MTV and TLG, than patients with partial or complete remission (MTV, p = 0.0005; TLG, p = 0.0021). Patients who had expired had significantly higher total MTV and TLG, than patient who survived at the end of study (MTV, p < 0.0001; TLG, p = 0.0004).

Identification of the most discriminative cut-off values
The receiver-operating characteristics (ROC) curve analysis was used to identify the ideal cut-off values in distinguishing high levels of MTV and TLG from low levels of MTV and TLG ( Figure 1). For progression-free survival (PFS), the estimated areas under the ROC curve (AUCs) of MTV and TLG were 0.687 (p = 0.0001) and 0.665 (p = 0.001) respectively ( Figure 1A). 165.4 cm 3 was the best distinguishable cut-off value for dividing high and low MTV status, with 76.5% sensitivity and 58.7% specificity (Youden index 0.35). 1204.9 cm 3 was the best determinative cut-off value for dividing high and low TLG status, with 70.9% sensitivity and 60.3% specificity (Youden index 0.31).
For overall survival (OS), the estimated AUCs of MTV and TLG were 0.723 (p < 0.0001) and 0.691 (p = 0.0001) respectively ( Figure 1B). 190.2 cm 3 was the best distinguishable cut-off value for dividing high and low MTV status, with 77.6% sensitivity and 62.3% specificity (Youden index 0.40). 1480.8 cm 3 was the best determinative cut-off value for dividing high and low TLG status, with 69.4% sensitivity and 62.3% specificity (Youden index 0.32).

Clinical outcomes according to cut-off values of MTV and TLG
In Kaplan-Meier survival analysis, patients with high MTV had poorer clinical survival, compared to patients with low MTV levels (PFS, cut-off value 165.4 cm 3 , p = 0.0001; OS, cut-off value 190.2 cm 3 , p < 0.0001; Figure 2A and 2B). Similarly, patients with high TLG had inferior clinical survival, compared to patients with low TLG levels (PFS, cut-off value 1204.9 cm 3 , p = 0.0008; OS, cut-off value 1480.8 cm 3 , p = 0.0002; Figure 2C and 2D). The 5-year PFS for patients with high TLG (n = 65) and low TLG (n = 53) were 34.3% and 61.8% respectively. The 5-year OS for patients with high TLG (n = 60) and low TLG (n = 58) were 41.3% and 59.5% respectively. The median OS time for the patients with higher TLG (≥ 1480.8 cm 3 , n = 60) was also 17.0 months (95% CI: 10.0 -35.0).

Clinical outcomes in patients with different subgroups
Patients were divided into early-staged (staged I and II, n = 48) and late-staged (staged III and IV, n = 70) groups. In the early-staged group, patients with higher total MTV had poorer clinical outcomes (PFS, cut-off value 77.7 cm 3 , log-rank p = 0.0033; OS, cut-off value 77.7 cm 3 , log-rank p = 0.0193). Higher TLG also correlated with poorer clinical outcomes (PFS, cut-off value 475.6 cm 3 , log-rank p = 0.0095; OS, cut-off value 587.0 cm 3 , log-rank p = 0.0419).
In the late-staged group, patients with higher total MTV or TLG had poorer clinical PFS and OS. However, a significant difference of survival was only shown in the evaluation of OS using dichotomized total MTV (cut-off value 190.2 cm 3 , log-rank p = 0.0153).

DISCUSSION
FDG PET/CT scan has been widely used in the oncological field for several years. The clinical roles of FDG PET/CT scan in diagnosis, staging, monitoring of treatment and prediction of prognosis in patients with lymphoma have been reported [7][8][9][10][11]. The maximal SUV of the primary tumor has been previously demonstrated to be of prognostic values, because of easy accessibility and high reproducibility [12,13]. However, maximal SUV solely recorded intensity of FDG uptake in the most aggressive cells, without reflecting the volumetric concept. The volumetric analysis of MTV and TLG, providing more information than maximal SUV, has brought increasing evidences of clinical value. Meignan et al. collected pooled data from three clinical trials dealing with follicular lymphoma, and found that higher MTV yielded poor clinical outcomes based on PFS [21]. Cottereau et al. reported that higher MTV predicted a poor survival in patients with peripheral T-cell lymphoma [22]. Kanoun et al. [23] and Ceriani et al. [24] had similar reports in Hodgkin's lymphoma and primary mediastinal (thymic) large B-cell lymphoma respectively.
In DLBCL, Song et al. conducted a retrospective analysis on 169 patients with nodal stage II and III DLBCL, in which MTV had more potential predictive power than Ann Arbor stage [25]. Sasanelli et al. had  Figure 2A and 2B). Patients with higher TLG also had significantly poorer outcome compared to patients with lower TLG (PFS, p = 0.0008; OS, p = 0.0002; Figure 2C and 2D). similar results suggesting that pre-therapy total MTV is an independent predictor of outcome in all staged patients [17]. In patients with bone marrow involvement, it was concluded by Song et al. that high total MTV predicted worse prognosis [26]. Another article by Song and his colleagues concluded that high MTV is an independent factor for predicting survival in primary gastrointestinal DLBCL [27]. Combining early PET/CT response or molecular characteristics, MTV also improved the predictive power and defined a poor prognosis group [20], and made accurate selection of patients to increase tailored therapy [28]. However, conflicting results coexisted. Some articles mentioned that TLG, but not MTV, was the better predictor and correlated well with the patient outcomes [15,16,18]. Some articles presented that neither total MTV nor TLG on FDG PET/CT scan was independent predictor [14,19,29].  [25][26][27], in spite of different patient populations. In the current study, TLG was statistically significant in univariate analysis, but failed to be an independent factor in multivariate analysis. We speculated that the cause may be related to different definitions of marginal threshold, when measuring the MTV and mean SUV of lesion. Most studies, in which MTV were more predictive for patient outcomes, used absolute cutoff of SUV (more than 2.5), as the threshold to define MTV [20,[25][26][27], while only one article used 41% threshold of maximal SUV to calculate MTV [17]. The similarities and differences between the current study and previous similar studies in DLBCL were summarized in Table 7.
In the literature review, we found that total MTV and TLG differed in a wide range among earlier reports. The reasons may be related to patient characteristics with a wide range of age, clinical stage and different subtypes of disease. Another important reason was related to the different software and the different ways used to define the marginal threshold with abnormal FDG uptake. There were multiple programs provided by different vendors used to calculate the MTV, e.g. Syngo TrueD (Siemens Healthcare) [14,15], Planet Onco (DOSISoft) [28], PET-VCAR program (GE Healthcare) [21,29], Imagys (Keosys, Saint-Herblain, France) [17, 21] and so on. There was a paucity of inter-program correlations and discrepancy.
As to the methodology, there are three basic methods to evaluate the MTV. The first one is according to the threshold percentage of maximal SUV in a lesion [30]. Some authors adopted this method with different thresholds, ranging from 40% to 42% [14,17,19,28,29]. One article compared 3 settings of marginal thresholds (i.e. 25%, 50% and 75%) to get an optimal one [16]. We think that there are drawbacks in using this methodology. If the maximal SUV of lesion is relatively high, the metabolic volume will be underestimated. For example, if we use a threshold of 40% to estimate the volume of a lesion with maximal SUV of 18, the portion with SUV below 7.2 will not be included in the further calculation. That is the reason why the ideal threshold should be different according to maximal SUV, in the earlier articles. The second method to define threshold is according to the mean SUV of normal liver plus 3 standard deviations (SD) [18,31]. This method is patient-based and is able to reduce the influence of different PET/CT system and technical or artificial factors. However, the mean SUV of normal liver should be carefully defined, especially in patients who presented with hepatic involvement by lymphoma at the diagnosis. In the current study, we used the third method, in which lesions with an absolute cut-off value of SUV more than 2.5 were incorporated into calculation of total MTV, as suggested by Freudenberg et al. [32]. The method was also adopted in several articles [20,[25][26][27]. The important things regarding this method are to control the imaging protocols, including patient preparation, as consistently as possible. Under the reading of experienced nuclear physicians, the advantage of this method is that it is easy to define the lesion with a clear-cut value. Several other methods, such as gradient-based, statistical-and texture-based methods for auto-segmentation of PET volumes exist. Every method has its specific advantages and disadvantages. To the best of our knowledge, there is no published technical standard to confirm complete accuracy in measuring the metabolic volumes in all organs and settings. However, a normalized and standardized method to calculate the metabolic volume is necessary, because baseline metabolic tumor volume values were significantly influenced by the choice of the method used for determination of volume [33].
In the dichotomization of ideal cut-off values of total MTV and TLG, most articles used a retrospective ROC analysis to determine the optimal values. Only one article used X-tile analysis to determine the value [21]. Some authors didn't mention the dichotomizing method in their articles [14,29]. More reliable analytic tests have been provided. X-tile is a graphical method that illustrates the presence of substantial tumor sub-populations and shows the relationship between a biomarker and outcome by construction of a two dimensional projection of every possible subpopulation [34]. The time-dependent ROC curve is another method, which allows for time-varying marker effects and accommodates censored failure time outcome [35,36]. Further validations with more sophisticated analytic tests may be necessarily applied.
Although the current study was relatively small with a retrospective design, the results underlined the prediction of poor PFS and OS in DLBCL patients with higher total MTV on the pre-treatment FDG PET/CT scan. In addition to the IPI score, the higher total MTV helped to identify the high-risk patients. Early identification of high-risk patients allowed clinicians to pay more attention to the treatment strategies and follow-up [37]. Further prospective study with a larger patient population and a more specific histological subtype collection may be conducted.

CONCLUSION
Our study indicated that total MTV on pre-treatment FDG PET/CT scans was an independent predictor for survival in patients with DLBCL receiving R-CHOP therapy. An elevated total MTV was associated with poorer PFS and OS.

Patient population
We performed a retrospective analysis of patients with DLBCL who were diagnosed between September 2009 and December 2014 and received treatment in Kaohsiung Medical University Hospital. Patient consent was waived because all the clinical data were retrospectively collected via medical chart review. However, informed consent before every examination including FDG PET/CT scan was required. The inclusion criteria for this study were as follows: (a) the diagnosis of DLBCL was pathologically proved,

FDG PET/CT acquisition
All the FDG PET/CT images were acquired using the Discovery ST 16 PET/CT scanner (GE Medical System, Waukesha, Wisconsin, USA). Every patient was asked to fast for at least 6 hours prior to the examination. The blood glucose level was measured to enssure no more than 150 mg/dl before the tracer injection. After intravenous injection of 370-555 MBq (10-15 mCi) of 18 F-FDG, patients were asked to lie comfortably to reduce muscular uptake. The mean uptake time was 55 ± 5 minutes. Spiral low dose CT scan (140 kV, 80 mA, 3.75 mm section thickness) was acquired with a craniocaudal direction and an "arm up" position, followed by the emission acquisition with a reverse direction. The emission scan time per bed was 4 minutes. PET images were reconstructed iteratively (order subset expectation maximization) with CT data for attenuation correction. The Xeleris Functional Imaging Workstation (GE Medical System, Waukesha, Wisconsin, USA) was used for image display and interpretation.

FDG PET/CT analysis
The image interpretation and SUV measurement were performed by two nuclear medicine physicians, who were blinded to the patients' clinical outcomes. A positive lesion on PET/CT was defined as focal or diffuse FDG uptake above the background and was not compatible with a physiological normal uptake [38]. Disagreements were resolved by discussion to reach a consensus interpretation. Using CT images from the FDG PET/CT, the maximal SUVt was collected by drawing a region of interest (ROI) over the most intense slice of the primary lesions. The MTV was defined as the volume of hyper-metabolic lesion, with an SUV greater than a threshold of 2.5, as previous literature suggested [32]. To measure MTV values, PET/CT data were transferred in DICOM format to an OsiriX workstation (OsiriX MD 8.0, Pixmeo Sari, Bernex, Switzerland). Using the 3-dimensional segmentation, a 3-dimensional ROI as well as the contour including each hyper-metabolic lesion previously recognized was automatically produced. The voxels presenting SUV values more than 2.5 within the contour margin were then incorporated, in order to calculate the tumor volumes. The mean SUV of the delineated volume was also provided, using the in-house SUV-based automated contouring program. The total MTV of each patient was defined as the summation of MTVs of all focal lesions selected. The TLG was obtained by multiplying the MTV of every focal lesion by the corresponding mean SUV. The whole-body TLG of each patient was determined by the summation of the TLGs of all focal lesions selected.

Treatment and clinical course
PFS was defined as the time from diagnosis to disease relapse, progression or death. OS was defined as the time from diagnosis to death from any cause. All patients received 6 or 8 cycles of R-CHOP for the initial therapy. Involved field radiation therapy was administered for clinically indicated patients, i.e. initial bulky disease (≥ 10 cm) or residual tumor presented, after completion of chemotherapy. Complete remission (CR) was defined by follow-up image evaluation, either by FDG PET/CT or CT scan, according to published criteria [38]. Patients with refractory and relapsed disease were treated with salvage chemotherapy or received autologous stem cell transplantation (ASCT) with high-dose chemotherapy, if clinically indicated. The observation period was from September 2009 to January 2016.

Statistical analysis
Continuous variables were presented as mean (SD) and categorical data were given as frequencies (percentages). A Kolmogorov-Smirnov test was used to determine whether the variable was of normal distribution or not. The Spearman's rank correlation test was used to analyze the correlation between metabolic parameters from FDG PET/CT and clinical prognostic factors. The Mann-Whitney test was conducted to compare metabolic parameters measured in patients with different clinical outcomes. The optimal cut-off values for total MTV and TLG were determined by ROC curves analysis. The survival curves were obtained by the Kaplan-Meier analysis in the groups dichotomized by optimal cut-off values of metabolic parameters. The survival difference between groups was evaluated by the log-rank test. A Cox proportional hazard model with univariate and multivariate analysis was conducted, to evaluate the impact of every clinical and metabolic parameter on patient survival. The HR and its 95% CI, calculated by Cox proportional hazard model were presented. All these analyses were performed using MedCalc Statistical Software version 17.4.4 (MedCalc Software bvba, Ostend, Belgium; http://www.medcalc.