Signature of survival: a 18F-FDG PET based whole-liver radiomic analysis predicts survival after 90Y-TARE for hepatocellular carcinoma

Purpose To generate a predictive whole-liver radiomics scoring system for progression-free survival (PFS) and overall survival (OS) in patients undergoing transarterial radioembolization using Yttrium-90 (90Y-TARE) for unresectable hepatocellular carcinoma (uHCC). Results The generated pPET-RadScores were significantly correlated with survival for PFS (median of 11.4 mo [95% confidence interval CI: 6.3–16.5 mo] in low-risk group [PFS-pPET-RadScore < 0.09] vs. 4.0 mo [95% CI: 2.3–5.7 mo] in high-risk group [PFS-pPET-RadScore > 0.09]; P = 0.0004) and OS (median of 20.3 mo [95% CI: 5.7–35 mo] in low-risk group [OS-pPET-RadScore < 0.11] vs. 7.7 mo [95% CI: 6.0–9.5 mo] in high-risk group [OS-pPET-RadScore > 0.11]; P = 0.007). The multivariate analysis confirmed PFS-pPET-RadScore (P = 0.006) and OS-pPET-RadScore (P = 0.001) as independent negative predictors. Conclusion Pretreatment 18F-FDG PET whole-liver radiomics signature appears as an independent negative predictor for PFS and OS in patients undergoing 90Y-TARE for uHCC. Methods Pretreatment 18F-FDG PET of 47 consecutive patients undergoing 90Y-TARE for uHCC (31 resin spheres, 16 glass spheres) were retrospectively analyzed. For each patient, based on PET radiomics signature from whole-liver semi-automatic segmentation, PFS and OS predictive PET-radiomics scores (pPET-RadScores) were obtained using LASSO Cox regression. Using X-tile software, the optimal score to predict PFS (PFS-pPET-RadScore) and OS (OS-pPET-RadScore) served as cutoff to separate high and low-risk patients. Survival curves were estimated using the Kaplan-Meier method. The prognostic value of PFS and OS-pPET-RadScore, Barcelona-Clinic Liver Cancer staging system and serum alpha-fetoprotein level was analyzed to predict PFS and OS in multivariate analysis.


INTRODUCTION
Hepatocellular carcinoma (HCC) is responsible for significant morbidity and mortality. It is the most common primary liver cancer and represents the 2nd most common cause of cancer mortality worldwide [1]. The identification of accurate predictive factors to guide therapy was subject of numerous studies and several robust predictors of death as portal vein invasion (PVI), tumor size, serum alpha-fetoprotein (AFP) level, Child-Pugh class [2], the tumor-node-metastasis (TNM), the Okuda [3] and Barcelona-Clinic Liver Cancer (BCLC) systems [4] or the Cancer of the Liver Italian Program (CLIP) score [5] have been described. In many of these factors, imaging is essential and therefore plays an important role in the management. Multiple studies have shown a correlation between standardized uptake value (SUV) of HCC on 18 F-fluorodeoxyglucose positron emission tomography ( 18 F-FDG PET) and outcomes following different systemic and locoregional treatments [6][7][8][9][10][11][12][13][14][15][16], including more recently transarterial radioembolization with Yttrium-90 ( 90 Y-TARE) [17][18][19].
Recently, radiomics has been introduced in the field of oncology [20]. Radiomics is a fast evolving medical field consisting in the extraction of high-throughput quantitative imaging features that may quantify in vivo and noninvasively intra and inter-tissue textural heterogeneity [21]. Indeed, radiomics allows virtual biopsies [20] that captures the inner organization processes of an entire volume with the surrounding tissue without being limited to the sampling site contrary to conventional biopsies. Additionally, virtual biopsies are noninvasive, instantaneous, can be repeated over the time and permit the monitoring of the host tumor relationships and of the treatment sequence. Radiomics does not have a consensual definition but its aim is to provide a characterization of images phenotypes [22] using extracted parameters from medical images (often more than 200+ features [21]) which can be used as biomarkers. They may include first order statistic (intensity, histogram analysis), shape (such as sphericity), textural features (sometimes intensity features or shape are confounded with textural features) or wavelets decompositions. The emerging field of radiomics have sparked large interest the past few years for different imaging modalities (computed tomography [CT], magnetic resonance imaging [MRI], PET) and many cancers such as esophagus, non-small cell lung cancer [21][22][23] or breast cancer [24].
For the HCC, interest of radiomics has already been reported. Using an integrated imaging-genomic approach with semiquantitative CT features relative to the poorly defined tumor margin, Kuo et al. were able to identify HCC imaging phenotypes at CT that correlate with a doxorubicin drug response gene expression program [25]. In another study Segal et al. [26] demonstrated that combinations of 28 imaging phenotypes can reconstruct 78% of the global gene expression programs of primary human liver cancer.
Despite the growing evidence for radiomics, no predictive studies in HCC using this technique exist. The aim of the current study was to generate a predictive PET radiomics scoring system for progression-free survival (PFS-pPET-RadScore) and overall survival (OS-pPET-RadScore) in patients undergoing 90 Y-TARE for unresectable HCC (uHCC) using a pretreatment 18 F-FDG PET whole-liver radiomics signature. When compared to the previous studies listed above, we used intensity and texture analyses of the entire liver volume, providing an advanced signature of the metabolic heterogeneity and morphology for the subtle distinction of HCC and liver cirrhosis.

Patients and subgroups characteristics
The characteristics of the whole-population and low-risk and high-risk groups are given in Table 1. Data of 90 Y-TARE and associated treatments are given in Table 2. The mean interval between 18 F-FDG PET/CT and 90 Y-TARE was 18 days (range, 1-85 days). Patients did not receive any treatment between 18 F-FDG PET/CT and 90 Y-TARE. Using the BCLC staging system, 3 patients (6.5%) were stage A, 18 (38.5%) stage B and 26 (55%) stage C. Three patients had normal livers, all others (94%) had cirrhotic liver disease including 36 patients Child-Pugh A and 8 patients Child-Pugh B (≤ B7). Two patients have periportal lymphadenopathy. Among the 47 patients, 19 (40%) were treatment naïve and 28 (60%) had already received various procedures before 90 Y-TARE including targeted therapy by Sorafenib or Everolimus with an association of 2 or more treatment modalities in 7 patients (15%). With regards to the comparison between low-risk and high-risk groups, the analysis revealed a significant higher tumor size in the high-risk group for OS-pPET-RadScore (P = 0.02). A trend for higher tumor size was seen for PFS-pPET-RadScore (P = 0.05). The hepatic control rate at 6 months of lesions treated by 90 Y-TARE was better (but not statistically significant probably explained by the limited number of patients) in low-risk group compared to high-risk group for both PFS-pPET-Radscore (76 vs. 60%; P = 0.43) and OS-pPET-Radscore (79 vs. 64; P = 0.29).

Construction of PFS and OS-pPET-RadScores
As shown in Figure 1, out of a total 108 radiomics features, 69 were highly correlated and were excluded from the analysis. On the 39 remaining features, a LASSO Cox regression analysis was performed to assess variables with non-zero coefficients. The contribution of the selected parameters with their regression coefficient for the radiomics signature construction is illustrated in Figure 2 by a histogram which shows the importance of each regression coefficients used to generate PFS and OS-pPET-RadScores. PFS and OS-pPET-RadScores were calculated using regression coefficients from the LASSO regression as follow: PFS PETRadScore = 0.1201883 * Strength OS PETRadScore = 0.13444452 * Variance + 0.12018832 * Strength -0.01887273 * Low Intensity Run Short Emphasis -0.01046038 * Contrast

Survival analysis
The median duration of follow-up was 11.1 mo (range, 2.2-53.7 mo). Hepatic relapse occurred in 30 patients (64%) at a median of 6.9 mo (range, 0.7-31.1 mo) after 90 Y-TARE and 33 (70%) patients died from tumor progression. As shown in Figure 3, the generated pPET-RadScores were significantly correlated with survival for PFS (median of 11.4 mo [95% confidence interval CI: 6.  The prognostic value of generated PFS and OS-pPET-RadScores did not differ when stratified by BCLC staging system or tumor size (Table 3).

DISCUSSION
The aim of the current study was to generate a predictive radiomics scoring system based on the wholeliver (tumor and non-tumoral liver) segmentation of 18 F-FDG PET in patients undergoing 90 Y-TARE for uHCC. This study which describes whole-liver (tumor and non-tumoral) radiomics, might be an interesting concept to integrate liver function and tumor biology. This integrative model may be able to separate patients in low-risk and high-risk groups and to predict survival. This is of interest since 90 Y-TARE is costly and sometimes associated with side effects in this vulnerable patient population. By introducing the wholeliver and not isolated tumors in the radiomics model, we aim to integrate liver function and tumor biology, thus representing the liver biology in one system. This approach in our view might represent the fragile balance between HCC and liver cirrhosis. Two predictive radiomics scores were generated to predict survival. These radiomics scores successfully classified patients between low-risk and high-risk in either PFS and OS and remains statistically significant in the multivariate analysis independently of the BCLC staging system which includes variables related to tumor stage, liver functional status, performance status, and cancerrelated symptoms [4]. Our score furthermore remained an independent factor against the tumor size and the AFP level in our patient population.
Comparison between low-risk and high-risk revealed that a higher tumor size was seen in the high-risk group for the PFS-pPET-RadScore and OS-pPET-RadScore.
Indeed, the tumor size is a well-known factor associated with outcome. Interestingly, our approach replaces the tumor size and functional parameters of the BCLC classification as performance status, PVI and Child-Pugh score with a whole-liver radiomics approach taking into account the lesion size as well as the metabolic activity of the non-tumoral but cirrhotic liver. The generated radiomics score of this study remained significant in the multivariate analysis, mandating an independent value of our mathematical model. Furthermore, a whole-liver radiomics model is more less prone to failure due to lesion interpretation by the radiologist/nuclear physician and can capture much more complex patterns than reported by the  BCLC scoring system. The mandated segmentation can be performed easily on the CT scan and then translated on the 18 F-FDG PET images. Furthermore, in the future, this process of segmentation will be even more fast, reproducible and user-friendly with fully automated liver segmentation integrated into clinical routine [27]. Finally, the segmentation of the whole-liver metabolism could have an additional potential clinical significance if a predictive model of toxicity were identified in future studies.
The main textural features in our predictive radiomics scoring system were Strength and Variance. The presence of the Strength in both PFS and OS models confirmed the relevant predictive value of this parameter. Strength is a textural feature based on the neighborhood gray-tone difference matrix that is first described in 1989 by Amadasun et al. [28] and means if a pattern is perceivable within the texture and if it can be recognized. Variance is a textural feature derived from texture feature coding method and describes a deviation from the mean of textural feature numbers (a transformation of the image voxels that represent a certain type of local texture). Variance is one of the textural feature that was initially used by Horng et al. to classify ultrasonic liver images into 3 liver states (normal liver, hepatitis and cirrhosis) with a correct classification rate of 86.7% and a false-negative rate of 4.4% [29]. We believe that this publication strengthens our integrative whole-liver approach using radiomics and emphasizes once more the importance to include not only tumor lesions to predict outcome but also have to a tool to assess the non-tumoral liver. Our current analysis also has shortcomings, whereas the most important is the lack of an external cohort to verify our findings. This criticism is certainly justified, however we see the current work rather as a generation of hypothesis that the reading of imaging especially in the fragile context of liver function versus tumor control could be performed on a much more complex level than for example the BCLC staging. A further shortcoming is that some patients received prior treatment as Sorafenib ™ (Bayer, Leverkusen, Germany) which might influence the outcome of PFS. The preceding treatments are summarized in Table 2 and reflect a standard population receiving radioembolization where this treatment is used rather in later therapy lines. However, prospective studies showed the feasibility and tolerability of anti-angiogenic treatment as Sorafenib followed by radioembolization [30]. This presented analysis is to our knowledge the first whole-liver radiomics approach, representing the fragile balance between liver function and tumor burden, which is the clinical reality in these patients. However, our results have to be verified in future prospective studies.

Patient characteristics
All pretreatment 18 F-FDG PET images of patients undergoing 90 Y-TARE for uHCC between December 2010 and December 2015 were retrospectively analyzed. The American Association for the Study of Liver Diseases (AASLD) guidelines [31] were used to diagnose HCC and the BCLC staging system have been used to stage HCC [4]. Patients included in the study had unresectable HCC because of a locally advanced tumor, multifocal disease or PVI. Also, inclusion criteria consisted of patients with a liver-dominant or liver-only disease, an adequate hematologic, renal and hepatic function, a good (ECOG PS) <2 and a life expectancy >3 months and a Child-Pugh score ≤ B7. Exclusion criteria were an inadequate liver reserve (bilirubin >34 µmol/L, ascites), a Child-Pugh score > B7, a poor ECOG PS ≥ 2, distant metastases, a higher lung shunt fraction > 20%, an estimated lung absorbed dose of >30 Gray per session and 50 Gray in total and an uncorrectable extrahepatic flow on the pretreatment 99m technetium-macroaggregated albumin single-photon emission computed tomography ( 99m Tc-MAA SPECT/CT). All patients underwent imaging procedures and 90 Y-TARE as standard care. The local Ethics Research Committee of the State of Vaud took into account the retrospective analysis of our database, approved the protocol (Number 2016-00640) and waived the need for patient informed consent for the study analysis.

F-FDG PET
All patients underwent 18 F-FDG PET/CT on a Discovery D690 TOF (GE HealthCare, Waukesha, WI) 50-70 minutes after a planned intravenous injection of 3.6 ± 0.4 MBq/kg of 18 F-FDG. All patients fasted for at least 6 hours and blood glucose levels were less than 140 mg/dL before administration of 18 F-FDG. A low-dose helical CT (120kV, 80-200mA) was first performed for anatomical correlation and attenuation correction. Then, whole-body emission images were acquired using 7 to 9 overlapping bed positions of 2 min each (starting from the top of skull and ending at the mid-thigh). Images were reconstructed using iterative protocols with body weight-normalized SUV computation.

Radiomics features segmentation and extraction
All CT livers were semi-automatically segmented using The Medical Imaging Interaction Toolkit (MITK) workbench software [32] to generate a three-dimensional mask that was further incorporated and translated to the 18 F-FDG PET images. Three-dimensional texture analysis was applied to the pretreatment 18 F-FDG PET study using an open-source software Chang Gung Image Texture Analysis toolbox (CGITA) [33] implemented in Matlab 2015b (Mathworks Inc., Natick, MA). A total of 108 radiomics features from the three-dimensional segmented livers of 18 F-FDG PET images were extracted according from following categories: SUV statistics, cooccurrence matrix, voxel alignment matrix, neighborhood intensity difference matrix, intensity size zone matrix, normalized co-occurrence matrix, voxel statistics, texture spectrum, texture feature coding co-occurrence matrix and neighborhood gray level dependence. Steps of the radiomics process are illustrated in Figure 1.

Y-TARE procedure
The 90 Y-TARE planning and procedure was made as previously described [34]. Briefly, before 90 Y-TARE, all patients underwent a pretherapy SPECT/CT with intraarterial administration of 120-180 MBq of 99m Tc-MAA. The required 90 Y administered activity was calculated from partition model dosimetry as reported by Gnesin et al. [35]. 90 Y-resin (SIR-Spheres ™ ; SIRTex Medical, Sydney, Australia) or 90 Y-glass (TheraSphere ™ ; BTG Biocompatibles Ltd, Farnham, UK) microspheres were injected by a nuclear physician into a percutaneous catheter inserted into the femoral artery and directed to the selected hepatic artery. Patients with small-tumor volumes were preferentially addressed to 90 Y-glass microspheres due to their higher specific 90 Y activity and lower particle number aiming at avoiding lesion saturation and consecutive reflux to non-target volumes. A post-90 Y-TARE SPECT/CT was performed to confirm the distribution of 90 Y microspheres.

Study endpoints
Study endpoints were PFS and OS. PFS was defined as time from the date of the 90 Y-TARE until the date of the first occurrence of hepatic tumor progression based on imaging data with contrast-enhanced CT or MRI using Response Evaluation Criteria in Solid Tumors, distant recurrence, death or last known consultation (censored). OS was defined as time from the date of the 90 Y-TARE until death from any cause or last known consultation (censored).

Statistical analysis
The statistical analysis was performed with R software (The R Project for Statistical Computing, www.rproject.org, version 3.3.2) [32]. The packages in R used in the present study were "glmnet" [36], "Survival" [37], "ggplot2" [38], "caret" [39], "matcor" [40]. All continuous variables were checked for normality and described with conventional statistics. All continuous numeric data were centered and scaled from the mean and standard deviation. According to the Harrell guideline as the number of events should exceed the number of included covariates by at least 10 times in a multivariate analysis [41], an initial reduction of variables was necessary. To address this issue, highly correlated variables were removed (which were defined as a Spearman's correlation > 0.9). On the remaining variables, the least absolute shrinkage and selection operator (LASSO) Cox regression model [42,43] which is suitable for the regression of high-dimensional data, was used to select the most useful prognostic features in the data set. The selected imaging features were then combined into a radiomics signature. For each patient, PFS and OS predictive scores based on 18 F-FDG PET radiomics signature (pPET-RadScore) were computed through a linear combination of selected features weighted by their respective coefficients. Using X-tile software version 3.6.1 (Yale University School of Medicine, New Haven, Conn) [44], the optimal pPET-RadScore value to predict PFS and OS served as cutoff to separate high-and low-risk patients. Survival curves of the high-risk and low-risk groups were estimated using the Kaplan-Meier method and differences between subgroups were compared with the log-rank test. Using SPSS software (version 23, SPSS Inc., Chicago, IL, USA), the differences in demographic, clinical, pathological and treatment data between these two groups were compared by using χ 2 test with Pearson's correction for discrete variables and t test or Mann-Whitney test for continuous variables. The influence of PFS and OS-pPET-RadScores, BCLC staging system and serum AFP level was investigated using a Cox proportional hazards model. Stratified analyses were performed to explore the potential association of the radiomics signature with the PFS and OS using subgroups within clinical-pathologic risk factors from the whole data set. For all statistical analyses, P values < 0.05 were considered statistically significant.