MRI texture analysis in predicting treatment response to neoadjuvant chemoradiotherapy in rectal cancer

To evaluate the importance of MRI texture analysis in prediction and early assessment of treatment response before and early neoadjuvant chemoradiotherapy (nCRT) in patients with locally advanced rectal cancer (LARC). This retrospective study comprised of 59 patients. The tumoral texture parameters were compared between pre- and early nCRT. Area Under receiver operating characteristic (ROC) Curves [AUCs] were used to compare the diagnostic performance of statistically significant difference parameters and logistic regression analysis predicted probabilities for discriminating responders and nonresponders. The Standard Deviation (SD), kurtosis and uniformity were statistically significantly difference between pre- and early nCRT (p = 0.0012, 0.0001, and < 0.0001, respectively). In pathological complete response (pCR) group, pre-uniformity and pre-Energy were significantly higher than that of nonresponders (p = 0.03 and p < 0.01, respectively), while the pre-entropy in nonresponder was reverse (p = 0.01). The diagnostic performance of pre-kurtosis and pre-Energy were higher in tumor regression grade (TRG) and pCR group (AUC = 0.67, 0.73, respectively). Logistic regression analysis showed that diagnostic performance for prediction responder and nonresponder did not significantly improve compared with to pre-uniformity, energy and entropy in pCR group (AUC = 0.76, p = 0.2794, 0.4222 and 0.3512, respectively). Texture parameters as imaging biomarkers have the potential to prediction and early assessment of tumoral treatment response to neoadjuvant chemoradiotherapy in patients with LARC.


INTRODUCTION
Neoadjuvant chemoradiotherapy (nCRT) followed by total mesorectal excision (TME) is the recommended standard therapy for patients with locally advanced rectal cancer (LARC) [1][2][3]. This treatment strategy has improved locoregional control, and rates of sphincter preservation [1,2] and lead to significant pathologic complete response (pCR) defined as the absence of viable tumor cells after full pathologic examination of the resected specimen (ypT0N0M0) in a significant proportion of patients [4,5]. In these patients with pCR to nCRT, some investigations have indicated that surgery can be omitted and the non-operative treatment strategy with strict follow-up (watch-and-wait strategy) may be safe and associate with good survival rates [4][5][6][7]. Accurate response assessment to nCRT prior to the start and early treatment can enhance clinical care management by enabling the personalization of treatment plans based on predicted outcome.
Magnetic resonance imaging (MRI) have been the most extensively studied response evaluation for

Research Paper
Oncotarget 12000 www.impactjournals.com/oncotarget nCRT in patients with LARC. Different MRI biomarkers including tumor volume, apparent diffusion coefficient (ADC) values, perfusion parameters of dynamic contrastenhanced MRI (DCE-MRI), and parameters derived from intravoxel incoherent motion diffusion-weighted imaging (IVIM-DWI) have been investigated [8][9][10][11]. But these imaging markers have limitations in predicting treatment response. Tumor volume measurement methodology is not practically feasible owing to the time-consuming nature. The DWI is a functional imaging technique that analyses differences in intracellular and extracellular space random Brownian motion of water protons to discriminate between tissues of varying cellularity. By measuring the ADC values, DWI has shown to be more valuable to monitor tumor response before and after treatment than morphologic MRI, but there is no consensus on the diagnostic accuracy in rectal cancer, the performance varies dramatically ranging from 0.51 to 0.85 [8,12,13]. Although studies prove that DCE-MRI and IVIM-DWI modalities are useful in treatment response of rectal cancer, these studies are still in extremely preliminary stages [11,14].
There is increased interest in the field of radiomics due to the limitations in existing imaging modalities and the concept that radiological images hold more information than that is being utilized. Radiomics is defined as the high throughput extraction of quantitative imaging features or texture parameters from imaging to decode tissue pathology and creating a high dimensional data set for feature extraction [15]. Recently, as a potentially imaging biomarker, assessing tumor heterogeneity in relation to treatment response by extracting textural features has emerged [16][17][18]. Texture analysis (TA) is a noninvasive method of assessing the intratumoral heterogeneity. To date, there is very little research carried out to assess whether TA of MRI in rectal cancer can potentially be used as an imaging biomarker for early response to nCRT [19,20]. The first study demonstrated that pre-treatment kurtosis was the best predictor to distinguish pCR from partial response (PR) and nonresponse (NR), and the diagnostic performance was 0.86. However, both studies included some patients with stage T1-2N+M0, and the validation was not performed, particularly in T3-T4 rectal cancer.
The aim of this study was to investigate whether TA of rectal cancer based on T2-weighted MRI can predict and provide an early assessment of tumoral response in patients with LARC treated with nCRT.

Patient population
The study cohort consisted of 59 consecutive patients (39 males and 20 females; mean age -54 years; age range -46-62 years). According to the reference standards, TRG 1-2, pCR, and T-downstaging were found in 30 (50.8%), 15 (25.4%), and 28 (47.5%) patients, respectively. Baseline characteristics of the patient population and the pathologic findings of the surgical specimen are summarized in Table 1.

Interobserver agreement
There was a moderate-to-excellent interobserver agreement in the histogram and first-order texture metrics, with intraclass correlation coefficients ranging from 0.60 to 0.99. Full results are listed in Table 2.

Differences of TA between pre-and early nCRT
There was a trend for SD and energy to be lower in pre-nCRT than in early nCRT, while Mean value, skewness, kurtosis, uniformity, and entropy reversed. Only SD, kurtosis, and uniformity were significantly different between pre-and early nCRT (p = 0.0012, 0.0001, and < 0.0001, respectively) ( Figure 1). There was no significant difference in mean value, skewness, Energy, and entropy between pre-and early nCRT.

Responder and nonresponder parameters
The pre-kurtosis was significantly higher in patients with responder vs. nonresponder in TRG group (3.57 vs 3.24, p = 0.02). There were significant difference in pre-uniformity (0.82 vs 0.79, p = 0.03), pre-energy (0.95 vs 0.50, p < 0.01) and pre-entropy (0.22 vs 1.39, p = 0.01) between patients with responder and patients with nonresponder in pCR group. Full results are listed in Tables 3-4.

Diagnostic performance
The AUC to discriminate patients with responder from patients with nonresponder were 0.67 for prekurtosis in TRG group. This allowed a prediction of response with a sensitivity of 55.17% and a specificity of 73.33% at an optimal cutoff value of ≤ 3.29. For the preuniformity, pre-energy and pre-entropy in pCR group, ROC curve analysis showed an AUC of 0.69, 0.73 and 0.72 at an optimal cutoff value of ≤ 0.79, ≤ 0.93 and > 0.22, respectively. This allowed for a prediction of response with a sensitivity of 54.55, 84.09, 86.36% and a specificity of 93.33, 53.33, 53.33%, respectively. The logistic regression analysis for the combined parameters (pre-uniformity, pre-energy, and pre-entropy) achieved an AUC of 0.76 (cutoff value > 0.64, SE 79.55%, SP 66.67%) in pCR group. This was not a significant improvement compared with the pre-uniformity, preenergy and pre-entropy in pCR group (p = 0.2794, 0.4222 and 0.3512, respectively). Full results are listed in Table 5 and Figure 2. www.impactjournals.com/oncotarget

DISCUSSION
In this study, we demonstrated the reliable use of TA parameters extracted from conventional T2WI for prediction and early assessment of treatment response of LARC to nRCT according to two different pathological reference standards.
Our results showed that most of the texture parameters (mean value, skewness, kurtosis, uniformity and energy) decreased at the third of week of CRT, with the exception of SD and entropy. In particular, the mean pre-kurtosis in TRG group, pre-uniformity, and pre-energy in pCR group significantly higher in responders compared with nonresponders. While the pre-entropy of pCR group was statistically lower than that of the nonresponder group. The AUCs of pre-kurtosis in TRG group and predicted probabilities derived by logistic regression analysis in pCR group were 0.67 and 0.76, respectively.
Kurtosis reflects peakedness and tailedness of the histogram; it is related inversely to the number of features highlighted [21]. De Cecco et al. research on the correlation of DWI, DCE-MRI and TA parameters indicated that there was a significant negative correlation between kurtosis and ADC [20]. This correlation may be interpreted as a trend that kurtosis was to be higher in pre-CRT than in early CRT.
A preliminary study has recently demonstrated the importance of MRI TA parameters in predicting treatment response of rectal cancer to nCRT [19]. Based on Dworak tumor regression grade, the study demonstrated that kurtosis was the best predictor of tumor response. Prekurtosis with medium filtration was significantly lower in patients with pCR in comparison with those with partial response (PR) + nonresponse (NR). Pretreatment AUC for kurtosis using the best medium texture to discriminate  [22,23].
In our study, logistic regression analysis was used to predict probabilities for analyzation. The AUC of predicted probabilities derived by logistic regression analysis before nCRT in pCR group was more favorable than   [8,24]. The TA parameters extraction from MRI hold more information and generated meaningful data than existing imaging modalities to prediction response before nCRT. There were some limitations to our study. First, this retrospective analysis was not validated by other centers. Further, a prospective investigation with a larger patient database is necessary to ascertain the diagnostic performance of TA parameters to nCRT. Moreover, tumor delineation was performed using a single slice method. The multislice delineation of the tumor area is representative of the whole tumor, but the multislice method is not clinically feasible owing to the time-consuming nature. And, according to Ng, et al. [25], a large cross-sectional area of the tumor is sufficiently represented and provides comparable results to whole tumor analysis. Third, in this study, comparison with the different imaging markers was not made. Finding the optimal imaging marker is essential to ensure sufficiently predictive accuracy. A comprehensive imaging model analyzing the combined efficacy would be needed to assess their combination on the prediction of treatment response. Lastly, we did not perform investigations on the correlation of TA parameters with the corresponding histopathology.     Table 5.
Oncotarget 12005 www.impactjournals.com/oncotarget In conclusion, our preliminary study indicates that TA based on T2WI holds promise to prediction and early assessment response and nonresponse to nCRT in patients withLARC.

Study population
This retrospective study was approved by the Medical Ethics Committee of National Cancer Center/ Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College. Informed consents were obtained from all participants. This study was carried out in accordance with the Declaration of Helsinki. The methods were carried out in accordance with the approved guidelines.
From October 2010 to December 2013, all histologically proven locally advanced rectal adenocarcinoma (≥ T3 or lymph node positive) originating within 15 cm of the anal verge and treated with nCRT before TME at our institution were enrolled in this study. The MRI of the pelvis, computed tomography scans of the chest, abdomen, and pelvis were performed as pre-nCRT tumor staging. The exclusion criteria included the history of other malignant tumor, previous pelvic radiotherapy, or contraindication to MRI examination, insufficient quality to analysis.

Study protocol
All patients underwent MR examinations thrice. The first MR examination (pre-nCRT MRI) was performed for tumor staging before treatment, and the second one (early nCRT MRI, 10-15 fractions after initiation) was used to assess early treatment response at the third week of CRT. The third MRI examination was performed 4-6 weeks after nCRT to monitor the response to treatment. Between 6 and 8 weeks after the nCRT, TME was performed and the gross specimen was analyzed by a dedicated gastrointestinal (GI) pathologist. As our study aimed at prediction and early assessment tumoral treatment response to neoadjuvant chemoradiotherapy, we focused the analyses on pre-and early MR examinations.

MRI data acquisitions
All MR imaging were performed using a 3T scanner (Signa HDx, General Electrics, Milwaukee, WI, USA) by using a phased-array body coil. A routine clinical imaging protocol was performed including small field of view (FOV) (16 cm×16 cm) high-resolution two-dimensional T2-weighted spin-echo (SE) sequence (repetition time msec/echo time msec, 5160/151; flip angle, 90°; echo train length, 19; slice thickness, 3 mm; matrix, 512 × 512) acquired in three directions, sagittal, oblique coronal (parallel to the long axis of the rectum), oblique axial (perpendicular to the long axis of the tumor), respectively. After that, axial SE DWI echo-planar imaging sequence with background body signal suppression was acquired at b values of 0, 800 sec/mm 2 (repetition time msec/echo time msec -4925/68; Nex -4; slice thickness -4 mm; matrix -128×128). Subsequently, axial three-dimensional LAVA DCE-MRI images were acquired. However, only the oblique axial T2-weighted sequence was used for analysis. Patients underwent bowel preparation with antispasmodic medication before the MRI examinations. All these sequences were obtained during free breathing. The average time interval between two MR examinations (the first and the second) and initiation of nCRT were 13 ± 7 days (range, 1-29 days) and 15 ± 2 days (range, 13-21 days), respectively.

Imaging segmentation and textural features calculation
The Omni-Kinetics software (v. 2.06, GE Healthcare) was used to obtain first order and histogram texture metrics, including mean value, SD, skewness, kurtosis, uniformity, energy, and entropy. One GI radiologist (15 years of experience in interpreting rectal MR images) reviewed the images of all patients on a local picture archiving and communication system (PACS; v. 3.1.S08.1, 2006 Carestream Corporation), then the largest tumor area depicted on the oblique axial T2WI MRI images were chosen to analyze. Pre-and early nCRT MRI images were randomly analyzed by two GI radiologists (10 and 2 years of experience in interpreting rectal MR images, respectively) who were blinded to each other's results, the clinical and histopathological data related to tumoral treatment response. Regions of interest (ROI) were drawn manually on the selected section of the largest tumor area. The entire area of tumor was included within the ROI, including any viable tumor. The corresponding oblique axial T2WI, DWI, and DCE-MRI imaging were at the readers' disposal as a reference. Then the pre-and early nCRT texture parameters values were calculated automatically. To remove the MRI noise and improve the parameters reliability, voxel intensities were therefore resampled into equally spaced bins in our study. This discretization step not only reduces image noise, but also normalizes intensities across all patients, allowing for a direct comparison of all calculated textural features between patients.
In this study we explore a feature-based approach to extract and quantify meaningful and reliable information from MR images. In this section we describe in detail the imaging traits assessed in our study, that were used to derive textural features. First-order and histogram statistics describe the distribution of voxel intensities within the MR image through commonly used and basic metrics. Let denote the three dimensional image matrix with voxels and the first order histogram divided by discrete intensity www.impactjournals.com/oncotarget levels. The following first-order and histogram statistics were extracted:

Standard deviation:
where X is the mean of X.

Skewness:
where X is the mean of X.

Kurtosis:
where X is the mean of X. The mean value of the absolute deviations of all voxel intensities around the mean intensity value. The standard deviation is measures of the histogram dispersion, that is, a measure of how much the gray levels differ from the mean. The skewness and kurtosis are the most frequently used central moments. The skewness measures the degree of histogram asymmetry around the mean, and kurtosis is a measure of the histogram sharpness. As measures of histogram randomness we computed the uniformity and entropy of the image histogram.

Neoadjuvant chemoradiotherapy
All patients were treated with a long course of radiation therapy (RT) at a dose of 50 Gy (in 25 daily fractions of 2 Gy given in 5 weeks) to the whole pelvis.
Chemotherapy consisted of oxaliplatin infusion 50 mg/m 2 on the first day of each week of RT and oral 5-FU derivate capecitabine, 1650 mg/m 2 bid from the first day to the end of nCRT. Dose reduction of oxaliplatin and capecitabine was not planned.

Surgical approach
All patients underwent the standard procedure of TME surgery by experienced colorectal surgeons specialized in colorectal oncology [26]. The approach of surgery was chosen by the surgeon based on the different tumor location and results of post nCRT restaging MRI.

Reference standards of treatment response
The resected specimens were processed and evaluated by a single pathologist (15 years of experience in interpreting rectal cancer pathology) who was not aware of the clinical and MRI findings. Two different pathological reference standards were used to assess tumor treatment response. Resected specimens were examined according to the Union for International Cancer Control (UICC)/American Joint Committee on Cancer (AJCC) TNM system. The tumor regression grade (TRG) was assessed according to Mandard et al. [23]. TRG 1 (complete regression) means the absence of histologically identifiable residual cancer and fibrosis extending through the wall, with or without granuloma. TRG 2 is characterized by the presence of rare residual cancer cells scattered throughout the fibrosis. TRG 3 corresponds to an increase in the number of residual cancer cells, but fibrosis still predominates. TRG 4 indicates residual cancer outgrowing fibrosis. TRG 5 is the absence of regressive changes. Patients with a TRG 1 or 2 were considered as responders, whereas the remaining patients (TRG 3, 4, or 5) were classified as nonresponders. Pathological complete response (pCR) was defined as the absence of any residual tumor cells detected in the surgical specimens (ypT0N0). Patients with ypT0N0 were divided into responder group, while the patients without ypT0N0 were classified into nonresponder group.

Statistical analysis
Interobserver agreement was characterized by using the intraclass correlation coefficient (ICC) for continuous variables (0-0.20, poor agreement; 0.21-0.40, fair agreement; 0.41-0.60, moderate agreement; 0.61-0.80, good agreement; and 0.81-1.00, excellent agreement). First-order texture parameters (mean value, SD) and Histogram texture parameters (skewness, kurtosis, uniformity, energy, and entropy) were compared between pre-and early nCRT in terms of averages using Wilcoxon signed-rank test. Responder and nonresponder groups were analyzed using the Mann-Whitney test. We used backward method in logistic regression analysis. The TA parameters