Inter- and intra-observer reproducibility of ADC measurements in esophageal carcinoma primary tumors

The apparent diffuse coefficient (ADC) may correlate with the treatment response to chemotherapy/radiotherapy in solid tumors. Our aim was to determine the inter- and intra-observer reproducibility of ADC measurements in primary esophageal squamous cell carcinoma (ESCC). ADCs were blindly measured in 31 patients diagnosed with ESCC by two observers before treatment (pre-ADC) and after 5th fraction radiotherapy (intra-ADC) twice with a 2-week interval. The mean pre-ADC of primary tumors was 1.25±0.22 and 1.27±0.23 (in 10−3mm2/s) from observer A for measurements 1 and 2, respectively, and the intra-observer measurements were -0.02 bias vs. -0.13-0.09 limits of agreement. From observer B, the mean pre-ADC varied between 1.25±0.23 and 1.27±0.23 (in 10−3mm2/s) for measurements 1 and 2, respectively, and intra-observer measurements were -0.02 bias vs. -0.17∼0.16 limits of agreement. The mean pre-ADC of primary tumors was 1.26±0.24 (in 10−3mm2/s) from observers A and B, and inter-observer measurements were 0.01 bias vs. -0.09-0.09 limits of agreement, revealing a low inter-observer variance. Similar measurements of the intra-SD parameters showed that the pre- and intra-ADC of primary tumors differed significantly. Thus ADC measurements may have sufficient inter-observer and intra-observer reproducibility to measure primary tumor responses to treatment, and the ADCs before and during treatment differed.


INTRODUCTION
Treatment response of esophageal carcinoma (EC) is affected by many factors including oxygenation status of cancer cells, gene mutation, and distribution of microvascular vessels [1][2][3]. There is also variability in radiotherapy dosage among different treatment centers. For example, the RTOG 94-05 phase III trials demonstrated that the survival or local/regional control in the group of higher radiation dose (64,8Gy) was not increased compared with that in the group of lower radiation dose (50.4Gy) [4]. Moreover, in the CROSS phase III trial, disease-free survival (DFS) and overall survival were improved in patients underwent radiation doses of 41.4Gy preoperative chemoradiotherapy (CRT) compared to patients that underwent surgery alone [5]. If the sensitivity of chemotherapy/radiotherapy response is monitored early, the effectiveness of these treatment regimens will be better predicted.
Diffusion-weighted imaging (DWI) is a functional approach that detects water molecule diffusion in the body, and the apparent diffusion coefficient (ADC) has www.impactjournals.com/oncotarget/ Oncotarget, 2017, Vol. 8, (No. 54), pp: 92880-92889 Research Paper www.impactjournals.com/oncotarget been utilized for the clinical application of evaluating the treatment response to CRT in many cancers [6]. Clinically valid use of DWI requires that measurement variation in a given patient be less than that observed by different observers or measurement. The inter-and intrareproducibility of ADC measurement is rarely reported for primary tumors of EC patients, and contouring of the measurement region of interest (ROI) is not standardized. The variation of protocols, b-values, and calculations reported by different institutions to obtain the ADC values and cut-off values cannot be compared and are not possible to utilize clinically [7,8]. Therefore, the broad application of DWI in the prediction of treatment response is dependent on the accuracy and reproducibility of the measurements.
We estimated the reproducibility of two measurements of ADC (at baseline and the 5 th fraction of RT) via a designated method and explored the change in ADC during the early stages of treatment.

General clinical data of 31 patients
A total of 31 patients (20 men, 11 women; mean age 64.5±8.7 years) were diagnosed with esophageal squamous cell carcinoma (ESCC). The number of patients in each T stage were T1, n = 0, T2, n = 3, T3, n = 24, T4, n = 4, respectively. Three cases scored performance status (PS) of 0, 16 cases scored PS of 1, and 12 cases scored PS of 2. Primary tumor sites were located in the neck of 3 cases, upper thoracic of 4 cases, middle thoracic of 17 cases, and lower thoracic of 7 cases. The mean RT dose was 5800.65±647.94 cGy.

Measurement and reproducibility of ADC and SD during treatment
At the 5 th RT, the mean intra-ADC of primary tumors from observer A was 1.57±0.32 and 1.59±0.30 (in 10 −3 mm 2 /s) for measurement 1 and 2 respectively, and 1.60±0.34, 1.58±0.33 from observer B ( Table 4). The bias vs. limits of agreement for the intra-observer measurements corresponding to Bland-Altman plots from observer A and B are displayed in Figure 1C, 2C, respectively. The inter-observer measurements of intra-ADC were -0.03 bias vs. -0.15~0.10 limits of agreement ( Figure 3B). The mean intra-SD of primary tumors from observer A and B is summarized in Table  5 for measurement 1 and 2, and the intra-observer bias vs. limits of agreement are displayed in Figure 1D, 2D. The inter-observer bias vs. limits of agreement for the intra-ADC and intra-SD are shown in Figure 3B, 3D, respectively.

Differential analysis of ADC and SD measurements
Compared to the value of pre-ADC, the value of intra-ADC was significantly higher (P<0.05, Figure 4A). However, while the value of intra-SD was higher than pre-SD, there is no significant difference between pre-SD and intra-SD (P>0.05, Figure 4B).

DISCUSSION
Functional imaging such as DWI is increasingly prominent in the treatment response evaluation of esophageal carcinoma due to the recent widespread application of MR for esophagus examination. However, a major challenge to the interpretation of functional metabolic imaging-generated parameters, including the ADC value of DWI, is the inherent physiologic heterogeneity within a tumor. To our best of knowledge, there is no standard protocol for performing ADC measurements of esophageal carcinoma.
Notably, few published studies have investigated the clinical value of DWI in evaluating esophageal carcinoma. Several studies have used "whole tumor" ROI data to differentiate malignant and benign nodes in esophageal carcinoma [10,11] and predict RT response [12,13]. Some groups have advocated assessment of only the most enhanced voxels within a tumor, based on the result that the most enhanced ROIs provided more statistically significant differences between responders and non-responders in CRT than whole tumor ROI [14]. Many     studies neglect to illustrate the delineation of ROI and do not report intra-and inter-observer reproducibility of the ADC measurement [15,16].
Our study specifically addressed ROI selection strategies to estimate intra-and inter-observer reproducibility. The method of ROI contouring in our study relied on the following strategies: (1) ROI in the slice containing the most enhanced voxels in enhanced contrast T1WI and excluded the necrotic areas to avoid intra-tumoral variation [9,17,18], and the point was widely recommended for the measurement of ADC; and (2) Selection of three continuous sections, including the largest slice, to determine the average ADC of the tumor ( Figure 5). Our method was derived from previous studies [19,20] where the delineation was based on the largest slice, but the ROI our study was not confined to the largest slice to assure low variance during the period of ROI delineation. Our data suggest that this is an appropriate strategy to assure the reproducibility of intra-observer and inter-observer. Furthermore, the resulting bias and limits of agreement measurements were acceptable, and low variance in ADC measurements was indicated by the parameter SD. Our results were consistent with Kwee et al. [21] who determined that semi-automated volumetric ADC measurements were more reproducible than manual ADC measurementxoldaxas. However, Kwee et al. [22] revealed that despite good inter-and intraobserver reproducibilities, the ADC value was not always

Altman plots of difference of ADC or SD measurements (y-axis) vs. mean ADC measurement (x-axis), with mean absolute difference (bias) (continuous line) and 95% confidence interval (CI) of the mean difference (limits of agreement) (dashed lines except zero line). The results
showed that inter-observer reproducibility was acceptable, which displayed most plots distributed between the lines of 95% CI. (A) The measurement of pre-ADC between observer A and B, (B) the measurement of intra-ADC between observer A and B, (C) the measurement of pre-SD between observer A and B, and (D) the measurement of intra-SD between observer A and B.

Figure 4: The comparison of ADC and SD parameters between pre-treatment and the 5 th RT. (A)
The value of intra-ADC was higher than that of pre-ADC ( * :P<0.05); (B) the value of intra-SD was higher than that of pre-SD, but the difference was not statistically significant (P>0.05). sufficiently reproducible to discriminate malignant from non-malignant lymph nodes.
Interestingly, we also found that the ADC was significantly different at 5 th fraction RT which showed the change of functional parameters preceded the change of anatomical morphology. This result suggests the potential of ADC to predict the treatment response of esophageal carcinoma earlier. The check-point of monitoring response may shift earlier to avoid interference from tumor reduction that causes measurement error. The optimal check-point of treatment response is still controversial [8,14,15], so the method in our study may be an alternative to monitoring early treatment response. Studies are ongoing in our center.
In conclusion, the ADC measurement from DWI is highly reproducible in esophageal carcinoma via our method and could predict treatment response. A region of interest (ROI) was placed manually for observer A in the selected section, on the image obtained at a b-value of 1000 s/mm 2 , and the ROI was then copied and pasted onto the ADC map (D), and the ADC and SD of the selected section were automatically calculated. (E) A region of interest (ROI) was placed manually for observer B in the selected section, on the image obtained at a b-value of 1000 s/mm 2 , and the ROI was then copied and pasted onto the ADC map (F), and the ADC and SD were also automatically calculated. www.impactjournals.com/oncotarget

Patient selection
Thirty-one patients (20 men, 11 women; mean age 64.5±8.7 years; age range 41-79 years) diagnosed with ESCC by pathology, and treated at Zhejiang provincial cancer hospital between January 2015 and November 2016 were enrolled in this study. The study was approved by the institutional review board, and written informed consent was obtained from each participant before MRI examination. All subjects were qualified by the following criteria: (1) Eastern Cooperative Oncology Group performance status score is smaller than or equal to 2; (2) Adequate organ function; (3) No concomitant malignancy; (4) Good compliance; (5) No contraindication to MRI examination; (6) No surgical indications or patient refusal; (7) and completion of the entire course of radiotherapy. The stage of disease was classified according to the 7th edition of the Union for International Cancer Control (UICC) and the American Joint Committee on Cancer (AJCC) staging system. The clinical characteristics of all patients are listed in Table 1.

Imaging analysis
Both MR images were transferred to a workstation (ViewForum; Philips Medical Systems, Best, The Netherlands). Two board-certified radiologists (observer 1, Tieming Xie, with 13 years of experience in MR imaging; observer 2, Mingxiang Jiang, with 12 years of experience in MR imaging) reviewed the images and recorded the locations and slice numbers of the primary tumor site independently and blindly, and then performed ADC measurements of the selected tumor through the contouring region of interest (ROI). Each ROI was variable so that the two observers obeyed the following stipulations: (1) Used ROI in the slice containing the most enhancing voxels in enhanced contrast T1WI [9].
(2) Avoided the non-enhancement or necrotic areas in the ROIs. (3) Selected three continuous slices including the one of maximal diameter and its adjacent above and below one in tumor parenchyma according to the sagittal and horizontal view, and the values of ADC were averaged based on the three slices. (4) All measurements were performed twice by each observer, with a wash-out period of at least two weeks between the first and second series of measurements. The pre-treatment and 5 th RT ADCs were labeled as pre-ADC and intra-ADC, respectively.

Statistical analysis
The mean ADC±SD of primary tumors including pre-ADC, pre-SD, intra-ADC, and intra-SD were acquired by each observer for each series of measurements. Secondly, inter-and intra-observer reproducibility of primary tumor ADC measurements tumor was determined by mean absolute difference (bias) and 95% confidence interval of the mean difference (limits of agreement) according to the methods of Bland and Altman. Bland-Altman plots were constructed by GraphPad-Prism 5 software. Statistical analyses were executed using SPSS 16.0 software (SPSS, Chicago, IL, USA), and the analysis of variance (ANOVA) was performed to compare the continuous variables between two groups.

Author contributions
Zhun Wang and Zhenfu Fu designed the study, Zhimin Ye and Jun Fang wrote the manuscript. Tieming Xie, Kai Li conducted MRI examination and ROI contouring, Shujun Dai collected the data and performed the analysis of statistics. Fangzheng Wang, and Yuezhen Wang supervised the analysis of imaging and assisted with manuscript preparation.

CONFLICTS OF INTEREST
On behalf of all authors of this paper, I declare that this study will not lead to any financial or other kinds of conflicts of interest.