Predicting clinically significant prostate cancer using DCE-MRI habitat descriptors

Prostate cancer diagnosis and treatment continues to be a major public health challenge. The heterogeneity of the disease is one of the major factors leading to imprecise diagnosis and suboptimal disease management. The improved resolution of functional multi-parametric magnetic resonance imaging (mpMRI) has shown promise to improve detection and characterization of the disease. Regions that subdivide the tumor based on Dynamic Contrast Enhancement (DCE) of mpMRI are referred to as DCE-Habitats in this study. The DCE defined perfusion curve patterns on the identified tumor habitat region are used to assess clinical significance. These perfusion curves were systematically quantified using seven features in association with the patient biopsy outcome and classifier models were built to find the best discriminating characteristics between clinically significant and insignificant prostate lesions defined by Gleason score (GS). Multivariable analysis was performed independently on one institution and validated on the other, using a multi-parametric feature model, based on DCE characteristics and ADC features. The models had an intra institution Area under the Receiver Operating Characteristic (AUC) of 0.82. Trained on Institution I and validated on the cohort from Institution II, the AUC was also 0.82 (sensitivity 0.68, specificity 0.95).


INTRODUCTION
Prostate cancer is the second largest cause for cancer deaths among men in the US with an estimated 21% of newly diagnosed cancers [1]. Over-diagnosis and resulting overtreatment of the disease is a major concern, some of which is attributed to the traditional screening procedures, including prostate specific antigen (PSA) [2][3][4]. mpMRI has improved the detection of clinically significant lesions [5] impacting the staging, diagnosis and follow-up of patients with prostate cancer [6]. Dynamic Contrast Enhancement (DCE) imaging is routinely www.oncotarget.com Oncotarget, 2018, Vol. 9, (No. 98), pp: 37125-37136 Research Paper www.oncotarget.com included in prostate MRI exams along with T2-weighted imaging (T2W), and diffusion-weighted imaging (DWI). DCE imaging shows the dynamics of the administered contrast agent, while Apparent Diffusion Coefficient (ADC) qualifies tissue density by measuring diffusion of water molecules. Clinical assessment of lesions on MRI is guided by the Prostate Imaging Reporting and Data System, version 2 (PIRADSv2) standard [7]. The standard reporting restricts use of DCE to peripheral zone (PZ) when clinical significance of the lesion in DWI is equivocal (DWI PIRADSv2 score of three). DCE analysis can be broadly divided in two approaches: Quantitative, where a model is used to determine the rate of contrast transfer from blood into tissue, and semiquantitative, where the analysis uses time activity curve characteristics to describe different contrast absorption patterns. The difficulty in establishing consistent features from the DCE curves, as well as the high inter-observer variability affects the use of DCE in a quantitative fashion. Even though semi-quantitative DCE analysis lacks a formal modeling of contrast uptake, it provides a more general characterization of the activity curves. This characterization has been used in the automatic detection and quantitative scoring of prostate cancer aggressiveness [8][9][10]. Quantitative models study the pharmacokinetics of the contrast agent in the prostate, especially with the use of arterial input function (AIF) [11]. The model proposed by Tofts et al [12] provides an elegant model to characterize contrast absorption. It has been shown that the pharmacokinetic analysis parameters correlate with lesion aggressiveness but with high inter-institution variability [13]. Major constraint in the identification of a region to derive a reference AIF has prevented these features from being used in practice [14].
In this study, we examine the utility of DCE derived quantitative characteristics on a habitat region co-localized by ADC to discriminate clinically significant prostate cancer. We also show ways to improve discriminatory ability by adding radiomics features derived on an ADC region. The identified model was validated in an independent cohort obtained from a different institution.

RESULTS
Data set from Institution I consisted of 173 positive for cancer biopsies from 57 patients; data set from Institution II consisted of 51 biopsies from 39 patients. Biopsies without assigned Gleason Score (GS) were discarded, such as those labeled by the clinical pathologist as benign prostatic tissue, or as benign prostatic hyperplasia. Biopsies with an assigned GS sum of 6 or above were included for analysis. The average interval between imaging and biopsy sampling was 12 days for Institution I and 27 days for Institution II. The data from Institution I consisted of 116 clinically insignificant and 57 clinically significant biopsies. The data from Institution II consisted of 22 clinically insignificant and 29 clinically significant biopsies. Patients with temporal resolution larger than or equal to 15 sec were excluded (Institution I, n=14; Institution II, n=6). Patients with DCE motion artifacts were also excluded (Institution I, n=5; Institution II, n=3). The final analysis included 38 patients (99 biopsies; 84 clinically insignificant, 15 clinically significant: nine with GS 3+4, four with GS 4+3, and two with GS 5+3) for Institution I and 30 patients (42 biopsies; 17 clinically insignificant, 25 clinically significant: sixteen with GS 3+4, six with GS 4+3, two with GS 4+4 and one GS 4+5) for Institution II. Prior preliminary study had shown detrimental effects of using low temporal resolution on the estimated curve characteristics [15]. The intra-modality temporal alignment of DCE was measured as the percentage difference between the mean prostate time activity curve and its fitted model. Before registration, the mean difference was 11.17% (standard deviation, 7.64%). After registration the mean difference was 7.77% (standard deviation, 2.58%).
In this study, classifier models using features on the perfusion characteristics were used to discriminate between clinically insignificant and significant prostate cancer (see Table 1). The highest predictive DCE and ADC features were used to develop a multivariable predictor model. The wash-in slope habitat and the radiologist contours had a Dice score of 0.21 suggesting that this habitat was exploring the peritumoral region, adding information from the surrounding environment to the model. Intra-institution analysis of DCE features (diagonal,  Sensitivity, specificity and AUC for classification between clinically insignificant and significant cancer is shown, based on MRI-guided biopsies. Decision trees were used as classifiers. Leave-one-out(LOO) cross validation was used. The diagonal corresponds to the univariate case. Each DCE feature generates a 3D map that is thresholded to converge to a 3D DCE volume. For each feature, it is shown the 2D Dice coefficient between each converged volume and the manual radiologist contour of the finding, for the slice with the largest manual volume.
performing features ( Table 4). The same intra-institution analysis of DCE features was performed for Institution II without image registration (diagonal, Table 5). It showed that the AUC was in the range 0.44 to 0.59, with the top predictor being initialAUC. Pairwise analysis of DCE features (off diagonal, Table 5) showed that for Institution II, the AUC increased for the pair of features (time-topeak, initialAUC) to 0.73, with sensitivity of 0.84 and specificity of 0.63. ADC features were ranked by the intra institution AUC and the top five pairs (Table 6, description of features in Table 7) were considered for inter-institution analysis. AUC for both Institution I and for Institution II were in the range 0.71 to 0.82. The top performing ADC features were associated with histogram gradient, volume/ intensity fraction difference and habitat volume. Multivariable analysis was performed by joining pairs of DCE features (Table 2) with the top five performing ADC pairs (from Table 6) as predictors and evaluating their predictive power. The quadruples were ranked by the cumulative inter institution AUC. The top performing couples corresponded to the same ADC feature pair: (MaxHistGrad, MinorAxisL). For intra-institution analysis ( Table 8) the AUC for Institution I was in the range 0.75 to 0.88, and for Institution II it was in the range 0.45 to 0.76. For inter-institution analysis (Table 9) the AUC for Institution I was in the range 0.71 to 0.82, and for Institution II it was in the range 0.54 to 0.70.

DISCUSSION
DCE features show promise in discriminating between normal appearing versus tumor tissue: In a recent study [9], characterization of the prostate region (radiomics) in MRI showed predictive of cancer tissue, with an AUC of 0.71 for PZ and of 0.68 for TZ. DCE features have also shown to be discriminant between clinically significant and insignificant prostate cancer: wash-in and wash-out slope were two of the parameters in a three-variable linear models that showed a classification AUC of 0.85 for PZ and 0.92 for TZ in an intra-institution setting using whole-mount histopathology contours registered unto T2W for lesion characterization [10].
The discriminatory power of DCE shows promise in this work; only if the procedure could be translated in clinical practice to obtain better risk stratification therefore avoiding aggressive treatment in patients with non-significant cancer. It was already shown that DCEbased habitats provide significant correlation between clinically insignificant and significant lesions in [8] where the AUC for the significant quantitative features reached 0.88 and 0.95. This previous work supports the underlying hypothesis for this study: that DCE features are able to differentiate clinical significance of identified lesions. It is shown in this paper that DCE and ADC radiomics features from a wash-in slope induced habitat differentiate clinically significant vs insignificant cancer with an AUC of 0.88 and 0.82 for intra and inter-institution analysis respectively (Tables  5 and 6) showing similar discriminating power than whole-mount histopathology-based regions of interest [10].
The intent of this paper is to show utility of DCE habitats accurate predict cancer status and to show adding multiple modality information (ADC metrics) shown improvement in the predictability. The analysis presented here did not break down the tumors by prostatic zone because of the small sample size for clinically significant lesions, but the segmentation step was aware of the prostate zone containing the largest percentage of the lesion.
Although DCE plays a minimal role in PIRADSv2, finer quantification of perfusion characteristics may have a greater role. As shown in this study, pairs of DCE features had an AUC of 0.71 for Institution I, and 0.82 for Institution II, for intra-institution analysis. For interinstitution analysis (Table 6), it can be seen that training on Institution I had better performance than training on Institution II. This might be due to the difference in training size (99 and 42 biopsies, respectively) suggesting than the radiomics approach requires a larger training set. The best performing features in this inter-institution analysis ( Table 6) for training on Institution I were the tuple (AUCi, wo, MaxHistGrad, MinorAxisL) with an AUC of 0.82. This tuple had both features from early uptake (AUCi) and late uptake (wo) suggesting that including descriptors for the whole DCE curve improves performance. The ADC features suggest that abrupt intensity changes and the volume of the ADC habitat play an important role in improving classification. Perfusion characteristics associated with early enhancement (peak enhancement, time-to-peak, start of enhancement) continued to be some of the top predictors of clinical significance. In addition, it was found that the rate of contrast activity at early and late absorption (wash-in slope, wash-out slope, and slope product) were consistently top candidates related to Gleason tumor grades. A meta-analysis of various DCE publications in prostate cancer [16] showed that the forward volume transfer (Ktrans) and reverse reflux (Kep) are consistently related to tumor aggressiveness and these measures were valuable for differential diagnosis of prostate cancer. This study found that feature descriptors related to perfusion peak, rate and wash-out characteristics were predictive of clinical significance. Multicenter validation studies in breast cancer finds variation in concordance between participants estimate of Ktrans, ranging from 0.047 to 0.92 [17].
There are a few studies using quantitative imaging in prostate cancer relating features to aggressiveness. Some top features correspond to gradients, Gabor filters, etc [18]. The predictors showed high specificity (>95%) but a low level of sensitivity (≤ 42%). In a recent review on prostate cancer, the concern of over-diagnosis was addressed by a suggestion to exploit quantitative imaging metrics to offset the need for invasive biopsies [19]. A quantitative imaging approach such as the one presented in this current study has the potential to significantly reduce the number of biopsies and associated morbidity. The presented approach of using a sphere around the lesion to find an appropriate habitat can easily be adapted to a deep-learning framework to identify DCE habitats in a data-driven fashion that shows promising in imaging but requires larger data sets. Center of mass of manually drawn contour was used, to co-localized ADC map to converge on DCE habitats. Small changes in lesion contours will have minimal impact on the habitat region.
The need for registration between different modalities of medical imaging has been well documented [20]. The measurement of registration accuracy is still challenging for 4D DCE data. DCE intensity variances over time have been used as a similarity measure in registration of DCE data [21]. We used MIM PACS registration modules (FDA approved package) accessed iteratively using custom routines to minimize discrepancy in mpMRI modality alignment. Based on our preliminary analysis to study the influence of modality alignment to downstream analysis, we find time-topeak and initialAUC are early enhancement features, that are probably not affected by patient movement which predominantly happens during the later parts of the scan. The aim of the work presented here is not to identify nor delineate suspicious regions in the prostate. Our goal is to provide the radiologists and oncologists with an accurate prediction of the clinical significance of identified lesions.
The American College of Radiology recommends use of high DCE temporal resolution (10 seconds or less) for characterizing prostatic vasculature [22]. In a recent  Pairings of 90 ADC features for intensity statistics, histogram and shape were used for classification between clinically insignificant and significant cancer. Sensitivity, specificity and AUC were computed. The cumulative AUC between institutions was used for ranking. The top five pairings are shown below. Decision trees were used as classifiers. Leave-oneout (LOO) cross validation was used for intra-institution evaluation    (Table 3). Sensitivity, specificity and AUC were computed. The cumulative inter-institution AUC was used for ranking. Leave-one-out (LOO) cross validation was used for decision tree classifiers. All top DCE-ADC tuples had the same ADC pair (MaxHistGrad, MinorAxisL) www.oncotarget.com   The top performing tuples in the intra-institution DCE and ADC feature evaluation (Table 5) are independently tested between institutions. Sensitivity, specificity and AUC were computed. Decision trees were used as classifiers.
study, a sampling resolution of 15 sec and above resulted in a statistical insignificance compared to higher resolutions [15]. Due to retrospective nature of the study, data sets with temporal sampling larger than 15 seconds were removed to compromise on the sample size between two centers.

Patient data
This study evaluated the performance of features using two independent data sets: First cohort was collected at the University of Miami (Institution I) under approved Institutional Review Board (IRB) protocol and deidentified for retrospective analysis. An additional cohort was collected at H. Lee Moffitt Cancer Center (Institution II), under protocol approved by the University of South Florida's IRB. The patient's informed consent was waived for retrospective access of de-identified patient records. The methods were performed in accordance with the approved guidelines. Data consisted of histopathology analysis of prostate biopsies acquired with either template or targeted biopsy from pre-treatment MRI acquired by fusing the MRI and real-time ultrasound images using Uronav (Invivo Corporation, Gainesville, FL), which allows for accurate measurement of needle location. For this study, GS values were grouped in two categories: clinically insignificant cancer (=GS6) and clinically significant cancer (GS ≥ 7). All statistics were performed using this grouping.
All modalities were registered locally to the prostate using the T2W image as reference. We used gradient descent of mutual information on the space spanned by 3D affine transformations, using a combination of native and custom routines on the MIM PACS software (MIM Corporation, Cleveland, OH). Manual contours of the prostate, PZ, and the radiologist finding in the pre-biopsy MRI were stored as RT-DICOM structures. The peak-absorption time point S peak was identified in DCE using the AIF signal as reference. All other DCE time points were registered to S peak . ADC were standardized within the prostate, i.e., ADCz = (ADCmean(ADC(prostate)))/std(ADC(prostate)), which has been shown to be used to standardize data variability [23]. DCE data was normalized using an automatically segmented arterial contour as described in [24], which makes the signal proportional to the change in relaxation rate caused by the contrast agent weighted by the initial spin-lattice relaxation time [25]. Image analysis was performed using custom routines written in Matlab (Mathworks, Natick, MA) which were accessed directly from the PACS (MIM Corporation, Cleveland, OH, USA).

Image registration in mpMRI
In order to minimize the effects of patient movement during the long period of mpMRI scan on the downstream analysis has motivated to use image registration to align modalities [20]. The measurement of registration accuracy is still challenging for 4D DCE data. In prior studies intensity variances over time have been used as a similarity measure in registration of DCE data [21]. This variance was quantified by measuring the percentage difference between the mean prostate time activity curve and its fitted model. The mean percentage different of the signal intensity decreased from 11.17% without any to 7.77% after image registration. The standard deviation also reduced, from 7.64% to 2.58%, showing a larger decrease in the distribution of motion artifacts after registration. It was found that the performance of DCE features as predictors of accuracy is sensitive to patient motion artifacts. We evaluated the performance of single DCE features as predictor with respect to registration. We find the AUC range shifted from 0.37 to 0.71 (with registration) to the range 0.44 to 0.64 (without registration). Pairwise analysis showed that the feature pair (time-to-peak, initialAUC) was not affected by the registration process. www.oncotarget.com

Study design
The overall methodology of this study is shown in Figure 1. A wash-in slope map was generated by estimating the wash-in slope (Figure 2) of the time activity curves associated with the voxels within the prostate. The tumor region that was based on the radiologist finding on the T2W images was obtained. This region was centered based on the TRUS biopsy location that was imported directly from the fused TRUS/MRI system. This lesion boundary was initialized with a uniform 3D volume (extended region of fixed diameter) around the biopsy location and converged automatically into a wash-in slope habitat based on the upper quartile of the wash-in slope map ( Figure 3A). The habitat's average absorption at each DCE time point was analyzed and used to generate time activity curves that were characterized by computing quantitative descriptors. These descriptors were then used in a classifier model to find features that discriminate clinically significant tumors. Intra-institution classification was used to select a set of DCE and ADC features that were analyzed in an inter-institution setting, where the model was build using the cohort from one institution and validated on the other institution. Inter-institution analysis of this subset of features was performed and the top performing features are shown in Table 5.

Feature extraction
In this study, seven features were extracted from the DCE time activity curves, which describe both early and late enhancement (see Table 1). DCE curves were fitted using a bi-exponential model [26]. This semi-quantitative model has five parameters: initial static intensity s 0 , plateau s m , start of enhancement t 0 , time-to-peak tau, and wash-out slope, wo. Figure 2 shows an example DCE curve along with these parameters. Peak enhancement s p =s m -s 0 ; washin slope wi=s p /tau. In addition, we computed two features that describe the area under the DCE curve between a time intervals, namely: AUC t1-t2 is the area under the biexponential fitted DCE curve between time, t1 and t2. AUCi = AUC t0-t0+60 measures the early wash-in uptake curve and AUCf = AUC t0+240-t0+270 measures the late washout curve. The seventh feature computes the multiplicative effect of wash-in and wash-out slopes and was computed as m io = wi * wo. On the localized region, a set of 90 ADC features were computed consisting of intensity statistics, histogram and volume features. A subset of pairs of ADC features was obtained from the top performing pair-wise features selecting those with largest AUC.

Computation of the wash-in slope habitat
The wash-in slope has been useful for cancer detection and localization [27], as well as in discriminating aggressive versus non-aggressive lesions [28]. It also differentiates prostate cancer from non-neoplastic lesions [29]. In [28] manual contours on whole mount histopathology after prostatectomy were mapped to T2 and the DCE wash-in slope was significantly different between these two groups for both the mean and the 75 th percentile within the mapped contour. In recent work [30], wash-in slope along with time-to-peak induced the highest sensitivity (0.89 for linear discriminant analysis, and 0.97 for SVM) for ovarian cancer. In this study, the wash-in slope parameter was used to converge to an intra and peritumoral region (habitat) around the biopsy location to characterize the surroundings of the biopsied lesion. This was done by first forming a sphere (radius r = 15 mm) around the given biopsy location to account for TRUS/MRI registration error. This region was bounded by the prostatic zone, either PZ or transition zone (TZ) allocating the largest lesion volume. The values for the wash-in slope within the localized sphere were used to obtain the region defined by the upper quartile. The corresponding DCE region will be our consensus tumor habitat region of interest. The mean DCE signal value at the consensus region at each sampling time was used as a representative perfusion curve for the patient biopsy. The definition of the wash-in slope habitat is shown in Figure 3

Statistical analysis
Univariate analysis of the seven DCE features was performed to evaluate the overall discrimination of clinically significant to non-significant cancers using decision trees [31]. Sensitivity, specificity and AUC were computed on the features (see Table 1). Pair-wise multivariable analysis was performed by exhaustive comparison of all possible DCE feature pairs.
The underrepresented GS class was over-sampled using SMOTE [32], calibrated so that both classes had exactly the same size. Intra institution classifier performance was evaluated using leave-one-out (LOO) cross validation. For inter institution validation, a training model was built using the whole balanced data set in one institution, and tested using the unbalanced data set from the other institution. Data from different institutions were not mixed to build the classification models.
Pair-wise comparison of AUC was performed using DeLong test [33]. False discovery rate [34] (FDR) was used to correct for multiple comparisons.
Image processing and segmentations were performed on MIM Imaging PACS workstation (MIM Corporation, Cleveland, OH, USA). The feature computations were developed using custom code written in C++ and Matlab (Mathworks Inc., Natick, MA). Classifiers were implemented in Matlab. DeLong and FDR tests were performed in R.

CONCLUSIONS
This paper describes a systematic approach to quantifying the clinical significance of lesions identified by radiology using a DCE-based habitat and evaluating both DCE and ADC features. Our approach identifies reproducible features for inter-institution prediction and can be translated seamlessly into clinical practice to guide radiologists and oncologists in the assessment of clinically significant prostate cancer.