Nomograms for predicting Gleason upgrading in a contemporary Chinese cohort receiving radical prostatectomy after extended prostate biopsy: development and internal validation

The current strategy for the histological assessment of prostate cancer (PCa) is mainly based on the Gleason score (GS). However, 30-40% of patients who undergo radical prostatectomy (RP) are misclassified at biopsy pathologically. Thus, we developed and validated nomograms for the prediction of Gleason score upgrading (GSU) in patients who underwent radical prostatectomy after extended prostate biopsy in a Chinese population. This retrospective study included a total of 411 patients who underwent radical prostatectomy at our institute after having prostate biopsies between 2011 and 2015. The final pathologic GS was upgraded in 151 (36.74%) of the cases in all patients and 92 (60.13%) cases in men with GS=6. In multivariate analyses, the primary biopsy GS, secondary biopsy GS and obesity were predictive of GSU in the patient cohort assessed. In patients with GS=6, the significant predictors of GSU included the body mass index (BMI), prostate-specific antigen density(PSAD) and percentage of positive cores. The area under the curve (AUC) of the prediction models was 0.753 for the entire patient population and 0.727 for the patients with GS=6. Both nomograms were well calibrated, and decision curve analysis demonstrated a high net benefit across a wide range of threshold probabilities. This study may be relevant for improved risk assessment and clinical decision-making in PCa patients.


INTRODUCTION
The Gleason score (GS) is the most widely accepted system for the grading of prostate cancer (PCa) [1]. However, due to problems inherent with needle biopsy sampling, there is usually a difference between the GS at biopsy and the GS at radical prostatectomy (RP). According to the available reports, approximately 30-40% of patients who undergo RP are misclassified during the pathological review of the biopsy [2].
The decision-making process regarding the treatment options for patients with PCa, such as active surveillance, radical prostatectomy, brachytherapy or cryosurgery, is highly reliant on GS. Thus, precise determination of the GS at biopsyis of particular interest for clinical decision-making. For instance, GS upgrading (GSU) after RP is highly associated with a risk of extracapsular extension (ECE) and biochemical recurrence (BCR) [2]. In addition, it is also important that the presence of increasing quantities of Gleason pattern 4 results in an increased risk of biochemical disease recurrence, a need for adjuvant therapy, and cancerspecific mortality [3]. Importantly, there are substantial differences in the outcomes between patients with a GS of 3+4 and 4+3 or higher [4].
Although several nomograms for predicting GSU after RP have been reported [5][6][7], there is evidence that there is a high degree of racial variation in the upgrading and upstaging among patients of different races [8]. This is because the epidemiology and patient spectrum of PCa in China and other Asian countries with similar situations are quite different from those of Western populations [9,10]. First, more Chinese patients are diagnosed with a higher grade GS; for example, patients diagnosed with a GS of ≥7 accounted for 80% of the total patients in a collaborative report in Asia [11]. We designed this study based on the Chinese population by introducing two nomograms: one for the overall patients and one for those patients with a GS=6. Second, a comparative study showed that the PCa prevalence on autopsy was similar between unscreened Caucasian and unscreened Asian Men; nevertheless, the rate of high-grade prostate cancer (HGPCa) was higher in unscreened Asian men than in unscreened Caucasian men, even after adjusting for age and prostate weight [12], suggesting differences in this grading system between the two populations. Third, reports from Korea and Japan have illustrated that Western population-derived prediction models perform poorly in Asian populations [13,14]. Pilot studies have been performed in Asian populations [13,[15][16][17]; however, the sample size and factors involved were limited, and the nomogram derived from these studies lacked validation.
Thus, we performed this retrospective study to better predict GSU in a contemporary Chinese cohort of patients who underwent transrectal ultrasound (TRUS)guided biopsy and subsequent RP. Nomograms predicting significant GSU and any upgrading were established and internally validated.

Patient characteristics
The clinical and pathologic characteristics of all involved patients and those with a GS=6 are shown in Table  1. The final pathologic GS was upgraded in 151 (36.74%) cases in overall patient population and in 92 (60.13%) cases among men with a GS=6. A total of 61 (39.87%) men with a GS=6 and 168 (40.88%) of the overall patients had the same GS at biopsy and RP. Nevertheless, 92 (22.38%) patients in the overall patient showed downgrading of the GS from biopsy to RP ( Table 2).

Predictors of GSU in the overall patient population
In univariate logistic regression analyses of potential preoperative predictors of GSU in the overall patient pool, primary biopsy GS (P<0.001) and secondary biopsy GS (P<0.001 in patients with GS=4) were statistically significant predictors of GSU ( Table 3). The only informative predictors (P<0.001) in the overall patient group were the primary and secondary biopsy GS values (AUC0.66 and 0.70, respectively),both of which were negatively correlated with GSU. Although obesity was correlated with GSU at the borderline significance P value of 0.089 in the overall cohort, obese patients were estimated to have a 2.6-fold higher risk of GSU than non-obese patients. Thus, we tested the performance of this variable in the multivariate prediction model. In the multivariate logistic regression analysis, the predictors were obtained using the backward elimination selection procedure, including the primary biopsy GS=4, primary GS=5; secondary biopsy GS=4, secondly GS=5; and obesity, with an OR of 0.53 (95%CI, 0.32 to 0.87), 0.14 (95%CI, 0.03 to 0.62); 0.30 (95%CI, 0.19 to 0.48), 0.00 (95%CI, 0.00 to 0.00); and 1.72 (95%CI, 1.08 to 2.74), respectively. The AUC of the prediction model reached 0.753 (95%CI, 0.706 to 0.800) (Table 4, Figure 1).

Predictors of GSU in men with GS=6
Univariate logistic regression analyses indicated that the preoperative variables significantly associated with GSU included PSA (P=0.  Table 4). The accuracy of this prediction model was relatively high, with an AUC of 0.727 (95%CI, 0.647 to 0.808) (Table 4, Figure 1).

Calibration curve analyses
In addition to the AUC, calibration is an important indicator of nomogram performance. Thus, calibration analysis was applied to measure how far the predictions were from the actual outcomes. The bias-corrected calibration plots showed only a limited departure from the ideal predictions. The mean absolute error was 1.6% in the overall patient group and 2.6% in the GS6 cohort ( Figure 2).

Decision curve analyses
In the decision analyses, the net benefit was higher for the two models at the highest threshold compared with any single predictor of GSU in the overall patient group and in those with a GS=6 (Figure 3). Higher positive net benefits were observed in the range of most threshold probabilities from 0.4 to 0.7 in any GSU and from over 0.3 to 0.8 in the GSU for patients with a GS=6, suggesting a benefit in men with a probability in these ranges. In the overall patient population, the primary biopsy GS showed the highest performance, followed by the secondary biopsy GS and obesity. In men with a GS=6, Model 2 showed better performance in most thresholds, from 0.3 to 0.8.

DISCUSSION
In this study, we found that the primary biopsy GS and secondary biopsy GS values were significant predictors of GSU in all patients undergoing RP, while the PSA, PSAD, number of positive biopsy cores, percentage of positive cores, and obesity were predictors of GSU in patients with GS=6. Prediction models including the primary biopsy GS, secondary biopsy GS and obesity were constructed for the overall patient population, while the prediction model in patients with GS=6 included the PSAD, percentage of positive cores and obesity ( Figure 4). Both models were of a relatively high accuracy and were well fitted in the calibration analysis; clinical decision curve analysis also confirmed their effectiveness.
Several models predicting GSU after RP have been constructed, mostly in Western patient populations. For instance, the primary biopsy GS, secondary biopsy GS, preoperative PSA and clinical stage were identified as predictors of GSU in a study by Chun FK et al. [5]. In particular, patients with a primary biopsy GS ≤3 or secondary biopsy GS≤3 were likely to have GSU, and a higher preoperative PSA level was correlated with an increased likelihood of GSU.
However, the PSA level was not identified as a significant predictor of GSU in this study. We suggest that this may be due to the differences in the PSA distribution between the two studies. There are 25% of patients had a PSA level over 19.3 ng/ml in this study, while only a small portion of patients (7.6%) had a PSA level over 20 ng/ml in the study conducted by Chun FK et al. This fact further illustrates the necessity of establishing new prediction models in the Chinese population, which appears to be quite different from the Western population.
Furthermore, a study in a Japanese population also confirmed that men with a primary biopsy GS of 4-5 show a much lower likelihood of GSU [17]. The primary biopsy GS was found to be more accurate than the secondary biopsy GS in the Japanese study. However, the predictive accuracy of secondary biopsy GS was higher than that of  Studies focused on patients with GS=6, especially those with the potential to meet the inclusion criteria for active surveillance or other conservative treatment options such as brachy therapy and cryosurgery (CSAP), are therefore important. Truong M et al. [6] developed a prediction model in a total of 1,961 patients who underwent RP with a GS=6. PSAD, obesity (BMI>30), and maximum core involvement were correlated with the GSU. Similarly, PSAD was considered to be a predictor of GSU in several previous studies [19][20][21][22]. Tilki D et al. [21] found that PSAD was a strong predictor of GSU, and our results confirm this finding. Nevertheless, we found that neither the PSA nor the prostate volume was significantly associated with GSU. We thus predict that the effect of these two predictors may be enhanced when they are combined. Previous reports have suggested that the prostate volume is a significant predictor of GSU [17,19]. Data from this study suggest that patients with a median prostate size (30-45 ml) may have a lower risk of GSU than patients with smaller prostates (<30 ml) in the overall patient population (OR=0.65, 95%CI, 0.40 to 1.06, P=0.085) and in patients with GS=6 (OR=0.60, 95%CI, 0.26 to 1.36, P=0.222). However, the trend was reversed when the prostate volume continued to increase (30-45 ml vs. > 45 ml). We thus predict that there are confounding factors in the relationship between prostate volume and GSU, although the limited sample size of this study makes it difficult to determine the exact influence of this variable.
This study supports the recent findings that BMI or obesity may have a significant correlation with GSU, as patients with a higher BMI show a higher likelihood of GSU [23]. Although the BMI was not significantly associated with GSU (P=0.534 in overall patients and P=0.170 in patients with GS=6), we found that obese patients showed a higher likelihood of GSU. Although there are racial differences in BMI between populations, this study suggests that we should be cautious about GSU in men with a higher BMI in the Chinese population.
Age has been considered an important predictor of GSU in Western studies [8,18,21]. Nevertheless, we found that age was not a significant predictor in this study. Such variation may be due to the limited sample size and different composition of the included patients. The mean age of this Chinese cohort was 67 years, which was much    higher than that of previous Western reports (58.6-64.3 years) [6,7,19,21]. Other factors, such as the tumor volume at biopsy and the prostate weight at RP, have also been shown to be associated with GSU in previous reports [6,19]; nevertheless, these factors were not included in this study.
On the whole, there was a relatively large difference between this Chinese nomogram and the Western nomogram in predicting GSU in the overall patient population. However, the nomograms predicting GSU in patients with GS=6 were similar between the two populations, which may be due to the differences in the characteristics of PCa patients in the two populations. Currently, most PCa patients are diagnosed at an earlier stage and with a lower GS(a high proportion of patients with GS=6) in Western countries due to the implementation of PSA-based screening. In China, however, a high proportion of patients are diagnosed with a GS of 3+4 or 4+3 and even higher. Thus, we need to develop prediction models in these special situations, as the prediction of GSU in these patients would influence decision-making regarding RP and other treatments.
This study presented the first prediction tool of GSU for Chinese patients with a GS=6 who underwent RP after biopsy. The developed nomogram was based on more clinical predictors and a larger sample size than previous studies [16] for overall patients. These strengths will enhance its performance in future clinical practice.
Nevertheless, there were some drawbacks to this study. First, this was a retrospective study and thus suffers the limitations associated with this type of study. Furthermore, some predictors were not involved in the analyses, such as the percent of tumor in each core and the tumor size. Although the prostate volume of most patients was assessed by TRUS, there were a few patients who without these data, and the information regarding prostate volume for these patients instead came from the RP specimens. Lastly, the sample size of this study was limited, and only single-center data were involved. As the Chinese Prostate Cancer Consortium was established to facilitate multi-center studies, a PCa database based on browser/sever schema was created [24]. Importantly, the external validation of these nomograms in a multi-center population is scheduled.

Patient population
From 2011 to 2015, a total of 642 consecutive patients who underwent radical prostatectomy at our institute after TRUS-guided prostate biopsies were retrospectively involved in this study. Of these, 231 (35.98%) were excluded because of missing information (n=129) or inadequate prostate sampling (<10 cores, n=95) or sarcoma (n=7). Further analysis targeted the remaining 411 patients. The clinical and pathologic characteristics of the involved patients, including age, body mass index (BMI), preoperative PSA, prostate volume measured by TRUS, prostate-specific antigen density(PSAD), clinical stage, biopsy and RP specimen features, are shown in Table 1. GSU was defined as a biopsy GS changing from 6 to 7 or changing from 7 to a higher GS, as well as a GS changing from 3+4 to 4+3.GSU in patients with GS=6 was defined as changes in the GS from 6 to 7 or more.
Both biopsy and prostatectomy specimens were evaluated by the same uropathology group. Nine prioridefined preoperative risk factors, including patient age, BMI, obesity, PSA, prostate volume, PSAD, clinical stage, number of positive cores, and percentage of positive cores, were assessed for their ability to predict GSU in patients with GS=6. Another two factors, the primary and secondary biopsy GS values, were added to the 9 factors for the assessment of GSU in all cases. We were unable to assess the percentage of cancer tissue involved in each biopsy core due to incomplete data.

Statistical analyses
Statistical analysis was performed in the overall patient population and the patients with GS=6. First, univariate logistic regression was performed to investigate the association of clinical and pathological variables with GSU and GSU from GS=6. Second, variables with P<0.1 in the univariate analysis were further tested for in multivariable logistic regression analysis to identify independent predictors of GSU or GSU from GS=6. We chose to use P<0.1 as the criterion because we intended to expand the inclusion of variables. Third, receiver-operating characteristic (ROC) curve analysis was applied to calculate the area under the curve (AUC) of the prediction models. Fourth, calibration curves were constructed to assess the agreement between the actual rate of GSU and the predicted probabilities of GSU by these two models. The bias-corrected calibrated values were generated from internal validation based on 200 bootstrap resamples. The ideal curve was characterized with an intercept close to 0 and a slope close to 1. Finally, the decision curve analysis described by Vickers et al. [18] was performed to assess the clinical utility of these two models and the single predictors by quantifying the net benefits at a spectrum of threshold probabilities. All of the P values were two-sided, and P<0.05 was considered to be statistically significant. ROC curve analysis was performed using MedCalc v.10.4.7.0 (MedCalc Software bvba, Mariakerke, Belgium), and other analyses were performed with R version 3.1.3 (R foundation for Statistical Computing, Vienna, Austria).

CONCLUSION
Obesity and primary and secondary biopsy GS values were identified as predictors of GSU in the overall patient population. Obesity, PSAD and the percentage of positive cores served as predictors of GSU in patients with a GS=6. Nomograms for predicting GSU were established in Chinese PCa patients, with a relatively high accuracy in internal validation. This study may therefore be of relevance in the risk assessment and clinical decisionmaking for PCa patients.