Comparison of DWI and 18F-FDG PET/CT for assessing preoperative N-staging in gastric cancer: evidence from a meta-analysis

The diagnostic values of diffusion weighted imaging (DWI) and 18F-fluorodeoxyglucose positron emission tomography/computed tomography (18F-FDG PET/CT) for N-staging of gastric cancer (GC) were identified and compared. After a systematic search to identify relevant articles, meta-analysis was used to summarize the sensitivities, specificities, and areas under curves (AUCs) for DWI and PET/CT. To better understand the diagnostic utility of DWI and PET/CT for N-staging, the performance of multi-detector computed tomography (MDCT) was used as a reference. Fifteen studies were analyzed. The pooled sensitivity, specificity, and AUC with 95% confidence intervals of DWI were 0.79 (0.73–0.85), 0.69 (0.61–0.77), and 0.81 (0.77–0.84), respectively. For PET/CT, the corresponding values were 0.52 (0.39–0.64), 0.88 (0.61–0.97), and 0.66 (0.62–0.70), respectively. Comparison of the two techniques revealed DWI had higher sensitivity and AUC, but no difference in specificity. DWI exhibited higher sensitivity but lower specificity than MDCT, and 18F-FDG PET/CT had lower sensitivity and equivalent specificity. Overall, DWI performed better than 18F-FDG PET/CT for preoperative N-staging in GC. When the efficacy of MDCT was taken as a reference, DWI represented a complementary imaging technique, while 18F-FDG PET/CT had limited utility for preoperative N-staging.


INTRODUCTION
Although the incidence and mortality have dramatically decreased over the past 50 years, gastric cancer (GC) remains the fourth common cancer and the second leading cause of cancer-related deaths, with poor prognosis worldwide [1,2]. The variety of therapeutic options available for GC, such as radical resection, endoscopic submucosal dissection, and neoadjuvant chemotherapy [3], makes accurate preoperative TNM staging for GC patients a necessity [4][5][6]. Lymph node assessment is crucial to treatment strategy and to determining prognosis in GC patients [7,8]. In cases without distant metastases, extended lymphadenectomy based on precise lymph node staging is an important procedure in radical gastrectomy, which could improve the outcome for GC patients [9,10]. According to Japanese Gastric Cancer Association, for differentiated T1a early GC without lymph node metastasis, endoscopic resection or partial resection plus D1/D1+ lymphadenectomy is indicated, but patients with lymph node metastasis need a standard D2 lymphadenectomy [11]. Closely correlated with tumor size, infiltrating degree, and vascular tumor thrombus, lymph node metastasis is regarded as a key Meta-Analysis independent predictor of recurrence and is one of the indications for adjuvant chemotherapy in GC patients [10,12]. Statistically, the 5-year survival rate (after surgical treatment) in patients with N0 GC is 86.1%, whereas the survival rates in patients with N1, N2, and N3 GC dramatically decrease to 58.1%, 23.3%, and 5.9%, respectively [13]. Therefore, accurate preoperative lymph node assessment might facilitate the selection of candidates for neoadjuvant chemotherapy, optimize radical surgery strategy, and predict prognosis of GC [14].
Several tools to diagnose lymph node metastasis of GC are available, such as multi-detector computed tomography (MDCT), endoscopic ultrasonography (EUS), positron emission tomography/computed tomography (PET/CT), and magnetic resonance imaging (MRI) [15]. MDCT is most widely used to assess lymph node staging of GC patients, mainly on the basis of lymph node size [16,17], but the limited sensitivity of MDCT results in false negative findings [18][19][20][21]. EUS provides good information on lymph node status around lesions but was inadequate for predicting extra-perigastric and distant lymph node metastasis because of the limited penetration range of the ultrasound beam [15,22]. Therefore, finding more accurate imaging techniques for N-staging of GC is essential.
Diffusion weighted imaging (DWI) and 18 F-Fluorodeoxyglucose positron emission tomography/ computed tomography ( 18 F-FDG PET/CT) are relatively new imaging techniques used for preoperative staging of numerous cancers. Studies have suggested that diffusion MRI is helpful in distinguishing malignant from benign lesions by use of apparent diffusion coefficient (ADC) measurements [23][24][25][26][27]. The theory is that malignant tumors have restricted diffusion whereas benign lesions do not [28,29]. Although the value of this imaging modality in the differentiation of metastatic lymph nodes from non-metastatic lymph nodes has been shown in patients with neck, lung, prostate and colorectal cancers [30][31][32][33], no enough evidence is available to support the generally accepted use of DWI in nodal staging of GC patients. 18 F-FDG PET/CT, which integrates the anatomical details from CT with the functional status from PET, facilitates early detection of primary lesions and differentiation of metastases in various cancers, including GC [34]. PET/CT have several advantages to PET alone or CT alone, and PET/CT is increasingly used in diagnostic staging, treatment decisions and prognosis evaluations [35][36][37]. The usefulness of PET/CT in the assessment of preoperative lymph node involvement is hindered by unsatisfactory sensitivity compared with contrastenhanced CT, despite PET/CT showing better specificity [38,39]. Furthermore, the few published studies on the subject exhibited a wide range of sensitivities and specificities in the preoperative diagnostic performance of 18 F-FDG-PET/CT in nodal assessment of GC [40,41].
The value of conventional imaging techniques, such as MDCT, EUS, MRI, and PET, has been investigated by meta-analyses [42][43][44][45]. However, the efficacy of DWI and 18 F-FDG-PET/CT in lymph node staging were not determined and no relevant meta-analyses were performed. Therefore, we performed a systematic review and meta-analysis to confirm and compare the diagnostic values of DWI and 18 F-FDG PET/CT for lymph node staging in GC patients.
The principal characteristics of the 15 selected articles are listed in Table 1. Of these articles, 12 were retrospective, and three were prospective. Patients in 11 articles were Asians while another four articles were Caucasians. All the reference standards are based on pathological analysis after surgery, although the operation methods differed. Considering the complexity of the MRI technique, Table 2 summarizes the field strength, imaging evaluation, b value, the number of reporting radiologists, pulse sequence and diagnostic criteria of DWI in each study. Similarly, the characteristics of 18 F-FDG PET/CT in nine studies are displayed in Table 3. Figure 2 showed the methodological quality assessment for six studies of DWI and nine studies of 18 F-FDG PET/CT. All the included studies used pathological diagnosis as a reference. There of six DWI studies and only one of nine 18 F-FDG PET/CT studies reported time intervals between examinations and pathological confirmations. Six of six DWI studies and eight of nine 18 F-FDG PET/CT studies had the same reference standard. Two of six DWI studies reported that references were blinded from MRI and no studies described blind measurements of reference tests without knowledge of 18 F-FDG PET/CT. Six of six DWI studies and six of nine 18 F-FDG PET/CT studies provided clinical data when interpreting the two imaging techniques.

Diagnostic accuracy of DWI and 18 F-FDG PET/ CT
The pooled results are shown in Figure 3 and To confirm the summary estimates of two imaging techniques in the evaluation of nodal staging of GC patients, we conducted the comparison between DWI and 18 F-FDG PET/CT on the pooled sensitivity, specificity and AUC by using the Z test. The results indicated that DWI had an advantage over 18 F-FDG PET/CT in sensitivity (0.79 vs. 0.52, P < 0.001) and AUC (0.81 vs. 0.66, P < 0.001), and no differences in specificity between the two imaging examinations was detected (0.69 vs. 0.88, P = 0.06).
To better understand the clinical diagnostic performance of the two imaging techniques, we used the corresponding values of MDCT from Wang's metaanalysis as a reference, which was published in 2015 [43].

Heterogeneity analysis
Our analysis disclosed strong heterogeneity in both sensitivity (I 2 = 89.7%, P < 0.001) and specificity (I 2 = 98.4%, P < 0.001) among 18  The univariable meta-regression and subgroups analyses of sensitivity and specificity of 18 F-FDG PET/ CT are presented in Figure 5 and Table 5. Eight studies that utilized qualitative analyses showed much lower sensitivity than in quantitative analyses (0.47 vs. 0.84, P < 0.001) but failed to explain the heterogeneity of specificity. Six studies that utilized GE equipment exhibited a higher specificity (0.96 vs. 0.47, P < 0.001) than a study that utilized non-GE equipment. Seven studies with the number of subjects < 100 showed higher specificity than studies with the number of subjects > 100 (0.93 vs. 0.25, P < 0.001). The ethnicity of participants failed to explain the heterogeneity (P = 0.44 for sensitivity, P = 0.83 for specificity, respectively). Deeks' funnel plots provided evidence of publication bias for PET/CT studies (P < 0.001) rather than DWI studies (P = 0.58) (Figure 6).

DISCUSSION
The treatment strategies and prognoses of GC subjects are heavily dependent on accurate staging before surgery. Generally, preoperative N-staging assessment based on imaging modalities, compared with T-staging, remains less precise and leaves much room for improvement [56][57][58]. Among the conventional imaging modalities for lymph node evaluation of GC patients, the value of MDCT, EUS, MRI and PET have been investigated by meta-analyses [42][43][44]. DWI and PET/   CT are updated imaging techniques, but their diagnostic efficacy for lymph node involvement in GC has been inconsistently reported [40,41,49,51]. We performed this systematic review and meta-analysis to provide evidence for a better selection for imaging assessment of metastatic lymph node in patients with GC. Among the 15 DWI and 18 F-FDG PET/CT studies included in our meta-analysis, DWI achieved a higher sensitivity than PET/CT for lymph node staging in GC patients (0.79 vs. 0.52, respectively, P < 0.001). However, no difference in specificity between the DWI and 18 F-FDG PET/CT was detected (0.69 vs. 0.88, respectively, P = 0.06). Consequently, the superiority of DWI can be explained by the observation that DWI produced fewer false-negative results (1 -sensitivity) for N staging of GC. However, the specificity was not fully satisfactory, and thus excessive treatment and excision range might occur because of a relatively greater false-positive results (1 -specificity). The poor sensitivity of 18 F-FDG PET/ CT resulted in a high number of false-negative findings (1-sensitivity), which was similar to the results of Yun et al. [59] and Yang et al. [55], suggesting that positive lymph nodes would be missed and potentially resectable GC patients would receive inappropriate therapy.
The sROC curve and its AUC are used to describe the relation between the sensitivity and specificity in a study and the overall estimation of test performance [60]. A preferred test has an AUC close to 1, whereas a poor test has an AUC close to 0.5. The AUC for DWI is significantly higher than that for 18 F-FDG PET/CT (0.81 vs. 0.66, P < 0.001), indicating that DWI might be more accurate for nodal staging in GC patients. However, neither of the AUCs of the two techniques are high enough to be sufficient for nodal staging of GC patients in clinical practice.
Currently, MDCT is the most frequently used imaging modality for GC staging before surgery [61]. To better understand the clinical value of DWI and 18 F-FDG PET/CT for N-staging of GC patients, we compared the summarized sensitivities, specificities, and AUCs of the two imaging modalities with those of MDCT in a previous meta-analysis performing by Wang et al. [43]. This meta-analysis covering 6,726 subjects estimated the sensitivity, specificity, and AUC to be 0.67, 0.84, and 0.83, respectively. The poor sensitivity of MDCT is not adequate for the detection of metastasized lymph nodes, so it is essential to obtain the accuracy of other imaging techniques for N-staging of GC patients and analyze the possibilities of these techniques replacing MDCT. In this study, DWI achieved higher sensitivity but lower specificity, and 18 F-FDG PET/CT had lower sensitivity and equivalent specificity when compared with MDCT (data are shown in Table 4). DWI and 18 F-FDG PET/CT had no obvious advantages of AUC over MDCT in preoperative lymph node assessment of GC patients, and the two techniques are more costly and require longer scanning times than MDCT. Thus, DWI and 18 F-FDG PET/CT were unlikely to take the place of MDCT in the short term for lymph node staging of GC patients. Nevertheless, the higher sensitivity and lower specificity of DWI indicates that DWI and MDCT could be complementary imaging modalities and the combined utilization of the two techniques might improve the accuracy of lymph node staging [62].
DWI, a magnetic resonance imaging (MRI) technique, can recognize the restricted diffusion of water molecules among tissues at the cellular level by the measurement of ADC value [23,63]. DWI has increasingly been used to characterize various diseases and diseased lymph nodes, including alimentary tract cancers such as gastric or colorectal cancers, and has shown promising results [27,[64][65][66]. However, the value of DWI in the detection and characterization of lymph nodes in GC remains controversial [48,49,64]. In the past, DWI of the abdomen and pelvis was easily distorted by respiratory motion and gastrointestinal peristalsis    18 F-fluorodeoxyglucose positron emission tomography/computer tomography; DWI = diffusion weighted imaging; MDCT = multi-detector computed tomography; 95% CI = 95% confidence intervals; AUC = area under summary receiver operating characteristic curve; ND = not documented. www.impactjournals.com/oncotarget [67,68]. Recent technological developments in MRI, including new sequences (echo-planar imaging sequence, multichannel coils and parallel imaging), the high-field magnet and volumetric acquisition of T1-weighted images, allow the acquisition of DWI that is largely free of motion artifacts and provide excellent anatomical detail [23,69,70]. By performing this meta-analysis, we found that DWI displays an acceptable sensitivity and moderate specificity for N-staging, but based on the AUC value, the DWI is not adequate for nodal staging of GC patients in clinical practice.
In the N-staging of GC patients, the accuracy of DWI is poor when based only on the size of lymph node in imaging, but when integrated with the ADC value as the diagnostic standard, the detection rate is much improved [23,50,71]. Zhou et al. [50] reported that the mean ADC value of metastatic lymph nodes (1.059 × 10 -3 mm 2 /s) was lower than that of non-metastatic lymph nodes (1.4029 × 10 -3 mm 2 /s). The overall accuracy is higher when the reference standard is based on ADC (ADC < 1.189×10 -3 mm 2 /s) than when based on the short axis diameter (SAD) (SAD > 5.05 mm) [50]. A study by Giganti et al. proved that ADC value significantly differed according to local invasion, nodal involvement and the AJCC Cancer Staging Manual, 7th Edition TNM stage groups for GC, indicating that the ADC was potentially useful in the staging and risk stratification of GC patients [46]. Although Hasbahceci et al. [49] demonstrated that ADC value did not aid in distinguishing metastatic lymph nodes, this contrary conclusion was based on study of only 23 GC subjects and was not convincing. In addition, the ADC value correlates with the histological features, response to treatment and long-term prognosis [72][73][74][75]. The increased ADC signifies long-term survival [72]. Thus, the quantitative analysis measured by ADC value is a promising method for N-staging assessment in the future.
Although no wild heterogeneity was assessed by the I 2 test among the selected studies of DWI, a wide variation in imaging techniques including preparations (gastric emptying, reduced peristole and fillingexpansion of the stomach), instruments (field strength, pulse sequence, b value), procedures (breathholding, measuring method of ADC value) still existed [47][48][49][50][51]. These inconsistencies could inhibit the accuracy of DWI for staging [76,77]. However, because of the limited number of included studies of DWI, no subgroup analyses were carried out to explore their impacts on the diagnostic performance of DWI. As a result, large-scale, high- Integrated PET/CT directly combines PET data on metabolic changes with highly detailed anatomic CT information, which help detect lesions earlier and provide more precise location information than CT or PET alone in numerous cancers [78]. Even though 18 F-FDG PET/CT achieved inadequate sensitivity, it was not undertaken to evaluate lymph node metastasis in GC patients. On one hand, physiologic uptake was originally high in GC. Thus when primary tumor uptake was not dramatically increased, the detection of lymph node metastasis is difficult [79,80]. On the other hand, most of the included studies only adopted the qualitative analysis by radiograph reading, without combining with the value of maximum standardized uptake (SUVmax) [21, 38-41, 52, 53, 55]. In our subgroup analysis, the quantitative analysis based on SUVmax displayed a much higher sensitivity than qualitative analysis (0.84 vs. 0.47),  with the imaging analysis being regarded as a potential resource of heterogeneity in our meta-analysis [54]. In fact, a lack of unified criteria prevents confirmation of the diagnosis of lymph node metastasis and the cutoff values of SUVmax differing in quantitative analysis [54,81,82]. When coupled with the long scanning-acquisition time and expense, 18 F-FDG PET/CT is not recommended as the first choice for clinically assessing lymph node staging in GC patients [38,41,80]. Finding another sensitive imaging agent and establishing the criteria for N-staging are proposed to improve the present situation of PET/CT [54,83,84].
The present meta-analysis has several limitations. First and foremost, no head-to-head comparison between MRI and 18 F-FDG PET/CT were done in a single study, which might cause some bias in patient selection, or even adjustment. Second, the assessment of the two techniques for lymph node staging in some included studies were patient-based. A region-by-region or node-by-node comparison that could provide crucial information and  a more accurate assessment was not performed in this study. Third, a wide variation in imaging techniques likely influenced the assessment of diagnostic accuracy of 18 F-FDG PET/CT and DWI, which are potential resources of heterogeneity. Forth, no single reference standard strategy for the histopathologic analyses was applied, and a wide variation in histopathologic types of GC was found in all studies. This factor was not analyzed because it was too mixed to classify. Finally, potential publication bias was found in 18 F-FDG PET/CT studies by use of Deeks' funnel plot.
In conclusion, DWI achieved a higher sensitivity and equivalent specificity than 18 F-FDG PET/CT in preoperative N-staging of GC patients. When the efficacy of MDCT was taken as a reference, DWI represented a complementary imaging technique and 18 F-FDG PET/ CT had limited usefulness in the preoperative assessment of N-staging. Therefore, large-scale randomized control trials are needed to confirm their clinical values and to establish reference standards for measurement, analysis, and cutoff values of lymph node diagnosis for both DWI and 18 F-FDG PET/CT.

Search strategy
A comprehensive computer-aided literature search of PubMed, Cochrane Library, and Embase databases was carried out to find relevant articles about DWI or PET/ CT for N-staging in GC subjects (last update July 12th, 2017). We used a search algorithm based on a combination of the following parameters: ("DW-MRI" OR "diffusionweighted magnetic resonance imaging") OR ("FDG" OR "18F-FDG" OR "FDG-18F" OR "fluorodeoxyglucose" OR "PET/CT" OR "positron emission tomography/computed tomography") AND ("stomach cancer" or "gastric cancer" or "stomach carcinoma" or "gastric carcinoma" or "GC") AND ("lymph node metastasis" or "nodal metastases" or "lymphatic metastasis" or "lymph node involvement" or "nodal involvement" or "lymph node status" or "lymph node staging" or "N staging" or "TNM").

Inclusion and exclusion criteria
The inclusion criteria were as follows: (i) Studies investigating the diagnostic value of DWI or 18  The exclusion criteria were as follows: (i) Studies focused on DWI or 18 F-FDG PET/CT in monitoring chemoradiotherapy response or prognosis rather than on lymph node diagnoses. (ii) Studies included subjects who received preoperative radiotherapy or chemotherapy, which might cause tumor down-staging. (iii) Articles were case reports, reviews, meeting abstracts, in vitro studies, or animal experiments for GC, or the studies had fewer than 20 samples. (iv) Studies had data errors in statistical analyses.

Data extraction and quality assessment
Two reviewers (XZ and YL, respectively) independently reviewed titles and abstracts of the retrieved articles according to the above-mentioned selection criteria. Articles were excluded if clearly ineligible. Then the full-text versions of the selected articles were evaluated to determine their eligibility for inclusion. Finally, the above two reviewers cross-checked each independently selected study. Any controversy was resolved by consultation with a third author (BC). For each eligible study, the following information was extracted: first author, year of publication, country and ethnicity of the study subjects, study design, technique characteristics for DWI and 18 F-FDG PET-CT, reference standard, and diagnostic criteria. The values of truepositive, false-positive, true-negative, and false-negative were also extracted. The methodological quality was assessed according to the revised tool of the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) , which consists of 11 question items with responses "yes," "no," or "not available" [85]. Two reviewers (XZ and YL, respectively) independently extracted the relevant data and assessed the methodological quality from each included study. Any discrepancies were resolved by discussion.

Statistical analysis
For patient-based analyses, we identified the pooled sensitivities and specificities of DWI and PET/CT, as well as their 95% CI using the weighted average method. The sROC curve was constructed for recruited studies and AUC was calculated to estimate the overall accuracy. Comparison between the two techniques was performed by use of the Z test, which could detect diagnostic differences between sensitivity, specificity, and AUC of the two imaging modalities. The following formula was used: Z = (VAL 1 −VAL 2 )/SORT (SE 1 2 +SE 2 2 ). VAL indicated the means of sensitivity, specificity, and AUC, and SE was the standard error of corresponding variables.
To better understand the diagnostic performance of the two imaging techniques, we took the performance of MDCT for nodal staging of GC as a reference. The pooled estimates of sensitivity, specificity and AUC with 95% CIs was derived from Wang's meta-analysis, which was published in 2015 [43]. www.impactjournals.com/oncotarget Heterogeneity among those eligible studies was assessed by the I 2 test, with I 2 > 50% suggesting mild heterogeneity among studies. When I 2 index was higher than 50%, a random-effect model was used; otherwise, a fixedmodel was used. If mild heterogeneity existed among those included studies, the potential sources of heterogeneity were identified by meta-regression and subgroup analyses. Threshold effect was an important additional source of variation in meta-analysis. To assess whether the threshold effect existed, the Spearman's correlation test was used.
Deeks' funnel plots were to determine potential publication bias for DWI and 18 F-FDG PET/CT in assessing preoperative N-staging of primary GC subjects. Stata 14.0 software was used to run all the statistical analyses. Values of P < 0.05 were considered statistically significant.

Author contributions
ML and BC contributed to conception and design of the study. XZ and YL contributed to the data acquisition, analysis and interpretation of the data. ML and HS contributed to writing and editing of the manuscript. All authors commented on drafts of the paper and approved the final draft of the manuscript.