Efficacy of D5F3 IHC for detecting ALK gene rearrangement in NSCLC patients: a systematic review and meta-analysis

We conducted a pooled analysis comparing the efficacy of an immunohistochemistry (IHC) assay using the D5F3 antibody with that of fluorescence in situ hybridization (FISH) for detecting ALK gene rearrangement in NSCLC patients. A total of 32 studies involving 5805 samples were included in this review. Pooled sensitivity for D5F3 IHC was 0.97 (95%CI: 0.93-0.98), specificity was 0.99 (95%CI: 0.98-1.00), PLR was 119.20 (95%CI: 57.79-245.89), NLR was 0.03 (95%CI: 0.02-0.07), DOR was 3526.66 (95%CI: 1344.71-9249.03), and AUROC was 1.00 (95%CI: 0.99-1.00). Meta-regression revealed that specimen type was a source of heterogeneity for specificity, and specimen type and FISH signal distance were sources of heterogeneity in the joint model. Subgroup analysis revealed that sensitivity and specificity were higher when the FISH signal distance standard was ≥ 2 than when it was ≥ 1. Sensitivity was higher for tumor specimens than for cell specimens; specificity was higher for cell specimens than for tumor specimens. In conclusion, the D5F3 IHC assay was nearly as effective as FISH for detection of ALK gene rearrangement in NSCLC patients.


INTRODUCTION
Lung cancer is one of the most frequency diagnosed and deadly cancers worldwide, and non-small cell lung cancer (NSCLC) accounts for more than 85% of lung cancer cases [1,2]. Anaplastic lymphoma kinase (ALK) gene rearrangement is responsible for approximately 3%-5% of NSCLC cases [3,4]. Studies have reported that ALKtyrosine kinase inhibitors (ALK-TKIs) increase response rate [5][6][7] and progression-free survival times [8] in ALK fusion-positive NSCLC patients. NCCN guidelines thus recommend detection of ALK gene fusion in metastatic NSCLC, and the use of the ALK-TKI crizotinib as a firstline treatment in ALK-positive patients [9].
It is therefore crucial to assess the efficacy of different methods for detecting ALK rearrangement. At present, fluorescence in situ hybridization (FISH), polymerase chain reaction (PCR), and immunohistochemistry (IHC) are commonly used to detect ALK fusion. NCCN guidelines recommend FISH as the gold standard for detecting ALK fusion [10], but FISH is expensive and labor-intensive. Studies examining polymerase chain reaction (PCR) for detection of ALK rearrangement found that PCR had high diagnostic performance compared to FISH [11,12]. However, PCR also resulted in a high false positive rate, suggesting that high-quality RNA is needed for this method [13]. Recently studies [14][15][16] have also examined the clinical use of IHC with D5F3, 5A4,

Research Paper
and ALK1 antibodies, the most cost-effective method, for detecting ALK rearrangement. Jiang et al. [17] conducted a meta-analysis of the diagnostic operating characteristics of IHC and concluded that IHC assays using D5F3 and 5A4 antibodies reliably detected ALK rearrangement in NSCLC. However, this study did not examine the diagnostic value of IHC screening methods using the D5F3 antibody. Although the D5F3 antibody is commonly used in the clinical setting, its efficacy remains largely unknown; we therefore conducted a systematic review and meta-analysis to assess the diagnostic accuracy of the D5F3 antibody in detecting ALK rearrangement.

Selection of studies
A total of 352 literature citations were identified in database searches and 1 citation was identified from reference lists. Ultimately, 32 studies [12,14,15, containing 37 trials and 5905 samples that met the inclusion criteria were included in this metaanalysis. 833 of these samples were positive and 4845 were negative for ALK gene rearrangement. Shen et al. [24] examined automated or manual detection of a D5F3 clone (Ventana) to detect ALK rearrangement; the rest of the studies used a different D5F3 clone (CST). Zhou et al. [12] used two difference samples, while Fu et al. [28] used EML4-ALK and ALK probes, to detect ALK rearrangement by FISH. Ying et al. [41] used two IHC positive standards. Figure 1 shows a flow diagram of the literature research process.

Characteristics of included studies
All eligible studies were published between 2012 and 2015, and 19 of the studies were from China. Each study included one trial, with the following exceptions: Shen et al. [24] included 3 trials, Zhou et al. [12] involved 2 trails, Fu et al. [28] involved 2 parallel trials, and Ying et al. [41] included 2 trials. Of the 32 studies, 23 studies examined NSCLC specifically and 9 studies examined lung adenocarcinoma. Tumor tissues or cell blocks were used as FISH specimens, and details of the FISH and IHC procedures differed among the studies. The main characteristics of the included studies are shown in Supplementary Data 1.

Quality assessment
The methodological quality of the 32 studies was assessed using the QUADAS-2 tool. Risk of bias analysis revealed that 21 studies had high bias in flow and timing, 8 studies had high bias in patient selection, 5 studies had high bias in index tests, and 1 study had high bias in reference standard. Regarding applicability concerns, 31 studies had low bias in reference standard, 7 studies had low bias in index tests, and 30 studies had low bias in patient selection. Finally, the overall quality was acceptable. The details of methodological quality analysis of the included studies are summarized in Figures 2 and 3.

Meta-regression analysis
The overall I 2 was 96.50, and the boxplot ( Figure 6) showed that heterogeneity existed in the studies. Therefore, meta-regression was used to investigate potential sources of heterogeneity. Sample size, country, histological type, cells counted using FISH, FISH signal distance, supplier, manual or automated counting, specimen type, and IHC positive standard were included in the meta-regression analysis of sensitivity, specificity, and the joint model. Metaregression results are shown in Table 1 and indicated that specimen type was a likely source of heterogeneity for specificity; specimen type and FISH signal distance were likely sources of heterogeneity for the joint model.

Subgroup analysis
The results of subgroup analysis are shown in Table 2. Different FISH standards influenced the sensitivity and specificity of IHC. When FISH signal distance standard was ≥ 2, the sensitivity was 0.987 (95%CI: 0.983-0.991) and specificity was 0.983 (0.978-0.987); when the standard was ≥ 1, the sensitivity was 0.952 (0.881-0.987) and the specificity was 0.963 (95%CI: 0.933-0.982). Regarding sample type, the sensitivity and specificity were 0.984 (95%CI: 0.960-0.996) and 0.965 (95%CI: 0.951-0.976) for tumor samples and 0.936 (95%CI: 0.914-0.954) and 0.987 (95%CI: 0.983-0.991) for cell samples, respectively. Finally, sensitivity and specificity were higher when the FISH standard was at least 2 than when it was at least 1. Additionally, sensitivity was higher for tumor specimens than for cell specimens, while specificity was higher for cells than for tumors.

DISCUSSION
Siegel et al. first reported the presence of ALK rearrangement in NSCLC in 2007 [51]. While ALK fusion was previously found to occur in approximately 3%-5% of NSCLC patients [3,4], we found ALK rearrangement     rates of nearly 14%. This may be partially due to the large number of Chinese patients in our analysis; more studies are needed to examine possible differences among patient populations. Crizotinib and the newer ALK-TKIs ceritinib and alectinib are commonly used in clinical practice. While crizotinib improves progression-free survival (PFS) and response rates in NSCLC patients [52], almost all patients treated with crizotinib eventually experience progression. Ceritinib, with an overall response rate of 56% and median PFS of 7 months, is effective in ALKpositive metastatic NSCLC patients who progress during or are intolerant to crizotinib treatment [5]. Alectinib is also effective in those who progressed during crizotinib treatment, with a response rate of 50% and an 11.2-month median duration of response; additionally, it is effective for treating CNS disease [53].
FISH, IHC, and PCR are currently used to detect ALK gene fusion in NSCLC patients. FISH, a molecular diagnostic test, has been proved by the FDA for detecting ALK rearrangements and is regarded as the gold standard by most researchers. However, FISH is expensive and relatively labor-intensive. PCR, another method for detecting ALK fusion, is associated with high false positive rates, limiting its clinical utility. Some researchers have reported that IHC, a cost-effective and simple assay, can be used to screen for ALK rearrangements [40,54], and three different antibodies are used for this purpose in clinical practice. D5F3, one of these antibodies, is widely used in clinics; we therefore conducted this systematic review and meta-analysis to assess the diagnostic accuracy of D5F3 IHC assays in detecting ALK rearrangement in NSCLC.
We examined 32 studies including 5805 samples among the 353 initially-identified literature citations in this study. High pooled sensitivity and specificity values indicated that D5F3 IHC had high diagnostic accuracy in detecting ALK rearrangement. Meanwhile, pooled PLR and NLR values further indicated high diagnostic accuracy for D5F3 IHC in clinical practice. Finally, pooled DOR and AUROC values indicated that D5F3 IHC had perfect discriminating ability. Although significant heterogeneity existed in our analysis, meta-regression revealed specimen type as a source of heterogeneity for specificity and specimen type and FISH signal distance as sources of heterogeneity for the joint model. Subgroup analysis revealed that sensitivity and specificity were higher when the FISH signal distance standard was ≥ 2 compared to ≥ 1. Finally, regarding specimen type, sensitivity was higher for tumor specimens than for cell specimens, while specificity was higher for cell specimens than for tumor specimens.
The present study expands upon the findings of Jiang et al. [17]. We searched more databases to identify studies and included more samples (5805 vs. 3754 patients),   allowing our pooled analysis to more reliability evaluate the diagnostic value of IHC with the D5F3 antibody. Moreover, analysis with the QUADAS-2 tool indicated that the overall quality of the included studies was good. We also used the GRADE system to assess levels of evidence and metaregression and subgroup analysis to investigate sources of heterogeneity in our meta-analysis.
However, the limitations of this study should also be considered when interpreting these results. First, the included studies used different standards for positive IHC and FISH results, possibly reducing the diagnostic accuracy. Second, the publication bias of the studies included in this investigation was significant. Moreover, our review only included studies published in English or Chinese; potentially    relevant studies in other languages were excluded. Finally, although nonsmokers had a higher incidence of ALK gene rearrangement, we were unable to conduct subgroup analyses based on smoking status due to the limited availability of relevant information in the included studies.
In conclusion, our results indicate that IHC assays using the D5F3 antibody are nearly as effective as FISH in the detection of ALK gene rearrangement. Because IHC is more cost-effective and less labor-intensive than FISH, it might be a better method for primary ALK rearrangement screening in NSCLC patients.

MATERIALS AND METHODS
Ethical approval and informed consent were not necessary for this meta-analysis study, which was conducted according to the guidelines of the

Data extraction and quality assessment
Hu Ma and Wen-Xiu Yao independently extracted the following information: study features (last name of the first author, year of publication, and country); number of samples; IHC and FISH details; type of specimen and outcome data (TP, FP, FN, and TN). The Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool [18] and Review Manager 5.3 (The Nordic Cochrane Centre, The Cochrane Collaboration, 2014) were used to evaluate the methodological quality of selected studies. We assigned low, high, or unclear risk of bias values to the patient selection, index test, reference standard, and flow and timing domains; applicability concerns were also evaluated in the first three domains.

Level of evidence
Jian-Guo Zhou used GRADEpro GDT (available at http://gdt.guidelinedevelopment.org/central_prod/_ design/client/index.html), an all-in-one web solution for summarizing and presenting health care decisionmaking information, to evaluate level of evidence, and an evidence profile was generated to summarize the results. The GRADE system identified the following four rating grades of evidence quality [19]: High: further research is very unlikely to change our confidence in the effect estimate; Moderate: further research is likely to have an important impact on our confidence in the effect estimate and may change the estimate; Low: further research is very likely to have an important impact on our confidence in the effect estimate and is likely to change the estimate; and Very Low: any effect estimate is very uncertain.

Statistical analysis
A bivariate regression model [20] was used to calculate the pooled sensitivity, specificity, PLR, NLR, DOR, and AUC and associated 95% confidence intervals (CIs). Bivariate boxplot, Chi-square, and inconsistency index (I 2 ) were used to assess heterogeneity; an I 2 greater than 50% indicated significant heterogeneity [21]. Metaregression and subgroup analysis were also used to investigate potential sources of heterogeneity. In addition, a likelihood ratio scatter gram was used to evaluate the exclusion and confirmation capacities of the index test. Finally, clinical utility and publication bias were assessed by a Fagan diagram and Deek's plot. STATA version 12.0 (Stata Corp, College Station, TX) was used for statistical analyses. www.impactjournals.com/oncotarget

CONFLICTS OF INTEREST
There is no conflict of interest for any author regarding the publication of this manuscript.