Quantitation of cell-free DNA in blood is a potential screening and diagnostic maker of breast cancer: a meta-analysis

Introduction Increased cell-free DNA (cfDNA) levels in circulating blood have been associated with higher possibility of breast cancer, however, researchers have not reached an agreement on its analysis. Materials and Methods We conducted a meta-analysis of 12 retrospective studies to clarify the value of cfDNA quantification in screening and diagnosis of breast cancer. PubMed, EMBASE, Web of Science and Cochrane library were searched from January, 2000 to October, 2016. Pooled analyses were estimated using a random effects model. Results In total, 1003 primary breast cancer patients, 283 cases with benign breast disease and 575 healthy individuals were included. Pooled diagnostic odds ratio (DOR) was 27.63 (95% confidence interval [CI]: 10.96~69.61, I2 = 86.2%, P < 0.001) in discriminating between breast cancer and healthy controls; the area under the summary receiver operating characteristic (SROC) curve measured 0.91 (95% CI: 0.17~1.00). Analysis of available data in distinguishing breast cancer and benign breast disease showed a pooled DOR of 35.30 (95% CI: 7.58~164.39, I2 = 79.9%, P = 0.002) with an area under SROC of 0.91 (95% CI: 0.89~0.93). Ethnic group distribution based geographical factors suggested by meta-regression and subgroup analyses explained most of the heterogeneity. Conclusions Quantification of cfDNA is a promising test in screening and diagnostic of breast cancer, but population-based standardization of test methods require completion prior to clinical use.


INTRODUCTION
Breast cancer is the most frequently diagnosed cancer and the leading cause of cancer death among females worldwide [1]. The incidence of breast cancer is still rising, especially in South America, Africa, and Asia [1][2]. Early diagnosis is usually considered to be central to reducing the mortality of breast cancer. Although population-based mammographic screening [3] has contributed to the reduction of death rate of breast cancer in North America and some well-developed European countries, the cost limits its application in developing countries [4]. Therefore, relatively inexpensive tumor biomarkers are also needed.

Meta-Analysis www.impactjournals.com/oncotarget
Circulating cell-free DNA (cfDNA) is the tumorderived fragmented extracellular DNA which has been detected in human body fluids. As early as 1977, Leon et al. have reported that breast cancer patients contained increased cfDNA in their serum [5]. With advances in knowledge and technology, detection of cfDNA has been applied in prenatal diagnosis [6], disease surveillance, and tumor diagnosis [7]. From cfDNA, we can obtain information regarding cancer, including gene mutations, copy number variation and DNA integrity [8][9][10][11].
Quantification of cfDNA has emerged to be a possible tool for early diagnosis of cancers, which has been confirmed in liver cancer and non-small cell lung cancer [12][13].
Numerous clinical studies [14][15][16][17] have emphasized that the concentration of cfDNA can be used to distinguish between malignant breast cancer and benign breast nodules. However, there are still inconsistencies in these results and a systematic analyses are required to confirm its diagnostic accuracy. Thus, this meta-analysis was designed to investigate the value of cfDNA quantification as a biomarker for breast cancer and assess the possible factors that influenced the diagnostic efficiency.

Data sources and searches
Four main databases were searched for related studies: PubMed, EMBASE, Web of Science and Cochrane library (from 2000 to October 2016), without language limitations. The combinations of search terms included "breast neoplasms," "cell-free," "DNA," and all of their possible variations. The search strategy was manually adapted according to the citation lists of retrieved articles for sensitivity.

Study selection
Inclusion criteria consisted of studies in breast cancer and availability of diagnostic data, such as area under the receiver operating characteristic (ROC) curve, specificity and sensitivity. No restrictions on methodology or types of the study were included. Case reports, reviews, conference presentations and duplications were excluded. Two independent reviewers, Z. Liu and H. Wang, evaluated the eligibility of studies. Disagreements were resolved by consensus.

Data extraction
All the data analyzed were from published papers. Predesigned forms were applied in data collecting. Details listed as follows: first author, year of publication, country, type of study, numbers of cases categorized by age, estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER-2), tumor stage, and lymph mode metastasis. Sample materials, testing methods, reference genes were collected as well. Diagnostic data was directly extracted from articles or estimated from ROC curves based on the Youden index (sensitivity+specificity-1), as others have published [13,18]. When nuclear DNA and mitochondria DNA were both measured, the former was used in the analysis.

Quality assessment
Quality assessment was conducted according to Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) [19,20], which is composed of four domains: patient selection, index test, reference standard, and flow and timing. For each domain, the risk of bias and concerns about applicability were evaluated and rated (low risk, high risk, and unclear). The results of the quality assessment were used to investigate potential sources of heterogeneity. Two reviewers scored all the studies independently. Different opinions were discussed until an agreement was reached. If the two reviewers can't reach consensus since the quality of the article was dissatisfying, it would be excluded.

Statistical analysis
Fourfold tables (tables) for diagnostic test were rebuilt according to the primary publications. Pooled analysis for sensitivity, specificity positive likelihood ratio (PLR), negative likelihood ratio (NLR), and area under the ROC curve (AUC) with their corresponding 95% confidence intervals (95% CI) were calculated using the bivariate random effects model. The diagnostic accuracy for discriminating between breast cancer patients and healthy individuals or patients with benign breast disease were presented as diagnostic odds ratios (DORs) with 95% CI. The statistical heterogeneity was tested through the Q statistic and the variation in OR attributable to heterogeneity was evaluated by statistic. The source of heterogeneity was further investigated using meta-regression and subgroup analyses based on regions, time points of sample collection, sample materials, test methods, and reference genes. The test of publication bias was performed according to the methods described by Deeks et al. [21]. Sensitivity analysis was carried out using leave-one-out method, which is also named influence analysis. All statistical analyses were calculated in STATA v14.0 (Stata Corporation, TX). Statistical significance was defined as P < 0.05.

Study characteristics and quality assessment
A total of 1385 records were retrieved and 12 studies [14][15][16][22][23][24][25][26][27][28][29][30] involving 1807 people met the eligibility criteria ( Figure 1). The characteristics of these are included were summarized in Table 1. All the trials were retrospective studies, which involved 8 countries (Portugal, Germany, United Kingdom, Switzerland, Egypt, Israel, China, and Thailand) and 3 regions (Europe, Middle East, and East Asia). Amongst the 12 studies, 1003 primary breast cancer patients and 575 healthy individuals were included; 283 cases with benign breast disease were involved in 8 studies. Eight included studies used quantitative PCR based methods, though they varied in the process of DNA extraction and the choice of reference genes. The other four used fluorescence quantitative analyses. Out of the 12 studies, 8 collected samples of plasma or serum before treatment (surgery or chemotherapy), the others collected samples after surgery. Most breast cancers patients were diagnosed in their fifties and were in stage II~III. Details of reference genes, cut-off values and AUCs in each study were summarized in Table  2. Information of Pre-analytical procedures of cfDNA quantification were listed in Supplementary Table 1.
Evaluation of the risk of bias and concerns regarding applicability are graphically displayed in Supplementary  Table 2. Gal et al. [26] arranged four groups of breast cancer patients, all the other studies reported that the patients were consecutive or random in a certain period of time. All patients included had clear diagnosis with pathological evidence. In general, all studies met the predefined criteria for our review questions and were high in applicability.
The concentration of cfDNA was also associated with molecular subtypes of breast cancer and nodal status (Supplementary Table 3). Although cfDNA levels were not associated with ER or PR, they were significantly higher in HER-2-positive patients than in HER-2-negative patients [15,28]. Most researchers [15,[25][26][27][28][30][31] showed a higher level of cfDNA in node-positive patients compared to node-negative patients, and as more lymph nodes are involved, more cfDNA could be detected in circulation [28]. Also, two studies [15,30] reported significant differences in the level of cfDNA between node-positive patients and node-negative patients, which suggested that the concentration of cfDNA might be a possible marker of early lymph node metastasis in breast cancer.

Major clinical heterogeneity sources
To find the source of heterogeneity, firstly, random effects meta-regression analysis was used to assess covariates involved in these studies. Factors including "regions (Europe, Middle East, and East Asia)", "sample materials (plasma or serum)", "test methods (polymerase chain reaction (PCR)-based or not)", "time of sample collection (before or after treatment)", and "Method of extraction (QIAamp DNA Blood Mini Kit, Qiagen DNA Blood Mini Kit, Nucleic-Spino Plasma XS Kit, and Other)" were included in univariate analysis. The results suggested that "regions" accounted for most of heterogeneity and explained 82.54% betweenstudy variance. Although "test methods" alone was not responsible for the heterogeneity (Coef. = 0.70, 95% CI, -1.64~3.04, P = 0.53), in multivariate analysis, "test methods" and "regions" altogether explained 87.09% of the between-study variance. The other factors could not explain the heterogeneity, and details of the calculation were showed in Table 3.
The geographical grouping (Europe, Middle East, and East Asia) contained the differences in ethnic and genetic characteristics in these regions, thus, subgroup analysis was used to further evaluate the accuracy of diagnosis test in each region. As shown in Figure 4     Sensitivity analysis, using leave-one-out sensitivity analysis (Supplementary Figure 2) with a random effect model, revealed that the pooled effect estimates would not be influenced by any single study and maintain their stability.

DISCUSSION
The incidence and mortality of breast cancer in less developed countries are still increasing. Early screening and reliable diagnosis are essential for breast cancer treatment. Various investigations [14][15][16][22][23][24][25][26][27][28][29][30][31][32][33][34] have demonstrated that the quantification of cfDNA was potentially an effective biomarker for breast cancer diagnosis. In this meta-analysis, quantification of cfDNA as a screening tool for breast cancer had a pooled sensitivity of 84% and a pooled specificity of 85%. The DOR was 27.63 and the AUC value was 0.91, which reached a high level of evaluation criteria and indicated a high degree of overall diagnostic accuracy. The cfDNA level was reported higher in HER-2 or node positive patients but was not associated with ER and PR status.
Subgroup and meta-regression analyses found "regions" was the main source of heterogeneity, revealing the heterogeneity of breast cancer among ethnic groups. Additionally, as a diagnostic tool, high level of cfDNA also pointed to a higher risk of breast cancer, with a DOR of 35.30 and an AUC of 0.91, indicating a diagnostic value of cfDNA quantification for breast cancer.
Sources of heterogeneity was evaluated by metaregression and subgroup analyses. Nearly 90 percent of the heterogeneity could be explained by the mixed effect of "regions" and "test method", and the former is the main factor. Subgroup analysis provided more details of the ethnicity-based regional grouping. The heterogeneity between three studies from East Asia [23,27,29] decreased to a quite low level (I 2 < 30.0%), and studies from China [23,27] had no heterogeneity (I 2 = 0%), both of which used real-time quantification PCR. Four studies from Middle East were done in Egypt and Israel, and there remained moderate heterogeneity within this subgroup. The major covariate affecting these trials was "testing method": Hadshad et al. [15] and Mahmoud et al. [28] quantified cfDNA with PCR-based detection  [35] to measure cfDNA directly in the diluted samples. Pre-analytical procedures of cfDNA quantification do add heterogeneity according to the information we collected (Supplementary Table 1), however, only 14.50% of between-study variance was explained by "method of extraction". Missing information and changing methods hindered further analyses, and excessive confounding factors can also lead to unreliable analysis results. Generally, PCR is regarded as a more sensitive approach in the quantification of cfDNA, however, the standardization of cfDNA quantification methods remains one of the problems confronted in the way of further clinical application. Therefore, we recommend a unified technique in future studies of cfDNA at least in a specific region, in order to guarantee the sensitivity of detection and establish guidelines in this area.
Recently, Lin et al. [36] published a meta-analysis to comprehensively evaluate the cfDNA-based early detection methods for BC. They included literature measuring cfDNA quantification, integrity, methylations, loss of heterogeneity and etc., which resulted in heterogeneity between studies. In this meta-analysis, we only included studies containing DNA quantification results. Measured by the same inclusion criteria, more studies could be included in our study (n = 12 vs. n = 9). Our comprehensive literature search was supported by a lack of evidence for publication bias through Deeks and Harbord test. Although significant heterogeneity was observed in their study, Lin et al. demonstrated that none of the methodological covariates ("country" and "assay methods") produced major heterogeneity (P > 0.05) in meta-regression analysis. This may be caused by the loose inclusion criteria, which introduced too many influencing factors. Moreover, among the data from Kohler et al.'s study [16], Lin et al. mistook the results of screening test using mitochondrial DNA (mtDNA) quantification (cutoff: 463282 GE/ml; sensitivity: 53%; specificity: 87%; P < 0.001) for distinguishing BC from patients with benign breast diseases.
However, there remain limitations in our metaanalysis. Firstly, due to the restriction of systematic review and meta-analysis, only population-level data could be extracted; more correlation of subtypes defined by ER, PR or HER-2 as well as lymph node metastasis could not be further analyzed. Secondly, there remained about 1need a space: 10% heterogeneity which did not have clear source. At the pre-analytical phase of cfDNA quantification, the methods of cfDNA extraction and the following assessment lacked standard protocols, which may lead to heterogeneity [37].
In conclusion, this meta-analysis suggests that the quantification of cfDNA can be a potential biomarker for accurately discriminating BC patients from healthy individuals. Sensitivity analyses using various criteria to improve the quality of included studies or reduce the systematic errors in the process of calculation did not alter the results substantially, suggesting that the results of our meta-analysis are robust. Combination of cfDNA quantification and other biomarkers maybe a future direction in BC early diagnosis.