Mutational spectrum of acute myeloid leukemia patients with double CEBPA mutations based on next-generation sequencing and its prognostic significance

The aim of this study was to profile the spectrum of genetic mutations in acute myeloid leukemia (AML) patients co-occurring with CEBPA double mutation (CEBPAdm). Between January 1, 2012, and June 30, 2017, 553 consecutive patients with de novo AML were screened for CEBPA mutations. Out of these, 81 patients classified as CEBPAdm were analyzed further by a sensitive next-generation sequencing assay for mutations in 112 candidate genes. Within the CEBPA gene itself, we found 164 mutations. The most common mutated sites were c.936_937insGAG (n = 11/164, 6.71%) and c.939_940insAAG (n = 11/164, 6.71%), followed by c.68dupC (n = 10/164, 6.10%). The most common co-occurring mutations were found in the CSF3R (n = 16/81, 19.75%), WT1 (n = 15/81, 18.52%), and GATA2 (n = 13/81, 16.05%) genes. Patients with CSF3R mutations had an inferior four-year relapse-free survival (RFS) than those with the wild-type gene (15.3% versus 46.8%, respectively; P = 0.021). Patients with WT1 mutations had an inferior five-year RFS compared with those without such mutations (0% versus 26.6%, respectively, P = 0.003). However, GATA2, CSF3R, WT1 mutations had no significant influence on the overall survival. There were some differences in the location of mutational hotspots within the CEBPA gene, as well as hotspots of other co-occurring genetic mutations, between AML patients from Chinese and Caucasian populations. Some co-occurring mutations may be potential candidates for refining the prognoses of AML patients with CEBPAdm in the Chinese population.


INTRODUCTION
Mutations in the CCAAT/enhancer binding protein α (CEBPA) gene occur in 7%-15% of all acute myeloid leukemia (AML) cases. The subgroup of biallelic CEBPA mutations in AML patients has now been acknowledged in 'The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia' as a definite entity, given its distinct biological and clinical features, as well as its prognostic significance [1]. CEBPA belongs to the basic-leucine zipper (b-ZIP) family of transcription factors whose C-terminal regions contain two highly conserved motifs: a DNA-binding motif rich in basic amino acids and a leucine zipper dimerization motif. They also contain two less conserved N-terminal transactivation domains (TADs) [2]. CEBPA mutations can occur across the whole gene, but cluster in two main hotspots: N-terminal frame-shift insertions/deletionsthese cause translation of a 30 kDa protein from an internal ATG start site that lacks transactivation domain 1 and has a dominant negative effect over the full-length p42 protein; C-terminal mutations-these are generally in-frame insertions/deletions, in the DNA-binding or leucine zipper domains, that disrupt binding to DNA or dimerization [3].
AML patients with double CEBPA mutations (CEBPA dm ) show a favorable outcome, which was also observed in our previous study [4]. Both others' and our studies suggest that the frequency of CEBPA mutations (17.1%-21.6%) may be higher in Chinese patients with AML than what has been reported for populations of Western countries [4][5]. We also noticed some genetic differences between patients with AML from China and Western countries [4,[6][7]. Although the genetic profiling of AML patients with CEBPA dm has been reported in previous studies [8][9], there is no data available for Chinese patients. Furthermore, the prognostic significance of co-occurring mutations remains unclear in patients with CEBPA dm . In this study, we screened 553 patients with de novo AML and profiled genetic mutations in those with CEBPA dm (n = 81) by a sensitive next-generation sequencing assay. The prognostic significance of the top three co-occurring genetic mutations was also evaluated.

Patients' characteristics
Of the 553 consecutive patients with de novo AML, CEBPA mutations were detected in 105 patients (18

Treatment response and long-term outcome
For 67 patients received induction therapy, 50 patients achieved complete remission (CR), 14 achieved partial remission (PR), and the remaining three cases were classified as non-remission (NR) after one course of chemotherapy. CSF3R, WT1, and GATA2 mutations had no influence on the CR rate (P = 0.320, P = 0.130, and P = 0.158 respectively). Finally, 66 cases who achieved CR entered long-term follow-up. The follow-up time ranged from two to 66 months (median: 8 months). In total, 18 patients relapsed, and 13 patients died. Five-year relapsefree survival (RFS) ( Figure 4A) and overall survival (OS) ( Figure 4B) rates were 20.7% and 47.0%, respectively.
We also evaluated the prognostic significance of CSF3R, WT1, GATA2 mutations in patients with CEBPA dm . The four-year RFS of patients with CSF3R mutations was 15.3%, which was lower than those with wild-type CSF3R (46.8%) (P = 0.021). The median RFS of patients with mutated and wild-type CSF3R were 10 and 43 months, respectively. Patients with WT1 mutations had an inferior five-year RFS compared with those without the mutations (0% versus 26.6%, P = 0.003). The median RFS of patients with mutated and wild-type WT1 were 10 and 64 months, respectively. The five-year RFS rates were 38.1% and 46.4% in patients with the mutated and wild-type GATA2, respectively (P = 0.641). GATA2, CSF3R, WT1 mutations had no significant influence on OS in this study ( Figure 5).

DISCUSSION
AML is a heterogeneous disease, and DNA sequencing can offer clues to its etiology and predict prognoses of patients with AML. With the development of new sequencing technology, more and more genetic mutations are being identified in AML patients [10]. In our previous studies, we observed some differences in genetic alterations between AML patients from China and Western countries [4,[6][7]. The frequencies of NPM1 (15.4%) and FLT3-ITD mutations (14%) are lower in AML patients from China [7], whereas the frequency of CEBPA mutations is higher (17.1%) [4]. These results accord with the literature published by others from China [5,[11][12]. It is known that AML with CEBPA dm indicates a favorable outcome, which was also confirmed in our cohort of patients [4]. However, it is unknown whether there are genetic differences among the geographic or ethnic subgroups of AML patients with CEBPA dm .
The present subset of AML patients was derived from 553 consecutive patients with de novo diagnoses, which avoided selection bias. The majority (60.49%) of patients presented with M1 and M2 subtypes, according to the FAB classification system. A normal karyotype was present in 92.31% of the patients, while aberrant karyotypes included del(9q) and +8 trisomy. Only two patients with NPM1 mutation were detected in this study. All these features are consistent with previous reports [3,8].
We found that the most common CEBPA mutation types were frame-shift insertions or deletions, followed by in-frame insertions or deletions, which is in accord with previous studies [3,8]. A combination of an N-terminal frame-shift and a C-terminal in-frame mutation was present in the majority of patients in this study, which was also reported previously [8][9]. Fasen et al. reported that the most frequent mutation site was p.Lys313del, followed by p.His24Alafs, and p.Gln312del [8]. However, we observed a different result. The most common mutation site in the present study was p.Pro23fs, followed by p.Gln312_Lys313insGln, and p.Lys313_Val314insLys. We profiled for genetic mutations co-occurring in CEBPA dm AML patients. Interestingly, we observed that the percentage of patients with three or more co-occurring molecular mutations was higher in this study than in previous studies (25.93% versus 2.88%, respectively, χ 2 = 21.412, P < 0.001; [8]). We hypothesize that these differences between AML patients from Chinese and Caucasian populations may be due to their differing ethnic backgrounds.
The frequency of GATA2 mutations in CEBPA dm patients in this study (16.05%) was lower than that reported in previous studies [9,13]. There are still some controversies regarding the prognostic significance of GATA2 mutations in patients with CEBPA dm [9,[13][14][15].
Grossmannet al. reported that GATA2-mutated patients show a longer OS than GATA2 wild-type cases (n = 95; [9]). Hou and colleagues observed that among patients with CEBPA dm , those with GATA2 mutations had a trend of better OS and RFS than those without (n = 62; [13]). In univariate analysis, GATA2 mutations were associated with better event-free survival (EFS) and OS (P = 0.03 and P = 0.041, respectively; n = 98; [14]). However, no significant difference in CR rate, RFS, and OS was also observed in CEBPA dm patients with and without GATA2 mutations (n = 113; [15]). In the present study, we found that GATA2 mutations had no influence on CR, RFS and OS. Due to the relatively small number of patients in these studies, further research is still needed to evaluate the prognostic significance of GATA2 mutation in patients with CEBPA dm . Furthermore, we argue that ethnicity should also be taken into account when conducting analyses. Recently, Lavallée et al. from Canada reported that CSF3R mutations were the most frequent mutations (29%) in AML patients with CEBPA dm [16]. Maxson and colleagues confirmed those findings in a cohort of pediatric patients with AML. They found a significant enrichment of CSF3R mutations (46%) among the CEBPAmutated AML patients in America [17]. A high frequency of CSF3R mutations was also observed in our cohort of AML patients. In accordance with a previous study (n = 11/19, 57.89%) [17], we also found the majority of CSF3R mutations (n = 11/16, 68.75%) were p.T618I. Collectively, these findings suggest that CEBPA dm AML patients may benefit from treatment with Janus kinase inhibitors.
Although AML with CEBPA dm indicates a favorable outcome, recent data show that more than 50% of the patients finally relapsed when consolidated with chemotherapy alone [18]. Hence, a new marker is needed to stratify patients with CEBPA dm . Patients with CSF3R and WT1 mutations showed inferior RFS compared with those with the wild-type genes. As a result, WT1 and CSF3R mutations may be adopted as potential markers to stratify patients with CEBPA dm in the Chinese population.
Consistent with a previous study [19], we also found that the most frequent mutations in patients with CEBPA dm occurred in the tyrosine kinase signaling pathway. Exploration or evaluation of drugs targeting these pathways, and translational research integrating these molecular findings, may improve the treatment of patients with CEBPA dm .
In summary, we found that there were some differences in hotspots of CEBPA mutations, and in hotspots of co-occurring genetic mutations, between AML patients from Chinese and Caucasian populations. Some of the co-occurring mutations may even be potential candidates, for treating patients with CEBPA dm , specific to the Chinese population. The continuation of such studies may uncover more mutational differences based on ethnicity, which may similarly reveal information pertinent to research into the etiology of AML and treatment of AML patients with CEBPA dm .

Patients and treatment
From January 1, 2012, to June 30, 2017, 553 consecutive patients with de novo AML were screened for CEBPA mutations from our center and Chinese People's Liberation Army (PLA) General Hospital. They were categorized into FAB subtypes (M0-M7) based on morphological diagnoses [20] (Supplementary S3).
Patients in this study were treated with the standard '3+7' regimen (darubicin/idarubicin + cytarabine) or CAG (aclarubicin + cytarabine + G-CSF) regimen (for some elderly patients) for induction therapy. The response was assessed by bone marrow aspiration performed on days 14 and 28. The first consolidation therapy was the same as that generally used to achieve CR. Three to four courses of scheduled, high-dose cytarabine, at 2-3 g/m 2 , were administrated for consolidation therapy. Five patients with CEBPA dm received allo-HSCT. All of the participating patients gave informed consent prior to enrolment in the study. This study was approved by the ethics committee of Jilin University and Chinese PLA General Hospital, and conducted in accordance with the Declaration of Helsinki.

Cytogenetic analysis
Standard culturing and chromosome-banding techniques were used to analyze the karyotypes. Their clonal abnormalities were defined and described according to the International System for Human Cytogenetic Nomenclature [21].

Molecular mutations screening by nextgeneration sequencing
Eighty-one patients with CEBPA dm were analyzed by a sensitive next-generation sequencing (NGS) assay for 112 genes (see Table 3). The NGS assay was performed as previously described [22], covering 654 coding regions, and approximately 2610000 base pairs. A NimbleGen SeqCap EZ Choice kit was used according to the manufacturer's protocol with some modifications. Multiplexed libraries were sequenced using 75-bp pairedend runs on an Illumina NextSeq 550AR system. Reads were aligned using the Burrows-Wheeler alignment (BWA) tool (version 0.7.5a) against human genomic reference sequences (HG19, NCBI build 37). To identify single nucleotide polymorphisms (SNPs) and short insertions and deletions (INDELs), MuTect2 operation was performed with recommended parameters. All mutations were annotated by the ANNOVAR software. A subset of somatic mutations was randomly selected for validation using Sanger sequencing. Cell line dilutions were prepared for evaluation of sensitivity and specificity. For AML patients in this study, the SCARF file was converted to the FASTQ format by the CASAVA software (version 1.8, Illumina). Raw sequence reads were filtered with an indigenous program. Reads with more than 5% N bases or in which at least 50% bases had Q ≤5 were eliminated. The remaining reads were aligned using the BWA tool to the human genomic reference sequences (HG19, NCBI build 37) with certain parameters (mem -t 10 -k 32 -M). To decrease PCR duplication bias, the resulting Bam files were processed with Sam tools. Only unique reads were delivered for analyses. For identification of SNP and indel, MuTect2 operation was performed with recommended parameters. All mutations were annotated by the ANNOVAR software using the following resources: all annotated transcripts in RefSeq Gene; known constitutional polymorphisms as reported in human variation databases, such as 1000 Genomes (release date 20130308), the Exome Aggregation Consortium (ExAC release date 20151129) and dbSNP (version 135) were download from ANNOVAR; known somatic variations in myeloid and other malignancies as reported in COSMIC (version 70). To identify high-confidence somatic variants in AML samples in the absence of matched control samples, the following criteria were used: removal of all variants within intronic, UTR and intergenic regions, and retention of only nonsynonymous, frame-shift and stop-gain mutations in exonic regions; removal of all variants present in at least one of 81 healthy individuals; removal of all variants with one of the following features in MuTect results: mutation depth of less than four, Phredscaled p-value using Fisher's exact test to detect strand bias of more than 60, mapping quality lower than 40. Because we lacked matched normal samples, somatic mutations could not be selected by comparing a tumor with a matched, normal sample. Thus, a series of steps were used to remove germline mutations and harmless mutations. Mutations were removed unless they satisfied all of the following conditions: the mutation depth was more than four; the mutation occurred in an exonic region; the mutation function was not "synonymous SNV"; the annotation from ClinVar was not "benign" or the mutation did not appear in a dbSNP135 or the 1000 Genomes Project (2012 Feb) database.

Statistics
Statistics Package for Social Sciences (SPSS) software (Version 17.0, SPSS Inc., Chicago, IL, USA) was used to calculate the statistical difference. For categorical variables, the Chi-square test or Fisher's exact test was used to assess the statistical significance of differences between groups. Independent-samples t-test or Mann-Whitney U-test was used to compare differences between groups for continuous variables. Kaplan-Meier method was employed for survival analysis, and the log-rank test was used to compare differences between groups. P < 0.05 was considered significant in all tests.

Author contributions
SL, TYH, GSJ and YL conceived the study. SL, TYH, GSJ, LXL, and LW designed the study and analyzed the data. SL, LH, LXL, SJN, JFY, BO, YY, YL and LCS recruited the patients. LSS, YYP, YL and LQJ performed the next-generation sequence analysis. SL, TYH, and GSJ wrote this manuscript. All authors discussed and revised the manuscript before submission.