Clinical mutational profiling of 1006 lung cancers by next generation sequencing

Analysis of lung adenocarcinomas for actionable mutations has become standard of care. Here, we report our experience using next generation sequencing (NGS) to examine AKT1, BRAF, EGFR, ERBB2, KRAS, NRAS, and PIK3CA genes in 1006 non-small cell lung cancers in a clinical diagnostic setting. NGS demonstrated high sensitivity. Among 760 mutations detected, the variant allele frequency (VAF) was 2–5% in 33 (4.3%) mutations and 2–10% in 101 (13%) mutations. A single bioinformatics pipeline using Torrent Variant Caller, however, missed a variety of EGFR mutations. Mutations were detected in KRAS (36% of tumors), EGFR (19%) including 8 (0.8%) within the extracellular domain (4 at codons 108 and 4 at codon 289), BRAF (6.3%), and PIK3CA (3.7%). With a broader reportable range, exon 19 deletion and p.L858R accounted for only 36% and 26% of EGFR mutations and p.V600E accounted for only 24% of BRAF mutations. NGS provided accurate sequencing of complex mutations seen in 19% of EGFR exon 19 deletion mutations. Doublet (compound) EGFR mutations were observed in 29 (16%) of 187 EGFR-mutated tumors, including 69% with two non-p.L858R missense mutations and 24% with p.L858 and non-p.L858R missense mutations. Concordant VAFs suggests doublet EGFR mutations were present in a dominant clone and cooperated in oncogenesis. Mutants with predicted impaired kinase, observed in 25% of BRAF-mutated tumors, were associated with a higher incidence of concomitant activating KRAS mutations. NGS demonstrates high analytic sensitivity, broad reportable range, quantitative VAF measurement, single molecule sequencing to resolve complex deletion mutations, and simultaneous detection of concomitant mutations.


INTRODUCTION
Approximately 30-40% of Asian patients and 10-15% of Caucasian patients with lung adenocarcinoma harbor activating mutations in the epidermal growth factor receptor (EGFR) gene. Gefitinib, erlotinib and afatinib are tyrosine kinase inhibitors (TKI) approved by the Food and Drug Administration (FDA) of the United States for treatment of patients with EGFR-mutated lung cancers [1][2][3]. In 2011, a provisional clinical opinion from the

Research Paper
American Society of Clinical Oncology recommended testing for EGFR mutations in patients with metastatic lung cancer to predict response to TKI therapy [4]. Molecular testing guidelines for selection of lung cancer patients for TKI therapy have been published and are currently under revision by the College of American Pathologists, International Association for the Study of Lung Cancer, and Association for Molecular Pathology [5].
A variety of molecular diagnostic assays have been clinically validated for detection of EGFR mutations [6]. Although the prior gold standard of Sanger sequencing covers all EGFR mutations within exons 18-21, it's analytic sensitivity (20-40% tumor cellularity) may not be adequate in the clinical diagnostic setting where specimens containing low tumor cellularity are not uncommon [7,8]. The analytic sensitivity can be improved to approximately 5% variant allele frequency (VAF) (10% tumor cellularity) using pyrosequencing, 1% VAF using mutation-specific real time PCR assays, or even less than 1% by droplet digital PCR. Currently, there are two assays approved by the FDA for testing EGFR mutations in lung cancers, the cobas EGFR mutation test (Roche Molecular Systems, Branchburg, NJ) and the therascreen EGFR RGQ PCR Kit (Qiagen, Hilden, Germany) [9][10][11][12][13]. Both assays detect hot spot EGFR mutations by multiple separate runs of mutation-specific real-time PCR assays. These assays are not able to detect less common mutations outside the reportable ranges. A total of 150 ng DNA is needed for the cobas test. DNA input has not been quantified for the therascreen test. We have shown that 44% of specimens submitted for clinical mutational profiling were taken by biopsy or fine needle aspiration [14]. DNA isolated from biopsy or fine needle aspiration specimens containing limited tissue may not be sufficient.
Multiplexed genotyping platforms are replacing the traditional "one test-one drug" paradigm not only because of continuous expansion of predictive markers for targeted therapeutics but also often limited tissues submitted to the clinical diagnostic laboratories. Primer extension-based assays with a multiplex design, such as the Sequenom MassARRAY system, detect multiple hotspots within a panel of genes including EGFR in a single reaction while retaining an analytic sensitivity of 5% or less VAF [15,16]. In the era of precision cancer medicine, molecular diagnostics assays with a higher analytic sensitivity and a broader reportable range are warranted to provide a more comprehensive mutational profiling. Next generation sequencing (NGS) technology has led to a revolution in genome discovery and will soon become the most costeffective multiplexed sequencing platform in the setting of clinical care [7,17]. We have previously validated a NGS platform using the AmpliSeq Cancer Hotspot Panel and Personal Genome Machine in a Clinical Laboratory Improvement Amendments (CLIA)-certified laboratory [18]. In this retrospective analysis for quality assessment, we surveyed our experience with clinical mutation detection of AKT1, BRAF, ERBB2, EGFR, KRAS, NRAS, and PIK3CA genes in 1006 lung cancer specimens using this NGS assay, including false negative calling, the capability of detecting complex deletion mutations within exon 19 of the EGFR gene, a high frequency of doublet (compound) EGFR mutations with concordant VAFs, and the association of kinase impaired BRAF mutations with activating RAS mutations.

Positive and negative controls
No mutations were detected in 88 runs of the negative control specimen while all mutations in the positive control specimens were detected. The observed VAFs in the positive controls were highly consistent throughout the test period, demonstrating that NGS is a precise quantification assay for VAF (Supplementary Table 1).

Mutations missed by Torrent Variant Caller
Specimens with prior TKI therapy were not included in this study. One mutation was detected in 560 of 1006 tumors, 2 mutations in 70 tumors, and 3 mutations in 3 tumors. Our analysis pipeline included both Torrent Variant Caller and direct visual inspection of all amplicons within the reportable range using Integrative Genomics Viewer (IGV). A total of 15 mutations detected by IGV inspection were missed by Torrent Variant Caller (Table 1). These included 12 EGFR mutations (3 missense mutations of exon 18, 3 deletion mutations of exon 19, 3 insertion/duplication mutations of exon 20 and 3 p.L858R mutations of exon 21), 2 PIK3CA missense mutations and one ERBB2 duplication mutation. VAFs ranged from 3.2% to 65% ( Figure 1A). Eleven of the 15 false negative calls by Torrent Variant Caller occurred within the first year. The false negative calls were most likely related to bioinformatics pipelines of the Torrent Variant Caller, which has been improved with the updated versions. The recent Torrent Variant Caller did not miss mutations in lung cancers. All insertion/deletion (indel) mutations detected within EGFR exons 19 and 20 by a capillary electrophoresis sizing assay were also detected by IGV inspection. Accordingly, sizing by capillary electrophoresis was discontinued in April 2014.

EGFR mutations
A total of 49 unique EGFR variants/mutations were detected in 187 (19%) tumors, including 29 tumors with 2 EGFR mutations and 10 variants/mutations outside exons 18-21 involving extracellular domain of EGFR (four codon 108 mutations, four codon 289 mutations and two possible germline variants, p.E282K and p.H584R) ( Table 2 and  Supplementary Table 2). All except p.L858fs*45 caused   At the level of cDNA, there were 5 unique simple exon 19 deletion mutations (defined as an exon 19 mutation with no additional nucleotide changes within 1-2 adjacent codons) and 11 unique complex exon 19 deletion mutations (defined as an exon 19 deletion accompanied with one or more nucleotide changes in cis position within 1-2 adjacent codons) ( Table 3). Complex mutations were seen in 15 (19%) of 78 exon 19 deletion mutations, including 12 with one deletion and 1 or 2 single nucleotide changes, 2 with two deletions, and one with two deletions and one single nucleotide change ( Figure 1B). All the accompanied single nucleotide changes were located 3' to the deletion. All 5 simple deletion mutations were observed in 2 or more tumors while most complex deletion mutations were observed in one tumor.

Concomitant EGFR and PIK3CA mutations
There were 7 tumors with both EGFR and PIK3CA mutations. In contrast to double EGFR mutations, EGFR VAFs were not concordant with those of the coexisting PIK3CA mutations ( Figure 2B, r = 0.07). VAFs of the EGFR mutations were equivalent to or higher than those of the coexisting PIK3CA mutations (P = 0.009 by paired Student's t-test).

Concomitant BRAF and KRAS mutations
Activating KRAS mutation was detected in 11 (18%) of 63 BRAF-mutated tumors, including one (3.0%) of 33 tumors with a kinase-activated BRAF mutation, 5 (31%) of 16 with a kinase-impaired BRAF mutation, and 5 (36%) of 14 with a BRAF mutation of unknown kinase activity (Table 5) [19][20][21][22]. None of the tumors with concomitant BRAF and KRAS mutations had the p.V600E mutation. The incidence of concomitant BRAF mutations with activating KRAS mutations is significantly lower in tumors with kinase-activated mutants than those with kinase-impaired (P = 0.01) or kinase-unknown mutants (P = 0.006).   [7,23,24]. We have previously shown a test feasibility of 94% among the first 625 lung cancer specimens submitted for NGS testing. This included an approximately 3% of specimens rejected due to inadequate specimens and approximately 3% of specimens failed the NGS assay [14]. This retrospective analysis of 1006 lung cancers reaffirms the strength of NGS in clinical mutational profiling. NGS demonstrates a great analytic sensitivity, broad reportable ranges, and accurate detection and annotation of complex indel mutations. With an analytic sensitivity of 10-20% VAF, Sanger sequencing would have missed 7.8% or 27% of EGFR mutations with less than 10% or 20% VAFs in this series. The analytic sensitivity can be improved to approximately 1-5% VAF by mutation-specific real time PCR assays such as cobas EGFR mutation test and therascreen EGFR RGQ PCR Kit, which were designed to detect only hot spot EGFR mutations [9][10][11][12][13].
NGS detected a variety of uncommon mutations located outside the reportable ranges of cobas and therascreen tests, including 4 codon 108 mutations  [26,27]. Complex exon 19 deletion mutations may not be accurately characterized by Sanger sequencing [28], partly because the accompanied single nucleotide change or second deletion change may be difficult to interpret in the presence of an underlying deletion mutation, especially at a lower level of VAF. Furthermore, Sanger sequencing cannot distinguish if the two adjacent sequence changes are located within the same allele or different alleles without laborious cloning of the PCR amplicons for sequencing. NGS platforms provide individual sequencing information from a single molecule and, therefore, can confirm that the two sequence changes are always located within the same allele to form a complex exon 19 deletion. With accurate detection and annotation of the complex exon 19 deletion, further studies are warranted to elucidate if the point mutation component of the complex exon 19 deletion mutations may decrease TKI efficacy.
Application of assays with broader reportable ranges may shed light on the significance of uncommon mutations. For example, mutations involving codon 33 of the KRAS gene were recently found to be oncogenic [29]. NGS detected 17 unique PIK3CA mutations including 3 novel ones. Commonly reported mutations involving codons 542, 545 and 1047 accounted for only 65% of the PIK3CA mutations. A total of 24 unique BRAF mutations including 4 novel ones were detected among 63 BRAFmutated lung cancers. The p.V600E mutant accounted for only 24% of BRAF mutations while kinase-impaired BRAF mutants involving codons 466 and 594 were seen in 25% of BRAF mutations. In a previous study of combined lung cancer, colorectal cancer and melanoma specimens, kinase-impaired BRAF mutants were associated with a higher incidence of a concomitant activating KRAS/ NRAS mutation [22]. This is confirmed by a larger cohort of lung cancer specimens in this study. In vitro studies have shown that in the presence of oncogenic RAS proteins, kinase-impaired BRAF forms a complex with CRAF and leads to hyperactivation of the CRAF/MEK/ ERK cascade, suggesting MEK inhibitors or CRAF inhibitors may benefit patients with concomitant kinaseimpaired BRAF mutation and activating RAS mutation [20,21]. Dabrafenib (a selective BRAF inhibitor) alone or combined with trametinib (a MEK inhibitor) has shown efficacy in p.V600E-mutated lung cancers [30,31]. The NCI-Molecular Analysis for Therapy Choice (NCI-MATCH) Trial also includes an arm for targeting tumors with non-p.V600E BRAF mutations with trametinib (https://www.cancer.gov/about-cancer/treatment/clinicaltrials/nci-supported/nci-match, accessed 1/19/2017).
False negative results were a major concern when NGS platforms were initially implemented in the clinical diagnostic setting. During our clinical validation of this NGS platform, we have found that Torrent Variant Categorized as intermediate activity mutants by Wan et al. [19]. b These 4 cases have been reported previously [22]. www.impactjournals.com/oncotarget Caller alone may miss EGFR p.L858R [18]. Therefore, we have included IGV inspection of each amplicon as a second analysis pipeline. We also examined the indel mutations within EGFR exons 19 and 20 by a sizing assay. All EGFR mutations, missed by older versions of Torrent Variant Caller, were observed by IGV. Although recent versions did not miss mutations in lung cancers, version 5.0.2.1 (since December 2015) did miss a 8.4% KIT p.K558_D572del (45-base deletion mutation) in a gastrointestinal stromal tumor specimen and version 5.0.4.0 (since September 2016) missed a 2.6% BRAF p.V600E in a melanoma specimen with less than 10% estimated tumor cellularity (unpublished data). All exons 19 and 20 indel mutations detected by the sizing assay were also observed by IGV inspection. The results indicate that indel mutations of 18 or less bases can be reliably detected by Torrent Suite analysis combined with direct visual inspection of the binary sequence alignment/map file using IGV. Longer indel mutations such as internal tandem duplication mutations of the FLT3 gene may not be detected by NGS assays without bioinformatics pipelines designed for longer indel mutations [32,33].
Consistent VAFs over a 4-year period in positive control specimens highlighted the precise quantitative nature of NGS assays [27,34]. Analysis of VAF may yield important information regarding mutant allelespecific imbalance (such as gene amplification or loss of heterozygosity), tumor heterogeneity, and germline mutations [7,34,35]. We have shown that lower than expected VAF indicated tumor heterogeneity while higher than expected VAF indicated mutant allele-specific imbalance [34,35]. In this study, we found an equivalent higher EGFR VAF than the coexisting PIK3CA VAF. The results suggest a higher incidence of mutant allele-specific imbalance (most likely duplication or amplification) of the EGFR gene or the presence of PIK3CA mutation in a subpopulation of the EGFR-mutated tumors which may contribute to TKI resistance [27]. This was confirmed by subarea analysis of a specimen containing 2.8% PIK3CA mutation and 12% EGFR p.L858R mutation. Only one of 8 fragments showed both mutations with concordant VAFs (data not shown). In contrast, VAFs of the doublet EGFR mutations were highly concordant except for one specimen with 7.6% p.L858R and 65% p.T790M in a context of 11-30% estimated tumor cellularity, suggesting a germ-line p.T790M mutation [36].
Doublet EGFR mutations are not uncommon (1.6% to 14%) [37][38][39][40]. A summary from 66 publications showed 96 (6%) doublets in 1621 EGFR-mutated lung cancers [41], including several tumors which were indeed complex exon 19 deletion. The incidence of doublet EGFR mutations is higher by using sensitive assays with broader reportable ranges such as NGS assays. We found two EGFR mutations in 29 (16%) of 187 EGFRmutated tumors while 11 tumors with complex exon 19 deletion were excluded. Concordant VAFs between two mutations suggests one dominant tumor population with two mutations rather than two tumor populations each containing a mutation. Both mutations may be needed to initiate the founder clone or an EGFR-mutated subclone that has become the dominant tumor population after acquiring the second mutation. Concomitant mutations between exon 19 deletion, exon 20 insertion and p.L858R are uncommon [41]. Only one doublet consisting of exon 19 deletion and p.L858R was reported among 1621 EGFR-mutated lung cancers [41]. In contrast, non-p. L858R missense mutations were often seen in doublet mutations, suggesting a lower oncogenic potential of these mutations. In vitro studies have shown a comparable or higher level of catalytic phosphorylating activity in mutants involving conservative codons at 709, 719, 768, 790 and 861 compared to the wild-type, but a lower level of kinase activity with respect to exon 19 deletion or p.L858R mutants [42][43][44][45]. Significant enhancement of kinase activity observed in doublet with p.T790M and p.L858R or exon 19 deletion suggests additive oncogenic effect from p.T790M [43,44].
NGS demonstrates a high analytic sensitivity, quantitative measurement of VAF, single molecule sequencing of complex exon 19 deletion, and broad reportable ranges with simultaneous detection of doublet EGFR mutations and concomitant BRAF and KRAS mutations in the clinical diagnostic setting. Further studies are warranted to elucidate the clinical and/or biological significance of uncommon mutations, doublet non-p. L858R missense mutations of EGFR, and concomitant kinase-impaired BRAF mutations in lung cancers.

Materials
NGS was conducted in 1103 formalin-fixed paraffin-embedded (FFPE) specimens with a diagnosis of adenocarcinoma in situ (5 specimens), invasive adenocarcinoma (1033 specimens), adenosquamous carcinoma (10 specimens) or non-small cell carcinoma (55 specimens) of lung submitted to the Molecular Diagnostics Laboratory at The Johns Hopkins Hospital between April 2013 and June 2016. Specimens with prior TKI therapy were not included. There were 499 (45%) resection specimens, 341 (31%) biopsy specimens, and 204 (19%) fine needle aspiration specimens, 55 (5.0%) pleural or pericardial effusion specimen, 2 bronchial brushing specimen, one bronchoalveolar lavage specimen and one curetting of bone (Supplementary Table 6). Twenty-nine (2.6%) specimens failed (Supplementary Table 6). The remaining 1074 specimens with successful NGS were submitted from 1006 tumors of 987 patients (Supplementary Table 7). Tissue blocks with adequate tumor cellularity were selected by the pathologists who made the diagnosis. One hematoxylin & eosin (H&E) slide followed by 5-10 unstained slides and one additional H&E slide were prepared with PCR precautions. The H&E slide was examined and marked by PI (pulmonary pathologist), MTL and/or GZ (molecular pathologists) for subsequent macro-dissection of FFPE neoplastic tissues from 3-10 unstained slides of 5 or 10-micron thick sections. DNA was isolated from the designated area(s) using the Pinpoint DNA Isolation System (Zymo Research, Irvine, CA), followed by further purification via the QIAamp Mini Kit (Qiagen, Valencia, CA) [46]. DNA concentration was measured by Qubit 2.0 Fluorometer (Life Technologies, Carlsbad, California). Specimens with concentration less than 10 ng/µl or bony specimens were concentrated with Amicon Ultra 0.5 ml centrifugal filters with ultracel-30K membrane (Millipore Sigma, Darmstadt, Germany) after April 2015. The Johns Hopkins Medicine institutional review board granted approval to this study.

Next generation sequencing (NGS)
NGS was conducted using AmpliSeq Cancer Hotspot Panel (v2) for targeted multi-gene amplification, as described previously [18,34]. Briefly, we used the Ion AmpliSeq Library Kit 2.0 for library preparation, Ion OneTouch 200 Template Kit v2 DL (replaced by Ion Personal Genome Machine Hi-Q OT2 Kit lately) and Ion OneTouch 2 Instrument for emulsion PCR and template preparation, and the Ion Personal Genome Machine 200 Sequencing Kit (replaced by Ion Personal Genome Machine Hi-Q Sequencing Kit lately) with the Ion 318 Chip and Personal Genome Machine as the sequencing platform (Life Technologies). The DNA input ranged from 1 ng to 30 ng, as measured by Qubit 20 Fluorometer (Life Technologies). Up to 8 specimens were barcoded using Ion Xpress Barcode Adapters (Life Technologies) for each Ion 318 chip. One to three controls (a non-template control, a normal peripheral blood control from a male, and/or positive control specimens) were included in each chip. Positive controls were mixed DNA specimens from several cell lines with known mutations as reported previously [27,34].
Redundant bioinformatics pipelines are essential for NGS analysis, as a single analysis pipeline may give false negative and false positive results. Direct visual inspection of the binary sequence alignment/map file using the Broad Institute's Integrative Genomics Viewer (IGV) (http:// www.broadinstitute.org/igv/) was implemented during the validation processes of this assay after we found that Torrent Variant Caller missed the most common EGFR point mutation (p.L858R) in a lung cancer specimen [18,47]. In this study, sequencing data were analyzed using Torrent Suite (Life Technologies) as described previously [18]. Mutations were identified and annotated through both Torrent Variant Caller (Life Technologies) and direct visual inspection of the binary sequence alignment/map file using IGV. All specimens were analyzed for AKT1 (NM_005163), BRAF (NM_004333), EGFR (NM_005228), ERBB2 (NM_004448), KRAS (NM_033360), NRAS (NM_002524) and PIK3CA (NM_006218) genes. During our validation of this NGS assay, a cutoff of background noise at 2% was chosen for single nucleotide variant according to a study of 16 non-neoplastic FFPE tissues [18]. We also developed a statistical model to determine the read depth needed for a given percent tumor cellularity and number of functional genomes. With sufficient DNA input, the limit of detection is dictated by the depth of coverage (or number of sequencing reads). Approximately 150 and 500 reads is needed to detect a heterozygous mutation at a 99% confidence in a specimen with 20% and 10% tumor cellularity, respectively. The reportable ranges and reference ranges for the 7 genes have been reported previously [27,34]. BRAF mutation data before September 2014 has also been published previously together with colorectal cancer specimens and melanoma specimens [22].

Alternative assays
Insertion/deletion mutations within exons 19 and 20 of the EGFR genes were also examined by sizing using capillary electrophoresis. PCR followed by capillary electrophoresis was conducted as described previously [18]. This was discontinued after April 2014. Mutations not reported in the COSMIC database were confirmed by either Sanger sequencing or pyrosequencing. This policy was also discontinued after April 2015.

Statistical analysis
Student's t-test, χ 2 test or Fisher exact test was performed to calculate P values. Correlation of frequencies between two mutations was examined by Spearman's rank correlation coefficient (denoted as r) using the GraphPad Prism software (GraphPad Software, ver5, La Jolla, CA) as described previously [48].