Targeted next generation sequencing identified a high frequency genetic mutated profile in wood smoke exposure-related lung adenocarcinoma patients

Background Wood smoke exposure (WSE) has been associated with an increased risk of lung cancer development. WSE has been related with high frequency of EGFR mutations and low frequency of KRAS mutations. The aim of this study was to evaluate large scale genomic alterations in lung adenocarcinomas associated with WSE using targeted next generation sequencing. Methods DNA multi-targeted sequencing was performed in 42 fresh-frozen samples of advanced lung adenocarcinomas. The TruSeQ Cancer Panel (Illumina) was used for genomic library construction and sequencing assays. Results WSE rate was higher in women (p=0.037) and non-smokers (p=0.001). WSE correlated with mutations in the genes SMARCB1 (p=0.002), Ataxia telangiectasia mutated (p=0.004), Kinase Insert Domain Receptor (p=0.006), and were borderline significant in RET and EGFR exon. Genomic alterations significantly co-occurred in the tumor suppressor gene ATM with the following genes: SMARCB1, EGFR exon 7, RET and KDR. Clinical factors associated with poor prognosis were ECOG ≥ 2 (p= 0.014), mutations in KDR (p= 0.004) and APC genes (p < 0.001). Conclusions Lung adenocarcinoma patients with WSE showed a distinctive mutated profile for the SMARCB1, ATM, EGFR exon 7, RET and KDR genes. ECOG status and KDR gene mutations were significantly associated with poor prognosis.


INTRODUCTION
Lung cancer is the first cause of cancer-related deaths worldwide with 1.6 million deaths per year [1]. In México, lung cancer accounts for 10% of all cancer-related mortality [2]. The recurrent etiological factor of non-small cell lung cancer (NSCLC) is cigarette smoking, represented by almost 90 % of patients in United States [3]. In México only 56.5% of NSCLC cases have a history of tobacco smoking, particularly, www.oncotarget.com Oncotarget, 2018, Vol. 9, (No. 55), pp: 30499-30512 Research Paper www.oncotarget.com in women represents only 33% [4][5][6]. This suggest that other environmental factors have a greater impact in the development of lung cancer, such as asbestos exposure, arsenic, hydrocarbons, metals, ionizing radiation, air pollution, tuberculosis and wood smoke exposure (WSE) [7,8].
Currently about 3 billion people, particularly females use biomass and coal as fuels indoors and for domestic cooking exposing themselves to WSE [9]. WSE in women is considered a risk factor for lung cancer independently of smoking status [10]. Wood combustion releases polycyclic aromatic hydrocarbons such as naphthalene, retene, and phenanthrene. in vitro, these carcinogens cause DNA strand breaks, epithelial-mesenchymal transition, cell proliferation and inflammation [11] and induce lung adenocarcinoma in mice [12].
Our group has previously reported that WSE is related with 35% of NSCLC cases in México [5]. Patients with WSE have showed a better response to treatment with tyrosine kinase inhibitors (TKI) targeting EGFR mutations [7]. Furthermore, it has been described that patients with WSE were associated with adenocarcinoma histology and higher incidence of EGFR mutations in up to 50% of the cases and low frequency of KRAS mutations with 6.7% [8]. Moreover, we reported gene expression profile of WSE-related NSCLC where 37 genes were significantly altered and closely related to UBC and GABARAPL1 affecting PI3K/AKT and MAPK pathways [13]. WSE is related to high levels of phosphorylated TP53, as well as promoter methylation in genes such as p16 and GATA4 [14]. However, a comprehensive genetic mutation profile in WSE-NSCLC patients and their clinical outcomes remains unexplored. The aim of the present work was to study somatic mutations based on genomic profiling by the method of targeted next generation sequencing (NGS) on tumor samples of lung adenocarcinoma patients with WSE and their prognostic value.

RESULTS
Patient selection for this study is outlined in Figure 1. From the patients with lung adenocarcinoma 71.4% were women. Median age was 67 years, 69% of the patients were over 60 years old and 85.7% had an ECOG of 0-1. Forty-five percent had a history of WSE and 38.1% had tobacco smoking history, but only 12.5% (2 cases) had both exposures (Table 1)

DISCUSSION
Around 40% of the world population uses solid fuels, including wood for cooking and heating homes. In Mexico, 15% of households, particularly in rural areas (40.5%) and with low socioeconomic status use wood as fuel for cooking. The development of chronic obstructive pulmonary disease in 62% of women is not attributable to tobacco, and could be related to longterm WSE. This is associated with the observed twofold increase in lung cancer, particularly in nonsmoking Mexican women [15]. Previous reports have shown the association between WSE and lung cancer development mainly in women [7,8,13].
Exposure to carcinogenic compounds of wood smoke produce alterations in 53, phospho-TP53, and MDM2 expression increasing lung cancer risk [8]. Previously, our group reported a relation between WSE, female gender, EGFR mutations and different gene expression profiles [8,13]. Our population is a complex admixture of races and ethnic groups and difficult to characterize therefore we do not make distinctions according to races in our study. This could be accomplished more accurately by genetic ancestry testing [16,17]. On this study we describe a landscape

Tobacco-Smoking Exposure
Smoking index Median (Range)    Some of these genomic alterations are not reported in the catalog of somatic mutations in cancer (COSMIC), and may have a prognostic value for lung adenocarcinoma patients. Additionally, a comprehensive search across major genomic studies in lung adenocarcinoma revealed that these WSE-related genes are not associated with smoking history [18], showing a distinct mutation profile (Figure 4).
In the present study, we report mutations in known tumor suppressor genes such as SMARCB1 and ATM. Truncating SMARCB1 mutations were detected in 14 patients with a history of WSE and were indicative of poor prognosis. SMARCB1 is a member of the SWI/ SNF chromatin remodeling complex involved in DNA repair and replication thereby controlling cell growth and differentiation [19]. Truncating forms of SMARCB1 are linked to an aggressive tumor phenotype, and are frequent in malignant rhabdoid tumors and epithelioid sarcomas, but rarely found in NSCLC [20]. Loss-of-function in the SWI/SNF complex activates EGFR-related pathways and represent a resistance mechanism to MET and ALK inhibitors, therefore, this could be a suitable target for combined inhibition with TKIs [21].
Our findings also report frequent frameshift mutations in the ATM tumor suppressor gene leading to protein truncation in patients with lung adenocarcinoma. This is consistent with the fact that ATM mutations represent an early event in NSCLC pathogenesis and over 40% of lung adenocarcinomas are negative for ATM protein expression [22]. This gene has been found to be deficient serving as an independent prognostic factor associated with worse survival in stages II/III and chemotherapy resistance [22]. Moreover, several ATM polymorphisms are risk factors for developing lung cancer in never smokers with low levels of carcinogen exposure [23]. Upon loss of ATM function, patients experience genomic instability that can be targeted through inhibition of alternative DNA repair mechanisms in combination with TKIs, which result in better response and overall survival [22,23].
Furthermore, we report genomic alterations in the oncogenes EGFR, RET and KDR. The average frequency  [24,25]. Roughly 90% of these mutations are exon 19 deletions and the L858R mutation in exon 21. We have also reported the presence rare mutations in EGFR in exons 18-21 of the tyrosine kinase domain in 20.5% of the patients [25].
In the present study, we report novel mutations in exon 7 of EGFR encoding for an extracellular portion of this receptor. Alterations in this region could affect ligand binding and the activation of intracellular pathways as well as the response to antibody-based therapies such as cetuximab. [26]. ATM mutations are an early event in NSCLC pathogenesis mutually exclusive with TP53 mutations  and may substitute its functional role in cancer initiation. Loss of ATM function contributes to genomic instability impairing double-strand break (DSB) DNA repair, therefore, combined treatments with inhibitors for alternative DNA repair mechanisms have been tested. ATM-deficient NSCLC cells reported higher sensibilization to ionizing radiation after cisplatin treatment and in vivo studies showed increased sensitivity to cisplatin and AZD6738 [27]. Furthermore, we describe the presence of missense mutations close to the tyrosine kinase domain of the RET oncogene. There is a 2.5% incidence of RET missense mutations in NSCLC. These mutations spanning the extracellular cadherin-like and the intracellular tyrosine kinase domains affect downstream signaling pathways promoting tumorigenesis [18,20]. However, the most studied RET alterations in NSCLC are gene fusions mutually exclusive with EGFR mutations. NSCLC patients with RET rearrangements are generally young, never smokers, with high grade and small tumors of solid subtype. RET translocations are currently targeted with different TKIs but to date there are no therapies available for RET mutations.
In addition, we detected missense mutations in the tyrosine kinase domain of the KDR gene encoding the vascular endothelial growth factor receptor 2 (VEGFR-2) that were associated with shorter overall survival. The cBioportal database reveals a frequency of KDR  mutations of 8% and 1% amplifications in NSCLC also the expression level of the VEGFR-2 protein defines molecular subsets of this malignancy. VEGFR-2 mediates the activation of EGFR-related pathways and its high expression is correlated with poor prognosis indicating a clinically attractive target with multiple VEGFR TKIs treatment [18]. However, responses to anti-VEGFR-2 antibodies or TKIs are still limited, with better response rates and PFS than conventional therapies but no significant improvements in OS [28].
Some of the genomic alterations detected in WSErelated NSCLC in our study were concurrent, represented mostly by ATM mutations in combination with another tumor suppressor, like SMARCB1 and oncogenes such as RET, KDR and EGFR exon 7 [29]. We hypothesize that carcinogens released by WSE produce frameshift truncations, resulting in loss of protein function in the tumor suppressors ATM and SMARCB1, and subsequent mutations in the oncogenes RET, KDR and EGFR exon 7 among others involved in the development of lung adenocarcinoma. The association between these three oncogenes may highlight the activation of several signaling pathways associated to tyrosine kinase receptors, suggesting the use of TKI combinations could be a suitable therapeutic strategy and would explain better response rates observed in NSCLC patients with WSE [8,24]. Patients with driver alterations in major oncogenes, such as ALK, ROS1 and EGFR can benefit from targeted therapies, however, the presence of concurrent mutations in tumor suppressor genes can alter the course and prognosis of the disease [26,27]. Our study is based on a small cohort, and due to the limited number of patients these results should be taken with caution since there is always a small probability of false positives, but this could be elucidated in further studies that focus on the role of these genes in NSCLC associated with WSE.

Patient selection
A prospective cohort study was conducted, in patients diagnosed with lung adenocarcinoma from 2014-2017 at the Thoracic Oncology Clinic of the Instituto Nacional de Cancerología. The protocol was approved by the scientific and ethics institutional committees (15/049/ ICI and CEI/1023/15, respectively). A total of 42 patients participated in the project after signing informed consent. A detailed medical history was registered including characteristics of patients, such as: age, gender, smoking status, WSE, disease stage, histological classification and clinical outcome. WSE was defined as exposure to fumes resulting from burning wood in fireplaces and wood stoves for at least four hours a day over five years. The WSE exposure index was calculated as the average number of hours spent cooking daily per total number of years, as reported previously by Behera [30].

Sample processing
Tissue samples were obtained by tru-cut needle biopsies from primary tumors and they were immediately frozen in liquid nitrogen prior to storage until DNA extraction and library preparation. The pathology department performed the histologic diagnosis and quantification of the percentage of neoplastic cellularity. The procedure for DNA extraction and purification was carried out using the Genomic DNA Wizard kit (Promega, Madison, WI, USA). DNA purity was assessed by a NanoDrop-1000 spectrophotometer (Thermo Scientific, Wilmington, DE, USA), concentration was measured using a Quantus fluorimeter, and the DNA integrity was tested by agarose electrophoresis.

Library preparation and sequencing
The commercial TruSeq Cancer Panel (llumina) for 48 cancer-related genes and 212 amplicons was used (FC-130-1008, Illumina; San Diego CA, USA). Targeted sequencing was performed on a MiSeq instrument, with an average sequencing depth per base of 1000X. ALK fusions were detected by Fluorescent in Situ Hybridization.

Sequence analysis and variant calls
The bioinformatic workflow used for sequence analysis was the following: FASTQ files generated in the sequencer were processed in the FASTQC program. Sequences were filtered with the Trimmomatic software removing adapters. Those sequences with phred quality scores over Q30, i.e with base calling accuracy of 99.9% aligned with BWA using hg19 as reference genome. They were subsequently processed with the PICARD tools package, preparing the alignments for GATK analysis. Genomic sites with high propensity to insertions or deletions were realigned. The quality of reads and alignments was recalibrated and variants were called with muTect. Statistical filters were applied to the variants obtained to distinguish actual mutations from possible artifacts. All filtered variants were annotated regarding their possible functional consequence by snpEff and Variant Studio and the alignments and variants were visualized in the Integrative Genomics Viewer (Broad Institute, USA).

Statistical tests
Continuous variables were summarized as arithmetic means with standard deviation, medians with interquartile ranges for descriptive analysis, while categorical variables were expressed as frequencies and percentages. Either Student's t or Mann-Whitney U tests were used for two group comparisons, according to data distribution evaluated by Kolmogorov-Smirnov test. Comparisons between categorical variables were assessed by Fisher's www.oncotarget.com exact or χ2 tests. A p-value < 0.05 was accepted as statistically significant for two tailed tests. All variables were dichotomized for survival curve analysis. Overall survival (OS) was measured from day of diagnosis to the date of death or last follow-up, and comparisons among survival times were performed with log-rank test. Data were analyzed using SPSS software package, version 22 (SPSS, Inc., Chicago, IL, USA).

CONCLUSIONS
WSE-related lung adenocarcinoma presents genomic alterations in SMARCB1, ATM, EGFR exon 7, RET and KDR not associated with smoking history. Genomic changes in some of these genes had a relevant impact on overall survival in lung adenocarcinoma patients and could represent novel therapeutic targets. Further studies are required to elucidate the functional role of these genomic alterations in early events of WSErelated carcinogenesis and the implications of loss of function mutations in these tumor suppressor genes.