Gene expression changes in tumor free tongue tissue adjacent to tongue squamous cell carcinoma

Due to the high frequency of loco-regional recurrences, which could be explained by changes in the field surrounding the tumor, patients with squamous cell carcinoma of head and neck show poor survival. Here we identified a total of 554 genes as dysregulated in clinically tumor free tongue tissue in patients with tongue tumors when compared to healthy control tongue tissue. Among the top dysregulated genes when comparing control and tumor free tissue were those involved in apoptosis (CIDEC, MUC1, ZBTB16, PRNP, ECT2), immune response (IFI27) and differentiation (KRT36). Data suggest that these are important findings which can aid in earlier diagnosis of tumor development, a relapse or a novel squamous cell carcinoma of the tongue, in the absence of histological signs of a tumor.


INTRODUCTION
Tongue squamous cell carcinoma (TSCC) is a subgroup of squamous cell carcinoma of the head and neck (SCCHN) the sixth most common cancer in the world. SCCHN is often classified as one disease although the region anatomically consists of several distinct structures. For the whole group of SCCHN the overall 5-year survival has not improved significantly over the last decades [1] and still is relatively low, around 60% [2].
The poor survival rate has been ascribed to late detection, a high frequency of relapses and death due to comorbidity. Relapses can be divided into tumor recurrences when tumor cells are not successfully eliminated by treatment and second primary tumors (SPTs) caused by an independent carcinogenetic process [3]. Numerous factors have been suggested as predictive of recurrence such as tumor stage, nodal status, tumor thickness/diameter and positive surgical margins [4][5][6].
One reason for development of recurrence is thought to be existence of transformed cells in areas adjacent to the primary tumor. Presence of molecular changes in the area around tumors was suggested already six decades ago as field cancerization [7]. These genetic alterations related to the neoplastic process are detectable within 7 cm from the tumor [8]. The field cancerization concept has been supported by findings of histological and molecular changes in clinically normal tissue adjacent to tumors [9][10][11][12][13]. In its original concept, field cancerization was defined as an extended region of tissue containing a limited number of oncogenic mutations from which a clone of malignant cells arose due to additional mutations. In a recently expanded "etiologic field effect" model, it was suggested that various etiological factors and their interactions generate a field of tissue changes favoring development of cancers [14].
Due to the rapid growth of TSCC and its potential to spread to the surrounding tissue where 20 to 40% of patients already have occult metastasis at diagnosis, early detection of the tumor is of utmost importance [15,16]. Numerous studies have been performed searching for molecular differences between TSCC and normal tissue and many genes have been found to be differentially expressed and involved in tumor development, such as matrix metalloproteinases (MMPs) and keratins [17,18]. In keeping with studies of other cancers that aim to identify tumor-specific alterations for diagnosis, classification and prognosis, these studies on oral SCC, including our own, have compared tumor to adjacent clinically normal tissue from the same patient and have successfully identified useful tumor-associated alterations [12,13,19]. Clinically normal tissue in the tumor proximity as control might, however, risk masking any field effects present. Theoretically such field changes may represent a pre-neoplastic condition in which identical oncogenic events occur before overt tumor development. Alternatively, gene expression changes may represent a tissue response to either the tumor itself or to the damaging environment from which the oncogenic events arise. In either scenario, gene expression alterations in clinically normal tissue from patients with oral cancer could be useful markers for early detection of novel or relapse tumors.
Aiming at characterizing changes indicative of tumor presence in clinically tumor free tissue, we mapped and compared changes in tissue adjacent to TSCC, the corresponding TSCC and tongue samples from healthy individuals.

Multivariate data analysis of gene expression profiles of individual patients and healthy volunteers
A principal component analysis (PCA) model was constructed to identify outliers and clusters within the sample population ( Figure 1). The PCA showed that tumor samples from TSCC patients, T (18 samples),  cluster together to the left while clinically normal tumor  free tongue adjacent to TSCC, TF (12 samples), and healthy control tongue tissue, C (14 samples), cluster to the right. Additionally a difference between TF and C can be observed with control samples clustering furthest away from tumors. By coloring samples according to different factors such as sex, age and RIN value separation of clusters was found to be dependent on biological differences between samples. Also the comparison of samples based on RNA extraction methods used showed no difference in expression.

Differently expressed genes in tumor free tongue tissue adjacent tumor
By comparing TF (12 samples) with C (14 samples) using the illuminaHT-12 bead chip containing 47231 probes, 614 probes, representing 554 genes, were found to be differentially expressed ( Figure 2A). The expression profiles showed four different patterns. Genes that were either progressively up-regulated (93) or progressively down-regulated (105) from C → TF → T, and genes showing a non-progressive pattern with either upregulation from C → TF and down-regulation from TF → T (168), or down-regulation from C → TF and upregulation from TF → T (188) ( Figure 2B). Of the 554 genes, 142 were not significantly differently expressed when comparing T to C, and thus represented altered gene expression specific for TF. The remaining 412 genes were, on the other hand significantly differently expressed when comparing T to C based on p < 0.01, and therefore represented tumor related changes.

Tumor free specific gene changes
Analyzing gene ontology terms for the 142 genes not significantly differently expressed between tumor and controls using WebGestalt showed cell division to be the most enriched ontology term including 10 genes, adjusted p-value 0.1542 (Table 1). By sorting the 142 genes by log fold-change (FC) between T and TF a list of the top 20 up-and down regulated genes was generated ( Table 2). Of the top 20 down-regulated genes two are involved in programmed cell death, prion protein (PRNP) and epithelial cell transforming 2 (ECT2). Of the top 20 up-regulated genes three are also involved in programmed cell death, Mucin 1 (MUC1), Cell Death-Inducing DFFA-Like Effector C (CIDEC), zinc finger and BTB domain containing 16 (ZBTB16). qRT-PCR validation for CIDEC shows a good correlation with illumina expression values, Pearson´s correlation was 0.66 with p-value < 0.001.

Progressive gene changes from control to tumor free to tumor
Of the 93 genes with progressively up-regulated expression from control to tumor free to tumor the two most enriched ontology terms were cellular response to type I interferon and type I interferon-mediated signaling pathway ( Table 3). The top gene based on FC when comparing tumor free tissue and control was interferon alpha-inducible protein 27 (IFI27) which is involved in the type I interferon signaling pathway ( Table 3).
The two most enriched ontology terms for the progressively down-regulated genes (105 genes) were keratinization and epidermal cell differentiation (Table 1), with keratin associated protein 13 (KRTAP13-1, KRTAP13-2) and keratin 36 (KRT36) as the top deregulated genes. The top 20 progressively up-and downregulated genes based on logFC are presented in Table 3, where mean expression values for all samples are also included. Validation of IFI27, KRT36 and KRTAP13-1 show a good correlation between illumina expression values and qRT-PCR, pearson´s correlation were 0.97;0.91 and 0.71 respectively, with p-values < 0.001 for all three genes.

IFI27, KRT36, CIDEC, MUC1 and ZBTB16 expression in individual patients
Among the top genes listed, CIDEC, MUC1, ZBTB16, IFI27 and KRT36, were selected for further analysis. Plotting RNA expression for each patient, 18 tumors (T), 12 tumor free samples (TF) and 14 controls (C) showed interindividual variations, especially in tumor free tissue ( Figure 3). Comparing tumor free tissue with controls showed higher expression of CIDEC, MUC1 and ZBTB16 in the majority of tumor free samples.
Four of the 5 patients that were still alive after 5 years had lower levels of IFI27 in tumor free tissue samples using LIMMA statistics, p-value < 0.01 and FC > 2 showed 554 genes to be significantly differently expressed. Of these, 412 genes were also significantly differently expressed between TSCC and C, and thus called tumor related changes. The 142 genes not significantly differently expressed between T and C, represented microenvironmental changes. B. The expression profiles showed four different patterns, progressive up-regulation from C to TF to T, progressive down-regulation from C to TF to T, up-regulation from C to TF and downregulation from TF to T and down-regulation from C to TF and up-regulation from TF to T. www.impactjournals.com/oncotarget compared to the seven patients that were dead. For KRT36, three of five patients alive expressed high levels while five out of seven dead patients expressed low levels. As Keratin 36 is not previously known as an oral keratin, its expression was evaluated in SCC, TF and C of different locations within the oral cavity using PCR. Results showed KRT36 to be tongue specific and highly expressed in normal tongue but not in normal buccal mucosa or tumor free tissue from floor of the mouth, gingiva and tonsil ( Figure 4).

DISCUSSION
Previous studies suggested that gene expression in histologically normal oral mucosa might be useful in predicting outcome and recurrence of OSCC [13,20]. While these studies included tissue collected from oral mucosa from different sites, tongue, gingiva and floor of the mouth [9][10][11] we here analyzed tissue from mobile tongue only.
Our genome-wide expression profiling revealed dysregulation of 554 genes in clinically tumor-free tongue compared to control tongue from healthy individuals. By focusing on genes that were dysregulated in uninvolved oral mucosa compared with normal oral mucosa and also dysregulated in cancer tissue a previous study identified 71 transcripts as being important in the progression of OSCC [13]. In the present study, all changes found in clinically tumor-free tongue compared to healthy controls were included irrespective of status in tumors, as tumor free   tissue not necessarily displays the same molecular features as the tumor [14]. The altered gene expression seen in tumor free tongue only and not in tumor was interpreted as microenvironmental changes in response to various exposures, whereas the progressive gene expression changes seen from controls to tumor free to tumor were classified as tumor related changes in tumor free tissue. A total of 142 genes were significantly dysregulated in tumor free tongue compared to controls, whereas their expression in controls and tumor was similar. Of the top 20 most dysregulated genes, five are involved in regulation of apoptosis. Of the three upregulated genes, (CIDEC, MUC1 and ZBTB16) CIDEC seems to have a pro-apoptotic role [21] while MUC1 is anti-apoptotic [22,23] and ZBTB16 can both be pro-and anti-apoptotic [24,25].
CIDEC, or Fsp27, is one of three members of the CIDE-family (cell death-inducing DFF45-like effector) which is important in regulation of energy homeostasis and also linked to development of different metabolic disorders like obesity and diabetes [26]. Looking at levels of CIDEC all but two of the tumor free samples showed higher levels than controls, and seven of the 12 tumor free samples had levels higher than the tumor. Whether the pro-apoptotic or the energy regulatory role was the prime activity of CIDEC cannot be judged based on the present data. MUC1 is a well-known response gene to low nutrient and hypoxia in the microenvironment and the increased expression of MUC1 could, apart from inhibiting apoptosis, be a protective response against the stress created in the field. Notably, MUC1 has been reported to be overexpressed in a variety of epithelial cancers, including SCCHN, and also plays a key role in cancer [27]. Overexpression of MUC1 in clinically tumor free tissue thus could play a dual role in cancer development. ZBTB16, zinc finger and BTB domain containing 16 (also known as PLZF or ZNF145), is a transcription factor involved in balancing stem cell self-renewal and differentiation and is tightly regulated in cell-type and developmental stage-specific manners [28]. Reactivation of factors involved in development is often seen in tumors, accordingly this factor could also have a dual tumor stimulatory role through stem cell renewal and/ or inhibition of apoptosis.
Among the tumor related factors seen in tumor free tissue, the gene ontology terms cellular response to type  I interferon and type I interferon -mediated signaling pathway including IFI27 are seen. IFI27 is suggested to be involved in proliferation of skin keratinocytes [29] and up-regulated in breast cancer, SCC of the skin and ovarian cancer [30][31][32]. In ovarian cancer IFI27 is associated with patient survival where high IFI27 expression correlates with poor disease free survival [30]. A similar trend was seen in our data, where the four patients with highest IFI27 levels died within 5 years from diagnosis while the four with lowest expression are still alive 5 years after diagnosis. Low expression of IFI27 in tumor free tongue tissue thus indicates better survival. Importantly, this correlation was only seen in tumor-free tongue tissue and not in TSCC, emphasizing the importance to evaluate also corresponding tumor free tissue. The progressively down-regulated genes showed keratinization and epidermal cell differentiation as the most enriched ontology terms. One of the identified genes with a role in epidermal cell differentiation is KRT36, a type I hair keratin with an unknown role in the oral cavity. By analyzing mRNA levels in different locations in the oral cavity, we found KRT36 to be tongue specific. Levels of KRT36 go from very high in normal control tongue tissue to extremely low in tumors, and tumor free tissue showed a big variation in expression with patients having levels  comparable to either control tongue or TSCC. So far, there is no evidence that degree of differentiation or keratinization plays a role in TSCC, but the fact that KRT36 is uniquely expressed in healthy tongue and that expression decreases and almost disappears in the majority of tumor free samples indicates that it could have a role in development of TSCC.
In summary, results show upregulation of IFI27, CIDEC, MUC1 and ZBTB16, genes important in proliferation and apoptosis respectively, in clinically normal tongue adjacent to TSCC. Furthermore, data indicate a role in tumor development for KRT36, a novel keratin in the development of TSCC.
These are important findings which can aid in earlier diagnosis of tumor development, a relapse or a novel TSCC, in the absence of histological signs of a tumor.

Study design and tissue acquisition
Between May 2002 and December 2010, 20 patients undergoing treatment for TSCC consented to a tumor biopsy (T) as well as a biopsy from clinically normal tumor free tissue adjacent to the tumor (TF) for research studies. All biopsies were collected from tongue tissue and were collected before treatment of the patients. For specific location of tongue biopsies see Table 4. At the same time a biopsy for diagnostic use was taken, certifying the diagnosis squamous cell carcinoma in all cases studied. Clinically normal tumor free tissue was always taken from the opposite side and the same location as the tumor. The biopsies called tumour free tissue, were just judged clinically, and not analysed histologically before they were subjected to RNA extraction. All patients had at least 5 years follow up time. Biopsies were collected from 20 patients in total, however, in this study material from 18 T and 12 TF (10 pairs with T and corresponding TF) was used. In addition, 14 tumor free volunteers provided a biopsy from lateral border of the tongue, denominated healthy control tongue tissue (C). These biopsies were taken at the same session by an experienced ENT-surgeon. For clinical data see Table 4. Samples had been consecutively collected and several of the patients are included in previous studies with different objectives [9,19,33]. RNA from the biopsies thus has been extracted using two different methods, RNA only (Trizol) and RNA and protein (kit from Norgen). The project was approved by the local Ethical Committee (dnr 08-003M).

RNA extraction
The fresh frozen biopsies were homogenized in either trizol or lysis buffer from RNA/protein purification kit (Norgen, Canada) using a precellys (Bertin Technologies, Artigus Pres Boreaux, France).
After homogenization, samples were treated according to protocols provided by the supplier. For samples extracted with trizol, chloroform was added to the homogenized sample and phase separated. RNA was precipitated by adding isopropanol followed by wash in ethanol. All RNA samples were dissolved or eluted in water and quality and quantity measured using nano-drop and Agilent RNA 6000 Nano kit (Agilent 2100 Bioanalyzer, Agilent Technologies, Santa Clara, CA, USA).

Illumina HT-12 bead chip array
After extraction and quality control 200 ng of RNA was labelled with TargetAmp™-Nano Labelling Kit for illumina Expression BeadChip (Epicenter) to produce cRNA, which was purified using Qiagen RNeasy MinElute Cleanup kit (Qiagen). Purified cRNA, 750 ng, was hybridized to illumina HumanHT-12 v4 bead chip and analyzed with an iScan system, according to the manufacturer's manuals. For raw data see http://www.ebi. ac.uk/arrayexpress/help/FAQ.html#cite, ArrayExpress accession E-MTAB-4678.

Data pre-processing and statistics
Raw data was exported from genome studio to R, where necq normalization was performed using the BioConductor package (http://www.bioconductor.org/). After normalization MeV software and Limma statistics (http://www.tm4.org/mev.html) were used to calculate differences between groups. For significance the criteria of p < 0.01 and FC > 2 should be fulfilled. For gene ontology enrichment terms WEB-based Gene SeT AnaLysis Toolkit (WebGestalt) (http://bioinfo.vanderbilt.edu/webgestalt/) was used.

Multivariate data analysis
Gene expression data from microarray was analyzed with an unsupervised regression method, principal component analysis, PCA [34], to provide an overview of variation in the data and detect trends and clusters in samples and variables. Data was normalized to unit variance. SIMCA 14 software (Umetrics, Umea, Sweden) was used for multivariate data analysis.

Confirmation of array data with PCR
Real-time quantitative PCR was used to confirm array results. For cDNA synthesis, 500 ng of total RNA was used in a 20 µl reaction with RevertAid H minus first strand cDNA synthesis kit (Thermo Scientific). cDNA was diluted 5x and 2.5µl used in each reaction with a total reaction volume of 10µl. For PCR amplification of cDNA, IQ sybr green supermix (Bio-Rad) was used in combination with www.impactjournals.com/oncotarget