Identification and characterization of the gene expression profiles for protein coding and non-coding RNAs of pancreatic ductal adenocarcinomas.

Significant advances have been achieved in recent years in the identification of the genetic and the molecular alterations of pancreatic ductal adenocarcinoma (PDAC). Despite this, at present the understanding of the precise mechanisms involved in the development and malignant transformation of PDAC remain relatively limited. Here, we evaluated for the first time, the molecular heterogeneity of PDAC tumors, through simultaneous assessment of the gene expression profile (GEP) for both coding and non-coding genes of tumor samples from 27 consecutive PDAC patients. Overall, we identified a common GEP for all PDAC tumors, characterized by an increased expression of genes involved in PDAC cell proliferation, local invasion and metastatic capacity, together with a significant alteration of the early steps of the cellular immune response. At the same time, we confirm and extend on previous observations about the genetic complexity of PDAC tumors as revealed by the demonstration of two clearly distinct and unique GEPs (e.g. epithelial-like vs. mesenchymal-like) reflecting the alteration of different signaling pathways involved in the oncogenesis and progression of these tumors. Our results also highlight the potential role of the immune system microenvironment in these tumors, with potential diagnostic and therapeutic implications.


INTRODUCTION
In recent years, important advances have been achieved in the identification of the genetic and molecular alterations of pancreatic ductal adenocarcinoma (PDAC).Such studies have also shown that PDAC is a genetically highly-heterogeneous and complex group of tumors [1][2][3][4][5][6].However, the knowledge about the precise mechanisms underlying the development and malignant transformation of PDAC, still remain largely unknown.In this regard, global gene expression profiling (GEP) at both the mRNA and the protein levels has proven to allow identification of distinct molecular tumor subtypes in many different human cancer types [7][8][9].In recent years, many GEP studies have been also reported in PDAC [4,[10][11][12][13][14][15][16][17][18][19][20][21][22][23][24]; such studies, have mainly focused on the definition of molecular signatures associated with disease progression; but, to the best of our knowledge, only Collisson et al. [25] described (three) PDAC subtypes based on microarray analysis of GEP which were associated with different clinical outcomes and therapeutic responses: the classical, quasi-mesenchymal and exocrine-like subtypes of PDAC tumors.Currently, it is well-established that the cellular mechanisms involved in tumor genesis and progression depend, not only on the protein-coding GEP, but also on the expression profile of post-transcriptional regulators such as the miRNAs.Thus, simultaneous assessment of the mRNA and the non-coding RNA gene expression profiles may contribute to a better understanding of the molecular pathways of PDAC and a more accurate definition of the distinct molecular subtypes of these tumors.To date, only a few studies by Donahue et al. [26] and Frampton et al. [27] have combined global mRNA and miRNA expression analysis of PDAC tumors.In the former study, combined GEP data and DNA copy number alterations were investigated in a cohort of 25 primary PDAC tumors in an attempt to identify tumoral molecular profiles associated with a distinct patient survival.By contrast, Frampton et al. [27] analysed the impact of miRNA expression on the whole mRNA GEP in a small cohort of PDAC tumors (n = 9) and cell lines (n = 2) aiming at the identification of functional miRNA-mRNA interactions that could contribute to PDAC growth.
Here we evaluate the molecular heterogeneity of PDAC tumors based on simultaneous assessment of the overall GEP of both coding mRNA and non-coding RNA genes -including miRNA, small nucleolar and large intergenic RNAs-in primary tumor samples from 27 consecutive PDAC patients vs. non-tumoral pancreatic tissue.Overall, our results define a common GEP for all PDAC tumors, at the same time they confirm and extend on previous observations about the existence of two clearly distinct molecular subtypes of PDAC.

The gene expression profiling of PDAC vs. nontumoral pancreatic tissues defines two molecular subgroups of PDAC tumors
Despite there were global differences in the GEP of PDAC vs. non-tumoral pancreatic tissues (Table 1; Supplementary Tables 2 and 3), both unsupervised PCA (Figure 1A) and HCA (Figure 1B), showed two welldefined subgroups of PDAC tumor samples with distinct GEP: 1) a major group consisting of 24/27 PDAC samples (GEP-A subgroup of tumors) and 2) a minor subgroup of three PDAC tumors which clustered together, clearly apart from the GEP-A PDAC tumors (GEP-B subgroup of PDAC).Of note both the GEP-A and GEP-B subgroups of PDAC tumors also clustered separately from the nontumoral pancreatic tissue samples (n = 5; Figure 1).
Taking in account these GEP-based subgroups of PDAC tumors, supervised analysis showed a total of 2,594 mRNA and 214 small RNA altered genes among GEP-A and GEP-B tumors vs. non-tumoral pancreatic tissue samples (Table 1; Supplementary Tables 2 and 3).Upon comparing the GEP of the GEP-A and GEP-B subgroups of PDAC tumors: 1,605/2,594 (62%) and 181/214 (85%) differentially expressed mRNA and small RNA genes were associated with the GEP-A cluster, respectively, while 1,522/2,594 (59%) and 103/214 (48%) mRNA and small RNA genes were associated with the GEP-B cluster, respectively; a total of 533 (21%) mRNA and 70 (33%) small RNA transcripts were simultaneously altered in the two subgroups of PDAC tumors (Supplementary Tables 2 and 3).The altered gene profile common to the GEP-A and GEP-B tumors included increased expression of mRNA coding for the RAC1 and RHOC GTP-binding proteins, the insulin-like growth factor binding protein 3 (IGFBP3), several members of the S100A and the MMP gene families (e.g.: S100A6, 11 and 16, and MMP2, 11 and 14), as well as the PDAC-associated miRNAs miR-155 and miR-203, which are known to be typically altered in PDAC; in addition, both subgroups of PDAC tumors also showed loss of expression of normal pancreatic genes such as the CELA2A (pancreatic elastase), the CEL, PNLIP, PNLIPRP1 and PNLIPRP2 genes (pancreatic lipases and related proteins), the SERPINI2 serin peptidase inhibitor gene and the miR-216, miR-217 and miR-148 miRNAs.In turn, those genes which were found to be differentially altered in the GEP-A and GEP-B tumor subgroups, included, among other, the KRAS oncogene, the CEACAM1 and CEACAM5 epithelial marker carcinoembrionary antigens, the SERPINB5 gene, as well as the miR-21, miR-221 and miR-222 miRNAs which were all overexpressed in GEP-A vs. GEP-B tumors (Supplementary Tables 2 and 3).
Supervised analysis further showed differential expression for another 20 genes in GEP-A vs. GEP-B PDAC tumors (Supplementary Table 2).Among other altered genes, these included greater expression in GEP-A (vs.GEP-B) PDAC of the CEACAM6 gene, as well as of genes associated with the inflammatory response and chronic pancreatic diseases such as the integrin β4 and β6 genes (ITGB4 and ITGB6), the cytochrome b-245 beta polypeptide (CYBB), lysozyme (LZY), the SERPINA1 antiproteinase and the antitrypsin serpin peptidase inhibitor genes, together with genes involved in tumor metastasis and invasion -e.g. the MMP7 matrix metalloproteinase and the tetraspanin 8 (TSPAN8) genes-(Supplementary Table 2).

Functional characterization of deregulated GEP in PDAC tumors
Analysis of the biological and functional significance of the deregulated GEPs observed in our PDAC tumors, revealed > 55 significantly altered canonical pathways vs. non-tumoral pancreatic tissues.Among those pathways more commonly altered in PDAC tumors we observed increased expression of genes involved in axonal guidance, the actin cytoskeleton and/or endocytosis processes such as integrins (TGA5, ITGB1), GTP-ases (RRAS and RAC1) and actin-related proteins (ACTR3); in addition, genes that participate in the early steps of inflammatory cell responses, such as genes associated with leukocyte extravasation, cell adhesion and diapedesis, and with IL-8 signaling, together with genes involved in cell motility, were altered in tumoral tissues from both groups of PDAC, as reflected by an increased expression of the   4).Both subgroups of tumors also displayed increased expression of genes involved in bladder cancer signaling pathways and glioma invasiveness (Figure 2).

Functional characterization of GEP differentially altered in GEP-A and GEP-B PDAC
Canonical pathways found to be deregulated in GEP-A vs. GEP-B PDAC (Figure 2B and 2C) included multiple genes involved in innate and adaptive cellular and humoral immune responses.Among others, these included interleukin 18 (IL18), several IL receptors (IL2RA, IL2RG, IL10RA) and the IL1RN IL-1 antagonist, the CD80 receptor gene, major histocompatibility complex class I (HLA-A, HLA-B, HLA-E and HLA-F) and class II (HLA-DRA, HLA-DMA, HLA-DMB, HLA-DPA1 and -DQB1) molecules, toll-like receptors 4 and 6 (TLR-4 and TLR-6) and both the janus kinase family members 1 and 2 genes (JAK1 and JAK2) and their signal transducer and activator of transcription 2 gene (STAT2).In contrast to GEP-B cases, GEP-A tumors also displayed an altered expression of genes involved in cell stress, injury responses and chronic inflammatory disease pathways; this included overexpression of the COL3A1 and COL10A1 collagen genes, the PLA2G7, 10 and 16 phospholipases, the APOL1 and APOC1 apolipoproteins and the PLAT and PLAU plasminogen activatorassociated kinase genes (Figure 2B and Supplementary Table 5).Conversely, GEP-B tumors displayed a less altered GEP, which consisted of decreased expression of genes related with cell junction and intercellular adhesion -e.g. the E-cadherin (CDH1), OCLN (occludin) and CGN (cingulin) genes, and several members of the claudin gene family (CLDN1, CLDN4, CLDN7 and CLDN10)together with increased expression of the ILK signaling pathway, due to overexpression of the ILK gene and of other genes involved in the ephitelial-to-mesenchymal transition (EMT) such as SNAI1, SNAI2 and vimentin (VIM) (Figure 2C and Supplementary Table 6).Of note, GEP-B tumors also showed a GEP which was associated with other key elements of the GEP signature of EMT; thus, they showed overexpression of the N-cadherin (CDH2), TWIST1 and S100A4 mesenchymal phenotypeassociated markers, together with decreased expression of epithelial phenotype markers such as the CDH1, cytokeratins (KRT8 and KRT18), desmoplakin (DSP), the chymotrypsinogen B1 (CTRB1), insulin (INS) and GCG genes.
From all differentially expressed RNA transcripts, a combination of 63 mRNA genes overexpressed in GEP-A and 97 mRNA genes overexpressed in GEP-B tumors (vs.non-tumoral pancreatic tissues) allowed for a clear cut discrimination of these two subgroups of PDAC tumors (Supplementary Table 7).A list of those highly-discriminant genes which were found to be most differentially expressed (≥10 fold difference) in GEP-A and GEP-B tumors, with a power to classify them with a 100% accuracy, are shown in Table 3.These genes included PDAC epithelial markers (e.g., CEACAM5 and SERPINB5) for the definition of GEP-A tumors and the SNAI2 mesenchymal marker for GEP-B tumors.

miRNAs genes which may inhibit gene expression in PDAC
In order to determine the impact of the miRNAs signature on the GEP of PDAC tumors, both the miRNA and mRNA gene expression data sets were combined to investigate potential correlations between miRNAs and mRNA genes which are altered in PDAC.Evaluation of each pair of potential miRNA-mRNA interacting genes identified potential interactions for 51 inversely correlated and 139 positively correlated (absolute value of R ≥0.7; p < .0001)pairs of miRNA-mRNA genes.Based on currently available miRNA target prediction and database tools, such interactions corresponded to 27 predictable and 1 experimentally validated (miR-30a-star/SLC7A6) interactions for the negatively correlated miRNA-mRNA pairs (Table 4).Of note, both the experimentally validated pair of mRNA/miRNA genes and other 4 predicted miRNA-mRNA interactions (miR-130b-star/TSHZ3, miR-148a/BBS7, miR-148a/LIMA1 and miR-30a/PLAUR) were systematically altered in the 27 PDAC samples analyzed; in turn, another 10 predicted miRNA-mRNA pairs were specifically altered in GEP-A cases (miR-148a stem loop transcript/ACSL5, CTSE, SLC44A4, TNFRSF21 or TSPAN15, and miR-23a/COQ10A) or in GEP-B tumors (miR-1180/BMPER, miR-1244/C10orf118, miR-362-5p/FGD1 and the miR-423 stem loop transcript/ RSPO3).

DISCUSSION
PDAC is currently recognized as a genetically heterogeneous group of tumors, but limited information exists about the biological significance of such variability.In order to gain insight into the genetic heterogeneity of PDAC, here we analyzed for the first time, the global coding and non-coding GEP of a relatively large cohort of PDAC tumors vs. non-tumoral pancreatic tissues.Overall, our results showed two clearly defined subtypes of PDAC which shared a GEP clearly distinct from that of non-tumoral pancreatic tissues.Globally, this included increased expression of genes linked to PDAC cell proliferation, local invasion and metastatic capacity.Thus, the most top-ranked altered networks (e.g.: axonal guidance, inhibition of matrix metalloproteinases, semaphorin, epithelial adherent junction and Rho family of GTPases signaling pathways) are directly involved in cell-cell and cell-matrix adhesion, extracellular matrix degradation and tissue remodeling, angiogenesis, tumor cell migration and invasiveness [28][29][30][31].In addition, cytoskeleton remodeling which is essential for cell movement and growth, is also altered in PDAC tumor cells as reflected by the alteration of axonal guidance, actin cytoskeleton, virus-entry via endocytosis, clathrinmediated endocytosis, macropinocytosis signaling, as well as signaling pathways activated by the Rho family of GTPases [28][29][30]32]; of note, many of such processes had been previously described to be altered in PDAC [27,29,30,33].PDAC tumors also showed a significant alteration of the early steps of cellular immune responses; this is possibly due to a host response against the tumor [34], as reflected by the alteration of cell adhesion, diapedesis and extravasation, IL8 signaling and antigen presentation via macropinocytosis signaling [35].However, since the tumors here analyzed represented relatively advanced stages of the disease, alteration of such pathways could also be due to inflammation-mediated cell migration mechanisms [36].Altogether, these processes found to be altered in PDAC encompass a pro-tumoral scenario; in such scenario PDAC tumor cells secrete factors that actively enhance recruitment of immune cells, while activated immune cells, produce cytokines and growth factors that may exert a direct effect on the tumor cells and the stroma [37].This hypothesis was fully supported by the observation of areas containing significant leucocyte infiltrates in the tumoral vs. non-tumoral pancreatic tissues, through immunostainings for CD45 and CD15 of formalin-fixed, paraffin-embedded tissues from the same cases (data not shown).
Interestingly, in addition to the common GEP, the two subgroups of PDAC here identified also showed clearly different GEPs.Thus, enrichment in genes involved in the innate and adaptative immune response was predominantly detected in GEP-A vs. GEP-B cases, even when both subgroups of tumors presented similar levels of infiltration by inflammatory cells (data not shown).These findings, together with the increased expression of genes correlated to immune and chronic pancreatic diseases, cellular stress and injury conditions, among GEP-A vs. GEP-B cases, point out the potential involvement of immune selection mechanisms (e.g.: selection of non-immunogenic tumor-cell variants) in the former subgroup of PDAC [30].Additionally, GEP-A tumors also showed an altered expression of genes involved in cell proliferation, angiogenesis, cell motility, invasion and tumor progression (e.g.genes involved in the MSP-RON, actin nucleation by the AR-WASP complex and by the Rho, Rac, PAK, Cdc42, integrin, ERK/MAPK, Paxilin, FAK, NF-KB, calpain protease and glioma tumor invasiveness pathways [28,[38][39][40][41][42][43][44][45][46][47][48], among other genes [31,49,50]), would confer a highly-aggressive phenotype to GEP-A tumor cells.Of note, GEP-A tumors retained an epithelial GEP phenotype which includes an increased expression of epithelial markers, carcinoembrionary antigens (CEACAM1, CEACAM6 and CEACAM 5) and cytokeratins (KRT7 and KRT19).
In contrast to GEP-A tumors, GEP-B PDAC cases showed fewer specifically altered canonical pathways, despite an overall similar number of altered genes was found in both subgroups of tumors (1,183 vs. 1,012 altered genes in GEP-A vs. GEP-B cases, respectively).Of note, GEP-B cases showed no specific GEPs associated to tumor cell proliferation; moreover, they had decreased expression of genes linked to canonical pathways associated with immune responses.Thus, GEP-B tumors had: i) enhanced self-defense mechanisms against complement-dependent cytotoxicity, as reflected by overexpression of the KIT mast cellassociated molecule [51]; ii) defective expression of major histocompatibility complex (MHC) molecules which are that frequently involved in tumor immune escape [52], and/or; iii) greater cancer-driven immunosuppression as a consequence of increased expression of the programmed cell death 1 ligand 2 (PDCD1LG2) [53] and the VGFC [51] genes.Most interestingly, our results indicate activation of epithelial-mesenchymal transition (EMT) genes in GEP-B tumors as depicted by their higher expression of mesenchymal signature genes (e.g: CDH2, SNAI1, SNAI2 and VIM) and other EMT-related genes (e.g.S100A4), together with decreased expression of epithelial markers (e.g: CEACAM6, EPCAM, CDH1, KRT8 and KRT18) [54][55][56], which activate the integrin linked kinase (ILK) signaling pathway [57], inhibit genes involved in cell-cell www.impactjournals.com/oncotargetjunction signaling pathways and expression of adhesion molecules (e.g: DSG2, DSC2 and PKP2 genes) [58].Altogether, these results suggest that in GEP-B tumors, immunosuppression linked to an EMT phenotype could be involved in the pathogenesis of PDAC.Whether immunosuppression precedes or develops after acquisition of an EMT phenotype, remains to be determined.
Overall, the above results confirm and extend on previous observations about the existence of distinct molecular subgroups of PDAC tumors as identified by GEP, including a "classical epithelial" and a "quiasimesenchymal" subtype of PDAC [25].However, despite this, we failed to detect a third subtype of PDAC tumors with an exocrine-like phenotype, as previously described by Collisson et al. in a larger patient cohort [25].Such apparently discrepant results could potentially be due to differences in the size of the cohort analyzed (27 tumoral samples in our study vs. 63 PDAC samples in the series of Collisson et al.), the methodology used (e.g.macrodissected freshly-frozen PDAC tissues vs. a mixture of formalin-fixed paraffin-embedded and freshly-frozen PDAC tissues, with or without microdissection), and/or the comparison against non-pancreatic reference tissues done in our series, but not in the study by Collisson et al. [25].Of note, we also failed to confirm the previously reported association between specific GEP and the clinical and histopathological features of the disease (e.g.: the association between a mesenchymal phenotypes and both adverse tumor features and a poorer prognosis) [25,56,59].Independently of the pathogenic significance of the distinct GEP and tumor phenotypes here described, the understanding of such biological pathways may contribute to better identify more efficient treatment strategies and to e.g.avoid standard PDAC therapy with gemcitabine and 5-fluorouracil in patients with GEP-B, due to the high chemoresistance of PDAC cells with an EMT phenotype to these treatments [56,60].Despite all the above, a major concern remains regarding the functional effect of microRNA expression levels on the mRNA transcript expression.Here we identified several miRNAs to be significantly correlated with expression of specific genes at the mRNA level.Among other miRNA-mRNA pairs, the miR-30a-star emerged in our series, as significantly correlated with an increased expression of the SLC7A6 gene transcript.The SLC7A6 (solute carrier 7 member of this family of genes) has known functions in the transport of leucin, being involved in promoting cell growth in many cancers [61,62] and podocyte development [63].Furthermore, expression of the miR-30 family of miRNAs is a key element during embryonic pancreatic development to maintain the epithelial phenotype of pancreatic tissues [64], their inhibition mediating an EMT phenotype in several types of cancer [65,66].Although, we were not able to detect any other (validated) inverse correlation for other miR-30 elements-genes, decreased expression of miR-30a, miR-30c and miR-30d was found in both GEP-A and GEP-B tumors with an epithelial vs. EMT phenotype, respectively; these results suggest that the EMT phenotype is potentially promoted in all PDAC tumors, but only those tumors carrying additional molecular/genomic alterations associated with immunosuppression and/or activation of ILK signaling could more clearly acquire a mesenchymal phenotype.Other miRNAs found to be altered in PDAC were exclusively deregulated among GEP-A or GEP-B tumors.Interestingly deregulated miRNA genes in GEP-A tumors included the stem loop transcript of miR-148a.The miR-148a miRNA possibly mediates overexpression of genes involved in tumor cell growth (e.g.acetyl-CoA sintetase, ACSL5), migration (e.g. the TSPAN15 tetraspanin) with an effect also on both apoptosis and immune responses (e.g. the TNFRSF21 tumor necrosis factor receptor); in turn miR-23a inhibits the antioxidative effect of the coenzyme Q10 homologe A (COQ10A) gene.In contrast, those miRNA genes which were overexpressed in GEP-B tumors included the miR-1180, miR-362-5p and the miR-423, all of which promote tumor cell proliferation and invasion through e.g. the BMP binding endothelial regulator (BMPER), the FYVE Rho GEF and PH domain containing 1 (FGD1) and the R-spondin 3 (RSPO3) genes.
Interestingly, clear cut discrimination between GEP-A and GEP-B tumors carrying an epithelial vs. mesenchymal-like molecular profile could be obtained via a set of 63 and 97 mRNA genes overexpressed in GEP-A and GEP-B tumors, respectively, as also confirmed in an external series of 27 PDAC patients [25].These results indicate that these gene signatures could potentially serve in the future as prior knowledge for the discovery of biomarker candidates (i.e: CEACAM5, GPX2, MUC13, S100P and TMEM45B for GEP-A cases, and PAPPA and VGLL3 for GEP-B tumors) that may contribute to more efficient treatment and/or monitoring of both subtypes of PDAC tumors.In addition, in our series a small panel of 5 overexpressed PDAC markers (S100A11, GPR137B, SULF1, POSTN and miR-155) would allow precise distinction between PDAC and non-tumoral pancreatic tissues.In line with this hypothesis, strong expression of the S100A11 and GPR137B genes has been reported at the protein level in PDAC tissues, while SULF1 and POSTN are expressed at more variable patterns [67][68][69]; of note all four proteins have been also found to be secreted and present in both tumor tissues and the plasma [67,68,70] from PDAC patients.Altogether, secretion of these proteins outside the tumor cell, supports the potential utility of these genes as candidate markers for the diagnosis and monitoring of PDAC patients.
In summary, the present study provides evidence for a common GEP of tumor cells in PDAC, at the same time it confirms the genetic complexity and heterogeneity of these tumors with at least two clearly distinct and unique GEPs (e.g.epithelial-like vs. mesenchymal-like genomic profiles), potentially reflecting different pathways involved www.impactjournals.com/oncotarget in the oncogenesis and progression of PDAC.In addition, our results also highlight the potential role of the tumor microenvironment, particularly of the immune system, in PDAC, with potential diagnostic and therapeutic implications.

Patients and samples
Tumor tissue specimens were obtained at diagnostic surgery from 27 consecutive sporadic PDAC patients (18 males and 9 females; mean age of 67 years, ranging from 41 to 79 years); in addition, non-tumoral pancreatic tissue specimens were also collected from another 5 patients each having a different pancreatic disease (pancreatic fibrosis with inflammation, chronic pancreatitis, an ampullary tumor, a neuroendocrine tumor and a PDAC, respectively).All PDAC patients underwent surgical tumor resection at the Division of Hepatobiliary and Pancreatic Surgery of the University Hospital of Salamanca (Salamanca, Spain).PDAC tumors were diagnosed and classified according to Adsay et al. [71] with the following distribution: 8 cases corresponded to well-differentiated/grade I tumors; 11 to moderately-differentiated/grade II, and; 8 to poorlydifferentiated/grade III PDAC.Histopathological grade was confirmed in all cases in a second independent evaluation by an experienced pathologist.Most tumors (21/27, 78%) were localized in the head of the pancreas, while the remaining six cases were localized in the pancreatic body (1/27, 4%), the tail (3/27, 11%) and the pancreatic body/tail (2/27, 7%).Mean tumor size at diagnostic surgery was of 3.0±0.82cm, 6 cases corresponding to TNM stage IIA tumors and 21 to TNM stage IIB.The most relevant clinical and laboratory patient characteristics are summarized in Supplementary Table 1.
Pancreatic tissue samples were collected immediately after surgical resection, snap frozen and stored in OCT at -80°C (Tumor Biobank of the University Hospital of Salamanca, Red de Bancos de Tumores de Castilla y León, Salamanca, Spain).The study was approved by the local ethics committee of the University Hospital of Salamanca (Salamanca, Spain) and informed consent was given by each individual prior to entering the study, according to the Declaration of Helsinki.Once the histopathological diagnosis had been established, sections from the paraffin-embedded tissue samples were cut from three different areas representative of the tumoral tissue with > 70% tumor cell infiltration by hematoxylin-eosin staining, excluding stroma-enriched tumor areas.Selection of the neighbour areas of the tumor containing ≥70% tumor cells was performed on dissected samples stored in OCT.

RNA extraction and gene expression profiling (GEP) microarray studies
For GEP, sample preparation was performed as described in the Affymetrix GeneChip Expression Analysis Manual (Santa Clara, CA, USA).Briefly, each frozen tissue (≥0.3 g) was crushed to powder at cryogenic temperatures and homogeneized in Trizol (Life Technologies, Inc., Rockville, MD).Total RNA was then extracted using the miRNeasy mini kit according to the manufacturer's protocol (Qiagen, Valencia, CA); subsequently, the quality and integrity of the RNA was evaluated in an Agilent 2100 Bioanalyzer (Agilent Technologies Inc., Santa Clara, CA, USA).Total RNA (100-1,000ng) from both tumoral and non-tumoral pancreatic tissues was hybridized to both the Affymetrix Human Gene ST 1.0 Expression and the microRNA 2.0 Expression arrays, according to the instructions of the manufacturer.Fluorescence signals were detected using the GeneChip Scanner 3000 7G (Affymetrix) and data stored as .CEL files.
For data analysis, GEP raw data derived from the Affymetrix Human Gene Expression ST 1.0 microarray and the microRNA 2.0 microarray, was normalized with the Robust Multi-array Average (RMA) algorithm; this included sequentially background correction, intraand inter-microarray well normalization, probe set summarization and calculation of expression signals, respectively [72].Unsupervised classification of samples and genes -28,869 mRNA and 4,544 human small noncoding RNA transcripts-was performed by principal component (PCA) and hierarchical clustering analyses (HCA) using the expression signal detected for each gene for each probe set, and the MultiExperiment Viewer (MeV, version 4.8.1)[73] and Cluster 3.0 software programs (PAM software; http://www-stat.stanford.edu/~tibs/PAM).Clustering was run using an Euclidean correlation metric and the average linkage method.For visualization of dendograms, the TreeView software (version 1.0.4)[74] was used.Differentially expressed genes between all tumor samples or GEP-defined subgroups of PDAC samples vs. non-tumoral samples were identified by supervised two-class unpaired Significance Analysis of Microarray (SAM; MeV software) [75] based on a false discovery rate (FDR) cut off of < .0001and an absolute fold change cutoff of ≥2.0.
In order to identify the best combination of genes for the discrimination between the GEP of PDAC tumors and non-neoplastic pancreatic tissues, a two-step strategy was used.In the first step, five prediction algorithms were used: 1) PAM (PAM software v 2.1; University of Stanford, CA) [76], 2) Partial Least Squares algorithms (PLS; SIMFIT software v.6.9.9; www.simfit.org.uk), 3) Support Vector Machines (SVM), 4) K-Nearest Neigbour (KNN) and, 5) Random Forest algorithms; the latter three algorithms are implemented in the Babelomics suite (http://babelomics.bioinfo.cipf.es/)[77].For this purpose, GEP data from two-thirds of the tumoral samples was randomly selected as a training dataset, while the remaining were used to build the validation dataset.In this first step, informative genes were defined as those represented in ≥ 4/5 analyses.In the second step, the discriminative power of each informative gene was assessed by receiver operating curve (ROC) analysis (SPSS 15.0 Inc, Chicago, IL, USA).Finally, those genes which depicted a high predictive power -area under the curve (AUC) ≥0.96-together with an expression fold change (vs.non-tumoral tissues) > 4, were selected.Validation of genes was performed in the same pancreatic sample series (27 tumoral plus 5 pancreatic non-tumoral samples) applying the PAM and SVM models, using a 10-fold and a leave-one-out-cross validation method, respectively.
For the identification of miRNA candidates acting as gene-regulators in PDAC samples, Spearman correlation analyses were performed to identify significant correlations between individual miRNA and mRNA gene transcripts across tumoral (n = 27) and non-tumoral (n = 5) samples.Each miRNA-mRNA interaction identified was subsequently evaluated with the Ingenuity Pathway Analysis software (IPA, Ingenuity Systems, www.ingenuity.com),as well as with available databases of experimentally validated miRNA interactions (TarBase 6.0 and miRWalk-database) and miRNA target prediction tools (DIANA-microT-CDS v5.0, miRWalk-database and miRecords) [78,79].Functional enrichment analysis of deregulated genes, analysis of canonical pathways, correlation networks, as well as gene-gene and gene-miRNA interactions were defined using the IPA software.

Validation of gene expression profiles by quantitative real-time PCR assays
TaqMan Gene Expression Assays were used to validate GEP in the same samples used for microarray studies via the Step One Plus Real-Time PCR System -Applied Biosystems (ABI; Foster City, CA, USA) according to the manufacturer's instructions.The assays ID for the genes studied were as follows: Hs_00429010_m1 (PDIA2), Hs_00170815_m1 (POSTN), Hs_00418420_m1 (SCYN), 002220 (hsa-miR-216a), 002337 (hsa-miR-217), 002623 (hsa-miR-155) and 000507 (hsa-miR-203).Each PCR was carried out in duplicate in a final volume of 10 uL using the TaqMan Fast Universal Mastermix (ABI) and the following cycling parameters: incubation at 95ºC (20 s), followed by 50 cycles at 95ºC (1s) and an incubation at 60ºC (20s).GEP and miRNA expression data was normalized against the GAPDH internal housekeeping gene and the RNU43 internal control, and it was further analyzed using the StepOne software (v2.0;ABI).The relative amounts of the quantified genes were calculated using the following equation: 2 -ΔCT (ΔC T = C T GENE-C T GAPDH or RNU43) expressed as arbitrary units (AU); results showed a high degree of correlation between data from both microarrays and RQ-PCR methods, for all genes evaluated (r 2 ≥ 0.66, p < .0001;Supplementary Figure 1).

External validation series of PDAC tumors
External validation of the predictive value of the differentially expressed genes that discriminated between the distinct GEP-defined subgroups of PDAC tumors found in our series, was performed in a group of previously reported PDAC patients (n = 27).GEP array data files (Affymetrix Human Genome U133 Plus 2.0 Array) are publicly available at the GEO database (accession number GSE17891) [25].Downloaded data CEL files were normalized using the RMA algorithm and overlapping probe sets were defined on the basis of probe specificity, using the GATExplorer server [80].Probe sets with the best specificity to the interrogated genes (see Supplementary Table 7) were selected, and the expression signals detected for each gene for each probe set were further analyzed using the column metric preserving biplot assay [81] implemented in the SIMFIT statistical software (http://www.simfit.org.uk/).

Other statistical methods
The Mann-Whitney U test and a linear regression model were used to evaluate the statistical significance of differences observed between groups and to explore the degree of correlation between different variables, respectively (SPSS 15.0 Inc.).P-values ≤.05 were considered to be associated with statistical significance.

Figure 1 :
Figure 1: Classification of PDAC tumors and non-tumoral pancreatic tissues based on coding (mRNA) and noncoding (small nuclear and microRNA) gene expression profiles (GEP).Both principal component (Panel A) and unsupervised hierarchical clustering (Panel B) analyses differentiated tumoral vs. non-tumoral tissues (n = 5; color coded in green), at the same time they showed the existence of two major subgroups of PDAC tumors: GEP group A (n = 24; color coded in red) and GEP group B (n = 3; color coded in purple).Case ID of tumors are shown inside the colored bars.

Figure 2 :
Figure 2: Most representative canonical pathways involved in PDAC tumors as defined by their GEP for both coding and non-coding RNAs (n = 27; p < .05).Shared canonical pathways by the two GEP-A and GEP-B subgroups of PDAC tumors are shown in panel A, while those pathways specific for the GEP-A and GEP-B subgroups of PDAC tumors are displayed in panels B and C, respectively.

Figure 3 :
Figure 3: Biplot analysis of 27 PDAC tumors from an independent external validation dataset [25] evaluated for the expression of GEP-A and GEP-B overexpressed tumor markers identified in our series.PDAC samples previously classified by Collisson et al. as "classical PDAC" tumors (grey dots) were mostly represented by the expression of GEP-A associated genes (red vectors) while "quasi-mesenchymal PDAC" tumors (light blue dots) were grouped by the expression of GEP-B associated markers (purple vectors).