Over-expression of AURKA, SKA3 and DSN1 contributes to colorectal adenoma to carcinoma progression

Development of colorectal cancer (CRC) involves sequential transformation of normal mucosal tissues into benign adenomas and then adenomas into malignant tumors. The identification of genes crucial for malignant transformation in colorectal adenomas (CRAs) has been based primarily on cross-sectional observations. In this study, we identified relevant genes using autologous samples. By performing genome-wide SNP genotyping and RNA sequencing analysis of adenocarcinomas, adenomatous polyps, and non-neoplastic colon tissues (referred as tri-part samples) from individual patients, we identified 68 genes with differential copy number alterations and progressively dysregulated expression. Aurora A, SKA3, and DSN1 protein levels were sequentially up-regulated in the samples, and this overexpression was associated with chromosome instability (CIN). Knockdown of SKA3 in CRC cells dramatically reduced cell growth rates and increased apoptosis. Depletion of SKA3 or DSN1 induced G2/M arrest and decreased migration, invasion, and anchorage-independent growth. AURKA and DSN1 are thus critical for chromosome 20q amplification-associated malignant transformation in CRA. Moreover, SKA3 at chromosome 13q was identified as a novel gene involved in promoting malignant transformation. Evaluating the expression of these genes may help identify patients with progressive adenomas, helping to improve treatment.


Microsatellite instability assay
Fifty ng of genomic DNA from 106 carcinomas, 99 polyps, and 106 paired non-neoplastic colon tissues was used for each PCR reaction. PCR products amplified from 5 microsatellite loci (BAT25, BAT26, D2S123, D5S346, and D17S250) were analyzed by capillary electrophoresis using an ABI 3730 DNA analyzer (Life Technologies, Grand Island, NY). The consensus guideline established by the National Cancer Institute for determining MSI status was used to determine microsatellite instability of each tumor tissue. DNA samples with microsatellite instability were categorized as follows: MSI-high (MSI-H), instability at ≥2 loci; MSI-low (MSI-L), instability at one locus; microsatellite stable (MSS), no detectable instability.

Chromosomal aberration detection
A total of 500 ng of genomic DNA each from 76 carcinoma, 67 polyp, and 76 paired non-neoplastic colon tissue samples was subjected to SNP genotyping using Genome-wide Human Array SNP6.0 (Affymetrix, CA, USA) according to the manufacturer's instructions. Genotyping was performed by the National Genotyping Center at Academia Sinica, Taipei, Taiwan (http://ngc. sinica.edu.tw). Copy number estimation for carcinoma tissue and polyp tissue was performed using Partek Genomics Suite (Partek Inc. MO, USA) under paired mode by comparing probe intensity data from neoplasia tissues to that from corresponding normal colon tissues to filter out germ-line copy number variations (CNV). Regions with at least 50 consecutive probes with inferred copy numbers > 2.2 or < 1.8 were defined as having a copy number alteration (CNA). A given chromosome arm with CNAs detected in more than 50% of the total length was defined as having amplification or deletion of the whole chromosome arm. In this report, the severity of chromosome instability (CIN) was classified into five categories: CIN-stable (no altered chromosome arms), low degree of CIN (≥ 1 but ≤ 5 altered chromosome arms), medium degree of CIN (> 5 but ≤ 10 altered chromosome arms), high degree of CIN (> 10 but ≤ 20 altered chromosome arms), and ultra-high degree of CIN (> 20 atlered chromosome arms).

RNA sequencing and data processing
Tri-part samples from 10 patients were subjected to massively parallel sequencing for expression profiling and quantification. RNA quality and quantity was checked using an Agilent Bioanalyzer. Quality control of the raw sequence data was performed in two steps. First, adapters and low quality reads with Phred quality scores (Q score) < 13 were trimmed. Then, processed reads with lengths < 25 bp were removed by Solexa QA (version 2.5) [1]. The qualified paired-end reads were mapped to Human reference genome hg19 using Bowtie (version 1.0.1) in parallel with Tophat (version 2.0.11) to analyze the mapping results for splice junction identification between exons [2,3]. The Sequence Alignment/Map (SAM) file for each sample was quantified and normalized to estimate gene expression levels using Cufflinks (version 2.2.1) with default options [4]. Gene expression levels in fragments per kilobase of transcript per million mapped reads (FPKM values) were used for further statistical analysis.

Statistical analysis of RNA-seq data
Since the polyp sample from patient CRC10 did not pass FASTQC, tri-part samples from this patient were excluded from downstream analysis. A total of 23,615 genes were reported by Cuffmerge and were further filtered by expression abundance. Only genes with detectable expression in more than 50% of samples and with average FPKM ≥ 1 were included in statistical analysis. A total of 14,516 genes passed this filter. The Multi-Omics On Line Analysis System (MOLAS) (http://molas.iis.sinica.edu. tw/) was used to perform K-mean clustering analysis for the 14,516 genes with the following parameters: p-value ≤ 0.001, fold change ≥ 2, clustering # = 9, min_row_sum # = 1, and dispersion # = 0.001. Genes in the clusters showing progressive expression level increases or decreases from non-neoplastic colon tissue through polyps to carcinomas, and with mean FPKM ≤ 1 for non-neoplastic tissue or mean FPKM ≥ 1 for carcinoma, were considered candidate genes. The Wilcoxon Signed Rank test was performed for the same 14,516 genes in a pair-wise manner using SAS version 9.3 (SAS Institute, Cary, NC, USA). Genes with p-values ≤ 0.01 and fold change > 2 or < -2 in carcinoma compared to paired non-neoplastic tissue, and with pvalues ≤ 0.01 and fold change > 1.5 or < -1.5 in carcinoma compared to paired polyp, were considered differentially expressed. Pathway enrichment analysis of the candidate genes was performed using Ingenuity Pathway Analysis software (Qiagen, Valencia, CA, US)

Genomic real-time quantitative PCR
Real-time quantitative PCR was performed using Power SYBR Green master mix (Applied Biosystems, Foster City, CA, USA). Briefly, PCR reactions with 2 ng of genomic DNA template were run on an ABI PRISM7900 using the default 2 step thermal cycle comprised of a heat activation step at 95ºC for 10 min, 40 cycles of 95ºC for 15 s and 60ºC for 1 min, and a dissociation stage at the end. The NSE1 gene located on chromosome 2, which was the least frequently altered among all tissue samples, served as an internal control. Two primer pairs specific to exon 4 and exon 6 of SKA3 were used to evaluate SKA3 copy number. The primer sequences were as follows: All reactions were carried out in triplicate. SKA3 copy numbers for each tumor sample are expressed as 2x2 -ΔΔCt compared to the paired non-tumor sample. Copy numbers higher than 2.2 indicated gene amplification. The SKA3 gene was considered amplified when amplification was detected by both primer pairs. Samples with suspected NSE1 amplification detected by the SNP genotyping array were excluded from statistical analysis.

Apoptosis analysis
Cells were plated in 6-well cell culture plates (3x10 5 cells per well) and subjected to siRNA transfection.
N, non-neoplastic tissue; A, polyp; C, carcinoma. Signal intensity was scored as follows: 0 (no staining), 1 (weak staining), 2 (moderate staining) and 3 (strong staining). *, Hyperplastic polyps; all others were adenomatous polyps. ------# , Patients with MSI in either carcinoma or polyp tissues were excluded from the statistical analysis; patients with adenocarcinoma nodules detected in polyp tissues and those with extreme alteration of protein expression were also excluded from the statistical analysis. *, CIN index 0 corresponds to CIN-stable, 1 corresponds to low degree of CIN, 2 corresponds to medium degree of CIN, 3 corresponds to high degree of CIN, and 4 corresponds to ultra-high degree of CIN. & , Samples with suspected amplification of the NSE1 gene, the normalization control for genomic qPCR, identified by SNP genotyping array were excluded from statistical analysis.