Genomic variants link to hepatitis C racial disparities

Chronic liver diseases are one of the major public health issues in United States, and there are substantial racial disparities in liver cancer-related mortality. We previously identified racially distinct alterations in the expression of transcripts and proteins of hepatitis C (HCV)-induced hepatocellular carcinoma (HCC) between Caucasian (CA) and African American (AA) subgroups. Here, we performed a comparative genome-wide analysis of normal vs. HCV+ (cirrhotic state), and normal adjacent tissues (HCCN) vs. HCV+HCC (tumor state) of CA at the gene and alternative splicing levels using Affymetrix Human Transcriptome Array (HTA2.0). Many genes and splice variants were abnormally expressed in HCV+ more than in HCV+HCC state compared with normal tissues. Known biological pathways related to cell cycle regulations were altered in HCV+HCC, whereas acute phase reactants were deregulated in HCV+ state. We confirmed by quantitative RT-PCR that SAA1, PCNA-AS1, DAB2, and IFI30 are differentially deregulated, especially in AA compared with CA samples. Likewise, IHC staining analysis revealed altered expression patterns of SAA1 and HNF4α isoforms in HCV+ liver samples of AA compared with CA. These results demonstrate that several splice variants are primarily deregulated in normal vs. HCV+ stage, which is certainly in line with the recent observations showing that the pre-mRNA splicing machinery may be profoundly remodeled during disease progression, and may, therefore, play a major role in HCV racial disparity. The confirmation that certain genes are deregulated in AA compared to CA tissues also suggests that there is a biological basis for the observed racial disparities.


INTRODUCTION
Hepatocellular carcinoma (HCC) is one of the few malignancies in which the incidence is on the rise worldwide, especially in the US [1]. The increasing incidence of HCC in the US is associated with the rise in Hepatitis C virus (HCV) infection [2]. It is estimated that 3.2 million people in the US are infected with HCV, a blood-borne disease linked to 12,000 US deaths a year [3]. Even with the availability of new oral direct acting antiviral drugs [4], it is anticipated that 320,000 patients will die from HCV, 157,000 will develop HCC, and 203,000 will develop cirrhosis in the next 35 years [5]. Inequalities in disease prevalence, treatment, and outcome make HCC an important health problem among minority groups [6]. First, there are disparities in the prevalence of HCV infection with African Americans (AA) being twice as likely to have been infected compared with Caucasian Americans (CA) [7]. Second, there are significant racial/ ethnic disparities in access to HCV care [8]. Third, African www.impactjournals.com/oncotarget/ Oncotarget, 2017, Vol. 8, (No. 35), pp: 59455-59475

Research Paper
Americans are also less likely to respond to the new anti-HCV therapy than Caucasian Americans, possibly due to a lower rate of sustained virologic response (SVR) [9], and have considerably lower likelihood of receiving liver transplantation [10]. While much of the existing literature so far has focused on noting the presence of these disparities, little is known about specific biological or genetic factors that are involved. Therefore, there is clear need for molecular/biological approaches to understand the molecular basis for HCV health and racial disparities. Ultimately positive outcomes would allow for the development of novel, affordable and much needed next generation therapeutic care management based on HCV disease state and the racial/ethnic background of patients [11]. We recently reported that racially distinct alterations in the expression of transcripts and proteins exist between CA and AA individuals infected with HCV, as measured by proteomics-based analysis [12]. For example, we showed that the mRNA levels of transferrin (TF), Apolipoprotein A1 (APOA1) and hepatocyte nuclear factor 4-alpha (HNF4α) were significantly altered in AA liver (cirrhotic) and tumor samples compared to CA. It is known that AA with chronic HCV commonly have elevated levels of serum markers of iron stores and altered cholesterol & triglyceride levels [13,14]. The expression of TF & APOA1 (both involved in iron homeostasis and lipid metabolic processes, respectively) is transcriptionally regulated by HNF4α [15,16]. Furthermore, HNF4α is also known to be involved in the pathogenesis of HCC [17,18]. To the best of our knowledge, that was the first study to demonstrate possible link between deregulation of the expression of specific transcripts & proteins and HCV racial disparity between AA and CA subgroups. This finding prompted us to further investigate whether alternative splicing (AS) of genes could be involved in the transcriptome diversity seen between these two ethnic populations. Alternative splicing (AS) is a posttranscriptional event whereby exons are joined by different combinations generating various isoforms from a single gene [19][20][21]. It has been shown that most genes have at least 2 alternative isoforms [22,23] contributing to both transcriptome and proteome diversities in various pathophysiological situations including HCV infection and HCC [24,25].
In this study, we have performed a genome-wide transcriptomic analysis at the gene and splice variants levels in liver and tumor tissue samples of HCV infected individuals using the Affymetrix GeneChip Human Transcriptome array (HTA2.0). The array is especially designed to allow for expression profiling of transcript splice variants. It contains >6.0 million probes covering coding transcripts (70%) and exon-exon splice junctions and non-coding transcripts (30%). Herein, we describe our methods for expression microarray analysis at the genes and splice variants levels using Transcriptome Analysis Console (TAC2.0) software coupled by validation studies to confirm disease-specific splice variants of genes that could be involved in the racial disparity of HCV-induced HCC by real-time qRT-PCR and immunohistochemistry using sixty liver and tumor tissue samples.

Clinical characteristics of tissue samples
A total of 36 snapped frozen liver and tumor samples from CA and AA populations were used in this study. The clinicopathologic characteristics of samples are presented in Supplementary Table 2. As reported in our previous study [12], there were no significant differences of age and sex between samples in the two groups. However, the cirrhotic HCV+ liver samples of AA group had statistically significant laboratory results for aspartate aminotransferase (AST), and alanine aminotransferase (ALT) (p<0.05) compared to CA group. There were no significance differences in the laboratory values for albumin, total albumin and hemoglobin between samples in the two groups.

Identification of differentially expressed genes and splice variants based on diseased states of Caucasian American (CA) population
Gene level differential expression profiles of 12 CA tissues samples (3 normal liver, 3 HCV+ livers, 3 HCV+/ HCC+ tumors and 3 HCCN) were determined using HTA2.0 GeneChip Arrays (Affymetrix ® ) that contain 70,523 detectable transcripts using TAC2.0 software (for filtering criteria see Materials and methods). For normal vs. HCV+, 636 genes were differentially expressed: 350 genes were up-regulated in HCV+ compared to normal (coding 235; non-coding 103; other 12) as shown in Table  1A, whereas 286 genes were down-regulated in HCV+ compared to normal (coding 209; non-coding 73; other 4), Table 1B. For HCCN vs. HCV+HCC, only 61 genes were differentially expressed, as shown in Table 2, using the same algorithm options and filter criteria (see Materials and methods): 47 genes were up-regulated in HCV+HCC compared to HCCN (coding 23; non-coding 6; other 18) and 14 genes were down-regulated in HCV+HCC compared to HCCN (coding 5; non-coding 1; other 8). These results suggest that tumor-adjacent tissue (HCCN) shares biology of the tumors themselves, and only 61 genes are differentially expressed in this case. Figure 1 shows the scatter plot (log 2 scale of expression values) for differentially expressed genes (DEGs) in normal vs. HCV+ state ( Figure 1A) and HCCN vs. HCV+HCC state ( Figure  1B), respectively. In both cases, most of the genes run along the diagonal axis and can be considered as common genes, expressed similarly in either diseased state, whereas differentially expressed genes with values <-2.0 or <+2.0 are scattered outside the diagonal axis. Examples of these     scattered genes (arrows) are shown in Figure 1A (insert 1 C) and Figure 1B (insert 1 D). No overlap of genes (marked) was detected between the two disease stages, which suggest that these genes are differentially expressed based on disease state (normal vs. HCV+ cirrhotic livers; HCCN vs. HCV+/HCC cirrhotic tumors).
For alternative splicing analysis, based on the algorithm options and filter criteria stated in the materials and methods, we were able to detect splice variant events only in normal vs. HCV+ stage (cirrhotic) and not in HCCN vs. HCV+HCC stage (tumor). This could be due to the low numbers of DEGs detected in the tumor state (61 genes) and/or the cut off and filter criteria. However, in normal vs. HCV+ stage about 12,650 genes were expressed in both conditions (coding). Only 15% of genes have at least one PSR or junction with SI (linear) <-2.0 or >+2.0 to indicate alternative splicing. For non-coding, about 2,943 of genes were expressed in both conditions. Only 2.7% of genes were found to have at least one PSR or junction with SI (linear) <-2.0 or >+2.0 to indicate alternative splicing. Table 3 shows various alternative splicing events (coding) for the top 30 genes identified in normal vs. HCV+ livers.

Differentially expressed genes are involved in a number of pathways and networks associated with disease state
To gain insights into the molecular pathways involving the identified differentially expressed genes, Ingenuity Pathway Analysis (IPA) of experimental data was performed by Ingenuity software as we previously reported [12]. Using the list of 636 genes involved in normal vs. HCV+ (cirrhotic) events and 61 genes involved in HCCN vs. HCV+HCC (tumor) events, IPA identified several pathways and function that might be relevant for each disease stage as shown in Tables 4A and 4B, respectively. Top associated network functions for differentially expressed genes in HCV+ cirrhotic state (Table 4A) were: 1) Hepatic fibrosis/hepatic stellate cell activation, 2) Antigen presentation pathway, 3) Graftversus-host disease signaling, 4) Inhibition of matrix metalloproteases, and 5) T-helper cell differentiation. These data suggest that acute inflammatory phase is involved in HCV+ cirrhotic state as a result of HCVinduced oxidative stress. Genes such as SAA1, SAA2 and LGALS4 known to be involved in acute inflammatory phase were detected in this disease state (Tables 1A and 1B; Figure 1A). For HCCN vs. HCV+HCC (tumor stage), top associated network functions for differentially expressed genes (Table 4B) were: 1) GADD 45 signaling, 2) Cell cycle control of chromosomal replication, 3) Estrogen-mediated S-phase entry, 4) Cell cycle: G2/M DNA damage checkpoint regulation, 5) Cyclins and cell cycle regulation. These data suggest that cell cycle signaling pathways are certainly involved in HCV-induced HCC (tumor phase). Genes such as PCNA-AS1 and HIST1H2BK known to be involved in cell cycle regulation pathways were detected in this disease stage (Table 2; Figure 1B).

Target validation of gene expression and splice variants in Caucasian and African Americans tissue samples
In order to determine whether the racial disparity seen in HCV associated HCC is partly due to the diversity in gene expression and splice variants events between CA and AA, we selected a representative group of genes for qRT-PCR cross validation analysis. For normal vs. HCV+ (cirrhotic state), we selected the following genes: SAA1, AOX1 and SLC13A5. Representative examples of the amplicon binding sites for the PCR primer sequences are shown in Supplementary Figures 1 and 2. For HCCN vs. HCV+HCC (tumor stage), the following genes were selected: PCNA-AS1, IFI30, DBA2, ROBO1, and SNORD82. The expression of these eight genes was validated by qRT-PCR using an independent test set of 24 liver and tumor tissue samples (12 CA and 12 AA). The qRT-PCR results are shown in Tables 5A and 5B. The data suggest that good concordance of the results is seen using HTA2.0 arrays and qRT-PCR analysis. However, there is a distinct difference in SAA1 expression level between CA & AA samples (Table 5A). The overall fold change (FC) of SAA1 in CA samples has a positive value because the overall gene expression in HCV+ cirrhotic liver is down compared to normal (Table 1A) resulting in a positive fold-change (FC) value. Although the overall FC (qRT-PCR) in AA samples (Table 5A) has a positive value, it is actually lower than CA, because the overall gene expression in HCV+ cirrhotic liver is higher in CA, thus lower value of FC is seen. Similar profile is seen in genes expressed in HCCN vs. HCV+HCC (tumor state): PCNA-AS1, ROBO1, DAB2, and IFI30 (Table 5A, lower part). As shown in Table 5B, SAA1 has an overall SI positive value in both HTA2.0 and qRT-PCR analyses. However, the SI value in AA samples (qRT-PCR) is lower compared to CA. This relates to the overall gene signal being higher in HCV+ cirrhotic liver (Table 5A, upper), thus more sliced out (higher signal) compared to normal. These data suggest that the observed disparity in HCV-induced HCC seen in CA and AA tissue samples could be due, in part, to transcriptome diversity of specific genes like SAA1, PCNA-AS1, IFI30, DBA2, and ROBO1.

Hepatocyte nuclear factor 4α (HNF4α) and serum amyloid A1 (SAA1)-associated protein staining patterns in liver and tumor tissue samples
Since SAA1 is transcriptionally regulated by HNF4α [26], we examined the staining patterns of both  proteins in 72 tissues sections for CA and AA using immunohistochemical analysis (Figures 2 and 3). Intense staining for SAA1 and P1/P2-HNF4α was observed in normal liver tissues for both CA (Figure 2Aa, and 2Ad) and AA (2Ba, and 2Bd). In contrast, the staining reactivity for both proteins showed a tendency to decrease in HCV+ cirrhotic livers of AA (Figure 2Bb, and 2Be) compared to CA (2Ab, and 2Ae). As shown in Figure 2C and 2D, the percentage of reactivity for SAA1 and P1/P2-HNF4α are 6.5 and 40 in AA, whereas in CA they are 25 and 50, respectively. Likewise, the staining patterns for both SAA1 and P1/P2-HNF4α in HCC are different in AA compared to CA samples. In AA tumor samples, there was no staining detected for SAA1 (Figure 2Bc), whereas intense staining was detected for P1/P2-HNF4α ( Figure   2Bf). For CA tumor samples, staining was detected for both proteins, although less than what is detected in normal tissues (Figure 2Ac, and 2Af). Figure 3A illustrates the staining pattern of P1-HNF4α in tissue samples for both CA and AA. In HCV+ tissues, the percentage reactivity of P1-HNF4α is higher in CA (125%), and lower in AA (50%). There is no clear difference in HCC staining reactivity of P1-HNF4α between CA and AA.

DISCUSSION
We previously showed [12] that there are distinct alterations in the expression of transcripts and proteins exist in CA liver and tumor tissue samples based on HCV disease state. However, the levels of expression  were different when the results were cross-validated on tissue samples of AA cohort. The aim of the current study was to follow up on these findings and investigate, at the whole transcriptome level, the extent to which splice variant events may play a role in this genomic diversity of HCV disease state and racial disparity. Alternative splicing of mRNA is a major mechanism that generates diverse mRNA transcript isoforms from a single gene, and subsequently differentiates proteins to have varying cellular processes [19][20][21][22][23]. These variants are targeted as biomarkers in disease diagnosis, prognosis and treatment [27][28][29].
In the present study, genome-wide analyses of genes and alternative splicing events of human liver and tumor tissues were performed using the newly developed Affymetrix Human Transcriptome 2.0 arrays (HTA 2.0). With a high density of oligonucleotide probes, these arrays cover the exonic regions of human genome as well as junction regions between adjacent exons. Many changes were apparent in HCV+ cirrhotic vs. normal livers, even more so than HCV+HCC vs. HCCN. This may indicate that HCV+ cirrhotic livers, as a type of intermediary lesion in HCV disease progression, already exhibited strong signs of alternations. From the molecular changes evidenced in HCV+ ( Figure 1A), it is clear that HCV+ cirrhotic livers are not merely accumulating alterations that will be found in HCV+HCC ( Figure 1B). Possibly, the evolution to HCC follows a more strictly clonal expansion, which may select for gene changes important for clonal growth while eliminating less relevant modifications. According to this hypothesis, HCV+ cirrhotic livers may have different outcomes, some evolving toward cancer (HCC), whereas others could be prone to disappearance. In this case, we were able to identify more genes expressed in normal vs. HCV+ (636 DEGs), whereas only 61 DEGs were detected in HCCN vs. HCV+HCC. No overlap of genes was detected between the two disease states.
Tables 1A & 1B show specific gene expression alterations in normal vs. HCV+. The signature of 350 probes corresponding to downregulated genes in HCV+ compared to normal is shown in Table 1A. Among the highest down-regulated genes are: AVR1A, SAA2, MT1F, CFHR5, SLITRK3, CLEC4M, SAA1, CPN1, TIMD4, GPR125, and AOX1. Most of these genes have not been  described to be associated with HCV+ cirrhotic livers, although several of the changes agreed to previous reports including variations in the expression levels of SAA1, SAA2 or MT1F [30][31][32][33]. For example, SAA1 and SAA2 are well-known acute phase reactants, and their serum levels were shown to be down regulated in HBV-associated HCC patients compared to healthy individuals [34]. In our study, both SAA1 and SAA2 are down regulated in HCV+ liver compared to normal ( Figure 1A). As tumor suppressor, metallothionein 1F (MT1F) has been shown to be down regulated in several tumors as part of cancer initiation and/or progression [35]. The signature of 286 probes corresponding to upregulated genes in HCV+ compared to normal is shown in Table 1B. Among the highest upregulated genes are: AKR1B10, IFI27, IL8, VTRNA1-1, SPP1, GDF15, CXCL10, IGLC7, and LGALS4. The expression of these genes is known to be strongly associated with HCV-induced liver cirrhosis and/ or HCC [36][37][38][39][40][41][42][43][44][45]. In Figure 1A, both SPP1 and IL8 are upregulated in HCV+ cirrhotic liver compared to normal. The signature of 61 probes corresponding to genes showing expression alterations in HCCN vs. HCV+HCC is shown in Table 2. In this disease state, 47 genes (77%) are upregulated, whereas 14 genes (23%) are downregulated. Among the top deregulated probes, PCNA-AS1 has been found to be the most up-regulated probes in HCV+HCC compared to HCCN, whereas SNORD82, among the downregulated probes ( Figure 1B). Both genes are considered long non-coding RNAs (lncRNAs) and well recognized to play major regulatory roles in disease development. For example, PCNA-AS1 was shown to act as an upstream regulator in HCC [46], and SNORD82 has been found to be involved in the development of prostate and breast cancers [47,48]. Ingenuity Pathway Analysis (IPA) was performed using Ingenuity software, as we reported previously [12] to understand the correlation between the canonical biological pathways and the deregulated genes identified in this study. Among the top 5 canonical pathways for normal vs. HCV+ state (Table 5A) was Hepatic Fibrosis/Satellite Cell Activation (p=4.25E-04). In hepatic fibrosis, hepatotoxins like HCV initiate a cascade of stress related pro-inflammatory events, which eventually activate Hepatic Stellate cells (HSCs). Activated HSCs secrete cytokines that perpetuate their activated state. Continued liver injury results in an accumulation of activated HSCs, which in turn synthesize  (a and d, respectively), HCV+ cirrhotic (b and e, respectively), and HCV+/HCC cirrhotic (c and f, respectively) in AA. Bar graphs = % staining reactivity (Y-axis) vs. disease state (X-axis) for SAA1 (C) and P1/P2-HNF4α (D). Black bar = CA; Gray bar = AA (n=3 -4 tissue sections from 24 paraffin embedded tissue blocks ± S.E; *p<0.05; **p<0.001). www.impactjournals.com/oncotarget large amount of extracellular matrix (ECM) proteins, leading to severe fibrosis and eventually liver cirrhosis. SAA1 and SAA2 genes are among the molecules activated in this disease state (acute phase reactants), and both are down regulated indicating a possible involvement in disease initiation to HCC. For HCCN vs. HCV+HCC state (Table 5B), GADD45 Signaling was the top pathway identified (p=2.93E-06). It has been implicated in stress signaling response that can result in cell cycle arrest, DNA repair, cell survival, senescence, and apoptosis. This response is mediated via a complex binding to several proteins involved in these processes, including PCNA and thus PCNA-ASI was found to be upregulated in HCC ( Figure 1B).
We next validated the expression of 8 DEGs by real-time qRT-PCR using independent samples for CA and AA, as shown in Table 5A. Although it is clearly shown in this table that there is good concordance in results obtained using both platforms, the level of SAA1 in AA samples (normal vs. HCV+ state) is significantly lower than that of CA (p<0.05). Thus, immune response to chronic HCV infection may play a crucial role in HCV racial disparities. Four (PCNA-AS1, ROBO1, DAB2 and IFI30) out 5 transcripts with increased expression in HCCN vs. HCV+HCC state ( Table 2) were found to be significantly lower (p<0.05) in AA compared to CA samples. Thus, in addition to the immune responseassociated genes, these genes could also play a role in HCV/HCC racial disparities seen between CA and AA samples, and might be valuable markers for early diagnosis of the disease based on racial background of patients. Since SAA1 (acute response reactant) is transcriptionally regulated by HNF4α [49] we validated the expression of both using immunohistochemical analysis. HNF4α is a member of the superfamily of liganddependent transcription factors (TFs) and master regulator of tissue-specific gene expression in the liver [50]. It inhibits progression of HCC in mice [17,18]. There are two alternative promoters that drive expression of HNF4α gene (P1 and P2) and give rise to HNF4α isoforms that differ by 16-38 amino acids in their terminal region [51]. While the different isoforms have identical DNA and ligand binding domains, there subtle yet significant functional differences between the HNF4α isoforms. Both P1-and P2-driven HNF4α are expressed in the fetal liver but only P1-HNF4α is expressed in the normal adult liver [52], and P1-HNF4α is down regulated in human HCC while P2-HNF4α is upregulated [51]. Furthermore, P1-HNF4α is known to repress the activation of the P2 promoter [51], which could explain the switch between the two isoforms. In this study, we used both H1415 and K9218 monoclonal antibodies to detect P1/P2-and P1-promoter-driven HNF4α, respectively, in the liver and tumor samples to determine how the expression of these two isoforms may play a role in SAA1 expression patterns. Our data in Figure 2 clearly indicate that staining reactivity of SAA1 and P1/P2-HNF4α is altered based on HCV disease state and race. For example, staining reactivity (%) for SAA1 ( Figure 2C) in CA is 25% for both HCV+ cirrhotic and HCC states, whereas in AA samples it is only 6.5% and 0.0%, respectively. This indicate that the marker for "acute inflammatory phase" is much lower in HCV+ of AA compared to CA cohort. As shown in Figure  2D, the staining reactivity of P1/P2-HNF4α, which is a measure of both isoforms, is lower in HCV+ for both CA and AA tissue samples. However, it is clearly shown in Figure 3B that the low staining reactivity is related to P1-HNF4α isoform, and mainly in AA tissue samples. These data clearly indicate that the acute inflammatory phase as measured by SAA1 level is severely compromised in AA compared to CA as a result of dysregulation of HNF4α isoforms. Our results also show that changes in splicing profiles in normal vs. HCV+ state could possibly contribute to the observed HCV disease state racial disparity ( Table 3). The alternative splicing events of three genes (SAA1, AOX1 and SLC13A5) from the 28-gene set ( Table 3) were confirmed by real-time qRT-PCR in normal vs. HCV+ state. Specifically, we validated the expression of SAA1, AOX1, and SLC13A5. For SAA1, the expression of exon 1 to 2 and exon 1 to 3 (Supplementary Figure  1), for AOX1 4 to 5, and the exon 12 to 13, for SLC13A5 exon 10 to 12 (Supplementary Figure 2). We found that the splicing index (SI) of SAA1 is significantly lower (p<0.05) in AA compared to CA (Table 5B). This suggests that splicing events occurred mainly in specific disease state (HCV+ cirrhotic) predominantly in AA cohort. The role played by these alternative splice products in HCV+ will thus require further investigations, together with the other alternative transcripts detected. In sum, our study suggests that altered gene expression, and splice variants are important events in HCV racial disparities between Caucasian and African Americans.
In conclusion, our genomic variants study showed that genes were differentially expressed between HCCN and HCV+HCC but, also, to a large extent, between normal and HCV+ (cirrhotic) state. Many of these genes are involved in biological pathways pertinent to the overall pathophysiological response to HCV infection. The observation that several splice variants were deregulated in normal vs. HCV+ is certainly in line with the recent observations showing that the pre-mRNA splicing machinery may be profoundly remodeled during HCV disease progression, and may, therefore, play a major role in the disease outcome. Target validation analyses showed that some of these genes are significantly deregulated especially in AA compared to CA tissue samples. These observations suggest that socioeconomic factors may not fully explain the differences in HCV racial disparity, but rather biological/genetic factors should also be considered. Further analyses will be required to determine if these gene variants are predictive markers of the pathophysiological evolution in HCV disease progression. It would be of great interest to determine whether our differentially expressed genes and splice variants are under some kind of coordinated control. This certainly will allow for the development of next generation therapeutic care management for HCV disease state based on racial/ethnic backgrounds of patients.

Sample preparation and data analysis
Total RNA was extracted from 12 tissue samples of Caucasian individuals (3 normal livers, 3 HCV+/ HCC-(cirrhotic livers), 3 HCV+/HCC+ (cirrhotic tumors) and 3 normal adjacent tissue matched pairs HCCN) using the RNeasy mini kit (Qiagen, Valencia, CA, USA) and quantified using Nanodrop ND-100 Spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA), as previously reported [12]. RNA samples were then subjected to RNA amplification using the SensationPlus FFPE Amplification and WT Labeling Kit (Affymetrix Inc., Santa Clara, CA, USA), as previously reported [53,54]. The biotin double-stranded cDNA products were hybridized to Affymetrix HTA 2.0 arrays using an Affymetrix hybridization kit. Hybridized HTA 2.0 arrays were scanned with an Affymetrix GeneChip ® 3000 fluorescent scanner. Image generation and feature extraction was performed using Affymetrix GeneChip Command Console Software. The raw data (.*CEL) were analyzed using the Transcriptome Analysis Console (TAC) 2.0 software, which allows for the identification www.impactjournals.com/oncotarget of differentially expressed genes (DEG) & exons and the visualization of alternative splicing events for determining possible transcript isoforms that may exist in samples.
For microarray data analysis, two parallel analyses (gene-level and alternative splicing level) were performed. Data were normalized using quantile normalization, and background noise was detected using Detection Above Background (DABG) algorithm. Only the probesets characterized by a DABG p-value <0.05 in at least 50% of the samples were considered for statistical analysis. We performed an unpaired Student's t-test to compare gene intensities between normal vs. HCV+ and HCCN vs. HCV+HCC. Genes were considered significantly regulated when Fold Change (FC), linear <-2.0 or >+2.0 and ANOVA p-value (condition pair) <0.05. Analysis of the splicing level was also performed using TAC 2.0 software, which determines among other parameters, the Splicing Index (SI) of a gene. The SI corresponds to a comparison of gene-normalized exon-intensity values between the two analyzed experimental conditions [55]. Additional criteria used beside SI: q-value <0.05, a gene is expressed in both conditions (normal vs. HCV+, and HCCN vs. HCV+HCC), a Probset Ratio (PSR)/Junction must be expressed in at least one condition, and a gene must contain at least one PSR value.

Reverse transcription PCR validation
Validation of 8 selected differentially expressed genes (DEGs) and splice variants was performed on 24 independent tissue samples (12 CA, and 12 AA) at various disease state (normal, HCV+ and HCC). mRNA levels were measured using the SYBR-GREEN quantitative RT-PCR (qRT-PCR) method as previously reported [12] by the ABI 7900HT Fast Real Time PCR System (Applied Biosystems). cDNAs were amplified using specific primers indicated in Supplementary  Table 1; data results were normalized against alpha-ACTIN (ACTIN1), beta-2-Microglobin (B2M), and glyceraldehyde 3-phosphate dehydrogenase (GAPDH). Relative RNA levels of genes were calculated using the comparative Ct method 2 -ΔΔCt [56]. For splice variants, altspliced (A) and constitutive (C) exons were identified in TAC 2.0, and qRT-PCR primer sets were designed using Primer3 (http://www.ncbi.nlm.nih.gov/tools/primerblast/) as shown in Supplementary Table 1. By designing specific primer pairs for constitutively expressed flanking exons (Supplementary Figure 1 and 2), it is possible to simultaneously amplify isoforms that include or skip the target exon [57]. The identities of variant specific amplicons were simultaneously verified and quantitated by melt curve analysis, and the products were confirmed either present or absent using agarose gel electrophoresis. Splice Index (SI) was calculated for (A) by normalizing fold change (FC) to the average FC of (C) for each splicing event. For amplicon spanning exons 4-5 in AOX1 (Supplementary Table 1), the calculated FC (A)/average FC (C) value is less than 1 (0.47), indicating decreased exon 5 inclusion in Normal vs. HCV+. This is finally reported as -1/0.47 = -2.1, as a negative number (Table  5B). For SAA1, the reported positive SI number (9.12) indicates increased exon 3 inclusion in Normal vs. HCV+. Each sample was measured in triplicate and values were reported as average.

Pathways, functional enrichment and interactive network analysis
Gene networks and canonical pathways representing key genes were identified through the use of QIAGEN'S Ingenuity Pathway Analysis software (IPA, QIAGEN Redwood City, www.qiagen.com/ingenuity, content version 18841524, release date 06/26/2014) as previously reported [12]. Briefly, the data sets containing gene identifiers and corresponding fold change and p-values were uploaded into the web-delivered application and each gene identifier was mapped to its corresponding gene object in the IPA software. Fisher's exact test was performed to calculate a P-value assigning probability of enrichment to each biological function and canonical pathway within the IPA library.

Statistical analysis
The data were expressed as mean±SE, and analyzed with the Student's t-test between two groups. Changes