Alcohol and hepatitis virus-dysregulated lncRNAs as potential biomarkers for hepatocellular carcinoma

Hepatocellular carcinoma (HCC) is one of the leading causes of cancer-related deaths because of frequent late detection and poor therapeutic outcomes, necessitating the need to identify effective biomarkers for early diagnosis and new therapeutic targets for effective treatment. Long noncoding RNAs (lncRNAs) have emerged as promising molecular markers for diagnosis and treatment. Through analysis of patient samples from The Cancer Genome Atlas database, we identified putative lncRNAs dysregulated in HCC and by its risk factors, hepatitis infection and alcohol consumption. We identified 184 lncRNAs dysregulated in HCC tumors versus paired normal samples, 53 lncRNAs dysregulated in alcohol-drinking patients with hepatitis B, and 5, 456 lncRNAs dysregulated in patients with hepatitis infection. A panel of these candidate lncRNAs’ expressions correlated significantly with patient survival, clinical variables, and known genomic alteration in HCC. Two most significantly dysregulated lncRNAs in our computational analysis, lnc-CFP-1:1 and lnc-CD164L2-1:1, were validated in vitro to be dysregulated by alcohol. Our findings suggest that lncRNAs dysregulated by different etiologies of HCC serve as potential disease markers and can be further investigated to develop personalized prevention, diagnosis, and treatment strategies.


INTRODUCTION
Hepatocellular carcinoma (HCC) is the most common class of liver cancer, accounting for 70-90% of primary liver cancer cases [1]. Because of limitations in diagnostic methods, HCC is often diagnosed late, when intrahepatic and extrahepatic metastasis are likely to have already occurred, leading to poor clinical outcome and therapeutic response [1,2]. HCC causes 750, 000 death annually worldwide, the second highest total mortality out of all human cancers [3]. The five-year survival rate for HCC has remained below 20% [4]. Therefore, there is an urgent need for the discovery of biomarkers that can allow for early diagnosis of HCC and prediction of metastasis risk.
Long noncoding RNAs (lncRNAs) are noncoding RNAs over 200 bases in length that serve a variety of roles in regulating protein levels and gene expression [1]. LncRNAs have been extensively documented as important regulatory molecules involved in tumorigenesis [5]. A number of lncRNAs were revealed to have significant functions in the pathogenesis of HCC and could serve either tumor suppressing or oncogenic roles. Active pathological mechanisms in HCC such as the Wnt signaling pathway, the STAT3 signaling pathway, and Research Paper www.impactjournals.com/oncotarget epithelial-to-mesenchymal transition (EMT) were shown to be regulated by lncRNAs such as DANCR, ATB, and MALAT1, respectively [6][7][8]. Other lncRNAs involved in HCC include HOTAIR (HOX transcript antisense RNA) and HULC (Highly Upregulated in Liver Cancer), which are also both involved in multiple cancers beside HCC [3,9]. Because of their extensive involvement in tumorigenesis, lncRNAs are promising candidates as biomarkers for prediction of prognosis in HCC.
Important risk factors in the development of HCC are alcohol consumption and the hepatitis virus. An estimated half of HCC patients have an hepatitis B virus (HBV) infection, while thirty to forty percent of HBV patients ultimately develop HCC [10,11]. Hepatitis C virus (HCV) infection causes an estimated seventeen-fold increase in the risk of developing HCC compared to nonhepatitis individuals [12]. For people with heavy alcohol consumption, ten to thirty percent develop alcoholic steatohepatitis, while ten to twenty percent develop liver cirrhosis [13]. Both alcoholic steatohepatitis, a type of fatty liver disease, and liver cirrhosis can lead to development of HCC. Because HCC is a heterogeneous disease with various prognostic factors and prognostic outcomes, the risk factors involved in the development of HCC for specific patients must be considered in order to develop personalized diagnostic and treatment methods.
Most studies investigating lncRNAs involved in HCC were limited by the small size of patient cohorts and the lack of specific HCC etiology focus, with many studies comparing only tumor samples with normal samples [1]. While there have been studies exploring differential gene expression for coding genes in HCC caused by different risk facts, to the best of our knowledge, no study has compared the role of lncRNAs in HBV/HCV-related HCC to their role in alcohol-related HCC. We downloaded RNA-sequencing data for 222 HCC patients from The Cancer Genome Atlas (TCGA) database to obtain a substantial cohort for analysis. We then analyzed lncRNA expression levels of patient normal samples versus that of three tumor sample cohorts, based on HBV infection, HCV infection, and history of alcohol consumption. Finally, the expressions of selected lncRNAs were verified in vitro.

Identification of alcohol and hepatitisdysregulated lncRNAs
Clinical and RNA-sequencing data for 222 HCC patients and 50 normal liver tissues were sorted into four cohorts based on their clinical history on viral hepatitis infection and alcohol consumption: (1) HBV+ HCV− drinker (n = 34); (2) HBV+ HCV− non-drinker (n = 109); (3) HBV− HCV+ nondrinker (n = 32); and (4) HBV+ HCV+ nondrinker (n = 47). To identify lncRNAs dysregulated by alcohol in the context of HBV, three differential expression analyses were performed using the Bioconductor package edgeR: (a) HBV+ HCV− drinker versus normal liver; (b) HBV+ HCV− non-drinker versus normal liver; and (c) HBV+ HCV− drinker versus HBV+ HCV− non-drinker ( Figure 1A). 53 lncRNA transcripts were found to be significantly dysregulated due to alcohol in the context of HBV (FDR < 0.05) ( Figure 1B) and the top 10 most dysregulated lncRNAs due to alcohol use are shown in Table 1. To identify lncRNAs dysregulated by hepatitis virus, three differential expression analyses were performed: (d) HBV+ HCV-non-drinker versus normal; (e) HBV− HCV+ non-drinker versus normal liver; and (f) HBV+ HCV+ non-drinker versus normal liver. 12,328, 20,873 and 19,194 lncRNA transcripts were differentially expressed in patients with HBV, HCV, and both HBV and HCV, respectively, compared to normal liver. 5,456 lncRNA transcripts were found to be commonly dysregulated by HBV and HCV in all three comparisons above (FDR < 0.05) ( Figure 1C). The top 10 most dysregulated lncRNAs due to hepatitis virus are shown in Table 2. In addition, a pairwise differential expression analysis was performed on 50 tumor and adjacent normal pairs. 184 lncRNAs were found to be significantly dysregulated (FDR < 0.05) in HCC tumors compared to adjacent normal, 32 of which were implicated in the previous alcohol and/or hepatitis analyses. The top 10 most dysregulated lncRNAs in tumor samples are shown in Table 3.

In vitro validation of alcohol-dysregulated lncRNAs in liver cell lines
We chose two lncRNAs, lnc-CFP-1:1 and lnc-CD164L2-1:1, for further in vitro validation for their strong correlations with both clinical outcomes and genomic alterations (Figures 2-5). Both of these lncRNAs are observed to be consistently downregulated in drinker HCC patients compared to normal livers ( Figure 6A), with clusters of drinker HCC patients observed at low expressions for both lncRNAs ( Figure 6B). lnc-CFP-1:1, a 569-nt transcript on chromosome X, and lnc-CD164L2-1:1, a 1,000-nt transcript on chromosome 1, have not been previously characterized. To validate their dysregulation due to alcohol, we treated the non-cancerous liver cell line L02 as well as the human hepatoma cell line Hep3B with 0.1% (17 mM), 0.3% (34 mM) and 1% (170 mM) ethanol for 7 days. Upon treatment with ethanol, both lncRNAs were observed to be significantly downregulated, in a dose-dependent fashion, in both cell lines, with lnc-CFP-1:1 reduced to less than 20% of its original expression at 170 mM alcohol.

DISCUSSION
We investigated lncRNAs deregulated in alcoholrelated HCC and HBV/HCV-related HCC to explore unique disease markers for different etiologies. lncRNAs have been documented as important functional molecules in the development of HBV/HCV-related HCC [17]. Our study identified 12, 328 differentially expressed lncRNAs between HBV-infected patient samples and normal samples. 20, 873 lncRNAs were differentially expressed between HCV positive samples and normal samples. 19, 194 lncRNAs were differentially expressed between samples positive for both HBV and HCV and normal samples. The 5, 456 lncRNAs that displayed significant on relative high and low expression of candidate lncRNAs proposed to be dysregulated by alcohol and/or viral hepatitis (Kaplan-Meier, p < 0.05). www.impactjournals.com/oncotarget differential expression in all three permutations above were most likely to be highly involved in hepatitis induction of HCC. Despite the large numbers of lncRNAs that are deregulated in patients with HBV/HCV-related HCC, only the mechanisms of a select few lncRNAs have been extensively studied [6,[17][18][19][20]. None of the most significantly dysregulated lncRNAs we found has been previously studied in HCC, suggesting that our current picture of lncRNA dysregulation in HCC is incomplete.
From our analysis, 53 lncRNAs were associated with alcohol-related HCC. We found no previous study that explored the relationship between lncRNAs and alcohol consumption in HCC. The reason may be that alcohol intake often indirectly leads to genetic dysregulation-through causing elevated acetaldehyde levels, accumulation of iron, chronic liver inflammation, or liver fibrosis/cirrhosis-while hepatitis viruses integrate viral DNA directly into the host genome to cause dysregulation of genetic mechanisms [21,22]. For example, alcohol consumption can lead to liver fibrosis, which can progress to liver cirrhosis. Cirrhotic livers led to HCC in 90% of HCC cases [23]. Knowledge of dysregulated lncRNAs in alcohol-related HCC may lead to development of prevention or treatment strategies for alcoholic cirrhosis, thereby arresting disease progression to HCC.
We correlated the expression of survival-related lncRNAs with mutation status and copy number variation to gain an understanding of their putative involvement in dysregulated genetic pathways in HCC. lnc-FABP6-4:1 upregulation correlated with presence of TP53 and RB1 mutation. IGF2R mutation correlated with downregulation of lnc-CFP-1:1 and lnc-CD164L2-1:1. Lnc-CD164L2 has a gene locus resides in the intronic region of the protein coding gene CD164, which has been found to act as a metastasis promoter in prostate cancer [24]. CTNNB1 mutation correlated with upregulation of lnc-HPS3-2:3. TP53, a tumor suppressor gene, is the most commonly mutated gene in cancer, including in HCC; while CTNNB1 is the most commonly mutated proto-oncogene in HCC [25]. CTNNB1 is an integral element of the Wnt signaling pathway, which is activated in HCC and shown to be partly regulated by lncRNAs, such as lnc-DANCR [6]. RB1 is part of the RB1 tumor suppressing pathway, which is commonly inactivated in many cancers, including HCC [26]. IGF2R is also a tumor suppressor involved in multiple cancers and functions in HCC by inhibiting liver cell invasion [27].
To validate our correlations, we treated cells to different concentrations of alcohol in vitro and measured expression levels of lnc-CD164L2-1:1 and lnc-CFP-1:1, which correlated very well with patient survival and clinical variable. The decrease in their expression as alcohol concentration increases in both HCC and normal cell lines matches the direction of dysregulation in our statistical analysis and validates our hypothesis that alcohol dysregulates lncRNA expression.
No previous study, to the best of our knowledge, has explored the mechanism of possible synergism between the risk factors of hepatitis infection and alcohol intake. Studies from Italy, Taiwan, Japan, and the United States have found increased risk of HCC development when patients with HCV and HBV infections regularly drink alcohol, compared to non-drinking patients with hepatitis [28][29][30][31]. Further investigation of the mechanisms of lncRNAs we identified to be involved in alcohol-related HCC and HBV/HCV-related HCC may provide insight into this mechanism of synergism, knowledge that can lead to better identification of high-risk individuals and provide focus for development of preventative methods.
The early diagnosis of hepatocellular carcinoma is critical to more effective treatment and better prognosis since the few current treatment options for potentially curing HCC, such as liver transplantation, ablation, or resection, are effective only for the early-stages of HCC [32]. Ultrasonography can be used for early diagnosis but only has a sensitivity of 60-80%, despite a specificity of 94% [33]. Biomarkers such as serum alpha-fetoprotein (AFP) and des-gamma carboxyprothrombin (DCP) failed as useful diagnostic tools because of similar sensitivity as ultrasonography [1]. In contrast, studies utilizing combinations of circulating lncRNAs or lncRNA with other types of RNAs as diagnostic biomarkers for HCC achieved sensitivities of over 90% [34][35][36][37]. These studies suggested the usefulness of using lncRNAs as diagnostic markers, although the great majority of these studies investigated HCC cohort versus healthy liver cohort instead of taking specific risk factors into account [1]. Because different etiologies of HCC can result in dysregulation of different lncRNAs and disparate prognoses, our analysis of lncRNAs dysregulated in HCC induced by different risk factors can provide useful information about biomarkers present at the early stages of HCC.
Besides serving as diagnostic markers, lncRNAs can potentially be directly involved in the treatment of HCC. Direct delivery of tumor suppressing lncRNAs into liver cells or reducing expression of oncogenic lncRNAs through siRNA interference can be potential treatment strategies [1]. These options provide alternatives to the aforementioned invasive procedures for curing HCC. Furthermore, current treatment practices for HCC only consider the stage of disease, not its molecular etiology [38]. The use of lncRNAs for treatment may also lead to more personalized treatments of HCC for better therapeutic response.  lncRNA differential expression analyses lncRNA read counts were generated from RNAsequencing datasets via BEDtools coverageBed (https:// github.com/arq5x/bedtools2) [39] using lncRNA annotation files obtained from LNCipedia 3.0 (http://lncipedia.org/) [40], a database curating 113,438 lncRNA transcripts from sources including the Broad Institute, Ensembl, Gencode, Refseq, and NONCODE. The read count tables were imported into edgeR v3.0 (http://www.bioconductor. org/packages/release/bioc/html/edgeR.html) [41], and lowly expressed lncRNAs (counts-per-million <1 in more than one-half of samples) were filtered from the analysis. Following TMM normalization, pairwise designs were applied to identify significantly differentially expressed lncRNAs in (1) HBV+ HCV− drinker versus normal liver;

Association of lncRNA expression with patient survival and other clinical outcomes
Survival analyses were performed using the Kaplan-Meier Model, with lncRNA expression in HCC tumors designated as a binary variable based on expression above or below the median. One patient with no clinical information was removed and 221 HCC patients were retained for the analysis. Employing the Kruskal-Wallis test and lncRNA expression values (counts-per-million), we investigated lncRNA association patient vital status, tumor grade and pathologic stages. HCC patients in all four cohorts were used in this analysis.

Association of lncRNA expression with tumor mutations and copy number aberrations
Mutation calls for the HCC tumors were obtained from mutation annotation files (maf) generated by the Broad Institute GDAC Firehose on 5 September 2016. We focused our analysis on the 10 most frequently mutated genes in HCCs, as determined by Debuire et al [16]. Wilcoxon rank sum tests were employed to test for significant associations between lncRNA expression level (counts-per-million) and mutational status. Copy number variations for the TCGA tumors were obtained from the GISTIC2 pipeline in Broad GDAC Firehose on 6 July 2017. All significant (99% confidence) focal amplifications and deletions were analyzed for correlation to lncRNA expression level using Wilcoxon rank sum tests, followed by Benjamini-Hochberg correction of lncRNA p-values.

Cell culture and treatments with ethanol
The non-cancerous liver cell line L02 and the human hepatoma cell line Hep3B were gifts from the Wang lab at University of Hong Kong. The cells were cultured in DMEM supplemented with 10% fetal bovine serum, 2% penicillin/streptomycin, and 2% L-glutamate (GIBCO) and maintained at 37°C in a humidified 5% CO2/95% air atmosphere. These cells were exposed to ethanol for 7 days. The doses used for ethanol treatment were 0.1 %, 0.3 %, and 1 % by volume (approximate concentrations 17 mM, 51 mM, and 170 mM, respectively). We chose the 0.1 % (17 mM) dose to represent social drinking habits, as 0.1 % is the blood alcohol level constituting legal intoxication in the U.S. [42]. The 0.3 % (51 mM) ethanol dose was used to simulate binge drinking habits, as it is representative of the blood alcohol levels of moderate to heavy drinkers [43]. The 1% (170 mM) ethanol dose, while potentially lethal in humans, was employed as an upper limit control. Treatment media was replaced every 24 hours with fresh media containing the stated ethanol concentration. The tissue culture plates were sealed with paraffin film to reduce evaporative loss of ethanol from the media.

RNA isolation and cDNA synthesis
Upon completion of alcohol treatments, cells were harvested, and total cell lysates were collected. RNA was extracted using SurePrep RNA Isolation kit (Thermo Fisher Scientific, Inc.). Complementary DNA was synthesized according to the manufacturer's protocol, using LncProfiler qPCR Array kit (catalogue no. RA900A-1; System Biosciences, Mountain View, CA, USA).